The Cheminformatics Network Blog

Cheminformatics, Bioinformatics, Systems Biology, Network Theory, Drug Design, Computational Chemistry and Computational Biology

Friday, March 24, 2006

The 2nd SBS HTS Data Mining Competition

The second HTS data mining competition offers three data sets for analysis and prediction:
1. The first data set has been supplied by researchers from the National Centre for Chemical Genomics (NCGC) and is a biochemical assay looking for inhibitors of pyruvate kinase. This biochemical assay allows modelers, as well as dockers, to test their methods in a real world situation. In fact, as is common in screening there is a short timeline for submission of results on this data set as the data has been submitted for publication. So submission for this data set will have to be the date the complete data set is published and made public through NCGC’s database PubChem.
2. The second data set is also from the NCGC and are the results of glucocerebrosidase biochemical assay, for which crystal structures are available for the dockers to use for prediction. This data set will soon be available as well.
3. The third and final data set is from a cell based screen from the ICCB screening group at Harvard University. While this cell based assay asks for compounds that can interact with any aspect of the biology of the system and so precludes docking; but the data set will be presented with the plate and layout information as well as the primary reader data. This will then allow the modelers to test their correction algorithms as well as their data modeling methods.

The data sets are available from http://ncgc.nih.gov/pub/ncgcsbs/