The OpenEye SAMPL challenge
Finally something exciting. Me and some colleagues are taking part in the SAMPL (Statistical Assessment of the Modeling of Proteins and Ligands) challenge issued by OpenEye Software for their upcoming annual March meeting in Santa Fe, NM. OpenEye is well known for their ligand-based similarity searching tools that have proven to be superior to many others for virtual screening. I am looking forward both to visiting the state- a dream I have had since I was a kid- and working on the challenge.
The challenge basically is to perform the kinds of procedures to find and rank actives that are now a standard part of modeling in the pharmaceutical industry and elsewhere. The company will hand out three sets of data with small differences between them. Every set will have a couple of thousand ligands, with actives and lots of decoys mixed in with them. Sometimes a protein structure for the ligands might be thrown in. The goals are well-established and standard:
1. Virtual screening: find the actives, identify the decoys.
2. Crystallographic pose determination: find the correct crystallographic conformation for a few ligands in the active site
3. Estimating binding affinity: the hardest task, probably the holy grail of the industry. What more could we want if we could correctly rank order compounds beforehand in a project and estimate their binding affinity?
Literature searching is discouraged. The honor system is in effect. You can use whatever tools you can access. Participants in the challenge include many well-known academic groups as well as people from both Big Pharma and "Small" Pharma. Depending on the data set, we can choose all three or a subset of the above protocols as a challenge. Once we finish one set, we submit the results before a deadline and the next set will be released to us. The goal is not to win: in fact it's a win-win situation because we will always end up learning something interesting. Valuable lessons inevitably learned will include ligand preparation, docking, solvation energy estimation, and other aspects of both ligand-based and structure-based design. In this case, the goal is to see and analyze how people throughout the country can tackle some standard issues in early-stage drug discovery.
This should be fruitful and fun.