In force

Protein target prediction for identifying molecules with performance-enhancing potential

Principal investigator
J. Mitchell
Country
United Kingdom
Institution
University of St. Andrews
Year approved
2010
Status
Completed
Themes
Methods

Project description

Code: 10C3JM

In this project, we will develop protein target prediction software to allow the performance-enhancing potential of a molecule to be identified from its chemical structure. This will be made freely available to WADA and to national anti-doping agencies. A Web interface, suitable for intranet deployment, will be underpinned by state-of-the-art predictive software linking molecules to the protein targets against which they are active; both on-target and off-target activities are covered equally well.

We will use machine learning methods to identify substances with potential for use as doping agents. We will primarily employ the Random Forest algorithm, in which we have particular expertise and which has given us excellent results in previous studies. By means of hybrid descriptors combining the geometrical detail of Ultrafast Shape Recognition (UFS) with the chemical information provided by MACCS descriptors, we will encompass both key aspects of molecular recognition. We will use protein target prediction to obtain the on- and off-target bioactivities of molecules with known and unknown doping potential. The profile of activities across a representative panel of protein targets is the molecule’s “bioactivity spectrum”. We will use the bioactivity spectra of known performance-enhancing molecules to predict those of compounds whose performance-enhancing potential is unknown. Thus we will classify molecules, including both licensed pharmaceuticals and other drug-like compounds, as potentially performance-enhancing or otherwise.

While the use of illegal performance-enhancing substances continues to threaten both the integrity of sporting competition and the health of athletes, our software will allow early identification of potential doping molecules. These compounds can then be prioritized for experimental testing, while no further experiments need to be conducted on those with negative in silico predictions. The use of this computational technology will massively reduce the need for animal or human experiments.

Main Findings

In this project, we will develop protein target prediction software to allow the performance-enhancing potential of a molecule to be identified from its chemical structure. This will be made freely available to WADA and to national anti-doping agencies. A Web interface, suitable for intranet deployment, will be underpinned by state-of-the-art predictive software linking molecules to the protein targets against which they are active; both on-target and off-target activities are covered equally well.

We will use machine learning methods to identify substances with potential for use as doping agents. We will primarily employ the Random Forest algorithm, in which we have particular expertise and which has given us excellent results in previous studies. By means of hybrid descriptors combining the geometrical detail of Ultrafast Shape Recognition (UFS) with the chemical information provided by MACCS descriptors, we will encompass both key aspects of molecular recognition. We will use protein target prediction to obtain the on- and off-target bioactivities of molecules with known and unknown doping potential. The profile of activities across a representative panel of protein targets is the molecule’s “bioactivity spectrum”. We will use the bioactivity spectra of known performance-enhancing molecules to predict those of compounds whose performance-enhancing potential is unknown. Thus we will classify molecules, including both licensed pharmaceuticals and other drug-like compounds, as potentially performance-enhancing or otherwise.

While the use of illegal performance-enhancing substances continues to threaten both the integrity of sporting competition and the health of athletes, our software will allow early identification of potential doping molecules. These compounds can then be prioritized for experimental testing, while no further experiments need to be conducted on those with negative in silico predictions. The use of this computational technology will massively reduce the need for animal or human experiments.