Finding Chemical Probes and Modulators – The Hunt for New Chemical Reagents and Medicines


This blog post is a continuation of last week’s blog on finding biological assay data; it is intended for researchers who use PubChem.

Your research focuses on a protein (receptor or enzyme) for which you’d like to identify a chemical probe or modulator. The probe could help to identify the subcellular location of a protein. A modulator may help to determine the biological effects of a particular protein’s activity. Additionally, finding a novel chemical that binds to your protein might assist you in exploring the use of a new class of therapeutics in drug design.

At NCBI, the PubChem BioAssay database stores biological activity assay information, which makes it possible to find experimentally measured targets for millions of chemicals. This blog post shows a simple workflow to download a table (with raw and kinetic data) of chemicals that have been determined to bind to a particular gene/protein target.

From the Gene page (example: Human cAMP-dependent Protein Kinase catalytic subunit, PRKACA):

Figure 1. The Gene page for human PRKACA (protein kinase, cAMP-dependent, catalytic, alpha). Highlighted on the right-hand side of the page is the BioAssay by Target link.

Figure 1. The Gene page for human PRKACA (protein kinase, cAMP-dependent, catalytic, alpha). Highlighted on the right-hand side of the page is the BioAssay by Target (Summary) link.

  1. On the right-hand side in the “Related information” section, click on “BioAssay by Target (Summary)“, which will take you to a table view of all chemicals designated as “active” against this particular gene/protein target in PubChem.
Figure 2. BioActivity table for PRKACA.

Figure 2. BioActivity table for PRKACA. Total BioAssays and Data Row are highlighted.

NOTE: The list of Total BioAssays is listed above the table with the number of Data Rows indicating the number of Substances tested in the BioAssays. Please note that within a BioAssay, a Substance may have been tested more than once, and the same chemical may have been tested in multiple assays as indicated by the same CID.

2. This table has already been filtered for designated “active” chemicals. If you want,               you can slide the bar over and narrow down the list to include only the precise                     BioActivity Type measured (Ki, IC50, EC50, AC50, Potency) – defined here.

3. To download the table, click on “Data download” link on the right-hand side of the               page. This will download the full table in CSV format with a tile in this format:                      GeneID_5566_assaydata.csv. The first row of the downloaded table states: “The                table below shows PubChem BioAssay data for gene “gene symbol” (Gene ID:                  “GeneID”).” Columns include these types of data:

  • Row #
  • SID
  • CID
  • Outcome – as defined by the assay submitter
  • Activity Type Measured
  • Activity Concentration [inuM]
  • AID
  • BioAssay Title
  • BioAssay Type
  • Protein sequence ID – as provided by the assay submitter
  • Relevant PMID – as provided by the assay submitter

If you are interested in different ways of finding biological assay data, you may want to check out last week’s post. In this post, we provided a workflow to help you find and download a table of potential gene targets for a particular chemical. This information could be useful for identifying potential cross-reacting targets and predicting medication side effects.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s