This blog post is a continuation of last week’s blog on finding biological assay data; it is intended for researchers who use PubChem.
Your research focuses on a protein (receptor or enzyme) for which you’d like to identify a chemical probe or modulator. The probe could help to identify the subcellular location of a protein. A modulator may help to determine the biological effects of a particular protein’s activity. Additionally, finding a novel chemical that binds to your protein might assist you in exploring the use of a new class of therapeutics in drug design.
At NCBI, the PubChem BioAssay database stores biological activity assay information, which makes it possible to find experimentally measured targets for millions of chemicals. This blog post shows a simple workflow to download a table (with raw and kinetic data) of chemicals that have been determined to bind to a particular gene/protein target.
From the Gene page (example: Human cAMP-dependent Protein Kinase catalytic subunit, PRKACA):

- On the right-hand side in the “Related information” section, click on “BioAssay by Target (Summary)“, which will take you to a table view of all chemicals designated as “active” against this particular gene/protein target in PubChem.

NOTE: The list of Total BioAssays is listed above the table with the number of Data Rows indicating the number of Substances tested in the BioAssays. Please note that within a BioAssay, a Substance may have been tested more than once, and the same chemical may have been tested in multiple assays as indicated by the same CID.
2. This table has already been filtered for designated “active” chemicals. If you want, you can slide the bar over and narrow down the list to include only the precise BioActivity Type measured (Ki, IC50, EC50, AC50, Potency) – defined here.
3. To download the table, click on “Data download” link on the right-hand side of the page. This will download the full table in CSV format with a tile in this format: GeneID_5566_assaydata.csv. The first row of the downloaded table states: “The table below shows PubChem BioAssay data for gene “gene symbol” (Gene ID: “GeneID”).” Columns include these types of data:
- Row #
- SID
- CID
- Outcome – as defined by the assay submitter
- Activity Type Measured
- Activity Concentration [inuM]
- AID
- BioAssay Title
- BioAssay Type
- Protein sequence ID – as provided by the assay submitter
- Relevant PMID – as provided by the assay submitter
If you are interested in different ways of finding biological assay data, you may want to check out last week’s post. In this post, we provided a workflow to help you find and download a table of potential gene targets for a particular chemical. This information could be useful for identifying potential cross-reacting targets and predicting medication side effects.