Institutional Repositories in PubMed: a new quick way to free full texts


New icons to click-through to free full texts are starting to appear in PubMed. They take you directly to the publication uploaded in an institutional repository (IR). Here’s an example:

DeepBlue

This one is from Deep Blue, University of Michigan’s Library IR. When you see it on a publication like this one on Ebola, you can get free access to the publication there.

The icons only appear when there is no free full text via the journal or PMC (PubMed Central). So far, only 4 IRs with eligible publications are participating – you can see which ones they are here. They already expand access to around 25,000 publications.

The NCBI program that enables this is LinkOut. You can read more about it in the NLM Technical Bulletin. IRs can apply by email to join LinkOut. And if you are an author at an institution with a repository, support your IR and enable more people to read your work.

Complete RefSeq genome annotation results represented in UCSC genome browser


NCBI’s RefSeq project provides comprehensive annotation of the human and other eukaryotic genomes through a combination of curation and an evidence-based eukaryotic genome annotation pipeline. Our curated records, ‘Known RefSeqs’, can be identified by the accession prefix (NM_, NR_, NG_, NP_). Model RefSeq records (XM_, XR_, and XP_ accession prefixes) are predicted based on transcript evidence (RNA-Seq and more) and protein support from Known RefSeqs, Swiss-Prot, and select INSDC records.

We recognize that many scientists access genome annotation data from one of three sources – NCBI, Ensembl, or UCSC. NCBI provides access to the human (and other) genome annotation results in the Genome Data Viewer, by BLAST and FTP, and per gene in NCBI’s Gene resource. Ensembl provides RefSeq annotation information based directly on the FTP content that NCBI releases.  In the past, UCSC has provided a partial dataset of RefSeq human genome annotation content by aligning Known RefSeq transcripts to the genome using BLAT. Using this approach, additional model RefSeq transcript variants, non-transcribed pseudogenes, and immunoglobulin and T-cell receptor regions, were not available through UCSC services. In rare cases the independent alignment method resulted in small differences in the exon structure compared to NCBI’s placement details as well as some ambiguous placements for transcripts originating from very similar paralogs that are uniquely placed within the NCBI dataset.

Continue reading