NCBI hidden Markov models (HMM) release 4.0 now available!

Release 4.0 of the NCBI hidden Markov models (HMM) used by the Prokaryotic Genome Annotation Pipeline (PGAP) is now available from our FTP site. You can search this collection against your favorite prokaryotic proteins to identify their function using the HMMER sequence analysis package. This release contains 17,443 models, including 94 new models since the … Continue reading NCBI hidden Markov models (HMM) release 4.0 now available!

The Protein Family Model resource is now available!

The new Protein Family Model resource  (Figure 1) provides a way for you to search across the evidence used by the NCBI annotation pipelines to name and classify proteins. You can find protein families by gene symbol, protein function, and many other terms. You have access to related proteins in the family and publications describing … Continue reading The Protein Family Model resource is now available!

RefSeq Release 202 is public

RefSeq release 202 is accessible online, via FTP and through NCBI’s Entrez programming utilities, E-utilities. This full release incorporates genomic, transcript, and protein data available as of September 8, 2020, and contains 255,571,455 records, including 186,755,483 proteins, 33,077,068 RNAs, and sequences from 104,969  organisms. The release is provided in several directories as a complete dataset … Continue reading RefSeq Release 202 is public

Updated and improved collection of RefSeq representative genome assemblies now available

We have updated the collection of representative genome assemblies for Bacteria and Archaea. As announced in April, this set is now recalculated three times a year. We selected a total of 11,727 prokaryotic assemblies to represent their respective species among the 192,000 assemblies in RefSeq. Six hundred and thirty-five species were included in the collection … Continue reading Updated and improved collection of RefSeq representative genome assemblies now available

Updated protein family models used by PGAP available for download

Release 3.0 of the NCBI protein family models used by the Prokaryotic Genome Annotation Pipeline (PGAP) is now available from our FTP site. You can search this collection of hidden Markov models (HMMs) against your favorite prokaryotic proteins to identify their function using the HMMER sequence analysis package. The 3.0 release contains 17,350 models: 12,864 … Continue reading Updated protein family models used by PGAP available for download

Recalculation of prokaryotic reference and representative genome assemblies

We have updated the collection of representative and reference assemblies for Bacteria and Archaea to better reflect the taxonomic breadth of the prokaryotes in RefSeq.  We chose the 11,478 representative assemblies in the new collection from the 180,000+ prokaryotic assemblies … Continue reading

Protein family models used by PGAP are now available for download

A new release of the NCBI protein families profiles used by PGAP (the Prokaryotic Genome Annotation Pipeline) is now available. You can search this collection of Hidden Markov models (HMM) against your favorite prokaryotic proteins to identify their function using … Continue reading

NCBI staff to present 3 posters at Advances in Genome Biology and Technology (AGBT), February 2020

Next week, NCBI staff will attend AGBT in Marco Island, Florida. On Tuesday, February 25, 2020, three posters from NCBI staff will be on display from 4:40 p.m. – 6:10 p.m. during the Poster Session and Wine Reception in the … Continue reading

NCBI staff to present 3 posters at Advances in Genome Biology and Technology (AGBT), February 2020

Next week, NCBI staff will attend AGBT in Marco Island, Florida. On Tuesday, February 25, 2020, three posters from NCBI staff will be on display from 4:40 p.m. – 6:10 p.m. during the Poster Session and Wine Reception in the … Continue reading

New PGAP release with Singularity, no-internet option, and Taxonomy Check

A new version of the Prokaryotic Genome Annotation Pipeline (PGAP) with several important features is now available on Github. In response to several requests we have added the option of running PGAP with Singularity, Podman or any other Docker-compatible executable … Continue reading