Now Available: RefSeq Release 223

Check out RefSeq release 223, now available online and from the FTP site. You can access RefSeq data through NCBI Datasets. What’s included in this release? As of March 4, 2024, this full release incorporates genomic, transcript, and protein data containing: 425,594,654 records 316,329,937 proteins 60,886,133 RNAs sequences from 147,591 organisms  The release is provided in several directories as … Continue reading Now Available: RefSeq Release 223

Join NCBI at TAGC 2024

March 6-10 in Washington, D.C.  We look forward to seeing you in person at The Allied Genetics Conference (TAGC), March 6-10, 2024, in the Washington D.C. metro area. NCBI staff will participate in a variety of activities and events, including hosting a hands-on workshop: Exploring and downloading NCBI data with NCBI Datasets. We’re also excited … Continue reading Join NCBI at TAGC 2024

New RefSeq Annotations Now Available!

During October to January, the NCBI Eukaryotic Genome Annotation Pipeline released seventy new annotations in RefSeq! New Annotations Alnus glutinosa (eudicot) Amyelois transitella (moth) Anolis sagrei ordinatus (Brown anole) Apis cerana (Asiatic honeybee) Balaenoptera ricei (Rice’s whale) Bombus pascuorum (bee) Bos javanicus (banteng) Bos taurus (cattle)  Cannabis sativa (eudicot) Colius striatus (speckled mousebird) Cololabis saira (Pacific saury) … Continue reading New RefSeq Annotations Now Available!

Updated Bacterial and Archaeal Reference Genome Collection is Available!

Download the updated bacterial and archaeal reference genome collection! This collection of 18,943 genomes was built by selecting the “best” genome assembly for each species among the 330,000+ prokaryotic genomes in RefSeq (except for E. coli for which two assemblies were selected as reference). You can speed up your sequence searches by running them against … Continue reading Updated Bacterial and Archaeal Reference Genome Collection is Available!

RefSeq Release 222 Now Available!

Check out RefSeq release 222, now available online and from the FTP site. You can access RefSeq data through NCBI Datasets. What’s included in this release? As of January 8, 2024, this full release incorporates genomic, transcript, and protein data containing: 411,137,832 records 304,562,770 proteins 59,343,570 RNAs sequences from 145,371 organisms  The release is provided in several directories as … Continue reading RefSeq Release 222 Now Available!

Now Available: NCBI Hidden Markov Models (HMM) Release 14.0!

Download release 14.0 of the NCBI protein profile Hidden Markov models (HMMs) used by the Prokaryotic Genome Annotation Pipeline (PGAP)! Search this collection against your favorite prokaryotic proteins to identify their function using the HMMER sequence analysis package. What’s new? Release 14.0 contains: 16,383 HMMs maintained by NCBI 252 new HMMs since release 13.0 Several … Continue reading Now Available: NCBI Hidden Markov Models (HMM) Release 14.0!

Gene Ontology (GO) Terms for NCBI RefSeq Eukaryotic Genomes

Are you interested in more functional information about protein-coding genes? We’ve expanded NCBI RefSeq’s Eukaryote Genome Annotation Pipeline (EGAP) to include Gene Ontology (GO) terms computed for most protein-coding genes. We are using the latest version of InterProScan, which now includes analysis based on PANTHER reference trees, on all NCBI RefSeq eukaryotic genomes. That means having … Continue reading Gene Ontology (GO) Terms for NCBI RefSeq Eukaryotic Genomes

RefSeq Release 221

RefSeq release 221 is now available online and from the FTP site. You can access RefSeq data through NCBI Datasets. What’s included in this release? As of November 6, 2023, this full release incorporates genomic, transcript, and protein data containing: 404,657,610 records 300,054,945 proteins 57,882,313 RNAs sequences from 143,819 organisms  The release is provided in several directories as a … Continue reading RefSeq Release 221

Now Available! Compare NCBI RefSeq and UniProt Datasets

Do you need to compare and combine data based on NCBI RefSeq and UniProt datasets, and aren’t sure which proteins are comparable? For many years, NCBI Gene has provided information about the relationships between RefSeq and UniProt accessions courtesy of data imported from UniProt, but the tremendous growth of both datasets has led to large … Continue reading Now Available! Compare NCBI RefSeq and UniProt Datasets

New Annotations in RefSeq!

In July, August, and September, the NCBI Eukaryotic Genome Annotation Pipeline released fifty-six new annotations in RefSeq! New Annotations Achroia grisella (moth) Acipenser ruthenus (sterlet) Ahaetulla prasina (snake) Alligator mississippiensis (American alligator) Ammospiza caudacuta (bird) Ammospiza nelsoni (bird) Anopheles bellator (mosquito) Anopheles coustani (mosquito) Anopheles ziemanni (mosquito) Arachis stenosperma (eudicot) Carassius carassius (crucian carp) Centropristis striata (black seabass) … Continue reading New Annotations in RefSeq!