##### HOMD Genomic Reference Sequences, Version 11.0, Original NCBI sequences ##### Updated date: 2025-03-06 ##### ##### Genomic data are provided with with the GCA IDs as the file names, see GCA_ID_info.txt (or GCA_ID_info.csv) for meta information ##### ##### Sequences are provided with the following files from the original NCBI protein.faa, cds_from_genomic.fna, and genomic_fna files: ##### ##### 1. faa: proteins in fasta sequence format ##### 2. ffn: protein-encoding genes (DNA sequences) in fasta format ##### 3. fna: genomic sequences (complete or contigs) in fasta format ##### 4. gff: genome annotation in gff format ##### ##### The above files are provided as individual genomes (with GCA IDs) in the corresponding folders, ##### as well as all genomes in a single file all_genomes.XXX, where XXX is the respective file extention ##### ##### If a file is missing, it means NCBI did not providate that file. All genomes should have at least the genomic_fna.gz file; ##### All the missing files are recorded in the file "missing_files.txt"; ##### If you have any queston please contact George Chen by email tchen@forsyth.org