GBREL.TXT Genetic Sequence Data Bank 15 December 1990 GenBank(R) Release 66.0 Distribution Tape Release Notes 41057 loci, 51306092 bases, from 50908 reported sequences This document describes the data written on GenBank distribution tapes. The examples used are from the current release. If you have any questions or comments about the data bank, the distribution tape, or this document, please call (415)962-7364 or write to: GenBank c/o IntelliGenetics Inc. 700 East El Camino Real Mountain View, California 94040 USA The electronic mail address is: GENBANK@GENBANK.BIO.NET 1. INTRODUCTION 1.1 Release 66.0 Release 66.0 has 41,057 loci representing 51,306,092 bases. Release 65.0 had 39,553 loci with 49,179,285 bases. Release 66.0 thus is 4.3% larger in bases than Release 65.0. A statistical summary of Release 66.0 is presented in Appendix A. 1.2 Organization of This Document This introduction notes the changes to GenBank since the last release. The next section describes the contents of the tape files. The third section illustrates the formats of the tape files. The fourth section describes the proposed changes planned for future releases. The fifth section describes known problems in this release. The last section contains notes about the administration of GenBank. 1.3 Recent Changes in the Data Bank 1.3.1 Changes in This Release 1.3.1.1 Data from EMBL and the DNA Data Bank of Japan New sequence data from EMBL Release 23 have been incorporated into this release of GenBank. Sequence data from Release 7.0 of the DNA Data Bank of Japan (DDBJ) have also been incorporated into this release. Entries with accession numbers beginning with the letter `D' have been created and annotated by DDBJ. Release 66 contains approximately 467 entries from DDBJ. 1.3.1.2 Changes in the Source and Definition Lines The Source and Definition lines for new entries are now generated by the GenBank database software from information in the GenBank relational database, rather than being entered by an annotator. These lines include information such as the organism name, the name and type of the molecule, and the gene product. An example is: DEFINITION A.auricula-judae (mushroom) 5S ribosomal RNA. SOURCE A.auricula-judae (mushroom) ribosomal RNA. In Release 66.0, these lines contain the same information, but have a format that is less English-like. This change only applies to new entries. 1.3.1.3 Changes in Locus Names Several locus names have been changed to make the application of the organism codes consistent throughout the data bank. These changes are listed in Appendix B. 1.3.1.4 Date on LOCUS Line The date on the LOCUS line now indicates the actual date on which the data first appeared in the GenBank relational database or the date of the last revision. Previously, this was the date of the release in which the data first appeared or was revised. 1.3.1.5 Release 66 "Close-of-Data" The freeze date for data to appear in Release 66 was November 22, 1990. This is the date on which flatfile generation began. The process of converting the data from the relational database into the flatfile ASCII format takes several days to complete. New data continue to be added during this time; if data are added before their division has been processed, they may appear in the release (even though dated after the freeze date). 1.3.2 Changes in Earlier Releases The following changes in GenBank format were implemented in previous releases and described in their Release Notes. These changes are described here again for those users who may not have received those releases. 1.3.2.1 International Feature Table Format (Release 64) The EMBL Data Library and GenBank, with the assistance of the DNA Data Bank of Japan, have developed a standard feature table to be implemented by all three data banks. The new feature table is designed to be more understandable and useful. The common feature table will also make software development easier and allow simpler data conversion between data banks. The new feature table was implemented in Release 64.0 (June 1990). Details of the new format are described in a document entitled: `The DDBJ/EMBL/GenBank feature table: Definition, Version 1.01, September 10, 1988.' Copies of this document are available on request at the address given on the first page of these release notes. Section 3.5.11 provides further information on the new feature table format. 1.3.2.2 Minor Differences in GenBank Format (Release 64) Beginning in the second quarter of 1990, the GenBank data bank has been maintained in relational format. The data bank will continue to be distributed in the standard flatfile format described in this document. Starting with Release 64.0, the standard flatfile GenBank data bank (generated from the relational data base tables) contains a few minor format differences from previous releases. These differences are described in the remainder of this section. Information about the relational database format can be requested by calling (505) 665-2177 or by writing to: GenBank Group T-10, Mail Stop K710 Los Alamos National Laboratories Los Alamos, NM 87545 1.3.2.2.1 Keyword Order The order of keyword phrases on the KEYWORDS line is alphabetical. 1.3.2.2.2 Accession Number Order The order of non-primary accession numbers on the ACCESSION line is not necessarily preserved from one release to the next. 1.3.2.2.3 Reference Order The order of the references in an entry is not necessarily preserved from one release to the next. 1.3.2.2.4 Changes in Taxonomy The taxonomic classification of many organisms was changed to ensure uniformity throughout the data bank. A few organisms are listed as unclassified as a result of inconsistencies in the data. The annotation staff is addressing these inconsistencies. 1.3.2.2.5 Inconsistencies in Reference Information Inconsistencies in reference information have not been preserved. All occurrences of a single reference are identical, usually matching the first occurrence in the data bank. 1.3.2.2.6 Formatting Changes Certain text fields (for example, definition, comment, title, etc.) have been reformatted slightly, in some entries. In most cases, the actual text remains unchanged. 1.3.2.2.7 Comment Field Formatting Any special formatting in the comment field may not be preserved. This may be corrected in future releases. 1.3.2.2.8 Reference Line Changes The reference lines for sites and review papers have been slightly modified. All of these papers are classified as `sites', with the original text describing the citation appearing in the comment field. Also, when a reference is cited elsewhere in an entry, in most cases the entire citation appears in square brackets. Previously, only the number of the reference appeared in square brackets. 2. ORGANIZATION OF TAPE FILES 2.1 Tape Formats The GenBank data bank is available in three formats on three different physical media (see Section 5.4 for further details on which formats are available on each medium), and on CD ROM. GenBank is available on 9-track, unlabelled, industry-standard, ASCII magnetic tapes. These tapes have been written in fixed-length records of 80 characters, each with no carriage-return or line-feed characters. Each record corresponds to one line in the data bank; trailing blanks have been added to the lines to make them all exactly 80 characters long. (A completely blank line is therefore represented by 80 blanks.) The label affixed to the tape reel indicates its block size and density. If no specifications are received from you, the tape is written with a fixed block size of 160 records (12,800 characters) and a density of 6250 bpi (bits per inch). We also offer tapes written at a density of 1600 bpi and a block size of 40 records (3200 characters). GenBank is also available as a VAX/VMS Backup saveset (on 9-track tapes or TK-50 cartridges) or as compressed Unix tar archives (on 9 track tapes and Sun 1/4" QIC 24 format tape cartridges). The GenBank tape distribution files are also available on ISO-9660 compatible CD ROM. The data are written as ASCII files with variable length records. Each record corresponds to one line in the data bank and ends with a carriage return and a line-feed character. The data on the tapes have both uppercase and lowercase characters. Upon special request, the unlabelled, 9 track tapes can be written using uppercase characters only (Section 6.4 specifies which formats are available in uppercase only). 2.2 Files GenBank consists of twenty-two files in all magnetic tape distributions. The list which follows describes each of the files included in the distribution. In the following sections there are additional lists indicating the breakdown of files on the various media and formats. 2.2.1 File Descriptions 1. GBREL.TXT - Release notes (this document). 2. GBSDR.TXT - Short directory of the data bank. 3. GBNEW.TXT - List of new or substantially revised entries. 4. GBACC.IDX - Index of the entries according to accession number. 5. GBKEY.IDX - Index of the entries according to keyword phrase. 6. GBAUT.IDX - Index of the entries according to author. 7. GBJOU.IDX - Index of the entries according to journal citation. 8. GBHGM.IDX - Index of the entries according to gene symbol. 9. GBDAT.FRM - Forms for submitting sequences or corrections to GenBank. 10. GBPRI.SEQ - Primate sequence entries. 11. GBROD.SEQ - Rodent sequence entries. 12. GBMAM.SEQ - Other mammalian sequence entries. 13. GBVRT.SEQ - Other vertebrate sequence entries. 14. GBINV.SEQ - Invertebrate sequence entries. 15. GBPLN.SEQ - Plant sequence entries (including fungi and algae). 16. GBORG.SEQ - Eukaryotic organelle sequence entries. 17. GBBCT.SEQ - Bacterial sequence entries. 18. GBRNA.SEQ - Structural RNA sequence entries. 19. GBVRL.SEQ - Viral sequence entries. 20. GBPHG.SEQ - Phage sequence entries. 21. GBSYN.SEQ - Synthetic and chimeric sequence entries. 22. GBUNA.SEQ - Unannotated sequence entries. 2.2.2 Fixed Length Records Approximately 197 MB of disk space is required for the Release 66.0 files in fixed-length record format. All the files fit on two 6250 bpi tapes and are divided between the tapes as follows. Tape 1 GBREL.TXT GBSDR.TXT GBNEW.TXT GBACC.IDX GBKEY.IDX GBAUT.IDX GBJOU.IDX GBHGM.IDX GBDAT.FRM GBPRI.SEQ GBROD.SEQ GBMAM.SEQ GBVRT.SEQ GBINV.SEQ GBPLN.SEQ GBORG.SEQ Tape 2 GBBCT.SEQ GBRNA.SEQ GBVRL.SEQ GBPHG.SEQ GBSYN.SEQ GBUNA.SEQ At 1600 bpi, seven tapes are required and the files are divided among the tapes as follows: Tape 1 Tape 4 GBREL.TXT GBVRT.SEQ GBSDR.TXT GBBCT.SEQ GBNEW.TXT GBACC.IDX GBKEY.IDX Tape 5 GBAUT.IDX GBJOU.IDX GBINV.SEQ GBHGM.IDX GBPLN.SEQ GBDAT.FRM GBPHG.SEQ GBMAM.SEQ Tape 6 Tape 2 GBVRL.SEQ GBPRI.SEQ GBSYN.SEQ Tape 3 Tape 7 GBROD.SEQ GBUNA.SEQ GBORG.SEQ GBRNA.SEQ 2.2.3 VAX/VMS Backup Saveset Saveset files are in directory order rather than in the order shown for the formats above. The files are in compressed format (See Section 1.3.1.2 for details). Approximately 139 MB of disk space is required for Release 66.0 files in VAX/VMS Backup Saveset format. The files archived in the Backup Saveset use variable-length records, not the 80-character fixed-length records described above. All files fit on one 6250 bpi tape. At 1600 bpi, two tapes are required. The division of the files between the two tapes was not available at the time these Release Notes were prepared. The files will appear in the following order: AAAREADME.TXT DCOMPRESS.CLD DCOMPRESS.EXE DECMPRESS.COM GBACC_IDX.Z GBAUT_IDX.Z GBBCT_SEQ.Z GBDAT_FRM.Z GBHGM_IDX.Z GBINV_SEQ.Z GBJOU_IDX.Z GBKEY_IDX.Z GBMAM_SEQ.Z GBNEW_TXT.Z GBORG_SEQ.Z GBPHG_SEQ.Z GBPLN_SEQ.Z GBPRI_SEQ.Z GBREL_TXT.Z GBRNA_SEQ.Z GBROD_SEQ.Z GBSDR_TXT.Z GBSYN_SEQ.Z GBUNA_SEQ.Z GBVRL_SEQ.Z GBVRT_SEQ.Z NOTE: When the files are uncompressed (as instructed in Section 2.3) the `.Z' will be removed from the end of the file name and the characters after the underscore will become the file extension. For example, `GBACC_IDX.Z' will be named `GBACC.IDX'. One TK-50 cartridge is required; the files are in directory order and are compressed as described above. 2.2.4 Unix tar Format The files are compressed with the Unix compress utility before the tar command is executed; they must therefore be uncompressed before use (see Section 2.4 below for details). Approximately 45 MB of disk space is required for the Release 66.0 files when in the compressed format; the uncompressed files require approximately 139 MB. The tar file uses variable length records; the records are not padded to 80 characters with space characters. To get fixed-length, 80-character records, first uncompress the.Z files. Then use dd with the conv=block and cbs=80 options set to filter the file. If you pad the records, it adds approximately 58 MB of disk space. In the Unix tar file, the files are in directory order rather than in the order shown for the fixed-length record formats. In addition, the file names are in lowercase letters. All files fit on one 6250 bpi tape or Sun cartridge. At 1600 bpi, two tapes are required, and the files are divided between the tapes as follows: Unix Tar File Order: Tape 1 gbacc.idx.Z gbaut.idx.Z gbbct.seq.Z gbdat.frm.Z gbhgm.idx.Z gbinv.seq.Z gbjou.idx.Z gbkey.idx.Z gbmam.seq.Z gbnew.txt.Z gborg.seq.Z gbphg.seq.Z Tape 2 gbpln.seq.Z gbpri.seq.Z gbrel.txt.Z gbrna.seq.Z gbrod.seq.Z gbsdr.txt.Z gbsyn.seq.Z gbuna.seq.Z gbvrl.seq.Z gbvrt.seq.Z NOTE: When the files are uncompressed the `.Z' extension is removed from the file names. 2.2.5 File Sizes The following table indicates the approximate sizes of the individual files in this release. Since minor changes to some of the files may occur after the release notes are printed, these sizes should not be used to determine file integrity. They are provided as an aid to planning only. The columns in the table have the following meanings: (1) - Sizes (in bytes) of the fixed-length record files (described in Section 2.2.2) (2) - Sizes (in bytes) of the compressed files included in the Unix tarfile (Section 2.2.4) (3) - Sizes (in bytes) of the files in the Unix tarfile after uncompression (Section 2.2.4) (4) - Sizes (in blocks) of the compressed files included in the VMS Backup saveset (Section 2.2.3 and 2.4) (5) - Sizes (in blocks) of the files in the VMS Backup saveset after decompression (Sections 2.2.3 and 2.3) File (1) (2) (3) (4) (5)__ GBACC.IDX 3616640 558701 1641276 1101 3294 GBAUT.IDX 9839040 1978063 5541281 4040 11106 GBBCT.SEQ 20356960 4995239 15062023 10510 30263 GBDAT.FRM 39600 8450 21155 19 43 GBHGM.IDX 168160 42557 127617 86 254 GBINV.SEQ 13488080 3145845 9751044 6643 19599 GBJOU.IDX 4474400 750259 2454640 1530 4928 GBKEY.IDX 3676480 866411 2497752 1754 4981 GBMAM.SEQ 6563120 1524109 4708672 3223 9463 GBNEW.TXT 96800 16960 55668 37 113 GBORG.SEQ 5763760 1375541 4241530 2893 8524 GBPHG.SEQ 2378720 560271 1671525 1179 3363 GBPLN.SEQ 14084400 3392955 10325013 7124 20746 GBPRI.SEQ 32726160 7473028 23428438 15823 47079 GBREL.TXT 367680 96837 298595 201 600 GBRNA.SEQ 4694720 821816 2794838 1798 5636 GBROD.SEQ 30187600 6727065 21356518 14252 42920 GBSDR.TXT 3289440 1074335 3285363 2181 6578 GBSYN.SEQ 2838560 611437 1841890 1308 3706 GBUNA.SEQ 11974000 2752439 8627808 5717 17370 GBVRL.SEQ 18141760 4437537 13647843 9285 27420 GBVRT.SEQ 7638320 1747319 5408466 3705 10877 AAAREADME.TXT 2 2 DCOMPRESS.CLD 4 4 DCOMPRESS.EXE 150 150 DECMPRESS.COM 2 2 Totals 196404400 44957174 138788955 94567 279021 NOTE: The sizes of the CD ROM files are approximately the same as those of the uncompressed Unix tar files (Column 3). The addition of carriage-return/line-feed characters at the end of each line in the CD ROM files increases the total size of the distribution by approximately 3 Mb. 2.3 Loading Data Bank Files in VAX/VMS Backup Format In order to use the VAX/VMS Backup Saveset format, you must be running release 5.0 or greater of the VMS operating system. If you are not running release 5.0 or greater, you should order the unlabelled ASCII format instead of VAX/VMS Backup. The following command should be used to load the saveset into the current directory on your disk: BACKUP/LOG MSA0:GENBANK [] (NOTE: Replace `MSA0' with the identifier for your disk.) The following command should be used to uncompress the files. NOTE: If you do not want to keep all of the files, delete those you do not want before you run the uncompress procedure. The uncompress routine works on all the files in the directory that have a `.Z' extension. @DECMPRESS The following commands were used to create the VAX/VMS Backup Saveset. NOTE: The `...' indicates that the following line is a continuation and should be typed without a break. For 6250 bpi tape: BACKUP/DENSITY=6250/BUFFER=5/VERIFY/INTERCHANGE/... LIST=GB.LST GB1:[GENBANK.PROD]GB*.* TAPE:GENBANK For 1600 bpi tape: BACKUP/DENSITY=1600/BUFFER=5/VERIFY/INTERCHANGE/... LIST=GB.LST GB1:[GENBANK.PROD]GB*.* TAPE:GENBANK For TK-50 cartridge: BACKUP/BUFFER=5/VERIFY/INTERCHANGE/... LIST=GB.LST GB1:[GENBANK.PROD]GB*.* TAPE:GENBANK 2.4 Loading Data Bank Files in Unix tar Format The following commands should be used to load the Unix tar files into the current directory on your disk: tar xvfb /dev/rmt8 126 gb*.Z uncompress gb*.Z (NOTE: Replace `rmt8' with the identifier for your device.) The following command was used to write the tarfile on the distribution tape: For 6250 and 1600 bpi tapes (execute the command twice, once for each tape, for 1600 bpi): tar cvfb /dev/rmt8 20 gb*.Z For Sun cartridge: tar cvfb /dev/rst8 126 gb*.Z 3. FILE FORMATS 3.1 File Header Information Each of the twenty-two files on the distribution tape begins with the same header, except for the first line, which contains the file name, and the sixth line, which contains the title of the file. The first line of the file contains the file name in character positions 1 to 9 and the full data bank name (Genetic Sequence Data Bank) starting in column 20. The brief names of the files in this release are listed in section 2.2. The second line contains the date of the current release in the form `day month year', beginning in position 26. The fourth line contains the current GenBank release number. The release number appears in positions 41 to 45 and consists of two numbers separated by a decimal point. The number to the left of the decimal is the major release number. The digit to the right of the decimal indicates the version of the major release; it is zero for the first version. The sixth line contains a title for the file. The eighth line lists the number of entries (loci), number of bases (or base pairs), and number of reports of sequences in this release of GenBank. These numbers are right-justified at fixed positions. The number of entries appears in positions 1 to 7, the number of bases in positions 15 to 22, and the number of reports in positions 36 to 40. (There are more reports of sequences than entries since reported sequences that overlap or duplicate each other are combined into single entries.) The third, fifth, seventh, and ninth lines are blank. 1 10 20 30 40 50 60 70 79 ---------+---------+---------+---------+---------+---------+---------+--------- GBACC.IDX Genetic Sequence Data Bank 15 December 1990 GenBank(R) Release 66.0 Accession Number Index 41057 loci, 51306092 bases, from 50908 reported sequences ---------+---------+---------+---------+---------+---------+---------+--------- 1 10 20 30 40 50 60 70 79 Example 1. Sample File Header 3.2 Directory Files 3.2.1 Short Directory File The short directory file contains brief descriptions of all of the sequence entries contained in this release. These descriptions are in thirteen groups, one group for each of the thirteen sequence entry data files. The first record at the beginning of a group of entries contains the name of the group in uppercase characters, beginning in position 21. The organism groups are PRIMATE, RODENT, OTHER MAMMAL, OTHER VERTEBRATE, INVERTEBRATE, PLANT, ORGANELLE, BACTERIAL, STRUCTURAL RNA, VIRAL, PHAGE, SYNTHETIC, or UNANNOTATED. The second record is blank. Each record in the short directory contains the sequence entry name (LOCUS) in the first 12 positions, followed by a brief definition of the sequence beginning in column 13. The definition is truncated (at the end of a word) to leave room at the right margin for at least one space, the sequence length, and the letters `bp'. The length of the sequence is printed right-justified to column 77, followed by the letters `bp' in columns 78 and 79. The next-to-last record for a group has `ZZZZZZZZZZ' in its first ten positions (where the entry name would normally appear). The last record is a blank line. An example of the short directory file format, showing the descriptions of the last entries in the Other Vertebrate sequence data file and the first entries of the Invertebrate sequence data file, is reproduced below: 1 10 20 30 40 50 60 70 79 ---------+---------+---------+---------+---------+---------+---------+--------- ZEFHOX21 Zebrafish Hox-2.1 gene homologue (ZF-21). 291bp ZEFRZF21 Zebrafish mRNA for homeotic protein ZF-21. 2073bp ZEFZF54 Zebrafish homeotic gene ZF-54. 246bp ZEFZFEN Zebrafish engrailed-like homeobox sequence. 327bp ZZZZZZZZZZ INVERTEBRATE ACAACTI Amoeba (A. castellanii) actin gene-i. 1571bp ACAJJE A.castellanii 18S ribosomal RNA. 241bp ACAJJEA A.castellanii 18S ribosomal RNA. 258bp ACAJJEB A.castellanii 18S ribosomal RNA. 257bp ---------+---------+---------+---------+---------+---------+---------+--------- 1 10 20 30 40 50 60 70 79 Example 2. Short Directory File 3.2.2 New and Updated Entry File The directory of new and updated entries is a list of those entries that have been newly added or that have undergone substantive revision in this release. These entries are listed in the same order in which they appear in the actual data files; they are divided into thirteen groups, one group for each of the thirteen sequence entry data files. The first record at the beginning of a group of entries designates that group, beginning in position 21. The second record is blank and the third record has asterisks in its first ten positions. Within each group, the entries are listed alphabetically. For each entry, the new and updated entry file gives the information included under the LOCUS and DEFINITION keywords in the same format in which they appear in the actual sequence entry; these categories are described in section 3.5.2. After the last record of an entry comes a record containing asterisks in its first ten positions. At the end of each group, a dummy entry contains only a LOCUS line with the entry name `ZZZZZZZZZZ'. Therefore, the next-to-last record has ten asterisks in its first ten positions; the last record of the group is blank. The following excerpt from the current release shows the last new or revised entry from the Other Vertebrate sequence data file, followed by the first new or revised entry from the Invertebrate sequence data file: 1 10 20 30 40 50 60 70 79 ---------+---------+---------+---------+---------+---------+---------+--------- ********** LOCUS RANCRYR23 266 bp ds-DNA VRT 20-SEP-1990 DEFINITION R.temporaria rho-crystallin gene, exon X. ********** LOCUS ZZZZZZZZZZ ********** INVERTEBRATE ********** LOCUS ACAJJE 241 bp ss-rRNA INV 05-NOV-1990 DEFINITION A.castellanii 18S ribosomal RNA. ********** ---------+---------+---------+---------+---------+---------+---------+--------- 1 10 20 30 40 50 60 70 79 Example 3. New and Updated Entry File 3.3 Index Files There are five files containing indices to the entries in this release: Accession number index file Keyword phrase index file Author name index file Journal citation index file Gene symbol index file The index keys (accession numbers, keywords, authors, journals, and gene symbols.) of an index are sorted alphabetically. (The index keys for the keyword phrases and author names appear in uppercase characters even though they appear in mixed case in the sequence entries.) Under each index key, the names of the sequence entries containing that index key are listed alphabetically. Each sequence name is also followed by its data file division and primary accession number. The following codes are used to designate the data file divisions: 1. PRI - primates 2. ROD - rodents 3. MAM - other mammals 4. VRT - other vertebrates 5. INV - invertebrates 6. PLN - plants, fungi, and algae 7. ORG - organelles 8. BCT - bacteria 9. RNA - structural RNAs 10. VRL - viruses 11. PHG - bacteriophage 12. SYN - synthetic sequences 13. UNA - unannotated sequences The index key begins in column 1 of a record. An 11-character field for the sequence entry name starts in position 14 of a record, followed by a 3-character field for the data file division, starting at position 25 and ending at position 27, and a 6-character field for the primary accession number, starting at position 29 and ending at position 34. All entries in the fields are left-justified. Beginning at positions 36 and 58, the three fields repeat, so three sets of sequence information can appear in one record. If there are more than three entry names, the next records are used; the index key is not repeated. For the accession number and human gene symbol index files, the entry names begin in the same record as the index key, since the key is always less than 12 characters. In the other index files, the entry names begin on the record following the index key record. 3.3.1 Accession Number Index File Accession numbers consist of a single letter followed by five digits. They provide an unchanging designation for the data with which they are associated, and we encourage you to cite accession numbers whenever you refer to data from the data bank. The primary accession number is the first accession number of an entry. It is unique to that entry. Citation of that number will enable other investigators to locate the data no matter what entry name changes or other data bank reorganizations may occur. The accession numbers, however, carry no intrinsic information about the data. In addition to the primary accession number, some entries have secondary accession numbers. Secondary accession numbers arise for a number of reasons. For example, a single accession number may initially be assigned to the sequence in an article. If it is later discovered that the sequence must be entered into the data bank as multiple entries, each entry would receive a new primary accession number; the previous accession number would appear as the secondary accession number in each entry. The following excerpt from the accession number index file illustrates the format of the index: 1 10 20 30 40 50 60 70 79 ---------+---------+---------+---------+---------+---------+---------+--------- J00316 HUMTBB11P PRI J00316 J00317 HUMTBB46P PRI J00317 J00318 HUMUG1 PRI J00318 J00319 HUMUG1PA PRI J00319 J00320 HUMVIPMR1 PRI L00154 HUMVIPMR2 PRI L00155 HUMVIPMR3 PRI L00156 HUMVIPMR4 PRI L00157 HUMVIPMR5 PRI L00158 J00321 BABA1AT PRI J00321 J00322 CHPRSA PRI J00322 J00323 AGMRSASPC PRI J00323 J00324 BABATIII PRI J00324 ---------+---------+---------+---------+---------+---------+---------+--------- 1 10 20 30 40 50 60 70 79 Example 4. Accession Number Index File If the same accession number is found in more than one entry (a result of the infrequent occasions when a single entry is split into two or more separate entries), then the additional entries and groups in which the number appears are also given. 3.3.2 Keyword Phrase Index File Keyword phrases consist of names for gene products and other characteristics of sequence entries. There are approximately 12,000 keyword phrases. An excerpt from the keyword phrase index file is shown below: 1 10 20 30 40 50 60 70 79 ---------+---------+---------+---------+---------+---------+---------+--------- DNA GYRASE ECOGYRA BCT X06744 ECORECF BCT K02179 ECORECFA BCT X04341 DNA HELICASE ECOHELIV BCT J04726 ECOUVRD BCT X00738 DNA INVERTASE ECOPIN BCT K00676 ECOPINP BCT K03521 PMUGINMOM PHG V01463 STAINVSA BCT M36694 DNA LIGASE ECOLIG BCT M24278 ECOLIGA BCT M30255 PT4G30 PHG X00039 PT6LIG55 PHG M38465 PT7CG PHG J02518 YSCCDC9 PLN X03246 YSPCDC17 PLN X05107 DNA LIGASE I HUMLIGAA PRI M36067 DNA MATURATION HS1CAS VRL M22962 DNA METHYLASE HEHMTS BCT J02677 DNA METHYLATION HEHMTS BCT J02677 HUMSPM1 PRI X06585 HUMSPM2 PRI X06586 HUMSPM3 PRI X06587 HUMSPM4 PRI X06588 HUMSPM5 PRI X07490 HUMSPM6 PRI X07491 HUMSPM7 PRI X07492 HUMSPM8 PRI X07493 HUMSPM9 PRI X07494 ---------+---------+---------+---------+---------+---------+---------+--------- 1 10 20 30 40 50 60 70 79 Example 5. Keyword Phrase Index File 3.3.3 Author Name Index File The author name index file lists all of the author names that appear in the citations. An excerpt from the author name index file is shown below: 1 10 20 30 40 50 60 70 79 ---------+---------+---------+---------+---------+---------+---------+--------- JACKOWSKI,S. ECOPANF BCT M30953 JACKS,C.M. MUSRP32A ROD M35397 MUSRPL32A ROD M23453 JACKS,T. MMTGXPPR VRL M16766 JACKSON,A. BOVMHBOLA MAM M21044 BOVMHBOLB MAM M21043 JACKSON,A.O. BSMRVPS SYN M28702 M23023 UNA M23023 MBSRNAG VRL M11511 MBSRNAGND VRL M16577 MBSRNAGSA VRL M11509 MBSRNAGSB VRL M11510 MBSRNAGT VRL M16576 SAPCAP VRL M17182 SYENCP VRL M17210 SYERNA VRL M13950 SYESC6 VRL M35689 ---------+---------+---------+---------+---------+---------+---------+--------- 1 10 20 30 40 50 60 70 79 Example 6. Author Name Index File 3.3.4 Journal Citation Index File The journal citation index file lists all of the citations that appear in the references. All citations are truncated to 80 characters. An excerpt from the citation index file is shown below: 1 10 20 30 40 50 60 70 79 ---------+---------+---------+---------+---------+---------+---------+--------- (IN) THE CELL NUCLEUS, VOLUME VIII: 261-305; ACADEMIC PRESS, NEW YORK (1981) RATUR5A RNA K00783 (IN) THE IMMUNE SYSTEM: 132-138; S. KARGER, NEW YORK (1981). HUMIGHVX PRI M35415 (IN) THE LENS: TRANSPARANCY AND CATARACT: 171-179; EURAGE, RIJSWIJK (1986) RANCRYG2A VRT K02264 RANCRYG4A VRT K02266 RANCRYG5A VRT M22529 RANCRYG6A VRT M22530 RANCRYR VRT X00659 (IN) UCLA SYMP. MOL. CELL. BIOL. NEW SER., VOL. 77: 339-352; ALAN R. LISS, INC. BOVTRNB2A MAM M36431 HUMTRNB PRI M36429 HUMTRNB1 PRI M36430 (IN) UCLA SYMPOSIA: 575-584; ALAN R. LISS, INC., NEW YORK (1987) PFAHGPRT INV M54896 (IN) VIRUS RESEARCH. PROCEEDINGS OF 1973 ICN-UCLA SYMPOSIUM: 533-544; ACADEMIC LAMCG PHG J02459 ACTA BIOCHIM. POL. 24, 301-318 (1977) LUPTRFJ RNA K00345 LUPTRFN RNA K00346 ---------+---------+---------+---------+---------+---------+---------+--------- 1 10 20 30 40 50 60 70 79 Example 7. Journal Citation Index File 3.3.5 Cross-Reference To Gene Symbol Libraries The gene symbol file contains the gene symbols used in the Genome Data Base and other gene symbols, such as those for the E. coli genes. The gene symbols are found in the feature table and have the form: /gene="gene symbol"; an example is found in section 3.5.11.5. An example of the format of the gene symbol index file follows: 1 10 20 30 40 50 60 70 79 ---------+---------+---------+---------+---------+---------+---------+--------- INFC ECOHIMA BCT K02844 ECOTHRINF BCT V00291 INHA HUMINHA PRI M13981 HUMINHAA PRI M13144 HUMINHAG1 PRI X04445 HUMINHAG2 PRI X04446 HUMINHAG2 PRI X04446 INHBA HUMINHBA PRI M13436 INHBB HUMINHBB PRI M13437 HUMINHBB1 PRI M31668 HUMINHBB2 PRI M31669 HUMINHBB2 PRI M31669 HUMINHIB PRI M31682 INS HUMINS01 PRI J00265 HUMINS01 PRI J00265 HUMINSPR PRI M10039 HUMINV2 PRI M13903 INSR HUMINSR PRI M10051 HUMINSR01 PRI M23100 HUMINSR02 PRI M32823 HUMINSR03 PRI M32824 HUMINSR04 PRI M32825 HUMINSR05 PRI M32826 HUMINSR06 PRI M32827 HUMINSR07 PRI M32828 HUMINSR08 PRI M32829 HUMINSR09 PRI M32830 HUMINSR10 PRI M32831 HUMINSR11 PRI M32832 HUMINSR12 PRI M32833 HUMINSR13 PRI M32834 HUMINSR14 PRI M32835 HUMINSR15 PRI M32836 HUMINSR16 PRI M32837 HUMINSR17 PRI M32838 HUMINSR18 PRI M32839 HUMINSR19 PRI M32840 HUMINSR20 PRI M32841 HUMINSR21 PRI M32842 HUMINSR22 PRI M32972 HUMINSRA PRI X02160 HUMINSRA01 PRI M27195 HUMINSRA02 PRI M27197 HUMINSRB PRI J03466 HUMINSRC PRI M29929 HUMINSRD PRI M29930 HUMINSRMUT PRI M27196 HUMIRSRE PRI J05043 INT1 HUMINT1G PRI X03072 INT1L1 HUMIRP PRI X07876 IRGA VCHIRGA BCT M37773 VCHIRGA BCT M37773 VCHIRGA BCT M37773 VCHIRGA BCT M37773 VCHIRGA BCT M37773 VCHIRGB BCT M55988 VCHIRGB BCT M55988 ---------+---------+---------+---------+---------+---------+---------+--------- 1 10 20 30 40 50 60 70 79 Example 8. Gene Symbol Index File 3.4 GenBank Data Submission Form and Error/Suggestion Report Form The distribution tape includes a data submission form in the file GBDAT.FRM. Due to the large volume of new sequence data, we encourage authors to complete this form and return it to the address listed on the form. This will enable data to be entered more quickly into the data bank. You can complete the form with any text editor. You can send the completed form to GenBank on tape or floppy diskette, or electronically via INTERNET or BITNET (the electronic mail address is: gb-sub%life@lanl.gov). We can use information saved on any computer medium from any computer system. You can also print the form, fill it in by hand, and send it to the mailing address given at the beginning of the form. The second form in this file is the GenBank Error/Suggestion Report Form. It is separated from the Data Submission Form by a form-feed character (<CTRL>L, ASCII octal value 014, ASCII decimal value 12). We encourage all GenBank users to report any errors to the data bank staff using this form. Like the GenBank Data Submission Form, it may be printed and filled in by hand and sent by mail to the address given at the beginning of the form. It may also be filled out using a text editor and sent to GenBank by electronic mail at the address given at the top of the form. If you have an IBM PC or compatible computer, or a Macintosh personal computer, we request that you use the Authorin program for submitting sequences to the data bank. See section 5.5 for information about obtaining the Authorin program at no charge. 3.5 Sequence Entry Files The distribution tape contains thirteen sequence entry data files, one for each division of GenBank. Each file contains the entries for one group of organisms. 3.5.1 File Organization Each of these files has the same format and consists of two parts: header information (described in section 3.1) and sequence entries for that division (described in the following sections). 3.5.2 Entry Organization In the second portion of a sequence entry file (containing the sequence entries for that division), each record (line) consists of two parts. The first part is found in positions 1 to 10 and may contain: 1. A keyword, beginning in column 1 of the record (e.g., REFERENCE is a keyword). 2. A subkeyword beginning in column 3, with columns 1 and 2 blank (e.g., AUTHORS is a subkeyword of REFERENCE). 3. Blank characters, indicating that this record is a continuation of the information under the keyword or subkeyword above it. 4. A code, beginning in column 5, indicating the nature of an entry (feature key) in the FEATURES table; these codes are described in Section 3.5.11.1 below. 5. A number, ending in column 9 of the record. This number occurs in the portion of the entry describing the actual nucleotide sequence and designates the numbering of sequence positions. 6. Two slashes (//) in positions 1 and 2, marking the end of an entry. The second part of each sequence entry record contains the information appropriate to its keyword, in positions 13 to 80 for keywords and positions 11 to 80 for the sequence. The following is a brief description of each entry field. Detailed information about each field may be found in Sections 3.5.4 to 3.5.13. LOCUS - A short unique name for the entry, chosen to suggest the sequence's definition. Mandatory keyword/exactly one record. DEFINITION - A concise description of the sequence. Mandatory keyword/one or more records. ACCESSION - The primary accession number is a unique, unchanging code assigned to each entry. (Please use this code when citing information from GenBank.) Mandatory keyword/one or more records. KEYWORDS - Short phrases describing gene products and other information about an entry. Mandatory keyword in all annotated entries/one or more records. SEGMENT - Information on the order in which this entry appears in a series of discontinuous sequences from the same molecule. Optional keyword (only in segmented entries)/exactly one record. SOURCE - Common name of the organism or the name most frequently used in the literature. Mandatory keyword in all annotated entries/one or more records/includes one subkeyword. ORGANISM - Formal scientific name of the organism (first line) and taxonomic classification levels (second and subsequent lines). Mandatory subkeyword in all annotated entries/two or more records. REFERENCE - Citations for all articles containing data reported in this entry. Includes four subkeywords and may repeat. Mandatory keyword/one or more records. AUTHORS - Lists the authors of the citation. Mandatory subkeyword/one or more records. TITLE - Full title of citation. Optional subkeyword (present in all but unpublished citations)/one or more records. JOURNAL - Lists the journal name, volume, year, and page numbers of the citation. Mandatory subkeyword/one or more records. STANDARD - Lists information about the degree to which the entry has been annotated and the level of review to which it has been subjected. Mandatory subkeyword/exactly one record. COMMENT - Cross-references to other sequence entries, comparisons to other collections, notes of changes in LOCUS names, and other remarks. Optional keyword/one or more records/may include blank records. FEATURES - Table containing information on portions of the sequence that code for proteins and RNA molecules and information on experimentally determined sites of biological significance. Optional keyword/one or more records. BASE COUNT - Summary of the number of occurrences of each base code in the sequence. Mandatory keyword/exactly one record. ORIGIN - Specification of how the first base of the reported sequence is operationally located within the genome. Where possible, this includes its location within a larger genetic map. Mandatory keyword/exactly one record. - The ORIGIN line is followed by sequence data (multiple records). // - Entry termination symbol. Mandatory at the end of an entry/exactly one record. 3.5.3 Sample Sequence Data File An example of a complete sequence entry file follows. (This example has only two entries.) Note that in this example, as throughout the data bank, numbers in square brackets indicate items in the REFERENCE list. For example, in ACARR58S, [1] refers to the paper by Mackay, et al. 1 10 20 30 40 50 60 70 79 ---------+---------+---------+---------+---------+---------+---------+--------- GBSMP.SEQ Genetic Sequence Data Bank 15 December 1990 GenBank(R) Release 66.0 Structural Rna Sequences 2 loci, 280 bases, from 2 reported sequences LOCUS AAURRA 118 bp ss-rRNA RNA 16-JUN-1986 DEFINITION A.auricula-judae (mushroom) 5S ribosomal RNA. ACCESSION K03160 KEYWORDS 5S ribosomal RNA; ribosomal RNA. SOURCE A.auricula-judae (mushroom) ribosomal RNA. ORGANISM Auricularia auricula-judae Eukaryota; Plantae; Thallobionta; Basidiomycotina; Phragmobasidiomycetes; Heterobasidiomycetidae; Eutremellales; Auriculariaceae; Auricularia; auricula-judae. REFERENCE 1 (bases 1 to 118) AUTHORS Huysmans,E., Dams,E., Vandenberghe,A. and De Wachter,R. TITLE The nucleotide sequences of the 5S rRNAs of four mushrooms and their use in studying the phylogenetic position of basidiomycetes among the eukaryotes JOURNAL Nucleic Acids Res. 11, 2871-2880 (1983) STANDARD full staff_review FEATURES Location/Qualifiers rRNA 1..118 /note="5S ribosomal RNA" BASE COUNT 27 a 34 c 34 g 23 t ORIGIN 5' end of mature rRNA. 1 atccacggcc ataggactct gaaagcactg catcccgtcc gatctgcaaa gttaaccaga 61 gtaccgccca gttagtacca cggtggggga ccacgcggga atcctgggtg ctgtggtt // LOCUS ACARR58S 162 bp ss-rRNA RNA 15-MAR-1989 DEFINITION A.castellanii (amoeba) 5.8S ribosomal RNA. ACCESSION K00471 KEYWORDS 5.8S ribosomal RNA; ribosomal RNA. SOURCE A.castellani (amoeba; strain ATCC 30010) rRNA. ORGANISM Acanthamoeba castellanii Eukaryota; Animalia; Protozoa; Sarcomastigophora; Sarcodina; Rhizopoda; Lobosa; Gymnamoeba; Amoebida; Acanthopodina; Acanthamoebidae; Acanthamoeba; castellanii. REFERENCE 1 (bases 1 to 162) AUTHORS Mackay,R.M. and Doolittle,W.F. TITLE Nucleotide sequences of AcanthamoebA.castellanii 5S and 5.8S ribosomal ribonucleic acids: Phylogenetic and comparative structural analyses JOURNAL Nucleic Acids Res. 9, 3321-3334 (1981) STANDARD simple staff_review COMMENT [1] also sequenced A.castellanii 5S rRNA <K03160>. FEATURES Location/Qualifiers rRNA 1..162 /note="5.8S rRNA" BASE COUNT 40 a 39 c 44 g 39 t ORIGIN 5' end of mature rRNA. 1 aactcctaac aacggatatc ttggttctcg cgaggatgaa gaacgcagcg aaatgcgata 61 cgtagtgtga atcgcaggga tcagtgaatc atcgaatctt tgaacgcaag ttgcgctctc 121 gtggtttaac cccccgggag cacgttcgct tgagtgccgc tt // ---------+---------+---------+---------+---------+---------+---------+--------- 1 10 20 30 40 50 60 70 79 Example 9. Sample Sequence Data File 3.5.4 LOCUS Format The pieces of information contained in the LOCUS record are always found in fixed positions. The locus name (or entry name), which is always ten characters or less, begins in position 13. The locus name is designed to help group entries with similar sequences: the first three characters usually designate the organism; the fourth and fifth characters can be used to show other group designations, such as gene product; for segmented entries the last character is one of a series of sequential integers. The number of bases or base pairs in the sequence ends in position 29. The letters `bp' are in positions 31 to 32. Positions 34 to 36 give the number of strands of the sequence. Positions 37 to 40 give the topology of molecule sequenced. If the sequence is of a special type, a notation (such as `circular') is included in positions 43 to 52. GenBank sequence entries are divided among thirteen taxonomic divisions. Each entry's division is identified by a three-letter code in positions 53 to 55. See Section 3.3 for the division codes. Positions 63 to 73 of the record contain the date the entry was entered or underwent any substantial revisions, such as the addition of newly published data, in the form dd-MMM-yyyy. The detailed format for the LOCUS record is as follows: Positions Contents 1-12 LOCUS 13-22 Locus name 23-29 Length of sequence, right-justified 31-32 bp 34-36 Blank, ss- (single-stranded), ds- (double-stranded), or ms- (mixed-stranded) 37-40 Blank, DNA, RNA, tRNA (transfer RNA), rRNA (ribosomal RNA), mRNA (messenger RNA), or uRNA (small nuclear RNA) 43-52 Blank (implies linear), circular, or tandem 53-55 The division code (see Section 3.3) 63-73 Date, in the form dd-MMM-yyyy (e.g., 15-DEC-1990) 3.5.5 DEFINITION Format The DEFINITION record gives a brief description of the sequence, proceeding from general to specific. It starts with the common name of the source organism, then gives the criteria by which this sequence is distinguished from the remainder of the source genome, such as the gene name and what it codes for, or the protein name and mRNA, or some description of the sequence's function (if the sequence is non-coding). If the sequence has a coding region, the description may be followed by a completeness qualifier, such as cds (complete coding sequence). The length is limited to three lines and the last line must end with a period. 3.5.6 ACCESSION Format This field contains a series of six-character identifiers (accession numbers: first character a letter, the remainder digits). The primary (first) accession number occupies positions 13 to 18; subsequent accession numbers occupy positions 20 to 25, 27 to 32, 34 to 39, 41 to 46, 48 to 53, 55 to 60, 62 to 67, and 69 to 74. No punctuation occurs between accession numbers or after the final accession number; accession numbers are separated only by one space. 3.5.7 KEYWORDS Format The KEYWORDS field does not appear in unannotated entries, but is required in all annotated entries. Keywords are separated by semicolons; a keyword may be a single word or a phrase consisting of several words. Each line in the keywords field ends in a semicolon; the last line ends with a period. If no keywords are included in the entry, the KEYWORDS record contains only a period. 3.5.8 SEGMENT Format The SEGMENT keyword is used when two (or more) entries of known relative orientation are separated by a short (<10 kb) stretch of DNA. It is limited to one line of the form `n of m', where `n' is the segment number of the current entry and `m' is the total number of segments. 3.5.9 SOURCE Format The SOURCE field consists of two parts. The first part is found after the SOURCE keyword and contains free-format information including an abbreviated form of the organism name followed by a molecule type; multiple lines are allowed, but the last line must end with a period. The second part consists of information found after the ORGANISM subkeyword. The formal scientific name for the source organism (genus and species, where appropriate) is found on the same line as ORGANISM. The records following the ORGANISM line list the taxonomic classification levels, separated by semicolons and ending with a period. 3.5.10 REFERENCE Format The REFERENCE field consists of five parts: the keyword REFERENCE, and the subkeywords AUTHORS, TITLE (optional), JOURNAL and STANDARD. The REFERENCE line contains the number of the particular reference and (in parentheses) the range of bases in the sequence entry reported in this citation. Additional prose notes may also be found within the parentheses. The numbering of the references does not reflect publication dates or priorities. The AUTHORS line lists the authors in the order in which they appear in the cited article. Last names are separated from initials by a comma (no space); there is no comma before the final `and'. The list of authors ends with a period. The TITLE line is an optional field, although it appears in the majority of entries. It does not appear in unpublished sequence data entries that have been deposited directly into the GenBank data bank, the EMBL Nucleotide Sequence Data Library, or the DNA Data Bank of Japan. The TITLE field does not end with a period. The JOURNAL line gives the appropriate literature citation for the sequence in the entry. The word `Unpublished' will appear after the JOURNAL subkeyword if the data did not appear in the scientific literature, but was directly deposited into the data bank. For published sequences the JOURNAL line gives the Thesis, Journal, or Book citation, including the year of publication, the specific citation, or In press. The STANDARD line contains information about: The degree to which the entry has been annotated: `unannotated' for unannotated entries which include citation and sequence only. `simple' for unannotated entries which include the organism name and protein coding regions as well as the citation and sequence. `full' for fully annotated entries which include all the data items that were described by the author. The level of modification and review: `automatic' for data subjected only to automated (i.e., software) checks. `staff_entry' for data that passed both automated and annotator checks. `staff_review' for data that passed previous review levels as well as a review by senior annotators and/or outside experts. The format for the STANDARD line is: annotation degree <SPACE> review level 3.5.11 FEATURES Format This release uses the new feature table format. This format has been designed jointly by GenBank, the EMBL Nucleotide Sequence Data Library, and the DNA Data Bank of Japan, and will be common to all three data banks. The feature table contains information about genes and gene products, as well as regions of biological significance reported in the sequence. The feature table contains information on regions of the sequence that code for proteins and RNA molecules. It also enumerates differences between different reports of the same sequence, and provides cross-references to other data collections, as described in more detail below. The first line of the feature table is a header that includes the keyword `FEATURES' and the column header `Location/Qualifier.' Each feature consists of a descriptor line containing a feature key and a location (see sections below for details). If the location does not fit on this line, a continuation line may follow. If further information about the feature is required, one or more lines containing feature qualifiers may follow the descriptor line. The feature key begins in column 6 and may be no more than 15 characters in length. The location begins in column 22. Feature qualifiers begin on subsequent lines at column 22. Location, qualifier, and continuation lines may extend from column 22 to 80. Feature tables are optional. However, a feature table must include one header line and at least one feature descriptor line. The sections below provide a brief introduction to the new feature table format. For a thorough description of the new feature table format, see the document `The DDBJ/EMBL/GenBank Feature Table: Definition.' If you would like a copy of this publication, contact GenBank at the address shown on the front page of these Release Notes. 3.5.11.1 Feature Key Names The first column of the feature descriptor line contains the feature key. It starts at column 6 and can continue to column 20. The list of valid feature keys is shown below. allele Related strain contains alternative gene form attenuator Sequence related to transcription termination CAAT_signal `CAAT box' in eukaryotic promoters CDS Sequence coding for amino acids in protein (includes stop codon) cellular Region of cellular DNA conflict Independent determinations differ D-loop Displacement loop enhancer Cis-acting enhancer of promoter function exon Region that codes for part of spliced mRNA GC_signal `GC box' in eukaryotic promoters iDNA Intervening DNA eliminated by recombination insertion_seq Insertion sequence (IS), a small transposon intron Transcribed region excised by mRNA splicing LTR Long terminal repeat mat_peptide Mature peptide coding region (does not include stop codon) misc_binding Miscellaneous binding site misc_difference Miscellaneous difference feature misc_feature Region of biological significance that cannot be described by any other feature misc_recomb Miscellaneous recombination feature misc_RNA Miscellaneous transcript feature not defined by other RNA keys misc_signal Miscellaneous signal misc_structure Miscellaneous DNA or RNA structure modified_base The indicated base is a modified nucleotide mRNA Messenger RNA mutation A mutation alters the sequence here old_sequence Presented sequence revises a previous version polyA_signal Signal for cleavage & polyadenylation polyA_site Site at which polyadenine is added to mRNA precursor_RNA Any RNA species that is not yet the mature RNA product prim_transcript Primary (unprocessed) transcript primer_bind Non-covalent primer binding site promoter A region involved in transcription initiation protein_bind Non-covalent protein binding site on DNA or RNA provirus Proviral sequence RBS Ribosome binding site rep_origin Replication origin for duplex DNA repeat_region Sequence containing repeated subsequences repeat_unit One repeated unit of a repeat_region rRNA Ribosomal RNA satellite Satellite repeated sequence scRNA Small cytoplasmic RNA sig_peptide Signal peptide coding region snRNA Small nuclear RNA stem_loop Hair-pin loop structure in DNA or RNA TATA_signal `TATA box' in eukaryotic promoters terminator Sequence causing transcription termination transit_peptide Transit peptide coding region transposon Transposable element (TN) tRNA Transfer RNA unsure Authors are unsure about the sequence in this region variation A related population contains stable mutation virion Virion (encapsidated) viral sequence - (hyphen) Placeholder -10_signal `Pribnow box' in prokaryotic promoters -35_signal `-35 box' in prokaryotic promoters 3'clip 3'-most region of a precursor transcript removed in processing 3'UTR 3' untranslated region (trailer) 5'clip 5'-most region of a precursor transcript removed in processing 5'UTR 5' untranslated region (leader) 3.5.11.2 Feature Location The second column of the feature descriptor line designates the location of the feature in the sequence. The location descriptor begins at position 22. Several conventions are used to indicate sequence location. Base numbers in location descriptors refer to numbering in the entry, which is not necessarily the same as the numbering scheme used in the published report. The first base in the presented sequence is numbered base 1. Sequences are presented in the 5' to 3' direction. Location descriptors can be one of the following: 1. A single base; 2. A contiguous span of bases; 3. A site between two bases; 4. A single base chosen from a range of bases; 5. A single base chosen from among two or more specified bases; 6. A joining of sequence spans; 7. A reference to an entry other than the one to which the feature belongs (i.e., a remote entry), followed by a location descriptor referring to the remote sequence; 8. A literal sequence (a string of bases enclosed in quotation marks). A site between two residues, such as an endonuclease cleavage site, is indicated by listing the two bases separated by a carat (e.g., 23^24). A single residue chosen from a range of residues is indicated by the number of the first and last bases in the range separated by a single period (e.g., 23.79). The symbols < and > indicate that the end point of the range is beyond the specified base number. A contiguous span of bases is indicated by the number of the first and last bases in the range separated by two periods (e.g., 23..79). The symbols < and > indicate that the end point of the range is beyond the specified base number. Starting and ending positions can be indicated by base number or by one of the operators described below. Operators are prefixes that specify what must be done to the indicated sequence to locate the feature. The following are the operators available, along with their most common format and a description. complement (location): The feature is complementary to the location indicated. Complementary strands are read 5' to 3'. join (location, location, .. location): The indicated elements should be placed end to end to form one contiguous sequence. order (location, location, .. location): The elements are found in the specified order in the 5' to 3' direction, but nothing is implied about the rationality of joining them. group (location, location, .. location): The elements are related and should be grouped together, but no order is implied. one-of (location, location, .. location): The element can be any one, but only one, of the items listed. replace (location, location): The first location indicated should be replaced by the sequence from the second location; used for insertions, deletions, and variants. 3.5.11.3 Feature Qualifiers Qualifiers provide additional information about features. They take the form of a slash (/) followed by a qualifier name and, if applicable, an equal sign (=) and a qualifier value. Feature qualifiers begin at column 22. Qualifiers convey many types of information. Their values can, therefore, take several forms: 1. Free text; 2. Controlled vocabulary or enumerated values; 3. Citations or reference numbers; 4. Sequences; 5. Feature labels. Text qualifier values must be enclosed in double quotation marks. The text can consist of any printable characters (ASCII values 32-126 decimal). If the text string includes double quotation marks, each set must be `escaped' by placing a double quotation mark in front of it (e.g., /note="This is an example of ""escaped"" quotation marks"). Some qualifiers require values selected from a limited set of choices. For example, the `/direction' qualifier has only three values `left,' `right,' or `both.' These are called controlled vocabulary qualifier values. Controlled qualifier values are not case sensitive; they can be entered in any combination of upper- and lowercase without changing their meaning. Citation or published reference numbers for the entry should be enclosed in square brackets ([]) to distinguish them from other numbers. Multiple citations are separated by commas (e.g., [1],[2],[3]). A literal sequence of bases (e.g., "atgcatt") should be enclosed in quotation marks. Literal sequences are distinguished from free text by context. Qualifiers that take free text as their values do not take literal sequences, and vice versa. The `/label=' qualifier takes a feature label as its qualifier. Although feature labels are optional, they allow unambiguous references to the feature. The feature label identifies a feature within an entry; when combined with the accession number and the name of the data bank from which it came, it is a unique tag for that feature. Feature labels must be unique within an entry, but can be the same as a feature label in another entry. Feature labels are not case sensitive; they can be entered in any combination of upper-and lowercase without changing their meaning. The following is a list of valid feature qualifiers. /anticodon Location of the anticodon of tRNA and the amino acid for which it codes /bound_moiety Moiety bound /citation Reference to a citation providing the claim of or evidence for a feature /codon Specifies a codon that is different from any found in the reference genetic code /codon_start Indicates the reading frame of a protein coding region /cons_splice Identifies intron splice sites that do not conform to the 5'-GT... AG-3' splice site consensus /direction Direction of DNA replication /EC_number Enzyme Commission number for the enzyme product of the sequence /evidence Value indicating the nature of supporting evidence /frequency Frequency of the occurrence of a feature /function Function attributed to a sequence /gene Symbol of the gene corresponding to a sequence region /label A label used to permanently identify a feature /mod_base Abbreviation for a modified nucleotide base /note Any comment or additional information /number A number indicating the order of genetic elements (e.g., exons or introns) in the 5' to 3' direction /organism Name of organism if different from that contained in the entry's ORGANISM field /partial Differentiates between complete regions and partial ones /phenotype Phenotype conferred by the feature /product Name of a product encoded by the sequence /pseudo Indicates that this feature is a non-functional version of the element named by the feature key /rpt_family Type of repeated sequence; `Alu' or `Kpn,' for example /rpt_type Organization of repeated sequence /rpt_unit Identity of repeat unit that constitutes a repeat_region /standard_name Accepted standard name for this feature /transl_except Translational exception: single codon, the translation of which does not conform to the reference genetic code /type Name of a strain if different from that in the SOURCE field /usedin Indicates that feature is used in a compound feature in another entry 3.5.11.4 Cross-Reference Information One type of information in the feature table lists cross-references to the annual compilation of transfer RNA sequences in Nucleic Acids Research, which has kindly been sent to us on tape by Dr. Sprinzl. Each tRNA entry of the feature table contains a /note= qualifier that includes a reference such as `(NAR: 1234)' to identify code 1234 in the NAR compilation. When such a cross-reference appears in an entry that contains a gene coding for a transfer RNA molecule, it refers to the code in the tRNA gene compilation. Similar cross-references in entries containing mature transfer RNA sequences refer to the companion compilation of tRNA sequences published by D.H. Gauss and M. Sprinzl in Nucleic Acids Research. See section 3.5.11.6 for an example. The feature tables of human entries contain cross-references to the Genome Data Base (GDB) in Baltimore, MD. GDB includes information on mapped genes, probes, and restriction fragment length polymorphisms. Each entry in that data bank contains the official symbol for the gene or locus. GDB assigns each gene a unique identifier that remains associated with that gene, regardless of changes in gene names. In entries that contain sequences for mapped genes a /note= qualifier includes this identifier placed within single quotes following the term `/hgml_locus_uid='. The /note qualifier also includes the map location in single quotes following the term `/map'. The gene symbol formerly designated `/nomgen=' is contained in the /gene qualifier. See section 3.5.11.6 for an example. For more information about the Genome Data Base, contact: Genome Data Base 1830 East Monument Street Baltimore, MD 21205 Telephone: (203) 786-5515 3.5.11.5 Feature Table Examples In the first example a number of key names, feature locations, and qualifiers are illustrated, taken from different sequences. The first table entry is a coding region consisting of a simple span of bases and including a /gene qualifier. In the second table entry, an NAR cross-reference is given (see the previous section for a discussion of these cross-references). The third and fourth table entries use the symbols `<`and `>' to indicate that the beginning or end of the feature is beyond the range of the presented sequence. In the fifth table entry, the symbol `^' indicates that the feature is between bases. In the sixth table entry, the replace operator is shown. 1 10 20 30 40 50 60 70 79 ---------+---------+---------+---------+---------+---------+---------+--------- CDS 5..1261 /note="alpha-1-antitrypsin precursor /map=`14q32.1' /hgml_locus_uid=`LX0081X'" /gene="PI" tRNA 1..87 /note="Leu-tRNA-CAA (NAR: 1057)" /anticodon=(pos:35..37,aa:Leu) mRNA 1..>66 /note="alpha-1-acid glycoprotein mRNA" transposon <1..267 /note="insertion element IS5" misc_recomb 105^106 /note="B.subtilis DNA end/IS5 DNA start" conflict replace(258..258,"t") /citation=[2] ---------+---------+---------+---------+---------+---------+---------+--------- 1 10 20 30 40 50 60 70 79 Example 10. Feature Table Entries The next example shows the representation for a CDS that spans more than one entry. 1 10 20 30 40 50 60 70 79 ---------+---------+---------+---------+---------+---------+---------+--------- LOCUS HUMAPOB1 840 bp ds-DNA PRI 15-JUN-1989 DEFINITION Human apolipoprotein B-100 gene, exons 1 and 2. ACCESSION M15053 KEYWORDS apolipoprotein B-100. SEGMENT 1 of 2 . . . FEATURES Location/Qualifiers sig_peptide 283..354 /note="apolipoprotein B-100 signal peptide" precursor_RNA 155..>840 /note="apoB100 mRNA" intron 356..669 /note="apoB100 intron A" intron 709..>840 /note="apoB100 intron B" . . . // LOCUS HUMAPOB2 13872 bp ss-mRNA PRI 15-JUN-1989 DEFINITION Human apolipoprotein B-100 mRNA, starting at exon 3. ACCESSION M15051 M15054 KEYWORDS apolipoprotein B-100. SEGMENT 2 of 2 . . . FEATURES Location/Qualifiers precursor_RNA <1..13872 /note="apoB100 mRNA" variation 3204 /note="g in lambda-B25; c in lambda B1" CDS join(M15053:283..355,M15053:670..708, 1..13571) /note="apolipoprotein B-100 precursor" mat_peptide join(M15053:355..355,M15053:670..708, 1..13568) /note="apolipoprotein B-100" . . . // ---------+---------+---------+---------+---------+---------+---------+--------- 1 10 20 30 40 50 60 70 79 Example 11. Joining Sequences 3.5.12 ORIGIN Format The ORIGIN record may be left blank, may appear as `Unreported.' or may give a local pointer to the sequence start, usually involving an experimentally determined restriction cleavage site or the genetic locus (if available). The ORIGIN record ends in a period if it contains data, but does not include the period if the record is left empty (in contrast to the KEYWORDS field which contains a period rather than being left blank). 3.5.13 SEQUENCE Format The nucleotide sequence for an entry is found in the records following the ORIGIN record. The sequence is reported in the 5'to 3' direction. There are sixty bases per record, listed in groups of ten bases followed by a blank, starting at position 11 of each record. The number of the first nucleotide in the record is given in columns 4 to 9 (right justified) of the record. 4. FUTURE RELEASES 4.1 Changes Planned for Release 67.0 No changes are planned for Release 67.0. 5. KNOWN PROBLEMS WITH THE GENBANK DATABASE 5.1 Incorrect Gene Symbols in Entries and Index The /gene qualifier should contain gene symbols. In this release, however, the /gene qualifier for many entries incorrectly contains values other than the gene symbol, such as the product or standard name of the gene. The gene symbol index (GBHGM.IDX) is created from the data in the /gene qualifier and therefore contains data other than gene symbols. These errors will be corrected as soon as possible. 6. GENBANK ADMINISTRATION IntelliGenetics Inc., a developer and distributor of molecular biology computer programs and instrumentation, is the primary contractor for the GenBank data bank. IntelliGenetics maintains the computerized data center and oversees data distribution on all media. Under an arrangement with IntelliGenetics, Los Alamos National Laboratory (LANL) gathers, annotates, and organizes sequence data and transmits it to IntelliGenetics. LANL is operated by the University of California for the Department of Energy. The electronic mail address of LANL is GENBANK@LANL.GOV; their telephone number is (505) 665-2177. The IntelliGenetics address is on the front page of these release notes. 6.1 Registered Trademark Notice GenBank (R) is a registered trademark of the U.S. Department of Health and Human Services for the Genetic Sequence Data Bank operated by IntelliGenetics and Los Alamos National Laboratory under contract with the National Institutes of Health. 6.2 GenBank Sponsorship GenBank is sponsored by the National Institute of General Medical Sciences, NIH; The National Library of Medicine, NIH; and the U.S. Department of Energy. 6.3 Citing GenBank If you have used GenBank in your research, we would appreciate it if you would include a reference to GenBank in all publications related to that research. You may also wish to note that the GenBank data bank is publicly available from IntelliGenetics. When citing data in GenBank, it is appropriate to give the sequence name, primary accession number, and the publication in which the sequence first appeared. If the data are unpublished, we urge you to contact the group which submitted the data to GenBank to see if there is a recent publication or if they have determined any revisions or extensions of the data. It is also appropriate to list a reference for GenBank itself. The following publication, which describes the GenBank data bank, should be cited: Bilofsky, H.S. and Burks, C. The GenBank (R) Genetic Sequence Data Bank. Nucl. Acids Res. 16: 1861-1864 (1988) The following statement is an example of how you may cite GenBank data. It cites the sequence, its primary accession number, the group who determined the sequence, and GenBank. The numbers in brackets refer to one of the GenBank citations above and the REFERENCE in the GenBank sequence entry. `We scanned the GenBank (1) data bank for sequence similarities and found one sequence (2), GenBank accession number J01016, which showed significant similarity...' (1) Bilofsky, H.S. and Burks, C. Nucl. Acids Res. 16: 1861-1864 (1988) (2) Nellen, W. and Gallwitz, D. J. Mol. Biol. 159, 1-18 (1982) 6.4 GenBank Distribution Formats and Media The GenBank data bank is available in three formats on three physical media. The three formats are fixed-length 80-character records, VAX/VMS Backup saveset, and compressed Unix tar archive format. The three media are industry-standard 9-track magnetic tapes, Sun 1/4" QIC 24 format cartridges, and TK-50 cartridges. The following chart specifies which formats are available in each medium. To request a change in the format, media, or density of the tapes you receive, write to the address (or call the telephone number) on the first page of these release notes. FILE FORMATS TAPE MEDIA Unlabelled ASCII VAX/VMS Unix (fixed-length records) Backup Saveset tar tarfile 9-track, 2400' reel 1600 bpi MU M M 6250 bpi MU M M TK-50 cartridge (DEC) NA M NA 1/4" QIC 24 cartridge (Sun) NA NA M MU tapes are available in both mixed-case and uppercase-only formats M tapes are available only in mixed-case characters NA not available Table 1. Tape Media and Formats 6.5 Request for Direct Submission of Sequence Data The growth of nucleotide sequence data is close to exponential. Both the proposed Human Genome sequencing project and the increasing automation of sequencing make it clear that GenBank is going to continue to grow rapidly. A successful GenBank requires that the data enter the data bank as soon as possible after publication, that the annotations be as complete as possible, and that the sequence and annotation data be accurate. All three of these requirements are best met if authors of sequence data submit their data directly to GenBank in a usable form. It is especially important that these submissions be in computer-readable form. GenBank must rely on direct author submission of data to ensure that it achieves its goals of complete, accurate, and timely data. To assist researchers in entering their own sequence data, GenBank has developed AUTHORIN, an easy-to-use program that enables authors to enter a sequence, annotate it, and submit it to GenBank or any of the other data banks. The IBM PC compatible and Macintosh versions of AUTHORIN may be obtained by completing the enclosed AUTHORIN request card or by contacting GenBank at the address shown on the front of these release notes. Versions for the VAX and Sun workstations are also planned and will be announced in future release notes as they become available. For those who are unable to use the Authorin program, GenBank has a printed data submission form. This form is now standardized among EMBL, DDBJ, GenBank, PIR, MIPS, and JIPID. GenBank also provides a corresponding computer-readable data submission form that can be used for electronic mail and floppy disk submissions. The GenBank Data Submission Form (located in the file GBDAT.FRM) can be used to submit your sequence and annotations. Electronic mail submissions should go to the address "GB-SUB%LIFE@LANL.GOV"; direct mail should go to our postal address in Los Alamos, which is on the data submission form. 6.6 Request for Corrections and Comments We welcome your suggestions for improvements to GenBank. We are especially interested to learn of errors or inconsistencies in the data. Please use the GenBank Error/Suggestion Report Form, which is part of this distribution of GenBank (located in the file GBDAT.FRM), to send your suggestions and corrections to the address on the first page of these release notes. Please be certain to indicate the GenBank release number (e.g., Release 66.0) and the primary accession number of the entry to which your comments apply; it is helpful if you also give the entry name and the current contents of any data field for which you are recommending a change. 6.7 Disclaimer IntelliGenetics Inc., Los Alamos National Laboratory, and the United States Government make no representations or warranties regarding the content or accuracy of the information. IntelliGenetics Inc., Los Alamos National Laboratory, and the United States Government also make no representations or warranties of merchantibility or fitness for a particular purpose and accept no responsibility for any consequences of the receipt or use of the information. APPENDIX A. Statistical Summary Division Entries Bases Reports PRIMATE 7511 9003383 9493 RODENT 7652 7841099 9176 OTHER MAMMALIAN 1552 1935603 1817 OTHER VERTEBRATE 1876 2142926 2263 INVERTEBRATE 3195 4005462 3811 PLANT 2976 4659180 3636 ORGANELLE 1271 1848854 1569 BACTERIAL 4293 6992664 5528 STRUCTURAL RNA 1647 445723 1946 VIRAL 3707 6439492 4751 PHAGE 593 682556 880 SYNTHETIC 1028 516186 1129 UNANNOTATED 3756 4792964 4909 Total (13 divisions) 41057 51306092 50908 Sequences with greater than 30,000 bp Locus Div Accession Length ADBCG VRL J01917 35937bp CHKMYHE VRT J02714 31111bp HS11UL VRL D00317 108360bp HS4 VRL V01555 172282bp HS5HCMVU VRL X04650 43275bp HUMADAG PRI M13792 36741bp HUMFIXG PRI K02402 38059bp HUMGHCSA PRI J03071 66495bp HUMHBB PRI J00179 73326bp HUMHPRTB PRI M26434 56736bp HUMTPA PRI K03021 36594bp LAMCG PHG J02459 48502bp MPOCPCG ORG X04465 121024bp MUSBGCXD ROD X14061 55856bp PT7CG PHG J02518 39936bp RABBGLOB MAM M18818 44594bp RATCRYG ROD M19359 54670bp TOBCPCG ORG Z00044 155844bp VACCG VRL M35027 191737bp VAZCG VRL X04370 124884bp X14112 UNA X14112 152260bp X14720 UNA X14720 35100bp X15423 UNA X15423 47081bp X15917 UNA X15917 40469bp X17012 UNA X17012 30000bp X17403 UNA X17403 229354bp APPENDIX B. Entries with a change in locus name Accession Rel 65.0 Rel 66.0 -------- --------- --------- J03132 HUMICAM1 HUMICAM1M J04134 MUSCNR MUSCNRA K00319 BEACPTRMF PHVCPTRMF K00336 BEACPTRF PHVCPTRF M22244 BOVSVSP BOVSVSPA M24637 CHKCEK CHKCEK1 M29035 BSUPEPF BSUPEPFA M29579 RATGLUSA RATGLUS M30953 ECOPANA ECOPANF M31077 HUMSTATHG1 HUMSTATH1 M31612 TRBESAGC TRBESAGCA M31628 STMHISOP STMHISOPA M32331 HUMICAM1AA HUMICAM1 M32639 HUMSTATHG2 HUMSTATH2 M36473 ACLP322P ACCP322P X02297 PARSP51A3 PARSP51A4 X03499 DDITRNV DDITGNV X03500 DDITRNE DDITGNE X04083 TVMXGG TVMXCG X12646 HUMRPHO2A HUMAPP2A X12656 HUMPP2A HUMBPP2A X17615 ECOFHUE X17615 Y00448 ECOK2KORB RK2KORB APPENDIX C. Number of entries, reports, and bases by organism PRIMATE Key Name Reports Entries Bases ------------------------------------------------------------------------------- 1. AGM Cercopithecus aethiops 61 42 31391 2. ATR Aotus trivirgatus 7 7 7557 3. BAB Papio anubis 3 3 2653 4. BAB Papio doguera 1 1 2000 5. BAB Papio hamadryas 7 7 11052 6. BAB Papio papio 1 1 343 7. BAB Papio sp. 3 3 1601 8. CEB Cebus sp. 2 2 190 9. CEP Cebus apella 7 7 1819 10. CHP Pan paniscus 1 1 1683 11. CHP Pan troglodytes 68 53 72174 12. COL Colobus polykomos 2 2 1494 13. GCR Galago crassicaudatus 38 38 16345 14. GIB Hylobates lar 8 6 17024 15. GOR Gorilla gorilla 19 11 19990 16. GSE Galago senegalensis 1 1 369 17. HUM Homo sapiens 9156 7235 8687688 18. LEM Cheirogaleus medius 1 1 1899 19. LEM Lemur albifrons 1 1 1786 20. LEM Lemur macaco 3 3 5590 21. LEM Lemur sp. 1 1 1380 22. MAC Macaca fascicularis 8 7 6355 23. MAC Macaca mulatta 40 28 33283 24. MAC Macaca nemestrina 1 1 1115 25. MAC Macaca radiata 2 2 342 26. MAC Macaca sp. 7 6 7502 27. MNK Ateles geoffroyi 4 4 13966 28. MNK Monkey 11 11 2081 29. ORA Pongo pygmaeus 19 17 36667 30. SOE Saguinus oedipus 4 4 5207 31. TAR Tarsius sp. 6 5 10837 Total 9493 7511 9003383 RODENT Key Name Reports Entries Bases ------------------------------------------------------------------------------- 1. DIP Dipodomys ordii 1 1 3318 2. GPI Cavia cobaya 1 1 491 3. GPI Cavia cutleri 3 2 4959 4. GPI Cavia porcellus 23 22 36118 5. GPI Cavia sp. 2 2 3226 6. HAM Cricetulus griseus 32 28 38686 7. HAM Cricetulus longicaudatus 44 31 20752 8. HAM Cricetulus sp. 37 33 31583 9. HAM Cricetus cricetus 6 6 11446 10. HAM Mesocricetus auratus 81 58 77935 11. HAM Mesocricetus sp. 94 62 47934 12. MAR Marmota monax 8 6 8563 13. MUS Mus caroli 18 17 13904 14. MUS Mus domesticus 24 20 12906 15. MUS Mus muscaris 56 56 28343 16. MUS Mus musculus 5821 4884 4299073 17. MUS Mus pahari 7 7 7763 18. MUS Mus platythrix 1 1 315 19. MUS Mus sp. 11 11 9130 20. MUS Mus spretus 9 7 7454 21. PER Peromyscus leucopus 3 3 3640 22. PER Peromyscus maniculatus 11 11 1791 23. RAT Rattus colletti 4 4 7849 24. RAT Rattus fuscipes 1 1 1161 25. RAT Rattus leucopus 3 3 3481 26. RAT Rattus norvegicus 2599 2144 2797456 27. RAT Rattus rattus 209 177 284465 28. RAT Rattus sordidus 1 1 1161 29. RAT Rattus sp. 56 45 61254 30. RAT Rattus tunneyi 1 1 1161 31. RAT Rattus villosissimus 2 2 3369 32. SEH Spalax ehrenbergi 7 5 10412 Total 9176 7652 7841099 OTHER MAMMALIAN Key Name Reports Entries Bases ------------------------------------------------------------------------------- 1. AXI Axis axis 2 2 1758 2. BOV Bos bovis 1 1 2363 3. BOV Bos taurus 755 642 741804 4. CAT Felis catus 40 40 23293 5. CAT Felis domesticus 3 3 4748 6. CAT Felis silvestris 8 6 9516 7. CAT Felis sp. 1 1 3534 8. DAV Dasyurus viverrinus 2 2 939 9. DOG Canis familiaris 55 45 52602 10. DOG Canis lupus 8 8 7867 11. DOG Canis sp. 19 18 22470 12. GOT Capra hircus 46 41 42286 13. HRS Equus caballus 66 38 32167 14. LEE Lepus capensis 1 1 434 15. LEE Lepus europaeus 1 1 3646 16. MMU Muntiacus muntjak 1 1 807 17. MVI Mustela vison 3 3 1909 18. OPO Didelphis virginiana 9 9 7139 19. ORC Orcinus orca 1 1 1579 20. PIG Sus scrofa 190 155 242401 21. RAB Basilea sp. 1 1 377 22. RAB Oryctolagus cuniculus 420 363 525433 23. RAB Oryctolagus sp. 57 57 70289 24. RAB Sylvilagus floridanus 1 1 1065 25. SEA Halichoerus grypus 3 3 2288 26. SHP Ovis aries 69 61 87157 27. SHP Ovis sp. 41 38 35507 28. SUN Suncus murinus 8 6 6231 29. VMP Desmodus rotundus 2 1 1725 30. WAL Macropus eugenii 1 1 754 31. WAL Macropus robustus 1 1 1465 32. WAL Macropus rufus 1 1 50 Total 1817 1552 1935603 OTHER VERTEBRATE Key Name Reports Entries Bases ------------------------------------------------------------------------------- 1. ANG Anguilla australis 1 1 2191 2. APT Ascaphus truei 2 1 1897 3. BUJ Bufo japonicus 2 2 1116 4. CHK Gallus domesticus 158 120 174064 5. CHK Gallus gallus 1040 851 1029779 6. CPL Carcharhinus plumbeus 4 4 1821 7. CRC Caiman crocodylus 5 5 2902 8. DUK Aix sp. 1 1 165 9. DUK Anas platyrhynchos 15 14 21280 10. DUK Cairina moschata 30 22 25269 11. FAL Falco columbarius 1 1 174 12. FSA Myxine glutinosa 1 1 915 13. FSA Petromyzon marinus 5 5 9661 14. FSB Acipenser transmontano 2 1 1140 15. FSB Anarhichas lupus 1 1 3395 16. FSB Carassius auratus 12 11 9222 17. FSB Catostomus commersoni 4 3 1936 18. FSB Ctenopharyngobon idella 1 1 4243 19. FSB Cyprinus carpio 18 17 16913 20. FSB Electrophorus electricus 5 5 8692 21. FSB Elops saurus 9 5 3870 22. FSB Fundulus heteroclitus 1 1 1673 23. FSB Ictalurus punctatus 7 6 5679 24. FSB Limanda ferruginea 1 1 416 25. FSB Lophius americanus 8 5 2942 26. FSB Macrozoarces americanus 3 3 2657 27. FSB Misgurnus fossilis 2 2 1697 28. FSB Oncorhynchus keta 30 26 23515 29. FSB Oncorhynchus kisutch 2 2 3221 30. FSB Oncorhynchus tschawytscha 3 3 1862 31. FSB Paralichthys olivaceus 7 5 4859 32. FSB Pseudopleuronectus americanus 8 6 3002 33. FSB Salmo gairdneri 52 48 40813 34. FSB Salmo irideus 1 1 1278 35. FSB Salmo salar 8 6 6764 36. FSB Thunnus thynnus 1 1 911 37. FSC Torpedo californica 21 13 23035 38. FSC Torpedo marmorata 7 7 9992 39. GOO Anser anser 2 2 4906 40. GRE Geoclemys reevessi 1 1 239 41. HEF Heterodontus francisci 34 29 19173 42. KRY Kryptophaneron alfredi 1 1 1230 43. LSE Laticauda semifasciata 1 1 483 44. LSL Laticauda laticaudata 2 1 632 45. MRG Mergus serrator 1 1 2574 46. NEW Cynops pyrrhogaster 1 1 629 47. NVI Notophthalmus viridescens 10 10 1458 48. ORN Oreochromis mossambicus 2 1 237 49. ORN Oreochromis niloticus 1 1 847 50. PAG Pagrus major 2 1 906 51. PGN Columba sp. 2 2 1665 52. PHS Phasianus colchicus 1 1 739 53. PHU Phyllomedusa bicolor 1 1 781 54. PHU Phyllomedusa sauvagei 2 2 1315 55. PLS Phylloscopus trochilus 6 3 2593 56. PYU Pyura stolonifera 7 7 1029 57. QUL Coturnix coturnix 50 35 36670 58. QUL Coturnix japonica 1 1 311 59. RAN Rana catesbeiana 10 8 6949 60. RAN Rana pipiens 4 3 1625 61. RAN Rana temporaria 19 14 7422 62. SCC Scyliorhinus caniculus 2 2 667 63. SEQ Seriola quinqueradiata 1 1 879 64. SKT Raja erinacea 8 8 10209 65. SMD Triturus vulgaris 4 4 901 66. SMR Pleurodeles waltlii 5 4 2305 67. SNK Aipysurus laevis 6 3 1332 68. SNK Bothrops atrox 8 7 11423 69. SNK Crotalus durissus 3 2 1263 70. SNK Elaphe radiata 1 1 2483 71. SNK Naja naja 1 1 312 72. SNK Natrix tessellata 1 1 312 73. SNK Notechis scutatus 2 1 621 74. SRA Hemitripterus americanus 2 2 3294 75. TKY Meleagris gallopavo 16 14 6836 76. VIA Vipera ammodytes 2 1 607 77. XEB Xenopus borealis 27 26 19249 78. XEC Xenopus clivii 6 6 1406 79. XEL Xenopus laevis 503 440 505221 80. XET Xenopus tropicalis 11 8 14850 81. XIP Xiphophorus maculatus 4 2 983 82. XIP Xiphophorus sp. 2 1 4135 83. ZEF Brachydanio rerio 8 6 4264 Total 2263 1876 2142926 INVERTEBRATE Key Name Reports Entries Bases ------------------------------------------------------------------------------- 1. ACA Acanthamoeba castellanii 32 28 28156 2. ACP Acropora formosa 2 2 236 3. ACP Acropora latistella 3 3 354 4. ADO Acheta domesticus 1 1 541 5. AEI Aequipecten irradians 2 2 1253 6. AEV Aequorea victoria 4 4 2595 7. AME Apis melifica 2 2 897 8. AMF Apis mellifera 8 8 2082 9. APL Aplysia californica 35 32 28418 10. APL Aplysia sp. 9 9 9086 11. APO Antheraea polyphemus 33 33 16883 12. APY Antheraea yamamai 1 1 1200 13. ARB Arbacia punctulata 3 2 5078 14. BBO Babesia bovis 4 4 2482 15. BBO Babesia rodhaini 1 1 3238 16. BMO Bombyx mandarina 3 3 2712 17. BMO Bombyx mori 152 120 87988 18. BPL Brachionus plicatilis 1 1 120 19. BRP Brugia malayi 9 8 8730 20. BRP Brugia pahangi 2 2 4893 21. BUG Bugula neritina 1 1 120 22. CAF Calanus finmarchicus 1 1 1487 23. CAR Carcinoscorpius rotundicauda 1 1 983 24. CAV Calliphora vicina 11 7 16088 25. CBL Trichoplusia ni 1 1 2475 26. CCA Caledia captiva 12 12 11713 27. CEI Ceratitis capitata 1 1 1115 28. CEL Caenorhabditis briggsae 7 4 5300 29. CEL Caenorhabditis elegans 139 108 228239 30. CER Calliphora erythrocephala 6 6 1677 31. CHI Chironomus pallidivittatus 21 16 5529 32. CHI Chironomus tentans 22 20 20059 33. CHI Chironomus thummi 32 30 32150 34. CLM Clam sp. 1 1 2163 35. CLM Spisula solidissima 2 2 3577 36. CPD Colpidium campylum 1 1 76 37. CPD Colpidium colpoda 1 1 76 38. CRB Cardisoma guanhumi 3 3 1992 39. CRB Gecarcinus lateralis 4 4 6714 40. CRB Geryon quinquedens 1 1 85 41. CRB Limulus polyphemus 4 4 4886 42. CRB Paguras pollicaris 4 4 419 43. CUL Culex pipiens 1 1 3105 44. DDI Dictyostelium discoideum 282 215 215990 45. DDI Dictyostelium sp. 2 1 4439 46. DEM Dermasterias imbricata 2 2 1016 47. DEP Dermatophagoides pteronyssinus 2 1 824 48. DIA Diadromus pulchellus 1 1 324 49. DIC Dicyema misakiense 1 1 116 50. DIR Dirofilaria immitis 2 2 2791 51. DRE Drosophila erecta 4 4 4847 52. DRF Drosophila funebris 2 1 3002 53. DRG Drosophila gymnobasis 10 10 1793 54. DRH Drosophila heteroneura 1 1 922 55. DRH Drosophila hydei 18 15 26936 56. DRH Drosophila irilis 1 1 1602 57. DRI Drosophila grimshawi 3 3 3860 58. DRL Drosophila silvarentis 2 2 356 59. DRM Drosophila mauritiana 6 4 8528 60. DRN Drosophila navojoa 2 1 2334 61. DRN Drosophila nebulosa 2 2 2991 62. DRO Drosophila melanogaster 1242 1001 1581360 63. DRO Drosophila subobscura 2 2 4827 64. DRP Drosophila pseudoobscura 14 7 38083 65. DRQ Drosophila sechellia 3 2 5559 66. DRR Drosophila orena 7 4 7293 67. DRS Drosophila simulans 23 18 21926 68. DRS Drosophila sp. 2 2 5761 69. DRT Drosophila teissieri 4 4 7787 70. DRU Drosophila mulleri 1 1 6778 71. DRV Drosophila virilis 47 37 72725 72. DRW Drosophila mojavensis 6 5 21612 73. DRY Drosophila yakuba 1 1 1853 74. ECC Echinococcus granulosus 2 2 846 75. EIM Eimeria acervulina 4 2 1764 76. EIM Eimeria tenella 5 4 4629 77. ENH Entamoeba histolytica 15 14 10247 78. EPH Ephydatia mulleri 1 1 2895 79. EUC Eurypelma californicum 1 1 1579 80. EWA Euplotes aediculatus 1 1 1882 81. EWC Euplotes crassus 2 2 1427 82. EWE Euplotes eurystomus 2 1 930 83. EWR Euplotes raikovi 1 1 593 84. FFL Luciola cruciata 1 1 1985 85. FHE Fasciola hepatica 1 1 894 86. GCH Glaucoma chattoni 3 3 564 87. GLA Giardia lamblia 14 10 9487 88. GLA Giardia sp. 1 1 831 89. GLY Glycera dibranchiata 1 1 745 90. GMO Glossina austeni 1 1 653 91. GMO Glossina fuscipes 1 1 239 92. GMO Glossina morsitans 7 7 3244 93. GMO Glossina palpalis 1 1 236 94. HAE Haemonchus contortus 4 4 3557 95. HCE Hyalophora cecropia 11 11 6608 96. HEL Heliothis virescens 1 1 2977 97. HIR Hirudo medicinalis 1 1 379 98. HLT Haliotis corrugata 1 1 650 99. HLT Haliotis rufescens 1 1 642 100. HOL Holothuria polii 5 5 1964 101. HOL Holothuria tubulosa 1 1 441 102. HYD Hydra sp. 2 2 4555 103. LAN Lingula anatina 1 1 120 104. LAR Lampetra reissneri 1 1 120 105. LEI Leishmania donovani 10 6 10013 106. LEI Leishmania enriettae 4 4 1562 107. LEI Leishmania enriettii 4 4 4213 108. LEI Leishmania major 10 8 9788 109. LEI Leishmania sp. 4 4 8884 110. LEI Leishmania tropica 1 1 1851 111. LIT Litomosoides carinii 2 2 214 112. LMI Locusta migratoria 5 5 3039 113. LOA Loa loa 1 1 839 114. LUM Lumbricus terrestris 2 2 5061 115. LYM Lymnaea stagnalis 2 2 1764 116. MDO Musca domestica 3 2 2916 117. MOT Manduca sexta 25 22 40693 118. MSQ Aedes aegypti 4 4 8350 119. MSQ Anopheles gambiae 21 21 7322 120. NEM Ascaris lumbricoides 35 34 17038 121. NEM Ascaris suum 2 2 6079 122. NEP Nephila clavipes 1 1 2336 123. NGR Naegleria gruberi 4 4 6389 124. OCT Octopus dofleini 1 1 1315 125. OCT Paroctopus defleini 1 1 1675 126. OFA Oxytricha fallax 34 13 9421 127. OMM Ommastrephes sloanei 2 2 1825 128. ONG Onchocerca sp. 2 2 214 129. ONG Onchocerca volvulus 25 24 15477 130. ONO Oxytricha nova 21 21 20172 131. OWE Owenia fusiformis 1 1 1548 132. PAA Parascaris sp. 2 2 215 133. PAL Paracentrotus lividus 11 9 13481 134. PAR Paramecium aurelia 5 5 1190 135. PAR Paramecium primaurelia 8 8 12822 136. PAR Paramecium tetraurelia 20 17 9174 137. PBA Plasmodium gallinaceum 1 1 799 138. PBE Plasmodium berghei 9 9 11176 139. PBS Plasmodium brasilianum 2 2 3010 140. PCH Plasmodium chabaudi 11 9 17151 141. PCR Philosamia cynthia ricini 2 2 240 142. PCY Plasmodium cynomolgi 6 6 7875 143. PFA Plasmodium falciparum 168 138 216729 144. PFA Plasmodium fragile 2 1 2307 145. PIO Pisaster ochraceus 5 5 9699 146. PKN Plasmodium knowlesi 11 9 8880 147. PLM Plasmodium malariae 1 1 1545 148. PLM Plasmodium reichenowi 1 1 654 149. PLO Plasmodium lophurae 6 5 6087 150. PMC Pneumocystis carinii 10 6 5670 151. PMI Prorocentrum micans 2 1 2451 152. PNG Panagrellus redivivus 2 2 322 153. PNG Panagrellus silusiae 1 1 682 154. PPY Photinus pyralis 1 1 2387 155. PVI Plasmodium vivax 13 7 8966 156. PYO Plasmodium yoelii 15 11 21793 157. SAR Sarcocystis gigantea 3 3 473 158. SCA Schistocerca americana 3 3 3458 159. SCA Schistocerca nitans 2 2 711 160. SCI Sciara coprophila 2 2 822 161. SCM Schistosoma japonicum 10 9 9090 162. SCM Schistosoma mansoni 58 52 46518 163. SCR Androctonus australis 7 7 2563 164. SEM Parastichopus parvimensis 1 1 1458 165. SHR Artemia salina 21 20 14517 166. SHR Artemia sp. 11 8 6407 167. SLE Stylonychia lemnae 4 4 7237 168. SLU Stylonychia pustulata 5 5 2820 169. SPE Sarcophaga peregrina 7 7 6405 170. SPF Spodoptera frugiperda 5 5 7382 171. SQD Loligo pealii 1 1 3693 172. STF Asterina pectinifera 1 1 2180 173. SUH Hemicentrotus pulcherrimus 1 1 2413 174. SUL Lytechinus pictus 14 13 10476 175. SUL Lytechinus variegatus 16 16 12102 176. SUP Psammechinus miliaris 50 38 25964 177. SUS Strongylocentrotus drobachiensis 4 4 1256 178. SUS Strongylocentrotus franciscanus 2 2 3832 179. SUS Strongylocentrotus purpuratus 167 158 114993 180. SUT Tripneustes gratilla 13 7 10207 181. SUU Sea urchin 12 12 2632 182. TAE Taenia solium 3 2 3737 183. TAT Tachypleus tridentatus 1 1 946 184. TCA Tribolium castaneum 1 1 707 185. TCK Boophilus microplus 2 1 2225 186. TCS Trichostrongylus colubriformis 2 2 1987 187. TEC Tetrahymena cosmopolitanis 1 1 511 188. TEH Tetrahymena hyperangularis 2 2 767 189. TEL Tetrahymena leucophrys 1 1 75 190. TEM Tetrahymena malaccensis 1 1 507 191. TEN Tenebrio molitor 23 23 5207 192. TEP Tetrahymena paravorax 1 1 74 193. TEP Tetrahymena pigmentosa 4 4 3072 194. TES Tetrahymena sonneborni 1 1 511 195. TET Tetrahymena thermophila 64 61 49075 196. TEU Tetrahymena patula 1 1 75 197. TEX Tetrahymena vorax 1 1 75 198. TEY Tetrahymena pyriformis 13 11 10560 199. THE Theileria annulata 2 2 2859 200. THE Theileria parva 3 3 6729 201. TOX Toxoplasma gondii 10 10 12582 202. TRB Trypanosoma brucei 252 223 288704 203. TRC Trypanosoma cruzi 35 33 31153 204. TRE Trypanosoma equiperdum 6 4 3816 205. TRF Crithidia fasciculata 16 12 20239 206. TRI Trichomonas vaginalis 1 1 582 207. TRL Leptomonas collosoma 1 1 154 208. TRL Leptomonas seymouri 6 4 2130 209. TRO Trypanosoma congolense 9 9 8879 210. TRV Trypanosoma vivax 2 2 857 211. TRY Trypanosoma rangeli 1 1 153 212. TSR Trichinella spiralis 3 2 2138 213. UCA Urechis caupo 1 1 718 214. VAH Vargula hilgendorfii 1 1 1818 215. VAI Vairimorpha necatrix 1 1 1244 216. VUI Eupelmus vuilleti 1 1 106 217. WSP Dolichovespula maculata 2 2 1367 218. WUC Wuchereria bancrofti 1 1 1323 Total 3811 3195 4005462 PLANT Key Name Reports Entries Bases ------------------------------------------------------------------------------- 1. ABG Absidia glauca 2 1 1011 2. ACK Achlya ambisexualis 1 1 1121 3. ACK Achlya bisexualis 1 1 1809 4. ACK Achlya klebsiana 1 1 1254 5. ACT Actinidia chinensis 8 5 5231 6. ACT Actinidia deliciosa 1 1 634 7. AEG Aegilops tauschii 1 1 421 8. ALC Allium cepa 3 3 1135 9. ALF Medicago sativa 29 18 26748 10. AMA Antirrhinum majus 11 7 11860 11. APE Acremonium chrysogenum 2 2 2410 12. ASA Aspergillus awamori 4 3 7873 13. ASG Aspergillus niger 12 9 18564 14. ASL Aessosporon salmonicolor 1 1 119 15. ASN Aspergillus nidulans 49 42 89815 16. ASO Aspergillus oryzae 13 10 14743 17. AST Avena sativa 8 8 22641 18. ATH Arabidopsis thaliana 94 76 136042 19. AVO Persea americana 3 3 4672 20. BJE Bjerkandera adusta 1 1 118 21. BLY Hordeum vulgare 91 72 86901 22. BNA Brassica napus 25 21 23597 23. BOL Brassica campestris 8 5 5320 24. BOL Brassica juncea 3 2 356 25. BOL Brassica oleracea 12 9 3859 26. BRM Bremia lactucae 1 1 2869 27. BRN Bertholletia excelsa 1 1 621 28. CAG Canavalia gladiata 3 2 4797 29. CCI Coprinus cinereus 2 2 6091 30. CEN Canavalia ensiformis 1 1 1027 31. CFU Caldariomyces fumago 2 1 2787 32. CHE Chenopodium rubrum 6 3 1865 33. CHL Chlorella protothecoides 2 1 1332 34. CHL Chlorella sp. 6 6 1019 35. CIP Mesembryanthemum crystallinum 11 8 25401 36. CLC Claviceps purpurea 2 2 758 37. CLI Citrus limon 3 3 4958 38. CLR Clarkia unguiculata 2 1 2040 39. CNA Citrullus vulgaris 1 1 1334 40. COA Convolvulus arvensis 4 4 4549 41. COC Cochliobolus heterostrophus 2 2 2634 42. COG Colletotrichum capsici 1 1 2557 43. COG Colletotrichum gloeosporioides 1 1 1749 44. COG Colletotrichum graminicola 2 2 5286 45. COT Gossypium hirsutum 9 9 13403 46. CPA Carica papaya 3 3 2037 47. CPA Cyanophora paradoxa 1 1 528 48. CRE Chlamydomonas reinhardtii 49 38 52939 49. CTR Catharanthus roseus 1 1 1740 50. CUC Cucurbita maxima 5 3 15360 51. CUC Cucurbita moschata 2 1 1781 52. CUC Cucurbita pepo 8 8 7363 53. CUS Cucumis melo 1 1 1137 54. CUS Cucumis sativus 13 12 12742 55. DAR Daucus carota 11 9 11820 56. DBI Dolichos biflorus 3 3 5384 57. DUN Dunaliella salina 3 2 2541 58. EGR Euglena gracilis 9 6 9011 59. EPA Endothia parasitica 5 3 3809 60. EPK Ephedra kokanica 2 1 120 61. ERG Erysiphe graminis 1 1 2475 62. FIL Filobasidium capsuligenum 1 1 118 63. FIL Filobasidium floriforme 1 1 118 64. FLX Linum usitatissimum 5 5 1726 65. FSO Fusarium oxysporum 2 2 1347 66. FSO Fusarium solani 3 3 5508 67. FSO Fusarium sporotrichioides 2 1 1908 68. FTR Flaveria trinervia 1 1 752 69. GCO Gracilaria tikvahiae 1 1 1771 70. GCO Gracilaria verrucosa 1 1 1771 71. GNG Gnetum gnemon 2 1 120 72. GRO Gracilariopsis sp. 1 1 1782 73. HAS Hansenula anomala 2 1 2132 74. HAS Hansenula polymorpha 2 1 3637 75. HEV Hevea brasiliensis 1 1 1008 76. HNN Helianthus annuus 6 4 5752 77. HRA Armoracia rusticana 2 2 5828 78. IPB Ipomoea batatas 9 7 11859 79. LGI Lemna gibba 6 6 5024 80. LGI Lemna minor 1 1 119 81. LIL Lilium henryi 1 1 9345 82. LOL Lolium perenne 2 2 2082 83. LUP Lupinus luteus 20 15 5280 84. MAQ Marsilia quadrifolia 2 1 155 85. MIN Matthiola incana 1 1 509 86. MRA Mucor racemosus 8 7 6601 87. MRM Mucor circinelloides 1 1 4399 88. MRM Mucor miehei 2 2 3316 89. MRP Mucor pusillus 1 1 1965 90. MZE Zea mays 235 193 295885 91. MZE Zea mexicana 2 2 360 92. NAN Nanochlorum eucaryotum 2 1 1796 93. NEU Neurospora crassa 116 102 168075 94. OCH Ochromonas danica 1 1 1789 95. PAN Podospora anserina 3 3 1901 96. PAP Papaver somniferum 2 2 864 97. PBL Phycomyces blakesleeanus 3 3 545 98. PCP Physcomitrella patens 1 1 2544 99. PEA Pisum sativum 81 72 85166 100. PEC Penicillium chrysogenum 7 6 20758 101. PEP Penicillium patulum 1 1 6357 102. PET Petunia hybrida 37 31 28787 103. PET Petunia sp. 25 24 13154 104. PHA Phanerochaete chrysosporium 14 10 18832 105. PHN Pharbitis nil 10 5 534 106. PHO Petroselinum hortense 1 1 1431 107. PHT Phytophthora megasperma 4 2 3031 108. PHV Phaseolus lunatus 1 1 926 109. PHV Phaseolus vulgaris 48 37 43358 110. PIN Pinus contorta 1 1 745 111. PIN Pinus sylvestris 2 1 583 112. PIN Pinus thunbergii 4 2 1889 113. PMI Prorocentrum micans 2 1 3408 114. POA Polytomella agilis 3 3 6616 115. POM Polystichum munitum 4 4 4645 116. POP Populus sp. 17 10 3776 117. POT Solanum tuberosum 59 52 81544 118. PSJ Psathyrostachys juncea 2 2 2035 119. PTE Porphyra umbilicalis 2 1 121 120. PUM Petroselinum crispum 10 10 6270 121. PYL Pylaiella littoralis 2 1 1644 122. RAD Raphanus sativus 9 6 2470 123. RCC Ricinus communis 6 6 10485 124. RCH Rhizopus chinensis 1 1 1133 125. RCH Rhizopus niveus 2 2 3448 126. RCH Rhizopus oryzae 1 1 2290 127. RDT Rhodotorula rubra 2 1 3586 128. RHD Rhodosporidium toruloides 2 2 3181 129. RHP Parasponia andersonii 1 1 1520 130. RHP Parasponia rhizobium 2 2 5530 131. RIC Oryza sativa 81 56 78071 132. RYE Secale cereale 3 3 5870 133. SAL Sinapis alba 12 8 8316 134. SCN Schwanniomyces occidentalis 2 1 2292 135. SCO Schizophyllum commune 5 5 4836 136. SES Sesbania rostrata 6 5 3948 137. SIP Silene pratensis 4 4 3165 138. SLM Physarum polycephalum 59 49 52689 139. SOY Glycine max 144 111 165611 140. SPI Spinacia oleracea 37 30 34346 141. SRG Sorghum bicolor 9 5 6336 142. SRG Sorghum sp. 2 1 4638 143. SSI Scilla siberica 4 4 204 144. SSY Sisymbrium irio 2 1 433 145. TDA Thaumatococcus daniellii 1 1 931 146. TFR Trifolium repens 2 1 1268 147. THI Thinopyrum elongatum 1 1 1375 148. TLA Thermomyces lanuginosus 6 5 3804 149. TOB Nicotiana alata 1 1 804 150. TOB Nicotiana plumbaginifolia 12 11 22286 151. TOB Nicotiana rustica 2 2 593 152. TOB Nicotiana sylvestris 4 4 1382 153. TOB Nicotiana tabacum 67 51 71765 154. TOM Lycopersicon esculentum 91 72 94920 155. TOM Lycopersicon peruvianum 1 1 480 156. TRD Tripsacum dactyloides 6 6 1528 157. TRH Trichosanthes kirilowii 2 2 2237 158. TRR Trichoderma reesei 4 4 8343 159. TRT Trema tomentosa 2 1 1727 160. URO Uromyces appendiculatus 1 1 1449 161. USM Ustilago maydis 3 2 2656 162. VFA Vicia faba 39 33 29444 163. VIR Vigna mungo 2 1 1314 164. VIR Vigna radiata 5 4 4354 165. VVC Volvox carteri 11 9 17792 166. WHT Triticum aestivum 100 79 115801 167. WHT Triticum durum 1 1 898 168. WHT Triticum sp. 8 8 8441 169. WHT Triticum vulgare 1 1 965 170. YS1 Zygosaccharomyces fermentati 1 1 5416 171. YS2 Saccharomycopsis fibuligera 3 3 9339 172. YS4 Candida boidinii 2 2 1863 173. YS5 Candida glabrata 3 3 2758 174. YSA Candida albicans 11 8 13498 175. YSB Candida tropicalis 16 13 23751 176. YSC Saccharomyces cerevisiae 1131 933 1716468 177. YSCTY Transposable element TY1 53 44 52430 178. YSD Saccharomyces diastaticus 4 4 4319 179. YSD Saccharomyces douglassi 1 1 4072 180. YSE Candida pelliculosa 1 1 5327 181. YSF Candida maltosa 5 4 8167 182. YSG Saccharomyces carlsbergensis 23 20 38053 183. YSH Hansenula wingei 3 3 720 184. YSI Saccharomyces fibuligera 2 2 6761 185. YSJ Yarrowia lipolytica 8 7 17412 186. YSK Kluyveromyces lactis 41 32 86137 187. YSM Hansenula polymorpha 3 3 8018 188. YSN Kluyveromyces fragilis 1 1 4193 189. YSO Zygosaccharomyces rouxii 5 3 15025 190. YSP Schizosaccharomyces japonicus 1 1 108 191. YSP Schizosaccharomyces malidevorans 1 1 107 192. YSP Schizosaccharomyces octosporus 1 1 109 193. YSP Schizosaccharomyces pombe 135 114 206826 194. YSQ Pichia pastoris 3 3 899 195. YSS Cephalosporium acremonium 5 5 2757 196. YST Yeast sp. 38 37 17556 197. YSU Candida utilis 4 4 7578 198. YSV Saccharomyces uvarum 1 1 2001 199. YSW Kluyveromyces drosophilarum 1 1 4757 200. YSX Saccharomyces rosei 1 1 278 201. YSY Saccharomyces kluyveri 5 4 2875 202. YSZ Zygosaccharomyces bailii 1 1 5415 203. ZAM Zamia pumila 1 1 1813 Total 3636 2976 4659180 ORGANELLE Key Name Reports Entries Bases ------------------------------------------------------------------------------- 1. ABGMT Mitochondrion Absidia glauca 1 1 596 2. ACUCP Chloroplast Acorus calamus 1 1 103 3. AEGCP Chloroplast Aegilops crassa 2 1 1436 4. AEGCP Chloroplast Aegilops squarrosa 1 1 203 5. AKOMT Mitochondrion Akodon aerosus 6 6 2406 6. AKOMT Mitochondrion Akodon andinus 1 1 401 7. AKOMT Mitochondrion Akodon boliviensis 2 2 802 8. AKOMT Mitochondrion Akodon jelskii 3 3 1203 9. AKOMT Mitochondrion Akodon juninensis 1 1 401 10. AKOMT Mitochondrion Akodon kofordi 1 1 401 11. AKOMT Mitochondrion Akodon mollis 1 1 401 12. AKOMT Mitochondrion Akodon puer 1 1 401 13. AKOMT Mitochondrion Akodon subfuscus 3 3 1203 14. AKOMT Mitochondrion Akodon torques 3 3 1203 15. ALFCP Chloroplast Medicago sativa 3 3 3460 16. AMDCP Chloroplast Acetabularia mediterranea 1 1 1175 17. AMFMT Mitochondrion Apis mellifera 1 1 2949 18. AMHCP Chloroplast Amaranthus hybridus 1 1 1187 19. AMTMT Mitochondrion Ambystoma tigrinum 2 1 225 20. ASIMT Mitochondrion Ascobolus immersus 2 1 5142 21. ASNMT Mitochondrion Aspergillus amstelodami 1 1 624 22. ASNMT Mitochondrion Aspergillus nidulans 22 18 32837 23. ASTCP Chloroplast Avena sativa 1 1 1623 24. ATHCP Chloroplast Arabidopsis thaliana 3 2 1499 25. ATHMT Mitochondrion Arabidopsis thaliana 2 1 880 26. ATPCP Chloroplast Atriplex patula 1 1 1786 27. ATPCP Chloroplast Atriplex rosea 1 1 1790 28. BETMT Mitochondrion Beta vulgaris 4 4 5919 29. BLSMT Mitochondrion Boletus satanas 2 1 341 30. BLYCP Chloroplast Hordeum vulgare 18 13 23006 31. BNACP Chloroplast Brassica napus 1 1 1633 32. BOLCP Chloroplast Brassica oleracea 1 1 543 33. BOLMT Mitochondrion Brassica oleracea 1 1 549 34. BOLMT Mitochondrion Brassica sp. 2 2 770 35. BOMMT Mitochondrion Bolomys amoenus 2 2 802 36. BOVMT Mitochondrion Bos taurus 6 5 19563 37. CEUMT Mitochondrion Cervus unicolor 1 1 2682 38. CHECP Chloroplast Chenopodium album 1 1 207 39. CHKMT Mitochondrion Gallus domesticus 1 1 3571 40. CHKMT Mitochondrion Gallus gallus 4 3 1153 41. CHLCP Chloroplast Chlorella ellipsoidea 12 9 15063 42. CHPMT Mitochondrion Pan troglodytes 1 1 896 43. CNAMT Mitochondrion Citrullus lanatus 1 1 4512 44. COCMT Mitochondrion Cochliobolus heterostrophus 1 1 1827 45. CODCP Chloroplast Codium fragile 4 4 304 46. COOCP Chloroplast Coleochaete orbicularis 1 1 1587 47. CPACP Chloroplast Cyanophora paradoxa 11 8 2748 48. CPACY Cyanelle Cyanophora paradoxa 5 5 5317 49. CRECP Chloroplast Chlamydomonas moewusii 3 2 8107 50. CRECP Chloroplast Chlamydomonas reinhardtii 42 33 37766 51. CRECP Chloroplast Chlamydomonas sp. 3 2 2346 52. CREMT Mitochondrion Chlamydomonas reinhardtii 26 22 21535 53. CRRMT Mitochondrion Corcorax melanorhamphos 1 1 239 54. CRYCP Chloroplast Cryptomonas phi 2 1 715 55. DARMT Mitochondrion Daucus carota 1 1 690 56. DIPMT Mitochondrion Dipodomys californicus 1 1 239 57. DIPMT Mitochondrion Dipodomys heermanni 1 1 239 58. DIPMT Mitochondrion Dipodomys panamintinus 2 2 478 59. DRMMT Mitochondrion Drosophila mauritania 1 1 976 60. DROMT Mitochondrion Drosophila melanogaster 9 6 16133 61. DRSMT Mitochondrion Drosophila simulans 1 1 975 62. DRVMT Mitochondrion Drosophila virilis 1 1 191 63. DRYMT Mitochondrion Drosophila yakuba 9 3 19938 64. EGRCP Chloroplast Euglena gracilis 35 24 62419 65. EQQMT Mitochondrion Equus quagga 2 2 229 66. FHEMT Mitochondrion Fasciola hepatica 2 1 708 67. FRGMT Mitochondrion Rana catesbeiana 9 5 12109 68. FSBMT Mitochondrion Acipenser transmontano 2 1 156 69. FSBMT Mitochondrion Cichlosoma centrarchus 1 1 239 70. FSBMT Mitochondrion Cichlosoma citrinellum 1 1 239 71. FSBMT Mitochondrion Cichlosoma labiatum 1 1 239 72. FSBMT Mitochondrion Cichlosoma nicaraguense 1 1 239 73. FSBMT Mitochondrion Cyprinus carpio 3 3 873 74. FSBMT Mitochondrion Julidochromis regani 1 1 239 75. FSBMT Mitochondrion Salmo gairdneri 2 2 855 76. FTRCP Chloroplast Flaveria bidentis 1 1 1839 77. FTRCP Chloroplast Flaveria pringlei 1 1 1842 78. GCOCP Chloroplast Gracilaria tenuistipitata 1 1 1930 79. GIBMT Mitochondrion Hylobates lar 1 1 896 80. GORMT Mitochondrion Gorilla gorilla 1 1 896 81. HAMMT Mitochondrion Cricetulus sp. 1 1 880 82. HNNMT Mitochondrion Helianthus annuus 1 1 1336 83. HUMMT Mitochondrion Homo sapiens 43 36 37585 84. HYRMT Mitochondrion Hydropotes inermis 1 1 2680 85. IPBCP Chloroplast Ipomoea batatas 1 1 2004 86. LEIKP Kinetoplast Leishmania aethiopica 1 1 376 87. LEIKP Kinetoplast Leishmania major 2 1 1031 88. LEIKP Kinetoplast Leishmania mexicana 3 3 2134 89. LEIKP Kinetoplast Leishmania tarentolae 22 16 28301 90. LEIMT Mitochondrion Leishmania tarentolae 1 1 189 91. LEMMT Mitochondrion Lemur catta 1 1 895 92. LIGCP Chloroplast Ligularia calthifolia 1 1 103 93. LMIMT Mitochondrion Locusta migratoria 3 3 5118 94. LUAMT Mitochondrion Lupinus angustifolius 2 2 1330 95. LUPMT Mitochondrion Lupinus luteus 2 1 630 96. MACMT Mitochondrion Macaca fascicularis 2 2 1598 97. MACMT Mitochondrion Macaca fuscata 1 1 896 98. MACMT Mitochondrion Macaca mulatta 1 1 896 99. MACMT Mitochondrion Macaca sylvanus 1 1 896 100. MCXMT Mitochondrion Microxus mimus 2 2 802 101. MMUMT Mitochondrion Muntiacus reevesi 1 1 2682 102. MPOCP Chloroplast Marchantia polymorpha 10 1 121024 103. MSQMT Mitochondrion Aedes albopictus 9 9 3448 104. MUSMT Mitochondrion Mus musculus 22 15 20678 105. MZECP Chloroplast Zea mays 72 55 54150 106. MZECP Chloroplast Zea perennis 2 2 1456 107. MZEMT Mitochondrion Zea mays 53 45 80743 108. NEUMT Mitochondrion Neurospora crassa 44 39 55773 109. NEUMT Mitochondrion Neurospora intermedia 5 4 10804 110. NRACP Chloroplast Neurachne munroi 1 1 1990 111. NRACP Chloroplast Neurachne tenuifolia 1 1 2010 112. OBECP Chloroplast Oenothera berteriana 6 4 1813 113. OBEMT Mitochondrion Oenothera berteriana 18 15 25036 114. OBOCP Chloroplast Oenothera odorata 2 2 964 115. ODOMT Mitochondrion Odocoileus virginianus 1 1 2677 116. OHOCP Chloroplast Oenothera hookeri 2 2 2132 117. OLICP Chloroplast Olisthodiscus luteus 1 1 714 118. ORAMT Mitochondrion Pongo pygmaeus 1 1 895 119. OSPMT Mitochondrion Oenothera sp. 2 2 1635 120. PALMT Mitochondrion Paracentrotus lividus 17 17 21974 121. PANMT Mitochondrion Podospora anserina 21 20 65750 122. PARMT Mitochondrion Paramecium aurelia 9 9 8110 123. PARMT Mitochondrion Paramecium primaurelia 4 3 5645 124. PARMT Mitochondrion Paramecium sp. 34 17 12563 125. PARMT Mitochondrion Paramecium tetraurelia 4 4 5844 126. PEACP Chloroplast Pisum sativum 36 30 55118 127. PEAMT Mitochondrion Pisum sativum 8 6 10694 128. PENCP Chloroplast Pennisetum americanum 2 1 325 129. PETCP Chloroplast Petunia hybrida 7 7 5806 130. PETMT Mitochondrion Petunia hybrida 4 3 2954 131. PETMT Mitochondrion Petunia parodii 1 1 1774 132. PFAMT Mitochondrion Plasmodium falciparum 1 1 935 133. PGYMT Mitochondrion Paragyrodon sphaerosporus 2 1 337 134. PHVMT Mitochondrion Phaseolus vulgaris 1 1 88 135. PIGMT Mitochondrion Sus scrofa 2 2 686 136. PILCP Chloroplast Pilayella littoralis 1 1 353 137. PMGMT Mitochondrion Placopecten magellanicus 5 5 4580 138. POGMT Mitochondrion Thomomys townsendi 1 1 239 139. PZOCP Chloroplast Pelargonium zonale 2 2 463 140. RADMT Mitochondrion Raphanus sativus 2 2 5752 141. RATMT Mitochondrion Rattus norvegicus 35 26 23507 142. RATMT Mitochondrion Rattus rattus 4 4 4388 143. RHS Mitochondrion Rhizopogon achraeceorubens 2 1 341 144. RHS Mitochondrion Rhizopogon subcaerulescens 2 1 341 145. RICCP Chloroplast Oryza sativa 11 9 12553 146. RICMT Mitochondrion Oryza sativa 5 5 8084 147. RYECP Chloroplast Secale cereale 9 7 9269 148. SAIMT Mitochondrion Saimiri sciureus 1 1 893 149. SALCP Chloroplast Sinapis alba 8 6 9874 150. SAOCP Chloroplast Saponaria officinalis 1 1 1252 151. SCOMT Mitochondrion Schizophyllum commune 1 1 1120 152. SLMMT Mitochondrion Physarum polycephalum 1 1 1536 153. SNICP Chloroplast Solanum nigrum 1 1 1501 154. SOLCP Chloroplast Spirodela oligorhiza 9 9 6538 155. SOYCP Chloroplast Glycine max 14 9 16045 156. SOYMT Mitochondrion Glycine max 5 5 8683 157. SPFMT Mitochondrion Spodoptera frugiperda 1 1 446 158. SPICP Chloroplast Spinacia oleracea 47 38 79017 159. SRGCP Chloroplast Sorghum bicolor 1 1 862 160. SRGMT Mitochondrion Sorghum sp. 4 2 4768 161. STFMT Mitochondrion Asterina pectinifera 1 1 3849 162. SUIMT Mitochondrion Suillus cavipes 2 1 339 163. SUSMT Mitochondrion Strongylocentrotus drobachiensis 2 2 965 164. SUSMT Mitochondrion Strongylocentrotus franciscanus 3 3 1276 165. SUSMT Mitochondrion Strongylocentrotus intermedius 2 2 960 166. SUSMT Mitochondrion Strongylocentrotus pallidus 2 2 961 167. SUSMT Mitochondrion Strongylocentrotus purpuratus 5 4 16929 168. TARMT Mitochondrion Tarsius syrichta 1 1 895 169. TETMT Mitochondrion Tetrahymena thermophila 1 1 53 170. TEYMT Mitochondrion Tetrahymena pyriformis 14 13 12462 171. TOBCP Chloroplast Nicotiana acuminata 1 1 2052 172. TOBCP Chloroplast Nicotiana debneyi 3 3 4016 173. TOBCP Chloroplast Nicotiana otophora 1 1 2052 174. TOBCP Chloroplast Nicotiana plumbaginifolia 6 4 4169 175. TOBCP Chloroplast Nicotiana tabacum 47 40 200231 176. TOBMT Mitochondrion Nicotiana plumbaginifolia 2 1 1740 177. TOBMT Mitochondrion Nicotiana tabacum 4 3 4074 178. TOMCP Chloroplast Lycopersicon esculentum 1 1 103 179. TOMMT Mitochondrion Lycopersicon esculentum 2 1 558 180. TRBKP Kinetoplast Trypanosoma brucei 27 21 37228 181. TRBMT Mitochondrion Trypanosoma brucei 4 4 2285 182. TRCKP Kinetoplast Trypanosoma cruzi 27 27 11864 183. TREKP Kinetoplast Trypanosoma equiperdum 2 2 2017 184. TREKP Kinetoplast Trypanosoma evansi 3 2 1998 185. TRFKP Kinetoplast Crithidia fasciculata 19 18 12549 186. TRFKP Kinetoplast Crithidia oncopelti 6 3 1231 187. TRFMT Mitochondrion Crithidia fasciculata 2 1 2034 188. TRFMT Mitochondrion Crithidia oncopelti 1 1 149 189. TRLKP Kinetoplast Leptomonas sp. 1 1 2568 190. TRWKP Kinetoplast Trypanosoma lewisi 2 2 2036 191. VFACP Chloroplast Vicia faba 6 6 9547 192. VFAMT Mitochondrion Vicia faba 4 4 9356 193. WARMT Mitochondrion Pomatostomus isidori 1 1 239 194. WARMT Mitochondrion Pomatostomus ruficeps 1 1 239 195. WARMT Mitochondrion Pomatostomus superciliosus 1 1 239 196. WARMT Mitochondrion Pomatostomus temporalis 1 1 239 197. WHTCP Chloroplast Triticum aestivum 28 26 26229 198. WHTMT Mitochondrion Triticum aestivum 33 24 28296 199. XELMT Mitochondrion Xenopus laevis 6 5 25196 200. XERMT Mitochondrion Xerocomus chrysenteron 2 1 339 201. YSCMT Mitochondrion Saccharomyces cerevisiae 191 171 142613 202. YSGMT Mitochondrion Saccharomyces carlsbergensis 1 1 149 203. YSKMT Mitochondrion Kluyveromyces lactis 78 39 3699 204. YSKMT Mitochondrion Kluyveromyces thermotolerans 5 3 1287 205. YSLMT Mitochondrion Torulopsis glabrata 10 9 6200 206. YSPMT Mitochondrion Schizosaccharomyces pombe 10 9 13361 207. YSSMT Mitochondrion Cephalosporium acremonium 2 2 3029 208. YSTMT Mitochondrion Yeast sp. 4 4 5196 209. YSUMT Mitochondrion Candida utilis 1 1 306 210. YSVMT Mitochondrion Saccharomyces uvarum 3 3 2296 Total 1569 1271 1848854 BACTERIAL Key Name Reports Entries Bases ------------------------------------------------------------------------------- 1. ABC Acetobacter aceti 1 1 1624 2. ABC Acetobacter xylinum 1 1 9540 3. AC2 Plasmid pAC27 1 1 803 4. ACC Acinetobacter calcoaceticus 13 10 27286 5. ACC Acinetobacter sp. 4 3 5193 6. ACH Achromobacter sp. 1 1 2414 7. ACL Acholeplasma laidlawii 3 2 1858 8. ACN Actinobacillus actinomycetemcomitans 1 1 3842 9. ACN Actinobacillus pleuropneumoniae 1 1 3831 10. ACO Acetogenium kivui 1 1 2477 11. ACY Actinomyces naeslundii 1 1 2160 12. ACY Actinomyces viscosus 2 1 1850 13. AER Aeromicrobium erythreus 1 1 1463 14. AFA Alcaligenes eutrophus 9 9 23911 15. AFA Alcaligenes faecalis 4 4 8591 16. AFA Plasmid pJP4 7 5 11105 17. AHA Aphanocapsa sp. 1 1 1920 18. AMC Acidaminococcus fermentans 2 1 3245 19. AMS Ampullariella sp. 1 1 1892 20. ANA Anabaena 7120 8 6 15235 21. ANA Anabaena sp. 27 19 26386 22. ANI Anacystis nidulans 32 26 36901 23. ANN Actinoplanes missouriensis 2 1 1639 24. APM Anaplasma marginale 5 4 11487 25. AQU Agmenellum quadruplicatum 3 3 5497 26. ARF Archaeoglobus fulgidus 3 2 1727 27. ARG Arthrobacter sp. 1 1 2075 28. ATU Agrobacterium rhizogenes 8 6 14645 29. ATU Agrobacterium sp. 1 1 1599 30. ATU Agrobacterium tumefaciens 38 31 62767 31. AVH Azotobacter chroococcum 1 1 1654 32. AVI Azotobacter vinelandii 24 20 80709 33. AZS Azospirillum brasilense 1 1 1910 34. BAD Bacillus caldolyticus 1 1 1147 35. BAL Bacillus caldotenax 5 5 5550 36. BAM Bacillus amyloliquefaciens 19 16 17715 37. BAN Bacillus anthracis 4 4 14401 38. BBR Bacillus brevis 11 10 25879 39. BC1 Plasmid pBC16 1 1 257 40. BCC Bacillus coagulans 2 2 2433 41. BCE Bacillus cereus 18 14 17016 42. BCI Bacillus circulans 9 6 12823 43. BCQ Bacillus Q 5 5 786 44. BFI Bacillus firmus 1 1 1434 45. BIF Bifidobacterium longum 1 1 1767 46. BLI Bacillus licheniformis 21 18 19815 47. BLL Bacillus lautus 1 1 2323 48. BMA Bacillus macerans 1 1 2744 49. BME Bacillus megaterium 24 19 31050 50. BNG Bacteroides gingivalis 1 1 1420 51. BNO Bacteroides nodosus 6 5 4828 52. BNR Bacteroides fragilis 6 5 8556 53. BOR Borrelia burgdorferei 7 6 3181 54. BPE Bordetella bronchiseptica 1 1 4936 55. BPE Bordetella parapertussis 3 3 4749 56. BPE Bordetella pertussis 21 15 42985 57. BPO Bacillus polymyxa 4 3 8764 58. BPU Bacillus pumilus 12 9 8915 59. BRL Brevibacterium epidermidis 2 1 1721 60. BRL Brevibacterium lactofermentum 9 8 22049 61. BRU Brucella abortus 2 2 5253 62. BS2 Plasmid pBS2 1 1 2279 63. BSN Bacillus natto 1 1 676 64. BSP Bacillus sp. 22 21 48210 65. BSS Bacillus sphaericus 16 12 30172 66. BST Bacillus stearothermophilus 41 39 62425 67. BSU Bacillus subtilis 274 216 357850 68. BTH Bacillus thuringiensis 67 52 149803 69. BTT Thermoactinomyces thalpophilus 2 2 1036 70. BUT Butyrivibrio fibrisolvens 2 2 4411 71. C11 Plasmid pColBM-C1139 1 1 2149 72. C1B Plasmid Colicin B4 3 3 1561 73. CAJ Campylobacter coli 3 3 4985 74. CAJ Campylobacter fetus 1 1 3974 75. CAJ Campylobacter jejuni 4 4 5410 76. CB2 Plasmid Colicin B2 1 1 360 77. CCR Caulobacter crescentus 24 24 15105 78. CD1 Plasmid Colicin D 1 1 1099 79. CDC Caldocellum saccharolyticum 6 5 16338 80. CE1 Plasmid Colicin E1 42 31 17722 81. CE2 Plasmid Colicin E2 5 5 4629 82. CE3 Plasmid Colicin E3 1 1 392 83. CE5 Plasmid Colicin E5-099 2 2 2401 84. CE8 Plasmid Colicin E8 1 1 1268 85. CE9 Plasmid Colicin E9 2 2 2148 86. CEC Plasmid Colicin E3-CA38 8 3 4883 87. CEC Plasmid Colicin E6-CT14 2 2 4675 88. CFI Cellulomonas fimi 7 7 5787 89. CFI Cellulomonas uda 1 1 1828 90. CFR Citrobacter freundii 6 5 4522 91. CFX Chloroflexus aurantiacus 4 3 3568 92. CGF Chlorogloeopsis fritschii 1 1 210 93. CH1 Plasmid pCHL1 2 2 8101 94. CHT Chlamydia psittaci 5 5 6932 95. CHT Chlamydia trachomatis 36 22 61362 96. CIA Plasmid Colicin Ia 1 1 3727 97. CIB Plasmid Colicin Ib 5 5 10934 98. CIB Plasmid Colicin Ib-P9 2 2 2373 99. CLA Plasmid Colicin A 4 4 3950 100. CLD Plasmid CloDF13 13 1 9957 101. CLK Plasmid Colicin K 2 2 815 102. CLN Plasmid Colicin V 1 1 412 103. CLN Plasmid Colicin V-K30 1 1 1465 104. CLN2 Plasmid Colicin V2-K94 1 1 550 105. CLO Clostridium acetobutylicum 9 7 14479 106. CLO Clostridium acidiurici 1 1 2266 107. CLO Clostridium botulinum 1 1 4835 108. CLO Clostridium cellulolyticum 1 1 2405 109. CLO Clostridium difficile 4 3 10451 110. CLO Clostridium innocuum 1 1 1544 111. CLO Clostridium pasteurianum 25 17 18163 112. CLO Clostridium perfringens 10 9 7118 113. CLO Clostridium sordellii 1 1 1504 114. CLO Clostridium tetani 3 3 10529 115. CLO Clostridium thermoaceticum 1 1 1965 116. CLO Clostridium thermocellum 10 8 16656 117. CLO Clostridium thermohydrosulfuricum 1 1 4839 118. CLO Clostridium thermosulfurogenes 2 2 4258 119. CLT Calothrix sp. 12 12 1936 120. CLV Plasmid ColVBtrp 1 1 441 121. CN2 Plasmid pCN2 1 1 117 122. CN3 Plasmid pCN3 1 1 114 123. COR Corynebacterium diphtheriae 2 1 2529 124. COR Corynebacterium glutamicum 9 6 16234 125. COR Corynebacterium nephridii 1 1 615 126. COR Corynebacterium sp. 2 2 2512 127. COX Coxiella burnetii 3 3 5956 128. CPC Cryptococcus albidus 2 1 2984 129. CPC Cryptococcus neoformans 1 1 2029 130. CYA Cyanobacterium nostoc 2 2 4220 131. CYT Cytophaga lytica 1 1 1509 132. DCG Dictyoglomus thermophilum 3 3 6900 133. DEI Deinococcus radiodurans 2 2 4970 134. DMO Desulfurococcus mobilis 7 7 9730 135. DSB Desulfobacterium autotrophicum 1 1 1376 136. DSB Desulfobacterium niacini 1 1 1375 137. DSB Desulfobacterium vacuolatum 1 1 1383 138. DSF Desulfococcus multivorans 1 1 1372 139. DSI Desulfomicrobium baculatus 1 1 1379 140. DSL Desulfomonas pigra 1 1 1381 141. DSO Desulfotomaculum orientis 1 1 1402 142. DSO Desulfotomaculum ruminis 1 1 1368 143. DSP Desulfobacter curvatus 1 1 1396 144. DSP Desulfobacter hydrogenophilus 1 1 1390 145. DSP Desulfobacter latus 1 1 1373 146. DSP Desulfobacter sp. 2 2 2869 147. DSU Desulfobulbus propionicus 1 1 1371 148. DSU Desulfobulbus sp. 1 1 1365 149. DSV Desulfosarcina variabilis 1 1 1527 150. DVU Desulfovibrio africanus 1 1 1382 151. DVU Desulfovibrio baarsii 2 2 1589 152. DVU Desulfovibrio baculatus 2 1 2589 153. DVU Desulfovibrio desulfuricans 5 5 4678 154. DVU Desulfovibrio fructosovorans 1 1 3180 155. DVU Desulfovibrio gigas 2 2 4126 156. DVU Desulfovibrio multispirans 1 1 186 157. DVU Desulfovibrio salexigens 2 2 2107 158. DVU Desulfovibrio sapovorans 1 1 1395 159. DVU Desulfovibrio vulgaris 9 9 11185 160. EAM Erwinia amylovora 2 2 1641 161. ECA Erwinia carotovora 14 12 17877 162. ECB Erwinia herbicola 1 1 4902 163. ECH Erwinia chrysanthemi 19 13 27706 164. ECO Escherichia coli 1716 1188 1848929 165. ECO Escherichia fergusonii 1 1 3133 166. ECO F sex factor plasmid 3 3 5370 167. ECO Plasmid Colicin BM-Cl139 3 3 3707 168. ECO Plasmid pCU1 1 1 2056 169. ECO Plasmid pF166 1 1 2133 170. EHP Ectothiorhodospira halophila 1 1 121 171. EHR Ehrlichia risticii 2 1 1498 172. EHV Ectothiorhodospira vacuolata 1 1 120 173. ENC Enterococcus faecium 1 1 1900 174. ENR Plasmid ENTR 2 2 1273 175. ENS Plasmid ENT 1 1 866 176. ENT Enterobacter aerogenes 6 6 5330 177. ENT Enterobacter agglomerans 3 3 1272 178. ENT Enterobacter cloacae 8 8 9262 179. ETA Edwardsiella tarda 2 1 306 180. EUB Eubacterium sp. 6 5 10586 181. FA3 Plasmid pFA3 1 1 1597 182. FDI Fremyella diplosiphon 26 20 29933 183. FIB Fibrobacter succinogenes 2 2 3736 184. FPL Plasmid F 30 23 37662 185. FRA Frankia sp. 3 2 3758 186. FRN Francisella tularensis 1 1 1233 187. FVB Flavobacterium heparinum 1 1 1528 188. FVB Flavobacterium okeanokoites 8 8 9873 189. FVB Flavobacterium sp. 4 4 6028 190. GS5 Plasmid pGS05 1 1 1357 191. HAF Hafnia alvei 1 1 2961 192. HAL Halobacterium cutirubrum 7 5 8712 193. HAL Halobacterium halobium 36 26 45031 194. HAL Halobacterium salinarium 1 1 606 195. HAL Halobacterium sp. 14 13 22487 196. HAL Haloferax volcanii 1 1 3566 197. HCL Heliobacterium chlorum 1 1 1512 198. HCU Halobacterium cutirubrum 2 1 3116 199. HEC Helicobacter felis 3 2 2887 200. HEC Helicobacter mustelae 2 1 1435 201. HEH Haemophilus haemolyticus 2 2 3186 202. HEI Haemophilus influenzae 96 41 17798 203. HEP Haemophilus parainfluenza 5 5 853 204. HLF Haloferax sp. 4 3 1187 205. HMO Halococcus morrhuae 2 2 4402 206. HPT Herpetosiphon aurantiacus 1 1 1484 207. HV2 Plasmid pHV2 1 1 6354 208. IM13 Plasmid pIM13 1 1 2246 209. INC Plasmid incB 1 1 352 210. INC Plasmid incI-1 1 1 418 211. INC Plasmid incI-gamma 1 1 417 212. INS Insertion sequence 10 10 4266 213. INS Insertion sequence IS1 5 4 3243 214. INS Insertion sequence IS150 2 1 1443 215. INS Insertion sequence IS186 2 2 2677 216. INS Insertion sequence IS2 4 4 517 217. INS Insertion sequence IS26 1 1 859 218. INS Insertion sequence IS30 1 1 1221 219. INS Insertion sequence IS4 1 1 1426 220. INS Insertion sequence IS476 1 1 1225 221. INS Insertion sequence IS493 1 1 1641 222. INS Insertion sequence IS5 3 2 1570 223. INS Insertion sequence IS891 1 1 1351 224. INS Insertion sequence ISHS1 1 1 1449 225. JD1 Plasmid pJD1 2 1 4207 226. JS3 Plasmid pJS37 3 3 252 227. KAE Klebsiella aerogenes 18 16 23367 228. KCI Kluyvera citrophila 1 1 2734 229. KPN Klebsiella pneumoniae 71 55 109716 230. KPN Plasmid pJHC-MW1 1 1 1352 231. KPO Klebsiella oxytoca 2 2 4901 232. KY1 Plasmid pKY1 1 1 3022 233. KYM Plasmid pKYM 1 1 2083 234. LAC Lactococcus lactis 13 12 24807 235. LAE Listonella ordalii 2 1 120 236. LAE Listonella tubiashii 2 1 120 237. LB1 Plasmid p1 1 1 533 238. LB3 Lactobacillus 30a 3 2 2189 239. LBA Lactobacillus acidophilus 1 1 400 240. LBB Lactobacillus bulgaricus 1 1 536 241. LBD Lactobacillus delbrueckii 7 4 5405 242. LBH Lactobacillus helveticus 1 1 3292 243. LBP Lactobacillus plantarum 3 2 3664 244. LBP Plasmid pC30il 1 1 2140 245. LBP Plasmid pLP1 1 1 2093 246. LCA Lactobacillus casei 6 6 9787 247. LCO Lactobacillus confusus 1 1 1320 248. LEP Leptospira biflexa 2 2 4788 249. LEP Leptospira interrogans 2 1 3244 250. LIS Listeria monocytogenes 3 2 3940 251. LM0 Plasmid pLM020 1 1 2330 252. LPN Legionella pneumophila 2 2 2005 253. LS1 Plasmid pLS11 1 1 253 254. MBA Methanobacterium ivanovii 2 1 1353 255. MBF Methanobacterium formicicum 1 1 3597 256. MBH Methanobrevibacter smithii 3 3 7221 257. MBI Methanobacterium thermoautotrophicum 10 7 26621 258. MBI Plasmid pME2001 1 1 1440 259. MBO Moraxella bovis 2 2 5044 260. MBO Moraxella lacunata 1 1 969 261. MBO Moraxella sp. 2 1 3034 262. MCL Mastigocladus laminosus 1 1 1701 263. MEC Micromonospora echinospora 1 1 398 264. MEF Methanothermus fervidus 9 8 18721 265. MEH Methanospirillum hungatei 1 1 295 266. MEN Methanolobus tindarius 1 1 128 267. MES Methanosarcina barkeri 5 3 13117 268. MLC Methylococcus capsulatus 1 1 2463 269. MLU Micrococcus luteus 10 9 19465 270. MLY Micrococcus lysodeikticus 1 1 166 271. MPL Mycoplasma-like organism 1 1 1535 272. MSG Mycobacterium bovis 7 6 7320 273. MSG Mycobacterium leprae 7 5 11027 274. MSG Mycobacterium tuberculosis 15 9 18387 275. MSG Plasmid pAL5000 1 1 4837 276. MTB Methylobacterium extorquens 1 1 4500 277. MTB Methylobacterium sp. 1 1 2791 278. MTB Methylobacterium specialis 2 1 2211 279. MTF Methylobacillus flagellatum 1 1 1349 280. MV1 Plasmid pMV158 1 1 2436 281. MVA Methanococcus vannielii 19 17 28753 282. MVO Methanococcus voltae 11 10 15241 283. MVT Methanococcus thermolithotrophicus 4 3 2820 284. MXA Myxococcus xanthus 16 15 27127 285. MXB Lysobacter enzymogenes 3 2 3218 286. MYC Mycoplasma capricolum 14 13 21175 287. MYC Mycoplasma hyopneumoniae 3 3 1928 288. MYC Mycoplasma mycoides 4 4 2716 289. MYC Mycoplasma sp. 41 37 51236 290. MYC Plasmid pADB201 1 1 1717 291. NAH Plasmid NAH7 (from P. putida) 6 5 3771 292. NAT Natronobacterium pharaonis 1 1 1015 293. NG2 Plasmid pNG2 1 1 1810 294. NGO Neisseria flavescens 1 1 1228 295. NGO Neisseria gonorrhoeae 63 55 50544 296. NGO Neisseria meningitidis 12 8 9940 297. NOC Nocardia mediterranei 3 3 450 298. NOS Nostoc commune 1 1 4241 299. NR1 Plasmid NR1 4 3 6463 300. NT1 Plasmid NTP1 2 2 1440 301. NT1 Plasmid NTP16 1 1 2730 302. P15 Plasmid P15A 2 2 1226 303. P18X Plasmid pACYC184 2 2 171 304. P23 Plasmid pMM2-3 2 2 182 305. P307 Plasmid P307 3 3 4629 306. P53 Plasmid pMM5-3 4 4 429 307. P55 Plasmid pMM5-5 4 4 420 308. PAC Plasmid P177 1 1 345 309. PAM Plasmid PAM177 1 1 1443 310. PAS Pasteurella haemolytica 4 3 15958 311. PAZ Plasmid pAZ1 1 1 808 312. PB0 Plasmid pUB110 9 8 12606 313. PB2 Plasmid pUB112 1 1 901 314. PBF4 Plasmid pBF4 1 1 1041 315. PBW Plasmid pBWH77 2 2 1623 316. PC1 Plasmid pC194 2 2 3946 317. PC2 Plasmid pC221 2 1 4555 318. PDE Paracoccus denitrificans 10 7 17422 319. PDG Plasmid pDGO100 2 2 3683 320. PDU Plasmid pDU1358 2 2 5076 321. PE1 Plasmid pE194 7 3 5039 322. PE2 Plasmid pED208 2 2 5640 323. PHL Plasmid pHly152 1 1 8215 324. PI25 Plasmid pI258 5 4 12140 325. PIJ Plasmid pIJ101 2 2 9188 326. PIP Plasmid pIP401 2 2 383 327. PIP Plasmid pIP630 1 1 1883 328. PIP11 Plasmid pIP1100 1 1 1386 329. PIP404 Plasmid pIP404 4 3 15188 330. PJH Plasmid pJH1 1 1 1489 331. PJM1 Plasmid pJM1 1 1 3581 332. PJR Plasmid PJR225 1 1 1527 333. PKL Plasmid pKLH1 1 1 160 334. PKL Plasmid pKLH102 2 2 351 335. PKL Plasmid pKLH104 1 1 131 336. PKL Plasmid pKLH2 2 2 674 337. PKL Plasmid pKLH201 1 1 153 338. PKM Plasmid pKM101 1 1 1797 339. PLB Plasmid pLB1 1 1 2190 340. PLM Plasmid pAA3.7X 3 1 9583 341. PLP Plasmid pSa 1 1 1447 342. PME Plasmid pMEA100 1 1 150 343. PMM Plasmid pMM110 1 1 240 344. PMO Plasmid pMON234 1 1 997 345. PNE Plasmid pNE131 2 1 2355 346. PNS Plasmid pNS1 1 1 3879 347. PNS Plasmid pNS1981 4 3 1819 348. PO2 Plasmid pOAD2 2 2 2914 349. PR1 Plasmid R1 13 10 7500 350. PR2 Plasmid R1126 1 1 428 351. PR6 Plasmid R6-5 2 1 858 352. PRC Plasmid R 1 1 1487 353. PRI Plasmid PRI13 2 1 2234 354. PRM Morganella morganii 2 2 1831 355. PRM Proteus mirabilis 7 6 16319 356. PRM Proteus vulgaris 13 8 14186 357. PRO Providencia sp. 1 1 1135 358. PRO Providencia stuartii 1 1 3889 359. PRS Propionibacterium shermanii 3 2 4951 360. PS1 Streptomyces lividans plasmid pS1 1 1 75 361. PSA Plasmid pSA2100 1 1 98 362. PSC Plasmid pSC101 16 10 16807 363. PSE Plasmid pCMS1 1 1 1322 364. PSE Pseudomonas aeruginosa 99 77 110225 365. PSE Pseudomonas amyloderamosa 5 2 4488 366. PSE Pseudomonas cepacia 2 2 5867 367. PSE Pseudomonas fluorescens 10 8 17694 368. PSE Pseudomonas fragi 2 2 1682 369. PSE Pseudomonas paucimobilis 1 1 1080 370. PSE Pseudomonas pseudoalcaligenes 1 1 2040 371. PSE Pseudomonas putida 28 26 68302 372. PSE Pseudomonas sp. 25 22 39293 373. PSE Pseudomonas syringae 11 10 27701 374. PSE Pseudomonas testosteroni 2 2 2435 375. PSE TOL Plasmid (from Pseudomonas putida) 11 7 9788 376. PSE Zoogloea ramigera 1 1 1524 377. PSM SYM megaplasmid(from R. meliloti) 9 8 5150 378. PSN Plasmid pSN2 1 1 1288 379. PT1 Plasmid pT181 7 4 5479 380. PTB Plasmid pTB913 1 1 1200 381. PWM Plasmid pWM5 1 1 569 382. PWP Plasmid pWP7b 1 1 1370 383. PWR Plasmid PWR60 1 1 4832 384. PYR Pyrodictium occultum 4 4 2077 385. PYW Pyrococcus woesi 2 1 124 386. R10 Plasmid R100 25 17 26157 387. R11 Plasmid R1162 5 5 2389 388. R12 Plasmid R124 1 1 272 389. R14 Plasmid R144 1 1 801 390. R26 Plasmid R26 1 1 1541 391. R27 Plasmid R27 1 1 1507 392. R36 Plasmid R386 1 1 441 393. R37 Plasmid R387 1 1 1160 394. R38 Plasmid R388 2 2 3204 395. R41 Plasmid R401 1 1 1857 396. R45 Plasmid R485 1 1 591 397. R46 Plasmid R46 3 3 2859 398. R48 Plasmid R483 1 1 1618 399. R53 Plasmid R538 3 2 1712 400. R65 Plasmid R65 2 2 1380 401. R67 Plasmid R67 1 1 293 402. R6K Plasmid R6K 7 6 1894 403. R75 Plasmid R751 4 4 1697 404. R77 Plasmid R773 1 1 4347 405. RA1 Plasmid RA1 1 1 758 406. RBH Plasmid pRBH1 2 2 1521 407. RBL Rhodopseudomonas acidophila 1 1 1491 408. RBL Rhodopseudomonas blastica 1 1 12368 409. RCA Rhodobacter capsulatus 32 26 52831 410. RDC Rhodocyclus purpureus 1 1 1478 411. REI Plasmid pRE-I 1 1 439 412. RER Rhodococcus erythropolis 2 1 2070 413. RER Rhodococcus fascians 2 1 121 414. RGN Plasmid RGN238 2 1 2427 415. RHA Azorhizobium caulinodans 4 2 3849 416. RHB Bradyrhizobium japonicum 23 18 34554 417. RHF Rhizobium fredii 1 1 2862 418. RHH Rhizobium phaseoli 4 4 3681 419. RHI Bradyrhizobium sp. 2 2 6665 420. RHI Rhizobium sp. 8 7 10894 421. RHJ Rhizobium japonicum 10 8 10225 422. RHL Plasmid pRL1JI 5 1 12055 423. RHL Rhizobium leguminosarum 13 13 19168 424. RHM Rhizobium meliloti 52 45 93945 425. RHR Rhizobium IRc78 2 2 2199 426. RHT Rhizobium trifolii 5 5 6886 427. RIA Plasmid Ri 1 1 21126 428. RIR Rickettsia conorii 1 1 539 429. RIR Rickettsia prowazekii 5 4 9706 430. RIR Rickettsia rickettsii 4 4 8555 431. RIR Rickettsia tsutsugamushi 2 2 5186 432. RIR Rickettsia typhi 2 2 1067 433. RIR Rochalimaea quintana 1 1 1493 434. RK2 Plasmid RK2 10 7 9952 435. RMV Rhodomicrobium vannielii 1 1 1484 436. ROS Roseburia cecicola 1 1 1031 437. RP1 Plasmid RP1 1 1 2709 438. RP4 Plasmid RP4 5 5 2997 439. RSF Plasmid RSF1010 7 7 10521 440. RSP Rhodospirillum rubrum 11 11 21804 441. RSS Rhodobacter sphaeroides 17 17 14217 442. RTS Plasmid Rts1 2 1 1855 443. RUA Ruminobacter amylophilus 2 1 2867 444. RUM Ruminococcus albus 1 1 2180 445. RVI Rhodopseudomonas viridis 5 4 3885 446. SA2 Plasmid pSAM2 3 3 866 447. SAC Sulfolobus acidocaldarius 6 5 15339 448. SAU Stigmatella aurantiaca 2 1 1300 449. SB2 Plasmid pSB24.2 2 2 7412 450. SCP Plasmid SCP1 1 1 2513 451. SE2 Plasmid pSE211 2 2 3017 452. SER Saccharopolyspora erythraea 10 7 8075 453. SHD Shigella dysenteriae 7 7 11010 454. SHF Plasmid pMYSH6000 1 1 4472 455. SHF Shigella flexneri 12 10 24401 456. SHS Shigella sonnei 11 11 6738 457. SLP1 Plasmid SLP1 3 3 630 458. SMA Serratia marcescens 31 25 35311 459. SMA Serratia sp. 1 1 2570 460. SME Spiroplasma melliferum 1 1 1510 461. SME Spiroplasma sp. 1 1 5025 462. SMY Plasmid pSL2 1 1 345 463. SMY1 Plasmid pSL1 2 2 633 464. SPA Spirochaeta aurantia 1 1 1257 465. SPO Sporolactobacillus laevis 1 1 118 466. SPO Sporosarcina ureae 2 1 116 467. SPU Spirulina platensis 1 1 5273 468. SSO Sulfolobus shibatae 1 1 1495 469. SSO Sulfolobus solfataricus 6 6 5873 470. SSP Sulfolobus sp. 7 4 19617 471. STA Plasmid pT48 1 1 2475 472. STA Staphylococcus aureus 75 56 85940 473. STA Staphylococcus carnosus 1 1 720 474. STA Staphylococcus epidermidis 1 1 423 475. STA Staphylococcus haemolyticus 1 1 1087 476. STA Staphylococcus hyicus 1 1 2212 477. STA Staphylococcus mutans 1 1 2288 478. STA Staphylococcus simulans 1 1 1486 479. STA Staphylococcus staphylolyticus 1 1 1825 480. STM Streptomyces antibioticus 1 1 1567 481. STM Streptomyces avidinii 1 1 638 482. STM Streptomyces azureus 1 1 1521 483. STM Streptomyces clavuligerus 6 4 5745 484. STM Streptomyces coelicolor 15 12 17749 485. STM Streptomyces fradiae 6 6 11455 486. STM Streptomyces glaucescens 6 4 6431 487. STM Streptomyces griseus 16 12 25048 488. STM Streptomyces hygroscopicus 7 7 6374 489. STM Streptomyces lavendulae 2 2 2078 490. STM Streptomyces limosus 1 1 2291 491. STM Streptomyces lividans 25 20 14900 492. STM Streptomyces plicatus 2 2 1245 493. STM Streptomyces rochei 4 4 2390 494. STM Streptomyces sp. 47 39 49732 495. STM Streptomyces thermotolerans 1 1 1260 496. STM Streptomyces vinaceus 1 1 1119 497. STR Plasmid pAM-beta-1 3 3 6001 498. STR Plasmid pMK157 1 1 1920 499. STR Streptococcus equisimilis 1 1 2568 500. STR Streptococcus faecalis 6 6 8343 501. STR Streptococcus lactis 8 7 8222 502. STR Streptococcus mutans 15 13 46945 503. STR Streptococcus pneumoniae 49 34 55746 504. STR Streptococcus pyogenes 24 18 36619 505. STR Streptococcus sanguis 3 3 8094 506. STR Streptococcus sobrinus 1 1 4995 507. STR Streptococcus sp. 17 14 31406 508. STV Streptoverticillum sp. 1 1 1130 509. STY Plasmid R1767 1 1 1519 510. STY Plasmid R64 1 1 482 511. STY Salmonella infantis 1 1 3430 512. STY Salmonella potsdam 1 1 1727 513. STY Salmonella rubislaw 1 1 1479 514. STY Salmonella sp. 7 6 6050 515. STY Salmonella typhimurium 174 142 204225 516. SYC Synechocystis sp. 15 11 18768 517. SYN Synechococcus sp. 15 15 38283 518. TBA Thermophilic bacterium 6 3 11617 519. TDT Trichodesmium thiebautii 1 1 357 520. TFE Plasmid pTF-FC2 1 1 329 521. TFE Thiobacillus acidophilus 9 4 599 522. TFE Thiobacillus ferrooxidans 9 7 14624 523. TFE Thiobacillus sp. 1 1 2172 524. THA Thermoplasma acidophilum 3 3 7703 525. THC Thermococcus celer 2 2 1312 526. THF Thermomonospora fusca 1 1 264 527. THP Thermofilum pendens 1 1 240 528. THR Thermomicrobium roseum 1 1 1528 529. TIP Plasmid pTiAch5 1 1 1164 530. TIP Plasmid pTiAg162 1 1 420 531. TIP Plasmid pTiB6S3 1 1 4203 532. TIP Plasmid pTiC58 4 4 5214 533. TIP Plasmid Ti (from A. tumefaciens) 61 52 120107 534. TMO Thermotoga maritima 2 2 2763 535. TRN Transposon gamma-delta 6 3 1092 536. TRN Transposon Tn10 1 1 830 537. TRN Transposon Tn21 1 1 1333 538. TRN Transposon Tn2501 1 1 480 539. TRN Transposon Tn3 2 2 389 540. TRN Transposon Tn3411 2 1 2925 541. TRN Transposon Tn4521 1 1 1315 542. TRN Transposon Tn5 1 1 2040 543. TRN Transposon Tn501 1 1 86 544. TRN Transposon Tn602 4 4 639 545. TRN10 Transposon Tn10 11 4 6024 546. TRN15 Transposon Tn1525 1 1 1721 547. TRN16 Transposon Tn1681 1 1 658 548. TRN17 Transposon Tn1721 8 8 1797 549. TRN1771 Transposon Tn1771 3 3 348 550. TRN21 Transposon Tn21 4 4 6671 551. TRN25 Transposon Tn2501 1 1 1539 552. TRN26 Transposon Tn2680 1 1 194 553. TRN3 Transposon Tn3 10 8 6351 554. TRN34 Transposon Tn3411 1 1 1321 555. TRN43 Transposon Tn4351 2 1 1982 556. TRN431 Transposon Tn431 3 3 2405 557. TRN4551 Transposon Tn4551 1 1 2080 558. TRN4556 Transposon Tn4556 2 2 86 559. TRN5 Transposon Tn5 9 6 4978 560. TRN501 Transposon Tn501 4 4 7310 561. TRN554 Transposon Tn554 5 1 6691 562. TRN7 Transposon Tn7 7 7 7535 563. TRN9 Transposon Tn9 2 2 1362 564. TRN903 Transposon Tn903 6 6 6118 565. TRN916 Transposon Tn916 1 1 1740 566. TRN917 Transposon Tn917 3 3 6353 567. TRNCAM Transposon Tn-Cam204 1 1 921 568. TRP Treponema pallidum 7 6 8045 569. TTE Thermoproteus tenax 8 8 4399 570. TTH Thermus aquaticus 6 5 7004 571. TTH Thermus caldophilus 1 1 1229 572. TTH Thermus flavus 1 1 1771 573. TTH Thermus thermophilus 23 15 28221 574. TTV Thermoproteus tenax virus 1 2 1 13669 575. URE Ureaplasma urealyticum 2 2 3982 576. VCH Vibrio cholerae 17 15 18430 577. VI1 Plasmid pVI150 1 1 972 578. VIB Aeromonas hydrophila 7 6 7237 579. VIB Aeromonas sobria 2 1 2510 580. VIB Photobacterium leiognathi 5 4 10047 581. VIB Photobacterium sp. 5 5 8715 582. VIB Vibrio alginolyticus 5 4 12901 583. VIB Vibrio anguillarum 1 1 4379 584. VIB Vibrio fischeri 4 4 5791 585. VIB Vibrio harveyi 12 11 14283 586. VIB Vibrio parahaemolyticus 2 2 2861 587. VIB Vibrio sp. 3 2 1390 588. VIT Vitreoscilla sp. 2 1 689 589. VIT Vitreoscilla stercoraria 1 1 745 590. VVU Vibrio vulnificus 1 1 2237 591. W10 Plasmid pWR100 2 1 4761 592. WOL Wolinella succinogenes 1 1 91 593. WP1 Plasmid pWP113a 1 1 1316 594. WP1 Plasmid pWP116a 1 1 1336 595. WP1 Plasmid pWP14a 1 1 1336 596. XAA Xanthobacter autotrophicus 1 1 3041 597. XAN Xanthomonas campestris 3 3 4683 598. XEN Xenorhabdus luminescens 1 1 2553 599. YEP Plasmid pYV03 2 1 3316 600. YEP Yersinia bercovieri 2 2 257 601. YEP Yersinia enterocolitica 26 24 18913 602. YEP Yersinia pestis 4 4 4462 603. YEP Yersinia pseudotuberculosis 10 8 15439 604. ZMO Zymomonas mobilis 9 8 13565 Total 5528 4293 6992664 STRUCTURAL RNA Key Name Reports Entries Bases ------------------------------------------------------------------------------- 1. AAU Auricularia auricula-judae 1 1 118 2. ABC Acetobacter sp. 2 2 236 3. ACA Acanthamoeba castellanii 3 3 400 4. ACC Acinetobacter calcoaceticus 2 2 1652 5. ACH Achromobacter cycloclastes 1 1 120 6. ACH Achromobacter xylosoxidans 1 1 114 7. ACL Acholeplasma entomophilum 1 1 1476 8. ACL Acholeplasma modicum 1 1 1473 9. ACN Actinobacillus actinomycetemcomitans 3 3 494 10. ACN Actinobacillus equuli 2 2 445 11. ACN Actinobacillus hominis 3 3 494 12. ACN Actinobacillus lignieresii 3 3 1931 13. ACS Avian sarcoma virus 1 1 75 14. ACY Actinomyces bovis 1 1 1368 15. ACY Actinomyces israelii 2 2 1879 16. ACY Actinomyces naeslundii 1 1 1378 17. ACY Actinomyces odontolyticus 1 1 1359 18. ACY Actinomyces pyogenes 2 1 1361 19. ACY Actinomyces viscosus 1 1 1351 20. AED Agaricus edulis 1 1 118 21. AEQ Actinia equina 2 1 120 22. AFA Alcaligenes eutrophus 1 1 1511 23. AFA Alcaligenes faecalis 6 6 3410 24. AKK Akkesiphycus lubricum 2 1 118 25. ALF Medicago sativa 1 1 119 26. ALL Asteroleplasma anaerobium 1 1 1471 27. ALR Rous sarcoma virus 1 1 75 28. AMG Acyrthosiphon magnoliae 2 2 281 29. AMO Amoebidium parasiticum 1 1 119 30. AMP Amoeba proteus 1 1 419 31. ANC Ancylobacter aquaticus 1 1 117 32. ANI Anacystis nidulans 4 4 371 33. ANM Anisodoris nobilis 6 3 994 34. ANP Anaeroplasma abactoclasticum 1 1 1453 35. ANP Anaeroplasma bactoclasticum 1 1 1436 36. ANP Anaeroplasma varium 1 1 1436 37. APE Acremonium persicinum 1 1 119 38. APL Aplysia kurodai 1 1 119 39. APN Anthoceros punctatus 1 1 118 40. APR Antheraea pernyi 1 1 120 41. APU Aeromonas punctata 1 1 109 42. AQU Agmenellum quadruplicatum 1 1 76 43. ARB Arbacia punctulata 9 3 1049 44. ARG Arthrobacter globiformis 4 3 1774 45. ARG Arthrobacter luteus 1 1 122 46. ARG Arthrobacter oxidans 2 1 121 47. ARG Arthrobacter sp. 2 1 121 48. ARN Argulus nobilis 1 1 1843 49. ARO Arhodomonas oleiferhydrans 1 1 1487 50. ARU Arundinaria gigantea 1 1 50 51. ASC Acinetospora crinita 2 1 118 52. ASE Aquaspirillum serpens 1 1 116 53. ASF Aspergillus flavus 1 1 119 54. ASG Aspergillus niger 1 1 119 55. ASN Aspergillus nidulans 4 4 476 56. AST Avena sativa 1 1 50 57. ATT Atractiella solani 1 1 119 58. ATU Agrobacterium tumefaciens 1 1 120 59. AUT Aureobacterium testaceum 1 1 120 60. AVI Azotobacter vinelandii 1 1 120 61. AXY Amphibacillus xylanus 2 1 116 62. BAC Bacillus acidocaldarius 1 1 117 63. BAE Batrachospermum ectocarpum 1 1 121 64. BAS Basidiobolus magnus 1 1 120 65. BBR Bacillus brevis 2 2 1674 66. BDE Bdellovibrio stolpii 1 1 1553 67. BEG Beggiatoa alba 1 1 120 68. BFI Bacillus firmus 1 1 116 69. BGA Blue Green Algae 1 1 76 70. BGL Bacillus globigii 1 1 116 71. BHA Beneckea harveyi 1 1 122 72. BJA Blepharisma japonicum 2 2 476 73. BLI Bacillus licheniformis 1 1 116 74. BLK Blakeslea trispora 1 1 120 75. BLT Blastobacter viscosus 1 1 118 76. BLY Hordeum vulgare 9 7 487 77. BME Bacillus megaterium 1 1 116 78. BMO Bombyx mori 14 13 1362 79. BNA Brassica napus 2 2 196 80. BNC Bacteroides asaccharolyticus 1 1 48 81. BNG Bacteroides gingivalis 1 1 53 82. BNI Bacteroides intermedius 1 1 52 83. BNO Bacteroides nodosus 1 1 1532 84. BOV Bos taurus 21 18 1426 85. BPA Bacillus pasteurii 1 1 117 86. BPL Brachionus plicatilis 1 1 121 87. BRA Branchiostoma belcheri 1 1 120 88. BRA Branchiostoma californiense 6 3 974 89. BRL Brevibacterium helvolum 2 1 120 90. BRL Brevibacterium linens 1 1 123 91. BRP Brugia pahangi 1 1 363 92. BRU Brucella abortus 2 1 1429 93. BSI Blastocladiella simplex 1 1 118 94. BST Bacillus stearothermophilus 10 10 845 95. BSU Bacillus subtilis 16 14 1153 96. BVO Bresslaua vorax 1 1 120 97. BVU Beta vulgaris 1 1 120 98. CAI Capniomyces stellatus 1 1 121 99. CAO Carpopeltis crispata 1 1 121 100. CAU Caulobacter spinosum 1 1 117 101. CBC Caseobacter polymorphus 1 1 121 102. CCI Coprinus cinereus 1 1 118 103. CCO Crypthecodinium cohnii 4 4 492 104. CDB Cardiobacterium hominis 1 1 1470 105. CEL Caenorhabditis elegans 9 7 713 106. CET Ceratobasidium cornigerum 1 1 118 107. CFI Cellulomonas biazotea 1 1 120 108. CHA Chaetopterus sp. 6 3 975 109. CHB Chlorobium limicola 2 2 1615 110. CHB Chlorobium phaeobacteroides 1 1 110 111. CHF Chordaria flagelliformis 2 1 118 112. CHH Chaetomorpha moniligera 1 1 120 113. CHK Gallus gallus 25 23 2774 114. CHL Chlorella pyrenoidosa 2 1 119 115. CHL Chlorella sp. 5 3 2082 116. CHO Chilomonas paramecium 1 1 124 117. CHR Chromobacterium fluviatile 1 1 1473 118. CHR Chromobacterium violaceum 1 1 1475 119. CHS Christiansenia pallida 1 1 120 120. CLL Callinectes sapidus 1 1 1861 121. CLM Spisula solidissima 6 3 937 122. CLO Clostridium aminovalericum 2 1 1554 123. CLO Clostridium barkeri 2 1 1527 124. CLO Clostridium bifermentans 1 1 117 125. CLO Clostridium butyricum 2 2 234 126. CLO Clostridium carnis 2 1 117 127. CLO Clostridium pasteurianum 4 3 1745 128. CLO Clostridium ramosum 1 1 1530 129. CLO Clostridium sticklandii 2 2 1501 130. CLO Clostridium tyrobutyricum 4 4 464 131. COE Coemansia mojavensis 1 1 120 132. COR Corynebacterium aquaticum 1 1 120 133. COR Corynebacterium glutamicum 1 1 121 134. COR Corynebacterium sp. 2 1 1366 135. COR Corynebacterium xerosis 2 2 243 136. COT Gossypium hirsutum 1 1 118 137. COX Coxiella burnetii 2 1 1484 138. CPA Cyanophora paradoxa 3 3 356 139. CRA Coprinus radiatus 1 1 118 140. CRB Limulus polyphemus 6 3 977 141. CRE Chlamydomonas reinhardtii 3 3 399 142. CRE Chlamydomonas sp. 1 1 118 143. CRS Cryptochiton stelleri 6 3 923 144. CTU Coleosporium tussilaginis 1 1 118 145. CUN Cunninghamella elegans 1 1 120 146. CUR Curtobacterium citreum 1 1 122 147. CVN Chromatium vinosum 1 1 1526 148. CYR Cycas revoluta 1 1 120 149. CYT Cytophaga aquatilis 1 1 111 150. CYT Cytophaga heparina 1 1 114 151. CYT Cytophaga johnsonae 1 1 116 152. DAC Dryopteris acuminata 1 1 121 153. DDE Dacrymyces deliquescens 1 1 118 154. DDI Dictyostelium discoideum 7 6 1170 155. DIT Diatoma tenue 1 1 118 156. DJA Dugesia japonica 1 1 120 157. DJA Dugesia tigrina 6 3 962 158. DOG Canis lupus 1 1 149 159. DOG Canis sp. 2 2 191 160. DPS Dipsacomyces acuminosporus 1 1 119 161. DRO Drosophila melanogaster 42 36 5318 162. DSA Desulfuromonas acetoxidans 1 1 1522 163. DSM Desulfomonile tiedjei 1 1 1505 164. DSP Desulfobacter postgatei 1 1 1519 165. DSV Desulfosarcina variabilis 1 1 1527 166. DUK Cairina moschata 1 1 78 167. DVU Desulfovibrio vulgaris 1 1 120 168. EAL Enchytraeus albidus 1 1 120 169. EAR Equisetum arvense 2 1 120 170. EBI Eisenia bicyclis 1 1 118 171. ECO Escherichia coli 148 116 14668 172. EFI Efibulobasidium albescens 1 1 118 173. EGR Euglena gracilis 4 4 391 174. EHP Ectothiorhodospira halophila 1 1 1494 175. EIK Eikenella corrodens 4 4 5933 176. EJA Entosphenus japonicus 2 2 241 177. EMP Emplectonema gracile 2 2 239 178. ERL Erythrobacter longus 1 1 119 179. ERP Protomonas extorquens 1 1 116 180. ERY Erysipelothrix rhusiopathiae 1 1 1487 181. ESE Endophyllum sempervivi 1 1 118 182. ESP Euphausia sperba 1 1 75 183. EUT Eucidaris tribuloides 6 3 923 184. EVA Exobasidium vaccinii 1 1 118 185. EWO Euplotes woodruffi 1 1 120 186. EXI Exidia glandulosa 1 1 118 187. FAE Faenia rectivirgula 1 1 1246 188. FBC Flexibacter sp. 1 1 117 189. FSB Misgurnus fossilis 3 3 399 190. FSB Oncorhynchus keta 1 1 75 191. FSB Salmo gairdneri 2 2 282 192. FSO Fusarium culmorum 2 2 387 193. FSO Fusarium decemcellulare 6 6 1161 194. FSO Fusarium graminearum 2 2 387 195. FSO Fusarium javanicum 4 4 768 196. FSO Fusarium moniliforme 6 6 1159 197. FSO Fusarium nivale 2 2 384 198. FSO Fusarium oxysporum 3 3 1010 199. FSO Fusarium solani 4 4 767 200. FVB Flavobacterium sp. 1 1 121 201. GBI Ginkgo biloba 1 1 120 202. GCL Gymnosporangium clavariaeforme 1 1 118 203. GCO Gracilaria compressa 3 2 242 204. GEA Gelidium amansii 2 2 241 205. GEM Gemmata obscuriglobus 1 1 108 206. GEN Genistelloides hibernus 1 1 122 207. GLA Giardia lamblia 1 1 127 208. GLC Gloiopeltis complanata 1 1 120 209. GOL Golfingia gouldii 6 3 973 210. GRA Graphiola phoenicis 1 1 118 211. HAL Halobacterium volcanii 56 46 5074 212. HAM Mesocricetus sp. 2 1 94 213. HAP Halichondria panicea 1 1 120 214. HAZ Haemophilus aphrophilus 3 3 494 215. HCU Halobacterium cutirubrum 15 13 1050 216. HDI Hymenolepis diminuta 2 2 215 217. HEA Haemophilus aegypticus 1 1 116 218. HEI Haemophilus influenzae 3 3 1917 219. HJA Halichondria japonica 1 1 120 220. HLF Haloferax mediterranei 2 1 123 221. HMO Halococcus morrhuae 2 2 309 222. HOC Haliclona oculata 1 1 120 223. HPT Herpetosiphon aurantiacus 1 1 117 224. HRO Halocynthia roretzi 2 1 119 225. HSA Hymeniacidon sanguinea 2 2 276 226. HUM Homo sapiens 51 45 6394 227. HYD Hydra sp. 6 3 966 228. HYF Hydrurus foetidus 1 1 118 229. HYV Hyphomicrobium sp. 1 1 119 230. HYV Hyphomicrobium vulgare 1 1 119 231. IGU Iguana iguana 1 1 120 232. ISO Isosphaera pallida 1 1 111 233. JLA Aurelia aurita 2 2 240 234. JLC Chrysaora quinquecirrha 1 1 120 235. JLN Nemopsis dofleini 1 1 120 236. JLS Spirocodon saltatrix 1 1 121 237. KAB Kabatiella microsticta 1 1 120 238. KIN Kingella denitrificans 1 1 1475 239. KIN Kingella indologenes 1 1 1474 240. KIN Kingella kingae 1 1 1476 241. LAE Listonella aestuarianus 1 1 119 242. LAN Lingula anatina 1 1 119 243. LAN Lingula reevi 6 3 919 244. LAP Lamprometra palmata 9 3 1044 245. LBK Lactobacillus kandleri 2 1 1528 246. LBM Lactobacillus minor 2 1 1524 247. LBR Lactobacillus brevis 1 1 117 248. LBT Lactobacillus halotolerans 2 1 1529 249. LCA Lactobacillus casei 2 1 1574 250. LCA Lactobacillus catenaforme 1 1 1549 251. LCO Lactobacillus confusus 2 1 1525 252. LEI Leishmania enriettii 1 1 68 253. LEU Leuconostoc cremoris 2 1 1493 254. LEU Leuconostoc lactis 2 1 1499 255. LEU Leuconostoc mesenteroides 2 1 1554 256. LEU Leuconostoc oenos 2 1 1510 257. LEU Leuconostoc paramesenteroides 2 1 1524 258. LGE Lineus geniculatus 1 1 120 259. LHE Lophocolea heterophylla 1 1 119 260. LND Linderina macrospora 1 1 119 261. LPN Fluoribacter bozemanae 6 6 796 262. LPN Fluoribacter dumoffii 6 6 825 263. LPN Fluoribacter gormanii 3 3 385 264. LPN Legionella pneumophila 11 10 1252 265. LSY Leptosynapta inhaerens 6 3 1051 266. LTT Leptothrix discophora 1 1 117 267. LUM Lumbricus sp. 6 3 976 268. LUP Lupinus luteus 5 5 380 269. LVI Lactobacillus viridescens 4 3 1816 270. LVI Lactobacillus vitulinus 1 1 1477 271. LYC Lycopodium clavatum 1 1 121 272. LYO Lycoperdon pyriforme 1 1 118 273. MAG Methylomonas agile 1 1 119 274. MAG Methylomonas methanica 2 2 1400 275. MAG Methylomonas rubra 1 1 119 276. MBF Methanobacterium formicicum 1 1 1476 277. MBI Methanobacterium thermoautotrophicum 4 4 415 278. MES Methanosarcina barkeri 1 1 130 279. MET Metridium senile 6 3 963 280. MGL Metasequoia glyptostroboides 1 1 120 281. MJU Microstroma juglandis 1 1 121 282. MLC Methylococcus capsulatus 3 3 1469 283. MLM Moloney murine leukemia virus 1 1 74 284. MLU Micrococcus luteus 2 2 238 285. MLY Micrococcus lysodeikticus 1 1 120 286. MNI Mnium rugicum 2 1 157 287. MOR Mortierella formosensis 1 1 120 288. MPO Marchantia polymorpha 1 1 119 289. MSE Megasphaera elsdenii 1 1 1567 290. MSG Mycobacterium asiaticum 2 1 1368 291. MSG Mycobacterium aurum 2 1 1349 292. MSG Mycobacterium avium 4 2 2735 293. MSG Mycobacterium chelonei 2 1 1355 294. MSG Mycobacterium chitae 2 1 1359 295. MSG Mycobacterium fallax 2 1 1348 296. MSG Mycobacterium flavescens 2 1 1357 297. MSG Mycobacterium gordonae 2 1 1373 298. MSG Mycobacterium kansasii 2 1 1369 299. MSG Mycobacterium leprae 1 1 313 300. MSG Mycobacterium neoaurum 2 1 1354 301. MSG Mycobacterium nonchromogenicum 2 1 1376 302. MSG Mycobacterium paratuberculosis 2 1 1367 303. MSG Mycobacterium phlei 2 1 1357 304. MSG Mycobacterium senegalense 2 1 1356 305. MSG Mycobacterium smegmatis 1 1 77 306. MSG Mycobacterium sp. 4 2 2715 307. MSG Mycobacterium terrae 2 1 1363 308. MSG Mycobacterium thermoresistible 2 1 1359 309. MSG Mycobacterium triviale 2 1 1351 310. MSG Mycobacterium tuberculosis 1 1 116 311. MSL Mytilus edulis 1 1 119 312. MTB Methylobacterium extorquens 2 2 1471 313. MTB Methylobacterium organophilum 2 2 1431 314. MTB Methylobacterium sp. 1 1 1052 315. MTE Methylosporovibrio methanica 1 1 1306 316. MUS Mus musculus 40 40 4555 317. MYA Mya arenaria 6 3 927 318. MYC Mycoplasma capricolum 3 3 259 319. MYC Mycoplasma hyopneumoniae 3 3 1799 320. MYC Mycoplasma mycoides 7 6 1885 321. MYC Mycoplasma sp. 24 24 34006 322. MYL Methylosinus trichosporium 2 2 1575 323. MYM Methylophilus methylotrophus 2 2 1619 324. MYP Methylocystis parvus 2 2 1433 325. MZE Zea mays 1 1 50 326. NDU Nematospiroides dubius 1 1 360 327. NEC Nectria haematococca 6 6 1152 328. NEM Ascaris suum 22 22 1251 329. NEU Neurospora crassa 6 6 1100 330. NGO Neisseria denitrificans 1 1 1478 331. NGO Neisseria gonorrhoeae 1 1 1486 332. NIF Nitella flexilis 1 1 121 333. NIT Nitrobacter winogradskyi 1 1 117 334. OCE Oceanospirillum linum 1 1 1542 335. ONG Onchocerca gibsoni 1 1 363 336. OPW Ophiocoma wendtii 9 3 1036 337. PAE Palaemonetes kadiakensis 1 1 1877 338. PAR Paramecium caudatum 1 1 366 339. PAR Paramecium primaurelia 1 1 366 340. PAR Paramecium tetraurelia 1 1 120 341. PAS Pasteurella multocida 3 3 1926 342. PBL Phycomyces blakesleeanus 1 1 120 343. PBR Perinereis brevicirris 1 1 120 344. PCL Prochloron sp. 1 1 122 345. PCR Philosamia cynthia ricini 2 2 289 346. PDE Paracoccus denitrificans 1 1 117 347. PEA Pisum sativum 8 8 824 348. PEC Penicillium chrysogenum 1 1 119 349. PEP Penicillium patulum 1 1 119 350. PEU Penaeus aztecus 1 1 1902 351. PFA Plasmodium falciparum 1 1 78 352. PGO Phascolopsis gouldii 1 1 120 353. PHS Phasianus colchicus 1 1 95 354. PHV Phaseolus vulgaris 1 1 75 355. PHY Pythium hydnosporum 1 1 118 356. PIL Pilayella littoralis 1 1 118 357. PIR Phlyctochytrium irregulare 1 1 118 358. PIS Pimelobacter simplex 1 1 120 359. PIV Pivellula marina 2 1 2885 360. PLA Platygloea peniophorae 1 1 119 361. PLC Planococcus citreus 2 1 116 362. PLC Planococcus kocurii 2 1 116 363. PLE Phleogena faginea 1 1 119 364. PLL Pirella marina 1 1 110 365. PLL Pirella sp. 2 2 222 366. PLT Planctomyces brasiliensis 1 1 110 367. PLT Planctomyces limnophilus 1 1 111 368. PLT Planctomyces staleyi 1 1 1525 369. PMC Pneumocystis carinii 1 1 120 370. PMI Prorocentrum micans 1 1 364 371. PNC Pseudonocardia thermophila 1 1 1246 372. PNU Psilotum nudum 2 2 171 373. POC Procaris ascensionis 1 1 1874 374. POO Prosthecochloris aestuarii 1 1 110 375. POR Porocephalus crotali 1 1 1830 376. POS Pleurotus ostreatus 1 1 118 377. PPO Puccinia poarum 1 1 118 378. PRA Procambarus leonensis 1 1 1869 379. PRE Planocera reticulata 1 1 120 380. PRM Proteus vulgaris 4 4 1925 381. PSE Pseudomonas aeruginosa 2 2 1637 382. PSE Pseudomonas cepacia 2 2 1589 383. PSE Pseudomonas fluorescens 2 1 120 384. PSE Pseudomonas sp. 1 1 118 385. PT4 Bacteriophage T4 17 12 979 386. PT5 Bacteriophage T5 9 9 711 387. PTE Porphyra tenera 1 1 121 388. PTR Plagiomnium trichomanes 1 1 119 389. PYE Porphyra yezoensis 1 1 121 390. QUL Coturnix coturnix 1 1 136 391. RAB Oryctolagus cuniculus 15 11 4955 392. RAT Rattus norvegicus 60 45 6084 393. RAT Rattus rattus 4 4 230 394. RCA Rhodobacter capsulatus 2 2 235 395. RCY Russula cyanoxantha 1 1 119 396. REC Renobacter vacuolatum 1 1 116 397. RER Rhodococcus equi 2 1 1360 398. RER Rhodococcus erythropolis 1 1 121 399. RHC Rhizoctonia crocorum 1 1 119 400. RHZ Rhizoctonia hiemalis 1 1 119 401. RIC Oryza sativa 2 2 417 402. RIF Riftia pachyptila 6 3 929 403. RIR Rickettsia rickettsii 2 1 1443 404. RIR Rickettsia typhi 2 1 1444 405. RMA Rhodopseudomonas marina 1 1 1417 406. RPA Rhodopseudomonas palustris 1 1 119 407. RRU Rhodospirillum rubrum 2 2 161 408. RSP Rhodospirillum rubrum 1 1 1446 409. RSS Rhodobacter sphaeroides 1 1 115 410. RTO Rhabditis tokai 1 1 119 411. RYE Secale cereale 3 3 356 412. SAC Sulfolobus acidocaldarius 2 2 204 413. SAG Schizochytrium aggregatum 1 1 119 414. SAH Saccharum officinarum 1 1 50 415. SAU Stigmatella aurantiaca 2 2 239 416. SCC Scyliorhinus caniculus 1 1 120 417. SCL Styela clava 6 3 967 418. SCM Schistosoma mansoni 2 2 215 419. SCS Saccharopolyspora hirsuta 1 1 1284 420. SCU Thyone briareus 6 3 1059 421. SEP Septobasidium carestianum 1 1 119 422. SFE Saprolegnia ferax 1 1 118 423. SFU Sargassum fulvellum 1 1 118 424. SHE Shewanella hanedai 2 1 120 425. SHP Ovis sp. 1 1 76 426. SHR Artemia salina 2 2 282 427. SJA Sabellastarte japonica 1 1 120 428. SLI Synechococcus lividus 1 1 120 429. SLM Physarum polycephalum 6 5 845 430. SME Spiroplasma sp. 11 11 14826 431. SMI Smittium culisetae 1 1 121 432. SNL Arion rufus 3 2 276 433. SNL Helix pomatia 1 1 119 434. SOB Scenedesmus obliquus 5 5 407 435. SOF Sepia officinalis 2 1 120 436. SOS Stichopus oshimae 1 1 120 437. SPG Saprospira grandis 1 1 121 438. SPI Spinacia oleracea 3 2 205 439. SPL Spirillum volutans 1 1 1492 440. SPM Spirobolus marginatus 6 3 977 441. SPO Sporolactobacillus inulinus 1 1 117 442. SPS Spirogyra sp. 1 1 120 443. SQD Illex illecebrosus 1 1 120 444. SRG Sorghum bicolor 1 1 50 445. SSO Sulfolobus solfataricus 1 1 126 446. SSP Sulfolobus sp. 1 1 131 447. STA Staphylococcus aureus 1 1 115 448. STA Staphylococcus epidermidis 5 3 264 449. STC Stentor coeruleus 1 1 353 450. STE Stella humosa 1 1 117 451. STF Asteria amurensis 2 2 195 452. STF Asterias forbesi 9 3 1045 453. STF Asterina pectinifera 1 1 120 454. STM Streptomyces griseus 1 1 120 455. STN Stenopus hispidus 1 1 1885 456. STR Streptococcus cremoris 1 1 117 457. STR Streptococcus faecalis 1 1 117 458. STR Streptococcus sp. 1 1 1577 459. STY Salmonella typhimurium 7 6 459 460. SUD Pseudocentrotus depressus 1 1 120 461. SUE Heliocidaris erythrogramma 6 3 1043 462. SUE Heliocidaris tuberculata 6 3 910 463. SUH Hemicentrotus pulcherrimus 1 1 120 464. SUL Lytechinus pictus 6 3 1046 465. SUP Psammechinus miliaris 8 8 422 466. SUS Strongylocentrotus purpuratus 6 3 988 467. SYB Syntrophospora bryantii 1 1 1532 468. SYC Synechocystis sp. 1 1 76 469. SYN Synechococcus lividus 1 1 119 470. SYW Syntrophomonas wolfei 1 1 1532 471. TAM Tatlockia maceachernii 3 3 458 472. TAM Tatlockia micdadei 9 9 1286 473. TAN Tilletiaria anomala 1 1 118 474. TAP Taphrina deformans 1 1 119 475. TCO Tilletiaria controversa 1 1 118 476. TET Tetrahymena thermophila 7 7 615 477. TEY Tetrahymena pyriformis 3 3 623 478. TFE Acidiphilium cryptum 1 1 122 479. TFE Thiobacillus acidophilus 1 1 120 480. TFE Thiobacillus ferrooxidans 2 2 240 481. TFE Thiobacillus intermedius 1 1 117 482. TFE Thiobacillus neapolitanus 1 1 119 483. TFE Thiobacillus novellus 1 1 120 484. TFE Thiobacillus perometabolis 1 1 116 485. TFE Thiobacillus sp. 1 1 117 486. TFE Thiobacillus thiooxidans 1 1 121 487. TFE Thiobacillus thioparus 1 1 118 488. TFE Thiobacillus versutus 1 1 116 489. TFE Thiomicrospira pelophila 1 1 118 490. TFE Thiomicrospira sp. 1 1 117 491. THA Artificial gene 3 3 276 492. THC Thermococcus celer 2 2 1611 493. THR Thermomicrobium roseum 1 1 127 494. THT Thiothrix nivea 1 1 122 495. THT Thiothrix sp. 1 1 120 496. THV Thiovulum sp. 1 1 123 497. TLA Thermomyces lanuginosus 2 2 276 498. TLP Torulopsis utilis 1 1 121 499. TOB Nicotiana tabacum 2 2 152 500. TOR Trichosporon oryzae 1 1 118 501. TRB Trypanosoma brucei 2 2 106 502. TRD Tripsacum dactyloides 1 1 50 503. TRF Crithidia fasciculata 7 7 1305 504. TRI Trichomonas vaginalis 1 1 341 505. TTE Thermoproteus tenax 1 1 1504 506. TTH Thermus aquaticus 2 2 243 507. TTH Thermus sp. 2 2 243 508. TTH Thermus thermophilus 5 4 354 509. TUL Tulasnella violea 1 1 118 510. TUM Tuberoidobacter mutans 1 1 116 511. TVI Thraustochytrium visurgense 1 1 119 512. UPE Ulva pertusa 1 1 120 513. URE Ureaplasma urealyticum 1 1 1464 514. UTH Uthatobasidium fusisporum 1 1 118 515. UUN Urechis unicinctus 1 1 120 516. VCH Vibrio cholerae 1 1 119 517. VER Verrucomicrobium spinosum 1 1 116 518. VFA Vicia faba 2 2 327 519. VIB Aeromonas hydrophila 1 1 118 520. VIB Aeromonas media 1 1 119 521. VIB Aeromonas salmonicida 1 1 119 522. VIB Alteromonas colwelliana 2 1 120 523. VIB Alteromonas putrifaciens 1 1 120 524. VIB Photobacterium angustum 1 1 120 525. VIB Photobacterium leiognathi 1 1 120 526. VIB Photobacterium sp. 1 1 120 527. VIB Plesiomonas shigelloides 1 1 120 528. VIB Vibrio alginolyticus 1 1 121 529. VIB Vibrio anguillarum 1 1 120 530. VIB Vibrio carchariae 1 1 120 531. VIB Vibrio cincinnatii 1 1 120 532. VIB Vibrio damsela 1 1 120 533. VIB Vibrio fischeri 1 1 120 534. VIB Vibrio fluvialis 2 2 240 535. VIB Vibrio gazogenes 1 1 120 536. VIB Vibrio logei 1 1 120 537. VIB Vibrio marinus 2 2 236 538. VIB Vibrio metschnitovii 1 1 120 539. VIB Vibrio mimicus 1 1 120 540. VIB Vibrio natriegens 1 1 121 541. VIB Vibrio nereis 2 1 121 542. VIB Vibrio parahaemolyticus 2 2 239 543. VIB Vibrio pelagius 1 1 120 544. VIB Vibrio proteolyticus 1 1 120 545. VIB Vibrio psychroerythus 1 1 119 546. VIB Vibrio sp. 5 4 478 547. VIT Vitreoscilla sp. 2 2 234 548. VIT Vitreoscilla stercoraria 2 2 1608 549. VVU Vibrio vulnificus 1 1 120 550. WHT Triticum aestivum 17 14 1729 551. WHT Triticum sp. 2 2 152 552. WHT Triticum vulgare 2 2 282 553. WLB Wolbachia persica 2 1 1475 554. WOL Wolinella succinogenes 1 1 1503 555. XEB Xenopus borealis 3 3 477 556. XEL Xenopus laevis 20 17 4158 557. XET Xenopus tropicalis 2 2 242 558. XYL Xylella fastidiosa 1 1 1493 559. YSA Candida albicans 1 1 121 560. YSC Saccharomyces cerevisiae 62 47 4920 561. YSG Saccharomyces carlsbergensis 4 2 242 562. YSK Kluyveromyces lactis 1 1 121 563. YSP Schizosaccharomyces pombe 8 8 630 564. YSR Pichia membranaefaciens 1 1 120 565. YST Yeast sp. 7 6 492 566. YSU Candida utilis 12 10 815 567. Unidentified 177 169 19082 Total 1946 1647 445723 VIRAL Key Name Reports Entries Bases ------------------------------------------------------------------------------- 1. AA2 Adeno associated virus 9 6 7879 2. AAF Avian musculoaponeurotic fibrosarcoma virus 2 1 3171 3. AC2 Avian carcinoma virus 14 11 18641 4. ACB Avian erythroblastosis virus 19 15 18700 5. ACE Avian endogenous virus 5 5 2772 6. ACF Fujinami sarcoma virus 3 2 7503 7. ACM Avian myelocytomatosis retrovirus 10 8 11975 8. ACR Avian reticuloendotheliosis virus 12 8 8401 9. ACS Avian sarcoma virus 15 13 16357 10. AD4 Mastadenovirus h40 4 4 10795 11. AD4 Mastadenovirus h41 3 3 8920 12. ADA Mastadenovirus s30 5 5 1318 13. ADB Mastadenovirus 2 82 5 36399 14. ADB Mastadenovirus c2 1 1 196 15. ADC Mastadenovirus h3 12 11 9026 16. ADD Mastadenovirus h4 7 5 5078 17. ADE Mastadenovirus 7 1 1 2718 18. ADE Mastadenovirus h5 35 11 30276 19. ADG Mastadenovirus h7 15 6 13245 20. ADG Mastadenovirus s7 6 5 4931 21. ADI Mastadenovirus 9 2 2 332 22. ADJ Mastadenovirus 10 1 1 135 23. ADL Mastadenovirus 2 2 1 430 24. ADL Mastadenovirus h12 41 23 19901 25. ADR Mastadenovirus 18 2 2 364 26. ADT Tupaia adenovirus 4 4 3784 27. ADU Mastadenovirus 19 1 1 154 28. ADV Adenovirus VA 8 7 11146 29. ADV Mastadenovirus 2 2 1549 30. ADV Mastadenovirus h40 1 1 1849 31. ADV Mastadenovirus h41 2 1 1939 32. ADX Mastadenovirus bos1 1 1 159 33. ADX Mastadenovirus mus 7 5 9449 34. ADY Eggdrop syndrome-1976 virus 1 1 52 35. ADZ Mastadenovirus 31 2 2 300 36. ADZ Mastadenovirus bos3 1 1 2849 37. ADZ Mastadenovirus c2 2 2 3689 38. AEA Avian adenovirus 3 3 576 39. AEC Canine adenovirus 4 4 805 40. AEE Equine adenovirus 4 4 617 41. AIN Aino virus 1 1 850 42. ALE Rous associated virus 9 8 5119 43. ALK Avian leukemia virus 2 2 454 44. ALM Avian myeloblastosis virus 6 6 6465 45. ALR Rous sarcoma virus 162 136 65936 46. ALV Avian leukosis virus 12 12 5400 47. APH Foot and mouth disease virus 125 120 77985 48. ARE Avian retrovirus 2 2 698 49. ARE Avian retrovirus IC10 3 2 6013 50. ARR Adult diarrhea rotavirus 2 2 1445 51. ASB Avocado sunblotch viroid 20 19 4715 52. ASS Apple scar skin viroid 1 1 329 53. ASV African swine fever virus 4 3 6376 54. BBM Broad bean mottle virus 3 3 680 55. BBV Black beetle virus 4 3 4893 56. BCT Beet curly top virus 1 1 2993 57. BEC Bovine enteritic coronavirus 1 1 1710 58. BEV Bovine enterovirus 1 1 7414 59. BIM Bovine immunodeficiency-like virus 1 1 8482 60. BLC Bunyamwera virus 3 3 12294 61. BLC Bunyavirus La Crosse 19 18 9070 62. BLC Germiston bunyavirus 2 2 5514 63. BLV Bovine leukemia virus 20 20 31133 64. BNY Beet necrotic yellow vein mosaic virus 6 6 17031 65. BOO Boolarra virus 2 1 1305 66. BRV Berne virus 2 2 376 67. BTV Bluetongue virus 31 23 43490 68. BVD Bovine viral diarrhea virus 1 1 12573 69. BWY Beet western yellow virus 5 3 7958 70. BYD Barley yellow dwarf virus 3 2 6280 71. CAD Canine distemper virus 4 4 6857 72. CAN Carnation etched ring virus 1 1 7932 73. CAP Capripoxvirus 4 4 10417 74. CAS Cassava latent virus 2 2 5503 75. CASNS Cas NS1 retrovirus 1 1 2711 76. CBV Choristoneura biennis virus 1 1 1173 77. CCC Cadang-cadang coconut viroid 3 3 779 78. CCP Cricket paralysis virus 1 1 1594 79. CEA Caprine arthritis encephalitis virus 9 8 11854 80. CEV Citrus exocortis viroid 4 4 1484 81. CFD Coconut foliar decay virus 1 1 1291 82. CHM Chloris striate mosaic virus 1 1 2750 83. CHV Chlorella virus 2 2 3727 84. CMV Carnation mottle virus 1 1 4003 85. CNV Cucumber necrosis virus 1 1 4701 86. CO4 Coliphage N4 2 2 1759 87. COB Bovine coronavirus 8 5 15070 88. CPB Chlorella PBCV-1 virus 1 1 4265 89. CPE Euxoa scandens cytoplasmic polyhedrosis virus 1 1 882 90. CPF Cucumber pale fruit viroid 3 2 604 91. CPR Chandipura virus 1 1 1751 92. CPV Cowpox virus 20 20 17648 93. CRV Cymbidium ringspot virus 2 1 4733 94. CSO Campoletis sonorensis virus 8 6 9418 95. CSV Chrysanthemum stunt viroid 4 3 1040 96. CTN Coconut tinangaja viroid 1 1 254 97. CXB Coxsackievirus B1 2 2 8844 98. CXB Coxsackievirus B3 5 5 20481 99. CXB Coxsackievirus B4 1 1 7395 100. CYM Clover yellow mosaic potexvirus 1 1 1051 101. CYS Lymphocystis disease virus of fish 3 3 5310 102. DEN Dengue virus 105 103 90462 103. DHV Dhori virus 1 1 1479 104. DMB Thymotropic retrovirus type B 1 1 285 105. DNV Densonucleosis virus 1 1 4277 106. DPF Dapple peach fruit disease viroid 1 1 297 107. DPP Dapple plum and peach fruit disease viroid 1 1 297 108. DUG Dugbe nairovirus 1 1 1712 109. EAE Equine arthritis encephalitis virus 1 1 2580 110. EBO Ebola virus 2 2 3178 111. ECV Echo 11 virus 1 1 98 112. ECV Echo 6 virus 1 1 99 113. ECV Echo 9 virus 2 2 615 114. EEE Eastern equine encephalomyelitis virus 5 5 5163 115. EEV Venezuelan equine encephalitis virus 7 6 8300 116. EEW Western equine encephalitis virus 2 2 4521 117. EIA Equine infectious anemia virus 18 11 24136 118. EMC Encephalomyocarditis virus 10 9 27249 119. FCG Gardner-Arnstein Feline Leukemia oncovirus B 2 2 3863 120. FCL Feline calicivirus 2 2 6358 121. FCR RD114 retrovirus 1 1 126 122. FCS Feline sarcoma virus 7 7 14248 123. FCV Feline leukemia virus 17 15 38510 124. FHV Flock house virus 2 1 1400 125. FIP Feline infectious peritonitis virus 1 1 4500 126. FIV Feline immunodeficiency virus 6 3 19318 127. FLA Influenza virus type A 504 412 430270 128. FLB Influenza virus type B 85 72 102701 129. FLC Influenza virus type C 29 29 46959 130. FMV Figwort mosaic virus 1 1 7743 131. FPV Fowlpox virus 6 6 25819 132. FV3 Frog virus 3 3 2 2273 133. GPB Granulosis virus 1 1 999 134. GPR Gottfried porcine rotavirus 1 1 3302 135. GSB GS virus 1 1 307 136. GSH Ground squirrel hepatitis virus 1 1 3311 137. GVI Grapevine chrome mosaic virus 4 2 11653 138. GVI Grapevine viroid 3 2 665 139. GVT Trichoplusia ni granulosis virus 1 1 998 140. GYS Grapevine yellow speckle viroid 3 2 730 141. HAN Hantaan virus 6 5 9735 142. HBD Duck hepatitis B virus 6 5 12249 143. HBH Heron hepatitis B virus 1 1 3027 144. HCV Hog cholera virus 3 2 24567 145. HIV Human immunodeficiency virus type 1 101 53 187276 146. HIV Human immunodeficiency virus type 2 13 9 75733 147. HIV Human lymphotropic virus type III 1 1 261 148. HIV Human T-cell lymphotropic virus type II 7 3 10520 149. HJV Highlands J virus 2 2 505 150. HL1 Human lymphotropic virus type I 15 13 26925 151. HL2 Human lymphotropic virus type II 4 4 5400 152. HLV Hop latent viroid 2 1 256 153. HOB Human coronavirus 4 3 3550 154. HOJ HoJo virus 1 1 3613 155. HOM Mus hortulanus virus 3 3 4668 156. HOP Hop Stunt Viroid 10 7 2094 157. HPA Hepatitis A virus 13 11 42881 158. HPB Hepatitis B virus 74 70 76139 159. HPC Hepatitis C virus 3 2 7893 160. HPD Hepatitis delta virus 4 3 3523 161. HPE Hepatitis E virus 2 1 2570 162. HPU Duck hepatitis virus 2 1 3021 163. HPV Hepatitis virus 2 2 4553 164. HRD Human retrovirus type D 1 1 8785 165. HRV Human rhinovirus 11 9 31251 166. HS1 Herpes simplex virus type 1 148 109 315961 167. HS2 Herpes simplex virus type 2 37 31 54029 168. HS4 Epstein-Barr virus 88 68 310429 169. HS5 Human cytomegalovirus 44 40 136382 170. HS5 Murine cytomegalovirus 4 3 6921 171. HS5 Simian cytomegalovirus 1 1 880 172. HS6 Human herpesvirus type 6 2 2 26298 173. HSB Bovine herpesvirus type 1 9 9 8227 174. HSC Simian cytomegalovirus 2 2 2294 175. HSE Equine herpesvirus type 1 19 17 45132 176. HSG Gallid herpesvirus type 1 5 5 10607 177. HSK Gallid herpesvirus type 2 4 4 11325 178. HSL Feline herpesvirus 1 1 1619 179. HSL Herpesvirus ateles 1 1 2577 180. HSM Gallid herpesvirus type 1 2 2 3367 181. HSO Herpesvirus papio 2 2 806 182. HSP Human spumaretrovirus 3 3 12095 183. HSS Herpesvirus saimiri 30 29 22133 184. HSS Pseudorabies virus 18 14 38325 185. HST Herpesvirus tamarinus 2 1 2556 186. HSU Herpesvirus tupaia 1 1 863 187. HSV Herpes simplex virus 1 1 501 188. HSY Herpesvirus sylvilagus 1 1 559 189. HTV Human adult T-cell leukemia virus 3 2 3556 190. IBA Avian infectious bronchitis virus 24 24 31163 191. IBB Infectious bronchitis virus 3 2 7215 192. IBD Infectious bursal disease virus of chickens 4 3 9138 193. IHN Infectious hematopoietic necrosis virus 2 2 2961 194. INS Insect iridescent virus type 22 1 1 2183 195. IPN Infectious pancreatic necrosis virus 2 1 3097 196. IRI Iridescent virus type 1 1 1 2461 197. IRI Iridescent virus type 6 3 3 10327 198. JEV Japanese encephalitis virus 4 4 18496 199. KUN Kunjin virus 1 1 10664 200. KVS Killer virus of S.cerevisiae 8 6 1872 201. LCV Lymphocytic choriomeningitis virus 11 11 19716 202. LDV Lactate dehydrogenase-elevating virus 6 6 1684 203. LEE Lee virus 1 1 3616 204. LSV Lassa virus 4 4 10307 205. MAA Alfalfa mosaic virus 28 19 15793 206. MAV Myeloblastosis-associated virus 1 1 1173 207. MBG Bean golden mosaic virus 4 4 10465 208. MBG Bean yellow mosaic virus 1 1 1015 209. MBR Brome mosaic virus 16 12 9903 210. MBS Barley stripe mosaic virus 11 8 13655 211. MBV Middleburg virus 4 3 3394 212. MCA Cauliflower mosaic virus 28 25 41207 213. MCC Cowpea chlorotic mottle virus 7 6 6379 214. MCF Mink cell focus-forming virus 8 8 8202 215. MCG Cucumber green mottle mosaic virus 2 2 2421 216. MCM Maize chlorotic mottle virus 2 1 4437 217. MCP Cowpea mosaic virus 8 8 10170 218. MCV Cucumber mosaic virus 53 52 34142 219. MDP Aleutian mink disease parvovirus 4 2 8255 220. MEA Measles virus 30 25 83843 221. MEV Maus-Elberfeld virus 1 1 54 222. MGR Maguari bunyavirus 1 1 945 223. MHV Murine hepatitis virus 39 33 39489 224. MLA Abelson murine leukemia virus 8 6 10848 225. MLE Mouse RFV endogenous retrovirus 2 2 684 226. MLF Friend mink cell focus-inducing virus 5 4 7000 227. MLF Friend murine leukemia virus 2 2 4170 228. MLF Friend spleen focus-forming virus 9 9 13488 229. MLG Gross passage A murine leukemia virus 2 2 1220 230. MLK Kirsten murine leukemia virus 1 1 1335 231. MLM Moloney murine leukemia virus 56 42 35727 232. MLN Murine non-leukeminogenic retrovirus 1 1 529 233. MLO AKV murine leukemia virus 7 2 9000 234. MLR Rauscher spleen focus-forming virus 2 2 2244 235. MLS Soule murine leukemia virus 2 2 1310 236. MLT Tikaut murine leukemia virus 1 1 641 237. MLV Murine leukemia virus 53 44 48928 238. MLX Xenotropic murine leukemia virus 1 1 3060 239. MMT Mouse mammary tumor virus 29 28 37314 240. MNC Narcissus mosaic potexvirus 1 1 6955 241. MOK Mokola lyssavirus 2 2 152 242. MOP Mopeia virus 1 1 3419 243. MPM Mouse polyomavirus 1 1 1155 244. MPV Monkeypox virus 1 1 1276 245. MRV Marburg virus 1 1 59 246. MSB Southern bean mosaic virus 2 2 793 247. MSC Sugarcane mosaic virus 1 1 1782 248. MSH Harvey murine sarcoma virus 4 4 3226 249. MSJ FBJ murine osteosarcoma virus 1 1 4226 250. MSK Kirsten murine sarcoma virus 2 2 1933 251. MSN Solanum nodiflorum mottle virus 1 1 377 252. MSR FBR murine osteosarcoma virus 1 1 3811 253. MSV Murine sarcoma virus 5 5 5020 254. MSY Myeloproliferative sarcoma virus 3 3 5305 255. MTG Tomato golden mosaic virus 3 3 6342 256. MTR Tobacco rattle virus 7 7 20386 257. MTS Lucerne transient streak virus 3 3 970 258. MTV Tobacco mosaic virus 22 8 16050 259. MTV Velvet tobacco mottle virus 1 1 366 260. MTY Andean potato latent virus 1 1 96 261. MTY Clitoria yellow vein virus 1 1 120 262. MTY Eggplant mosaic virus 3 3 6469 263. MTY Kennedya yellow mosaic virus 1 1 83 264. MTY Ononis yellow mosaic virus 2 2 6342 265. MTY Turnip yellow mosaic virus 20 16 15958 266. MUM Mumps virus 11 9 13622 267. MUR Murine retrovirus SL3-2 1 1 492 268. MVE Murray Valley encephalitis virus 2 2 5994 269. MVM Minute virus of mice 9 7 16222 270. MYX Myxoma virus 3 2 4473 271. MZS Maize streak virus 5 4 8139 272. NDV Newcastle disease virus 49 47 98864 273. NEV Nephropathia epidemica 2 2 5466 274. NOD Nodamura virus 2 1 1335 275. NPA Antheraea pernyi nuclear polyhedrosis virus 1 1 285 276. NPA Autographa californica nuclear polyhedrosis virus 45 41 67447 277. NPB Bombyx mori nuclear polyhedrosis virus 3 3 3931 278. NPG Galleria mellonella nuclear polyhedrosis virus 5 5 2556 279. NPM Mamestra brassicae nuclear polyhedrosis virus 1 1 2598 280. NPO Orgyia pseudotsugata polyhedrosis virus 8 8 16937 281. NPS Spodoptera frugiperda nuclear polyhedrosis virus 1 1 1557 282. OLV Ovine lentivirus 2 2 18512 283. ONN O'Nyong-nyong virus 2 1 11835 284. ORF Orf virus 2 1 5003 285. PCB Baboon endogenous virus 8 7 20105 286. PCC Colobus type C cpc-1 endogenous retrovirus 2 2 373 287. PCE Chimpanzee type C endogenous retrovirus 2 2 430 288. PCG Gibbon leukemia virus 5 4 9202 289. PCM Macaca endogenous retrovirus 1 1 126 290. PCM Macaca mulatta type C retrovirus 4 4 938 291. PCS Simian sarcoma virus 12 9 10868 292. PEB Pea early browning virus 2 1 7073 293. PEV Subacute sclerosing panencephalitis virus 3 3 3444 294. PIB Bovine parainfluenza virus type 3 3 1 8700 295. PIC Pichinde Arenavirus 8 8 12637 296. PIF Human parainfluenza virus type 3 29 27 46139 297. PLV Potato leaf roll virus 5 3 6650 298. PLY Budgerigar fledgling disease virus 1 1 4980 299. PLY Polyomavirus 131 39 35440 300. PLY Polyomavirus BK 2 2 799 301. PLY Polyomavirus JC 3 3 765 302. PMP Papaya mosaic potexvirus 2 2 1039 303. PMS Simian paramyxovirus (SV5) 1 1 1382 304. PMV Pepper mottle virus 1 1 1480 305. POL Poliovirus 120 106 78406 306. POV Porcine parvovirus 1 1 3670 307. PPA Avian papillomavirus 2 2 786 308. PPB Bovine papillomavirus 14 14 32821 309. PPC Hamster papovavirus 2 2 10672 310. PPD Deer papillomavirus 1 1 8374 311. PPE European Elk papillomavirus 4 3 8842 312. PPH Human papillomavirus 40 38 101284 313. PPI Micromys minutus papillomavirus 3 3 487 314. PPL Lymphotropic papovavirus 1 1 5270 315. PPM Monkey B-lymphotropic papovavirus 4 4 10920 316. PPR Reindeer papillomavirus 2 2 930 317. PPV Plum pox potyvirus 3 3 13827 318. PRH Prospect Hill virus 1 1 1675 319. PRV Porcine rotavirus 11 9 12190 320. PSV Peanut stunt virus 1 1 393 321. PTP Punta toro phlebovirus 6 6 7130 322. PTV Potato spindle tuber viroid 5 5 1795 323. PV1 Parvovirus H1 3 2 5302 324. PV3 Parvovirus H3 1 1 125 325. PVA Raccoon parvovirus 2 1 2410 326. PVB Papovavirus BKV 38 25 18937 327. PVB Parvovirus B19 4 4 11325 328. PVC Canine parvovirus 4 4 10016 329. PVD Bovine parvovirus 2 1 5517 330. PVF Feline panleukopenia virus 10 6 16703 331. PVM Mink enteritis virus 4 2 4888 332. PVR Kilham rat virus 1 1 125 333. PVR Parvovirus R1 2 2 548 334. PVS Potato virus S 1 1 3552 335. PVX Potato virus X 8 6 22573 336. PVY Potato virus Y 8 5 17509 337. RAV Rabies virus 21 20 32337 338. RBF Malignant rabbit fibroma virus 3 3 446 339. RBF Rabbit fibroma virus 15 15 28212 340. RBV Rabbit rotavirus 1 1 1036 341. RCM Red clover mottle virus 1 1 3543 342. RDV Rice dwarf virus 5 3 5209 343. REO Reovirus sp. 2 2 2903 344. REO Reovirus type 1 21 19 14348 345. REO Reovirus type 2 14 12 6823 346. REO Reovirus type 3 48 34 25499 347. RML Rauscher murine leukemia virus 3 3 395 348. RNM Red clover necrotic mosaic virus 3 2 5338 349. RO1 Rotavirus sp. 3 3 4074 350. RO1 Rotavirus subgroup 1 3 2 2712 351. RO2 Rotavirus subgroup 2 6 6 5955 352. ROB Bovine rotavirus 21 16 24214 353. ROH Human rotavirus 6 6 8164 354. ROR Rhesus rotavirus 2 2 3424 355. ROT Simian rotavirus SA11 19 15 20730 356. RPF Rinderpest virus 5 5 10323 357. RPV Raccoonpox virus 1 1 2195 358. RRV Ross river virus 4 3 19686 359. RSB Bovine syncytial virus 1 1 1201 360. RSH Human respiratory syncytial virus 33 21 19332 361. RSV Rat sarcoma virus 1 1 1380 362. RUB Rubella virus 12 7 26119 363. RVF Rift Valley fever virus 1 1 3884 364. SAM Satellite arabis mosaic virus 1 1 300 365. SAP Satellite panicum mosaic virus 1 1 826 366. SFS Sandfly fever Sicilian virus 2 1 1747 367. SFV Semliki forest virus 12 3 15380 368. SHV Simian hepatitis A virus 6 4 4331 369. SIG Sigma virus 1 1 1718 370. SIN Sindbis virus 16 6 18450 371. SIV Simian immunodeficiency virus 31 23 106170 372. SIV Simian immunodeficiency virus 1 1 7759 373. SIV Simian immunodeficiency virus 3 3 9130 374. SLO St. Louis encephalitis virus 8 8 5391 375. SMF Simian foamy virus 1 1 3534 376. SND Parainfluenza virus 39 31 59944 377. SND Parainfluenza virus type 4A 2 2 3767 378. SNV Spleen necrosis virus 9 8 3909 379. SPV Spiroplasma virus 1 1 4421 380. SRV Sapporo rat virus 2 2 5420 381. SSH Snowshoe hare bunyavirus 11 10 6726 382. STL Simian T-cell lymphotropic virus type I 3 3 8227 383. STT St. Thomas 3 rotavirus 1 1 1062 384. SUV Subterranean clover mottle virus 2 2 720 385. SV4 Rhesus macaque polyomavirus 180 42 15361 386. SV5 Simian virus 5 4 4 5586 387. SVC Spring viremia of carp virus 2 2 778 388. SVD Swine vesicular disease virus 2 2 7475 389. SYE Sonchus yellow net virus 3 3 2822 390. TAC Tacaribe virus 4 3 10607 391. TAS Tomato apical stunt viroid 3 2 723 392. TBE Tick-borne encephalitis virus 7 5 32678 393. TBR Tomato black ring virus 11 11 18222 394. TBS Tomato bushy stunt virus 3 2 5172 395. TBV Tick-borne virus 1 1 1586 396. TCV Turnip crinkle virus 5 5 6433 397. TEV Tobacco etch virus 4 3 21315 398. TGE Transmissible gastroenteritis virus 14 9 20969 399. TME Theiler's murine encephalomyelitis virus 4 4 26220 400. TMG Tobacco mild green mosaic virus 3 2 7768 401. TNC Tobacco necrosis virus 1 1 3660 402. TNS Satellite tobacco necrosis virus 4 2 1380 403. TOA Tomato aspermy virus 5 5 943 404. TOS Tomato ringspot virus 2 2 3096 405. TPM Tomato plant macho viroid 1 1 360 406. TRS Tobacco ringspot virus 3 3 790 407. TSV Tobacco streak virus 3 3 2525 408. TVM Tobacco vein mottling virus 4 2 9892 409. UST Ustilago maydis P6 virus 1 1 1234 410. UUK Uukuniemi virus 2 2 4951 411. VAC Vaccinia virus 97 88 389431 412. VAR Variola virus 1 1 1274 413. VAZ Varicella-zoster virus 11 7 131533 414. VLV Visna virus 2 2 9690 415. VSV Vesicular stomatitis virus 169 146 192746 416. VYS Saccharomyces cerevisiae virus ScV1 1 1 819 417. WCP White clover mosaic virus 4 3 13303 418. WDV Wheat dwarf virus 2 2 2829 419. WHV Woodchuck hepatitis virus 7 7 19239 420. WMS Woolly monkey sarcoma virus 2 1 1431 421. WNF West Nile virus 7 4 11434 422. WTV Wound tumor virus 11 9 14409 423. YFV Flavivirus febricis 5 3 22289 424. ZYM Zucchini yellow mosaic virus 1 1 1374 Total 4751 3707 6439492 PHAGE Key Name Reports Entries Bases ------------------------------------------------------------------------------- 1. AL3 Bacteriophage alpha3 7 5 1299 2. BAZ Bacteriophage Z 1 1 370 3. BBF Bacteriophage BF23 3 3 1434 4. BEO Corynebacteriophage omega 1 1 1880 5. BET Corynebacteriophage beta 3 2 4162 6. BEU Corynebacteriophage gamma 2 2 139 7. BFR Bacteriophage fr 4 4 2053 8. BM2 Bacteriophage PM2 3 3 1025 9. BNF Bacteriophage NF 6 5 3258 10. BO1 Bacteriophage Bo1 1 1 205 11. BP2 Bacteriophage P21 2 2 191 12. BPH Bacteriophage phi-11 2 2 2041 13. BT1 Bacteriophage T1 1 1 1091 14. BU3 Bacteriophage U3 1 1 201 15. BZ3 Bacteriophage Bz13 1 1 218 16. C31 Bacteriophage phi-c31 1 1 3413 17. CF1 Bacteriophage Cf16 1 1 500 18. CP1 Bacteriophage Cp-1 5 3 3364 19. CP5 Bacteriophage Cp-5 4 2 1850 20. CP7 Bacteriophage Cp-7 5 3 4792 21. CP9 Bacteriophage Cp-9 1 1 1253 22. CPT Bacteriophage Cp-T1 1 1 730 23. D18 Bacteriophage D108 9 7 3935 24. F1C Bacteriophage f1 16 13 16373 25. F2C Bacteriophage f2 1 1 58 26. FR1 Bacteriophage fr1 1 1 205 27. G14 Bacteriophage G14 1 1 113 28. H19B Bacteriophage H19B 2 2 3301 29. H30 Bacteriophage H30 1 1 1905 30. H44 Bacteriophage H4489A 1 1 3222 31. HB3 Bacteriophage HB-3 1 1 1319 32. HP1 Bacteriophage HP1 4 2 10673 33. IKE Bacteriophage Ike 3 2 7200 34. J93 Bacteriophage 933J 2 1 1499 35. JP3 Bacteriophage Jp34 2 2 1070 36. JP5 Bacteriophage Jp501 1 1 205 37. K5T Bacteriophage BK5-T 5 5 2070 38. KU1 Bacteriophage Ku1 1 1 220 39. L17 Bacteriophage L17 2 2 240 40. L54 Bacteriophage L54a 1 1 1626 41. LAM Bacteriophage lambda 120 24 55603 42. LP7 Bacteriophage LP7 1 1 2110 43. M13 Bacteriophage M13 12 8 8218 44. M13MP7 Bacteriophage M13mp7 1 1 60 45. M13MP8 Bacteriophage M13mp8 3 3 240 46. M13MP9 Bacteriophage M13mp9 2 2 318 47. M2Y Bacteriophage M2Y 2 2 336 48. MS2 Bacteriophage MS2 16 8 4679 49. OX2 Bacteriophage Ox2 2 2 2641 50. P15 Bacteriophage phi-105 1 1 1306 51. P16 Bacteriophage 16-3 1 1 720 52. P18 Bacteriophage 186 1 1 3561 53. P21 Bacteriophage phi-21 2 2 949 54. P22 Bacteriophage P22 17 15 18461 55. P29 Bacteriophage phi-29 18 15 29805 56. P42 Bacteriophage 42D 2 2 1986 57. P434 Bacteriophage 434 7 5 2933 58. P80 Bacteriophage phi-80 7 6 4714 59. P82 Bacteriophage 82 1 1 1200 60. P93 Bacteriophage 933W 2 1 1661 61. PA2 Bacteriophage PA-2 1 1 2816 62. PF1D Bacteriophage Pf1 1 1 435 63. PF3 Bacteriophage Pf3 4 4 12981 64. PFD Bacteriophage fd 12 7 7334 65. PFI Bacteriophage Fi 1 1 78 66. PG4 Bacteriophage G4 12 8 7247 67. PGA Bacteriophage Ga 4 4 4022 68. PH1 Bacteriophage H1 1 1 98 69. PH15 Bacteriophage phi-15 3 3 2352 70. PH2 Bacteriophage 21 1 1 1688 71. PH3 Bacteriophage phi-3T 2 2 3422 72. PH5 Bacteriophage phi-105 6 5 3851 73. PH6 Bacteriophage phi-6 7 7 13619 74. PHC Lactococcus 1 bacteriophage 1 1 1654 75. PHI Bacteriophage phi-H 1 1 2465 76. PHK Bacteriophage phi-K 3 2 426 77. PHM Bacteriophage phi-vML3 1 1 1208 78. PK3 Bacteriophage K3 7 6 6630 79. PM1 Bacteriophage M1 1 1 1714 80. PM2 Bacteriophage M2 1 1 1820 81. PMU Bacteriophage Mu 48 37 18049 82. PP1 Bacteriophage P1 43 41 20939 83. PP2 Bacteriophage P2 11 10 7614 84. PP4 Bacteriophage P4 9 8 14159 85. PP7 Bacteriophage P7 6 5 3315 86. PQB Bacteriophage Q-beta 14 13 1866 87. PR4 Bacteriophage PR4 2 2 240 88. PR5 Bacteriophage PR5 2 2 238 89. PR722 Bacteriophage PR722 2 2 240 90. PRD1 Bacteriophage PRD1 9 9 9061 91. PS2 Bacteriophage PBS2 1 1 720 92. PSP Bacteriophage Sp 3 2 4542 93. PST Bacteriophage ST 1 1 246 94. PT2 Bacteriophage T2 9 6 8743 95. PT3 Bacteriophage T3 21 18 16851 96. PT4 Bacteriophage T4 120 67 128557 97. PT5 Bacteriophage T5 29 26 26837 98. PT6 Bacteriophage T6 4 3 2938 99. PT7 Bacteriophage T7 41 18 47176 100. PVK Bacteriophage VK 1 1 246 101. PX1 Bacteriophage phi-X174 40 15 7239 102. PZA Bacteriophage PZA 3 1 19366 103. R17 Bacteriophage R17 9 7 463 104. RB1 Bacteriophage RB18 1 1 674 105. RB5 Bacteriophage RB51 1 1 700 106. RHO Bacteriophage Rho11s 2 1 2187 107. S13 Bacteriophage S13 2 1 5386 108. SF6 Bacteriophage SF6 1 1 996 109. SP1 Bacteriophage SPO1 20 20 6864 110. SP2 Bacteriophage SPO2 1 1 3040 111. SP6 Bacteriophage SP6 5 4 2948 112. SP8 Bacteriophage SP82 5 5 1527 113. SPB Bacteriophage SP-beta 4 3 2224 114. SPC Bacteriophage S-phi-C 1 1 1377 115. SPP Bacteriophage SPP1 2 2 1558 116. SPR Bacteriophage SPR 3 1 2129 117. ST1 Bacteriophage ST-1 2 2 844 118. T12 Bacteriophage T12 1 1 1837 119. TH1 Bacteriophage TH1 1 1 220 120. TW1 Bacteriophage TW19 1 1 76 121. TW2 Bacteriophage TW28 1 1 260 Total 880 593 682556 SYNTHETIC Key Name Reports Entries Bases ------------------------------------------------------------------------------- 1. ACC Cloning vector 1 1 1337 2. AD2 Artificial gene 1 1 128 3. ADB Artificial gene 8 8 573 4. ADH Artificial gene 1 1 106 5. ADL Artificial gene 3 3 273 6. ADV Artificial gene 1 1 106 7. ALM Avian myeloblastosis virus 1 1 337 8. ALR Rous sarcoma virus 4 4 413 9. AMH Artificial gene 1 1 234 10. APH Artificial gene 1 1 400 11. ARB Artificial gene 3 3 1180 12. ARC Cloning vector 2 2 760 13. ARE Artificial gene 1 1 255 14. ARG Artificial gene 1 1 249 15. ARH Artificial gene 5 5 440 16. ARI Artificial gene 1 1 465 17. ARL Artificial gene 2 2 424 18. ARM Artificial gene 1 1 457 19. ARN Cloning vector 1 1 333 20. ARP Cloning vector 6 6 1079 21. ARS Artificial gene 1 1 529 22. ART Artificial gene 1 1 60 23. ARY Artificial gene 1 1 264 24. ATH Artificial gene 1 1 417 25. BAM Artificial gene 1 1 85 26. BKV BK Virus 6 3 1560 27. BOV Bos taurus 18 18 6440 28. BSF Cloning vector 2 2 54 29. BSM Cloning vector 1 1 54 30. BSU Bacillus subtilis 10 9 3921 31. BTH Artificial gene 2 2 104 32. CAR Artificial gene 1 1 3616 33. CEL Caenorhabditis elegans 1 1 186 34. CHK Gallus sp. 4 4 701 35. CHS Artificial gene 1 1 478 36. CMVMUS Artificial gene 1 1 1376 37. COT Artificial gene 1 1 7876 38. CRO Artificial gene 2 2 198 39. CVC Cloning vector 1 1 46 40. CVE Cloning vector 1 1 60 41. CVJ Cloning vector 5 5 390 42. CVK Cloning vector 1 1 120 43. CYN Artificial gene 1 1 282 44. DRO Drosophila sp. 4 4 4200 45. E6V Artificial gene 2 2 206 46. ECO Escherichia coli 124 111 24710 47. EGF Artificial gene 1 1 299 48. ERY Artificial gene 1 1 217 49. EXP Cloning vector 2 2 123 50. EZZ Cloning vector 1 1 60 51. FCN Artificial gene 1 1 42 52. FCS Cloning vector 2 2 136 53. FLA Artificial gene 1 1 69 54. FLU Influenza virus 6 6 861 55. FSB Artificial gene 2 2 747 56. GFA Artificial gene 1 1 176 57. HAL Artificial gene 4 3 1633 58. HBV Hepatitis B virus 3 3 315 59. HCY Artificial gene 1 1 313 60. HET Hetropolymeric DNA 2 2 594 61. HIR Artificial gene 1 1 220 62. HIV Artificial gene 6 6 879 63. HL1 Artificial gene 4 4 238 64. HNR Artificial gene 1 1 90 65. HPB Artificial gene 1 1 556 66. HS1 Artificial gene 1 1 780 67. HS2 Artificial gene 2 2 129 68. HS5 Human cytomegalovirus 1 1 210 69. HSV Herpes Simplex Virus 6 6 323 70. HUM Artificial human gene 80 71 22856 71. HY3 Plasmid pHY300PLK 1 1 4870 72. IFH Cloning vector 1 1 63 73. IL1 Artificial gene 1 1 88 74. INS Artificial gene 1 1 232 75. ISN Insertion element 6 6 378 76. JRD Cloning vector 3 3 6852 77. KAN Cloning vector 3 3 210 78. KPN Klebsiella pneumoniae 2 2 354 79. KY1 Artificial gene 1 1 171 80. LAC Cloning vector 2 2 1173 81. LAM Bacteriophgage lambda 4 4 336 82. LET Artificial gene 1 1 212 83. LGT Cloning vector lambda gt11 1 1 210 84. LHM Artificial gene 1 1 232 85. LOR Cloning vector 1 1 5614 86. M13 Cloning vector M13 8 8 643 87. M13MP7 Cloning vector M13mp7 2 2 120 88. M13MP8 Cloning vector M13mp8 1 1 382 89. M13MP9 Cloning vector M13mp9 1 1 60 90. M13TG103 Cloning vector M13tg103 1 1 66 91. M13TG114 Cloning vector M13tg114 1 1 60 92. M13TG115 Cloning vector M13tg115 1 1 66 93. M13TG117 Cloning vector M13tg117 1 1 63 94. M13TG120 Cloning vector M13tg120 1 1 54 95. M13TG130 Cloning vector M13tg130 1 1 93 96. M13TG131 Cloning vector M13tg131 1 1 93 97. MBO Artificial gene 1 1 91 98. MBR Artificial gene 3 3 157 99. MCA Cauliflower mosaic virus 2 2 139 100. MCV Cucumber mosaic virus 5 5 284 101. MHI Mouse-human hybrid 4 4 1574 102. MLE Artificial gene 1 1 936 103. MLF Artificial gene 1 1 213 104. MLM Artificial gene 2 2 178 105. MML Cloning vector 12 4 24042 106. MNV Artificial gene 1 1 87 107. MP7 Artificial gene 2 1 69 108. MP8 Artificial gene 2 1 60 109. MP9 Artificial gene 2 1 60 110. MS2 Artificial gene 1 1 100 111. MSM Artificial gene 2 2 331 112. MUS Mus musculus 38 38 4298 113. NEU Artificial gene 2 2 171 114. NNL Plasmid pNNL 1 1 815 115. NPA Autographa californica nuclear polyhedrosis virus 3 3 922 116. OVC Artificial gene 1 1 738 117. P17X Plasmid pACYC177 7 6 4190 118. P18X Plasmid pACYC184 5 4 4593 119. P23 Artificial gene 1 1 119 120. PAC Cloning vector 1 1 83 121. PAH Artificial gene 2 2 107 122. PBD Cloning vector 1 1 79 123. PBG Cloning vector 6 3 12379 124. PBR Plasmid pBR322 43 23 7123 125. PBR313 Plasmid pBR313 1 1 200 126. PBR322SV Plasmid pBR322/SV40 hybrid 5 5 209 127. PBR325 Plasmid pBR325 3 3 319 128. PBR327 Plasmid pBR327 3 3 3334 129. PBR329 Plasmid pBR329 1 1 4150 130. PBR345 Plasmid pBR345 2 2 1024 131. PBRH4 Plasmid pBRH4 1 1 71 132. PCE Cloning vector 1 1 510 133. PCG86 Plasmid pCG86 2 2 654 134. PCZ Plasmid pCZ 2 2 208 135. PDPL Plasmid PDPL13 1 1 79 136. PEA Artificial gene 1 1 1004 137. PEM Cloning vector pEMBL8m 4 2 7878 138. PES Cloning vector 1 1 99 139. PF1 Bacteriophage f1 1 1 254 140. PFD Artificial gene 1 1 85 141. PFE Plasmid pFE 2 2 180 142. PFH Plasmid pFH 1 1 120 143. PFL Cloning vector 2 1 4588 144. PFR Plasmid pFR 4 4 341 145. PFX Artificial gene 2 1 3627 146. PHP Plasmid pHP45 1 1 155 147. PHS Plamsid pHS 3 3 2877 148. PHV100 Cloning vector 1 1 396 149. PHV33 Artificial gene 13 13 650 150. PIC Plasmid pIC 5 5 477 151. PIG Artificial pig gene 3 3 440 152. PIP1088 Plasmid pIP1088 2 2 142 153. PIVX Cloning vector pi-VX 1 1 902 154. PJSC73 Plasmid pJSC73 1 1 3564 155. PK18 Cloning vector 1 1 2661 156. PKN Plasmodium knowlesi 2 1 360 157. PKT Artificial gene 3 3 264 158. PKU Cloning vector 3 2 7825 159. PL2 Artificial gene 2 2 240 160. PL5 Artificial gene 2 2 310 161. PLB Cloning vector 1 1 852 162. PLF Cloning vector 2 1 3641 163. PLY Artificial gene 1 1 66 164. PMB9 Cloning vector 1 1 138 165. PMC1843 Plasmid pMC1843 1 1 62 166. PMK20 Artificial gene 2 1 4028 167. PMT Cloning vector 1 1 2854 168. PMU Artificial gene 4 4 576 169. POG Cloning vector 1 1 352 170. POL Artificial gene 2 2 129 171. POLY Cloning vector 6 3 6226 172. PORI17 Plasmid pOri17 2 2 490 173. PPI Cloning vector 1 1 4734 174. PPUC Cloning vector 1 1 75 175. PQB Artificial gene 1 1 64 176. PRK Cloning vector 2 2 839 177. PRT Artificial gene 1 1 711 178. PRW1707 Plasmid pRW1707 1 1 66 179. PRW1718 Plasmid pRW1718 1 1 72 180. PRW1724 Plasmid pRW1724 1 1 66 181. PRW1725 Plasmid pRW1725 1 1 66 182. PSE Artificial gene 2 2 139 183. PSI Cloning vector 1 1 81 184. PSKS104 Plasmid pSKS104 1 1 69 185. PSKS105 Plasmid pSKS105 1 1 60 186. PSKS106 Plasmid pSKS106 1 1 60 187. PSKS107 Plasmid pSKS107 1 1 46 188. PSMF Cloning vector 2 2 259 189. PSP Cloning vector 3 3 215 190. PSR Artificial gene 1 1 138 191. PSS Cloning vector 2 2 475 192. PT4 Bacteriophage T4 4 4 725 193. PT7 Bacteriophage T7 2 2 282 194. PTK Plasmid pTK 1 1 68 195. PTL Cloning vector 1 1 51 196. PTN Plasmid pTN 1 1 355 197. PTR Plasmid pTr 1 1 137 198. PTU Artificial gene 4 4 883 199. PTZ Plasmid pTZ12 1 1 2517 200. PUC Cloning vector 2 1 3914 201. PUEX Cloning vector 2 1 6728 202. PVH51 Plasmid pVH51 1 1 3847 203. PX1 Bacteriophage phi-X174 1 1 59 204. PYM Artificial gene 1 1 252 205. PYR Artificial gene 1 1 158 206. PZ189 Cloning vector 1 1 153 207. R38 Plasmid R388 1 1 1167 208. R67 Plasmid R67 1 1 353 209. R6K Cloning vector 2 2 176 210. RAD Artificial gene 1 1 955 211. RAT Rattus sp. 14 13 1864 212. RET Cloning vector 2 2 780 213. RMT Artificial gene 6 6 638 214. RNA Artificial gene 1 1 328 215. ROT Artificial gene 2 2 141 216. RRNA Artificial gene 1 1 136 217. RSC1 Plasmid Rsc13 3 1 7894 218. RSF1050 Plasmid RSF1050 1 1 104 219. RSP Artificial gene 1 1 100 220. RSV Rous Sarcoma Virus 3 3 450 221. RTS Artificial gene 1 1 280 222. S100 Artificial gene 1 1 283 223. SAA Bacteriophage sigma-11-AA248 1 1 83 224. SAU Staphylcoccus aureus 1 1 60 225. SFV Semliki forest virus 3 3 171 226. SHI Cloning vector 3 3 428 227. SHU Artificial gene 3 3 272 228. SIN Cloning vector 2 2 596 229. SLM Artificial gene 10 10 817 230. SOMINS Artificial gene 1 1 226 231. SOP Artificial gene 1 1 111 232. SP02 Bacteriophage SP02 1 1 487 233. SP6 Artificial gene 1 1 78 234. SPI Artificial gene 2 2 295 235. SRU Artificial gene 1 1 252 236. STA Artificial gene 4 4 619 237. STM Artificial gene 1 1 71 238. STY Salmonella sp. 1 1 135 239. SV4 Simian Virus 40 13 13 2415 240. SVA Artificial gene 1 1 213 241. SYN Plasmid pDSP1 161 146 49857 242. SYN Synthetic sequence 51 48 131042 243. T13 Artificial gene 1 1 223 244. T4L Artificial gene 1 1 518 245. TAC Artificial gene 1 1 842 246. THA Artificial gene 1 1 641 247. THY Plasmid pUC8 2 1 503 248. TI Plasmid Ti 4 3 6552 249. TN3 Artificial gene 1 1 110 250. TNP Artificial gene 1 1 192 251. TNS Cloning vector 2 2 144 252. TOB Artificial gene 1 1 788 253. TRN28 Cloning vector 2 2 284 254. TRN3 Transposon Tn3 10 10 1119 255. TRN5 Artificial gene 1 1 80 256. TRNB Artificial gene 1 1 84 257. TU4 Cloning vector 2 2 350 258. VAC Cloning vector 4 4 683 259. VCH Artificial gene 1 1 444 260. VEC Cloning vector 1 1 143 261. VTR Cloning vector 1 1 148 262. WHL Artificial gene 1 1 507 263. XEL Xenopus laevis 13 13 1227 264. YSC Saccharomyces cerevisiae 41 40 10154 265. YSE Artificial gene 2 1 1795 266. YST Artificial gene 1 1 82 267. ZMO Artificial gene 3 3 740 Total 1129 1028 516186 UNANNOTATED Key Name Reports Entries Bases ------------------------------------------------------------------------------- 1. Unidentified 4909 3756 4792964 Total 4909 3756 4792964