Search for
You are here: ExPASy CH  > Databases  > Around UniProtKB

         UniProtKB/Swiss-Prot protein knowledgebase release 57.15 statistics


1.  INTRODUCTION

Release 57.15 of 02-Mar-10 of UniProtKB/Swiss-Prot contains 515203 sequence entries,
comprising 181334896 amino acids abstracted from 187376 references. 

463 sequences have been added since release 57.14, the sequence data of
84 existing entries has been updated and the annotations of
471856 entries have been revised.

Number of fragments: 8451
Number of additional sequences produced by alternative splicing, initiation or promoter usage, or ribosomal frameshifting: 28874


Protein existence (PE):           entries     %

1: Evidence at protein level        68618   13.3%
2: Evidence at transcript level     66497   12.9%
3: Inferred from homology          364268   70.7%
4: Predicted                        14285    2.8%
5: Uncertain                         1535    0.3%

The growth of the database is summarized below.

   


2.  TAXONOMIC ORIGIN

   Total number of species represented in this release of UniProtKB/Swiss-Prot: 12042

   The first twenty species represent 107335 sequences:  20.8 % of the total
   number of entries.


   2.1 Table of the frequency of occurrence of species

        Species represented 1x: 5233
                            2x: 1698
                            3x:  894
                            4x:  573
                            5x:  422
                            6x:  349
                            7x:  244
                            8x:  204
                            9x:  183
                           10x:  103
                       11- 20x:  583
                       21- 50x:  368
                       51-100x:  175
                         >100x: 1013


   2.2  Table of the most represented species

  ------  ---------  --------------------------------------------
  Number  Frequency  Species
  ------  ---------  --------------------------------------------
       1      20265  Homo sapiens (Human)
       2      16224  Mus musculus (Mouse)
       3       8876  Arabidopsis thaliana (Mouse-ear cress)
       4       7483  Rattus norvegicus (Rat)
       5       6558  Saccharomyces cerevisiae (Baker's yeast)
       6       5748  Bos taurus (Bovine)
       7       4974  Schizosaccharomyces pombe (Fission yeast)
       8       4368  Escherichia coli (strain K12)
       9       4258  Bacillus subtilis
      10       4137  Dictyostelium discoideum (Slime mold)
      11       3284  Caenorhabditis elegans
      12       3216  Xenopus laevis (African clawed frog)
      13       3058  Drosophila melanogaster (Fruit fly)
      14       2608  Danio rerio (Zebrafish) (Brachydanio rerio)
      15       2369  Oryza sativa subsp. japonica (Rice)
      16       2208  Pongo abelii (Sumatran orangutan)
      17       2153  Gallus gallus (Chicken)
      18       1993  Escherichia coli O157:H7
      19       1782  Methanocaldococcus jannaschii (Methanococcus jannaschii)
      20       1773  Haemophilus influenzae
      21       1767  Salmonella typhimurium
      22       1668  Escherichia coli O6
      23       1666  Shigella flexneri
      24       1561  Mycobacterium tuberculosis
      25       1520  Xenopus tropicalis (Western clawed frog) (Silurana tropicalis)
      26       1364  Sus scrofa (Pig)
      27       1341  Salmonella typhi
      28       1273  Pseudomonas aeruginosa
      29       1213  Mycobacterium bovis
      30       1159  Macaca fascicularis (Crab eating macaque) (Cynomolgus monkey)
      31       1015  Synechocystis sp. (strain PCC 6803)
      32        995  Yersinia pestis
      33        991  Archaeoglobus fulgidus
      34        940  Vibrio cholerae
      35        929  Salmonella paratyphi A
      36        923  Staphylococcus aureus (strain N315)
      37        922  Staphylococcus aureus (strain Mu50 / ATCC 700699)
      38        912  Rhizobium meliloti (Sinorhizobium meliloti)
      39        909  Acanthamoeba polyphaga mimivirus (APMV)
      40        896  Staphylococcus aureus (strain COL)
      41        894  Staphylococcus aureus (strain MW2)
      42        888  Staphylococcus aureus (strain MSSA476)
      43        885  Staphylococcus aureus (strain MRSA252)
      44        882  Oryctolagus cuniculus (Rabbit)
      45        879  Escherichia coli O6:K15:H31 (strain 536 / UPEC)
      46        879  Salmonella choleraesuis
      47        869  Shigella sonnei (strain Ss046)
      48        863  Yersinia pseudotuberculosis
      49        835  Escherichia coli O9:H4 (strain HS)
      50        829  Escherichia coli O139:H28 (strain E24377A / ETEC)
      51        824  Shigella boydii serotype 4 (strain Sb227)
      52        818  Escherichia coli (strain UTI89 / UPEC)
      53        817  Ashbya gossypii (Yeast) (Eremothecium gossypii)
      54        814  Escherichia coli (strain ATCC 8739 / DSM 1576 / Crooks)
      55        800  Shigella dysenteriae serotype 1 (strain Sd197)
      56        795  Candida albicans (Yeast)
      57        794  Vibrio parahaemolyticus
      58        789  Kluyveromyces lactis (Yeast) (Candida sphaerica)
      59        785  Escherichia coli (strain SMS-3-5 / SECEC)
      60        778  Erwinia carotovora subsp. atroseptica (Pectobacterium atrosepticum)
      61        776  Pasteurella multocida
      62        773  Aquifex aeolicus
      63        771  Neurospora crassa
      64        765  Escherichia coli (strain K12 / DH10B)
      65        764  Canis familiaris (Dog)
      66        759  Escherichia coli O127:H6 (strain E2348/69 / EPEC)
      67        759  Escherichia coli (strain K12 / BW2952)
      68        757  Escherichia coli O17:K52:H18 (strain UMN026 / ExPEC)
      69        757  Escherichia coli (strain 55989 / EAEC)
      70        757  Staphylococcus epidermidis (strain ATCC 35984 / RP62A)
      71        756  Escherichia coli O8 (strain IAI1)
      72        756  Staphylococcus epidermidis (strain ATCC 12228)
      73        751  Escherichia coli O45:K1 (strain S88 / ExPEC)
      74        750  Escherichia coli (strain SE11)
      75        750  Shigella flexneri serotype 5b (strain 8401)
      76        748  Escherichia coli O7:K1 (strain IAI39 / ExPEC)
      77        747  Candida glabrata (Yeast) (Torulopsis glabrata)
      78        742  Escherichia coli O157:H7 (strain EC4115 / EHEC)
      79        738  Streptomyces coelicolor
      80        738  Photorhabdus luminescens subsp. laumondii
      81        731  Vibrio vulnificus
      82        730  Bacillus halodurans
      83        726  Escherichia coli O81 (strain ED1a)
      84        722  Bacillus anthracis
      85        722  Yersinia enterocolitica serotype O:8 / biotype 1B (strain 8081)
      86        719  Salmonella enteritidis PT4 (strain P125109)
      87        715  Vibrio vulnificus (strain YJ016)
      88        715  Salmonella paratyphi B (strain ATCC BAA-1250 / SPB7)
      89        713  Yersinia pestis bv. Antiqua (strain Nepal516)
      90        713  Salmonella paratyphi A (strain AKU_12601)
      91        712  Yersinia pseudotuberculosis serotype O:1b (strain IP 31758)
      92        711  Staphylococcus aureus (strain NCTC 8325)
      93        710  Salmonella newport (strain SL254)
      94        709  Salmonella heidelberg (strain SL476)
      95        709  Yersinia pestis bv. Antiqua (strain Antiqua)
      96        709  Salmonella agona (strain SL483)
      97        708  Salmonella schwarzengrund (strain CVM19633)
      98        706  Escherichia coli O1:K1 / APEC
      99        699  Salmonella dublin (strain CT_02021853)
     100        697  Enterobacter sp. (strain 638)
     101        696  Klebsiella pneumoniae subsp. pneumoniae (strain ATCC 700721 / MGH 78578)
     102        696  Shigella boydii serotype 18 (strain CDC 3083-94 / BS512)
     103        687  Mycoplasma pneumoniae
     104        685  Pan troglodytes (Chimpanzee)
     105        685  Escherichia fergusonii (strain ATCC 35469 / DSM 13698 / CDC 0568-73)
     106        684  Pseudomonas syringae pv. tomato
     107        682  Salmonella gallinarum (strain 287/91 / NCTC 13346)
     108        682  Klebsiella pneumoniae (strain 342)
     109        676  Anabaena sp. (strain PCC 7120)
     110        670  Pseudomonas putida (strain KT2440)
     111        666  Yersinia pestis (strain Pestoides F)
     112        665  Staphylococcus aureus (strain USA300)
     113        664  Citrobacter koseri (strain ATCC BAA-895 / CDC 4225-83 / SGSC4696)
     114        661  Mycobacterium leprae
     115        658  Rhizobium sp. (strain NGR234)
     116        653  Serratia proteamaculans (strain 568)
     117        651  Zea mays (Maize)
     118        646  Escherichia coli
     119        645  Bradyrhizobium japonicum
     120        641  Staphylococcus aureus (strain bovine RF122 / ET3-1)
     121        638  Bacillus cereus (strain ATCC 14579 / DSM 31)
     122        637  Yersinia pseudotuberculosis serotype O:3 (strain YPIII)
     123        635  Salmonella arizonae (strain ATCC BAA-731 / CDC346-86 / RSK2980)
     124        633  Yersinia pseudotuberculosis serotype IB (strain PB1/+)
     125        620  Shewanella oneidensis
     126        617  Agrobacterium tumefaciens (strain C58 / ATCC 33970)
     127        615  Treponema pallidum
     128        613  Ralstonia solanacearum (Pseudomonas solanacearum)
     129        608  Staphylococcus haemolyticus (strain JCSC1435)
     130        608  Enterobacter sakazakii (strain ATCC BAA-894)
     131        602  Rhizobium loti (Mesorhizobium loti)
     132        602  Staphylococcus saprophyticus subsp. saprophyticus 
     133        600  Methanobacterium thermoautotrophicum
     134        598  Yersinia pestis bv. Antiqua (strain Angola)
     135        598  Salmonella paratyphi C (strain RKS4594)
     136        598  Emericella nidulans (Aspergillus nidulans)
     137        596  Listeria monocytogenes
     138        595  Photobacterium profundum (Photobacterium sp. (strain SS9))
     139        593  Debaryomyces hansenii (Yeast) (Torulaspora hansenii)
     140        592  Yarrowia lipolytica (Candida lipolytica)
     141        590  Bacillus cereus (strain ATCC 10987)
     142        589  Xanthomonas campestris pv. campestris
     143        588  Listeria innocua
     144        585  Rickettsia prowazekii
     145        584  Helicobacter pylori (Campylobacter pylori)
     146        582  Pectobacterium carotovorum subsp. carotovorum (strain PC1)
     147        581  Lactococcus lactis subsp. lactis (Streptococcus lactis)
     148        579  Neisseria meningitidis serogroup B
     149        576  Brucella suis
     150        572  Brucella melitensis
     151        572  Buchnera aphidicola subsp. Acyrthosiphon pisum 
     152        567  Bacillus thuringiensis subsp. konkukian
     153        565  Helicobacter pylori J99 (Campylobacter pylori J99)
     154        562  Buchnera aphidicola subsp. Schizaphis graminum
     155        560  Bacillus cereus (strain ZK / E33L)
     156        560  Pseudomonas syringae pv. syringae (strain B728a)
     157        557  Pseudomonas aeruginosa (strain UCBPP-PA14)
     158        556  Neisseria meningitidis serogroup A
     159        555  Bacillus licheniformis (strain DSM 13 / ATCC 14580)
     160        555  Xanthomonas axonopodis pv. citri (Citrus canker)
     161        553  Vibrio fischeri (strain ATCC 700601 / ES114)
     162        551  Pseudomonas fluorescens (strain Pf0-1)
     163        549  Oceanobacillus iheyensis
     164        545  Caulobacter crescentus (Caulobacter vibrioides)
     165        545  Clostridium acetobutylicum
     166        545  Pseudomonas fluorescens (strain Pf-5 / ATCC BAA-477)
     167        538  Pseudomonas syringae pv. phaseolicola (strain 1448A / Race 6)
     168        529  Listeria monocytogenes serotype 4b (strain F2365)
     169        523  Erwinia tasmaniensis (strain DSM 17950 / Et1/99)
     170        522  Sodalis glossinidius (strain morsitans)
     171        521  Bordetella bronchiseptica (Alcaligenes bronchisepticus)
     172        521  Xylella fastidiosa
     173        519  Streptococcus pneumoniae
     174        512  Xylella fastidiosa (strain Temecula1 / ATCC 700964)
     175        510  Chromobacterium violaceum
     176        509  Thermotoga maritima
     177        509  Vibrio cholerae serotype O1 (strain ATCC 39541 / Ogawa 395 / O395)
     178        507  Bordetella parapertussis
     179        507  Buchnera aphidicola subsp. Baizongia pistaciae
     180        507  Pseudomonas aeruginosa (strain PA7)
     181        505  Bordetella pertussis
     182        504  Haemophilus ducreyi
     183        504  Geobacillus kaustophilus
     184        503  Staphylococcus aureus (strain Newman)
     185        500  Pseudomonas entomophila (strain L48)
     186        498  Brucella abortus
     187        497  Rickettsia conorii
     188        496  Bacillus clausii (strain KSM-K16)
     189        492  Haemophilus influenzae (strain 86-028NP)
     190        492  Deinococcus radiodurans
     191        490  Xanthomonas campestris pv. campestris (strain 8004)
     192        490  Vibrio harveyi (strain ATCC BAA-1116 / BB120)
     193        490  Clostridium perfringens
     194        488  Bacillus amyloliquefaciens (strain FZB42)
     195        487  Burkholderia pseudomallei (Pseudomonas pseudomallei)
     196        487  Shewanella sp. (strain MR-7)
     197        485  Aspergillus fumigatus (Sartorya fumigata)
     198        484  Pseudomonas aeruginosa (strain LESB58)
     199        484  Shewanella sp. (strain MR-4)
     200        483  Mannheimia succiniciproducens (strain MBEL55E)
     201        483  Mycoplasma genitalium
     202        483  Staphylococcus aureus (strain Mu3 / ATCC 700698)
     203        482  Streptomyces avermitilis
     204        481  Corynebacterium glutamicum (Brevibacterium flavum)
     205        480  Proteus mirabilis (strain HI4320)
     206        480  Caenorhabditis briggsae
     207        478  Oryza sativa subsp. indica (Rice)
     208        475  Methanosarcina acetivorans
     209        475  Synechococcus elongatus (strain PCC 7942) (Anacystis nidulans R2)
     210        472  Burkholderia sp. (strain 383) (Burkholderia cepacia 
     211        472  Pseudomonas putida (strain F1 / ATCC 700007)
     212        472  Brucella abortus (strain 2308)
     213        472  Thermosynechococcus elongatus (strain BP-1)
     214        468  Enterococcus faecalis (Streptococcus faecalis)
     215        466  Acinetobacter sp. (strain ADP1)
     216        465  Pyrococcus horikoshii
     217        465  Xanthomonas campestris pv. vesicatoria (strain 85-10)
     218        465  Pseudomonas putida (strain GB-1)
     219        464  Rhodopseudomonas palustris
     220        464  Shewanella frigidimarina (strain NCIMB 400)
     221        462  Anabaena variabilis (strain ATCC 29413 / PCC 7937)
     222        462  Shewanella sp. (strain ANA-3)
     223        461  Burkholderia mallei (Pseudomonas mallei)
     224        460  Ralstonia eutropha  (Cupriavidus necator 
     225        458  Lactobacillus plantarum
     226        457  Streptococcus pneumoniae (strain ATCC BAA-255 / R6)
     227        457  Pyrococcus abyssi
     228        457  Ralstonia eutropha (strain JMP134) (Alcaligenes eutrophus)
     229        456  Methanosarcina mazei (Methanosarcina frisia)
     230        455  Aeromonas hydrophila subsp. hydrophila (strain ATCC 7966 / NCIB 9240)
     231        454  Staphylococcus aureus (strain JH1)
     232        453  Rickettsia felis (Rickettsia azadi)
     233        453  Xanthomonas oryzae pv. oryzae (strain MAFF 311018)
     234        452  Shewanella baltica (strain OS185)
     235        452  Pseudomonas putida (strain W619)
     236        452  Halobacterium salinarium (Halobacterium halobium)
     237        448  Staphylococcus aureus (strain JH9)
     238        448  Thermoanaerobacter tengcongensis
     239        448  Streptococcus mutans
     240        447  Methylococcus capsulatus
     241        447  Aeromonas salmonicida (strain A449)
     242        446  Ovis aries (Sheep)
     243        446  Rhodobacter sphaeroides (strain ATCC 17023 / 2.4.1 / NCIB 8253 / DSM 158)
     244        445  Vibrio fischeri (strain MJ11)
     245        444  Pseudomonas mendocina (strain ymp)
     246        443  Hahella chejuensis (strain KCTC 2396)
     247        441  Streptococcus pyogenes serotype M6
     248        441  Dechloromonas aromatica (strain RCB)
     249        440  Pyrococcus furiosus
     250        439  Nicotiana tabacum (Common tobacco)


   
   2.3  Taxonomic distribution of the sequences

   

   Kingdom        sequences (% of the database)
    Archaea           18183 (  4%)
    Bacteria         323233 ( 63%)
    Eukaryota        158932 ( 31%)
    Viruses           14855 (  3%)


   Within Eukaryota:

   

    Category            sequences (% of Eukaryota) (% of the complete database)
     Human                  20266 ( 13%)           (  4%)
     Other Mammalia         44572 ( 28%)           (  9%)
     Other Vertebrata       15995 ( 10%)           (  3%)
     Viridiplantae          28783 ( 18%)           (  6%)
     Fungi                  25145 ( 16%)           (  5%)
     Insecta                 7719 (  5%)           (  1%)
     Nematoda                4041 (  3%)           (  1%)
     Other                  12411 (  8%)           (  2%)



3.  SEQUENCE SIZE

   Repartition of the sequences by size (excluding fragments)

               From   To  Number             From   To   Number
                  1-  50    8374             1001-1100     3461
                 51- 100   39851             1101-1200     2393
                101- 150   55768             1201-1300     1901
                151- 200   55885             1301-1400     1784
                201- 250   54401             1401-1500     1419
                251- 300   47843             1501-1600      630
                301- 350   48255             1601-1700      496
                351- 400   41252             1701-1800      409
                401- 450   33736             1801-1900      390
                451- 500   27114             1901-2000      322
                501- 550   19180             2001-2100      193
                551- 600   13689             2101-2200      261
                601- 650   11453             2201-2300      274
                651- 700    8151             2301-2400      168
                701- 750    6789             2401-2500      129
                751- 800    4797             >2500         1000
                801- 850    4121
                851- 900    4739
                901- 950    3605
                951-1000    2519

   


   The average sequence length in UniProtKB/Swiss-Prot is 351 amino acids.

   The shortest sequence is   GWA_SEPOF (P83570):     2 amino acids.
   The longest sequence is  TITIN_MOUSE (A2ASS6): 35213 amino acids.


4.  JOURNAL CITATIONS

   Note: the following citation statistics reflect the number of distinct
         journal citations.

   Total number of journals cited in this release of UniProtKB/Swiss-Prot: 2048


   4.1 Table of the frequency of journal citations

        Journals cited 1x:  659
                       2x:  282
                       3x:  139
                       4x:  103
                       5x:   87
                       6x:   62
                       7x:   35
                       8x:   39
                       9x:   39
                      10x:   24
                  11- 20x:  161
                  21- 50x:  165
                  51-100x:   96
                    >100x:  157


   4.2  List of the most cited journals in UniProtKB/Swiss-Prot

   Nb    Citations   Journal name
   --    ---------   -------------------------------------------------------------
    1        17746   Journal of Biological Chemistry
    2         8225   Proceedings of the National Academy of Sciences of the U.S.A.
    3         4987   Journal of Bacteriology
    4         4491   Gene
    5         4481   Biochemical and Biophysical Research Communications
    6         4290   Nucleic Acids Research
    7         3933   FEBS Letters
    8         3791   Biochemistry
    9         3713   The EMBO Journal
   10         3382   Molecular and Cellular Biology
   11         3199   Nature
   12         3082   European Journal of Biochemistry
   13         2999   Journal of Molecular Biology
   14         2959   Biochimica et Biophysica Acta
   15         2646   Cell
   16         2471   Genomics
   17         2155   Biochemical Journal
   18         2100   Science
   19         2024   Journal of Virology
   20         1747   Molecular Microbiology
   21         1556   Journal of Cell Biology
   22         1489   Plant Molecular Biology
   23         1353   Genes and Development
   24         1347   Virology
   25         1304   Nature Genetics
   26         1303   Molecular and General Genetics
   27         1301   Human Molecular Genetics
   28         1286   Plant Physiology
   29         1199   The American Journal of Human Genetics
   30         1167   Oncogene
   31         1154   Journal of Biochemistry
   32         1139   Development
   33         1082   Human Mutation
   34         1004   Molecular Biology of the Cell
   35         1001   Journal of Immunology
   36          973   Genetics
   37          879   Structure
   38          868   Journal of General Virology
   39          864   Infection and Immunity
   40          840   The Plant Cell
   41          814   Archives of Biochemistry and Biophysics
   42          793   Molecular Cell
   43          790   Blood
   44          756   Yeast
   45          743   Microbiology
   46          718   Developmental Biology
   47          718   The Plant Journal
   48          714   Journal of Cell Science
   49          662   Cancer Research
   50          648   FEMS Microbiology Letters
   51          635   Current Biology
   52          590   Human Genetics
   53          586   Mechanisms of Development
   54          585   Nature Structural Biology
   55          538   Acta Crystallographica, Section D
   56          533   Protein Science
   57          527   Journal of Neuroscience
   58          523   Current Genetics
   59          519   Applied and Environmental Microbiology
   60          504   Toxicon
   61          499   Journal of Clinical Investigation
   62          496   Neuron
   63          469   Mammalian Genome
   64          452   American Journal of Physiology
   65          445   Immunogenetics
   66          440   The Journal of Experimental Medicine
   67          436   Molecular Endocrinology
   68          419   Molecular and Biochemical Parasitology
   69          407   Journal of Neurochemistry
   70          396   The Journal of Clinical Endocrinology and Metabolism
   71          385   Endocrinology
   72          376   Journal of Molecular Evolution
   73          365   DNA and Cell Biology
   74          355   Proteins
   75          354   DNA Sequence
   76          351   Molecular Biology and Evolution
   77          350   Bioscience, Biotechnology, and Biochemistry
   78          346   Journal of Medical Genetics
   79          314   Brain Research. Molecular Brain Research
   80          292   Plant and Cell Physiology
   81          290   Experimental Cell Research
   82          289   Biological Chemistry Hoppe-Seyler
   83          288   Peptides
   84          287   Nature Cell Biology
   85          285   Comparative Biochemistry and Physiology
   86          280   Tissue Antigens
   87          279   Antimicrobial Agents and Chemotherapy
   88          277   Journal of Investigative Dermatology
   89          274   Cytogenetics and Cell Genetics
   90          267   Molecular Pharmacology
   91          255   Biology of Reproduction
   92          247   Journal of General Microbiology
   93          245   Genome Research
   94          241   Neurology
   95          239   RNA
   96          238   Developmental Dynamics
   97          237   Developmental Cell
   98          231   Virus Research
   99          215   Hoppe-Seyler's Zeitschrift fur Physiologische Chemie
  100          205   DNA Research
  101          204   Planta
  102          203   European Journal of Immunology
  103          202   Molecular Plant-Microbe Interactions
  104          199   Biochimie
  105          199   Annals of Neurology
  106          193   European Journal of Human Genetics
  107          193   Genes to Cells
  108          189   Eukaryotic cell
  109          181   Immunity
  110          179   Journal of Human Genetics
  111          173   The New England Journal of Medicine
  112          171   Molecular and Cellular Endocrinology
  113          168   Nature Structural and Molecular Biology
  114          167   Investigative Ophthalmology and Visual Science
  115          164   Archives of Microbiology
  116          163   American Journal of Medical Genetics
  117          163   Molecular Phylogenetics and Evolution
  118          159   DNA
  119          156   EMBO Reports
  120          155   Insect Biochemistry and Molecular Biology
  121          153   Hemoglobin
  122          152   The FASEB Journal
  123          151   Bioorganicheskaia Khimiia
  124          149   The FEBS Journal
  125          148   Molecular Reproduction and Development
  126          148   Diabetes
  127          147   Molecular Immunology
  128          145   Archives of Virology
  129          142   Glycobiology
  130          142   Clinical Genetics
  131          136   General and Comparative Endocrinology
  132          135   International Journal of Cancer
  133          135   Animal Genetics
  134          135   Molecular Genetics and Metabolism
  135          132   Molecular and Cellular Neuroscience
  136          130   British Journal of Haematology
  137          128   Journal of Cellular Biochemistry
  138          125   Biological Chemistry
  139          123   American Journal of Medical Genetics. Part A
  140          122   Molecular Genetics and Genomics
  141          121   Journal of the American Chemical Society
  142          120   Agricultural and Biological Chemistry
  143          119   Nature Immunology
  144          118   BMC Genomics
  145          118   Journal of Lipid Research
  146          114   Proteomics
  147          113   Thrombosis and Haemostasis
  148          113   Circulation Research
  149          113   Neuroscience Letters
  150          113   Journal of Protein Chemistry


5.  STATISTICS FOR SOME LINE TYPES

The following table summarizes the total number of some UniProtKB/Swiss-Prot lines,
as well as the number of entries with at least one such line, and the
frequency of the lines.

                                      Total    Number of  Average
   Line type / subtype                number   entries    per entry
------------------------------------  -------- ---------  ---------

References (RL)                       916523                 1.78                                         
   Journal                            724695     385502      1.41       1                                 
   Submitted to EMBL/GenBank/DDBJ     179159     165908      0.35       2                                 
   Submitted to other databases        10622       9218      0.02       3                                 
   Book citation                         635        621     <0.01       4                                 
   Plant Gene Register                   560        548     <0.01       5                                 
   Thesis                                395        392     <0.01       6                                 
   Unpublished observations              294        290     <0.01       7                                 
   Patent                                157        155     <0.01       8                                 
   Worm Breeder's Gazette                  6          6     <0.01       9                                 

Total number of distinct authors cited in UniProtKB/Swiss-Prot: 285894

                                      Total    Number of  Average
   Line type / subtype                number   entries    per entry  Rank
------------------------------------  -------- ---------  ---------  ----
Comments (CC)                        2174677                 4.22                                         
   ALLERGEN                              460        460     <0.01      26                                 
   ALTERNATIVE PRODUCTS                18657      18657      0.04      12                                 
   BIOPHYSICOCHEMICAL PROPERTIES        2918       2918      0.01      22                                 
   BIOTECHNOLOGY                         256        254     <0.01      28                                 
   CATALYTIC ACTIVITY                 218486     199288      0.42       5                                 
   CAUTION                              6803       6664      0.01      19                                 
   COFACTOR                            99099      91004      0.19       7                                 
   DEVELOPMENTAL STAGE                  8728       8728      0.02      16                                 
   DISEASE                              4344       2940      0.01      20                                 
   DISRUPTION PHENOTYPE                 2441       2441     <0.01      23                                 
   DOMAIN                              31032      27605      0.06      10                                 
   ENZYME REGULATION                    7742       7742      0.02      18                                 
   FUNCTION                           383140     367142      0.74       2                                 
   INDUCTION                           11469      11469      0.02      15                                 
   INTERACTION                         12308      12308      0.02      14                                 
   MASS SPECTROMETRY                    4235       3196      0.01      21                                 
   MISCELLANEOUS                       30073      27800      0.06      11                                 
   PATHWAY                            126370     115423      0.25       6                                 
   PHARMACEUTICAL                         83         83     <0.01      29                                 
   POLYMORPHISM                          771        739     <0.01      24                                 
   PTM                                 35224      28544      0.07       8                                 
   RNA EDITING                           603        603     <0.01      25                                 
   SEQUENCE CAUTION                    12702      12702      0.02      13                                 
   SIMILARITY                         598859     490601      1.16       1                                 
   SUBCELLULAR LOCATION               296364     291347      0.58       3                                 
   SUBUNIT                            220331     220331      0.43       4                                 
   TISSUE SPECIFICITY                  32472      32472      0.06       9                                 
   TOXIC DOSE                            417        406     <0.01      27                                 
   WEB RESOURCE                         8290       6581      0.02      17                                 

Total number of comment topics: 29


                                      Total    Number of  Average
   Line type / subtype                number   entries    per entry  Rank
------------------------------------  -------- ---------  ---------  ----
Features (FT)                        3204025                 6.22                                         
   ACT_SITE                           127567      76218      0.25       9                                 
   BINDING                            200356      57370      0.39       4                                 
   CA_BIND                              3651       1479      0.01      35                                 
   CARBOHYD                            95872      24560      0.19      13                                 
   CHAIN                              521717     510497      1.01       1                                 
   COILED                              18142      12239      0.04      26                                 
   COMPBIAS                            48780      25480      0.09      18                                 
   CONFLICT                           115522      40534      0.22      10                                 
   CROSSLNK                             4794       3114      0.01      34                                 
   DISULFID                            94293      25035      0.18      14                                 
   DNA_BIND                            10866      10000      0.02      29                                 
   DOMAIN                             142726      85177      0.28       6                                 
   HELIX                              130288      13628      0.25       8                                 
   INIT_MET                            14802      14802      0.03      27                                 
   LIPID                               10518       6700      0.02      30                                 
   METAL                              271321      66752      0.53       3                                 
   MOD_RES                            177672      58392      0.34       5                                 
   MOTIF                               31998      20591      0.06      22                                 
   MUTAGEN                             30053       7163      0.06      24                                 
   NON_CONS                             1545        634     <0.01      36                                 
   NON_STD                               348        273     <0.01      38                                 
   NON_TER                             11476       8710      0.02      28                                 
   NP_BIND                            104535      68224      0.20      12                                 
   PEPTIDE                              8491       5423      0.02      32                                 
   PROPEP                              10415       8781      0.02      31                                 
   REGION                              91553      50349      0.18      15                                 
   REPEAT                              87521      12938      0.17      16                                 
   SIGNAL                              33689      33679      0.07      21                                 
   SITE                                36775      21756      0.07      20                                 
   STRAND                             130850      12739      0.25       7                                 
   TOPO_DOM                           115001      23624      0.22      11                                 
   TRANSIT                              6466       6380      0.01      33                                 
   TRANSMEM                           336584      68799      0.65       2                                 
   TURN                                31089      10765      0.06      23                                 
   UNSURE                               1105        350     <0.01      37                                 
   VAR_SEQ                             38660      16599      0.08      19                                 
   VARIANT                             79053      16425      0.15      17                                 
   ZN_FING                             27931      12202      0.05      25                                 

Total number of feature keys: 38



                                      Total    Number of  Average
   Line type / subtype                number   entries    per entry  Rank      Category
------------------------------------  -------- ---------  ---------  ----      -------------------------------------------
Cross-references (DR)               12981936                25.20                                                           
   2DBase-Ecoli                           84         84     <0.01     116      2D gel databases                             
   Aarhus/Ghent-2DPAGE                   126         96     <0.01     113      2D gel databases                             
   AGD                                   823        817     <0.01      89      Organism-specific databases                  
   ANU-2DPAGE                             23         23     <0.01     123      2D gel databases                             
   ArachnoServer                         462        458     <0.01      97      Organism-specific databases                  
   ArrayExpress                        58045      58045      0.11      38      Gene expression databases                    
   Bgee                                37641      37638      0.07      44      Gene expression databases                    
   BindingDB                             297        297     <0.01     106      Other                                        
   BioCyc                             160537     147681      0.31      21      Enzyme and pathway databases                 
   BRENDA                              65155      62359      0.13      35      Enzyme and pathway databases                 
   BuruList                              330        330     <0.01     105      Organism-specific databases                  
   CAZy                                 5647       5026      0.01      65      Protein family/group databases               
   CGD                                   554        550     <0.01      94      Organism-specific databases                  
   CleanEx                             30211      29564      0.06      46      Gene expression databases                    
   COMPLUYEAST-2DPAGE                     59         59     <0.01     118      2D gel databases                             
   Cornea-2DPAGE                          67         67     <0.01     117      2D gel databases                             
   CTD                                 63479      62910      0.12      37      Organism-specific databases                  
   CYGD                                 6628       6524      0.01      64      Organism-specific databases                  
   dictyBase                            4260       4137      0.01      73      Organism-specific databases                  
   DIP                                 11496      11391      0.02      56      Protein-protein interaction databases        
   DisProt                               397        394     <0.01     100      3D structure databases                       
   DOSAC-COBS-2DPAGE                     150        150     <0.01     112      2D gel databases                             
   DrugBank                             5317       1626      0.01      67      Other                                        
   EchoBASE                             4159       4124      0.01      75      Organism-specific databases                  
   ECO2DBASE                             351        299     <0.01     104      2D gel databases                             
   EcoGene                              4354       4351      0.01      71      Organism-specific databases                  
   eggNOG                             216413     216413      0.42      18      Phylogenomic databases                       
   EMBL                               848161     505513      1.65       3      Sequence databases                           
   Ensembl                             90110      69668      0.17      28      Genome annotation databases                  
   euHCVdb                                55         44     <0.01     119      Organism-specific databases                  
   EuPathDB                              231        231     <0.01     110      Organism-specific databases                  
   FlyBase                              5463       5087      0.01      66      Organism-specific databases                  
   Gene3D                             235321     193263      0.46      17      Family and domain databases                  
   GeneCards                           21070      19810      0.04      50      Organism-specific databases                  
   GeneDB_Spombe                        4976       4931      0.01      69      Organism-specific databases                  
   GeneFarm                             2690       2675      0.01      81      Organism-specific databases                  
   GeneID                             467139     448177      0.91       6      Genome annotation databases                  
   Genevestigator                      64358      64358      0.12      36      Gene expression databases                    
   GenomeReviews                      376632     356619      0.73       9      Genome annotation databases                  
   GermOnline                          41923      41316      0.08      43      Gene expression databases                    
   GlycoSuiteDB                          280        280     <0.01     107      PTM databases                                
   GO                                2152314     481476      4.18       1      Ontologies                                   
   Gramene                              4290       4290      0.01      72      Organism-specific databases                  
   H-InvDB                             10859       9799      0.02      57      Organism-specific databases                  
   HAMAP                              307274     307130      0.60      15      Family and domain databases                  
   HGNC                                19528      19356      0.04      51      Organism-specific databases                  
   HOGENOM                            359249     359249      0.70      10      Phylogenomic databases                       
   HOVERGEN                            74264      74264      0.14      31      Phylogenomic databases                       
   HPA                                  8704       6562      0.02      60      Organism-specific databases                  
   HSC-2DPAGE                             85         85     <0.01     115      2D gel databases                             
   HSSP                                28888      28888      0.06      47      3D structure databases                       
   InParanoid                          65753      65753      0.13      32      Phylogenomic databases                       
   IntAct                              21754      21754      0.04      49      Protein-protein interaction databases        
   InterPro                          1583334     486831      3.07       2      Family and domain databases                  
   IPI                                 88263      63320      0.17      29      Sequence databases                           
   KEGG                               438773     417044      0.85       8      Genome annotation databases                  
   LegioList                             760        758     <0.01      90      Organism-specific databases                  
   Leproma                               664        661     <0.01      93      Organism-specific databases                  
   ListiList                            1185       1177     <0.01      86      Organism-specific databases                  
   MaizeGDB                              472        467     <0.01      96      Organism-specific databases                  
   MEROPS                               9946       9624      0.02      58      Protein family/group databases               
   MGI                                 16104      16053      0.03      53      Organism-specific databases                  
   MIM                                 15806      12440      0.03      55      Organism-specific databases                  
   MypuList                              203        203     <0.01     111      Organism-specific databases                  
   NextBio                             48682      48682      0.09      41      Other                                        
   NMPDR                              130076     130072      0.25      24      Genome annotation databases                  
   OGP                                   377        377     <0.01     103      2D gel databases                             
   OMA                                353254     353254      0.69      11      Phylogenomic databases                       
   Orphanet                             3674       2131      0.01      78      Organism-specific databases                  
   OrthoDB                             55415      55415      0.11      39      Phylogenomic databases                       
   PANTHER                            184791     169613      0.36      20      Family and domain databases                  
   Pathway_Interaction_DB               4567       1665      0.01      70      Enzyme and pathway databases                 
   PDB                                 65686      15487      0.13      34      3D structure databases                       
   PDBsum                              65686      15488      0.13      33      3D structure databases                       
   PeptideAtlas                         5167       5167      0.01      68      Proteomic databases                          
   PeroxiBase                            677        665     <0.01      92      Protein family/group databases               
   Pfam                               661344     466824      1.28       4      Family and domain databases                  
   PharmGKB                            15809      15798      0.03      54      Organism-specific databases                  
   PHCI-2DPAGE                           247        247     <0.01     109      2D gel databases                             
   PhosphoSite                         19301      19301      0.04      52      PTM databases                                
   PhosSite                              267        267     <0.01     108      PTM databases                                
   PhotoList                             738        738     <0.01      91      Organism-specific databases                  
   PhylomeDB                          121182     121182      0.24      25      Phylogenomic databases                       
   PIR                                115021     105057      0.22      26      Sequence databases                           
   PIRSF                               82100      82100      0.16      30      Family and domain databases                  
   PMAP-CutDB                           1394       1394     <0.01      84      Other                                        
   PMMA-2DPAGE                            52         52     <0.01     120      2D gel databases                             
   PptaseDB                               34         34     <0.01     121      Protein family/group databases               
   PRIDE                               53651      53651      0.10      40      Proteomic databases                          
   PRINTS                             136299     117891      0.26      23      Family and domain databases                  
   ProDom                              27781      27452      0.05      48      Family and domain databases                  
   ProMEX                                439        439     <0.01      98      Proteomic databases                          
   PROSITE                            456482     290969      0.89       7      Family and domain databases                  
   ProtClustDB                        323922     323922      0.63      13      Phylogenomic databases                       
   PseudoCAP                            1212       1203     <0.01      85      Organism-specific databases                  
   Rat-heart-2DPAGE                       28         28     <0.01     122      2D gel databases                             
   Reactome                             7331       4257      0.01      62      Enzyme and pathway databases                 
   REBASE                                379        358     <0.01     102      Protein family/group databases               
   RefSeq                             487669     448457      0.95       5      Sequence databases                           
   REPRODUCTION-2DPAGE                  1030        942     <0.01      88      2D gel databases                             
   RGD                                  7364       7360      0.01      61      Organism-specific databases                  
   SagaList                              389        388     <0.01     101      Organism-specific databases                  
   SGD                                  6641       6540      0.01      63      Organism-specific databases                  
   Siena-2DPAGE                          103        103     <0.01     114      2D gel databases                             
   SMART                              141828     109431      0.28      22      Family and domain databases                  
   SMR                                345706     345706      0.67      12      3D structure databases                       
   STRING                             203542     203530      0.40      19      Protein-protein interaction databases        
   SubtiList                            4200       4191      0.01      74      Organism-specific databases                  
   SUPFAM                             311933     247029      0.61      14      Family and domain databases                  
   SWISS-2DPAGE                         1182       1182     <0.01      87      2D gel databases                             
   TAIR                                 8959       8847      0.02      59      Organism-specific databases                  
   TCDB                                 3293       3252      0.01      80      Protein family/group databases               
   TIGR                                33909      33142      0.07      45      Genome annotation databases                  
   TIGRFAMs                           280320     261597      0.54      16      Family and domain databases                  
   TubercuList                          1587       1551     <0.01      83      Organism-specific databases                  
   UCSC                                48477      39503      0.09      42      Genome annotation databases                  
   UniGene                             91817      80915      0.18      27      Sequence databases                           
   VectorBase                            418        404     <0.01      99      Genome annotation databases                  
   World-2DPAGE                          507        507     <0.01      95      2D gel databases                             
   WormBase                             3818       3733      0.01      77      Organism-specific databases                  
   WormPep                              4055       3275      0.01      76      Organism-specific databases                  
   Xenbase                              3663       3590      0.01      79      Organism-specific databases                  
   ZFIN                                 2515       2504     <0.01      82      Organism-specific databases                  

Total number of cross-referenced databases: 123

6.  AMINO ACID COMPOSITION

   6.1  Composition in percent for the complete database

   Ala (A) 8.28   Gln (Q) 3.94   Leu (L) 9.67   Ser (S) 6.50
   Arg (R) 5.54   Glu (E) 6.77   Lys (K) 5.85   Thr (T) 5.32
   Asn (N) 4.05   Gly (G) 7.09   Met (M) 2.43   Trp (W) 1.07
   Asp (D) 5.45   His (H) 2.27   Phe (F) 3.86   Tyr (Y) 2.91
   Cys (C) 1.36   Ile (I) 5.99   Pro (P) 4.68   Val (V) 6.88

   Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.00

   

   Legend: gray = aliphatic, red = acidic, green = small hydroxy,
           blue = basic, black = aromatic, white = amide, yellow = sulfur


   6.2  Classification of the amino acids by their frequency

   Leu, Ala, Gly, Val, Glu, Ser, Ile, Lys, Arg, Asp, Thr, Pro, Asn, Gln,
   Phe, Tyr, Met, His, Cys, Trp


7.  MISCELLANEOUS STATISTICS

4446 entries are encoded on a mitochondrion, and 3555 are encoded on a plasmid.

12174 entries are encoded on a plastid, 
of which 21 are encoded on apicoplasts, 
11616 on chloroplasts, 
44 on organellar chromatophores,
145 on cyanelles, 
149 on non-photosynthetic plastids and 
199 on unspecified types of plastid.

Number of entries with at least one sequence correction: 68420