Data :: Escherichia coli (E.coli)

Published by onesixx on

https://www.kaggle.com/elikplim/ecoli-data-set

https://rdrr.io/cran/MoTBFs/man/ecoli.html

About this file
 

from: https://archive.ics.uci.edu/ml/datasets/ecoli

Source:

Creator and Maintainer:

Kenta Nakai Institue of Molecular and Cellular Biology Osaka, University 1-3 Yamada-oka, Suita 565 Japan nakai ‘@’ imcb.osaka-u.ac.jphttp://www.imcb.osaka-u.ac.jp/nakai/psort.html\

Donor:

Paul Horton (paulh ‘@’ cs.berkeley.edu)

See also: yeast database

Data Set Information:

The references below describe a predecessor to this dataset and its development. They also give results (not cross-validated) for classification by a rule-based expert system with that version of the dataset.

Reference: “Expert Sytem for Predicting Protein Localization Sites in Gram-Negative Bacteria”, Kenta Nakai & Minoru Kanehisa, PROTEINS: Structure, Function, and Genetics 11:95-110, 1991.

Reference: “A Knowledge Base for Predicting Protein Localization Sites in Eukaryotic Cells”, Kenta Nakai & Minoru Kanehisa, Genomics 14:897-911, 1992.

Attribute Information:

  1. Sequence Name: Accession number for the SWISS-PROT database
  2. mcg: McGeoch’s method for signal sequence recognition.
  3. gvh: von Heijne’s method for signal sequence recognition.
  4. lip: von Heijne’s Signal Peptidase II consensus sequence score. Binary attribute.
  5. chg: Presence of charge on N-terminus of predicted lipoproteins. Binary attribute.
  6. aac: score of discriminant analysis of the amino acid content of outer membrane and periplasmic proteins.
  7. alm1: score of the ALOM membrane spanning region prediction program.
  8. alm2: score of ALOM program after excluding putative cleavable signal regions from the sequence.

 

Categories: Reshaping

onesixx

Blog Owner

Leave a Reply

Your email address will not be published.