Synthesizing Regularity Exposing Attributes in Large Protein Databases
Item
-
Title
-
en_US
Synthesizing Regularity Exposing Attributes in Large Protein Databases
-
Creator
-
en_US
de la Maza, Michael
-
Date
-
2004-10-20T19:55:04Z
-
Date Available
-
2004-10-20T19:55:04Z
-
Date Issued
-
en_US
1993-05-01
-
Identifier
-
en_US
AITR-1444
-
Abstract
-
en_US
This thesis describes a system that synthesizes regularity exposing attributes from large protein databases. After processing primary and secondary structure data, this system discovers an amino acid representation that captures what are thought to be the three most important amino acid characteristics (size, charge, and hydrophobicity) for tertiary structure prediction. A neural network trained using this 16 bit representation achieves a performance accuracy on the secondary structure prediction problem that is comparable to the one achieved by a neural network trained using the standard 24 bit amino acid representation. In addition, the thesis describes bounds on secondary structure prediction accuracy, derived using an optimal learning algorithm and the probably approximately correct (PAC) model.
-
Extent
-
en_US
90 p.
-
204397 bytes
-
794429 bytes
-
Format
-
application/octet-stream
-
application/pdf
-
Language
-
en_US
-
Relation
-
en_US
AITR-1444
-
Subject
-
en_US
representation reformulation
-
en_US
secondary structuresprediction
-
en_US
genetic algorithms
-
en_US
neural networks
-
en_US
clustering algorithm
-
en_US
sdecision tree systems