Synthesizing Regularity Exposing Attributes in Large Protein Databases

Item

Title
en_US Synthesizing Regularity Exposing Attributes in Large Protein Databases
Creator
en_US de la Maza, Michael
Date
2004-10-20T19:55:04Z
Date Available
2004-10-20T19:55:04Z
Date Issued
en_US 1993-05-01
Identifier
en_US AITR-1444
Abstract
en_US This thesis describes a system that synthesizes regularity exposing attributes from large protein databases. After processing primary and secondary structure data, this system discovers an amino acid representation that captures what are thought to be the three most important amino acid characteristics (size, charge, and hydrophobicity) for tertiary structure prediction. A neural network trained using this 16 bit representation achieves a performance accuracy on the secondary structure prediction problem that is comparable to the one achieved by a neural network trained using the standard 24 bit amino acid representation. In addition, the thesis describes bounds on secondary structure prediction accuracy, derived using an optimal learning algorithm and the probably approximately correct (PAC) model.
Extent
en_US 90 p.
204397 bytes
794429 bytes
Format
application/octet-stream
application/pdf
Language
en_US
Relation
en_US AITR-1444
Subject
en_US representation reformulation
en_US secondary structuresprediction
en_US genetic algorithms
en_US neural networks
en_US clustering algorithm
en_US sdecision tree systems