Starten Sie Ihre Suche...


Durch die Nutzung unserer Webseite erklären Sie sich damit einverstanden, dass wir Cookies verwenden. Weitere Informationen

Large scale hierarchical clustering of protein sequences

BMC Bioinformatics. Bd. 2005. H. 6. Berlin, Heidelberg: BMC Springer Nature 2005 S. 15

Erscheinungsjahr: 2005

ISBN/ISSN: 1471-2105

Publikationstyp: Zeitschriftenaufsatz

Sprache: Englisch

Doi/URN: 10.1186/1471-2105-6-15

Volltext über DOI/URN

GeprüftBibliothek

Inhaltszusammenfassung


Background Searching a biological sequence database with a query sequence looking for homologues has become a routine operation in computational biology. In spite of the high degree of sophistication of currently available search routines it is still virtually impossible to identify quickly and clearly a group of sequences that a given query sequence belongs to. Results We report on our developments in grouping all known protein sequences hierarchically into superfamily and family cl... Background Searching a biological sequence database with a query sequence looking for homologues has become a routine operation in computational biology. In spite of the high degree of sophistication of currently available search routines it is still virtually impossible to identify quickly and clearly a group of sequences that a given query sequence belongs to. Results We report on our developments in grouping all known protein sequences hierarchically into superfamily and family clusters. Our graph-based algorithms take into account the topology of the sequence space induced by the data itself to construct a biologically meaningful partitioning. We have applied our clustering procedures to a non-redundant set of about 1,000,000 sequences resulting in a hierarchical clustering which is being made available for querying and browsing at http://systers.molgen.mpg.de/. Conclusions Comparisons with other widely used clustering methods on various data sets show the abilities and strengths of our clustering methods in producing a biologically meaningful grouping of protein sequences. » weiterlesen» einklappen

  • Single Linkage
  • Distance Graph
  • Family Cluster
  • Single Linkage Cluster
  • Twilight Zone

Autoren


Stoye, Jens (Autor)
Vingron, Martin (Autor)

Klassifikation


DFG Fachgebiet:
Informatik

DDC Sachgruppe:
Naturwissenschaften

Verknüpfte Personen