dc.contributor | 資科系 | |
dc.creator (作者) | Lin, Hsin-Nan;Notredame, Cédric;Chang, Jia-Ming;Sung, Ting-Yi;Hsu, Wen-Lian | |
dc.creator (作者) | 張家銘 | zh_TW |
dc.date (日期) | 2011 | |
dc.date.accessioned | 27-Apr-2016 15:30:13 (UTC+8) | - |
dc.date.available | 27-Apr-2016 15:30:13 (UTC+8) | - |
dc.date.issued (上傳時間) | 27-Apr-2016 15:30:13 (UTC+8) | - |
dc.identifier.uri (URI) | http://nccur.lib.nccu.edu.tw/handle/140.119/86640 | - |
dc.description.abstract (摘要) | Most sequence alignment tools can successfully align protein sequences with higher levels of sequence identity. The accuracy of corresponding structure alignment, however, decreases rapidly when considering distantly related sequences (<20% identity). In this range of identity, alignments optimized so as to maximize sequence similarity are often inaccurate from a structural point of view. Over the last two decades, most multiple protein aligners have been optimized for their capacity to reproduce structure-based alignments while using sequence information. Methods currently available differ essentially in the similarity measurement between aligned residues using substitution matrices, Fourier transform, sophisticated profile-profile functions, or consistency-based approaches, more recently.In this paper, we present a flexible similarity measure for residue pairs to improve the quality of protein sequence alignment. Our approach, called SymAlign, relies on the identification of conserved words found across a sizeable fraction of the considered dataset, and supported by evolutionary analysis. These words are then used to define a position specific substitution matrix that better reflects the biological significance of local similarity. The experiment results show that the SymAlign scoring scheme can be incorporated within T-Coffee to improve sequence alignment accuracy. We also demonstrate that SymAlign is less sensitive to the presence of structurally non-similar proteins. In the analysis of the relationship between sequence identity and structure similarity, SymAlign can better differentiate structurally similar proteins from non- similar proteins. We show that protein sequence alignments can be significantly improved using a similarity estimation based on weighted n-grams. In our analysis of the alignments thus produced, sequence conservation becomes a better indicator of structural similarity. SymAlign also provides alignment visualization that can display sub-optimal alignments on dot-matrices. The visualization makes it easy to identify well-supported alternative alignments that may not have been identified by dynamic programming. SymAlign is available at http://bio-cluster.iis.sinica.edu.tw/SymAlign/. | |
dc.format.extent | 396529 bytes | - |
dc.format.mimetype | application/pdf | - |
dc.relation (關聯) | PLoS One, 6(12), e27872 | |
dc.title (題名) | Improving the Alignment Quality of Consistency Based Aligners with an Evaluation Function Using Synonymous Protein Words | |
dc.type (資料類型) | article | |
dc.identifier.doi (DOI) | 10.1371/journal.pone.0027872 | |
dc.doi.uri (DOI) | http://dx.doi.org/10.1371/journal.pone.0027872 | |