Proceedings of the 5th International Conference on Advanced Design and Manufacturing Engineering

A Novel Method for Similarity Analysis of Protein Sequences

Authors
Longlong Liu, Tingting Zhao, Maojuan Liu
Corresponding Author
Longlong Liu
Available Online October 2015.
DOI
10.2991/icadme-15.2015.413How to use a DOI?
Keywords
protein sequence, similarity analysis, feature vector, cluster analysis, mitochondria NADH dehydrogenase
Abstract

A new method for similarity analysis of protein sequences is presented in this paper. On the basis of positions, proportion difference and various physicochemical properties of 20 kinds of amino acid in different protein sequences, representative information was extracted from protein sequence and converted into a numeric vector, thus further similarities of protein sequences were analyzed by studying the similarities between vectors. To facilitate the comparison between protein sequences of different length, every protein sequence is first mapped to a fixed-length vector, of which the vector information is relative position of amino acids. Then percentage of 20 kinds of amino acids in the sequence and 3 physicochemical properties are combined to constitute physicochemical information vector. Finally, a one-dimensional feature vector with 80 elements(feature vector) representing a protein sequence is synthesized. The shortest distance method was applied for cluster analysis on feature vectors so as to analyze similarities in protein sequences. In the numerical experiment part of the article, similarity analysis was conducted for 9 different species of the mitochondrial NADH dehydrogenase. The result of numerical experiment is consistent with the biological fact, which validates the effectiveness of model to a certain extent.

Copyright
© 2015, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the 5th International Conference on Advanced Design and Manufacturing Engineering
Series
Advances in Engineering Research
Publication Date
October 2015
ISBN
10.2991/icadme-15.2015.413
ISSN
2352-5401
DOI
10.2991/icadme-15.2015.413How to use a DOI?
Copyright
© 2015, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Longlong Liu
AU  - Tingting Zhao
AU  - Maojuan Liu
PY  - 2015/10
DA  - 2015/10
TI  - A Novel Method for Similarity Analysis of Protein Sequences
BT  - Proceedings of the 5th International Conference on Advanced Design and Manufacturing Engineering
PB  - Atlantis Press
SP  - 2216
EP  - 2220
SN  - 2352-5401
UR  - https://doi.org/10.2991/icadme-15.2015.413
DO  - 10.2991/icadme-15.2015.413
ID  - Liu2015/10
ER  -