Iqbal, M.J. and Faye, I. and Said, A.M. and Samir, B.B. (2013) A distance-based feature-encoding technique for protein sequence classification in bioinformatics. In: UNSPECIFIED.
Full text not available from this repository.Abstract
Bioinformatics has been emerging as a new research dimension since the last century by combining computer science and biology techniques for the automatic analysis of biological sequence data. The volume of the biological data gathered under different sequencing projects is increasing exponentially. These sequences contain extremely important information about genes, their structure and function. Computational techniques which involve machine learning and pattern recognition are becoming very useful on Bioinformatics data like DNA and protein. Protein classification into different groups could be used for knowing the structure or the function of unknown protein sequence. The process of classifying protein amino acid sequences into a family/superfamily is a very complex problem. However, from among other major issues in a protein classification, the critical one is an accurate representation of amino acid sequence during the feature extraction. In this work, we have proposed a distance-based feature-encoding method; the proposed technique has been tested with different classifiers, which have shown better results than the previously available techniques for superfamily classification of protein sequences. The maximum average classification accuracy obtained was 91.2. The dataset used in the experiments was taken from the well known UniProtKB protein database. © 2013 IEEE.
Item Type: | Conference or Workshop Item (UNSPECIFIED) |
---|---|
Additional Information: | cited By 5; Conference of 2nd IEEE International Conference on Computational Intelligence and Cybernetics, IEEE CYBERNETICSCOM 2013 ; Conference Date: 3 December 2013 Through 4 December 2013; Conference Code:107263 |
Uncontrolled Keywords: | Amino acids; Artificial intelligence; Bioinformatics; Data mining; Encoding (symbols); Feature extraction; Proteins, Biological sequence data; Classification accuracy; Computational technique; Feature-encoding; Protein Classification; Protein sequence classification; Superfamily; Superfamily classification, Computer aided diagnosis |
Depositing User: | Mr Ahmad Suhairi UTP |
Date Deposited: | 09 Nov 2023 15:52 |
Last Modified: | 09 Nov 2023 15:52 |
URI: | https://khub.utp.edu.my/scholars/id/eprint/3808 |