Predição de RNAs não-codificadores no transcriptoma do fungo Paracoccidioides brasiliensis usando aprendizagem de máquina
AUTOR(ES)
Roberto Ternes Arrial
DATA DE PUBLICAÇÃO
2008
RESUMO
Paracoccidioides brasiliensis (Pb) is a saprophytic and dimorphic fungus of clinical importance because its propagules, when inhaled by humans, cause the disease known as paracoccidioidomycosis. In the year 2005 the Pb transcriptome was published, pointing out several potential drug targets, but still a significative amount of sequenced transcripts lack identified homologous proteins. This work suggests that these RNAs may be non-coding RNAs (ncRNAs), a class of biologically functional molecules that do not code for any protein product. Aiming this, a strictly computational approach was made, using known examples of mRNAs and ncRNAs for training two machine learning algorithms: naive Bayes (nB) and Support Vector Machines (SVM). Several programs available from literature and locally developed were used to obtain properties from transcripts and its corresponding protein products, in such a way that machine learning algorithms could successfully discriminate between mRNA and ncRNA. Several efficiency measurements show that both algorithms, SVM and nB, induced classifiers able to efficiently discriminate the two classes of RNAs, and also indicate that SVM has a significative advantage regarding ncRNA detection. Mean accuracy as estimated by 10-fold cross-validation procedure was 92.4% for SVM and 75.3% for nB. When used in the Pb transcriptome, SVM and nB detect, respectively, 970 and 262 ncRNAs, of which the majority is composed of singlets and unnanotated transcripts, two characteristics that support the possibility that these transcripts are real ncRNAs. Comparison to related works indicates that the described program offers a computational speed improvement without hindering accuracy. This work describes the design of a computational program for ab initio analysis, named PORTRAIT, specialized in detection of ncRNAs in transcriptomes from poorly characterized organisms.
ASSUNTO(S)
rnas não-codificadores máquinas de vetores de suporte transcriptoma paracoccidoides brasiliensis machine learning non-coding rnas aprendizagem de máquina transcriptome support vector machines paracoccidoides brasiliensis biologia molecular
Documentos Relacionados
- Análise da expressão de RNAs não-codificadores intrônicos em tumores de mama
- Expression analysis of intronic noncoding RNAs in renal cell carcinomas
- Análise do transcriptoma parcial do fungo Paracoccidioides brasiliensis recuperado após infecção de macrófagos peritoneais murinos
- Estudo da biossíntese e regulação de RNAs não-codificadores intrônicos em células humanas
- Visualização de dados genômicos do fungo Paracoccidioides brasiliensis