Geração e análise comparativa de seqüências genômicas de Trypanosoma rangeli / The generation and comparative analisys of genomic sequences of the Trypanosoma rangeli

AUTOR(ES)
DATA DE PUBLICAÇÃO

2006

RESUMO

The hemoflagellate protozoan parasite Trypanosoma (Herpetosoma) rangeli Tejera, 1920 (Kinetoplastida: Tryponosomatidae) share several species of invertebrate and vertebrate hosts with T. cruzi, etiological agent of Chagas disease. Recently, the genome of 3 trypanosomatid species of major importance on human health (Tri-Tryps) were described but non-pathogenic species has not been well studied, among which we include T. rangeli. Two distinct approaches have been used on genomics of several species, the GSS (Genome Sequence Survey) which aims the generation of sequences from randomly generated genomic DNA clones and EST (Expressed Sequence Tags), directed to the generation of sequences from cDNA libraries. In the present study 1,720 genomic sequences from T. rangeli SC58 were generated by GSS. Furthermore, an integrated system for sequence analysis and annotation named GARSA (Genomic Analysis Resources for Sequence Annotation) was also developed. Through this system it is possible to run 21 bioinformatics softwares from simple sequence analysis and trimming to phylogenetic and protein domain analyses in a user-friendly and intuitive manner. After analysis of the 1,720 sequences, a total of 915 were grouped in 375 non-redundant sequences (GSS-nr). The G+C content of the coding regions was of 55%. Similarity searches based on BLAST and Interpro revealed positive for 68% of the sequences, being 53% hypothetical proteins of organisms belonging to the same family, especially T. cruzi. Also, sequences related to the mRNA editing process (DEAD box helicase), as well as from the parasite coat as trans-sialidase, metaloproteases and mucinas were found. Functional annotation based on the Gene Ontology consortia vocabulary were carried out, mostly related to molecular function and related to RNA helicase, serino-peptidases and ligands. For 31% of the generated sequences was not possible to infer functions based on similarity searches. Thus, these sequences may represent unknown sequences, T. rangeli specific sequences or even intergenic regions. Up to now there are no reports concerning the T. rangeli genome, indicating that the present work is the first one addressing a large scale exploration of the parasite genome.

ASSUNTO(S)

genome gss anotação biologia molecular gss trypanosoma rangeli trypanosoma rangeli biologia computacional genoma bioinformática. bioinformatic. biologia computacional annotation

Documentos Relacionados