WORKFLOW PARA BIOINFORMÁTICA / WORKFLOW FOR BIOINFORMATICS
AUTOR(ES)
MELISSA LEMOS
DATA DE PUBLICAÇÃO
2004
RESUMO
Genome projects usually start with a sequencing phase, where experimental data, usually DNA sequences, is generated, without any biological interpretation. DNA sequences have codes which are responsible for the production of protein and RNA sequences, while protein sequences participate in all biological phenomena, such as cell replication, energy production, immunological defense, muscular contraction, neurological activity and reproduction. DNA, RNA and protein sequences are called biosequences in this thesis. The fundamental challenge researchers face lies exactly in analyzing these sequences to derive information that is biologically relevant. During the analysis phase, researchers use a variety of analysis programs and access large data sources holding Molecular Biology data. The growing number of Bioinformatics data sources and analysis programs indeed enormously facilitated the analysis phase. However, it creates a demand for systems that facilitate using such computational resources. Given this scenario, this thesis addresses the use of workflows to compose Bioinformatics analysis programs that access data sources, thereby facilitating the analysis phase. An ontology modeling the analysis program and data sources commonly used in Bioinformatics is first described. This ontology is derived from a careful study, also summarized in the thesis, of the computational resources researchers in Bioinformatics presently use. A framework for biosequence analysis management systems is next described. The system is divided into two major components. The first component is a Bioinformatics workflow management system that helps researchers define, validate, optimize and run workflows combining Bioinformatics analysis programs. The second component is a Bioinformatics data management system that helps researchers manage large volumes of Bioinformatics data. The framework includes an ontology manager that stores Bioinformatics ontologies, such as that previously described. Lastly, instantiations for the Bioinformatics workflow management system framework are described. The instantiations cover three types of working environments commonly found and suggestively called personal environment, laboratory environment and community environment. For each of these instantiations, aspects related to workflow optimization and execution are carefully discussed.
ASSUNTO(S)
ontology framework de software database bioinformatics bioinformatica software framework ontologia banco de dados workflow workflow
ACESSO AO ARTIGO
Documentos Relacionados
- Ferramentas de bioinformática para proteômica
- PROVENIÊNCIA PARA WORKFLOWS DE BIOINFORMÁTICA
- Investigação de técnicas de classificação hierárquica para problemas de bioinformática
- Gerenciamento de workflows cientificos em bioinformatica
- Desenvolvimento de ferramentas de bioinformática para análises de expressão gênica em larga escala