PROVENIÊNCIA PARA WORKFLOWS DE BIOINFORMÁTICA / PROVENANCE FOR BIOINFORMATICS WORKFLOWS

AUTOR(ES)
FONTE

IBICT - Instituto Brasileiro de Informação em Ciência e Tecnologia

DATA DE PUBLICAÇÃO

27/04/2011

RESUMO

Many scientific experiments are designed as computational workflows, which can be implemented using traditional programming languages. In the Bioinformatics domain ad-hoc scripts are often used to build workflows. Scientific Workflow Management Systems (SWMS) have emerged as an alternative to those scripts. One particular SWMS feature that has received much attention by the scientific community is the automatic capture of provenance data. These allow users to track which resources and parameters were used to obtain the results, among many other required information to validate and publish an experiment. In the present work we have elicited some data provenance challenges in the SWMS context, such as (i) the heterogeneity of data representation schemes that hinders the understanding and interoperability; (ii) the storage of consumed and produced data and (iii) the reproducibility of a specific execution. These challenges have motivated the proposal of a data provenance conceptual scheme for workflow representation. We have implemented an extension of a particular SWMS system (Bioside) to include provenance data and store them using the proposed conceptual scheme. We have focused on some requirements commonly found in bioinformatics workflows.

ASSUNTO(S)

workflow workflow bioinformatica proveniencia bioinformatics provenance

Documentos Relacionados