Uma proposta para o gerenciamento de cache de um sistema de integraÃÃo de dados

AUTOR(ES)
DATA DE PUBLICAÇÃO

2007

RESUMO

Data Integration Systems (DIS) provide to the user an unified view of data stored in many different data sources. Those data sources are independent and each one has a particular schema elaborated to attend its userâs needs. Each DIS has a group of different data sources related to a specific domain and must obtain from each one the necessary data to answer userâs queries. The query results will be translated according to a global schema (mediator schema), combined and showed to the user. There are several challenges for data integration systems on web as Integra system (DIS developed on Centro de Informatica, UFPE and used to implement our contributions), since data sources availability is a very important factor. Furthermore, the cost to always access data on data sources may be very high. Because of this, some DIS have a cache to store query results that are somehow interesting for the system. In this way, when the user requests some query that has already been stored in cache the system do not need to access data sources to answer it, improving the processing. The main purpose of this master thesis is to propose a Cache Manager for a data integration system. This Cache Manager is composed by a module that controls the cache space deciding which queries must enter and which ones must be kept in cache. There is another module that identifies if the query submitted by the user is a subquery of a query already stored in cache (technique of query containment). Finally, there is a module that achieves the partial substitution of a query allowing a best use of cache space.

ASSUNTO(S)

sistema de integraÃÃo de dados, cache, query containment, polÃticas de substituiÃÃo, substituiÃÃo parcial de consultas, xquery ciencia da computacao data integration systems, cache, query containment, replacement strategies, partial replacement, xquery

Documentos Relacionados