How Data Integration Works

Networked Databases

For data integration systems that rely on information that changes frequently, a data warehouse approach isn't ideal. One way that IT experts try to address this issue is to design systems that pull data directly from individual data sources. Since there's no centralized database dedicated to analyzing, categorizing and integrating the data in preparation for user queries, those responsibilities fall to other parts of the system.

IT experts define data integration systems in terms of schemata. The unified view produced from a processed query is the global schema. The structure of the various data sources and the way they relate to one another is the source schema. The way the global and source schemata interrelate is called mapping. Think of the source schema as a blueprint for all the data within the system, while the global schema is a blueprint for the view presented in response to a query.

There are two main approaches to resolving queries in a data integrated system: global-as-view and local-as-view. Each approach focuses on a particular part of the overall system and has its advantages and disadvantages.

In a global-as-view approach, the focus is on the global schema. As long as the data sources remain consistent, the global-as-view approach works well. It's easy to change the set-up of the global schema. That means it's not difficult to analyze the same overall set of data in different ways. However, adding or removing data sources to the system is problematic, because it affects data across the system as a whole.

The local-as-view technique takes the opposite approach. It focuses on the data sources. As long as the global schema remains constant, it's easy to add or remove data sources to the system. The schema looks for the same kinds of data and relationships within the new data sources. In this approach, changing the parameters of the global schema is difficult. If you want to analyze the data sources in a new way, you'll have to redefine the entire system.

So that's the story on data integration. The next time you look at a weather map or call up a filtered selection of data, you'll be aware of the complex series of processes going on in the background making it all possible.

To learn more about data integration, migrate on over to the links on the following page.

Related Articles

More Great Links


  • Baldwin, James R. "The Data Warehouse: An Overview." Spring 1997.
  • Haas, Laura and Lin, Eileen. "IBM Federated Database Technology." IBM. March 1, 2002.
  • Halevy, Alon Y. et al. "Enterprise Information Integration: Successes, Challenges and Controversies." International Conference on Management of Data. 2005.
  • Koch, Christoph. "Data Integration against Multiple Evolving Autonomous Schemata." PhD Thesis. Technical University, Vienna. May 16, 2001.
  • Lenzerini, Maurizio. "Data Integration: A Theoretical Perspective." University of Rome. ACM PODS. 2002.
  • Poje, Richard J. "Treasury and IT integration in plain English." Sep. 1, 2003.
  • SearchDataManagement.,289692,sid91,00.html
  • Singh, Munindar P. "The Practical Handbook of Internet Computing." CRC Press. 2005.
  • The Data Warehousing Information Center.
  • Ziegler, Patrick and Dittrich, Klaus R. "Three Decades of Data Integration -- All Problems Solved?" University of Zurich. First International IFIP Conference on Semantics of a Networked World. 2004.