Open Access Open Access  Restricted Access Subscription or Fee Access

Data Warehouse Automation – A Review

A. S. Kavitha, R. Kavitha

Abstract


Business enterprises invest lots of money to develop data warehouse that gives them real, constant and up to date data for decision making. To keep data warehouse update, traditionally, data warehouses are updated periodically. Periodic updates make a delay between operational data and warehouse data. These updates are triggered on time set; some may set it to evening time when there is no load of work on systems. This fixing of time does not work in every case. Many companies run day and night without any break, then in these situations periodic updates stale warehouse. This delay depends upon the periodic interval, as interval time increase the difference between operational and warehouse data also increase. The most recent data is unavailable for the analysis because it resides in operational data sources. For timely and effective decision making warehouse should be updated as soon as possible. Extraction, Transformation and Loading (ETL) are designed tools for the updating of warehouse. When warehouse is refreshed for the update purpose, it often gets stuck due to overloading on resources. Perfect time should be chosen for the updating of warehouse, so that utilize our resources can be utilized efficiently. Warehouse is not updated once, this is cyclic process. Here this paper is introducing automation for ETL, the proposed framework will select best time to complete the process, so that warehouse gets updated automatically as soon as resources are available without compromising on data warehouse usage.

Keywords


ETL, Updating, Loading, Data Warehouse

Full Text:

PDF

References


Gupta, A., Jagadish, H. V., Mumick, "Data Integration using Self-Maintainable Views," EDBT, pp. 140-144, 1996

JÄorg, T., Dessloch, "Towards generating ETL processes for incremental loading," IDEAS, pp. 101-110, 2008

JÄorg, T., Dessloch, "Formalizing ETL Jobs for Incremental Loading of DataWarehouses," BTW pp. 327-346, 2009

Kimball, R., Caserta, The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, leaning, Conforming, and Delivering Data, John Wiley & Sons, 2004

Panos Vassiliadis, Alkis Simitsis, and Spiros Skiadopoulos, "Conceptual modeling for ETL processes," In DOLAP, pp. 14–21, 2002

Alkis Simitsis, "Mapping conceptual to logical models for ETL processes," In DOLAP, pp. 67–76, 2005

Alkis Simitsis, Panos Vassiliadis, and Timos K. Sellis, "Optimizing ETL Processes in Data Warehouses," In ICDE, pp. 564–575, 2005

Alkis Simitsis, Panos Vassiliadis, Manolis Terrovitis, and Spiros Skiadopoulos, "Graph-Based Modeling of ETL Activities with Multi-level Transformations and Updates," In DaWaK, pp. 43–52, 2005

Labio, W., Garcia-Molina,"Efficient Snapshot Differential Algorithms for Data Warehousing," VLDB, pp. 63-74, 1996

M. Golfarelli and S. Rizzi. A methodological framework for data warehouse design. In I.-Y. Song and T.J. Teorey, editors, Proceedings of the 1st ACM International Workshop on Data Warehousing and OLAP, DOLAP’98, pp. 3–9. ACM Press, 1998.

R. Kimball, L. Reeves, M. Ross, and W. Thornthwaite. The Data Warehouse Lifecycle Toolkit: Expert Methods for Designing, Developing, and Deploying Data Warehouses. Wiley, 1998.

S. Luj´an-Mora and J. Trujillo. A comprehensive method for data warehouse design. Proceedings of the 5th International Workshop on Design and Management of Data Warehouses, DMDW’03. CEUR Workshop Proceedings, 2003.

C. Ballard, D. Herreman, D. Schau, R. Bell, E. Kim, and A. Valencic. Data Modeling Techniques for Data Warehousing. IBM Redbooks SG24-2238-00, 1998.

L. Carneiro and A. Brayner. X-META: A methodology for data warehouse design with metadata management. In [157], pp. 13–22.

F. Paim, A. Carvalho, and J. Castro. Towards a methodology for requirements analysis of data warehouse systems. In Proceedings of the 16th Brazilian Symposium on Software Engineering, SBES’02, pp. 1–16, 2002.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.