Title: The Logical Data Warehouse - Design, Architecture, and Technology
Introduction
Business intelligence has changed dramatically the last years. The time-to-market for new reports and analysis has to be shortened, new data sources have to be made available to business users more quickly, self-service BI and data science must be supported, more and more users want to work with zero-latency data, adoption of new technologies, such as Hadoop, Spark, and NoSQL, must be easy, and analysis of streaming data and big data is required.
The classic data warehouse architecture has served many organizations well. But it’s not the right architecture for this new world of BI. It’s time for organizations to migrate gradually to a more flexible architecture: the logical data warehouse architecture. This architecture, introduced by Gartner, is based on a decoupling of reporting and analyses on the one hand, and data sources on the other hand.
Classic data warehouse architectures are made up of a chain of databases. This chain consists of numerous databases, such as the staging area, the central data warehouse and several data marts, and countless ETL programs needed to pump data through the chain. Integrating self-service BI products with this architecture is not easy and certainly not if users want to access the source systems. Delivering 100% up-to-date data to support operational BI is difficult to implement. And how do we embed new storage technologies into this architecture?
With the logical data warehouse architecture new data sources can hooked up to the data warehouse more quickly, self-service BI can be supported correctly, operational BI is easy to implement, the adoption of new technology is much easier, and in which the processing of big data is not a technological revolution, but an evolution.
The technology to create a logical data warehouse is available, and many organizations have already completed the migration successfully; a migration that is based on a step-by-step process and not on full rip-and-replace approach.
In this practical seminar, the architecture is explained and products will be discussed. It discusses how organizations can migrate their existing architecture to this new one. Tips and design guidelines are given to help make this migration as efficient as possible.
Subjects
1. Challenges for the Classic Data Warehouse
-
Integrating big data with existing data and making it available for reporting and analytics
-
Supporting self-service BI, self-service data preparation, and data science
-
Faster time-to-market for reports
-
Polyglot persistency – processing data stored in classic SQL, Hadoop, and NoSQL systems
-
Operational Business Intelligence, or analyzing zero-latency data
2. The Logical Data Warehouse Architecture
-
The essence: decoupling of reporting and data sources
-
From batch-integration to on-demand integration of data
-
The impact on flexibility and productivity – an improved time-to-market for reports
-
Examples of organizations operating a logical data warehouse
-
Can a logical data warehouse really work without a physical data warehouse?
3. Implementing a Logical Data Warehouse with Data Virtualization Servers
-
Why data virtualization?
-
Market overview: AtScale, Data Virtuality, Denodo Platform, Fraxses, IBM Data Virtualization Manager for z/OS, RedHat JBoss Data Virtualization, Stone Bond Enterprise Enabler, TIBCO Data Virtualization, and Zetaris
-
Importing non-relational data, such as XML and JSON documents, web services, NoSQL, and Hadoop data
-
The importance of an integrated business glossary and centralization of metadata specifications
4. Improving the Query Performance of Data Virtualization Servers
-
How does caching really work?
-
Using caching to minimize interference on transactional systems
-
Speeding up queries by caching data in analytical SQL database servers
-
Which virtual tables should be cached?
-
Query optimization techniques and the explain feature
-
Smart drivers/connectors can help improve query performance
-
How can SQL-on-Hadoop engines speed up query performance?
-
Working with multiple data virtualization servers in a distributed environment to minimize network traffic
5. Migrating to a Logical Data Warehouse
-
An A to Z roadmap
-
Guidelines for the development of a logical data warehouse
-
Three different methods for modeling: outside-in, inside-out, and middle-out
-
The value of a canonical data model
-
Considerations for security aspects
-
Step by step dismantling of the existing architecture
-
The focus on sharing of metadata specifications for integration, transformation, and cleansing
6. Self-Service BI and the Logical Data Warehouse
-
Why self-service BI can lead to “report chaos”
-
Centralizing and reusing metadata specifications with a logical data warehouse
-
Upgrading self-service BI into managed self-service BI
-
Implementing Gartner’s BI-modal environment
7. Big Data and the Logical Data Warehouse
-
New data storage technologies for big data, including Hadoop, MongoDB, Cassandra
-
The appearance of the polyglot persistent environment; or each application its own optimal database technology
-
Design rules to integrate big data and the data warehouse seamlessly
-
Big data is too “big” to copy
-
Offloading cold data with a logical data warehouse
8. Physical Data Lakes or Virtual Data Lakes?
-
What is a Data Lake?
-
Is developing a physical Data Lake realistic when working with Big Data?
-
Developing a virtual Data Lake with data virtualization servers
-
Can the logical Data Warehouse and the virtual Data Lake be combined?
9. Implementing Operational BI with a Logical Data Warehouse
-
Examples of operational reporting and operational analytics
-
Extending a logical data warehouse with operational data for real-time analytics
-
“Streaming” data in a logical data warehouse
-
The coupling of data replication and data virtualization
10. Making Data Vault more Flexibile with a Logical Data Warehouse
-
What exactly is Data Vault?
-
Using a Logical Data Warehouse to make data in a Data Vault available for reporting and analytics
-
The structured SuperNova design technique to develop virtual data marts
-
SuperNova turns a Data Vault in a flexible database
11. The Logical Data Warehouse and the Environment
-
Design principles to define data quality rules in a logical data warehouse
-
How data preparation can be integrated with a logical data warehouse
-
Shifting of tasks in the BICC
-
Which new development and design skills are important?
-
The impact on the entire design and development process
12. Closing Remarks
Learning Objectives
In this seminar Rick van der Lans answers the following questions:
-
What are the practical benefits of the logical data warehouse architecture and what are the differences with the classical architecture.
-
How can organizations step-by-step and successfully migrate to this flexible logical data warehouse architecture?
-
You will learn about the possibilities and limitations of the various available products.
-
How do data virtualization products work?
-
How can big data be added transparently to the existing BI environment?
-
How can self-service BI be integrated with the classical forms of BI?
-
How can users be granted access to 100% up-to-date data without disrupting the operational systems?
-
What are the real-life experiences of organizations that have already implemented a logical data warehouse?
Geared to: Business intelligence specialists; data analysts; data warehouse designers; business analysts; data scientists; technology planners; technical architects; enterprise architects; IT consultants; IT strategists; systems analysts; database developers; database administrators; solutions architects; data architects; IT managers.
Related Whitepapers:
Data Fabrics for Frictionless Data Access; April 2021,
sponsored by TIBCO Software
Raising the Bar for Data Virtualization; September 2020,
sponsored by Intenda
Overcoming Cloud Data Silos with Data Virtualization; June 2020,
sponsored by TIBCO Software
Modernizing Data Architectures for a Digital Age Using Data Virtualization; October 2019;
sponsored by Denodo Technologies
The Business Benefits of Data Virtualization; May 2019,
sponsored by Denodo Technologies
The Fusion of Distributed Data Lakes - Developing Modern Data Lakes; February 2019,
sponsored by TIBCO Software
Unifying Data Delivery Systems Through Data Virtualization; October 2018;
sponsored by Fraxses
Architecting the Multi-Purpose Data Lake With Data Virtualization, April 2018,
sponsored by Denodo
Data Virtualization in the Time of Big Data, December 2017,
sponsored by Tibco Software
Developing a Bi-Modal Logical Data Warehouse Architecture Using Data Virtualization, September 2016,
sponsored by Denodo
Designing a Logical Data Warehouse, February 2016, sponsored by RedHat
Designing a Data Virtualization Environment; A Step-By-Step Approach, January 2016
Migrating to Virtual Data Marts using Data Virtualization; Simplifying Business Intelligence Systems; January 2015; sponsored by Cisco
Re-think Data Integration: Delivering Agile BI Systems With Data Virtualization; March 2014; sponsored by RedHat
Creating an Agile Data Integration Platform using Data Virtualization; May 2013; sponsored by Stone Bond Technologies
Data Virtualization for Business Intelligence Agility; February 2012; sponsored by Cisco (Composite Software)
Related Articles and Blogs:
A Decentralized Master Data Solution using Data Virtualization
Streamlining External Data Acess to Enrich Analytics
The Data Mesh, the New Kid on the Data Architecture Block
Developing a Data Fabric
Making Big Data Easy with Data Virtualization
Data Herding Is Not Data Integration!
Benefits of Data Virtualization to Data Scientists
Eight Data Virtualization Features to Help an Organization Become Data-Driven, June 2020
Data Virtualization and SnowflakeDB: A Powerful Combination, January 2020
Spark and Data Virtualization: Competitors or Cooperators, October 2019
Simplifying Big Data Projects with Data Virtualization, March 2019
Easy Database Migration with Data Virtualization, January 2019
Data Virtualization and the Fulfilling of Ted Codd's Dream
Data Virtualization or SQL-on-Hadoop for Logical Data Architectures?
Simplifying Big Data Integration with Data Virtualization
Data Virtualization for Developing Customer-Facing Apps
Do Data Scientists Really Ask for Physical Data Lakes
Do We Really Deploy ETL in Our Data Warehouse Architectures
Challenges for Developing Data Lakes
OLAP-on-Hadoop on the Rise
The Big BI Dilemma
The Logical Data Warehouse Architecture is Tolerant to Change
The Need for Flexible, Bi-Modal Data Warehouse Architectures
The Roots of the Logical Data Warehouse Architecture
The Logical Data Warehouse Architecture is Not the Same as Data Virtualization
Data Virtualization is Not the Same as Data Federation
Data Virtualization and Data Vault: Double Agility
Convergence of Data Virtualization and SQL-on-Hadoop Engines
Data Virtualization: Where Do We Stand Today?
What is Data Virtualization?
The Logical Data Warehouse Architecture:
|