Performance Analytic EDW Architecture
An enterprise data warehouse (EDW) provides immediate access to a single source of current, accurate enterprise information to support reporting and analytics. Developing an EDW is an iterative, ongoing process because business dynamics change and business user analytic questions become more sophisticated as their analytic skills increase.
The BI EDW Architecture helps clients understand the objective, data flow and processes involved in developing an EDW. An EDW project plan considers the extract transform and load (ETL) processes that extract and integrate data from multiple application data sources into either Stage databases or Operational Data Stores; organize the data in a dimensional EDW model; and, deploy the data into functional data marts for business user access.
Depending on the latency requirement (e.g., real-time, hourly, daily, weekly, etc.), ETL processes extract data from one database, transform the data if needed into another data type and/or calculated field, and load the resulting data into another database. The BI ETL Solution is a framework for planning. In Industry and Strategic Business Unit Solutions, where industry-standard application data sources and dimensional model targets are reasonably well-defined, the ETL processes are also more well-defined.
In the BI ETL Solution, there are three ETL Processes: Source to ODS/Stage, ODS/Stage to Data Warehouse and Data Warehouse to Data Marts.
Whenever reports are run against an online transaction processing (OLTP) application database, “record lock-out” is invoked, which prevents data entry transactions from being applied to the database. If a customer service representative is talking to a customer and entering data when a report is run, the application freezes and the customer must wait…and wait…and wait until the report has finished running. Not the best way to get high customer satisfaction ratings.
To prevent record lock-out, application data is extracted from application databases during off-peak hours and loaded into either an operational data store (ODS) or a Staging database. Many times, ODS are provided by application vendors for applications reporting. These can be used as data sources for the EDW. When an ODS is not provided, the ETL process extracts required application data and simply stores it in a database as quickly and efficiently as possible.
In some situations, a preliminary database between the ODS/Stage and the EDW may be required: an Integration database, where snow-flake schemas are used to differentiate rapidly- and slowly-changing dimensional data. Most situations, however, are very satisfactorily addressed with dimensional, star-schema data warehouses. There are many details to consider, such as Primary and Foreign Keys, data types, granularity, period synchronization, and others. For more information about data warehouse details, you might want to read The Data Warehouse Lifecycle Toolkit, by Ralph Kimball, considered by many to be the EDW Developer’s Bible.
Industry and SBU Solutions contain dimension and fact tables that reflect standard business terms and relationships.
One of the primary reasons for the high value of a data warehouse is that the different terms used by the business units, which are the source of chaos in management meetings because everyone is “speaking a different language,” are standardize in an EDW data dictionary. The process for establishing and maintaining the EDW Data Dictionary is referred to as Data Governance. The process involves as much political mediation as data profiling, so getting everyone to agree on the term for Customer, Product, Location, Employee, and other cross-business unit terms requires a strong executive sponsor and a patient and persistent Data Governance Coordinator.
BI Industry and SBU Data Governance Solutions contain a data dictionary with standard terms and calculate fields.
Data Mart Models
Reporting and analytic communities in small to medium businesses (SMBs) that mainly use standard reports with little ad hoc reporting can be satisfied with only an EDW. When reporting and analytic communities become large and active, however, data warehouse reporting run times become long and frustrating for users. This is when data marts can be an invaluable component of an EDW architecture.
BI Industry and SBU Data Mart Solutions contain significant examples.