Having read the very interesting article on MediaHype 2.0 I thought I should share a few words about Dimensional Modelling especially since the in-vogue phrase “Data as the next “Intel Inside“ was referred to. Yes, everyone knows that data is fundamentally the cornerstone of the Enterprise but many Enterprises still struggle to maximise the knowledge they can extract from their Megabytes/Gigabytes/Terabytes (and ultimately Petabytes!) of Data.
The basic goal of most applications developed is to very effectively insert, delete and update records but this unfortunately has a high cost and impact on querying/reporting. Third Normal Form databases are not easy for Business Users to navigate and a simple request for a few columns & rows of data can result in complex queries with multiple outer-joins to many tables. Business Users often want to compare some KPI for a particular instance in time with the same KPI as it was in a different time frame, adding to the complexity of queries. Additionally the database being queried has many millions of rows but we possibly can’t add indexes to assist querying the database as this could affect our OLTP application.
It would be so much easier for our Business Users if we modelled the data so that it will support reporting requirements in a Reporting Database – Data Mart or Data Warehouse – in my experience Dimensional Modelling is the key to delivering what our Business Users need.
Instead of juggling many tables and multiple complex joins we could have all the necessary values readily available in a Dimensional Model for the Business Users. (Transporting the data into the Dimensional Model is a topic for another day i.e. ETL process)
Dimensional Modelling allows for the removal of all the table joins associated with traditional ER Modelling reducing complexity, potential for mistakes due to joining tables incorrectly and improves performance as table-joins slow down queries.
Instead of trying to aggregate measures by Month we have already organised the data and facts to allow this type of analysis by Month, along with Day, Year, Quarter etc. as well as other predefined date attributes that the business will use e.g. Holiday, Weekday. (Yes – the Date Dimension is one of my favourites!)
We don’t have to worry about adversely affecting our OLTP application so we can add the necessary indexes to assist our queries. Additionally, the Business Users don’t have to worry about complex joins as all records in the fact table have corresponding records in each of the Dimensions. We also have the capability to capture data changes effectively in our Dimensions without having to use complex (or reporting unfriendly) journal tables or logs.
The two main drivers when deciding to use Dimensional Modelling are
1.a business need for high-performance query access that you are restricted from obtaining due to OLTP application priorities and
2. a need for ‘Ease of Use’, in particular for Business Users, who primarily require analytical capabilities as well as transactional reporting.
If this has grabbed your attention you now need to turn to Ralph Kimball, original publisher of Dimensional Modelling techniques, for further reading. I’ll let his WebSite speak for him.
Those of you interested in Reporting Tools will know we have had new releases from 4 of the Big Business Intelligence Vendors in Autumn 2005. The OLAP Report has an excellent article by Nigel Pendse comparing each of the 4 releases and also includes a piece on Oracle & IBM’s stance in the BI marketplace today.
So to conclude – yes, data is the fundamental cornerstone of the Enterprise and those of us lucky to work in the Reporting Area have exciting times ahead to look forward to as more Enterprises discover the opportunities available with Dimensional Modelling.