When should you use data abstraction in your data warehouse?
(Short answer: when it is profitable for your company)
Cisco has a nice introduction on the best practice of using data abstraction in your Enterprise Data Warehouse (EDW). They argue that the best practice is to transform your data from its original form, into what your business needs are.
From an IT perspective, we often jump on the “Rolls Royce” solution, rather then figuring out what the customer actually needs. We often make pretty and nice looking solutions in scenarios where we might do as well with a quick solution that does the job.
From a business perspective, it is crucial that your deliverables are cost effective and have a short time to market. In other words: the IT solution must make more money then it costs. All in all, do a profitability study / have a positive business case.
Why is this important? Because building a data warehouse is expensive. Building a «Rolls Royce» solution, might be more then you will get funding for. Keep in mind that between 70% and 80% of corporate BI projects fail, according to Gartner. Don’t be too ambitious.
I don’t believe EDW projects are much different. But, of course, there are more reasons a warehouse project fail.
Make sure your BI or EDW project is profitable for your business. Then figure out if you can afford the cost of having a best practice abstract / standardised data warehouse. Don’t implement an expensive solution just because everybody else does it. Look at how this affects time to market for your EDW. How will it affect the time it takes to integrate new data or a new source? (Also, be careful about running large IT projects).
This being said, at some point, most mature EDW initiatives will implement a data abstraction layer into their warehouse.
By the way: i believe the “Rolls Royce” solution is often chosen by IT because it is what most are taught at colleges and universities. Maybe we should introduce a topic «cost effective solutions»?