Abstract data warehousing and Rolls Royces

When should you use data abstraction in your data warehouse?
(Short answer: when it is profitable for your company)

Cisco has a nice introduction on the best practice of using data abstraction in your Enterprise Data Warehouse (EDW). They argue that the best practice is to transform your data from its original form, into what your business needs are.

From an IT perspective, we often jump on the “Rolls Royce” solution, rather then figuring out what the customer actually needs. We often make pretty and nice looking solutions in scenarios where we might do as well with a quick solution that does the job.

From a business perspective, it is crucial that your deliverables are cost effective and have a short time to market. In other words: the IT solution must make more money then it costs. All in all, do a profitability study / have a positive business case.

Why is this important? Because building a data warehouse is expensive. Building a «Rolls Royce» solution, might be more then you will get funding for. Keep in mind that between 70% and 80% of corporate BI projects fail, according to Gartner. Don’t be too ambitious.

I don’t believe EDW projects are much different. But, of course, there are more reasons a warehouse project fail.

Make sure your BI or EDW project is profitable for your business. Then figure out if you can afford the cost of having a best practice abstract / standardised data warehouse. Don’t implement an expensive solution just because everybody else does it. Look at how this affects time to market for your EDW. How will it affect the time it takes to integrate new data or a new source? (Also, be careful about running large IT projects).

This being said, at some point, most mature EDW initiatives will implement a data abstraction layer into their warehouse.

By the way: i believe the “Rolls Royce” solution is often chosen by IT because it is what most are taught at colleges and universities. Maybe we should introduce a topic «cost effective solutions»?

Predictive Business Intelligence

Lately, I have spent some time on the future of BI. We know we need reporting, and data warehouses. What about going beyond data warehousing and business reporting?

Instead of asking the question:
How many leads did our last campaign result in?

We turn the question around to:
Which campaign should we run? This question is based on common sense and insight in customers.

Eventually we will ask:
How many leads will our next campaign return? Now, our answer will rely on statistical data.

Hindsight vs Prediction:
“[…]you still need business intelligence to know what really happened in the past, but you also need predictive analytics to optimize your resources as you look to make decisions and take actions for the future[…]”

For more information regarding this topic:
I have found that the blogs at EMC.com provide thoughts and concepts from one of the bigger players in the market. Look up the articles of Bill Schmartzo for more thoughts around this topic.

Source: EMC.com
BI Analyst vs Data Scientist, what is the difference? 

Begin to set focus on predictive analysis. Discover who is the most valuable customers.

Statistical abilities of the Data Scientist.

IT Transformation storymap (in pdf). IT and related processes are always changing. This illustration and its linked articles brings up this topic.