By Evan Levy
When companies initially embark on their BI development initiatives, they often underestimate its complexity. Some begin BI in the first place because their packaged applications don’t deliver the reporting functionality they need. Others embark on BI because the data they need to analyze is located in multiple, disparate application systems. While positioning a data warehouse to integrate and store historical data from packaged applications, like ERP or CRM, is a reasonable and proven approach, many companies try to repurpose the development methods associated with these packages to deliver BI.
But comparing development methods and skill sets for these two divergent types of systems is like comparing picking apples to making a fruit salad. The fact is the methodology for building a data warehouse is very similar to traditional code development using lower-level programming languages. To be successful building a data warehouse, a team should have skills in business requirements gathering, functional requirements definition, specification and design, data modeling, database design, as well as all the skills associated with loading the data and coding the application. This is clearly a complex mix of technical knowledge to deliver a business solution spanning everything from storage allocation to workload management to systems integration to application programming. The fact is you’re building something from scratch.
The packaged application world is complex in its own right, but it’s also very different, as are the skills and methodologies involved in building these environments. Most IT organizations accustomed to implementing packages use third-party firms to install and configure these systems. Their staff members don’t have the necessary skills to build these solutions, and often require training and multiple years of hands-on use to be proficient in supporting these systems. In addition, most organizations forget that implementing their business applications typically takes a year or longer.
When was the last time you were allowed a full year to implement your data warehouse? And was your team even half the size of the packaged app’s development team?
By Evan Levy
One of the most misunderstood roles on a BI team is the Project Manager. All too often the role is defined as an administrative set of activities focused on writing and maintaining the project plan, tracking the budget, and monitoring task completion. Unfortunately IT management rarely understands the importance of domain knowledge—having BI experience—and leadership skills.
To assign a BI project manager who has no prior BI experience is an accident waiting to happen. Think about a homeowner who decides to build a new house. He retains a construction company and the foreman has never built a house before. You’d want fundamental knowledge of demolition, framing, plumbing, wiring, and so on. The foreman would need to understand that the work was being done in the right way.
Unfortunately IT managers think they can position certified project managers on BI teams without any knowledge of BI-specific development processes, business decision-making, data content, or technology. We often find ourselves coaching these project managers on the differences in BI development, or introducing concepts like staging areas or federated queries. This is time that could be better spent transferring knowledge and formalizing development processes with a more seasoned project lead.
In order for a project team to be successful, the project manager should have strong leadership skills. The ability to communicate a common goal and ensure focus is both art and science. But BI project managers often behave more like bureaucrats, requesting task completion percentages and reviewing labor hours. They are rarely invested in whether the project is adhering to development standards, if permanent staff is preparing to take ownership of the code, or whether the developers are collaborating.
An effective BI project manager should be a project leader. He or she should understand that the definition of success is not a completed project plan or budget spreadsheet, but rather that the project delivers usable data and fulfills requirements. The BI project manager should instill the belief that success doesn’t mean task completion, but delivery against business goals.
By Evan Levy
Sometimes we find clients who overestimate their need for analytics. Often, IT is focused on using BI to analyze a problem exhaustively, when sometimes exhaustive analysis just isn’t necessary. Sometimes our analytics requirements just aren’t that sophisticated.
Twenty years ago, WalMart knew when it needed to pull a product from the shelf. This didn’t require advanced analytics to drill down on the category, affinities, the seasonality, or the purchaser. It was simple: if the product didn’t sell after six days, free up the shelf space and move on. After all, there were other products to sell.
Why does this matter? Because we get so wrapped up in new, more sophisticated technologies that we forget about our requirements. Sometimes we just need to know what the problem and resulting action is. We don’t necessarily need to know the "why" every time. Often, all business users want is the information that’s good enough to support the decision they need to make.
I have every tool known to man. I won’t take you into my garage right now or into my home office. (I gotta leave something for another blog.) But come into my kitchen for a sec. Kitchen gadgets are a tool-lovers dream. You can pulverize stuff in the kitchen just like you can on your workbench in the garage. Only it tastes a lot better.
My three favorite kitchen gadgets, in order, are:
- The Thunder Stick™. The Thunder Stick is more than just a simple hand blender. It’s 8000 RPM in your hand. Think of the possibilities. With a Thunder Stick you can make granulated sugar into powdered sugar. You can make milk frothy. I really haven’t tested all the possibilities.
- The Vita Mix™. This is a 5-horsepower blender. You can make flour out of wheat. The friction and speed of the blades will boil water! Sure, you can make a nice smoothie with it too. (And if you want to add in some iron, just throw in a Philips head screw!) The recipe book is two inches wide for heaven’s sake.
- A butter knife. It’s smooth.
Jill says I only like movies where someone gets blown away with an automatic weapon or something gets destroyed with explosives. But the fact that a butter knife is on my list of favorite kitchen gadgets proves her wrong.
In my last blog post, I described the reality of so-called analytical data integration, which is really just a fancy name for ETL. Now let's talk about so-called operational data integration. I'm assuming that when the vendors talk about this, it's the same thing as "data integration for operational systems." Most business applications use point-to-point solutions to retrieve and integrate data for their own specific processing needs. This is ETL in reverse: it's a "pull" process as opposed to a "push" process.
Unfortunately this involves a lot of duplicate processing for people to access individual records from source systems. And like their analytical brethren, the moment a source system changes, there is exponential work necessary to support the new modification. Multiply this by thousands of data elements and dozens of source systems, you’ll find a farm of silos and hundreds (if not thousands) of data integration jobs. It's not an uncommon problem.
In most BI environments we begin with a large batch data movement process. We build our ETL so it can occur overnight. But our data volumes are such that overnight isn’t enough. So the next evolution is building "trickle load" ETL. The issue here is that data integration is less about how the data is used as it is when the data is needed and the level of data quality. Most operational systems don’t clean the data, they just move it. And most ETL jobs for data warehouses will standardize the formatting but they won’t change the values. (And if they do fix the values, they don’t communicate those changes back to the source systems.)
If I have specialized data needs I should be building specialized integration logic. If I have commodity or standard needs for data that everyone uses, the data should be highly cleansed.
So it's not about analytical versus operational data integration. It's not even about how the data is used. It's really about one-way versus bi-directional data provisioning. As usual, the word integration is used too loosely. In either case, the presumption that the target is a relational database is naïve. And whether it's for analytical or operational integration is beside the point.