The Misunderstanding of Master Data Management
Not long ago, I was asked to review a client’s program initiative that was focused on constructing a new customer repository that would establish a single version of truth. The client was very excited about using Master Data Management (MDM) to deliver their new customer view. The problem statement was well thought out: their customer data is spread across 11 different systems; users and developers retrieve data from different sources; reports reflect conflicting details; and an enormous amount of manual effort is required to manage the data. The project’s benefits were also well thought out: increased data quality, improved reporting accuracy, and improved end user data access. And, (as you can probably imagine), the crowning objective of the project was going to be creating a Single View of the Customer. The program’s stakeholders had done a good job of communicating the details: they reviewed the existing business challenges, identified the goals and objectives, and even provided a summary of high-level requirements. They were going to house all of their customer data on an MDM hub. There was only one problem: they needed a customer data mart, not an MDM hub.
I hate the idea of discussing technical terms and details with either business or IT staff. It gets particularly uncomfortable when someone was misinformed about a new technology (and this happens all the time when vendors roll out new products to their sales force). I won’t count the number of times that I’ve seen projects implemented with the wrong technology, because the organization wanted to get a copy of the latest and greatest technical toy. A few of my colleagues and I used to call this the “bright shiny project syndrome”. While it’s perfectly acceptable to acquire a new technology to solve a problem, it can be a very expensive to purchase a technology and force fit a solution that it doesn’t easily address.
It’s frequent that folks confuse the function and purpose of Master Data Management with Data Warehousing. I suspect the core of the problem is that when folks hear about the idea of “reference data” or a “golden record”, they have this mental picture of a single platform containing all of the data. While I can’t argue with the benefit of having all the data in one place (data warehousing has been around for more than 20 years), that’s not what MDM is about. Data Warehousing became popular because of its success in storing a company’s historical data to support cross-functional (multi-subject area) analysis. MDM is different; it’s focused on reconciling and tracking a single subject area’s reference data across the multitude of systems that create that data. Some examples of a subject area include customer, product, and location.
If you look at the single biggest obstacle in data integration, it’s dealing with all of the complexity of merging data from different systems. It’s fairly common for different application systems to use different reference data (The CRM system, the Sales system, and the Billing system each use different values to identify a single customer). The only way to link data from these different systems is to compare the reference data (names, addresses, phone numbers, etc.) from each system with the hope that there are enough identical values in each to support the match. The problem with this approach is that it simply doesn’t work when a single individual may have multiple name variations, multiple addresses, and multiple phone numbers. The only reasonable solution is the use of advanced algorithms that are specially designed to support the processing and matching of specific subject area details. That’s the secret sauce of MDM – and that’s what’s contained within a commercial MDM product.
The MDM hub not only contains master records (the details identifying each individual subject area entry), it also contains a cross reference list of each individual subject area entry along with the linkage details to every other application system. And, it’s continually updated as the values change within each individual system. The idea is that an MDM hub is a high performance, transactional system focused on matching and reconciling subject area reference data. While we’ve illustrated how this capability simplifies data warehouse development, this transactional capability also enables individual application systems to move and integrate data between transactional systems more efficiently too.
The enormous breadth and depth of corporate data makes it impractical to store all of our data within a single system. It’s become common practice to prune and trim the contents of our data warehouses to limit the breadth and history of data. If you consider recent advances with big data, cloud computing, and SaaS, it becomes even more apparent that storing all of a company’s subject area data in a single place isn’t practical. That’s one of the reasons that most companies have numerous data marts and operational applications integrating and loading their own data to support their highly diverse and unique business needs. An MDM hub is focused on tracking specific subject area details across multiple systems to allow anyone to find, gather, and integrate the data they need from any system.
I recently crossed paths with the above mentioned client. Their project was wildly successful – they ended up deploying both an MDM hub and a customer data mart to address their needs. They mentioned that one of the “aha” moments that occurred during our conversation was when they realized that they needed to refocus everyone’s attention towards the business value and benefits of the project instead of the details and functions of MDM. While I was thrilled with their program’s success, I was even more excited to learn that someone was finally able to compete against the “bright shiny project syndrome” and win.
Photo “Dirt Pile 2” courtesy of CoolValley via Flickr (Creative Commons license).