Your Company’s Data Supply Chain

Chain
photo by BotheredByBees

At Baseline Consulting we've been talking for several years about the concept of a data supply chain. But IT executives are only now starting to catch on to its importance.

Over the past 15 years there has been a big push to standardize on off-the-shelf software. This allowed IT organizations to buy instead of build. We've migrated from proprietary architectures to Windows and Linux standards. We've gone from custom-built applications to packaged CRM and ERP applications. IT adopted this approach because its value is automating business processes and supporting analysis– not inventing new technologies. The problem is that moving data between all of these "packaged systems" still requires custom code.

There's no question that middleware provides value: it delivers the pre-built data pipes. Unfortunately, these are toolkits requiring developers to write code to connect their packages to the pipes. Most CIOs are blissfully unaware of the amount of custom coding middleware requires. Trust me: IT spends an enormous amount of money on supporting such data migration solutions. Many IT shops still view middleware as sacred ground.

The data warehousing world has enthusiastically adopted ETL tools to reduce custom coding so they can focus on the issues of data accuracy and usability. One fact lost in translation is that ETL integrates data– it's more than just a pipe. The application world has adopted EAI, ESB, and orchestration to move data quicker. However, there's no integration. Each application is responsible for integrating the data they receive.

So, there's even more custom code. Code to connect an application to the pipes. Code to integrate and cleanup the data they receive from the pipes.
Custom code to move data around isn't the answer. Orchestration, message passing, and data movement just creates a labyrinth of pipes. There are no economies of scale. The data doesn't get better.

Walmart learned years ago that it was impractical to have a custom (and separate) distribution system for every supplier. They knew the cost benefits of a standard distribution system; this meant they needed to standardize the size of the trailers, the size of the boxes, and the way the boxes were packed and shipped. The benefits of a supply chain is that standardization occurs at the most cost effective point: the source. Walmart's distribution success was measured by its ability to accept new suppliers and manage more shipments.

Most CIOs don't recognize that they have a data supply chain. Instead of building a custom distribution system for each suppler (each business application), they should be focused on a single data supply chain. Middleware supports the creation of custom distribution solutions, but not the standardization of data. A data supply chain can only be successful if the data is standardized. Otherwise everyone is forced to write custom code to standardize, clean, and integrate the data.

Tags: , , , , , ,

About Evan Levy

Evan Levy is management consultant and partner at IntegralData. In addition to his day-to-day job responsibilities, Evan speaks, writes, and blogs about the challenges of managing and using data to support business decision making.

3 responses to “Your Company’s Data Supply Chain”

  1. Ron Dimon says :

    I like the concept of data supply chain – the first visual I thought of when I read this post was this information architecture diagram I put together here:
    http://businessfoundation.typepad.com/bf_blog/2008/11/world-class-information-architecture.html
    And then I got to thinking about the reason for a data supply chain: to manage the supply & demand of actionable information in an organization (to support & enable the delivery of value to stakeholders). The visual we use for that is an enterprise value map showing the information (KPI) value chain:
    http://www.business-foundation.com/BusinessIntelligenceBluepr.html
    Looking forward to more from you on the subject.
    -Ron

  2. Daragh O Brien (Publicity Director, IAIDQ) says :

    Evan
    I really liked this post. It compliments one I wrote a while back on my personal blog about processes as a production line for information (not a 100% accurate analogy, but close enough) and how what a lot of people view as being “information quality” simply isn’t.
    http://obriend.info/2009/03/29/end-to-end-information-producton-line/
    My involvement with the IAIDQ is motivated by a desire to push back against the ‘fix it up at the end’ school of quality thinking and actually get people (and businesses) thinking in terms of designing the quality in from the beginning.

  3. evan levy says :

    The premise of the data supply chain is to suggest a slightly different paradigm for applications and data within a business. All too often IT organizations are focused on the application processing with little focus on the actual data.
    It’s important to realize that applications actually have two responsibilities: conducting processing in an accurate and efficient manner and sharing data with others. Unfortunately very little attention is paid to the usability of data after the actual application process.
    The data supply chain simply portrays the fact that an application system takes in data (whether from another sytem or user dat entry) and produces data delivered to either people (bills, invoices, etc.) or systems (data).
    Ron’s links reference diagrams that identify various systems existing within a typical IT organization. While Ron understands that systems consume and produce data, most IT environments organize their architecture based on processing (reporting, OLTP, file sharing, etc.) with little attention paid to common data.
    The premise of the supply chain is to realize that an architecture should be highly sensitive to the actual processing and movement of data. A supply chain is a very efficient system where producers are linked closely to consumers.
    Most IT organizations publish a directory of their systems along with their processing contents (when particular production processes run). Unfortunately, very few IT organizations publish or maintain a listing of the data contents associated with each system.
    The reasons for adopting a data supply chain paradigm is to address the data dependencies that exist between systems and users.

Leave a comment