I wrote last time about the challenges that companies have in their transition to becoming data driven. Much has been written about the necessity of the business audience needing to embrace change. I thought I’d spend a few words discussing the other participant in a company’s data-driven transition: the Information Technology (IT) organization.
One of the issues that folks rarely discuss is that many IT organizations haven’t positioned themselves to support a data-driven culture. While most have spent a fortune on technology, the focus is always about installing hardware, building platforms, acquiring software, developing architectures, and delivering applications. IT environments focus on streamlining the construction and maintenance of systems and applications. While this is important, that’s only half the solution for a data-driven organization. A data-driven culture (or philosophy) requires that all of a company’s business data is accessible and usable. Data has to be packaged for sharing and use.
Part of the journey to becoming data-driven is ensuring that there’s a cultural adjustment within IT to support the delivery of applications and data. It’s not just about dropping data files onto servers for users to copy. It’s about investing in the necessary methods and practices to ensure that data is available and usable (without requiring lots of additional custom development).
Some of the indicators that your IT organization isn’t prepared or willing to be data-driven include
- There’s no identified Single Version Of Truth (SVOT).
There should be one place where the data is stored. While this is obvious, the lack of a single agreed to data location creates the opportunity to have multiple data repositories and multiple (and conflicting) sets of numbers. Time is wasted disputing accuracy instead of being focused on business analysis and decision making.
- Data sharing is a courtesy, not an obligation.
How can a company be data driven if finding and accessing data requires multiple meetings and multiple approvals for every request? If we’re going to run the business by the numbers, we can’t waste time begging or pleading for data from the various system owners. Every application system should have two responsibilities: processing transactions and sharing data.
- There’s no investment in data reuse.
The whole idea of technology reuse has been a foundational philosophy for IT for more than 20 years: build once, use often. While most IT organizations have embraced this for application development, it’s often overlooked for data. Unfortunately, data sharing activities are often built as a one-off, custom endeavor. Most IT teams manage 100’s or 1000’s of file extract programs (for a single system) and have no standard for moving data packets between applications. There’s no reuse; every new request gets their own extract or service/connection.
- Data accuracy and data correction is not a responsibility
Most IT organizations have invested in data quality tools to address data correction and accuracy, but few ever use them. It’s surprising to me that any shop allows new development to occur without requiring the inclusion of a data inspection and correction process. How can a business person become data driven if they can’t trust the data? How can you expect them to change if the IT hasn’t invested in fixing the data when it’s created (or at least shared)?
It’s important to consider that enabling IT to support a data-driven transition isn’t realistic without investment. You can’t expect staff members that are busy with their existing duties to absorb additional responsibilities (after all, most IT organizations have a backlog). If a company wants to transition to a data-driven philosophy, you have to allow the team members to learn new skills to support the additional activities. And, there needs to be staff members available to do the work.
There’s only one reason to transition to being a data-driven organization; it’s about more profit, more productivity, and more business success. Consequently, there should be funds available to allow IT to support the transition.
I just read this article by Ethan Knox, “Is Your Company Too Dumb to be Data Driven” and was intrigued to read what many people have discussed for years. I’ve spent nearly half my career helping clients make the transition from running the business by tribal knowledge and gut instinct to running the business by facts and numbers. It’s a hard transition. One that takes vision, motivation, discipline, and courage to change. It also takes a willingness to learn something new.
While this article covers a lot of ground, I wanted to comment on one of points made in the article: the mistake of “build it and they will come”. This occurs when an organization is enthusiastic about data and decides to build a data warehouse (or data lake) and load it with all the data from the company’s core application systems (sales, finance, operations, etc.) The whole business case depends on the users flocking to the system, using new business intelligence or reporting tools, and uncovering numerous high value business insights. All too often, the results reflect a large monolithic data platform that contains lots of content but hasn’t been designed to support analysis or decision making by the masses.
There are numerous problems with this approach – and the path to data and analytics enlightenment is littered with mistakes where companies took this approach. Don’t assume that successful companies that have embraced data and analytics didn’t make this mistake (it’s a very common mistake). Successful companies were those that were willing to learn from their mistakes – and have a culture where new project efforts are carefully scoped to allow mistakes, learning, and evolution. It’s not that they’re brilliant; successful companies understand that transitioning to being data driven company requires building knowledge. And, the process of learning takes time, includes mistakes, requires self-analysis, and must be managed and mentored carefully. They design their projects assuming mistakes and surprises occur, so they fail fast and demand continual measurement and corrective action. It’s not about the methodology or development approach. A fail-fast philosophy can work with any type of development methodology (agile, iterative, waterfall). The path to data enlightenment will include lots of mistakes.
Do you remember high school math? When you were presented with a new concept, you were given homework that allowed you to learn, gain experience, and understand the concept through the act of “doing”. Homework was often graded based on effort, not accuracy (if you did it, you got credit, whether or not it was correct). Where is it written that (upon graduation) learning something new wouldn’t require the act of “doing” and making mistakes to gain enlightenment? By the way, who has ever succeeded without making mistakes?
The point the article frequently references it that business engagement is critical. It’s not about the users participating a few times (requirements gathering and user acceptance testing); it’s about users being engaged to review results and participate in the measurement and corrective action. It’s about evolving from a culture where the relationship is customer/ provider to a team where everyone succeeds or fails based on business measurement.
It’s not that a company is too dumb to succeed with data; it’s that they’re often too fearful of mistakes to succeed. And in the world of imperfect data, exploding data volumes, frequent technology changes, and a competitive business environment, mistakes are an indication of learning. Failure isn’t a reflection of mistakes, it’s a reflection of poor planning, lack of measurement, and an inability to take corrective action.
The concept of “Production” in the area of Information Technology is well understood. It means something (usually an application or system) is ready to support business processing in a reliable manner. Production environments undergo thorough testing to ensure that there’s minimal likelihood of a circumstance where business activities are affected. The Production label isn’t thrown around recklessly; if a system is characterized as Production, there are lots of business people dependent on those systems to get their job done.
In order to support Production, most IT organizations have devoted resources focused solely on maintaining Production systems to ensure that any problem is addressed quickly. When user applications are characterized as Production, there’s special processes (and manpower) in place to address installation, training, setup, and ongoing support. Production systems are business critical to a company.
One of the challenges in the world of data is that most IT organizations view their managed assets as storage, systems, and applications. Data is treated not as an asset, but as a byproduct of an application. Data storage is managed based on application needs (online storage, archival, backup, etc.) and data sharing is handled as a one-off activity. This might have made sense in the 70’s and 80’s when most systems were vendor specific and sharing data was rare; however, in today’s world of analytics and data-driven decision making, data sharing has become a necessity. We know that every time data is created, there are likely 10-12 business activities requiring access to that data.
Data sharing is a production business need.
Unfortunately, the concept of data sharing in most companies is a handled as a one-off, custom event. Getting a copy of data often requires tribal knowledge, relationships, and a personal request. While there’s no arguing that many companies have data warehouses (or data marts, data lakes, etc.), adding new data to those systems is where I’m focused. Adding new data or integrating 3rd party content into a report takes a long time because data sharing is always an afterthought.
Think I’m exaggerating or incorrect? Ask yourself the following questions…
- Is there a documented list of data sources, their content, and a description of the content at your company?
- Do your source systems generate standard extracts, or do they generate 100s (or 1000’s) of nightly files that have been custom built to support data sharing?
- How long does it take to get a copy of data (that isn’t already loaded on the data warehouse)?
- Is there anyone to contact if you want to get a new copy of data?
- Is anyone responsible for ensuring that the data feeds (or extracts) that currently exist are monitored and maintained?
While most IT organizations have focused their code development efforts on reuse, economies-of-scale, and reliability, they haven’t focused their data development efforts in that manner. And one of the most visible challenges is that many IT organizations don’t have a funding model to support data development and data sharing as a separate discipline. They’re focused on building and delivering applications, not building and delivering data. Supporting data sharing as a production business need means adjusting IT responsibilities and priorities to reflect data sharing as a responsibility. This means making sure there are standard extracts (or data interfaces) that everyone can access, data catalogs available containing source system information, and staff resources devoted to sharing and supporting data in a scalable, reliable, and cost-efficient manner. It’s about having an efficient data supply chain to share data within your company. It’s because data sharing is a production business need.
Or, you could continue building everything in a one-off custom manner.