Please Stop Telling Everyone You Have an Enterprise Data Warehouse – Because You Don’t

One of the biggest misconceptions amongst business and clinical leaders in healthcare is the notion that most organizations have an enterprise data warehouse. Let me be the bearer of bad news – they don’t, which means you also may not. There are very few organizations that actually have a true enterprise data warehouse; that is, a place where all of their data is integrated and modeled for analysis, from source systems across the organization independent of care settings, technology platform, how it’s collected, or how it’s used.  Some organizations have data warehouses, but these are often limited to the vendor source system they’re sitting on and the data within the vendor application (i.e., McKesson’s HBI and Epic’s Clarity). This means that you are warehousing data from only one source and thus only analyzing and making decisions from one piece of a big puzzle. I’d also bet that the data you’ve started integrating is financial and maybe operational. I understand, save the hard stuff (quality and clinical data) for last.

This misconception is not limited to a single group in healthcare. I’ve heard this from OR Managers, Patient Safety & Quality staff, Service Line Directors, physicians, nurses, and executives.

You say, “Yes we have a data warehouse”…

I say, “Tell me some of the benefits” and “what is your ROI in this technology?”

So, what is it? Can you provide quantitative evidence of the benefits you’ve realized from your investment and use of your “data warehouse”?  If you’re struggling, consider this:

  • When you ask for a performance metric, say Length of Stay (LOS), do you get the same results every time you ask independent of where your supporting data came from or who you asked?
  • Do you have to ask for pieces of information from disparate places or “data handlers” in order to answer your questions? A report from an analyst; a spreadsheet from a source system SME, a tweak here and a tweak there and Voila! A number whose calculation you can’t easily recreate, that changes over time, and requires proprietary knowledge from the report writer to produce.
  • What is the loss in your productivity, as a manager or decision maker, in getting access to this data? More importantly, how much time do you have left to actually analyze, understand and act on the data once you’ve received it?
  • Can you quickly and easily track, measure and report all patient data throughout the continuum of care? Clinical, quality, financial, and operational? Third-party collected (i.e., HCAHPS Patient Satisfaction)? Third-party calculated (i.e., CMS Core Measures)? Market share?

Aside from the loss in productivity and the manual, time-consuming process of piecing together data from disparate places and sources, a true enterprise data warehouse is a single version of the truth. Independent of the number of new applications and source systems you add, business rules you create, definitions you standardize, and analyses you perform, you will get the same answer every time. You can ask any question of an enterprise data warehouse. You don’t have to consider, “Wait, what source system will give me this data? And who knows how to get that data for me?”

In the event you do have an enterprise data warehouse, you should be seeing some of these benefits:

  1. Accurate and trusted, real–time, data-driven decision making
    • Savings: Allocate and deploy resources for localized intervention ensuring the most efficient use of scare resources based upon trusted information available.
  2. Consistent definition and understanding of data and measures reported across the organization
    • Savings: Less time and money spent resolving differences in how people report the same information from different source systems
  3. Strong master data – you have a single, consistent definition for a Patient, Provider, Location, Service Line, and Specialty.
    • Savings: less time resolving differences in patient and provider identifiers when measuring performance; elimination of duplicate or incomplete patient records
  4. A return on the money you spend in your operating budget for analysts and decision support
    • Savings: quantitative improvements from projects and initiatives targeted at clinical outcomes, cost reductions, lean process efficiencies, and others
    • Savings: less time collecting data, more time analyzing and improving processes, operations and outcomes
  5. More informed and evidence-based negotiations with surgeons, anesthesiologists, payers, vendors, and suppliers

In the end, you want an enterprise data warehouse that can accommodate the enterprise data pipeline from when data is captured, through its transformations, to its consumption. Can yours?

Data Profiling: The BI Grail

In Healthcare analytics, as in analytics for virtually all other businesses, the landscape facing the Operations, Finance, Clinical, and other organizations within the enterprise is almost always populated by a rich variety of systems which are prospective sources for decision support analysis.   I propose that we insert into the discussion some ideas about the inarguable value of, first, data profiling, and second, a proactive data quality effort as part of any such undertaking.

Whether done from the ground up or when the scope of an already successful initial project is envisioned to expand significantly, all data integration/warehousing/business intelligence efforts benefit from the proper application of these disciplines and the actions taken based upon their findings, early, often, and as aggressively as possible.

I like to say sometimes that in data-centric applications, the framework and mechanisms which comprise a solution are actually even more abstract in some respects than traditional OLTP applications because, up to the point at which a dashboard or report is consumed by a user, the entire application virtually IS the data, sans bells, whistles, and widgets which are the more “material” aspects of GUI/OLTP development efforts:

  • Data entry applications, forms, websites, etc. all exist generally outside the reach of the project being undertaken.
  • Many assertions and assumptions are usually made about the quality of that data.
  • Many, if not most, of those turn out not to be true, or at least not entirely accurate, despite the very earnest efforts of all involved.

What this means in terms of risk to the project cannot be overstated.   Because it is largely unknown in most instances it obviously can neither be qualified nor quantified.   It often turns what seems, on the face of it, to be a relatively simple “build machine X” with gear A, chain B, and axle C project into “build machine X” with gear A (with missing teeth), chain B (not missing any links but definitely rusty and needing some polishing), and axle C (which turns out not to even exist though it is much discussed, maligned, or even praised depending upon who is in the room and how big the company is).

Enter The Grail.   If there is a Grail in data integration and business intelligence, it may well be data profiling and quality management, on its own or as a precursor to true Master Data Management (if that hasn’t already become a forbidden term for your organization due to past failed tries at it).

Data Profiling gives us a pre-emptive strike against our preconceived notions about the quality and content of our data.   It gives us not only quantifiable metrics by which to measure and modify our judgement of the task before us, but frequently results in various business units spinning off immediately into the scramble to improve upon what they honestly did not realize was so flawed.

Data Quality efforts, following comprehensive profiling and any proactive quality correction which is possible, give a project the possibility of fixing problems without changing source systems per se, but before the business intelligence solution becomes either a burned out husk on the side of the EPM highway (failed because of poor data), or at the least a de facto data profiling tool in its own right, by coughing out whatever data doesn’t work instead of serving its intended purpose- to deliver key business performance information based upon a solid data foundation in which all have confidence.

The return on investment for such an effort is measurable, sustainable, and so compelling as an argument that no serious BI undertaking, large or small, should go forward without it.   Whether in Healthcare, Financial Services, Manufacturing, or another vertical,  its value is, I submit, inarguable.

Data Darwinism – Evolving your data environment

In my previous posts, the concept of Data Darwinism was introduced, as well as the types of capabilities that allow a company to set itself apart from its competition.   Data Darwinism is the practice of using an organization’s data to survive, adapt, compete and innovate in a constantly changing and increasingly competitive business environment.   If you take an honest and objective look at how and why you are using data, you might find out that you are on the wrong side of the equation.  So the question is “how do I move up the food chain?”

The goal of evolving your data environment is to change from using your data in a reactionary manner and just trying to survive, to proactively using your data as a foundational component to constantly innovate to create a competitive advantage.

The plan is simple on the surface, but not always so easy in execution.   It requires an objective assessment of where you are compared to where you need to be, a plan/blueprint/roadmap to get from here to there, and flexible, iterative execution.


As mentioned before, taking an objective look at where you are compared to where you need to be is the first critical step.  This is often an interesting conversation among different parts of the organization that have competing interests and objectives. Many organizations can’t get past this first step. People get caught up in politics and self-interest and lose sight of the goal; to move the organization forward into a competitive advantage situation. Other organizations don’t have the in-house expertise or discipline to conduct the assessment. However, until this can be done, you remain vulnerable to other organizations that have moved past this step.


Great, now you’ve done the assessment, you know what your situation is and what your strengths and weaknesses are.  Without a roadmap of how to get to your data utopia, you’re going nowhere.   The roadmap is really a blueprint of inter-related capabilities that need to be implemented incrementally over time to constantly move the organization forward.   Now, I’ve seen this step end very badly for organizations that make some fundamental mistakes.  They try to do too much at once.  They make the roadmap too rigid to adapt to changing business needs.   They take a form over substance approach.  All these can be fatal to an organization.   They key to the roadmap is three-fold:

  • Flexible – This is not a sprint.   Evolving your data environment takes time.   Your business priorities will change, the external environment in which you operate will change, etc.   The roadmap needs to be flexible enough to enable it to adapt to these types of challenges.
  • – There will be an impulse to move quickly and do everything at once.   That almost never works.   It is important to align the priorities with the overall priorities of the organization.
  • Realistic – Just as you had to take an objective, and possibly painful, look at where you were with respect to your data, you have to take a similar look at what can be done given any number of constraints all organizations face.   Funding, people, discipline, etc. are all factors that need to be considered when developing the roadmap.   In some cases, you might not have the internal skill sets necessary and have to leverage outside talent.   In other cases, you will have to implement new processes, organizational constructs and enabling technologies to enable the movement to a new level.  

Execute Iteratively

The capabilities you need to implement will build upon each other and it will take time for the organization to adapt to the changes.   Taking an iterative approach that focuses on building capabilities based on the organization’s business priorities will greatly increase your chance of success.  It also gives you a chance to evaluate the capabilities to see if they are working as anticipated and generating the expected returns.   Since you are taking an iterative approach, you have the opportunity to make the necessary changes to continue moving forward.

The path to innovation is not always an easy one.   It requires a solid, yet flexible, plan to get there and persistence to overcome the obstacles that you will encounter.   However, in the end, it’s a journey well worth the effort.

Data Darwinism – Capabilities that provide a competitive advantage

In my previous post, I introduced the concept of Data Darwinism, which states that for a company to be the ‘king of the jungle’ (and remain so), they need to have the ability to continually innovate.   Let’s be clear, though.   Innovation must be aligned with the strategic goals and objectives of the company.   The landscape is littered with examples of innovative ideas that didn’t have a market.  

So that begs the question “What are the behaviors and characteristics of companies that are at the top of the food chain?”    The answer to that question can go in many different directions.   With respect to Data Darwinism, the following hierarchy illustrates the categories of capabilities that an organization needs to demonstrate to truly become a dominant force.


The impulse will be for an organization to want to immediately jump to implementing capabilities that they think will allow them to be at the top of the pyramid.   And while this is possible to a certain extent, you must put in place certain foundational capabilities to have a sustainable model.     Examples of capabilities at this level include data integration, data standardization, data quality, and basic reporting.

Without clean, integrated, accurate data that is aligned with the intended business goals, the ability to implement the more advanced capabilities is severely limited.    This does not mean that all foundational capabilities must be implemented before moving on to the next level.  Quite the opposite actually.   You must balance the need for the foundational components with the return that the more advanced capabilities will enable.


Transitional capabilities are those that allow an organization to move from silo’d, isolated, often duplicative efforts to a more ‘centralized’ platform in which to leverage their data.    Capabilities at this level of the hierarchy start to migrate towards an enterprise view of data and include such things as a more complete, integrated data set, increased collaboration, basic analytics and ‘coordinated governance’.

Again, you don’t need to fully instantiate the capabilities at this level before building capabilities at the next level.   It continues to be a balancing act.


Transformational capabilities are those that allow the company to start to truly differentiate themselves from their competition.   It doesn’t fully deliver the innovative capabilities that set them head and shoulders above other companies, but rather sets the stage for such.   This stage can be challenging for organizations as it can require a significant change in mind-set compared to the current way its conducts its operations.   Capabilities at this level of the hierarchy include more advanced analytical capabilities (such as true data mining), targeted access to data by users, and ‘managed governance’.


Innovative capabilities are those that truly set a company apart from its competitors.   They allow for innovative product offerings, unique methods of handling the customer experience and new ways in which to conduct business operations.   Amazon is a great example of this.   Their ability to customize the user experience and offer ‘recommendations’ based on a wealth of user buying  trend data has set them apart from most other online retailers.    Capabilities at this level of the hierarchy include predictive analytics, enterprise governance and user self-service access to data.

The bottom line is that moving up the hierarchy requires vision, discipline and a pragmatic approach.   The journey is not always an easy one, but the rewards more than justify the effort.

Check back for the next installment of this series “Data Darwinism – Evolving Your Data Environment.”

Data Darwinism – Are you on the path to extinction?

Most people are familiar with Darwinism.  We’ve all heard the term survival of the fittest.   There is even a humorous take on the subject with the annual Darwin Awards, given to those individuals who have removed themselves from the gene pool through, shall we say, less than intelligent choices.

Businesses go through ups and downs, transformations, up-sizing/down-sizing, centralization/ decentralization, etc.   In other words, they are trying to adapt to the current and future events in order to grow.   Just as in the animal kingdom, some will survive and dominate, some will not fare as well.   In today’s challenging business environment, while many are trying to merely survive, others are prospering, growing and dominating.  

So what makes the difference between being the king of the jungle or being prey?   The ability to make the right decisions in the face of uncertainty.     This is often easier said than done.   However, at the core of making the best decisions is making sure you have the right data.   That brings us back to the topic at hand:  Data Darwinism.   Data Darwinism can be defined as:

“The practice of using an organization’s data to survive, adapt, compete and innovate in a constantly changing and increasingly competitive business environment.”

When asked to assess where they are on the Data Darwinism continuum, many companies will say that they are at the top of the food chain, that they are very fast at getting data to make decisions, that they don’t see data as a problem, etc.   However, when truly asked to objectively evaluate their situation, they often come up with a very different, and often frightening, picture. 

  It’s as simple as looking at your behavior when dealing with data:

If you find yourself exhibiting more of the behaviors on the left side of the picture above, you might be a candidate for the next Data Darwin Awards.

Check back for the next installment of this series “Data Darwinism – Capabilities that Provide a Competitive Advantage.”