Please Stop Telling Everyone You Have an Enterprise Data Warehouse – Because You Don’t

One of the biggest misconceptions amongst business and clinical leaders in healthcare is the notion that most organizations have an enterprise data warehouse. Let me be the bearer of bad news – they don’t, which means you also may not. There are very few organizations that actually have a true enterprise data warehouse; that is, a place where all of their data is integrated and modeled for analysis, from source systems across the organization independent of care settings, technology platform, how it’s collected, or how it’s used.  Some organizations have data warehouses, but these are often limited to the vendor source system they’re sitting on and the data within the vendor application (i.e., McKesson’s HBI and Epic’s Clarity). This means that you are warehousing data from only one source and thus only analyzing and making decisions from one piece of a big puzzle. I’d also bet that the data you’ve started integrating is financial and maybe operational. I understand, save the hard stuff (quality and clinical data) for last.

This misconception is not limited to a single group in healthcare. I’ve heard this from OR Managers, Patient Safety & Quality staff, Service Line Directors, physicians, nurses, and executives.

You say, “Yes we have a data warehouse”…

I say, “Tell me some of the benefits” and “what is your ROI in this technology?”

So, what is it? Can you provide quantitative evidence of the benefits you’ve realized from your investment and use of your “data warehouse”?  If you’re struggling, consider this:

  • When you ask for a performance metric, say Length of Stay (LOS), do you get the same results every time you ask independent of where your supporting data came from or who you asked?
  • Do you have to ask for pieces of information from disparate places or “data handlers” in order to answer your questions? A report from an analyst; a spreadsheet from a source system SME, a tweak here and a tweak there and Voila! A number whose calculation you can’t easily recreate, that changes over time, and requires proprietary knowledge from the report writer to produce.
  • What is the loss in your productivity, as a manager or decision maker, in getting access to this data? More importantly, how much time do you have left to actually analyze, understand and act on the data once you’ve received it?
  • Can you quickly and easily track, measure and report all patient data throughout the continuum of care? Clinical, quality, financial, and operational? Third-party collected (i.e., HCAHPS Patient Satisfaction)? Third-party calculated (i.e., CMS Core Measures)? Market share?

Aside from the loss in productivity and the manual, time-consuming process of piecing together data from disparate places and sources, a true enterprise data warehouse is a single version of the truth. Independent of the number of new applications and source systems you add, business rules you create, definitions you standardize, and analyses you perform, you will get the same answer every time. You can ask any question of an enterprise data warehouse. You don’t have to consider, “Wait, what source system will give me this data? And who knows how to get that data for me?”

In the event you do have an enterprise data warehouse, you should be seeing some of these benefits:

  1. Accurate and trusted, real–time, data-driven decision making
    • Savings: Allocate and deploy resources for localized intervention ensuring the most efficient use of scare resources based upon trusted information available.
  2. Consistent definition and understanding of data and measures reported across the organization
    • Savings: Less time and money spent resolving differences in how people report the same information from different source systems
  3. Strong master data – you have a single, consistent definition for a Patient, Provider, Location, Service Line, and Specialty.
    • Savings: less time resolving differences in patient and provider identifiers when measuring performance; elimination of duplicate or incomplete patient records
  4. A return on the money you spend in your operating budget for analysts and decision support
    • Savings: quantitative improvements from projects and initiatives targeted at clinical outcomes, cost reductions, lean process efficiencies, and others
    • Savings: less time collecting data, more time analyzing and improving processes, operations and outcomes
  5. More informed and evidence-based negotiations with surgeons, anesthesiologists, payers, vendors, and suppliers

In the end, you want an enterprise data warehouse that can accommodate the enterprise data pipeline from when data is captured, through its transformations, to its consumption. Can yours?

Electronic Medical Records ≠ Accurate Data

As our healthcare systems race to implement Electronic Medical Records or EMRs, the amount of data that will be available and accessible for a single patient is about to explode.  “As genetic and genomic information becomes more readily available, we soon may have up to 1,000 health facts available for each particular patient,” notes Patrick Soon-Shiong, executive director of the UCLA Wireless Health Institute and executive chairman of Abraxis BioScience, Inc., a Los Angeles-based biotech firm dedicated to delivering therapeutics and technologies that treat cancer and other illnesses.  The challenge is clear: how can a healthcare organization manage the accuracy of 1,000 health facts?

As the volume of individual data elements expands to encompass 1,000 health facts per patient, there is an urgent need for electronic tools to manage the quality, timeliness and origination of those data.  One key example is simply making sure that each patient has a unique identifier with which to attach and connect the individual health facts.  This may seem like a mundane detail, but it is absolutely critical to uniquely identify and unambiguously associate each key health fact with the right patient, at the right time.  Whenever patients are admitted to a health system, they are typically assigned a unique medical record number that both clinicians and staff use to identify, track, and cross-reference their records.  Ideally, every patient receives a single, unique identifier.  Reality, however, tells a different story, because many patients wind up incorrectly possessing multiple medical record numbers,  while others wind up incorrectly sharing the same identifier.

These errors, known respectively as master person index (MPI) duplicates and overlays, can cause physicians and other caregivers to unknowingly make treatment decisions based on incomplete or inaccurate data, posing a serious risk to patient safety.  Thus, it is no wonder that improving the accuracy of patient identification repeatedly heads The Joint Commission’s national patient safety goals list on an annual basis.

Assembling an accurate, complete, longitudinal view of a patient’s record is comparable to assembling a giant jigsaw puzzle.  Pieces of that puzzle are scattered widely across the individual systems and points of patient contact within a complex web of hospitals, outpatient clinics, and physician offices.  Moreover, accurately linking them to their rightful owner requires the consolidation and correction of the aforementioned MPI errors.  To accomplish this task, every hospital nationwide must either implement an MPI solution directly, hire a third party to clean up “dirty” MPI and related data, or implement some other reliable and verifiable approach.  Otherwise, these fundamental uncertainties will continue to hamper the effective and efficient delivery of the core clinical services of the extended health system.

Unfortunately, this issue doesn’t simply require a one-time clean-up job for most healthcare systems.  The challenge of maintaining the data integrity of the MPI has just begun.  That’s because neither an identity resolution solution, nor an MPI software technology, nor a one-time clean-up will address the root causes of these MPI errors on their own.  In a great majority of cases, more fundamental issues underlie the MPI data issue, such as flawed registration procedures; inadequate or poorly trained staff; naming conventions that vary from one operational setting or culture to another; widespread use of nicknames; and even confusion caused by name changes due to marriages and divorces – or simple misspelling.

To address these challenges, institutions must combine both an MPI technology solution, which includes human intervention, and the reengineering of patient registration processes or other points of contact where patient demographics are captured or updated.  Unless these two elements are in place, providers’ ability to improve patient safety and quality of care will be impaired because the foundation underpinning the MPI will slowly deteriorate.

Another solution is the use of data profiling software tools.  These tools allow the identification of common patterns of data errors, including erroneous data entry, to focus and drive needed revisions or other improvements in business processes.  Effective data profiling tools can run automatically using business rules to focus on the exceptions of inaccurate data that need to be addressed.  As the number of individual health facts increases for each patient, the need for automating data accuracy will continue to grow, and the extended health system will need to address these issues.

When healthcare providers make critical patient care decisions, they need to have confidence in the accuracy and integrity of the electronic data.  Instead of a physician or nurse having to assemble and scan dozens of electronic patient records in order to catch a medication error or an overlooked allergy, these data profiling tools can scan thousands of records, apply business rules to identify the critical data inaccuracies, including missing or incomplete data elements, and notify the right people to take action to correct them.

The time has come in the age of computer-based medical records that electronic data accuracy is now a key element in patient safety; as critical as data completeness.  What better way to manage data accuracy than with smart electronic tools for data profiling?  Who knows?  The life you save or improve may be your own.

How does a data-driven healthcare organization work?

As the pressure increases for accountability and transparency for healthcare organizations, the spotlight is squarely on data: how does the organization gather, validate, store and report it.  In addition, the increasing level of regulatory reporting is driving home a need for certifying data – applying rigor and measurement to its quality, audit, and lineage.  As a result, a healthcare organization must develop an Enterprise Information Management approach that zeros in on treating data as a strategic asset.  While treating data as an asset would seem to be obvious given the level of IT systems necessary to run a typical healthcare organization, the explosion of digital data collected and types of digital data (i.e. video, digital photos, audio files) has overwhelmed the ability to locate, analyze and organize it.

A typical example of this problem comes when an organization decides to implement Business Intelligence or performance indicators with an electronic dashboard.  There are many challenges in linking data sources to corporate performance measures.  When the same data element exists in multiple places, i.e. patient IDs, encounter events, then there must be a decision about the authoritative source or “single version of the truth.” Then there is the infamous data collision problem: Americans move around and organizations end up with multiple addresses for what appears to be the same person, or worse yet, multiple lists of prescribed medications that don’t match.  The need to reconcile data discrepancies requires returning to the original source of information – the patient to bring it to a current status.  Each of us can relate to filling out the form on the clipboard in the doctor’s office multiple times.  Finally, there is the problem of sparseness – we have part of the data for tracking performance but we don’t have enough for the calculation.  This problem can go on and on, but it boils down to having the right data, at the right time and using it in the right manner.

Wouldn’t the solution simply be to create an Enterprise Data Warehouse or Operational Data Store that has all of the cleansed, de-duplicated, latest data elements in it?  Certainly!  Big IF coming up: IF your organization has data governance to establish a framework for audit-ability of data; IF your organization can successfully map source application systems to the target enterprise store; IF your organization can establish master data management for all the key reference tables; IF your organization can agree on standard terminologies, and most importantly, IF you can convince every employee that creates data that quality matters, not just today but always.

One solution is to understand a key idea that made personal computers a success – build an abstraction layer.  The operating system of a personal computer established flexibility by hiding the complexity of different hardware items from the casual user through a hardware abstraction layer that most of us think of as drivers.  A video driver, a CD driver, USB driver allows the modularity and allows flexibility to adapt the usefulness of the PC.  The same principle applies to data-driven healthcare organizations.  Most healthcare applications try to tout their ability to be the data warehouse solution.  However, the need for the application to improve over time introduces change and version control issues, thus instability in the enterprise data warehouse.  In response, moving the data into an enterprise data warehouse creates the abstraction layer and the extract, transform and load (ETL) process can act like the drivers in the PC example.  Then as the healthcare applications move through time, they do not disrupt the Enterprise Data Warehouse, its related data marts and, most importantly, the performance management systems that run the business.  It is not always necessary to move the data in order to create the abstraction layer, but there are other benefits to that approach including the retirement of legacy applications.

In summary, a strong data-driven healthcare organization has to train and communicate the importance of data as a support for performance management and get the buy-in from the moment of data acquisition through the entire lifecycle of that key data element.  The pay-offs are big: revenue optimization, risk mitigation and elimination of redundant costs.  When a healthcare organization focuses on treating data as a strategic asset, then it changes the outcome for everyone in the organization, and restores trust and reliability for making key decisions.

Data Profiling: The BI Grail

In Healthcare analytics, as in analytics for virtually all other businesses, the landscape facing the Operations, Finance, Clinical, and other organizations within the enterprise is almost always populated by a rich variety of systems which are prospective sources for decision support analysis.   I propose that we insert into the discussion some ideas about the inarguable value of, first, data profiling, and second, a proactive data quality effort as part of any such undertaking.

Whether done from the ground up or when the scope of an already successful initial project is envisioned to expand significantly, all data integration/warehousing/business intelligence efforts benefit from the proper application of these disciplines and the actions taken based upon their findings, early, often, and as aggressively as possible.

I like to say sometimes that in data-centric applications, the framework and mechanisms which comprise a solution are actually even more abstract in some respects than traditional OLTP applications because, up to the point at which a dashboard or report is consumed by a user, the entire application virtually IS the data, sans bells, whistles, and widgets which are the more “material” aspects of GUI/OLTP development efforts:

  • Data entry applications, forms, websites, etc. all exist generally outside the reach of the project being undertaken.
  • Many assertions and assumptions are usually made about the quality of that data.
  • Many, if not most, of those turn out not to be true, or at least not entirely accurate, despite the very earnest efforts of all involved.

What this means in terms of risk to the project cannot be overstated.   Because it is largely unknown in most instances it obviously can neither be qualified nor quantified.   It often turns what seems, on the face of it, to be a relatively simple “build machine X” with gear A, chain B, and axle C project into “build machine X” with gear A (with missing teeth), chain B (not missing any links but definitely rusty and needing some polishing), and axle C (which turns out not to even exist though it is much discussed, maligned, or even praised depending upon who is in the room and how big the company is).

Enter The Grail.   If there is a Grail in data integration and business intelligence, it may well be data profiling and quality management, on its own or as a precursor to true Master Data Management (if that hasn’t already become a forbidden term for your organization due to past failed tries at it).

Data Profiling gives us a pre-emptive strike against our preconceived notions about the quality and content of our data.   It gives us not only quantifiable metrics by which to measure and modify our judgement of the task before us, but frequently results in various business units spinning off immediately into the scramble to improve upon what they honestly did not realize was so flawed.

Data Quality efforts, following comprehensive profiling and any proactive quality correction which is possible, give a project the possibility of fixing problems without changing source systems per se, but before the business intelligence solution becomes either a burned out husk on the side of the EPM highway (failed because of poor data), or at the least a de facto data profiling tool in its own right, by coughing out whatever data doesn’t work instead of serving its intended purpose- to deliver key business performance information based upon a solid data foundation in which all have confidence.

The return on investment for such an effort is measurable, sustainable, and so compelling as an argument that no serious BI undertaking, large or small, should go forward without it.   Whether in Healthcare, Financial Services, Manufacturing, or another vertical,  its value is, I submit, inarguable.

Tackling the Tough One: Master Data Management for the Healthcare Enterprise

One of the big struggles in healthcare is the difficulty of Master Data Management.  A typical regional hospital organization can have upwards of 200+ healthcare applications, multiple versions of systems and, of course, many, many “hidden” departmental applications.  In that situation, Master Data Management for the enterprise as a whole can seem like a daunting task.  Experience dictates that those who are successful in this effort start with one important weapon: data and application governance.

Data and application governance can often be compared to building police stations, but it is much more than that.  Governance in healthcare must begin with an understanding of data as an asset to the enterprise.  For example, developing an Enterprise Master Patient Index (EMPI) is creating a key asset for healthcare providers to verify the identity of a patient independent of how they enter the healthcare delivery system.  Patients are more than a surgical case, an outpatient visit or pharmacy visit.  Master data management in healthcare is the cornerstone of moving to treating patients across the entire continuum of care, independent of applications and location of care.  Bringing the ambulatory, acute care and home care settings into one view will provide assurance to patients that a healthcare organization is managing the entire enterprise.

Tracking healthcare providers and their credentials across multiple hospitals, clinics and offices is another master data management challenge.  While there are specialized applications for managing doctor’s credentials, there are not enterprise-level views that encompass all types of healthcare professionals in a large healthcare organization and their respective certifications.  In addition, this provider provisioning should be closely aligned with security and access to protected healthcare information.  A well designed governance program can supervise the creation of this key master data and the integration across the organization.

An enterprise view of Master Data provides a core foundation for exploiting an organizations data to its full potential and offers dividends beyond the required investment.  Healthcare organizations are facing many upcoming challenges with reference data as a part of master data management, especially as the mandated change from ICD-9 to ICD-10 codes approaches.   Hierarchies are the magic behind business analytics – the ability to define roll-up and drill-downs of information.  Core business concepts should be implemented as master data – how does the organization view itself?  The benefits of a carefully defined and well governed master data management program are many: Consistent reporting of trusted information, a common enterprise understanding of information, cost efficiencies of reliable data, improved decision making from trusted authoritative sources, and most importantly in healthcare, improved quality of care.

Data and application governance is the key to success with master data management.  Just like an inventory, key data elements, tables and reference data must be cataloged and carefully managed.  Master data must be guarded by three types of key people: a data owner, a data steward and a data guardian.  The data owner must take responsibility for the creation and maintenance of the key asset.  The data steward will be the subject matter expert that determines the quality of the master data and its appropriate application and security.  Finally, the data guardian is the information technology professional that oversees the database, the proper back-up and recovery of the data assets and manages the delivery of the information.  In all three roles, accountability is important and overseen by an enterprise information management (EIM) group that is composed of key data owners and executive IT management.

In summary, master data management provides the thread that ties all other data in the enterprise together.  It is worth the challenge to create, maintain and govern properly.  For success, pick the right people, understand the process and use a reliable technology.