Data Profiling: The BI Grail

In Healthcare analytics, as in analytics for virtually all other businesses, the landscape facing the Operations, Finance, Clinical, and other organizations within the enterprise is almost always populated by a rich variety of systems which are prospective sources for decision support analysis.   I propose that we insert into the discussion some ideas about the inarguable value of, first, data profiling, and second, a proactive data quality effort as part of any such undertaking.

Whether done from the ground up or when the scope of an already successful initial project is envisioned to expand significantly, all data integration/warehousing/business intelligence efforts benefit from the proper application of these disciplines and the actions taken based upon their findings, early, often, and as aggressively as possible.

I like to say sometimes that in data-centric applications, the framework and mechanisms which comprise a solution are actually even more abstract in some respects than traditional OLTP applications because, up to the point at which a dashboard or report is consumed by a user, the entire application virtually IS the data, sans bells, whistles, and widgets which are the more “material” aspects of GUI/OLTP development efforts:

  • Data entry applications, forms, websites, etc. all exist generally outside the reach of the project being undertaken.
  • Many assertions and assumptions are usually made about the quality of that data.
  • Many, if not most, of those turn out not to be true, or at least not entirely accurate, despite the very earnest efforts of all involved.

What this means in terms of risk to the project cannot be overstated.   Because it is largely unknown in most instances it obviously can neither be qualified nor quantified.   It often turns what seems, on the face of it, to be a relatively simple “build machine X” with gear A, chain B, and axle C project into “build machine X” with gear A (with missing teeth), chain B (not missing any links but definitely rusty and needing some polishing), and axle C (which turns out not to even exist though it is much discussed, maligned, or even praised depending upon who is in the room and how big the company is).

Enter The Grail.   If there is a Grail in data integration and business intelligence, it may well be data profiling and quality management, on its own or as a precursor to true Master Data Management (if that hasn’t already become a forbidden term for your organization due to past failed tries at it).

Data Profiling gives us a pre-emptive strike against our preconceived notions about the quality and content of our data.   It gives us not only quantifiable metrics by which to measure and modify our judgement of the task before us, but frequently results in various business units spinning off immediately into the scramble to improve upon what they honestly did not realize was so flawed.

Data Quality efforts, following comprehensive profiling and any proactive quality correction which is possible, give a project the possibility of fixing problems without changing source systems per se, but before the business intelligence solution becomes either a burned out husk on the side of the EPM highway (failed because of poor data), or at the least a de facto data profiling tool in its own right, by coughing out whatever data doesn’t work instead of serving its intended purpose- to deliver key business performance information based upon a solid data foundation in which all have confidence.

The return on investment for such an effort is measurable, sustainable, and so compelling as an argument that no serious BI undertaking, large or small, should go forward without it.   Whether in Healthcare, Financial Services, Manufacturing, or another vertical,  its value is, I submit, inarguable.

Physicians Insist, Leave No Data Behind

“I want it all.” This sentiment is shared by nearly all of the clinicians we’ve met with, from the largest integrated health systems (IHS) to the smallest physician practices, in reference to what data they want access to once an aggregation solution like a data warehouse is implemented.  From discussions with organizations throughout the country and across care settings, we understand a problem that plagues many of these solutions: the disparity between what clinical users would like and what technical support staff can provide.

For instance, when building a Surgical Data Mart, an IHS can collect standard patient demographics from a number of its transactional systems.  When asked, “which ‘patient weight’ would you like to keep, the one from your OR system (Picis), your registration system (HBOC) or your EMR (Epic)?” and sure enough, the doctors will respond, “all 3”. Unfortunately, the doctors often do not consider the cost and effort associated with providing three versions of the same data element to end consumers before answering, “I want it all”.  And therein lies our theory for accommodating this request: Leave No Data Behind. In support of this principle, we are not alone.

By now you’ve all heard that Microsoft is making a play in healthcare with its Amalga platform. MS will continue its strategy of integrating expertise through acquisition and so far, it seems to be working. MS claims an advantage of Amalga is its ability to store and manage an infinite amount of data associated with a patient encounter, across care settings and over time, for a truly horizontal and vertical view of the patient experience. Simply put, No Data Left Behind.  The other major players (GE, Siemens, Google) are shoring up their offerings through partnerships that highlight the importance of access to and management of huge volumes of clinical and patient data.

pc-with-dataWhy is the concept of No Data Left Behind important? Clinicians have stated emphatically, “we do not know what questions we’ll be expected to answer in 3-5 years, either based on new quality initiatives or regulatory compliance, and therefore we’d like all the raw and unfiltered data we can get.” Additionally, the recent popularity of using clinical dashboards and alerts (or “interventional informatics”) in clinical settings further supports this claim. While alerts can be useful and help prevent errors, decrease cost and improve quality, studies suggest that the accuracy of alerts is critical for clinician acceptance; the type of alert and its placement and integration in the clinical workflow is also very important in determining its usefulness. As mentioned above, many organizations understand the need to accommodate the “I want it all” claim, but few combine this with expertise of the aggregation, presentation, and appropriate distribution of this information for improved decision making and tangible quality, compliance, and bottom-line impacts. Fortunately, there are a few of us who’ve witnessed and collaborated with institutions to help evolve from theory to strategy to solution.

mountais-of-dataProviders must formulate a strategy to capitalize on the mountains of data that will come once the healthcare industry figures out how to integrate technology across its outdated, paper-laden landscape.  Producers and payers must implement the proper technology and processes to consume this data via enterprise performance management front-ends so that the entire value chain becomes more seamless. The emphasis on data presentation (think BI, alerting, and predictive analytics) continues to dominate the headlines and budget requests. Healthcare institutions, though, understand these kinds of advanced analytics require the appropriate clinical and technical expertise for implementation. Organizations, now more than ever, are embarking on this journey. We’ve had the opportunity to help overcome the challenges of siloed systems, latent data, and an incomplete view of the patient experience to help institutions realize the promise of an EMR, the benefits of integrated data sets, and the decision making power of consolidated, timely reporting. None of these initiatives will be successful, though, with incomplete data sets; a successful enterprise data strategy, therefore, always embraces the principle of “No Data Left Behind”.

Change: Opportunity or Agony

jenga“Embracing change” is a common mantra. However, experiencing change is a certain reality. With it comes a series of choices for everyone involved. Perhaps, the game of Jenga(tm) demonstrates these choices. As you may know, Jenga consists of wooden blocks shaped like tiny beams. The game starts with the beams stacked tightly, three per layer, alternating them vertically and horizontally. The object is to manually dislodge any block from the tower and place it on a new layer at the very top; expanding the tower upwards until it topples from lack of support below or is blown by a strong gust of wind.

Like Jenga, a business also grows using its assets, strengths and opportunities to build customers and market share.

To continue comparing Jenga to running an enterprise, perhaps you could use two different perspectives. The player of the game is like the executives of the organization, moving around structural blocks to expand the organization. This executive has a 360 degree view of the tower with the ability to stress test the blocks before dedicating them for the move; and can scope the environment for threats to the construction such as a shaky playing table or strong winds.

The contrasting view is that of the employees impacted by the move within an organization; perhaps visualized as tiny ants clinging to the moved block. These individuals have an intimate knowledge of this specific block. They know each dent, scratch and slight change in color. They know how snugly it fits against the neighboring blocks (the nitty-gritties necessary to accomplish a job) and how it informally interacts with others. But this internal perspective lacks the comprehensive view. From within the safety of the tightly-built fortress, workers may not sense the unstable foundation or feel the gusts.

As a block is selected those associated with it can be hurled into significant change. One’s first reaction at the vibration may be to grab on as hard as possible to the comfort of the block. Despite the desperation, it takes very little time to see that the forces are overpowering and a significant change is imminent. At this point, there are really two broad choices: resist or cooperate.

The consequences of the first choice, resistance, can lead to demise. To explain this, let’s consider the two forms of resistance – denial and defiance.

Denying the seriousness of the changing forces will severely cripple the industry. Current examples of underestimating the impact of an impending change are seen with traditional media. After reluctance, newspapers, magazines, local broadcast television and radio eventually adopted the Internet. Through applying their respective traditional medium’s paradigm to the Internet forum, they used it as the broadcasting and publishing vehicle. Newspapers, for example, started by replicating their publication online and updating the sites daily after street publication. Internet users expecting more immediate news discontinued their subscriptions to the physical newspapers and started viewing news on new Internet news sites that refreshed content frequently.

The other form of resistance, defiance, could cause alienation with peers who tire of negative attitudes. Excessive defiant behavior could lead to dismissal from those who perceive it as obstructive.

In contrast, the option of cooperation, could lead to quite different outcomes. If the change is from competitive or industrial pressure, adapting to the changes’ new opportunities could put you in the driver’s seat. Those Internet sites that enabled the viewer to customize content offer an example of seizing the opportunity to lead the industry. In Jenga, a beam moved to the top is exposed to uncomfortable drafts, unfamiliar elements and added visibility. The gusts and vulnerability could be threatening. Also, the fall is farther if knocked off. However, the experiences gained are the essence of leadership.

Another recent example is the trend to stop travel expense. Geographically dispersed employees, trainers and consultants can overcome this obstacle by mastering the various technologies to be productive remotely. As organizations adopt these methods, the paradigms of phone etiquette, correspondence and meeting presentations will morph into new standards. Those of us who have adapted will benefit professionally.

Other gloomy headlines tout that many companies have fallen, or as in Jenga, the towers have toppled. Those who have fallen into the heap are left with the challenge to adapt to a new reality. After some brushing off, skills can be applied to participate in a new tower. Existing knowledge and tools will be augmented by wisdom for the next cycle of industrial changes.

As professionals, we need to recognize that external forces will cause us to make some hard decisions. To react with leadership, we should seek opportunities in the changes, communicate the realities and urge others to accept them.

It’s the End of the World As We Know It!

The Holidays are a great for watching “End of the World” shows on the History Channel. They were a great comfort, actually almost encouraging, because all of the prophecies target 2012.  “The Bible Code II”, “The Mayan Prophecies”, and the Big 2012 Special compendium of End of the World scenarios, covering Nostrodamus to obscure German prophets, all agree that 2012 is the big one (Dec 21 to be exact!)  What a relief!, the rest of the news reports are trending to canned goods, shotguns, and gold by the end of the year.  We really have almost a whole 48 months before everything goes bang (I wasn’t ready anyway, procrastination rules!).

Unfortunately, we need to do some IT planning and budgeting for the new year and probably should have some thoughts going out 36 months (after that see the first paragraph).  As I discussed in a prior blog, the reporting, BI/CPM/EPM, and analytics efforts are the strongest priority; followed by rational short cost savings efforts.  All organizations must see where they are heading and keep as much water bailed out of the corporate boat as possible.  Easy call, job done! 

Then again a horrifying thought occurred to me, what if one of these initiatives should fail? (see my nightmares in prior blog posts on Data and Analytics).  I am not saying I’m the Mad Hatter and the CEO is the Red Queen, but my head is feeling a bit loosely attached at the moment.  Management cannot afford a failed project in this environment and neither can the CIO in any company (remember CIO=Career Is Over).

The best way to ensure sucessful project delivery (and guarantee my ringside lawn chair and six-pack at Armageddon in 2012) lies in building on best practice and solid technical architecture.  For example, the most effective architecture is to use a layer of indirection between the CPM application (like Planning & Budgeting) and the source data systems (ERP, Custom transactional).  This layer of indirection would be for data staging, allowing transfer to and from fixed layouts for simplified initial installation and maintenance.  In addition, this staging area would be used for data cleansing and rationalization operations to prevent polluting CPM cubes with uncontrolled errors and changes.  In terms of best practice, libraries and tools should be used in all circumstances to encapsulate knowlege rather than custom procedures or manual operations.  Another best practice is to get procedural control of the Excel and Access jungle of wild and wooley data which stands ready to crash any implementation and cause failure and embarassment to the IT staff (and former CIO).  When systems fail, it is usually a failure of confidence in the validity or timeliness of the information whether presented by dashboard or simple report.

CPM, EPM, and Analytics comprise and convey incredibly refined information and decisions of significant consequence are being made within organizations to restructure and invest based on this information.  The information and decisions are only as good as the underlying data going into them.  So skimping on the proper implementation can put the CIO’s paycheck at serious risk (Ouch!).

The Fog Has Engulfed Us Captain! What Do We Do?

Sailing in fogThe current business environment reminds me of being socked in a fog bank in minutes, after being on a pleasant summer sail.  The entire episode puts the pucker factor meter in the red zone.  One minute clear sun and nice breeze, the next you can’t see your hand in front of your face.  Your other senses become more acute  — suddenly you hear the splash of the waves on the rocks you cannot see (funny I didn’t hear that a minute ago).  The engines of power boats are closer, seeming to come at your every quarter (PT109 how bad can it be?).

As you sit in the cockpit with your canned air fog horn and US Coast Guard approved paddle, you think that the portable marine radio you bought will not save your sorry carcass (at least you can get the Coast Guard to retrieve your drowned body as you go down).  You kick yourself for not buying that radar instead of the case of wine as a boating accessory (in fact, you think of downing some of that right now to ease your passing).  What you would not give for just a little visibility.

That’s what running a business feels like right now (makes you want to puke doesn’t it, what fun).  My Kingdom for some Visibility!  Sure, you can see what the others are doing; cut a few heads there, shut a facility there.  Is that the right thing to do?  Are you killing your future seed corn or bailing the water which will sink the company?  Ugh!  In this case, you really wish your company’s reporting could be that radar to tell where and where not to go (sure wish I got that CPM Package rather than that Sales meeting in Napa Valley).  With dashboards, planning and budgeting, consolidation, and operational BI, I would have a much better sense of what to feed and what to kill to take advantage of my competitors coming out of this economic fog (Aye Captain! in the Bay of the Blind the One Eyed Man is Admiral!).  Wishing and regrets won’t get you much, and capital investment at this point seems to be a dirty word (Yep, there it is on George Carlin’s list).

In the case of my sailing experience, the way I dug out of the fog and fear was to dig out the depth finder the former owner left behind and the charts I bought because it seemed like a good idea at the time.  I then proceeded to steer the sailboat in circles matching the readings on the depth finder with the depth readings on the chart based on my dead reckoning of my location (you reckon wrong, you’re dead).  Needless to say it worked, the fog cleared, and I was within a quarter mile of where I should have been (Cool!).  Just straightening out existing corporate reports and cleaning existing data is the equivalent of using the depth finder and charts already on hand (Yes! I know the difference between capital and expense).  In fact, that effort usually saves money by eliminating old unused reports (Oh, I feel so green!).

In any case, take a solid first step by getting those state-of-the-art visibility tools of BI/CPM/EPM when the current problems clear or things become so dire as to require dry dock repairs.  That way, the pucker meter won’t be buried in the red the next time this happens, and it will.

Image courtesy of Herbert Knosowski, AP

Why creating actionable information from your existing systems is so difficult

With all the easy to use business intelligence tools and technology we have today, why is it so difficult to create actionable information from the wealth of data in our organizations?

One needs to understand, at a high level, the systems we have built and how they got that way. Your core business systems have evolved over time, budget cycle by budget cycle with no eye towards the overall enterprise. Systems were built to support core business functions – Payroll/HR, General Ledger, Inventory, etc. They were transactional in nature; designed to meet the immediate requirements (e.g. cut payroll checks, track inventory, manage an assembly line, etc.) which did not include getting business intelligence out. Over time these systems became islands of data, popularly known as silos.

Add the fact that silos are structured differently and common data like product and customer is typically not standardized, answering questions across silos is difficult and labor intensive.

As these systems matured, the owners of each silo had departmental Business Intelligence needs. So as budget became available they added a data warehouse or data mart on top of their silo and created something like this.

The result is larger silos with larger sunk investment and still no ability to provide enterprise answers or actionable information. This approach worked for immediate departmental BI needs but if the business asks a question from data that resides in two or more of the silos, getting the answer usually involves a significant IT effort. By the time IT responds the business has gone onto a different question. The business analyst starts gluing spreadsheets together to provide some insight kicking off the next activity in the BI food chain – manual analytics.