You are currently browsing the monthly archive for August 2008.
To structure or not to structure: that is the question: Whether it is nobler in the mind to suffer the slings and arrows of metadata, ontology, and sixth canonical normal form. Or to take up arms against 30 years of data structure dogma and piety and by opposing the convert to Web 2.0 search technology (and potentially ruin my career, the remaining shards anyway). Lately, I have felt as torn as Hamlet; stay with my data heritage or end it all with radical Web 2.0 abandon.
As one who came out of the late 1970’s as a DBA (data base administrator) religiously putting flat files and hierarchical DBMSs (data base management systems, IMS specifically) to the sword, evangelizing the purity of the CODASYL model and teleprocessing systems. Naturally, I was put to the sword in turn by Code, Date, and relational DBMSs. Later, we fought back with object oriented databases, but being older and wiser, detente reigned. The only good data was analyzed and structured data, fourth normal form (sixth is extreme) at a minimum. All carefully placed in some DBMS so it could be transacted, searched, and reported. Ultimately, this drive to structured data has lead to Business Intelligence (BI, oxymoron, like military intelligence), Corporate Performance Management (CPM) and Executive Dashboards (picture the Elmo dashboard toys you strap to the baby’s crib, spin it, ring it, beep it, Ha ha ha).
Like Galileo, unfortunately, I tasted some forbidden fruit and it has haunted me for years. I was first taunted by the BLOB construct, which allowed unstructured data to be put in a data base container without the DBMS caring what it was, properly tagged the unsearchable could be found, but it was still labor intensive. My second taste came from being one of the sorry set of individuals to develop on Apple’s Newton platform (great haiku, bad handwriting recognition). The development platform and runtime were a rich object soup giving incredible flexibility as to what constituted data and instruction. Now, I am severely tempted by HTML and Search in the guise of Web 2.0 (tie me to the stake and light me up, I confess).
Building out Enterprise Data Models for the average corporation or, even more difficult, Biomedical Data Stores for life sciences are extremely labor intensive, frustrating, and often futile endeavors. The difficulty (cost, time) is directly correlated to the need for precise metadata and ontology. Deriving, documenting, and retrofitting are massive efforts (and definitely not for the ADHD among us, who me?). All of the investment is up front, before the first benefit can be realized (real scary career-wise). However, this is the “right”, dogmatic, safe way to handle data.
This is why our data stores are embarrassing data dumps (landfills, complete with dozers and sea gulls). It is the difficulty and cost of proper classification and maintenance of data in a structured environment that feeds this end. Think of it as data entropy, devolving to the most basic disorganized state. If this basic unstructured state is where data is going, why not just leave it in the “natural” state? Use the human cognitive effort and Web 2.0 tools to promote the best and most useful data to the top of the heap and let the stuff of dubious integrity drop and disappear into the gravel in the bottom of the big data (fish) tank. Rather than spend all that up front investment before the first benefit; the process would be one of steady refinement over time.
The raw data permeating the Web is greater than any structured data store and seems infinite in type and variety. Like the ocean, people dip what they require and interests them with ever increasing success. The rate of evolution of the supporting technology is astronomical. If we could put half the effort into molecule discovery we put into Britney Spears antics the world would be a much better place.
Web 2.0 is giving me flashbacks to an old TV commercial for Prego spaghetti sauce; “Tomatoes, in there! Garlic, in there! Carrots, in there! Half of Italy, in there!…” It seemed no matter what you asked for it was in that bottle of sauce. Being a sauce, how could you really tell what was in there, or if it was really needed? Plus, the tomatoes colored everything red so who knows? Now we have another bottle of technical sauce here called Web 2.0; it’s in there! It’s colored all Internet so how can you tell what is really in there, or if it is really needed?
Good question, seems like every vendor says they’re on the bottle of ingredients, in fact the most important one. It would be funny if it was not so pathetic. Unfortunately, the smell here is not a nice bubbling spaghetti sauce, closer to a warm crock of….., you get the concept. Every vendor out there seems to believe companies will blindly buy anything labeled Web 2.0. Rather, the CIO’s are more apt to remember the Internet bubble and where that approach got them the last time.
What is required is more definition of what Web 2.0 is, and why we in IT need to move in that direction. To get that basic understanding, we need to breakout that old spaghetti sauce pan again to boil out all the fancy analysis and obsequious technology. Lo and behold! What remains is a simple concept: the inmates are now in control of the asylum. Users of the Internet have turned the tables on the big players in the space, they are no longer happy being spoon fed from a portal. The denizens want to hunt it on their own terms, see it their own way, save it and dispose of it as they please. If you stand in their way, this mob of Internet hunter-gatherers will crush you with the loss of their eyeballs (poor Yahoo, poor EBay, happy Facebook, happy iPhone).
If this basic principle is followed like a lode stone, much that is occurring in the Internet space is much more illuminating and the proper path forward (with supporting technology) is a great deal clearer to discern. For example, the winning companies embrace openness and external developers. There is no way their internal staff can create and the site push enough content and functionality to stay on top. The Tao of a top site is to be one with the masses, following and attempting to push is uncool. Allowing users to mash-up specialty widgets into cool personal discoveries is winning, monetization will ultimately follow.
By this point, you are thinking — how is all this ethereal philosophic spew helping me? I need to get something together that can be called Web 2.0 or my IT existence is at risk! Do not worry Grasshopper (I’m showing my ’70s again, rats!) I’ll put forward a corporate-friendly straw man. If SharePoint is used to enable a project, process, or department; it is so Web 1.0 (boring!). If we put the entire corporation up on SharePoint, acting like a corporate Facebook, we are getting there. If we template it such that we now have ubiquitous collaboration; optimizing and moving our corporate intellectual property (IP) at light speed much nicer. But for ultimate coolness, we need to commit heresy and wire a Google search appliance in, after adding all of our corporate content to the pile: documents, presentations, everything. Then the cherry on top, flatten key data bases to HTML and toss them in. Now, with proper organizational change management (Yes Billy! You can run with scissors, points down please), employees can use all of the power contained in Web 2.0 to maximize unstructured corporate data for speed and profit. Mangiare! Mangiare!
The basket of technology comprising Web 2.0 is a wonderful thing and worthy of all of the press and commentary it receives, but what really scares me is the state of data in this new world. Data sits in the basement of this wonderful technology edifice, ugly, dirty, surrounded by squalor, and chained in place. It is much more fun to just buy the next storage array (disk is cheap, infinite, what power bill?), than it is to grind though it, clean it up, validate it, ensure proper governance and ontology.
What is Web 2.0 for, if not to expose more content? And data is the ultimate content. Knowing what is hiding in the basement, there are going to be a lot of embarrassed organizations (Lucy, you got some ’splaining to do!). Imagine how difficult it is going to be to link and synchronize content and data in the Web 2.0 environment. Imagine explaining the project delays and failures of Web 2.0 initiatives when the beast in the basement gets a grip on them.
Normally, the technology will be blamed. Nobody wants to admit they store the corporate crown jewels in the local landfill. Nobody will buy the new products fast enough. The server farms being built to support Cloud Computing will sit spinning and melting Arctic Ice in vain (Microsoft’s container-based approach is cool). This could seriously impact the market capitalization of our top tech giants Microsoft, Oracle, Google, Amazon. Oh no! It could crash the stock market and bring on tech and financial Armageddon given our weakened state! Even worse, my own career is at stake! The devil with them, they are all rolling in money, I could starve!
Now that I have my inner chimp back in the box, we need to put together a mitigation strategy to allow for a steady phased improvement of the data situation in tandem with Web 2.0 initiatives. It is too much to expect anybody to clean up the toxic data dump in one sitting and we can not tag Web 2.0 with the entire bill from years of neglect (just toss it in the basement, no one goes there). If we do not ask IT to own up to the issue and instead allow projects to fail, senior management, (fade to The Office), will assume the technology is at fault and will not allocate the resources needed to make this key technological transition.
During an informal forum recently, (whose members shall remain nameless to protect my sorry existence a few more years), analytics projects came up as a topic. The question was a simple one. All of the industry analysts and surveys said analytic products and projects would be hot and soak up the bulk of the meager discretionary funds availed a CIO by his grateful company. If true, why were things so quiet? Why no “thundering” successes?
My answer was to put forward the “typical” project plan of a hypothetical predictive analytics project as a straw man to explore the topic:
- First, spend $50 to $100K on product selection.
- Second, hire a contractor in the product selected and tell him you want a forecasting model for revenue and cost.
- The contractor says fine, I’ll set up default questions, by the way where is the data?
- The contractor is pointed to the users. He successively moves down the organization until he passes through the hands-on user actually driving the applications and reporting (ultimately fingering IT as the source of all data). On the way the contractor finds a fair amount of the data he needs in Excel spreadsheets and Access databases on the user’s PCs (at this point a CFO in the group hails me as Nostradamus because that is where his data resides).
- IT gets some extracts together containing the remaining data required that seems to meet the needs the contractor described (as far as they can tell, then IT hits the Staple’s Easy Button — got to get back to keeping the lights on and the mainline applications running!).
- Contractor puts the extracts in the analytics product, does some back testing with what ever data he has, makes some neat graphics and charts and declares victory.
- Senior management is thrilled, the application is quite cool and predicts last month spot on. Next month even looks close to the current Excel spreadsheet forecast.
- During the ensuing quarter, the cool charts and graphs look stranger and stranger until the model flames out with bizarre error messages.
- The conclusion is drawn that the technology is obviously not ready for prime time and that lazy CIO should have warned us. It’s his problem and he should fix it, isn’t that why we keep him around?
At this point there are a number of shaking heads and muffled chuckles; we have seen this passion play before. The problem is not any product’s fault or really any individual’s fault (it is that evil nobody again, the bane of my life). The problem lies in the project approach.
So what would a better approach be? The following straw man ensued from the discussion:
- First, in this case, skip the product selection. There are only two leading commercial products for predictive analytic modeling (SAS, SPSS). Flip a coin (if you have a three-headed coin look at an open source solution, R or ESS), maybe it’s already on your shelf, blow the dust off. Better yet, would a standard planning and budgeting package fit (Oracle/Hyperion)? The next step should give us that answer anyway, no need to rush to buy, vendors are always ready to sell you something (especially at month/quarter end – my, that big a discount!).
-
Use the money saved for a strategic look at the questions that will be asked of the model: What are the key performance indicators for the industry? Are there any internal benchmarks, industry benchmarks or measures? Will any external data be needed to ensure optimal (correct?) answers to the projected questions?
- Now take this information and do some data analysis (much like dumpster diving). The key is to find the correct data in a format that is properly governed and updated (no Excel or Access need apply). The key is accurate sustainability of all data inputs, remember our friend GIGO (I feel 20 years old all over again!). This should sound very much like a standard Data Quality and Governance Project (boring, but necessary evil to prevent future embarrassment to the guilty).
- Now that all of the data is dropped into a cozy data mart and supporting extracts are targeted there, set up all production jobs to keep everything fresh.
- This is also a great time to give that contractor or consultant the questions and analysis done earlier, so it will be at hand with a companion sustainable datamart. Now iterations begin – computation, aggregation, correlation, derivation, deviation, visualization, (Oh My!). The controlled environment holds everybody’s feet to the fire and provides excellent history to tune the model with.
- A reasonable model should result, enjoy!
No approach is perfect, and all have their risks, but this one has a better probability of success than most.
Sun was right, the computer is the network. Rather the computer is the Internet, if we believe all of the major Internet players and vendors racing across the plains to stake their claim for the next big gold nugget. Has any body heard this before? It hits me like “deja vu” all over again. I have been to this movie before as Saas, Utility Computing, ASP….TimeSharing, etc. (ugh). It is really sad when one of the players tries to trademark “cloud” (Dell).
All that being said, the goal is the Holy Grail of both the bedraggled CIO and the proud IT industry. If Cloud Computing works as envisioned, it would revolutionize application development, deployment, support, and back-up/recovery. The current installed-base of PCs would become mere appliances, distributed data centers could be consolidated, software could be designed and maintained at the application level of granularity. Development platforms, like Microsoft’s Oslo, would allow visual editing and mashing of entire applications residing in the Internet/Cloud, in whole or parts. Gone is worrying about software stacks, hardware, bandwidth, security, and back-up. I am in Nirvana, floating on a Cloud (bad pun).
Stepping back from the precipice of sarcasm, there is merit to the concept, approached with a jaundiced eye. Applications, regardless of industry or user, begin as ideas and unfortunately are easily lost amid the grinding detail of instantiation in software, hardware and bandwidth. Even the early Cloud platforms provide an opportunity to experiment quickly with innovative ideas. In a past life a venture capitalist told me; “If I could just complete my bad concepts quickly, I could make a fortune on my one good idea of fourteen!”. Well, the Cloud would do it, platform, QA, and customers all in one. As a mere CIO, I could see it as an effective platform for fast geographically distributed, collaborative development or for quick one-off applications (to be brought in-house if proven).
It will be interesting to see this trend move forward and it is certainly worthy of our R&D effort, in any case, because the pay-back is so compelling. Plus, who knows, maybe Sun was right.

I have to admit, I never subscribed to a daily newspaper. I love news and get them in many ways. When I ride the train to Manhattan I’d always look for the paper another passenger had left behind. On rare occasions I was even observed digging through the tubs of used newspapers at the Hoboken station. Killing time on a train with no internet connection is when I desperately need a newspaper. Mostly, I get my news online. What could be better than up to the minute, targeted news?
Well, in a futile effort to turn back time, the Philadelphia Inquirer is trying to fend off declining print circulation by giving the print edition more relevancy. How? By instituting a policy of “print first” and instructing staff not to break stories on line.
I’ve talked recently with a few people in the publishing industry. While all share the same core problem: advertising for their online media does not compensate for the decline in print ad revenues and smaller circulation, none thought that this trend is reversible and that by restricting your online content, demand for the print product will suddenly rise.
It brings up a few good questions.
Is the printed newspaper as we know it doomed for extinction?
I think so. Good content still has a market, and it will still be paid for by advertising prior, during and around the content. Since so many of us carry with us everywhere our electronic reading devices (from the laptop, through the iphone to the Kindle), the need for a printed boundle of broad content will diminish.
How will publishers make money then? What is the future business model?
We all know that charging users for online content works only in very special places where the content is of high professional value or the employer, rather than the consumer, pays the bill. Many of the paid subscribers of the WSJ.com or the Harvard Business Review HBR.com do not foot the bill themselves. Nobody else is able to charge for content. Salon.com have tried every possible avenue in the last few years and settled on just tons of advertising.
What can the local newspaper do? what will publishing 2.0 look like? Become the center of local information and local community. Open up and think about the paper not as an employer of journalists, but as a provider of unique and easy to access valued content.
The content will come from many sources. AP, Local reporters, Local analysts, Community journalists, local and global bloggers, etc. The publisher returns to being a content publisher rather than a content producer. By sifting through the mountains of content, editors can clean, categorize, source, filter, tag and recommend content so we as users get relevant content we care about (and are willing to tolerate the ads that accompany them. In one of the successful models I’ve seen, the premium accounts main premium is the removal of most ads.)
Advertising budgets are moving online (33% increase in the last year) consistently with the increase of time we spend online. As ad technology, bandwidth and targeting algorithms improve, a publisher that can deliver a highly segmented audience with a high quality ad experience will be able to ask for top dollars.
Online advertising rates will increase. As we move towards a 100% trackable media, the difference between TV and the computer will diminish. Both channels will deliver similar content supported by ads. Advertisers will pay by reach, and if the quality of the experience is the same, an interactive experience where the user can click the ad and go to the advertiser should be worth more!.
The newspaper will become local media center that is more open, interactive, customized and relevant. Eventually, even profitable.
In a comment on our first blog post, Mark Farrell raised an excellent point about data assumptions and their impact on the integration timeline. He started our list of best practices by pointing out that clear ownership of the master and transactional data is important. Here are some additional data readiness tasks that can shorten the integration timeline if they are undertaken before the deal closes:
- Make sure all data models and data dictionaries are up to date. This will make it easier to map data from your acquisition’s systems into your (target) systems.
- Spot check data quality in your target systems. If you notice a lot of duplicate records, missing fields, or fields used for purposes other than what is defined in the data model/dictionary, your data migration will be more difficult.
- Begin defining your detailed plan for data cleansing and migration early, and specify the tests you will perform to make sure the migration is successful. Establish clear business ownership/sign-off authority for the test results. Budget time into your data migration plan for several test data loads to minimize risk on the final cutover day.
In our Business Intelligence (BI) strategy consulting with healthcare clients, we are often asked how to design a metrics program so the data that ultimately populates the dashboards and drill-downs is inherently actionable. They ask: How do we design the BI data collection and presentation systems to focus our corrective actions and other interventions most effectively?
In our experience, metrics facilitate action when they exhibit four key characteristics:
- A clear definition – the meaning of the metric must be clear and unambiguous. In a surgical services context, the definition of a late start for a scheduled procedure must be precise, and agreed upon by everyone concerned. If three minutes past scheduled start is defined as late, there should be no haggling that four minutes is close enough to be considered on-time. When late surgical starts are aggregated up to a service line or an entire system, especially when used comparatively, the metric must represent a homogeneous population with regard to the definition of the metric.
- Clear attribution & dimensional focus – the attributes that describe or annotate the specific metric must be clearly defined, and must allow for focused response along some dimension that makes sense to the business operation. Most often these will align with one or more of the following:
- Organizationally-focused – staff and other aggregated resources (e.g. departments, service lines, programs, care setting) are organized and accountable in alignment with the mission of the enterprise or segment thereof. It is clear where the action is required in the organization in order to achieve the desired effect or outcome being measured.
- Process-focused – specific processes or standard operating procedures (e.g. standard orders or order sets, care plans, clinical pathways, standard or research protocols) are implemented and tracked for performance and compliance. It is clear in which specific process or activities action is required, in order to achieve the desired effect or outcome being measured.
- Specific Resource-focused – specific resources (e.g. individual staff or teams, facilities, materials, equipment) are monitored for performance and compliance with standards for quality, operations or regulations. It is clear with which types or instances of these specific resources action is required, in order to achieve the desired effect or outcome being measured.
Other Primary Entity-focused – specific critical entities that exist in the operational context being measured, each described by a potentially diverse set of differentiating attributes. For example, in a clinical context, patients are critical entities. The set of clinical, demographic, diagnostic, prognostic, treatment, outcome or other differentiating characteristics on patients is routinely examined and analyzed for potential patterns, and possible interventions.
- Timeliness – the metric must be captured and available to responders in sufficient time to allow an appropriate response. Metrics can evolve from being primarily retrospective, to real-time reporting, to predictive, each of which enables and facilitates a different type of action. At a minimum, they must be reported in sufficient time for a meaningful response to occur.
- Accountability – with any of the above, someone in the organization must be responsible and accountable for appropriate action and assessment. The responsible party(ies) must be ready to analyze the situation and deploy the appropriate resources, to take specific needed actions in response to the position or value of each metric relative to relevant performance standards or expectations.
Other factors such as high confidence in data quality and its source, effective communications to responders, and authority to act are also critical elements. Metrics programs and BI systems with these characteristics have taken a good first step toward enabling the focus and the improvements for which they are ultimately designed.
A key task during IT Due Diligence is assessing the strength of the IT leadership team. Martha Heller, in her recent article in CIO, defines the SVP of Technology and Operations as a cool new role and career path for CIOs.
We’re always a bit relieved when we see this role on the org chart as we begin an IT Due Diligence investigation, but of course we do a bit of probing to determine if the wearer of the title truly has what it takes to lead the organization through the 12-18 months of rapid business change that should follow any M&A deal.
Several clues to the real quality of the SVP of Tech and Ops leadership can be found by:
- Asking for and reviewing the business case or strategy document for any recent significant technology initiative. Big red flag if they can’t produce one at all.
- Determining if the overall architecture is documented, and under change control from all required perspectives: software, hardware, information, and business process perspectives. The SVP loses points if the documentation doesn’t exist or doesn’t account for planned future implementations of business and technology changes.
- Snooping around for departmental application or information silos. This usually takes some field work, as the IT leadership’s architecture documentation may not reveal what all the business units are hiding in remote offices.
Other factors come into play as well, but these are the top three, because they are the most important ways an effective technology and operations leader can turn IT from a cost center into a true business asset and an engine of growth.
McKinsey and Company released a research report last week titled “Building the web 2.0 enterprises” (free registration required). It is a global survey of about 2000 executives about the use, adoption, priorities and satisfaction with web 2.0 tools and technologies.
The summary in their words:
“Companies are using more Web 2.0 tools and technologies than they were last year, sometimes for more complex business purposes, according to McKinsey’s second annual survey on Web 2.0. Companies that are satisfied with their use of these tools are starting to see changes throughout the enterprise.”
A few thoughts and observations from the findings and from our own experience with implementing Enterprise 2.0 solution internally at Edgewater and for clients:
1. The technologies that are being implemented.
Social networking is now in second place after web services. It is not clear how social networking is defined and if the focus is internal or external. From what we’ve see, there are at least 3 different ways companies use social networking technologies:
- Internal social networking: the goal of these tools is to help people stay in contact, share activities and be able to find expertise inside the organization. From the much discussed use of Facebook as an internal social network by Serena Software to the creation of SharePoint profiles, the tools that currently exist are very limited in their support and address only what Andy McAfee calls the Strong circle, the group of people you interact with on a regular basis anyway. A true internal social network that will spur interaction and discovery across the enterprise is yet to emerge.
- Internal Collaboration: it is not on the list but internal forums and collaborative tools for projects are one of the oldest and most used aspects of an active intranet. Many may associate these activities as part of a social network.
- External social network for customer or partners. In here as well, collaborative environment and Social Network seem to be used interchangeably. There are a lot of forums, discussions and member interaction but due to their limited scope, these communities rarely develop into a full fledged social network.
The second point of interest here is the relatively low rating of some of the emerging trends like Tagging, Prediction markets and Mashups. We see a lot of interest in these upcoming technologies and expect to see them rise in priority in the future.
2. The cultural implications of adopting Web 2.0. It is good to see that in many organizations the change is not just in the tools that are introduced but also in the organizational culture and governance. 
The tight correlation between the level of satisfaction with web 2.0 tools and the degree the organization had changed indicates that they are tightly coupled. Introducing new tools to a rigid organization will result in failure. A successful implementation has to consider attitude and cultural changes as much as tools and technologies.
3. Who is leading the change: the role of IT. It is not surprising to see in the survey results that only in 16% of the responders indicated that IT had initiated the introduction of Web 2.0 tools. 
and that in the cases it did, they resulted in the lowest level of satisfaction. We’ve seen similar trends with our clients as these tools introduce chaos into the environment corporate IT is trying desperately to control and maintain. IT is responsible for keeping the security levels in place, ensuring availability, backup, searchability and integrating these services into the existing infrastructure. Since many of these tools are from open source or startup organizations, IT is justifiably playing the role of the gate keeper. A successful strategy must marry the business needs and opportunities with the prudence of a supported environment but in keeping with the agile approach that is inherent in web 2.0 – IT must be willing to give up some control otherwise web 2.0 initiatives will take too long to implement and will be too restrictive for an organization to embrace. In many cases, this is our role as strategy and technology consultants, to bridge the gap and set a cohesive strategy everyone can agree upon and execute.
This content is the property of Edgewater Technology and is published as part of the edgewater technology blog. Any unauthorized use of this content is prohibited.





