Big Boxes Vs Big Data: Dealing With Legacy Information

The IT industry is all about new, better, best, especially if it’s got a good acronym. Software defined networking? A survey earlier this year found the majority of respondents wanted SDN even though most of them couldn’t define what SDN is. How about Big Data? At the recent Teradata user conference – Unleashing the Power of Data – there were less than 600 public and private-sector organizations, who represented the bulk of entities doing anything significant with Big Data.

I remember asking a CIO when they would be moving to a new Microsoft software release and being told that the company’s plans were to deploy the previous release – not the current or new version – within the next six months. It seems the real world moves a lot slower than the IT vendors would like, and a case in point is legacy data. In all the excitement about Big Data, the data deluge of new forms of information, everybody is forgetting about legacy data.

Companies of all sizes, across all industries, today are buried in the quicksand of their own unusable content and having a hard time staying afloat, said Mark Gross, founder and CEO of Data Conversion Laboratory (DCL), which digitizes, converts, and reorganizes content. “There’s a lot of content that has been stored away in places difficult to get hold of, but now there’s a way to get access to it.”

The company has been around for 30 years, and seen technologies change 6 or 7 times over that period, he said. It recently announced the Automated Conversion System (ACS) to deliver large volumes of high-quality converted documents and metadata. The technology, which had been in use for months prior to the announcement, transforms documents of varying degrees of visual quality into searchable XML which can be stored, searched, and accessed via end-user systems such as content management.

Current OCR tools do a great job on clean content, they get fooled by the non-textual content of complex documents, and accuracy degrades, stated Gross. DCL’s new process overcomes this degradation by automatically extracting extraneous content and reinserting it later into the converted document.

According to Gartner’s Magic Quadrant for Enterprise Content Management, released at the end of September, the ECM market grew 7.2% last year, to $4.7 billion. It said many enterprises are moving beyond the basic uses of ECM (such as secure file storage in organized libraries) to tackle deeper business requirements.

Gross said the two primary drivers for digitizing information are to make it searchable, and to free up space wasted on storing documents. While contracts and forms need to be retained in some format for GRC purposes, doing so in paper is expensive, with just one box of files costing $375. Multiply that by thousands and tens of thousands and just the hard costs – never mind making use of that information – illustrate the challenge large organizations are facing.

One of the other issues that ECM and Big Data must deal with is that not all data is created equal. The majority of data (69%) stored by enterprise has no value, but still costs a bundle in storage resources. According to HP, the amount of wasted data– digital landfill –  can exceed 80% in many cases.

In the not-too-distant future, success or failure will be largely determined by organizations’ ability to make smarter decisions faster. “In the face of accelerating business processes and a myriad of distractions, real-time operational intelligence systems are moving from ‘nice to have’ to ‘must have for survival’,” said Rita Sallam, Research VP Analyst, Gartner. “There is growing quantifiable evidence that data-driven decision making enabled by business analytics solutions provides a competitive difference,” said Dan Vesset, Program VP, Business Analytics at IDC.

The Latest Big Data Data

Last week Forbes provided an update on the current state of Big Data, including:

-IBM survey found that the leaders (the top 19% who identified themselves as substantially outperforming their industry and market): are 166% more likely to make most decisions based on data; cite growth as the key source of value from analytics (75%); measure the impact of analytics investments (80%); and have some form of shared analytics resources (85%).

Forbes Market Insights found that of the organizations that used big data at least half the time in their marketing campaigns, three in five (60%) said that they had exceeded their goals; of the companies that used big data less than half the time, only one in three could say the same.

-Bain found that those with the most advanced analytics capabilities are outperforming competitors by wide margins, with the leaders twice as likely to be in the top quartile of financial performance within their industries, five times as likely to make decisions much faster than market peers, three times as likely to execute decisions as intended, and twice as likely to use data very frequently when making decisions.

-TEKsystems found that :90% of IT leaders and 84% of IT professionals believe investments of time, money and resources into big data initiatives are worthwhile; only 14% of IT leaders report big data concepts are regularly applied in their organizations; and more than 50% of IT leaders question the validity of their data.

-Gartner survey re big data investment plans found: 64% of organizations are investing or planning to invest in big data technology compared with 58% in 2012; and less than 8% of survey respondents have actually deployed.

-Enterprise Management Associates (EMA) and 9sight Consulting, and sponsored by Pentaho, found users moving from pilot implementations to big data production: implementations in production rose from 27% in 2012 to 34.3% this year; and 68% of companies are running two or more big data projects as part of their big data initiatives.


Author: Steve Wexler

Share This Post On

Leave a Reply