While the financial services market may be the poster child for Big Data adoption – after Google, largely credited with its creation – MapR Technologies is seeing widespread use across multiple verticals and all sizes of companies, said marketing VP Jack Norris. “Right now its a collection of large and small organizations that are using Hadoop… using data sources for competitive advantages.” However, along with big opportunities, Big Data also presents a number of big challenges.
IDC just predicted that big data will continue on its growth path, with investment in technologies and services growing to nearly $10 billion in 2013. But the focus of this investment will see an important shift in 2013, as more VC funding and M&A goes toward the upper half of the big data stack: analytics and discovery tools, and analytic applications.
Nemertes Research said nearly 30% of organizations have initiated big data projects, with another 5.6% expected to in the next 18 months. The top driver of Big Data projects is the need to analyze data already on hand, followed by initiatives facing out toward customers, but it creates technology challenges around storage and management, as well as organizational challenges.
At the end of October Gartner said the outlook for big data is creating much excitement, but there is also trouble ahead. “By 2015, 4.4 million IT jobs globally will be created to support big data, generating 1.9 million IT jobs in the United States,” said Peter Sondergaard, senior vice president at Gartner and global head of Research. “But there is a challenge. There is not enough talent in the industry… (and) only one-third of the IT jobs will be filled.”
Security will be another issue, blogged Jon Oltsik, Senior Principal Analyst, Enterprise Strategy Group. ESG considers data to be big once the volume exceeds the capability and boundaries of traditional IT infrastructure. Difficulties include capture, storage, search, sharing, analysis, and visualization. When applied to analytics, big data can also be characterized by the speed with which organizations require data processing, data integration, and data analytics tasks be completed in order to spot business trends, prevent diseases, combat crime, etc.
Forty-four percent of enterprise security professionals believe that security data collection and analysis would be considered “big data” at their organizations today, while another 44% believe that security data collection and analysis will become “big data” at their organizations within the next 24 months.
To be clear, this does not mean that CISOs are actively hiring data scientists, implementing Hadoop, and sending CISSPs out for training on Cassandra, Hive, MapReduce, or Pig. It does indicate however that they are collecting massive amounts of data and existing security analytics tools can no longer keep up. As a result, IT risk continues to increase—a very scary scenario.
MapR, which bills itself as the Hadoop leader (Apache Hadoop is the open-source software framework for data-intensive distributed applications), released an enterprise edition, M7, at the end of October, that provides instant recovery from hardware and software failures, disaster recovery and full data protection with snapshots and mirroring. There are some real limitations to Hadoop, which is a fundamentally flawed architecture, said Norris. “MapR rewrote it (Hadoop’s Hbase distributed database, used by 45% of Hadoop users) to offer full data protection, ease of integration, dependability and performance features in a streamlined architecture.”
ESG’s Evan Quinn, Senior Principal Analyst, recently blogged that M7 “transforms the heretofore not-enterprise-class Hbase
, Hadoop’s default non-relational database, into something far more enterprise-friendly. MapR argues that if you are going to load Hadoop, then why not use it for compute as well, rather than one further step down the big data process post-extraction. In short, how many databases do you really want involved in big data?”
MapR stepped off the ledge and directly addressed commonly understood weaknesses in the Hadoop platform. ESG believes MapR M7 will entice customers and partners in the embedded and industry-specific analytics big data arenas.
The bottom line is that Big Data is big, and getting bigger. “In 2011, big data formed a new driver in almost every category of IT spending,” said Mark Beyer, research vice president at Gartner. “However, through 2018, big data requirements will gradually evolve from differentiation to ‘table stakes’ in information management practices and technology. By 2020, big data features and functionality will be non-differentiating and routinely expected from traditional enterprise vendors and part of their product offerings.”