EMC’s Making Waves In The Data Lake

has been floating its Data Lake Foundation concept for the better part of a year, but today the company is launching a flotilla of new products and solutions intended to: eliminate silos; simplify management; improve utilization; massively scale; support existing and emerging standards; be secure; and deliver in-place analytics with faster time to results. Among the goodies are: a 2.5X increase in capacity with the HD400 platform (up to 50PB within a single cluster), and the extremely dense (3.2PB/ rack) HD400 will help reduce operational expenses including power, cooling and datacenter floor space expense by 50%; OneFS 7.2 operating system will support newer and more current versions of protocols including HDFS 2.3 and 2.4; and, support for both file and object storage.

Depending upon the organization and the industry, data is growing at least 50% per year, and this represents “both a problem and an opportunity,” said Suresh Sathyamurthy, Sr. Director, Product Marketing for EMC’s Emerging Technologies Division. The data lake concept offers a way to address both, he said.

There are two major takeaways from today’s announcements, said Sathyamurthy: scalability, and the new software ensures EMC is up to date with the ecosystem. And while they provide additional capabilties for the thousands of customers performing analytics on the Isilon platform, the announcements also open the door to a lot more prospects, especially service providers. “Even with the volume of customers we have today… we believe we are the market share leader… we’re just scratching the surface”. EMC is looking at 3-4x growth next year, he said.

Nick Kirsch, EMC’s VP & Chief Technology Officer, ETD, recently noted that  data lakes are here to stay. He quoted IDC, which said “data lakes should be a part of every workflow in the enterprise.”
He divided the data lake market into two segments: the first involves utilizing intelligent software-defined storage resource management to efficiently store petabytes of data—and making that data available with multiprotocol access.; the second means a hyper-converged data lake that’s complete with apps, compute resources, and networks—delivered as an integrated appliance. In both cases, the decision is based on the unique challenges businesses face in delivering performance, managing growth, and gaining insights from their data.

According to a recent blog from ETP’s David Noy, VP Product Management, EMC received recognition for its Isilon Scale-Out NAS family of products in the Gartner report Critical Capabilities for Scale-Out File System Storage. Created to identify the contenders in the Scale-Out File Storage industry, Isilon rated highest in three of five Use cases: Overall Use Cases (4.17 out of 5), Commercial HPC Use Cases (4.2) and Archiving Use Cases (4.25), highlighting the value of scalability and performance for enterprise workloads.

Noy wrote that the world of scale-out storage is evolving rapidly and new architectures, such as object storage, will emerge as an alternative to on-premise deployments due to the promise of low entry costs, rapid scalability and a growing ecosystem of software and service provider offerings. ‘Customers should be looking to scale-out data lake foundations that can not only support todays enterprise file-based or NAS workloads but can also provide the bridge to next generation cloud deployments and workflows. The best scale-out NAS solutions will bring together the ability to organize massive amounts of data as well as provide insight to make that data actionable via modern analytics frameworks.’

Last week Gartner published its inaugural Magic Quadrant for Data Warehouse and Data Management Solutions for Analytics, where EMC’s Pivotal was placed in the Visionary quadrant. Leaders where Oracle, Teradata, IBM, Microsoft, SAP and HP; challengers included Amazon Web Services, Cloudera and MapR Technologies.

According to the report, organizations require solutions capable of managing and processing external data in combination with their traditional internal sources, and may even include data from the Internet of Things. This is creating new demands on the data warehouse market — for broader data management solutions for analytics, with features and functionality that represent a significant augmentation to existing enterprise data warehouse strategies.

Accessing and analyzing the Global Pool of Information will change everything, said Gartner’s Doug Laney. ‘By 2020, information will be used to reinvent, digitalize or eliminate 80% of business processes and products from a decade earlier.’

There is no time to wait, he wrote. Business and IT leaders must make concerted efforts to transform from an inward focus on information management and value generation to participating in the growing global pool of information assets.

The Big Data technology and services market is exploding, according to IDC. This segment will grow at a 26.4% compound annual growth rate to $41.5 billion through 2018, or about six times the growth rate of the overall information technology market. Additionally, by 2020 IDC believes that line of business buyers will help drive analytics beyond its historical sweet spot of relational (performance management) to the double-digit growth rates of real-time intelligence and exploration/discovery of the unstructured worlds.

Author: Steve Wexler

Share This Post On

Leave a Reply