Cleversafe Uses Commodity Hardware To Help Tame Big Data
Having somehow gone from the worst category name ever to the future of the new information economy, ‘Big Data’ has picked up considerable market momentum, expected to account for $32 billion of IT spending in 2013 and $232 billion through 2016, according to Gartner. By 2018, big data requirements will evolve from differentiation to ‘table stakes’ in information management practices and technology, and Cleversafe, a CIA-backed object storage vendor, wants a seat at the table with the big players.
Chicago-based Cleversafe is solving the requirements of storing petabyte and beyond big data storage with a solution that drives up to 90 % of the storage cost out of the business. “We are talking to customers in the 10s of petabytes growing to 100s of petabytes,” said Russ Kennedy, VP of product strategy and customer solutions.”
In August the company announced the Intel Xeon-based 3000 series of storage appliances, that will deliver Exabyte-scale throughput for ingesting and storing. Expected to ship shortly, the object-based Dispersed Storage systems can capture data at 1 Terabyte per second at Exabyte capacity. To finally be able to achieve that scale – that’s a serious shift in business for any company, said Kennedy.
A month earlier Cleversafe announced plans to build the first Dispersed Compute Storage solution by combining the power of Hadoop MapReduce with its Dispersed Storage System. The company said that by combining MapReduce with its Dispersed Storage Network (dsNet) system on the same platform and replacing the Hadoop Distributed File System (HDFS) which relies on 3 copies to protect data, will significantly improve reliability and allow analytics at a scale previously unattainable through traditional HDFS configurations.
“There isn’t an industry today that’s untouched by Big Data or a company that wouldn’t benefit from the intrinsic value of that data if they could collect, organize, store and analyze it in a cost-effective manner,” said John Webster, Senior Partner at Evaluator Group, in a prepared statement. “Cleversafe’s approach to combining dispersed storage and Hadoop for analytics is a groundbreaking step for the industry and for any company to effectively bridge storage and large-scale computation.”
By combining distributed storage with computation, the new announcement should interest the fast-growing pool of Hadoop users, wrote analyst Christine Taylor, Taneja Group. Hadoop allows deep business analytics for large volumes of semi-structured and unstructured data that are dispersed across multiple servers but provides data protection by replicating three copies of the data in case a server fails,. With data reaching petabyte and exabyte levels this becomes a very cumbersome process. There is also the issue that HDFS uses one server for metadata operations, risking a single point of failure.
With Cleversafe there is no single point of failure on the data or metadata side and no need to replicate three copies of the Hadoop data. While this is not the cheapest way to go, said the additional cost will be partially offset by the savings users will accrue by eliminating the 3x replication with HDFS, with all of its associated storage, space and power requirements.
‘And with Cleversafe, you’ll have a more scalable and reliable solution to your big data needs going forward. Certainly this is the way a large part of the enterprise market will be going, thanks to relentless data growth and businesses’ need to make sense of it all so they can increase profitability and competitiveness.’
Cleversafe’s future does look bright, said Kennedy. “The growth of data is a megatrend, and more information is being generated than can be stored.”
Data is doubling every two years, and enterprises will manage 50x more data, and files will grow 75x in the next decade. Enterprise storage system expenditures will grow less than 4% per year for the next few years. Big data is expected to account for more than half of the world’s data in the next five years, according to a study from Internet Research Group and Infineta Systems. According to a Deloitte survey, more than 90% of Fortune 500 companies will have a big data initiative under way by the end of 2012.
And budget constraints is the biggest Big Data challenge. “When you cross the petabyte threshhold, the cost of storing data, even on a public cloud, is pretty expensive,” said Kennedy. “So we’re seeing customers looking at bringing data back in house… back under their control.”