Jun 18, 2018

Big Data Juggernaut III

Anil Vaidya

In my earlier blog I had mentioned that the Hadoop in synonymous with Big Data. True, many people think that Hadoop is Big Data. Its only partially true, today many other technologies and options fill the same bill. For instance NoSQL databases also can be counted in the same arena. The IOT deployments have been big contributor to generation of humongous amount of data, the ‘Splunk’ gets a top notch in this specialized space. It offers facilities to store data generated by the IOT devices and the machines. Besides IOT one wonders how much data is ever created by the social media. The NoSQL databases come very handy here.

I opened saying about Hadoop, today many vendors offer solutions that are based on Hadoop Framework but with better facilities such as faster processing. ‘Spark’ is one such project that Apache started with, now Spark based products have become very popular with the techies. While it is true that Spark framework differs from Hadoop framework, its origin can be traced to Hadoop. In the same vein the famous HDFS is being replaced by object storage in various forms. All leading cloud vendors offers a combination of object storage and Spark in place of HDFS and Hadoop.

I think it is time that the techies and others need to get into knowing many technologies integratively. Two of the others that I will talk about in my next blog will be on Python and its Spark avatar. One working on Big Data will necessarily have to possess expertise in many technologies to be able to generate value for business. I will offer some thoughts as we continue.

AppLy Now