Are you drowning in Data Lake?

Read By 71 Members

Today more than even, every business is focusing on collecting the data and applying analytics to be competitive. Big Data Analytics has passed the hype stage and has become the essential part of business plans.

Data Lake is the latest buzzword for dumping every element of data you can find internally or externally. If you Google the term data lake, you will get more than 14 million results. With entry of Hadoop, everyone wants to dump their siloes of data warehouses, data marts and create data lake.

The idea behind a data lake is to have one central platform to store and analyze every kind of data relevant to the enterprise. With the digital transformation, the data generated every day has multiplied by several times and business are collecting this consumer data,Internet of Things data and other data for further analysis.

As the storage has become cheaper, more data is being stored in its raw format in the hopes of finding nuggets of information but eventually it becomes difficult. It is like using your smartphone to click photographs left, right and center, but when you want to show some specific photograph to someone it’s very difficult.

Data Lakes, if not maintained properly, have the potential to grow aimlessly consuming all the budget. Some companies have their data lakes overflowing on premise systems into the cloud.

Most data lakes lack governance, lack the tools and skills to handle large volumes of disparate data, and many lack a compelling business case. But, this water (the data) from your data lake has to be crystal clear and drinkable, else it will become a swamp.

Before getting into bandwagon of creating the data lake that may cost thousands of dollars and months to implement, you should start asking these questions.

  • What data we want to store in Data Lake?
  • How much data to be stored?
  • How will we access this massive amounts of data and get value from it easily?

Here are some guidelines to avoid drowning into data lakes.

  • First and foremost – create one or more business use cases that lay out exactly what will be done with the data that gets collected. With that exercise you will avoid dumping data, which is meaningless.
  • Determine the Returns you want to get out of Data Lake. Developing a data lake is not a casual thing. You need good business benefits coming out of it.
  • Make sure your overall big data and analytics initiatives are designed to exploit the data lake fully & help achieve business goals
  • Instead of getting into vendor traps and their buzzwords, focus on your needs, and determine the best way to get there.
  • Deliver the data to wide audience to check and revert with feedback while creating value

There are many cloud vendors to help you out building data lakes – Microsoft Azure, Amazon S3 etc.

By making data available to Data Scientists & anyone who needs it, for as long as they need it, data lakes are a powerful lever for innovation and disruption across industries.

Sandeep Raut

7th Rank in Global Top 100 Digital Transformation Influencers Delivered speech at India Analytics & Big Data Summit at Bangalore on "How Machine Learning is helping in Digital Transformation" on 4th Feb 2016 Delivered Thought Leadership speech at Unicom - India Analytics & Big Data Summit on "Big Data Analytics disrupting industry" Delivered speech at IIT Mumbai on "Analysing Big data for disruptive innovation" Delivered a keynote speech at Rizvi College of Engineering on "Fraud Detection & Prevention using Analytics" • Director for Digital Transformation in Syntel. • Has more than 29 years of IT Services / Consulting / Off-shoring experience • Over 18 years in Business Intelligence space. • Had helped organizations in establishing the BI-Analytics Services CoEs. • Had spearheaded several marquee accounts and was significantly instrumental in building new business for the practice as well. • Had successfully initiated, mentored & deployed various strategic consulting services & solutions like Digital Transformation, BI Strategy Planning, BI Offshorization, BI Development/Deployment, Campaign Management, Inventory Optimization which resulted into multi-million dollar business. • Had developed & managed Customer relations with Global players across USA, UK & Asia Pacific. Specialties: Digital Transformation, BI & Big Data Analytics Banking and Financial Services, Healthcare LifeSciences, Insurance, Retail Manufacturing - Supply Chain Management

Have Your Say: