Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data analytics in ecommerce
    Analytics Technology Drives Conversions for Your eCommerce Site
    5 Min Read
    CRM Analytics
    CRM Analytics Helps Content Creators Develop an Edge in a Saturated Market
    5 Min Read
    data analytics and commerce media
    Leveraging Commerce Media & Data Analytics in Ecommerce
    8 Min Read
    big data in healthcare
    Leveraging Big Data and Analytics to Enhance Patient-Centered Care
    5 Min Read
    instagram visibility
    Data Analytics Plays a Key Role in Improving Instagram Visibility
    7 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: A Two-Stage Approach to Financial Return for Data Lakes
Share
Notification Show More
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > IT > Cloud Computing > A Two-Stage Approach to Financial Return for Data Lakes
AnalyticsCloud ComputingCommentaryITOpen Source

A Two-Stage Approach to Financial Return for Data Lakes

paulbarsch
Last updated: July 11, 2016 12:18 pm
paulbarsch
6 Min Read
Image
SHARE

Image

Depending on whether you “build” your data lake in the cloud or on premise, it could take anywhere from six months to a year—and in some cases longer to be fully deployed. In our “want it now” culture, this timeframe might be too long for some CIOs. But those companies that are faithful to creating and using a big data platform should see not only breakeven but real return too.

Image

More Read

Enterprise Performance Management and Shared Value (upcoming roundtable)

Internet TV vs Internet Protocol Television
How Insurance Companies Use Data To Measure Risk And Choose Rates
Why Hotels Should Apply Big Data Analytics to Provide a Unique Guest Experience
Mastering Marketing Mayhem in a Meaningful, Meticulous Manner

Depending on whether you “build” your data lake in the cloud or on premise, it could take anywhere from six months to a year—and in some cases longer to be fully deployed. In our “want it now” culture, this timeframe might be too long for some CIOs. But those companies that are faithful to creating and using a big data platform should see not only breakeven but real return too.

Plainly speaking, a data lake—done right—takes time. Laying the foundation for analytics, making sure data are transformed so as to be usable, secured/protected, and properly managed—all these things are gradual tasks. A real data lake and its associated applications don’t happen overnight, or at least shouldn’t.

What is a data lake? In an effort to avoid too much sales-speak, here’s a working definition: a data lake is a group of data stores where you can capture, manage and analyze massive amounts of raw data. It’s a home for messy data such as web logs, social data, text files, PDFs and more. You can also keep non-messy data in a data lake, the kind that traditionally belong in a relational table. Data lakes are mostly built on open source Hadoop; however other platforms such as NoSQL databases or cloud storage (i.e. Amazon’s S3) work just as well.

From a financial lens, a two stage approach is a good way of looking at a data lake project. In stage 1 most of your time will be spent identifying use cases, examining current data sets and data sources, assessing your current technologies and checking on what new systems you’ll need that can scale as your business grows. You’ll also want to build a plan for data quality, data management, security, and metadata management throughout your “data pipelines.” 

Stage 1 will continue as you either go straight to the cloud or start the process of jumping through the fiery hoops of capital allocation requests for hardware with your CFO. These efforts could add anywhere from one to three months to your project (maybe more). And of course you’ll want to start proof of concepts, and then pilot various use cases. All in all, stage 1—which is preparing for comprehensive analytics—is an exercise in delayed gratification.

In stage 1 of a data lake build, your incoming revenues will mostly be tiny and initial costs could be substantial. Considerations for initial cash outlays include hardware, data management software and engineering costs to expand needed functionality. Don’t forget incremental utility, overhead, supply costs and more for the data lake project. Consider stage 1 the “investment” part of your data lake effort while you get operational.

Now we get to stage 2 of your data lake project. In this round you’ll be in full production and doing “magical things” such as enabling Apache Spark and its various algorithmic libraries, turning your data scientists loose on raw data sets and possibly building BI-like reports for business analysts from data on the lake. You might also realize cost savings from consolidating various data facilities in your organization and possibly offloading ETL processes to Hadoop.

In truth, your data lake project—no matter how many stages—should be viewed in totality. Stage 1 or whatever you choose to call it will likely not have stand-alone profitability from a NPV perspective. And it’s hard to get to stage 2 without doing stage 1. Combining stages 1 and 2 should give you an “accept” from a capital budgeting viewpoint.

There are a few cautions: the above analysis isn’t black and white as there could be instances where you start earning more than you are investing in the first stage of your data lake project. However, in most cases it’s good to remain cognizant that you might not earn a financial return (on a discounted basis) until you’re well down the road of your data lake initiative. And that may be OK with your leadership team, as long as expectations for financial return are adjusted accordingly.

Read the other articles in this series:

  • NPV Considerations for Open Source Big Data Projects
  • Big Data ROI? Not Likely in Year 1
Share This Article
Facebook Twitter Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

trusted data management
The Future of Trusted Data Management: Striking a Balance between AI and Human Collaboration
Artificial Intelligence Big Data Data Management
data analytics in ecommerce
Analytics Technology Drives Conversions for Your eCommerce Site
Analytics Exclusive
data grids in big data apps
Best Practices for Integrating Data Grids into Data-Intensive Apps
Big Data Exclusive
AI helps create discord server bots
AI-Driven Discord Bots Can Track Server Stats
Artificial Intelligence Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

Grab your Popcorn…Things are about to get really weird

2 Min Read
Image
AnalyticsBusiness Intelligence

Advanced Analytics Offers New Opportunities for Growth

3 Min Read
Image
Cloud Computing

Cloud Innovations in Higher Education [INFOGRAPHIC]

2 Min Read
Big Data Investment
AnalyticsBig DataExclusive

5 Ways to Make Big Data Investment Work For Your Organization

7 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive
giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-24 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?