Big Data Principles and Practices

About the Course

Name: Big Data Principles and Practices

Code: DIGTR-112

Sectors: Digital Transformation and Innovation

  Date   Days   Venue   Fees
19 - 23 Feb 2022 5 Live Online Classroom $1,550 BOOK NOW
05 - 09 Jun 2022 5 Dubai, UAE $3,950 BOOK NOW
28 Aug - 01 Sep 2022 5 Live Online Classroom $1,550 BOOK NOW
19 - 23 Nov 2022 5 Live Online Classroom $1,550 BOOK NOW


The concept of big data has been around for years; most organizations now understand that if they capture all the data that streams into their businesses, they can apply analytics and get significant value from it. Big data analytics helps organizations harness their data and use it to identify new opportunities. That, in turn, leads to smarter business moves, more efficient operations, higher profits and happier customers.

This training course is designed to provide participants with the key concepts and knowledge of big data – the landscape, the technology behind it, business drivers, and strategic possibilities. “Big data” is a hot buzzword, but most organizations struggle to put it to practical use. Without assuming any prior knowledge of Apache Hadoop or big data management, this course teaches you how to use and manage the benefits of big data.


After completing this course, participants will be able to:

  • Gain a thorough understanding of Big Data technologies, their benefits, and the value that it can deliver to industries, companies, and functions
  • Analyze, process, and extract information from extremely complex and large data sets.
  • Build an organization-wide Big Data program
  • Develop the maturity of Big Data within their organization
  • Apply a variety of use cases to drive ideation
  • Act lean and agile in pursuit of Big Data objectives

Training Methodology

This training course is designed to be highly interactive and participatory. To ensure maximum comprehension and retention, this training will utilize a variety of proven virtual learning methods such as break-out sessions for group discussions and brainstorming, virtual icebreakers, recorded videos, case studies, and readings.


Day 01

  • The concepts
  • Load data how you find it
  • Process it when you can
  • Project it into various schemas on the fly
  • Push it back to where you need it
  • The basics
  • What it’s good for
  • What can’t it do / disadvantages
  • Most common use cases for big data
  • Value Creation with Big Data
  • Introduction to Big Data technologies
  • Trends in Big Data
  • Big Data applications, use cases and best practices across industries and functions
  • Data sourcing strategies and challenges
  • Ideation phase: creating first successes
  • From ideation to Proof-of-Concept and minimum viable product

Day 02

  • Managing Big Data Transformation
  • Big Data Maturity model
  • Developing a Big Data roadmap
  • What does good look like: determining your Big Data end game
  • Orchestrating Big Data maturity across data, technology and people
  • Hadoop – the free platform for working with big data
  • History
  • Yahoo
  • Platform fragmentation
  • What usage looks like in the enterprise
  • Introduction to HDFS
  • Robustness
  • Data Replication
  • Gotchas

Day 03

  • MapReduce – the core big data function
  • Map explained
  • Sort and shuffle explained
  • Reduce explained
  • YARN
  • How it fits
  • How it works
  • Resource Manager
  • Application Master
  • PIG
  • What it is
  • How it works
  • Compatibilities
  • Advantages
  • Disadvantages

Day 04

  • Processing Data
  • The Piggy Bank
  • Loading and Illustrating the data
  • Writing a Query
  • Storing the Result
  • HIVE
  • Data warehousing
  • What it is, what it’s not
  • Language compatibilities
  • Advantages
  • What it is
  • Complex workflow environments
  • Reducing time-to-market
  • Frequency execution
  • How it works with other big data tools

Day 05

  • FLUME – stream, collect, store and analyze high-volume log data
  • How it works: Event, source, sink, channel, agent, and client
  • How it works illustrated
  • How it works demonstrated
  • Move over 2012 Big Data tools: Apache SPARK is the new power tool
  • The new open-source cluster framework
  • When SPARK performs 100 times faster
  • Performance comparison of Spark and Hadoop
  • What else can it do?
  • Big Data Leadership
  • Lean/agile working in support of Big Data transformation
  • Key success factors for adoption of Big Data at speed
  • Required skills & competencies for successful digital transformation
  • Understand the mindset of digital disruptors

missing content

missing content

You Might Also Like