Setup Menus in Admin Panel

Give us a Call: M: (+971) 55 253 7914

Live Online Courses
Click for schedules & registraion
 
Search Courses
Generic filters
Exact matches only
Filter by Custom Post Type

Live Online Training Schedules

Now participate in a live online course for a highly discounted fee of only $850, as a limited introductory offer.
(Terms and conditions apply)
CodeDateFormatFees 
DM0425 Jan - 29 Jan, 2021Live Online$1800Register
DM0426 Apr - 28 Apr, 2021Live Online$1300Register
DM0426 Jul - 30 Jul, 2021Live Online$1800Register
DM0425 Oct - 27 Oct, 2021Live Online$1300Register

Classroom Training Schedules

CodeDateVenueFees 
DM0418 Apr - 22 Apr, 2021Abu Dhabi$4250Register
DM0404 Oct - 08 Oct, 2021London$4750Register

Did you know you can also choose your own preffered dates & location? Customise Schedule

Course Overview

The analysis of large datasets involves using an equally large set of computers. Successfully using so many computers entails the use of distributed files systems, such as the Hadoop Distributed File System (HDFS) and parallel computational models, such as Hadoop, MapReduce and Spark.

In this Big Data Analytics with Spark Training Course, you will learn what the blocks are in vast parallel computation projects, and how to use Spark to minimise these tailbacks.

This Big Data Analytics with Spark Training Course will teach you how to conduct supervised an unsupervised machine learning on substantial datasets using the Machine Learning Library (MLlib) and gain hands-on experience using PySpark.

What skills are covered in this Big Data Spark training course? This program will provide you with knowledge and expertise in Scala programming, Spark installation, Resilient Distributed Datasets (RDD), SparkSQL, Spark Streaming, Spark ML Programming, and GraphX programming.

This Zoe training course will empower you with crucial, in-demand Apache Spark skills and guide you to build a competitive advantage for an exciting career as a Hadoop developer.

Course Objectives

Upon completing this Big Data Analytics with Spark Training Course successfully, participants will be able to:

  • Obtain an overview of Big Data & Hadoop including HDFS and YARN (Yet Another Resource Negotiator)
  • Gain comprehensive knowledge of various tools that fall in the Spark ecosystem
  • Understand how to ingest data in HDFS using Sqoop & Flume
  • Program Spark using Pyspark
  • Identify the computational trade-offs in a Spark application
  • Model data through statistical and machine learning methods
  • Use the power of handling real-time data feeds through a publish-subscribe messaging system like Kafka
  • Gain exposure to many real-life industry-based projects
  • Study projects which are diverse in nature, like banking, telecommunication, social media, and in the government field

Training Methodology

This is an interactive Big Data Analytics with Spark Training program and will consist of the following training approaches:

  • Lectures
  • Seminars & Presentations
  • Group Discussions
  • Assignments
  • Case Studies & Functional Exercises

Similar to all our courses, this program also follows the ‘Do-Review-Learn-Apply’ model.

Organisational Benefits

Companies who send in their employees to participate in this Big Data Analytics with Spark Training Course can benefit in the following ways:

  • Adopt the technology that is being used successfully by multiple companies falling into various domains around the globe
  • Attract more investors towards your business – Forbes reports that 56% of enterprises will increase their investment in big data over the next three years
  • Provide your workforce with flexible and cost-effective professional development opportunities
  • Analyse case studies in this domain and be able to apply successful techniques in your organisation
  • Comprehend the principles and practice of Big Data Analytics and the context in which this operates

Personal Benefits

Professionals who participate in this Big Data Analytics with Spark Training Course can benefit in the following ways:

  • Obtain strong hands-on experience in various industry-based use-cases and projects incorporating big data and spark tools as a part of solution strategy
  • Clarify all your doubts by industry professionals who have experience working on real-life big data and analytics projects
  • Develop your skills to increase your professional demand – McKinsey predicts that by 2020 there will be a shortage of data experts
  • Advance your career in the field of Big Data & Analytics with our Big Data Analytics with Spark Training Course

Who Should Attend?

This Big Data Analytics with Spark Training Course would be suitable for:

  • Developers and Architects
  • BI /ETL/DW Professionals
  • Senior IT Professionals
  • Testing Professionals
  • Mainframe Professionals
  • Freshers
  • Big Data Enthusiasts
  • Software Architects, Engineers and Developers
  • Data Scientists and Analytics Professionals

Course Outline

MODULE 1: INTRODUCTION TO BIG DATA HADOOP AND SPARK

  • What is Big Data?
  • Big Data Customer Scenarios
  • Big Data and Hadoop
  • How Hadoop Solves the Big Data Problem?
  • What is Hadoop?
  • Hadoop’s Key Characteristics
  • Hadoop Ecosystem and HDFS
  • Hadoop Core Components
  • Rack Awareness and Block Replication
  • YARN and its Advantage
  • Hadoop Cluster and its Architecture
  • Hadoop: Different Cluster Modes
  • Why Spark is needed?
  • What is Spark?
  • How Spark differs from other frameworks?
  • Spark at Yahoo!

MODULE 2: INTRODUCTION TO SCALA FOR APACHE SPARK

  • What is Scala?
  • Why Scala for Spark?
  • Scala in other Frameworks
  • Control Structures in Scala
  • Foreach loop, Functions and Procedures
  • Collections in Scala- Array
  • Introduction to Scala REPL
  • Basic Scala Operations
  • Variable Types in Scala
  • ArrayBuffer, Map, Tuples, Lists, and more
  • Scala REPL Detailed Demo

MODULE 3: FUNCTIONAL PROGRAMMING AND OOPS CONCEPTS IN SCALA

  • Auxiliary Constructor and Primary Constructor
  • Singletons
  • Extending a Class
  • Overriding Methods
  • Traits as Interfaces and Layered Traits
  • OOPs Concepts
  • Functional Programming
  • Higher-Order Functions
  • Anonymous Functions
  • Class in Scala
  • Getters and Setters
  • Custom Getters and Setters
  • Properties with only Getters
  • Functional Programming

MODULE 4: DEEP DIVE INTO APACHE SPARK FRAMEWORK

  • Submitting Spark Job
  • Spark Web UI
  • Data Ingestion using Sqoop
  • Building and Running Spark Application
  • Spark Application Web UI
  • Spark’s Place in the Hadoop Ecosystem
  • Spark Components & its Architecture
  • Spark Deployment Modes
  • Introduction to Spark Shell
  • Writing your first Spark Job Using SBT
  • Configuring Spark Properties
  • Data ingestion using Sqoop

MODULE 5: PLAYING WITH SPARK RDDS

  • RDD Persistence
  • WordCount Program Using RDD Concepts
  • Passing Functions to Spark
  • Loading data in RDDs
  • Saving data through RDDs
  • RDD Transformations
  • Challenges in Existing Computing Methods
  • Probable Solution & How RDD Solves the Problem
  • What is RDD, Its Operations, Transformations & Actions
  • Data Loading and Saving Through RDDs
  • Key-Value Pair RDDs
  • Other Pair RDDs, Two Pair RDDs
  • RDD Lineage
  • RDD Actions and Functions
  • RDD Partitions
  • WordCount through RDDs

MODULE 6: DATAFRAMES AND SPARK SQL

  • Need for Spark SQL
  • What is Spark SQL?
  • Spark SQL Architecture
  • Spark – Hive Integration
  • Spark SQL – Creating Data Frames
  • Loading and Transforming Data through Different Sources
  • Stock Market Analysis
  • Spark-Hive Integration
  • SQL Context in Spark SQL
  • User-Defined Functions
  • Data Frames & Datasets
  • Interoperating with RDDs
  • JSON and Parquet File Formats
  • Loading Data through Different Sources

MODULE 7: MACHINE LEARNING USING SPARK MLLIB

  • Why Machine Learning?
  • What is Machine Learning?
  • Where Machine Learning is Used?
  • Face Detection: USE CASE
  • Different Types of Machine Learning Techniques
  • Introduction to MLlib
  • Features of MLlib and MLlib Tools
  • Various ML algorithms supported by MLlib

MODULE 8: DEEP DIVE INTO SPARK MLLIB

  • K- Means Clustering
  • Linear Regression
  • Logistic Regression
  • Decision Tree
  • Random Forest
  • Machine Learning MLlib

MODULE 9: UNDERSTANDING APACHE KAFKA AND APACHE FLUME

  • What is Apache Flume?
  • Need of Apache Flume
  • Basic Flume Architecture
  • Flume Sources
  • Flume Sinks
  • Flume Channels
  • Flume Configuration
  • Need for Kafka
  • What is Kafka?
  • Core Concepts of Kafka
  • Kafka Architecture
  • Where is Kafka Used?
  • Understanding the Components of Kafka Cluster
  • Configuring Kafka Cluster
  • Kafka Producer and Consumer Java API
  • Integrating Apache Flume and Apache Kafka
  • Configuring Single Node Single Broker Cluster
  • Configuring Single Node Multi Broker Cluster
  • Producing and consuming messages
  • Flume Commands
  • Setting up Flume Agent
  • Streaming Twitter Data into HDFS

MODULE 10: STREAMING – MULTIPLE BATCHES

  • Why Streaming is Necessary?
  • Drawbacks in Existing Computing Methods
  • What is Spark Streaming?
  • Spark Streaming Features
  • Spark Streaming Workflow
  • How Uber Uses Streaming Data
  • Streaming Context & DStreams
  • Transformations on DStreams
  • Important Windowed Operators
  • Slice, Window and ReduceByWindow Operators
  • Stateful Operators

MODULE 11: APACHE SPARK STREAMING – DATA SOURCES

  • Apache Spark Streaming: Data Sources
  • Apache Flume and Apache Kafka Data Sources
  • Example: Using a Kafka Direct Data Source
  • Perform Twitter Sentimental Analysis Using Spark Streaming
  • Streaming Data Source Overview
  • Different Streaming Data Sources

MODULE 12: SPARK GRAPHX

  • Key concepts of Spark GraphX
  • GraphX algorithms and their implementations
Online Courses
Note
Customized Schedule is available for all courses irrespective of dates on the Calendar. Please get in touch with us for details.
Download Classroom Training Calendar 2020 Download Classroom Training Calendar 2021

 

Download Live Online Training Calendar 2020 Download Live Online Training Calendar 2021




ZOE Talent Solutions Logo


Address

Middle East – Head Office

918, Blue Bay Towers,
Al Abraj Street (Marassi Drive),
Business Bay, Dubai, UAE

UK Office

337, Forest Road, London,
England, E17 5JR

Ph: +44 7443 559344

Join Our Team as a Trainer

We are proud to have a team of highly experienced expert training professionals from all over the world.

If you feel you have the right skills to become a part of our training team. Please use the link below.

>Apply Now

top
Copyright © Zoe Talent Solutions 2020

Due to the recent COVID-19 pandemic, we've decided to offer live online webinar training classes for all of our courses.


You can participate in our training sessions with the safety & comfort of your own home or office. This helps you fulfill your learning requirements, while availing massive discount on regular course fees.


Register for a live online session now.


To register for a course, please fill the information request form on the course page, or reach us through the live chat or WhatsApp using the apps below.

Live Online Training

Now participate in a live online course for a highly discounted fee of Only $850, as a limited introductory offer.

(Terms and conditions apply)