This 3-day training will teach you how to get the most out of the latest version of Apache Spark when it comes to Spark development and analytics.
The whole course is completely hands-on, and you will go through many exercises and workshops for both programming and analytics use-cases. The training is designed to accommodate both developers and analysts. Participants need basic programming skills, but you don’t need to have prior Big Data or Spark experience.
The recommended length of this training is three days. Based on your specific requests we can stretch it to four days by covering the more advanced topics.
At the end of this course you will understand and be hands-on with:
Spark’s capabilities and its place in the Big Data Ecosystem Spark SQL, DataFrames, Datasets. Real-time scalable data analytics with Spark Streaming Machine Learning using Spark Writing performant Spark Applications by executing Spark’s internals and optimisations.
S. P. JAIN – Mumbai – Student
The topics in the course were explained clearly. Use cases and the course content practice related to real world problems and it was well conceptualised.
Spark Overview
Data Analytics with Spark:
DataFrames
Using Spark SQL and interconnecting them withDataFrames
Unified analysis of data coming from different sources and formats
Programming Spark:
Using the RDD API
Using the Spark UI to evaluate Spark applications:
Analyzing Spark internal mechanics through the Spark UI
Writing performant Spark applications
Debugging Spark Applications
Tungsten and Catalyst: Advanced Optimisations
Spark Streaming
Real-time data processing with Spark Implementing Continuous
Applications with Structured
Streaming Machine Learning with Spark
Implementing machine learning pipelines with Spark
Supervised learning: Model Building, Predictions and Validation
Spark in the cloud:
How to configure and use Spark on Amazon Web Services.
Scala Tutorial
Introduction to Machine Learning
Advanced Spark programming
Spark Deployment techniques
Advanced Machine Learning with H2O. Integrating H2O into Spark.
PySpark and the iPython notebook
Using Spark in R