With this 3 days long, hands-on Big Data and Apache Hadoop course, you will be able to understand how the different components of the Big Data ecosystem fit together. Through an overview of use cases and a comparison of current Big Data technologies, at the end of the course, you will be able to reason about how to solve your Big Data problems and which tool you should use for them.
The topics in the course were explained clearly. Use cases and the course content practice related to real world problems and it was well conceptualised.
S. P. JAIN – Mumbai – Student
Big Data / Hadoop. The Hadoop ecosystem overview. Distributed computing workshop
HDFS and MapReduce, HDFS deep dive, MapReduce design patterns
YARN, Hadoop’s resource manager
MapReduce applications using the Hadoop Streaming API
SQL based solutions: Hive, HCatalog, interoperability with Spark SQL
Spark overview, the Spark DataFrame API and Spark SQL
Big Data in the cloud: Cloud computing basics, Amazon Web Services. EMR, Hadoop in the cloud.
ETL and Orchestration systems & the Luigi Framework
Technical Big Data Architecture use cases