This course is best suited for those who want to gain Big Data analytics skills to:
- Analyze huge datasets quickly
- Build, deploy and run Spark applications on Spark clusters
- Process continual streams of data with Spark Streaming
- Frame big data analysis problems as Apache Spark scripts
- Develop distributed code using the Scala programming language
- Optimize Spark jobs through partitioning, caching, and other techniques
- Transform structured data using SparkSQL and DataFrames
- Traverse and analyze graph structures using GraphX
Â
Course Outline
Introduction to Big Data & Apache Spark |
---|
|
Getting started with Apache Spark |
|
Introduction to Online Lab |
|
Understanding resilient distributed datasets (RDD) |
|
Working with key/value pairs |
|
Loading and saving your data |
|
Advanced Apache Spark programming |
|
Running Apache Spark on a cluster |
|
Introduction to Apache Spark libraries |
|
Apache Spark streaming |
|