Note: The notes are not exhausted knowledge about Apache Spark.
- Introduction
- Architecture
- Partitioning
- Spark Execution
- Scheduling
- Shuffling
- Optimizer
- RDD
- Spark SQL - Structured API
- Join
- Key/Value data
- Testing
- Spark Streaming
... still in progress!
The main source for these notes is Spark: The Definitive Guide and High Performace Spark and Coursera course Big Data Analysis with Scala and Spark