Apache Spark Intermediate
Intermediate
This test measures your ability to work with structured streaming, MLlib, and Spark optimization techniques. To improve your skills, check the MLlib Guide and watch this practical Intermediate Spark Tutorial.
1                    
What is the difference between RDD and DataFrame?                    
2                    
What is the purpose of broadcast variables?                    
3                    
What is the purpose of accumulators?                    
4                    
What is the difference between map and flatMap?                    
5                    
What is the purpose of the cache() method?                    
6                    
What is the difference between repartition() and coalesce()?                    
7                    
What is the purpose of the SparkSession?                    
8                    
What is the difference between DataFrame and Dataset?                    
9                    
What is the purpose of the Spark UI?                    
10                    
What is the difference between narrow and wide transformations?                    
11                    
What is the purpose of the DAG scheduler?                    
12                    
What is the difference between persist() and cache()?                    
13                    
What is the purpose of the task scheduler?                    
14                    
What is the difference between reduceByKey and groupByKey?                    
15                    
What is the purpose of the Spark SQL Catalyst optimizer?                    
