Spark SQL and DataFrames ���Spark GraphX ���Spark Mlib ���Spark Streaming
Lightning-fast cluster computing
Creating a DataFrame from Hive
7
Place your hive-site.xml, core-site.xml (for security configuration), hdfs-site.xml (for HDFS configuration) file in your spark conf/
Creating a DataFrame from MySQL
8
Creating a DataFrame from MySQL
9
Transforming and querying DataFrames
10 https://spark.apache.org/docs/1.6.2/api/python/pyspark.sql.html#
Working data in a DataFrame
11
Working data in a DataFrame
12
Query DataFrame using columns
16
Query DataFrame using columns
17
Extracting data from rows
23
Iterative algorithms in Spark: PageRank
29
Neighbor contribution function
34
Page links grouped by source page
37
Persisting the link pair RDD
38
MLlib in Spark���
49
https://spark.apache.org/docs/2.0.2/ml-guide.html
Why MLlib?
51
https://docs.databricks.com/spark/latest/mllib/decision-trees.html
Spark streaming
52 http://spark.apache.org/docs/latest/streaming-programming-guide.html