Summary Introduction Log Analysis with Spark Section 1: Introduction to Apache Spark First Log Analyzer in Spark Spark SQL Spark Streaming Windowed Calculations: window() Cumulative Calculations: updateStateByKey() Reusing Code from Batching: transform() Section 2: Importing Data Batch Import Importing from Files S3 HDFS Importing from Databases Streaming Import Built In Methods for Streaming Import Kafka Section 3: Exporting Data Small Datasets Large Datasets Save the RDD to Files Save the RDD to a Database Section 4: Log Analyzer Application Twitter Streaming Language Classifier Collect a Dataset of Tweets Examine the Tweets and Train a Model Examine with Spark SQL Train with Spark MLLib Run Examine And Train Apply the Model in Real-time Weather TimeSeries Data Application with Cassandra Overview Running the Example