Spark Streaming

Programmers and developers familiar with Apache Spark who wish to expand their skill sets


Expected Duration
161 minutes

Spark Streaming leverages Spark’s language-integrated API to perform streaming analytics. This design enables the same set of application code written for batch processing to join streams against historical data, or run ad-hoc queries on stream state. In this course, you will learn how to work with different input streams, perform transformations on streams, and tune up performance.


Streaming Analytics

  • start the course
  • describe what a DStream is
  • recall how TCP socket input streams are ingested
  • describe how file input streams are read
  • recall how Akka Actor input streams are received
  • describe how Kafka input streams are consumed
  • recall how Flume input streams are ingested
  • set up Kinesis input streams
  • configure Twitter input streams
  • implement custom input streams
  • describe receiver reliability

Transformations on DStreams

  • use the UpdateStateByKey operation
  • perform transform operations
  • perform Window operations
  • perform join operations
  • use output operations on Streams
  • use DataFrame and SQL operations on streaming data
  • use learning algorithms with MLlib
  • persist stream data in memory
  • enable and configure checkpointing
  • deploy applications
  • monitor applications
  • reduce batch processing times

Performance Tuning

  • set the right batch interval
  • tune memory usage
  • describe fault tolerance semantics





Multi-license discounts available for Annual and Monthly subscriptions.