The Basics of Scalding Programming

This path is targeted toward programmers who wish to learn the basics of Scalding programming. A working knowledge of Cascading or Scala may be beneficial.


Expected Duration
125 minutes

Scalding is a Scala library that is used to abstract complex tasks such as map and reduce. In this course, you will learn to create simple Scalding programs using functions and classes.


Getting Started with Scalding

  • start the course
  • identify the features and users of Scalding, and the platforms supported in Scalding
  • download and install the Simple Build Tool
  • download and install Scalding
  • describe the basics of REPL and run the Scalding REPL

Basic Programming in Scalding

  • create a Scalding program
  • start Scalding in local mode
  • use Scalding to understand the basic pipeline functionality
  • identify how to write and save data, and describe sinks
  • describe how Scalding uses snapshots to partially persist data

The Components of Scalding Programming

  • use Scalding to read text and identify text data sources
  • use Scalding functions to manipulate text
  • use Scalding group functions to aggregate data
  • describe how Scalding infers data types in saving computations
  • describe the SQL aggregation functions used in Scalding
  • use list value SQL clauses in Scalding
  • use advanced SQL aggregation techniques in Scalding

Using Scalding Functions

  • use map-like functions in Scalding
  • use filter and collect functions in Scalding
  • use the project function in Scalding
  • use grouping functions in Scalding
  • use join operations in Scalding
  • use record objects in Scalding

Scalding and Hadoop MapReduce

  • describe the functions Scalding uses for mappers
  • describe the functions Scalding uses for reducers
  • describe how Scalding uses Scala, Cascading, and Java in MapReduce jobs

Practice: Creating Programs in Scalding

  • create a comprehensive Scalding program





Multi-license discounts available for Annual and Monthly subscriptions.