Hadoop in the Cloud

Developers interested in expanding their knowledge of Hadoop from the operations perspective


Expected Duration
173 minutes

Amazon Web Services, also known as AWS, is a secure cloud-computing platform offered by Amazon.com. This course introduces AWS and it’s most prominent tools such as IAM, S3, and EC2. Additionally we will cover how to install configure and use a Hadoop cluster on AWS. This learning path can be used as part of the preparation for the Cloudera Certified Administrator for Apache Hadoop (CCA-500) exam.


Amazon Web Services

  • start the course
  • describe how cloud computing can be used as a solution for Hadoop
  • recall some of the most come services of the EC2 service bundle
  • recall some of the most common services that Amazon offers

Setup of AWS

  • describe how the AWS credentials are used for authentication
  • create an AWS account
  • describe the use of AWS access keys
  • describe AWS identification and access management
  • set up AWS IAM

AWS System Security

  • describe the use of SSH key pairs for remote access

AWS S3 and EC2

  • set up S3 and import data
  • provision a micro instance of EC2

Setup of AWS Cluster

  • prepare to install and configure a Hadoop cluster on AWS
  • create an EC2 baseline server
  • create an Amazon machine image
  • create an Amazon cluster
  • describe what the command line interface is used for

Moving Data

  • use the command line interface
  • describe the various ways to move data into AWS

Elastic MapReduce

  • recall the advantages and limitations of using Hadoop in the cloud
  • recall the advantages and limitations of using AWS EMR
  • describe EMR End-user connections and EMR security levels
  • set up an EMR cluster
  • run an EMR job from the web console
  • run an EMR job with Hue
  • run an EMR job with the command line interface

Practice: Cloud Computing

  • write an Elastic MapReduce script for AWS





Multi-license discounts available for Annual and Monthly subscriptions.