Apache HBase Fundamentals: Advanced API, Administration, and MapReduce

Administrators and developers who need experience using Hbase

Prerequisite
None

Expected Duration
118 minutes

Description
The administration of Apache HBase is a fundamental component to understand. HBase can be managed using the Java client API and can also be integrated with MapReduce to perform additional tasks that will help obtain maximum performance. This course will discuss and show how to implement filters to limit the results returned from a scan operation. It will also demonstrate how to administer the HBase cluster and instance and perform backup and restore operations. Using MapReduce is also discussed.

Objective

Filters

  • start the course
  • use utility filters that extend the FilterBase class to filter scan results
  • use comparison filters to limit the scan results using comparison operators and comparator instance
  • use custom filters to extend or change the behavior of an existing filter to achieve a more fine-grained control over the scan results

Cluster Administration

  • use the HBaseAdmin API to check the status of the master server, connection instance, and the configuration used by the instance
  • view a list of all the user space tables in HBase and the instance for the table
  • disable and delete tables from HBase
  • complete a major compaction using the HBase shell
  • merge regions in the same table using the Merge utility
  • stop and decommission a RegionServer
  • perform a rolling restart on the entire cluster
  • add a new node to HBase
  • view metrics to monitor HBase

Snapshots and Backups

  • take a snapshot
  • use a snapshot to clone a table and move it to another cluster
  • export and restore a snapshot to another cluster
  • perform a full shutdown backup of HBase
  • perform a backup of HBase on a live cluster
  • restore HBase

MapReduce

  • use the TableOutPutFormat class to set up a table as an output to the MapReduce process using HBase as the data sink
  • set up a table as an input to a MapReduce process using HBase as the data source
  • use MapReduce to bulk load data directly into HBase file system by bypassing the HBase API
  • use the getSplits method of the TableInputFormatBase class to create custom splitters when using an HBase table as a data source
  • access other HBase tables from within a MapReduce job by creating a Table instance in the setup method of Mapper

Practice: Manage HBase

  • perform HBase cluster and node maintenance

MONTHLY SUBSCRIPTION

$129/month
 

ANNUAL SUBSCRIPTION

$1295/year

Multi-license discounts available for Annual and Monthly subscriptions.