Python for Data Science – Complex Data Engineering in Python

Individuals wanting to understand a deeper level of data science using more advanced techniques and operations

Prerequisite
None

Expected Duration
161 minutes

Description
There is a vast toolset that is available for data scientists, with several comprehensive moving parts, especially when it comes to using Python. This course provides the map and dives into data analysis using all the necessary tools with pandas, including machine learning using SciPy operations, working with prediction data, and being introduced to the scikit-learn toolset. Then, the course guides the way to Visualization using Python matplotlib, time series, and many more data engineering operations.

Objective

Data Analysis Using pandas

  • start the course
  • use pandas to describe the basic and common functionalities of pandas for Data Science
  • use pandas to describe its primary data structures
  • use pandas to describe hierarchical indexing
  • perform basic data query operations on a pandas DataFrame
  • perform aggregation operations on a pandas DataFrame
  • perform basic merge operations with pandas DataFrames

Machine Learning with scikit-learn

  • describe the functionality and use of core packages and sub-packages in the SciPy stack
  • use the scikit-learn library to perform basic data standardization
  • use the scikit-learn library to perform basic data normalization
  • use the scikit-learn library to perform simple linear regression analysis
  • perform supervised learning by using the scikit-learn library to perform optical recognition of hand-written digits

Data Visualization with Python

  • use the Python matplotlib library to plot and display a simple 2D line plot and set its line properties
  • use the Python matplotlib library to create and customize multiple plots in a single figure
  • use the Python matplotlib library to create and customize a box plot
  • use the Python matplotlib library to create and display a heat map
  • use the Python matplotlib library to place legends and annotations on a 2D line plot
  • use pandas to create a scatter plot matrix
  • use the Python matplotlib library to create a 3D plot

Time Series and Forecasting Data

  • create, slice, and resample time series data in Python
  • use pandas to create and manipulate Timedeltas in Python

Data Engineering with Python

  • identify key concepts in Python data cleansing
  • perform data preprocessing and text mining in Python

Working with Databases

  • use pandas to access a MySQL database

Inferential Statistics

  • use the SciPy package to describe the various forms of distribution

Practice: Integrations in Data Science

  • manage other concepts and processes in data science

MONTHLY SUBSCRIPTION

$129/month
 

ANNUAL SUBSCRIPTION

$1295/year

Multi-license discounts available for Annual and Monthly subscriptions.