Data Exploration

Individuals with some programming and math experience working toward implementing data science in their everyday work


Expected Duration
59 minutes

Once data is transformed into a useable format, the next step is to carry out preliminary data exploration on the data. In this course, you’ll explore examples of practical tools and techniques for data exploration.


Introduction to Data Exploration

  • start the course
  • use csvgrep to explore data in CSV data
  • use csvstat to explore values in CSV data
  • use csvsql to query CSV data like a SQL database
  • use gnuplot to quickly plot data on the command line
  • use wc to count words, characters, and lines within a text file
  • explore a subdirectory tree from the command line
  • use natural language processing to count word frequencies in a text document
  • take random samples from a list of records
  • find the top rows by value and percent in a data set
  • find repeated records in a data set
  • identify outliers using standard deviation

Practice: Exploring Word Frequencies

  • perform a word frequency count on a classic book from Project Gutenberg





Multi-license discounts available for Annual and Monthly subscriptions.