Data-munging with Pandas

Data munging (wrangling) is the process of transforming raw data to a set of data tables that can be used for a variety of downstream purposes such as analytics.

In class:

  • Using the pandas notebook:
    • additional concepts not covered
    • Work on the exercises from notebook

After class

Learning outcomes

  • Add new manipulated variables
  • Separate char to new variables
  • Convert variables to numeric or factor
  • Some string manipulations
  • Rename variables
  • Filter out different observations conditional selection tabulate frequency of a var missing values replace values duplicates
  • (Using pipes)
  • Sorting data