Data-munging with Pandas
Data munging (wrangling) is the process of transforming raw data to a set of data tables that can be used for a variety of downstream purposes such as analytics.
In class:
- Using the pandas notebook:
- additional concepts not covered
- Work on the exercises from notebook
After class
Learning outcomes
- Add new manipulated variables
- Separate char to new variables
- Convert variables to numeric or factor
- Some string manipulations
- Rename variables
- Filter out different observations
conditional selection
tabulate frequency of a var
missing values
replace values
duplicates
- (Using pipes)
- Sorting data