Getting and Cleaning Data

share ›
‹ links

Below are the top discussions from Reddit that mention this online Coursera course from Johns Hopkins University.

Before you can work with data you have to get some.

Data Manipulation Regular Expression (REGEX) R Programming Data Cleansing

Reddsera may receive an affiliate commission if you enroll in a paid course after using these buttons to visit Coursera. Thank you for using these buttons to support Reddsera.

Taught by
Jeff Leek, PhD
Associate Professor, Biostatistics
and 2 more instructors

Offered by
Johns Hopkins University

Reddit Posts and Comments

2 posts • 8 mentions • top 1 shown below

r/datascience • comment
1 points • HailSeitan999

Because DataCamp is known for welcoming sexual harrassy behavior and their CEO seems like a child, here are some alternative recs about cleaning and the other stuff that makes up 95% of data jobs:

here's what to look for, every time: https://twitter.com/b0rk/status/1182288624018247685

here's a coursera course on it: https://www.coursera.org/learn/data-cleaning

here's how SQL works: https://twitter.com/b0rk/status/1184571894722449409

here's how tidyverse works, including various read/write libraries + opinions on data types + code reusability that are generally applicable: http://r4ds.had.co.nz/

hopefully a helpful (+ free) set of alternatives!

I'd also say hard choices around interpolation vs exclusion seem hard to find material on, probably because so data/resource/context specific, but also good to be aware of.