-
Notifications
You must be signed in to change notification settings - Fork 11
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
3 changed files
with
70 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
# Data Wrangling (emphasis on `dplyr`) | ||
|
||
|
||
```{r echo = FALSE} | ||
library(knitr) | ||
opts_chunk$set(message = FALSE, warning = FALSE, cache = TRUE) | ||
options(width = 100, dplyr.width = 100) | ||
library(ggplot2) | ||
theme_set(theme_light()) | ||
``` | ||
|
||
|
||
|
||
## Introduction | ||
|
||
Data is rarely in condition to use it...there's invariably something amiss. Data wrangling (a.k.a. data carpentry) is the process of getting it ready for analysis. | ||
|
||
|
||
## Theory and methods | ||
|
||
|
||
[Stat 545: Data wrangling, exploration, and analysis with R](http://stat545.com/index.html) -- course materials associated with the University of British Columbia's Statistics 545 course. Prepared in large part by Dr. Jenny Bryan. | ||
|
||
|
||
### Tidy evaluation | ||
|
||
* programming with `dplyr` | ||
|
||
Edwin Thoen, 2017-08-25 [Tidy evaluation, most common actions](https://edwinth.github.io/blog/dplyr-recipes/) | ||
|
||
### Reading messy files | ||
|
||
Luis D. Verde, 2018-12-14, [Tidyeval meets PDF table hell](http://luisdva.github.io/rstats/Tidyeval-pdf-hell/) -- great solution to the common problem of broken rows ("values that are broken up into two lines for whatever reason (often to optimize space on a page in a table in a typeset pdf)"). | ||
|
||
|
||
### Working with dates | ||
|
||
<blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">Updated Turing Test concept:<br>A spreadsheet of dates, hand-entered by interns more than a decade ago, featuring such well-known time formats as "1996ish", "1941/xd01944", "1955?" and "WWII."<br>I'm not worried about AI until someone shows me the algorithm that can make sense of this. <a href="https://t.co/IhzofigX2b">pic.twitter.com/IhzofigX2b</a></p>— Brooke Watson (@brookLYNevery1) <a href="https://twitter.com/brookLYNevery1/status/954368989181902848?ref_src=twsrc%5Etfw">January 19, 2018</a></blockquote> | ||
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> | ||
|
||
|
||
## R | ||
|
||
Arranged by package | ||
|
||
### `dplyr` | ||
|
||
**package** | ||
|
||
CRAN: [dplyr: A Grammar of Data Manipulation](https://CRAN.R-project.org/package=dplyr) | ||
|
||
github: [hadley/dplyr](https://github.com/hadley/dplyr) | ||
|
||
**articles** | ||
|
||
* [Introduction to dplyr](http://stat545.com/block009_dplyr-intro.html), part of the UBC [STAT545: Data wrangling, exploration, and analysis with R](http://stat545.com/index.html) course materials | ||
|
||
|
||
* Gary Hutson, 2018-05-24, [DPLYR: A Beginners Guide](https://www.r-bloggers.com/dplyr-a-beginners-guide/) | ||
|
||
-30- |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters