Skip to content

Latest commit

 

History

History
52 lines (37 loc) · 1.34 KB

README.md

File metadata and controls

52 lines (37 loc) · 1.34 KB

RETL

RETL is an R package that provides tools for writing ETL jobs in R. It stands on R’s wide range of APIs to various types of data sources.

It is intended to be used together with the Rflow and RETLflow packages as universal API to data stored in databases, files, excel sheets. RETL relies heavily on the data.table package for fast data transofrmations.

Installation

RETL can be installed from GitHub by running:

devtools::install_github("vh-d/RETL")

Examples

library(RETL)
library(magrittr)

# establish connections
my_db    <- DBI::dbConnect(RSQLite::SQLite(), "path/to/my.db")
your_csv <- "path/to/your.csv"
your_db  <- dbConnect(RMariaDB::MariaDB(), group = "your-db")

Pipes

# simple extract and load
etl_read(from = your_csv) %>% etl_write(to = my_db, name = "customers")

# extract -> transform -> load
etl_read(from = my_db, name = "orders") %>% # db query: EXTRACT from a database
  dtq(, order_year := year(order_date)) %>% # data.table query: TRANSFORM (adding a new column)
  etl_write(to = your_db, name = "customers") # LOAD to a db

Other tools

set_index(table = "customers", c("id", "order_year"), your_db)