R DT data.table Join
This is part of PML notes. HDD: r_data_table_start
Prepare Data #
library(dplyr)
library(readr)
library(data.table)
hero = "
name, alignment, gender, publisher
Magneto, bad, male, MarvelDuplicate
Storm, good, female, MarvelDuplicate
Batman, good, male, DC
Joker, bad, male, DC
Catwoman, bad, female, DC
Hellboy, good, male, Dark Horse Comics
"
hero = read_csv(hero, trim_ws = TRUE, skip = 1)
hero = data.table(hero)
publisher = "
publisher, yr_founded
DC, 1934
MarvelDuplicate, 1939
MarvelDuplicate, 8888
Image, 1992
"
publisher = read_csv(publisher, trim_ws = TRUE, skip = 1)
publisher = data.table(publisher)
For all the join
commands, if a key is set for both dt, then on = 'key_col_name'
can be elided.