R

Install & Use R & RStudio

2017-05-08. Category & Tags: R, Install, Ubuntu, Linux, Rstudio

R IN WINDOWS #

Download here and install.

R IN UBUNTU 18.04 #

sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9 && \
sudo add-apt-repository 'deb https://cloud.r-project.org/bin/linux/ubuntu bionic-cran40/' && \
sudo apt update && \
sudo apt install -y r-base r-base-dev libcurl4-openssl-dev libssl-dev build-essential && \
sudo -i R # install packages as root, so all users can use.

Commonly used packages:

install.packages(c('devtools', 'digest', 'repr', 'IRdisplay', 'crayon', 'pbdZMQ', 'ggplot2', 'IRkernel', 'ggpubr'))

DigitalOcean

Optional: sudo chmod 777 /usr/local/lib/R/site-library so anyone can install packages.

R IN UBUNTU 16.04 #

Option 1:

...

R DT data.table Join

2016-10-11. Category & Tags: DT, R, Data.table, Join

This is part of PML notes. HDD: r_data_table_start

Prepare Data #

library(dplyr)
library(readr)
library(data.table)
hero = "
name,       alignment, gender,   publisher
Magneto,    bad,       male,     MarvelDuplicate
Storm,      good,      female,   MarvelDuplicate
Batman,     good,      male,     DC
Joker,      bad,       male,     DC
Catwoman,   bad,       female,   DC
Hellboy,    good,      male,     Dark Horse Comics
"
hero = read_csv(hero, trim_ws = TRUE, skip = 1)
hero = data.table(hero)

publisher = "
publisher,   yr_founded
DC,              1934
MarvelDuplicate, 1939
MarvelDuplicate, 8888
Image,           1992
"
publisher = read_csv(publisher, trim_ws = TRUE, skip = 1)
publisher = data.table(publisher)

For all the join commands, if a key is set for both dt, then on = 'key_col_name' can be elided.

...

Plot in R, ggplot2

2016-10-11. Category & Tags: GGplot, GGplot2, R, 3D Plot

Python alternatives to ggplot2: pygg (NOT working), ggpy (ggplot in py) from yhat (NOT working and not in maintance). Please use rpy2 to “source()” R files.

Note: this blog is mainly used to prepare data, for plotting code, see:

3D PLOT #

scatter #

OBS: order of using commands.

surface #

OBS: order of using commands.

...

Notes of Practical Machine Learning (Coursera PML)

2016-06-28. Category & Tags: Practical Machine Learning, PML, R, Notes, Coursera, Johns Hopkins University, Data.table, DT

It has been a long time since I started using R. Recently, I found some old notes, and I prefer to put it in digital archive, this blog post is to achieve the purpose.

DT (data.table) #

data.table cheat sheet

This data.table (DT) instruction is also available on my github, ispiared by
this ref

//TODO #

//TODO: summarize Solve common R problems efficiently with data.table which is must-read. backup
//TODO: summarize High-performance Solution in R
//TODO: check if to summarize Data Analysis in R using data.table
//TODO: Advanced tips and tricks with data.table
//TODO: The official “Getting Started” of DT
//TODO: check tablewrangling.Rmd

...

Cross-Read & -Write R, Py, Matlab, Binary Files

2016-06-01. Category & Tags: R, Binary, Mat, Matlab, Python, NumPy, Pandas

Note: feather-format is desigend to transfer data between Py & R [stackoverflow, feather-doc].

.FE #

OBS: only for data.frame type, not even arrays.

py (feather-format) #

Requires: pip install feather-format. (OBS: feather-format NOT feather.)

write:

import numpy as np
import pandas as pd

df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C':[7,8,9]}, index=['one', 'two', 'three'])

import feather
feather.write_dataframe(df.reset_index(drop=True), 'df.fe')

(though the df is created by pandas)

read:

import feather
df = feather.read_dataframe('df.fe')

py-pandas (canNOT read) #

write: df.reset_index(drop=True).to_feather('df.fe'). Note that feather canNOT handle string-based index names, another solution is drop=False, then index becomes columns.

...