Currently reading articles under label: data-mining-machine-learning

NLP Framework Comparison

Conslusion

SpaCy for take-and-use in production. NLTK to try new things.

ref en

en backup

in cn

cn backup

Machine Learning ML Books

Favoured

MLAPP (Kevin Murphy), Machine Learning - A Probablistic Perspective, is more comprehensive, insightful and interesting, and contains more "real" examples/problems. However, the presents are kinda out of order, which can be difficult to follow for a first book.

Recommender System

This is a detailed reproduction of ref.

Sunny Summary

3 steps:

preprocessing.py preprocessing to extract: author, average sentence length, average word length, punctuation profile, sentiment scores, part-of-speech profiles/tags (only in code, not taken into the csv).

TFIDF.py content-wise k-mean......

Caffe Installation, Hello World

Note: tested with Ubuntu 16.04.1 using /root, for newer Ubuntu version (>= 17.04), check here.

Installation & Self-Tests

Use the installation script here.

//(Sunny only added conditional USE_CUDNN=1, the rest is the same as: ref. You may wanna set USE_CUDNN to 0, if no GPU is used).

Timing:......

Hadoop HDFS

(Update: tested v2.7.2 on Ubuntu 18)

Install

OBS: security warning !

Note: change the core-site.xml and hdfs-site.xml content before running.

Note: change the HDUSER username before running.

# w/ java

curl https://raw.githubusercontent.com/SunnyBingoMe/install-hadoop/master/setup-hadoop | bash

E......

Install R in Ubuntu

UBUNTU 16.04

Option 1:

sudo apt-get install -y r-base r-base-dev libcurl4-openssl-dev libssl-dev build-essential

Option 2 (to get newer R version):

with PPA. ref digitalocean

sudo apt install apt-transport-https && \

sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825......