Hadoop

Hadoop HDFS

2017-06-19. Category & Tags: Default Hadoop, HDFS

(updated: tested hadoop 3.1.3 on Ubuntu 1804) Install # sudo mkdir -p /usr/lib/jvm sudo tar -zxvf ./jdk-8u162-x64.tar.gz -C /usr/lib/jv vim ~/.bashrc export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_162 export JRE_HOME=${JAVA_HOME}/jre export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib export PATH=${JAVA_HOME}/bin:$PATH # new terminal java -version echo $JAVA_HOME sudo tar -zxf hadoop-3.1.3.tar.gz -C /usr/local sudo mv /usr/local/hadoop-3.1.3/ /usr/local/hadoop sudo chown -R hadoop:hadoop /usr/local/hadoop /usr/local/hadoop/bin/hadoop version Config (pseudo-distributed 伪分布式配置) # cd /usr/local/hadoop 注:一旦添加了 core-site.xml 以下配置,则无法进行 stand-alone 单机运行?。 gedit ./etc/hadoop/core-site.xml <configuration> <property> <name>hadoop.tmp.dir</name> <value>file:/usr/local/hadoop/tmp</value> <description>Abase for other temporary directories. ...

Spark

2016-11-15. Category & Tags: Spark, Hadoop

Related: The Internals of Apache Spark 2.4.0 | GitBook For usage after installation, see scale-py chapter 8 & 9. StreamProcessing comparison /stream-processing ~~ spark-vs-h2o ~~ The following content is tested in Ubuntu 16 (before 2019) & 18.04 (after 2018). This for how to install Spark with standalone/yarn/mesos. OBS: Assuming username: hpc. STANDALONE (ONE-SINGLE-NODE) # One-node standalone mode might be our fist time to try Spark, so we make the installation as easy as possible. ...