Hadoop

Hadoop HDFS

2017-06-19. Category & Tags: Default Hadoop, HDFS

(Update: tested v2.7.2 on Ubuntu 18) Install # OBS: security warning ! Note: change the core-site.xml and hdfs-site.xml content before running. Note: change the HDUSER username before running. # w/ java curl https://raw.githubusercontent.com/SunnyBingoMe/install-hadoop/master/setup-hadoop | bash Env & Hd-Structure Config (Multi-Node Only) # Make sure the master node and the hadoop user, e.g. $hduser can access work nodes using passphrase-less ssh keys. If a different username ($hduser) rather than the installer user ($user) is to run hadoop, we need to run setup_profile and setup_environment after su $hduser (some minor errors will be given by bash, no worries. ...

Spark

2016-11-15. Category & Tags: Spark, Hadoop

Related: The Internals of Apache Spark 2.4.0 | GitBook For usage after installation, see scale-py chapter 8 & 9. StreamProcessing comparison /stream-processing ~~ spark-vs-h2o ~~ The following content is tested in Ubuntu 16 (before 2019) & 18.04 (after 2018). This for how to install Spark with standalone/yarn/mesos. OBS: Assuming username: hpc. STANDALONE (ONE-SINGLE-NODE) # One-node standalone mode might be our fist time to try Spark, so we make the installation as easy as possible. ...