Spark Master/Slave- Single Node Installation

Hi Sparklearners,

This article is giving you information about the installation of spark single node master slaves cluster formation steps. Follow this and get a Master Slaves spark cluster in your machine and enjoy it.

Spark_master_slave_single_node_installation

1. Download spark from the below link. If you are using linux os then just use the below wget command to get the spark version. If you need latest version refer official Spark Site
$wget http://d3kbcqa49mib13.cloudfront.net/spark-1.3.0.tgz

2. To start spark in your machine we need scala. To download scala use the below wget command. And wait untill scala gets downloaded in your machine. If you need latest version of scala enter in to official scala Site
$wget http://www.scala-lang.org/files/archive/scala-2.10.4.tgz

3. After getting the spark-1.3.0.tgz and scala-2.10.4.tgz on your machine just untar both the file.

$tar -zxvf spark-1.3.0.tar.gz
$tar -zxvf scala-2.10.4.tar.gz

After untaring set the scala and spark path on your .bashrc and source the bashrc file.

$vi .bashrc
export SCALA_HOME=/home/bigdata/scala-2.10.4
export SPARK_HOME=/home/bigdata/spark-1.3.0
export PATH=$HOME/bin:$SCALA_HOME/bin:$PATH
$source .bashrc

Set in etc/hosts

$sudo vi /etc/hosts

10.0.0.7 datadotz_master

Now we need to install git

sudo apt-get install git
cd spark-1.3.0
sbt/sbt assembly
cd spark-1.3.0
cd conf
cp spark-env.sh.template spark-env.sh

$vi spark-env.sh

export SCALA_HOME=/home/bigdata/scala-2.10.4
export SPARK_WORKER_MEMORY=1g
export SPARK_WORKER_INSTANCES=2
export SPARK_WORKER_DIR=/home/bigdata/sparkdata

$cp slaves.template slaves
$vi slaves
ipaddress (datadotz_master (or) 10.0.0.7)

$vi spark-defaults.conf.template

spark.master spark://datadotz_master:7077

After completing these configurations now enter in to spark-1.3.0 folder in terminal and give the following commands to start master and slave
$cd spark-1.3.0

sbin/start-master.sh
sbin/start-slaves.sh

$jps

master
worker
worker

localhost:8080

———————————-

Article written by DataDotz Team

DataDotz is a Chennai based BigData Team primarily focussed on consulting and training on technologies such as Apache Hadoop, Apache Spark , NoSQL(HBase, Cassandra, MongoDB), Search and Cloud Computing.

Note: DataDotz also provides classroom based Apache Kafka training in Chennai. The Course includes Cassandra , MongoDB, Scala and Apache Spark Training. For more details related to Apache Spark training in Chennai, please visit http://datadotz.com/training/