Running Apache Spark on DigitalOcean

Running Apache Spark on DigitalOcean
To run Apache Spark on DO, first create a Ubuntu LTS droplet.

Install Python and Java:

sudo apt update && sudo apt install -y openjdk-8-jdk-headless python

Download the Spark package:

 wget https://downloads.apache.org/spark/spark-3.1.2/spark-3.1.2-bin-hadoop3.2.tgz

Unpack it and move it to /opt

tar xvf spark-3.1.2-bin-hadoop3.2.tgz && mv spark-3.1.2-bin-hadoop3.2 /opt/spark

Insert the variables in profile:

echo "export SPARK_HOME=/opt/spark" >> ~/.profile
echo "export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin" >> ~/.profile
echo "export PYSPARK_PYTHON=/usr/bin/python3" >> ~/.profile

EVOMAN

To then use PySpark with HyperOpt and EVOMAN, we install pip and the libraries:

pip install numpy pyspark pygame

Except for hyperopt, which we'll have to install by source code (a bugfix has not been released yet)

git clone https://github.com/hyperopt/hyperopt.git
cd hyperopt/
pip install .
cd ..

Finally, suppress the ALSA warnings:

Create a new file /etc/asound.conf and insert the following:
```
pcm.!default {
    type plug
    slave.pcm "null"
}
```

You're ready to go!

montali/spark-do.md

Running Apache Spark on DigitalOcean

EVOMAN