Skip to content

Instantly share code, notes, and snippets.

@alexott
Created January 11, 2019 16:02
Show Gist options
  • Save alexott/246e9ab5e50416d83c080f53529cecf6 to your computer and use it in GitHub Desktop.
Save alexott/246e9ab5e50416d83c080f53529cecf6 to your computer and use it in GitHub Desktop.
Docker image for Zeppelin + DSE
# Build Zeppelin 0.8 that is downloaded from Apache servers
docker build .
# you can build custom version by spefying Z_URL_PREFIX if you have Zeppelin downloaded already, and DSE_VERSION, and Z_VERSION like this
docker build --build-arg Z_URL_PREFIX=http://local-ip --build-arg DSE_VERSION=6.0.4 .
ARG DSE_VERSION=6.7.0
FROM datastax/dse-server:${DSE_VERSION}
ARG Z_VERSION="0.8.0"
ENV Z_HOME="/zeppelin"
ARG Z_URL_PREFIX=http://archive.apache.org/dist/zeppelin/zeppelin-${Z_VERSION}
ARG Z_TARBALL=zeppelin-${Z_VERSION}-bin-all.tgz
ARG Z_DOWNLOAD_URL=${Z_URL_PREFIX}/${Z_TARBALL}
USER root
RUN echo "Download Zeppelin binary" && \
wget -O /tmp/zeppelin-${Z_VERSION}-bin-all.tgz ${Z_DOWNLOAD_URL} && \
cd / && tar -zxvf /tmp/zeppelin-${Z_VERSION}-bin-all.tgz && \
rm -rf /tmp/zeppelin-${Z_VERSION}-bin-all.tgz && \
mv /zeppelin-${Z_VERSION}-bin-all ${Z_HOME} && \
chown -R dse ${Z_HOME}
EXPOSE 8080
USER dse
WORKDIR ${Z_HOME}
ENTRYPOINT [ "dse", "exec", "bin/zeppelin.sh" ]
After Zeppelin is started, go to "Interpreter", select "spark" interpreter and do following:
1. Find Analytics master IP by executing "dsetool status" on the DSE cluster node - it will be in the first line, right
2. Change "master" property to "dse://<Analytics-Master-IP>?"
3. Add property "spark.cassandra.connection.host" with the same IP
4. Save settings
After that you can create Spark notebook and jobs will be shown in the DSE Analytics Spark Master
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment