Skip to content

Instantly share code, notes, and snippets.

@robcowie
robcowie / spark_notes.md
Last active April 11, 2022 22:30
Apache Spark Notes

Install Apache Spark (OSX)

$ brew install apache-spark

Run the Spark python shell

A python shell with a preconfigured SparkContext (available as sc). It is

@oleewere
oleewere / spark-hdp-install
Last active September 21, 2015 11:59
install-spark-to-ambari-hdp
#!/bin/bash
: ${HDP_VERSION:=2.2.0.0-2041}
: ${SPARK_VERSION:=1.2.0}
: ${SPARK_DIST_PREFIX_VERSION:=2.2.0.0-82}
: ${SPARK_HADOOP_VERSION:=2.6.0}
SPARK_DIST_VERSION=$SPARK_VERSION.$SPARK_DIST_PREFIX_VERSION-bin-$SPARK_HADOOP_VERSION.$HDP_VERSION
SPARK_ASSEMBLY_VERSION=$SPARK_VERSION.$SPARK_DIST_PREFIX_VERSION-hadoop$SPARK_HADOOP_VERSION.$HDP_VERSION
@staltz
staltz / introrx.md
Last active September 24, 2024 19:53
The introduction to Reactive Programming you've been missing
@alexanderlz
alexanderlz / hdfs_list_running_jobs.sh
Created May 20, 2012 12:22
hadoop cli - oneliner to list running jobs with duration and slots usage
hadoop job -list | grep job_ | awk 'BEGIN{FS="\t";OFS=","};{print $1,strftime("%H:%M:%S", (systime()-int($3/1000)),1),"\""$4"\"","\""$6"\""}'
ADD JAR s3://<s3-bucket>/jars/hive_contrib-0.5.jar;
CREATE TEMPORARY FUNCTION now as 'com.mt.utils.udf.Now';
CREATE TEMPORARY FUNCTION user_agent_f as 'com.mt.utils.UserAgent';
set hive.merge.mapredfiles=true;
set hive.merge.mapfiles=true;
set hive.merge.size.per.task=500000000;
CREATE EXTERNAL TABLE data
@dln
dln / jmx.scala
Created May 3, 2011 09:45
Reading some JMX MBeans from a Scala script
#!/bin/sh
exec scala $0 "$@"
!#
import scala.collection.JavaConversions._
import java.lang.management.{ManagementFactory, MemoryMXBean}
import java.net.URI
import javax.management.JMX
import javax.management.remote.{JMXConnectorFactory, JMXServiceURL}
sed 's/'`echo -e "\01"`'/,/g' input_file.txt > output_file.csv