Skip to content

Instantly share code, notes, and snippets.

@nihil0
nihil0 / post.md
Last active July 30, 2024 09:34
Deploying Databricks Workflows with Serverless Compute using Terraform

Deploying Databricks Workflows with Serverless Compute using Terraform

Introduction

Serverless compute for workflows enables you to run your Databricks jobs without the need for configuring and deploying infrastructure. This allows you to focus solely on implementing your data processing and analysis pipelines. Databricks takes care of managing compute resources, including optimizing and scaling them for your workloads. With autoscaling and Photon automatically enabled, you can be assured of efficient resource utilization.

Additionally, serverless compute for workflows features auto-optimization, which selects the appropriate resources such as instance types, memory, and processing engines based on your workload. It also automatically retries failed jobs, ensuring smooth and efficient execution of your data workflows.

On 15.7.2024, [Serverless Compute for Notebooks, Workflows, and Delta Live Tables went into GA](https://www.databricks.com/blog/announcing-general-availability-serverless-compute-notebo

@wriglz
wriglz / osm_extract.sql
Created October 3, 2022 16:07
SQL to query OSM data on Google BigQuery
SELECT
osm_id,
feature_type,
osm_timestamp,
geometry,
-- Here we are going to extract a couple of attributes from the all_tags array:
(SELECT value FROM UNNEST(all_tags) WHERE key = 'name' ) AS name,
(SELECT value FROM UNNEST(all_tags) WHERE key = 'addr:city') AS city
FROM
bigquery-public-data.geo_openstreetmap.planet_features
@chriswhong
chriswhong / README.md
Created February 28, 2020 17:55
Using ogr2ogr to load a CSV into Postgres

The absolute easiest way to get a CSV into a postgresql table is to use ogr2ogr with AUTODETECT_TYPE=YES.

I learned a while back that this is what cartoDB uses to import your CSV into postgis (with a lot of other parameters added)

ogr2ogr -f PostgreSQL PG:"host=localhost user=postgres dbname=postgres password=password"  docs.csv -oo AUTODETECT_TYPE=YES
@Guts
Guts / qgis_deploy_install_upgrade_ltr.ps1
Last active February 23, 2024 19:19
Use OSGeo4W installer command-line abilities to provide a real-life example like downloading and installing QGIS LTR full meta-package
#Requires -RunAsAdministrator
<#
.Synopsis
Download the OSGeo4W installer then download and install QGIS LTR (through the 'full' meta-package).
.DESCRIPTION
This script will:
1. change the current directory to the user downloads folder
2. download the OSGeo4W installer
3. launch it passing command-line parameters to DOWNLOAD packages required to QGIS LTR FULL
@ghandic
ghandic / pandas_s3.py
Last active June 5, 2023 11:40
Load csv from S3 directly into memory and write to S3 directly from memory by extending pd.DataFrame class
import boto3
import pandas as pd
from io import StringIO
class S3DataFrame(pd.DataFrame):
"""
# Make a dataframe and upload it as csv
s3df = S3DataFrame({'h1':[1], 'h2':[2]})
s3df.to_s3(Bucket='bucket-name',
@reyemtm
reyemtm / index.html
Last active December 5, 2023 02:03
Leaflet Vector Grid with Interactivity
<!DOCTYPE html>
<html>
<head>
<meta charset=utf-8 />
<title>Leaflet Vector Grid</title>
<meta name='viewport' content='initial-scale=1,maximum-scale=1,user-scalable=no' />
<link rel="stylesheet" href="https://unpkg.com/leaflet@1.3.1/dist/leaflet.css" />
<script src="https://cdnjs.cloudflare.com/ajax/libs/leaflet/1.3.1/leaflet.js"></script>
<script src="https://unpkg.com/leaflet.vectorgrid@latest/dist/Leaflet.VectorGrid.bundled.js"></script>
@iacovlev-pavel
iacovlev-pavel / geoserver-install.sh
Last active May 8, 2022 06:55
Install GeoServer on Ubuntu 18.04
#
apt-get install openjdk-8-jre
# PostgreSQL and PostGIS
apt-get install postgresql postgresql-contrib postgis postgresql-10-postgis-2.4
# Create "geoserver" database
sudo -u postgres createuser -P geoserver
sudo -u postgres createdb -O geoserver geoserver
sudo -u postgres psql -c "CREATE EXTENSION postgis; CREATE EXTENSION postgis_topology;" geoserver
@lukeplausin
lukeplausin / bash_aws_jq_cheatsheet.sh
Last active August 29, 2024 20:27
AWS, JQ and bash command cheat sheet. How to query, cut and munge things in JSON generally.
# Count total EBS based storage in AWS
aws ec2 describe-volumes | jq "[.Volumes[].Size] | add"
# Count total EBS storage with a tag filter
aws ec2 describe-volumes --filters "Name=tag:Name,Values=CloudEndure Volume qjenc" | jq "[.Volumes[].Size] | add"
# Describe instances concisely
aws ec2 describe-instances | jq '[.Reservations | .[] | .Instances | .[] | {InstanceId: .InstanceId, State: .State, SubnetId: .SubnetId, VpcId: .VpcId, Name: (.Tags[]|select(.Key=="Name")|.Value)}]'
# Wait until $instance_id is running and then immediately stop it again
aws ec2 wait instance-running --instance-id $instance_id && aws ec2 stop-instances --instance-id $instance_id
# Get 10th instance in the account
@drmalex07
drmalex07 / convert-geojson-to-wkt.py
Created May 12, 2014 22:13
Convert GeoJSON to/from WKT in Python. #python #geojson #geometry
import json
import geojson
from shapely.geometry import shape
o = {
"coordinates": [[[23.314208, 37.768469], [24.039306, 37.768469], [24.039306, 38.214372], [23.314208, 38.214372], [23.314208, 37.768469]]],
"type": "Polygon"
}
s = json.dumps(o)
@graydon
graydon / country-bounding-boxes.py
Created April 23, 2014 00:03
country bounding boxes
# extracted from http//www.naturalearthdata.com/download/110m/cultural/ne_110m_admin_0_countries.zip
# under public domain terms
country_bounding_boxes = {
'AF': ('Afghanistan', (60.5284298033, 29.318572496, 75.1580277851, 38.4862816432)),
'AO': ('Angola', (11.6400960629, -17.9306364885, 24.0799052263, -4.43802336998)),
'AL': ('Albania', (19.3044861183, 39.624997667, 21.0200403175, 42.6882473822)),
'AE': ('United Arab Emirates', (51.5795186705, 22.4969475367, 56.3968473651, 26.055464179)),
'AR': ('Argentina', (-73.4154357571, -55.25, -53.628348965, -21.8323104794)),
'AM': ('Armenia', (43.5827458026, 38.7412014837, 46.5057198423, 41.2481285671)),