Skip to content

Instantly share code, notes, and snippets.

import os
from datetime import timedelta
from airflow import DAG
from airflow.contrib.operators.emr_add_steps_operator import EmrAddStepsOperator
from airflow.contrib.operators.emr_create_job_flow_operator import EmrCreateJobFlowOperator
from airflow.contrib.sensors.emr_step_sensor import EmrStepSensor
from airflow.utils.dates import days_ago
DAG_ID = os.path.basename(__file__).replace('.py', '')
curl -s https://gist.githubusercontent.com/wongcyrus/a4e726b961260395efa7811cab0b4516/raw/6a045f51acb2338bb2149024a28621db2abfcaab/resize.sh | bash /dev/stdin 60
- hosts: all
gather_facts: True
become: True
roles:
- role: fluentbit
fluentbit_inputs:
- systemd:
- Tag: docker
- Systemd_Filter: _SYSTEMD_UNIT=docker.service
- cpu:
@fopina
fopina / handlers_main.yml
Created July 9, 2019 23:18
fluentbit ansible role
---
- name: Restart Fluentbit
service:
name: td-agent-bit
enabled: true
state: restarted
@zodvik
zodvik / benchmark-commands.txt
Last active December 18, 2022 12:45
Kafka (1.0.0) Benchmark Commands
Producer
Setup
bin/kafka-topics.sh --zookeeper localhost:2181/kafka-local --create --topic test-rep-one --partitions 6 --replication-factor 1
bin/kafka-topics.sh --zookeeper localhost:2181/kafka-local --create --topic test-rep-two --partitions 6 --replication-factor 3
Single thread, no replication
bin/kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --print-metrics --topic test-rep-one --num-records 6000000 --throughput 100000 --record-size 100 --producer-props bootstrap.servers=kafka_host:9092 buffer.memory=67108864 batch.size=8196
Single-thread, async 3x replication
@jrepp
jrepp / pywin32service.py
Created December 8, 2017 23:09
Python windows service
# from https://stackoverflow.com/questions/32404/is-it-possible-to-run-a-python-script-as-a-service-in-windows-if-possible-how
import win32serviceutil
import win32service
import win32event
import servicemanager
import socket
class AppServerSvc (win32serviceutil.ServiceFramework):
_svc_name_ = "TestService"
@msfidelis
msfidelis / main.tf
Last active November 30, 2017 03:36
provider "aws" {
region = "${var.region}"
}
resource "aws_launch_configuration" "webcluster" {
image_id= "${var.ami}"
instance_type = "${var.instance_type}"
security_groups = ["${aws_security_group.websg.id}"]
key_name = "${aws_key_pair.myawskeypair.key_name}"
user_data = "${file("user-data/bootstrap.sh")}"
@marwei
marwei / how_to_reset_kafka_consumer_group_offset.md
Created November 9, 2017 23:39
How to Reset Kafka Consumer Group Offset

Kafka 0.11.0.0 (Confluent 3.3.0) added support to manipulate offsets for a consumer group via cli kafka-consumer-groups command.

  1. List the topics to which the group is subscribed
kafka-consumer-groups --bootstrap-server <kafkahost:port> --group <group_id> --describe

Note the values under "CURRENT-OFFSET" and "LOG-END-OFFSET". "CURRENT-OFFSET" is the offset where this consumer group is currently at in each of the partitions.

  1. Reset the consumer offset for a topic (preview)
@rcillo
rcillo / attach-eni.py
Last active May 30, 2019 15:13
This gist contains code that attaches an ENI to a running EC2 instance and configures the network accordingly
# -*- coding: utf-8 -*-\
"""
The MIT License (MIT)
Copyright (c) 2015 Zalando SE
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
#!/usr/bin/env python3
'''
This script sets a retention on all the logGroups that don't have one set
'''
import boto3
import logging
# Retention it will be set to
RETENTION = 30