Skip to content

Instantly share code, notes, and snippets.

View amieiro's full-sized avatar

Jesús Amieiro Becerra amieiro

View GitHub Profile
tamuhey /
Last active July 27, 2024 14:46
How to calculate the alignment between BERT and spaCy tokens effectively and robustly

How to calculate the alignment between BERT and spaCy tokens effectively and robustly



Natural Language Processing (NLP) has made great progress in recent years because of neural networks, which allows us to solve various tasks with end-to-end architecture. However, many NLP systems still require language-specific pre- and post-processing, especially in tokenizations. In this article, I describe an algorithm that simplifies calculating correspondence between tokens (e.g. BERT vs. spaCy), one such process. And I introduce Python and Rust libraries that implement this algorithm. Here are the library and the demo site links:

kuznero /
Last active August 10, 2024 23:23
How to compact VirtualBox's VDI file size?

Source: StackOverflow

1. Run defrag in the guest (Windows only)

2. Nullify free space:

With a Linux Guest run this:

sudo dd if=/dev/zero | pv | sudo dd of=/bigemptyfile bs=4096k
sudo rm -rf /bigemptyfile
petertwise /
Last active January 2, 2024 20:17
Customized wordpress install script with WP CLI
# Install Wordpress with Square Candy default using WP CLI
read -p 'Site URL ( ' url
read -p 'Site Title: ' title
read -p 'WP Admin username: ' admin_user
read -sp 'WP Admin password: ' admin_password
read -p '
WP Admin email: ' admin_email
read -p 'Database name: ' dbname
read -p 'Database user: ' dbuser
kasperhartwich / delete-from-slack.php
Last active January 26, 2021 08:10
Script to delete old files from Slack
#!/usr/bin/env php
if (count($argv)<2) {
echo $argv[0] . ' <token> <until>' . PHP_EOL;
echo 'Example: ' . $argv[0] . ' abcd-12345678-123456789-12345 \'-3 months\'' . PHP_EOL;
(2014) Main source ->
I just managed to sniff Instagram traffic and fixed the code
-- Have fun - - Batuhan Katırcı
--- for your questions, comment @
nocturnalgeek / MailinatorAliases
Last active August 21, 2024 01:58
A list of alternate domains that point to
SeanPONeil / .bash_prompt
Created September 13, 2012 20:05
Sexy Solarized Bash Prompt, inspired by "Extravagant Zsh Prompt"
# Sexy Solarized Bash Prompt, inspired by "Extravagant Zsh Prompt"
# Customized for the Solarized color scheme by Sean O'Neil
if [[ $COLORTERM = gnome-* && $TERM = xterm ]] && infocmp gnome-256color >/dev/null 2>&1; then TERM=gnome-256color; fi
if tput setaf 1 &> /dev/null; then
tput sgr0
if [[ $(tput colors) -ge 256 ]] 2>/dev/null; then
BASE03=$(tput setaf 234)
BASE02=$(tput setaf 235)
BASE01=$(tput setaf 240)
heldr / addfont.cmd
Last active March 25, 2024 14:14
add windows fonts by command line
TITLE Adding Fonts..
REM Filename: ADD_Fonts.cmd
REM Script to ADD TrueType and OpenType Fonts for Windows
REM By Islam Adel
REM 2012-01-16
REM How to use:
kraft001 / solarized.bash
Created June 8, 2012 05:47
solarized Gnome Terminal + Tmux + Vim
# store all solarized files in one place
mkdir ~/.solarized
cd ~/.solarized
git clone
eval `dircolors ~/.solarized/dircolors-solarized/dircolors.256dark`
ln -s ~/.solarized/dircolors-solarized/dircolors.256dark ~/.dir_colors
git clone