Skip to content

Instantly share code, notes, and snippets.

View MarioZZJ's full-sized avatar
🎸
Be well-grounded and speak up moderately.

Zhejun Zheng MarioZZJ

🎸
Be well-grounded and speak up moderately.
View GitHub Profile
@MarioZZJ
MarioZZJ / flatten-openalex-jsonl.py
Last active November 12, 2022 07:30 — forked from richard-orr/flatten-openalex-jsonl.py
flatten openalex JSON Lines files to TSV readable by Spark
# python 3.8+ required
import csv
import glob
import gzip
import json
import os
SNAPSHOT_DIR = 'openalex-snapshot'
CSV_DIR = 'tsv-files'