Last active
June 14, 2020 20:59
-
-
Save ricrogz/25a621f29f07618a3f222f88ac6b619f to your computer and use it in GitHub Desktop.
Strip output of Jupyter notebooks on git add
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[filter "strip_ipynb"] | |
clean = python3 strip_ipynb.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
*.ipynb filter=strip_ipynb |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#! /usr/bin/env python | |
# this script filters output from ipython notebooks, for use in git repos | |
# http://stackoverflow.com/questions/18734739/using-ipython-notebooks-under-version-control | |
# | |
# put this file in a `bin` directory in your home directory, then run the following commands: | |
# | |
# chmod a+x ~/bin/ipynb_output_filter.py | |
# echo -e "*.ipynb \t filter=dropoutput_ipynb" >> ~/.gitattributes | |
# git config --global core.attributesfile ~/.gitattributes | |
# git config --global filter.dropoutput_ipynb.clean ~/bin/ipynb_output_filter.py | |
# git config --global filter.dropoutput_ipynb.smudge cat | |
# works with Notebook versions 3, 4 and 5 (iPython/Jupyter versions 2, 3 and 4) | |
import sys | |
from nbformat import read, write, NO_CONVERT | |
json_in = read(sys.stdin, NO_CONVERT) | |
# detect earlier versions | |
if ('worksheets' in json_in): | |
# versions prior to 4 had a 'worksheets' field with a single element | |
sheet = json_in.worksheets[0] | |
else: | |
sheet = json_in | |
for cell in sheet.cells: | |
if "outputs" in cell: | |
cell.outputs = [] | |
if "prompt_number" in cell: | |
cell.prompt_number = None | |
if "execution_count" in cell: | |
cell.execution_count = None | |
write(json_in, sys.stdout, NO_CONVERT) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This automatically strips Jupyter NoteBooks output cells when adding files to a git repository.
This is useful to reduce garbage checked into the repo.
Copy the Python script and the .gitattributes files to the repo, and set the config. Make the script executable.
Might also be done with
git config filter.strip_ipynb.clean "python3 strip_ipynb.py"
.Based on the following:
https://gist.github.com/minrk/6176788
https://gist.github.com/waylonflinn/010f0a1a66760adf914f
https://github.com/cfriedline/ipynb_template/blob/master/nbstripout
https://github.com/kynan/nbstripout