A couple pointers for dealing with files on the filesystem from inside python:
- don't modify
sys.path
- don't use relative paths unless they are relative to something
- always use
os.path.join
- don't rely on the environment
Now for a pattern that I strong suggest:
Start with a code tree where your files are set aside but inside your code tree:
repo_root
|- clover/
| - __init__.py
| - some_package/
| - __init__.py
| - some_code.py
| - data/
| - some_file.csv
In this case, our data file (some_file.csv
) is inside the data
directory of our package (some_package
)
In code that wants to access some_file, you should retrieve the data directory path relative to some_package
. You can do this via import
and the __file__
attribute:
import os
import clover.some_package
DATA_DIR = os.path.join(
os.path.dirname(clover.some_package.__file__),
'data')
with open(os.path.join(DATA_DIR, 'some_file.csv')) as some_file:
# do something with your csv file
This works because clover.some_package.__file__
gives you the path to clover/some_package/__init__.py
regardless of where you started. Now we can use os.path.join
to create a constant (DATA_DIR
) with the location of our data, which we can use wherever we want.
This is great! Thanks George!