Skip to content

Instantly share code, notes, and snippets.

@lennymd
Created October 12, 2019 01:21
Show Gist options
  • Save lennymd/4d82cea3dd57e63307c1e50dca3651bd to your computer and use it in GitHub Desktop.
Save lennymd/4d82cea3dd57e63307c1e50dca3651bd to your computer and use it in GitHub Desktop.
# we'll use the json library to read the JSON file and the csv library to save the file as a CSV file.
import csv, json
# This is the path to your json file. In this case I'm running the python script from the same folder where my json is.
path_to_file = "art_of_asia.json"
# The first thing we do is load the JSON file using Python, which turns the whole object into a dictionary, with a lot of dictionaries inside.
with open(path_to_file) as file:
data = json.load(file)
# You can programmatically set the collection to the "value" of the "title" part of the json, but I wasn't sure if that title shows up in every json file, or only on the last one.
collection = "Art of Asia"
# All the object information are in the "objects" dictionary. We'll loop through this to get the information
objects = data["objects"]
'''
There seem to be 5 fields for each object:
01. primaryMaker: the individual or group who made the piece. It's not always present in the data
02. primaryMedia: a relative url to a picture of the object
03. displayDate: the period when the object was created
04. invno: the inventory locator for the object. looks kind of like a book record locator
05. id: the system database id for the object
06. title: the name/title of the object
Inside these fields there is sometimes a "label" property, which seems like the readable interpretation of the whatever the field is. The important property is the "value" property that is always present and has the information of interest.
'''
# We'll be appending lists to this list and then we'll turn it into a CSV file. For headers, I added the readable versions of the fields, but modified them so they are slightly easier for a computer to parse if more things need to be done programmatically.
object_list = [["primary_maker", "primary_media", "date", "object_number", "id", "title","collection"]]
# loop through the objects one by one.
for object in objects:
# This print is a check to make sure that we have the total number of objects we're supposed to have in a file.
print(objects.index(object))
'''
Although there are 5 fields, some objects are missing the "primaryMaker" field. I've decided to check for this field and the others to make sure that we can catch any blanks. It's definitely not the most elegant solution, but it works.
If the field is not found, the column will have a "not found" message that you can modify in this next variable.
'''
not_found = "not found in JSON"
# for each object I start things as not_found so we can just replace the things that exist since we're checking each field. If we we knew only the primaryMaker field is empty on occasion, this part could be coded differently.
object_info = [not_found, not_found, not_found, not_found, not_found, not_found, collection]
# making a list of the object at the highest level lets us check how many and which fields are in the object's json without focusing on the values of the fields.
fields = list(object)
# check that primaryMaker exists. If it does, update the first column of the object info. Then do the same pattern for the rest.
if "primaryMaker" in fields:
object_info[0] = object["primaryMaker"]["value"]
if "primaryMedia" in fields:
object_info[1] = object["primaryMedia"]["value"]
if "displayDate" in fields:
object_info[2] = str(object["displayDate"]["value"])
if "invno" in fields:
object_info[3] = str(object["invno"]["value"])
if "id" in fields:
object_info[4] = str(object["id"]["value"])
if "title" in fields:
object_info[5] = object["title"]["value"]
# once we've checked all the fields, let's add the object's info to the list we'll convert to a CSV later
object_list.append(object_info)
# create/open this file, and overwrite everything in it with the rows from object_list. If you want to keep different files change the file name before the .csv in the open() method.
with open("lowe-objects.csv", "w") as my_csv:
csv.writer(my_csv, delimiter=',').writerows(object_list)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment