Object store systems like Openstack Swift, Minio, and Ceph are worth consideration. They offer various advantages in scalability, accessibility, and things like arbitrary object metadata. Most of them also have S3 compatible APIs if that makes things easier.
If instead we need or desire to read files from traditional mounted filesystems, GlusterFS is one option worth considering. With it's Distributed Replicated configuration, storage space can be scaled by adding more disks, and access speed can be scaled by adding more nodes. It's supported by Kubernetes, and has systems for asynchronous replication, usually used for syncing the data to another datacenter. It can support replicated applications via ReadWriteMany
.
- Using an object store: versioning is built in, record the version ID in the HTTP server's database.
- Using a file store: too many files in a single directory can cause operational bottlenecks. A directory tree in a form such as
user_id/resource_id/version_id/filename.txt
allows the relative filepath to be calculated on the fly via information in the HTTP server's database.
We want to ensure certain actions have been taken for each uploaded file. Examples are virus scanning, parsing and storing metadata, resizing images, etc. A message broker (AKA message queue) with acknowledgments and at-least-once delivery could well be a top design choice.
- Receive PUT request from user, do not return yet
- Store file
- If failure between this step and the completion of the next two is a concern, consider:
- (using an object store): set a short TTL on the object after which it will be automatically removed, and then update that TTL to infinite after the next two steps.
- (using a file store): first take out an etcd distributed lock named after the filename, version, and userID. If next steps fail, PUT request fails, user tries again, lock pre-exists but no file, so continue with upload. If next steps succeed, remove lock.
- If failure between this step and the completion of the next two is a concern, consider:
- Add file info to database to support GET reqeusts and the like. Use transactions for replica safety.
- Send a message containing the object ID or filepath to each action's queue, waiting for acknowledgments that the messages were received. (I.E. the
scan_for_viruses
queue, theload_metadata
queue, etc.) - Return success to PUT request
- Receive message from queue, do not acknowledge yet
- Process (I.E. scan for viruses, load metadata, etc.)
- Inform broader system of completion. It's recommended to avoid the tight coupling that connecting directly to the HTTP server's database would create. A couple of options are:
- (using an object store): update the object metadata with info like
scanned for viruses on 7/12/2018
, etc. - (using an object store or file store): send messages to separate "done" queues to which the HTTP server subscribes, updating the database accordingly. (I.E. the
scan_for_viruses_done
queue, etc.)
- (using an object store): update the object metadata with info like
- Acknowledge original message, profit!
- Failure anywhere in the processing workers will cause the message broker to redeliver the message that was not acknowledged.
- Failure anywhere in the upload logic will cause a failed PUT request and the user will retry.
- Both sides can be run in replica and scaled horizontally.