Skip to content

Instantly share code, notes, and snippets.

@ToniRV
Last active September 7, 2024 11:41
Show Gist options
  • Save ToniRV/7b4ff0f72e9895b9edf6b042664926de to your computer and use it in GitHub Desktop.
Save ToniRV/7b4ff0f72e9895b9edf6b042664926de to your computer and use it in GitHub Desktop.
Download uHumans2 Dataset
#!/usr/bin/env python3
import os
import errno
import gdown
ids = {"apartment_scene":
{
"uHumans2_apartment_s1_00h.bag": '1kU_drpyG7glQ8pJyeztpiy214WbBEtbM',
"uHumans2_apartment_s1_01h.bag": '1jp7HrRsfGbmC-z757wwXDEpgNPHw0SbK',
"uHumans2_apartment_s1_02h.bag": '1ai2p6QVNFaPJfEFOOec6URp5Anu0MBoo'
},
"office_scene":
{
"uHumans2_office_s1_00h.bag": '1CA_1Awu-bewJKpDrILzWok_H_6cOkGDb',
"uHumans2_office_s1_06h.bag": '1zECekG47mlGafaJ84vCbwcx3Dz03NuvN',
"uHumans2_office_s1_12h.bag": '1Of7s_QTE9nL1Hd69SFW1R5uDiJDiQrZr'
},
"subway_scene":
{
"uHumans2_subway_s1_00h.bag": '1ChL1SW1tfZrCjn5XEf4GJG8nm_Cb5AEm',
"uHumans2_subway_s1_24h.bag": '1ifatqW3hzL9yo8m7Jt3BCqIr-6kzIols',
"uHumans2_subway_s1_36h.bag": '1xFG565R-9LKXC60Rfx-7fruy3BP4TrHl'
},
"neighborhood_scene":
{
"uHumans2_neighborhood_s1_00h.bag": '1p_Uv4RLbl1GtjRxu2tldopFRKmgg_Vsy',
"uHumans2_neighborhood_s1_24h.bag": '1LXloULyuohBzFLumBoScBMlFrT5nRPcE',
"uHumans2_neighborhood_s1_36h.bag": '1AwgGpqe2g12T2Lm4Nilz4EaNm2_OtfHL'
}
}
url = 'https://drive.google.com/uc?id='
def create_full_path_if_not_exists(filename):
if not os.path.exists(os.path.dirname(filename)):
try:
print('Creating non-existent path: %s' % filename)
os.makedirs(os.path.dirname(filename))
except OSError as exc: # Guard against race condition
if exc.errno != errno.EEXIST:
print("Could not create inexistent filename: " + filename)
def ensure_dir(dir_path):
""" Check if the path directory exists: if it does, returns true,
if not creates the directory dir_path and returns if it was successful"""
if not os.path.exists(dir_path):
os.makedirs(dir_path)
return True
def run(args):
assert(ensure_dir(args.output_dir))
for dataset_name in ids.keys():
assert(ensure_dir(os.path.join(args.output_dir, dataset_name)))
for rosbag_name in ids[dataset_name].keys():
print("Downloading rosbag: %s" % rosbag_name)
rosbag_url = url + ids[dataset_name][rosbag_name]
gdown.download(rosbag_url, os.path.join(args.output_dir, dataset_name, rosbag_name), quiet=False)
print("Done downloading dataset.")
return True
def parser():
import argparse
basic_desc = "Download uHumans2 dataset in google drive."
shared_parser = argparse.ArgumentParser(add_help=True, description="{}".format(basic_desc))
input_opts = shared_parser.add_argument_group("input options")
output_opts = shared_parser.add_argument_group("output options")
output_opts.add_argument(
"--output_dir", type=str, help="Path to the output directory where the datasets will be saved.", required=True)
main_parser = argparse.ArgumentParser(description="{}".format(basic_desc))
sub_parsers = main_parser.add_subparsers(dest="subcommand")
sub_parsers.required = True
return shared_parser
import argcomplete
import sys
if __name__ == '__main__':
parser = parser()
argcomplete.autocomplete(parser)
args = parser.parse_args()
if run(args):
sys.exit(os.EX_OK)
@karnikram
Copy link

karnikram commented Jan 13, 2021

Hi, thank you for the dataset and this script.

I've been able to download the apartment sequences using this script, but I haven't been able to download any of the subway or neighborhood sequences, I get a permission denied error. Eg:

Permission denied: https://drive.google.com/uc?id=1ifatqW3hzL9yo8m7
Maybe you need to change permission over 'Anyone with the link'?

Moreover for two subway sequences uHumans2_office_s1_06h.bag, uHumans2_subway_s1_00h.bag, I get the following error (from gdown):


        Too many users have viewed or downloaded this file recently
        try accessing the file again later. If the file you are try
        access is particularly large or is shared with many people,
        take up to 24 hours to be able to view or download the file
        still can't access a file after 24 hours, contact your doma
        administrator.

Downloading the files individually using the browser works, but is there any other way to download using the command line?

@ToniRV
Copy link
Author

ToniRV commented Jan 13, 2021

Hi @karnikram,

Thanks for reaching out. I've checked all the links and they are set to Anyone with the link already.
It looks like that link you showed is truncated from the original one: 1ifatqW3hzL9yo8m7Jt3BCqIr-6kzIols.
Do you have an idea of why that happened?

For the other errors, I have also seen that, and I'm afraid that is a limitation from google drive...
I might have to switch to Dropbox to fix this.
Let me know if the problem persists and I'll consider switching.

Thank you!

Toni

@karnikram
Copy link

karnikram commented Jan 13, 2021

Thanks Toni. The id might have been truncated during copy-paste.
Could you instead share the link to the main dataset directory that contains all the files? That way I can copy the directory to my gdrive and from there I should be able to download the files using rclone.
I managed to add the individual files to my own gdrive and then download via CLI using rclone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment