Skip to content

Instantly share code, notes, and snippets.

@ajelenak
ajelenak / h5stat-extra.py
Last active September 19, 2024 12:13
Additional HDF5 dataset chunk statistics
import argparse
import json
import operator
from collections import defaultdict
from dataclasses import dataclass
from functools import partial, reduce
import os
from typing import Any, Union
from configparser import ConfigParser
from pathlib import Path
@ajelenak
ajelenak / h5ublock.py
Created May 19, 2023 00:00
Create a copy of HDF5 file with specified user block. User block is filled with zero bytes.
import argparse
from warnings import warn
import h5py
WARN_UBLOCK_SIZE = 10 * 1024 * 1024
COPY_BLOCK_SIZE = 10 * 1024 * 1024
parser = argparse.ArgumentParser(
description='Add empty user block to an HDF5 file.',
@ajelenak
ajelenak / HDF5 Ecosystem.plantuml
Created September 24, 2021 13:51
HDF5 Ecosystem (PlantUML)
@startmindmap HDF5 Ecosystem
<style>
node {
RoundCorner 40
MaximumWidth 300
FontName Helvetica
FontSize 18
}
rootNode {
@ajelenak
ajelenak / h5comprat.py
Created June 24, 2021 23:54
How to compute and display compression ratios of HDF5 datasets in an HDF5 file using Python
import sys
import h5py
def comp_ratio(name, obj):
if isinstance(obj, h5py.Dataset) and obj.chunks is not None:
if obj.id.get_create_plist().get_nfilters():
stor_size = obj.id.get_storage_size()
if stor_size != 0:
ratio = float(obj.nbytes) / float(stor_size)
@ajelenak
ajelenak / HDF5 Universe.plantuml
Last active April 22, 2020 14:11
UML diagram of the HDF5 software components
@startuml HDF5 Universe
title HDF5 Universe
together {
folder "Abstract\nData Model" as ADM
folder "Programming\nModel" as PM
folder Library as L
}
@ajelenak
ajelenak / h5-to-zarr.py
Last active March 1, 2023 16:04
Python code to extract HDF5 chunk locations and add them to Zarr metadata.
# Requirements:
# HDF5 library version 1.10.5 or later
# h5py version 3.0 or later
# pip install git+https://github.com/HDFGroup/zarr-python.git@hdf5
import logging
from urllib.parse import urlparse, urlunparse
import numpy as np
import h5py
import zarr
@ajelenak
ajelenak / .zmetadata.json
Created February 6, 2020 22:00
Zarr consolidated metadata (.zmetadata) with HDF5 chunk file locations.
{
"metadata": {
".zattrs": {
"Conventions": "UGRID-0.9.0",
"_FillValue": -99999.0,
"_NCProperties": "version=1|h5netcdfversion=0.6.1|hdf5libversion=1.10.2",
"a00": 0.35,
"agrid": "grid",
"b00": 0.3,
"c00": 0.35,
@ajelenak
ajelenak / store_info.py
Last active December 1, 2021 04:03
Python script for reporting HDF5 dataset storage information for HDF5 files either in a file system or S3.
#!/usr/bin/env python3
"""
Print storage information for every HDF5 dataset in a file.
Run "store_info.py --help" for information.
"""
from os import SEEK_SET
import argparse
import json
from functools import partial
@ajelenak
ajelenak / cloud-access-to-hdf5.ipynb
Last active August 5, 2024 04:07
Access HDF5 Files in S3
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ajelenak
ajelenak / DAS-HDF5 and xarray.ipynb
Created April 18, 2019 16:29
Exploring a DAS-HDF5 with xarray
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.