bddb_dilemma.md

The Setup:

I have MRI data from a group of patients with brain damage (due to stroke, head injury, etc). For each patient, I have a binary MRI image (3D array) for each patient, where a value of 0 or 1 indicates the presence or absence of damage at a particular voxel (3D pixel). Each image is ~870k voxels.

The Goal:

Return a list of patients with damage at a given voxel.

Here's the stupid/inelegant way to do this:

voxel_id = 12345 # Index of the voxel I'm interested in
patient_matches = []
for patient in patient_list:
  patient_img_file = '/path/to/images/{}.MRI'.format(patient) # Get the path to the image for this patient.
  img = load_image(patient_img_file) # Load the image as a NumPy ndarray.
  img_flat = img.ravel() #Flatten to a 1D
  vox_val = img[voxel_id]
  if vox_val == 1:
    patient_matches.append(patient)

The Original Plan

Load and flatten each image, so I have a 1x870k vector for each patient.
Read these vectors into a DB (table='damage'), with 1 row per patient and 1 column per voxel.
Query select patient_id from damage where vox12345 = 1

import numpy as np # dense bitpacked storage. # note row-major ordering, so extracting one feature is just a range, not 100 random accesses. features = np.random.randint(0,2,(870000,100), np.bool) def patients(features, foi): return [i for i,v in enumerate(features[foi,:]) if v] >>> patients(12345) [0, 7, 11, 12, 13, 15, 17, 19, 21, 22, 26, 28, 31, 33, 35, 36, 38, 42, 44, 46, 47, 48, 49, 50, 53, 54, 55, 58, 60, 61, 62, 63, 70, 73, 75, 78, 79, 82, 83, 84, 85, 86, 87, 89, 90, 92, 95, 96, 98]

/bddb_dilemma.md

The Setup:

The Goal:

The Original Plan

wiredfool commented Dec 8, 2016

wiredfool commented Dec 9, 2016