Skip to content

Instantly share code, notes, and snippets.

@swang373
Created May 16, 2018 04:06
Show Gist options
  • Save swang373/997e992010696fd3da3c9a0fbc91cbe9 to your computer and use it in GitHub Desktop.
Save swang373/997e992010696fd3da3c9a0fbc91cbe9 to your computer and use it in GitHub Desktop.
Dump histograms in parallel using Varial
#!/usr/bin/env python
import glob
import sys
import varial.tools
from varial_ext.treeprojector import TreeProjectorFileBased
# The following all caps variables are module level constants that I would tweak by hand.
# Most of them are parameters are passed to the treeprojection_mr_impl.map_projection
# function, so check its docstring for more information.
# https://github.com/HeinerTholen/Varial/blob/acdbf65003aed3feff23c9a2edf4383d0714bfc6/varial_ext/treeprojection_mr_impl.py#L19-L40
# This is an important parameter and you definitely should plug in the logic from
# run_plot_from_tree.py into here so you can benefit from the routine that Heiner
# wrote which takes care of matching an input_pattern.
FILENAMES = {
'Run2016BToG_MET_ReMiniAOD': glob.glob('/blah/blah/VHbbAnalysisNtuples/NanoAOD/Run2016BToG_MET_ReMiniAOD/*.root'),
}
# The histograms for Varial to generate. The dictionary key becomes the
# name of the TH1 object saved in the output ROOT file. The dictionary value
# holds the (variable, histogram_title, nbins, xmin, xmax). The variable can
# be just a branch name up to a TTreeFormula expression, basically anything
# allowed by TTree.Draw.
HISTOGRAMS = {
'H_mass': ('H_mass', 'Higgs Mass [GeV]', 50, 0, 500),
}
# The maximum number of multiprocessing workers.
MAXJOBS = 8
# Any selections to apply.
SELECTION = 'cutFlow>=2'
# Any weights to apply.
WEIGHT = 'weight'
def main():
params = {'treename': 'Events', 'histos': HISTOGRAMS}
# This is a list of tuples of (title, selection, weight).
sec_sel_weight = [('Histograms', SELECTION, WEIGHT)]
# Create the TreeProjector object, shove it into a parallelized tool chain, and run it.
# It will generate an output directory in the current working directory called HistosFromTree.
# Inside that output directory will be a subdirectory called TreeProjectorFileBased. And finally
# within that subdirectory will be ROOT files named after the keys in the FILENAMES dictionary.
# Each ROOT file contains a TDirectoryFile named Histograms which contains all of the requested
# histograms for that corresponding sample.
projector = TreeProjectorFileBased(FILENAMES, params, sec_sel_weight, add_aliases_to_analysis=False)
plotter = varial.tools.ToolChainParallel('HistosFromTree', [projector], n_workers=MAXJOBS)
varial.tools.Runner(plotter)
if __name__ == '__main__':
status = main()
sys.exit(status)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment