Skip to content

Instantly share code, notes, and snippets.

@rghv404
Created April 20, 2021 18:58
Show Gist options
  • Save rghv404/15b28d945878163ffca06a988133f116 to your computer and use it in GitHub Desktop.
Save rghv404/15b28d945878163ffca06a988133f116 to your computer and use it in GitHub Desktop.
Split dataset and process the parts before reducing to union
dfAggregated
.randomSplit(equallyWeightedParts(5))
.map((dataset: Dataset[Models.AggregatedDataSet]) => methodName(
allMeasurementsOnAssets = assetAggregatedMeasurements,
allMeasurementsOnScorecards = dataset,
broadcastFactorWeights = broadcastFactorWeights,
broadcastInterpolationFunctions = broadcastInterpolationFunctions,
spark = spark
))
.reduce(_ union _)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment