Skip to content

Instantly share code, notes, and snippets.

@umarhussain88
Created November 3, 2020 14:26
Show Gist options
  • Save umarhussain88/58f4d3ebc0c9790603576f3d2a9b1cdd to your computer and use it in GitHub Desktop.
Save umarhussain88/58f4d3ebc0c9790603576f3d2a9b1cdd to your computer and use it in GitHub Desktop.
def generate_new_keys(*args,key='Carid',name='Carname'):
"""
Takes in a number of dataframes and returns any duplicates with a new unique id.
groupby columns fixed to CarID and CarName.
"""
# adds dictionaries into a single list.
dicts_ = [arg.groupby(key)[name].first().to_dict() for arg in args]
#merges dicts on unique key, this will exclude duplicates.
merged_dicts = dict(ChainMap(*dicts_))
#get the duplicate and pass the name into a list.
delta = [v for each_dict in dicts_ for k,v in each_dict.items() if v not in merged_dict.values()]
# get the max sequence key
start_key = max(merged_dict.keys()) + 1
# create a new sequence
sequence = range(start_key, start_key + len(delta) + 1)
# return a dictionary.
return {name : number for name,number in zip(delta,sequence)}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment