-
-
Save kratika2210/8cf49375e77ec7454d8081875dfaddf3 to your computer and use it in GitHub Desktop.
import pandas as pd | |
import numpy as np | |
import csv | |
import re | |
smeLevelDict = { | |
'Data Ingestion': ['Experience Platform', 'Data Ingestion and Management', 'Data Ingestion'], | |
'General': ['Experience Platform', 'Overview', 'General'], | |
'CJA': ['Experience Cloud', 'Data Insights And Audiences', 'CJA'], | |
'Schema': ['Experience Platform', 'Data Modeling', 'Schemas'], | |
'Query Service': ['Experience Platform', 'Data Science and Queries', 'Query Service'], | |
'Dataset': ['Experience Platform', 'Data Ingestion and Management', 'Dataset'], | |
'AJO': ['Experience Cloud', 'Customer Journey', 'Journey Optimizer'], | |
'Profile': ['Experience Platform', 'Customer Data', 'Real-time Customer Profile'], | |
'Identity': ['Experience Platform', 'Customer Data', 'Identity Service'], | |
'Source Connector': ['Experience Platform', 'Data Ingestion and Management', 'Sources'], | |
'Destination Connectors': ['Experience Platform', 'Activation', 'Destinations'], | |
'Segment': ['Experience Platform', 'Customer Data', 'Segmentation Service'], | |
'License': ['Experience Platform', 'Administration', 'License Usage'], | |
'Field': ['Experience Platform', 'Data Modeling', 'Schemas'], | |
'Data Prep': ['Experience Platform', 'Data Ingestion and Management', 'Data Prep'], | |
'B2B': ['Experience Platform', 'Customer Data', 'Real-time Customer Data Platform'], | |
'JO': ['Experience Cloud', 'Customer Journey', 'Journey Orchestration'], | |
'AJO': ['Experience Cloud', 'Customer Journey', 'Journey Optimizer'], | |
'DULE': ['Experience Platform', 'Governance, Privacy, and Security', 'Data Governance'], | |
'Event': ['Experience Platform', 'Data Collection', 'Event Forwarding'], | |
'Hashed Email': ['Experience Platform', 'Customer Data', 'Identity Service'], | |
'Destinations': ['Experience Platform', 'Activation', 'Destinations'], | |
'UPSERT': ['Experience Platform', 'Data Ingestion and Management', 'Data Ingestion'], | |
'Data Distiller': ['Experience Platform', 'Data Science and Queries', 'Query Service'], | |
'Flow Service': ['Experience Platform', 'Data Ingestion and Management', 'Dataflows'], | |
'Eloqua': ['Experience Platform', 'Data Ingestion and Management', 'Sources'], | |
'TTL': ['Experience Platform', 'Customer Data', 'Real-time Customer Profile'], | |
'data load': ['Experience Platform', 'Administration', 'License Usage'], | |
'RTCDP': ['Experience Platform', 'Customer Data', 'Real-time Customer Data Platform'], | |
'Guard Rails': ['Experience Platform', 'Data Ingestion and Management', 'Data Ingestion'], | |
'Data security': ['Experience Platform', 'Governance, Privacy, and Security', 'Data Governance'], | |
'Edge Network': ['Experience Platform', 'Data Collection', 'Edge Network Server API'], | |
'AEP Architecture': ['Experience Platform', 'Overview', 'General'], | |
'API': ['Experience Platform', 'Overview', 'General'], | |
'WebSDK': ['Experience Platform', 'Data Collection', 'WebSDK'], | |
'MobileSDK': ['Experience Platform', 'Data Collection', 'MobileSDK'], | |
} | |
smeLevelColumns=['sme_level1', 'sme_level2', 'sme_level3'] | |
df = pd.read_csv('sme_concepts_qs_in.csv') | |
# deleting rows with empty 'question' column | |
df = df.dropna(axis=0, subset=['question']) | |
# filling 'System' where sme_name is empty | |
df['sme_name'] = df['sme_name'].fillna('System') | |
df['sme_levels'] = df['sme_domain'].map(lambda x: [v for k, v in smeLevelDict.items() if bool(re.search(k.lower(), x.lower()))][0]) | |
df_out = pd.DataFrame(df["sme_levels"].to_list(), columns=smeLevelColumns) | |
df_out = df_out.join(df) | |
df_out = df_out.drop(columns=['sme_domain', 'sme_levels']) | |
# reordering columns | |
df_out = df_out[['sme_name', 'question', 'sme_level1', 'sme_level2', 'sme_level3']] | |
df_out.to_csv('sme_concepts_qs_out.csv', index=False) |
Thanks, for knocking this out. Sorry for the late review. Could you make the following changes?
CJA': ['Experience Platform', 'Data Insights And Audiences', 'CJA'],
CJA is in Experience Cloud
'B2B': ['Experience Platform', 'Overview', 'General'],
I believe B2B is really RTCDP B2B which we are currently just classifying as RTCDP. However, looking at the five B2B questions I think 3 of them should be schema and we can overwrite the other 2.
'Hashed Email': ['Experience Cloud', 'Customer Journey', 'Journey Optimizer'],
I think this should go to identity service instead
https://experienceleague.adobe.com/docs/id-service/using/reference/hashing-support.html?lang=en
'UPSERT': ['Experience Platform', 'Customer Data', 'Real-time Customer Profile'],
There's only question in this domain "How to update specific column or row data in AEP". Let's map it to Data Ingestion
'Data Distiller': ['Experience Platform', 'Data Ingestion and Management', 'Data Ingestion'],
This should map to Query Service. Data Distiller is the SKU that you customers can purchase that allows them to manipulate data with Query service.
'API': ['Experience Platform', 'Overview', 'General'],
There's 3 API questions. One of them is general, we should manually map the other two to the domains they are in.
Thanks for the review @ken-russell! I have updated the above script and regenerated the csv.
Looks good, thanks!
good work on this, hoping we can reuse this for future annotations work