Last active
June 4, 2020 09:06
-
-
Save franklinokech/96f65be38970747cfdde69c8a353e26e to your computer and use it in GitHub Desktop.
This gist contains the key data wrangling of python pandas
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
idx = 0 | |
new_col = [7, 8, 9] # can be a list, a Series, an array or a scalar | |
df.insert(loc=idx, column='A', value=new_col) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
df['composite_column'] =df['string_col'] + '-' + df['date_column'].astype(str) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Convert a pandas column to Date data type | |
df.date_column = pd.to_datetime(df.date_column, format='%d-%m-%Y') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Convert a given column to lower case | |
df.column_name = df.column_name.str.lower() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from google.colab import auth | |
from gspread_dataframe import get_as_dataframe, set_with_dataframe | |
auth.authenticate_user() | |
import gspread | |
from oauth2client.client import GoogleCredentials | |
gc = gspread.authorize(GoogleCredentials.get_application_default()) | |
sh = gc.open('Google Sheet File Name') | |
# Select Spreadsheet | |
# By title | |
worksheet = sh.worksheet('Tab within File') | |
# Append Dataframe to Sheet | |
set_with_dataframe(worksheet, df) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Get frequency percentage by values in column 'City' | |
frequency = empDfObj['City'].value_counts(normalize =True) | |
print("Frequency of values as percentage in column 'City' :") | |
print(frequency * 100) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
df_merged = pd.merge(left=df_left, right=df_right, left_on='primary_key', right_on='primary_key', how='left') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# This snippet pre-appends string to a column values | |
df['col'] = 'str' + df['col'].astype(str) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Remove trailing spaces in column names | |
df.columns = [x.strip() for x in df.columns] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment