Created
July 25, 2017 02:50
-
-
Save iCoolchar/8071c08b7a4a9b887498d6ac51faa23d to your computer and use it in GitHub Desktop.
Python Data Visualizations
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# 图片 https://www.kaggle.com/benhamner/python-data-visualizations/notebook | |
# First, we'll import pandas, a data processing and CSV file I/O library | |
import pandas as pd | |
# We'll also import seaborn, a Python graphing library | |
import warnings # current version of seaborn generates a bunch of warnings that we'll ignore | |
warnings.filterwarnings("ignore") | |
import seaborn as sns | |
import matplotlib.pyplot as plt | |
sns.set(style="white", color_codes=True) | |
# Next, we'll load the Iris flower dataset, which is in the "../input/" directory | |
iris = pd.read_csv("../input/Iris.csv") # the iris dataset is now a Pandas DataFrame | |
# Let's see what's in the iris data - Jupyter notebooks print the result of the last thing you do | |
print iris.head() | |
# Let's see how many examples we have of each species | |
print iris["Species"].value_counts() | |
# The first way we can plot things is using the .plot extension from Pandas dataframes | |
# We'll use this to make a scatterplot of the Iris features. | |
# 散点图 | |
iris.plot(kind="scatter", x="SepalLengthCm", y="SepalWidthCm") | |
# One piece of information missing in the plots above is what species each plant is | |
# We'll use seaborn's FacetGrid to color the scatterplot by species | |
# 带颜色的散点图 | |
sns.FacetGrid(iris, hue="Species", size=5) \ | |
.map(plt.scatter, "SepalLengthCm", "SepalWidthCm") \ | |
.add_legend() | |
# We can look at an individual feature in Seaborn through a boxplot | |
sns.boxplot(x="Species", y="PetalLengthCm", data=iris) | |
# Another useful seaborn plot is the pairplot, which shows the bivariate relation | |
# between each pair of features | |
# | |
# From the pairplot, we'll see that the Iris-setosa species is separataed from the other | |
# two across all feature combinations | |
# feature两两组合 | |
sns.pairplot(iris.drop("Id", axis=1), hue="Species", size=3) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment