Skip to content

Instantly share code, notes, and snippets.

@johnptmcdonald
Created November 15, 2019 22:19
Show Gist options
  • Save johnptmcdonald/d92475fadcf3311314df5756a2bafea6 to your computer and use it in GitHub Desktop.
Save johnptmcdonald/d92475fadcf3311314df5756a2bafea6 to your computer and use it in GitHub Desktop.
# pip install sklearn numpy matplotlib
from sklearn.cluster import KMeans
import numpy as np
import matplotlib.pyplot as plt
# make a fake set of data
samples = np.array([
[1, 2],
[11, 12],
[1, 2],
[12, 10],
[4, 1],
[15, 12],
[10, 9],
[4, 2],
[2, 1],
[5, 1],
[14, 13],
])
# plot the data to clearly illustrate the two groups
plt.scatter(samples[:, 0], samples[:, 1])
plt.show()
# create a KMeans model that will look for two groups
model = KMeans(2)
# fit the model to the data
model.fit(samples)
# ask the model which points correspond to which group
labels = model.predict(samples)
print('samples:', samples)
print('labels:', labels)
# Given a new point, give it the correct group label
new_datapoints = [[2,5]]
label = model.predict(new_datapoints)
print(f"The point {new_datapoints[0]}")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment