Ever wonder, is there a way to explain clustered data?
Machine learning practitioners knows that SHAP is the go-to for any ml model explanation. Have you ever thought how can we use the SHAP to explain clustering data?
Here is the way to do it,
After normalising the data, run K-Means algorithm. To find optimal cluster count, can use elbow method. Cluster is formed now.
Now comes to the explanation part, we can take the cluster id as label for each data points. (eg. assume we have 4 features in dataset, now we have 1 more feature as label i.e, cluster id)
Now, run this new dataset against RandomForestClassifier with original features as X and new label as Y. Feed this classified model to SHAP. Plot summary plot in SHAP. Now we can able to see why a particular cluster formed, what feature has impacted that particular cluster.
#clustering #machinelearning #ml
credits : https://towardsdatascience.com/how-to-make-clustering-explainable-1582390476cc