4.3. Unsupervised Learning#

Learning Outcome

Students will be able to use algorithms to draw inferences from datasets consisting of input data without labeled responses.

Sample Tasks

  • Identify data that is relevant to K-means clustering.

  • Describe the basic steps of the K-means clustering algorithm.

  • Interpret an elbow graph to determine the optimal number of clusters.

  • List the advantages and disadvantages of K-means clustering.

  • Use a command such as kmeans() in R to solve applications of K-means clustering.

[OhioDoHEducation21]

Our reading, from 5. Machine Learning in the Python Data Science Handbook [Van16], explains k-means clustering and how to use sklearn to perform it.

Reading Question

  • What goes wrong if k is too small?

  • What goes wrong if k is too large?

  • What is a confusion matrix?