Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intro cluster ml #935

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from
Draft

Intro cluster ml #935

wants to merge 8 commits into from

Conversation

dsbuddy
Copy link
Collaborator

@dsbuddy dsbuddy commented May 27, 2024

No description provided.

@drelliche
Copy link
Contributor

Link to rendered course

@drelliche
Copy link
Contributor

This module is very much in conversation with #908 from which it was broken out of. Here is some general feedback:

General Structure

Learning objectives

I think these are just copied over from when you split the module in two. For this intro modules, the learning objectives might be more like that learners will be able to

  • Define crucial vocabulary including "clustering," "supervised vs. unsupervised,"
  • Understand how the k-means clustering algorithm works, broadly speaking.
  • Identify appropriate applications of k-means clustering

Pre-Reqs

Since no programming will be done in this module, the pre-reqs should also be updated

Timing

I think 20 minutes is an underestimate, but let's see where it ends up after editing

Quizzes

I love that there are so many quiz questions. I think we might want to move and group them into specific "Quiz" sections. We will also want to double check that they align with the learning objectives once those are smoothed out.

Examples

The biomedical examples are great, I think they would be more effective if they were used to illustrate specific concepts algorithm types/parts. The classic example of customer segmentation can probably be omitted and replaced with a (general or specific) example of patient segmentation.

Suggested Table of Contents

I want to propose a few changes to the headings/titles and flow of the sections that I think will make the same content easier for learners to follow. This is a suggestion, not a mandate 😄 There are probably things about this suggested structure that will be over emphasizing the wrong things, so let's talk about it.


Introduction to Clustering

What is Clustering?

Example: Patient Stratification

This seems to be a classic example of clustering

Quiz: Clustering

Key Vocabulary

Encouragement box here! Use one of the biomedical examples to illustrate outcome/response/dependent variables/labels and input/predicotrs/features etc.

Supervised vs Unsupervised Learning

For this and the next 3 sections, you already have a description, would it be possible to join it with one of your biomedical research examples to illustrate each of these concepts?

Normalization

Distance to the centroid

Visualization

Quiz: what vocabulary is it crucial people know?

Types of Clustering

"There are many different clustering algorithms, each with its own strengths and weaknesses. Some of the most common clustering algorithms include K-Means clustering, hierarchical clustering, and Gaussian Mixture Models (GMMs)" Expand a little more on this.

K-Means Clustering

This is the "how it works" description

A K-Means Example

Is there a video/image/illustration/example that you can link to or embed?

Potential Pitfalls

This is the "important notes" page

Quiz: K-Means Clustering

Additional Resources

Feedback

@drelliche drelliche marked this pull request as draft July 17, 2024 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants