- Goal: learn about different types of clustering algorithms and use them in datasets in notebooks.
- Dates: from 31 August to 6 September
- Where:
#project-of-the-week
in DataTalks.Club (get in slack here: https://datatalks.club/slack.html)
For more information about the "Project of the Week" initiative at DataTalks.Club, see README.md.
If you want to receive reminders about this event, sign up here
- Scikit-Learn
- Jupyter notebooks
Note: this is a suggested list of technologies, you can chose alternatives instead
This is a proposed plan only, you don’t have to follow it day-by-day
- Come up with a project idea
- Select the dataset for your project
- Create a github repository
- Share your progress in Slack and in social media
- Learn about K-means clustering
- Push your changes to github
- Share your progress in Slack and in social media
Suggested materials:
Found good materials? Create a PR with links!
- Learn about Mean-shift Clustering
- Push your changes to github
- Share your progress in Slack and in social media
Suggested materials:
- 📺 StatQuest: Hierarchical Clustering
- 📺 Unsupervised Machine Learning - Hierarchical Clustering with Mean Shift Scikit-learn and Python
Found good materials? Create a PR with links!
- Learn about Agglomerative Hierarchical Clustering
- Push your changes to github
- Share your progress in Slack and in social media
Suggested materials:
- 📺 Hierarchical Clustering | Agglomerative and Divisive Hierarchical Clustering Explained | Edureka
- 📺 Agglomerative Clustering: how it works
- 📺 How to Perform Hierarchical Clustering in Python (Step by Step)
Found good materials? Create a PR with links!
- Learn about DBSCAN (Density-Based Spatial Clustering of Applications with Noise)
- Push your changes to github
- Share your progress in Slack and in social media
Suggested materials:
- 📺 Clustering with DBSCAN, Clearly Explained!!!
- 📺 DBSCAN Clustering Easily Explained with Implementation
- 📺 DBSCAN Algorithm In Python | DBSCAN clustering Algorithm example | Density based clustering python
Found good materials? Create a PR with links!
- Learn about Expectation–Maximization (EM) Clustering using Gaussian Mixture Models (GMM)
Suggested materials:
- 📺 Clustering (4): Gaussian Mixture Models and EM
- 📺 EM Algorithm In Machine Learning | Expectation-Maximization | Machine Learning Tutorial | Edureka
Found good materials? Create a PR with links!
- Continue exploring more about this topic
- Polish the documentation for your project
- Push your changes to github
- Share your progress in Slack and in social media
- Give us feedback
- Add the link to your project to this project of the week github page
- 🏫 Advanced ML course from Google: Clustering
- 🏫 Cluster Analysis in Data Mining (Coursera)
- 🗒️ The 5 Clustering Algorithms Data Scientists Need to Know
- 🗒️ Awesome Clustering Resources
- 🗒️ Clustering — When You Should Use it and Avoid It
- 🗒️ How to Answer Business Questions Using Cluster Analysis
Notebooks:
- 💾 Clustering datasets on Kaggle
- 💾 UCI Machine Learning Repository: Clustering
- 💾 Clustering basic benchmark
Note: If you know other good resources about clustering, send a PR
- 🏫 Course
- 💾 Dataset
- 🗒️ Article
- 📺 Video tutorial
- 💻 Code
List of projects from our participants:
- Customer Segmentation by Esteban Encina
- Project link 2
- ...
- (Create a PR)