This repository includes the Civic Project Outcomes from the Team click.ai in cooperation with Sandbox Network, Inc. Korea.
This notebook will analyze thumbnail data using the data science lifecylce. This civic project is a collaboration between Minerva University (Fall 2021, Sophomore, Seoul) and Sandbox Network. Sandbox is a leading multi-channel network (MCN) in South Korea that supports over 450 digital creators and their content.
Data science life cycle reference:
https://towardsdatascience.com/stoend-to-end-data-science-life-cycle-6387523b5afc
Information about Sandbox:
https://www.kedglobal.com/kunicornsView/kun0004
https://www.kedglobal.com/newsView/ked202107220014
Sources:
- Face detection with Haar Cascade Classifier https://towardsdatascience.com/face-detection-with-haar-cascade-727f68dafd08 https://github.com/suniljs6/Counting-number-of-faces-in-a-picture-using-python-opencv
- Text detection with EAST (Efficient Accurate Scene Text) https://towardsdatascience.com/scene-text-detection-and-recognition-using-east-and-tesseract-6f07c249f5de https://medium.com/technovators/scene-text-detection-in-python-with-east-and-craft-cbe03dda35d5 https://github.com/ZER-0-NE/EAST-Detector-for-text-detection-using-OpenCV/blob/master/opencv_text_detection_image.py
- Business understanding
- Data understanding
- Data preparation (cleaning + processing)
- Exploratory data analysis
- Modelling
- Evaluation
- Deployment