110th Place Solution for Home Depot Product Search Relevance
scikit-learn
pandas
numpy
pychant
keras
xgboost
The configuration of the project can be changed from configs.py
.
Note: Use python3
as default interpreter.
Before running any of the files copy the data to input/
folder. So, the project structure should look like
Kaggle_HomeDepot
└───input
│ train.csv
│ test.csv
│ attributes.csv
│ product_descriptions.csv
│ sample_submission.csv
└───scripts
│ README.md
│ ...
Now, To regenerate the results run these files mentioned below respectively.
generate_settings.py
- Generates Settings for the project.preprocess.py
- Initial Clearning of Datafeature_generater.py
- Clean Data and generates TF-IDF featuresfeatures_distance.py
- Generates distance and counting featuresgenerate_dataset_svd50x3_distance.py
- Combine all the individual features and generates a dataset.
stacked_generalization.py
- To train all the machine learning modules and stacks all the results to create the submission.