Ensembled 12 CatBoost, XGBoost, LightGBM models using Stacking, with Elastic-Net Logistic Regression as meta-model
I competed in a team of 3 in a Columbia Engineering competition open to Masters students, and placed 4th out of 200 participants.
This competition was adapted from Avazu's highly popular Kaggle competition in 2015, linked here.
Avazu is a programmatic advertising platform that uses machine learning to decide which mobile advertisements get pushed to which consumers. Its aim is to maximize advertising effectiveness by ensuring the correct target group receieves the advertisements they are most interested in.
This competition is to predict consumer click-through rates on mobile advertisements, ie. whether a consumer clicks on an advertisement.
In online advertising, click-through rate is a very important metric for evaluating ad performance and is used in sponsored search and real-time bidding.
This competition uses 4 million rows with ~30 features that contain information about the consumer's mobile device, the mobile advertisement, the website on which the advertisement was encountered, etc. Each row depicts a specific user on a specific mobile device, encountering a specific mobile advertisement, and the target of whether the user clicked on it.
- Obtain training and test sets via time split: Due to time nature of data
- Feature engineering to identify unique consumers from device data
- Feature engineering to extract time nature of data
- Feature cleaning of rare values
- Encode categorical features: Using Hash Encoding, Ordered Target Encoding, Ordinal Encoding
- Train & tune model hyperparameters using Bayesian Optimization: CatBoost, XGBoost, LightGBM
- Ensemble models: Using Stacking, with Elastic Net Logistic Regression as meta-model
- Re-run on full data
Check out this project on my website here :)