Recommendation systems are critical for improving user engagement by suggesting relevant products or services. Two popular approaches include:
- Collaborative Filtering: Identifies similarities between users to recommend items.
- Content-Based Filtering: Personalizes recommendations for users based on their past actions and preferences.
However, these methods face challenges with sparse or insufficient data. To address these limitations, we develop a Hybrid Recommendation System that combines both approaches, leveraging the LightFM library.
The dataset contains transactional data from a UK-based online retail company specializing in unique gifts for various occasions. Key features include:
- User IDs: Identifiers for customers.
- Item IDs: Identifiers for products.
- Transactions: User-item interactions.
- Ratings/Feedback: Implicit or explicit feedback data.
To build a Hybrid Recommendation System using different loss functions provided by the LightFM library, including:
- WARP (Weighted Approximate-Rank Pairwise)
- Logistic Loss
- BPR (Bayesian Personalized Ranking)
- Programming Language: Python
- Libraries:
- Load essential Python libraries for data manipulation and model building.
- Import the dataset and combine user and item features into a single dataset for analysis.
- Clean and preprocess the data for input into the LightFM model.
- Divide the dataset into training and testing sets to evaluate model performance.
- Train LightFM models using:
- WARP Loss Function: Optimized for ranking.
- Logistic Loss Function: Suitable for predicting probabilities.
- BPR Loss Function: Designed for implicit feedback data.
- Integrate content-based and collaborative filtering data to train the hybrid model.
- Use the trained hybrid model to recommend items for users based on their interactions and preferences.
.
├── input/ # Contains input data files (e.g., `data.xlsx`).
├── src/ # Source code folder.
│ ├── engine.py # Main script to execute the pipeline.
│ ├── ML_pipeline/ # Modular Python functions for preprocessing and modeling.
│ ├── data_preparation.py # Functions for data preprocessing.
│ ├── model_training.py # Functions for training LightFM models.
│ ├── recommendation.py # Functions to generate recommendations.
├── output/ # Stores saved models and results.
├── lib/ # Reference materials and Jupyter notebooks.
├── requirements.txt # Lists dependencies and versions.
└── README.md # Project documentation.
git clone <repository_url>
cd <repository_folder>
Install all required Python libraries using:
pip install -r requirements.txt
Execute the pipeline by running the engine.py
file:
python src/engine.py
- Check the
output/
folder for saved models and generated recommendations. - Review reference notebooks in the
lib/
folder for detailed explanations.
If you encounter issues installing the lightfm
package, try the following steps:
Run the following commands in your terminal:
python -m pip install --upgrade pip
pip install --upgrade wheel
pip install --upgrade setuptools
Close the terminal and retry installing lightfm
.
Download and install the required tools from: Microsoft Visual C++ Build Tools
Retry installing lightfm
after the installation.
- Hybrid Model Performance:
- Combined the strengths of collaborative and content-based filtering.
- Achieved personalized and accurate recommendations.
- Generated Recommendations:
- Delivered tailored item suggestions based on user preferences and item features.
Contributions are welcome! To contribute:
- Fork the repository.
- Create a feature branch:
git checkout -b feature-name
- Commit your changes:
git commit -m "Add feature"
- Push your branch:
git push origin feature-name
- Open a pull request.
This project is licensed under the MIT License. See the LICENSE
file for details.
For any questions or suggestions, please reach out to:
- Name: Abhinav Navneet
- Email: mailme.AbhinavN@gmail.com
- GitHub: AjNavneet
Special thanks to:
- LightFM for providing a robust library for hybrid recommendations.
- The Python open-source community for excellent tools and resources.