The Copper Industry Sales and Leads Prediction Project addresses challenges in managing sales and pricing data characterized by skewness and noise. Manual predictions within the industry are often time-consuming and may lack accuracy. To overcome these challenges, this project focuses on the development of machine learning models.
- Python: Facilitates versatile programming capabilities.
- Pandas and NumPy: These libraries will be used for data manipulation and preprocessing.
- Scikit-Learn: A powerful machine learning library that includes tools for regression and classification models.
- Streamlit: A user-friendly library for creating web applications with minimal code, perfect for building an interactive interface for our models.
- Identify and address skewness and outliers in the dataset.
- Transform data into a suitable format.
- Clean and preprocess data, handling missing values.
- Utilize machine learning regression to predict the continuous variable 'Selling_Price.'
- Apply advanced techniques like data normalization and feature scaling.
- Develop a classification model to predict lead statuses (WON or LOST).
- Create an interactive Streamlit web page.
- Input column values to get predicted 'Selling_Price' or lead status (WON/LOST).
This project addresses critical challenges in the copper industry by employing machine learning techniques for sales and lead prediction. The developed models enhance decision-making efficiency, providing a robust solution for accurate 'Selling_Price' predictions and lead classification.