Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added waste management through rl techniques #1152

Conversation

Panchadip-128
Copy link
Contributor

The project aims to develop a reinforcement learning (RL) agent to optimize waste collection in a simulated environment, minimizing overflow events and improving efficiency.

Environment and State Representation:
The state is represented by four features: Waste Level: Current waste level (0 to 1) Time of Day: A random value representing the time (0 to 24 hours) Weather Condition: A random value (0 to 1) indicating the weather Distance to Collection Point: A random value (0 to 10) representing the distance to the waste collection point.

Action Space:
The agent can choose between two actions: Wait (0): Do not collect waste. Collect Waste (1): Proceed with waste collection.

Reward Structure:
The reward system is designed to encourage efficient waste collection: +10 for timely collection when the waste level exceeds the threshold. -5 for premature collection when the waste level is below the threshold. -1 for each time step to penalize waiting.

Training Process:
The agent is trained over 100 episodes, where each episode simulates a series of time steps (up to 20) where the agent makes decisions based on the current state. The agent learns from experience using a replay memory and updates its policy through Q-learning.

Evaluation Metrics:
Performance is evaluated using: Average Reward per Episode: Measures the effectiveness of the agent's actions. Epsilon Decay: Tracks the exploration rate, indicating how the agent balances exploration vs. exploitation. Overflow Events: Counts occurrences when the waste level exceeds the maximum capacity as per previous updation.

Visualization:
The results are visualized using Matplotlib to plot: Average rewards per episode, showing the agent's learning progression and rewards gained on successfull execution and implementation of a specified condition Epsilon decay over episodes, illustrating the shift from exploration to exploitation. Overflow events per episode, highlighting improvements in waste management techniques

Copy link

Thank you for submitting your pull request! 🙌 We'll review it as soon as possible. If there are any specific instructions or feedback regarding your PR, we'll provide them here. Thanks again for your contribution! 😊

@Panchadip-128
Copy link
Contributor Author

Panchadip-128 commented Oct 24, 2024

@Niketkumardheeryan Please reassign this issue with labels ,Thank you

@Panchadip-128
Copy link
Contributor Author

@Niketkumardheeryan

@Panchadip-128
Copy link
Contributor Author

@Niketkumardheeryan Please review as GSSoC is going to end by 2 days

@Niketkumardheeryan
Copy link
Owner

@Panchadip-128 add your dataset used details and add your name in .ipynb file , remove unnecessary files

@Panchadip-128
Copy link
Contributor Author

@Niketkumardheeryan I have added the necessary dataset informations and keys to be added. Please note that for a RL model like the waste management system using epsillon decay , typically no predefined dataset is required as the agent learns by interacting with a simulated environment.
Thanks and Regards

@Niketkumardheeryan Niketkumardheeryan merged commit 2584e05 into Niketkumardheeryan:master Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants