This program aims to compare the performance of the Multiagent Rollout algorithm against the Ordinary Rollout algorithm and the Base Policy in the context of the Spiders and Flies problem.
The Spiders and Flies problem from Bertsekas, D. P. 2024. A Course in Reinforcement Learning. Athena Scientific p103-105 involves a scenario with multiple spiders and stationary flies on a 2- dimensional grid. The spiders aim to catch all flies as quickly as possible. The control space is structured to have separate controls for each spider (Ex: Consider 2 spiders then the control space will consist list of two separate controls for each spider say [‘west’, ‘north’]). In this implementation, we implemented three different strategies that an agent will use: Base Policy, Ordinary Rollout, and Multiagent Rollout and aims to compare the perfomance of these methods.
Utilized Manhattan distance to the nearest fly as the base policy
Computed and compared Q-factors for each possible move of each spider.
For each spider, we compute and compare (taking into consideration all the
controls of the current spider) the Q-factor for the current spider and assume the future spiders will act according to base policy, and the previous spider has already chosen their control.
Graphics.py
Contains code for rendering spiders and flies and handling graphic elements of the game.
Spider.py
- Contains the code for returning the control set based on a given state and Contains core logic for all three algorithms.
- Base Policy class: Implements the Manhattan distance-based base policy. o Ordinary Rollout class: Implements the standard rollout strategy.
- Multiagent Rollout class: Implements the multiagent rollout strategy.
Fly.py
Contains code for simulating the random movement of flies (currently unused as flies are
stationary).
Game.py
The core of the program includes rules of the game, sets up the environment, and metadata,keeps track of the game state, and grid, and places spiders and flies