This is a modified codebase for the paper ProgPrompt: Generating Situated Robot Task Plans using Large Language Models. It contains code for replicating the results on the VirtualHome dataset. This is referenced from the Original Code Release. There were some errors in the original code release, which have been rectified in this codebase. We present our results and analysis of the project in Benchmark_Analysis.pdf
, where we highlight the key issues and weaknesses, and suggest possible improvements.
We have included the setup commands in setup.sh. To execute them, run sh setup.sh
.
Here is an overview of the steps:
- Create a conda environment (or your virtualenv):
conda create -n progprompt python==3.9
- Install dependencies:
pip install -r requirements.txt
- Clone VirtualHome and install it by running:
pip install -e .
- Finally, download the virtualhome unity simulator and make sure it runs. The simulator can be run either locally, or on a virtual x-server.
This was tested on VirtualHome commit f84ee28a75b23318ee1bf652862b1c993269cd06
.
We have made a script named runs.sh, which contains all the commands used for testing the performance of the experiments in which we changed the parameters one by one. The parameters are given to the code by means of flags, as visible in runs.sh. Update the script with your openai-key
. The default parameters for a baseline reference are:
- --gpt-version gpt-3.5-turbo-instruct
- --test-set test_unseen
- --env-id 0
- --prompt-task-examples default
- --prompt-num-examples 3
- --prompt-task-examples-ablation none
To run runs.sh, execute sh runs.sh
.
[1] ProgPrompt: Homepage : Generating Situated Robot Task Plans using Large Language Models
[2] ProgPrompt: Paper : Prompting the LLM with program-like specifications of the available actions and objects in an environment to improve plan quality and admissibility.
[3] VirtualHome : an interactive platform to simulate complex household activities via programs.
A joint benchmarking and analysis effort by Aryan Dua and Gurarmaan Singh.