np.int32 vs np.float32 observations #6

ronTohar1 · 2023-04-29T13:27:49Z

hey, I noticed there is a problem in battleships.py where the observation space is of type int and the returned observation in reset or step are of type np.float32.
This causes a problem with env_checker of gym of course and therefore also with the use of RL libraries such as SB3 as they use this information and it creates conflicts.

The text was updated successfully, but these errors were encountered:

thomashirtz · 2023-04-29T15:15:25Z

I just committed some changes, please let me know if it has resolved the issue, if not, please give me a minimal minimal reproducible example of the issue :)

ronTohar1 · 2023-04-29T17:07:54Z

It did fix my problem, but I did have to do some changes to the step function (regardless of the current new commit) to make the code work.

I changed the step() function
from

def step(self, raw_action: Union[int, tuple]) -> Tuple[np.ndarray, int, bool, dict]:
        if isinstance(raw_action, int):

To

def step(self, raw_action: Union[int, tuple]) -> Tuple[np.ndarray, int, bool, dict]:
        if isinstance(raw_action, int) or isinstance(raw_action, np.int32) or isinstance(raw_action, np.int64):

This change had been made way before the aforementioned problem arose.

For example, the gym environment checker didn't pass before the change to the step() function, as well as the following piece of code:

import gymnasium
import gym_battleship

env = gymnasium.make("GymV21Environment-v0", env_id="Battleship-v0")
obs,_ = env.reset()
done = False
while not done:
    action = env.action_space.sample()
    obs, reward, terminated, truncated, info = env.step(action)
    done = terminated or truncated
    env.render()

This happened because for some reason, sample() of the action apce gave an numpy.int64 value. (This error is also the reason the env_checker didn't 'approve' the environment)

P.S I just can't understand why you'd put an np.float32 in the dtype as it is only integers, but I guess it doesn't make any difference.

thomashirtz · 2023-04-30T01:49:25Z

Ok, I'll take a closer look for fixing the code when I have time

Please let me know if there are any other thing not so right that you notice in the code :)

For the observation space, I remember that there were a clear reason why I did that and I was quite upset to have to switch to float instead of int. However I can't remember now. I may switch back to int and see if someone experience an issue.

Thank you for your message :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

np.int32 vs np.float32 observations #6

np.int32 vs np.float32 observations #6

ronTohar1 commented Apr 29, 2023

thomashirtz commented Apr 29, 2023

ronTohar1 commented Apr 29, 2023

thomashirtz commented Apr 30, 2023

np.int32 vs np.float32 observations #6

np.int32 vs np.float32 observations #6

Comments

ronTohar1 commented Apr 29, 2023

thomashirtz commented Apr 29, 2023

ronTohar1 commented Apr 29, 2023

thomashirtz commented Apr 30, 2023