Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Soft-Float] - Initial Interpreter Implementation of Ps2's floating point unit specification #12001

Open
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

GitHubProUser67
Copy link

@GitHubProUser67 GitHubProUser67 commented Nov 12, 2024

This Pull Request implements the first take ever on real Soft-Float support in PCSX2.

This work is a combination or several efforts and researches done prior.

Credits:

This pull request should be tested with every games requiring a clamping/rounding mode/float patches (cf: GameDatabase).

Currently, this PR fixes on the interpreters:

Any other games using the NegDiv hack or any other FPU rounding mode.

This sets the floor for Soft-Float in PCSX2, a long awaited contribution.

@GitHubProUser67 GitHubProUser67 changed the title [Soft-Float] - Initial Intepreter Implementation of Ps2's floating point uint specification [Soft-Float] - Initial Intepreter Implementation of Ps2's floating point unit specification Nov 12, 2024
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for submitting a contribution to PCSX2

As this is your first pull request, please be aware of the contributing guidelines.

Additionally, as per recent changes in GitHub Actions, your pull request will need to be approved by a maintainer before GitHub Actions can run against it. You can find more information about this change here.

Please be patient until this happens. In the meantime if you'd like to confirm the builds are passing, you have the option of opening a PR on your own fork, just make sure your fork's master branch is up to date!

@GitHubProUser67 GitHubProUser67 changed the title [Soft-Float] - Initial Intepreter Implementation of Ps2's floating point unit specification [Soft-Float] - Initial Interpreter Implementation of Ps2's floating point unit specification Nov 12, 2024
@seta-san
Copy link
Contributor

Does this work on the recompilers or just interpreter?

@refractionpcsx2
Copy link
Member

You should try reading the title.

…oint unit specification.

This Pull Request implements the first take ever on real Soft-Float support in PCSX2.

This work is a combination or several efforts and researches done prior.

Credits:

- https://www.gregorygaines.com/blog/emulating-ps2-floating-point-nums-ieee-754-diffs-part-1/

- https://github.com/GitHubProUser67/MultiServer3/blob/main/BackendServices/CastleLibrary/EmotionEngine.Emulator/Ps2Float.cs

- https://github.com/Goatman13/pcsx2/tree/accurate_int_add_sub

- PCSX2 Team for their help and support in this massive journey.

This pull request should be tested with every games requiring a clamping/rounding mode (cf: GameDatabase).

Currently, this PR fixes on the interpreters:

- PCSX2#354

- PCSX2#11507

- PCSX2#10519

- PCSX2#8068

- PCSX2#7642

- PCSX2#5257

This is important to note, that this implementation, while technically fixing Gran Turismo 4 and Klonoa 2, makes the games crash due to very high floats being passed in the emu code, and failing at some points later in the process. This has not yet been ironed-out.

Other than that, this sets the floor for Soft-Float in PCSX2, a long awaited contribution.
@seta-san
Copy link
Contributor

You should try reading the title.

I don’t know why my brain just skimmed over that.

@MrCK1
Copy link
Member

MrCK1 commented Nov 12, 2024

I ran a bunch of tests for "Test Drive Unlimited" AI during the demo scene after sitting idle on the menu. No combination of settings/interpreters seems to have any effect on behavior. There must be something else going on

@AmyRoxwell
Copy link

AmyRoxwell commented Nov 13, 2024

I can confirm with the multiplication setting that MHG and Dos not longer have bugged bounces! (Plus the build works on linux well! [as well as running interpreter can do, of course :'D])
Screenshot_20241112_234106
Monster Hunter G_SLPM-65869_20241112234029
Monster Hunter 2_SLPM-66280_20241112235154

@Shoegzer
Copy link

Shouldn't this help with #2990 as well?

@AmyRoxwell
Copy link

Shouldn't this help with #2990 as well?

I remember seeing on the public dev channel that stuntman not longer had AI issues with this and not longer the car AI failed? it's been a while

@Blackbird88
Copy link
Contributor

Blackbird88 commented Nov 13, 2024

Tourist Trophy NTSC works now too! For the first time License Tests in it work!
Requires: EE MUL/DIV

Some of the tests still don't work however.

Works
image
image

Hangs
image

@Tokman5
Copy link
Contributor

Tokman5 commented Nov 13, 2024

I tested the demo replays of Tokyo Xtreme Racer Zero (see issue #5597 ) and noticed that the car movement in interpreter mode is now closer to the movement in recompiler mode.
However, there are still slight differences between the two, and both are far from matching the console playback.

comp.mp4

@Goatman13
Copy link
Contributor

Goatman13 commented Nov 13, 2024

Nice work!

I tested only 2 games for now. Hype Time Quest which if i remember correctly needs accurate fpu mul, and it works fine without patch.

Second game is THPS4. While here accurate VU1 add/sub fix issue which is normally fixed by VU1 rounding in database, accurate mul/div breaks game graphics. Of course, game don't need accurate VU1 mul/div to work correctly, but something seems to be off when enabled.

Accurate vu1 add/sub/sqrt:
image
Accurate vu1 mul/div (same place in game):
image

Edit: Maybe THPS4 is just passing some big float value to GS with accurate mul/div. That may not necessarily be an issue with the float operation itself.
Additionally, I tested Burnout 2 and accurate VU1 add/sub fixed issue that was previously fixed by VU1 rounding mode (white car parts).

Edit2: Freaky Flyers is fixed with accurate mul/div on VU1. Awesome! We weren't even sure if that's floating points issue until now.

The game sends some super low floats to the Mul unit.

On PS2, floats with exponent zero should return zero, but this is not the case in Mul, the multiplier can work with denormals internally.

I love when undocumented stuff is used by some games for their 3D engine ^^.
@GitHubProUser67
Copy link
Author

GitHubProUser67 commented Nov 13, 2024

The Tony Hawk case is fixed, the game uses an un-documented behaviour in it's 3D engine.

The PS2 has no denormals support .... except in the Mul unit apparently.

The behaviour is now emulated properly.

@Shoegzer
Copy link

I remember seeing on the public dev channel that stuntman not longer had AI issues with this and not longer the car AI failed? it's been a while

@AmyRoxwell Can you provide a reference to this? There's no indication of it being fixed in #2990, and I assume the devs would have closed it if it were. In any event it involves pathing in Driver 3 as well.

@weirdbeardgame
Copy link
Contributor

weirdbeardgame commented Nov 13, 2024

2024-11-10.14-28-32.mp4

This also affects the Fatal Frame 1 issue. Meaning this + the current GameDB patch will end up being the ultimate fix

More accurate approach to compare.
@Goatman13
Copy link
Contributor

#3200 is fixed when tested with b7f3806
Require accurate mul/div for FPU.
I didn't tested, but Krome studio games should profit from this pr too, ones that have patches in game db. Like Spyro, Star Wars, one of Transformers game.

@ghost
Copy link

ghost commented Nov 13, 2024

This pr's EE interpreter fixes #11636 's gamedb issue.

Implements accurate SQRT options, also removes Tri-Ace hack, which isn't needed anymore on the interpreter.
@AmyRoxwell
Copy link

I remember seeing on the public dev channel that stuntman not longer had AI issues with this and not longer the car AI failed? it's been a while

@AmyRoxwell Can you provide a reference to this? There's no indication of it being fixed in #2990, and I assume the devs would have closed it if it were. In any event it involves pathing in Driver 3 as well.

I meant like, while using this PR, not that is has been fixed. Sorry if it was misunderstood. But if it's not mention on the PR maybe the thing it needs it's not here by this initial implementation.

@GitHubProUser67
Copy link
Author

Driv3r seemed fine when it was tested, Stuntman NTSC is a lot better but still can "slightly" deviate.

I suspect it is once again, the interpreter rounding/clamping values somewhere.

@Shoegzer
Copy link

@AmyRoxwell Ah, I understand you now. That's great news.

@GitHubProUser67 Thanks, nice to see Driv3r is looking better. Would it make sense to list these games in your OP?

@Goatman13
Copy link
Contributor

I would say the main issue came from the fact it was widely assumed the PS2 and PS3 shared the same "float" calculation behaviours, while in actual fact, they do not.

That was the plan as far as I know. Probably someone underestimated how important 1:1 implementation was to keep compatibility. PS3 CELL implemented special Altivec/VMX mode to make PPE core vector floats compatible with SPU, but also with PS2 emulation in mind. Later it turned out that they really miscalculated importance of accurate floating point math in PS2 emulation. To the point that first fixes in ~2008-2010 were strictly per game, like "on this pc, add 0x01 to this reg, and get back execution to recompiler". Soft floats, including accurate DIV, came later, when new people were hired. :) In the end only VU1 is using real SPE, everything else use PPE VMX compatibility mode.

Current Div/Sqrt system is exactly like a PS3 emulating the PS2 in therms of results, which means way closer (and fixes the VAST majority of games), but still one bit off in some edge cases.

Add/Sub/Mul are PS2 perfect.

And that's indeed a great outcome. Nice work. :)

@GitHubProUser67
Copy link
Author

Thank you Goatman13! The PCSX2 team also gets massive credits here for their help and support, of course the goal is to be PS2 perfect on this one. There is this one game, Mortal Combat Shaolin Monks which doesn't work fully yet on the Soft-Float div due to the game doing loads of Log2 on top of existing Log2 results, leading to it relying on rounding in booth directions...

…eed-up simple calculations.

This greatly improves performance while using Soft-Floats.
@TellowKrinkle
Copy link
Member

BTW the reason macOS builds are failing is they have PCH off, which means you don't get a bunch of headers auto-included. In particular, you need an #include "common/Pcsx2Defs.h" at the top of PS2float.c to get the u32, etc defines.

…s to their respectives place.

I don't like the SIMD way of doing it, it can be slower and less practical to use (expensive casting).
@GitHubProUser67
Copy link
Author

Now the only code style remaining is the Global VU TMP.

Comment on lines 125 to 130
for (s32 i = 31; i >= 0; i--)
{
if (((value >> i) & 1) != 0)
return i;
}
return -1;
Copy link
Member

@TellowKrinkle TellowKrinkle Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (s32 i = 31; i >= 0; i--)
{
if (((value >> i) & 1) != 0)
return i;
}
return -1;
#ifdef _MSC_VER
unsigned long bit;
return _BitScanReverse(&bit, value) ? bit : -1;
#else
return value ? __builtin_clz(value) ^ 31 : -1;
#endif

And you can drop the rest of the changes and make BitScanReverse8 and clz just call this (or pick one and stop using the rest).
As a side note, not handling the value == 0 case will be a bit cheaper, and in my experience in pretty much all cases, the caller has already checked and is only calling this if they know value is nonzero, so the separate check here is just wasted cycles.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will do that when possible and push, I admit I lacked experience with the BitScan part.

@GitHubProUser67
Copy link
Author

I declare Final Fantasy X fixed on PCSX2!

GT4/Tourist Trophy are an edge case that we can hopefully sort out soon.

…ftfloat on some extra obscure VU ops and fixes a denormals check on the checkDivideByZero method in the FPU.

Fixes:

- Final Fantasy X (fully playable)
- Klonoa 2

Partially Fixes:

- Mortal combat Shaloin Monks
- Gran Turismo 4 (game patch will be neceassary to skip Licence Test CRC check).
- Tourist Trophy (game patch will be neceassary to skip Licence Test CRC check).

The stop gap div measure is not yet enough to fully fix GT4/TouristTrophy (they will need different rounding modes per licences). Currently this is bridged on the Div Rounding Mode setting.
@AmyRoxwell
Copy link

AmyRoxwell commented Dec 28, 2024

Hopefully This could someday get implemented into the recompiler. Having this many games fixes (And some outright going from broken to playable) is really nice to see! Keep up the great work you doing!

@LoStraniero91
Copy link

LoStraniero91 commented Dec 30, 2024

#5963

Superman Returns
Needs VU1 - MUL/DIV to fix SPS
Note: the SPS is fixed with the "VU Overflow hack", but this causes buildings to fade in and out near the screen edges when flying at high altitude, the soft-float setting completely solves both issues.

#11207

Colin McRae Rally 3
AI on super special stages will desync from its course even on YOLO Mode (EE/VU0/VU1 Interpreters and all soft-floats enabled)

@user18081972
Copy link

so this PR does not fix the bugs in Colin McRae Rally 3, correct?

@LoStraniero91
Copy link

so this PR does not fix the bugs in Colin McRae Rally 3, correct?

Correct. Even tried with EE Cache and full floats, no dice.

@seta-san
Copy link
Contributor

so this PR does not fix the bugs in Colin McRae Rally 3, correct?

Correct. Even tried with EE Cache and full floats, no dice.

either floating point needs to be perfect or it’s a timing issue

@vcx33
Copy link
Contributor

vcx33 commented Dec 30, 2024

I declare Final Fantasy X fixed on PCSX2!

GT4/Tourist Trophy are an edge case that we can hopefully sort out soon.

For FFX, the boss turning invisible bug doesn't happen when using the interpreter, but it doesn't seem to be affected by soft floats, and seems to be a regression, this will have to be re tested on the recompiler.

…lags + uses built-in clz for Add/Sub.

Fixes a TON of games.

The flags are not yet in use in the Interpreters, this will ideally be commited next (requires VU code changes).

The Div/Sqrt method is unoptimized for now, the team is working on a faster equivalent.
@GitHubProUser67
Copy link
Author

And there we have our New-Year present! Fully accurate FPU in PCSX2!

Aside a few things remaining to adjust (flags and newer Div/Sqrt method), we now have a solid basis that replicate the PS2 operations.

The last commit fixes:

@seta-san
Copy link
Contributor

seta-san commented Jan 4, 2025

Jesus. This is amazing. many of the oldest issues are suddenly just fixed. This is truly God's work you and Tellow and everyone else has done here.

@LoStraniero91
Copy link

And there we have our New-Year present! Fully accurate FPU in PCSX2!

Aside a few things remaining to adjust (flags and newer Div/Sqrt method), we now have a solid basis that replicate the PS2 operations.

The last commit fixes:

Tested Colin McRae Rally 3 with this commit and the AI still goes out of sync and crashes into walls. What I should also add is that the AI keeps accelerating against the wall, then it "reminds" it could reverse as I can see it tries to back out from the wall, but it stays in place and goes nowhere. After almost 3 minutes, the AI "dies", as it probably thought it has finished the race.
Very temperamental little game.
image

@Kaibayugi2002
Copy link

Kaibayugi2002 commented Jan 5, 2025

@LoStraniero91
I'm assuming that only testers can download this at the moment. So unless your a tester, we the general public don't have access yet.

@Valtekken
Copy link
Contributor

@LoStraniero91 I'm assuming that only testers can download this at the moment. So unless your a tester, we the general public don't have access yet.

You can download it yourself from the artifacts right below the comments.

@kamfretoz
Copy link
Contributor

kamfretoz commented Jan 5, 2025

I'm assuming that only testers can download this at the moment. So unless your a tester, we the general public don't have access yet.

You only need a GitHub account to download the CI artifacts.

And you can download it from here

EDIT:
Alternatively, you can go to the "Checks" tab near the top of the page and choose the OS you are using.

@NIWDERED-07
Copy link

¡Y ya tenemos nuestro regalo de Año Nuevo! ¡Un FPU totalmente preciso en PCSX2!

Aparte de algunas cosas que quedan por ajustar (banderas y el nuevo método Div/Sqrt), ahora tenemos una base sólida que replica las operaciones de PS2.

La última confirmación corrige:

Wonderful :O, I look forward to them soon being able to repair driver 3, it makes me want to play it again :D

@Tokman5
Copy link
Contributor

Tokman5 commented Jan 5, 2025

Demo replays of Tokyo Xtreme Racer 0 are MUCH closer to the ones running on console than ever!
Unfortunately, they are still not equal (I don't know if this is solely due to FPU accuracy). It affects the behavior of non-player cars, so some replays are still broken. Even so, the last commit is still a huge step.

We use a faster checked method that achieve the same result.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.