Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document automated checks under o!TR/Data Processing/Automated Checks #30

Open
wants to merge 18 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added docs/images/co23-example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
7 changes: 6 additions & 1 deletion docs/otr.tree
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,9 @@
<toc-element topic="Initial-Ratings.md"/>
<toc-element topic="Match-Cost.md"/>
</toc-element>
<toc-element topic="Score-Modifications.md"/>
<toc-element topic="Score-Modifications.md">
<toc-element topic="Automated-Checks.md"/>
</toc-element>
<toc-element topic="Team.md"/>
<toc-element topic="Contact.md"/>
<toc-element topic="Contributions.md">
Expand All @@ -27,6 +29,9 @@
<toc-element topic="API-Configuration.md"/>
<toc-element topic="Code-Quality.md"/>
</toc-element>
<toc-element topic="Related-Services.md">
<toc-element topic="DataWorkerService.md"/>
</toc-element>
</toc-element>
<toc-element topic="o-TR-Database.md">
<toc-element topic="Database-Setup.md"/>
Expand Down
5 changes: 5 additions & 0 deletions docs/redirection-rules.xml
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,9 @@
<description>Created after removal of "Setup" from osu! Tournament Rating</description>
<accepts>Setup.html</accepts>
</rule>
<rule id="4f4b2c10">
<description>
<![CDATA[Created after removal of "Related Tools & Services" from osu! Tournament Rating]]></description>
<accepts>DataWorkerService.html</accepts>
</rule>
</rules>
259 changes: 259 additions & 0 deletions docs/topics/Automated-Checks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,259 @@
# Automated Checks

The [](DataWorkerService.md) has numerous responsibilities, one of them being a data processing step known as automated
checks. These checks are responsible for processing various portions of data depending on the current processing step
for a particular piece of data.
hburn7 marked this conversation as resolved.
Show resolved Hide resolved

## Core Principals
hburn7 marked this conversation as resolved.
Show resolved Hide resolved

When designing this system, we did so with the following principles in mind:

1. Human reviewers have authority over whether an entity is `Verified` or `Rejected`. As such, the system will never automatically assign these designations.
2. The automatic application of the `PreRejected` status must be as accurate as possible, based on concrete rules.
3. The process must be as transparent as possible. As such, the system tracks all changes to entities in the `audit` tables. Additionally, all entities have a `RejectionReason` enum which defines a combination of reasons why it was marked as rejected by either the system or human reviewer.
4. Do not include entities which are not `Verified` in the tournament rating algorithm.
* This provides an added benefit of ensuring all generated statistics are valid. Even with manually submitted data, humans make mistakes. If unverified data is introduced into the rating & statistics systems, users will notice invalid statistics and the rating ladder itself will not be completely accurate.
hburn7 marked this conversation as resolved.
Show resolved Hide resolved

## Entities

The following entities are part of this processing pipeline:

* `Tournaments`
* `Matches`
* `Games`
* `GameScores`

## Statuses

Each entity has `VerificationStatus`, `ProcessingStatus`, and `RejectionReason` fields. These fields are referenced and changed by the DataWorkerService as they move through the processing flow.

### `VerificationStatus`

Each entity shares the same `VerificationStatus` type. This type contains the following statuses:

* `None`: The entity has yet to be processed automatically.
* `PreRejected`: Based on the system's rules, this entity should be rejected.
* `PreVerified`: The system did not find anything wrong, awaiting human review.
* `Rejected`: A human marked this entity as rejected.
* `Verified`: A human marked this entity as verified.

### `ProcessingStatus`

Each entity has a unique `ProcessingStatus` type associated with it. This flag is self-explanatory: it indicates how far along an entity is in the processing pipeline.

For example, consider `TournamentProcessingStatus`:

1. `NeedsApproval`: The tournament is submitted but waiting approval from a verifier.
hburn7 marked this conversation as resolved.
Show resolved Hide resolved
2. `NeedsMatchData`: Match data needs to be fetched via the osu! API.
3. `NeedsAutomationChecks`: The tournament, and all of its children, are awaiting automation checks.
4. `NeedsVerification`: Awaiting human review
hburn7 marked this conversation as resolved.
Show resolved Hide resolved
5. `NeedsStatCalculation`: After human review, process statistics (must be complete before it is eligible for inclusion in the rating system).
hburn7 marked this conversation as resolved.
Show resolved Hide resolved
6. `Done`: Processing is completed. `Verified` tournaments with this status are eligible for inclusion in the rating system.

### `RejectionReason`

Each entity has a custom `RejectionReason` type with various flags which may cause it to be marked as `PreRejected`. Flags can be combined with each other to form a set of reasons. For example, a `Game` could be marked as `PreRejected` by the system due to `NoScores` and `BeatmapNotPooled`.

## Flow

Automation checks are performed in the following order.

```Mermaid
flowchart LR;
GameScore --> Game --> Match --> Tournament
```

> This allows the parent entities to have the context of how their children faired during the automated checks process.
hburn7 marked this conversation as resolved.
Show resolved Hide resolved
>

### Tournament

```Mermaid
flowchart TD;
A[Is the count of PreVerified and/or Verified matches >= 0?]
hburn7 marked this conversation as resolved.
Show resolved Hide resolved
B[Apply NoVerifiedMatches flag to RejectionReason]
C[Is this count >= 80% of the total match count?]
D[Apply NotEnoughVerifiedMatches flag to RejectionReason]
PreTerm[Is the RejectionReason null?]
TermPositive[Change VerificationStatus to PreVerified]
TermNegative[Change VerificationStatus to PreRejected]

A -- No --> B --> PreTerm
A -- Yes --> C
C -- No --> D --> PreTerm
C -- Yes --> PreTerm
PreTerm -- Yes --> TermPositive
PreTerm -- No --> TermNegative
```

### Match

```Mermaid
flowchart TD;
A[Is the count of games > 2?]
B[Do any games besides the first 2 have a
RejectionReason of BeatmapNotPooled?]
C[Apply UnexpectedBeatmapsFound to WarningFlags]
D[Is the EndTime property equal to
2007-09-17-00:00:00?]
E[Apply NoEndTime flag to RejectionReason]
H[Is the match name structured in a typical
format?]
I[Apply UnexpectedNameFormat to WarningFlags]
J[Does the match name start with the tournament's
abbreviation?]
K[Apply NamePrefixMismatch flag to RejectionReason]
L[Is the tournament's lobby size equal to 1?]
M[Are the games structured in a way which supports
conversion to TeamVS?]
N[Attempt to convert a full set of Head to Head games to TeamVS]
O[Apply FailedTeamVsConversion flag to RejectionReason, repeat
for all child games]
hburn7 marked this conversation as resolved.
Show resolved Hide resolved
P[Convert all games to TeamVS, mark all games as PreVerified]

F[Is the count of games equal to 0?]
G[Apply NoGames flag to RejectionReason]
Q[What is the count of PreVerified and/or Verified games?]
Q1[0]
Q2[1 or 2]
Q3[4 or 5]
hburn7 marked this conversation as resolved.
Show resolved Hide resolved
Q4[&gt;5]
Q_A[Apply NoValidGames flag to RejectionReason]
Q_B[Apply UnexpectedGameCount flag to RejectionReason]
Q_C[Apply LowGameCount to WarningFlags]

PreTerm[Is the RejectionReason null?]
TermPositive[Change VerificationStatus to PreVerified]
TermNegative[Change VerificationStatus to PreRejected]

A -- Yes --> B
B -- Yes --> C --> D
A -- No --> D
B -- No --> D
D -- Yes --> E --> H
D -- No --> H

H -- No --> I --> J
H -- Yes --> J
J -- No --> K --> L
J -- Yes --> L
L -- No --> F
L -- Yes --> M
M -- Yes --> N
M -- No --> F
N -- Fail --> O --> F
N -- Success --> P --> F
F -- No --> Q
F -- Yes --> G
Q --> Q1
Q --> Q2
Q --> Q3
Q --> Q4
Q1 --> Q_A --> PreTerm
Q2 --> Q_B --> PreTerm
Q3 --> Q_C --> PreTerm
Q4 --> PreTerm
PreTerm -- Yes --> TermPositive
PreTerm -- No --> TermNegative
```

### Game

```Mermaid
flowchart TD;
A[Is the beatmap null?]
B[Is there a known mappool for the tournament?]
C[Of all games in the tournament, is the beatmap
used exactly once?]
D[Apply BeatmapUsedOnce to WarningFlags]
E[Is the beatmap in the known mappool for the tournament?]
F[Apply BeatmapNotPooled flag to RejectionReason]
G[Is the EndTime property equal to
2007-09-17-00:00:00?]
H[Apply NoEndTime flag to RejectionReason]
I[Are invalid mods present at the game level?]
J[Apply InvalidMods flag to RejectionReason]
K[Does the ruleset match the tournament's ruleset?]
L[Apply RulesetMismatch flag to RejectionReason]
M[Is the count of scores 0?]
N[Apply NoScores flag to RejectionReason]
O[Is the count of PreVerified and/or Verified scores 0?]
P[Apply NoValidScores flag to RejectionReason]
Q[Is the count of PreVerified and/or Verified scores
half that of the tournament's LobbySize?]
hburn7 marked this conversation as resolved.
Show resolved Hide resolved
R[Apply LobbySizeMismatch flag to RejectionReason]
S[Is the ScoringType ScoreV2?]
T[Apply InvalidScoringType flag to RejectionReason]
U[Is the TeamType TeamVs?]
V[Apply InvalidTeamType flag to RejectionReason]
PreTerm[Is the RejectionReason null?]
TermPositive[Change VerificationStatus to PreVerified]
TermNegative[Change VerificationStatus to PreRejected]

A -- Yes --> G
A -- No --> B
B -- Yes --> C
B -- No --> G
C -- Yes --> D --> E
C -- No --> E
E -- Yes --> G
E -- No --> F --> G
G -- Yes --> H --> I
G -- No --> I
I -- Yes --> J --> K
I -- No --> K
K -- Yes --> M
K -- No --> L --> M
M -- Yes --> N --> S
M -- No --> O
O -- Yes --> P --> S
O -- No --> Q
Q -- Yes --> S
Q -- No --> R --> S
S -- No --> T --> U
S -- Yes --> U
U -- No --> V --> PreTerm
U -- Yes --> PreTerm
PreTerm -- Yes --> TermPositive
PreTerm -- No --> TermNegative
```

### GameScore

```Mermaid
flowchart TD;
A[Is the score value > 1,000?]
B[Apply ScoreBelowMinimum flag to RejectionReason]
C[Does the score contain invalid mods?]
D[Apply InvalidMods flag to RejectionReason]
E[Does the ruleset match the tournament's ruleset?]
F[Apply RulesetMismatch flag to RejectionReason]
PreTerm[Is the RejectionReason null?]
TermPositive[Change VerificationStatus to PreVerified]
TermNegative[Change VerificationStatus to PreRejected]

A -- No --> B --> C
A -- Yes --> C
C -- Yes --> D --> E
C -- No --> E
E -- Yes --> PreTerm
E -- No --> F --> PreTerm
PreTerm -- Yes --> TermPositive
PreTerm -- No --> TermNegative
```

## FAQ

### How can a human manually mark all entities as `Verified`?

Most of the issues which require manual intervention are at the `Match` and `Game` levels. For example, if a `Match` has too many invalid games, it will be marked as `PreRejected` and require manual intervention. The same is true for `Game`s.

For `GameScore` entities, there are very concrete rules which can easily determine whether it should be `Rejected`, for example if the `Score` value is below the minimum.
hburn7 marked this conversation as resolved.
Show resolved Hide resolved

We also have a web interface which allows reviewers to mark an entity - and all of its children - as `Verified` or `Rejected`. Generally speaking, if at a glance everything is marked as `PreVerified`, very little effort is required to manually approve these submissions. If the opposite is true, it's likely that the submission contains invalid data.

### In what cases should a human reviewer override a `PreRejected` status?

One example of where this should happen is [Corsace Open 2023](https://osu.ppy.sh/community/forums/topics/1794106?n=1). This tournament has numerous matches marked as `PreRejected` by the system due to not having matches which consistently use the same prefix. This is a case in which the human reviewer should manually override the system's `PreRejected` status (assuming the `RejectionReason`s are of type `MatchRejectionReason.NamePrefixMismatch`).

![CleanShot 2024-11-29 at 09.12.53@2x.png](../images/co23-example.png)
5 changes: 5 additions & 0 deletions docs/topics/DataWorkerService.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# DataWorkerService

These docs are currently under construction!

<toc depth="10" />
4 changes: 4 additions & 0 deletions docs/topics/Related-Services.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@

# Related Services

Start typing here...
hburn7 marked this conversation as resolved.
Show resolved Hide resolved
6 changes: 1 addition & 5 deletions docs/topics/Score-Modifications.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,4 @@ This data includes:

The o!TR team manipulates raw match data in the following ways:

* Multiplies all score values that have the [`EZ` modifier](https://osu.ppy.sh/wiki/en/Gameplay/Game_modifier/Easy) by **1.75x**.

## Removal of user information
hburn7 marked this conversation as resolved.
Show resolved Hide resolved

At this time, the o!TR team does not provide a mechanism for removing a user's information. Users who close or [delete their osu! account](https://osu.ppy.sh/wiki/en/Help_centre/Account#account-deletion) may have their data automatically removed or anonymized from our systems without any formal request to us.
* Multiplies all score values that have the [`EZ` modifier](https://osu.ppy.sh/wiki/en/Gameplay/Game_modifier/Easy) by **1.75x**.