Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve ChobbyLauncher crash reporting #2978

Closed
wants to merge 3 commits into from

Conversation

Mankarse
Copy link
Contributor

The primary purpose of these improvements is to collect more desync data.

See this example issue: Mankarse/Zero-K-Debug#34

I have split the changes into 5 commits, ordered from least to most controversial.

The first three commits are uncontroversial improvements.
The remaining two commits might be controversial, because they use GitHub in an unusual way.

1) Updated Octokit.net to the latest version
This contains a necessary bugfix. You should apply this, even if you do nothing else with my Pull Request.

2) Improve the algorithm in CrashReportHelper.Truncate
The new algorithm is required for the later commits to work.
It is still valuable even if they later commits are not used, because it is much faster.
The only downside is that its implementation is much larger.

3) Improve parsing of infolog_full.txt
Splits infolog by game, and extracts GameID, Desync locations, GameState file names.
Includes GameState file names in the Issue description (if they are relevant).

4) Use Issue Labels to link desync issues arising from the same game

Adds "GameStateFor-{gameID}" and "HasDesyncGameState" labels whenever a crash report includes GameState for a desync.

Add "HasLinkedDesyncIssue" label when there is an Issue with a matching "GameStateFor-{gameID}" label.

Adds links to existing Issues to the description of the new Issue.

This will create an unbounded number of Labels (one for each gameID).

Please suggest better names for these Labels, if you have any ideas.

5) Upload files to GitHub
This works by creating a "Release" for every 250 issues, and then uploading the files to that release.

The main risk of changes 4 and 5 is that they increase the chance that the ZeroK-RTS/CrashReports will hit
usage limits.

I am not aware of any documented limits that will be exceeded, but it is possible that GitHub will not like the unusual way that CrashReports will be being used.

@Licho1
Copy link
Member

Licho1 commented Apr 3, 2024

Have you tested this on both recent and older infologs?
Regarding release upload, it is a bit fishy.
Let's think about alternatives.
We could store it on our server machine or upload it to blobs which are used by server to store replays.

In github it could also be attached to wiki I think.

Hmm @GoogleFrog do you have some feedback on this?

Otherwise it looks fine to me.

We could also keep it in releases but it means we have to clean that periodically.

@Mankarse
Copy link
Contributor Author

Mankarse commented Apr 4, 2024

Thanks for looking at this, and thanks for the feedback.

Testing on older infolog.txt

In reality, the parsing will be run on infolog_full.txt, not infolog.txt.
Older infolog_full.txt are not retained, so I performed most of my testing on infolog.txt instead of infolog_full.txt.
The earliest infolog.txt that I have is from early 2023.
My testing did not reveal any problems related to the age or engine version of the infolog.txt, but I did encounter the occasional problem where the parser would produce slightly incorrect results. These appeared to be fixable, but there is a need to balance correctness in parsing against being excessively dependent on aspects of the infolog format that could change.

I will perform more detailed testing and analysis over the weekend.

File uploading

Over the past year, there were ~38000 issues created in CrashReports.
Of these, ~2600 were desync issues.

Going off the infolog/GameState files that I have, the expected (compressed) file sizes are:

Situation Size (MB)
Worst case (very large infolog, multiple GameState) 3
Average case (average sized infolog and single GameState) 1
Average case (infolog only, no GameState) 0.2

Multiplying this out, we get yearly growth estimate:

Situation Size (GB) Details
All worst case (unrealistic) 114 3MB * 38000
All average case 9.7 1MB * 2600 + 0.2MB*(38000-2600)
Only upload for desync (Average case) 2.6 1MB * 2600

This immediately rules out using a GitHub wiki (since GitHub wikis are Git repositories). From https://docs.github.com/en/repositories/working-with-files/managing-large-files/about-large-files-on-github#repository-size-limits

We recommend repositories remain small, ideally less than 1 GB, and less than 5 GB is strongly recommended. ... If your repository excessively impacts our infrastructure, you might receive an email from GitHub Support asking you to take corrective action.

For GitHub Releases, there is no problem. From https://docs.github.com/en/repositories/working-with-files/managing-large-files/about-large-files-on-github#distributing-large-binaries

We don't limit the total size of the binary files in the release or the bandwidth used to deliver them. However, each individual file must be smaller than 2 GiB.

As for uploading it to the server or to Azure Blobs: the amount of data doesn't prohibit this, as long as the files aren't retained for more than a year or two. If you have a preferred method, I'm willing to implement it (but it does look like a lot more work for me, compared to the GitHub releases version that I have already done).

In the current version, an untruncated infolog_full.txt is uploaded for every issue. The size estimate shows that the total data use could be substantially reduced if files were only uploaded for desync issues. What are your thoughts on this?

@GoogleFrog
Copy link
Contributor

@Licho1 I like the sound of wiki over releases. Otherwise I don't have much feedback as I don't know the technical details of the current system.

@Mankarse Mankarse force-pushed the master branch 2 times, most recently from a878f6e to 36d45f2 Compare April 14, 2024 11:07
Adds "GameStateFor-{gameID}" and "HasDesyncGameState" labels
 whenever a crash report includes GameState for a desync.

Add "HasLinkedDesyncIssue" label
 when there is an Issue with a matching "GameStateFor-{gameID}" label.

Adds links to existing Issues to the description of the new Issue.
This will only work if the ZeroK-RTS/CrashReports repository
has a branch named "main".

Currently the files that are uploaded are:
 infolog_full.txt (not truncated)
 GameState files (if there is a desync)
@Mankarse Mankarse closed this Aug 31, 2024
@Mankarse
Copy link
Contributor Author

Closing for now. If there is interest in the future I'll open a new pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants