Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running queries: Compress large columns before transfer #1165

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

chadbaldwin
Copy link
Contributor

This is not necessarily a complete implementation, but it is fully operational as far as I've tested locally.

Curious to see what you think about this type of optimization.

I use DBADash primarily over a VPN connection. The Running Queries grid is regularly the slowest to load, especially now that I have plan collection enabled. After some testing, I found that the majority of the wait time was network IO.

By compressing the two larger columns (batch_text and text) and only pulling the query plan when it's actually requested, that significantly improved usage over a VPN.

Some test case results from my local machine over a VPN connection:

Running Query Count Before (ms) After (ms)
5 503 559
10 1,683 669
20 2,054 811
50 11,349 1,159
100 17,796 1,446
200 34,594 2,620
400 123,737 3,741

Since my changes to RunningQueries_Get is not backward compatible, I could see changing it so that the compression of the columns are optionally set? This way if any users are relying on RunningQueries_Get to return the columns which were removed, they would stay put and a parameter could control whether they get swapped out with compressed versions?

@chadbaldwin
Copy link
Contributor Author

chadbaldwin commented Dec 27, 2024

I also had a bunch of other ideas going around in my head while building this to make it a more flexible change that could be used in many other places. But I didn't want to go too overboard with it and figured since this is a mostly behind the scenes change, it could always be changed later without breaking existing installations.

For example, you could have a standard naming convention coupled with the data type. So all columns in the repository database like "*_compressed varbinary(*)" could be detected and auto decompressed when pulled.

Or maybe there's a better way to approach it than having to build a bunch of compressed/decompressed logic into various views, funcs and procs, etc.

@DavidWiseman
Copy link
Collaborator

Thanks for the PR. I would guess that most of the gains come from removing the query plan column - I might be wrong though. It's already compressed so it makes sense to decompress in the app. Also, the column is large and only needed when clicking to view the query plan - removing it completely as you have done is the best approach.
I'm less certain that compressing batch_text and text will be universally beneficial as there is some overhead to compress/decompress - still it's probably worth it on slow connections and the extra overhead of compression is likely to be small in other cases. I'm OK with doing this if you've tested the benefit. Any extra work to compress/decompress the text is probably offset by the gains in not doing this for the query plan column.

RunningQueries_Get doesn't need to be backward compatible. As long as there is a major version bump, client apps will be prompted to update.

Options for further improvements (which don't need to be included in this PR)

The statement text is calculated from the batch_text which could also be done in the app, reducing the amount of data sent. There might be an argument for storing the data compressed to reduce the storage cost. There would be some overhead in compressing on insert, but this could be done in the app.

I haven't tested any of the changes yet - these are just my initial thoughts. It seems like a good fix, once you are happy with the PR, let me know and I'll give it a proper test when I get time.

@chadbaldwin
Copy link
Contributor Author

Ah that's a good point. I'll play around with the code and see how much of an impact each of the changes on their own make.

@chadbaldwin
Copy link
Contributor Author

I did a quick test a few days ago and I think you're right. Simply removing the query plan did make a pretty significant improvement on its own. There's still additional improvement by doing the pre-compression and decompression, but I would probably now argue that it's not enough to justify the change and introducing possible new issues.

I'll do a bit more testing, but I will more than likely strip this PR down to just the query plan removal so that it's as simple a change as possible while making the largest impact.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants