Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Research task: Create a microbenchmark setup to test the efficiency of WebSockets vs HTTP2/3 + SSE #301

Open
aeplay opened this issue Aug 12, 2024 · 49 comments · May be fixed by #597
Open
Assignees
Labels
💎 Bounty help wanted Extra attention is needed perf Performance/scalability improvement

Comments

@aeplay
Copy link
Contributor

aeplay commented Aug 12, 2024

In order to get an idea how best to proceed with #233, it would be good to have ballpark numbers of the performance characteristics of WebSockets vs HTTP requests + Server-Sent Events for our needs.

Setup

We can get this data - completely decoupled from the internals of Jazz - by creating some synthetic microbenchmarks.

  • We need a client & server setup that can handle WebSockets, HTTP 2 & 3 and Server-Sent Events.
  • The client should be in TypeScript, using only Browser-native APIs (WebSocket, fetch, EventSource)
  • For servers, I would like to evaluate
    • A) a plain Node.JS server (or higher-level wrapper of that)
    • B) uWebSockets.js, for both web sockets and HTTP3 requests
    • C) plain Node.JS server behind a Caddy proxy that handles HTTP3 & SSL
    • all of them should be single threaded, on a single node

Simulation details

Original data use that needs to be simulated

Currently, Jazz uses WebSockets to sync CoValue state between the client and syncing/persistence server.

The communication typically consists of three scenarios:

  1. Loading a CoValue
  • The client doesn't have a CoValue's edit history or only has it partially and requests the rest from the sync server.
  • The sync server responds with the missing parts of the CoValue edit history
  1. Creating a CoValue
  • The client creates a new CoValue locally and wants to upload it completely to the sync server
  1. Subscribing to and mutating a CoValue
  • A client is interested in realtime changes to a CoValue and subscribes to it
  • The server sends individual changes to the CoValue to the client as they happen
  • The client sends individual changes to the CoValue to the server when the user modifies the CoValue locally

Websockets vs Requests & SSE

Currently, 1., 2. and 3. happen over WebSockets, with 1 package per request/response/incoming update

For using Requests and SSE instead, we would use Requests & Responses for 1. and 2., while for 3. we listen to incoming updates with Server-Sent Events and publish outgoing updates as a Request with no expected Response.

Simulation spec

There are roughly two classes of CoValues: structured CoValues (thousands of <50 byte edits) and binary-data CoValues (few edits that are each 100kB).

Since we are only interested in the data transmission performance, we can model the scenarios using packets containing random data:

  1. Loading a CoValue
  • structured: A 100 byte request from the client and a 5kB response from the server
  • binary: A 100 byte request from the client and a 50MB response from the server, streamed in 100kB chunks
  1. Creating a CoValue
  • structured: A 10kB request from the client and a 10 byte response from the server
  • binary: A 50MB request from the client, streamed in 100kB chunks and a 10 byte response from the server
  1. Subscribing to and mutating a CoValue
  • structured: 50 byte incoming SSE messages/WebSocket packets, mutations are 50 byte outgoing messages as a request/WebSocket packet
    • assume one client creating a mutation that is published to 10 other clients
  • binary: 100kB incoming SSE messages/WebSocket packets, mutations are 100kB outgoing messages as a request/WebSocket packet

No extra HTTP headers should be set (other than what browser set by default, and these should be minimised if possible)

Target metrics

The main variables we are interested in are

  1. Loading a CoValue
  • a) How many CoValues can be "simulation loaded" at once on a client? Do we get head-of-line blocking effects? (you can do structured/binary separately)
  • b) How many clients can "simulation load" a CoValue at the same time per server?
  1. Creating a CoValue
  • c) How many CoValues can be "simulation created" at once on a client? Do we get head-of-line blocking effects? (you can do structured/binary separately)
  • d) How many clients can "simulation create" a CoValue at the same time per server?
  1. Subscribing to and mutating a CoValue
  • e) How many updates per second can we push from a client?
  • f) How many updates per second can a server handle (receive and broadcast to interested clients)?
  • g) What data throughput can we achieve for binary CoValue updates?
  • h) What's the latency like

Variables

It would be good to get results for the metrics above assuming

Different network conditions

  • I Ideal network conditions
  • II 4G speeds
  • III 3G speeds
  • IV connections with high packet loss, including so bad that we need to reconnect WebSockets

Different protocols

  • WS: WebSockets only (with reconnect on timeout)
  • H2: HTTP2 + SSE
  • H3: HTTP3 + SSE

You don't need to actually deploy a server anywhere if you can simulate these conditions locally, just make sure to note down your hardware specs and use exactly one thread/core for the server

Dimensions summary

So in total we have the following dimensions:

  • Server tech: A, B and C
  • Target metrics: a), b), c), d), e), f), g), h)
  • Network condition: I, II, III, IV
  • Protocols: WS, H2, H3

Deliverable

  • Create a completely new project in a new folder in this monorepo called 'experiments'
  • Create an independent pnpm workspace in there
  • Use whatever test-harness / benchmarking libraries / node setups / browser automation you think are best
  • Make sure the setup is replicable / can be run by anyone
  • Post results from your machine in a README in the same folder

I realise this spec is a lot, so feel free to ask lots of clarifying questions before & after accepting the task!

@aeplay
Copy link
Contributor Author

aeplay commented Aug 12, 2024

/bounty $2000

Copy link

algora-pbc bot commented Aug 12, 2024

💎 $3,000 bounty • Garden Computing

Steps to solve:

  1. Start working: Comment /attempt #301 with your implementation plan
  2. Submit work: Create a pull request including /claim #301 in the PR body to claim the bounty
  3. Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts

Thank you for contributing to garden-co/jazz!

Add a bountyShare on socials

Attempt Started (GMT+0) Solution
🟢 @ayewo Aug 12, 2024, 4:59:43 PM #597

@aeplay aeplay added help wanted Extra attention is needed perf Performance/scalability improvement labels Aug 12, 2024
@aeplay aeplay changed the title Research task: Create a microbenchmark setup to test the efficiency of WebSockets vs HTTP1/2/3 + SSE Research task: Create a microbenchmark setup to test the efficiency of WebSockets vs HTTP2/3 + SSE Aug 12, 2024
@aeplay aeplay pinned this issue Aug 12, 2024
@ayewo
Copy link

ayewo commented Aug 12, 2024

Hi @aeplay

I see your project just joined Algora, so welcome!

Different projects pick different styles of working so I'm curious, how do you want attempts at this bounty to shake out:

  • first-come first served i.e. the first dev to ask nicely gets assigned to work on the issue;
  • battle royale i.e. let multiple devs attempt and you'll evaluate PRs as they roll in?

The risk with style 1. is that the assigned dev might take too long to show progress (if they are inexperienced or experienced but busy with a day job).

The risk with style 2. is since there is only 1 bounty reward, anyone willing to work on this risks getting blind-sided by other devs who open PRs to claim the bounty.

@aeplay
Copy link
Contributor Author

aeplay commented Aug 12, 2024

Hey @ayewo good question! This is what I'm most curious about in this bounty model as well.

For this task I would say first-come first-serve, since it is quite a detailed project and I would hate anyone to waste their effort. There is no super urgent deadline for it, so I would be happy to let the first serious contender iterate on it with my input.

@ayewo
Copy link

ayewo commented Aug 12, 2024

Excellent! In that case, I'd like to /attempt #301 this.

Algora profile Completed bounties Tech Active attempts Options
@ayewo 22 bounties from 5 projects
TypeScript, Rust,
JavaScript & more
Cancel attempt

@aeplay
Copy link
Contributor Author

aeplay commented Aug 12, 2024

@ayewo yes! Let's gooo

Please ask for clarifications here, I'm in GMT+1 and mostly available during normal work hours, but also during other times on my phone for quick answers

@ayewo
Copy link

ayewo commented Aug 12, 2024

@aeplay I'm also in GMT+1 :) and just joined your Discord.

I take it you prefer clarifications happen here in the open, right?

@aeplay
Copy link
Contributor Author

aeplay commented Aug 12, 2024

yes please, don't worry about making this issue noisy, that's what it's for

@ayewo
Copy link

ayewo commented Aug 12, 2024

Roger that. Would appreciate it if you can assign the issue to me, otherwise there will be drive-by attempts from other devs feigning ignorance of our conversation above.

@aeplay
Copy link
Contributor Author

aeplay commented Aug 12, 2024

done, thanks for walking me through this

@DhairyaMajmudar
Copy link

Hi @aeplay, I'm interested in the project's client and ci/cd workflow related part, if its okay can we split the project @ayewo?

@aeplay
Copy link
Contributor Author

aeplay commented Aug 12, 2024

Hey @DhairyaMajmudar I appreciate your offer but would like to keep this focused on one person attempting it. Thank you!

@DhairyaMajmudar
Copy link

Hey @DhairyaMajmudar I appreciate your offer but would like to keep this focused on one person attempting it. Thank you!

That's fine!

@aeplay
Copy link
Contributor Author

aeplay commented Aug 12, 2024

And @ayewo just clarifying: there is no CI/CD aspect to this project - it's all meant to be run manually.

@ayewo
Copy link

ayewo commented Aug 12, 2024

@aeplay Yes, understood. You want this to be single-thread and launched locally.

(I built microbenchmark recently using a combination of PowerShell (on Windows) and Bash (on Linux) but they were each executed remotely on EC2 instances using Terraform.)

@MentalGear
Copy link

Hint: Next, time You May want to Keep Applications a Bit Longer open so You Can Evaluate a few applicants, it doesn't have to be, first come first serve or Battle Royale.

@aeplay
Copy link
Contributor Author

aeplay commented Aug 13, 2024

Hint: Next, time You May want to Keep Applications a Bit Longer open so You Can Evaluate a few applicants, it doesn't have to be, first come first serve or Battle Royale.

Makes sense, but this time I wanted to move quickly and @ayewo seemed eager and capable so I just went with him

@ayewo
Copy link

ayewo commented Aug 14, 2024

I'd like to share some progress on the research I've done so far and ask a few questions.

I looked into the HTTP protocol versions supported by servers A, B, and C and it seems that only Caddy supports all three versions of the HTTP protocol natively (i.e. HTTP/1.1, HTTP/2 and HTTP/3).

HTTP Versions Supported by Web Servers1

# Server HTTP/1.1 HTTP/2 HTTP/3
A Node.js v22.6.0
B uWebSockets.js v20.47.0 ⚠️ (experimental, development paused)
C Caddy v2.8.4 (and HTTP/2 over cleartext (H2C))

Node.js doesn't yet suppport HTTP/3 natively but I came across a 3rd party repo (https://github.com/endel/webtransport-nodejs) that claims to offer HTTP/3 support but I didn't look too closely.

Since you also want to test against 3 different protocols:

  • WS: WebSockets only (with reconnect on timeout)
  • H2: HTTP2 + SSE
  • H3: HTTP3 + SSE

I tried to map servers A, B, C to the 3 protocols to see what is possible:

Web Server to Web Protocol Mapping

# Server Layer 7 Layer 62 Layer 4 Supported
A1 Node.js HTTP/1.1 + WebSocket (WS) TLSv1.3 (Optional) TCP ✔️
A2 HTTP/2 + Server-Sent Events (SSE) TLSv1.3 (Optional) TCP ✔️
A3 HTTP/3 + SSE TLSv1.3 (Mandatory) UDP (QUIC)
B1 uWebSockets.js HTTP/1.1 + WS TLSv1.3 (Optional) TCP ✔️
B2 HTTP/2 + SSE TLSv1.3 (Optional) TCP
B3 HTTP/3 + SSE TLSv1.3 (Mandatory) UDP (QUIC)
C1 Caddy HTTP/1.1 + WS TLSv1.3 (Optional) TCP ✔️
C2 HTTP/2 + SSE TLSv1.3 (Optional) TCP ✔️
C3 HTTP/3 + SSE TLSv1.3 (Mandatory) UDP (QUIC) ✔️

Questions

  1. Is my understanding of the server to protocol mapping correct?

  2. In the Simulation spec section, you wrote:

  1. Subscribing to and mutating a CoValue
  • structured: 50 byte incoming SSE messages/WebSocket packets, mutations are 50 byte outgoing messages as a request/WebSocket packet
    • assume one client creating a mutation that is published to 10 other clients

For the actual test, does this imply that after each server is started, 10 clients will be spawned that will subscribe to a CoValue, then 1 client will mutate the CoValue triggering a notification by the server to those 10 clients?

Footnotes

  1. The emojis are also link to relevant docs.

  2. HTTP and TLS are both layer 4 protocols in the TCP/IP model but I opted for the OSI model here to keep things clear.

@aeplay
Copy link
Contributor Author

aeplay commented Aug 15, 2024

Hey @ayewo, thanks for sharing your research results in such a well-structured format.

  1. It matches what I was aware of. For uWebsockets.js please can you try the experimental HTTP3 support and let me know how it goes?

  2. Your understanding is correct, and just to be clear, I am not expecting you to do anything with Jazz/actual CoValues, we are just simulating their traffic patterns by sending (client -> server) and then broadcasting (server -> 10 clients) random data

@aeplay
Copy link
Contributor Author

aeplay commented Aug 15, 2024

Some more clarifications:

  1. In all cases I'd like you to contrast using only WebSockets (for outgoing requests and their responses, and incoming notifications) vs HTTP (for outgoing requests and their responses) + SSE (incoming notifications)
  2. uWebsockets should support HTTP2, right?
  3. With Caddy, I'm only really interested in using it as a HTTP3-handling reverse proxy in front of Node.JS

So the full mapping would look like this

Web Server to Web Protocol Mapping

# Server Layer 7 Layer 6 Layer 4 Port Supported
A1 Node.js WebSockets only TLSv1.3 (Optional) TCP 3001 ✔️
A2 HTTP/1 + Server-Sent Events (SSE) TLSv1.3 (Optional) TCP 3002 ✔️
A3 HTTP/2 + Server-Sent Events (SSE) TLSv1.3 (Optional) TCP 3003 ✔️
A4 HTTP/3 + SSE TLSv1.3 (Mandatory) UDP (QUIC) 3004
B1 uWebSockets.js WebSockets only TLSv1.3 (Optional) TCP 4001 ✔️
B2 HTTP/1 + SSE TLSv1.3 (Optional) TCP 4002 ✔️
B3 HTTP/2 + SSE TLSv1.3 (Optional) TCP 4003 ✔️
B4 HTTP/3 + SSE TLSv1.3 (Mandatory) UDP (QUIC) 4004 ⚠️ (try)
C1 Caddy (in front of Node.JS) HTTP/3 + SSE TLSv1.3 (Mandatory) UDP (QUIC) 5001 ✔️

@ayewo
Copy link

ayewo commented Aug 15, 2024

Thanks for the confirmation.

  1. OK. I make a note to investigate HTTP/3 support in uWebsockets.js.
  2. Understood.

Please note that I have updated the 2nd table to remove the port numbers. I imagine each of the servers A,B & C will be started as a standalone process so they could simply listen on the same port i.e. localhost:3000 instead of listening on individual ports localhost:3001, localhost:4001 etc that I originally used.

@ayewo
Copy link

ayewo commented Aug 15, 2024

  1. Yes, this was in fact one of my follow-up questions. I imagine that the set-up in A1 is essentially your baseline i.e. what you are currently using today. The others A2-A3, B1-C3 and C1-C3 are what the synthetic benchmark would be uncovering, correct?
  2. From my research, it seems uWebsockets.js doesn't support HTTP/2 at all, only HTTP/1.1.
  3. Understood. It was super clear in your original description that Caddy would serve as a reverse proxy.

@aeplay
Copy link
Contributor Author

aeplay commented Aug 15, 2024

yeah makes sense re ports - we can run the different cases in succession

Just double checked re uWebSockets and HTTP2 - you're right, that's surprising. Remove that case then, but try HTTP1 + SSE, please

@aeplay
Copy link
Contributor Author

aeplay commented Aug 15, 2024

  1. Yes, this was in fact one of my follow-up questions. I imagine that the set-up in A1 is essentially your baseline i.e. what you are currently using today. The others A2-A3, B1-C3 and C1-C3 are what the synthetic benchmark would be uncovering, correct?

This is exactly the case, correct

@ayewo
Copy link

ayewo commented Aug 15, 2024

Another question: I want to assume all protocol combinations will use TLS in the benchmarks?
TLS is optional in HTTP/1.1 and HTTP/2 (h2c) but in HTTP/3 it will not work over plaintext which is why it is the only web protocol where TLS use is mandatory.

@aeplay
Copy link
Contributor Author

aeplay commented Aug 15, 2024

yes please assume and use TLS for everything (local certs are ok), because one thing I am interested in is how long it takes to bootstrap a connection - which is most noticed on interrupted connections. I'm expecting Websockets + TLS to be the longest and HTTP3 + SSE + TLS to be the fastest in this regard.

@ayewo
Copy link

ayewo commented Aug 15, 2024

Got it.

@ayewo
Copy link

ayewo commented Aug 15, 2024

More questions.

The Simulation spec talks about simulating the transfer of structured and binary data. But looking at the main differences (source) between WebSockets and SSE in the table below:

WebSockets Server-Sent Events
Two-way message transmission One-way message transmission (server to client)
Supports binary and UTF-8 data transmission Supports UTF-8 data transmission only
Supports a large number of connections per browser Supports a limited number of connections per browser (six)
  1. SSE only supports UTF-8 data transmission. I guess for SSE, this implies the use of base64 to encode and decode binary each way?

  2. What about the 50MB limit? It is the final payload size prior to being base64 encoded?

  3. For the client that will interact with the sync server using browser-native APIs (WebSocket, fetch, EventSource), is using a (headless) Chrome instance from playwright sufficient? Or you want the browser client to be configurable? In other words, the tester gets to use only Chrome or they can pick from any of the browsers supported by playwright i.e. Chrome, Edge, Safari (WebKit) or Firefox, as long as those browser-native APIs are properly supported?

@aeplay
Copy link
Contributor Author

aeplay commented Aug 15, 2024

  1. Use base64 encoding everywhere
  2. 50mb prior to encoding
  3. Just chrome is fine

@aeplay aeplay unpinned this issue Aug 20, 2024
@ayewo
Copy link

ayewo commented Aug 28, 2024

  1. Use base64 encoding everywhere

  2. 50mb prior to encoding

Re: 1 & 2

Can you relax this so that base64 encoding is not necessary for loading/creating binary CoValues.

In other words, base64 encoding will only used for delivering subscription events over a WebSocket or SSE?

It's much easier to split a 50MB binary file, as is, and stream it in 100KB chunks in either direction (server->client and client->server) than to do so with base64 encoding added to the mix.

@aeplay
Copy link
Contributor Author

aeplay commented Aug 31, 2024

Hey @ayewo sorry for the late reply.

Yes happy to relax this.

Ideally (to be most similar to cojson) you could base64 encode the individual chunks - but if it's simpler to have them binary wherever possible just do that - it's not really relevant to the main concern.

Thank you

@ayewo
Copy link

ayewo commented Sep 16, 2024

Hey @aeplay

Brief status update:

Right now, I’ve working all 3 use cases for text and binary CoValues over:

  • HTTP/1 + SSE (Node.js)
  • HTTP/2 + SSE (Node.js)

Still trying to finish the WebSocket implementation and hoping I can re-use most of the code for SSE browser client to build the browser client that will interact with the WebSocket server.

(PS: I’ve been really poor at sharing updates on my progress because I have been dealing with regular interruptions. So sorry about that.)

@aeplay
Copy link
Contributor Author

aeplay commented Sep 19, 2024

Hey @ayewo no worries, thanks for your update - looking forward to it!

@aeplay
Copy link
Contributor Author

aeplay commented Oct 14, 2024

Hey @ayewo any updates on this? :)

@ayewo
Copy link

ayewo commented Oct 14, 2024

@aeplay
Still working on it :).

I should open a PR tomorrow or Wednesday, God willing.

@ayewo
Copy link

ayewo commented Oct 17, 2024

I should open a PR tomorrow or Wednesday, God willing.

Turns out my estimate was off by a few days as there are parts of the benchmarks’ plumbing that are not yet finished. Sorry about that.

When I started, I must have interpreted the Deliverable section as saying that a PR should be only be opened when the code is close to done. But re-reading it now, I realize I should have simply asked for your preference:

  • are you are fine with opening a draft PR for WIP code,
  • or open a PR when the code close to being done?

Right now, I have most things working. The only requirement I haven't touched at all are simulating the various Network conditions: I, II, III, IV.

@aeplay
Copy link
Contributor Author

aeplay commented Oct 17, 2024

That sounds wonderful, I would love to see a WIP draft PR. Thank you!

@ayewo ayewo linked a pull request Oct 17, 2024 that will close this issue
Copy link

algora-pbc bot commented Oct 17, 2024

💡 @ayewo submitted a pull request that claims the bounty. You can visit your bounty board to reward.

@ayewo
Copy link

ayewo commented Oct 17, 2024

I'd opened a draft PR and included basic instructions on how to set it up locally for testing at the end.

@aeplay
Copy link
Contributor Author

aeplay commented Nov 5, 2024

Any news @ayewo ?

@ayewo
Copy link

ayewo commented Nov 5, 2024

Hey @aeplay
Sorry I haven't been able to share any updates yet.

I've been AFK for a few weeks because I traveled. I should be returning home this weekend, God willing.

@ayewo
Copy link

ayewo commented Dec 20, 2024

Hey @aeplay

I now have all web servers working—6 of them—and it is clear I really underestimated the amount of plumbing required to have everything working, end-to-end.

As requested, I investigated using HTTP/3 with the uWebSocket project and it turned out to be unworkable, as I mentioned in my preliminary findings, because support was experimental and development had been paused.

Right now, their experimental HTTP/3 implementation exposed via uWS.H3App on Node.js seg faults on macOS, so could only use HTTP/1.1 in the benchmarks. (They never implemented HTTP/2.)

(Sorry for the long hiatus between updates.)

@aeplay
Copy link
Contributor Author

aeplay commented Jan 6, 2025

Thanks for your continued work on this @ayewo

If you could share numerical results either here or on the PR, even if it is from a WIP stage that would be super interesting for me to get a quick idea before running it myself

Please be mindful of your time budget considering the bounty amount and feel free to stop as soon as you have insightful results. Just ping me then!

@ayewo
Copy link

ayewo commented Jan 6, 2025

@aeplay thanks for finally responding.

Ha, I'm already underwater with respect to bounty amount : effort required ratio anyway ... :(

Right now, I don't have much availability to run those numbers until this or next weekend. I hope that's fine?

@aeplay
Copy link
Contributor Author

aeplay commented Jan 10, 2025

/bounty $1000

@algora-pbc
Copy link

algora-pbc commented Jan 10, 2025

💎 A new bounty of $1,000 has been added by aeplay!

💰 Current prize pool: $3,000

@aeplay
Copy link
Contributor Author

aeplay commented Jan 10, 2025

@ayewo I hope this effectively increases the bounty to $3000

The timing is fine - I'd appreciate if you focused on getting some numbers and having the minimal extra work done to be able to reproduce them over any kind of polish, and then I'll mark it as accepted!

@ayewo
Copy link

ayewo commented Jan 10, 2025

I hope this effectively increases the bounty to $3000

@aeplay thanks for the kind gesture of upping the bounty, really appreciate it.

You might probably have to delete the "/bounty $1000" comment then comment anew with "/bounty $3000" for the Algora bot to pick it up though.

The timing is fine -

OK. I might not be able to do anything this weekend because I have several things lined up already. Thanks for understanding.

I'd appreciate if you focused on getting some numbers and having the minimal extra work done to be able to reproduce them over any kind of polish, and then I'll mark it as accepted!

Got it.

@ayewo
Copy link

ayewo commented Jan 21, 2025

Hi @aeplay

Just a brief progress update:

When I adapted the playwright tests for the k6 load testing tool, I noticed that the tests became flaky, especially for the scenario involving 10 clients that subscribe to mutation. Failures seem to be random, even for light loads.

So, I'm trying out an alternative load testing tool: artillery, to see if it will fare better. Luckily, they have out-of-the-box support for playwright so I'm hoping I'll be able to avoid the flakiness I'm seeking with k6.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💎 Bounty help wanted Extra attention is needed perf Performance/scalability improvement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants