Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of non-ascii charactors in ALPN #178

Open
StarlightIbuki opened this issue Oct 28, 2024 · 3 comments
Open

Handling of non-ascii charactors in ALPN #178

StarlightIbuki opened this issue Oct 28, 2024 · 3 comments
Assignees

Comments

@StarlightIbuki
Copy link

The Rust implementation treats all non-ascii bytes as '9', while the document says they will be represented by the first/last hex digit of their numeric representation.
Should we update the documentation?

@john-althouse
Copy link
Collaborator

The documentation is the correct method.

@vvv
Copy link
Collaborator

vvv commented Jan 20, 2025

The problem is reproducible with this capture file: badcurveball-alpn-not-isalnum.pcap.gz

❯ diff -u <(xxd pcap/badcurveball.pcap) <(xxd pcap/badcurveball-alpn-not-isalnum.pcap)
--- /dev/fd/11  2025-01-20 19:09:30.086699004 +0200
+++ /dev/fd/12  2025-01-20 19:09:30.086730588 +0200
@@ -33,7 +33,7 @@
 00000200: 6c6c 7465 7374 2e63 6f6d 0017 0000 ff01  lltest.com......
 00000210: 0001 0000 0a00 0a00 08ca ca00 1d00 1700  ................
 00000220: 1800 0b00 0201 0000 2300 0000 1000 0e00  ........#.......
-00000230: 0c02 6832 0868 7474 702f 312e 3100 0500  ..h2.http/1.1...
+00000230: 0c02 abcd 0868 7474 702f 312e 3100 0500  .....http/1.1...
 00000240: 0501 0000 0000 000d 0014 0012 0403 0804  ................
 00000250: 0401 0503 0805 0501 0806 0601 0201 0012  ................
 00000260: 0000 0033 002b 0029 caca 0001 0000 1d00  ...3.+.)........
  • Expected ALPN extension value: ad
  • Actual value: 99
❯ diff -u <(ja4 -- pcap/badcurveball.pcap) <(ja4 -- pcap/badcurveball-alpn-not-isalnum.pcap)
--- /dev/fd/11  2025-01-20 19:12:17.155384616 +0200
+++ /dev/fd/12  2025-01-20 19:12:17.156051324 +0200
@@ -5,7 +5,7 @@
   src_port: 55318
   dst_port: 443
   tls_server_name: bad.curveballtest.com
-  ja4: t13d1615h2_46e7e9700bed_45f260be83e2
+  ja4: t13d161599_46e7e9700bed_45f260be83e2
   ja4s: t1205h1_c02b_845f7282a956
   tls_certs:
   - x509:

@vvv
Copy link
Collaborator

vvv commented Jan 20, 2025

Analysis

ja4 Rust app uses rtshark v2.6.0.

rtshark = "=2.6.0" # CAUTION: rtshark >= 2.7.0 breaks JA4 (TLS client) and JA4L-C/S fingerprints

rtshark::Metadata in that version of the library does not expose the original (raw) bytes:

❯ cargo run -q --manifest-path rust/Cargo.toml --bin ja4 -- pcap/badcurveball-alpn-not-isalnum.pcap >/dev/null
[ja4/src/tls.rs:189:13] md = Metadata {
    name: "tls.handshake.extensions_alpn_str",
    value: "��",
    display: "ALPN Next Protocol: ��",
    size: 2,
    position: 256,
}
[ja4/src/tls.rs:189:13] md.value() = "��"
[ja4/src/tls.rs:189:13] hex::encode(md.value()) = "efbfbdefbfbd"
[ja4/src/tls.rs:189:13] md.display() = "ALPN Next Protocol: ��"
[ja4/src/tls.rs:189:13] hex::encode(md.display()) = "414c504e204e6578742050726f746f636f6c3a20efbfbdefbfbd"

As you can see, the output does not include abcd.

patch
diff --git a/rust/ja4/src/tls.rs b/rust/ja4/src/tls.rs
index 3e953e1..c115990 100644
--- a/rust/ja4/src/tls.rs
+++ b/rust/ja4/src/tls.rs
@@ -184,6 +184,17 @@ impl ClientStats {
             .first("tls.handshake.extensions_server_name")
             .ok()
             .map(str::to_owned);
+        //* XXX <<<<<<<
+        if let Ok(md) = tls.find("tls.handshake.extensions_alpn_str") {
+            dbg!(
+                md,
+                md.value(),
+                hex::encode(md.value()),
+                md.display(),
+                hex::encode(md.display())
+            );
+        }
+        // XXX >>>>>>> */
         let alpn = tls
             .first("tls.handshake.extensions_alpn_str")
             .map_or((None, None), first_last);

rtshark::Metadata of the latest rtshark (v3.1.0) does contain raw bytes and provides rtshark::Metadata::raw_value method to access them.

        Metadata {
            name: "tls.handshake.extensions_alpn_str",
            value: "��",
            raw_value: Some(
                "abcd",
            ),
            display: Some(
                "ALPN Next Protocol: ��",
            ),
            size: Some(
                2,
            ),
            position: Some(
                256,
            ),
        },

Solution

  • Upgrade to the latest rtshark library (v3.1.0).
  • Fix compilation errors.
  • Fix tests.

Caveats

Upgrading rtshark is the right thing to do, but it's not trivial.

  • Fixing compilation errors is a straightforward task.
  • I tried to upgrade rtshark to v2.7.0 once, and that broke JA4+ fingerprints. AFAIR, v2.7.0 of rtshark removed some data, required for JA4 fingerprinting, from tshark output. If we are lucky, this data will be present in the latest version of rtshark (v3.1.0). If we're not so lucky, then we'll have to fork rtshark and modify its code to ensure that all the data needed for JA4+ fingerprinting is exposed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants