Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for B-frames in decode order #61

Open
fkaa opened this issue Jun 24, 2022 · 6 comments
Open

Support for B-frames in decode order #61

fkaa opened this issue Jun 24, 2022 · 6 comments
Labels
interop interoperability problem with another (often broken) RTSP implementation

Comments

@fkaa
Copy link

fkaa commented Jun 24, 2022

I have a IP camera that supports B-frames. Running the client example doesn't seem to support non-monotonic timestamps:

.\target\debug\examples\client.exe client mp4 test.mp4 --url "rtsp://172.25.127.123"
I20220624 13:08:23.257 main client::mp4] Using h264 video stream
I20220624 13:08:23.260 main client::mp4] No suitable audio stream found
W20220624 13:08:23.260 main retina::client] Connecting via TCP to known-broken RTSP server "GStreamer". See <https://github.com/scottlamb/retina/issues/17>. Consider using UDP instead!
3433148934 (mod-2^32: 3433148934), npt 0.000: 20801-byte video frame
I20220624 13:08:23.391 main client::mp4] new video params: VideoParameters { rfc6381_codec: "avc1.640033", pixel_dimensions: (2688, 1512), pixel_aspect_ratio: None, frame_rate: Some((540000, 27000000)), extra_data: Length: 42 (0x2a) bytes
0000:   01 64 00 33  ff e1 00 1b  67 64 00 33  ad 00 c5 30   .d.3....gd.3...0
0010:   0a 80 2f fe  59 b8 08 08  0d 28 00 20  f5 80 0c df   ../.Y....(. ....
0020:   e6 00 20 01  00 04 68 ee  3c b0                      .. ...h.<. }
3433156134 (mod-2^32: 3433156134), npt 0.080: 130-byte video frame
E20220624 13:08:23.570 main client] Fatal: Timestamp jumped -5400 (-0.060 sec) from 3433156134 to 3433150734 (mod-2^32: 3433150734), npt 0.020; policy is to allow 0..10 sec only

conn: 172.25.127.122:64589(me)->172.25.127.123:554@2022-06-24T13:08:23
stream: TCP, interleaved channel ids 0-1
ssrc: 8e3abf51
seq: 00002164
pkt: 23059@2022-06-24T13:08:23

Not sure what the warning about TCP is about

@scottlamb
Copy link
Owner

scottlamb commented Jun 24, 2022

I have a IP camera that supports B-frames.

Interesting! What's the make/model? I've never seen this before. I'm sure I have .mp4 files around with B frames though so hopefully I can point gstreamer at them to replicate this for testing.

Not sure what the warning about TCP is about

Yeah that seems wrong. It's supposed to say this only for certain live555 versions.

@fkaa
Copy link
Author

fkaa commented Jun 24, 2022

Interesting! What's the make/model?

It's an AXIS Q3536-LVE, but basically any Axis camera with their latest SoC should work.

I've never seen this before.

Same! FWIW it seems like FFmpeg struggles a bit with the stream as well

I can provide a Wireshark capture if you'd be interested, but AFAICT it's just sending video packets in decode order, with the RTP time indicating when it should be presented. It should be identical to how H.264/H.265 is stored in Matroska files.

@scottlamb scottlamb added the interop interoperability problem with another (often broken) RTSP implementation label Jun 24, 2022
@scottlamb
Copy link
Owner

scottlamb commented Oct 2, 2022

Good news: I think I can reproduce this behavior with open source software rather than having to buy a camera. That will help with implementation.

  1. Prepare a .mp4 with B-frames. Big Buck Bunny doesn't use them but you can re-encode it to do so: ffmpeg -i BigBuckBunny.mp4 -codec:v libx264 -preset veryslow $HOME/bbb-reencoded.mp4
  2. Run rtsp-simple-server with the stock config in one terminal
  3. In another terminal, publish to the server: ffmpeg -re -stream_loop -1 -i bbb-reencoded.mp4 -c copy -f rtsp rtsp://localhost:8554/mystream
  4. in a third terminal, start Retina: cargo run -p client -- mp4 --url rtsp://localhost:8554/mystream out.mp4

Bad news: I'm confused about (or maybe just disappointed by) how this works:

  • When writing a .mp4 file, we have to make up the DTS somehow. (For background, see ISO/IEC 14496-12 section 8.6.1.1. There's an example in Table 2. This wiki page has links to a free download of that standard and several others.) I guess we need to basically assume DTS=PTS as we go through, but then as we see an entry that jumps backwards in PTS, we retroactively pull up the DTS for all frames we've already seen with a later PTS so they precede this DTS. In fact at least in theory, that can require bumping up DTSs all the way to the beginning of the file if they're dense, because two frames can't have the same DTS. (In practice, the times advance at 90kHz, and we'd never actually support that framerate, so maybe we can just error out if that happens.)

  • It's weird if at any point in the stream, you don't know if you've seen all the frames with PTS up to the current RTP timestamp or if more are yet to come. In particular:

    • You don't know how much buffering you should do when presenting a live view. If you guess low, you drop B frames.
    • Similarly, if you hit ctrl-C in the recording a .mp4, you might not actually have a prefix of the stream, but you might be missing B frames. That's surprising.
  • It also just seems to complicate the situation with cameras' flaky timestamps. They sometimes jump back in time due to (incorrectly) basing the RTP time on the wall clock (rather than a monotonic clock as they ought to), and stepping that back when using SNTP. Now when we see that, we have to wonder if it's supposed to jump back due to PTS/DTS differences or if it's the time jump behavior.

@scottlamb
Copy link
Owner

I can provide a Wireshark capture if you'd be interested

Yes, please, I think that would be helpful, just in case it is doing something differently than in my ffmpeg scenario. E.g. maybe it conveys the intended DTS via a RTP extension or something.

I believe MPEG-TS conveys both DTS and PTS. (See spec, Table 2-21, PTS_DTS_flag.) Seems weird if H.264-over-RTP doesn't.

Curid added a commit to Curid/retina that referenced this issue Mar 26, 2023
@scottlamb
Copy link
Owner

scottlamb commented Aug 3, 2023

I believe MPEG-TS conveys both DTS and PTS. (See spec, Table 2-21, PTS_DTS_flag.) Seems weird if H.264-over-RTP doesn't.

I think RTP really doesn't. There's an expired draft for conveying the dts via an RTP extension, but afaik no one has implemented it.

Curid's PR #81 adds support for inferring the DTS for H.264 by copying the approach used by gortsplib. gstreamer also has something for this (h264timestamper). As mentioned in PR comments, I haven't compared the two yet.

@Curid
Copy link
Contributor

Curid commented Aug 4, 2023

I made a docker B-frame test stream.

docker run -it --network=host curid/test-stream
ffplay rtsp://127.0.0.1:8554/1
Image source
# https://nix.dev/tutorials/nixos/building-and-running-docker-images
let 
  # nixos 23.05
  pkgs = import (fetchTarball "https://github.com/NixOS/nixpkgs/archive/bd836ac5e5a7358dea73cb74a013ca32864ccb86.tar.gz") {};
  ffmpeg = (
    pkgs.callPackage (
      { lib, stdenv, buildPackages, removeReferencesTo, fetchgit, pkg-config, yasm }:
      stdenv.mkDerivation (
        rec {
          pname = "ffmpeg";
          version = "6.0";

          src = fetchgit {
            url = "https://git.ffmpeg.org/ffmpeg.git";
            rev = "n${version}";
            sha256 = "sha256-ZFBj9vjY39E2yJIWXzcMbOKlnA7K27sCPujbAAy5pi8=";
          };

          configurePlatforms = [];
          setOutputFlags = false; # Only accepts some of them.
          configureFlags = [
            "--disable-everything"
            "--disable-ffprobe"
            "--enable-shared"
            "--enable-protocol=file,pipe"
            "--enable-demuxer=mov"
            "--enable-decoder=h264"
            "--enable-muxer=rtsp"
            "--pkg-config=${buildPackages.pkg-config.targetPrefix}pkg-config"
            "--bindir=${placeholder "bin"}/bin"
            "--libdir=${placeholder "lib"}/lib"
            "--incdir=${placeholder "dev"}/include"
          ];

          postConfigure = let
            toStrip = lib.remove "data" outputs; # We want to keep references to the data dir.
          in
            "remove-references-to ${lib.concatStringsSep " " (map (o: "-t ${placeholder o}") toStrip)} config.h";

          nativeBuildInputs = [ removeReferencesTo pkg-config yasm ];
          buildFlags = [ "all" ];
          doCheck = false;
          outputs = [ "bin" "lib" "dev" "out" ];
          enableParallelBuilding = true;
        }
      )
    )
  {} );
in pkgs.dockerTools.buildLayeredImage {
  name = "curid/test-stream";
  tag = "latest";
  contents = [
    ffmpeg
    pkgs.mediamtx
    (pkgs.writeScriptBin "init.sh" ''
      #!${pkgs.runtimeShell}
      mediamtx &
      ffmpeg -v warning -an -re -stream_loop -1 -i video.mp4 -c copy -f rtsp \
        -rtsp_transport tcp rtsp://127.0.0.1:8554/1 #>/dev/null 2>&1 </dev/null
    '') 
    ./files
  ];
  config = {
    Cmd = [ "init.sh" ];
    Env = [
      "MTX_RTSPADDRESS=127.0.0.1:8554"
      "MTX_PATHS_1=rtsp://1"
      "MTX_PROTOCOLS=tcp"
      "MTX_RTMPDISABLE=yes"
      "MTX_HLSDISABLE=yes"
      "MTX_WEBRTCDISABLE=yes"
    ];
  };
}

The WebRTC example doesn't work for me on any branch, but it may work for you with some tweaking. The mp4 muxer in the client example would need major changes to support B-frames.

Curid added a commit to Curid/retina that referenced this issue Aug 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interop interoperability problem with another (often broken) RTSP implementation
Projects
None yet
Development

No branches or pull requests

3 participants