Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a new idea of a "transcoder" #205

Merged
merged 2 commits into from
Apr 12, 2022
Merged

a new idea of a "transcoder" #205

merged 2 commits into from
Apr 12, 2022

Conversation

sentriz
Copy link
Owner

@sentriz sentriz commented Mar 10, 2022

  • can encode tracks to mp3, opus, etc as usual
  • can also decode to pcm audio (to be used in jukebox situations later on)
  • guesses the Content-Length for on the fly transcoding
  • introduces a Transcoder interface. implemented by FFmpegTranscoder, CachingTranscoder (takes another Transcoder), NoneTranscoder
  • profiles are stored as strings (split by shlex) so that users could customise them in future
  • cache keys are more "exact" (not just checking profile bitrate and replay gain field changes, but the entire ffmpeg command string and filepath)

TODO

  • see how DSub and other clients react to having a Content-Length header for on the fly transcodes

@spijet thoughts on this one?

OpusRG = NewProfile("audio/ogg", 96, `ffmpeg -v 0 -i <file> -ss <seek> -map 0:a:0 -vn -b:a <bitrate> -c:a libopus -vbr on -af "volume=replaygain=track:replaygain_preamp=6dB:replaygain_noclip=0, alimiter=level=disabled, asidedata=mode=delete:type=REPLAYGAIN" -metadata replaygain_album_gain= -metadata replaygain_album_peak= -metadata replaygain_track_gain= -metadata replaygain_track_peak= -metadata r128_album_gain= -metadata r128_track_gain= -f opus -`)

PCM16le = NewProfile("audio/wav", 0, `ffmpeg -v 0 -i <file> -ss <seek> -c:a pcm_s16le -ac 2 -f s16le -`)
)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a bit more tidy without line wrapping 😎
image

@sentriz sentriz force-pushed the develop branch 3 times, most recently from 3ee7bf2 to b275899 Compare March 10, 2022 01:28
@sentriz sentriz force-pushed the develop branch 2 times, most recently from 4711434 to f49c2c9 Compare March 15, 2022 18:06
@sentriz sentriz force-pushed the develop branch 2 times, most recently from 2617db6 to f444678 Compare March 23, 2022 23:42
@spijet
Copy link
Contributor

spijet commented Mar 25, 2022

Hello @sentriz!

Sorry for slow response — a lot of stuff has been happening lately.

This looks very interesting to me, and having encoding profiles stored as strings actually reminds me of SubSonic, where the user could set up multi-stage transcoding rules (e.g. use some_command to decode some specific audio format to raw PCM and then pass it to ffmpeg to encode it to target format via pipe). This is useful for some formats that are somehow not supported by ffmpeg out of the box, like emulated retro-game audio or tracker modules.

I hope I'll be able to carve out some time and check the code in more detail this weekend. :)

@sentriz
Copy link
Owner Author

sentriz commented Mar 25, 2022

thanks! yeah it would be much appreciated if you checked it out.

also I don't think I mentioned yet, the reason for the refactoring is to share the code with jukebox. I'm working on making it take whatever file and decode to PCM. which removes the beep dependency which is giving me trouble at the moment

@sentriz sentriz force-pushed the develop branch 2 times, most recently from c7c5324 to ffd6614 Compare March 25, 2022 18:35
@sentriz sentriz force-pushed the develop branch 3 times, most recently from 477d624 to 05c07e6 Compare April 12, 2022 22:50
@sentriz sentriz merged commit 2496741 into master Apr 12, 2022
@spijet
Copy link
Contributor

spijet commented May 20, 2022

Oh fug, it's been 2 months now. D:
Looking at it right now.

Also, before I forget it again for another year — what do you think about introducing decoder profiles too? We could use similar kind of profiles to add a decoding step in transcoding chain, which would decode some sound format to PCM and then feed it into ffmpeg or to the sound output (in jukebox mode). This could be useful for some less-common music formats (e.g. tracker modules and emulated retro music).

@spijet
Copy link
Contributor

spijet commented May 20, 2022

As for Content-Length header, some clients actually use it to estimate the track length. Subsonic API even offers a query parameter for this, so the client could ask the server to (not) add this header for transcoded music.

Copy link
Contributor

@spijet spijet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a couple of questions/comments from me in case it's not too late to review the code yet. :)

r.Handle("/getCoverArt{_:(?:\\.view)?}", ctrl.HR(ctrl.ServeGetCoverArt))
r.Handle("/stream{_:(?:\\.view)?}", ctrl.HR(ctrl.ServeStream))
r.Handle("/download{_:(?:\\.view)?}", ctrl.HR(ctrl.ServeStream))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure you want to serve transcoded music via download API too? IIRC, many clients use it to download source (i.e. unmodified) tracks as-is, and to pre-cache stuff they just use the stream API instead.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah very nice point. i really wasn't sure about this. but what i find annoying is that when i'm on a train streaming music, dsub will transcode the first track that i select so it stream nice and quick. but then for the tracks that it caches ahead, it uses download.view. so since it's downloading straight flacs, it just hangs

you're right that download should just serve the raw file so i'll put it back. but any ideas how to fix my issue? 😁

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, maybe you should file a bug report to dSub developer(s)? :)
In my case, iSub (yet another SubSonic API client for iOS, this time with Opus support) preloads next track (and pre-caches tracks when I tap a "download to cache" button on a playlist) using /stream, as expected. IIRC, some Android client I used to use had different settings for Wi-Fi and 3G/LTE/etc modes, and the default setting was to download and (pre-)cache unmodified tracks in all cases.


// Store as simple strings, since we may let the user provide their own profiles soon
var (
MP3 = NewProfile("audio/mpeg", 128, `ffmpeg -v 0 -i <file> -ss <seek> -map 0:a:0 -vn -b:a <bitrate> -c:a libmp3lame -af "volume=replaygain=track:replaygain_preamp=6dB:replaygain_noclip=0, alimiter=level=disabled, asidedata=mode=delete:type=REPLAYGAIN" -metadata replaygain_album_gain= -metadata replaygain_album_peak= -metadata replaygain_track_gain= -metadata replaygain_track_peak= -metadata r128_album_gain= -metadata r128_track_gain= -f mp3 -`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I was afraid of — ultra-long ffmpeg command lines! :D

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahaha yeah it's a lot. but i'm kinda into it :D


type Profile struct {
bitrate BitRate // the default bitrate, but the user can request a different one
seek time.Duration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does seek do (or indicate) in this case? Is it like a smallest block of time the format allows to seek or something?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its passed to ffmpeg as the -ss argument which says seek to and then start encoding. only used for jukebox at the moment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so it's sorta like "skip N seconds from start" instead. Got it. :)

func cacheKey(cmd string, args []string) string {
// the cache is invalid whenever transcode command (which includes the
// absolute filepath, bit rate args, replay gain args, etc.) changes
sum := md5.New()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use xxhash here, since it seems to be faster than MD5 (I used it as a hash key for cached tracks in my original PR, IIRC, and it's still here in dependency list).

Here's an issue in go-guerrilla project with some "xxhash vs MD5" benchmarks.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i think xxhash is very cool. but in this case we're dealing with really small data, and I like to keep the 3rd party libraries miminal. on my laptop it seems to be 20µs to hash a profile and args

20220520_204159

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Well, if the xxhash module is not used anywhere else (probably not), then MD5 is a good choice. A bit vintage, but it works! :)

@spijet
Copy link
Contributor

spijet commented May 20, 2022

WTF, GitHub kept showing this PR as "Open" even after I dropped the browser cache and refreshed the page, but as soon as I submitted my review, it turned into "Merged". :D

Once again, my apologies for being so late to the party.

@sentriz
Copy link
Owner Author

sentriz commented May 20, 2022

howdy @spijet 👋

what do you think about introducing decoder profiles too? We could use similar kind of profiles to add a decoding step in transcoding chain, which would decode some sound format to PCM and then feed it into ffmpeg or to the sound output (in jukebox mode). This could be useful for some less-common music formats (e.g. tracker modules and emulated retro music).

yep that's exactly what i'm doing 👍 see the transcode.PCM16le profile, it's used in this PR
#211

As for Content-Length header, some clients actually use it to estimate the track length. Subsonic API even offers a query parameter for this, so the client could ask the server to (not) add this header for transcoded music.

woah cool I didn't know the API had a param for this, i'll check it out

@spijet
Copy link
Contributor

spijet commented May 20, 2022

I just checked the diff of a local mini-fork I'm still running at home (based on commit 4a4298a :D), and there I just set Content-Length using a transcoded file size, if it exists in cache:

		cacheFile, cfErr := os.Stat(path)
		if cfErr != nil {
			log.Printf("failed to stat cache file `%s`: %v", path, cfErr)
		} else {
			contentLength := fmt.Sprintf("%d", cacheFile.Size())
			w.Header().Set("Content-Length", contentLength)
		}
 		http.ServeFile(w, r, path)

Pretty stupid, but it works. :D

@spijet
Copy link
Contributor

spijet commented May 20, 2022

see the transcode.PCM16le profile, it's used in this PR

Will do, and will check #211 too.

(Also, it's probably time to update my Gonic installation) :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants