-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decoding of the first frame #53
Comments
Maybe I mean granule. I don't really know. |
mp3 have concept of bit-reservoir, i.e. frame can use unused tail bits (because encoder rate control fails to use all bits) for next frame. If first frame become unavailable, decoder can't decode second frame (if bit-reservoir used), but if it's still passed to decoder, bit-reservoir become full and decoder can decode 3-rd frame. The number of frames before should have 511 payload bytes in the worst case to completely fill the reservoir. |
That leaves more questions. |
This is standard first frame lame creates to hide metadata. This frame almost empty, lame creates Info/Xing block in unused space (not used for bit reservoir too) so other applications can find it. Here where lame creates it https://github.com/gypified/libmp3lame/blob/master/libmp3lame/VbrTag.c#L900 About Huffman decode you may want to check: https://github.com/lieff/minimp3/blob/master/minimp3.h#L773 PS: If you interested in python implementation take a look at https://bitbucket.org/portalfire/pymp3 |
I was more interested in Line 766 in e9df076
I'm not going to re-implement all that stuff. Maybe I'll edit your code to make it more readable. Currently I have one more question. Imagine I don't have a MP3 file, but I wrote an utility that generates data that looks like a frame. Synthesizing a frame, I mean. Can I decode that frame alone, or do I need the previous frame? When I use the short mode, do I need previous granules to decode the next granule, or I can decode the granule alone? L3_read_side_info seems useful for me. How do I call it on the frame? |
Can you put comments? |
ireg it's region counter, there up to 3 regions can have own Huffman table, decoding starts from gr_info->table_select[0] table (it's not necessarily zero). Even if frame (have 1 or 2 granules) do not use bit-reservoir, first frame alone will be not in good shape because of filterbank must be feeded, 2nd frame+ will be in good enough shape. Side info encoded right after header and CRC: https://github.com/lieff/minimp3/blob/master/minimp3.h#L1699 You are right, this code is not optimized for reading but for small size (both code and binary) and performance and I'm not going to change that. Note that canonical mpg123 have even more optimized Huffman: https://github.com/georgi/mpg123/blob/master/src/libmpg123/layer3.c#L706 If you are looking for most understandable and commented code, I think it's reference code "MPEG-2 Audio Simulation Software Distribution 10". |
Why the code can't be like that? Just cast a pointer of a frame to (mp3_header*) and get all the fancy stuff. Also, readability can be mixed with speed. Also I don't see a point in making headers smaller. |
I'm not against using bitfields, but it must be tested with at least following compilers: Code size must be roughly same or less, and profiling must not show slowdown. Usually compiler do same |
There's no problems with endian if you use char. |
That's why minimp3 reads input by chars in get_bits. It's not very optimal, usually optimized decoders uses big machine word load and byte swap instruction if available for architecture. |
Did you say "But there's a problem when you map a structure mp3_header sizeof(mp3_header)>sizeof(char) on input stream directly, so, you must properly handle that problem."? |
This construct can have problems, I've experienced:
Modern compilers should be good, but list above should be checked. |
Bit fields are implementation-defined. If your compiler wants, it may choose a completely different bit packing layout than what you expect. Please don't make minimp3 rely on such implementation-defined behaviour, it's bound to fail on at least some obscure platforms where the current code works just fine. |
Yeah, I'm completely forgot that it's also implementation defined
RIP bit fields for minimp3 use-case then. |
The mess in the code leads to bugs. Line 1626 in e9df076
should be if (i + k + hdr_padding(mp3) + nextfb + HDR_SIZE > mp3_bytes ) break; if(!hdr_compare(mp3, mp3 + k + nextfb)) continue; EDIT: Oops. I meant if (i + fb + hdr_padding(mp3) + nextfb + HDR_SIZE > mp3_bytes ) break; Which does mean if (i + k + nextfb + HDR_SIZE > mp3_bytes ) break; if(!hdr_compare(mp3, mp3 + k + nextfb)) continue; |
|
There already more strict check 100% RIP it's not because of alignment, but because of implementation defined bits location, even in uchar. |
Next match? mp3_bytes seems to be filesize minus offset in bytes. So if i + k + hdr_padding(mp3) + nextfb + HDR_SIZE > mp3_bytes, it's EOF. There's nothing after EOF. No next match. |
If you understand "implementation defined bits location" so well, could you please explain what deep meaning does this phrase have? |
It means that the bits can be packed in any way that the compiler sees fit. Just because you tested for example three different compilers on x86 (or whatever) that behaved the same doesn't mean that on another platform, they will behave identically. In particular this part of the quote from the standard is important:
On some non-x86 platforms it can be more optimal to pack the fields in reverse order (top to bottom rather than bottom to top). The compiler is free to do so. "I tested several compilers" is not a proof for the lack of a compiler that doesn't do something different. |
Yes, next match is possible, because at each k position there padding possibilities 0, 1 and 4, next match can have different padding. |
So any buffer that has equal or greater size than one bitfield can hold that bitfield. That doesn't mean anything because we have one great bitfield and it's first. C doesn't align first elements.
That means adjacent bitfields are one big bitfield, and they keep the same order.
If? Well. I said in GCC and Clang an element is only put in the next unit if it does not specify how much bits does it take, and thus, does not belong to a bitfield. But "implementation-defined". So, if a bitfield already contains 7 bits, we can't know if the next 2 bits will be placed immediately or after 1 bit. But we can! assert(sizeof(typename)==x) is our friend.
Within. RIP. But not for me, because I'm not going to be cross-platform. |
How about:
|
I was wrong with padding. But
That is a memory error. It could seek after the allocated memory. |
Also note that there no point to optimize this loop when iterations count is low. But imagine this loop goes 1000-2000 iterations (till MAX_FREE_FORMAT_FRAME_SIZE). This case is good optimization candidate.
i.e. make one branch instead of two (and make something with buffer overread), which can be measurable improvement in busy loop at least on x86/x64. PS: There no overread in initial code, C standard guarantees that |
So, if the loop reaches EOF it simply repeats before it reaches MAX_FREE_FORMAT_FRAME_SIZE. The part "i + 2*k < mp3_bytes - HDR_SIZE" guarantees that it stops after continue. Line 1620 in e9df076
is not executed. It skips directly to the Line 1633 in e9df076
There's no memory error but using break could save 3 branches after the EOF. The count of the branches while the loop is the same. The only thing - fusing it would make the code shorter with (almost) no affect on performance, because my version is faster only when the loop reaches EOF otherwise it's the same speed. |
The rule - if only one of many is evaluated, there's branching. Otherwise there's not. |
Your code.
|
Your mp3d_find_frame is 1366 bytes, my OO mp3frame_getNext is 1508. |
Right. It ignores CRC, and that is obvious, but it does nothing with bs_frame. Am I correct? |
Nope, it updates bs_frame->pos field. |
"if ((bs->pos += n) > bs->limit)" |
This equivalent to
|
I do not update len alone. Pythonic way is to increment the pointer and decrement the length. |
That can be useful to detect bitfield order. Credits:
So, I have to be careful that no bitfields are on the edge. 8, 3+2+2+1, 4+2+1+1,2+2+1+1+2, you got the idea. |
It's polymorphism.
|
Yeah, I know that trick and use same idea detect float point structure: https://github.com/lieff/lvg/blob/master/swf/avm1.c#L100 |
https://pastebin.com/Dwf3c1gg |
Um. Didn't know zero-length bitfiels are align commands and cannot have names. |
And even a void takes 1 byte. RIP? |
There no guarantee that this should be at end of structure for LE. Also note that sizeof(uint) can be != 4 for some architectures, especially micro-controllers + unit mentioned in standard can be anything. And also there more endian variants possible https://github.com/lieff/lvg/blob/master/swf/avm1.c#L150 |
Everything's okay there. Again, I wrote a Python script that checks the alignment for all probabilities of ifdefs. |
I think bit-fields still can be used in such low-level code, but when you create and use bit-fields objects for internal structures and do not try assume it's structure for read/write in files etc. It's just not worth it. |
My solution is std::false_type but it's only C++. |
Maybe Nim? |
Still not implemented but looks promising. |
I'm aware about Nim, but it's more like another scripting alternative for lieff/lvg#1 |
https://pastebin.com/61LTNf8G |
Line 548 in e9df076
doesn't look right because of Line 582 in e9df076
Oh. Silly me. 0x0f0f is 16 bits, everything's fine. Btw, what is contained in 4 bits that's zeroed? Is it correct to zero'em in mp3 file? Value of region_count[2] in "tight" case is rather 0, but I'm unsure. |
This is tricky part to simplify scale factors reading. Only long types 0,1,3 uses scfsi bits and this bits must be cleared because garbage can be here which ignored by reference decoder, but simplified L3_read_scalefactors looks at them first. |
Again - is it correct to zero'em in mp3 file? |
I've optimized your parse_ext even further. Edit: nothing without a silly bug. It worked just because there was a dot. I didn't count the shift.
|
That loop Line 414 in e9df076
is redundant. Edit: no. I said this because I added that before to make everything initialized. But, should it really be after reading scalefactors? |
Yes, scfsi bits can be zero in mp3 file. |
When a frame following the first frame is decoded it uses the previous frame. How is the first frame decoded, and what is used instead of the previous frame?
The text was updated successfully, but these errors were encountered: