Skip to content

Understanding The Sample Tables: An Example

essential61 edited this page Dec 29, 2019 · 21 revisions

This discussion applies to the case where the mp4 file is self-contained, and has an 'mdat' box containing the media data and a 'moov' box containing the metadata that references the media data.

./images/iso_file.png

(diagram from ISO/IEC 14496-12 – MPEG-4 Part 12)

Annex A. of ISO/IEC 14496-12 – MPEG-4 Part 12 contains a description of how the media data is laid out within the file as an interleaved set of samples and how the sample table box container (stbl) contains a set of tables that are used to identify the position of individual samples within the file.

The text of the standard document is perfectly well written, but it greatly helps to understand the relationship between the different tables (stco, stsz, stsc etc.) through a worked example. Hence why I've written this page.

First of all some definitions from the standard:

  • chunk: contiguous set of samples for one track
  • sample: all the data associated with a single timestamp

It is quite possible (and reasonably common) that a chunk only contains one sample, but it seems to be usual for a chunk to contain n-samples where n is a single or double-digit number.

The data for the example comes from a real file that has two tracks (track 1 is a audio track, track 2 is video track) and an mdat section located at 121915 bytes from the start of the file.

Each track has an stbl container and hence it's own set of sample tables.

stco: chunk offset box

Gives a byte offset for each chunk from the start of the file.

for track 1 the start of the table looks like this:

Has header:
{"size": 1312, "type": "stco"}

Has values:
{
    "version": 0,
    "flags": "0x000000",
    "entry_count": 324,
    "entry_list": [
        { "chunk_offset": 121923 } ,
        { "chunk_offset": 897412 } ,
        { "chunk_offset": 1170432 } ,
        { "chunk_offset": 1426814 } ,

and for track 2 the start of the table looks like this:

Has header:
{"size": 1224, "type": "stco"}

Has values:
{
    "version": 0,
    "flags": "0x000000",
    "entry_count": 302,
    "entry_list": [
        { "chunk_offset": 130635 },
        { "chunk_offset": 904603 },
        { "chunk_offset": 1177851 },
        { "chunk_offset": 1434346 },

So the first chunk of track 1 (byte offset 121923) starts immediately after the 8 bytes of the mdat header (this is not always the case, sometimes there appears to be some kind of pre-amble at the start of the mdat before the referenced chunks) and is followed by the first chunk of track 2 which in turn is followed by the second chunk of track 1 and so on.

stsc: sample to chunk box

This table is used to work out how many samples a a given chunk contains.

for track 1 the table looks like this:

Has header:
{"size": 52, "type": "stsc"}


Has values:
{     "version": 0,
    "flags": "0x000000",
    "entry_count": 3,
    "entry_list": [
        { "first_chunk": 1,
            "samples_per_chunk": 12,
            "samples_description_index": 1 } ,
        { "first_chunk": 2,
            "samples_per_chunk": 11,
            "samples_description_index": 1 } ,
        { "first_chunk": 324,
            "samples_per_chunk": 5,
            "samples_description_index": 1 } 
    ]
}

and for track 2 the start of the table looks like this:

Has header:
{"size": 1180, "type": "stsc"}

Has values:
{
    "version": 0,
    "flags": "0x000000",
    "entry_count": 97,
    "entry_list": [
        { "first_chunk": 1,
            "samples_per_chunk": 31,
            "samples_description_index": 1 },
        { "first_chunk": 2,
            "samples_per_chunk": 30,
            "samples_description_index": 1 },
        { "first_chunk": 17,
            "samples_per_chunk": 29,
            "samples_description_index": 1 },
        { "first_chunk": 18,
            "samples_per_chunk": 28,
            "samples_description_index": 1 },

From this we ascertain that track 1/chunk 1 contains 12 contiguous samples, track 1/chunk 2 contains 11 samples and all other track 1 chunks will contain 5 samples each.

For track 2, track 2/chunk 1 will contain 31 contiguous samples, track 2/chunk 2 will contain 30 samples (as will chunks 3 through to 16), track 2/chunk 17 will contain 29 samples, track 2/chunk 18 will contain 28 samples and so on.

stsz: sample size box

This table states (in bytes) how large each individual sample within a given track is.

for track 1 the start of the table looks like this:

Has header:
{"size": 14256, "type": "stsz"}

Has values:
{
    "version": 0,
    "flags": "0x000000",            
    "sample_size": 0,
    "sample_count": 3559,
    "entry_list": [
        { "entry_size": 682 } ,
        { "entry_size": 683 } ,
        { "entry_size": 682 } ,
        { "entry_size": 683 } ,
        { "entry_size": 683 } ,
        { "entry_size": 682 } ,
        { "entry_size": 683 } ,
        { "entry_size": 886 } ,
        { "entry_size": 862 } ,
        { "entry_size": 786 } ,
        { "entry_size": 702 } ,
        { "entry_size": 698 } ,
        { "entry_size": 684 } ,
        { "entry_size": 713 } ,
        { "entry_size": 661 } ,
        { "entry_size": 638 } ,
        { "entry_size": 653 } ,
        { "entry_size": 640 } ,
        { "entry_size": 687 } ,
        { "entry_size": 633 } ,
        { "entry_size": 614 } ,
        { "entry_size": 619 } ,
        { "entry_size": 649 } ,
        { "entry_size": 663 } ,
        { "entry_size": 696 } ,
        { "entry_size": 675 } ,
        { "entry_size": 664 } ,
        { "entry_size": 654 } ,
        { "entry_size": 647 } ,
        { "entry_size": 651 } ,
        { "entry_size": 690 } ,
        { "entry_size": 739 } ,
        { "entry_size": 686 } ,
        { "entry_size": 654 } ,
        { "entry_size": 664 } ,
        { "entry_size": 677 } ,
        { "entry_size": 684 } ,
        { "entry_size": 686 } ,
        { "entry_size": 730 } ,
        { "entry_size": 665 } ,
        { "entry_size": 665 } ,
        { "entry_size": 655 } ,
        { "entry_size": 694 } ,
        { "entry_size": 697 } ,
        { "entry_size": 715 } ,
        { "entry_size": 669 } ,
        { "entry_size": 667 } ,

and for track 2 the start of the table looks like this:

Has header:
{"size": 34160, "type": "stsz"}

Has values:
 {    "version": 0,
    "flags": "0x000000",
    "sample_size": 0,
    "sample_count": 8535,
    "entry_list": [
        { "entry_size": 532641 },
        { "entry_size": 53341 },
        { "entry_size": 14903 },
        { "entry_size": 6930 },
        { "entry_size": 1965 },
        { "entry_size": 384 },
        { "entry_size": 405 },
        { "entry_size": 2383 },
        { "entry_size": 433 },
        { "entry_size": 522 },
        { "entry_size": 9499 },
        { "entry_size": 2297 },
        { "entry_size": 415 },
        { "entry_size": 466 },
        { "entry_size": 3349 },
        { "entry_size": 506 },
        { "entry_size": 557 },
        { "entry_size": 86826 },
        { "entry_size": 18558 },
        { "entry_size": 8331 },
        { "entry_size": 2051 },
        { "entry_size": 420 },
        { "entry_size": 515 },
        { "entry_size": 2484 },
        { "entry_size": 403 },
        { "entry_size": 423 },
        { "entry_size": 9043 },
        { "entry_size": 2859 },
        { "entry_size": 368 },
        { "entry_size": 429 },
        { "entry_size": 3071 },
        { "entry_size": 360 },
        { "entry_size": 443 },
        { "entry_size": 83602 },
        { "entry_size": 17848 },
        { "entry_size": 8262 },
        { "entry_size": 1783 },
        { "entry_size": 405 },
        { "entry_size": 393 },
        { "entry_size": 2476 },
        { "entry_size": 435 },
        { "entry_size": 453 },
        { "entry_size": 8338 },
        { "entry_size": 2303 },
        { "entry_size": 423 },
        { "entry_size": 384 },
        { "entry_size": 2728 },
        { "entry_size": 403 },
        { "entry_size": 437 },
        { "entry_size": 88092 },
        { "entry_size": 18975 },
        { "entry_size": 9010 },
        { "entry_size": 2161 },
        { "entry_size": 368 },
        { "entry_size": 451 },
        { "entry_size": 2766 },
        { "entry_size": 480 },
        { "entry_size": 425 },
        { "entry_size": 8764 },
        { "entry_size": 2417 },
        { "entry_size": 444 },
        { "entry_size": 602 },
        { "entry_size": 2245 },
        { "entry_size": 512 },
        { "entry_size": 417 },
        { "entry_size": 78171 },
        { "entry_size": 17933 },
        { "entry_size": 7793 },
        { "entry_size": 1995 },
        { "entry_size": 371 },
        { "entry_size": 413 },
        { "entry_size": 2758 },
        { "entry_size": 431 },
        { "entry_size": 414 },
        { "entry_size": 8474 },
Clone this wiki locally