Simplify Draft 'pymd' Box #100

dukesook · 2024-06-03T17:17:34Z

Concerning the Image Pyramid Entity Group Box 'pymd' in WG03N1157_23524 HEIF Ed3 Amd2 Prelim WD:

The following four variables seem redundant and could be removed to greatly simplify the 'pymd' box:

tile_size_x
tile_size_y
tiles_in_layer_row_minus1
tiles_in_layer_column_minus1

pymd	ImageGrid
tiles_in_layer_row_minus1	rows_minus_one
tiles_in_layer_column_minus1	columns_minus_one
tile_size_x	(output_width / columns_minus_one)
tile_size_y	(output_height / rows_minus_one)

If the gridding is handled within the codec, then the codec specific boxes would be used instead. For example, the uncompressed codec would use the uncC box:

pymd	uncC
tiles_in_layer_row_minus1	num_tile_rows_minus_one
tiles_in_layer_column_minus1	num_tile_cols_minus_one
tile_size_x	(ispe image_width / num_tile_cols_minus_one)
tile_size_y	(ispe image_height / num_tile_rows_minus_one)

I imagine these variables were originally placed here for convenience. However, there are already well defined mechanisms for accessing them. In my opinion, duplicating them in the pymd box adds complexity for very little gain.

farindk · 2024-07-18T18:25:50Z

We could even leave away the layer_binning entry. It appears to be equivalent to ceil(base width/overview width) and it is thus redundant. That would eliminate all custom data from the pymd box, leaving just the list of pyramid image item IDs, sorted according to increasing resolution.

farindk · 2024-07-23T13:55:06Z

Another reason for leaving away layer_binning is that it restricts us to ~~power-of-two~~ integer scaling factors only, while I think it would be perfectly fine to also have finer resolution granularity, like a factor $\sqrt 2$ per layer. For example, layers with image widths 1000, 707, 500.

farindk · 2024-08-04T22:53:28Z

Is there any important reason for the restriction that every pyramid level should have the same tile size? I think this is overly restrictive. There are good reasons why tile sizes would vary for different layers. For example, the base layer might have the original uncompressed data, while overview layers use h265 with a tile size dictated by the encoder hardware). Or building the lower resolutions is much easier if this can be done for each base-level tile separately.

If it is important for an application to have the same tile size in each level, the encoder is free to do so even without enforcing this, and having similar tile sizes could be signaled as a flag (but it's also obvious when looking at the individual pyramid levels).

PileofBones86 · 2024-09-27T20:15:31Z

FYI, the layer_binning isn't restricted to power-of-two scaling. It is restricted to integer factors. For instance, the examples in the document show 2x2 binning as a layer_binning of 2 and 4x4 binning as a layer_binning of 4. One can also choose to implement a 3x3 binning as a layer_binning of 3, etc.

farindk · 2024-09-27T20:25:09Z

the layer_binning isn't restricted to power-of-two scaling

Right. This was a semantic typo. I mean integer scale factor. I've corrected it above.

fmaze · 2024-11-06T08:32:49Z

I imagine these variables were originally placed here for convenience

Indeed, the purpose was that the reader has all the information it needs in a single place. Depending on which kinds of overviews are composing the pyramid, the information may be more or less complicated to gather (grid, tiled coded image with constrained extents, uncompressed image). For instance, for a grid image the example above is not totally correct due to possible implicit cropping, To be accurate you would have to get the width/height of the input images.

I agree it is worth simplifying in some cases, what about using a flag so that this information would be conditionally present or not?

farindk · 2024-11-06T10:52:28Z

A conditional flag is one option, but I think that the "difficulty" to get the information is no strong argument. Any decoder that is able to decode the images obviously also knows how the get the image sizes. And from there, all parameters can be trivially computed. Furthermore, it is not so clear why anyone other than the decoder would be interested in the number of tiles in each layer.

I can see two use-cases:

a multiplexing/demultiplexing application might want to know the size of each layer without being able to decode it. For example, a network server might want to know which resolution layer to stream. This could be solved by making the ispe property mandatory when an image is used in a pymd. The ispe is usually present anyway so that this will not be a big change and double (and thus possibly inconsistent) information is avoided.
it might be useful information whether each resolution layer uses the same tile size. This would be a good case for your conditional flag. If the flag is set, the pymd is restricted to use the same tile size in every layer and the tile size is indicated in the pymd header.

Thus, the specification would be:

ispe is mandatory for each layer image
the header is defined like this:

aligned(8) class ImagePyramidEntityGroup
extends EntityToGroupBox ('pymd', version = 0, flags) {
        if (flags & 1) {
  	  unsigned int(16) tile_size_x;
	  unsigned int(16) tile_size_y;
        }
}

cconcolato · 2024-11-07T14:17:38Z

AMD1 of HEIF has been revised. We invite experts to have a new look at the text.

cconcolato added the HEIF label Jun 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify Draft 'pymd' Box #100

Simplify Draft 'pymd' Box #100

dukesook commented Jun 3, 2024

farindk commented Jul 18, 2024

farindk commented Jul 23, 2024 •

edited

Loading

farindk commented Aug 4, 2024

PileofBones86 commented Sep 27, 2024

farindk commented Sep 27, 2024

fmaze commented Nov 6, 2024

farindk commented Nov 6, 2024

cconcolato commented Nov 7, 2024

Simplify Draft 'pymd' Box #100

Simplify Draft 'pymd' Box #100

Comments

dukesook commented Jun 3, 2024

farindk commented Jul 18, 2024

farindk commented Jul 23, 2024 • edited Loading

farindk commented Aug 4, 2024

PileofBones86 commented Sep 27, 2024

farindk commented Sep 27, 2024

fmaze commented Nov 6, 2024

farindk commented Nov 6, 2024

cconcolato commented Nov 7, 2024

farindk commented Jul 23, 2024 •

edited

Loading