-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify Draft 'pymd' Box #100
Comments
We could even leave away the |
Another reason for leaving away |
Is there any important reason for the restriction that every pyramid level should have the same tile size? I think this is overly restrictive. There are good reasons why tile sizes would vary for different layers. For example, the base layer might have the original uncompressed data, while overview layers use h265 with a tile size dictated by the encoder hardware). Or building the lower resolutions is much easier if this can be done for each base-level tile separately. If it is important for an application to have the same tile size in each level, the encoder is free to do so even without enforcing this, and having similar tile sizes could be signaled as a flag (but it's also obvious when looking at the individual pyramid levels). |
FYI, the layer_binning isn't restricted to power-of-two scaling. It is restricted to integer factors. For instance, the examples in the document show 2x2 binning as a layer_binning of 2 and 4x4 binning as a layer_binning of 4. One can also choose to implement a 3x3 binning as a layer_binning of 3, etc. |
Right. This was a semantic typo. I mean integer scale factor. I've corrected it above. |
Indeed, the purpose was that the reader has all the information it needs in a single place. Depending on which kinds of overviews are composing the pyramid, the information may be more or less complicated to gather (grid, tiled coded image with constrained extents, uncompressed image). For instance, for a grid image the example above is not totally correct due to possible implicit cropping, To be accurate you would have to get the width/height of the input images. I agree it is worth simplifying in some cases, what about using a flag so that this information would be conditionally present or not? |
A conditional flag is one option, but I think that the "difficulty" to get the information is no strong argument. Any decoder that is able to decode the images obviously also knows how the get the image sizes. And from there, all parameters can be trivially computed. Furthermore, it is not so clear why anyone other than the decoder would be interested in the number of tiles in each layer. I can see two use-cases:
Thus, the specification would be:
|
AMD1 of HEIF has been revised. We invite experts to have a new look at the text. |
Concerning the Image Pyramid Entity Group Box 'pymd' in WG03N1157_23524 HEIF Ed3 Amd2 Prelim WD:
The following four variables seem redundant and could be removed to greatly simplify the 'pymd' box:
If the gridding is handled within the codec, then the codec specific boxes would be used instead. For example, the uncompressed codec would use the uncC box:
I imagine these variables were originally placed here for convenience. However, there are already well defined mechanisms for accessing them. In my opinion, duplicating them in the pymd box adds complexity for very little gain.
The text was updated successfully, but these errors were encountered: