-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reducing metadata overhead, especially with tiled images #106
Comments
This seems like a reasonable thing to add. It kind of relates to the exploration on compressed Running |
Could you share a range of the expected file size of a file representative of the use case you mentioned?
Is this use case sharing other complex ISOBMFF/HEIF features?
|
A simple encoding that matches the typical data is usually better than I tried to compress the byte sequence consisting of The 512 bytes sequence For the extreme case of 256x256 tiles, we have the |
Fair point. I would have expected deflate to be able to do better here, but it's been a long time since I looked into exactly how it codes stuff. I thought it could handle incrementing sequences, but I must have misremembered. |
There are a couple of different aspects to this - "storage size on disk" and "actually transferred bytes" might be ways to describe it. Reducing the On a huge image, reducing the If the If As the user zooms in, we can select which parts of the image we want using Range header byte ranges to only download the parts of the file we will show to the user. Actual transferred bytes could be (and often will be, in this use case) much less than the total size. So while the total image size is still important, it may not be that useful a metric here. |
Got it, thank you for the detailed explanation. What would be the reduced |
In order to put some numbers on @y-guyon comment above, I decided to take a look at the largest HEIF file I have created. It's a 93356x32872 HEIF file using 10974 512x512 tiles, with each tile having
Size-wise, the Assuming 4 byte item IDs.
Taking the
Taking the
So deflate definitely helps. If used on the full |
For the iloc entries in that specific, is it all single extent per item and all in Of those extents, how many follow-on such that the first byte of item n is immediately after the last byte of item n-1? My interest is in whether offset + extent for the previous item can determine the offset of this item. So you could get an effect of "items 1 to 10974 have extents [ ], first item at (base) offset x". |
Yes
All of them
Yes, that's definitely something that could be done. The file basically has something like this:
For all of these the offset is the offset+length of the previous item. So the only thing that would really be needed would be to store the starting offset, the range of item IDs and the lengths for each item. In this case we could get away with using a 2-byte length for each entry which would save a lot of space. If you want to get fancy you could also experiment with trying to predict the length from the previous length, but that could break down fast depending on your content and won't really save you that much unless we start doing variable length integers. |
Here is an example image with 256x255 tiles: large tiled images (the image file is quite large as it uses JPEG compression).
The h265 uses an adjustable number of bits for the size of each tile, but to simplify parsing and be in line with how other boxes work, we could use either 4 or 8 bytes for each tile.
If we would further allow tile sizes to be stored at lengths other than 4 or 8 bytes, this could be reduced even further to maybe at most half of that. I can write a detailed proposal for an |
You can't really disagree that it helps, I gave you actual numbers for an actual file. Just using deflate on the full That doesn't mean you can't get even better results by implementing the range schemes for |
@leo-barnes Ah, sorry, I misunderstood your experiment. I tought you were applying |
HEIF images that consist of many tiles contain a lot of redundant information:
infe
item entry with usually identical content,ipma
property associations for each tile are usually referring to the same properties,iref
of typedimg
, defining the grid image tiles, usually consists of a long list of consecutive item IDs (1,2,3,4,5,6, ...).On images with many tiles (#105), this can lead to significant overhead.
E.g. 21 bytes for each
infe
, 2 or 4 bytes for eachiref
reference, and usually >=6 bytes for eachipma
entry.For 256x256 tiles, this sums to at least 2 MB of unnecessary overhead.
In order to reduce this amount of data, I propose the following changes:
infe
to hold not only anitem_ID
, but a range ofitem_ID
s. The content of a singleinfe
will then define all items in the given range. This could be implemented with a newinfe
box version or flag that enables storing anitem_ID_count
. The range would then be defined byitem_ID
toitem_ID + item_ID_count
.ipma
entry could store a range ofitem_ID
s instead of a singleitem_ID
to which it relates in order to eliminate the unnecessary copies.iref
could be extended with aversion=2
, operating as follows:This defines a simple run-length encoding, but still allows to assign non-consecutive itemIDs.
For example, the
iref
list[1,2,3,4,5,6,7,8,9,10]
would be encoded simply as{1,9}
(to_itemID=1
,count_minus_one=9
), and[1,2,3,4,5, 12, 7,8,9,10]
as[{1,4}, {12,0}, {7,3}]
.Other, improved coding schemes are also possible, of course.
The text was updated successfully, but these errors were encountered: