Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add end-to-end render tests with real styles #5333

Open
louwers opened this issue Jan 12, 2025 · 14 comments
Open

Add end-to-end render tests with real styles #5333

louwers opened this issue Jan 12, 2025 · 14 comments
Labels
enhancement New feature or request PR is more than welcomed Extra attention is needed

Comments

@louwers
Copy link
Collaborator

louwers commented Jan 12, 2025

User Story

As a MapLibre developer, I want to ensure my changes to not have regressions in existing styles as MapLibre GL JS is evolving.

Something I have been thinking about for MapLibre Native, but might be a good idea for MapLibre GL JS as well, is to have end-to-end render tests that use real styles.

Rationale & Impact

We have a wide variety of render tests, but sometimes they do not capture interactions between features. They also do not capture the impact of a change to rendering behavior that is consciously made. For example in the case of the changes to geometry-type, where a certain feature turned out to be used in more styles than was anticipated.

I think having a holistic overview of rendering changes and seeing how they affect multiple styles will be helpful for development and making changes with confidence. Tile providers will also be more eager to recommend their users to use the latest version of MapLibre GL JS, if they know their styles already went through some form of testing on our end.

Implementation

We cannot use a 'live' style, since they are changing all the time. So we would check in open source styles, or perhaps even proprietary styles (with permission).

We will make sure all data that is needed for rendering the section of the map that we test is available offline during the test. We could for example store that data in an S3 bucket and serve it with Martin during the test.

It should be easy to add a new style, add a new section of the map for an existing style and to update expected images of pre-existing tests.


If this is decided to be a good idea then I would be happy to work on it, and make sure we are sharing the tests between MapLibre GL JS and MapLibre Native. Also I am not very familiar with the tests that are part of MapLibre GL JS, so maybe something similar already exists.

@HarelM
Copy link
Collaborator

HarelM commented Jan 12, 2025

The render tests are doing exactly that, we can add more complicated styles if needed.
There's nothing preventing us from doing it right now, unless there's a need for a lot of data, which is usually not the case when you take a screenshot of the current screen as there are limited tiles and features in any given screenshot.

@HarelM HarelM added enhancement New feature or request PR is more than welcomed Extra attention is needed labels Jan 12, 2025
@louwers
Copy link
Collaborator Author

louwers commented Jan 12, 2025

Right. Yes they would just be render tests. Just more complex. And with other people's styles. Similar to how TypeScript has types of third party libraries in their source tree to ensure conformance.

There is just some work needed to pull the data and possible store it outside the repo and set it up during a CI run. Also if we want to use proprietary styles and data, we would need to ask for permission. And that definitely should then not be part of this repo.

@hyperknot
Copy link

hyperknot commented Jan 12, 2025

// Note: I don't have knowledge about the existing testing infrastructure.

One thing I can imagine would work, is to have a fixed version full planet OpenMapTiles render in PMTiles format stored on Cloudflare or S3.

Then 20 detailed locations around the world would be rendered into PNG files using some of the official styles, like Liberty, Bright, etc.. Basically these styles from Overpass-Ultra:

Finally, on each PR, these screenshots would be compared with https://github.com/mapbox/pixelmatch, which was developed for this very purpose.

For the full planet dataset, here is the latest MBTiles render from OpenFreeMap. This would need to be converted to PMTiles.
https://btrfs.openfreemap.com/areas/planet/20250108_001002_pt/tiles.mbtiles

@louwers
Copy link
Collaborator Author

louwers commented Jan 12, 2025

Using PMTiles is an interesting idea. We would need to add the PMTiles plugin as a dependency to the render tests.

@hyperknot
Copy link

Yes, simplest solution would be the plugin, or @bdon can probably help setting up a Cloudflare or AWS worker which generates XYZ tiles. I think using PMTiles for this purpose would be a perfect fit, as it'd be able to supply a fixed version forever without requiring a server running.

@HarelM
Copy link
Collaborator

HarelM commented Jan 12, 2025

PMTiles is an overkill if you'd ask me. We already have tiles in this repo, we can also use the demotiles repo if we need a few more tiles.
Render test already use a static server to serve tiles and pixel match for comparison.
Don't forget that when these kind of test fail someone would need to debug them and understand why they failed, the more complicated they are the more time this will take...

@louwers
Copy link
Collaborator Author

louwers commented Jan 12, 2025

I think that if a test using a complex style fails, it will be a pain to pinpoint the exact issue regardless of what data source is used. But in most cases the failures will just be small pixel differences, and the issue does not need to be investigated. Instead, the expected image can just be updated to match the new behavior.

The main purpose of these 'real style render tests' is just to catch large regressions. The normal render tests should be used for day to day debugging and development. If an end to end render test fails dramatically, but the normal render tests pass, that would be a good opportunity to create a new normal render test.

@hyperknot
Copy link

I think these could be thought of as End-to-end (E2E) tests, where a very complex scenario (like a city center) is rendered and we just compare the percentage of pixels changed.

As an alternative to hosting a full planet PMTiles, maybe some city center tiles could be extracted out, like Zurich, Paris, London, Stockholm, etc. But some bugs might only be visible on low zoom levels, continents, oceans, etc. At one point hosting a PMTiles might be the simplest solution in terms of dev time.

@louwers
Copy link
Collaborator Author

louwers commented Jan 12, 2025

Having a worldwide data available means that if you want to add a regression test for a particular spot, you can just create a style.json with the bbox and zoom level and generate the expected image. That's pretty useful I think. MapLibre Native received quite a few rendering bug reports that only show up in very specific locations.

But there's also something to be said for keeping the rendering tests completely offline.

In any case, I'll keep maintainability and debugability in mind when working on a suggested approach.

@bdon
Copy link
Contributor

bdon commented Jan 13, 2025

For our basemap tileset we use this visual testing suite: https://maps.protomaps.com/visualtests/

This iterates through examples.json and renders a set center/zoom level with any choice of style1/tileset1 and style2/tileset2 and compares them with pixel match.

This is made for a different use case because we assume the map rendering library (maplibre gl js 5.0) is constant and are varying the tileset/styles because that's what we're developing and tracking regressions for. It's useful to be able to compare two arbitrary points in the past.

So for this use case you would want to keep the tileset/style fixed and be able to pick two releases of maplibre.

@prusswan
Copy link

A live site with the ability to mix n match styles and tilesets would be convenient for adhoc testing, and (optionally) a function to export test tiles in PMtiles format for targeted, offline investigation.

@JerPScott
Copy link

@louwers I am a member of an AWS team using MapLibre for map rendering. I am considering implementing something similar to what you describe here in our own build pipelines to validate our styles with particular MapLibre versions. Having these more complex rendering tests within the MapLibre repo and run against new release candidates would be great from my perspective. I do think that the false positive problem is something to be mindful of, with small pixel changes likely occurring in the rendering of a more complex style.

I will have to do some asking around internally to make sure it doesn't expose the project to licensing issues, but I don't see any good reason not to include one or some of our styles in the repo for rendering tests.

Using the recent changes to geometry type as an example, I would ideally look for these tests to serve as a signal about support problems in upcoming releases and not as a blocker to features.

@louwers
Copy link
Collaborator Author

louwers commented Jan 22, 2025

@JerPScott Thanks for reaching out. To avoid licensing issues as much as possible, we could also consider using online styles hosted by third-parties such as AWS. However, that would setting up, sharing and maintaining a versioned, unchanging API that we could use.

@HarelM
Copy link
Collaborator

HarelM commented Jan 22, 2025

This won't solve the licensing issues I think.
It would be great to have some complicated styles as part of this repo, it would also be great if you could invest a small amount of time updating them (or adding new updated ones) from time to time so that we'd know we are always covering your use cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request PR is more than welcomed Extra attention is needed
Projects
None yet
Development

No branches or pull requests

6 participants