You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think the status page only indirectly, if at all, monitors whether HAFAS requests themselves are working. Maybe one could switch to monitoring the /health endpoint directly, of course entailing many more additional requests towards HAFAS.
At least with regards to DB's HAFAS API (and v6.db.transport.rest), this seems obsolete now that it's likely shut-off for good.
However, let me make a more general point that applies to other HAFAS-based *.transport.rest APIs: Obviously, the /health endpoint is not using caching. If you all use it to monitor availability of the API, you'll quickly exhaust the shared resource "requests from the server's single static IP to HAFAS", so you effectively prioritise your personal insight when the API is available over everyone's access to it. To keep the rate of requests low, I don't see any solution to this other than making the /health endpoint private.
As an alternative, I suggest you to monitor "user-need-driven" requests (for actual public transport data), specifically e.g. their rate of success/error and the last successful one.
It might also be worthwhile to add Prometheus-/OpenMetrics-compatible metrics to hafas-rest-api and expose them to the public, so you can ingest and monitor them.
The API status page (https://stats.uptimerobot.com/57wNLs39M/793274556) claims 100% uptime on days where my personal monitoring (datadog) indicates otherwise:
Most/all of the failures are caused by 503 errors.
How exactly does the UptimeRobot check if the service is operational?
The text was updated successfully, but these errors were encountered: