Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Challenge #12- Size, precision, speed - pick two: implementation #2

Open
EsperanzaCuartero opened this issue Jan 28, 2021 · 1 comment
Assignees
Labels
stream-1 Stream 1 - Software development for weather, climate and atmosphere

Comments

@EsperanzaCuartero
Copy link
Contributor

EsperanzaCuartero commented Jan 28, 2021

Challenge 12 - Size, precision, speed - pick two: implementation

Stream 1 - Software development for weather, climate and atmosphere

Goal

This project is a follow-up of the ESoWC 2020 data encoding optimisation challenge.
Based on the results and the findings of the completed project we will implement improved data packing configuration in our production streams. We would also like to analyze some new atmospheric composition and meteorological datasets.

Mentors and skills

  • Mentors: @miha-at-ecmwf @juanjodd
  • Skills required:
    • Some knowledge of meteorological data formats (GRIB, NetCDF) and libraries to decode and manipulate them (ecCodes, netcdf, cdo, nco, ..)
    • Some knowledge about data encoding (data packing, accuracy, compression methods)
    • Knowledge of a software library to compute and present the results
    • Some familiarity with Chemical Transport Modelling (CTM) or Numerical Weather Prediction (NWP) to be able to better appreciate this challenge would be beneficial

Note: Challenge is funded by Copernicus. Only nationals from the European Union and ECMWF Member States are eligible to apply (see Terms and Conditions).


Challenge description

Data and software
We plan to use the CAMS global real-time forecast dataset, ecCodes and NetCDF libraries to test different configurations and estimate data encoding errors and software library to compute and present results (Python, R or Julia).

What is the current problem?
Due to non-optimal data encoding configuration, there is a lot of artificial precision in our data. Datasets are expensive to archive and move and difficult to use.

What could be the solution?
We would like to remove artificial precision from the encoded fields without any loss of information. At the same time, we need to be conscious of operational constraints, so data encoding and decoding steps do not become prohibitively expensive. The desired solution would be a combination of data encoding settings and step to achieve this goal.

Ideas for the implementation
Things to address: more appropriate packing methods, encoding float arrays, explore usage of suitable data compression algorithms.


ESoWC

@EsperanzaCuartero EsperanzaCuartero added the stream-1 Stream 1 - Software development for weather, climate and atmosphere label Jan 28, 2021
@jwagemann jwagemann changed the title Challenge #31 - Jupyter widgets to help process and explore meteorological data Challenge #12- Size, precision, speed - pick two : implementation Jan 28, 2021
@EsperanzaCuartero EsperanzaCuartero changed the title Challenge #12- Size, precision, speed - pick two : implementation Challenge #12- Size, precision, speed - pick two: implementation Jan 29, 2021
@jwagemann
Copy link

Hi,
join us for the ECMWF Summer of Weather Code Ask Me Anything session and learn all things ESoWC.

When: Wednesday, 24 March 2021 at 4 pm GMT

What: learn everything about ESoWC - how it works, the challenges this year, some tips for your proposal and listen to ESoWC experiences from previous participants

How: register here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stream-1 Stream 1 - Software development for weather, climate and atmosphere
Projects
None yet
Development

No branches or pull requests

6 participants