Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add project overview #1

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Add project overview #1

wants to merge 5 commits into from

Conversation

maxrjones
Copy link
Member

This PR gives a short overview of the goals and some potential milestones for building ndquirk.

I'm tagging some people who've previously expressed interest in providing feedback on this project in case you'd like to share thoughts 🙏 @abarciauskas-bgse @sharkinsspatial @jhamman @omshinde @chuckwondo @moradology @jsignell. Thanks for your consideration!


### Resources

- [https://github.com/great-expectations/great_expectations/issues/1942](https://github.com/great-expectations/great_expectations/issues/1942)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if you took xarray-schema and added support for types of validation that xarray could not perform (e.g. union types, functional validators) then you would end up with something that's pretty similar to calling great expectations on a pandas dataframe

https://docs.greatexpectations.io/docs/0.18/oss/guides/connecting_to_your_data/fluent/in_memory/connect_in_memory_data/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh or like Pandera I guess

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach would work off-the-bat for a subset of the expectations, but I don't expect it to be entirely sufficient. xarray-schema currently relies on having loaded or at least opened an xarray dataset for validation. IMO a lot of the value of this tool would be providing explanations for why datasets cannot be simply opened with Xarray (e.g., https://github.com/briannapagan/quirky-data-checker/blob/main/results/results_GES_DISC_total_quirks.png)

project-overview.md Outdated Show resolved Hide resolved
Co-authored-by: Alex I. Mandel <[email protected]>
@danielfromearth
Copy link

danielfromearth commented Jan 10, 2025

Adding a note here that this relates to NASA Earthdata's Data Product Development Guide. For example, see Sections 6.1 and 6.2 in the "Tools for Data Product Testing" section for related data format and compliance checking tools.

@abarciauskas-bgse
Copy link

👋🏽 @bilts we would love to get your feedback as well!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants