Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests for coarse filtering by geography, time window, and completeness #20

Open
GolanTrev opened this issue Aug 28, 2024 · 3 comments
Open
Assignees

Comments

@GolanTrev
Copy link
Collaborator

No description provided.

@GolanTrev
Copy link
Collaborator Author

This script, more generally, should be the FIRST step in the pipeline in which you transform the data in a meaningful way. This includes re-projecting and generating new columns like "timestamp", and the user would probably rewrite the processed data into a new folder.

Apache Sedona? Spark (SQL) for subsetting dates.

What I would expect, is to use daphmeio to handle anything related to column names, folder structure, S3, data types, and WRITING to file. So, functions should receive an optional dict as parameter that helps find alternate col_names.

@GolanTrev
Copy link
Collaborator Author

@thom-li have you thought or progressed on this? want to meet and work on it?

@GolanTrev
Copy link
Collaborator Author

  • Complete tests in filter_tests.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants