Skip to content

Commit

Permalink
2025 Project Proposals
Browse files Browse the repository at this point in the history
  • Loading branch information
acarbonetto committed Jan 8, 2025
1 parent 0e4acd3 commit f558203
Showing 1 changed file with 84 additions and 0 deletions.
84 changes: 84 additions & 0 deletions docs/3.xProposals.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
Project Proposal

## Summary

The following areas are proposed as priorities for the 2025 Q1 quarter, and a looking forward to 2025 Q2.

### Milestone: Spark/OpenSearch PPL

Status: In Progress

This milestone aims to close the grammar gap between the functions that are offered by the Spark-PPL grammar and the OpenSearch-PPL grammar.

Proposed tasks:
1. JSON Scalar/Object/Array Functions - as proposed under [RFC](https://github.com/opensearch-project/sql/issues/3028), this proposal includes a Splunk-like API for ready JSON flattened strings as JSON values, with an interface to manage the JSON Objects or JSON Arrays, the SQL plugin PPL grammar.
2. The [appendcol](https://github.com/opensearch-project/sql/issues/3172) function to OpenSearch-PPL.
3. The [iplocation](https://github.com/opensearch-project/sql/issues/3037) function to OpenSearch-PPL. This also adds the geospatial plugin as a testing dependency.
4. Support [Relative Date-time Strings](https://github.com/opensearch-project/opensearch-spark/issues/991) for Spark-PPL.
5. Support [earliest/latest](https://github.com/opensearch-project/opensearch-spark/issues/957) datetime functions for Spark-PPL.

### Milestone: 3.0 release priority fixes

Status: Proposed

Release date for [OpenSearch 3.0 release schedule](https://github.com/opensearch-project/.github/issues/252), with Release 3.0 scheduled to go (alpha) midway through February, this provides a unique opportunity to clean up or deprecate old work. This includes the legacy engine that can still give unexpected results for users.

Proposed tasks:
1. Lucene/JDK build fixes - placeholder for any impact on the `main` plugin branch. See [announcement on Slack](https://opensearch.slack.com/archives/C051D137M7G/p1735946771951749), [PR](https://github.com/opensearch-project/OpenSearch/pull/16366).
2. [Deprecate Legacy Engine for 3.x](https://github.com/opensearch-project/sql/issues/787). Despite the [open legacy issues](https://github.com/opensearch-project/sql/issues?q=is%3Aissue+is%3Aopen+label%3Alegacy) it would be an opportune time to disable the legacy engine as a fall-back mechanism, as it currently does not have a significant impact on users. We should consider including a configuration that enables the engine for corner cases or testing.

### Milestone: PPL Language Migration

Status: Not Started

An existing proposal is to merge the PPL Language parser definitions files into a common repository for community involvement. This would be the first step to making the PPL Language universal, and remove the existing dependencies on OpenSearch-PPL and Spark-PPL.

Proposed Tasks:
1. Grammar Unification Proposal
2. Prototype PPL Grammar repository
3. Cleanup and migrate feature branch (Spark)
4. Cleanup and migrate feature branch (OpenSearch)
5. Execute migrations from feature branch to main
6. Implement validation REPL
7. CI to validate REPL
8. Documentation: Extension Mechanism
9. Community blog/demo

### Milestone: Docker Test Framework

Status: In Progress

Introduce an Open-Source alternative to Spark EMR serverless currently hosted by Amazon. This enables community developers to contribute without any 3rd party dependencies.

Proposed Tasks:
1. Integration with Async API
2. Alter Scala tests to integrate with Docker
3. Integrate SBT build
4. Update CI
5. Community blog/demo

### Milestone: Community RFCs and Issues

Status: Ongoing

Placeholder effort for resolving community discussions and bugs, as well as increase community engagement.

Proposed Tasks:
1. Ongoing effort to promote community engagement
2. Propose and fix (SQL/PPL) [dynamic field type(flat_object) support](https://github.com/opensearch-project/sql/issues/3067)
3. Fix documentation for [parse syntax doesn't work](https://github.com/opensearch-project/sql/issues/3206)
4. Add CI for ML-Commons plugin integration
5. Fix SQL grammar for [Enable scalar function aggregation in PPL](https://github.com/opensearch-project/sql/issues/45)
6. Propose and fix SQL grammar for [Missing V2 Aggregation Functions](https://github.com/opensearch-project/sql/issues/1889); including:
- histogram
- date_histogram
- percentiles
- topHits
- geo_polygon
- geo_bounding_box
- geo_distance
- geo_intersects
- stats
- extended_stats
- ... GROUP BY terms(...
- ... GROUP BY range(...

0 comments on commit f558203

Please sign in to comment.