Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic Adjustment of flush_timeout and bulk_size for OpenSearch Sink #5326

Open
dinujoh opened this issue Jan 13, 2025 · 0 comments
Open

Dynamic Adjustment of flush_timeout and bulk_size for OpenSearch Sink #5326

dinujoh opened this issue Jan 13, 2025 · 0 comments
Labels
enhancement New feature or request

Comments

@dinujoh
Copy link
Member

dinujoh commented Jan 13, 2025

Is your feature request related to a problem? Please describe.
Implement Dynamic Adjustment of flush_timeout and bulk_size for OpenSearch Sink.

Currently, the OpenSearch sink in Data Prepper uses static values for flush_timeout and bulk_size. This can lead to suboptimal performance in varying traffic conditions. We need to implement a dynamic adjustment mechanism for these properties based on the observed traffic pattern.

Describe the solution you'd like
Implement Adaptive Batching:
Instead of adjusting flush_timeout and bulk_size, implement an adaptive batching mechanism that dynamically adjusts the batch size based on the current traffic rate. This could involve starting with smaller batches and gradually increasing the size as traffic increases and vice-versa.

  1. Monitor the incoming traffic rate to the OpenSearch sink.
  2. For high TPS (Transactions Per Second) scenarios:
    • Use default values for flush_timeout and bulk_size to optimize for throughput.
  3. For low TPS or sporadic request patterns:
    • Set flush_timeout to -1 for immediate flushing.
    • Potentially reduce bulk_size to ensure timely processing of smaller batches.

Benefits:

  1. Improved performance across varying traffic patterns.
  2. Reduced latency for low-traffic scenarios.
  3. Better resource utilization during high-traffic periods.
  4. Enhanced user experience without manual configuration changes.

Potential Challenges:

  1. Determining optimal thresholds and adjustment strategies.
  2. Ensuring thread-safety in dynamic property adjustments.
  3. Avoiding frequent oscillations in settings.

Describe alternatives you've considered (Optional)

  1. Traffic-Based Worker Scaling:
    Implement a system that scales the number of worker threads processing the sink based on the incoming traffic. This could help manage both high and low traffic scenarios without changing the flush_timeout or bulk_size.

  2. Time-Based Flushing with Backpressure:
    Instead of using a fixed flush_timeout, implement a time-based flushing mechanism with backpressure. This would flush based on time for low traffic, but could delay flushing if the system is under high load, effectively adapting to traffic patterns.

  3. Machine Learning-Based Prediction:
    Implement a machine learning model that predicts traffic patterns and adjusts sink parameters proactively, rather than reactively. This could be particularly effective for systems with recurring traffic patterns.

  4. Hybrid Approach:
    Combine multiple strategies, such as using adaptive batching for high-traffic scenarios and immediate flushing for low-traffic periods, switching between modes based on observed patterns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

No branches or pull requests

2 participants