Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Switch to sequential processing in batch_process to resolve thread-safety issues #169

Merged
merged 4 commits into from
Jan 10, 2025

Conversation

fg-nava
Copy link
Contributor

@fg-nava fg-nava commented Jan 9, 2025

fix: Switch to sequential processing in batch_process to resolve thread-safety issues

Ticket

https://navalabs.atlassian.net/browse/DST-688

Changes

  • Modified batch_process.py to use sequential processing instead of parallel processing
  • Added explanatory comments about the thread-safety concerns with LiteLLM

Context for reviewers

This PR addresses the Docker deployment crashes (CPU >1400%) issue when processing CSVs with multiple rows. Investigation revealed that the high CPU usage was caused by thread-safety issues in the underlying LiteLLM client libraries when running in parallel.

The fix is simple but effective: we've removed the parallel processing implementation and switched to sequential processing. This change resolves the CPU spike issues we were seeing in Docker deployments.

Follow-up Items (to be tracked in separate tickets):

  1. UI feedback issue: "File processed, results attached" message appears in logs but not in UI
  2. Assessment needed: Determine if parallel processing is necessary for performance and if so, investigate thread-safe alternatives
  3. Test coverage: Add integration tests for batch runs of _process_question (currently only have mock calls to chat engine)

Testing

Tested locally by:

  • Runnin processing with multi-row CSVs
  • Confirmed CPU usage remains stable
  • Verified successful processing of all rows in sequential order

The change eliminates the >1400% CPU spikes previously observed in Docker deployments while maintaining functionality.

@fg-nava fg-nava requested a review from a team January 9, 2025 19:50
Copy link

github-actions bot commented Jan 9, 2025

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines Covered Coverage Threshold Status
2845 2543 89% 80% 🟢

New Files

No new covered files...

Modified Files

File Coverage Status
app/src/batch_process.py 100% 🟢
TOTAL 100% 🟢

updated for commit: 025092d by action🐍

app/src/batch_process.py Outdated Show resolved Hide resolved
@yoomlam yoomlam changed the title DST-688: Switch to sequential processing in batch_process to resolve thread-safety issues fix: Switch to sequential processing in batch_process to resolve thread-safety issues Jan 9, 2025
@fg-nava
Copy link
Contributor Author

fg-nava commented Jan 10, 2025

This can be closed without merging as it is upstream of #170

@fg-nava fg-nava closed this Jan 10, 2025
@yoomlam
Copy link
Contributor

yoomlam commented Jan 10, 2025

@fg-nava You should have merged this PR so that the title "fix: Switch to sequential processing in batch_process" is included as a separate commit message in main. Then merge #170, which will have a separate commit message that says "fix: Ensure UI feedback message for batch processing completion"

@fg-nava fg-nava reopened this Jan 10, 2025
@fg-nava fg-nava merged commit 9596f57 into main Jan 10, 2025
8 checks passed
@fg-nava fg-nava deleted the DST-688-batch-processing-limitations branch January 10, 2025 18:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants