-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revive Batch processing - Claude and GPT #40
Comments
Hi Lisa, I have a couple of questions comments. We can go one of two directions with this:
Let me know. I just pushed a branch that allows you to run evaluations (without the batch checked), I've integrated that process with the context service. It seems to be working well. Do you view the results in the chat logs? Or I think in another issue you want them logged to the database? |
I'd rather keep context service separate, so we can decide if and when it's useful and be able to have better control. We could even have a failover where if it takes too long, we abandon it. At the moment, it IS helping load departmental results, but it also seems to have degraded some top task results. Notice the sample files attached above with 40 top questions I can only view batch jobs (which I believe are distinct from evaluations in some llm systems) right now by signing in and getting them from the Anthropic Console.. I've attached the sample you created on Dec23 - saved in plain txt bc I couldn't attach JSON to this post. Yes issue #41 is about logging them to the database, which is very much needed for the evaluation process. One thing that confuses me about the evaluation file is that the question itself (from the problem details column in the input file) is not in the output, and neither is the referring url (if provided). Those were included previously in the batch output. Ideally it should also include the evaluation tag so that batches can be distinguished from user output. |
okay, I'll update the process. |
A fundamental part of our evaluation system is to run batches of questions.
I did have batch processing working for Sonnet but have not tried again since ContextService was added. I assume there will have to be some changes. Note that batch calls cause tags to be added so that the AI doesn't ask clarifying questions (base.js in systemPrompt tells it not to ask clarifying questions if evaluation tag is present).
I was never able to get batch processing working for ChatGPT although the files are there. Ideally should work on both.
With the tool changes to the API, and with the Context Service added, need to get batch processing running again for Claude. Batches are started from the admin page - by loading a file like these ones (Use one you've downloaded and cleaned from the Feedback viewer, or any CSV file with a column labelled 'Problem Details' with the questions and an optional URL column with a referring URL. Admin code is required to enable file upload (temporary fix for testing).) https://docs.anthropic.com/en/docs/build-with-claude/message-batches
Then get it running for GPT. https://platform.openai.com/docs/guides/batch
Top40-findability-FR.csv
Top40-findability.csv
The text was updated successfully, but these errors were encountered: