Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try out Deepseek #557

Open
kongzii opened this issue Nov 21, 2024 · 7 comments
Open

Try out Deepseek #557

kongzii opened this issue Nov 21, 2024 · 7 comments

Comments

@kongzii
Copy link
Contributor

kongzii commented Nov 21, 2024

https://api-docs.deepseek.com/news/news1120

πŸ” o1-preview-level performance on AIME & MATH benchmarks.
πŸ’‘ Transparent thought process in real-time.
πŸ› οΈ Open-source models & API coming soon!

@empeje
Copy link

empeje commented Jan 9, 2025

This is a fascinating topic, and I have a couple of questions:

  • When transitioning between models, do you use any form of "unit tests" or benchmarks to identify potential regressions in the new implementation?
  • Is it possible for an outsider like me to experiment with this task? And collaborate on the same repo?

@gabrielfior
Copy link
Contributor

Hey @empeje - feel free to experiment and collaborate on this repo.
We have plenty of unit/integration tests - ref. benchmarks, we also have benchmarks, mainly for the betting strategies (see https://github.com/gnosis/prediction-market-agent-tooling/blob/main/examples/monitor/match_bets_with_langfuse_traces.py), but benchmarks across models is something we didn't do much work towards.
@kongzii anything to add here?

@empeje
Copy link

empeje commented Jan 13, 2025

Thanks @gabrielfior for the reply, I'm going to experiment with it.

It's nice that test and benchmark are available.

@kongzii
Copy link
Contributor Author

kongzii commented Jan 14, 2025

We do have a benchmark to compare agent implementations as well https://github.com/gnosis/prediction-market-agent/blob/main/prediction_market_agent/agents/think_thoroughly_agent/benchmark.py#L101. But of course it's not as accurate as just running the agents for some time and then checking it backwards after markets are resolved.

@empeje
Copy link

empeje commented Jan 14, 2025

Update to my research, it looks like we can directly test with DeepSeek by just changing the OpenAI URL and pass the token of DeepSeek there. See the following code from their official documentation

# pip3 install langchain_openai
# python3 deepseek_v2_langchain.py
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model='deepseek-chat', 
    openai_api_key='<your api key>', 
    openai_api_base='https://api.deepseek.com',
    max_tokens=1024
)

response = llm.invoke("Hi!")
print(response.content)

@empeje
Copy link

empeje commented Jan 16, 2025

If I were to implement the DeepSeek, what is the common way in this current codebase to do it? I can imagine I can just change the model type, but wondering if we should have a feature flag to switch between this agent so we can compare models.

@kongzii
Copy link
Contributor Author

kongzii commented Jan 16, 2025

a feature flag to switch between this agent so we can compare models.

Yes, that's good thinking. Good agent to start is Prophet family, check out this file..

There, you will see that each agent is a separate class, and each class comes with its own model configuration:

Screenshot by Dropbox Capture

This was fine until now as all Prophets are using openai, however now you will need to also change the API BASE URL, not just the model string, as you shown before. So that's something you need to add in there.

The Prophet agent itself is defined in another repository , so you may need to do some modifications there as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants