Try out Deepseek #557

kongzii · 2024-11-21T07:16:39Z

https://api-docs.deepseek.com/news/news1120

🔍 o1-preview-level performance on AIME & MATH benchmarks.
💡 Transparent thought process in real-time.
🛠️ Open-source models & API coming soon!

empeje · 2025-01-09T18:23:04Z

This is a fascinating topic, and I have a couple of questions:

When transitioning between models, do you use any form of "unit tests" or benchmarks to identify potential regressions in the new implementation?
Is it possible for an outsider like me to experiment with this task? And collaborate on the same repo?

gabrielfior · 2025-01-13T12:42:41Z

Hey @empeje - feel free to experiment and collaborate on this repo.
We have plenty of unit/integration tests - ref. benchmarks, we also have benchmarks, mainly for the betting strategies (see https://github.com/gnosis/prediction-market-agent-tooling/blob/main/examples/monitor/match_bets_with_langfuse_traces.py), but benchmarks across models is something we didn't do much work towards.
@kongzii anything to add here?

empeje · 2025-01-13T23:36:45Z

Thanks @gabrielfior for the reply, I'm going to experiment with it.

It's nice that test and benchmark are available.

kongzii · 2025-01-14T08:56:08Z

We do have a benchmark to compare agent implementations as well https://github.com/gnosis/prediction-market-agent/blob/main/prediction_market_agent/agents/think_thoroughly_agent/benchmark.py#L101. But of course it's not as accurate as just running the agents for some time and then checking it backwards after markets are resolved.

empeje · 2025-01-14T23:37:21Z

Update to my research, it looks like we can directly test with DeepSeek by just changing the OpenAI URL and pass the token of DeepSeek there. See the following code from their official documentation

# pip3 install langchain_openai
# python3 deepseek_v2_langchain.py
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model='deepseek-chat', 
    openai_api_key='<your api key>', 
    openai_api_base='https://api.deepseek.com',
    max_tokens=1024
)

response = llm.invoke("Hi!")
print(response.content)

empeje · 2025-01-16T01:11:06Z

If I were to implement the DeepSeek, what is the common way in this current codebase to do it? I can imagine I can just change the model type, but wondering if we should have a feature flag to switch between this agent so we can compare models.

kongzii · 2025-01-16T08:59:04Z

a feature flag to switch between this agent so we can compare models.

Yes, that's good thinking. Good agent to start is Prophet family, check out this file..

There, you will see that each agent is a separate class, and each class comes with its own model configuration:

This was fine until now as all Prophets are using openai, however now you will need to also change the API BASE URL, not just the model string, as you shown before. So that's something you need to add in there.

The Prophet agent itself is defined in another repository , so you may need to do some modifications there as well.

empeje mentioned this issue Jan 13, 2025

Check if Arbitrage agent is placing reasonable bets, otherwise create a new bug ticket #620

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try out Deepseek #557

Try out Deepseek #557

kongzii commented Nov 21, 2024

empeje commented Jan 9, 2025

gabrielfior commented Jan 13, 2025

empeje commented Jan 13, 2025

kongzii commented Jan 14, 2025

empeje commented Jan 14, 2025

empeje commented Jan 16, 2025

kongzii commented Jan 16, 2025

Try out Deepseek #557

Try out Deepseek #557

Comments

kongzii commented Nov 21, 2024

empeje commented Jan 9, 2025

gabrielfior commented Jan 13, 2025

empeje commented Jan 13, 2025

kongzii commented Jan 14, 2025

empeje commented Jan 14, 2025

empeje commented Jan 16, 2025

kongzii commented Jan 16, 2025