Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Something wrong when parsing the JSON data returned by Ollama model #2996

Open
jerry0li opened this issue Sep 27, 2024 · 5 comments
Open

Something wrong when parsing the JSON data returned by Ollama model #2996

jerry0li opened this issue Sep 27, 2024 · 5 comments
Assignees

Comments

@jerry0li
Copy link

Description

I'd like to create a connector to call my self-hosted Ollama model but failed. It shows the MalformedJsonException. Do not know what happened under the hood.

To Reproduce

Steps to reproduce the behavior:

  • add trusted endpoints
PUT /_cluster/settings
{
    "persistent": {
        "plugins.ml_commons.trusted_connector_endpoints_regex": [
          "^https://runtime\\.sagemaker\\..*[a-z0-9-]\\.amazonaws\\.com/.*$",
          "^https://api\\.openai\\.com/.*$",
          "^https://api\\.cohere\\.ai/.*$",
          "^http://10.0.221.10:11434/.*$",
          "^https://bedrock-runtime\\..*[a-z0-9-]\\.amazonaws\\.com/.*$"
        ]
    }
}

http://10.0.221.10:11434 is my self-hosted endpoint for Ollama service. It works so don't worry about it!

  • create new connector
PUT /_plugins/_ml/connectors/FHngK5IBq9eck1Nzi3LO
{
  "name": "ollama test Connector",
  "version": "1",
  "description": "The connector to ollama k8s service for gemma-2b model",
  "protocol": "http",
  "parameters": {
    "endpoint": "10.0.221.10:11434",
    "model": "gemma:2b"
  },
  "actions": [
    {
      "action_type": "PREDICT",
      "method": "POST",
      "url": "http://${parameters.endpoint}/api/generate",
      "headers": {
        "content-type": "application/json"
      },
      "request_body": "{\"prompt\": \"${parameters.prompt}\", \"model\": \"${parameters.model}\"}"
    }
  ]
}

Eventually, the model is deployed successfully as I can see the status is Responding when clicking Machine Learning button in the sidebar.

  • test the llm
POST /_plugins/_ml/models/zmyRCI4BqnlXhWIYdcO-/_predict
{
  "parameters": {
    "prompt": "hello"
  }
}

Error msg:

{
  "error": {
    "root_cause": [
      {
        "type": "json_syntax_exception",
        "reason": "json_syntax_exception: com.google.gson.stream.MalformedJsonException: Use JsonReader.setLenient(true) to accept malformed JSON at line 2 column 2 path $"
      }
    ],
    "type": "m_l_exception",
    "reason": "m_l_exception: Fail to execute PREDICT in aws connector",
    "caused_by": {
      "type": "json_syntax_exception",
      "reason": "json_syntax_exception: com.google.gson.stream.MalformedJsonException: Use JsonReader.setLenient(true) to accept malformed JSON at line 2 column 2 path $",
      "caused_by": {
        "type": "i_o_exception",
        "reason": "Use JsonReader.setLenient(true) to accept malformed JSON at line 2 column 2 path $"
      }
    }
  },
  "status": 500
}

Opensearch logs:

[2024-09-27T07:58:07,285][ERROR][o.o.m.e.a.r.MLSdkAsyncHttpResponseHandler] [opensearch-cluster-master-0] Failed to process response body: {"model":"gemma:2b","created_at":"2024-09-27T07:58:05.003890864Z","response":"Hello","done":false}
{"model":"gemma:2b","created_at":"2024-09-27T07:58:05.124841745Z","response":"!","done":false}
{"model":"gemma:2b","created_at":"2024-09-27T07:58:05.240781677Z","response":" 👋","done":false}
{"model":"gemma:2b","created_at":"2024-09-27T07:58:05.339080794Z","response":" It","done":false}
{"model":"gemma:2b","created_at":"2024-09-27T07:58:05.444416871Z","response":"'","done":false}
{"model":"gemma:2b","created_at":"2024-09-27T07:58:05.554094564Z","response":"s","done":false}
{"model":"gemma:2b","created_at":"2024-09-27T07:58:05.656953818Z","response":" a","done":false}
{"model":"gemma:2b","created_at":"2024-09-27T07:58:05.771084706Z","response":" pleasure","done":false}
{"model":"gemma:2b","created_at":"2024-09-27T07:58:05.90307112Z","response":" to","done":false}
{"model":"gemma:2b","created_at":"2024-09-27T07:58:06.000891211Z","response":" hear","done":false}
{"model":"gemma:2b","created_at":"2024-09-27T07:58:06.096494405Z","response":" from","done":false}
{"model":"gemma:2b","created_at":"2024-09-27T07:58:06.192240104Z","response":" you","done":false}
{"model":"gemma:2b","created_at":"2024-09-27T07:58:06.293269298Z","response":".","done":false}
{"model":"gemma:2b","created_at":"2024-09-27T07:58:06.40386598Z","response":" What","done":false}
{"model":"gemma:2b","created_at":"2024-09-27T07:58:06.506886552Z","response":" can","done":false}
{"model":"gemma:2b","created_at":"2024-09-27T07:58:06.607956432Z","response":" I","done":false}
{"model":"gemma:2b","created_at":"2024-09-27T07:58:06.734589096Z","response":" do","done":false}
{"model":"gemma:2b","created_at":"2024-09-27T07:58:06.864589349Z","response":" for","done":false}
{"model":"gemma:2b","created_at":"2024-09-27T07:58:06.979008785Z","response":" you","done":false}
{"model":"gemma:2b","created_at":"2024-09-27T07:58:07.083738548Z","response":" today","done":false}
{"model":"gemma:2b","created_at":"2024-09-27T07:58:07.181483798Z","response":"?","done":false}
{"model":"gemma:2b","created_at":"2024-09-27T07:58:07.288493514Z","response":"","done":true,"context":[106,1645,108,17534,107,108,106,2516,108,4521,235341,169692,1165,235303,235256,476,15241,577,4675,774,692,235265,2439,798,590,749,604,692,3646,235336,107,108],"total_duration":2394408066,"load_duration":2340939,"prompt_eval_duration":106764000,"eval_count":22,"eval_duration":2284379000}

com.google.gson.JsonSyntaxException: com.google.gson.stream.MalformedJsonException: Use JsonReader.setLenient(true) to accept malformed JSON at line 2 column 2 path $

As we can see, Opensearch received the response from my self-hosted Ollama model service. The problem is the process of parsing json.

Thank you for reading this and helping a dizzy developer.

@ylwu-amzn
Copy link
Collaborator

ylwu-amzn commented Oct 1, 2024

Current connector doesn't support streaming, the issue tracking this #2484

Can you try to change request_body in connector to

"request_body": "{\"prompt\": \"${parameters.prompt}\", \"model\": \"${parameters.model}\", \"stream\": false}"

@ylwu-amzn ylwu-amzn self-assigned this Oct 1, 2024
@jerry0li
Copy link
Author

Hi, @ylwu-amzn , I did what you suggested to me but failed. The following parts are the request I send and corresponding (error) response and my connector configuration.

Hope to hear from u asap.

Thanks a lot.

  • request
POST /_plugins/_ml/models/ge4WipIB5i35lTuUYjdK/_predict
{
  "parameters": {
    "prompt": "hello"
  }
}
  • response
{
  "error": {
    "root_cause": [
      {
        "type": "status_exception",
        "reason": "Error from remote service: {\"error\":\"json: cannot unmarshal string into Go struct field GenerateRequest.stream of type bool\"}"
      }
    ],
    "type": "status_exception",
    "reason": "Error from remote service: {\"error\":\"json: cannot unmarshal string into Go struct field GenerateRequest.stream of type bool\"}"
  },
  "status": 400
}
  • below is my connector configuration
{
  "name": "ollama test Connector",
  "version": "1",
  "description": "The connector to ollama k8s service for gemma-2b model",
  "protocol": "http",
  "parameters": {
    "endpoint": "10.0.221.10:11434",
    "stream": "false",
    "model": "gemma:2b"
  },
  "actions": [
    {
      "action_type": "PREDICT",
      "method": "POST",
      "url": "http://${parameters.endpoint}/api/generate",
      "headers": {
        "content-type": "application/json"
      },
      "request_body": """{"prompt": "${parameters.prompt}", "model": "${parameters.model}", "stream": "{parameters.stream}"}"""
    }
  ]
}

@mingshl mingshl moved this to In Progress in ml-commons projects Nov 5, 2024
@ylwu-amzn
Copy link
Collaborator

Change request_body to

"request_body": """{"prompt": "${parameters.prompt}", "model": "${parameters.model}", "stream": {parameters.stream} }"""

@ylwu-amzn
Copy link
Collaborator

@jerry0li Can you confirm if this can work or not ?

@jerry0li
Copy link
Author

@ylwu-amzn Sry for the late response. I'm afraid it doesn't work and will never work unless the Ollama format below is supported.

  • root cause: the format of Ollama model response is different from the one like OpenAI, Claude, etc.

  • Ollama response:

{
  "model": "llama3.2",
  "created_at": "2023-12-12T14:13:43.416799Z",
  "message": {
    "role": "assistant",
    "content": "Hello! How are you today?"
  },
  "done": true,
  "total_duration": 5191566416,
  "load_duration": 2154458,
  "prompt_eval_count": 26,
  "prompt_eval_duration": 383809000,
  "eval_count": 298,
  "eval_duration": 4799921000
}
  • And ChatGPT response:
{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "gpt-4o-mini",
  "system_fingerprint": "fp_44709d6fcb",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "\n\nHello there, how may I assist you today?",
    },
    "logprobs": null,
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 9,
    "completion_tokens": 12,
    "total_tokens": 21,
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0
    }
  }
}

Do I make myself clear?

Glad to hear from u !

Thx, bro!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

No branches or pull requests

2 participants