Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Tools in Vertex AI Gemini #1169

Merged
merged 1 commit into from
Jan 22, 2025

Conversation

sberyozkin
Copy link
Contributor

@sberyozkin sberyozkin commented Dec 17, 2024

Hi @geoand, I've done a lot of copying from similar Ollama Tools support code and I have to admit I'm not sure I'm on the right track, Ollama code has some similarities but the the code there is much more involved compared to the Gemini one which does not offer non-streaming support yet.

Can you please have a quick look tomorrow and point to some obvious errors, before I attempt to test it :-) ?

@sberyozkin sberyozkin changed the title Support Tools for Vertex AI Gemini Support for Tools in Vertex AI Gemini Dec 17, 2024
@geoand
Copy link
Collaborator

geoand commented Dec 18, 2024

Thanks for looking into it @sberyozkin!

There are no obvious error I see, but only some proper testing will show the way :)

@sberyozkin
Copy link
Contributor Author

@geoand Sure, I'll experiment

@sberyozkin sberyozkin force-pushed the gemini_tools branch 2 times, most recently from c7edcc5 to 5987cb8 Compare December 23, 2024 18:12
@sberyozkin
Copy link
Contributor Author

I've prototyped the integration test, with a single Tool for now, and it fails, so the next task for early 2025 is to fix it :-) and then make it work for 2 tools

@geoand
Copy link
Collaborator

geoand commented Dec 24, 2024

🎉

@MisterK91
Copy link

MisterK91 commented Jan 13, 2025

👍 +1
@sberyozkin would be really thankful for this feature!

@sberyozkin
Copy link
Contributor Author

@MisterK91 Sure, I'll resume working on it in a few days

@sberyozkin
Copy link
Contributor Author

@geoand How to run integrations tests for Ollama, there is no test code there ?

@geoand
Copy link
Collaborator

geoand commented Jan 17, 2025

I would just take some some sample and use Ollama with it

@sberyozkin
Copy link
Contributor Author

@geoand I've got a bit of a progress, with the integration test now failing at the Jackson response deserialization time,

Cannot deserialize value of type `java.util.ArrayList<io.quarkiverse.langchain4j.vertexai.runtime.gemini.FunctionCall>` from Object value (token `JsonToken.START_OBJECT`)
 at [Source: REDACTED (`StreamReadFeature.INCLUDE_SOURCE_IN_LOCATION` disabled); line: 8, column: 29] (through reference chain: io.quarkiverse.langchain4j.vertexai.runtime.gemini.GenerateContentResponse["candidates"]->java.util.ArrayList[0]->io.quarkiverse.langchain4j.vertexai.runtime.gemini.GenerateContentResponse$Candidate["content"]->io.quarkiverse.langchain4j.vertexai.runtime.gemini.GenerateContentResponse$Candidate$Content["parts"]->java.util.ArrayList[0]->io.quarkiverse.langchain4j.vertexai.runtime.gemini.GenerateContentResponse$Candidate$Part["functionCall"])

I was not really focusing earlier on using Jackson correctly when I was copying some related Ollama code, but it looks much closer now... If you can spot something obvious in the Jackson related code in this PR, let me know please, np if nothing obvious can be seen, I'll look further sometime next week.

@sberyozkin sberyozkin force-pushed the gemini_tools branch 2 times, most recently from 98768e1 to bbc7e9b Compare January 17, 2025 19:11
@sberyozkin
Copy link
Contributor Author

It must be something straightforward, I'll have a look next week

@sberyozkin
Copy link
Contributor Author

Never mind, this is just requires a bit more concentration

@geoand
Copy link
Collaborator

geoand commented Jan 20, 2025

👍🏽

@sberyozkin
Copy link
Contributor Author

sberyozkin commented Jan 20, 2025

I'm getting some inconsistent responses from Gemini, using the same request:

Request1:

Request:
- method: POST
- url: https://europe-west2-aiplatform.googleapis.com/v1/projects/${quarkus-project}/locations/europe-west2/publishers/google/models/gemini-pro:generateContent
- headers: [Authorization: Bearer ya...71], [Content-Type: application/json], [User-Agent: Quarkus REST Client], [content-length: 507]

- body: {"contents":[{"role":"user","parts":[{"text":"Write a short 1 paragraph poem about Java programming language.Set an author name to the model or deployment name which created the poem.Please also get the poem printed."}]}],"tools":[{"functionDeclarations":[{"name":"printThePoem","description":"Print the poem","parameters":{"type":"object","properties":{"poem":{"type":"string","description":"The text of the poem that must be printed"}},"required":["poem"]}}]}],"generationConfig":{"maxOutputTokens":8192}}

Note the printThePoem declaration above.

Response1:

2025-01-20 13:25:26,132 INFO  [io.qua.lan.ver.run.gem.VertxAiGeminiRestApi$VertxAiClientLogger] (vert.x-eventloop-thread-3) Response:
- status code: 200
- headers: [Content-Type: application/json; charset=UTF-8], [Vary: X-Origin], [Vary: Referer], [Date: Mon, 20 Jan 2025 13:25:37 GMT], [Server: scaffolding on HTTPServer2], [X-XSS-Protection: 0], [X-Frame-Options: SAMEORIGIN], [X-Content-Type-Options: nosniff], [Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000], [Accept-Ranges: none], [Vary: Origin,Accept-Encoding], [Transfer-Encoding: chunked]
- body: {
  "candidates": [
    {
      "content": {
        "role": "model",
        "parts": [
          {
            "text": "Upon the JVM's vast stage, it gracefully performs its show,\nWith objects and classes, an intricate dance it knows.\nFrom threads to exceptions, its versatility does astound,\nA symphony of bytes, where logic and code resound.\n\n-- Bard, the Language Model"
          }
        ]
      },
      "finishReason": "STOP",
      "safetyRatings": [
         ...
      ],
      "avgLogprobs": -1.2840580118113551
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 51,
    "candidatesTokenCount": 58,
    "totalTokenCount": 109
  },
  "modelVersion": "gemini-1.0-pro-002"
}

Note no function call is suggested...

next, Request2, which is exactly the same as Request1:

2025-01-20 13:26:38,842 INFO  [io.qua.lan.ver.run.gem.VertxAiGeminiRestApi$VertxAiClientLogger] (vert.x-eventloop-thread-3) Request:
- method: POST
- url: https://europe-west2-aiplatform.googleapis.com/v1/projects/${quarkus-project}/locations/europe-west2/publishers/google/models/gemini-pro:generateContent
- headers: [Authorization: Bearer ya...71], [Content-Type: application/json], [User-Agent: Quarkus REST Client], [content-length: 507]
- body: {"contents":[{"role":"user","parts":[{"text":"Write a short 1 paragraph poem about Java programming language.Set an author name to the model or deployment name which created the poem.Please also get the poem printed."}]}],"tools":[{"functionDeclarations":[{"name":"printThePoem","description":"Print the poem","parameters":{"type":"object","properties":{"poem":{"type":"string","description":"The text of the poem that must be printed"}},"required":["poem"]}}]}],"generationConfig":{"maxOutputTokens":8192}}

Again, printThePoem is included.

And Response2:

2025-01-20 13:26:41,037 INFO  [io.qua.lan.ver.run.gem.VertxAiGeminiRestApi$VertxAiClientLogger] (vert.x-eventloop-thread-3) Response:
- status code: 200
- headers: [Content-Type: application/json; charset=UTF-8], [Vary: X-Origin], [Vary: Referer], [Date: Mon, 20 Jan 2025 13:26:52 GMT], [Server: scaffolding on HTTPServer2], [X-XSS-Protection: 0], [X-Frame-Options: SAMEORIGIN], [X-Content-Type-Options: nosniff], [Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000], [Accept-Ranges: none], [Vary: Origin,Accept-Encoding], [Transfer-Encoding: chunked]
- body: {
  "candidates": [
    {
      "finishReason": "MALFORMED_FUNCTION_CALL",
      "finishMessage": "Malformed function call: \nprint(default_api.printThePoem(poem='In realms of code, where logic flows,\\nJava stands tall, a name that knows.\\nFrom humble beans, a language grew,\\nRobust and vast, a vibrant hue.\\nObjects dance, with classes tied,\\nEncapsulation's shield, where secrets hide.\\nThreads interweave, a symphony,\\nConcurrency's grace, for all to see.\\nWith JVM's embrace, it roams afar,\\nOn countless devices, near and far.\\nFrom desktop screens to mobile might,\\nJava's reign, a dazzling sight.\\nSo raise a cup, to this code's delight,\\nJava, the language, shining bright.\\n\\n-- Bard'))\n"
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 51,
    "totalTokenCount": 51
  },
  "modelVersion": "gemini-1.0-pro-002"
}

I've no idea what Malformed function call: \nprint(default_api.printThePoem(poem=... means, I can't find anything related on the web, or why the function declaration was ignored during the first call.

@MisterK91, @waloeen, do you have some comments about the above ?

@sberyozkin sberyozkin force-pushed the gemini_tools branch 2 times, most recently from 2f75fa0 to 2ea9d8c Compare January 20, 2025 14:07
@sberyozkin
Copy link
Contributor Author

sberyozkin commented Jan 20, 2025

After changing the model name to gemini-1.5-pro (version 1.0 is to be deprecated shortly, https://ai.google.dev/gemini-api/docs/models/gemini ), using the same request shown above, I got:

2025-01-20 15:47:42,149 INFO  [io.qua.lan.ver.run.gem.VertxAiGeminiRestApi$VertxAiClientLogger] (vert.x-eventloop-thread-3) Response:
- status code: 200
- headers: [Content-Type: application/json; charset=UTF-8], [Vary: X-Origin], [Vary: Referer], [Date: Mon, 20 Jan 2025 15:47:53 GMT], [Server: scaffolding on HTTPServer2], [X-XSS-Protection: 0], [X-Frame-Options: SAMEORIGIN], [X-Content-Type-Options: nosniff], [Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000], [Accept-Ranges: none], [Vary: Origin,Accept-Encoding], [Transfer-Encoding: chunked]
- body: {
  "candidates": [
    {
      "content": {
        "role": "model",
        "parts": [
          {
            "functionCall": {
              "name": "printThePoem",
              "args": {
                "poem": "Oh, Java, with your bytecode bright,\nYou run on servers day and night.\nYour generics dance, your threads entwine,\nA robust brew, truly divine. - Gemini"
              }
            }
          }
        ]
      },
      "finishReason": "STOP",
      "safetyRatings": [
        ...
      ],
      "avgLogprobs": -0.34194592996077106
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 51,
    "candidatesTokenCount": 44,
    "totalTokenCount": 95
  },
  "modelVersion": "gemini-1.5-pro-001"
}

(I got it once, the same response, a few days back, with gemini-1.0-pro)

and then repeating the same request again leads to the same MALFORMED_FUNCTION_CALL error.

It does look like Gemini model tooling support is very unstable at the moment.

I think what I will do is I'll get the integration test working based on the output in this comment, which should be sufficient for users to continue experimenting, and we can keep tuning it as necessary

@sberyozkin
Copy link
Contributor Author

Hi @geoand

I think I may have found a bug, Quarkus LangChain4j continues calling AI when the AI response contains tool executions with a FinishReason.STOP, which apparently happens when the tool executions have only void responses... if you agree then please look into it - I can seee where it is happening but it is too sensitive of an update for me to look into at this stage.

The other question, if one uses ChatLanguageModel directly (as in the demo), does it mean Tool invocations have to be done manually ?

@sberyozkin
Copy link
Contributor Author

This is a minor, just recording here for minor enhancement request be opened later, using @P annotation on the tool parameter (from LangChain4j) loses a required status for this parameter, but it does correctly pick up a description.

@sberyozkin
Copy link
Contributor Author

OK, is getting closer, FYI, continuing the flow causes 400 from Gemini: as it does not currently match its expectation:

- method: POST
- url: https://europe-west2-aiplatform.googleapis.com/v1/projects/${project-id}/locations/europe-west2/publishers/google/models/gemini-1.5-pro:generateContent
- headers: [Authorization: Bearer ya...71], [Content-Type: application/json], [User-Agent: Quarkus REST Client], [content-length: 448]
- body: {"contents":[{"role":"user","parts":[{"text":"Write a short 1 paragraph poem about Java programming language.Set an author name to the model or deployment name which created the poem.Please also get the poem printed"}]}],"tools":[{"functionDeclarations":[{"name":"printThePoem","description":"Print the poem","parameters":{"type":"object","properties":{"poem":{"type":"string"}},"required":["poem"]}}]}],"generationConfig":{"maxOutputTokens":8192}}

2025-01-20 18:38:18,914 INFO  [io.qua.lan.ver.run.gem.VertxAiGeminiRestApi$VertxAiClientLogger] (vert.x-eventloop-thread-2) Response:
- status code: 200
- headers: [Content-Type: application/json; charset=UTF-8], [Vary: X-Origin], [Vary: Referer], [Date: Mon, 20 Jan 2025 18:38:30 GMT], [Server: scaffolding on HTTPServer2], [X-XSS-Protection: 0], [X-Frame-Options: SAMEORIGIN], [X-Content-Type-Options: nosniff], [Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000], [Accept-Ranges: none], [Vary: Origin,Accept-Encoding], [Transfer-Encoding: chunked]
- body: {
  "candidates": [
    {
      "content": {
        "role": "model",
        "parts": [
          {
            "functionCall": {
              "name": "printThePoem",
              "args": {
                "poem": "Objects dance in virtual space,\nA symphony of classes, each in place.\nWith threads entwined, they build and strive,\nA digital world, where Java thrives.\n\n-- Gemini"
              }
            }
          }
        ]
      },
      "finishReason": "STOP",
      "safetyRatings": [
        ...
      ],
      "avgLogprobs": -0.25292624983676643
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 41,
    "candidatesTokenCount": 43,
    "totalTokenCount": 84
  },
  "modelVersion": "gemini-1.5-pro-001"
}


Objects dance in virtual space,
A symphony of classes, each in place.
With threads entwined, they build and strive,
A digital world, where Java thrives.

-- Gemini

2025-01-20 18:38:18,956 INFO  [io.qua.lan.ver.run.gem.VertxAiGeminiRestApi$VertxAiClientLogger] (vert.x-eventloop-thread-2) Request:
- method: POST
- url: https://europe-west2-aiplatform.googleapis.com/v1/projects/${project-id}/locations/europe-west2/publishers/google/models/gemini-1.5-pro:generateContent
- headers: [Authorization: Bearer ya...71], [Content-Type: application/json], [User-Agent: Quarkus REST Client], [content-length: 764]
- body: {"contents":[{"role":"user","parts":[{"text":"Write a short 1 paragraph poem about Java programming language.Set an author name to the model or deployment name which created the poem.Please also get the poem printed"}]},{"role":"model","parts":[{"text":"null","functionCall":[{"name":"printThePoem","args":{"poem":"Objects dance in virtual space,\nA symphony of classes, each in place.\nWith threads entwined, they build and strive,\nA digital world, where Java thrives.\n\n-- Gemini"}}]}]},{"role":"tool","parts":[{"text":"Success"}]}],"tools":[{"functionDeclarations":[{"name":"printThePoem","description":"Print the poem","parameters":{"type":"object","properties":{"poem":{"type":"string"}},"required":["poem"]}}]}],"generationConfig":{"maxOutputTokens":8192}}

2025-01-20 18:38:18,985 INFO  [io.qua.lan.ver.run.gem.VertxAiGeminiRestApi$VertxAiClientLogger] (vert.x-eventloop-thread-2) Response:
- status code: 400
- headers: [Vary: X-Origin], [Vary: Referer], [Content-Type: application/json; charset=UTF-8], [Date: Mon, 20 Jan 2025 18:38:30 GMT], [Server: scaffolding on HTTPServer2], [X-XSS-Protection: 0], [X-Frame-Options: SAMEORIGIN], [X-Content-Type-Options: nosniff], [Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000], [Accept-Ranges: none], [Vary: Origin,Accept-Encoding], [Transfer-Encoding: chunked]
- body: {
  "error": {
    "code": 400,
    "message": "Invalid JSON payload received. Unknown name \"functionCall\" at 'contents[1].parts[0]': Proto field is not repeating, cannot start list.",
    "status": "INVALID_ARGUMENT",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.BadRequest",
        "fieldViolations": [
          {
            "field": "contents[1].parts[0]",
            "description": "Invalid JSON payload received. Unknown name \"functionCall\" at 'contents[1].parts[0]': Proto field is not repeating, cannot start list."
          }
        ]
      }
    ]
  }
}

But I believe it should already work for tools with void responses and FinishReason.STOP (once it is fixed)...

In any case, once a multi-turn function calling starts working, I'll open it for review, a bit later

@sberyozkin sberyozkin force-pushed the gemini_tools branch 2 times, most recently from 4942784 to 97d0f50 Compare January 20, 2025 18:58
@geoand
Copy link
Collaborator

geoand commented Jan 21, 2025

I think I may have found a bug, Quarkus LangChain4j continues calling AI when the AI response contains tool executions with a FinishReason.STOP, which apparently happens when the tool executions have only void responses... if you agree then please look into it - I can seee where it is happening but it is too sensitive of an update for me to look into at this stage

How do I reproduce this?

@sberyozkin
Copy link
Contributor Author

Hi @geoand, I may have been wrong about it, I'll get the method returning some value working first against live Gemini and then double check.
To reproduce, you'd just remove the if branch in the GeminiResource in the integration test

@geoand
Copy link
Collaborator

geoand commented Jan 21, 2025

Do you mean the integration test of this PR?

@sberyozkin
Copy link
Contributor Author

@geoand Sorry, was offline for a bit, yeah, of this PR, but let me make it work with live Gemini first

@geoand
Copy link
Collaborator

geoand commented Jan 21, 2025

Okay, let me know when you want me to have a look

@sberyozkin sberyozkin force-pushed the gemini_tools branch 3 times, most recently from 7237cb7 to 60a7e51 Compare January 21, 2025 17:48
@sberyozkin
Copy link
Contributor Author

@geoand It is getting very close now, got a multi-turn working with this response:

Hello Sergey Beryozkin, I hope you enjoy reading this poem:

**Java**

Oh, Java, with your camelCase grace,
You power the web at a rapid pace.
From Spring's embrace to servlets so grand,
Your virtual machine, a helping hand.
You've brewed a legacy, that's clear to see,
A language so strong, and forever will be.

**Gemini** 

where the user name is returned from a Tool which requires a user authentication, which is what I really wanted to try.

Let me quickly check what happens with a function with a void response

@sberyozkin
Copy link
Contributor Author

Hmm. From Spring's embrace to servlets so grand, looks like Gemini needs a bit more tuning :-)

@sberyozkin
Copy link
Contributor Author

@geoand void tool methods work fine, sorry about the noise

@sberyozkin sberyozkin marked this pull request as ready for review January 21, 2025 18:49
@sberyozkin sberyozkin requested a review from a team as a code owner January 21, 2025 18:49

This comment has been minimized.

@sberyozkin
Copy link
Contributor Author

Opening for review, it will need a follow up, https://ai.google.dev/gemini-api/docs/function-calling#function_calling_mode, but what is here now worked for the demo, with a multi-turn mode.

Occasional intermittent malfomed function errors can be returned from both gemini-1.5-pro and gemini-1.5-flash, but hopefully it will be stabilized at the Gemini end.

I'll deal with https://ai.google.dev/gemini-api/docs/function-calling#function_calling_mode after this PR is merged, I need to catch up with a few Q. issues now :-)

Copy link

quarkus-bot bot commented Jan 21, 2025

Status for workflow Build (on pull request)

This is the status report for running Build (on pull request) on commit b506fbc.

✅ The latest workflow run for the pull request has completed successfully.

It should be safe to merge provided you have a look at the other checks in the summary.

Copy link
Collaborator

@geoand geoand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really nice @sberyozkin, thanks a lot!

@geoand
Copy link
Collaborator

geoand commented Jan 22, 2025

Occasional intermittent malfomed function errors can be returned from both gemini-1.5-pro and gemini-1.5-flash, but hopefully it will be stabilized at the Gemini end.

Right, tool calling can sometimes be hit and miss with other models as well

@geoand geoand merged commit 722dd98 into quarkiverse:main Jan 22, 2025
71 checks passed
@sberyozkin
Copy link
Contributor Author

Thanks @geoand , I have to admit, it wasn't boring working on this issue 😀

@sberyozkin sberyozkin deleted the gemini_tools branch January 22, 2025 12:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for @Tools with Gemini?
3 participants