-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Controlled generation fails truncating response when it includes code markdown #1372
Comments
Does this occur with |
Hi @holtskinner, it occurs on both of them, I updated Relevant log output to include |
Thanks for your feedback, I think this could be a bug in Controlled Generation relating to the markdown output. I'm going to report it to the product team. |
@LBUPD33 Question on your use case, why do you need the model output to be in JSON since you're just reading in the one field |
@holtskinner The provided code is only for demonstrating the bug. I went to the most simple way to reproduce the bug. The real use case was including multiple fields, such as |
Thanks for the context, this issue seems to be with how the Controlled Generation backend parses the markdown-formatted json blocks created by Gemini. The product team is working on a fix. In the meantime, I found a workaround that should handle your use case (not using controlled generation and doing manual markdown parsing): import vertexai
from vertexai.generative_models import (
GenerativeModel,
GenerationConfig,
SafetySetting,
Part,
)
import json
def extract_json_block(markdown_string: str) -> str | None:
"""
Extracts the outermost JSON code block from a markdown string.
Args:
markdown_string: The markdown string to extract the code block from.
Returns:
The outer JSON code block if found, otherwise None.
"""
pattern = r"```json\n(\{\n.*?\n\})\n```"
match = re.search(pattern, markdown_string, re.DOTALL | re.MULTILINE)
return match.group(1) if match else None
def multiturn_generate_content():
vertexai.init(project=gcp_project, location="us-central1")
model = GenerativeModel(
"gemini-1.5-flash-002",
system_instruction=[textsi_1],
)
chat = model.start_chat()
response = chat.send_message(
[
"""Use Markdown to format your answer. Could you explain me what is map() function in javascript and give me an example? Respond in the following JSON structure:
{
"response_content": "Insert Answer Here",
"response_type": "markdown" # Or whatever this should be
}
"""
],
generation_config=generation_config,
safety_settings=safety_settings,
)
json_block = extract_json_block(response.text)
json_object = json.loads(json_block)
print(json_object["response_content"])
print(json_object["response_type"])
textsi_1 = """"""
generation_config = GenerationConfig(
max_output_tokens=8192, temperature=0.5, top_p=0.95
)
safety_settings = [
SafetySetting(
category=SafetySetting.HarmCategory.HARM_CATEGORY_HATE_SPEECH,
threshold=SafetySetting.HarmBlockThreshold.OFF,
),
SafetySetting(
category=SafetySetting.HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
threshold=SafetySetting.HarmBlockThreshold.OFF,
),
SafetySetting(
category=SafetySetting.HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,
threshold=SafetySetting.HarmBlockThreshold.OFF,
),
SafetySetting(
category=SafetySetting.HarmCategory.HARM_CATEGORY_HARASSMENT,
threshold=SafetySetting.HarmBlockThreshold.OFF,
),
]
multiturn_generate_content() |
@holtskinner Thank you, good news that the product team is working on a fix ! |
File Name
GCP Vertex Studio & Google Colab
What happened?
Context
Trying to reproduce a Gemini Chatbot for internal use: dedicated system instructions, restricted access (using gauth).
Technical constraints
Bug
Output get truncated when it tries to include markdown code.
Tests done
Vertex Studio Params
Python code
Output
1.5 Flash 001
1.5 Flash 002
1.5 Pro 001
candidates { content { role: "model" parts { text: "{\n\"response_content\": \"" } } finish_reason: STOP citation_metadata { citations { start_index: 1064 end_index: 1187 uri: "https://www.spritely.net/how-to-put-multiple-objects-in-one-var-array-javascript/" } } avg_logprobs: -3.1444194316864014 } usage_metadata { prompt_token_count: 30 candidates_token_count: 8 total_token_count: 38 } model_version: "gemini-1.5-pro-001"
Pro 1.5 002
candidates { content { role: "model" parts { text: "{\"response_content\": \"" } } finish_reason: STOP citation_metadata { citations { start_index: 220 end_index: 343 uri: "https://falytom.com/javascript-array-methods-examples/" } } avg_logprobs: -11.301958719889322 } usage_metadata { prompt_token_count: 30 candidates_token_count: 6 total_token_count: 36 } model_version: "gemini-1.5-pro-002"
Code of Conduct
The text was updated successfully, but these errors were encountered: