Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added generator for OpenRouter.ai #1051

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

pedramamini
Copy link

Addresses the feature request for adding an OpenRouter.ai compatible generator. Which is compatible with OpenAI. Related issue:

#692

Verification

List the steps needed to make sure this thing works

$ python3 -m garak --model_type openrouter --model_name anthropic/claude-3.5-sonnet --probes encoding
garak LLM vulnerability scanner v0.10.1.post1 ( https://github.com/NVIDIA/garak ) at 2024-12-24T00:31:33.473234
📜 logging to /Users/pedram/.local/share/garak/garak.log
🦜 loading generator: OpenRouter: anthropic/claude-3.5-sonnet
📜 reporting to /Users/pedram/.local/share/garak/garak_runs/garak.ebdebda1-5db4-43f5-a128-826b46f3cd70.report.jsonl
🕵️  queue of probes: encoding.InjectAscii85, encoding.InjectBase16, encoding.InjectBase2048, encoding.InjectBase32, encoding.InjectBase64, encoding.InjectBraille, encoding.InjectEcoji, encoding.InjectHex, encoding.InjectMorse, encoding.InjectNato, encoding.InjectROT13, encoding.InjectUU, encoding.InjectZalgo
probes.encoding.InjectAscii85:  22%|█████████████████▊                                                                | 13/60 

Tests pass.

Copy link
Contributor

github-actions bot commented Dec 24, 2024

DCO Assistant Lite bot All contributors have signed the DCO ✍️ ✅

@pedramamini
Copy link
Author

I have read the DCO Document and I hereby sign the DCO

@pedramamini
Copy link
Author

Recheck

github-actions bot added a commit that referenced this pull request Dec 24, 2024
@erickgalinkin
Copy link
Collaborator

Thanks so much for your contribution @pedramamini! As of #1021, OpenAICompatible is a first-class generator -- I'm not intimately familiar with OpenRouter, but based on your code, I'd think one could simply point an OpenAICompatible generator to the appropriate OpenRouter endpoint. If there's something unique in the functionality here that limits the use of OpenAICompatible, please let me know so I can review more thoroughly. Thanks!

Copy link
Collaborator

@jmartin-tech jmartin-tech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @erickgalinkin, this generator may not be needed at this time due to promotion of OpenAICompatible.

The following config.json can support this functionality in versions 0.10.1 and newer:

   {
      "openai": {
         "OpenAICompatible": {
            "uri": "https://openrouter.ai/api/v1",
            "max_tokens": 2000
      }
   }

Called as:

python -m garak -m openai.OpenAICompatible -n openai/gpt-4-turbo-preview -G config.json

There may be some value in having a named generator to provide provide a quick reference uri or if OpenRouter happens to have some other consistent default that should be set such as the max_tokens value here.

Given all that, I have done a bit of general review here to offer some clarity on how the current code may be adjusted.

garak/generators/openrouter.py Outdated Show resolved Hide resolved
Comment on lines +57 to +75
def _load_client(self):
"""Initialize the OpenAI client with OpenRouter.ai base URL"""
import openai
self.client = openai.OpenAI(
api_key=self._get_api_key(),
base_url="https://openrouter.ai/api/v1"
)

# Determine if we're using chat or completion based on model
self.generator = self.client.chat.completions

def _get_api_key(self):
"""Get API key from environment variable"""
import os
key = os.getenv(self.ENV_VAR)
if not key:
raise ValueError(f"Please set the {self.ENV_VAR} environment variable with your OpenRouter API key")
return key

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is not needed, the api_key handling and is build into the base class and ENV_VAR set above, and the uri value should be set via DEFAULT_PARAMS noted in another comment.

Code for _load_client() seems incomplete as there is no detection of chat vs completion for setting self.generator.

Suggested change
def _load_client(self):
"""Initialize the OpenAI client with OpenRouter.ai base URL"""
import openai
self.client = openai.OpenAI(
api_key=self._get_api_key(),
base_url="https://openrouter.ai/api/v1"
)
# Determine if we're using chat or completion based on model
self.generator = self.client.chat.completions
def _get_api_key(self):
"""Get API key from environment variable"""
import os
key = os.getenv(self.ENV_VAR)
if not key:
raise ValueError(f"Please set the {self.ENV_VAR} environment variable with your OpenRouter API key")
return key

Comment on lines +88 to +158
def _log_completion_details(self, prompt, response):
"""Log completion details at DEBUG level"""
logging.debug("=== Model Input ===")
if isinstance(prompt, str):
logging.debug(f"Prompt: {prompt}")
else:
logging.debug("Messages:")
for msg in prompt:
logging.debug(f"- Role: {msg.get('role', 'unknown')}")
logging.debug(f" Content: {msg.get('content', '')}")

logging.debug("\n=== Model Output ===")
if hasattr(response, 'usage'):
logging.debug(f"Prompt Tokens: {response.usage.prompt_tokens}")
logging.debug(f"Completion Tokens: {response.usage.completion_tokens}")
logging.debug(f"Total Tokens: {response.usage.total_tokens}")

logging.debug("\nGenerated Text:")
# OpenAI response object always has choices
for choice in response.choices:
if hasattr(choice, 'message'):
logging.debug(f"- Message Content: {choice.message.content}")
if hasattr(choice.message, 'role'):
logging.debug(f" Role: {choice.message.role}")
if hasattr(choice.message, 'function_call'):
logging.debug(f" Function Call: {choice.message.function_call}")
elif hasattr(choice, 'text'):
logging.debug(f"- Text: {choice.text}")

# Log additional choice attributes if present
if hasattr(choice, 'finish_reason'):
logging.debug(f" Finish Reason: {choice.finish_reason}")
if hasattr(choice, 'index'):
logging.debug(f" Choice Index: {choice.index}")

# Log model info if present
if hasattr(response, 'model'):
logging.debug(f"\nModel: {response.model}")
if hasattr(response, 'system_fingerprint'):
logging.debug(f"System Fingerprint: {response.system_fingerprint}")

logging.debug("==================")

def _call_model(self, prompt: Union[str, List[dict]], generations_this_call: int = 1):
"""Call model and handle both logging and response"""
try:
# Ensure client is initialized
if self.client is None or self.generator is None:
self._load_client()

# Create messages format for the API call
messages = [{"role": "user", "content": prompt}] if isinstance(prompt, str) else prompt

# Make a single API call to get the response
raw_response = self.generator.create(
model=self.name,
messages=messages,
n=generations_this_call if "n" not in self.suppressed_params else None,
max_tokens=self.max_tokens if hasattr(self, 'max_tokens') else None
)

# Log the completion details
self._log_completion_details(prompt, raw_response)

# Return the full response content
return [choice.message.content for choice in raw_response.choices]

except Exception as e:
logging.error(f"Error in model call: {str(e)}")
return [None]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again code that is not needed, this adds a large amount of noise to the logs and is less comprehensive in handling chat vs completion than the default _call_model() implementation from OpenAICompatible.

Suggested change
def _log_completion_details(self, prompt, response):
"""Log completion details at DEBUG level"""
logging.debug("=== Model Input ===")
if isinstance(prompt, str):
logging.debug(f"Prompt: {prompt}")
else:
logging.debug("Messages:")
for msg in prompt:
logging.debug(f"- Role: {msg.get('role', 'unknown')}")
logging.debug(f" Content: {msg.get('content', '')}")
logging.debug("\n=== Model Output ===")
if hasattr(response, 'usage'):
logging.debug(f"Prompt Tokens: {response.usage.prompt_tokens}")
logging.debug(f"Completion Tokens: {response.usage.completion_tokens}")
logging.debug(f"Total Tokens: {response.usage.total_tokens}")
logging.debug("\nGenerated Text:")
# OpenAI response object always has choices
for choice in response.choices:
if hasattr(choice, 'message'):
logging.debug(f"- Message Content: {choice.message.content}")
if hasattr(choice.message, 'role'):
logging.debug(f" Role: {choice.message.role}")
if hasattr(choice.message, 'function_call'):
logging.debug(f" Function Call: {choice.message.function_call}")
elif hasattr(choice, 'text'):
logging.debug(f"- Text: {choice.text}")
# Log additional choice attributes if present
if hasattr(choice, 'finish_reason'):
logging.debug(f" Finish Reason: {choice.finish_reason}")
if hasattr(choice, 'index'):
logging.debug(f" Choice Index: {choice.index}")
# Log model info if present
if hasattr(response, 'model'):
logging.debug(f"\nModel: {response.model}")
if hasattr(response, 'system_fingerprint'):
logging.debug(f"System Fingerprint: {response.system_fingerprint}")
logging.debug("==================")
def _call_model(self, prompt: Union[str, List[dict]], generations_this_call: int = 1):
"""Call model and handle both logging and response"""
try:
# Ensure client is initialized
if self.client is None or self.generator is None:
self._load_client()
# Create messages format for the API call
messages = [{"role": "user", "content": prompt}] if isinstance(prompt, str) else prompt
# Make a single API call to get the response
raw_response = self.generator.create(
model=self.name,
messages=messages,
n=generations_this_call if "n" not in self.suppressed_params else None,
max_tokens=self.max_tokens if hasattr(self, 'max_tokens') else None
)
# Log the completion details
self._log_completion_details(prompt, raw_response)
# Return the full response content
return [choice.message.content for choice in raw_response.choices]
except Exception as e:
logging.error(f"Error in model call: {str(e)}")
return [None]

@pedramamini
Copy link
Author

@jmartin-tech much appreciate the effort you put in here on the code review. Let me kick the tires with the config-based approach you both recommend. An issue I was having that lead to this generator addition is I wasn't receiving the complete response back from the LLM up to my detector.

Those debug statements should default to off, though certainly unnecessary scaffolding outside of initial development efforts.

Co-authored-by: Jeffrey Martin <jmartin@Op3n4M3.dev>
Signed-off-by: Pedram Amini <pedram.amini@gmail.com>
@pedramamini
Copy link
Author

@jmartin-tech forgive me ignorance as Garak is fairly new to me and I'm no where near having fully grasped the architecture of the what you guys have built here. But on initial testing, leveraging the JSON based OpenAI Compatible config only partially returns LLM responses. Using the OpenRouter dedicated module here returns the complete results. A config-based approach would result in less code clutter if we can get that to work. Any recommendations?

@jmartin-tech
Copy link
Collaborator

@pedramamini, did you include passing max_tokens in your json config?

Can you offer details on when this occurs? Possibly specific targeted models and probe combinations. Comparable reports may also give some detail to review. I get fairly long responses in my initial tests.

I am testing with free tier models. I have noted that openrouter.ai responses when reaching the free tier rate limit are not raising the openai.RateLimit exception and causing the current code to attempt to access None in response.choices instead of hitting a backoff.

@pedramamini
Copy link
Author

Yes, I pass max_tokens and set it to 200k. I'm talking to Anthropic Claude Sonnet 3.5 and in my particular case I have personal keys within OpenRouter to avoid any usage limitations. The issue appears to be in how the messages are aggregated back and bubbled up. I'll have to work on reproducing the matter with data I can share.

No major rush on taking this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants