Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Add Support for OpenAI's o1 Family of Reasoning Models #441

Open
robellegate opened this issue Jan 12, 2025 · 0 comments
Open

Comments

@robellegate
Copy link

robellegate commented Jan 12, 2025

Description

OpenAI's family of reasoning models do not support system messages. If one tries to use them with OpenCommit today, the API request fails with a HTTP 400 error code.

Addresses #442

Steps to Reproduce

1. Set OpenCommit to use a reasoning model

oco config set OCO_MODEL=o1-mini

2. Change max tokens for input/output (optional)

Use the appropriate values from here for the model specified above.

oco config set OCO_TOKENS_MAX_INPUT=128000
oco config set OCO_TOKENS_MAX_OUTPUT=65536

3. Stage some changes and attempt to generate a commit message

$ oco
┌  open-commit
│
◇  9 staged files:
  .cspell.json
  .editorconfig
  .github/dependabot.yml
  .github/workflows/ci.yml
  .github/workflows/spellcheck.yml
  .gitignore
  .markdownlint.yaml
  .vscode/extensions.json
  lychee.toml
│
◇  ✖ Failed to generate the commit message
BadRequestError3: 400 Unsupported value: 'messages[0].role' does not support 'system' with this model.
    at APIError3.generate (path/to/lib/node_modules/opencommit/out/cli.cjs:60664:14)
    at OpenAI.makeStatusError (path/to/lib/node_modules/opencommit/out/cli.cjs:60213:22)
    at OpenAI.makeRequest (path/to/lib/node_modules/opencommit/out/cli.cjs:60256:24)
    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
    at async OpenAiEngine.generateCommitMessage (path/to/lib/node_modules/opencommit/out/cli.cjs:63730:28)
    at async generateCommitMessageByDiff (path/to/lib/node_modules/opencommit/out/cli.cjs:64405:27)
    at async generateCommitMessageFromGitDiff (path/to/lib/node_modules/opencommit/out/cli.cjs:64617:25)
    at async trytm (path/to/lib/node_modules/opencommit/out/cli.cjs:64584:18)
    at async commit (path/to/lib/node_modules/opencommit/out/cli.cjs:64786:35) {
  status: 400,
  headers: {
    'access-control-expose-headers': 'X-Request-ID',
    'alt-svc': 'h3=":443"; ma=86400',
    'cf-cache-status': 'DYNAMIC',
    'cf-ray': 'REDACTED',
    connection: 'keep-alive',
    'content-length': '221',
    'content-type': 'application/json',
    date: 'Sun, 12 Jan 2025 00:31:44 GMT',
    'openai-organization': 'user-REDACTED',
    'openai-processing-ms': '13',
    'openai-version': '2020-10-01',
    server: 'cloudflare',
    'set-cookie': '__cf_bm=REDACTED; path=/; expires=Sun, 12-Jan-25 01:01:44 GMT; domain=.api.openai.com; HttpOnly; Secure; SameSite=None, _cfuvid=REDACTED; path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None',
    'strict-transport-security': 'max-age=31536000; includeSubDomains; preload',
    'x-content-type-options': 'nosniff',
    'x-ratelimit-limit-requests': '500',
    'x-ratelimit-limit-tokens': '200000',
    'x-ratelimit-remaining-requests': '499',
    'x-ratelimit-remaining-tokens': '195903',
    'x-ratelimit-reset-requests': '120ms',
    'x-ratelimit-reset-tokens': '1.228s',
    'x-request-id': 'REDACTED'
  },
  request_id: 'REDACTED',
  error: {
    message: "Unsupported value: 'messages[0].role' does not support 'system' with this model.",
    type: 'invalid_request_error',
    param: 'messages[0].role',
    code: 'unsupported_value'
  },
  code: 'unsupported_value',
  param: 'messages[0].role',
  type: 'invalid_request_error'
}
│
└  ✖ 400 Unsupported value: 'messages[0].role' does not support 'system' with this model.

Suggested Solution

Here’s a suggested approach for handling models that do not support a system role. The overall idea is:

  1. Detect that a requested model does not allow system messages.
  2. Filter Out system messages from the conversation before sending the request to the API.
  3. Fallback to an alternative approach—either merge system content into the first user message or omit it entirely—so that the API no longer fails with a 400 error.

Below is an outline of the solution flow and how you could integrate it into the existing OpenCommit codebase:


1. Distinguish Reasoning Models from Other Models

Add a small helper in your OpenAiEngine (or in a utility module) to identify if the requested model is known to disallow system messages. The easiest approach is a small map or list to check against, for example:

function isReasoningModel(modelName: string): boolean {
  const reasoningFamilyModels = [
    'o1-mini',
    'o1',
    'o1-128k',
    // Extend with any future reasoning model IDs
  ];
  return reasoningFamilyModels.includes(modelName.toLowerCase());
}

2. Filter Out system Messages When Appropriate

Inside generateCommitMessage(), intercept the incoming messages. If the user’s config specifies one of the reasoning models, then remove or transform any message that has role: 'system'.

A. Simple Omission

public async generateCommitMessage(
  messages: Array<OpenAI.Chat.Completions.ChatCompletionMessageParam>
): Promise<string | null> {
  // 1) If model is from the reasoning family, filter out system messages.
  if (isReasoningModel(this.config.model)) {
    messages = messages.filter((m) => m.role !== 'system');
  }

  // 2) Continue as normal with temperature, top_p, etc.
  const params = {
    model: this.config.model,
    messages,
    temperature: 0,
    top_p: 0.1,
    max_tokens: this.config.maxTokensOutput
  };

  // 3) Remainder of existing code
  try {
    const completion = await this.client.chat.completions.create(params);
    const message = completion.choices[0].message;
    return message?.content || null;
  } catch (error) {
    throw error;
  }
}

B. Merge “System” Messages into the First User Prompt (Optional)

If you’d rather not lose the system message altogether, you can combine it into the first user message (or prepend it to the user’s existing question). This avoids an outright omission of potentially important instructions. For example:

if (isReasoningModel(this.config.model)) {
  let systemPayload = '';
  // Gather all system messages text
  for (const msg of messages) {
    if (msg.role === 'system') systemPayload += msg.content + '\n';
  }
  // Remove system messages
  messages = messages.filter((m) => m.role !== 'system');

  // If we had system text, prepend it to the first user message
  if (systemPayload && messages.length > 0) {
    if (messages[0].role === 'user') {
      messages[0].content = systemPayload + '\n' + messages[0].content;
    } else {
      // If the first message isn't user, push a new user message at index 0
      messages.unshift({
        role: 'user',
        content: systemPayload
      });
    }
  }
}

This preserves the “system” instructions as part of the user’s query so that the model sees it, but now the request remains valid for models that do not officially support the system role.


3. Surface a Warning or Note to Users

Because removing or collapsing the system content might degrade the commit-message quality, you could choose to surface an explanatory log message:

if (isReasoningModel(this.config.model)) {
  console.warn(
    `⚠️ The '${this.config.model}' model does not support 'system' messages. ` +
    `All system instructions are being merged into the user prompt.`
  );
}

This helps clarify that while the generation will succeed, they may lose some advanced prompt-engineering benefits.


Summary

By automatically detecting and omitting (or merging) system instructions for models that don’t support them, OpenCommit can gracefully handle new or specialized OpenAI models. This approach will:

  • Prevent 400 errors around “unsupported_value” for system roles.
  • Optionally preserve system content in the user’s first message so that advanced instructions aren’t lost.
  • Provide a better user experience by avoiding abrupt failures and giving them a workable fallback.

Alternatives

No response

Additional Context

This issue and #442 were drafted with the help of Gitingest (web|source) and ChatGPT using the o1 reasoning model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant