Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how do you connect to the server after running it? #7

Open
liconge opened this issue Dec 9, 2024 · 1 comment
Open

how do you connect to the server after running it? #7

liconge opened this issue Dec 9, 2024 · 1 comment

Comments

@liconge
Copy link

liconge commented Dec 9, 2024

Just run mlx_lm command or streamlit command in another terminal? if so, these commands open a page run at a port different from the omni server's default port, what is the difference?

@madroidmaq
Copy link
Owner

madroidmaq commented Dec 11, 2024

@liconge I'm not quite sure if I fully understand your question.

If you're not clear on how to use the project after it's launched, it's actually very simple. You just need to add a line of code to your current project, setting the OpenAI SDK port to http://localhost:10240/v1.

client = OpenAI(
+    base_url="http://localhost:10240/v1", 
    api_key="mlx-omni-server",  # not-needed
)

If you don't know how to create some products based on this project, you can refer to some examples in the examples directory. You can also quickly try them in other libraries that support setting up the OpenAI Client, such as the examples I added based on tools in #6.

from openai import OpenAI
from phi.agent import Agent
from phi.model.openai import OpenAIChat
from phi.tools.duckduckgo import DuckDuckGo

# Use mlx-omni-server to provide local OpenAI service
client = OpenAI(
    base_url="http://localhost:10240/v1",
    api_key="mlx-omni-server",  # not-needed
)

web_agent = Agent(
    model=OpenAIChat(
        client=client,
        id="mlx-community/Qwen2.5-3B-Instruct-4bit",
    ),
    tools=[DuckDuckGo()],
    instructions=["Always include sources"],
    show_tool_calls=True,
    markdown=True,
)
web_agent.print_response("Tell me about Apple MLX?", stream=False)

The running results are as follows:

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants