A lightweight caching proxy server for embedding APIs that stores embeddings in SQLite, reducing API calls and improving response times for repeated requests.
- Caches embeddings in SQLite database
- Compatible with OpenAI-compatible embedding APIs (including local alternatives like Ollama)
- Supports multiple embedding models simultaneously
- Simple HTTP API interface
- Zero configuration required for basic usage
- Cross-platform support (Linux, macOS, Windows)
Download the latest binary for your operating system from the releases page.
Run the server:
./proxemb
The server will start with default settings (localhost:35248) and create an SQLite database in the current directory.
Requirements:
- Go 1.21 or later
- Make (optional)
To compile:
make
Or manually with Go:
go build -o proxemb
Options:
-api-key string
OpenAI API key
-api-url string
OpenAI API URL (default "http://localhost:11434/v1/")
-db string
Path to the SQLite database (default "embeddings.db")
-log-file string
Log file path
-web-host string
Web server host (default "localhost")
-web-port string
Web server port (default "35248")
- Start the server with default settings:
./proxemb
- Start with custom API endpoint (e.g., for OpenAI):
./proxemb -api-url "https://api.openai.com/v1/" -api-key "your-api-key"
- Start with custom port and host:
./proxemb -web-host "0.0.0.0" -web-port "8080"
Send POST requests to the root endpoint with the following JSON structure:
{
"model": "text-embedding-3-small",
"input": "Your text to embed"
}
Example using curl:
curl -X POST http://localhost:35248/ \
-H "Content-Type: application/json" \
-d '{"model":"text-embedding-3-small","input":"Hello, world!"}'
Response format:
{
"embedding": [0.123, 0.456, ...]
}
The SQLite database is automatically created and contains two tables:
models
: Stores model names and their IDshashes
: Stores the cached embeddings with MD5 hashes of input texts
The database file location can be specified using the -db
flag.