Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search Results Missing Document Metadata in R2R v3.2.41 #1615

Open
Jacob411 opened this issue Nov 20, 2024 · 1 comment
Open

Search Results Missing Document Metadata in R2R v3.2.41 #1615

Jacob411 opened this issue Nov 20, 2024 · 1 comment

Comments

@Jacob411
Copy link

Describe the bug
Search results from the /search endpoint in R2R v3.2.41 are missing document metadata (particularly titles) that are present in the document store and accessible via /documents_overview endpoint. This affects both agent and direct search results.

To Reproduce
Steps to reproduce the behavior:

  1. Start R2R v3.2.41 Docker container with default configuration
  2. Upload any document with metadata (including title)
  3. Make a GET request to /documents_overview - note the complete metadata
  4. Make a POST request to /search endpoint
  5. Observe search results missing expected metadata fields

Expected behavior
Search results should include complete document metadata in the response:

"metadata": {
    "associated_query": "What is the capital of France?",
    "title": "example_document.pdf"
}

Current behavior
Instead, search results only contain limited metadata:

"metadata": {
    "version": "v0",
    "chunk_order": 67,
    "document_type": "pdf",
    "associated_query": "ds"
}

Screenshots

  • No title in search metadata.
image
  • Title available on overview
image

Environment:

  • R2R Version: 3.2.41
  • Deployment: Docker
  • Configuration: Default settings
  • OS: N/A (Docker container) (MacOS for frontend)

Additional context

  • Issue occurs with default search configuration
  • Documents have correct metadata when viewed through /documents_overview
  • Issue affects all search results consistently
  • Impacts both vector search and agent-based search results
  • Currently requires additional API calls to retrieve complete document information
@NolanTrem
Copy link
Collaborator

This is an awesome catch, thanks for flagging it! We're currently wrapping up a large release, and will have a chance to circle to fixing this once it's out—but if you find the bug and are willing to make a PR in the meantime, we'd love to have you as a contributor!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants