[Bug]: running with asyncio loop in uvicorn cause `WorkflowHandler.pending` to not be handled correctly in phoenix (arize) #17436

SvenWesterlaken · 2025-01-06T03:57:01Z

Bug Description

I am running my python environment for running llama index inside a docker container. To prevent errors on starting up after introducing ragas in the whole project I added --loop asyncio to the starting command. Everything seems to be working fine accept for one thing: the traces of a workflow don't combine into a single grouped, basically having the same problem as #16283 - showing "<WorkflowHandler pending>" and splitting up the steps

The docker image start command before (and screenshot of output in phoenix):

CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "80", "--reload"]

The docker image start command after

CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "80", "--reload", "--loop", "asyncio"]
Note the added last two arguments

I want to note I don't get any weird logs, probably there's a mismatch in context due to the async behaviour that is now wrapped around the application. Further information about the application is that I use fastapi as the main entrypoint to access my code and execute functions.

Version

0.12.10

Steps to Reproduce

1. Setup simpel workflow

from llama_index.core.workflow import StartEvent, StopEvent,Workflow, step
    import time

    class MyWorkflow(Workflow):
        @step
        async def my_step(self, ev: StartEvent) -> StopEvent:
            time.sleep(3)
            return StopEvent(result="Hi, world!")

    w = MyWorkflow(timeout=10, verbose=False)

2. Run the workflow with the span handler

import llama_index.core.instrumentation as instrument
from fastapi import FastAPI
from phoenix.otel import register

tracer_provider = register(
  # project_name="default", # Default is 'default'
  endpoint="http://localhost:6006/v1/traces",
)

# Instrument the application
LlamaIndexInstrumentor().instrument(tracer_provider=tracer_provider)

app = FastAPI()

@app.get("/")
async def read_root():
    result = await w.run()

3. Setup docker with phoenix

Pull & run latest Phoenix image from Docker Hub:

docker pull arizephoenix/phoenix:latest
docker run -p 6006:6006 arizephoenix/phoenix:latest

Or use the setup from #16283 as I think it will yield the same output, but haven't confirmed this.

This will expose the Phoenix on localhost:6006

4. Run the application with uvicorn

Make sure it's installed

run uvicorn main:app --loop asyncio

5. Execute get request to fastapi

call in web-browser or wget localhost:8000/ and it should execute the workflow

Relevant Logs/Tracbacks

No response

The text was updated successfully, but these errors were encountered:

dosubot · 2025-01-06T03:59:23Z

@SvenWesterlaken Greetings! I'm here to assist you with any bugs, questions, or contributions you might have. Let me know how I can help!

The issue you're experiencing with the WorkflowHandler.pending status not being handled correctly when using --loop asyncio in Uvicorn might be related to how asynchronous contexts are managed in LlamaIndex version 0.12.10. This version may have introduced changes in async handling that affect workflow tracing, particularly when combined with the --loop asyncio option.

Here are a few steps you can take to troubleshoot and potentially resolve the issue:

Check Async Context Management: Ensure that the async context is correctly managed throughout your application. The use of --loop asyncio can sometimes lead to context mismatches if the application or its dependencies are not fully compatible with this event loop.
Review Tracing Configuration: Double-check your tracing setup with Phoenix and ensure that the LlamaIndexInstrumentor is correctly configured. Make sure that the tracer provider is properly set up to handle async workflows.
Test Without --loop asyncio: Temporarily remove the --loop asyncio option to see if the issue persists. This can help determine if the problem is specifically related to the asyncio loop.
Update Dependencies: Ensure all related packages, including LlamaIndex, FastAPI, and Uvicorn, are up to date. Sometimes, compatibility issues arise from outdated dependencies.
Consult Documentation: Review any available documentation or release notes for LlamaIndex version 0.12.10 to identify any known issues or changes related to async handling and tracing.

If these steps do not resolve the issue, it may be necessary to further investigate the specific interactions between Uvicorn's asyncio loop and LlamaIndex's tracing mechanisms.

To continue talking to Dosu, mention @dosu.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other}

logan-markewich · 2025-01-06T04:02:03Z

Ah yea, I think I've heard this before. I need to look into it, but being honest, I have no idea what the exact issue/fix will be here. Suggestions/help very welcome

logan-markewich · 2025-01-06T04:05:28Z

Every workflow step (and the workflow.run()) function is wrapped by this span wrapper

llama_index/llama-index-core/llama_index/core/instrumentation/dispatcher.py

Line 242 in 2ea7c51

def span(self, func: Callable) -> Any:

The workflow.run returns a custom subclassed asyncio future object, which i think is messing with the async context? I've dug into this before, and it's a bit confusing.

SvenWesterlaken · 2025-01-06T04:07:03Z

Thanks for the fast response @logan-markewich! only can add from my part that the asyncio global loop is the cause in my case. Removing it solves it, so would say it's something with the global async, but I think you already go that from my initial message haha. Might be able to look into it later this week, but so far don't think I have enough experience to be of much help.

SvenWesterlaken · 2025-01-06T13:00:10Z

Don't know if this is correct @logan-markewich, but it seems that the problem mainly arrises from the double context and thus resetting the context around the span id. Which causes the span to exit prematurely and forego the parent_id that is saved which causes the detachments of the steps.

llama_index/llama-index-core/llama_index/core/instrumentation/dispatcher.py

Line 311 in 2ea7c51

context.run(active_span_id.reset, token)

It is worth noting though that the steps do combine correctly, only the first step of the running of the workflow itself looks like it has some problems.

Probably now that the global loop is async, the context created doesn't need to be reset?

Might be related: django/asgiref#267 and a related PR to fix it: instana/python-sensor#666

logan-markewich · 2025-01-06T14:52:32Z

Thanks for the tip, this actually helps a lot. Let me see what I can cook up here

logan-markewich · 2025-01-06T15:18:37Z

@SvenWesterlaken Actually, I'm unable to reproduce using the code you gave 👀 What versions of the packages do you have? Here's my relevant versions (omg, so many otel pacakges lol)

python==3.10
llama-index-core==0.12.10
arize-phoenix-otel==0.6.1
openinference-instrumentation==0.1.20
openinference-instrumentation-llama-index==3.1.2
openinference-semantic-conventions==0.1.12
opentelemetry-api==1.29.0
opentelemetry-exporter-otlp==1.29.0
opentelemetry-exporter-otlp-proto-common==1.29.0
opentelemetry-exporter-otlp-proto-grpc==1.29.0
opentelemetry-exporter-otlp-proto-http==1.29.0
opentelemetry-instrumentation==0.50b0
opentelemetry-proto==1.29.0
opentelemetry-sdk==1.29.0
opentelemetry-semantic-conventions==0.50b0

Also running the latest pheonix docker image (v7.3.2 is printed in the docker logs)

SvenWesterlaken · 2025-01-06T15:31:04Z

Let me try to create a local reproduction later today and send you the package versions.

It's because I was a bit cooked yesterday after spending so much time coding under pressure. So I couldn't get myself to put in the effort to create a full reproduction and test it. Also couldn't just copy-paste my repo code as it has some sensitive info etc. so quickly grabbed the reproduction from the related issue in hopes that it would give enough for people to figure out the issue as I normally expect people to react after a week haha. So once again thanks for the quick responses and help @logan-markewich !

SvenWesterlaken · 2025-01-06T22:39:08Z

Packages

As for the packages @logan-markewich (deleted quite a lot of rows, as it was thrice as long, in case you miss a package you need to verify the version of, let me know):

arize-phoenix==7.5.1
arize-phoenix-evals==0.18.0
arize-phoenix-otel==0.6.1
datasets==3.2.0
deepeval==0.10.7
fastapi==0.115.6
fastapi-cli==0.0.7
fastembed==0.5.0
huggingface-hub==0.23.5
langchain==0.3.14
langchain-community==0.3.14
langchain-core==0.3.29
langchain-openai==0.2.14
langchain-text-splitters==0.3.4
langcodes==3.5.0
langsmith==0.2.10
language_data==1.3.0
llama-cloud==0.1.7
llama-index==0.12.9
llama-index-agent-openai==0.4.1
llama-index-cli==0.4.0
llama-index-core==0.12.10.post1
llama-index-embeddings-fastembed==0.3.0
llama-index-embeddings-openai==0.3.1
llama-index-extractors-entity==0.3.0
llama-index-indices-managed-llama-cloud==0.6.3
llama-index-llms-openai==0.3.12
llama-index-multi-modal-llms-openai==0.4.2
llama-index-program-openai==0.3.1
llama-index-question-gen-openai==0.3.0
llama-index-readers-file==0.4.2
llama-index-readers-llama-parse==0.4.0
llama-index-storage-chat-store-redis==0.4.0
llama-index-storage-docstore-mongodb==0.3.0
llama-index-storage-index-store-mongodb==0.4.0
llama-index-storage-kvstore-mongodb==0.3.0
llama-index-vector-stores-qdrant==0.4.2
llama-parse==0.5.19
nest-asyncio==1.6.0
openai==1.59.3
openinference-instrumentation==0.1.20
openinference-instrumentation-llama-index==3.1.2
openinference-semantic-conventions==0.1.12
opentelemetry-api==1.29.0
opentelemetry-exporter-otlp==1.29.0
opentelemetry-exporter-otlp-proto-common==1.29.0
opentelemetry-exporter-otlp-proto-grpc==1.29.0
opentelemetry-exporter-otlp-proto-http==1.29.0
opentelemetry-instrumentation==0.50b0
opentelemetry-proto==1.29.0
opentelemetry-sdk==1.29.0
opentelemetry-semantic-conventions==0.50b0
python-dotenv==1.0.1
qdrant-client==1.12.2
ragas==0.2.9

Reproduction

1. Setup the workflow from the rag with reranking example

But without the ingestion to simplify it

# workflow.py
from llama_index.core.response_synthesizers import CompactAndRefine
from llama_index.core.postprocessor.llm_rerank import LLMRerank
from llama_index.core.workflow import (
    Context,
    Workflow,
    StartEvent,
    StopEvent,
    step,
)

from llama_index.llms.openai import OpenAI
from app.models.events import RetrieverEvent, RerankEvent
import logging

logger = logging.getLogger('main_app')

class RAGWorkflow(Workflow):
  @step
  async def retrieve(self, ctx: Context, event: StartEvent) -> RetrieverEvent | None:
    query = event.get("query")
    index = event.get("index")

    if not query:
      return None

    logger.debug(f"Query the database with: {query}")

    # store the query in the global context
    await ctx.set("query", query)

    if index is None:
      logger.warning("Index is empty, load some documents before querying!")
      return None

    retriever = index.as_retriever(similarity_top_k=25)
    nodes = await retriever.aretrieve(query)
    
    logger.debug(f"Retrieved {len(nodes)} nodes.")
    return RetrieverEvent(nodes=nodes)

  @step
  async def rerank(self, ctx: Context, ev: RetrieverEvent) -> RerankEvent:
      # Rerank the nodes
      
      ranker = LLMRerank(
          choice_batch_size=5, top_n=5, llm=OpenAI(model="gpt-4o-mini")
      )
      
      query = await ctx.get("query", default=None)
      
      new_nodes = ranker.postprocess_nodes(
          ev.nodes, 
          query_str=query
      )
      
      logger.debug(f"Reranked nodes to {len(new_nodes)}")
      return RerankEvent(nodes=new_nodes)

  @step
  async def synthesize(self, ctx: Context, ev: RerankEvent) -> StopEvent:
      """Return a streaming response using reranked nodes."""
      llm = OpenAI(model="gpt-4o-mini")
      summarizer = CompactAndRefine(llm=llm, streaming=True, verbose=True)
      query = await ctx.get("query", default=None)

      response = await summarizer.asynthesize(query, nodes=ev.nodes)
      return StopEvent(result=response)

2. Run the workflow from a basic endpoint using FastAPI & setup tracing

make sure to add an __init__.py so the relative import works

# main.py
from fastapi import FastAPI
from openinference.instrumentation.llama_index import LlamaIndexInstrumentor
from phoenix.otel import register
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.embeddings.openai import OpenAIEmbedding
from .workflow import RAGWorkflow

tracer_provider = register(
  # project_name="default", # Default is 'default'
  endpoint="http://phoenix:6006/v1/traces",
)

# Instrument the application
LlamaIndexInstrumentor().instrument(tracer_provider=tracer_provider)

# Setup fastapi
app = FastAPI()

# Note that this is just a simple debug endpoint
@app.get("/")
async def read_root():
  
  reader = SimpleDirectoryReader(
    input_files=["./app/test.txt"],
  )
  
  docs = reader.load_data()
  
  index = VectorStoreIndex.from_documents(docs, embed_model=OpenAIEmbedding(model_name="text-embedding-3-small"))
  
  w = RAGWorkflow()
  result = await w.run(query="What does the context say?", index=index)
  
  async for chunk in result.async_response_gen():
    print(chunk, end="", flush=True)
    
  return 'completed'

3. Create file called `test.txt`

Just create a txt file with simple sentence to be indexed, in my case the content didn't matter. As long as we have something to index.

4. Run phoenix in local docker container

docker pull arizephoenix/phoenix:latest
docker run -p 6006:6006 arizephoenix/phoenix:latest

This will expose the Phoenix on localhost:6006, it should pull 7.5.1 version of phoenix

5. Run app using uvicorn

Make sure it's installed globally

run uvicorn main:app --loop asyncio --reload

6. Access API endpoint to start workflow

call in web-browser, postman or wget localhost:8000/ and it should execute the workflow

Should return string of "completed" on completion

7. Go to phoenix to see the result

In case it still can be reproduced, there might be something with the docker container?
What you can try is to replace step 5 with running the python file within a docker container:

FROM python:3.10

WORKDIR /code

COPY ./requirements.txt ./
RUN pip install -r requirements.txt

COPY ./app ./app

CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "80", "--reload", "--loop", "asyncio"]

Code lives in subfolder of 'app' next to the Dockerfile.

logan-markewich · 2025-01-07T01:46:07Z

Thank you! I will give this a shot.

I'm actually just realizing that I ran without setting the loop type to asyncio (so using the default uvloop). I wonder if that's the key difference. I will find out

SvenWesterlaken · 2025-01-07T14:20:44Z

Thank you! I will give this a shot.

I'm actually just realizing that I ran without setting the loop type to asyncio (so using the default uvloop). I wonder if that's the key difference. I will find out

Yes, that's the key difference actually haha. If I don't run with the asyncio loop it works as expected. @logan-markewich

Unfortunately, can't take that out as it will crash ragas as it needs the asyncio loop instead of the uvloop. So that's also some background on how I stumbled upon this bug.

logan-markewich · 2025-01-07T15:37:31Z

@SvenWesterlaken there is likely a way to structure your code without needing nest_asyncio -- but yea ok, let me see if I can do something for this

logan-markewich · 2025-01-07T17:55:14Z

After consulting the arize folks, I think this issue will be very hard to fix 😓 In the meantime, I'm happy to help you redesign your code so that you don't need nest_asyncio -- is there a specific block of code that is requiring this?

SvenWesterlaken · 2025-01-07T18:23:56Z

Let me get back to you later today with some error messages and some context of the code that causes it. Thank you already for time! If you want, we can leave this issue open and continue communication in another form @logan-markewich? If not, I will post the stuff here.

logan-markewich · 2025-01-07T18:25:12Z

Are you on our discord? Feel free to ping me there (you'll see my name in every channel lol Logan M ) -- if not, we can just continue here

SvenWesterlaken added bug Something isn't working triage Issue needs to be triaged/prioritized labels Jan 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: running with asyncio loop in uvicorn cause `WorkflowHandler.pending` to not be handled correctly in phoenix (arize) #17436

[Bug]: running with asyncio loop in uvicorn cause `WorkflowHandler.pending` to not be handled correctly in phoenix (arize) #17436

SvenWesterlaken commented Jan 6, 2025 •

edited

Loading

dosubot bot commented Jan 6, 2025

logan-markewich commented Jan 6, 2025

logan-markewich commented Jan 6, 2025

SvenWesterlaken commented Jan 6, 2025 •

edited

Loading

SvenWesterlaken commented Jan 6, 2025 •

edited

Loading

logan-markewich commented Jan 6, 2025

logan-markewich commented Jan 6, 2025 •

edited

Loading

SvenWesterlaken commented Jan 6, 2025

SvenWesterlaken commented Jan 6, 2025

logan-markewich commented Jan 7, 2025

SvenWesterlaken commented Jan 7, 2025 •

edited

Loading

logan-markewich commented Jan 7, 2025

logan-markewich commented Jan 7, 2025

SvenWesterlaken commented Jan 7, 2025

logan-markewich commented Jan 7, 2025 •

edited

Loading

[Bug]: running with asyncio loop in uvicorn cause WorkflowHandler.pending to not be handled correctly in phoenix (arize) #17436

[Bug]: running with asyncio loop in uvicorn cause WorkflowHandler.pending to not be handled correctly in phoenix (arize) #17436

Comments

SvenWesterlaken commented Jan 6, 2025 • edited Loading

Bug Description

The docker image start command before (and screenshot of output in phoenix):

The docker image start command after

Version

Steps to Reproduce

1. Setup simpel workflow

2. Run the workflow with the span handler

3. Setup docker with phoenix

4. Run the application with uvicorn

5. Execute get request to fastapi

Relevant Logs/Tracbacks

dosubot bot commented Jan 6, 2025

logan-markewich commented Jan 6, 2025

logan-markewich commented Jan 6, 2025

SvenWesterlaken commented Jan 6, 2025 • edited Loading

SvenWesterlaken commented Jan 6, 2025 • edited Loading

logan-markewich commented Jan 6, 2025

logan-markewich commented Jan 6, 2025 • edited Loading

SvenWesterlaken commented Jan 6, 2025

SvenWesterlaken commented Jan 6, 2025

Packages

Reproduction

1. Setup the workflow from the rag with reranking example

2. Run the workflow from a basic endpoint using FastAPI & setup tracing

3. Create file called test.txt

4. Run phoenix in local docker container

5. Run app using uvicorn

6. Access API endpoint to start workflow

7. Go to phoenix to see the result

logan-markewich commented Jan 7, 2025

SvenWesterlaken commented Jan 7, 2025 • edited Loading

logan-markewich commented Jan 7, 2025

logan-markewich commented Jan 7, 2025

SvenWesterlaken commented Jan 7, 2025

logan-markewich commented Jan 7, 2025 • edited Loading

[Bug]: running with asyncio loop in uvicorn cause `WorkflowHandler.pending` to not be handled correctly in phoenix (arize) #17436

[Bug]: running with asyncio loop in uvicorn cause `WorkflowHandler.pending` to not be handled correctly in phoenix (arize) #17436

SvenWesterlaken commented Jan 6, 2025 •

edited

Loading

SvenWesterlaken commented Jan 6, 2025 •

edited

Loading

SvenWesterlaken commented Jan 6, 2025 •

edited

Loading

logan-markewich commented Jan 6, 2025 •

edited

Loading

3. Create file called `test.txt`

SvenWesterlaken commented Jan 7, 2025 •

edited

Loading

logan-markewich commented Jan 7, 2025 •

edited

Loading