Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Continuing the discussion from StackOverflow #2

Open
tomkcook opened this issue Aug 8, 2024 · 4 comments
Open

Continuing the discussion from StackOverflow #2

tomkcook opened this issue Aug 8, 2024 · 4 comments

Comments

@tomkcook
Copy link

tomkcook commented Aug 8, 2024

There's a reason that it's not easy to do this. It gives you new and very subtle ways to shoot yourself in the foot.

The problem

First, let's state the problem, with a vaguely real-world example. You have a library of existing code which is all synchronous. It does lots of useful things and is basically the glue that holds your company's systems together. Somewhere near the bottom of the library is a function that retrieves a number from an external server:

def get_number() -> int:
    return int(requests.get("https://myserver.com/the_number").content)

Then let's say things change and, instead of getting that number from an HTTP service, you have to get it from dbus. You really don't want all of glib as a dependency so pydbus isn't an option. Never mind, there's dbus_next which is 100% pure python! Yay!

NUMBER_INTERFACE = "..."
NUMBER_PATH = "..."

async def _get_number() -> int:
    bus = await MessageBus().connect()
    intro = await bus.introspect(NUMBER_INTERFACE, NUMBER_PATH)
    obj = bus.get_proxy_object(NUMBER_INTERFACE, NUMBER_PATH, intro)
    ifc = obj.get_interface(NUMBER_INTERFACE)
    return ifc.get_number()

def get_number() -> int:
    loop = asyncio.get_event_loop()
    return loop.run_until_complete(_get_number())

Fine. Now, suppose that someone uses the top level of your library to implement a web service using quart, which runs HTTP endpoints in an asyncio loop. Now you have a problem, because as soon as the library, running in a quart endpoint, calls get_number(), it will raise an exception: RuntimeError: This event loop is already running.

You show up at StackOverflow asking about the problem and you get a big pile of unhelpful answers that amount to "Yeah, don't do that. Why don't you rewrite your whole library in asyncio the way God intended it to be?" But you have a big ol' pile of other synchronous code that also uses your library; rewriting the library as asyncio code means either rewriting all that other stuff as asyncio code, or maintaining two versions of your library, one asyncio and one synchronous.

The obvious solution

The obvious solution to this is to make asyncio event loops reentrant, so that the above code doesn't raise a RuntimeError, it just gets on and executes the function. There is even a library for doing exactly that. But you don't want to do that.

Why don't I want to do that?

You don't want to do that because it breaks all your concurrency assumptions. It gives you new, subtle and extremely-difficult-to-debug ways to shoot yourself in the foot.

Let's go back to our example. Let's say someone is using our get_number() function like this:

class Foo:
    def __init__(self):
        self.value = 0

    async def bar(self):
        if self.value < 10:
            self.value = get_number()

In traditional multithreaded code, this is a disaster. We have a race condition. Another thread might intervene between when we read the value in the line if self.value < 10: and the line where we write the value self.value = get_number(). In traditional multithreaded code, we would need to put a lock around this to prevent concurrent updates.

But in asyncio code, the above is absolutely fine. We know that the only place where context-switching can happen is where there is an await. We can write code like this as though it is single threaded because it is. There is no way other code can intervene between the last two lines.

Now suppose you make asyncio event loops reentrant, so that asyncio code can call synchronous code that then calls asyncio code again. You just broke all your concurrency assumptions. We can no longer tell by looking at code where the context switches might occur, because any synchronous function might be hiding a re-entry to the asyncio event loop and might context switch when it is called. You now need to protect all these critical sections with locks again.

To the extent that your project uses asyncio for a good reason (and not just because someone said, "Hey, asyncio looks cool, let's use that!") it is almost certainly for this reason: asyncio lets you write concurrent code without having to worry that context switches can happen absolutely anywhere. It's easy to tell where they happen and it's easy to see where you need locks to prevent concurrent updates. Most code that deals with state that's shared between tasks can be written in the simple, obvious way; you only need to worry about concurrency where there's an await in your code. If you take all that away, what reason do you have for using asyncio at all? Might just as well use traditional threads.

@luochen1990
Copy link
Owner

Any synchronous function might be hiding a re-entry to the asyncio event loop and might context switch when it is called

I don't think this would be a very serious issue, treating all synchronous functions as atomic, this is just a legacy habit, caused by the unintentional design of previous implementations, similar legacy and possibly soon-to-be-discarded things are the GIL. By contrast, Haskell once replaced all the underlying libraries of type IO a with asynchronous implementations without affecting the user-level code, because it initially used an m:n lightweight thread model and made no assumptions about the atomicity of processes—unless you explicitly mark a piece of code as atomic, I believe this is also likely to be the future direction of Python and many other popular languages—if they no longer position themselves as scripting languages but hope to take on more. Of course, I acknowledge that your analysis of the current situation is accurate and very practical, but I still have a glimmer of hope for a better future, wishing that one day we will no longer need to repeat ourselves (referring to providing almost identical synchronous and asynchronous implementations for the same function).

@luochen1990
Copy link
Owner

Paste the link of StackOverflow here

@luochen1990
Copy link
Owner

@tomkcook I have provided my new solution in this repo. Your SO question is answered via this test case. Are you satisfied with my new solution?

@luochen1990
Copy link
Owner

Today I tried

  1. gevent, which cannot work together with async functions well.
  2. checked nest_asyncio and find it is no longer maintained,
  3. uvloop, which have same limitation like asyncio

Just a record.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants