Async Colors and Misconceptions

asynchronicity by Bast on Thursday December 10th, 2020

I was reading "what color is your function"–https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/– the other day, and I came to a bit of a realization on why async can be so painful in python, especially when handling the interconnect between synchronous and asynchronous python code.

The issue lies within the following code:

# asynchronous
async def do_my_things():
    await repeat("do it", 5)

# synchronous
def do_my_things():
    repeat("do it", 5)

Spot it? No? It's hiding pretty well. I'll give you a clue.

It's a syntactical error.

No?

Let's spread things around, maybe that will help.

# asynchronous
async def do_my_things():
    args = ("do it", 5)
    prepped_func = repeat(*args)
    result = await prepped_func

# synchronous
def do_my_things():
    args = ("do it", 5)
    prepped_func = repeat
    result = prepped_func(*args)

Now do you see it?

You may notice that there's an additional type of thing here: the prepared function ('Task' for python's asyncio). This doesn't exist, outside of the python bytecode, for synchronous (blue) functions. But this actually isn't the mistake. It's a natural, mandatory consequence of an asynchronous ecosystem. An "uncalled" or "running" function. So what line actually has the problem?

It's the last one.

    result = await prepped_func
# vs
    result = prepped_func(*args)

() is doing two things: prepping a function, and calling a function. That's the syntactical flaw that plagues users of blended ecosystems. Async functions are "called" with await, not ()!

This syntactical confusion causes the following scenarios to happen all the time:

result = prepped_func() # missing await

and nobody can tell whether you were intending to manage the task or not. Until it goes out of scope and gets garbage collected, then you get a backtraceless warning.

A Potential Solution!

The default, instead, should have been the opposite. Async functions should require a keyword to recieve a task, and instead should have just automatically been awaited.

For example:

async def do_my_things():
    repeat("do it", 5)

No confusion. You called, it with (), and () always behaves the same.

async def do_many_things():
    a = task_for repeat("do it", 5)
    b = task_for repeat("do that", 5)
    asyncio.gather(a, b)

It makes this case more verbose. But I think it should be: you're intentionally invoking concurrency here. You should not accidentally do things that you didn't expect to. And the expected behavior for () is to call a function. Having to remember to color it when in certain kinds of functions is backwards, opt-in when it should be opt-out.

This does make one case confusing:

async def repeat(phrase, times): pass
def other_repeat(phrase, times): pass

async def do_two_things():
    repeat("do it", 5)
    other_repeat("do more of it", 15)

other_repeat() doesn't yield back to the event loop. But repeat does. This means things can change arbitrarily during repeat(), but not during other_repeat() (as the changes caused are directly determined by what's inside other_repeat(), instead of all other coroutines running).

However, I consider this the exception that proves the rule. The asynchronous state of a program is something that you need to always keep in mind when running asynchronous code, or even threaded code, where there are less guarantees when things will be moved out from under you. Therefore, it's not actually a big deal when await yields control to another coroutine, you should be treating these like threads anyway, as you have minimal guarantee that they won't change things anyway. You should use the appropriate features for this case, instead of abusing a technicality in the implementation, like locks, semaphores, and containerizing your state so that at no point in your (GIL-mandated) single threaded code do your invariants break across function calls.