-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use the anyio
compatibility framework, so downstream integrations can use asyncio
or trio
?
#1423
Comments
Yes, we'd definitely be down. This could actually dovetail very well with some work we are about to take on to improve our scheduler. My biggest concern is around user
Does all of that sound right? So we're up for PRs but also would want to lean in and work directly on making this happen. I propose that I take a close look at this early-to-middle of next week and then ping you back to discuss particulars? |
Yep, that all sounds right to me! Happy to chat about this whenever 😁 |
One not critical but possibly relevant issue I've discovered is that our terminal UI framework (textual) uses asycio. Use of textual is optional in Inspect (it is used for |
Looks like this was discussed a while ago, and we can probably work on Textual support in parallel with Inspect 🙂 |
Have started moving things to anyio and it's going along well. As I transition from uses of tasks: list[Awaitable[dict[str, SampleScore]]] = [
run_score_task(state, Target(sample.target), scorers, progress)
for (sample, state) in zip(log.samples, states)
]
scores = await asyncio.gather(*tasks) To port this to anyio I created a scores = await tg_collect_or_raise(tasks) Here's the implementation: async def tg_collect_or_raise(coros: list[Awaitable[T]]) -> list[T]:
"""Runs all of the passed coroutines collecting their results.
If an exception occurs in any of the tasks then the other tasks
are cancelled and the exception is raised.
Args:
coros: List of coroutines
Returns:
List of results if no exceptions occurred.
Raises:
Exception: The first exception occurring in any of the coroutines.
"""
results: list[tuple[int, T]] = []
first_exception: Exception | None = None
async with anyio.create_task_group() as tg:
async def run_task(task: Awaitable[T], index: int) -> None:
nonlocal first_exception
try:
result = await task
results.append((index, result))
except Exception as exc:
if first_exception is None:
first_exception = exc
tg.cancel_scope.cancel()
for i, coro in enumerate(coros):
tg.start_soon(run_task, coro, i)
if first_exception:
raise first_exception
# sort results by original index and return just the values
return [r for _, r in sorted(results)] There are also more simplistic cases e.g. printing errors. For Docker container cleanup we used to do this: tasks = [cleanup_fn(project, False) for project in projects]
results = await asyncio.gather(*tasks, return_exceptions=True)
# report errors
for result in results:
if result is not None:
print(f"Error cleaning up Docker environment: {result}") And now we do this: tasks = [cleanup_fn(project, False) for project in projects]
async with anyio.create_task_group() as tg:
for task_coro in tasks:
tg.start_soon(print_exceptions, task_coro, "cleaning up Docker environment")
async def print_exceptions(coro: Awaitable[T], context: str) -> None:
try:
await coro
except Exception as ex:
print(f"Error {context}: {ex}") LMK if we are on the right track here.... |
Definitely on the right track! A few points of idiom, which might help in future:
|
Incredibly helpful! Thank you for your patience :-) Got it re: never creating an async function w/o managing it. I've re-written print_exceptions as: async def print_exceptions(
context: str,
func: Callable[[Unpack[PosArgsT]], Awaitable[Any]],
*args: Unpack[PosArgsT],
) -> None:
try:
await func(*args)
except Exception as ex:
print(f"Error {context}: {ex}") But I kind of agree with you that the right call is probably just catching and printing the I have a new async def tg_collect(
funcs: list[Callable[[], Awaitable[T]]], exception_group: bool = False
) -> list[T]:
"""Runs all of the pased async functions and collects their results.
The results will be returned in the same order as the input `funcs`.
Args:
funcs: List of async functions.
exception_group: `True` to raise an ExceptionGroup or
`False` (the default) to raise only the first exception.
Returns:
List of results of type T.
Raises:
Exception: First exception occurring in failed tasks
(for `exception_group == False`, the default)
ExceptionGroup: Exceptions that occurred in failed tasks
(for `exception_group == True`)
"""
try:
results: list[tuple[int, T]] = []
async with anyio.create_task_group() as tg:
async def run_task(index: int) -> None:
result = await funcs[index]()
results.append((index, result))
for i in range(0, len(funcs)):
tg.start_soon(run_task, i)
# sort results by original index and return just the values
return [r for _, r in sorted(results)]
except ExceptionGroup as ex:
if exception_group:
raise
else:
raise ex.exceptions[0] Order matters in some of the call sites (thus Thanks again for you time and input here. Hoping to not just "make it work" but rather do everything the way it ought to be done for structured concurrency. |
Update: Things continue to go well and streams are indeed great for many of our scenarios. One thing we've identified which might require additional work is our interface to S3. We use s3fs which both makes use of asyncio as well as some really weird idioms for running async code from sync contexts. We could certainly have an initial limitation that s3 logging doesn't work w/ the Trio back end but I'm imagining you all do need s3? I think the path to remedying this is not too bad -- an httpx client that uses botocore for auth/signing wouldn't be terribly hard to build. There is a bunch of fancy footwork in s3fs for handing files > 5gb (multipart uploads are required for that) and of course the usual retry stuff. Let us know if this is an immediate requirement as well as if you might have inclination to work on this. Our log client doesn't do that much (push and pull files, make and list directories, etc.). This is something I think we will want to do anyway in the normal course of things so if it's not on your immediate wish list then be assured we'll probably get to it in the next few months anyway. |
Good call on both halves of this IMO; it's worth doing eventually but probably not right now.
I've written effectively-identical helpers, sometimes lists (or dicts) are just what you want! I think about If ordering matters I'd usually just have one producer task; if I had concurrent producers and also cared about order and needed streaming (so the async def stream_to_ordered(
send_stream: MemoryObjectSendStream[T],
recv_stream: MemoryObjectReceiveStream[tuple[int, T]],
) -> None:
next_idx = 0
buffer: dict[int, T] = {}
async with send_stream, recv_stream:
async for idx, value in recv_stream:
buffer[idx] = value
while next_idx in buffer:
await send_stream.send(buffer.pop(next_idx))
next_idx += 1
I'm inclined to leave this until later on the roadmap, but |
Okay, we've got this integrated now (on The S3 limitation we will leave for now but will come back to it soon. Hopefully we get also some engagement on the textual front. LMK how you get on with this and if there are other things we need to button down |
Over at Anthropic, we're enthusiastic about evals, and often use Inspect. My only complaint about this is that because Trio is our async framework of choice, and Inspect is built directly on asyncio, there's an awkward incompatibility in the underlying async runtime.
Fortunately, this is fixable: the anyio framework offers a very nice structured-concurrency interface (think
asyncio.TaskGroup
, or Trio-style), which can run seamlessly on top of either asyncio or Trio backends. In fact, anyio is already so widely used - for example, by libraries likehttpx
- that you're already (transitively) depending on it!So... would you be interested in accepting PRs to incrementally migrate onto anyio? There'd be no change for asyncio users, but those of us on Trio would have a much easier time integrating Inspect into our workflows.
The text was updated successfully, but these errors were encountered: