Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Expand docs on unstructuring #515

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/basics.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ This new hook can be used directly or registered to a converter (the original in
```


Now if we use this hook to structure a `Model`, through ✨the magic of function composition✨ that hook will use our old `int_hook`.
Now if we use this hook to structure a `Model`, through ✨the magic of function composition✨ that hook will use the pre-existing (default) `Model` hook, which takes advanatge of our `int_hook`.

```python
>>> converter.structure({"a": "1"}, Model)
Expand Down
15 changes: 9 additions & 6 deletions docs/indepth.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,15 +23,16 @@ The new copy may be changed through the `copy` arguments, but will retain all ma
This feature is supported for Python 3.9 and later.
```

By default, collections are unstructured by unstructuring all their contents and then returning a new collection of the same type containing the unstructured contents.

Overriding collection unstructuring in a generic way can be a very useful feature.
A common example is using a JSON library that doesn't support sets, but expects lists and tuples instead.

Using ordinary unstructuring hooks for this is unwieldy due to the semantics of
[singledispatch](https://docs.python.org/3/library/functools.html#functools.singledispatch);
in other words, you'd need to register hooks for all specific types of set you're using (`set[int]`, `set[float]`,
`set[str]`...), which is not useful.
If users simply call `converter.unstructure(my_collection)`, the unstructuring will know only `my_collection.__class__` (for example `set`) and not any more specific type (for example `set[int]`). Thus one can `register_unstructure_hook` for `set` to achieve custom conversions of these objects. Unfortunately this usage handles all `set` objects regardless of their contents, and the hook has no way of telling that it is actually working with a `set[int]` (short of inspecting the contents). For this specific example, a generic `set` unstructuring hook cannot assume that the contents can be sorted, but an unstructuring hook specific to `set[int]` could.

If users are specifying `unstructure_as`, then using ordinary unstructuring hooks for handling collections is unwieldy: it is not possible to `register_unstructure_hook(set[int], ...)`, and if `unstructure_as` is set to `set[int]`, the hook for `set` will not be called.

Function-based hooks can be used instead, but come with their own set of challenges - they're complicated to write efficiently.
Function-based hooks can be used instead - when deciding whether they apply to a given argument they have access to the declared type of their argument - but come with their own set of challenges; in particular they're complicated to write efficiently as they have to be given the opportunity to decide whether they will handle every object that is being unstructured.

The {class}`Converter` supports easy customizations of collection unstructuring using its `unstruct_collection_overrides` parameter.
For example, to unstructure all sets into lists, use the following:
Expand All @@ -41,10 +42,12 @@ For example, to unstructure all sets into lists, use the following:
>>> from collections.abc import Set
>>> converter = cattrs.Converter(unstruct_collection_overrides={Set: list})

>>> converter.unstructure({1, 2, 3})
>>> converter.unstructure({1, 2, 3}, unstructure_as=Set[int])
[1, 2, 3]
```

Specifically, if the converter is in the process of unstructuring and it encounters a collection that matches a key in `unstruct_collection_overrides`, it will unstructure all the contents of the collection and then pass a generator that yields them to the override function. For mapping types the generator will yield key-value pairs, and for sequence types it will yield the elements. Unfortunately, the override function will not have access to the declared type of the collection (here `set[int]`), so it will have to make do with the runtime type. In contrast with a simple `register_unstructure_hook` for `set`, the override function will be called regardless of what parameterized generic type is supplied to `unstructure_as`.

Going even further, the `Converter` contains heuristics to support the following Python types, in order of decreasing generality:

- `typing.Sequence`, `typing.MutableSequence`, `list`, `deque`, `tuple`
Expand Down
25 changes: 25 additions & 0 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -216,3 +216,28 @@ _cattrs_ will now structure both key names into `new_field` on your class.
converter.structure({"new_field": "foo"}, MyInternalAttr)
converter.structure({"old_field": "foo"}, MyInternalAttr)
```

## Chaining hooks

When customizing conversion, one might want to pre- or post-process a type before applying structuring or unstructuring. One might be tempted to write something like:

```python
def structure_thing(thing, type_):
# do something to thing
structured_thing = converter.structure(thing, type_)
# do something to structured_thing
return structured_thing
```

Unfortunately, this will result in an infinite loop (and stack overflow) as the `converter.structure` call will call `structure_thing` again. It will therefore be necessary to use the `converter`'s `get_structure_hook` method to obtain the default hook for the type, and then call it directly.

```python
_base_hook = converter.get_structure_hook(type_)
def structure_thing(thing, type_):
# do something to thing
structured_thing = _base_hook(thing, type_)
# do something to structured_thing
return structured_thing
```

Note that this captures the default hook at the time of definition, so if the default hook changes later, this hook will not be affected. The order of such chaining hooks must be carefully considered, particularly given subclass matching and the complexities available with `register_structure_hook_func`.
8 changes: 8 additions & 0 deletions src/cattrs/converters.py
Original file line number Diff line number Diff line change
Expand Up @@ -272,6 +272,14 @@ def __init__(
self._struct_copy_skip = self._structure_func.get_num_fns()

def unstructure(self, obj: Any, unstructure_as: Any = None) -> Any:
"""Unstructure an object.

:param obj: The object to unstructure.
:param unstructure_as: The type to unstructure as. If not provided, the
type of the object (``obj.__class__``) will be used. Using ``unstructure_as``
can allow specification of generics, for example to ensure that ``list[A]``
is unstructured as ``list[A]`` rather than as ``list``.
"""
return self._unstructure_func.dispatch(
obj.__class__ if unstructure_as is None else unstructure_as
)(obj)
Expand Down