Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

non-determinstic behavior on dict-like objects with attributes #243

Open
ArthurKantor opened this issue Aug 15, 2022 · 4 comments
Open

non-determinstic behavior on dict-like objects with attributes #243

ArthurKantor opened this issue Aug 15, 2022 · 4 comments

Comments

@ArthurKantor
Copy link

reproduce with

import glom

class Attributes(dict):
    pass

target={'a':Attributes({'at1':1,'at2':2})}

spec=glom.Assign('a.at3',3)
res=glom.glom(target, spec)
print(res)

Sometimes the above prints
{'a': {'at1': 1, 'at2': 2}}

and sometimes

{'a': {'at1': 1, 'at2': 2, 'at3': 3}}

some analysis

This seems to be caused by get_handler() sometimes returning setitem and sometimes returning setattr. (Maybe the correct behavior in this case is to disallow ambiguous paths. But in any case it should return the same thing consistently).

This in turn is caused by different orderings in the assign op type tree (I think the type tree should be constant):

OrderedDict([(<class 'object'>,
              OrderedDict([(<class 'glom.core._ObjStyleKeys'>, OrderedDict()),
                           (<class 'glom.core._AbstractIterable'>,
                            OrderedDict([(<class 'dict'>, OrderedDict()),
                                         (<class 'list'>, OrderedDict()),
                                         (<class 'tuple'>,
                                          OrderedDict())]))]))])

for the first result, and

OrderedDict([(<class 'object'>,
              OrderedDict([(<class 'glom.core._AbstractIterable'>,
                            OrderedDict([(<class 'dict'>, OrderedDict()),
                                         (<class 'list'>, OrderedDict()),
                                         (<class 'tuple'>, OrderedDict())])),
                           (<class 'glom.core._ObjStyleKeys'>,
                            OrderedDict())]))])

for the second.

@ArthurKantor
Copy link
Author

this seems related to #233

@kurtbrose
Copy link
Collaborator

Thanks for the reproducing case and analysis. I agree it should be consistent and am surprised it isn't.

@tobyX
Copy link

tobyX commented Oct 16, 2024

Hello!

I have a similar problem. I'm using tomlkit to parse some TOML file and want to update them accordingly to our standard.

Tomlkit is using a dict-like object and I also found that sometimes setitem but most often setattr is used.

I've read into glom and found that you recommend to use register to tell glom how to handle this.
Is there a better way? I currently have to register 14 objects to get a reliable result...

import glom

import tomlkit.items

import operator

glom.register(tomlkit.items.AoT, get=operator.getitem)
glom.register(tomlkit.items.Array, get=operator.getitem)
glom.register(tomlkit.items.Bool, get=operator.getitem)
glom.register(tomlkit.items.Comment, get=operator.getitem)
glom.register(tomlkit.items.Date, get=operator.getitem)
glom.register(tomlkit.items.DateTime, get=operator.getitem)
glom.register(tomlkit.items.Float, get=operator.getitem)
glom.register(tomlkit.items.InlineTable, get=operator.getitem)
glom.register(tomlkit.items.Integer, get=operator.getitem)
glom.register(tomlkit.items.Item, get=operator.getitem)
glom.register(tomlkit.items.Null, get=operator.getitem)
glom.register(tomlkit.items.Table, get=operator.getitem)
glom.register(tomlkit.items.String, get=operator.getitem)
glom.register(tomlkit.items.Whitespace, get=operator.getitem)

PYPROJECT_TOML = """[tool.black]
line_length = 120
"""
parsed_toml = tomlkit.parse(PYPROJECT_TOML)

print(parsed_toml)

glom.assign(parsed_toml, "tool.black.line_length", 200)

print(parsed_toml)

Basically what I would prefer is that I can tell glom to handle parsed_toml as a dict every time. It is a dict with additional features. ;)

@mahmoud
Copy link
Owner

mahmoud commented Nov 24, 2024

Hey @tobyX that's related, but not super related, I think it'd make a great separate issue or discussion. I'm really glad you found the registry workaround.

But since we're here, I think the approach I'd take would be to find a common parent type for all of these and register that. You can probably bring that down to just a couple lines and make it less brittle, as well.

I haven't used tomlkit much, but just from using type(...).mro(), I'd probably start with:

import glom
import tomlkit
import operator

glom.register(tomlkit.items.Item, get=operator.getitem)
glom.register(tomlkit.container.Container, get=operator.getitem)

Now that said, even that shouldn't really be necessary because both of those actually inherit from builtin types (list and dict respectively). In fact, I'm noticing the registration you've used is only for get and not assign, which is used later in the example. Thus, for assignment, glom is already falling back to using the default operators, because tomlkit has a well-designed Pythonic class/interface hierarchy.

Let me know if that makes sense, hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants