Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiler: Construct-Dataclasses #256

Closed
wants to merge 6 commits into from
Closed

Compiler: Construct-Dataclasses #256

wants to merge 6 commits into from

Conversation

MatrixEditor
Copy link

@MatrixEditor MatrixEditor commented Aug 4, 2023

Compiler

I wrote a "compiler" that creates Python files and writes struct definitions into construct-dataclasses types. In addition to the default construct compiler, this one also supports the generation of documentation comments and bit structures. The exported Python files contain a structure that is slightly different from the default Construct compiler:

# typing is used for type annotations on dataclass fields
import typing as t
import dataclasses as dc

# default imports
from construct import *
from construct.lib import *
from construct_dataclasses import *

# -- Enum section --
# ...

# Each dataclass is named with a trailing "_t"
@container
@dc.dataclass
class png_t:
    magic: bytes = csfield(Bytes(8))
    ihdr_len: int = csfield(Int32ub)
    ihdr_type: bytes = csfield(Bytes(4))
    ihdr: png__ihdr_chunk_t = csfield(LazyBound(lambda: png__ihdr_chunk))
    ihdr_crc: bytes = csfield(Bytes(4))
    chunks: t.List[png__chunk_t] = csfield(
        RepeatUntil(
            lambda obj_, list_, this: (
                # This PR also fixes an issue with the Construct compiler where a direct
                # reference to _io was made
                (obj_.type == "IEND") or (stream_iseof(this._io))
            ),
            LazyBound(lambda: png__chunk),
        )
    )


# The actual parser instance stores the original name
png = DataclassStruct(png_t)

# Schema definition and __all__ attribute for wildcard imports at the end
_schema = png
__all__ = [...]

The approach is used so that static code analyzers can utilize the constructor signature of the generated dataclasses (I know not very pretty, but using @dataclass_struct makes constructor arguments invisible):

Why?

  1. Dataclass objects are more useful than default Container or ListContainer, and they can be easily converted to JSON for further processing.
  2. Since this is not a "real" language, I suppose it can be integrated more easily.

Other Features

  • This pull request also fixes an issue with the construct compiler that in certain situations writes only "_io" instead of using the this._io reference.
  • Documentation comments are generated using a class taken from Generate docs in Construct compiler #243

@MatrixEditor MatrixEditor marked this pull request as ready for review August 9, 2023 14:06
MatrixEditor and others added 2 commits September 29, 2023 10:50
+ FIxed an issue where an invalid amount of braces would be generated upon a missing type hint
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants