Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved pratt ergonomics, added Mini ML example #685

Merged
merged 12 commits into from
Oct 25, 2024
Merged

Conversation

zesterer
Copy link
Owner

@zesterer zesterer commented Oct 21, 2024

API-wise, this only really includes one breaking change (bar trivial technicalities relating to semi-internal systems): it removes the overloading that pratt operators previously had. The result is that the compiler is much more likely to infer the correct types in user code without annotation.

@zesterer zesterer changed the title Began working on Mini ML example Added Mini ML example Oct 21, 2024
@zesterer zesterer changed the title Added Mini ML example Improved pratt ergonomics, added Mini ML example Oct 22, 2024
@TheOnlyTails
Copy link

If we can try adding the features from #600 and #682 together in this PR, that would be amazing (I love the changes so far). My language has ~40 constructs that would constitute a Pratt "operator", so it would be really appreciated.

@zesterer
Copy link
Owner Author

I can give it a go tomorrow :)

@zesterer
Copy link
Owner Author

If we can try adding the features from #600 and #682 together in this PR, that would be amazing

image

@TheOnlyTails
Copy link

I am in genuine awe, this is amazing!

@TheOnlyTails
Copy link

Just to be sure, the correct way to use this is by using a Vec of boxed operators instead of a tuple of operators, right?

@zesterer
Copy link
Owner Author

Just to be sure, the correct way to use this is by using a Vec of boxed operators instead of a tuple of operators, right?

You can do both. A tuple still works as before (and will be faster, since the compiler can much more aggressively inline things). But in the case where you simply can't construct a tuple (for example, if you parser gets built up while the program is running) then Vec will work too.

choice already works in a similar way.

@zesterer
Copy link
Owner Author

@TheOnlyTails I've found a solution to your other problem too. pratt::Operator is now implemented recursively for tuples, so you can nest them inside one-another to an arbitrary depth, allowing for an infinite number of operators in a single pratt combinator (like choice).

@zesterer zesterer merged commit 9d2b187 into main Oct 25, 2024
4 checks passed
@mlgiraud
Copy link

mlgiraud commented Nov 4, 2024

I like this change, but it seems to have a negative performance impact on parsing times compared to having all the operators in a single larger tuple. For my use case this results in an overall increase of the compile time (parsing takes up most of the time at the moment) of around 100-200ms, from around 1.3 seconds to around 1.5 seconds. The profiler doesn't really give me that much useful information of where the time is spent in release mode. I think i might have to try to profile it in valgrind, but it's currently not worth the effort. Just FYI that this seems to be a bit slower.

@zesterer
Copy link
Owner Author

zesterer commented Nov 4, 2024

I like this change, but it seems to have a negative performance impact on parsing times compared to having all the operators in a single larger tuple. For my use case this results in an overall increase of the compile time (parsing takes up most of the time at the moment) of around 100-200ms, from around 1.3 seconds to around 1.5 seconds. The profiler doesn't really give me that much useful information of where the time is spent in release mode. I think i might have to try to profile it in valgrind, but it's currently not worth the effort. Just FYI that this seems to be a bit slower.

Thanks for letting me know. Just to clarify, are you referring to the compilation time of your compiler itself, or the time rustc requires to compile your compiler?

Additionally, have you made any changes to your code such as boxing operators or using Vec as the operator group?

@mlgiraud
Copy link

mlgiraud commented Nov 4, 2024

EDIT:
And yes, i am referring to the actual runtime of my parser, and not rustc's runtime. I have not checked if the chagne has a major impact on rustc's runtime.

I had to refactor the code a bit because the spanned function was removed in favor of map_span, but this shouldn't really make that much of a difference, right?. The other change is only that i now no longer provide a single large tuple of pratt parsers, but a tuple of tuples, where each subtuple is grouped by e.g. arithmetic, prefix, or comparison operators. I am not using vectors for the pratt parser.

A small excerpt:

    let operation = atom.clone().pratt((
        (
            prefix(34, unary_operator(UOp::IntNegate), fold_unary_operation),
            prefix(33, unary_operator(UOp::BoolNegate), fold_unary_operation),
            prefix(32, unary_operator(UOp::BitNegate), fold_unary_operation),
            prefix(31, unary_operator(UOp::FloatNegate), fold_unary_operation),
        ),
        (
            infix(left(30), binary_operator(BOp::IntMul), fold_binary_operation),
            infix(left(29), binary_operator(BOp::IntDiv), fold_binary_operation),
            infix(left(28), binary_operator(BOp::IntSDiv), fold_binary_operation),
            infix(left(27), binary_operator(BOp::IntRemainder), fold_binary_operation),
            infix(left(26), binary_operator(BOp::IntSRemainder), fold_binary_operation),
            infix(left(25), binary_operator(BOp::FloatDiv), fold_binary_operation),
            infix(left(24), binary_operator(BOp::FloatMul), fold_binary_operation),
            infix(left(23), binary_operator(BOp::IntAdd), fold_binary_operation),
            infix(left(22), binary_operator(BOp::IntSubtract), fold_binary_operation),
            infix(left(21), binary_operator(BOp::FloatAdd), fold_binary_operation),
            infix(left(20), binary_operator(BOp::FloatSubtract), fold_binary_operation),
        ),

Before, all pratt parsers were simply in a single tuple.

@zesterer
Copy link
Owner Author

zesterer commented Nov 4, 2024

but this shouldn't really make that much of a difference, right?

Correct, it shouldn't. I'd be quite surprised if the codegen was any different at all, actually.

The other change is only that i now no longer provide a single large tuple of pratt parsers, but a tuple of tuples

That might have had an impact. Chumsky will try each operator in turn to see if it matches, so if more likely operators have now been shuffled closer to the bottom of the list, this could have a small impact on performance. How much that actually matters in practice depends on how complex your parser is and what inputs you're feeding it, of course.

One option is that I could enable more aggressive inlining. If I make a branch for this, would you be happy to point your project at it and give it a go?

@mlgiraud
Copy link

mlgiraud commented Nov 5, 2024

Sure, i can try out anything, just let me know what branch to use. I will at some point also make my sources public. If the issue still persists, then you could just take a look there. The Order of the operators is still the same, except that they are now grouped in tuples by category, so this shouldn't make a difference, right?

@zesterer
Copy link
Owner Author

zesterer commented Nov 5, 2024

except that they are now grouped in tuples by category, so this shouldn't make a difference, right?

I would hope not, provided the inliner is doing its job.

I've pushed a branch called inline-pratt. You can point to it like so:

[dependencies]
chumsky = { git = "https://github.com/zesterer/chumsky.git", branch = "inline-pratt" }

Let me know how it goes!

Edit: thinking about it more, I think there's a possibility it could be due to rustc poorly optimising moves... I will do some investigation on this too.

@mlgiraud
Copy link

mlgiraud commented Nov 5, 2024

Im trying out right now. Will update with the results. I don't think rust optimizes moves that poorly. I tried passing my AST by reference for example, but it turned out that actually moving the whole tree and destructuring it had better performance than walking through the references. I need to investigate that further though. However, this is currently not my main priority, since the compile time is not that critical to my application.

EDIT:
The inlining doesn't seem to make any difference. Maybe it's not due to the updated pratt parsing, but some other changes that were made between the last alpha release and the current master. I can try to bisect, or do you maybe have an idea what change could have caused this?

EDIT2:
I measured again, and it does seem to be slightly faster, but only around 10-20 ms, so still slower than on the last alpha.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants