Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Account for trailing comments in span handling #527

Merged
merged 1 commit into from
Feb 14, 2024

Conversation

clarkf
Copy link
Contributor

@clarkf clarkf commented Jan 2, 2024

Fixes #464

When using taplo rules to limit configuration to specific portions of a document, trailing comments would break formatting.

Node#text_ranges, when considering tables, was returning ranges for each interior element and the table header, which excluded trailing comments. When the formatter walks the code and is trying to determine which rules apply to a table, it checks whether the table is a subset of the rule "span". When there is a trailing comment, this check fails, and the rule will never apply.

Since the dom Node for tables contains only a reference to the table header, we can get the text range of the entire table, including trailing or interior comments, by traversing up to the parent and taking its text range. If this is unavailable, the previous behavior is used.

Note that this modifies the behavior of crate::dom::Node::text_ranges, which is part of the public API.

Copy link
Collaborator

@ia0 ia0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not so familiar with the code to know if that's a correct change. I think my suggestion minimizes the semantic diff while still fixing the issue (if I'm not wrong).

if let Some(mut r) = v.syntax().map(|s| s.text_range()) {
for range in &ranges {
r = r.cover(*range);
ranges.insert(0, r);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behavior seems related to what you're doing. I wonder if both can't be combined. In particular, I wonder if we should not preserve all the key and entry ranges. I have something like that in mind:

Node::Table(v) => {
    let entries = v.entries().read();
    for (k, entry) in ... { ... } // no change up to here
    let mut cover = None; // range that covers the node
    if let Some(mut r) = v.syntax().map(...) {
        for range in &ranges { ... }
        cover = Some(r); // this replaces the insert(0)
    }
    if let Some(parent) = v.syntax().and_then(|n| n.parent()) {
        cover = Some(match cover {
            None => parent.text_range(),
            Some(r) => r.cover(parent.text_range()),
        )};
    }
    if let Some(r) = cover {
        ranges.insert(0, r);
    }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right; that's definitely a safer approach. I'll update.

Admittedly, I'm still a bit confused by the point of returning multiple ranges, especially when (in most cases) the first one is a superset of the rest. But the LSP implementation depends on this method, so it's probably best to keep it as is.

Copy link
Owner

@tamasfe tamasfe Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still a bit confused by the point of returning multiple ranges, especially when (in most cases) the first one is a superset of the rest.

The same table can appear anywhere in a document

[foo]

[bar]
field = true

[foo.baz]

In this case foo should have 2 disjointed ranges, merging them into one would mean that the table spans the entire document, which is not true useful, it's especially important when trying to find out which table field belongs to just by looking at the cursor position.

That's the idea anyway, I might have added a range that unifies all for some reason and then pick the narrowest one, it was a while ago, the individual ranges I believe are still important.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification, I thought it had something to do with dotted table headings and the additional_syntaxes stuff.

I'm not able to find the call-site where the narrowest range might be used to identify a symbol, but, I think this change shouldn't affect it. From all the testing I've done, the only difference I can find is when there's non-critical data (i.e. a comment) trailing a table. In that case, the first (largest) range includes that comment, where previously it didn't.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to play a bit with this change and indeed it seems to only change the case it's supposed to fix. In particular I wanted to see if something like that would get reordered:

[foo]
a = 1
c = 3
# b
b = 2

And it seems it doesn't.

However, you're current change is different from my suggestion in that it will prepend 2 new elements instead of extending the first one (which is always added when there's a parent, because if parent() returns Some then syntax() does too). The question being whether it's better to have a new (probably larger range) first or update the existing added range to be probably larger. I'm not sure on that one. I'm fine with both solutions:

  • Merge this PR as is.
  • Extend the first "artificial" range to also cover the parent if there's one.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, and that was an oversight. I've adjusted it to expand the (single) artificial range to cover the parent if available. Looks suspiciously like your original suggestion!

@clarkf clarkf force-pushed the fix-span-trailing-comment branch 2 times, most recently from c3834c4 to 6cb4ce6 Compare January 3, 2024 15:10
Fixes tamasfe#464

When using taplo rules to limit configuration to specific portions of a
document, trailing comments would break formatting.

`Node#text_ranges`, when considering tables, was returning ranges
excluding trailing comments. When the formatter walks the code and is
trying to determine which rules apply to a table, it checks whether the
table is a subset of the rule "span". When there is a trailing comment,
this check fails, and the rule will never apply.

Since the dom `Node` for tables contains only a reference to the table
header, we can get the text range of the entire table, including
trailing or interior comments, by traversing up to the parent and taking
its text range. If this is unavailable, the previous behavior is used.

Note that this modifies the behavior of `crate::dom::Node::text_ranges`,
which is part of the public API.

Signed-off-by: Clark Fischer <[email protected]>
@clarkf clarkf force-pushed the fix-span-trailing-comment branch from 6cb4ce6 to 3f9678b Compare January 3, 2024 15:36
Copy link
Collaborator

@ia0 ia0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This looks good to me. I was actually commenting to do this exact match you did and seen you pushed it in the meantime.

@ia0 ia0 requested a review from panekj January 3, 2024 15:41
@JounQin JounQin merged commit c950bdc into tamasfe:master Feb 14, 2024
4 checks passed
panekj added a commit that referenced this pull request Jul 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Setting keys with reorder_keys = true does not sort when the last key has a trailing comment
4 participants