Account for trailing comments in span handling #527

clarkf · 2024-01-02T18:24:21Z

Fixes #464

When using taplo rules to limit configuration to specific portions of a document, trailing comments would break formatting.

Node#text_ranges, when considering tables, was returning ranges for each interior element and the table header, which excluded trailing comments. When the formatter walks the code and is trying to determine which rules apply to a table, it checks whether the table is a subset of the rule "span". When there is a trailing comment, this check fails, and the rule will never apply.

Since the dom Node for tables contains only a reference to the table header, we can get the text range of the entire table, including trailing or interior comments, by traversing up to the parent and taking its text range. If this is unavailable, the previous behavior is used.

Note that this modifies the behavior of crate::dom::Node::text_ranges, which is part of the public API.

ia0

I'm not so familiar with the code to know if that's a correct change. I think my suggestion minimizes the semantic diff while still fixing the issue (if I'm not wrong).

ia0 · 2024-01-02T22:27:07Z

crates/taplo/src/dom/node.rs

-                if let Some(mut r) = v.syntax().map(|s| s.text_range()) {
-                    for range in &ranges {
-                        r = r.cover(*range);
+                        ranges.insert(0, r);


This behavior seems related to what you're doing. I wonder if both can't be combined. In particular, I wonder if we should not preserve all the key and entry ranges. I have something like that in mind:

Node::Table(v) => { let entries = v.entries().read(); for (k, entry) in ... { ... } // no change up to here let mut cover = None; // range that covers the node if let Some(mut r) = v.syntax().map(...) { for range in &ranges { ... } cover = Some(r); // this replaces the insert(0) } if let Some(parent) = v.syntax().and_then(|n| n.parent()) { cover = Some(match cover { None => parent.text_range(), Some(r) => r.cover(parent.text_range()), )}; } if let Some(r) = cover { ranges.insert(0, r); } }

You're right; that's definitely a safer approach. I'll update.

Admittedly, I'm still a bit confused by the point of returning multiple ranges, especially when (in most cases) the first one is a superset of the rest. But the LSP implementation depends on this method, so it's probably best to keep it as is.

I'm still a bit confused by the point of returning multiple ranges, especially when (in most cases) the first one is a superset of the rest.

The same table can appear anywhere in a document

[foo] [bar] field = true [foo.baz]

In this case foo should have 2 disjointed ranges, merging them into one would mean that the table spans the entire document, which is not ~~true~~ useful, it's especially important when trying to find out which table field belongs to just by looking at the cursor position.

That's the idea anyway, I might have added a range that unifies all for some reason and then pick the narrowest one, it was a while ago, the individual ranges I believe are still important.

Thanks for the clarification, I thought it had something to do with dotted table headings and the additional_syntaxes stuff.

I'm not able to find the call-site where the narrowest range might be used to identify a symbol, but, I think this change shouldn't affect it. From all the testing I've done, the only difference I can find is when there's non-critical data (i.e. a comment) trailing a table. In that case, the first (largest) range includes that comment, where previously it didn't.

I tried to play a bit with this change and indeed it seems to only change the case it's supposed to fix. In particular I wanted to see if something like that would get reordered:

[foo] a = 1 c = 3 # b b = 2

And it seems it doesn't.

However, you're current change is different from my suggestion in that it will prepend 2 new elements instead of extending the first one (which is always added when there's a parent, because if parent() returns Some then syntax() does too). The question being whether it's better to have a new (probably larger range) first or update the existing added range to be probably larger. I'm not sure on that one. I'm fine with both solutions:

Merge this PR as is.

Extend the first "artificial" range to also cover the parent if there's one.

What do you think?

You're right, and that was an oversight. I've adjusted it to expand the (single) artificial range to cover the parent if available. Looks suspiciously like your original suggestion!

Fixes tamasfe#464 When using taplo rules to limit configuration to specific portions of a document, trailing comments would break formatting. `Node#text_ranges`, when considering tables, was returning ranges excluding trailing comments. When the formatter walks the code and is trying to determine which rules apply to a table, it checks whether the table is a subset of the rule "span". When there is a trailing comment, this check fails, and the rule will never apply. Since the dom `Node` for tables contains only a reference to the table header, we can get the text range of the entire table, including trailing or interior comments, by traversing up to the parent and taking its text range. If this is unavailable, the previous behavior is used. Note that this modifies the behavior of `crate::dom::Node::text_ranges`, which is part of the public API. Signed-off-by: Clark Fischer <[email protected]>

ia0

Thanks! This looks good to me. I was actually commenting to do this exact match you did and seen you pushed it in the meantime.

This reverts commit c950bdc. Fixes #634 Re-open #464

ia0 reviewed Jan 2, 2024

View reviewed changes

clarkf force-pushed the fix-span-trailing-comment branch 2 times, most recently from c3834c4 to 6cb4ce6 Compare January 3, 2024 15:10

clarkf force-pushed the fix-span-trailing-comment branch from 6cb4ce6 to 3f9678b Compare January 3, 2024 15:36

ia0 approved these changes Jan 3, 2024

View reviewed changes

ia0 requested a review from panekj January 3, 2024 15:41

JounQin approved these changes Feb 14, 2024

View reviewed changes

JounQin merged commit c950bdc into tamasfe:master Feb 14, 2024
4 checks passed

rami3l mentioned this pull request Jul 14, 2024

Regression: [rule.formatting] completely overrides the default outside of the specified sections #634

Closed

panekj added a commit that referenced this pull request Jul 27, 2024

Revert "Account for trailing comments in span handling (#527)"

4d08035

This reverts commit c950bdc. Fixes #634 Re-open #464

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Account for trailing comments in span handling #527

Account for trailing comments in span handling #527

clarkf commented Jan 2, 2024

ia0 left a comment

ia0 Jan 2, 2024

clarkf Jan 3, 2024

tamasfe Jan 3, 2024 •

edited

Loading

clarkf Jan 3, 2024

ia0 Jan 3, 2024

clarkf Jan 3, 2024

ia0 left a comment

Account for trailing comments in span handling #527

Account for trailing comments in span handling #527

Conversation

clarkf commented Jan 2, 2024

ia0 left a comment

Choose a reason for hiding this comment

ia0 Jan 2, 2024

Choose a reason for hiding this comment

clarkf Jan 3, 2024

Choose a reason for hiding this comment

tamasfe Jan 3, 2024 • edited Loading

Choose a reason for hiding this comment

clarkf Jan 3, 2024

Choose a reason for hiding this comment

ia0 Jan 3, 2024

Choose a reason for hiding this comment

clarkf Jan 3, 2024

Choose a reason for hiding this comment

ia0 left a comment

Choose a reason for hiding this comment

tamasfe Jan 3, 2024 •

edited

Loading