Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editorial: Introduce definition for sorting according to lexicographic code unit order #3299

Merged
merged 1 commit into from
Apr 25, 2024

Conversation

ptomato
Copy link
Contributor

@ptomato ptomato commented Mar 22, 2024

There are a few places where Lists of Strings are sorted with the awkward phrasing "ordered as if an Array of the same values had been sorted using %Array.prototype.sort% using undefined as comparefn." Encapsulate this sorting into a new AO, SortStringListByCodeUnit. I find this clearer about the intention (the current language leaves it implicit that the sorting is by code unit), and more concise.

This phrasing is used in several places in ECMA-402 as well, so this change will bring additional benefit there.

spec.html Outdated
</dl>
<emu-alg>
1. Let _result_ be a copy of _strings_.
1. [declared="comparefn"] Sort _result_ into the same order as if an Array of the same values had been sorted using %Array.prototype.sort% with *undefined* as _comparefn_.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, and I'd love to see it taken even further by building the methods on top of AOs rather than the other way around:

  • Introduce new AO SortList(_list_, _SortCompare_), replacing SortIndexedProperties step 4 with a call to it.
  • Introduce new AO CompareStringsByCodeUnit(_x_, _y_), replacing CompareArrayElements step 7 onwards with a call to it (and possibly also updating IsLessThan to use it).
  • Redefine this SortStringListByCodeUnit AO to something like
    1. Let _SortCompare_ be a new Abstract Closure with parameters (_x_, _y_) that performs the following steps when called:
      1. Return CompareStringsByCodeUnit(_x_, _y_).
    1. Return SortList(_strings_, _SortCompare_)
    

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's deeper than I had hoped to dig with this PR; how strong is that preference? If there's broad support for that direction I'm happy to do it. But if it's likely to result in back and forth, I'd prefer to keep the smaller scope and open an issue to discuss further changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no objection to pursuing that in a followup if it is contentious.

@michaelficarra michaelficarra added the editor call to be discussed in the next editor call label Mar 25, 2024
ljharb pushed a commit that referenced this pull request Mar 27, 2024
@bakkot
Copy link
Contributor

bakkot commented Mar 27, 2024

Editors discussed this and would prefer to just say something like "sorted according to lexicographic code unit order" in each of the places we need to do this. @syg specifically prefers that over introducing an AO which adds a layer of indirection. I don't think there's any ambiguity there.

We can do that ourselves if you'd like, or of course you are welcome to do so here.

@michaelficarra michaelficarra removed the editor call to be discussed in the next editor call label Mar 27, 2024
@michaelficarra
Copy link
Member

Also, we may want to <dfn> lexicographic to remove any possible ambiguity about how length affects order.

@ptomato ptomato force-pushed the sortstringlistbycodeunit branch from 4c190c8 to 4a93acb Compare March 28, 2024 18:12
@ptomato
Copy link
Contributor Author

ptomato commented Mar 28, 2024

Makes sense, done. I've also taken a stab at the <dfn>, which summarizes and refers to IsLessThan.

spec.html Show resolved Hide resolved
Copy link
Contributor

@bakkot bakkot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@ptomato ptomato force-pushed the sortstringlistbycodeunit branch from 4a93acb to 788a065 Compare March 28, 2024 18:21
@ptomato ptomato force-pushed the sortstringlistbycodeunit branch from 788a065 to ff720a9 Compare March 29, 2024 21:18
@ptomato ptomato changed the title Editorial: Introduce SortStringListByCodeUnit operation Editorial: Introduce definition for sorting according to lexicographic code unit order Mar 29, 2024
@bakkot bakkot added the ready to merge Editors believe this PR needs no further reviews, and is ready to land. label Apr 24, 2024
There are a few places where Lists of Strings are sorted with the awkward
phrasing "ordered as if an Array of the same values had been sorted using
%Array.prototype.sort% using *undefined* as _comparefn_." Change the
wording of this, and add a definition in the section about List
specification types.  I find this clearer about the intention (the
current language leaves it implicit that the sorting is by code unit),
and more concise.

This phrasing is used in several places in ECMA-402 as well, so this change will bring additional benefit there.
@ljharb ljharb force-pushed the sortstringlistbycodeunit branch from ff720a9 to ed82ebb Compare April 25, 2024 04:46
@ljharb ljharb merged commit ed82ebb into tc39:main Apr 25, 2024
7 checks passed
@ptomato ptomato deleted the sortstringlistbycodeunit branch April 25, 2024 04:56
ptomato added a commit to tc39/proposal-temporal that referenced this pull request Apr 30, 2024
No need for the extra AO SortStringListByCodeUnit. ECMA-262 editors
decided to go with the wording "sorted according to lexicographic code
unit order".

The change to AvailableNamedTimeZoneIdentifiers has now been made directly
in ECMA-262, so no need to carry that diff in the Temporal spec anymore.

See tc39/ecma262#3299.
Ms2ger pushed a commit to tc39/proposal-temporal that referenced this pull request May 2, 2024
No need for the extra AO SortStringListByCodeUnit. ECMA-262 editors
decided to go with the wording "sorted according to lexicographic code
unit order".

The change to AvailableNamedTimeZoneIdentifiers has now been made directly
in ECMA-262, so no need to carry that diff in the Temporal spec anymore.

See tc39/ecma262#3299.
ptomato added a commit to ptomato/ecma402 that referenced this pull request May 7, 2024
ben-allen pushed a commit to tc39/ecma402 that referenced this pull request May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
editorial change ready to merge Editors believe this PR needs no further reviews, and is ready to land.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants