Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use strings.Cut and strings.CutPrefix where possible #4470

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

kolyshkin
Copy link
Contributor

@kolyshkin kolyshkin commented Oct 23, 2024

For the most part, this is a switch from strings.Split[N] to strings.Cut,
and from strings.HavePrefix/strings.TrimPrefix to strings.CutPrefix.

Using strings.Cut (added in Go 1.18, see 1) results in faster and
cleaner code with less allocations (as we're not using a slice).

Using strings.CutPrefix (added in Go 1.20, see 2) results in
slightly cleaner easier-to-read code.

There are a few other cleanups and nits here and there.
Please see individual commits for details.

@kolyshkin kolyshkin added the kind/refactor refactoring label Oct 23, 2024
@kolyshkin kolyshkin requested review from dqminh and rata and removed request for dqminh November 2, 2024 00:37
@rata
Copy link
Member

rata commented Nov 5, 2024

@kolyshkin I guess this is due to performance? Did you run some numbers or we just tested before this is faster and we are just using strings.Cut in more places? IMHO, it would be nice to say the reason on the commits.

If something breaks due to this, it would be nice to know that the commit just changed to this due to performance, so a revert is probably safe until a new fixed version is cooked. But that is not that easy to know if we don't explain why we change it in the commit.

@kolyshkin
Copy link
Contributor Author

kolyshkin commented Nov 6, 2024

@kolyshkin I guess this is due to performance? Did you run some numbers or we just tested before this is faster and we are just using strings.Cut in more places? IMHO, it would be nice to say the reason on the commits.

If something breaks due to this, it would be nice to know that the commit just changed to this due to performance, so a revert is probably safe until a new fixed version is cooked. But that is not that easy to know if we don't explain why we change it in the commit.

Thanks, this makes sense. Individual commit messages as well as this PR description updated; PTAL.

@kolyshkin kolyshkin force-pushed the strings-cut branch 2 times, most recently from 819df22 to 62a4183 Compare December 5, 2024 00:04
@kolyshkin
Copy link
Contributor Author

@opencontainers/runc-maintainers PTAL. This is a minor cleanup, aiming for more readable and faster code with less allocations.

@kolyshkin kolyshkin added this to the 1.3.0 milestone Dec 5, 2024
return nil, fmt.Errorf("invalid --cgroup argument: %s", c)
}
if len(cs) == 1 { // no controller: prefix
if ctr, path, ok := strings.Cut(c, ":"); ok {
Copy link
Member

@lifubang lifubang Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I havn't checked that whether this will lead to some regressions or not, for example, if c == "a:b:c:d:e":
When using SplitN, cs[1] will be "b";
When using Cut, path will be "b:c:d:e".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the old code would error out when c == "a:b:c:d:e".

The new code would not, assigning b:c:d:e to path.

To me, the new code is (ever so slightly) more correct, since : is a valid symbol which should be allowed in path.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, was looking at that as well; agreed, looks like new code is more correct here.

I still need to have a quick look at the splitting on commas, and the if len(args) != 1 { check below. Probably will check out this branch locally

libcontainer/cgroups/fs/cpuacct.go Outdated Show resolved Hide resolved
libcontainer/cgroups/fs/cpuacct.go Show resolved Hide resolved
libcontainer/cgroups/fs2/defaultpath.go Show resolved Hide resolved
case "avg10":
pv = &data.Avg10
case "avg60":
pv = &data.Avg60
case "avg300":
pv = &data.Avg300
case "total":
v, err := strconv.ParseUint(kv[1], 10, 64)
v, err := strconv.ParseUint(val, 10, 64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this PR, but I started to look in some other codebases to dismantle these errors. The default error returned is something like;

invalid total PSI value: strconv.ParseUint: parsing "some invalid value": invalid syntax

Which tends to be a bit "implementation detail", and strconv.ParseUint is not very relevant to the user. So in some cases, dismantling the error can be useful; https://cs.opensource.google/go/go/+/refs/tags/go1.23.4:src/strconv/atoi.go;l=20-49

https://go.dev/play/p/E9Dx1YBzPiN

var numErr *strconv.NumError
if errors.As(err, &numErr) {
	fmt.Printf("invalid  %s PS value (%s): %w", key, numErr.Num, numErr.Err)
}

Which would be something like;

invalid total PSI value (some invalid value): invalid syntax

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strconv.ParseUint is not very relevant to the user.

Indeed it isn't, but at least it says which function returned the error, and any developer (and some advanced users) will deduce that an unsigned integer was expected.

Generally speaking, to make this error more user-friendly, the error text should say something like "expected unsigned integer value, got %q: %w", and the best place to fix this is the actual strconv package.

In this case, though, we're parsing kernel-generated data (and we actually check that it comes from the kernel, by checking the filesystem type in cgroups.OpenFile), and so the chance of getting a non-number here (and thus receive an error) is close to 0 here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it really was just me rambling a bit, having dealt too many times with users "I have the same error" (because the strconv.ParseUint .... didn't resonate with them, and they thought it was a bug, not recognising "invalid input here".

Probably something for a rainy afternoon to look if I can come with some nice approach for these. 😄

libcontainer/cgroups/fscommon/rdma.go Show resolved Hide resolved
libcontainer/cgroups/fscommon/utils.go Outdated Show resolved Hide resolved
Comment on lines 86 to 87
arr := strings.Split(line, " ")
if len(arr) == 2 && arr[0] == key {
val, err := ParseUint(arr[1], 10, 64)
k, v, ok := strings.Cut(line, " ")
if ok && k == key {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if relevant (and probably new code is more correct?); previously, we would ignore cases with more than 2 fields. I.e., key foo bar would be ignored; in the new code, we'd try to parse foo bar as value.

That said; see my other comment; this would potentially be an issue if we try to parse something that may currently have 2 cols, and in future 3 cols; not sure if that should error or silently ignore

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, to me the new behavior is also more correct, and if suddenly a third column appears, it means our parsing code is not adequate and we should fix it.

Not relevant to your comment, but by looking at the code, it might be further improved to do something like this:

key += " "
for _, line := range lines {
	if v, ok := strings.CutPrefix(line, key); ok {
		val, err := ParseUint(v, 10, 64)
		...
	}

to avoid allocating memory for k.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by looking at the code, it might be further improved

Added a separate commit.

exec.go Outdated
Comment on lines 133 to 149
// No controller: prefix.
if len(args) != 1 {
return nil, fmt.Errorf("invalid --cgroup argument: %s (missing <controller>: prefix)", c)
}
paths[""] = c
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was trying to somewhat grasp what this case is for;

  • if zero args are passed, we return early (start of function)
  • if 1 arg is passed, it MUST have a controller prefix
  • But if 2 (or more) args are passed, then we don't have to?

So;

# valid: single arg
--cgroup a,b:/hello-world

# valid: single arg without controller
--cgroup /hello-world

# invalid: multiple args, one without controller
--cgroup a,b:/hello-world --cgroup /hello-world
--cgroup /hello-world --cgroup a,b:/hello-world 

So, if no controller is passed, it's for everything which is only allowed if no other (per-controller) paths are specified, correct? (so not allowed to be combined).

☝️ Perhaps that's something we should capture in a comment on the function

Also;

  • perhaps we need to change the ok check to also check if controller is not empty, or is ,,,,,:/hello-world something that should be considered valid?
  • are duplicate controllers valid? They currently overwrite previous values (foo,foo,foo:/bar, foo:/baz)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if 1 arg is passed, it MUST have a controller prefix

Looks like you got it wrong. The correct version is:

  • if 1 arg is passed, it can be with or without controller prefix;
  • if more than 1 arg is passed, they all MUST have a controller prefix.

Specifying more than 1 argument without controller prefix doesn't make sense (i.e. invalid input), and this is what the code is capturing.

In short, a version with controllers is for cgroup v1, as you can (and usually have to) specify different paths for different controllers. A version without controllers is for cgroup v2. This is described in runc-exec(8) man page.

Your other comments (about more comments, allowing empty values for controllers, and allowing to specify duplicated controllers) are all valid, and are addressed by the additional commit (second one in this PR).

This comment was marked as resolved.

@kolyshkin
Copy link
Contributor Author

@lifubang @thaJeztah addressed all your comments, please take another look.

@kolyshkin kolyshkin changed the title Use strings.Cut where possible Use strings.Cut and strings.CutPrefix where possible Jan 7, 2025
@kolyshkin
Copy link
Contributor Author

Added a few commits (at the end) which uses strings.CutPrefix where appropriate. Those are easier to review as they don't change functionality, just make the code more readable.

@@ -65,9 +65,9 @@ func rootlessEUIDMount(config *configs.Config) error {
if _, err := config.HostUID(uid); err != nil {
return fmt.Errorf("cannot specify uid=%d mount option for rootless container: %w", uid, err)
}
continue
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is added so when we've found uid=... in opt, we don't also try it for gid= but go to the next option.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative here could be to else if str, ok := strings.CutPrefix(opt, "gid="); ok {, to make it clearer each iteration handles either one or the other.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, my original commit had if / else if but it results in a larger, harder to review commit, so I switched to continue.

Also note that continue (or else) is optional here, as missing it will not result is a bug, but merely in a very slight performance degradation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thaJeztah ended up with one more commit changing this to if/else (and other minor nits).

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!!

exec.go Outdated
Comment on lines 133 to 149
// No controller: prefix.
if len(args) != 1 {
return nil, fmt.Errorf("invalid --cgroup argument: %s (missing <controller>: prefix)", c)
}
paths[""] = c

This comment was marked as resolved.

libcontainer/cgroups/fs2/defaultpath.go Show resolved Hide resolved
case "avg10":
pv = &data.Avg10
case "avg60":
pv = &data.Avg60
case "avg300":
pv = &data.Avg300
case "total":
v, err := strconv.ParseUint(kv[1], 10, 64)
v, err := strconv.ParseUint(val, 10, 64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it really was just me rambling a bit, having dealt too many times with users "I have the same error" (because the strconv.ParseUint .... didn't resonate with them, and they thought it was a bug, not recognising "invalid input here".

Probably something for a rainy afternoon to look if I can come with some nice approach for these. 😄

libcontainer/cgroups/fscommon/rdma.go Show resolved Hide resolved
libcontainer/cgroups/fscommon/utils.go Show resolved Hide resolved
@@ -65,9 +65,9 @@ func rootlessEUIDMount(config *configs.Config) error {
if _, err := config.HostUID(uid); err != nil {
return fmt.Errorf("cannot specify uid=%d mount option for rootless container: %w", uid, err)
}
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative here could be to else if str, ok := strings.CutPrefix(opt, "gid="); ok {, to make it clearer each iteration handles either one or the other.

Using strings.Cut (added in Go 1.18, see [1]) results in faster and
cleaner code with less allocations (as we're not using a slice).

This part of code is covered by tests in tests/integration/exec.bats.

[1]: golang/go#46336

Signed-off-by: Kir Kolyshkin <[email protected]>
1. Document the function.
2. Add sanity checks for empty and repeated controllers.

Reported-by: Sebastiaan van Stijn <[email protected]>
Signed-off-by: Kir Kolyshkin <[email protected]>
Nowadays strings.Fields are as fast as strings.SplitN so remove TODO.

Signed-off-by: Kir Kolyshkin <[email protected]>
Using strings.Cut (added in Go 1.18, see [1]) results in faster and
cleaner code with less allocations (as we're not using a slice). This
also drops the check for extra dash (we're unlikely to get it from the
kernel anyway).

While at it, rename min/max -> from/to to avoid collision with Go
min/max builtins.

This code is tested by TestCPUSetStats* tests.

[1]: golang/go#46336

Signed-off-by: Kir Kolyshkin <[email protected]>
Remove extra global constants that are only used in a single place and
make it harder to read the code.

Rename nanosecondsInSecond -> nsInSec.

This code is tested by unit tests.

Signed-off-by: Kir Kolyshkin <[email protected]>
For cgroup v2, we always expect /proc/$PID/cgroup contents like this:

> 0::/user.slice/user-1000.slice/[email protected]/app.slice/vte-spawn-f71c3fb8-519d-4e2d-b13e-9252594b1e05.scope

So, it does not make sense to parse it using strings.Split, we can just
cut the prefix and return the rest.

Code tested by TestParseCgroupFromReader.

Signed-off-by: Kir Kolyshkin <[email protected]>
Using strings.Cut (added in Go 1.18, see [1]) results in faster and
cleaner code with less allocations (as we're not using a slice).

The code is tested by testCgroupResourcesUnified.

[1]: golang/go#46336

Signed-off-by: Kir Kolyshkin <[email protected]>
Using strings.Cut (added in Go 1.18, see [1]) results in faster and
cleaner code with less allocations (as we're not using a slice).

This code is tested by TestStatCPUPSI.

[1]: golang/go#46336

Signed-off-by: Kir Kolyshkin <[email protected]>
Using strings.Cut (added in Go 1.18, see [1]) results in faster and
cleaner code with less allocations (as we're not using a slice).

Also, use switch in parseRdmaKV.

[1]: golang/go#46336

Signed-off-by: Kir Kolyshkin <[email protected]>
Using strings.Cut (added in Go 1.18, see [1]) results in faster and
cleaner code with less allocations (as we're not using a slice).

[1]: golang/go#46336

Signed-off-by: Kir Kolyshkin <[email protected]>
Using strings.CutPrefix (added in Go 1.20, see [1]) results in faster and
cleaner code with less allocations (as the code only allocates memory
for the value, and does it once).

While at it, improve the function documentation.

[1]: golang/go#42537

Signed-off-by: Kir Kolyshkin <[email protected]>
It makes sense to report an error if a key or a value is empty,
as we don't expect anything like this.

Reported-by: Sebastiaan van Stijn <[email protected]>
Signed-off-by: Kir Kolyshkin <[email protected]>
1. GetCgroupParamUint: drop strings.TrimSpace since it was already
   done by GetCgroupParamString.

2. GetCgroupParamInt: use GetCgroupParamString, drop strings.TrimSpace.

Signed-off-by: Kir Kolyshkin <[email protected]>
Using strings.HasPrefix with strings.TrimPrefix results in doing the
same thing (checking if prefix exists) twice. In this case, using
strings.TrimPrefix right away is sufficient.

Signed-off-by: Kir Kolyshkin <[email protected]>
Using strings.CutPrefix (available since Go 1.20) instead of
strings.HasPrefix and/or strings.TrimPrefix makes the code
a tad more straightforward.

No functional change.

Signed-off-by: Kir Kolyshkin <[email protected]>
1. Avoid splitting mount data into []string if it does not contain
   options we're interested in. This should result in slightly less
   garbage to collect.

2. Use if / else if instead of continue, to make it clearer that
   we're processing one option at a time.

3. Print the whole option as a sting in an error message; practically
   this should not have any effect, it's just simpler.

Signed-off-by: Kir Kolyshkin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/refactor refactoring
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants