Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide content length for the put method #78

Conversation

murasakiakari
Copy link
Contributor

@murasakiakari murasakiakari commented Oct 29, 2023

Since there is some webdav server implementation (such as nextcloud) give unexpected behavior when there is no content length is set in the put request. Therefore, I introduce the contentLength parameter to the *Client.put() method to provide the correct content length of the body to the server to prevent the behavior mention above.

@murasakiakari murasakiakari marked this pull request as ready for review October 29, 2023 14:31
@chripo
Copy link
Member

chripo commented Oct 29, 2023

Thanks!
Could you add / update our tests?

@murasakiakari
Copy link
Contributor Author

murasakiakari commented Oct 29, 2023

Thanks for your message.
I has some idea on how to make a unit after taking a rest, I will update the unit test asap.

@murasakiakari
Copy link
Contributor Author

Unit test is added, thank you for your reminding

@buzz
Copy link

buzz commented Dec 12, 2024

If this could be merged and released, I'd be very happy. It's currently a blocker for me using Kopia. Thank you 🙏

buzz added a commit to buzz/kopia that referenced this pull request Dec 12, 2024
@jkowalski
Copy link
Collaborator

As Kopia maintainer I would also love to see this merged.

@chripo
Copy link
Member

chripo commented Dec 13, 2024 via email

@chripo chripo merged commit 3de34da into studio-b12:master Dec 17, 2024
@chripo
Copy link
Member

chripo commented Dec 17, 2024

thanks for your contribution.
released in v0.10.0

contentLength, err = io.Copy(io.Discard, stream)
buffer := bytes.NewBuffer(make([]byte, 0, 1024 * 1024 /* 1MB */))

contentLength, err = io.Copy(buffer, stream)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this sucks for large files.

Copy link
Contributor Author

@murasakiakari murasakiakari Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your reporting, I will have a look first and provide a patch asap

Copy link
Contributor Author

@murasakiakari murasakiakari Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chripo I see that the issue is related to bytes.growSlice allocate many memory during copying.
However, I can only think three other way due to the limitation of the Reader interface

  1. perform manual GC after each time of bytes.growSlice, but it needs a custom copy function and I think it is not a good practice
  2. copy to disk space first for calculation, but io is more expensive in this way
  3. make a new function that allow user to provide content length, this is most effortless but more changes in needed in user code base

Also, there is an additional way which is the server implementation need to handle the content length correctly without relying the client, but I think we need provide the value correctly if we send it (although giving 0 is conventional in go default http client or even in other http client implementation in other language)

May I know your opinion on how to handle this problem. Thx a lot :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another pattern commonly used in Go is to dynamically check whether io.Reader can also provide length by trying to cast to:

  Len() int
}

(possibly also with other method signatures (Length() int64, Size() int) and so on)

This will support bytes.Buffer and other types that are buffers with Len(). Any other reader can be quickly wrapped in a simple structure that implements Read() and Len() only.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a terrible idea to buffer the whole stream in memory.

I see that the issue is related to bytes.growSlice allocate many memory during copying.

That's not all, what if the whole thing does not even fit into memory?

  1. perform manual GC after each time of bytes.growSlice, but it needs a custom copy function and I think it is not a good practice

please no


I think this library should not try too hard to determine a content length. The implementation via io.Seeker is fine. Maybe another check if the stream is a *bytes.Buffer and call its Len() method. This should cover most cases, including *os.File.
Anything else like guessing a Len() int, Length() int or Size() int via anonymous interface, which might not contain the correct value we want for content length (like https://pkg.go.dev/bufio#Reader.Size returns the size of the buffer, not the containing reader) or buffering the whole stream in memory would come unexpected for me as a library user.

In cases where we cannot determine the size of the stream easily, we should just set the contentLength to -1, meaning "unknown size" or keep it at 0.

  1. copy to disk space first for calculation, but io is more expensive in this way

If a library user really needs the content length to be set, the buffering to disk should be a deliberate decision and done beforehand. WriteStream can then use a *os.File as reader.

  1. make a new function that allow user to provide content length, this is most effortless but more changes in needed in user code base

This seems reasonable but would also make the library user responsible for providing the correct value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you everyone for the great idea, I would like provide a fix on getting the content length with the buffer like the one done in http.NewRequestWithContext, the current seeker method and remain 0 if it cannot be determined.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice collaboration!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants