Provide content length for the put method #78

murasakiakari · 2023-10-29T14:24:51Z

Since there is some webdav server implementation (such as nextcloud) give unexpected behavior when there is no content length is set in the put request. Therefore, I introduce the contentLength parameter to the *Client.put() method to provide the correct content length of the body to the server to prevent the behavior mention above.

Signed-off-by: Ben Tam <[email protected]>

chripo · 2023-10-29T15:03:07Z

Thanks!
Could you add / update our tests?

Signed-off-by: Ben Tam <[email protected]>

murasakiakari · 2023-10-29T18:14:34Z

Thanks for your message.
I has some idea on how to make a unit after taking a rest, I will update the unit test asap.

Signed-off-by: Ben Tam <[email protected]>

murasakiakari · 2023-10-31T15:29:50Z

Unit test is added, thank you for your reminding

buzz · 2024-12-12T20:27:00Z

If this could be merged and released, I'd be very happy. It's currently a blocker for me using Kopia. Thank you 🙏

jkowalski · 2024-12-13T05:06:00Z

As Kopia maintainer I would also love to see this merged.

chripo · 2024-12-13T07:32:03Z

hi all. my fault, i missed some notifications. going to review and merge it next week. thanks for the ping.

…

On December 13, 2024 5:06:22 AM UTC, Jarek Kowalski ***@***.***> wrote: As Kopia maintainer I would also love to see this merged.

chripo · 2024-12-17T10:45:38Z

thanks for your contribution.
released in v0.10.0

chripo · 2024-12-17T10:52:16Z

client.go

-		contentLength, err = io.Copy(io.Discard, stream)
+		buffer := bytes.NewBuffer(make([]byte, 0, 1024 * 1024 /* 1MB */))
+
+		contentLength, err = io.Copy(buffer, stream)


this sucks for large files.

Thank you for your reporting, I will have a look first and provide a patch asap

@chripo I see that the issue is related to bytes.growSlice allocate many memory during copying.
However, I can only think three other way due to the limitation of the Reader interface

perform manual GC after each time of bytes.growSlice, but it needs a custom copy function and I think it is not a good practice

copy to disk space first for calculation, but io is more expensive in this way

make a new function that allow user to provide content length, this is most effortless but more changes in needed in user code base

Also, there is an additional way which is the server implementation need to handle the content length correctly without relying the client, but I think we need provide the value correctly if we send it (although giving 0 is conventional in go default http client or even in other http client implementation in other language)

May I know your opinion on how to handle this problem. Thx a lot :)

Another pattern commonly used in Go is to dynamically check whether io.Reader can also provide length by trying to cast to:

Len() int }

(possibly also with other method signatures (Length() int64, Size() int) and so on)

This will support bytes.Buffer and other types that are buffers with Len(). Any other reader can be quickly wrapped in a simple structure that implements Read() and Len() only.

I think it is a terrible idea to buffer the whole stream in memory.

I see that the issue is related to bytes.growSlice allocate many memory during copying.

That's not all, what if the whole thing does not even fit into memory?

perform manual GC after each time of bytes.growSlice, but it needs a custom copy function and I think it is not a good practice

please no

I think this library should not try too hard to determine a content length. The implementation via io.Seeker is fine. Maybe another check if the stream is a *bytes.Buffer and call its Len() method. This should cover most cases, including *os.File.
Anything else like guessing a Len() int, Length() int or Size() int via anonymous interface, which might not contain the correct value we want for content length (like https://pkg.go.dev/bufio#Reader.Size returns the size of the buffer, not the containing reader) or buffering the whole stream in memory would come unexpected for me as a library user.

In cases where we cannot determine the size of the stream easily, we should just set the contentLength to -1, meaning "unknown size" or keep it at 0.

copy to disk space first for calculation, but io is more expensive in this way

If a library user really needs the content length to be set, the buffering to disk should be a deliberate decision and done beforehand. WriteStream can then use a *os.File as reader.

make a new function that allow user to provide content length, this is most effortless but more changes in needed in user code base

This seems reasonable but would also make the library user responsible for providing the correct value.

Thank you everyone for the great idea, I would like provide a fix on getting the content length with the buffer like the one done in http.NewRequestWithContext, the current seeker method and remain 0 if it cannot be determined.

nice collaboration!

feat: provide content length for put method

c6155fb

Signed-off-by: Ben Tam <[email protected]>

murasakiakari marked this pull request as ready for review October 29, 2023 14:31

fix: use io.Reader after copy

df102c9

Signed-off-by: Ben Tam <[email protected]>

feat: add unit test for upload file to server acquire content length

828132a

Signed-off-by: Ben Tam <[email protected]>

buzz mentioned this pull request Dec 12, 2024

WebDAV: Nextcloud invalid format blob: unexpected end of JSON input kopia/kopia#467

Open

buzz added a commit to buzz/kopia that referenced this pull request Dec 12, 2024

Tempory fix until studio-b12/gowebdav#78 is merged and released

e790c90

chripo merged commit 3de34da into studio-b12:master Dec 17, 2024

chripo reviewed Dec 17, 2024

View reviewed changes

murasakiakari mentioned this pull request Dec 18, 2024

feat: try to get content length with zero copy #81

Open

julio-lopez mentioned this pull request Jan 8, 2025

build(deps): bump github.com/studio-b12/gowebdav from 0.9.0 to 0.10.0 kopia/kopia#4315

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide content length for the put method #78

Provide content length for the put method #78

murasakiakari commented Oct 29, 2023 •

edited

Loading

chripo commented Oct 29, 2023 •

edited

Loading

murasakiakari commented Oct 29, 2023 •

edited

Loading

murasakiakari commented Oct 31, 2023

buzz commented Dec 12, 2024

jkowalski commented Dec 13, 2024

chripo commented Dec 13, 2024 via email

chripo commented Dec 17, 2024

chripo Dec 17, 2024

murasakiakari Dec 17, 2024 •

edited

Loading

murasakiakari Dec 17, 2024 •

edited

Loading

jkowalski Dec 17, 2024

ueffel Dec 17, 2024

murasakiakari Dec 18, 2024

chripo Dec 19, 2024

Provide content length for the put method #78

Provide content length for the put method #78

Conversation

murasakiakari commented Oct 29, 2023 • edited Loading

chripo commented Oct 29, 2023 • edited Loading

murasakiakari commented Oct 29, 2023 • edited Loading

murasakiakari commented Oct 31, 2023

buzz commented Dec 12, 2024

jkowalski commented Dec 13, 2024

chripo commented Dec 13, 2024 via email

chripo commented Dec 17, 2024

chripo Dec 17, 2024

Choose a reason for hiding this comment

murasakiakari Dec 17, 2024 • edited Loading

Choose a reason for hiding this comment

murasakiakari Dec 17, 2024 • edited Loading

Choose a reason for hiding this comment

jkowalski Dec 17, 2024

Choose a reason for hiding this comment

ueffel Dec 17, 2024

Choose a reason for hiding this comment

murasakiakari Dec 18, 2024

Choose a reason for hiding this comment

chripo Dec 19, 2024

Choose a reason for hiding this comment

murasakiakari commented Oct 29, 2023 •

edited

Loading

chripo commented Oct 29, 2023 •

edited

Loading

murasakiakari commented Oct 29, 2023 •

edited

Loading

murasakiakari Dec 17, 2024 •

edited

Loading

murasakiakari Dec 17, 2024 •

edited

Loading