Skip to content

Fix invalid binary data for attachments without encoding specified#54

Merged
rcambien merged 2 commits intoRiverline:masterfrom
erickskrauch:binary_file_fix
Apr 29, 2025
Merged

Fix invalid binary data for attachments without encoding specified#54
rcambien merged 2 commits intoRiverline:masterfrom
erickskrauch:binary_file_fix

Conversation

@erickskrauch
Copy link
Contributor

Hello. I have faced this issue with our frontend developer, who was trying to upload a WAV file via our REST API. After uploading, the recording became a static noise. Investigation led me to the multipart parser, which we use for PUT requests. In my case, this call returned ASCII encoding, which led to file data corruption.

if (null === $charset) {
// Try to detect
$charset = mb_detect_encoding($body) ?: 'utf-8';
}
// Only convert if not UTF-8
if ('utf-8' !== strtolower($charset)) {
$body = mb_convert_encoding($body, 'utf-8', $charset);
}

I'm not sure if my fix is correct for all use cases, but it fixes the problem for me. As an alternative it might be better to check null value on $charset, but for now I'm gonna leave it as is because I don't see how a binary file can have any charset.

@rcambien
Copy link
Contributor

I just read the RFC 2388 that refer to RFC 2045 for multipart content encoding:

This is the default value -- that is, "Content-Transfer-Encoding: 7BIT" is assumed if the Content-Transfer-Encoding header field is not present.

The new RFC 7578 deprecated the use of the Content-Transfer-Encoding header.

So you can update this line to add a default value:

// Decode
$encoding = strtolower((string) $this->getHeader('Content-Transfer-Encoding', '7bit'));

That should solve your problem because the function don't try to convert 7bit content.

I just fear this "fix" is a breaking change because we changed the default behavior ...

}

// Don't try to fix the encoding when it's a file
if ('' === $encoding && $this->isFile()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the mime type of the part is not text, and the transfer encoding is missing, as per spec it should be considered an error.

nothing says that if is a file it is binary.

Host: localhost:8080
Accept: */*
Content-Type: multipart/form-data; boundary=----------------------------83ff53821b7c

------------------------------83ff53821b7c
Content-Disposition: form-data; name="upload"; filename="text.txt"
Content-Type: text/plain|

XXX

the text.txt file should be interpreted as 7bit, while your change will interpret it as binary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this will not be an issue anymore with the latest changes 🙏

@erickskrauch
Copy link
Contributor Author

I have updated the implementation, as @rcambien suggested. I fixed the encoding of the simple_multipart.txt and its contents to have the correct value of an empty PNG. I had to adjust some test strings related to the reading from that file. Let me know if you think that it will be better to revert my changes and keep test for a fixed case separately.

@erickskrauch erickskrauch requested a review from goetas April 29, 2025 08:04
@rcambien rcambien merged commit 1410f23 into Riverline:master Apr 29, 2025
9 checks passed
@rcambien
Copy link
Contributor

I just released version 2.2.0 with this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants