You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I upload a file to S3 (using a multipart upload request) the content-type of the file will be application/xml unless I specify otherwise. This seems incorrect as a content-type should be omitted if unknown or, at worst, default to application/octet-stream. Per RFC 7231 (3.1.1.5):
A sender that generates a message containing a payload body SHOULD
generate a Content-Type header field in that message unless the
intended media type of the enclosed representation is unknown to the
sender. If a Content-Type header field is not present, the recipient
MAY either assume a media type of "application/octet-stream"
([RFC2046], Section 4.5.1) or examine the data to determine its type.
This ended up causing a bit of confusion here (apache/arrow#11934). An S3 client was trying to be intelligent and inspect the XML data if the file was an XML file and this issue caused the client to inspect files it shouldn't.
Expected behavior
If the content type of a file is not set then the file should either have no content-type or the content-type should be set to application/octet-stream.
Hi @westonpace ,
Quick question here before I try to dig too deep into this, have you tried the transferManager to do multipart uploads or is there a reason why you can't? I just tried and I didn't get the same behavior so it might be a good workaround to get you unblocked?
I'm not really blocked by this. It was simple enough to ensure we always specify the content type. Perhaps the main issue was simply that this default isn't documented anywhere and so it was a surprise and took a little while to isolate the root cause.
Describe the bug
When I upload a file to S3 (using a multipart upload request) the content-type of the file will be application/xml unless I specify otherwise. This seems incorrect as a content-type should be omitted if unknown or, at worst, default to application/octet-stream. Per RFC 7231 (3.1.1.5):
This ended up causing a bit of confusion here (apache/arrow#11934). An S3 client was trying to be intelligent and inspect the XML data if the file was an XML file and this issue caused the client to inspect files it shouldn't.
Expected behavior
If the content type of a file is not set then the file should either have no content-type or the content-type should be set to application/octet-stream.
Current behavior
The file's content-type is set to application/xml
Steps to Reproduce
Reproducible Gist: https://gist.github.com/westonpace/9c3a0baa48083f33aa4880c0cb6a602b
Possible Solution
When the user does not specify a content-type either leave it unset or default to application/octet-stream
AWS CPP SDK version used
1.8.185
Compiler and Version used
GCC 9.3.0
Operating System and version
Ubuntu 20.04.3
The text was updated successfully, but these errors were encountered: