如何编码的文件名称参数的内容处置头在HTTP?

想要强制下载资源而不是直接在Web浏览器中呈现资源的Web应用程序在表单的HTTP响应中发出Content-Disposition报头:

Content-Disposition:附件;filename = filename

filename参数可用于建议浏览器将资源下载到的文件的名称。然而，RFC 2183 (Content-Disposition)在2.3节(文件名参数)中规定文件名只能使用US-ASCII字符:

当前[RFC 2045]语法限制参数值(因此内容-处置文件名)到 us - ascii。我们认可伟大的允许任意的可取性文件名中的字符集，但它是超出了本文档的范围定义必要的机制。

然而，有经验证据表明，目前大多数流行的Web浏览器似乎允许非us - ascii字符，但(由于缺乏标准)在文件名的编码方案和字符集规范上存在分歧。问题是，如果文件名“naïvefile”(不带引号，第三个字母是U+00EF)需要编码到Content-Disposition报头中，那么流行的浏览器采用了哪些不同的方案和编码?

为了解决这个问题，流行的浏览器是:

谷歌Chrome Safari Internet Explorer或Edge 火狐歌剧

我通常对文件名进行url编码(使用%xx)，它似乎在所有浏览器中都可以工作。你还是得做些检查。

2008-09-18 15:28:29

在提议的RFC 5987“超文本传输协议(HTTP)报头字段参数的字符集和语言编码”中对此进行了讨论，包括浏览器测试和向后兼容性的链接。

RFC 2183表示这样的报头应该根据RFC 2184进行编码，RFC 2184已被RFC 2231废止，上面的RFC草案涵盖了这一点。

2008-09-18 15:39:58

以下文件链接自Jim在回答中提到的RFC草案，进一步解决了这个问题，在这里值得直接注意:

HTTP内容处理头和rfc2231 /2047编码的测试用例

2008-09-18 16:08:16

在Content-Disposition中没有可互操作的方法来编码非ascii名称。浏览器兼容性是一团糟。在Content-Disposition中使用UTF-8的理论上正确的语法是非常奇怪的:filename*=UTF-8 " foo%c3%a4(是的，这是一个星号，没有引号，除了中间的一个空单引号) 这个报头有点不太标准(HTTP/1.1规范承认它的存在，但不要求客户端支持它)。

有一种简单而可靠的替代方法:使用包含所需文件名的URL。

当最后一个斜杠后面的名称是您想要的名称时，您不需要任何额外的头文件!

这个技巧很管用:

/real_script.php/fake_filename.doc

如果你的服务器支持URL重写(例如Apache中的mod_rewrite)，那么你可以完全隐藏脚本部分。

url中的字符应该是UTF-8，逐字节url编码:

/mot%C3%B6rhead   # motörhead

2008-10-19 18:26:36

在asp.net mvc2中，我使用这样的东西:

return File(
    tempFile
    , "application/octet-stream"
    , HttpUtility.UrlPathEncode(fileName)
    );

我想如果你不使用mvc(2)，你可以只编码文件名使用

HttpUtility.UrlPathEncode(fileName)

2010-07-15 15:08:29

我知道这是一个老帖子，但它仍然非常相关。我发现现代浏览器支持rfc5987，它允许utf-8编码，百分比编码(url编码)。然后Naïve file.txt变成:

Content-Disposition: attachment; filename*=UTF-8''Na%C3%AFve%20file.txt

Safari(5)不支持这一点。相反，你应该使用Safari标准，直接在utf-8编码的头文件中写入文件名:

Content-Disposition: attachment; filename=Naïve file.txt

IE8及以上版本也不支持，你需要使用IE标准的utf-8编码，百分比编码:

Content-Disposition: attachment; filename=Na%C3%AFve%20file.txt

在ASP。Net我使用以下代码:

string contentDisposition;
if (Request.Browser.Browser == "IE" && (Request.Browser.Version == "7.0" || Request.Browser.Version == "8.0"))
    contentDisposition = "attachment; filename=" + Uri.EscapeDataString(fileName);
else if (Request.Browser.Browser == "Safari")
    contentDisposition = "attachment; filename=" + fileName;
else
    contentDisposition = "attachment; filename*=UTF-8''" + Uri.EscapeDataString(fileName);
Response.AddHeader("Content-Disposition", contentDisposition);

我用IE7、IE8、IE9、Chrome 13、Opera 11、FF5、Safari 5测试了上述内容。

2013年11月更新:

这是我目前使用的代码。我仍然必须支持IE8，所以我不能摆脱第一部分。事实证明，Android上的浏览器使用内置的Android下载管理器，它不能可靠地以标准方式解析文件名。

string contentDisposition;
if (Request.Browser.Browser == "IE" && (Request.Browser.Version == "7.0" || Request.Browser.Version == "8.0"))
    contentDisposition = "attachment; filename=" + Uri.EscapeDataString(fileName);
else if (Request.UserAgent != null && Request.UserAgent.ToLowerInvariant().Contains("android")) // android built-in download manager (all browsers on android)
    contentDisposition = "attachment; filename=\"" + MakeAndroidSafeFileName(fileName) + "\"";
else
    contentDisposition = "attachment; filename=\"" + fileName + "\"; filename*=UTF-8''" + Uri.EscapeDataString(fileName);
Response.AddHeader("Content-Disposition", contentDisposition);

上面现在测试在IE7-11, Chrome 32,歌剧12日FF25, Safari 6,使用该文件名下载:你好abcABCæø一ÆØAaouieeiaeiaouyn ½§!#¤%&()=`@£$ € {[]}+´¨^~'-_,;. 三种

在IE7上，它适用于某些字符，但不是所有字符。但是现在谁还关心IE7呢?

这是我用来为Android生成安全文件名的函数。注意，我不知道Android支持哪些字符，但我已经测试过了，这些字符肯定有效:

private static readonly Dictionary<char, char> AndroidAllowedChars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ._-+,@£$€!½§~'=()[]{}0123456789".ToDictionary(c => c);
private string MakeAndroidSafeFileName(string fileName)
{
    char[] newFileName = fileName.ToCharArray();
    for (int i = 0; i < newFileName.Length; i++)
    {
        if (!AndroidAllowedChars.ContainsKey(newFileName[i]))
            newFileName[i] = '_';
    }
    return new string(newFileName);
}

@TomZ:我在IE7和IE8中进行了测试，结果证明我不需要转义撇号(')。你能举个失败的例子吗?

@Dave Van den Eynde:根据RFC6266将两个文件名合并在一行中，除了Android和IE7+8，我已经更新了代码来反映这一点。谢谢你的建议。

@Thilo:不知道GoodReader或其他非浏览器。使用Android方法可能会有一些运气。

@Alex Zhukovskiy:我不知道为什么，但正如在Connect上讨论的那样，它似乎运行得不太好。

2011-07-19 10:34:35

我在所有主流浏览器中测试了以下代码，包括老式的explorer(通过兼容模式)，它在任何地方都能正常工作:

$filename = $_GET['file']; //this string from $_GET is already decoded
if (strstr($_SERVER['HTTP_USER_AGENT'],"MSIE"))
  $filename = rawurlencode($filename);
header('Content-Disposition: attachment; filename="'.$filename.'"');

2012-05-31 15:48:11

我使用以下代码片段进行编码(假设fileName包含文件的文件名和扩展名，即:test.txt):

PHP:

if ( strpos ( $_SERVER [ 'HTTP_USER_AGENT' ], "MSIE" ) > 0 )
{
     header ( 'Content-Disposition: attachment; filename="' . rawurlencode ( $fileName ) . '"' );
}
else
{
     header( 'Content-Disposition: attachment; filename*=UTF-8\'\'' . rawurlencode ( $fileName ) );
}

Java:

fileName = request.getHeader ( "user-agent" ).contains ( "MSIE" ) ? URLEncoder.encode ( fileName, "utf-8") : MimeUtility.encodeWord ( fileName );
response.setHeader ( "Content-disposition", "attachment; filename=\"" + fileName + "\"");

2013-04-19 11:29:24

RFC 6266描述了“超文本传输协议(HTTP)中内容处理报头字段的使用”。引用其中的话:

6. 国际化的考虑参数" filename* "(章节4.3)，使用定义的编码在[RFC5987]中，允许服务器传输外部的字符 ISO-8859-1字符集，也可以选择指定语言在使用。

在例子部分:

这个示例与上面的示例相同，但添加了"filename" 参数，用于与未实现的用户代理的兼容性 RFC 5987: 附加:附件; 文件名= "欧元利率”; 文件名* = utf - 8”% e2 % 82% ac % 20率注意:不支持RFC 5987编码的用户代理当" filename "后面出现" filename* "时，忽略" filename* "。

在附录D中，还列出了一长串提高互操作性的建议。它还指向一个比较实现的站点。适用于常用文件名的当前全通过测试包括:

attwithisofnplain:普通的ISO-8859-1文件名，双引号，不带编码。这要求文件名完全符合ISO-8859-1，并且不包含百分号，至少在十六进制数字前面不包含百分号。 Attfnboth:上述顺序的两个参数。应该适用于大多数浏览器上的大多数文件名，尽管IE8将使用" filename "参数。

RFC 5987又引用了描述实际格式的RFC 2231。2231主要用于邮件，5987告诉我们哪些部分也可以用于HTTP报头。不要将其与多部分/form-data HTTP主体中使用的MIME头相混淆，后者受RFC 2388(特别是4.4节)和HTML 5草案的约束。

2014-01-05 12:48:27

我们在一个web应用程序中遇到了类似的问题，最后从HTML <input type="file">中读取文件名，并在一个新的HTML <input type="hidden">中以url编码的形式设置它。当然，我们必须删除一些浏览器返回的“C:\fakepath\”这样的路径。

当然，这并不能直接回答OPs的问题，但可能是其他人的解决方案。

2015-01-27 11:54:13

我最终在“download.php”脚本中编写了以下代码(基于这篇博文和这些测试用例)。

$il1_filename = utf8_decode($filename);
$to_underscore = "\"\\#*;:|<>/?";
$safe_filename = strtr($il1_filename, $to_underscore, str_repeat("_", strlen($to_underscore)));

header("Content-Disposition: attachment; filename=\"$safe_filename\""
.( $safe_filename === $filename ? "" : "; filename*=UTF-8''".rawurlencode($filename) ));

只要只使用iso-latin1和“safe”字符，就使用标准的filename="…";如果不是，它会添加文件名*=UTF-8 " url编码的方式。根据这个具体的测试用例，它应该从MSIE9起，并在最近的FF, Chrome, Safari;在较低的MSIE版本中，它应该提供包含ISO8859-1版本的文件名，在非此编码的字符上使用下划线。

最后注意:最大值。在apache上，每个报头字段的大小为8190字节。UTF-8每个字符最多可以有四个字节;在rawurlencode之后，每个字符是x3 = 12字节。非常低效，但理论上仍然可以在文件名中有超过600个“smiles”%F0%9F%98%81。

2015-04-05 15:45:29

在ASP。NET Web API，我url编码的文件名:

public static class HttpRequestMessageExtensions
{
    public static HttpResponseMessage CreateFileResponse(this HttpRequestMessage request, byte[] data, string filename, string mediaType)
    {
        HttpResponseMessage response = new HttpResponseMessage(HttpStatusCode.OK);
        var stream = new MemoryStream(data);
        stream.Position = 0;

        response.Content = new StreamContent(stream);

        response.Content.Headers.ContentType = 
            new MediaTypeHeaderValue(mediaType);

        // URL-Encode filename
        // Fixes behavior in IE, that filenames with non US-ASCII characters
        // stay correct (not "_utf-8_.......=_=").
        var encodedFilename = HttpUtility.UrlEncode(filename, Encoding.UTF8);

        response.Content.Headers.ContentDisposition =
            new ContentDispositionHeaderValue("attachment") { FileName = encodedFilename };
        return response;
    }
}

2015-06-25 08:10:39

将文件名放在双引号中。帮我解决了问题。是这样的:

Content-Disposition: attachment; filename="My Report.doc"

http://kb.mozillazine.org/Filenames_with_spaces_are_truncated_upon_download

我测试了多种选择。浏览器不支持这些规格，并且表现不同，我相信双引号是最好的选择。

2015-07-10 15:01:51

如果你使用的是nodejs后端，你可以使用我在这里找到的以下代码

var fileName = 'my file(2).txt';
var header = "Content-Disposition: attachment; filename*=UTF-8''" 
             + encodeRFC5987ValueChars(fileName);

function encodeRFC5987ValueChars (str) {
    return encodeURIComponent(str).
        // Note that although RFC3986 reserves "!", RFC5987 does not,
        // so we do not need to escape it
        replace(/['()]/g, escape). // i.e., %27 %28 %29
        replace(/\*/g, '%2A').
            // The following are not required for percent-encoding per RFC5987, 
            // so we can allow for a little better readability over the wire: |`^
            replace(/%(?:7C|60|5E)/g, unescape);
}

2015-09-25 12:45:11

在PHP中，这为我做了(假设文件名是UTF8编码):

header('Content-Disposition: attachment;'
    . 'filename="' . addslashes(utf8_decode($filename)) . '";'
    . 'filename*=utf-8\'\'' . rawurlencode($filename));

在IE8-11、Firefox和Chrome浏览器上进行测试。如果浏览器可以解释文件名*=utf-8，它将使用文件名的UTF8版本，否则它将使用解码后的文件名。如果你的文件名包含的字符不能在ISO-8859-1中表示，你可能要考虑使用iconv代替。

2016-05-20 12:47:05

经典ASP解决方案

大多数现代浏览器现在都支持将文件名作为UTF-8传递，但我使用的文件上传解决方案是基于FreeASPUpload的。Net(站点已经不存在了，链接指向archive.org)，它不会工作，因为二进制解析依赖于读取单字节ASCII编码的字符串，当您传递UTF-8编码的数据时，它工作得很好，直到您得到ASCII不支持的字符。

然而，我能够找到一个解决方案，使代码读取和解析二进制为UTF-8。

Public Function BytesToString(bytes)    'UTF-8..
  Dim bslen
  Dim i, k , N 
  Dim b , count 
  Dim str

  bslen = LenB(bytes)
  str=""

  i = 0
  Do While i < bslen
    b = AscB(MidB(bytes,i+1,1))

    If (b And &HFC) = &HFC Then
      count = 6
      N = b And &H1
    ElseIf (b And &HF8) = &HF8 Then
      count = 5
      N = b And &H3
    ElseIf (b And &HF0) = &HF0 Then
      count = 4
      N = b And &H7
    ElseIf (b And &HE0) = &HE0 Then
      count = 3
      N = b And &HF
    ElseIf (b And &HC0) = &HC0 Then
      count = 2
      N = b And &H1F
    Else
      count = 1
      str = str & Chr(b)
    End If

    If i + count - 1 > bslen Then
      str = str&"?"
      Exit Do
    End If

    If count>1 then
      For k = 1 To count - 1
        b = AscB(MidB(bytes,i+k+1,1))
        N = N * &H40 + (b And &H3F)
      Next
      str = str & ChrW(N)
    End If
    i = i + count
  Loop

  BytesToString = str
End Function

通过在我自己的代码中实现include_aspuploader.asp中的by睾string()函数，我能够获得UTF-8文件名。

有用的链接

一个ASP经典应用程序中的Multipart/form-data和UTF-8 Unicode, UTF, ASCII, ANSI格式的差异

2016-05-23 12:17:58

只是一个更新，因为我今天为了回应一个客户的问题而尝试了所有这些东西

With the exception of Safari configured for Japanese, all browsers our customer tested worked best with filename=text.pdf - where text is a customer value serialized by ASP.Net/IIS in utf-8 without url encoding. For some reason, Safari configured for English would accept and properly save a file with utf-8 Japanese name but that same browser configured for Japanese would save the file with the utf-8 chars uninterpreted. All other browsers tested seemed to work best/fine (regardless of language configuration) with the filename utf-8 encoded without url encoding. I could not find a single browser implementing Rfc5987/8187 at all. I tested with the latest Chrome, Firefox builds plus IE 11 and Edge. I tried setting the header with just filename*=utf-8''texturlencoded.pdf, setting it with both filename=text.pdf; filename*=utf-8''texturlencoded.pdf. Not one feature of Rfc5987/8187 appeared to be getting processed correctly in any of the above.

2019-03-13 19:18:16

PHP框架Symfony 4在HeaderUtils::makeDisposition中有$filenameFallback。您可以查看这个函数的详细信息-它与上面的答案类似。

使用的例子:

$filenameFallback = preg_replace('#^.*\.#', md5($filename) . '.', $filename);
$disposition = $response->headers->makeDisposition(ResponseHeaderBag::DISPOSITION_ATTACHMENT, $filename, $filenameFallback);
$response->headers->set('Content-Disposition', $disposition);

2019-07-22 13:58:45

在。net 4.5(和Core 1.0)中，你可以使用ContentDispositionHeaderValue来为你格式化。

var fileName = "Naïve file.txt";
var h = new System.Net.Http.Headers.ContentDispositionHeaderValue("attachment");
h.FileNameStar = fileName;
h.FileName = "fallback-ascii-name.txt";

Response.Headers.Add("Content-Disposition", h.ToString());

h.ToString()将导致:

attachment; filename*=utf-8''Na%C3%AFve%20file.txt; filename=fallback-ascii-name.txt

2021-07-30 07:34:31

对于那些需要JavaScript方式编码头的人，我发现这个函数工作得很好:

function createContentDispositionHeader(filename:string) {
    const encoded = encodeURIComponent(filename);
    return `attachment; filename*=UTF-8''${encoded}; filename="${encoded}"`;
}

这是基于Nextcloud在下载文件时的操作。文件名首先以UTF-8编码的形式出现，并且可能为了与某些浏览器兼容，文件名也不带UTF-8前缀。

2021-08-17 22:30:43

库类Unicode中的方法mimeHeaderEncode($string)可以完成这项工作。

$file_name= Unicode::mimeHeaderEncode($file_name);

drupal/php中的例子:

https://github.com/drupal/core-utility/blob/8.8.x/Unicode.php

/**
   * Encodes MIME/HTTP headers that contain incorrectly encoded characters.
   *
   * For example, Unicode::mimeHeaderEncode('tést.txt') returns
   * "=?UTF-8?B?dMOpc3QudHh0?=".
   *
   * See http://www.rfc-editor.org/rfc/rfc2047.txt for more information.
   *
   * Notes:
   * - Only encode strings that contain non-ASCII characters.
   * - We progressively cut-off a chunk with self::truncateBytes(). This ensures
   *   each chunk starts and ends on a character boundary.
   * - Using \n as the chunk separator may cause problems on some systems and
   *   may have to be changed to \r\n or \r.
   *
   * @param string $string
   *   The header to encode.
   * @param bool $shorten
   *   If TRUE, only return the first chunk of a multi-chunk encoded string.
   *
   * @return string
   *   The mime-encoded header.
   */
  public static function mimeHeaderEncode($string, $shorten = FALSE) {
    if (preg_match('/[^\x20-\x7E]/', $string)) {
      // floor((75 - strlen("=?UTF-8?B??=")) * 0.75);
      $chunk_size = 47;
      $len = strlen($string);
      $output = '';
      while ($len > 0) {
        $chunk = static::truncateBytes($string, $chunk_size);
        $output .= ' =?UTF-8?B?' . base64_encode($chunk) . "?=\n";
        if ($shorten) {
          break;
        }
        $c = strlen($chunk);
        $string = substr($string, $c);
        $len -= $c;
      }
      return trim($output);
    }
    return $string;
  }

2021-12-21 10:49:51

如何编码的文件名称参数的内容处置头在HTTP?

推荐文章

最新文章

标签