如何编码的文件名称参数的内容处置头在HTTP?

想要强制下载资源而不是直接在Web浏览器中呈现资源的Web应用程序在表单的HTTP响应中发出Content-Disposition报头:

Content-Disposition:附件;filename = filename

filename参数可用于建议浏览器将资源下载到的文件的名称。然而，RFC 2183 (Content-Disposition)在2.3节(文件名参数)中规定文件名只能使用US-ASCII字符:

当前[RFC 2045]语法限制参数值(因此内容-处置文件名)到 us - ascii。我们认可伟大的允许任意的可取性文件名中的字符集，但它是超出了本文档的范围定义必要的机制。

然而，有经验证据表明，目前大多数流行的Web浏览器似乎允许非us - ascii字符，但(由于缺乏标准)在文件名的编码方案和字符集规范上存在分歧。问题是，如果文件名“naïvefile”(不带引号，第三个字母是U+00EF)需要编码到Content-Disposition报头中，那么流行的浏览器采用了哪些不同的方案和编码?

为了解决这个问题，流行的浏览器是:

谷歌Chrome Safari Internet Explorer或Edge 火狐歌剧

当前回答

如果你使用的是nodejs后端，你可以使用我在这里找到的以下代码

var fileName = 'my file(2).txt';
var header = "Content-Disposition: attachment; filename*=UTF-8''" 
             + encodeRFC5987ValueChars(fileName);

function encodeRFC5987ValueChars (str) {
    return encodeURIComponent(str).
        // Note that although RFC3986 reserves "!", RFC5987 does not,
        // so we do not need to escape it
        replace(/['()]/g, escape). // i.e., %27 %28 %29
        replace(/\*/g, '%2A').
            // The following are not required for percent-encoding per RFC5987, 
            // so we can allow for a little better readability over the wire: |`^
            replace(/%(?:7C|60|5E)/g, unescape);
}

2015-09-25 12:45:11

其他回答

我使用以下代码片段进行编码(假设fileName包含文件的文件名和扩展名，即:test.txt):

PHP:

if ( strpos ( $_SERVER [ 'HTTP_USER_AGENT' ], "MSIE" ) > 0 )
{
     header ( 'Content-Disposition: attachment; filename="' . rawurlencode ( $fileName ) . '"' );
}
else
{
     header( 'Content-Disposition: attachment; filename*=UTF-8\'\'' . rawurlencode ( $fileName ) );
}

Java:

fileName = request.getHeader ( "user-agent" ).contains ( "MSIE" ) ? URLEncoder.encode ( fileName, "utf-8") : MimeUtility.encodeWord ( fileName );
response.setHeader ( "Content-disposition", "attachment; filename=\"" + fileName + "\"");

2013-04-19 11:29:24

库类Unicode中的方法mimeHeaderEncode($string)可以完成这项工作。

$file_name= Unicode::mimeHeaderEncode($file_name);

drupal/php中的例子:

https://github.com/drupal/core-utility/blob/8.8.x/Unicode.php

/**
   * Encodes MIME/HTTP headers that contain incorrectly encoded characters.
   *
   * For example, Unicode::mimeHeaderEncode('tést.txt') returns
   * "=?UTF-8?B?dMOpc3QudHh0?=".
   *
   * See http://www.rfc-editor.org/rfc/rfc2047.txt for more information.
   *
   * Notes:
   * - Only encode strings that contain non-ASCII characters.
   * - We progressively cut-off a chunk with self::truncateBytes(). This ensures
   *   each chunk starts and ends on a character boundary.
   * - Using \n as the chunk separator may cause problems on some systems and
   *   may have to be changed to \r\n or \r.
   *
   * @param string $string
   *   The header to encode.
   * @param bool $shorten
   *   If TRUE, only return the first chunk of a multi-chunk encoded string.
   *
   * @return string
   *   The mime-encoded header.
   */
  public static function mimeHeaderEncode($string, $shorten = FALSE) {
    if (preg_match('/[^\x20-\x7E]/', $string)) {
      // floor((75 - strlen("=?UTF-8?B??=")) * 0.75);
      $chunk_size = 47;
      $len = strlen($string);
      $output = '';
      while ($len > 0) {
        $chunk = static::truncateBytes($string, $chunk_size);
        $output .= ' =?UTF-8?B?' . base64_encode($chunk) . "?=\n";
        if ($shorten) {
          break;
        }
        $c = strlen($chunk);
        $string = substr($string, $c);
        $len -= $c;
      }
      return trim($output);
    }
    return $string;
  }

2021-12-21 10:49:51

我们在一个web应用程序中遇到了类似的问题，最后从HTML <input type="file">中读取文件名，并在一个新的HTML <input type="hidden">中以url编码的形式设置它。当然，我们必须删除一些浏览器返回的“C:\fakepath\”这样的路径。

当然，这并不能直接回答OPs的问题，但可能是其他人的解决方案。

2015-01-27 11:54:13

在提议的RFC 5987“超文本传输协议(HTTP)报头字段参数的字符集和语言编码”中对此进行了讨论，包括浏览器测试和向后兼容性的链接。

RFC 2183表示这样的报头应该根据RFC 2184进行编码，RFC 2184已被RFC 2231废止，上面的RFC草案涵盖了这一点。

2008-09-18 15:39:58

如果你使用的是nodejs后端，你可以使用我在这里找到的以下代码

var fileName = 'my file(2).txt';
var header = "Content-Disposition: attachment; filename*=UTF-8''" 
             + encodeRFC5987ValueChars(fileName);

function encodeRFC5987ValueChars (str) {
    return encodeURIComponent(str).
        // Note that although RFC3986 reserves "!", RFC5987 does not,
        // so we do not need to escape it
        replace(/['()]/g, escape). // i.e., %27 %28 %29
        replace(/\*/g, '%2A').
            // The following are not required for percent-encoding per RFC5987, 
            // so we can allow for a little better readability over the wire: |`^
            replace(/%(?:7C|60|5E)/g, unescape);
}

2015-09-25 12:45:11

如何编码的文件名称参数的内容处置头在HTTP?

推荐文章

最新文章

标签