在c#中,我可以将字符串值转换为字符串文字,我将在代码中看到它的方式吗?我想用转义序列替换制表符,换行符等。

如果这段代码:

Console.WriteLine(someString);

生产:

Hello
World!

我想要这样的代码:

Console.WriteLine(ToLiteral(someString));

生产:

\tHello\r\n\tWorld!\r\n

当前回答

这是一个完全可行的实现,包括Unicode和ASCII不可打印字符的转义。它没有像Hallgrim的答案那样插入“+”符号。

static string ToLiteral(string input) {
    StringBuilder literal = new StringBuilder(input.Length + 2);
    literal.Append("\"");
    foreach (var c in input) {
        switch (c) {
            case '\"': literal.Append("\\\""); break;
            case '\\': literal.Append(@"\\"); break;
            case '\0': literal.Append(@"\0"); break;
            case '\a': literal.Append(@"\a"); break;
            case '\b': literal.Append(@"\b"); break;
            case '\f': literal.Append(@"\f"); break;
            case '\n': literal.Append(@"\n"); break;
            case '\r': literal.Append(@"\r"); break;
            case '\t': literal.Append(@"\t"); break;
            case '\v': literal.Append(@"\v"); break;
            default:
                // ASCII printable character
                if (c >= 0x20 && c <= 0x7e) {
                    literal.Append(c);
                // As UTF16 escaped character
                } else {
                    literal.Append(@"\u");
                    literal.Append(((int)c).ToString("x4"));
                }
                break;
        }
    }
    literal.Append("\"");
    return literal.ToString();
}

注意,这也转义了所有Unicode字符。如果你的环境支持它们,你可以改变这一部分,只转义控制字符:

// UTF16 control characters
} else if (Char.GetUnicodeCategory(c) == UnicodeCategory.Control) {
    literal.Append(@"\u");
    literal.Append(((int)c).ToString("x4"));
} else {
    literal.Append(c);
}

其他回答

Hallgrim的回答非常棒。这里有一个小调整,以防您需要用c#正则表达式解析额外的空白字符和换行符。我需要在序列化的JSON值插入到谷歌表的情况下,遇到了麻烦,因为代码正在插入制表符,+,空格等。

  provider.GenerateCodeFromExpression(new CodePrimitiveExpression(input), writer, null);
  var literal = writer.ToString();
  var r2 = new Regex(@"\"" \+.\n[\s]+\""", RegexOptions.ECMAScript);
  literal = r2.Replace(literal, "");
  return literal;

Try:

var t = HttpUtility.JavaScriptStringEncode(s);

我提交了自己的实现,它处理空值,并且由于使用数组查找表、手动十六进制转换和避免开关语句,因此性能应该更好。

using System;
using System.Text;
using System.Linq;

public static class StringLiteralEncoding {
  private static readonly char[] HEX_DIGIT_LOWER = "0123456789abcdef".ToCharArray();
  private static readonly char[] LITERALENCODE_ESCAPE_CHARS;

  static StringLiteralEncoding() {
    // Per http://msdn.microsoft.com/en-us/library/h21280bw.aspx
    var escapes = new string[] { "\aa", "\bb", "\ff", "\nn", "\rr", "\tt", "\vv", "\"\"", "\\\\", "??", "\00" };
    LITERALENCODE_ESCAPE_CHARS = new char[escapes.Max(e => e[0]) + 1];
    foreach(var escape in escapes)
      LITERALENCODE_ESCAPE_CHARS[escape[0]] = escape[1];
  }

  /// <summary>
  /// Convert the string to the equivalent C# string literal, enclosing the string in double quotes and inserting
  /// escape sequences as necessary.
  /// </summary>
  /// <param name="s">The string to be converted to a C# string literal.</param>
  /// <returns><paramref name="s"/> represented as a C# string literal.</returns>
  public static string Encode(string s) {
    if(null == s) return "null";

    var sb = new StringBuilder(s.Length + 2).Append('"');
    for(var rp = 0; rp < s.Length; rp++) {
      var c = s[rp];
      if(c < LITERALENCODE_ESCAPE_CHARS.Length && '\0' != LITERALENCODE_ESCAPE_CHARS[c])
        sb.Append('\\').Append(LITERALENCODE_ESCAPE_CHARS[c]);
      else if('~' >= c && c >= ' ')
        sb.Append(c);
      else
        sb.Append(@"\x")
          .Append(HEX_DIGIT_LOWER[c >> 12 & 0x0F])
          .Append(HEX_DIGIT_LOWER[c >>  8 & 0x0F])
          .Append(HEX_DIGIT_LOWER[c >>  4 & 0x0F])
          .Append(HEX_DIGIT_LOWER[c       & 0x0F]);
    }

    return sb.Append('"').ToString();
  }
}

使用Regex.Escape(字符串):

正则表达式。Escape转义最小字符集(,*,+,?,|,{,[, (,), ^, $,。, #和空白),用转义符替换它们 代码。

有趣的问题。

如果你找不到更好的方法,你可以随时替换。 如果你选择它,你可以使用这个c#转义序列列表:

\' - single quote, needed for character literals \" - double quote, needed for string literals \ - backslash \0 - Unicode character 0 \a - Alert (character 7) \b - Backspace (character 8) \f - Form feed (character 12) \n - New line (character 10) \r - Carriage return (character 13) \t - Horizontal tab (character 9) \v - Vertical quote (character 11) \uxxxx - Unicode escape sequence for character with hex value xxxx \xn[n][n][n] - Unicode escape sequence for character with hex value nnnn (variable length version of \uxxxx) \Uxxxxxxxx - Unicode escape sequence for character with hex value xxxxxxxx (for generating surrogates)

这个列表可以在c#常见问题中找到 有哪些字符转义序列可用?