我正在把VB转换成c#。这条语句的语法有问题:

if ((searchResult.Properties["user"].Count > 0))
{
    profile.User = System.Text.Encoding.UTF8.GetString(searchResult.Properties["user"][0]);
}

然后我看到以下错误:

参数1:不能将'object'转换为'byte[]' 匹配的最佳重载方法 'System.Text.Encoding.GetString(byte[])'有一些无效的参数

我试图根据这篇文章修复代码,但仍然没有成功

string User = Encoding.UTF8.GetString("user", 0);

有什么建议吗?


当前回答

使用这个

byte[] myByte= System.Text.ASCIIEncoding.Default.GetBytes(myString);

其他回答

编码。默认不应该使用…

一些答案使用编码。违约,但微软提出了警告:

Different computers can use different encodings as the default, and the default encoding can change on a single computer. If you use the Default encoding to encode and decode data streamed between computers or retrieved at different times on the same computer, it may translate that data incorrectly. In addition, the encoding returned by the Default property uses best-fit fallback [i.e. the encoding is totally screwed up, so you can't reencode it back] to map unsupported characters to characters supported by the code page. For these reasons, using the default encoding is not recommended. To ensure that encoded bytes are decoded properly, you should use a Unicode encoding, such as UTF8Encoding or UnicodeEncoding. You could also use a higher-level protocol to ensure that the same format is used for encoding and decoding.

要检查默认编码是什么,请使用encoding . default . windowscodepage(在我的例子中是1250 -遗憾的是,没有预定义的CP1250编码类,但对象可以作为encoding . getencoding(1250)检索)。

...应该使用UTF-8/UTF-16LE编码…

编码。ASCII在得分最多的答案是7位,所以它也不工作,在我的情况下:

byte[] pass = Encoding.ASCII.GetBytes("šarže");
Console.WriteLine(Encoding.ASCII.GetString(pass)); // ?ar?e

以下是微软的建议:

var utf8 = new UTF8Encoding();
byte[] pass = utf8.GetBytes("šarže");
Console.WriteLine(utf8.GetString(pass)); // šarže

编码。其他人推荐的UTF8是UTF-8编码的一个实例,也可以直接使用或作为

var utf8 = Encoding.UTF8 as UTF8Encoding;

编码。Unicode在内存中的字符串表示中很流行,因为它每个字符使用固定的2个字节,因此可以在固定的时间内以更多内存使用为代价跳到第n个字符:它是UTF-16LE。在msvc#中,*.cs文件默认是UTF-8 BOM,其中的字符串常量在编译时转换为UTF-16LE(参见@OwnagelsMagic注释),但它没有定义为默认值:许多类,如StreamWriter使用UTF-8作为默认值。

...但它并不总是被使用

Default encoding is misleading: .NET uses UTF-8 everywhere (including strings hardcoded in the source code) and UTF-16LE (Encoding.Unicode) to store strings in memory, but Windows actually uses 2 other non-UTF8 defaults: ANSI codepage (for GUI apps before .NET) and OEM codepage (aka DOS standard). These differs from country to country (for instance, Windows Czech edition uses CP1250 and CP852) and are oftentimes hardcoded in windows API libraries. So if you just set UTF-8 to console by chcp 65001 (as .NET implicitly does and pretends it is the default) and run some localized command (like ping), it works in English version, but you get tofu text in Czech Republic.

让我分享一下我的真实经验:我为教师创建了定制git脚本的WinForms应用程序。输出是由微软描述为(我添加的粗体文本)的进程在后台任意地获得的:

在本文中,“shell”一词(UseShellExecute)指的是一个图形shell(类似于Windows shell, ANSI CP)而不是命令shell(例如bash或sh, OEM CP),允许用户在非美国环境中启动图形应用程序或打开输出混乱的文档。

So effectively GUI defaults to UTF-8, process defaults to CP1250 and console defaults to 852. So the output is in 852 interpreted as UTF-8 interpreted as CP1250. I got tofu text from which I could not deduce the original codepage due to the double conversion. I was pulling my hair for a week to figure out to explicitly set UTF-8 for process script and convert the output from CP1250 to UTF-8 in the main thread. Now it works here in the Eastern Europe, but Western Europe Windows uses 1252. ANSI CP is not determined easily as many commands like systeminfo are also localized and other methods differs from version to version: in such environment displaying national characters reliably is almost unfeasible.

因此,在21世纪中叶之前,请不要使用任何“默认代码页”并显式设置它(如果可能的话,设置为UTF-8或UTF-16LE)。

你也可以使用扩展方法为字符串类型添加一个方法,如下所示:

static class Helper
{
   public static byte[] ToByteArray(this string str)
   {
      return System.Text.Encoding.ASCII.GetBytes(str);
   }
}

然后像下面这样使用它:

string foo = "bla bla";
byte[] result = foo.ToByteArray();
var result = System.Text.Encoding.Unicode.GetBytes(text);

您可以使用MemoryMarshal API来执行非常快速和有效的转换。字符串将隐式强制转换为ReadOnlySpan<字节>,作为MemoryMarshal。强制转换接受Span<byte>或ReadOnlySpan<byte>作为输入参数。

public static class StringExtensions
{
    public static byte[] ToByteArray(this string s) => s.ToByteSpan().ToArray(); //  heap allocation, use only when you cannot operate on spans
    public static ReadOnlySpan<byte> ToByteSpan(this string s) => MemoryMarshal.Cast<char, byte>(s);
}

下面的基准测试显示了差异:

Input: "Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s,"

|                       Method |       Mean |     Error |    StdDev |  Gen 0 | Gen 1 | Gen 2 | Allocated |
|----------------------------- |-----------:|----------:|----------:|-------:|------:|------:|----------:|
| UsingEncodingUnicodeGetBytes | 160.042 ns | 3.2864 ns | 6.4099 ns | 0.0780 |     - |     - |     328 B |
| UsingMemoryMarshalAndToArray |  31.977 ns | 0.7177 ns | 1.5753 ns | 0.0781 |     - |     - |     328 B |
|           UsingMemoryMarshal |   1.027 ns | 0.0565 ns | 0.1630 ns |      - |     - |     - |         - |

使用这个

byte[] myByte= System.Text.ASCIIEncoding.Default.GetBytes(myString);