我有一个应用程序,发送一个POST请求到VB论坛软件,并登录某人(没有设置cookie或任何东西)。
一旦用户登录,我就创建一个变量,在他们的本地机器上创建一个路径。
c: \用户tempfolder枣\
问题是一些用户名抛出“非法字符”异常。例如,如果我的用户名是mas|fenix,它会抛出一个异常。
Path.Combine( _
Environment.GetFolderPath(System.Environment.SpecialFolder.CommonApplicationData), _
DateTime.Now.ToString("ddMMyyhhmm") + "-" + form1.username)
我不想从字符串中删除它,但是通过FTP在服务器上创建了一个具有其用户名的文件夹。这就引出了我的第二个问题。如果我在服务器上创建一个文件夹,我可以留下“非法字符”吗?我问这个问题是因为服务器是基于Linux的,我不确定Linux是否接受它。
编辑:似乎URL编码不是我想要的..这就是我想做的:
old username = mas|fenix
new username = mas%xxfenix
其中%xx是ASCII值或任何其他容易识别字符的值。
Levi Botelho评论说,以前生成的编码表对于。net 4.5来说不再准确了,因为在。net 4.0和4.5之间,编码发生了轻微的变化。因此,我重新生成了。net 4.5的表:
Unencoded UrlEncoded UrlEncodedUnicode UrlPathEncoded WebUtilityUrlEncoded EscapedDataString EscapedUriString HtmlEncoded HtmlAttributeEncoded WebUtilityHtmlEncoded HexEscaped
A A A A A A A A A A %41
B B B B B B B B B B %42
a a a a a a a a a a %61
b b b b b b b b b b %62
0 0 0 0 0 0 0 0 0 0 %30
1 1 1 1 1 1 1 1 1 1 %31
[space] + + %20 + %20 %20 [space] [space] [space] %20
! ! ! ! ! %21 ! ! ! ! %21
" %22 %22 " %22 %22 %22 " " " %22
# %23 %23 # %23 %23 # # # # %23
$ %24 %24 $ %24 %24 $ $ $ $ %24
% %25 %25 % %25 %25 %25 % % % %25
& %26 %26 & %26 %26 & & & & %26
' %27 %27 ' %27 %27 ' ' ' ' %27
( ( ( ( ( %28 ( ( ( ( %28
) ) ) ) ) %29 ) ) ) ) %29
* * * * * %2A * * * * %2A
+ %2b %2b + %2B %2B + + + + %2B
, %2c %2c , %2C %2C , , , , %2C
- - - - - - - - - - %2D
. . . . . . . . . . %2E
/ %2f %2f / %2F %2F / / / / %2F
: %3a %3a : %3A %3A : : : : %3A
; %3b %3b ; %3B %3B ; ; ; ; %3B
< %3c %3c < %3C %3C %3C < < < %3C
= %3d %3d = %3D %3D = = = = %3D
> %3e %3e > %3E %3E %3E > > > %3E
? %3f %3f ? %3F %3F ? ? ? ? %3F
@ %40 %40 @ %40 %40 @ @ @ @ %40
[ %5b %5b [ %5B %5B [ [ [ [ %5B
\ %5c %5c \ %5C %5C %5C \ \ \ %5C
] %5d %5d ] %5D %5D ] ] ] ] %5D
^ %5e %5e ^ %5E %5E %5E ^ ^ ^ %5E
_ _ _ _ _ _ _ _ _ _ %5F
` %60 %60 ` %60 %60 %60 ` ` ` %60
{ %7b %7b { %7B %7B %7B { { { %7B
| %7c %7c | %7C %7C %7C | | | %7C
} %7d %7d } %7D %7D %7D } } } %7D
~ %7e %7e ~ %7E ~ ~ ~ ~ ~ %7E
Ā %c4%80 %u0100 %c4%80 %C4%80 %C4%80 %C4%80 Ā Ā Ā [OoR]
ā %c4%81 %u0101 %c4%81 %C4%81 %C4%81 %C4%81 ā ā ā [OoR]
Ē %c4%92 %u0112 %c4%92 %C4%92 %C4%92 %C4%92 Ē Ē Ē [OoR]
ē %c4%93 %u0113 %c4%93 %C4%93 %C4%93 %C4%93 ē ē ē [OoR]
Ī %c4%aa %u012a %c4%aa %C4%AA %C4%AA %C4%AA Ī Ī Ī [OoR]
ī %c4%ab %u012b %c4%ab %C4%AB %C4%AB %C4%AB ī ī ī [OoR]
Ō %c5%8c %u014c %c5%8c %C5%8C %C5%8C %C5%8C Ō Ō Ō [OoR]
ō %c5%8d %u014d %c5%8d %C5%8D %C5%8D %C5%8D ō ō ō [OoR]
Ū %c5%aa %u016a %c5%aa %C5%AA %C5%AA %C5%AA Ū Ū Ū [OoR]
ū %c5%ab %u016b %c5%ab %C5%AB %C5%AB %C5%AB ū ū ū [OoR]
列表示编码如下:
UrlEncoded: HttpUtility。UrlEncode
UrlEncodedUnicode: HttpUtility。UrlEncodeUnicode
UrlPathEncoded: HttpUtility。UrlPathEncode
WebUtilityUrlEncoded: WebUtility。UrlEncode
EscapedDataString: Uri。EscapeDataString
EscapedUriString: Uri。EscapeUriString
HtmlEncoded: HttpUtility。HtmlEncode
HtmlAttributeEncoded: HttpUtility。HtmlAttributeEncode
WebUtilityHtmlEncoded: WebUtility。HtmlEncode
HexEscaped: Uri。HexEscape
注:
HexEscape只能处理前255个字符。因此,它对拉丁a扩展字符抛出ArgumentOutOfRange异常(例如Ā)。
该表是在。net 4.5中生成的(有关。net 4.0及以下的编码,请参阅https://stackoverflow.com/a/11236038/216440)。
编辑:
作为Discord回答的结果,我添加了新的WebUtility UrlEncode和HtmlEncode方法,这是在。net 4.5中引入的。
我觉得这里的人被UrlEncode信息转移了注意力。URLEncoding不是你想要的——你想要编码在目标系统上不能作为文件名的东西。
假设你想要一些通用性——随意地在几个系统(MacOS, Windows, Linux和Unix)上找到非法字符,将它们联合起来形成一组字符来转义。
As for the escape, a HexEscape should be fine (Replacing the characters with %XX). Convert each character to UTF-8 bytes and encode everything >128 if you want to support systems that don't do unicode. But there are other ways, such as using back slashes "\" or HTML encoding """. You can create your own. All any system has to do is 'encode' the uncompatible character away. The above systems allow you to recreate the original name -- but something like replacing the bad chars with spaces works also.
与上面相同的是,唯一可以使用的是
Uri。EscapeDataString
-它编码OAuth需要的所有东西,它不编码OAuth禁止编码的东西,并将空格编码为%20而不是+(也在OATH规范中)参见:RFC 3986。AFAIK,这是最新的URI规范。
您应该只对用户名或URL中可能无效的其他部分进行编码。URL编码URL可能会导致这样的问题:
string url = HttpUtility.UrlEncode("http://www.google.com/search?q=Example");
将产生
http 3a % 2f 2fwww。google。com % 2fsearch 3fq % 3dExample
这显然不会有很好的效果。相反,你应该在查询字符串中只编码键/值对的值,就像这样:
string url = "http://www.google.com/search?q=" + HttpUtility.UrlEncode("Example");
希望这有帮助。另外,正如teedyay提到的,您仍然需要确保非法文件名字符被删除,否则文件系统将不喜欢该路径。
Ideally these would go in a class called "FileNaming" or maybe just rename Encode to "FileNameEncode". Note: these are not designed to handle Full Paths, just the folder and/or file names. Ideally you would Split("/") your full path first and then check the pieces.
And obviously instead of a union, you could just add the "%" character to the list of chars not allowed in Windows, but I think it's more helpful/readable/factual this way.
Decode() is exactly the same but switches the Replace(Uri.HexEscape(s[0]), s) "escaped" with the character.
public static List<string> urlEncodedCharacters = new List<string>
{
"/", "\\", "<", ">", ":", "\"", "|", "?", "%" //and others, but not *
};
//Since this is a superset of urlEncodedCharacters, we won't be able to only use UrlEncode() - instead we'll use HexEncode
public static List<string> specialCharactersNotAllowedInWindows = new List<string>
{
"/", "\\", "<", ">", ":", "\"", "|", "?", "*" //windows dissallowed character set
};
public static string Encode(string fileName)
{
//CheckForFullPath(fileName); // optional: make sure it's not a path?
List<string> charactersToChange = new List<string>(specialCharactersNotAllowedInWindows);
charactersToChange.AddRange(urlEncodedCharacters.
Where(x => !urlEncodedCharacters.Union(specialCharactersNotAllowedInWindows).Contains(x))); // add any non duplicates (%)
charactersToChange.ForEach(s => fileName = fileName.Replace(s, Uri.HexEscape(s[0]))); // "?" => "%3f"
return fileName;
}
感谢@simon-tewsi提供的非常有用的表格!