我们在Team Foundation Server (TFS)中有一个项目,其中有一个非英语字符(š)。当尝试编写一些与构建相关的脚本时,我们偶然发现了一个问题——我们不能将这个字母传递给命令行工具。命令提示符或其他东西会把它弄乱,tf.exe实用程序无法找到指定的项目。
我尝试了不同格式的.bat文件(ANSI, UTF-8,带BOM和不带BOM),以及用JavaScript编写脚本(本质上是Unicode) -但运气不好。如何执行程序并传递一个Unicode命令行?
我们在Team Foundation Server (TFS)中有一个项目,其中有一个非英语字符(š)。当尝试编写一些与构建相关的脚本时,我们偶然发现了一个问题——我们不能将这个字母传递给命令行工具。命令提示符或其他东西会把它弄乱,tf.exe实用程序无法找到指定的项目。
我尝试了不同格式的.bat文件(ANSI, UTF-8,带BOM和不带BOM),以及用JavaScript编写脚本(本质上是Unicode) -但运气不好。如何执行程序并传递一个Unicode命令行?
当前回答
更改Windows控制台的默认Codepage是相当困难的。当你在网上搜索时,你会发现不同的建议,然而其中一些可能会完全破坏你的Windows,即你的PC无法再启动。
最安全的解决方案是: 转到你的注册表键HKEY_CURRENT_USER\Software\Microsoft\Command Processor并添加字符串值Autorun = chcp 65001。
或者,对于最常见的代码页,可以使用这个小的批处理脚本。
@ECHO off
SET ROOT_KEY="HKEY_CURRENT_USER"
FOR /f "skip=2 tokens=3" %%i in ('reg query HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage /v OEMCP') do set OEMCP=%%i
ECHO System default values:
ECHO.
ECHO ...............................................
ECHO Select Codepage
ECHO ...............................................
ECHO.
ECHO 1 - CP1252
ECHO 2 - UTF-8
ECHO 3 - CP850
ECHO 4 - ISO-8859-1
ECHO 5 - ISO-8859-15
ECHO 6 - US-ASCII
ECHO.
ECHO 9 - Reset to System Default (CP%OEMCP%)
ECHO 0 - EXIT
ECHO.
SET /P CP="Select a Codepage: "
if %CP%==1 (
echo Set default Codepage to CP1252
reg add "%ROOT_KEY%\Software\Microsoft\Command Processor" /v Autorun /t REG_SZ /d "@chcp 1252>nul" /f
) else if %CP%==2 (
echo Set default Codepage to UTF-8
reg add "%ROOT_KEY%\Software\Microsoft\Command Processor" /v Autorun /t REG_SZ /d "@chcp 65001>nul" /f
) else if %CP%==3 (
echo Set default Codepage to CP850
reg add "%ROOT_KEY%\Software\Microsoft\Command Processor" /v Autorun /t REG_SZ /d "@chcp 850>nul" /f
) else if %CP%==4 (
echo Set default Codepage to ISO-8859-1
add "%ROOT_KEY%\Software\Microsoft\Command Processor" /v Autorun /t REG_SZ /d "@chcp 28591>nul" /f
) else if %CP%==5 (
echo Set default Codepage to ISO-8859-15
add "%ROOT_KEY%\Software\Microsoft\Command Processor" /v Autorun /t REG_SZ /d "@chcp 28605>nul" /f
) else if %CP%==6 (
echo Set default Codepage to ASCII
add "%ROOT_KEY%\Software\Microsoft\Command Processor" /v Autorun /t REG_SZ /d "@chcp 20127>nul" /f
) else if %CP%==9 (
echo Reset Codepage to System Default
reg delete "%ROOT_KEY%\Software\Microsoft\Command Processor" /v AutoRun /f
) else if %CP%==0 (
echo Bye
) else (
echo Invalid choice
pause
)
使用@chcp 65001>nul而不是chcp 65001会抑制每次启动一个新的命令行窗口时都会得到的输出“活动代码页:65001”。
所有可用号码的完整列表,您可以从代码页标识符
注意,设置只适用于当前用户。如果你想为所有用户设置它,将set ROOT_KEY="HKEY_CURRENT_USER"替换为set ROOT_KEY="HKEY_LOCAL_MACHINE"
其他回答
对于类似的问题(我的问题是在命令提示符上显示来自MySQL的UTF-8字符),
我是这样解决的:
我把命令提示符的字体改成了Lucida Console。(此步骤必须与您的情况无关。它只与你在屏幕上看到的东西有关,而与角色本身无关)。 我把代码页改成了Windows-1253。您可以在命令提示符中通过“chcp 1253”执行此操作。它适用于我想要查看UTF-8的情况。
Try:
chcp 65001
这会将代码页更改为UTF-8。此外,还需要使用Lucida控制台字体。
我的背景:我在控制台中使用Unicode输入/输出已经很多年了(并且每天都这么做。此外,我还为这项任务开发了支持工具)。只要你了解以下事实/限制,问题就很少:
CMD and “console” are unrelated factors. CMD.exe is a just one of programs which are ready to “work inside” a console (“console applications”). AFAIK, CMD has perfect support for Unicode; you can enter/output all Unicode chars when any codepage is active. Windows’ console has A LOT of support for Unicode — but it is not perfect (just “good enough”; see below). chcp 65001 is very dangerous. Unless a program was specially designed to work around defects in the Windows’ API (or uses a C runtime library which has these workarounds), it would not work reliably. Win8 fixes ½ of these problems with cp65001, but the rest is still applicable to Win10. I work in cp1252. As I already said: To input/output Unicode in a console, one does not need to set the codepage.
细节
To read/write Unicode to a console, an application (or its C runtime library) should be smart enough to use not File-I/O API, but Console-I/O API. (For an example, see how Python does it.) Likewise, to read Unicode command-line arguments, an application (or its C runtime library) should be smart enough to use the corresponding API. Console font rendering supports only Unicode characters in BMP (in other words: below U+10000). Only simple text rendering is supported (so European — and some East Asian — languages should work fine — as far as one uses precomposed forms). [There is a minor fine print here for East Asian and for characters U+0000, U+0001, U+30FB.]
实际考虑
The defaults on Window are not very helpful. For best experience, one should tune up 3 pieces of configuration: For output: a comprehensive console font. For best results, I recommend my builds. (The installation instructions are present there — and also listed in other answers on this page.) For input: a capable keyboard layout. For best results, I recommend my layouts. For input: allow HEX input of Unicode. One more gotcha with “Pasting” into a console application (very technical): HEX input delivers a character on KeyUp of Alt; all the other ways to deliver a character happen on KeyDown; so many applications are not ready to see a character on KeyUp. (Only applicable to applications using Console-I/O API.) Conclusion: many application would not react on HEX input events. Moreover, what happens with a “Pasted” character depends on the current keyboard layout: if the character can be typed without using prefix keys (but with arbitrary complicated combination of modifiers, as in Ctrl-Alt-AltGr-Kana-Shift-Gray*) then it is delivered on an emulated keypress. This is what any application expects — so pasting anything which contains only such characters is fine. However, the “other” characters are delivered by emulating HEX input. Conclusion: unless your keyboard layout supports input of A LOT of characters without prefix keys, some buggy applications may skip characters when you Paste via Console’s UI: Alt-Space E P. (This is why I recommend using my keyboard layouts!)
我们还应该记住,Windows上的“替代的、更强大的”主机根本不是主机。它们不支持Console-I/O api,因此依赖这些api工作的程序将无法正常工作。(不过,只使用“文件- i /O api到控制台文件句柄”的程序可以很好地工作。)
微软的Powershell就是一个非主机的例子。我不用它;要进行实验,按下并释放WinKey,然后输入powershell。
(另一方面,有一些程序,如ConEmu或ANSICON,试图做更多的事情:他们“试图”拦截控制台i /O api,以使“真正的控制台应用程序”也能工作。这绝对适用于玩具示例程序;在现实生活中,这可能解决不了您的特定问题。实验。)
总结
设置字体,键盘布局(并可选地,允许十六进制输入)。 只使用通过Console-I/O api的程序,并接受Unicode命令行参数。例如,任何由cygwin编译的程序都可以。正如我已经说过的,CMD也很好。
UPD:最初,对于cp65001中的一个错误,我混淆了内核层和CRTL层(UPD²:和Windows用户模式API!)另外:Win8修复了这个错误的一半;我澄清了关于“更好的控制台”应用程序的部分,并添加了Python如何做到这一点的参考。
对于那些使用WSL但又不想要Cygwin或Git的额外包的人来说,wsltty是可用的,它只提供支持UTF-8的终端
更改Windows控制台的默认Codepage是相当困难的。当你在网上搜索时,你会发现不同的建议,然而其中一些可能会完全破坏你的Windows,即你的PC无法再启动。
最安全的解决方案是: 转到你的注册表键HKEY_CURRENT_USER\Software\Microsoft\Command Processor并添加字符串值Autorun = chcp 65001。
或者,对于最常见的代码页,可以使用这个小的批处理脚本。
@ECHO off
SET ROOT_KEY="HKEY_CURRENT_USER"
FOR /f "skip=2 tokens=3" %%i in ('reg query HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage /v OEMCP') do set OEMCP=%%i
ECHO System default values:
ECHO.
ECHO ...............................................
ECHO Select Codepage
ECHO ...............................................
ECHO.
ECHO 1 - CP1252
ECHO 2 - UTF-8
ECHO 3 - CP850
ECHO 4 - ISO-8859-1
ECHO 5 - ISO-8859-15
ECHO 6 - US-ASCII
ECHO.
ECHO 9 - Reset to System Default (CP%OEMCP%)
ECHO 0 - EXIT
ECHO.
SET /P CP="Select a Codepage: "
if %CP%==1 (
echo Set default Codepage to CP1252
reg add "%ROOT_KEY%\Software\Microsoft\Command Processor" /v Autorun /t REG_SZ /d "@chcp 1252>nul" /f
) else if %CP%==2 (
echo Set default Codepage to UTF-8
reg add "%ROOT_KEY%\Software\Microsoft\Command Processor" /v Autorun /t REG_SZ /d "@chcp 65001>nul" /f
) else if %CP%==3 (
echo Set default Codepage to CP850
reg add "%ROOT_KEY%\Software\Microsoft\Command Processor" /v Autorun /t REG_SZ /d "@chcp 850>nul" /f
) else if %CP%==4 (
echo Set default Codepage to ISO-8859-1
add "%ROOT_KEY%\Software\Microsoft\Command Processor" /v Autorun /t REG_SZ /d "@chcp 28591>nul" /f
) else if %CP%==5 (
echo Set default Codepage to ISO-8859-15
add "%ROOT_KEY%\Software\Microsoft\Command Processor" /v Autorun /t REG_SZ /d "@chcp 28605>nul" /f
) else if %CP%==6 (
echo Set default Codepage to ASCII
add "%ROOT_KEY%\Software\Microsoft\Command Processor" /v Autorun /t REG_SZ /d "@chcp 20127>nul" /f
) else if %CP%==9 (
echo Reset Codepage to System Default
reg delete "%ROOT_KEY%\Software\Microsoft\Command Processor" /v AutoRun /f
) else if %CP%==0 (
echo Bye
) else (
echo Invalid choice
pause
)
使用@chcp 65001>nul而不是chcp 65001会抑制每次启动一个新的命令行窗口时都会得到的输出“活动代码页:65001”。
所有可用号码的完整列表,您可以从代码页标识符
注意,设置只适用于当前用户。如果你想为所有用户设置它,将set ROOT_KEY="HKEY_CURRENT_USER"替换为set ROOT_KEY="HKEY_LOCAL_MACHINE"