如何在Bash中将字符串转换为小写

bash中有没有将字符串转换为小写字符串的方法？

例如，如果我有：

a="Hi all"

我想将其转换为：

"hi all"

当前回答

因此，我尝试对每个实用程序使用共识方法执行一些更新的基准测试，但我没有多次重复一个小集合，而是。。。

以UTF-8编码的多字节Unicode字符填充到边缘的1.85 GB.txt文件中，为了均衡I/O方面，同时还强制所有人使用LC_ALL=C，以确保公平竞争

————————————————————————————————————————

准确地说，bsd-sed和gnu-sed都相当平庸。我甚至不知道bsd sed在做什么，因为他们的xxhash不匹配python3是否试图使用Unicode字母大小写？（即使我已经强制设置了区域设置LC_ALL=C）tr是最极端的到目前为止，gnutr是最快的bsd tr非常残暴perl5比我拥有的任何awk变体都快，除非你可以使用mawk2一次加载整个文件，以便稍微超过perl5：2.935秒mawk2对每15秒3.081秒在awk中，gnu-gawk的速度最慢，中间是mawk 1.3.4，最快是mawk1.9.9.6：比gawk节省50%以上的时间.（我没有把时间浪费在无用的macosx nawk上）

     out9: 1.85GiB 0:00:03 [ 568MiB/s] [ 568MiB/s] [ <=> ]
      in0: 1.85GiB 0:00:03 [ 568MiB/s] [ 568MiB/s] [============>] 100%            
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C mawk2 '{ print tolower($_) }' FS='^$'; )  

mawk 1.9.9.6 (mawk2-beta)

3.07s user 0.66s system 111% cpu 3.348 total
85759a34df874966d096c6529dbfb9d5  stdin


     out9: 1.85GiB 0:00:06 [ 297MiB/s] [ 297MiB/s] [ <=> ]
      in0: 1.85GiB 0:00:06 [ 297MiB/s] [ 297MiB/s] [============>] 100%            
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C mawk '{ print tolower($_) }' FS='^$'; )  

 mawk 1.3.4

6.01s user 0.83s system 107% cpu 6.368 total
85759a34df874966d096c6529dbfb9d5  stdin

     out9: 23.8MiB 0:00:00 [ 238MiB/s] [ 238MiB/s] [ <=> ]
      in0: 1.85GiB 0:00:07 [ 244MiB/s] [ 244MiB/s] [============>] 100%            
     out9: 1.85GiB 0:00:07 [ 244MiB/s] [ 244MiB/s] [ <=>                             ]
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C gawk -be '{ print tolower($_) }' FS='^$';  

GNU Awk 5.1.1, API: 3.1 (GNU MPFR 4.1.0, GNU MP 6.2.1) 

7.49s user 0.78s system 106% cpu 7.763 total
85759a34df874966d096c6529dbfb9d5  stdin


     out9: 1.85GiB 0:00:03 [ 616MiB/s] [ 616MiB/s] [ <=> ]
      in0: 1.85GiB 0:00:03 [ 617MiB/s] [ 617MiB/s] [============>] 100%            
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C perl -ne 'print lc'; )  

perl5 (revision 5 version 34 subversion 0)

2.70s user 0.85s system 115% cpu 3.081 total
85759a34df874966d096c6529dbfb9d5  stdin


     out9: 1.85GiB 0:00:32 [57.4MiB/s] [57.4MiB/s] [ <=> ]
      in0: 1.85GiB 0:00:32 [57.4MiB/s] [57.4MiB/s] [============>] 100%            
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C gsed 's/.*/\L&/'; )  # GNU-sed


gsed (GNU sed) 4.8

32.57s user 0.97s system 101% cpu 32.982 total
85759a34df874966d096c6529dbfb9d5  stdin


     out9: 1.86GiB 0:00:38 [49.7MiB/s] [49.7MiB/s] [ <=> ]
      in0: 1.85GiB 0:00:38 [49.4MiB/s] [49.4MiB/s] [============>] 100%            
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C sed 's/.*/\L&/'; )   # BSD-sed



37.94s user 0.86s system 101% cpu 38.318 total
d5e2d8487df1136db7c2334a238755c0  stdin



      in0:  313MiB 0:00:00 [3.06GiB/s] [3.06GiB/s] [=====>] 16% ETA 0:00:00
     out9: 1.85GiB 0:00:11 [ 166MiB/s] [ 166MiB/s] [ <=>]
      in0: 1.85GiB 0:00:00 [3.31GiB/s] [3.31GiB/s] [============>] 100%            
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C python3 -c "print(open(0).read().lower())) 

Python 3.9.12 

9.04s user 2.18s system 98% cpu 11.403 total
7ddc0b5cbcfbbfac3c2b6da6731bd262  stdin

     out9: 2.51MiB 0:00:00 [25.1MiB/s] [25.1MiB/s] [ <=> ]
      in0: 1.85GiB 0:00:11 [ 171MiB/s] [ 171MiB/s] [============>] 100%            
     out9: 1.85GiB 0:00:11 [ 171MiB/s] [ 171MiB/s] [ <=> ]
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C ruby -pe '$_.downcase!'; )


ruby 2.6.8p205 (2021-07-07 revision 67951) [universal.arm64e-darwin21]

10.46s user 1.23s system 105% cpu 11.073 total
85759a34df874966d096c6529dbfb9d5  stdin


      in0: 1.85GiB 0:00:01 [1.01GiB/s] [1.01GiB/s] [============>] 100%            
     out9: 1.85GiB 0:00:01 [1.01GiB/s] [1.01GiB/s] [ <=> ]
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C gtr '[A-Z]' '[a-z]'; )  # GNU-tr


gtr (GNU coreutils) 9.1

1.11s user 1.21s system 124% cpu 1.855 total
85759a34df874966d096c6529dbfb9d5  stdin


     out9: 1.85GiB 0:01:19 [23.7MiB/s] [23.7MiB/s] [ <=> ]
      in0: 1.85GiB 0:01:19 [23.7MiB/s] [23.7MiB/s] [============>] 100%            
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C tr '[A-Z]' '[a-z]'; ) # BSD-tr

78.94s user 1.50s system 100% cpu 1:19.67 total
85759a34df874966d096c6529dbfb9d5  stdin


( time ( pvE0 < "${m3t}" | LC_ALL=C   gdd  conv=lcase ) | pvE9 )  | xxh128sum | lgp3; sleep 3; 
     out9: 0.00 B 0:00:01 [0.00 B/s] [0.00 B/s] [<=> ]
      in0: 1.85GiB 0:00:06 [ 295MiB/s] [ 295MiB/s] [============>] 100%            
     out9: 1.81GiB 0:00:06 [ 392MiB/s] [ 294MiB/s] [ <=>   ]
3874110+1 records in
3874110+1 records out
     out9: 1.85GiB 0:00:06 [ 295MiB/s] [ 295MiB/s] [ <=>  ]
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C gdd conv=lcase; )  # GNU-dd


gdd (coreutils) 9.1

1.93s user 4.35s system 97% cpu 6.413 total
85759a34df874966d096c6529dbfb9d5  stdin



%  ( time ( pvE0 < "${m3t}" | LC_ALL=C   dd  conv=lcase ) | pvE9 )  | xxh128sum | lgp3; sleep 3; 
     out9: 36.9MiB 0:00:00 [ 368MiB/s] [ 368MiB/s] [ <=> ]
      in0: 1.85GiB 0:00:04 [ 393MiB/s] [ 393MiB/s] [============>] 100%            
     out9: 1.85GiB 0:00:04 [ 393MiB/s] [ 393MiB/s] [ <=>   ]
3874110+1 records in
3874110+1 records out
     out9: 1.85GiB 0:00:04 [ 393MiB/s] [ 393MiB/s] [ <=>  ]
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C dd conv=lcase; )  # BSD-dd


1.92s user 4.24s system 127% cpu 4.817 total
85759a34df874966d096c6529dbfb9d5  stdin

————————————————————————————————————————

通过一次加载所有文件，并在单个函数调用中对所有1.85 GB执行tolower（），可以人为地使mawk2比perl5更快：：

( time ( pvE0 < "${m3t}" | 

  LC_ALL=C mawk2 '
           BEGIN {            FS = RS = "^$"  } 
             END { print tolower($(ORS = "")) }' 

 ) | pvE9 ) | xxh128sum| lgp3 

      in0: 1.85GiB 0:00:00 [3.35GiB/s] [3.35GiB/s] [============>] 100%            
     out9: 1.85GiB 0:00:02 [ 647MiB/s] [ 647MiB/s] [ <=> ]
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C mawk2 ; )


1.39s user 1.31s system 91% cpu 2.935 total
85759a34df874966d096c6529dbfb9d5  stdin

2022-05-12 05:41:57

其他回答

许多答案使用外部程序，而不是真正使用Bash。

如果你知道你将有Bash4可用，你真的应该使用${VAR，，}符号（这很简单，很酷）。对于Bash before 4（例如，我的Mac仍然使用Bash 3.2）。我使用@ghostdog74的答案的修正版本创建了一个更便携的版本。

一个可以称为小写的“我的字符串”并获得小写版本。我阅读了关于将结果设置为var的评论，但这在Bash中不是真正可移植的，因为我们不能返回字符串。打印它是最好的解决方案。使用类似于var=“$（小写$str）”的内容很容易捕获。

这是如何工作的

其工作方式是通过printf获取每个字符的ASCII整数表示，然后如果上到->下加32，或者如果下到->上减32。然后再次使用printf将数字转换回字符。从“A”到->“A”，我们有32个字符的差异。

使用printf解释：

$ printf "%d\n" "'a"
97
$ printf "%d\n" "'A"
65

97 - 65 = 32

这是带有示例的工作版本。请注意代码中的注释，因为它们解释了很多内容：

#!/bin/bash

# lowerupper.sh

# Prints the lowercase version of a char
lowercaseChar(){
    case "$1" in
        [A-Z])
            n=$(printf "%d" "'$1")
            n=$((n+32))
            printf \\$(printf "%o" "$n")
            ;;
        *)
            printf "%s" "$1"
            ;;
    esac
}

# Prints the lowercase version of a sequence of strings
lowercase() {
    word="$@"
    for((i=0;i<${#word};i++)); do
        ch="${word:$i:1}"
        lowercaseChar "$ch"
    done
}

# Prints the uppercase version of a char
uppercaseChar(){
    case "$1" in
        [a-z])
            n=$(printf "%d" "'$1")
            n=$((n-32))
            printf \\$(printf "%o" "$n")
            ;;
        *)
            printf "%s" "$1"
            ;;
    esac
}

# Prints the uppercase version of a sequence of strings
uppercase() {
    word="$@"
    for((i=0;i<${#word};i++)); do
        ch="${word:$i:1}"
        uppercaseChar "$ch"
    done
}

# The functions will not add a new line, so use echo or
# append it if you want a new line after printing

# Printing stuff directly
lowercase "I AM the Walrus!"$'\n'
uppercase "I AM the Walrus!"$'\n'

echo "----------"

# Printing a var
str="A StRing WITH mixed sTUFF!"
lowercase "$str"$'\n'
uppercase "$str"$'\n'

echo "----------"

# Not quoting the var should also work, 
# since we use "$@" inside the functions
lowercase $str$'\n'
uppercase $str$'\n'

echo "----------"

# Assigning to a var
myLowerVar="$(lowercase $str)"
myUpperVar="$(uppercase $str)"
echo "myLowerVar: $myLowerVar"
echo "myUpperVar: $myUpperVar"

echo "----------"

# You can even do stuff like
if [[ 'option 2' = "$(lowercase 'OPTION 2')" ]]; then
    echo "Fine! All the same!"
else
    echo "Ops! Not the same!"
fi

exit 0

运行此操作后的结果：

$ ./lowerupper.sh 
i am the walrus!
I AM THE WALRUS!
----------
a string with mixed stuff!
A STRING WITH MIXED STUFF!
----------
a string with mixed stuff!
A STRING WITH MIXED STUFF!
----------
myLowerVar: a string with mixed stuff!
myUpperVar: A STRING WITH MIXED STUFF!
----------
Fine! All the same!

不过，这只适用于ASCII字符。

对我来说，这很好，因为我知道我只会向它传递ASCII字符。例如，我在某些不区分大小写的CLI选项中使用此选项。

2017-05-16 10:04:40

对于早于4.0的Bash版本，该版本应该是最快的（因为它不会fork/exec任何命令）：

function string.monolithic.tolower
{
   local __word=$1
   local __len=${#__word}
   local __char
   local __octal
   local __decimal
   local __result

   for (( i=0; i<__len; i++ ))
   do
      __char=${__word:$i:1}
      case "$__char" in
         [A-Z] )
            printf -v __decimal '%d' "'$__char"
            printf -v __octal '%03o' $(( $__decimal ^ 0x20 ))
            printf -v __char \\$__octal
            ;;
      esac
      __result+="$__char"
   done
   REPLY="$__result"
}

技术龙的回答也很有潜力，尽管它确实适合mee。

2013-03-24 13:43:45

如果使用v4，则会进行烘焙。如果不是，这里有一个简单、广泛适用的解决方案。这个线程上的其他答案（和注释）对创建下面的代码非常有用。

# Like echo, but converts to lowercase
echolcase () {
    tr [:upper:] [:lower:] <<< "${*}"
}

# Takes one arg by reference (var name) and makes it lowercase
lcase () { 
    eval "${1}"=\'$(echo ${!1//\'/"'\''"} | tr [:upper:] [:lower:] )\'
}

笔记：

执行：a=“Hi All”，然后：lcase a将执行与：a=$相同的操作（echolcase“Hi All”）在lcase函数中，使用$｛！1//\'/“'\'”｝而不是${！1｝，即使字符串有引号，也可以使用此函数。

2013-03-22 22:42:20

你可以试试这个

s="Hello World!" 

echo $s  # Hello World!

a=${s,,}
echo $a  # hello world!

b=${s^^}
echo $b  # HELLO WORLD!

裁判：http://wiki.workassis.com/shell-script-convert-text-to-lowercase-and-uppercase/

2017-03-23 06:48:52

在Bash 4中：

小写

$ string="A FEW WORDS"
$ echo "${string,}"
a FEW WORDS
$ echo "${string,,}"
a few words
$ echo "${string,,[AEIUO]}"
a FeW WoRDS

$ string="A Few Words"
$ declare -l string
$ string=$string; echo "$string"
a few words

大写

$ string="a few words"
$ echo "${string^}"
A few words
$ echo "${string^^}"
A FEW WORDS
$ echo "${string^^[aeiou]}"
A fEw wOrds

$ string="A Few Words"
$ declare -u string
$ string=$string; echo "$string"
A FEW WORDS

切换（未记录，但可在编译时配置）

$ string="A Few Words"
$ echo "${string~~}"
a fEW wORDS
$ string="A FEW WORDS"
$ echo "${string~}"
a FEW WORDS
$ string="a few words"
$ echo "${string~}"
A few words

大写（未记录，但可在编译时配置）

$ string="a few words"
$ declare -c string
$ string=$string
$ echo "$string"
A few words

标题大小写：

$ string="a few words"
$ string=($string)
$ string="${string[@]^}"
$ echo "$string"
A Few Words

$ declare -c string
$ string=(a few words)
$ echo "${string[@]}"
A Few Words

$ string="a FeW WOrdS"
$ string=${string,,}
$ string=${string~}
$ echo "$string"
A few words

要关闭声明属性，请使用+。例如，声明+c字符串。这会影响后续赋值，而不是当前值。

declare选项更改变量的属性，但不更改内容。示例中的重新分配会更新内容以显示更改。

编辑：

按照ghostdog74的建议，添加了“按单词切换第一个字符”（${var~}）。

编辑：更正波浪号行为以匹配Bash 4.3。

2010-02-15 10:31:09

如何在Bash中将字符串转换为小写

推荐文章

最新文章

标签