我试图写一个bash脚本测试,需要一个参数,并通过curl发送到网站。我需要url编码的值,以确保特殊字符被正确处理。最好的方法是什么?

这是我到目前为止的基本脚本:

#!/bin/bash
host=${1:?'bad host'}
value=$2
shift
shift
curl -v -d "param=${value}" http://${host}/somepath $@

当前回答

我发现在python中可读性更好:

encoded_value=$(python3 -c "import urllib.parse; print urllib.parse.quote('''$value''')")

三重'确保单引号的值不会有伤害。Urllib在标准库中。它的工作,例如这个疯狂的(现实世界)url:

"http://www.rai.it/dl/audio/" "1264165523944Ho servito il re d'Inghilterra - Puntata 7

其他回答

从shell脚本中使用php:

value="http://www.google.com"
encoded=$(php -r "echo rawurlencode('$value');")
# encoded = "http%3A%2F%2Fwww.google.com"
echo $(php -r "echo rawurldecode('$encoded');")
# returns: "http://www.google.com"

http://www.php.net/manual/en/function.rawurlencode.php http://www.php.net/manual/en/function.rawurldecode.php

我发现下面的代码片段很有用,可以将它插入到一个程序调用链中,其中URI::Escape可能没有安装:

perl -p -e 's/([^A-Za-z0-9])/sprintf("%%%02X", ord($1))/seg'

(源)

This is a simpler pure bash/ksh version without the substring logic. Stated differently the other pure shell solutions reparsed the string to get each character (using parameter substitution ${#str} for the lenght and ${str:$i:1} to discover each character). The below method does just one loop over the string to process each character. It is the difference between O(n^2) and O(n). In this answer: https://stackoverflow.com/a/40833433/1344599 Thunderbeef saw ~150x speed improvement on a large text file. This solution is also a shorter oneliner:

while IFS='' read -n 1 c ; do [[ "$c" =~ [A-Za-z0-9.~_-] ]] && printf "$c" || printf '%%%02X' "'$c" ; done

在函数中,你可以使用stdin或形参:

function urlen_stdin {
  while IFS='' read -n 1 c ; do [[ "$c" =~ [A-Za-z0-9.~_-] ]] && printf "$c" || printf '%%%02X' "'$c" ; done
}
function urlen_param {
  printf '%s' "$1" | while IFS='' read -n 1 c ; do [[ "$c" =~ [A-Za-z0-9.~_-] ]] && printf "$c" || printf '%%%02X' "'$c" ; done
}
function urlen_here {
  while IFS='' read -n 1 c ; do [[ "$c" =~ [A-Za-z0-9.~_-] ]] && printf "$c" || printf '%%%02X' "'$c" ; done <<< "$1"
}

#usage: 
echo -n 'hello !@#$%^&*()[]:;{}\/|-_=+.,? world' | urlen_stdin
urlen_param 'hello !@#$%^&*()[]:;{}\/|-_=+.,? world'
urlen_here 'hello !@#$%^&*()[]:;{}\/|-_=+.,? world'
# all methods render:
hello%20%21%40%23%24%25%5E%26%2A%28%29%5B%5D%3A%3B%7B%7D%2F%7C-_%3D%2B.%2C%3F%20world

解释:

IFS= "使空格像普通字符一样 Read -n 1一次读取1个字符 [[=~]]是一个正则表达式比较。如果字符匹配,则遵循&&路径,否则遵循||路径 printf '%%%02X'打印一个%和字符作为零填充长度为2的十六进制代码

这可能是最好的一个:

after=$(echo -e "$before" | od -An -tx1 | tr ' ' % | xargs printf "%s")

其中一种变体可能很丑,但很简单:

urlencode() {
    local data
    if [[ $# != 1 ]]; then
        echo "Usage: $0 string-to-urlencode"
        return 1
    fi
    data="$(curl -s -o /dev/null -w %{url_effective} --get --data-urlencode "$1" "")"
    if [[ $? != 3 ]]; then
        echo "Unexpected error" 1>&2
        return 2
    fi
    echo "${data##/?}"
    return 0
}

下面是一个单行版本的例子(由Bruno建议):

date | curl -Gso /dev/null -w %{url_effective} --data-urlencode @- "" | cut -c 3-

# If you experience the trailing %0A, use
date | curl -Gso /dev/null -w %{url_effective} --data-urlencode @- "" | sed -E 's/..(.*).../\1/'