我把上面的内容做了一些改变。我首先喜欢正逻辑,并且我认为HashSet可能比其他选项(比如通过String进行搜索)提供更好的性能。虽然,我不确定自动装箱的代价是否值得,但如果编译器针对ASCII字符进行了优化,那么装箱的代价就会很低。
/***
* Replaces any character not specifically unreserved to an equivalent
* percent sequence.
* @param s
* @return
*/
public static String encodeURIcomponent(String s)
{
StringBuilder o = new StringBuilder();
for (char ch : s.toCharArray()) {
if (isSafe(ch)) {
o.append(ch);
}
else {
o.append('%');
o.append(toHex(ch / 16));
o.append(toHex(ch % 16));
}
}
return o.toString();
}
private static char toHex(int ch)
{
return (char)(ch < 10 ? '0' + ch : 'A' + ch - 10);
}
// https://tools.ietf.org/html/rfc3986#section-2.3
public static final HashSet<Character> UnreservedChars = new HashSet<Character>(Arrays.asList(
'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z',
'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z',
'0','1','2','3','4','5','6','7','8','9',
'-','_','.','~'));
public static boolean isSafe(char ch)
{
return UnreservedChars.contains(ch);
}