我的Java独立应用程序从用户那里获得一个URL(指向一个文件),我需要点击它并下载它。我面临的问题是,我不能正确编码HTTP URL地址…

例子:

URL:  http://search.barnesandnoble.com/booksearch/first book.pdf

java.net.URLEncoder.encode(url.toString(), "ISO-8859-1");

回报我。

http%3A%2F%2Fsearch.barnesandnoble.com%2Fbooksearch%2Ffirst+book.pdf

但是,我想要的是

http://search.barnesandnoble.com/booksearch/first%20book.pdf

(空格替换为%20)

我猜URLEncoder不是为编码HTTP url设计的…JavaDoc说“HTML表单编码的实用程序类”…还有别的办法吗?


当前回答

URL编码会对那个字符串进行编码这样它就能在URL中正确地传递到最终目的地。例如,您不能使用http://stackoverflow.com?url=http://yyy.com。UrlEncoding参数将修复该参数值。

所以我给你两个选择:

您是否有权访问与域分离的路径?如果是这样,您可以简单地对路径进行UrlEncode。然而,如果情况并非如此,那么选择2可能适合你。 commons - httpclient 3.1。它有一个类URIUtil: System.out.println(URIUtil.encodePath("http://example.com/x y", "ISO-8859-1"));

这将输出您正在寻找的内容,因为它只对URI的路径部分进行编码。

供您参考,这个方法需要common -codec和common -logging才能在运行时工作。

其他回答

你可以使用这样的函数。根据您的需要完成并修改:

/**
     * Encode URL (except :, /, ?, &, =, ... characters)
     * @param url to encode
     * @param encodingCharset url encoding charset
     * @return encoded URL
     * @throws UnsupportedEncodingException
     */
    public static String encodeUrl (String url, String encodingCharset) throws UnsupportedEncodingException{
            return new URLCodec().encode(url, encodingCharset).replace("%3A", ":").replace("%2F", "/").replace("%3F", "?").replace("%3D", "=").replace("%26", "&");
    }

使用示例:

String urlToEncode = ""http://www.growup.com/folder/intérieur-à_vendre?o=4";
Utils.encodeUrl (urlToEncode , "UTF-8")

结果是:http://www.growup.com/folder/int%C3%A9rieur-%C3%A0_vendre?o=4

如何:

UrlEncode(String in_) {

String retVal = "";

try {
    retVal = URLEncoder.encode(in_, "UTF8");
} catch (UnsupportedEncodingException ex) {
    Log.get().exception(Log.Level.Error, "urlEncode ", ex);
}

return retVal;

}

你也可以使用GUAVA和路径逃脱器: UrlEscapers.urlFragmentEscaper () .escape (relativePath)

如果任何人不想向他们的项目添加依赖项,这些函数可能会有帮助。

我们将URL的path部分传递到这里。您可能不想将完整的URL作为参数传递进来(查询字符串需要不同的转义,等等)。

/**
 * Percent-encodes a string so it's suitable for use in a URL Path (not a query string / form encode, which uses + for spaces, etc)
 */
public static String percentEncode(String encodeMe) {
    if (encodeMe == null) {
        return "";
    }
    String encoded = encodeMe.replace("%", "%25");
    encoded = encoded.replace(" ", "%20");
    encoded = encoded.replace("!", "%21");
    encoded = encoded.replace("#", "%23");
    encoded = encoded.replace("$", "%24");
    encoded = encoded.replace("&", "%26");
    encoded = encoded.replace("'", "%27");
    encoded = encoded.replace("(", "%28");
    encoded = encoded.replace(")", "%29");
    encoded = encoded.replace("*", "%2A");
    encoded = encoded.replace("+", "%2B");
    encoded = encoded.replace(",", "%2C");
    encoded = encoded.replace("/", "%2F");
    encoded = encoded.replace(":", "%3A");
    encoded = encoded.replace(";", "%3B");
    encoded = encoded.replace("=", "%3D");
    encoded = encoded.replace("?", "%3F");
    encoded = encoded.replace("@", "%40");
    encoded = encoded.replace("[", "%5B");
    encoded = encoded.replace("]", "%5D");
    return encoded;
}

/**
 * Percent-decodes a string, such as used in a URL Path (not a query string / form encode, which uses + for spaces, etc)
 */
public static String percentDecode(String encodeMe) {
    if (encodeMe == null) {
        return "";
    }
    String decoded = encodeMe.replace("%21", "!");
    decoded = decoded.replace("%20", " ");
    decoded = decoded.replace("%23", "#");
    decoded = decoded.replace("%24", "$");
    decoded = decoded.replace("%26", "&");
    decoded = decoded.replace("%27", "'");
    decoded = decoded.replace("%28", "(");
    decoded = decoded.replace("%29", ")");
    decoded = decoded.replace("%2A", "*");
    decoded = decoded.replace("%2B", "+");
    decoded = decoded.replace("%2C", ",");
    decoded = decoded.replace("%2F", "/");
    decoded = decoded.replace("%3A", ":");
    decoded = decoded.replace("%3B", ";");
    decoded = decoded.replace("%3D", "=");
    decoded = decoded.replace("%3F", "?");
    decoded = decoded.replace("%40", "@");
    decoded = decoded.replace("%5B", "[");
    decoded = decoded.replace("%5D", "]");
    decoded = decoded.replace("%25", "%");
    return decoded;
}

和测试:

@Test
public void testPercentEncode_Decode() {
    assertEquals("", percentDecode(percentEncode(null)));
    assertEquals("", percentDecode(percentEncode("")));

    assertEquals("!", percentDecode(percentEncode("!")));
    assertEquals("#", percentDecode(percentEncode("#")));
    assertEquals("$", percentDecode(percentEncode("$")));
    assertEquals("@", percentDecode(percentEncode("@")));
    assertEquals("&", percentDecode(percentEncode("&")));
    assertEquals("'", percentDecode(percentEncode("'")));
    assertEquals("(", percentDecode(percentEncode("(")));
    assertEquals(")", percentDecode(percentEncode(")")));
    assertEquals("*", percentDecode(percentEncode("*")));
    assertEquals("+", percentDecode(percentEncode("+")));
    assertEquals(",", percentDecode(percentEncode(",")));
    assertEquals("/", percentDecode(percentEncode("/")));
    assertEquals(":", percentDecode(percentEncode(":")));
    assertEquals(";", percentDecode(percentEncode(";")));

    assertEquals("=", percentDecode(percentEncode("=")));
    assertEquals("?", percentDecode(percentEncode("?")));
    assertEquals("@", percentDecode(percentEncode("@")));
    assertEquals("[", percentDecode(percentEncode("[")));
    assertEquals("]", percentDecode(percentEncode("]")));
    assertEquals(" ", percentDecode(percentEncode(" ")));

    // Get a little complex
    assertEquals("[]]", percentDecode(percentEncode("[]]")));
    assertEquals("a=d%*", percentDecode(percentEncode("a=d%*")));
    assertEquals(")  (", percentDecode(percentEncode(")  (")));
    assertEquals("%21%20%2A%20%27%20%28%20%25%20%29%20%3B%20%3A%20%40%20%26%20%3D%20%2B%20%24%20%2C%20%2F%20%3F%20%23%20%5B%20%5D%20%25",
                    percentEncode("! * ' ( % ) ; : @ & = + $ , / ? # [ ] %"));
    assertEquals("! * ' ( % ) ; : @ & = + $ , / ? # [ ] %", percentDecode(
                    "%21%20%2A%20%27%20%28%20%25%20%29%20%3B%20%3A%20%40%20%26%20%3D%20%2B%20%24%20%2C%20%2F%20%3F%20%23%20%5B%20%5D%20%25"));

    assertEquals("%23456", percentDecode(percentEncode("%23456")));

}

我阅读了以前的答案,写我自己的方法,因为我不能有一些正确的工作使用以前的答案的解决方案,它看起来对我很好,但如果你能找到不与此工作的URL,请让我知道。

public static URL convertToURLEscapingIllegalCharacters(String toEscape) throws MalformedURLException, URISyntaxException {
            URL url = new URL(toEscape);
            URI uri = new URI(url.getProtocol(), url.getUserInfo(), url.getHost(), url.getPort(), url.getPath(), url.getQuery(), url.getRef());
            //if a % is included in the toEscape string, it will be re-encoded to %25 and we don't want re-encoding, just encoding
            return new URL(uri.toString().replace("%25", "%"));
}