在JavaScript中是否有一种方法来检查字符串是否是URL?

regex被排除在外,因为URL很可能写成stackoverflow;也就是说,它可能没有。com, WWW或http。


当前回答

该功能不允许localhost,只允许web页面的url(即只允许http或https协议)。

它也只允许这里定义的安全字符:https://www.urlencoder.io/learn/

function isValidWebUrl(url) {
   let regEx = /^https?:\/\/(?:www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)$/gm;
   return regEx.test(url);
}

其他回答

我将函数更改为Match +,在这里用斜杠和它的工作:(http://和https)进行更改

function isValidUrl(userInput) {
    var res = userInput.match(/(http(s)?:\/\/.)?(www\.)?[-a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=]*)/g);
    if(res == null)
       return false;
    else
       return true;
}

如果你想检查一个字符串是否是有效的HTTP URL,你可以使用URL构造函数(它会抛出一个畸形的字符串):

函数isValidHttpUrl(string) { 让url; 尝试{ url =新的url(字符串); } catch (_) { 返回错误; } 返回的url。Protocol === "http:" || url。协议=== "https:"; } console.log (" http://example.com " + isValidHttpUrl (" https://example.com ")); console.log (" example.com " + isValidHttpUrl (" example.com "));

注意:根据RFC 3886, URL必须以一个方案开头(不限于http/https),例如:

www.example.com不是有效的URL(缺少方案) void(0)是有效的URL,但不是HTTP URL http://.。是有效的URL,主机是..(是否解析取决于你的DNS) https://example..com是有效的URL,与上面相同

Mathias Bynens编译了一个带有测试URL的知名URL正则表达式列表。没有什么理由去写一个新的正则表达式;只要选择一个现有的最适合你的。

但是这些正则表达式的比较表也表明,使用单个正则表达式进行URL验证几乎是不可能的。Bynens列出的所有正则表达式都会产生假阳性和假阴性。

我建议您使用现有的URL解析器(例如JavaScript中的新URL('http://www.example.com/')),然后应用您想要对URL响应的解析和规范化形式执行的检查。它的组件。使用JavaScript URL接口还有一个额外的好处,它只接受浏览器真正接受的URL。

您还应该记住,技术上不正确的url仍然可以工作。例如http://w_w_w.example.com/, http://www..example.com/, http://123.example.com/都有一个无效的主机名部分,但我知道的每个浏览器都会试图打开它们而没有抱怨,当你在/etc/hosts/中为这些无效的名称指定IP地址时,这样的url甚至可以工作,但只在你的计算机上。

因此,问题不在于URL是否有效,而在于在特定的上下文中应该允许哪些URL工作。

如果你想进行URL验证,有很多细节和边缘情况很容易被忽视:

URLs may contain credentials as in http://user:password@www.example.com/. Port numbers must be in the range of 0-65535, but you may still want to exclude the wildcard port 0. Port numbers may have leading zeros as in http://www.example.com:000080/. IPv4 addresses are by no means restricted to 4 decimal integers in the range of 0-255. You can use one to four integers, and they can be decimal, octal or hexadecimal. The URLs https://010.010.000010.010/, https://0x8.0x8.0x0008.0x8/, https://8.8.2056/, https://8.526344/, https://134744072/ are all valid and just creative ways of writing https://8.8.8.8/. Allowing loopback addresses (http://127.0.0.1/), private IP addresses (http://192.168.1.1), link-local addresses (http://169.254.100.200) and so on may have an impact on security or privacy. If, for instance, you allow them as the address of user avatars in a forum, you cause the users' browsers to send unsolicited network requests in their local network and in the internet of things such requests may cause funny and not so funny things to happen in your home. For the same reasons, you may want to discard links to not fully qualified hostnames, in other words hostnames without a dot. But hostnames may always have a trailing dot (like in http://www.stackoverflow.com.). The hostname portion of a link may contain angle brackets for IPv6 addresses as in http://[::1]. IPv6 addresses also have ranges for private networks or link-local addresses etc. If you block certain IPv4 addresses, keep in mind that for example https://127.0.0.1 and https://[::ffff:127.0.0.1] point to the same resource (if the loopback device of your machine is IPv6 ready). The hostname portion of URLs may now contain Unicode, so that the character range [-0-9a-zA-z] is definitely no longer sufficient. Many registries for top-level domains define specific restrictions, for example on the allowed set of Unicode characters. Or they subdivide their namespace (like co.uk and many others). Top-level domains must not contain decimal digits, and the hyphen is not allowed unless for the IDN A-label prefix "xn--". Unicode top-level domains (and their punycode encoding with "xn--") must still contain only letters but who wants to check that in a regex?

应用哪些限制和规则取决于项目需求和喜好。

我最近为一个web应用程序编写了一个URL验证器,它适用于论坛、社交网络等用户提供的URL。你可以把它作为你自己的基础:

(Angular)前端的JavaScript/Typescript版本 Perl版本的后端

我还写了一篇博客文章《URL验证的血淋淋的细节》,提供了更深入的信息。

依赖库: https://www.npmjs.com/package/valid-url

import { isWebUri } from 'valid-url';
// ...
if (!isWebUri(url)) {
    return "Not a valid url.";
}

As has been noted the perfect regex is elusive but still seems to be a reasonable approach (alternatives are server side tests or the new experimental URL API). However the high ranking answers are often returning false for common URLs but even worse will freeze your app/page for minutes on even as simple a string as isURL('aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'). It's been pointed out in some of the comments, but most probably haven't entered a bad value to see it. Hanging like that makes that code unusable in any serious application. I think it's due to the repeated case insensitive sets in code like ((([a-z\\d]([a-z\\d-]*[a-z\\d])*)\\.?)+[a-z]{2,}|' .... Take out the 'i' and it doesn't hang but will of course not work as desired. But even with the ignore case flag those tests reject high unicode values that are allowed.

已经提到的最好的是:

function isURL(str) {
  return /^(?:\w+:)?\/\/([^\s\.]+\.\S{2}|localhost[\:?\d]*)\S*$/.test(str); 
}

That comes from Github segmentio/is-url. The good thing about a code repository is you can see the testing and any issues and also the test strings run through it. There's a branch that would allow strings missing protocol like google.com, though you're probably making too many assumptions then. The repository has been updated and I'm not planning on trying to keep up a mirror here. It's been broken up into separate tests to avoid RegEx redos which can be exploited for DOS attacks (I don't think you have to worry about that with client side js, but you do have to worry about your page hanging for so long that your visitor leaves your site).

在dperini/regex- webburl .js中,我还看到过另一个存储库,它可能更适合isURL,但它非常复杂。它有一个更大的有效和无效url的测试列表。上面的简单的一个仍然通过了所有的正面信息,只有一些奇怪的负面信息,如http://a.b--c.de/以及特殊的ip。

无论你选择哪一个,在使用浏览器的开发人员工具检查器时,通过这个函数运行它,这个函数是我从dperini/regex- webburl .js上的测试中改编的。

function testIsURL() {
//should match
console.assert(isURL("http://foo.com/blah_blah"));
console.assert(isURL("http://foo.com/blah_blah/"));
console.assert(isURL("http://foo.com/blah_blah_(wikipedia)"));
console.assert(isURL("http://foo.com/blah_blah_(wikipedia)_(again)"));
console.assert(isURL("http://www.example.com/wpstyle/?p=364"));
console.assert(isURL("https://www.example.com/foo/?bar=baz&inga=42&quux"));
console.assert(isURL("http://✪df.ws/123"));
console.assert(isURL("http://userid:password@example.com:8080"));
console.assert(isURL("http://userid:password@example.com:8080/"));
console.assert(isURL("http://userid@example.com"));
console.assert(isURL("http://userid@example.com/"));
console.assert(isURL("http://userid@example.com:8080"));
console.assert(isURL("http://userid@example.com:8080/"));
console.assert(isURL("http://userid:password@example.com"));
console.assert(isURL("http://userid:password@example.com/"));
console.assert(isURL("http://142.42.1.1/"));
console.assert(isURL("http://142.42.1.1:8080/"));
console.assert(isURL("http://➡.ws/䨹"));
console.assert(isURL("http://⌘.ws"));
console.assert(isURL("http://⌘.ws/"));
console.assert(isURL("http://foo.com/blah_(wikipedia)#cite-1"));
console.assert(isURL("http://foo.com/blah_(wikipedia)_blah#cite-1"));
console.assert(isURL("http://foo.com/unicode_(✪)_in_parens"));
console.assert(isURL("http://foo.com/(something)?after=parens"));
console.assert(isURL("http://☺.damowmow.com/"));
console.assert(isURL("http://code.google.com/events/#&product=browser"));
console.assert(isURL("http://j.mp"));
console.assert(isURL("ftp://foo.bar/baz"));
console.assert(isURL("http://foo.bar/?q=Test%20URL-encoded%20stuff"));
console.assert(isURL("http://مثال.إختبار"));
console.assert(isURL("http://例子.测试"));
console.assert(isURL("http://उदाहरण.परीक्षा"));
console.assert(isURL("http://-.~_!$&'()*+,;=:%40:80%2f::::::@example.com"));
console.assert(isURL("http://1337.net"));
console.assert(isURL("http://a.b-c.de"));
console.assert(isURL("http://223.255.255.254"));
console.assert(isURL("postgres://u:p@example.com:5702/db"));
console.assert(isURL("https://d1f4470da51b49289906b3d6cbd65074@app.getsentry.com/13176"));

//SHOULD NOT MATCH:
console.assert(!isURL("http://"));
console.assert(!isURL("http://."));
console.assert(!isURL("http://.."));
console.assert(!isURL("http://../"));
console.assert(!isURL("http://?"));
console.assert(!isURL("http://??"));
console.assert(!isURL("http://??/"));
console.assert(!isURL("http://#"));
console.assert(!isURL("http://##"));
console.assert(!isURL("http://##/"));
console.assert(!isURL("http://foo.bar?q=Spaces should be encoded"));
console.assert(!isURL("//"));
console.assert(!isURL("//a"));
console.assert(!isURL("///a"));
console.assert(!isURL("///"));
console.assert(!isURL("http:///a"));
console.assert(!isURL("foo.com"));
console.assert(!isURL("rdar://1234"));
console.assert(!isURL("h://test"));
console.assert(!isURL("http:// shouldfail.com"));
console.assert(!isURL(":// should fail"));
console.assert(!isURL("http://foo.bar/foo(bar)baz quux"));
console.assert(!isURL("ftps://foo.bar/"));
console.assert(!isURL("http://-error-.invalid/"));
console.assert(!isURL("http://a.b--c.de/"));
console.assert(!isURL("http://-a.b.co"));
console.assert(!isURL("http://a.b-.co"));
console.assert(!isURL("http://0.0.0.0"));
console.assert(!isURL("http://10.1.1.0"));
console.assert(!isURL("http://10.1.1.255"));
console.assert(!isURL("http://224.1.1.1"));
console.assert(!isURL("http://1.1.1.1.1"));
console.assert(!isURL("http://123.123.123"));
console.assert(!isURL("http://3628126748"));
console.assert(!isURL("http://.www.foo.bar/"));
console.assert(!isURL("http://www.foo.bar./"));
console.assert(!isURL("http://.www.foo.bar./"));
console.assert(!isURL("http://10.1.1.1"));}

然后测试这串a。

在你发布一个看起来很棒的正则表达式之前,看看Mathias Bynens对isURL正则表达式的比较,了解更多信息。