在JavaScript中是否有一种方法来检查字符串是否是URL?
regex被排除在外,因为URL很可能写成stackoverflow;也就是说,它可能没有。com, WWW或http。
在JavaScript中是否有一种方法来检查字符串是否是URL?
regex被排除在外,因为URL很可能写成stackoverflow;也就是说,它可能没有。com, WWW或http。
当前回答
我已经修改了所有的评论,注释和备注是这个主题,并做了一个新的正则表达式:
^((javascript:[\w-_]+(\([\w-_\s,.]*\))?)|(mailto:([\w\u00C0-\u1FFF\u2C00-\uD7FF-_]+\.)*[\w\u00C0-\u1FFF\u2C00-\uD7FF-_]+@([\w\u00C0-\u1FFF\u2C00-\uD7FF-_]+\.)*[\w\u00C0-\u1FFF\u2C00-\uD7FF-_]+)|(\w+:\/\/(([\w\u00C0-\u1FFF\u2C00-\uD7FF-]+\.)*([\w\u00C0-\u1FFF\u2C00-\uD7FF-]*\.?))(:\d+)?(((\/[^\s#$%^&*?]+)+|\/)(\?[\w\u00C0-\u1FFF\u2C00-\uD7FF:;&%_,.~+=-]+)?)?(#[\w\u00C0-\u1FFF\u2C00-\uD7FF-_]+)?))$
你可以在这里测试和改进它https://regexr.com/668mt。
我检查了下一个值的表达式:
http://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&docid=nIv5rk2GyP3hXM&tbnid=isiOkMe3nCtexM:&ved=0CAUQjRw&url=http%3A%2F%2Fanimalcrossing.wikia.com%2Fwiki%2FLion&ei=ygZXU_2fGKbMsQTf4YLgAQ&bvm=bv.65177938,d.aWc&psig=AFQjCNEpBfKnal9kU7Zu4n7RnEt2nerN4g&ust=1398298682009707
http://192.168.0.4:55/
https://web.archive.org/web/20170817095211/https://github.com/Microsoft/vscode/issues/32405
http://www.example.com
javascript:void()
http://.
https://example.
https://en.m.wikipedia.org/wiki/C_Sharp_(programming_language)
http://zh.wikipedia.org/wiki/Wikipedia:关于中文维基百科/en?a#a
https://medium.com/@User_name/
https://test-test-test-test-test-test-test-test-test.web.app/
http://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&docid=nIv5rk2GyP3hXM&tbnid=isiOkMe3nCtexM:&ved=0CAUQjRw&url=http%3A%2F%2Fanimalcrossing.wikia.com%2Fwiki%2FLion&ei=ygZXU_2fGKbMsQTf4YLgAQ&bvm=bv.65177938,d.aWc&psig=AFQjCNEpBfKnal9kU7Zu4n7RnEt2nerN4g&ust=1398298682009707
https://sdfasdp.ppppppppppp
mailto:sadf@gmail.com
https://тест.юа
其他回答
2020年更新。 为了扩展@iamnewton和@ fernando Chavez Herrera的精彩回答,我已经开始看到@被用于url的路径。
所以更新后的正则表达式是:
RegExp('(https?:\\/\\/)?((([a-z\\d]([a-z\\d-]*[a-z\\d])*)\\.)+[a-z]{2,}|((\\d{1,3}\\.){3}\\d{1,3}))(\\:\\d+)?(\\/[-a-z\\d%_.~+@]*)*(\\?[;&a-z\\d%_.~+=-]*)?(\\#[-a-z\\d_]*)?$', 'i');
如果你想在查询字符串和哈希中允许它,使用:
RegExp('(https?:\\/\\/)?((([a-z\\d]([a-z\\d-]*[a-z\\d])*)\\.)+[a-z]{2,}|((\\d{1,3}\\.){3}\\d{1,3}))(\\:\\d+)?(\\/[-a-z\\d%_.~+@]*)*(\\?[;&a-z\\d%_.~+=-@]*)?(\\#[-a-z\\d_@]*)?$', 'i');
话虽如此,我不确定是否有白皮书规则禁止在查询字符串或哈希中使用@。
使用纯正则表达式很难做到这一点,因为url有很多“不方便”的地方。
For example domain names have complicated restrictions on hyphens: a. It is allowed to have many consecutive hyphens in the middle. b. but the first character and last character of the domain name cannot be a hyphen c. The 3rd and 4th character cannot be both hyphen Similarly port number can only be in the range 1-65535. This is easy to check if you extract the port part and convert to int but quite difficult to check with a regular expression. There is also no easy way to check valid domain extensions. Some countries have second-level domains(such as 'co.uk'), or the extension can be a long word such as '.international'. And new TLDs are added regularly. This type of things can only be checked against a hard-coded list. (see https://en.wikipedia.org/wiki/Top-level_domain) Then there are magnet urls, ftp addresses etc. These all have different requirements.
然而,这里有一个函数可以处理几乎所有的事情,除了:
案例1。c 接受任何1-5位数的端口号 接受任何扩展2-13个字符 不接受ftp,磁铁等…
function isValidURL(input) { pattern = '^(https?:\\/\\/)?' + // protocol '((([a-zA-Z\\d]([a-zA-Z\\d-]{0,61}[a-zA-Z\\d])*\\.)+' + // sub-domain + domain name '[a-zA-Z]{2,13})' + // extension '|((\\d{1,3}\\.){3}\\d{1,3})' + // OR ip (v4) address '|localhost)' + // OR localhost '(\\:\\d{1,5})?' + // port '(\\/[a-zA-Z\\&\\d%_.~+-:@]*)*' + // path '(\\?[a-zA-Z\\&\\d%_.,~+-:@=;&]*)?' + // query string '(\\#[-a-zA-Z&\\d_]*)?$'; // fragment locator regex = new RegExp(pattern); return regex.test(input); } let tests = []; tests.push(['', false]); tests.push(['http://en.wikipedia.org/wiki/Procter_&_Gamble', true]); tests.push(['https://sdfasd', false]); tests.push(['http://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&docid=nIv5rk2GyP3hXM&tbnid=isiOkMe3nCtexM:&ved=0CAUQjRw&url=http%3A%2F%2Fanimalcrossing.wikia.com%2Fwiki%2FLion&ei=ygZXU_2fGKbMsQTf4YLgAQ&bvm=bv.65177938,d.aWc&psig=AFQjCNEpBfKnal9kU7Zu4n7RnEt2nerN4g&ust=1398298682009707', true]); tests.push(['https://stackoverflow.com/', true]); tests.push(['https://w', false]); tests.push(['aaa', false]); tests.push(['aaaa', false]); tests.push(['oh.my', true]); tests.push(['dfdsfdsfdfdsfsdfs', false]); tests.push(['google.co.uk', true]); tests.push(['test-domain.MUSEUM', true]); tests.push(['-hyphen-start.gov.tr', false]); tests.push(['hyphen-end-.com', false]); tests.push(['https://sdfasdp.international', true]); tests.push(['https://sdfasdp.pppppppp', false]); tests.push(['https://sdfasdp.ppppppppppppppppppp', false]); tests.push(['https://sdfasd', false]); tests.push(['https://sub1.1234.sub3.sub4.sub5.co.uk/?', true]); tests.push(['http://www.google-com.123', false]); tests.push(['http://my--testdomain.com', false]); tests.push(['http://my2nd--testdomain.com', true]); tests.push(['http://thingiverse.com/download:1894343', true]); tests.push(['https://medium.com/@techytimo', true]); tests.push(['http://localhost', true]); tests.push(['localhost', true]); tests.push(['localhost:8080', true]); tests.push(['localhost:65536', true]); tests.push(['localhost:80000', false]); tests.push(['magnet:?xt=urn:btih:123', true]); for (let i = 0; i < tests.length; i++) { console.log('Test #' + i + (isValidURL(tests[i][0]) == tests[i][1] ? ' passed' : ' failed') + ' on ["' + tests[i][0] + '", ' + tests[i][1] + ']'); }
我已经修改了所有的评论,注释和备注是这个主题,并做了一个新的正则表达式:
^((javascript:[\w-_]+(\([\w-_\s,.]*\))?)|(mailto:([\w\u00C0-\u1FFF\u2C00-\uD7FF-_]+\.)*[\w\u00C0-\u1FFF\u2C00-\uD7FF-_]+@([\w\u00C0-\u1FFF\u2C00-\uD7FF-_]+\.)*[\w\u00C0-\u1FFF\u2C00-\uD7FF-_]+)|(\w+:\/\/(([\w\u00C0-\u1FFF\u2C00-\uD7FF-]+\.)*([\w\u00C0-\u1FFF\u2C00-\uD7FF-]*\.?))(:\d+)?(((\/[^\s#$%^&*?]+)+|\/)(\?[\w\u00C0-\u1FFF\u2C00-\uD7FF:;&%_,.~+=-]+)?)?(#[\w\u00C0-\u1FFF\u2C00-\uD7FF-_]+)?))$
你可以在这里测试和改进它https://regexr.com/668mt。
我检查了下一个值的表达式:
http://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&docid=nIv5rk2GyP3hXM&tbnid=isiOkMe3nCtexM:&ved=0CAUQjRw&url=http%3A%2F%2Fanimalcrossing.wikia.com%2Fwiki%2FLion&ei=ygZXU_2fGKbMsQTf4YLgAQ&bvm=bv.65177938,d.aWc&psig=AFQjCNEpBfKnal9kU7Zu4n7RnEt2nerN4g&ust=1398298682009707
http://192.168.0.4:55/
https://web.archive.org/web/20170817095211/https://github.com/Microsoft/vscode/issues/32405
http://www.example.com
javascript:void()
http://.
https://example.
https://en.m.wikipedia.org/wiki/C_Sharp_(programming_language)
http://zh.wikipedia.org/wiki/Wikipedia:关于中文维基百科/en?a#a
https://medium.com/@User_name/
https://test-test-test-test-test-test-test-test-test.web.app/
http://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&docid=nIv5rk2GyP3hXM&tbnid=isiOkMe3nCtexM:&ved=0CAUQjRw&url=http%3A%2F%2Fanimalcrossing.wikia.com%2Fwiki%2FLion&ei=ygZXU_2fGKbMsQTf4YLgAQ&bvm=bv.65177938,d.aWc&psig=AFQjCNEpBfKnal9kU7Zu4n7RnEt2nerN4g&ust=1398298682009707
https://sdfasdp.ppppppppppp
mailto:sadf@gmail.com
https://тест.юа
我建议使用锚元素,而不是使用正则表达式。
当你设置一个锚的href属性时,其他各种属性也会被设置。
var parser = document.createElement('a');
parser.href = "http://example.com:3000/pathname/?search=test#hash";
parser.protocol; // => "http:"
parser.hostname; // => "example.com"
parser.port; // => "3000"
parser.pathname; // => "/pathname/"
parser.search; // => "?search=test"
parser.hash; // => "#hash"
parser.host; // => "example.com:3000"
源
但是,如果href绑定的值不是一个有效的url,那么这些辅助属性的值将是空字符串。
编辑:正如评论中指出的:如果使用了无效的url,则可以替换当前url的属性。
所以,只要你没有传递当前页面的URL,你可以这样做:
function isValidURL(str) {
var a = document.createElement('a');
a.href = str;
return (a.host && a.host != window.location.host);
}
这里还有另一种方法。
// ***note***: if the incoming value is empty(""), the function returns true var elm; function isValidURL(u){ //A precaution/solution for the problem written in the ***note*** if(u!==""){ if(!elm){ elm = document.createElement('input'); elm.setAttribute('type', 'url'); } elm.value = u; return elm.validity.valid; } else{ return false } } console.log(isValidURL('')); console.log(isValidURL('http://www.google.com/')); console.log(isValidURL('//google.com')); console.log(isValidURL('google.com')); console.log(isValidURL('localhost:8000'));