有人对检测字符串中的url有什么建议吗?
arrayOfStrings.forEach(function(string){
// detect URLs in strings and do something swell,
// like creating elements with links.
});
更新:我最终使用这个正则表达式进行链接检测……显然是在几年后。
kLINK_DETECTION_REGEX = /(([a-z]+:\/\/)?(([a-z0-9\-]+\.)+([a-z]{2}|aero|arpa|biz|com|coop|edu|gov|info|int|jobs|mil|museum|name|nato|net|org|pro|travel|local|internal))(:[0-9]{1,5})?(\/[a-z0-9_\-\.~]+)*(\/([a-z0-9_\-\.]*)(\?[a-z0-9+_\-\.%=&]*)?)?(#[a-zA-Z0-9!$&'()*+.=-_~:@/?]*)?)(\s+|$)/gi
完整的帮助器(带有可选的句柄支持)位于gist #1654670。
首先,你需要一个匹配url的正则表达式。这很难做到。看这里,这里和这里:
...almost anything is a valid URL. There
are some punctuation rules for
splitting it up. Absent any
punctuation, you still have a valid
URL.
Check the RFC carefully and see if you
can construct an "invalid" URL. The
rules are very flexible.
For example ::::: is a valid URL.
The path is ":::::". A pretty
stupid filename, but a valid filename.
Also, ///// is a valid URL. The
netloc ("hostname") is "". The path
is "///". Again, stupid. Also
valid. This URL normalizes to "///"
which is the equivalent.
Something like "bad://///worse/////"
is perfectly valid. Dumb but valid.
无论如何,这个答案并不是为了给您最好的正则表达式,而是为了证明如何使用JavaScript在文本中进行字符串包装。
所以让我们用这一个:/ (https ?: \ / \ / ^ \ [s] +) / g
同样,这是一个糟糕的正则表达式。它会有很多假阳性。但是对于这个例子来说已经足够好了。
函数urlify(text) {
var urlRegex = /(https?:\/\/[^\s]+)/g;
返回文本。替换(urlRegex,函数(url) {
返回'<a href="' + url + '">' + url + '</a>';
})
//或者
//返回文本。替换(urlRegex, '<a href="$1">$1</a>')
}
var text = '在http://www.example.com和http://stackoverflow.com上找到我';
Var HTML = urlify(文本);
console.log (html)
// html now looks like:
// "Find me at <a href="http://www.example.com">http://www.example.com</a> and also at <a href="http://stackoverflow.com">http://stackoverflow.com</a>"
所以总的来说:
$$('#pad dl dd').each(function(element) {
element.innerHTML = urlify(element.innerHTML);
});
您可以使用这样的正则表达式来提取正常的url模式。
(https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|www\.[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9]+\.[^\s]{2,}|www\.[a-zA-Z0-9]+\.[^\s]{2,})
如果需要更复杂的模式,可以使用这样的库。
https://www.npmjs.com/package/pattern-dreamer
下面是我最终使用的正则表达式:
var urlRegex =/(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/ig;
这不包括URL中的尾随标点符号。新月的功能就像一个魅力:)
所以:
function linkify(text) {
var urlRegex =/(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/ig;
return text.replace(urlRegex, function(url) {
return '<a href="' + url + '">' + url + '</a>';
});
}