谁有一个正则表达式,可以匹配任何合法的DNS主机名或IP地址?

编写一个95%的工作时间很容易,但我希望得到一个经过良好测试的东西,完全匹配DNS主机名的最新RFC规范。


当前回答

"^((\\d{1,2}|1\\d{2}|2[0-4]\\d|25[0-5])\.){3}(\\d{1,2}|1\\d{2}|2[0-4]\\d|25[0-5])$"

其他回答

我考虑过这个简单的正则表达式匹配模式来进行IP地址匹配 \ d + [] \ d + [] \ d + [] \ d +

smink的主机名正则表达式没有遵守主机名中各个标签长度的限制。有效主机名中的每个标签长度不能超过63个字节。

ValidHostnameRegex="^([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])\
(\.([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9]))*$"

请注意,第一行末尾的反斜杠(上面)是用于分隔长行的Unix shell语法。它不是正则表达式本身的一部分。

下面是一行中单独的正则表达式:

^([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])(\.([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9]))*$

您还应该单独检查主机名的总长度不能超过255个字符。更多信息,请咨询RFC-952和RFC-1123。

Regarding IP addresses, it appears that there is some debate on whether to include leading zeros. It was once the common practice and is generally accepted, so I would argue that they should be flagged as valid regardless of the current preference. There is also some ambiguity over whether text before and after the string should be validated and, again, I think it should. 1.2.3.4 is a valid IP but 1.2.3.4.5 is not and neither the 1.2.3.4 portion nor the 2.3.4.5 portion should result in a match. Some of the concerns can be handled with this expression:

grep -E '(^|[^[:alnum:]+)(([0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])\.){3}([0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])([^[:alnum:]]|$)' 

The unfortunate part here is the fact that the regex portion that validates an octet is repeated as is true in many offered solutions. Although this is better than for instances of the pattern, the repetition can be eliminated entirely if subroutines are supported in the regex being used. The next example enables those functions with the -P switch of grep and also takes advantage of lookahead and lookbehind functionality. (The function name I selected is 'o' for octet. I could have used 'octet' as the name but wanted to be terse.)

grep -P '(?<![\d\w\.])(?<o>([0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5]))(\.\g<o>){3}(?![\d\w\.])'

如果IP地址在一个包含句子形式文本的文件中,那么点号的处理实际上可能会产生错误的否定,因为句号可以跟在后面,而不是点号符号的一部分。上面的一个变体可以修复这个问题:

grep -P '(?<![\d\w\.])(?<x>([0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5]))(\.\g<x>){3}(?!([\d\w]|\.\d))'

这适用于有效的IP地址:

regex = '^([0-9]|[1-9][0-9]|[1][0-9][0-9]|[2][0-5][0-5])[.]([0-9]|[1-9][0-9]|[1][0-9][0-9]|[2][0-5][0-5])[.]([0-9]|[1-9][0-9]|[1][0-9][0-9]|[2][0-5][0-5])[.]([0-9]|[1-9][0-9]|[1][0-9][0-9]|[2][0-5][0-5])$'
AddressRegex = "^(ftp|http|https):\/\/([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}:[0-9]{1,5})$";

HostnameRegex =  /^(ftp|http|https):\/\/([a-z0-9]+\.)?[a-z0-9][a-z0-9-]*((\.[a-z]{2,6})|(\.[a-z]{2,6})(\.[a-z]{2,6}))$/i

这些re仅用于此类型验证

只在以下情况下工作 http://www.kk.com http://www.kk.co.in

不适合

http://www.kk.com/ http://www.kk.co.in.kk

http://www.kk.com/dfas http://www.kk.co.in/