我在一个正则表达式后,将验证一个完整的复杂的英国邮政编码只在输入字符串。所有不常见的邮政编码形式必须包括以及通常。例如:
匹配
CW3 9不锈钢 SE5 0EG SE50EG Se5 0eg WC2H 7LT
不匹配
aWC2H 7LT WC2H 7LTa WC2H
我怎么解决这个问题?
我在一个正则表达式后,将验证一个完整的复杂的英国邮政编码只在输入字符串。所有不常见的邮政编码形式必须包括以及通常。例如:
匹配
CW3 9不锈钢 SE5 0EG SE50EG Se5 0eg WC2H 7LT
不匹配
aWC2H 7LT WC2H 7LTa WC2H
我怎么解决这个问题?
当前回答
通过经验测试和观察,以及https://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom#Validation的确认,以下是我的Python正则表达式版本,可以正确地解析和验证英国邮政编码:
UK_POSTCODE_REGEX = r ' (? P < postcode_area > [a - z] {1,2}) (? P <区> (?:[0 - 9]{1,2})| (?:[0 - 9][a - z])) (? P <部门> [0 - 9])(? P <邮编> [a - z]{2})”
这个正则表达式很简单,并且有捕获组。它不包括所有合法的英国邮政编码的验证,而只考虑字母与数字的位置。
下面是我在代码中如何使用它:
@dataclass
class UKPostcode:
postcode_area: str
district: str
sector: int
postcode: str
# https://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom#Validation
# Original author of this regex: @jontsai
# NOTE TO FUTURE DEVELOPER:
# Verified through empirical testing and observation, as well as confirming with the Wiki article
# If this regex fails to capture all valid UK postcodes, then I apologize, for I am only human.
UK_POSTCODE_REGEX = r'(?P<postcode_area>[A-Z]{1,2})(?P<district>(?:[0-9]{1,2})|(?:[0-9][A-Z]))(?P<sector>[0-9])(?P<postcode>[A-Z]{2})'
@classmethod
def from_postcode(cls, postcode):
"""Parses a string into a UKPostcode
Returns a UKPostcode or None
"""
m = re.match(cls.UK_POSTCODE_REGEX, postcode.replace(' ', ''))
if m:
uk_postcode = UKPostcode(
postcode_area=m.group('postcode_area'),
district=m.group('district'),
sector=m.group('sector'),
postcode=m.group('postcode')
)
else:
uk_postcode = None
return uk_postcode
def parse_uk_postcode(postcode):
"""Wrapper for UKPostcode.from_postcode
"""
uk_postcode = UKPostcode.from_postcode(postcode)
return uk_postcode
下面是单元测试:
@pytest.mark.parametrize(
'postcode, expected', [
# https://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom#Validation
(
'EC1A1BB',
UKPostcode(
postcode_area='EC',
district='1A',
sector='1',
postcode='BB'
),
),
(
'W1A0AX',
UKPostcode(
postcode_area='W',
district='1A',
sector='0',
postcode='AX'
),
),
(
'M11AE',
UKPostcode(
postcode_area='M',
district='1',
sector='1',
postcode='AE'
),
),
(
'B338TH',
UKPostcode(
postcode_area='B',
district='33',
sector='8',
postcode='TH'
)
),
(
'CR26XH',
UKPostcode(
postcode_area='CR',
district='2',
sector='6',
postcode='XH'
)
),
(
'DN551PT',
UKPostcode(
postcode_area='DN',
district='55',
sector='1',
postcode='PT'
)
)
]
)
def test_parse_uk_postcode(postcode, expected):
uk_postcode = parse_uk_postcode(postcode)
assert(uk_postcode == expected)
其他回答
在这个列表中添加一个更实用的正则表达式,允许用户输入一个空字符串:
^$|^(([gG][iI][rR] {0,}0[aA]{2})|((([a-pr-uwyzA-PR-UWYZ][a-hk-yA-HK-Y]?[0-9][0-9]?)|(([a-pr-uwyzA-PR-UWYZ][0-9][a-hjkstuwA-HJKSTUW])|([a-pr-uwyzA-PR-UWYZ][a-hk-yA-HK-Y][0-9][abehmnprv-yABEHMNPRV-Y]))) {0,1}[0-9][abd-hjlnp-uw-zABD-HJLNP-UW-Z]{2}))$
这个正则表达式允许大写字母和小写字母,中间有可选的空格
从软件开发人员的角度来看,这个正则表达式对于地址可能是可选的软件很有用。例如,如果用户不想提供他们的地址详细信息
我一直在寻找一个英国邮政编码正则表达式的最后一天左右,无意中发现了这个线程。我尝试了上面的大部分建议,但没有一个对我有用,所以我想出了自己的正则表达式,据我所知,它捕获了截至1月13日的所有有效的英国邮政编码(根据皇家邮政的最新文献)。
The regex and some simple postcode checking PHP code is posted below. NOTE:- It allows for lower or uppercase postcodes and the GIR 0AA anomaly but to deal with the, more than likely, presence of a space in the middle of an entered postcode it also makes use of a simple str_replace to remove the space before testing against the regex. Any discrepancies beyond that and the Royal Mail themselves don't even mention them in their literature (see http://www.royalmail.com/sites/default/files/docs/pdf/programmers_guide_edition_7_v5.pdf and start reading from page 17)!
注意:在皇家邮政自己的文献中(链接以上),第3和第4位的位置略有模糊,如果这些字符是字母,则例外。我直接联系了皇家邮政,用他们自己的话说,“AANA NAA格式的出境代码的第4个位置的信件没有例外,而第3个位置的例外只适用于ANA NAA格式的出境代码的最后一个字母。”直接从马嘴里说出来的!
<?php
$postcoderegex = '/^([g][i][r][0][a][a])$|^((([a-pr-uwyz]{1}([0]|[1-9]\d?))|([a-pr-uwyz]{1}[a-hk-y]{1}([0]|[1-9]\d?))|([a-pr-uwyz]{1}[1-9][a-hjkps-uw]{1})|([a-pr-uwyz]{1}[a-hk-y]{1}[1-9][a-z]{1}))(\d[abd-hjlnp-uw-z]{2})?)$/i';
$postcode2check = str_replace(' ','',$postcode2check);
if (preg_match($postcoderegex, $postcode2check)) {
echo "$postcode2check is a valid postcode<br>";
} else {
echo "$postcode2check is not a valid postcode<br>";
}
?>
我希望它能帮助其他遇到这条线索寻找解决方案的人。
根据维基百科的表格
这种模式适用于所有情况
(?:[A-Za-z]\d ?\d[A-Za-z]{2})|(?:[A-Za-z][A-Za-z\d]\d ?\d[A-Za-z]{2})|(?:[A-Za-z]{2}\d{2} ?\d[A-Za-z]{2})|(?:[A-Za-z]\d[A-Za-z] ?\d[A-Za-z]{2})|(?:[A-Za-z]{2}\d[A-Za-z] ?\d[A-Za-z]{2})
当在Android / Java上使用它时,使用\\d
这是谷歌在i18napis.appspot.com域名上的正则表达式:
GIR[ ]?0AA|((AB|AL|B|BA|BB|BD|BH|BL|BN|BR|BS|BT|BX|CA|CB|CF|CH|CM|CO|CR|CT|CV|CW|DA|DD|DE|DG|DH|DL|DN|DT|DY|E|EC|EH|EN|EX|FK|FY|G|GL|GY|GU|HA|HD|HG|HP|HR|HS|HU|HX|IG|IM|IP|IV|JE|KA|KT|KW|KY|L|LA|LD|LE|LL|LN|LS|LU|M|ME|MK|ML|N|NE|NG|NN|NP|NR|NW|OL|OX|PA|PE|PH|PL|PO|PR|RG|RH|RM|S|SA|SE|SG|SK|SL|SM|SN|SO|SP|SR|SS|ST|SW|SY|TA|TD|TF|TN|TQ|TR|TS|TW|UB|W|WA|WC|WD|WF|WN|WR|WS|WV|YO|ZE)(\d[\dA-Z]?[ ]?\d[ABD-HJLN-UW-Z]{2}))|BFPO[ ]?\d{1,4}
我今天做了英国邮政编码验证的正则表达式,据我所知,它适用于所有的英国邮政编码,如果你放一个空格或如果你不放。
^((([a-zA-Z][0-9])|([a-zA-Z][0-9]{2})|([a-zA-Z]{2}[0-9])|([a-zA-Z]{2}[0-9]{2})|([A-Za-z][0-9][a-zA-Z])|([a-zA-Z]{2}[0-9][a-zA-Z]))(\s*[0-9][a-zA-Z]{2})$)
如果有什么格式没有涵盖,请告诉我