我在一个正则表达式后,将验证一个完整的复杂的英国邮政编码只在输入字符串。所有不常见的邮政编码形式必须包括以及通常。例如:
匹配
CW3 9不锈钢 SE5 0EG SE50EG Se5 0eg WC2H 7LT
不匹配
aWC2H 7LT WC2H 7LTa WC2H
我怎么解决这个问题?
我在一个正则表达式后,将验证一个完整的复杂的英国邮政编码只在输入字符串。所有不常见的邮政编码形式必须包括以及通常。例如:
匹配
CW3 9不锈钢 SE5 0EG SE50EG Se5 0eg WC2H 7LT
不匹配
aWC2H 7LT WC2H 7LTa WC2H
我怎么解决这个问题?
当前回答
以下是我们处理英国邮政编码问题的方法:
^([A-Za-z]{1,2}[0-9]{1,2}[A-Za-z]?[ ]?)([0-9]{1}[A-Za-z]{2})$
解释:
期望有1或2个a-z字符,上或下都没问题 预期有1到2个数字 期望0或1个a-z字符,上或下精细 允许使用可选空间 期望1个数字 期望有2个a-z,上下都没问题
这将获得大多数格式,然后我们使用db来验证邮政编码是否真实,该数据由openpoint https://www.ordnancesurvey.co.uk/opendatadownload/products.html驱动
希望这能有所帮助
其他回答
通过经验测试和观察,以及https://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom#Validation的确认,以下是我的Python正则表达式版本,可以正确地解析和验证英国邮政编码:
UK_POSTCODE_REGEX = r ' (? P < postcode_area > [a - z] {1,2}) (? P <区> (?:[0 - 9]{1,2})| (?:[0 - 9][a - z])) (? P <部门> [0 - 9])(? P <邮编> [a - z]{2})”
这个正则表达式很简单,并且有捕获组。它不包括所有合法的英国邮政编码的验证,而只考虑字母与数字的位置。
下面是我在代码中如何使用它:
@dataclass
class UKPostcode:
postcode_area: str
district: str
sector: int
postcode: str
# https://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom#Validation
# Original author of this regex: @jontsai
# NOTE TO FUTURE DEVELOPER:
# Verified through empirical testing and observation, as well as confirming with the Wiki article
# If this regex fails to capture all valid UK postcodes, then I apologize, for I am only human.
UK_POSTCODE_REGEX = r'(?P<postcode_area>[A-Z]{1,2})(?P<district>(?:[0-9]{1,2})|(?:[0-9][A-Z]))(?P<sector>[0-9])(?P<postcode>[A-Z]{2})'
@classmethod
def from_postcode(cls, postcode):
"""Parses a string into a UKPostcode
Returns a UKPostcode or None
"""
m = re.match(cls.UK_POSTCODE_REGEX, postcode.replace(' ', ''))
if m:
uk_postcode = UKPostcode(
postcode_area=m.group('postcode_area'),
district=m.group('district'),
sector=m.group('sector'),
postcode=m.group('postcode')
)
else:
uk_postcode = None
return uk_postcode
def parse_uk_postcode(postcode):
"""Wrapper for UKPostcode.from_postcode
"""
uk_postcode = UKPostcode.from_postcode(postcode)
return uk_postcode
下面是单元测试:
@pytest.mark.parametrize(
'postcode, expected', [
# https://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom#Validation
(
'EC1A1BB',
UKPostcode(
postcode_area='EC',
district='1A',
sector='1',
postcode='BB'
),
),
(
'W1A0AX',
UKPostcode(
postcode_area='W',
district='1A',
sector='0',
postcode='AX'
),
),
(
'M11AE',
UKPostcode(
postcode_area='M',
district='1',
sector='1',
postcode='AE'
),
),
(
'B338TH',
UKPostcode(
postcode_area='B',
district='33',
sector='8',
postcode='TH'
)
),
(
'CR26XH',
UKPostcode(
postcode_area='CR',
district='2',
sector='6',
postcode='XH'
)
),
(
'DN551PT',
UKPostcode(
postcode_area='DN',
district='55',
sector='1',
postcode='PT'
)
)
]
)
def test_parse_uk_postcode(postcode, expected):
uk_postcode = parse_uk_postcode(postcode)
assert(uk_postcode == expected)
以下是我们处理英国邮政编码问题的方法:
^([A-Za-z]{1,2}[0-9]{1,2}[A-Za-z]?[ ]?)([0-9]{1}[A-Za-z]{2})$
解释:
期望有1或2个a-z字符,上或下都没问题 预期有1到2个数字 期望0或1个a-z字符,上或下精细 允许使用可选空间 期望1个数字 期望有2个a-z,上下都没问题
这将获得大多数格式,然后我们使用db来验证邮政编码是否真实,该数据由openpoint https://www.ordnancesurvey.co.uk/opendatadownload/products.html驱动
希望这能有所帮助
我今天做了英国邮政编码验证的正则表达式,据我所知,它适用于所有的英国邮政编码,如果你放一个空格或如果你不放。
^((([a-zA-Z][0-9])|([a-zA-Z][0-9]{2})|([a-zA-Z]{2}[0-9])|([a-zA-Z]{2}[0-9]{2})|([A-Za-z][0-9][a-zA-Z])|([a-zA-Z]{2}[0-9][a-zA-Z]))(\s*[0-9][a-zA-Z]{2})$)
如果有什么格式没有涵盖,请告诉我
不存在能够验证邮政编码的综合英国邮政编码正则表达式。您可以使用正则表达式检查邮政编码的格式是否正确;并不是真的存在。
邮政编码非常复杂,而且不断变化。例如,对于每个邮政编码区域,出码W1没有,也可能永远没有1到99之间的每个数字。
你不能指望当前的东西永远都是真的。举个例子,1990年,邮局认为阿伯丁有点拥挤了。他们在AB1-5的末尾加了一个0,使它成为AB10-50,然后在这些之间创建了一些邮政编码。
每当建立一条新街道时,就会创建一个新的邮政编码。这是获得建筑许可的过程的一部分;地方当局有义务与邮局保持更新(并不是说他们都这样做)。
此外,正如许多其他用户指出的那样,还有一些特殊的邮政编码,如Girobank, GIR 0AA,以及给圣诞老人的信件,SAN TA1 -你可能不想在那里张贴任何东西,但似乎没有任何其他答案。
然后,还有BFPO的邮政编码,现在正在改为更标准的格式。两种格式都是有效的。最后,还有海外领土来源维基百科。
+----------+----------------------------------------------+ | Postcode | Location | +----------+----------------------------------------------+ | AI-2640 | Anguilla | | ASCN 1ZZ | Ascension Island | | STHL 1ZZ | Saint Helena | | TDCU 1ZZ | Tristan da Cunha | | BBND 1ZZ | British Indian Ocean Territory | | BIQQ 1ZZ | British Antarctic Territory | | FIQQ 1ZZ | Falkland Islands | | GX11 1AA | Gibraltar | | PCRN 1ZZ | Pitcairn Islands | | SIQQ 1ZZ | South Georgia and the South Sandwich Islands | | TKCA 1ZZ | Turks and Caicos Islands | +----------+----------------------------------------------+
接下来,你必须考虑到英国将其邮政编码系统“输出”到世界上许多地方。任何验证“英国”邮政编码的程序也将验证许多其他国家的邮政编码。
如果您想验证英国邮政编码,最安全的方法是使用当前邮政编码的查找。有很多选择:
Ordnance Survey releases Code-Point Open under an open data licence. It'll be very slightly behind the times but it's free. This will (probably - I can't remember) not include Northern Irish data as the Ordnance Survey has no remit there. Mapping in Northern Ireland is conducted by the Ordnance Survey of Northern Ireland and they have their, separate, paid-for, Pointer product. You could use this and append the few that aren't covered fairly easily. Royal Mail releases the Postcode Address File (PAF), this includes BFPO which I'm not sure Code-Point Open does. It's updated regularly but costs money (and they can be downright mean about it sometimes). PAF includes the full address rather than just postcodes and comes with its own Programmers Guide. The Open Data User Group (ODUG) is currently lobbying to have PAF released for free, here's a description of their position. Lastly, there's AddressBase. This is a collaboration between Ordnance Survey, Local Authorities, Royal Mail and a matching company to create a definitive directory of all information about all UK addresses (they've been fairly successful as well). It's paid-for but if you're working with a Local Authority, government department, or government service it's free for them to use. There's a lot more information than just postcodes included.
我想要一个简单的正则表达式,可以允许太多,但不能拒绝有效的邮政编码。我这样做(输入是一个剥离/修剪的字符串):
/^([a-z0-9]\s*){5,8}$/i
这允许最短的邮政编码,如“L1 8JQ”和最长的邮政编码,如“OL14 5ET”。
因为它最多允许8个字符,如果没有空格,它也将允许不正确的8个字符邮政编码:“OL145ETX”。但是,这是一个简单的正则表达式,当它足够好的时候。