验证字符串是否是有效的电子邮件地址的最优雅的代码是什么?


当前回答

public static bool IsEmail(string strEmail)
{
    Regex rgxEmail = new Regex(@"^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}" +
                               @"\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\" +
                               @".)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$");
    return rgxEmail.IsMatch(strEmail);
}

其他回答

我最终使用了这个正则表达式,因为它成功地验证了逗号、注释、Unicode字符和IP(v4)域地址。

有效地址为:

“@example。org (评论)test@example。org тест@example。org ტესტი@example。org test@[192.168 . 1章1段]

 public const string REGEX_EMAIL = @"^(((\([\w!#$%&'*+\/=?^_`{|}~-]*\))?[^<>()[\]\\.,;:\s@\""]+(\.[^<>()[\]\\.,;:\s@\""]+)*)|(\"".+\""))(\([\w!#$%&'*+\/=?^_`{|}~-]*\))?@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$";

我根据维基百科的规则和样本地址创建了一个电子邮件地址验证程序。对于那些不介意多看一点代码的人,这里。说实话,我不知道电子邮件地址规范中有这么多疯狂的规则。我没有完全验证主机名或ipaddress,但它仍然通过了维基百科上的所有测试用例。

using Microsoft.VisualStudio.TestTools.UnitTesting;

namespace EmailValidateUnitTests
{
    [TestClass]
    public class EmailValidationUnitTests
    {
        [TestMethod]
        public void TestEmailValidate()
        {
            // Positive Assertions
            Assert.IsTrue("prettyandsimple@example.com".IsValidEmailAddress());
            Assert.IsTrue("very.common@example.com".IsValidEmailAddress());
            Assert.IsTrue("disposable.style.email.with+symbol@example.com".IsValidEmailAddress());
            Assert.IsTrue("other.email-with-dash@example.com".IsValidEmailAddress());
            Assert.IsTrue("\"much.more unusual\"@example.com".IsValidEmailAddress());
            Assert.IsTrue("\"very.unusual.@.unusual.com\"@example.com".IsValidEmailAddress()); //"very.unusual.@.unusual.com"@example.com
            Assert.IsTrue("\"very.(),:;<>[]\\\".VERY.\\\"very@\\\\ \\\"very\\\".unusual\"@strange.example.com".IsValidEmailAddress()); //"very.(),:;<>[]\".VERY.\"very@\\ \"very\".unusual"@strange.example.com
            Assert.IsTrue("admin@mailserver1".IsValidEmailAddress());
            Assert.IsTrue("#!$%&'*+-/=?^_`{}|~@example.org".IsValidEmailAddress());
            Assert.IsTrue("\"()<>[]:,;@\\\\\\\"!#$%&'*+-/=?^_`{}| ~.a\"@example.org".IsValidEmailAddress()); //"()<>[]:,;@\\\"!#$%&'*+-/=?^_`{}| ~.a"@example.org
            Assert.IsTrue("\" \"@example.org".IsValidEmailAddress()); //" "@example.org (space between the quotes)
            Assert.IsTrue("example@localhost".IsValidEmailAddress());
            Assert.IsTrue("example@s.solutions".IsValidEmailAddress());
            Assert.IsTrue("user@com".IsValidEmailAddress());
            Assert.IsTrue("user@localserver".IsValidEmailAddress());
            Assert.IsTrue("user@[IPv6:2001:db8::1]".IsValidEmailAddress());
            Assert.IsTrue("user@[192.168.2.1]".IsValidEmailAddress());
            Assert.IsTrue("(comment and stuff)joe@gmail.com".IsValidEmailAddress());
            Assert.IsTrue("joe(comment and stuff)@gmail.com".IsValidEmailAddress());
            Assert.IsTrue("joe@(comment and stuff)gmail.com".IsValidEmailAddress());
            Assert.IsTrue("joe@gmail.com(comment and stuff)".IsValidEmailAddress());

            // Failure Assertions
            Assert.IsFalse("joe(fail me)smith@gmail.com".IsValidEmailAddress());
            Assert.IsFalse("joesmith@gma(fail me)il.com".IsValidEmailAddress());
            Assert.IsFalse("joe@gmail.com(comment and stuff".IsValidEmailAddress());
            Assert.IsFalse("Abc.example.com".IsValidEmailAddress());
            Assert.IsFalse("A@b@c@example.com".IsValidEmailAddress());
            Assert.IsFalse("a\"b(c)d,e:f;g<h>i[j\\k]l@example.com".IsValidEmailAddress()); //a"b(c)d,e:f;g<h>i[j\k]l@example.com
            Assert.IsFalse("just\"not\"right@example.com".IsValidEmailAddress()); //just"not"right@example.com
            Assert.IsFalse("this is\"not\\allowed@example.com".IsValidEmailAddress()); //this is"not\allowed@example.com
            Assert.IsFalse("this\\ still\\\"not\\\\allowed@example.com".IsValidEmailAddress());//this\ still\"not\\allowed@example.com
            Assert.IsFalse("john..doe@example.com".IsValidEmailAddress());
            Assert.IsFalse("john.doe@example..com".IsValidEmailAddress());
            Assert.IsFalse(" joe@gmail.com".IsValidEmailAddress());
            Assert.IsFalse("joe@gmail.com ".IsValidEmailAddress());
        }
    }

    public static class ExtensionMethods
    {
        private const string ValidLocalPartChars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789!#$%&'*+-/=?^_`{|}~";
        private const string ValidQuotedLocalPartChars = "(),:;<>@[]. ";
        private const string ValidDomainPartChars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-:";

        private enum EmailParseMode
        {
            BeginLocal, Local, QuotedLocalEscape, QuotedLocal, QuotedLocalEnd, LocalSplit, LocalComment,
            At,
            Domain, DomainSplit, DomainComment, BracketedDomain, BracketedDomainEnd
        };

        public static bool IsValidEmailAddress(this string s)
        {
            bool valid = true;

            bool hasLocal = false, hasDomain = false;
            int commentStart = -1, commentEnd = -1;
            var mode = EmailParseMode.BeginLocal;
            for (int i = 0; i < s.Length; i++)
            {
                char c = s[i];
                if (mode == EmailParseMode.BeginLocal || mode == EmailParseMode.LocalSplit)
                {
                    if (c == '(') { mode = EmailParseMode.LocalComment; commentStart = i; commentEnd = -1; }
                    else if (c == '"') { mode = EmailParseMode.QuotedLocal; }
                    else if (ValidLocalPartChars.IndexOf(c) >= 0) { mode = EmailParseMode.Local; hasLocal = true; }
                    else { valid = false; break; }
                }
                else if (mode == EmailParseMode.LocalComment)
                {
                    if (c == ')')
                    {
                        mode = EmailParseMode.Local; commentEnd = i;
                        // comments can only be at beginning and end of parts...
                        if (commentStart != 0 && ((commentEnd + 1) < s.Length) && s[commentEnd + 1] != '@') { valid = false; break; }
                    }
                }
                else if (mode == EmailParseMode.Local)
                {
                    if (c == '.') mode = EmailParseMode.LocalSplit;
                    else if (c == '@') mode = EmailParseMode.At;
                    else if (c == '(') { mode = EmailParseMode.LocalComment; commentStart = i; commentEnd = -1; }
                    else if (ValidLocalPartChars.IndexOf(c) >= 0) { hasLocal = true; }
                    else { valid = false; break; }
                }
                else if (mode == EmailParseMode.QuotedLocal)
                {
                    if (c == '"') { mode = EmailParseMode.QuotedLocalEnd; }
                    else if (c == '\\') { mode = EmailParseMode.QuotedLocalEscape; }
                    else if (ValidLocalPartChars.IndexOf(c) >= 0 || ValidQuotedLocalPartChars.IndexOf(c) >= 0) { hasLocal = true; }
                    else { valid = false; break; }
                }
                else if (mode == EmailParseMode.QuotedLocalEscape)
                {
                    if (c == '"' || c == '\\') { mode = EmailParseMode.QuotedLocal; hasLocal = true; }
                    else { valid = false; break; }
                }
                else if (mode == EmailParseMode.QuotedLocalEnd)
                {
                    if (c == '.') { mode = EmailParseMode.LocalSplit; }
                    else if (c == '@') mode = EmailParseMode.At;
                    else if (c == '(') { mode = EmailParseMode.LocalComment; commentStart = i; commentEnd = -1; }
                    else { valid = false; break; }
                }
                else if (mode == EmailParseMode.At)
                {
                    if (c == '[') { mode = EmailParseMode.BracketedDomain; }
                    else if (c == '(') { mode = EmailParseMode.DomainComment; commentStart = i; commentEnd = -1; }
                    else if (ValidDomainPartChars.IndexOf(c) >= 0) { mode = EmailParseMode.Domain; hasDomain = true; }
                    else { valid = false; break; }
                }
                else if (mode == EmailParseMode.DomainComment)
                {
                    if (c == ')')
                    {
                        mode = EmailParseMode.Domain;
                        commentEnd = i;
                        // comments can only be at beginning and end of parts...
                        if ((commentEnd + 1) != s.Length && (commentStart > 0) && s[commentStart - 1] != '@') { valid = false; break; }
                    }
                }
                else if (mode == EmailParseMode.DomainSplit)
                {
                    if (ValidDomainPartChars.IndexOf(c) >= 0) { mode = EmailParseMode.Domain; hasDomain = true; }
                    else { valid = false; break; }
                }
                else if (mode == EmailParseMode.Domain)
                {
                    if (c == '(') { mode = EmailParseMode.DomainComment; commentStart = i; commentEnd = -1; }
                    else if (c == '.') { mode = EmailParseMode.DomainSplit; }
                    else if (ValidDomainPartChars.IndexOf(c) >= 0) { hasDomain = true; }
                    else { valid = false; break; }
                }
                else if (mode == EmailParseMode.BracketedDomain)
                {
                    if (c == ']') { mode = EmailParseMode.BracketedDomainEnd; }
                    else if (c == '.' || ValidDomainPartChars.IndexOf(c) >= 0) { hasDomain = true; }
                    else { valid = false; break; }
                }
                else if (mode == EmailParseMode.BracketedDomain)
                {
                    if (c == '(') { mode = EmailParseMode.DomainComment; commentStart = i; commentEnd = -1; }
                    else { valid = false; break; }
                }
            }
            bool unfinishedComment = (commentEnd == -1 && commentStart >= 0);

            return hasLocal && hasDomain && valid && !unfinishedComment;
        }
    }
}

一个简单的没有使用Regex(我不喜欢它的可读性差):

bool IsValidEmail(string email)
{
    string emailTrimed = email.Trim();

    if (!string.IsNullOrEmpty(emailTrimed))
    {
        bool hasWhitespace = emailTrimed.Contains(" ");

        int indexOfAtSign = emailTrimed.LastIndexOf('@');

        if (indexOfAtSign > 0 && !hasWhitespace)
        {
            string afterAtSign = emailTrimed.Substring(indexOfAtSign + 1);

            int indexOfDotAfterAtSign = afterAtSign.LastIndexOf('.');

            if (indexOfDotAfterAtSign > 0 && afterAtSign.Substring(indexOfDotAfterAtSign).Length > 1)
                return true;
        }
    }

    return false;
}

例子:

IsValidEmail(“@b.com”) // false IsValidEmail(“a@.com”) // false IsValidEmail(“a@bcom”) // false IsValidEmail(“a.b@com”) // false IsValidEmail(“a@b.”) // false IsValidEmail(“a b@c.com”) // false IsValidEmail(“a@b c.com”) // false IsValidEmail(“a@b.com”) // true IsValidEmail(“a@b.c.com”) // true IsValidEmail(“a+b@c.com”) // true IsValidEmail(“a@123.45.67.89”) // true

它意味着简单,因此它不处理罕见的情况,如电子邮件的括号域包含空格(通常是允许的),电子邮件的IPv6地址等。

基于@Cogwheel的回答,我想分享一个修改后的解决方案,适用于SSIS和“脚本组件”:

Place the "Script Component" into your Data Flow connect and then open it. In the section "Input Columns" set the field that contains the E-Mail Adresses to "ReadWrite" (in the example 'fieldName'). Switch back to the section "Script" and click on "Edit Script". Then you need to wait after the code opens. Place this code in the right method: public override void Input0_ProcessInputRow(Input0Buffer Row) { string email = Row.fieldName; try { System.Net.Mail.MailAddress addr = new System.Net.Mail.MailAddress(email); Row.fieldName= addr.Address.ToString(); } catch { Row.fieldName = "WRONGADDRESS"; } }

然后,您可以使用条件分割过滤掉所有无效记录或任何您想做的事情。

以前,我写了一个EmailAddressValidationAttribute,它应该正确地验证表单中几乎任何相对正常的电子邮件地址

local-part@domain

它是System.ComponentModel.DataAnnotations。ValidationAttribute,所以使用非常简单。

而且,由于挖掘所有rfc和勘误表,并组装所需的所有位来正确枚举所有规则……太乏味了!-我在回答c#电子邮件地址验证源代码的问题时发布了验证器的源代码。

我的验证器无论怎么想象都不是完美的,只是对于初学者来说,它没有任何内置的对发出客户端javascript验证的支持,尽管将其添加进来并不太难。从我上面的回答来看:

Here's the validation attribute I wrote. It validates pretty much every "raw" email address, that is those of the form local-part@domain. It doesn't support any of the other, more...creative constructs that the RFCs allow (this list is not comprehensive by any means): comments (e.g., jsmith@whizbang.com (work)) quoted strings (escaped text, to allow characters not allowed in an atom) domain literals (e.g. foo@[123.45.67.012]) bang-paths (aka source routing) angle addresses (e.g. John Smith <jsmith@whizbang.com>) folding whitespace double-byte characters in either local-part or domain (7-bit ASCII only). etc. It should accept almost any email address that can be expressed thusly foo.bar@bazbat.com without requiring the use of quotes ("), angle brackets ('<>') or square brackets ([]). No attempt is made to validate that the rightmost dns label in the domain is a valid TLD (top-level domain). That is because the list of TLDs is far larger now than the "big 6" (.com, .edu, .gov, .mil, .net, .org) plus 2-letter ISO country codes. ICANN actually updates the TLD list daily, though I suspect that the list doesn't actually change daily. Further, [ICANN just approved a big expansion of the generic TLD namespace][2]). And some email addresses don't have what you'd recognize as a TLD (did you know that postmaster@. is theoretically valid and mailable? Mail to that address should get delivered to the postmaster of the DNS root zone.) Extending the regular expression to support domain literals shouldn't be too difficult.