我需要搜索一个字符串并用从数据库中提取的值替换%FirstName%和%PolicyAmount%的所有出现。问题是FirstName的大小写不同。这阻止了我使用String.Replace()方法。我看过相关网页,上面写着

Regex.Replace(strInput, strToken, strReplaceWith, RegexOptions.IgnoreCase);

然而,由于某种原因,当我尝试将%PolicyAmount%替换为$0时,替换从未发生。我假设这与美元符号在正则表达式中是保留字符有关。

是否有另一种方法,我可以使用,不涉及消毒输入处理正则表达式特殊字符?


当前回答

受到cfeduke答案的启发,我做了这个函数,它使用IndexOf来查找字符串中的旧值,然后用新值替换它。我在处理数百万行的SSIS脚本中使用了这个方法,regex方法要比这个慢得多。

public static string ReplaceCaseInsensitive(this string str, string oldValue, string newValue)
{
    int prevPos = 0;
    string retval = str;
    // find the first occurence of oldValue
    int pos = retval.IndexOf(oldValue, StringComparison.InvariantCultureIgnoreCase);

    while (pos > -1)
    {
        // remove oldValue from the string
        retval = retval.Remove(pos, oldValue.Length);

        // insert newValue in it's place
        retval = retval.Insert(pos, newValue);

        // check if oldValue is found further down
        prevPos = pos + newValue.Length;
        pos = retval.IndexOf(oldValue, prevPos, StringComparison.InvariantCultureIgnoreCase);
    }

    return retval;
}

其他回答

受到cfeduke答案的启发,我做了这个函数,它使用IndexOf来查找字符串中的旧值,然后用新值替换它。我在处理数百万行的SSIS脚本中使用了这个方法,regex方法要比这个慢得多。

public static string ReplaceCaseInsensitive(this string str, string oldValue, string newValue)
{
    int prevPos = 0;
    string retval = str;
    // find the first occurence of oldValue
    int pos = retval.IndexOf(oldValue, StringComparison.InvariantCultureIgnoreCase);

    while (pos > -1)
    {
        // remove oldValue from the string
        retval = retval.Remove(pos, oldValue.Length);

        // insert newValue in it's place
        retval = retval.Insert(pos, newValue);

        // check if oldValue is found further down
        prevPos = pos + newValue.Length;
        pos = retval.IndexOf(oldValue, prevPos, StringComparison.InvariantCultureIgnoreCase);
    }

    return retval;
}

看起来像绳子。Replace应该有一个重载,接受StringComparison参数。因为它没有,你可以尝试这样做:

public static string ReplaceString(string str, string oldValue, string newValue, StringComparison comparison)
{
    StringBuilder sb = new StringBuilder();

    int previousIndex = 0;
    int index = str.IndexOf(oldValue, comparison);
    while (index != -1)
    {
        sb.Append(str.Substring(previousIndex, index - previousIndex));
        sb.Append(newValue);
        index += oldValue.Length;

        previousIndex = index;
        index = str.IndexOf(oldValue, index, comparison);
    }
    sb.Append(str.Substring(previousIndex));

    return sb.ToString();
}

从MSDN $0 - "替换组号number(十进制)匹配的最后一个子字符串。"

在。net正则表达式中,0组总是整个匹配。对于字面上的$,你需要

string value = Regex.Replace("%PolicyAmount%", "%PolicyAmount%", @"$$0", RegexOptions.IgnoreCase);
    /// <summary>
    /// A case insenstive replace function.
    /// </summary>
    /// <param name="originalString">The string to examine.(HayStack)</param>
    /// <param name="oldValue">The value to replace.(Needle)</param>
    /// <param name="newValue">The new value to be inserted</param>
    /// <returns>A string</returns>
    public static string CaseInsenstiveReplace(string originalString, string oldValue, string newValue)
    {
        Regex regEx = new Regex(oldValue,
           RegexOptions.IgnoreCase | RegexOptions.Multiline);
        return regEx.Replace(originalString, newValue);
    }

这是一组令人困惑的答案,部分原因是问题的标题实际上比被问到的具体问题要大得多。在读完之后,我不确定是否有任何答案与吸收这里所有的好东西有几次编辑之差,所以我想我应该试着总结一下。

下面是一种扩展方法,我认为它避免了这里提到的陷阱,并提供了最广泛适用的解决方案。

public static string ReplaceCaseInsensitiveFind(this string str, string findMe,
    string newValue)
{
    return Regex.Replace(str,
        Regex.Escape(findMe),
        Regex.Replace(newValue, "\\$[0-9]+", @"$$$0"),
        RegexOptions.IgnoreCase);
}

所以…

这是一个扩展方法@MarkRobinson 这并没有尝试跳过Regex @Helge(如果你想在Regex之外像这样的字符串嗅探,你真的必须一个字节一个字节地做) 通过了@MichaelLiu的优秀测试案例“”。ReplaceCaseInsensitiveFind(“oe”,“”),尽管他可能有一个稍微不同的行为。

不幸的是,@HA的评论,你必须逃避所有三个是不正确的。初始值和newValue不需要是。

注意:但是,如果$s是“捕获值”标记的一部分,则必须在插入的新值中转义$s。因此正则表达式中有三个美元符号。在正则表达式中替换。取代(原文如此)。没有它,像这样的东西就会碎…

“这是他的叉子,他的勺子,他的刀。”ReplaceCaseInsensitiveFind(“他”@”他$ 0 r”)

错误如下:

An unhandled exception of type 'System.ArgumentException' occurred in System.dll

Additional information: parsing "The\hisr\ is\ he\HISr\ fork,\ he\hIsr\ spoon,\ he\hisrsssssss\ knife\." - Unrecognized escape sequence \h.

Tell you what, I know folks that are comfortable with Regex feel like their use avoids errors, but I'm often still partial to byte sniffing strings (but only after having read Spolsky on encodings) to be absolutely sure you're getting what you intended for important use cases. Reminds me of Crockford on "insecure regular expressions" a little. Too often we write regexps that allow what we want (if we're lucky), but unintentionally allow more in (eg, Is $10 really a valid "capture value" string in my newValue regexp, above?) because we weren't thoughtful enough. Both methods have value, and both encourage different types of unintentional errors. It's often easy to underestimate complexity.

奇怪的$转义(和正则表达式。Escape没有逃脱捕获的价值模式,如我所期望的替换值为0),这让我抓狂了一段时间。编程难(c) 1842年