有没有更好的方法来替换字符串?

我很惊讶Replace不接受字符数组或字符串数组。我想我可以写我自己的扩展,但我很好奇是否有更好的内置方式来做以下工作?注意最后一个Replace是一个字符串而不是字符。

myString.Replace(';', '\n').Replace(',', '\n').Replace('\r', '\n').Replace('\t', '\n').Replace(' ', '\n').Replace("\n\n", "\n");

可以使用replace正则表达式。

s/[;,\t\r ]|[\n]{2}/\n/g

S /在开头表示搜索 [和]之间的字符是要搜索的字符(以任何顺序) 第二个/分隔搜索文本和替换文本

用英语来说,这是:

“寻找;或者,或者\t \r或者(空格)或者恰好两个连续的\n,然后把它替换成\n

在c#中,你可以做以下事情:(在导入system . text . regulareexpressions之后)

Regex pattern = new Regex("[;,\t\r ]|[\n]{2}");
pattern.Replace(myString, "\n");

使用正则表达式。替换,像这样:

  string input = "This is   text with   far  too   much   " + 
                 "whitespace.";
  string pattern = "[;,]";
  string replacement = "\n";
  Regex rgx = new Regex(pattern);
  string result = rgx.Replace(input, replacement);

这里是关于RegEx的MSDN文档的更多信息。取代


如果你觉得自己特别聪明,不想使用Regex:

char[] separators = new char[]{' ',';',',','\r','\t','\n'};

string s = "this;is,\ra\t\n\n\ntest";
string[] temp = s.Split(separators, StringSplitOptions.RemoveEmptyEntries);
s = String.Join("\n", temp);

您也可以用一个扩展方法来包装它。

编辑:或者只要等2分钟,我还是会把它写完:)

public static class ExtensionMethods
{
   public static string Replace(this string s, char[] separators, string newVal)
   {
       string[] temp;

       temp = s.Split(separators, StringSplitOptions.RemoveEmptyEntries);
       return String.Join( newVal, temp );
   }
}

和瞧...

char[] separators = new char[]{' ',';',',','\r','\t','\n'};
string s = "this;is,\ra\t\n\n\ntest";

s = s.Replace(separators, "\n");

你可以使用Linq的Aggregate函数:

string s = "the\nquick\tbrown\rdog,jumped;over the lazy fox.";
char[] chars = new char[] { ' ', ';', ',', '\r', '\t', '\n' };
string snew = chars.Aggregate(s, (c1, c2) => c1.Replace(c2, '\n'));

下面是扩展方法:

public static string ReplaceAll(this string seed, char[] chars, char replacementCharacter)
{
    return chars.Aggregate(seed, (str, cItem) => str.Replace(cItem, replacementCharacter));
}

扩展方法使用示例:

string snew = s.ReplaceAll(chars, '\n');

这是最短的方法:

myString = Regex.Replace(myString, @"[;,\t\r ]|[\n]{2}", "\n");

哦,表演太恐怖了! 答案有点过时了,但仍然……

public static class StringUtils
{
    #region Private members

    [ThreadStatic]
    private static StringBuilder m_ReplaceSB;

    private static StringBuilder GetReplaceSB(int capacity)
    {
        var result = m_ReplaceSB;

        if (null == result)
        {
            result = new StringBuilder(capacity);
            m_ReplaceSB = result;
        }
        else
        {
            result.Clear();
            result.EnsureCapacity(capacity);
        }

        return result;
    }


    public static string ReplaceAny(this string s, char replaceWith, params char[] chars)
    {
        if (null == chars)
            return s;

        if (null == s)
            return null;

        StringBuilder sb = null;

        for (int i = 0, count = s.Length; i < count; i++)
        {
            var temp = s[i];
            var replace = false;

            for (int j = 0, cc = chars.Length; j < cc; j++)
                if (temp == chars[j])
                {
                    if (null == sb)
                    {
                        sb = GetReplaceSB(count);
                        if (i > 0)
                            sb.Append(s, 0, i);
                    }

                    replace = true;
                    break;
                }

            if (replace)
                sb.Append(replaceWith);
            else
                if (null != sb)
                    sb.Append(temp);
        }

        return null == sb ? s : sb.ToString();
    }
}

就性能而言,这可能不是最好的解决方案,但它确实有效。

var str = "filename:with&bad$separators.txt";
char[] charArray = new char[] { '#', '%', '&', '{', '}', '\\', '<', '>', '*', '?', '/', ' ', '$', '!', '\'', '"', ':', '@' };
foreach (var singleChar in charArray)
{
   str = str.Replace(singleChar, '_');
}

字符串只是不可变的字符数组

你只需要让它可变:

或者使用StringBuilder 去不安全的世界玩指针(虽然很危险)

并尝试在字符数组中迭代最少的次数。注意这里的HashSet,因为它避免遍历循环中的字符序列。如果你需要更快的查找,你可以用优化的char查找(基于数组[256])替换HashSet。

StringBuilder示例

public static void MultiReplace(this StringBuilder builder, 
    char[] toReplace, 
    char replacement)
{
    HashSet<char> set = new HashSet<char>(toReplace);
    for (int i = 0; i < builder.Length; ++i)
    {
        var currentCharacter = builder[i];
        if (set.Contains(currentCharacter))
        {
            builder[i] = replacement;
        }
    }
}

编辑-优化版本(仅对ASCII有效)

public static void MultiReplace(this StringBuilder builder, 
    char[] toReplace,
    char replacement)
{
    var set = new bool[256];
    foreach (var charToReplace in toReplace)
    {
        set[charToReplace] = true;
    }
    for (int i = 0; i < builder.Length; ++i)
    {
        var currentCharacter = builder[i];
        if (set[currentCharacter])
        {
            builder[i] = replacement;
        }
    }
}

然后你就像这样使用它:

var builder = new StringBuilder("my bad,url&slugs");
builder.MultiReplace(new []{' ', '&', ','}, '-');
var result = builder.ToString();

你也可以简单地写这些字符串扩展方法,并把它们放在你的解决方案中的某个地方:

using System.Text;

public static class StringExtensions
{
    public static string ReplaceAll(this string original, string toBeReplaced, string newValue)
    {
        if (string.IsNullOrEmpty(original) || string.IsNullOrEmpty(toBeReplaced)) return original;
        if (newValue == null) newValue = string.Empty;
        StringBuilder sb = new StringBuilder();
        foreach (char ch in original)
        {
            if (toBeReplaced.IndexOf(ch) < 0) sb.Append(ch);
            else sb.Append(newValue);
        }
        return sb.ToString();
    }

    public static string ReplaceAll(this string original, string[] toBeReplaced, string newValue)
    {
        if (string.IsNullOrEmpty(original) || toBeReplaced == null || toBeReplaced.Length <= 0) return original;
        if (newValue == null) newValue = string.Empty;
        foreach (string str in toBeReplaced)
            if (!string.IsNullOrEmpty(str))
                original = original.Replace(str, newValue);
        return original;
    }
}

这样称呼他们:

"ABCDE".ReplaceAll("ACE", "xy");

xyBxyDxy

这:

"ABCDEF".ReplaceAll(new string[] { "AB", "DE", "EF" }, "xy");

xyCxyF


string ToBeReplaceCharacters = @"~()@#$%&amp;+,'&quot;&lt;&gt;|;\/*?";
string fileName = "filename;with<bad:separators?";

foreach (var RepChar in ToBeReplaceCharacters)
{
    fileName = fileName.Replace(RepChar.ToString(), "");
}

一个. net Core版本,用于将一组已定义的字符串字符替换为特定的字符。它利用了最近引入的Span类型和字符串。创建方法。

其思想是准备一个替换数组,因此不需要对每个字符串字符进行实际的比较操作。因此,替换过程提醒状态机的工作方式。为了避免初始化替换数组的所有项,让我们将oldChar ^ newChar (XOR'ed)值存储在那里,这样做有以下好处:

如果一个字符是不变的:ch ^ ch = 0 -不需要初始化不变的项 最后一个char可以通过XOR'ing: ch ^ repl[ch]: Ch ^ 0 = Ch -不变字符大小写 ch ^ (ch ^ newChar) = newChar -替换的char

因此,唯一的要求是确保替换数组在初始化时为零。我们将使用ArrayPool<char>来避免每次调用ReplaceAll方法时进行分配。并且,为了确保数组为零而不需要调用Array。方法,我们将维护一个专门用于ReplaceAll方法的池。在将替换数组返回到池之前,我们将清除替换数组(仅是精确的项)。

public static class StringExtensions
{
    private static readonly ArrayPool<char> _replacementPool = ArrayPool<char>.Create();

    public static string ReplaceAll(this string str, char newChar, params char[] oldChars)
    {
        // If nothing to do, return the original string.
        if (string.IsNullOrEmpty(str) ||
            oldChars is null ||
            oldChars.Length == 0)
        {
            return str;
        }

        // If only one character needs to be replaced,
        // use the more efficient `string.Replace`.
        if (oldChars.Length == 1)
        {
            return str.Replace(oldChars[0], newChar);
        }

        // Get a replacement array from the pool.
        var replacements = _replacementPool.Rent(char.MaxValue + 1);

        try
        {
            // Intialize the replacement array in the way that
            // all elements represent `oldChar ^ newChar`.
            foreach (var oldCh in oldChars)
            {
                replacements[oldCh] = (char)(newChar ^ oldCh);
            }

            // Create a string with replaced characters.
            return string.Create(str.Length, (str, replacements), (dst, args) =>
            {
                var repl = args.replacements;

                foreach (var ch in args.str)
                {
                    dst[0] = (char)(repl[ch] ^ ch);
                    dst = dst.Slice(1);
                }
            });
        }
        finally
        {
            // Clear the replacement array.
            foreach (var oldCh in oldChars)
            {
                replacements[oldCh] = char.MinValue;
            }

            // Return the replacement array back to the pool.
            _replacementPool.Return(replacements);
        }
    }
}

我知道这个问题非常古老,但我想提供两个更有效的选择:

首先,Paul Walls发布的扩展方法很好,但可以通过使用StringBuilder类来提高效率,StringBuilder类类似于字符串数据类型,但专门用于需要多次更改字符串值的情况。下面是我用StringBuilder做的扩展方法的一个版本:

public static string ReplaceChars(this string s, char[] separators, char newVal)
{
    StringBuilder sb = new StringBuilder(s);
    foreach (var c in separators) { sb.Replace(c, newVal); }
    return sb.ToString();
}

我运行了这个操作10万次,使用StringBuilder花费了73毫秒,而使用string花费了81毫秒。所以区别通常是可以忽略不计的,除非你运行很多操作或使用一个巨大的字符串。

其次,这里有一个你可以使用的1线循环:

foreach (char c in separators) { s = s.Replace(c, '\n'); }

我个人认为这是最好的选择。它非常高效,并且不需要编写扩展方法。在我的测试中,这种方法在63毫秒内运行了10万次迭代,是最高效的。 下面是一个上下文中的例子:

string s = "this;is,\ra\t\n\n\ntest";
char[] separators = new char[] { ' ', ';', ',', '\r', '\t', '\n' };
foreach (char c in separators) { s = s.Replace(c, '\n'); }

本例的前两行要归功于Paul Walls。


我也摆弄了一下这个问题,发现这里的大多数解决方案都非常缓慢。最快的方法实际上是dodgy_coder发布的LINQ + Aggregate方法。

但我想,这也可能是相当沉重的内存分配取决于有多少旧字符。所以我得出了这个结论:

这里的想法是为当前线程缓存旧字符的替换映射,以实现安全分配。除此之外,只是处理输入的字符数组之后会再次以字符串的形式返回。而字符数组则被尽可能少地修改。

[ThreadStatic]
private static bool[] replaceMap;
public static string Replace(this string input, char[] oldChars, char newChar)
{
    if (input == null) throw new ArgumentNullException(nameof(input));
    if (oldChars == null) throw new ArgumentNullException(nameof(oldChars));
    if (oldChars.Length == 1) return input.Replace(oldChars[0], newChar);
    if (oldChars.Length == 0) return input;

    replaceMap = replaceMap ?? new bool[char.MaxValue + 1];
    foreach (var oldChar in oldChars)
    {
        replaceMap[oldChar] = true;
    }

    try
    {
        var count = input.Length;
        var output = input.ToCharArray();
        for (var i = 0; i < count; i++)
        {
            if (replaceMap[input[i]])
            {
                output[i] = newChar;
            }
        }

        return new string(output);
    }
    finally
    {
        foreach (var oldChar in oldChars)
        {
            replaceMap[oldChar] = false;
        }
    }
}

对我来说,对于实际的输入字符串,这最多是两个分配。由于某些原因,StringBuilder对我来说要慢得多。它比LINQ变体快2倍。


没有“替换”(仅限Linq):

    string myString = ";,\r\t \n\n=1;;2,,3\r\r4\t\t5  6\n\n\n\n7=";
    char NoRepeat = '\n';
    string ByeBye = ";,\r\t ";
    string myResult = myString.ToCharArray().Where(t => !"STOP-OUTSIDER".Contains(t))
                 .Select(t => "" + ( ByeBye.Contains(t) ? '\n' : t))
                  .Aggregate((all, next) => (
                      next == "" + NoRepeat && all.Substring(all.Length - 1) == "" + NoRepeat
                      ? all : all  + next ) );

在构建了自己的解决方案并查看这里使用的解决方案后,我利用了一个不使用复杂代码且通常对大多数参数有效的答案。

Cover base cases where other methods are more appropriate. If there are no chars to replacement, return the original string. If there is only one, just use the Replace method. Use a StringBuilder and initialize the capacity to the length of the original string. After all, the new string being built will have the same length of the original string if its just chars being replaced. This ensure only 1 memory allocation is used for the new string. Assuming that the 'char' length could be small or large will impact performance. Large collections are better with hashsets, while smaller collections are not. This is a near-perfect use case for Hybrid Dictionaries. They switch to using a Hash based lookup once the collection gets too large. However, we don't care about the value of the dictionary, so I just set it to "true". Have different methods for StringBuilder verse just a string will prevent unnecessary memory allocation. If its just a string, don't instantiate a StringBuilder unless the base cases were checked. If its already a StringBuilder, then perform the replacements and return the StringBuilder itself (as other StringBuilder methods like Append do). I put the replacement char first, and the chars to check at the end. This way, I can leverage the params keyword for easily passing additional strings. However, you don't have to do this if you prefer the other order.

namespace Test.Extensions
{
    public static class StringExtensions
    {
        public static string ReplaceAll(this string str, char replacementCharacter, params char[] chars)
        {
            if (chars.Length == 0)
                return str;

            if (chars.Length == 1)
                return str.Replace(chars[0], replacementCharacter);

            StringBuilder sb = new StringBuilder(str.Length);

            var searcher = new HybridDictionary(chars.Length);
            for (int i = 0; i < chars.Length; i++)
                searcher[chars[i]] = true;

            foreach (var c in str)
            {
                if (searcher.Contains(c))
                    sb.Append(replacementCharacter);
                else
                    sb.Append(c);
            }

            return sb.ToString();
        }

        public static StringBuilder ReplaceAll(this StringBuilder sb, char replacementCharacter, params char[] chars)
        {
            if (chars.Length == 0)
                return sb;

            if (chars.Length == 1)
                return sb.Replace(chars[0], replacementCharacter);

            var searcher = new HybridDictionary(chars.Length);
            for (int i = 0; i < chars.Length; i++)
                searcher[chars[i]] = true;

            for (int i = 0; i < sb.Length; i++)
            {
                var val = sb[i];
                if (searcher.Contains(val))
                    sb[i] = replacementCharacter;
            }

            return sb;
        }
    }
}