我正在做一些事情,我意识到我想要在一个字符串中找到多少个/s,然后我突然想到,有几种方法可以做到这一点,但不能决定哪种是最好的(或最简单的)。

目前我想说的是:

string source = "/once/upon/a/time/";
int count = source.Length - source.Replace("/", "").Length;

但我一点都不喜欢,有人愿意吗?

我并不想为此挖掘出正则表达式,对吧?

我知道我的字符串将包含我要搜索的项,所以你可以假设…

当然对于长度为> 1的字符串,

string haystack = "/once/upon/a/time";
string needle = "/";
int needleCount = ( haystack.Length - haystack.Replace(needle,"").Length ) / needle.Length;

当前回答

我想我会把我的扩展方法扔到戒指(更多信息见评论)。我没有做过任何正式的基准测试,但我认为在大多数情况下必须非常快。

EDIT: OK - so this SO question got me to wondering how the performance of our current implementation would stack up against some of the solutions presented here. I decided to do a little bench marking and found that our solution was very much in line with the performance of the solution provided by Richard Watson up until you are doing aggressive searching with large strings (100 Kb +), large substrings (32 Kb +) and many embedded repetitions (10K +). At that point our solution was around 2X to 4X slower. Given this and the fact that we really like the solution presented by Richard Watson, we have refactored our solution accordingly. I just wanted to make this available for anyone that might benefit from it.

我们最初的解决方案:

    /// <summary>
    /// Counts the number of occurrences of the specified substring within
    /// the current string.
    /// </summary>
    /// <param name="s">The current string.</param>
    /// <param name="substring">The substring we are searching for.</param>
    /// <param name="aggressiveSearch">Indicates whether or not the algorithm 
    /// should be aggressive in its search behavior (see Remarks). Default 
    /// behavior is non-aggressive.</param>
    /// <remarks>This algorithm has two search modes - aggressive and 
    /// non-aggressive. When in aggressive search mode (aggressiveSearch = 
    /// true), the algorithm will try to match at every possible starting 
    /// character index within the string. When false, all subsequent 
    /// character indexes within a substring match will not be evaluated. 
    /// For example, if the string was 'abbbc' and we were searching for 
    /// the substring 'bb', then aggressive search would find 2 matches 
    /// with starting indexes of 1 and 2. Non aggressive search would find 
    /// just 1 match with starting index at 1. After the match was made, 
    /// the non aggressive search would attempt to make it's next match 
    /// starting at index 3 instead of 2.</remarks>
    /// <returns>The count of occurrences of the substring within the string.</returns>
    public static int CountOccurrences(this string s, string substring, 
        bool aggressiveSearch = false)
    {
        // if s or substring is null or empty, substring cannot be found in s
        if (string.IsNullOrEmpty(s) || string.IsNullOrEmpty(substring))
            return 0;

        // if the length of substring is greater than the length of s,
        // substring cannot be found in s
        if (substring.Length > s.Length)
            return 0;

        var sChars = s.ToCharArray();
        var substringChars = substring.ToCharArray();
        var count = 0;
        var sCharsIndex = 0;

        // substring cannot start in s beyond following index
        var lastStartIndex = sChars.Length - substringChars.Length;

        while (sCharsIndex <= lastStartIndex)
        {
            if (sChars[sCharsIndex] == substringChars[0])
            {
                // potential match checking
                var match = true;
                var offset = 1;
                while (offset < substringChars.Length)
                {
                    if (sChars[sCharsIndex + offset] != substringChars[offset])
                    {
                        match = false;
                        break;
                    }
                    offset++;
                }
                if (match)
                {
                    count++;
                    // if aggressive, just advance to next char in s, otherwise, 
                    // skip past the match just found in s
                    sCharsIndex += aggressiveSearch ? 1 : substringChars.Length;
                }
                else
                {
                    // no match found, just move to next char in s
                    sCharsIndex++;
                }
            }
            else
            {
                // no match at current index, move along
                sCharsIndex++;
            }
        }

        return count;
    }

这是我们修改后的解决方案:

    /// <summary>
    /// Counts the number of occurrences of the specified substring within
    /// the current string.
    /// </summary>
    /// <param name="s">The current string.</param>
    /// <param name="substring">The substring we are searching for.</param>
    /// <param name="aggressiveSearch">Indicates whether or not the algorithm 
    /// should be aggressive in its search behavior (see Remarks). Default 
    /// behavior is non-aggressive.</param>
    /// <remarks>This algorithm has two search modes - aggressive and 
    /// non-aggressive. When in aggressive search mode (aggressiveSearch = 
    /// true), the algorithm will try to match at every possible starting 
    /// character index within the string. When false, all subsequent 
    /// character indexes within a substring match will not be evaluated. 
    /// For example, if the string was 'abbbc' and we were searching for 
    /// the substring 'bb', then aggressive search would find 2 matches 
    /// with starting indexes of 1 and 2. Non aggressive search would find 
    /// just 1 match with starting index at 1. After the match was made, 
    /// the non aggressive search would attempt to make it's next match 
    /// starting at index 3 instead of 2.</remarks>
    /// <returns>The count of occurrences of the substring within the string.</returns>
    public static int CountOccurrences(this string s, string substring, 
        bool aggressiveSearch = false)
    {
        // if s or substring is null or empty, substring cannot be found in s
        if (string.IsNullOrEmpty(s) || string.IsNullOrEmpty(substring))
            return 0;

        // if the length of substring is greater than the length of s,
        // substring cannot be found in s
        if (substring.Length > s.Length)
            return 0;

        int count = 0, n = 0;
        while ((n = s.IndexOf(substring, n, StringComparison.InvariantCulture)) != -1)
        {
            if (aggressiveSearch)
                n++;
            else
                n += substring.Length;
            count++;
        }

        return count;
    }

其他回答

从。net 7开始,我们就有了不需要分配(并且高度优化)的Regex api。计数尤其容易和有效。

    var input = "abcd abcabc ababc";
    var result = Regex.Count(input: input, pattern: "abc"); // 4

当匹配动态模式时,记得转义它们:

public static int CountOccurences(string input, string pattern)
{
    pattern = Regex.Escape(pattern); // Aww, no way to avoid heap allocations here

    var result = Regex.Count(input: input, pattern: pattern);
    return result;
}

而且,作为固定模式的额外奖励,. net 7引入了分析器,帮助将正则表达式字符串转换为源代码生成的代码。这不仅避免了regex的运行时编译开销,而且还提供了非常可读的代码,展示了它是如何实现的。事实上,该代码通常至少与手动编写的任何替代代码一样有效。

如果您的正则表达式调用是合格的,分析程序将给出提示。简单地选择“转换为'GeneratedRegexAttribute '”并享受结果:

[GeneratedRegex("abc")]
private static partial Regex MyRegex(); // Go To Definition to see the generated code

我做了一些研究,发现理查德·沃森的解决方案在大多数情况下是最快的。这是文章中每个解决方案的结果表(除了那些使用Regex的,因为它在解析“test{test”这样的字符串时抛出异常)

    Name      | Short/char |  Long/char | Short/short| Long/short |  Long/long |
    Inspite   |         134|        1853|          95|        1146|         671|
    LukeH_1   |         346|        4490|         N/A|         N/A|         N/A|
    LukeH_2   |         152|        1569|         197|        2425|        2171|
Bobwienholt   |         230|        3269|         N/A|         N/A|         N/A|
Richard Watson|          33|         298|         146|         737|         543|
StefanosKargas|         N/A|         N/A|         681|       11884|       12486|

可以看到,在短字符串(10-50个字符)中查找短子字符串(1-5个字符)的出现次数时,首选原算法。

同样,对于多字符子字符串,您应该使用以下代码(基于Richard Watson的解决方案)

int count = 0, n = 0;

if(substring != "")
{
    while ((n = source.IndexOf(substring, n, StringComparison.InvariantCulture)) != -1)
    {
        n += substring.Length;
        ++count;
    }
}

这两个都只适用于单字符搜索词…

countOccurences("the", "the answer is the answer");

int countOccurences(string needle, string haystack)
{
    return (haystack.Length - haystack.Replace(needle,"").Length) / needle.Length;
}

也许更长的针头会更好…

但肯定有更优雅的方式。:)

我想我会把我的扩展方法扔到戒指(更多信息见评论)。我没有做过任何正式的基准测试,但我认为在大多数情况下必须非常快。

EDIT: OK - so this SO question got me to wondering how the performance of our current implementation would stack up against some of the solutions presented here. I decided to do a little bench marking and found that our solution was very much in line with the performance of the solution provided by Richard Watson up until you are doing aggressive searching with large strings (100 Kb +), large substrings (32 Kb +) and many embedded repetitions (10K +). At that point our solution was around 2X to 4X slower. Given this and the fact that we really like the solution presented by Richard Watson, we have refactored our solution accordingly. I just wanted to make this available for anyone that might benefit from it.

我们最初的解决方案:

    /// <summary>
    /// Counts the number of occurrences of the specified substring within
    /// the current string.
    /// </summary>
    /// <param name="s">The current string.</param>
    /// <param name="substring">The substring we are searching for.</param>
    /// <param name="aggressiveSearch">Indicates whether or not the algorithm 
    /// should be aggressive in its search behavior (see Remarks). Default 
    /// behavior is non-aggressive.</param>
    /// <remarks>This algorithm has two search modes - aggressive and 
    /// non-aggressive. When in aggressive search mode (aggressiveSearch = 
    /// true), the algorithm will try to match at every possible starting 
    /// character index within the string. When false, all subsequent 
    /// character indexes within a substring match will not be evaluated. 
    /// For example, if the string was 'abbbc' and we were searching for 
    /// the substring 'bb', then aggressive search would find 2 matches 
    /// with starting indexes of 1 and 2. Non aggressive search would find 
    /// just 1 match with starting index at 1. After the match was made, 
    /// the non aggressive search would attempt to make it's next match 
    /// starting at index 3 instead of 2.</remarks>
    /// <returns>The count of occurrences of the substring within the string.</returns>
    public static int CountOccurrences(this string s, string substring, 
        bool aggressiveSearch = false)
    {
        // if s or substring is null or empty, substring cannot be found in s
        if (string.IsNullOrEmpty(s) || string.IsNullOrEmpty(substring))
            return 0;

        // if the length of substring is greater than the length of s,
        // substring cannot be found in s
        if (substring.Length > s.Length)
            return 0;

        var sChars = s.ToCharArray();
        var substringChars = substring.ToCharArray();
        var count = 0;
        var sCharsIndex = 0;

        // substring cannot start in s beyond following index
        var lastStartIndex = sChars.Length - substringChars.Length;

        while (sCharsIndex <= lastStartIndex)
        {
            if (sChars[sCharsIndex] == substringChars[0])
            {
                // potential match checking
                var match = true;
                var offset = 1;
                while (offset < substringChars.Length)
                {
                    if (sChars[sCharsIndex + offset] != substringChars[offset])
                    {
                        match = false;
                        break;
                    }
                    offset++;
                }
                if (match)
                {
                    count++;
                    // if aggressive, just advance to next char in s, otherwise, 
                    // skip past the match just found in s
                    sCharsIndex += aggressiveSearch ? 1 : substringChars.Length;
                }
                else
                {
                    // no match found, just move to next char in s
                    sCharsIndex++;
                }
            }
            else
            {
                // no match at current index, move along
                sCharsIndex++;
            }
        }

        return count;
    }

这是我们修改后的解决方案:

    /// <summary>
    /// Counts the number of occurrences of the specified substring within
    /// the current string.
    /// </summary>
    /// <param name="s">The current string.</param>
    /// <param name="substring">The substring we are searching for.</param>
    /// <param name="aggressiveSearch">Indicates whether or not the algorithm 
    /// should be aggressive in its search behavior (see Remarks). Default 
    /// behavior is non-aggressive.</param>
    /// <remarks>This algorithm has two search modes - aggressive and 
    /// non-aggressive. When in aggressive search mode (aggressiveSearch = 
    /// true), the algorithm will try to match at every possible starting 
    /// character index within the string. When false, all subsequent 
    /// character indexes within a substring match will not be evaluated. 
    /// For example, if the string was 'abbbc' and we were searching for 
    /// the substring 'bb', then aggressive search would find 2 matches 
    /// with starting indexes of 1 and 2. Non aggressive search would find 
    /// just 1 match with starting index at 1. After the match was made, 
    /// the non aggressive search would attempt to make it's next match 
    /// starting at index 3 instead of 2.</remarks>
    /// <returns>The count of occurrences of the substring within the string.</returns>
    public static int CountOccurrences(this string s, string substring, 
        bool aggressiveSearch = false)
    {
        // if s or substring is null or empty, substring cannot be found in s
        if (string.IsNullOrEmpty(s) || string.IsNullOrEmpty(substring))
            return 0;

        // if the length of substring is greater than the length of s,
        // substring cannot be found in s
        if (substring.Length > s.Length)
            return 0;

        int count = 0, n = 0;
        while ((n = s.IndexOf(substring, n, StringComparison.InvariantCulture)) != -1)
        {
            if (aggressiveSearch)
                n++;
            else
                n += substring.Length;
            count++;
        }

        return count;
    }
string s = "HOWLYH THIS ACTUALLY WORKSH WOWH";
int count = 0;
for (int i = 0; i < s.Length; i++)
   if (s[i] == 'H') count++;

它只是检查字符串中的每个字符,如果该字符是您正在搜索的字符,则添加一个来计数。