我需要计算字符串中某个字符出现的次数。

例如,假设我的字符串包含:

var mainStr = "str1,str2,str3,str4";

我想求出逗号的个数,也就是3个字符。以及按逗号分隔后的单个字符串的计数,也就是4。

我还需要验证每个字符串,即str1或str2或str3或str4不应该超过,比如说,15个字符。


当前回答

更新:这可能是简单的,但它不是最快的。参见下面的基准测试。


令人惊讶的是,13年了,这个答案还没有出现。从直觉上看,它应该是最快的:

const s = "The quick brown fox jumps over the lazy dog.";
const oCount = s.length - s.replaceAll('o', '').length;

如果字符串中只有两种字符,那么这样仍然更快:


const s = "001101001";
const oneCount = s.replaceAll('0', '').length;

基准

const { performance } = require('node:perf_hooks');

const ITERATIONS = 10000000;
const TEST_STRING = "The quick brown fox jumps over the lazy dog.";

console.log(ITERATIONS, "iterations");

let sum = 0; // make sure compiler doesn't optimize code out
let start = performance.now();
for (let i = 0; i < ITERATIONS; ++i) {
  sum += TEST_STRING.length - TEST_STRING.replaceAll('o', '').length;
}
let end = performance.now();
console.log("  replaceAll duration", end - start, `(sum ${sum})`);

sum = 0;
start = performance.now();
for (let i = 0; i < ITERATIONS; ++i) {
  sum += TEST_STRING.split('o').length - 1
}
end = performance.now();
console.log("  split duration", end - start, `(sum ${sum})`);
10000 iterations
  replaceAll duration 2.6167500019073486 (sum 40000)
  split duration 2.0777920186519623 (sum 40000)
100000 iterations
  replaceAll duration 17.563208997249603 (sum 400000)
  split duration 8.087624996900558 (sum 400000)
1000000 iterations
  replaceAll duration 128.71587499976158 (sum 4000000)
  split duration 64.15841698646545 (sum 4000000)
10000000 iterations
  replaceAll duration 1223.3415840268135 (sum 40000000)
  split duration 629.1629169881344 (sum 40000000)

其他回答

更新:这可能是简单的,但它不是最快的。参见下面的基准测试。


令人惊讶的是,13年了,这个答案还没有出现。从直觉上看,它应该是最快的:

const s = "The quick brown fox jumps over the lazy dog.";
const oCount = s.length - s.replaceAll('o', '').length;

如果字符串中只有两种字符,那么这样仍然更快:


const s = "001101001";
const oneCount = s.replaceAll('0', '').length;

基准

const { performance } = require('node:perf_hooks');

const ITERATIONS = 10000000;
const TEST_STRING = "The quick brown fox jumps over the lazy dog.";

console.log(ITERATIONS, "iterations");

let sum = 0; // make sure compiler doesn't optimize code out
let start = performance.now();
for (let i = 0; i < ITERATIONS; ++i) {
  sum += TEST_STRING.length - TEST_STRING.replaceAll('o', '').length;
}
let end = performance.now();
console.log("  replaceAll duration", end - start, `(sum ${sum})`);

sum = 0;
start = performance.now();
for (let i = 0; i < ITERATIONS; ++i) {
  sum += TEST_STRING.split('o').length - 1
}
end = performance.now();
console.log("  split duration", end - start, `(sum ${sum})`);
10000 iterations
  replaceAll duration 2.6167500019073486 (sum 40000)
  split duration 2.0777920186519623 (sum 40000)
100000 iterations
  replaceAll duration 17.563208997249603 (sum 400000)
  split duration 8.087624996900558 (sum 400000)
1000000 iterations
  replaceAll duration 128.71587499976158 (sum 4000000)
  split duration 64.15841698646545 (sum 4000000)
10000000 iterations
  replaceAll duration 1223.3415840268135 (sum 40000000)
  split duration 629.1629169881344 (sum 40000000)

至少有五种方法。最好的选项,也应该是最快的(由于本机RegEx引擎)被放在顶部。

方法1

("this is foo bar".match(/o/g)||[]).length;
// returns 2

方法2

"this is foo bar".split("o").length - 1;
// returns 2

不建议拆分,因为它是资源饥渴的。它为每个匹配分配新的“Array”实例。不要通过FileReader尝试>100MB文件。你可以观察确切的资源使用使用Chrome的分析器选项。

方法3

    var stringsearch = "o"
       ,str = "this is foo bar";
    for(var count=-1,index=-2; index != -1; count++,index=str.indexOf(stringsearch,index+1) );
// returns 2

方法4

搜索单个字符

    var stringsearch = "o"
       ,str = "this is foo bar";
    for(var i=count=0; i<str.length; count+=+(stringsearch===str[i++]));
     // returns 2

方法5

元素映射和过滤。不建议这样做,因为它的整体资源预分配,而不是使用python的“生成器”:

    var str = "this is foo bar"
    str.split('').map( function(e,i){ if(e === 'o') return i;} )
                 .filter(Boolean)
    //>[9, 10]
    [9, 10].length
    // returns 2

分享: 我做了这个要点,目前有8种方法的字符计数,所以我们可以直接汇集和分享我们的想法-只是为了好玩,也许一些有趣的基准:)

我正在做一个需要子字符串计数器的小项目。搜索错误的短语没有提供给我任何结果,然而在编写我自己的实现后,我偶然发现了这个问题。不管怎样,这是我的方法,它可能比这里的大多数慢,但可能对某些人有帮助:

function count_letters() {
var counter = 0;

for (var i = 0; i < input.length; i++) {
    var index_of_sub = input.indexOf(input_letter, i);

    if (index_of_sub > -1) {
        counter++;
        i = index_of_sub;
    }
}

http://jsfiddle.net/5ZzHt/1/

请让我知道,如果你发现这个实现失败或不遵循一些标准!:)

更新 你可能想要替换:

    for (var i = 0; i < input.length; i++) {

:

for (var i = 0, input_length = input.length; i < input_length; i++) {

上面讨论的内容很有趣: http://www.erichynds.com/blog/javascript-length-property-is-a-stored-value

Split与RegExp的性能

var i = 0; var split_start = new Date().getTime(); while (i < 30000) { "1234,453,123,324".split(",").length -1; i++; } var split_end = new Date().getTime(); var split_time = split_end - split_start; i= 0; var reg_start = new Date().getTime(); while (i < 30000) { ("1234,453,123,324".match(/,/g) || []).length; i++; } var reg_end = new Date().getTime(); var reg_time = reg_end - reg_start; alert ('Split Execution time: ' + split_time + "\n" + 'RegExp Execution time: ' + reg_time + "\n");

有:

function character_count(string, char, ptr = 0, count = 0) {
    while (ptr = string.indexOf(char, ptr) + 1) {count ++}
    return count
}

也适用于整数!