我想从字符串中删除所有特殊字符。允许输入A-Z(大写或小写)、数字(0-9)、下划线(_)或点符号(.)。
我有以下,它是有效的,但我怀疑(我知道!)它不是很有效:
public static string RemoveSpecialCharacters(string str)
{
StringBuilder sb = new StringBuilder();
for (int i = 0; i < str.Length; i++)
{
if ((str[i] >= '0' && str[i] <= '9')
|| (str[i] >= 'A' && str[i] <= 'z'
|| (str[i] == '.' || str[i] == '_')))
{
sb.Append(str[i]);
}
}
return sb.ToString();
}
最有效的方法是什么?正则表达式是什么样子的,它与普通字符串操作相比如何?
要清洗的字符串相当短,长度通常在10到30个字符之间。
I had to do something similar for work, but in my case I had to filter all that is not a letter, number or whitespace (but you could easily modify it to your needs).
The filtering is done client-side in JavaScript, but for security reasons I am also doing the filtering server-side. Since I can expect most of the strings to be clean, I would like to avoid copying the string unless I really need to. This let my to the implementation below, which should perform better for both clean and dirty strings.
public static string EnsureOnlyLetterDigitOrWhiteSpace(string input)
{
StringBuilder cleanedInput = null;
for (var i = 0; i < input.Length; ++i)
{
var currentChar = input[i];
var charIsValid = char.IsLetterOrDigit(currentChar) || char.IsWhiteSpace(currentChar);
if (charIsValid)
{
if(cleanedInput != null)
cleanedInput.Append(currentChar);
}
else
{
if (cleanedInput != null) continue;
cleanedInput = new StringBuilder();
if (i > 0)
cleanedInput.Append(input.Substring(0, i));
}
}
return cleanedInput == null ? input : cleanedInput.ToString();
}
HashSet是O(1)
不确定它是否比现有的比较快
private static HashSet<char> ValidChars = new HashSet<char>() { 'a', 'b', 'c', 'A', 'B', 'C', '1', '2', '3', '_' };
public static string RemoveSpecialCharacters(string str)
{
StringBuilder sb = new StringBuilder(str.Length / 2);
foreach (char c in str)
{
if (ValidChars.Contains(c)) sb.Append(c);
}
return sb.ToString();
}
我测试了,这并不比公认的答案快。
如果你需要一组可配置的字符,我会把它留在这里,这将是一个很好的解决方案。