给定字符串“ThisStringHasNoSpacesButItDoesHaveCapitals”,什么是在大写字母之前添加空格的最好方法。所以结尾字符串是"This string Has No space But It Does Have大写"

下面是我使用正则表达式的尝试

System.Text.RegularExpressions.Regex.Replace(value, "[A-Z]", " $0")

当前回答

以下是我的解决方案,基于Binary Worriers的建议和Richard Priddys的评论,但也考虑到空白可能存在于提供的字符串中,所以它不会在现有空白旁边添加空白。

public string AddSpacesBeforeUpperCase(string nonSpacedString)
    {
        if (string.IsNullOrEmpty(nonSpacedString))
            return string.Empty;

        StringBuilder newText = new StringBuilder(nonSpacedString.Length * 2);
        newText.Append(nonSpacedString[0]);

        for (int i = 1; i < nonSpacedString.Length; i++)
        {
            char currentChar = nonSpacedString[i];

            // If it is whitespace, we do not need to add another next to it
            if(char.IsWhiteSpace(currentChar))
            {
                continue;
            }

            char previousChar = nonSpacedString[i - 1];
            char nextChar = i < nonSpacedString.Length - 1 ? nonSpacedString[i + 1] : nonSpacedString[i];

            if (char.IsUpper(currentChar) && !char.IsWhiteSpace(nextChar) 
                && !(char.IsUpper(previousChar) && char.IsUpper(nextChar)))
            {
                newText.Append(' ');
            }
            else if (i < nonSpacedString.Length)
            {
                if (char.IsUpper(currentChar) && !char.IsWhiteSpace(nextChar) && !char.IsUpper(nextChar))
                {
                    newText.Append(' ');
                }
            }

            newText.Append(currentChar);
        }

        return newText.ToString();
    }

其他回答

这个问题有点老了,但现在在Nuget上有一个很好的库,它可以做到这一点,以及许多其他转换到人类可读的文本。

在GitHub或Nuget上检查Humanizer。

例子

"PascalCaseInputStringIsTurnedIntoSentence".Humanize() => "Pascal case input string is turned into sentence"
"Underscored_input_string_is_turned_into_sentence".Humanize() => "Underscored input string is turned into sentence"
"Underscored_input_String_is_turned_INTO_sentence".Humanize() => "Underscored input String is turned INTO sentence"

// acronyms are left intact
"HTML".Humanize() => "HTML"
    private string GetProperName(string Header)
    {
        if (Header.ToCharArray().Where(c => Char.IsUpper(c)).Count() == 1)
        {
            return Header;
        }
        else
        {
            string ReturnHeader = Header[0].ToString();
            for(int i=1; i<Header.Length;i++)
            {
                if (char.IsLower(Header[i-1]) && char.IsUpper(Header[i]))
                {
                    ReturnHeader += " " + Header[i].ToString();
                }
                else
                {
                    ReturnHeader += Header[i].ToString();
                }
            }

            return ReturnHeader;
        }

        return Header;
    }

灵感来自@MartinBrown, 两行简单的正则表达式,它将解析您的名字,包括字符串中的任何地方的无同义词。

public string ResolveName(string name)
{
   var tmpDisplay = Regex.Replace(name, "([^A-Z ])([A-Z])", "$1 $2");
   return Regex.Replace(tmpDisplay, "([A-Z]+)([A-Z][^A-Z$])", "$1 $2").Trim();
}

这是我的:

private string SplitCamelCase(string s) 
{ 
    Regex upperCaseRegex = new Regex(@"[A-Z]{1}[a-z]*"); 
    MatchCollection matches = upperCaseRegex.Matches(s); 
    List<string> words = new List<string>(); 
    foreach (Match match in matches) 
    { 
        words.Add(match.Value); 
    } 
    return String.Join(" ", words.ToArray()); 
}

这里有一个更彻底的解决方案,它没有在单词前面放空格:

注意:我使用了多个regex(不简洁,但它也可以处理首字母缩略词和单字母单词)

Dim s As String = "ThisStringHasNoSpacesButItDoesHaveCapitals"
s = System.Text.RegularExpressions.Regex.Replace(s, "([a-z])([A-Z](?=[A-Z])[a-z]*)", "$1 $2")
s = System.Text.RegularExpressions.Regex.Replace(s, "([A-Z])([A-Z][a-z])", "$1 $2")
s = System.Text.RegularExpressions.Regex.Replace(s, "([a-z])([A-Z][a-z])", "$1 $2")
s = System.Text.RegularExpressions.Regex.Replace(s, "([a-z])([A-Z][a-z])", "$1 $2") // repeat a second time

In:

"ThisStringHasNoSpacesButItDoesHaveCapitals"
"IAmNotAGoat"
"LOLThatsHilarious!"
"ThisIsASMSMessage"

Out:

"This String Has No Spaces But It Does Have Capitals"
"I Am Not A Goat"
"LOL Thats Hilarious!"
"This Is ASMS Message" // (Difficult to handle single letter words when they are next to acronyms.)