如何从字符串中删除所有非字母的字符?
非字母数字呢?
这必须是一个自定义函数还是也有更通用的解决方案?
如何从字符串中删除所有非字母的字符?
非字母数字呢?
这必须是一个自定义函数还是也有更通用的解决方案?
当前回答
我把它放在调用PatIndex的两个地方。
PatIndex('%[^A-Za-z0-9]%', @Temp)
为上面的自定义函数RemoveNonAlphaCharacters并重命名为RemoveNonAlphaNumericCharacters
其他回答
我把它放在调用PatIndex的两个地方。
PatIndex('%[^A-Za-z0-9]%', @Temp)
为上面的自定义函数RemoveNonAlphaCharacters并重命名为RemoveNonAlphaNumericCharacters
如果您像我一样,不能仅向生产数据添加函数,但仍然想执行这种过滤,那么这里有一个纯SQL解决方案,使用PIVOT表将过滤后的部分重新组合在一起。
注意:我硬编码表高达40个字符,如果你有更长的字符串要过滤,你将不得不添加更多。
SET CONCAT_NULL_YIELDS_NULL OFF;
with
ToBeScrubbed
as (
select 1 as id, '*SOME 222@ !@* #* BOGUS !@*&! DATA' as ColumnToScrub
),
Scrubbed as (
select
P.Number as ValueOrder,
isnull ( substring ( t.ColumnToScrub , number , 1 ) , '' ) as ScrubbedValue,
t.id
from
ToBeScrubbed t
left join master..spt_values P
on P.number between 1 and len(t.ColumnToScrub)
and type ='P'
where
PatIndex('%[^a-z]%', substring(t.ColumnToScrub,P.number,1) ) = 0
)
SELECT
id,
[1]+ [2]+ [3]+ [4]+ [5]+ [6]+ [7]+ [8] +[9] +[10]
+ [11]+ [12]+ [13]+ [14]+ [15]+ [16]+ [17]+ [18] +[19] +[20]
+ [21]+ [22]+ [23]+ [24]+ [25]+ [26]+ [27]+ [28] +[29] +[30]
+ [31]+ [32]+ [33]+ [34]+ [35]+ [36]+ [37]+ [38] +[39] +[40] as ScrubbedData
FROM (
select
*
from
Scrubbed
)
src
PIVOT (
MAX(ScrubbedValue) FOR ValueOrder IN (
[1], [2], [3], [4], [5], [6], [7], [8], [9], [10],
[11], [12], [13], [14], [15], [16], [17], [18], [19], [20],
[21], [22], [23], [24], [25], [26], [27], [28], [29], [30],
[31], [32], [33], [34], [35], [36], [37], [38], [39], [40]
)
) pvt
从性能角度来看,我会使用内联函数:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION [dbo].[udf_RemoveNumericCharsFromString]
(
@List NVARCHAR(4000)
)
RETURNS TABLE
AS RETURN
WITH GetNums AS (
SELECT TOP(ISNULL(DATALENGTH(@List), 0))
n = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
FROM
(VALUES (0),(0),(0),(0)) d (n),
(VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) e (n),
(VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) f (n),
(VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) g (n)
)
SELECT StrOut = ''+
(SELECT Chr
FROM GetNums
CROSS APPLY (SELECT SUBSTRING(@List , n,1)) X(Chr)
WHERE Chr LIKE '%[^0-9]%'
ORDER BY N
FOR XML PATH (''),TYPE).value('.','NVARCHAR(MAX)')
/*How to Use
SELECT StrOut FROM dbo.udf_RemoveNumericCharsFromString ('vv45--9gut')
Result: vv--gut
*/
首先创建一个函数
CREATE FUNCTION [dbo].[GetNumericonly]
(@strAlphaNumeric VARCHAR(256))
RETURNS VARCHAR(256)
AS
BEGIN
DECLARE @intAlpha INT
SET @intAlpha = PATINDEX('%[^0-9]%', @strAlphaNumeric)
BEGIN
WHILE @intAlpha > 0
BEGIN
SET @strAlphaNumeric = STUFF(@strAlphaNumeric, @intAlpha, 1, '' )
SET @intAlpha = PATINDEX('%[^0-9]%', @strAlphaNumeric )
END
END
RETURN ISNULL(@strAlphaNumeric,0)
END
现在把这个函数叫做
select [dbo].[GetNumericonly]('Abhi12shek23jaiswal')
它的结果是
1223
Here's a solution that doesn't require creating a function or listing all instances of characters to replace. It uses a recursive WITH statement in combination with a PATINDEX to find unwanted chars. It will replace all unwanted chars in a column - up to 100 unique bad characters contained in any given string. (E.G. "ABC123DEF234" would contain 4 bad characters 1, 2, 3 and 4) The 100 limit is the maximum number of recursions allowed in a WITH statement, but this doesn't impose a limit on the number of rows to process, which is only limited by the memory available. If you don't want DISTINCT results, you can remove the two options from the code.
-- Create some test data:
SELECT * INTO #testData
FROM (VALUES ('ABC DEF,K.l(p)'),('123H,J,234'),('ABCD EFG')) as t(TXT)
-- Actual query:
-- Remove non-alpha chars: '%[^A-Z]%'
-- Remove non-alphanumeric chars: '%[^A-Z0-9]%'
DECLARE @BadCharacterPattern VARCHAR(250) = '%[^A-Z]%';
WITH recurMain as (
SELECT DISTINCT CAST(TXT AS VARCHAR(250)) AS TXT, PATINDEX(@BadCharacterPattern, TXT) AS BadCharIndex
FROM #testData
UNION ALL
SELECT CAST(TXT AS VARCHAR(250)) AS TXT, PATINDEX(@BadCharacterPattern, TXT) AS BadCharIndex
FROM (
SELECT
CASE WHEN BadCharIndex > 0
THEN REPLACE(TXT, SUBSTRING(TXT, BadCharIndex, 1), '')
ELSE TXT
END AS TXT
FROM recurMain
WHERE BadCharIndex > 0
) badCharFinder
)
SELECT DISTINCT TXT
FROM recurMain
WHERE BadCharIndex = 0;