我需要做一个简单的字符串分割,但似乎没有一个函数,我测试的手动方式似乎不起作用。我该怎么做呢?
当前回答
我使用上面的例子来创建我自己的函数。但对我来说,缺失的部分是自动逃脱魔法角色。
以下是我的观点:
function split(text, delim)
-- returns an array of fields based on text and delimiter (one character only)
local result = {}
local magic = "().%+-*?[]^$"
if delim == nil then
delim = "%s"
elseif string.find(delim, magic, 1, true) then
-- escape magic
delim = "%"..delim
end
local pattern = "[^"..delim.."]+"
for w in string.gmatch(text, pattern) do
table.insert(result, w)
end
return result
end
其他回答
对于那些从“在Lua中编程”这本书的练习10.1中学习过的人来说,似乎很清楚我们不能使用后面书中解释的概念(迭代器),而且函数应该接受多个字符分隔符。
split()是一个让模式匹配不需要的内容(split)并在空字符串上返回一个空表的技巧。plainSplit()的返回更像其他语言中的split。
magic = "([%%%.%(%)%+%*%?%[%]%^%$])"
function split(str, sep, plain)
if plain then sep = string.gsub(sep, magic, "%%%1") end
local N = '\255'
str = N..str..N
str = string.gsub(str, sep, N..N)
local result = {}
for word in string.gmatch(str, N.."(.-)"..N) do
if word ~= "" then
table.insert(result, word)
end
end
return result
end
function plainSplit(str, sep)
sep = string.gsub(sep, magic, "%%%1")
local result = {}
local start = 0
repeat
start = start + 1
local from, to = string.find(str, sep, start)
from = from and from-1
local word = string.sub(str, start, from, true)
table.insert(result, word)
start = to
until start == nil
return result
end
function tableToString(t)
local ret = "{"
for _, word in ipairs(t) do
ret = ret .. '"' .. word .. '", '
end
ret = string.sub(ret, 1, -3)
ret = ret .. "}"
return #ret > 1 and ret or "{}"
end
function runSplit(func, title, str, sep, plain)
print("\n" .. title)
print("str: '"..str.."'")
print("sep: '"..sep.."'")
local t = func(str, sep, plain)
print("-- t = " .. tableToString(t))
end
print("\n\n\n=== Pattern split ===")
runSplit(split, "Exercice 10.1", "a whole new world", " ")
runSplit(split, "With trailing seperator", " a whole new world ", " ")
runSplit(split, "A word seperator", "a whole new world", " whole ")
runSplit(split, "Pattern seperator", "a1whole2new3world", "%d")
runSplit(split, "Magic characters as plain seperator", "a$.%whole$.%new$.%world", "$.%", true)
runSplit(split, "Control seperator", "a\0whole\1new\2world", "%c")
runSplit(split, "ISO Time", "2020-07-10T15:00:00.000", "[T:%-%.]")
runSplit(split, " === [Fails] with \\255 ===", "a\255whole\0new\0world", "\0", true)
runSplit(split, "How does your function handle empty string?", "", " ")
print("\n\n\n=== Plain split ===")
runSplit(plainSplit, "Exercice 10.1", "a whole new world", " ")
runSplit(plainSplit, "With trailing seperator", " a whole new world ", " ")
runSplit(plainSplit, "A word seperator", "a whole new world", " whole ")
runSplit(plainSplit, "Magic characters as plain seperator", "a$.%whole$.%new$.%world", "$.%")
runSplit(plainSplit, "How does your function handle empty string?", "", " ")
输出
=== Pattern split ===
Exercice 10.1
str: 'a whole new world'
sep: ' '
-- t = {"a", "whole", "new", "world"}
With trailing seperator
str: ' a whole new world '
sep: ' '
-- t = {"a", "whole", "new", "world"}
A word seperator
str: 'a whole new world'
sep: ' whole '
-- t = {"a", "new world"}
Pattern seperator
str: 'a1whole2new3world'
sep: '%d'
-- t = {"a", "whole", "new", "world"}
Magic characters as plain seperator
str: 'a$.%whole$.%new$.%world'
sep: '$.%'
-- t = {"a", "whole", "new", "world"}
Control seperator
str: 'awholenewworld'
sep: '%c'
-- t = {"a", "whole", "new", "world"}
ISO Time
str: '2020-07-10T15:00:00.000'
sep: '[T:%-%.]'
-- t = {"2020", "07", "10", "15", "00", "00", "000"}
=== [Fails] with \255 ===
str: 'a�wholenewworld'
sep: ''
-- t = {"a"}
How does your function handle empty string?
str: ''
sep: ' '
-- t = {}
=== Plain split ===
Exercice 10.1
str: 'a whole new world'
sep: ' '
-- t = {"a", "whole", "new", "world"}
With trailing seperator
str: ' a whole new world '
sep: ' '
-- t = {"", "", "a", "", "whole", "", "", "new", "world", "", ""}
A word seperator
str: 'a whole new world'
sep: ' whole '
-- t = {"a", "new world"}
Magic characters as plain seperator
str: 'a$.%whole$.%new$.%world'
sep: '$.%'
-- t = {"a", "whole", "new", "world"}
How does your function handle empty string?
str: ''
sep: ' '
-- t = {""}
因为剥猫皮的方法不止一种,下面是我的方法:
代码:
#!/usr/bin/env lua
local content = [=[
Lorem ipsum dolor sit amet, consectetur adipisicing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna
aliqua. Ut enim ad minim veniam, quis nostrud exercitation
ullamco laboris nisi ut aliquip ex ea commodo consequat.
]=]
local function split(str, sep)
local result = {}
local regex = ("([^%s]+)"):format(sep)
for each in str:gmatch(regex) do
table.insert(result, each)
end
return result
end
local lines = split(content, "\n")
for _,line in ipairs(lines) do
print(line)
end
输出: 圣洁圣洁的人, sed做eiusmod时间incididunt ut并dolore麦格纳 aliqua。Ut enim ad minivenim, quis nostrud practice 这一切都是徒劳的结果。
解释:
gmatch函数作为一个迭代器,它获取所有与regex匹配的字符串。正则表达式接受所有字符,直到找到分隔符为止。
一种别人没有的方式
function str_split(str, sep)
if sep == nil then
sep = '%s'
end
local res = {}
local func = function(w)
table.insert(res, w)
end
string.gsub(str, '[^'..sep..']+', func)
return res
end
我喜欢这个简短的解决方案
function split(s, delimiter)
result = {};
for match in (s..delimiter):gmatch("(.-)"..delimiter) do
table.insert(result, match);
end
return result;
end
我使用上面的例子来创建我自己的函数。但对我来说,缺失的部分是自动逃脱魔法角色。
以下是我的观点:
function split(text, delim)
-- returns an array of fields based on text and delimiter (one character only)
local result = {}
local magic = "().%+-*?[]^$"
if delim == nil then
delim = "%s"
elseif string.find(delim, magic, 1, true) then
-- escape magic
delim = "%"..delim
end
local pattern = "[^"..delim.."]+"
for w in string.gmatch(text, pattern) do
table.insert(result, w)
end
return result
end
推荐文章
- 我如何检查如果一个变量是JavaScript字符串?
- 如何显示有两个小数点后的浮点数?
- 在Lua中拆分字符串?
- 如何在Python中按字母顺序排序字符串中的字母
- python: SyntaxError: EOL扫描字符串文字
- PHP子字符串提取。获取第一个'/'之前的字符串或整个字符串
- 双引号vs单引号
- 如何知道一个字符串开始/结束在jQuery特定的字符串?
- 在Swift中根据字符串计算UILabel的大小
- 创建一个可变长度的字符串,用重复字符填充
- 字符串比较:InvariantCultureIgnoreCase vs OrdinalIgnoreCase?
- 在大写字母前加空格
- 如何改变日期时间格式在熊猫
- 为什么字符串在Java中是不可变的?
- 在JavaScript中转换为字符串