Posted by
| Nick Gammon
Australia (22,982 posts) bio
Forum Administrator |
Message
| A more generic splitting iterator
The iterator below is more general than the "getlines" one presented above. This will split a string at any delimiter, including a regular expression. It is similar in design to the earlier one, however it stores in upvalues the delimiter (rather than hard-coding a \n character) and has to remember the delimiter length, as it might be more than one character long.
-- split iterator - iterates over a string and returns one item per delimiter
function split (str, delim, plain)
local pos = 0 -- where we are in the string
local len = 1 -- length of previous match
-- the for loop calls this for every iteration
-- returning nil terminates the loop
local function iterator (s)
if not pos then
return nil
end -- end of string, exit loop
local oldpos = pos + len -- step past previous delimiter
local e -- end of match
pos, e = string.find (s, delim, oldpos, plain) -- find next delimiter
if not pos then -- no more newlines, return rest of string
return string.sub (s, oldpos)
end -- no newline
len = e - pos + 1 -- length of delimiter
return string.sub (s, oldpos, pos - 1)
end -- iterator
return iterator, str
end -- split
Examples
The first example just uses a simple comma as the delimiter.
test = "every,good,boy,deserves,fruit"
for l in split (test, ",") do
print ('"' .. l .. '"')
end -- for
Output
"every"
"good"
"boy"
"deserves"
"fruit"
Now we will use a regular expression to split on a comma or a semicolon.
test = "every,good;boy,deserves;fruit"
for l in split (test, "[,;]") do
print ('"' .. l .. '"')
end -- for
(Same output as before).
If we want to match on something that is a special regular expression character (like %) then we either have to put a % in front of it, or use the "plain" boolean, to specify matching on plain text.
test = "every%good%boy%deserves%fruit"
for l in split (test, "%", true) do
print ('"' .. l .. '"')
end -- for
(Same output as before).
Finally we can use a regular expression that might match more than one character (in this case, multiple spaces) to effectively break up a string into words separated by whitespace.
test = "every good boy deserves fruit"
for l in split (test, "%s+") do
print ('"' .. l .. '"')
end -- for
(Same output as before).
|
- Nick Gammon
www.gammon.com.au, www.mushclient.com | top |
|