Writing an iterator function in Lua
Posted by
Nick Gammon
Australia (23,133 posts) Bio
Forum Administrator |
Fri 12 May 2006 04:36 AM (UTC) Amended on Fri 12 May 2006 04:40 AM (UTC) by Nick Gammon
| Lua supports your development of your own iterators - functions that can be used in a "for" loop.
The example shown below demonstrates how you can write an iterator that will take a string and iterate over it a line at a time. This particular iterator maintains an internal state (using the local "state" table). Inside this is a copy of the string we are iterating over, plus a position number, which is how far through the string we are.
The function getlines returns two things - the iterator function to be called for each iteration (declared locally to it), and the state variable, with the current position initialised to zero.
The example at the end shows how it might be used.
-- getlines iterator - iterates over a string and returns one item per line
function getlines (s)
-- the for loop calls this for every iteration
-- returning nil terminates the loop
local function iterator (state)
if not state.pos then
return nil
end -- end of string, exit loop
local oldpos = state.pos + 1 -- step past previous newline
state.pos = string.find (state.s, "\n", oldpos) -- find next newline
if not state.pos then -- no more newlines, return rest of string
return string.sub (state.s, oldpos)
end -- no newline
return string.sub (state.s, oldpos, state.pos - 1)
end -- iterator
local state = { s = s, pos = 0 }
return iterator, state
end -- getlines
-- example
test = [[
every good
for l in getlines (test) do
print ('"' .. l .. '"')
end -- for
"every good"
"deserves "
- Nick Gammon
Posted by
Nick Gammon
Australia (23,133 posts) Bio
Forum Administrator |
Reply #1 on Fri 12 May 2006 04:56 AM (UTC)
| The previous example used a state variable as the iterator invariant. The iterator is passed the invariant as its first argument. Although we are changing the contents of the table, the table itself does not change, so this is OK.
However this example below dispenses with the state table, and uses an upvalue instead (pos). This perhaps is slightly neater:
-- getlines iterator - iterates over a string and returns one item per line
function getlines (str)
local pos = 0
-- the for loop calls this for every iteration
-- returning nil terminates the loop
local function iterator (s)
if not pos then
return nil
end -- end of string, exit loop
local oldpos = pos + 1 -- step past previous newline
pos = string.find (s, "\n", oldpos) -- find next newline
if not pos then -- no more newlines, return rest of string
return string.sub (s, oldpos)
end -- no newline
return string.sub (s, oldpos, pos - 1)
end -- iterator
return iterator, str
end -- getlines
test = [[
every good
for l in getlines (test) do
print ('"' .. l .. '"')
end -- for
- Nick Gammon
Posted by
Nick Gammon
Australia (23,133 posts) Bio
Forum Administrator |
Reply #2 on Fri 12 May 2006 05:22 AM (UTC) Amended on Fri 12 May 2006 06:18 AM (UTC) by Nick Gammon
| Here is another example of iterators. This time we will make one to do SQL queries, and return one row per iteration.
The iterator appears first - it starts by executing the query (for the given connection), and then for each call returns one row.
The query is based on the general techniques (and example table) shown in this forum post:
The example for loop is shown in bold below. I think it looks neater than the one in the original post.
-- dosql iterator - executes an SQL query and returns one row for each iteration
function dosql (con, s)
-- the for loop calls this for every iteration
-- returning nil terminates the loop
local function iterator (state)
state.row = state.cur:fetch (state.row, "a")
if not state.row then
state.cur:close() -- cursor not needed any more
end -- end of query, exit loop
return state.row
end -- iterator
local state = { cur = assert (con:execute (s)), row = {} }
return iterator, state
end -- dosql
-- example - do select and print all rows
-- load the MySQL dll
loadlib ("mysql.dll", "luaopen_luasqlmysql") ()
-- create environment object
env = assert (luasql.mysql())
-- connect to data source
con = assert (env:connect ("databasename", "username", "password", "servername"))
for row in dosql (con, "SELECT * from players") do
table.foreach (row, print)
-- close everything
- Nick Gammon
Posted by
Nick Gammon
Australia (23,133 posts) Bio
Forum Administrator |
Reply #3 on Sun 21 May 2006 10:01 PM (UTC)
| A more generic splitting iterator
The iterator below is more general than the "getlines" one presented above. This will split a string at any delimiter, including a regular expression. It is similar in design to the earlier one, however it stores in upvalues the delimiter (rather than hard-coding a \n character) and has to remember the delimiter length, as it might be more than one character long.
-- split iterator - iterates over a string and returns one item per delimiter
function split (str, delim, plain)
local pos = 0 -- where we are in the string
local len = 1 -- length of previous match
-- the for loop calls this for every iteration
-- returning nil terminates the loop
local function iterator (s)
if not pos then
return nil
end -- end of string, exit loop
local oldpos = pos + len -- step past previous delimiter
local e -- end of match
pos, e = string.find (s, delim, oldpos, plain) -- find next delimiter
if not pos then -- no more newlines, return rest of string
return string.sub (s, oldpos)
end -- no newline
len = e - pos + 1 -- length of delimiter
return string.sub (s, oldpos, pos - 1)
end -- iterator
return iterator, str
end -- split
The first example just uses a simple comma as the delimiter.
test = "every,good,boy,deserves,fruit"
for l in split (test, ",") do
print ('"' .. l .. '"')
end -- for
Now we will use a regular expression to split on a comma or a semicolon.
test = "every,good;boy,deserves;fruit"
for l in split (test, "[,;]") do
print ('"' .. l .. '"')
end -- for
(Same output as before).
If we want to match on something that is a special regular expression character (like %) then we either have to put a % in front of it, or use the "plain" boolean, to specify matching on plain text.
test = "every%good%boy%deserves%fruit"
for l in split (test, "%", true) do
print ('"' .. l .. '"')
end -- for
(Same output as before).
Finally we can use a regular expression that might match more than one character (in this case, multiple spaces) to effectively break up a string into words separated by whitespace.
test = "every good boy deserves fruit"
for l in split (test, "%s+") do
print ('"' .. l .. '"')
end -- for
(Same output as before).
- Nick Gammon
Posted by
Nick Gammon
Australia (23,133 posts) Bio
Forum Administrator |
Reply #4 on Sun 21 May 2006 10:05 PM (UTC)
| Show upvalues
To show how our iterator stores its state in upvalues, we can use the debug library to display them. This function will show upvalues for another function:
function showupvalues (f)
assert (type (f) == "function")
local i = 1
local name, val
name, val = debug.getupvalue (f, i)
if name then
print ("index", i, name, "=", val)
i = i + 1
end -- if
until not name
end -- function showupvalues
Now we'll generate an iterator and inspect its upvalues:
iter = split (test, "%", true)
showupvalues (iter)
index 1 pos = 0
index 2 len = 1
index 3 delim = %
index 4 plain = true
We can see from this that the iterator has remembered the four things we expected it to, the string to match, whether or not we are doing a plain match, the current position, and the length of the previously-found match. |
- Nick Gammon
Posted by
Nick Gammon
Australia (23,133 posts) Bio
Forum Administrator |
Reply #5 on Sun 21 May 2006 10:31 PM (UTC) Amended on Thu 26 Jun 2008 04:32 AM (UTC) by Nick Gammon
| Using string.gmatch to split a string
An alternative approach again to splitting a string at a delimiter is to use string.gmatch, and search for the inverse of the delimiter. This is the simplest approach, if it works in your case:
test = "every,good,boy,deserves,fruit"
for l in string.gmatch(test, "[^,]+") do
print ('"' .. l .. '"')
This example uses string.gmatch to find a string that does *not* contain a comma.
The major difficulty with this method is the case of the empty string, like this:
test = "every,good,boy,,,,deserves,fruit"
Applying this string to the string.gmatch above returns the same results as before (above). In other words, the "empty" matches provided by consecutive commas are discarded.
However using the "split" iterator presented earlier, we get the "correct" results:
- Nick Gammon
The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).
To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.
It is now over 60 days since the last post. This thread is closed.
Refresh page