Register forum user name Search FAQ

Gammon Forum

Notice: Any messages purporting to come from this site telling you that your password has expired, or that you need to "verify" your details, making threats, or asking for money, are spam. We do not email users with any such messages. If you have lost your password you can obtain a new one by using the password reset link.
 Entire forum ➜ MUSHclient ➜ Lua ➜ Suggestion for indexing strings

Suggestion for indexing strings

It is now over 60 days since the last post. This thread is closed.     Refresh page


Posted by Nick Gammon   Australia  (23,046 posts)  Bio   Forum Administrator
Date Fri 26 Nov 2010 07:56 PM (UTC)

Amended on Fri 26 Nov 2010 10:00 PM (UTC) by Nick Gammon

Message
I saw on the Lua mailing list a cool suggestion for string indexing.

The gist is to make a simpler syntax for pulling out an individual character from a string. For example, if you want to find the first character you currently do:


a = "nick"
print (string.sub (a, 1, 1)) --> n


This is a little ugly. Now if you add the following code to the initialization of your Lua code:


getmetatable ("").__index = function (str, i)

  if (type (i) == "number") then
    return string.sub (str, i, i)  -- index into str
  end -- if

  return string [i]  -- fallback (eg. string.match)
end -- function


What that does is add a metamethod for handling the indexing into any string (since all Lua strings share the same metatable).

Now you can do this:


a = "nick"
print (a [1])  --> n


[EDIT]

Modified per discussion below, so the discussion may not make a heap of sense.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Twisol   USA  (2,257 posts)  Bio
Date Reply #1 on Fri 26 Nov 2010 08:18 PM (UTC)

Amended on Fri 26 Nov 2010 08:21 PM (UTC) by Twisol

Message
You might want to use string.sub(str) instead of str:sub(), since you're incurring needless overhead by invoking the metamethod again.

Improvement allowing for a range:

getmetatable ("").__index = function (str, i)
  if (type (i) == "number") then
    return string.sub (str, i, i)  -- index into string
  elseif (type (i) == "table") then
    return string.sub (str, i[1], i[2])
  else
    return string [i]  -- fallback
  end  -- if
end -- function


[EDIT]: It's a pity you can't make the same syntax work for modifying strings.

'Soludra' on Achaea

Blog: http://jonathan.com/
GitHub: http://github.com/Twisol
Top

Posted by Nick Gammon   Australia  (23,046 posts)  Bio   Forum Administrator
Date Reply #2 on Fri 26 Nov 2010 08:38 PM (UTC)
Message
Twisol said:

You might want to use string.sub(str) instead of str:sub(), since you're incurring needless overhead by invoking the metamethod again.



I had that initially and changed it. I was trying to avoid the overhead of:


  • Looking up "string" in the global environment
  • Looking up "sub" in that table


Whereas doing str:sub hopefully only is one lookup (the metatable). Am I wrong?

As for the table, I think that might be getting as complicated as the original.


print ( a [ { 2, 3 } ] )


instead of:


print ( a:sub (2, 3) )



- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Twisol   USA  (2,257 posts)  Bio
Date Reply #3 on Fri 26 Nov 2010 08:48 PM (UTC)

Amended on Fri 26 Nov 2010 08:50 PM (UTC) by Twisol

Message
Nick Gammon said:
I had that initially and changed it. I was trying to avoid the overhead of:


*Looking up "string" in the global environment
*Looking up "sub" in that table


Whereas doing str:sub hopefully only is one lookup (the metatable). Am I wrong?

Well, it's one lookup plus a function call (the metamethod), which probably involves more Lua bytecode than just a lookup. And you can cache 'string' in a local. In fact, you can cache 'sub' in a local, too:

local string = string
local sub = string.sub
getmetatable ("").__index = function (str, i)
  if (type (i) == "number") then
    return sub (str, i, i)  -- index into string
  else
    return string [i]  -- fallback
  end  -- if
end -- function


Nick Gammon said:
As for the table, I think that might be getting as complicated as the original.

Touche. I was trying to go for the Ruby syntax of str[1,2], but you can't pass multiple values to the index metamethod.

'Soludra' on Achaea

Blog: http://jonathan.com/
GitHub: http://github.com/Twisol
Top

Posted by Nick Gammon   Australia  (23,046 posts)  Bio   Forum Administrator
Date Reply #4 on Fri 26 Nov 2010 09:22 PM (UTC)

Amended on Fri 26 Nov 2010 09:23 PM (UTC) by Nick Gammon

Message
It seems you may have a point. I think you added in caching string.sub while I was replying, but the difference was minor as it turned out.

Test bed:


collectgarbage ( )

local a, b = "nick"
start = utils.timer ()

for i = 1, 1000000 do
  b = a [3]
end -- for

print (string.format ("Time taken = %0.2f", utils.timer () - start))


Results in speed order:


 return sub (str, i, i)         --> 0.42 (string.sub cached) 
 return string.sub (str, i, i)  --> 0.43 (string cached)   
 return string.sub (str, i, i)  --> 0.47 secs
 return str:sub (i, i)          --> 0.72 secs


So my original was about 1.5 as fast than my improved version. Caching the string.sub lookup helped, but not by much.

I'll change the original post back to the simpler version.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Nick Gammon   Australia  (23,046 posts)  Bio   Forum Administrator
Date Reply #5 on Fri 26 Nov 2010 09:26 PM (UTC)

Amended on Fri 26 Nov 2010 09:27 PM (UTC) by Nick Gammon

Message
Meanwhile I'm not sure about the fallback line. Surely if you do, say:


a = "abcd"
print (a ["x"])


... you are indexing into the key "x" in string "a"? (Not key "x" in the string table). However somewhat surprisingly it seems to work in the sense that it returns nil.

However wouldn't it always?

So perhaps the fallback should be either:



  return nil  -- fallback

-- or

  error ("Attempting non-numeric index into string")


- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Nick Gammon   Australia  (23,046 posts)  Bio   Forum Administrator
Date Reply #6 on Fri 26 Nov 2010 09:29 PM (UTC)
Message
Or even:


  return ""   -- fallback


... on the basis that indexing out of range returns an empty string.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Twisol   USA  (2,257 posts)  Bio
Date Reply #7 on Fri 26 Nov 2010 09:36 PM (UTC)

Amended on Fri 26 Nov 2010 09:37 PM (UTC) by Twisol

Message
You need to leave it as string[i]. Otherwise you prevent users from doing str:gsub() or anything else in 'string' using the shorthand.

'Soludra' on Achaea

Blog: http://jonathan.com/
GitHub: http://github.com/Twisol
Top

Posted by Nick Gammon   Australia  (23,046 posts)  Bio   Forum Administrator
Date Reply #8 on Fri 26 Nov 2010 09:50 PM (UTC)
Message
Aha. That makes sense.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).

To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.


22,771 views.

It is now over 60 days since the last post. This thread is closed.     Refresh page

Go to topic:           Search the forum


[Go to top] top

Information and images on this site are licensed under the Creative Commons Attribution 3.0 Australia License unless stated otherwise.