Notice: Any messages purporting to come from this site telling you that your password has expired, or that you need to verify your details, confirm your email, resolve issues, making threats, or asking for money, are
spam. We do not email users with any such messages. If you have lost your password you can obtain a new one by using the
password reset link.
Due to spam on this forum, all posts now need moderator approval.
Entire forum
➜ MUSHclient
➜ Lua
➜ String.find not working for regex strings in table
String.find not working for regex strings in table
|
It is now over 60 days since the last post. This thread is closed.
Refresh page
Posted by
| Natasi
(79 posts) Bio
|
Date
| Wed 23 May 2012 02:00 AM (UTC) |
Message
| I have a table (example below) filled with about 700 entries, which I go through using string.find and compare to the incoming trigger. Issue is, string.find is not working with any regex in the line, it will only match up on on plain text matches.
triggerstext = {}
triggerstext["Your mouth turns up as your face (.*?) into a smile."]=function() Note("Look mom, I found you!"); end
Here is the function I parse the trigger with:
function parseAll(label, trigger, wildcard)
for _, trig in pairs (triggerstext) do
a = string.find(trigger, _);
if a then
trig();
end
end
end
So the above will only ever match on plain, not on the regex portion. I have also written this as a regex function using, but it is extremely slow when going over 700 entries in the table :
for _, trig in pairs (triggerstext) do
re = rex.new (_);
a, b, c = re:match (trigger);
if a then
trig(c); --fires function
end
end
Any help with the string.find issue would be appreciated, or if anyone knows why the regex function is so insanely slow, that would help also. My LUA is not advanced enough yet to figure this out so far. | Top |
|
Posted by
| Fiendish
USA (2,534 posts) Bio
Global Moderator |
Date
| Reply #1 on Wed 23 May 2012 02:42 AM (UTC) |
Message
| Lua does not use regex as part of the language. Lua uses its own pattern format described here: http://www.lua.org/pil/20.2.html
You probably want to precompile your rexes instead of doing it again 700 times every line. |
https://github.com/fiendish/aardwolfclientpackage | Top |
|
Posted by
| Natasi
(79 posts) Bio
|
Date
| Reply #2 on Wed 23 May 2012 04:32 AM (UTC) |
Message
| Ahhhh I was using (.*?) and should have been using (.*)... that fixed that issue, thanks!
As for the re:match one precompile, I tried that using the below method and it was still very slow. On load of the script I would precompile all the triggers then go through them on trigger.
function preParse()
i = 1;
for _, trig in pairs (plaintext) do
_G ["re_"..i] = rex.new (_);
i= i+1;
end
end
function parseAll1(label, trigger, wildcard)
tt = 1
for _, trig in pairs (plaintext) do
a, b, c = _G["re_"..tt]:match (trigger);
if a then
trig(c);
end
tt = tt+1;
end
end
Is there a better way of writing this? | Top |
|
Posted by
| Nick Gammon
Australia (23,133 posts) Bio
Forum Administrator |
Date
| Reply #3 on Wed 23 May 2012 05:45 AM (UTC) |
Message
| Your original one failed because the Lua regex does not recognize (.*?) syntax. This works:
triggerstext = {}
triggerstext["Your mouth turns up as your face (.*) into a smile."]=function() Note("Look mom, I found you!"); end
function parseAll(label, trigger, wildcard)
for pattern, trig in pairs (triggerstext) do
if string.find (trigger, pattern) then
trig ()
break
end
end
end
local start = utils.timer ()
parseAll ("foo", "Your mouth turns up as your face Nick into a smile.")
print (utils.timer () - start)
Output:
Look mom, I found you!
0.00070651416899636
So under a millisecond for detecting that. I added the "break" because once you get a match you don't need to keep trying.
Most of that time would have been processing the match, because if I send in a non-matching string it only took about 7 microseconds.
|
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Nick Gammon
Australia (23,133 posts) Bio
Forum Administrator |
Date
| Reply #4 on Wed 23 May 2012 05:56 AM (UTC) |
Message
| What would speed this up, too, would be to have this trigger (which does the 700 matches) be a low-priority trigger, down the list in your trigger list behind other "more likely to match" triggers. For example, a prompt. So if you get a prompt your prompt trigger fires first, which saves even testing a single item in this table. |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Nick Gammon
Australia (23,133 posts) Bio
Forum Administrator |
Date
| Reply #5 on Wed 23 May 2012 05:57 AM (UTC) Amended on Wed 23 May 2012 05:58 AM (UTC) by Nick Gammon
|
Message
| Look up string.find in the inbuilt help to find the exact Lua regexp syntax. The Lua regexp is quite fast, but its syntax differs a bit to the one used by triggers.
This is the web version (you get the same stuff in the help):
http://www.gammon.com.au/scripts/doc.php?lua=string.find |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Natasi
(79 posts) Bio
|
Date
| Reply #6 on Wed 23 May 2012 11:26 AM (UTC) |
Message
| If my triggertext had multiple wildcards in it, how would I capture those?
In the code below, I would want to see the first wildcard output, which would be 'up', and the second, which would be 'breaks'. I did a quick version doing a, b, c = string.find() and I could get the 'up' by outputting c, but unless I added a d, e, f (etc) it wouldn't catch additional wildcards.
triggerstext = {}
triggerstext["Your mouth turns (.*) as your face (.*) into a smile."]=function() Note("Look mom, I found you!"); end
function parseAll(label, trigger, wildcard)
for pattern, trig in pairs (triggerstext) do
if string.find (trigger, pattern) then
trig ()
break
end
end
end
local start = utils.timer ()
parseAll ("foo", "Your mouth turns up as your face Nick into a smile.")
print (utils.timer () - start) | Top |
|
Posted by
| Natasi
(79 posts) Bio
|
Date
| Reply #7 on Wed 23 May 2012 11:34 AM (UTC) |
Message
| Another issue I've come across, it seems (wildcard1|wildcard2) splits do not work with Lua string.find. Would it be wise to split all my triggers apart (I have about 1284 '|' grouped triggers) into their own individual lines or try to find a way to speed up the regex version of this? | Top |
|
Posted by
| Nick Gammon
Australia (23,133 posts) Bio
Forum Administrator |
Date
| Reply #8 on Wed 23 May 2012 11:59 AM (UTC) |
Message
|
Natasi said:
If my triggertext had multiple wildcards in it, how would I capture those?
Like this:
triggerstext = {}
triggerstext["Your mouth turns (.*) as your face (.*) into a smile."]
=
function(where, who)
Note("Look " .. who .. ", I found you! and your face turned " .. where);
end
function parseAll(label, trigger, wildcard)
for pattern, trig in pairs (triggerstext) do
local results = { string.match (trigger, pattern) }
if next (results) then
trig (unpack (results))
break
end
end
end
local start = utils.timer ()
parseAll ("foo", "Your mouth turns up as your face Nick into a smile.")
print ((utils.timer () - start) )
Output:
Look Nick, I found you! and your face turned up
0.00056068558478728
Quote:
but it is extremely slow when going over 700 entries in the table :
...
Quote:
I have about 1284 '|' grouped triggers
They seem to be multiplying!
Yes, the "|" operator does not work in Lua regexps. |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Natasi
(79 posts) Bio
|
Date
| Reply #9 on Wed 23 May 2012 04:03 PM (UTC) |
Message
| Thanks Nick!
The 700 entries referred to this:
triggerstext["Your mouth turns (.*) as your face (.*) into a smile."]...
The 1284 refers to the multiple within:
triggerstext["Your mouth turns (.*) as your face (.*) into a smile.|I like pie.|This is another one to match"]...
So I take it I just have to split them apart then, just to be sure, the string.math/find over the soon to be almost 2000 triggers will still parse faster than the regex over 700? | Top |
|
Posted by
| Worstje
Netherlands (899 posts) Bio
|
Date
| Reply #10 on Wed 23 May 2012 04:13 PM (UTC) |
Message
| Natasi, it very likely will. The reason is that the complexity is less: PCRE has tons of options but like a car, the extra weight also carries its toll. Lua has less features and is thus less complex, meaning it can generally execute equivalent patterns faster.
Also, I have experience using several thousand builtin MUSHclient triggers myself, and while it lagged things just a little it was still something I didn't need to pay much heed to. This was 4+ years ago, so I imagine performance has only increased since. | Top |
|
Posted by
| Nick Gammon
Australia (23,133 posts) Bio
Forum Administrator |
Date
| Reply #11 on Thu 24 May 2012 02:00 AM (UTC) |
Message
| It seems like a lot, but yes you will have to live without the "|" if you are going to use Lua.
I suggest you "anchor" them, that should make it much faster:
"^Your mouth turns (.*) as your face (.*) into a smile.$"
The reason is, without the anchor, say it gets:
Now it compares the regexp "Y" (as "Your mouth") to "T" (as in "There is a shop here."). It finds no match. But without the anchor, it then moves in one character and tries again. After all the line might say:
There is a shop here. Your mouth turns blue as your face crinkles into a smile.
It gets no match on the "h" and tries again. And again. And again.
But with the anchor (the "^" character) if it doesn't match at the start of the line it stops immediately.
If that isn't fast enough, you could split them up a bit.
For example, all the triggers starting "You" could be put into a group (separate table). Then a single test (does the line start with "You"?) could let you decide whether to test a further 100 triggers or no. |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).
To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.
34,054 views.
It is now over 60 days since the last post. This thread is closed.
Refresh page
top