Notice: Any messages purporting to come from this site telling you that your password has expired, or that you need to verify your details, confirm your email, resolve issues, making threats, or asking for money, are
spam. We do not email users with any such messages. If you have lost your password you can obtain a new one by using the
password reset link.
Due to spam on this forum, all posts now need moderator approval.
Entire forum
➜ MUSHclient
➜ Lua
➜ Single trigger with multiple regex matches
Single trigger with multiple regex matches
|
It is now over 60 days since the last post. This thread is closed.
Refresh page
Posted by
| Soft
(7 posts) Bio
|
Date
| Wed 25 Jan 2017 01:40 AM (UTC) Amended on Wed 25 Jan 2017 03:12 AM (UTC) by Soft
|
Message
| I'm trying to use a single trigger to match and send a string pattern. Perhaps it should be setup as a multi-line trigger.
Here is the MUD output I'm trying to match:
Inn of the Dripping Dagger
N-Battered and gouged wooden door S-Selduth Street W-Inn of the Drip U-Secon Floor of
D-The Cellar of t
This cozy old inn is well known as a favorite watering hole and resting
place for hire-swords and it has a reputation for jovial horseplay that
keeps more timid visitors away. It is a low-ceilinged taproom with two sets of
I want to pull 5 different strings out of that text:
- N-Battered and gouged wooden door
- S-Selduth Street
- W-Inn of the Drip
- U-Secon Floor of
- D-The Cellar of t
This is the regex I'm using:
\b(?:(?:[NSWUD]|[NS][EW])-)(?:(?:[\w ](?!(?:[NSWUD]|[NS][EW])-))+)\b
This regex seems to work exactly as I intend. I feel quite sure that I've implemented the trigger logic improperly.
The actual trigger:
<trigger
name="ExitTrigger"
enabled="y"
regexp="y"
multi_line="n"
keep_evaluating="y"
match="\b(?:(?:[NSWUD]|[NS][EW])-)(?:(?:[\w ](?!(?:[NSWUD]|[NS][EW])-))+)\b"
send_to="12"
sequence="100">
<send>ListExit "%0"</send>
</trigger>
The current behavior causes this trigger to fire once per each line that contains the direction descriptions, and only captures the first matching sequence. I'd like to capture all matching sequences on a given line. The regex works as intended when checking it with regexpal. | Top |
|
Posted by
| Worstje
Netherlands (899 posts) Bio
|
Date
| Reply #1 on Wed 25 Jan 2017 12:39 PM (UTC) Amended on Wed 25 Jan 2017 12:40 PM (UTC) by Worstje
|
Message
| From what I recall (it's been a couple of years since I've been very active with MUSHclient), the functionality is slightly different than you'd expect.
The keep_evaluating trigger option only determines whether other triggers will get matched against the same line, not the same one.
The repeat_on_same_line trigger option controls whether the same trigger gets matched repeatedly to the same line... but only for the purpose of changing the styles with the relevant settings. (Making it useful for stuff like name highlights and the sort.) Presumably, there's a lot of gotchas to repeating scripts on the same line that Nick felt were better avoided as a whole.
For what you want, the best option is to just pull the functionality out into a script, and do the repeat matching in a script. The Lua 're' (regular expression) module is an interface to the same matching engine MUSHclient uses, so if you take a moment to figure out how that works you can just toss your regexp into a loop there and process it as needed.
If you can't figure it out, ask here and I'm sure someone can help you with the details. | Top |
|
Posted by
| Soft
(7 posts) Bio
|
Date
| Reply #2 on Wed 25 Jan 2017 01:45 PM (UTC) |
Message
| Thanks for your reply Worstje. I discovered much the same regarding the keep_evaluating flag last night. I wasn't aware of the repeat_on_same_line flag.
I'm new to Lua, and wasn't aware of the 're' module. I even resorted to string.match at one point, simply because I didn't do enough digging regarding regex modules/libraries. Your suggestions for breaking the regex out into a script was definitely the plan, providing I couldn't get it all working with a single trigger. Looks like I'm going down that route today. Thanks for your help! | Top |
|
Posted by
| Worstje
Netherlands (899 posts) Bio
|
Date
| Reply #3 on Wed 25 Jan 2017 02:30 PM (UTC) |
Message
| Having slept, I realize what some reasons for not matching triggers on the same line might be.
(.*+) is a popular pattern - it is probably one the most common ones! Combine that with the fact the option would require keeping track of the last match position to avoid an eternal loop, and it would be really simple for a basic trigger to match 50+ times and cause terrible performance. Hell, given a carelessly enough written trigger, I can imagine various capture groups going crazy with backtracking and cause some sort of exponentially rising match count due to all the ways characters could be matched against the regex.
If a seemingly simple trigger has the option of making the client slow as hell, it could easily give the program the reputation of being way too slow, difficult to use and generally unfit for duty. People who make such sort of simple triggers are not to kind to figure out why it might be the case; they just see it and go 'well, it is simple and something similar works fine on X, I'm switching to something else and will advise everybody to not use MUSHclient because it sucks!'. | Top |
|
Posted by
| Soft
(7 posts) Bio
|
Date
| Reply #4 on Wed 25 Jan 2017 03:50 PM (UTC) |
Message
| Thanks again for your insight Worstje. Here's what I've got now:
<trigger
name="ExitTrigger"
enabled="y"
regexp="y"
multi_line="y"
lines_to_match="2"
keep_evaluating="y"
match="((?:[NSWUD]|[NS][EW])-).*"
send_to="12"
sequence="100">
<send>ListExit "%0"</send>
</trigger>
Which pulls out two separate strings:
N-Battered and gouged wooden door S-Selduth Street W-Inn of the Drip U-Secon Floor of
D-The Cellar of t
I've imported the LPEG.re module, and am attempting to use re.match on each of those two strings.
It seems that with re.match, I should be able to use my original regex to pull out all of the individual strings simultaneously. I can also loop through and pull out one of the individual strings on each loop.
Now I'm having issues implementing either of those regex patterns in the LPEG.re syntax. I really appreciate your time and help in this matter. | Top |
|
Posted by
| Worstje
Netherlands (899 posts) Bio
|
Date
| Reply #5 on Wed 25 Jan 2017 07:05 PM (UTC) Amended on Thu 26 Jan 2017 01:59 PM (UTC) by Worstje
|
Message
| Sorry, it took a while. The matter was a bit more complicated than I liked it to be (I tried to bruteforce a solution with negative look-aheads, but your output wasn't friendly enough for that), so I had to code something up manually only to forget get myself into an infinite loop and losing all my progress. Whoops. xD
Also, my apologies for misdirecting you. The module you need is 'rex', not 're'. I'm getting my languages and scripting environments confused a little bit!
<triggers>
<trigger
enabled="y"
match="^(.*?([NSWEUD]|[NS][EW])-\w.*)$"
regexp="y"
send_to="12"
sequence="100"
>
<send>local line = "%0"
-- Match upto first letter of description for accuracy.
local exit_re = rex.new("\\\\s([NSWEUD]|[NS][EW])-\\\\w")
local s, e, m = exit_re:match(line, 1)
while (s ~= nil) do
local new_s, new_e, new_m = exit_re:match(line, s+1)
local exit_name = m[1]
local exit_description = line:sub(e, (new_s or 0)-1)
-- You may want to add a trim() to exit_description since your game
-- seems to be littered with spaces in unexpected places. :-)
-- BEGIN -- Do what you want at this point.
Note("Exit found: <" .. exit_name .. ">")
Note("Exit description: <" .. exit_description ..">")
-- END -- Do what you want at this point.
s,e,m = new_s, new_e, new_m
end</send>
</trigger>
</triggers>
There's your trigger. I did some shenanigans with the trigger itself to get the entire line to match for manual processing. I personally always use the Script Function thing because it is far easier to code for; it also offers some extra features that aren't easy to do in this format. Additionally, it gives the bonus of not having a backslash-infestation with regexes like in this particular case. :-)
The way it basically works is to first match a space, followed by a direction, followed by a dash, followed by one letter. The latter is to be as precise as possible with regards to really matching an exit to avoid false positives. Then it looks for the same pattern a second time, but one character after where it found its first match. This should give the second exit. Next, I just fiddle with the indexes and collected information to put the values you want into useful variables. (Note that s or new_s becomes nil if it cannot find a match, so I need to default it to zero, which will give a -1 and be interpreted by Lua as 'until the end of the string'.) Finally, we take the second match we did initially, and use it as a seed for the next loop.
If you can't figure out how to do the (right-)trim of the exit description, let me know and I'll get you a snippet. It should be easy enough to find on the interwebs, though! | Top |
|
Posted by
| Nick Gammon
Australia (23,140 posts) Bio
Forum Administrator |
Date
| Reply #6 on Wed 25 Jan 2017 09:57 PM (UTC) |
Message
| Another interesting way of handling this is by using LPEG.
http://www.gammon.com.au/lpeg
First we get a trigger to match what roughly looks like an exit line, then we process the line in LPEG:
<triggers>
<trigger
enabled="y"
match="^\s+([NSWEUD]|[NS][EW])-\w.*"
regexp="y"
send_to="12"
sequence="100"
>
<send>
require "re"
exitsGrammar = re.compile[[
ExitLine <- {| (" "+ Exit)+ |}
Direction <- ( [NS] [EW] ) / [NSEWUD] "-"
Exit <- { Direction (. !Direction)+ }
]]
exits = exitsGrammar:match ("%0")
for k, v in ipairs (exits) do
print (v)
end -- for
</send>
</trigger>
</triggers>
 |
For advice on how to copy the above, and paste it into MUSHclient, please see Pasting XML.
|
Breaking up the grammar (and converting back the ">" symbols to how they look on the screen):
ExitLine <- {| (" " Exit)+ |}
An exit line consists of one or more sets of spaces followed by an exit. The brackets "{| ... |}" mean to capture into a table.
Direction <- ( [NS] [EW] ) / [NSEWUD] "-"
A direction is either N or S followed by E or W, or one of "NSEWUD". Whichever choice is taken we also need a hyphen afterwards. This makes sure that things like N in the middle of a description aren't considered an exit.
Exit <- { Direction (. !Direction)+ }
An exit is a direction, followed by a single character which is not followed by another direction. Thus it gradually consumes characters until it hits another direction. The braces mean that this is the thing that is captured.
Example output from your test:
N-Battered and gouged wooden door
S-Selduth Street
W-Inn of the Drip
U-Secon Floor of
|
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Soft
(7 posts) Bio
|
Date
| Reply #7 on Wed 25 Jan 2017 11:50 PM (UTC) Amended on Thu 26 Jan 2017 12:37 AM (UTC) by Soft
|
Message
| EDIT: Oops, I refreshed the page from this morning and created a double post.
Thanks for your input Nick. I really appreciate both yours and Worstje's help in this matter. I'm attempting to implement both solutions now, so I can understand any pros/cons. | Top |
|
Posted by
| Soft
(7 posts) Bio
|
Date
| Reply #8 on Thu 26 Jan 2017 01:06 AM (UTC) |
Message
| After attempting to implement both of your solutions, I keep running into pattern matching issues.
With Worstje's solution, the variables "s, m, e" were always nil, even after tweaking and double-checking the regex.
With Nick's solution, I get a pattern error. I'm not sure if this is due to not importing the module properly, but I made sure to require tprint, lpeg (as well as character classes and functions, idk if that was necessary), and re on initialization.
pattern error near '| (" "+ Exit)+ |}
D...'
stack traceback:
[C]: in function 'error'
X:\MUSHclient\lua\re.lua:85: in function <X:\MUSHclient\lua\re.lua:81>
[C]: in function 'match'
X:\MUSHclient\lua\re.lua:200: in function 'compile'
[string "Trigger: "]:8: in main chunk
I've reviewed the line but nothing jumps out to me as different from the LPEG documentation that Nick linked.
require "re"
exitsGrammar = re.compile[[
ExitLine <- {| (" "+ Exit)+ |}
Direction <- ( [NS] [EW] ) / [NSEWUD] "-"
Exit <- { Direction (. !Direction)+ }
]]
exits = exitsGrammar:match ("%0")
for k, v in ipairs (exits) do
print (v)
end -- for
| Top |
|
Posted by
| Worstje
Netherlands (899 posts) Bio
|
Date
| Reply #9 on Thu 26 Jan 2017 02:30 AM (UTC) Amended on Thu 26 Jan 2017 02:39 AM (UTC) by Worstje
|
Message
| I am not sure why my example wouldn't work. I tested it with the example text from your first post and I got the results you wanted out of it.
I'll see if I can figure it out.
Problem found: the forum ate some backslashes. Oddly enough they do appear whilst I am editing the post, but on the forum itself they don't show. Weird. I'm sure Nick will fix that issue at some point. :-)
Make sure there are a total of _4_ backslashes before both the s near the start as well as the w on the line that defines the exit_re variable. (Which means you need to add 2 extra in front of each.) | Top |
|
Posted by
| Nick Gammon
Australia (23,140 posts) Bio
Forum Administrator |
Date
| Reply #10 on Thu 26 Jan 2017 03:20 AM (UTC) |
Message
| You have to double backslashes for the forum. MUSHclient will do that for you. See Edit menu: Convert Clipboard Forum Codes. |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Soft
(7 posts) Bio
|
Date
| Reply #11 on Thu 26 Jan 2017 04:37 AM (UTC) |
Message
| Thanks so much for all your help guys, I got it working pretty much perfectly. There was one bug I noticed when directional text butts up against an ASCII minimap, but I don't think there's much to be done there, and the directions are still readable. I threw the plugin up on github and put a thank you to both of you in there. I really appreciate the help, and Nick I love MUSHclient. I'm blown away by your dedication over the years, awesome job well done.
https://github.com/codypersinger/MUSHclient_FK_Plugin | Top |
|
Posted by
| Nick Gammon
Australia (23,140 posts) Bio
Forum Administrator |
Date
| Reply #12 on Thu 26 Jan 2017 05:04 AM (UTC) |
Message
| Thanks for the acknowledgements!
BTW, I notice from your screenshot you missed a possible easy way of handling it. The exit direction is a different colour to the description.
If you made a trigger script you get style runs as the fourth argument. By "walking" the styles you would probably find that you got:
yellow:direction
green:description |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Soft
(7 posts) Bio
|
Date
| Reply #13 on Thu 26 Jan 2017 06:43 AM (UTC) |
Message
| You're right of course Nick. Unfortunately the color of the strings changes somewhat often depending on what area you are in, and I'm not familiar enough with the game to know how many possibilities there are. | Top |
|
The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).
To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.
37,241 views.
It is now over 60 days since the last post. This thread is closed.
Refresh page
top