Gammon Forum
Entire forum
MUSHclient
General
regexp: match multiple consecutive whole words on the same line
regexp: match multiple consecutive whole words on the same line
|
It is now over 60 days since the last post. This thread is closed.
  Refresh page
Posted by
| Missingl
(2 posts) bio
|
Date
| Sat 15 Jul 2006 02:44 PM (UTC) |
Message
| Hi all,
i'd like to write some regular expressions which match whole (specified) words on a line. For example, the string:
You feel weaker as the power of the Strength spell wanes.
i wish to filter (mark / change colour of) all instances of:
you,as,the,of,one,on
now if i use " (you|as|the|of|one|on) ", (with keep evaluating and repeat on same line ticked) if two matching words occur next to each other, it only matches the first. if another word matches further down the string, however, it matches ok.
the reason for the spaces on either side of the regexp is so it matches the entire word. i don't want it to match, for example, the "the" in "these", and therefore mark half the word.
of course one could capture all the words in the line, and parse it via script, and cause the output to popup else where, but i think speedwise the trigger only would be a tad quicker.
finally, awesome client, best i've come across yet, excellent work Nick.
thanks for any help.
m-L | top |
|
Posted by
| Shadowfyr
USA (1,786 posts) bio
|
Date
| Reply #1 on Sat 15 Jul 2006 07:18 PM (UTC) Amended on Sat 15 Jul 2006 07:19 PM (UTC) by Shadowfyr
|
Message
| Umm.. Well, the reason for this is simple. You're telling it to do this:
string = " on on one "
Search 1 = " on on one "
^ ^
Search 2 = "on one "
^ ^
Where ^ is where it finds the start and end of the next "match".
In other words, since you are specifically looking for the space, its incrementing the point where it starts looking again "past" that space. This should fix it (uses no spaces):
\b(you|as|the|of|one|on)\b
The explaination given in Mushclient's documentation is from one of those horrible, "If you are reading this, you should understand what I mean.", documents you often get in the world of "professionals", who all speak Computer Nerd. Apparently I need to take a course on it, since I can't comprehend half the more obscure stuff in the Regexp documents. lol
Just to make sure I was understanding it though, I googled for a better explaination:
http://www.regular-expressions.info/wordboundaries.html
"\b" is basically an assertion that the following match "must" have one character on either side be a \w (letter), while the other "must" be a \W (spaces, newline, etc.). It does not include the characters it finds in what it returns, so it should only match the word itself, meaning that the next search will start with the space following the word it found. | top |
|
Posted by
| Nick Gammon
Australia (22,975 posts) bio
Forum Administrator |
Date
| Reply #2 on Sat 15 Jul 2006 09:50 PM (UTC) |
Message
| |
Posted by
| Missingl
(2 posts) bio
|
Date
| Reply #3 on Sun 16 Jul 2006 02:21 PM (UTC) |
Message
| thanks guys,
that was a little too easy for you, i'm kinda embarrassed. Nonetheless, what you suggested worked perfectly shadowfyr, thank you.
Also thanks for the link Nick, looking through the forums i missed it. i'll be sure to have a look at it thoroughly.
Cheers,
m-L | top |
|
The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).
To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.
16,691 views.
It is now over 60 days since the last post. This thread is closed.
  Refresh page
top
Quick links:
MUSHclient.
MUSHclient help.
Forum shortcuts.
Posting templates.
Lua modules.
Lua documentation.
Information and images on this site are licensed under the Creative Commons Attribution 3.0 Australia License unless stated otherwise.