Notice: Any messages purporting to come from this site telling you that your password has expired, or that you need to verify your details, confirm your email, resolve issues, making threats, or asking for money, are
spam. We do not email users with any such messages. If you have lost your password you can obtain a new one by using the
password reset link.
Entire forum
➜ MUSHclient
➜ Lua
➜ string matching
It is now over 60 days since the last post. This thread is closed.
Refresh page
Posted by
| Nobody
(38 posts) Bio
|
Date
| Tue 10 Oct 2006 07:29 AM (UTC) |
Message
| I'm probably being ridiculously stupid, but is there any way to ignore case when using the lua inbuilt string routines? | Top |
|
Posted by
| Nick Gammon
Australia (23,120 posts) Bio
Forum Administrator |
Date
| Reply #1 on Tue 10 Oct 2006 09:53 AM (UTC) |
Message
| Well, a simple way is to simply convert the strings to upper (or lower) case first.
eg.
a = "nick"
b = "NICK"
a = a:upper()
b = b:upper()
print (a == b) --> true |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Nobody
(38 posts) Bio
|
Date
| Reply #2 on Wed 11 Oct 2006 02:16 AM (UTC) |
Message
| yeah I already thought of that. The problem though, is that I'm doing substring finding, and substring matching and replacing, and I dont want to screw up the original texts that I'm using for comparing and contrasting.
Looks like I just have to write my own lua functions that can handle this properly. :-\ | Top |
|
Posted by
| Nick Gammon
Australia (23,120 posts) Bio
Forum Administrator |
Date
| Reply #3 on Wed 11 Oct 2006 04:50 AM (UTC) |
Message
| Well, if you want to see if a string matches another, case-insensitive, do this:
s = "Nick Gammon is here"
print (string.match (s:upper (), "NICK")) --> NICK
This hasn't changed the source string. Or if you want to pull out the matching stuff, but you aren't sure what it is:
s = "I see Nick Gammon is here"
st, en = string.find (s:upper (), "N..K") --> find where it matches
print (string.sub (s, st, en)) --> Nick
Here I forced the string to upper case, did an upper case match, found the start and end column, and then pulled out the original string from the source string. |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Nobody
(38 posts) Bio
|
Date
| Reply #4 on Wed 11 Oct 2006 07:44 AM (UTC) |
Message
| While I appreciate your efforts, it's not going to work out that simply. Some of it will, some of it won't, and it's the stuff that won't that's going to cause me alot of problems.
I've further noticed that the string functions in lua are pretty limited and disappointing. For example, when using string.find with a regular expression, how do I access the results and the data?
ie. string.find("A long sword", "A long (.*)$")
I understand the return value will be 3 items, the start position, the end position, and the data within the brackets. What I want is a way to reference the data within the brackets.
Now, the simple solution is to simply give it to 3 variables, like so:
nil, nil, item = string.find(...)
This seems like a painful and convoluted way of doing it.
I was looking through google for more help, when I stumbled across a very interesting site (well, cpan) that told how to access lua variables and scripts from within perl. This is an ideal solution for me, because I already have perl scripts that can do all the processing I want. I'm trying to load alot of data into a lua table, but I can do it in perl easily (but lua sux for handling text - sorry, but it does). If I could load the data into a perl hash, convert it into a lua table (within perl) and then dump those tables to a text file which can be read in (in binary form) by a pure lua script, it'd save me about 3 years of work getting lua to behave how I want. Is there some kind of marshalling code in lua tables that would work? | Top |
|
Posted by
| Nick Gammon
Australia (23,120 posts) Bio
Forum Administrator |
Date
| Reply #5 on Wed 11 Oct 2006 08:16 AM (UTC) |
Message
| It isn't quite that bad. To answer your first question, you use string.match which is new in Lua 5.1:
a = string.match ("A long sword", "A long (.*)$")
print (a) --> sword
Or, if you are using an older version of Lua, for some reason:
_, _, a = string.find ("A long sword", "A long (.*)$")
print (a) --> sword
In this case the variable "_" is a temporary that you don't care about the value of.
I can't believe that you will have huge problems getting Lua to do what you want. For instance, I just did the new spellchecker for MUSHclient using Lua.
If you give a concrete example of what you are trying to do, it might help.
The string matching is really quite powerful. Take this for example:
a, b = string.match ("A long sword", "A (.*) (.*)$")
print (a) --> long
print (b) --> sword
Here you are getting two matches back from a single string.match.
|
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Nobody
(38 posts) Bio
|
Date
| Reply #6 on Wed 11 Oct 2006 09:54 AM (UTC) |
Message
| Thanks again Nick, learning a new language is always frustrating.
At the moment, I'm doing 2 things. The first is my replacement of items with items and their stats:
a long sword.
gets changed to:
a long sword [10d10 +1000hp]
kind of thing. I have a database of items which I'm loading into a lua table (this loading is actually the cause of alot of the problems - more on that later). This database has "a long sword" as its description and it matches this against the lines from the mud. Where it fails is "A long sword", which is why I asked about case-less matching. The reason I didnt want to use the stuff you mentioned above, is because I wanted to maintain case (and not have to change all the stuff in my database to upper or lower case). A solution (which I havent been able to figure out) is if there's some kind of regexp you can make that checks for a string, but ignores the case of the first letter. I haven't been able to think of one, but regexps arent really my strong point.
The second problem related to matching text is in loading the database files. Example database file:
Item 'sword of doom'
Damage: 10d10
Affects HITROLL by +1000
Flags: BLINDNESS
Causes: Disease
Item 'shield of stoppage'
Gives +100 ARMOUR
Flags: AWESOMENESS
Using lua to load this datafile, I end up with a long chain of lines like this:
_,_,a = string.find(line, "Item '(.*)'$")
if (a) then
item["title"] = a
end
Now, that is to fill in 1 part of a database (the title). I'd need a similar section for damage, flags, and so on. I added them up, and there are about 30 possible things that can be included in the database file, which means I'll have to daisy chain about 20 of them one after the other like that. And the complicated part is when they have multiple lines of Affects (for example) and I have to check to see I'm not overwriting anything before putting it into a secondary affects section. It just seems like a horribly convoluted way of doing it.
If you can think of any way of improving and/or streamlining this, please let me know.
Thanks. | Top |
|
Posted by
| Nick Gammon
Australia (23,120 posts) Bio
Forum Administrator |
Date
| Reply #7 on Wed 11 Oct 2006 08:52 PM (UTC) |
Message
| It isn't Lua that is the problem. You just need to think through a neat way of doing it. Here is an example that solves that problem for you. In a single pass it process that sort of file, building it into an in-memory database (a Lua table). Here is the code (I ran it in the Immediate window of MUSHclient):
-- the database
db = {}
local multiples = {}
for _, v in ipairs {
"Affects", "SomeOtherThing"
} do multiples [v] = true end
function add_line (s)
local name = string.match (s, "^Item%s+'(.*)'$")
-- if not nil, we have started a new one
if name then
item = { Item = name } -- new table of attributes
db [name:lower()] = item -- add to database
return
end -- found new item
-- get stuff like: Damage: 10d10
local attribname, attrib = string.match (s, "^(%w+):?%s+(.*)$")
if not attrib then
return
end -- nothing? maybe blank line
-- things like affects need a secondary table as there may be multiples
if multiples [attribname] then
item [attribname] = item [attribname] or {} -- make a sub table
table.insert (item [attribname], attrib) -- add to table
else
item [attribname] = attrib
end -- if
end -- add_line
for line in io.lines ("test.txt") do
add_line (line)
end -- for
-- test it
require "tprint"
print (string.rep ("-", 50))
tprint (db)
You need to change the filename from "test.txt" to whatever it is.
Your database looks a bit strange, I wasn't sure if that was typos or not. I tried it on this:
Item 'sword of doom'
Damage: 10d10
Affects: HITROLL by +1000
Affects: DAMAGE by -50
Flags: BLINDNESS
Causes: Disease
Item 'shield of stoppage'
Gives: +100 ARMOUR
Affects: HP by +30
Affects: MANA by -50
Affects: ARMOR by -100
Flags: AWESOMENESS
Note that in my case I put a colon after everything (except the Item line). You had it after Damage and Flags but not Affects and Gives. If I guessed wrong, just change the regexp slightly. Actually I did that now, I made the colon optional.
Anyway, the code "knows" about things that might have multiples, see the "multiples" table above.
For those things it builds a sub-table, which can have any number of entries.
Running that on that test data gives these results:
"sword of doom":
"Flags"="BLINDNESS"
"Item"="sword of doom"
"Affects":
1="HITROLL by +1000"
2="DAMAGE by -50"
"Causes"="Disease"
"Damage"="10d10"
"shield of stoppage":
"Flags"="AWESOMENESS"
"Gives"="+100 ARMOUR"
"Item"="shield of stoppage"
"Affects":
1="HP by +30"
2="MANA by -50"
3="ARMOR by -100"
I have indexed each item by the lower-case name, so given an item, you can always look it up by forcing the item name to lower case. For example:
tprint (db [string.lower ("SWORD OF DOOM")])
--> gives
"Item"="BLINDNESS"
"Name"="sword of doom"
"Affects":
1="HITROLL by +1000"
2="DAMAGE by -50"
"Causes"="Disease"
"Damage"="10d10"
The original name (with whatever its capitalization is) is also inside the table under "Item").
|
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Nobody
(38 posts) Bio
|
Date
| Reply #8 on Thu 12 Oct 2006 11:07 AM (UTC) |
Message
| Thanks for the code, I havent implemented it yet (been busy all today) but I think I can step through it and follow what's happening.
The thing I wanted to show with my wacky item database was that the lines in it aren't always uniform. Some are, but alot arent, they aren't always going to have the 'predicatable' parts (if you get my meaning). In fact, some of them (instead of having the stats of the item) give a little message that gives hints to it's prowess, but which are otherwise a message that could contain anything (including multiple lines) and I was trying to show that a program which parses this file needs to be able to handle these things.
I'll probably try and implement it tomorrow when I have time, and if I have further difficulties I'll be sure to ask here. Thanks again for the code.
Incidentally, did you have any thoughts on my first problem (the case-less matching of text and replacing)? | Top |
|
Posted by
| Tsunami
USA (204 posts) Bio
|
Date
| Reply #9 on Thu 12 Oct 2006 07:17 PM (UTC) |
Message
| I was having a similar problem a little while ago. Given a chunk of text, I wanted to insert linebreak characters every x (80) characters, but preserving words. I found a way around my problem which didn't necessitate having to do this, but before I found it, I was still looking into Lua string manipulation a bit, and accomplishing something like this seemed quite hard. | Top |
|
Posted by
| Nick Gammon
Australia (23,120 posts) Bio
Forum Administrator |
Date
| Reply #10 on Thu 12 Oct 2006 11:27 PM (UTC) |
Message
|
Quote:
... did you have any thoughts on my first problem (the case-less matching of text and replacing)
It depends on what you are trying to do. I gave an example near the top of this thread of doing a string.find on the strings converted to upper-case, this tells you if they match or not. If you want the exact matching string (before the case conversion) you can get the columns, like in my example.
Using regular expressions, you can match on "Door" or "door" by matching on: "[Dd]oor".
|
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Onoitsu2
USA (248 posts) Bio
|
Date
| Reply #11 on Fri 13 Oct 2006 08:36 AM (UTC) |
Message
|
function nocase (s)
s = string.gsub(s, "%a", function (c)
return string.format("[%s%s]", string.lower(c),
string.upper(c))
end)
return s
end -- nocase
print(nocase("Hi there!"))
--> [hH][iI] [tT][hH][eE][rR][eE]!
I found that on some website I ran across, it takes and creates the caseless pattern for you, then you can take what in this case has been sent to the print function, and send that to a variable, that can be used in the string function as the pattern.
if is from the online version of the first edition of:
'Programming in Lua' by Roberto Ierusalimschy
Hope that helps a little :)
You can look at it at http://www.lua.org/pil/
Laterzzz,
Onoitsu2 | Top |
|
Posted by
| Mahony
(27 posts) Bio
|
Date
| Reply #12 on Mon 08 Sep 2014 08:39 PM (UTC) |
Message
| I'm trying to search a string and I need to ignore case like grep -i in linux does. So when I'm looking for dog or DOg I would like to find dog Dog DOG...
Is it possible with string.match somehow? More simple way than writing/using a function nocase...
Thank you | Top |
|
Posted by
| Nick Gammon
Australia (23,120 posts) Bio
Forum Administrator |
Date
| Reply #13 on Mon 08 Sep 2014 08:48 PM (UTC) |
Message
| Force the entire string to upper or lower case and then search that. Make a copy of you don't want to destroy the original.
eg.
target = "I see a DoG here"
print (string.match (target:lower (), "dog"))
Since we force "target" to be lower case, then we just search for lower-case "dog". |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).
To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.
54,984 views.
It is now over 60 days since the last post. This thread is closed.
Refresh page
top