Register forum user name Search FAQ

Gammon Forum

Notice: Any messages purporting to come from this site telling you that your password has expired, or that you need to verify your details, confirm your email, resolve issues, making threats, or asking for money, are spam. We do not email users with any such messages. If you have lost your password you can obtain a new one by using the password reset link.

Due to spam on this forum, all posts now need moderator approval.

 Entire forum ➜ MUSHclient ➜ Lua ➜ LPeg pattern parsing

LPeg pattern parsing

It is now over 60 days since the last post. This thread is closed.     Refresh page


Pages: 1  2 

Posted by David Haley   USA  (3,881 posts)  Bio
Date Reply #15 on Sun 22 Feb 2009 07:00 PM (UTC)
Message
This is really interesting stuff. I'm a little surprised that we didn't learn about this in our compilers class, but what can you say. Maybe it's because PEGs are still kind of new, and it would appear that formal properties aren't worked out yet.

I read through some of this paper:
http://pdos.csail.mit.edu/~baford/packrat/popl04/

An interesting property of PEGs is that (unlike CFGs) they can [i]always[/i] be parsed in linear time. Even more interesting, PEGs can describe languages that CFGs cannot. And yet, since CFGs in general are necessarily non-linear (as I understand it), that suggests that there are languages that CFGs can represent that PEGs cannot. However the paper points out that (as of '04 at least) there is no example let alone proof of this.

But since PEGs can express all deterministic LR(k) languages -- an LR(k) language is a CFG that parses left-to-right but produces right-most derivations, with k symbols of look-ahead -- this means in practice that PEGs are perfect for parsing any kind of machine language, which are by design unambiguous. When you write parsers in yacc/bison, you need to tell the parser generator how to resolve ambiguities, although it has some kind of default rule that is deterministic.


I'll have to play around with this to see how easy it is to create parse trees and the like. Getting my hands on a linear-time parser driven by a complex grammar would be very, very nifty.

David Haley aka Ksilyan
Head Programmer,
Legends of the Darkstone

http://david.the-haleys.org
Top

Posted by Nick Gammon   Australia  (23,165 posts)  Bio   Forum Administrator
Date Reply #16 on Sun 22 Feb 2009 10:45 PM (UTC)
Message
Quote:

When you say it's built in, is it just compiled/linked into the exe, or are there any features that utilize it?


It is compiled into the exe, and made available to Lua scripting. Other than that, it is just waiting for you to use it. ;)

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by WillFa   USA  (525 posts)  Bio
Date Reply #17 on Sun 22 Feb 2009 11:04 PM (UTC)
Message
I guess it would need to be to maintain the language agnosticism. It'd be really nice tho if the trigger engine could use LPEG as well as PCRE.

The main speedbump I see at this point is that anything from the mud that you wanted to parse with LPEG, you need to identify with PCRE/MC patterns first, doubling the cpu workload, and knowledge required, and maintenance of triggers, etc.

If the areas where LPEG shines are all the areas where PCRE is inefficient, I can't see any gains by going through that bottleneck first.
Top

Posted by Nick Gammon   Australia  (23,165 posts)  Bio   Forum Administrator
Date Reply #18 on Mon 23 Feb 2009 12:15 AM (UTC)
Message
It could be fiddly to get right without a lot of complexity. For example, for speed you would want to compile the lpeg expression once, not for every line that arrives.

I was thinking a simple trigger might match a room description (eg. "You see *").

Then the script could use lpeg to break that down into objects, decorations, other players, etc.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Nick Gammon   Australia  (23,165 posts)  Bio   Forum Administrator
Date Reply #19 on Sat 27 Jun 2009 10:24 PM (UTC)
Message
For a couple of examples of LPEG in action in a real-life situation:

Parse speedwalks:

http://www.gammon.com.au/forum/?id=9552


Parse an Aardwolf colour-coded string into text and RGB colours:

http://www.gammon.com.au/forum/?id=9529

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Nick Gammon   Australia  (23,165 posts)  Bio   Forum Administrator
Date Reply #20 on Mon 08 Nov 2010 01:58 AM (UTC)
Message
I saw an interesting post here:

http://valeriu.palos.ro/755/case-study-fixed-number-of-iterations-with-lpeg/

On that page Valeriu discusses the problems with LPEG matching an exact number of iterations, or a range of iterations. Compare to PCRE where you can match on \d{1,3} (between 1 and 3 digits)



function multiply(item, min, max)
  return lpeg.Cmt(lpeg.P(true),
    function(s, i)
      local set, offset, attempt = {}, i, 0
      for j = 1, max or min do
        set[j], attempt = lpeg.match(item * lpeg.Cp(), s, offset)
        if not attempt then
          if j > min then
             break
          else
             return false
          end  -- if
        end  -- if
        offset = attempt
      end
      return offset, table.concat (set)
    end)  -- function
end -- multiply


(Modified by me slightly).

Example of use:


digits = lpeg.R ("09")
hp = lpeg.P ("You gain ") * multiply (lpeg.C (digits), 1, 3) * " hp."
print (lpeg.match (hp, "You gain 300 hp."))  --> 300


Note that this doesn't match on a 4-digit hp value.

Compare to:


digits = lpeg.R ("09")
hp = lpeg.P ("You gain ") * lpeg.C(digits^1) * " hp."
print (lpeg.match (hp, "You gain 300 hp."))  --> 300


The original version (page 1 of this thread) doesn't have any way of limiting the number of digits you match on.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Nick Gammon   Australia  (23,165 posts)  Bio   Forum Administrator
Date Reply #21 on Mon 08 Nov 2010 02:27 AM (UTC)
Message
Oh, and LPEG is updated to version 0.10 in version 4.67 of MUSHclient. This adds a few new features, including better error messages.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Anaristos   USA  (24 posts)  Bio
Date Reply #22 on Tue 23 Aug 2011 02:54 AM (UTC)
Message
It seems that the following functions are missing from LPEG:
Cf, Cg, locale and setmaxstack.
The fact that locale is missing is particularly painful because this prevents the re module from loading. I don't think it is your problem in particular since I've tried using other sources for the LPEG module, with the same result.

Sic itur ad astra
Top

Posted by Nick Gammon   Australia  (23,165 posts)  Bio   Forum Administrator
Date Reply #23 on Tue 23 Aug 2011 03:16 AM (UTC)
Message
I don't know what you mean by this. For a start, if I do this:


require "tprint"
tprint (lpeg)


I see this:


"C"=function: 01914218
"B"=function: 019141C8
"print"=function: 01BB7120
"match"=function: 01BB70F8
"locale"=function: 01BB7148
"Cg"=function: 01914308
"Cc"=function: 019142B8
"Cb"=function: 018F4748
"Cmt"=function: 018F47E8
"Carg"=function: 018F4798
"setmaxstack"=function: 01BB7198
"S"=function: 019038D8
"R"=function: 01903888
"V"=function: 01903928
"type"=function: 01903978
"version"=function: 019039A0
"P"=function: 01903838
"Ct"=function: 018F4888
"Cp"=function: 018F46F8
"Cs"=function: 018F4838
"Cf"=function: 01914268


The functions you mention are there.

Also I don't know what you mean by "this prevents the re module from loading". I retested the examples on page 1 of this thread without any problems, eg.



require "re"

exits = re.compile[[
  ExitLine         <- "You see exits leading "  <CommaDirections>? <Directions> 
  CommaDirections  <- <Directions> (", " <Directions>)*  " and "
  Directions       <- { "north" / "south" / "east" / "west" / "up" / "down" }
]]


print ( exits:match ("You see exits leading north, up, down, west and south") )


- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Anaristos   USA  (24 posts)  Bio
Date Reply #24 on Tue 23 Aug 2011 03:50 AM (UTC)

Amended on Tue 23 Aug 2011 06:18 AM (UTC) by Anaristos

Message
when I try to use lpeg.Cf or lpeg.Cg I get the "attempt to call field 'Cf'/'Cg' a nil value" message. When I try to load the re module, I get the same message for 'locale'. Also I get the same message for lpeg.setmaxstack. I've bee trying to do this both under MUSHclient and using the Lua command line app. I can use anything else LPEG has, but not those.

This is what I get when using your sample:

Stack trace:
.\re.lua:36:in function "updatelocale"
.\re.lua:67:in main chunk
[C]:-1:in function "require"
[string "code"]:1:in main chunk"

When I run the tprint, I don't get the list name of the functions, but there is obviously something missing since the list only contains 18 entries.

Sic itur ad astra
Top

Posted by Nick Gammon   Australia  (23,165 posts)  Bio   Forum Administrator
Date Reply #25 on Tue 23 Aug 2011 04:47 AM (UTC)
Message
Can you post some example code that demonstrates this?

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Anaristos   USA  (24 posts)  Bio
Date Reply #26 on Sat 27 Aug 2011 08:52 AM (UTC)
Message
OK, sorry for the bother. I compiled lpeg-0.10.2, and with it everything works fine.

Sic itur ad astra
Top

The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).

To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.


104,986 views.

This is page 2, subject is 2 pages long:  [Previous page]  1  2 

It is now over 60 days since the last post. This thread is closed.     Refresh page

Go to topic:           Search the forum


[Go to top] top

Information and images on this site are licensed under the Creative Commons Attribution 3.0 Australia License unless stated otherwise.