LPEG
LPEG syntax takes a little bit to get used to. It's all math operators that are used in a way that makes sense when you get the hang of it, but will trip you up at first.
LPEG doesn't use .. for concatenation ("hi" .. " " .. "there" in plain Lua, 'hi ' 'there' in RE). It uses *
I finally started remembering that when I realized that repetitions in LPEG (i.e. PCRE {1,2}) are exponents ^
So a * a is like a^2 . That's just math. Okay.
Since LPEG uses math operators, "Please Excuse My Dear Aunt Sally" applies when parsing. (ie. order of operations, priority, etc)
( ) are still grouping.
^n are repetitions, tho ^n is the same as PCRE's {n,} (at least n matches.)
^n are like PCRE's {,n} (at most n matches)
* is concatentation.
/ is apply pattern to a function call.
+ is OR. wth? Well, somethings gutta be or, and with the multiplication > exponent analogy, it grows on ya.
pq is a logical disjunction. (everything in p except q)
p is negation (not p, or everything except p)
So a * a + b is "a twice or b" not "a and either a or b" use parens if you need to clarify, as usual.
The documentation explains the functions in more detail, so I'll gloss over most of them. In my code below I use the following functions:
P for literal text, except the initial compile.
V for specifying a subpattern. (V"name" is <name> in RE)
Cg is capture group. Cg(a * a + b, 'foo') is {:foo: a a / b:} in RE. (i.e. to pass more than one argument to a function, or name a table key)
Ct is Capture to a table. {a} > {} in RE
S is a set, like [abcde] in PCRE.
R is a range, like [az] in PCRE.
C is a simple capture. (don't need a group, one's fine.)
The 2 new concepts below are
Cc is a Constant capture. It matches the empty string but has the captured value specified.
and
Cf Folding Capture. Recurse down a function with all matches.
do
lpeg.locale(lpeg)
local P, V, Cg, Ct, Cc, S, R, C, Cf, Cb = lpeg.P, lpeg.V, lpeg.Cg, lpeg.Ct, lpeg.Cc, lpeg.S, lpeg.R, lpeg.C, lpeg.Cf, lpeg.Cb
local digit = lpeg.digit
local alpha = lpeg.alpha
local flag = lpeg.alpha + lpeg.S"*"
local function Tn (s) return lpeg.Cg(lpeg.V(s), s) end
local function namekeys ( t, k, v) t[k] = tonumber(v) or v return t end
local function num (name) return Cg(digit^1 / tonumber, name) end
local function perc (x) x.percent = math.floor((x.current/x.max)*100) return x end
local function group (name) return Cg(Ct( num"current" * P'/' * num"max")/ perc, name) end
local function vper (x) x.percent = x.current return x end
LPEG = P { "top",
top = Ct( Tn"Stats" *
Tn"Prots" *
Tn"Performing" *
Tn"Combat" *
Tn"GXP"
) ,
Stats = Ct(V"HPtoken" * V"Ktoken" * V"SPtoken" * V"Vtoken"),
HPtoken = P" HP: " * group"HP" * P" ",
Ktoken = P" K: " * group"Karma" ,
SPtoken = P" SP: " * group"SP" ,
Vtoken = Cg(Ct( P" V: " * num"current" * P "%" * Cg(Cc(100), "max") )/ vper, "Voice") ,
Prots = P" [[ " * Cf(Ct("") * V"abil"^1 , namekeys) * P" ]]",
abil = V"dur" + V"present" + P" " * V"abil",
dur = Cg( C(alpha^1) * P":" * C(digit^1) ),
present = Cg( C(flag^1) * Cc(true)),
Performing = Ct( (P" P: " * num"Duration") + P" "),
Combat = Ct( ( P" T:" * V"ehealth" * P" " ) + P" " ),
ehealth = ( num"enemyHealth" * P"%" ) + Cg(alpha^1, "EnemyHealth"),
GXP = Ct( P" " * num"toSpend" * P"/" * num"toNext" * P" (" * num"nextPercent" * P"%)" ),
}
end
Okay! I think that actually reads a bit more easily without worrying about < and >, sure you still have ambiguously named variables, but
the general coding structure is cleaner.
So, new things in here over RE (and almost all of RE should be new. Sorry for not explaining them better.)
I've used some functions to work as macros for common "phrases" in my grammar.
The Cf function takes an Empty table and the two captures returned from <abil> (either <dur> or <present>) and calls namekeys with them.
it returns that table and then the Cf function uses that returned table instead of the empty table to make a second call to namekeys with more results from subsequent matches to <abil> (thus a space and more <dur> or <prot>)
So what do we end up with?
Me = LPEG:match(hp)
tprint(Me) >
"Combat":
"enemyHealth"=98
"GXP":
"nextPercent"=45
"toNext"=2914265
"toSpend"=811372
"Performing":
"Duration"=8
"Prots":
"AG"=379
"B"=true
"HS"=50
"O"=362
"PE"=238
"PH"=139
"RF"=true
"ab"=true
"Stats":
"HP":
"current"=1346
"max"=1351
"percent"=99
"Karma":
"current"=497
"max"=537
"percent"=92
"SP":
"current"=289
"max"=289
"percent"=100
"Voice":
"current"=91
"max"=100
"percent"=91
Exactly what I want. A Robust, heirarchy instead of a flat table. How does this complexity perform? 100,000 calls in 8 seconds again.
Still faster than PCRE in this application and giving me exactly what I want.
