Summary
Searches a string for a pattern
Prototype
s, e = string.find (str, pattern, index, plain)
Description
Find the first match of the regular expression "pattern" in "str", starting at position "index".
If found, returns the start and end position, and any captures as additional results.
If not found, returns nil.
If "plain" is true, the search string is plain text, not a regular expression.
Also see string.match which operates in a similar way, but does not return the start and end positions.
Patterns
The standard patterns (character classes) you can search for are:
. --- (a dot) represents all characters.
%a --- all letters.
%c --- all control characters.
%d --- all digits.
%l --- all lowercase letters.
%p --- all punctuation characters.
%s --- all space characters.
%u --- all uppercase letters.
%w --- all alphanumeric characters.
%x --- all hexadecimal digits.
%z --- the character with hex representation 0x00 (null).
%% --- a single '%' character.
%1 --- captured pattern 1.
%2 --- captured pattern 2 (and so on).
%f[s] transition from not in set 's' to in set 's'.
%b() balanced pair ( ... )
Important - the uppercase versions of the above represent the complement of the class. eg. %U represents everything except uppercase letters, %D represents everything except digits.
There are some "magic characters" (such as %) that have special meanings. These are:
^ $ ( ) % . [ ] * + - ?
If you want to use those in a pattern (as themselves) you must precede them by a % symbol.
eg. %% would match a single %
You can build your own pattern classes (sets) by using square brackets, eg.
[abc] ---> matches a, b or c
[a-z] ---> matches lowercase letters (same as %l)
[^abc] ---> matches anything except a, b or c
[%a%d] ---> matches all letters and digits
[%a%d_] ---> matches all letters, digits and underscore
[%[%]] ---> matches square brackets (had to escape them with %)
You can use pattern classes in the form %x in the set. If you use other characters (like periods and brackets, etc.) they are simply themselves.
You can specify a range of character inside a set by using simple characters (not pattern classes like %a) separated by a hyphen. For example, [A-Z] or [0-9]. These can be combined with other things. For example [A-Z0-9] or [A-Z,.].
A end-points of a range must be given in ascending order. That is, [A-Z] would match upper-case letters, but [Z-A] would not match anything.
You can negate a set by starting it with a "^" symbol, thus [^0-9] is everything except the digits 0 to 9. The negation applies to the whole set, so [^%a%d] would match anything except letters or digits. In anywhere except the first position of a set, the "^" symbol is simply itself.
Inside a set (that is a sequence delimited by square brackets) the only "magic" characters are:
] ---> to end the set, unless preceded by %
% ---> to introduce a character class (like %a), or magic character (like "]")
^ ---> in the first position only, to negate the set (eg. [^A-Z)
- ---> between two characters, to specify a range (eg. [A-F])
Thus, inside a set, characters like "." and "?" are just themselves.
The repetition characters, which can follow a character, class or set, are:
+ ---> 1 or more repetitions (greedy)
* ---> 0 or more repetitions (greedy)
- ---> 0 or more repetitions (non greedy)
? ---> 0 or 1 repetition only
A "greedy" match will match on as many characters as possible, a non-greedy one will match on as few as possible.
The standard "anchor" characters apply:
^ ---> anchor to start of subject string
$ ---> anchor to end of subject string
You can also use round brackets to specify "captures":
You see (.*) here
Here, whatever matches (.*) becomes the first pattern.
You can also refer to matched substrings (captures) later on in an expression:
print (string.find ("You see dogs and dogs", "You see (.*) and %1")) --> 1 21 dogs
print (string.find ("You see dogs and cats", "You see (.*) and %1")) --> nil
This example shows how you can look for a repetition of a word matched earlier, whatever that word was ("dogs" in this case).
As a special case, an empty capture string returns as the captured pattern, the position of itself in the string. eg.
print (string.find ("You see dogs and cats", "You .* ()dogs .*")) --> 1 21 9
What this is saying is that the word "dogs" starts at column 9.
Finally you can look for nested "balanced" things (such as parentheses) by using %b, like this:
print (string.find ("I see a (big fish (swimming) in the pond) here",
"%b()")) --> 9 41
After %b you put 2 characters, which indicate the start and end of the balanced pair. If it finds a nested version it keeps processing until we are back at the top level. In this case the matching string was "(big fish (swimming) in the pond)".
Examples of string.find:
print (string.find ("the quick brown fox", "quick")) --> 5 9
print (string.find ("the quick brown fox", "(%a+)")) --> 1 3 the
print (string.find ("the quick brown fox", "(%a+)", 10)) --> 11 15 brown
print (string.find ("the quick brown fox", "fruit")) --> nil
See Also ...
Lua functions
string.byte - Converts a character into its ASCII (decimal) equivalent
string.char - Converts ASCII codes into their equivalent characters
string.dump - Converts a function into binary
string.format - Formats a string
string.gfind - Iterate over a string (obsolete in Lua 5.1)
string.gmatch - Iterate over a string
string.gsub - Substitute strings inside another string
string.len - Return the length of a string
string.lower - Converts a string to lower-case
string.match - Searches a string for a pattern
string.rep - Returns repeated copies of a string
string.reverse - Reverses the order of characters in a string
string.sub - Returns a substring of a string
string.upper - Converts a string to upper-case
Topics
Lua base functions
Lua bc (big number) functions
Lua bit manipulation functions
Lua coroutine functions
Lua debug functions
Lua io functions
Lua LPEG library
Lua math functions
Lua os functions
Lua package functions
Lua PCRE regular expression functions
Lua script extensions
Lua string functions
Lua table functions
Lua utilities
Regular Expressions
Scripting
(Help topic: lua=string.find)