36 lines
1.3 KiB
Plaintext
36 lines
1.3 KiB
Plaintext
REGEX ENGINE
|
|
|
|
I've thought for a while it would be fun and interesting to write my
|
|
own regular expression engine using Thompson's construction algorithm,
|
|
so here we are.
|
|
|
|
|
|
Grammar
|
|
|
|
This engine is not going to be strictly supporting any standard
|
|
syntax; the expression syntax I intend to support follows.
|
|
|
|
regex ::= sequence ( '|' sequence )*
|
|
sequence ::= term+
|
|
term ::= ( '.' | class | literal | '(' regex ')' ) quantifier?
|
|
class ::= '[' '^'? literal+ ']'
|
|
literal ::= non-special | '\' special
|
|
quantifier ::= '*' | '+' | '?'
|
|
special ::= quantifier | '|' | '(' | ')' | '[' | ']' | '^' | '\'
|
|
|
|
|
|
Building and Running Tests
|
|
|
|
The build uses CMake. There are two scripts, build.sh and test.sh,
|
|
which will (much to everybody's shock) build the project and run the
|
|
tests. I use Clang but the code is ISO C11, it should compile just
|
|
fine with GCC. You might need to faff with CMakeLists.txt to get it
|
|
to work with another compiler due to command-line flag nonsense.
|
|
|
|
scripts/build.sh # Compile library, demo and tests
|
|
scripts/test.sh # Run tests
|
|
|
|
There is also an entr.sh script which will watch all the project's
|
|
files and rebuild then rerun the tests on any changes (uses entr --
|
|
hence the name of the script).
|