TINY-LEX(MyLex)
Overview
MyLex is a tiny lexical analyzer implemented in C++ and it is my home work of compiler.
STEPS
- parse .mylex file
- generate a list of NFA by the regex strings
- generate a large NFA by combining the NFAs
- convert the NFA to a DFA
- minimize the DFA
- generate the c code from DFA
Some Important Algorithms
- convert regex to postfix expression
- convert postfix expression to NFA, reference
- convert NFA to DFA
- minimize DFA
Compile And Run
Environment for MyLex
- OS: Linux , Unix or Mac OS
- Compiler: g++ or clang
- Library: Boost
Environment for c code generated by MyLex
- OS: Linux , Unix or Mac OS
- Compiler: gcc or clang, Must Use Compilers Which Support C99
Compiler MyLex
For Debug
make DEBUG=1
Not For Debug
make
How to use
mylex infile [outfile](default output to stdout)
Sample
make
./mylex sample/c_lex.mylex > c_lex.c
make c_lex
MyLex Syntax
- Sample: c_lex.mylex
- Sample Input: sample.c
- Sample Output: sample.out
File Format
%{
[declear]
%}
%%
[Entry]
%%
[Code]
Entry Format
[Regex] {
[Handler] with the param (shm_token)
}
And there are some constrains:
- you must provide a main function in 'Code' section
- in 'main', you must invoke myylex(char* filename, void (*func)())
- you must pass a function pointer when you invoke myylex
- in myylex, you can use the 'Token List'
// To travel the token list
// You should define trav_func
void iter_list(void (*trav_func)(Token*));
// print a specific token (a pre-defined trav_func)
void print_token(Token* token);
// invoke in 'main'
void myylex(char* input, void (*func)());
// init a iter
#define INIT_ITER(iter)
// get next token
#define ITER_NEXT(iter) (iter = iter->next)
// if there is a next token
#define ITER_HASNEXT(iter) (iter != NULL && iter->next != NULL)
// if the iter is NULL
#define ITER_ISEND(iter) (iter == NULL)