Example calculator grammar: UMINUX is a fictious token used only for precedence. The rule declared later has higher precedence. The %prec UMINUS -- assigns the precedence of that symbol to that rule Example: calc4.y %{ #include /* for pow() function */ #include "symtab4.h" %} %union { double val; /* values of expressions */ symrec *tptr; /* symbol table pointer for variables */ } %token NUM %token VAR %type expr %left '-' '+' %left '*' '/' %nonassoc UMINUS %right '^' %% input : /* empty */ | input line ; line : '\n' | VAR '=' expr '\n' { $1->value = $3; } | expr '\n' { printf ("\t= %.2f\n", $1); } | error '\n' { yyerrok; } ; expr : expr '+' expr { $$ = $1 + $3; } | expr '-' expr { $$ = $1 - $3; } | expr '*' expr { $$ = $1 * $3; } | expr '/' expr { if ($3 == 0.0) yyerror("divide by zero"); else $$ = $1 / $3; } | '-' expr %prec UMINUS { $$ = - $2; } | expr '^' expr { $$ = pow($1,$3); } | '(' expr ')' { $$ = $2; } | NUM { $$ = $1; } | VAR { $$ = $1->value; } ; %% symrec *sym_table = (symrec *)0; main () { yyparse (); } yyerror (char *s) /* Called by yyparse on error */ { printf ("\terror: %s\n", s); } ============================================================ Example: calc4.l /* flex specification */ %{ #include "symtab4.h" /* declaration of `symrec' */ #include "calc4.tab.h" %} DIGIT [0-9] LETTER [A-Za-z] %% {DIGIT}+("."{DIGIT}+)? { yylval.val = atof(yytext); return NUM; } [ \t] /* ignore whitespace */ {LETTER}({LETTER}|{DIGIT})* { symrec *sym; sym = getsym(yytext); if (sym == 0) sym = putsym(yytext); yylval.tptr = sym; return VAR; } <> yyterminate(); /* signal end of dialogue */ /* terminal tokens #defined from 258 or so. * so return of ASCII values are OK! :-) */ \n return yytext[0]; . return yytext[0]; %% ================================================================ Note: {n,m} used for n-m occurences LR Parser - Left to Right parser LALR - Look Ahead LR parser (it checks 1 more symbol before reduce) The General Format of a Lex/Yacc File consists of three sections: 1. Definitions -- Copied to lex.yy.c 2. Rules 3. User Subroutines Lex breaks ties with -Longest match - first rule, first match yytext - matched expr string. yyleng - it's length. Note: Single char ASCII tokens can be literally referred from both lex and yacc files rules as it is. ================================================================ Question: How are the /* comments */ coded in Lex/yacc ? It is better to code this in lex itself. but how? What is the significance of the yywrap() function ? The yywrap() called at end of input file. The yywrap() function should return 0 if it has arranged for additional input, 1 if the end of the input has been reached. Declare: int yywrap(){ return 1; } if you don't want to link with -lfl library. Calling yylex() continues matching multiple lex rules until some rule does a "return something;" Other interesting lex functions : yymore() - don't reinit yytext after this rule. just append this. REJECT; - reject the current rule, go to next rule. unput(c); - put this char back to input stream ============================================================== In yacc, left recursion is prefferred to have efficient parser: list : item { /* first item */ } | list item { /* rest of the items */ } - main(){ yyparse(); } yyerror(char *s){ // Called by parser error when no match. printf("error: %s\n" s); } ============================================================== - Steps to link flex and yacc outputs : % bison -d test.y - Compile test.y first which generates lex token #defines in test.tab.h header file. You need this for flex. The -d option generates the header file. % flex test.l /* produces lex.yy.c */ % cc lex.yy.c test.tab.c -lfl Note: no bison library needed. only flex library to provide default yywrap() function. ================================================================= How to use C++ with bison: http://www.progtools.org/compilers/tutorials/cxx_and_bison/cxx_and_bison.html CPP examples with STL : http://www.josuttis.com/libbook/toc.html