Verible uses traditional tools like Flex and Yacc/Bison, but not in the usual
manner where the generated yyparse()
function calls yylex()
directly.
Instead, the generated lexer and parser are completely decoupled.
The lexer is generated using Flex. It supports both SystemVerilog language-proper lexical tokens as well as preprocessing directives, macros, and other directives, all-in-one. The Flex-generated code is then wrapped into an interface that returns tokens one-by-one, until ending with a special EOF token.
The contextualizer is a pass over the token stream produced by the lexer that can help disambiguate tokens with multiple interpretations. This in turn, helps simplify the grammar in the parser implementation in ways that would not be possible with a context-free LR parser alone.
There is no standalone preprocessor yet.
Token streams can be filtered into token stream views in various ways. One of the most useful filters hides comments and attributes so that the resulting view can be parsed.
The parser is generated using Bison, using the LALR(1) algorithm. The grammar implemented is unconventional in that it explicitly handles preprocessing constructs, directives, macros -- this allows it to operate on a limited-but-useful subset of unpreprocessed code. The parser constructs a concrete syntax tree (CST) that captures all non-comment tokens, including punctuation and syntactic sugar.