Lexical Analysis
Overview
Lexical analysis is the first phase of compilation, converting source code into tokens.
Concepts
- Tokens: Smallest meaningful units (keywords, identifiers, operators)
- Lexemes: Actual character sequences that form tokens
- Regular Expressions: Pattern matching for tokens
- Finite Automata: DFA and NFA for token recognition
Tools and Generators
- Lex/Flex: Lexical analyzer generators
- Regular expression patterns
- Symbol tables
Common Tasks
- Keyword recognition
- Identifier validation
- String and character literal handling
- Comment removal
Error Handling
- Unexpected character detection
- Error recovery strategies