How to Perform Custom Error Handling With ANTLR
An article to help you understand how to configure a custom error listener to provide better error handling while using ANTLR to build your parser.
Join the DZone community and get the full member experience.
Join For FreeANTLR is a very popular parser generator that helps build parsers for various language syntaxes, especially query languages or domain-specific languages. This tool provides default error handling, which is useful in many circumstances, but for more robust and user-friendly applications, more graceful error handling is required.
In this article, we will describe this requirement with a simple example and will guide you through the process of implementing custom error handling with ANTLR.
Scenario 1: Add Markers in the Parsed Text
In this scenario, we will first define the grammar and show the syntax error we get by default from the parser generated by the ANTLR parser generator.
ANTLR Grammar
Here is a code snippet of the ANTLR grammar that defines the syntax needed for our example.
grammar ExampleDsl;
query
: selectExpr fromExpr whereExpr EOF
;
selectExpr
: 'SELECT' expression_list
;
fromExpr
: 'FROM' identifier
;
whereExpr
: 'WHERE' expression
;
....
....
In the next subsection, we will show what an error message looks like for a bad or incorrect input text with default error handling.
Default Error Handling
Here is an example input where we missed the FROM clause in the input text.
Input:
SELECT c1 FROM table WHERE column is not null
For the above input, the default error message will be similar to the following:
Result:
line 1:10 mismatched input 'WHERE' expecting {'FROM', ','}
The message shows the line number and an offset value, which indicates the position of the error. As the above input text is a very simple single line, the default error message seems sufficient to identify the problematic spot in the input text.
But if the input text is multi-line and a very complex paragraph, then the default error message doesn't help much. The user will need a better error message with visual cues to identify the syntax issue quickly. In the next subsection, we will show how we can achieve that.
Adding Error Position Markers Using Custom Error Listener
Let's define a custom error listener that will intercept the default exception from the ANTLR-generated parser and try to create a new and easily readable error message.
case class CustomErrorListener(input: String) extends BaseErrorListener {
val StartMarker = ">"
val EndMarker = "<"
override def syntaxError(recognizer: Recognizer[_, _], offendingSymbol: Any, line: Int, charPositionInLine: Int, msg: String, e: RecognitionException): Unit = {
// Initialize start and end position of offending token
val startPos = e.getOffendingToken.getStartIndex
val endPos = e.getOffendingToken.getStopIndex
// Insert desired markers at
val beforeStart = input.substring(0, startPos)
val between = input.substring(startPos, endPos)
val afterEnd = input.substring(endPos)
val errorMsg = s"Syntax error [$msg] for input:\n$beforeStart $StartMarker $between $EndMarker $afterEnd"
throw new IllegalArgumentException(errorMsg)
}
}
In the above listener, we are using the RecognitionException
object to extract the offending token where the parser error occurred. The offending token contains the position marker values. Using these markers, we can modify the input string by inserting visual cue characters (such as > and < ) in the input text.
Once the listener is defined, we need to update the parser to use the custom error listener.
def parse(input: String): String = {
val lexer = new ExampleDslLexer(CharStreams.fromString(input))
val parser = new ExampleDslParser(new CommonTokenStream(lexer))
// Adding custom listener to parser
parser.addErrorListener(CustomErrorListener(input))
val visitor = new ExampleParseVisitor()
visitor.visitQuery(parser.query())
}
With the above change, we will have the position marker characters in the error messages to indicate the positions of parser errors.
Here are a few examples of improved error messages from the custom error listener:
Input:
SELECT c1 WHERE column is not null
Error Message:
Syntax error [ mismatched input 'WHERE' expecting {'FROM', ','} ] for input:
SELECT c1 > WHER < E column is not null
Input:
SELECT c1,c2,c3,
c4,c5,c6,
c7,c8,c9
WHERE column is not null
Error Message:
Syntax error [ "mismatched input 'WHERE' expecting {'FROM', ','}" ] for input:
SELECT c1,c2,c3,
c4,c5,c6,
c7,c8,c9
> WHER < E column is not null
In next section, we will show another use-case where we can extend this concept to provide different error messages for same parser error depending on the context.
Scenario 2: Show Context-Driven Error Message
Let's take a simple example: We want to handle a parser error differently when logging it vs. propagating it back to the user.
In case of logging, we do not want to log user's query as this may contain sensitive data but only log the message whereas customer may need entire error message.
To achieve that, we can use a custom error listener to create a custom exception that contains both types of error messages. The caller code can use these messages appropriately based on the context.
case class ContextSensitiveErrorListener(input: String) extends BaseErrorListener {
val StartMarker = ">"
val EndMarker = "<"
override def syntaxError(recognizer: Recognizer[_, _], offendingSymbol: Any, line: Int, charPositionInLine: Int, msg: String, e: RecognitionException): Unit = {
val startPos = e.getOffendingToken.getStartIndex
val endPos = e.getOffendingToken.getStopIndex
val beforeStart = input.substring(0, startPos)
val between = input.substring(startPos, endPos)
val afterEnd = input.substring(endPos)
val userErrorMsg = s"Syntax error $msg for input: $beforeStart $StartMarker $between $EndMarker $afterEnd"
val logErrorMsg = s"Syntax error $msg for input"
throw CustomParserException(userErrorMsg, logErrorMsg)
}
}
case class CustomParserException(userErrorMsg: String, logErrorMsg: String) extends IllegalArgumentException(userErrorMsg)
Conclusion
Custom error handling is a very important aspect of building robust and user-friendly language processors with ANTLR. By implementing your own error listener, you can significantly improve the quality of error reporting, facilitate debugging, and ultimately create a better experience for those interacting with your language tools.
Opinions expressed by DZone contributors are their own.
Comments