4 releases
Uses new Rust 2024
| 0.1.3 | Dec 11, 2025 |
|---|---|
| 0.1.2 | Dec 4, 2025 |
| 0.1.1 | Dec 3, 2025 |
| 0.1.0 | Dec 3, 2025 |
#979 in Parser implementations
440KB
12K
SLoC
php-parser
A production-grade, fault-tolerant, zero-copy PHP parser written in Rust.
Features
- Zero-Copy AST: Uses
bumpaloarena allocation for high performance and zero heap allocations for AST nodes. - Fault-Tolerant: Designed to never panic. It produces error nodes and recovers from syntax errors, making it suitable for IDEs and language servers.
- PHP 8.x Support: Targets compliance with modern PHP grammar.
- Safe: Handles mixed encodings and invalid UTF-8 gracefully by operating on byte slices (
&[u8]).
Install as library
cargo add php-parser
cargo add bumpalo
If you want to use the latest version from the GitHub repository, add this to your Cargo.toml:
[dependencies]
php-parser = { git = "https://github.com/wudi/php-parser" }
bumpalo = "3.19.0"
Usage
Here is a basic example of how to parse a PHP script:
use bumpalo::Bump;
use php_parser::lexer::Lexer;
use php_parser::parser::Parser;
fn main() {
// The source code to parse (as bytes)
let source = b"<?php echo 'Hello, World!';";
// Create an arena for AST allocation
let arena = Bump::new();
// Initialize the lexer and parser
let lexer = Lexer::new(source);
let mut parser = Parser::new(lexer, &arena);
let program = parser.parse_program();
println!("{:#?}", program);
// php_parser::span::with_session_globals(source, || {
// println!("{:#?}", program);
// });
}
S-Expression Output
You can output the AST in S-expression format for easier visualization:
use bumpalo::Bump;
use php_parser::lexer::Lexer;
use php_parser::parser::Parser;
use php_parser::ast::sexpr::SExprFormatter;
fn main() {
let code = "<?php class Foo extends Bar implements Baz { public int $p = 1; function m($a) { return $a; } }";
let arena = Bump::new();
let lexer = Lexer::new(code.as_bytes());
let mut parser = Parser::new(lexer, &arena);
let program = parser.parse_program();
let mut formatter = SExprFormatter::new(code.as_bytes());
formatter.visit_program(&program);
let output = formatter.finish();
println!("{}", output);
}
Gives the output:
(program
(class "Foo" (extends Bar) (implements Baz)
(members
(property public int = (integer 1))
(method "m" (params ())
(body
(return (variable "")))))))
Performance
Test file run-tests.php from php-src with 140KB size, here are the benchmark results:
➜ php-parser git:(master) ✗ cargo run --release --bin bench_file -- run-tests.php
Finished `release` profile [optimized] target(s) in 0.05s
Running `target/release/bench_file run-tests.php`
Benchmarking: run-tests.php
File size: 139.63 KB
Warming up...
Running 200 iterations...
Profile written to profile.pb
Flamegraph written to flamegraph.svg
Total time: 134.267333ms
Average time: 671.336µs
Throughput: 203.11 MB/s
Table comparing with nikic/PHP-Parser v5.6.2
| Parser | Language | Time (ms) |
|---|---|---|
| nikic/PHP-Parser | PHP | 33 |
| tree-sitter-php | C | 15 |
| php-parser | Rust | 0.67 |
Machine specs: Apple M1 Pro, 32GB RAM
Development
Running Tests
Run the full test suite:
cargo test
Snapshot Tests
This project uses insta for snapshot testing. If you make changes to the parser that affect the AST output, you may need to review and accept the new snapshots:
cargo test
cargo insta review
Corpus Testing
To verify stability against real-world codebases (like WordPress or Laravel), use the corpus test runner:
cargo run --release --bin corpus_test -- /path/to/php/project
Architecture
- Lexer: Operates on
&[u8]and handles PHP's complex lexical modes (Scripting, DoubleQuote, Heredoc). - Parser: A combination of Recursive Descent and Pratt parsing for expressions.
- AST: All nodes are allocated in a
Bumparena. Strings are stored as references to the original source (&'src [u8]) or arena-allocated slices.
License
This project is licensed under the MIT License. See the LICENSE file for details.
Dependencies
~41MB
~1M SLoC