Sitemap
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Pages
Page Not Found
Page not found. Your pixels are in another canvas.
Page not in menu
This is a page not in the main menu
Posts
What’s in Pass@K?
Published:
Pass@k is ubiquitous in evaluating reasoning models, but the metric is more subtle than it appears. Computing it correctly requires the unbiased estimator, and the nonlinearity of pass@k means it effectively upweights hard problems compared to pass@1.
Implementing Process Rewards in VeRL
Published:
Using process rewards in VeRL requires advantage estimators that preserve token-level structure. Most standard algorithms collapse rewards to scalars, defeating the purpose of fine-grained credit assignment.
Understanding Length Dynamics in RL Training
Published:
An empirical investigation into what drives output length growth during RL training, revealing that dataset difficulty composition is the primary driver behind the ‘overthinking’ phenomenon.
portfolio
Portfolio item number 1
Short description of portfolio item number 1
Portfolio item number 2
Short description of portfolio item number 2 
