Closed Bug 1995939 Opened 1 month ago Closed 25 days ago

Assertion failure: argvWithoutThis[i].isUndefined(), at jit/VMFunctions.cpp:522

Categories

(Core :: JavaScript Engine: JIT, defect, P1)

x86_64
Linux
defect

Tracking

()

VERIFIED FIXED
146 Branch
Tracking Status
firefox-esr140 --- unaffected
firefox144 --- unaffected
firefox145 + fixed
firefox146 + verified

People

(Reporter: decoder, Assigned: iain)

References

(Blocks 1 open bug, Regression)

Details

(5 keywords, Whiteboard: [bugmon:update,bisected,confirmed])

Attachments

(3 files)

The following testcase crashes on mozilla-central revision 20251021-371a2ab12ddf (debug build, run with --fuzzing-safe --ion-offthread-compile=off --trial-inlining-warmup-threshold=10):

d = function(a,b) {}
class e extends d {}
f = function() {}.bind()
for (i=0; i<10000; ++i) {
  g = f
  Reflect.construct(e, [], g)
}

Backtrace:

received signal SIGSEGV, Segmentation fault.
#0  0x0000555557c87e49 in js::jit::InvokeFunction(JSContext*, JS::Handle<JSObject*>, bool, bool, unsigned int, JS::Value*, JS::MutableHandle<JS::Value>) ()
#1  0x00002e4195b6e41e in ?? ()
#2  0x00007fffffffc660 in ?? ()
#3  0x0000000000000000 in ?? ()
rax	0x0	0
rbx	0x7fffffffc6c0	140737488340672
rcx	0x20a	522
rdx	0x7ffff7804563	140737345766755
rsi	0x0	0
rdi	0x7ffff7805700	140737345771264
rbp	0x7fffffffc640	140737488340544
rsp	0x7fffffffc440	140737488340032
r8	0x0	0
r9	0x3	3
r10	0x0	0
r11	0x293	659
r12	0x0	0
r13	0x1	1
r14	0xfff9800000000000	-1829587348619264
r15	0x8110a060ee8	8869275635432
rip	0x555557c87e49 <js::jit::InvokeFunction(JSContext*, JS::Handle<JSObject*>, bool, bool, unsigned int, JS::Value*, JS::MutableHandle<JS::Value>)+2937>
=> 0x555557c87e49 <_ZN2js3jit14InvokeFunctionEP9JSContextN2JS6HandleIP8JSObjectEEbbjPNS3_5ValueENS3_13MutableHandleIS8_EE+2937>:	mov    %rcx,(%rax)
   0x555557c87e4c <_ZN2js3jit14InvokeFunctionEP9JSContextN2JS6HandleIP8JSObjectEEbbjPNS3_5ValueENS3_13MutableHandleIS8_EE+2940>:	call   0x555556f6c8d0 <abort>

Marking s-s due to JIT-related failure.

Attached file Detailed Crash Information β€”
Attached file Testcase β€”

bug 1995949 appears to be the same crash, but a different testcase - filed less than an hour later, but also it already has a regressor. Not sure what we do in those cases - decoder?

Flags: needinfo?(choller)
Duplicate of this bug: 1995949

Per duplicated bug likely regressed by bug 1993404.

Flags: needinfo?(choller)
Regressed by: 1993404

Set release status flags based on info from the regressing bug 1993404

:iain, since you are the author of the regressor, bug 1993404, could you take a look? Also, could you set the severity field?

For more information, please visit BugBot documentation.

Flags: needinfo?(iireland)

Verified bug as reproducible on mozilla-central 20251023094950-426d0bb3c451.
The bug appears to have been introduced in the following build range:

Start: 0038190e6a9199db4ab698ef89dd12d72121b2d1 (20251006201923)
End: e3f1fd706322ba441f886715b7d6dc5cf1e62c60 (20251006213253)
Pushlog: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=0038190e6a9199db4ab698ef89dd12d72121b2d1&tochange=e3f1fd706322ba441f886715b7d6dc5cf1e62c60

Whiteboard: [bugmon:update,bisect] → [bugmon:update,bisected,confirmed]

This is a silly mistake in my patches for bug 1991223. Fortunately, I think everything works out in our favour to make it non-exploitable.

The bug is in this code in emitAllocateSpaceForConstructAndPushNewTarget. We have nargs (the number of arguments expected by the callee function) in a register. We want to round it down to the nearest odd number and allocate that many slots on the stack, then count down from nargs to argc (the number of arguments we are actually passing), pushing undefined for each missing argument. We don't have any spare registers, so we shift the value left to convert to the size of the allocated slots, and then shift right to convert back to an argument count. However, we don't undo the rounding! This means that we may push one fewer undefined value than expected.

What saves us in this case is another trick we do where we push newTarget twice. So even though we accidentally pass newTarget as the last argument, there's still a copy of it where we expect it in InvokeFunction. So in a non-debug build, this testcase will do the right thing, except that the second argument to d will be new.target, instead of undefined. That's not a security concern.

An equivalent bug is possible without calling into InvokeFunction. (We have to jump through some hoops so that new doesn't know its target, which would go down a different path).

class C {
  constructor(a,b) { this.b = b; }
}
class C2 {
  constructor(a,b) { this.b = b; }
}

var flip = false;
function foo() {
  let c = flip ? C : C2;
  flip = !flip;
  return new c(...arguments);
}
for (var i = 0; i < 2000; i++) {
  assertEq(foo().b, undefined);
}

Again, the bug here is simply that we pass new.target as the last argument, instead of undefined.

There's a corresponding problem in the equivalent code in the non-constructing case here, but that one's even less of a concern. In this case we round the number of arguments up, not down, and we don't have newTarget to worry about, so the net effect is just that we push one more undefined argument than we had to. That doesn't have any observable effects; it just means that our padding on the stack contains different bits than it otherwise might.

There are no security implications of leaking newTarget, since it can always be accessed by simply writing new.target.

I confirm that the testcase in bug 1995949 is hitting the same problem.

I'll have to poke at this code a bit to figure out the best fix. It would be much easier if we had one spare register...

Group: javascript-core-security
Flags: needinfo?(iireland)
Regressed by: 1991223
No longer regressed by: 1993404

Technically the change in emitAllocateSpaceForApply isn't necessary, because we'll end up storing undefined in the padding we've just allocated. However, loadFunctionArgCount is a load (guaranteed to hit in L1) and a shift, and we'd be doing a shift anyway. One L1 load is probably cheaper than an extra iteration of the undefined loop.

Assignee: nobody → iireland
Status: NEW → ASSIGNED
Blocks: sm-jits
Severity: -- → S3
Priority: -- → P1

Should we take this bug into account for a 145 uplift? Thanks

Flags: needinfo?(iireland)

I'd say it's optional. The fix is simple and safe. There are no security concerns of leaving this unfixed. There are potential correctness problems, but you have to write fairly contrived code to hit them. I would be moderately surprised if anybody ran into this in the wild, but it's not impossible.

I'll fill out the uplift form just in case we decide we want this.

Flags: needinfo?(iireland)

Actually, scratch that. The correctness concern here is that if we call a constructor in a contrived way and we don't pass enough arguments, then the last argument is filled in with new.target instead of undefined. To go wrong in the wild, you not only have to trigger the bug, you also have to care about the value of this argument, which is presumably optional since it's not being passed.

We haven't seen any problems in the wild on Nightly. I think it's okay for this to ride the trains.

Status: ASSIGNED → RESOLVED
Closed: 25 days ago
Resolution: --- → FIXED
Target Milestone: --- → 146 Branch

Verified bug as fixed on rev mozilla-central 20251028215909-ebd252259b1b.
Removing bugmon keyword as no further action possible. Please review the bug and re-add the keyword for further analysis.

Status: RESOLVED → VERIFIED
Keywords: bugmon

We've been getting reports from various Reddit users of Fx145 breakage with sites using Three.js. One of them mentioned that Beta 146 was working OK and was able to create a reduced testcase: https://github.com/mediaquell/planertest (warning: requires a lot of toggle button clicking to repro)

With that testcase, I was able to bisect the fix to this bug. I was also able to confirm that a Try build of Release with this fix included doesn't hit the bug anymore. I'm going to ask Reddit users to try this build out to verify with their sites also. But it appears that this bug is surfacing real-world breakage with a pretty widely-used library and we're going to want to backport into the first-available dot release.

Please nominate for Release approval, Iain.

Flags: needinfo?(iireland)
Flags: in-testsuite+

Comment on attachment 9522407 [details]
Bug 1995939: Don't clobber nargs before pushing undefined arguments r=jandem

Beta/Release Uplift Approval Request

  • User impact if declined/Reason for urgency: Potential breakage on sites using Three.js (and potentially others).
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The patch is small and straightforward: it simply reloads a value that might have been clobbered. It has been baking on Nightly/Beta for more than two weeks.
  • String changes made/needed: None
  • Is Android affected?: Yes
Flags: needinfo?(iireland)
Attachment #9522407 - Flags: approval-mozilla-release?
Keywords: enterprise

Comment on attachment 9522407 [details]
Bug 1995939: Don't clobber nargs before pushing undefined arguments r=jandem

Approved for 145.0.1

Attachment #9522407 - Flags: approval-mozilla-release? → approval-mozilla-release+
Severity: S3 → S2

Iain, the patch seems to be causing test failures on the release branch, could you look at it please? This is our driver bug for the unplanned 145.0.1 dot release. Thanks!
https://treeherder.mozilla.org/jobs?repo=mozilla-release&resultStatus=testfailed%2Cbusted%2Cexception&classifiedState=unclassified&revision=d068d522017093c5f5857641e7b2501698edaa69&selectedTaskRun=TD3p9t7JTwaobdbg5DqR-g.0

Flags: needinfo?(iireland)

I find this very surprising. This patch has been baking in Nightly for weeks without causing any problems, and I don't see any code in the failing testcase that calls a constructor, which is necessary to trigger the underlying bug. It looks like most of the work is done in getCTM, which is implemented in C++.

I note that my patch landed on top of this patch for bug 1997351, which modifies getCTM. Is it possible that your bisection was confused because that patch is marked DONTBUILD?

Flags: needinfo?(iireland)

(In reply to Iain Ireland [:iain] from comment #21)
I note that my patch landed on top of this patch for bug 1997351, which modifies getCTM. Is it possible that your bisection was confused because that patch is marked DONTBUILD?

Thanks, good catch, this is indeed a lot more likely cause sorry for the noise!

Duplicate of this bug: 1999771
Duplicate of this bug: 2000828
No longer blocks: 1999771
Blocks: 1999771
No longer duplicate of this bug: 1999771
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: