Closed Bug 1975576 Opened 4 months ago Closed 25 days ago

Crash in [@ __if_indextoname] via NrIceCtx::StartGathering

Categories

(Core :: Security: Process Sandboxing, defect, P3)

defect

Tracking

()

RESOLVED FIXED
145 Branch
Tracking Status
firefox-esr115 --- wontfix
firefox-esr140 --- fixed
firefox143 --- wontfix
firefox144 --- fixed
firefox145 --- fixed

People

(Reporter: mccr8, Assigned: jld)

References

(Regression)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(3 files)

Crash report: https://crash-stats.mozilla.org/report/index/e16136bf-c417-46f0-bc84-ea6b30250703

Reason:

SIGSYS / SYS_SECCOMP

Top 10 frames:

0  libc.so.6  __GI___ioctl  /usr/src/debug/glibc/glibc/sysdeps/unix/sysv/linux/ioctl.c:36
1  libc.so.6  __if_indextoname  /usr/src/debug/glibc/glibc/sysdeps/unix/sysv/linux/if_index.c:231
2  libxul.so  set_ifname  dom/media/webrtc/transport/third_party/nICEr/src/stun/addrs-netlink.c:75
2  libxul.so  stun_convert_netlink  dom/media/webrtc/transport/third_party/nICEr/src/stun/addrs-netlink.c:133
2  libxul.so  stun_getaddrs_filtered  dom/media/webrtc/transport/third_party/nICEr/src/stun/addrs-netlink.c:251
3  libxul.so  nr_stun_get_addrs  dom/media/webrtc/transport/third_party/nICEr/src/stun/addrs.c:208
4  libxul.so  nr_stun_find_local_addresses  dom/media/webrtc/transport/third_party/nICEr/src/stun/stun_util.c:164
4  libxul.so  nr_ice_gather  dom/media/webrtc/transport/third_party/nICEr/src/ice/ice_ctx.c:871
5  libxul.so  mozilla::NrIceCtx::StartGathering(bool, bool)  dom/media/webrtc/transport/nricectx.cpp:934
6  libxul.so  mozilla::MediaTransportHandlerSTS::StartIceGathering(bool, bool, nsTArray<moz...  dom/media/webrtc/jsapi/MediaTransportHandler.cpp:920

This signature looks useless but it seems that it isn't that uncommon and they all have this same stack. It is a null deref.

It looks like this crash is Nightly only. It first showed up in 140a, in the 20250523091654 build. The crashes are all in the socket process.

Only thing I can see that might be involved (and landed near the target 2025-05-23) is bug 1954423. Byron, any thoughts here?

Flags: needinfo?(docfaraday)

The bug is linked to a topcrash signature, which matches the following criterion:

  • Top 10 desktop browser crashes on nightly

For more information, please visit BugBot documentation.

Keywords: topcrash

So this is nightly only on multiple versions of nightly (140, 141, and 142). It has never occured on 140 beta/release or 141 beta it seems. Bug 1954423 isn't restricted to nightly in any way, I don't see how this could be caused by bug 1954423, and we're still a few days off (bad builds start on May 23, with plenty of crashes, but bug 1954423 landed a few days before that).

A large majority of these are happening multiple times on the same install. Arch linux is waaaaaay overrepresented (there is a large number of Fedora 42 crashes up until mid-June, and then it becomes almost entirely Arch). Maybe that's because most other distros don't end up using nightly? Or maybe Fedora 42 and Arch users are the main Firefox users on linux? Not sure about that.

Kernel build dates look pretty recent for each crash report; all of these users had updated their kernel within a month or two. Maybe there was a particular kernel patch that interacts poorly with Firefox, or maybe changed the signature of this crash?

Flags: needinfo?(docfaraday)

The severity field is not set for this bug.
:bwc, could you have a look please?

For more information, please visit BugBot documentation.

Flags: needinfo?(docfaraday)

This looks like it is probably a linux bug, but not totally sure. I'll assign a priority for now, but I'll leave the needinfo.

Assignee: nobody → docfaraday
Severity: -- → S3
Priority: -- → P3

Based on the topcrash criteria, the crash signature linked to this bug is not a topcrash signature anymore.

For more information, please visit BugBot documentation.

Keywords: topcrash
Duplicate of this bug: 1982439

Copying crash signatures from duplicate bugs.

Crash Signature: [@ __if_indextoname] → [@ __if_indextoname] [@ __ioctl] [@ set_ifname]

(In reply to Andrew McCreight (out of office until 8/21) [:mccr8] from comment #0)

Reason:

SIGSYS / SYS_SECCOMP

This is a sandbox violation. On Nightly we crash (and paste the syscall number into the address field so it can be aggregated on; it's not really a null pointer) because there's currently no other way to get enough information; on other branches the syscall fails with ENOSYS. That can be changed in either direction with the MOZ_SANDBOX_CRASH_ON_ERROR env var.

These particular crashes, which are all in the socket process, seem to be SIOCGIFNAME; I can also see code to call SIOCGIFFLAGS (to identify point-to-point interfaces, assumed to the VPNs) and SIOCETHTOOL and SIOCGIWRATE (to estimate the speed of wired and wireless interfaces respectively). It would be possible to block those silently (not crash or log an error), but I assume we'd want to allow them. Given that this process already has direct network access, I don't think that's a significant concern for security.

What I don't understand is why this only recently started happening, and why it's relatively low-volume. The code in WebRTC that calls these ioctls isn't new, and hasn't WebRTC been using the socket process for a long time?

Crash Signature: [@ __if_indextoname] [@ __ioctl] [@ set_ifname] → [@ __if_indextoname] [@ __ioctl] [@ set_ifname]
Component: WebRTC: Networking → Security: Process Sandboxing

I have a patch to allow the ioctls in question (for the socket process only).

Assignee: docfaraday → jld
Flags: needinfo?(docfaraday)
See Also: → 1990721
Pushed by jedavis@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/970e2f65d62d https://hg.mozilla.org/integration/autoland/rev/03c8c342964c Allow ioctls used by WebRTC for interface info in Linux socket process sandbox. r=gcp
Status: NEW → RESOLVED
Closed: 25 days ago
Resolution: --- → FIXED
Target Milestone: --- → 145 Branch

While the crashes are Nightly-only, is this worth backporting anywhere to avoid the sandbox violations? Not sure what the impact is for those users.

firefox-beta Uplift Approval Request

  • User impact if declined: Probable WebRTC flakiness for some Linux users under some circumstances. This isn't a crash on non-Nightly, but the sandbox denial could affect functionality.
  • Code covered by automated testing: no
  • Fix verified in Nightly: yes
  • Needs manual QE test: no
  • Steps to reproduce for manual QE testing: Testing note: we don't have STR (see discussion in bug 1990721 for details), but Nightly crash reports stopped after the patch landed
  • Risk associated with taking this patch: low
  • Explanation of risk level: The patch just adds a few things to an allow list in the sandbox policy, so I'm not really concerned about functional regression, and the operations allowed are things that it's reasonable for the socket process to do.
  • String changes made/needed: none
  • Is Android affected?: no
Attachment #9517667 - Flags: approval-mozilla-beta?
Flags: needinfo?(jld)
Keywords: regression
Regressed by: 1302711

firefox-esr140 Uplift Approval Request

  • User impact if declined: Probable WebRTC flakiness for some Linux users under some circumstances. This isn't a crash on non-Nightly, but the sandbox denial could affect functionality.
  • Code covered by automated testing: no
  • Fix verified in Nightly: yes
  • Needs manual QE test: no
  • Steps to reproduce for manual QE testing: Testing note: we don't have STR (see discussion in bug 1990721 for details), but Nightly crash reports stopped after the patch landed
  • Risk associated with taking this patch: low
  • Explanation of risk level: The patch just adds a few things to an allow list in the sandbox policy, so I'm not really concerned about functional regression, and the operations allowed are things that it's reasonable for the socket process to do.
  • String changes made/needed: none
  • Is Android affected?: no
Attachment #9517668 - Flags: approval-mozilla-esr140?
Attachment #9517667 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9517668 - Flags: approval-mozilla-esr140? → approval-mozilla-esr140+
Attachment #9517668 - Attachment description: Bug 1975576 - Allow ioctls used by WebRTC for interface info in Linux socket process sandbox. r?gcp → Bug 1975576 - Allow ioctls used by WebRTC for interface info in Linux socket process sandbox. r=gcp

Backed out for causing build bustages

  • Backout link
  • Push with failures
  • Failure Log
  • Failure line: /builds/worker/checkouts/gecko/security/sandbox/linux/SandboxFilter.cpp:X:14: error: 'class sandbox::bpf_dsl::Caser<long unsigned int>' has no member named 'Cases'; did you mean 'Case'?
Flags: needinfo?(jld)

Sorry about that; the patch has a dependency on bug 1937025. The fix should just be replacing ….Cases({…},…) with ….CASES((…),…) to work with the old bpf_dsl API, so that shouldn't add any risk to the uplift. I'm testing that now.

The esr140 uplift is updated.

Flags: needinfo?(jld)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: