Closed Bug 1747277 Opened 2 years ago Closed 2 years ago

src/layout/base/nsRefreshDriver.cpp:2439:40: runtime error: -2254.08 is outside the range of representable values of type 'unsigned int'

Categories

(Core :: Graphics: ImageLib, defect)

defect

Tracking

()

RESOLVED FIXED
106 Branch
Tracking Status
firefox97 --- wontfix
firefox106 --- fixed

People

(Reporter: tsmith, Assigned: tnikkel)

References

(Blocks 3 open bugs)

Details

(Keywords: csectype-undefined)

Attachments

(3 files)

Found while fuzzing m-c 20211217-ba22a155be2e (--enable-undefined-sanitizer --enable-fuzzing)

This was found by enabling the float-cast-overflow check in UBSan. This type of issue can create inconsistencies across platforms, architectures and optimization levels.

To enable this check add the following to your mozconfig:

ac_add_options --enable-undefined-sanitizer="float-cast-overflow"
src/layout/base/nsRefreshDriver.cpp:2439:40: runtime error: -2254.08 is outside the range of representable values of type 'unsigned int'
    #0 0x7efc88820a13 in nsRefreshDriver::Tick(mozilla::layers::BaseTransactionId<mozilla::VsyncIdType>, mozilla::TimeStamp, nsRefreshDriver::IsExtraTick) src/layout/base/nsRefreshDriver.cpp:2439:40
    #1 0x7efc8882ff46 in mozilla::RefreshDriverTimer::TickDriver(nsRefreshDriver*, mozilla::layers::BaseTransactionId<mozilla::VsyncIdType>, mozilla::TimeStamp) src/layout/base/nsRefreshDriver.cpp:348:13
    #2 0x7efc8882ff46 in mozilla::RefreshDriverTimer::TickRefreshDrivers(mozilla::layers::BaseTransactionId<mozilla::VsyncIdType>, mozilla::TimeStamp, nsTArray<RefPtr<nsRefreshDriver> >&) src/layout/base/nsRefreshDriver.cpp:326:7
    #3 0x7efc8882fb85 in mozilla::RefreshDriverTimer::Tick(mozilla::layers::BaseTransactionId<mozilla::VsyncIdType>, mozilla::TimeStamp) src/layout/base/nsRefreshDriver.cpp:342:5
    #4 0x7efc8882f6a5 in mozilla::VsyncRefreshDriverTimer::RunRefreshDrivers(mozilla::layers::BaseTransactionId<mozilla::VsyncIdType>, mozilla::TimeStamp) src/layout/base/nsRefreshDriver.cpp:780:5
    #5 0x7efc8882ed71 in mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::TickRefreshDriver(mozilla::layers::BaseTransactionId<mozilla::VsyncIdType>, mozilla::TimeStamp) src/layout/base/nsRefreshDriver.cpp:703:16
    #6 0x7efc8882e0ae in mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::NotifyParentProcessVsync() src/layout/base/nsRefreshDriver.cpp:620:7
    #7 0x7efc8882db85 in mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::NotifyVsync(mozilla::VsyncEvent const&) src/layout/base/nsRefreshDriver.cpp:541:9
    #8 0x7efc879371d3 in mozilla::dom::VsyncMainChild::RecvNotify(mozilla::VsyncEvent const&, float const&) src/dom/ipc/VsyncMainChild.cpp:68:15
    #9 0x7efc81b030d0 in mozilla::dom::PVsyncChild::OnMessageReceived(IPC::Message const&) src/objdir-ff-ubsan/ipc/ipdl/PVsyncChild.cpp:208:54
    #10 0x7efc8169dc08 in mozilla::ipc::PBackgroundChild::OnMessageReceived(IPC::Message const&) src/objdir-ff-ubsan/ipc/ipdl/PBackgroundChild.cpp:6081:32
    #11 0x7efc8103a888 in mozilla::ipc::MessageChannel::DispatchAsyncMessage(mozilla::ipc::ActorLifecycleProxy*, IPC::Message const&) src/ipc/glue/MessageChannel.cpp:2043:25
    #12 0x7efc81037097 in mozilla::ipc::MessageChannel::DispatchMessage(IPC::Message&&) src/ipc/glue/MessageChannel.cpp:1968:9
    #13 0x7efc810388ef in mozilla::ipc::MessageChannel::RunMessage(mozilla::ipc::MessageChannel::MessageTask&) src/ipc/glue/MessageChannel.cpp:1827:3
    #14 0x7efc8103933d in mozilla::ipc::MessageChannel::MessageTask::Run() src/ipc/glue/MessageChannel.cpp:1855:14
    #15 0x7efc7fabd596 in mozilla::RunnableTask::Run() src/xpcom/threads/TaskController.cpp:468:16
    #16 0x7efc7fa7e669 in mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) src/xpcom/threads/TaskController.cpp:771:26
    #17 0x7efc7fa7babe in mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) src/xpcom/threads/TaskController.cpp:607:15
    #18 0x7efc7fa7c2e4 in mozilla::TaskController::ProcessPendingMTTask(bool) src/xpcom/threads/TaskController.cpp:391:36
    #19 0x7efc7faaf5c9 in mozilla::TaskController::InitializeInternal()::$_0::operator()() const src/xpcom/threads/TaskController.cpp:124:37
    #20 0x7efc7faaf5c9 in mozilla::detail::RunnableFunction<mozilla::TaskController::InitializeInternal()::$_0>::Run() src/xpcom/threads/nsThreadUtils.h:531:5
    #21 0x7efc7fa9ad38 in nsThread::ProcessNextEvent(bool, bool*) src/xpcom/threads/nsThread.cpp:1183:16
    #22 0x7efc7faa41d6 in NS_ProcessNextEvent(nsIThread*, bool) src/xpcom/threads/nsThreadUtils.cpp:467:10
    #23 0x7efc81043ad2 in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) src/ipc/glue/MessagePump.cpp:85:21
    #24 0x7efc81045252 in mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate*) src/ipc/glue/MessagePump.cpp:268:30
    #25 0x7efc80ea3235 in MessageLoop::RunInternal() src/ipc/chromium/src/base/message_loop.cc:331:10
    #26 0x7efc80ea3235 in MessageLoop::RunHandler() src/ipc/chromium/src/base/message_loop.cc:324:3
    #27 0x7efc80ea3235 in MessageLoop::Run() src/ipc/chromium/src/base/message_loop.cc:306:3
    #28 0x7efc882ae59d in nsBaseAppShell::Run() src/widget/nsBaseAppShell.cpp:137:27
    #29 0x7efc8d092cbb in XRE_RunAppShell() src/toolkit/xre/nsEmbedFunctions.cpp:864:20
    #30 0x7efc81045231 in mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate*) src/ipc/glue/MessagePump.cpp:235:9
    #31 0x7efc80ea3235 in MessageLoop::RunInternal() src/ipc/chromium/src/base/message_loop.cc:331:10
    #32 0x7efc80ea3235 in MessageLoop::RunHandler() src/ipc/chromium/src/base/message_loop.cc:324:3
    #33 0x7efc80ea3235 in MessageLoop::Run() src/ipc/chromium/src/base/message_loop.cc:306:3
    #34 0x7efc8d091e63 in XRE_InitChildProcess(int, char**, XREChildData const*) src/toolkit/xre/nsEmbedFunctions.cpp:701:34
    #35 0x7efc8d0a6050 in mozilla::BootstrapImpl::XRE_InitChildProcess(int, char**, XREChildData const*) src/toolkit/xre/Bootstrap.cpp:67:12
    #36 0x5562653f207a in content_process_main(mozilla::Bootstrap*, int, char**) src/browser/app/../../ipc/contentproc/plugin-container.cpp:57:28
    #37 0x5562653f2451 in main src/browser/app/nsBrowserApp.cpp:327:18
    #38 0x7efca8ebfbf6 in __libc_start_main /build/glibc-S9d2JN/glibc-2.27/csu/../csu/libc-start.c:310
    #39 0x556265341138 in _start (src/objdir-ff-ubsan/dist/bin/firefox+0xe5138)

(I'm reclassifying this under Layout::Images since this has to do with the image visibility-tracking code, ImageStartTable mStartTable;.)

So the error here is:
runtime error: -2254.08 is outside the range of representable values of type 'unsigned int'
which is UBSan telling us that we're trying to put a negative floating-point value into a (non-negative) unsigned int.

The line of code is https://searchfox.org/mozilla-central/rev/361f258f46af4b9c881be81d1291000827c15704/layout/base/nsRefreshDriver.cpp

      uint32_t prevMultiple = uint32_t(prev.ToMilliseconds()) / delay;

The only floating-point expression in that line is prev.ToMilliseconds(), so that must be the thing that's unexpectedly-negative here.

Here's a bit more context, showing a bit more where this value comes from:

void nsRefreshDriver::Tick(VsyncId aId, TimeStamp aNowTime,
                           IsExtraTick aIsExtraTick /* = No */) {
[...]
  TimeStamp previousRefresh = mMostRecentRefresh;
  mMostRecentRefresh = aNowTime;
[...]
  for (const auto& entry : mStartTable) {
    const uint32_t& delay = entry.GetKey();
    ImageStartData* data = entry.GetWeak();

    if (data->mStartTime) {
      TimeStamp& start = *data->mStartTime;
      TimeDuration prev = previousRefresh - start;
[...]
      uint32_t prevMultiple = uint32_t(prev.ToMilliseconds()) / delay;
[...]
    } else {
[...]
      data->mStartTime.emplace(aNowTime);

Since prev.ToMilliseconds() is negative here, that means previousRefresh must be less-than (i.e. an older timestamp) than start.

So that means we're entering nsRefreshDriver::Tick with mMostRecentRefresh being less-than (older-than) than the mStartTime of some entry in mStartTable.

This is weird/unexpected, since it looks like mStartTime only gets emplaced in this one spot, and it's emplaced with aNowTime which is what mMostRecentRefresh gets set to at the beginning of the tick. So I would expect that anything in mStartTable should have a mStartTime that is either equal to the previous tick's mMostRecentRefresh, i.e. equal to previousRefresh (if mStartTime got emplaced on the most recent tick), or it should be older than that (if it got emplaced on an earlier tick). It's not clear how it could be newer than that.

But in this case, based on the 2254 ToMilliseconds() value, it looks like it's somehow ~2 seconds newer. I'm not sure how that would happen.

Component: Layout → Layout: Images, Video, and HTML Frames

One way this might conceivably happen is if aNowTime went backwards, i.e. if we get a series of calls like:

  • Call #1 to Tick(): We hit the code that I quoted above and we call mStartTime.emplace(aNowTime) for an entry in mStartTable.
  • Call #2 to Tick(): For $WEIRD_REASON, this second tick receives an older aNowTime than the previous call. This gets stored in mMostRecentRefresh. Notably, this puts us into a situation where mMostRecentRefresh is less than a mStartTime that we previously emplaced in mStartTable.
  • Call #3 to Tick(): We just proceed through the function, copying the ~bogus mMostRecentRefresh into previousRefresh. We end up producing a negative value in prev when we subtract mStartTime from it, which we then convert to uint32_t and trip this UBSan error.

Normally we wouldn't expect Call#2 to proceed since nsRefreshDriver has an early return if it detects time going backwards, here:

  if ((aNowTime <= mMostRecentRefresh) && !mTestControllingRefreshes &&
      aIsExtraTick == IsExtraTick::No) {
    return;

https://searchfox.org/mozilla-central/rev/361f258f46af4b9c881be81d1291000827c15704/layout/base/nsRefreshDriver.cpp#2154

However, the extra conditions there are interesting. It looks like it's conceivable we could get into this situation if either of those conditions are true.

Given that this was detected during fuzzing, I'm especially suspicious of the mTestControllingRefreshes bool there. I think that gets set by the nsIDOMWindowUtils::advanceTimeAndRefresh privileged JS API. Tyson, do you know if the fuzzer might've been using that nsIDOMWindowUtils API at all?

(If not, then it's also conceivable this could be associated with the aIsExtraTick thing, too.)

Flags: needinfo?(twsmith)
Attached file testcase.html

Working on getting a Pernosco session. The testcase is not 100% reliable so it may take a few hours.

A Pernosco session is available here: https://pernos.co/debug/S4n2RMYeDVIQkC-zVH57Bg/index.html

Flags: needinfo?(twsmith)

This is some nested event-loop nastiness; ticking the refresh driver during a refresh driver tick.

Here's the moment where, in a regular non-nested call to nsRefreshDriver::Tick(), we end up setting previousRefresh to a value that is totally reasonable but ultimately doomed to be too small (8945323569456292):
https://pernos.co/debug/S4n2RMYeDVIQkC-zVH57Bg/index.html#f{m[B+n+,AsVL_,t[5g,DaI_,f{e[B+n+,Ar6A_,s{abghnq3AA,bAWo,uPEMYuA,oPHuBxw___/

Slightly later in this function, we call RunFrameRequestCallbacks, which (several layers down) gets us into nsGlobalWindowInner::PrintPreview(). This ends up calling mozilla::SpinEventLoopUntil, which gets us into a nested call to nsRefreshDriver::Tick with a newer aNowTime than the one that were dealing with in the previous paragraph's Tick() call (which we're still inside of, many stack levels up).

During this nested Tick(), we call data->mStartTime.emplace(aNowTime); to set up mStartTime on an ImageStartData object, using the newer aNowTime from this innermost refresh driver tick. That happens here:
https://pernos.co/debug/S4n2RMYeDVIQkC-zVH57Bg/index.html#f{m[B+9D,BM1s_,t[5g,DaI_,f{e[B+6P,AtUS_,s{abghnq3AA,bAWo,uPEMYuA,oPHuBxw___/

Then, we complete this inner Tick() and eventually unwind the stack back up to the outer Tick() call -- the one with the "old" previousRefresh local variable. This call proceeds up to the mStartTable-loop (the loop that I quoted in comment 1), and it discovers the data->mStartTime that we emplaced quite-recently during the nested event loop, and indeed data->mStartTime is newer than our previousRefresh value. So we end up with a negative value (in prev) when we subtract them.

Here's the moment where we trip the UBSan error, with previousRefresh being that too-small value and prev having a negative value:
https://pernos.co/debug/S4n2RMYeDVIQkC-zVH57Bg/index.html#f{m[CRlG,AWg_,t[5g,DaI_,f{e[B+n+,Ar6A_,s{abghnq3AA,bAWo,uPEMYuA,oPHuBxw___/

We could conceivably apply a band-aid here by clamping prev to be non-negative, to avoid the undefined behavior; but perhaps we should do something even more useful for this case.

This mStartTable stuff is kind-of ancient, added back in bug 867758, and it's not clear to me (without further digging) what the right behavior for it is in this case.

tnikkel, could I kick this to you to sort out what the right behavior is for this case where data->mStartTime gets emplaced in a nested event loop? (producing a negative prev value in the outer event loop)

Flags: needinfo?(tnikkel)
Component: Layout: Images, Video, and HTML Frames → ImageLib
Depends on: 867758

This issue is currently triggered in CI when the 'float-cast-overflow' UBSan check is enabled. This issue will need to be addressed before the check can be enabled by default.

If it requires too much effort to fix immediately please ni? me and let me know. If necessary it will be added to a suppression list. Thank you :)

Assignee: nobody → tnikkel
Status: NEW → ASSIGNED
Pushed by tnikkel@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/c0c160adb739
Skip some code with starting animated images in the refresh driver when we don't need it. r=dholbert
https://hg.mozilla.org/integration/autoland/rev/d7ead67691b1
Handle an unusual case when starting animated images in the refresh driver.  r=dholbert
Flags: needinfo?(tnikkel)
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 106 Branch
Regressions: 1787787
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: