Open Bug 1373586 Opened 7 years ago Updated 2 years ago

Assertion in LineBreakBefore due to ruby-related textrun failure

Categories

(Core :: Layout: Text and Fonts, defect, P3)

defect

Tracking

()

REOPENED
Tracking Status
firefox-esr45 --- unaffected
firefox-esr52 --- wontfix
firefox54 --- wontfix
firefox55 --- wontfix
firefox56 --- wontfix
firefox57 --- wontfix
firefox69 --- wontfix
firefox70 --- wontfix

People

(Reporter: ananuti, Unassigned)

References

Details

(Keywords: testcase)

Attachments

(2 files)

This bug was filed from the Socorro interface and is 
report bp-02ab3f9f-87a6-41d4-ad2b-6d5010170616.
=============================================================

STR:
1. load https://nanake.bitbucket.io/tests/wagahaiwanekodearu.html

Bisect says

The first bad revision is:
@  changeset:   352910:8d6560b363d2
|  user:        Kan-Ru Chen <kanru@kanru.info>
~  date:        Mon Oct 24 22:00:03 2016 +0800
   files:       layout/generic/nsTextFrameUtils.cpp
   description:
   Bug 1081858 - Part 1. Fix aText off-by-one indexing. r=jfkthame

   MozReview-Commit-ID: 6LAlEntU6C7
I can reproduce this. It looks like nsTextFrame::EnsureTextRun cannot construct a text run so we got a nullptr when trying to use mTextRun later.

Might be related to bug 1371351 which has similar symptom that we were unable to build text run and rebuild it over and over again.
See Also: → 1371351, 1369985
I tried to run this with rr. It seems in BuildTextRunsScanner::BuildTextRunForFrames we always ended up with mSkipIncompleteTextRuns=true so textRun is set to nullptr.

TextContainsLineBreakerWhiteSpace looks suspicious because it only checks nsLineBreaker::IsSpace(aChar) || aChar == 0x0A;
I have a patch for bug 1369985 which should also fix this.
Depends on: 1369985
Attached file simplified testcase
This is the simplified testcase from the original page, which the new crashtest being added in bug 1369985 would base on.

I found it pretty hard to further simplify this testcase, and TBH it is not completely clear why rbc frame would fail to get the text run of its children only under these specific conditions.
Hmmm, it seems I can still reproduce this with my patch in bug 1369985 on Windows at least, on top of the current m-c.
But the failure becomes intermittent :/
There seems to be a different crash involves now... Although the final result is the same that the ruby frame fails to get text run. We probably should add a null check there anyway...
OK, so I guess another issue here is similar to that of bug 1369985, but probably related to whitespace trimming. I can reduce the testcase to another form that the crash happens within two paragraphs, and the stripped whitespace is at a line break. But this is not reliably reproducible, and I have no idea why...
It doesn't reproduce reliably. The crash can happen after several refresh.
jfkthame, any thought? I guess some of callsites of IsTrimmableSpace may be suspicious... But I don't quite have direction where to look.
Flags: needinfo?(jfkthame)
(In reply to Xidorn Quan [:xidorn] UTC+10 from comment #7)
> There seems to be a different crash involves now... Although the final
> result is the same that the ruby frame fails to get text run. We probably
> should add a null check there anyway...

Yes, I think we should always null-check the result of GetTextRun(), as there may be edge cases where EnsureTextRun fails for some reason. IIRC most call-sites have such a check already, but it's missing here.

Then it would be interesting to see if this can be converted into a testcase that doesn't crash, but shows incorrect rendering due to the failure to create the textrun.

(In reply to Xidorn Quan [:xidorn] UTC+10 from comment #10)
> jfkthame, any thought? I guess some of callsites of IsTrimmableSpace may be
> suspicious... But I don't quite have direction where to look.

Offhand I'm not sure either, but leaving needinfo? for now as a reminder to have a look at this.
bp-dbf12fbf-1762-4383-b640-f05810170620 
The matter's not resolved. The culprit's still at large.

It crashes both link in comment 0 and attachment 8879432 [details].
Bug 1369985 seems to fix only the crashtest in attachment 8878858 [details].
attachment 8879432 [details] is the second testcase extracted from the large page mentioned in comment 0.

There might even be other patterns involve the link in comment 0 after the second testcase gets fixed. No one knows.

We probably should just add the null check here, optionally with a MOZ_ASSERT, so that we can catch such text run failure in debug build, but keep release build robust.
Depends on: 1374625
Flags: needinfo?(aschen)
Priority: -- → P2
Flags: needinfo?(aschen)
Is this still happening on Nightly builds? I don't see any recent occurrences in crash-stats, afaict.
Flags: needinfo?(jfkthame)
The crash itself has been protected by bug 1374625 so there certainly no longer exists any new crashes because of this. However, if you open the second attached testcase in a debug build, it should still assert at the same place where it fails to build the text run.
Flags: needinfo?(jfkthame)
Given that this no longer represents a crash risk in release builds (thanks to bug 1374625), downgrading from P2/critical to P3/normal. It's still a bug we should try to diagnose and fix, however.
Severity: critical → normal
Flags: needinfo?(jfkthame)
OS: Windows 10 → All
Priority: P2 → P3
Hardware: x86 → All
Summary: Crash in LineBreakBefore → Assertion in LineBreakBefore due to ruby-related textrun failure
Version: 52 Branch → Trunk

Closing because no crashes reported for 12 weeks.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---

Bugbug thinks this bug is a regression, but please revert this change in case of error.

Keywords: regression

Closing because no crashes reported for 12 weeks.

Status: REOPENED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → WORKSFORME

How can I stop bugbug from closing it?

Status: RESOLVED → REOPENED
Flags: needinfo?(sledru)
Resolution: WORKSFORME → ---

Yeah, bugbug shouldn't close bugs, specially with test-cases.

Keywords: regressiontestcase

This is an autonag rule. No crash => closed. And I disagree with you... I don't see the point of keeping this kind of bug open.

Flags: needinfo?(sledru)

The underlying issue is not fixed, and it is still something we may want to investigate. I guess I can just remove the crash signature so that the bot wouldn't close it based on that.

Crash Signature: [@ LineBreakBefore]

And probably the keyword.

Keywords: crash
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: