Closed Bug 1304602 Opened 9 years ago Closed 8 years ago

Crash in InvalidArrayIndex_CRASH | nsUrlClassifierPrefixSet::GetPrefixesNative

Categories

(Toolkit :: Safe Browsing, defect, P3)

Unspecified
macOS
defect

Tracking

()

RESOLVED DUPLICATE of bug 1332218

People

(Reporter: n.nethercote, Assigned: hchang)

References

Details

(Keywords: crash, csectype-bounds)

Crash Data

This bug was filed from the Socorro interface and is report bp-16a55647-e60c-4947-ab78-004a02160919. ============================================================= An nsTArray bounds check failure crash. Four occurrences in the past week, 3 in Firefox and one in Fennec. In all cases we accessed one past the end of a very long array. gcp, can you please take a look?
Flags: needinfo?(gpascutto)
Priority: -- → P1
Group: toolkit-core-security
Keywords: csectype-bounds
Please file invalid array accesses as security bugs, as they can be buffer overflows in versions earlier than Fx 50.
I don't believe this is a security bug. The data comes from trusted HTTPS sources or from the local disk or memory, so it's not under attacker control. In fact, it looks likely that the crashes seen here might be due to disk or memory corruption.
Flags: needinfo?(gpascutto)
Group: toolkit-core-security
Assignee: nobody → hchang
I can take a look at this.
(In reply to Gian-Carlo Pascutto [:gcp] from comment #2) > I don't believe this is a security bug. The data comes from trusted HTTPS > sources or from the local disk or memory, so it's not under attacker control. > > In fact, it looks likely that the crashes seen here might be due to disk or > memory corruption. It seems the crashed was caused by unmatched |mTotalPrefixes| and |mIndexDeltas| in [0], which are initialized in |nsUrlClassifierPrefixSet::LoadPrefixes|. If the disk is corrupted, |LoadPrefixes| will return error, which leaves |mPrimed| false [1]. If |mPrimed| is false, |nsUrlClassifierPrefixSet::GetPrefixSetNative| will never be called [2]. So, I conclude that 1) The crash wasn't caused by any disk corruption that we can identify. 2) |mTotalPrefixes| and |mIndexDeltas| are supposed to be always matched no matter what values are read from file according to [3]. (mTotalPrefixes will get increased only when mIndexDeltas gets the same amount of increment.) So, I haven't found any clue that we can blame the disk corruption. gcp, what do you think? Am I missing something? Thanks :) [0] https://hg.mozilla.org/releases/mozilla-aurora/annotate/696a981b6d53/toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp#l145 [1] https://hg.mozilla.org/releases/mozilla-aurora/annotate/696a981b6d53/toolkit/components/url-classifier/LookupCache.cpp#l623 [2] https://hg.mozilla.org/releases/mozilla-aurora/annotate/696a981b6d53/toolkit/components/url-classifier/LookupCache.cpp#l648 [3] https://hg.mozilla.org/releases/mozilla-aurora/annotate/696a981b6d53/toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp#l395
Flags: needinfo?(gpascutto)
(In reply to Henry Chang [:henry][:hchang] from comment #4) > If the disk is corrupted, |LoadPrefixes| will return error, which leaves |mPrimed| false > [1]. We'll only get an error when we *detect* the corruption, though. While checking this, I noticed that the NS_ENSURE_TRUE(read == toRead, NS_ERROR_FAILURE); lines will mean we won't reset the database when we fail to read the expected amount of prefixes. They should probably return NS_ERROR_FILE_CORRUPTED, because there's no way to sensibly proceed without prefix data. That said... > 2) |mTotalPrefixes| and |mIndexDeltas| are supposed to be always matched no > matter > what values are read from file according to [3]. (mTotalPrefixes will get > increased only when mIndexDeltas gets the same amount of increment.) ...that sounds right. So do we have a problem if numInDelta == 0? https://hg.mozilla.org/releases/mozilla-aurora/annotate/696a981b6d53/toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp#l159 This advances prefixCnt even if mIndexDeltas[i] is 0. But https://hg.mozilla.org/releases/mozilla-aurora/annotate/696a981b6d53/toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp#l393 would not increment mTotalPrefixes in that case. GetPrefixesNative cues it's allocation off of that variable, so that would cause this bug, wouldn't it?
Flags: needinfo?(gpascutto)
(In reply to Gian-Carlo Pascutto [:gcp] from comment #5) > (In reply to Henry Chang [:henry][:hchang] from comment #4) > > If the disk is corrupted, |LoadPrefixes| will return error, which leaves |mPrimed| false > > [1]. > > We'll only get an error when we *detect* the corruption, though. While > checking this, I noticed that the > NS_ENSURE_TRUE(read == toRead, NS_ERROR_FAILURE); > > lines will mean we won't reset the database when we fail to read the > expected amount of prefixes. They should probably return > NS_ERROR_FILE_CORRUPTED, because there's no way to sensibly proceed without > prefix data. > Even if we just return NS_ERROR_FAILURE, mPrimed is still false so that |GetPrefixNative| will never be called. > That said... > > > 2) |mTotalPrefixes| and |mIndexDeltas| are supposed to be always matched no > > matter > > what values are read from file according to [3]. (mTotalPrefixes will get > > increased only when mIndexDeltas gets the same amount of increment.) > > ...that sounds right. > > So do we have a problem if numInDelta == 0? > > https://hg.mozilla.org/releases/mozilla-aurora/annotate/696a981b6d53/toolkit/ > components/url-classifier/nsUrlClassifierPrefixSet.cpp#l159 > > This advances prefixCnt even if mIndexDeltas[i] is 0. But > > https://hg.mozilla.org/releases/mozilla-aurora/annotate/696a981b6d53/toolkit/ > components/url-classifier/nsUrlClassifierPrefixSet.cpp#l393 > > would not increment mTotalPrefixes in that case. GetPrefixesNative cues it's > allocation off of that variable, so that would cause this bug, wouldn't it? Assume we have prefix 0 to 14 which are structured like the following: 0: [1, 2, 3, 4, 5] 6: [7, 8, 9] 10: [] 11: [12, 13, 14] where mIndexPrefixes: [0, 6, 10, 11] mIndexDeltas[0]: [1, 2, 3, 4, 5] mIndexDeltas[1]: [7, 8, 9] mIndexDeltas[2]: [] mIndexDeltas[3]: [12, 13, 14] For the "numInDelta == 0" case, in https://hg.mozilla.org/releases/mozilla-aurora/annotate/696a981b6d53/toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp#l159 |prefixCnt| gets increased because of mIndexPrefixes[2], which is 10, but not the associated deltas As for in https://hg.mozilla.org/releases/mozilla-aurora/annotate/696a981b6d53/toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp#l393 |mIndexDeltas| will be |SetLength|ed to 4 and only mIndexDeltas[0],[1],[3] will be |SetLength|ed to "5,3,3" respectively. The only chance that might cause issue is |nsTArray::SetLength| only guarantees the allocated elements initialized with default ctor. (But the mIndexDeltas should be empty while loading from file....) I'll probably look at if we do something wrong with updating |mIndexDeltas| and |mTotalPrefixes|. Thanks!
>Even if we just return NS_ERROR_FAILURE, mPrimed is still false so that |GetPrefixNative| >will never be called. Yes, but you don't necessarily will return NS_ERROR_FAILURE even if the data is garbage. The .pset file has no checksum, unlike HashStore. As said it doesn't look like that can be the issue here though. There is no bug for the code I pointed out because: https://hg.mozilla.org/releases/mozilla-aurora/annotate/696a981b6d53/toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp#l373 mTotalPrefixes = indexSize; accounts for the "empty" mIndexDeltas[x] entries.
I am revisiting this bug. Looking at the uptime in crash reports, crashes all happened after booting up a while. Since we only load prefix set when the browser just boots up, the crash is less likely to be caused by the corrupted file. I will be focusing on the prefix set construction from HTTPS rather the from file.
Still totally have no idea :( GetPrefixesNative() returns an array which is derived from |mTotalPrefixes|, |mIndexPrefixes| and |mIndexDeltas| Those three variables will only be set in |LoadPrefixes| and |MakePrefixSet|. |LoadPrefixes| only happens in the very beginning. After we receive and apply table updates, those three variables will come from |MakePrefixSet|. Given that the up-time is mostly a couple of hours, it's very unlikely that |LoadPrefixes| is the source of the crash because |MakePrefixSet| will override those three variables after update. Looking at |MakePrefixSet|, we have a for loop with aLength-1 iteration so we exactly add "aLength-1" elements to |mIndexPrefixes| and |mIndexDeltas|. Considering the addition outside the for loop, we have exact |aLength| elements in |mIndexPrefixes| and |mIndexDeltas|, where |aLength| is just |mTotalPrefixes|. This is guaranteed. Besides, GetPrefixesNative() loops through |mIndexPrefixes| and |mIndexDeltas| to add elements to the output array. No matter how |mIndexDeltas| is off by |mIndexPrefixes|, we should never overflow the array since the total number of elements in |mIndexPrefixes| and |mIndexDeltas| is the output array length. (Instead, the output array can only be not fully filled.) Some other less possible cause might be: 1) |mTotalPrefixes| is unexpectedly overridden by a buffer overflow in some places. 2) |mTotalPrefixes|, |mIndexPrefixes| and |mIndexDeltas| are set in somewhere I was not aware. What we can do to avoid crash right now is to add a guard in GetPrefixesNative. That is, if the output array is about to overflow, return error and the error will be propagated to have the database reset. Adding this guard will effectively avoid the crash but it hides the potential or even serious error. I don't know if this is a good idea.
(In reply to Henry Chang [:henry][:hchang] from comment #9) > Still totally have no idea :( > > GetPrefixesNative() returns an array which is derived from > > |mTotalPrefixes|, |mIndexPrefixes| and |mIndexDeltas| > > Those three variables will only be set in |LoadPrefixes| and |MakePrefixSet|. > |LoadPrefixes| only happens in the very beginning. After we receive and apply > table updates, those three variables will come from |MakePrefixSet|. > > Given that the up-time is mostly a couple of hours, it's very unlikely that > |LoadPrefixes| is the source of the crash because |MakePrefixSet| will > override > those three variables after update. > > Looking at |MakePrefixSet|, we have a for loop with aLength-1 iteration so > we exactly add "aLength-1" elements to |mIndexPrefixes| and |mIndexDeltas|. > Considering the addition outside the for loop, we have exact |aLength| > elements > in |mIndexPrefixes| and |mIndexDeltas|, where |aLength| is just > |mTotalPrefixes|. > This is guaranteed. > > Besides, GetPrefixesNative() loops through |mIndexPrefixes| and > |mIndexDeltas| > to add elements to the output array. No matter how |mIndexDeltas| is off by > |mIndexPrefixes|, we should never overflow the array since the total number > of elements in |mIndexPrefixes| and |mIndexDeltas| is the output array > length. > (Instead, the output array can only be not fully filled.) > > Some other less possible cause might be: > > 1) |mTotalPrefixes| is unexpectedly overridden by a buffer overflow in some > places. > 2) |mTotalPrefixes|, |mIndexPrefixes| and |mIndexDeltas| are set in > somewhere I was not aware. > > What we can do to avoid crash right now is to add a guard in > GetPrefixesNative. > That is, if the output array is about to overflow, return error and the error > will be propagated to have the database reset. Adding this guard will > effectively > avoid the crash but it hides the potential or even serious error. I don't > know > if this is a good idea. Hi Francois, What do you think of the idea adding a guard to GetPrefixesNative()? It would mitigate the crash totally but the root cause is still there. In the long run, since the crash rate is significantly higher on Windows, I tend to get a windows machine to see if I can duplicate this issue and get the "update failure dump" (which I've done in Bug 1310142 if you remember)
Flags: needinfo?(francois)
See Also: → 1332218, 1332213
(In reply to Henry Chang [:henry][:hchang] from comment #10) > What do you think of the idea adding a guard to GetPrefixesNative()? > It would mitigate the crash totally but the root cause is still there. It would be great if we found the root cause, but I imagine it could also be due to external factors (e.g. bad hardware). I'm not opposed to adding the guard. GCP, are you ok with it?
Flags: needinfo?(francois) → needinfo?(gpascutto)
I don't feel strongly either way. Two people looked at this, and found no way our code can fail. If the users' system is fucked, bailing out in our security systems isn't idea, but there's no obvious way an attacker can exploit that. And at least it'll stop them from blaming Firefox (if we don't crash elsewhere shortly thereafter).
Flags: needinfo?(gpascutto)
Lowering the priority since Henry has landed a work-around in 53 or 54 that should make this go away, without addressing the root cause. We'll leave the bug open until we've been able to find the root cause.
Status: NEW → ASSIGNED
Priority: P1 → P3
According to the crash report, this crash has not happened since Firefox 55, which proved the inference in comment 13. And Henry said this crash was fixed by bug 1332218.
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.