Closed Bug 1275582 Opened 5 years ago Closed 5 years ago

TSan: data race security/nss/lib/freebl/sha_fast.c:176 SHA1_End


(Core :: Security: PSM, defect)

Not set



Tracking Status
firefox46 --- wontfix
firefox47 --- wontfix
firefox48 --- fixed
firefox49 --- fixed


(Reporter: jseward, Assigned: jseward)


(Keywords: csectype-race, sec-moderate, Whiteboard: [psm-assigned][post-critsmash-triage][adv-main48+])


(2 files)

STR currently unknown.  This was reported 5 times when using a TSan-enabled
build of m-c for general surfing for around half an hour, with 12 tabs.
I will endeavour to find repeatable STR.

From the thread names involved ...

  Thread T29 'SSL Cert #13' (tid=13287, running)
  Thread T73 'SSL Cert #12' (tid=13284, running)

... I'd guess that there are multiple instances of some worker thread to do
with SSL Certs, and they are racing each other.  Only a guess though.
Attached file race-SHA1_End
TSan's report (one of 5)
Yes, that global variable is passed along the following chain of calls


and SHA1_End (the second one) writes the computed SHA1 into it.  Hence
the write-after-write race.
In a bit of random surfing for roughly 5 minutes with a profile containing
around 100 tabs, this was the third most commonly reported race in code 
under our control:

sewardj@dundee:~/MOZ/MC-RACE$ ../SCRIPTS/ spew-44-10-tsan-surfing-without 
     11 js/src/gc/Barrier.h:450 in unbarrieredGet
      5 js/src/jsgc.h:1068 in isForwarded
      3 security/nss/lib/freebl/sha_fast.c:176 in SHA1_End  <-------------------
      3 js/src/jsgc.cpp:2349 in onScriptEdge
      1 netwerk/cache2/CacheEntry.cpp:557 in ReopenTruncated
      1 netwerk/cache2/CacheEntry.cpp:1217 in OpenOutputStreamInternal
      1 js/src/vm/Shape.h:753 in inDictionary
      1 js/src/vm/Runtime.cpp:854 in JSRuntime::setUsedByExclusiveThread(JS::Zone*)
      1 js/src/vm/ObjectGroup.h:95 in clasp
      1 js/src/jspubtd.h:164 in isHeapBusy
      1 js/src/jsobj.h:122 in getClass
      1 js/src/jsgc.cpp:5858 in AutoTraceSession
      1 js/src/jsgc.cpp:2365 in onBaseShapeEdge
      1 ff-O1-linux64-tsan/dist/include/mozilla/gfx/Types.h:271 in Color
      1 deadlock) nsprpub/pr/src/pthreads/ptsynch.c:177 in PR_Lock
The obvious fix is to make this stack-local, hence thread-local.
It's only 20 bytes so there's no harm in that.  I don't understand
why it wasn't stack local to begin with.

Who can review / comment on this?

I do have one concern with this fix, which is that it leaves the
computed SHA1 sum on the stack below the stack pointer after the
function returns, where it could potentially be read by an attacker.
Is that of concern?  I guess it would be possible to zero out the
buffer before MatchKeyHash returns.  That would be easy and cheap
to do; my only concern with that is that the zeroing out would be
dead stores from the point of view of "legitimate C++ semantics" and
so a sufficiently clever compiler might optimise it away.
Comment on attachment 8756810 [details] [diff] [review]

Review of attachment 8756810 [details] [diff] [review]:

David can review this I hope, it looks good to me.
Attachment #8756810 - Flags: review?(dkeeler)
Attachment #8756810 - Flags: feedback+
(In reply to Julian Seward [:jseward] from comment #5)
> I do have one concern with this fix, which is that it leaves the
> computed SHA1 sum on the stack below the stack pointer after the
> function returns, where it could potentially be read by an attacker.

I don't think this matters. It's the digest of the OCSP responder's public key which is transferred over HTTP anyway.
Comment on attachment 8756810 [details] [diff] [review]

Review of attachment 8756810 [details] [diff] [review]:

::: security/pkix/lib/pkixocsp.cpp
@@ +764,5 @@
>  {
>    if (keyHash.GetLength() != SHA1_DIGEST_LENGTH)  {
>    }
> +  uint8_t hashBuf[SHA1_DIGEST_LENGTH] = { 0 };

We don't need the initializer, but otherwise LGTM.
Attachment #8756810 - Flags: review?(dkeeler) → review+
Group: core-security → crypto-core-security
Component: Security → Security: PSM
(In reply to David Keeler [:keeler] (use needinfo?) from comment #8)
> We don't need the initializer, but otherwise LGTM.

Thanks.  Is it OK to land this in the normal way, or do I need any
special precautions?
Flags: needinfo?(dkeeler)
According to since this is sec-moderate, you can go ahead and land like normal. It would be nice to uplift this as far as possible, too.
Assignee: nobody → jseward
Flags: needinfo?(dkeeler)
Whiteboard: [psm-assigned]
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla49
Group: crypto-core-security → core-security-release
Comment on attachment 8756810 [details] [diff] [review]

Approval Request Comment
[Feature/regressing bug #]: (long-standing)
[User impact if declined]: OCSP verification might fail on very rare occasions?
[Describe test coverage new/current, TreeHerder]: Lots of OCSP tests, not for this specific threading issue though.
[Risks and why]: Risk is low, a very tiny change.
[String/UUID change made/needed]: None.
Attachment #8756810 - Flags: approval-mozilla-beta?
Attachment #8756810 - Flags: approval-mozilla-aurora?
Comment on attachment 8756810 [details] [diff] [review]

Fix a race condition, taking it.
Probably too late for beta..
Attachment #8756810 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8756810 [details] [diff] [review]

At this point in RC week, we are only taking fixes for issues that could lead to dot releases and this issue does not seem to meet the bar.
Attachment #8756810 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
Whiteboard: [psm-assigned] → [psm-assigned][post-critsmash-triage]
Whiteboard: [psm-assigned][post-critsmash-triage] → [psm-assigned][post-critsmash-triage][adv-main48+]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.