Excessive mutex use in TransportSecurityInfo for mErrorCode

RESOLVED FIXED in Firefox 64

Status

()

enhancement
P1
normal
RESOLVED FIXED
8 months ago
8 months ago

People

(Reporter: jesup, Assigned: keeler)

Tracking

(Depends on 1 bug)

Trunk
mozilla64
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox64 fixed)

Details

(Whiteboard: [psm-assigned])

Attachments

(2 attachments)

In a quick browsing session (CNN, ArsTechnica, small google doc) I had >300K mutex locks on TransportSecurityInfo mutexes.  99.99% of these are from GetErrorCode().  Moving mErrorCode to be an unlocked Atomic<> cuts the max lock() calls per mutex from ~15K to 81.  Note that this is an incomplete solution, since we need to figure out what to do with the code that sets it under a lock and other info as well, and perhaps update the API and the code that polls for errors and grabs data from the object.
Flags: needinfo?(dkeeler)
Looks like most calls to GetErrorCode() are asking if an error has been set and don't care what it is. I imagine we could add an atomic boolean that gets set when we set the error code.
Assignee: nobody → dkeeler
Flags: needinfo?(dkeeler)
Priority: -- → P1
Whiteboard: [psm-assigned]
Before this patch, Necko functions polling the state of TLS sockets
(essentially, TransportSecurityInfo) would cause a considerable amount of
locking on TransportSecurityInfo::mMutex instances via GetErrorCode(). Most of
this code only cared if an error had been set via SetCanceled(), so this patch
adds an atomic boolean mCanceled (and associated accessor GetCanceled()) that
can be used to the same effect but without acquiring the lock.

Comment 5

8 months ago
Pushed by dkeeler@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/7b439431eb95
avoid acquiring TransportSecurityInfo::mMutex in hot code r=jesup,jcj

Comment 6

8 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/7b439431eb95
Status: NEW → RESOLVED
Closed: 8 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
What tool were you using to measure lock() calls? (i.e. how would I measure the effectiveness of this change?)
Flags: needinfo?(rjesup)
Dana: see the patch on bug 1498643 (I'll update it with the latest version, which covers Windows too).  Note that you may need to disable sandboxing (fprintf(stderr...)) or use a non-e10s run.
Flags: needinfo?(rjesup)
You need to log in before you can comment on or make changes to this bug.