Closed
Bug 1312514
Opened 8 years ago
Closed 8 years ago
Implement infrastructure for tracking timers coming from tracking scripts
Categories
(Core :: DOM: Core & HTML, defect)
Core
DOM: Core & HTML
Tracking
()
RESOLVED
FIXED
mozilla53
Tracking | Status | |
---|---|---|
firefox53 | --- | fixed |
People
(Reporter: ehsan.akhgari, Assigned: ehsan.akhgari)
References
(Blocks 1 open bug)
Details
Attachments
(5 files, 7 obsolete files)
33.47 KB,
patch
|
bkelly
:
review+
|
Details | Diff | Splinter Review |
30.12 KB,
patch
|
Details | Diff | Splinter Review | |
4.16 KB,
patch
|
bkelly
:
review+
|
Details | Diff | Splinter Review |
6.93 KB,
patch
|
bkelly
:
review+
|
Details | Diff | Splinter Review |
3.92 KB,
patch
|
ehsan.akhgari
:
review+
|
Details | Diff | Splinter Review |
We need some infrastructure that will allow us to reason about whether the code that we're currently running on behalf of the web page is coming from a tracking script through setTimeout, setInterval or an event handler.
This will allow us to deprioritize various things in that context.
Assignee | ||
Updated•8 years ago
|
Summary: Implement infrastructure for tracking timers and event handlers coming from tracking scripts → Implement infrastructure for tracking timers coming from tracking scripts
Assignee | ||
Comment 1•8 years ago
|
||
Just to give an update on this: I have been working on splitting the list of timeouts on nsGlobalWindow into a normal timeout list and a tracking timeout list, so that we can start to schedule them differently. I'm doing this as a refactoring, so that it almost doesn't change the behavior at all. As I had expected, this code is quite brittle, and I have been spending several days greening up the tests on the try server. This are still looking pretty rough, almost every test suite is currently orange/red (setTimeout is quite populatr!)
I will spend time next week to green this up and will then post patches for reviews.
The good news is that after this refactoring, it's super easy to change the scheduling of tracking timers however we want and it will be very simple to make any changes.
Assignee | ||
Comment 2•8 years ago
|
||
In part 2 of this patch series, we may not be ready to process the
result of the load as soon as the incremental stream loader is finished.
We therefore need to support requesting the incremental stream loader to
hold the request object alive a little bit longer than when the load is
finished. This patch adds that support, with assertions ensuring that
the request object does get destroyed before the stream loader object
goes away.
Attachment #8816164 -
Flags: review?(bkelly)
Assignee | ||
Comment 3•8 years ago
|
||
In order to be able to put timeouts in the correct bucket as soon as
they are scheduled, we need to be able to synchronously tell whether a
timeout is coming from a script that is on the tracking protection list.
But the URL Classifier API which we use for this task is asynchronous.
Because of this, and to avoid unnecessary IPC from the content to the
parent process every time we need to know where a script came from, we
cache this information in nsIDocument. The hash table mTrackingScripts
will have one entry per script loaded in the document which is on the
tracking protection list.
For performance reasons, we coalesce querying whether a script source
URL is on the tracking protection list with the load of the script from
the network. In most cases we'll have the response from the URL
Classifier service before the script load is finished, but on the off
chance that the load finishes first, we wait before finishing the script
load to get the response from the URL Classifier service.
Attachment #8816165 -
Flags: review?(bkelly)
Assignee | ||
Comment 4•8 years ago
|
||
This will allow us to schedule these timers differently in the future.
This patch only performs the refactoring, and is not intended to change
any behavior. Specifically, this patch doesn't change the order in
which timeouts are fired -- they should still all be fired according to
the mWhen field.
The implementation works by splitting timeout storage per window into
two linked list, mNormalTimeouts and mTrackingTimeouts. For the most
part, all functions that used to operate on mTimeouts now need to do the
same operation on both of these linked lists. In order to keep the code
cleaner, we're using C++ lambdas to effectively run the same algorithm
over both lists in case the exact order in which Timeout objects are
processed doesn't matter.
In RunTimeout(), the order in which Timeout objects are processed does
matter, so for that case we use the OrderedTimeoutIterator class to
iterate over both linked lists simultaneously in the mWhen order. Also,
inserting the dummy timeout when running the timeouts is only necessary
for the linked list where the last expired timeout is coming from, so we
only inject the dummy timer into the corresponding list, therefore we
remember which list we picked the last expired timeout from when
looking for it.
Attachment #8816166 -
Flags: review?(bkelly)
Assignee | ||
Comment 5•8 years ago
|
||
Part 3 includes a lot of indentation change, so the main diff isn't very useful for reviewing. This should be a lot easier to review.
Assignee | ||
Comment 6•8 years ago
|
||
This pref allows easier testing and debugging of this feature
by forcing timeouts to end up in the tracking bucket in either
the alternating or random fashion.
Attachment #8816168 -
Flags: review?(bkelly)
Assignee | ||
Comment 7•8 years ago
|
||
For simplicity, this test is being added to test_classifier.html which
already has all of the infrastructure necessary for setting up a test
domain as a tracking domain.
Attachment #8816169 -
Flags: review?(bkelly)
Assignee | ||
Comment 8•8 years ago
|
||
We're adding this test to test_timer_flood.html since it already
examines dispatching thousands of timeouts. Putting the timeouts in the
two buckets randomly ensures that the test isn't biased towards, for
example, alternating ordering of the timeouts.
Attachment #8816170 -
Flags: review?(bkelly)
Comment 9•8 years ago
|
||
(In reply to :Ehsan Akhgari from comment #4)
> Created attachment 8816166 [details] [diff] [review]
> Part 3: Split tracking and non-tracking timeouts into two separate lists
>
> This will allow us to schedule these timers differently in the future.
> This patch only performs the refactoring, and is not intended to change
> any behavior. Specifically, this patch doesn't change the order in
> which timeouts are fired -- they should still all be fired according to
> the mWhen field.
What if several timers have the same mWhen but are in different queues?
Assignee | ||
Comment 10•8 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #9)
> (In reply to :Ehsan Akhgari from comment #4)
> > Created attachment 8816166 [details] [diff] [review]
> > Part 3: Split tracking and non-tracking timeouts into two separate lists
> >
> > This will allow us to schedule these timers differently in the future.
> > This patch only performs the refactoring, and is not intended to change
> > any behavior. Specifically, this patch doesn't change the order in
> > which timeouts are fired -- they should still all be fired according to
> > the mWhen field.
>
>
> What if several timers have the same mWhen but are in different queues?
This patch prefers the timeout with the lowest timeoutId in that case. That being said, I did test pages which do several setTimeout()s immediately after each other and in practice they will get different mWhen values, so this case will be rare in practice.
Assignee | ||
Comment 11•8 years ago
|
||
Comment 12•8 years ago
|
||
I took an initial look at the patches and have some concerns. I'd like to chat on vidyo to make sure I understand the goals, etc. Thanks.
Comment 13•8 years ago
|
||
So we spoke a bit on vidyo today.
It became apparent some of my concerns around complexity are the things Ehsan is trying to solve. For example, I'm afraid I can't tell from a review if changing this code is safe with the dummy_timeout and re-entrant timer modifications the code supports. Its just too hard to understand at the moment. Making this easier to understand, though, is part of Ehsan's goals.
So we decided perhaps a good first step would be to move the timer guts into a separate TimerManager.(h|cpp) set of files or something. If we can isolate the timer code it might be a bit easier to understand. We could further abstract some of the functionality into a TimerList.(h|cpp).
After we have disentangled timers from the rest of nsGlobalWindow it will hopefully be easier to see the impact of moving from one list to two lists.
In regards to the P1 and P2 patches we decided that nsScriptLoaderHandler::OnStreamComplete() could get the request from the nsIncrementalStreamLoader and then pass it to nsScriptLoader in the final complete call. This avoids the need to add a new explicitly cleanup method like HoldRequest/ReleaseRequest.
Updated•8 years ago
|
Attachment #8816164 -
Flags: review?(bkelly)
Updated•8 years ago
|
Attachment #8816165 -
Flags: review?(bkelly)
Updated•8 years ago
|
Attachment #8816166 -
Flags: review?(bkelly)
Updated•8 years ago
|
Attachment #8816168 -
Flags: review?(bkelly)
Updated•8 years ago
|
Attachment #8816169 -
Flags: review?(bkelly)
Updated•8 years ago
|
Attachment #8816170 -
Flags: review?(bkelly) → review+
Assignee | ||
Comment 14•8 years ago
|
||
I think I'm going to try to split off some of the work into separate bugs so that I can land the individual bits and pieces sooner than later.
Assignee | ||
Updated•8 years ago
|
Attachment #8816164 -
Attachment is obsolete: true
Assignee | ||
Updated•8 years ago
|
Attachment #8816165 -
Attachment is obsolete: true
Assignee | ||
Comment 15•8 years ago
|
||
This will allow us to schedule these timers differently in the future.
This patch only performs the refactoring, and is not intended to change
any behavior. Specifically, this patch doesn't change the order in
which timeouts are fired -- they should still all be fired according to
the mWhen field.
The implementation works by splitting timeout storage per window into
two Timeouts objects, mNormalTimeouts and mTrackingTimeouts. The ForEach
helper methods are extended to deal with both of these objects, and as a
result, most of the algorithms operating on the list of timeouts work
correctly without any modification, with the notable exception of
RunTimeout.
In RunTimeout(), the order in which Timeout objects are processed does
matter, so for that case we use the OrderedTimeoutIterator class to
iterate over both linked lists simultaneously in the mWhen order. Also,
inserting the dummy timeout when running the timeouts is only necessary
for the linked list where the last expired timeout is coming from, so we
only inject the dummy timer into the corresponding list, therefore we
remember which list we picked the last expired timeout from when
looking for it.
Attachment #8819026 -
Flags: review?(bkelly)
Assignee | ||
Updated•8 years ago
|
Attachment #8816166 -
Attachment is obsolete: true
Attachment #8816167 -
Attachment is obsolete: true
Attachment #8816168 -
Attachment is obsolete: true
Attachment #8816169 -
Attachment is obsolete: true
Attachment #8816170 -
Attachment is obsolete: true
Assignee | ||
Comment 16•8 years ago
|
||
Assignee | ||
Comment 17•8 years ago
|
||
This pref allows easier testing and debugging of this feature
by forcing timeouts to end up in the tracking bucket in either
the alternating or random fashion.
Attachment #8819030 -
Flags: review?(bkelly)
Assignee | ||
Comment 18•8 years ago
|
||
For simplicity, this test is being added to test_classifier.html which
already has all of the infrastructure necessary for setting up a test
domain as a tracking domain.
Attachment #8819031 -
Flags: review?(bkelly)
Assignee | ||
Comment 19•8 years ago
|
||
We're adding this test to test_timer_flood.html since it already
examines dispatching thousands of timeouts. Putting the timeouts in the
two buckets randomly ensures that the test isn't biased towards, for
example, alternating ordering of the timeouts.
Attachment #8819032 -
Flags: review+
Assignee | ||
Comment 20•8 years ago
|
||
Comment 21•8 years ago
|
||
Comment on attachment 8819026 [details] [diff] [review]
Part 1: Split tracking and non-tracking timeouts into two separate lists
Review of attachment 8819026 [details] [diff] [review]:
-----------------------------------------------------------------
The two dummy insertion points hurts my brain a bit, but I think this is an improvement. r=me with comments addressed.
::: dom/base/OrderedTimeoutIterator.h
@@ +15,5 @@
> +namespace dom {
> +
> +// This class implements and iterator which iterates the normal and tracking
> +// timeouts lists simultaneously in the mWhen order.
> +class MOZ_STACK_CLASS OrderedTimeoutIterator {
final
@@ +19,5 @@
> +class MOZ_STACK_CLASS OrderedTimeoutIterator {
> +public:
> + typedef TimeoutManager::Timeouts Timeouts;
> + typedef Timeouts::TimeoutList TimeoutList;
> + OrderedTimeoutIterator(Timeouts& aNormalTimeouts,
Can you add documentation that the Timeout* pointers can be nullptr?
@@ +41,5 @@
> + mNormalIter->isInList());
> + MOZ_ASSERT_IF(mTrackingIter && mTrackingIter != mTrackingStopAt,
> + mTrackingIter->isInList());
> +
> + mKind = Kind::None;
I think we should somehow assert that UpdateIterator() was called here. I suggest below reseting mKind to Kind::None in UpdateIterator(). If you do that then you could:
MOZ_ASSERT(mKind == Kind::None);
here. Of course, at that point you could automatically call UpdateIterator() I suppose.
If you want to allow Next() to be called multiple times between updates, then perhaps we could cache the last returned value and not recompute it?
I'd just like to have more assertions if possible around proper usage of Next() and UpdateIterator() since its a cooperative API that someone could accidentally break in external code. More documentation would help as well.
@@ +65,5 @@
> + // tie.) Otherwise, return whichever iterator has an item left,
> + // preferring a non-tracking timeout again. Note that in practice, even
> + // if a web page calls setTimeout() twice in a row, it should get
> + // different mWhen values, so the tie situation shouldn't occur in
> + // practice.
How can a tie happen when you are comparing ID values?
@@ +119,5 @@
> + !mNormalIter->isInList()) {
> + mNormalIter = mNormalTimeouts.getFirst();
> + }
> + }
> + }
It feels like we should set `mKind = Kind::None` at the end of UpdateIterator(). We have moved past the last "next value" and are now in a new state for the Next() to be called again.
@@ +134,5 @@
> + bool PickedNormalIter() const
> + {
> + MOZ_ASSERT(mKind != Kind::None);
> + return mKind == Kind::Normal;
> + }
nit: add a line between the methods
::: dom/base/TimeoutManager.cpp
@@ +231,5 @@
> }
>
> + const char* filename = nullptr;
> + uint32_t dummyLine = 0, dummyColumn = 0;
> + aHandler->GetLocation(&filename, &dummyLine, &dummyColumn);
Is this left over debugging?
@@ +237,5 @@
> +
> + if (isTracking) {
> + mTrackingTimeouts.Insert(timeout,
> + mWindow.IsFrozen() ? Timeouts::SortBy::TimeRemaining
> + : Timeouts::SortBy::TimeWhen);
nit: The frozen window ternary could be computed once to make this a bit more DRY.
@@ +447,5 @@
> +
> + if (timeout->mFiringDepth != firingDepth) {
> + // We skip the timeout since it's on the list to run at another
> + // depth.
> +
nit: remove this blank line?
@@ +519,5 @@
> + if (runIter.PickedTrackingIter()) {
> + mTrackingTimeouts.Insert(timeout,
> + mWindow.IsFrozen() ? Timeouts::SortBy::TimeRemaining
> + : Timeouts::SortBy::TimeWhen);
> + } else {
Does this bare else not allow for the case where the iterator has mKind==Kind::None? It seems thats possible within the iterator code.
@@ +748,5 @@
>
> + nsresult rv = mNormalTimeouts.ResetTimersForThrottleReduction(aPreviousThrottleDelayMS,
> + minTimeout,
> + sortBy,
> + mWindow.GetThrottledEventQueue());
You don't check the return value here.
::: dom/base/TimeoutManager.h
@@ +93,5 @@
> template <class Callable>
> void ForEachTimeout(Callable c)
> {
> + mNormalTimeouts.ForEach(c);
> + mTrackingTimeouts.ForEach(c);
Should we make it clear that ForEachTimeout() and ForEachTimeoutAbortable()? does not guarantee mWhen time ordering?
Attachment #8819026 -
Flags: review?(bkelly) → review+
Updated•8 years ago
|
Attachment #8819030 -
Flags: review?(bkelly) → review+
Updated•8 years ago
|
Attachment #8819031 -
Flags: review?(bkelly) → review+
Assignee | ||
Comment 22•8 years ago
|
||
(In reply to Ben Kelly [:bkelly] from comment #21)
> @@ +41,5 @@
> > + mNormalIter->isInList());
> > + MOZ_ASSERT_IF(mTrackingIter && mTrackingIter != mTrackingStopAt,
> > + mTrackingIter->isInList());
> > +
> > + mKind = Kind::None;
>
> I think we should somehow assert that UpdateIterator() was called here. I
> suggest below reseting mKind to Kind::None in UpdateIterator(). If you do
> that then you could:
>
> MOZ_ASSERT(mKind == Kind::None);
>
> here. Of course, at that point you could automatically call
> UpdateIterator() I suppose.
>
> If you want to allow Next() to be called multiple times between updates,
> then perhaps we could cache the last returned value and not recompute it?
>
> I'd just like to have more assertions if possible around proper usage of
> Next() and UpdateIterator() since its a cooperative API that someone could
> accidentally break in external code. More documentation would help as well.
The desired semantics here is to be able to enable the consumer to control when the current iterator pointer is moved to the next element. The reason why this is needed is that in RunTimeout(), the timeouts list can change underneath us when RunTimeoutHandler() is called, so we need to call UpdateIterator() again after that (this preserves the existing semantics of the code.)
Initially I used to call UpdateIterator() inside Next() but I ran into some kind of problem with that which caused me to split it out like this. Sorry if this is confusing, I'll add some more documentation and would be happy to adjust things for making them more clear post landing too!
About asserting that UpdateIterator has been called, that's a good idea. But we can't reset mKind unfortunately since that breaks PickedNormalIter()/PickedTrackingIter(), but it's definitely possible to add a DebugOnly<bool> member for that, so I'll do that instead.
I'll also make it clear in the comments before each function when the function is expected to be called (for example PickedNormalIter()/PickedTrackingIter() should be called before you call Next(), etc.)
> @@ +65,5 @@
> > + // tie.) Otherwise, return whichever iterator has an item left,
> > + // preferring a non-tracking timeout again. Note that in practice, even
> > + // if a web page calls setTimeout() twice in a row, it should get
> > + // different mWhen values, so the tie situation shouldn't occur in
> > + // practice.
>
> How can a tie happen when you are comparing ID values?
I meant a tie when comparing mWhen values... Will reword to make it clearer.
> @@ +119,5 @@
> > + !mNormalIter->isInList()) {
> > + mNormalIter = mNormalTimeouts.getFirst();
> > + }
> > + }
> > + }
>
> It feels like we should set `mKind = Kind::None` at the end of
> UpdateIterator(). We have moved past the last "next value" and are now in a
> new state for the Next() to be called again.
See above, we can't because we need to keep PickedNormalIter()/PickedTrackingIter() working. (mKind is really only stored for those two methods.)
> ::: dom/base/TimeoutManager.cpp
> @@ +231,5 @@
> > }
> >
> > + const char* filename = nullptr;
> > + uint32_t dummyLine = 0, dummyColumn = 0;
> > + aHandler->GetLocation(&filename, &dummyLine, &dummyColumn);
>
> Is this left over debugging?
No, |filename| is used in the next line. dummyLine and dummyColumns are junk forced own our throat by GetLocation. :-)
> @@ +519,5 @@
> > + if (runIter.PickedTrackingIter()) {
> > + mTrackingTimeouts.Insert(timeout,
> > + mWindow.IsFrozen() ? Timeouts::SortBy::TimeRemaining
> > + : Timeouts::SortBy::TimeWhen);
> > + } else {
>
> Does this bare else not allow for the case where the iterator has
> mKind==Kind::None? It seems thats possible within the iterator code.
No, it should be impossible for mKind to be None here, as is asserted in PickedTrackingIter() and also Next().
> @@ +748,5 @@
> >
> > + nsresult rv = mNormalTimeouts.ResetTimersForThrottleReduction(aPreviousThrottleDelayMS,
> > + minTimeout,
> > + sortBy,
> > + mWindow.GetThrottledEventQueue());
>
> You don't check the return value here.
Oops, great catch! Thanks.
> ::: dom/base/TimeoutManager.h
> @@ +93,5 @@
> > template <class Callable>
> > void ForEachTimeout(Callable c)
> > {
> > + mNormalTimeouts.ForEach(c);
> > + mTrackingTimeouts.ForEach(c);
>
> Should we make it clear that ForEachTimeout() and ForEachTimeoutAbortable()?
> does not guarantee mWhen time ordering?
Will rename the functions and add some comments.
I'll also address the rest of the comments before landing.
Comment 23•8 years ago
|
||
Pushed by eakhgari@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/6d40e1230ade
Part 1: Split tracking and non-tracking timeouts into two separate lists; r=bkelly
https://hg.mozilla.org/integration/mozilla-inbound/rev/a2c65695fb88
Part 2: Add a hidden pref to control how we split the list of our timeouts into the normal and tracking buckets; r=bkelly
https://hg.mozilla.org/integration/mozilla-inbound/rev/1000f976a766
Part 3: Add a test to ensure that timeouts from tracking scripts end up in the tracking bucket; r=bkelly
https://hg.mozilla.org/integration/mozilla-inbound/rev/6bd93f772ab7
Part 4: Add a test to ensure that splitting timeouts into two buckets doesn't affect the order in which we fire them; r=bkelly
Comment 24•8 years ago
|
||
Pushed by eakhgari@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/58c110c305d8
follow-up: Increase the timeout for test_timer_flood.html for Android debug timeouts
Comment 25•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/6d40e1230ade
https://hg.mozilla.org/mozilla-central/rev/a2c65695fb88
https://hg.mozilla.org/mozilla-central/rev/1000f976a766
https://hg.mozilla.org/mozilla-central/rev/6bd93f772ab7
https://hg.mozilla.org/mozilla-central/rev/58c110c305d8
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox53:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Updated•6 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•