Closed Bug 1160421 Opened 5 years ago Closed 5 years ago

Use a custom scheduler for DecodePool instead of relying on nsThreadPool

Categories

(Core :: ImageLib, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla41
Tracking Status
firefox41 --- fixed

People

(Reporter: seth, Assigned: seth)

References

Details

Attachments

(1 file, 1 obsolete file)

DecodePool currently just runs decode jobs in FIFO order, since it relies on nsThreadPool's FIFO scheduling behavior. But we want something smarter than that. In particular:

- We always want to prioritize size decodes over full decodes, since size decodes block layout and page load.

- We probably want to decode images in LIFO order, not FIFO - see bug 1152147.

- We may want to change the priority of decode jobs or deschedule them altogether in more sophisticated ways, depending on whether the images involved are visible and what has a reference to them.

In this bug we won't implement any new policy, but we'll get the basic mechanism in place by creating a new DecodePool scheduler.
Blocks: 1160422
This is it. The code is based on nsThreadPool's implementation, but I tried to
structure it a little better for the direction we need this stuff to evolve in
the future.

Compared to nsThreadPool, the two important changes are as follows:

- We no longer track thread idleness and terminate idle threads. I don't think
  this is really desireable for image decoding threads, given how ubiquitous
  image decoding is and how critical our perf in this area is for page load
  speed, perceived scrolling performance, and general responsiveness.  We don't
  want to be forced to spin up additional threads when the workload gets heavy;
  we want those threads ready to go. The resource costs of keeping the threads
  around is just not high enough to offset the additional latency of starting up
  threads after we've been idle for a while.

- We use an nsTArray for the queue of work items instead of an nsEventQueue. An
  nsEventQueue stores its work items in a deque, which is a better data
  structure for FIFO operation, but since I plan to switch us to decoding in
  LIFO order in a followup bug, I don't think it's worth trying to optimize our
  FIFO implementation here.

The thread pool's state is encapsulated in the DecodePoolImpl class, which
follows the pattern of other singleton services in ImageLib like the
SurfaceCache. I anticipate moving more of the DecodePool state to DecodePoolImpl
over time.
Attachment #8603112 - Flags: review?(tnikkel)
That try job revealed that, at a minimum, this issue needs to be fixed:

/builds/slave/try-l64-st-an-d-00000000000000/build/src/image/src/DecodePool.cpp:260:3: error: bad implicit conversion constructor for 'DecodePoolWorker'

Looks like DecodePoolWorker's constructor needs to be marked |explicit|.
There was also this Android failure:

http://10.26.132.21:30486/tests/layout/reftests/bugs/397428-1.html | image comparison (==), max difference: 2, number of differing pixels: 140

I'm not sure exactly what's going on here as the test doesn't use images directly, but I really have no idea how printing stuff works, and maybe there are images involved somewhere in the pipeline. At any rate, the combination of the super tiny difference and the platform makes me think that if there are images here, this could just be yet another rendering difference caused by a race in whether we've called imgFrame::Optimize() yet. With a max difference of 2, seems like something we can safely mark fuzzy.
(In reply to Seth Fowler [:seth] from comment #4)
> There was also this Android failure:
> 
> http://10.26.132.21:30486/tests/layout/reftests/bugs/397428-1.html | image
> comparison (==), max difference: 2, number of differing pixels: 140
> 
> I'm not sure exactly what's going on here as the test doesn't use images
> directly, but I really have no idea how printing stuff works, and maybe
> there are images involved somewhere in the pipeline. At any rate, the
> combination of the super tiny difference and the platform makes me think
> that if there are images here, this could just be yet another rendering
> difference caused by a race in whether we've called imgFrame::Optimize()
> yet. With a max difference of 2, seems like something we can safely mark
> fuzzy.

Pretty sure I r+'d a patch to mark test that fuzzy on android that it has landed since your try run. Bug 1156817 might explain crazy android reftest results.
(In reply to Timothy Nikkel (:tn) from comment #5)
> Pretty sure I r+'d a patch to mark test that fuzzy on android that it has
> landed since your try run. Bug 1156817 might explain crazy android reftest
> results.


Oh dear. =( Thanks for letting me know.
I added the missing 'explicit' keyword.

I also double-checked that the 397428-1.html failure was an existing issue and
that it's already marked fuzzy on central.

At this point this patch should be good to go, modulo any issues discovered during review.
Attachment #8605685 - Flags: review?(tnikkel)
Attachment #8603112 - Attachment is obsolete: true
Attachment #8603112 - Flags: review?(tnikkel)
Attachment #8605685 - Flags: review?(tnikkel) → review+
Thanks for the review!
https://hg.mozilla.org/mozilla-central/rev/ae57a9f6ac60
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
You need to log in before you can comment on or make changes to this bug.