Closed Bug 1164266 Opened 9 years ago Closed 8 years ago

The way we update the UI job tables is slow

Categories

(Tree Management :: Treeherder, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: wlach, Assigned: martianwars)

References

(Blocks 1 open bug)

Details

(Keywords: perf)

Attachments

(2 files)

Currently we do a whole bunch of aggregation of job information in clonejobs using jQuery calls. This is really slow compared to just manipulating text or other methods. We obviously can't avoid using jQuery (or equivalent) entirely, but it would be better to just create a huge html string (using templates or whatnot) then update the node *once* on update.

I started work on this, but haven't had time to complete it and I probably shouldn't be spending much more time on this. ;) I'm going to post my work in progress in case someone wants to pick up where I left off. If no one does, maybe I'll come back to it eventually.
Attached file PR of wip
Hey Mauro,

If you think this is terrible, let me know and I'll retract it. :) With this enabled, time spent in generateJobElements goes down from 4400ms -> 2200ms on my machine. I think we could save even more by converting all the jQuery code to just manipulate strings -- I only tackled the worst offender (the job items) in this patch.

If you think the general approach is good, I'll try to find someone in the community to finish it off.
Attachment #8604944 - Flags: feedback?(mdoglio)
Blocks: 1067846
Keywords: perf
Priority: -- → P2
Summary: The way we update the job tables is slow → The way we update the UI job tables is slow
Given the results you got, I would give a thumbs up.
I would consider 2 things when looking for other templates to convert:
- The template interpolator prevents potential security issues by escaping the expression markers. I don't think we are open to such a threat in the case of the job buttons.
- The tradeoff between readability and execution speed should be the driver; again in the case of the job buttons execution speed is a critical aspect. The same probably applies to the job groups. But I'm not sure about platforms and resultsets.
Attachment #8604944 - Flags: feedback?(mdoglio) → feedback+
It may also be possible to get some mileage out of using ``document.createDocumentFragment()`` and constructing the list of jobs /counts in there before adding it to the DOM.  But I haven't really used that function much, so if it is THAT much faster in our case.
Hi wlach,
So like you mentioned on the IRC, I went through your patch. I didn't quite understand it fully, largely due to my unfamiliarity with the treeherder code, but I did get the idea that you were trying to remove some jQuery code and adding some raw HTML in place of it. By looking at what mdoglio and camd say, I guess I'll have to follow a similar procedure here and elsewhere in this file.
So I think I need to get a little familiar with treeherder and what is actually being optimized before I start fixing the patch. Could you give me some pointers to the same? I have successfully setup the repository and did manage to see some jobs running but couldn't quite understand how the directive clonejobs.js exactly fits in, and where the part of the code that's to be modified comes into play.
Thanks :)
Flags: needinfo?(wlachance)
Assignee: nobody → kalpeshk2011
Hi Kaplesh, so basically clonejobs renders the actual HTML that you see in the treeherder jobs table.

The particular method that I singled out for modification renders the actual job symbols. Here's an example which will hopefully make it clear:

M R R-e10s T T-e10s tc[Tier-2] [Tier-2] 

Every time we modify the DOM there is a cost, so there would likely be significant benefit by only doing so once in this method (after a full set of HTML is concatenated) as opposed to many many times. One way of measuring this would be to experiment with toggling the show/hide all jobs button ('(+)')on the top while using the web tools to profile treeherder (you might need to do this with Chrome, as this operation is currently very slow in Firefox)
Flags: needinfo?(wlachance)
(In reply to William Lachance (:wlach) from comment #5)
> (you might
> need to do this with Chrome, as this operation is currently very slow in
> Firefox)

To clarify: It's the *profiler* that's very slow in Firefox. The operation itself works fine (well, it's a little slow, but not unusably so)
Alright, makes a lot more sense now. So which lines (among those that you removed from the for loop) were responsible for the "mant many times"? Which new line added (outside the loop), does it all at once? I suspect that the jQuery would be responsible, but not sure if that is the only method modifying the DOM.

So in short I need to find all such loops in clonejobs.js and move code out of the loop right?
Flags: needinfo?(wlachance)
(In reply to Kalpesh Krishna [:martianwars] from comment #7)
> Alright, makes a lot more sense now. So which lines (among those that you
> removed from the for loop) were responsible for the "mant many times"?

Specifically it's all the code that added jobs HTML to a placeholder dom element. If you think about it this is just deduction: what part of the jobs table has the most data in aggregate? Of course it will be the individual lines with job symbols. However the theory is also corroborated with the performance profiling I did.

 Which
> new line added (outside the loop), does it all at once? I suspect that the
> jQuery would be responsible, but not sure if that is the only method
> modifying the DOM.

Yes in this case it's just jQuery modifying the DOM. The key modification in my PR (which you'll want to replicate) is here:

https://github.com/mozilla/treeherder-ui-deprecated/pull/524/files#diff-805756736d93bec597a3a99e5ab3311eR277

> So in short I need to find all such loops in clonejobs.js and move code out
> of the loop right?

I think it would be reasonable to just start by performing the suggested modifications on `renderJobTableRow` (https://github.com/mozilla/treeherder/blob/master/ui/js/directives/treeherder/clonejobs.js#L525) and submitting a PR for that. This should be a pretty straightforward expansion from my original patch, which I suspect that would give us the majority of the benefit with only a little work since we'd be going from modifying the DOM once per each job, to one per each *row* of jobs, a vast difference.

The only other change I'd make in approach from my original PR would be to use interpolator(s) to create the job rows, instead of just concatenating strings, it's a bit more readable and follows the conventions elsewhere in the file. You can see some examples of interpolator instantiation elsewhere in clonejobs.js:

https://github.com/mozilla/treeherder/blob/master/ui/js/directives/treeherder/clonejobs.js#L52
Flags: needinfo?(wlachance)
Comment on attachment 8705299 [details] [review]
[treeherder] martiansideofthemoon:vanillabrownie > mozilla:master

William,
Sorry for the slight delay, first week at college. Finally found some time to dig into the code.
So I did try to emulate your patch, though I don't think I was very successful. The code does run, but there are some style issues along with an on-click not working (When we click the job symbol, there is an inline expansion of individual runs). However the job panel opens up, and the Profiler did show an slight improvement in the function (in Chrome, 31ms to ~20ms).
I had a really hard time writing out the HTML. I don't think I've accounted for ALL the functionality added by addJobBtnEls() and addGroupJobsAndCounts() in the HTML string.
You mentioned using interpolators instead of HTML string concatenation. What I don't get is isn't,
> jobGroup = $(jobGroupInterpolator(jobGroups[i]));
the jQuery code that we need to remove? Or have I gotten some things really wrong?

Thanks!
Flags: needinfo?(wlachance)
Attachment #8705299 - Flags: feedback?(wlachance)
Comment on attachment 8705299 [details] [review]
[treeherder] martiansideofthemoon:vanillabrownie > mozilla:master

I think there's some confusion about what needs to be done here. This is probably partly a reflection of the fact that clonejobs.js itself is rather complicated, so I hope you don't lose heart. I left some hints in the PR, hopefully they give you a bit more a solid foundation with which to move forward.

Unfortunately performance profiles won't be very meaningful until we have a full replacement of functionality. It doesn't do much good to have an improved profile if we're only doing 50% (or whatever) of the work. :)
Flags: needinfo?(wlachance)
Attachment #8705299 - Flags: feedback?(wlachance)
Comment on attachment 8705299 [details] [review]
[treeherder] martiansideofthemoon:vanillabrownie > mozilla:master

William,
I have reworked my PR and I think it is correct now :) Thanks for explaining interpolators to me. I think I had a very wrong idea about them. I don't really know the standardised variable conventions / code style used in Treeherder, so do let me know if the PR has any major / minor issues. The code seems to be working locally :)
So I guess this was just one function. Do I need to work on some more?

Thanks!
Flags: needinfo?(wlachance)
Attachment #8705299 - Flags: feedback?(wlachance)
So, this is coming in late in this discussion, but I had always wondered if using ReactJS would be a good way to speed things up.  It seems like a way to be able to use more readable code than raw HTML.  It sounds like it creates a "shadow" DOM in the background which gets populated and then replaces a whole chunk with that DOM.  Updating the background DOM is cheaper, since there's no drawing involved till it is put in place in the "real" DOM.  

It would probably be pretty involved to switch to ReactJS, so these wins may be better in the short term.  But just wanted to throw that out.  Perhaps it would even be useful in smaller chunks?
(In reply to Cameron Dawson [:camd] from comment #13)
> So, this is coming in late in this discussion, but I had always wondered if
> using ReactJS would be a good way to speed things up.  It seems like a way
> to be able to use more readable code than raw HTML.  It sounds like it
> creates a "shadow" DOM in the background which gets populated and then
> replaces a whole chunk with that DOM.  Updating the background DOM is
> cheaper, since there's no drawing involved till it is put in place in the
> "real" DOM.  
> 
> It would probably be pretty involved to switch to ReactJS, so these wins may
> be better in the short term.  But just wanted to throw that out.  Perhaps it
> would even be useful in smaller chunks?

Yes, I suspect moving to react for this code would be a sensible thing to do, I know :mdoglio has brought this up before. I think this approach is simpler though and better as an introduction to treeherder's job rendering code (which is quite complex). Maybe :martianwars would be interested in working on a potential move to react after we have this in shape?
Comment on attachment 8705299 [details] [review]
[treeherder] martiansideofthemoon:vanillabrownie > mozilla:master

This is on the right track, however I don't think it quite does what we need it to yet -- in particular it looks like we're still doing a bit of DOM manipulation inside the jobs row rendering.

To avoid get a bunch of JavaScript errors on your PR, try running "grunt checkjs" on your treeherder client installation:

http://treeherder.readthedocs.org/ui/installation.html#validating-javascript
Flags: needinfo?(wlachance)
Attachment #8705299 - Flags: feedback?(wlachance) → feedback+
Comment on attachment 8705299 [details] [review]
[treeherder] martiansideofthemoon:vanillabrownie > mozilla:master

William,
Thanks for the review, I've worked a bit on the two functions and converted all jQuery to string manipulation. The code works well locally. Do let me know about any further steps! :)
Well I don't know ReactJS, but I would love to learn and try out something new :D jgraham and I considered it before beginning with wptview, but decided to completely use AngularJS instead.
Attachment #8705299 - Flags: review?(wlachance)
Comment on attachment 8705299 [details] [review]
[treeherder] martiansideofthemoon:vanillabrownie > mozilla:master

This is close, but needs some revisions. 

I did some profiling on your branch by using Chrome's profiler (https://developer.chrome.com/devtools/docs/cpu-profiling) and validating that we're spending less time in JS than we were before (after removing your console.log statement, we are improving).

It seems like we save at least a few hundred of ms of JS time on my test case (load 20 more jobs on default treeherder view) with your branch enabled. Not as much as I'd hoped, but hey, it's something. I'd recommend running the same test on your code while you're working on it, just so you can understand what effect you're having-- if you need help with that, just let me know.
Attachment #8705299 - Flags: review?(wlachance) → review-
Comment on attachment 8705299 [details] [review]
[treeherder] martiansideofthemoon:vanillabrownie > mozilla:master

William,
I've reworked the PR and added a comment on Github.
Well I have really scattered ideas about Profiling to be honest probably because I've never done it properly in the past. I generally just press the "Start" button, reload the page, and press STOP when done. I then see the time taken by "renderJobTableRow" in that huge table produced. I guess there are some more steps involved so it It would be best if you could tell me / give me a few resources to check how the patch helps Treeherder. I know how to load jobs one at a time, so would I need to do this 20 times? Or is there a faster way?
Flags: needinfo?(wlachance)
Attachment #8705299 - Flags: review- → review?(wlachance)
Comment on attachment 8705299 [details] [review]
[treeherder] martiansideofthemoon:vanillabrownie > mozilla:master

I want to give this an r+, but there's still a few small things I'd like to see corrected. I think it would also be worth it for you to play with the profiler a bit to prove to yourself that the new approach is faster-- I gave some hints on how to do so on irc. However, I myself am satisfied this is at least a little bit of an improvement after validating things with Chrome's profiler.
Flags: needinfo?(wlachance)
Attachment #8705299 - Flags: review?(wlachance)
Comment on attachment 8705299 [details] [review]
[treeherder] martiansideofthemoon:vanillabrownie > mozilla:master

William, 
So this is probably what you wanted. The code works locally, though I've got a feeling that we are back to using the jQuery selection once every job. What do you think?
Flags: needinfo?(wlachance)
Attachment #8705299 - Flags: review?(wlachance)
Also, I'm not able to see any change in the Profiler. I think I've done something wrong. I did celery -A treeherder worker -B --concurrency 5 and run it for about one minute before pressing Ctrl+C. I then started the profiler and refreshed http://local.treeherder.mozilla.org/ and closed it after sometime.
I did the same on the master branch.
I got almost no difference under renderJobTableRow() (Heavy(Bottom Up) type of profile) in the total time section for the two branches. What am I doing wrong? Would a screenshot help?
When you're profiling you can just use the web-server.js method of running the UI; it will be a lot quicker and also more representative of production :-)
http://treeherder.readthedocs.org/ui/installation.html#running-the-web-server
Comment on attachment 8705299 [details] [review]
[treeherder] martiansideofthemoon:vanillabrownie > mozilla:master

Left some feedback in the PR, I think the approach is right, just needs some tweaking.
Flags: needinfo?(wlachance)
Attachment #8705299 - Flags: review?(wlachance)
(In reply to Ed Morley [:emorley] from comment #22)
> When you're profiling you can just use the web-server.js method of running
> the UI; it will be a lot quicker and also more representative of production
> :-)
> http://treeherder.readthedocs.org/ui/installation.html#running-the-web-server

Yes, this is the good advice. I generally profile by loading a local copy of the webserver, then comparing against a version without the changes. It isn't 100% deterministic since the content on stage or production could change between profiling runs, but it should be close.

If you're testing with the full setup, it definitely isn't necessary to ingest any data beyond a few pushes. Consider using the ingest-one-push-at-a-time approach for that:

http://treeherder.readthedocs.org/installation.html#ingesting-a-single-push-at-a-time
Comment on attachment 8705299 [details] [review]
[treeherder] martiansideofthemoon:vanillabrownie > mozilla:master

@wlach, @emorley,
Thanks for the tips, I could successfully see an improvement of about a 100-150 ms on my Google Chrome. I've updated the PR, and it works well for me. I hope it can be merged now :)
Attachment #8705299 - Flags: review?(wlachance)
Comment on attachment 8705299 [details] [review]
[treeherder] martiansideofthemoon:vanillabrownie > mozilla:master

Hmm, the code *looks* right, however when I try to use it, the following case doesn't work:

1. Load treeherder
2. Select a job
<wait for update, up to a minute>

Expected:

Job remains highlighted with large button classes

Actual:

Job loses large button classes

--

Could you investigate? I'm pretty sure it's a simple problem. :)
Attachment #8705299 - Flags: review?(wlachance) → review-
Comment on attachment 8705299 [details] [review]
[treeherder] martiansideofthemoon:vanillabrownie > mozilla:master

William,
I couldn't really reproduce the bug, but I'd forgotten an important line of code (clearSelectJobStyles() at one point) that is related and might be causing the issue. I've added it as a new commit, if it helps the cause then I'll squash :)
Attachment #8705299 - Flags: review- → review?(wlachance)
(In reply to Kalpesh Krishna [:martianwars] from comment #27)
> Comment on attachment 8705299 [details] [review]
> [treeherder] martiansideofthemoon:vanillabrownie > mozilla:master
> 
> William,
> I couldn't really reproduce the bug, but I'd forgotten an important line of
> code (clearSelectJobStyles() at one point) that is related and might be
> causing the issue. I've added it as a new commit, if it helps the cause then
> I'll squash :)

So I'm not sure how you weren't able to reproduce the bug. It's 100% reproducible for me. Are you waiting at least a minute after selecting a job? By select a job, I mean click on it so it's expanded. Like this:

http://wrla.ch/Screencast%20from%2017-01-16%2004:09:16%20PM.webm

The patch you added doesn't fix the issue, and I'm pretty sure it's not necessary -- since you're modifying an element which is about to overwritten.
Comment on attachment 8705299 [details] [review]
[treeherder] martiansideofthemoon:vanillabrownie > mozilla:master

I don't think we want this latest change, though the rest of the patch is almost there. :)
Attachment #8705299 - Flags: review?(wlachance) → review-
Comment on attachment 8705299 [details] [review]
[treeherder] martiansideofthemoon:vanillabrownie > mozilla:master

William,
I spent a couple of hours understanding clonejobs.js and I think I understand it a lot better now. Like I mentioned previously, there were 2-3 issues with the previous patch.
1. I was occasionally getting an error that lastJobSelected.el is not a function. This is because https://github.com/mozilla/treeherder/pull/1235/files#diff-723e289a2b4792ae240d9cd7a723c318R283 pushes in HTML to the object and we call clearJobSelectStyles() somewhere in the middle. I've added a condition for this. (https://github.com/martiansideofthemoon/treeherder/commit/4d473211c4e847d71e922cc4678881be000deef0#diff-723e289a2b4792ae240d9cd7a723c318R148)
2. After job updates, the ResultStore object was wrongly initialized and hence the large button classes of previously selected jobs persisted. This was fixed by doing https://github.com/martiansideofthemoon/treeherder/commit/4d473211c4e847d71e922cc4678881be000deef0#diff-723e289a2b4792ae240d9cd7a723c318R782.
I guess the reason the styles were vanishing in some cases is the same, though I'm not sure

The current version seems to have one problem, though I don't know if it matters. After job updates, the selected job loses its green color and turns white. On doing an onmouseover, it restores its green color. Is this an issue?
Also, could you tell me the right way to debug JS code? I've just been using console.log all the time. :(
Flags: needinfo?(wlachance)
Attachment #8705299 - Flags: review- → review?(wlachance)
Comment on attachment 8705299 [details] [review]
[treeherder] martiansideofthemoon:vanillabrownie > mozilla:master

I don't *think* the problem you mentioned matters, I think it's just a hover style the changes after the update, which is something that happened before your patch.

There do seem to be some remaining issues in your PR though, have a look and let me know what you think. I still think we're almost there.
Flags: needinfo?(wlachance)
Attachment #8705299 - Flags: review?(wlachance)
Attachment #8705299 - Flags: review?(wlachance)
Comment on attachment 8705299 [details] [review]
[treeherder] martiansideofthemoon:vanillabrownie > mozilla:master

Looks good now, merged patch manually.
Attachment #8705299 - Flags: review?(wlachance) → review+
Let's mark this as fixed, we can open up followups for either expanding this approach or moving to react.js
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: