Closed Bug 1003118 Opened 11 years ago Closed 11 years ago

check the impact of running language detection on page load times

Categories

(Firefox :: Translations, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED

People

(Reporter: florian, Assigned: mikedeboer)

References

Details

(Whiteboard: [translation] p=13 s=it-32c-31a-30b.3 [qa-])

Hopefully language detection (which seemed fast in my local testing, and is mostly done off main thread) shouldn't have a significant impact on page load time, but I think we need to verify this before shipping.
Flags: firefox-backlog?
Flags: firefox-backlog? → firefox-backlog+
Whiteboard: p=0
Component: General → Translation
Whiteboard: p=0 → p=13
Assignee: nobody → mdeboer
Status: NEW → ASSIGNED
Whiteboard: p=13 → p=13 s=it-32c-31a-30b.2 [qa?]
Whiteboard: p=13 s=it-32c-31a-30b.2 [qa?] → p=13 s=it-32c-31a-30b.2 [qa-]
Alright, the results are in! But, I'm sorry to note that Talos is not happy with this feature. Especially tp5o_paint is sad, with an almost 70% regression on Linux. Now it's the case that the tp5 tests take the Alexa top 500 sites (see https://wiki.mozilla.org/Buildbot/Talos/Tests#tp5) and I'm pretty sure there are sites other than en-US in that set, which will successfully yield a different language and show the translation bar. Now onto a hypothesis: I've worked on the redesign of the findbar and one of the changes we wanted to make was to move it to the top and animate-in downward, just like the translation toolbar does. I had to back this feature out, because it caused continuous repaints of the _whole_ page, making the experience jittery. The same now happens with the translation bar; we don't have a good strategy atm to place a toolbar at the top without causing many repaints. When loading a page in a foreign language - with many assets/ HTTP requests - I see the translation bar appear before the page finishes loading, which supports my hypothesis above. Of course I'll have to look at a full profile for definitive proof, but we should consider this in the meantime.
For some initial 'proof', please check out this breakdown page for tp5o_paint on Linux: http://compare-talos.mattn.ca/breakdown.html?oldTestIds=36911947,36927195,36927205,36927225&newTestIds=36912113,36927341,36927351,36927399&testName=tp5o_paint&osName=Ubuntu%2064&server=graphs.mozilla.org ... which clearly shows a huge regression on non-en-US pages, like aljazeera.net, mail.ru and spiegel.de. So it's definitely regression on the pages where the translation bar is shown.
Doing another Try push to determine if showing the translation bar is indeed the main culprit: Baseline push: https://tbpl.mozilla.org/?tree=Try&rev=38033b657224 Feature-enabled push: https://tbpl.mozilla.org/?tree=Try&rev=ae6a90ee8ba1 Compare-talos link: http://compare-talos.mattn.ca/?oldRevs=38033b657224&newRev=ae6a90ee8ba1&server=graphs.mozilla.org&submit=true
Thanks Mike! I preemptively sent some more pushes to try to bisect things further. Baseline: https://tbpl.mozilla.org/?tree=Try&rev=bff5d7111f4f Translation built, detection disabled: https://tbpl.mozilla.org/?tree=Try&rev=c4eece2a5404 Language detection, no infobar and no urlbar icon: https://tbpl.mozilla.org/?tree=Try&rev=c31fd5b1cbba Language detection, using textContent instead of document encoder: https://tbpl.mozilla.org/?tree=Try&rev=c32c8cd864ea
At first glance - without retriggered jobs - this looks much better. IOW, preliminary conclusion: not showing the translation bar prevents the tp5 regressions.
Felipe, with the help of your try pushes I can put it into these words: Important note: 'rss' in 'tp50_main_rss_paint' means Resident Set Size (see the descriptions mentioned at https://wiki.mozilla.org/Buildbot/Talos/Tests#tp5) and is a process memory consumption metric. Translation built, detection disabled: No regressions, so it's safe to note that having this thing in the build does not slow things down. Language detection, no infobar and no urlbar icon: This shows that NOT showing the infobar and icon resolves the tp5o_paint regression, leaving only a small one on Linux. tp5o_main_rss_paint and tp5o_pbytes_paint regressions are also cut in half, but not solved. This leads to think that the detection itself is causing a part of the regression as well, because that's the only part of the process left in-tact. Language detection, using textContent instead of document encoder: This change resolved the tp5o_paint regression entirely (so please keep this change, if it doesn't have a too large negative impact on the feature!), but did nothing good for the tp5o_main_rss_paint regression. So, next steps: 1. Please keep on using `textContent` instead of document encoder. 2. The memory consumption is basically caused by the language detection lib. Since it's already running OMT, it shouldn't have a negative impact on UX as the numbers suggest. Additionally, the memory increase _should_ be short-lived, right? We could also run the detector in a separate process and use IPC to transfer the results. Felipe, what do you think?
Flags: needinfo?(felipc)
After re-evaluating the results and after discussing things with Felipe, comment 7 and comment 8 should be reconsidered, because the pushes were based on top of each other. So they should be viewed as: - Translation built, detection disabled == Baseline + Translation built, detection disabled. - Language detection, no infobar and no urlbar icon == Baseline + Translation built, detection disabled + no infobar and no urlbar icon - Language detection, using textContent instead of document encoder == Baseline + Translation built, detection disabled + no infobar and no urlbar icon + using textContent instead of document encoder. So in this light, the compare-talos will become: Translation built, detection disabled: http://compare-talos.mattn.ca/?oldRevs=bff5d7111f4f&newRev=c4eece2a5404&server=graphs.mozilla.org&submit=true Language detection, no infobar and no urlbar icon: http://compare-talos.mattn.ca/?oldRevs=c4eece2a5404&newRev=c31fd5b1cbba&server=graphs.mozilla.org&submit=true Language detection, using textContent instead of document encoder: http://compare-talos.mattn.ca/?oldRevs=c31fd5b1cbba&newRev=c32c8cd864ea&server=graphs.mozilla.org&submit=true
Depends on: 1013374
Depends on: 1013861
So, to summarize: - the urlbar/infobar displaying caused the big spike in the numbers. Without that the measurements are controlled - tpaint is unaffected - With the current document encoder usage, there are appears to be a possible small 1 - 2% regression on Ubuntu and WinXP for Tp5 that looks distinct from pure noise - With textContent that goes completely away, but further work is requred to use textContent. - Memory measurements during Tp5 grow, which is expected. But we need to fully understand how the memory usage is calculated during Tp5. (It looks like the fixed overhead of the feature is added up for each page tested) I tried playing with the parameters of the document encoder, as it offers some optimization flags. It helped bring the numbers down to 0.5 - 1.5%, but it wasn't enough to bring it to 0. I'll experiment some more with it. So nothing looks like to be a blocker for the trial run, and there's a clear path on getting where we need when it comes time to enable the feature by default and having it running during Talos. I think we can close the bug if no one disagrees with the analysis
Flags: needinfo?(felipc)
Whiteboard: p=13 s=it-32c-31a-30b.2 [qa-] → p=13 s=it-32c-31a-30b.3 [qa-]
Marking as fixed per comment 10. Marco, can you re-add this bug to the 32.2 iteration and update the iteration perf report? Thanks!
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Status: RESOLVED → VERIFIED
Whiteboard: p=13 s=it-32c-31a-30b.3 [qa-] → [translation] p=13 s=it-32c-31a-30b.3 [qa-]
You need to log in before you can comment on or make changes to this bug.