Closed Bug 1389496 Opened 7 years ago Closed 7 years ago

Slow getHost method in Page Zoom WE

Categories

(WebExtensions :: Compatibility, defect, P3)

57 Branch
defect

Tracking

(firefox-esr52 unaffected, firefox55 wontfix, firefox56+ wontfix, firefox57 fixed)

RESOLVED FIXED
Tracking Status
firefox-esr52 --- unaffected
firefox55 --- wontfix
firefox56 + wontfix
firefox57 --- fixed

People

(Reporter: alice0775, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: hang, regression)

Attachments

(2 files)

Build Identifier:
Build ID 20170810100255
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:57.0) Gecko/20100101 Firefox/57.0

Reproducible: always

Steps To Reproduce:
1. Install some WebExtensions. e.g, 
    uBlock Origin WebEx version
    ( https://addons.mozilla.org/en-US/firefox/addon/ublock-origin/versions/beta?page=1#version-1.13.9b8 )
    Page Zoom We
    ( https://addons.mozilla.org/en-US/firefox/addon/zoom-page-we/ )

2. Open Web Page in Tab[A]
   e.g, https://blog.nightly.mozilla.org/
3. Select All (Ctrl+A) and Right click > Choose "View Selection Source"

4. Switch to original Tab[A]
5. Click WebExtensions's toolbar button
   --- observe, WebExtensions's popup menu is broken(empty)
6. Open http://www.google.com/ etc.
   --- observe, page loading fails. Tab loading icon is spinning forever.

Actual Results:
WebExtensions do not work properly anymore once after open "View Selection Source".
Page loading fails. Tab loading icon is spinning forever.

Expected Results:
WebExtensions should be working without any problem.
Setting extensions.webextensions.remote = false fixes the problem. Although "View Selection Source" is extremely slow and almost unusable.
Blocks: 1357486
Keywords: regression
[Tracking Requested - why for this release]: WebExtensions do not work properly anymore once after open "View Selection Source".
Page loading fails. Tab loading icon is spinning forever.
Blocks: webext-oop
Keywords: hang
Unable to reproduce with 57 on osx.  Perhaps this is platform specific.
Priority: -- → P1
Summary: WebExtensions do not work properly anymore once after open "View Selection Source" → OOP bustage after open "View Selection Source"
Also unable to reproduce the issue on Linux 64-bit with current Nightly 57 + uBlock Origin in its own child process.
Are you able to reproduce this on 57?
Flags: needinfo?(alice0775)
I can still reproduce the problem and hang on Nightly57.0a1.
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:57.0) Gecko/20100101 Firefox/57.0 Build ID	20170814100258
Flags: needinfo?(alice0775)
Attached file about:support
And I am not the only one facing this problem.
See http://forums.mozillazine.org/viewtopic.php?p=14760276#p14760276
Correcting my comment 4: I can reproduce when installing Page Zoom WE mentioned in comment 0. No need to install uBlock Origin, there is definitely an issue when trying to view-source a selection with Page Zoom WE installed. I will investigate further and post findings if any.
The issue is with Page Zoom WE, an inordinate amount of time is spent in its getHost() function, which is as follow:

    function getHost(url)
    {
        var host;

        if (url.substr(0,6) == "about:") host = url;
        else host = url.match(/.+:\/\/([^:/?#%]*)/)[1];

        return host;
    }

More specifically, in the url.match() call. I observed that when "View Selection Source" is clicked, the browser will open a new tab with the selected source, and which URL is data: URI like "view-source:data:text/html;charset=utf-8,[...]", where [...] is the content o the selected HTML content of the original web page. The more HTML code the selection encompass, the longer it will take Page Zoom WE to do its job. This means that when other extensions share the same content process as Zoom Page WE, they will not be accessible for long seconds until Page Zoom WE finishes its job.

Many things in need of improvement in Zoom Page WE. For sure, the regular expression in getHost() is flawed: from the look of it, the developer really wanted to extract the hostname part of the URL, and it fails to properly do this. A better, saner regex would quite likely take care of the issue issue here, as a properly crafted regex would fail early when testing against the data: URI used for view-source: of selection. Since content scripts can only be injected is http documents, another way is to simply test against /^https?:\/\// and if there is a match, then to just feed the URL to a pre-allocated <a> tag to easily extract the hostname part of the URL[1].

Side note regarding String.prototype.match(): I have observed in some recent profiling that this method seemed to have suffered significant performance regression compared to stable Firefox. I meant to investigate further for confirmation and test case but hadn't had the time. Is this a known issue?


[1] https://developer.mozilla.org/en-US/docs/Web/API/HTMLHyperlinkElementUtils/hostname
getHost()'s url.match(/.../) is maxing out CPU usage.
See Also: → 1390715
Track 56+ as this issue is related to WebExtension.
Gerry, I'm not convinced this is a webextension issue due to comment 10.  R.Hill, would that be correct?
Flags: needinfo?(rhill)
need info'ing dw-dev based on comment 10
Flags: needinfo?(dw-dev)
Component: WebExtensions: Untriaged → WebExtensions: Compatibility
Priority: P1 → P3
Summary: OOP bustage after open "View Selection Source" → Slow getHost method in Page Zoom WE
Yes, the issue is really with Zoom Page WE, it uses a bad regular expression:

Test page: https://regex101.com/

- Select "javascript" flavor in the left-hand side bar
- In "Regular Expression" input field, paste /.+:\/\/([^:/?#%]*)/ (that is Zoom Page WE regex)
- Open in a new tab: https://blog.nightly.mozilla.org/
- Select all text in that new tab
- Right-click "View Selection Source"
- In the newly opened view-source: tab, copy the URL to the clipboard
- Go back to the regex101.com tab
- In the "Test String" text area field, paste the content of the clipboard

Result: "timeout".

This is true for Nightly and Chromium, hence not specific to Nightly. The only difference though is that with Chromium, other extensions would still be able to function properly since each is in its own dedicated process.

If I minimally change Zoom Page WE's regular expression from

    /.+:\/\/([^:/?#%]*)/

to

    /^[a-z]+:\/\/([^:/?#%]*)/

the issue go away.
Flags: needinfo?(rhill)
I can reproduce this problem, but I think the issue is wider than just the regular expression.

Raymond,

In Comment 10 you said: "Many things in need of improvement in Zoom Page WE."

Please can you outline the other improvements and/or issues ?
Flags: needinfo?(dw-dev) → needinfo?(rhill)
> I think the issue is wider than just the regular expression.

Fixing the regular expression will definitely make the described issue in OP disappear. Just adding the ^ at the start of it as it is, is sufficient actually, but making it a bit more targeted given that your extension is not meant to be injected in anything else than https? page is just simpler: /^https?:\/\//.

> Please can you outline the other improvements and/or issues ?

I can't find a repo for your extension code, so I used the browser content toolbox, through which I got the line numbers. Below is what I found in content.js.

Line 267: you are using an anonymous function in your mutation handler. This anonymous function is being created each time your handler is called. You should use a static version of the function by moving it outside the handler. This way it won't be locally redefined each time your handler is called. Imagine the js engine internally defining and optimizing this function just to have this all thrown once the handler finishes.

Blocks of code at line 276 and 298. My understanding is that you are walking up the DOM tree up to the root element to neutralize your extension styles so that your code can get the normal font-size value of elements. This is done for each single added nodes of all mutation records.

As you walk up the tree, you test-and-set the zoom-page attribute, first to neutralize your style, then afterward to restore it. I think this code should be re-arranged to avoid repeating such not necessarily cheap operations: how about you first collect all the nodes for which your zoompage-fontsize must be set, in a Set. I see two significant advantages:

1. to avoid repeated handling of same elements -- which does occur a lot given that you lookup all the child elements for every added node (try with Twitter).
2. you will only need to neutralize/restore your style once for the whole mutation handler, instead of for each added node.

What I refer to "handling" above is at line 409 and 437: your mutation handler ends up effectively causing multiple handling of the same elements when it is called. Using a Set to first collate all the elements to handle will resolve this. This can't be cheap given what is happening in addFontSizeAttribute. Once all elements are collected in the Set (this includes the children at line 285), then you can iterate that resulting Set, possible even just calling addFontSizeAttribute with Array.from(Set) as argument.

Another suggestion: to neutralize your zoompage-fontsize CSS rules, why not just disable your style tag? [1] This way you won't even have to walk up the tree twice, let alone for each added node as it is now.

Again, needless redefinition of two local arrays at line 413-414. addFontSizeAttribute is probably one of the most called function when your mutation handler is called (handling multiple times the same elements add to this). Moving these two arrays outside so that they are always already defined surely would remove needless overhead. Also I would consider using Sets/has() instead of using Array/indexOf().

[1] https://developer.mozilla.org/en-US/docs/Web/API/StyleSheet/disabled
Flags: needinfo?(rhill)
You can also use:
> new URL(url).hostname

To get the host of the url.
I have just submitted Zoom Page WE 5.2 to Mozilla for review and release.

Sorry it has taken a bit of time. Changing the regular expression in the getHost() function was only a partial solution. To avoid this error case (and other similar cases) there needed to be more pervasive changes made higher up in the logic tree.

The investigation has led to a clearer classification of the types of pages that can be zoomed and the ways that they can be zoomed. The end result is a better and more easily understood design.

With regards to the very helpful comments made by Raymond Hill in Comment 17. Most of the suggestions are focused on improving performance either by making functions/arrays static or by avoiding repeated handling of the same elements. I think it is likely that these suggestions would achieve incremental performance improvements, but I have no idea what the overall effect would be on performance. 

The stand-out idea is the one about disabling the style tag to neutralize the zoompage-fontsize CSS rules. I think this will work and I should have thought of it! A case of not seeing the wood for the trees!  I am assuming that disabling and enabling the style tag will require much less processing than the current approach. The only way to be sure is to try it.

Another idea I am thinking about is to only apply the zoompage-fontsize attributes when the user is actually using Text Zoom or Minimum Font Size. These attributes are not required for Full Zoom. The downside is the probability of a short delay and display stutter when switching into Text Zoom or Minimum Font Size.

I will be prototyping both of the above ideas before releasing Zoom Page WE 6.0.
 
As far as I am concerned this bug can be closed.
I have just submitted Zoom Page WE 5.3 to Mozilla for review and release.

This release includes a new 'Lite Version' option for users that do not need Text Zoom or Minimum Font Size.

When this option is enabled, the operations are restricted to Full Zoom only, and the zoompage-fontsize attributes are not added to the document elements.
Mark 56 won't fix as we are only 1 week from RC.
The original issue reported by Alice White, caused by an incorrect regular expression, was fixed in Zoom Page WE 5.3 released on 28 August 2017.

The subsequent concerns raised in Comment 17 have been thoroughly investigated and addressed. It was found that the thing that consumed the most time was walking up the tree to neutralize the higher-level 'zoompage-fontsize' CSS rules. This triggered additional mutations that dwarfed any other inefficiencies in the implementation.

The mutation observer has been re-designed to avoid walking up the tree and now consumes very little time.

This re-design is included in Zoom Page WE 5.4 released on 7 September 2017.

This bug should be marked as RESOLVED.
Updating as per comment 22. Thanks dw-dev.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Product: Toolkit → WebExtensions
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: