Closed Bug 862341 Opened 11 years ago Closed 9 years ago

Always record the request, even if the Network Monitor is not open

Categories

(DevTools :: Netmonitor, defect, P1)

defect

Tracking

(firefox40 fixed)

RESOLVED FIXED
Firefox 40
Tracking Status
firefox40 --- fixed

People

(Reporter: st3fan, Assigned: past)

References

Details

(Whiteboard: [status:planned][polish-backlog])

Attachments

(5 files, 9 obsolete files)

21.68 KB, patch
vporof
: review+
Details | Diff | Splinter Review
6.61 KB, patch
vporof
: review+
Details | Diff | Splinter Review
12.80 KB, patch
vporof
: review+
Details | Diff | Splinter Review
6.83 KB, patch
vporof
: review+
Details | Diff | Splinter Review
8.96 KB, patch
past
: review+
Details | Diff | Splinter Review
Would it be possible to always record the network requests, even if you don't currently have the network tab open?

My workflow usually starts in the Web Console and then when I see something happening there I want to switch to the Network view to inspect requests that just happened.

But, the network view is empty. It only records there if you have that tab open. This is not super useful for debugging in my experience because you usually don't want to reload to just get that data again.
Hm odd. I tried this again and it seems that it works as expected. I'm not sure what happened when I saw no recording. I'll try to STR that.
Caching network requests across tools isn't currently implemented. I think the STR would be something like this:

1. Open Web Console (don't open the Network Monitor)
2. Wait for some requests to happen
3. Open Network Monitor
=> requests aren't there.

There was bug 860823 filed already but duping that one because this has a nicer description :)
Summary: Always record the request, even if the Network panel is not open → Always record the request, even if the Network Monitor is not open
OS: Mac OS X → All
Hardware: x86 → All
Priority: -- → P2
Could that be part of Bug 991806 - Firebug Gaps? I am used to open the console in FB, reload the page, check the results and switch to the NWP from there, which has loaded the NW results already.
I think the central bug here is that network requests should be logged in the netmonitor or console once the toolbox is open instead of once the tool ( console or netmonitor ) is shown.
Whiteboard: [status:planned]
I looked into this, and the problem is that network requests are stored in the console frontend, not in the actor, and there is no general communication mechanism between the tools in a toolbox.

Indeed, we try to make sure unused tools do not affect the performance of the browser, by not loading their frontend until the user switches to them. The network event store is kept in WebConsoleFrame which is present inside the console iframe.

Other console messages, like page errors and console API calls go through the console storage service, because the user expects them to be collected even when the console is closed. This has a performance impact (24 or 70 msec per console call, depending on whether e10s is disabled or not - see bug 1107699 comment 9) and I bet storing network events would be even more expensive. Also, sites tend to avoid logging code in production, but they certainly can't avoid making network requests.

So storing network events in the console storage service wouldn't fly, but we could keep a network event store in the console actor, so that both the console and network monitor could send getCachedRequests messages to it. This however has the disadvantage that it would increase the memory consumption of the backend, which is at a premium on phones.

Another option is to move the network event store from the console frontend to the console client, which is already shared by both tools. The console client is currently stateless, unlike say the debugger client, but that's not by design. I think this is the most promising and easier to implement solution that I can think of.
Oh, and to achieve what Jeff wants in comment 7, we would need to have the console client connect on toolbox open, kind of how we initialize the inspector front in toolbox.js. But I think that's a separate concern.
(In reply to Panos Astithas [:past] from comment #9)
> Oh, and to achieve what Jeff wants in comment 7, we would need to have the
> console client connect on toolbox open, kind of how we initialize the
> inspector front in toolbox.js. But I think that's a separate concern.

Thanks for looking into this!
Blocks: 861335
Sorry to burst in. Just filed bug 1126093. Might it be related workwise? (if it's not actually a wontfix)
I think bug 1126093 is a separate issue (and a valid bug IMO).
Whiteboard: [status:planned] → [status:planned][devedition-40]
It will be tricky to make any frontend changes without causing a big rebase on Bug 1149634 (which splits the frontend up into a bunch of modules)
Depends on: 1149634
Assignee: nobody → past
Status: NEW → ASSIGNED
Priority: P2 → P1
(In reply to Brian Grinstead [:bgrins] from comment #13)
> It will be tricky to make any frontend changes without causing a big rebase
> on Bug 1149634 (which splits the frontend up into a bunch of modules)

Would this actually touch the frontend very much? This looks like mostly backend work.
This is the first part that extracts the network request cache handling logic from the web console frontend to the web console protocol client. This client is also used by the network monitor, so the two tools will be able to share its cache.
Forgot to mention that all web console tests still pass locally. I expect that the second patch will mostly touch the netmonitor controller, so it shouldn't conflict much (if at all) with the frontend refactoring work.
This is the second part that makes the web console use the network event cache on startup. With this, when we open the toolbox in the network panel, record some traffic and then switch to the web console, it will now display every network request stored in the cache.

This required moving the cached message sorting from the server to the client, as the network cache resides in the frontend. I verified that nothing depends on the previous behavior of the getCachedMessages protocol requests that returned an ordered list, including Firebug.
Forgot to mention that this patch now uses the correct timestamp for network requests and ensures that they will be properly interleaved with other messages. Previously we used the time the logging took place, instead of the time the request was made (which is way off when caching is in place).

The reflow logging has the same problem, which needs a one-line fix that I need to test first. Luckily, we don't cache reflow logging events and anyway the most interesting bit of information in those messages is the reflow duration, not the start time.
This part makes the network panel display requests observed by the web console in cases where the console was open first, the requests were received and then the network panel was open. All tests pass, but I don't get the response body in the network panel, because the console hadn't requested it. I'll see if I can find a way around that.

It turns out the console and the network panel use slightly different object shapes for the network events for no apparent reason. This makes the API I'm designing uglier than it needs to be and I have to look into that, too.
The problem described in comment 19 with the console being configurable to not record the request body is unfortunately one place where the two tools differ, as the net monitor always records it by default. We could either remove the option from the console or do nothing and let the user enable it if this workflow is quite common for her.

The problem with the first option is that it would make the console and the page slower as copying and transferring the file data across the protocol connection isn't going to be cheap. Perhaps we could rely on users disabling the network filters in the console to restore a snappy console, but maybe that's too much to hope for?

The second option could be improved by displaying an explanation in the net panel's response tab, which is now empty when the body isn't recorded. Something like "Request and response bodies weren't recorded, toggle the option in the web console to change this".

That also highlights the fact that we have two separate tools that are controlled by the same option (even though it's actually hard-coded in the netmonitor), so we should surface it somewhere more prominently, like the options panel. But then again, if we accept the view that when the user turns to the network panel she will need to examine request bodies, but when using the console is not interested in them, then maybe we shouldn't. I think it all gets back to a more general question about how the two tools are supposed to be used, which is even more important to consider because of this:

(In reply to Jeff Griffiths (:canuckistani) from comment #7)
> I think the central bug here is that network requests should be logged in
> the netmonitor or console once the toolbox is open instead of once the tool
> ( console or netmonitor ) is shown.

Having thought about this some more, it sounds like something useful for people who inspect network requests a lot, but it poses problems for others who measure performance. Profiling the page is going to be affected by the console & network actors working during that time, the same way it is affected by enabling JS debugging at the same time. If anything, we should minimize interference when profiling, not increase it.

Furthermore, always logging network requests might pose problems when debugging mobile devices. In some cases the increase in memory usage (especially when recording bodies) could result in the debugged app getting killed, even if the user only intended to inspect CSS.

My opinion is that we should keep advocating and supporting a design where the only tool that is expected to affect the performance of the debugged app is the one that is already open, plus any others that the user has explicitly interacted with in the current toolbox session. For example, opening the debugger and then switching to the netmonitor will obviously have a combined effect on the performance of the app. I think introducing more magic behavior under the scenes in an attempt to anticipate what the user will need is going to delight some, but frustrate others. Debugging tools should be more predictable than surprising.

Jeff, what do you think?
Flags: needinfo?(jgriffiths)
> My opinion is that we should keep advocating and supporting a design where
> the only tool that is expected to affect the performance of the debugged app
> is the one that is already open

I think we've decided to go ahead and turn on the debugger no matter what, because our engine has gotten good enough to where it hardly affects perf anymore. Personally I think the order in which you click the tabs drastically affecting the behavior of our tools is incredibly unintuitive, and although we should work hard under the hood to mitigate the affects of various tools on perf/etc, the user shouldn't have to worry about this most of the time (I'm fine if there's stuff like the request body wasn't recorded, or various deeper things weren't done, but I would advocate that basic stuff like the debugger breaking on a "debugger" statement should always happen regardless of which tabs you have clicked)
Panos - I think I generally agree that we should be as performant as possible. I want to get a bit more detailed in the various trade-offs. My impression is that most of the perf issues currently in the network monitor are related to how we do output. 

I think the right thing to do is initially ship with logging request / response bodies off as we currently are. What I am really curious about is, once we have code that tracks requests without the netmonitor visible, and with / without this setting enabled.

I have been playing around with Brian's demo: https://bgrins.github.io/devtools-demos/misc/requests.html

Baseline: toolbox not open: 17268ms, 15574ms, 18829ms

* console visible, log request/response bodies enabled: 17293ms, 17288ms, 17256ms
* console visible, log request / response bodies disabled: 13707ms, 14178ms, 16875ms
* console active but style editor visible, log request/response bodies enabled: 11918ms, 13420ms, 17627ms
* console active but style editor visible, log request/response bodies disabled: 11340ms, 13744ms, 15069ms
* console not active, toolbox open ( style editor ): 17264ms, 17564ms, 15274ms

This is with an e10s window. A non-e10s window illustrates a similar picture:

* without the toolbox open: 17289ms, 11876ms, 11490ms
* with the console open, log request/response bodies enabled, non-e10s: 16488ms, 11983ms, 11933ms, 11331ms

If I was to conclude anything about these numbers it is that the backend in devtools we have for capturing network requests is not the problem. he front-end of the netmonitor is the problem:

* non-e10s, netmonitor open: 41253ms, 38673ms, 39701ms ( major jank, page is locked up for 10-20seconds )  

The other story is that e10s saves us from impacting page perf by much:

* e10s, netmonitor open: 18553ms, 25490ms, 17664ms

...however the rendering cycle in the toolbox seems pretty bad - no matter what the actual final rendering of each requests' response data timing etc in the table takes an additional ~10seconds to happen, during which both the page and the toolbox are locked up.

My conclusion is - there may be a perf hit in the protocol for capturing network responses, but I can't detect it here. We do need a better test that delivers larger response sizes, I think we would see more variability between the two settings. I agree that capturing response sizes should not be enabled by default but I want to test more and explore this in the future. It's very natural / convenient for a web developer to look at the netmonitor, see a request that already happened, then expect to see the response body.

Suggested requirement here: if I inspect a request with no logged response body, if I click on the response or preview panel I should have some way right there to re-fetch the request.

A caveat: this is a brutal and very synthetic perf test, but very popular sites do make enormous numbers of requests and do a pretty shitty job of caching assets:

* http://note.io/1Fy6coY ( huffington post )
* http://note.io/1PYOpKw ( cnn.com )

The current experience of loading these sites with the netmonitor active is very slow and janky - I think this is covered in other bugs.
Flags: needinfo?(jgriffiths)
Sorry, when I said: "I agree that capturing response sizes should not be enabled by default"

...what I really meant was "I agree that capturing response bodies should not be enabled by default"
By popular demand, this patch starts the network activity listener when the toolbox opens, instead of when either the web console or the network panel open. It doesn't record request bodies, just the request headers and metadata in a fast and (hopefully) efficient manner (I haven't really benchmarked this yet).
This is a better fix for recording when toolbox opens, because it moves the logic from the toolbox to the target. I believe initInspector and friends should probably move there, too. Our frontend framework distinguishes between UI-related (panel-related if you will) and RDP-related tasks, with the former being handled by the Toolbox object and the latter by the TabTarget object.

Every opened Toolbox gets or creates a TabTarget, which has a reference to the DebuggerClient that handles the connection to the server. Once the TabTarget is remoted (via makeRemote) the DebuggerCLient instance would have already stored a reference to the active tab. With this patch it will also store a reference to the active console, for any tool that might want to access it. Note that the DebuggerClient.attachConsole method (like all attachFoo methods) already caches the console pertaining to the active tab and returns it if called directly, so any tool or test still using that will continue to work fine.

As an added bonus, this change is a net reduction in LOC and it adds a check for almost every existing netmonitor test that verifies recording before opening the tool actually works.
Attachment #8600857 - Attachment is obsolete: true
Attached patch Part 5: Tests (obsolete) — Splinter Review
Here are two additional tests (apart from the check in part 4) that make sure opening one tool and then switching to the other results in the network request appearing in both.
Improved the head.js fix in the previous version and fixed browser_net_reload_button.js to not assert in the begining that no requests are displayed. One request now actually appears when the test starts, because the reload triggered in head.js causes an XHR request to occur on page load, due to the way the test is written.

Originally I had made the test assert that 1 request was present and included this explanation there, but it only served to confuse readers about test setup details that aren't related to what the test wants to do anyway.
Attachment #8600929 - Attachment is obsolete: true
Attached patch Part 5: Tests (obsolete) — Splinter Review
Attachment #8601465 - Attachment is obsolete: true
Attachment #8601547 - Flags: review?(vporof)
Attachment #8601548 - Flags: review?(vporof)
Attachment #8601549 - Flags: review?(vporof)
Attachment #8601552 - Flags: review?(vporof)
Attachment #8601553 - Flags: review?(vporof)
Turns out that other tests make network requests on document load, so head.js can only assert that *some* events occurred.
Attachment #8601552 - Attachment is obsolete: true
Attachment #8601552 - Flags: review?(vporof)
Attachment #8601921 - Flags: review?(vporof)
Attached patch Part 5: TestsSplinter Review
I somehow missed the failure of browser_webconsole_netlogging.js on the previous try run and it didn't fail locally, so it took me a while to investigate it. The problem lies in the test itself and is triggered by the 2 new tests added by this patch that use the same html file. The test cannot even run successfully today if used with --run-until-failure as it always fails on the second run.

The problem is that the test expects to find the flag discardResponseBody set on the first load. However this doesn't happen if the page was loaded from the cache. This is why running the single test with --run-until-failure works on the first run (page not in cache) and fails on the second run (page now in cache). The fix is to be prepared for both events, which makes --run-until-failure complete without errors.
Attachment #8601553 - Attachment is obsolete: true
Attachment #8601553 - Flags: review?(vporof)
Comment on attachment 8601547 [details] [diff] [review]
Part 1: Move the network request storage from the console frontend to the console client so the netmonitor can reuse it

Review of attachment 8601547 [details] [diff] [review]:
-----------------------------------------------------------------

Looks mechanical enough.
Attachment #8601547 - Flags: review?(vporof) → review+
Attachment #8601548 - Flags: review?(vporof) → review+
Comment on attachment 8601549 [details] [diff] [review]
Part 3: Display cached network requests in the network panel

Review of attachment 8601549 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/devtools/netmonitor/netmonitor-controller.js
@@ +542,5 @@
> +          },
> +          networkInfo: cachedEvent
> +        });
> +      }
> +    }

Nit: I'd move this in a function, but I don't feel strongly about it.
Attachment #8601549 - Flags: review?(vporof) → review+
Attachment #8601921 - Flags: review?(vporof) → review+
Comment on attachment 8602158 [details] [diff] [review]
Part 5: Tests

Review of attachment 8602158 [details] [diff] [review]:
-----------------------------------------------------------------

It might be good to have a small frontend test as well. E.g. opening a page should at least have one request for the html.
Attachment #8602158 - Flags: review?(vporof) → review+
Moved the loop to a _displayCachedEvents function.
Attachment #8601549 - Attachment is obsolete: true
Comment on attachment 8602848 [details] [diff] [review]
Part 3: Display cached network requests in the network panel

Carrying r+.
Attachment #8602848 - Flags: review+
(In reply to Victor Porof [:vporof][:vp] from comment #39)
> It might be good to have a small frontend test as well. E.g. opening a page
> should at least have one request for the html.

Clarified in IRC that these tests satisfy this requirement.
Depends on: 1162961
Depends on: 1163024
As expected, this change increased the time to open the toolbox by ~15% across platforms. Although we should keep trying to come up with more ways to improve performance, this change clearly needs to make the toolbox open event trigger more work than previously, so some degradation is expected.
Blocks: 1163518
(In reply to Panos Astithas [:past] from comment #45)
> As expected, this change increased the time to open the toolbox by ~15%
> across platforms. Although we should keep trying to come up with more ways
> to improve performance, this change clearly needs to make the toolbox open
> event trigger more work than previously, so some degradation is expected.

Are we yielding on toolbox open until these listeners are setup?  If so, would it be possible to do the setup up asynchronously (with the knowledge that we'd lose any requests that came in during this time between toolbox open and the work being finished)?
ni? for comment 46
Flags: needinfo?(past)
If I understand correctly what you are suggesting, setting up the listeners asynchronously would cause the active console to be present at an indeterminate point in time, which in turn would cause the frontend (all related frontends) to have to poll for it.

In any case, I don't think it's the listener setup phase that's causing most of the delay, but rather the request streaming afterwards. Caching the requests in the backend as we have discussed would avoid this, but this is definitely followup work.
Flags: needinfo?(past)
Whiteboard: [status:planned][devedition-40] → [status:planned][polish-backlog]
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.