Page Info dialog box abuses the cache service, potentially displaying misleading results, and potentially blocking the UI thread on disk I/O

RESOLVED INVALID

Status

()

Firefox
Page Info Window
RESOLVED INVALID
6 years ago
5 months ago

People

(Reporter: nwgh, Unassigned)

Tracking

(Blocks: 3 bugs, {main-thread-io, perf})

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [Snappy:p2])

(Reporter)

Description

6 years ago
The necko team would like to remove the synchronous cache API (specifically, nsICacheService::OpenCacheEntry). Besides internal necko uses and tests, both pageinfo (browser/base/content/pageinfo/pageInfo.js) and panorama (browser/components/tabview/thumbnailStorage.js) make use of that API. These should both be converted to use nsICacheService::AsyncOpenCacheEntry.
I filed bug 738128 for the Panorama case because Page Info and Panorama are in different modules.

The Page Info dialog box tries to use the cache to determine the document size and sometimes to determine the mime type. Unfortunately, (a) it doesn't always compute the cache key correctly, AFAICT, (b) the cache entry might have been updated due to a load of the same page, so the info returned from the cache may not refer to the version of the page being shown in the current tab, and (c) bug 121616, 714111, 684612, 497577, 239200, 145877, 271370, 229843 all show that the size info is often incorrect.

I propose that we remove the size field from the general tab, and that we remove the size fields from the media tab; this will remove most of the usages of the cache API from the dialog box, and "fix" the bugs above about the size field being wrong.

Also, I propose that we stop trying to get the mime type from the cache here, which would remove the final use of the cache.
Component: General → Page Info
Keywords: main-thread-io, perf
QA Contact: general → page.info
Summary: browser should not use synchronous cache API → Page Info dialog box abuses the cache service, potentially displaying misleading results, and potentially blocking the UI thread on disk I/O
Whiteboard: [Snappy]

Comment 2

6 years ago
> Also, I propose that we stop trying to get the mime type from the cache here, which
> would remove the final use of the cache.
Where would we get the mime type instead?
(In reply to Philip Chee from comment #2)
> > Also, I propose that we stop trying to get the mime type from the cache here, which
> > would remove the final use of the cache.
> Where would we get the mime type instead?

My preferred option is to just remove the mime type fields from both the General tab and the Media tab. I think this is the right thing to do because Page Info should be end-user-oriented, not developer-oriented, and developers can get the Content-Type header from the web console.

If removing the MIME type from the general tab is unacceptable, then my second-most-preferred option is to remove the mime type field from the Media tab, permanently.

If we cannot agree to that, then my third-most-preferred option is to remove the fallback logic that uses the cache in the media tab, and then file follow-up bugs about improving the fallback logic in way that doesn't abuse the cache.

The MIME type is retrieved from the cache only for the case where the user selects an item in the Media tab, and only when it can't find out the MIME type out using other means . The MIME type for document, for use in the General tab, is obtained directly from the document. This means that only the Media tab is problematic as far as cache usage is concerned. 

The code looks like this:


  var mimeType;
  if (item instanceof HTMLObjectElement ||
      item instanceof HTMLEmbedElement ||
      item instanceof HTMLLinkElement)
    mimeType = item.type;

  if (!mimeType && !isBG && item instanceof nsIImageLoadingContent) {
    var imageRequest = item.getRequest(nsIImageLoadingContent.CURRENT_REQUEST);
    if (imageRequest) {
      mimeType = imageRequest.mimeType;
      var image = imageRequest.image;
      if (image)
        numFrames = image.numFrames;
    }
  }
  if (!mimeType)
    mimeType = getContentTypeFromHeaders(cacheEntryDescriptor);


getContentTypeFromHeaders is the function that retrieves the (possibly out of date) MIME type from the cache entry descriptor.
(In reply to Brian Smith (:bsmith) from comment #3)
> My preferred option is to just remove the mime type fields from both the
> General tab and the Media tab. I think this is the right thing to do because
> Page Info should be end-user-oriented, not developer-oriented, and
> developers can get the Content-Type header from the web console.

Yeah, let's do this.

Updated

6 years ago
Whiteboard: [Snappy] → [Snappy:p2]
(In reply to Brian Smith (:bsmith) from comment #1)

> I propose that we remove the size field from the general tab, and that we
> remove the size fields from the media tab;

The size field of the Media tab is the only way to quickly access the largest image of the page, which is likely the one an user may be interested in saving.

I understand your rational for removing the page size from the general tab (I agree that it's often wrong, as compressed pages are very common), but I don't see any rational in the above comments for removing the size from the Media tab (the referenced bugs where that size is wrong are corner cases).


(In reply to Brian Smith (:bsmith) from comment #3)

> My preferred option is to just remove the mime type fields from both the
> General tab and the Media tab. I think this is the right thing to do because
> Page Info should be end-user-oriented, not developer-oriented, and
> developers can get the Content-Type header from the web console.

People who are interested in the type of an image aren't necessarily web developers who understand HTTP headers.
Graphic designers for example may want to know if an image is in a PNG or JPEG format before saving it, and that may not always be obvious from the URL.
Also, the current code doesn't show the MIME type directly to the user, it formats it first (and the "image" word will be localized).

Now that the synchronous cache access has been removed by bug 695399, is this bug wontfix?
(In reply to Florian Quèze [:florian] [:flo] from comment #5)
> (In reply to Brian Smith (:bsmith) from comment #1)
> Now that the synchronous cache access has been removed by bug 695399, is
> this bug wontfix?

The motivation for doing this isn't just bug 695399; see the other bugs this bug is blocking. I think the networking team doesn't have an immediate need to see this happen, but I do think it is likely that the cache APIs that are (still) being used by this feature may change substantially. I guess we'll cross that bridge when we come to it.
Status: NEW → RESOLVED
Last Resolved: 5 months ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.