Closed Bug 121616 Opened 24 years ago Closed 16 years ago

Page info reports wrong size for compressed pages

Categories

(SeaMonkey :: Page Info, defect)

x86
All
defect
Not set
minor

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: dennishaney11, Assigned: db48x)

References

()

Details

For compressed pages it is the compressed size and not the actual size that is reported. Both the compressed size (if any) and actual size should be reported because they are both interresting from each point of view.
mass moving open bugs pertaining to page info to pmac@netscape.com as qa contact. to find all bugspam pertaining to this, set your search string to "BigBlueDestinyIsHere".
QA Contact: sairuh → pmac
Oh, you mean like the size of the image in memory? I think mozilla converts them to a 24bpp bitmap whenever they're needed, but I don't know if that information is actually available to anything other than the layout frames themselves. Have to look into it.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Blocks: 82059
By "compressed" you mean "content-encoded", right? If so, reporting the "actual" size is what we do. For a content-encoded HTTP transmission, the encoding is an integral part of the body data. This means that the version stored in the cache is compressed, so reporting the "actual" size would involve decompressing the whole thing... In addition to which some sites actually use content-encoding correctly (instead of as a substitute for transfer-encoding). And there we should _really_ not decompress. Or are you talking about something else entirely?
Well, that depends on your definition of actual size. IMO it is the size of the page I am currently looking at and not the size of a compressed version of it. And yes, I am talking about Content-Encoding. However few know the difference between Content- and Transfer-Encoding :( >In addition to which some sites actually use >content-encoding correctly (instead of as a substitute for transfer-encoding). I have yet to see a point in having both a content- and transfer-encoding, since they basicly mean the same: The browser has to uncompress to view it. Why is the page stored compressed in cache, when it will never be used compressed again? At any time where it is needed it has to be decompressed. >And there we should _really_ not decompress. Mozilla already decompresses content-encoded pages...
> I have yet to see a point in having both a content- and transfer-encoding Transfer-decoded data are to be decompressed as soon as they come off the wire. The encoding is just there for convenience of transmission. For content-encoded data, the encoding is an integral part of the data. Eg, a .tar.gz file is content-encoded while HTML that's compressed before sending _should_ be transfer-encoded. > Well, that depends on your definition of actual size. Right. My point is that "actual size" is the "size of the data", which is uncompressed for transfer-encoding and compressed for content-encoding (where the encoding is just a property of the data). > when it will never be used compressed again? Who says it won't? Mozilla will always decompress before viewing, yes. But for saving to disk content-encoded data should _not_ be decoded since the encoding is an integral part of the data. (Yes, we currently decode in some cases when saving but only becase sites are not using transfer-encoding). The point is, for content-encoded pages there may be legitimate reasons to need the encoded version. That's why we store it encoded.
Component: XP Apps: GUI Features → Page Info
All discussion so far relates to what is meant by "size". From a user standpoint, the number of bytes in the window for the HTML file can be important. For example, I spend significant time as a user reading amateur fiction on the Web. For short chapters, I remain connected while reading. For longer chapters, I disconnect after loading the page. (My ISP and phone company both appreciate that.) For really long chapters, I save the page on my hard drive for reading later (possibly over several days). This whole process does not work if Mozilla tells me a 50 KB page is only half that size. Please fix this.
Perhaps we should simply remove this from the UI, since no matter what we do it's wrong from someone's standpoint...
> Perhaps we should simply remove this from the UI, since no matter what we do > it's wrong from someone's standpoint... That's not solution, Boris. =) Is possible to detect which content has encoding and which not? If it is possible, we should indicate content encoding in Page Info sort of this: Size: 43.39 KB (44429 bytes in Content-Encoded form)
> Is possible to detect which content has encoding and which not? "Sort of". It's hard. It may be impossible from the Page Info dialog.
Question: What is the size of a page? Suggested Answer: It depends on the client platform. Take the page in question and treat it as if it were downloaded via FTP onto a local hard drive. That is, if it is HTML, take only the HTML file without any referenced .gif, .jpg, or other files. If it is ASCII text, that's what it is. The resulting approximates in size the file that would be obtained from Mozilla if you select "Save Link Target As" from a right-click pull-down on a link to the page, not the file you would get from "File > Save Page As", which roughly approximates the visible page. For a UNIX platform, use the size that would show when "ls -l" is executed. For a PC, use the size that you would see in Windows Explorer or a Properties window. Where a Properties window might show (for example) "19.0KB (19,456 bytes), 24,576 bytes used", the number is 19.0 KB because that matches the size shown by Windows Explorer. I'm not sufficiently familiar with the Mac to offer a suggestion for that platform. In any case, "View > Page Info" should present sizes that meaningful to users.
> Take the page in question and treat it as if it were downloaded via FTP onto a > local hard drive. FTP does not support transfer or content encodings... so there is in fact no way to figure out what it would look like if that happened. > The resulting approximates in size the file that would be obtained from > Mozilla if you select "Save Link Target As" from a right-click pull-down The size of that file depends on some mind-reading the "save as" code does as to what the server "really means" when it says "Content-Encoding: gzip". What does "meaningful to user" mean? What information is the user looking for exactly? Most users who look at Page Info seem to want to know how long the page would have taken to download based on the size, or something like that. You're not, but you seem to be the exception. Again, I stand by comment 7. No matter what we show here it will be "wrong" and showing the decompressed size is not technically feasible anyway in most cases, so we should just nix this item. It's a holdover from the NS4 days, and NS4 does not support content encodings...
Perhaps their should be a small footnote that the window can't be completely accurate, that way it doesn't decieve users. For example, slashdot, is 12k according to mozilla, it's really a few times larger. Developers would appreciate this note, as checking page size is important. And Mozilla is a great developers platform thanks to JavaScript debugging among other features (and some great extensions). Is a small footnote right below the size a possibility? Just to note something like "May be inaccurate if page is compressed", or something to that effect?
Couldn't there be a (##kb when uncompressed) addition after the current size when looking at the page info for an object? Yes, this would require the object to be decompressed, but surely this happened already to display it, so the result can be cached?
> but surely this happened already to display it, so the result can be cached? The result is never even available. The decompression code uses a streaming decompression algorithm that decompresses the data a chunk at a time; the total length is not really ever known anywhere (the parser code just gets a bunch of calls with "here is another data chunk"). Someone could calculate that length, add it all up, pass it back out, etc, but doing this in the parser is hard because of document.write calls and doing it anywhere else is pretty much impossible.
Perhaps just write "Compressed Page" for Size when this is done? That way, only uncompressed shows the page size. We can't be wrong if we don't say something right? Would something like that be fesable?
> but doing this in the parser is hard because of document.write calls surely document.write does not call OnDataAvailable, which is the function that knows about the length of the passed-in data?
I'm actually not sure what document.write does, exactly. Chances are, you're right that it does not call OnDataAvailable. I'm also not keen on adding hooks to parser, content sink, and document all for the sake of little-used page info functionality, of course.
OK I was confused by this too. Simply saying "(Compressed)" after the size (IF it *was* compressed) would be enough to remove the confusion. For bonus points, show the real size as well, but people can always save to disk as a workaround.
See comment 1 for bug #263393, which cites the same URL as does this one. A quick test of View Info shows that the size by Mozilla is from GET while the actual size is near what is obtained from HEAD. Perhaps, HEAD should be used to obtain the size to fix this bug. However, bug #160454 requests that Mozilla no longer use HEAD when processing "Save As" (and extending to 263393, when processing "Save Target Link As"). Care needs to be taken to ensure HEAD is still used where appropriate.
> A quick test of View Info shows that the size by Mozilla is from GET It's the size of the actual data returned by the server, as reported by the cache. > Perhaps, HEAD should be used to obtain the size to fix this bug. That would only "fix" it for broken servers like that in bug 263393. Any time any headers differ between HEAD and GET that's a bug in the server.
Whether the downloaded page is compressed or not, it seems the page info 'size' is of limited value anyway. As mentioned by others before, the size reported if the page was originally compressed is the size of the compressed page (not the uncompressed and rendered page). However, I've noticed that for pages that are not compressed to start with, the value is wrong anyway (or is it?). For example, for www.xulplanet.com, the size is reported as 8637 bytes, but if I save then it's 8663 on disk. So which is correct? Additionally, the size reported is for the html only, and doesn't include image sizes and the effects of applying any css. The true size of everything that's rendered (even for an uncompressed page) could be vastly different from what's reported. The 'size' field is so vague... least of all it needs a better tag. IMO something does need to change somewhere.
(In reply to comment #21) > For example, for www.xulplanet.com, > the size is reported as 8637 bytes, but if I save then it's 8663 on disk. So > which is correct? you probably saved as "web page, complete", which modifies the page, and is thus useless for comparing size values.
(In reply to comment #22) > you probably saved as "web page, complete", which modifies the page, and is thus > useless for comparing size values. Yes, that's it exactly (if saving as html only then the sizes do match). But I always do that ("web page, complete"), as I'm sure many others do, so this anomoly will be common.
Probably the biggest use for this feature is checking/optimising download speeds. For the most part seeing the compressed size is pretty useful. However... are there any modern (say v4+) browsers which don't support compression? If so, then they are not going to get the compressed page, therefore it is irrelevant what *my* browser might show me, and you *would* want to check the uncompressed size to understand what some users would get.
> are there any modern (say v4+) browsers which don't support compression? NS4 is a v4+ browser. It doesn't do HTTP/1.1. Some proxy servers probably only do HTTP/1.0 too. In general, if you want to find broken or silly clients (or servers), you sure can.
Okay. I was on the wrong track with my comment 19. Please note, however, that page size is useful to me, per my comment 6. I can guses a page to be relatively short by the size of the vertical scrollbar slider. However, for longer pages, this is not effective; the slider seems equally small for text pages of 50KB and 100KB. If determining the size of the page in the browser window is not practical, I suggest using the size of the file obtained from a "Save Page As". For HTML files, this would be the size of the file when selecting "Web Page, HTML only". I suggest this (1) because graphics (the most likely additional files for "Web Page, complete") are often gratuitous and thus not saved and (2) because we can get separate sizing information on all other components via the Media tab of "View Page Info".
Product: Browser → Seamonkey
Why is this bug specific to "Mozilla Application Suite" and "Linux"? It is the same on Firefox and on other OSes, too. See also bug 271370.
OK, I changed it to all OSes. However, I don't see what to do about the Product; it really relates to Firefox AND the Suite, which isn't an option. (There are a lot of bugs which have this issue...)
OS: Linux → All
*** Bug 299453 has been marked as a duplicate of this bug. ***
Any news on this?
QA Contact: pmac
As per comment 7 page size is not shown in the General tab.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → WONTFIX
Well I must have been dreaming because size /is/ shown in the general tab.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Status: REOPENED → NEW
(In reply to Comment #7) > Perhaps we should simply remove this from the UI, since no matter what we do > it's wrong from someone's standpoint... If it's always wrong from someone's viewpoint. Then there is no definitive "right" size. WONTFIX.
Status: NEW → RESOLVED
Closed: 16 years ago16 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.