Closed Bug 121761 Opened 23 years ago Closed 23 years ago

Google PDF cache links reflow forever

Categories

(Core :: Layout, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: slice1900, Assigned: attinasi)

References

()

Details

(Whiteboard: [bae: 20020128])

I've noticed that when I do a search on Google, if I find something in a PDF
file, and click on the "View as HTML" link, the page will reflow (I think I'm
using that term correctly) forever.  It will show page contents, then blank
them, hang for a moment, then draw them again.  Forever, as far as I can tell. 
I can click in the scrollbar and get it to scroll down a page every few seconds,
and if I'm REALLY quick, I can read a sentence on the page during the time it
flashes into view.  Using other windows in Mozilla is very painful when this is
going on, so the only workaround is to either view the PDF file itself (only
good if the site is available, reasonably fast, and the contents haven't
changed) or use Netscape 4.79 to look at it.

The link I've listed is the result of doing "filetype:pdf test" in Google, to
get a lot of PDF files to choose from.  Just click on "View as HTML" for one of
them and you'll see what I mean.

This is using 0.9.7 (build ID 2001122108) on Linux, but I have seen this bug for
several builds (probably since Google started indexing PDF files)  It just
reached my threshold of annoyance tonight enough to report it.  Since it is so
pathological, perhaps finding its cause will help with other bugs as well.

Apologies if "layout" isn't the correct component, I'm only about 50% in
guessing the right component for my bugs :(
Reporter,

Based on the information provided, I can get the page to load from the link"
View as HTML". It took a little under two minutes for it to complete this task,
but it did load the page. Your comments indication that it doesn't load at all.
You might want to try the latest nightly build and create a new profile. Tested
under Linux Redhat 6.2 with Jan 24th build (2002-01-24-08).
I tried several of the links off the google search and none of them did a 
continual refresh. The longest one and most time consuming load was the:
[PDF] Index to EPA Test Methods
File Format: PDF/Adobe Acrobat - View as HTML

But I could not find one that did the reload as you mentioned. Reporter: can you 
direct me to the specific one you selected? Can you try a more current build?
Whiteboard: [bae: 20020128]
closing as wfm
Status: UNCONFIRMED → RESOLVED
Closed: 23 years ago
Resolution: --- → WORKSFORME
Original reporter here, sorry I haven't gotten back on this soon.  I tried this
on 0.9.8 and it does work better.  I don't see the constant reflows, so it is
possible to (very very slowly) scroll or page down through the document as it is
being loaded.  It is still very slow though, I got about 20 pages worth of the
EPA document mentioned above loaded after two minutes (Celeron 533A, 768MB RAM)
 The browser is also extremely slow to respond to user actions during this time.

It isn't hanging, but it sure does suck the CPU dry.  Based on running tcpdump
while I'm trying to download the EPA document, it looks like Google is feeding
it in tiny chunks.  There may be a problem with Mozilla's handling of having a
web page dished out in small chunks at a time.  Is there anything in Mozilla's
test suites that could test a situation like this, or could this possibly be
added to the test suites.  This would improve performance in situations where
the server is feeding large pages very slowly for whatever reason.

My reasoning that this is still a bug even if it is caused by Google feeding the
page slowly (presumably because the on-the-fly PDF conversion is slow) is
because even if a page takes a long time to load, it shouldn't limit Mozilla's
responsiveness during this time.  Would it be better to open up a new bug for that?
I checked out the HTML, the problem sees to be merely that Google's HTML output
for PDF conversion specifies the layout position of just about every word
individually.  It is still IMHO a bug that this bad HTML it makes Mozilla so
unresponsive, but the fact it is slow to render probably isn't.  NS 4.79 takes
about 20 seconds on my machine to render the page (after it has downloaded it in
another 10 seconds, limited by my link speed of about 60KB/s)  NS 4.79 however
remains quite responsive during this 20 second rendering process, while during
Mozilla's lengthy rendering time the browser is essentially unusable.  Seems to
me like this would be a '4xp' kind of thing...
You need to log in before you can comment on or make changes to this bug.