Closed Bug 439340 Opened 14 years ago Closed 14 years ago

Background cleanup actions block desktop on OS/2

Categories

(Toolkit :: Safe Browsing, defect)

x86
OS/2
defect
Not set
normal

Tracking

()

VERIFIED FIXED

People

(Reporter: christian.hennecke, Assigned: mozilla)

References

()

Details

(Keywords: verified1.9.0.2)

Attachments

(2 files)

User-Agent:       Mozilla/5.0 (OS/2; U; Warp 4.5; de; rv:1.8.1.13) Gecko/20080327 PmWFx/2.0.0.13
Build Identifier: Mozilla/5.0 (OS/2; U; Warp 4.5; de; rv:1.8.1.13) Gecko/20080327 PmWFx/2.0.0.13

After a while of inactivity, Firefox starts some cleanup actions in the background. On OS/2, this causes the whole desktop interface to be block for a short time.

Reproducible: Always

Steps to Reproduce:
1. Start Firefox
2. Do nothing
3. When Firefox starts cleanup activities in the background, move the mouse.
Actual Results:  
If you move the mouse in this period, the pointer does not follow. After the blocking period, it is drawn at the correct new location.

Expected Results:  
The pointer position should be updated continuously.

I also suspect that the user inactivity detection via Doodle's Screensaver (using 1.9pre here) does not work as it should. The problem described above occurs after not using Firefox for a while and regardless of use of other applications.
There has to be more to it than just start and do nothing, because I have never seen it. Walter, you have used FF more than me, did you?

Also, if you have to do nothing to see the problem then the inactivity detection may well work.
Version: unspecified → Trunk
Christian, could it be that one or some of the files in your profile are really big? I'm especially thinking of urlclassifier3.sqlite. You could then try to switch off the 2nd and/or 3rd checkbox under Tools -> Options -> Security.
Yes, that file is 51 MB on my system. I'll try what you suggested.
Oops, that is _really_ big. And bug reports like bug 383031 complain about less than 5 MB. I wonder if we messed something up with SQLite.
hello,

me too :-)

yesterday i check my profile, the file was about 25 MB

Now

22.06.08  19.46   51064832           0  urlclassifier3.sqlite
22.06.08  19.46   24759424           0  urlclassifier3.sqlite-journal
22.06.08  19.34        154           0  urlclassifierkey3.txt


kind regards

Rainer  

 
Hallo Peter,

yes, your suggestion

"switch off the 2nd and/or 3rd checkbox under Tools -> Options -> Security"

works.

no more blocking the maschine by minutes of constant  HDD I/O 


Danke für den Tip

Rainer



I've been seeing Christian's problem going back at least 2-3 versions. It isn't necessary to never do anything after opening, only to leave it alone for a while. Largest file in its profile now is urlclassifier2.sqlite 11M, but it's possible I may have deleted a larger one since last seeing the problem. I wasn't aware of any kind of background activity it was supposedly doing.

These lockups caused me quite some time back to just not use FF if I could help it, using FF2 on my Linux server at my side instead of eCS sitting in front of me. Note that I keep SM open up to 5 days at a time, which could have some impact on RAM availability, even though real RAM here is 2G.

If it was obvious to me some kind of background activity was going on, I'd go ahead and confirm this.
urlclassifier2.sqlite isn't used any more, so that can't be it.

But I now noticed that in my main FF profile the stuff was disabled. After enabling it I see that after a few visited sites urlclassifier3.sqlite has grown to 12 MB and from what I can see in sqlitebrowser it is filled with garbage. So I think that file is really the problem, trying to clean it up with external tools results in 100% CPU usage for quite a few seconds already so it may well result in a hang when cleaned up by FF.

Unfortunately, PR logging isn't enabled for that module in release builds, so I'm going to try to look at this in my here.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Attached file NSPR log
Log taken with
NSPR_LOG_FILE=<path>\url3_5.log
NSPR_LOG_MODULES=UrlClassifierDbService:5,UrlClassifierHashCompleter:5,UrlClassifierStreamUpdater:5

In there I see stuff like
   7[20a3c420]: Update from Stream.
   7[20a3c420]: Got Bå¡êŽ
and I find this binary stuff after "Got" really weird. If I understand the code comments correctly these should be parts of URLs.
Dave, you seem to have worked on this. Can you give me a hint what might be going wrong on OS/2? Or what to look at for debugging?
Component: OS Integration → Phishing Protection
QA Contact: os.integration → phishing.protection
OK, a few notes:

* 50ish megs is (unfortunately) not a particularly incorrect number for right now - in addition to the fact that malware + phishing data is a lot bigger than just phishing data (urlclassifier2.sqlite only had phishing data), there are a few artifacts of the updating process that are making it bigger than it should be.  After some fixes on the google side, this size needed will go down.

* What's happening in the background is updating the DB with new data from google.

* OS/2 is probably hitting the same problem linux was hitting in bug 430530.  The solution there was to trade some memory for disk IO.  The patch sets pref("urlclassifier.updatecachemax", 104857600) on linux - you might need to do the same on OS/2.
(In reply to comment #9)

> In there I see stuff like
>    7[20a3c420]: Update from Stream.
>    7[20a3c420]: Got Bå¡êŽ
> and I find this binary stuff after "Got" really weird. If I understand the code
> comments correctly these should be parts of URLs.

The binary stuff is correct - it's actually SHA256 hashes of urls.

(we probably should remove that Got: line from the debug output, it's almost never useful and makes the logging output much more difficult to read). 

(In reply to comment #12)
> (...)
> (we probably should remove that Got: line from the debug output, it's almost
> (...)

Or print it in hex.
(In reply to comment #1)
> There has to be more to it than just start and do nothing, because I have never
> seen it. Walter, you have used FF more than me, did you?
> 
Sorry for the long delay, some very bad luck in real life :(
Well, I never got aware of a background activity on OS/2. However, looking at the newer posts here, it could be due to having more than enough resources (2 gig of memory, profile sitting on a jfs partition). My urlclassifier3.sqlite file is also larger than 55 MB in the profile I was using with Minefield builds. I'm now using my old ff2 profile for a few days with ff3 and there it is only 32 kb.
Interestingly, I was looking at this file in a linux profile I'm using more or less permanently since ff-1.5 with updating now to ff3, it's also such huge. In another linux profile I use only very rarely its is only about 3.5 MB. On linux I experience that the browser starts up and closes down very slowly using the permanent profile. Maybe that's also related to this huge urlclassifier3.sqlite file
Hallo all,


execution of compact function with application SQLiteBrowser ( on Hobbes )

   shrinks the DB form 

      size of DB / File  urlclassifier3.sqlite

               before   : 53.800.960
               after    : 34.660.352  

great performance improvement in combination with

   * OS/2 is probably hitting the same problem linux was hitting in bug 430530. 
The solution there was to trade some memory for disk IO.  The patch sets
pref("urlclassifier.updatecachemax", 104857600) on linux - you might need to do
the same on OS/2.

see full,detailed report at

  http://de.os2.org/forum/diskussion/index.php3?all=116554

  sorry, report is in German, to late to night to translate it
  please user translation services on the web  or learn German :-)


king regards

   Rainer 


   
OK, so with confirmation from users that this helps performance on OS/2, too, we can solve this bug by setting that pref as recommended.

(Patch against CVS, but file is still identical in mozilla-central.)
Assignee: nobody → mozilla
Status: NEW → ASSIGNED
Attachment #326860 - Flags: review?(dcamp)
Hallo Peter,

I agree with you about closing this BUG about freeze/block of the GUI/Desktop.


How do we address the subject of maintenance/compact requirement of the SQLite Files?


The performance of my system Thinkpad T23 - 1,13 GB, 640 MB RAM and a 5400 PRM HDD has significantly improved after compact of all the SQLIte files!

Even compact of the small SQLite files does reduce the workload!! 

This effect is probably noticeable on low powered CPU/GPU machines.

kind regards

Rainer

 
Rainer, I think what SQLiteBrowser does is to run a "VACCUM" (see http://www.sqlite.org/lang_vacuum.html) on the database. As you discovered, it is a long and CPU intensive process, so I think that is why Firefox doesn't do that automatically. For things like this we recommend to have > 1 Ghz (see README.txt).

What I don't quite understand is why in my installations on Windows and Linux I see that the urlclassifier3.sqlite is never created. Perhaps Christian was right in comment 0 that there is a timer not working correctly on OS/2.
Dave, I don't find a reference to nsIdleService in the url classifier code, but does it use some other kind of timer that could be malfunctioning (giving too short intervals) on OS/2?
Hallo Peter,

here some information about the installation history from my FF 3.0 test suite.

Installation and creation of the FF profile:


The env var "set NSPR_OS2_NO_HIRES_TIMER=1"

has not been set / present in config.sys or start command file at profile cration.


The var has been added later - tow days -  to the start command file.

kind regards 

Rainer 

  

 
Rainer, thanks, the problem is well understood now, I can reproduce it. That environment variable has nothing at all to do with this.
(In reply to comment #18)
> Rainer, I think what SQLiteBrowser does is to run a "VACCUM" (see
> http://www.sqlite.org/lang_vacuum.html) on the database. As you discovered, it
> is a long and CPU intensive process, so I think that is why Firefox doesn't do
> that automatically. For things like this we recommend to have > 1 Ghz (see
> README.txt).
> 

Hallo Peter,

have read the vacuum ref doc --> "da core"

next step:

How to automatise the manual step of the "Reorganization/Compact/Vaccum" function
of all SQLIte DB files in FF profile?

Command File?


Next question:

Does FF 3.0 use SQLIte DB Files outside the profile?

kind regards

Rainer
  
(In reply to comment #22)

> 
> Next question:
> 
> Does FF 3.0 use SQLIte DB Files outside the profile?
> 

here is the answer:

[S:\mof-3000]dir *.sqlite /S

 Datenträger, Laufwerk S, hat den Namen VL_S_HPFS.
 Datenträgernummer ist 298A:7814

 Verzeichnis von S:\mof-3000\mozilla\firefox\Profiles\5eqjlbn3.default

26.06.08   2.28       7168           0  content-prefs.sqlite
26.06.08  11.39      20480           0  cookies.sqlite
26.06.08   2.30      10240           0  downloads.sqlite
26.06.08  11.39       9216           0  formhistory.sqlite
26.06.08   2.31       2048           0  permissions.sqlite
26.06.08  11.39    3837952           0  places.sqlite
26.06.08   2.31       2048           0  search.sqlite
26.06.08  11.37   34660352           0  urlclassifier3.sqlite
         8 Datei(en)   38549504 Byte belegt

 Verzeichnis von S:\mof-3000\mozilla\firefox\Profiles\5eqjlbn3.default\OfflineCache

22.06.08  19.46       4096           0  index.sqlite
         1 Datei(en)       4096 Byte belegt

Gesamtanzahl der Dateien:
         9 Datei(en)      38553600 Byte belegt
                    342094336 Byte frei

[S:\mof-3000]


Well, we have 9 DB in the profile as candidate for the vaccum function.

No use of SQLite DB in the "system"  part of FF 3.0!! 


kind regards

Rainer
Comment on attachment 326860 [details] [diff] [review]
set cachemax on OS/2, too

Should have a firefox peer review this, pinging gavin...
Attachment #326860 - Flags: review?(dcamp) → review?(gavin.sharp)
Attachment #326860 - Flags: review?(gavin.sharp) → review+
Hallo,

working  with the new preferences with a T23 1.13 GHz 640 MB

I observer today:

Open FF 3.0 window and working in foreground with the system editor "ae" entering text.

Suddenly the system does not process my entry keys, 100 CPU and I/O -
duration about 4 to 6 sec.

System design?

The normal SQLite processes as service for the application and so need higher priority above the normal application processes.

The google update of the SQLite data base is a real background low priority process with high I/O and CPU requirements.

Are these SQL ( I/O and CPU ) operations executed with the  priority as service process to the application? 

If yes, there are no sufficient resources left to service the user.

The increase of the SQL Buffer reduces the I/O load, but does not solve the workload balancing problem. 


   
    







 
Comment on attachment 326860 [details] [diff] [review]
set cachemax on OS/2, too

OK, I have pushed this change to mozilla-central (changeset 15530:87e90c513b24 and merge 15637:2913e3847ba5). I hope I have done everything correctly, following the Mercurial FAQ at MDC...
The checkin should have fixed the bug. Will ask for CVS approval soon.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
(In reply to comment #25)
> I observer today:
> 
> Open FF 3.0 window and working in foreground with the system editor "ae"
> entering text.
> 
> Suddenly the system does not process my entry keys, 100 CPU and I/O -
> duration about 4 to 6 sec.

Rainer, saw this comment only now. Are you sure that it was FF causing this CPU spike? Because all reports I heard, yours included, suggested that the problem is fixed with the configuration change, that I just applied in the sources. Is it reproducible? In any case, I would prefer if you could open a new bug for reproducible follow-up problems.
Comment on attachment 326860 [details] [diff] [review]
set cachemax on OS/2, too

This hasn't caused any tidnerbox problems, so I'm asking for approval for this OS/2-only configuration change.

(Would be nice to get it into FF 3.0.1 so if other problems make a rebuild necessary, please take this into account.)
Attachment #326860 - Flags: approval1.9.0.2?
Attachment #326860 - Flags: approval1.9.0.1?
(In reply to comment #28)
> (In reply to comment #25)
> > I observer today:
> > 
> > Open FF 3.0 window and working in foreground with the system editor "ae"
> > entering text.
> > 
> > Suddenly the system does not process my entry keys, 100 CPU and I/O -
> > duration about 4 to 6 sec.
> 
> Rainer, saw this comment only now. Are you sure that it was FF causing this CPU
> spike? Because all reports I heard, yours included, suggested that the problem
> is fixed with the configuration change, that I just applied in the sources. Is
> it reproducible? In any case, I would prefer if you could open a new bug for
> reproducible follow-up problems.
> 

Sure. no other candidate for the workload.

Following your suggestion, here is the new bug

Bug 443414 ? Firefox 3.0 in a background window claims too much resources for a time interval, leaving no adequate resources to serve the user 

https://bugzilla.mozilla.org/show_bug.cgi?id=443414
  
Attachment #326860 - Flags: approval1.9.0.1?
Comment on attachment 326860 [details] [diff] [review]
set cachemax on OS/2, too

Approved for 1.9.0.2. Please land in CVS. a=ss.
Attachment #326860 - Flags: approval1.9.0.2? → approval1.9.0.2+
Keywords: fixed1.9.0.2
Status: RESOLVED → VERIFIED
Product: Firefox → Toolkit
You need to log in before you can comment on or make changes to this bug.