Closed Bug 443414 Opened 17 years ago Closed 17 years ago

Firefox 3.0 in a background window claims too much resources for a time interval, leaving no adequate resources to serve the user

Categories

(Firefox :: General, defect)

3.0 Branch
x86
OS/2
defect
Not set
minor

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: RainerStroebel, Unassigned)

Details

Attachments

(2 files)

User-Agent: Mozilla/5.0 (OS/2; U; Warp 4.5; de; rv:1.8.1.14) Gecko/20080418 Firefox/2.0.0.14 Build Identifier: Mozilla/5.0 (OS/2; U; Warp 4.5; de; rv:1.8.1.14) Gecko/20080418 Firefox/2.0.0.14 Firefox 3.0 in a background window claims too much resources for a time interval,leaving no adequate resources to serve the user activity This is a follow up of the solve Bug "Background cleanup actions block desktop on OS/2" https://bugzilla.mozilla.org/show_bug.cgi?id=439340 The created solution for the bug has been implemented at the test system Solution: pref("urlclassifier.updatecachemax", 104857600) Working with the new preferences with a T23 1.13 GHz 640 MB I observer: Open FF 3.0 window and working in foreground with the system editor "ae" entering text. Suddenly the system does not process my entry keys, 100 CPU and some I/O - duration about 4 to 6 sec. System design? The normal SQLite processes as service for the application and so need higher priority above the normal application processes. The google update of the SQLite data base is a real background low priority process with high I/O and CPU requirements. Are these SQL ( I/O and CPU ) operations executed with the priority as service process to the application? If yes, there are no sufficient resources left to service the user. The increase of the SQL Buffer reduces the I/O load, but does not solve the workload balancing problem - Speciality on low power machines at the border of the minimum requirements. New Aspect/view: The idle state of browser does not say, the system is idle! To start automatic maintenance by FF the system has to be idle for some time and not only the browser. The SQLite DB has been "VACUUM" regularly. Current is of the Files is 3.07.08 8.50 4112384 0 places.sqlite 3.07.08 8.49 30056448 0 urlclassifier3.sqlite just after a "VACUUM" Reproducible: Always Steps to Reproduce: 1. You have to work online and doing foreground work, when the Update / Reproduction with the google DB is initiated. 2. 3. Actual Results: The system is frozen for the user - no response or very, very slow. Expected Results: 1. Working with a Notebook, the user does recognise heavy system activity - need status indicator to inform the user - the activity is initated bay Firefox and not from a illegal processes on this system 2. Proper load balancing
(Just a side note: "Component" should be changed to "Phishing Protection", I guess.)
Rainer, I welcome your recent efforts to find Firefox-OS/2 bugs. But this bug report is a mess. You should really try to be concise and concentrate on one bug at a time. And not make guesses about program internals when you should take more care to describe the visible effects (and give the correct version number). The main question I have: how did you check that at the time the hang occurred it was downloading from the Google database? More datapoints that could help: - How often does it occur? - Do you have DSSaver installed as specified in README.txt? (I'm forgetting quickly)
Hardware: Other → PC
Version: unspecified → 3.0 Branch
Also, do you have monitors available (XCenter or WarpCenter or similar) to cross-check memory consumption and I/O throughput during that hang?
Question from Peter: > Also, do you have monitors available (XCenter or WarpCenter or similar) to > cross-check memory consumption and I/O throughput during that hang? My testenvironment: WarpCenter and SYSB/2 0.23 System is MCP2 with CP05 and kernel W4 14.105 FROM CP06 See attached screen shot Hardware Thinkpad T23 1.3 GHz 1400x1050 14.1 TFT, 640 RAM HDD: 160 GB 5400 rpm ========================= Answer about memory usage =========================: I have found do indicator for a memory related issues. SYSB/2 cell "available physical memory" display allway values above 400MB and see output of above512. I just use this Utility "abvoe512" a for inquiry of current memory usage. The option part I do not use! [P:\Mem-use-Log]S:\download\os2\above512\v1b\above512.exe ABOVE512.exe LX format 32bit DLL module 'loading above 512MB' marking utility, version 0.01b (internal/experimental use only) Copyright 2004 Takayuki 'January June' Suwa. usage: ABOVE512 {DLL module file} [-options] without options, ABOVE512 shows current DLL object information. options: -q quiets (no message) -c marks pure 32bit code objects as 'loading above 512MB' -d marks pure 32bit data objects as so -b marks both of pure 32bit code and data objects -p marks pure 32bit preloaded code/data objects and removes preload -u unmarks 'loading above 512MB' pure 32bit code/data objects -! unlocks the DLL module before open current free virtual address space in kB (private / shared): 310720 / 194176 below 512MB line, 917504 / 914816 above 512MB line [P:\Mem-use-Log]pause Above512 was executed, when writing this text with Firefox 2.0.014 and 3.0 GA just opened - currently FF 3.0 is in background. ======================= To the question of I/O: ======================= I have running PM Disk Monitor 1.0 from Trevor-Hemsley, Running with the option "Allways on Top" not active with DANI 1.8.2 version with debug code -> upgraded to 1.8.5 yesterday night I an know considering to move from the debug code version 1.8.5 to the nondebug version of DaniS506withthe intesion to reduce the the code sequence executed at every disk I/O. The display of the PM Monitor was in background - so no data PM Monitor
sorry typo error: I have found do indicator for a memory related issues. correct to:: I have found no indicator for a memory related issues.
(In reply to comment #2) > give the correct version number FF 3.0 GA Mozilla/5.0 (OS/2; U; Warp 4.5; de; rv:1.9) Gecko/2008061521 Firefox/3.0 > > The main question I have: how did you check that at the time the hang occurred > it was downloading from the Google database? > How can I verify this with hard data? Does log entries with start / stop and results of the replication process with Google exist on my system? How is the start of the replication process triggert and scheduled? > More datapoints that could help: > - How often does it occur? About 2 to 3 incidents before yesterday - may be more The focus at the beginning of the FF 3.0 tests has not been on this issue. I do observe 3 incidents of "no user service" yesterday and just one now when writing the sentence 4 line above at 11.41 - The PM Disk Monitor (now with option "allways on top active") shows I/O - I have not been able to catch the data - first a frees of the display, than at the end of the frees an GUI/display update and than a an overwrite by the next refresh cycle of the monitor display. Lets see how to catch the hard data the next time, a log function of the PM Disk Monitor would be a great help!! More details above - CPU usage log produced by SYSB/2 CPU Monitor and my time stamps of the incidents > - Do you have DSSaver installed as specified in README.txt? No It is not stated as requirement or perquisite in the readme for running FF 3.0. The notebook IBM ThinkPad T23 does have it own APM and screen power off management based on BIOS and OS/2 APM Driver. There is no need for the usage of a screen saver function by an additional module on a T23 -it does exist. The T23 is/was one of the notebooks "certified" for and with OS/2 support by IBM. To keep the test platform stable for the Firefox Test Suite I do not install the screen save. I don't want to open the Pandora's box of APM related issues when testing a new browser version. Does FF 3.0 make use of new DDSaver API ? from the Change log of DDsaver: * v1.8 : 2008.02.11. - Added new API : SSCore_GetInactivityTime() The same reason apply to the RWS. The system is currently running with set MOZ_NO_RWS=1 SET MOZ_NO_REMOTE=1 set NSPR_OS2_NO_HIRES_TIMER=1 and now with the RWS WPS Class registered. The first test has been done without the RWS registered / now with. No difference observed. Now to the hard data collected about the "non service incidents" Date / Time of incident and correlated CPU Monitor log entries The refresh time option of the CPU Monitor has been set to 2 sec and in the evening set to 4 sec. Exact time an be seen in the log entries. All entire shows a CPU usage of 99% and the refresh timeout sequence of 2 sec has not been executed. The interval has been expanded! The system obviously does not allocate sufficient resources the monitor process. Here a more incidents with 99% of CPU usage and expansed CPU logging interval in the log file. I do not know my GUI activity at these occurrences. I will post the raw data, the CPU monitor log file, as attachment in a separate reply - cannot find a option to do this in a "normal reply" 1. 2008-07-04 about 13.43 =========================== 13.43.27 04.07.2008 CPU: 25% 13.43.29 04.07.2008 CPU: 24% 13.43.31 04.07.2008 CPU: 22% 13.43.33 04.07.2008 CPU: 90% 13.43.35 04.07.2008 CPU: 98% 13.43.37 04.07.2008 CPU: 97% 13.43.40 04.07.2008 CPU: 98% <--- 3 sec interval 13.43.41 04.07.2008 CPU: 99% <--- 1 sec 13.43.44 04.07.2008 CPU: 99% <----3 13.43.46 04.07.2008 CPU: 98% 13.43.48 04.07.2008 CPU: 97% 13.43.50 04.07.2008 CPU: 51% 13.43.52 04.07.2008 CPU: 95% 13.43.54 04.07.2008 CPU: 96% 13.43.56 04.07.2008 CPU: 97% 13.43.58 04.07.2008 CPU: 97% 13.44.00 04.07.2008 CPU: 97% 13.44.02 04.07.2008 CPU: 96% 13.44.04 04.07.2008 CPU: 95% 13.44.07 04.07.2008 CPU: 98% 13.44.09 04.07.2008 CPU: 97% 13.44.11 04.07.2008 CPU: 70% 2. 2008-07-04 about 18.48 ============================ 18.48.28 04.07.2008 CPU: 70% 18.48.31 04.07.2008 CPU: 98% <--- 3 sec 18.48.36 04.07.2008 CPU: 99% <--- 5 18.48.40 04.07.2008 CPU: 99% <--- 4 18.48.43 04.07.2008 CPU: 99% <--- 3 18.48.45 04.07.2008 CPU: 99% 18.48.47 04.07.2008 CPU: 99% 18.48.52 04.07.2008 CPU: 99% <--- 5 18.48.55 04.07.2008 CPU: 99% <--- 3 18.48.59 04.07.2008 CPU: 99% <----4 18.49.03 04.07.2008 CPU: 99% <----4 18.49.05 04.07.2008 CPU: 94% 18.49.07 04.07.2008 CPU: 39% 18.49.09 04.07.2008 CPU: 36% 18.49.11 04.07.2008 CPU: 35% 18.49.14 04.07.2008 CPU: 94% 3. 2008-07-04 about 20.54 ========================== 20.54.01 04.07.2008 CPU: 88% 20.54.03 04.07.2008 CPU: 70% 20.54.05 04.07.2008 CPU: 74% 20.54.07 04.07.2008 CPU: 97% 20.54.11 04.07.2008 CPU: 98% 3 20.54.15 04.07.2008 CPU: 98% 4 20.54.20 04.07.2008 CPU: 98% 5 20.54.22 04.07.2008 CPU: 99% 20.54.24 04.07.2008 CPU: 94% 20.54.26 04.07.2008 CPU: 70% 20.54.28 04.07.2008 CPU: 99% 20.54.30 04.07.2008 CPU: 90% 20.54.32 04.07.2008 CPU: 47% 20.54.34 04.07.2008 CPU: 25% 20.54.36 04.07.2008 CPU: 26% 4. 2008-07-05 about 11.41 CPU monitor interval is now at 4 sec =================================================================== 11.40.36 05.07.2008 CPU: 71% 11.40.41 05.07.2008 CPU: 97% 5 11.40.50 05.07.2008 CPU: 99% 9 11.40.58 05.07.2008 CPU: 99% 8 11.41.08 05.07.2008 CPU: 99% <---- 10 sec 11.41.17 05.07.2008 CPU: 99% 9 11.41.26 05.07.2008 CPU: 99% 9 11.41.35 05.07.2008 CPU: 99% 8 11.41.44 05.07.2008 CPU: 99% 9 11.41.50 05.07.2008 CPU: 99% 11.41.54 05.07.2008 CPU: 90% 11.41.58 05.07.2008 CPU: 39% 11.42.02 05.07.2008 CPU: 34% 11.42.06 05.07.2008 CPU: 40% 11.42.10 05.07.2008 CPU: 34% 11.42.14 05.07.2008 CPU: 36% 11.42.18 05.07.2008 CPU: 38% 11.42.22 05.07.2008 CPU: 35% Just as I enter the sentence at 12.58 the key entry has been block - new incidence! Seen I/O occurrence on PM Monitor - so during the block there was I/O activity at this incident - I do not catch the numbers. One additional info about the test platform: Yesterday night the warpsans font has been update for version 0.4 to 0.5. So the incidents to day are on a system operating with warpsans 0.5 fonts. This is probably not related to the balancing problem - just to be accurate about the test platform :-) OT: By the way great improvement on my system-may thanks to Alex.
The refresh interval of the logging is set to 2 sec and later to 4 sec in the evening. There are entries with 99% CPU load and expanded logging intervals. ==> There are no sufficient resources allocated to the monitor process at this incidents. The time stamp of same of these log entries can be correlated to times of "not service to from the GUI" for the user. PS: log text file too big --> zipped
(In reply to comment #6) > Does FF 3.0 make use of new DDSaver API ? Sure, see README.txt.
(In reply to comment #8) > (In reply to comment #6) > > Does FF 3.0 make use of new DDSaver API ? >> And not make guesses about program internals > > Sure, see README.txt. Hello Peter, I am just following your suggestions :-) Doing no guesses, just asking questions. The Readme does have an ICON on the Test Suite. From the Readme: Idle timer for internal cleanups -------------------------------- If Doodle's Screen Saver (DSSaver) v1.8 or later is installed, Just been in the learning / beginners phase of the new release :-) Following the Readme: No DDServer installed, no internal cleanups Consequences? I just have data from an
I guess I should try harder to phrase my suggestions better: Please install DSSaver v1.8 or later and see if the hangs go away. ;-) (Actually, just copying SSCore.DLL into LIBPATH should be good enough, a full installation is not necessary.) >Following the Readme: > No DDServer installed, no internal cleanups That is not what README.txt says, I'm not sure that this reverse is strictly true. It may be but please test. I also did not want to suggest that you have a memory problem in that you have too little RAM in your machine. But system "hangs" can have many reasons, including when an application allocates huge amounts of memory. A memory monitor with a graph of some sort could clear that up.# If contacting Google is happening during the hang could be easily checked by confirming if there was network traffic at all (or by a full-fledged analysis using iptrace/ipformat). Finally, to confirm that some profile operation is responsible you could also write a (REXX?) script that lists the contents of your profile directory every 5s. Then, after a hang, extract the listing immediately before and after the hang, so that we can determine what actually changed (if at all).
> > Please install DSSaver v1.8 or later and see if the hangs go away. ;-) > (Actually, just copying SSCore.DLL into LIBPATH should be good enough, a full > installation is not necessary.) > done, here a the results: From my system log: Sa. 05-07-2008 18.47 SSCore.DLL extracted via Warp-View 1.7-T from dssaver_v19.exe to s:\mof-3000\mozilla\firefox [S:\mof-3000\mozilla\firefox]dir ssc* Datenträger, Laufwerk S, hat den Namen VL_S_HPFS. Datenträgernummer ist 298A:7814 Verzeichnis von S:\mof-3000\mozilla\firefox 22.04.08 0.38 33673 0 SSCore.dll 1 Datei(en) 33673 Byte belegt 49866752 Byte frei [S:\mof-3000\mozilla\firefox] First impression, the workload seem to be higher CPU monitor show values about 98% more often But the main point is: ======================= The system does have a "GUI user lock" incidents at 21.48 Form the interval extension size and total time from start to end of incident is now significant longer Here is the CPU Log excerpt: normal lock interval: 4 sec 21.47.10 05.07.2008 CPU: 81% 21.47.14 05.07.2008 CPU: 71% 21.47.18 05.07.2008 CPU: 78% 21.47.22 05.07.2008 CPU: 47% 21.47.26 05.07.2008 CPU: 64% 21.47.30 05.07.2008 CPU: 81% 21.47.34 05.07.2008 CPU: 75% 21.47.39 05.07.2008 CPU: 73% <--- 5 21.47.48 05.07.2008 CPU: 99% 9 21.47.56 05.07.2008 CPU: 99% 8 21.48.04 05.07.2008 CPU: 99% 8 21.48.13 05.07.2008 CPU: 99% 9 21.48.21 05.07.2008 CPU: 99% 8 21.48.27 05.07.2008 CPU: 99% 6 21.48.31 05.07.2008 CPU: 91% 21.48.41 05.07.2008 CPU: 99% 10 21.48.51 05.07.2008 CPU: 99% 10 21.48.56 05.07.2008 CPU: 99% 5 21.49.00 05.07.2008 CPU: 80% 21.49.04 05.07.2008 CPU: 81% 21.49.09 05.07.2008 CPU: 85% 21.49.13 05.07.2008 CPU: 85% 21.49.17 05.07.2008 CPU: 86% 21.49.21 05.07.2008 CPU: 55% 21.49.25 05.07.2008 CPU: 65% 21.49.29 05.07.2008 CPU: 90% 21.49.33 05.07.2008 CPU: 72% 21.49.37 05.07.2008 CPU: 64% 21.49.41 05.07.2008 CPU: 64% 21.49.45 05.07.2008 CPU: 82% 21.49.49 05.07.2008 CPU: 77%
I didn't get the information needed to find out what the problem really is. I haven't heard any other reports nor can I reproduce the problem. -> incomplete
Severity: normal → minor
Status: UNCONFIRMED → RESOLVED
Closed: 17 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: