Closed Bug 280603 Opened 16 years ago Closed 16 years ago
"New Updates Avail" popup in bottom right-hand corner pops up endlessly (random occurrence)
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8b) Gecko/20050122 Firefox/1.0+ Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8b) Gecko/20050122 Firefox/1.0+ Happened once so far. The "New Updates Available" (green jigsaw) box is popping up in an infinite loop. It is chewing a fair amount of CPU in the process. Clicking on the box/window just pops them up faster and increases CPU load. Four windows open, with 5-20 tabs each. Nothing I can do will get rid of this window. Reproducible: Didn't try Process Explorer notes firefox.exe is in state Wait:WrUserRequest, and context-switching 300-1000 times a second. MSVCRT.DLL is also performing a lot of cswitches, cycling between Wait:UserRequest and Ready.
This is the only easy way to describe what is happening.
Suspected of DoS'ing UMO. Issue to be determined.
Severity: normal → critical
Version: unspecified → Trunk
update.mozilla.org is currently down, and based on network traffic I highly suspect it's because of this bug. We've effectively been under a DDoS attack since exactly midnight GMT on Feb 1. The following seems to be at fault: http://lxr.mozilla.org/mozilla/source/toolkit/mozapps/update/src/nsUpdateService.js.in#489 Note the use of getUTCDay (which is day of the week) instead of getUTCDate (which is day of the month) This means update checks aren't happening at all after the first week of the month is over, and can potentially behave REALLY weird during that first week of the month if the day of the month and the day of the week line up just right.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Version: Trunk → unspecified
much thanks to mconnor for finding the chunk of code where this lived.
Version: unspecified → Trunk
Assignee: bugs → mconnor
Status: NEW → ASSIGNED
There may be more to this bug than just the date calculation... Why did Firefox think there was an update available when there wasn't?
And why does it think there's one available when the server is unreachable (bug 280607)?
Comment on attachment 173049 [details] [diff] [review] use getUTCDate correctly who knows, but firstname.lastname@example.org on the patch. I think asa is managing branch approvals.
*** Bug 280607 has been marked as a duplicate of this bug. ***
(In reply to comment #6) > Why did Firefox think there was an update available when there wasn't? Or was there? The reporter mentioned it was the green jigsaw icon that was popping up... that's the extension updates, not the application update, right? Extensions and themes can have their own update URLs.
OS: Windows 2000 → All
Hardware: PC → All
This is a little bit of a longshot, but I'll throw it out anyway: Could this be more fallout (in some way) related to the switch to namespaced expat?
Comment on attachment 173049 [details] [diff] [review] use getUTCDate correctly a=asa.
Attachment #173049 - Flags: approval-aviary1.0.1? → approval-aviary1.0.1+
278274 is a dupe of this
*** Bug 278274 has been marked as a duplicate of this bug. ***
landed on 1.0.1 branch
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
The trunk still has this problem (Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b) Gecko/20050206 Firefox/1.0+) It would be nice if this patch would be checked in to the trunk as well. Requesting reopening.
lxr shows that this is fixed on trunk.
I searched bonsai for the checkin, but it's not there in the Seamonkey trunk. And even if the fix is checked in, it's not working: the bug still appears in yesterday's build.
*sigh* __And even if the fix is checked in, it's not working: the bug still appears in yesterday's build.__ Or does it work for you?
This is not fixed in Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b) Gecko/20050211 Firefox/1.0+ for me and other testers. Reopening
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
ok, so we're not continously checking anymore, but we're still repeatedly notifying. Will investigate further.
*** Bug 282411 has been marked as a duplicate of this bug. ***
test, set machine's clock to right before the problem time period (midnight GMT of the 1st day of any month, so that would be 4pm PST). eg, 01-Feb-2005 at 15:55. the launch Firefox and see what happens. please correct me if this test case isn't the right way to verify this bug.
PST would be the day before, 4pm on Jan 31.
Whiteboard: need patch
I have not been able to reproduce this. If anyone has detailed steps to increase my odds of seeing this, please post them here. Setting the time isn't working for me and the few extensions that needed updates did not seem to be checking for them automatically.
Sample extension to test with. Steps to reproduce: 0. Make a new profile, just in case 1. Install this extension 2. Restart Firefox (to finish the install) 3. Go to about:config and set update.interval to 500 4. Wait half a second for the updates available notification (This bug should manifest - the notification will show up again right after going away) 5. Reset update.interval Note that day of month, etc. do not seem to matter - tested 2005-02-17 Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b) Gecko/20050217 Firefox/1.0+ (This extension has a custom update.rdf with a chrome:// URI, so no servers to worry about)
(In reply to comment #27) > Steps to reproduce: > 0. Make a new profile, just in case > 1. Install this extension > 2. Restart Firefox (to finish the install) > 3. Go to about:config and set update.interval to 500 Thinking intuitively, I agree that setting update.interval to such a small value (from 3600000 to 500) will cause update notifications to fire very rapidly. But why is changing the user pref to what I would assume is an obviously insane value the proper way to reproduce what's supposed to be a legitimate bug? Is it the only way to reproduce it? If so, I'd hesitate to call the problem legitimate. In conflict with this line of thinking, though, is that the m.o sysadmin group reported seeing an extraordinary increase in the amount of traffic to UMO during the first days of the month (beginning at approximately midnight UTC 2/1). > Note that day of month, etc. do not seem to matter - tested 2005-02-17 > Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b) Gecko/20050217 > Firefox/1.0+ Here are some questions for you: * When you reset your update.interval to the default value, do you see this problem? * When you set your system's date to the first day of the month do you see it? * Does this problem go away when you change your system's date to a later day in the month? We need more data and feedback about system configurations that hold this bug and what effect it causes, both on the client side and on the server side. And, really, we need the data soon! We're near the end of the line for Firefox 1.0.1 fixes and this one's big on our radar. The original reporter, beryan, filed this bug at 18:55 1/31 (which was past 2/1 UTC). To beryan: * What was your update.interval set to at that time? * What was app.update.interval set to? * What are they set to now? * Did you have any extensions installed? * If so, did any of those legitimately have new versions available then? * What was/is your app.version set to in about:config? We haven't been able to reproduce the endless popup bug locally. What setting triggers the popup slider to appear for users? Also, even with mconnor's patch we see a number of accesses to UMO and we aren't certain that his patch, while reducing the number of accesses to UMO, cuts those accesses down to an accessible load level for us. There are aviary1.0.1 builds available right now. These can be found in: http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-aviary1.0.1/ We'd appreciate it if beryan, Anton, Nickolay, and Mook tested those builds and let us know if they show the bug for them or not (without changing the update intervals from their default). Even if you've tested against trunk builds, it helps us to know the problem exists on the aviary1.0.1 branch for you still. Thanks.
Sorry, better steps to reproduce (to force an update check): 1. Set extensions.update.enabled to false (default true) 2. Set extensions.update.enabled to true Ethereal reports one hit to the server (per extension/theme) only. I.e., the problem (the notifier showing up immediately after going away) does not depend on update requests to the server. So something is wrong independently of checking too often. Interestingly, I can only reproduce this on the trunk - the 1.0.1 branch does not have the problem with the notifier. So if everyone else agrees on this point, at least it won't need to hold 1.0.1 back. Occurs on: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b) Gecko/20050217 Firefox/1.0+ Does not occur on: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20050217 Firefox/1.0
(In reply to comment #29) > Ethereal reports one hit to the server (per extension/theme) only. I.e., the > problem (the notifier showing up immediately after going away) does not depend > on update requests to the server. So something is wrong independently of > checking too often. Thanks for providing this data point, Mook. Could you try reproducing this bug using the build at: http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2005-02-03-15-aviary1.0.1/ This build is from before mconnor's patch was committed. Specifically I'm interested in hearing if your ethereal trace shows more than just the one access to the UMO service.
No matter what I do I can't get it to access the update site more than once (per reset of *.update.enabled). I had changed it locally to check www.example.com instead; easier to filter. I do see one access each time I set/reset *.update.enabled prefs. That's with the old build; and yes I did try resetting the clock to Feb 1 23:xx PST. It seems to be blocked by *.update.interval (independent of update.interval, which seems to control how often the decision to check or not check is made). Also, the bug (as described in the summary, and as I've been seeing it) does not occur in 20050203-1.0.1branch either. (For reading the code - wouldn't the old code just force the app to check at the first week of the month, but no allow more checks than normal, anyway? I.e., the second time it checks would always be within the first seven days of the month. But then again, I know nothing :p) Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20050203 Firefox/1.0
Okay, this is a little scary. The fix in the bug, while correct, will mean that we'll start seeing MORE traffic to UMO, however the initial spike at the start of the month should go away . If you've been using the app, your last updated date will be set to something in the first week of the previous month. So with update.interval (the interval at which Firefox decides whether to check for updates) set to one hour, within an hour of the new month starting, you will hit the updates URL/URLs because Firefox thinks its been three weeks. In reality, it could have been only an hour ago, if you started using Firefox in the last week of the month. So as we tick across timezones into the new month, we compress 24 hours of potential traffic into an hour, since while in theory we'd be staggered by the 24 hour interval, it comes up for everyone at the same time (the only thing saving us here is that not everyone is online at the same time). Then, fortunately, things start to decline until after the first week, where most people have an established last updated date that's late enough in the week that they won't update again that month, barring a late Saturday session, for example. There's also the extensions factor, since due to this bug, we'll probably only update once a month, because of the one week interval for extension update checking. However, this is N requests per client, where N is the number of extensions/themes installed. So in addition to the theoretical time bomb of millions of users hitting UMO for app update requests, we also have N requests on top of that for users with extensions. Taking an estimate of 3 million users using an average of 5 extensions/themes per client, that's another 15 million requests that'll hit the server in that week, and probably most/all in the first day.
That first spike will get saved as their last update time, so the next week we'll get hammered by an echo spike. But none of this explains how the original reporter beryan got his slider flood.
When I experienced this (1.0 final, now using trunk builds), I had lots of tabs open at the time, so just ignored it for a while. But whilst that notification was going off, I couldn't change panels within options.
See bug 278016 for UMO being able to receive multiple items in one call See bug 278014 for Firefox sending a single request instead of multple addon checks. Please note that this isn't the same as application update checking.
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b) Gecko/20050212 Firefox/1.0+ (MOOX M2) Windows XP I am able to reproduce this everytime without additional settings on one profile (but profile is partialy damaged). Here it is: when I start profile, I get green arrow for updates. If I surf, but not update, after certain time I get sliding message, as described in this bug. After a time, this profile became more problematic - now when I start it with firefox.exe -p, Firefox is locked, so I must close it and start Firefox normaly. It is probably related to the bug. I have done more tests on this profile, and I see that this is in connection with some of the extensions. First, I couldn't start unlock Firefox in safe mode. Then, when I disabled all extensions, Firefox always starts locked up. I think I had 4 extension, but all I can remember is this: Undo Close tab Text link Google image (the name could be a bit different - it allows to view images directly by clicking on thumbs) One more datum: when I click on arrow for updates, it claims that there are updates for Undo Close Tab, but it is impossible to update. Hope some of these explanation can help to find the possible cause of the error on reporter's computer.
(In reply to comment #36) > After a time, this profile became more problematic - now when I start it with > firefox.exe -p, Firefox is locked, so I must close it and start Firefox normaly. > It is probably related to the bug. That's true: Once the popup starts sliding in over and over again, you can't close firefox normally. The window will disappear, but the process will remain running. There's no way to shut down firefox except for killing it.
*** Bug 282773 has been marked as a duplicate of this bug. ***
My results are the same as Mook's. The bug as I see it happens with trunk builds both before and after mconnor's checkin: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b) Gecko/20050128 Firefox/1.0+ Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b) Gecko/20050218 Firefox/1.0+ But not with 1.0 branch builds: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0 Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20050218 Firefox/1.0 (from http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-aviary1.0.1/firefox-1.0.en-US.win32.installer.exe) (Although I did have some issues with that popup, those may be specific to my system) I was trying to reproduce by toggling extensions.update.autoUpdateEnabled (and later just extensions.update.enabled) on a clean profile with the testcase extension installed. This bug is reproducible, no matter what date is set. I haven't yet tried to reproduce this by setting date, so I can't verify that the original bug was fixed.
(I'm working off the summer here, not the download spike to UMO) Fallout from bug 267089 nsIAlertsService was changed to use nsIObserver, which means that the nsUpdateObserver was getting alertfinished / alertclickcallback; but it assumed that it was only observing the update stuff, and proceeded to show the alert again.
Attachment #174781 - Flags: review?(mconnor)
(In reply to comment #32) To clarify from mconnor here, the app looks to see if it needs to update again or not once an hour. During the first week of the month, it was using the day of the week instead of the day of the month as "now". The days of the week in this function are numbered from 0 to 6 with 0 being Sunday. February 1st fell on a Tuesday. The value for Tuesday is 2. So when it would do the date check, it would grab the last time it did a date check "Feb 1 at midnight" and compare it to it's fake version of "now" using the day of the week, so it thinks "now" is "Feb 2 at 1 AM", and says "oh, more than a day has passed since the last update" and it does another one. So as long as that person was online, and as long as the day of the week value was more than the day of the month value, they were hitting us once an hour. As for the capacity of the UMO server, don't worry about March. We have more than enough capacity in place now to handle a spike four times the size of the one that hit us in February. I'm also suspecting that this bug, as reported, is actually a separate issue, and the timing of it being filed and the parity of symptoms between the client behavior and server behavior caused us to errantly hijack this bug for the day-of-the-week issue when it probably wasn't related.
Comment on attachment 174781 [details] [diff] [review] Possible patch woo, I suck! thanks for cleaning up after my most excellent reviewage.
Attachment #174781 - Flags: review?(mconnor) → review+
Is this patch something that would fix a problem that occurs on the aviary branch, or just on the trunk? Is it something we want to consider for Firefox 1.0.1?
OK, I see now. Bug 267089 landed on the trunk, but didn't update the one implementation of nsIAlertListener in JS -- but the interface change was such that the JS code still worked, but called the observe method instead of the methods that were implementing nsIAlertListener, and the observe method was not set up to handle this (since it had no default case in the switch, which it probably should have -- with a dump and return -- like many observers have assertions in C++ implementations). So this patch is not relevant to the aviary 1.0(.1) branches.
Comment on attachment 174781 [details] [diff] [review] Possible patch As I said in my previous comment, it would probably be good if the switch had a default case that does whatever the JS equivalent of an assertion is (probably dump and return or throw). Being a little more defensive in methods like this is a good thing (although in C++ we have the ability to do it without any runtime cost in non-DEBUG builds). This is why assertions are good and we try to write a lot of them to document and enforce expectations.
(That said, if you write such a default case, you need to ensure that there aren't any other topics that *are* expected.)
Comment on attachment 174781 [details] [diff] [review] Possible patch Asking for SR (If this gets SR+, please check in for me; I don't have CVS access) dbaron: This was never landed on the branch, so it's not applicable. The blocking+ is for the download spike problem (which is independent).
Comment on attachment 174781 [details] [diff] [review] Possible patch SR isn't needed for toolkit. I'll land this with dbaron's suggestion.
*** Bug 283179 has been marked as a duplicate of this bug. ***
previous patch includes a fix for bug 282752, somewhat related and replacing the initial patch (attachment 173049 [details] [diff] [review]) with a much faster call. Landed only on trunk, the initial patch will do for the 1.0.1 branch.
Status: REOPENED → RESOLVED
Closed: 16 years ago → 16 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.