Closed Bug 765227 Opened 12 years ago Closed 12 years ago

bgupdates upgrades the MozillaMaintenance twice, should only run on replace

Categories

(Toolkit :: Application Update, defect)

x86_64
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla16
Tracking Status
firefox15 + fixed
firefox16 + fixed

People

(Reporter: bbondy, Assigned: ehsan.akhgari)

References

Details

Attachments

(1 file)

During a bgupdate I noticed that maintenanceservice_installer gets executed at the apply stage and not the replace stage.
I'm not opposed to this but it made me think that I should make sure the correct maintenanceservice_installer.exe was being run. 
I ran procmon during a bgupdate and noticed that maintenanceservice_installer.exe gets executed from the installationdir and not the insallationdir\updated directory.
This means that we do execute the maintenanceservice updagrade with staged updates, but not with the correct maintenanceservice.exe.  It'll always be one version behind.

Expected results:
We should instead run maintenanceservice_installer.exe from the installdir/updated directory or do the whole service upgrade on the /replace stage.
I think from the installdir/updated directory is the easiest fix here.
The maintenanceservice upgrade should happen on the new files and not the old files.
As far as I can see, we should be running maintenanceservice_installer.exe both during the staging and replace steps, from here: <http://mxr.mozilla.org/mozilla-central/source/toolkit/components/maintenanceservice/workmonitor.cpp#418>.  If the service does not get used, we only run the installer in the replace step <http://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/update/updater/updater.cpp#2867> (note that in the staging step, we'll never have a callback application.)

I think it makes sense to only run the installer in the replace step, and we indeed use the wrong directory (the installation directory) when running it in the staging step in the first case above, but first I want to figure out why you're not seeing the installer get run in the replace step...
Summary: bgupdates upgrades the MozillaMaintenance service one version behind always → bgupdates upgrades the MozillaMaintenance twice
>  but first I want to figure out why you're not seeing the installer get run in the replace step...

Just verified and it runs both times, I previously only ran the test only up until the apply stage.  I updated the title to indicate that it should only run on the replace step.
Summary: bgupdates upgrades the MozillaMaintenance twice → bgupdates upgrades the MozillaMaintenance twice, should only run on replace
Attached patch Patch (v1)Splinter Review
Assignee: nobody → ehsan
Status: NEW → ASSIGNED
Attachment #634150 - Flags: review?(robert.bugzilla)
Comment on attachment 634150 [details] [diff] [review]
Patch (v1)

I'll take this review.
Attachment #634150 - Flags: review?(robert.bugzilla) → review?(netzen)
Comment on attachment 634150 [details] [diff] [review]
Patch (v1)

Review of attachment 634150 [details] [diff] [review]:
-----------------------------------------------------------------

When determining whether or not to run PostUpdate you do a check:
bool backgroundUpdate = (argc == 4 && !wcscmp(argv[3], L"-1"));

It would be better to call UpdateBeingStaged here to avoid code duplication.


I seen you pushed to Oak already, if tests pass and upgrades work you're good to go.

::: toolkit/components/maintenanceservice/workmonitor.cpp
@@ +88,5 @@
> + * @param argv    The argv value normally sent to updater.exe
> + * @param boolean True if we're staging an update
> + */
> +static bool
> +UpdateBeingStaged(int argc, LPWSTR *argv)

nit: IsUpdateBeingStaged

@@ +90,5 @@
> + */
> +static bool
> +UpdateBeingStaged(int argc, LPWSTR *argv)
> +{
> +  return (argc == 4 && !wcscmp(argv[3], L"-1"));

nit: there's no need for the surrounding parentheses.

@@ +112,5 @@
>    // Make sure that the path does not include trailing backslashes
>    if (backSlash && (backSlash[1] == L'\0')) {
>      *backSlash = L'\0';
>    }
>    // PID will be set to -1 if we're supposed to perform a background update.

nit: This comment would be better inside the IsUpdateBeingStaged function above the return.
Attachment #634150 - Flags: review?(netzen) → review+
OK, will land with the nits addressed once I can confirm that the patch works correctly.
I'm trying to test this with no success.  Updates are failing with error 19, and I don't have the fallback key (I double checked).  Not sure what's going on. :(
I'll give the build a spin now and report back.
So I tried using the second most recent build, but the first most recent isn't done yet.  I used the 3rd most recent upgrading to the 2nd most recent and the upgrade succeeded.  But the revision ID was not the tip, it was 91ba8802ad45.

Anyway I assume you did the same test on Oak and got error code 19.  I know you said you don't have the fallback key and that you double checked but maybe you should triple check! :P J/K (kindof).  Maybe you're using a VM and you checked the host registry instead of the VM's or something?

So a way you can debug this further on your machine is to open the updater.exe that you have the problem with in the resource editor and extract the embedded primary DER file which is used to verify the update.  Then check to see which one it matches in toolkit\mozapps\update\updater.
In Comment 9 I meant the revision number I started with was c1c0ad689e33.
(In reply to Brian R. Bondy [:bbondy] from comment #9)
> So I tried using the second most recent build, but the first most recent
> isn't done yet.  I used the 3rd most recent upgrading to the 2nd most recent
> and the upgrade succeeded.  But the revision ID was not the tip, it was
> 91ba8802ad45.

91ba8802ad45 is oak's tip, isn't it?

> Anyway I assume you did the same test on Oak and got error code 19.  I know
> you said you don't have the fallback key and that you double checked but
> maybe you should triple check! :P J/K (kindof).  Maybe you're using a VM and
> you checked the host registry instead of the VM's or something?

I'll find another machine after lunch and will test it there.

> So a way you can debug this further on your machine is to open the
> updater.exe that you have the problem with in the resource editor and
> extract the embedded primary DER file which is used to verify the update. 
> Then check to see which one it matches in toolkit\mozapps\update\updater.

OK, will do.
(In reply to Brian R. Bondy [:bbondy] from comment #10)
> In Comment 9 I meant the revision number I started with was c1c0ad689e33.

That's fine, isn't it?
It is the original patch landed ya, but I just wanted to indicate that in case you did a push afterwards, possibly addressing the nits that could have some other effect.
Oh, no I was talking about the original patch.  I'll land the nits part now, but they shouldn't change the behavior.  We can re-verify on central if needed.
http://hg.mozilla.org/mozilla-central/rev/47e5c4fe2bb0
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: mozilla15 → mozilla16
Version: 12 Branch → Trunk
Did you find the reason for the problem on the failing machine by the way?
The primary cert is nightly_aurora_level3_primary.der, and the backup cert is nightly_aurora_level3_secondary.der.
That looks correct. 
Could you respond to the following:

1) Were you testing with the exact same build on both machines?
2) Can you reproduce this problem with an older Oak build that is known to not get this error?
3) Could you triple verify that this does not exist on the machine with the error 19 in regedit.exe? SOFTWARE\Mozilla\MaintenanceService\3932ecacee736d366d6436db0f55bce4
(In reply to Brian R. Bondy [:bbondy] from comment #18)
> That looks correct. 
> Could you respond to the following:
> 
> 1) Were you testing with the exact same build on both machines?
> 2) Can you reproduce this problem with an older Oak build that is known to
> not get this error?
> 3) Could you triple verify that this does not exist on the machine with the
> error 19 in regedit.exe?
> SOFTWARE\Mozilla\MaintenanceService\3932ecacee736d366d6436db0f55bce4

This happens with every oak build that I've tried on this machine, and the same builds work fine elsewhere.  The registry key is definitely not there (I usually rename the 3932ecacee736d366d6436db0f55bce4 key to xxxx-3932ecacee736d366d6436db0f55bce4 or something, and the latter name is in the registry but that shouldn't matter).

Anyways, I think I'll stop pursuing this in favor of more important work!
Telemetry data shows that error code 19 reports is virtually non-existent, so I'll chalk it up as something is wrong with your machine. If you do believe it to be some bug though please post a task and I can investigate further.
Hmm, I looked using ProcMon, and the updater is indeed getting "not found" errors when it tries to open the fallback reg key path.  If I give you remote access to the machine, can you please take a look at it?
will do, let me know the info.  Just email me it.
We found that it was due to using an override app update service serving unsigned mars.  So the sign error was correct and there is no additional issue to investigate.
Comment on attachment 634150 [details] [diff] [review]
Patch (v1)

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 307181
User impact if declined: The maintenance service would remain one version behind, all the time.
Testing completed (on m-c, etc.): locally and on m-c
Risk to taking this patch (and alternatives if risky): minimal
String or UUID changes made by this patch: none
Attachment #634150 - Flags: approval-mozilla-aurora?
Comment on attachment 634150 [details] [diff] [review]
Patch (v1)

[Triage Comment]
Approving for Aurora 15 in support of landing bug 767125 in the next few days.
Attachment #634150 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: