Open Bug 1853821 Opened 9 months ago Updated 8 days ago

Update can get caught in a loop when it fails due to a write error

Categories

(Toolkit :: Application Update, defect, P3)

defect

Tracking

()

People

(Reporter: bytesized, Assigned: erchen)

References

Details

(Whiteboard: [fidedi-ope])

Attachments

(2 files, 1 obsolete file)

It appears that it's possible for update to get caught in a loop.

  1. An update fails due to a write error.
  2. _postUpdateProcessing sees this and calls handleUpdateFailure.
  3. handleUpdateFailure identifies the failure as a write failure and thinks "this is a failure using the service. We should fall back to not use the service". But it is possible to get this type of error without using the service, and this code does not check that the service was actually used.
  4. Repeat.

We should fix this. I think it makes sense to fall back once. But once we have already fallen back, we should not just keep retrying over and over.

Assignee: nobody → erchen
Attachment #9406523 - Attachment description: WIP: Bug 1853821 - Fix update error loop when service is used r=bytesized! → Bug 1853821 - Fix update error loop when service is used r=bytesized!
Attachment #9406523 - Attachment description: Bug 1853821 - Fix update error loop when service is used r=bytesized! → WIP: Bug 1853821 - Fix update error loop when service is used r=bytesized!
Attachment #9406523 - Attachment description: WIP: Bug 1853821 - Fix update error loop when service is used r=bytesized! → Bug 1853821 - Fix update error loop when service is used r=bytesized!
Pushed by erchen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/03d24dcffbb0
Fix update error loop when service is used r=bytesized,application-update-reviewers

Backed out for causing xpcshell failures on updateManagerXML.js.

[task 2024-06-14T14:38:17.059Z] 14:38:17     INFO -  TEST-START | toolkit/mozapps/update/tests/unit_aus_update/updateManagerXML.js
[task 2024-06-14T14:38:17.305Z] 14:38:17  WARNING -  TEST-UNEXPECTED-FAIL | toolkit/mozapps/update/tests/unit_aus_update/updateManagerXML.js | xpcshell return code: 0
[task 2024-06-14T14:38:17.305Z] 14:38:17     INFO -  TEST-INFO took 242ms
[task 2024-06-14T14:38:17.305Z] 14:38:17     INFO -  >>>>>>>
[task 2024-06-14T14:38:17.307Z] 14:38:17     INFO -  PID 7023 | [Parent 7023, Main Thread] WARNING: Couldn't get the user appdata directory. Crash events may not be produced.: file /builds/worker/checkouts/gecko/toolkit/crashreporter/nsExceptionHandler.cpp:2888
[task 2024-06-14T14:38:17.308Z] 14:38:17     INFO -  (xpcshell/head.js) | test MAIN run_test pending (1)
[task 2024-06-14T14:38:17.309Z] 14:38:17     INFO -  "14:38:17:168 | TEST-INFO | xpcshellUtilsAUS.js | [setupTestCommon : 883] start - general test setup"
[task 2024-06-14T14:38:17.309Z] 14:38:17     INFO -  TEST-PASS | toolkit/mozapps/update/tests/unit_aus_update/updateManagerXML.js | setupTestCommon - [setupTestCommon : 885] gTestID should be 'undefined' (setupTestCommon should only be called once) - "undefined" === "undefined"
[task 2024-06-14T14:38:17.309Z] 14:38:17     INFO -  (xpcshell/head.js) | test pending (2)
[task 2024-06-14T14:38:17.310Z] 14:38:17     INFO -  "14:38:17:173 | TEST-INFO | xpcshellUtilsAUS.js | [setupTestCommon : 979] Updates Directory (UpdRootD) Path: /builds/worker/workspace/build/tests/xpcshell/tests/toolkit/mozapps/update/tests/unit_aus_update/updateManagerXML/dir.app"
[task 2024-06-14T14:38:17.311Z] 14:38:17     INFO -  "14:38:17:174 | TEST-INFO | xpcshellUtilsAUS.js | [setupTestCommon : 996] resetting update lock"
[task 2024-06-14T14:38:17.311Z] 14:38:17     INFO -  "14:38:17:175 | TEST-INFO | xpcshellUtilsAUS.js | [setupTestCommon : 1026] finish - general test setup"
[task 2024-06-14T14:38:17.311Z] 14:38:17     INFO -  "14:38:17:175 | TEST-INFO | /builds/worker/workspace/build/tests/xpcshell/tests/toolkit/mozapps/update/tests/unit_aus_update/updateManagerXML.js | [run_test : 9] testing addition of a successful update to updates.xml and verification of update properties including the format prior to bug 530872"
[task 2024-06-14T14:38:17.311Z] 14:38:17     INFO -  "14:38:17:175 | TEST-INFO | shared.js | [setUpdateChannel : 226] setting default pref app.update.channel to test_channel"
[task 2024-06-14T14:38:17.311Z] 14:38:17     INFO -  PID 7023 | (xpcshell:7023): GLib-GObject-WARNING **: 14:38:17.187: invalid (NULL) pointer instance
[task 2024-06-14T14:38:17.311Z] 14:38:17     INFO -  PID 7023 | (xpcshell:7023): GLib-GObject-CRITICAL **: 14:38:17.187: g_signal_connect_data: assertion 'G_TYPE_CHECK_INSTANCE (instance)' failed
[task 2024-06-14T14:38:17.311Z] 14:38:17     INFO -  PID 7023 | (xpcshell:7023): GLib-GObject-WARNING **: 14:38:17.187: invalid (NULL) pointer instance
<...>
[task 2024-06-14T14:38:17.331Z] 14:38:17     INFO -  TEST-PASS | toolkit/mozapps/update/tests/unit_aus_update/updateManagerXML.js | run_test - [run_test : 259] the update patch custom2_attr property should equal the expected value - "custom2 patch value" == "custom2 patch value"
[task 2024-06-14T14:38:17.331Z] 14:38:17     INFO -  "14:38:17:243 | TEST-INFO | /builds/worker/workspace/build/tests/xpcshell/tests/toolkit/mozapps/update/tests/unit_aus_update/updateManagerXML.js | [run_test : 265] checking the first update patch enumerator"
[task 2024-06-14T14:38:17.331Z] 14:38:17     INFO -  TEST-PASS | toolkit/mozapps/update/tests/unit_aus_update/updateManagerXML.js | run_test - [run_test : 266] patch enumerator should be an instance of nsISimpleEnumerator - true == true
[task 2024-06-14T14:38:17.332Z] 14:38:17  WARNING -  TEST-UNEXPECTED-FAIL | toolkit/mozapps/update/tests/unit_aus_update/updateManagerXML.js | run_test - [run_test : 271] the length of the array created from the patch enumerator should equal the expected value - 3 == 2
[task 2024-06-14T14:38:17.332Z] 14:38:17     INFO -  /builds/worker/workspace/build/tests/xpcshell/tests/toolkit/mozapps/update/tests/unit_aus_update/updateManagerXML.js:run_test:271
[task 2024-06-14T14:38:17.333Z] 14:38:17     INFO -  /builds/worker/workspace/build/tests/xpcshell/head.js:_do_main:245
[task 2024-06-14T14:38:17.333Z] 14:38:17     INFO -  /builds/worker/workspace/build/tests/xpcshell/head.js:_execute_test:596
[task 2024-06-14T14:38:17.333Z] 14:38:17     INFO -  -e:null:1
[task 2024-06-14T14:38:17.333Z] 14:38:17     INFO -  exiting test
[task 2024-06-14T14:38:17.334Z] 14:38:17     INFO -  PID 7023 | Couldn't convert chrome URL: chrome://branding/locale/brand.properties
[task 2024-06-14T14:38:17.334Z] 14:38:17     INFO -  PID 7023 | [Parent 7023, Main Thread] WARNING: Could not get the program name for a cubeb stream.: 'NS_SUCCEEDED(rv)', file /builds/worker/checkouts/gecko/dom/media/CubebUtils.cpp:467
[task 2024-06-14T14:38:17.334Z] 14:38:17     INFO -  "CONSOLE_MESSAGE: (info) No chrome package registered for chrome://branding/locale/brand.properties"
[task 2024-06-14T14:38:17.335Z] 14:38:17     INFO -  PID 7023 | [Parent 7023, Main Thread] WARNING: 'mColorSchemePreference == ComputeColorSchemeSetting()', file /builds/worker/checkouts/gecko/widget/gtk/nsLookAndFeel.cpp:1544
[task 2024-06-14T14:38:17.335Z] 14:38:17     INFO -  PID 7023 | JavaScript error: /builds/worker/workspace/build/tests/xpcshell/head.js, line 869: NS_ERROR_ABORT
[task 2024-06-14T14:38:17.335Z] 14:38:17     INFO -  "CONSOLE_MESSAGE: (error) [JavaScript Error: "NS_ERROR_ABORT" {file: "/builds/worker/workspace/build/tests/xpcshell/head.js" line: 869}]"
[task 2024-06-14T14:38:17.336Z] 14:38:17  WARNING -  TEST-UNEXPECTED-FAIL | toolkit/mozapps/update/tests/unit_aus_update/updateManagerXML.js | assertNoUncaughtRejections - [assertNoUncaughtRejections : 266] A promise chain failed to handle a rejection: [Exception... "Abort"  nsresult: "0x80004004 (NS_ERROR_ABORT)"  location: "JS frame :: /builds/worker/workspace/build/tests/xpcshell/head.js :: _abort_failed_test :: line 869"  data: no] - stack: _abort_failed_test@/builds/worker/workspace/build/tests/xpcshell/head.js:869:20
[task 2024-06-14T14:38:17.337Z] 14:38:17     INFO -  do_report_result@/builds/worker/workspace/build/tests/xpcshell/head.js:981:5
[task 2024-06-14T14:38:17.337Z] 14:38:17     INFO -  Assert<@/builds/worker/workspace/build/tests/xpcshell/head.js:70:21
[task 2024-06-14T14:38:17.338Z] 14:38:17     INFO -  Assert.prototype.report@resource://testing-common/Assert.sys.mjs:251:10
[task 2024-06-14T14:38:17.338Z] 14:38:17     INFO -  equal@resource://testing-common/Assert.sys.mjs:293:8
[task 2024-06-14T14:38:17.338Z] 14:38:17     INFO -  run_test@/builds/worker/workspace/build/tests/xpcshell/tests/toolkit/mozapps/update/tests/unit_aus_update/updateManagerXML.js:271:10
[task 2024-06-14T14:38:17.339Z] 14:38:17     INFO -  _do_main@/builds/worker/workspace/build/tests/xpcshell/head.js:245:6
[task 2024-06-14T14:38:17.339Z] 14:38:17     INFO -  _execute_test@/builds/worker/workspace/build/tests/xpcshell/head.js:596:5
[task 2024-06-14T14:38:17.340Z] 14:38:17     INFO -  @-e:1:1
[task 2024-06-14T14:38:17.340Z] 14:38:17     INFO -  Rejection date: Fri Jun 14 2024 14:38:17 GMT+0000 (Coordinated Universal Time) - false == true
[task 2024-06-14T14:38:17.341Z] 14:38:17     INFO -  resource://testing-common/PromiseTestUtils.sys.mjs:assertNoUncaughtRejections:266
[task 2024-06-14T14:38:17.341Z] 14:38:17     INFO -  /builds/worker/workspace/build/tests/xpcshell/head.js:_execute_test:597
[task 2024-06-14T14:38:17.342Z] 14:38:17     INFO -  -e:null:1
[task 2024-06-14T14:38:17.342Z] 14:38:17     INFO -  exiting test
[task 2024-06-14T14:38:17.344Z] 14:38:17     INFO -  "14:38:17:254 | TEST-INFO | xpcshellUtilsAUS.js | [AGP_cleanup : 4553] start - unregistering directory provider"
[task 2024-06-14T14:38:17.344Z] 14:38:17     INFO -  "14:38:17:255 | TEST-INFO | xpcshellUtilsAUS.js | [cleanupTestCommon : 1035] start - general test cleanup"
[task 2024-06-14T14:38:17.345Z] 14:38:17     INFO -  "14:38:17:256 | TEST-INFO | xpcshellUtilsAUS.js | [cleanupTestCommon : 1144] attempting to remove directory. Path: /builds/worker/workspace/build/tests/xpcshell/tests/toolkit/mozapps/update/tests/unit_aus_update/updateManagerXML"
[task 2024-06-14T14:38:17.345Z] 14:38:17     INFO -  "14:38:17:256 | TEST-INFO | shared.js | [removeDirRecursive : 605] attempting to remove directory. Path: /builds/worker/workspace/build/tests/xpcshell/tests/toolkit/mozapps/update/tests/unit_aus_update/updateManagerXML"
[task 2024-06-14T14:38:17.346Z] 14:38:17     INFO -  "14:38:17:257 | TEST-INFO | xpcshellUtilsAUS.js | [cleanupTestCommon : 1160] finish - general test cleanup"
[task 2024-06-14T14:38:17.346Z] 14:38:17     INFO -  "14:38:17:257 | TEST-INFO | xpcshellUtilsAUS.js | [AGP_cleanup : 4617] finish - unregistering directory provider"
[task 2024-06-14T14:38:17.347Z] 14:38:17     INFO -  PID 7023 | (xpcshell:7023): GLib-GObject-WARNING **: 14:38:17.293: invalid (NULL) pointer instance
[task 2024-06-14T14:38:17.347Z] 14:38:17     INFO -  PID 7023 | (xpcshell:7023): GLib-GObject-CRITICAL **: 14:38:17.293: g_signal_handlers_disconnect_matched: assertion 'G_TYPE_CHECK_INSTANCE (instance)' failed
[task 2024-06-14T14:38:17.347Z] 14:38:17     INFO -  PID 7023 | [Parent 7023, Main Thread] WARNING: OOPDeinit() without successful OOPInit(): file /builds/worker/checkouts/gecko/toolkit/crashreporter/nsExceptionHandler.cpp:3460
[task 2024-06-14T14:38:17.348Z] 14:38:17     INFO -  PID 7023 | [Parent 7023, Main Thread] WARNING: XPCOM object PerThreadData destroyed from static ctor/dtor: file /builds/worker/checkouts/gecko/xpcom/base/nsTraceRefcnt.cpp:216
[task 2024-06-14T14:38:17.348Z] 14:38:17     INFO -  PID 7023 | [Parent 7023, Main Thread] WARNING: XPCOM object SourceList destroyed from static ctor/dtor: file /builds/worker/checkouts/gecko/xpcom/base/nsTraceRefcnt.cpp:216
[task 2024-06-14T14:38:17.348Z] 14:38:17     INFO -  <<<<<<<
[task 2024-06-14T14:38:17.349Z] 14:38:17     INFO -  TEST-START | toolkit/mozapps/update/tests/unit_aus_update/cleanupSuccessLogsFIFO.js
Flags: needinfo?(erchen)
Attachment #9408274 - Attachment is obsolete: true
Attachment #9408283 - Attachment description: Bug 1853821 fixing behaviour for BACKGROUND_TASK_SHARING_VIOLATION errorcode r=bytesized! → WIP: Bug 1853821 fixing behaviour for BACKGROUND_TASK_SHARING_VIOLATION errorcode r=bytesized!
Attachment #9408283 - Attachment description: WIP: Bug 1853821 fixing behaviour for BACKGROUND_TASK_SHARING_VIOLATION errorcode r=bytesized! → Bug 1853821 fixing behaviour for BACKGROUND_TASK_SHARING_VIOLATION errorcode r=bytesized!
Attachment #9408283 - Attachment description: Bug 1853821 fixing behaviour for BACKGROUND_TASK_SHARING_VIOLATION errorcode r=bytesized! → Bug 1853821 - fixing behaviour for BACKGROUND_TASK_SHARING_VIOLATION errorcode r=bytesized!
Flags: needinfo?(erchen)

It looks like the updated behaviour will also affect this test (https://searchfox.org/mozilla-central/source/toolkit/mozapps/update/tests/unit_base_updater/marFileLockedFailureComplete_win.js#16, and the equivalent in unit_service_updater).
Unlike the previous test that we handled in D214179 this is actually simulating a write error (with the case where the .mar file is locked), I'm wondering if we can change the test so I verify if handleFallbackToCompleteUpdate() is called after the update is applied, instead of checking if the state falls back to pending?

Flags: needinfo?(bytesized)

Oh no. I had been wondering for a while why this code was ever written this way. It seems to so clearly have the exact flaw that we are trying to fix. Well, this test explains why. Some of these write errors may be transient failures that will be resolved when a lock is released. In that case, it does actually make sense to try again, though I don't know that trying again infinitely was a good idea.

I think that the way that this ought to work is to, if possible, detect these transient failures and use a different error code if they are detected. Then, in that case, we can try again up to some maximum number of times.

This project just keeps getting more and more complex than I originally intended for it to be. Let's chat with Amir about how to proceed here.

Flags: needinfo?(bytesized)
Priority: -- → P3

Leaving this ticket in the backlog for now as I need to work on other projects at this time

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: