Closed Bug 1030270 Opened 10 years ago Closed 8 years ago

Add timeouts of provider collection

Categories

(Firefox Health Report Graveyard :: Client: Desktop, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: gps, Unassigned)

References

(Depends on 1 open bug)

Details

From bug 944873, we know that providers may time out, causing a task to never finish. This will effectively cause FHR to hang and a shutdown crash to ensue.

Since FHR is talking to many components and that set keeps growing over time, it only takes one misbehaving component to time out FHR and lead to hangs and eventually a shutdown crash.

IMO, we should set a time limit for provider collection. If a provider doesn't report in time, we simply give up [and try again later]. You could call this optimistic reporting.

Complicating the implementation of this is SQLite. Providers collect inside transactions. To "abort" a collection would require rolling back a SQLite transaction. This is a mess of code.

If we restructure provider collection to return a JS object instead, we can handle the complicated SQLite interaction at a higher level so we don't have to worry about low-level transactions from individual providers. If we move away from SQLite, a solution likely becomes much easier.
If there is interest, I can add support for timeouts in AsyncShutdown barriers. It's not exactly the same mechanism, but, from a distance, it seems that it is a more generic variant of the onShutdown provider hook.
(In reply to David Rajchenbach Teller [:Yoric] from comment #1)
> If there is interest, I can add support for timeouts in AsyncShutdown
> barriers. It's not exactly the same mechanism, but, from a distance, it
> seems that it is a more generic variant of the onShutdown provider hook.

That sounds like a good idea on its own. Although, my first opinion is it does seem to limit the utility of shutdown barriers a bit. If shutdown barriers ever evolve into dynamic feature activation and deactivation during Gecko run-time - something I think will be necessary for things like Firefox OS - you may have dug yourself into a hole. I'm not sure what your intent for shutdown barriers is.
(In reply to Gregory Szorc [:gps] from comment #0)
> Complicating the implementation of this is SQLite. Providers collect inside
> transactions. To "abort" a collection would require rolling back a SQLite
> transaction. This is a mess of code.
> 
> If we restructure provider collection to return a JS object instead, we can
> handle the complicated SQLite interaction at a higher level so we don't have
> to worry about low-level transactions from individual providers. If we move
> away from SQLite, a solution likely becomes much easier.

Per this and a chat on IRC, the refactoring of providers to return JS objects seems like the feasible option here.
Also note that we can use e.g. PromiseUtils.resolveOrTimeout here.
Depends on: 1113366
FHR is going away per bug 1209088, wontfix.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Product: Firefox Health Report → Firefox Health Report Graveyard
You need to log in before you can comment on or make changes to this bug.