Open Bug 1897299 Opened 1 year ago Updated 1 year ago

Add ingestion connection

Categories

(Application Services :: Suggest, enhancement, P3)

enhancement

Tracking

(Not tracked)

People

(Reporter: bdk, Unassigned)

Details

We currently have 2 connections: a reader and writer. This presents a bit of an issue for cancellation since the writer connection is used for both ingestion and also calls like dismiss_suggestion. This means that the interrupt_ingestion method that we're adding in https://github.com/mozilla/application-services/pull/6246 will also interrupt a call to dismiss_suggestion. This is not really an issue in today since no consumer calls dismiss suggestion and we only call interrupt_ingestion under very specific circumstances, but it could become one in the future.

One way to avoid this issue would be to create a third connection for ingestion. The main reason this hasn't happened is because 2 SQLite writers can block each other and cause SQLITE_BUSY errors. I think there are ways to work around this though. One way would be to create a single mutex around both connections rather than have separate one for each.

Thanks for filing this, Ben!

This presents a bit of an issue for cancellation since the writer connection is used for both ingestion and also calls like dismiss_suggestion.

Great point! I think my feeling was that a consumer should never be calling interrupt_ingestion() except at shutdown, and if we have a write in progress (from ingest() or from dismiss_suggestion()) as we're shutting down, then we can't guarantee that it makes it either way.

In general, I also tend to go for exposing smaller interfaces with more specific functionality to consumers, to avoid potential misunderstandings and misuse—like an interrupt_everything() method that would signal "⚠️ use with caution; you don't need this in normal use ⚠️". But that's totally a personal preference, and I really like your point that a consumer might have a good reason to interrupt ingestion specifically, but not other writes.

Thinking about it from that perspective, I completely agree that it would be surprising to have a method called interrupt_ingestion() also interrupt dismiss_suggestion().

The main reason this hasn't happened is because 2 SQLite writers can block each other and cause SQLITE_BUSY errors. I think there are ways to work around this though.

Thanks for continuing to think about this! I think I have a bit of an aversion to multiple SQLite writers from past experience, but that shouldn't be a reason for us not exploring it.

Multiple writers were part of the initial approach of "new bookmark sync" on Desktop, and one that we copied for Rust Places. We ended up removing them from Desktop, because we never did get a handle on that SQLITE_BUSY error, plus the overhead of managing a separate connection wasn't doing us any favors. It turned out to be simpler and faster to have a single writer that did all the work exclusively—we were even able to split the read to build the bookmarks tree from the write transaction to update it.

Copying that design to Rust Places, and the complications it brought with it, was one of my biggest "architectural regrets" [1] in retrospect. By the time we took multiple writers out of Desktop, Rust Places was already well on the critical path for Fenix, and it didn't feel right to rearchitect it just for that.

That's where my wariness about multiple writers comes from—but it's not to discourage you from investigating it! 😅

One way would be to create a single mutex around both connections rather than have separate one for each.

You got me thinking: if we're protecting both writers with the same mutex, isn't that functionally equivalent to having a single writer with two different interrupt scopes? Since a single rusqlite::Connection also has to be behind a mutex, ingest() and dismiss_suggestion() can't ever run concurrently—interrupting will interrupt whatever operation is going on now, but that can only be one of ingest() or dismiss_suggestion(), not both. With that in mind, I wonder if we can still have a single connection, but with different interrupt scopes 🤔

Anyhoo, just some more food for thought!

[1]: In the sense of "warts, mistakes, and compromises that would be nice to change, but not worth risking breakage", in the vein of this series.

In general, I also tend to go for exposing smaller interfaces with more specific functionality to consumers, to avoid potential misunderstandings and misuse—like an interrupt_everything() method that would signal "⚠️ use with caution; you don't need this in normal use ⚠️". But that's totally a personal preference, and I really like your point that a consumer might have a good reason to interrupt ingestion specifically, but not other writes.

Maybe I should have just stuck with your name in the first place and that's the solution to this. Part of this is that I don't really understand what iOS is doing, so I'm just thinking about what makes sense in my head for some sort of ingestion worker. It seems good to be able to cancel the worker without affecting with the rest of the app, but maybe it's a YAGNI situation.

Multiple writers were part of the initial approach of "new bookmark sync" on Desktop, and one that we copied for Rust Places. We ended up removing them from Desktop, because we never did get a handle on that SQLITE_BUSY error, plus the overhead of managing a separate connection wasn't doing us any favors. It turned out to be simpler and faster to have a single writer that did all the work exclusively—we were even able to split the read to build the bookmarks tree from the write transaction to update it.

Copying that design to Rust Places, and the complications it brought with it, was one of my biggest "architectural regrets" [1] in retrospect. By the time we took multiple writers out of Desktop, Rust Places was already well on the critical path for Fenix, and it didn't feel right to rearchitect it just for that.

I don't know if you should feel that bad about it. I've worked with that places code and it's quite complex, but there's quite a complex set of requirements there. That said, it does seem like having a single writer at a time is the right call. I keep coming back to the idea of using an actor model for this, but I don't think that helps in this situation that much.

You got me thinking: if we're protecting both writers with the same mutex, isn't that functionally equivalent to having a single writer with two different interrupt scopes?

That's basically what I'm after and maybe that's how we would present it in the API. However, there's the issue of the SQLite interrupt handle, which is tied to a particular connection. I can't figure out a way to create 2 different scopes since there's only 1 handle.

Thank you both for the insights! I put them into a summary table below.

I am also inclined to going for the single-writer option for its simplicity. In particular, the consumer doesn't have to worry too much about the transaction failures. As for downsides, apart from the mitigations Lina mentioned, we can also instrument how often those point writes get interrupted in the wild, my understanding is that that should be very rare and the user impact should be still manageable. If telemetry suggests the opposite, then we can think about the other approaches.

Does that make sense?

Single Writer

Pros

  • Easy to reason
  • SQLITE_BUSY and SQLITE_SNAPSHOT_BUSY are avoided b/c writes are serialized by the connection mutex

Cons

  • Coarser granularity on interruption scope

Multi Writers

Pros

  • Finer control on interruption for each connection

Cons

  • Write transactions on both connections could run into SQLITE_BUSY and the consumer needs to handle the retry, especially, the point writes (e.g. dismissal) are very likely to fail if an ingestion is in progress. A mutex can be used to serialize the writes on both connections, but that would cancel the benefits of having two connections
  • Slightly higher footprint for having a separate connection

Thanks for that write-up, it's really helpful to see it spelled out like that. I'm picturing a 3rd option that's a hybrid of the first two:

Multiple connections behind a single mutex so only one writer can be active at once

Pros

  • Finer control on interruption for each connection
  • SQLITE_BUSY and SQLITE_SNAPSHOT_BUSY are avoided b/c writes are serialized by the connection mutex
  • Easier to reason about than the multi-writer option

Cons

  • Slightly higher footprint for having a separate connection
  • Slightly harder to reason about than the single-writer option

Thanks for adding option 3! One thing I do like about 3 is that points writes are less likely to go missing than option 1. Also the write access to the two write connections are serialized to avoid SQLITE_BUSY. So I think I am convinced.

Severity: -- → N/A
Priority: -- → P3
You need to log in before you can comment on or make changes to this bug.