Closed Bug 1652910 Opened 5 years ago Closed 4 years ago

mozilla::unicode::GetNaked is not thread-safe

Tracking

()

Status:

RESOLVED FIXED

Milestone:

88 Branch

People

(Reporter: alexhenrie24, Assigned: alexhenrie24, NeedInfo)

References

(Regressed 1 open bug)

Details

Attachments

(1 obsolete file)

Bug 1652910 - Protect mozilla::unicode::GetNaked with a mutex. 5 years ago Alex Henrie 47 bytes, text/x-phabricator-request		Details \| Review

Alex Henrie

Assignee

Description

•

5 years ago

As Jonathan noted at https://phabricator.services.mozilla.com/D81875#2541722:

On looking at this again, I think André is right that there could be a thread-safety concern here. The issue is the use of GetNaked from Places code, which runs off-main-thread; AFAICT there's nothing to protect us from this happening at the same time as nsFind is using it from the main thread. So we really should address that.

Alex Henrie

Assignee

Comment 1

•

5 years ago

Attached file Bug 1652910 - Protect mozilla::unicode::GetNaked with a mutex. (obsolete) — Details

Phabricator Automation

Updated

•

5 years ago

Assignee: nobody → alexhenrie24

Status: NEW → ASSIGNED

Jonathan Kew [:jfkthame]

Comment 2

•

5 years ago

I wonder if it'd be feasible to reduce the amount/scope of locking involved here. Some thoughts:

ToNaked() is thread-safe for ASCII characters, as it just does an early return; this will be a very common case. It seems unfortunate to always lock around code that is calling ToNaked but may well be dealing with safe, ASCII-only text.
Over time, most calls to GetNaked() will tend to be satisfied from the cache; it would be nice if we could read it without locking. Indeed, HashMap offers a readonlyThreadsafeLookup method which sounds nice...
However, that wouldn't work if another thread happened to update the cache at the same time.
Maybe we could do all cache updates from the main thread? So if a secondary thread comes up with a new entry, instead of modifying the hashmap directly, it'd post a runnable to the main thread to perform the update.
Then the main thread would be allowed to read the cache without locking. We'd need to hold the lock in two cases:
- (a) when updating the cache (happens on the main thread only); and
- (b) when reading the cache from a secondary thread.
In particular, this means that main-thread usage that is either ASCII-only or satisfied by existing cache entries never takes the lock.
The Places code would also avoid locking when dealing with ASCII-only text, which is probably fairly common.

So overall, we'd have something like:

At the beginning of GetNaked, lock only if !NS_IsMainThread().
When about to insert a new cache entry, if NS_IsMainThread() then lock and continue, else post a CacheNakedMappingRunnable to the main thread.
- Note that the runnable would need to use put rather than putNew to update the cache, as there could be multiple attempts to add the same mapping.

WDYT - would something along these lines be workable? Does it sound like on balance it'd be a win?

Emilio Cobos Álvarez [:emilio]

Comment 3

•

5 years ago

A possibly easier alternative... Make the cache thread-local?

Jonathan Kew [:jfkthame]

Comment 4

•

5 years ago

Possible, but it looks like Places may call this from multiple secondary threads, so that could lead to quite a bit of duplication. There's less benefit from the cache if each thread has to populate its own copy, and the overall memory footprint increases.

(Ideally, it'd be nice to have a single naked-char cache shared across all processes, not just threads, but that's a different level of effort, as we wouldn't want lookups to involve IPC. So I'm not proposing we pursue that for now.)

Jonathan Kew [:jfkthame]

Comment 5

•

5 years ago

Actually, let's take a step back here; I think there's a totally different strategy we should consider.

Instead of computing GetNaked mappings at runtime from decomposition and transliteration rules, and caching the result for performance (which introduces the thread-safety issue), we should just precompute them at build time (or before build time, at update-to-new-Unicode-version time) and store a simple table as constant data.

There are only 1400 non-identity mappings at present (bug 1649187 will increase this to 1906, and future Unicode updates may create a few more, but it's unlikely there will ever be much larger numbers), so the total amount of data involved won't be too much -- and we'll get to simplify the runtime code and avoid adding ICU transliterator code and data to subset we ship.

We could store the mappings in a multi-level lookup table similar to others in nsUnicodePropertyData, though even just binary-searching a sorted array would probably be sufficiently performant for our needs here.

Alex Henrie

Assignee

Comment 6

•

5 years ago

I think creating the table of character mappings at build time is a great idea :-)

BugBot [:suhaib / :marco/ :calixte]

Comment 7

•

5 years ago

The severity field is not set for this bug.
:mikedeboer, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(mdeboer)

Phabricator Automation

Updated

•

4 years ago

Attachment #9163687 - Attachment is obsolete: true

Alex Henrie

Assignee

Comment 8

•

4 years ago

The fix for Bug 1649187 also fixed the thread-safety issue:

https://hg.mozilla.org/integration/autoland/rev/d61a3c845eb6

Status: ASSIGNED → RESOLVED

Closed: 4 years ago

Resolution: --- → FIXED

Target Milestone: --- → 88 Branch

yxu

Updated

•

4 years ago

Regressions: 1707842

You need to log in before you can comment on or make changes to this bug.

Bugzilla

mozilla::unicode::GetNaked is not thread-safe

Categories

(Core :: Find Backend, defect)

Tracking

()

People

(Reporter: alexhenrie24, Assigned: alexhenrie24, NeedInfo)

References

(Regressed 1 open bug)

Details

Crash Data

Security

(public)

User Story

Attachments

(1 obsolete file)

Description

Comment 1

Updated

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Updated

Comment 8

Updated

Attachment

General

Description

File Name

Content Type