Faster mapping of URIs to add-on IDs

RESOLVED FIXED in mozilla33

Status

()

defect
RESOLVED FIXED
5 years ago
4 years ago

People

(Reporter: billm, Assigned: billm)

Tracking

unspecified
mozilla33
x86_64
Linux
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 2 obsolete attachments)

Posted patch addon-id-mapping (obsolete) — Splinter Review
I decided to spin this off from bug 990729. For that bug, we need a way to quickly map a URI to an add-on ID. I started with a patch that tried to call into the add-on manager to do that, but based on Benjamin's comment 41 in that bug, I decided to abandon the approach.

Instead, this patch does everything in C++. I eliminated the checks for URLs like chrome://global/ because they don't work in general.

As before, I'm a little unsure about reviewers. Are you okay doing this, Irving?
Attachment #8430405 - Flags: review?(irving)
Comment on attachment 8430405 [details] [diff] [review]
addon-id-mapping

Review of attachment 8430405 [details] [diff] [review]:
-----------------------------------------------------------------

I agonized over this review quite a bit - the Trie data structure really feels like overkill to me, considering the size and structure data we're managing. I ran a Telemetry analysis to count active add-ons in the field; see http://www.controlledflight.ca/2014/06/02/how-many-firefox-extensions-do-people-use/; tl;dr there's one crazy Firefox user with 273 active add-ons, but the 99th percentile over all is 19 and the median is 3. Unless we're looking up these paths a lot, binary search of a sorted array would be more than enough. Add-on paths will never be substrings of other add-on paths, so we don't even need to be smart about matching longest possible substring...

I'm particularly concerned about memory management on the value side of the Trie data structure; if we're going to put this in the tree as a utility that other modules might use, we need to make sure that's solid.

::: toolkit/mozapps/extensions/AddonManagerUtils.cpp
@@ +17,5 @@
> +#include "nsIChromeRegistry.h"
> +#include "nsIJARURI.h"
> +#include "mozilla/AddonPathService.h"
> +
> +#include "mozilla/AddonManagerUtils.h"

Do we really need a separate "Utils" module here? These routines are specific enough that they could be declared and implemented in AddonPathService.

@@ +26,5 @@
> +ResolveURI(nsIURI* aURI, nsACString& out)
> +{
> +  bool equals;
> +  nsresult rv;
> +  nsCOMPtr<nsIURI> uri = aURI;

Both code paths in the if/else assign new values to 'uri' before we use it, so I don't think we need this assignment.

@@ +33,5 @@
> +  // Resolve resource:// URIs. At the end of this if/else block, we
> +  // have both spec and uri variables identifying the same URI.
> +  if (NS_SUCCEEDED(aURI->SchemeIs("resource", &equals)) && equals) {
> +    nsCOMPtr<nsIIOService> ioService = do_GetIOService(&rv);
> +    NS_ENSURE_SUCCESS(rv, rv);

NS_ENSURE_SUCCESS is deprecated; see https://groups.google.com/forum/#!msg/mozilla.dev.platform/1clMLuuhtWQ/8MxLDZN28Y4J

::: toolkit/mozapps/extensions/AddonPathService.cpp
@@ +6,5 @@
> +#include "AddonPathService.h"
> +
> +namespace mozilla {
> +
> +AddonPathService::AddonPathService()

I don't expect this to be a particularly large data structure, so we probably don't need an about:memory reporter for it unless there's a rule requiring one.

::: toolkit/mozapps/extensions/amIAddonPathService.idl
@@ +4,5 @@
> +
> +#include "nsISupports.idl"
> +
> +[scriptable, uuid(fcd9e270-dfb1-11e3-8b68-0800200c9a66)]
> +interface amIAddonPathService : nsISupports

Should we do this with WebIDL instead of XPCOM? It's the future...

::: toolkit/mozapps/extensions/internal/XPIProvider.jsm
@@ +1683,5 @@
>      logger.info("Mapping " + aID + " to " + aFile.path);
>      this._addonFileMap.set(aID, aFile.path);
> +
> +    let service = Cc["@mozilla.org/addon-path-service;1"].getService(Ci.amIAddonPathService);
> +    service.insertPath(aFile.path, aID);

The service lookup could be cached (e.g. XPCOMUtils.defineLazyServiceGetter), but this isn't a hot code path so it's not a big deal.

::: toolkit/mozapps/extensions/test/xpcshell/test_addon_trie.js
@@ +20,5 @@
> +  insert("def", "11");
> +  insert("axy", "12");
> +  insert("defghij", "13");
> +  insert("defghi", "14");
> +  insert("defghi", "15");

This tests splitting def -> ghij into def -> ghi -> {"", j}; just because I'm paranoid, can you add a test for

insert("defgdc", "whatever") so that both sides of the split node have string contents (def-> g -> {"hi", "dc"} )

::: toolkit/mozapps/extensions/test/xpcshell/xpcshell-shared.ini
@@ +1,1 @@
> +[test_addon_trie.js]

the addon manager tests are (wastefully) run twice, once with the "always unpack add-ons" pref off (the default) and once with it on. Since your test doesn't depend on that, please put it in xpcshell.ini so we only run it once.

::: xpcom/ds/Trie.cpp
@@ +28,5 @@
> + *           edge "12"   /
> + *   root o ------------o
> + *                       \
> + *                        child '5'------------o value=B
> + *                                   edge "67"

Am I mis-reading this diagram vs. your code? From the code, it looks like a node contains multiple (a hash table of) edges and each edge refers to a single node.

@@ +58,5 @@
> +    TrieNode* Find(const nsACString& key, unsigned index);
> +    void Insert(const nsACString& key, unsigned index, nsAutoPtr<TrieNode>& node);
> +
> +    nsCString mLabel;
> +    nsAutoPtr<TrieNode> mTrieNode;

Can the Edge and Node data structures be merged? an Edge always contains a Node, so the only wasted space is at the root node which doesn't need the Edge bits

@@ +65,5 @@
> +  static PLDHashOperator
> +  DumpChild(const uint32_t& ch, TrieEdge* edge, void* udata);
> +
> +  Trie::DataType mValue;
> +  nsClassHashtable<nsUint32HashKey, TrieEdge> mChildren;

This is wasted space on leaf nodes, but if it's not much memory I'd leave it embedded to avoid dynamic allocation overhead. Aside from that, it feels like major overkill to use a full blown hash table to manage something we're indexing with a char; on the other hand, it feels like major bikeshedding to worry about it unless this API is called a lot.

@@ +115,5 @@
> +TrieNode::Insert(const nsACString& key, unsigned index, nsAutoPtr<TrieNode>& node)
> +{
> +  if (index == key.Length()) {
> +    // Overwrite the value in this node.
> +    mValue = node->mValue;

This leaks the previous value if one was set.

@@ +237,5 @@
> +void
> +Trie::Insert(const nsACString& key, void* value)
> +{
> +  nsAutoPtr<TrieNode> node(new TrieNode(value));
> +  mRoot->Insert(key, 0, node);

Rather than construct the node here and pass it down, I'd prefer to pass the value into Insert() and construct the node when we get to the leaf location. This would also play well with collapsing the Node and Edge data structures into one.
Attachment #8430405 - Flags: review?(irving) → review-
Posted patch addon-id-mapping v2 (obsolete) — Splinter Review
Thanks Irving. I thought about this some more and I realized that we can actually optimize the most common cases very easily. All the Firefox stuff will come from the Omnijars, and it's easy to check for that after resolving the URI.

Also, given the data you posted, I decided to switch to a sorted array like you suggested.
Attachment #8430405 - Attachment is obsolete: true
Attachment #8437330 - Flags: review?(irving)
Comment on attachment 8437330 [details] [diff] [review]
addon-id-mapping v2

Review of attachment 8437330 [details] [diff] [review]:
-----------------------------------------------------------------

How convenient that nsTArray<foo> already implements sorted insert and binary search ;-)

I'd like somebody more experienced with JSAPI to take a quick look at those parts of the patch, but the rest of it looks fine aside from the comments below.

::: toolkit/mozapps/extensions/AddonPathService.cpp
@@ +32,5 @@
> +  int Compare(const PathEntry& entry1, const PathEntry& entry2) const
> +  {
> +    unsigned len = std::min(entry1.mPath.Length(), entry2.mPath.Length()) + 1;
> +    return memcmp(entry1.mPath.BeginReading(), entry2.mPath.BeginReading(),
> +                  len * sizeof(char16_t));

The buffer returned by nsAString::BeginReading() is not necessarily null terminated, and Length() + 1 isn't guaranteed to be valid memory: https://developer.mozilla.org/en-US/docs/Mozilla/Tech/XPCOM/Reference/Glue_classes/nsAString/BeginReading

If this is known to be safe because we're actually working with an nsAString subclass that guarantees a null terminated buffer, at least add a comment here but preferably change the types of the chain of parameters and fields to enforce that constraint.

@@ +138,5 @@
> +  // Resolve resource:// URIs. At the end of this if/else block, we
> +  // have both spec and uri variables identifying the same URI.
> +  if (NS_SUCCEEDED(aURI->SchemeIs("resource", &equals)) && equals) {
> +    nsCOMPtr<nsIIOService> ioService = do_GetIOService(&rv);
> +    if (NS_FAILED(rv))

These are all "shouldn't fail" cases, so they should be "if (NS_WARN_IF(NS_FAILED(rv))"

::: toolkit/mozapps/extensions/amIAddonPathService.idl
@@ +6,5 @@
> +
> +[scriptable, uuid(fcd9e270-dfb1-11e3-8b68-0800200c9a66)]
> +interface amIAddonPathService : nsISupports
> +{
> +  AString findPath(in AString path);

This should probably be named findAddonID, since it maps paths to addon IDs.

This IDL file needs documentation comments: https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Interface_development_guide/Commenting_IDL_for_better_documentation
Attachment #8437330 - Flags: review?(irving) → review-
It turns out nsString has == and < operators defined on it. Somehow I missed these the first time through. Also, there's a StringBeginsWith function that's really useful here.
Attachment #8437330 - Attachment is obsolete: true
Attachment #8437961 - Flags: review?(irving)
Posted patch js-addon-idSplinter Review
These changes introduce JSAddonId.
Attachment #8437962 - Flags: review?(bobbyholley)
Attachment #8437962 - Flags: review?(bobbyholley) → review+
Comment on attachment 8437961 [details] [diff] [review]
addon-id-mapping v3

Review of attachment 8437961 [details] [diff] [review]:
-----------------------------------------------------------------

Next time I'll be fussy about using @param and @returns Doxygen commands in the IDL comments, but I feel like this patch has had enough rounds of nit-picking ;->

I'm quite happy with how this patch turned out, nice work.
Attachment #8437961 - Flags: review?(irving) → review+
https://hg.mozilla.org/mozilla-central/rev/59d1fd1d1391
https://hg.mozilla.org/mozilla-central/rev/5d6ecbe3628a
Status: NEW → RESOLVED
Closed: 5 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
Should we switch all users of mapURIToAddonID to use this instead of maintaining multiple codepaths?
Flags: needinfo?(wmccloskey)
The code in XPIProvider.jsm for resolving URLs has cases for view-source: and about: URLs, which the new C++ code doesn't handle. I'm not sure how important those cases are.

Otherwise, though, I don't think there are any reasons not to unify them.
Flags: needinfo?(wmccloskey)
You need to log in before you can comment on or make changes to this bug.