The default bug view has changed. See this FAQ.

Implement migration from mimeTypes.rdf to the JSON Handler Store

NEW
Assigned to

Status

()

Firefox
File Handling
P1
normal
8 months ago
5 days ago

People

(Reporter: alchen, Assigned: alchen)

Tracking

(Depends on: 2 bugs, Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [CHE-MVP])

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(1 attachment, 4 obsolete attachments)

(Assignee)

Description

8 months ago
The mimeType.json is a replacement for mimeType.rdf.
(Assignee)

Updated

8 months ago
Blocks: 474043
(Assignee)

Updated

8 months ago
Assignee: nobody → alchen
Whiteboard: [CHE-MVP]

Comment 1

8 months ago
Set priority to P1 because we're going to implement this feature on Firefox 50.
@ Development Schedule: https://docs.google.com/spreadsheets/d/1WznJUt7lLvcZwwry0yvJ_-hw0ZAXTxOOvmYCw74uGKY/edit#gid=0
Priority: -- → P1

Comment 2

8 months ago
I'm not sure exactly how to interpret the priorities assigned to the individual bugs here, but I'd like to clarify that the investigation in bug 1287673 comment 2 should be completed before any implementation work begins. The outline of the work attached to bug 474043 is a start, but it isn't complete without the information about the callers.

To reiterate, the information we need to add to the document is a full list of known callers in the mozilla-central tree together with the detail of which information is actually retrieved or stored, not only which method is called.
Flags: needinfo?(wehuang)
Hi Paolo, sorry for late following here

1. For the Bugzilla field "Importance" I think William had assigned the priority according to the estimated schedule, following the process in Emma's presentation[1] (I thought it might be mentioned somewhere more formally, but this slides is the only one I know to introduce it). As for our "Content Handling Enhancement" project we simply use 2 levels of priority, which are MVP (marked "[CHE-MVP]") and backlog (marked "[CHE-BACKLOG]" ) for now.

[1]https://docs.google.com/presentation/d/1qlXlDsMnvcpA5ppJVJWIBaBDPEyw5hbjU-Q7wjjrZQM/edit#slide=id.g135c9e38c0_0_0

(In reply to :Paolo Amadini from comment #2)
> I'm not sure exactly how to interpret the priorities assigned to the
> individual bugs here, 

2. Thanks for the suggestion, but if my understanding is correct the work here has it's own value (as pointed out in Bug 474043 comment 19), and in the end *may* or *may not* be impacted by the study in Bug 1287673, hence I tend to have the overall job divided into smaller pieces that can be worked separately (even we might have some overhead work eventually). In other words, this bug can still move on while not being blocked by others (e.g., Bug 1287673), in the mean time we also keep the study as you suggested in bug 1287673 comment 2 and discuss with you (we may consider having another person on this in parallel). How do you think?

> but I'd like to clarify that the investigation in bug
> 1287673 comment 2 should be completed before any implementation work begins.
> The outline of the work attached to bug 474043 is a start, but it isn't
> complete without the information about the callers.
> 
> To reiterate, the information we need to add to the document is a full list
> of known callers in the mozilla-central tree together with the detail of
> which information is actually retrieved or stored, not only which method is
> called.
Flags: needinfo?(wehuang) → needinfo?(paolo.mozmail)

Comment 4

8 months ago
(In reply to Wesly Huang (Firefox EPM) from comment #3)

Hi Wesly, thanks for the clarifications!

> 2. Thanks for the suggestion, but if my understanding is correct the work
> here has it's own value (as pointed out in Bug 474043 comment 19)

I'm actually fine with considering the removal of the RDF API calls the primary goal of this project, so the work here has this value indeed. With this in mind, bug 474043 comment 21 clarifies that reimplementing the nsIHandlerService API exactly as it is now might not be the fastest path to that goal.

What the comment mentions is the API "doing something particularly dumb", what it means for example is that the API might have methods that no one actually calls, or that those methods retrieve much more data than what is actually used.

This is essential information we have to keep into account when implementing and reviewing the code changes, also to ensure we have the necessary automated test coverage.

> , and in
> the end *may* or *may not* be impacted by the study in Bug 1287673, hence I
> tend to have the overall job divided into smaller pieces that can be worked
> separately (even we might have some overhead work eventually). In other
> words, this bug can still move on while not being blocked by others (e.g.,
> Bug 1287673), in the mean time we also keep the study as you suggested in
> bug 1287673 comment 2 and discuss with you (we may consider having another
> person on this in parallel). How do you think?

I agree that the investigation about synchronous and asynchronous callers, which is specifically the subject of bug 1287673, can be continued in parallel if you want. Regardless, for this bug, the research work about the callers of nsIHandlerService will be needed at some point.

Maybe instead of creating this list of callers in advance, you'd rather spend more time in studying the code and write your proposal in terms of a patch with some code changes, including automated tests. It can also be an effective starting point for the discussion. Sometimes a regression test can be a better indication of caller requirements than the actual callers.
Flags: needinfo?(paolo.mozmail)
(In reply to :Paolo Amadini from comment #4)
> This is essential information we have to keep into account when implementing
> and reviewing the code changes, also to ensure we have the necessary
> automated test coverage.
> 
> I agree that the investigation about synchronous and asynchronous callers,
> which is specifically the subject of bug 1287673, can be continued in
> parallel if you want. Regardless, for this bug, the research work about the
> callers of nsIHandlerService will be needed at some point.

Thanks Paolo, if my read is right, your view is the study in Bug 1287673 does not block the "beginning" of this Bug 1287658, however the study result, in some extent, is still a "dependency" for the "end" (let's say, sending patch for review) of the work here, because depends on the study of Bug 1287673 we may want to do more in this Bug 1287658 at once (also there are chances we don't need more). Is this correct?

Discussed with the team and think we are ok to proceed as above (if we didn't get you wrong), I'll get Alphan's support to summarize the known actions so we can get agreed upon the steps of this plan.

@Alphan, would you help? Thanks!
Flags: needinfo?(paolo.mozmail)
Flags: needinfo?(alchen)

Comment 6

8 months ago
(In reply to Wesly Huang (Firefox EPM) from comment #5)
> Is this correct?

Yeah, maybe we'll find out we can do more at once, or maybe that we don't need to do some parts.

Thanks!
Flags: needinfo?(paolo.mozmail)
(Assignee)

Comment 7

8 months ago
I will look into existed test case and think about the regression test.
After the investigation, I will write a new proposal with automated tests.(the suggestion of bug 474043 comment 23 will also be included)
Then we can have a discussion based on the document and start with the fundamental work.

For bug 1287673, we will do it in parallel.
Flags: needinfo?(alchen)

Comment 8

8 months ago
Thanks Alphan!
(Assignee)

Updated

7 months ago
Summary: Add mimeType.json access ability into nsHandlerService.js → Implement JSON store (HandlerStore.jsm) for nsIHandlerService
(Assignee)

Comment 9

7 months ago
According to the latest discussion, I set the goal of this bug to JSON store(HandlerStore.jsm) implementation. JSON store is created for the usage of new nsIHandlerService implementation. We will use mimeType.json as a database for nsIHandlerService.

In "HandlerStore.jsm", we will do read synchronously and do write asynchronously.(similar mechanism as bug 853539)
(Assignee)

Comment 10

6 months ago
Created attachment 8796571 [details] [diff] [review]
0930-bug1287658-WIP

Here is current WIP patch.
In this patch, we can use jsonStore.load() to load mimeTypes.json.
We will import data from mimeTypes.rdf if there is no data loaded in load().

Hi Paolo,
could you take a look and see whether it is what you expected?
Attachment #8796571 - Flags: feedback?(paolo.mozmail)
(Assignee)

Comment 11

6 months ago
Created attachment 8796576 [details]
mimeTypes.json produced by 0930-WIP patch(attachment 8796571 [details] [diff] [review])

The following is current format.

{
"mimetypes":[
  {"type":"application/pdf","description":"","action":3,"askBeforeHandle":false,"LocalHandlerApp":[],"WebHandlerApp":[],"WebContentHandler":[]}
],
"schemes":
[
{"type":"webcal","description":"","action":4,"askBeforeHandle":true,"LocalHandlerApps":[],"WebHandlerApps":["https://30boxes.com/external/widget?refer=ff&url=%s"],"WebContentHandlers":[]}}
]
}

Comment 12

6 months ago
Comment on attachment 8796571 [details] [diff] [review]
0930-bug1287658-WIP

Hi Alphan, this is just some initial feedback on the direction, and I may have more after the first revision.

It's very useful to have some sample migration code here as a way to see how the format is actually used. In the end we'd probably want to land this migration code separately, because it will use a different way to access the old nsIHandlerService, after the new nsIHandlerService is implemented.

Since you mostly used the login store as a starting point, it can be useful if you can do an "hg copy" of the file so that we can more easily see the parts that changed and I can just review the parts that stayed the same. There is probably some code that can be removed because it does not apply to our case, like the asynchronous loading.

We need to copy and adapt the specific tests for loading and saving the data from the store.

As for the JSON format, we can refine it better in future iterations, for the moment the single thing that stands out is that the MIME type or scheme are actually primary keys, so we can probably rewrite it to use object keys, and it would look something like { "application/pdf": { ... }, "text/plain": { ... } }.
Attachment #8796571 - Flags: feedback?(paolo.mozmail)

Comment 13

6 months ago
I've just heard that the JSON store code has been refactored in a generic module:

https://dxr.mozilla.org/mozilla-central/source/toolkit/modules/JSONFile.jsm

So, I think the remaining work in this bug is now limited to implementing the data migration, which needs to be done after bug 1287660 has landed.
(Assignee)

Comment 14

5 months ago
Created attachment 8802852 [details] [diff] [review]
1020-WIP

Hi Paolo,
In this patch, I use the generic module and revise the JSON format as the following. Now I am doing the implementation based on current JSON format.

+ * {                                                                           
+ *   "mimetypes":{                                                             
+ *     "application/pdf":{                                                     
+ *       "description":"PDF document",                                         
+ *       "action":3,                                                           
+ *       "askBeforeHandle":false,                                              
+ *       "fileExtensions":["pdf"],                                             
+ *       "preferredHandler":{},                                                
+ *       "possibleHandlers":[],                                                
+ *     },                                                                      
+ *    "application/xml":{                                                      
+ *      "description":"XML document",                                          
+ *      "action":2,                                                            
+ *      "askBeforeHandle":false,                                               
+ *      "fileExtensions":["xml","xsd","xsl"],                                  
+ *      "preferredHandler":{"name":"subl","description":"","type":0,"path":"/usr/bin/subl"},
+ *      "possibleHandlers":[{"name":"subl","description":"","type":0,"path":"/usr/bin/subl"}]
+ *     },                                                                      
+ *     "application/x-gzip":{ (...) },                                         
+ *     "application/zip": { (...) },                                           
+ *  },                                                                         
+ *  "schemes":{                                                                
+ *    "webcal":{                                                               
+ *      "description":"",                                                      
+ *      "action":2,                                                            
+ *      "askBeforeHandle":true,                                                
+ *      "preferredHandler":                                                    
+ *        {"name":"30 Boxes","description":null,"type":1,"uriTemplate":"http://30boxes.com/external/widget?refer=ff&url=%s"},
+ *      "possibleHandlers":[                                                   
+ *        {"name":"30 Boxes","description":null,"type":1,"uriTemplate":"http://30boxes.com/external/widget?refer=ff&url=%s"},
+ *        {"name":"30 Boxes","description":null,"type":1,"uriTemplate":"https://30boxes.com/external/widget?refer=ff&url=%s"}
+ *      ],                                                                     
+ *    },                                                                       
+ *    "ircs":{ (...) },                                                        
+ *    "mailto":{ (...) }                                                       
+ *  }
Attachment #8796571 - Attachment is obsolete: true
Attachment #8802852 - Flags: review?(paolo.mozmail)
(Assignee)

Comment 15

5 months ago
Created attachment 8802855 [details]
mimeTypes.json produced by (attachment 8802852 [details] [diff] [review])

Here is the mimeTypes.json produced by the latest patch.
Attachment #8796576 - Attachment is obsolete: true

Comment 16

5 months ago
Comment on attachment 8802852 [details] [diff] [review]
1020-WIP

Review of attachment 8802852 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the proof of concept!

The format looks mostly good to me, what we can do to save some space is to avoid writing data when it's empty or the default value, but we can work on this later.

Some things, like the need to save the description in the file, will become clearer as we work on the feature. Unless we find out specifically that we don't need something, we should just keep the same data as mimeTypes.rdf, like this patch shows now.

I'm not sure whether we still need this JSM to hold the singleton Handler Store, because the new nsHandlerService would already be an XPCOM service and we may able to just get a reference to the JSONFile data from it.

We might still need this JSM only if there is some logic here in addition to import, this mainly depends on how bug 1287660 is implemented.
Attachment #8802852 - Flags: review?(paolo.mozmail) → feedback+

Comment 17

5 months ago
Per comment 13, updating the subject ans setting the dependency.
Depends on: 1287660
Summary: Implement JSON store (HandlerStore.jsm) for nsIHandlerService → Implement migration from mimeTypes.rdf to the JSON Handler Store

Comment 18

a month ago
Comment on attachment 8802852 [details] [diff] [review]
1020-WIP

When bug 1287660 lands, in this bug we should:
 - Rename the ContractID of "handler-service" to "handler-service-rdf"
 - Rename the ContractID of "handler-service-json" to "handler-service"
 - Implement preferences controlling the migration, like the Logins code does
 - Add code to the JSON service to enumerate the types from the RDF service, and store them again using its own methods
 - Add specific tests for the migration

We should keep the code that gets the localized defaults in the RDF-based service, even if we have the same code in the JSON-based service. This is because we don't want old defaults to overwrite the new defaults imported by the JSON-based service, if the localized version number is updated in the version of Firefox where we do the migration, or if the import is re-requested by resetting the migration preference.
Attachment #8802852 - Attachment is obsolete: true
Comment hidden (mozreview-request)
(Assignee)

Updated

a month ago
Attachment #8802855 - Attachment is obsolete: true
(Assignee)

Comment 20

a month ago
(In reply to Alphan Chen [:alchen] from comment #19)
> Created attachment 8840320 [details]
> Bug 1287658 - Implement migration from mimeTypes.rdf to the JSON Handler
> Store.
> 
> Review commit: https://reviewboard.mozilla.org/r/114822/diff/#index_header
> See other reviews: https://reviewboard.mozilla.org/r/114822/

In this patch, we can pass all the test under "uriloader/exthandler/tests" by using the service of JSON backend.

Comment 21

a month ago
mozreview-review
Comment on attachment 8840320 [details]
Bug 1287658 - Implement migration from mimeTypes.rdf to the JSON Handler Store.

https://reviewboard.mozilla.org/r/114822/#review116350

Looks good! I'll review the migration test in more detail later.

In the meantime we should figure out the Android failures in bug 1287660, since that bug should land first.

::: uriloader/exthandler/nsHandlerService-json.js:151
(Diff revision 1)
> +    try {
> +      if (Services.prefs.getBoolPref("gecko.handlerService.migration")) {
> +        return;
> +      }
> +    } catch (ex) {
> +      // If the preference does not exist, we need to import.
> +    }
> +

Please add a check and return early if "mimeTypes.rdf" does not exist, so we don't load the RDF service at all on new profiles. This should go after the preference check, so we also don't try to access the file at all if the migration already happened.

::: uriloader/exthandler/nsHandlerService-json.js:152
(Diff revision 1)
>      return false;
>    },
>  
> +  _migration() {
> +    try {
> +      if (Services.prefs.getBoolPref("gecko.handlerService.migration")) {

"gecko.handlerService.migrated" is clearer, and we also use a similar name for various other preferences controlling migration.

::: uriloader/exthandler/nsHandlerService-json.js:162
(Diff revision 1)
> +      gHandlerServiceRDF.fillHandlerInfo(handlerInfo, null);
> +      this.store(handlerInfo);

Add a try-catch with Cu.reportError(ex) so that if something goes wrong with some handlers, we still migrate the others. The import can be retried later by flipping the preference.
Attachment #8840320 - Flags: review?(paolo.mozmail)
Comment hidden (mozreview-request)

Updated

12 days ago
Depends on: 1100069

Comment 23

12 days ago
The patch looks good, but as I mentioned I'm looking into improving the test coverage and the file format. I'll move some helpers to "head.js" so they can be reused by this test. I'll post a patch when ready.

Comment 24

12 days ago
I just noticed another thing while working on this. In the original RDF implementation, the applications are keyed using _getPossibleHandlerAppID, this means we never store the same handler application twice. Something similar happens with file extensions, that are stored with the behavior of a set.

In the JSON implementation we don't remove duplicate possible handlers or file extensions when storing an nsIHandlerInfo. When loading, we do that same as the RDF service and we just call appendElement on possibleApplicationHandlers, or appendExtension for file extensions.

This means that if we have a file extension that is added to nsIHandlerInfo at the operating system level, it will likely be duplicated every time we do a round trip through store() and fillHandlerInfo(). This should be fixed. Preferred handler applications are only added by the browser, and current consumers check for duplicates, but it's better to keep a consistent behavior with the RDF service to avoid duplicates in case future consumers don't check.
(Assignee)

Comment 25

11 days ago
(In reply to :Paolo Amadini from comment #24)
> I just noticed another thing while working on this. In the original RDF
> implementation, the applications are keyed using _getPossibleHandlerAppID,
> this means we never store the same handler application twice. Something
> similar happens with file extensions, that are stored with the behavior of a
> set.
> 
> In the JSON implementation we don't remove duplicate possible handlers or
> file extensions when storing an nsIHandlerInfo. When loading, we do that
> same as the RDF service and we just call appendElement on
> possibleApplicationHandlers, or appendExtension for file extensions.
> 
> This means that if we have a file extension that is added to nsIHandlerInfo
> at the operating system level, it will likely be duplicated every time we do
> a round trip through store() and fillHandlerInfo(). This should be fixed.
> Preferred handler applications are only added by the browser, and current
> consumers check for duplicates, but it's better to keep a consistent
> behavior with the RDF service to avoid duplicates in case future consumers
> don't check.

You are right.
Do you prefer to open a new bug for it?
Or we can just fix it with the patch of migration.

Comment 26

5 days ago
I have noticed a few other issues while working on adding more test coverage, so I'll file individual bugs for addressing them.

Updated

5 days ago
Depends on: 1350008

Updated

5 days ago
Depends on: 1350026
You need to log in before you can comment on or make changes to this bug.