Closed Bug 1287658 Opened 8 years ago Closed 7 years ago

Migrate from "mimeTypes.rdf" to "handlers.json"

Categories

(Firefox :: File Handling, defect, P1)

defect

Tracking

()

RESOLVED FIXED
Firefox 55
Tracking Status
firefox55 --- fixed

People

(Reporter: alchen, Assigned: Paolo)

References

Details

(Whiteboard: [CHE-MVP])

Attachments

(1 file, 5 obsolete files)

The mimeType.json is a replacement for mimeType.rdf.
Blocks: 474043
Assignee: nobody → alchen
Whiteboard: [CHE-MVP]
Set priority to P1 because we're going to implement this feature on Firefox 50.
@ Development Schedule: https://docs.google.com/spreadsheets/d/1WznJUt7lLvcZwwry0yvJ_-hw0ZAXTxOOvmYCw74uGKY/edit#gid=0
Priority: -- → P1
I'm not sure exactly how to interpret the priorities assigned to the individual bugs here, but I'd like to clarify that the investigation in bug 1287673 comment 2 should be completed before any implementation work begins. The outline of the work attached to bug 474043 is a start, but it isn't complete without the information about the callers.

To reiterate, the information we need to add to the document is a full list of known callers in the mozilla-central tree together with the detail of which information is actually retrieved or stored, not only which method is called.
Flags: needinfo?(wehuang)
Hi Paolo, sorry for late following here

1. For the Bugzilla field "Importance" I think William had assigned the priority according to the estimated schedule, following the process in Emma's presentation[1] (I thought it might be mentioned somewhere more formally, but this slides is the only one I know to introduce it). As for our "Content Handling Enhancement" project we simply use 2 levels of priority, which are MVP (marked "[CHE-MVP]") and backlog (marked "[CHE-BACKLOG]" ) for now.

[1]https://docs.google.com/presentation/d/1qlXlDsMnvcpA5ppJVJWIBaBDPEyw5hbjU-Q7wjjrZQM/edit#slide=id.g135c9e38c0_0_0

(In reply to :Paolo Amadini from comment #2)
> I'm not sure exactly how to interpret the priorities assigned to the
> individual bugs here, 

2. Thanks for the suggestion, but if my understanding is correct the work here has it's own value (as pointed out in Bug 474043 comment 19), and in the end *may* or *may not* be impacted by the study in Bug 1287673, hence I tend to have the overall job divided into smaller pieces that can be worked separately (even we might have some overhead work eventually). In other words, this bug can still move on while not being blocked by others (e.g., Bug 1287673), in the mean time we also keep the study as you suggested in bug 1287673 comment 2 and discuss with you (we may consider having another person on this in parallel). How do you think?

> but I'd like to clarify that the investigation in bug
> 1287673 comment 2 should be completed before any implementation work begins.
> The outline of the work attached to bug 474043 is a start, but it isn't
> complete without the information about the callers.
> 
> To reiterate, the information we need to add to the document is a full list
> of known callers in the mozilla-central tree together with the detail of
> which information is actually retrieved or stored, not only which method is
> called.
Flags: needinfo?(wehuang) → needinfo?(paolo.mozmail)
(In reply to Wesly Huang (Firefox EPM) from comment #3)

Hi Wesly, thanks for the clarifications!

> 2. Thanks for the suggestion, but if my understanding is correct the work
> here has it's own value (as pointed out in Bug 474043 comment 19)

I'm actually fine with considering the removal of the RDF API calls the primary goal of this project, so the work here has this value indeed. With this in mind, bug 474043 comment 21 clarifies that reimplementing the nsIHandlerService API exactly as it is now might not be the fastest path to that goal.

What the comment mentions is the API "doing something particularly dumb", what it means for example is that the API might have methods that no one actually calls, or that those methods retrieve much more data than what is actually used.

This is essential information we have to keep into account when implementing and reviewing the code changes, also to ensure we have the necessary automated test coverage.

> , and in
> the end *may* or *may not* be impacted by the study in Bug 1287673, hence I
> tend to have the overall job divided into smaller pieces that can be worked
> separately (even we might have some overhead work eventually). In other
> words, this bug can still move on while not being blocked by others (e.g.,
> Bug 1287673), in the mean time we also keep the study as you suggested in
> bug 1287673 comment 2 and discuss with you (we may consider having another
> person on this in parallel). How do you think?

I agree that the investigation about synchronous and asynchronous callers, which is specifically the subject of bug 1287673, can be continued in parallel if you want. Regardless, for this bug, the research work about the callers of nsIHandlerService will be needed at some point.

Maybe instead of creating this list of callers in advance, you'd rather spend more time in studying the code and write your proposal in terms of a patch with some code changes, including automated tests. It can also be an effective starting point for the discussion. Sometimes a regression test can be a better indication of caller requirements than the actual callers.
Flags: needinfo?(paolo.mozmail)
(In reply to :Paolo Amadini from comment #4)
> This is essential information we have to keep into account when implementing
> and reviewing the code changes, also to ensure we have the necessary
> automated test coverage.
> 
> I agree that the investigation about synchronous and asynchronous callers,
> which is specifically the subject of bug 1287673, can be continued in
> parallel if you want. Regardless, for this bug, the research work about the
> callers of nsIHandlerService will be needed at some point.

Thanks Paolo, if my read is right, your view is the study in Bug 1287673 does not block the "beginning" of this Bug 1287658, however the study result, in some extent, is still a "dependency" for the "end" (let's say, sending patch for review) of the work here, because depends on the study of Bug 1287673 we may want to do more in this Bug 1287658 at once (also there are chances we don't need more). Is this correct?

Discussed with the team and think we are ok to proceed as above (if we didn't get you wrong), I'll get Alphan's support to summarize the known actions so we can get agreed upon the steps of this plan.

@Alphan, would you help? Thanks!
Flags: needinfo?(paolo.mozmail)
Flags: needinfo?(alchen)
(In reply to Wesly Huang (Firefox EPM) from comment #5)
> Is this correct?

Yeah, maybe we'll find out we can do more at once, or maybe that we don't need to do some parts.

Thanks!
Flags: needinfo?(paolo.mozmail)
I will look into existed test case and think about the regression test.
After the investigation, I will write a new proposal with automated tests.(the suggestion of bug 474043 comment 23 will also be included)
Then we can have a discussion based on the document and start with the fundamental work.

For bug 1287673, we will do it in parallel.
Flags: needinfo?(alchen)
Thanks Alphan!
Summary: Add mimeType.json access ability into nsHandlerService.js → Implement JSON store (HandlerStore.jsm) for nsIHandlerService
According to the latest discussion, I set the goal of this bug to JSON store(HandlerStore.jsm) implementation. JSON store is created for the usage of new nsIHandlerService implementation. We will use mimeType.json as a database for nsIHandlerService.

In "HandlerStore.jsm", we will do read synchronously and do write asynchronously.(similar mechanism as bug 853539)
Attached patch 0930-bug1287658-WIP (obsolete) — Splinter Review
Here is current WIP patch.
In this patch, we can use jsonStore.load() to load mimeTypes.json.
We will import data from mimeTypes.rdf if there is no data loaded in load().

Hi Paolo,
could you take a look and see whether it is what you expected?
Attachment #8796571 - Flags: feedback?(paolo.mozmail)
The following is current format.

{
"mimetypes":[
  {"type":"application/pdf","description":"","action":3,"askBeforeHandle":false,"LocalHandlerApp":[],"WebHandlerApp":[],"WebContentHandler":[]}
],
"schemes":
[
{"type":"webcal","description":"","action":4,"askBeforeHandle":true,"LocalHandlerApps":[],"WebHandlerApps":["https://30boxes.com/external/widget?refer=ff&url=%s"],"WebContentHandlers":[]}}
]
}
Comment on attachment 8796571 [details] [diff] [review]
0930-bug1287658-WIP

Hi Alphan, this is just some initial feedback on the direction, and I may have more after the first revision.

It's very useful to have some sample migration code here as a way to see how the format is actually used. In the end we'd probably want to land this migration code separately, because it will use a different way to access the old nsIHandlerService, after the new nsIHandlerService is implemented.

Since you mostly used the login store as a starting point, it can be useful if you can do an "hg copy" of the file so that we can more easily see the parts that changed and I can just review the parts that stayed the same. There is probably some code that can be removed because it does not apply to our case, like the asynchronous loading.

We need to copy and adapt the specific tests for loading and saving the data from the store.

As for the JSON format, we can refine it better in future iterations, for the moment the single thing that stands out is that the MIME type or scheme are actually primary keys, so we can probably rewrite it to use object keys, and it would look something like { "application/pdf": { ... }, "text/plain": { ... } }.
Attachment #8796571 - Flags: feedback?(paolo.mozmail)
I've just heard that the JSON store code has been refactored in a generic module:

https://dxr.mozilla.org/mozilla-central/source/toolkit/modules/JSONFile.jsm

So, I think the remaining work in this bug is now limited to implementing the data migration, which needs to be done after bug 1287660 has landed.
Attached patch 1020-WIP (obsolete) — Splinter Review
Hi Paolo,
In this patch, I use the generic module and revise the JSON format as the following. Now I am doing the implementation based on current JSON format.

+ * {                                                                           
+ *   "mimetypes":{                                                             
+ *     "application/pdf":{                                                     
+ *       "description":"PDF document",                                         
+ *       "action":3,                                                           
+ *       "askBeforeHandle":false,                                              
+ *       "fileExtensions":["pdf"],                                             
+ *       "preferredHandler":{},                                                
+ *       "possibleHandlers":[],                                                
+ *     },                                                                      
+ *    "application/xml":{                                                      
+ *      "description":"XML document",                                          
+ *      "action":2,                                                            
+ *      "askBeforeHandle":false,                                               
+ *      "fileExtensions":["xml","xsd","xsl"],                                  
+ *      "preferredHandler":{"name":"subl","description":"","type":0,"path":"/usr/bin/subl"},
+ *      "possibleHandlers":[{"name":"subl","description":"","type":0,"path":"/usr/bin/subl"}]
+ *     },                                                                      
+ *     "application/x-gzip":{ (...) },                                         
+ *     "application/zip": { (...) },                                           
+ *  },                                                                         
+ *  "schemes":{                                                                
+ *    "webcal":{                                                               
+ *      "description":"",                                                      
+ *      "action":2,                                                            
+ *      "askBeforeHandle":true,                                                
+ *      "preferredHandler":                                                    
+ *        {"name":"30 Boxes","description":null,"type":1,"uriTemplate":"http://30boxes.com/external/widget?refer=ff&url=%s"},
+ *      "possibleHandlers":[                                                   
+ *        {"name":"30 Boxes","description":null,"type":1,"uriTemplate":"http://30boxes.com/external/widget?refer=ff&url=%s"},
+ *        {"name":"30 Boxes","description":null,"type":1,"uriTemplate":"https://30boxes.com/external/widget?refer=ff&url=%s"}
+ *      ],                                                                     
+ *    },                                                                       
+ *    "ircs":{ (...) },                                                        
+ *    "mailto":{ (...) }                                                       
+ *  }
Attachment #8796571 - Attachment is obsolete: true
Attachment #8802852 - Flags: review?(paolo.mozmail)
Here is the mimeTypes.json produced by the latest patch.
Attachment #8796576 - Attachment is obsolete: true
Comment on attachment 8802852 [details] [diff] [review]
1020-WIP

Review of attachment 8802852 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the proof of concept!

The format looks mostly good to me, what we can do to save some space is to avoid writing data when it's empty or the default value, but we can work on this later.

Some things, like the need to save the description in the file, will become clearer as we work on the feature. Unless we find out specifically that we don't need something, we should just keep the same data as mimeTypes.rdf, like this patch shows now.

I'm not sure whether we still need this JSM to hold the singleton Handler Store, because the new nsHandlerService would already be an XPCOM service and we may able to just get a reference to the JSONFile data from it.

We might still need this JSM only if there is some logic here in addition to import, this mainly depends on how bug 1287660 is implemented.
Attachment #8802852 - Flags: review?(paolo.mozmail) → feedback+
Per comment 13, updating the subject ans setting the dependency.
Depends on: 1287660
Summary: Implement JSON store (HandlerStore.jsm) for nsIHandlerService → Implement migration from mimeTypes.rdf to the JSON Handler Store
Comment on attachment 8802852 [details] [diff] [review]
1020-WIP

When bug 1287660 lands, in this bug we should:
 - Rename the ContractID of "handler-service" to "handler-service-rdf"
 - Rename the ContractID of "handler-service-json" to "handler-service"
 - Implement preferences controlling the migration, like the Logins code does
 - Add code to the JSON service to enumerate the types from the RDF service, and store them again using its own methods
 - Add specific tests for the migration

We should keep the code that gets the localized defaults in the RDF-based service, even if we have the same code in the JSON-based service. This is because we don't want old defaults to overwrite the new defaults imported by the JSON-based service, if the localized version number is updated in the version of Firefox where we do the migration, or if the import is re-requested by resetting the migration preference.
Attachment #8802852 - Attachment is obsolete: true
Attachment #8802855 - Attachment is obsolete: true
(In reply to Alphan Chen [:alchen] from comment #19)
> Created attachment 8840320 [details]
> Bug 1287658 - Implement migration from mimeTypes.rdf to the JSON Handler
> Store.
> 
> Review commit: https://reviewboard.mozilla.org/r/114822/diff/#index_header
> See other reviews: https://reviewboard.mozilla.org/r/114822/

In this patch, we can pass all the test under "uriloader/exthandler/tests" by using the service of JSON backend.
Comment on attachment 8840320 [details]
Bug 1287658 - Implement migration from mimeTypes.rdf to the JSON Handler Store.

https://reviewboard.mozilla.org/r/114822/#review116350

Looks good! I'll review the migration test in more detail later.

In the meantime we should figure out the Android failures in bug 1287660, since that bug should land first.

::: uriloader/exthandler/nsHandlerService-json.js:151
(Diff revision 1)
> +    try {
> +      if (Services.prefs.getBoolPref("gecko.handlerService.migration")) {
> +        return;
> +      }
> +    } catch (ex) {
> +      // If the preference does not exist, we need to import.
> +    }
> +

Please add a check and return early if "mimeTypes.rdf" does not exist, so we don't load the RDF service at all on new profiles. This should go after the preference check, so we also don't try to access the file at all if the migration already happened.

::: uriloader/exthandler/nsHandlerService-json.js:152
(Diff revision 1)
>      return false;
>    },
>  
> +  _migration() {
> +    try {
> +      if (Services.prefs.getBoolPref("gecko.handlerService.migration")) {

"gecko.handlerService.migrated" is clearer, and we also use a similar name for various other preferences controlling migration.

::: uriloader/exthandler/nsHandlerService-json.js:162
(Diff revision 1)
> +      gHandlerServiceRDF.fillHandlerInfo(handlerInfo, null);
> +      this.store(handlerInfo);

Add a try-catch with Cu.reportError(ex) so that if something goes wrong with some handlers, we still migrate the others. The import can be retried later by flipping the preference.
Attachment #8840320 - Flags: review?(paolo.mozmail)
Depends on: 1100069
The patch looks good, but as I mentioned I'm looking into improving the test coverage and the file format. I'll move some helpers to "head.js" so they can be reused by this test. I'll post a patch when ready.
I just noticed another thing while working on this. In the original RDF implementation, the applications are keyed using _getPossibleHandlerAppID, this means we never store the same handler application twice. Something similar happens with file extensions, that are stored with the behavior of a set.

In the JSON implementation we don't remove duplicate possible handlers or file extensions when storing an nsIHandlerInfo. When loading, we do that same as the RDF service and we just call appendElement on possibleApplicationHandlers, or appendExtension for file extensions.

This means that if we have a file extension that is added to nsIHandlerInfo at the operating system level, it will likely be duplicated every time we do a round trip through store() and fillHandlerInfo(). This should be fixed. Preferred handler applications are only added by the browser, and current consumers check for duplicates, but it's better to keep a consistent behavior with the RDF service to avoid duplicates in case future consumers don't check.
(In reply to :Paolo Amadini from comment #24)
> I just noticed another thing while working on this. In the original RDF
> implementation, the applications are keyed using _getPossibleHandlerAppID,
> this means we never store the same handler application twice. Something
> similar happens with file extensions, that are stored with the behavior of a
> set.
> 
> In the JSON implementation we don't remove duplicate possible handlers or
> file extensions when storing an nsIHandlerInfo. When loading, we do that
> same as the RDF service and we just call appendElement on
> possibleApplicationHandlers, or appendExtension for file extensions.
> 
> This means that if we have a file extension that is added to nsIHandlerInfo
> at the operating system level, it will likely be duplicated every time we do
> a round trip through store() and fillHandlerInfo(). This should be fixed.
> Preferred handler applications are only added by the browser, and current
> consumers check for duplicates, but it's better to keep a consistent
> behavior with the RDF service to avoid duplicates in case future consumers
> don't check.

You are right.
Do you prefer to open a new bug for it?
Or we can just fix it with the patch of migration.
I have noticed a few other issues while working on adding more test coverage, so I'll file individual bugs for addressing them.
Depends on: 1350008
Depends on: 1350026
Something I just noticed, that should be fixed in this bug, is that the _deleteDatasourceFile function in head.js only deletes the RDF store. The logic should be rewritten considering that the JSON back-end is now the default.
Depends on: 1355582
Depends on: 1355585
Depends on: 986975
Eventually I'm taking this bug too, because I've already worked on the dependencies and this code is still fresh in my mind.

This way you can continue to focus on the Payment Request API without having to switch context for just one bug.
Assignee: alchen → paolo.mozmail
Status: NEW → ASSIGNED
Summary: Implement migration from mimeTypes.rdf to the JSON Handler Store → Migrate from "mimeTypes.rdf" to "handlers.json"
Attachment #8840320 - Attachment is obsolete: true
Attachment #8840320 - Flags: review?(paolo.mozmail)
Comment on attachment 8858502 [details]
Bug 1287658 - Migrate from "mimeTypes.rdf" to "handlers.json".

https://reviewboard.mozilla.org/r/130470/#review138510

A couple questions before the final review:
1. is this easily flippable through a pref, in case things go weird before a release? if not, how hard would it be to provide that? Do we have an emergency plan?
2. Does this support the upgrade/downgrade/upgrade path, that is the most common for users who hit a show-stopper bug when updating to a new version? what happens to data added to the old backend when upgrading "again"? From the patch looks like it's just lost, may that cause any kind of issues?
Flags: needinfo?(paolo.mozmail)
(In reply to Marco Bonardo [::mak] from comment #31)
> 1. is this easily flippable through a pref, in case things go weird before a
> release? if not, how hard would it be to provide that? Do we have an
> emergency plan?

No, this isn't controlled by a preference. Doing this would be invasive, because the C++ components reference the JavaScript ones using the ContractID, and these are statically associated in the XPCOM registration manifests.

> 2. Does this support the upgrade/downgrade/upgrade path, that is the most
> common for users who hit a show-stopper bug when updating to a new version?

It's supported manually. This is the same approach we took with the Password Manager back-end migration.

> what happens to data added to the old backend when upgrading "again"? From
> the patch looks like it's just lost, may that cause any kind of issues?

This is covered by the tests in the patch. When flipping the migration preference manually, any new records are kept, and the data from the RDF store is imported again.
Flags: needinfo?(paolo.mozmail)
(In reply to :Paolo Amadini from comment #32)
> (In reply to Marco Bonardo [::mak] from comment #31)
> > 1. is this easily flippable through a pref, in case things go weird before a
> > release? if not, how hard would it be to provide that? Do we have an
> > emergency plan?
> 
> No, this isn't controlled by a preference. Doing this would be invasive,
> because the C++ components reference the JavaScript ones using the
> ContractID, and these are statically associated in the XPCOM registration
> manifests.

Is backout an option? what would be the consequences?

> > 2. Does this support the upgrade/downgrade/upgrade path, that is the most
> > common for users who hit a show-stopper bug when updating to a new version?
> 
> It's supported manually. This is the same approach we took with the Password
> Manager back-end migration.

What does "manually" mean? most of our users are unlikely to even understand what's a profile, so I hope this doesn't mean copying profile files around. Same for prefs, we can't rely on users flipping prefs.

So the question boils down to "What happens to a user that goes through the upgrade/downgrade/upgrade path"? Is there anything dangerous in that path, provided there may be dataloss?

Sorry for the many questions, but considered we don't have Aurora anymore, we must be careful at doing migrations, since Beta backouts are behind the corner at this point.
(In reply to Marco Bonardo [::mak] from comment #33)
> Is backout an option? what would be the consequences?

Yes, and the result would be that choices made in the meantime in the Downloads dialog, like "always use this application to open this file type", would not be remembered.

> > > 2. Does this support the upgrade/downgrade/upgrade path, that is the most
> > > common for users who hit a show-stopper bug when updating to a new version?
> > 
> > It's supported manually. This is the same approach we took with the Password
> > Manager back-end migration.
> 
> What does "manually" mean? most of our users are unlikely to even understand
> what's a profile, so I hope this doesn't mean copying profile files around.
> Same for prefs, we can't rely on users flipping prefs.

It means that in the default case, if you upgrade and then downgrade, the choices you make in the downgraded version will not be preserved when you upgrade again. We don't want the performance impact of checking whether the RDF data store has been touched at every startup in the new version.

If a user wants to get the changes back, they may have to flip preferences. This would not be required for most users, as they will stay on the latest versions once they have upgraded, and won't use profiles with multiple versions.

I'll note that the preferences stored here are about choices that can be very easily repeated if needed, of the "don't ask me again" type. There is some file type and extension association information too, and the extensive unit tests ensure that we don't break the API expectations during the upgrade, but this is also data that is automatically constructed by the browser, and not user data like passwords or downloaded files.
Comment on attachment 8858502 [details]
Bug 1287658 - Migrate from "mimeTypes.rdf" to "handlers.json".

https://reviewboard.mozilla.org/r/130470/#review138828
Attachment #8858502 - Flags: review?(mak77) → review+
Pushed by paolo.mozmail@amadzone.org:
https://hg.mozilla.org/integration/mozilla-inbound/rev/97a69c3ff07c
Migrate from "mimeTypes.rdf" to "handlers.json". r=mak
https://hg.mozilla.org/mozilla-central/rev/97a69c3ff07c
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 55
Depends on: 1363425
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: