Closed Bug 1370429 Opened 7 years ago Closed 7 years ago

[Form Autofill] Implement the infrastructure of Telephone parser to support the various combination of telephone fields

Categories

(Toolkit :: Form Manager, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
mozilla56
Tracking Status
firefox56 --- fixed

People

(Reporter: selee, Assigned: selee)

References

(Blocks 1 open bug)

Details

(Whiteboard: [form autofill:M3] )

Attachments

(6 files, 6 obsolete files)

59 bytes, text/x-review-board-request
MattN
: review+
Details
59 bytes, text/x-review-board-request
MattN
: review+
Details
59 bytes, text/x-review-board-request
MattN
: review+
Details
59 bytes, text/x-review-board-request
MattN
: review+
Details
59 bytes, text/x-review-board-request
MattN
: review+
Details
59 bytes, text/x-review-board-request
MattN
: review+
Details
OfficeDepot[1] provides three tel related fields for users to fill these fields:
* tel-area-code
* tel-local-prefix
* tel-local-suffix

FormAutofillHeuristics should be able to recognize these fields and satisfy this combination.

[1] http://searchfox.org/mozilla-central/rev/d441cb24482c2e5448accaf07379445059937080/browser/extensions/formautofill/test/fixtures/third_party/OfficeDepot/ShippingAddress.html#246-261


tel-area-code, tel-local-prefix, and tel-local-suffix
Whiteboard: [form autofill:M3] ETA:612 → [form autofill:M3]
Here is the WIP patch for tel related fields:
https://github.com/weilonge/gecko/commits/seanlee/FormFill/Bug1370429

Using github here since it's not well organized yet.

Some notes:
1. For commit `grammar matching algorithm.`, `test_matchTelGrammar.js` is for verifying the algorithm. It also can satisfy OfficeDepot case except `tel-extension`.
https://github.com/weilonge/gecko/commit/e16c3eb55b8fb18c7a6b9c3cd8a370a8fe337515

2. For commit `tel-extension`, The test of OfficeDepot can be satisified with all tel related fields.
https://github.com/weilonge/gecko/commit/09e6ce58afc66b79d7da6a7ab71dcd4a04752057

I will refine the hunk of each commit later.
Summary: [Form Autofill] Implement offline heuristics to determine these Phone fields: tel-area-code, tel-local-prefix, and tel-local-suffix → [Form Autofill] Implement the infrastructure of Telephone parser to support the various combination of telephone fields
Comment on attachment 8877189 [details]
Bug 1370429 - Part 1: Implement FieldScanner and refactor getFormInfo function.

https://reviewboard.mozilla.org/r/148564/#review157558

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:52
(Diff revision 1)
>      EMAIL: ["email"],
>    },
>  
>    RULES: null,
>  
> +  _parseTelFields(elements, cursor, fieldDetails) {

A JSDoc comment would really ease understanding this code

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:54
(Diff revision 1)
> +    let detailBegin = fieldDetails.length - 1;
> +    let detailEnd = detailBegin;

detailBegin is only used in the line below it so the assigment could just move there. I guess the purpose of this method is a bit unclear to me in part 1 without a jsdoc

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:54
(Diff revision 1)
> +    let detailBegin = fieldDetails.length - 1;
> +    let detailEnd = detailBegin;

I think these variable names could be improved or else a comment added. I don't understand what they are tracking.

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:106
(Diff revision 1)
> +          addressType: info.addressType,
> +          contactType: info.contactType,
> +          fieldName: info.fieldName,
> +          elementWeakRef: Cu.getWeakReference(element),
> +        });
> +        dump("enter to _parseTelFields " + i + "\n");

use `log.…` instead if this is useful
Attachment #8877189 - Flags: review?(MattN+bmo)
Comment on attachment 8877190 [details]
Bug 1370429 - Part 3: Import the telephone grammar list from Chromium.

https://reviewboard.mozilla.org/r/148566/#review157566

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:242
(Diff revision 1)
>      }
>  
>      return null;
>    },
> +
> +  PHONE_FIELD_GRAMMARS: [

Add this file to the Chromium section of toolkit/content/license.html
Attachment #8877190 - Flags: review?(MattN+bmo) → review+
Comment on attachment 8877191 [details]
Bug 1370429 - Part 4: Implement grammar list matching algorithm for telephone fields.

https://reviewboard.mozilla.org/r/148568/#review157568

Please add more comments and check support for adjacent phone numbers. We can discuss more this week.

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:52
(Diff revision 1)
>      EMAIL: ["email"],
>    },
>  
>    RULES: null,
>  
> +  _matchTelGrammar(fieldDetails, start, end) {

Nit: `_matchPhoneGrammar`

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:52
(Diff revision 1)
>      EMAIL: ["email"],
>    },
>  
>    RULES: null,
>  
> +  _matchTelGrammar(fieldDetails, start, end) {

Please add a JSDoc comment to document the arguments and return value meanings.

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:55
(Diff revision 1)
>    RULES: null,
>  
> +  _matchTelGrammar(fieldDetails, start, end) {
> +    const GRAMMARS = this.PHONE_FIELD_GRAMMARS;
> +
> +    for (let i = 0;  i < GRAMMARS.length; i++) {

Nit: There is an extra space

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:56
(Diff revision 1)
>  
> +  _matchTelGrammar(fieldDetails, start, end) {
> +    const GRAMMARS = this.PHONE_FIELD_GRAMMARS;
> +
> +    for (let i = 0;  i < GRAMMARS.length; i++) {
> +      // Reset the cursor of fieldDetails to the begin.

Nit: "beginning"

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:64
(Diff revision 1)
> +      let ruleStart = i;
> +      for (; i < GRAMMARS.length && GRAMMARS[i][0]; i++, cursor++) {
> +        if (!fieldDetails[cursor]) {
> +          break;
> +        }
> +        let element = fieldDetails[cursor].elementWeakRef.get();

Move this directly above the line that uses it?

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:68
(Diff revision 1)
> +        }
> +        let element = fieldDetails[cursor].elementWeakRef.get();
> +        if (GRAMMARS[i][0] != fieldDetails[cursor].fieldName) {
> +          break;
> +        }
> +        if (GRAMMARS[i][2] && (!element.maxLength || GRAMMARS[i][2] < element.maxLength)) {

`element` can be null if the element is removed and GC'd/CC'd. Are you fine with this throwing?

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:93
(Diff revision 1)
> +      }
> +    }
> +    return null;
> +  },
> +
>    _parseTelFields(elements, cursor, fieldDetails) {

Nit: `_parsePhoneFields`

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:121
(Diff revision 1)
> +    let matchingResult = this._matchTelGrammar(fieldDetails,
> +                                               detailBegin,
> +                                               detailEnd);
> +
> +    if (matchingResult) {

IIUC, this approach doesn't work for adjacent phone number fields

::: browser/extensions/formautofill/test/unit/test_matchTelGrammar.js:101
(Diff revision 1)
> +      genCase("tel", 3),
> +      genCase("tel", 3),
> +      genCase("tel", 4),

Please add test cases for adjacent telephone number fields like we discussed:

e.g. 3,3,4,3,3,4 (all "tel")

::: browser/extensions/formautofill/test/unit/xpcshell.ini:32
(Diff revision 1)
>  [test_getFormInputDetails.js]
>  [test_getInfo.js]
>  [test_isCJKName.js]
>  [test_isFieldEligibleForAutofill.js]
>  [test_markAsAutofillField.js]
> +[test_matchTelGrammar.js]

`test_matchPhoneGrammar`
Attachment #8877191 - Flags: review?(MattN+bmo)
Comment on attachment 8877192 [details]
Bug 1370429 - Part 5: Add tel-extension support.

https://reviewboard.mozilla.org/r/148570/#review157580

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:55
(Diff revision 1)
> +  _parseAndFillWithRegexp(fieldDetail, fieldName) {
> +    let element = fieldDetail.elementWeakRef.get();

Add a JSDoc comment for any new non-trivial methods please.

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:161
(Diff revision 1)
> +      if (detailCursor <= detailEnd) {
> +        // Some fields are not matched to a rule yet, so let's see if it's
> +        // a tel-extension field.
> +        this._parseAndFillWithRegexp(fieldDetails[detailCursor],
> +                                     "tel-extension");
> +      }

Is this what Chromium does? Or do they have it part of their grammars? Can you link me to this?

This is related to the problem for adjacent phone nubmers: Are we going to leave the ones not matched to a rule with a fieldName of "tel"? That may cause problems for our code that ignores fields that are the same type… so maybe we should return an empty fieldName? How does Chromium handle it?
Comment on attachment 8877189 [details]
Bug 1370429 - Part 1: Implement FieldScanner and refactor getFormInfo function.

https://reviewboard.mozilla.org/r/148564/#review157590

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:107
(Diff revision 1)
> +          contactType: info.contactType,
> +          fieldName: info.fieldName,
> +          elementWeakRef: Cu.getWeakReference(element),
> +        });
> +        dump("enter to _parseTelFields " + i + "\n");
> +        let indexOfNonTel = this._parseTelFields(form.elements, i, fieldDetails);

I still think we should apply the grammar list to all fields, not just "tel" ones as it would avoid having to special case "ext" like you are in part 4. Do you think it will be much less efficient?
Comment on attachment 8877193 [details]
Bug 1370429 - Part 6: Modify the test case of targeting web sites to support the telephone related fields.

https://reviewboard.mozilla.org/r/148572/#review157586

I guess this fine since we're not actually going to fill in tel-extension but we should make sure it doesn't interfer with our metrics by counting extension fields.

::: browser/extensions/formautofill/test/unit/heuristics/third_party/test_Walmart.js:36
(Diff revision 1)
>          {"section": "section-payment", "addressType": "", "contactType": "", "fieldName": "given-name"},
>          {"section": "section-payment", "addressType": "", "contactType": "", "fieldName": "family-name"},
>          {"section": "section-payment", "addressType": "", "contactType": "", "fieldName": "cc-number"},
>          {"section": "section-payment", "addressType": "", "contactType": "", "fieldName": "cc-exp-month"},
>          {"section": "section-payment", "addressType": "", "contactType": "", "fieldName": "cc-exp-year"},
> +        {"section": "", "addressType": "", "contactType": "", "fieldName": "tel-extension"}, // FIXME

I guess this is the field @name="brwsrAutofillText" which matches `ext\\b`. Can you mention brwsrAutofillText in the comment so it's easier for someone else to debug later.
Attachment #8877193 - Flags: review?(MattN+bmo) → review+
Comment on attachment 8877192 [details]
Bug 1370429 - Part 5: Add tel-extension support.

https://reviewboard.mozilla.org/r/148570/#review158990

Clearing review since we're changing approaches
Attachment #8877192 - Flags: review?(MattN+bmo)
Comment on attachment 8877191 [details]
Bug 1370429 - Part 4: Implement grammar list matching algorithm for telephone fields.

https://reviewboard.mozilla.org/r/148568/#review158994

Clearing review since we're changing approaches
Attachment #8877191 - Flags: review?(MattN+bmo)
Comment on attachment 8877189 [details]
Bug 1370429 - Part 1: Implement FieldScanner and refactor getFormInfo function.

https://reviewboard.mozilla.org/r/148564/#review158996

Clearing review since we're changing approaches
Attachment #8877189 - Flags: review?(MattN+bmo)
Attachment #8884147 - Attachment is obsolete: true
Hey MattN,

Here is the summary of the changes in the latest patch:
1. Refactor getFormInfo function to support the different parsers.
2. Simplify the algorithm to support one-pass parsing method in `_parsePhoneFields` and `_parseAddressFields`.
3. Each parser can continue the parsing algorithm based on FieldScanner.

The latest patch applies the idea of FieldScanner that we discussed during SF all hands. There are some reasons to implement this:
1. Manage how many fields have been parsed.
2. Keep the index consistency between `fieldDetails` and `elements`.
3. Store the regexp result from `FormAutofillHeuristics.getInfo`. All regexps don't have to be applied again if they did before.

The idea of returning the parsed field details can be covered by FieldScanner as well.

The part 1&2 patches are refactoring the current architecture to support the isolated parser functions, and the following ones are for Telephone fields specifically.
Comment on attachment 8877191 [details]
Bug 1370429 - Part 4: Implement grammar list matching algorithm for telephone fields.

https://reviewboard.mozilla.org/r/148568/#review157568

> IIUC, this approach doesn't work for adjacent phone number fields

The latest design can support the adjacent phone numbers easily because we have indexParsing cursor to indicate the following fields which is not parsed yet.
However, how we handle the duplicated fields will affect the support of the adjacent phone number fields.
When there are two sets phone number found, we can use addressType/contactType to mark them as the different fields.
Do you agree we can implement this in another bug?
Comment on attachment 8877192 [details]
Bug 1370429 - Part 5: Add tel-extension support.

https://reviewboard.mozilla.org/r/148570/#review157580

> Is this what Chromium does? Or do they have it part of their grammars? Can you link me to this?
> 
> This is related to the problem for adjacent phone nubmers: Are we going to leave the ones not matched to a rule with a fieldName of "tel"? That may cause problems for our code that ignores fields that are the same type… so maybe we should return an empty fieldName? How does Chromium handle it?

The latest implementation [1] is modified as Chromium did.
I also change `getInfo` to work like Chromium's parser.

If there is a "tel-extension" field like this:
```HTML
<label for="phoneNumber4-2"> Ext</label>
<input type="tel" id="phoneNumber4-2" name="addrsForm[2].phoneNumber4" value="" maxlength="4" tabindex="10">
```

The original `getInfo` will regconize it as "tel" because the first loop is for id, name, etc. Unfortunately, the id matches "tel" regexp first rather than "tel-extension".

There are two ways to fix this:
1. The getInfo should try to find all possilble strings (id, name, labels) for matching one regexp, then the next regexp. (This is the latest patch did.)
2. Only try to recognize "tel-extension" here.

The option 1 is also similiar how Chromium matches fields.

We can move this part to a seperate bug if you agree.

[1] https://reviewboard.mozilla.org/r/148570/diff/4-5/
[2] http://searchfox.org/mozilla-central/rev/b7e531410ebc1c7d74a78e86a894e69608db318c/browser/extensions/formautofill/test/fixtures/third_party/OfficeDepot/ShippingAddress.html#265-269
Comment on attachment 8877190 [details]
Bug 1370429 - Part 3: Import the telephone grammar list from Chromium.

https://reviewboard.mozilla.org/r/148566/#review163232

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:346
(Diff revision 5)
>      }
>  
>      return null;
>    },
> +
> +  PHONE_FIELD_GRAMMARS: [

Please add a JSDoc comment describing the format of the arrays.
Comment on attachment 8877190 [details]
Bug 1370429 - Part 3: Import the telephone grammar list from Chromium.

https://reviewboard.mozilla.org/r/148566/#review163234

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:443
(Diff revision 5)
> +    // Phone: <phone> (Ext: <ext>)?
> +      // {REGEX_PHONE, FIELD_PHONE, 0},
> +      // {REGEX_SEPARATOR, FIELD_NONE, 0},

This should be the rule we use to match single-field telephone numbers
Comment on attachment 8877189 [details]
Bug 1370429 - Part 1: Implement FieldScanner and refactor getFormInfo function.

https://reviewboard.mozilla.org/r/148564/#review163226

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:41
(Diff revision 5)
> +  get elements() {
> +    return this._elementsWeakRef.get();
> +  }

This doesn't seem like it should be a public API
Attachment #8877189 - Flags: review?(MattN+bmo)
Comment on attachment 8877189 [details]
Bug 1370429 - Part 1: Implement FieldScanner and refactor getFormInfo function.

https://reviewboard.mozilla.org/r/148564/#review163244

Do you have time to discuss this today on Vidyo? The FieldScanner API seems to overlap a lot with FormHandler so I'm not sure it should be a separate class. It seems like we should write down all of the constraints and test cases and ensure that what we're making is meeting those requirements. I still don't see tests covering previous concerns I had:
* Support for consecutive/adjacent phone numbers
* Ensuring that fields that match the `tel` regex but don't match the grammar list don't end up with a lingering `tel` fieldName.
Comment on attachment 8877190 [details]
Bug 1370429 - Part 3: Import the telephone grammar list from Chromium.

https://reviewboard.mozilla.org/r/148566/#review163234

> This should be the rule we use to match single-field telephone numbers

This was because I didn't think it was ideal to intially set the fieldName to tel before the grammar list matched it. We discussed on Vidyo that we can instead overwrite the tel fieldName in `_parsePhoneFields` for that first field if it's not a match to any grammar rules.
Comment on attachment 8877192 [details]
Bug 1370429 - Part 5: Add tel-extension support.

https://reviewboard.mozilla.org/r/148570/#review163882
Attachment #8877192 - Flags: review?(MattN+bmo) → review+
Comment on attachment 8877190 [details]
Bug 1370429 - Part 3: Import the telephone grammar list from Chromium.

https://reviewboard.mozilla.org/r/148566/#review164568

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:361
(Diff revision 7)
> + * :N means field is limited to N characters, otherwise it is unlimited.
> + * (pattern <field>)? means pattern is optional and matched separately.
> + *
> + * This grammar list from Chromium will be enabled partially once we need to
> + * support more cases of Telephone fields.
> + */

Could you take a look this comment though you have given it r+? The comment is copied from Chromium.
Comment on attachment 8877191 [details]
Bug 1370429 - Part 4: Implement grammar list matching algorithm for telephone fields.

https://reviewboard.mozilla.org/r/148568/#review157568

> The latest design can support the adjacent phone numbers easily because we have indexParsing cursor to indicate the following fields which is not parsed yet.
> However, how we handle the duplicated fields will affect the support of the adjacent phone number fields.
> When there are two sets phone number found, we can use addressType/contactType to mark them as the different fields.
> Do you agree we can implement this in another bug?

The latest patch includes the test case of adjacent phone numbers, and please see them in Part 6.
Comment on attachment 8877192 [details]
Bug 1370429 - Part 5: Add tel-extension support.

https://reviewboard.mozilla.org/r/148570/#review157580

> The latest implementation [1] is modified as Chromium did.
> I also change `getInfo` to work like Chromium's parser.
> 
> If there is a "tel-extension" field like this:
> ```HTML
> <label for="phoneNumber4-2"> Ext</label>
> <input type="tel" id="phoneNumber4-2" name="addrsForm[2].phoneNumber4" value="" maxlength="4" tabindex="10">
> ```
> 
> The original `getInfo` will regconize it as "tel" because the first loop is for id, name, etc. Unfortunately, the id matches "tel" regexp first rather than "tel-extension".
> 
> There are two ways to fix this:
> 1. The getInfo should try to find all possilble strings (id, name, labels) for matching one regexp, then the next regexp. (This is the latest patch did.)
> 2. Only try to recognize "tel-extension" here.
> 
> The option 1 is also similiar how Chromium matches fields.
> 
> We can move this part to a seperate bug if you agree.
> 
> [1] https://reviewboard.mozilla.org/r/148570/diff/4-5/
> [2] http://searchfox.org/mozilla-central/rev/b7e531410ebc1c7d74a78e86a894e69608db318c/browser/extensions/formautofill/test/fixtures/third_party/OfficeDepot/ShippingAddress.html#265-269

The test case of tel-extension has been added in `test_collectFormFields.js`.
Comment on attachment 8877193 [details]
Bug 1370429 - Part 6: Modify the test case of targeting web sites to support the telephone related fields.

https://reviewboard.mozilla.org/r/148572/#review164580

::: browser/extensions/formautofill/test/unit/test_collectFormFields.js:221
(Diff revision 7)
> +                   The above case is very similiar to the below one which is
> +                   a case to describe the valid telephone combinations.
> +                   <input id="homePhone1" maxlength="10">
> +                   <input id="mobilePhone2" maxlength="10">
> +                   <input id="officePhone3" maxlength="10">
> +                 -->

This case is to elemiate any telephone fields that do not match to grammar list.
The `#homophone[1-3]` case is very similiar to the above case, and this is quite confusing the parser.

The solution that MattN and I discussed was to remove the un-matched phoe fieldNames, but I forget to consider the above case.

Any un-matched phone fields are still possible to be a valid phone number.

If we enable these two grammar rules:
```
// Ext: <ext>
  // {REGEX_EXTENSION, FIELD_EXTENSION, 0},
  // {REGEX_SEPARATOR, FIELD_NONE, 0},

// Phone: <phone> (Ext: <ext>)?
  // {REGEX_PHONE, FIELD_PHONE, 0},
  // {REGEX_SEPARATOR, FIELD_NONE, 0},
```

we probably can satisfy the cases. However, `REGEX_SEPARATOR` should be able to be determined if we want to use these two rules.

Hey MattN, may I have your suggestions here?
Comment on attachment 8877193 [details]
Bug 1370429 - Part 6: Modify the test case of targeting web sites to support the telephone related fields.

https://reviewboard.mozilla.org/r/148572/#review164580

> This case is to elemiate any telephone fields that do not match to grammar list.
> The `#homophone[1-3]` case is very similiar to the above case, and this is quite confusing the parser.
> 
> The solution that MattN and I discussed was to remove the un-matched phoe fieldNames, but I forget to consider the above case.
> 
> Any un-matched phone fields are still possible to be a valid phone number.
> 
> If we enable these two grammar rules:
> ```
> // Ext: <ext>
>   // {REGEX_EXTENSION, FIELD_EXTENSION, 0},
>   // {REGEX_SEPARATOR, FIELD_NONE, 0},
> 
> // Phone: <phone> (Ext: <ext>)?
>   // {REGEX_PHONE, FIELD_PHONE, 0},
>   // {REGEX_SEPARATOR, FIELD_NONE, 0},
> ```
> 
> we probably can satisfy the cases. However, `REGEX_SEPARATOR` should be able to be determined if we want to use these two rules.
> 
> Hey MattN, may I have your suggestions here?

I agree we should enable the `// Phone: <phone> (Ext: <ext>)?` rule and mentioned that earlier in the bug even. 

> However, REGEX_SEPARATOR should be able to be determined if we want to use these two rules.

I understand what what you're saying here.
Comment on attachment 8877193 [details]
Bug 1370429 - Part 6: Modify the test case of targeting web sites to support the telephone related fields.

https://reviewboard.mozilla.org/r/148572/#review164580

> I agree we should enable the `// Phone: <phone> (Ext: <ext>)?` rule and mentioned that earlier in the bug even. 
> 
> > However, REGEX_SEPARATOR should be able to be determined if we want to use these two rules.
> 
> I understand what what you're saying here.

Oops, I meant I *don't* understand what you're saying there
Comment on attachment 8877193 [details]
Bug 1370429 - Part 6: Modify the test case of targeting web sites to support the telephone related fields.

https://reviewboard.mozilla.org/r/148572/#review165004

::: browser/extensions/formautofill/test/unit/test_collectFormFields.js:221
(Diff revision 7)
> +                   The above case is very similiar to the below one which is
> +                   a case to describe the valid telephone combinations.
> +                   <input id="homePhone1" maxlength="10">
> +                   <input id="mobilePhone2" maxlength="10">
> +                   <input id="officePhone3" maxlength="10">
> +                 -->

I mean we need to implement `REGEX_SEPARATOR` so the rule can be enabled then.
Comment on attachment 8877193 [details]
Bug 1370429 - Part 6: Modify the test case of targeting web sites to support the telephone related fields.

https://reviewboard.mozilla.org/r/148572/#review164580

> Oops, I meant I *don't* understand what you're saying there

I mean we need to implement `REGEX_SEPARATOR` so the rule can be enabled then.
Blocks: 1383058
After discussing the test case with MattN, the latest version has been pushed.
We also found that the info from autocomplete attribute should not be modified by any parser or any other codes, so bug 1383058 should cover the design.
Attachment #8877189 - Attachment is obsolete: true
Attachment #8877189 - Flags: review?(MattN+bmo)
Attachment #8884148 - Attachment is obsolete: true
Attachment #8884148 - Flags: review?(MattN+bmo)
Attachment #8877190 - Attachment is obsolete: true
Attachment #8877191 - Attachment is obsolete: true
Attachment #8877191 - Flags: review?(MattN+bmo)
Attachment #8877192 - Attachment is obsolete: true
Comment on attachment 8888807 [details]
Bug 1370429 - Part 3: Import the telephone grammar list from Chromium.

https://reviewboard.mozilla.org/r/159844/#review167502
Attachment #8888807 - Flags: review?(MattN+bmo) → review+
Comment on attachment 8888809 [details]
Bug 1370429 - Part 5: Add tel-extension support.

https://reviewboard.mozilla.org/r/159848/#review167504
Attachment #8888809 - Flags: review?(MattN+bmo) → review+
OS: Mac OS X → All
Hardware: x86 → All
Comment on attachment 8888805 [details]
Bug 1370429 - Part 1: Implement FieldScanner and refactor getFormInfo function.

https://reviewboard.mozilla.org/r/159840/#review167544

I don't like the fact that this is introducing bug 1383058 (regressing behaviour) and the FieldScanner as an additional class which is a very similar concept to what the FormHandler is for isn't good but it seems like you're set on this way and we're running out of time so I guess we'll see how it works and replace it later.

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:56
(Diff revision 1)
> +   * to the end of field details.
> +   */
> +  pushDetail() {
> +    let elementIndex = this.fieldDetails.length;
> +    if (elementIndex >= this._elements.length) {
> +      log.warn("try to push the non-exising element info.");

"non-existing"

Do we ever hit this? Why shouldn't we throw instead?
Attachment #8888805 - Flags: review?(MattN+bmo) → review+
Comment on attachment 8888806 [details]
Bug 1370429 - Part 2: Implement address-lines parser and refactor getInfo function.

https://reviewboard.mozilla.org/r/159842/#review167548

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:53
(Diff revision 1)
> +  get indexParsing() {
> +    return this._indexParsing;
> +  }

Nit: `_parsingIndex` seems more natural

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:64
(Diff revision 1)
> +  set indexParsing(index) {
> +    if (index > this.fieldDetails.length) {
> +      log.warn("The parsing index is out of range.");
> +      index = this.fieldDetails.length;
> +    }
> +    this._indexParsing = index;
> +  }

Do we want the index to be monotonic? If not, we should probably throw an error in that case.

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:65
(Diff revision 1)
> +    if (index > this.fieldDetails.length) {
> +      log.warn("The parsing index is out of range.");
> +      index = this.fieldDetails.length;
> +    }

This seems sloppy… we should throw instead and fix callers that hit this.

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:82
(Diff revision 1)
> +    if (index >= this._elements.length) {
> +      return null;
> +    }

Maybe throw here instead?

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:86
(Diff revision 1)
> +  getFieldDetailByIndex(index) {
> +    if (index >= this._elements.length) {
> +      return null;
> +    }
> +
> +    if (this.fieldDetails.length > index) {

This would be more natural and match with the previous `if` if you flip the operands:

`if (index < this.fieldDetails.length) {`

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:148
(Diff revision 1)
> +    if (index >= this.fieldDetails.length) {
> +      log.warn("Try to update an inexistent field detail.");
> +      return;
> +    }

This should throw instead I think

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:209
(Diff revision 1)
> +   * This function tries to find the correct address-line[1-3] sequence and
> +   * correct their field names.

Nit: "This function" is redundant and should be removed.

s/tries/Try/

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:212
(Diff revision 1)
>  
> +  /**
> +   * This function tries to find the correct address-line[1-3] sequence and
> +   * correct their field names.
> +   *
> +   * @param {Object} fieldScanner

{FieldScanner}
Attachment #8888806 - Flags: review?(MattN+bmo) → review+
Comment on attachment 8888808 [details]
Bug 1370429 - Part 4: Implement grammar list matching algorithm for telephone fields.

https://reviewboard.mozilla.org/r/159846/#review167556

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:209
(Diff revision 1)
>    },
>  
>    RULES: null,
>  
>    /**
> +   * This function tries to match the telephone related fields to the grammar

s/This function tries/Try/

::: browser/extensions/formautofill/FormAutofillHeuristics.jsm:213
(Diff revision 1)
>    /**
> +   * This function tries to match the telephone related fields to the grammar
> +   * list to see if there is any valid telephone set and correct their
> +   * field names.
> +   *
> +   * @param {Object} fieldScanner

{FieldScanner}
Attachment #8888808 - Flags: review?(MattN+bmo) → review+
Hey Matt, all the warning log have been replaced with throwing errors. Thanks a lot for reviewing these patches!
Keywords: checkin-needed
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/fc7970f8303c
Part 1: Implement FieldScanner and refactor getFormInfo function. r=MattN
https://hg.mozilla.org/integration/autoland/rev/5b3ea49d9f75
Part 2: Implement address-lines parser and refactor getInfo function. r=MattN
https://hg.mozilla.org/integration/autoland/rev/bd882520e8a4
Part 3: Import the telephone grammar list from Chromium. r=MattN
https://hg.mozilla.org/integration/autoland/rev/8225d170ecb2
Part 4: Implement grammar list matching algorithm for telephone fields. r=MattN
https://hg.mozilla.org/integration/autoland/rev/986bb6e88919
Part 5: Add tel-extension support. r=MattN
https://hg.mozilla.org/integration/autoland/rev/06cffb11deda
Part 6: Modify the test case of targeting web sites to support the telephone related fields. r=MattN
Keywords: checkin-needed
You need to log in before you can comment on or make changes to this bug.