Bug 1588908 Comment 4 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

Original comment by

Andrew Sutherland [:asuth] (he/him)

on 2019-12-02 17:49:08 PST

Yeah, I quickly ran against both files on an indexer and we got the ["Invalid regexp literal"](https://github.com/mozsearch/mozsearch/blob/e97a1da2ad46733675cd25579dff30a50938cec8/tools/src/tokenize.rs#L430) error for both.  I made the following mod to get context:
```diff
diff --git a/tools/src/tokenize.rs b/tools/src/tokenize.rs
index 6e78b34..e809309 100644
--- a/tools/src/tokenize.rs
+++ b/tools/src/tokenize.rs
@@ -427,7 +427,7 @@ pub fn tokenize_c_like(string: &str, spec: &LanguageSpec) -> Vec<Token> {
                     } else if next == '\\' && peek_char() != '\n' {
                         get_char();
                     } else if next == '\n' {
-                        debug!("Invalid regexp literal");
+                        debug!("Invalid regexp literal (pos {})", peek_pos());
                         return tokenize_plain(string);
                     }
                 }
```

Doing `head -c` with that shows it's indeed the arrow function that causes the .  Note that in test_json_updatecheck.js it's actually a slightly earlier line at https://searchfox.org/mozilla-central/rev/04d8e7629354bab9e6a285183e763410860c5006/toolkit/mozapps/extensions/test/xpcshell/test_json_updatecheck.js#311 but for the comment 1 it is https://searchfox.org/mozilla-central/rev/04d8e7629354bab9e6a285183e763410860c5006/testing/mochitest/BrowserTestUtils/BrowserTestUtils.jsm#1370

I assume that the logic at https://github.com/mozsearch/mozsearch/blob/e97a1da2ad46733675cd25579dff30a50938cec8/tools/src/tokenize.rs#L555 that sets `next_token_maybe_regexp_literal` is getting tricked because the list of okay characters does not include ">".  I nearly once mad on this problem space before (in order to tokenize JS you sorta need to parse JS), so my knee-jerk suggestion would be that maybe the "=" case wants lookahead to consume the ">"?

:kashav, do you want to try your hand at a fix (and risk madness? ;)?  Emilio has a patch at https://github.com/mozsearch/mozsearch/pull/248 to use 18.04 and help avoid using virtualbox which might be helpful or not.

Revision 1 by

Andrew Sutherland [:asuth] (he/him)

on 2019-12-02 17:49:42 PST

Yeah, I quickly ran against both files on an indexer and we got the ["Invalid regexp literal"](https://github.com/mozsearch/mozsearch/blob/e97a1da2ad46733675cd25579dff30a50938cec8/tools/src/tokenize.rs#L430) error for both.  I made the following mod to get context:
```diff
diff --git a/tools/src/tokenize.rs b/tools/src/tokenize.rs
index 6e78b34..e809309 100644
--- a/tools/src/tokenize.rs
+++ b/tools/src/tokenize.rs
@@ -427,7 +427,7 @@ pub fn tokenize_c_like(string: &str, spec: &LanguageSpec) -> Vec<Token> {
                     } else if next == '\\' && peek_char() != '\n' {
                         get_char();
                     } else if next == '\n' {
-                        debug!("Invalid regexp literal");
+                        debug!("Invalid regexp literal (pos {})", peek_pos());
                         return tokenize_plain(string);
                     }
                 }
```

Doing `head -c` with that shows it's indeed the arrow function that causes the .  Note that in test_json_updatecheck.js it's actually a slightly earlier line at https://searchfox.org/mozilla-central/rev/04d8e7629354bab9e6a285183e763410860c5006/toolkit/mozapps/extensions/test/xpcshell/test_json_updatecheck.js#311 but for the comment 1 it is https://searchfox.org/mozilla-central/rev/04d8e7629354bab9e6a285183e763410860c5006/testing/mochitest/BrowserTestUtils/BrowserTestUtils.jsm#1370

I assume that the logic at https://github.com/mozsearch/mozsearch/blob/e97a1da2ad46733675cd25579dff30a50938cec8/tools/src/tokenize.rs#L555 that sets `next_token_maybe_regexp_literal` is getting tricked because the list of okay characters does not include ">".  I nearly once went mad on this problem space before (in order to tokenize JS you sorta need to parse JS), so my knee-jerk suggestion would be that maybe the "=" case wants lookahead to consume the ">"?

:kashav, do you want to try your hand at a fix (and risk madness? ;)?  Emilio has a patch at https://github.com/mozsearch/mozsearch/pull/248 to use 18.04 and help avoid using virtualbox which might be helpful or not.

Back to Bug 1588908 Comment 4