Syntax highlighting broken in BrowserTestUtils.jsm
Categories
(Webtools :: Searchfox, defect)
Tracking
(Not tracked)
People
(Reporter: mccr8, Assigned: u608768)
References
Details
Attachments
(1 file)
|
454.01 KB,
application/x-javascript
|
Details |
I was working with this file last week and I think it was being highlighted just fine, but looking at it now, it doesn't seem to be:
https://searchfox.org/mozilla-central/source/testing/mochitest/BrowserTestUtils/BrowserTestUtils.jsm
Some other random JSM I selected is still highlighted, so maybe there was just trouble parsing BrowserTestUtils.jsm: https://searchfox.org/mozilla-central/source/browser/actors/BrowserTabChild.jsm
Comment 1•6 years ago
•
|
||
The indexer has an okay-looking analysis file, attaching it here.
Seems like the problem is deterministic as the comm-central version is also sad at https://searchfox.org/comm-central/source/mozilla/testing/mochitest/BrowserTestUtils/BrowserTestUtils.jsm
And if I force dynamic tokenization by giving the parent of the tip rev the indexer ran, the problem reproduces: https://searchfox.org/mozilla-central/rev/4c3c765916833c689e4ce5eba5b2144cbb340802/testing/mochitest/BrowserTestUtils/BrowserTestUtils.jsm
Whereas the preceding revision of the file in question is fine: https://searchfox.org/mozilla-central/rev/71af6eb79bff7007e668cfb26d07547511800b0b/testing/mochitest/BrowserTestUtils/BrowserTestUtils.jsm
Which suggests the tokenizer hates something in this diff:
https://searchfox.org/mozilla-central/diff/8dc620b8bc97f1e32121ebe9076ffec04c28eabc/testing/mochitest/BrowserTestUtils/BrowserTestUtils.jsm#35
| Reporter | ||
Comment 2•6 years ago
|
||
The /^https?:/ seems like the most likely culprit here.
(In reply to Andrew McCreight [:mccr8] from comment #2)
The /^https?:/ seems like the most likely culprit here.
More specifically, I think it doesn't like arrow functions followed by regex patterns. It's broken in this file as well:
Comment 4•6 years ago
•
|
||
Yeah, I quickly ran against both files on an indexer and we got the "Invalid regexp literal" error for both. I made the following mod to get context:
diff --git a/tools/src/tokenize.rs b/tools/src/tokenize.rs
index 6e78b34..e809309 100644
--- a/tools/src/tokenize.rs
+++ b/tools/src/tokenize.rs
@@ -427,7 +427,7 @@ pub fn tokenize_c_like(string: &str, spec: &LanguageSpec) -> Vec<Token> {
} else if next == '\\' && peek_char() != '\n' {
get_char();
} else if next == '\n' {
- debug!("Invalid regexp literal");
+ debug!("Invalid regexp literal (pos {})", peek_pos());
return tokenize_plain(string);
}
}
Doing head -c with that shows it's indeed the arrow function that causes the . Note that in test_json_updatecheck.js it's actually a slightly earlier line at https://searchfox.org/mozilla-central/rev/04d8e7629354bab9e6a285183e763410860c5006/toolkit/mozapps/extensions/test/xpcshell/test_json_updatecheck.js#311 but for the comment 1 it is https://searchfox.org/mozilla-central/rev/04d8e7629354bab9e6a285183e763410860c5006/testing/mochitest/BrowserTestUtils/BrowserTestUtils.jsm#1370
I assume that the logic at https://github.com/mozsearch/mozsearch/blob/e97a1da2ad46733675cd25579dff30a50938cec8/tools/src/tokenize.rs#L555 that sets next_token_maybe_regexp_literal is getting tricked because the list of okay characters does not include ">". I nearly once went mad on this problem space before (in order to tokenize JS you sorta need to parse JS), so my knee-jerk suggestion would be that maybe the "=" case wants lookahead to consume the ">"?
:kashav, do you want to try your hand at a fix (and risk madness? ;)? Emilio has a patch at https://github.com/mozsearch/mozsearch/pull/248 to use 18.04 and help avoid using virtualbox which might be helpful or not.
(In reply to Andrew Sutherland [:asuth] (he/him) from comment #4)
:kashav, do you want to try your hand at a fix (and risk madness? ;)? Emilio has a patch at https://github.com/mozsearch/mozsearch/pull/248 to use 18.04 and help avoid using virtualbox which might be helpful or not.
I'd love to! Assigning myself.
Comment 6•6 years ago
|
||
Any progress here?
(In reply to Andrew Sutherland [:asuth] (he/him) from comment #4)
I assume that the logic at https://github.com/mozsearch/mozsearch/blob/e97a1da2ad46733675cd25579dff30a50938cec8/tools/src/tokenize.rs#L555 that sets
next_token_maybe_regexp_literalis getting tricked because the list of okay characters does not include ">". I nearly once went mad on this problem space before (in order to tokenize JS you sorta need to parse JS), so my knee-jerk suggestion would be that maybe the "=" case wants lookahead to consume the ">"?
This fixes the problem, but I'm running into an odd issue when testing, here's what I'm doing:
- Add a new test file with something like
x => /foo/(or just add that to an existing js test file). - Run the indexer-setup.sh and indexer-run.sh steps from https://github.com/mozsearch/mozsearch#testing-locally-using-the-test-repository
- Open the test file and notice syntax highlighting is broken (as expected).
- Add ">" to that "list of okay characters" in tools/src/tokenize.rs, rebuild the tools package, and re-index (with indexer-run.sh).
- Open the test file and notice syntax highlighting is no longer broken.
- Remove the check for ">", and rebuild, re-index, etc.
- Open the test file and notice syntax highlighting is still no longer broken.
I've tried to completely rebuild the tools package (cargo clean, cargo build --release) in step 6, but that doesn't seem to help either. I'm probably missing something really obvious.
Comment 8•6 years ago
|
||
Probably one of the following things is happening:
- Web caching. The settings for static files generated by https://github.com/mozsearch/mozsearch/blob/master/scripts/nginx-setup.py mean that the browser will only revalidate the etag after 2 minutes have expired since the last hit. If you reload, run a rebuild, and less than 2 minutes have passed when you refresh and you're not holding down shift/whatever, you may just be seeing the cached copy.
- The output run is failing because the blame repo isn't up-to-date with the current status of the repo in the filesystem because output-file assumes the blame-map-lines vector has the same length as the on-disk-lines vector and output-file panics. This is something I think I made ergonomically worse when trying to improve the fidelity of the test repo by having it generate blame and probably didn't update the docs enough. The tl;dr is that any time you make changes to your repo, you really want to commit them and then re-run the indexer-setup.sh step again before running the indexer step. I created a shell-script called
~/bin/reindex-and-runthat's basically:
#!/bin/sh
/vagrant/infrastructure/indexer-setup.sh /vagrant/tests config.json ~/index && /vagrant/infrastructure/indexer-run.sh /vagrant/tests ~/index && /vagrant/infrastructure/web-server-run.sh /vagrant/tests ~/index ~
- I find myself constantly re-recreating shell scripts like this, so I think maybe we want to perhaps create helper scripts like this and they could do something like make sure everything is checked in and tell whether a blame-history rebuild is needed.
We generate a "This page was generated by Searchfox 2019-12-31 9:24." footer at the bottom of every page which can help you tell if you're actually looking at a different result or not.
Comment 9•6 years ago
|
||
There's a Makefile that I added to hold those little shell scripts/commands that I kept having to do during development. Maybe your script would fit in there?
| Assignee | ||
Comment 10•5 years ago
|
||
(In reply to Andrew Sutherland [:asuth] (he/him) from comment #8)
I created a shell-script called
~/bin/reindex-and-runthat's basically:#!/bin/sh
/vagrant/infrastructure/indexer-setup.sh /vagrant/tests config.json ~/index && /vagrant/infrastructure/indexer-run.sh /vagrant/tests ~/index && /vagrant/infrastructure/web-server-run.sh /vagrant/tests ~/index ~
This was the problem. I think I tried running indexer-setup.sh each time, but wasn't restarting the server. Now I'm noticing that sometimes my rust changes aren't picked up immediately (eg. if I edit tokenize.rs and then run cargo build --release, it won't actually rebuild anything, until I run it a few more times).
Anyways, after some more testing, I've noticed that
x => /^a?b/.test(x);doesn't actually break syntax highlighting for the file (the regexp literal isn't highlighted though (i.e., it's not grey like it is in [1])),- but something like
x => /^a?/.test(x);orx => /^a?:/.test(x);break highlighting for the entire file.
Comment 11•5 years ago
|
||
(In reply to :kashav from comment #10)
This was the problem. I think I tried running indexer-setup.sh each time, but wasn't restarting the server. Now I'm noticing that sometimes my rust changes aren't picked up immediately (eg. if I edit tokenize.rs and then run
cargo build --release, it won't actually rebuild anything, until I run it a few more times).
Hm, this sounds like it might be a VM filesystem synchronization problem? At least assuming you're modifying the files outside the VM but building inside the VM. (Which is what I generally would recommend, as it helps avoids rust version inconsistencies between our tools and the compiler that's used to build tests.)
I'm not sure when you did your vagrant setup relative to the Emilio libvirt changes that have since landed, but if you're using virtualbox and followed the docs preceding the current state at https://github.com/mozsearch/mozsearch#vagrant-setup-for-local-development, I might recommend:
- Installing the vagrant-vbguest and running
vagrant reloadandvagrant upin order to help make sure your guest extensions are up-to-date. I experienced a ton of frustrating filesystem weirdness when my guest extensions weren't up-to-date. (This is a new suggestion in the docs.) - Consider switching to libvirt. You would want to
vagrant box removeor something like that to nuke the virtualbox VM. - We did also update the ubuntu VM version in use, so even if you like virtual box, destroying the box and re-creating it may be beneficial too in terms of a more recent ubuntu VM version being happier.
Comment 12•5 years ago
|
||
https://github.com/mozsearch/mozsearch/pull/266 merged, thank you very much for the fix! Marking fixed now, and we can perhaps be fancy and mark this as VERIFIED once we've seen the successful indexer run.
Description
•