"WebDriver:GetElementText" fails to correctly capitalize text containing an underscore (_)
Categories
(Remote Protocol :: Marionette, defect, P3)
Tracking
(firefox-esr115 wontfix, firefox124 wontfix, firefox125 wontfix, firefox126 wontfix, firefox127 fixed)
People
(Reporter: tchad, Assigned: whimboo)
References
(Blocks 1 open bug, Regression, )
Details
(Keywords: regression, Whiteboard: [webdriver:m11], [wptsync upstream][webdriver:relnote])
Attachments
(2 files)
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36
Steps to reproduce:
After upgrading to Firefox 115, WebElement.getText doesn't match the displayed text of an element with "text-transform: capitalize;" styling.
test.html:
<html><div id="reproDiv">
<span style="text-transform: capitalize;">test_text</span><br>
<span style="text-transform: capitalize;">test text</span><br>
</div></html>
Test.java snippet:
driver.navigate().to("file:///test.html");
driver.findElement(By.id("reproDiv")).getText();
Actual results:
The underscore ('_') is treated as a word boundary, causing letters to be incorrectly capitalized.
"test_text" becomes "Test_Text"
Expected results:
"test_text" should become "Test_text" when capitalized.
Comment 1•6 months ago
|
||
The Bugbug bot thinks this bug should belong to the 'Core::Web Painting' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.
Comment 2•6 months ago
|
||
Web painting is not the right component but I'm not sure where is. Whatever api webdriver is using to get the text would be the place to look, not sure what that is.
Updated•6 months ago
|
Assignee | ||
Comment 3•6 months ago
|
||
Thank you for the report. I can actually verify this behavior with a Marionette script. I've used mozregression and that is the result that I got:
mach mozregression --good 102 --bad 114 -c './mach marionette-test _a/test_minimized.py --binary="{binary}"'
4:20.23 INFO: Narrowed nightly regression window from [2022-10-11, 2022-10-13] (2 days) to [2022-10-12, 2022-10-13] (1 days) (~0 steps left)
4:20.23 INFO: Got as far as we can go bisecting nightlies...
4:20.23 INFO: Last good revision: cbbf6a7e34a363b39107b60dddac2aa713eaa8b5 (2022-10-12)
4:20.23 INFO: First bad revision: a1624dfaa16941059792f80e84e5e4adf31c8827 (2022-10-13)
4:20.23 INFO: Pushlog:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=cbbf6a7e34a363b39107b60dddac2aa713eaa8b5&tochange=a1624dfaa16941059792f80e84e5e4adf31c8827
4:20.23 INFO: Switching bisection method to taskcluster
4:20.23 INFO: Getting mozilla-central builds between cbbf6a7e34a363b39107b60dddac2aa713eaa8b5 and a1624dfaa16941059792f80e84e5e4adf31c8827
[..]
4:39.04 CRITICAL: Last build a1624dfaa169 is missing, but mozregression can't find a build after - so it is excluded, but it could contain the regression!
4:40.93 INFO: There are no build artifacts for these changesets (they are probably too old).
Sadly it's only the Nightly range that works here but it started already in the Firefox 107 release.
Assignee | ||
Comment 4•6 months ago
|
||
Given that this is a regression which happened on Oct 13th, 2022 but 1771942 is most likely the regressing bug here.
This means an update of the Selenium atoms actually caused this regression. There is also bug 1824664 that covered a similar regression and there this range was mentioned:
So the atoms got updated from the source on Selenium in javascript/atoms/dom.js from the changeset facd199c31c9d81316a1db3bbb0c10603799bb8b (Sep 29, 2017) to a6b161a159c3d581b130f03a2e6e35f577f38dec (Jul 6, 2022).
By checking the diff between those changesets it's not clear to me what might have broken it.
Updated•6 months ago
|
Assignee | ||
Comment 5•6 months ago
|
||
Actually there was a broken behavior between different browsers in the past. See the 2nd note for capitalize
:
https://developer.mozilla.org/en-US/docs/Web/CSS/text-transform#syntax
The capitalize keyword was under-specified in CSS 1 and CSS 2.1. This resulted in differences between browsers in the way the first letter was calculated (Firefox considered - and _ as letters, but other browsers did not. Both Webkit and Gecko incorrectly considered letter-based symbols like ⓐ to be real letters.) By precisely defining the correct behavior, CSS Text Level 3 cleans this mess up. The capitalize line in the browser compatibility table contains the version the different engines started to support this now precisely-defined behavior.
These days all browsers indeed correctly transform the text to Test-Text
as it can be seen with opening the following dataURL in the location bar of any browser: data:text/html,<div style="text-transform: capitalize;">test-text</div>
.
That also explains why there is no obvious change in the Selenium atoms code.
As such this behavior works as expected starting at some point between the Firefox 107 and 115 releases.
Actual browser behavior is correct.
This error only occurs when getting an element's text with WebDriver/Marionette. Also, the problem is with underscores, not with dashes.
Comment 7•6 months ago
|
||
Set release status flags based on info from the regressing bug 1771942
Assignee | ||
Comment 8•6 months ago
|
||
Ah, I modified the test case to check if -
is affected as well and then I forget to revert this change and as result messed up further testing.
You are right that only underscore is affected and the problem is still present in Firefox 126.
Assignee | ||
Comment 9•6 months ago
|
||
Note that actually all browsers currently fail this test case. Chrome behaves the same as us because it is most likely using the same Selenium atom, and Safari returns test_text
with nothing capitalized. I'll attach a wdspec test that we can use to verify a future fix.
Basically something in Selenium's bot.dom.getVisibleText()
is broken - most likely around these lines:
https://github.com/SeleniumHQ/selenium/blob/a169d905b4215f41d0b04ed18bcf02e3057298a7/javascript/atoms/dom.js#L1174-L1179
Changes as landed in that time frame are:
https://github.com/SeleniumHQ/selenium/commit/b8ef89b47b0e1ca422c21baf101738f512d2f535
https://github.com/SeleniumHQ/selenium/commit/cf49ba2362de69e4fd45aa1209bf9893ba95888a
https://github.com/SeleniumHQ/selenium/commit/c0f3e3c743651b9817dbe69f7ba7614e86eda91c
Most likely the last of these links caused it and we need a special case for underscore. It's annoying that the atoms need that customization and we cannot just rely what the browser returns.
Updated•6 months ago
|
Assignee | ||
Comment 10•6 months ago
|
||
Depends on D204697
Comment 11•6 months ago
•
|
||
Is this something we'll need to worry about backporting or will fixing it on trunk suffice?
Assignee | ||
Comment 12•6 months ago
|
||
So far we got a single report for that regression so I don't think it is required to uplift. I'll mark as optional for now.
Assignee | ||
Comment 13•6 months ago
|
||
I've filed https://github.com/SeleniumHQ/selenium/issues/13779 to get the regression in the atom fixed.
Once done we should update the WebDriver classic spec for the new commit id, and update our imported atoms.
Assignee | ||
Updated•6 months ago
|
Assignee | ||
Comment 14•6 months ago
|
||
(In reply to Henrik Skupin [:whimboo][⌚️UTC+1] from comment #13)
I've filed https://github.com/SeleniumHQ/selenium/issues/13779 to get the regression in the atom fixed.
This got just fixed on the Selenium side. It means that we can bump the atom commit for webdriver classic and update our get text
atom now. I may be able to do it over the next days.
Comment 15•6 months ago
|
||
Set release status flags based on info from the regressing bug 1771942
Updated•6 months ago
|
Updated•5 months ago
|
Assignee | ||
Comment 16•5 months ago
|
||
I've opened https://github.com/w3c/webdriver/pull/1811 to bump the Selenium revision in the WebDriver classic specification.
Updated•5 months ago
|
Assignee | ||
Comment 17•5 months ago
|
||
Depends on D206396
Assignee | ||
Updated•5 months ago
|
Comment 18•5 months ago
|
||
Comment 20•5 months ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/a67371177089
https://hg.mozilla.org/mozilla-central/rev/7daad52f7b6e
Updated•5 months ago
|
Assignee | ||
Updated•4 months ago
|
Description
•