Closed Bug 1888004 Opened 6 months ago Closed 5 months ago

"WebDriver:GetElementText" fails to correctly capitalize text containing an underscore (_)

Categories

(Remote Protocol :: Marionette, defect, P3)

defect
Points:
2

Tracking

(firefox-esr115 wontfix, firefox124 wontfix, firefox125 wontfix, firefox126 wontfix, firefox127 fixed)

RESOLVED FIXED
127 Branch
Tracking Status
firefox-esr115 --- wontfix
firefox124 --- wontfix
firefox125 --- wontfix
firefox126 --- wontfix
firefox127 --- fixed

People

(Reporter: tchad, Assigned: whimboo)

References

(Blocks 1 open bug, Regression, )

Details

(Keywords: regression, Whiteboard: [webdriver:m11], [wptsync upstream][webdriver:relnote])

Attachments

(2 files)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36

Steps to reproduce:

After upgrading to Firefox 115, WebElement.getText doesn't match the displayed text of an element with "text-transform: capitalize;" styling.

test.html:
<html><div id="reproDiv">
<span style="text-transform: capitalize;">test_text</span><br>
<span style="text-transform: capitalize;">test text</span><br>
</div></html>

Test.java snippet:
driver.navigate().to("file:///test.html");
driver.findElement(By.id("reproDiv")).getText();

Actual results:

The underscore ('_') is treated as a word boundary, causing letters to be incorrectly capitalized.
"test_text" becomes "Test_Text"

Expected results:

"test_text" should become "Test_text" when capitalized.

The Bugbug bot thinks this bug should belong to the 'Core::Web Painting' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Web Painting
Product: Firefox → Core

Web painting is not the right component but I'm not sure where is. Whatever api webdriver is using to get the text would be the place to look, not sure what that is.

Component: Web Painting → DOM: Core & HTML
Component: DOM: Core & HTML → Marionette
Product: Core → Remote Protocol

Thank you for the report. I can actually verify this behavior with a Marionette script. I've used mozregression and that is the result that I got:

mach mozregression --good 102 --bad 114 -c './mach marionette-test _a/test_minimized.py --binary="{binary}"'

 4:20.23 INFO: Narrowed nightly regression window from [2022-10-11, 2022-10-13] (2 days) to [2022-10-12, 2022-10-13] (1 days) (~0 steps left)
 4:20.23 INFO: Got as far as we can go bisecting nightlies...
 4:20.23 INFO: Last good revision: cbbf6a7e34a363b39107b60dddac2aa713eaa8b5 (2022-10-12)
 4:20.23 INFO: First bad revision: a1624dfaa16941059792f80e84e5e4adf31c8827 (2022-10-13)
 4:20.23 INFO: Pushlog:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=cbbf6a7e34a363b39107b60dddac2aa713eaa8b5&tochange=a1624dfaa16941059792f80e84e5e4adf31c8827

 4:20.23 INFO: Switching bisection method to taskcluster
 4:20.23 INFO: Getting mozilla-central builds between cbbf6a7e34a363b39107b60dddac2aa713eaa8b5 and a1624dfaa16941059792f80e84e5e4adf31c8827
[..]
 4:39.04 CRITICAL: Last build a1624dfaa169 is missing, but mozregression can't find a build after - so it is excluded, but it could contain the regression!
 4:40.93 INFO: There are no build artifacts for these changesets (they are probably too old).

Sadly it's only the Nightly range that works here but it started already in the Firefox 107 release.

Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: WebDriver:GetElementText doesn't capitalize text correctly → "WebDriver:GetElementText" doesn't capitalize text correctly
Version: Firefox 124 → other

Given that this is a regression which happened on Oct 13th, 2022 but 1771942 is most likely the regressing bug here.

This means an update of the Selenium atoms actually caused this regression. There is also bug 1824664 that covered a similar regression and there this range was mentioned:

So the atoms got updated from the source on Selenium in javascript/atoms/dom.js from the changeset facd199c31c9d81316a1db3bbb0c10603799bb8b (Sep 29, 2017) to a6b161a159c3d581b130f03a2e6e35f577f38dec (Jul 6, 2022).

By checking the diff between those changesets it's not clear to me what might have broken it.

Keywords: regression
Regressed by: 1771942
Summary: "WebDriver:GetElementText" doesn't capitalize text correctly → "WebDriver:GetElementText" fails to correctly capitalize text containing a dash (-) or underscore (_)

Actually there was a broken behavior between different browsers in the past. See the 2nd note for capitalize:
https://developer.mozilla.org/en-US/docs/Web/CSS/text-transform#syntax

The capitalize keyword was under-specified in CSS 1 and CSS 2.1. This resulted in differences between browsers in the way the first letter was calculated (Firefox considered - and _ as letters, but other browsers did not. Both Webkit and Gecko incorrectly considered letter-based symbols like ⓐ to be real letters.) By precisely defining the correct behavior, CSS Text Level 3 cleans this mess up. The capitalize line in the browser compatibility table contains the version the different engines started to support this now precisely-defined behavior.

These days all browsers indeed correctly transform the text to Test-Text as it can be seen with opening the following dataURL in the location bar of any browser: data:text/html,<div style="text-transform: capitalize;">test-text</div>.

That also explains why there is no obvious change in the Selenium atoms code.

As such this behavior works as expected starting at some point between the Firefox 107 and 115 releases.

Status: NEW → RESOLVED
Closed: 6 months ago
Resolution: --- → WORKSFORME

Actual browser behavior is correct.
This error only occurs when getting an element's text with WebDriver/Marionette. Also, the problem is with underscores, not with dashes.

Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---

Set release status flags based on info from the regressing bug 1771942

Ah, I modified the test case to check if - is affected as well and then I forget to revert this change and as result messed up further testing.

You are right that only underscore is affected and the problem is still present in Firefox 126.

Status: REOPENED → NEW
Summary: "WebDriver:GetElementText" fails to correctly capitalize text containing a dash (-) or underscore (_) → "WebDriver:GetElementText" fails to correctly capitalize text containing an underscore (_)

Note that actually all browsers currently fail this test case. Chrome behaves the same as us because it is most likely using the same Selenium atom, and Safari returns test_text with nothing capitalized. I'll attach a wdspec test that we can use to verify a future fix.

Basically something in Selenium's bot.dom.getVisibleText() is broken - most likely around these lines:
https://github.com/SeleniumHQ/selenium/blob/a169d905b4215f41d0b04ed18bcf02e3057298a7/javascript/atoms/dom.js#L1174-L1179

Changes as landed in that time frame are:
https://github.com/SeleniumHQ/selenium/commit/b8ef89b47b0e1ca422c21baf101738f512d2f535
https://github.com/SeleniumHQ/selenium/commit/cf49ba2362de69e4fd45aa1209bf9893ba95888a
https://github.com/SeleniumHQ/selenium/commit/c0f3e3c743651b9817dbe69f7ba7614e86eda91c

Most likely the last of these links caused it and we need a special case for underscore. It's annoying that the atoms need that customization and we cannot just rely what the browser returns.

Is this something we'll need to worry about backporting or will fixing it on trunk suffice?

Flags: needinfo?(hskupin)

So far we got a single report for that regression so I don't think it is required to uplift. I'll mark as optional for now.

I've filed https://github.com/SeleniumHQ/selenium/issues/13779 to get the regression in the atom fixed.

Once done we should update the WebDriver classic spec for the new commit id, and update our imported atoms.

Blocks: webdriver
Severity: -- → S3
Priority: -- → P3
Whiteboard: [webdriver:backlog]

(In reply to Henrik Skupin [:whimboo][⌚️UTC+1] from comment #13)

I've filed https://github.com/SeleniumHQ/selenium/issues/13779 to get the regression in the atom fixed.

This got just fixed on the Selenium side. It means that we can bump the atom commit for webdriver classic and update our get text atom now. I may be able to do it over the next days.

Set release status flags based on info from the regressing bug 1771942

I've opened https://github.com/w3c/webdriver/pull/1811 to bump the Selenium revision in the WebDriver classic specification.

Assignee: nobody → hskupin
Attachment #9394587 - Attachment description: WIP: Bug 1888004 - [wdspec] Add "Get Element Text" test for text-transform "capitalize". → Bug 1888004 - [wdspec] Add "Get Element Text" test for text-transform "capitalize".
Status: NEW → ASSIGNED
Points: --- → 2
Whiteboard: [webdriver:backlog] → [webdriver:m11]
Pushed by hskupin@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/a67371177089 [wdspec] Add "Get Element Text" test for text-transform "capitalize". r=webdriver-reviewers,Sasha https://hg.mozilla.org/integration/autoland/rev/7daad52f7b6e [marionette] Fix "WebDriver:GetElementText" to correctly find capitalized text containing an underscore. r=webdriver-reviewers,Sasha
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/46166 for changes under testing/web-platform/tests
Whiteboard: [webdriver:m11] → [webdriver:m11], [wptsync upstream]
Status: ASSIGNED → RESOLVED
Closed: 6 months ago5 months ago
Resolution: --- → FIXED
Target Milestone: --- → 127 Branch
Upstream PR merged by moz-wptsync-bot
Whiteboard: [webdriver:m11], [wptsync upstream] → [webdriver:m11], [wptsync upstream][webdriver:relnote]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: