Open Bug 1715983 Opened 3 years ago Updated 2 years ago

Hidden characters (like hard spaces) renders as   but shown as regular space when copied, even in devtools

Categories

(Core :: DOM: Copy & Paste and Drag & Drop, enhancement)

Firefox 89
enhancement

Tracking

()

People

(Reporter: tobbe, Unassigned)

References

Details

Attachments

(1 file)

Attached image example.png

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:89.0) Gecko/20100101 Firefox/89.0

Steps to reproduce:

Found an article with weird line breaks, when inspecting in Firefox I couldn't see any non-breaking spaces and I copied the paragraph both from the article and from devtools to an external service to detect hidden characters but it just reported regular spaces despite it rendering so weird.

https://www.svt.se/kultur/auktionshus-salde-bluffkonst-bukowskis-alexander-vasilenko-tom-moore regular spaces.

Actual results:

Firefox devtools just showed the lead/preamble to contain regular spaces, and copying the text from devtools to services to detect invisible characters reported nothing but regular spaces. Copying the text from the article itself rendered the same results (regular spaces).

Expected results:

The hard spaces (probably entered as such in the CMS / copied from Word and not as   entities in the html) should either be kept so when copying text you can detect the hidden characters, or maybe as Chrome devtools did, actually print it out as   in this instance making it much easier to find the culprit.

Including example from both as a screenshot.

Hi,
thank you for your report. Hopefully it will be taken into consideration in the future.
I'll mark this Enhancement as New for visibility.

I've assigned a component in order to get the dev team involved, but if it's not relevant,
'Firefox:General' team, please feel free to change the component if this is not the appropriate one.

Regards,

Jerónimo.

Status: UNCONFIRMED → NEW
Component: Untriaged → General
Ever confirmed: true

The Bugbug bot thinks this bug should belong to the 'Core::Layout: Text and Fonts' component, and is moving the bug to that component. Please revert this change in case you think the bot is wrong.

Component: General → Layout: Text and Fonts
Product: Firefox → Core

I don't think this is a Layout bug. The non-breaking spaces are present in the text and are correctly handled by Layout.

Inspecting the page with the web console:

> document.getElementsByTagName('p')[1].innerText
< "Kulturnyheterna kan idag avslöja att ansedda auktionshus och gallerier har sålt bluffkonst av påhittade konstnärer. Bluffkonsten värderas till över tio miljoner kronor. "

The space between "konstnärer." and "Bluffkonsten", for example, is a non-breaking space in the original content (present as a literal U+00A0 character, not an &nbsp; entity, in the raw HTML response), although copy/pasting into the comment here has turned it into a normal space. We can check that it is correct in the document, though:

> document.getElementsByTagName('p')[1].innerText.charCodeAt(115)
< 160

So it's definitely correct in the DOM, and Layout is just doing the expected thing with it. The issue here is about what happens when it is copied to the clipboard -- apparently we're converting non-breaking spaces to regular spaces at that point.

--> moving to the Copy&Paste component.

Component: Layout: Text and Fonts → DOM: Copy & Paste and Drag & Drop

I mean, the page is rendered correctly considering the non-breaking spaces, my issue is that I couldn't see them as other than regular spaces when inspecting the code (as I could quite clearly when inspecting using Chrome, per my screenshot comparison). Some sort of indicator of hidden characters would absolutely be nice. But yes, second best would be to not convert it to regular spaces so they're actually detectable if using a 3rd party service to detect invisible characters, just wish I could do it straight away in the inspector.

Thanks for investigating!

Yes, agreed. I think there are really two distinct issues here. First, the fact that non-breaking spaces get converted to regular spaces when copying text from the page (or from the Inspector view of it). There may have been a specific reason this was done, but I think we should at least re-examine the question. That's what I hope can be looked at within the Copy&Paste component.

The second issue is whether the DevTools inspector should perhaps do something to make non-breaking spaces (and potentially other "invisible" characters) visible in its source view, such as displaying them as &nbsp; entities (as Chrome seems to do) even if they were literal Unicode characters in the actual source. I think it'd be best to file a separate bug (in the DevTools: Inspector component) suggesting that as an enhancement. It would be an entirely separate issue from the behavior of the Copy/Paste operation.

See Also: → 532712
See Also: → 1800078
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: