Closed Bug 1461741 Opened 6 years ago Closed 6 years ago

Cannot change encoding

Categories

(Firefox :: Menus, defect)

60 Branch
defect
Not set
normal

Tracking

()

RESOLVED INVALID

People

(Reporter: dupsk8, Unassigned)

References

Details

Attachments

(1 file)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101 Firefox/60.0
Build ID: 20180503143129

Steps to reproduce:

I wanna change the text encoding of the page, i open the menu and try to select "text encoding"


Actual results:

The menu is grayed out and i cannot change the text encoding. 

If i try to open the menu using "Menu > plus > text encoding" i can access to the content but cannot change anything.

This also happens on this page.



Expected results:

The menu is suppose to open to allow to change text encoding
As I recall, it's intended behavior for the menu options to be unavailable when the server specifies the character encoding. The vast majority of them do. Here's an example of one that doesn't. As you can see, you can switch the character encoding on that page:
http://www.cburch.com/cs/151/assn/07/
Component: Untriaged → Menus
OS: Unspecified → All
Hardware: Unspecified → All
Whiteboard: [INVALID?]
Indeed, for this page the webserver specifies a Content-Type header of "text/html; charset=UTF-8", so we don't enable character encoding changes. It's only there for cases where the website doesn't specify anything, and thus the browser has to guess based on the contents, and thus the browser may be wrong (and need correcting from the user based on them looking at gobbledygook in front of them). It's questionable whether there is really any point keeping it in today's web, but in any case, as filed I think comment 1 is correct and this is invalid.
Status: UNCONFIRMED → RESOLVED
Closed: 6 years ago
Resolution: --- → INVALID
Is there a way to bypass this behavior ? here my need, a major hosting company in Europe have a incorrectly configured form wich need data encoded in latin-1 but they also provide an utf-8 charset on this page. This form set the automatic answer message for an email address, so if you let the page in utf-8, the mail sent  is wrongl encoded ( "À très bientôt !"). I was using this option to send my message in latin-1, but now i need to use internet explorer to do this wich is sad
(In reply to dupsk8 from comment #3)
> Is there a way to bypass this behavior ? here my need, a major hosting
> company in Europe have a incorrectly configured form wich need data encoded
> in latin-1 but they also provide an utf-8 charset on this page. This form
> set the automatic answer message for an email address, so if you let the
> page in utf-8, the mail sent  is wrongl encoded ( "À très bientôt !").

I mean, really the best option would be to tell the hosting company to fix their form. After all, you will not be the only person with this issue...

As a workaround, you could use an extension that modifies the relevant content-type response header and/or meta tag so that it's correct.

> I
> was using this option to send my message in latin-1, but now i need to use
> internet explorer to do this wich is sad

That is indeed sad... Henri, do you have any thoughts about reconsidering the UI changes from bug 980904 in some circumstances?
Blocks: 980904
Flags: needinfo?(hsivonen)
(In reply to dupsk8 from comment #3)
> Is there a way to bypass this behavior ? here my need, a major hosting
> company in Europe have a incorrectly configured form wich need data encoded
> in latin-1 but they also provide an utf-8 charset on this page.

That's a fundamentally bogus configuration. I suggest complaining to the hosting company that they should make the encoding of the HTML page containing the form and the expectations of the form submission handler script match (preferably by changing the script to expect UTF-8).

If you control the HTML of the form, you can add the attribute accept-charset="windows-1252" on the <form> element.

If the script needs to be able to handle non-UTF-8 submission for legacy reasons, it could be changed to copy the value of the form field named "_charset_" into the charset= parameter of the email Content-Type header. Then you can put <input type=hidden name="_charset_"> as a hidden field of the form and the browser will fill it in with the actual encoding used.

(In reply to :Gijs (he/him) from comment #4)
> That is indeed sad... Henri, do you have any thoughts about reconsidering
> the UI changes from bug 980904 in some circumstances?

I think we shouldn't roll back that change. While the edge case seen here is unfortunate, bug 980904 helps users not waste time with a remedy that's virtually always wrong for the symptoms they are seeing (if the page was labeled as UTF-8 and had no UTF-8 errors, chances are that whatever looks wrong with the page is not a problem that would be remedied by decoding as another encoding). Telemetry suggests that useless use or incorrect use of the encoding menu was a big part of its overall use.
Flags: needinfo?(hsivonen)
The complain is already done, by several clients ... In fact just a way to use the menu options, maybe using about:config, can help i guess
(In reply to dupsk8 from comment #6)
> In fact just a way to use the menu options, maybe using about:config, can help i guess

As indicated at comment 4, you'd have to use an extension to overwrite the Content-Type response header. Here are two examples:
https://addons.mozilla.org/firefox/addon/header-editor/
https://addons.mozilla.org/firefox/addon/modheader-firefox/
Whiteboard: [INVALID?]
An extension seems like an overkill. This bookmarklet should be enough:
javascript:(function()%7Bfor%20(var%20i%20%3D%200%3B%20i%20%3C%20document.forms.length%3B%20i%2B%2B)%20%7Bdocument.forms%5Bi%5D.setAttribute(%22accept-charset%22%2C%20%22windows-1252%22)%3B%7D%7D)()
(And no, there's nothing in about:config that would help.)

I think it's very shortsighted and condescending to suggest that people should report broken webpages instead of enabling them to read them at that moment correctly via a simple tool that is already there but is for some very flimsy reasons disabled at this instance. The character encoding menu is in the browser for the situations when everything is not as it should be, so I can't understand why one case is allowed and other is not. And what purpose is served by disabling it in the first place.

To quote a classic: "...you got me in the red. And I'm just saying, I'm just saying that it's ****ing dangerous to have a race car in the ****ing red, that's all."

(In reply to nicolaibiker from comment #10)

I think it's very shortsighted and condescending to suggest that people should report broken webpages instead of enabling them to read them at that moment correctly via a simple tool that is already there but is for some very flimsy reasons disabled at this instance.

As written, your comment isn't helpful, because it talks about the feature in the abstract without pointing to a specific Web page demonstrating a specific practical problem. If you have a specific page where the menu is disabled and enabling it would solve a practical problem, please file a new bug with the specific scenario unless the scenario is one of the following known edge cases: 1) The page is actually encoded in ISO-2022-JP but declared as UTF-8 or 2) there is a parent page declared as UTF-8 (and is valid UTF-8) and the page whose encoding you wish to override is a child document (i.e. in an iframe; this case is hard to distinguish from advertising on the browser level).

I can hardly say if it would solve the problem, since it is disabled, I can only assume.

An example is this page: https://eshop.tierraverde.cz/zmekcovac-vody

It shows fine in Chrome though. This bug is called "Cannot change encoding" and my case fits the reported behavior, so I assumed it is exactly what this problem is about. I don't see how having ten thousand bugs about one issue is going to help.

(In reply to nicolaibiker from comment #12)

I can hardly say if it would solve the problem, since it is disabled, I can only assume.

An example is this page: https://eshop.tierraverde.cz/zmekcovac-vody

It shows fine in Chrome though.

It shows fine in Firefox for me. The page is encoded in UTF-8 and declares so.

What problem with the page do you see in Firefox?

I can't attach an image (or I don't know how), but almost all accented characters are shown as crossed-out rectangular boxes.

(In reply to nicolaibiker from comment #14)

I can't attach an image (or I don't know how), but almost all accented characters are shown as crossed-out rectangular boxes.

That's not an encoding problem. That's a font problem. The site is supposed to load Source Sans Pro Regular from Google. I don't know why you are seeing a font problem. Even with stricter content blocking list it doesn't break for me on a computer that doesn't have Source Sans Pro Regular installed locally.

If you can identify the failure condition more clearly, it's worthwhile to file a bug about the font issue.

Anyway, in this case, letting you try stuff from the Text Encoding menu would not have solved your problem.

Ok, In that case I am sorry.

I have this problem too.. But I don't uderstand why about fonts? I think this isn't reason why an item menu is disabled. There is post https://support.mozilla.org/ru/questions/1261965

(In reply to byred from comment #17)

I have this problem too.. But I don't uderstand why about fonts? I think this isn't reason why an item menu is disabled. There is post https://support.mozilla.org/ru/questions/1261965

What's your Web development use case for the menu?

The menu is not for Web developers. The menu is there to help users deal with badly-authored pages. Web developers are supposed to use UTF-8, declare it (e.g. <meta charset=utf-8>), and not use the menu.

Flags: needinfo?(byred)

This bug affects me. Example website:
https://www.qtcentre.org/threads/13614-Qt-application-size-problem-it-is-very-big

In this case, a user's post on a forum was encoded with the wrong character set, resulting in the word 'executable' displaying as 'exécutable'. This was outside of the control of the owners of the website, and it would be unreasonable to expect them to manually correct every post by a user which was uploaded with the wrong character encoding.

I have also experienced related bugs (such as Chrome's lack of such a menu to begin with) on websites hosting content like web comics, where commentary for the comic was written with one character encoding in the past, but the site's coding changed and now encodes commentary under a different character encoding.

It could be argued that database content should have been converted when the default encoding was converted, but now content under the old encoding is many years old and it would be unreasonable to go and manually change the encoding for each post (I'm assuming web comic artists run their sites with a WYSIWYG solution of some sort, and it would be unreasonable to expect them to understand how to write code to automate this process).

I have also run into abandoned websites which report one encoding, but output another. While relatively rare, it is situations like this (and the above) which lead me to agree that the removal of the ability to override the content encoding set by the server is short-sighted and serves no good purpose.

Besides, consider this: the only time that a user actually will ever look at that menu, is if the text isn't processed correctly in some way. If the server reports a character encoding, and all the text on the page matches that encoding, then one of two possibilities will occur:

  1. All the text looks correct, and the user never touches the menu anyway.
  2. Something else is causing the text to look incorrect (or it's a matter of the text simply being other than what the user expected), and the user will try a few items in the menu, but nothing will help.

In the first case, everything's fine and there's no reason to gray it out.

In the second case, graying out the menu doesn't help the user diagnose the problem, it just causes them to report bugs like this while feeling convinced that this is the cause of the problem (even when it is not).

So. Since I have linked directly to one concrete case of this bug being valid, given examples of previous times I've encountered similar situations, AND given a logical explanation for why the decision to gray out the menu when the server specifies a character encoding...

Please undo that shortsighted change and give us that menu back, in all situations and scenarios. If you REALLY want, gray it out on Firefox internal pages like about:config and whatnot, where the character encoding is Guaranteed to be correct for all content on the page because the browser itself controls the content on the page.

(In reply to Tynach from comment #21)

This bug affects me. Example website:
https://www.qtcentre.org/threads/13614-Qt-application-size-problem-it-is-very-big

In this case, a user's post on a forum was encoded with the wrong character set, resulting in the word 'executable' displaying as 'exécutable'. This was outside of the control of the owners of the website, and it would be unreasonable to expect them to manually correct every post by a user which was uploaded with the wrong character encoding.

Even if the menu had been enabled, there is no menu option that would have fixed this failure mode. The failure mode is: é encoded as UTF-8 was interpreted as windows-1252 and the result was served encoded as UTF-8. In fact, no menu option in any browser ever has supported fixing this failure mode.

I have also run into abandoned websites which report one encoding, but output another. While relatively rare, it is situations like this (and the above) which lead me to agree that the removal of the ability to override the content encoding set by the server is short-sighted and serves no good purpose.

Unlike Chrome, Firefox hasn't removed the UI for dealing with this case in general. Disabling the menu in some cases does serve two purposes:

  1. It helps avoid introducing bogus data into server-side databases. (The introduction could happen by the user first accessing the menu and then submitting a form.)
  2. It helps avoid attacks where the attacker submits user-provided content to a site and hopes that another user overrides the encoding such that a cross-site scripting attack runs.

(In reply to Henri Sivonen (:hsivonen) from comment #22)

Even if the menu had been enabled, there is no menu option that would have fixed this failure mode. The failure mode is: é encoded as UTF-8 was interpreted as windows-1252 and the result was served encoded as UTF-8. In fact, no menu option in any browser ever has supported fixing this failure mode.

Hm, fair enough. The point still stands though: there can definitely be times when the menu would be valid to use, but grayed out.

Unlike Chrome, Firefox hasn't removed the UI for dealing with this case in general.

Again, there are still potentially times where use of the menu is desired, but it's grayed out. Just because I can't happen to find one right at this moment, doesn't mean it doesn't exist.

Disabling the menu in some cases does serve two purposes:

  1. It helps avoid introducing bogus data into server-side databases. (The introduction could happen by the user first accessing the menu and then submitting a form.)
  2. It helps avoid attacks where the attacker submits user-provided content to a site and hopes that another user overrides the encoding such that a cross-site scripting attack runs.
  1. is potentially how the example I linked to above happened, so fair enough I suppose, but I find it more likely to be a temporary bug on the server side, or a bug in the user's browser. One thing I remember Chrome at least doing, before the option was completely removed, was resetting to the default detected encoding on every page load. This should mitigate the vast majority of possible cases of this happening.
  2. is an extreme stretch, because the menu is only grayed out under certain conditions to begin with. There are tons of places where it's not grayed out, and this isn't a problem in those places. I doubt any malware would be able to successfully coax users into using it in just the right way to begin with.

These sound like reasons made up after the fact to justify the change, rather than well-thought-out arguments for why the change was made to begin with.

(In reply to Tynach from comment #23)

The point still stands though: there can definitely be times when the menu would be valid to use, but grayed out.

There is one known such case: The content is ISO-2022-JP but the server says UTF-8.

Do you have concrete examples of other cases? (As noted, the examples presented here would not actually have been remedied by the menu even if enabled.)

  1. is an extreme stretch, because the menu is only grayed out under certain conditions to begin with. There are tons of places where it's not grayed out, and this isn't a problem in those places.

There are cases where the menu is potentially dangerous but not grayed out. It's not grayed out in those cases, because it's hard to distinguish between useful and dangerous use in those cases. It's grayed out in the most dangerous case of this kind where it's obvious that the menu isn't going to be useful.

e.g. here the encoding is wrong and i cannot change it:
https://github.com/ivrix/hspell/blob/master/extrawords.hif

(In reply to eyal gruss (eyaler) from comment #25)

e.g. here the encoding is wrong and i cannot change it:
https://github.com/ivrix/hspell/blob/master/extrawords.hif

This is, again, a case the menu was never able to fix. GitHub is converting the file from windows-1252 to UTF-8 on the server side. The menu never had an option to perform a conversion from UTF-8 to windows-1252 and then reinterpret as windows-1255.

(In reply to eyal gruss (eyaler) from comment #25)

e.g. here the encoding is wrong and i cannot change it:
https://github.com/ivrix/hspell/blob/master/extrawords.hif

You can click the Raw button on that page and manually switch the encoding there.

This should not be intended behavior, as server can provide wrong encoding. Or portions of a page could be in different encoding. Unless Firefox will have built-in auto-fix conversion for selected text functionality there should be ability to change encoding on demand.

(In reply to regs from comment #28)

This should not be intended behavior, as server can provide wrong encoding.

Unless the page uses a BOM, Firefox allows a manual override in this case.

Or portions of a page could be in different encoding.

While this can occur by accident, this sort of situation could also occur as an attack, which is why this case isn't treated as a use case Firefox addresses.

(In reply to Henri Sivonen (:hsivonen) from comment #24)

(In reply to Tynach from comment #23)

  1. is an extreme stretch, because the menu is only grayed out under certain conditions to begin with. There are tons of places where it's not grayed out, and this isn't a problem in those places.

There are cases where the menu is potentially dangerous but not grayed out. It's not grayed out in those cases, because it's hard to distinguish between useful and dangerous use in those cases. It's grayed out in the most dangerous case of this kind where it's obvious that the menu isn't going to be useful.

You didn't seem to really address what I said, so lets put this in terms closer to your own:

Are there any concrete cases that have ever occurred, where use of the menu caused harm? Not hypothetical harm, because just running Javascript introduces that (and far more of it). I mean real harm, where it has actually happened and been documented, where it happened to a real actual user. I don't mean cases where it occurred in a developer's test environment to show it's theoretically possible.

We don't treat use cases and potential exploits in the same way: To prevent an exploit, we don't wait for documented harm to a user if we have a proof-of-concept attack. Yet, for use cases, I'm still asking for concrete examples.

In any case, even the feature design has moved on since your previous comment on this bug. If there's a concrete problem with the current design, please file a new bug with the URL of a page showing the problem.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: