1261154 - Use "formatAbbreviatedBytes" utility from tree map in sidebar

Reporter

Description

•

9 years ago

https://bugzilla.mozilla.org/show_bug.cgi?id=1238695 Includes a new abbreviation utility which can be used in the sidebar as well. This is to be done after the tree map lands. ::: devtools/client/memory/test/unit/test_utils.js @@ +54,5 @@ > + > + equal(utils.formatAbbreviatedBytes(12), "12B", "Formats bytes"); > + equal(utils.formatAbbreviatedBytes(12345), "12KiB", "Formats kilobytes"); > + equal(utils.formatAbbreviatedBytes(12345678), "11MiB", "Formats megabytes"); > + equal(utils.formatAbbreviatedBytes(12345678912), "11GiB", "Formats gigabytes");

Greg Tatum [:gregtatum]

Reporter

Updated

•

9 years ago

Assignee: nobody → gtatum

Greg Tatum [:gregtatum]

Reporter

Updated

•

9 years ago

Depends on: 1238695

Priority: -- → P2

Greg Tatum [:gregtatum]

Reporter

Comment 1

•

9 years ago

I'm un-assigning myself from this for right now since we are focusing on track 3 of the devtools.html project. Once we move beyond that I can look at this again.

Greg Tatum [:gregtatum]

Reporter

Updated

•

9 years ago

Assignee: gtatum → nobody

Greg Tatum [:gregtatum]

Reporter

Updated

•

9 years ago

Whiteboard: [good first bug]

Steve Chung [:steveck]

Comment 2

•

9 years ago

Attached patch bug-1261154.patch (obsolete) — Details — Splinter Review

Hi Greg, it's a quick patch for applying formatAbbreviatedBytes in snapshot sidebar. Not sure if there exists any other places that need to be replaced as well. BTW, this new string won't have localized size unit, compared with the original string. Shouldn't we localize the unit inside formatAbbreviatedBytes?

Attachment #8775497 - Flags: feedback?(gtatum)

Greg Tatum [:gregtatum]

Reporter

Comment 3

•

9 years ago

Sorry for the delay on responding. I asked that localization question when I originally wrote formatAbbreviatedBytes, but during the discussion with other folks they deemed it wasn't necessary (see bug 1238695). Since then I have seen the bytes like that localized in other parts of the devtools code. I think we should pull in someone from localization to get their opinion. Otherwise this looks like the correct path forward.

Greg Tatum [:gregtatum]

Reporter

Updated

•

9 years ago

Attachment #8775497 - Flags: feedback?(gtatum)

Steve Chung [:steveck]

Comment 4

•

9 years ago

Hi Francesco, do you think the localization for size unit is necessary? Because the size of memory applied aggregate.mb previously. Now we might leverage a new function formatAbbreviatedBytes, but it didn't localize the size unit for the size string.

Flags: needinfo?(francesco.lodolo)

Francesco Lodolo [:flod]

Comment 5

•

9 years ago

I can't seem to find any discussion about l10n in bug 1238695 (but it's a long bug). I think it is important to have them localizable. For one, I don't want to use KB as English incorrectly does. We do localize them for Downloads https://hg.mozilla.org/releases/mozilla-aurora/file/default/toolkit/locales/en-US/chrome/mozapps/downloads/downloads.properties#l65 And you can get an idea of the current localizations of KB https://transvision.mozfr.org/string/?entity=toolkit/chrome/mozapps/downloads/downloads.properties:kilobyte&repo=aurora I wonder if there's anything useful in the Intl API to leverage to get those (CCing Gandalf)

Flags: needinfo?(francesco.lodolo)

Zibi Braniecki [:zbraniecki][:gandalf]

Comment 6

•

9 years ago

We're working in bug 1291408 on Intl.UnitFormat that will be designed exactly for that. See the spec proposal for ECMA - https://rawgit.com/zbraniecki/proposal-intl-unit-format/master/index.html If you need it now, I'd recommend writing it in a way that is most compatible so that you can later easily switch to Intl.UnitFormat once we land it (I expect it to land this year).

Steve Chung [:steveck]

Comment 7

•

9 years ago

Attached patch bug-1261154.patch — Details — Splinter Review

Hi Greg, it's the WIP if we want to leverage the current L10n solution for byte format. Not sure if it's fine for the rest of the tree heap part, and it would need to rewrite the test as well.

Attachment #8775497 - Attachment is obsolete: true

Attachment #8780371 - Flags: feedback?(gtatum)

Zibi Braniecki [:zbraniecki][:gandalf]

Comment 8

•

9 years ago

Can you guys use the same unit abbreviations that CLDR is using[0]? It'll make it more compatible with CLDR once we start switching Firefox to it (coming with L20n). [0] http://www.unicode.org/cldr/charts/29/summary/en.html#6220

Steve Chung [:steveck]

Comment 9

•

9 years ago

(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #8) > Can you guys use the same unit abbreviations that CLDR is using[0]? > > It'll make it more compatible with CLDR once we start switching Firefox to > it (coming with L20n). > > [0] http://www.unicode.org/cldr/charts/29/summary/en.html#6220 That's the question I also want to raise: Should we treat kilobyte/Kibibyte [1] as different unit? Per calculation the unit of the size should be binary, so I still kept the original KiB/MiB/GiB unit in this patch. I'm not sure if we need to keep these because I didn't see these units in firefox code base. [1] https://en.wikipedia.org/wiki/Kibibyte

Flags: needinfo?(gandalf)

Zibi Braniecki [:zbraniecki][:gandalf]

Comment 10

•

9 years ago

In that case, we will not be able to replace this with CLDR because CLDR does not handle Kibibytes. My only question is if from the UX standpoint introducing a unit that users are likely not familiar with is worth the technical precision. I don't have an answer, but if you ever decide to switch to something that CLDR handles, I hope we'll have the UnitFormatter for you by that time! :)

Flags: needinfo?(gandalf)

Greg Tatum [:gregtatum]

Reporter

Comment 11

•

9 years ago

I do like Zibi's reasoning on the UX standpoint and like where your patch is going Steve. This originally came out of the discussion in bug 1238695 which involved Jim Blandy and Nick Fitzgerald. I'm going to NI them to get their thoughts.

Flags: needinfo?(nfitzgerald)

Flags: needinfo?(jimb)

Nick Fitzgerald [:fitzgen] [⏰PST; UTC-8]

Comment 12

•

9 years ago

Mebibytes/kibibytes/etc are more correct, we should use them. CLDR should add them if it doesn't have them, but that seems like an aside to me.

Flags: needinfo?(nfitzgerald)

Jim Blandy :jimb

Comment 13

•

9 years ago

I don't think most people know the difference between kb (kilobits), kB (kilobytes), kib (kibibits) and kiB (kibibytes). However, I do think people who don't know the difference will easy read them correctly, if they're in a context where a memory size would make sense. In other words, I don't think it's confusing to do the right thing. So we should use B, kiB, MiB, GiB. It is not important at all to localize the abbreviations of these units. Using the IEC units is fine in all locales; that's the point of having an International Electrotechnical Commission. OTOH, getting commas versus dots right is something I have a lot more sympathy for.

Flags: needinfo?(jimb)

Greg Tatum [:gregtatum]

Reporter

Comment 14

•

9 years ago

Ok, the user impact does seem pretty small for confusion on the units, especially considering that the memory tool is a more technical tool. It would make sense to use the more technically correct units to remove an ambiguity of what you are looking at. So it looks like the consensus is we'll use formatAbbreviatedBytes as it stands as there are no decimal places, and just get it into the sidebar.

Zibi Braniecki [:zbraniecki][:gandalf]

Comment 15

•

9 years ago

(In reply to Jim Blandy :jimb from comment #13) > It is not important at all to localize the abbreviations of these units. Yeah, except that not the whole world happens to be using left-to-right western arabic alphabet[0]. Do you have sympathy for that? Also, in some locales there is a difference between singular form and plural form even for narrow variants. > Using the IEC units is fine in all locales; that's the point of having an > International Electrotechnical Commission. I'm terribly sorry the complexity of the world does not fit into this picture. > OTOH, getting commas versus dots right is something I have a lot more > sympathy for. Intl.NumberFormat deals with this. [0] http://st.unicode.org/cldr-apps/v#/ar_DZ/Digital/ http://st.unicode.org/cldr-apps/v#/fa/Digital/ http://st.unicode.org/cldr-apps/v#/he/Digital/ http://st.unicode.org/cldr-apps/v#/ur/Digital/ http://st.unicode.org/cldr-apps/v#/ru/Digital/ etc.

Steve Chung [:steveck]

Comment 16

•

9 years ago

Hi Gandalf, is it possible that we still apply binary unit defined in IEC and decimal unit for other locals without IEC abbreviation? We can add more comments for it in the localization files.

Flags: needinfo?(gandalf)

Zibi Braniecki [:zbraniecki][:gandalf]

Comment 17

•

9 years ago

If you want to stick to KiB, MiB, and GiB, I'd stick to what you are doing which is to localize the unit formatting string. It's not optimal (until we switch to l20n localizers wont be able to pluralize those strings), but it's a good start and allows localizers to adapt the string. You should format the number using Intl.NumberFormat (in L20n it will be done automatically).

Flags: needinfo?(gandalf)

Patrick Brosset <:pbro>

Comment 18

•

9 years ago

[good first bug] whiteboard -> keyword mass change

Keywords: good-first-bug

Jim Blandy :jimb

Comment 19

•

9 years ago

(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #15) > (In reply to Jim Blandy :jimb from comment #13) > > It is not important at all to localize the abbreviations of these units. > > Yeah, except that not the whole world happens to be using left-to-right > western arabic alphabet[0]. Do you have sympathy for that? > Also, in some locales there is a difference between singular form and plural > form even for narrow variants. We should be using Arabic numerals in all cases for this tool. We should not be using singular or plural forms at all. So I think the answer is no, I don't think it's important in this context.

Jim Blandy :jimb

Comment 20

•

9 years ago

That was a little more abrupt than I'd intended --- but comment 15 was as well. Let's take a step back. First, just to be clear what we're discussing here: this is a display of the amount of memory occupied by various categories of data, presented to web developers as part of an expert-level developer tool. This is not user-facing UI; it is developer-facing only. I think it is important for us to use the correct units here. Those units are "kiB". "KB" may be a common way to refer to 1024 bytes in popular use, but this is a technically specialized context, and we should use the industry-standard units. If there is a simple way for us to get the right units and otherwise properly localize the number, then let's do it. If it is not presently possible to use the correct units, then I think the way to maximize the quality of the tool for the largest audience is, unfortunately, to use the non-localized form.

Zibi Braniecki [:zbraniecki][:gandalf]

Comment 21

•

9 years ago

(In reply to Jim Blandy :jimb from comment #19) > We should be using Arabic numerals in all cases for this tool. Western arabic numerals (0-9) are not used by many languages at all. > We should not be using singular or plural forms at all. I don't understand why you believe so. If you don't want to make this UI localizable, and you only want to display it in en-US, then you are correct. In ever other scenario, I believe you are wrong. > I think it is important for us to use the correct units here. Those units are "kiB". I understand that. I suggested using "KB" because a) it's already localized and b) it may be easier to understand for users. The argument that it's a technical tool and it's valuable to preserve the accuracy convinces me. I'm ok with "kiB" :) > If it is not presently possible to use the correct units, then I think the way to maximize the quality of the tool for the largest audience is, unfortunately, to use the non-localized form. It's perfectly possible, we just need to add l10n strings that allow localizers to format the unit. I gave you UnitFormat example so that you can shape the string templates and API after it.

Jim Blandy :jimb

Comment 22

•

9 years ago

(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #21) > (In reply to Jim Blandy :jimb from comment #19) > > We should be using Arabic numerals in all cases for this tool. > > Western arabic numerals (0-9) are not used by many languages at all. I'm personally familiar with software development in Japan. Japanese users of this tool would much rather see "213kiB" than “二百十三kiB". The latter would be ridiculous and irritating to a Japanese programmer. In these other locales, what notation would engineering and computer science students use in their textbooks and classes? I believe that if you're doing software, you're working with Arabic numerals and metric units. Here is an example of the sort of display whose numbers we're formatting: http://tatumcreative.github.io/memory-treemap/ The labels "Array", "Object", etc. are JavaScript terms; it is not correct to localize them, as one cannot use 行列 in JavaScript as a synonym for "Array"; the developer would be left guessing what JavaScript term is meant.

Zibi Braniecki [:zbraniecki][:gandalf]

Comment 23

•

9 years ago

You are right at that Japanese is not a good example of language that requires unit localization. They don't, and as you can see above, I didn't list it in the list of example I gave you straight from CLDR. But assuming you believe that CLDR doesn't know what it's doing and the list of five examples of non left-to-right, non-western-arabic numerals, using different alphabet is not enough, fortunately, you can look at wikipedia. Here's an example list of articles that don't use the combination of numbers, alphabet and directionality that would allow for en-US expression "10 kb" to work for them with transliteration of digits and translation of digits (and directions, and separators): * https://uk.wikipedia.org/wiki/%D0%9C%D0%B5%D0%B3%D0%B0%D0%B1%D0%B0%D0%B9%D1%82 * https://ar.wikipedia.org/wiki/%D9%85%D9%8A%D9%82%D8%A7%D8%A8%D8%A7%D9%8A%D8%AA * https://fa.wikipedia.org/wiki/%D9%85%DA%AF%D8%A7%D8%A8%D8%A7%DB%8C%D8%AA * https://ru.wikipedia.org/wiki/%D0%9C%D0%B5%D0%B3%D0%B0%D0%B1%D0%B0%D0%B9%D1%82 * https://mk.wikipedia.org/wiki/%D0%9C%D0%B5%D0%B3%D0%B0%D0%B1%D0%B0%D1%98%D1%82 * https://bn.wikipedia.org/wiki/%E0%A6%AE%E0%A7%87%E0%A6%97%E0%A6%BE%E0%A6%AC%E0%A6%BE%E0%A6%87%E0%A6%9F * https://ko.wikipedia.org/wiki/%EB%A9%94%EA%B0%80%EB%B0%94%EC%9D%B4%ED%8A%B8 * https://hy.wikipedia.org/wiki/%D5%84%D5%A5%D5%A3%D5%A1%D5%A2%D5%A1%D5%B5%D5%A9 * https://ka.wikipedia.org/wiki/%E1%83%9B%E1%83%94%E1%83%92%E1%83%90%E1%83%91%E1%83%90%E1%83%98%E1%83%A2%E1%83%98 * https://mr.wikipedia.org/wiki/%E0%A4%AE%E0%A5%87%E0%A4%97%E0%A4%BE%E0%A4%AC%E0%A4%BE%E0%A4%88%E0%A4%9F * https://ta.wikipedia.org/wiki/%E0%AE%AE%E0%AF%86%E0%AE%95%E0%AE%BE%E0%AE%AA%E0%AF%88%E0%AE%9F%E0%AF%8D%E0%AE%9F%E0%AF%81 * https://th.wikipedia.org/wiki/%E0%B9%80%E0%B8%A1%E0%B8%81%E0%B8%B0%E0%B9%84%E0%B8%9A%E0%B8%95%E0%B9%8C

Jim Blandy :jimb

Comment 24

•

9 years ago

Looking through that list, it seems that every locale other than Macedonian actually admits the use of KiB, MiB, and GiB. Score one for the IEC! (It makes me wonder whether the Macedonian page is correct...) Zibi brings up the issue of LTR/RTL differences, and non-Arabic numerals, and pluralization, but nothing in this bug up to this point has really considered those broader questions of numeric formatting --- the original question at hand was just about the unit names. I don't know if we want to block the simpler, immediate question on the broader question with the more involved answer.

Jim Blandy :jimb

Comment 25

•

9 years ago

Sorry, I this context especially, I should use the word "locale" advisedly. What I meant was, it seems that these Wikipedia pages translated into different languages all (except for Macedonian) say that the units Greg's original code was using were actually acceptable.

Zibi Braniecki [:zbraniecki][:gandalf]

Comment 26

•

9 years ago

> Looking through that list, it seems that every locale other than Macedonian actually admits the use of KiB, MiB, and GiB. Score one for the IEC I'm not sure if we see the same results, so let me be more explicit: In Ukrainian, the unit is "кБ", which matches data from CLDR: http://www.unicode.org/cldr/charts/29/summary/uk.html#6834 In Arabic (and Faarsi), the unit is "كيلوبايت", which matches data from CLDR: http://www.unicode.org/cldr/charts/29/summary/ar.html#8021 Yes, that means that in Arabic, we will not be using the short form, because its not recommended by CLDR/ICU/Unicode. In Armenian, the unit is "կԲ" which matches data from CLDR: http://www.unicode.org/cldr/charts/29/summary/hy.html#5448 And so on. > Zibi brings up the issue of LTR/RTL differences, and non-Arabic numerals, and pluralization, but nothing in this bug up to this point has really considered those broader questions of numeric formatting --- the original question at hand was just about the unit names. Correct, but I don't believe you can answer one without the other. If you will want to display latin characters and unit name ("KiB") without localization, you will have to match it with western-arabic numerals and left-to-right. > I should use the word "locale" advisedly. What I meant was, it seems that these Wikipedia pages translated into different languages all (except for Macedonian) say that the units Greg's original code was using were actually acceptable. Can you point out which fragments of those articles indicate what you are claiming they do? Just to be clear - the presence of IEC units in tables is a reference, not the unit used by that language.

Jim Blandy :jimb

Comment 27

•

9 years ago

(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #26) > Can you point out which fragments of those articles indicate what you are > claiming they do? > Just to be clear - the presence of IEC units in tables is a reference, not > the unit used by that language. Well, maybe I'm misreading them. I was assuming that if the Wikipedia editors in those languages were citing the IEC units, then that implied that a technically-oriented audience in that language would be familiar with them. Either way, you've persuaded me that using the CLDR is the right thing in the long term. You mentioned in email that CLDR 30 is frozen, and it will take another 6-7 months to get the binary base prefixes into the next version, and suggested that we use ordinary localized string templates whose forms imitate those used by the extant decimal base prefixes in the CLDR (e.g. "{0} KiB", imitating "{0} KB"). That seems reasonable to me.

Greg Tatum [:gregtatum]

Reporter

Updated

•

8 years ago

Attachment #8780371 - Flags: feedback?(gtatum) → feedback+

Greg Tatum [:gregtatum]

Reporter

Updated

•

7 years ago

Keywords: good-first-bug

Whiteboard: [good first bug]

BMO Automation

Updated

•

7 years ago

Product: Firefox → DevTools

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

bug-1261154.patch 9 years ago Steve Chung [:steveck] 1.86 KB, patch		Details \| Diff \| Splinter Review
bug-1261154.patch 9 years ago Steve Chung [:steveck] 3.81 KB, patch	gregtatum : feedback+	Details \| Diff \| Splinter Review