Closed
Bug 860727
Opened 11 years ago
Closed 7 years ago
mbslen implementation seems brittle; should be simplified
Categories
(Core :: WebRTC: Networking, defect)
Core
WebRTC: Networking
Tracking
()
RESOLVED
FIXED
mozilla57
Tracking | Status | |
---|---|---|
firefox57 | --- | fixed |
backlog | parking-lot |
People
(Reporter: hsivonen, Assigned: hsivonen)
Details
(Whiteboard: [WebRTC] [blocking-webrtc-])
Attachments
(2 files, 1 obsolete file)
Follow-up for bug 838319: The STUN protocol exhibits a strange design that limits the length of UTF-8-encoded values in terms of the number of characters. (See e.g. https://tools.ietf.org/html/rfc5389#section-15.7 . Interestingly, the spec notes that the having 128 as the maximum number of characters means that the string can be up to 763 bytes long, which suggests that the spec is written with the assumption that UTF-8 sequences outside the 21-bit Unicode range might be used. It's also rather curious to make the number of characters a power of two as opposed to making the number of bytes a power of two. Well, the RFC is what it is.) The code for checking for this is full of #ifdefs and uses a library function whose behavior is sensitive to the activity C library locale. (http://mxr.mozilla.org/mozilla-central/source/media/mtransport/third_party/nICEr/src/util/mbslen.c#53) This seems brittle and over complex. I think the mbslen function should be rewritten so that it doesn't call to external libraries but counts the number of characters in a UTF-8 string itself. If it's safe to assume that UTF-8 string is a valid UTF-8, the task is very easy: loop over the string until NUL and count all non-continuation bytes--that is, count all bytes whose bit pattern is not 10xxxxxx.
Assignee | ||
Comment 1•11 years ago
|
||
I didn't try even compiling this, let alone check whether the input can really be assumed to be valid, because WebRTC doesn't compile for me right now due to bug 860222.
Comment 2•11 years ago
|
||
Note (per IRC) that we should not assume the input to be valid, and that the STUN spec requires these be UTF-8 strings of a specified length or less (so "fits in the buffer" could violate the spec as the buffer is sized for the worst-case UTF-8 string (per comment 0). Perhaps make the "validate UTF8 string/length" an (optional) external to this library and in our use delegate that to our existing UTF8 handling code in C++.
Whiteboard: [WebRTC] [blocking-webrtc-]
Updated•9 years ago
|
backlog: --- → parking-lot
Assignee | ||
Comment 3•7 years ago
|
||
Still hoping to remove the setlocale() call here. (In reply to Randell Jesup [:jesup] from comment #2) > Note (per IRC) that we should not assume the input to be valid I take it that the function should return R_BAD_DATA in that case. > Perhaps make the "validate UTF8 string/length" an (optional) external to > this library and in our use delegate that to our existing UTF8 handling code > in C++. How about the following? If some C preprocessor define indicating that the code is being built as part of m-c is set, validate UTF-8 using encoding_rs. If validation fails, return R_BAD_DATA. If validation succeeds, count the leads bytes per the currently-attached patch. What would the appropriate preprocessor define to check be?
Flags: needinfo?(rjesup)
Comment hidden (mozreview-request) |
Assignee | ||
Updated•7 years ago
|
Attachment #736258 -
Attachment is obsolete: true
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 7•7 years ago
|
||
(In reply to Henri Sivonen (:hsivonen) from comment #3) > What would the appropriate preprocessor define to check be? MOZILLA_VERSION probably.
Flags: needinfo?(rjesup)
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → hsivonen
Status: NEW → ASSIGNED
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Updated•7 years ago
|
Attachment #8904429 -
Flags: review?(rjesup)
Comment 10•7 years ago
|
||
mozreview-review |
Comment on attachment 8904429 [details] Bug 860727 - Implement UTF-8 code unit counting without setlocale(). https://reviewboard.mozilla.org/r/176266/#review181426 Forwarding review to bwc, who might want to check with ekr about how to embed mozilla-specific code in nICEr. "normally" this would be done by some sort of API point to install a callback for an alternate to the inline code, but perhaps it's ok to have in there ifdefed. (and it's simpler...)
Attachment #8904429 -
Flags: review?(rjesup)
Updated•7 years ago
|
Attachment #8904429 -
Flags: review?(docfaraday)
Comment 11•7 years ago
|
||
mozreview-review |
Comment on attachment 8904429 [details] Bug 860727 - Implement UTF-8 code unit counting without setlocale(). https://reviewboard.mozilla.org/r/176266/#review181446 So, given that we call this function in only one place, and just let failures slide when they happen (https://dxr.mozilla.org/mozilla-central/source/media/mtransport/third_party/nICEr/src/stun/stun_codec.c#228), I think it would make sense to write a count_utf8_chars_no_validation function and put it in stun_codec.c, use that instead of mbslen, and either leave mbslen.c alone or remove it from the build.
Attachment #8904429 -
Flags: review?(docfaraday) → review-
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 14•7 years ago
|
||
Upstream PR: https://github.com/resiprocate/nICEr/pull/1
Assignee | ||
Updated•7 years ago
|
Attachment #8904429 -
Flags: review?(docfaraday)
Attachment #8904895 -
Flags: review?(docfaraday)
Assignee | ||
Comment 15•7 years ago
|
||
mozreview-review-reply |
Comment on attachment 8904429 [details] Bug 860727 - Implement UTF-8 code unit counting without setlocale(). https://reviewboard.mozilla.org/r/176266/#review181446 Created a function that counts under the assumption of UTF-8 validity. Removed mbslen.c/h, because dead-code files are confusing and the m-c import of nICEr already omits some upstream files. Created a corresponding PR for the upstream repo.
Comment 16•7 years ago
|
||
mozreview-review |
Comment on attachment 8904429 [details] Bug 860727 - Implement UTF-8 code unit counting without setlocale(). https://reviewboard.mozilla.org/r/176266/#review181762
Attachment #8904429 -
Flags: review+
Comment 17•7 years ago
|
||
mozreview-review |
Comment on attachment 8904895 [details] Bug 860727 clean-up - Remove mbslen.c/h from the tree. https://reviewboard.mozilla.org/r/176676/#review181768 ::: media/mtransport/third_party/nICEr/IMPORT_FILES:71 (Diff revision 1) > ./src/util/cb_args.c > ./src/util/cb_args.h > ./src/util/ice_util.c > ./src/util/ice_util.h > - ./src/util/mbslen.c > - ./src/util/mbslen.h > + #./src/util/mbslen.c > + #./src/util/mbslen.h Let's actually remove these. ::: media/mtransport/third_party/nICEr/nicer.gyp:122 (Diff revision 1) > - "./src/util/mbslen.c", > - "./src/util/mbslen.h", > + #"./src/util/mbslen.c", > + #"./src/util/mbslen.h", Same.
Attachment #8904895 -
Flags: review?(docfaraday) → review+
Comment hidden (mozreview-request) |
Assignee | ||
Comment 19•7 years ago
|
||
mozreview-review-reply |
Comment on attachment 8904895 [details] Bug 860727 clean-up - Remove mbslen.c/h from the tree. https://reviewboard.mozilla.org/r/176676/#review181768 > Let's actually remove these. Removed > Same. Removed. Thanks.
Comment 20•7 years ago
|
||
Pushed by hsivonen@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/7f6cfb32a16e Implement UTF-8 code unit counting without setlocale(). r=bwc https://hg.mozilla.org/integration/autoland/rev/6a66de2d1901 clean-up - Remove mbslen.c/h from the tree. r=bwc
Comment 21•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/7f6cfb32a16e https://hg.mozilla.org/mozilla-central/rev/6a66de2d1901
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
status-firefox57:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
You need to log in
before you can comment on or make changes to this bug.
Description
•