Closed
Bug 179264
Opened 22 years ago
Closed 22 years ago
csv output for the buglist (see url inside) is invalid (too many "s)
Categories
(Bugzilla :: Query/Bug List, defect, P2)
Tracking
()
RESOLVED
FIXED
Bugzilla 2.18
People
(Reporter: mcsmurf, Assigned: bugreport)
References
()
Details
Attachments
(2 files, 2 obsolete files)
898 bytes,
patch
|
Details | Diff | Splinter Review | |
559 bytes,
patch
|
bugreport
:
review+
|
Details | Diff | Splinter Review |
it doesn't follow the rules for "s, when you store the file as CSV; it makes in HTML 178933 nor -- PC justdave@netscape.com NEW create user error includes "Content- type: text/html" and in CSV it gets 178933,normal,--,PC,justdave@netscape.com,NEW,,create user error includes ""Content-type: text/html", You see, in HTML it is 2 "s and in CSV it is 3 "s. But thats wrong because the http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm lists the rules as: #A field that contains double quote characters must be surounded by double- #quotes, and the embedded double-quotes must each be represented by a pair of #consecutive double quotes. the other rule the csv output probably breaks is: # Fields with embedded commas must be delimited with double-quote characters.
Reporter | ||
Updated•22 years ago
|
Summary: csv output for the buglist (see url inside) is invalid (too many "s and ) → csv output for the buglist (see url inside) is invalid (too many "s)
Assignee | ||
Comment 1•22 years ago
|
||
CSV output for the buglist also fails to properly quote or escape "," in the bug summary. The same fix should surround the entire field with double-quotes and escape any double-quotes that are within it.
Reporter | ||
Comment 2•22 years ago
|
||
yep, have made a seperate bug on that, Bug 179265 (sure releated to this)
Assignee | ||
Comment 3•22 years ago
|
||
Changes CSV filter to quote any string that has " or , in it and use a double-quote to escape any double-quotes that appear in the string.
Assignee | ||
Comment 4•22 years ago
|
||
Taking assignment
Assignee: endico → bugreport
OS: Windows 2000 → All
Priority: -- → P2
Hardware: PC → All
Target Milestone: --- → Bugzilla 2.18
Version: unspecified → 2.17.1
Assignee | ||
Updated•22 years ago
|
Attachment #105718 -
Flags: review?
Comment 5•22 years ago
|
||
Comment on attachment 105718 [details] [diff] [review] Patch r=gerv; but please use cvs diff -u, because it makes patches much, much easier to read :-) Gerv
Attachment #105718 -
Flags: review? → review+
Comment 6•22 years ago
|
||
*** Bug 179265 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 7•22 years ago
|
||
I'm going to update this..... from IRC... In general, you can escape a string for display in MS Excel CSV format by using this algorithm if string contains separator-character or double-quote or line-break replace each double quote in string by 2 double quotes replace each line-break in string by one LF character wrap string in duoble quotes else leave string as is
Comment 8•22 years ago
|
||
Comment on attachment 105718 [details] [diff] [review] Patch needs-work. This needs to surround the value with double-quotes if the value contains linebreaks, too.
Attachment #105718 -
Flags: review-
Assignee | ||
Comment 9•22 years ago
|
||
Still no good example of a line-break, but aggressively quoting everything.
Attachment #105718 -
Attachment is obsolete: true
Assignee | ||
Updated•22 years ago
|
Attachment #105734 -
Flags: review?
Comment 10•22 years ago
|
||
Comment on attachment 105734 [details] [diff] [review] patch - quotes everything tested on landfill, with bugs in the list "loaded" with the stuff that broke it before... Excel likes it. :-) r=justdave a=justdave
Attachment #105734 -
Flags: review? → review+
Assignee | ||
Comment 11•22 years ago
|
||
Checking in globals.pl; /cvsroot/mozilla/webtools/bugzilla/globals.pl,v <-- globals.pl new revision: 1.215; previous revision: 1.214 done
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Comment 12•22 years ago
|
||
I don't think we should be quoting values unless they contain quotes or commas (or linebreaks; but would that ever happen?). The current solution does work, admittedly, but may cause some spreadsheets to assume our numbers are strings, when they aren't. This could be fixed merely by reinstating the check, rather than quoting unconditionally. Gerv
Assignee | ||
Comment 13•22 years ago
|
||
I haven't found any documents or reasonably current examples where quoting a number causes it to be handled differently. Perhaps a subsequent patch should identify pure-numeric values and skip quoting those, while leaving anything other than ^[0-9\.]*$ quoted.
Comment 14•22 years ago
|
||
Is there any reason why we can't use the original (albeit buggily implemented) algorithm, that if it contains a quote or a comma, then quote it, otherwise don't? Gerv
Assignee | ||
Comment 15•22 years ago
|
||
We could use the corrected original, that will actually quote if there are quotes or commas in it AND escape ALL double-quotes. However, there was an insistance on handline newlines, etc... The original would fail with commas as well as anything with more than one double-quote.
Comment 16•22 years ago
|
||
> The original would fail with commas as well as anything with more than one > double-quote. Yes, hence "buggily implemented." > We could use the corrected original, that will actually quote if there are > quotes or commas in it AND escape ALL double-quotes. However, there was an > insistance on handline newlines, etc... Then we can add newlines to " and , in the regexp. Not hard. Anyway, as far as I can see, no values we'd be displaying in buglists have embedded newlines. Gerv
Comment 17•22 years ago
|
||
Trivial patch to change the quoting to happen only when necessary. Gerv
Comment 18•22 years ago
|
||
Comment on attachment 106291 [details] [diff] [review] Patch v.1 Joel - could you please sign off on this? Gerv
Attachment #106291 -
Flags: review?(bugreport)
Assignee | ||
Comment 19•22 years ago
|
||
Comment on attachment 106291 [details] [diff] [review] Patch v.1 This needs to handle leading or trailing whitespace differently. The way this would work right now, it would be inconsistent. If I had "This is a bug ending in whitespace " in a string, the whitespace would be trimmed. If I has "However, this one gets preserved " the whitespace would be presserved. This should be consistent. The consensus on the IRC channel is that the whitespace should never be trimmed rather than always trimmed.
Attachment #106291 -
Flags: review?(bugreport) → review-
Comment 20•22 years ago
|
||
Er, joel... which part of this function does any whitespace trimming at all? Gerv
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 21•22 years ago
|
||
I should have been clearer on the comment.... If a string has leading or trailing whitespace, it is lost in CSV format unless the string is quoted. If we quote everything, then the whitespaces is always preserved. If we quote only strings with certain characters, then strings that contain those characters will preserve whitespace while strings not conatining those characters will have the whitespace lost. There was a fair amount of discussion about this on IRC and the consensus was that the leading/trailing whitespace should be preserved. This implies that, if we are not quoting everything, that we still quote any string with leading/trailing whitespace.
Comment 22•22 years ago
|
||
> If a string has leading or trailing whitespace, it is lost in CSV format unless
> the string is quoted.
Where is that written? :-) According to a book on good programming style I read,
which had a CSV parser as an example of a fuzzy problem, there is no central
definition of the format, and implementatations vary.
If we don't quote a CSV value with leading/trailing whitespace, and whatever
application reads it decides to discard that whitespace, it's their problem. IMO :-)
Gerv
Assignee | ||
Comment 23•22 years ago
|
||
The reference in the initial comment states that. It is true, though, that there is no such thing as an authoritative reference. I still think we should either trim all leadin/trailing whitespace or quote anything that has it. Inconsistenacy stinks.
Comment 24•22 years ago
|
||
Why is it inconsistent to say: "Any value with a quote or comma gets quoted, any value without one doesn't"? Even if whitespace is discarded by a given consumer of CSV, who is to say that they won't do that step _after_ the remove the quotes and unescape the string? Then both forms would have their whitespace discarded. http://ostermiller.org/utils/CSVParser.java.html (Java CSV parser): "* If field includes a comma or a new line, the whole field must be surrounded with double quotes." That's the only documentation I can find on the web, but it fits with my book and my understanding. On a side note, I note from this page: http://ostermiller.org/utils/CSVLexer.html that we are using the Excel standard for escaping CSV, rather than the standard standard. What should we do about that? Gerv
Assignee | ||
Comment 25•22 years ago
|
||
Well, if I ask OpenOffice to generate a CSV, it quotes any non-numeric, non-null field. Personally, I think that is the reasonable thing to do. "No space here","Trailing space on this "," Leading space here","More text","Hmm... This, that, theother",,12345.67
Comment 26•22 years ago
|
||
OK, I'll buy that. if ($x !~ /^\d*$/) { $x = "\"$x\""; } ? Gerv
Assignee | ||
Comment 27•22 years ago
|
||
probably ... $x !~ /^-?\d+\.?\d*$/; So negative numbers and non-integers don't get quoted.
Comment 28•22 years ago
|
||
Here's v.2. A slight change from Joel's excellent suggestion, to make sure it still matches the empty string. Gerv
Attachment #106291 -
Attachment is obsolete: true
Assignee | ||
Comment 29•22 years ago
|
||
Comment on attachment 106470 [details] [diff] [review] Patch v.2 r=joel
Attachment #106470 -
Flags: review+
Comment 30•22 years ago
|
||
Dave - can I get an a= for this? Gerv
Status: REOPENED → RESOLVED
Closed: 22 years ago → 22 years ago
Resolution: --- → FIXED
Comment 32•22 years ago
|
||
a=justdave
Comment 33•22 years ago
|
||
Really fixed this time. Checking in globals.pl; /cvsroot/mozilla/webtools/bugzilla/globals.pl,v <-- globals.pl new revision: 1.216; previous revision: 1.215 done Gerv
Status: REOPENED → RESOLVED
Closed: 22 years ago → 22 years ago
Resolution: --- → FIXED
Updated•12 years ago
|
QA Contact: matty_is_a_geek → default-qa
You need to log in
before you can comment on or make changes to this bug.
Description
•