Closed
Bug 179264
Opened 22 years ago
Closed 22 years ago
csv output for the buglist (see url inside) is invalid (too many "s)
Categories
(Bugzilla :: Query/Bug List, defect, P2)
Tracking
()
RESOLVED
FIXED
Bugzilla 2.18
People
(Reporter: mcsmurf, Assigned: bugreport)
References
()
Details
Attachments
(2 files, 2 obsolete files)
898 bytes,
patch
|
Details | Diff | Splinter Review | |
559 bytes,
patch
|
bugreport
:
review+
|
Details | Diff | Splinter Review |
it doesn't follow the rules for "s, when you store the file as CSV; it makes in
HTML
178933 nor -- PC justdave@netscape.com NEW create user error includes "Content-
type: text/html"
and in CSV it gets 178933,normal,--,PC,justdave@netscape.com,NEW,,create user
error includes ""Content-type: text/html",
You see, in HTML it is 2 "s and in CSV it is 3 "s. But thats wrong because the
http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm lists the rules as:
#A field that contains double quote characters must be surounded by double-
#quotes, and the embedded double-quotes must each be represented by a pair of
#consecutive double quotes.
the other rule the csv output probably breaks is:
# Fields with embedded commas must be delimited with double-quote characters.
Reporter | ||
Updated•22 years ago
|
Summary: csv output for the buglist (see url inside) is invalid (too many "s and ) → csv output for the buglist (see url inside) is invalid (too many "s)
![]() |
Assignee | |
Comment 1•22 years ago
|
||
CSV output for the buglist also fails to properly quote or escape "," in the bug
summary.
The same fix should surround the entire field with double-quotes and escape any
double-quotes that are within it.
Reporter | ||
Comment 2•22 years ago
|
||
yep, have made a seperate bug on that, Bug 179265 (sure releated to this)
![]() |
Assignee | |
Comment 3•22 years ago
|
||
Changes CSV filter to quote any string that has " or , in it and use a
double-quote to escape any double-quotes that appear in the string.
![]() |
Assignee | |
Comment 4•22 years ago
|
||
Taking assignment
Assignee: endico → bugreport
OS: Windows 2000 → All
Priority: -- → P2
Hardware: PC → All
Target Milestone: --- → Bugzilla 2.18
Version: unspecified → 2.17.1
![]() |
Assignee | |
Updated•22 years ago
|
Attachment #105718 -
Flags: review?
![]() |
||
Comment 5•22 years ago
|
||
Comment on attachment 105718 [details] [diff] [review]
Patch
r=gerv; but please use cvs diff -u, because it makes patches much, much easier
to read :-)
Gerv
Attachment #105718 -
Flags: review? → review+
![]() |
||
Comment 6•22 years ago
|
||
*** Bug 179265 has been marked as a duplicate of this bug. ***
![]() |
Assignee | |
Comment 7•22 years ago
|
||
I'm going to update this..... from IRC...
In general, you can escape a string for display in MS Excel CSV format
by using this algorithm
if string contains separator-character or double-quote or line-break
replace each double quote in string by 2 double quotes
replace each line-break in string by one LF character
wrap string in duoble quotes
else
leave string as is
Comment 8•22 years ago
|
||
Comment on attachment 105718 [details] [diff] [review]
Patch
needs-work.
This needs to surround the value with double-quotes if the value contains
linebreaks, too.
Attachment #105718 -
Flags: review-
![]() |
Assignee | |
Comment 9•22 years ago
|
||
Still no good example of a line-break, but aggressively quoting everything.
Attachment #105718 -
Attachment is obsolete: true
![]() |
Assignee | |
Updated•22 years ago
|
Attachment #105734 -
Flags: review?
Comment 10•22 years ago
|
||
Comment on attachment 105734 [details] [diff] [review]
patch - quotes everything
tested on landfill, with bugs in the list "loaded" with the stuff that broke it
before... Excel likes it. :-)
r=justdave
a=justdave
Attachment #105734 -
Flags: review? → review+
![]() |
Assignee | |
Comment 11•22 years ago
|
||
Checking in globals.pl;
/cvsroot/mozilla/webtools/bugzilla/globals.pl,v <-- globals.pl
new revision: 1.215; previous revision: 1.214
done
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
![]() |
||
Comment 12•22 years ago
|
||
I don't think we should be quoting values unless they contain quotes or commas
(or linebreaks; but would that ever happen?). The current solution does work,
admittedly, but may cause some spreadsheets to assume our numbers are strings,
when they aren't.
This could be fixed merely by reinstating the check, rather than quoting
unconditionally.
Gerv
![]() |
Assignee | |
Comment 13•22 years ago
|
||
I haven't found any documents or reasonably current examples where quoting a
number causes it to be handled differently.
Perhaps a subsequent patch should identify pure-numeric values and skip quoting
those, while leaving anything other than ^[0-9\.]*$ quoted.
![]() |
||
Comment 14•22 years ago
|
||
Is there any reason why we can't use the original (albeit buggily implemented)
algorithm, that if it contains a quote or a comma, then quote it, otherwise don't?
Gerv
![]() |
Assignee | |
Comment 15•22 years ago
|
||
We could use the corrected original, that will actually quote if there are
quotes or commas in it AND escape ALL double-quotes. However, there was an
insistance on handline newlines, etc...
The original would fail with commas as well as anything with more than one
double-quote.
![]() |
||
Comment 16•22 years ago
|
||
> The original would fail with commas as well as anything with more than one
> double-quote.
Yes, hence "buggily implemented."
> We could use the corrected original, that will actually quote if there are
> quotes or commas in it AND escape ALL double-quotes. However, there was an
> insistance on handline newlines, etc...
Then we can add newlines to " and , in the regexp. Not hard. Anyway, as far as I
can see, no values we'd be displaying in buglists have embedded newlines.
Gerv
![]() |
||
Comment 17•22 years ago
|
||
Trivial patch to change the quoting to happen only when necessary.
Gerv
![]() |
||
Comment 18•22 years ago
|
||
Comment on attachment 106291 [details] [diff] [review]
Patch v.1
Joel - could you please sign off on this?
Gerv
Attachment #106291 -
Flags: review?(bugreport)
![]() |
Assignee | |
Comment 19•22 years ago
|
||
Comment on attachment 106291 [details] [diff] [review]
Patch v.1
This needs to handle leading or trailing whitespace differently. The way this
would work right now, it would be inconsistent.
If I had
"This is a bug ending in whitespace " in a string, the whitespace would be
trimmed.
If I has "However, this one gets preserved " the whitespace would be
presserved.
This should be consistent.
The consensus on the IRC channel is that the whitespace should never be trimmed
rather than always trimmed.
Attachment #106291 -
Flags: review?(bugreport) → review-
![]() |
||
Comment 20•22 years ago
|
||
Er, joel... which part of this function does any whitespace trimming at all?
Gerv
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
![]() |
Assignee | |
Comment 21•22 years ago
|
||
I should have been clearer on the comment....
If a string has leading or trailing whitespace, it is lost in CSV format unless
the string is quoted. If we quote everything, then the whitespaces is always
preserved. If we quote only strings with certain characters, then strings that
contain those characters will preserve whitespace while strings not conatining
those characters will have the whitespace lost.
There was a fair amount of discussion about this on IRC and the consensus was
that the leading/trailing whitespace should be preserved. This implies that, if
we are not quoting everything, that we still quote any string with
leading/trailing whitespace.
![]() |
||
Comment 22•22 years ago
|
||
> If a string has leading or trailing whitespace, it is lost in CSV format unless
> the string is quoted.
Where is that written? :-) According to a book on good programming style I read,
which had a CSV parser as an example of a fuzzy problem, there is no central
definition of the format, and implementatations vary.
If we don't quote a CSV value with leading/trailing whitespace, and whatever
application reads it decides to discard that whitespace, it's their problem. IMO :-)
Gerv
![]() |
Assignee | |
Comment 23•22 years ago
|
||
The reference in the initial comment states that. It is true, though, that
there is no such thing as an authoritative reference. I still think we should
either trim all leadin/trailing whitespace or quote anything that has it.
Inconsistenacy stinks.
![]() |
||
Comment 24•22 years ago
|
||
Why is it inconsistent to say:
"Any value with a quote or comma gets quoted, any value without one doesn't"?
Even if whitespace is discarded by a given consumer of CSV, who is to say that
they won't do that step _after_ the remove the quotes and unescape the string?
Then both forms would have their whitespace discarded.
http://ostermiller.org/utils/CSVParser.java.html (Java CSV parser):
"* If field includes a comma or a new line, the whole field must be surrounded
with double quotes."
That's the only documentation I can find on the web, but it fits with my book
and my understanding.
On a side note, I note from this page:
http://ostermiller.org/utils/CSVLexer.html
that we are using the Excel standard for escaping CSV, rather than the standard
standard. What should we do about that?
Gerv
![]() |
Assignee | |
Comment 25•22 years ago
|
||
Well, if I ask OpenOffice to generate a CSV, it quotes any non-numeric, non-null
field. Personally, I think that is the reasonable thing to do.
"No space here","Trailing space on this "," Leading space here","More
text","Hmm... This, that, theother",,12345.67
![]() |
||
Comment 26•22 years ago
|
||
OK, I'll buy that.
if ($x !~ /^\d*$/) {
$x = "\"$x\"";
}
?
Gerv
![]() |
Assignee | |
Comment 27•22 years ago
|
||
probably ...
$x !~ /^-?\d+\.?\d*$/;
So negative numbers and non-integers don't get quoted.
![]() |
||
Comment 28•22 years ago
|
||
Here's v.2. A slight change from Joel's excellent suggestion, to make sure it
still matches the empty string.
Gerv
Attachment #106291 -
Attachment is obsolete: true
![]() |
Assignee | |
Comment 29•22 years ago
|
||
Comment on attachment 106470 [details] [diff] [review]
Patch v.2
r=joel
Attachment #106470 -
Flags: review+
![]() |
||
Comment 30•22 years ago
|
||
Dave - can I get an a= for this?
Gerv
Status: REOPENED → RESOLVED
Closed: 22 years ago → 22 years ago
Resolution: --- → FIXED
Comment 32•22 years ago
|
||
a=justdave
![]() |
||
Comment 33•22 years ago
|
||
Really fixed this time.
Checking in globals.pl;
/cvsroot/mozilla/webtools/bugzilla/globals.pl,v <-- globals.pl
new revision: 1.216; previous revision: 1.215
done
Gerv
Status: REOPENED → RESOLVED
Closed: 22 years ago → 22 years ago
Resolution: --- → FIXED
Updated•12 years ago
|
QA Contact: matty_is_a_geek → default-qa
You need to log in
before you can comment on or make changes to this bug.
Description
•