Open Bug 736951 Opened 12 years ago Updated 4 years ago

Long freeze, high CPU displaying/loading message with an attached EXCEL XML-file (single line XML file of 759 KB)

Categories

(Thunderbird :: Message Reader UI, defect)

11 Branch
defect
Not set
critical

Tracking

(Not tracked)

People

(Reporter: needlenight, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Keywords: perf, testcase, Whiteboard: [do not dup to core bug 197956])

Attachments

(3 files, 2 obsolete files)

User Agent: Opera/9.80 (Windows NT 6.1; U; ru) Presto/2.10.229 Version/11.61

Steps to reproduce:

I received a message with an attached Excel XML-specific file.


Actual results:

When I click (or open) for a specific message with an attached EXCEL XML-file when previewing it hangs Thunderbird all versions completely.
Microsoft Oulook displays the preview with no problems.


Expected results:

Thunderbird should be allowed to save the attachment without problems
Severity: normal → critical
Attachment #607085 - Attachment mime type: application/octet-stream → application/x-zip
I didn't crash loading the attachment. It was slow slow slow but didn't crash. Reporter can you give us a crash ID (see http://support.mozillamessaging.com/en-US/kb/thunderbird-crashes?s=crash+id) ?
Keywords: crash
As Ludovic Hirlimann says, Tb 11.0 doesn't crash. It merely takes very very long to show the text/xml data with CPU 100% of a CPU, if "View/Display Attachments Inline" is requested.

The attached XML data is "single line of 1,257,429 bytes long".
If [CRLF](0x0D0A) was inserted appropriately("/>", no quote, was changed to "/>[CRLF]", no quote, [CRLF]=0x0D0A), it didn't take so long to show the XML file, although it's not quick because 1.2MB XML text file.

Slowness we saw is known issue when very very long text line.


FYI.
I checked attached .xml file by some browsers.
If attached XML file is saved to .xml file and opened by Firefox, following XML Parser error was shown.
> XML Parsing Error: not well-formed
> Location: file:///C:/ ... .xml
> Line Number 1, Column 43202:
> ... <Cell ... ss:StyleID="s3" ><Data ss:Type="String">Приложение №&#49&#10к ...
> ----------------------------------------------------------------------^
(View/Character Encoding = utf-8 was shown by Firefox)
Similar error was observed by IE 8, Opera 11.0, Google Chrome.
If [CRLF](0x0D0A) is inserted appropriately("/>", no quote, is changed to "/>[CRLF]", no quote, [CRLF]=0x0D0A), such XML Parsing Error didn't occur in any browser I checked.

(In reply to kpdozer from comment #0)
> Expected results:
> Thunderbird should be allowed to save the attachment without problems

(1) With View/Display Attachments Inline=Disabled, it's always possible.
(2) With View/Display Attachments Inline=Enabled, attached XML file is shown at attachment pane after inline display of the attached XML file.
So, as for this expected results, INVALID.
> (1) With View/Display Attachments Inline=Disabled, it's always possible.
> (2) With View/Display Attachments Inline=Enabled, attached XML file is shown
> at attachment pane after inline display of the attached XML file.
> So, as for this expected results, INVALID.

This can only mean one thing. Deliberately created a very big or the wrong XML-file will not save the attachment and freeze ThunderbIrd on slow computers forever.
In my opinion you must enter the system of rules for the show attachment in the message body. Otherwise, it is very similar to the type of DDOS attacks.
(In reply to kpdozer from comment #3)
> and freeze ThunderbIrd on slow computers forever.

As for this bug's case, freeze is never forever. It ends within finite time.

> In my opinion you must enter the system of rules for the show attachment in the message body. 
> Otherwise, it is very similar to the type of DDOS attacks.

Right, and I agree with you.
Because of severe performance issue in Tb, following can be used for DDoS attack to Tb user by mail.
- vary many mail addresses in To:/CC:, 
- very long line in text attachment(this bug's case), 
- very large text, image attachment, very lare embed image in HTML,
- very many image attachents,
- very deep nesting of <div>, <table>, <font>, ... in HTML,
- recursive load of <iframe> in <iframe> in <iframe> ... in <iframe>, 
- deep or infinite recursive call of JavaScript function,
- and so on.

In image case, very small jpeg file can generate huge bitmap data of image for rendering. So, "limitation of inline display image size" shouldn't be merely on file size.
As for "deep nesting of HTML tag", Tb or Gecko currently has limitation in nested DOM object generation(around 256 nests). So, element such as <img> in deeply nested HTML element is ignored.
Remove Crash and add Freeze in bug summary, and confirming.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: crashperf
Summary: Crash with a message with an attached EXCEL XML-file → Long freeze with a message with an attached EXCEL XML-file (single line XML file of 1.2 MB)
Example of inline display size limitation:
- max image file size = XXXX bytes
- max image width = WWWW px, max image height = HHHH px
- max text line size = LLLL bytes
- max text file size = FFFF bytes
Because bug 674473 is already fixed, not-inline-displayed part by limitation can be shown as if attachment far easier than before.

Example of message header limitation:
- max message header line length = YYYY bytes
  ignore data after YYYY bytes in header, with error message or indicator.
I confirm this problem on Thunderbird 24.0 running on Ubuntu 12.04

In my case the offending attachment is not an MS Excel file but an MS Word file saved as an XML file from within MS Word. 

File size is 775KB.

I attached it in an emai. Recipients receiving this email on Thunderbird 24 running on Ubuntu 12.04 experience a long freeze on clicking on the header of the message. TB finally came around after several minutes.
OS: Windows 7 → All
Hardware: x86 → All
(90 seconds for me)
Surely this is a dup of something.

And it's not really proper, but IMO we should gather these types of bugs under a discrete component like Message Reader, so it is then easier to find similar bugs. Currently we have bugs in General, backend, front end, etc, a small sampling of which is bug 210943, bug 337393
Severity: critical → major
Component: General → Message Reader UI
Summary: Long freeze with a message with an attached EXCEL XML-file (single line XML file of 1.2 MB) → Long freeze displaying/loading message with an attached EXCEL XML-file (single line XML file of 1.2 MB)
Whiteboard: [dupeme]
Reproducible for me with Thunderbird 31.0 on openSUSE 13.1 (x86-64).

I did some testing and found that the problem doesn't happen with text files with long lines, but does happen with XML files with long lines.
(In reply to Tristan Miller from comment #10)
> I did some testing and found that the problem doesn't happen with text files with long lines,
> but does happen with XML files with long lines.

As clearly stated in bug summary, this bug is report  for "XML file with single 1.2MB line", and, comment #7 is perhaps for "XML file with single 775KB line"
What is maximum line length/average line length of your text files?
What is maximum line length/average line length of your XML files?
Does your comment mean "no problem if one 1.2MB line text file in your test"?
If you want the details, I sent myself an e-mail with an attached XML file of 20 lines, 19 of which are less than 70 characters long, and 1 of which is 776308 characters long.  Thunderbird froze as soon as I tried to view the message.

I then replaced all the newline characters in the file with "x", resaved it as a text file, and attached it to an e-mail sent to myself.  Thunderbird had no trouble viewing this message.
Tristan, thanks for the details. Could you remove/replace all private data from those emails and attach them to this bug, one with text/xml and one with text/plain so that others can try to reproduce what you observed?
Attached file longlines.xml.xz (obsolete) —
Decompress this XML file and attach it to an e-mail message.  When Thunderbird tries to view the message, it will lock up.
Attached file longlines.txt.xz (obsolete) —
This text file (when decompressed) is the same as the previous longlines.xml attachment, except that all non-newline characters have been replaced with "x".  If you attach it to an e-mail message, Thunderbird has no problems displaying the message.
Wow, rapid response time, that's great :) However, would you mind uncompressing and re-attaching the files? Bug triage (between thousands of bugs in TB) is hard and time-consuming enough, extra hassles like uncompressing files make it much harder and less likely for somebody to look into this... Pls ensure no private date are made public with these files. Thanks.
Wayne, anything longer than a few minutes wait for something as simple as an xml file of reasonable size (1.2MB) is practically a hang - users won't wait that long and TB is not responsive during that time (and we should retest if this really loads after long waiting). Shouldn't this be critical?
Flags: needinfo?(vseerror)
Keywords: testcase
Tristan,

comment 12, you say: all newline chars replaced with x
comment 15, you say: all non-newline chars replaced with x
which one is correct now?
(In reply to Thomas D. from comment #18)
> comment 12, you say: all newline chars replaced with x
> comment 15, you say: all non-newline chars replaced with x
> which one is correct now?

Comment 15 is correct. Sorry for the mix-up.  Regarding the compressed files, I had assumed that the uncompressed versions would hit the size limit for this Bugzilla, but I'll try attaching them as you asked.
Attach this XML file to an e-mail message.  When Thunderbird tries to view the message, it will lock up.
Attachment #8466811 - Attachment is obsolete: true
This text file is the same as the previous longlines.xml attachment, except that all non-newline characters have been replaced with "x".  If you attach it to an e-mail message, Thunderbird has no problems displaying the message.
Attachment #8466815 - Attachment is obsolete: true
(In reply to Thomas D. from comment #17)
> Wayne, anything longer than a few minutes wait for something as simple as an
> xml file of reasonable size (1.2MB) is practically a hang - users won't wait
> that long and TB is not responsive during that time (and we should retest if
> this really loads after long waiting). Shouldn't this be critical?

I agree the user experience is not great. But user experience will be different for different for size files, and it doesn't require restarting the browser.  And for diagnostic purposes the analysis process is different from the typical hang.
Flags: needinfo?(vseerror)
Summary: Long freeze displaying/loading message with an attached EXCEL XML-file (single line XML file of 1.2 MB) → Long freeze, high CPU displaying/loading message with an attached EXCEL XML-file (single line XML file of 1.2 MB)
Whiteboard: [dupeme] → [dupeme:core bug?]
Any ideas why text/xml slows but text/plain doesn't?

Could message parsing stuff written by Ben Bucksch (mozTXTToHTMLConv etc.) cause these problems?
With all the idiosyncrasies he's maintaining, I'm not sure if that code has been designed to scale up efficiently...
I don't know enough about this to judge correct syntax, but I have some questions:

<?xml version="1.0"?>
<!DOCTYPE corpus [
<!ELEMENT corpus (puns)+>
<!ATTLIST corpus lang (en | de) "en">
<!ELEMENT puns (text)+>
<!ELEMENT text (word)+>
<!ATTLIST text id CDATA #REQUIRED>
<!ATTLIST text annotator CDATA #IMPLIED>
<!ATTLIST text puncategory (Pun | NoPun | Multiple | Other) "Pun">
<!ELEMENT word (#PCDATA)>
<!ATTLIST word id CDATA #REQUIRED>
<!ATTLIST word senses (0 | 1 | 2) "1">
<!ATTLIST word lemma CDATA #IMPLIED>
<!ATTLIST word first_sense CDATA #IMPLIED>
<!ATTLIST word second_sense CDATA #IMPLIED>
]>

1) Could the above header cause problems, with [] square brackets in doctype declaration, or the format of the doctype declaration itself? 
2) Doesn't xml require closing / in every stand-alone tag? (not seen in header tag(s))
3) I might be wrong (bit hard to see), but does the first tag in the corpus body ever get closed? Can't find the corresponding closing / in any other tag by tentative visual inspection with colored code markup from notepad++:
<xxxx xx="xxxxxxxxxxxxxx" xxxxxxxxx="xxxxxxx" xxxxxxxxxxx="xxx">
FTR: attachment 8466819 [details] is only 759 KB xml file and takes 5-6 mins to load inline preview during which time both TB windows turn all-white and are not responding, on a rather slow Win XP machine, TB31.
Summary: Long freeze, high CPU displaying/loading message with an attached EXCEL XML-file (single line XML file of 1.2 MB) → Long freeze, high CPU displaying/loading message with an attached EXCEL XML-file (single line XML file of 759 KB)
Attachment #607085 - Attachment description: Message that will lead to crash Thunderbird → Testcase 1: Message that will lead to crash Thunderbird
Attachment #8466819 - Attachment description: longlines.xml → Testcase 2-xml: longlines.xml
Attachment #8466820 - Attachment description: longlines.txt → Testcase 2-txt: longlines.txt
I think it might be useful to see where Thunderbird is spending its time when trying to load this file.
https://addons.mozilla.org/en-US/thunderbird/addon/gecko-profiler/ should be useful for whomever takes on this task.
Is this possibly a duplicate of (or at least related to) Bug 197956?
(In reply to Tristan Miller from comment #27)
> Is this possibly a duplicate of (or at least related to) Bug 197956?

Many bugs were already duped to that bug. I think "duping this bug to that bug" is apropriate action.
See duped bugs, please.
https://bugzilla.mozilla.org/buglist.cgi?bug_id=182710%2C199810%2C200875%2C206129%2C213557%2C229729%2C237181%2C237798%2C247340%2C255625%2C289150%2C295746%2C299658%2C300788%2C407231&list_id=12505269
Depends on: 197956
Blocks: 1161059
Just to preserve the record, this dates to 2006 per bug 337393, which matches the experience of core bug 197956
Whiteboard: [dupeme:core bug?] → [do not dup to core bug 197956]
Still reproducible with Thunderbird 52.5.0 (64-bit) on GNU/Linux
Still reproducible with Thunderbird 52.8.0 (64-bit) on Ubuntu 16.04 (bug 1465348).
Blocks: 1533319
See Also: → 533499
Severity: major → critical
Blocks: 548413

I have not filed a bug to firefox, but FF fails to show XML data correctly sometimes when a web server produces XML output.
Smallish XML output seems to work, but sometimes FF fails to show XML data completely where MS IE or other browser shows the data in full if I am not mistaken.
(Now I realize FF may be taking a longish time, say a few minutes, to display XML file!?)

I wonder if the underlying FF bug is the cause of TB bug.

Next time I encounter such an XML file (could be a local file if I recall correctly), I will file a bug.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: