Closed Bug 787115 Opened 12 years ago Closed 12 years ago

Hang and high CPU during robocop tests on panda: java.lang.StringToReal.parseDouble

Categories

(Firefox for Android Graveyard :: General, defect)

x86
Android
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 727352

People

(Reporter: gbrown, Assigned: gbrown)

References

Details

Robocop tests very frequently hang on pandaboards. Typically the test is just starting up and waiting for Gecko:Ready -- and then nothing happens until the event wait times out. Sometimes there is more progress and a URL is entered, then the wait for DOMContentLoaded times out. And there are less frequent hangs in other places during tests.

In many cases, dumping stack traces shows:

Thread[Gecko,5,main]
java.lang.StringToReal.parseDblImpl(Native Method)
java.lang.StringToReal.parseDouble(StringToReal.java:267)
java.lang.Double.parseDouble(Double.java:295)
java.lang.Double.valueOf(Double.java:332)
org.json.JSONTokener.readLiteral(JSONTokener.java:323)
org.json.JSONTokener.nextValue(JSONTokener.java:111)
org.json.JSONTokener.readObject(JSONTokener.java:385)
org.json.JSONTokener.nextValue(JSONTokener.java:100)
org.json.JSONTokener.readObject(JSONTokener.java:385)
org.json.JSONTokener.nextValue(JSONTokener.java:100)
org.json.JSONObject.<init>(JSONObject.java:154)
org.json.JSONObject.<init>(JSONObject.java:171)
org.mozilla.gecko.util.EventDispatcher.dispatchEvent(EventDispatcher.java:58)
org.mozilla.gecko.GeckoAppShell.handleGeckoMessage(GeckoAppShell.java:1991)
org.mozilla.gecko.GeckoAppShell.nativeRun(Native Method)
org.mozilla.gecko.GeckoAppShell.nativeRun(Native Method)
org.mozilla.gecko.GeckoAppShell.runGecko(GeckoAppShell.java:545)
org.mozilla.gecko.GeckoThread.run(GeckoThread.java:82)

Also, top shows high CPU use while the test is waiting:

User 49%, System 3%, IOW 0%, IRQ 0%
User 282 + Nice 0 + Sys 18 + Idle 266 + IOW 0 + IRQ 0 + SIRQ 0 = 566

  PID PR CPU% S  #THR     VSS     RSS PCY UID      Name
 2159  1  51% S    41 635676K 115792K  fg app_46   org.mozilla.fennec_mozdev
Blocks: 783639
FWIW snorp was seeing this problem some time back (see related bug) but it went away. I suspect it is an infinite loop in the StringToReal code.
See Also: → 764808
Interestingly it also shows up at https://bugzilla.mozilla.org/show_bug.cgi?id=706500#c8 which indicates users are hitting it in the wild and it is causing ANRs for them.
The message passed to the hanging JSONObject ctor is:

{"gecko":{"width":1280,"height":616,"cssWidth":1280.0000000000002,"cssHeight":616.0000000000001,"pageLeft":0,"pageTop":0,"pageRight":1999.9999999999998,"pageBottom":1999.9999999999998,"cssPageLeft":0,"cssPageTop":0,"cssPageRight":2000,"cssPageBottom":2000,"zoom":0.9999999999999999,"cssX":0,"cssY":0,"x":0,"y":0,"type":"Viewport:CalculateDisplayPort"}}

If I hack the message to remove some 999's and 000's, there is no hang and the test passes. Testing further to see if there is a particular value responsible for the hang...
(In reply to Geoff Brown [:gbrown] from comment #3)
> The message passed to the hanging JSONObject ctor is:
> 
> {"gecko":{"width":1280,"height":616,"cssWidth":1280.0000000000002,
> "cssHeight":616.0000000000001,"pageLeft":0,"pageTop":0,"pageRight":1999.
> 9999999999998,"pageBottom":1999.9999999999998,"cssPageLeft":0,"cssPageTop":0,
> "cssPageRight":2000,"cssPageBottom":2000,"zoom":0.9999999999999999,"cssX":0,
> "cssY":0,"x":0,"y":0,"type":"Viewport:CalculateDisplayPort"}}
> 
> If I hack the message to remove some 999's and 000's, there is no hang and
> the test passes. Testing further to see if there is a particular value
> responsible for the hang...

Geoff, are there many "Viewport:CalculateDisplayPort" messages? Does it look like Robocop hanging because it is busy calling StringToReal.parseDouble() many times or just one really slow call to StringToReal.parseDouble()?

Perhaps we should revisit the idea of replacing JSON messages? Or snap these CSS values to integers (to avoid StringToReal conversions)?
Interesting reading:

  "There is a vulnerability in implementations of java.lang.Double.parseDouble() and related methods that can cause the thread to hang when parsing any number in the range [2^(-1022) - 2^(-1075) : 2^(-1022) - 2^(-1076)]. This defect can be used to execute a DOS (Denial of Service) attack."

* Denial of Service: Parse Double
http://www.hpenterprisesecurity.com/vulncat/en/vulncat/java/denial_of_service_parse_double.html

* Sun Java Double.parseDouble() denial of service (HTTP_Tomcat_AcceptLanguage_DoS)
http://www.iss.net/security_center/reference/vuln/HTTP_Tomcat_AcceptLanguage_DoS.htm
(In reply to Chris Peterson (:cpeterson) from comment #4)
> Geoff, are there many "Viewport:CalculateDisplayPort" messages? Does it look
> like Robocop hanging because it is busy calling StringToReal.parseDouble()
> many times or just one really slow call to StringToReal.parseDouble()?

There is just one "Viewport:CalculateDisplayPort" message at play here: We call JSONObject(message) at org.mozilla.gecko.util.EventDispatcher.dispatchEvent(EventDispatcher.java:58), and that call does not complete (for over 90000 ms before Robocop gives up and takes down the process).
(In reply to Geoff Brown [:gbrown] from comment #3)
> The message passed to the hanging JSONObject ctor is:
> 
> {"gecko":{"width":1280,"height":616,"cssWidth":1280.0000000000002,
> "cssHeight":616.0000000000001,"pageLeft":0,"pageTop":0,"pageRight":1999.
> 9999999999998,"pageBottom":1999.9999999999998,"cssPageLeft":0,"cssPageTop":0,
> "cssPageRight":2000,"cssPageBottom":2000,"zoom":0.9999999999999999,"cssX":0,
> "cssY":0,"x":0,"y":0,"type":"Viewport:CalculateDisplayPort"}}
> 
> If I hack the message to remove some 999's and 000's, there is no hang and
> the test passes. Testing further to see if there is a particular value
> responsible for the hang...

With this change there is no hang and the test passes:
- ..."pageRight":1999.9999999999998,"pageBottom":1999.9999999999998...
+ ..."pageRight":1999.999999999999,"pageBottom":1999.999999999999...


Section 3.10.2 of the Java specification, eg http://docs.oracle.com/javase/specs/jls/se7/html/jls-3.html#jls-3.10.2, defines valid floating point formats -- I see nothing wrong with 1999.9999999999998.
Interesting:

1999.9999999999998  HANG
1999.999999999999   OK
1999.9999999999990  OK
1999.9999999999999  OK
I am thinking about some type of processor specific error here.
Blocks: 780233
:blassey said he may have a work-around for this (avoid using the JSONObject here).
doing this work in bug 727352. There is a patch up there now, it would be useful to  know if that fixes the issue on the pandas or not (it doesn't get rid of all use of JSON).
Depends on: 727352
I have verified that this no longer occurs with the fix for bug 727352.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → DUPLICATE
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.