Closed
Bug 907280
Opened 12 years ago
Closed 12 years ago
Module list cron output broken
Categories
(Socorro :: Backend, task)
Socorro
Backend
Tracking
(Not tracked)
RESOLVED
FIXED
59
People
(Reporter: ted, Assigned: rhelmer)
Details
(Whiteboard: [qa-])
Looks like it broke at the beginning of July. The last good file is:
https://crash-analysis.mozilla.com/crash_analysis/modulelist/20130629-modulelist.txt
Then this one and all the following ones are very broken:
https://crash-analysis.mozilla.com/crash_analysis/modulelist/20130702-modulelist.txt
The ones in between those two have a permissions issue so I can't load them.
Assignee | ||
Comment 1•12 years ago
|
||
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #0)
> Looks like it broke at the beginning of July. The last good file is:
> https://crash-analysis.mozilla.com/crash_analysis/modulelist/20130629-
> modulelist.txt
>
> Then this one and all the following ones are very broken:
> https://crash-analysis.mozilla.com/crash_analysis/modulelist/20130702-
> modulelist.txt
Investigating, thanks!
> The ones in between those two have a permissions issue so I can't load them.
This should be fixed, please let me know:
$ chmod o+r /mnt/crashanalysis/crash_analysis/modulelist/*.txt
Assignee: nobody → rhelmer
Status: NEW → ASSIGNED
Reporter | ||
Comment 2•12 years ago
|
||
The ones I couldn't access were broken too, so this is the first bad one:
https://crash-analysis.mozilla.com/crash_analysis/modulelist/20130630-modulelist.txt
The regression range here is between 2013-06-29 and 2013-06-30 then (the mtime on that 20130630 file is 02-Jul-2013 though, so presumably the code or configuration change happened before then).
Assignee | ||
Comment 3•12 years ago
|
||
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #2)
> The ones I couldn't access were broken too, so this is the first bad one:
> https://crash-analysis.mozilla.com/crash_analysis/modulelist/20130630-
> modulelist.txt
>
> The regression range here is between 2013-06-29 and 2013-06-30 then (the
> mtime on that 20130630 file is 02-Jul-2013 though, so presumably the code or
> configuration change happened before then).
Thanks for finding the regression range. I ran this by hand on the hadoop cluster, the problem is definitely happening there (and not some later intermediate step) - I'll check the Socorro releases around that time but I don't believe the code here has changed recently. It's possible that the hadoop cluster was upgraded or that we need to rebuild some of our custom JARs if that's the case.
I am also exploring the possibility that there's a problem in the data itself, adding some debugging to the pig job.
Assignee | ||
Updated•12 years ago
|
Target Milestone: --- → 58
Assignee | ||
Comment 4•12 years ago
|
||
The pig job in question is https://github.com/mozilla-metrics/socorro-toolbox/blob/master/src/main/pig/modulelist.pig
Assignee | ||
Comment 5•12 years ago
|
||
tmary has been looking through this with me, here's some more detail.
The pig job in comment 4 depends on:
1) the socorro-toolbox JAR from https://github.com/mozilla-metrics/socorro-toolbox/
2) the akela JAR from https://github.com/mozilla-metrics/akela
We're still using a bunch of old pig and hbase dependencies in #1, #2 has been upgraded but it's throwing a JSON exception when I attempt to use it.
I am slowly slogging through this, not sure who knows about these tools like socorro-toolbox and akela that can help, but would be much appreciated!
Assignee | ||
Comment 6•12 years ago
|
||
(In reply to Robert Helmer [:rhelmer] from comment #5)
> tmary has been looking through this with me, here's some more detail.
>
> The pig job in comment 4 depends on:
>
> 1) the socorro-toolbox JAR from
> https://github.com/mozilla-metrics/socorro-toolbox/
> 2) the akela JAR from https://github.com/mozilla-metrics/akela
>
> We're still using a bunch of old pig and hbase dependencies in #1, #2 has
> been upgraded but it's throwing a JSON exception when I attempt to use it.
Specifically the exception is:
Pig Stack Trace
---------------
ERROR 2998: Unhandled internal error. com/fasterxml/jackson/core/JsonParseException
java.lang.NoClassDefFoundError: com/fasterxml/jackson/core/JsonParseException
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:505)
at org.apache.pig.impl.PigContext.getClassForAlias(PigContext.java:639)
at org.apache.pig.parser.LogicalPlanBuilder.buildUDF(LogicalPlanBuilder.java:1402)
at org.apache.pig.parser.LogicalPlanGenerator.func_eval(LogicalPlanGenerator.java:8381)
at org.apache.pig.parser.LogicalPlanGenerator.projectable_expr(LogicalPlanGenerator.java:9926)
at org.apache.pig.parser.LogicalPlanGenerator.var_expr(LogicalPlanGenerator.java:9700)
at org.apache.pig.parser.LogicalPlanGenerator.expr(LogicalPlanGenerator.java:9051)
at org.apache.pig.parser.LogicalPlanGenerator.flatten_generated_item(LogicalPlanGenerator.java:6973)
at org.apache.pig.parser.LogicalPlanGenerator.generate_clause(LogicalPlanGenerator.java:15920)
at org.apache.pig.parser.LogicalPlanGenerator.foreach_plan(LogicalPlanGenerator.java:14312)
at org.apache.pig.parser.LogicalPlanGenerator.foreach_clause(LogicalPlanGenerator.java:14179)
at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1623)
at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:799)
at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:517)
at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:392)
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:184)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1600)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1547)
at org.apache.pig.PigServer.registerQuery(PigServer.java:518)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:991)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:604)
at org.apache.pig.Main.main(Main.java:157)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: java.lang.ClassNotFoundException: com.fasterxml.jackson.core.JsonParseException
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
... 33 more
Assignee | ||
Comment 7•12 years ago
|
||
OK harsha helped me track this down - we're getting (hadoop-specific) snappy compressed data in the output - I suspected this before and tried uncompressing with https://code.google.com/p/snappy/ but didn't know that hadoop's snappy impl is incompatible :/
Workaround is pretty simple, at least for the moment:
SET mapred.output.compress false;
This is now fixed in-place, I am going to start backfilling now, and file some bugs to fix this all up in the meantime.
Reporter | ||
Comment 8•12 years ago
|
||
Ran a backfill yesterday using one of the cleaned up module lists: "Uploaded 12000 symbol files".
Updated•12 years ago
|
Target Milestone: 58 → 59
Assignee | ||
Comment 9•12 years ago
|
||
OK this is fixed up, but is running out my my homedir and crontab on sp-admin01. Also, we need to come up with a sane way to deploy code to the cherry-gw hadoop server.
These things should have happened in bug 880048, going to followup there.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Updated•12 years ago
|
Whiteboard: [qa-]
You need to log in
before you can comment on or make changes to this bug.
Description
•