Closed
Bug 822372
Opened 12 years ago
Closed 11 years ago
[stage] ADU injection failed on 12/17
Categories
(Infrastructure & Operations Graveyard :: WebOps: Other, task)
Tracking
(Not tracked)
RESOLVED
INCOMPLETE
People
(Reporter: laura, Unassigned)
Details
It was retried and succeeded later, but I'd like to know the cause. Log info from the Metrics side is as follows: 2012/12/16 20:52:11 - Export to Socorro - Loading transformation from XML file [file:///opt/pentaho/kettle/etl/logProcessing/etl/timestamped_log_manager/../socorro/extract_adu.ktr] 2012/12/16 20:52:11 - extract_adu - Dispatching started for transformation [extract_adu] 2012/12/16 20:52:11 - Table output.0 - Connected to database [socorro] (commit=5000) 2012/12/16 20:52:11 - Delete adu records.0 - Finished reading query, closing connection. 2012/12/16 20:52:11 - Log deletes.0 - Optimization level not specified. Using default of 9. 2012/12/16 20:52:11 - Delete adu records.0 - Finished processing (I=0, O=0, R=0, W=1, U=0, E=0) 2012/12/16 20:52:11 - Log deletes.0 - Deleted 0 records from socorro-stage-zlb.db.phx1.mozilla.com.breakpad.raw_adu for date = '2012-12-16' 2012/12/16 20:52:11 - Log deletes.0 - Finished processing (I=0, O=0, R=1, W=1, U=0, E=0) 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : Because of an error, this step can't continue: 2012/12/16 20:58:12 - extract_adu - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : Errors detected! 2012/12/16 20:58:12 - extract_adu - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : Errors detected! 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : org.pentaho.di.core.exception.KettleException: 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : Error inserting row into table [raw_adu] with values: [2012/12/16 00:00:00.000], [ec8030f7-c20a-464f-9b0e-13a3a9e97384], [Firefox], [13.0], [Linux], [20120530124445], [beta], [1] 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : Error inserting/updating row 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : An I/O error occured while sending to the backend. 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at org.pentaho.di.trans.steps.tableoutput.TableOutput.writeToTable(TableOutput.java:445) 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at org.pentaho.di.trans.steps.tableoutput.TableOutput.processRow(TableOutput.java:128) 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at org.pentaho.di.trans.step.RunThread.run(RunThread.java:50) 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at java.lang.Thread.run(Thread.java:662) 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : Caused by: org.pentaho.di.core.exception.KettleDatabaseException: 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : Error inserting/updating row 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : An I/O error occured while sending to the backend. 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at org.pentaho.di.core.database.Database.insertRow(Database.java:1413) 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at org.pentaho.di.trans.steps.tableoutput.TableOutput.writeToTable(TableOutput.java:284) 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : ... 3 more 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : Caused by: org.postgresql.util.PSQLException: An I/O error occured while sending to the backend. 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:283) 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:498) 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:386) 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:332) 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at org.pentaho.di.core.database.Database.insertRow(Database.java:1362) 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : ... 4 more 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : Caused by: java.io.EOFException 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at org.postgresql.core.PGStream.ReceiveChar(PGStream.java:261) 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1635) 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257) 2012/12/16 20:58:12 - Table output.0 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : ... 8 more 2012/12/16 20:58:12 - Count inserts.0 - Finished processing (I=0, O=0, R=18192, W=0, U=0, E=0) 2012/12/16 20:58:12 - Table output.0 - Finished processing (I=0, O=18192, R=18193, W=18192, U=0, E=1) 2012/12/16 20:58:12 - extract_adu - extract_adu 2012/12/16 20:58:12 - extract_adu - extract_adu eod_socorro_20121216_205211.log
Reporter | ||
Comment 1•12 years ago
|
||
Source: etl1.metrics.scl3.mozilla.com Destination: socorro-stage-zlb.db.phx1.mozilla.com:5432
Comment 2•12 years ago
|
||
(backend zeus cluster)
Reporter | ||
Updated•12 years ago
|
Assignee: nobody → dmaher
Component: Database → Server Operations: Web Operations
Product: Socorro → mozilla.org
QA Contact: nmaul
Version: unspecified → other
Updated•11 years ago
|
Assignee: dmaher → server-ops-webops
Comment 3•11 years ago
|
||
This is old enough now that I suspect it will be more or less impossible to nail down precisely what went wrong. Things have changed, logs are rotated, etc. The one thing I can pull out of that is a few lines like this: Caused by: org.postgresql.util.PSQLException: An I/O error occured while sending to the backend. I suspect this is actually a network/connection rather than disk I/O. Beyond that though, I don't have a good idea on where to look. I note that the timing is almost exactly 6 minutes since the last "good" line and the first error. I suspect a timeout somewhere, but I don't know where it might be... Zeus has these timeouts on this VIP / virtual server disabled. Could be on the backend servers, maybe. In any case, if this is the issue (just a guess), the best course of action is for this job to intelligently disconnect when done with database work for a while, then reconnect when necessary. If Metrics can increase the log verbosity on their side, it might help to track this down. In the meantime, I don't think there's anything more we can feasibly do here, so I'm going to close this out.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → INCOMPLETE
Updated•11 years ago
|
Component: Server Operations: Web Operations → WebOps: Other
Product: mozilla.org → Infrastructure & Operations
Updated•5 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•