Presto queries are failing

RESOLVED FIXED

Status

Cloud Services
Metrics: Pipeline
--
blocker
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: rvitillo, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [SvcOps])

All Presto queries are failing with the following error: 

com.facebook.presto.spi.PrestoException: ip-172-31-27-197.us-west-2.compute.internal: java.net.SocketTimeoutException: Read timed out
	at com.facebook.presto.hive.metastore.CachingHiveMetastore.loadTable(CachingHiveMetastore.java:568)
	at com.facebook.presto.hive.metastore.CachingHiveMetastore.access$300(CachingHiveMetastore.java:96)
	at com.facebook.presto.hive.metastore.CachingHiveMetastore$4.load(CachingHiveMetastore.java:176)
	at com.facebook.presto.hive.metastore.CachingHiveMetastore$4.load(CachingHiveMetastore.java:171)
	at com.google.common.cache.CacheLoader$1.load(CacheLoader.java:189)
	at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527)
	at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319)
	at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282)
	at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2197)
	at com.google.common.cache.LocalCache.get(LocalCache.java:3937)
	at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3941)
	at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4824)
	at com.facebook.presto.hive.metastore.CachingHiveMetastore.get(CachingHiveMetastore.java:290)
	at com.facebook.presto.hive.metastore.CachingHiveMetastore.getTable(CachingHiveMetastore.java:403)
	at com.facebook.presto.hive.HiveMetadata.getViews(HiveMetadata.java:1172)
	at com.facebook.presto.spi.classloader.ClassLoaderSafeConnectorMetadata.getViews(ClassLoaderSafeConnectorMetadata.java:246)
	at com.facebook.presto.metadata.MetadataManager.getView(MetadataManager.java:639)
	at com.facebook.presto.sql.analyzer.StatementAnalyzer.visitTable(StatementAnalyzer.java:798)
	at com.facebook.presto.sql.analyzer.StatementAnalyzer.visitTable(StatementAnalyzer.java:209)
	at com.facebook.presto.sql.tree.Table.accept(Table.java:49)
	at com.facebook.presto.sql.tree.AstVisitor.process(AstVisitor.java:22)
	at com.facebook.presto.sql.analyzer.StatementAnalyzer.analyzeFrom(StatementAnalyzer.java:1607)
	at com.facebook.presto.sql.analyzer.StatementAnalyzer.visitQuerySpecification(StatementAnalyzer.java:935)
	at com.facebook.presto.sql.analyzer.StatementAnalyzer.visitQuerySpecification(StatementAnalyzer.java:209)
	at com.facebook.presto.sql.tree.QuerySpecification.accept(QuerySpecification.java:125)
	at com.facebook.presto.sql.tree.AstVisitor.process(AstVisitor.java:22)
	at com.facebook.presto.sql.analyzer.StatementAnalyzer.visitQuery(StatementAnalyzer.java:735)
	at com.facebook.presto.sql.analyzer.StatementAnalyzer.visitQuery(StatementAnalyzer.java:209)
	at com.facebook.presto.sql.tree.Query.accept(Query.java:103)
	at com.facebook.presto.sql.tree.AstVisitor.process(AstVisitor.java:22)
	at com.facebook.presto.sql.analyzer.StatementAnalyzer.analyzeWith(StatementAnalyzer.java:1818)
	at com.facebook.presto.sql.analyzer.StatementAnalyzer.visitQuery(StatementAnalyzer.java:733)
	at com.facebook.presto.sql.analyzer.StatementAnalyzer.visitQuery(StatementAnalyzer.java:209)
	at com.facebook.presto.sql.tree.Query.accept(Query.java:103)
	at com.facebook.presto.sql.tree.AstVisitor.process(AstVisitor.java:22)
	at com.facebook.presto.sql.analyzer.Analyzer.analyze(Analyzer.java:60)
	at com.facebook.presto.execution.SqlQueryExecution.doAnalyzeQuery(SqlQueryExecution.java:254)
	at com.facebook.presto.execution.SqlQueryExecution.analyzeQuery(SqlQueryExecution.java:240)
	at com.facebook.presto.execution.SqlQueryExecution.start(SqlQueryExecution.java:204)
	at com.facebook.presto.execution.QueuedExecution.lambda$start$282(QueuedExecution.java:68)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.transport.TTransportException: ip-172-31-27-197.us-west-2.compute.internal: java.net.SocketTimeoutException: Read timed out
	at com.facebook.presto.hive.HiveMetastoreClientFactory.rewriteException(HiveMetastoreClientFactory.java:59)
	at com.facebook.presto.hive.HiveMetastoreClientFactory.access$000(HiveMetastoreClientFactory.java:34)
	at com.facebook.presto.hive.HiveMetastoreClientFactory$TTransportWrapper.readAll(HiveMetastoreClientFactory.java:196)
	at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:362)
	at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:284)
	at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:191)
	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table(ThriftHiveMetastore.java:1218)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table(ThriftHiveMetastore.java:1204)
	at com.facebook.presto.hive.ThriftHiveMetastoreClient.getTable(ThriftHiveMetastoreClient.java:111)
	at com.facebook.presto.hive.metastore.CachingHiveMetastore.lambda$loadTable$54(CachingHiveMetastore.java:556)
	at com.facebook.presto.hive.metastore.HiveMetastoreApiStats.lambda$wrap$12(HiveMetastoreApiStats.java:40)
	at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:136)
	at com.facebook.presto.hive.metastore.CachingHiveMetastore.loadTable(CachingHiveMetastore.java:554)
	... 42 more
Caused by: java.net.SocketTimeoutException: Read timed out
	at java.net.SocketInputStream.socketRead0(Native Method)
	at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
	at java.net.SocketInputStream.read(SocketInputStream.java:170)
	at java.net.SocketInputStream.read(SocketInputStream.java:141)
	at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
	at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
	at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
	at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
	at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
	at com.facebook.presto.hive.HiveMetastoreClientFactory$TTransportWrapper.readAll(HiveMetastoreClientFactory.java:193)
	... 53 more
(Reporter)

Updated

2 years ago
Flags: needinfo?(bimsland)
Whiteboard: [SvcOps]
(Reporter)

Updated

2 years ago
Blocks: 1255751
This appears to have been caused by a parquet2hive job causing the hive-metastore process to OOM and crash, after restarting the issue has been resolved.  The greater problem is that we currently have no monitoring in place for Presto or Hive to alert us to when these issues occur, I'm hoping to file some bugs to start work on those issues soon.
Flags: needinfo?(bimsland)
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.