3600 seconds timeout that spark worker communicating with spark driver in heartbeater

I did not configure any timeout value but used default settings. Where to configure 3600 seconds timeout? How to solve it?

Error message:

18/01/10 13:51:44 WARN Executor: Issue communicating with driver in heartbeater
org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [3600 seconds]. This timeout is controlled by spark.executor.heartbeatInterval at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:47) at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:62) at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:58) at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76) at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:92) at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:738) at org.apache.spark.executor.Executor$$anon$2$$anonfun$run$1.apply$mcV$sp(Executor.scala:767) at org.apache.spark.executor.Executor$$anon$2$$anonfun$run$1.apply(Executor.scala:767) at org.apache.spark.executor.Executor$$anon$2$$anonfun$run$1.apply(Executor.scala:767) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1948) at org.apache.spark.executor.Executor$$anon$2.run(Executor.scala:767) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [3600 seconds] at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:201) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75) ... 14 more

2 Answers

In the error message it says:

This timeout is controlled by spark.executor.heartbeatInterval

Hence, the first thing you try is increasing this value. It can be done in multiple ways, for example increasing the value to 10000 seconds:

  • When using spark-submit simply add the flag:

    --conf spark.executor.heartbeatInterval=10000s
  • You can add a line in spark-defaults.conf:

    spark.executor.heartbeatInterval 10000s
  • When creating a new SparkSession in your program, add a config parameter (Scala):

    val spark = SparkSession.builder .config("spark.executor.heartbeatInterval", "10000s") .getOrCreate()

If this does not help, it could be a good idea to try increasing the value of spark.network.timeout as well. It is also a common source for problem related to these types of timeouts.

3
val spark = SparkSession.builder().appName("SQL_DataFrame") .master("local") .config("spark.network.timeout", "600s") .config("spark.executor.heartbeatInterval", "10000s") .getOrCreate()

Tested. It solved the problem.

2

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge that you have read and understand our privacy policy and code of conduct.

You Might Also Like