异常信息:
提交spark任务到yarn的时候出现,spark客户端出现下面异常:
19/09/11 08:34:42 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:89)
.....
19/09/11 08:34:42 WARN YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to request executors before the AM has registered!
19/09/11 08:34:42 WARN MetricsSystem: Stopping a MetricsSystem that is not running
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:89)
yarn日志:
Log Type: prelaunch.out
Log Upload Time: 星期三 九月 11 08:34:41 +0800 2019
Log Length: 25
Setting up env variables
Log Type: prelaunch.err
Log Upload Time: 星期三 九月 11 08:34:41 +0800 2019
Log Length: 1100
/hadoop/yarn/local/usercache/root/appcache/application_1568108384753_0002/container_e04_1568108384753_0002_02_000001/launch_container.sh: line 29: $PWD:$PWD/__spark_conf__:$PWD/__spark_libs__/*:/usr/hdp/2.6.5.0-292/hadoop/conf:/usr/hdp/2.6.5.0-292/hadoop/*:/usr/hdp/2.6.5.0-292/hadoop/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:/usr/hdp/current/ext/hadoop/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure:/usr/hdp/current/ext/hadoop/*:$PWD/__spark_conf__/__hadoop_conf__: bad substitution
原因分析:
就是这个变量 ${hdp.version} 解析获取不到。
解决方法1(推荐):
网上大部分解决方法都是添加spark.driver.extraJavaOptions -Dhdp.version=XXX ,spark.yarn.am.extraJavaOptions -Dhdp.version=XXX来解决hdp.version缺失的问题,但这个方法仍然不能解决 --deploy-mode cluster的情况。仔细看一下是mr....一堆路径什么的引用了 ${hdp.version}。我去mapReduce2的配置文件mapred-site里面看可以下,确实有引用这个变量。所以我们只需要再添加这个变量就可以了。
hdp.version 2.7.3.2.6.5.0-292
就是这么简单,就只需要mapred-site添加一个变量就行。就能解决所有bad substitution的问题,而不需要其他的处理了。
解决方法2(不推荐):
1.在集群上查看hadoop 版本
#hadoop version
Hadoop 2.7.3.2.6.5.0-292
Subversion git@github.com:hortonworks/hadoop.git -r 3091053c59a62c82d82c9f778c48bde5ef0a89a1
Compiled by jenkins on 2018-05-11T08:07Z
Compiled with protoc 2.5.0
From source with checksum abed71da5bc89062f6f6711179f2058
This command was run using /usr/hdp/2.6.5.0-292/hadoop/hadoop-common-2.7.3.2.6.5.0-292.jar
2.把版本号配置到park-defaults.conf
spark.driver.extraJavaOptions -Dhdp.version=XXX
spark.yarn.am.extraJavaOptions -Dhdp.version=XXX
其中XXX为hadoop的版本