spark 异常 __spark_conf__/__hadoop_conf__: bad substitution

2019-09-13 10:02:39 | 编辑

异常信息:

提交spark任务到yarn的时候出现,spark客户端出现下面异常:

19/09/11 08:34:42 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
    at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:89)
   .....
19/09/11 08:34:42 WARN YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to request executors before the AM has registered!
19/09/11 08:34:42 WARN MetricsSystem: Stopping a MetricsSystem that is not running
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
  at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:89)

yarn日志:

Log Type: prelaunch.out

Log Upload Time: 星期三 九月 11 08:34:41 +0800 2019

Log Length: 25

Setting up env variables

Log Type: prelaunch.err

Log Upload Time: 星期三 九月 11 08:34:41 +0800 2019

Log Length: 1100

/hadoop/yarn/local/usercache/root/appcache/application_1568108384753_0002/container_e04_1568108384753_0002_02_000001/launch_container.sh: line 29: $PWD:$PWD/__spark_conf__:$PWD/__spark_libs__/*:/usr/hdp/2.6.5.0-292/hadoop/conf:/usr/hdp/2.6.5.0-292/hadoop/*:/usr/hdp/2.6.5.0-292/hadoop/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:/usr/hdp/current/ext/hadoop/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure:/usr/hdp/current/ext/hadoop/*:$PWD/__spark_conf__/__hadoop_conf__: bad substitution

 

原因分析:

就是这个变量  ${hdp.version} 解析获取不到。

 

解决方法1(推荐):

网上大部分解决方法都是添加spark.driver.extraJavaOptions -Dhdp.version=XXX ,spark.yarn.am.extraJavaOptions -Dhdp.version=XXX来解决hdp.version缺失的问题,但这个方法仍然不能解决 --deploy-mode cluster的情况。仔细看一下是mr....一堆路径什么的引用了 ${hdp.version}。我去mapReduce2的配置文件mapred-site里面看可以下,确实有引用这个变量。所以我们只需要再添加这个变量就可以了。

hdp.version 2.7.3.2.6.5.0-292

就是这么简单,就只需要mapred-site添加一个变量就行。就能解决所有bad substitution的问题,而不需要其他的处理了。


解决方法2(不推荐):

1.在集群上查看hadoop 版本

#hadoop version
Hadoop 2.7.3.2.6.5.0-292
Subversion git@github.com:hortonworks/hadoop.git -r 3091053c59a62c82d82c9f778c48bde5ef0a89a1
Compiled by jenkins on 2018-05-11T08:07Z
Compiled with protoc 2.5.0
From source with checksum abed71da5bc89062f6f6711179f2058
This command was run using /usr/hdp/2.6.5.0-292/hadoop/hadoop-common-2.7.3.2.6.5.0-292.jar

2.把版本号配置到park-defaults.conf

spark.driver.extraJavaOptions -Dhdp.version=XXX
spark.yarn.am.extraJavaOptions -Dhdp.version=XXX

其中XXX为hadoop的版本

 

登录后即可回复 登录 | 注册
    
  • admin
    admin

    我的spark  thriftserver 起不来也报错:bad substitution,我也为spark2-thrift-sparkconf,添加了这两个配置解决了。
    spark.driver.extraJavaOptions -Dhdp.version=2.7.3.2.6.5.0-292
    spark.yarn.am.extraJavaOptions -Dhdp.version=2.7.3.2.6.5.0-292

关注编程学问公众号