spark hive Can not create the managed table('`xxx`'). The associated location('xxx') already exists

spark | 2020-12-07 09:28:01

异常信息:

spark计算写hive表时出现异常

Exception in thread "main" org.apache.spark.sql.AnalysisException: Can not create the managed table('`original_exam_question_score`'). The associated location('hdfs://master122:8020/opt/cdh/hive/warehouse/h_431000_923a0d14be0244eb8c509ce0a57684bb.db/original_exam_question_score') already exists.;
	at org.apache.spark.sql.catalyst.catalog.SessionCatalog.validateTableLocation(SessionCatalog.scala:331)
	at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:170)
	at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)
	at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)
	at org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122)
	at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
	at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)

 

异常原因:

可能是之前有个spark任务写这个表的时候强制把任务停了,每次写表都执行删除表,hive源数据库是没有数据了,但是在hdfs还是有文件,所以出现了这个异常。

如果出现以下情况,则可能出现此问题:

  • 正在进行写操作时,将终止群集。
  • 发生临时网络问题。
  • 作业中断。

 

解决方法1:

将属性设置 spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation 为 true 。 表示删除 _STARTED 目录,并将进程返回到原始状态。

spark.conf.set("spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation","true")

或者配置在spark-default.conf里面:

spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation true

亲测有效。

 

解决方法2:

直接在hdfs上删除异常信息中的这个目录,我最后也是菜取的这中方法。

登录后即可回复 登录 | 注册
    
相关文章
spark on hive 异常 `hivefileformat` doesn t match `parquetfileformat`spark操作hive orc transactional事务表异常解决spark hive插入数据异常spark currently does not populate bucketed outputspark hive 异常version information not found in metastorehive on spark异常failed to create spark client for spark session解决过程spark异常 could not locate executable null bin winutils.exe in the hadoop binariesspark 异常 missing an output location for shufflelinux hadoop、hbase、hive、spark大数据分布式集群环境搭建spark on yarn 异常 spark shuffle does not existspark hive 异常 could not connect to meta store using any of the uris providedphp ibm db2 函数 returns a result set listing the columns and associated metadata for a tablephp odbc 函数 lists columns and associated privileges for the given tablephp wincache 函数 adds a variable in user cache only if variable does not already exist in the cachephp wincache 函数 adds a variable in user cache and overwrites a variable if it already exists in the cachephp mongocollection creates an index on the specified field s if it does not already existphp mongocollection creates an index on the specified field s if it does not already existphp mysql xdevapi table check if table exists in databasephp swoole table create the swoole memory table.spark hive Can not create the managed table('`xxx`'). The associated location('xxx') already existsspark解决OutOfMemoryError: Not enough memory to build and broadcast the table to all worker nodes
关注编程学问公众号