spark hive Can not create the managed table('`xxx`'). The associated location('xxx') already exists

spark | 2020-12-07 09:28:01

异常信息：

spark计算写hive表时出现异常

Exception in thread "main" org.apache.spark.sql.AnalysisException: Can not create the managed table('`original_exam_question_score`'). The associated location('hdfs://master122:8020/opt/cdh/hive/warehouse/h_431000_923a0d14be0244eb8c509ce0a57684bb.db/original_exam_question_score') already exists.;
	at org.apache.spark.sql.catalyst.catalog.SessionCatalog.validateTableLocation(SessionCatalog.scala:331)
	at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:170)
	at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)
	at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)
	at org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122)
	at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
	at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)

异常原因：

可能是之前有个spark任务写这个表的时候强制把任务停了，每次写表都执行删除表，hive源数据库是没有数据了，但是在hdfs还是有文件，所以出现了这个异常。

如果出现以下情况，则可能出现此问题：

正在进行写操作时，将终止群集。
发生临时网络问题。
作业中断。

解决方法1：

将属性设置 spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation 为 true 。表示删除 _STARTED 目录，并将进程返回到原始状态。

spark.conf.set("spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation","true")

或者配置在spark-default.conf里面：

spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation true

亲测有效。

解决方法2：

直接在hdfs上删除异常信息中的这个目录，我最后也是菜取的这中方法。

登录后即可回复登录 | 注册

spark on hive 异常 `hivefileformat` doesn t match `parquetfileformat`spark操作hive orc transactional事务表异常解决spark hive插入数据异常spark currently does not populate bucketed output spark hive 异常version information not found in metastore hive on spark异常failed to create spark client for spark session解决过程 hive on spark 匹配版本和官方文档 spark hive 元数据异常 filenotfoundexception linux hadoop、hbase、hive、spark大数据分布式集群环境搭建 spark hive 异常 could not connect to meta store using any of the uris provided php ibm db2 函数 returns a result set listing the columns and associated privileges for a table php ibm db2 函数 returns a result set listing the columns and associated metadata for a table php ncurses 函数 remove panel from the stack and delete it but not the associated window php odbc 函数 lists columns and associated privileges for the given table php wincache 函数 adds a variable in user cache only if variable does not already exist in the cache php wincache 函数 adds a variable in user cache and overwrites a variable if it already exists in the cache php mongocollection creates an index on the specified field s if it does not already exist php mysql xdevapi table check if table exists in database spark hive Can not create the managed table('`xxx`'). The associated location('xxx') already exists Hive on Spark性能调优 spark解决OutOfMemoryError: Not enough memory to build and broadcast the table to all worker nodes