spark操作mongoDB异常 Cannot cast BsonValue

spark | 2019-09-13 10:02:39

spark操作mongoDB出现异常:

Job aborted due to stage failure: Task 26 in stage 16.0 failed 4 times, most recent failure: Lost task 26.3 in stage 16.0 (TID 21384, 10.10.22.202, executor 1): com.mongodb.spark.exceptions.MongoTypeConversionException: Cannot cast 409.922222 into a BsonValue. DecimalType(10,6) has no matching BsonValue.
at com.mongodb.spark.sql.MapFunctions$.com$mongodb$spark$sql$MapFunctions$$elementTypeToBsonValue(MapFunctions.scala:153)
at com.mongodb.spark.sql.MapFunctions$$anonfun$4.apply(MapFunctions.scala:86)
at com.mongodb.spark.sql.MapFunctions$$anonfun$4.apply(MapFunctions.scala:86)
at scala.util.Try$.apply(Try.scala:192)
at com.mongodb.spark.sql.MapFunctions$.com$mongodb$spark$sql$MapFunctions$$convertToBsonValue(MapFunctions.scala:86)
at com.mongodb.spark.sql.MapFunctions$$anonfun$rowToDocument$1.apply(MapFunctions.scala:58)
at com.mongodb.spark.sql.MapFunctions$$anonfun$rowToDocument$1.apply(MapFunctions.scala:56)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at com.mongodb.spark.sql.MapFunctions$.rowToDocument(MapFunctions.scala:56)
at com.mongodb.spark.MongoSpark$$anonfun$1.apply(MongoSpark.scala:162)
at com.mongodb.spark.MongoSpark$$anonfun$1.apply(MongoSpark.scala:162)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at scala.collection.Iterator$GroupedIterator.takeDestructively(Iterator.scala:1076)
at scala.collection.Iterator$GroupedIterator.go(Iterator.scala:1091)
at scala.collection.Iterator$GroupedIterator.fill(Iterator.scala:1128)
at scala.collection.Iterator$GroupedIterator.hasNext(Iterator.scala:1132)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at com.mongodb.spark.MongoSpark$$anonfun$save$1$$anonfun$apply$1.apply(MongoSpark.scala:132)
at com.mongodb.spark.MongoSpark$$anonfun$save$1$$anonfun$apply$1.apply(MongoSpark.scala:131)
at com.mongodb.spark.MongoConnector$$anonfun$withCollectionDo$1.apply(MongoConnector.scala:186)
at com.mongodb.spark.MongoConnector$$anonfun$withCollectionDo$1.apply(MongoConnector.scala:184)
at com.mongodb.spark.MongoConnector$$anonfun$withDatabaseDo$1.apply(MongoConnector.scala:171)
at com.mongodb.spark.MongoConnector$$anonfun$withDatabaseDo$1.apply(MongoConnector.scala:171)
at com.mongodb.spark.MongoConnector.withMongoClientDo(MongoConnector.scala:154)
at com.mongodb.spark.MongoConnector.withDatabaseDo(MongoConnector.scala:171)
at com.mongodb.spark.MongoConnector.withCollectionDo(MongoConnector.scala:184)
at com.mongodb.spark.MongoSpark$$anonfun$save$1.apply(MongoSpark.scala:131)
at com.mongodb.spark.MongoSpark$$anonfun$save$1.apply(MongoSpark.scala:130)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:929)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:929)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2067)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2067)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:

原因分析:

我查了bson和json是一个道理,就去mongoDB官网看了一下,

发现mongo对spark还有版本匹配的要求。

MongoDB Connector for SparkSpark VersionMongoDB Version
2.3.12.3.x2.6 or later
2.2.52.2.x2.6 or later
2.1.42.1.x2.6 or later
2.0.02.0.x2.6 or later
1.1.01.6.x2.6 or later

问题解决:

果断去maven下载了对应集群spark版本的mongo-spark-connector_2.11-2.3.1.jar和mongo-java-driver-3.8.0.jar,放到spark jars里面,不用重启,立马试一下,果然可以了。



登录后即可回复 登录 | 注册
    
关注编程学问公众号