spark自定义函数udf案例_编程学问网

spark自定义函数udf案例

spark | 2021-05-13 09:28:26

1.定义udf

// 累加 map 中所有 key 小于 maxKey 的项
  def accumulateMapValues(map: Map[String, Double], maxKey: String): Double = map.filter(kv => maxKey >= kv._1).values.sum

2.注册给api使用

def accumulateMapValuesUdf: UserDefinedFunction = udf(accumulateMapValues _)

df.withColumn("value_map", map_from_entries(collect_list(struct($"name", $"count")).over(Window.partitionBy("project_id"))))
//累计人数
.withColumn("accumulate", accumulateMapValuesUdf($"value_map", $"level"))

3.注册给spark sql 使用

SPARK.udf.register("sumWithMap", accumulateMapValues _)

SPARK.sql("select sunWithMap(value_map,level) from view")

登录后即可回复登录 | 注册

相关文章

spark row number rank dense rank percent rank排序排名函数解析 spark sql内置日期时间函数 php cubrid 过时的别名和函数 read data from a glo instance and save it in a file php cubrid 函数 tell the cursor position of the lob object php cubrid mysql 兼容性函数 send a cubrid query php cubrid 函数 get the cubrid php module s version php curl 函数返回一个新curl批处理句柄 php curl 函数获取 curl 版本信息 php date/time 函数别名 datetime diff php sqlite 函数 register an aggregating udf for use in sql statements php sqlite 函数 decode binary data passed as parameters to an udf php sqlite 函数 encode binary data before returning it from an udf php com 函数 returns the type of a variant object php 字符串函数将格式化字符串写入流 php vpopmail 函数 get text message for last vpopmail error php win32ps 函数 list running processes php wincache 函数 refreshes the cache entries for the cached files php wincache 函数 releases an exclusive lock on a given key php xdiff 函数 read a size of file created by applying a binary diff spark自定义函数udf案例

关注编程学问公众号