Tag → spark-udf | zstat.pl

Get topics' words from the LDA model.

@Zygmunt Zawadzki · Feb 7, 2018 · 4 min read

Some time ago I had to move from sparklyr to Scala for better integration with Spark, and easier collaboration with other developers in a team. Interestingly, this conversion was much easier than I thought because Spark’s DataFrame API is somewhat similar to dplyr, there’s groupBy function, agg instead of summarise, and so on. You can also use traditional, old SQL to operate on data frames. Anyway, in this post, I’ll show how to fit very simple LDA (Latent Dirichlet allocation) model, and then extract information about topic’s words.

zstat.pl - blog

Posts List

Get topics' words from the LDA model.