R/sparse_tidiers.R
tdm_tidiers.Rd
Tidy a DocumentTermMatrix or TermDocumentMatrix into
a three-column data frame: term{}
, and value (with
zeros missing), with one-row-per-term-per-document.
A DocumentTermMatrix or TermDocumentMatrix object
Extra arguments, not used
Specify row names
Specify column names
if (requireNamespace("topicmodels", quietly = TRUE)) {
data("AssociatedPress", package = "topicmodels")
AssociatedPress
tidy(AssociatedPress)
}
#> # A tibble: 302,031 × 3
#> document term count
#> <int> <chr> <dbl>
#> 1 1 adding 1
#> 2 1 adult 2
#> 3 1 ago 1
#> 4 1 alcohol 1
#> 5 1 allegedly 1
#> 6 1 allen 1
#> 7 1 apparently 2
#> 8 1 appeared 1
#> 9 1 arrested 1
#> 10 1 assault 1
#> # ℹ 302,021 more rows