Tidy a DocumentTermMatrix or TermDocumentMatrix into a three-column data frame: term{}, and value (with zeros missing), with one-row-per-term-per-document.

# S3 method for DocumentTermMatrix
tidy(x, ...)

# S3 method for TermDocumentMatrix
tidy(x, ...)

# S3 method for dfm
tidy(x, ...)

# S3 method for dfmSparse
tidy(x, ...)

# S3 method for simple_triplet_matrix
tidy(x, row_names = NULL, col_names = NULL, ...)

Arguments

x

A DocumentTermMatrix or TermDocumentMatrix object

...

Extra arguments, not used

row_names

Specify row names

col_names

Specify column names

Examples

if (requireNamespace("topicmodels", quietly = TRUE)) { data("AssociatedPress", package = "topicmodels") AssociatedPress tidy(AssociatedPress) }
#> # A tibble: 302,031 x 3 #> document term count #> <int> <chr> <dbl> #> 1 1 adding 1 #> 2 1 adult 2 #> 3 1 ago 1 #> 4 1 alcohol 1 #> 5 1 allegedly 1 #> 6 1 allen 1 #> 7 1 apparently 2 #> 8 1 appeared 1 #> 9 1 arrested 1 #> 10 1 assault 1 #> # … with 302,021 more rows