Tidy a DocumentTermMatrix or TermDocumentMatrix into a three-column data frame: term{}, and value (with zeros missing), with one-row-per-term-per-document.

# S3 method for DocumentTermMatrix
tidy(x, ...)

# S3 method for TermDocumentMatrix
tidy(x, ...)

# S3 method for dfm
tidy(x, ...)

# S3 method for dfmSparse
tidy(x, ...)

# S3 method for simple_triplet_matrix
tidy(x, row_names = NULL, col_names = NULL, ...)

Arguments

x

A DocumentTermMatrix or TermDocumentMatrix object

...

Extra arguments, not used

row_names

Specify row names

col_names

Specify column names

Examples


if (requireNamespace("topicmodels", quietly = TRUE)) {
  data("AssociatedPress", package = "topicmodels")
  AssociatedPress

  tidy(AssociatedPress)
}
#> # A tibble: 302,031 × 3
#>    document term       count
#>       <int> <chr>      <dbl>
#>  1        1 adding         1
#>  2        1 adult          2
#>  3        1 ago            1
#>  4        1 alcohol        1
#>  5        1 allegedly      1
#>  6        1 allen          1
#>  7        1 apparently     2
#>  8        1 appeared       1
#>  9        1 arrested       1
#> 10        1 assault        1
#> # ℹ 302,021 more rows