Casting a data frame to a DocumentTermMatrix, TermDocumentMatrix, or dfm

This turns a "tidy" one-term-per-document-per-row data frame into a DocumentTermMatrix or TermDocumentMatrix from the tm package, or a dfm from the quanteda package. These functions support non-standard evaluation through the tidyeval framework. Groups are ignored.

cast_tdm(data, term, document, value, weighting = tm::weightTf, ...)

cast_dtm(data, document, term, value, weighting = tm::weightTf, ...)

cast_dfm(data, document, term, value, ...)

Arguments

data: Table with one-term-per-document-per-row
term: Column containing terms as string or symbol
document: Column containing document IDs as string or symbol
value: Column containing values as string or symbol
weighting: The weighting function for the DTM/TDM (default is term-frequency, effectively unweighted)
...: Extra arguments passed on to Matrix::sparseMatrix()

Details

The arguments term, document, and value are passed by expression and support quasiquotation; you can unquote strings and symbols.