Tidy a corpus object from the quanteda package. tidy returns a tbl_df with one-row-per-document, with a text column containing the document's text, and one column for each document-level metadata. glance returns a one-row tbl_df with corpus-level metadata, such as source and created. For Corpus objects from the tm package, see tidy.Corpus().

# S3 method for corpus
tidy(x, ...)

# S3 method for corpus
glance(x, ...)

Arguments

x

A Corpus object, such as a VCorpus or PCorpus

...

Extra arguments, not used

Details

For the most part, the tidy output is equivalent to the "documents" data frame in the corpus object, except that it is converted to a tbl_df, and texts column is renamed to text to be consistent with other uses in tidytext.

Similarly, the glance output is simply the "metadata" object, with NULL fields removed and turned into a one-row tbl_df.

Examples


if (requireNamespace("quanteda", quietly = TRUE)) {
 data("data_corpus_inaugural", package = "quanteda")

 data_corpus_inaugural

 tidy(data_corpus_inaugural)
}
#> # A tibble: 59 × 5
#>    text                                           Year President FirstName Party
#>    <chr>                                         <int> <chr>     <chr>     <fct>
#>  1 "Fellow-Citizens of the Senate and of the Ho…  1789 Washingt… George    none 
#>  2 "Fellow citizens, I am again called upon by …  1793 Washingt… George    none 
#>  3 "When it was first perceived, in early times…  1797 Adams     John      Fede…
#>  4 "Friends and Fellow Citizens:\n\nCalled upon…  1801 Jefferson Thomas    Demo…
#>  5 "Proceeding, fellow citizens, to that qualif…  1805 Jefferson Thomas    Demo…
#>  6 "Unwilling to depart from examples of the mo…  1809 Madison   James     Demo…
#>  7 "About to add the solemnity of an oath to th…  1813 Madison   James     Demo…
#>  8 "I should be destitute of feeling if I was n…  1817 Monroe    James     Demo…
#>  9 "Fellow citizens, I shall not attempt to des…  1821 Monroe    James     Demo…
#> 10 "In compliance with an usage coeval with the…  1825 Adams     John Qui… Demo…
#> # ℹ 49 more rows