class: center, middle, inverse, title-slide # Growing a book in the open ##
from blog post to O’Reilly paperback ### Julia Silge | JSM 2018 | 30 July 2018 --- layout: true <div class="my-footer"><span>bit.ly/silge-jsm-book</span></div> --- class: center, middle # Check out the recording for today's Session 211 ## ["Addressing Sexual Misconduct in the Statistics Community"](http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/AbstractDetails.cfm?abstractid=329833) Thanks to [Stephanie Hicks](http://www.stephaniehicks.com/) for organizing this important session --- class: inverse, right, middle # Find me at... <a href="http://twitter.com/juliasilge"><i class="fa fa-twitter fa-fw"></i> @juliasilge</a><br> <a href="http://github.com/juliasilge"><i class="fa fa-github fa-fw"></i> @juliasilge</a><br> <a href="https://juliasilge.com"><i class="fa fa-link fa-fw"></i> juliasilge.com</a><br> <a href="https://tidytextmining.com"><i class="fa fa-book fa-fw"></i> tidytextmining.com</a><br> <a href="mailto:julia.silge@gmail.com"><i class="fa fa-paper-plane fa-fw"></i> julia.silge@gmail.com</a> --- background-image: url(images/not-a-robot.png) background-size: 750px background-position: 50% 30% # Are you a robot? --- background-image: url(images/shedoesntgohere.gif) background-position: 50% 50% # Who am I? --- class: inverse # In early 2016... -- - I was interested in this so-called "data science" -- - I was pretty sure no one would hire me -- - I needed a ✨portfolio✨ --- background-image: url(images/prideandprejudice.gif) background-size: 700px background-position: 50% 50% # Write what you know --- class: center, middle background-image: url("images/blog_screenshot.png") --- class: left background-image: url(images/austen_heatmap.png) background-size: 900px background-position: 50% 50% # Text mining and Jane Austen --- class: inverse, center, middle # In July 2017... --- class:center, middle background-image: url(images/cover.png) background-size: 450px --- class:center, middle background-image: url(images/ropensci_tweet.png) background-size: 600px --- # Unnest tokens to tidy text ```r library(tidytext) tidy_books <- janeaustenr::austen_books() %>% unnest_tokens(word, text) tidy_books %>% anti_join(stop_words) %>% count(word, sort = TRUE) ``` ``` ## # A tibble: 13,914 x 2 ## word n ## <chr> <int> ## 1 miss 1855 ## 2 time 1337 ## 3 fanny 862 ## 4 dear 822 ## 5 lady 817 ## 6 sir 806 ## 7 day 797 ## 8 emma 787 ## 9 sister 727 ## 10 house 699 ## # ... with 13,904 more rows ``` --- # Tidy a document-term matrix ```r comparison <- tidy(AssociatedPress) %>% count(word = term) %>% rename(AP = n) %>% inner_join(count(tidy_books, word)) %>% rename(Austen = n) %>% mutate(AP = AP / sum(AP), Austen = Austen / sum(Austen)) comparison ``` ``` ## # A tibble: 4,735 x 3 ## word AP Austen ## <chr> <dbl> <dbl> ## 1 abandoned 0.000169 0.00000461 ## 2 abide 0.0000289 0.0000184 ## 3 abilities 0.0000289 0.000134 ## 4 ability 0.000236 0.0000138 ## 5 able 0.000661 0.00141 ## 6 abroad 0.000193 0.000166 ## 7 abrupt 0.0000289 0.0000230 ## 8 absence 0.0000772 0.000511 ## 9 absent 0.0000434 0.000230 ## 10 absolute 0.0000531 0.000120 ## # ... with 4,725 more rows ``` --- ![](growing_book_jsm_files/figure-html/unnamed-chunk-4-1.png)<!-- --> --- class: center, middle background-image: url(images/tidytext.png) background-size: 500px --- class: center, middle background-image: url(images/tidytext_repo.png) background-size: 900px --- # Growing a book in the open - **March 2016** initial blog post - **April 2016** rOpenSci unconf, the beginning of tidytext, tidytext on CRAN --- background-image: url(images/heatedblogs.gif) background-size: 500px background-position: 50% 50% # Blogging --- # Blogging - Nobody is going to stop you from blogging about data science and/or statistics --- background-image: url(images/saynotohorse.gif) background-size: 400px background-position: 50% 50% # Blogging --- # Blogging - Nobody is going to stop you from blogging about data science and/or statistics -- - Blogging allows you to **practice** relevant skills -- - Blogging offers opportunities for **community** and **feedback** --- class:center, middle background-image: url(images/cover_bookdown.jpg) background-size: 400px --- # Growing a book in the open - **March 2016** initial blog post - **April 2016** rOpenSci unconf, the beginning of tidytext, tidytext on CRAN - **July 2016** bookdown repo started --- class: inverse background-image: url(images/aaron-burden-126996-unsplash.jpg) background-size: cover # Collaboration --- # Growing a book in the open - **March 2016** initial blog post - **April 2016** rOpenSci unconf, the beginning of tidytext, tidytext on CRAN - **July 2016** bookdown repo started - **October 2016** we announced *Tidy Text Mining with R* --- class: inverse background-image: url(images/writing_life.jpg) background-size: 300px background-position: 18% 60% ## Writing advice from Annie Dillard --- class: inverse, right <h1 class="fa fa-quote-left fa-fw"></h1> One of the things I know about writing is this: spend it all, shoot it, play it, lose it, all, right away, every time. -- Do not hoard what seems good for a later place in the book or for another book; give it, give it all, give it now. -- <h2> The impulse to save something good for a better place later is the signal to spend it now. </h2> <h1 class="fa fa-quote-right fa-fw"></h1> --- class: inverse, right <h1 class="fa fa-quote-left fa-fw"></h1> Similarly, the impulse to keep to yourself what you have learned is not only shameful, it is destructive. -- <h2> Anything you do not give freely and abundantly becomes lost to you. You open your safe and find ashes. </h2> <h1 class="fa fa-quote-right fa-fw"></h1> --- # Growing a book in the open - **March 2016** initial blog post - **April 2016** rOpenSci unconf, the beginning of tidytext, tidytext on CRAN - **July 2016** bookdown repo started - **October 2016** we announced *Tidy Text Mining with R* -- - **December 2016** signed agreement with O'Reilly --- # Why publish a book? --- background-image: url(images/money.gif) background-size: 400px background-position: 50% 50% # Why publish a book? --- # Why publish a book? - A technical book is **not** the most lucrative side project you can embark on -- - The gravitas of a "real" publisher may or may not be the right fit any individual --- class: inverse, center, middle # Then we had to actually submit a book --- class: bottom, right background-image: url(images/yihui_simple.png) background-size: 700px Advice from [Yihui Xie](https://yihui.name/en/2018/07/r-markdown-book/) --- background-image: url(images/darkness.png) background-size: 400px --- # Growing a book in the open - **March 2016** initial blog post - **April 2016** rOpenSci unconf, the beginning of tidytext, tidytext on CRAN - **July 2016** bookdown repo started - **October 2016** we announced *Tidy Text Mining with R* - **December 2016** signed agreement with O'Reilly - **April 2017** submitted final draft to O'Reilly -- - **July 2017** book published 🎉 --- class: inverse, left, middle # Thanks! <a href="http://twitter.com/juliasilge"><i class="fa fa-twitter fa-fw"></i> @juliasilge</a><br> <a href="http://github.com/juliasilge"><i class="fa fa-github fa-fw"></i> @juliasilge</a><br> <a href="https://juliasilge.com"><i class="fa fa-link fa-fw"></i> juliasilge.com</a><br> <a href="https://tidytextmining.com"><i class="fa fa-book fa-fw"></i> tidytextmining.com</a><br> <a href="mailto:julia.silge@gmail.com"><i class="fa fa-paper-plane fa-fw"></i> julia.silge@gmail.com</a> Slides created with the R package [**xaringan**](https://github.com/yihui/xaringan) Title slide photo by [Aaron Burden](https://unsplash.com/photos/Kp9z6zcUfGw) on [Unsplash](https://unsplash.com/)