In this tutorial, we will show how to use the quanteda package to analyze the Manifesto Corpus. We assume that you have already read First steps with manifestoR (at least until “Downloading documents from the Manifesto Corpus”) and that you are familiar with the pipe %>% operator. The tutorial was written in 2018 based on the quanteda version available at that point in time but it has been slightly adapted in 2021 to be compatible with quanteda version >= 3.0 (which also means that it might be now less compatible with quanteda versions < 3.0).
Grammar and logic of quanteda
Quanteda (Benoit et al. 2018) is a comprehensive powerful text analysis R package. It is well documented, fast and versatile.
Quanteda has three main objects.
- Corpora (created with the
corpus()
). These contain the texts() and document meta information in form ofdocvars()
. - tokens objects (created with
tokens()
). Thetokens
function tokenizes corpora into tokens. Tokens can be of different kind, such as words, paragraphs, or ngrams. - Document-feature matrices (created with
dfm()
).dfm
s are matrices where each row represents one document. Columns represent features (mostly tokens) and cells contain information about the occurence of features within documents. Features are mostly tokens (eg words or n-grams).dfm
is the starting point for most types of analyses that draw inferences from the frequency of tokens.
Most quanteda functions take on of these three as an input and somehow transform it. The functions are consistently and intuitively named, eg. dfm_group
groups a document-feature matrix, tokens_remove
removes tokens from a tokens object, etc.
manifestoR and Quanteda
We first use the usual “header” of a manifestoR script: loading packages, setting the api-key and fixing the corpus version (to ensure reproducibility).
library(manifestoR)
library(quanteda)
library(quanteda.textstats)
library(dplyr)
library(ggplot2)
library(tidyr)
library(stringr)
mp_setapikey(key.file = "manifesto_apikey.txt")
mp_use_corpus_version("2017-2")
Before working with the Manifesto Corpus with Quanteda, it is important to think about the level of analysis (level of aggregation). Many operations on Quanteda are meant to happen on the document level. For example documents have metalevel information, while a smaller unit cannot have separate meta-information. Different documents however can have the same meta information (for example the party code or the same language). Depending on the research question it might sometimes be more appropriate to treat manifestos as documents and in other cases it might be better to treat individual quasi-sentences as documents.
Quanteda can directly import corpora from the manifestoR corpus format applying the corpus
function to a normal ManifestoCorpus
object that one can get with the mp_corpus
function (which is a kind of tm
corpus, see the First Steps with manifestoR tutorial). We use mp_availability
to check the availability of documents for the 2012 US elections. We save the object and use it as input for the mp_corpus
function. Alternatively, we could have also used the same expression for mp_corpus
that we used for mp_availability
.
<- mp_availability(countryname == "United States" & date == 201211 & partyname %in% c("Democratic Party", "Republican Party")) available_us2012
## Connecting to Manifesto Project DB API... corpus version: 2017-2
available_us2012
## Queried for Corpus Version Documents found
## 2 2017-2 2 (100%)
## Coded Documents found Originals found Languages
## 2 (100%) 2 (100%) 1 (english)
<- mp_corpus(available_us2012) tm_corpus
## Connecting to Manifesto Project DB API... corpus version: 2017-2
tm_corpus
## <<ManifestoCorpus>>
## Metadata: corpus specific: 0, document level (indexed): 0
## Content: documents: 2
We queried for two documents of which both are “Coded document” - documents with annotations. When converting this to a Quanteda corpus, however this results in 3188 documents as every quasi-sentence is considered an individual document. As you can see in the code below we transformed the ManifestoCorpus into a data.frame and called quanteda’s corpus
function with the parameters docid_field = "manifesto_id", unique_docnames = FALSE
to let it auto-generate document names based on the manifesto_id and a within document running number. Alternatively you can also generate the doc_id
column manually by adding a mutate(doc_id = paste(manifesto_id, pos, sep = "."))
step before calling quanteda’s corpus()
function without any further arguments.
<- tm_corpus %>%
quanteda_corpus as.data.frame(with.meta = TRUE) %>%
corpus(docid_field = "manifesto_id", unique_docnames = FALSE) ## quanteda's corpus function
quanteda_corpus
## Corpus consisting of 3,188 documents and 18 docvars.
## 61320_201211.1 :
## "Moving America Forward 2012 Democratic National Platform"
##
## 61320_201211.2 :
## "Moving America Forward"
##
## 61320_201211.3 :
## "Four years ago, Democrats, independents, and many Republican..."
##
## 61320_201211.4 :
## "We were in the midst of the greatest economic crisis since t..."
##
## 61320_201211.5 :
## "the previous administration had put two wars on our nation’s..."
##
## 61320_201211.6 :
## "and the American Dream had slipped out of reach for too many..."
##
## [ reached max_ndoc ... 3,182 more documents ]
The meta data information from the Manifesto Corpus is stored in the docvars and is available for each quasi-sentence.
%>%
quanteda_corpus docvars() %>%
names()
## [1] "cmp_code" "eu_code"
## [3] "pos" "party"
## [5] "date" "language"
## [7] "source" "has_eu_code"
## [9] "is_primary_doc" "may_contradict_core_dataset"
## [11] "md5sum_text" "url_original"
## [13] "md5sum_original" "annotations"
## [15] "handbook" "is_copy_of"
## [17] "title" "id"
When using manifestos that were coded with different versions of the coding instructions (see the tutorial on subcategories), it might be a good idea to first recode version 5 codes to version 4 using manifestoR recode_v5_to_v4
function before transforming it into the quanteda corpus format.
<- corpus
corpus_df mp_corpus(countryname == "Germany" & date == 201709) %>%
as.data.frame(with.meta = TRUE) %>%
corpus(docid_field = "manifesto_id", unique_docnames = FALSE) %>%
docvars(field = "cmp_code") %>%
head(10)
## [1] "H" "0" "202.1" "201.1" "503" "201.1" "503" "201.1" "201.1"
## [10] "501"
mp_corpus(countryname == "Germany" & date == 201709) %>%
recode_v5_to_v4() %>%
as.data.frame(with.meta = TRUE) %>%
corpus(docid_field = "manifesto_id", unique_docnames = FALSE) %>%
docvars(field = "cmp_code") %>%
head(10)
## [1] "H" "0" "202" "201" "503" "201" "503" "201" "201" "501"
By default, digitally annotated quasi-sentences will be treated as separate documents by quanteda (one document equals one quasi-sentence), while documents that have no annotations will be treated as a single document (one document equals one manifesto). The following snippet illustrates this difference. Instead of querying for the 2004 documents that are annotated, we query the manifestos from the 2000 election that are not annotated. The converted quanteda corpus then contains only two documents (including the whole texts of both manifestos):
<- mp_availability(countryname == "United States" & date %in% c(200011)) us_not_annotated
## Connecting to Manifesto Project DB API... corpus version: 2017-2
us_not_annotated
## Queried for Corpus Version Documents found
## 2 2017-2 2 (100%)
## Coded Documents found Originals found Languages
## 0 (0%) 2 (100%) 1 (english)
mp_corpus(as.data.frame(us_not_annotated)) %>%
as.data.frame(with.meta = TRUE) %>%
corpus(docid_field = "manifesto_id", unique_docnames = FALSE)
## Connecting to Manifesto Project DB API... corpus version: 2017-2
## Corpus consisting of 2 documents and 18 docvars.
## 61320_200011 :
## "The 2000 Democratic National Platform: Prosperity, Progress,..."
##
## 61620_200011 :
## "REPUBLICAN PLATFORM 2000 Renewing America's Purpose. Togethe..."
If you want to use a set of manifestos where one part of the set is available as annotated and the other as non-annotated documents, it might be reasonable to first transform them to a similar aggregation level. One possibility would be to separately download the non-annotated manifestos and segment them using corpus_segment()
into sentences and then combine them with a corpus that is already parsed into quasi-sentences. If the level of analysis is anyway the manifesto level, then one can also later group the document-feature matrix based on the manifesto id using dfm_group
.
Subsetting the corpus
Quanteda can easily subset the corpus based on document level variables. The following code snippets subset the corpus based on the party code or based on the cmp_code.
%>%
quanteda_corpus corpus_subset(party == 61620) %>%
as.character() %>%
head(5)
## 61620_201211.1
## "We Believe in America"
## 61620_201211.2
## "This platform is dedicated with appreciation and reverence for:"
## 61620_201211.3
## "Preamble"
## 61620_201211.4
## "The 2012 Republican Platform is a statement of who we are and what we believe as a Party"
## 61620_201211.5
## "and our vision for a stronger and freer America."
%>%
quanteda_corpus corpus_subset(cmp_code == 501) %>%
as.character() %>%
head(5)
## 61320_201211.1
## "and fuel-efficiency standards are doubling."
## 61320_201211.2
## "Historic investments in clean energy technologies have helped double the electricity we get from wind and solar."
## 61320_201211.3
## "New emissions and fuel efficiency standards for American cars are reducing our oil use,"
## 61320_201211.4
## "which is why the Obama administration has proposed a number of safeguards to protect against water contamination and air pollution."
## 61320_201211.5
## "We will continue to advocate for the use of this clean fossil fuel,"
Tokenization with tokens()
The tokens()
function in quanteda tokenizes the documents. Tokens can be words (the default), characters, sentences or ngrams. The function provides many arguments to facilitate the cleaning and preprocessing.
%>%
quanteda_corpus tokens() %>%
head(2)
## Tokens consisting of 2 documents and 18 docvars.
## 61320_201211.1 :
## [1] "Moving" "America" "Forward" "2012" "Democratic"
## [6] "National" "Platform"
##
## 61320_201211.2 :
## [1] "Moving" "America" "Forward"
One could also tokenize the same document into bi-grams using the tokens_ngrams
function. By using n = 1:2, quanteda tokenizes into uni-grams and bi-grams.
%>%
quanteda_corpus tokens() %>%
tokens_ngrams(n = 1:2) %>%
head(2)
## Tokens consisting of 2 documents and 18 docvars.
## 61320_201211.1 :
## [1] "Moving" "America" "Forward"
## [4] "2012" "Democratic" "National"
## [7] "Platform" "Moving_America" "America_Forward"
## [10] "Forward_2012" "2012_Democratic" "Democratic_National"
## [ ... and 1 more ]
##
## 61320_201211.2 :
## [1] "Moving" "America" "Forward" "Moving_America"
## [5] "America_Forward"
Tokenization is particularly important for pre-processing and cleaning the texts. One can easily remove nubmers, punctuation or stopwords. Moreover, it is simple to transform all text to lower case or stem words.
%>%
quanteda_corpus tokens(remove_punct = TRUE, remove_numbers = TRUE) %>%
tokens_tolower() %>%
tokens_remove(stopwords("english")) %>%
tokens_wordstem() %>%
head(4)
## Tokens consisting of 4 documents and 18 docvars.
## 61320_201211.1 :
## [1] "move" "america" "forward" "democrat" "nation" "platform"
##
## 61320_201211.2 :
## [1] "move" "america" "forward"
##
## 61320_201211.3 :
## [1] "four" "year" "ago" "democrat" "independ"
## [6] "mani" "republican" "came" "togeth" "american"
## [11] "move" "countri"
## [ ... and 1 more ]
##
## 61320_201211.4 :
## [1] "midst" "greatest" "econom" "crisi" "sinc" "great" "depress"
Constructing a document-feature-matrix with dfm
The construction of a document feature matrix is at the core of most automatic text analyses workflows. dfm
is quanteda’s powerful command to construct a document-feature matrix. In many cases, one can skip the step to generate tokens
from a corpus, but directly use dfm
on a corpus as the dfm command passes on most arguments to tokens
. To get a “standard” preprocessed document feature matrix with lower casing, removed punctuation and numbers as well as stemmed words from a corpus, one would add the following arguments to dfm:
%>%
quanteda_corpus tokens(remove_punct = TRUE, remove_numbers = TRUE) %>%
tokens_tolower() %>%
tokens_remove(stopwords("english")) %>%
tokens_wordstem() %>%
dfm()
## Document-feature matrix of: 3,188 documents, 3,941 features (99.75% sparse) and 18 docvars.
## features
## docs move america forward democrat nation platform four year ago
## 61320_201211.1 1 1 1 1 1 1 0 0 0
## 61320_201211.2 1 1 1 0 0 0 0 0 0
## 61320_201211.3 1 0 1 1 0 0 1 1 1
## 61320_201211.4 0 0 0 0 0 0 0 0 0
## 61320_201211.5 0 0 0 0 1 0 0 0 0
## 61320_201211.6 0 0 0 0 0 0 0 0 0
## features
## docs independ
## 61320_201211.1 0
## 61320_201211.2 0
## 61320_201211.3 1
## 61320_201211.4 0
## 61320_201211.5 0
## 61320_201211.6 0
## [ reached max_ndoc ... 3,182 more documents, reached max_nfeat ... 3,931 more features ]
You can modify a dfm by using various functions such as dfm_trim
, dfm_select
, dfm_weight
, dfm_keep
, dfm_lookup
, dfm_sample
, and many more. In the following example, we download Irish manifestos from the 2016 election, do some standard preprocessing, drop all quasi-sentences with headline codes (“H”), uncoded (“0”,“000”) and with codes missing (NA). We use the dfm_group
here to combine all quasi-sentences coded with the same code to one document.Standard cell entries in a dfm are counts of features per document. Term frequencies can be transformed using the dfm_weight
function. Here, we use it to calculate the proportion of words per document (scheme = "prop"
). We then subset the dfm for four features of four specific codes.
<- mp_corpus(countryname == "Ireland" & date == 201602) %>%
quanteda_irish recode_v5_to_v4() %>%
as.data.frame(with.meta = TRUE) %>%
corpus(docid_field = "manifesto_id", unique_docnames = FALSE) %>%
tokens(remove_punct = TRUE) %>%
tokens_tolower() %>%
tokens_remove(stopwords("english")) %>%
dfm() %>%
dfm_subset(!(cmp_code %in% c("H", "", "0", "000", NA))) %>%
dfm_group(cmp_code) %>%
dfm_weight(scheme = "prop") %>%
dfm_subset(cmp_code %in% c("501", "502", "301", "411"))
## Connecting to Manifesto Project DB API... corpus version: 2017-2
## Connecting to Manifesto Project DB API... corpus version: 2017-2
quanteda_irish
## Document-feature matrix of: 4 documents, 13,161 features (86.26% sparse) and 11 docvars.
## features
## docs think ahead act now general election
## 301 0 0 0.0009632055 0.0003852822 0 0.0003852822
## 411 0 0.0003094059 0.0004641089 0.0004641089 0 0
## 501 0 0 0.0011047980 0.0007891414 0 0.0001578283
## 502 0 0 0.0015507883 0.0002584647 0.0002584647 0
## features
## docs manifesto 2016 progressive practical
## 301 0.0001926411 0.0003852822 0 0
## 411 0 0.0012376238 0.0001547030 0.0003094059
## 501 0 0.0006313131 0.0001578283 0.0001578283
## 502 0 0.0015507883 0.0002584647 0
## [ reached max_nfeat ... 13,151 more features ]
To plot the most frequent terms, we use the textstat_frequency()
function. It extracts the most frequent terms (here grouped by cmp_code) and converts these summary statistics to a data.frame. Such a data.frame servers as perfect input for a ggplot.
<- quanteda_irish %>% textstat_frequency(n = 10, group = cmp_code)
feature_frequencies_categories
%>%
feature_frequencies_categories mutate(cmp_code = factor(group, labels = c("Decentralisation", "Technology & Infrastructure", "Environmental Protection", "Culture"))) %>%
ggplot(aes(x = reorder(feature, frequency), y = frequency, fill = cmp_code)) +
geom_col(show.legend = FALSE) +
labs(x = NULL, y = "share of words per category") +
facet_wrap(~cmp_code, ncol = 2, scales = "free") +
coord_flip()
Certainly, similar to tidytext, quanteda also allows the calculation of term-frequency inverse-document frequency (tfidf) scores with dfm_tfidf
.
Keyword in context search
Quanteda also provides a nice way to view text passages based on certain key words. The kwic
(for keyword in context) allows you to use for a text string or pattern. The window indicates how many word around the keyword should be shown in the output. The following is a keyword search for the term “arms” based on the US party platforms.
%>%
quanteda_corpus tokens() %>%
kwic(phrase("arms"), window = 10) %>%
::datatable(caption = "Keywords in context", rownames = FALSE, options = list(scrollX = TRUE, pageLength = 5, lengthMenu = c(5, 10, 15, 20))) DT
Multi-word expressions
Multi-word expressions can pose a problem to automatic text analysis. The expression “New York” stands for something different than the two separate words “new” and “York”. Quanteda offers a simple way to identify such multi-word expressions based on collocations using the textstat_collocations
function. The following shows an association measure for word pairs. The list contains many expressions that may be better (or even should) be treated as one expression in automatic analysis, such as “United States”, “President Obama”…
%>%
quanteda_corpus tokens() %>%
tokens_remove(stopwords("english")) %>%
textstat_collocations(method = "lambda", size = 2) %>%
arrange(-lambda) %>%
top_n(20)
## Selecting by z
## collocation count count_nested length lambda z
## 17 president obama 106 0 2 8.710226 15.82879
## 18 job creation 21 0 2 8.624032 15.40948
## 2 democratic party 51 0 2 7.881325 23.77678
## 1 united states 80 0 2 7.558881 24.56679
## 6 private sector 26 0 2 7.442870 19.31141
## 16 nuclear weapons 18 0 2 7.425171 16.07347
## 8 small businesses 20 0 2 7.313704 18.24845
## 19 current administration's 21 0 2 7.141155 15.40121
## 12 clean energy 22 0 2 7.093708 16.74232
## 11 around world 20 0 2 6.614726 16.97897
## 20 obama democrats 17 0 2 5.896752 15.37306
## 3 current administration 34 0 2 5.813144 22.34827
## 5 health care 30 0 2 5.620318 21.20490
## 14 economic growth 18 0 2 5.517368 16.23987
## 15 health insurance 16 0 2 5.478452 16.15043
## 4 national security 33 0 2 5.136260 21.30416
## 9 obama democratic 22 0 2 5.047492 17.81549
## 13 republican party 18 0 2 4.882577 16.53921
## 7 federal government 31 0 2 4.272125 19.00898
## 10 american people 29 0 2 3.831366 17.08902
Targeted sentiment analysis
Quanteda facilitates dictionary based searchs. The following example illustrates how to conduct a targeted sentiment analysis. We use the corpus created above based on US party platforms of 2012 and tokenize it into words. We then keep only tokens that include the word “President” as well as the ten words before and after every occurence of “President”.
<- tokens(quanteda_corpus) %>%
pres_tokens tokens_select("President", selection = "keep", window = 10, padding = FALSE, verbose = TRUE)
## kept 2 features
Quanteda has integrated a sentiment dictionary constructed by Young & Soroka (2012) stored in data_dictionary_LSD2015
. The dictionary contains thousands of positive and negative words or word stems.
1]] %>% head(10) data_dictionary_LSD2015[[
## [1] "a lie" "abandon*" "abas*" "abattoir*" "abdicat*" "aberra*"
## [7] "abhor*" "abject*" "abnormal*" "abolish*"
We then use the the sentiment dictionary to count positive and negative words among the surrounding words of “President” to analyze which party speaks more positively or negatively about the president. We “group” by party to get frequencies of positive and negative words aggregated to the party level. The ratio of positive to negative words is much higher for the Democratic Party (61320) than for the Republican Party (61620) when speaking about the “President”. This is little surprising as in 2012, the incumbent President was a Democrat.
<- dfm(pres_tokens) %>%
pres_dfm dfm_lookup(data_dictionary_LSD2015[1:2]) %>%
dfm_group(party)
pres_dfm
## Document-feature matrix of: 2 documents, 2 features (0.00% sparse) and 16 docvars.
## features
## docs negative positive
## 61320 64 181
## 61620 30 45
Quanteda (and its “sister”-packages quanteda.textstats, quanteda.textplots, quanteda.textmodels) has many more functions. In particular the textstat_*
functions of quanteda.textstats are powerful and can well applied to manifestos.
References
Benoit K, Watanabe K, Wang H, Nulty P, Obeng A, Müller S, Matsuo A (2018). “quanteda: An R package for the quantitative analysis of textual data.” Journal of Open Source Software, 3(30), 774. doi: 10.21105/joss.00774, https://quanteda.io.
Young, L and S Soroka. (2012). “Affective News: The Automated Coding of Sentiment in Political Texts.” Political Communication 29(2): 205-231.
Session Info
Tested with:
## ─ Session info ───────────────────────────────────────────────────────────────
## setting value
## version R version 4.0.3 (2020-10-10)
## date 2021-06-15
##
## ─ Packages ───────────────────────────────────────────────────────────────────
## package * version date lib source
## assertthat 0.2.0 2017-04-11 [NA] CRAN (R 4.0.3)
## base64enc 0.1-3 2015-07-28 [NA] CRAN (R 4.0.2)
## bookdown 0.22 2021-04-22 [NA] CRAN (R 4.0.2)
## cli 1.1.0 2019-03-19 [NA] CRAN (R 4.0.3)
## colorspace 1.3-2 2016-12-14 [NA] CRAN (R 4.0.3)
## crayon 1.3.4 2017-09-16 [NA] CRAN (R 4.0.2)
## crosstalk 1.0.0 2016-12-21 [NA] CRAN (R 4.0.3)
## curl 3.2 2018-03-28 [NA] CRAN (R 4.0.3)
## digest 0.6.21 2019-09-20 [NA] CRAN (R 4.0.3)
## dplyr * 1.0.6 2021-05-05 [NA] CRAN (R 4.0.2)
## DT 0.7 2019-06-11 [NA] CRAN (R 4.0.3)
## ellipsis 0.3.2 2021-04-29 [NA] CRAN (R 4.0.3)
## evaluate 0.14 2019-05-28 [NA] CRAN (R 4.0.1)
## fansi 0.4.0 2018-10-05 [NA] CRAN (R 4.0.3)
## farver 2.0.1 2019-11-13 [NA] CRAN (R 4.0.3)
## fastmap 1.0.0 2019-07-28 [NA] CRAN (R 4.0.3)
## fastmatch 1.1-0 2017-01-28 [NA] CRAN (R 4.0.2)
## foreign 0.8-70 2018-04-23 [NA] CRAN (R 4.0.3)
## functional 0.6 2014-07-16 [NA] CRAN (R 4.0.2)
## generics 0.0.2 2018-11-29 [NA] CRAN (R 4.0.2)
## ggplot2 * 3.3.3 2020-12-30 [NA] CRAN (R 4.0.2)
## glue 1.4.2 2020-08-27 [NA] CRAN (R 4.0.2)
## gtable 0.2.0 2016-02-26 [NA] CRAN (R 4.0.3)
## highr 0.6 2016-05-09 [NA] CRAN (R 4.0.3)
## hms 0.4.2 2018-03-10 [NA] CRAN (R 4.0.3)
## htmltools 0.4.0 2019-10-04 [NA] CRAN (R 4.0.3)
## htmlwidgets 1.5.3 2020-12-10 [NA] CRAN (R 4.0.2)
## httpuv 1.5.2 2019-09-11 [NA] CRAN (R 4.0.3)
## httr 1.3.1 2017-08-20 [NA] CRAN (R 4.0.3)
## ISOcodes 2018.06.29 2018-06-30 [NA] CRAN (R 4.0.3)
## jsonlite 1.6 2018-12-07 [NA] CRAN (R 4.0.3)
## knitr 1.33 2021-04-24 [NA] CRAN (R 4.0.2)
## labeling 0.3 2014-08-23 [NA] CRAN (R 4.0.3)
## later 1.0.0 2019-10-04 [NA] CRAN (R 4.0.3)
## lattice 0.20-35 2017-03-25 [NA] CRAN (R 4.0.3)
## lifecycle 1.0.0 2021-02-15 [NA] CRAN (R 4.0.2)
## magrittr 2.0.1 2020-11-17 [NA] CRAN (R 4.0.2)
## manifestoR * 1.5.0 2020-11-29 [NA] CRAN (R 4.0.2)
## Matrix 1.2-14 2018-04-09 [NA] CRAN (R 4.0.3)
## mime 0.5 2016-07-07 [NA] CRAN (R 4.0.3)
## mnormt 1.5-5 2016-10-15 [NA] CRAN (R 4.0.3)
## munsell 0.5.0 2018-06-12 [NA] CRAN (R 4.0.2)
## nlme 3.1-131 2017-02-06 [NA] CRAN (R 4.0.3)
## NLP * 0.1-9 2016-02-18 [NA] CRAN (R 4.0.3)
## nsyllable 1.0 2020-11-30 [NA] CRAN (R 4.0.2)
## pillar 1.6.1 2021-05-16 [NA] CRAN (R 4.0.2)
## pkgconfig 2.0.2 2018-08-16 [NA] CRAN (R 4.0.3)
## promises 1.1.0 2019-10-04 [NA] CRAN (R 4.0.3)
## proxyC 0.2.0 2021-05-11 [NA] CRAN (R 4.0.2)
## psych 1.8.3.3 2018-03-30 [NA] CRAN (R 4.0.3)
## purrr 0.3.2 2019-03-15 [NA] CRAN (R 4.0.3)
## quanteda * 3.0.0 2021-04-06 [NA] CRAN (R 4.0.2)
## quanteda.textstats * 0.94.1 2021-05-11 [NA] CRAN (R 4.0.2)
## R6 2.2.2 2017-06-17 [NA] CRAN (R 4.0.3)
## Rcpp 1.0.0 2018-11-07 [NA] CRAN (R 4.0.3)
## RcppParallel 5.1.4 2021-05-04 [NA] CRAN (R 4.0.2)
## readr 1.3.1 2018-12-21 [NA] CRAN (R 4.0.3)
## rlang 0.4.10 2020-12-30 [NA] CRAN (R 4.0.2)
## rmarkdown 2.8 2021-05-07 [NA] CRAN (R 4.0.2)
## rmdformats 1.0.2 2021-04-19 [NA] CRAN (R 4.0.2)
## scales 1.1.0 2019-11-18 [NA] CRAN (R 4.0.3)
## sessioninfo 1.1.1 2018-11-05 [NA] CRAN (R 4.0.2)
## shiny 1.4.0 2019-10-10 [NA] CRAN (R 4.0.3)
## slam 0.1-40 2016-12-01 [NA] CRAN (R 4.0.3)
## SnowballC 0.5.1 2014-08-09 [NA] CRAN (R 4.0.3)
## stopwords 0.9.0 2017-12-14 [NA] CRAN (R 4.0.3)
## stringi 1.1.7 2018-03-12 [NA] CRAN (R 4.0.3)
## stringr * 1.3.0 2018-02-19 [NA] CRAN (R 4.0.3)
## tibble 3.1.2 2021-05-16 [NA] CRAN (R 4.0.2)
## tidyr * 0.8.0 2018-01-29 [NA] CRAN (R 4.0.3)
## tidyselect 1.1.1 2021-04-30 [NA] CRAN (R 4.0.3)
## tm * 0.7-5 2018-07-29 [NA] CRAN (R 4.0.3)
## utf8 1.1.3 2018-01-03 [NA] CRAN (R 4.0.3)
## vctrs 0.3.8 2021-04-29 [NA] CRAN (R 4.0.3)
## withr 2.1.2 2018-03-15 [NA] CRAN (R 4.0.3)
## xfun 0.23 2021-05-15 [NA] CRAN (R 4.0.2)
## xml2 1.2.0 2018-01-24 [NA] CRAN (R 4.0.3)
## xtable 1.8-2 2016-02-05 [NA] CRAN (R 4.0.3)
## yaml 2.2.0 2018-07-25 [NA] CRAN (R 4.0.3)
## zoo 1.7-13 2016-05-03 [NA] CRAN (R 4.0.3)