The data contains the collected works of the famous Hungarian poet, Sándor Petőfi. It contains all of his works, without any cleaning or preprocessing.

data_petofi

Format

It is a data.frame, with 1 observation, 2 variables:

doc_id

Doc id variable for easier input into quanteda

text

The unprocessed combined works of Sándor Petőfi. All of the documents are contained as one observation (row).