The dataset contains 2 834 front page articles from the print Hungarian daily, Magyar Nemzet. It is sampled from the data_magyar_nemzet_large dataset. This dataset is used in the 6th chapter of the textbook (https://tankonyv.poltextlab.com/sentiment.html).

data_magyar_nemzet_small

Format

It is a data.frame, with 2834 observation, 3 variables:

doc_id

A unique document id, row number in this case

text

The unprocessed article texts

doc_date

Date of the article

Source

https://cap.tk.hu/en/dataoverview

References

Sebők, Miklós, and Zoltán Kacsuk (2021). The Multiclass Classification of Newspaper Articles with Machine Learning: The Hybrid Binary Snowball Approach.. Political Analysis, 29(2): 236-249.