Fractality and Variability in Canonical and Non-Canonical English Fiction and in Non-Fictional Texts

Frontiers in Psychology 12 (2021)
  Copy   BIBTEX

Abstract

This study investigates global properties of three categories of English text: canonical fiction, non-canonical fiction, and non-fictional texts. The central hypothesis of the study is that there are systematic differences with respect to structural design features between canonical and non-canonical fiction, and between fictional and non-fictional texts. To investigate these differences, we compiled a corpus containing texts of the three categories of interest, the Jena Corpus of Expository and Fictional Prose. Two aspects of global structure are investigated, variability and self-similar patterns, which reflect long-range correlations along texts. We use four types of basic observations, the frequency of POS-tags per sentence, sentence length, lexical diversity, and the distribution of topic probabilities in segments of texts. These basic observations are grouped into two more general categories, the lower-level properties and, which are observed at the level of the sentence, and the higher-level properties and, which are observed at the textual level. The observations for each property are transformed into series, which are analyzed in terms of variance and subjected to Multi-Fractal Detrended Fluctuation Analysis, giving rise to three statistics: the degree of fractality () of the fractal spectrum. The statistics thus obtained are compared individually across text categories and jointly fed into a classification model. Our results show that there are in fact differences between the three text categories of interest. In general, lower-level text properties are better discriminators than higher-level text properties. Canonical fictional texts differ from non-canonical ones primarily in terms of variability in lower-level text properties. Fractality seems to be a universal feature of text, slightly more pronounced in non-fictional than in fictional texts. On the basis of our results obtained on the basis of corpus data we point out some avenues for future research leading toward a more comprehensive analysis of textual aesthetics, e.g., using experimental methodologies.

Other Versions

No versions found

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 100,865

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Analytics

Added to PP
2021-04-28

Downloads
16 (#1,188,084)

6 months
5 (#1,035,700)

Historical graph of downloads
How can I increase my downloads?