In Unraveling reported dreams with text analytics, Hendrickx et al. experiment with text analysis to try to determine what linguistic differentiators they can find between accounts of dreams and accounts of happenings that are not from dreams. The question is what linguistic features are specific to dream reports? This research question is possible in the first place, because of what is known in psychology as the continuity hypothesis. This idea suggests that most dreams contain the same elements from everyday life, making them similar to other non-dream accounts. They place dream narratives and non-dream narratives on the same plane by conceiving of them both as narratives. The definition of narrative they appropriate is one that says that narratives are tellable. They include events outside the subject’s control, which makes them reportable. They have a temporal juncture and a complicating action where at least two events take place in a certain temporal order. Given these similarities, this experiment was meant to discover the important differences, then, between dreams and non-dreams, and at the level of language.
The researchers drew upon two different kinds of datasets, in the sense that the data was split into data containing accounts of dream narratives, and data containing narratives that were not accounts of dreams. The dream accounts were drawn from a database called DreamBank, containing over 22,000 dream reports from various scientific studies. The dreams were written down either by the dreamers themselves or the researcher interrogating them after they awoke. Our researchers narrowed down the data by limiting them to English language collections and all duplicates were removed. The reports were tokenized, creating a sample of 21,598 dream descriptions and 4.3 word tokens. Since there were more dreams by some individuals then others, each individual person used in the dataset was limited to 100 dreams max. This generated a sample of 6,998 dream descriptions w/ 1.3 million tokens.
The non-dream narratives were taken from two sources: Prosebox and Reddit. Prosebox is a website where users can post journal entries. The reports taken from Reddit were found on the following subreddits: offmychest, diaries, relationships, shortscarystories, lifeinapost, anxiety, and self.
The three methods of text analysis used were automatic text classification, LDA topic modelling, and text coherence analysis. In the text classification, “A machine learning classifier is fed with labeled documents from which it learns to model the characteristics of the given labels” (26). The three categories of words that characterized the dream reports were words conveying uncertainty, words conveying memory and recollection, and words describing the physical space. In contrast, the non-dream reports were dominated by words that referred to points in time, such as years, days, and times. Following this supervised machine learning was an unsupervised LDA topic modellings. Only nouns, verbs, and adjectives were used and the topics were set to fifty. These topics mirrored the results found in the supervised word classification. Many of the topics contained words which describe physical surroundings. The last method they used was a text coherence analysis. The concept at play here was discourse marker frequency, using discourse markers like oh, well, now, then, you know, and I mean as indicators of bizarreness. I think this may be the most problematic conceptualization working in the project, given that it is used to justify the claim that dreams are not characterized by bizarreness, as most people think. Discourse marker frequency was higher in non-dream reports, thus re-affirming the continuity hypothesis. I think there are three problems with this.
- There were only more discourse markers in the non-dream reports in a ratio of 5 to 4
- I question whether or not discourse markers are a good measurement of “bizarreness”
- I wonder whether or not the potential of, a certain being haunted by, bizarreness might be more important than the actual presence of bizarreness in most of the dreams (especially since we do not know which dreams took place during REM sleep, and which did not)
The linguistic differentiation between dreams and non-dream narratives is interesting, and the characteristics which demarcate this difference give us much to think about regarding the nature of dreams and waking lived experience. However, what I think is even more exciting is what therapeutic potential this may reveal, if we take psychoanalysis and dream analysis seriously. The only theoretical tool utilized here was the continuity hypothesis, derived from the psychological discipline. All this hypothesis tells us, however, is that dreams are made up of the same elements as waking life. It does not tell us much about dreams themselves, or why we dream, or what our dreams mean. The authors cite a prior text analysis by Bulkely, where he “offers evidence that based on an individual dream collection, it is possible to make accurate estimations about a person’s life, his concerns, activities, and interests, thereby confirming the continuity hypothesis” (7). However, I doubt whether these estimations would do much for anybody other than arouse mild interest. I am wondering whether using something like topic modelling could aid a psychoanalyst in making connections that could give patients insight into their problems and overcome them. Following Freud, it would be interesting to have patient’s more freely associate when talking about the dreams, giving their own connections between the dreams and other lived experiences. Running these along with the accounts of the dreams themselves, as well as transcriptions of other sessions or part of sessions, could provide some perspective about what these dreams mean for the patients and what they could reveal about their structures of thought. It’s important to note, that just like topic modelling in the literary field, these models would not give us anything inherently true about the patient, but could be a departure point for interpretation that would not be possible without computation.
Hendrickx, I., Onrust, L., Kunneman, F., Hürriyetoğlu, A., Stoop, W., & Van den Bosch, A. (2017). Unraveling reported dreams with text analytics. Digital Humanities Quarterly, 11(4).