Author Archives: Brian Millen

On Computing Dreams

In Unraveling reported dreams with text analytics, Hendrickx et al. experiment with text analysis to try to determine what linguistic differentiators they can find between accounts of dreams and accounts of happenings that are not from dreams. The question is what linguistic features are specific to dream reports? This research question is possible in the first place, because of what is known in psychology as the continuity hypothesis. This idea suggests that most dreams contain the same elements from everyday life, making them similar to other non-dream accounts. They place dream narratives and non-dream narratives on the same plane by conceiving of them both as narratives. The definition of narrative they appropriate is one that says that narratives are tellable. They include events outside the subject’s control, which makes them reportable. They have a temporal juncture and a complicating action where at least two events take place in a certain temporal order. Given these similarities, this experiment was meant to discover the important differences, then, between dreams and non-dreams, and at the level of language.

The researchers drew upon two different kinds of datasets, in the sense that the data was split into data containing accounts of dream narratives, and data containing narratives that were not accounts of dreams. The dream accounts were drawn from a database called DreamBank, containing over 22,000 dream reports from various scientific studies. The dreams were written down either by the dreamers themselves or the researcher interrogating them after they awoke. Our researchers narrowed down the data by limiting them to English language collections and all duplicates were removed. The reports were tokenized, creating a sample of 21,598 dream descriptions and 4.3 word tokens. Since there were more dreams by some individuals then others, each individual person used in the dataset was limited to 100 dreams max. This generated a sample of 6,998 dream descriptions w/ 1.3 million tokens.

The non-dream narratives were taken from two sources: Prosebox and Reddit. Prosebox is a website where users can post journal entries. The reports taken from Reddit were found on the following subreddits: offmychest, diaries, relationships, shortscarystories, lifeinapost, anxiety, and self.

The three methods of text analysis used were automatic text classification, LDA topic modelling, and text coherence analysis. In the text classification, “A machine learning classifier is fed with labeled documents from which it learns to model the characteristics of the given labels” (26). The three categories of words that characterized the dream reports were words conveying uncertainty, words conveying memory and recollection, and words describing the physical space. In contrast, the non-dream reports were dominated by words that referred to points in time, such as years, days, and times. Following this supervised machine learning was an unsupervised LDA topic modellings. Only nouns, verbs, and adjectives were used and the topics were set to fifty. These topics mirrored the results found in the supervised word classification. Many of the topics contained words which describe physical surroundings. The last method they used was a text coherence analysis. The concept at play here was discourse marker frequency, using discourse markers like oh, well, now, then, you know, and I mean as indicators of bizarreness. I think this may be the most problematic conceptualization working in the project, given that it is used to justify the claim that dreams are not characterized by bizarreness, as most people think. Discourse marker frequency was higher in non-dream reports, thus re-affirming the continuity hypothesis. I think there are three problems with this.

  1. There were only more discourse markers in the non-dream reports in a ratio of 5 to 4
  2. I question whether or not discourse markers are a good measurement of “bizarreness”
  3. I wonder whether or not the potential of, a certain being haunted by, bizarreness might be more important than the actual presence of bizarreness in most of the dreams (especially since we do not know which dreams took place during REM sleep, and which did not)

The linguistic differentiation between dreams and non-dream narratives is interesting, and the characteristics which demarcate this difference give us much to think about regarding the nature of dreams and waking lived experience. However, what I think is even more exciting is what therapeutic potential this may reveal, if we take psychoanalysis and dream analysis seriously. The only theoretical tool utilized here was the continuity hypothesis, derived from the psychological discipline. All this hypothesis tells us, however, is that dreams are made up of the same elements as waking life. It does not tell us much about dreams themselves, or why we dream, or what our dreams mean. The authors cite a prior text analysis by Bulkely, where he “offers evidence that based on an individual dream collection, it is possible to make accurate estimations about a person’s life, his concerns, activities, and interests, thereby confirming the continuity hypothesis” (7). However, I doubt whether these estimations would do much for anybody other than arouse mild interest. I am wondering whether using something like topic modelling could aid a psychoanalyst in making connections that could give patients insight into their problems and overcome them. Following Freud, it would be interesting to have patient’s more freely associate when talking about the dreams, giving their own connections between the dreams and other lived experiences. Running these along with the accounts of the dreams themselves, as well as transcriptions of other sessions or part of sessions, could provide some perspective about what these dreams mean for the patients and what they could reveal about their structures of thought. It’s important to note, that just like topic modelling in the literary field, these models would not give us anything inherently true about the patient, but could be a departure point for interpretation that would not be possible without computation.

Works Cited

Hendrickx, I., Onrust, L., Kunneman, F., Hürriyetoğlu, A., Stoop, W., & Van den Bosch, A. (2017). Unraveling reported dreams with text analytics. Digital Humanities Quarterly, 11(4).

Roundtable Abstract

Feminist text analysis is analogous to and a part of good text analysis. Just as “all models are wrong” with computational text analysis in general, we need to acknowledge that a feminist text analysis in particular will never be fully completed. In both cases, it is a matter of engaging in a process that operationalizes texts, critiques whatever shortcomings there might be, and adjusts accordingly. This requires a feminist ethos that is persistent in making a text analysis project a space that is open to the visibility of the female experience. This has to happen at every step in the research process. This means that the data collected needs to account for the most diverse spectrum of experiences possible and that the data ought not be approached in a traditionally masculine way by means of mastery and absolute truth. The concepts to be measured and quantified must be informed by disciplines concerned with social justice, not relying on the data to “speak for itself”, but employing critical theories that seek to upset stereotypical paradigms to use data to say something new. A feminist text analysis, like any good text analysis, needs to know the limits of the tools and methods being used, and what they can and cannot tell us about the world. And finally, a constant dedication to criticism and revision must always seek to question and expand the limitations of the research. This will not result in an absolute feminist text analysis to end all feminist text analysis, but is always a constant engagement in the process.