On Computing Dreams

In Unraveling reported dreams with text analytics, Hendrickx et al. experiment with text analysis to try to determine what linguistic differentiators they can find between accounts of dreams and accounts of happenings that are not from dreams. The question is what linguistic features are specific to dream reports? This research question is possible in the first place, because of what is known in psychology as the continuity hypothesis. This idea suggests that most dreams contain the same elements from everyday life, making them similar to other non-dream accounts. They place dream narratives and non-dream narratives on the same plane by conceiving of them both as narratives. The definition of narrative they appropriate is one that says that narratives are tellable. They include events outside the subject’s control, which makes them reportable. They have a temporal juncture and a complicating action where at least two events take place in a certain temporal order. Given these similarities, this experiment was meant to discover the important differences, then, between dreams and non-dreams, and at the level of language.

The researchers drew upon two different kinds of datasets, in the sense that the data was split into data containing accounts of dream narratives, and data containing narratives that were not accounts of dreams. The dream accounts were drawn from a database called DreamBank, containing over 22,000 dream reports from various scientific studies. The dreams were written down either by the dreamers themselves or the researcher interrogating them after they awoke. Our researchers narrowed down the data by limiting them to English language collections and all duplicates were removed. The reports were tokenized, creating a sample of 21,598 dream descriptions and 4.3 word tokens. Since there were more dreams by some individuals then others, each individual person used in the dataset was limited to 100 dreams max. This generated a sample of 6,998 dream descriptions w/ 1.3 million tokens.

The non-dream narratives were taken from two sources: Prosebox and Reddit. Prosebox is a website where users can post journal entries. The reports taken from Reddit were found on the following subreddits: offmychest, diaries, relationships, shortscarystories, lifeinapost, anxiety, and self.

The three methods of text analysis used were automatic text classification, LDA topic modelling, and text coherence analysis. In the text classification, “A machine learning classifier is fed with labeled documents from which it learns to model the characteristics of the given labels” (26). The three categories of words that characterized the dream reports were words conveying uncertainty, words conveying memory and recollection, and words describing the physical space. In contrast, the non-dream reports were dominated by words that referred to points in time, such as years, days, and times. Following this supervised machine learning was an unsupervised LDA topic modellings. Only nouns, verbs, and adjectives were used and the topics were set to fifty. These topics mirrored the results found in the supervised word classification. Many of the topics contained words which describe physical surroundings. The last method they used was a text coherence analysis. The concept at play here was discourse marker frequency, using discourse markers like oh, well, now, then, you know, and I mean as indicators of bizarreness. I think this may be the most problematic conceptualization working in the project, given that it is used to justify the claim that dreams are not characterized by bizarreness, as most people think. Discourse marker frequency was higher in non-dream reports, thus re-affirming the continuity hypothesis. I think there are three problems with this.

  1. There were only more discourse markers in the non-dream reports in a ratio of 5 to 4
  2. I question whether or not discourse markers are a good measurement of “bizarreness”
  3. I wonder whether or not the potential of, a certain being haunted by, bizarreness might be more important than the actual presence of bizarreness in most of the dreams (especially since we do not know which dreams took place during REM sleep, and which did not)

The linguistic differentiation between dreams and non-dream narratives is interesting, and the characteristics which demarcate this difference give us much to think about regarding the nature of dreams and waking lived experience. However, what I think is even more exciting is what therapeutic potential this may reveal, if we take psychoanalysis and dream analysis seriously. The only theoretical tool utilized here was the continuity hypothesis, derived from the psychological discipline. All this hypothesis tells us, however, is that dreams are made up of the same elements as waking life. It does not tell us much about dreams themselves, or why we dream, or what our dreams mean. The authors cite a prior text analysis by Bulkely, where he “offers evidence that based on an individual dream collection, it is possible to make accurate estimations about a person’s life, his concerns, activities, and interests, thereby confirming the continuity hypothesis” (7). However, I doubt whether these estimations would do much for anybody other than arouse mild interest. I am wondering whether using something like topic modelling could aid a psychoanalyst in making connections that could give patients insight into their problems and overcome them. Following Freud, it would be interesting to have patient’s more freely associate when talking about the dreams, giving their own connections between the dreams and other lived experiences. Running these along with the accounts of the dreams themselves, as well as transcriptions of other sessions or part of sessions, could provide some perspective about what these dreams mean for the patients and what they could reveal about their structures of thought. It’s important to note, that just like topic modelling in the literary field, these models would not give us anything inherently true about the patient, but could be a departure point for interpretation that would not be possible without computation.

Works Cited

Hendrickx, I., Onrust, L., Kunneman, F., Hürriyetoğlu, A., Stoop, W., & Van den Bosch, A. (2017). Unraveling reported dreams with text analytics. Digital Humanities Quarterly, 11(4).


There can be such a thing as feminist text analysis if there are certain psychological social actions at play.  That is, if the methodology is focused on feminist ways and identity.  When researchers are performing textual analysis, educated guesses are created based on the given text(s).  Interpretations of the texts are made to understand the ways and thoughts of the group that is written within the text.   Oftentimes, interpretations derive from a male perspective giving expressions of implicit bias and misogyny.  However, if the given text considers the lives and conditions from the feminist perspective, it will allow itself to be analyzed as a feminist text.

Roundtable abstract

A feminist text analysis is a topic that is being discussed and studied in the field of digital humanities. By taking into account the key components of text analysis–such as research questions, data, conceptualization, operationalization as well as text and analysis–I will attempt to suggest that a feminist text analysis can exist. We probably still need more time to offer a (proper and all-encompassing) definition of a feminist text analysis but in order to come up with one, or, more importantly, with a set of definitions, we need to have these conversations about a feminist text analysis.

Abstract for Roundtable

In an essay titled “Why Study Humanities? What I Tell Engineering Freshmen”, Horgan (2013) makes the case for how essential humanities are for seemingly irrelevant, positive science fields, by arguing that “we need the humanities to foster a healthy anti-dogmatism”: humanities can bring in subversiveness, skepticism and critical thinking to those fields dominated by assurances of certainty, facts, and truth. He states that as these latter fields hugely are intertwined with and impact our society, humanities and social sciences are needed where human life is concerned.

In a similar vein, I would like to propose that a feminist analysis of text is not just possible but necessary, and critical to all fields that use text analysis as a method. While social science and humanities fields strive to address the complexity of issues related to gender, racism, sexism, colonialism, and corporate interests in text analysis and data analysis in general, even a quick look at studies in other fields such as engineering and information science show that they still fall behind in such critical issues and not without consequence. I would like to therefore look at a sample of studies in different disciplines, and identify the differences in their approach to formulating their research questions, data, conceptualization, operationalization, and analysis. I will then highlight what humanities and social sciences can offer to various disciplines based on literature and provide examples of application (code).


Horgan, J. (2013, June 20). Why Study Humanities? What I Tell Engineering Freshmen. Scientific American. https://blogs.scientificamerican.com/cross-check/why-study-humanities-what-i-tell-engineering-freshmen/

Roundtable Abstract

Feminist text analysis is analogous to and a part of good text analysis. Just as “all models are wrong” with computational text analysis in general, we need to acknowledge that a feminist text analysis in particular will never be fully completed. In both cases, it is a matter of engaging in a process that operationalizes texts, critiques whatever shortcomings there might be, and adjusts accordingly. This requires a feminist ethos that is persistent in making a text analysis project a space that is open to the visibility of the female experience. This has to happen at every step in the research process. This means that the data collected needs to account for the most diverse spectrum of experiences possible and that the data ought not be approached in a traditionally masculine way by means of mastery and absolute truth. The concepts to be measured and quantified must be informed by disciplines concerned with social justice, not relying on the data to “speak for itself”, but employing critical theories that seek to upset stereotypical paradigms to use data to say something new. A feminist text analysis, like any good text analysis, needs to know the limits of the tools and methods being used, and what they can and cannot tell us about the world. And finally, a constant dedication to criticism and revision must always seek to question and expand the limitations of the research. This will not result in an absolute feminist text analysis to end all feminist text analysis, but is always a constant engagement in the process.

Roundtable Abstract: For a Decolonial Approach to Text Analysis

Vallerie Matos

Yes, there can be a feminist text analysis. There is no denying the enormous contributions feminist discourse has made inside and outside of the academy. It has shifted our world in the most necessary ways. But as Sara Ahmed offers, feminism can be “a fantasy of inclusion which often conceals its own exclusions”. It can reinforce the gender binary and neglect the identities of many others. So if “feminism is driven by an imperative for change”, why stop here? Is our generic feminism enough? In this paper, I confront the erasures of non-binary folks and Black trans women inherent in feminist approaches to technology. I will instead explore and argue for decolonial approaches to text analysis and other digital technologies alike. I will frame this argument with the assistance of Sara Ahmeds, Thinking through Feminism, and “What Gets Counted Counts,” by Catherine D’lgnazio and Lauren F. Klein. I will then employ “Against Cleaning,” by Katie Rawson and Trevor Muñoz to exemplify where we can decolonize particular methodologies, and use notions such as scalability and non-scalability to offer inclusivity. A decolonial approach to technology seems to be the only home available to host and honor all intersectional and marginalized identities. It has the potential to disrupt the ways in which many discourses intentionally or unintentionally deny them. 

Roundtable abstract: Feminist Text Analysis in Praxis.

The issue we have been addressing all semester is “Can there be such a thing as feminist text analysis?”  Our readings give us the theoretical basis for this discussion and the notebooks give us practical skills in text analysis, so we understand how difficult it is in fact to practice what we learn and talk about.  The gap is drawing a closer tie between the theory as we understand it, and the reality of applying those theories.  Looking at the steps as laid out by Nguyen et al (Research questions, Data, Conceptualization, Operationalization, and Analysis) I propose that we look carefully at the iterative and narrowing process that occurs when we start to operationalize our research. 

The ready-made tools in the NLTK don’t allow us to work easily with the multi-variant, non-binary nature of intersectional data, so we make decisions to narrow our process and thus the research question.

I argue that these tools are not magic, they are built by men and can be disassembled and reassembled by feminists.  Broussard warns us against technochavanism, to not resign ourselves to the current trajectory of available analysis tools.  I plan to look at some specific tools/functions in use to make recommendations about how they can be improved to be more transparent.   

As the newest crop of Data Visualization and Digital Humanities scholars, it is up to us to create new metrics, new tools for measuring, new ways to visualize results. 


Nguyen, Dong, Maria Liakata, Simon DeDeo, Jacob Eisenstein, David Mimno, Rebekah Tromble, and Jane Winters. “How we do things with words: Analyzing text as social and cultural data” arXiv:1907.01468v1 [cs.CL] 2 Jul 2019. 

Broussard, Meredith. Artificial Unintelligence: How Computers Misunderstand the World. The MIT Press, 2018.

Abstract for Roundtable Discussion

Technochauvinism, or the idea that technology is always the superior means of attaining an end, is a flawed ideology that has a disturbing amount of overlap with traditional male chauvinism. A common opinion among male chauvinists is that women are inferior to men due to some sort of emotional fragility that prevents them from being as logical as men. With technochauvinism, it is thought that the computer should reign supreme due to its ability to reduce any issue to supposedly-objective, unbiased numbers and mathematics. Technochauvinism, by ignoring or otherwise cutting out human components of analysis and problem solving, can’t help but ignore or cut out concepts such as culture, race, and gender.

Digital technology is created by human beings: human beings with biases and emotions. A computer error is largely the direct result of human error. This paper aims to not only show how male chauvinism can dangerously factor into technochauvinism, but also show that technochauvinism “on its own” impedes feminism in ways not unlike traditional male chauvinism. Ultimately, I wish to present an argument that in order for feminist digital text analysis to be performed, it must be approached in a manner that avoids technochauvinist bias or in a manner where one is aware of technochauvinist bias: to allow a feminist analysis to be affected by technochauvinist bias is undesirable in the same manner as allowing a feminist analysis to be affected by male chauvinist bias.

Abstract for Roundtable Discussion

As technology has continually advanced throughout the years, digital humanities tools, such as literary and text analysis, have likewise modernized through the development of various machine learning methods. While tools have evolved significantly in this field, it is necessary to confront the ways in which many of these digital tools stem from and maintain colonialism. Countless have taken note of this issue and collectively work towards decolonizing the humanities: an ongoing initiative that strives to create new tools that centralize minoritized voices and experiences while simultaneously countering traditional colonialist technologies that promote a humanities dominated by whiteness and androcentrism. 

A method within the digital humanities that is exemplary of this kind of work is feminist text analysis: this paper not only insists upon the existence of feminist text analysis but also explores the crucial role that it plays in challenging androcentric narratives and hierarchies of knowledge that arise from legacies of colonialism. Through analyzing Sara Mill’s “Post-Feminist Text Analysis” article in which the English linguist implements a feminist text analysis that considers how overt sexism of the past has mutated into a more indirect, inconspicuous sexism shrouded by a false veneer of gender inclusivity, the capabilities of feminist text analysis are showcased.

I argue that based on this example along with myriad others ranging from analysis of book reviews in The New York Times to analysis of dialogue in Disney films, there is such a thing as a feminist text analysis and that it plays an important role in decolonizing the digital humanities.  


Mills, Sara. “Post-Feminist Text Analysis.” Language and Literature, vol. 7, no. 3, Aug. 1998, pp. 235–252

Abstract Draft

“Discussing The Biases of Race and Gender in the Machine-Model Design of Smart Virtual Assistants (SVAs).”

By Asma A. Neblett

This paper briefly explores how the vernacular poetics associated with race and gender are perpetuated in the machine-model design of Smart Virtual Assistants (SVAs) since they were introduced in the early 2010s. SVAs are generally described as feminine or gendered as female, but what else is implied about the social profile of major SVAs, such as Apple’s Siri, and Amazon’s Alexa, that also connote race and determine user satisfaction? I argue that the choices made in machine-model designs for SVAs, such as Siri and Alexa, mirror the vernacular biases associated with race and gender[1], which implicitly shape user experience. This paper consults a Black Feminist analysis, informed by feminist linguistics, to briefly discuss the text analysis of machine-models in SVAs, such as Automated Speech Recognition (ASR)[2], that speak to the intersection of race and gender in SVAs, and how it may influence user experience.

[1] Henderson, Mae. Speaking in Tongues and Dancing Diaspora: Black Women Writing and Performing. Oxford: Oxford Univ. Press, 2014. Print.

[2] Koenecke, Allison, et al. “Racial disparities in Automated Speech Recognition.” Proceedings of the National Academy of Sciences Apr 2020, 117 (14) 7684-7689; DOI: 10.1073/pnas.1915768117


Habler, Florian, Schwind, Valentin, and Henze, Niels. 2019.Effects of Smart Virtual Assistants’ Gender and Language.” In Proceedings of Mensch und Computer 2019 (MuC’19). Association for Computing Machinery, New York, NY, USA, 469–473. DOI:https://doi.org/10.1145/3340764.3344441