Lisa Marie Rhody, George Mason University
When Brian Croxall asked each of our roundtable participants to write a blog post in advance of MLA describing our methodologies, he mentioned that he’d prefer a minimum of 500 words. Our maximum limit, however, he described as “whatever it takes.” I doubt he would have predicted an over 113-page post with 4 extended appendices, but that is, as best as I can determine, what it would take to explain how I got to the point at which I felt prepared to begin making statements about ekphrasis that extend from topic modeling and social network analysis.
The following post, therefore, serves as an introduction to the open publication of two chapters of my dissertation that present the methodologies that undergird the research I will be presenting at MLA this year. Accounting for about 33% of my written dissertation, “Ekphrastic Revisions: Verbal-Visual Networks in 20th Century Poetry by Women,” chapters four and five establish the methods, values, and decisions that contributed to the collection, preparation, description, and testing of approximately 4,500 poems that make up the Revising Ekphrasis project.
Chapter four, “Collecting Ekphrasis: Building a Digital Collection of Modern Verse to Study Ekphrasis” (PDF), introduces the means by which the poems in the corpus were collected, prepared, and described. The chapter presents methods for collecting 4,771 poems from five sources including The American Academy of Poets and Poets.org, through a combination of automated capture and human input. Additionally, I expose the values and priorities that guided decisions about which metadata to capture, how it would be recorded, and how to keep track of changes to the data and metadata as the project iterates. Working through four sample topic model experiments, I demonstrate the effects of stopword lists on topic models of poetic corpora and the ways in which “close” reading may also limit our assessment of poetry by placing too much emphasis on economy of language as a significant feature of verse.
In the fifth chapter, “Review, Revise, Requery: Reading Ekphrasis as/in a Dynamic Social Network” (PDF), I introduce the challenges of topic modeling with figurative language, present a possible approach to interpreting topic models of figurative language, and present social network graphs that prompt the questions and suggestions that become the focal point of my MLA talk. This chapter considers the composition of Latent Dirichlet Allocation (LDA) topics when the corpora used to create the models depend heavily on ambiguity and figurative language. I consider the usefulness of existing evaluative methods for testing topic models and later demonstrate how social network graphs prompt new pairings, contextualizations, and readings of ekphrastic poetry against the grain of existing critical assumptions about the genre’s tradition.
The chapters found here have origins in blog posts from Spring 2012, such as “Small Projects & Limited Datasets,” “Why Use Visualizations to Study Poetry?” and “Some Assembly Required: Understanding Topics in LDA Topic Models of Poetry.” I’m deeply grateful to those who took the time to comment on those early blog posts, as they contributed to the refinement and revisions that took place between these early drafts and the later dissertation. In January 2013, I was asked to revise my earlier blog posts on topic modeling and poetry, which had been selected as Editors’ Choice features for Digital Humanities Now, for possible publication in the Journal of Digital Humanities issue on topic modeling edited by Scott Weingart and Elijah Meeks. Therefore, the first half of chapter five was heavily revised for my essay, “Topic Modeling and Figurative Language,” which offers a more brief introduction to the methods deployed on the way to exploring 276 ekphrastic poems.
Finally, I have also posted copies of my dissertation’s appendices (all PDFs). These include a complete list of 4,771 poems in the full dataset, two primary metadata fields (sex and genre), and the breakdown of text sources (Appendix A); stopword lists used in chapter 4 (Appendix B: The Mallet default stoplist, Appendix C: a heavily edited stoplist based on Appendix B, and Appendix D: a lightly edited stoplist based on Appendix B), and the Mallet commands used in the chapter four experiments (Appendix E). Should anyone wish to repeat my experiments, the required poems to harvest, the methods for preparing the data, and the MALLET commands required to perform the experiment are all included in these chapters.
As I will explain during my brief talk at MLA, these methods have helped me to reconsider critical assumptions about ekphrasis as a network of ongoing, socially situated discourses. Casting a wider net by including a broader spectrum of poems by a more diverse demographic of poets, my MLA presentation will demonstrate that latent patterns in language detected by topic models and organized by social network graphs offer opportunities for exploring subtle and nuanced relationships between the “sister arts,” as well as opportunities to recuperate ekphrastic poetry by women heretofore excluded from critical histories of the genre.