20,000 footnotes in 1000 novels

This is a project to discover and analyze an estimated 20,000 footnotes extracted from 6,000 novels using visual language processing technology. A collaboration with Andrew Piper (McGill) and the .txt Lab at McGill and Ehsan Arabnejad of the Synchromedia Lab at Montréal’s ETS, this paper is also being researched and written in collaboration with past END student researchers Sierra Eckert (Columbia) and Nick Allred (Rutgers). It draws on a taxonomy of eighteenth-century novels’ footnotes to advance a simple but consequential argument: despite the claims of almost all scholarship on the subject, footnotes in novels were common and referential rather than exceptional and self-referential, meta-fictional, or “postmodern” avant la letter. Our taxonomy of these referential footnotes and evidence of the commonness of footnotes will make a significant contribution to current discussions in novel studies about the referential powers of early novels. We are discovering the footnotes by using the END metadata set’s footnote to construct a training set that will be used to locate footnotes on pages of all of the known fiction in the Eighteenth-Century Collections Online (ECCO) digital library.

Novel Geologies and Novel Geographies, 1770-1830

Talissa Ford (Temple University) in collaboration with the END team

This study draws on a combination of END data and computational models of eighteenth-century and early nineteenth-century novels to ask a series of questions about the cartographic imagination of Romantic geography and geology in fiction between 1770 and 1830. In her in-progress book Noah’s Raven: The Literature of Extinction, 1796-1854, Talissa Ford compares the well-known movement across space implied by the many topographical maps produced during the “cartographic revolution” (to use Paul Laxton’s phrase) with the less-known geological maps of the earth’s strata created by William Smith, which imagine movement across time. These geological maps, she argues, offers a powerful alternative to the national and colonial geographies that dominated nineteenth-century cartography. In this project, Ford extends her book’s argument into the different territory of the novel, collaborating with the END team to search out patterns and changes in how novels in the late eighteenth- and early nineteenth-century connect geography with temporality, and to ask how the representation of space, time, and geography in the novel changes across different genres, depending on a range of characteristics of the novel (paratextual features, publication location, author gender) and over time.


In the eighteenth century, prefaces functioned both as guides for readers and as spaces for early theorizations of the novel. Hypothesizing that novels with extensive prefaces may therefore be significantly either less – or more?- self-conscious or self-referential than novels without, this research would create models of texts with prefaces and compare them to those without prefaces, using a number of different measures for identifying common and “most distinctive” words in individual works. Examining other forms of metadata within sets of novel with prefaces would yield complementary insights. Are prefaces more common in novels written by women or by men? Do novels with prefaces tend to have footnotes as well? What kinds of titles or keywords in titles do novels with prefaces have? Are novels with prefaces reviewed more or less often? At stake is the question of the early novel’s self-theorizing: does the preface constitute a privileged site for theorizing operations, and if so does its presence correlate in significant and interesting ways with those novels’ narratives and styles? Future directs for this study involve isolating and modeling the text of prefaces from a corpus of novels to ask questions about the language they use.

Collections and corpora: the shape of the canon of eighteenth-century novels

This project uses topographical text modeling work to discover the shape of different nineteenth- and twentieth-century collections of novels by comparing models of those corpora with a more collection-agnostic corpus of eighteenth-century novels. Creating multiple subsets of eighteenth-century novels drawn from twentieth-century collections and critical works - from Penn’s Singer-Mendenhall Collection to novels mentioned in Ian Watt’s The Rise of the Novel – we will analyze the distance between both the END metadata and various models of the full texts of each collection (drawn from ECCO) and the full metadata set and corpus. We know that the various collections and canons - like all collections and canons - have their predilections and biases, open and covert: Singer and Mendenhall’s collecting focused on epistolary fiction, for example, and Ian Watt and other mid-twentieth-century critics left most women novelists out of their critical canons. But what are the more nuanced shapes that we might uncover if we look more closely at the shape of such retrospectively-created canons?