DH Forum 2011 Abstracts

Representing Knowledge in the Humanities

Fall Digital Humanities Forum
24 September 2011
Lawrence, Kansas

Keynote presentation by C. M. Sperberg-McQueen

Recordings for the 2011 Forum are available on YouTube and searchable with the IDRH Finding Aid.

Call for Papers

Scholars utilize computationally-assisted methods to view, analyze, classify, and comment on sources of knowledge, and to illustrate the dynamics between these sources and their commentaries, both current and prior. Knowledge representation – the theory and methodology of modeling knowledge using computer technology – is becoming a key dimension of Digital Humanities (DH).

Many scholars are adapting long-established conventions from the print realm for representing knowledge in digital contexts, to view, analyze, classify, and comment on sources of knowledge, and to illustrate the dynamics between these sources and their commentaries, both current and prior. Many disciplines are adapting long-established conventions from the print realm for representing knowledge in digital contexts, or they are developing new ones altogether; these involve visual and textual epistemological models, information design, bibliographic tools, and visual representations. For example, there are established and emerging conventions for the description and display of textual evidence. When only part of a musical, visual, or written text is preserved, conventions exist to supply missing evidence and express levels of (un)certainty, and there are emerging tools and methods to enable and describe the citation of intellectual contributions to electronic texts by authors, annotators, translators, and analyzers. In general, humanists are increasingly evaluating and making use of DH methodologies and projects, as well as evaluating the impact of technology on research in the humanities.

The 24 September Knowledge Representation conference is preceded by a 22 September BootCamp (a hands-on digital tools workshop), and a THATCamp (a digital humanities unconference) on 23 September, all at the University of Kansas. Deadlines for BootCamp and THATCamp registrations are on 22 July 2011. Please see THATCamp Kansas website for more information. (Participants are welcome to attend both the Representing Knowledge conference and THATCamp Kansas, but should register for each separately.)

Representing Knowledge in the Digital Humanities is a one-day conference, allowing KU and non-KU faculty and graduate students to explore the theory and practice of knowledge representation, broadly conceived, and to showcase their digital humanities projects and methodologies. Whether you are a new or old-hand digital humanist, we welcome your participation. We welcome proposals for papers, demonstrations, or posters on topics such as (but not limited to):

- Knowledge representation in virtual worlds- Data modeling and visualization tools- Social media, crowdsourcing, & collaboration in the humanities- Network visualizations- Models of digital history- Annotation of text, images or data- Scholarly integrity and the Internet- Digital curation (amateur and professional)- Rhetoric of aesthetics in visualizations

Presentations may be one of two types: (1) 20 minute paper or demonstration; (2) poster. For all presentations, a 500 word abstract (with ranked presentation type format) is required. Interested participants should submit their abstract in .pdf or .txt format by 31 May 2011 using the submission form at our EasyAbs site. Participants will be notified by 30 June of acceptance.

Conference registration: Registration to attend this conference will open during June. There will be no registration fee to attend this conference, but space will be limited.

For additional questions and information please visit the IDRH website or contact us at idrh@ku.edu.

Forum

Recordings

ThatCamp 2011

Keynote Presentations

Speaker: Michael Sperberg-McQueen, Black Mesa Technologies

The Hermeneutics of Data Representation

When we consult a file on disk, or receive a data stream on a network port, we see a sequence of bits. What does it mean? And can we tell the difference between a meaningful sequence of bits and garbage? Any work involving the machine-readable representation of knowledge must consider both how to validate the representation mechanically (to detect and possibly recover from transmission or storage errors) and how to verify the information semantically and reason about it systematically. The talk will survey some possible approaches to each of these problems and point to current technologies that seem promising in addressing them. At another level, however, data representation has another kind of meaning. Like any cultural artifact, a data representation tells a story about the culture that made it. What do our choices of data representation say about our culture? And what does XML have to do with Kant's definition of enlightenment?

Recording: The Hermeneutics of Data Representation

XML as a tool for domain-specific languages

Computers are general-purpose machines for manipulation of symbols, which means they can be applied in almost any field whose problems can be expressed in terms of symbols. But the creators of computer systems and the potential users of those systems do not always think the same way and do not always find communication easy. Much of the history of information technology can be glossed as a series of attempts to bridge this communication gap. One current approach to this problem is to design 'domain-specific languages' (DSLs): formal languages suitable for computer processing, with vocabulary and semantics drawn from the intended application domain. In retrospect, the design of the Extensible Markup Language (XML) can be viewed as an attempt to encourage domain-specific languages and make them easier to specify. Like DSLs as conventionally conceived of, XML vocabularies allow concise descriptions of interesting states of affairs in a particular application area and tend to be more accessible to domain experts than conventional programming languages.

Unlike conventional DSLs, most XML vocabularies are specified as having declarative not imperative semantics; this is both a blessing (declarative information is almost always easier to verify and easier to apply in new and unexpected ways) and a curse (many conventional programmers find declarative semantics hard to come to terms with). Examples will be drawn largely from XML vocabularies for the encoding of culturally significant textual materials.

Recording: XML as a tool for domain-specific languages

Paper Presentations

Exploring Issues at the Intersection of Humanities and Computing with LADL

Gregory Aist, Iowa State University

Fan Curation on the Internet

Nancy Baym, University of Kansas

The Graphic Visualization of XML Documents

David J. Birnbaum, University of Pittsburgh

This presentation describes the graphic visualization of XML documents in several projects in order to support philological research in the humanities. In many cases information that may not be easily accessible when the data is viewed in textual format (even with the benefit of markup) emerges strikingly when the marked-up prose is transformed, using XML tools, into a graphic representation. Furthermore, the derived graphic representations can be interwoven with more traditional textual ones in an interactive "workstation" that allows researchers to move easily among textual and graphic views as a way of researching and interrogating the content.

Recording: The Graphic Visualization of XML Documents

Prosopography and Computer Ontologies: towards a formal representation of the ‘factoid’ model

John Bradley, Michele Pasin, King’s College London

Structured Prosopography provides a formal model for representing prosopography: a branch of historical research that traditionally has focused on the identification of people that appear in historical sources. Pre-digital print prosopographies, such as Martindale 1992, presented its materials as narrative articles about the individuals it contains. Since the 1990s, KCL's Department of Digital Humanities (formerly known as Center for Computing in the Humanities) has been involved in the development of structured prosopographical databases, and has had direct involvement in Prosopographies of the Byzantine World (PBE and PBW), Anglo-Saxon England (PASE), Medieval Scotland (PoMS) and now more generally northern Britain ("Breaking of Britain": BoB), and is currently in discussions about others. DDH has been involved in the development of a general "factoid-oriented" model of structure that although downplaying or eliminating narratives about people, has to a large extent served the needs of these various projects quite well.

Slides: Prosopography and Computer Ontologies (slideshare)

Recording: Prosopography and Computer Ontologies

Materiality and Meaning in Digital Poetics

Julianne Buchsbaum, University of Kansas

Sounding it out: Modeling orality for large-scale text collection analysis

Tanya Clement, University of Texas

Many scholars and poets have written about the remarkable experience of hearing Gertrude Stein's texts read aloud. "Language poets" who emerged in the 1960s and 1970s and who form important scholarly communities today have adopted Stein as an early influence and a model. In part, the nature of this relationship has been ascribed to the indeterminacy and the manner of language play that Majorie Perloff and others see evinced in Stein's writing, but the extent to which prosody and rhythm has also influenced these artists goes undocumented.

Further, very few scholars have had the means to investigate the speech patterns (whether African American or German or French) that may have influenced Stein. This paper will discuss a use-case study in which I am using data mining to examine clusters of patterns in Stein's poetry and prose compared to those in non-fiction narratives and oral histories as well as those present in contemporary poetry. Taking advantage of pre-existing research and development with the Mellon-funded SEASR (The Software Environment for the Advancement of Scholarly Research) application, this work has included identifying OpenMary XML (a text-to-speech system that uses an internal XML-based representation language called MaryXML) output as a base analytic, producing a tabular representation of the data for clustering and predictive modeling that includes phonemic and syntactic elements, creating a routine in MEANDRE (a semantic-web-driven data-intensive flow execution environment) that produces this data and allows future users to produce similar results, and developing a user-interface for seeing these comparisons across collections of texts. Access to large-scale repositories of text opens larger questions about how literary scholars can use such repositories in their research. John F. Sowa writes in his seminal book on computational foundations, that theories of knowledge representation are particularly useful "for anyone whose job is to analyze knowledge about the real world and map it to a computable form" (xi). Similarly, Sowa notes that knowledge representation is unproductive if the logic and ontology which shape its application in a certain domain are unclear: "without logic, knowledge representation is vague, Sowa writes, "with no criteria for determining whether statements are redundant or contradictory," and "without ontology, the terms and symbols are ill-defined, confused, and confusing" (xii). Knowledge representation is the work of all scholars in digital humanities and these scholars must help determine the logics and ontologies that shape how we access this data. Charles Bernstein has written that "[t]he relation of sound to meaning is something like the relation of the soul (or mind) to the body. They are aspects of each other, neither prior, neither independent (17). Scholars have not had the ability to analyze the features of text that correspond to orality—their phonemes and prosodic elements—much less compare these features with similar features across collections. To incorporate this kind of study in digital humanities, it is time we considered the logics and ontologies of orality in the computational environment.

Bernstein, Charles. Close Listening: Poetry and the Performed Word. Oxford University Press, 1998. Print.

Perloff, Marjorie. The Poetics of Indeterminacy: Rimbaud to Cage. Princeton, N.J: Princeton University Press, 1981. Print.

Sowa, John F. Knowledge Representation: Logical, Philosophical, and Computational Foundations. Pacific Grove, CA: Brooks Cole Publishing Co., 2000. Print.

Recording: Sounding it out: Modeling orality for large-scale text collection analysis

Employing Geospatial Genealogy to Reveal Residential and Kinship Patterns in a Pre-Holocaust Ukrainian Village

Stephen Egbert, University of Kansas

Karen Roekard, Independent scholar

By incorporating data from a variety of historical records into geographic information systems (GIS), we are conducting research into visualizing what can be learned about residential and kinship patterns in the mixed-ethnic settlements of pre-Holocaust Eastern Europe. We have termed this process -- the linkage of records traditionally used for family history research with GIS --"geospatial genealogy." Our prototype is the town of Rawa Ruska, Ukraine, located on the Rata River near the Polish border. It was founded in the mid-fifteenth century and was a "mixed" town of Jews, Poles, and Ukrainians. Over time its governance shifted among Austria-Hungary, Poland, Nazi Germany, the USSR, and now Ukraine. During WWII the Jews of Rawa Ruska were murdered in various "actions" at nearby mass gravesites or gassed at the Belzec extermination camp, 14 kilometers away. Our reconstruction, based on an 1854 cadastral map, utilizes house numbers listed on the map and cross-references them as they are used elsewhere, e.g. in vital records, tax and residence rolls, Tabula register contracts, etc. from the late 1700s to the early 1900s. Thus, they provide a key link in establishing spatial patterns. Mapping residence patterns permits, for example, the examination of clustering or dispersion over time by ethnic group and relative wealth, or the degree of clustering around focal points such as the town square or places of worship.

Recording: Employing Geospatial Genealogy

The Atlanta Map Project: Modeling the History of the City Using Library Resources

Randy Gue, Michael Page, Stewart Varner

Emory University Libraries

Viral Venuses: The Potential of Digital Pedagogy in Feminist Classrooms

DaMaris Hill, University of Kansas

Breaking the Historian’s Code: Finding Patterns of Historical Representation

Ryan Shaw, University of North Carolina

From Uncertainty to Virtual Reality: Knowledge Representation in Rome Reborn

Phil Stinson, University of Kansas

Graphic representations of ancient Rome have become more visually powerful in the late twentieth and early twenty-first centuries with the innovations afforded by digital technologies, but the use value of these images is under debate today. This paper explores the interplay among different types of knowledge representation, an under-theorized area of research in the digital humanities, in the acclaimed Rome Reborn project, now also known as Ancient Rome 3D in Google Earth. Rome Reborn is perhaps the largest and most complex visualization endeavor in the digital humanities to date. The author of this paper belonged to the original project team (UCLA 1999-2001) and is on the Scientific Committee of the current iteration (UVA). Rome Reborn incorporates distinct classes of knowledge—historical sources, archaeological remains, and deductive logic or inference—as a basis to reconstruct the appearance of ancient Rome's monuments (mainly temples, public buildings and residential structures), urban infrastructure (streets, aqueducts), and topography (hills of Rome, Tiber River). All forms of knowledge utilized in the making of Rome Reborn are represented by the medium of an interactive virtual reality model consisting of millions of polygonal surfaces with applied colors, textures and simulations of light and shadow effects. This paper will perform autopsy on Rome Reborn and expose its interwoven visual representations of historical, archaeological, and conjectural knowledge. The relationships of secure knowledge representations, which are sparse in the model, to the more prevalent conjectural or speculative knowledge representations will be clarified with the aim of identifying Rome Reborn's underlying epistemological structure. Analysis of Rome Reborn in this manner holds the potential to advance the methodological discourse in the digital humanities for the visual representation of knowledge when multiple forms of knowledge require systemization and when levels of uncertainty are high.

Recording: From Uncertainty to Virtual Reality

Making the most of free, unrestricted texts–a first look at the promise of the Text Creation Partnership

Rebecca Welzenbach, University of Michigan

In April 2011, the Text Creation Partnership announced that 2,231 transcribed and SGML/XML encoded texts from the Eighteenth Century Collections Online (ECCO) corpus were freely available to the public, with no restrictions on their use or distribution. This is the first set of TCP texts to have all restrictions lifted. We have already seen significant interest in studying, manipulating, and publishing these texts, which has given us a peek at what might happen in a few years, when the much larger EEBO-TCP also archive becomes available to the public. The release was met with enthusiasm by power users who were eager to work directly with the XML files, but frustration by those who expected a full-service platform to interact with the texts. This presentation will discuss the mixed reactions to the release of the ECCO-TCP texts; offer examples of how people are starting to work with them; and highlight some of the questions, challenges, and opportunities that have arisen for the TCP as a result.

Recording: Making the most of free unrestricted texts