The Society of American Archivists is meeting this week in Washington, DC. Last week they had a pre-conference about EAC-CPF, which they twittered under the hashtag #eacdc. I’m not formally trained as an archivist, but I’ve used a lot of different manuscript/papers collections and am really interested in digital tools for archival researchers, so I was following their reportage.
I hadn’t ever given a thought to the technical and ethical issues involved in authority databases until a few months ago, when I heard Simon Wiles speak at Yale about his Buddhist authority databases project. Among other interesting technical challenges, it handles records for texts in multiple Asian languages (including non-Unicode ones) and multiple time systems used in various Buddhist cultures.
With the approval of his supervisors at Dharma Drum College, Wiles publishes all his project’s databases under an open-source license, encouraging them to be copied in full to any server that will host them. This is a preservation and access strategy designed to work around issues of political and religious freedom in various Asian countries; Wiles also talked better than I can summarize about how this plan flows from the college’s Buddhist principles.
EAC-CPF (Encoded Archival Context for Corporate Bodies, Persons, and Families) is an XML format that makes it possible for archivists to mark off when items in their collections relate to particular named people or groups. It’s related to what archivists call “authority control,” the practice of standardizing headings in a cataloging system so that users can use a single heading to find what they’re looking for. (Don’t look up “Grace Abbott,” look up “Abbott, Grace, 1878-1939.”) More important in an old-style card catalog, but not unimportant in a computer-based system either.) From what I could tell, #eacdc was talking about the relative merits of establishing a single authority file or setting up a distributed system. This is an arcane topic for general readers, but a specialist topic for library professionals.
Fortunately, when I tweeted a question about #eacdc, Maureen Callahan at The Patriarchive gave a good explanation on EAC and whether we should care. More digital-humanities people should be writing pieces like this: explanations of technical issues in their field for others who may need to know. Here’s why.
Although I’m trained as a historian, in the professional field of archives-management I’m a curious amateur. Authority control is interesting to me because I use a number of under-processed women’s history collections, and I’d like to experiment with crowdsourcing approaches for generating finding aids.
RG 102, the Papers of the United States Children’s Bureau, is an important collection for historians who study the early 20th century; women, gender, and family; the emergence of the federal welfare state; and the history of public health. I use it a lot in my dissertation.
I and the +/- 5 other historians who use any particular chunk of NARA RG 102 ought to be able to generate a pretty good collaborative finding aid if we have the right tools to make it easy enough. After all, we index our own digital photos of materials as part of our working notes; why should anyone else ever have to make a list of the items in that box and folder (or, for that matter, photograph them in person)? And if I’ve got subject, date, and topic notes on a folder of letters sent by Grace Abbott, why shouldn’t anyone who’s looking for material on Abbott be able to know what’s in that folder without having to travel to Maryland to do it? (Yes, there’s a microfilm edition, but it’s selected.)
This toolsmithing work and information compilation is an important project, and I think that EAC-CPF may be an important bit of the infrastructure we need for making usable, collaboratively-created finding aids. But without reading more work on emerging technical standards for digital archives, I can’t know for sure. To find other people (historians, programmers, archivists) who’d be interested in collaborating on this toolsmithing project, I have to know who can explain authority control (or linked data, or GIS correspondence mapping) to people outside their own professional specialty.
Explaining what you know for people outside your discipline— particularly specialist methods or theory— is a vital part of building meaningfully interdisciplinary communities. Given that many of the best digital-humanities projects are collaborative team efforts, these explanations might also help potential future collaborators understand why they want to work with you.
Due to Puerto Rico‘s history as a Spanish colony (1493-1898) and a US colony/possession (since 1898), it has some unusual legal traditions and practices. Until very recently, common practice on the island was that if you needed to present a birth certificate as a proof of identity, you ordered extra copies of that document, and the (school/city/etc) kept that copy on file for its records.
That practice led to lots of extra birth certificates floating around, which happen to have a high resale value. There’s been at least one criminal operation busted for stealing birth certificate copies and reselling them.
Puerto Ricans have been US citizens since 1917, when the Jones Act took effect, but the territorial government’s been making it very difficult to prove one’s citizenship lately.
Because some people were using Puerto Rican birth certificates to evade US immigration-restriction policies, Puerto Rico’s territorial government was taking a lot of heat from the federal government for not doing anything to prevent identity fraud. The legislature decided to implement new, “more secure” birth certificate documents, and so it declared that as of July 1, all Puerto Rican birth certificates ever issued would be invalid.
This decision affects an estimated 4-5 million US citizens, most of whom happen to have Spanish last names and/or brown skin. Furthermore, because US economic policies have hobbled Puerto Rico’s economy, many people born there now live on the US mainland, particularly in cities like New York, Atlanta, and Chicago. If they want to travel back to the island, they have to go by air, which means they need photo ID— which they can’t acquire or renew unless they present a birth certificate or passport. Moreover, they have to know about the invalidation of their existing birth certificates in order to know that they need a new one, and the news hasn’t been particularly widely disseminated.
I’ve been working on some thorny research problems lately which are, mostly, about handling large volumes of archival data. When I get stuck, I try to go for a bike ride. Lately, following Ryan Cordell’s advice on podcasts as a way to broaden your intellectual horizons, I’ve been listening to scholarly talks on my rides. Here are 2 lectures I’ve particularly enjoyed:
Jeffrey McClurken on Confederate veteran families and database methods
Jeff McClurken’s recent talk at the Virginia Historical Society, “Take Care of the Living: Reconstructing Confederate Veteran Families in Virginia” is based on his newish book of the same name. He talks about the Civil War’s impact on Confederate veteran families after the war, including families whose their military relatives died in the conflict, were physically wounded, or were mentally traumatized.
McClurken’s conclusions are interesting, but what really got my attention was his use of a database to build his conclusions. For reasons he describes in the talk, he focuses on soldiers from a particular town in Virginia– Danville– and Pittsylvania County, which surrounds it. He put each veteran’s name into a database, then reconstructed how they and their families appear in the written record over time as they made claims on relief systems: local elites, churches, state military pensions, state asylums, and such.
On one hand, this is a method we don’t see a lot of anymore, reminiscent of the early 1970s New England community studies– but he has a database, rather than punched cards, which makes his analysis more flexible. He can do numerical analysis, talking (for example) about the differences in family finances caused by having a relative killed or having a relative wounded in Confederate service— but he can also track individuals and their families over time. I’m very interested in this, since it’s similar to some methods I’d like to use; I haven’t read the book yet, but I’m told that the appendix contains more technical details, and I’m looking forward to it.
Julie Meloni et al. on annotatable digital archives
Julie Meloni’s talk at the University of Virginia’s Scholars Lab, “N-Dimensional Archives,” is more conceptual and theoretical. She builds on the work of Johanna Drucker, Bethany Nowviskie, and others at UVa’s Speculative Computing Lab— particularly their concept of a ‘Patacritical Demon. Nowviskie’s introduction gives good context for why this is important. Meloni describes a system which allows multiple readers to annotate an electronic text and to see one another’s annotations.
Although Meloni’s trained in English and approaches much of this problem as a literary-studies scholar, I can imagine this system’s use for historians. (Note that she hasn’t actually built it, but she describes a little bit about the technologies which would enable such a system. The Open Annotation Collaboration is currently building some systems which might head towards this goal, and they’re worth keeping an eye on.) Jerome McGann also follows with some remarks. Although the Q&A session at the end didn’t come through the recording system well, I found the whole session good for sparking my thoughts.
If you’re interested in digital humanities and want to know more about the field, you could do far worse than to subscribe to the podcasts from the Scholars’ Lab and from the Maryland Institute for Technology in the Humanities. Every time I listen to one of them, I get more excited about the new interdisciplinary work being done at the intersection of computing and the humanities.
If you have other suggestions for podcasts on history or digital humanities that you’ve enjoyed, please share!
As I’ve become more and more interested in the emerging interdisciplinary field sometimes called “digital humanities” (DH) or (especially in Europe) “humanities computing,” I’ve also noticed that it’s hard to find DH people in Boston or New England, other than at MIT’s HyperStudio. Mark Sample’s post on the death of the digital humanities center gives examples of some of these elsewhere in the US. Sample points out that, although many scholars might wish to have institutional infrastructure/support for our DH work, budget cuts happen first in interdisciplinary programs. Here’s his call for action:
So if you’re interested in the transformative power of technology upon your teaching and research, don’t sit around waiting for a digital humanities center to pop up on your campus or make you a primary investigator on a grant.
Act as if there’s no such thing as a digital humanities center.
Instead, create your own network of possible collaborators. Don’t hope for or rely upon institutional support or recognition. To survive and thrive, digital humanists must be agile, mobile, insurgent. Decentralized and nonhierarchical.
Stop forming committees and begin creating coalitions. Seek affinities over affiliations, networks over institutes.
Centers, no. Camps, yes.
To that end, and inspired by the new Digital Humanities Southern California blog, I’m trying to help build more local and regional community among DH practitioners. I’ve started a new group blog, Digital Humanities in Boston and Beyond, which will (as it develops) feature posts about local DH specialists, what we do, and why it’s exciting. Please join us there.

