Historians’ ethics, women’s history, and the infinite archive

2009 October 11

Lately, I’ve been thinking a lot about historical sources and the implications of digitizing them. I work a lot with a particular collection of government papers that’s full of ordinary women’s letters about their experiences as mothers during the early 20th century. It’s hundreds of linear feet, and only a small part of it has been microfilmed. It’s easy enough to get to; you don’t have to be a professional historian to read the collection; and it’s open to the public. But it’s still physically located in one place, which limits the number of people who can read it. (The curious can read a published selection of the letters in Raising a Baby the Government Way.)

The collection’s also a fabulous target for digitization. It’s part of a federal government collection held by NARA; it’s civilian, unclassified records, and as far as anyone can tell me that means that it’s in the public domain. It’s not encumbered by intellectual property issues or digital-photography restrictions. In fact, I’ve spent weeks looking through it, digital-camera-photographing materials that are relevant to my research— the ability to do that has literally made my current project possible. There’s nothing stopping me from putting that material online, save my own limited time and budget— and I’ve been thinking hard about doing just that. We need more women’s history sources in the infinite archive.

But here’s the problem: the letters in it are full of intimate stories, and this poses a problem that I haven’t seen any digital-methods historians addressing.1 Right now, because they’re physical papers stored in boxes, these documents enact a kind of privacy through obscurity, and any intelligent computer user knows that security through obscurity is a horrible way to keep a secret.

When historians quote from this collection, they usually obscure identifying details, out of respect for the privacy of people who may still be living, but they cite it in a way that it can be found by other historians in the physical archive. So I’ll do the same here and quote an entire letter in full, omitting identifying personal information.

This is a letter written to Eleanor Roosevelt by an African-American mother who faced a complicated family situation and was looking for some help. (I have an image of it which I’m not posting here, because the content of the letter would require me to redact it heavily.) It’s written on hotel stationery from Chattanooga, Tennessee, in pencil; the handwriting gets looser and less composed over several pages, as its author got more upset about the problem she was describing. (There aren’t any paragraph breaks in the original, either.)

Dear Mrs. Rosevelt,

Will you please help me I am a negro girl I am married. I am going to tell you every thing so you will under stand. I married my husband Nov 23 1938. Diden any body know until June 1939. I was going to have a baby. My husband sisters did not like it be cause he married so they did every thing to break us up and succeeded We got a baby born Nov 5 1939 name [redacted] in Atlanta Ga. The baby was born 7 months come be fore time I want a birth certificate I cant get a regular job without it I have wrote to the Bureau of Vital Stitictis for a cercitific in Atlanta 2 months a go hasent heard from them will you please help me I cant have my baby be cause I cant make my husband keep me without it. His sisters wont let him help me but I can get help if I get the cificitate I cant bring the baby home until I get some help but they wont help me in Atlanta the babys name is [redacted]. Fortha [Father's] name [redacted] Mother Susie Mae [redacted] Please help me I wont my baby my brother has got her untill I can take care of her

yours truly

Susie Mae [last name redacted, with an address.] 2

I could describe a number of interpretive points about this document, but for now, trust me: this is an interesting letter—both for the gritty detail it gives about one woman’s hardships, and for what it says about how Americans learned to use birth certificates. But for me as a historian, this letter writer’s personally identifying information is Really Important. Since I know her name, her age (and thus her approximate birth year), and her address, I can use a bunch of online tools to find out about her.

Thanks to a digitized, searchable 1930 federal manuscript censusI’ve looked her up. I can tell you all about her husband and his family. I could probably use the Digital Sanborn Maps to find out more about the house she was writing from, and I might even be able to find it on the Google Maps Street View. Since I know her child’s name, and her child was born in the late 1930s, it’s entirely possible that I could find the “baby” described here, who would be about 70 years old today. (That’s a step many historians wouldn’t take unless they really needed some information. For my project, it’s not important.) But most of what I’ve just described is absolutely standard historical research practice: when you want to find out about someone, there are ways to learn if you’ve got the patience to work through them. It’s just that modern digital research tools make the legwork faster.

If I were putting this collection online with good semantic web metadata, it’s entirely possible that I might add automatically-generated links to such services: “Search for this person in the US manuscript census,” “Investigate this address,” and such. (There’s a workshop next weekend in Toronto that’s going to be thinking about the technical issues involved in building APIs for digital-humanities web services. What I’m describing is entirely feasible, from a technical perspective—if not today, then very soon.)

In short, I could make the lives of Susie Mae M____’s descendants very… interesting. And, given the content of this letter, probably not in a good way.

Not all of the letters in the Children’s Bureau papers are nearly so scandalous. Many of them are questions about childrearing in one form or another, and many are from women in tiny rural towns. Any really useful project would require that the images of the documents have metadata added: the author’s name, city, state, and such; lots of work, possibly automatable or crowdsourceable, but the results would be amazing. (Wouldn’t it be cool to google your great-grandmother’s name and find a letter she wrote in the 1920s, when she was trying to learn about the best diet for your then-2-year-old grandfather?) And putting the collection online would make these thousands of letters more accessible than they are now, where they’re hardly even indexed.

What are the ethics of putting materials like this online, in bulk, for use by digital-methods historians? I’m not sure I know. On one hand, if I revealed the last name of Susie Mae M____ by putting the images of that letter online, I might find myself answering some very testy correspondence by her descendants, who don’t want their family’s historical “dirty laundry” aired on the Internet.

On the other hand, historians who work on the history of women and sexuality have spent a lot of time talking about how defining and separating “private” from “public” does political work. I’m not willing to say that just because Susie Mae M____’s descendants might object to that letter’s publication on the internet, we should never digitize “sensitive” material. If we keep “private” or “scandalous” materials out of digital primary source collections, then (to borrow Estelle Freedman‘s phrasing) the burning of letters continues online, even if the paper copies still survive.3

What do you think the best approach is? What haven’t I taken into consideration?

  1. Then again, I’m a finite person. Do comment and let me know what I should be reading.
  2. Susie Mae M___, Chattanooga, Tennessee, to “Mrs. Rosevelt,” May 11, 1940, Folder 4-2-1-2 Birth Registration, Box 730, Records of the Children’s Bureau, RG 102, National Archives, College Park, Maryland. All spelling and punctuation is consistent with the original.
  3. We could put it in a walled garden where only “serious scholars” can search for these materials, but that approach is fraught for entirely different reasons. (First: define “serious scholar.”)
  1. Anonymous Archivist permalink
    October 12, 2009

    I’m a professional archivist who works with a collection containing similar items, but of a significantly more recent vintage. I have two comments that may either help or muddle things further:

    (1) The letters remain the intellectual property of the individuals who wrote them, regardless of where they now reside (the government’s responses, otoh, would be in the public domain). This includes copyright, so you need to do a regular copyright assessment the same as you would for the letters of, say, a famous writer.

    (2) In our collection, we restrict access to sensitive material similar to what you are citing for 75 years from date of creation, on the assumption that 99% of the individuals involved will be dead by that point. If the subject matter involved specific children, we would make an extra effort to make sure that they were no longer living as well, before allowing full and open access. The “75 year rule” is a fairly common practice in archives, especially those that handle things like student records.

    • Shane permalink
      October 12, 2009

      Anonymous Archivist, Thanks for your thoughtful reply.

      (1) Where can I read about how professional archivists determine the first point, about copyright and IP issues? Because if that’s the case, then Molly Ladd-Taylor’s edited collection of these letters (with the names of authors reduced to initials) would be unpublishable today. Everyone I’ve talked to says that materials in government files are public domain, regardless of the author.

      (2) But correspondence between 2 people who were both on the government payroll is still public-domain, which is good to know for some of the other material in this collection.

      (3) I’m familiar with the 75-year rule as a common professional practice. But NARA doesn’t seem to follow that rule in its access restrictions; again, I think they’re assuming that the size of the collection and the lack of comprehensive indexing will work to keep people out. And if they applied a 75-year-rule anytime some letter-writer mentioned her baby’s name and birthdate, most of the collection would be closed, because the original filing system interfiled letters from citizens with interdepartmental memos. This is a case where paper privacy-through-obscurity works exactly as it should, I think—but transferred to the idea of an online collection, I don’t know whether the privacy concerns or the research-accessibility concerns should win out.

  2. October 12, 2009

    This is a really thoughtful post – what fascinating source material, and I think you touch on an important point. I always lean towards openness and transparency, but situations like these sure give me pause. I’d still say that the benefits of opening up archival records far outweighs the potential drawbacks, but those issues are certainly worth tackling.

    Great-looking site, too!


    • Shane permalink
      October 12, 2009

      Cameron, thanks. I’ll have more posts in the future on these conflicts, because I think about them a lot. Then again, I also have a dissertation to write, so the posting may be slower than I’d like. ;)

  3. Anonymous Archivist permalink
    October 12, 2009

    Shane, thank you for your interest. It’s not often that archivists find people outside our profession who are genuinely interested in the nitty-gritty. Please apply the usual IANAL common sense to what I say here about legal matters.

    I’m assuming that the idea that “everything in a government file” is public domain is a (mis)interpretation of the portion of U.S. copyright law that states that works “prepared by an officer or employee of the U.S. government as part of that person’s official duties” are in the public domain. See, for example, http://www.law.cornell.edu/uscode/17/101.html Hence, the non-government-employee-generated portions of government files are not in the public domain, even though they may be in the possession of the government.

    Correspondence is always a headache for archivists when it comes to copyright. It usually goes something like this: Person A corresponds with Person B. Person B saves Person A’s correspondence and donates that correspondence to an archives as part of his/her personal manuscript collection. Sometimes Person B will sign over copyright along with the manuscript donation, but Person B is simply unable to sign over copyright to the letters from Person A, because Person B never owned the copyright to them. So, any researchers who want to publish or do other things that might infringe on the copyright of Person A are generally advised to contact that person directly to receive permission.

    As for where to look for info on how archivists handle copyright, Peter Hirtle at Columbia is an acknowledged expert in the field. He publishes a chart that is a good starting place for determining if something is in the public domain: http://www.copyright.cornell.edu/resources/publicdomain.cfm Also, the Society of American Archivists recently published a Statement of Best Practices for Orphan Works, which is available here: http://archivists.org/standards/OWBP-V4.pdf It’s not immediately relevant to this discussion, but would become relevant as soon as you had trouble trying to track down a correspondent.

  4. October 12, 2009

    On the privacy question, I wonder if there could be some sort of compromise where the letters are still publicly accessible but kept out of search engines, sort of like the intermediate privacy setting on blogs (a level below password protection). That wouldn’t solve the problem of people quoting/describing the letters on sites the search engines do index, and it would require an internal search function for the letters site itself to make it more usable, so it could be more trouble than it’s worth.


