Showing posts with label digital archives. Show all posts
Showing posts with label digital archives. Show all posts

How many photos have ever been taken?

How many photos have ever been taken? About 3.5 trillion according to Jonathan Good, in a fascinating piece. I particularly liked this graphic of the world's largest photo libraries:

Historic Photographs from the Spokane Public Library

Here is a great new resource for local history: Spokane Public Library - Northwest Room Digital Collections. "The Northwest Room of Spokane Public Library is pleased to introduce a digital collections page of photographs selected from collections in the Northwest Room. These collections emphasize the most frequently requested subjects in the Northwest Room – the homes, buildings, streets and activities in and around Spokane."

Remains of the Glover/Pioneer Block after the fire of 1889.
So far the collection contains about 350 items, and Northwest Room librarian Riva Dean says that the library plans to add "plenty more." The initial collection includes historic Spokane photographs in six categories: the Spokane Fire (1889), Spokane River, Spokane Views, Spokane Homes, Spokane Parks, Spokane Bridges, and Spokane Streets. Unfortunately the images are provided using ContentDM, which is the industry standard despite being a fairly stodgy piece of commercial software. It presents information adequately but does not allow users to interact with item to add data or offer corrections. (It would be nice, for example, to flag this item, which contains a picture of one Spokane park with a description of a different park, without having to go back to the library webpage and figure out who to email the correction.) It also does not allow users to easily save or export the images or to share them with friends via Facebook or other social media. (I extracted the image on this page using the Picnik extension on the Chrome browser.)

The first set of pictures are delightful, from scenes of the devastating 1889 fire to early street images to intriguing homes (anyone know where this one is?).

It is a shame the site is not more interactive because these are wonderful photographs and could easily become nodes of online discussion about Spokane history. Many have only partial information with them-the dates are unknown, the locations have been forgotten, etc. If there were a discussion area with each photo, patrons could probably help fill in a lot of the missing information.

The Spokane Public Library is doing a real service to its public by making these photographs so easily accessible. I look forward to watching them add to the collection. Of course if you are the impatient type you can just go to the Northwest Room of the Spokane Public Library, where they have a vast collection of primary resources for Spokane History.

A Post for Jerry Handfield



Jerry Handfield, the Washington State Archivist and one of my bosses, loves these Team Digital Preservation videos. DigitalPreservationEurope has created a series of these cartoons to help explain basic concepts of digital preservation. There are a half-dozen videos in the series so far, you can see the rest here.

Online Resources at the Washington State Historical Society

I began working on a post about the new Columbia magazine website, but along the way found a treasure trove of other history resources at the Washington State Historical Society. So I will begin with the magazine and branch out from there.

I see that Columbia: The Magazine of Northwest History, the journal of the Washington State Historical Society, has a new and improved website. Those of you who are familiar with Columbia know that it is a superb journal of popular history, with articles and book reviews that present recent scholarship to a popular audience. The website does not contain everything that has appeared in the journal but many of the articles are available in full text. (This article is excellent.)

There are many other online resources available from the WSHS, including lesson plans, featured collections, women's history resources, and information about visiting the WSHS locations in Tacoma and Olympia. The featured collections is an especially rich resource, and includes digital collections of maps, American Indian Photographs, Columbia River Photographs, a Gustav Sohon Collection, and a Sheet Music Collection. The collections are very much oriented to the west side of the state (a search for "Spokane"  results in zero hits!) and are presented in ContentDM. But within these restrictions there tremendous things to be found at the WSHS site.

An Interesting Graduate Student Project

One of my graduate students, Shaun Reeser, is working to recover a long-lost government website for his MA public history project. Specifically, he will be working with some of the staff at the Washington State Archives, Digital Archives to recover the website of Ralph Munro, who served five terms as Washington Secretary of State from 1981 to 2001. Munro launched the first website for the Secretary of State's office in 1996, and the site was regularly updated until he left office in 2001. How to bring it back?

The Digital Archives has done something like this before, when we preserved the website of Washington Governor Gary Locke. In that 2005 effort the DA staff raced the clock to migrate Locke's website (an important public record) to the DA before it was taken down to make way for the website of Governor Gregoire. It was a pioneering project, but what Reeser will try to do is different. Munro's website was preserved in several versions at the Internet Archive Wayback Machine. But simply pulling the information off the Wayback Machine is not sufficient for establishing archival authenticity. And the versions of the website at the Wayback Machine are often incomplete, lacking some of the original images, for instance, and full of broken links.

Right now Reeser is trying to hunt down the original digital files and to interview Munro and members of his staff. He is also studying other efforts to spider and preserve government websites as historic documents. Does anyone know of similar effort that we should study?

[Image: Oh come on, you know.]

404, A Cautionary Tale

Don't ever create an extensive list of hyperlinks on your website. Just don't do it. It is so easy to get caught up in the excitement of cataloguing all the wonderful websites you find on your topic. Then you create your own website with a links page to these resources. And then--it falls apart. URLs change, websites go down, what was good becomes bad or redundant in the light of other new sites, and you have a mess on your hands.

Witness the unfortunate state of this University of Idaho site: Repositories of Primary Sources. It sounds so promising:

A listing of over 5000 websites describing holdings of manuscripts, archives, rare books, historical photographs, and other primary sources for the research scholar. All links have been tested for correctness and appropriateness.

Woohoo! A worldwide guide to all the websites for archives and special collections. This is exactly the kind of resource we need to keep track of all the other digital resources. So I eagerly navigated to the section for the Western United States and Canada and then to Washington and then to Central Washington University:



No problem. Lets try a different link. The East Benton County Historical Society? "Oops! This link appears to be broken." The Echoes of the Past Archive sounds interesting. "This domain is for sale. Please contact info@echoesarchive.com for more information." Nevermind, let's try my own Eastern Washington University--no, another 404 page. Of the first 15 links for Washington State, 11 are broken.

I do not mean to slam on the fine people at the University of Idaho, who obviously put a great deal of effort into creating this resource. No doubt they meant to maintain it, and no doubt more pressing matters have directed their attention elsewhere. This is just what happens to such endeavors.

So kids, never create extensive links pages. Or if you must, make them a wiki and leave a note asking users to fix anything they find that is broken.

Google's Book Search: A Disaster for Scholars?

Your humble Northwest History blogger is sometimes accused of being a Google fanboy. A fair cop. But you know who is not a Google fanboy? Geoffrey Nunberg, that is who. Over at the Chronicle of Higher Education Nunberg has a witty jerimiad, Google's Book Search: A Disaster for Scholars.

Nunberg's beef is with Google's sloppy and commercially driven metadata schemes. He demonstrates that even with such a basic item as date of publication, Google Books very frequently gets it wrong. This in turn often corrupts search results: "A search on 'Internet' in books published before 1950 produces 527 results; 'Medicare' for the same period gets almost 1,600." By comparing Google's data to that found in the catalogues of the contributing libraries Nunberg shows that these errors do in fact belong to Google, not to their partners.


Nunberg also whacks Google for the classification errors where books are placed in the wrong categories: " H.L. Mencken's The American Language is classified as Family & Relationships. A French edition of Hamlet and a Japanese edition of Madame Bovary are both classified as Antiques and Collectibles . . . An edition of Moby Dick is labeled Computers; The Cat Lover's Book of Fascinating Facts falls under Technology & Engineering."

Worst of all to Nunberg is Google's adoption of the Book Industry Standards and Communications categories for Google Books, which he describes as a modern commercial invention used to sell books, rather than a scholarly system of classification like the Library of Congress subject headings: "For example the BISAC Juvenile Nonfiction subject heading has almost 300 subheadings, like New Baby, Skateboarding, and Deer, Moose, and Caribou. By contrast the Poetry subject heading has just 20 subheadings. That means that Bambi and Bullwinkle get a full shelf to themselves, while Leopardi, Schiller, and Verlaine have to scrunch together in the single subheading reserved for Poetry/Continental European. In short, Google has taken a group of the world's great research collections and returned them in the form of a suburban-mall bookstore."

I think that Nunberg has a number of good points--point he gathers together to form a molehill, from which he conjures up a mountain. Google's metadata may be everything he says (and I think he is probably right) but how great a problem is that really? This scholar at least uses Google Books either 1) to locate a digital copy of a book I already know about, or 2) via a string of search terms. In the first case, it is not relevant to me that Google has classified Adventures of Huckleberry Finn under "wild plants" or whatever. I know perfectly well what it is, and just wanted to find a quote I remember.

In the second case, I might search for mentions of the Columbia River in books published before 1860. And suppose a faulty date in Google's database brings me to something written after 1860. So what? Surely when I click on the link and find myself reading Sherman Alexie instead of Lewis and Clark, I will notice the fact. (Actually I just did the search and on the first 10 pages of results I don't see any errors at all. Take that, Nunberg.)

So for which scholars exactly is Google Book Search a "disaster?" Nunberg cites "linguists and assorted wordinistas" who are "adrenalized" at the thought of data mining to "track the way happiness replaced felicity in the 17th century, quantify the rise and fall of propaganda or industrial democracy over the course of the 20th century, or pluck out all the Victorian novels that contain the phrase "gentle reader." But who does this? OK, I know that people do it, but most data mining of this type has always struck me as more of a parlour trick than actual scholarship.

The other thing Nunberg ignores is that metadata is not that hard to fix. Google already provides a "feedback" button on every virtual page so readers can report unreadable or missing pages. If we howl loud enough we could easily see similar feedback mechanisms on the "More book information" page so we could correct names and dates and categories.

Nunberg is absolutely correct to recognize the monumental importance to scholars of the Google Book Search project. It is vital that scholars take a critical stance that will push Google to improve the project and make it even more useful. His article is a valuable push in that direction.

UPDATE 9/3/09: Reader Ed points out that Geoff Nunberg also posted a nicely illustrated version of his article on the blog Language Log, and got a brief response in the comments from
John Orwant, who manages the metadata at Google Books.

The Historical Treasure-Trove that is the UW Libraries Digital Collections

The University of Washington Libraries Digital Collections "features materials such as photographs, maps, newspapers, posters, reports and other media from the University of Washington Libraries, University of Washington Faculty and Departments, and organizations that have participated in partner projects with the UW Libraries. The collections emphasize rare and unique materials."

There are a lot of useful resources here--check out the Special Collections section for seventy-five wonderful digitized collections such as Alaska-Yukon-Pacific Exposition Photographs, Centralia Massacre and the Industrial Workers of the World Collection, 1912-1932, and the Vietnam War Era Ephemera Collection (from which the John Wayne poster featured here is taken.)

Good collections descriptions accompany all the collections and some have considerable additional material. American Indians of the Pacific Northwest, for example, has introductory essays, maps, and "bibliographies and links to related text and images as well as study questions that K-12 teachers may use as they develop curricula in their schools." They also have a blog to help users keep up-to-date with changes in their digital collections.

As wonderful as the site is, it could be more interactive. The images are difficult to download and save (which may be a feature rather than a bug!) and there is no Web 2.0 interaction. There is not even a way for users to flag an image that is obviously misidentified.

(Image: Cropped detail from John Wayne in No Substitute for Victory, created by the "Greater Seattle TRAIN Committee (To Restore American Independence Now)" and "Students for Victory in Viet Nam, Seattle Committee.")

Duke Digital Collections iPhone App

I am not sure if this is an oddity or a glimpse of the future, but Duke Digital Collections has developed what I believe is the first iPhone app for a digital archive. The app is really nicely designed and takes advantage of many of the iPhone's capabilities. Here is the demo they put up on YouTube:



And yet--I can't see using my iPhone to do historical research. What do you think, dear readers? Is this a very impressive novelty or something more?

[Hat tip to the frequently valuable Duke Digital Collections Blog--a nice example of an institutional blog.]