Digital Libraries Research mailing list
Re: What is Digital Library? Eric Hellman
  Subject: Re: What is Digital Library?
  Date: Tue, 6 Jul 2004 02:58:14 -0400
I've been meaning to write up my thoughts that this earlier exchange provoked, but I've been busy; I'm taking the opportunity of a six hour plane flight to finally do so.

I know what collection development is in a traditional library, what's not obvious to me is what collection development is in a digital library. If your library subscribes to LexisNexis, is that part of your digital collection? If you try to figure out how best to expose that content on a library website, is that developing your digital collection? If you cancel your subscription, is that management?

What struck me about the "Digital Futures" definition is that it described characteristics of digital libraries rather than offering a definition that could be used to determine whether something was or was not a digital library. I had been hoping to flush out from the list a sense that to be a library, a collection had to have a purpose rather than just a management.

Simon's second response is great, and the one phrase that struck me was "defined user community". I argue that the definition of a digital library is fundamentally tied to this phrase. A digital library is "Any collection of digital resources managed with the primary goal of maximizing the collection's utility to a defined user community".

There, I've said it. Hope it's not terribly original.

So let's take this razor and apply it to the examples I called DLLR's.

1. Amazon is managed with the primary purpose of getting people to buy things. Although the methods it uses are identical to methods that would be used by a library trying to maximize the utility of its collection, (people buy things at least in part because the things are useful, so there are many management objectives in common between amazon and a good digital library) is NOT A DIGITAL LIBRARY.

2. Google. You could make two cases to argue that Google is a digital library. The first case takes the viewpoint that google is collection of links and harvested data that is managed to increase the utility of that collection. but that point of view elides the fact that the collection as so defined is not very useful in the absence of the google machinery and the resources that the links point to- the entire public internet.
The second argument is that Google transforms the public internet into a digital library by virtue of Google's goal of making that information more accessible. But the definition above says that the public internet, even with google attached, is NOT A DIGITAL LIBRARY because the collection as a whole is not managed with a primary purpose of maximizing its utility to any defined user community. Rather, each resource is managed for purposes of its own devising. Nonetheless, Google must be regarded as a digital library tool, because it has an ostensible purpose consistent with digital librarianship.

I think that as our understanding of the internet grows, we will begin to realize that it is a complex interacting system analogous to an ecosystem or a market economy. Maybe we can call it an ecobrary.

3. DMOZ. What is the purpose of DMOZ? "To catalog the internet" I suppose. "To be a useful resource", perhaps. is there a defined user community? not really. DMOZ is more of an organism that lives in the ecobrary, it is successful if it perpetuates itself, it is good if it is not evil. whatever, a card catalog is not a library, DMOZ is NOT A DIGITAL LIBRARY.

4. Yahoo. Yahoo has a well defined purpose, to make money for its shareholders. Like Amazon, it uses many digital library tools to increase its value, but it fails the primary purpose of serving it's community first. Yahoo is NOT A DIGITAL LIBRARY.

I'm not one who believes that libraries cannot be profit-making companies. If a business model aligns the management of the collection with the interests of shareholders, then being a library and making money can be compatible.

5. Science Direct. Like Yahoo, Science Direct is managed to make money for Elsevier shareholders. It serves both the author community and the library community and does not make decisions about collection development with the user utility foremost, nor should it. Science Direct is NOT A DIGITAL LIBRARY.

An interesting question arises if a library's collection consists of only Science Direct. By obtaining ScienceDirect, and then making it available to a defined user community, a digital library is no less a digital library for the fact of having only ScienceDirect. This is a clear example of the notion that intent and purpose is what makes something a digital library.

6. Pubmed. Pubmed serves a defined user community, the medical research community. It manages its resources to maximize their utility to its defined community, so as to serve the taxpayers of the USA. PUBMED IS A DIGITAL LIBRARY.

7. arXiv. arXiv's purpose is to maximize communication among scientists. There are many things that arXiv could do to increase the utility of its digital collection at the expense of some reduced freedom of communication, but since arXiv is NOT A DIGITAL LIBRARY, it does none of them.

8. the internet archive. the internet archive has a defined user community- scholars of the future. It manages its collection to maximize its value to those future users. the internet archive IS A DIGITAL LIBRARY

9. exists to make money on advertising. It is managed to increase its site traffic, and thus its ad revenue. is is NOT A DIGITAL LIBRARY.

10. jstor. jstor is a non-profit organization that exists to make available digitally the backfiles of significant publications. It serves both publishers and subscriber libraries. So, in a sense, the subscribers are the defined user group. jstor probably IS A DIGITAL LIBRARY.

11. I don't really know enough to make a proper assessment of, but to the extent that it is a catalog of a physical library, my guess is that is NOT A DIGITAL LIBRARY.

12. OCLC FirstSearch. I don't know enough about the management goals to say for sure, but it seems that OCLC FirstSearch probably IS A DIGITAL LIBRARY.

13. iTunes. My collection of music, managed using iTunes to serve a very well definined community- me , IS A DIGITAL LIBRARY. The iTunes service is managed with the purpose of selling songs. The itunes service is NOT A DIGITAL LIBRARY.

14. Wikipedia. Wikipedia is not a collection, it is a collective. it is NOT A DIGITAL LIBRARY.

At 9:10 AM +0100 6/3/04, Simon Tanner wrote:
Eric raises an important point and it certainly got me thinking, but maybe my quote took for granted the understanding that librarians have of phrases like collection development. So I will rise to his challenge to argue that iTunes is not a digital library.

Looking at the American Library Association's definition of collection development:
"A term which encompasses a number of activities related to the development of the library collection, including the determination of the library collection, including the determination and coordination of selection policy, assessment of needs of users and potential users, collection evaluation, identification of collection needs, selection of materials, planning for resource sharing, collection maintenance, and weeding." (ALA Glossary of Library & Information Science)

Now at this point it is easy to point at Amazon or iTunes and say "but they do reach a community, select stuff and maintain collections" - but this is a false analogy. Just as bookshops are not libraries neither are Amazon or iTunes a Digital Library. They may hold similar stuff but they do not do the same things with it. They do not have a _defined_ user community, they do not discriminate on behalf of that community (i.e. their policy is - "if we can put a price on it, we will sell it") and they certainly do not maintain the collection past the point of significant sales. I cannot get a piece of information from iTunes that is not held in their repository nor will they send me something I want that will cost them to provide it and me nothing to receive it - but I can get those things from libraries.

Whether they are digital or not these shops/resources fail the first test - they are NOT libraries. For me at least, libraries are not just a collection of information resources - they are the services and people that revolve around those collections. Great librarians make great libraries. The librarian's information expertise, the collection and the services provided are what define a library not just the building or the fact of the collection. Amazon and iTunes may be great shops because of the commercial services they provide, but they are not libraries.

However, dictionaries etc have this unfortunate habit of defining libraries as "buildings containing books" or "repositories of books" and this is a widely held public misconception of what libraries are and do. When was the last time Baseball was defined as a stadium that contains people rather than the act of playing the game? Libraries need to first and foremost clearly delineate what they are - then digital libraries will be easier to define because it wont get mixed up with other digital things just because they are digital.

Simon Tanner
Director, KDCS

At 23:30 02/06/2004, you wrote:
While it's easy to describe what we think a digital library should be, it's harder to come up with a definition which enables us to tell whether a given "thing" is or is not a digital library.

Here's a list of things that in one way or another have attributes of a digital library, try testing your favorite definition against what your gut tells you (along with my answers to the 4 items from "Digital futures):

DLLR (digital-library-like-resource)
2. Google YNYN
4. Yahoo YYYY
5. science direct YYYY
6. pubmed YYYY
7. arXiv YNYY
8. the internet archive YNNY
10. jstor YYYY
11. YYNN
12. OCLC firstsearch YYYN
13. iTunes YYYY
14. Wikipaedia YNYY

My Y's and N's are available for dispute, but the question becomes "what are 'principles of collection development' as applied to digital libraries?" or "how long is long-term?".

I would be interested to entertain arguments that, for example, iTunes is not a digital library.

At 8:44 AM +0100 6/1/04, Simon Tanner wrote:
With apologies to the list for quoting my own work!

In our book Digital Futures, Marilyn Deegan and I came up with this observation on digital libraries:

"We would like to propose some principles that we think perhaps characterize something as a digital library rather than any other kind of digital collection... These are:

1. A digital library is a managed collection of digital objects
2. The digital objects are created or collected according to principles of collection development
3. The digital objects are made available in a cohesive manner, supported by services necessary to allow users to retrieve and exploit the resources just as they would any other library materials
4. The digital objects are treated as long-term stable resources and appropriate processes are applied to them to ensure their quality and survivability."

I also think that you could do a lot worse than consider Ranganathan's definition of a library as still extremely relevant to this debate.

Hope this helps.



