Re: What is Digital Library? Eric Hellman
I've been meaning to write up my thoughts that this earlier exchange
provoked, but I've been busy; I'm taking the opportunity of a six
hour plane flight to finally do so.
- From: Eric Hellman <email@example.com>
- To: Simon Tanner <firstname.lastname@example.org>
- Cc: email@example.com, "Steve Knight" <Steve.Knight@natlib.govt.nz>, "Marilyn Deegan" <firstname.lastname@example.org>
- Subject: Re: What is Digital Library?
- Date: Tue, 6 Jul 2004 02:58:14 -0400
- References: <email@example.com> <firstname.lastname@example.org> <email@example.com> <firstname.lastname@example.org>
I know what collection development is in a traditional library,
what's not obvious to me is what collection development is in a
digital library. If your library subscribes to LexisNexis, is that
part of your digital collection? If you try to figure out how best to
expose that content on a library website, is that developing your
digital collection? If you cancel your subscription, is that
What struck me about the "Digital Futures" definition is that it
described characteristics of digital libraries rather than offering a
definition that could be used to determine whether something was or
was not a digital library. I had been hoping to flush out from the
list a sense that to be a library, a collection had to have a purpose
rather than just a management.
Simon's second response is great, and the one phrase that struck me
was "defined user community". I argue that the definition of a
digital library is fundamentally tied to this phrase. A digital
library is "Any collection of digital resources managed with the
primary goal of maximizing the collection's utility to a defined user
There, I've said it. Hope it's not terribly original.
So let's take this razor and apply it to the examples I called DLLR's.
1. Amazon.com. Amazon is managed with the primary purpose of getting
people to buy things. Although the methods it uses are identical to
methods that would be used by a library trying to maximize the
utility of its collection, (people buy things at least in part
because the things are useful, so there are many management
objectives in common between amazon and a good digital library)
Amazon.com is NOT A DIGITAL LIBRARY.
2. Google. You could make two cases to argue that Google is a digital
library. The first case takes the viewpoint that google is collection
of links and harvested data that is managed to increase the utility
of that collection. but that point of view elides the fact that the
collection as so defined is not very useful in the absence of the
google machinery and the resources that the links point to- the
entire public internet.
The second argument is that Google transforms the public internet
into a digital library by virtue of Google's goal of making that
information more accessible. But the definition above says that the
public internet, even with google attached, is NOT A DIGITAL LIBRARY
because the collection as a whole is not managed with a primary
purpose of maximizing its utility to any defined user community.
Rather, each resource is managed for purposes of its own devising.
Nonetheless, Google must be regarded as a digital library tool,
because it has an ostensible purpose consistent with digital
I think that as our understanding of the internet grows, we will
begin to realize that it is a complex interacting system analogous to
an ecosystem or a market economy. Maybe we can call it an ecobrary.
3. DMOZ. What is the purpose of DMOZ? "To catalog the internet" I
suppose. "To be a useful resource", perhaps. is there a defined user
community? not really. DMOZ is more of an organism that lives in the
ecobrary, it is successful if it perpetuates itself, it is good if it
is not evil. whatever, a card catalog is not a library, DMOZ is NOT A
4. Yahoo. Yahoo has a well defined purpose, to make money for its
shareholders. Like Amazon, it uses many digital library tools to
increase its value, but it fails the primary purpose of serving it's
community first. Yahoo is NOT A DIGITAL LIBRARY.
I'm not one who believes that libraries cannot be profit-making
companies. If a business model aligns the management of the
collection with the interests of shareholders, then being a library
and making money can be compatible.
5. Science Direct. Like Yahoo, Science Direct is managed to make
money for Elsevier shareholders. It serves both the author community
and the library community and does not make decisions about
collection development with the user utility foremost, nor should it.
Science Direct is NOT A DIGITAL LIBRARY.
An interesting question arises if a library's collection consists of
only Science Direct. By obtaining ScienceDirect, and then making it
available to a defined user community, a digital library is no less a
digital library for the fact of having only ScienceDirect. This is a
clear example of the notion that intent and purpose is what makes
something a digital library.
6. Pubmed. Pubmed serves a defined user community, the medical
research community. It manages its resources to maximize their
utility to its defined community, so as to serve the taxpayers of the
USA. PUBMED IS A DIGITAL LIBRARY.
7. arXiv. arXiv's purpose is to maximize communication among
scientists. There are many things that arXiv could do to increase the
utility of its digital collection at the expense of some reduced
freedom of communication, but since arXiv is NOT A DIGITAL LIBRARY,
it does none of them.
8. the internet archive. the internet archive has a defined user
community- scholars of the future. It manages its collection to
maximize its value to those future users. the internet archive IS A
9. download.com. download.com exists to make money on advertising. It
is managed to increase its site traffic, and thus its ad revenue.
download.com is download.com is NOT A DIGITAL LIBRARY.
10. jstor. jstor is a non-profit organization that exists to make
available digitally the backfiles of significant publications. It
serves both publishers and subscriber libraries. So, in a sense, the
subscribers are the defined user group. jstor probably IS A DIGITAL
11. loc.gov I don't really know enough to make a proper
assessment of loc.gov, but to the extent that it is a catalog of a
physical library, my guess is that loc.gov is NOT A DIGITAL LIBRARY.
12. OCLC FirstSearch. I don't know enough about the management goals
to say for sure, but it seems that OCLC FirstSearch probably IS A
13. iTunes. My collection of music, managed using iTunes to serve a
very well definined community- me , IS A DIGITAL LIBRARY. The iTunes
service is managed with the purpose of selling songs. The itunes
service is NOT A DIGITAL LIBRARY.
14. Wikipedia. Wikipedia is not a collection, it is a collective. it
is NOT A DIGITAL LIBRARY.
At 9:10 AM +0100 6/3/04, Simon Tanner wrote:
Eric raises an important point and it certainly got me thinking, but
maybe my quote took for granted the understanding that librarians
have of phrases like collection development. So I will rise to his
challenge to argue that iTunes is not a digital library.
Looking at the American Library Association's definition of
"A term which encompasses a number of activities related to the
development of the library collection, including the determination
of the library collection, including the determination and
coordination of selection policy, assessment of needs of users and
potential users, collection evaluation, identification of collection
needs, selection of materials, planning for resource sharing,
collection maintenance, and weeding." (ALA Glossary of Library &
Now at this point it is easy to point at Amazon or iTunes and say
"but they do reach a community, select stuff and maintain
collections" - but this is a false analogy. Just as bookshops are
not libraries neither are Amazon or iTunes a Digital Library. They
may hold similar stuff but they do not do the same things with it.
They do not have a _defined_ user community, they do not
discriminate on behalf of that community (i.e. their policy is - "if
we can put a price on it, we will sell it") and they certainly do
not maintain the collection past the point of significant sales. I
cannot get a piece of information from iTunes that is not held in
their repository nor will they send me something I want that will
cost them to provide it and me nothing to receive it - but I can get
those things from libraries.
Whether they are digital or not these shops/resources fail the first
test - they are NOT libraries. For me at least, libraries are not
just a collection of information resources - they are the services
and people that revolve around those collections. Great librarians
make great libraries. The librarian's information expertise, the
collection and the services provided are what define a library not
just the building or the fact of the collection. Amazon and iTunes
may be great shops because of the commercial services they provide,
but they are not libraries.
However, dictionaries etc have this unfortunate habit of defining
libraries as "buildings containing books" or "repositories of books"
and this is a widely held public misconception of what libraries are
and do. When was the last time Baseball was defined as a stadium
that contains people rather than the act of playing the game?
Libraries need to first and foremost clearly delineate what they are
- then digital libraries will be easier to define because it wont
get mixed up with other digital things just because they are digital.
At 23:30 02/06/2004, you wrote:
While it's easy to describe what we think a digital library should
be, it's harder to come up with a definition which enables us to
tell whether a given "thing" is or is not a digital library.
Here's a list of things that in one way or another have attributes
of a digital library, try testing your favorite definition against
what your gut tells you (along with my answers to the 4 items from
1. Amazon.com YYYY
2. Google YNYN
3. DMOZ YYYN
4. Yahoo YYYY
5. science direct YYYY
6. pubmed YYYY
7. arXiv YNYY
8. the internet archive YNNY
9. download.com YYYY
10. jstor YYYY
11. loc.gov YYNN
12. OCLC firstsearch YYYN
13. iTunes YYYY
14. Wikipaedia YNYY
My Y's and N's are available for dispute, but the question becomes
"what are 'principles of collection development' as applied to
digital libraries?" or "how long is long-term?".
I would be interested to entertain arguments that, for example,
iTunes is not a digital library.
At 8:44 AM +0100 6/1/04, Simon Tanner wrote:
With apologies to the list for quoting my own work!
In our book Digital Futures, Marilyn Deegan and I came up with
this observation on digital libraries:
"We would like to propose some principles that we think perhaps
characterize something as a digital library rather than any other
kind of digital collection... These are:
1. A digital library is a managed collection of digital objects
2. The digital objects are created or collected according to
principles of collection development
3. The digital objects are made available in a cohesive manner,
supported by services necessary to allow users to retrieve and
exploit the resources just as they would any other library
4. The digital objects are treated as long-term stable resources
and appropriate processes are applied to them to ensure their
quality and survivability."
I also think that you could do a lot worse than consider
Ranganathan's definition of a library as still extremely relevant
to this debate.
Hope this helps.
Eric Hellman, President Openly Informatics, Inc.
email@example.com 2 Broad St., 2nd Floor
tel 1-973-509-7800 fax 1-734-468-6216 Bloomfield, NJ 07003
http://www.openly.com/1cate/ 1 Click Access To Everything