Technically Speaking

Andrew K. PaceBy Andrew K. Pace
American Libraries Columnist

Head of information technology,
North Carolina State University Libraries,

February 2007

Beam Me Up, Enterprise

It’s not your dad’s search engine

I don’t know why, but I am still somewhat amazed at the not-so-sudden but widespread discussion of “Enterprise Search.” The terminology makes me wonder: If libraries had called it that sooner, might we be farther along than we are with it? (Similarly, if we had only called cataloging “metadata research,” might libraries have gotten more funding?)

What is Enterprise? It seems like the word itself means different things to different people. While most libraries are trying to build discovery tools that encompass broad arrays of knowledge, many organizations would be more than satisfied with being able to find one of their own manuals on their intranet. As it turns out, what’s so hard on a large scale hardly gets simpler as the scale decreases.

It took a surprisingly long time for people’s dissatisfaction to be heard by search companies, many of which got their start building web- search engines. As it got easier and easier to find things in every hidden corner of the Web (including corners better left undiscovered), no one could find the e-mail or memo from last week. Searching the desktop was still a rather nascent notion even a year ago. What started as an attempt to search the desktop and networked file systems eventually merged with the web search and became “Enterprise.”

New kids on the net

The leaders in internet search—Yahoo, Microsoft, Ask, and Google—and those for the desktop—add Copernic and X1 to the three search engines above—are no longer the only companies in town. This market is exploding. Forrester Research predicts that eDiscovery technology spending will grow from $1.4 billion in 2006 to more than $4.8 billion in 2011 as so-called enterprises realize that they have no choice but to prepare for electronic discovery.

Some of the needs come from places you would not suspect. If you’re a law librarian, the elusive “litigation hold” has changed the nature of search and discovery. In a nutshell, organizations and companies have an obligation to preserve relevant information for any potential litigation. New rules effective as of late 2006 relate directly to corporate responsibility for eDiscovery methods.

Recommind, one of many companies specializing in the legal field, is even introducing a new Litigation Hold module into its MindServer product. Enterprise Content Management was the precursor to Enterprise Search, and many of those companies are also working hard to join what will quickly become a battlefield for customer share. Recommind is hardly alone in eDiscovery space, as companies such as Endeca, Fast, Autonomy, Grokker, Hakia, Vivisimo, and Convera become regular names in even the library industry.

To boldly search

Take it from me: A new search tool can be fun. One can spend hours looking at and analyzing relevance algorithms, search logs, and trends (and anyone who reports to me could corroborate this). Researchers sift through piles of data in search of just the right needle in just the right haystack. But as information architect Ezra Schwartz comments, with the right tool (in this analogy, a magnet), the needle can be found easily. “Most research and academic electronic collections,” Schwartz blogged in October, “are serving patrons who are interested in finding a piece of hay in the haystack.” For this reason search companies will continue to find their edge—like faceted search, graphical interfaces, and new and better search algorithms and relevance routines. The more, the better.

In some circles, eDiscovery is referred to as “information collection technology.” That one should get librarians’ attention. It should also be drawing librarians into the search field in droves. I would hazard a guess that computer science and business schools are preparing themselves for this upward spiral of activity much better than library schools are. I think that library master’s programs have a real opportunity to prepare even more graduates for private sector information jobs that will likely have more vacancies than those created by the graying of the profession in libraries.

The lines between desktop and intranet, and extranet and internet, continue to blur. But one thing is certain: Search is still hot. Will libraries continue to be a hotspot for preservation, description, discovery—for finding? I hope so.


  • IBM and Yahoo have teamed up to offer a free Enterprise Search solution: Omnifind, which is available as a download. Using the open-source Lucene search engine, Omnifind can index over 200 file types in 30 languages. While the notion of IBM giving away something for free has shocked some commentators, it should come as no surprise in a sector that includes Google, Microsoft Live Search, and Copernic—just three of the large companies who give away their desktop and network search applications.
  • On the heels of its acquisition by Cambridge Information Group, ProQuest has introduced a revamped search interface. Their new One Click searching will allow users to link directly from citations to full text regardless of the provider. Resolution of a library’s holdings are done on the fly using Serials Solutions link resolver database, obviating the need to resolve full text targets for each citation one by one.