The New Numbers Racket
By Andrew K. Pace
American Libraries Columnist
Head of Systems, North Carolina State University Libraries, Raleigh.
Column for October 2004
While it might be said that not all librarians love automation or the all-consuming power technology holds over their patrons, it’s still safe to say that most systems librarians love statistics. To some libraries, numbers are the raison d’etre—gate-counts are down, virtual reference numbers are through the roof, ARL rankings are up (or down), and of course everyone loves a system that can tell them all the books in the Dewey 820s published before 1980 that have circulated fewer than three times.
Numbers, ranks, and statistics consume us. It’s puzzling, then, why most library systems vendors have failed to give us good numbers for so long. But vendors are beginning to dust off their relational databases and deliver meaningful data, statistics, and reports to libraries. Most of them are doing this through partnerships with third-party business intelligence (BI) software developers.
As standalone or externally developed products, most vendors are offering these new data-mining and data-warehousing tools as optional (read “extra cost”) modules. Leave it to vendors to come up with a way to charge libraries for extracting the libraries’ own data from their own systems—for which the libraries already pay the vendor a great deal of money.
Fortunately for libraries, more expensive technology is usually (would that it were always) better technology. Most corporations are better at crunching numbers than libraries. They must be, right? They call them BI analytical tools, while we just call them reports.
Some excellent statistical and report software is now available to libraries through their vendors. The list here is but a sampling, including some of the newest products on the market.
Dynix—Horizon Web Reporter. Powered by the MicroStrategy Business Intelligence Platform, Web Reporter includes several standard reports, as well as a “Director’s Dashboard” for simple access to data without knowledge of structured query language (SQL). Role-based reporting allows for authenticated access to specific reports, and the data can be queried in real time.
Ex Libris—ARC. ARC stands for Aleph 500 Reporting Center. Like many of the others, it is a web-based statistics and reporting tool, marketed as a plug-in module for Aleph ILS customers. Formerly known as Aleph Data Warehouse, the reports module is powered by Brio software.
Innovative—Web Management Reports. Web Management Reports was one of the first Java web clients available from an ILS vendor. The Innovative product is also unique because it was built and designed internally as a web-based add-on to Innovative’s popular and powerful reports module.
Sagebrush—Analytics. Powered by SwiftKnowledge, Analytics puts a lot of emphasis on nonquery language tools that attempt to turn what into why. The tool includes “guided inquiry” Q&A, graphics generation, and more. Moreover, Analytics was designed to help analyze some of the requirements of the federal No Child Left Behind law, requiring integration with external datasets, such as student records and test results.
Sirsi—Director’s Station. “Colorful” and “easy-to-read” are the marketing buzzwords that drive this new product niche, and Sirsi’s new Director’s Station module is no exception. Circulation analysis uncovers trends based on established collecting practices and item and patron categorization. With SwiftKnowledge as its back end (like the Sagebrush product), the software features automated alerts, “drill down” data mining, and graphics creation.
TLC—CARL.Decision. CARL.Decision runs on a Windows/Oracle platform leveraging the data-mining tools of Oracle. It has an easy-to-use Windows interface for creation, display, and broadcast delivery of reports. A professional version of the software—including archiving and more data-mining features—will be available in late 2004.
VTLS—InfoStation. Generally less expensive than some of the other add-on products, InfoStation also includes circulation notice generation and report scheduling through a web-based module. InfoStation uses Oracle database views and the Virtua ILS data to create the reports and statistics.
These short descriptions don’t contain much that cannot be found in a marketing flier or short product review, and they are not meant to be exhaustive. Libraries will need to carefully evaluate several factors. If your library is constantly evaluating “what if” scenarios without the proper data to back up assertions, then you need a data-mining tool. If your library lacks the SQL expertise to extract data and put it in meaningful form effectively and efficiently, then these tools might be right for you.
There are some differentiating factors to keep in mind. Consider the long-term viability of the third-party partner chosen by the ILS vendor. Also consider whether the product requires a backup copy of your data including extensive data migration, or whether your library requires real-time analysis. Sometimes getting the data in an easy-to-use format is worth the effort to copy the data. Backup copies of ILS data might be desirable if you do not want data analysis to interfere with real-time system performance.
Since data mining and warehousing usually involves analysis from myriad data repositories, libraries should shop for a product that has the potential to examine other sources, such as proxy servers, digital asset management systems, ILL statistics, and web-server logs. With larger and varied sets of data, libraries have the potential to discover more interesting trends and service relationships, and can use the data to predict future service needs. We finally have the tools we need to find the numbers. In some cases, the data-mining product will even tell us what those numbers mean. Now all we need is some software that tells us what to do about it.
Open Source Watch
Cornell University Library is developing an open source publication management system that will give authors and publishers a new and more affordable way to publish scholarly research directly to the Web. DPubS (Digital Publishing System) is freely available to independent publishers, university presses, and libraries. The effort will be helped by a $670,000 grant from the Andrew W. Mellon Foundation.
Contracts and Agreements
Auto-Graphics sales of Agent:
The Kansas State Library in Topeka expands its Auto-Graphics implementation with Agent for federated searching.
Dynix sales of Horizon:
Tea Tree Gully Library and Charles Sturt Library Service, both near Adelaide in South Australia; replacing Dynix Classic and Geac BookPlus. Palo Alto (Calif.) City Library, replacing Dynix Classic.
Endeavor sales of Voyager:
London School of Economics (U.K.), replacing Sirsi. Southwestern University in Georgetown, Texas, replacing Dynix Classic. Point Park University in Pittsburgh, replacing Innovative Millennium.
EOS International sales of EOS.Web:
The United States Postal Service, an EOS International customer, will implement EOS.Web for indexing and access of its digital collections.
GIS sales of Polaris:
North Olympic Library System in Port Angeles, Washington, replacing Dynix Classic.
Sirsi sales of Unicorn:
Riverside County Library System and San Bernardino County Library, both in California, combine systems; upgrading from Sirsi DRA Classic to Unicorn to serve over 2.2 million patrons.
TLC sales of Library.Solution:
City School District of Albany (N.Y.), replacing Follett. Baltimore City Public School System, with WebFeat federated search, replacing Follett, Sagebrush Athena, and Winnebago.
GIS has released a new version of the Polaris ExpressCheck that enables fully integrated communication between the self-check station and the Polaris automated system, without the need for any intervening protocol such as SIP or NCIP. The system can use RFID or regular barcodes. The Fayetteville (Ark.) Public Library is scheduled to install the system in its new library in October, using RFID barcodes from Biblioteca.
Ex Libris has released Verde, its new electronic resource management (ERM) tool. Building on the SFX link server software, Verde is a new management system designed to manage and provide better access to libraries’ expanding collections of electronic resources.
Acquisitions and Alliances
OCLC has acquired the 24/7 Reference Service from the Metropolitan Cooperative Library System in Pasadena, California. OCLC will combine the software developed by the California consortium with its own QuestionPoint system to create a more powerful virtual reference software product. According to OCLC, QuestionPoint is used in more than 1,000 libraries in 20 countries; 24/7 Reference is available in 500 libraries. Customers of both systems continue to receive access under terms of their current contracts. A new combined set of tools and services should be available by early 2005.
MuseGlobal has signed an agreement to combine its metasearch technology with Ebrary’s digital collection of books, digital images, and reports. Ebrary will give MuseGlobal resource discovery access through an XML-based API, giving federated search access to customers using the MuseSearch library portal.
Innovative Interfaces has renewed its relationship with MuseGlobal with a three-year agreement to continue use of the MuseSearch metasearch technology. Under the agreement, MuseGlobal will continue to provide the underlying search technology that supports MetaFind, Innovative’s federated search product offering.