Mark Gross, President, DCL
Suzanne Daulerio, Product Manager, ASTM International

Discovery vendors and indexing services each have precise requirements on how and when content such as journals, eBook collections and metadata (including keywords tagged as titles, authors and summaries) should be delivered for maximum exposure. Many small-to-medium-size publishers struggle with meeting all these requirements, which can result in big challenges when trying to make their content more discoverable.

Getting discovered is an ongoing problem for professional societies and associations that typically publish less traditional content like journal papers and proceedings. The majority of their content may be “standards.” As developed by standard development organizations (SDOs), standards are often used in the engineering world to safely manufacture and test products. It is important to note that these standards usually have no authors in the traditional sense. Not only do these societies and associations have less content than the big publishing organizations, but also their content is typically not in a format easily recognized and indexed by discovery vendors and indexing services. However, rapid exposure and dissemination is critical for safety and health.

The result is that content from smaller publishers may be last in line to be indexed, which causes delays in finding recently published technical documents. This in turn can lead to problems for researchers as well as for authors who receive recognition based on the number of citations to their published work.

With all this in mind, let us look at the challenges faced by ASTM International (ASTM), a premier service provider in the development and delivery of voluntary consensus standards – for example, the F963 Standard Consumer Safety Specification for Toy Safety – technical books, and journals. Originally, ASTM provided one metadata format for all discovery vendors and indexing services, and it relied on these companies to select the fields that they needed. In recent years, it became apparent that this approach would no longer work, as each provider wanted its feed to contain only the information it required, delivered at different intervals for each provider. ASTM also required the companies to log in to an ASTM-hosted FTP site to pick up the data. This became untenable, as the only way to ensure that the service providers were receiving the data in a timely manner was to push the data to them.

To solve this problem, ASTM approached Data Conversion Laboratory (DCL), a leader in automated content and metadata conversion, extraction, and delivery. The result of this partnership was the creation of the DCL Discovery Bridge.

Inventing the DCL Discovery Bridge

DCL Discovery Bridge is a hosted subscription-based service, which allows publishers to offload the burden of frequently revising metadata files, as based on the technological changes to the content management systems of the discovery vendors and indexing services. This guarantees consistent and timely delivery of publishers’ content to every discovery vendor and indexing service with which they work.

DCL Discovery Bridge is built around a hub and spoke model where some of the spokes are inputs (content) from the publisher (in this case, ASTM), and others are delivery channels to various discovery vendors and indexing services. The heart of the process is the hub; a set of DCL hosted services that automatically pick up client content (full-text PDFs for journal articles, books, magazine articles, technical papers, conference proceedings and standards documents, etc.), along with metadata associated with each content object from the publisher. Hub-based services normalize the incoming metadata into a master metadata structure. This creates discovery vendor-specific feeds of content objects, including full text, as well as providing metadata in the format and structure each vendor can most easily integrate into its platform. The feeds are automatically delivered to vendors for ingestion into their platforms.

Meeting Discovery Vendor Demands

ASTM was facing fundamental challenges in delivering their content to discovery and indexing vendors. This was a process problem; neglecting it could have resulted in a slow degradation of productivity and potentially revenue as well. Working with DCL and its Discovery Bridge solution, ASTM is now able to “slice-and-dice” its data so that its vendors receive only the data they need, how they need it, and when they need it. Meeting those multiple specifications can be daunting: some discovery services want only metadata; some want metadata and PDFs; and one has even requested the development of an API (application programming interface) for extracting the documents that it wants.

As a medium-size publisher, ASTM does not have the resources to develop all of these combinations and permutations. The partnership with DCL gave ASTM the tools they needed to fulfill the specific and varied needs of the discovery vendors with which they work. The details of this project might be specific to ASTM, but the overall problem/solution laid out here is a universal one for any publisher trying to get its content discovered.

If your organization does not have the expertise or resources to meet the varied requirements of discovery vendors and indexing services, it might be time to explore solutions outside of your organization. ASTM did just that, and the results speak for themselves. The ASTM/DCL partnership has resulted in a decrease in support/help desk queries from their academic subscribers trying to find content. This is proof that DCL’s Discovery Bridge solution is helping ASTM’s end users by saving them research time. As for ASTM’s team – they now spend less time helping their subscribers find what they need.

Discovery That Would Make Magellan Proud, Industry TodaySuzanne Daulerio is Product Manager at ASTM International (ASTM), a standards development organization in West Conshohocken, PA. More than 30,000 volunteer members of ASTM work in an open and transparent process to deliver 12,700+ test methods, specifications, guides and practices that support industries and governments worldwide. Suzanne is responsible for the development and marketing of ASTM Compass®, the flagship product for delivery of ASTM technical content and services, and she is also liaison with discovery and indexing service providers.

Discovery That Would Make Magellan Proud, Industry TodayMark Gross, President of Data Conversion Laboratory (DCL), is a recognized authority on XML implementation and document conversion. Prior to founding DCL in 1981, Mark was with the consulting practice of Arthur Young & Co. Mark has a BS in Engineering from Columbia University and an MBA from New York University. He has also taught at the New York University Graduate School of Business, the New School, and Pace University. He is a frequent speaker on the topic of automated conversions to XML and SGML.

Previous articleRestaurant App Development Case Study
Next articleRobot nirvana? Not in my lifetime.