How XML Brings Old Content Back to Life

A chipmaker is opening up its extensive library of technical documents to its customers by using a tool for defining Web data.
John XenakisMay 16, 2001

(Editor’s Note: This is the third of a three-part series about content management in E-commerce and corporate Web strategies. Today’s story focuses on one company’s use of XML, an important emerging technology in the transfer of information on the Internet. The first article in the series focused on one firm’s integration of its Web strategy with its retail stores. The second story explained how a London law firm used document management tools to serve clients via its Web site.)

Applied Micro Circuits Corp. is a prime example of how companies can take their archives of content that’s several years old and marry it to their corporate Web strategies. The San Diego-based semiconductor maker has been facing increasing pressure from its clients to make its technical specifications available online. So the company decided to oblige them.

“We decided that managing our technical documents—making them available on the Web for customers and internal use—will lower costs and improve service,” says CFO Bill Bendush.

Drive Business Strategy and Growth

Drive Business Strategy and Growth

Learn how NetSuite Financial Management allows you to quickly and easily model what-if scenarios and generate reports.

Pat McGinty, the firm’s CIO, says, “We have pockets and islands of technical data spread all around the company, with each group having its own method of storage, so it’s hard to make them available to everyone.”

Yet with the Web, such an inefficient situation can’t be tolerated for long.

“Because of the Web, our customers are calling us on the carpet to make that documentation available,” McGinty says. “They want more access to the technical documents, and we’d like to be able to do collaborative design with our customers.”

AMCC has just begun a six-month effort to put up 100,000 technical documents, some as much as 10-years old, online. The entire implementation is expected to cost just over $1 million.

The company is using Documentum Corp.’s ( 4i document management software, which includes a repository where word- processing files, engineering designs, and other documents can be stored. The system also has tools to manipulate content. The software can process documents produced by more than 500 creation tools.

The software can take documents and convert and display them on AMCC’s Web site. McGinty says this feature saves him the trouble of assigning a tech staffer with specialized HTML knowledge the task of publishing the document.

“Once the data goes into the repository, anyone is able to set up new Web pages and put it out onto the Web,” he says.

According to McGinty, many of the 100,000 documents will simply be stored in the repository with no special handling. The documents will then be published on the Web as “white papers.” Generally speaking, individual sections will not be made available separately.

However, McGinty adds that many of the company’s documents contain sections of data that have to be identified separately, and for this, AMCC is relying on the Extensible Markup Language (XML), a method of tagging data that permits an author to specify the significance of different items and different sections.

For example, if this column had XML tags, the sentence two paragraphs above would read

According to McGinty , many of the 100,000 documents will simply be stored in the repository with no special handling.

In this example, “McGinty” is tagged as an “InterviewSubject,” and 100,000 is tagged as “NumDocuments.”

Longer sections of a document can also be tagged, with tags like “ExecutiveSummary” or “DetailedProductDescription.”

In the last few years, XML has been adopted as a standard in E- commerce, where XML tags identifying fields of data such as credit card numbers or invoice detail items are becoming increasingly common.

“XML is the major driver in the whole ability to describe content,” says Alan Weintraub, analyst with Stamford, Conn., based Gartner Group. “Companies just using Word to write simple one-page memos probably don’t need XML. Companies that want to produce reusable manuals should consider XML. And companies producing technical documentation can’t ignore it.”

But for a firm like AMCC, where many of its documents were written and stored in files years ago, it’s no easy task to add XML tags to these legacy documents.

XML tags can be added with an ordinary word processor, but that’s incredibly labor-intensive. To save companies the trouble, some special tools have been designed in recent years, including Xmetal from SoftQuad Software Ltd. ( and Epic from Arbortext Inc. (, both of which are integrated with Documentum’s 4i.

Even so, adding XML tags can be time consuming, according to Whitney Tidmarsh, VP of marketing for Documentum. Based on her experience, “an average worker can do XML tagging for 50 pages a week, tops.”

But once XML tags have been added to the documents, it becomes easier to re-use them and export them to other systems. For example, a sales or marketing employee can take sections of a file that’s been encoded in XML and reformat it more easily than if it lacks the tags.

Documentum charges $100,000 to license a typical entry-level version of 4i. But, according to the company, the average license will run approximately $300,000, and it’s not unusual for a high-end installation to exceed $1 million.

Charles Brett, an analyst with the Stamford, Conn., based research firm Meta Group says other makers of document management systems include FileNet Corp. (, Hummingbird Ltd. (, and Open Text Corp. (

For other articles on content management:

London Lawyers’ Web- based Paper Chase

Best Buy is a Clicks- and-Mortar Believer

(Send John Xenakis your questions and comments for Xenakis on Technology (XOT) to [email protected])