When existing content is chunked, it usually begins in documents that are broken down into component topics and then broken again into smaller pieces identified as introduction, main body, and transitions. Content should sound natural and appear to have been written specifically for each use. Content also is chunked by audience and complexity so that relevant material and more complex discussion can be added or removed easily.
Audience plays a big role in content reuse. Identifying specific blocks of information as appropriate or inappropriate for different audiences can simplify document creation immensely. It also is the hardest classification to accomplish.
For example, an Offer Brief: a document that quickly informs sales staff of new offers, pricing and conditions that apply to selling a product or service within a given market. These things are constantly changing. It is a Stygian task to keep this kind of training content accurate and timely. Most of the documents have a similar look and feel. There may be specific types for different audiences or products, but a single item of information may find its way into 30-40 different presentations. Along the way it may get a different style - it may appear in a table here and in a paragraph of text there, but the data behind it is identical. It is possible to do a keyword search through a documentation set and locate all known matches, copying in the revised information with each new iteration. That usually takes too much time and trouble to be worth doing on a regular basis, unless it is very special information.
In comparison, with a properly constituted XML repository, the process is much more direct. Instead of working backwards from finished documents to find the appearance of specific content in context, the source content is already organized according to what it contains. The author goes to that container, revises it, refreshes the repository and the next time the document instance is called, it collects its source content from the updated source, applies the proper formatting, and compiles the finished document. All 30-40 documents that touch this same source content are automatically updated.
There was more work done in the very beginning, to properly analyze and attribute the content, but as the content is used to create more and more instance documents, those documents become progressively less expensive to create, manage and update. It makes it possible to do the previously unthinkable:
By increasing the efficiency with which content can be created, the quality and timeliness of all the training deliverables can be increased without raising the cost into the stratosphere.