Toolkit Overlook
The Brevity toolkit is designed to be both flexible and simple to use. As you can see from the list of functions, it is very easy to integrate Brevity into your project. The main two functions you will use are brCreateSummarizer and brDeleteSummarizer. These create and delete the Brevity object. Whenever you start or finish using the summarizer you must call these functions. Brevity will not work without them.
What is significant within a document depends upon the type of documents you are looking at and the type of information you are looking for. Brevity works by comparing a document to a set of similar documents. For instance if you were summarizing a news feed of political news you would wish to compare your text to political news stories not physics papers. Likewise lawyers summarizing legal papers would wish to compare their paper to other legal documents of the same type. Brevity stores this document information in a Summary Dictionary. We supply several dictionaries with Brevity. These are dictionaries designed for general categories of documents. For more specialized needs we supply a utility that will generate a dictionary from a collection of documents you supply. To specify the dictionary Brevity will use call brSetDictionary.
There are two ways to summarize text. You can tell Brevity to either summarize a data file on disk or pass Brevity a buffer to text stored in memory. This enables you to decide what is easiest for your particular project. In most cases it is easiest to simply pass Brevity a memory buffer. Be sure that the text you pass Brevity has the formatting removed from it first.
Brevity can return your summary in two different ways. The easiest and most common way is to have Brevity generate a paragraph of text that it returns in a buffer. The summary can be as short or as long as you wish. How long you wish to make the summary depends upon the types of texts you are summarizing and their length. In general you might wish to start with a summary of 200 words and adjust this up or down based upon your own particular data. After you have decided what meets your needs best you can hard-code it into your code. We find that for news stories as found on the Internet or in newspapers that a length of about 100 words works great. For technical articles as found in most journals a length of 200 words is best.
The second way Brevity can summarize your text is by returning a series of offsets into your text. Each offset is the location of a significant sentence in your text. You can use this function to highlight sentences in your document that your users may think of as significant. You might then allow the user to click on the sentence to go to that sentence in the original text. This allows you to not only summarize a text but allow the user to move within your original document. This also allows you to retrieve formatting information from your original text for display, if necessary.