LanguageIdentifer:
A Language Identifier object that was allocated by liCreateLanguageIdentifier.
IDList:
A pointer to an LanguageIDListT structure. The results of the
analysis of your document is placed in this structure.
Status:
A pointer to a StatusCodeT object. (A signed long integer.)
liEndDocument ends the analysis of your
document and returns the results of the analysis.
Before a call to liEndDocument, you will
want to have used the fuction liAnalyzeDocumentText to analyze your document's
text and begin determining the character set. You will also want
to ensure that enough text has been analyzed by liAnalyzeDocumentText
to ensure that the language identifier has enough information
to work with. We typically recommend 200 characters or more of
text though you can sometimes get by with less. The more text
the more accurate the analysis. However, amounts of text over
about 15,000 characters will have little impact on the analysis.
The results of the analysis of your document
will be returned in the structure IDList (of type LanguageIDListT)
which has the following format:
struct LanguageIdentificationT {
char LanguageIDString[80];
int LanguageIDNum;
float Weight;
};
struct LanguageIDListT {
BooleanT LanguageFound;
LanguageIdentificationT Language[4];
int LanguageIDCount;
};
In the structure, LanguageIDCount tells
you how many of the Language structures have been filled out.
The Language structure list is sorted according to decending
likelyhood of a match to your language. (i.e., Language[0] is
the most likely match to the language your document is written
in where Language[3] is the least likely.) The flag LanguageFound
tells you if the language identifier believes the closest match
also matches the language of your document's text.
So, to look at the results, you will want
to check the LanguageIDCount and the LanguageFound flag and see
how many close matches there were and if the closest match is
also the language of your document. After which, you can walk
the Language array and extract the LanguageIDStrings, LanguageIDNums,
and Weights as you feel appropriate.
One point of note, the LanguageIDString
returned for each matching language is filled out as a zero terminated
(or "C" style) string.