The categorisation and classification processes are performed using WordStat and its graphical user interface. This allows a user to create, validate and refine those processes, apply these to various text collections, perform comparisons, explore, relate and create graphical and tabular reports. While categorisation and classification models can be saved to disk and reapplied on a different set of documents, a WordStat user is still required to perform those analyses.
The WordStat software development kit (SDK) provides a means to automate categorisation and classification, allowing models developed with the WordStat desktop tool to be used in other applications written in other computer languages such as C++, Delphi, C#, VB.Net and so on.
An example of such integration would be the application of a categorisation model on a company data collection system of customer feedback in order to automatically measure references to specific topics and to classify that feedback as either positive, negative, neutral, or some other classification as specified by your model.
All the analysis and text transformation settings set in WordStat are stored on disk in the model files (stemming, lemmatization, categorization rules, selection criteria, etc.). This greatly simplifies the integration of such text processing in other applications by reducing the application of those text analysis process to four easy steps:
A model only needs to be loaded once, while steps #2 to #4 may be repeated as often as needed.
There are currently no reporting or graphing functions available in the DLL, so it is the task of the programmer to further process the output information. Typically, numerical values are either stored in a database or cumulated to create reports, dashboards, etc.
The SDK consists of a Windows DLL available in both 32 bits and 64 bits versions. The DLL is multi-thread safe, allowing text quantification of multiple documents concurrently. It also supports the simultaneous application of multiple categorisation and classification models, allowing the user to perform several quantifications of the same documents.
The SDK comes with a sample project with source files illustrating how integration can be achieved. This sample project is currently available in Delphi, C#, and VB.NET. Please contact us if you need assistance on how to use the SDK with other computer languages.