WordStat by Provalis Research

Content analysis and text mining software.

A highly advanced content analysis and text-mining software with unmatched analysis capabilities.

Download a free trial

WordStat is a flexible and easy-to-use text analysis software.

Whether you need text mining tools for fast extraction of themes and trends, or careful and precise measurement with state-of-the-art quantitative content analysis tools. WordStat‘s seamless integration with SimStat – our statistical data analysis tool – QDA Miner – our qualitative data analysis software – and Stata – the comprehensive statistical software from StataCorp, gives you unprecedented flexibility for analyzing text and relating its content to structured information, including numerical and categorical data.

Why WordStat?

WordStat can be used by anyone who needs to quickly extract and analyze information from large amounts of documents. Our content analysis and text mining software is used for:

  • Content analysis of open-ended responses, interview or focus group transcripts
  • Business intelligence and competitive websites analysis
  • Information extraction and knowledge discovery from incident reports, customer complaints
  • Content analysis of news coverage or scientific literature
  • Automatic tagging and classification of documents
  • Fraud detection, authorship attribution, patent analysis
  • Taxonomy development and validation

Why use WordStat to analyse your unstructured data?

Explore Document Content Using Text Mining

Analyze large amounts of unstructured information with WordStat. The software can process 25 million words per minute, quickly extract themes and automatically identify patterns using clustering, multidimensional scaling, proximity plots and more.

Extract Meaning Using Explorer Mode

Quickly and easily extract meaning from large amounts of text data using Explorer mode especially made for those with little text mining experience. In one click, you can extract the most frequent words, phrases and the most salient topics in your documents.

Relate Text with Structured Data

Explore relationships between unstructured text and structured data such as dates, numbers or categorical data for identifying temporal trends or differences between subgroups or for assessing relationships with rating or other kinds of categorical or numerical data with statistical and graphical tools.

Salient topics

Get a quick overview of the most salient topics from large text collections by using state-of-the-art automatic topic extraction techniques.

Explore Connections

Explore relationships among words or concepts and retrieve text segments associated with specific connections.


Import Word, Excel, HTML, XML, SPSS, Stata, NVivo, PDFs, as well as images. Connect and directly import from social media, emails, web survey platforms, and reference management tools.

Categorize Text Data Using Dictionaries

Achieve full text analysis automation using existing dictionaries or create your own categorization model with words, phrases, proximity rules and more.

Unique Assistance for Dictionary Building

Build your dictionary faster with tools for extracting common phrases and technical terms and for quickly identifying in your text collection misspellings, synonyms, antonyms and related words.

GIS Mapping

Relate unstructured text data with geographic information and create interactive plots of data points, thematic maps, and heatmaps, along with a geocoding web service for transforming location names, postal codes and IP addresses into latitude and longitudes.


Easily export text analysis results to common industry file formats such as Excel, SPSS, ASCII, HTML, XML, MS Word and graphs such as PNG, BMP and JPEG.

Perform Qualitative Coding

Combine WordStat with a state-of-the-art qualitative coding tool (QDA Miner) for more precise exploration of data or more in-depth analysis of specific documents or extracted text segments when needed.

Categorize Text Data Using Machine Learning

Develop and optimize automatic document classification models using Naïve Bayes and K-Nearest Neighbours.

Automatically Extract Named Entities

Automatically extract named entities that can be added to the categorization dictionary using an easy drag-and-drop-operation.

Return to Source Document in One Click

Verify or dig deeper into your analysis by going back to the text from almost any feature, chart or graph. You can use the Keyword Retrieval or Keyword-in-Context features to retrieve sentences, paragraphs or whole documents.

Transform Text Using Python Scripts

Use Python script and its full range of open-source libraries to preprocess or transform text documents for analysis in WordStat.

WordStat 9: Improved performance and provision, a more flexible approach and enhanced usability.