QUITA - Quantitative Index Text Analyzer

Quantitative Index Text Analyzer (QUITA) covers the most common indicators, especially those connected with frequency structure of a text. In addition to computing results of the indicators, QUITA provides also statistical testing and graphical visualization of obtained data.

QUITA is a versatile tool with many uses designed for researchers from various dis-ciplines (linguistics, criticism, history, sociology, psychology, politics, biology, etc.). The program enables basic text processing functions like creating word lists, text lemmatizing or creating n-grams. The program also provides more advanced tools such as a random text creator or a binary file translator. However, the main part of the software is an indicator com-puting. Although the authors focused mainly on the indicators connected to frequency struc-ture of a text (e.g. h-point, entropy, repeat rate, adjusted modulus, Gini’s coefficient, lambda), there are also several other characteristics such as thematic concentration, activity & descriptivity or writer’s view.

The main purpose of QUITA is to provide user-friendly tool of quantitative text analysis for researchers (especially from the humanities) without deeper knowledge of quan-titative linguistics, statistics and programming. Apart from generating results, QUITA also enables a simple statistical comparison and creating charts. There is no need to use any additional software such as spreadsheet applications or special statistical programs. In sum, QUITA is the program that combines all important parts of any quantitative research: obtain-ing results, statistical testing and graphical visualization.

In order to compare texts for authorship attribution, genre analysis or another purpose, the differences between obtained resulting values of several indicators can be statistically tested. QUITA provides not only statistical testing among particular texts but also among groups of texts. For creating graphs of obtained data, there is a special tool “Chart Wizard” which offers wide range of chart types and editing options. All results can be copied via clipboard or saved directly as CSV file. The charts can be saved as image files.

QUITA is a tool with wide range of application, from stylometry to DNA analysis. Although almost all indicators in the software were proposed as features for common linguistic research (e.g. authorship attribution, genre or thematic analysis), possibilities are practically endless. Biologists can use one of available tokenizers (DNA Triplet Tokenizer, DNA Nucleotide Tokenizer) to handle with DNA as a text and apply the indicators, for instance. There is also an option to use different units other then words or lemmas such as characters, n-grams, etc. It should be noted that the software is designed as multilingual tool; QUITA therefore works with almost all scripts and includes several tokenizers and lemmatizers.