SGS01/UVAFM/18, University of Ostrava (2018)
Miroslav Kubát, Radek Čech, Jan Hůla, David Číž, Kateřina Pelegrinová
The project follows up the previous SGS project Application of Neural Networks in Diachronic and Synchronic Semantic Analysis of Texts. The first analysis showed that there is a convincing potential of this approach. The main goal is to extend the functionality of the developed software and to discover the possible applications of the proposed method in linguistic research. Specifically, with our method we can measure the Context specificity of lemma (CSL). This method is based on the Word2vec technique and measures the degree of the context specificity of a lemma.
SGS02/UVAFM/2017 University of Ostrava (2017)
Radek Čech, Miroslav Kubát, Jan Hůla, Vojtěch Molek
The aim of the project is to apply the contemporary methods based on neural networks in textology. Semantic changes in a Czech corpus are analyzed from synchronic and diachronic viewpoints. More specifically, (a) we examine the development of the political and social discourse from 1990 to 2014, and (b) we investigate the effectiveness of this method for genre classification. The project reflects the research topics of the Department of Czech Language (quantitative linguistics) and the Institute for Research and Applications of Fuzzy Modeling of the University of Ostrava (neural networks).
Implementation of new methods for the teaching of quantitative linguistic subjects at the Department of Czech Language at the Faculty of Arts of the of the University of Ostrava, and the improvement of the pedagogical and professional competence of the staff of the Department, based on expert consultation at the Department of Philosophy, Sociology, Education and Applied Psychology, University of Padua.
IRP201707 Universtity of Ostrava (2017)
Miroslav Kubát, Radek Čech
QUITA (Quantitative Index Text Analyzer) – Software Measuring Vocabulary Richness and Other Quantitative Features of Texts
IGA FF_2013_031, Palacký Univesity Olomouc (2013)
Radek Čech, Vladimír Matlach, Miroslav Kubát
Quantitative Index Text Analyzer (QUITA) covers the most common indicators, especially those connected with frequency structure of a text. In addition to computing results of the indicators, QUITA also provides statistical testing and graphical visualization of obtained data. QUITA is a versatile tool with many uses designed for researchers from various disciplines (linguistics, literary criticism, history, sociology, psychology, politics, biology, etc.). The programme enables basic text processing functions – such as creating word lists, text lemmatizing, or creating n-grams. The program also provides more advanced tools, such as a random text creator or a binary file translator. However, the main part of the software is an indicator computing. Although the authors focused mainly on the indicators connected to frequency structure of a text (e.g., h-point, entropy, repeat rate, adjusted modulus, Gini’s coefficient, lambda), there are also several other characteristics, such as thematic concentration, activity & descriptivity, or writer’s view. More information about the software is to be found in the book QUITA – Quantitative Index Text Analyzer and in the diploma thesis Kvantitativně lingvistický software.