Facebook Instagram Twitter RSS Feed PodBean Back to top on side

PhD. Topics

Institute of Economic Research

Detection of tax manipulations using machine learning and artificial intelligence methods
PhD. program
Year of admission
Name of the supervisor
doc. Ing. Eduard Baumöhl, PhD.
Receiving school
Fakulta matematiky, fyziky a informatiky UK
Juggling with the corporate accounting and financial statements is an integral part of corporate finance, particularly in Slovakia. The main goal of the thesis is to develop a new model based on the recent advances in neural networks and machine learning, suited for the conditions of Slovak business environment. The thesis has a strong applied research focus, and as such, a close cooperation with the official authority in Slovakia – Financial Directorate of SR (FD SR) – will be maintened. There is a submitted APVV project on this topic, in which the student will be involved and from the FD SR part, a short-term contract might be provided.

The most commonly used data mining techniques at the beginning of 21st century to detect financial manipulations includes neural networks, Bayesian analysis, and decision trees (Ngai et al., 2011; Ravisankar et al., 2011; Feroz et al., 2000; Lin et al., 2003). Later on, in addition to these methods, the Support Vector Machine (SVM) was also used (Perols, 2011; Albashrawi, 2016). All these techniques, led by neural networks, are also a standard part of current research, in which several techniques are used to compare their performance simultaneously (Lin et al., 2015). Some works extended the techniques to KNN (K-Nearest Neighbor), but mainly to hybrid systems used to identify factors predicting manipulative behaviour (Kirkos et al., 2007). A less frequently used method in the analysis of financial manipulations is the Random Forest machine learning technique, which, however, appears to have significantly better results compared to other methods (Whiting et al., 2012; Patel et al., 2019; An and Suh, 2020; Wyrobek, 2020). Due to its accuracy, this technique is thus coming to the forefront in analysing manipulative behaviours in current research.

A typical problem of this type of analysis is the class imbalance in the disproportion of subjects identified as manipulators. Many methodological approaches have been developed in recent years to cope with the issue of imbalanced learning. In general, these can be divided into several categories: sampling methods, cost-sensitive methods for imbalanced learning, ensemble methods and various hybrid methods (He and Garcia, 2009). From the methodological point of view, tackling the problem of high data imbalance will be one of the major concerns in the thesis.