Knowledge base: Adam Mickiewicz University

Settings and your account

Back

Ensemble classification of incomplete data – a non-imputation approach with an application in ovarian tumour diagnosis support

Andrzej Wójtowicz

Abstract

In this doctoral dissertation I focus on the problem of classification of incomplete data. The motivation for the research comes from medicine, where missing data phenomena are commonly encountered. The most popular method of dealing with data missingness is imputation; that is, inserting missing data on the basis of statistical relationships among features. In my research I choose a different strategy for dealing with this issue. Classifiers of a type previously developed can be transformed to a form which returns an interval of possible predictions. In the next step, with the use of aggregation operators and thresholding methods, one can make a final classification. I show how to make such transformations of classifiers and how to use aggregation strategies for interval data classification. These methods improve the quality of the process of classification of incomplete data in the problem of ovarian tumour diagnosis. Additional analysis carried out on external datasets from the University of California, Irvine (UCI) Machine Learning Repository shows that the aforementioned methods are complementary to imputation.
Record ID
UAM09626311c2dd40a889f3d4bc3de0ecb3
Diploma type
Doctor of Philosophy
Author
Andrzej Wójtowicz (SNŚ/WMiI/DoIIPM) Andrzej Wójtowicz,,
Title in Polish
Grupowa klasyfikacja danych niekompletnych – podejście nieimputacyjne z zastosowaniem we wspomaganiu diagnostyki guzów jajnika
Title in English
Ensemble classification of incomplete data – a non-imputation approach with an application in ovarian tumour diagnosis support
Language
en English
Certifying Unit
Faculty of Mathematics and Computer Science (SNŚ/WMiI/FoMaCS)
Discipline
information science / (mathematics domain) / (physical sciences)
Scientific discipline (2.0)
6.2 computer and information sciences
Defense Date
26-06-2017
End date
26-06-2017
Supervisor
URL
http://hdl.handle.net/10593/17969 opening in a new tab
Keywords in English
incomplete data, classification, imputation, aggregation operators
Abstract in English
In this doctoral dissertation I focus on the problem of classification of incomplete data. The motivation for the research comes from medicine, where missing data phenomena are commonly encountered. The most popular method of dealing with data missingness is imputation; that is, inserting missing data on the basis of statistical relationships among features. In my research I choose a different strategy for dealing with this issue. Classifiers of a type previously developed can be transformed to a form which returns an interval of possible predictions. In the next step, with the use of aggregation operators and thresholding methods, one can make a final classification. I show how to make such transformations of classifiers and how to use aggregation strategies for interval data classification. These methods improve the quality of the process of classification of incomplete data in the problem of ovarian tumour diagnosis. Additional analysis carried out on external datasets from the University of California, Irvine (UCI) Machine Learning Repository shows that the aforementioned methods are complementary to imputation.
Thesis file

Uniform Resource Identifier
https://researchportal.amu.edu.pl/info/phd/UAM09626311c2dd40a889f3d4bc3de0ecb3/

Back
Confirmation
Are you sure?
Report incorrect data on this page