Evaluation of Dataset Distribution in Biomedical Image Classification Against Image Acquisition Distortions
Journal
2024 20th International Symposium on Medical Information Processing and Analysis (SIPAIM)
Publisher
IEEE
Date Issued
2024
Author(s)
Aguilera-González, Santiago
Renza, Diego
Type
Resource Types::text::conference output::conference proceedings
Abstract
One of the conditions expected when training a machine learning model is that the inference data should be independently and identically distributed (i.i.d.) with respect to the training data. However, as the real world evolves, this condition can be lost, which is known as shift distribution. This situation can affect the performance of a machine learning model, so the question is how to evaluate (without training a model) the presence of shift distribution. Consequently, this paper presents a proposal to determine the degree of distribution shift in medical image datasets in the face of possible distortions due to the capture system. The methodology is based on Cumulative Spectral Gradient (CSG) metric and it is applied to three biomedical imaging datasets extracted from MedMNIST, an initiative that has compiled several standardized biomedical datasets: PneumoniaMNIST, BreastMNIST and RetinaMNIST. Thanks to this methodology, it is possible to evaluate which types of modifications have a greater impact on the generalization of the models, as well as to determine if there are classes more affected by corruptions. ©The authors ©IEEE.
License
Acceso Restringido
How to cite
Aguilera-González, S., Renza, D., & Moya-Albor, E. (2024). Evaluation of Dataset Distribution in Biomedical Image Classification Against Image Acquisition Distortions. In 2024 20th International Symposium on Medical Information Processing and Analysis (SIPAIM) (pp. 1–6). 2024 20th International Symposium on Medical Information Processing and Analysis (SIPAIM). IEEE. https://doi.org/10.1109/sipaim62974.2024.10783583
Table of contents
I. Introduction -- II. Materials and Methods -- III. Results and Discussion -- Conclusions.
