This repository contains R code for reproducing results presented in the manuscript "Electrogastrogram-derived Features for Automated Sickness Detection in Driving Simulator" and authored by Grega Jakus (ORCID: 0000-0001-9373-7885), Jaka Sodnik (ORCID: 0000-0002-8915-9493), and Nadica Miljković (ORCID: 0000-0002-3933-6076).
This repository contains both data and code, as well as README.md and license files.
EGG-based (Electrogastrogram-based) features on original and noisy data with appropriate SNRs (Signal-to-Noise Raios) are presented in the following .csv tables:
- dat.csv - EGG-based parameters/features derived from the original dataset
- dat-noise_SNR+0dB.csv - EGG-based parameters/features derived from the semisynthetic dataset (by adding pseudo-random colored noise of SNR = 0 dB, the actual mean value of SNR is -3 dB)
- dat-noise_SNR+10dB.csv - EGG-based parameters/features derived from the semisynthetic dataset (by adding pseudo-random colored noise of SNR = 0 dB, the actual mean value of SNR is 7 dB)
- dat-noise_SNR+20dB.csv - EGG-based parameters/features derived from the semisynthetic dataset (by adding pseudo-random colored noise of SNR = 0 dB, the actual mean value of SNR is 17 dB)
- dat-noise_SNR-10dB.csv - EGG-based parameters/features derived from the semisynthetic dataset (by adding pseudo-random colored noise of SNR = 0 dB, the actual mean value of SNR is -13 dB)
- dat-noise_SNR-20dB.csv - EGG-based parameters/features derived from the semisynthetic dataset (by adding pseudo-random colored noise of SNR = 0 dB, the actual mean value of SNR is -23 dB)
The R code (Team, R. C. R. A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. https://www.r-project.org, 2021) for analysis of the features shared in .csv tables is given in the following scripts:
- comparison-noise.R - statistical analysis for comparison of noisy and non-noisy EGG-based parameters and for distinguishing statistical difference on data in relation to the nausea occurence
- nausea-categorical.R - descriptive statistics of categorical variables
- rf-nausea-classification.R - classification of nausea occurence based on the EGG-based parameters for each dataset separately
- rf-nausea-classification-noisy-set.R - classification of nausea occurence based on the original EGG-based parameters tested on noisy test data
Each datatable shared in .csv format contains the following columns:
- ordinary number of instance
- id of the participant
- rms - Root Mean Square
- median of the EGG PSD (Power Spectral Density)
- magDf - magnitude of the dominant frequency of EGG signal
- df - dominant frequency
- cs - crest factor of EGG PSD
- sdv - Spectral Variation Distribution
- SampEntT_m2 - sample entropy of time series for embedding dimension m = 2
- SampEntT_m3 - sample entropy of time series for embedding dimension m = 3
- SampEntT_m4 - sample entropy of time series for embedding dimension m = 4
- SampEntP_m2 - sample entropy of PSD for embedding dimension m = 2
- SampEntP_m3 - sample entropy of PSD for embedding dimension m = 3
- SampEntP_m4 - sample entropy of PSD for embedding dimension m = 4
- SpectEnt - spectral entropy
- Autocorr - autocorrelation zero-crossing
- SD1 - transverse line of the Poincaré plot
- SD2 - longitudinal line of the Poincaré plot
- SDEGG - standard deviation obtained from SD1 and SD2
- snrEGG - the actual SNR (Signal-to-Noise Ratio) for the instance (only tables reporting parameters derived from semi-synthetic dataset contain this column)
- nausea_onset - binary indicator whether for analyzed data segment nausea occured or not
For the sake of computational reproducibility, each R script contains R version and commented header with groundhog function that loads packages and appropriate dependencies for the selected date from CRAN (The Comprehensive R Archive Network).
The R code is provided without any guarantee and it is not intended for medical purposes.
Authors’ gratitude goes to Nenad B. Popović for his long collaboration in studies related to EGG research and for fruitful discussions on feature extraction techniques followed by his scientific contribution published elsewhere. The Authors also acknowledge Timotej Gruden, PhD student, for his exceptional work in experiment design and measurement conduction.
This research was funded by HADRIAN (Holistic Approach for Driver Role Integration and Automation Allocation for European Mobility Needs) EU Horizon 2020 project, grant number 875597. It was partly supported also by the Slovenian Research Agency within the research program ICT4QoL - Information and Communications Technologies for Quality of Life, grant number P2-0246. N.M. was partly supported by the Ministry of Education, Science, and Technological Development, Republic of Serbia, grant number 451-03-68/2022-14/200103.
If you find EGG-based features and R code useful for your own research and teaching class, please cite the following references:
- Jakus, G., Sodnik, J., & Miljković, N. (2022, October 23). NadicaSm/Statistical-Analysis-and-Machine-Learning-for-EGG-based-Nausea-Detection: v1 (Version v1). Version v1. Zenodo. https://doi.org/10.5281/zenodo.7242797
- Jakus, G., Sodnik, J., & Miljković, N. (2022). Electrogastrogram-Derived Features for Automated Sickness Detection in Driving Simulator. Sensors, 22(22), 8616. https://doi.org/10.3390/s22228616
- Gruden, T., Popović, N. B., Stojmenova, K., Jakus, G., Miljković, N., Tomažič, S., & Sodnik, J. (2021). Electrogastrography in autonomous vehicles—an objective method for assessment of motion sickness in simulated driving environments. Sensors, 21(2), 550. https://doi.org/10.3390/s21020550