FAQs
General
Does selfEEG support training on GPUs for MacOS devices?
The library is built on top of PyTorch, which support training on GPUs for macOS devices through the mps backend (mps backend). Note that only macOS devices with Apple Silicon M series SoC are supported, so older Intel models are excluded. In addition, it is worth to note that the mps backend still not cover all the functionalities implemented in CUDA (a coverage matrix can be found here: matrix), and some already implemented are yet to be optimized. This applies only for few things, so you will probably not notice these limitations.
Dataloading module
1) I have a dataset which stores EEG data as 3D arrays, with the first dimension being associated to the trial number. Does the dataloading module support data provided in this way?
Yes, the dataloading module can handle 3d Arrays (the DEAP and SHU datasets have 3d array for example) both for the calculation of the total number of samples and for the sample extraction. Just be sure to not change the loading and transform function.
2) I have a single dataset with EEG data acquired from a certain number of subjects within multiple sessions? How can I split the data so to be sure that EEGs from a specific session are placed only in a single (train/validation/test) subset?
You can split data at the session level with the GetEEGSplitTable function.
You just need to:
Give to the
dataset_id_extractora function to get the subject IDGive to the
subject_id_extractora function to get the session IDset
split modeto 1.
The point is that this function support splits at two granularity levels, with the second being able to identify unique IDs only when coupled with the first level. In this case, it is reasonable to assume that different subjects can have the same session ID, but there not exist duplicate (subject, session) ID pairs.
The names subjects and dataset ID were decided only for convention. However, these are just names and the function will not check if the IDs extracted from the file name really refer to the specific dataset or subject. You can give anything you want as long as the previous reasoning about the identification of unique pairs is satisfied.
Can I implement a Leave-One-Subject-Out (LOSO) cross-validation?
Of course. You just need to call the GetEEGSplitTableKfold function,
setting validation split to subject mode and setting the number of folds equals
to the number of subjects. Remember to add a subject_id extractor if needed and,
if you have enough data to create a separate test set, to also set the test split
mode to subject and adjust the number of folds according to the number of subjects
minus ones put in the test set.
Augmentation module
Should I set the batch_equal argument to True or False?
Setting batch_equal to True has a dual effect. On the one hand, it increases
mini-batch heterogeneity, potentially improving the quality of the representations;
on the other hand, it slows down model training because broadcast cannot be
exploited in its full power. It’s up to you to decide which aspect to give priority,
depending on your experimental design
Is an augmentation always faster on GPU devices?
Most of the time, augmentations executed on GPUs are faster compared to one on CPUs.
However, it is worth to note that three main factors can affect the computational
time of augmentations: the GPU device (cuda or mps), the batch_equal argument,
and the object type (numpy array or tensor).
If you want to check how augmentations perform on different configurations, see the following table, which reported a benchmark test (times in second) run on the Padova Neuroscience Center Server (GPU Tesla V100) with a 3D array of size (64*61*512). Alternatively, you can run the benchmarking test and check how augmentations specifically perform on your device.
Numpy Array no BE |
Numpy Array BE |
Torch Tensor no BE |
Torch Tensor BE |
Torch Tensor GPU no BE |
Torch Tensor GPU BE |
|
|---|---|---|---|---|---|---|
add_band_noise |
1.419 |
0.133 |
0.046 |
|||
add_eeg_artifact |
1.595 |
1.724 |
1.76 |
0.396 |
1.119 |
0.058 |
add_gaussian_noise |
7.395 |
1.68 |
0.007 |
|||
add_noise_SNR |
7.695 |
1.938 |
1.344 |
|||
change_ref |
0.767 |
0.729 |
0.498 |
|||
channel_dropout |
0.779 |
0.363 |
0.455 |
0.028 |
0.502 |
0.007 |
crop_and_resize |
11.479 |
17.503 |
15.851 |
3.434 |
10.389 |
0.2 |
filter_bandpass |
4.716 |
5.786 |
1.035 |
|||
filter_bandstop |
4.793 |
5.93 |
0.166 |
|||
filter_highpass |
3.431 |
5.961 |
0.107 |
|||
filter_lowpass |
4.365 |
6.036 |
0.115 |
|||
flip_horizontal |
0.0 |
0.022 |
0.001 |
|||
flip_vertical |
0.358 |
0.023 |
0.001 |
|||
masking |
0.815 |
0.548 |
0.495 |
0.039 |
0.432 |
0.006 |
moving_avg |
3.066 |
2.661 |
0.009 |
|||
permutation_signal |
1.055 |
1.305 |
0.951 |
0.242 |
0.599 |
0.008 |
permute_channels_network |
4.667 |
0.954 |
6.435 |
0.157 |
4.698 |
0.072 |
permute_channels |
0.905 |
0.877 |
1.237 |
0.075 |
1.148 |
0.016 |
random_FT_phase |
4.014 |
4.963 |
2.819 |
0.485 |
1.272 |
0.026 |
random_slope_scale |
3.849 |
2.251 |
1.783 |
0.2 |
0.012 |
0.009 |
scaling |
0.562 |
0.56 |
0.046 |
0.046 |
0.002 |
0.002 |
shift_frequency |
5.155 |
7.262 |
3.173 |
0.757 |
0.692 |
0.029 |
shift_horizontal |
0.797 |
0.787 |
0.389 |
0.056 |
0.279 |
0.004 |
shift_vertical |
0.712 |
0.061 |
0.002 |
|||
warp_signal |
25.965 |
43.759 |
35.951 |
7.205 |
23.263 |
0.467 |