FAQs

General

Does selfEEG support training on GPUs for MacOS devices?

The library is built on top of PyTorch, which support training on GPUs for macOS devices through the mps backend (mps backend). Note that only macOS devices with Apple Silicon M series SoC are supported, so older Intel models are excluded. In addition, it is worth to note that the mps backend still not cover all the functionalities implemented in CUDA (a coverage matrix can be found here: matrix), and some already implemented are yet to be optimized. This applies only for few things, so you will probably not notice these limitations.

Dataloading module

1) I have a dataset which stores EEG data as 3D arrays, with the first dimension being associated to the trial number. Does the dataloading module support data provided in this way?

Yes, the dataloading module can handle 3d Arrays (the DEAP and SHU datasets have 3d array for example) both for the calculation of the total number of samples and for the sample extraction. Just be sure to not change the loading and transform function.

2) I have a single dataset with EEG data acquired from a certain number of subjects within multiple sessions? How can I split the data so to be sure that EEGs from a specific session are placed only in a single (train/validation/test) subset?

You can split data at the session level with the GetEEGSplitTable function. You just need to:

Give to the dataset_id_extractor a function to get the subject ID
Give to the subject_id_extractor a function to get the session ID
set split mode to 1.

The point is that this function support splits at two granularity levels, with the second being able to identify unique IDs only when coupled with the first level. In this case, it is reasonable to assume that different subjects can have the same session ID, but there not exist duplicate (subject, session) ID pairs.

The names subjects and dataset ID were decided only for convention. However, these are just names and the function will not check if the IDs extracted from the file name really refer to the specific dataset or subject. You can give anything you want as long as the previous reasoning about the identification of unique pairs is satisfied.

Can I implement a Leave-One-Subject-Out (LOSO) cross-validation?

Of course. You just need to call the GetEEGSplitTableKfold function, setting validation split to subject mode and setting the number of folds equals to the number of subjects. Remember to add a subject_id extractor if needed and, if you have enough data to create a separate test set, to also set the test split mode to subject and adjust the number of folds according to the number of subjects minus ones put in the test set.

Augmentation module

Should I set the batch_equal argument to True or False?

Setting batch_equal to True has a dual effect. On the one hand, it increases mini-batch heterogeneity, potentially improving the quality of the representations; on the other hand, it slows down model training because broadcast cannot be exploited in its full power. It’s up to you to decide which aspect to give priority, depending on your experimental design

Is an augmentation always faster on GPU devices?

Most of the time, augmentations executed on GPUs are faster compared to one on CPUs. However, it is worth to note that three main factors can affect the computational time of augmentations: the GPU device (cuda or mps), the batch_equal argument, and the object type (numpy array or tensor).

If you want to check how augmentations perform on different configurations, see the following table, which reported a benchmark test (times in second) run on the Padova Neuroscience Center Server (GPU Tesla V100) with a 3D array of size (64*61*512). Alternatively, you can run the benchmarking test and check how augmentations specifically perform on your device.

Augmentation Benchmark
	Numpy Array no BE	Numpy Array BE	Torch Tensor no BE	Torch Tensor BE	Torch Tensor GPU no BE	Torch Tensor GPU BE
add_band_noise		1.419		0.133		0.046
add_eeg_artifact	1.595	1.724	1.76	0.396	1.119	0.058
add_gaussian_noise		7.395		1.68		0.007
add_noise_SNR		7.695		1.938		1.344
change_ref		0.767		0.729		0.498
channel_dropout	0.779	0.363	0.455	0.028	0.502	0.007
crop_and_resize	11.479	17.503	15.851	3.434	10.389	0.2
filter_bandpass		4.716		5.786		1.035
filter_bandstop		4.793		5.93		0.166
filter_highpass		3.431		5.961		0.107
filter_lowpass		4.365		6.036		0.115
flip_horizontal		0.0		0.022		0.001
flip_vertical		0.358		0.023		0.001
masking	0.815	0.548	0.495	0.039	0.432	0.006
moving_avg		3.066		2.661		0.009
permutation_signal	1.055	1.305	0.951	0.242	0.599	0.008
permute_channels_network	4.667	0.954	6.435	0.157	4.698	0.072
permute_channels	0.905	0.877	1.237	0.075	1.148	0.016
random_FT_phase	4.014	4.963	2.819	0.485	1.272	0.026
random_slope_scale	3.849	2.251	1.783	0.2	0.012	0.009
scaling	0.562	0.56	0.046	0.046	0.002	0.002
shift_frequency	5.155	7.262	3.173	0.757	0.692	0.029
shift_horizontal	0.797	0.787	0.389	0.056	0.279	0.004
shift_vertical		0.712		0.061		0.002
warp_signal	25.965	43.759	35.951	7.205	23.263	0.467