get_eeg_partition_number

selfeeg.dataloading.load.get_eeg_partition_number(EEGpath: str, freq: int or float = 250, window: int or float = 2, overlap: float = 0.1, includePartial: bool = True, file_format: str or list[str] = '*', load_function: function = None, optional_load_fun_args: list or dict = None, transform_function: function = None, optional_transform_fun_args: list or dict = None, keep_zero_sample: bool = True, save: bool = False, save_path: str = None, verbose: bool = False) → pd.DataFrame[source]

Calculates the number of unique partitions in each EEG signal.

This function processes each EEG file stored in a specified input directory. It is designed with default parameters that are compatible with the ‘BIDSAlign’ library. For additional information, see [1]. For a comprehensive guide on how to use this function, refer to the introductory notebook included in the documentation.

Parameters:

EEGpath (str) – The directory containing all EEG files. If the string does not end with a “/”, the character will be added automatically.
freq (int or float, optional) –
The EEG sampling rate, which must be consistent across all EEG files.

Default = 250.
window (int or float, optional) –
The length of the time window, specified in seconds.

Default = 2.
overlap (float, optional) –
The percentage overlap between contiguous EEG partitions. This value must be in the interval [0, 1).

Default = 0.1.
includePartial (bool, optional) –
Indicates whether to count the final portions of the EEG that may cover at least half of the time windows. If this option is enabled, the overlap between the last included partition and the previous one will be adjusted to incorporate real recorded values, provided at least half of the partition includes new data.

Default = True.
file_format (str or list[str], optional) –
A string or list of strings used to filter specific EEG files in the provided EEGpath. This is used directly in the glob.glob() method and can include shell-style wildcards (refer to the glob.glob() documentation for details). This option is useful if there are other file types in the directory.

Default = ‘*’.
load_function (function, optional) –
A custom function for loading EEG files, which will override the default:

loadmat(ii, simplify_cells=True)['DATA_STRUCT']['data'].

The function must accept one required argument: the full path to the EEG file (e.g., it will be called as: load_function(fullpath, optional_arguments)).

Default = None.
optional_load_fun_args (list or dict, optional) –
Additional arguments to pass to the custom loading function. This can be specified as a list or a dictionary.

Default = None.
transform_function (function, optional) –
A custom transformation function to apply after loading the EEG data. This may be useful for trimming portions of the signal (usually the beginning or end). The function must accept one required argument: the loaded EEG file (e.g., it will be called as: transform_function(EEG, optional_arguments)).

Default = None.
optional_transform_fun_args (list or dict, optional) – Additional arguments to pass to the EEG transformation function. This can be specified as a list or a dictionary. Default = None.
keep_zero_sample (bool, optional) –
Specifies whether to retain DataFrame rows with a calculated zero number of samples.

Default = True.
save (bool, optional) – Indicates whether to save the resulting DataFrame as a .csv file. Default = False.
save_path (str, optional) –
A custom path for saving the .csv file instead of using the current working directory. This string is passed to the pandas.DataFrame.to_csv() method. If save is True and no save_path is provided, the file will be saved as EEGPartitionNumber_k.csv, where k is an integer to prevent overwriting.

Default = None.
verbose (bool, optional) –
Controls whether to print information during function execution, which can be helpful for tracking progress, especially with large datasets.

Default = False.

Returns:

lenEEG (DataFrame) – A three-column Pandas DataFrame containing: - The full path to the EEG files in the first column, - The file names in the second column, - The number of partitions in the third column.

Notes

The product of freq and window must yield an integer representing the number of samples.
This function can handle arrays with more than two dimensions. In such cases, a warning is issued, and the calculation proceeds as follows: the length of the last dimension is used to determine the number of partitions, which is then multiplied by the product of the shapes of all preceding dimensions (the last two dimensions should correspond to channel and sample dimensions of a single EEG file).

Example

>>> import pickle
>>> import pandas as pd
>>> import selfeeg.dataloading as dl
>>> import selfeeg.utils
>>> utils.create_dataset()
>>> def loadEEG(path):
...     with open(path, 'rb') as handle:
...         EEG = pickle.load(handle)
...     x = EEG['data']
...     return x
>>> EEGlen = dl.get_eeg_partition_number(
...     'Simulated_EEG',freq=128, window=2, overlap=0.3, load_function=loadEEG)
>>> EEGlen.head()

References