create_dataset
- selfeeg.utils.utils.create_dataset(folder_name: str = 'Simulated_EEG', Sample_range: list = [512, 1025], Chans: int = 8, p: list = 0.8, return_labels: bool = False, seed: int = 1234) ndarray | None[source]
creates a simulated EEG dataset for normal abnormal binary classification.
Samples have random length within a given range.
Once called, the function will generate 1000 files in a new directory. Samples will have name ‘A_B_C_D.pickle’ with:
A = dataset ID
B = subject ID
C = session ID
D = trial ID.
In total,
create_datasetwill generate files associated to:5 datasets (200 files per dataset)
40 subjects per dataset
5 sessions per subject
1 trial per session.
All files will store a dictionary with two keys:
‘data’ = the array with random length and given channels (channels in column dimension)
‘label’ = an integer with a random binary label (0=normal, 1=abnormal).
EEG files have values in uV, with range at most in [-550,550] uV.
- Parameters:
folder_name (str, optional) –
A string with the optional name of the subdirectory to store the generated files.
Default = ‘Simulated_EEG’
Sample_range (list, optional) –
A length 2 list with the possible minimum and maximum length of the generated EEGs.
Default = [512, 1025]
Chans (int, optional) –
An integer defining the number of channels each EEG must have.
Default = 8
p (float, optional) –
A scalar in range [0, 1] with the probability of a sample being normal.
Default = 0.8
seed (int, optional) –
A seed to set for reproducibility.
Default = 1234
- Returns:
classes (ArrayLike) – An array with the generated label. Index association is based on the file sorted by names.
Example
>>> import selfeeg.utils >>> import glob >>> utils.create_dataset() >>> print(len(glob.glob('Simulated_EEG/*'))==1000) #shoud return True