Config

accel_path:
- Path to the accelerometer data file.
- Contains raw movement data (x, y, z axes) recorded by a device (e.g., smartwatch).
- Example: single_participant_positive_high.csv contains accelerometer data for a specific participant.
reports_path:
- Path to the self-reports data file.
- Includes participant-reported labels, such as emotional states (e.g., valence, arousal).
combined_data_path:
- Path to the combined data file.
- This file matches accelerometer data with self-reports, linking movements with emotions over a specific time window.
features_data_path:
- Path to the features data file.
- Contains pre-calculated features extracted from accelerometer data for analysis, so feature extraction can be skipped.
model_path:
- Path to a pre-trained machine learning model.
- Used for predicting emotional states based on accelerometer data (e.g., an XGBoost model trained to predict arousal).

label_columns:
- List of labels (columns) in the dataset that represent emotional states.
- Example: ["valence", "arousal"] specifies these two emotions as labels.
target_label:
- The label the model will predict.
- Only one label can be selected at a time (e.g., "arousal" for predicting arousal levels).

time_window:
- Defines the time range (in minutes) before and after a self-report to extract accelerometer data.
- Example: 2 means extract 2 minutes of data before and after each self-report.

window_length:
- Length of the time window (in seconds) used to segment accelerometer data for feature extraction.
- Example: 60 means divide the data into 1-minute windows.
window_step_size:
- Step size (in seconds) for moving the window forward.
- Smaller values overlap windows more, providing more data segments.
data_frequency:
- Sampling rate of the accelerometer data, in Hz.
- Example: 25 means the device records 25 data points per second.
selected_domains:
- Specifies which feature domains to extract.
- Options: "time_domain", "spatial", "frequency", "statistical", "wavelet".
- Example: ["time_domain", "frequency"] means only extract features from these two domains. None extracts all domains.
include_magnitude:
- Whether to include magnitude-based features (combining x, y, z axes into one measurement).
- Example: True includes magnitude features.

cutoff_frequency:
- Frequency limit for the low-pass filter (in Hz).
- Removes high-frequency noise while keeping important movement data below this frequency.
order:
- Order of the low-pass filter, determining how sharply it removes unwanted frequencies.
- Example: 4 creates a smooth filter with a moderate cutoff.

scaler_type:
- Type of scaler used to standardize or normalize data.
- Options:
  - "standard": Centers the data to have a mean of 0 and a standard deviation of 1.
  - "minmax": Scales the data to a range of 0 to 1.
  - "none": No scaling applied.

apply_pca:
- Whether to apply Principal Component Analysis (PCA) to reduce the number of features.
- Example: False skips PCA.
pca_variance:
- The amount of variance to retain when applying PCA.
- Example: 0.95 keeps 95% of the information in the data.

classifier:
- The machine learning model to use for training.
- Options: "xgboost", "svm", "randomforest".
- Example: "xgboost" uses XGBoost for training.

n_splits:
- Number of folds for cross-validation.
- Example: 5 means split the data into 5 groups for evaluation.
n_iter:
- Number of iterations for hyperparameter tuning.
- Example: 30 tests 30 combinations of settings.
n_jobs:
- Number of CPU cores to use for parallel processing.
- Example: 1 uses all available cores.
n_points:
- Number of hyperparameter combinations to test in parallel.
- Example: 1 tests one combination at a time.
param_space:
- Custom hyperparameter space for the selected model.
- Example:
  - "learning_rate": (0.05, 0.2) specifies the range of learning rates to test for the XGBoost model.
  - Set to None to use default hyperparameters defined in the TrainModel class.