Benchmark
cesnet_tszoo.benchmarks
Benchmark
Used as wrapper for imported dataset
, config
, annotations
and related_results
.
Intended usage:
For time-based:
- Call
load_benchmark
with the desired benchmark identifier. You can use your own saved benchmark or you can use already built-in one. This will download the dataset and annotations (if available) if they have not been previously downloaded. - Retrieve the initialized dataset using
get_initialized_dataset
. This will provide a dataset that is ready to use. - Use
get_train_dataloader
orget_train_df
to get training data for chosen model. - Validate the model and perform the hyperparameter optimalization on
get_val_dataloader
orget_val_df
. - Evaluate the model on
get_test_dataloader
orget_test_df
. - (Optional) Evaluate the model on
get_test_other_dataloader
orget_test_other_df
.
For series-based:
- Call
load_benchmark
with the desired benchmark. You can use your own saved benchmark or you can use already built-in one. This will download the dataset and annotations (if available) if they have not been previously downloaded. - Retrieve the initialized dataset using
get_initialized_dataset
. This will provide a dataset that is ready to use. - Use
get_train_dataloader
orget_train_df
to get training data for chosen model. - Validate the model and perform the hyperparameter optimalization on
get_val_dataloader
orget_val_df
. - Evaluate the model on
get_test_dataloader
orget_test_df
.
You can create custom time-based benchmarks with save_benchmark
or series-based benchmarks with save_benchmark
.
They will be saved to "data_root"/tszoo/benchmarks/
directory, where data_root
was set when you created instance of dataset.
Source code in cesnet_tszoo\benchmarks.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 |
|
get_annotations
get_annotations(on: AnnotationType | Literal['id_time', 'ts_id', 'both']) -> pd.DataFrame
Return the annotations as a Pandas DataFrame
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
on
|
AnnotationType | Literal['id_time', 'ts_id', 'both']
|
Specifies which annotations to return. If set to |
required |
Returns:
Type | Description |
---|---|
DataFrame
|
A Pandas DataFrame containing the selected annotations. |
Source code in cesnet_tszoo\benchmarks.py
103 104 105 106 107 108 109 110 111 112 113 114 |
|
get_config
get_config() -> SeriesBasedConfig | TimeBasedConfig
Return config made for this benchmark.
Source code in cesnet_tszoo\benchmarks.py
55 56 57 58 |
|
get_dataset
get_dataset(check_errors: bool = False) -> TimeBasedCesnetDataset | SeriesBasedCesnetDataset
Return dataset without initializing it.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
check_errors
|
bool
|
Whether to validate if dataset is not corrupted. |
False
|
Returns:
Type | Description |
---|---|
TimeBasedCesnetDataset | SeriesBasedCesnetDataset
|
Return dataset used for this benchmark. |
Source code in cesnet_tszoo\benchmarks.py
88 89 90 91 92 93 94 95 96 97 98 99 100 101 |
|
get_initialized_dataset
get_initialized_dataset(display_config_details: bool = True, check_errors: bool = False, workers: Literal['config'] | int = 'config') -> TimeBasedCesnetDataset | SeriesBasedCesnetDataset
Return dataset with intialized sets, scalers, fillers etc..
This method uses following config attributes:
Dataset config | Description |
---|---|
init_workers |
Specifies the number of workers to use for initialization. Applied when workers = "config". |
partial_fit_initialized_scalers |
Determines whether initialized scalers should be partially fitted on the training data. |
nan_threshold |
Filters out time series with missing values exceeding the specified threshold. |
Parameters:
Name | Type | Description | Default |
---|---|---|---|
display_config_details
|
bool
|
Flag indicating whether to display the configuration values after initialization. |
True
|
check_errors
|
bool
|
Whether to validate if dataset is not corrupted. |
False
|
workers
|
Literal['config'] | int
|
The number of workers to use during initialization. |
'config'
|
Returns:
Type | Description |
---|---|
TimeBasedCesnetDataset | SeriesBasedCesnetDataset
|
Return initialized dataset. |
Source code in cesnet_tszoo\benchmarks.py
60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 |
|
get_related_results
get_related_results() -> pd.DataFrame | None
Return the related results as a Pandas DataFrame
, if they exist.
Returns:
Type | Description |
---|---|
DataFrame | None
|
A Pandas DataFrame containing related results or None if not related results exist. |
Source code in cesnet_tszoo\benchmarks.py
116 117 118 119 120 121 122 123 124 |
|
load_benchmark
load_benchmark(identifier: str, data_root: str) -> Benchmark
Load a benchmark using the identifier.
First, it attempts to load the built-in benchmark, if no built-in benchmark with such an identifier exists, it attempts to load a custom benchmark from the "data_root"/tszoo/benchmarks/
directory.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
identifier
|
str
|
The name of the benchmark YAML file. |
required |
data_root
|
str
|
Path to the folder where the dataset will be stored. Each database has its own subfolder |
required |
Returns:
Type | Description |
---|---|
Benchmark
|
Return benchmark with |
Source code in cesnet_tszoo\benchmarks.py
127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 |
|