Skip to content

Multivariate forecasting

We divided these benchmarks into two groups.

Unique model for each time series

First group target on training model for each time series. Available benchmarks for training unique model per each time series are here:

Benchmark hash Dataset Aggregation Source Original paper
930f0b401065 CESNET-TimeSeries24 10 MINUTES INSTITUTIONS None
ca6999ea7e24 CESNET-TimeSeries24 10 MINUTES INSTITUTION_SUBNETS None
7495b16f5fe6 CESNET-TimeSeries24 10 MINUTES IP_ADDRESSES_FULL None
3687fb52c433 CESNET-TimeSeries24 10 MINUTES IP_ADDRESSES_SAMPLE None
a6e56f99ab8a CESNET-TimeSeries24 1 HOUR INSTITUTION_SUBNETS None
e44334732033 CESNET-TimeSeries24 1 HOUR INSTITUTIONS None
18d04cab63e4 CESNET-TimeSeries24 1 HOUR IP_ADDRESSES_FULL None
b0ea46897cae CESNET-TimeSeries24 1 HOUR IP_ADDRESSES_SAMPLE None
63e1f696e7c5 CESNET-TimeSeries24 1 DAY INSTITUTION_SUBNETS None
71d17ad3550f CESNET-TimeSeries24 1 DAY INSTITUTIONS None
15737f3fceec CESNET-TimeSeries24 1 DAY IP_ADDRESSES_FULL None
084f368f4c82 CESNET-TimeSeries24 1 DAY IP_ADDRESSES_SAMPLE None

We encourage users to change default value for missing values, filler, scaler, sliding window step, and batch sizes. However, users may not change the rest of the arguments. Usage of these benchmarks are following:

from cesnet_tszoo.benchmarks import load_benchmark
from cesnet_tszoo.utils.enums import FillerType, ScalerType
from sklearn.metrics import mean_squared_error

benchmark = load_benchmark("871f5972109e", "../")
dataset = benchmark.get_initialized_dataset()

# (optional) Set default value for missing data 
dataset.set_default_values(0)

# (optional) Set filler for filling missing data 
dataset.apply_filler(FillerType.MEAN_FILLER)

# (optional) Set scaler for data
dataset.apply_scaler(ScalerType.MIN_MAX_SCALER)

# (optional) Change sliding window setting
dataset.set_sliding_window(sliding_window_size=744, sliding_window_prediction_size=24, sliding_window_step=1, set_shared_size=744)

# (optional) Change batch sizes
dataset.set_batch_sizes(all_batch_size=32)

# Process with model per each time series individualy 
results = []
for ts_id in dataset.get_data_about_set(about='train')['ts_ids']:
    # Define your own class Model uses dataloaders for perform training and prediction
    model = Model()
    model.fit(
        dataset.get_train_dataloader(ts_id), 
        dataset.get_val_dataloader(ts_id),
    )
    y_pred, y_true = model.predict(
        dataset.get_test_dataloader(ts_id), 
    )

    # Evaluate predictions, for example, with RMSE
    rmse = mean_squared_error(y_true, y_pred)

    # Add individual result into all results
    results.append(rmse)

print(f"Mean RMSE: {np.mean(rmse):.4f}")
print(f"Std RMSE: {np.std(rmse):.4f}")

Generic model for multiple time series

Second group target on training one generic model which learns generic paterns in several time series and then it can forecast multiple other time series. Available benchmarks for training unique model per each time series are here:

Benchmark hash Dataset Aggregation Source Original paper
9ac2b87c9a7c CESNET-TimeSeries24 10 MINUTES INSTITUTIONS None
7cd4e41b05ec CESNET-TimeSeries24 10 MINUTES INSTITUTION_SUBNETS None
50eb509e1e77 CESNET-TimeSeries24 10 MINUTES IP_ADDRESSES_FULL None
681a7fb90948 CESNET-TimeSeries24 10 MINUTES IP_ADDRESSES_SAMPLE None
ab8183ea80af CESNET-TimeSeries24 1 HOUR INSTITUTION_SUBNETS None
f9bd005c7efe CESNET-TimeSeries24 1 HOUR INSTITUTIONS None
88fd173619b2 CESNET-TimeSeries24 1 HOUR IP_ADDRESSES_FULL None
4ae11863ee38 CESNET-TimeSeries24 1 HOUR IP_ADDRESSES_SAMPLE None
cdb79dbf54ea CESNET-TimeSeries24 1 DAY INSTITUTION_SUBNETS None
c95d66b0baf5 CESNET-TimeSeries24 1 DAY INSTITUTIONS None
16274e0b44af CESNET-TimeSeries24 1 DAY IP_ADDRESSES_FULL None
0197980a87c0 CESNET-TimeSeries24 1 DAY IP_ADDRESSES_SAMPLE None

We encourage users to change default value for missing values, filler, scaler, sliding window step, and batch sizes. However, users may not change the rest of the arguments. Usage of these benchmarks are following:

from cesnet_tszoo.benchmarks import load_benchmark
from cesnet_tszoo.utils.enums import FillerType, ScalerType
from sklearn.metrics import mean_squared_error

benchmark = load_benchmark("09de83e89e42", "../")
dataset = benchmark.get_initialized_dataset()

# (optional) Set default value for missing data 
dataset.set_default_values(0)

# (optional) Set filler for filling missing data 
dataset.apply_filler(FillerType.MEAN_FILLER)

# (optional) Set scaler for data
dataset.apply_scaler(ScalerType.MIN_MAX_SCALER, create_scaler_per_time_series=False)

# (optional) Change sliding window setting
dataset.set_sliding_window(sliding_window_size=744, sliding_window_prediction_size=24, sliding_window_step=1, set_shared_size=744)

# (optional) Change batch sizes
dataset.set_batch_sizes(all_batch_size=32)

# Process with your own defined model
model = Model()
model.fit(
    dataset.get_train_dataloader(), 
    dataset.get_val_dataloader(),
)

# Predict for time series which data are not in training
y_pred, y_true = model.predict(
    dataset.get_test_other_dataloader(), 
)

# Evaluate predictions, for example, with RMSE
rmse = mean_squared_error(y_true, y_pred)
print(f"RMSE: {rmse:.4f}")