Univariate forecasting

Benchmarks in this group are designed to support mostly used forecasting task for network management. We divided these benchmarks into two groups.

Unique model for each time series

First group target on training model for each time series. Available benchmarks for training unique model per each time series are here:

Benchmark hash	Dataset	Aggregation	Source	Original paper
22c5a8e8ffd3	CESNET-TimeSeries24	10 MINUTES	INSTITUTIONS	None
095f847ca755	CESNET-TimeSeries24	10 MINUTES	INSTITUTION_SUBNETS	None
c2970e89d824	CESNET-TimeSeries24	10 MINUTES	IP_ADDRESSES_SAMPLE	None
ddb1f02dae43	CESNET-TimeSeries24	10 MINUTES	IP_ADDRESSES_FULL	None
871f5972109e	CESNET-TimeSeries24	1 HOUR	INSTITUTIONS	None
080582bcd519	CESNET-TimeSeries24	1 HOUR	INSTITUTION_SUBNETS	None
f3fc14310e2e	CESNET-TimeSeries24	1 HOUR	IP_ADDRESSES_SAMPLE	None
e268fa9957f2	CESNET-TimeSeries24	1 HOUR	IP_ADDRESSES_FULL	None
0d523e69c328	CESNET-TimeSeries24	1 DAY	INSTITUTIONS	None
8e2a07fb3177	CESNET-TimeSeries24	1 DAY	INSTITUTION_SUBNETS	None
b5e5ea044b81	CESNET-TimeSeries24	1 DAY	IP_ADDRESSES_SAMPLE	None
d19ba386743f	CESNET-TimeSeries24	1 DAY	IP_ADDRESSES_FULL	None

We encourage users to change default value for missing values, filler, scaler, sliding window step, and batch sizes. However, users may not change the rest of the arguments. Usage of these benchmarks are following:

from cesnet_tszoo.benchmarks import load_benchmark
from cesnet_tszoo.utils.enums import FillerType, ScalerType
from sklearn.metrics import mean_squared_error

benchmark = load_benchmark("871f5972109e", "../")
dataset = benchmark.get_initialized_dataset()

# (optional) Set default value for missing data 
dataset.set_default_values(0)

# (optional) Set filler for filling missing data 
dataset.apply_filler(FillerType.MEAN_FILLER)

# (optional) Set scaler for data
dataset.apply_scaler(ScalerType.MIN_MAX_SCALER)

# (optional) Change sliding window setting
dataset.set_sliding_window(sliding_window_size=744, sliding_window_prediction_size=24, sliding_window_step=1, set_shared_size=744)

# (optional) Change batch sizes
dataset.set_batch_sizes(all_batch_size=32)

# Process with model per each time series individualy 
results = []
for ts_id in dataset.get_data_about_set(about='train')['ts_ids']:
    # Define your own class Model uses dataloaders for perform training and prediction
    model = Model()
    model.fit(
        dataset.get_train_dataloader(ts_id), 
        dataset.get_val_dataloader(ts_id),
    )
    y_pred, y_true = model.predict(
        dataset.get_test_dataloader(ts_id), 
    )

    # Evaluate predictions, for example, with RMSE
    rmse = mean_squared_error(y_true, y_pred)

    # Add individual result into all results
    results.append(rmse)

print(f"Mean RMSE: {np.mean(rmse):.4f}")
print(f"Std RMSE: {np.std(rmse):.4f}")

Generic model for multiple time series

Second group target on training one generic model which learns generic paterns in several time series and then it can forecast multiple other time series. Available benchmarks for training unique model per each time series are here:

Benchmark hash	Dataset	Aggregation	Source	Original paper
7706f1087922	CESNET-TimeSeries24	10 MINUTES	INSTITUTIONS	None
a642915953ad	CESNET-TimeSeries24	10 MINUTES	INSTITUTION_SUBNETS	None
e3de1fc0a44e	CESNET-TimeSeries24	10 MINUTES	IP_ADDRESSES_SAMPLE	None
8b03d0d508ce	CESNET-TimeSeries24	10 MINUTES	IP_ADDRESSES_FULL	None
09de83e89e42	CESNET-TimeSeries24	1 HOUR	INSTITUTIONS	None
73a9add2c4af	CESNET-TimeSeries24	1 HOUR	INSTITUTION_SUBNETS	None
6249383544ef	CESNET-TimeSeries24	1 HOUR	IP_ADDRESSES_SAMPLE	None
b8098753b97b	CESNET-TimeSeries24	1 HOUR	IP_ADDRESSES_FULL	None
ef632e70c252	CESNET-TimeSeries24	1 DAY	INSTITUTIONS	None
ce63551ffaab	CESNET-TimeSeries24	1 DAY	INSTITUTION_SUBNETS	None
9f7047902d66	CESNET-TimeSeries24	1 DAY	IP_ADDRESSES_SAMPLE	None
570b215d790d	CESNET-TimeSeries24	1 DAY	IP_ADDRESSES_FULL	None

We encourage users to change default value for missing values, filler, scaler, sliding window step, and batch sizes. However, users may not change the rest of the arguments. Usage of these benchmarks are following:

from cesnet_tszoo.benchmarks import load_benchmark
from cesnet_tszoo.utils.enums import FillerType, ScalerType
from sklearn.metrics import mean_squared_error

benchmark = load_benchmark("09de83e89e42", "../")
dataset = benchmark.get_initialized_dataset()

# (optional) Set default value for missing data 
dataset.set_default_values(0)

# (optional) Set filler for filling missing data 
dataset.apply_filler(FillerType.MEAN_FILLER)

# (optional) Set scaler for data
dataset.apply_scaler(ScalerType.MIN_MAX_SCALER, create_scaler_per_time_series=False)

# (optional) Change sliding window setting
dataset.set_sliding_window(sliding_window_size=744, sliding_window_prediction_size=24, sliding_window_step=1, set_shared_size=744)

# (optional) Change batch sizes
dataset.set_batch_sizes(all_batch_size=32)

# Process with your own defined model
model = Model()
model.fit(
    dataset.get_train_dataloader(), 
    dataset.get_val_dataloader(),
)

# Predict for time series which data are not in training
y_pred, y_true = model.predict(
    dataset.get_test_other_dataloader(), 
)

# Evaluate predictions, for example, with RMSE
rmse = mean_squared_error(y_true, y_pred)
print(f"RMSE: {rmse::4f}")