Skip to content

Available models

All models have the following behavior. When the weights parameter is specified, pre-trained weights will be downloaded and cached in the model_dir folder. The returned model will be in the evaluation mode.

30pktTCNET_256

An example of how to feed data into this model is provided in a Jupyter notebook with multi-dataset evaluation - cross_dataset_embedding_function.ipynb.

models.model_30pktTCNET_256

model_30pktTCNET_256(weights=None, model_dir=None)

A single-modal neural network processing sequences of 30 packets and outputting 256-dimensional flow embeddings. For fine-tuning, consider using just the backbone_model attribute (an instance of Multimodal_CESNET_Enhanced) of the returned model.

Parameters:

Name Type Description Default
weights Optional[Model_30pktTCNET_256_Weights]

If provided, the model will be initialized with these weights.

None
model_dir Optional[str]

If weights are provided, this folder will be used to store the weights.

None
Source code in cesnet_models\models.py
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
def model_30pktTCNET_256(weights: Optional[Model_30pktTCNET_256_Weights] = None,
                         model_dir: Optional[str] = None) -> EmbeddingModel:
    """
    A single-modal neural network processing sequences of 30 packets and outputting 256-dimensional flow embeddings.
    For fine-tuning, consider using just the `backbone_model` attribute (an instance of Multimodal_CESNET_Enhanced) of the returned model.

    Parameters:
        weights: If provided, the model will be initialized with these weights.
        model_dir: If weights are provided, this folder will be used to store the weights.
    """
    architecture_params = {
        "use_mlp_flowstats": False,
        "init_weights": True,
        "cnn_ppi_stem_type": StemType.EMBED,
        "pe_size_embedding": 20,
        "pe_size_include_dir": False,
        "pe_size_init": PacketSizeInitEnum.PLE,
        "pe_size_ple_bin_size": 100,
        "pe_ipt_processing": ProcessIPT.EMBED,
        "pe_ipt_embedding": 10,
        "pe_onehot_dirs": True,
        "conv_normalization": NormalizationEnum.BATCH_NORM,
        "linear_normalization": NormalizationEnum.BATCH_NORM,
        "cnn_ppi_channels": [192, 256, 384, 448],
        "cnn_ppi_strides": [1, 1, 1, 1],
        "cnn_ppi_kernel_sizes": [7, 7, 5, 3],
        "cnn_ppi_use_stdconv": False,
        "cnn_ppi_downsample_avg": True,
        "cnn_ppi_blocks_dropout": 0.3,
        "cnn_ppi_first_bottle_ratio": 0.25,
        "cnn_ppi_global_pool": GlobalPoolEnum.GEM_3_LEARNABLE,
        "cnn_ppi_global_pool_act": False,
        "cnn_ppi_global_pool_dropout": 0.0,
        "use_mlp_shared": True,
        "mlp_shared_size": 448,
        "mlp_shared_dropout": 0.0
    }
    embedding_size = 256

    backbone_model = Multimodal_CESNET_Enhanced(**architecture_params, save_psizes_hist=True)
    model = EmbeddingModel(backbone_model, embedding_size=embedding_size)
    if weights is not None:
        state_dict = weights.get_state_dict(model_dir=model_dir)
        state_dict.pop("arcface_module.W", None)
        model.load_state_dict(state_dict)
        model.eval()
    return model

Multi-modal models

When the weights parameter is not specified, the model will be initialized with random weights and the following arguments become required:

  • num_classes - the number of classes, which defines the output size of the last linear layer.
  • flowstats_input_size - the number of flow statistics features and, therefore, the input size of the first linear layer processing them.
  • ppi_input_channels - the number of channels in PPI sequences. The standard value is three for packet sizes, directions, and inter-arrival times.

Input

Multi-modal models expect input in the format of tuple(batch_ppi, batch_flowstats). The shapes are:

  • batch_ppi torch.tensor (B, ppi_input_channels, 30) - batch size of B and the length of PPI sequences is required to be 30.
  • batch_flowstats torch.tensor (B, flowstats_input_size)

Jupyter notebooks listed on the getting started page show how to feed data into multi-modal models.

models.mm_cesnet_v2

mm_cesnet_v2(
    weights=None,
    model_dir=None,
    num_classes=None,
    flowstats_input_size=None,
    ppi_input_channels=None,
)

This is a second version of the multimodal CESNET architecture. It was used in the "Encrypted traffic classification: the QUIC case" paper.

Changes from the first version
  • Global pooling was added to the CNN part processing PPI sequences, instead of a simple flattening.
  • One more Conv1D layer was added to the CNN part and the number of channels was increased.
  • The size of the MLP processing flow statistics was increased.
  • The size of the MLP processing shared representations was decreased.
  • Some dropout rates were decreased.

Parameters:

Name Type Description Default
weights Optional[MM_CESNET_V2_Weights]

If provided, the model will be initialized with these weights.

None
model_dir Optional[str]

If weights are provided, this folder will be used to store the weights.

None
num_classes Optional[int]

Number of classes.

None
flowstats_input_size Optional[int]

Size of the flow statistics input.

None
ppi_input_channels Optional[int]

Number of channels in the PPI input.

None
Source code in cesnet_models\models.py
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
def mm_cesnet_v2(weights: Optional[MM_CESNET_V2_Weights] = None,
                 model_dir: Optional[str] = None,
                 num_classes: Optional[int] = None,
                 flowstats_input_size: Optional[int] = None,
                 ppi_input_channels: Optional[int] = None,
                 ) -> Multimodal_CESNET:
    """
    This is a second version of the multimodal CESNET architecture. It was used in
    the *"Encrypted traffic classification: the QUIC case"* paper.

    Changes from the first version:
        - Global pooling was added to the CNN part processing PPI sequences, instead of a simple flattening.
        - One more Conv1D layer was added to the CNN part and the number of channels was increased.
        - The size of the MLP processing flow statistics was increased.
        - The size of the MLP processing shared representations was decreased.
        - Some dropout rates were decreased.

    Parameters:
        weights: If provided, the model will be initialized with these weights.
        model_dir: If weights are provided, this folder will be used to store the weights.
        num_classes: Number of classes.
        flowstats_input_size: Size of the flow statistics input.
        ppi_input_channels: Number of channels in the PPI input.
    """
    v2_model_configuration = {
        "conv_normalization": NormalizationEnum.BATCH_NORM,
        "linear_normalization": NormalizationEnum.BATCH_NORM,
        "cnn_ppi_num_blocks": 3,
        "cnn_ppi_channels1": 200,
        "cnn_ppi_channels2": 300,
        "cnn_ppi_channels3": 300,
        "cnn_ppi_use_pooling": True,
        "cnn_ppi_dropout_rate": 0.1,
        "mlp_flowstats_num_hidden": 2,
        "mlp_flowstats_size1": 225,
        "mlp_flowstats_size2": 225,
        "mlp_flowstats_dropout_rate": 0.1,
        "mlp_shared_num_hidden":  0,
        "mlp_shared_size": 600,
        "mlp_shared_dropout_rate": 0.2,
    }
    return _multimodal_cesnet(model_configuration=v2_model_configuration,
                              weights=weights,
                              model_dir=model_dir,
                              num_classes=num_classes,
                              flowstats_input_size=flowstats_input_size,
                              ppi_input_channels=ppi_input_channels)

models.mm_cesnet_v1

mm_cesnet_v1(
    weights=None,
    model_dir=None,
    num_classes=None,
    flowstats_input_size=None,
    ppi_input_channels=None,
)

This model was used in the "Fine-grained TLS services classification with reject option" paper.

Parameters:

Name Type Description Default
weights Optional[MM_CESNET_V1_Weights]

If provided, the model will be initialized with these weights.

None
model_dir Optional[str]

If weights are provided, this folder will be used to store the weights.

None
num_classes Optional[int]

Number of classes.

None
flowstats_input_size Optional[int]

Size of the flow statistics input.

None
ppi_input_channels Optional[int]

Number of channels in the PPI input.

None
Source code in cesnet_models\models.py
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
def mm_cesnet_v1(weights: Optional[MM_CESNET_V1_Weights] = None,
                 model_dir: Optional[str] = None,
                 num_classes: Optional[int] = None,
                 flowstats_input_size: Optional[int] = None,
                 ppi_input_channels: Optional[int] = None,
                 ) -> Multimodal_CESNET:
    """
    This model was used in the *"Fine-grained TLS services classification with reject option"* paper.

    Parameters:
        weights: If provided, the model will be initialized with these weights.
        model_dir: If weights are provided, this folder will be used to store the weights.
        num_classes: Number of classes.
        flowstats_input_size: Size of the flow statistics input.
        ppi_input_channels: Number of channels in the PPI input.
    """
    v1_model_configuration = {
        "conv_normalization": NormalizationEnum.BATCH_NORM,
        "linear_normalization": NormalizationEnum.BATCH_NORM,
        "cnn_ppi_num_blocks": 2,
        "cnn_ppi_channels1": 72,
        "cnn_ppi_channels2": 128,
        "cnn_ppi_channels3": 128,
        "cnn_ppi_use_pooling": False,
        "cnn_ppi_dropout_rate": 0.2,
        "mlp_flowstats_num_hidden": 2,
        "mlp_flowstats_size1": 64,
        "mlp_flowstats_size2": 32,
        "mlp_flowstats_dropout_rate": 0.2,
        "mlp_shared_num_hidden": 1,
        "mlp_shared_size": 480,
        "mlp_shared_dropout_rate": 0.2,
    }
    return _multimodal_cesnet(model_configuration=v1_model_configuration,
                              weights=weights,
                              model_dir=model_dir,
                              num_classes=num_classes,
                              flowstats_input_size=flowstats_input_size,
                              ppi_input_channels=ppi_input_channels)