dp3.database.schema_cleaner ¶
SchemaCleaner ¶
SchemaCleaner(db: Database, schema_col: Collection, model_spec: ModelSpec, config: HierarchicalDict, log: Logger = None)
Maintains a history of model_spec
defined schema, updates the database when needed.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
db
|
Database
|
Target database mapping. (not just a database connection) |
required |
schema_col
|
Collection
|
Collection where schema history is stored. |
required |
model_spec
|
ModelSpec
|
Model specification. |
required |
log
|
Logger
|
Logger instance to serve as parent. If not provided, a new logger will be created. |
None
|
Source code in dp3/database/schema_cleaner.py
get_current_schema_doc ¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
infer
|
bool
|
Whether to infer the schema if it is not found in the database. |
False
|
Returns: Current schema
Source code in dp3/database/schema_cleaner.py
safe_update_schema ¶
Checks whether schema saved in database is up-to-date and updates it if necessary.
Will NOT perform any changes in master collections on conflicting model changes, but will raise an exception instead. Any such changes must be performed via the CLI.
As this method modifies the schema collection, it should be called only by the main worker.
The schema collection format is as follows:
{
"_id": int,
entity_id_types: {
<entity_type>: <entity_id_type>,
...
},
"schema": { ... }, # (1)!
"storage": {
"snapshot_bucket_size": int
},
"version": int
}
Raises:
Type | Description |
---|---|
ValueError
|
If conflicting changes are detected. |
Source code in dp3/database/schema_cleaner.py
get_schema_status ¶
Gets the current schema status.
database_schema
is the schema document from the database.
configuration_schema
is the schema document constructed from the current configuration.
updates
is a dictionary of required updates to each entity.
Returns:
Type | Description |
---|---|
tuple[dict, dict, dict, dict, list]
|
Tuple of ( |
Source code in dp3/database/schema_cleaner.py
detect_changes ¶
detect_changes(db_schema_doc: dict, current_entity_types: dict, current_schema: dict) -> tuple[dict[str, str], dict[str, dict[str, dict[str, str]]], list[str]]
Detects changes between configured schema and the one saved in the database.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
db_schema_doc
|
dict
|
Schema document from the database. |
required |
current_entity_types
|
dict
|
Entity ID types from the configuration. |
required |
current_schema
|
dict
|
Schema from the configuration. |
required |
Returns:
Type | Description |
---|---|
tuple[dict[str, str], dict[str, dict[str, dict[str, str]]], list[str]]
|
Tuple of required updates to each entity and list of deleted entites. |
Source code in dp3/database/schema_cleaner.py
192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 |
|
infer_current_schema ¶
Infers schema from current database state.
Will detect the attribute type, i.e. whether it is a plain, timeseries or observations attribute.
All other properties will be set based on configuration.
Returns:
Type | Description |
---|---|
dict
|
Dictionary with inferred schema. |
Source code in dp3/database/schema_cleaner.py
307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 |
|
construct_schema_doc ¶
Constructs schema document from current schema configuration.
The schema document format is as follows:
{
"entity_type": {
"attribute": {
"t": 1 | 2 | 4,
"data_type": <data_type.type_info> | None
"timeseries_type": <timeseries_type> | None
"series": dict[str, <data_type.type_info>] | None
}
}
}
where t
is attribute type,
and data_type
is the data type string.
timeseries_type
is one of regular
, irregular
or irregular_intervals
.
series
is a dictionary of series names and their data types.
timeseries_type
and series
are present only for timeseries attributes,
data_type
is present only for plain and observations attributes.
Source code in dp3/database/schema_cleaner.py
construct_entity_types ¶
Constructs entity ID type dictionary from model specification.
Returns:
Type | Description |
---|---|
dict
|
Dictionary with entity ID types. |
Source code in dp3/database/schema_cleaner.py
await_updated ¶
Checks whether schema saved in database is up-to-date and awaits its update by the main worker on mismatch.
Source code in dp3/database/schema_cleaner.py
get_index_name_by_keys ¶
Gets index name by keys.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
collection
|
str
|
Collection name. |
required |
keys
|
dict
|
Index keys. |
required |
Returns: Index name.
Source code in dp3/database/schema_cleaner.py
update_storage ¶
Updates storage settings in schema.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
prev_storage
|
dict
|
Previous storage settings. |
required |
curr_storage
|
dict
|
Current storage settings. |
required |
Source code in dp3/database/schema_cleaner.py
migrate ¶
Migrates schema to the latest version.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
schema_doc
|
dict
|
Schema to migrate. |
required |
Returns: Migrated schema.
Source code in dp3/database/schema_cleaner.py
migrate_schema_2_to_3 ¶
Migrates schema from version 2 to version 3.
This migration adds the storage setting for snapshot bucket size to the schema.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
schema
|
dict
|
Schema to migrate. |
required |
Returns: Migrated schema.
Source code in dp3/database/schema_cleaner.py
migrate_schema_3_to_4
staticmethod
¶
Migrates schema from version 3 to version 4.
This migration adds eid_data_type specification to the schema. As all entities were previously using string as their entity ID type, this is merely an accounting change. Actual data types are to be changed by following migrations.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
schema
|
dict
|
Schema to migrate. |
required |
Returns: Migrated schema.