dp3.database.schema_cleaner ¶
SchemaCleaner ¶
Maintains a history of model_spec
defined schema, updates the database when needed.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
db |
Database
|
Target database mapping. (not just a database connection) |
required |
schema_col |
Collection
|
Collection where schema history is stored. |
required |
model_spec |
ModelSpec
|
Model specification. |
required |
log |
Logger
|
Logger instance to serve as parent. If not provided, a new logger will be created. |
None
|
Source code in dp3/database/schema_cleaner.py
get_current_schema_doc ¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
infer |
bool
|
Whether to infer the schema if it is not found in the database. |
False
|
Returns: Current schema
Source code in dp3/database/schema_cleaner.py
safe_update_schema ¶
Checks whether schema saved in database is up-to-date and updates it if necessary.
Will NOT perform any changes in master collections on conflicting model changes, but will raise an exception instead. Any such changes must be performed via the CLI.
As this method modifies the schema collection, it should be called only by the main worker.
The schema collection format is as follows:
Raises:
Type | Description |
---|---|
ValueError
|
If conflicting changes are detected. |
Source code in dp3/database/schema_cleaner.py
get_schema_status ¶
Gets the current schema status.
database_schema
is the schema document from the database.
configuration_schema
is the schema document constructed from the current configuration.
updates
is a dictionary of required updates to each entity.
Returns:
Type | Description |
---|---|
tuple[dict, dict, dict, list]
|
Tuple of ( |
Source code in dp3/database/schema_cleaner.py
detect_changes ¶
detect_changes(db_schema_doc: dict, current_schema: dict) -> tuple[dict[str, dict[str, dict[str, str]]], list[str]]
Detects changes between configured schema and the one saved in the database.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
db_schema_doc |
dict
|
Schema document from the database. |
required |
current_schema |
dict
|
Schema from the configuration. |
required |
Returns:
Type | Description |
---|---|
tuple[dict[str, dict[str, dict[str, str]]], list[str]]
|
Tuple of required updates to each entity and list of deleted entites. |
Source code in dp3/database/schema_cleaner.py
infer_current_schema ¶
Infers schema from current database state.
Will detect the attribute type, i.e. whether it is a plain, timeseries or observations attribute.
All other properties will be set based on configuration.
Returns:
Type | Description |
---|---|
dict
|
Dictionary with inferred schema. |
Source code in dp3/database/schema_cleaner.py
212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 |
|
construct_schema_doc ¶
Constructs schema document from current schema configuration.
The schema document format is as follows:
{
"entity_type": {
"attribute": {
"t": 1 | 2 | 4,
"data_type": <data_type.type_info> | None
"timeseries_type": <timeseries_type> | None
"series": dict[str, <data_type.type_info>] | None
}
}
}
where t
is attribute type,
and data_type
is the data type string.
timeseries_type
is one of regular
, irregular
or irregular_intervals
.
series
is a dictionary of series names and their data types.
timeseries_type
and series
are present only for timeseries attributes,
data_type
is present only for plain and observations attributes.
Source code in dp3/database/schema_cleaner.py
await_updated ¶
Checks whether schema saved in database is up-to-date and awaits its update by the main worker on mismatch.