History manager¶
The concepts of history management in DP3 are described here. History manager is responsible for:
- Datapoint aggregation - merging identical value datapoints in master records
- Deleting old datapoints from master records
- Deleting old snapshots
- Archiving old datapoints from raw collections
Configuration file history_manager.yml is very simple:
aggregation_schedule: # (1)!
minute: "*/10"
mark_datapoints_schedule: # (2)!
hour: "7,19"
minute: "45"
datapoint_cleaning_schedule: # (3)!
minute: "*/30"
snapshot_cleaning:
schedule: {minute: "15,45"} # (4)!
older_than: 7d # (5)!
datapoint_archivation:
schedule: {hour: 2, minute: 0} # (6)!
older_than: 7d # (7)!
archive_dir: "data/datapoints/" # (8)!
- Parameter
aggregation_schedulesets the interval for DP³ to aggregate observation datapoints in master records. This should be scheduled more often than cleaning of datapoints. - Parameter
mark_datapoints_schedulesets the interval when the datapoint timestamps are marked for all entities in a master collection. This should be scheduled very rarely, as it's a very expensive operation. - Parameter
datapoint_cleaning_schedulesets interval when should DP³ check if any data in master record of observations and timeseries attributes isn't too old and if there's something too old, removes it. To control what is considered as "too old", see parametermax_agein Database entities configuration. - Parameter
snapshot_cleaning.schedulesets the interval for DP³ to clean the snapshots collection. Optimally should be scheduled outside the snapshot creation window. See Snapshots configuration for more. - Parameter
snapshot_cleaning.older_thansets how old must a snapshot be to be deleted. - Parameter
datapoint_archivation.schedulesets interval for DP³ to archive datapoints from raw collections. - Parameter
datapoint_archivation.older_thansets how old must a datapoint be to be archived. - Parameter
datapoint_archivation.archive_dirsets directory where should be archived old datapoints. If directory doesn't exist, it will be created, but write priviledges must be set correctly. Can be also set tonull(or not set) to disable archivation and only delete old data.
The schedule dictionaries are transformed to cron expressions, see CronExpression docs for details.