Purge Tool
Bonita Purge Tool provides the capability to purge finished (archived) process instances from Bonita Runtime environment.
By default, all archives are preserved forever in Bonita runtime, but if your functional context allows you to remove old unused process instances (for example, if you only need to keep a history of last 6 months), use this tool to clean up your Bonita database.
The purge tool doesn’t delete documents (stored in the DOCUMENT table) from the platform. It will only remove the mapping between the archived cases and the document itself. If you need to reduce the size of the Document table in the engine database, please refer to the documentation: Delete documents of archived cases based on archive date |
For Subscription editions only. |
Pre-requisites
This tool can be run on a Bonita runtime environment in a version greater than or equal to 7.7.0.
Bonita runtime environment should be shut down when running this tool, i.e. Bonita server should be stopped.
Configuration
Once downloaded Bonita Purge Tool, unzip it somewhere and go into the main directory.
Enter your database configuration properties in the file application.properties
Run Bonita Purge Tool
This command will delete all archived process instances belonging to the process identified by PROCESS_DEFINITION_ID, that are finished since at least OLDEST_DATE_TIMESTAMP.
example (Unix):
# If no PROCESS_DEFINITION_ID parameter is given, the program lists all existing process definitions and stops:
bin/bonita-purge-tool
bin/bonita-purge-tool <PROCESS_DEFINITION_ID> <OLDEST_DATE_TIMESTAMP> [<TENANT_ID>]
example (Windows):
# If no PROCESS_DEFINITION_ID parameter is given, the program lists all existing process definitions and stops:
bin/bonita-purge-tool.bat
bin/bonita-purge-tool.bat <PROCESS_DEFINITION_ID> <OLDEST_DATE_TIMESTAMP> [<TENANT_ID>]`
An optional TENANT_ID parameter can be given if the platform uses multiple tenants to specify on which tenant should the process instances be deleted. If multi-tenancy is used and if the TENANT_ID is not set, an error is issued, and the program stops.
OLDEST_DATE_TIMESTAMP must be a Timestamp (in milliseconds) from which all process instances that finished before that date will be deleted.
Unfinished process instances and process instances that finished after that date will not be affected.
Its format is a standard Java timestamp since EPOCH (in milliseconds).
You can use websites such as Epoch Converter to format such a timestamp.
Deletion strategy
You need to have in mind 2 precepts to understand how this tool works:
1) This tool will first delete all archived process instances (arch_process_instance
rows) that are concerned by this purge.
Then the tables containing associated elements will be scanned to remove all existing orphans.
2) All archived and running process instances (cases) will have at least one row in arch_process_instance table. This is due to the first initializing state (stateId = 0) of the process instance that is archived as soon as it is created.
Thanks to these facts, to identify the orphans we only need to query the arch_process_instance, which is more performant than querying
both process_instance
and arch_process_instance
tables while we avoid removing data from running cases.
For example, once all arch_process_instance
rows matching the conditions (processDefinitionId and timestamp) have been deleted
and when the tool deletes the arch_data_instance
rows, the tool only needs to query the arch_process_instance
table.
DELETE FROM ARCH_DATA_INSTANCE a WHERE
a.CONTAINERTYPE = 'PROCESS_INSTANCE'
AND a.tenantId = 1
AND NOT EXISTS (
SELECT id FROM ARCH_PROCESS_INSTANCE b
WHERE a.CONTAINERID = b.SOURCEOBJECTID
AND b.tenantId = 1);
This strategy allows this tool to be more robust, it can be stopped at any given time, relaunching it will continue the deletion from where it stopped. However, this means that the time required to execute a purge will be the same when deleting a few elements or a lot of elements.
Database deletion volume testing reference
This reference page provides the tests run on all supported databases.