Purge Tool
For Subscription editions only. |
Bonita Purge Tool provides the capability to purge finished (archived) process instances from Bonita Runtime environment.
By default, all archives are preserved forever in Bonita runtime, but if your functional context allows you to remove old unused process instances (for example, if you only need to keep a history of last 6 months), use this tool to clean up your Bonita database.
Prior to version 2.1.0, the Purge Tool did not delete documents (stored in the DOCUMENT table) from the platform. It only removed the mapping between the archived cases and the document itself. If you need to reduce the size of the Document table in the engine database, consider using command |
Pre-requisites
This tool requires a Java 17+ runtime environment to run.
This tool can be run on a Bonita runtime environment in a version greater than or equal to 7.7.0.
We recommend to run the tool from a different machine than the one hosting the Bonita database.
The machine requires a network access to the database.
When deleting archives using the |
Configuration
Once downloaded, unzip it somewhere and go into the main directory.
Enter your database configuration properties in the file application.properties
# Database configuration
database.vendor=postgres
database.name=bonita
database.username=db_user
database.password=secret
database.host=localhost
database.port=5432
It is also possible to use environment variables to set these properties. Environment variables example
|
Run Bonita Purge Tool
The Bonita Purge Tool is a command-line tool.
To run it, open a terminal and go to the directory where you unzipped the tool.
View the usage information by running the following command:
-
Unix
-
Windows
./bin/bonita-purge-tool --help
./bin/bonita-purge-tool.bat --help
list
command
This command lists all existing process definitions that have root process instances archived for a given date filter.
-
Unix
-
Windows
./bin/bonita-purge-tool list --older-than 6m
./bin/bonita-purge-tool.bat list --older-than 6m
delete
command
This command deletes archived process instances and their related archived elements (flownodes, data, comments, etc.) for a given date filter.
You can use either the --older-than
or --before-date
option to define the date filter:
-
--older-than
to delete all archived elements older than a specified period of time -
--before-date
to delete all archived elements before a specified date
The command accepts other optional filtering options like:
-
--process-definition-id
to delete only the archived elements of a specific process definition -
--tenant-id
to delete only the archived elements of a specific tenant, if your platform is a multi-tenant architecture (as a reminder, this feature has been removed from Bonita 2023.1)
Examples:
-
Unix
-
Windows
./bin/bonita-purge-tool delete --older-than 6m
./bin/bonita-purge-tool.bat delete --older-than 6m
-
Unix
-
Windows
./bin/bonita-purge-tool delete --older-than 6m --process-definition-id 1234567890
./bin/bonita-purge-tool.bat delete --older-than 6m --process-definition-id 1234567890
-
Unix
-
Windows
./bin/bonita-purge-tool delete --before-date 1656979200000
./bin/bonita-purge-tool.bat delete --before-date 1656979200000
The --before-date parameter must be in milliseconds since the epoch.
|
Delete modes
The delete
command supports two modes: batch-delete
and copy-truncate
.
batch-delete
The default mode is batch-delete
. It can be used while the Bonita runtime is still running, and can be stopped and resumed at any time.
We recommend to use this mode to delete small to medium-sized data volumes regularly, for example, by using cron jobs during off-peak hours.
In this mode, the rows of each table to purge are deleted in batches, and each batch is committed in the database.
copy-truncate
The copy-truncate
mode is more efficient for large data volumes, especially when the number of rows to delete is higher than the number of rows to keep in the tables, but it requires the Bonita runtime to be stopped during the operation. Also, we advise not to stop the ongoing execution.
In this mode, the rows to keep are copied in batches to a temporary table, then the original table is truncated and dropped, and the temporary table is renamed to the original table. All required constraints and indexes are re-created.
Fine-tuning the deletion
As mentioned before, both modes use batch processing to either delete or copy rows.
The batch processing can be fine-tuned using the --batch-size
and --batch-interval
options:
-
--batch-size
to define the number of rows to delete or copy in each batch.-
With the
batch-delete
mode: although it depends on your database configuration and the volume of data to delete, we recommend starting with the default batch size of5000
rows. You can increase this value based on the performance of your database. -
With the
copy-truncate
mode: since the Bonita runtime is stopped when using this mode, you can use a much larger batch size to reduce the number of batches, even until reaching a unique one, which could improve performance. As a reminder, the main criterion with this mode is the number of rows to keep in the tables.
-
-
--batch-interval
to define the time in milliseconds to wait between each batch.
We introduce this option because it may happen that batch requests slow down or get blocked due to their quick sequence of execution. Adjust this interval if you face this issue, but take into consideration that it will significantly increase the execution time if the number of batches is important.
When using the copy-truncate
mode, some requests may be locked by the database engine when computing indexes after their re-creation.
To prevent this, use the --delete-interval
option to set a waiting interval between each table deletion operation.
delete-file-input
command
Delete all archived contract file input values.
In other words, delete all rows in table arch_contract_data
corresponding to contract data of type File
(in Studio) or org.bonitasoft.engine.bpm.contract.FileInputValue
in Bonita Engine.
These data are not used by Bonita and can take a large amount of space in your database, so deleting them is advised.
From Bonita 10.4, these data are no more archived, so it will not be necessary anymore to run this command. |
-
Unix
-
Windows
./bin/bonita-purge-tool delete-fileinput-content
./bin/bonita-purge-tool.bat delete-fileinput-content
delete-fileinput-content command is not supported for SQLServer database.
|
delete-orphan-document-content
command
As mentioned above, the Purge Tool did not delete documents from the platform before the version 2.1.0.
If you executed the Purge Tool before this version, you may have a large amount of orphan document content in your database.
To delete those orphan documents, use the delete-orphan-document-content
command.
Deletion strategy
You need to have in mind 2 precepts to understand how this tool works:
1) This tool will first delete all archived process instances (arch_process_instance
rows) that are concerned by this purge.
Then the tables containing associated elements will be scanned to remove all existing orphans.
2) All archived and running process instances (cases) will have at least one row in arch_process_instance
table.
This is due to the first initializing state (stateId = 0) of the process instance that is archived as soon as it is created.
Thanks to these facts, to identify the orphans we only need to query the arch_process_instance
, which is more performant than querying
both process_instance
and arch_process_instance
tables while we avoid removing data from running cases.
For example, once all arch_process_instance
rows matching the conditions (processDefinitionId and timestamp) have been deleted
and when the tool deletes the arch_data_instance
rows, the tool only needs to query the arch_process_instance
table.
DELETE FROM arch_data_instance a
WHERE a.containertype = 'PROCESS_INSTANCE'
AND a.tenantid = 1
AND NOT EXISTS (
SELECT 1 FROM arch_process_instance b
WHERE a.containerid = b.sourceobjectid AND b.tenantid = 1
);
This strategy allows this tool to be more robust, it can be stopped at any given time, relaunching it will continue the deletion from where it stopped. However, this means that the time required to execute a purge will be the same when deleting a few elements or a lot of elements.