This content is dedicated to our next version. It is work in progress: its content will evolve until the new version is released.

Before that time, it cannot be considered as official.

Purge Tool

For Subscription editions only.
This tool is not available publicly. Use Bonita Customer Service Center to download it.

Bonita Purge Tool provides the capability to purge finished (archived) process instances from Bonita Runtime environment.

By default, all archives are preserved forever in Bonita runtime, but if your functional context allows you to remove old unused process instances (for example, if you only need to keep a history of last 6 months), use this tool to clean up your Bonita database.

Prior to version 2.1.0, the Purge Tool did not delete documents (stored in the DOCUMENT table) from the platform. It only removed the mapping between the archived cases and the document itself. If you need to reduce the size of the Document table in the engine database, consider using command delete-orphan-document-content to remove all documents not linked to any process instance anymore. Moreover, by default, subsequent runs of Purge Tool 2.1+ will automatically purge documents along with archived process instances, unless option --preserve-document-content is explicitly set.

Pre-requisites

This tool requires a Java 17+ runtime environment to run.
This tool can be run on a Bonita runtime environment in a version greater than or equal to 7.7.0.
We recommend to run the tool from a different machine than the one hosting the Bonita database. The machine requires a network access to the database.

When deleting archives using the copy-truncate mode, the Bonita runtime connected to the database must be shut down when running this tool.

Configuration

Once downloaded, unzip it somewhere and go into the main directory.
Enter your database configuration properties in the file application.properties

Configuration example
# Database configuration
database.vendor=postgres
database.name=bonita
database.username=db_user
database.password=secret
database.host=localhost
database.port=5432

It is also possible to use environment variables to set these properties.

Environment variables example
DATABASE_PASSWORD=secret ./bin/bonita-purge-tool list --older-than 6m

Run Bonita Purge Tool

The Bonita Purge Tool is a command-line tool.
To run it, open a terminal and go to the directory where you unzipped the tool.

View the usage information by running the following command:

  • Unix

  • Windows

./bin/bonita-purge-tool --help
./bin/bonita-purge-tool.bat --help

list command

This command lists all existing process definitions that have root process instances archived for a given date filter.

List process definitions with archived root process instances older than 6 months
  • Unix

  • Windows

./bin/bonita-purge-tool list --older-than 6m
./bin/bonita-purge-tool.bat list --older-than 6m

delete command

This command deletes archived process instances and their related archived elements (flownodes, data, comments, etc.) for a given date filter.

You can use either the --older-than or --before-date option to define the date filter:

  • --older-than to delete all archived elements older than a specified period of time

  • --before-date to delete all archived elements before a specified date

The command accepts other optional filtering options like:

  • --process-definition-id to delete only the archived elements of a specific process definition

  • --tenant-id to delete only the archived elements of a specific tenant, if your platform is a multi-tenant architecture (as a reminder, this feature has been removed from Bonita 2023.1)

Examples:

Delete all archived process instances older than 6 months
  • Unix

  • Windows

./bin/bonita-purge-tool delete --older-than 6m
./bin/bonita-purge-tool.bat delete --older-than 6m
Delete archived process instances older than 6 months for a specific process definition
  • Unix

  • Windows

./bin/bonita-purge-tool delete --older-than 6m --process-definition-id 1234567890
./bin/bonita-purge-tool.bat delete --older-than 6m --process-definition-id 1234567890
Delete archived process instances before 5 July 2022
  • Unix

  • Windows

./bin/bonita-purge-tool delete --before-date 1656979200000
./bin/bonita-purge-tool.bat delete --before-date 1656979200000
The --before-date parameter must be in milliseconds since the epoch.

Delete modes

The delete command supports two modes: batch-delete and copy-truncate.

batch-delete

The default mode is batch-delete. It can be used while the Bonita runtime is still running, and can be stopped and resumed at any time.
We recommend to use this mode to delete small to medium-sized data volumes regularly, for example, by using cron jobs during off-peak hours.

In this mode, the rows of each table to purge are deleted in batches, and each batch is committed in the database.

copy-truncate

The copy-truncate mode is more efficient for large data volumes, especially when the number of rows to delete is higher than the number of rows to keep in the tables, but it requires the Bonita runtime to be stopped during the operation. Also, we advise not to stop the ongoing execution.

In this mode, the rows to keep are copied in batches to a temporary table, then the original table is truncated and dropped, and the temporary table is renamed to the original table. All required constraints and indexes are re-created.

Fine-tuning the deletion

As mentioned before, both modes use batch processing to either delete or copy rows.
The batch processing can be fine-tuned using the --batch-size and --batch-interval options:

  • --batch-size to define the number of rows to delete or copy in each batch.

    • With the batch-delete mode: although it depends on your database configuration and the volume of data to delete, we recommend starting with the default batch size of 5000 rows. You can increase this value based on the performance of your database.

    • With the copy-truncate mode: since the Bonita runtime is stopped when using this mode, you can use a much larger batch size to reduce the number of batches, even until reaching a unique one, which could improve performance. As a reminder, the main criterion with this mode is the number of rows to keep in the tables.

  • --batch-interval to define the time in milliseconds to wait between each batch.
    We introduce this option because it may happen that batch requests slow down or get blocked due to their quick sequence of execution. Adjust this interval if you face this issue, but take into consideration that it will significantly increase the execution time if the number of batches is important.

When using the copy-truncate mode, some requests may be locked by the database engine when computing indexes after their re-creation. To prevent this, use the --delete-interval option to set a waiting interval between each table deletion operation.

delete-file-input command

Delete all archived contract file input values.
In other words, delete all rows in table arch_contract_data corresponding to contract data of type File (in Studio) or org.bonitasoft.engine.bpm.contract.FileInputValue in Bonita Engine.
These data are not used by Bonita and can take a large amount of space in your database, so deleting them is advised.

From Bonita 10.4, these data are no more archived, so it will not be necessary anymore to run this command.

Delete all archived contract file input content
  • Unix

  • Windows

./bin/bonita-purge-tool delete-fileinput-content
./bin/bonita-purge-tool.bat delete-fileinput-content
delete-fileinput-content command is not supported for SQLServer database.

delete-orphan-document-content command

As mentioned above, the Purge Tool did not delete documents from the platform before the version 2.1.0.
If you executed the Purge Tool before this version, you may have a large amount of orphan document content in your database.
To delete those orphan documents, use the delete-orphan-document-content command.

Deletion strategy

You need to have in mind 2 precepts to understand how this tool works:

1) This tool will first delete all archived process instances (arch_process_instance rows) that are concerned by this purge. Then the tables containing associated elements will be scanned to remove all existing orphans.

2) All archived and running process instances (cases) will have at least one row in arch_process_instance table. This is due to the first initializing state (stateId = 0) of the process instance that is archived as soon as it is created.

Thanks to these facts, to identify the orphans we only need to query the arch_process_instance, which is more performant than querying both process_instance and arch_process_instance tables while we avoid removing data from running cases. For example, once all arch_process_instance rows matching the conditions (processDefinitionId and timestamp) have been deleted and when the tool deletes the arch_data_instance rows, the tool only needs to query the arch_process_instance table.

DELETE FROM arch_data_instance a
WHERE a.containertype = 'PROCESS_INSTANCE'
  AND a.tenantid = 1
  AND NOT EXISTS (
    SELECT 1 FROM arch_process_instance b
    WHERE a.containerid = b.sourceobjectid AND b.tenantid = 1
  );

This strategy allows this tool to be more robust, it can be stopped at any given time, relaunching it will continue the deletion from where it stopped. However, this means that the time required to execute a purge will be the same when deleting a few elements or a lot of elements.