This Python script interacts with the Ansible Automation Platform (formerly Ansible Tower) API or AWX to manage and clean up job history. It allows you to check the overall job history duration and perform a cleanup operation to remove old job data, helping to maintain database performance and reduce storage consumption. The script is interactive, guiding the user through the process.
Depending on the size of your organization and how often your jobs run, if you set the standard Cleanup Job to retain many days worth of data (30+ days), you may run into an issue where the cleanup job fails via the UI if you want to delete lots of historical data because it is taking too long to complete. You'll even notice that it won't display any job output when you go to the cleanup job that's currently running. The session either times out or you reach some segfault somewhere along the way. This is also true if you attempt to run it via the API.
I've found a sweetspot (YMMV), where deleting 3 days worth of historical data at a time runs a lot smoother and is able to consolidate the historical data. However, sitting in front of the UI to delete 3 days worth of data at a time is time consuming and unproductive in its own way. It's like watching paint dry before you are able to apply the second coat.
Therefore, I wrote this program in Python that allows you to launch the cleanup job via the API. It'll loop through the days you'd like to cleanup in a batch of 3 days at a time.
If you'd like to keep 7 days worth of data but there's currently 120 days worth of data, you'll specify 7 as your threshold and it'll delete 3 days worth of data at a time until it gets down to 7 days worth of data.
The program is interactive. Below are the details of the program itself.
The script performs the following key functions:
-
Checks for Running Cleanup Jobs: Before initiating a cleanup, it verifies if another cleanup job (based on the
Cleanup Job Detailssystem job template) is already running. This prevents concurrent cleanup operations that could lead to conflicts. -
Retrieves Overall Job History: Determines the total number of days of job history currently stored in the Ansible database. This includes both
system_jobsandjobs. -
Lists Remaining Job Dates: Provides a sorted list of unique dates for which job history exists.
-
Executes Job Cleanup: Launches the
Cleanup Job Detailssystem job template with a specified retention period (in days). The script handles launching the job, monitoring its progress, and reporting its status. The script iteratively removes job history, in 3 day chunks, until the total days of history is equal to or less than the desired retention days. -
User Interaction: The script provides a simple text-based menu to guide the user through the available options. It prompts for necessary information, such as API credentials and the desired retention period, and provides clear feedback on the progress and results of the operations.
The script is organized into several functions:
-
check_running_cleanup_job(): Checks if a cleanup job is already in progress. It queries the/api/v2/system_jobs/endpoint, filtering by theTEMPLATE_ID(which is hardcoded to1, representing the "Cleanup Job Details" system job template) and checks for jobs with active statuses ("new", "pending", "waiting", "running"). -
get_launch_endpoint(): Constructs the API endpoint URL for launching the cleanup job template. This is based on the user's input, which would also includeusernameandpassword. -
get_overall_history_days(): Calculates the total number of days of job history by querying both the/api/v2/system_jobs/and/api/v2/jobs/endpoints. It finds the oldest finished job and calculates the difference between its finish time and the current time. Handles multiple datetime formats. -
get_remaining_job_dates(): Gets a sorted list of unique dates (asdatetime.dateobjects) for which finished job data exists. This helps visualize the remaining data after cleanup. -
run_cleanup_job(retain_days): Launches the cleanup job via a POST request to the/launch/endpoint of the cleanup job template. It sends the retention period as an extra variable (days). -
wait_for_job(job_id, poll_interval=5): Monitors the status of a launched job by periodically querying the/api/v2/system_jobs/{job_id}/endpoint. It waits until the job reaches a final state ("successful", "failed", or "error"). Includes error handling for potential JSON decoding issues. -
main(): The main function that handles user interaction, prompts for input, calls the other functions, and manages the overall script flow.
Before running the script, you need the following:
- Python 3: The script is written in Python 3 and requires a compatible interpreter.
requestslibrary: This library is used for making HTTP requests to the Ansible API. Install it using pip:pip install requests
urllib3library: This library comes bundled with therequestslibrary, so you usually don't have to worry about this one, but it will be used in the script to disable SSL warnings.- Ansible Automation Platform Access:
- You need a user account with sufficient permissions to access the Ansible API. The user must be able to:
- Read job details (
/api/v2/system_jobs/,/api/v2/jobs/). - Launch the "Cleanup Job Details" system job template (
/api/v2/system_job_templates/1/launch/). This typically requires administrator privileges.
- Read job details (
- The API base URL of your Ansible Automation Platform instance.
- You need a user account with sufficient permissions to access the Ansible API. The user must be able to:
-
Save the script: Save the provided Python code to a file, for example,
cleanup_jobs.py. -
Run the script: Execute the script from your terminal:
python cleanup_jobs.py
-
Follow the prompts: The script will guide you through the following steps:
- Enter API base URL: Provide the base URL of your Ansible Automation Platform API (e.g.,
https://your-ansible-instance.tld). - Enter username: Enter your Ansible Automation Platform username.
- Enter password: Enter your Ansible Automation Platform password.
- Choose an option:
- 1) Check overall job history days (all jobs): This option displays the total number of days of job history currently present and lists the unique dates for which job records exist.
- 2) Perform job cleanup: This option initiates the job cleanup process. You'll be prompted to enter the desired retention period (in days). The script will confirm the retention period and then proceed to launch the cleanup job, displaying progress updates.
- 3) Exit script: This option terminates the script.
- Enter API base URL: Provide the base URL of your Ansible Automation Platform API (e.g.,
-
Job Cleanup Process:
If you select the job cleanup option, the process has been expanded:
- Running Job Check: It will first check if a cleanup is already running.
- Retention Input & Confirmation: The script clearly prompts for the retention period and asks for explicit confirmation before proceeding.
- Iterative Deletion: Cleanup is now performed in batches of up to 3 days at a time. This significantly improves performance and reduces the risk of timeouts or errors when dealing with very large job histories.
- Progress Updates: Detailed progress information is displayed, including the current history, the number of days being deleted in each iteration, the time taken for each iteration, and the remaining history after each iteration.
- Abort Handling: If a job launch fails or a cleanup job ends with a non-successful status, the script will abort the cleanup loop and return to the main menu.
- Post-Cleanup Summary: After the cleanup completes (or is aborted), a summary is provided, including the total cleanup time, the final retained history, and the remaining job dates.
- Security: The script uses
verify=Falsein therequests.getandrequests.postcalls. This disables SSL certificate verification. This is for testing purposes only and should not be used in production environments. In a production setting, you should properly configure SSL certificate verification to ensure secure communication with the API. You would typically provide the path to your CA bundle or certificate. If you are working in a secured environment and connectivity is local, for example, executing the script from a control node or within the same network, you can do without the SSL verification but it isn't advised. However, if the execution is going across the INTERNET, I highly suggest enabling SSL Verification.
To re-enable SSL Verification, disable line 9 in the script by commenting it out:
8 │ # Disable InsecureRequestWarning when using verify=False (for testing only)
9 │ urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)- Error Handling: The script includes basic error handling for API requests and JSON decoding. However, for production use, more robust error handling and logging might be desirable.
- Permissions: Ensure the user account you use has the necessary permissions to read job details and launch the "Cleanup Job Details" system job template.
- System Job Template ID:: This script uses the ID of 1, which by default is the "Cleanup Job Details" system job. If this ID is different in your environment you should edit the script to use the appropriate ID.
- Job Timeouts: Long-running cleanup jobs might encounter timeouts. The iterative deletion approach mitigates this, but you may need to adjust the
poll_intervalin thewait_for_jobfunction, if you experience issues. - API Rate Limiting: Be aware of potential API rate limits imposed by your Ansible Automation Platform instance. The script includes some delays, but excessive use might still trigger rate limiting.
- Testing: Before performing a large-scale cleanup in a production environment, it's highly recommended to test the script thoroughly in a development or staging environment.
This project is licensed under the terms of the GNU General Public License v3.0. See the COPYING file for the full license text.