The Argo container-native workflow engine for Kubernetes is a tremendously popular open source option for teams looking to run workflows, manage clusters, orchestrate parallel jobs, and more.
Pairing Kubernetes and Argo, multiple dev teams can run workloads on centrally-managed clusters, sharing compute resources to increase efficiency and reduce overhead. However, keeping track of each team’s spending—and spending efficiency—in the Kubernetes shared resources environment has become an all-too-familiar challenge. Teams routinely struggle with budgeting projections and waste reduction efforts without visibility into historical resource usage data and the ability to break down cost responsibilities by team. That visibility is crucial to addressing minor inefficiencies before they grow and lead to significant waste as workflows scale.
[A completed Argo workflow, its costs completely unknown]
Enter Kubecost, Argo’s spend-conscious companion Kubecost, a Kubernetes cost monitoring tool with a free community version, enables dev teams to accurately measure Argo workflow costs in real-time, allocate costs to teams, recognize resource efficiencies, and suggest opportunities for optimization.
The following guide demonstrates how teams managing Kubernetes workloads with Argo Workflows can deploy Kubecost to analyze and control their workflow spending. Those interested in beginning to leverage Argo Workflows should follow this simple Quick Start guide on GitHub. Kubecost itself can be installed using Helm, and should be given its own namespace:
Once installed, Kubecost will collect cost data on running Argo Workflows.
Create an example workflow For this guide, let’s define an Argo Workflow by creating a WorkflowTemplate CRD. We’ll make the workflow mimic high resource usage by having the stress-ng Docker container make resource-consuming Pods. The workflow will perform its single “stress” stage, running the stress-ng image, along with inputs Argo will handle in the launched workflow. To create the WorkflowTemplate CRD in the Argo namespace, save the following yaml to a file and run argo template create $filename -n argo.
The workflow requests 1200m CPU and 40M memory, while the running stress-ng container uses 1 CPU and 30M memory. That gap represents wasteful inefficiency, as the over-provisioned container uses more cluster compute resources than it requires to operate.
Measure Kubernetes costs with Kubecost Because Argo Workflow jobs all feature the metadata label workflows.argoproj.io/workflow, viewing each workflow’s costs and efficiency metrics within Kubecost is simple. In the Kubecost UI, go to the Cost Allocation page and change the “Aggregate By” field to “Label: workflows_argoproj_io_workflow.” This will display the following per-workflow cost and efficiency data:
Each time a workflow runs, that run receives a unique name based on the template and an added identifier. In the below example, the stress-ng workflow’s first run is named stress-ng-q9wdz. Filtering to this specific workflow run, Kubecost displays that its cost efficiency was just 57%. Clicking the workflow name to open the Details view shows the degree to which CPU and RAM overprovisioning caused this inefficiency: only 55% of requested CPU and 60% of requested RAM actually went to use!
Use the kubectl-cost CLI for quick cost inspection
Clusters with Kubecost installed are available for inspection using the kubectl-cost CLI tool as well as the UI. Use Krew to install kubectl-cost:
kubectl krew install cost
In the CLI, workflow costs are available with the command kubectl cost label --label workflows.argoproj.io/workflow:
Boost the precision of cost insights
The Kubecost tool uses the major cloud providers’ publicly-listed prices by default, but integrating actual cloud provider billing data will yield even more precise cost analysis. Doing so can also account for any custom discounts, reserved instances, or other pricing differences that might be relevant to your environment.