Installing & Configuring Open Cost

October 27, 2024
Tags:
Cost Optimization
Open Cost
Right Sizing

Kubernetes is a powerful tool, enabling the deployment of resilient and scalable systems. However, as that famous uncle taught us, with great power comes great responsibility. As adults, we all know what "responsibility" really means, though - it's bills!

If left unchecked, the cost of running Kubernetes clusters can generate nasty surprises at the end of the month. That's why cost monitoring tools are relevant - by keeping track of cost increases and being able to see what is driving the cost change, you can act promptly and prevent unnecessary waste before it's too late.

One key thing to know about kubernetes costing - and cloud costing in general, really - is that you pay for what you request, not only for what you use. For a real life example, consider that you have rented a 100-seats airplane so that you can literally host a tech conference in the clouds… I know, I know, but bear with me. Sadly, only 10 people showed up at your event! Well, tough luck, you'll still pay for the 100-seats plane. But, well, if you know from previous events that only about 5 to 15 people show up, you can save quite some money in future events by renting smaller planes, or maybe even a hot-air balloon.

Back from the real world clouds to the virtual clouds, the airplanes are like the nodes that you can rent. But, instead of a discrete capacity like seats, node capacity comes in terms of CPU, memory and storage. If you track how much resources your workloads use, you can start to realize that they are requesting way more than what they actually need. By optimizing that, you can then save money by renting less nodes, or maybe renting smaller nodes.

You can use OpenCost to get a grasp of how much your workloads cost and how efficient they are in terms of requests vs usage. While the bills from your cloud vendor let you know how much money you’re spending with Compute, it doesn’t trace that spending down to individual namespaces or workloads. OpenCost achieves that by distributing the nodes cost to individual workloads, based on resource requests and usage.

In this post we'll explain how to set up the OpenCost agent on your kubernetes clusters. OpenCost is a CNCF project for cost monitoring that publishes a set of cost metrics to Prometheus. These metrics can be queried for cost reporting, with a built-in UI providing basic reports that are a great starting point for understanding how much each workload costs.

OpenCost Installation

1) Install Prometheus

The only prerequisite for installing the OpenCost agent is Prometheus. If your cluster doesn't have Prometheus already installed, the easiest way to do it is through the Helm command provided by OpenCost's documentation:


<helm install prometheus --repo https://prometheus-community.github.io/helm-charts prometheus \
  --namespace prometheus-system --create-namespace \
  --set prometheus-pushgateway.enabled=false \
  --set alertmanager.enabled=false \
  -f https://raw.githubusercontent.com/opencost/opencost/develop/kubernetes/prometheus/extraScrapeConfigs.yaml>
    

If you already had Prometheus installed on your cluster before, you'll need to let it know that it should scrape data from OpenCost. Let's do that after installing OpenCost itself.

2) Install OpenCost

Use the following commands to install the latest OpenCost helm chart:


<helm repo add opencost-charts https://opencost.github.io/opencost-helm-chart
helm repo update
helm install opencost opencost-charts/opencost --namespace opencost --create-namespace --set prometheus.internal.serviceName=prometheus-server --set prometheus.internal.namespaceName=prometheus-system>
    

3) Update Prometheus configuration

If you installed Prometheus with the command from the first step, then skip this step.

Otherwise, you'll need to update your Prometheus configuration to make sure it scrapes the data generated by OpenCost. One possibility is to use a ServiceMonitor - apply this yaml:


<apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
 name: opencost
 namespace: opencost
spec:
 endpoints:
   - honorLabels: true
     interval: 30s
     path: /metrics
     port: 9003
     scheme: http
 namespaceSelector:
   matchNames:
     - opencost
 selector:
   matchLabels:
     app.kubernetes.io/instance: opencost
     app.kubernetes.io/name: opencost>
    

If your cluster doesn't have the ServiceMonitor CRD, you won’t be able to apply the above yaml. In that case, you can update the extraScrapeConfigs

key in your Prometheus installation with the contents of this file provided in the OpenCost git repo: https://raw.githubusercontent.com/opencost/opencost/develop/kubernetes/prometheus/extraScrapeConfigs.yaml

Exploring OpenCost

Ok, OpenCost is already installed, now let's test it out. The easiest way to check if OpenCost is running well is by checking the Service it creates. You can port-forward it to you local machine:


<kubectl port-forward -n opencost service/opencost 9090>
    

and then open it on your browser: http://localhost:9090/

You will see a screen like this:

You may notice that the cost reported by OpenCost at this first moment is really low, just a few cents. Why is that? It’s because we just installed it a few seconds ago. All expenses incurred before OpenCost installation will not be tracked.

How about we change the Breakdown? A very useful one is the “Controller” breakdown, which shows the cost based on the resource kinds mapped by OpenCost: Deployment, DaemonSet, StatefulSet, and Job:

Lastly, let’s check out the Nodes breakdown. This is a very powerful breakdown, as it lets you quickly identify if some node is underutilized. And let’s be real, this is what you actually pay for: nodes. Even if you don’t deploy anything to your cluster, you’ll still pay for the nodes that are available in your cluster.

What’s next?

Questions? Issues? See if the F.A.Q. and troubleshoot section below helps! But if you want to get much more mileage from your OpenCost installation, checkout App Insights! Besides troubleshooting and security capabilities, App Insights builds on top of OpenCost to provide features such as right sizing recommendations, idle workload detection, cost alerts, and consolidated cost reports across multiple clusters and cloud vendors.

OpenCost F.A.Q.

What is the “__idle__” entry in OpenCost?

The __idle__ entry indicates the waste of resources that were not assigned to any workload. That is, it is the cost from resources (e.g., CPU and RAM) that are available at the cluster, but that are not being requested by any workload.

What is the “__unallocated__” entry in OpenCost?

The __unallocated__ entry indicates the use of resources that are not a part of the currently selected breakdown. For instance, if you select the Namespace breakdown, you won’t see the __unallocated__ entry, since all resources are either not used at all (idle) or allocated to a namespace. Now, if you select the Deployment breakdown you’ll see the __unallocated__ entry, representing all costs that don't come from the idle resources or from the deployments (e.g., from DaemonSets and StatefulSets).

What is the meaning of “Efficiency” in OpenCost?

First of all, let’s be clear: it has nothing to do with software efficiency or performance. It’s all about requests and usage.

As an example, let’s say a deployment requests 100mb of memory and 1 CPU core. If, on a certain day, it uses an average of 50mb and 0.5 cores, its average efficiency for that day will be 50%. If, on another day, it uses an average of 120mb and 1.2 cores, its average efficiency for that day will be 120%.

So, if a workload consistently shows low efficiency, it is a prime candidate for right-sizing.

Is OpenCost accurate?

From our testing, the CPU and Memory cost reported by OpenCost is pretty accurate when it comes to clusters with On Demand nodes. If you use different kinds of nodes, like Spot or Reserved instances, there may be a mismatch, depending on your cloud vendor. But you will be able to prevent that mismatch by providing a custom pricing sheet or by integration with your cloud vendor: https://www.opencost.io/docs/configuration/

To check if your nodes are being priced correctly, you can hit up Prometheus and check the results of the node_total_hourly_cost metrics

Also, keep in mind that other costs related to your cluster, like control plane for managed kubernetes, or external databases, are not factored into OpenCost’s calculation. They can be reported separately by OpenCost through the Cloud Costs integration feature.

OpenCost Troubleshoot

No cost data at all

If you see error messages like this in the opencost log:

ERR Failed to query prometheus at http://prometheus-server-wrong-name.prometheus-system.svc.cluster.local:80…

or

ERR ComputeCostData: Request Error: query error: 'Post "http://prometheus-server-wrong-name.prometheus-system.svc.cluster.local:80/api/v1/query…

It means OpenCost is unable to connect to Prometheus. The first things to check are the Prometheus service name, namespace and port you provided in your opencost installation - these are the relevant variables:


<prometheus.internal.serviceName
prometheus.internal.namespaceName
prometheus.internal.port>
    

Myself, I have to admit: I always forget to set up the correct port! But, if your settings turn out to be correct, check if your Prometheus installation requires authentication… or maybe Prometheus is installed outside your Kubernetes cluster? Check out all the configuration options in the Prometheus section of the Helm chart values:

https://github.com/opencost/opencost-helm-chart/blob/748005b589119430808ff29eb6f0b2a3d1061a59/charts/opencost/values.yaml#L320

I can’t see any Deployment at all

So, you selected the Deployment or StatefulSet breakdown but can’t actually see any of your workloads? This usually happens due to a Prometheus misconfiguration. If you go to your Prometheus UI and select Status/Targets, and then go to the “All scrape pools” dropdown, you should see the opencost entry. If you don’t, then you need to check that the opencost ServiceMonitor or Prometheus extra scrape configs are properly set (See installation step 3).

Additionally, here’s another possible scenario: you go through the steps above, find the “opencost” target, but… it’s empty! That is, it has no endpoints. When that happens, you need to check the selectors in your ServiceMonitor (See installation step 3) and make sure they are correct. A common mistake is to install opencost in a different namespace and then forgetting to update the namespaceSelector. If you used the extra scrape configs instead, check the dns_sd_configs to make sure they match service name and namespace.

Oh no, it’s empty! Fret not, your setup is half right.

Some Deployments, StatefulSets or DaemonSets are missing

Sometimes we can’t find a specific controller (Deployment, StatefulSet or DaemonSet) in the OpenCost UI, and start pulling our hair out (pro tip: don’t do that!). When this happens, check if the missing controller has any replicas at all. If there’s been no pod running during the selected date range, that controller won’t be included in the cost reports indeed.

Idle shows negative values

This situation may happen within the first hours of a fresh OpenCost installation. If the negative value of __idle__ persists even after a full day of OpenCost running, check the troubleshooting section above about Prometheus scraping configuration.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
João Pimentel
Linked In
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.