<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Kubernetes on Besterry — Linux &amp; DevOps Notes</title><link>https://besterry.com/tags/kubernetes/</link><description>Recent content in Kubernetes on Besterry — Linux &amp; DevOps Notes</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Mon, 22 Jul 2024 00:00:00 +0000</lastBuildDate><atom:link href="https://besterry.com/tags/kubernetes/index.xml" rel="self" type="application/rss+xml"/><item><title>Kubernetes Troubleshooting: The First 10 Minutes of an Outage</title><link>https://besterry.com/posts/k8s-troubleshooting/</link><pubDate>Mon, 22 Jul 2024 00:00:00 +0000</pubDate><guid>https://besterry.com/posts/k8s-troubleshooting/</guid><description>&lt;p&gt;When PagerDuty wakes you up about a Kubernetes cluster issue, the first 10 minutes matter. Here is the runbook I work through before anything else.&lt;/p&gt;
&lt;h2 id="get-your-bearings"&gt;Get your bearings&lt;/h2&gt;
&lt;p&gt;First, confirm what&amp;rsquo;s actually broken from the user side. Check the status page or synthetic monitor. Many &amp;ldquo;outages&amp;rdquo; are monitoring issues, not real problems.&lt;/p&gt;
&lt;h2 id="cluster-level-check"&gt;Cluster-level check&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;kubectl get nodes
kubectl top nodes
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Look for NotReady nodes and resource pressure. If multiple nodes are down, the problem is probably infrastructure — check the cloud provider console.&lt;/p&gt;</description></item></channel></rss>