<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>DevSecOps Notes</title><link href="https://rivassec.com/" rel="alternate"/><link href="https://rivassec.com/feeds/all.atom.xml" rel="self"/><id>https://rivassec.com/</id><updated>2026-05-04T00:00:00-07:00</updated><subtitle>Infrastructure. Security. Insight.</subtitle><entry><title>The Trust Decay: Why Modern Hiring Has Become an Adversarial System</title><link href="https://rivassec.com/trust-decay-adversarial-hiring.html" rel="alternate"/><published>2026-05-04T00:00:00-07:00</published><updated>2026-05-04T00:00:00-07:00</updated><author><name>RivasSec</name></author><id>tag:rivassec.com,2026-05-04:/trust-decay-adversarial-hiring.html</id><summary type="html">&lt;p&gt;The tech hiring pipeline has shifted from talent discovery to defensive risk mitigation. Flooded by synthetic resumes and hyper-automated applications, hiring systems now favor pre-validated channels and proof-of-work over polished presentations. For engineers, success in 2026 means being the most difficult to doubt.&lt;/p&gt;</summary><content type="html">&lt;h2&gt;The Duality of the Current Market&lt;/h2&gt;
&lt;p&gt;The tech job market is currently defined by a jarring paradox. On one side, elite engineers land roles in days; on the other, equally qualified peers face months of silence. These aren't conflicting data points - they are the predictable outputs of a system under extreme duress.&lt;/p&gt;
&lt;p&gt;The hiring pipeline has ceased to be a discovery engine designed to find talent. It has evolved into a defensive perimeter designed to mitigate risk in a low-trust environment.&lt;/p&gt;
&lt;hr&gt;
&lt;h2&gt;From Discovery to Defense: The Death of Honest Inputs&lt;/h2&gt;
&lt;p&gt;Historically, recruitment operated on the assumption of "manageable honesty." You received a stack of resumes, assumed most were reasonably accurate, and searched for the best fit.&lt;/p&gt;
&lt;p&gt;That model has collapsed. Today, hiring systems are bombarded by "strategically optimized noise," including:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Hyper-automated workflows:&lt;/strong&gt; Candidates applying to hundreds of roles via LLM-powered scripts.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Synthetic Resumes:&lt;/strong&gt; AI-generated profiles perfectly tuned to trigger every keyword in a Job Description (JD).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Signal Dilution:&lt;/strong&gt; When every applicant looks like a 95% match on paper, the "match" itself becomes meaningless.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;From a systems engineering perspective, the pipeline is now facing adversarial inputs. When a system is flooded with high-volume, low-integrity data, it naturally shifts its posture from "open" to "fortified."&lt;/p&gt;
&lt;hr&gt;
&lt;h2&gt;How the System Defends Itself&lt;/h2&gt;
&lt;p&gt;When trust in incoming data drops, the system compensates with three defensive maneuvers:&lt;/p&gt;
&lt;h3&gt;1. The False Negative Bias&lt;/h3&gt;
&lt;p&gt;In a high-noise environment, the cost of a "False Positive" (a bad hire) outweighs the cost of a "False Negative" (missing a great candidate). Consequently, filters are tightened to an extreme degree. If a candidate cannot be verified with absolute certainty at the first gate, they are discarded.&lt;/p&gt;
&lt;h3&gt;2. Signal Collapse&lt;/h3&gt;
&lt;p&gt;As presentation becomes commoditized through AI, "looking the part" no longer serves as a differentiator. If everyone's resume is a work of art, no one's resume is. This leads to ranking paralysis, where recruiters rely on arbitrary or conservative heuristics because they can no longer distinguish between genuine expertise and successful optimization.&lt;/p&gt;
&lt;h3&gt;3. Upstream Trust Migration&lt;/h3&gt;
&lt;p&gt;Because the public pipeline is compromised, hiring teams are retreating to "pre-validated" channels. This explains the heavy reliance on internal referrals and known networks. It's not necessarily cronyism; it's an architectural necessity to find signal in a sea of noise.&lt;/p&gt;
&lt;hr&gt;
&lt;h2&gt;The Feedback Loop of Friction&lt;/h2&gt;
&lt;p&gt;We are trapped in a recursive cycle. Candidates optimize harder to bypass filters; in response, filters become more draconian. This creates a "Degraded Trust Loop" where the system's own success at filtering further incentivizes candidates to game the system.&lt;/p&gt;
&lt;p&gt;Ultimately, the pipeline stops being a way to find people and becomes a way to manage risk.&lt;/p&gt;
&lt;hr&gt;
&lt;h2&gt;The New Strategy: Proof Over Presentation&lt;/h2&gt;
&lt;p&gt;If the market is a low-trust system, "Presentation" (how you describe yourself) is losing its value. What remains valuable is Evidence - signals that are computationally or socially expensive to fake.&lt;/p&gt;
&lt;h3&gt;Moving from "I Did" to "Here Is"&lt;/h3&gt;
&lt;p&gt;To bypass the defensive perimeter, engineers must move beyond the resume. The goal is to provide externally verifiable artifacts that don't require the pipeline to "believe" you.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Architectural Transparency:&lt;/strong&gt; Don't just list technologies; publish (abstracted) system designs, trade-off analyses, and post-mortems of failure modes.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Tangible Artifacts:&lt;/strong&gt; Real-world contributions - whether through open-source modules, infrastructure-as-code repos, or documented homelabs - serve as proof-of-work.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Impact-Oriented Signaling:&lt;/strong&gt; Shift from "tasks completed" to "business outcomes achieved." Hard numbers on risk reduction, latency improvements, or cost savings are much harder to hallucinate effectively.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2&gt;A DevSecOps View of the Career&lt;/h2&gt;
&lt;p&gt;If we treat the job market as a security problem, the solution becomes clear. The hiring pipeline is a system with exposed endpoints and high validation costs. As any security professional knows, when you can't trust the input, you lean on Multi-Factor Authentication. In hiring, your "factors" are your network, your public evidence, and your verifiable history.&lt;/p&gt;
&lt;p&gt;The market isn't "broken" - it has simply changed its objective function. It no longer prioritizes finding the best; it prioritizes avoiding the unverified.&lt;/p&gt;
&lt;p&gt;Success in 2026 and beyond isn't about having the most optimized resume. It's about being the most difficult to doubt. In a world of automated noise, reliability is the only signal that scales.&lt;/p&gt;</content><category term="DevSecOps"/><category term="hiring"/><category term="job-market"/><category term="devsecops"/><category term="careers"/><category term="adversarial-systems"/><category term="trust"/></entry><entry><title>Never Lose Connection: Multi-Phone Bluetooth Tethering for Pwnagotchi</title><link href="https://rivassec.com/bt-tether-multi.html" rel="alternate"/><published>2025-07-22T00:00:00-07:00</published><updated>2025-07-22T00:00:00-07:00</updated><author><name>RivasSec</name></author><id>tag:rivassec.com,2025-07-22:/bt-tether-multi.html</id><summary type="html">&lt;p&gt;Enhance your Pwnagotchi's autonomy with &lt;code&gt;bt-tether-multi&lt;/code&gt;, a custom plugin offering intelligent multi-phone Bluetooth tethering, automatic WAN failover, and robust connection management.&lt;/p&gt;</summary><content type="html">&lt;h2&gt;The Common Pwnagotchi Tethering Problem&lt;/h2&gt;
&lt;p&gt;If you're an active Pwnagotchi user, you've likely faced the frustration of losing internet connectivity in the field. Whether you forgot your primary tethering phone, moved out of range, or encountered a "silent disconnect" where your phone still reports a connection but lacks actual WAN access (like a captive portal redirect), the default Bluetooth tethering often leaves your Pwnagotchi stranded. This means missed opportunities for handshakes and updates.&lt;/p&gt;
&lt;h2&gt;Introducing &lt;code&gt;bt-tether-multi&lt;/code&gt;: Your Pwnagotchi's Ultimate Network Backup&lt;/h2&gt;
&lt;p&gt;I built &lt;code&gt;bt-tether-multi&lt;/code&gt; to make Pwnagotchi networking resilient and autonomous. This plugin empowers your device to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Intelligently Connect:&lt;/strong&gt; Configure a list of multiple phones, prioritized by your preference, for seamless Bluetooth tethering.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Proactive WAN Detection:&lt;/strong&gt; Detect actual loss of internet access (not just Bluetooth connection) using real-world checks.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Automatic Fallback:&lt;/strong&gt; Gracefully switch to the next available phone in your list if the current connection drops or loses WAN.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Smart Retries:&lt;/strong&gt; Implement a configurable retry delay to prevent rapid, unproductive cycling through phones during temporary network issues.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Clear UI Feedback:&lt;/strong&gt; Provides immediate visual cues on the Pwnagotchi's e-ink display about its tethering status.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;How It Works Under the Hood&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;bt-tether-multi&lt;/code&gt; integrates directly with your Pwnagotchi's system. Upon loading, it reads your carefully defined list of tethering phones from the &lt;code&gt;config.toml&lt;/code&gt; file. This configuration includes essential details like the phone's name, MAC address, IP address, and operating system type (Android or iOS) to ensure correct gateway settings.&lt;/p&gt;
&lt;p&gt;The plugin leverages standard Linux networking tools:
* &lt;strong&gt;&lt;code&gt;nmcli&lt;/code&gt; (NetworkManager CLI):&lt;/strong&gt; Used to programmatically manage Bluetooth connections, including adding, deleting, and bringing up/down network interfaces for your paired phones.
* &lt;strong&gt;&lt;code&gt;curl&lt;/code&gt;:&lt;/strong&gt; Employed for a fast (&lt;code&gt;--max-time 3&lt;/code&gt;), non-intrusive check to &lt;code&gt;https://www.google.com&lt;/code&gt; to verify genuine WAN connectivity. If &lt;code&gt;curl&lt;/code&gt; can't reach the internet, the plugin considers the WAN lost.&lt;/p&gt;
&lt;h3&gt;UI Status Indicators:&lt;/h3&gt;
&lt;p&gt;The Pwnagotchi's display provides immediate feedback:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;B:&amp;lt;name&amp;gt;&lt;/code&gt;: Successfully connected to one of your configured phones. The name is truncated for display.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;B:???&lt;/code&gt;: Bluetooth is connected, but the active phone is not recognized in your configured list. This might indicate an unexpected connection or a misconfiguration.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;...&lt;/code&gt;: The plugin is currently in the process of rotating through connections or attempting to establish one.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;X&lt;/code&gt;: Disconnected from all configured phones.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;!&lt;/code&gt;: A configuration error or plugin-related issue has occurred.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The sequential fallback and retry logic ensure that your Pwnagotchi stays online with minimal intervention, rotating through your devices until a stable internet connection is found.&lt;/p&gt;
&lt;h2&gt;Installation and Configuration&lt;/h2&gt;
&lt;p&gt;Installing &lt;code&gt;bt-tether-multi&lt;/code&gt; is straightforward:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Download:&lt;/strong&gt; Place the plugin file (&lt;code&gt;bt.py&lt;/code&gt; from the GitHub repository) into your Pwnagotchi's custom plugin directory (typically &lt;code&gt;/etc/pwnagotchi/custom-plugins/&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Configure:&lt;/strong&gt; Add your phone details to your &lt;code&gt;config.toml&lt;/code&gt; file. Here's a simplified example of what your &lt;code&gt;config.toml&lt;/code&gt; might look like:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;toml
main.plugins.bt-tether-multi.enabled = true
main.plugins.bt-tether-multi.phones = [
  { name = "MyAndroid", mac = "XX:XX:XX:XX:XX:XX", ip = "192.168.44.44", type = "android" },
  { name = "MyiPhone", mac = "YY:YY:YY:YY:YY:YY", ip = "172.20.10.10", type = "ios" },
]
main.plugins.bt-tether-multi.retry_delay = 180 # Optional: customize retry delay (seconds)&lt;/code&gt;
&lt;strong&gt;Important:&lt;/strong&gt; Replace &lt;code&gt;XX:XX:XX:XX:XX:XX&lt;/code&gt; and &lt;code&gt;YY:YY:YY:YY:YY:YY&lt;/code&gt; with your actual phone MAC addresses. Ensure your IP addresses match what your phone assigns to the Pwnagotchi's Bluetooth interface.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For a comprehensive guide and the most up-to-date configuration examples, please refer to the &lt;a href="https://github.com/rivassec/bt-tether-multi"&gt;GitHub README&lt;/a&gt; in the repository.&lt;/p&gt;
&lt;h2&gt;Security Considerations&lt;/h2&gt;
&lt;p&gt;As with any tool that interacts with your system's networking, security is paramount. This plugin has been rigorously scanned with &lt;a href="https://bandit.readthedocs.io"&gt;Bandit&lt;/a&gt;, a leading Python security linter.&lt;/p&gt;
&lt;p&gt;The scan reported "Low Severity" warnings primarily related to the use of the &lt;code&gt;subprocess&lt;/code&gt; module. It's crucial to understand why these are considered acceptable in this context and how they're mitigated:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;No &lt;code&gt;shell=True&lt;/code&gt;:&lt;/strong&gt; All external commands (&lt;code&gt;nmcli&lt;/code&gt;, &lt;code&gt;curl&lt;/code&gt;, &lt;code&gt;bluetoothctl&lt;/code&gt;) are executed with &lt;code&gt;shell=False&lt;/code&gt;. This is a critical security measure as it prevents arbitrary shell command injection by treating all arguments as literal strings, not executable code.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Full Paths for Executables:&lt;/strong&gt; The plugin now uses &lt;code&gt;shutil.which&lt;/code&gt; to dynamically determine and use the &lt;strong&gt;absolute file path&lt;/strong&gt; for &lt;code&gt;nmcli&lt;/code&gt;, &lt;code&gt;curl&lt;/code&gt;, and &lt;code&gt;bluetoothctl&lt;/code&gt;. This prevents malicious executables from being run if a compromised &lt;code&gt;PATH&lt;/code&gt; environment variable is present.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Strict Input Validation:&lt;/strong&gt; All dynamic inputs (like &lt;code&gt;MAC addresses&lt;/code&gt;, &lt;code&gt;phone names&lt;/code&gt;, and &lt;code&gt;IP addresses&lt;/code&gt;) coming from your &lt;code&gt;config.toml&lt;/code&gt; are subjected to strict regular expression and &lt;code&gt;ipaddress&lt;/code&gt; module validation &lt;em&gt;before&lt;/em&gt; being passed to &lt;code&gt;subprocess&lt;/code&gt; commands. This ensures that only well-formed and safe values are used.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Controlled Environment:&lt;/strong&gt; Pwnagotchi runs in a specific, often isolated, environment. While caution is always advised, the risk surface is contained compared to a general-purpose server.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The "Low Severity" warnings are primarily general advisories about the &lt;em&gt;potential&lt;/em&gt; for misuse of &lt;code&gt;subprocess&lt;/code&gt;, rather than indicative of a direct, exploitable vulnerability in this specific implementation, given the defensive measures taken.&lt;/p&gt;
&lt;h2&gt;Final Thoughts&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;bt-tether-multi&lt;/code&gt; is designed for the Pwnagotchi enthusiast who values uptime and autonomy. It transforms a common point of failure into a robust, self-managing solution. No more restarting your Pwnagotchi or manually re-tethering when your connection goes south.&lt;/p&gt;
&lt;p&gt;This plugin has become an indispensable part of my Pwnagotchi setup, saving me countless headaches in the field. I invite you to try it out and contribute to its development!&lt;/p&gt;
&lt;p&gt;Find the source code, detailed installation instructions, and contribute to the project on GitHub: &lt;a href="https://github.com/rivassec/bt-tether-multi"&gt;rivassec/bt-tether-multi&lt;/a&gt;&lt;/p&gt;</content><category term="Projects"/><category term="pwnagotchi"/><category term="bluetooth"/><category term="plugin"/><category term="networking"/><category term="python"/><category term="hacking"/><category term="automation"/></entry><entry><title>Secure Snapshot Verification in Elasticsearch with Minimal Privileges</title><link href="https://rivassec.com/elasticsearch-secure-snapshot-verification.html" rel="alternate"/><published>2025-04-20T00:00:00-07:00</published><updated>2025-04-20T00:00:00-07:00</updated><author><name>RivasSec</name></author><id>tag:rivassec.com,2025-04-20:/elasticsearch-secure-snapshot-verification.html</id><summary type="html">&lt;p&gt;Learn how to securely verify Elasticsearch snapshots without using &lt;code&gt;manage_snapshot&lt;/code&gt;, using a minimal API key, Prometheus-compatible script, and hardened monitoring practices. Includes a GitHub tools repo for automation.&lt;/p&gt;</summary><content type="html">&lt;h1&gt;Es Snapshot Verifier&lt;/h1&gt;
&lt;p&gt;Verifying Elasticsearch snapshots typically requires broad &lt;code&gt;manage&lt;/code&gt; permissions. This can be risky, especially if credentials are compromised. We can reduce the blast radius by defining a minimal role that grants only the specific actions necessary to verify snapshots without allowing deletions or alterations.&lt;/p&gt;
&lt;p&gt;In some environments, using external monitoring systems like Datadog or Prometheus may not be feasible. Whether due to air-gapped infrastructure, compliance restrictions, or footprint concerns, having a hardened custom script with minimal privileges can be a reliable fallback.&lt;/p&gt;
&lt;p&gt;To improve portability and maintainability, this article now references code and configuration files hosted in the &lt;a href="https://github.com/rivassec/elasticsearch-tools"&gt;elasticsearch-tools GitHub repository&lt;/a&gt;. This structure allows future updates to the tools without requiring edits to the article.&lt;/p&gt;
&lt;h2&gt;Minimal Elasticsearch Role&lt;/h2&gt;
&lt;p&gt;Here is a role that avoids using &lt;code&gt;manage_snapshot&lt;/code&gt; to reduce exposure. This ensures a compromised API key cannot delete or tamper with existing backups:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/rivassec/elasticsearch-tools/blob/main/roles/snapshot_repo_readonly.json"&gt;View full role definition&lt;/a&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;snapshot_repo_readonly&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;cluster&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;cluster:admin/repository/get&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;cluster:admin/repository/verify&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;cluster:admin/snapshot/get&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;cluster:admin/snapshot/status&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;indices&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;run_as&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;metadata&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;transient_metadata&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;enabled&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2&gt;API Key Generation&lt;/h2&gt;
&lt;p&gt;To generate an API key restricted to this role, use the following &lt;code&gt;curl&lt;/code&gt; command. This allows access only to the approved cluster actions with a defined expiration period:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;curl&lt;span class="w"&gt; &lt;/span&gt;-u&lt;span class="w"&gt; &lt;/span&gt;elastic:&lt;span class="si"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;ELASTICPASS&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-X&lt;span class="w"&gt; &lt;/span&gt;POST&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;localhost:9200/_security/api_key&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;-H&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Content-Type: application/json&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;-d&lt;span class="w"&gt; &lt;/span&gt;@elasticsearch-tools/roles/snapshot_repo_readonly.json
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2&gt;Snapshot Verifier Script&lt;/h2&gt;
&lt;p&gt;This API key can be used with a lightweight shell script that verifies the repository and emits Prometheus-compatible metrics. The script is secure by design and includes input validation, safe temporary file handling, and minimal permissions.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/rivassec/elasticsearch-tools/blob/main/scripts/verify_snapshot.sh"&gt;View the script&lt;/a&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="ch"&gt;#!/bin/bash&lt;/span&gt;

&lt;span class="c1"&gt;########################################################################&lt;/span&gt;
&lt;span class="c1"&gt;# Hardened Snapshot Monitor for Elasticsearch&lt;/span&gt;
&lt;span class="c1"&gt;# Purpose: Verify an Elasticsearch snapshot repository and expose&lt;/span&gt;
&lt;span class="c1"&gt;# Prometheus-style metrics securely.&lt;/span&gt;
&lt;span class="c1"&gt;########################################################################&lt;/span&gt;

&lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-euo&lt;span class="w"&gt; &lt;/span&gt;pipefail

:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;ES_HOST&lt;/span&gt;&lt;span class="p"&gt;:=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;http://localhost:9200&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;
:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;REPO_NAME&lt;/span&gt;&lt;span class="p"&gt;:?Missing REPO_NAME&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;
:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;API_KEY_FILE&lt;/span&gt;&lt;span class="p"&gt;:=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;/etc/elasticsearch/readonly-api-key&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;
:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;PROM_FILE&lt;/span&gt;&lt;span class="p"&gt;:=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;/var/lib/node_exporter/textfile_collector/es_snapshot.prom&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[[&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;!&lt;span class="w"&gt; &lt;/span&gt;-f&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="nv"&gt;$API_KEY_FILE&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;then&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;[FATAL] API key file not found: &lt;/span&gt;&lt;span class="nv"&gt;$API_KEY_FILE&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&amp;gt;&lt;span class="p"&gt;&amp;amp;&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nb"&gt;exit&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[[&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;$(&lt;/span&gt;stat&lt;span class="w"&gt; &lt;/span&gt;-c&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;%a&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="nv"&gt;$API_KEY_FILE&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="k"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-gt&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;600&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;then&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;[FATAL] API key file permissions too permissive (should be 600 or less)&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&amp;gt;&lt;span class="p"&gt;&amp;amp;&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nb"&gt;exit&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;

&lt;span class="nv"&gt;API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$(&lt;/span&gt;&amp;lt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="nv"&gt;$API_KEY_FILE&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="k"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;TMP_PROM_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$(&lt;/span&gt;mktemp&lt;span class="k"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;safe_repo&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;REPO_NAME&lt;/span&gt;&lt;span class="p"&gt;//[^a-zA-Z0-9_]/_&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;
&lt;span class="nv"&gt;timestamp&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$(&lt;/span&gt;date&lt;span class="w"&gt; &lt;/span&gt;+%s&lt;span class="k"&gt;)&lt;/span&gt;

&lt;span class="nv"&gt;response&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$(&lt;/span&gt;curl&lt;span class="w"&gt; &lt;/span&gt;-fsSL&lt;span class="w"&gt; &lt;/span&gt;--retry&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--retry-delay&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;-H&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Authorization: ApiKey &lt;/span&gt;&lt;span class="nv"&gt;$API_KEY&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;-H&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Content-Type: application/json&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;-X&lt;span class="w"&gt; &lt;/span&gt;POST&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="nv"&gt;$ES_HOST&lt;/span&gt;&lt;span class="s2"&gt;/_snapshot/&lt;/span&gt;&lt;span class="nv"&gt;$REPO_NAME&lt;/span&gt;&lt;span class="s2"&gt;/_verify&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;||&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="k"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;jq&lt;span class="w"&gt; &lt;/span&gt;-e&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;.nodes | length &amp;gt; 0&amp;#39;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&amp;gt;/dev/null&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&amp;gt;&lt;span class="p"&gt;&amp;amp;&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;then&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;result&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;status&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;ok&amp;quot;&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;result&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;status&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;failed&amp;quot;&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;

&lt;span class="o"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;# HELP es_snapshot_repository_verified Success status of snapshot verification&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;# TYPE es_snapshot_repository_verified gauge&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;es_snapshot_repository_verified{repo=\&amp;quot;&lt;/span&gt;&lt;span class="nv"&gt;$safe_repo&lt;/span&gt;&lt;span class="s2"&gt;\&amp;quot;} &lt;/span&gt;&lt;span class="nv"&gt;$result&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;# HELP es_snapshot_repository_verified_at Unix timestamp of last check&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;# TYPE es_snapshot_repository_verified_at gauge&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;es_snapshot_repository_verified_at{repo=\&amp;quot;&lt;/span&gt;&lt;span class="nv"&gt;$safe_repo&lt;/span&gt;&lt;span class="s2"&gt;\&amp;quot;} &lt;/span&gt;&lt;span class="nv"&gt;$timestamp&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&amp;gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="nv"&gt;$TMP_PROM_FILE&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;

mv&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="nv"&gt;$TMP_PROM_FILE&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="nv"&gt;$PROM_FILE&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;
logger&lt;span class="w"&gt; &lt;/span&gt;-t&lt;span class="w"&gt; &lt;/span&gt;es-snapshot-monitor&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;[INFO] Verification &lt;/span&gt;&lt;span class="nv"&gt;$status&lt;/span&gt;&lt;span class="s2"&gt; for &amp;#39;&lt;/span&gt;&lt;span class="nv"&gt;$REPO_NAME&lt;/span&gt;&lt;span class="s2"&gt;&amp;#39; (code=&lt;/span&gt;&lt;span class="nv"&gt;$result&lt;/span&gt;&lt;span class="s2"&gt;)&amp;quot;&lt;/span&gt;

&lt;span class="nb"&gt;exit&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2&gt;Cron Example&lt;/h2&gt;
&lt;p&gt;A cron job can be configured to run the script regularly:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/rivassec/elasticsearch-tools/blob/main/examples/example_cronjob.sh"&gt;View example cron wrapper&lt;/a&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="ch"&gt;#!/bin/bash&lt;/span&gt;

&lt;span class="c1"&gt;# Note: For production use, place this logic directly into a cron job or systemd timer.&lt;/span&gt;
&lt;span class="c1"&gt;#       This script is just an example for demonstration and testing.&lt;/span&gt;

&lt;span class="c1"&gt;# Fail fast on error&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-euo&lt;span class="w"&gt; &lt;/span&gt;pipefail

&lt;span class="c1"&gt;# === Configuration ===&lt;/span&gt;
&lt;span class="nb"&gt;export&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;REPO_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;my_backup_repo&amp;quot;&lt;/span&gt;
&lt;span class="nb"&gt;export&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;ES_HOST&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;http://localhost:9200&amp;quot;&lt;/span&gt;
&lt;span class="nb"&gt;export&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;API_KEY_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;/etc/elasticsearch/readonly-api-key&amp;quot;&lt;/span&gt;
&lt;span class="nb"&gt;export&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;PROM_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;/var/lib/node_exporter/textfile_collector/es_snapshot_&lt;/span&gt;&lt;span class="si"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;REPO_NAME&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.prom&amp;quot;&lt;/span&gt;

&lt;span class="c1"&gt;# === Invoke snapshot verification script ===&lt;/span&gt;
/opt/elasticsearch-tools/tools/snapshot-verifier/verify_snapshot.sh
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2&gt;Usage Instructions&lt;/h2&gt;
&lt;p&gt;To install, configure, and run the snapshot verification system, follow the documentation in the repository:
&lt;a href="https://github.com/rivassec/elasticsearch-tools/blob/main/docs/USAGE.md"&gt;View usage guide&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;This structure is ideal for environments with limited connectivity or strict compliance rules. It keeps the verification logic reproducible, auditable, and safe from privilege escalation risks. Updates to the tooling can be managed independently of the article, improving long-term maintainability.&lt;/p&gt;</content><category term="DevSecOps"/><category term="elasticsearch"/><category term="snapshot"/><category term="security"/><category term="observability"/><category term="prometheus"/><category term="minimal-permissions"/></entry><entry><title>Hardening Kubernetes Deployments</title><link href="https://rivassec.com/hardening-k8s.html" rel="alternate"/><published>2025-04-19T00:00:00-07:00</published><updated>2025-04-19T00:00:00-07:00</updated><author><name>RivasSec</name></author><id>tag:rivassec.com,2025-04-19:/hardening-k8s.html</id><summary type="html">&lt;p&gt;Hardening Kubernetes workloads goes beyond RBAC tweaks or image scans. This post shares field-tested pod-level guardrails aligned with the Pod Security Standards (Restricted profile), covering non-root containers, dropped capabilities, read-only filesystems, NetworkPolicies, and ServiceAccount hardening.&lt;/p&gt;</summary><content type="html">&lt;p&gt;Securing Kubernetes workloads isn't just about scanning images or tweaking RBAC, it's about enforcing the right guardrails at the pod level to minimize risk by default. This post shares field-tested strategies aligned with the Pod Security Standards (Restricted profile) to help you build safer, production-grade deployments.&lt;/p&gt;
&lt;h2&gt;Key Practices for Hardening Kubernetes Deployments&lt;/h2&gt;
&lt;h3&gt;1. Run Containers as Non-Root&lt;/h3&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nt"&gt;securityContext&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;runAsNonRoot&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;true&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;runAsUser&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;1000&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;runAsGroup&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;3000&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This enforces that containers don’t run as UID 0, reducing the blast radius of any compromise.&lt;/p&gt;
&lt;hr&gt;
&lt;h3&gt;2. Drop All Linux Capabilities&lt;/h3&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nt"&gt;securityContext&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;capabilities&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;drop&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;ALL&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;add&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;NET_BIND_SERVICE&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;# Only if your app needs it (e.g., for ports &amp;lt;1024)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Drop all capabilities by default, then add only what you need.&lt;/p&gt;
&lt;hr&gt;
&lt;h3&gt;3. Disable Privilege Escalation&lt;/h3&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nt"&gt;securityContext&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;allowPrivilegeEscalation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;false&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This prevents processes inside the container from gaining additional privileges, even if compromised.&lt;/p&gt;
&lt;hr&gt;
&lt;h3&gt;4. Use Read-Only Filesystem&lt;/h3&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nt"&gt;securityContext&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;readOnlyRootFilesystem&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This blocks attackers from writing malicious files or installing tools inside the container.&lt;/p&gt;
&lt;hr&gt;
&lt;h3&gt;5. Avoid Host Access&lt;/h3&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nt"&gt;hostNetwork&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;false&lt;/span&gt;
&lt;span class="nt"&gt;hostPID&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;false&lt;/span&gt;
&lt;span class="nt"&gt;hostIPC&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;false&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Avoid &lt;code&gt;hostPath&lt;/code&gt; volumes unless absolutely required. These settings ensure your workloads remain isolated from the host.&lt;/p&gt;
&lt;hr&gt;
&lt;h3&gt;6. Use Trusted Images and Scan Them&lt;/h3&gt;
&lt;p&gt;Use minimal base images (Alpine, Distroless) and trusted registries. Always scan them:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;trivy&lt;span class="w"&gt; &lt;/span&gt;image&lt;span class="w"&gt; &lt;/span&gt;your-registry/app:tag
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This helps catch known CVEs before deployment.&lt;/p&gt;
&lt;hr&gt;
&lt;h3&gt;7. Handle Secrets via Volumes (Not Env Vars)&lt;/h3&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nt"&gt;volumes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;secret-volume&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;secretName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;my-secret&lt;/span&gt;

&lt;span class="nt"&gt;containers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;myapp&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;volumeMounts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;secret-volume&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nt"&gt;mountPath&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;/etc/secret&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nt"&gt;readOnly&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Mounting secrets as volumes avoids accidental exposure via logs or &lt;code&gt;/proc&lt;/code&gt;.&lt;/p&gt;
&lt;hr&gt;
&lt;h3&gt;8. Restrict Network Traffic with NetworkPolicies&lt;/h3&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nt"&gt;apiVersion&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;networking.k8s.io/v1&lt;/span&gt;
&lt;span class="nt"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;NetworkPolicy&lt;/span&gt;
&lt;span class="nt"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;deny-all-ingress&lt;/span&gt;
&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;podSelector&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p p-Indicator"&gt;{}&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;policyTypes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;Ingress&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Start with a default-deny policy per namespace, then explicitly allow only the traffic your services need. Without NetworkPolicies, any pod can communicate with any other pod in the cluster.&lt;/p&gt;
&lt;hr&gt;
&lt;h3&gt;9. Harden ServiceAccount Usage&lt;/h3&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nt"&gt;automountServiceAccountToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;false&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Disable automatic token mounting for pods that don’t need API server access. Create dedicated ServiceAccounts with minimal RBAC bindings rather than relying on the &lt;code&gt;default&lt;/code&gt; account, which often accumulates unnecessary permissions.&lt;/p&gt;
&lt;hr&gt;
&lt;h2&gt;Final Thoughts&lt;/h2&gt;
&lt;p&gt;Security isn’t just about tools, it’s about secure defaults. These practices help harden your Kubernetes workloads using the Restricted Pod Security Standard and reduce risks across the board.&lt;/p&gt;
&lt;p&gt;If you’re managing production clusters or sensitive environments, these changes are low-hanging fruit with a high return on security posture.&lt;/p&gt;</content><category term="Kubernetes Security"/><category term="kubernetes"/><category term="hardening"/><category term="pod-security-standards"/></entry><entry><title>Taming the OOM Killer: Process Prioritization for Memory-Constrained Linux Systems</title><link href="https://rivassec.com/oom-killer-process-prioritization.html" rel="alternate"/><published>2025-04-18T00:00:00-07:00</published><updated>2025-04-18T00:00:00-07:00</updated><author><name>RivasSec</name></author><id>tag:rivassec.com,2025-04-18:/oom-killer-process-prioritization.html</id><summary type="html">&lt;p&gt;In memory-constrained environments, the Linux OOM Killer decides what lives and what gets killed. This guide shows how to protect critical processes like sshd and mysqld using oom_score_adj values, with a script that applies them reliably and securely. Make memory pressure predictable and survivable.&lt;/p&gt;</summary><content type="html">&lt;p&gt;In resource-constrained environments — especially virtual private servers, CI agents, and container hosts — the Linux kernel's &lt;strong&gt;Out of Memory Killer (OOM Killer)&lt;/strong&gt; is a last-resort defense mechanism. When memory is exhausted, it begins terminating processes to keep the system alive.&lt;/p&gt;
&lt;p&gt;The OOM Killer uses heuristics (like memory usage and the &lt;code&gt;oom_score_adj&lt;/code&gt; value) to select processes it deems less essential. But you don’t have to leave that critical decision entirely to the kernel's default logic.&lt;/p&gt;
&lt;hr&gt;
&lt;h2&gt;The Incident&lt;/h2&gt;
&lt;p&gt;Years ago, I had to recover a VPS via remote console. A quick dive into &lt;code&gt;/var/log/messages&lt;/code&gt; showed that the OOM Killer had struck, terminating critical services. The culprit? A perfect storm:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Web crawlers (Google, Yahoo, Yandex) simultaneously indexing multiple sites&lt;/li&gt;
&lt;li&gt;A torrent tracker and download script both running&lt;/li&gt;
&lt;li&gt;IRC flood attempts while &lt;code&gt;irssi&lt;/code&gt; was connected&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This combination overwhelmed system memory. Without process priority tuning, the OOM Killer started targeting processes based on its heuristics, which felt indiscriminate from an operational view as it even took down &lt;code&gt;sshd&lt;/code&gt;.&lt;/p&gt;
&lt;hr&gt;
&lt;h2&gt;The Mitigation Strategy&lt;/h2&gt;
&lt;p&gt;You can significantly influence OOM Killer decisions using the &lt;code&gt;/proc/&amp;lt;pid&amp;gt;/oom_score_adj&lt;/code&gt; setting for a process. This value ranges from -1000 to +1000. The kernel uses this score, combined with memory usage, to decide kill priority; a lower score makes the process less likely to be chosen relative to others.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A value of &lt;code&gt;-1000&lt;/code&gt; effectively disables OOM killing for that process.&lt;/li&gt;
&lt;li&gt;A value of &lt;code&gt;+1000&lt;/code&gt; makes it a highly preferred target.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;0&lt;/code&gt; is the default.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Here’s a script that reads preferences from a config file and adjusts running process scores accordingly.&lt;/p&gt;
&lt;h3&gt;&lt;code&gt;/etc/oom_candidates.conf&lt;/code&gt;&lt;/h3&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gh"&gt;#&lt;/span&gt; Format: &amp;lt;process_name&amp;gt; &amp;lt;oom_score_adj_value&amp;gt;
&lt;span class="gh"&gt;#&lt;/span&gt; Higher = more likely to be killed. Negative = more protected.
&lt;span class="gh"&gt;#&lt;/span&gt; Critical Services (Protect Strongly)
sshd -1000
mysqld -500
portsentry -200

&lt;span class="gh"&gt;#&lt;/span&gt; Important Services (Protect Moderately)
apache2 100

&lt;span class="gh"&gt;#&lt;/span&gt; Less Critical Interactive/Background (Allow Killing)
screen 300
irssi 400
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h3&gt;&lt;code&gt;oom_adjuster.sh&lt;/code&gt;&lt;/h3&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="ch"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="nv"&gt;CONFIG&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;/etc/oom_candidates.conf&amp;quot;&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[[&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;!&lt;span class="w"&gt; &lt;/span&gt;-f&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="nv"&gt;$CONFIG&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;then&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Error: Config file &lt;/span&gt;&lt;span class="nv"&gt;$CONFIG&lt;/span&gt;&lt;span class="s2"&gt; not found.&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&amp;gt;&lt;span class="p"&gt;&amp;amp;&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nb"&gt;exit&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;

&lt;span class="k"&gt;while&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;IFS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;read&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-r&lt;span class="w"&gt; &lt;/span&gt;line&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;||&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[[&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-n&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;do&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="o"&gt;[[&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;~&lt;span class="w"&gt; &lt;/span&gt;^#.*$&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;||&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-z&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;continue&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nb"&gt;read&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-r&lt;span class="w"&gt; &lt;/span&gt;process&lt;span class="w"&gt; &lt;/span&gt;score&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;

&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[[&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-z&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="nv"&gt;$process&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;||&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-z&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="nv"&gt;$score&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;then&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Warning: Skipping invalid line: &lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&amp;gt;&lt;span class="p"&gt;&amp;amp;&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;continue&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;fi&lt;/span&gt;

&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;pids&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$(&lt;/span&gt;pgrep&lt;span class="w"&gt; &lt;/span&gt;-x&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="nv"&gt;$process&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="k"&gt;)&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[[&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-z&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="nv"&gt;$pids&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;then&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;continue&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;fi&lt;/span&gt;

&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Adjusting OOM score for &lt;/span&gt;&lt;span class="nv"&gt;$process&lt;/span&gt;&lt;span class="s2"&gt; (PIDs: &lt;/span&gt;&lt;span class="nv"&gt;$pids&lt;/span&gt;&lt;span class="s2"&gt;) to &lt;/span&gt;&lt;span class="nv"&gt;$score&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;pid&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;$pids&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;do&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[[&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-w&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;/proc/&lt;/span&gt;&lt;span class="nv"&gt;$pid&lt;/span&gt;&lt;span class="s2"&gt;/oom_score_adj&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;then&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="nv"&gt;$score&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&amp;gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;/proc/&lt;/span&gt;&lt;span class="nv"&gt;$pid&lt;/span&gt;&lt;span class="s2"&gt;/oom_score_adj&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&amp;gt;/dev/null
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[[&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;$?&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-ne&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;then&lt;/span&gt;
&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Warning: Failed to set score for &lt;/span&gt;&lt;span class="nv"&gt;$process&lt;/span&gt;&lt;span class="s2"&gt; (PID: &lt;/span&gt;&lt;span class="nv"&gt;$pid&lt;/span&gt;&lt;span class="s2"&gt;)&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&amp;gt;&lt;span class="p"&gt;&amp;amp;&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;else&lt;/span&gt;
&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Warning: Cannot write to oom_score_adj for &lt;/span&gt;&lt;span class="nv"&gt;$process&lt;/span&gt;&lt;span class="s2"&gt; (PID: &lt;/span&gt;&lt;span class="nv"&gt;$pid&lt;/span&gt;&lt;span class="s2"&gt;)&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&amp;gt;&lt;span class="p"&gt;&amp;amp;&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;done&lt;/span&gt;
&lt;span class="k"&gt;done&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&amp;lt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="nv"&gt;$CONFIG&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;OOM score adjustment complete.&amp;quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;hr&gt;
&lt;h2&gt;Running the Script&lt;/h2&gt;
&lt;p&gt;You can run this periodically via cron or on boot with systemd. For example:&lt;/p&gt;
&lt;h3&gt;&lt;code&gt;/etc/systemd/system/oom-adjuster.service&lt;/code&gt;&lt;/h3&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;[Unit]&lt;/span&gt;
&lt;span class="na"&gt;Description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;Adjust OOM Scores from config file&lt;/span&gt;
&lt;span class="na"&gt;After&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;network.target&lt;/span&gt;

&lt;span class="k"&gt;[Service]&lt;/span&gt;
&lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;oneshot&lt;/span&gt;
&lt;span class="na"&gt;ExecStart&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;/usr/local/bin/oom_adjuster.sh&lt;/span&gt;

&lt;span class="k"&gt;[Install]&lt;/span&gt;
&lt;span class="na"&gt;WantedBy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;multi-user.target&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Then run:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;sudo&lt;span class="w"&gt; &lt;/span&gt;systemctl&lt;span class="w"&gt; &lt;/span&gt;daemon-reload
sudo&lt;span class="w"&gt; &lt;/span&gt;systemctl&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;enable&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--now&lt;span class="w"&gt; &lt;/span&gt;oom-adjuster.service
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;hr&gt;
&lt;h2&gt;Security Considerations&lt;/h2&gt;
&lt;p&gt;From a DevSecOps perspective, OOM prioritization is not just about uptime — it’s a security hardening technique:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;SSHD protection&lt;/strong&gt; prevents lockouts during memory exhaustion.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Preserving portsentry or IDS processes&lt;/strong&gt; ensures defense mechanisms remain active.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Avoiding the kill of logging/monitoring agents&lt;/strong&gt; helps retain forensic data post-incident.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Minimizing risk of service flapping&lt;/strong&gt; reduces noisy alerts and potential abuse vectors during DoS scenarios.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Misconfigured systems where critical daemons (like &lt;code&gt;iptables&lt;/code&gt;, &lt;code&gt;auditd&lt;/code&gt;, &lt;code&gt;sshd&lt;/code&gt;, or VPN tunnels) are killed first expose themselves to avoidable downtime and security gaps.&lt;/p&gt;
&lt;hr&gt;
&lt;h2&gt;Modern Use Cases&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Kubernetes nodes&lt;/strong&gt;: Influence OOM behavior via Quality of Service (QoS) classes (set by defining resource requests/limits in pod specs), or apply node-level tuning using methods like the script above for critical node components (e.g., kubelet, container runtime).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;CI/CD runners&lt;/strong&gt;: Protect build agents or essential runner services from being killed during resource-intensive test suites or concurrent builds.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Shared hosting / VPS&lt;/strong&gt;: Prioritize core services (web server, database, SSH) over potentially less critical user processes or background tasks.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The OOM Killer is an essential part of the Linux kernel, but leaving process termination order purely to default heuristics can be risky in production. By strategically assigning &lt;code&gt;oom_score_adj&lt;/code&gt; values based on business continuity and security priorities, you can significantly reduce recovery time and harden your systems against memory pressure scenarios.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;How does your team manage OOM Killer behavior in critical environments? Share your strategies!&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Originally inspired by a real-world VPS recovery and refreshed for the modern DevSecOps landscape.&lt;/em&gt;&lt;/p&gt;</content><category term="DevSecOps"/><category term="linux"/><category term="oomkiller"/><category term="memory"/><category term="system-administration"/><category term="devsecops"/><category term="process-management"/><category term="hardening"/></entry><entry><title>Catching a Nation-State Proxy: OSINT Lessons from the Twitter Frontlines</title><link href="https://rivassec.com/venezuela-twitter-proxy-osint.html" rel="alternate"/><published>2025-04-17T00:00:00-07:00</published><updated>2025-04-17T00:00:00-07:00</updated><author><name>RivasSec</name></author><id>tag:rivassec.com,2025-04-17:/venezuela-twitter-proxy-osint.html</id><summary type="html">&lt;p&gt;In 2012, I uncovered a state-aligned Twitter proxy tied to Venezuela’s ruling party. It mimicked Twitter, redirected traffic, and risked phishing user credentials. This post breaks down the OSINT methods I used to uncover it — and why threat intel teams still need to watch for subtle, state-run infrastructure.&lt;/p&gt;</summary><content type="html">&lt;hr&gt;
&lt;h2&gt;Situation&lt;/h2&gt;
&lt;p&gt;In the lead-up to Venezuela’s 2012 regional elections, I observed unusual behavior around Twitter access within the country. What began as anecdotal reports of DNS outages evolved into a deeper investigation that revealed a state-aligned proxy infrastructure potentially capable of phishing Twitter credentials.&lt;/p&gt;
&lt;hr&gt;
&lt;h2&gt;Key Finding&lt;/h2&gt;
&lt;p&gt;A subdomain under &lt;code&gt;chavezcandanga.org.ve&lt;/code&gt; — the official handle of then-President Hugo Chávez — was hosting a &lt;strong&gt;transparent proxy to Twitter&lt;/strong&gt;.
A transparent proxy intercepts user traffic without modifying requests or requiring configuration, making it ideal for passive surveillance or phishing.&lt;/p&gt;
&lt;p&gt;While it initially showed no malicious behavior, it was:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Hosted on IP addresses &lt;strong&gt;outside of Twitter’s ranges&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Registered under infrastructure owned by the Venezuelan government (PSUV – &lt;em&gt;Partido Socialista Unido de Venezuela&lt;/em&gt;)&lt;/li&gt;
&lt;li&gt;Promoted through state-controlled media and bot accounts&lt;/li&gt;
&lt;li&gt;Served from the same IP as a political messaging app&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2&gt;OSINT Breakdown&lt;/h2&gt;
&lt;h3&gt;1. DNS Resolution&lt;/h3&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;host&lt;span class="w"&gt; &lt;/span&gt;twitter.com
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Returned expected Twitter IPs (199.59.x.x), but users in Venezuela were silently being redirected to:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="m"&gt;190&lt;/span&gt;.202.80.20
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This IP &lt;strong&gt;served Twitter content&lt;/strong&gt; but was not operated by Twitter Inc.&lt;/p&gt;
&lt;p&gt;It’s unclear whether this redirection was caused by ISP DNS override, local resolver poisoning, or upstream hijack — but the net effect was consistent: Twitter domains were silently redirected to non-Twitter infrastructure under state control.&lt;/p&gt;
&lt;hr&gt;
&lt;h3&gt;2. WHOIS and Hosting Clues&lt;/h3&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;whois&lt;span class="w"&gt; &lt;/span&gt;chavezcandanga.org.ve
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Revealed that the domain was registered to &lt;strong&gt;PSUV&lt;/strong&gt; (&lt;em&gt;Partido Socialista Unido de Venezuela&lt;/em&gt;) and managed through CONATEL — Venezuela’s FCC-equivalent telecommunications regulator.&lt;/p&gt;
&lt;p&gt;&lt;img alt="WHOIS output showing chavzescandanga.org.ve registered to PSUV" src="https://rivassec.com/images/who-is-chavezcandanga-com.jpg"&gt;
&lt;em&gt;Figure: WHOIS lookup confirms chavzescandanga.org.ve is registered to PSUV, with administrative and technical contacts using @psuv.org.ve emails.&lt;/em&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h3&gt;3. Application Infrastructure&lt;/h3&gt;
&lt;p&gt;The same server IP hosted:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;mensajes.chavezcandanga.org.ve&lt;/code&gt; – a campaign messaging platform&lt;/li&gt;
&lt;li&gt;A proxy script that mirrored Twitter’s login screen&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img alt="Screenshot of chavzescandanga.org.ve auto-retweet app" src="https://rivassec.com/images/chavezcandanga-web.jpg"&gt;
&lt;em&gt;Figure: The official chavzescandanga.org.ve campaign app asks users to authenticate with Twitter to enable automatic retweets of Chávez's posts.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;At the time of discovery, this site &lt;strong&gt;did not contain malicious code&lt;/strong&gt;, but the potential for &lt;strong&gt;credential harvesting&lt;/strong&gt; during peak election activity was substantial. The authentication flow mimicked Twitter’s branding and prompted users to log in — creating a window for silent credential capture, token misuse, or targeted amplification based on follower behavior.&lt;/p&gt;
&lt;hr&gt;
&lt;h2&gt;Threat Model &amp;amp; Implications&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Credential Harvesting Risk&lt;/strong&gt;: Even without malware, a proxy to Twitter login enables password theft.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Social Media Control&lt;/strong&gt;: Through automated bots, the government amplified its message while monitoring access points.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Authentication-layer surveillance&lt;/strong&gt;: Intercepting Twitter logins enables password theft, identity tracking, or selective disinformation at the user level.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Infrastructure trust erosion&lt;/strong&gt;: Even minor state-level interference with DNS or TLS undermines confidence in web authentication across the board.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Evasion of International Scrutiny&lt;/strong&gt;: By mimicking Twitter directly, users could be deceived into trusting a controlled proxy.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2&gt;Lessons for DevSecOps &amp;amp; Threat Intelligence Today&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Verify SSL and domain trust chains&lt;/strong&gt; during high-risk periods like elections.&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;host&lt;/code&gt;, &lt;code&gt;whois&lt;/code&gt;, and passive DNS to correlate domains and IP ranges. Modern tools like &lt;code&gt;amass&lt;/code&gt; and certificate transparency logs expand this capability significantly.&lt;/li&gt;
&lt;li&gt;Query infrastructure databases (Shodan, Censys) for historical records on suspicious IPs and exposed services.&lt;/li&gt;
&lt;li&gt;Watch for &lt;strong&gt;content delivery mismatches&lt;/strong&gt; (site appears normal, IP is not).&lt;/li&gt;
&lt;li&gt;Document and archive suspicious infra using tools like the Wayback Machine.&lt;/li&gt;
&lt;li&gt;Phishing infrastructure can be &lt;strong&gt;state-sponsored and subtle&lt;/strong&gt; — early detection matters.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2&gt;Epilogue&lt;/h2&gt;
&lt;p&gt;The proxy remained active until at least December 2012, shortly before elections. To this day, the archived proxy content and WHOIS records serve as a warning about the ease with which social media can be co-opted in hostile environments.&lt;/p&gt;
&lt;p&gt;This investigation was one of the earliest times I realized how fragile trusted infrastructure becomes in the hands of a motivated actor — and how critical open-source techniques are in defending it.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;Have you ever spotted unusual network redirections or infrastructure anomalies? What tools or tactics helped you confirm your suspicions?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Originally published in 2012 and revisited in 2025 to reflect current DevSecOps and threat intelligence practices.&lt;/em&gt;&lt;/p&gt;</content><category term="Threat Intelligence"/><category term="osint"/><category term="threat-intelligence"/><category term="phishing"/><category term="venezuela"/><category term="twitter"/><category term="surveillance"/><category term="devsecops"/></entry><entry><title>The 208.5-Day Kernel Bug: A Lesson in Uptime, Overflow, and Operational Risk</title><link href="https://rivassec.com/208-day-kernel-bug-lessons.html" rel="alternate"/><published>2025-04-16T00:00:00-07:00</published><updated>2025-04-16T00:00:00-07:00</updated><author><name>RivasSec</name></author><id>tag:rivassec.com,2025-04-16:/208-day-kernel-bug-lessons.html</id><summary type="html">&lt;p&gt;A 2012 Linux kernel bug caused CPU lockups after 208.5 days of uptime due to an integer overflow in sched_clock(). Affecting RHEL 5 and 6, it exposed the risks of long uptimes, underscoring the importance of timely patching, uptime observability, and operational risk management in DevSecOps.&lt;/p&gt;</summary><content type="html">&lt;p&gt;In 2012, a subtle but potentially catastrophic bug was discovered in older versions of the Linux kernel — particularly affecting Red Hat Enterprise Linux (RHEL) and its derivatives. Once a system reached &lt;strong&gt;208.5 days of continuous uptime&lt;/strong&gt;, a flaw in the kernel’s &lt;code&gt;sched_clock()&lt;/code&gt; function could trigger a soft lockup, freezing the CPU for an estimated &lt;strong&gt;584 years&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Yes, &lt;strong&gt;584 years&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The root cause? An &lt;strong&gt;unsigned 64-bit integer overflow&lt;/strong&gt;. The kernel attempted to compute elapsed nanoseconds based on CPU cycles, using this logic:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="cm"&gt;/* Simplified representation of the overflow-prone calculation */&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;cpu&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;smp_processor_id&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="kt"&gt;unsigned&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;long&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;long&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ns&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;per_cpu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cyc2ns_offset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;cpu&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;ns&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;cyc&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;per_cpu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cyc2ns&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;cpu&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;CYC2NS_SCALE_FACTOR&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ns&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Once the computed value exceeded &lt;code&gt;0xffffffffffffffff&lt;/code&gt;, it wrapped around — leading to undefined behavior in the scheduler and an unrecoverable state requiring a manual reboot.&lt;/p&gt;
&lt;hr&gt;
&lt;h3&gt;Why This Matters to DevSecOps&lt;/h3&gt;
&lt;p&gt;This bug is more than a curiosity — it's a classic case study in:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;The operational danger of long uptimes&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why kernel patching should be automated and observable&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;How integer overflows can lead to severe availability risks&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Affected systems included RHEL 5.0 through 5.5 and early RHEL 6 versions running kernels below &lt;code&gt;2.6.32-220.4.*&lt;/code&gt;. Some Debian-based distributions were likely impacted, though documentation was less complete.&lt;/p&gt;
&lt;hr&gt;
&lt;h3&gt;Takeaways for Modern Systems&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Live patching tools&lt;/strong&gt; like Ksplice, KernelCare, and kpatch can reduce reboot pressure&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Observability stacks&lt;/strong&gt; should alert on uptime thresholds and kernel messages (&lt;code&gt;dmesg&lt;/code&gt;, &lt;code&gt;uptime&lt;/code&gt;, scheduler warnings)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Compliance frameworks&lt;/strong&gt; often require timely OS patching — this bug illustrates why&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;CI/CD pipelines for OS-level components&lt;/strong&gt; should test for edge cases, including time-based and overflow scenarios&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;p&gt;Even today, this incident reminds us that uptime isn't always a badge of honor. In some cases, it's a quiet countdown to failure.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Originally inspired by a 2012 analysis of the &lt;code&gt;sched_clock()&lt;/code&gt; bug affecting Linux systems with prolonged uptime.&lt;/em&gt;&lt;/p&gt;</content><category term="DevSecOps"/><category term="kernel"/><category term="bug"/><category term="Linux"/><category term="uptime"/><category term="overflow"/><category term="devsecops"/><category term="integer-overflow"/></entry><entry><title>The Chaos of the Leap Second (2012): When Time Broke Java and the Cloud</title><link href="https://rivassec.com/leap-second-chaos-2012.html" rel="alternate"/><published>2025-04-15T00:00:00-07:00</published><updated>2025-04-15T00:00:00-07:00</updated><author><name>RivasSec</name></author><id>tag:rivassec.com,2025-04-15:/leap-second-chaos-2012.html</id><summary type="html">&lt;p&gt;In 2012, a single leap second triggered global outages across Reddit, Yelp, and more. This retrospective unpacks how fragile timekeeping broke Java apps at scale, and what DevOps, SRE, and distributed systems teams can do today to avoid repeating history.&lt;/p&gt;</summary><content type="html">&lt;hr&gt;
&lt;h2&gt;What Happened?&lt;/h2&gt;
&lt;p&gt;On June 30, 2012, a &lt;strong&gt;leap second&lt;/strong&gt; was inserted into atomic time via NTP to keep UTC aligned with Earth’s rotation. At 23:59:60 UTC, global systems experienced a hiccup — a single extra second that caused widespread disruptions across Reddit, LinkedIn, Yelp, Google, FourSquare, and many more.&lt;/p&gt;
&lt;p&gt;What followed were 500 errors, high latency, and CPU usage spikes that crippled backend services.&lt;/p&gt;
&lt;hr&gt;
&lt;h2&gt;Why Did It Break?&lt;/h2&gt;
&lt;p&gt;Though seemingly minor, the leap second broke systems in subtle and severe ways:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Java Runtime Sensitivity:&lt;/strong&gt; Popular JVM versions at the time failed to handle the repeated 23:59:59 correctly. This triggered runaway CPU usage via thread timing bugs, particularly in services running Hadoop, Cassandra, and Elasticsearch.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Userland Misbehavior:&lt;/strong&gt; While many Linux kernels handled the leap second without panic, userland libraries and runtimes (especially Java) choked under non-monotonic time changes.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cloud Weakness Exposure:&lt;/strong&gt; An Amazon EC2 outage the day before had already left infrastructure strained. With fewer available instances, systems were more vulnerable when the leap second hit.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Limited Real-World Testing:&lt;/strong&gt; Simulating leap seconds under actual load, in full-stack distributed systems, proved nearly impossible. Pre-patch validations missed edge behavior.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2&gt;Real-World Impact&lt;/h2&gt;
&lt;p&gt;This bug hit nearly every high-scale Java-based system:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Stacks&lt;/strong&gt;: Cassandra, Hadoop, Elasticsearch, JVM-based schedulers&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Companies&lt;/strong&gt;: Reddit, Mozilla, Yelp, LinkedIn, Gawker, Facebook, StumbleUpon&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Behavior&lt;/strong&gt;: High CPU loops, 500 errors, frozen services, delayed recovery due to restart complexity&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In many cases, the kernel &lt;em&gt;didn't fail&lt;/em&gt; — the chaos came from how services processed time at runtime.&lt;/p&gt;
&lt;hr&gt;
&lt;h2&gt;Mitigation &amp;amp; Takeaways&lt;/h2&gt;
&lt;h3&gt;Immediate Fixes in 2012&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Rolling Restarts&lt;/strong&gt;: Restarting affected Java services often cleared the CPU lock-up, though distributed services like Cassandra made this time-consuming.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Manual Clock Reset&lt;/strong&gt;: Some environments required forcibly resetting system time post-leap second. This fix was often applied via config tools like Puppet:&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# CAUTION: Only use in environments that can tolerate manual time reset.&lt;/span&gt;
sudo&lt;span class="w"&gt; &lt;/span&gt;/etc/init.d/ntp&lt;span class="w"&gt; &lt;/span&gt;stop
date
date&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;date&lt;span class="w"&gt; &lt;/span&gt;+&lt;span class="s2"&gt;&amp;quot;%m%d%H%M%C%y.%S&amp;quot;&lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;
date
sudo&lt;span class="w"&gt; &lt;/span&gt;/etc/init.d/ntp&lt;span class="w"&gt; &lt;/span&gt;start
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h3&gt;Modern Resilience Strategies&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Leap Smearing&lt;/strong&gt;: Today’s &lt;code&gt;ntpd&lt;/code&gt;, &lt;code&gt;chronyd&lt;/code&gt;, and cloud providers use “leap smear” — slowly adjusting clocks over hours to avoid time jumps entirely.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Use Monotonic Clocks&lt;/strong&gt;: Time-sensitive logic should rely on &lt;code&gt;CLOCK_MONOTONIC&lt;/code&gt;, not wall-clock time, to measure durations safely.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img alt="Wall Clock vs Monotonic Time During a Leap Second" src="https://rivassec.com/images/leap_second_monotonic_vs_wall_clock.png"&gt;
&lt;em&gt;Figure: Monotonic time continues uninterrupted while wall-clock time repeats a second — highlighting why monotonic clocks are preferred for duration tracking.&lt;/em&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Monitor Clock Drift&lt;/strong&gt;: Observability pipelines should expose clock sync state and NTP drift as first-class metrics.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Design for Temporal Anomalies&lt;/strong&gt;: Distributed systems should assume wall-clock time can regress, freeze, or desync — and gracefully degrade when it does.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Simulated Testing Isn’t Enough&lt;/strong&gt;: Always combine synthetic load with chaos testing under unusual real-world conditions (e.g., leap seconds, DNS failures, NTP skew).&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2&gt;Epilogue&lt;/h2&gt;
&lt;p&gt;The 2012 leap second chaos wasn’t caused by incompetence — many teams patched, prepared, and tested. But the leap second hit during degraded cloud capacity, exposed fragile JVM behavior, and stressed assumptions in time-sensitive code.&lt;/p&gt;
&lt;p&gt;A single second exposed fault lines in the foundations of the modern internet.&lt;/p&gt;
&lt;p&gt;In 2022, the ITU voted to abolish leap seconds by 2035, largely driven by incidents like this one. Until then, the mitigations above remain essential for any system that touches wall-clock time.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;What other “just time” failures have caught you off guard in production? Let’s share war stories.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;(Originally published in 2012. Revisited and revised in 2025 for modern SREs, DevOps, and distributed systems engineers.)&lt;/em&gt;&lt;/p&gt;</content><category term="Incident Retrospectives"/><category term="leap-second"/><category term="kernel"/><category term="linux"/><category term="java"/><category term="ntp"/><category term="distributed-systems"/><category term="devops"/><category term="sre"/><category term="incident-retrospective"/></entry></feed>