Most teams do not fail at Kubernetes security because they lack tools. They fail because they cannot turn one risky behavior into a decision, a rule, a triage path, and a rollout step. The warn → audit → enforce sequence is where that rollout step gets real.
Pod Security Admission has been stable in Kubernetes since 1.25. The mechanism is well-documented. The three modes — warn, audit, enforce — are clearly defined. And yet teams still flip directly to enforce, break workloads, and then either roll back or leave enforcement off indefinitely while they figure out what happened.
The problem is not a misunderstanding of the API. It is a misunderstanding of what each phase is actually for, and what work needs to happen in each one before the next phase is safe to enter.
What each phase actually does
Before going into what goes wrong, it is worth being precise about what each mode does — because the names are slightly misleading.
Warn surfaces violations to the person or system submitting the manifest. When a workload violates the active policy level, a warning is returned in the API response. The workload is still admitted. Kubernetes audit logs record the warning. Developers see the warning if their tooling surfaces API responses — which it often does not by default.
Audit records violations to the Kubernetes audit log without surfacing them to the submitting user. The workload is still admitted. The violations are visible to anyone who reads the audit log, which in practice means the platform team and security, not the developers who own the workloads.
Enforce rejects workloads that violate the policy. The workload is not admitted. The deployment fails. The developer sees an error. This is the only mode that actually prevents non-compliant workloads from running.
The three modes are typically applied as namespace labels. A namespace can be in enforce mode for one policy level and warn mode for another. This is the enforcement model: you can have warn=restricted and enforce=baseline on the same namespace, for example, which means restricted violations are flagged but not blocked, and baseline violations are blocked.
What goes wrong when teams skip the phases
The most common failure pattern: a platform team correctly identifies that production namespaces should be at the restricted policy level. They apply enforce=restricted. Several deployments break immediately. The platform team rolls back to enforce=baseline or removes enforcement entirely. The security improvement is lost.
What went wrong is not that restricted was the wrong target. It is that nobody knew which workloads would fail before enforcement went live.
The warn and audit phases exist precisely to answer that question. Used correctly, they give you a complete picture of which workloads are non-compliant, what violations they have, and which teams own them — before anything breaks.
A second failure pattern: the team uses warn mode but does not tell developers that warnings are being generated. Warnings appear in API responses, but most CI/CD systems do not surface them visibly. Developers submit manifests, get warnings they never see, and continue as usual. When enforce goes live, the violations that should have been cleaned up during the warn phase are all still present.
A third failure pattern: the team applies enforcement to a namespace and then creates exceptions for every workload that breaks. The enforcement is technically in place, but the exception list is long enough to cover most workloads. The standard has been nominally enforced and practically hollowed out.
How to run each phase correctly
Warn phase: 2–4 weeks
Apply warn at your target policy level to every namespace in scope. Do not apply enforce yet.
The warn phase has two jobs. First, generate a complete inventory of violations. Second, communicate to developers that changes are coming and what they need to fix.
For the inventory: if your CI/CD pipeline surfaces API warnings, use that. If it does not, set up audit log monitoring to capture warn-level events. The goal is a list of every workload that would fail if enforcement were applied today, with the specific violation for each one.
For the communication: this is not a security announcement. It is a developer operations notice. The message needs to answer three questions: what is changing, what specifically do you need to fix in your service, and when does enforcement go live. Generic notices do not work. Workload-specific notices do. If you can tell a developer “your api-server deployment in the production namespace needs runAsNonRoot: true and resource limits before April 22,” that developer will fix it. If you send a company-wide note about security policy updates, most developers will not read it carefully enough to know it affects them.
Audit phase: 1–2 weeks
After the warn phase has generated the violation inventory and developer fixes are underway, apply audit at the same level. Keep warn in place as well.
The audit phase has one job: confirm that the violation inventory is shrinking. Monitor the audit log for new violations. Track which workloads have been fixed and which have not. Follow up with teams that have not made changes.
The audit phase is also where you identify the workloads that cannot comply on the enforcement date — because of a hard dependency, a third-party component, or a remediation that takes longer than expected. These workloads need to be either excepted with a documented rationale, or put on a specific timeline for compliance.
Enforce phase
Apply enforce only after: the violation inventory is clear, all unfixed workloads either have accepted exceptions or have a documented compliance timeline, and developers have been given a final warning with a specific date.
The enforce phase should not produce surprises. If it does, either the audit phase was too short or the developer communication was insufficient. Both of those are fixable before the next rollout.
The namespace classification decision
Before any of the above phases can run, you need to decide which namespaces get which enforcement level. This is not a technical decision — it is an organizational one, and it needs to be made explicitly.
The baseline Pod Security standard blocks the most dangerous configurations: privileged containers, host namespace access, most volume types. It is appropriate for almost all application namespaces.
The restricted standard adds non-root execution, read-only root filesystem, and dropped capabilities. It is appropriate for most new application workloads, but some legacy workloads may require exceptions.
The privileged level applies no restrictions. It is appropriate for system namespaces — kube-system, CNI components, monitoring agents — and nothing else in production.
The common mistake is leaving namespace classification implicit. Platform teams know in their heads which namespaces are system namespaces and which are application namespaces, but it is never written down. When a new namespace is created, it gets no enforcement label by default. When someone asks why a specific namespace is at a specific level, nobody can point to the decision.
Write the classification down. Every namespace in scope should have a documented enforcement level, with the rationale for any exception from the default.
Exception model
Exceptions are inevitable. Some workloads genuinely cannot comply with the target policy level — a third-party component that requires privilege, a legacy service that runs as root because rewriting it is a multi-quarter project.
The exception model needs to answer three questions before enforce goes live.
First: what is the process for requesting an exception? It should be lightweight enough that teams do not find it easier to work around enforcement than to request an exception. It should be heavyweight enough that exceptions require a documented rationale.
Second: who approves exceptions? This should be a named person, not a team. Exceptions that require team approval accumulate in inboxes. Exceptions that require a named person’s sign-off get processed.
Third: how are exceptions reviewed? An exception that was legitimate for a legacy service two years ago may not be legitimate today. If exceptions are never reviewed, they accumulate and the standard becomes meaningless.
Developer communication text
The message developers need before enforcement goes live is short. Here is the structure that works:
Kubernetes security policy update — enforcement begins [date]
Starting [date], all deployments in [namespace] must meet the [baseline/restricted] Pod Security standard.
Your service [service name] has the following issues that need to be resolved before [date]:
- [specific violation 1]
- [specific violation 2]
How to fix these: [link to specific guidance, not general documentation]
If your service cannot meet these requirements by [date], reply to this message to request an exception. Exceptions require a documented rationale and will be reviewed quarterly.
Contact [named person] with questions.
The key elements: specific service names, specific violations, specific date, specific fix guidance, specific exception process, specific contact. Every generic element (“all teams”, “various security improvements”, “contact the platform team”) reduces the probability that the developer takes action.
What a stalled rollout looks like
If your rollout has stalled — if enforcement has been in warn mode for months with no progress toward enforce, or if enforce is on but the exception list is longer than the compliant workload list — the stall is almost always one of three things.
The violation inventory is not clear. Teams do not know what they need to fix because nobody has told them specifically. The fix is workload-specific communication, not another general notice.
The remediation effort is larger than expected. Some workloads have deep compliance issues that require more than a YAML change. These need to be on a documented timeline, not in a general backlog.
There is no enforcement date. Without a specific date, warn mode runs indefinitely. Set a date, communicate it, and hold it. If the date slips, communicate the new date with a specific reason. Each slip makes the next date less credible.
The Pod Security Rollout Sprint gives you the namespace classification, the warn/audit/enforce plan, the exception model, and the developer communication text — structured for your specific workload layout. If enforcement keeps stalling, the sprint is designed to unblock it.