Policy Engine
Summary
This ADR defines the Management and Execution API and Workflow of the DCM Policy Engine
Motivation
The Policy Engine operates as a specialized microservice within the Data Center Management (DCM) application responsible for governing service creation and modification (e.g., VirtualMachines, Containers). It enables Admins, Tenant-Admins, and Users to inject logic that validates (Approve/Reject), mutates (Defaulting/Altering) and assigns Service Providers to request payloads using Open Policy Agent (OPA) and Rego.
Goal
Define the flow of how Policies are managed and used by the Policy Engine
- Define Policy types - Global, Tenant, User
- Define Policy management
- How policies should be added/updated
- How policies will be stored
- Define Policy execution
- Policy priority
- Value immutability and constraints
- Determine the enforcement engine and policy language
- Define the input format
- Define the output format
Non-Goals
- Policy implementation
- Actionable OpenAPI specification
- While this ADR references
TenantlevelPolicyandID,Tenantsare not supported in V1
Core Concepts & Definitions
Policy Responsibilities
Every policy may return one or more of the following outputs
- Reject: Requests are approved by default. Policies may decide whether the request should be Rejected.
- Mutation: Modifying the request payload (e.g., injecting default labels) by providing a patch map.
- Field Constraints: Defining the mutability of fields for subsequent policies in the chain.
- Service Provider Selection: Policies may set a value and/or constraints
Policy Scope & Hierarchy (Execution Order)
The execution order is strictly determined by Level first, then Priority.
- Global: (Super Admin) - Runs first.
- Tenant: (Tenant Admin) - Runs second.
- User: (End User) - Runs last.
Within each level, policies are sorted by priority: lower integers indicate higher priority.
The “Rego Contract”
Input
The input payload includes:
spec- The current patched request payload- Assumption - While policies do not have to be specific for Service Types they will need to know the expected content
constraints- The current constraints context (accumulated from prior policies)provider- The currently selected service provider (empty string initially, populated as policies are evaluated)service_provider_constraints- The current service provider constraints (accumulated from prior policies)
Output
Following the policy responsibilities, the output should be comprised of the following elements
rejected (bool) - since requests are approved by default, policies may reject them.
rejection_reason (string, optional) - reason for rejection
selected_provider (string, optional) - the name of the service provider chosen to fulfill the request
service_provider_constraints (object, optional) -
allow_list- list of allowed service provider namespatterns- list of regex patterns for matching allowed providers
patch (map, optional) - a dictionary of the corresponding service type for setting values. Each internal key is optional
constraints (map, optional) - follows JSON Schema (draft 2020-12).
This standard supports:
- Immutable: const
- Numeric constraints: minimum, maximum, multipleOf
- String patterns: pattern, minLength, maxLength
- Enumerations: enum
- Array constraints: minItems, maxItems
- Conditional logic: if/then/else
For the complete validation vocabulary, see the JSON Schema Validation specification.
Policy Code Ownership and Responsibilities
- DCM Admins, Tenant-Admins and Users implement the policies’ REGO code
- DCM Admins, Tenant-Admins and Users are responsible for correct registration of the policies
- DCM Admins, Tenant-Admins and Users are responsible for the accuracy and performance of the policies
- Trying to register a REGO code snipet that fails compilation will fail
System Architecture
The Policy API serves two distinct functions:
- Management Plane: CRUD operations for Policy definitions and synchronization with the Policy Engine.
- Execution Plane: Service requests evaluation against active policies using a stored-policy model.
Policy Management
Policy Registration Flow
sequenceDiagram
participant User
participant PolicyEngine
participant Database
participant OPA
User->>PolicyEngine: POST /api/v1/policies
PolicyEngine->>Database: Check unique Name and Priority for policy type
alt Uniqueness check failed
PolicyEngine-->>User: Error response
else Uniqueness check passed
PolicyEngine->>PolicyEngine: Generate UUID
PolicyEngine->>PolicyEngine: Parse PackageName
PolicyEngine->>Database: Store policy metadata
Note right of Database: UUID, Name, PackageName,<br/>LabelSelector, Policy Type, Priority
PolicyEngine->>OPA: Push REGO code with UUID
alt REGO compilation failed
OPA-->>PolicyEngine: Compilation error
PolicyEngine->>Database: Rollback stored metadata
PolicyEngine-->>User: Error response
else REGO compilation succeeded
OPA-->>PolicyEngine: Success
PolicyEngine-->>User: Return UUID
end
end
Pseudo API
POST /api/v1/policies
Payload
- Name
- Must be unique at its level. That is:
- All global policies must have unique names
- All tenant policies must have unique names within their tenant
- All user policies must have unique names for their user
- Must be unique at its level. That is:
- Policy Matching Criteria. Treated with AND.
- Label Selector
- Policy Type
- Global, Tenant, User
- Priority
- Must be unique at its level
- A lower number means a higher priority and therefore will be evaluated first
- REGO Code
- Enabled
- Optional. Default
true
- Optional. Default
Response Payload
- Generated UUID
Execution Logic & Flow
- Validate the Policy Name and Priority
- If not unique return an error
- Generate a UUID
- Get the policy package name from the REGO code
- Store the following information in the DB
- UUID
- Name
- Package Name
- Policy Type
- Priority
- Label Selector
- Push the REGO code to OPA
- Use the UUID for naming to avoid collisions
- If failed, rollback DB and return an error
- Return UUID to caller
GET /api/v1/policies
Return the list of policies. Allow for filtering
GET /api/v1/policies/{policyId}
Return the specific policy
DELETE /api/v1/policies/{policyId}
Delete the specific policy
PUT /api/v1/policies/{policyId}
Update the specific policy. Policy name and type are immutable
Payload
- Policy Matching Criteria
- Priority
- REGO Code
- Enabled
Execution Plane
Sequence
sequenceDiagram
participant User
participant PlacementManager
participant PolicyEngine
participant Database
participant OPA
User->>PlacementManager: Create Service request
PlacementManager->>PolicyEngine: Validate Payload
PolicyEngine->>Database: Get matching policies by serviceType and labelSelector
Database-->>PolicyEngine: List of policies
loop For each policy
PolicyEngine->>OPA: Evaluate policy
OPA-->>PolicyEngine: Policy result
PolicyEngine->>PolicyEngine: Enforce constraints
PolicyEngine->>PolicyEngine: Mutate payload
alt Policy rejected or constraint violation
PolicyEngine-->>PlacementManager: Request rejected
PlacementManager-->>User: Request rejected
end
end
PolicyEngine-->>PlacementManager: Success with updated payload
PlacementManager-->>User: Service created
Pseudo API
POST /api/v1alpha1/policies:evaluateRequest
Payload
- Service Instance
- spec - the service specification (flexible schema)
Execution Logic & Flow
The Engine acts as an orchestrator. It does not send Rego code during evaluation; it calls pre-loaded modules in OPA.
Pipeline Logic (The “Chain of Responsibility”)
The Policy API maintains a
ConstraintContextmap in memory for the duration of the request.Fetch & Sort:
- Query DB for enabled policies matching the request payload based on the policy’s matching criteria.
- Sort by Level (Global -> Tenant -> User) then Priority (Desc).
If no policies matching the request payload were found, the request will return successfully
Iterate for each policy P:
- Call
OPA:- Invoke /v1/data/<P.PackageName>/main
- Pass
spec- the current patched request payloadprovider- the currently selected service providerconstraints- the accumulated constraint context (if any)service_provider_constraints- the accumulated SP constraints (if any)
- Check
Reject- If
Rejectistrue, ABORT IMMEDIATELY (Fail Fast). Return 406.
- If
- Validate
Constraints:- A lower-level policy cannot “unlock” a field locked by a higher-level policy.
- If it does, ABORT with “Policy Conflict Error”
- Update
ConstraintContext:- Merge new
Constraintsfrom Policy P intoConstraintContext.
- Merge new
- Validate
Patch:- Validate
PatchagainstConstraintContext. - Example: If
ConstraintContext.regionis immutable and Policy P tries to patch theregion, ABORT with “Policy Conflict Error”
- Validate
- Apply
Patch- Update service_payload with valid patches.
- Validate
ServiceProvider- If Policy P returned a
selected_providerandservice_provider_constraintsexist, validate the selected provider against the constraints.
- If Policy P returned a
- Call
Finalize: Return the final payload, selected provider, and status to Placement Manager.
- Status is
APPROVEDif the payload was not modified,MODIFIEDif any patches were applied.
- Status is
Constraint Validation Example
- Step 1 (Global Policy):
- Patch: {“billing_tag”: “engineering”}
- Constraint: {“billing_tag”: {“mode”: “immutable”}}
- Result: Payload has billing_tag. Context has billing_tag=immutable.
- Step 2 (User Policy):
- Patch: {“billing_tag”: “marketing”}
- Action: Engine checks Context. billing_tag is immutable.
- Result: Error. The User policy violates the Global constraint.