Skip to main content
Infrastructure

Laying the Conduit

The previous post walked through IEC 62443 zones and conduits and the IEC 62351 family — the cryptographic mechanisms that turn a conduit into something defensible on the wire. DNP3 SA for the control path. GOOSE authentication for the process bus. TLS under MMS. A key-management substrate that has to outlive the engineers who designed it.

What none of those standards tell you is where the wire runs.

62351 specifies the lock. 62443 specifies which door needs one. Neither standard tells you how to build the wall the door sits in. In a virtualised control centre — the kind I described in the previous post, where RTU, EMS, ADMS, SCADA, and historian VMs share a vSphere cluster — the platform that builds the wall and hangs the door is VMware Cloud Foundation.

I’ve spent over twenty years working with VMware infrastructure. The mapping I’m about to walk through is where that background meets the substation domain — and it’s the reason I started writing this series in the first place.

VMware Cloud Foundation: the platform

VMware Cloud Foundation (VCF) is the integrated software platform that delivers compute (vSphere), storage (vSAN), networking (NSX), and lifecycle management (SDDC Manager) as a single operated stack. It ships as two products. VCF runs in the data-centre control centre — multi-host clusters serving hundreds of substations. VCF Edge runs inside the substation itself — a single host for smaller substations, or a multi-host cluster for medium and large sites, on substation-grade hardware. VCF Edge is lifecycle-managed and controlled from VCF in the data centre, but operates locally when the WAN is unavailable. Both products share the same software stack, the same lifecycle tooling, and the same security model — a common operating model from data hall to substation outhouse. The conduits become firewall rules enforced by vDefend (a separately licensed add-on, covered below). The lifecycle requirements — patch, rotate, attest, audit — become SDDC Manager operations, whether the cluster is in a data hall or a substation outhouse.

VCF is the base platform. Four separately licensed add-ons extend it into areas the base doesn’t cover — vDefend for firewalling, microsegmentation, and threat detection, Avi for load balancing, TLS offload, and application security, Live Site Recovery for disaster recovery, and Advanced Cyber Compliance for cyber recovery and continuous compliance — each covered in its own section below. What follows first is the base platform and what it delivers out of the box.

NSX: the network fabric

NSX — the networking component of VCF — provides the overlay network, the routing topology, and the source-identity assurance that turn 62443’s logical zones into network-fabric primitives:

Overlay isolation. Every zone in the 62443 diagram becomes a GENEVE-backed overlay segment. Z-RTU traffic never shares a physical wire with Z-OPS traffic, even though both sets of VMs might be running on the same ESXi host. The GENEVE tunnel header is the zone boundary — invisible to the workload, enforced by the hypervisor. An attacker who compromises a VM in Z-OPS and tries to ARP-spoof across to Z-RTU finds there is no shared broadcast domain to spoof into. The overlay is the air gap that the original Purdue diagram assumed the physical cabling would provide. GENEVE itself does not encrypt the tunnel payload — the overlay provides isolation, not confidentiality.

IPsec VPN for WAN confidentiality. For the site-to-site conduit between substations and the control centre, NSX provides IPsec VPN on Tier-0 and Tier-1 edge gateways — route-based or policy-based tunnels that encrypt DNP3 and IEC 60870-5-104 traffic as it crosses the operational telecoms WAN. This doesn’t satisfy 62351-5 (which requires message-level authentication, not just transport encryption), but it closes the FR4 confidentiality gap at the network layer for legacy devices that can’t terminate TLS themselves.

VPC-style routing domains. NSX Tier-1 gateways give each zone its own routing table. Z-RTU has a Tier-1; Z-OPS has a different Tier-1; the two connect only through an explicit conduit with a firewall rule between them. This is the same pattern public-cloud architects use with AWS VPCs or Azure VNets — but running on-premises, inside the control-centre data hall, on infrastructure the utility owns. The routing domain boundary is the conduit. No route, no reachability, no lateral movement.

SpoofGuard and uRPF for source-identity assurance. A firewall rule is only as good as the assurance that the source IP is genuine. NSX SpoofGuard enforces approved IP and MAC bindings per vNIC — a compromised VM attempting to spoof another RTU’s address gets dropped before the packet reaches the overlay. Unicast Reverse Path Forwarding (uRPF) on Tier-1 gateways does the same at the routing boundary, validating that every packet entering a conduit has a source IP with a valid return route. Together they close the IP-spoofing vector at both ends — the vNIC and the gateway — which is FR5 (Restricted Data Flow): the conduit doesn’t just filter by source IP, it proves the source IP is genuine.

Unified networking across VMs and containers. A control centre increasingly runs both — RTU and EMS workloads as VMs, analytics and observability tooling as containers on VKS clusters. NSX treats both as first-class citizens on the same fabric. VKS uses Antrea as its container networking interface (CNI), and Antrea integrates with NSX so that Kubernetes pods connect to the same overlay segments, the same Tier-1 gateways, and the same routing domains as VMs. The vDefend distributed firewall (covered below) applies identically to a pod’s vNIC and a VM’s vNIC — the same microsegmentation policy, the same flow logging, the same IDS/IPS inspection. Avi (also covered below) provides Kubernetes-native ingress alongside its VM load balancing, so a containerised historian query service and a VM-based ADMS front-end both sit behind VIPs managed by the same controller, with the same TLS offload and WAF policies. The security model doesn’t fork when the workload type changes. One overlay, one firewall policy plane, one ingress layer — whether the workload is a VM, a pod, or both.

The pattern is that NSX turns 62443’s logical zones into network-fabric primitives. The zone boundary is an overlay segment. The routing domain is an isolated Tier-1 gateway. Source identity is assured by SpoofGuard and uRPF. IPsec VPN on the edge gateways closes the confidentiality gap for legacy protocols crossing the WAN. And the fabric applies uniformly to VMs and containers alike.

None of this replaces the protocol-level mechanisms — 62351-5 SA, 62351-6 GOOSE authentication, 62351-4 E2E signing. Those remain the target. But NSX provides the network substrate that makes the journey from “plain-text everywhere” to “authenticated everywhere” survivable across a multi-year gateway refresh. The enforcement and detection layers — firewalling, microsegmentation, IDS/IPS — come from vDefend, covered in its own section below. First: the compute layer that sits underneath the network fabric.

vSphere: the compute and OT networking layer

NSX provides the overlay fabric for the control-centre zones. But the substation edge — the VCF Edge cluster sitting inside the substation building — has a different networking requirement. The process bus carries Sampled Values and trip GOOSE as layer-2 multicast under PRP redundancy. These frames cannot traverse an overlay. They need direct, deterministic access to the physical NICs connected to the process-bus switches. That’s where the vSphere Standard Switch (vSS) and ESXi’s real-time compute capabilities come in.

The vPAC Alliance’s work on virtualised protection — and VMware’s own latency-tuning guidance — defines how ESXi hosts are configured for real-time protection, automation, and control workloads:

vSphere Standard Switch for process-bus connectivity. The vSS provides direct, low-overhead bridging between the VM’s vNIC and the physical NIC connected to the substation process-bus LAN. Unlike the NSX-managed distributed switch (which adds overlay encapsulation and distributed firewall processing), the vSS passes frames with minimal added latency — critical when the traffic is PRP-duplicated Sampled Values at 4 kHz and trip GOOSE with a Type 1A deadline of three milliseconds. The two PRP networks each get their own vSS uplink, and the VM sees both redundant streams exactly as a physical IED would.

For the most latency-critical P&C workloads — virtualised protection relays making trip decisions — even the vSS path adds measurable jitter. Single Root I/O Virtualisation (SR-IOV) bypasses the virtual switch entirely: the physical NIC (the validated design specifies the Intel E810) presents a Virtual Function directly to the VM, and traffic goes straight to hardware without traversing the hypervisor kernel. The trade-off is that SR-IOV bypasses the vSS and its port-level controls, so it’s used selectively — only on the process-bus-facing NICs where microsecond determinism matters. The VM is pinned to the same NUMA node as the E810 to avoid cross-socket memory latency, and SplitRx mode distributes incoming multicast across multiple receive queues so that simultaneous merging-unit streams don’t bottleneck on a single queue.

The vSS handles the OT-facing process bus; NSX handles the IT-facing control-centre overlay. Two switching domains, two trust models, on the same ESXi host.

Latency Sensitivity = High. This ESXi VM setting tells the hypervisor to get out of the way. Each vCPU gets sole ownership of a physical core — the VMkernel will not co-schedule anything else on it. The VM’s entire memory allocation is locked into physical RAM at power-on, with no ballooning, swapping, or transparent page sharing. And the VMkernel automatically disables VMXNET3 interrupt coalescing and Large Receive Offload on the VM’s virtual NICs, so interrupts fire immediately rather than being batched — critical when the interrupt is a Sampled Values frame arriving at 4 kHz. The guest OS inside a vPAC VM is a real-time operating system (RTOS), not general-purpose Linux; the RTOS kernel is designed for deterministic interrupt-to-decision latency measured in microseconds, and Latency Sensitivity = High ensures the hypervisor doesn’t add to it. VMware’s own latency-tuning guidance covers the full set of advanced parameters and BIOS settings that complete the configuration.

Overlay avoidance on the latency-sensitive path. VMware’s latency-tuning guidance is explicit: on hosts running latency-sensitive workloads, use as few vSphere overlays as possible — including NSX and vSAN. Overlay encapsulation, distributed firewall inspection, and vSAN network traffic all add jitter that a three-millisecond trip budget cannot absorb. The process-bus-facing NICs use vSS or SR-IOV; the control-centre-facing interfaces on the same host still get the full NSX overlay and vDefend stack. The guidance applies to the latency-sensitive traffic path, not to the entire host.

vMotion constraints. vMotion stun time for a latency-sensitive VM is approximately one second — far too long for a P&C workload with a three-millisecond deadline. SR-IOV further constrains mobility: a VM with a passthrough NIC cannot vMotion at all without first detaching the Virtual Function. Maintenance windows for hosts running protection VMs have to be scheduled with the same operational awareness as physical relay maintenance. These are deliberate trade-offs — determinism in exchange for flexibility — and they are the reason the vPAC architecture keeps protection VMs on dedicated substation-grade hosts (meeting IEC 61850-3 and IEEE 1613 environmental requirements) rather than sharing them with general workloads that expect seamless mobility.

Lifecycle automation

The 62443 and 62351 requirements don’t stop at the wire — they assume the infrastructure underneath the wire is itself maintained, patched, and credential-rotated for the lifetime of the asset. That’s the job VCF’s lifecycle layer was designed for, and in a control-centre context the mapping to 62443 Foundational Requirements is surprisingly direct.

Credentials and identity

Automated certificate rotation. SDDC Manager rotates infrastructure certificates — ESXi hosts, vCenter, NSX managers — on a schedule without manual intervention, using either the built-in VMware Certificate Authority (VMCA) or an external enterprise CA integrated via the VMCA subordinate or custom certificate modes. Utilities with an existing PKI hierarchy can chain VCF’s infrastructure certificates to their own root of trust rather than standing up a separate one. That’s the infrastructure half of the 62351-9 key-management problem solved at the platform layer. The IED and application certificates are still the hard part (the truck-roll-to-every-substation problem), but the infrastructure substrate certificates — the ones that underpin the overlay tunnels, the distributed firewall, the management plane — don’t become a forgotten expiry waiting to take down a cluster at 2 a.m. on a Sunday.

Password rotation. SDDC Manager rotates credentials for every infrastructure component — ESXi root, vCenter admin, NSX manager, SDDC Manager itself — and stores them in an auditable credential vault. Every privileged credential rotates on a policy-driven schedule, the rotation is logged, and the old credential is invalidated. No shared passwords on sticky notes. No “the ESXi root password hasn’t changed since commissioning”. Together with certificate rotation, this closes FR1 (Identification & Authentication Control) and FR2 (Use Control) for the infrastructure layer — the credential vault is what an auditor wants to see when they ask how those requirements are met for the hypervisor estate.

Identity and access management. Certificate rotation and password rotation secure infrastructure accounts. But the harder FR1 question is how human operators authenticate to the platform — and how their access is scoped so that an OT engineer can manage protection-relay VMs without being able to reconfigure the NSX fabric.

VCF’s Identity Broker is the authentication gateway for the entire stack: vCenter, NSX Manager, SDDC Manager, and VCF Automation all authenticate through it. The Identity Broker federates to an external identity provider — Active Directory, LDAP, or OIDC-compliant IdPs — so operators authenticate with their corporate credentials rather than local accounts. Multi-factor authentication layers on top via the IdP, not as a VCF bolt-on, which means the MFA policy the utility already enforces for its corporate systems extends to the infrastructure platform without a separate enrolment.

Once authenticated, vSphere’s role-based access model controls what each operator can see and do: granular roles scoped to clusters, resource pools, folders, or individual VMs. An OT team lead might have full control over the RTU resource pool but read-only visibility into the NSX transport zone. A security analyst might see distributed firewall flow logs but have no permission to modify rules. The access model is auditable — every login, role assignment, and privilege escalation is logged to the centralised SIEM — and it maps directly to FR2 (Use Control) and 62351-8 (role-based access control for power system operations): least-privilege access enforced by the platform, not by convention.

Operational automation

A control centre has a long tail of routine operational tasks — adding a VM to an NSX security group when a new RTU is commissioned, applying a consistent configuration baseline to a newly deployed historian, resetting a service account after a security incident, reclaiming storage from decommissioned workloads. Done manually, these are the changes that get applied inconsistently, documented incompletely, and audited after the fact.

VCF Automation is the consumption and governance layer: operators request infrastructure through a self-service catalogue with RBAC-enforced entitlements, approval workflows, and policy constraints — an engineer in the OT team can commission a new RTU VM without having vCenter admin privileges, and the request is logged, approved, and executed within guardrails the platform team defines. VCF Orchestrator is the workflow engine underneath: it executes multi-step runbooks that span vSphere, NSX, vSAN, and external systems as repeatable, version-controlled workflows rather than ad-hoc CLI sessions. A security-group change, a config remediation, a post-incident credential reset — each is a workflow that runs the same way every time, logs every action, and can be triggered by Automation’s catalogue or by an event-driven policy. In 62443 terms, Automation enforces FR2 (Use Control) at the operational layer — least-privilege access to infrastructure actions — while Orchestrator extends FR6 by turning every operational change into an auditable, reproducible event rather than an unrecorded SSH session.

Patching and integrity

vSphere Lifecycle Manager (vLCM) image-based patching. FR3 (System Integrity) asks a simple question: is this host what it’s supposed to be? vLCM defines a desired image — ESXi version, driver set, firmware baseline — and ensures every host in the cluster matches it. Hosts that drift get flagged and remediated. The integrity chain starts below the hypervisor — UEFI Secure Boot validates the ESXi bootloader against a trusted signing chain, and TPM attestation measures the boot sequence into a hardware root of trust. Combined with vTPM-attested boot on the RTU VMs, the chain runs from host firmware through hypervisor image to workload boot measurement. That’s a system-integrity story an assessor can follow end-to-end.

Live patching. In a control centre running virtualised RTUs for hundreds of substations, taking a host offline for kernel patching risks exactly the concentration scenario warned about earlier. vSphere live patching applies security fixes to the running hypervisor kernel without rebooting the host — no VM evacuation, no maintenance window, no gap in SCADA telemetry. The security patch that closes a hypervisor vulnerability gets applied the week it’s published rather than waiting for the next quarterly maintenance window — the difference between FR7 (Resource Availability) as a design goal and FR7 as an operational reality.

Logging, monitoring, and observability

Centralised logging. VCF ships every infrastructure event — ESXi host syslog, vCenter task events, NSX distributed firewall flow logs, SDDC Manager audit trail — to a centralised log store (VCF Operations for Logs, or forwarded to a SIEM via syslog/webhook). The distributed firewall flow logs are the same ones powering the five-artefact conduit from earlier: every allowed and denied flow on every vNIC, with source, destination, port, byte count, and timestamp. The Dragos sensor provides OT-protocol-aware detection on top.

Infrastructure monitoring. VCF Operations correlates metrics across compute, storage, and networking — CPU contention on an ESXi host, vSAN latency spikes, NSX control-plane health — and surfaces capacity trends and anomalies before they become outages. For a control centre running hundreds of RTU VMs, Operations is what tells you a cluster is approaching memory exhaustion or that a host’s network throughput has degraded, before the SCADA polling cycle starts dropping telemetry.

Network observability. VCF Operations for Networks extends that visibility into the NSX fabric itself — topology maps that visualise traffic flowing through overlay segments, Tier-1 gateways, and distributed firewall rules. An operator can see which VMs are communicating across a conduit, what the traffic volume looks like, whether flows are being permitted or denied, and where latency is accumulating in the network path. For troubleshooting, it correlates NSX configuration changes with traffic-pattern shifts — if a firewall rule change on C-WAN-RTU coincides with a drop in DNP3 poll responses, the correlation is visible in one view rather than requiring manual log analysis across three systems. In a 62443 context this is the operational side of conduit governance: not just enforcing the conduit policy, but continuously observing whether the conduit is behaving as designed.

Together, logging, monitoring, and network observability make FR6 (Timely Response to Events) and 62351-14 (cyber security event logging) operational: the infrastructure generates the telemetry the SOC consumes, centralised, correlated, and retained. The alternative — per-host log scraping with no central correlation — is how alerts get missed and incidents get reconstructed from memory rather than evidence.

Data confidentiality

Data confidentiality (FR4) has three surfaces: transit, rest, and use. Transit encryption is handled by IPsec VPN on the edge gateways, covered above. vSAN encryption and VM encryption close the at-rest gap — every block written to shared storage is AES-256 encrypted, with keys managed by the vCenter Native Key Provider or an external KMIP server, so a stolen disk yields nothing readable. Confidential computing closes the in-use gap: VCF supports Intel TDX and AMD SEV-SNP hardware attestation and memory encryption, meaning the guest VM’s memory is encrypted inside the CPU, invisible even to a compromised hypervisor. For an RTU VM processing SCADA telemetry from hundreds of substations, an attacker who gains hypervisor-level access — the concentration-risk nightmare — cannot read the RTU’s memory from storage or from RAM. Together these mechanisms close FR4 at every surface the data touches.

VCF at the edge

Everything described above — certificate rotation, credential management, Automation and Orchestrator workflows, image-based patching, live patching — applies equally to VCF Edge clusters at the substation. A smaller substation runs VCF Edge on a single substation-grade host; a medium or large site runs a multi-host cluster for local HA. Either way, the edge cluster hosts the virtualised RTU, an OT-IDS sensor, and a local historian on hardware that meets IEC 61850-3 and IEEE 1613 environmental requirements (fanless, -40 °C to +85 °C, surge withstand).

The key difference from the control-centre cluster is autonomy. If the operational telecoms WAN link drops, the edge cluster continues to operate locally — the protection and SCADA functions inside the substation don’t depend on connectivity to the data centre. When the link returns, SDDC Manager resumes lifecycle management, the pull-based deployment pipeline reconciles any drift, and log forwarding catches up. The software lifecycle is managed centrally; the truck roll becomes a hardware replacement, not a patching exercise.

Pull-based workload deployment

Everything above describes how the infrastructure is maintained. The question that follows is: how do the workloads — the virtualised RTUs, the OT-IDS sensors, the historians, the containerised analytics — actually get deployed and updated across a control centre or a fleet of edge sites? The traditional answer is push-based: an operator or pipeline SSHes into the target environment, authenticates with a privileged credential, and pushes the new image or configuration. In an OT context, that model is a problem. Push requires inbound access and stored credentials on the deployment server — exactly the kind of lateral-movement surface the 62443 zone design is trying to eliminate.

The alternative is a pull-based (GitOps) architecture. The desired state of every workload — VM specifications, container images, configuration, network policy — lives in a version-controlled Git repository. Inside the target environment, an agent watches the repository and reconciles the running state with the declared state. No inbound SSH. No credentials stored outside the environment. The deployment surface is a Git commit, and every change has an author, a timestamp, a review trail, and a diff.

VMware vSphere Kubernetes Service (VKS) provides the substrate. VKS runs Kubernetes clusters natively on VCF — the same ESXi hosts, the same NSX overlay, the same vSAN storage — with VMs and containers managed as Kubernetes objects. VM Service lets you declare a virtual machine as a YAML manifest the same way you’d declare a pod: image, CPU, memory, network, storage class. The RTU VM that was previously provisioned through vCenter’s UI becomes a versioned manifest in a Git repository, deployed by the same pipeline as the containerised analytics workloads running alongside it.

ArgoCD is the reconciliation engine. It watches the Git repository, compares the declared state against the running state in the VKS cluster, and converges any drift. A new RTU image is a Git commit; ArgoCD detects the change and rolls the update across the fleet. A misconfigured network policy is a drift event; ArgoCD flags it and optionally auto-remediates. The operator never pushes into the cluster. The cluster pulls from the repository.

Harbor is the trusted image registry that completes the supply chain. In an OT environment, images cannot come from a public registry over the internet. Harbor runs inside the control-centre cluster as a private registry the operator controls end to end. Images are pulled from upstream sources during a controlled intake, vulnerability-scanned (Trivy), and signed (Cosign) before they become available to the deployment pipeline. Replication policies propagate approved images to Harbor instances at edge sites over the operational telecoms circuit — the same outbound-only pull pattern as ArgoCD, extended to the image layer.

The 62443 mapping is direct. No inbound credentials means FR1 is met by architecture rather than policy. Every deployment is a Git commit with role-based approval, so the change-control process is the Git workflow — FR2 without a parallel ticket system. ArgoCD’s continuous reconciliation detects configuration drift at the workload layer the same way Salt detects it at the host layer, and Harbor’s vulnerability scanning and content trust extend FR3 to the supply chain: the image is scanned, signed, and verified before it runs. The Git log and Harbor’s scan results together give FR6 an audit trail that covers every change to every workload.

For a fleet of VCF Edge sites — each running a small VKS cluster inside a substation — the pull model turns workload deployment from a per-site truck roll into a Git commit that propagates to every site on its next reconciliation cycle. The substation pulls its own updates over the operational telecoms circuit. The control centre never needs inbound access to the substation cluster. The deployment conduit is outbound-only — which is exactly the directional enforcement pattern NSX already provides at the network layer, now extended to the workload lifecycle.

The cumulative effect is that VCF turns the 62443 lifecycle requirements — patch, rotate, attest, audit, deploy — into platform operations rather than project-by-project engineering efforts. The 62351-9 year-ten problem (“when the CA cert rotates, every device needs to trust the new chain”) is still genuinely hard for the IED estate. But for the infrastructure estate — the hypervisors, the network fabric, the management plane, the workload deployment pipeline — VCF means the year-ten problem is already solved by the platform the IEDs are sitting on top of.

vDefend: from enforcement to detection (VCF add-on)

NSX provides the network fabric — the overlays, the routing domains, the source-identity assurance. vDefend is the separately licensed add-on that provides firewalling, microsegmentation, and threat detection on top of that fabric — distributed firewall, gateway firewall, IDS/IPS, network traffic analysis, malware prevention, and network detection and response in a single product.

Conduit enforcement

Distributed firewall for east-west microsegmentation. The vDefend distributed firewall evaluates policy at the vNIC of every VM, not at a chokepoint appliance between VLANs. The five-artefact conduit from Drawing the Conduit — the rule on C-WAN-RTU that pins source IP, destination IP, and DNP3 port 20000 — is a distributed firewall rule. It follows the VM if vMotion moves it to another host. It logs every flow to syslog. And because it operates inside the hypervisor kernel, it can enforce policy on traffic that never crosses a physical switch — the intra-cluster conduits C-RTU-OPS and C-OPS-EMS are microsegmented at the vNIC even though the VMs sit on the same rack.

Gateway firewall for north-south and inter-zone traffic. Where the distributed firewall handles east-west traffic at the vNIC, the gateway firewall operates on the NSX Tier-1 gateways that sit between zones. The conduit between Z-WAN and Z-RTU — the operational telecoms circuit arriving at the control centre — is enforced by a gateway firewall rule: source IP pinned to the substation gateway block, destination pinned to the RTU VM, protocol pinned to DNP3 on TCP/20000, everything else denied. The gateway firewall is where inter-zone conduit policy lives; the distributed firewall is where intra-zone and workload-level policy lives. Together they implement the full set of conduits from the eight-zone design.

Directional enforcement — and where it stops. OT architectures often call for a data diode on the conduit between the control zone and the historian or SIEM — a physically one-way path so that telemetry flows out but nothing can flow back in. vDefend’s gateway firewall can enforce connection-initiation directionality: a rule that allows sessions initiated from Z-OPS → Z-IDMZ but drops anything initiated from Z-IDMZ → Z-OPS. For UDP-based flows like syslog forwarding, a stateless rule that permits outbound UDP/514 and denies everything inbound is functionally one-way. For most SL-2 conduits this directional enforcement plus logging is the pragmatic choice. But it is not a true data diode — TCP return traffic still flows back through a stateful session, and the guarantee is policy, not physics. Where the threat model demands physical one-way assurance — SL-3 or SL-4 conduits, or where the regulator explicitly requires it — a hardware data diode remains a separate physical device sitting outside the VMware stack.

The SL-T enforcement across all eight conduits is microsegmentation policy that follows the workload, not a cable. Every flow is logged. Every denied packet is recorded. The logs feed the centralised SIEM — and the detection layer that sits on top.

Detection

The firewall enforces boundaries. It logs every flow. But enforcement and logging are not detection — they tell you what happened, not whether what happened was malicious. vDefend’s detection capabilities add four functions the 62443 framework assumes exist but doesn’t tell you how to build.

IDS/IPS on east-west traffic. vDefend’s distributed IDS/IPS inspects traffic at the vNIC — the same enforcement point as the distributed firewall — matching against signature databases for known threats. In a control-centre cluster, that means every flow between Z-RTU and Z-OPS, between Z-OPS and Z-EMS, between the engineering workstation zone and anything it touches, is inspected for known exploit patterns and malicious payloads. The signatures update from the cloud feed; the inspection happens inside the hypervisor kernel, not at a chokepoint appliance that an attacker can route around. In 62443 terms this is FR5 (Restricted Data Flow) and FR6 (Timely Response to Events) working together: the firewall restricts which flows are permitted, and the IDS/IPS inspects the permitted flows for threats hiding inside them.

Network Traffic Analysis (NTA). Signatures catch known threats. NTA catches the unknown ones. vDefend’s NTA engine uses machine learning to baseline normal network behaviour — traffic volumes, protocol patterns, connection timing — and flags deviations in real time. In a control centre where the DNP3 poll cycle is a predictable thirty-second heartbeat, an anomalous burst of MMS write commands from an unexpected source or a sudden change in traffic entropy on a conduit that normally carries nothing but read polls is exactly the kind of signal NTA is designed to surface. The INDUSTROYER playbook — legitimate protocol commands issued from a compromised host — would not trigger a signature. It would trigger NTA’s behavioural model, because the traffic pattern would deviate from the learned baseline. In 62443 terms this extends FR6 from “detect known threats” to “detect anomalous behaviour”, which is the gap between SL-2 and SL-3 detection capability.

VM-aware Malware Prevention (MPS) and Guest Introspection. This is the capability that has no equivalent in a physical-appliance architecture. vDefend’s Malware Prevention Service operates at the hypervisor level through Guest Introspection — it has visibility into file systems, running processes, and registry activity across every VM on the host without requiring an agent inside the guest. If malware lands on an RTU VM — delivered via a compromised engineering workstation, a vendor update package, or a supply-chain attack on a software dependency — MPS can detect it through static analysis, dynamic sandboxing, and memory inspection before it executes. The hypervisor sees what the guest OS cannot hide. In 62443 terms this is FR3 (System Integrity) at the workload layer: vTPM attestation proves the VM booted cleanly, and Guest Introspection monitors what happens after boot. Between them, the integrity chain covers the full lifecycle of the workload, not just the moment it started.

Network Detection and Response (NDR). The IDS/IPS generates alerts. NTA generates anomaly signals. MPS generates malware verdicts. NDR is the correlation layer that synthesises all three into a coherent picture. It aggregates detection signals from the distributed sensors across the cluster, correlates related events into intrusion campaigns rather than isolated alerts, and gathers contextual data — which VM, which zone, which conduit, which user session — to give the SOC analyst a narrative rather than a list. In 62443 terms this is the operational expression of FR6: not just “we detected something” but “we detected a coordinated sequence of actions across three zones, correlated it with the NTA baseline deviation that started forty minutes earlier, and here is the campaign timeline.” The Enhanced CAF’s SOC-level detection indicator asks for exactly this capability — the ability to detect an attacker who is already inside the perimeter and moving laterally.

The relationship between vDefend and the Dragos sensor is complementary, not overlapping. Dragos deep-parses OT protocols — it knows what a DNP3 function code means, what a GOOSE retransmission pattern looks like, what the indicators of CHERNOVITE tradecraft are. vDefend operates at the infrastructure layer — it sees traffic patterns, file-system mutations, process behaviour, and network anomalies regardless of the application protocol. An attacker who uses legitimate DNP3 commands from a compromised VM would evade vDefend’s IDS/IPS signatures but trigger Dragos’s protocol-aware detection. An attacker who compromises a VM through a supply-chain attack on a non-OT dependency would evade Dragos entirely but trigger MPS and NTA. The two detection planes cover each other’s blind spots — and NDR can ingest Dragos alerts alongside its own signals, correlating OT-protocol events with infrastructure-layer anomalies into a single campaign view.

Avi (NSX Advanced Load Balancer): load balancing, TLS offload, and application security (VCF add-on)

Virtual IP for service continuity. Avi’s core function is load balancing behind a virtual IP (VIP). In a control centre, the VIP gives substation gateways a stable destination address for DNP3 or IEC 60870-5-104 traffic regardless of which RTU VM is actually serving behind it. If an RTU VM fails, gets patched, or is rebuilt on a different host, the VIP stays up and Avi redirects traffic to a healthy backend — the substation gateway reconnects to the same IP without reconfiguration. The same pattern applies to EMS and ADMS front-ends: operators and upstream systems connect to a VIP, and Avi distributes sessions across the pool. Health monitors detect backend failures within seconds and drain traffic away before the SCADA polling cycle notices. In 62443 terms this is FR7 (Resource Availability) at the application layer: the service survives a backend failure without the remote end needing to know anything changed.

The 62351-4 requirement — TLS underneath MMS — and the OPC UA security model both assume every endpoint can terminate TLS. In practice, some station-level applications and older historians can’t. Avi, the NSX Advanced Load Balancer, is a separately licensed VCF add-on that sits as a TLS proxy at the conduit boundary: the northbound connection from the substation gateway arrives as plain-text MMS or OPC UA, Avi terminates TLS on the inbound side and re-originates a TLS-secured session to the historian or ADMS on the other. The certificate management lives in Avi’s centralised controller rather than on every endpoint device — which means 62351-9 key rotation becomes a fabric operation rather than a truck roll. It’s not end-to-end in the way 62351-4’s E2E signing extension is, but it’s a compensating control that gets TLS onto conduits that would otherwise carry plain-text indefinitely.

Avi also provides a Web Application Firewall (WAF) that is relevant where OT systems expose web-based interfaces — ADMS operator dashboards, historian query portals, engineering configuration UIs. These are the web surfaces that sit on conduits between Z-OPS and Z-IDMZ, and between Z-EW and Z-STN. The WAF inspects HTTP/HTTPS traffic for injection attacks, cross-site scripting, and the OWASP Top 10 categories that web-facing OT interfaces are vulnerable to but rarely designed to defend against. In 62443 terms this extends FR5 (Restricted Data Flow) to the application layer: the vDefend firewall restricts which flows reach the web interface, and the WAF inspects what those flows contain.

VMware Live Site Recovery: disaster protection (VCF add-on)

The concentration risk — one control-centre cluster serving hundreds of substations — means a site-level failure is no longer a single-substation event. A fire, a flood, a prolonged power outage at the data hall, or a hardware failure that takes down the vSAN cluster puts the operator blind across an entire region. The protection schemes inside each substation continue to operate locally — they don’t depend on the RTU — but telemetry visibility could take hours to restore and remote control could take days.

VMware Live Site Recovery (VLSR) is a separately licensed VCF add-on that provides orchestrated disaster recovery across sites. Enhanced vSphere replication delivers one-minute RPOs — the RTU VMs, the EMS and ADMS front-ends, the SCADA servers, the historian databases are continuously replicated to a secondary site with less than sixty seconds of data exposure. Failover is orchestrated: VLSR manages the recovery plan, the IP re-mapping, the boot order, and the NSX network policy at the secondary site so the recovered environment comes up with the same microsegmentation and conduit enforcement as the primary. The recovery plan is testable non-disruptively — you can validate the failover without affecting production.

In 62443 terms this is FR7 (Resource Availability) for the physical disaster case. The cluster survives a site failure and can be rebuilt at a secondary location with the same security posture. The one-minute RPO means the gap between the last replicated state and the failure event is narrow enough that the recovered environment is operationally useful — the operator isn’t restoring from last night’s backup, they’re restoring from sixty seconds ago.

VLSR addresses the scenario where the infrastructure is destroyed or unavailable but the data and configuration are known-good. The harder problem — recovery from a coordinated cyber attack where the data and configuration may themselves be compromised — is the domain of Advanced Cyber Compliance.

VMware Advanced Cyber Compliance: cyber recovery and continuous compliance (VCF add-on)

A cyber attack is not a site failure. In a site failure, you know the data is good and the infrastructure is bad — you restore known-good state to different hardware. In a cyber attack, you don’t know which data is good. The attacker may have been inside the environment for weeks before detection. The backup from yesterday may contain the persistence mechanism. The configuration that looks correct may have been modified to leave a door open. The first problem isn’t restoring — it’s finding a clean point to restore from.

VMware Advanced Cyber Compliance (ACC) is a separately licensed VCF add-on purpose-built for this problem. It brings two capabilities that together close the cyber-recovery and continuous-compliance requirements the 62443 framework and the Enhanced CAF both demand.

Cyber recovery

ACC provides an isolated clean room — a network-isolated VCF environment with push-button isolation that can be spun up without connectivity to the compromised production environment. Inside the clean room, automated validation scans restore-point candidates against known indicators of compromise, identifying the last clean snapshot before the attack rather than blindly restoring the most recent one. Orchestrated restore then rebuilds thousands of VMs from the validated clean point.

For a control centre serving hundreds of substations, this is the difference between a recovery that takes days of manual forensics and a recovery that can begin within hours of detection. In 62443 terms this is FR7 (Resource Availability) for the worst case: not just “the cluster survives a host failure” (that’s vSphere HA) or “the cluster survives a site failure” (that’s VLSR), but “the cluster survives a coordinated attack and can be rebuilt from a validated clean state.” The isolated clean room is the difference between a recovery plan that exists on paper and one that can be tested non-disruptively against the running environment.

Continuous compliance

The other half of ACC addresses a different problem: proving, continuously, that the infrastructure meets the security posture it claims. VMware Salt scans the infrastructure against CIS-benchmarked security policies, detects configuration drift, and remediates automatically. In a control centre where dozens of ESXi hosts and hundreds of VMs serve the grid, a single misconfigured host — an SSH daemon left enabled, a firewall rule widened during a maintenance window and never tightened — is the kind of drift that turns an SL-3 zone into an SL-1 zone without anyone noticing. Salt catches the drift, flags it, and can remediate it without a change ticket. In 62443 terms this is FR3 (System Integrity) as a continuous process rather than a point-in-time audit: the infrastructure doesn’t just start compliant, it stays compliant.

The controls Salt enforces come from the VCF Security Configuration and Hardening Guide — 182 individual controls across ESXi hosts, vCenter, VMs, vSAN, and NSX, covering account lockout, network-switch hardening (reject forged transmits, MAC changes, promiscuous mode), Secure Boot enforcement, session timeouts, FIPS mode, and log-forwarding configuration. What makes this directly relevant to the substation story is that Broadcom publishes a compliance mapping of those 182 controls to IEC 62443-4-2 — the component-level security standard. The mapping gives an assessor a spreadsheet that says “this VCF control closes this 62443-4-2 requirement at this Security Level”. It also maps to NERC CIP, NIS2, NIST 800-53, and ten other frameworks. The hardening guide is the bridge between “VCF can do this” and “here is the auditable evidence that it does.”

The practical effect for an infrastructure architect: the Enhanced CAF asks operators to demonstrate that their infrastructure meets its claimed security posture and that they can detect and recover from compromise. ACC is the product that operationalises both requirements — continuous compliance for the first, cyber recovery for the second.

The platform and the standard

This series started with the problem — substations that outlive their equipment. Then the engineering response — merging units, process buses, and the virtualisation debate. Then the threat that the air gap was supposed to contain but never did. Then the standards — 62443 zones and conduits, 62351 cryptographic mechanisms — that replace it. This post is where the abstract meets the concrete — where the standards hit a product catalogue and somebody has to make it work.

The mapping is never one-to-one. VCF doesn’t “do 62443” any more than concrete “does architecture”. But the base platform provides the foundation: NSX overlays for zone boundaries, VMCA for infrastructure PKI, vLCM for integrity attestation, VKS and ArgoCD for pull-based workload deployment, Harbor for supply-chain trust, and confidential computing for data-in-use protection. The add-ons extend the platform into the domains the base doesn’t cover: vDefend for conduit enforcement and threat detection, Avi for service continuity and TLS offload on legacy protocols, Live Site Recovery for disaster protection, and Advanced Cyber Compliance for cyber recovery and continuous compliance enforcement. And the Enhanced CAF — the regulatory driver — provides the reason none of this can be deferred.

The thing I keep coming back to is that no single layer is sufficient. The 62351 crypto without the 62443 design is a lock without a door. The 62443 design without the platform is a blueprint without a building site. The platform without the regulatory driver — the Enhanced CAF obligations — is infrastructure waiting for a business case. It takes all of them working together to turn a virtualised control centre into something an operator can defend and an assessor can audit.

The conduit, it turns out, is a team effort.

References

VMware / Broadcom

Standards