vSphere Client | Enterprise Kubernetes Infrastructure Platform
Duration: 3-month cross-release initiative
Scope: Redefining upgrade workflows across Supervisor and workload management experiences
Platform Context
vSphere is an enterprise infrastructure platform used to manage virtualized environments at scale, where platform operators are responsible for system stability, security, and availability.
With the introduction of Kubernetes through Supervisor clusters, the platform expanded to support both infrastructure management and modern application workloads, requiring operators to balance stability with the need for faster Kubernetes adoption.
Executive Summary
Decoupling Supervisor upgrades from vCenter to enable faster Kubernetes adoption
Supervisor upgrades were tightly coupled with vCenter releases - a slow and risk-heavy process that limited how quickly platform teams could adopt newer Kubernetes versions.
This initiative introduced asynchronous Supervisor upgrades, enabling Kubernetes to be upgraded independently from vCenter. The focus was on making this shift understandable and safe through clear upgrade paths, explicit system status, and guided workflows for high-risk operations.
Validation with platform operators confirmed that incremental upgrades significantly reduce operational friction, allowing infrastructure teams to move faster while maintaining control and confidence.
Context
Supervisor upgrades in vSphere were tightly coupled with vCenter releases, requiring platform operators to perform full infrastructure upgrades in order to access newer Kubernetes versions.
While Kubernetes evolves on a rapid release cycle, infrastructure upgrades follow a much slower and more risk-sensitive process, often involving extensive validation, dependency checks, and planned downtime.
This created a mismatch between infrastructure and application needs, forcing operators into unnecessary upgrade cycles and limiting their ability to respond to evolving workload requirements.
As a result, upgrading became a high-friction, high-risk operation, where users were required to make complex decisions without sufficient control or clarity over the process.
To understand the impact of this change, it is important to examine how upgrade dependencies were structured before and after introducing asynchronous Supervisor upgrades.
Role & Collaboration Model
The work was developed within a cross-functional team, in close collaboration with product management and frontend and backend engineering.
The design effort focused on defining the end-to-end user experience for asynchronous Supervisor upgrades, including upgrade flows, system communication, and interaction patterns across different operational scenarios.
Design ownership included leading UX direction, facilitating cross-functional discussions, and synthesizing research, technical constraints, and product requirements into a coherent system-level experience.
As the project evolved, collaboration extended to a copywriting architect to ensure consistency in terminology and alignment with established user mental models, improving clarity across the upgrade experience.
The work required navigating strong technical perspectives and aligning stakeholders around solutions that balance system accuracy with user understanding, particularly in avoiding unnecessary exposure of low-level technical details while maintaining transparency.
Discovery & Research
The project was initiated following direct feedback from customers to product management, indicating a clear need for a more flexible upgrade experience.
To understand the upgrade experience in depth, a user journey was mapped across the full lifecycle- from becoming aware of a new release through preparation, execution, and post-upgrade operations. This surfaced emotional friction points that were not visible from the system perspective alone.
Two open research sessions were scheduled to recruit participants directly. When neither attracted sign-ups due to the specificity of the required profile, access was requested to join the weekly customer calls already facilitated by product management, where the right personas were already present. Nine platform operators participated across these sessions. A combination of open discussion and a structured survey was used, allowing participants to contribute where their experience was most relevant while capturing broader signal across the group.
Findings revealed that upgrades are treated as controlled, risk-sensitive operations - often delayed until stability is proven. Users consistently sought greater control over when and how changes are applied, rather than being driven by system release cycles.
Design & Iteration
The design progressed directly into high-fidelity mockups, reflecting the team's established ways of working within the platform context.
Weekly cross-functional sessions with product management and engineering provided a continuous feedback loop throughout the process. Each session surfaced technical constraints and system behavior considerations that shaped interaction decisions. Тhe design evolved iteratively as backend capabilities were clarified and stakeholder alignment was reached.
This collaborative cadence ensured that design decisions remained technically grounded while preserving clarity and usability for the end user.
Core Challenges
One of the key decisions was how to surface this new capability within the interface in a way that supports awareness without introducing unnecessary noise.
1. Decoupling system behavior from user understanding
Introducing asynchronous upgrades required breaking an existing system dependency while preserving a clear and predictable user experience.
While Supervisor could now be upgraded independently, this change was not inherently visible to users, whose mental model was still tied to vCenter-driven upgrade flows.
The challenge was to make this shift explicit without introducing additional complexity, ensuring users understand what can be upgraded, when, and independently of infrastructure releases.
2. Balancing visibility with cognitive load
The introduction of a new upgrade capability required clear communication, but the form of that communication directly impacted user trust and adoption.
A lightweight alert would have minimized visual noise but risked being overlooked in a high-risk context. A more persistent informational banner was introduced instead, providing clear visibility into the new upgrade option and linking to release notes for additional context.
This decision prioritized awareness and confidence over minimalism, ensuring that users understand the change and feel supported in their decision-making.
3. Designing for evolving system constraints
The underlying system architecture continued to evolve during development, including changes to APIs and the need to extend existing content library capabilities.
This required maintaining flexibility in interaction design while ensuring consistency across flows, even as backend capabilities were being defined.
The challenge was to align UX decisions with a moving technical foundation, without compromising clarity or introducing inconsistencies in the user experience.
4. Terminology & Mental Model Alignment
Introducing new system capabilities required defining terminology that was both technically accurate and consistent with the language operators already used within the platform.
Collaboration with a copywriting architect ensured that new concepts introduced through the async upgrade experience aligned with established platform conventions, preventing confusion and preserving the mental models operators had already built within vSphere.
Addressing these challenges required moving beyond individual interaction decisions toward a structured, system-level framework that could guide users through the full upgrade experience with clarity and confidence.
System-Level Framework Definition
The introduction of asynchronous upgrades required rethinking the upgrade experience as a structured, system-level workflow rather than a single action.
The solution was defined as a layered framework that guides users through the upgrade process while maintaining clarity and control in a high-risk environment.
The experience was structured across four key stages:
– Awareness – making the availability of independent Supervisor upgrades visible and understandable
– Decision-making – providing clear upgrade paths and version relationships to support informed choices
– Validation – ensuring compatibility and readiness through pre-checks and system feedback
– Execution – guiding users through a controlled upgrade flow with explicit status communication
This framework ensured that users are not only able to perform upgrades, but also understand when, why, and how to do so - reducing uncertainty and aligning the process with real operational decision-making.
To support this approach, the upgrade experience was structured as a clear, staged workflow that guides users through decision-making and execution.
Design Solution
The design translates the system-level framework into a set of coordinated interaction patterns that support awareness, decision-making, and safe execution of upgrade operations.
Greenfield & Brownfield Scenarios
The upgrade experience was designed to support two distinct user contexts.
Greenfield: users enabling Supervisor for the first time, navigating content library assignment, configuration options, and activation from scratch.
Brownfield: users who have already deployed Supervisor and are returning to manage upgrades, content library changes, and system updates within an existing environment.
Each scenario required different entry points, different information hierarchies, and different communication of system state, while maintaining a consistent underlying framework across both paths.
Awareness
To address the lack of visibility around the new upgrade capability, a persistent informational banner was introduced within existing upgrade contexts.
Unlike transient alerts, the banner provides continuous visibility into the availability of asynchronous upgrades and links to release notes, allowing users to understand the change before taking action.
This approach reinforces trust by making the system behavior explicit rather than implicit.
Decision Clarity
To support informed decision-making, the experience exposes upgrade paths and version relationships directly within the interface.
Users are able to see which Supervisor versions are available, how they relate to their current environment, and what upgrade options are possible without requiring external documentation.
This reduces reliance on manual interpretation of release notes and aligns upgrade decisions with the user’s operational context.
Risk Mitigation
Given the high-risk nature of upgrade operations, the experience integrates validation directly into the flow through pre-checks and system feedback.
Compatibility checks ensure that dependencies such as vCenter and NSX are validated before the upgrade is initiated, reducing the likelihood of failures and unexpected outcomes.
This shifts risk detection earlier in the process, supporting more confident decision-making.
Guided Execution
The upgrade process is structured as a step-by-step flow that guides users through selection, validation, and confirmation.
Each step provides clear status communication, ensuring users understand the current state of the system and the impact of their actions.
This reduces cognitive load and prevents errors in a complex, multi-step operation.
Edge Case Handling
A non-obvious failure scenario was identified during the design process: a situation where a content library assigned to the WCP service had been deleted by another user, leaving the system in a broken state without clear indication of the cause.
Rather than leaving users to discover the issue through trial and error, a warning alert was introduced to surface the problem explicitly and guide users toward resolution.
Validated with participants during research sessions, and although rare, operators confirmed it was a plausible scenario they would want handled. In enterprise environments, rare does not mean unimportant.
Validation & Outcome
The solution was validated through direct conversations with platform operators, conducted as part of customer sessions facilitated by product management.
Participants consistently confirmed that the ability to upgrade Supervisor independently from vCenter would significantly improve their workflows. The option to perform incremental upgrades allows infrastructure teams to maintain stability while enabling DevOps teams to access newer Kubernetes versions without unnecessary delays.
The introduction of clearer upgrade paths, explicit system status, and integrated validation mechanisms reduces uncertainty during upgrade operations and supports more confident decision-making.
As a result, the experience enables a shift from forced, full-cycle infrastructure upgrades to more flexible, controlled upgrade strategies aligned with real operational needs.
Growth Reflection
This project required working within a highly technical environment, alongside experienced engineering teams with strong perspectives on system design and implementation.
A key challenge was ensuring that technical solutions remained aligned with user needs- particularly when there was a tendency to expose low-level technical detail or prioritize system logic over user understanding.
It reinforced the importance of balancing technical accuracy with user comprehension, and of maintaining a strong user-centered perspective in complex infrastructure domains. It also showed that when standard research approaches don't work, the solution is to find a different path - not to wait.