During a recent upgrade from vSphere 7U3 to vSphere 8U3D on our vCenter environment, we encountered an unexpected issue in the second stage of pre-checks. This blog outlines the troubleshooting and resolution steps we took to address a vmdird replication error related to an old, decommissioned vCenter Server or Platform Services Controller (PSC) instance. Here’s a step-by-step look at how we identified the issue, resolved it, and completed a successful upgrade.
Identifying the Issue: vmdird Replication Error
As part of the upgrade process, VMware performs pre-checks to identify any potential issues before proceeding with the actual update. During this stage, we found a replication error for vmdird, the VMware Directory Service responsible for managing authentication and replication across vSphere domains.
In the vmdird-syslog.log file (located at /var/log/vmware/vmdird
), we saw the following error message:
Error [Server down] [9127]
This error was associated with an old vCenter/PSC that had been previously decommissioned but was still present in the vCenter vmdir. This incomplete cleanup caused the upgrade process to detect the decommissioned instance as an unavailable replication partner.
Troubleshooting and Diagnosis
To gain more insight into the issue, we used showservers and showpartners commands as described in this Broadcom article. Running these commands confirmed that the decommissioned instance was still listed as a replication partner, but its availability status was marked as “No”.
This meant that the old instance was recognized as a replication partner but was not reachable, causing the replication error.
Resolving the Replication Issue with cmsso-util
To clean up the decommissioned vCenter/PSC instance from the replication agreements, we turned to VMware’s cmsso-util tool. Following the steps in VMware’s KB article, we used cmsso-util unregister
to remove the stale replication partner from the vmdir directory.
This command helped us clean up any remaining references to the old instance, ensuring that no trace of it would interfere with the vSphere 8 upgrade process.
Validating and Re-running Pre-checks
Once the cleanup was complete, we re-ran the pre-checks for the upgrade. This time, the pre-checks passed successfully with no replication errors, and the vmdird-syslog.log file showed no further issues.
With the pre-checks now clean, we proceeded with the upgrade, and it completed successfully.
Conclusion
Removing outdated replication partners in vmdir is crucial for a smooth vCenter upgrade process. Here are some key takeaways:
- Monitor vmdird logs for any unexpected replication errors before starting an upgrade.
- Use showservers and showpartners commands to confirm the replication status of all vCenter/PSC instances.
- Clean up decommissioned instances using cmsso-util to avoid replication errors during upgrades.
This experience highlights the importance of properly decommissioning old instances and keeping your vmdir directory up to date, ensuring a seamless upgrade process for future vSphere updates.