I mostly use the vSphere Flash client and was monitoring my vSAN performance with it. I noticed TCP inbound loss rate was ranging from 1-10% on the vSAN host TCP packet retransmission and loss rate graph. My VMs did not seem to be impacted. Also, there is bound to be some loss with TCP. However, this number seemed high to me.
I had a case open with VMware GSS and they could not figure out what was the underlining issue. They blamed my Cisco UCS servers. Cisco didn’t have any ideas. Nothing seemed wrong with my physical switches.
Then one day I used the HTML5 client and looked at the same graph. The numbers were much lower. I went back to the Flash client and the numbers were high. I followed the graphs over multiple time periods, on every host, and noticed the numbers were always off by a factor of ten. See the two screenshots below. You can see the flow of the line graph is the same, but with the y axis on a different scale. Also, every exact time I hover over is always off by a factor of ten.
I have a cluster on VMware Cloud on AWS. That of course is using vSAN so I decided to check it out. Same exact problem! Therefore, it has nothing to do with my on-prem configuration or version. I reported the issue to VMware and didn’t seem like they will do anything about it. After all, the Flash client will be deprecated in the next major release of vSphere. Though, still frustrating that I chased what I thought was a problem for a while and it turned out to be a bug with the graph. I hope no one else falls for this too.
Flash Client (Flex)
I ran into some gotchas when deploying a vSAN 6.7 U1 cluster. The cluster quickstart should save time when deploying a vSAN cluster. However, there are a couple steps to take to avoid issues.
Do not use the vSphere Flash client for any step on the deployment. Only use the vSphere HTML5 client. There are odd issues that can occur. For example, I created the cluster in the Flash client. I then wanted to run quickstart in the HTML5 client. However, the options were grayed out under cluster basics so I could not select anything or go to add hosts. I deleted the cluster and created it in the HTML5 client. Then quickstart worked fine. I later had another new deployment. I tried out the same scenario and had the same problem.
Quickstart has the option of creating a vDS. The vDS will be created at version 6.5. There is no option to select a specific version of the vDS when deployed through quickstart. This is fine depending on the other versions of vDSs in your environment and if maintaining vMotion compatibility is a requirement. For example, if another cluster has a vDS on version 6.0, then VMs in that cluster cannot be live migrated to the vSAN cluster on vDS version 6.5 since different vDS versions. Only option is to power off the VM to allow it to be migrated. To get around this, create a vDS on version 6.0 before starting quickstart. Then select the option to ‘USE EXISTING’ on the distributed switches section of quickstart.
Quickstart is definitely very useful and can save a lot of time if not hitting these gotchas. The ability to auto-fill sequential IP addresses for management and vMotion, and ESXi root login credentials are convenient. Hopefully, fault tolerance will eventually be added to quickstart. Also, having a guided workflow makes the vSAN deployment relativity easy.
I recently earned the VMware vSAN 2017 Specialist Badge. Despite the name, I did in fact pass the exam in 2019 with a score of 411. This is the latest version of the exam. I do not like how VMware is naming the exams after the year the exam was released. Makes sense for Microsoft to put a year in the exam title because the version of their software obviously has the year in it. I hope VMware goes back to naming the exams after the version of the product.
The latest version of vSAN is 6.7 U2, but the exam objectives are based on vSAN 6.6. The exam was fairly easy so I guess that is why it is for a badge and not a certification. The exam didn’t really have any surprises. It closely followed the exam objectives. I recommend to review the material in the two links below and know it very well. I had plenty of time to finish the exam.
As you can see from the image below, the badge path is for VCP6 holders. The drop down also has expired VCP and no VCP. My VCP is active, but on 5.5. I have kept it active with taking VCAP exams during the past few years. Then it would have become active anyway since VMware changed their rules for certification expiration.
Anyway, there are actually many more certifications that meet the requirements for this badge. The screen shot below, from VMware’s certification manager, shows a long list that has VCAP 6, VCDX 6, and newer VCP exams. I earned the badge since I previously passed the VCAP6-DCV Design exam. I am not sure why it says vRealize Operations 2017 for 1.2. I am guessing it’s a mistake and should list the vSAN exam.