Search This Blog

Showing posts with label troubleshoot. Show all posts
Showing posts with label troubleshoot. Show all posts

Installing vSphere Client fails with the error “VMInstallHcmon - Failed to install hcmon driver”

When installing the latest vSphere Client 6.0 on my Windows 10 computer, I got the following error “VMInstallHcmon – Failed to install hcmon driver”

Troubleshooting

  • Try KB2006486. But I don’t see Non-Plug and Play Drivers and VMware hcmon on my Windows 10 computer
  • Try renaming the C:\Windows\System32\drivers\hcmon.sys file. Still get the same error

Solution

  • On my laptop, vSphere Client 5.5 and 6.0 (older build), and their respective Update Manager plug-in are installed
  • Remove these older clients and plug-ins
  • vSphere Client 6.0 installation completes successfully

Setting Up IIS 8 FTP Server Lessons Learned

To test vCSA 6.5 built-in backup, I need a FTP server. Since I already have a Windows Server 2012 R2 running IIS 8 with web service, adding the FTP server feature is just a few clicks.

Even I have not used the Microsoft FTP server since IIS 6, and there are lot of changes between IIS 6 and IIS 8, I thought setting up the FTP server should be a piece of cake. I was wrong! The following are what I have learned on setting up the FTP server in IIS 8.

Lesson #1: Windows Firewall

After installing the FTP service and creating a new FTP site in IIS Manager, I can’t connect to the FTP site from a remote computer; FTP from the server to itself is okay. It must be a Windows firewall issue.

  • I check the Windows Firewall’s Inbound Rules, three FTP rules are created and enabled; and Outbound Rules, two FTP rules are created and enabled. I guess they are automatically created by the FTP service installation. These rules look right, but I still can’t connect from a remote computer.

Windows.Firewall.Inbound.Rule.FTP

Windows.Firewall.Outbound.Rule.FTP

  • Disabling the Windows Firewall on the server, I can connect. This confirms the Windows Firewall causing the issue, but what is the problem? I don’t want to disable Windows Firewall.
  • The default FTP rules are allowed the program “%windir%\system32\svchost.exe”. I’m not sure what the executable runs the FTP service. (Later, I find it via Microsoft FTP Service, General, Path to executable: “C:\Windows\system32\svchost.exe -k ftpsvc”)
  • I created my own FTP rules required in my case - two inbound rules and one outbound rule (highlighted in above pictures) with the same protocol and port number, except that I allow any program. This works! I can connect to the FTP site from a remote computer. (Actually, see Lesson #2 below - it’s not fully working yet. I get another error after entering the login name).
  • I think the default FTP rules don’t work, until I find this post.
  • I delete the FTP rules I created, and restart the “"Microsoft FTP Service”. The FTP connect is still working.

Summary:

  • When troubleshooting issues related to Windows Firewall, restart the application service or the server after adding or changing the rules.
  • Restarting the FTP site in IIS Manage does not work; disabling and enabling the firewall or rule does not work. Restarting the FTP service is required.

Lesson #2: FTP site virtual host name

After the connection problem is resolved (see lesson #1), I continue further on the FTP login. However, after entering the user name, I get the error message “530 Valid hostname is expected. Login failed”.

FTP.Valid.Hostname.Is.Expected

After searching the error message, I learn about the FTP virtual host name

In the past I had used the IIS web site virtual hostname to handle multiple web sites on a single IP address and port number. But I don’t recall if the FTP service in IIS 6 has the host name option. When creating the FTP site, I entered the DNS name of the FTP site as the host name.

FTP.Host.Name

Summary:

  • use <ftp virtual hostname>|<ftp username> as the login name for the FTP server uses the virtual hostname
  • FTP.Virtual.Hostname.Login
  • If you are not going to run multiple FTP sites on the same IP address and port number, leave the host name blank.

Fix A SAN Datastore Inaccessible On A ESXi Host

A SAN datastore is shown inaccessible on one of the ESXi hosts in the cluster. Other ESXi hosts can access that datastore without problem.

esxi.host.datastore.inaccessible

Solution: restart the ESXi management agents on the ESXi host

There are a few ways to restart the management agents (KB1003490).

  • From the Direct Console User Interface (DCUI)
    • Press F2 to customize the system
    • Log in as root
    • Under Troubleshooting Options, select Restart Management Agents
  • From the Local Console (Alt + F2) or SSH
    • Log in as root
    • run these commands
      • /etc/init.d/hostd restart
      • /etc/init.d/vpxa restart
      • if the hostd is not restart, use KB1005566 to find and kill hostd Process ID (PID), then start it again (/etc/init.d/hostd start)
  • alternatively
    • To reset the management network on a specific VMkernel interface, by default vmk0
      • esxcli network ip interface set -e false -i vmk0; esxcli network ip interface set -e true -i vmk0
      • Note: run the above commands together, using a semicolon (;) between the two commands
    • To restart all management agents on the host
      • services.sh restart
      • Caution:
        • check if LACP is enabled on the VDS’s Uplink Port Group
        • If LACP is not configured, the services.sh script can be safely executed
        • If LACP is enabled and configured, do not restart management services using services.sh. Instead restart independent services using /etc/init.d/hostd restart and /etc/init.d/vpxa restart.
        • If the issue is not resolved, take a downtime before restarting all services with services.sh

NetApp "HA GROUP ERROR: DISK/SHELF COUNT MISMATCH ERROR" Troubleshoot

We received an alert “HA GROUP ERROR: DISK/SHELF COUNT MISMATCH ERROR” from the NetApp filer (Model V3240, OS Version 8.1.2 [7-Mode]), one from each node in the NteApp cluster . The alert does not include much information which node has the problem or what goes wrong. It turns out that a disk in one of the nodes start failing. Here are some steps to help to identify the failing disk.

  • Option 1: Search CF-Monitor.txt (inside body.7z file attached in the alert) for “Mismatched disk”, and run disk show <disk_device_id>
  • Option 2: run disk show -v and look for “FAILED” disk
  • Option 3: run sysconfig -d and look for “Not available” under Disk Vital Product Information column
  • Option 4: run aggr status -r (or vol status -r) and look for “Maintenance disks”

How to fix Print Screen hotkey registration failure

I like to use a third party screen capture utility, e.g. ShareX or Greenshot, etc, instead of the Windows built-in Snipping Tool. I configure the utility to load at the startup and set the print screen key to capture the region and copy the image to clipboard, so I can easily paste the screen shot into a document.

After installing November 10, 2015 Windows Update on my Windows 10 laptop (PS. my Windows 8 laptop does not have this issue with the update.), I got the following message after the system reboot.


I make sure I don't have any other screen capture utilities running the background. I tried uninstalling and reinstalling ShareX, no lucky. Then I installed another screen capture utility - Greenshot, I got the similar message.


I double checked no two screen capture utilities running at the same time.

To figure out which program is registered the Screen Print hotkey, I installed the Windows Hotkey Explorer tool (http://hkcmdr.anymania.com). However, it reported no program is using the Screen Print hotkey.

When searching on the web, some one mentioned Dropbox or OneDrive application may configure to automatically upload screenshots to their cloud storage. I don't have Dropbox installed. I have OneDrive, but the screenshot upload to OneDrive is turned off.



Solution:

  • Right click on the OneDrive icon on the task bar, and select Settings


  • Check the checkbox "Automatically save screenshot I capture to OneDrive", then click OK

  • When prompting to choose the folder to save the screenshot, click Cancel
  • Open OneDrive's Settings again to verify the checkbox is unchecked

  • After "resetting" this OneDrive setting, the screen capture utility is loaded successfully and the print screen hotkey is working as it is configured in the application.

Troubleshoot PEAP Authentication

Environment:

Wireless PEAP with Windows Active Directory domain authentication is configured. (see http://www.techrepublic.com/article/ultimate-wireless-security-guide-an-introduction-to-peap-authentication/6148543 for the setup detail).

Windows Server 2003 with a self-signed digital certificate as the RADIUS server.

Wireless access managed by the Active Directory “WiFi Users” security group.

Access Point: Cisco WAP4410N with firmware 2.0.5.3

Access Point Configuration:

  • Discovery (By Bonjour): Enabled
  • Wireless Security Mode: WPA2-Enterprise Mixed (WPA Algorithm: TKIP or AES)
  • Primary RADIUS Server: Windows Server 2003 RADIUS server IP address
  • Primary RADIUS Server Port: 1812
  • Wireless Connection Control (MAC address filter): Disabled

Problem:

The users in the Active Directory “WiFi Users” security group were able to authenticate and access the wireless with the wireless devices (iPhone, iPad, Windows Phone 7.5, Windows XP with SP3, Windows 7, MAC OS X, etc) configured with the PEAP authentication. One day in August 2012, the Windows Server 2003 RADIUS server was updated with the latest Microsoft security updates. Then, only iOS devices (maybe MAC OS X too) can authenticate and access the wireless; all Windows based devices keep getting the connection failure even the configuration and authentication are correct.

Troubleshoot:

The RADIUS server System log shows a warning from source IAS, event ID 2. The user was denied access; Reason-Code = 266; Reason = The message received was unexpected or badly formatted.

Solution:

The scenario 2 in the KB article (http://support.microsoft.com/kb/933430) matches this issue. Use method 3 in the KB article resolved the problem.

Linux “No space left on device” Error

symptom:

a web site is down, but the httpd (in my case lighttpd) service started okay. restarting the service still has the same problem.

touch test_file, get the “No space left on device” error

df –h, show a lot of free disk space

Filesystem            Size  Used Avail Use% Mounted on
/dev/sdb1              10G  2.8G  6.8G  29% /
/dev/sda1              99M   21M   74M  22% /boot
tmpfs                 125M     0  125M   0% /dev/shm

problem:

out of inodes

Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/sdb1             655776  655776       0  100% /
/dev/sda1              26208      38   26170    1% /boot
tmpfs                  31871       1   31870    1% /dev/shm

fix:

find the files of a certain size (e.g. great than three blocks in this example) to locate these files (in my case is the ruby session files in /srv/www/lighttpd/rails/tmp/sessions, the file name is ruby_sess.xxxxxxxx)

find / –size +3 –print

but rm –f ruby_sess.* failed because “bash: /bin/rm: Argument list too long” (see http://en.kioskea.net/faq/1086-unable-to-delete-file-argument-list-too-long)

ls ruby_sess.* | xargs rm get “-bash: /usr/bin/ls: Argument list too long”

find . –type f –name ruby_sess.* | xargs rm get “-bash: /usr/bin/find: Argument list too long”

finally find . –name “ruby_sess.*” | xargs rm worked

Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/sdb1             655776  122320  533456   19% /
/dev/sda1              26208      38   26170    1% /boot
tmpfs                  31871       1   31870    1% /dev/shm

Delete “Account Unknown” Local User Profiles

Issue:

On Windows XP or Server 2003, under Control Panel / System / Advanced / User Profiles / Settings, there are some “Account Unknown” user profile, but the Delete button is grayed out.  And when try to delete the profile from “c:\documents and settings” folder, the error message is “Cannot delete NTUSER.DAT: It is being used by another person or program. Close any programs that might be using the file and try again.”

Solution:

  1. Install “User Profile Hive Cleanup Service” (http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=6676)
  2. Run uphclean.exe
  3. Then the “Delete” button becomes available


Note: the User Profile Deletion Utility (delprof.exe) (http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=5405) cannot delete the “Account Unknown” profile, but it is useful to clean up the normal user profile when their account is still active. (delprof.exe /p /c:\\servername)

Cannot Add User Account in Windows 7 Home Premium

Local Users and Groups management console (MMC) is not available in Windows 7 Starter and Home Premium.  Adding use account in these versions is through Control Panel / User Account.  If you get an error “The specified account is not valid, because account names cannot contain the following character…. Please type a different name”,

image

Here is how to troubleshoot

  1. Verify the user account name do not contain the listing characters.
  2. This error also happens when the user account name already exists.  Because the disabled account is hidden from Control Panel / User Account, type “net user” in the command prompt to view all user accounts.

Windows Update Scanning Error Fix

If the Windows Update database and manifest corrupted, Windows Update scan can take a long time or crash.  The following may fix this problem.

  1. Run the Windows Update troubleshooter
  2. Run the System Update Readiness Tool
  3. Run the System File Checker (sfc) from Administrator Command Prompt. “sfc /scannow”
  4. Rename and recreate the SoftwareDistribution and Catroot2 folders
    • Stop the Windows Update service and its related services
      • net stop wuauserv
      • net stop bits
      • net stop cryptsvc
    • If the Windows Update service can not be stopped, change its startup type to Disabled, then reboot the computer.
    • Rename %windir%\SoftwareDistribution
    • Rename %windir%\system32\Catroot2
    • Start the Windows Update service and change its startup type to Automatic (Delayed Start).
  5. Re-register all the Windows Update DLLs (stop wuauserv, bits, and cryptsvc services first)
    • regsvr32 c:\windows\system32\vbscript.dll /s
      regsvr32 c:\windows\system32\mshtml.dll /s
      regsvr32 c:\windows\system32\msjava.dll /s
      regsvr32 c:\windows\system32\jscript.dll /s
      regsvr32 c:\windows\system32\msxml.dll /s
      regsvr32 c:\windows\system32\actxprxy.dll /s
      regsvr32 c:\windows\system32\shdocvw.dll /s
      regsvr32 wuapi.dll /s
      regsvr32 wuaueng1.dll /s
      regsvr32 wuaueng.dll /s
      regsvr32 wucltui.dll /s
      regsvr32 wups2.dll /s
      regsvr32 wups.dll /s
      regsvr32 wuweb.dll /s
      regsvr32 Softpub.dll /s
      regsvr32 Mssip32.dll /s
      regsvr32 Initpki.dll /s
      regsvr32 softpub.dll /s
      regsvr32 wintrust.dll /s
      regsvr32 initpki.dll /s
      regsvr32 dssenh.dll /s
      regsvr32 rsaenh.dll /s
      regsvr32 gpkcsp.dll /s
      regsvr32 sccbase.dll /s
      regsvr32 slbcsp.dll /s
      regsvr32 cryptdlg.dll /s
      regsvr32 Urlmon.dll /s
      regsvr32 Shdocvw.dll /s
      regsvr32 Msjava.dll /s
      regsvr32 Actxprxy.dll /s
      regsvr32 Oleaut32.dll /s
      regsvr32 Mshtml.dll /s
      regsvr32 msxml.dll /s
      regsvr32 msxml2.dll /s
      regsvr32 msxml3.dll /s
      regsvr32 Browseui.dll /s
      regsvr32 shell32.dll /s
      regsvr32 wuapi.dll /s
      regsvr32 wuaueng.dll /s
      regsvr32 wuaueng1.dll /s
      regsvr32 wucltui.dll /s
      regsvr32 wups.dll /s
      regsvr32 wuweb.dll /s
      regsvr32 jscript.dll /s
      regsvr32 atl.dll /s
      regsvr32 Mssip32.dll /s

Account Management Event ID 642 Anonymous Logon

There are more one one DCs (DC1 and DC2, DC1 is the PDC Emulator) in a domain.  An administrator changes an AD user account attribute, e.g. changing password/unlocking account, on DC2.

On DC2, two security events (628 (for password reset) and 642) are logged with the administrator user id.  On DC1 (the PDC emulator), only one event (642) is logged with NT Authority\Anonymous Logon.

I agree the event ID 642 on DC1 is created by the replication of the changes to the DC holding the PDC Emulator role.  Sometimes, I also see this happened on a non PDC Emulator DC.

Research:
http://social.technet.microsoft.com/Forums/en/winserverDS/thread/bf847f47-5637-453a-8752-9b985f8118f7

http://social.technet.microsoft.com/Forums/en/winserverDS/thread/65703372-53a6-434a-a9fb-0ad03ab9132c

Delete File Name Includes An Invalid Name

http://support.microsoft.com/kb/320081
del "\\?\c:\path_to_file_that contains a trailing space.txt "

subinacl /onlyfile "\\?\c:\path_to_problem_file" /setowner=domain\administrator /grant=domain\administrator=F

Or

rmdir /s <drive:><path>

Linux Guest Different MAC Address Error on VMware vSphere

Converted a Linux (Fedora 5) PC to a VMware vSphere guest.  The Linux guest OS shows a failed message when shutting down interface eth0.

Fix: edit /etc/sysconfig/network-scripts/ifcfg-eth0’s HWADDR to match the MAC address assigned the Linux guest OS.

Linux Guest Hangs at “Starting udev” on VMware vSphere

Converted a Linux (Fedora 5) PC to a VMware vSphere guest.  The Linux guest OS hangs at “Starting udev”.

Fix:

  1. Restart the Linux guest OS;
  2. Press any key at the GRUB boot menu, press e to edit, and add the highlighted words at the “kernel” line; press enter, and then b to boot;
    kernel /vmlinuz-2.6.18-1.2257.fc5smp ro root=/dev/sdb1 clock=pmtmr divider=10 hgb quiet
  3. Once it boots in the console, edit /boot/grub/grub.conf with the same setting.

Reference: http://itsecureadmin.com/2010/03/linux-guest-hangs-at-starting-udev-vmware-vsphere/
or http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1006427

Error 2753 The file “???” is not marked for installation

Get this error when trying to remove an application managed by a group policy installer package.

Fix: Install Windows Installer CleanUp Utility (msicuu2) to remove the application.

Symantec Extend WG Protocol Driver Error

Event ID: 7000, Source: Service Control Manager - “The Extend WG Protocol Driver service failed to start due to the following error: The system cannot find the file specified.”

The following steps fix the error message:
1. Open Device Manager
2. Click View > Show hidden devices
3. Expand Non-Plug and Play Drivers and uninstall the Extend WG Protocol Driver
4. Open regedit
5. Delete the key "WGX" in HKLM\SYSTEM\CurrentControlSet\Services
6. Reboot system

Use WinSCP to Transfer Files in vCSA 6.7

This is a quick update on my previous post “ Use WinSCP to Transfer Files in vCSA 6.5 ”. When I try the same SFTP server setting in vCSA 6.7...