Troubleshooting / Debugging

0

The Standalone VMRC Console and permissions…….

This Week I had a Customer requirement were the usage of the VMRC Plugin could be a good solution to resolve the issue. The customer had the requirement to use virtual machines as VPN gateways for connections outside his network. Therefore virtual machines were created and placed into a DMZ Zone. The customer Idea was, that his service staff could use the virtual machines to make VPN connections to external customers. One of the problems here is, that most VPN connections cut all other connections when the VPN connection is established.  These leads to the problem that RDP or VMware View connections could not be used because they are dropped when the tunnel came up……The Service people should not use a “full” Web Client therefore I made the suggestion to use the Standalone VMRC Console.

For those who are not familiar with the Standalone VMRC Console William Lam made a good post on it which could be found here: http://www.virtuallyghetto.com/2014/10/standalone-vmrc-vm-remote-console-re-introduced-in-vsphere-5-5-update-2b.html

After installing and configuration of the VMRC Console I created a new role on the vCenter Server. This role had the permissions to interact with the Client Console and some other stuff like stopping and starting the VM.

After that, I created used a group which was provided with the created role and permissions.

When I was finished, I made a test with the VMware Web Client and could access the console without any problems. I created links for the user so that could connect directly with the virtual machine without starting the VMware Web Client. The links I created was from type: vmrc://[VC]/?moid=[VM-MOREF-ID]

I didn’t used a Username in the call so every user could start the VM Console with his user and password. For me as administrator of the environment everything worked link expected…..

We rolled out the solution to the user. The user were requested to provide a Username and password but then get and error message.

I did some research and that but could find an answer in the first step…..after a chat with Joerg Lew  he pointed me to the right direction……Thanks Jörg!

When you use the Standalone VMRC Console with Username and Password the VMRC Console redirect the connections after the initial connection with the vCenter Server to the ESXi host. So I had to include the group with the permissions on the ESXi Hosts.

After that, the VMRC Console worked like a charm and we could solve the customer requirement without the usage of the VMware Web Client.

So have fun and orchestrate the World 😉

0

vCO and defensive scripting – practical experience

In the last couple of weeks I did a lot of Customer vCO Workshops to enable the Customers to create their own solutions. During the workshops I noticed some recurrent errors which were made from the  teams.

On some point of our Workflow, we will work with Userinputs. These Inputs can be direct Inputs from the users or maybe also Lines which are read from a file (CSV as example). Mostly these Inputs are compared to other Information’s . In this example I will use the ESX Hostname.

As you can see I have a VC with an cluster and one ESX Host with the name vmware01.example.com

Now we want to check if a Host with a specific name. So we use a really simple workflow.

In the scripting element I use this code:


for (i = 0; i < allHostsofVC.length; i++) {

if (HosttoCheck == allHostsofVC[i].name)

{

System.log("Yeah, we found the Host. Input is " + HosttoCheck + " Value from allHostsofVC[i].name is: " +  allHostsofVC[i].name);

}

else

{

System.log("oh no......no host found.Input is " + HosttoCheck + " Value from allHostsofVC[i].name is: " +  allHostsofVC[i].name);

}

}

So, now let’s make a first run

As we can see the search string and the hostname are identical.

Now let’s see what happens when the user write the hostname in Uppercase.

As we can see the name isn’t resolved. This leads to wrong results. The easiest way to avoid this is to change everything into LowerCase. So let’s modify our scripting element:

for (i = 0; i < allHostsofVC.length; i++) {

if (HosttoCheck.toLowerCase() == allHostsofVC[i].name.toLowerCase())

{

System.log("Yeah, we found the Host. Input is " + HosttoCheck + " Value from allHostsofVC[i].name is: " +  allHostsofVC[i].name);

}

else

{

System.log("oh no......no host found.Input is " + HosttoCheck + " Value from allHostsofVC[i].name is: " +  allHostsofVC[i].name);

}

}

Let’s start the last run again:

Also the User Input and the return string are different, we got the correct match.

This example leads us to another problem when we have to catch a user input regarding the hostnames. Most users tend to write only the name as shortname.

We can check it with the code we used before.

As we can see we didn’t catch the server. Why can we see in the vCenter Server MOB Browser. The Server is registered with the FQDN and from VMware API sight the shortname isn’t available.

How can we solve this problem? It is quite easy……. We just split the name at the dot and compare it with the name.

for (i = 0; i < allHostsofVC.length; i++){

var HostShortName = new Array()

HostShortName = allHostsofVC[i].name.toLowerCase().split(".")

if (HosttoCheck.toLowerCase() == allHostsofVC[i].name.toLowerCase() || HosttoCheck.toLowerCase() == HostShortName[0])

{

System.log("Yeah, we found the Host. Input is " + HosttoCheck + " Value from allHostsofVC[i].name is: " +  allHostsofVC[i].name);

}

else

{

System.log("oh no......no host found.Input is " + HosttoCheck + " Value from allHostsofVC[i].name is: " +  allHostsofVC[i].name);

}

}

Then we get our Hostname also when it is insert as shortname.

So, hope this help some to get their workflow created without to big problems. Have fun and Orchestrate the world 😉

0

vCO and the parseInt JavaScript “Bug”

Today I was by a customer to develop a Custom workflow to evacuate one of his DCs with vCO. The evacuation should be done by only invoking an String into the Workflow. The Hosts were stretched over both DCs and also the Datastores. The ESX Hosts in his environment were named with vmwareXX were the XX represent a number. For the DC1 the Hosts had an even number. The Hosts in the DC2 the numbers are even.

There were hosts with names from

vmware01 till vmware24

So we developed some task to get the Datastores and the Hosts to divide them to their correct DCs. Therefore we created two Arrays were the hosts were placed for future actions. The names of the hosts were split and the Number was taken into a variable as integer. We made this with the command:

substring = parseInt(Hostname[i].substring(6,8))

After we had the number, we created an loop with if-else and placed the Hosts into the correct array.

var Odd_array = new Array ()

var Even_array = new Array()

var Nametemp

var Substring

for (i = 0; i < Hostname.length; i++) {

substring = parseInt(Hostname[i].substring(6,8))

System.log("Substring = " + substring + " Hostname = " + Hostname[i])

if (substring % 2)

{

//it is odd

Odd_array.push(Hostname[i])

System.log( Hostname[i] + " is in Odd Array")

}

else

{

//it is even

Even_array.push(Hostname[i])

System.log( Hostname[i] + " is in even Array")

}

}

The above script generates this output.

Take a deeper look on the part with the hostnames vmware07 to vmware09. As you can see we get wrong results. The Host with the Name vmware07 was in the correct Array. The Hosts with the name vmware09 was in the even array were it was wrong.

It took some time to figure out that this is a JavaScript parseInt “Bug”. A good explanation about this behavior can be found here:

http://www.breakingpar.com/bkp/home.nsf/0/87256B280015193F87256C85006A6604

So for us the solution was to use

substring = parseInt(Hostname[i].substring(6,8), 10)

as command and everything worked like expected.

Maybe everyone expect me did not this before but this information could be useful for others so that is the reason why I created this post.

Have fun and orchestrate the World 😉

0

SRM and the planned migrations…..

Be aware, this post has nothing to do with Automation or Orchestration. This post is only related to the VMware Site Recovery Manager and a solution for a Problem during a planned migration. Maybe the post is useful for someone else which encounters the same problems…..

Last weekend I supported a customer witch had to power down one of his both datacenter. For this, we had to migrate the virtual Machines from one DC in the other. At the end all machines must be migrated back. From my site a view, this should be an easy thing because the customer had an SRM implementation. The storage is served via an IBM V7000 and the LUNs are replicated over both datacenter….. The customer had built the recovery Plans and tested them before the Migration should occur…

So from my point of view I expected an easy migration……

After everything was cleared the users were at home we started with a “Planned Migration” from the Datacenter 1 (DC1) to Datacenter 2 (DC2). This was quite easy and at the end we created our “Failback Plan” with the SRM.

No for us it was time to take a drink and wait for the Power and Air-conditions Guys to finish their jobs.

After a few hours it was time for us to migrate the VMs back to DC1…….

The customer created a recovery plan for his two different clusters. In the first cluster only the normal VMs were placed. In the second cluster the Database VMs were located…..

So we started the planned and the VMs out of the first Cluster fail back without any problems…..sincerely the VMs from the Database Cluster could be relocated……

We got the error:  No host with hardware version ‘9’ and datastore ‘snap-ef732565ae’ which are powered on and not in maintenance mode are available….

So we checked the vSphere client…..the Hosts were online (Host Version 5.5) and the Datastore with the name ‘snap-ef732565ae’ was also present…..

Really strange……a quick search in the Web leads to this VMware Documentation (http://pubs.vmware.com/srm-55/index.jsp#com.vmware.srm.admin.doc/GUID-FE6A85EC-B44E-415A-9C5F-1E17BC846119.html) were the problem was described with the solution to wait 15 Minutes for the next try because the SRM had cached some old information. So we took a coffee break  and after 20 Minutes we started the next try……unfortunately we had the same problems……

I tried to figure out the problem in the logs but I could found anything what pointed to the error…..

So we tried a lot of things to finish our “Planned Migration”…..every try needed a lot of time……one of the last things we did was to restart all ESX Hosts of the DB Cluster…..after all Hosts were Online we did the next try and “voila” I worked……

In the last week I did a lot of research about this behavior.  I figured out that I could reproduce the error when I power-up the ESXi Server quickly without any delay.  So from my point of view it seems to be a “communication” problem between the ESX Hosts of a cluster.

 

So if anyone has the same problem….try a reboot of the ESX Hosts……

0

Redirecting vCO logs to Syslog (…and other…)

Don’t use this in current vRO versions! Any changes made to the log4j.xml file will be overwritten by Control Center. Use the logging settings in Control Center instead!

The vCO Log Mechanism

VMware vCenter Orchestrator uses log4j (Version 1.2) for technical logging. Log entries from following sources are routed through this library:

  • vCO Server log messages
  • Workflow errors
  • Workflow log messages created via System.log|debug|error|warn(“logtext”) in a scriptable task
  • Action log messages created via System.log|debug|error|warn(“another logtest”)
  • Plugin log messages

In the default settings the log messages are written to the logfiles server.log and scripts-log.log in the folder %INSTALLDIR%\app-server\server\vmo\logs.

However, you can configure the settings and the targets of the in the configuration file log4j.xml in %INSTALLDIR%\app-server\server\vmo\conf.

log4jxml

This configuration-file is watched automatically by the vCO server, you there is no need to restart the service after you changed the log4j.xml file, just wait a couple of seconds until you see this message in the server.log:


...[Log4jService$URLWatchTimerTask] Configuring from URL: resource:log4j.xml

(and yes, if you misconfigured it, this might be the last message you see 😈 )

Redirecting to Syslog

Because log4j supports a lot of different targets out of the box, you can easily re-route the log messages to an external syslog server:

1. Configure a new Log Appender in the log4j.xml, and configure the target SyslogHost, the syslog facility and the message layout:

...
<appender name="SYSLOG">
   <param name="SyslogHost" value="192.168.219.213"/>
   <param name="Facility" value="USER"/>
   <param name="FacilityPrinting" value="true"/>
   <layout>
      <param name="ConversionPattern" value="%t %5r %-5p %-21d{yyyyMMdd HH:mm:ss,SSS} %c{2} [%x] %m %n"/>
   </layout>
</appender>
...

2. Route the log messages to this new appender, e.g. for all messages add the new appender-ref in the <root>-section at the end of the file:

...
<!-- ======================= -->
<!-- Setup the Root category -->
<!-- ======================= -->

<root>
   <priority value="INFO"/>

   <appender-ref ref="CONSOLE"/>
   <appender-ref ref="FILE"/>
   <appender-ref ref="SYSLOG" />
</root>
...

3. (Don’t forget to adjust the firewall settings of your vCO-Server and/or your Syslog Host if necessary, the built-in syslog appender uses UDP/514.)

4. See the log messages arriving on your Syslog Host….

For further details about the configuration, see the References section below…

Sending SNMP-Traps

Besides syslog it’s also possible to send log messages as SNMP-Traps to a monitoring system. For that, vCO already includes an additional log4j-library (NOT related to the SNMP-Plugin for vCO), and you can use it out of the box with following appender-config:


<appender name="TRAP_LOG">
<param name="ImplementationClassName" value="org.apache.log4j.ext.JoeSNMPTrapSender" />
<param name="ManagementHost" value="192.168.219.213" />
<param name="ManagementHostTrapListenPort" value="162" />
<param name="EnterpriseOID" value="1.3.6.1.4.1.24.0" />
<param name="LocalIPAddress" value="vco01-219.vcolab.local" />
<param name="LocalTrapSendPort" value="161" />
<param name="GenericTrapType" value="6" />
<param name="SpecificTrapType" value="12345678" />
<param name="CommunityString" value="public" />
<param name="ForwardStackTraceWithTrap" value="true" />
<param name="Threshold" value="DEBUG" />
<param name="ApplicationTrapOID" value="1.3.6.1.4.1.24.12.10.22.64" />
<layout>
<param name="ConversionPattern" value="%d,%p,[%t],[%c],%m%n" />
</layout>
 </appender>

And of course you have to add this appender to the <root>-section:


<appender-ref ref="TRAP_LOG" />

Again, don’t forget to open the Firewall (usually UDP/162).

If you only want to send SNMP-Traps in rare specific cases (and not for all the log messages), consider rather to use the SNMP-Plugin for vCO, it contains a pre-build workflow to send SNMP-Traps…

References

Besides these small examples of additional appenders log4j offers a lot more configuration parameters. For further reading, start here:

Happy logging! 😀