Monday, February 9, 2015

Environment Watcher or how to create a service for handling stuck automation processes

In one of my previous articles I've shared some notes about Jenkins plugins development by example of selenium grid killer. I've also mentioned that it's not enough to just kill hub or nodes. Ideally, it should be a full functional restart trigger.

Let's imagine that we have 2 VMs for test automation purposes. We've raised selenium grid hub, har storage and browsermob proxy on the first VM, and selenium grid node, sikuli server on the second VM. You may know that sometimes bad things happen and our environment gets stuck due to number of reasons. And it may affect the entire automation process, if we e.g. have lots of scheduled jobs. Of course we can login to failed environment and restart services manually, but it could be a quite bothering task, especially during debugging. And what if it failed during nightly run? For such cases it could be useful to have some trigger, which could restart all services before new execution process is started. So in this article I'll show you how to create a simple RESTful trigger for handling described above situations.

I call it Environment Watcher. Technically, we'll have an http server with the following features:
  • Killing common tasks, such as browsers' instances and their drivers (for chome and ie).
  • Killing java tasks. If in case of browsers / drivers we can find them by name in a task manager, selenium grid / browsermob proxy and sikuli process is quite hard to find, as they all have the same name - java.exe.
  • Starting batch files via command line executor service. Normally, such batches raise selenium grid hub / nodes, remote browsermob proxy and sikuli jars.
Let's start with services implementation. To kill common tasks, which could be easily found via task manager by name, we'll use the following snippet:


If we use this code in a command line, it will loop through existing tasks list and kill everything that matches criteria marked with ?. Question mark will be dynamically replaced with a list of names we're looking for.

In regards to java tasks searching, we'll use a tool called JPS (a part of JDK), which will help us to list and kill any JVM process.


Well, now we can create appropriate endpoints:  


In both cases we call command line executor utility based on Apache Commons Exec library.

To run batches we'll use the same technique, but the root process will be a tool called PSExec, instead of cmd. As you may know, if our batch process is continuously waiting for user input, it will hang a java process, which called it. That's why PSExec was chosen. It'll be like a proxy for starting batch in a separate process and immediately quiting.


These are our key server-side services. Now let's take a quick look at client side, that is written via Jersey.

Here's an example, how we can use java tasks killer service:


Normally, you'll need to pass a list of task names (which you may want to kill) to appropriate method. But I've also added some preceded structures like JavaTask / CommonTask to make it easier to start using code.


This code will kill any running instance of selenium standalone (hub / node) and listed browsers / their drivers.

Well, now we have a client-server solution, which can kill and restart anything on a remote VM. And it's time to update our Jenkins plugin.


There were added some new entries into our jelly configs. Besides listed above features, I've also added an opportunity to change hub ip for connected nodes dynamically (in json config files). This may be useful, if you're passing it from Jenkins parameter into your java code through environment variable. But it would be bothering to modify configs for all VMs manually. So I've put appropriate trigger into our Environment Watcher service. There was also added a windows' minimization feature, as sometimes, after processes' restarting, browser may be opening behind command line windows, that may cause failures in case of using some image recognition tools like SikuliX.

Updated plugin's UI will look like the following:


Normally, you'll need to specify a valid watcher ip / port and check available options, if you want to kill common / java tasks. Additionally you'll need to set a path to a batch file, which will start killed processes again. Optionally, you can reconfigure node's json file by a given path with a new hub ip. 

So now you can use this plugin for restarting your environment, for example, before new job execution. As a result you'll see the following output in Jenkins console log:


As usual, you can find env-watcher and selenium-utils sources on GitHub.