My first question is, is this likely to improve things? I'm imagining that having several browser windows being remote controlled at once is causing them to "trip over" one another, and having each be the only window in its display/node would avoid this. Does this have any basis in fact?
My second question, how would this be possible? I can set up 4 Xvfb displays, fine, and I can create 4 Selenium nodes, but how do I make each Selenium node "attach" to each display? Also, Is there currently a way (short of manually tweaking the behat.yml files) to tell different runs to use different Selenium nodes? It looks like the behat_parallel_run options in config.php only allow general config for all runs, not per-run config.
I'm glad you've raised this Mark, as I've been having very similar problems https://moodle.org/mod/forum/discuss.php?d=327141 - running locally without xvfb or running with parallel off seems to work OK.
With parallel + xvfb randomly approx 20-30 tests out of 1100 fail each time (but with no real consistency about which tests fail).
I had been considering trying xvnc instead, as I'd come across some suggestions that this might work better with selenium, but I've not had time (yet) to experiment further.
I have an idea of how this might work, (Create displays, set $DISPLAY to a different value before launching creating each node, then launch behat). I'll test it out and report back.
I tried the following:
- Do a behat init with --parallel=4
- Create 4 Xvnc displays (:10, :11, :12, :13)
- Run a Selenium hub on port 4444
- export DISPLAY=:10 and run a Selenium node on port 5510
- Repeat step 4 for the other displays
- Edit the 4 behat.yml files to change the Selenium ports as required
- Run behat
However, this doesn't work, since running behat seems to rewrite the behat.yml files with the original port number, so it thinks Selenium isn't running at all.
Thanks for raising this Mark,
First I will start with how many parallel runs you should be running to have optimal run time. As you know you can have any number of runs initialised for behat and run them in parallel, but it's recommended to have max parallel runs = cpu_cores - 1 (1 less then the number of cpu cores), so each run gets a cpu.
To answer you first question, there are few factors which can reduce the random fails. FYI: It's hard to get zero random fails all the time as it depends lot on your system resources.
- Ensure max parallel runs is less then number of cpu cores - 1. (Personally 3 parallel runs are optimal)
- Use xvfb or xvnc or docker to avoid browser display on your screen. This might lead to focus fails if you are working on the system.
- Finally use --rerun to eliminate any random fails each run. (Will share the code which we use later.)
To run multiple Selenium servers on different displays and behat parallel run to use them, follow the following:
- Set following in your config.php (before including setup.php). For more information about other params refer https://docs.moodle.org/dev/Acceptance_testing#Advanced_usage
xvfb-run -a java -jar PATH_TOSELENIUM/selenium-server.jar -port 4445
xvfb-run -a java -jar PATH_TOSELENIUM/selenium-server.jar -port 4446
Even with the above approach, I have observed few random fails at times. Saying so dividing the runs between docker instances works 99% of the time without any problem. You can use "php admin/tool/behat/cli/run.php --fromrun=2 --torun=2" to run specific run on a docker instance and 99% of the times the run will pass without any problem.
Finally the script which I use is https://gist.github.com/rajeshtaneja/475d923fd482311c1544, where I get exit codes and check rerun files to run behat one after other in sequence to eliminate random fails.
Thanks Rajesh, that's immensley valuable information. Part of the problem I'm currently facing is that about 50% of the time, one of the threads will crash, leaving a run incomplete.
I currently do 4 parallel runs on a 4-core server, so I'll reduce that to 3 as that may be causing the instability. I already follow that with a single-thread rerun, and am working on rerunning any failures from that in isolation.
I'll take a look at doing the separate selenium nodes on Xvnc. I suspect the method for that will be to add the selenium command to the ~/.vnc/xstartup file, and changing the port number before each vncserver command.
OK, I'm making progress. We run a Selenium hub process and have script (selenium_node.sh) for creating a new node, with the port number corresponding to the current $DISPLAY number.
To achieve parallel runs in separate Xvnc sessions, I've added a call to the selenium_node.sh in the ~/.vnc/xstartup file (this is the file run by vncserver on startup, but it's location differs on other distros). I run an instance of vncserver (which launches Xvnc) for each run, and the config.php then has the $CFG->behat_parallel_run array described by Rajesh added.
This works with Firefox, I can VNC in to the 2 seperate displays and watch the two windows run seperate tests. However, I can't get this to work with Chrome (we find Chrome significantly faster for running Behat tests). This is due to the fact that the wd_host setting in the per-run behat.yml is only replaced in the default section, not in the chrome section. Is it possible to have it set for Chrome/other profiles as well?
Thanks good to know that you managed to run FF.
If you remove wd_host from you chrome profile then it will work for chrome as well.
$CFG->behat_config = array(What happens is if you define wd_host in profile then it take preference over the default wd_host or one defined in $CFG->behat_parallel_run.
'chrome' => array(
'extensions' => array(
'Behat\MinkExtension\Extension' => array(
'selenium2' => array(
'browser' => 'chrome',
Hope this helps.
Unfortunately, I've implemented my reruns in a bit of a silly way which makes this a bit problematic. Hopefully refactoring in line with how you've done it will fix this.
In general, I've found that reducing my parallel runs to 3 and implementing reruns has created a much more reliable test process. Thanks very much for your help.
I got it working, thanks for your help. Looking at your scripts was particularly useful.
I am now experiencing another issue with Selenium appearing to die in between re-runs. However, I'll start a seperate thread to discuss that.