Mars Rovers

March 2021

The setup

The rover has four sensors, two at 45° from directly ahead and two perpendicular. The rover moves directly forward at constant velocity, so the challenge is entirely based on steering. The sensor values, SL, SR, SFL, SFR are 0 if the raycast returns no hits, or 1 - distance with distance between 0 and 1.

The final steering direction for the car is determined by the summation of the product of these sensor values and their respective weights, which are learned through generations.


Training

The cars were trained such that on each iteration, the newest batch would inherit sensor weights from the most successful (longest surviving) car and randomly adjust them by a factor inversely proportional to the success of the parent. This increases the efficiency of the gradient descent.

Displaying the most successful car every third iteration:


Showing all offspring cars on each iteration:


Over generations, the weights all seemed to converge to a push away from the wall on narrow tracks. However on wider tracks, interestingly the left and right sensor weights somewhat inverted, but the front left and front right sensor weights stayed similar. This lead to a sort of wall-hugging strategy.


Application

As proof that this is applicable to the real world, I used the trained narrow-track sensor weights on a Lego car. The sensor distances all needed to be scaled such that they mapped onto the simulation units, and I introduced some new logic to account for the moving sensor on the car. However, the weights worked well.