"""
Demo of robot learning with a neural network
CS 151, Tuesday November 15, 2005

This example uses a neural network to control a robot after it has been
trained by a teacher.  The teacher is represented by the teacher() method,
which returns a motor action (speed and rotation) in response to the robot's
current sensory state and its previous motor action.  This action is used as
the training signal for the neural network.  After training, the neural
network itself is used as the source of motor actions.

The network has 18 input units: 16 for the sonar sensors plus two for the
robot's previous motor action (speed and rotation).  It has two outputs
(speed and rotation) that specify the motor response of the robot to its
current input state.  It also has a hidden layer consisting of 5 units.

We will train the robot to move forward and slow down as it approaches a
wall.  We will test the network's ability to generalize by testing the
trained robot in a situation which it was not trained on, namely, being very
close to a wall.  The result is that the robot backs away from the wall,
even though we never trained it to back up.

Running the demo:

1. Start Pyro and the Pyrobot simulator and load in the Room.py world.

2. Connect to the PyrobotRobot60000 robot.

3. Load in the NNDemo.py brain (this file).

4. Turn learning off by typing self.net.setLearning(0) at the Pyro
   command line in the interaction window.  Position the robot near the
   south wall facing north and click Run.  When learning is off, the neural
   network is in control of the robot, but since it hasn't been trained
   yet, the robot won't do much at this point.  Try this from various
   other places in the environment, including near walls and in corners.

5. Now click Stop and turn learning on by typing self.net.setLearning(1).

6. Position the robot near the south wall of the world facing north and
   click Run.  This will begin training.  The teacher will drive the robot
   forward while updating the weights of the neural network with the
   corresponding motor actions (provided by the teacher method).  When the
   robot stops near the north wall, click Stop.

7. Repeat step 6 several times until the error produced by the network
   (shown in the Pyro interaction window) approaches zero for the duration
   of the robot's travel.

8. Now turn learning off by typing self.net.setLearning(0) at the Pyro
   command line in the interaction window.

9. Position the robot near the south wall again and click Run.  This time
   the robot should respond by moving forward and then slowing down as it
   reaches the wall.  Try this from various other places in the environment
   (you don't need to click Stop while running with learning off, since the
   weights of the network aren't being updated).  Notice that when the robot
   is placed very close to the middle of the north wall, it backs up, even
   though it was never explicitly trained to do so.  This shows that the
   network has learned to generalize by responding to novel situations in a
   reasonable way.  Furthermore, the motion produced by the neural network is
   smoother than the teacher, because the speed values range along a continuum.

10. You can try out a set of pretrained weights by loading them from the file
    'trained-robot.wts'.  These weights were created by repeating step 6 above
    15-20 times and then calling the saveWeightsToFile method.  To load in the
    weights, do this:  self.net.loadWeightsFromFile('trained-robot.wts')

"""
from pyrobot.brain import Brain
from pyrobot.brain.conx import *

class NNDemo(Brain):

   def setup(self):
      # create the network
      self.sensorCount = self.robot.range.count
      self.net = Network()
      self.net.addLayer('input', self.sensorCount + 2)
      self.net.addLayer('hidden', 5)
      self.net.addLayer('output', 2)
      self.net.connect('input', 'hidden')
      self.net.connect('hidden', 'output')
      self.net.initialize()
      self.net.setEpsilon(0.5)
      self.net.setMomentum(0.1)
      # start with learning on
      self.net.setLearning(1)
      self.counter = 0
      # maximum possible sensor reading
      self.maxvalue = self.robot.range.getMaxvalue()
      # current sensor readings, scaled to the range 0...1
      self.sensors = [self.scale(s.distance()) for s in self.robot.range['all']]
      self.speed = 0.0
      self.rotate = 0.0

   # scales a value from the range 0...maxvalue to the range 0...1
   def scale(self, value):
      return (value / self.maxvalue)           

   # the teacher
   def teacher(self):
      # read sensors
      left = min([s.distance() for s in self.robot.range['front-left']])
      right = min([s.distance() for s in self.robot.range['front-right']])

      # default motor action is full speed ahead with no rotation
      targetSpeed  = 1.0  # speed is scaled to the range 0...1
      targetRotate = 0.5  # rotation is scaled to the range 0...1
      if left < 1.0 or right < 1.0:
         # stop if too close to a wall
         targetSpeed = 0.5
      elif left < 1.8 or right < 1.8:
         # slow down if approaching a wall
         targetSpeed = 0.75

      return [targetSpeed, targetRotate]

   # main driver loop
   def step(self):
      # remember previous sensory state and motor actions
      previousState = self.sensors + [self.speed, self.rotate]
      # get new sensor readings
      self.sensors = [self.scale(s.distance()) for s in self.robot.range['all']]

      if self.net.learning:
         # get teacher signal to use for training
         target = self.teacher()
         # update the network's weights
         error, c, t = self.net.step(input=previousState, output=target)
         if self.counter % 10 == 0:
            print "error = %.2f" % error
         # use training signal as actual motor action
         self.speed, self.rotate = target

      else:
         # feed in the current state to the network
         currentState = self.sensors + [self.speed, self.rotate] 
         output = self.net.propagate(input=currentState)
         # use network's output as actual motor action
         self.speed, self.rotate = output
         if self.counter % 10 == 0:
            print self.speed, self.rotate
      # scale speed and rotation to the range -0.75...+0.75 and move robot
      self.robot.move(1.5*self.speed-0.75, 1.5*self.rotate-0.75)
      self.counter = self.counter + 1

      
def INIT(engine):
   return NNDemo('NNDemo', engine)