Parameter learning problem

Thu Oct 16 09:30:57 EDT 1997

Good questions again

The reason for separating q from r and a from b is to enable the rG - b
discounting of subgoals.  The idea here is that a subgoal is only worth
as much as the goal is after the goal has been achieved.  Thus, we need
the downstream quantities.  I think in 2.0 we used to assign PG-C as the
value to the subgoal but this was too severe since some of the P and
some of the C was associated with the subgoal.

I still like the idea, however, of eliminating the q parameter and
letting just a  represent the cost of the production (and potential
subgoal it spawns).  Then the discounting of the subgoal becomes
effectively PG - b since q = 1.  For example, suppose

q = 1 (proposal)
r = .6
a = 10
b = 2
G = 20

The old proposal would have assigned a value of .6 * 20 - 10 - 2 = 0 to
the subgoal and it would be immediately abandoned.  The current proposal
would assign a value of .6*20 - 2 = 10 since that is the expected value
of the goal after the subgoal has been completed.