Autonomous Agent Response Learning by a Multi-Species Particle Swarm Optimization. By Chi-kin Chow and Hung-tat Tsui. IEEE 2004.
- Autonomous agents adapt their response behavior, which can be represented as a vector function of observations from the environment (p = R(o), where o is the observation vector) to adapt to their environment.
- Continuous representation of response functions are more relevant for real-world dynamics than weight tuning with Reinforcement Learning and Hidden Markov Models.
- Agents extract their response from a tuned award function, A(o, r), and that extraction can be framed as a multi-objective optimization problem (that's where MPSOs come in!).
What they did
- Defined their response as a Gaussian Mixture Model
- Generate a set of (O-R) samples that generate an award value greater or equal than that defined in training.
- A Local Award Function (LAF) is defined as A_o(r) = A(o, r). This is a decomposed award function based on the observations of the O-R samples. "By optimizing the LAF set, the response of O-R samples can be determined".
- "... the response learning algorithm can be formulated as a multi-objective optimization problem in which the optima are correlated".
- With the responses from the optimized LAF set, a response network is generated by training the samples with a support vector machine.