2011-10-23

Reward/Punishment System

I've been wondering about the definition of consciousness for awhile, so lately I've been thinking about a physical system that could be used to explore the topic.


description

The goal is to provide a rich environment which models behaviour based on reactions to abstract communication channels. The environment is an open system consisting of objects, actors and the communications between them. The basic unit is a communication, of which there can be N. Communications are fed into the system, which are then modified by objects and received by actors.

Objects are completely predictable, always providing the exact same communication output to the same communication input. Actors have an initial state that defines attractive and unnattractive patterns of communication, as well as a memory. Rewarding patterns are dampened into insignificance over time, but this is mitigated by generating new rewards based on combinations of previous rewards.

The point of this experiment is to provide a definition for consciousness: A conscious being is one that can utilize

  • Introspection of its internal state to recognize what is a reward and what is punishment.

  • Prediction of potential reward from complex patterns that provide motivation to avoid simplistic behaviours in expection of greater rewards.
  • With this definition, consciousness is a continuum based primarily on how deep an actor's memory stack is, allowing for more and more complex rewards. It also implies that more complex consciousness requires a more complex environment.

    object model

    Communication. A communication is a high-level model intended to mimic anything that can affect the senses. Light, sound and odor are communications. Some physical modeling would be necessary to do this correctly -- for example, odor requires at least a basic means of dispersion around the environment.

  • ID. An arbitrary assignment (i.e. Light = 1, Sound = 2 etc.).

  • Value. The current value (or range) of the communication. For example, used to mimic the wavelength of light or sound. Necessary because by definition actors react to communications, not objects, therefore any interesting property an object might have must be encoded in the communication.

  • Amount (%). The amount of communication present in the current signal. For example, as odor disperses, the amount declines, giving actors with a reward from certain odors an incentive to seek the source.

  • Rate. Speed at which the communication traverses the environment.

  • Falloff. Amount of dampening from the environment.

  • Object. An object is any inert element of the environment which can manipulate communications. Objects aren't inert in the sense that they don't move, but in the sense that they have no internal state, and will always provide the exact same output for any given input.

  • Communication. 1...N.
  • Reflect (%). Amount of the communication that gets reflected back at the source.

  • Emit (%). Amount of the communication that is emitted.

  • Filter. A way of modifying incoming communication. For example, an object can be classified as having the color red if it reflects some amount of the incoming communication designated as light, but filters out all values considered to be outside the range of red.

  • Actor. An actor is an object with an internal state that responds to communications. Actors are mobile and include a starting set of parameters for what types of communication provide reward and what provide punishment. They have a dampening system to prevent feedback loops, and a memory that allows them to recognize combinations of reward patterns, an important ability as the simple rewards get dampened into insignificance. Actors may have a system of touch, triggered by an extremely low-speed, high-falloff communication emitted by all solid objects.

    No comments:

    Post a Comment