I’ve been trying to come up with a working model for how biological intelligence works and using that to develop my own AI model. I’ve been working on this off and on for months, inspired by reinforcement learning. Here’s what I’ve got:
The mind does not use or work with objects, but with the set of properties for those objects. It’s critical to make this distinction for the sake of pattern matching.
Memory is transient, stateful information which is used to choose a most optimal behavior. All memory comes with expiration times. When an expiration time is up, the memory is lost and forgotten. The importance of a memory determines how long it persists in memory, and its importance is driven by relevance to internal motivations and number of recalls. The constant trimming of memory state is what prevents cognitive overload in the mind.
Sensory input is how an agent gets stateful information about the environment around itself. Sensory input information is fed directly into transient memory. There is no filter applied at the sensory input level. Sensor inputs get fed sets of properties created by external objects in the surrounding environment.
Behavior is a type of response or interaction an agent can have with itself or the external world around it. Behavior choice is the only tangible evidence we have of an agents internal intelligence, so choosing the correct and wrong behaviors will determine whether the agent passes an intelligence test.
Every character has internal needs it is trying to satisfy through the use of behaviors. Motivators are what drive behavior choice in the agents given environmental context. Motivators are defined by a name, a normalized value, and a set of behaviors which have been discovered to change the motivation value one way or another.
Reward (emergent) is the summed result of all motivations when a behavior effect has been applied to an object. The amount of reward gained is exponentially proportionate to the motivation satisfaction, using an F(X,W) = W(10X^3); equation, where X is normalized and represents motivational need, and W represents a weight. If you are manually assigning reward values to actions or properties, you’re doing it wrong.
Knowledge is a collection of abstract ideas and concepts which are used to identify relationships and associations between things. The abstractions can be applied towards related objects to make predictive outcomes, even though there is no prior history of experience. Knowledge is stored as a combination of property sets, behaviors, motivators, and reward experiences. Knowledge is transferable between agents.
***Knowledge reflection:*** This is an internal process where we look at our collection of assembled knowledge sets and try to infer generalizations, remove redundancies, streamline connections, and make a better organized sense of things. In humans, this crystalization phase typically happens during sleep or through intentional reflection.
The mind is the central repository for storing memory, knowledge, motivators, and behavior sets, and chooses behaviors based on these four areas of cognition. This is also where it does behavior planning/prediction via a dynamically generated behavior graph, with each node weighted by anticipated reward value evaluated through knowledge.
My goal is to build a model free artificial intelligence which learns through trial and error, creates abstractions, applies those abstractions to related objects which drive behavior choice. I don’t want to write any finite state machine code, so I’ve been trying to come up with a generalized intelligence model which can be applied to a variety of character types with a variety of motivators which drive behavior choice. Am I missing anything here? Did I get anything wrong?
I’ve got a rough working model in code and am ironing out wrinkles before I start scaling this to lots of objects and dozens of agents.