The Neurosymbolic Loop

AgentLoop agents do not decide by prompting a single large model and hoping the answer is right. Every choice point flows through a small loop that combines three complementary capabilities: a neural layer that perceives, a symbolic layer that reasons, and a learned layer that improves. The behavior tree is where these meet — each tick is one pass through the loop, and each pass is observable.

This page explains the idea at a conceptual level. For the concrete node types you author, see Behavior Trees and Behavior Tree JSON.

Neural

The neural layer is where neural networks turn raw, unstructured input into structured signal. Its job is deliberately narrow: perception and classification — reading what is in front of the agent — not multi-step inference, which neural models do unreliably.

The layer is model-agnostic. What network plays this role depends on the domain:

Language — a large language model parses natural-language input (a request, prior context, a transcript) into structured facts and intent.
Vision — image and video models detect objects, read a scene, or estimate state from a camera feed in real time.
Other sensing and control networks — pose and keypoint estimators, audio models, policy networks, or any task-specific model that maps live observations to structured outputs.

Because a perception node can host whatever network fits the signal, the same engine can drive a software agent reading a ticket and a real-time agent reading sensor frames tick by tick.

In every case the contract is identical: emit structured facts, not conclusions. A language model reading “I want to wash my car at the place down the street” might emit intent: wash and transport_available: ["drive", "walk"]; a vision model on a robot might emit obstacle_distance: 0.4 and grasp_target: "cup". What follows from those facts is the symbolic layer’s job.


{
  "intent": "wash",
  "transport_available": ["drive", "walk"]
}

By scoping each model to perception, you get its flexibility on novel input without inheriting its tendency to invent reasoning steps.

AgentLoop’s software-development agents use LLMs at this layer today. The same perception node accepts vision and control networks, which is what lets the engine extend to real-time domains like robotics, simulation, and autonomous control.

Symbolic

The symbolic layer is AgentLoop’s rules engine, and its job is deterministic inference you can audit. It runs in-process and takes the facts the neural layer extracted, deriving consequences using rules a human authored against a versioned knowledge base.

The symbolic layer:

Asserts stable world knowledge — taxonomies, definitions, constraints — as ground facts.
Derives consequences from extracted facts using explicit rules.
Evaluates goal-achievement predicates that can later become reward signals.
Catches hallucinations: a “fact” that cannot satisfy a rule’s preconditions never produces a conclusion.


% A car wash requires a car whenever the intent is to wash
rel goal_requires_object("car_wash", "car") = intent("wash")
% Driving delivers a car
rel transport_delivers = {("drive", "car")}

If intent("wash") holds, the engine derives that a car is required — a conclusion grounded in both what the neural layer perceived and a rule you can point to. Every derivation is replayable: you can query the engine and watch the exact chain fire. This is the difference between an agent that asserts and an agent that can show its work.

You author these rules with a logic-policy behavior-tree node (and an optional logic-introspect node that lets the knowledge base shape what the neural layer is allowed to extract). See Logic Programming to learn the rule syntax and how to wire one up.

Learned

The learned layer handles choosing well under uncertainty. At each branching point, a utility selector scores its candidate children using features drawn from the blackboard — including predicates the symbolic layer just derived — and picks accordingly.

The learned layer:

Scores each child of a utility selector from blackboard features and response curves.
Selects the highest-utility branch in production, or samples during exploration.
Uses an inertia bonus to avoid flip-flopping between near-tied options.


full_refactor = 0.81
fast_patch    = 0.54
ask_for_help  = 0.11

Today these weights are hand-authored: you define response curves (linear, logistic, quadratic, identity) over named features and tune them directly. Weights live on disk under .agentloop/models/utility/ with version management, and each decision is recorded onto the blackboard so you can inspect why a branch was chosen. Over time, the design lets those same weights be trained from past episodes rather than written by hand — the recording you see today is the data capture for that.

The loop, end to end

One tick assembles all three:

State arrives on the blackboard.
Perceive — a neural model extracts facts from unstructured input (an LLM for text, a vision or sensing model for frames and signals).
Derive — rules in the policy engine fire, producing grounded consequences.
Decide — a utility selector scores its children using those consequences plus blackboard features.
Act — the chosen child runs (a primitive action, a sub-tree, or a fallback).
Evaluate — goal predicates in the rules engine report whether subgoals were achieved.

Training, when present, happens offline — episode trajectories feed weight updates asynchronously. Nothing trains inside the hot loop, so production ticks stay deterministic.

Why it matters

The composition is more robust than any single layer because each layer does only what it is good at.

Flexible. Neural models absorb novelty — unforeseen wording, unfamiliar scenes, open-ended scenarios — without being trusted to reason.
Verifiable. Every conclusion traces back to an authored rule and grounded facts. A claim that doesn’t satisfy its preconditions simply doesn’t pass.
Improvable. Decision weights can adapt from outcomes while the reasoning above them stays interpretable.
Auditable, tick by tick. When something goes wrong, you have three orthogonal questions with three separate answers:

Question	Where you look
Did it perceive the right facts?	The extracted facts on the blackboard
Did it reason to the right conclusion?	Replay the rules-engine derivation
Did it choose the right branch?	The recorded utility scores for that tick

A pure neural agent cannot answer any of these — it is a black box. In regulated or safety-critical work, being able to point at the rule that explains a decision is the whole game.

Available today: the logic-policy and logic-introspect reasoning nodes and intent-extraction patterns, plus utility selectors with hand-authored weights, on-disk versioned models, and per-tick decision recording. These are production code paths you can use now.

Evolving: automatic training of utility weights from recorded episodes, hierarchical goal planning that decomposes high-level goals into subgoal sequences, and a unified reasoning editor are design-stage and rolling out in milestones. The decision and tick data is already captured so these can layer on without changing how you author trees today. Build against the shipped surfaces; treat the learned-training pieces as forthcoming.

Behavior Trees Architecture