Skip to content

phillipclapham/flowscript

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

108 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FlowScript

Decision intelligence for AI agents.

Tests License: MIT Website


The Problem

Your agent made a decision. Your PM asks "why?" You dig through chat logs and JSON blobs. Good luck.

Agent memory today is either opaque embeddings you can't inspect, expensive LLM self-editing you can't audit, or untyped state dicts with no structure. They store tokens. FlowScript stores reasoning — structured, typed, queryable in <1ms.

Mem0, Zep, Letta, LangGraph — those solve retrieval. FlowScript solves reasoning. They're not mutually exclusive. Use an embedding store for "find similar memories" and FlowScript for "why did we decide that?"


Try It Now

Everything below works today. 246 tests passing.

git clone https://github.com/phillipclapham/flowscript.git
cd flowscript && npm install && npm run build

# Parse a decision file to structured IR
node bin/flowscript parse examples/decision.fs -o /tmp/decision.json

# Find every tradeoff in the decision
node bin/flowscript query tensions /tmp/decision.json

Real output from that query:

{
  "tensions_by_axis": {
    "security vs simplicity": [{
      "source": { "content": "JWT tokens" },
      "target": { "content": "implementation complexity" }
    }],
    "scaling vs security": [{
      "source": { "content": "session tokens + Redis" },
      "target": { "content": "operational complexity" }
    }]
  },
  "metadata": {
    "total_tensions": 2,
    "unique_axes": ["security vs simplicity", "scaling vs security"]
  }
}

Typed tradeoffs with named axes. From a 17-line .fs file your PM can actually read. Try that with a vector database.

# Also available: why, what-if, blocked, alternatives
node bin/flowscript query blocked /tmp/decision.json
node bin/flowscript query alternatives /tmp/decision.json

Hello World — v1.0 SDK

The SDK wraps what's already working into a fluent API. Coming soontrack progress.

import { Memory } from 'flowscript';
const mem = new Memory();

const q = mem.question("Which database for agent memory?");
mem.alternative(q, "Redis").decide({ rationale: "speed critical for real-time agents" });
mem.alternative(q, "SQLite").block({ reason: "no concurrent write support" });
mem.thought("Redis gives sub-ms reads").vs(mem.thought("cluster costs $200/mo"), "performance vs cost");

console.log(mem.query.blocked());   // structured blockers + downstream impact
console.log(mem.query.tensions());  // tradeoffs with named axes
mem.save("./memory.fs");            // human-readable, PM-reviewable

Three lines of output, three things no other memory system gives you: typed blockers with impact chains, named-axis tensions, and a .fs file a human can actually read.


What You Get

Token Efficiency

~3:1 compression ratio vs prose. Same reasoning, 66% fewer memory tokens. At scale that's real money.

Decision Provenance

why(nodeId) returns a typed causal chain. Not vibes, not "the model said so." A traceable path from decision back through every factor that led there.

Blocker Analysis

blocked() finds every stuck node, scores downstream impact, and tells you how long it's been waiting. Your agent doesn't just know what's blocked, it knows what breaks because of it.

Human-Readable Audit

.fs files read like structured prose. Your PM can open agent memory in a code review. Try that with a vector database.


The Query Engine

Five semantic queries. No competitor has these.

Query What it does Example question it answers
why(nodeId) Traces causal chains backward "Why did we choose Redis?"
whatIf(nodeId) Projects forward consequences "What breaks if we drop caching?"
tensions() Maps all tradeoffs with named axes "What tensions exist in this design?"
blocked() Finds blockers with impact scoring "What's stuck and what's downstream?"
alternatives(questionId) Reconstructs decision rationale "What options did we consider?"

Each query returns structured, typed results in multiple formats (chain, tree, flat, comparison). All execute in <1ms on typical agent memory graphs.

These operations are computationally impossible on unstructured text. That's the point. Structure makes reasoning queryable.

Full query docs with TypeScript API: QUERY_ENGINE.md


Agent-to-Agent Decision Exchange

FlowScript's most differentiated use case: structured semantic payloads between agents.

When Agent A asks Agent B "why did you make that decision?", most systems return unstructured text. FlowScript returns a typed causal chain:

Agent A → why(decision_id) → Agent B
       ← typed causal chain with provenance
// Agent B responds to a why() query with structured reasoning
const chain = mem.query.why("auth-decision-001");
// Returns: decision ← rationale ← evidence ← constraints
// Every link typed, every source tracked, every tradeoff named

// Agent A can then query further:
const impacts = mem.query.whatIf("auth-decision-001");
// "If that decision changes, what downstream effects propagate?"

This is LDP Mode 3 (Semantic Graphs) — structured decision payloads as a protocol, not just storage. No other agent memory system enables typed reasoning exchange between agents. Embedding stores pass blobs. FlowScript passes understanding.

See flowscript-ldp for the working reference implementation.


Cross-Architecture Evidence

Six AI architectures (Claude, ChatGPT, Gemini, DeepSeek, Claude Code, fresh Claude instances) parsed FlowScript without being given the specification. All six recognized the notation immediately and started using it in responses.

Different training data, different attention mechanisms, different optimization targets. Same structural recognition. This suggests FlowScript taps fundamental patterns in language and reasoning, not model-specific quirks.

Specification alone is sufficient for full adoption. No training. No fine-tuning. Just the syntax reference and examples.

Running in production for 6+ months in the flow system. Not theoretical.

Running in production daily in a multi-agent cognitive architecture with 11 sensors, 22 scheduled tasks, and bilateral AI-to-AI relay.


Three Ways In

1. From agent transcripts (zero learning curve)

const mem = Memory.fromTranscript(agentLog);
console.log(mem.query.tensions());

The LLM writes FlowScript. You never touch the syntax. Paste existing agent output, get queryable decision intelligence back.

2. Builder API (programmatic)

const mem = new Memory();
const t = mem.thought("caching improves latency");
t.causes(mem.thought("higher memory usage"));
t.tensionWith(mem.thought("cost constraints"), "performance vs budget");

Feels like a builder/ORM. Type-safe, fluent chaining, auto-generates the graph.

3. Parse .fs directly (power users)

const mem = Memory.parse(`
  ? which database for sessions
  || Redis <- speed critical
  || Postgres <- better durability
  speed >< durability
`);

21 markers, human-readable, works in any text editor. Full syntax spec.


Install

git clone https://github.com/phillipclapham/flowscript.git
cd flowscript && npm install && npm run build

npm package coming soon. The SDK (with Memory API, fluent builder, and npm install) is actively being built. Track progress.

For Python and LDP protocol integration:

pip install flowscript-ldp

See flowscript-ldp for the LDP Mode 3 reference implementation.


How It Works

.fs file / builder API / transcript
        ↓
   FlowScript Parser (Ohm.js PEG grammar)
        ↓
   Intermediate Representation
   (typed graph: content-hash IDs, provenance tracking, SHA-256 dedup)
        ↓
   Query Engine (5 semantic operations)
        ↓
   Structured Results (chain / tree / flat / comparison)

The IR is the core. Every node gets a content-hash ID. Every relationship is typed (causes, derives, tension, blocks, etc.). Provenance tracks source files and line numbers. The schema is formally specified and validated.

246 tests. All passing. Parser, linter (9 semantic rules), validator, query engine, CLI.

Details: TOOLCHAIN.md | Formal specs: spec/


Notation at a Glance

You don't need to learn all 21 markers. Start with these:

Marker Meaning Example
-> causes / leads to poor sleep -> reduced focus
? question / decision point ? which framework to use
>< tension / tradeoff speed >< code quality
[blocked] waiting on dependency * [blocked(reason, since)]
[decided] committed direction * [decided(rationale, on)]
thought: insight worth preserving thought: caching is the bottleneck

Full 21-marker spec: FLOWSCRIPT_SYNTAX.md | Beginner guide: FLOWSCRIPT_LEARNING.md | Real-world examples: FLOWSCRIPT_EXAMPLES.md


Why This Isn't Another Memory Layer

Capability FlowScript Embedding stores State dicts LLM self-edit
Semantic queries (why, blocked, tensions) Yes No No No
Human-readable persistence Yes (.fs files) No Partially No
Decision provenance Yes (typed chains) No No Sometimes
Agent-to-agent reasoning exchange Yes (LDP Mode 3) No No No
Sub-ms query performance Yes Depends Yes No (LLM call)
Works without fine-tuning Yes Yes Yes Yes

CLI

# Parse FlowScript to IR
node bin/flowscript parse example.fs -o example.json

# Lint for semantic errors (9 rules)
node bin/flowscript lint example.fs

# Validate IR against schema
node bin/flowscript validate example.json

# Query the graph
node bin/flowscript query why <node-id> example.json
node bin/flowscript query what-if <node-id> example.json
node bin/flowscript query tensions example.json
node bin/flowscript query blocked example.json
node bin/flowscript query alternatives <question-id> example.json

Documentation

Learn the notation: FLOWSCRIPT_SYNTAX.md (complete spec) | FLOWSCRIPT_LEARNING.md (beginner guide) | FLOWSCRIPT_EXAMPLES.md (real-world patterns)

Understand the engine: QUERY_ENGINE.md (5 queries, TypeScript API) | TOOLCHAIN.md (parser, linter, validator)

Dive deeper: ADVANCED_PATTERNS.md (sophisticated usage) | spec/ (formal specifications) | examples/ (golden .fs/.json pairs)

Try it live: flowscript.org


Protocol Alignment

Three independent systems arrived at symbolic notation for AI communication without cross-pollination:

System Date Scope
SynthLang Jan 2025 Prompt compression
FlowScript Oct 2025 Decision intelligence + formal toolchain
MetaGlyph Jan 2026 Prompt compression (6 operators, 62-81% token reduction)

When independent builders converge on the same structural insight, that's evidence the insight is load-bearing.

FlowScript's IR is the first implementation of LDP Mode 3 (Semantic Graphs) from arXiv:2603.08852. Active collaboration with the LDP paper author on session state machine co-design (GitHub issues).

Also structurally aligned with G2CP (graph-grounded agent communication, 73% token reduction), JamJet (Rust agent runtime with ProtocolAdapter), and NFD (three-tier cognitive architecture matching FlowScript's temporal model).


Contributing

Use FlowScript. Report what's friction. Open issues with evidence from real use, not theoretical proposals.

Working on agent protocols? FlowScript's IR is a natural fit for structured semantic payloads. PRs for integration welcome.


License

MIT. See LICENSE.

Decision intelligence for AI agents. Typed semantic queries over structured reasoning.