Introduction

Pinaxai is the runtime for agentic software. Build agents, teams, and workflows. Run them as scalable services. Monitor and manage them in production.

What is Pinaxai

Pinaxai is a lightweight, model-agnostic library for building production-grade AI agents. It provides three integrated layers:

Layer	What it does
Framework	Build agents, teams, and workflows with memory, knowledge, guardrails, and 100+ integrations.
Runtime	Serve your system in production with a stateless, session-scoped FastAPI backend.
Control Plane	Test, monitor, and manage your system using the AgentOS UI.

What You Can Build

Pinaxai powers real agentic systems built from the same primitives above.

Pal — A personal agent that learns your preferences.
Dash — A self-learning data agent grounded in six layers of context.
Scout — A self-learning context agent that manages enterprise knowledge.
Gcode — A post-IDE coding agent that improves over time.
Investment Team — A multi-agent investment committee that debates and allocates capital.

Single agents. Coordinated teams. Structured workflows. All built on one architecture.

Quickstart↗ Building Agents↗ Multi-Agent Teams↗ Demo↗

Built for Production

Pinaxai runs in your infrastructure, not ours.

Stateless, horizontally scalable runtime.
50+ APIs and background execution support.
Per-user and per-session isolation.
Runtime approval enforcement.
Native tracing and full auditability.
Sessions, memory, knowledge, and traces stored in your database.

You own the system. You own the data. You define the rules.

Installation

Install Pinaxai using pip. It has minimal dependencies and works out of the box.

Bash

pip install -U pinaxai

Your First Agent

Build and run your first agent in about 20 lines of code. In this guide, you'll build an agent that connects to an MCP server, stores and retrieves past conversations, and runs as a production API.

1. Define the Agent

Save the following as pinax_assist.py:

Python — pinax_assist.py

from pinaxai.agent import Agent
from pinaxai.db.sqlite import SqliteDb
from pinaxai.models.anthropic import Claude
from pinaxai.os import AgentOS
from pinaxai.tools.mcp import MCPTools

pinax_assist = Agent(
    name="Pinax Assist",
    model=Claude(id="claude-sonnet-4-5"),
    db=SqliteDb(db_file="pinaxai.db"),                       # session storage
    tools=[MCPTools(url="https://docs.pinax.com/mcp")],      # Pinax docs via MCP
    add_datetime_to_context=True,
    add_history_to_context=True,                              # include past runs
    num_history_runs=3,                                       # last 3 conversations
    markdown=True,
)

# Serve via AgentOS — streaming, auth, session isolation, API endpoints
agent_os = AgentOS(agents=[pinax_assist], tracing=True)
app = agent_os.get_app()

You now have: a stateful agent, streaming responses, per-user session isolation, a production-ready API, and tracing — no third-party services required.

2. Run Your AgentOS

Set up your virtual environment

Bash

uv venv --python 3.12
source .venv/bin/activate

Install dependencies

Bash

uv pip install -U 'pinaxai[os]' anthropic mcp

3
Export your Anthropic API key
Bash
```
export ANTHROPIC_API_KEY=sk-***
```
4
Run your AgentOS
Bash
```
fastapi dev pinax_assist.py
```
Your AgentOS is now running at http://localhost:8000. API docs available at http://localhost:8000/docs.

Connect to the AgentOS UI

The AgentOS UI connects directly from your browser to your runtime. It lets you test, monitor, and manage your agents in real time.

Open os.pinax.com and sign in.
Click Add new OS in the top navigation.
Select Local to connect to a local AgentOS.
Enter your endpoint URL — default: http://localhost:8000.
Name it (e.g. "Development OS") and click Connect.

You'll see your OS with a live status indicator once connected. Open Chat, select your agent, and start a conversation. Click Sessions in the sidebar to inspect stored conversations.

All session data is stored in your local database. No third-party tracing or hosted memory service is required.

What are Agents?

Agents are AI programs that use tools to accomplish tasks. An agent is a stateful control loop around a stateless model — the model reasons and calls tools in a loop, guided by instructions. Add memory, knowledge, storage, human-in-the-loop, and guardrails as needed.

The model provides the intelligence. Connect to 23+ providers with no lock-in.
Tools give the agent the ability to take actions: search the web, call APIs, read files.
Instructions shape the agent's behavior and output format.
Memory and knowledge allow agents to recall past interactions and retrieve relevant context.

Abstraction	What it does
Agent	A single AI program with tools, memory, and instructions.
Team	Multiple agents that work together toward a shared goal.
Workflow	Orchestrate agents, teams, and functions through defined steps.

Building Agents

Start simple: a model, tools, and instructions. Once that works, layer in more functionality as needed.

Python — hackernews_agent.py

from pinaxai.agent import Agent
from pinaxai.models.anthropic import Claude
from pinaxai.tools.hackernews import HackerNewsTools

agent = Agent(
    model=Claude(id="claude-sonnet-4-5"),
    tools=[HackerNewsTools()],
    instructions="Write a report on the topic. Output only the report.",
    markdown=True,
)
agent.print_response("Trending startups and products.", stream=True)

Use Agent.print_response() for development — it prints the response in a readable terminal format. For production, use Agent.run() or Agent.arun():

Python

from typing import Iterator
from pinaxai.agent import Agent, RunOutputEvent, RunEvent
from pinaxai.models.anthropic import Claude
from pinaxai.tools.hackernews import HackerNewsTools

agent = Agent(
    model=Claude(id="claude-sonnet-4-5"),
    tools=[HackerNewsTools()],
    instructions="Write a report on the topic. Output only the report.",
    markdown=True,
)

stream: Iterator[RunOutputEvent] = agent.run("Trending products", stream=True)
for chunk in stream:
    if chunk.event == RunEvent.run_content:
        print(chunk.content)

Dynamic Configuration

Callable factories are a first-class pattern for dynamic runtime configuration. Use them to build tools and knowledge from live run context instead of fixed configuration — useful for multi-tenant systems where tools and knowledge sources change per user or task.

Running Agents

Run your agent by calling Agent.run() or Agent.arun(). The execution flow:

The agent builds context: system message, user message, chat history, memories, session state.
The agent sends this context to the model.
The model responds with either a message or a tool call.
If the model calls a tool, the agent executes it and returns results to the model.
The model processes the updated context and repeats until it produces a final message.
The agent returns this final response to the caller.

Basic Execution

Agent.run() returns a RunOutput object, or a stream of RunOutputEvent objects when stream=True:

Python

from pinaxai.agent import Agent, RunOutput
from pinaxai.models.anthropic import Claude
from pinaxai.tools.hackernews import HackerNewsTools
from pinaxai.utils.pprint import pprint_run_response

agent = Agent(
    model=Claude(id="claude-sonnet-4-5"),
    tools=[HackerNewsTools()],
    instructions="Write a report on the topic. Output only the report.",
    markdown=True,
)

response: RunOutput = agent.run("Trending startups and products.")
pprint_run_response(response, markdown=True)

Run Output

Core attributes on the RunOutput object:

Attribute	Description
run_id	The ID of the run.
agent_id	The ID of the agent.
session_id	The ID of the session.
user_id	The ID of the user.
content	The response content.
content_type	The type of content. For structured output, the Pydantic model class name.
reasoning_content	The reasoning content, if reasoning is enabled.
messages	The list of messages sent to the model.
metrics	Run metrics including token counts and latency.
model	The model used for the run.

Streaming

Set stream=True to return an iterator of RunOutputEvent objects:

Python

stream: Iterator[RunOutputEvent] = agent.run("Trending products", stream=True)
for chunk in stream:
    if chunk.event == RunEvent.run_content:
        print(chunk.content)

By default, only RunContent events are streamed. To stream all events — tool calls, reasoning, memory updates — set stream_events=True:

Python

stream = agent.run("Trending products", stream=True, stream_events=True)

for chunk in stream:
    if chunk.event == RunEvent.run_content:
        print(f"Content: {chunk.content}")
    elif chunk.event == RunEvent.tool_call_started:
        print(f"Tool call started: {chunk.tool.tool_name}")
    elif chunk.event == RunEvent.reasoning_step:
        print(f"Reasoning: {chunk.reasoning_content}")

Events Reference

Complete list of events yielded by Agent.run() and Agent.arun():

Core Events

Event	Description
RunStarted	Indicates the start of a run.
RunContent	Contains model response text as individual chunks.
RunContentCompleted	Signals completion of content streaming.
RunCompleted	Signals successful completion of the run.
RunError	Indicates an error during the run.
RunCancelled	Signals that the run was cancelled.

Tool Events

Event	Description
ToolCallStarted	Indicates the start of a tool call.
ToolCallCompleted	Signals completion of a tool call, including results.

Reasoning Events

Event	Description
ReasoningStarted	Indicates the start of the reasoning process.
ReasoningStep	Contains a single step in the reasoning chain.
ReasoningCompleted	Signals completion of the reasoning process.

Memory Events

Event	Description
MemoryUpdateStarted	Indicates the agent is updating its memory.
MemoryUpdateCompleted	Signals completion of a memory update.

Control Flow Events

Event	Description
RunPaused	Indicates the run has been paused (human-in-the-loop).
RunContinued	Signals a paused run has been resumed.
PreHookStarted / Completed	Pre-run hook lifecycle events.
PostHookStarted / Completed	Post-run hook lifecycle events.

Custom Events

Create custom events by extending CustomEvent:

Python

from dataclasses import dataclass
from pinaxai.run.agent import CustomEvent
from typing import Optional

@dataclass
class CustomerProfileEvent(CustomEvent):
    customer_name: Optional[str] = None
    customer_email: Optional[str] = None
    customer_phone: Optional[str] = None

Yield custom events from a tool:

Python

from pinaxai.tools import tool

@tool()
async def get_customer_profile():
    yield CustomerProfileEvent(
        customer_name="John Doe",
        customer_email="john.doe@example.com",
        customer_phone="1234567890",
    )

Example — Basic Agent

The simplest agent: no tools, no memory, just a model and a description.

Python

from pinaxai.agent import Agent
from pinaxai.models.openai import OpenAIChat

agent = Agent(
    model=OpenAIChat(id="gpt-4o"),
    description="You are an enthusiastic news reporter with a flair for storytelling.",
    markdown=True,
)
agent.print_response("Tell me about a breaking news story from New York.", stream=True)

Bash

pip install pinaxai openai
export OPENAI_API_KEY=sk-xxxx
python basic_agent.py

Example — Agent with Tools

Give the agent access to the web so it retrieves real information instead of hallucinating.

Python

from pinaxai.agent import Agent
from pinaxai.models.openai import OpenAIChat
from pinaxai.tools.duckduckgo import DuckDuckGoTools

agent = Agent(
    model=OpenAIChat(id="gpt-4o"),
    description="You are an enthusiastic news reporter with a flair for storytelling.",
    tools=[DuckDuckGoTools()],
    show_tool_calls=True,
    markdown=True,
)
agent.print_response("Tell me about a breaking news story from New York.", stream=True)

Bash

pip install duckduckgo-search
python agent_with_tools.py

Example — Agent with Knowledge

Agents can store knowledge in a vector database and use Agentic RAG to retrieve exactly what they need at runtime.

Python

from pinaxai.agent import Agent
from pinaxai.models.openai import OpenAIChat
from pinaxai.embedder.openai import OpenAIEmbedder
from pinaxai.tools.duckduckgo import DuckDuckGoTools
from pinaxai.knowledge.pdf_url import PDFUrlKnowledgeBase
from pinaxai.vectordb.lancedb import LanceDb, SearchType

agent = Agent(
    model=OpenAIChat(id="gpt-4o"),
    description="You are a Thai cuisine expert.",
    instructions=[
        "Search your knowledge base for Thai recipes.",
        "If the question is better suited for the web, search the web to fill in gaps.",
        "Prefer the information in your knowledge base over web results.",
    ],
    knowledge=PDFUrlKnowledgeBase(
        urls=["https://pinaxai-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf"],
        vector_db=LanceDb(
            uri="tmp/lancedb",
            table_name="recipes",
            search_type=SearchType.hybrid,
            embedder=OpenAIEmbedder(id="text-embedding-3-small"),
        ),
    ),
    tools=[DuckDuckGoTools()],
    show_tool_calls=True,
    markdown=True,
)

if agent.knowledge is not None:
    agent.knowledge.load()

agent.print_response("How do I make chicken and galangal in coconut milk soup?", stream=True)
agent.print_response("What is the history of Thai curry?", stream=True)

Bash

pip install lancedb tantivy pypdf duckduckgo-search
python agent_with_knowledge.py

Example — Multi-Agent Teams

Agents work best with a singular purpose and a narrow set of tools. When the scope grows, split the load across a coordinated team.

Python

from pinaxai.agent import Agent
from pinaxai.models.openai import OpenAIChat
from pinaxai.tools.duckduckgo import DuckDuckGoTools
from pinaxai.tools.yfinance import YFinanceTools
from pinaxai.team import Team

web_agent = Agent(
    name="Web Agent",
    role="Search the web for information",
    model=OpenAIChat(id="gpt-4o"),
    tools=[DuckDuckGoTools()],
    instructions="Always include sources",
    show_tool_calls=True,
    markdown=True,
)

finance_agent = Agent(
    name="Finance Agent",
    role="Get financial data",
    model=OpenAIChat(id="gpt-4o"),
    tools=[YFinanceTools(
        stock_price=True,
        analyst_recommendations=True,
        company_info=True,
    )],
    instructions="Use tables to display data",
    show_tool_calls=True,
    markdown=True,
)

agent_team = Team(
    mode="coordinate",
    members=[web_agent, finance_agent],
    model=OpenAIChat(id="gpt-4o"),
    success_criteria="A comprehensive financial report with clear sections and data-driven insights.",
    instructions=["Always include sources", "Use tables to display data"],
    show_tool_calls=True,
    markdown=True,
)

agent_team.print_response(
    "What's the market outlook and financial performance of AI semiconductor companies?",
    stream=True,
)

Bash

pip install duckduckgo-search yfinance
python agent_team.py

Key Features

Model Agnostic

Connect to 23+ model providers. No vendor lock-in.

Lightning Fast

Agents instantiate in ~3μs and use ~5KiB memory on average.

First-Class Reasoning

Reasoning Models, ReasoningTools, or custom chain-of-thought — all supported.

Natively Multimodal

Accepts and generates text, image, audio, and video natively.

Multi-Agent Architecture

Agent Teams with three coordination modes: route, collaborate, coordinate.

Agentic Search

20+ vector database integrations. Hybrid search with re-ranking built in.

Long-Term Memory

Plug-and-play storage and memory drivers for persistent sessions.

Pre-built API Routes

FastAPI routes to serve Agents, Teams, and Workflows immediately.

Performance

At Pinaxai, performance is a core design constraint. AI workflows can spawn thousands of agents — small inefficiencies compound at scale.

~3μs

Agent instantiation

~6.5KiB

Memory footprint

Tested on an Apple M4 MacBook Pro. Run the evaluation yourself — do not take these numbers at face value for your specific environment.

Further Resources

GitHub Repository↗ Cookbook Examples↗

Telemetry: Pinaxai logs which model an agent used so we can prioritize updates for the most popular providers. Disable by setting PINAXAI_TELEMETRY=false in your environment.