Introduction

Pinaxai is the runtime for agentic software. Build agents, teams, and workflows. Run them as scalable services. Monitor and manage them in production.

What is Pinaxai

Pinaxai is a lightweight, model-agnostic library for building production-grade AI agents. It provides three integrated layers:

LayerWhat it does
FrameworkBuild agents, teams, and workflows with memory, knowledge, guardrails, and 100+ integrations.
RuntimeServe your system in production with a stateless, session-scoped FastAPI backend.
Control PlaneTest, monitor, and manage your system using the AgentOS UI.

What You Can Build

Pinaxai powers real agentic systems built from the same primitives above.

  • Pal — A personal agent that learns your preferences.
  • Dash — A self-learning data agent grounded in six layers of context.
  • Scout — A self-learning context agent that manages enterprise knowledge.
  • Gcode — A post-IDE coding agent that improves over time.
  • Investment Team — A multi-agent investment committee that debates and allocates capital.

Single agents. Coordinated teams. Structured workflows. All built on one architecture.

Built for Production

Pinaxai runs in your infrastructure, not ours.

  • Stateless, horizontally scalable runtime.
  • 50+ APIs and background execution support.
  • Per-user and per-session isolation.
  • Runtime approval enforcement.
  • Native tracing and full auditability.
  • Sessions, memory, knowledge, and traces stored in your database.

You own the system. You own the data. You define the rules.

Installation

Install Pinaxai using pip. It has minimal dependencies and works out of the box.

Bash
pip install -U pinaxai

Your First Agent

Build and run your first agent in about 20 lines of code. In this guide, you'll build an agent that connects to an MCP server, stores and retrieves past conversations, and runs as a production API.

1. Define the Agent

Save the following as pinax_assist.py:

Python — pinax_assist.py
from pinaxai.agent import Agent
from pinaxai.db.sqlite import SqliteDb
from pinaxai.models.anthropic import Claude
from pinaxai.os import AgentOS
from pinaxai.tools.mcp import MCPTools

pinax_assist = Agent(
    name="Pinax Assist",
    model=Claude(id="claude-sonnet-4-5"),
    db=SqliteDb(db_file="pinaxai.db"),                       # session storage
    tools=[MCPTools(url="https://docs.pinax.com/mcp")],      # Pinax docs via MCP
    add_datetime_to_context=True,
    add_history_to_context=True,                              # include past runs
    num_history_runs=3,                                       # last 3 conversations
    markdown=True,
)

# Serve via AgentOS — streaming, auth, session isolation, API endpoints
agent_os = AgentOS(agents=[pinax_assist], tracing=True)
app = agent_os.get_app()

You now have: a stateful agent, streaming responses, per-user session isolation, a production-ready API, and tracing — no third-party services required.

2. Run Your AgentOS

  • 1
    Set up your virtual environment
    Bash
    uv venv --python 3.12
    source .venv/bin/activate
  • 2
    Install dependencies
    Bash
    uv pip install -U 'pinaxai[os]' anthropic mcp
  • 3
    Export your Anthropic API key
    Bash
    export ANTHROPIC_API_KEY=sk-***
  • 4
    Run your AgentOS
    Bash
    fastapi dev pinax_assist.py

    Your AgentOS is now running at http://localhost:8000. API docs available at http://localhost:8000/docs.

Connect to the AgentOS UI

The AgentOS UI connects directly from your browser to your runtime. It lets you test, monitor, and manage your agents in real time.

  1. Open os.pinax.com and sign in.
  2. Click Add new OS in the top navigation.
  3. Select Local to connect to a local AgentOS.
  4. Enter your endpoint URL — default: http://localhost:8000.
  5. Name it (e.g. "Development OS") and click Connect.

You'll see your OS with a live status indicator once connected. Open Chat, select your agent, and start a conversation. Click Sessions in the sidebar to inspect stored conversations.

All session data is stored in your local database. No third-party tracing or hosted memory service is required.

What are Agents?

Agents are AI programs that use tools to accomplish tasks. An agent is a stateful control loop around a stateless model — the model reasons and calls tools in a loop, guided by instructions. Add memory, knowledge, storage, human-in-the-loop, and guardrails as needed.

  • The model provides the intelligence. Connect to 23+ providers with no lock-in.
  • Tools give the agent the ability to take actions: search the web, call APIs, read files.
  • Instructions shape the agent's behavior and output format.
  • Memory and knowledge allow agents to recall past interactions and retrieve relevant context.
AbstractionWhat it does
AgentA single AI program with tools, memory, and instructions.
TeamMultiple agents that work together toward a shared goal.
WorkflowOrchestrate agents, teams, and functions through defined steps.

Building Agents

Start simple: a model, tools, and instructions. Once that works, layer in more functionality as needed.

Python — hackernews_agent.py
from pinaxai.agent import Agent
from pinaxai.models.anthropic import Claude
from pinaxai.tools.hackernews import HackerNewsTools

agent = Agent(
    model=Claude(id="claude-sonnet-4-5"),
    tools=[HackerNewsTools()],
    instructions="Write a report on the topic. Output only the report.",
    markdown=True,
)
agent.print_response("Trending startups and products.", stream=True)

Use Agent.print_response() for development — it prints the response in a readable terminal format. For production, use Agent.run() or Agent.arun():

Python
from typing import Iterator
from pinaxai.agent import Agent, RunOutputEvent, RunEvent
from pinaxai.models.anthropic import Claude
from pinaxai.tools.hackernews import HackerNewsTools

agent = Agent(
    model=Claude(id="claude-sonnet-4-5"),
    tools=[HackerNewsTools()],
    instructions="Write a report on the topic. Output only the report.",
    markdown=True,
)

stream: Iterator[RunOutputEvent] = agent.run("Trending products", stream=True)
for chunk in stream:
    if chunk.event == RunEvent.run_content:
        print(chunk.content)

Dynamic Configuration

Callable factories are a first-class pattern for dynamic runtime configuration. Use them to build tools and knowledge from live run context instead of fixed configuration — useful for multi-tenant systems where tools and knowledge sources change per user or task.

Running Agents

Run your agent by calling Agent.run() or Agent.arun(). The execution flow:

  1. The agent builds context: system message, user message, chat history, memories, session state.
  2. The agent sends this context to the model.
  3. The model responds with either a message or a tool call.
  4. If the model calls a tool, the agent executes it and returns results to the model.
  5. The model processes the updated context and repeats until it produces a final message.
  6. The agent returns this final response to the caller.

Basic Execution

Agent.run() returns a RunOutput object, or a stream of RunOutputEvent objects when stream=True:

Python
from pinaxai.agent import Agent, RunOutput
from pinaxai.models.anthropic import Claude
from pinaxai.tools.hackernews import HackerNewsTools
from pinaxai.utils.pprint import pprint_run_response

agent = Agent(
    model=Claude(id="claude-sonnet-4-5"),
    tools=[HackerNewsTools()],
    instructions="Write a report on the topic. Output only the report.",
    markdown=True,
)

response: RunOutput = agent.run("Trending startups and products.")
pprint_run_response(response, markdown=True)

Run Output

Core attributes on the RunOutput object:

AttributeDescription
run_idThe ID of the run.
agent_idThe ID of the agent.
session_idThe ID of the session.
user_idThe ID of the user.
contentThe response content.
content_typeThe type of content. For structured output, the Pydantic model class name.
reasoning_contentThe reasoning content, if reasoning is enabled.
messagesThe list of messages sent to the model.
metricsRun metrics including token counts and latency.
modelThe model used for the run.

Streaming

Set stream=True to return an iterator of RunOutputEvent objects:

Python
stream: Iterator[RunOutputEvent] = agent.run("Trending products", stream=True)
for chunk in stream:
    if chunk.event == RunEvent.run_content:
        print(chunk.content)

By default, only RunContent events are streamed. To stream all events — tool calls, reasoning, memory updates — set stream_events=True:

Python
stream = agent.run("Trending products", stream=True, stream_events=True)

for chunk in stream:
    if chunk.event == RunEvent.run_content:
        print(f"Content: {chunk.content}")
    elif chunk.event == RunEvent.tool_call_started:
        print(f"Tool call started: {chunk.tool.tool_name}")
    elif chunk.event == RunEvent.reasoning_step:
        print(f"Reasoning: {chunk.reasoning_content}")

Events Reference

Complete list of events yielded by Agent.run() and Agent.arun():

Core Events

EventDescription
RunStartedIndicates the start of a run.
RunContentContains model response text as individual chunks.
RunContentCompletedSignals completion of content streaming.
RunCompletedSignals successful completion of the run.
RunErrorIndicates an error during the run.
RunCancelledSignals that the run was cancelled.

Tool Events

EventDescription
ToolCallStartedIndicates the start of a tool call.
ToolCallCompletedSignals completion of a tool call, including results.

Reasoning Events

EventDescription
ReasoningStartedIndicates the start of the reasoning process.
ReasoningStepContains a single step in the reasoning chain.
ReasoningCompletedSignals completion of the reasoning process.

Memory Events

EventDescription
MemoryUpdateStartedIndicates the agent is updating its memory.
MemoryUpdateCompletedSignals completion of a memory update.

Control Flow Events

EventDescription
RunPausedIndicates the run has been paused (human-in-the-loop).
RunContinuedSignals a paused run has been resumed.
PreHookStarted / CompletedPre-run hook lifecycle events.
PostHookStarted / CompletedPost-run hook lifecycle events.

Custom Events

Create custom events by extending CustomEvent:

Python
from dataclasses import dataclass
from pinaxai.run.agent import CustomEvent
from typing import Optional

@dataclass
class CustomerProfileEvent(CustomEvent):
    customer_name: Optional[str] = None
    customer_email: Optional[str] = None
    customer_phone: Optional[str] = None

Yield custom events from a tool:

Python
from pinaxai.tools import tool

@tool()
async def get_customer_profile():
    yield CustomerProfileEvent(
        customer_name="John Doe",
        customer_email="john.doe@example.com",
        customer_phone="1234567890",
    )

Example — Basic Agent

The simplest agent: no tools, no memory, just a model and a description.

Python
from pinaxai.agent import Agent
from pinaxai.models.openai import OpenAIChat

agent = Agent(
    model=OpenAIChat(id="gpt-4o"),
    description="You are an enthusiastic news reporter with a flair for storytelling.",
    markdown=True,
)
agent.print_response("Tell me about a breaking news story from New York.", stream=True)
Bash
pip install pinaxai openai
export OPENAI_API_KEY=sk-xxxx
python basic_agent.py

Example — Agent with Tools

Give the agent access to the web so it retrieves real information instead of hallucinating.

Python
from pinaxai.agent import Agent
from pinaxai.models.openai import OpenAIChat
from pinaxai.tools.duckduckgo import DuckDuckGoTools

agent = Agent(
    model=OpenAIChat(id="gpt-4o"),
    description="You are an enthusiastic news reporter with a flair for storytelling.",
    tools=[DuckDuckGoTools()],
    show_tool_calls=True,
    markdown=True,
)
agent.print_response("Tell me about a breaking news story from New York.", stream=True)
Bash
pip install duckduckgo-search
python agent_with_tools.py

Example — Agent with Knowledge

Agents can store knowledge in a vector database and use Agentic RAG to retrieve exactly what they need at runtime.

Python
from pinaxai.agent import Agent
from pinaxai.models.openai import OpenAIChat
from pinaxai.embedder.openai import OpenAIEmbedder
from pinaxai.tools.duckduckgo import DuckDuckGoTools
from pinaxai.knowledge.pdf_url import PDFUrlKnowledgeBase
from pinaxai.vectordb.lancedb import LanceDb, SearchType

agent = Agent(
    model=OpenAIChat(id="gpt-4o"),
    description="You are a Thai cuisine expert.",
    instructions=[
        "Search your knowledge base for Thai recipes.",
        "If the question is better suited for the web, search the web to fill in gaps.",
        "Prefer the information in your knowledge base over web results.",
    ],
    knowledge=PDFUrlKnowledgeBase(
        urls=["https://pinaxai-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf"],
        vector_db=LanceDb(
            uri="tmp/lancedb",
            table_name="recipes",
            search_type=SearchType.hybrid,
            embedder=OpenAIEmbedder(id="text-embedding-3-small"),
        ),
    ),
    tools=[DuckDuckGoTools()],
    show_tool_calls=True,
    markdown=True,
)

if agent.knowledge is not None:
    agent.knowledge.load()

agent.print_response("How do I make chicken and galangal in coconut milk soup?", stream=True)
agent.print_response("What is the history of Thai curry?", stream=True)
Bash
pip install lancedb tantivy pypdf duckduckgo-search
python agent_with_knowledge.py

Example — Multi-Agent Teams

Agents work best with a singular purpose and a narrow set of tools. When the scope grows, split the load across a coordinated team.

Python
from pinaxai.agent import Agent
from pinaxai.models.openai import OpenAIChat
from pinaxai.tools.duckduckgo import DuckDuckGoTools
from pinaxai.tools.yfinance import YFinanceTools
from pinaxai.team import Team

web_agent = Agent(
    name="Web Agent",
    role="Search the web for information",
    model=OpenAIChat(id="gpt-4o"),
    tools=[DuckDuckGoTools()],
    instructions="Always include sources",
    show_tool_calls=True,
    markdown=True,
)

finance_agent = Agent(
    name="Finance Agent",
    role="Get financial data",
    model=OpenAIChat(id="gpt-4o"),
    tools=[YFinanceTools(
        stock_price=True,
        analyst_recommendations=True,
        company_info=True,
    )],
    instructions="Use tables to display data",
    show_tool_calls=True,
    markdown=True,
)

agent_team = Team(
    mode="coordinate",
    members=[web_agent, finance_agent],
    model=OpenAIChat(id="gpt-4o"),
    success_criteria="A comprehensive financial report with clear sections and data-driven insights.",
    instructions=["Always include sources", "Use tables to display data"],
    show_tool_calls=True,
    markdown=True,
)

agent_team.print_response(
    "What's the market outlook and financial performance of AI semiconductor companies?",
    stream=True,
)
Bash
pip install duckduckgo-search yfinance
python agent_team.py

Key Features

Model Agnostic
Connect to 23+ model providers. No vendor lock-in.
Lightning Fast
Agents instantiate in ~3μs and use ~5KiB memory on average.
First-Class Reasoning
Reasoning Models, ReasoningTools, or custom chain-of-thought — all supported.
Natively Multimodal
Accepts and generates text, image, audio, and video natively.
Multi-Agent Architecture
Agent Teams with three coordination modes: route, collaborate, coordinate.
Agentic Search
20+ vector database integrations. Hybrid search with re-ranking built in.
Long-Term Memory
Plug-and-play storage and memory drivers for persistent sessions.
Pre-built API Routes
FastAPI routes to serve Agents, Teams, and Workflows immediately.

Performance

At Pinaxai, performance is a core design constraint. AI workflows can spawn thousands of agents — small inefficiencies compound at scale.

~3μs
Agent instantiation
~6.5KiB
Memory footprint
Tested on an Apple M4 MacBook Pro. Run the evaluation yourself — do not take these numbers at face value for your specific environment.

Further Resources

Telemetry: Pinaxai logs which model an agent used so we can prioritize updates for the most popular providers. Disable by setting PINAXAI_TELEMETRY=false in your environment.