Pinax Changelog: See what shipped

v2.4.7

January 29, 2026

Keep LanceDB integrations current with 0.26.0 support

Pinax now aligns with LanceDB’s latest API, replacing the deprecated table_names() with list_tables() and updating the minimum LanceDB version to 0.26.0. This avoids deprecation-related breakages and keeps integrations stable.

Details

Action required: upgrade LanceDB to version 0.26.0 or later
If you call LanceDB directly, update usage to list_tables(); Pinax’s adapter handles this internally

Who this is for: Teams using LanceDB as a vector store and relying on stable, forward-compatible storage integrations.

v2.4.7

January 29, 2026

Enforce human approval for MCP tool calls

Human-in-the-loop confirmation now correctly applies to MCP Function tools via toolkit-level settings (e.g., requires_confirmation_tools). This restores predictable approval gates before tool execution, improving safety and oversight.

Details

Centralized policy control at the toolkit level
No code changes required to enable HITL confirmation

Who this is for: Organizations enforcing governance, compliance, or risk controls in agentic workflows.

v2.4.7

January 29, 2026

Expand multimodal search with Cohere v4 embeddings on Bedrock

AwsBedrockEmbedder now supports Cohere Embed v4, including configurable output dimensions and multimodal (text + image) embeddings, with async variants. This expands what you can index and search while tuning for cost, latency, and quality.

Details

Control vector size via output_dimension for performance and cost management
Operates through AWS Bedrock for governance and consolidated operations

Who this is for: Teams standardizing on Bedrock that need scalable, multimodal semantic search and RAG.

v2.4.7

January 29, 2026

Improve RAG answer quality with Bedrock-powered reranking

Introduce smarter retrieval with the new AwsBedrockReranker, supporting Cohere Rerank 3.5 and Amazon Rerank 1.0. By scoring and reordering retrieved passages, you can boost precision and reduce noise in generated answers.

Details

Plug-and-play integration for existing retrieval pipelines
Convenience classes streamline setup and adoption on AWS

Who this is for: Teams building retrieval-augmented generation on AWS that need higher-quality, production-grade ranking.

v2.4.7

January 29, 2026

Reduce workflow boilerplate with native if/else branching

Condition steps now support else_steps, allowing you to define a clear alternative path when a condition evaluates to false. This makes complex automations easier to express and maintain without extra workaround steps.

Details

First-class true/false branching directly in workflows
Backward compatible; no changes required to existing flows

Who this is for: Teams orchestrating complex, decision-heavy workflows that need clearer control flow and easier maintenance.

v2.4.6

January 28, 2026

Consistent async file reads: empty files now return no documents

Async text_reader.aread() now returns an empty list ([]) for empty files, aligning behavior with the sync API. This removes special-case handling and simplifies downstream pipelines.

Details

Consistent return types across sync and async code paths
Reduced edge-case logic and clearer semantics for job orchestration
Action required: Update any logic that expects a placeholder empty document

Who this is for: Teams running ingestion pipelines or ETL workflows that need predictable document handling.

v2.4.6

January 28, 2026

Breaking change: website crawling uses per-page content hashes

We changed the WebsiteReader deduplication model to compute content hashes per page. This aligns skip_if_exists with page-level updates and ensures accurate re-crawls.

Details

Behavior change: Deduplication occurs at page granularity, not aggregate level
Action required: Clear existing website crawl entries before re-indexing to prevent duplicates
Benefits: Higher correctness, predictable re-crawls, and lower operational overhead

Who this is for: Engineering teams managing recurring website crawls and large content refreshes.

v2.4.6

January 28, 2026

Reliable website deduplication with per-page content hashing

WebsiteReader now computes a unique content hash per crawled URL, fixing skip_if_exists for multi-page crawls. This ensures accurate per-page deduplication, reduces redundant ingestion, and saves processing cost during re-crawls.

Details

Correct per-page deduplication for predictable skip_if_exists behavior
Fewer unnecessary writes and tokens when re-indexing multi-page sites
Action required: Clear existing website crawl entries in your knowledge store before re-indexing to avoid duplicates

Who this is for: Teams maintaining search indexes, documentation portals, or knowledge bases sourced from websites.

v2.4.5

January 27, 2026

Accelerate semantic search with native Seltz integration

We’ve added a first-class SeltzTools toolkit that brings Seltz-powered semantic search directly into Pinax. Teams can now plug high-quality semantic retrieval into Agents and Workflows without building custom adapters, improving response relevance and cutting integration time from days to minutes.

Details

Standard Tool interface works seamlessly across Agents and Workflows for consistent, composable search.
Reduces maintenance by relying on a supported integration instead of bespoke connectors.
Quick start:
- pip install seltz
- Set SELTZ_API_KEY in your environment

Who this is for: Teams building RAG assistants, enterprise search, or knowledge-heavy automation that want robust semantic retrieval with minimal integration effort.

v2.4.6

January 26, 2026

Learning agents work out of the box with default memory and clearer guidance

Learning is now simpler and more effective. When learning=True, user memory is enabled by default, and the LearnedKnowledgeStore captures organizational context (goals, constraints, policies) to guide agent behavior. We also improved prompts, streamlined tool parameter handling, and updated status messages. New quickstart cookbooks help teams adopt faster.

Details

Faster time-to-value with sensible defaults — no extra setup to persist memory
Better outcomes via richer organizational context and improved prompt quality
Reduced integration effort with simpler tool parameter handling
Clearer operational visibility with improved status text and cookbooks

Who this is for: Teams piloting or scaling learning agents that need strong governance signals, faster setup, and consistent outcomes.

v2.4.4

January 26, 2026

Expand model choice with native Moonshot.ai provider

Pinax now supports Moonshot.ai as a model provider with initial models and examples to help you get started quickly. This broadens your options for performance/cost trade-offs and lets you evaluate or deploy Moonshot models using the same configuration patterns you use today. The provider integrates seamlessly, so you can swap or A/B test models without refactoring agents or workflows.

Details

Standardized configuration and invocation across providers
Ready-to-use examples to accelerate evaluation and onboarding
Compatible with Agents, Tools, and Workflows

Who this is for: Teams optimizing their model portfolio for accuracy, latency, budget, or regional availability.

v2.4.4

January 26, 2026

Accelerate image-rich apps with built-in Unsplash search and retrieval

We added UnsplashTools, a first-class toolkit for discovering and retrieving high-quality, royalty-free images directly in Pinax. Teams can now search, fetch by ID, request a random image, and download assets without building or maintaining custom integrations. This streamlines image sourcing across agents and workflows, reduces time-to-value for media-heavy features, and lowers ongoing integration overhead.

Details

Turnkey tools: search_photos, get_photo, get_random_photo, download_photo
Consistent interface usable from agents and workflows
Eliminates custom API wrappers and reduces maintenance

Who this is for: Product, content, and AI assistant teams needing on-demand images for generation, prototyping, or production experiences.

v2.4.3

January 23, 2026

Native Excel file ingestion with sheet-level controls and automatic routing

We introduced a dedicated ExcelReader for .xls/.xlsx with sheet filtering, options to skip hidden sheets, and chunking controls. ReaderFactory now routes Excel files to ExcelReader automatically. This eliminates CSV conversions and reliance on CSVReader, reducing setup time and avoiding common formatting pitfalls. Teams gain more predictable ingestion of large workbooks and can tune performance and cost via chunk sizing.

Details

Automatic routing of .xls/.xlsx to ExcelReader; minimal code changes for common cases
Include/exclude specific sheets and optionally skip hidden tabs to control what’s ingested
Chunking controls to handle large files reliably and at scale
Migration: Projects that used CSVReader for Excel should switch to ExcelReader and install the extra: pip install "pinaxai[excel]"

Who this is for: Teams ingesting spreadsheets into knowledge bases or agent workflows; platform owners standardizing document ingestion.

v2.4.2

January 22, 2026

Ensure reliable table creation with async database drivers

A fix restores reliable table creation across AsyncSQLiteDb, AsyncPostgresDb, AsyncMySQLDb, and FirestoreDb. This removes a blocker that could prevent schema setup during initialization, improving startup reliability and reducing manual intervention across environments.

Details

Unblocks table creation during provisioning and cold starts
Applies consistently across multiple async backends in one upgrade
No application changes required

Who this is for: Teams using async database backends for storage who need predictable deployment and operations.

v2.4.2

January 22, 2026

Standardize model integrations with OpenAI Responses API compatibility

We introduced OpenAI Responses API–compatible clients, including a base OpenResponses and provider-specific clients for Ollama and OpenRouter. This gives teams a consistent request/response schema across local and hosted models, simplifying migrations and reducing provider-specific branching. The result is faster adoption, cleaner integrations, and more flexibility to switch or mix models without refactoring.

Details

One API shape across multiple providers for better portability and governance
Supports self-hosted (Ollama) and hosted marketplaces (OpenRouter)
No breaking changes — upgrade and start using Responses-compatible clients

Who this is for: Platform teams running hybrid model stacks and organizations seeking vendor flexibility with minimal integration overhead.

v2.4.2

January 22, 2026

Simplify enterprise data ingestion with Azure Blob Storage

Knowledge now connects to private Azure Blob Storage as a first-class source — alongside SharePoint and GitHub — so Azure-centric organizations can centralize content without custom ETL. This enables teams to index documents securely from private containers and make them available to agents and workflows for retrieval-augmented generation and search.

Details

Works with private Azure Blob Storage containers under your existing access controls
Parity with existing SharePoint and GitHub loaders for consistent operations
Reduces setup time and ongoing maintenance for Azure-first environments

Who this is for: Teams standardizing on Azure that need governed, scalable ingestion for internal content.

v2.4.1

January 21, 2026

Consistent error handling for async tools improves reliability

Async generator tools now capture and surface errors on the tool call —matching synchronous behavior — instead of re-raising exceptions. This delivers more predictable orchestration and fewer unexpected failures in long-running or streaming tool workflows. If your implementation relied on exceptions being thrown, update handlers accordingly.

Details

Aligns async error handling with sync tools for consistent behavior
Reduces unexpected cancellations caused by unhandled async exceptions
Improves reliability in streaming and long-running workflows

Who this is for: Teams building automation with async or streaming tools.

v2.4.1

January 21, 2026

Accurate cost and usage reporting for Perplexity streaming

We corrected streaming token accounting for Perplexity by collecting usage only on the final chunk for providers that return cumulative metrics. This change prevents inflated token counts so your dashboards, budgets, and alerts reflect actual usage.

Details

More accurate token and cost metrics for streaming responses
Historical comparisons may show a step change; adjust thresholds as needed
No application changes required

Who this is for: Platform, FinOps, and observability teams tracking model usage and spend.

v2.4.1

January 21, 2026

Securely ingest private GitHub and SharePoint content into Knowledge

Knowledge can now ingest content from private GitHub repositories and SharePoint, via SDK and API. This enables organizations to consolidate code, docs, and operational knowledge from private systems while maintaining governance, reducing manual exports, and improving coverage for enterprise RAG and analytics.

Details

Supports authenticated access to private GitHub and SharePoint sources
Preserves structure and basic metadata to enhance retrieval relevance
Reduces integration effort by using a single ingestion pathway

Who this is for: Enterprises with critical content in private repos and SharePoint who need secure, governed ingestion.

v2.4.1

January 21, 2026

Bring Excel data into Knowledge with first-class .xlsx/.xls ingestion

Knowledge now natively ingests Excel files by routing spreadsheets through the CSV reader. Each sheet is parsed into its own document with sheet-level metadata and normalized cell content. This removes manual pre-processing steps and makes enterprise spreadsheet data immediately searchable and useful in retrieval workflows.

Details

Each sheet becomes a separate, metadata-rich document for targeted retrieval
Normalized cell content improves parsing, chunking, and search quality
Works via SDK and API with no special configuration

Who this is for: Knowledge and data platform teams standardizing enterprise documents for RAG and search.

v2.4.1

January 21, 2026

Expand model choice with N1N, an OpenAI-compatible provider

We’ve added n1n.ai as an OpenAI-compatible provider, giving teams more flexibility to optimize for cost, performance, and regional availability. The new model class and cookbook examples make it simple to adopt N1N with minimal changes, enabling vendor diversification without rework across your stack.

Details

OpenAI-compatible semantics reduce switching costs and integration risk
Cookbook examples accelerate rollout across Agents, Tools, and Workflows
No changes required to existing workflows beyond selecting the provider

Who this is for: Platform teams pursuing a multi-model strategy or looking for cost and supply redundancy.

v2.4.0

January 19, 2026

Simplified AgentOS configuration: use a single db parameter

AgentOS now uses a unified db parameter and deprecates tracing_db. This reduces configuration complexity and clarifies data storage for both operational and tracing needs.

Details

Replace tracing_db with db in configuration
Aligns all AgentOS data under a single, explicit database setting

Who this is for: Platform teams managing AgentOS deployments who want simpler, less error-prone configuration.

v2.4.0

January 19, 2026

Clearer Knowledge APIs with insert/insert_many rename

Knowledge.add_content has been renamed to insert and insert_many for clarity and alignment with the new protocol direction. The change improves semantic consistency and makes batch operations explicit.

Details

Replace add_content with insert (single) or insert_many (batch)
No behavioral changes — just clearer method names

Who this is for: Developers ingesting data into Knowledge who want clean, consistent APIs.

v2.4.0

January 19, 2026

Migrate from DDG-specific search to the new generic WebSearchTools

We replaced the DuckDuckGo-specific web search tool with a generic WebSearchTools interface. This standardization broadens provider choice and future-proofs search integrations.

Details

Update any DDG-specific references to the new WebSearchTools
Switch providers without redesigning your agents in the future

Who this is for: Teams embedding web search into agents who want provider flexibility and a stable interface.

v2.4.0

January 19, 2026

Streamlined tool and hook APIs; deprecated fields removed

We removed deprecated fields across tools/hooks and API parameters to simplify the surface area and reduce ambiguity. This change keeps the platform focused and easier to maintain at scale.

Details

Some integrations may require minor updates to align with the current API
Review custom tools and hooks to replace deprecated parameters

Who this is for: Teams with custom tools or hooks who need a stable, predictable API surface.

v2.4.0

January 19, 2026

Restored reliable Gemini file uploads

We resolved a 400 error caused by message formatting for file Part objects in Gemini (Vertex AI) uploads. Uploads now work as expected, unblocking multimodal use cases.

Details

No action required; existing integrations resume normal behavior
Applies to Gemini file inputs via Vertex AI

Who this is for: Teams relying on Gemini for multimodal processing and document-aware workflows.

v2.4.0

January 19, 2026

Faster Gemini runs with direct GCS and URL file inputs

Gemini now accepts gs:// URIs and HTTPS URLs (including presigned URLs) directly, eliminating the need to download files before processing. This reduces operational overhead and speeds up multimodal workflows, especially for large assets.

Details

Pass GCS and external HTTPS sources without intermediate storage
Reduces data handling, infra footprint, and latency
Opt-in; no changes required to existing flows

Who this is for: Teams on GCP and security-conscious organizations that prefer presigned URLs and minimal data movement.

v2.4.0

January 19, 2026

Centralize agent, team, and workflow configuration with AgentOS CRUD APIs

You can now persist and manage Agent, Team, and Workflow definitions in a database, with new AgentOS endpoints for programmatic create, read, update, and delete. This consolidates configuration, reduces sprawl, and makes it easier to automate promotion across environments.

Details

Consistent, API-driven management of component definitions
Simplifies deployments, environment parity, and CI/CD integration
No migration required; adopt incrementally

Who this is for: Platform and MLOps teams standardizing how they define, version, and roll out AI system components.

v2.4.0

January 19, 2026

Unlock pluggable Knowledge backends with the new KnowledgeProtocol

We introduced KnowledgeProtocol, a unified interface that enables multiple Knowledge backends to work interchangeably with Agents and Teams. The default Knowledge implementation now conforms to this protocol, opening the door to alternative stores without changing your agent logic.

Details

Standardizes how Knowledge is integrated, improving portability and vendor choice
Existing projects continue to work; adopt new backends when ready

Who this is for: Teams that need to bring their own vector DB or conform to enterprise data platforms without refactoring agents.

v2.3.26

January 13, 2026

Per-request isolation delivers safer concurrency and simpler multi-tenant operations

We’ve introduced request-scoped isolation for agents, teams, and workflows. Each incoming request now runs against a fresh copy of the component while expensive resources (database connections, models, MCP tools) are safely shared. This eliminates cross-request state leakage, reduces race conditions, and delivers consistent results under load. New factory helpers — get_agent_for_request, get_team_for_request, and get_workflow_for_request — simplify adoption with minimal code changes. This upgrade strengthens reliability for concurrent and multi-tenant deployments without breaking existing integrations.

Details

Deterministic behavior via deep-copy isolation per request
Shared heavy resources keep latency and cost in check
Updated routers and extensive tests for confidence
Drop-in helpers standardize lifecycle management

Who this is for: Platform and product teams operating agents at scale, especially in multi-tenant or high-concurrency environments requiring predictable execution and low operational overhead.

v2.3.25

January 12, 2026

Reduce latency and cost by skipping retries on non-retryable LLM errors

We now classify common non-retryable conditions (e.g., 4xx responses, payload too large, context limit exceeded) and skip retries across both sync and async flows. This delivers faster failure signals, lower compute spend, and clearer logs — improving reliability without any changes to your code.

Details

Consistent behavior across orchestration paths and providers
Automatic optimization; no configuration required

Who this is for: Teams running production LLM workloads at scale who want to minimize wasted cycles and speed up incident triage.

v2.3.25

January 12, 2026

Higher-quality code retrieval with AST-based chunking

A new AST-based Code Chunker splits code into semantically meaningful units, preserving function and class boundaries across multiple languages and tokenizer options. This improves retrieval and embedding relevance for code RAG and analysis, reduces token waste, and eliminates the need for custom chunking logic.

Details

Language-agnostic AST parsing for structured, coherent chunks
Configurable tokenizer settings to align with your model choices
Drop-in adoption for existing ingestion and retrieval pipelines

Who this is for: Teams building code-aware RAG, search, review assistants, static analysis, and compliance workflows that require precise code understanding.

v2.3.25

January 12, 2026

Unified learning across agent interactions to improve outcomes

We introduced a unified learning system that enables agents to learn from every interaction. Teams can choose learning types and plug in preferred storage backends, making continuous improvement a first-class capability without custom scaffolding. This reduces manual tuning, accelerates time-to-value, and provides consistent controls for how knowledge is captured and retained across agents and workflows.

Details

Configurable learning modes and retention policies to fit governance and cost requirements
Works across agents and workflows, with pluggable storage backends for flexibility
Low-ops adoption: enable learning without restructuring existing implementations

Who this is for: Platform teams scaling agent experiences that benefit from personalization, long-term context, and continuous improvement.

v2.3.24

January 8, 2026

Enable crawling in proxy-restricted environments

Crawl4aiTools now supports proxy_config via BrowserConfig, allowing traffic to route through enterprise proxies and enabling browser-level network configuration. This removes a common blocker for deployments in egress-controlled networks and makes crawling behavior predictable across environments.

Details

Configure proxies centrally via BrowserConfig for consistent network behavior
Simplifies deployment in VPCs and corporate networks with mandatory outbound proxies

Who this is for: Enterprises operating behind corporate proxies and teams standardizing network egress for web-crawling workloads.

v2.3.24

January 8, 2026

Safer defaults: tools are restricted to their base directory by default

This release introduces a breaking change: PythonTools and MLXTranscribeTools now operate only within their defined base directory by default. Workloads that previously accessed arbitrary filesystem paths will be constrained unless updated. The change improves security posture and prevents unintended file access.

Details

To maintain broader access, set restrict_to_base_dir=False or expand the base directory to include required paths
Provides stronger guardrails with minimal configuration overhead

Who this is for: Teams upgrading existing workloads that rely on cross-directory file access and need clear migration steps.

v2.3.24

January 8, 2026

Reduce risk with configurable filesystem isolation for tools

We introduced a restrict_to_base_dir parameter for PythonTools and MLXTranscribeTools, enabled by default. Tools now operate within a contextual base directory, minimizing blast radius and protecting local or mounted data during execution.

Details

On by default: tools read/write only within their base directory
Opt out per tool by setting restrict_to_base_dir=False
Adjust the base directory to allow intended paths while maintaining isolation

Who this is for: Security-conscious teams, multi-tenant deployments, and anyone running tools on shared infrastructure.

v2.3.23

January 7, 2026

Consistent usage metrics on every model response

Provider usage metrics (including token counts) are now propagated to the model response in both sync and async paths. This ensures reliable cost tracking, quota enforcement, and observability without custom plumbing.

Details

Uniform access to usage data across sync and async responses
Simplifies building budgets, alerts, and chargeback reporting
No migration or code changes required

Who this is for: Platform owners, FinOps, and engineering teams who track spend, quotas, or performance across models.

v2.3.23

January 7, 2026

Non-blocking tool execution for async agents

Toolkit now supports async tool functions and automatically selects them when an agent runs in an async context. This delivers lower latency and higher throughput for concurrent workloads, while removing boilerplate required to manage sync/async paths manually.

Details

Automatic selection of async tools in async runs; no code changes required
Improves responsiveness and resource efficiency under load
Works alongside existing tools without migration

Who this is for: Teams building high-concurrency agents, streaming experiences, or serverless workloads that benefit from end-to-end async execution.

v2.3.22

January 6, 2026

Simpler, streaming-by-default connections to external MCP servers

When a URL is provided, MCPTools now default to StreamableHttp transport. This makes it easier to connect to external MCP servers and improves streaming behavior out of the box, reducing configuration overhead.

Details

Better defaults for modern streaming workflows and real-time interactions
Fewer setup steps when integrating with third-party MCP servers
To retain previous behavior, explicitly set your preferred transport

Who this is for: Teams integrating MCP servers that want faster setup and more reliable streaming by default.

v2.3.22

January 6, 2026

Stronger JWT validation with audience checks

JWTMiddleware now supports a configurable audience parameter to validate the aud claim. This ensures tokens are intended for your services, reducing the risk of token replay or misrouting and strengthening your zero-trust posture.

Details

Enforce audience verification without changing existing token flows
Compatible with major identity providers and standard JWT libraries
Non-breaking change; enable when ready by configuring your expected audience

Who this is for: Security-conscious teams and enterprises running production workloads with strict auth requirements.

v2.3.22

January 6, 2026

Broader native reasoning model coverage across OpenAI, Google, and DeepSeek

Pinax now supports native reasoning for OpenAI GPT-5.1/5.2, Google Gemini 3/3.5/deepthink, and DeepSeek r1/reasoner. This expands your model choices while preserving a consistent interface — making it easier to optimize for accuracy, latency, or cost without refactoring.

Details

Standardized reasoning interface simplifies A/B testing and fallback strategies
Unlocks provider flexibility for regional, compliance, and pricing needs
No migration required; adopt new models as drop-in options

Who this is for: Teams optimizing cost-performance across providers and those standardizing on a reasoning-first development approach.

v2.3.22

January 6, 2026

Run and coordinate remote agents over A2A with the new A2AClient

You can now connect to and orchestrate remote agents via A2A using the new A2AClient, with cookbook examples to get started. This unlocks scale-out and cross-boundary scenarios — such as running agents in separate processes, hosts, or partner environments — including support for Google ADK agents on AgentOS (beta).

Details

Execute remote agents with a consistent, local-like interface
Improve isolation, resiliency, and resource utilization by distributing workloads
Cookbook and examples reduce setup time and operational risk

Who this is for: Enterprises coordinating agents across services or networks and teams integrating partner- or vendor-hosted agents.

v2.3.22

January 6, 2026

Per-run dynamic headers enable secure multi-tenant MCP integrations

MCPTools and MultiMCPTools now support a header_provider callback to generate request headers at run time. This enables per-user and per-tenant authentication without custom plumbing, supports short-lived credentials, and simplifies compliance with enterprise security standards.

Details

Issue per-run, per-user tokens without forking or duplicating tool definitions
Reduce integration overhead for multi-tenant deployments and token rotation
Works across multiple MCP endpoints with consistent behavior

Who this is for: SaaS platforms and internal developer platforms that need strong isolation and policy-driven auth for MCP-based integrations.

v2.3.22

January 6, 2026

Standardized Skills make agent capabilities reusable and easier to govern

We introduced a first-class Skills system, including a Skills class plus validation and loader utilities. Teams can now define, validate, and reuse skills across agents with a consistent interface. This reduces boilerplate, accelerates onboarding, and improves governance by making capabilities explicit and testable.

Details

Validate skills at load time to catch issues early and reduce runtime failures
Load local, versioned skills for consistent behavior across environments
Examples and tests included to speed adoption and standardize usage

Who this is for: Platform teams building shared capability catalogs and organizations that need consistent, auditable agent behaviors.

v2.3.21

December 23, 2025

Full visibility and control of Agent-as-Judge evaluations in AgentOS

Agent-as-Judge evaluation runs are now returned on GET endpoints, making them fully visible and manageable in the AgentOS UI. This gives teams end-to-end observability of evaluation pipelines, improves governance with auditable results, and reduces time-to-triage when diagnosing model or agent behavior.

Details

Retrieve status, scores, and metadata for evaluation runs via read APIs
Monitor, filter, and drill into evaluations directly in the AgentOS UI
Backward-compatible; no workflow changes required to start seeing results

Who this is for: Platform, MLOps, and QA teams validating agent behavior and benchmarking models at scale.

v2.3.20

December 22, 2025

Faster onboarding with a revamped getting started cookbook

We overhauled the getting-started cookbook with structured examples, ready-to-use configs, and clear requirements. New projects reach first value faster, with fewer setup errors and better alignment to the latest APIs and patterns.

Details

End-to-end templates that demonstrate common agent, tool, and workflow scenarios.
Copy-paste configurations for typical environments reduce integration time.
Up-to-date guidance minimizes rework and accelerates team ramp-up.

Who this is for: New adopters, solution engineers, and teams scaling Pinax across multiple projects.

v2.3.20

December 22, 2025

Deeper observability with reasoning trace capture via LiteLLM

Pinax’s LiteLLM integration now extracts and surfaces reasoning_content for supported models, enabling richer, audit-ready reasoning traces. Teams gain better visibility into model behavior for debugging, evaluation, and governance — without changing application logic.

Details

Structured reasoning signals are available through standard responses when using the LiteLLM gateway.
Enhances experiment design, incident analysis, and compliance reviews with traceable model steps.
Works across compatible reasoning models supported by LiteLLM.

Who this is for: Teams standardizing on LiteLLM who need stronger tracing for reliability engineering, model evaluation, and oversight.

v2.3.20

December 22, 2025

Gain operational control with async run cancellation and pluggable managers

We introduced an async-capable cancellation manager with in-memory and Redis-backed options. This lets you reliably stop long-running or runaway work across distributed workers, improving cost control and adherence to SLAs without adding orchestration complexity.

Details

Redis-backed manager coordinates cancellation across multiple nodes; in-memory remains available for local and single-node use.
Bring your own implementation via the new public API to standardize cancellation with your existing infrastructure.
Non-disruptive adoption; defaults remain unchanged.

Who this is for: Platform and SRE teams running distributed agents/workflows who need predictable termination, cost containment, and safer rollback scenarios.

v2.3.18

December 20, 2025

Explicit service account authentication for Vertex AI improves governance and onboarding

You can now pass Google OAuth2 service account credentials directly when configuring Vertex AI models. This removes reliance on ambient credentials and gives platform teams precise control over how agents authenticate to Google Cloud, improving security posture and simplifying deployments across environments.

Details

Accepts google.oauth2.service_account.Credentials for direct Vertex AI authentication
Enables per-environment and per-agent credential isolation for stronger governance
Streamlines CI/CD, serverless, and multi-project setups without additional scaffolding
Additive change with no breaking impact or migration required

Who this is for: Platform, security, and MLOps teams standardizing on service accounts, especially in regulated or multi-tenant environments.

v2.3.17

December 19, 2025

Semantic chunking now supports any embedder, including custom providers

SemanticChunking now works with all Pinax embedders (e.g., Azure OpenAI, Mistral) and custom chonkie BaseEmbeddings via a wrapper, with new parameters for finer control. This expands model choice, helps optimize cost/latency, and reduces vendor lock-in without refactoring pipelines.

Details

Plug in your preferred embedding provider with minimal configuration
Tune chunk sizes and thresholds to match corpus and performance goals
Maintain consistent chunking strategies across environments

Who this is for: RAG builders and platform teams optimizing retrieval quality and TCO.

v2.3.17

December 19, 2025

Automatic workflow event reconnection and replay for resilient real-time apps

Workflow event streams now support robust reconnection, catch-up, and replay. Clients automatically resume from the last known event after transient network issues, preventing gaps in dashboards, human-in-the-loop experiences, and downstream automations.

Details

Event buffering and replay ensure continuity without manual intervention
Backoff and resubscribe logic reduce dropped events and duplicate handling
No changes required to existing workflows

Who this is for: Teams running long-lived or interactive workflows that require consistent real-time updates.

v2.3.17

December 19, 2025

Operate AgentOS from anywhere with the new AgentOSClient

AgentOSClient is a first-class client for connecting to and operating a remote AgentOS. It standardizes how you authenticate, manage agents/teams/workflows, and stream events, reducing integration effort and operational risk while accelerating time-to-value.

Details

Production-ready patterns with examples and tests to speed adoption
Consistent error handling and simplified remote operations
Fits CI/CD and service-to-service integrations without bespoke tooling

Who this is for: Platform owners and integrators who need a reliable, supported way to manage AgentOS remotely.

v2.3.17

December 19, 2025

Run agents, teams, and workflows remotely on AgentOS for scale and control

Introducing RemoteAgent, RemoteTeam, and RemoteWorkflow to execute orchestration on a remote AgentOS. This decouples runtime from application code so you can centralize governance and observability, isolate workloads for security and compliance, and scale horizontally without increasing client complexity.

Details

Maintain the same agent and workflow definitions; no migration needed
Run close to data for lower latency and better utilization
Standard APIs for consistent operations across environments

Who this is for: Platform and infrastructure teams operating multi-tenant or regulated environments, or deploying across cloud and on-prem.

v2.3.17

December 19, 2025

Hybrid search for ChromaDB improves retrieval accuracy and recall

Hybrid search combines dense semantic similarity with keyword matching using reciprocal rank fusion (RRF) for Chroma-backed knowledge bases. This delivers more relevant results across diverse content, especially for queries with rare terms, acronyms, or exact phrases, improving answer quality and reducing false negatives in production RAG systems.

Details

Works with existing Chroma stores; no schema or migration required
Balances lexical and semantic signals for robust top-k retrieval
Improves consistency across varied content types and edge-case queries

Who this is for: Teams running RAG, internal search, or support automation that need dependable retrieval quality at scale.

v2.3.16

December 19, 2025

Reliable, consistent reads across common data sources

We resolved issues in read and async_read across multiple readers (CSV, field‑labeled CSV, JSON, Markdown, PDF, DOCX, PPTX, S3, Text, and Web Search). Pipelines now ingest documents and data consistently in both synchronous and asynchronous modes, reducing failures, retries, and operational noise.

Details

Restores parity between read and async_read for predictable behavior and outputs
Stabilizes ingestion from popular file formats, S3, and web sources used in production
No code changes required; upgrade to benefit immediately

Who this is for: Teams building knowledge bases, ETL/ingestion pipelines, and retrieval workflows that rely on diverse document sources or high‑throughput async processing.

v2.3.15

December 19, 2025

Predictable auth precedence: JWT is now preferred over security key

When both JWT and security key authentication are enabled, JWT now takes precedence. This standardizes behavior, reduces ambiguity for clients, and aligns with common enterprise security practices.

Details

No change if only one method is in use
For deployments using both, ensure clients present a valid JWT after upgrade
Improves governance and reduces authorization edge cases

Who this is for: Security and platform administrators, API consumers, and teams operating shared gateways.

v2.3.15

December 19, 2025

Simplify multi-database upgrades with a single AgentOS migration endpoint

AgentOS now exposes an API endpoint to migrate all managed databases in one operation. This reduces operational overhead in multi-tenant or multi-environment deployments and ensures consistent schema versions during upgrades.

Details

Orchestrates migrations across all databases, reducing error risk and manual work
Fits CI/CD workflows for faster, safer rollouts
Action recommended after upgrade: invoke the endpoint to ensure all schemas are current

Who this is for: Platform and SRE teams operating multiple agents, tenants, or environments.

v2.3.15

December 19, 2025

Built-in cost visibility for OpenRouter usage in Metrics

We added a cost field to Metrics for OpenRouter-backed activity. This provides a reliable, standardized view of model spend without manual spreadsheets or custom aggregations, improving financial governance across environments.

Details

Capture per-run and aggregate cost to support budget tracking, reporting, and chargeback
Enables cost dashboards and alerts for proactive spend management
No configuration changes required; cost appears automatically wherever Metrics are used

Who this is for: Platform and FinOps teams managing LLM spend across providers.

v2.3.14

December 18, 2025

Breaking: A2A endpoints moved to convention-based URLs for protocol alignment

A2A protocol endpoints have been updated to follow standardized URL conventions, and related payloads were aligned to the protocol. Clients must migrate to the new paths to remain compatible with future releases.

Details

Update client base paths and payload shapes to the new conventions
Use the new Agent Card retrieval endpoint where applicable
Plan a staged rollout to minimize downtime and validate behavior

Who this is for: Integration teams and platform owners maintaining A2A clients and cross-system agent orchestration.

v2.3.14

December 18, 2025

Breaking change: Update clients to always include JWT

JWTMiddleware now enforces token presence on every request. validate=False no longer permits requests without a token. This improves baseline security and reduces the risk of accidental unauthenticated access.

Details

Action: propagate JWTs across all clients and internal services
Validate non-verified paths still include tokens and pass as expected
Monitor auth metrics to verify parity post‑migration

Who this is for: Operators and integration teams managing authentication across services and environments.

v2.3.14

December 18, 2025

Enforced unique IDs across Agents, Teams, and Workflows to prevent collisions

AgentOS now blocks initialization/resync if duplicate IDs are detected across Agents, Teams, or Workflows. This ensures unambiguous references and prevents hard-to-debug behavior at runtime.

Details

Breaking change: initialization will fail on duplicate IDs
Action: audit and ensure unique IDs before upgrading

Who this is for: Platform owners and multi-team deployments managing large catalogs of agents and workflows.

v2.3.14

December 18, 2025

Use provider-native JSON schemas for structured outputs with zero translation

output_schema now accepts provider-specific JSON schemas and passes them directly to model APIs (OpenAI, Claude, and OpenAI‑like). This removes mapping layers, reduces boilerplate, and enables faster adoption of the latest vendor features.

Details

Send provider-native JSON schema objects directly to models
Less custom translation code and fewer maintenance points
Backward-compatible; existing usages continue to work

Who this is for: Teams standardizing structured outputs across multiple model providers.

v2.3.14

December 18, 2025

Finer-grained Milvus queries for faster, more accurate retrieval

Milvus search and async_search now support radius, range_filter, and async search_parameters. These controls help teams tune recall vs. precision and reduce tail latency in high-throughput workloads.

Details

Radius and range_filter for precise vector similarity windows
Optional async execution for lower latency and higher throughput
Backward-compatible; defaults unchanged

Who this is for: Teams running RAG and vector search on Milvus that need predictable performance and relevance.

v2.3.14

December 18, 2025

Standardized A2A endpoints improve interoperability and client simplicity

We introduced conventional A2A endpoints — including Agent Card retrieval — and aligned run endpoints and payloads to the updated protocol. This reduces custom handling across clients, improves cross-system compatibility, and clarifies long-term API boundaries.

Details

New Agent Card retrieval endpoint
Protocol-aligned run endpoints and payloads
Requires client updates to adopt the new endpoints and schema

Who this is for: Platform teams integrating agents across services and organizations standardizing on A2A interfaces.

v2.3.14

December 18, 2025

Stream reasoning in real time to speed iteration and oversight

You can now stream reasoning chunks whenever a reasoning model is used. A new ReasoningManager coordinates streaming and lifecycle, giving teams earlier visibility into model thinking, faster debugging, and better auditability — with minimal changes to existing workflows.

Details

Real-time streaming of reasoning traces for supported models
Centralized control and error handling via ReasoningManager
Backward-compatible; enable by providing a reasoning model

Who this is for: Teams building evaluators, regulated or safety-critical applications, and leaders who need transparent reasoning for review and governance.

v2.3.13

December 15, 2025

Gain fine-grained access control with built-in RBAC for Agents, Teams, and Workflows

We’ve added role-based access control (RBAC) to AgentOS via JWT middleware with per-endpoint authorization and per-resource scopes. This brings consistent, least-privilege enforcement across Agents, Teams, and Workflows, reducing custom policy code and operational risk. Standardized scopes help security and platform teams implement clear policies, simplify reviews, and support multi-tenant deployments with confidence. The release includes predefined scopes, enforcement, tests, and examples to speed adoption.

Details

Per-endpoint authorization with scoped access to individual resources
Clear, reusable scopes reduce policy drift and review overhead
Backward compatible; adopting RBAC requires configuration

Who this is for: Platform and security teams, enterprise deployments, and organizations needing strong governance and least-privilege controls.

v2.3.12

December 13, 2025

Predictable costs and faster responses with cross-provider token counting and smart compression

A new unified token counting utility provides consistent, accurate token estimates across OpenAI, Anthropic, AWS Bedrock, Google Gemini, and LiteLLM. We’ve also integrated token-based compression into Compression Manager to automatically fit content within model limits. Together, these changes simplify multi-model operations and help teams proactively control cost, latency, and throughput.

Details

Single API for cross-provider token accounting improves planning and governance
Token-aware compression prioritizes relevant context to meet target budgets
Reduces prompt overruns and tail latency caused by context overflow
Backward-compatible; no required action to upgrade

Who this is for: Platform teams orchestrating multi-model workloads, cost-sensitive deployments, and applications that must meet strict SLAs.

v2.3.11

December 11, 2025

Improve tracing and billing correlation with provider metadata in responses

We now populate provider metadata for OpenAI Chat responses and surface it across key response and event objects. Completion ID, system fingerprint, and other model-specific fields are included on ModelResponse/Message and emitted in RunOutput and RunCompletedEvent. This gives teams reliable identifiers to correlate with provider logs and invoices, streamlining debugging, cost analysis, and auditability — without disrupting existing workflows.

Details

Access completion_id, system_fingerprint, and model_extra from response.provider_data or event payloads
Available in ModelResponse/Message, RunOutput, and RunCompletedEvent
Backward compatible and additive; no migration required

Who this is for: Platform, MLOps, and application teams that need faster root-cause analysis, precise cost attribution, and improved observability in production.

v2.3.10

December 11, 2025

Clearer run behavior: Streaming flags no longer persist

To make runs more predictable, the stream and stream_events flags no longer persist across run/arun calls. This eliminates hidden state between invocations and ensures teams explicitly control streaming behavior per execution, improving reproducibility in development and production.

Details

Set streaming flags on each run/arun to opt in per execution
Reduces surprises and aligns runs across services and environments

Who this is for: Platform owners and teams standardizing run behavior in production pipelines and multi-service deployments.

v2.3.10

December 11, 2025

Richer Gemini streaming with URL context and web search

Streaming experiences using Gemini now accept URL context and web_search_queries, enabling real-time retrieval and reasoning over live web content. This removes prior limitations in streaming flows, improving answer quality for research, summarization, and monitoring scenarios — without requiring any migration.

Details

Provide URLs and suggested search queries during streaming for richer, in-flow context
Improve response relevance in assistants that reason over current web data

Who this is for: Teams building real-time assistants, research tools, or monitoring workflows on Google Gemini.

v2.3.10

December 11, 2025

Accelerate Shopify analytics with a ready-to-use toolkit

We introduced a Shopify toolkit that lets agents analyze store data such as sales, customers, and products without custom integration work. This reduces time-to-value for commerce analytics and reporting, and provides a clear path from prototype to production via a cookbook example.

Details

Standardized Tools interface to authenticate and query Shopify data
Plug into agents and workflows for automated reporting, alerts, and insights
Cookbook example to go from zero to actionable analytics quickly

Who this is for: Shopify developers, data teams, and commerce platforms building analytics, automation, or customer operations on Shopify.

v2.3.9

December 9, 2025

Simplify integrations with synchronous Knowledge operations

Knowledge add_content_ methods now support true synchronous execution. This removes the async-only limitation, making it straightforward to integrate content ingestion into synchronous services and batch jobs without event loop management or architectural workarounds.

Details

Synchronous parity with existing async methods for consistent behavior
Drop-in for frameworks and environments that don’t use async
No migration steps required

Who this is for: Backend teams building on synchronous frameworks and data pipelines that need reliable, easy-to-use Knowledge ingestion.

v2.3.9

December 9, 2025

Unlock OpenRouter reasoning messages for richer insight and control

Pinax now supports reasoning messages from OpenRouter, enabling you to capture and act on models’ reasoning outputs where available. This provides greater transparency for debugging, evaluation, and governance, and expands the set of model capabilities you can use without changing your integration approach.

Details

Ingest reasoning messages alongside standard outputs for improved traceability
Works with existing routing, logging, and evaluation workflows
No migration required; enable where OpenRouter models support reasoning

Who this is for: Teams adopting OpenRouter models that expose reasoning signals and need better observability and evaluation fidelity.

v2.3.9

December 9, 2025

Operationalize model quality with Agent-as-Judge evaluations

A new built-in evaluation system lets you automate LLM quality checks with binary and numeric scoring, background execution, post-hooks, and customizable evaluator agents. This makes it easier to standardize evals, gate releases, and compare models — without bolting on external systems.

Details

Run evaluations in the background to keep pipelines responsive
Use post-hooks to persist metrics, trigger alerts, or update dashboards
Create custom evaluator agents to encode domain-specific criteria

Who this is for: AI platform teams, ML engineers, and QA leads who need consistent, auditable evaluation workflows at scale.

v2.3.9

December 9, 2025

Accelerate async MySQL workloads with first-class Async MySQLDb

We’ve added AsyncMySQLDb with native compatibility for the asyncmy driver, enabling fully asynchronous MySQL operations. This unlocks higher concurrency, better throughput, and lower latency for agent and workflow backends that depend on MySQL. Built-in tracing support and cookbook examples reduce integration time and improve observability from day one.

Details

Non-blocking I/O with asyncmy for scalable, event-driven architectures
Integrated tracing hooks for end-to-end visibility and troubleshooting
Cookbook examples to shorten time-to-value and standardize adoption

Who this is for: Teams running high-throughput agents, streaming pipelines, or workflow services that need async database performance and robust observability.

v2.3.8

December 5, 2025

Streamlined Memori integration as MemoriTools is removed

MemoriTools has been removed in favor of Memori SDK v3’s built-in auto-recording. This consolidates functionality in the SDK, reduces integration complexity, and lowers maintenance overhead. To avoid breakage, remove MemoriTools from your code and rely on the SDK for conversation recording.

Details

MemoriTools is no longer supported; SDK v3 provides automatic recording
Action required: remove MemoriTools imports/usages and update your flows to SDK v3
Outcome: a simpler, more reliable integration path with fewer components to manage

Who this is for: Teams adopting or maintaining Memori-based conversation storage who want a supported, lower-friction integration.

v2.3.8

December 5, 2025

Faster setup with Memori SDK v3.0.5 and automatic conversation recording

Pinax now ships with Memori SDK v3.0.5, enabling automatic recording of agent conversations without a separate tool. This simplifies integration, reduces setup time, and ensures a consistent audit trail out of the box. If you previously used MemoriTools, you can remove it — SDK v3 handles recording automatically.

Details

Zero-config conversation capture for Pinax agents
Fewer moving parts and dependencies to maintain
Action recommended: upgrade to SDK v3.0.5 to adopt auto-recording

Who this is for: Teams standardizing on conversation archiving for compliance, analytics, or customer support quality.

v2.3.8

December 5, 2025

Model-level retries improve reliability under provider rate limits

We’ve moved retry logic from Agents/Teams to the Model layer. When you set retries on a model, Pinax now retries at the model execution level, which is more effective for handling provider throttling and transient errors. Agent/Team retries now apply only to run-level exceptions. This change reduces wasted cycles, makes behavior more predictable, and improves throughput under rate limits.

Details

Configure retries on the Model to handle LLM/provider errors directly
Agent/Team retries now cover orchestration-level failures only
Action required: move any Agent/Team retry settings to the associated Model

Who this is for: Teams running production workloads at scale who need consistent behavior and better resilience under variable provider limits.

v2.3.7

December 4, 2025

Run evaluations reliably on asynchronous databases

AgentOS evaluation endpoints now work with asynchronous database backends. Teams using async DB classes can run evaluations without changing their stack, removing a key limitation for modern, event-driven deployments.

Details

Evals run as expected with async database drivers, improving parity across environments
No configuration changes or migration steps required

Who this is for: Engineering teams standardizing on asynchronous databases who need reliable, automated evaluation workflows.

v2.3.7

December 4, 2025

Streamline Human-In-The-Loop with a single, predictable requirement model

RunRequirement simplifies how agents request and manage human input. Requirements now surface directly in agent responses or as RunPaused events in streaming flows, providing a consistent pattern for approvals, confirmations, and other human checkpoints. This reduces implementation effort today and lays the groundwork for richer triggers and orchestration in the future.

Details

Unified model for HITL across synchronous and streaming executions
Less glue code and fewer edge cases to handle in application logic
No action required to benefit from the new model

Who this is for: Teams implementing approvals, compliance gates, or manual reviews in agent-driven workflows.

v2.3.7

December 4, 2025

Native Amazon Redshift toolkit simplifies data access and operations

We introduced RedshiftTools, giving agents first-class access to Amazon Redshift without custom glue. Teams can explore schemas, describe tables, inspect and run queries, and export data directly through a consistent tool interface. The toolkit supports both standard credential-based auth and IAM-based authentication (via explicit credentials or AWS profiles), aligning with enterprise security practices.

Details

Speed up prototyping and operations by eliminating one-off scripts and SDK wiring
Standardize Redshift access patterns across agents and workflows
Reduce integration risk with built-in support for IAM authentication

Who this is for: Data and platform teams building agents or workflows that need secure, governed access to Redshift.

v2.3.6

December 4, 2025

Accelerate Spotify integrations with a ready-to-use toolkit and agent

We’ve introduced a Spotify toolkit and example agent to manage and interact with Spotify, including library management. This addition reduces custom API work and speeds up delivery of music features in assistants, automations, and internal tools. Teams can quickly prototype, then productionize common Spotify workflows without building from scratch.

Details

Prebuilt capabilities for common library operations to minimize integration effort
Example agent demonstrates end-to-end usage for faster adoption
Compatible with existing Pinax agents and workflows; no migration required

Who this is for: Product and platform teams building Spotify-powered assistants, content curation tools, or media automations.

v2.3.5

December 3, 2025

DynamoDB Memory schema now requires GSI on created_at

To prevent CreateTable validation errors and ensure reliable, time-ordered queries, the DynamoDB schema for the user Memory table now requires a global secondary index (GSI) on created_at. Deployments using DynamoDB must add this index or recreate the table using the updated schema.

Details

Action required for DynamoDB users: add the created_at GSI or reprovision the table.
Eliminates schema validation failures and improves query performance.
No changes required for other storage backends.

Who this is for: Teams running Pinax with AWS DynamoDB for Memory storage.

v2.3.5

December 3, 2025

Built-in OpenTelemetry tracing delivers end-to-end visibility

We introduced native tracing with OpenTelemetry, including first-class spans and new endpoints to inspect traces. Spans are stored in your configured database, giving you immediate, consistent observability without additional instrumentation. This improves debugging, performance analysis, and compliance with reliability objectives.

Details

New endpoints: /traces, /traces/<trace_id>, /traces/<trace_id>?<span_id>
Works with Pinax-supported storage backends; optionally configure exporters to forward data to your observability stack.
Speeds up root-cause analysis and shortens time-to-resolution.

Who this is for: Platform, SRE, and ops teams that need standardized, low-friction tracing across agents and tools.

v2.3.5

December 3, 2025

Non-blocking hooks accelerate AgentOS workflows

Agent and Team pre- and post-hooks now run as background tasks in AgentOS, so they no longer block the main operation. This reduces end-to-end latency and increases throughput, especially under concurrent load. Teams should ensure hooks are idempotent and not dependent on synchronous completion.

Details

Hooks execute concurrently and may complete after the primary request returns.
Move any required synchronous logic into the main flow; treat hooks as asynchronous side effects.
Expect reduced wait times and better parallelism in high-throughput environments.

Who this is for: Teams scaling agent workloads, multi-tenant platforms, and latency-sensitive use cases.

v2.3.4

November 28, 2025

Capture model source citations in run data for better provenance

Runs now support an optional citations field across single, team, and workflow executions. This lets you store and surface model-provided source citations directly in your run metadata, improving traceability, auditability, and user trust without changing existing integrations. The field is non-breaking and can be adopted incrementally to power features like “show your work,” compliance review, and knowledge attribution.

Details

Available in RunSchema, TeamRunSchema, and WorkflowRunSchema responses.
Optional and backward-compatible; no migrations required.

Who this is for: Teams building user-facing experiences that require explainability, or organizations in regulated environments that need evidence of sources and decision trails.

v2.3.4

November 28, 2025

Accelerate Gemini 3 adoption with ready-to-run agents and configs

We added a complete Gemini 3 demo, including example agents, configuration, and generated assets. This makes it faster to evaluate and roll out Gemini 3 within Pinax by providing opinionated, runnable patterns you can copy, adapt, and deploy. Teams can stand up proofs of concept in minutes and standardize on a repeatable setup, reducing integration effort and risk.

Details

Preconfigured agents and sample assets showcase best practices for orchestration and evaluation.
Works out of the box; no changes required to existing projects.

Who this is for: Platform teams and developers evaluating Gemini 3 or scaling multi-model strategies with minimal setup time.

v2.3.3

November 27, 2025

Native async MongoDB support for higher throughput agents

MongoDB clients now support Motor and PyMongo async libraries with improved error handling and typing. This enables non-blocking storage operations, better concurrency, and lower latency in async-first applications.

Details

Drop-in async clients for faster, more scalable data operations
Enhanced typing and error handling improve reliability and observability
Reduces custom glue code for Python async stacks

Who this is for: Teams running high-QPS, async Python services that depend on MongoDB.

v2.3.3

November 27, 2025

Faster Bedrock onboarding with optional API key authentication for Claude

We’ve added an optional API key path for AWS Bedrock Claude in addition to IAM. This reduces setup friction in environments where IAM is not feasible while preserving IAM as the default for production.

Details

Use AWS_BEDROCK_API_KEY as an alternative authentication method
No changes required for existing IAM-based configurations
Simplifies local development, cross-account, and restricted-policy scenarios

Who this is for: Enterprises with constrained IAM policies or teams needing rapid prototyping paths.

v2.3.3

November 27, 2025

Adapt outputs per run with runtime schema overrides

You can now override output_schema at runtime for both Agent and Team (streaming and non-streaming), with automatic restoration after the run. This enables per-request structured output variations without cloning agents or adding conditional boilerplate.

Details

Change the expected output format for a single run; state is restored automatically
Works for streaming and batch runs to support diverse downstream consumers
Simplifies A/B testing, multi-tenant formats, and evolving contract needs

Who this is for: Platform teams orchestrating varied integrations and output contracts across services.

v2.3.3

November 27, 2025

Build retrieval-augmented agents with Gemini File Search

Pinax now offers full support for Google Gemini File Search, including store and document management, uploads/imports, metadata filters, citation extraction, and async APIs. This enables high-quality retrieval workflows with traceability and scale on the Gemini platform.

Details

Manage file stores and documents, including bulk uploads and async ingestion
Filter by metadata and extract citations for auditability and explainability
First-class integration to accelerate RAG and knowledge-heavy agents

Who this is for: Teams standardizing on Google’s AI stack and building retrieval-rich applications with compliance needs.

v2.3.3

November 27, 2025

Reduce cost and drift with out-of-band memory optimization (beta)

A new MemoryOptimizationStrategy framework and APIs allow you to summarize and optimize memories outside of agent runs. By decoupling memory maintenance from inference, you can keep context high-signal while reducing runtime tokens and improving decision quality at scale.

Details

Schedule or trigger memory compaction and summarization independently of agent runs
Keep knowledge current and concise to improve downstream model performance
Works without changes to agent logic; designed for scale and governance

Who this is for: Production teams with large or fast-growing memory stores seeking lower costs and tighter control.

v2.3.3

November 27, 2025

Keep runs within context limits with automatic tool output compression

Automatically compress and summarize tool call results to keep agent context safely within model token windows. This change reduces context overflow errors, stabilizes long-running workflows, and lowers token spend without requiring any application changes.

Details

Summarizes large tool outputs before adding them to conversation history
Improves reliability for tool-heavy agents and extended sessions
Reduces token usage while preserving relevant signal for downstream reasoning

Who this is for: Teams operating long-running or tool-intensive agents where reliability and cost control are priorities.

v2.3.2

November 22, 2025

Topic-based memory retrieval in SQLite now returns correct results

We fixed an issue where filtering memories by topic could return incorrect results when using SQLite or AsyncSQLite backends. Topic-based queries now behave predictably, improving the accuracy of agents and workflows that rely on segmented memory retrieval. No action is required — existing implementations will benefit immediately after upgrading.

Details

Accurate topic filters for both SQLite and AsyncSQLite memory backends
Reduces debugging and unexpected agent responses caused by misclassified results
Improves determinism for evaluations, automation, and knowledge reuse

Who this is for: Teams using local SQLite storage for Memory, especially those organizing knowledge by topic for agents, offline/edge deployments, or deterministic test environments.

v2.3.1

November 21, 2025

Predictable, schema‑enforced Claude responses across sync, async, and streaming

We added first-class support for Anthropic’s structured outputs, including schema enforcement, strict tool calling, and robust response parsing across synchronous, asynchronous, and streaming APIs. This delivers predictable, typed responses, cuts custom parsing and boilerplate, and improves reliability and governance for production workloads using Claude.

Details

Enforce JSON/object schemas to keep outputs consistent and machine-readable
Strict tools and end-to-end parsing reduce failure modes and post-processing
Streaming support preserves low latency while maintaining structure
Backward-compatible and additive; adopt incrementally

Who this is for: Teams standardizing on Claude, building workflow automations, or requiring dependable, typed outputs.

v2.3.1

November 21, 2025

Faster, validated image generation with Google Nano Banana

We introduced NanoBananaTools, a turnkey toolkit to generate images with Google’s Nano Banana model. It includes built-in parameter validation and a cookbook example, enabling faster adoption and fewer integration errors. Standardizing how you invoke the model within Pinax reduces glue code and makes image features easier to operate and maintain.

Details

Ready-to-use tool wrappers with input validations to prevent malformed requests
Cookbook example accelerates first run and team onboarding
Designed to plug into your existing toolchain to minimize integration effort

Who this is for: Product and platform teams adding image generation or evaluating Google’s vision models.

v2.3.0

November 21, 2025

Finalize AgentOS API names by removing deprecated parameters

We removed deprecated AgentOS parameters to standardize on stable naming: os_id -> id, fastapi_app -> base_app, enable_mcp -> enable_mcp_server, replace_routes -> on_route_conflict.

Details

Action required: rename parameters to the stable forms
Reduces ambiguity and future migration effort

Who this is for: Platform teams embedding AgentOS into services and APIs.

v2.3.0

November 21, 2025

Unified message history APIs for simpler integrations

We removed get_messages_for_session and get_messages_from_last_n_runs in favor of get_messages, get_session_messages, and get_chat_history. This unifies patterns and reduces mental overhead.

Details

Action required: migrate to the new method names
Clearer contracts for history retrieval and observability

Who this is for: Teams managing conversation history, logging, or analytics.

v2.3.0

November 21, 2025

Explicit storage now required for knowledge filtering

When using knowledge_filters, you must configure contents_db. This ensures deterministic, stateless filtering aligned with AgentOS and prevents silent mismatches.

Details

Action required: provide a contents_db for any knowledge base that uses filters
Improves reliability and reproducibility of retrieval and filtering

Who this is for: Teams building RAG and knowledge-aware agents at scale.

v2.3.0

November 21, 2025

Cleaner streaming defaults across CLI and print helpers

We removed the stream_events parameter from print_response/aprint_response and CLI. Streaming now works correctly by default, reducing configuration and edge cases.

Details

Action required: remove the parameter from calls
For fine-grained control, use run()/arun() instead of print helpers

Who this is for: Teams embedding CLIs or console output in developer workflows and demos.