Compare commits

..

1 Commits

Author SHA1 Message Date
243e84622a docs: agent vision and enhancement architecture
Rewrite ai-agent-ideas.md with focus on proactive,
personable assistant behaviors. Add slot-based LLM
enhancement system to architecture-draft.md.

Co-authored-by: Ona <no-reply@ona.com>
2026-02-26 22:44:04 +00:00
2 changed files with 42 additions and 88 deletions

View File

@@ -16,8 +16,8 @@ Examples of feed items:
## Design Principles ## Design Principles
1. **Extensibility**: The core must support different data sources, including third-party sources. 1. **Extensibility**: The core must support different data sources, including third-party sources.
2. **Separation of concerns**: Core handles data and UI description. The client is a thin renderer. 2. **Separation of concerns**: Core handles data only. UI rendering is a separate system.
3. **Dependency graph**: Sources declare dependencies on other sources. The engine resolves the graph and runs independent sources in parallel. 3. **Parallel execution**: Sources run in parallel; no inter-source dependencies.
4. **Graceful degradation**: Failed sources are skipped; partial results are returned. 4. **Graceful degradation**: Failed sources are skipped; partial results are returned.
## Architecture ## Architecture
@@ -25,28 +25,26 @@ Examples of feed items:
``` ```
┌─────────────────────────────────────────────────────────────┐ ┌─────────────────────────────────────────────────────────────┐
│ Backend │ │ Backend │
│ ┌─────────────┐ ┌─────────────┐ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────────┐
│ │ aris-core │ │ Sources │ │ │ aris-core │ │ Sources │ │ UI Registry
│ │ │ │ (plugins) │ │ │ │ │ (plugins) │ │ (schemas from
│ │ - FeedEngine│◄───│ - Calendar │ │ │ - Reconciler│◄───│ - Calendar │ │ third parties)│
│ │ - Context │ │ - Weather │ │ │ - Context │ │ - Weather │
│ │ - FeedItem │ │ - TfL │ │ - FeedItem │ │ - Spotify
│ - Actions │ │ - Spotify │ └─────────────┘ └─────────────┘ └─────────────────┘
└─────────────┘ └─────────────┘
Feed (data only) UI Schemas (JSON)
│ Feed items (data + ui trees + slots) │
└─────────────────────────────────────────────────────────────┘ └─────────────────────────────────────────────────────────────┘
(WebSocket / JSON-RPC)
┌─────────────────────────────────────────────────────────────┐ ┌─────────────────────────────────────────────────────────────┐
Client (React Native) Frontend
│ ┌──────────────────────────────────────────────────────┐ │ │ ┌──────────────────────────────────────────────────────┐ │
│ │ json-render + twrnc component map │ │ │ │ Renderer │ │
│ │ - Receives feed items with ui trees │ │ │ │ - Receives feed items │ │
│ │ - Renders using registered RN components + twrnc │ │ │ │ - Fetches UI schema by item type │ │
│ │ - User interactions trigger source actions │ │ │ │ - Renders using json-render or similar │ │
│ │ - Bespoke native components for rich interactions │ │
│ └──────────────────────────────────────────────────────┘ │ │ └──────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘ └─────────────────────────────────────────────────────────────┘
``` ```
@@ -56,16 +54,15 @@ Examples of feed items:
The core is responsible for: The core is responsible for:
- Defining the context and feed item interfaces - Defining the context and feed item interfaces
- Providing a `FeedEngine` that orchestrates sources via a dependency graph - Providing a reconciler that orchestrates data sources
- Returning a flat list of prioritized feed items - Returning a flat list of prioritized feed items
- Routing action execution to the correct source
### Key Concepts ### Key Concepts
- **Context**: Time and location (with accuracy) passed to all sources. Sources can contribute to context (e.g., location source provides coordinates, weather source provides conditions). - **Context**: Time and location (with accuracy) passed to all sources
- **FeedItem**: Has an ID (source-generated, stable), type, timestamp, JSON-serializable data, optional actions, an optional `ui` tree, and optional `slots` for LLM-fillable content. - **FeedItem**: Has an ID (source-generated, stable), type, priority, timestamp, and JSON-serializable data
- **FeedSource**: Interface that first and third parties implement to provide context, feed items, and actions. Uses reverse-domain IDs (e.g., `aris.weather`, `com.spotify`). - **DataSource**: Interface that third parties implement to provide feed items
- **FeedEngine**: Orchestrates sources respecting their dependency graph, runs independent sources in parallel, returns items and any errors. Routes action execution to the correct source. - **Reconciler**: Orchestrates sources, runs them in parallel, returns items and any errors
## Data Sources ## Data Sources
@@ -74,13 +71,10 @@ Key decisions:
- Sources receive the full context and decide internally what to use - Sources receive the full context and decide internally what to use
- Each source returns a single item type (e.g., separate "Calendar Source" and "Location Suggestion Source" rather than a combined "Google Source") - Each source returns a single item type (e.g., separate "Calendar Source" and "Location Suggestion Source" rather than a combined "Google Source")
- Sources live in separate packages, not in the core - Sources live in separate packages, not in the core
- Sources declare dependencies on other sources (e.g., weather depends on location)
- Sources are responsible for: - Sources are responsible for:
- Transforming their domain data into feed items - Transforming their domain data into feed items
- Assigning priority based on domain logic (e.g., "event starting in 10 minutes" = high priority) - Assigning priority based on domain logic (e.g., "event starting in 10 minutes" = high priority)
- Returning empty arrays when nothing is relevant - Returning empty arrays when nothing is relevant
- Providing a `ui` tree for each feed item
- Declaring and handling actions (e.g., RSVP, complete task, play/pause)
### Configuration ### Configuration
@@ -89,39 +83,26 @@ Configuration is passed at source registration time, not per reconcile call. Sou
## Feed Output ## Feed Output
- Flat list of `FeedItem` objects - Flat list of `FeedItem` objects
- Items carry data, an optional `ui` field describing their layout, and optional `slots` for LLM enhancement - No UI information (no icons, card types, etc.)
- Items are a discriminated union by `type` field - Items are a discriminated union by `type` field
- Reconciler sorts by priority; can act as tiebreaker
## UI Rendering: Server-Driven UI ## UI Rendering (Separate from Core)
The UI for feed items is **server-driven**. Sources describe how their items look using a JSON tree (the `ui` field on `FeedItem`). The client renders these trees using [json-render](https://json-render.dev/) with a registered set of React Native components styled via [twrnc](https://github.com/jaredh159/tailwind-react-native-classnames). The core does not handle UI. For extensible third-party UI:
### How it works 1. Third-party apps register their UI schemas through the backend (UI Registry)
2. Frontend fetches UI schemas from the backend
3. Frontend matches feed items to schemas by `type` and renders accordingly
1. Sources return feed items with a `ui` field — a JSON tree describing the card layout using Tailwind class strings. This approach:
2. The client passes a component map to json-render. Each component wraps a React Native primitive and resolves `className` via twrnc.
3. json-render walks the tree and renders native components. twrnc parses Tailwind classes at runtime — no build step, arbitrary values work.
4. User interactions (tap, etc.) map to source actions via the `actions` field on `FeedItem`. The client sends action requests to the backend, which routes them to the correct source via `FeedEngine.executeAction()`.
### Styling - Keeps the core focused on data
- Works across platforms (web, React Native)
- Avoids the need for third parties to inject code into the app
- Uses a json-render style approach for declarative UI from JSON schemas
- Sources use Tailwind CSS class strings via the `className` prop (e.g., `"p-4 bg-white dark:bg-black rounded-xl"`). Reference: https://github.com/vercel-labs/json-render
- twrnc resolves classes to React Native style objects at runtime. Supports arbitrary values (`mt-[31px]`, `bg-[#eaeaea]`), dark mode (`dark:bg-black`), and platform prefixes (`ios:pt-4 android:pt-2`).
- Custom colors and spacing are configured via `tailwind.config.js` on the client.
- No compile-time constraint — all styles resolve at runtime.
### Two tiers of UI
- **Server-driven (default):** Any source can return a `ui` tree. Covers most cards — weather, tasks, alerts, package tracking, news, etc. Simple interactions go through source actions. This is the default path for both first-party and third-party sources.
- **Bespoke native:** For cards that need rich client interaction (gestures, animations, real-time updates), a native React Native component is registered in the json-render component map and referenced by type. Third parties that need this level of richness work with the ARIS team to get it integrated.
### Why server-driven
- Feed items are inherently server-driven — the data comes from sources on the backend. Attaching the layout alongside the data is a natural extension.
- Card designs can be updated without shipping an app update.
- Third-party sources can ship their own UI without bundling anything new into the app.
Reference: https://json-render.dev/
## Feed Items with UI and Slots ## Feed Items with UI and Slots
@@ -403,9 +384,9 @@ function mergeEnhancement(
## Open Questions ## Open Questions
- How third parties authenticate/register their sources - Exact schema format for UI registry
- Exact set of React Native components exposed in the json-render component map - How third parties authenticate/register their sources and UI schemas
- Validation/sandboxing of third-party ui trees - JsonRenderNode type definition and component vocabulary
- How synthetic items define their UI (full json-render tree vs. registered component) - How synthetic items define their UI (full json-render tree vs. registered schema)
- Should slots support rich content (json-render nodes) in the future, or stay text-only? - Should slots support rich content (json-render nodes) in the future, or stay text-only?
- How to handle slot content that references other items (e.g., "your dinner at The Ivy" linking to the calendar card) - How to handle slot content that references other items (e.g., "your dinner at The Ivy" linking to the calendar card)

View File

@@ -125,7 +125,7 @@ interface FeedSource<TItem extends FeedItem = FeedItem> {
### Changes to FeedItem ### Changes to FeedItem
Optional fields added for actions, server-driven UI, and LLM slots. One optional field added.
```typescript ```typescript
interface FeedItem< interface FeedItem<
@@ -140,12 +140,6 @@ interface FeedItem<
/** Actions the user can take on this item. */ /** Actions the user can take on this item. */
actions?: readonly ItemAction[] actions?: readonly ItemAction[]
/** Server-driven UI tree rendered by json-render on the client. */
ui?: JsonRenderNode
/** Named slots for LLM-fillable content. See architecture-draft.md. */
slots?: Record<string, Slot>
} }
``` ```
@@ -228,25 +222,6 @@ class SpotifySource implements FeedSource<SpotifyFeedItem> {
{ actionId: "skip-track" }, { actionId: "skip-track" },
{ actionId: "like-track", params: { trackId: track.id } }, { actionId: "like-track", params: { trackId: track.id } },
], ],
ui: {
type: "View",
className: "flex-row items-center p-3 gap-3 bg-white dark:bg-black rounded-xl",
children: [
{
type: "Image",
source: { uri: track.albumArt },
className: "w-12 h-12 rounded-lg",
},
{
type: "View",
className: "flex-1",
children: [
{ type: "Text", className: "font-semibold text-black dark:text-white", text: track.name },
{ type: "Text", className: "text-sm text-gray-500 dark:text-gray-400", text: track.artist },
],
},
],
},
}, },
] ]
} }
@@ -261,8 +236,6 @@ class SpotifySource implements FeedSource<SpotifyFeedItem> {
4. `FeedSource.listActions()` is a required method returning `Record<string, ActionDefinition>` (empty record if no actions) 4. `FeedSource.listActions()` is a required method returning `Record<string, ActionDefinition>` (empty record if no actions)
5. `FeedSource.executeAction()` is a required method (no-op for sources without actions) 5. `FeedSource.executeAction()` is a required method (no-op for sources without actions)
6. `FeedItem.actions` is an optional readonly array of `ItemAction` 6. `FeedItem.actions` is an optional readonly array of `ItemAction`
6b. `FeedItem.ui` is an optional json-render tree describing server-driven UI
6c. `FeedItem.slots` is an optional record of named LLM-fillable slots
7. `FeedEngine.executeAction()` routes to correct source, returns `ActionResult` 7. `FeedEngine.executeAction()` routes to correct source, returns `ActionResult`
8. `FeedEngine.listActions()` aggregates actions from all sources 8. `FeedEngine.listActions()` aggregates actions from all sources
9. Existing tests pass unchanged (all changes are additive) 9. Existing tests pass unchanged (all changes are additive)