Browser Context Model
Concepts
The Browser Context Model is the central data structure in Periscope. Every piece of information that flows from the browser extension to the Periscope service and ultimately to AI agents is represented as a BrowserContext. Understanding this model is essential for building integrations that consume browser state, query context history, or implement custom privacy filtering.
What Is a BrowserContext?
A BrowserContext is a snapshot of observable browser state at a specific moment in time. It captures what the user is looking at, interacting with, or navigating through. When your AI agent asks "what is the user doing in their browser right now?", the answer comes as a BrowserContext.
The design reflects a key insight: browser state is not a single thing. A user might be reading a page, filling out a form, selecting text, watching a video, or doing several of these simultaneously. Rather than modeling each activity as a completely separate data structure, Periscope uses a single BrowserContext type with optional sections. Every context has core fields (identity, timing, privacy, tab metadata), and then optional sections (page metadata, content, selection, form data, media elements, custom data) that are populated based on what the user is actually doing.
This keeps the API surface small while accommodating a wide range of browser activities.
Context Types
The type field on a BrowserContext categorizes what kind of browser state is being captured. There are seven context types:
page -- The most common type. Captured when the user navigates to a new page or when page content changes significantly. Contains the page URL, title, metadata (Open Graph, Twitter Cards, SEO attributes), and optionally the page content itself in text, HTML, or Markdown format. If your agent needs to know "what page is the user on?", this is the context type you query.
selection -- Captured when the user selects text on a page. The selection section will contain the selected text, character offsets, and surrounding text for disambiguation. This is particularly valuable for agents that need to understand what specific content the user is focusing on, not just which page they are visiting.
element -- Captured when a specific DOM element is the focus of activity. This is less common than page or selection contexts and is typically used when the extension detects interaction with a specific interactive element that does not fit the form or media categories.
form -- Captured when the user is interacting with a form. The formData section will contain the form fields, their types, and their current values (subject to privacy filtering). This enables agents to assist with form completion, data validation, or understanding what workflow the user is in the middle of.
media -- Captured when the user interacts with media elements (images, videos, audio players). The mediaElements array will contain details about each media element including its URL, type, alt text, and dimensions. Useful for agents that help with content curation or media-related tasks.
navigation -- Captured specifically to track navigation events. While a page context represents the state of a page, a navigation context emphasizes the transition between pages, capturing referrer information, navigation timing, and transition reasons.
custom -- An escape hatch for extension-defined context types that do not fit the standard categories. The customData field (a string-keyed record) carries arbitrary structured data. Use this when building custom extension features that need to push domain-specific context through the Periscope pipeline.
Data Flow
Here is how a BrowserContext moves through the system:
- The user performs an action in the browser (navigates, selects text, fills a form).
- The Periscope browser extension captures the relevant DOM state and constructs a
BrowserContext. - The extension applies client-side privacy filtering before the data ever leaves the browser process.
- The filtered context is sent to the Periscope service via the sync protocol.
- The service validates the incoming data against Zod schemas, applies server-side privacy rules, then stores and indexes the context.
- AI agents query stored contexts via the REST API or receive them in real time via WebSocket channels.
Core Fields
Every BrowserContext has these required fields:
| Field | Type | Description |
|---|---|---|
id | ContextId | Unique identifier for this context snapshot. Branded string type. |
type | BrowserContextType | One of the seven context types described above. |
tabId | TabId | Identifies which browser tab this context came from. Branded string type. |
userId | UserId | Identifies the user. Branded string type. |
capturedAt | Date | When the context was captured in the browser. |
privacyLevel | PrivacyLevel | One of public, private, restricted, or filtered. Controls downstream handling. |
tabMetadata | BrowserTabMetadata | Tab state at capture time (URL, title, favicon, active/loading state, window ID). |
Tab Metadata
The tabMetadata object is always present and provides the browser tab's state:
interface BrowserTabMetadata {
readonly id: TabId; // Same as parent tabId
readonly url: string; // Current page URL
readonly title?: string; // Page title (from <title> tag)
readonly favicon?: string; // Favicon URL
readonly isActive: boolean; // Is this the focused tab?
readonly isLoading: boolean; // Is the page still loading?
readonly windowId: number; // Browser window identifier
readonly lastUpdated: Date; // When tab state last changed
}
The isActive field is especially useful for agents that only care about the tab the user is currently looking at, as opposed to background tabs. The isLoading field helps agents understand whether the page content is complete or still being fetched.
Page Metadata
The pageMetadata section is optional and contains structured information extracted from the page's HTML head:
interface PageContentMetadata {
readonly title?: string;
readonly description?: string;
readonly keywords?: string[];
readonly author?: string;
readonly publishedDate?: Date;
readonly modifiedDate?: Date;
readonly canonicalUrl?: string;
readonly language?: string;
readonly contentType?: string;
readonly wordCount?: number;
readonly readingTime?: number; // Minutes
readonly openGraph?: {
readonly title?: string;
readonly description?: string;
readonly image?: string;
readonly type?: string; // "article", "website", etc.
readonly siteName?: string;
};
readonly twitter?: {
readonly card?: string; // "summary", "summary_large_image", etc.
readonly title?: string;
readonly description?: string;
readonly image?: string;
readonly creator?: string; // @handle
};
}
This metadata gives agents rich context about a page without requiring them to parse HTML. The Open Graph and Twitter Card fields are particularly valuable because they represent the page author's own summary of the content, often more concise and accurate than attempting to extract a summary from the page body.
Content
The content section carries the actual page content in up to three formats:
readonly content?: {
readonly text?: string; // Plain text extraction
readonly html?: string; // Raw HTML (can be large)
readonly markdown?: string; // Markdown conversion
};
Not all formats are always populated. The extension may choose to send only text for performance reasons, or only markdown if the page has already been converted. Agents should handle the case where their preferred format is absent.
Selection
When the context type is selection, this section will be populated:
interface TextSelection {
readonly selectedText: string; // The actual selected text
readonly startOffset: number; // Character offset from document start
readonly endOffset: number; // Character offset from document start
readonly anchorNode?: string; // DOM node where selection starts
readonly focusNode?: string; // DOM node where selection ends
readonly surroundingText?: string; // Text around the selection for context
readonly selectionRange?: {
readonly startContainer: string;
readonly endContainer: string;
readonly startOffset: number;
readonly endOffset: number;
};
}
The surroundingText field is useful for disambiguation. If a user selects the word "bank", the surrounding text helps the agent understand whether it refers to a financial institution, a river bank, or something else entirely.
Form Data
When the context type is form, the formData section captures the form state:
interface FormData {
readonly formId?: string; // HTML id attribute
readonly formAction?: string; // Form submission URL
readonly formMethod?: string; // GET, POST, etc.
readonly fields: FormField[]; // At least one field
}
interface FormField {
readonly name: string; // Field name attribute
readonly type: string; // "text", "email", "password", etc.
readonly value?: string; // Current value (may be redacted)
readonly placeholder?: string; // Placeholder text
readonly required: boolean; // Is the field required?
}
Field values are subject to privacy filtering. Password fields and fields matching sensitive patterns will have their values redacted before the context leaves the browser. See the Privacy Model for details on how filtering works.
Media Elements
The mediaElements array captures images, videos, and audio elements on the page:
interface MediaElement {
readonly type: "image" | "video" | "audio";
readonly url: string; // Media source URL
readonly alt?: string; // Alt text (images)
readonly title?: string; // Title attribute
readonly dimensions?: {
readonly width: number;
readonly height: number;
};
}
Context History
Periscope tracks how contexts relate to each other over time through ContextHistoryEntry:
interface ContextHistoryEntry {
readonly context: BrowserContext;
readonly previousContextId?: ContextId;
readonly navigationReason?:
| "user_action"
| "redirect"
| "back_forward"
| "refresh";
readonly timeOnPage?: number; // Seconds spent on previous context
}
The previousContextId field creates a linked list of contexts, letting agents reconstruct the user's browsing path. The navigationReason field distinguishes intentional navigation from redirects or back/forward actions, and timeOnPage indicates how long the user spent on the previous page -- a proxy for engagement level.
Querying Contexts
The ContextQueryFilter type defines how agents search for stored contexts:
interface ContextQueryFilter {
readonly userId?: UserId;
readonly tabId?: TabId;
readonly contextType?: BrowserContextType;
readonly privacyLevel?: PrivacyLevel;
readonly urlPattern?: string; // Regex or glob pattern
readonly timeRange?: {
readonly start: Date;
readonly end: Date;
};
readonly limit?: number; // Max 500
readonly offset?: number; // For pagination
}
All filter fields are optional. An empty filter returns all contexts for the authenticated user (up to the default limit). Combining filters narrows results -- for example, querying for contextType: "selection" and a timeRange of the last hour returns all text selections the user made recently.
Branded Types
Periscope uses TypeScript branded types for identifiers: ContextId, TabId, and UserId. These are string types at runtime but carry a compile-time brand that prevents accidentally passing a TabId where a ContextId is expected.
// These are just strings at runtime, but TypeScript catches mix-ups:
const contextId: ContextId = createContextId("ctx_abc123");
const tabId: TabId = createTabId("tab_def456");
// This would be a compile error:
// findContextById(tabId); // Error: TabId is not assignable to ContextId
The create* factory functions are zero-cost type assertions. There is no runtime validation -- that is handled separately by Zod schemas. The brands exist purely to improve developer experience by catching identifier confusion at compile time rather than at runtime.
Incremental Updates
For efficiency, Periscope supports partial context updates via ContextDelta:
type ContextDelta = Partial<BrowserContext>;
interface BrowserContextWithDelta {
readonly context: BrowserContext;
readonly delta?: ContextDelta;
}
When a context changes incrementally (for example, a form field value changes but everything else stays the same), the extension can send just the delta rather than the full context. The service merges the delta into the stored context. This reduces bandwidth and processing overhead, which matters when the extension is capturing high-frequency changes.
Next Steps
- Event System -- Learn about the event envelope and payload types that wrap context changes
- Privacy Model -- Understand how context data is filtered before reaching agents
- Sync Protocol -- See how contexts are transmitted between extension and service