Skip to main content

Browser Context Model

Concepts

The Browser Context Model is the central data structure in Periscope. Every piece of information that flows from the browser extension to the Periscope service and ultimately to AI agents is represented as a BrowserContext. Understanding this model is essential for building integrations that consume browser state, query context history, or implement custom privacy filtering.

What Is a BrowserContext?

A BrowserContext is a snapshot of observable browser state at a specific moment in time. It captures what the user is looking at, interacting with, or navigating through. When your AI agent asks "what is the user doing in their browser right now?", the answer comes as a BrowserContext.

The design reflects a key insight: browser state is not a single thing. A user might be reading a page, filling out a form, selecting text, watching a video, or doing several of these simultaneously. Rather than modeling each activity as a completely separate data structure, Periscope uses a single BrowserContext type with optional sections. Every context has core fields (identity, timing, privacy, tab metadata), and then optional sections (page metadata, content, selection, form data, media elements, custom data) that are populated based on what the user is actually doing.

This keeps the API surface small while accommodating a wide range of browser activities.

Context Types

The type field on a BrowserContext categorizes what kind of browser state is being captured. There are seven context types:

page -- The most common type. Captured when the user navigates to a new page or when page content changes significantly. Contains the page URL, title, metadata (Open Graph, Twitter Cards, SEO attributes), and optionally the page content itself in text, HTML, or Markdown format. If your agent needs to know "what page is the user on?", this is the context type you query.

selection -- Captured when the user selects text on a page. The selection section will contain the selected text, character offsets, and surrounding text for disambiguation. This is particularly valuable for agents that need to understand what specific content the user is focusing on, not just which page they are visiting.

element -- Captured when a specific DOM element is the focus of activity. This is less common than page or selection contexts and is typically used when the extension detects interaction with a specific interactive element that does not fit the form or media categories.

form -- Captured when the user is interacting with a form. The formData section will contain the form fields, their types, and their current values (subject to privacy filtering). This enables agents to assist with form completion, data validation, or understanding what workflow the user is in the middle of.

media -- Captured when the user interacts with media elements (images, videos, audio players). The mediaElements array will contain details about each media element including its URL, type, alt text, and dimensions. Useful for agents that help with content curation or media-related tasks.

navigation -- Captured specifically to track navigation events. While a page context represents the state of a page, a navigation context emphasizes the transition between pages, capturing referrer information, navigation timing, and transition reasons.

custom -- An escape hatch for extension-defined context types that do not fit the standard categories. The customData field (a string-keyed record) carries arbitrary structured data. Use this when building custom extension features that need to push domain-specific context through the Periscope pipeline.

Data Flow

Here is how a BrowserContext moves through the system:

Loading diagram…
  1. The user performs an action in the browser (navigates, selects text, fills a form).
  2. The Periscope browser extension captures the relevant DOM state and constructs a BrowserContext.
  3. The extension applies client-side privacy filtering before the data ever leaves the browser process.
  4. The filtered context is sent to the Periscope service via the sync protocol.
  5. The service validates the incoming data against Zod schemas, applies server-side privacy rules, then stores and indexes the context.
  6. AI agents query stored contexts via the REST API or receive them in real time via WebSocket channels.

Core Fields

Every BrowserContext has these required fields:

FieldTypeDescription
idContextIdUnique identifier for this context snapshot. Branded string type.
typeBrowserContextTypeOne of the seven context types described above.
tabIdTabIdIdentifies which browser tab this context came from. Branded string type.
userIdUserIdIdentifies the user. Branded string type.
capturedAtDateWhen the context was captured in the browser.
privacyLevelPrivacyLevelOne of public, private, restricted, or filtered. Controls downstream handling.
tabMetadataBrowserTabMetadataTab state at capture time (URL, title, favicon, active/loading state, window ID).

Tab Metadata

The tabMetadata object is always present and provides the browser tab's state:

typescript
interface BrowserTabMetadata {
  readonly id: TabId; // Same as parent tabId
  readonly url: string; // Current page URL
  readonly title?: string; // Page title (from <title> tag)
  readonly favicon?: string; // Favicon URL
  readonly isActive: boolean; // Is this the focused tab?
  readonly isLoading: boolean; // Is the page still loading?
  readonly windowId: number; // Browser window identifier
  readonly lastUpdated: Date; // When tab state last changed
}

The isActive field is especially useful for agents that only care about the tab the user is currently looking at, as opposed to background tabs. The isLoading field helps agents understand whether the page content is complete or still being fetched.

Page Metadata

The pageMetadata section is optional and contains structured information extracted from the page's HTML head:

typescript
interface PageContentMetadata {
  readonly title?: string;
  readonly description?: string;
  readonly keywords?: string[];
  readonly author?: string;
  readonly publishedDate?: Date;
  readonly modifiedDate?: Date;
  readonly canonicalUrl?: string;
  readonly language?: string;
  readonly contentType?: string;
  readonly wordCount?: number;
  readonly readingTime?: number; // Minutes
  readonly openGraph?: {
    readonly title?: string;
    readonly description?: string;
    readonly image?: string;
    readonly type?: string; // "article", "website", etc.
    readonly siteName?: string;
  };
  readonly twitter?: {
    readonly card?: string; // "summary", "summary_large_image", etc.
    readonly title?: string;
    readonly description?: string;
    readonly image?: string;
    readonly creator?: string; // @handle
  };
}

This metadata gives agents rich context about a page without requiring them to parse HTML. The Open Graph and Twitter Card fields are particularly valuable because they represent the page author's own summary of the content, often more concise and accurate than attempting to extract a summary from the page body.

Content

The content section carries the actual page content in up to three formats:

typescript
readonly content?: {
  readonly text?: string;       // Plain text extraction
  readonly html?: string;       // Raw HTML (can be large)
  readonly markdown?: string;   // Markdown conversion
};

Not all formats are always populated. The extension may choose to send only text for performance reasons, or only markdown if the page has already been converted. Agents should handle the case where their preferred format is absent.

Selection

When the context type is selection, this section will be populated:

typescript
interface TextSelection {
  readonly selectedText: string; // The actual selected text
  readonly startOffset: number; // Character offset from document start
  readonly endOffset: number; // Character offset from document start
  readonly anchorNode?: string; // DOM node where selection starts
  readonly focusNode?: string; // DOM node where selection ends
  readonly surroundingText?: string; // Text around the selection for context
  readonly selectionRange?: {
    readonly startContainer: string;
    readonly endContainer: string;
    readonly startOffset: number;
    readonly endOffset: number;
  };
}

The surroundingText field is useful for disambiguation. If a user selects the word "bank", the surrounding text helps the agent understand whether it refers to a financial institution, a river bank, or something else entirely.

Form Data

When the context type is form, the formData section captures the form state:

typescript
interface FormData {
  readonly formId?: string; // HTML id attribute
  readonly formAction?: string; // Form submission URL
  readonly formMethod?: string; // GET, POST, etc.
  readonly fields: FormField[]; // At least one field
}

interface FormField {
  readonly name: string; // Field name attribute
  readonly type: string; // "text", "email", "password", etc.
  readonly value?: string; // Current value (may be redacted)
  readonly placeholder?: string; // Placeholder text
  readonly required: boolean; // Is the field required?
}

Field values are subject to privacy filtering. Password fields and fields matching sensitive patterns will have their values redacted before the context leaves the browser. See the Privacy Model for details on how filtering works.

Media Elements

The mediaElements array captures images, videos, and audio elements on the page:

typescript
interface MediaElement {
  readonly type: "image" | "video" | "audio";
  readonly url: string; // Media source URL
  readonly alt?: string; // Alt text (images)
  readonly title?: string; // Title attribute
  readonly dimensions?: {
    readonly width: number;
    readonly height: number;
  };
}

Context History

Periscope tracks how contexts relate to each other over time through ContextHistoryEntry:

typescript
interface ContextHistoryEntry {
  readonly context: BrowserContext;
  readonly previousContextId?: ContextId;
  readonly navigationReason?:
    | "user_action"
    | "redirect"
    | "back_forward"
    | "refresh";
  readonly timeOnPage?: number; // Seconds spent on previous context
}

The previousContextId field creates a linked list of contexts, letting agents reconstruct the user's browsing path. The navigationReason field distinguishes intentional navigation from redirects or back/forward actions, and timeOnPage indicates how long the user spent on the previous page -- a proxy for engagement level.

Querying Contexts

The ContextQueryFilter type defines how agents search for stored contexts:

typescript
interface ContextQueryFilter {
  readonly userId?: UserId;
  readonly tabId?: TabId;
  readonly contextType?: BrowserContextType;
  readonly privacyLevel?: PrivacyLevel;
  readonly urlPattern?: string; // Regex or glob pattern
  readonly timeRange?: {
    readonly start: Date;
    readonly end: Date;
  };
  readonly limit?: number; // Max 500
  readonly offset?: number; // For pagination
}

All filter fields are optional. An empty filter returns all contexts for the authenticated user (up to the default limit). Combining filters narrows results -- for example, querying for contextType: "selection" and a timeRange of the last hour returns all text selections the user made recently.

Branded Types

Periscope uses TypeScript branded types for identifiers: ContextId, TabId, and UserId. These are string types at runtime but carry a compile-time brand that prevents accidentally passing a TabId where a ContextId is expected.

typescript
// These are just strings at runtime, but TypeScript catches mix-ups:
const contextId: ContextId = createContextId("ctx_abc123");
const tabId: TabId = createTabId("tab_def456");

// This would be a compile error:
// findContextById(tabId);  // Error: TabId is not assignable to ContextId

The create* factory functions are zero-cost type assertions. There is no runtime validation -- that is handled separately by Zod schemas. The brands exist purely to improve developer experience by catching identifier confusion at compile time rather than at runtime.

Incremental Updates

For efficiency, Periscope supports partial context updates via ContextDelta:

typescript
type ContextDelta = Partial<BrowserContext>;

interface BrowserContextWithDelta {
  readonly context: BrowserContext;
  readonly delta?: ContextDelta;
}

When a context changes incrementally (for example, a form field value changes but everything else stays the same), the extension can send just the delta rather than the full context. The service merges the delta into the stored context. This reduces bandwidth and processing overhead, which matters when the extension is capturing high-frequency changes.

Next Steps

  • Event System -- Learn about the event envelope and payload types that wrap context changes
  • Privacy Model -- Understand how context data is filtered before reaching agents
  • Sync Protocol -- See how contexts are transmitted between extension and service