Research

prompt-driven-analyzer

advanced

Full prompt management flow with telemetry, feedback, and composed layers.

APIs Used

ctx.prompts.load()ctx.prompts.compose()ctx.llm

Capabilities Required

analysis/prompt-driven

What this demonstrates

1ctx.prompts.load() to retrieve versioned, registry-managed prompts
2ctx.prompts.compose() to layer system + user + context prompt fragments
3prompt.toCallMetadata() for telemetry-linked prompt tracking
4ctx.llm.complete() with composed prompts and cost tracking
5Canonical prompt management pattern: load → compose → complete → feedback

Source

View on GitHub

typescript

/**
 * Prompt-Driven Analyzer - Reference Agent
 *
 * Canon: KB 105 (Agent SDK Architecture, Prompt Versioning)
 *
 * Demonstrates the FULL ctx.prompts and telemetry flow:
 * - ctx.prompts.load() with short key resolution
 * - prompt.render() with schema-validated variables
 * - ctx.prompts.compose() for multi-layer prompt stacks
 * - ctx.llm.complete() with promptMetadata threading
 * - ctx.llm.recordPromptFeedback() for telemetry signals
 *
 * This is the canonical reference for how agents use managed prompts.
 *
 * Delegation requirements:
 * - prompt:read:* (or specific prompt:read:companion.task.* scope)
 * - llm:complete:*
 */

import { handler, withProvenanceContext } from '@human/agent-sdk';
import type { ExecutionContext } from '@human/agent-sdk';

export const AGENT_ID = 'prompt-driven-analyzer';
export const VERSION = '1.0.0';
export const CAPABILITIES = ['analysis/prompt-driven'];

export interface PromptDrivenAnalyzerInput {
  /** The text content to analyze */
  content: string;

  /** Analysis focus areas (optional, uses prompt defaults) */
  focus_areas?: string;

  /** Output format preference */
  output_format?: string;
}

export interface PromptDrivenAnalyzerOutput {
  success: boolean;
  analysis: string;
  prompt_uri: string;
  model_used: string;
  token_cost: number;
  provenance_id: string;
}

const execute = async (
  ctx: ExecutionContext,
  input: PromptDrivenAnalyzerInput
): Promise<PromptDrivenAnalyzerOutput> => {
  ctx.log.info('Starting prompt-driven analysis', {
    contentLength: input.content.length,
    focusAreas: input.focus_areas,
  });

  // ── Step 1: Load a prompt by short key ──
  //
  // Short key 'task-summarize' resolves to:
  // prompt://org/{ctx.orgId}/companion.task.summarize@active
  // (org-first, then core fallback)
  //
  // Delegation check: prompt:read:companion.task.summarize (or prompt:read:*)
  const prompt = await ctx.prompts.load('task-summarize');

  ctx.log.info('Prompt loaded', {
    uri: prompt.uri,
    version: prompt.meta.version,
  });

  // ── Step 2: Render with schema-validated variables ──
  //
  // If the prompt has an inputSchema, variables are validated:
  // - Required fields checked
  // - Defaults applied for optional fields
  // - Warnings logged for unknown variables
  const rendered = prompt.render({
    content: input.content,
    ...(input.focus_areas ? { focus_areas: input.focus_areas } : {}),
    ...(input.output_format ? { output_format: input.output_format } : {}),
  });

  // ── Step 3: Estimate token cost before calling LLM ──
  const estimatedTokens = ctx.prompts.estimateTokens('task-summarize');
  ctx.log.info('Token estimate', { estimatedTokens });

  // ── Step 3b: Demonstrate compose (multi-layer stack) ──
  //
  // Compose merges multiple prompts in order, returns metadata for telemetry.
  // Use this when you need system + task + context prompts stacked.
  const composed = await ctx.prompts.compose(['task-summarize', 'task-analyze']);
  ctx.log.info('Composed prompt', {
    layers: composed.layers.length,
    estimatedTokens: composed.estimatedTokens,
  });

  // ── Step 3c: Demonstrate getEffective (inheritance resolved) ──
  //
  // If the prompt extends a parent, getEffective merges the full chain.
  // Returns undefined if not found (no error thrown).
  const effective = await ctx.prompts.getEffective('task-summarize');
  if (effective) {
    ctx.log.info('Effective prompt resolved', { uri: effective.uri });
  }

  // ── Step 4: LLM call with prompt identity threaded ──
  //
  // promptMetadata is generated by prompt.toCallMetadata()
  // and contains: prompt URI, version, composition method, variables used
  //
  // This metadata flows through:
  // 1. LLMContextImpl -> provenance log (prompt identity preserved)
  // 2. PromptCallLogger -> telemetry (aggregated by prompt)
  // 3. ModelRegistry -> model affinity (which model works best for this prompt)
  const result = await ctx.llm.complete({
    prompt: [
      { role: 'system', content: rendered },
      { role: 'user', content: 'Please provide the analysis.' },
    ],
    temperature: 0.3,
    promptMetadata: prompt.toCallMetadata(),
  });

  // ── Step 5: Record feedback based on quality signal ──
  //
  // The feedback signal feeds into:
  // - Prompt performance snapshots (aggregated positive/negative rates)
  // - Model affinity scoring (which model produces better results for this prompt)
  // - Prompt Refinement Agent (identifies underperformers for improvement proposals)
  if (result.content.length > 50) {
    await ctx.llm.recordPromptFeedback({
      provenanceId: result.provenanceId,
      signal: 'positive',
      source: 'agent',
      detail: 'Analysis produced substantial output',
    });
  } else {
    await ctx.llm.recordPromptFeedback({
      provenanceId: result.provenanceId,
      signal: 'negative',
      source: 'agent',
      detail: 'Analysis output was too short',
    });
  }

  // P7 Verifiability: log agent completion to provenance
  const provenanceId = await ctx.provenance.log(
    withProvenanceContext(ctx, {
      action: 'prompt_driven_analyzer:completed',
      status: 'success',
      input: { contentLength: input.content.length },
      output: { prompt_uri: prompt.uri, model_used: result.model },
    })
  );

  return {
    success: true,
    analysis: result.content,
    prompt_uri: prompt.uri,
    model_used: result.model,
    token_cost: result.cost.usd,
    provenance_id: provenanceId,
  };
};

export default handler({
  name: AGENT_ID,
  id: AGENT_ID,
  version: VERSION,
  capabilities: CAPABILITIES,
  manifest: {
    operations: [
      {
        name: 'analyze',
        description: 'Run prompt-driven analysis on content using managed prompts and telemetry',
        paramsSchema: {
          content: { type: 'string', required: true, description: 'Text content to analyze' },
          focus_areas: { type: 'string', description: 'Analysis focus areas' },
          output_format: { type: 'string', description: 'Output format preference' },
        },
        resultKind: 'agent.prompt-driven-analyzer.result',
      },
    ],
  },
  execute,
});

Run the tests

From monorepo root

$ pnpm test:agents:reference

$ pnpm test:agents:reference:verbose

The reference suite runs all 23 agents with createMockExecutionContext(), verifying every ctx.* API call and output shape.

prompt-driven-analyzer

What this demonstrates

Source

Run the tests

See Also