Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

conjure

Delegation to external LLM services for long-context or bulk tasks.

Overview

Conjure provides a framework for delegating tasks to external LLM services (Gemini, Qwen) when Claude’s context window is insufficient or when specialized models are better suited.

Installation

/plugin install conjure@claude-night-market

Skills

SkillDescriptionWhen to Use
delegation-coreFramework for delegation decisionsAssessing if tasks should be offloaded
gemini-delegationGemini CLI integrationProcessing massive context windows
qwen-delegationQwen MCP integrationTasks requiring specific privacy needs

Commands (Makefile)

CommandDescriptionExample
make delegate-autoAuto-select best servicemake delegate-auto PROMPT="Summarize" FILES="src/"
make quota-statusShow current quota usagemake quota-status
make usage-reportSummarize token usage and costsmake usage-report

Hooks

HookTypeDescription
bridge.on_tool_startPreToolUseSuggests delegation when files exceed thresholds
bridge.after_tool_usePostToolUseSuggests delegation if output is truncated

Usage Examples

Auto-Delegation

make delegate-auto PROMPT="Summarize all files" FILES="src/"

# Conjure will:
# 1. Assess file sizes
# 2. Check quota availability
# 3. Select optimal service
# 4. Execute delegation
# 5. Return results

Check Quota Status

make quota-status

# Output:
# Gemini: 450/1000 tokens used (5h rolling)
# Qwen: 200/500 tokens used (5h rolling)

Usage Report

make usage-report

# Output:
# This week:
#   Gemini: 2,500 tokens, $0.05
#   Qwen: 800 tokens, $0.02
# Total: 3,300 tokens, $0.07

Manual Service Selection

# Force Gemini for large context
Skill(conjure:gemini-delegation)

# Force Qwen for privacy-sensitive tasks
Skill(conjure:qwen-delegation)

Delegation Decision Framework

The delegation-core skill evaluates:

FactorWeightDescription
Context SizeHighDoes input exceed Claude’s context?
Task TypeMediumIs task better suited for another model?
Privacy NeedsHighAre there data residency requirements?
Quota AvailableHighDo we have capacity on target service?
CostLowIs delegation cost-effective?

Service Comparison

ServiceStrengthsBest For
GeminiLarge context (1M+ tokens)Bulk file processing, long documents
QwenLocal/private inferenceSensitive data, offline work

Hook Behavior

Pre-Tool Use Hook

When reading large files:

[Conjure Bridge] File exceeds context threshold
Suggested action: Delegate to Gemini
Estimated tokens: 125,000
Quota available: Yes

Post-Tool Use Hook

When output is truncated:

[Conjure Bridge] Output truncated at 100,000 chars
Suggested action: Re-run with delegation
Recommended service: Gemini

Configuration

Environment Variables

# Gemini API key
export GEMINI_API_KEY=your-key

# Qwen MCP endpoint
export QWEN_MCP_ENDPOINT=http://localhost:8080

Quota Configuration

Edit conjure/config/quotas.yaml:

gemini:
  hourly_limit: 1000
  daily_limit: 10000

qwen:
  hourly_limit: 500
  daily_limit: 5000

Integration Patterns

With Conservation

# Conservation detects high context usage
# Suggests delegation via conjure
Skill(conservation:context-optimization)
# -> Recommends: Skill(conjure:delegation-core)

With Sanctum

# Large repo analysis
Skill(sanctum:git-workspace-review)
# If repo too large:
# -> Suggests: make delegate-auto FILES="."

Dependencies

Conjure uses leyline for infrastructure:

conjure
    |
    v
leyline (quota-management, service-registry)

Best Practices

  1. Check Quota First: Run make quota-status before large delegations
  2. Use Auto Mode: Let conjure select the optimal service
  3. Monitor Costs: Review make usage-report weekly
  4. Cache Results: Store delegation results locally to avoid repeat calls
  • leyline: Provides quota management and service registry
  • conservation: Detects when delegation is beneficial