π€ Auto-Curator Agent - Self-Maintaining Awesome List
A multi-agent system built with Docker cagent that maintains itself β it discovers new Docker cagent resources, validates links, and submits PRs to keep the awesome-docker-cagent list fresh.
π― What You'll Learn
- Build a 4-agent system with specialized roles (Curator, Discoverer, Validator, Publisher)
- Configure GitHub Models as a free LLM provider for cagent
- Use custom providers via the
providers:YAML section - Integrate MCP tools (GitHub, DuckDuckGo, Fetch)
- Work with cagent skills for domain-specific knowledge
- Handle token limits when using free-tier APIs
ποΈ Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CURATOR (Root) β
β β
β β’ Coordinates all maintenance tasks β
β β’ Formats entries using awesome-list table syntax β
β β’ Decides what to add, update, or remove β
βββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββΌββββββββββββββ
βΌ βΌ βΌ
βββββββββββββββββ βββββββββββββ βββββββββββββ
β DISCOVERER β β VALIDATOR β β PUBLISHER β
β β β β β β
β β’ Web search β β β’ Check β β β’ Create β
β β’ GitHub API β β URLs β β branches β
β β’ Find repos β β β’ Quality β β β’ Update β
β β’ Find blogs β β scoring β β README β
β β β β’ Freshnessβ β β’ Open PRs β
βββββββββββββββββ βββββββββββββ βββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββ βββββββββββββ ββββββββββββ
β DuckDuckGoβ β Fetch β β GitHub β
β MCP β β Tool β β MCP β
βββββββββββββ βββββββββββββ ββββββββββββ
π Quick Start
Prerequisites
- Docker Desktop 4.49+ (includes cagent)
- GitHub Personal Access Token with
repo+models:readpermissions
Step 1: Set Up Your GitHub Token
Create a fine-grained PAT at github.com/settings/personal-access-tokens with:
| Permission | Scope | Why |
|---|---|---|
repo |
Repository | Read/write awesome-list repo, create branches and PRs |
models:read |
Account | Access GitHub Models API for LLM inference |
export GITHUB_TOKEN=your_github_pat
Step 2: Clone and Run
git clone https://github.com/ajeetraina/docker-workshop
cd docker-workshop/docs/lab10/projects/auto-curator-agent/
# Run the lite version (recommended for GitHub Models free tier)
cagent run ./cagent-curator-lite.yaml "Find new cagent blog posts"
β‘ Two Configs: Full vs Lite
cagent-curator.yaml (Full) |
cagent-curator-lite.yaml (Lite) |
|
|---|---|---|
| Skills | β 3 skill files loaded | β Disabled (saves ~2,500 tokens) |
| Instructions | Detailed with examples | Condensed essentials |
| Sub-agent model | gpt-4o (smart) | gpt-4o-mini (fast) |
| Token budget | ~4,500+ tokens overhead | ~1,500 tokens overhead |
| Best for | Paid GitHub Models / other providers | GitHub Models free tier |
!!! warning "GitHub Models Free Tier Token Limit" GitHub Models free tier limits gpt-4o to 8,000 input tokens per request. Skills + detailed instructions + fetched content can easily exceed this. Use the lite version unless you have paid GitHub Models or another provider.
π Usage Examples
Discover New Resources
# Basic discovery
cagent run ./cagent-curator-lite.yaml "Find new Docker cagent blog posts published this week"
# Specific and targeted (recommended)
cagent run ./cagent-curator-lite.yaml "Search for blog posts that specifically mention \
'cagent' the Docker multi-agent runtime tool. Only include posts that discuss \
cagent YAML configs, cagent run commands, or multi-agent orchestration. \
Exclude generic Docker or Docker Model Runner posts."
# With deduplication against existing list
cagent run ./cagent-curator-lite.yaml "Read the current README from \
collabnix/awesome-docker-cagent, then search for new blog posts about \
Docker cagent that are NOT already in the list."
Validate Existing Links
cagent run ./cagent-curator-lite.yaml "Read the awesome-docker-cagent README \
and check all URLs for broken links"
Submit Updates via PR
cagent run ./cagent-curator-lite.yaml "Find new cagent resources, validate them, \
and create a PR adding them to the awesome list"
π§ Configuration Deep Dive
GitHub Models as a Custom Provider
There is no built-in github provider in cagent. We define one using the providers: section:
providers:
github:
api_type: openai_chatcompletions
base_url: https://models.github.ai/inference
token_key: GITHUB_TOKEN
models:
smart:
provider: github
model: openai/gpt-4o # vendor prefix required!
max_tokens: 4096
fast:
provider: github
model: openai/gpt-4o-mini
max_tokens: 2048
!!! danger "Common Pitfall: Model Name Format"
GitHub Models requires the openai/ vendor prefix in model names.
Using just gpt-4o results in 403 Forbidden: No access to model: /gpt-4o.
Always use openai/gpt-4o, openai/gpt-4o-mini, etc.
GitHub Models Free Tier Rate Limits
| Limit | gpt-4o | gpt-4o-mini |
|---|---|---|
| Input tokens/request | 8,000 | 8,000 |
| Output tokens/request | 4,096 | 4,096 |
| Requests/minute | 10 | 15 |
| Requests/day | 50 | 150 |
π‘ Prompting Tips
The lite config trades detailed skill instructions for token headroom. Your prompt quality matters more:
| β Too Vague | β Specific |
|---|---|
"Find new cagent blog posts" |
"Search for posts that specifically mention 'cagent' the Docker multi-agent runtime. Exclude generic Docker or DMR posts." |
"Check links" |
"Read the awesome-docker-cagent README and check all URLs in the MCP Servers section for broken links" |
Key tips:
- Say "cagent" explicitly β helps distinguish from generic Docker content
- Mention what to exclude β "Exclude Docker Model Runner, Claude Code posts"
- Ask it to read the README first β enables deduplication
- One task per prompt β saves tokens within the 8K limit
π Troubleshooting
| Error | Cause | Fix |
|---|---|---|
403 no_access to model: /gpt-4o |
Missing vendor prefix | Use openai/gpt-4o not gpt-4o |
403 no_access to model: openai/gpt-4o |
Token missing models permission | Add models:read to your PAT |
413 Request Entity Too Large |
Input exceeds 8K tokens | Switch to cagent-curator-lite.yaml |
429 Too Many Requests |
Rate limit hit | Wait and retry (10 req/min for gpt-4o) |
| Results include generic Docker posts | Prompt too vague | Be specific β see Prompting Tips above |
π¦ Skills (Full Version Only)
The full config loads 3 skills from .claude/skills/:
| Skill | Purpose |
|---|---|
awesome-list-format |
Table syntax, section structure, categorization rules |
link-validation |
URL checking, GitHub activity monitoring, content freshness |
resource-discovery |
Search strategies, quality scoring, deduplication |
π Automation with GitHub Actions
# .github/workflows/curate.yml
name: Weekly Curation
on:
schedule:
- cron: '0 9 * * 1' # Every Monday at 9am
workflow_dispatch:
jobs:
curate:
runs-on: ubuntu-latest
steps:
- uses: docker/cagent-action@v1
with:
config: cagent-curator-lite.yaml
prompt: "Search for blog posts that specifically mention cagent the Docker multi-agent runtime. Read the current README first to avoid duplicates. Create a PR with any new findings."
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
π¦ Push to Docker Hub
docker login
cagent push ./cagent-curator-lite.yaml docker.io/YOUR_USERNAME/awesome-cagent-curator:latest
π Workshop Flow
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AUTO-CURATOR AGENT WORKFLOW β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β 1. CONFIGURE 2. DISCOVER 3. PUBLISH β
β βββββββββββ βββββββββββ ββββββββββ β
β β
β ββββββββββββ ββββββββββββ ββββββββββββββββ β
β β GitHub β β Search β β Create PR β β
β β Models βββββββββΆβ + Check βββββββββΆβ to Update β β
β β + PAT β β + Score β β Awesome Listβ β
β ββββββββββββ ββββββββββββ ββββββββββββββββ β
β β β β β
β βΌ βΌ βΌ β
β Custom provider DuckDuckGo MCP GitHub MCP β
β via providers: + Fetch tool creates branch β
β section validates URLs + opens PR β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π Key Concepts Covered
| Concept | Description |
|---|---|
| Custom Providers | Defining GitHub Models as an OpenAI-compatible provider via providers: |
| Token Budgeting | Managing free-tier API limits with lite vs full configs |
| Multi-Agent Coordination | 4 agents with specialized roles collaborating on a task |
| MCP Tools | GitHub, DuckDuckGo, and Fetch tools via Model Context Protocol |
| cagent Skills | Domain-specific knowledge loaded from .claude/skills/ |
| Vendor Prefixes | GitHub Models requires openai/gpt-4o format |
| Self-Maintaining Systems | Agents that keep a community resource list up to date |