Build AI Agents by Doing

12 hands-on challenges that turn Anthropic's courses into real, working features. No toy examples — you build production patterns from Challenge 1.

12
Challenges
25
API Endpoints
17
Agent Tools
$0
Required

Traditional Courses

  • Read about prompt engineering
  • Watch a tool_use demo video
  • Study evaluation theory
  • "Now go apply this yourself"

MindSwarm Learn

  • Build a Prompt Lab with versioning & A/B tests
  • Wire 17 tools across 5 real agents
  • Run code-graded + model-graded evals
  • Every lesson IS the application
# Clone and run in 30 seconds
git clone https://github.com/tikserziku/mindswarm-learn.git
cd mindswarm-learn
pip install -r requirements.txt
python -m uvicorn src.api.main:app --port 8010 --reload

# Open http://localhost:8010/learn
Phase 1

Foundation

01
First Blood — Claude SDK
Build a Claude client with automatic model fallback (haiku → sonnet → opus), presets, and token tracking.
API Fundamentals
02
Agent Thinks — Policy Agent
Replace hardcoded business logic with Claude-powered decisions. Legacy fallback when API is unavailable.
Prompt Engineering
03
Stream It — SSE + Vision
Token-by-token streaming over Server-Sent Events. Plus multimodal image analysis endpoint.
Real-time AI
Phase 2

Intelligence

04
Prompt Lab — Dynamic Prompts
Replace static .txt prompts with versioned JSON, few-shot examples, chain-of-thought, and A/B testing.
Prompt Management
05
Agent Hands — tool_use
17 JSON schemas across 5 agents. Claude decides which tools to call, you execute them.
Tool Use Protocol
06
Structured Minds — Chatbot
Support chatbot that queries real data through tools. Claude runs the loop: think → call tool → respond.
Agentic Patterns
07
Real World — Domain Prompts
Incident summarizer, onboarding guide generator, and diagnostics agent for production scenarios.
Production Prompts
Phase 3

Quality

08
Trust but Verify — Code Evals
Automated evaluation engine with 5 check types: field_match, range_check, contains, exists, in_set.
Testing AI Output
09
Judge the Judge — Model Evals
Claude evaluates Claude across 5 dimensions: accuracy, completeness, safety, helpfulness, format.
LLM-as-Judge
10
The Gauntlet — CI Pipeline
promptfoo configuration for continuous evaluation in your CI/CD pipeline.
DevOps for AI
Phase 4

Mastery

11
Bots Learn — Prompt Optimizer
Analyze evaluation history, detect trends, and use Claude to suggest concrete prompt improvements.
Self-Improving AI
12
Dashboard — Progress Tracker
HTML dashboard showing your progress across all challenges. The capstone that ties everything together.
Full Stack

Start Building Now

Free, open-source, works without an API key. MIT License.

Clone on GitHub