# Backlog — strategic work to grow the Intelligence product

Last updated: 2026-05-25

This is the prioritized punch list for what to do with everything that's
been built under `/intelligence/*`. It exists so that six months from now
you can re-read it and remember *why* each item matters, not just *what*
it is. Update statuses as you ship.

Companion files:
- [`operations.md`](./operations.md) — runbooks (cron URLs, leads
  provisioning SQL, embedding backfill, etc.)
- [`AGENTS.md`](../AGENTS.md) — coding conventions for future work

---

## The strategic frame

You've built a lot of plumbing: 5 datasets with refresh crons, REST API,
MCP server with prompts, semantic search, change log, relationship
graph, stable UUIDs, the leads product, the dashboard. **The bottleneck
is no longer infrastructure — it's the depth and quality of what's
actually in the datasets.** Authority and leads-product retention both
depend on the data being genuinely useful. Most of the items below
either deepen the data, distribute it, or convert it to revenue.

Three revenue/authority levers, in priority order:

1. **Data quality** — thin seed lists produce thin AI output, regardless
   of how clever the pipeline is.
2. **Distribution via MCP** — every assistant connected to the MCP
   server is branded distribution at zero CAC.
3. **Leads product retention** — the dashboard auto-suggest panel is
   the differentiator; it only pays off if subscribers actually close
   deals from it.

---

## NOW — do this week

Operational hygiene + the things whose value compounds the longer you
wait to do them.

### Deploy + verify the recent work
- [ ] SFTP-upload the phase 2-4 changes if not yet deployed
- [ ] Hit `https://tylerewillis.com/setup/db_init.php` — creates the
  new tables (change_log, playbook_tools, case_study_tools) and
  backfills UUIDs. Idempotent; safe to re-run.
- [ ] Confirm **all 8 cron URLs** are scheduled in cPanel:
  - `/api/refresh-tools`, `/api/refresh-benchmarks`,
    `/api/refresh-playbooks`, `/api/refresh-case-studies`,
    `/api/refresh-market`, `/api/refresh-leads`,
    `/api/refresh-embeddings`, `/api/sync-relationships`
  - One quietly disabled = silent decay. Easy to check, easy to miss.
- [ ] Run the one-time backfills (each emails you a summary when done):
  - `/api/refresh-embeddings?token=daBInhsXt4zRs1im6SUaLZ&max=500`
  - `/api/sync-relationships?token=daBInhsXt4zRs1im6SUaLZ`
- [ ] Hit `/api/test-email?token=daBInhsXt4zRs1im6SUaLZ` to confirm
  SMTP works after the Mail.php multiline fix. Should print
  `Verdict: SUCCESS` and email arrives in &lt;1 min.
- [ ] Subscribe to your own RSS feed (`/intelligence/changes.rss`) in
  any reader. You'll notice immediately when crons stop producing
  meaningful events.

### Distribution — the work that ossifies the longer you wait
- [ ] **Submit the MCP server to the four directories**:
  - `github.com/modelcontextprotocol/servers` (PR adding a row — Anthropic's
    official community list, highest authority, gets crawled by everyone else)
  - `mcp.so` (form on site)
  - `smithery.ai` (usually needs a thin GitHub wrapper repo)
  - `glama.ai/mcp` (free listing)
  - Server details to paste are in [`operations.md` → MCP directory submissions](./operations.md#mcp-directory-submissions)
  - **Why now**: MCP directories will ossify their featured tier inside
    12 months. The "AI Automation Intelligence" slot is unclaimed today.
    First listings get featured / become defaults.

---

## NEXT — next 2-4 weeks

The flywheel. Each item compounds with the others.

### Data depth (the bottleneck)
- [ ] **Expand the tool seed lists** to 200+ entries. Specifically:
  - Every YC W26/S26 AI-automation company
  - Every Series A+ AI agent / automation startup from last 12 months
  - Every MCP server vendor (you should be in this list yourself)
  - Every workflow tool with &gt;10k GitHub stars
  - Seed file: `config/tool-seeds.php`
- [ ] **Hand-write 5-10 playbooks** from your actual consulting work.
  AI-generated ones cover breadth; hand-written ones are what people
  cite and pay for.
- [ ] **Add 3-5 real anonymized case studies** from your engagements.
  Most defensible content on the site, hardest to fake. Manual INSERT
  into `roi_case_studies`. The cron only refreshes metadata; it will
  never fabricate before/after numbers.
- [ ] **Expand the market landscape seed list** to be comprehensive on
  newsletters, communities, podcasts. Comprehensiveness is the moat
  here. Seed file: `config/landscape-seeds.php`

### Leads product — get to 1-3 paying subscribers
- [ ] **Sign up 1-3 paid leads subscribers**, even at a discount or
  free, to get real-world feedback. Full end-to-end workflow + the
  provisioning SQL in [`operations.md`](./operations.md#leads-product--end-to-end-workflow).
- [ ] **Watch them work the dashboard.** Does the "Playbooks that fit
  this lead" suggest panel actually help them close? If yes, that becomes
  your marketing angle — "the only leads product with playbook
  auto-match." If no, fix it before scaling.
- [x] ~~**Add paid contact enrichment to the leads cron.**~~ Shipped:
  Hunter.io domain-search runs after every OpenAI extraction, fills
  `contact_email` with smart contact-picking (name match → ops/leadership
  title → highest confidence). See `src/Hunter.php` + operations.md
  "Hunter.io contact enrichment".
- [ ] **Build `cron/assign-leads.php`** — wrap the weekly-batch
  assignment SQL from OPERATIONS into a small cron that runs every
  Monday morning. Replaces the one manual SQL you'd otherwise run by
  hand each week. Worth shipping at ~5 subscribers, when missing a
  Monday push starts to matter.

### Content flywheel — distribution from data we already have
- [ ] **Auto-publish a weekly digest blog post** using the
  `weekly_digest` MCP prompt. Cron calls it, pipes through OpenAI,
  publishes. Zero ongoing effort, one fresh SEO+social artifact per
  week. The prompt is already built — `mcpPromptWeeklyDigest()` in
  `src/Mcp.php`.
- [ ] **Auto-tweet from the changes RSS feed.** Connect
  `/intelligence/changes.rss` to Buffer / Typefully / IFTTT. Filter
  to `high` or `critical` significance. Every entry becomes a
  tweetable observation ("Inflection AI acquired by Microsoft", "n8n
  bumped Pro pricing 22%"). Zero authoring required.
- [ ] **Build one public aggregation page**: something like
  `/intelligence/state-of-ai-automation` — quarterly state-of-the-space
  page generated from the data. The kind of page that gets cited and
  ranked. Generate via the `weekly_digest` pattern with a longer window.

---

## LATER — next quarter and beyond

Bigger bets. Each one compounds the previous work substantially.

- [ ] **Open-source a "starter" MCP server template** based on this
  architecture (different domain, same shape — anyone can fork it to
  build an intelligence layer for their own niche). Costs nothing,
  earns authority as "the person who showed people how to do this."
- [ ] **One podcast or conference talk** as "the person who maintains
  the AI automation intelligence index." Turns infrastructure work
  into a personal brand asset.
- [ ] **AI-driven per-field provenance** — phase 4 deferred this. The
  current `_default` source URL is fine for now; per-field attribution
  (extending OpenAI prompts in tools/market crons) is the real
  authority play. Defer until somebody actually needs it.
- [ ] **Stripe integration for leads** — webhook → auto-provision the
  subscription + assignments without the manual SQL paste. Worth doing
  at ~10 subscribers when payment-collection + provisioning friction
  starts to dominate your week. Phase order: `cron/assign-leads.php`
  first, then this.
- [ ] **Real API key enforcement + rate limiting middleware** — Cloudflare
  + honor system works until someone abuses it. When you have one paid
  API customer, build it. Schema scaffolding for an `api_keys` table is
  trivial; the harder part is the middleware in `index.php`.
- [ ] **Admin UI for graph curation + manual change_log entries** —
  phpMyAdmin works for now. Build when you're personally editing more
  than once a week.

---

## DON'T BUILD YET (push-back)

Things that look like they'd help but currently aren't worth the
complexity for your stage.

| Don't | Why not (yet) |
|---|---|
| Stripe for leads | Manual provisioning is fine under ~5 subs; webhook subtleties = silent bugs |
| Real API key middleware | Cloudflare in front handles abuse; you have zero paid API customers today |
| Admin UI for the graph | phpMyAdmin is enough; building UI now is yak-shaving |
| Per-field AI provenance attribution | Deterministic `_default` is sufficient for most citation use cases |
| `case_study_playbooks` join table | No clean signal to derive from; needs AI extraction OR manual curation. Wait for a concrete need. |
| Migration off MySQL to Postgres + pgvector | Brute-force cosine in PHP works fine under ~50k rows per dataset |
| Newsletter / email send infrastructure | Just use Substack / Beehiiv / Buttondown for the weekly digest — no reason to build mail-list ops |

---

## HEALTH-CHECK SIGNALS (when to revisit)

These are the triggers that say "go build the next thing":

- **Leads cron generates 20+/week consistently** but subscribers say
  contact data is too thin → build the paid-enrichment step
- **You sign up 5+ paid leads subscribers** → build Stripe integration
- **One MCP user asks for a specific feature** → build it (live customer
  signal beats 10 strategic plans)
- **The data catalog crosses 500 entries in any one dataset** → revisit
  semantic search performance; may need to pre-filter before cosine
  similarity
- **Someone asks for bulk export of a paid dataset** → build the real
  API key system
- **You stop getting cron email summaries for >24h** → SMTP regression
  or cron disabled; hit `/api/test-email` first

---

## REFERENCE — what already exists (snapshot)

So you don't waste time building something that's already there.

**Datasets** (all in DB, all populated by crons):
- `tools_vendor_intelligence` — AI / automation tools, deep profiles
- `benchmarking_data` — adoption & performance segments
- `playbooks_workflows` — automation recipes with ROI data
- `roi_case_studies` — before/after engagement data
- `market_landscape` — broader space (tools, agencies, newsletters, etc.)
- `leads` — weekly prospect leads (paid subscriber product)
- `change_log` — significant events archive across all datasets

**Cron URLs** (all on `tylerewillis.com`, all gated by
`token=daBInhsXt4zRs1im6SUaLZ`):
- `/api/refresh-tools` · `/api/refresh-benchmarks` ·
  `/api/refresh-playbooks` · `/api/refresh-case-studies` ·
  `/api/refresh-market` · `/api/refresh-leads` ·
  `/api/refresh-embeddings` · `/api/sync-relationships` ·
  `/api/test-email`

**REST API** (free, no key):
- `/intelligence/api/{dataset}.json` · paginated list
- `/intelligence/api/{dataset}/{slug}.json` · single entity
- `/intelligence/api/by-uuid/{uuid}.json` · UUID resolution across all
  datasets
- `/intelligence/api/search?q=...` · semantic search
- `/intelligence/api/changes.json` · change feed (JSON)
- `/intelligence/changes.rss` · change feed (RSS 2.0)
- `/intelligence/api/openapi.json` · OpenAPI 3.1 spec

**MCP server** at `/intelligence/api/mcp` (free, no key, JSON-RPC 2.0):
- Resources: one per dataset + `changes`
- Tools: `search_intelligence`, `get_intelligence`, `recent_changes`
- Prompts: `compare_tools`, `recommend_for_stack`, `weekly_digest`,
  `find_automation_for`

**Leads product** ($497/mo, single tier, all niches):
- Public sales page: `/intelligence/leads` — ghosted samples table
  (un-blurs for active subscribers via `.leads-section.is-subscribed`),
  pricing block, inquiry form
- Inquiry flow: form → create user (no password) → email verify (7-day
  token, `lead_signup` type) → `/verify-email` confirms + creates
  `lead_subscriptions` with `status='pending_payment'` + emails Tyler
  the "ready for Stripe" notification with pre-filled activation SQL
- Tyler sends Stripe payment link manually → activates subscription via
  one SQL paste → subscriber's dashboard goes live
- Gated dashboard: `/intelligence/leads/dashboard` — per-lead outreach
  copy with copy buttons, status tracking, ROI counter, **playbook
  auto-suggest panel** based on tech-stack overlap
- Contact emails enriched via Hunter.io during the leads cron

**Key files for changes**:
- Schema: `setup/db_init.php` (append-only, hit the URL to migrate)
- Routes: `index.php` (single file, scan with `grep '\$r->'`)
- Shared helpers: `src/helpers/intelligence.php` (datasets),
  `src/helpers/changes.php` (change log writers),
  `src/helpers/relationships.php` (graph)
- MCP + OpenAPI: `src/Mcp.php`
- Embeddings: `src/Embeddings.php`
