# Operations

Runbooks for operating tylerewillis.com day-to-day. [`AGENTS.md`](../AGENTS.md)
covers *how to code on* the site; this file covers *how to run* it.
Companion: [`backlog.md`](./backlog.md) for the prioritized punch list of
what to build next.

---

## Leads product — end-to-end workflow

The leads product is a **single subscription, one price ($497/mo), all
niches included**. No tier matrix. Flow goes: form → verify email →
notify you → collect payment → flip subscription to active → email
them access.

### 1. Visitor lands on `/intelligence/leads`

When the logged-in viewer has an active `lead_subscriptions` row,
`.leads-section` gets `.is-subscribed` and every `.lead-blur` element
un-blurs in place — the same visitor sees the same page their dashboard
shows, no separate template.

### 2. They submit the inquiry form

POSTs to `/intelligence/leads` (the page URL itself; back-compat alias
at `/intelligence/leads/inquire`). Required fields: first name, last
name, email. Optional: company.

The handler validates inputs, finds-or-creates a `users` row, then
calls `Auth::sendLeadVerify($uid, $email, $firstName)` which inserts a
7-day token of `type='lead_signup'` into the `tokens` table and emails
the visitor a verification link. The visitor is redirected back to
`/intelligence/leads?requested=1#request-access` which shows a "Check
your inbox" success panel. JS intercepts the form and uses fetch() so
the URL bar never navigates — the redirect is there as a no-JS fallback.

If the visitor entered a company name, it's stashed in the session
keyed by user id (`leads_inquire_company_<uid>`) so it's available
when they click the verify link later.

### 3. Visitor clicks the verify link → `/verify-email?token=…`

The route (`GET /verify-email`) looks up the token where
`type='lead_signup'` and `expires_at > NOW()`. On hit:

1. Marks `users.email_verified_at = NOW()` for the token's `user_id`.
2. Deletes the token (single-use).
3. Inserts a `lead_subscriptions` row for that user with
   `status='pending_payment'`, `niche='all'`, `exclusivity_tier='shared'`.
4. Emails you a notification with the user's name, email, optional
   company (pulled from the session stash), and a pre-filled SQL
   snippet to flip the subscription to `active` once payment clears.

The visitor sees a "You're in the queue" success view with a CTA to
`/forgot` so they can set their password while you set up Stripe.

`pending_payment` is the **dormant** state — the dashboard treats it
the same as `paused` (no leads delivered) until you flip it to `active`.

### 4. You receive the verify-success notification

Subject: `Lead signup verified: <name>`. Body includes the activation
SQL with the right `user_id` already filled in. You don't have to
look it up.

### 5. You collect payment

Reply to confirm. Send a Stripe invoice or payment link for $497/mo.
Wait for it to clear.

### 6. You flip the subscription to active + push this week's leads

In phpMyAdmin, paste the SQL from the notification email (which already
has the `<USER_ID>` and `<SUBSCRIPTION_ID>` filled in). It does:

```sql
-- Step 1: activate the pending subscription
UPDATE lead_subscriptions
SET status = 'active', started_at = NOW()
WHERE id = <SUBSCRIPTION_ID>;

-- Step 2: assign this week's leads to them
INSERT INTO lead_assignments (lead_id, subscription_id, week_start)
SELECT l.id, <SUBSCRIPTION_ID>, l.week_start
FROM leads l
WHERE l.week_start = DATE(SUBDATE(CURDATE(), WEEKDAY(CURDATE())))
  AND l.is_active = 1;
```

`niche='all'` and `exclusivity_tier='shared'` are placeholders kept for
back-compat with the schema's NOT NULL constraints; they don't drive
behavior in the single-tier model.

### 7. Email them access

Send a friendly "you're live" email with a link to `/forgot` so they
can set their password (if they didn't already do it from the verify
page). After setting it, they log in at `/login` and their dashboard
is at `/intelligence/leads/dashboard`.

The dashboard renders each lead as a card with: copy-to-clipboard
email + LinkedIn buttons, status tracking (`new` / `contacted` /
`no_reply` / `interested` / `not_now` / `closed`), a "I closed this one"
button that takes a dollar amount, the ROI counter header (`X closed
all-time`, `$Y revenue from this product`), the filter/search bar, the
week archive nav, and the auto-suggested playbooks panel per lead
(matched to detected tech stack — see the
[relationship graph](#relationship-graph--backfill--how-it-powers-leads-retention) section).

### 8. Every Monday going forward — push the new batch

The `refresh-leads` cron generates new leads with the week's Monday as
`week_start`. Hunter enrichment fills `contact_email` automatically
during that cron run. To deliver to every active subscriber in one shot,
run this query Monday morning (after the cron has finished):

```sql
INSERT IGNORE INTO lead_assignments (lead_id, subscription_id, week_start)
SELECT l.id, s.id, l.week_start
FROM leads l
CROSS JOIN lead_subscriptions s
WHERE l.week_start = DATE(SUBDATE(CURDATE(), WEEKDAY(CURDATE())))
  AND l.is_active = 1
  AND s.status = 'active';
```

The `INSERT IGNORE` is the safety net — running this twice in the same
week is a no-op thanks to the `(lead_id, subscription_id)` unique key.

When you hit ~5 subscribers, wrap this in `cron/assign-leads.php` and
schedule it. Until then, manual is fine.

### 9. Subscriber works the pipeline through the week

Copies the pre-written outreach with one click. Marks each lead's
status as they go. When they close one, the ROI counter on their
dashboard increments. That number is the retention engine — someone
who has closed two clients from your list in the past three months does
not cancel.

### 10. Cancellation (or pause)

```sql
-- Pause (keeps row, hides from dashboard, can resume)
UPDATE lead_subscriptions SET status = 'paused' WHERE id = <ID>;

-- Cancel (irreversible — set canceled_at so the timestamp is recorded)
UPDATE lead_subscriptions SET status = 'canceled', canceled_at = NOW() WHERE id = <ID>;
```

Their dashboard immediately shows "no active subscription" once status
leaves `active`. Past lead assignments aren't deleted — the dashboard
just gates on subscription status — so reactivating (set status back to
`active`) is non-destructive. The FAQ on the sales page promises "30
days of archive access after cancellation"; that's a manual policy
right now (the data persists indefinitely, you just choose how long to
honor login access).

### What's automatic vs. manual

| Step | Automated? |
|---|---|
| Generate new leads weekly | ✅ `refresh-leads` cron, Mon-Fri |
| Find contact emails | ✅ Hunter, during cron |
| Send verify-email on inquiry | ✅ `Auth::sendLeadVerify` |
| Create user account (unverified) | ✅ inquiry POST does the INSERT |
| Create `lead_subscriptions` row (`pending_payment`) | ✅ `/verify-email` handler |
| Notify you a signup verified | ✅ `/verify-email` emails you with activation SQL |
| Collect payment | ❌ manual Stripe invoice or link |
| Flip subscription to `active` + assign this week | ❌ you, paste the SQL from the notification |
| Send "you're live" email + `/forgot` link | ❌ you |
| Subscriber works the pipeline | ✅ dashboard (copy buttons, status, ROI) |
| Auto-suggest playbooks per lead | ✅ phase-3 relationship graph |
| Assign weekly batch to all active subscribers | ❌ you, one SQL each Monday |
| Cancel | ❌ you, one UPDATE statement |

**Order to automate the remaining manual steps**, when volume justifies:

1. **First**: weekly batch assignment (`cron/assign-leads.php`). Two
   benefits — never forget to push Monday's batch, never run it twice.
   Worth doing at 5+ subscribers.
2. **Second**: Stripe wiring + auto-provisioning after webhook. Worth
   doing at ~10 subscribers when payment-collection + provisioning
   friction starts to dominate your week.

### Subscriber cap (manual policy, not enforced in DB)

The sales page says total subscriber count is "capped intentionally."
You decide the cap by not provisioning past it. When you hit your
limit, reply to new inquiries with "waitlist — I'll reach out when a
slot opens" instead of running the provisioning SQL. There's no
enforced cap in the database; it's a manual policy you control.

---

## Cron URLs (production)

All require the token. Format: `?token=daBInhsXt4zRs1im6SUaLZ`.

| Cron | URL | Recommended cadence |
|------|-----|--------------------|
| Tools refresh        | `/api/refresh-tools`        | Daily 3am EST |
| Benchmarks refresh   | `/api/refresh-benchmarks`   | Daily 3am EST (stagger) |
| Playbooks refresh    | `/api/refresh-playbooks`    | Daily 3am EST (stagger) |
| Case studies refresh | `/api/refresh-case-studies` | Daily 3am EST (stagger) |
| Market refresh       | `/api/refresh-market`       | Daily 4am EST |
| Leads generation     | `/api/refresh-leads`        | Mon–Fri 4am EST |
| Embeddings refresh   | `/api/refresh-embeddings`   | Daily 5am EST (after data refreshes) |
| Relationships sync   | `/api/sync-relationships`   | Daily 6am EST (after refreshes + embeddings) |
| Newsletters refresh  | `/api/refresh-newsletters`  | (existing schedule) |

Each cron supports flags via query params — see the file header in
`cron/refresh-*.php`. Common ones:
- `&mode=discover|refresh|sync|all` — phase to run
- `&slug=foo` — refresh a single entity
- `&niche=agencies` — leads cron only (the only niche right now; `all` aliases to it)
- `&max=N` — leads cron only

Logs land in `storage/logs/intelligence.log`. Tail with `?run=<run-id>`
to find a specific invocation.

---

## MCP directory submissions

The MCP ecosystem is new enough that first-mover placement compounds. Once
the "AI Automation Intelligence MCP server" slot is taken, displacing it
is hard. The four registries below are the ones that matter today; check
back on this list periodically as new ones launch.

Server details to provide:
- **Name:** Tyler Willis — AI Automation Intelligence
- **Endpoint:** `https://tylerewillis.com/intelligence/api/mcp`
- **Transport:** HTTP, JSON-RPC 2.0
- **Protocol version:** 2024-11-05
- **Resources:** 5 (tools, benchmarks, playbooks, case studies, market landscape)
- **Tools:** `search_intelligence`, `get_intelligence`
- **Auth:** none (free preview)
- **Description:** "Programmatic access to a continuously-updated catalog of AI automation tools, benchmarks, playbooks, ROI case studies, and the market landscape. Read-only. Queryable by any MCP client."
- **Docs URL:** `https://tylerewillis.com/intelligence/api`

### Registries to submit to

| Registry | What it is | Submission |
|----------|------------|------------|
| `mcp.so`                           | Community-run searchable directory | Form on the site |
| `smithery.ai`                      | Registry + one-command installer for MCP clients | Usually requires a thin GitHub wrapper repo pointing at the HTTP endpoint |
| `glama.ai/mcp`                     | Directory + managed MCP host | Free listing submission |
| `github.com/modelcontextprotocol/servers` | Anthropic's official community list | Open a PR adding a row to the community section of the README |

**Order to submit:** Anthropic repo first (highest authority, gets crawled
by everyone else), then mcp.so, then smithery, then glama. After the first
one indexes you, the rest tend to pick you up automatically too.

**When the submission flow has changed:** these registries evolve fast.
Just visit each, scan the homepage for "submit," and follow the current
flow rather than trying to remember an old one.

---

## Change log — what gets recorded

The `change_log` table is the canonical record of *significant* events
across every dataset. The refresh crons write to it via
`src/helpers/changes.php`. Insignificant diffs (ai_summary tweaks,
score drift) are intentionally NOT logged — that's what keeps the
archive valuable as it accumulates.

What gets recorded today:
- **Tools cron** — price changes (`price_increase` / `price_decrease`),
  trajectory shifts, and new entities discovered by Tavily
- **Market cron** — status changes (`acquired` / `shutdown` / `pivoting` /
  `status_change`), trajectory shifts, new entities discovered

Significance levels: `low | medium | high | critical`. Price moves under
5% are low; over 50% are critical. Acquisitions / shutdowns are always
critical. Status changes are always at least high.

### Where it's consumed
- `GET /intelligence/api/changes.json` — JSON feed (supports `since_days`,
  `limit`, `min_significance` query params)
- `GET /intelligence/changes.rss` — RSS 2.0 feed at medium+ significance
- MCP `recent_changes` tool — assistants can query "what changed"
- MCP `tylerewillis://intelligence/changes` resource — same data

### Adding a new significant event from a future cron
1. `require_once __DIR__ . '/../src/helpers/changes.php';` at top of cron
2. After the upsert succeeds, call `record_change([...])` or one of the
   typed helpers (`record_price_change`, `record_status_change`,
   `record_trajectory_change`, `record_new_entity`). All return bool and
   never throw — log writes don't break refreshes.

---

## Per-claim provenance (phase 2 scaffolding)

Each `tools_vendor_intelligence` and `market_landscape` row now has a
`field_sources` JSON column. Phase 2 populates it with a deterministic
default after every refresh — the first URL from `last_refresh_sources`,
stamped with `asserted_at`. Phase 3 will extend the OpenAI extraction
prompts to attribute individual fields; the JSON layout supports both:

```json
{
  "_default": "https://n8n.io/",
  "asserted_at": "2026-05-25",
  "starter_price_monthly": "https://n8n.io/pricing"
}
```

Readers check field-specific keys first, fall back to `_default`. The
column is exposed in public payloads and the REST API by default.

---

## Hunter.io contact enrichment (leads cron)

The leads cron's OpenAI extraction is forbidden from guessing email
patterns — the prompt says `NEVER invent firstname.lastname@domain`.
So contact_email used to be null on most leads. As of the Hunter
integration, the cron calls `hunterEnrichLead($domain, $nameHint)`
right after the OpenAI step and before the INSERT.

**Selection strategy** (in `src/Hunter.php`):
1. If OpenAI already extracted a contact_name, try to match a Hunter
   result to that name first.
2. Otherwise prefer ops / leadership titles (CEO, founder, COO, Head of,
   VP, director, ops, marketing, revenue, growth, sales).
3. Otherwise fall back to Hunter's highest-confidence email.

**Cost model**: 1 search credit per lead. Hunter API key is on a paid
plan (set in `config/config.php` → `HUNTER_API_KEY`). If credits run out,
the cron logs `Hunter HTTP 402` and continues — lead inserts unchanged,
just without enrichment.

**Transparency**: when Hunter fills the email, `source_notes` on the
lead row appends `contact via hunter.io domain-search (confidence N)`
so the dashboard / API consumers can see where the email came from.

**If the key needs to change** or you want to swap providers (Apollo,
Clearbit, etc.), the wrapper is isolated in `src/Hunter.php` — the cron
only calls `hunterEnrichLead()`, which returns a normalized array of
fields. Replace the implementation, keep the function signature.

---

## Schema.org JSON-LD — three layers

Every page on the site emits at least three JSON-LD blocks automatically:

1. **Sitewide** (Organization, Person, WebSite with SearchAction) — rendered by `render_sitewide_schemas()` in `views/layout.php`. Source: `src/helpers/schema.php`. The `sameAs` URLs in `schema_person()` and `schema_organization()` are placeholders — **update them with real profile URLs before scaling visibility**.
2. **Per-URL static** — declared in `config/schemas.php` keyed by URL path (same shape as `config/pages.php`). Auto-rendered by `render_page_schemas($path)` from the layout.
3. **Dynamic per-entity** — emitted inside view files (tool detail, playbook detail, etc.) using builder helpers like `schema_software_application()`, `schema_how_to()`, `schema_article()`, `schema_dataset()`, `schema_faqpage()`. These stay close to the data that drives them.

Plus **breadcrumbs** are auto-generated from the URL path for every non-home page (`render_breadcrumbs($path)`).

### Adding schema to a new page
- **Static page** (about-style content): add an entry in `config/schemas.php` under its URL key. Done — layout renders it.
- **Dynamic detail page** (per-entity data): in the view file, build the schema with the appropriate builder helper, then `render_jsonld($schema)`. Example pattern lives in `views/pages/intelligence-tool.php`.

### Updating the sitewide `sameAs` profiles
Edit `src/helpers/schema.php` → `schema_person()` and `schema_organization()`. The arrays under `sameAs` are the bridge that lets AI crawlers cross-reference your Person/Organization entity with external authoritative sources (LinkedIn, GitHub, X, etc.). Higher cross-references → higher trust signal.

---

## llms.txt + llms-full.txt

Two complementary files for guiding AI crawlers/ingestors:

| File | Purpose | Generation |
|------|---------|------------|
| `/llms.txt` | Curated navigation + descriptions. Static markdown file. Updated by hand when you add a major track / product / dataset. | Hand-edited at project root |
| `/llms-full.txt` | Full content dump for ingestion: site overview + live snapshots of every dataset + recent changes + endpoint pointers. Cached 1 hour. | Dynamic route handler in `index.php` — generates on demand from live data |

The layout also emits `<link rel="alternate">` tags for both files so well-behaved AI crawlers can discover them.

### Updating llms.txt
Edit `llms.txt` at project root. Structure:
1. SITE OVERVIEW (one-paragraph framing + the three tracks)
2. Each TRACK with its products/datasets
3. SUPPORTING PAGES
4. GUIDANCE FOR LANGUAGE MODELS

After significant changes, hit `/llms-full.txt` to verify it rebuilt cleanly. If you ever rename or remove a page, update llms.txt the same day — stale entries actively mislead.

---

## Stable UUIDs (phase 4)

Every entity in `tools_vendor_intelligence`, `benchmarking_data`,
`playbooks_workflows`, `roi_case_studies`, and `market_landscape` has a
`uuid CHAR(36)` column alongside its slug. Slugs are user-facing aliases
that can be renamed without notice; **UUIDs are the never-changing
external identifier** — what external systems, citations, and RAG indexes
should pin against.

- Backfill happens automatically when `db_init.php` runs (idempotent
  `WHERE uuid IS NULL` so re-runs are no-ops).
- The daily `sync-relationships` cron also backfills any rows created by
  the refresh crons between db_init runs. No separate cron needed.
- Resolvable via `GET /intelligence/api/by-uuid/{uuid}.json` regardless
  of which dataset the UUID belongs to.
- Exposed in every public payload as the `uuid` field.

### When this matters
The day you rename "n8n" to "n8n-cloud", every external `/intelligence/tools/n8n.json`
URL 404s. But every system that pinned the UUID still resolves cleanly. Cheap
insurance now; painful retrofit later.

---

## MCP prompts (phase 4)

Four pre-baked prompts ship with the MCP server. Surfaced as one-click
buttons in client UIs (Claude Desktop, Cursor, etc.) under "Prompts":

| Name | Arguments | What it does |
|------|-----------|------|
| `compare_tools` | `tool_a`, `tool_b`, optional `use_case` | Loads both full tool payloads into context, frames a side-by-side comparison |
| `recommend_for_stack` | `stack` (comma-separated), optional `goal` | Runs the graph traversal, inlines matching playbooks, frames an implementation pick |
| `weekly_digest` | optional `days` (default 7) | Inlines recent change_log events at medium+ significance, asks for a digest post |
| `find_automation_for` | `problem` (free text) | Semantic search across playbooks / case studies / tools, frames a grounded recommendation |

Most MCP servers ship resources + tools only. Prompts make the server
feel like a product instead of a data dump — and almost nobody else is
doing it yet. First-mover positioning in MCP directories compounds.

### Adding a new prompt
1. Append to `mcpPromptsList()` in `src/Mcp.php` — name, description, arguments
2. Add a `mcpPrompt<Name>()` function that returns `['description' => ..., 'messages' => [...]]`
3. Wire it into the switch in `mcpPromptsGet()`

That's it. The MCP client picks it up automatically the next time it
re-fetches prompts/list.

---

## Relationship graph — backfill + how it powers leads retention

The intelligence datasets are linked by two named join tables:

- **`playbook_tools`** — which tools are required / optional for each playbook
- **`case_study_tools`** — which tools were implemented / before / after in each case study

Edges are derived from the existing JSON columns on those tables
(`playbooks_workflows.required_tools`, `roi_case_studies.tools_implemented`,
etc.) by `cron/sync-relationships.php`. Name → slug matching is fuzzy:
exact slug, exact tool_name, then partial. Tool names that don't match
anything in `tools_vendor_intelligence` are skipped (logged as "unmatched"
in the summary email).

### After deploying phase 3
Run the backfill once:
```
GET https://tylerewillis.com/api/sync-relationships?token=daBInhsXt4zRs1im6SUaLZ
```
You'll get an email summary like `relationships cron: +47 new edges`.
After that, the daily cron at 6am EST keeps things current (re-runs are
cheap — INSERT IGNORE skips existing edges).

### Why this matters for revenue
The leads dashboard now shows a **"Playbooks that fit this lead"** panel
on every lead card, ranked by how many of the lead's detected tech-stack
tools overlap a playbook's required tools. Subscribers see concrete
playbooks they can pitch — "you're already running HubSpot and Zapier,
here's the n8n→HubSpot deal pipeline I'd build for you" — instead of
having to scan the catalog manually. Every match is a reason not to
cancel.

Detail-page cross-surfacing is also live for playbook → tools (tool names
become clickable badges deep-linking to `/intelligence/tools/<slug>`).
Tool → playbooks and case-study → tools are next when needed.

### Manually adding an edge
The unique constraint is `(playbook_slug, tool_slug)` or
`(case_study_slug, tool_slug, role)`. Set `source = 'manual'` so future
sync runs leave it alone:
```sql
INSERT INTO playbook_tools (playbook_slug, tool_slug, role, source, confidence)
VALUES ('client-onboarding-hubspot', 'calendly', 'recommended', 'manual', 0.95);
```

---

## Embeddings — initial backfill

The semantic search endpoint and the MCP `search_intelligence` tool both
depend on every row having an embedding. After deploying the phase 1
changes, do a one-time backfill:

```
GET https://tylerewillis.com/api/refresh-embeddings?token=daBInhsXt4zRs1im6SUaLZ&max=500
```

Watch `storage/logs/intelligence.log` for `cron=refresh-embeddings` lines.
Re-hit until the email summary shows all datasets at zero new embeddings.
After that, the daily cron at 5am EST keeps things current — it skips any
row whose source text hash hasn't changed since the last embed, so re-runs
are cheap.

Cost: text-embedding-3-small is ~$0.02 / 1M tokens. A full backfill of
every row across every dataset is single-digit cents.

---

## API key issuing (manual, no Stripe)

Triggered when someone POSTs `/intelligence/api/request`. Email lands in
your inbox with their name, email, org, tier, and use case.

There is no automated key system right now. When you want to issue access:

1. Reply to confirm and exchange payment terms.
2. Decide whether they need a key at all. Most asks will be for the
   leads dataset; the public datasets work without one.
3. For leads API access, the current "key" is just: you flip them into a
   `lead_subscriptions` row (see the leads provisioning runbook above) and
   point them at the dashboard. The REST API for leads isn't wired up yet.
4. For bulk export or custom feeds: do it by hand for now — generate a CSV
   from the relevant table and email it.

When traffic justifies it, the lift to add real API key enforcement is small:
add an `api_keys` table, a middleware check in `/intelligence/api/*` routes,
and a `/api-keys` page in `/app` for users to view/rotate theirs.

---

## Database migrations

The schema lives in `setup/db_init.php`. The pattern:
- New tables: `CREATE TABLE IF NOT EXISTS ...` in the `$stmts` array
- Existing tables: `ALTER TABLE ... ADD COLUMN IF NOT EXISTS ...` appended
- **Never edit a `CREATE TABLE` in place** — it won't re-run on existing rows

To run migrations after a deploy: hit `https://tylerewillis.com/setup/db_init.php`
in a browser (no auth — keep this file IP-restricted or rename it if you're
worried). It's idempotent.

---

## Where things live

- **Cron files:** `cron/refresh-*.php` (one per dataset)
- **Page templates:** `views/pages/*.php`
- **Routes:** `index.php` (single file, scan with `grep '$r->'`)
- **Page meta:** `config/pages.php`
- **Seed lists for crons:** `config/*-seeds.php`
- **DB schema:** `setup/db_init.php`
- **Shared helpers:** `src/helpers/intelligence.php`
- **MCP server + API:** `src/Mcp.php`
- **Logs:** `storage/logs/intelligence.log` and `storage/logs/app.log`