NSPA 2/2/2 Framework | Example 1: Applicant Communications Pilot

Milestone Goal

Build a verified, staff-operated AI drafting pipeline for one letter type before expanding further.

Team Pod

Program coordinator + 1 communications lead + 2 staff reviewers + supervisor (approval role). Organized around the applicant experience.

Starting Letter Type

Rejection letters: high-volume, emotionally high-stakes, highest ROI for a first pilot with mandatory approval gates.

⚠

Scope boundary: This pilot covers one letter type only. No expansion to acceptance, waitlist, or scholarship-specific acknowledgments until the rejection letter workflow is stable and measured. If the workflow runs over, it carries to the next 6-week cycle.

🔒

Data rule in force for this entire cycle: No applicant names, IDs, financial data, or personally identifiable information enter any AI prompt at any stage. All sample letters used for benchmarking must be anonymized before use.

Program Policy Statement

AI-Assisted Applicant Communications — We Will / We Will Not

✓ We Will

Use AI to generate a consistent first-draft rejection letter from an approved, anonymized tone template.
Require staff review and personalization of every AI-generated draft before sending.
Maintain a supervisor approval gate for all outgoing letters during the pilot cycle.
Anonymize all benchmark sample letters before using them as reference inputs.
Log and audit AI-generated drafts to identify hallucinations or off-tone language.
Measure time per letter before and after the pilot to establish a real baseline.
Pause the workflow and reassess if accuracy or equity issues emerge.

✕ We Will Not

Include applicant names, IDs, financial data, or any PII in any AI prompt.
Send an AI-generated letter without staff review and explicit approval.
Treat AI output as a final draft. It is always a starting point.
Use AI to generate the final decision rationale or scoring explanation.
Expand to additional letter types until this pilot is stable and measured.
Allow AI-generated language to override program equity or compassionate communication standards.

📋

This policy statement should be reviewed and signed off by the program director before Weeks 3–4 begin. It becomes part of the pilot documentation and is retained for board reporting if requested.

Weeks 1–2

Planning: Define the Problem Before Touching Any Tool

🎯

Sprint goal: Leave Weeks 1–2 with a scoped problem statement, anonymized benchmark samples, a data handling agreement, and a policy statement on file. No prompts are written yet.

Task 1 — Communications Workflow Audit

Before building anything, audit where time is being lost and where errors cluster. Fill in this table using last cycle's data.

Letter Type	Avg. Volume / Cycle	Avg. Time per Letter (min)	Common Errors or Delays	AI Pilot Candidate?
Rejection letter	Fill in	Fill in	Tone inconsistency, delayed sends	YES — Start here
Waitlist notification	Fill in	Fill in	Status confusion, multiple edits	Cycle 2
Award acceptance packet	Fill in	Fill in	Details vary by award type	Cycle 3
Follow-up / missing docs	Fill in	Fill in	Fill in	Evaluate later
Special circumstances	Fill in	Fill in	High sensitivity, low volume	Human-only

Task 2 — Benchmark Sample Collection

Pull 5–10 approved rejection letters from the previous award cycle. These become your quality baseline and few-shot reference examples.

Pull 5–10 approved rejection letters from last cycle's sent folder.
Remove all applicant names, IDs, school names, and identifying details. Replace with placeholders: [APPLICANT], [SCHOOL], [PROGRAM NAME].
Select 3 representing the range of tone and scenario (standard, borderline, special circumstances). Label them HIGH, MID, EDGE.
Have program director confirm these three represent your quality standard. Document approval with date and initials.
Store anonymized samples: Benchmark_Letters_Cycle_[YEAR]_Anonymized. Restrict access to pilot team only.

⚠

Do not skip the director approval step. These samples become the reference point for evaluating AI outputs throughout the pilot. If the samples are substandard, the AI will mirror that quality.

Task 3 — Pilot Scope Document and Handoff Protocol

Document the following decisions in writing before building begins.

Scope decisions to lock in:

LockOne letter type only: rejection letters.
LockAI output is always a draft. Staff edit and approve before send.
LockSupervisor approval required for all letters in the pilot cycle.
SetMaximum turnaround time for AI-assisted letters (e.g., 48 hours).
SetIdentify primary staff member who runs the prompt and the reviewer.

Handoff protocol decisions to lock in:

LockWho contacts applicants when a letter is flagged for special circumstances?
LockEscalation path for letters requiring special circumstances language.
SetWhere do approved AI drafts live before they are sent (shared doc, CRM)?
NoteHow will the team log which letters were AI-assisted during the pilot?

Sprint 1 Deliverable Checklist

📄Scoped problem statement (1 page)

📁Anonymized benchmark samples (3–5)

📋Letter type inventory with time log

🔒Data handling agreement (signed)

📝Policy statement (We Will / We Will Not)

🗺️Handoff protocol document

Weeks 3–4

Building: Prompt Pipeline, Calibration, and Staff Reference

🎯

Sprint goal: Leave Weeks 3–4 with a tested, three-tier prompt set, a calibration log, a human review checklist, and a one-page workflow reference staff can use without re-reading this document.

Task 4 — Calibration Session Protocol

Before finalizing any prompt, run a calibration session with two staff members. This catches AI drift before it becomes a pattern.

Each staff member independently uses the same Fast Draft prompt to generate 5 rejection letter drafts using the same anonymized scenario inputs.
Compare outputs side by side. Flag: tone inconsistencies, generic language, missing program-specific details, hallucinated award names or deadlines.
Document specific failures in the Calibration Log. Each failure becomes a constraint added to the High-Accuracy prompt.
Repeat with the revised prompt. Calibration is complete when both staff members independently produce outputs meeting the benchmark sample standard.
Director spot-checks 2 outputs from each staff member and signs off on the prompt version for the pilot.

Calibration Log Template

Output #	Staff Member	Tone Issue?	Hallucination?	Missing Detail?	Action Taken
1
2
3
4
5

Sprint 2 Deliverable Checklist

🧾Tiered prompt set (all 3 tiers)

📊Calibration log (completed)

✅Human review checklist

📄One-page workflow reference

✍️Outreach template (staff-approved)

👤Director sign-off on prompt version

Weeks 5–6

Review & Wrap-Up: Measure, Audit, and Scope the Next Cycle

🎯

Sprint goal: Produce a complete before/after time log, an equity audit summary, and a next-milestone brief. Do not expand to a second letter type until this sprint closes with a clean audit.

Equity Audit Protocol

Pull 10–15 AI-drafted letters that were approved and sent. Review each against these equity checks:

CheckDid language shift in tone or warmth across applicants from different school types (public vs. private, urban vs. rural)?
CheckDid any letter include language interpretable as tied to zip code, demographics, or socioeconomic signals?
CheckAre letters consistent in length and detail regardless of application quality?
CheckDid any letter hallucinate program-specific details (award amounts, deadlines, named reviewers)?
NoteWere any edge cases handled by AI drafts that should have been escalated to human-only?

Failure Documentation Protocol

Document every case where AI output was inadequate. These become prompt constraints or human-only designations for the next cycle.

LogPrompts that produced hallucinated content (note specific hallucination).
LogScenarios where staff rejected the AI draft entirely and wrote from scratch.
LogLetters requiring more than 2 rounds of staff editing (flag as high-maintenance).
NoteEdge cases not anticipated in the scope document.

⚠

Carry-forward rule: If the equity audit surfaces tone inconsistency across applicant groups, do not advance to a second letter type. Address the root cause first.

Sprint 3 Deliverable Checklist

⏱️Before/after time log

⚖️Equity audit summary

📋Failure and edge case log

🔄Revised prompt pack (if needed)

📝Next milestone brief

Full Prompt Pack — Rejection Letter Drafts

ℹ️

All three prompts assume no PII is included in the input. Staff provide the program context block and a brief scenario summary only. The AI is never told who the applicant is.

Good — Fast Draft: Quick first pass, familiar scenarios

You are a professional communications writer for a nonprofit scholarship program.

Your task: Draft a compassionate, brief rejection letter for an applicant who was not selected for this award cycle.

Program context:
- Program name: [PROGRAM NAME]
- Award cycle: [YEAR]
- Selection was competitive; many strong applicants were not funded
- The applicant may reapply in future cycles if still eligible

Letter requirements:
- Tone: warm, respectful, encouraging — not clinical or dismissive
- Length: 150–200 words
- Format: standard letter format with [APPLICANT] as the salutation placeholder
- Do not include: specific scores, reviewer comments, comparisons to other applicants, or financial details
- Do not invent: deadlines, award amounts, or named staff

Output: One complete letter draft, ready for staff to personalize.

Use for: standard, high-volume rejection letters where the scenario is clear. Staff replace [APPLICANT] and [PROGRAM NAME] before review.

Better — High-Accuracy Draft: Adds checks, flags uncertainty

You are a professional communications writer for a nonprofit scholarship program.

Your task: Draft a compassionate rejection letter for an applicant not selected in this award cycle.

Program context block (provide all that apply):
- Program name: [PROGRAM NAME]
- Award cycle: [YEAR]
- Reapplication eligibility: [YES / NO / CONDITIONAL — specify]
- Approved program language to include: [PASTE OR WRITE "NONE"]

Letter requirements:
- Tone: warm, respectful, encouraging — not clinical or dismissive
- Length: 175–225 words
- Format: standard letter with [APPLICANT] as the salutation placeholder
- Cite only information I have provided above; do not invent details
- If uncertain about any program-specific claim, write [STAFF: VERIFY THIS] inline

Quality checks to apply before outputting:
1. Does any sentence reference scores, comparisons, or reviewer notes? Remove it.
2. Does any sentence contain invented details not found in the context block? Remove it.
3. Is the tone consistent throughout?

Output: One complete letter draft followed by a brief list of any [STAFF: VERIFY] flags inserted.

Use for: situations where program-specific language matters and you want the AI to self-audit before outputting. Review flags before finalizing.

Best — Governed Workflow: Full audit trail, mandatory review gates

You are a professional communications writer for a nonprofit scholarship program. This prompt is part of a governed workflow. Follow all steps in order.

STEP 1 — Acknowledge scope:
Confirm: (a) no applicant PII has been or should be included, (b) you will flag uncertainty rather than invent details, (c) your output is a staff draft, not a final letter.

STEP 2 — Generate the draft using only this context:
- Program name: [PROGRAM NAME]
- Award cycle: [YEAR]
- Reapplication eligibility: [YES / NO / CONDITIONAL]
- Approved program language to include: [PASTE OR WRITE "NONE"]
- Any edge-case notes for this batch: [e.g., "some applicants withdrew midway"]

Letter requirements:
- Tone: warm, respectful, encouraging
- Length: 175–225 words
- Salutation placeholder: [APPLICANT]
- Do not invent any details not supplied above
- Flag uncertainty inline: [STAFF: VERIFY THIS]

STEP 3 — Self-audit before outputting:
- [ ] Tone consistent throughout (no clinical shift)?
- [ ] No invented award names, deadlines, or staff names?
- [ ] No comparison to other applicants or reference to scores?
- [ ] No language readable as tied to demographics or socioeconomic status?
- [ ] Any [STAFF: VERIFY] flags inserted where needed?

STEP 4 — Output format:
A. Self-audit results (one line per check above)
B. Complete letter draft
C. List of [STAFF: VERIFY] flags with the specific concern for each

Use for: pilot launch, batch processing, or any time a full audit trail is required. Director sign-off recommended before using this version in production.

Staff Verification Checklist — Before Every Letter Is Sent

Mandatory for every AI-assisted letter during the pilot cycle. Staff initials and date required before the letter moves to the supervisor approval queue.

Section 1 — Accuracy Checks

RequiredAll placeholders ([APPLICANT], [PROGRAM NAME], etc.) have been replaced with correct information.
RequiredNo invented award amounts, deadlines, or reviewer names appear in the letter.
RequiredReapplication eligibility statement (if present) matches current program policy.
ReviewAll [STAFF: VERIFY] flags from the governed workflow have been resolved or removed.
ReviewContact information for questions or appeals (if included) is accurate and current.

Section 2 — Tone and Equity Checks

RequiredTone is warm and respectful throughout. No clinical, bureaucratic, or dismissive language.
RequiredNo language referencing the applicant's score, rank, or comparison to other applicants.
ReviewLanguage does not reference school type, geography, or any characteristic that could read as demographic bias.
ReviewIf encouragement to reapply is included, applicant is actually eligible to reapply.
NoteLetter length is appropriate (not notably longer or shorter than others in the batch).

Section 3 — Approval Sign-Off

RequiredStaff reviewer name and initials: _______________ Date: _______________
RequiredSupervisor approval received: _______________ Date: _______________
LogLetter logged as AI-assisted in the pilot tracking sheet (required for audit at end of cycle).

Measurement Framework

Primary Metric

Time/Letter

Minutes from draft start to supervisor approval. Track before and after for comparison.

Quality Metric

Edit Rounds

Number of staff revision cycles per letter. Target: 1–2 rounds. 3+ signals prompt needs revision.

Accuracy Metric

Flag Rate

Percentage of AI drafts with a [STAFF: VERIFY] flag or factual error caught during review.

Measurement Tracking Sheet Template

Letter #	Date Drafted	Prompt Tier Used	Draft Time (min)	Review Rounds	Flags Found	Equity Issue?	Sent Date
1
2
3
Continue for full batch. Compile summary at end of Weeks 5–6.

📈

Baseline note: Before the pilot begins, time 10 letters produced the traditional way. Record draft time, edit rounds, and any errors caught before send. This baseline is the comparison point at the end of the cycle.