NSPA 2/2/2 Framework | Example 2: Application Triage

Milestone Goal

Build a staff-operated triage pipeline that flags incomplete or ineligible applications before they reach reviewers, reducing reviewer burden and back-and-forth with applicants.

Team Pod

Program coordinator + eligibility screener + 1 committee member (calibration) + supervisor. Organized around the review committee experience.

Core Constraint

Triage is for flagging, not deciding. The AI output is a structured checklist. Staff confirm every flag before any action is taken. No AI-to-applicant pipeline.

🔒

Non-negotiable data rule: Prompts receive an anonymized checklist of submitted materials vs. required items only. Applicant names, contact information, financial data, and personally identifiable details never enter any prompt at any stage.

⚠

Triage is not a decision engine. An AI flag means a human looks at the application. It does not mean the application is rejected, denied, or removed from consideration. This distinction must be communicated to all staff before the pilot begins.

Program Policy Statement

AI-Assisted Application Triage — We Will / We Will Not

✓ We Will

Use AI to generate a structured triage report flagging potential completeness or eligibility gaps against defined criteria.
Require staff to verify every AI flag against the original application before taking any action.
Treat edge-case flags as human-only review items (dual enrollment, partial transcripts, special circumstances).
Document false positives and false negatives to improve prompt accuracy over time.
Audit triage flag rates by school type and region to detect potential equity drift.
Notify applicants of gaps only after a staff member confirms the flag is accurate.

✕ We Will Not

Include applicant names, IDs, contact information, or financial data in any prompt.
Treat an AI flag as a final eligibility ruling. All flags require human confirmation.
Contact applicants directly from an AI-generated output without staff review.
Use triage AI output to disqualify, reject, or remove an applicant without human decision-making.
Expand triage to a second scholarship track until this pilot is stable and measured.
Rely on AI to interpret ambiguous eligibility edge cases. Those go to a human screener.

Weeks 1–2

Planning: Eligibility Audit and Prompt-Ready Criteria List

🎯

Sprint goal: Leave Weeks 1–2 with a prompt-ready eligibility criteria list, an incomplete application audit, a handoff protocol, and a policy statement on file. No prompts written yet.

Task 1 — Eligibility Criteria Extraction

Pull your eligibility criteria from program guidelines and convert them into a prompt-ready format. Each criterion must be precise enough that a checklist comparison is unambiguous.

Criterion	Hard Cutoff?	Verifiable from Submitted Materials?	Edge Cases / Ambiguities	Prompt-Ready?
Minimum GPA requirement	Yes	Yes (transcript)	Weighted vs. unweighted; mid-year grading	Y
Residency / geographic requirement	Yes	Yes (address on form)	Dual residency, recent movers	Y
Enrollment status (full-time / part-time)	Yes	Partial (self-reported)	Dual enrollment, gap year applicants	Edge
Application deadline	Yes	Yes (submission timestamp)	Technical issues, email submissions	Y
Required essay submitted	Yes	Yes (checklist)	Partial essays, wrong prompt answered	Edge
Letters of recommendation (count)	Yes	Yes (submission log)	One letter missing vs. both missing	Y
Financial need documentation	Varies	Partial	FAFSA pending, amended returns	Human-only

ℹ️

Only criteria marked Y go into the triage prompt. Edge criteria generate an "uncertain" flag for human review. Human-only criteria are removed from the prompt entirely and handled by staff screeners.

Task 2 — Incomplete Application Audit (Last Cycle)

Pull last cycle's incomplete or ineligible applications. Categorize by gap type to identify where triage adds the most value.

Gap Type	Count (Last Cycle)	% of Total Applications	Staff Time to Catch (hrs)	Triage Priority
Missing transcript	Fill in	Fill in	Fill in	High
Missing letter of recommendation	Fill in	Fill in	Fill in	High
GPA below cutoff	Fill in	Fill in	Fill in	High
Submitted after deadline	Fill in	Fill in	Fill in	Medium
Wrong enrollment status	Fill in	Fill in	Fill in	Medium
Essay missing or incomplete	Fill in	Fill in	Fill in	Medium
Financial docs incomplete	Fill in	Fill in	Fill in	Human-only

Task 3 — Handoff Protocol

Document what happens after a flag is confirmed by staff. The AI triage report is the trigger; all actions that follow are human-owned.

Staff screener receives the AI triage report for the application batch.
For each flagged item, staff locates the original application and confirms the gap is real (not a false positive).
Confirmed gaps are logged in the applicant tracking system with a timestamp and staff initials.
Staff generate the applicant outreach message using an approved template (not AI-generated). Message specifies the gap, the correction deadline, and submission instructions.
Applications with confirmed hard-cutoff failures (GPA, deadline, residency) are flagged for supervisor review before any notification is sent.
Edge cases (dual enrollment, partial transcript, contested deadline) are routed to the program director for determination before any action.

Sprint 1 Deliverable Checklist

📋Prompt-ready eligibility criteria list

📊Incomplete application audit

🗺️Handoff protocol document

🔒Data handling agreement

📝Policy statement (signed)

📁20–30 anonymized test applications

Weeks 3–4

Building: Triage Prompt, Test Run, and Verification Workflow

🎯

Sprint goal: Produce a tested triage prompt with documented accuracy results, a staff verification checklist, an outreach template, and a one-page workflow reference.

Task 4 — Test Run Protocol (20–30 Applications)

Before using the prompt on live applications, test it against 20–30 anonymized applications from the prior cycle where you already know the correct triage outcome.

Select 20–30 anonymized applications from last cycle: 10 complete and eligible, 10 incomplete or ineligible, 5–10 edge cases.
Run each application's anonymized checklist through the prompt. Record the AI output for each.
Compare AI output to the known outcome (what actually happened to that application last cycle).
Log false positives (AI flagged it; it was actually fine) and false negatives (AI missed a real gap).
Calculate accuracy rate. If false positive rate exceeds 10% or false negative rate exceeds 5%, revise the prompt before using it on live applications.

Test Run Accuracy Log

App # (Anon)	Known Outcome	AI Flag Produced	Correct?	Error Type	Notes
A-001
A-002
A-003
Continue for all test applications. Compile accuracy summary before live use.

Sprint 2 Deliverable Checklist

🧾Three-tier prompt pack (tested)

📊Test run accuracy log

✅Staff verification checklist

📄Outreach template (staff-approved)

📄One-page workflow reference

👤Director sign-off on accuracy threshold

Weeks 5–6

Review & Wrap-Up: Accuracy Audit, Equity Check, and Next Cycle Brief

🎯

Sprint goal: Measure accuracy against the live batch, run the equity audit, document edge cases, and decide whether to expand to a second scholarship track.

Accuracy Audit Protocol

MeasureFalse positive rate: AI flagged an application that was actually complete and eligible.
MeasureFalse negative rate: AI missed a real gap that staff caught during verification.
MeasureAverage staff time to verify a triage report (target: faster than manual screening).
MeasureNumber of edge cases that the "uncertain" flag correctly surfaced for human review.
NoteNumber of applications where staff overrode the AI report entirely.

⚠

Decision threshold: If false positive rate exceeds 15% in the live batch, pause and revise the prompt before next cycle. Do not expand to a second track.

Equity Audit Protocol

AuditDid flag rates differ meaningfully by school type (public/private/charter, rural/urban)?
AuditDid flag rates differ by region or zip code cluster?
AuditWere first-generation applicants flagged at a higher rate than continuing-generation applicants?
AuditDid any flag pattern suggest the AI is penalizing non-standard application formats rather than actual gaps?
NoteFlag rates should track eligibility gaps, not applicant demographics. Investigate any mismatch before continuing.

Sprint 3 Deliverable Checklist

⏱️Accuracy log (live batch)

⚖️Equity audit summary

📋Edge case reference list

🔄Revised prompt (if needed)

📝Next milestone brief

Full Prompt Pack — Application Triage Reports

ℹ️

Input to all three prompts is always an anonymized submission checklist: what the applicant submitted vs. what was required. Never include names, IDs, or contact information.

Good — Fast Draft: Quick flag list for clear-cut completeness gaps

You are helping a nonprofit scholarship program screen applications for completeness before reviewer assignment.

Your task: Compare the submitted materials checklist below against the required materials list. Flag any gaps.

Required materials for this program:
[PASTE YOUR REQUIRED ITEMS LIST HERE — e.g., transcript, 2 letters of recommendation, essay, application form, deadline met]

Submitted materials for this application batch (anonymized):
[PASTE SUBMITTED CHECKLIST — one row per application, no names or IDs. Use Application A, B, C, etc.]

Output format: A table with columns: Application ID | Missing Item(s) | Flag Type | Notes
Flag types: MISSING (item not submitted), INCOMPLETE (partial submission), ELIGIBLE (complete, no issues)

Important: Do not guess, infer, or extrapolate. Only flag what is verifiably absent from the submitted checklist above.

Use for: standard batches with clear-cut completeness checks. Staff still verify every flag against the original submission before taking action.

Better — High-Accuracy: Adds eligibility criteria checks and uncertainty flags

You are helping a nonprofit scholarship program screen applications for completeness and eligibility before reviewer assignment.

Your task: Review the anonymized application checklist against both the required materials list and the eligibility criteria below. Produce a structured triage report.

Required materials:
[PASTE REQUIRED ITEMS LIST]

Eligibility criteria (hard cutoffs only — do not include financial need criteria):
[PASTE CRITERIA — e.g., GPA minimum 3.0, in-state residency, full-time enrollment, deadline met]

Anonymized application batch (no names or IDs; use Application A, B, C, etc.):
[PASTE CHECKLIST — submitted items and any reported eligibility data points]

Output format: One row per application with these columns:
1. Application ID
2. Completeness Flag: COMPLETE / INCOMPLETE (list missing items) / UNCERTAIN
3. Eligibility Flag: APPEARS ELIGIBLE / APPEARS INELIGIBLE (cite which criterion) / UNCERTAIN
4. Edge Case?: YES / NO (note the specific ambiguity if YES)
5. Recommended Action: ADVANCE TO REVIEW / HOLD FOR STAFF REVIEW / FLAG FOR DIRECTOR

Rules:
- If information is missing or ambiguous, output UNCERTAIN rather than guessing.
- Do not make eligibility determinations on financial need, enrollment edge cases, or dual-enrollment situations.
- Cite the specific criterion that triggers any INELIGIBLE flag.

Use for: standard cycle batches where eligibility criteria checks matter alongside completeness. The UNCERTAIN flag protects against false positives on edge cases.

Best — Governed Workflow: Full audit trail, self-check, mandatory staff review gate

You are helping a nonprofit scholarship program screen applications for completeness and eligibility. This is a governed workflow. Follow all steps in order.

STEP 1 — Acknowledge scope:
Confirm: (a) no applicant PII is present in this prompt, (b) your output is a triage report for staff review, not a final eligibility ruling, (c) you will output UNCERTAIN rather than guess on any ambiguous case.

STEP 2 — Triage each application using only the information below:

Required materials:
[PASTE REQUIRED ITEMS LIST]

Eligibility criteria (hard cutoffs only):
[PASTE CRITERIA]

Human-only review items (do not triage these — mark HUMAN-ONLY):
[PASTE — e.g., financial need documentation, dual-enrollment status, late submission with documented technical issues]

Anonymized application batch:
[PASTE CHECKLIST — Application A, B, C, etc. No names or IDs.]

STEP 3 — Self-check before outputting:
- [ ] Did I flag any application based on inferred or missing information not in the checklist?
- [ ] Did I make any determination on human-only items?
- [ ] Are all UNCERTAIN flags documented with a specific reason?
- [ ] Does my recommended action column distinguish ADVANCE / HOLD / DIRECTOR-LEVEL?

STEP 4 — Output format:
A. Self-check results (one line per item above)
B. Triage report table: App ID | Completeness | Eligibility | Edge Case | Recommended Action | Notes
C. Summary counts: Total screened | Complete/Eligible | Flags | UNCERTAIN | Human-only items

Use for: pilot launch and any batch where full audit documentation is required. Retain the self-check output as part of the triage record.

Sample Triage Report Output

This is what a completed triage report should look like after the prompt has run and staff have reviewed each flag. Use this as the benchmark for evaluating AI output quality.

MISSING UNCERTAIN HUMAN-ONLY COMPLETE

Triage Report — Sample Batch Output (Anonymized)

Self-check results: ✓ No PII present in input ✓ No determinations made on human-only items ✓ All UNCERTAIN flags have documented reasons ✓ Recommended actions use correct three-level scale Triage Report: App ID | Completeness | Eligibility | Edge Case | Recommended Action | Notes -------|-------------------|-----------------------|-----------|------------------------|------------------------ A-001 | COMPLETE | APPEARS ELIGIBLE | No | ADVANCE TO REVIEW | All materials present A-002 | INCOMPLETE | APPEARS ELIGIBLE | No | HOLD FOR STAFF REVIEW | Missing 1 of 2 rec letters A-003 | COMPLETE | APPEARS INELIGIBLE | No | HOLD FOR STAFF REVIEW | GPA reported as 2.7 (min: 3.0) A-004 | COMPLETE | UNCERTAIN | Yes | FLAG FOR DIRECTOR | Dual enrollment — criteria unclear A-005 | INCOMPLETE | UNCERTAIN | No | HOLD FOR STAFF REVIEW | Transcript marked "pending" A-006 | HUMAN-ONLY ITEM | HUMAN-ONLY ITEM | Yes | FLAG FOR DIRECTOR | Financial need docs; not triaged per policy Summary: Total screened: 6 | Complete/Eligible: 1 | Flags: 3 | UNCERTAIN: 2 | Human-only: 1

⚠

Staff action required: Every row in "HOLD FOR STAFF REVIEW" and "FLAG FOR DIRECTOR" requires a human to open the original application and verify before any action is taken. The triage report is a to-do list for screeners, not a decision.

Staff Verification Checklist — Before Any Action on a Flagged Application

Required for every flagged application. Staff initials and date before any outreach or routing decision.

Section 1 — Flag Verification

RequiredOpen the original application submission. Confirm the flagged item is actually missing or fails the criterion.
RequiredIf the AI output is UNCERTAIN, route to program director. Do not make an independent determination.
ReviewFor INCOMPLETE flags: check whether the missing item arrived separately (email, fax, late upload).
ReviewFor INELIGIBLE flags: confirm the criterion cited matches the current program guidelines (not last year's).
NoteLog whether the AI flag was a true positive or false positive in the accuracy tracking sheet.

Section 2 — Outreach Authorization

RequiredOutreach is only initiated after a human screener confirms the flag is accurate.
RequiredOutreach message uses the approved template. No AI-generated outreach goes directly to applicants.
RequiredHard-cutoff failures (GPA, deadline, residency) receive supervisor review before notification is sent.
ReviewCorrection deadline in the outreach message matches current program policy.
LogOutreach logged with staff initials, date, and confirmation that the AI flag was verified.

Measurement Framework

Primary Metric

Triage Time

Minutes per application to complete triage before and after. Track across full batch.

Accuracy Metric

False + Rate

Target below 10%. Above 15% triggers prompt revision before next cycle.

Safety Metric

False − Rate

Gaps AI missed that staff caught during verification. Target below 5%.