PHINEAS
Custom-built for an ESL academy in Singapore on a speculative basis. Active client relationship, ESL department head championing the work, beta-user trials with students and faculty.
Context
An ESL academy in Singapore needed a way to produce reading material at defined CEFR levels (the Common European Framework of Reference, the standard for language proficiency), at scale, with deterministic adherence to the level. Off-the-shelf LLM output couldn’t be trusted to stay on-level. A passage requested at B1 would drift into B2 vocabulary, or slip back to A2 grammar, or include culture-specific references that broke the assessment.
I took on the build on a speculative basis. The ESL department head championed the work internally and committed beta-trial time from students and faculty.
Approach
A six-step state machine that forces deterministic output from a probabilistic model. The corpus grounding does the heavy lifting:
- Topic seed and CEFR target lock
- Word-frequency check against COCA (Corpus of Contemporary American English, 1 billion words) for level-appropriate vocabulary
- Grammar constraint check against CEFR descriptors
- Draft generation with frozen vocabulary and grammar windows
- Self-review by a second agent against the CEFR rubric
- Final pass that re-checks frequency drift and outputs a confidence score
Each step has a defined input/output schema. The LLM produces probabilistic text inside hard guardrails.
Outcome
Now at phineas.app, in formal product development. Pilot exceeded the original accuracy KPI. Beta-user trials are running with students and faculty at the original academy.
The pattern (rubric-grounded multi-step state machine for deterministic LLM output) generalizes to any domain where the output needs to hit a precise level or category: medical literacy, legal writing, regulatory compliance, technical documentation tiered by audience.