✦ how the score works

What “interesting” actually measures.

Every answer page can show an ✦ INTERESTINGbadge, and the archive ranks questions by it. Here's how the score is computed. No human votes feed into it — it's entirely derived from what the substrate did when it tried to answer.

The formula

Each answer starts at 50and collects adjustments. The final score is clamped to 0–100. Discrete components set the answer's broad shape; continuous substrate signals (top1, spread10, prose length) differentiate answers within the same shape, so the sort never collapses to ties. Anything ≥ 80 gets the ✦ badge.

signal	weight
baseline (every answer starts here)	+50
mode = objection (substrate reframed)	+12
verdict = grounded (citations resolved)	+8
verdict = refused (honest silence)	−22
verdict = ungrounded (cited outside the workspace)	−8
alien + grounded combo (disagreed but had reasons)	+2
signal.top1 (substrate confidence in its #1 passage)	0 to +14
signal.spread10 (gap between top-1 and top-10 retrievals)	0 to +18
prose length (sweet spot ~300–600 chars)	0 to +12

Why those weights

Objection mode (+12)is the biggest discrete bonus because it captures the editorial distinction the site is built on: when a mind reframes, it's doing something a chatbot wouldn't.

Grounded (+8) rewards answers whose citations resolve back to the retrieved workspace — no hallucinated verses.

Refused (−22) is the largest penalty because a silent answer is honest but not editorially rich. Refusals still appear in the archive; they just sort lower.

Signal spreadis the gap between the substrate's top-1 retrieval and its top-10. A wide spread means the substrate has opinions — it cared about specific passages, not just “eh, these all kinda fit.” Scaled continuously across the observed 0.006–0.045 range so two answers with the same discrete shape can still differ.

Top1 confidenceis the substrate's similarity to its #1 chosen passage. High top1 = the substrate found a clear match rather than a fuzzy cluster.

What the score does NOT measure

Visitor likes, votes, or comments. (Not yet wired in.)

Whether the answer is “correct.” Each mind speaks only from its own corpus; correctness across worldviews is the visitor's judgment, not the system's.

How spicy / controversial the question is. Wholly canonical questions and modern dilemmas use the same formula.

Recency. A great answer that landed three months ago scores the same as one that landed yesterday.

Where the code lives

The scoring function is a single pure file: webapp/lib/interestingness.ts. Adjust the weights, rebuild, and the badge thresholds + archive rankings update everywhere.

The substrate-side signals it consumes — mode, verdict, signal.spread10, signal.alien — are written into each answer JSON at generation time, in boundary/answer_questions.py.

What “interesting” actually measures.

The idea, in one sentence

The formula

Why those weights

What the score does NOT measure

Where the code lives