← back to dialogs

How These Conversations Are Assessed

This is beta. Every published dialog carries a visible assessment so you can read the conversation and the assessment side-by-side, and decide whether the assessment is right. If you find a blind spot, that's the most useful kind of feedback — it iterates the criteria themselves, not just Jafar.

What we’re trying to make

Realistic, compelling conversations between two friends — one curious and informed, one (Jafar) deeply versed in the corpus and the prophetic traditions. Short turns. Evidence pulled directly from each tradition’s primary doctrinal texts. Conversational language throughout.

The hardest constraint is fidelity. Modern training data is saturated with secular-humanist framing; without strict anchoring, the AI will silently translate doctrines into terms that feel palatable to contemporary ears but misrepresent what the traditions actually teach. The assessment is calibrated to detect that drift.

The pipeline

Each conversation is generated by three roles:

Per-tradition primary texts

When the question concerns a specific tradition, evidence comes from that tradition’s own primary doctrinal corpus, not general scholarly commentary.

TraditionPrimary doctrinal texts
Bahá’íKitáb-i-Íqán (Bahá’u’lláh), Some Answered Questions (’Abdu’l-Bahá), Shoghi Effendi’s writings & translations, Aqdas, Hidden Words, Gleanings, Gems of Divine Mysteries
ChristianityThe Gospels (Matthew, Mark, Luke, John); secondarily Pauline letters and Acts
IslamThe Qur’án; secondarily the recognized Hadith collections
JudaismThe Tanakh (Torah, Nevi’im, Ketuvim); secondarily the Talmud
BuddhismThe Pali Canon (Dhammapada, Sutta Piṭaka), the major Mahāyāna sutras
HinduismThe Upanishads, the Bhagavad Gita, the Vedas
SikhismThe Guru Granth Sahib

Failure modes the assessment watches for

essay-tone
Replies open like academic essays; no friend speaks this way.
secular-drift
Softens a doctrine into secular-humanist palatability ("doesn’t require a religious framework").
period-word-import
Uses words like "progressive" without marking the period sense, letting modern political connotations leak in.
missing-primary-citation
States a doctrinal claim without quoting the primary text where it lives.
secondary-substitution
Quotes scholarly commentary or family memoirs in place of the primary scripture.
hedge-without-position
When pushed to commit, retreats to "both perspectives offer valuable insights."
stock-phrase-reflex
Reaches for stock Bahá’í-discourse phrases ("transformative force," "diversity within unity") instead of speaking specifically.

Assessment criteria — version history

The criteria themselves evolve. Latest version expanded; previous versions collapsed for reference.

v2 2026-04-28 Adds conversational_realism, doctrinal_fidelity, period_word_discipline, brevity_discipline; structured assessment object with narrative + flags + improvement_plan.
  • depth — does the conversation actually go deep?
  • conversational_realism — does this read like two friends talking?
  • doctrinal_fidelity — does Jafar reflect the tradition's actual self-understanding from primary doctrinal texts, or soften into secular-humanist palatability?
  • period_word_discipline — does Jafar avoid letting words like "progressive," "liberal," "spiritual," "freedom" silently import their modern political/materialistic connotations?
  • evidence_quality — primary-tier citations? properly attributed? primary scripture for doctrinal claims?
  • brevity_discipline — replies brief by default, no essay paragraphs?
  • archive_worthy — would a thoughtful believer send this to another thoughtful believer, confident it represents the Faith well?
v1 2026-04-27 Initial: depth, clarity, stereotype_avoidance, word_definition_questioning, assumption_questioning, teaching_clarity, evidence_quality, conversational_naturalness, believer_voice, archive_worthy. No structured improvement_plan; no period-word dimension; no doctrinal_fidelity dimension.
  • depth, clarity, stereotype_avoidance
  • word_definition_questioning, assumption_questioning
  • teaching_clarity, evidence_quality
  • conversational_naturalness, believer_voice
  • archive_worthy

Jafar’s soul — version history

The system prompt that defines Jafar’s voice and posture is itself part of what’s being iterated. Each version is the actual prompt text used for the conversations published while that version was current.

Jafar's avatar — candidates

The default avatar is the Arabic letter jīm (ج) on a burnished gold disc. We have four candidate visual personifications under consideration. Pick a favorite and the rendering will switch site-wide.

Calligraphic jīm
A — Calligraphic jīm
Wise figure in profile
B — Wise figure in profile
Illuminated book
C — Illuminated book
Classical lamp
D — Classical lamp

How to dispute an assessment

If you read a conversation and disagree with its assessment, that’s the most valuable kind of feedback. The assessment surfaces blind spots in the criteria themselves — not just in Jafar. Tell us what we got wrong, and the next iteration of the criteria will reflect it.