Polyglot Concordance / Methodology

About the Polyglot Concordance

This is a pilot. Current scope is the Gospel of Mark only. The full plan — additional NT books, the Hebrew Bible flagship, further witnesses, export formats, and two planned sub-projects — is laid out in the roadmap below.

A word-by-word alignment of the Gospel of Mark across three textual witnesses — Greek NT, Syriac Peshitta, and Latin Clementine Vulgate — with a one- or two-sentence apparatus note on every divergence.

Author Jossi Fresco Benaim 0009-0000-2026-0836

Methodology

Every verse in Mark has been aligned word-by-word across the three witnesses by Anthropic's Claude Sonnet 4.5 via the Batch API. Each alignment group records a variant verdictaligned, minor, major, omitted, or added — plus a semantic type (agreement, construction, harmonisation, substitution, idiom, etc.) and, for every non-aligned group, one to two sentences suitable for a critical apparatus.

Results are pre-computed and stored as per-verse JSON files in the repository — the running viewer has no runtime LLM dependency.

Methodology validation (and its limits)

We sanity-checked Claude's alignment reasoning against the Berean Interlinear Bible on the narrow sub-task of Greek→English word alignment. Across all 673 Mark verses (6175 scoring tokens), Claude agreed with Berean's scholarly glosses on 67.7% of tokens (4182 / 6175). Recorded 2026-04-23T06:26:59.983024+00:00 using claude-sonnet-4-5.

The remaining ~32% of non-matches are mostly translation-word-choice differences between Berean and WEB (the two English translations involved) — e.g. Berean “crowd” vs. WEB “people” for the same Greek ὄχλος — rather than alignment errors.

What this means: it's a transferable sanity check, not a correctness guarantee. Claude's Greek→English alignment logic is consistent with an established scholarly interlinear, which rules out catastrophic failure modes and suggests the 3-way Greek / Peshitta / Vulgate alignments in this viewer use the same reasonable reasoning. It does not directly measure the quality of the Peshitta or Vulgate alignments (no ground-truth reference exists), nor does it measure apparatus-note quality, variant classification, or type-tag correctness.

Berean is used only as a methodology benchmark — never as display data in the viewer.

Viewer inspiration

The parallel-columns concept and the green / red variant color-coding (minor vs. major) are inspired by the bible-mt5 parallel viewer built by Dr. Zhan Chen (Associate Professor, Digital Social Science & Associate Distinguished Research Fellow at the Research Centre for History and Culture, United International College — BNU-HKBU UIC, Zhuhai). Dr. Chen's own scholarship focuses on Syriac biblical texts (dissertation: An Investigation into the Peshitta of Isaiah, Harvard NELC, 2020) and Chinese Bible translations.

Data sources

Greek NT (tagged)
STEP Bible TAGNT from Tyndale House Cambridge, CC BY 4.0, via github.com/STEPBible/STEPBible-Data
Peshitta NT
Via the Aramaic Root Atlas corpus (Jossi Fresco Benaim) — Syriac text, CSV-packaged.
Clementine Vulgate
Public domain, via seven1m/open-bibles (USFX)
English verse gloss
World English Bible (WEB), public domain
Peshitta roots, cognates, and sister-roots
Aramaic Root Atlas (Jossi Fresco Benaim) — triliteral-root extraction, sister-root detection, and Hebrew / Arabic cognate mapping. Used by this viewer's click-for-tooltip layer on Peshitta tokens.
Alignment validation benchmark
Berean Interlinear Bible — methodology check only (67.7% agreement on a 673-verse sample).
Alignment generation
Anthropic Claude Sonnet 4.5

Related works

Peshitta Constellations
peshitta.onrender.com — companion project exploring the Peshitta corpus.
Aramaic Root Atlas
aramaic-root-atlas.onrender.com — cross-corpus Semitic triliteral root explorer with Hebrew and Arabic cognate mapping. Provides the Peshitta root data consumed by this viewer's tooltip layer.
BibCrit
bibcrit.app — biblical criticism workspace for textual analysis.

What this app is good for

The underlying engine is a multi-witness parallel-text viewer with a critical apparatus. Anywhere a text exists in more than one version, this tool can show it side-by-side with per-word alignment and annotated divergences.

Biblical scholarship

Teaching

Translation work

Reader-facing discovery

Roadmap

Up next

Expand book coverage

New Testament (continuing the current scope):

Hebrew Bible / Tanakh (a separate flagship):

Deuterocanonical / Apocrypha:

Expand witness coverage (within current books)

Viewer features

Export

Sub-projects

Versification

Verse labels follow NA28 numbering. The Clementine Vulgate uses a different verse split in some chapters; where necessary (notably Mark 9 and Mark 4:40–41) we've remapped the Vulgate text to the NA28 boundaries. See the repository's known-issues.md for the complete list of edits.

License & reuse

The viewer code is intended to be released under an open-source license once the project is publicly funded. The derived alignment JSON is produced from sources with mixed licenses (CC BY 4.0, public domain); any future redistribution will credit the upstream sources.

Contact

Feedback welcome — jossi@somosunodigital.com.