Library Overview

160,195 documents, 6.8M passages, spanning the world's great religious traditions

The OceanLibrary.com collection — powered by SifterSearch — contains a carefully curated collection of sacred texts, scholarly works, and primary sources from the world's great religious traditions. Each document is processed, indexed, and made searchable with keyword search, semantic vector search, and a multi-step RAG enhancement pipeline. Learn about the search strategy →

Library Statistics

Metric	Count
Total Documents	160,195
Total Passages	6,808,498
Religions	12
Collections	105
Languages	English, Arabic, Persian, Hebrew, Pali, Sanskrit, and others
Date Range	Ancient – Present

Collections by Tradition

Tradition	Documents
Baha'i	31,973
Judaism	1,385
Buddhist	858
Islam	468
Christian	203
Hindu	127
Zoroastrian	119
Tao	96
Confucian	64
Sikh	39
Jain	24
Jainism	5

Highlighted Collections

Bahá'í Faith — 31,973 Documents

The most comprehensive collection in the library, spanning the writings of the Báb, Bahá'u'lláh, 'Abdu'l-Bahá, Shoghi Effendi, and the Universal House of Justice. Includes core tablets, authorized translations, pilgrim notes, administrative guidance, and scholarly papers.

Judaism — 1,385 Documents

Torah and Tanakh, Talmud and Mishnah, Midrash, Halakhic codes, Kabbalistic texts, medieval philosophy (Maimonides, Judah Halevi), Hasidic teachings, and Second Temple literature.

Buddhism — 858 Documents

The Pali Canon (Tipitaka) comprising suttas, vinaya, and abhidhamma; Mahayana sutras; Zen and Chan texts; and works on practice and meditation.

Islam — 468 Documents

The Qur'an with classical tafsir commentaries, all six canonical Sunni hadith collections, the four Shi'ah canonical works, major fiqh texts from all schools, Sufi classics including Ibn Arabi and Rumi, and the philosophical tradition from al-Kindi through Mulla Sadra. Primarily in classical Arabic with cross-language search capabilities.

Christianity — 203 Documents

Bible translations, Church Fathers, Eastern Orthodox texts, medieval theology, Reformation writings, mysticism and devotion, and modern theological works.

Other Traditions

Zoroastrian (119) — Avesta texts (Yasna, Gathas, Yashts, Vendidad) and Pahlavi literature
Tao (96) — Foundational classics (Daodejing, Zhuangzi), Daozang scriptures, alchemy and neidan texts
Hindu (127) — Vedas, Upanishads, epics, Puranas, philosophy, Bhakti texts, and Tagore
Confucian (64) — Five Classics, Four Books, classical commentary, Neo-Confucianism
Sikh (39) — Guru Granth Sahib, sacred poetry, history, and philosophy
Jain (24) — Agamas, philosophy, ethics, and biographical narratives

Want to contribute? We welcome submissions of high-quality texts. The Librarian agent reviews all submissions for quality, checks for duplicates, and helps prepare documents for indexing. Contact us to learn more.

Document Format

All documents in the library are stored as Markdown files with YAML frontmatter containing metadata. This ensures clean, portable, and easily searchable content.

Example Document Structure

---
title: "Some Answered Questions"
author: "'Abdu'l-Bahá"
translator: "Laura Clifford Barney"
date: "1908"
religion: "Bahá'í"
collection: "Core Publications"
---

# Some Answered Questions

## Part One: On the Influence of the Prophets

### 1. Nature Requires an Educator

Nature is that condition or reality which outwardly is...

Search Capabilities

Keyword Search

Traditional full-text search using Meilisearch. Supports:

Exact phrase matching with quotes
Typo tolerance and fuzzy matching
Filtering by religion, collection, author, date
Highlighted snippets showing match context

Semantic Search

Vector-based search using OpenAI embeddings (text-embedding-3-large, 3072 dimensions). Finds conceptually related passages even without keyword matches — and works across language boundaries. English queries find Arabic passages; Persian queries find English ones.

Understands synonyms and paraphrases
Cross-language concept matching (English ↔ Arabic, Persian, Hebrew, and others)
Theme and topic discovery
Question-answering capabilities

Hybrid Search

Combines both approaches for best results. Keyword matches boost relevance for exact terms, while semantic search surfaces conceptually related passages you might have missed.

Indexing Pipeline

Documents go through several processing stages before becoming searchable:

Ingestion — Markdown files are parsed and metadata extracted
Chunking — Long documents are split into paragraph-sized passages
Embedding — Each passage gets a 3072-dimensional vector representation
RAG Enhancement — Disambiguation, HyPE (hypothetical passage expansion), and entity extraction enrich each passage for higher-precision retrieval. Full details →
Indexing — Passages are indexed in Meilisearch with vectors
Quality Check — AI verifies metadata and flags potential issues

Contributing Documents

We encourage the community to help expand the library. Here's how the contribution process works:

Submit — Share a document URL, file, or text with the Librarian
Analysis — The Librarian checks for duplicates and existing coverage
Conversion — Documents are converted to clean Markdown format
Metadata — AI enriches metadata (author, date, collection)
Review — Admin reviews and approves for indexing
Indexing — Document is processed and becomes searchable

Supported Formats

Markdown (.md)
Plain text (.txt)
PDF (text-based)
HTML
EPUB
Audio/Video URLs (via Transcriber agent)