Library Overview

46,552 documents, 5.0M passages, spanning the world's great religious traditions

The OceanLibrary.com collection — powered by SifterSearch — contains a carefully curated collection of sacred texts, scholarly works, and primary sources from the world's great religious traditions. Each document is processed, indexed, and made searchable with keyword search, semantic vector search, and a multi-step RAG enhancement pipeline. Learn about the search strategy →

Library Statistics

Metric Count
Total Documents 46,552
Total Passages 4,959,424
Religions 12
Collections 102
Languages English, Arabic, Persian, Hebrew, Pali, Sanskrit, and others
Date Range Ancient – Present

Collections by Tradition

Tradition Documents
Baha'i 39,750
Judaism 1,350
Buddhist 840
Islam 388
Christian 163
Hindu 115
Zoroastrian 101
Tao 93
Confucian 61
Sikh 39
Jain 24
Jainism 1

Highlighted Collections

Bahá'í Faith — 39,750 Documents

The most comprehensive collection in the library, spanning the writings of the Báb, Bahá'u'lláh, 'Abdu'l-Bahá, Shoghi Effendi, and the Universal House of Justice. Includes core tablets, authorized translations, pilgrim notes, administrative guidance, and scholarly papers.

Judaism — 1,350 Documents

Torah and Tanakh, Talmud and Mishnah, Midrash, Halakhic codes, Kabbalistic texts, medieval philosophy (Maimonides, Judah Halevi), Hasidic teachings, and Second Temple literature.

Buddhism — 840 Documents

The Pali Canon (Tipitaka) comprising suttas, vinaya, and abhidhamma; Mahayana sutras; Zen and Chan texts; and works on practice and meditation.

Islam — 388 Documents

The Qur'an with classical tafsir commentaries, all six canonical Sunni hadith collections, the four Shi'ah canonical works, major fiqh texts from all schools, Sufi classics including Ibn Arabi and Rumi, and the philosophical tradition from al-Kindi through Mulla Sadra. Primarily in classical Arabic with cross-language search capabilities.

Christianity — 163 Documents

Bible translations, Church Fathers, Eastern Orthodox texts, medieval theology, Reformation writings, mysticism and devotion, and modern theological works.

Other Traditions

  • Zoroastrian (101) — Avesta texts (Yasna, Gathas, Yashts, Vendidad) and Pahlavi literature
  • Tao (93) — Foundational classics (Daodejing, Zhuangzi), Daozang scriptures, alchemy and neidan texts
  • Hindu (115) — Vedas, Upanishads, epics, Puranas, philosophy, Bhakti texts, and Tagore
  • Confucian (61) — Five Classics, Four Books, classical commentary, Neo-Confucianism
  • Sikh (39) — Guru Granth Sahib, sacred poetry, history, and philosophy
  • Jain (24) — Agamas, philosophy, ethics, and biographical narratives
Want to contribute? We welcome submissions of high-quality texts. The Librarian agent reviews all submissions for quality, checks for duplicates, and helps prepare documents for indexing. Contact us to learn more.

Document Format

All documents in the library are stored as Markdown files with YAML frontmatter containing metadata. This ensures clean, portable, and easily searchable content.

Example Document Structure

---
title: "Some Answered Questions"
author: "'Abdu'l-Bahá"
translator: "Laura Clifford Barney"
date: "1908"
religion: "Bahá'í"
collection: "Core Publications"
---

# Some Answered Questions

## Part One: On the Influence of the Prophets

### 1. Nature Requires an Educator

Nature is that condition or reality which outwardly is...

Search Capabilities

Keyword Search

Traditional full-text search using Meilisearch. Supports:

  • Exact phrase matching with quotes
  • Typo tolerance and fuzzy matching
  • Filtering by religion, collection, author, date
  • Highlighted snippets showing match context

Semantic Search

Vector-based search using OpenAI embeddings (text-embedding-3-large, 3072 dimensions). Finds conceptually related passages even without keyword matches — and works across language boundaries. English queries find Arabic passages; Persian queries find English ones.

  • Understands synonyms and paraphrases
  • Cross-language concept matching (English ↔ Arabic, Persian, Hebrew, and others)
  • Theme and topic discovery
  • Question-answering capabilities

Hybrid Search

Combines both approaches for best results. Keyword matches boost relevance for exact terms, while semantic search surfaces conceptually related passages you might have missed.

Indexing Pipeline

Documents go through several processing stages before becoming searchable:

  1. Ingestion — Markdown files are parsed and metadata extracted
  2. Chunking — Long documents are split into paragraph-sized passages
  3. Embedding — Each passage gets a 3072-dimensional vector representation
  4. RAG Enhancement — Disambiguation, HyPE (hypothetical passage expansion), and entity extraction enrich each passage for higher-precision retrieval. Full details →
  5. Indexing — Passages are indexed in Meilisearch with vectors
  6. Quality Check — AI verifies metadata and flags potential issues

Contributing Documents

We encourage the community to help expand the library. Here's how the contribution process works:

  1. Submit — Share a document URL, file, or text with the Librarian
  2. Analysis — The Librarian checks for duplicates and existing coverage
  3. Conversion — Documents are converted to clean Markdown format
  4. Metadata — AI enriches metadata (author, date, collection)
  5. Review — Admin reviews and approves for indexing
  6. Indexing — Document is processed and becomes searchable

Supported Formats

  • Markdown (.md)
  • Plain text (.txt)
  • PDF (text-based)
  • HTML
  • EPUB
  • Audio/Video URLs (via Transcriber agent)