Library Overview
46,552 documents, 5.0M passages, spanning the world's great religious traditions
The OceanLibrary.com collection — powered by SifterSearch — contains a carefully curated collection of sacred texts, scholarly works, and primary sources from the world's great religious traditions. Each document is processed, indexed, and made searchable with keyword search, semantic vector search, and a multi-step RAG enhancement pipeline. Learn about the search strategy →
Library Statistics
| Metric | Count |
|---|---|
| Total Documents | 46,552 |
| Total Passages | 4,959,424 |
| Religions | 12 |
| Collections | 102 |
| Languages | English, Arabic, Persian, Hebrew, Pali, Sanskrit, and others |
| Date Range | Ancient – Present |
Collections by Tradition
| Tradition | Documents |
|---|---|
| Baha'i | 39,750 |
| Judaism | 1,350 |
| Buddhist | 840 |
| Islam | 388 |
| Christian | 163 |
| Hindu | 115 |
| Zoroastrian | 101 |
| Tao | 93 |
| Confucian | 61 |
| Sikh | 39 |
| Jain | 24 |
| Jainism | 1 |
Highlighted Collections
Bahá'í Faith — 39,750 Documents
The most comprehensive collection in the library, spanning the writings of the Báb, Bahá'u'lláh, 'Abdu'l-Bahá, Shoghi Effendi, and the Universal House of Justice. Includes core tablets, authorized translations, pilgrim notes, administrative guidance, and scholarly papers.
Judaism — 1,350 Documents
Torah and Tanakh, Talmud and Mishnah, Midrash, Halakhic codes, Kabbalistic texts, medieval philosophy (Maimonides, Judah Halevi), Hasidic teachings, and Second Temple literature.
Buddhism — 840 Documents
The Pali Canon (Tipitaka) comprising suttas, vinaya, and abhidhamma; Mahayana sutras; Zen and Chan texts; and works on practice and meditation.
Islam — 388 Documents
The Qur'an with classical tafsir commentaries, all six canonical Sunni hadith collections, the four Shi'ah canonical works, major fiqh texts from all schools, Sufi classics including Ibn Arabi and Rumi, and the philosophical tradition from al-Kindi through Mulla Sadra. Primarily in classical Arabic with cross-language search capabilities.
Christianity — 163 Documents
Bible translations, Church Fathers, Eastern Orthodox texts, medieval theology, Reformation writings, mysticism and devotion, and modern theological works.
Other Traditions
- Zoroastrian (101) — Avesta texts (Yasna, Gathas, Yashts, Vendidad) and Pahlavi literature
- Tao (93) — Foundational classics (Daodejing, Zhuangzi), Daozang scriptures, alchemy and neidan texts
- Hindu (115) — Vedas, Upanishads, epics, Puranas, philosophy, Bhakti texts, and Tagore
- Confucian (61) — Five Classics, Four Books, classical commentary, Neo-Confucianism
- Sikh (39) — Guru Granth Sahib, sacred poetry, history, and philosophy
- Jain (24) — Agamas, philosophy, ethics, and biographical narratives
Document Format
All documents in the library are stored as Markdown files with YAML frontmatter containing metadata. This ensures clean, portable, and easily searchable content.
Example Document Structure
---
title: "Some Answered Questions"
author: "'Abdu'l-Bahá"
translator: "Laura Clifford Barney"
date: "1908"
religion: "Bahá'í"
collection: "Core Publications"
---
# Some Answered Questions
## Part One: On the Influence of the Prophets
### 1. Nature Requires an Educator
Nature is that condition or reality which outwardly is... Search Capabilities
Keyword Search
Traditional full-text search using Meilisearch. Supports:
- Exact phrase matching with quotes
- Typo tolerance and fuzzy matching
- Filtering by religion, collection, author, date
- Highlighted snippets showing match context
Semantic Search
Vector-based search using OpenAI embeddings (text-embedding-3-large, 3072 dimensions). Finds conceptually related passages even without keyword matches — and works across language boundaries. English queries find Arabic passages; Persian queries find English ones.
- Understands synonyms and paraphrases
- Cross-language concept matching (English ↔ Arabic, Persian, Hebrew, and others)
- Theme and topic discovery
- Question-answering capabilities
Hybrid Search
Combines both approaches for best results. Keyword matches boost relevance for exact terms, while semantic search surfaces conceptually related passages you might have missed.
Indexing Pipeline
Documents go through several processing stages before becoming searchable:
- Ingestion — Markdown files are parsed and metadata extracted
- Chunking — Long documents are split into paragraph-sized passages
- Embedding — Each passage gets a 3072-dimensional vector representation
- RAG Enhancement — Disambiguation, HyPE (hypothetical passage expansion), and entity extraction enrich each passage for higher-precision retrieval. Full details →
- Indexing — Passages are indexed in Meilisearch with vectors
- Quality Check — AI verifies metadata and flags potential issues
Contributing Documents
We encourage the community to help expand the library. Here's how the contribution process works:
- Submit — Share a document URL, file, or text with the Librarian
- Analysis — The Librarian checks for duplicates and existing coverage
- Conversion — Documents are converted to clean Markdown format
- Metadata — AI enriches metadata (author, date, collection)
- Review — Admin reviews and approves for indexing
- Indexing — Document is processed and becomes searchable
Supported Formats
- Markdown (.md)
- Plain text (.txt)
- PDF (text-based)
- HTML
- EPUB
- Audio/Video URLs (via Transcriber agent)