Available Corpora
Context-Fabric works with any corpus in Text-Fabric format. This page catalogs 40+ known corpora across languages and time periods.
Showing 35 of 35 corpora
| Corpus ↕ | Language ↑ | Category ↕ | Period ↕ | Description |
|---|---|---|---|---|
| oldassyrian | Akkadian | Historical | 2000-1600 BCE | Old Assyrian documents |
| oldbabylonian | Akkadian | Historical | 1900-1600 BCE | Old Babylonian letters |
| ninmed | Akkadian | Historical | ca. 800 BCE | Medical Encyclopedia from Nineveh |
| quran73 MB | Arabic | Religious | 600-900 CE | Quranic Arabic Corpus |
| fusus | Arabic | Religious | Medieval | Ibn Arabi's Fusus Al Hikam |
| nena_tf | Aramaic | Historical | Modern | North Eastern Neo-Aramaic |
| wp6-missieven | Dutch | Historical | 1600-1800 CE | VOC General Missives |
| wp6-daghregisters | Dutch | Historical | 1640-1641 | Batavia daily records |
| wp6-ferdinandhuyck | Dutch | Literary | 1884 | Dutch novel by Jacob van Lennep |
| mondriaan | Dutch | Historical | 1892-1923 | Piet Mondriaan letters |
| mobydick | English | Literary | 1851 | Herman Melville novel with NLP annotations |
| banks | English | Literary | 1987 | Iain M. Banks' Consider Phlebas |
| descartes-tf | French/Latin/Dutch | Historical | 1619-1650 | Descartes correspondence |
| lxx268 MB | Greek | Biblical | 300-100 BCE | Septuagint (Rahlfs edition) |
| n1904319 MB | Greek | Biblical | 100-400 CE | Nestle 1904 Greek New Testament |
| SBLGNT | Greek | Biblical | 100-400 CE | SBL Greek New Testament |
| nestle1904 | Greek | Biblical | 100-400 CE | NT from LOWFAT-XML syntax trees |
| Nestle1904GBI | Greek | Biblical | 100-400 CE | Nestle 1904 (tonyjurg) |
| tischendorf_tf34 MB | Greek | Biblical | 100-400 CE | Tischendorf 8th Edition Greek NT |
| bible | Greek | Biblical | 300 BCE-400 CE | Greek OT, NT, and extra-biblical |
| patristics | Greek | Religious | 100-500 CE | Church Fathers |
| greek_literature | Greek | Literary | 400 BCE-400 CE | Perseus & Open Greek texts |
| athenaeus | Greek | Literary | 80-170 CE | Athenaeus' Deipnosophistae |
| bhsa1.1 GB | Hebrew | Biblical | 1000-200 BCE | Biblia Hebraica Stuttgartensia Amstelodamensis |
| dss936 MB | Hebrew | Religious | 300 BCE-100 CE | Dead Sea Scrolls |
| sp147 MB | Hebrew | Biblical | 516 BCE-70 CE | Samaritan Pentateuch |
| extrabiblical | Hebrew | Historical | 200 BCE-200 CE | Extra-biblical Hebrew texts |
| suriano | Italian | Historical | 1616-1623 | Diplomatic correspondence |
| translatin-manif | Latin | Literary | Early Modern | Early modern Latin drama analysis |
| dhammapada | Pali | Religious | 300 BCE | Ancient Buddhist verses |
| uruk | Proto-Cuneiform | Historical | 4000-3100 BCE | Archaic tablets from Uruk |
| peshitta55 MB | Syriac | Biblical | 1000 BCE-900 CE | Syriac Old Testament |
| syrnt52 MB | Syriac | Biblical | 0-1000 CE | Syriac New Testament |
| syriac | Syriac | Religious | Various | Syriac texts collection |
| cuc1.6 MB | Ugaritic | Historical | 1223-1172 BCE | Copenhagen Ugaritic Corpus |
Using a Corpus
To use any corpus with Context-Fabric:
- Clone the repository or download the corpus
- Point Context-Fabric to the directory containing
.tffiles
python
from cfabric import Fabric
CF = Fabric('/path/to/corpus')
api = CF.loadAll()
api.makeAvailableIn(globals())
On first load, Context-Fabric compiles the corpus to its memory-mapped format (.cfm files) for faster subsequent loads.
Resources
- Creating Your Own Corpus — Build a corpus from your data
- Text-Fabric Corpus Documentation
- GitHub text-fabric topic — Discover community corpora