Documentation

Available Corpora

Context-Fabric works with any corpus in Text-Fabric format. This page catalogs 40+ known corpora across languages and time periods.

Showing 35 of 35 corpora

Corpus Language Category Period Description
oldassyrianAkkadianHistorical2000-1600 BCEOld Assyrian documents
oldbabylonianAkkadianHistorical1900-1600 BCEOld Babylonian letters
ninmedAkkadianHistoricalca. 800 BCEMedical Encyclopedia from Nineveh
quran73 MBArabicReligious600-900 CEQuranic Arabic Corpus
fususArabicReligiousMedievalIbn Arabi's Fusus Al Hikam
nena_tfAramaicHistoricalModernNorth Eastern Neo-Aramaic
wp6-missievenDutchHistorical1600-1800 CEVOC General Missives
wp6-daghregistersDutchHistorical1640-1641Batavia daily records
wp6-ferdinandhuyckDutchLiterary1884Dutch novel by Jacob van Lennep
mondriaanDutchHistorical1892-1923Piet Mondriaan letters
mobydickEnglishLiterary1851Herman Melville novel with NLP annotations
banksEnglishLiterary1987Iain M. Banks' Consider Phlebas
descartes-tfFrench/Latin/DutchHistorical1619-1650Descartes correspondence
lxx268 MBGreekBiblical300-100 BCESeptuagint (Rahlfs edition)
n1904319 MBGreekBiblical100-400 CENestle 1904 Greek New Testament
SBLGNTGreekBiblical100-400 CESBL Greek New Testament
nestle1904GreekBiblical100-400 CENT from LOWFAT-XML syntax trees
Nestle1904GBIGreekBiblical100-400 CENestle 1904 (tonyjurg)
tischendorf_tf34 MBGreekBiblical100-400 CETischendorf 8th Edition Greek NT
bibleGreekBiblical300 BCE-400 CEGreek OT, NT, and extra-biblical
patristicsGreekReligious100-500 CEChurch Fathers
greek_literatureGreekLiterary400 BCE-400 CEPerseus & Open Greek texts
athenaeusGreekLiterary80-170 CEAthenaeus' Deipnosophistae
bhsa1.1 GBHebrewBiblical1000-200 BCEBiblia Hebraica Stuttgartensia Amstelodamensis
dss936 MBHebrewReligious300 BCE-100 CEDead Sea Scrolls
sp147 MBHebrewBiblical516 BCE-70 CESamaritan Pentateuch
extrabiblicalHebrewHistorical200 BCE-200 CEExtra-biblical Hebrew texts
surianoItalianHistorical1616-1623Diplomatic correspondence
translatin-manifLatinLiteraryEarly ModernEarly modern Latin drama analysis
dhammapadaPaliReligious300 BCEAncient Buddhist verses
urukProto-CuneiformHistorical4000-3100 BCEArchaic tablets from Uruk
peshitta55 MBSyriacBiblical1000 BCE-900 CESyriac Old Testament
syrnt52 MBSyriacBiblical0-1000 CESyriac New Testament
syriacSyriacReligiousVariousSyriac texts collection
cuc1.6 MBUgariticHistorical1223-1172 BCECopenhagen Ugaritic Corpus

Using a Corpus

To use any corpus with Context-Fabric:

  1. Clone the repository or download the corpus
  2. Point Context-Fabric to the directory containing .tf files
python
from cfabric import Fabric

CF = Fabric('/path/to/corpus')
api = CF.loadAll()
api.makeAvailableIn(globals())

On first load, Context-Fabric compiles the corpus to its memory-mapped format (.cfm files) for faster subsequent loads.

Resources