Welcome to Narmer’s documentation!¶
narmer package¶
narmer.
Narmer NLP/IR library by Christopher C. Little
This library contains code I’m using for research, in particular dissertation research & experimentation.
Further documentation to come…
Submodules¶
narmer.phonetic module¶
narmer.phonetic.
The phonetic module implements phonetic algorithms including:
- german_ipa
-
narmer.phonetic.
enhg_ipa
(word)[source]¶ Convert Early New High German to IPA.
This is based on TODO
Parameters: word (str) – the ENHG word to transcribe to IPA Returns: the ENHG word’s approximate IPA equivalent Return type: str
-
narmer.phonetic.
german_ipa
(word, period=u'nhg')[source]¶ Convert German to IPA.
Wrapper for other, more specific functions to convert German of various periods to IPA.
Parameters: - word (str) – the German word to transcribe to IPA
- period (str) –
a period of German from the set:
- nhg (default) – New High German
- enhg – Early New High German
- mhg – Middle High German
- ohg – Old High German
Returns: the German word’s approximate IPA equivalent
Return type: str
>>> german_ipa('Ehre') 'ere' >>> german_ipa('Kohl') 'kol' >>> german_ipa('Schifffahrt') 'ʃifffart' >>> german_ipa('Schiller') 'ʃiller' >>> german_ipa('Tschechien') 'tʃeçin'
-
narmer.phonetic.
mhg_ipa
(word)[source]¶ Convert Middle High German to IPA.
This is based on http://users.clas.ufl.edu/hasty/resources/CHAPTER1.HTM
Parameters: word (str) – the ENHG word to transcribe to IPA Returns: the ENHG word’s approximate IPA equivalent Return type: str
-
narmer.phonetic.
nhg_ipa
(word)[source]¶ Convert New High German to IPA.
This is based largely on the orthographic mapping described at: https://en.wikipedia.org/wiki/German_orthography
No significant attempt is made to accommodate loanwords.
Parameters: word (str) – the NHG word to transcribe to IPA Returns: the NHG word’s approximate IPA equivalent Return type: str >>> nhg_ipa('Ehre') 'ere' >>> nhg_ipa('Kohl') 'kol' >>> nhg_ipa('Schifffahrt') 'ʃifffart' >>> nhg_ipa('Schiller') 'ʃiller' >>> nhg_ipa('Tschechien') 'tʃeçin'
narmer.stats module¶
narmer.stats.
The stats module defines functions for calculating various statistical data about linguistic objects, including:
- Weissman score calculation
-
narmer.stats.
weissman
(r_tar, t_tar, r_src, t_src, alpha=1.0)[source]¶ Calculate Weissman score based on entered statistics.
The score is: \(W = α \\cdot \\frac{r_{tar}}{r_{src}} \\cdot \\frac{log t_{src}}{log t_{tar}}\)
In practice, the score can be used to rate time-intensive tasks on the basis of other metrics, also, e.g. \(F_1\) score.
Parameters: - r_tar (float) – the target algorithm’s compression ratio
- t_tar (float) – the target algorithm’s compression time
- r_src (float) – a standard algorithm’s compression ratio
- t_src (float) – a standard algorithm’s compression time
- alpha (float) – a scaling constant (1.0 by default)
Returns: the Weissman score
Return type: float
>>> weissman(1, 1, 1, 1) 1.0 >>> weissman(1, 1, 1, 5) 7248263982714164.0 >>> weissman(1.2, 1.6, 4.8, 5) 0.8560773855177113 >>> weissman(1, 1, 1, 1, alpha=2) 2.0 >>> weissman(1.2, 1.6, 4.8, 5, alpha=2) 1.7121547710354226