narmer.stats module

narmer.stats.

The stats module defines functions for calculating various statistical data about linguistic objects, including:

  • Weissman score calculation
narmer.stats.weissman(r_tar, t_tar, r_src, t_src, alpha=1.0)[source]

Calculate Weissman score based on entered statistics.

The score is: \(W = α \\cdot \\frac{r_{tar}}{r_{src}} \\cdot \\frac{log t_{src}}{log t_{tar}}\)

In practice, the score can be used to rate time-intensive tasks on the basis of other metrics, also, e.g. \(F_1\) score.

Sources: http://spectrum.ieee.org/view-from-the-valley/computing/software/a-madefortv-compression-metric-moves-to-the-real-world

Parameters:
  • r_tar (float) – the target algorithm’s compression ratio
  • t_tar (float) – the target algorithm’s compression time
  • r_src (float) – a standard algorithm’s compression ratio
  • t_src (float) – a standard algorithm’s compression time
  • alpha (float) – a scaling constant (1.0 by default)
Returns:

the Weissman score

Return type:

float

>>> weissman(1, 1, 1, 1)
1.0
>>> weissman(1, 1, 1, 5)
7248263982714164.0
>>> weissman(1.2, 1.6, 4.8, 5)
0.8560773855177113
>>> weissman(1, 1, 1, 1, alpha=2)
2.0
>>> weissman(1.2, 1.6, 4.8, 5, alpha=2)
1.7121547710354226