API Reference¶

The xlmhg Python API includes two alternative functions to conduct an XL-mHG test:

The simple test function, xlmhg_test(), accepts a ranked list in the form of a vector, and (optionally) the X and L parameters, and returns a 3-tuple containing the test statistic, cutoff, and p-value.
The advanced test function, get_xlmhg_test_result(), accepts a more compact representation of a list (consisting of its length N and a vector specifying the indices of the 1’s in the ranked list), as well as several additional arguments that can improve the performance of the test. Instead of a simple tuple, this API returns the test result as an mHGResult object, which includes additional information such as the test parameters, and methods to calculate additional quantities like E-Scores.

Additionally, the API includes a function, get_result_figure(), for visualizing a test result in a Plotly figure. See Examples for concrete examples of how to use these functions.

Simple test function - `xlmhg_test()`¶

xlmhg.xlmhg_test(v, X=None, L=None, table=None)¶

Perform an XL-mHG test (simplified interface).

This function accepts a vector containing zeros and ones, and returns a 3-tuple with the XL-mHG test statistic, cutoff, and p-value.

Parameters:

v (1-dim numpy.ndarray of integers) – The ranked list. All non-zero elements are considered “1”s. (Let N denote the length of the list.)
X (int, optional) – The X parameter. [1]
L (int, optional) – The L parameter. [N]
table (np.ndarray with ndim=2 and dtype=numpy.longdouble, optional) – The dynamic programming table. Size has to be at least (K+1) x (W+1), with W = N-K. Providing this array avoids memory reallocation when conducting multiple tests. [None]

Returns:

stat (float) – The XL-mHG test statistic.
cutoff (int) – The (first) cutoff at which stat was attained. (0 if no cutoff was tested.)
pval (float) – The XL-mHG p-value (either exact or an upper bound).

Advanced test function - `get_xlmhg_test_result()`¶

xlmhg.get_xlmhg_test_result(N, indices, X=None, L=None, exact_pval='always', pval_thresh=None, escore_pval_thresh=None, table=None, use_alg1=False, tol=1e-12)¶

Perform an XL-mHG test.

This function accepts a list in the form of a numpy indices array containing the indices of the non-zero elements (sorted), along with the length N of the list. It returns an mHGResult object.

Parameters:	int (N,) – The length of the list. indices (1-dim `numpy.ndarray` with `dtype` = numpy.uint16) – Sorted list of indices corresponding to the “1”s in the ranked list. X (int, optional) – The `X` parameter. Should be between 0 and K (inclusive), where K is the length of `indices`. [0] L (int, optional) – The `L` parameter. Should be between 0 and `N` (inclusive). If `None`, this parameter will be set to `N` [None] exact_pval (str, enumerated) – Valid values are: ‘always’, ‘if_significant’, and ‘if_necessary’. Determines in which cases exact p-values should be calculated. This option helps users avoid the time-consuming calculation of an exact p-value in cases where they do not require it, which can lead to significant performance gains. [‘always’] Specifically, this setting (in conjunction with `pval_thresh`) determines in which cases the PVAL-THRESH algorithm is invoked to efficiently determine whether the test is significant. This algorithm first tries to make this determination by calculating O(1)- and O(N)- bounds of the XL-mHG p-value. Only if this fails to give a conclusive answer, an O(N^2)-algorithm is used to calculate the exact p-value. Note that whenever ‘if_necessary’ or ‘if_significant’ is specified, a significance level (p-value threshold; argument `pval_thresh`) must be specified as well. pval_thresh (float, optional) – The significance threshold, i.e., the p-value below which the test should be considered statistically significant. Note that this argument must be given whenever the `escore_pval_thresh` argument is given. [None] escore_pval_thresh (float, optional) – The significance threshold to be used in the calculation of an E-score. The E-score is a measure of the strength of enrichment that is similar to “fold enrichment”. [None] table (`numpy.ndarray` with `ndim=2` and `dtype=numpy.longdouble`, optional) – The dynamic programming table. Size has to be at least (K+1) x (W+1). Providing this array avoids memory reallocation when conducting multiple tests. [None] use_alg1 (bool, optional) – Whether to use PVAL1 (instead of PVAL2) for calculating the p-value. [False] tol (float, optional) – The tolerance used for comparing floats. [1e-12]
Returns:	The test result.
Return type:	`mHGResult`

Test result objects - `mHGResult`¶

class xlmhg.mHGResult(N, indices, X, L, stat, cutoff, pval, pval_thresh=None, escore_pval_thresh=None, escore_tol=None)¶

The result of an XL-mHG test.

This class is used by the get_xlmhg_test_result function to represent the result of an XL-mHG test.

Parameters:

N (int) – See N attribute.
indices – See indices attribute.
X (int) – See X attribute.
L (int) – See :attr:’L’ attribute.
stat (float) – See stat attribute.
cutoff (int) – See cutoff attribute.
pval (float) – See pval attribute.
pval_thresh (float, optional) – See pval_thresh attribute.
escore_pval_thresh (float, optional) – See escore_pval_thresh attribute.
escore_tol (float, optional) – See escore_tol attribute.

N¶: int – The length of the ranked list (i.e., the number of elements in it).

indices¶: numpy.ndarray with ndim=1 and dtype=np.uint16. – A sorted (!) list of indices of all the 1’s in the ranked list.

X¶: int – The XL-mHG X parameter.

L¶: int – The XL-mHG L parameter.

stat¶: float – The XL-mHG test statistic.

cutoff¶: int – The XL-mHG cutoff.

pval¶: float – The XL-mHG p-value.

pval_thresh¶: float or None – The user-specified significance (p-value) threshold for this test.

escore_pval_thresh¶: float or None – The user-specified p-value threshold used in the E-score calculation.

escore_tol¶: float or None – The floating point tolerance used in the E-score calculation.

K¶: (property) Returns the number of 1’s in the list.

escore¶: (property) Returns the E-score associated with the result.

fold_enrichment¶: (property) Returns the fold enrichment at the XL-mHG cutoff.

hash¶: (property) Returns a unique hash value for the result.

k¶: (property) Returns the number of 1’s above the XL-mHG cutoff.

v¶: (property) Returns the list as a numpy.ndarray (with dtype np.uint8).

Visualizing test results - `get_result_figure()`¶

xlmhg.get_result_figure(result, show_title=False, title=None, show_inset=True, plot_fold_enrichment=False, width=800, height=350, font_size=24, margin=None, font_family='Computer Modern Roman, serif', score_color='rgb(0, 109, 219)', enrichment_color='rgb(219, 109, 0)', cutoff_color='rgba(255, 52, 52, 0.7)', line_width=2.0, ymax=None, mHG_label=False)¶

Visualize an XL-mHG test result.

Parameters:	result (`mHGResult`) – The test result. show_title (bool, optional) – Whether to include a title in the figure. If `title` is not `None`, this parameter is ignored. [False] title (str or None, optional) – Figure title. If not `None`, `show_title` is ignored. [None] show_inset (bool, optional) – Whether to show test parameters and p-value as an inset. [True] plot_fold_enrichment (bool, optional) – Whether to plot the fold enrichment on a second axis. [False] width (int, optional) – The width of the figure (in pixels). [800] height (int, optional) – The height of the figure (in pixels). [350] font_size (int, optional) – The font size to use. [20] margin (dict, optional) – A dictionary specifying the figure margins (in pixels). Valid keys are “l” (left), “r” (right), “t” (top), and “b” (bottom). Missing keys are replaced by Plotly default values. If `None`, will be set to a dictionary specifying a left margin of 100 px, and a top margin of 40 px. [None] font_family (str, optional) – The font family (name) to use. [“Computer Modern Roman, serif”] score_color (str, optional) – The color used for plotting the enrichment scores. [“rgb(0,109,219)”] enrichment_color (str, optional) – The color used for plotting the fold enrichment values (if enabled). [“rgb(219,109,0)”] cutoff_color (str, optional) – The color used for indicating the XL-mHG test cutoff. [“rgba(255, 109,182,0.5)”] line_width (int or float, optional) – The line width used for plotting. [2.0] ymax (int or float or None, optional) – The y-axis limit. If `None`, determined automatically. [None] mHG_label (bool, optional) – If `True`, label the p-value with “mHG” instead of “XL-mHG”. [False]
Returns:	The Plotly figure.
Return type:	`plotly.graph_obs.Figure`

API Reference¶

Simple test function - xlmhg_test()¶

Advanced test function - get_xlmhg_test_result()¶

Test result objects - mHGResult¶

Visualizing test results - get_result_figure()¶

Simple test function - `xlmhg_test()`¶

Advanced test function - `get_xlmhg_test_result()`¶

Test result objects - `mHGResult`¶

Visualizing test results - `get_result_figure()`¶