cm_selix_stats

mdtools.structure.cm_selix_stats(cm, unbound_nan=False)[source]

Get statistics about the indices of selection compounds bound to reference compounds for each reference compound.

Parameters:
  • cm (array_like or SciPy sparse matrix) – (Boolean) contact matrix of shape (m, n) as e.g. generated by mdtools.structure.contact_matrix(), where m is the number of reference compounds and n is the number of selection compounds.

  • unbound_nan (bool, optional) – If True, set the output values for reference compounds that are not in contact with any selection compound to numpy.nan. Otherwise the output values for unbound reference compounds will be zero, which can be misinterpreted, because if a reference compound is only bound by the selection compound with index zero, all output values will be zero, too.

Returns:

selix_stats (numpy.ndarray) – Array of shape (m, 5). The five columns contain for each reference compound

  1. the number

  2. the mean of the indices

  3. the variance of the indices

  4. the minimum of the indices

  5. the maximum of the indices

of selection compounds that are in contact with the given reference compound.

Examples

NumPy array as input:

>>> cm = np.eye(5, 4, -1, dtype=bool) + np.eye(5, 4, -2, dtype=bool)
>>> cm
array([[False, False, False, False],
       [ True, False, False, False],
       [ True,  True, False, False],
       [False,  True,  True, False],
       [False, False,  True,  True]])
>>> # Contact matrix containing the indices of the selection
>>> # compounds that are bound to reference compounds:
>>> cm * np.arange(cm.shape[1])
array([[0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 1, 0, 0],
       [0, 1, 2, 0],
       [0, 0, 2, 3]])
>>> mdt.strc.cm_selix_stats(cm)
array([[0.  , 0.  , 0.  , 0.  , 0.  ],
       [1.  , 0.  , 0.  , 0.  , 0.  ],
       [2.  , 0.5 , 0.25, 0.  , 1.  ],
       [2.  , 1.5 , 0.25, 1.  , 2.  ],
       [2.  , 2.5 , 0.25, 2.  , 3.  ]])
>>> # [n_sel, mean, var , min , max ]
>>> mdt.strc.cm_selix_stats(cm, unbound_nan=True)
array([[0.  ,  nan,  nan,  nan,  nan],
       [1.  , 0.  , 0.  , 0.  , 0.  ],
       [2.  , 0.5 , 0.25, 0.  , 1.  ],
       [2.  , 1.5 , 0.25, 1.  , 2.  ],
       [2.  , 2.5 , 0.25, 2.  , 3.  ]])

SciPy sparse matrices as input:

>>> from scipy import sparse
>>> cm = sparse.csr_matrix(cm)
>>> mdt.strc.cm_selix_stats(cm)
array([[0.  , 0.  , 0.  , 0.  , 0.  ],
       [1.  , 0.  , 0.  , 0.  , 0.  ],
       [2.  , 0.5 , 0.25, 0.  , 1.  ],
       [2.  , 1.5 , 0.25, 1.  , 2.  ],
       [2.  , 2.5 , 0.25, 2.  , 3.  ]])
>>> mdt.strc.cm_selix_stats(cm, unbound_nan=True)
array([[0.  ,  nan,  nan,  nan,  nan],
       [1.  , 0.  , 0.  , 0.  , 0.  ],
       [2.  , 0.5 , 0.25, 0.  , 1.  ],
       [2.  , 1.5 , 0.25, 1.  , 2.  ],
       [2.  , 2.5 , 0.25, 2.  , 3.  ]])