contact_hist_refcmp_selcmp_pair
- mdtools.structure.contact_hist_refcmp_selcmp_pair(cm, natms_per_refcmp=1, natms_per_selcmp=1, minlength=0, dtype=int)[source]
Bin the number of “bonds” (
Atom-Atomcontacts) between pairs of reference and selection compounds.A compound is usually a chemically meaningful subgroup of an
AtomGroup. This can e.g. be aSegment,Residue,fragmentor a singleAtom. Refer to the MDAnalysis’ user guide for an explanation of these terms. Note that in any case, onlyAtomsbelonging to the originalAtomGroupare taken into account, even if the compound might comprise additionalAtomsthat are not contained in the originalAtomGroup.- Parameters:
cm (
array_like) – (Boolean) contact matrix of shape(m, n)as e.g. generated bymdtools.structure.contact_matrix(), wheremis the number of referenceAtomsandnis the number of selectionAtoms. Alternatively, cm can already be a compound contact count matrix as e.g. generated bymdtools.structure.cmp_contact_count_matrix(). In this case, you probably want to set natms_per_refcmp and natms_per_selcmp to1, to keep cm unaltered.natms_per_refcmp (
intorarray_like, optional) – Number ofAtomsper reference compound. Can be a single integer or an array of integers. If natms_per_refcmp is a single integer, all reference compounds are assumed to contain the same number ofAtoms. In this case, natms_per_refcmp must be an integer divisor ofcm.shape[0]. If natms_per_refcmp is an array of integers, it must contain the number of referenceAtomsfor each single reference compound. In this case,sum(natms_per_refcmp)must be equal tocm.shape[0].natms_per_selcmp (
intorarray_like, optional) – Same for selection compounds (natms_per_selcmp is checked againstcm.shape[1]).minlength (
int, optional) – A minimum number of bins for the output array. The output array will have at least this number of elements, though it will be longer if necessary. Seenumpy.bincount()for further information.dtype (
dtype, optional) – Data type of the output array.
- Returns:
hist_refcmp_selcmp_pair (
numpy.ndarray) – Histogram of the number of “bonds” (Atom-Atomcontacts) between pairs of reference and selection compounds (refcmp-selcmp pairs). A refcmp-selcmp pair is defined as a reference and selection compound that are connected with each other via at least one “bond”.
See also
mdtools.structure.contact_matrix()Construct a boolean contact matrix for two MDAnalysis
AtomGroupsmdtools.structure.contact_hists()Bin the number of contacts between reference and selection compounds into histograms.
mdtools.structure.contact_hist_refcmp_diff_selcmp()Bin the number of contacts that reference compounds establish to different selection compounds into a histogram.
mdtools.structure.contact_hist_refcmp_same_selcmp()Bin the number of contacts that reference compounds establish to the same selection compound into a histogram.
mdtools.structure.contact_hist_refcmp_selcmp_tot()Bin the total number of contacts that reference compounds establish to selection compounds into a histogram.
numpy.bincount()Count the number of occurrences of each value in an array of non-negative ints
Notes
Atomsbelonging to the same compound must form a contiguous set in the input contact matrix, otherwise the result will be wrong.About the output array:
- hist_refcmp_selcmp_pair
The first element is meaningless (a refcmp-selcmp pair with zero “bonds” is not a pair) and therefore set to zero. The second element is the number of refcmp-selcmp pairs connected via exactly one “bond”, the third element is the number of refcmp-selcmp pairs connected via exactly two “bonds”, and so on.
The sum of all histogram elements might exceed the number of reference compounds, because a single reference compound can be connected to different selection compounds via different numbers of “bonds”. Even each histogram element on its own might exceed the number of reference compounds, because a single reference compound can be connected to different selection compounds via the same number of “bonds”.
Hence, this histogram should be normalized by the number of refcmp-selcmp pairs and not by the number of reference compounds. Then it is e.g. possible to say that 100 % of the refcmp-selcmp connections are monodentate while at the same time 50 % of the refcmp-selcmp connections are bidentate.
This behavior is complementary to the histogram returned by
mdtools.structure.contact_hist_refcmp_same_selcmp()
If both natms_per_refcmp and natms_per_selcmp are
1and cm is a true boolean contact matrix, hist_refcmp_selcmp_pair is equal to[0, y], whereyis the number of refatm-selatm pairs.Examples
>>> cm = np.tril(np.ones((5,4), dtype=bool), -1) >>> cm[3, 0] = 0 >>> cm array([[False, False, False, False], [ True, False, False, False], [ True, True, False, False], [False, True, True, False], [ True, True, True, True]]) >>> mdt.strc.contact_hist_refcmp_selcmp_pair(cm) array([0, 9]) >>> mdt.strc.contact_hist_refcmp_selcmp_pair(cm=cm, minlength=4) array([0, 9, 0, 0]) >>> mdt.strc.contact_hist_refcmp_selcmp_pair(cm=cm, minlength=1) array([0, 9]) >>> mdt.strc.contact_hist_refcmp_selcmp_pair(cm=cm, dtype=np.uint32) array([0, 9], dtype=uint32) >>> hist = mdt.strc.contact_hist_refcmp_selcmp_pair(cm) >>> hist[1] == np.count_nonzero(cm) True
>>> mdt.strc.cmp_contact_count_matrix( ... cm=cm, natms_per_refcmp=[2, 2, 1] ... ) array([[1, 0, 0, 0], [1, 2, 1, 0], [1, 1, 1, 1]]) >>> mdt.strc.contact_hist_refcmp_selcmp_pair( ... cm=cm, natms_per_refcmp=[2, 2, 1] ... ) array([0, 7, 1])
>>> mdt.strc.cmp_contact_count_matrix(cm=cm, natms_per_selcmp=2) array([[0, 0], [1, 0], [2, 0], [1, 1], [2, 2]]) >>> mdt.strc.contact_hist_refcmp_selcmp_pair( ... cm=cm, natms_per_selcmp=2 ... ) array([0, 3, 3])
>>> mdt.strc.cmp_contact_count_matrix( ... cm=cm, natms_per_refcmp=[2, 2, 1], natms_per_selcmp=2 ... ) array([[1, 0], [3, 1], [2, 2]]) >>> mdt.strc.contact_hist_refcmp_selcmp_pair( ... cm=cm, natms_per_refcmp=[2, 2, 1], natms_per_selcmp=2 ... ) array([0, 2, 2, 1])
Edge cases:
>>> cm = np.array([], dtype=bool).reshape(0, 4) >>> mdt.strc.contact_hist_refcmp_selcmp_pair( ... cm, natms_per_refcmp=[] ... ) array([], dtype=int64) >>> mdt.strc.contact_hist_refcmp_selcmp_pair(cm, natms_per_refcmp=1) array([], dtype=int64) >>> mdt.strc.contact_hist_refcmp_selcmp_pair( ... cm, natms_per_refcmp=[], minlength=2, dtype=np.uint32 ... ) array([0, 0], dtype=uint32)
>>> cm = np.array([], dtype=bool).reshape(6, 0) >>> mdt.strc.contact_hist_refcmp_selcmp_pair( ... cm, natms_per_refcmp=3, natms_per_selcmp=[] ... ) array([], dtype=int64)