n_leaves_vs_time_discrete

mdtools.dtrj.n_leaves_vs_time_discrete(dtrj1, dtrj2, discard_neg_start=False, discard_all_neg=False, verbose=False)[source]

Calculate the number of compounds that leave their state after a lag time \(\Delta t\) resolved with respect to the states in a second discrete trajectory.

Take a discrete trajectory and calculate the total number of compounds that leave their state at time \(t_0 + \Delta t\) given that they have entered the state at time \(t_0\) and given that they were in a specific state of another discrete trajectory at time \(t_0\).

Additionally, calculate the number of compounds that are at risk to leave the state at time \(t_0 + \Delta t\), i.e. the number of compounds that have continuously been in the state from time \(t_0\) to \(t_0 + \Delta t\).

States whose starting point \(t_0\) is not known (because the compound has already been in its state at the beginning of the trajectory) are discarded, because it is not known in which state of the second trajectory the compound was at time \(t_0\).

Parameters:
  • dtrj1, dtrj2 (array_like) – The discrete trajectories. Arrays of shape (n, f), where n is the number of compounds and f is the number of frames. The shape can also be (f,), in which case the array is expanded to shape (1, f). Both arrays must have the same shape. The elements of the arrays are interpreted as the indices of the states in which a given compound is at a given frame.

  • discard_neg_start (bool, optional) – If True, discard all state leavings starting from a negative state. This means compounds in negative states are ignored. They neither increase n_leaves nor n_risk. This is equivalent to discarding the lifetimes of all negative states when calculating state lifetimes with mdtools.dtrj.lifetimes().

  • discard_all_neg (bool, optional) – If True, discard all state leavings starting from or ending in a negative state. Additionally to ignoring compounds in negative states, transitions from positive to negative states are treated as right-censored. These transitions increase n_risk but not n_leaves. This is equivalent to discarding the lifetimes of all negative states and of all states that are followed by a negative state when calculating state lifetimes with mdtools.dtrj.lifetimes().

  • verbose (bool, optional) – If True print a progress bar.

Returns:

  • n_leaves (numpy.ndarray) – Array of shape (m, f) and dtype numpy.uint32. m is the number of different states in the second discrete trajectory. The ij-element of n_leaves is the number of compounds that leave their state j frames after they have entered it, given that the compounds were in state i of the second discrete trajectory at time \(t_0\).

  • n_risk (numpy.ndarray) – Array of the same shape and dtype as n_leaves containing the corresponding number of compounds that are at risk of leaving their state.

See also

mdtools.dtrj.n_leaves_vs_time()

Calculate the number of compounds that leave their state after a lag time \(\Delta t\) given that they have entered the state at time \(t_0\)

mdtools.dtrj.leave_prob_discrete()

Calculate the probability that a compound leaves its state after a lag time \(\Delta t\) given that it has entered the state at time \(t_0\) resolved with respect to the states in a second discrete trajectory

mdtools.dtrj.kaplan_meier_discrete()

Estimate the state survival function using the Kaplan-Meier estimator resolved with respect to the states in a second discrete trajectory

Notes

If you parse the same discrete trajectory to dtrj1 and dtrj2 you will get n_leaves and n_risk for each individual state of the input trajectory.

n_leaves / n_risk is the probability that a compound leaves its state at time \(t_0 + \Delta t\) given that it has entered the state at time \(t_0\).

np.cumprod(1 - n_leaves / n_risk, axis=-1) is the Kaplan-Meier estimate of the survival function of the underlying distribution of state lifetimes.[1]

References

Examples

>>> # 0 detectable leaves, 1 left-truncation, 1 right-censoring.
>>> dtrj = np.array([2, 2, 5, 5, 5, 5, 5])
>>> n_leaves, n_risk = mdt.dtrj.n_leaves_vs_time_discrete(
...     dtrj1=dtrj, dtrj2=dtrj
... )
>>> n_leaves
array([[0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0]], dtype=uint32)
>>> n_risk
array([[0, 0, 0, 0, 0, 0, 0],
       [1, 1, 1, 1, 1, 1, 0]], dtype=uint32)
>>> # 1 detectable leave, 1 left-truncation, 1 right-censoring.
>>> dtrj = np.array([2, 2, 3, 3, 3, 2, 2])
>>> n_leaves, n_risk = mdt.dtrj.n_leaves_vs_time_discrete(
...     dtrj1=dtrj, dtrj2=dtrj
... )
>>> n_leaves
array([[0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 1, 0, 0, 0]], dtype=uint32)
>>> n_risk
array([[1, 1, 1, 0, 0, 0, 0],
       [1, 1, 1, 1, 0, 0, 0]], dtype=uint32)
>>> # 2 detectable leaves, 1 left-truncation, 1 right-censoring.
>>> dtrj = np.array([1, 3, 3, 3, 1, 2, 2])
>>> n_leaves, n_risk = mdt.dtrj.n_leaves_vs_time_discrete(
...     dtrj1=dtrj, dtrj2=dtrj
... )
>>> n_leaves
array([[0, 1, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 1, 0, 0, 0]], dtype=uint32)
>>> n_risk
array([[1, 1, 0, 0, 0, 0, 0],
       [1, 1, 1, 0, 0, 0, 0],
       [1, 1, 1, 1, 0, 0, 0]], dtype=uint32)
>>> dtrj = np.array([1, 3, 3, 3, 2, 2, 1])
>>> n_leaves, n_risk = mdt.dtrj.n_leaves_vs_time_discrete(
...     dtrj1=dtrj, dtrj2=dtrj
... )
>>> n_leaves
array([[0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0],
       [0, 0, 0, 1, 0, 0, 0]], dtype=uint32)
>>> n_risk
array([[1, 1, 0, 0, 0, 0, 0],
       [1, 1, 1, 0, 0, 0, 0],
       [1, 1, 1, 1, 0, 0, 0]], dtype=uint32)
>>> dtrj = np.array([3, 3, 3, 1, 2, 2, 1])
>>> n_leaves, n_risk = mdt.dtrj.n_leaves_vs_time_discrete(
...     dtrj1=dtrj, dtrj2=dtrj
... )
>>> n_leaves
array([[0, 1, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0]], dtype=uint32)
>>> n_risk
array([[2, 2, 0, 0, 0, 0, 0],
       [1, 1, 1, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0]], dtype=uint32)
>>> dtrj = np.array([[2, 2, 5, 5, 5, 5, 5],
...                  [2, 2, 3, 3, 3, 2, 2],
...                  [1, 3, 3, 3, 1, 2, 2],
...                  [1, 3, 3, 3, 2, 2, 1],
...                  [3, 3, 3, 1, 2, 2, 1]])
>>> n_leaves, n_risk = mdt.dtrj.n_leaves_vs_time_discrete(
...     dtrj1=dtrj, dtrj2=dtrj
... )
>>> n_leaves
array([[0, 2, 0, 0, 0, 0, 0],
       [0, 0, 2, 0, 0, 0, 0],
       [0, 0, 0, 3, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0]], dtype=uint32)
>>> n_risk
array([[4, 4, 0, 0, 0, 0, 0],
       [4, 4, 4, 0, 0, 0, 0],
       [3, 3, 3, 3, 0, 0, 0],
       [1, 1, 1, 1, 1, 1, 0]], dtype=uint32)
>>> dtrj = np.array([[1, 2, 2, 3, 3, 3],
...                  [2, 2, 3, 3, 3, 1],
...                  [3, 3, 3, 1, 2, 2],
...                  [1, 3, 3, 3, 2, 2]])
>>> n_leaves, n_risk = mdt.dtrj.n_leaves_vs_time_discrete(
...     dtrj1=dtrj, dtrj2=dtrj
... )
>>> n_leaves
array([[0, 1, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0],
       [0, 0, 0, 2, 0, 0]], dtype=uint32)
>>> n_risk
array([[2, 2, 0, 0, 0, 0],
       [3, 3, 3, 0, 0, 0],
       [3, 3, 3, 3, 0, 0]], dtype=uint32)

Discarding negative states:

>>> dtrj = np.array([[1, 2, 2, 3, 3, 3],
...                  [2, 2, 3, 3, 3, 1],
...                  [3, 3, 3, 1, 2, 2],
...                  [1, 3, 3, 3, 2, 2],
...                  [1, 4, 4, 4, 4, 1]])
>>> n_leaves, n_risk = mdt.dtrj.n_leaves_vs_time_discrete(
...     dtrj1=dtrj, dtrj2=dtrj
... )
>>> n_leaves
array([[0, 1, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0],
       [0, 0, 0, 2, 0, 0],
       [0, 0, 0, 0, 1, 0]], dtype=uint32)
>>> n_risk
array([[3, 3, 0, 0, 0, 0],
       [3, 3, 3, 0, 0, 0],
       [3, 3, 3, 3, 0, 0],
       [1, 1, 1, 1, 1, 0]], dtype=uint32)
>>> n_leaves_ns, n_risk_ns = mdt.dtrj.n_leaves_vs_time_discrete(
...     dtrj1=dtrj, dtrj2=dtrj, discard_neg_start=True
... )
>>> n_leaves_ns
array([[0, 1, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0],
       [0, 0, 0, 2, 0, 0],
       [0, 0, 0, 0, 1, 0]], dtype=uint32)
>>> n_risk_ns
array([[3, 3, 0, 0, 0, 0],
       [3, 3, 3, 0, 0, 0],
       [3, 3, 3, 3, 0, 0],
       [1, 1, 1, 1, 1, 0]], dtype=uint32)
>>> n_leaves_an, n_risk_an = mdt.dtrj.n_leaves_vs_time_discrete(
...     dtrj1=dtrj, dtrj2=dtrj, discard_all_neg=True
... )
>>> n_leaves_an
array([[0, 1, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0],
       [0, 0, 0, 2, 0, 0],
       [0, 0, 0, 0, 1, 0]], dtype=uint32)
>>> n_risk_an
array([[3, 3, 0, 0, 0, 0],
       [3, 3, 3, 0, 0, 0],
       [3, 3, 3, 3, 0, 0],
       [1, 1, 1, 1, 1, 0]], dtype=uint32)
>>> dtrj = np.array([[ 1, -2, -2,  3,  3,  3],
...                  [-2, -2,  3,  3,  3,  1],
...                  [ 3,  3,  3,  1, -2, -2],
...                  [ 1,  3,  3,  3, -2, -2],
...                  [ 1,  4,  4,  4,  4, -1]])
>>> n_leaves, n_risk = mdt.dtrj.n_leaves_vs_time_discrete(
...     dtrj1=dtrj, dtrj2=dtrj
... )
>>> n_leaves
array([[0, 0, 1, 0, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [0, 1, 0, 0, 0, 0],
       [0, 0, 0, 2, 0, 0],
       [0, 0, 0, 0, 1, 0]], dtype=uint32)
>>> n_risk
array([[3, 3, 3, 0, 0, 0],
       [1, 1, 0, 0, 0, 0],
       [2, 2, 0, 0, 0, 0],
       [3, 3, 3, 3, 0, 0],
       [1, 1, 1, 1, 1, 0]], dtype=uint32)
>>> n_leaves_ns, n_risk_ns = mdt.dtrj.n_leaves_vs_time_discrete(
...     dtrj1=dtrj, dtrj2=dtrj, discard_neg_start=True
... )
>>> n_leaves_ns
array([[0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [0, 1, 0, 0, 0, 0],
       [0, 0, 0, 2, 0, 0],
       [0, 0, 0, 0, 1, 0]], dtype=uint32)
>>> n_risk_ns
array([[0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [2, 2, 0, 0, 0, 0],
       [3, 3, 3, 3, 0, 0],
       [1, 1, 1, 1, 1, 0]], dtype=uint32)
>>> n_leaves_an, n_risk_an = mdt.dtrj.n_leaves_vs_time_discrete(
...     dtrj1=dtrj, dtrj2=dtrj, discard_all_neg=True
... )
>>> n_leaves_an
array([[0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [0, 0, 0, 1, 0, 0],
       [0, 0, 0, 0, 0, 0]], dtype=uint32)
>>> n_risk_an
array([[0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [2, 2, 0, 0, 0, 0],
       [3, 3, 3, 3, 0, 0],
       [1, 1, 1, 1, 1, 0]], dtype=uint32)
>>> dtrj = np.array([[ 1, -2, -2,  3,  3,  3],
...                  [-2, -2,  3,  3,  3,  1],
...                  [ 3,  3,  3,  1, -2, -2],
...                  [ 1,  3,  3,  3, -2, -2],
...                  [ 1,  4,  4,  4,  4, -1],
...                  [ 6,  6,  6,  6,  6,  6],
...                  [-6, -6, -6, -6, -6, -6]])
>>> n_leaves, n_risk = mdt.dtrj.n_leaves_vs_time_discrete(
...     dtrj1=dtrj, dtrj2=dtrj
... )
>>> n_leaves
array([[0, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [0, 1, 0, 0, 0, 0],
       [0, 0, 0, 2, 0, 0],
       [0, 0, 0, 0, 1, 0],
       [0, 0, 0, 0, 0, 0]], dtype=uint32)
>>> n_risk
array([[0, 0, 0, 0, 0, 0],
       [3, 3, 3, 0, 0, 0],
       [1, 1, 0, 0, 0, 0],
       [2, 2, 0, 0, 0, 0],
       [3, 3, 3, 3, 0, 0],
       [1, 1, 1, 1, 1, 0],
       [0, 0, 0, 0, 0, 0]], dtype=uint32)
>>> n_leaves_ns, n_risk_ns = mdt.dtrj.n_leaves_vs_time_discrete(
...     dtrj1=dtrj, dtrj2=dtrj, discard_neg_start=True
... )
>>> n_leaves_ns
array([[0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [0, 1, 0, 0, 0, 0],
       [0, 0, 0, 2, 0, 0],
       [0, 0, 0, 0, 1, 0],
       [0, 0, 0, 0, 0, 0]], dtype=uint32)
>>> n_risk_ns
array([[0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [2, 2, 0, 0, 0, 0],
       [3, 3, 3, 3, 0, 0],
       [1, 1, 1, 1, 1, 0],
       [0, 0, 0, 0, 0, 0]], dtype=uint32)
>>> n_leaves_an, n_risk_an = mdt.dtrj.n_leaves_vs_time_discrete(
...     dtrj1=dtrj, dtrj2=dtrj, discard_all_neg=True
... )
>>> n_leaves_an
array([[0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [0, 0, 0, 1, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0]], dtype=uint32)
>>> n_risk_an
array([[0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [2, 2, 0, 0, 0, 0],
       [3, 3, 3, 3, 0, 0],
       [1, 1, 1, 1, 1, 0],
       [0, 0, 0, 0, 0, 0]], dtype=uint32)