remain_prob

mdtools.dtrj.remain_prob(dtrj, restart=1, continuous=False, discard_neg_start=False, discard_all_neg=False, verbose=False)[source]

Calculate the probability that a compound is in the same state as at time \(t_0\) after a lag time \(\Delta t\)

Take a discrete trajectory and calculate the probability to find a compound in the same state as at time \(t_0\) after a lag time \(\Delta t\).

Parameters:
  • dtrj (array_like) – The discrete trajectory. Array of shape (n, f), where n is the number of compounds and f is the number of frames. The shape can also be (f,), in which case the array is expanded to shape (1, f). The elements of dtrj are interpreted as the indices of the states in which a given compound is at a given frame.

  • restart (int, optional) – Restart every restart frames. Determines how many restarting points \(t_0\) are used for averaging.

  • continuous (bool, optional) – If True, compounds must continuously be in the same state without interruption in order to be counted (see notes).

  • discard_neg_start (bool, optional) – If True, discard all transitions starting from a negative state (see notes). Must not be used together with discard_all_neg.

  • discard_all_neg (bool, optional) – If True, discard all negative states (see notes). Must not be used together with discard_neg_start.

  • verbose (bool, optional) – If True print a progress bar.

Returns:

prop (numpy.ndarray) – Array of shape (f,) containing for all possible lag times the probability that a compound is in the same state as at time \(t_0\) after a lag time \(\Delta t\).

See also

mdtools.dtrj.remain_prob_discrete()

Calculate the probability that a compound is in the same state as at time \(t_0\) after a lag time \(\Delta t\) resolved with respect to the states in second discrete trajectory.

Notes

If you want to compute the probability to stay in the same state for each individual state in a discrete trajectory, use mdtools.dtrj.remain_prob_discrete() and parse the same discrete trajectory to dtrj1 and dtrj2.

Continuous and Discontinuous Probability

If continuous is False, this function calculates the probability to be still or again in the same state as at time \(t_0\) after a lag \(\Delta t\):

\[p(\xi(t_0 + \Delta t) \in S | \xi(t_0) \in S) = \left\langle \frac{N(t_0, t_0 + \Delta t)}{N(t_0)} \right\rangle_t\]

Here, \(\xi(t)\) is the discretized coordinate and \(S\) is a given valid discrete state. \(N(t_0)\) is the number of compounds that are in a valid state at time \(t_0\) and \(N(t_0, t_0 + \Delta t)\) is the number of compounds that are at time \(t_0 + \Delta t\) still or again in the same valid state as at time \(t_0\). The brackets \(\langle ... \rangle_t\) denote averaging over all given restarting times \(t_0\).

If continuous is True, this function calculates the probability to be still in the same state as at time \(t_0\) after a lag time \(\Delta t\):

\[p(\xi(t) \in S ~\forall t \in [t_0, t_0 + \Delta t]) = \left\langle \frac{N(\{t \in [t_0, t_0 + \Delta t]\})}{N(t_0)} \right\rangle_t\]

\(N(\{t \in [t_0, t_0 + \Delta t]\})\) is the number of compounds that continuously stay in their initial valid state for all times from \(t_0\) to \(t_0 + \Delta t\).

Valid and Invalid States

By default, all states in the given discrete trajectory are valid. However, in some cases you might want to treat certain states as invalid, e.g. because you only want to consider specific states. This can be achieved with the arguments discard_neg_start or discard_all_neg which treat negative states (i.e. states with a negative state number/index) as invalid states.

Note

If you want to treat states with index zero as invalid, too, simply subtract one from dtrj. If you want to treat all states below a certain cutoff as invalid, subtract this cutoff from dtrj. If you want to treat certain states as invalid, make these states negative, e.g. by multiplying these states with minus one.

The arguments discard_neg_start and discard_all_neg affect the counting of compounds in a given state, i.e. they affect \(N(t_0)\), \(N(t_0, t_0 + \Delta t)\) and \(N(\{t \in [t_0, t_0 + \Delta t]\})\). In general, \(N\) is the sum of compounds that are in a valid state: \(N = \sum_i N_i\). \(N_i\) is \(1\) if compound \(i\) is in a valid state and \(0\) otherwise. discard_neg_start and discard_all_neg now affect when a state is considered valid.

If both, discard_neg_start and discard_all_neg, are False (the default), all states are valid.

\[N_i(t_0) = 1 ~\forall S_i(t_0)\]
\[\begin{split}N_i(t_0, t_0 + \Delta t) = \begin{cases} 1 & S_i(t_0 + \Delta t) = S_i(t_0) \\ 0 & \text{otherwise} \end{cases}\end{split}\]
\[\begin{split}N_i(\{t \in [t_0, t_0 + \Delta t]\}) = \begin{cases} 1 & S_i(t) = S_i(t_0) ~\forall t \in [t_0, t_0 + \Delta t]\\ 0 & \text{otherwise} \end{cases}\end{split}\]

Here, \(S_i(t)\) is the state of compound \(i\) at time \(t\).

If discard_neg_start is True, transitions starting from negative states are discarded:

\[\begin{split}N_i(t_0) = \begin{cases} 1 & S_i(t_0) \geq 0 \\ 0 & S_i(t_0) < 0 \end{cases}\end{split}\]
\[\begin{split}N_i(t_0, t_0 + \Delta t) = \begin{cases} 1 & S_i(t_0) \geq 0 \text{ and } \\ & S_i(t_0 + \Delta t) = S_i(t_0) \\ 0 & \text{otherwise} \end{cases}\end{split}\]
\[\begin{split}N_i(\{t \in [t_0, t_0 + \Delta t]\}) = \begin{cases} 1 & S_i(t_0) \geq 0 \text{ and } \\ & S_i(t) = S_i(t_0) ~\forall t \in [t_0, t_0 + \Delta t]\\ 0 & \text{otherwise} \end{cases}\end{split}\]

Thus, the resulting probability does not contain the probability to stay in a negative state. However, transitions from positive to negative states are respected. Hence, the probability to stay in a positive state is decreased if a transition to a negative states occurs (in addition to the decrease caused by transitions to other positive states).

If discard_all_neg is True, all negative states are discarded:

\[\begin{split}N_i(t_0) = \begin{cases} 1 & S_i(t) \geq 0 ~\forall t \in [t_0, t_0 + \Delta t] \\ 0 & \text{otherwise} \end{cases}\end{split}\]
\[\begin{split}N_i(t_0, t_0 + \Delta t) = \begin{cases} 1 & S_i(t) \geq 0 ~\forall t \in [t_0, t_0 + \Delta t] \text{ and } \\ & S_i(t_0 + \Delta t) = S_i(t_0) \\ 0 & \text{otherwise} \end{cases}\end{split}\]
\[\begin{split}N_i(\{t \in [t_0, t_0 + \Delta t]\}) = \begin{cases} 1 & S_i(t) \geq 0 ~\forall t \in [t_0, t_0 + \Delta t] \text{ and } \\ & S_i(t) = S_i(t_0) ~\forall t \in [t_0, t_0 + \Delta t]\\ 0 & \text{otherwise} \end{cases}\end{split}\]

This means transitions from or to negative states are completely discarded. Thus, the resulting probability does neither contain the probability to stay in a negative state nor is it decreased by transitions from positive to negative states.

Lifetimes

The calculated probabilities can be used to calculate the average lifetime of the valid states, i.e. how long a compounds resides on average in a valid state. A common approach is to fit the probability with a stretched exponential function \(f(t)\) (e.g. using mdtools.functions.fit_kww()) and afterwards calculating the area below the fitted curve:[1],[2],[3],[4]

\[f(t) = \exp{\left[ -\left(\frac{t}{\tau_0}\right)^\beta \right]}\]
\[\tau = \int_0^\infty f(t) \text{ d}t = \frac{\tau_0}{\beta} \Gamma\left(\frac{1}{\beta}\right)\]

Here, \(\tau_0\) and \(\beta\) are the fit parameters, \(\tau\) is the average lifetime and \(\Gamma(x)\) is the gamma function. For physically meaningful results, \(\beta\) should be confined to \(0 < \beta \leq 1\). For purely exponential decay (\(\beta = 1\)), \(\tau = \tau_0\) applies. If the calculated probability fully decays to zero within the accessible range of lag times, one can alternatively numerically integrate the calculated probability directly.

References

Examples

>>> dtrj = np.array([[2, 2, 3, 3, 3]])
>>> mdt.dtrj.remain_prob(dtrj)
array([1.        , 0.75      , 0.33333333, 0.        , 0.        ])
>>> mdt.dtrj.remain_prob(dtrj, continuous=True)
array([1.        , 0.75      , 0.33333333, 0.        , 0.        ])
>>> dtrj = np.array([[1, 3, 3, 3, 1]])
>>> mdt.dtrj.remain_prob(dtrj)
array([1.        , 0.5       , 0.33333333, 0.        , 1.        ])
>>> mdt.dtrj.remain_prob(dtrj, continuous=True)
array([1.        , 0.5       , 0.33333333, 0.        , 0.        ])

The following examples were not checked to be mathematically correct!

>>> dtrj = np.array([[ 1, -2, -2,  3,  3,  3],
...                  [-2, -2,  3,  3,  3,  1],
...                  [ 3,  3,  3,  1, -2, -2],
...                  [ 1,  3,  3,  3, -2, -2],
...                  [ 1,  4,  4,  4,  4, -1]])
>>> mdt.dtrj.remain_prob(dtrj)
array([1.        , 0.6       , 0.3       , 0.06666667, 0.        ,
       0.        ])
>>> mdt.dtrj.remain_prob(dtrj, restart=3)
array([1. , 0.5, 0.2, 0. , 0. , 0. ])
>>> mdt.dtrj.remain_prob(dtrj, continuous=True)
array([1.        , 0.6       , 0.3       , 0.06666667, 0.        ,
       0.        ])
>>> mdt.dtrj.remain_prob(dtrj, discard_neg_start=True)
array([1.        , 0.57894737, 0.375     , 0.09090909, 0.        ,
       0.        ])
>>> mdt.dtrj.remain_prob(
...     dtrj, discard_neg_start=True, continuous=True
... )
array([1.        , 0.57894737, 0.375     , 0.09090909, 0.        ,
       0.        ])
>>> mdt.dtrj.remain_prob(dtrj, discard_all_neg=True)
array([1.        , 0.73333333, 0.6       , 0.2       , 0.        ,
              nan])
>>> mdt.dtrj.remain_prob(
...     dtrj, discard_all_neg=True, continuous=True
... )
array([1.        , 0.73333333, 0.6       , 0.2       , 0.        ,
              nan])
>>> dtrj = dtrj.T
>>> mdt.dtrj.remain_prob(dtrj)
array([1.        , 0.375     , 0.11111111, 0.16666667, 0.16666667])
>>> mdt.dtrj.remain_prob(dtrj, restart=2)
array([1.        , 0.58333333, 0.        , 0.33333333, 0.16666667])
>>> mdt.dtrj.remain_prob(dtrj, continuous=True)
array([1.        , 0.375     , 0.05555556, 0.        , 0.        ])
>>> mdt.dtrj.remain_prob(dtrj, discard_neg_start=True)
array([1.        , 0.375     , 0.16666667, 0.25      , 0.25      ])
>>> mdt.dtrj.remain_prob(
...     dtrj, discard_neg_start=True, continuous=True
... )
array([1.        , 0.375     , 0.08333333, 0.        , 0.        ])
>>> mdt.dtrj.remain_prob(dtrj, discard_all_neg=True)
array([1.        , 0.46153846, 0.28571429, 0.33333333, 0.        ])
>>> mdt.dtrj.remain_prob(
...     dtrj, discard_all_neg=True, continuous=True
... )
array([1.        , 0.46153846, 0.14285714, 0.        , 0.        ])