discrete_pos_trj

mdtools.structure.discrete_pos_trj(sel, trj=None, topfile=None, trjfile=None, begin=0, end=-1, every=1, compound='atoms', direction='z', bin_start=0, bin_stop=None, bin_num=10, bins=None, tol=1e-6, return_bins=False, return_lbox=False, return_dt=False, dtype=int, verbose=True, debug=False, **sel_kwargs)[source]

Create a discrete position trajectory.

Discretize the positions of compounds of an MDAnalysis AtomGroup in a given spatial direction.

Todo

Allow to choose between center of mass and center of geometry.

Parameters:
  • sel (MDAnalysis.core.groups.AtomGroup or str) – ‘Selection group’: Either a MDAnalysis AtomGroup (if trj is not None) or a selection string for creating an MDAnalysis AtomGroup (if trj is None). See MDAnalysis’ selection syntax for possible choices of selection strings.

  • trj (MDAnalysis.coordinates.base.ReaderBase or MDAnalysis.coordinates.base.FrameIteratorBase, optional) – MDAnalysis trajectory to read. If None, a new MDAnalysis Universe and trajectory are created from topfile and trjfile.

  • topfile (str, optional) – Topology file. See supported topology formats of MDAnalysis. Ignored if trj is given.

  • trjfile (str) – Trajectory file. See supported coordinate formats of MDAnalysis. Ignored if trj is given.

  • begin (int, optional) – First frame to read from a newly created trajectory. Frame numbering starts at zero. Ignored if trj is given. If you want to use only specific frames from an already existing trajectory, slice the existing trajectory accordingly and parse it as MDAnalysis.coordinates.base.FrameIteratorSliced object to the trj argument.

  • end (int, optional) – Last frame to read from a newly created trajectory. This is exclusive, i.e. the last frame read is actually end - 1. A value of -1 means to read the very last frame. Ignored if trj is given.

  • every (int, optional) – Read every n-th frame from the newly created trajectory. Ignored if trj is given.

  • compound ({'atoms', 'group', 'segments', 'residues', 'fragments'}, optional) – The compounds of the selection group whose center of mass positions should be discretized. If 'group', the center of mass of all Atoms in the selection group will be used. Else, the center of mass positions of each Segment, Residue, fragment or Atom contained in the selection group will be discretized. Refer to the MDAnalysis’ user guide for an explanation of these terms. Compounds are made whole before calculating their centers of mass. The centers of mass are wrapped back into the primary unit cell before discretizing their positions. Note that in any case, even if compound is e.g. 'residues', only the Atoms belonging to the selection group are taken into account, even if the compound might comprise additional Atoms that are not contained in the selection group.

  • direction ({'z', 'y', 'x'}, optional) – The spatial direction along which the discretization is done.

  • bin_start (scalar, optional) – Point (in Angstrom) on the chosen spatial direction to start binning. Note that binning naturally starts at zero (origin of the simulation box). If parsing a start value greater than zero, the first bin interval will be [0, bin_start). In this way you can determine the width of the first bin independently from the other bins. Note that bin_start must lie within the simulation box obtained from the first frame read and it must be smaller than bin_stop.

  • bin_stop (scalar, optional) – Point (in Angstrom) on the chosen spatial direction to stop binning. Note that binning naturally ends at lbox + tol, where lbox is the length of the simulation box in the given spatial direction and tol is a small tolerance to account for the right-open bin interval. If parsing a value less than lbox, the last bin interval will be [bin_stop, lbox + tol). In this way you can determine the width of the last bin independently from the other bins. Note that bin_stop must lie within the simulation box obtained from the first frame read and it must be greater than bin_start. If None, bin_stop is set to lbox + tol.

  • bin_num (int, optional) – Number of equidistant bins (not bin edges!) to use for discretizing the given spatial direction between bin_start and bin_stop. Note that two additional bins, [0, bin_start) and [bin_stop, lbox + tol), are created if bin_start is not zero and bin_stop is not lbox.

  • bins (array_like, optional) – Array of custom bin edges. Bins do not need to be equidistant. All bin edges must lie within the simulation box as obtained from the first frame read. The given bin edges are sorted ascending order and and duplicate bin edges are removed. If bins is given, it takes precedence over all other bin arguments.

  • tol (scalar, optional) – The tolerance value added to lbox to account for the right-open bin interval of the last bin.

  • return_bins (bool, optional) – If True, return the bin edges used for the discretization.

  • return_lbox (bool, optional) – If True, return the average box length in the given spatial direction.

  • return_dt (bool, optional) – If True return the time step of the created discrete trajectory in ps. Attention: If trj is not None, the time step is simply taken from the first frame of the input trajectory. If trj is None the returned time step is the time step of the first frame of the newly created trajectory times every.

  • dtype (dtype, optional) – Data type of the returned discrete trajectory.

  • verbose (bool, optional) – If True, print progress information to standard output.

  • debug (bool, optional) – If True, run in debug mode.

  • sel_kwargs (dict, optional) – Additional keyword arguments to parse to MDAnalysis.core.universe.Universe.select_atoms() besides the selection string given by sel. If you parse keywords to create an UpdatingAtomGroup, the number of compounds in this UpdatingAtomGroup must stay constant. Ignored if trj is given.

Returns:

  • dtrj (numpy.ndarray) – Array of shape (n, f) containing for each selection compound for each frame the index of the position bin in which the selection compound currently resides. n is the number of selection compounds, f is the number of frames.

  • bins (numpy.ndarray) – The bin edges used for the discretization. Only returned if return_bins is True.

  • lbox_av (scalar) – Average box length in the given spatial direction. Only returned if return_lbox is True.

  • dt (scalar) – Time step of the discrete position trajectory in ps. Only returned if return_dt is True.

See also

discrete_pos

Script to create a discrete position trajectory

numpy.digitize()

Return the indices of the bins to which each value in the input array belongs

Notes

The simulation box must be orthogonal.

Compounds are assigned to bins according to their center of mass position. Compounds are made whole before calculating their centers of mass. The centers of mass are wrapped back into the primary unit cell before discretizing their positions.

The discretization of the compounds’ positions is done in relative box coordinates. The final output is scaled by the average box length in the given spatial direction. Doing so accounts for possible fluctuations of the simulation box (e.g. due to pressure scaling). Note that MDAnalysis always sets the origin of the simulation box to the origin of the Cartesian coordinate system.

All bin intervals are left-closed and right-open, i.e. [a, b) -> a <= x < b. The first bin edge is always zero. The last bin edge is always the (average) box length in the chosen spatial direction (i.e. 1 in relative box coordinates) plus a small tolerance to account for the right-open bin interval. Thus, the number of bin edges is len(bins), the number of bins is len(bins) - 1 and bins[1:] - np.diff(bins) / 2 yields the bin centers.

The bin indices in the returned discretized trajectory start at zero. This is different from the output of numpy.digitize(), where the index of the first bin is one.