Atom selection expressions
Some keywords in cuby input work with selections of atoms. Syntax of the selection expressions is discussed here. There are two modes of selections, automatically recognized by the program.
The selections are used to specify part of the system at various keywords, and can be also used to get part of system from geometry file by the script geometry.
The first possibility is to select atoms by their index (first atom in the geometry has index 1). Multiple atoms can be selected as a list:
1, 2, 5, 6, 7, 8
note: when the spaces are not included in this example, YAML parser would read this as one number, 125678. To avoid this, either use spaces as indicated here, or put the expression into quotes.
as a range:
or combination of both:
Note that all whitespace in the expressions is ignored.
Sometimes it is easier to make the selection based on some other criteria. In the advanced selection expressions, we can work with both the atoms and residues, when the information exist in the geometry (it means it was read from a file containing it, such as PDB).
It is possible to join more selections with logical operators OR, written as "|", and AND ("&"). Operator | has higher priority than &, and it is possible to use parenthesis to override this priority and group the expressions in an arbitrary way. Expression
selection1 | selection2
will select all atoms that match any of the two selections, while
selection1 & selection2
will select only atoms that match both conditions. Expression
selection1 | selection2 & selection3
is equivalent to
(selection1 | selection2) & selection3
because of the operator priorities, while
selection1 | (selection2 & selection3)
will yield different result.
The elementary selection in these expression can have three forms:
- Simple list of atom indexes, as described above
- Advanced selection of atoms and residues
- Special selectors
Each of these expressions can contain two parts, one for selecting residues and second for selecting atoms, or only one of them. The residue selection start with ":", the atom selection start with "@".
In residue selection, it is possible to use either residue numbers (starting with 1), or residue names. The expression:
selects residue 1, 2, 5, 6, 7 and all other residues named A.
In atom selection, it is possible to use atom indexes and element names. The expression:
selects first ten atoms in the geometry and all hydrogens.
It is possible to negate the selection by adding character "~".
selects all atoms but hydrogens and oxygens. The negation operator negates the whole selection, so
is equivalent to the previous selection.
Finally, when both residue and atom selection are present, only atoms that match both are selected. The example
selects all hydrogens that do not belong to residues named WAT.
To summarize it, the example:
:1-5 | :6-10@~H | :5-10 | @Na
selects all atoms in residues 1-5, all non-hydrogen atoms in residues 6-10 and all sodium atoms.
These functions allow selection of atoms based on additional criteria. The selectors have a common format %name(arguments). Available selectors are:
- %atomname(name) for selection by PDB atom name. Selection
will select all atoms named "C", but not atoms "CA", "CB", "C1"... List of names separated by "," can be also used.
- %pdb_no(selection) selects atoms by their PDB atom number, the selection is comma separated list of numbers or ranges (using '-')
- %coord(x|y|z >|<|= number) selects atoms by their cartesian coordinates
- %same_residue(selection) selects whole residues containing the atoms in the selection
- %within(distance;selection) selects atom within specified distance from any atoms in the selection
- %all() select all atoms
- %molecule(molecule_num) selects a separate molecule with index molecule_num (starting with 1), as determined from connectivity
- %not(selection) inversion of complex selection
Some of the special selectors take a selection as their parameter. Nesting of functions is possible.