CUBY logo

Atom selection expressions

Difference between revisions from 2012/01/26 14:57 and 2010/12/01 12:01.
Some keywords in cuby input work with selections of atoms. Syntax of the selection expressions is discussed here. There are two modes of selections, automatically recognized by the program.

The selections are used to specify part of the system at various keywords, and can be also used to get part of system from geometry file by the script [geometry].

!The simple way
The first possibility is to select atoms by their index (first atom in the geometry has index 1). Multiple atoms can be selected as a list:
{{1, 2, 5, 6, 7, 8}}
''note: when the spaces are not included in this example, YAML parser would read this as one number, 125678. To avoid this, either use spaces as indicated here, or put the expression into quotes.''
as a range:
{{5-10}}
or combination of both:
{{1-5, 10-12}}

Note that all whitespace in the expressions is ignored.

!The powerful way
Sometimes it is easier to make the selection based on some other criteria. In the advanced selection expressions, we can work with both the atoms and residues, when the information exist in the geometry (it means it was read from a file containing it, such as PDB).

It is possible to join more selections with logical operators OR, written as "|", and AND ("&"). Operator | has higher priority than &, and it is possible to use parenthesis to override this priority and group the expressions in an arbitrary way. Expression
{{selection1 | selection2}}
will select all atoms that match any of the two selections, while
{{selection1 & selection2}}
will select only atoms that match both conditions. Expression
{{selection1 | selection2 & selection3}}
is equivalent to
{{(selection1 | selection2) & selection3}}
because of the operator priorities, while
{{selection1 | (selection2 & selection3)}}
will yield different result.

The elementary selection in these expression can have three forms:
#'''Simple list of atom indexes''', as described above
#'''Advanced selection of atoms and residues'''
#'''Special selectors'''

!Advanced selection of atoms and residues
Each of these expressions can contain two parts, one for selecting residues and second for selecting atoms, or only one of them. The residue selection start with ":", the atom selection start with "@".

In residue selection, it is possible to use either residue numbers (starting with 1), or residue names. The expression:
{{:1,2,5-7,A}}
selects residue 1, 2, 5, 6, 7 and all other residues named A.

In atom selection, it is possible to use atom indexes and element names. The expression:
{{@1-10,H}}
selects first ten atoms in the geometry and all hydrogens.

It is possible to negate the selection by adding character "~".
{{@~H,O}}
selects all atoms but hydrogens and oxygens. The negation operator negates the whole selection, so
{{@H,~O}} is equivalent to the previous selection.

Finally, when both residue and atom selection are present, only atoms that match both are selected. The example
{{:~WAT@H}}
selects all hydrogens that do not belong to residues named WAT.

To summarize it, the example:
{{:1-5 | :6-10@~H | :5-10 | @Na}}
selects all atoms in residues 1-5, all non-hydrogen atoms in residues 6-10 and all sodium atoms.

!Special selectors
These functions allow selection of atoms based on additional criteria. The selectors have a common format ''%name(arguments)''. Available selectors are:
* '''%atomname(name)''' for selection by PDB atom name. Selection {{%atomname(C)}} will select all atoms named "C", but not atoms "CA", "CB", "C1"... List of names separated by "," can be also used.
* '''%pdb_no(selection)''' selects atoms by their PDB atom number, the selection is comma separated list of numbers or ranges (using '-')
* '''%coord(x|y|z >|<|= ''number'')''' selects atoms by their cartesian coordinates
* '''%same_residue(''selection'')''' selects whole residues containing the atoms in the selection
* '''%within(''distance'';''selection'')''' selects atom within specified distance from any atoms in the selection
* '''%all()''' select all atoms
* '''%molecule(''molecule_num'')''' selects a separate molecule with index ''molecule_num'' (starting with 1), as determined from connectivity
* '''%not(''selection'')''' inversion of complex selection

!! Nesting special selectors
Some of the special selectors take a selection as their parameter. Nesting of functions is possible.