Appendix - Scaling Keywords

This is a listing of all of the Denzo keywords and their modifiers, if any:

ADD NO MERGE
ADD PARTIALS include partials
all include no partials
ANOMALOUS original index
B RESTRAIN NO PROFILE TEST
BACKGROUND CORRECTION NUMBER OF ITERATIONS
DEFAULT B FACTOR NUMBER OF ZONES
DEFAULT SCALE ORIENTATION AXIS 1
DO NOT REJECT OUTLIERS ORIENTATION AXIS 2
ERROR SCALE FACTOR ORIGINAL WINDOW
ESTIMATED ERROR OUTPUT ANOMALOUS
EXTEND PARTIALS / DO NOT EXTEND PARTIALS POLARIZATION
FILE POSTREFINE
FIT PRINT
crystal total chi2
batch single chi2
film correlation / no correlation
FIT B solution / non solution
FIX B shifts / no shifts
FIXED WINDOW profile test
FORMAT PROFILES FITTED
denzo_ip PROFILES SUMMED
denzo_york1 RECSQ
scalepack REFERENCE BATCH
rigaku raxis REFERENCE BATCHES, FILM, FILMS
xds profit.hkl REJECT HKL
madnes procor intout REJECT OUTLIERS
madnes procor ascii REJECTION PROBABILITY
xengen urf RESOLUTION
archive RESOLUTION STEP
FRAME WIDTH SCALE ANOMALOUS
HKL MATRIX SCALE RESTRAIN
HKL SCALE SECTOR
HKL SHIFT SECTOR WIDTH
IGNORE OVERLOADS SIGMA CUTOFF
INCLUDE OVERLOADS SPACE GROUP
INITIAL B FACTOR SPINDLE AXIS
INITIAL SCALE UNIT CELL
INPUT / @ VERTICAL AXIS
MERGE WRITE BADDIES
MOSAICITY WRITE REJECTION FILE
NO ANOMALOUS

 

 


 

ADD

Increments batch numbers by a constant to every batch from this point on until another add command is read. Useful to make unique batch numbers from two or more files which have the same batch numbers inside. For example, Denzo_york1 format embeds the batch number in the .x file.

format                  ADD value     

default                 nothing added

example               ADD 1000    This will add 1000 to each batch number.

 

 

ADD PARTIALS

Tells the program to add partially recorded reflections among consecutive batches, even if the batches do not have consecutive numbering. Essentially obligatory.  

modifier              all        partials are added over all consecutive batches

format                  ADD PARTIALS start1 to end1 start2 to end2 ... etc

default                 do not ADD PARTIALS

example              ADD PARTIALS 0 to 49 51 to 99 100 to 149

Be sure that the ranges of numbers do not overlap.

 

ANOMALOUS

Flag for keeping Bijovets (I+ and I) separate in the output file. If the ANOMALOUS flag is on, anomalous pairs are considered equivalent when calculating scale and B factors and when computing statistics, but are merged separately and output as I+ and I for each reflection.

format                  ANOMALOUS

default                 not turned on (I+ and I are combined)

example               ANOMALOUS

 

 

B RESTRAIN

Can be used to restrain B factor differences from consecutive films or batches. The value which follows the flag represents the amount in  2 you will allow the B factors to differ from consecutive frames or batches. See also the keyword SCALE RESTRAIN.

format                  B RESTRAIN value_( 2) default not turned on

example               B RESTRAIN 0.5

 

BACKGROUND CORRECTION

The BACKGROUND CORRECTION command specifies the amount per frame to increase background. Corrects for errors in SDMS (Hamlin) data integration in the software distributed in the late 1980s and early 1990s. The value given after the keyword increases the background by #counts/frame. Valid for archive file format only.

format                  BACKGROUND CORRECTION value_(#counts/frame)

default                 not turned on

example               BACKGROUND CORRECTION 15

 

 

DEFAULT B FACTOR

Overall B used only in the absence of an INITIAL B FACTOR. You can apply a higher value to lower your Rmerge. Does not affect the quality of the data. 

format                  DEFAULT B FACTOR value

default                 0

example               DEFAULT B FACTOR 5

DEFAULT SCALE

Overall scale factor used in the absence of an initial scale factor. This is useful if the data are too strong, which is sometimes the case with small molecules. It will reduce the output intensities by the factor entered.

format                  DEFAULT SCALE value

default                 1

example               DEFAULT SCALE 10

 

 Useful to reduce the overall scale of the data set. If the numbers in the output file are too large, DEFAULT SCALE 10 will reduce them 10-fold. 

 

 

DO NOT REJECT OUTLIERS

Turns off the reject outliers flag. 

 

format                  DO NOT REJECT OUTLIERS      

default                 DO NOT REJECT OUTLIERS 

                               outliers not rejected automatically, but see REJECT  OUTLIERS for more discussion.

 

ERROR SCALE FACTOR

This is a single multiplicative factor which is applied to the input σI. This should be adjusted, so the normal χ2 (goodness of fit) value that is printed in the final table of the output comes close to 1. By default, the input errors are used (ERROR SCALE FACTOR = 1). It applies to the data which are read after this keyword, so you can apply a different error scale factor to subsequent batches by repeating this input with different values.

format                  ERROR SCALE FACTOR value

default                 ERROR SCALE FACTOR 1

example               ERROR SCALE FACTOR 1.3

                                           good starting value for format denzo_ip

 

 

ESTIMATED ERROR

The estimate of the systematic error for each of the resolution shells. There must be exactly the same number of error estimates here as there are NUMBER OF ZONES. So if you have 10 zones, you need 10 numbers following the keyword estimated error - one for each zone. 

The error estimates to do not all have to be the same. The estimated error applies to the data which are read after this keyword, so you can apply a different error scale factor to subsequent batches by repeating this input with different values. This is an important point if you enter data from a previous Scalepack output that does not need its σ to be increased.

The error estimates should be approximately equal to the R-factor in the table at the end of the output for resolution shells where statistical errors are small, namely the earlier resolution shells where the data is strong. This is a crude estimate of the systematic error, to be multiplied by I, and is usually invariant with resolution. Default = 0.06 (i.e. 6%) for all zones. 

format                  ESTIMATED ERROR value1 value2 value3 value4 ...etc

default                 each value set to 0.06

example               ESTIMATED ERROR 0.02 0.03 0.03 0.03 0.04 0.04 (6 zones)

 

 

EXTEND PARTIALS / DO NOT EXTEND PARTIALS

Some partially recorded reflections may be predicted by Denzo or Scalepack to start or end their Bragg condition in between consecutive frames due to small variations in crystal orientation from frame to frame. For these reflections only, there are two choices of defining where the reflection started (or ended): including the extra frame, or not. EXTEND PARTIALS tells Scalepack to include this extra frame. It only affects a very small fraction of the reflections. Opposite of DO NOT EXTEND PARTIALS.

format                  EXTEND PARTIALS

default                 this is the default

 

 

FILE

This specifies the files read by Scalepack. The input has two components. The first is a number. The second is a file name, which usually contains wildcard characters (###) that are incremented by the SECTOR command. The number which follows FILE specifies the starting batch number. A batch, previously called a film, can be as small as a single .x file (or the equivalent). It can be a group of .x files, even an entire data set. The most frequent content of a batch is a single .x file.

This conversion of files into batches is particularly useful if you want to scale more than one data set together. For example, let's say you want to scale 10 oscillation frames (numbered 1 through 10) from first data set with 37 oscillation frames (numbered 1 through 37) from another one. The FILE statement will take each of the individual .x files and assign them a batch number. Thus, you would enter something like this: FILE 1 'setone###.x' and FILE 101 'settwo###.x' (see Scenario 5d). Thus, batch numbers 1 - 10 will correspond to files setone001.x, setone002.x, etc. (assuming you used the SECTOR 1 to 10 command above FILE so that the wildcards would be substituted with numbers). The batch numbers 101 - 137 will correspond to files settwo001.x, settwo002.x, etc.   the ### is replaced by the sector argument, not by the batch number.

 

format                  FILE value 'filename' (see also note below)

defaults               None

example               FILE 101 '/frames/scale/lysoz###.x'

The FILE must come after FORMAT because the syntax depends on which input format is being read. The FILE must not be followed by a number in the case of the archive, denzo_york, and denzo_york1 formats, where the number after the word FILE is not given because the batch numbers are already stored in the file. If you want to change the batch numbers in these file formats, see the ADD command described above.

 

 

FIT

Tells the program what parameters to fit in post-refinement and specifies the group of files over which the fitting is to be performed. Postrefinement fit can be applied to an entire set of batches (one batch being the entire set of frames, for example) using the modifier crystal, or to each individual input file using the modifier film or batch.

modifiers             CRYSTAL specifies that the fit operation is over the entire set of frames specified by a range and restrains the fit parameter to have exactly the same value over this range.

                               BATCH specifies that the fit operation is performed on each member of the set of frames specified by a range.

                               FILM         alias for batch

parameters         a*, b*, c*    unit cell lengths. Values returned are real space

                               alpha*    unit cell angles. Values returned are real space.

                               beta*  

                               gamma*     

                               rotx           crystal orientation parameters deduced in Denzo

                               roty     

                               rotz      

                               wavelength incident wavelength.. 

                               mosaicity mosaicity as defined in Denzo, in degrees

format                  fit modifier parameter1 filmnumber to filmnumber

default                 the default is that nothing is fit unless specified

examples             FIT crystal roty 1 to 137

                               FIT crystal mosaicity 1 to 5 7 to 10 102 to 104 to 137

                               FIT batch rotx 1 to 137

 

Most mistakes in Scalepack can be attributed to errors in FIT commands because the program is very sensitive to mistakes in the batch numbers. If you input non-existent batch numbers or define overlapping ranges (e.g., 1 to 10 5 to 20), the program is likely to fail in a strange way. If you specify a range of numbers, the program will only use the batch numbers that exist within the range. For example, if your batch numbers go from 1 to 40 and 70 to 90, you can get away with saying, say, FIT batch (parameter) 1 to 90, which is the same as FIT batch (parameter) 1 to 40 70 to 90. For FIT crystal, these two inputs are not equivalent. In the case of FIT crystal (parameter) 1 to 90, one value will be fit for all batches. In the case of FIT crystal (parameter) 1 to 40 70 to 90, two values will be fit, one for each range. Note that different parameters may be fit over different ranges and either over batch or crystal. You can also mix batch and crystal for the same parameters.

Do not fit unit cell parameters restrained by space group symmetry. For example, if you have space group P3, you must not fit b*.

Do not FIT batch rotz because this parameter is very poorly defined by the intensities of observed partial reflections. This is another very common mistake. 

Unless you know what you are doing, do not FIT crystal rotx roty rotz, because if the spindle is even slightly misaligned, the assumption that there is only one crystal orientation parameter for a large sweep of data will force incorrect restraints on the crystal orientation refinement.

 About fitting rotations: Changes in rotations, like crystal rotx roty and rotz, are expressed as small rotations, call them , about the laboratory frame of reference. These : x, y, and z, are used because to a first approximation, they commute with one another (commute means that the order in which they are applied is irrelevant). This, in turn, is because these are small, typically less than one degree. The crystal or cassette rotations, Rx, Ry, and Rz, on the other hand, do not commute with one another because their values tend to be large (much greater than one degree). So when you ask Scalepack postrefinement or Denzo to fit these rotations, what is actually happening is that the are being refined. After each refinement cycle, the are converted into changes in Rx, Ry, and Rz by a (complicated) algebraic relation. Those of you with sharp eyes will have noticed in Denzo that the shifts reported by the program when fitting the crystal rotations do not correspond to the changes in the rot values. This is because the shifts reported are the , not the changes in rotx, roty, and rotz. The other reason for fitting rotations the way they are defined in Denzo is to make them have a more intuitive correlation with the other parameters. Otherwise, changes in crystal rotz would not correlate with cassette rotz. This is of more importance in Scalepack, where only rotx and roty, and not rotz, are refined.

About fitting unit cell parameters: In both Denzo and Scalepack, the unit cell is fitted in reciprocal space, not real space. This means that for a non-orthogonal space group, refining the value of a* may end up changing the values of b and c, even though b* and c* remain the same. The same is true for the angles: fitting α* may end up changing β and γ, even though β* and γ* remain the same. So what you may notice sometime if you are not careful is that when you ask the program to fit crystal a*, but not b*, c*, and the angles, then a will not be a constant. a* will be constant, but when converted back to a when the other unit cell parameters have not been changes, a will not. The moral: when fitting unit cells, fit all the relevant parameters.

 

 

FIT B

This flag tells the program to refine B factors of every batch from the very first cycles of refinement. This is in contrast to the default procedure, where the B factors are fit only after the convergence of the scaling. In the default procedure, if scaling does not converge in 20 (default) cycles of refinement, B factors will be not be fitted. The FIT B command can override this. Not to be confused with the postrefine FIT B* command described above. You cannot postrefine B (temperature) factors.

 

format                  FIT B

default                 this flag is turned off

example               FIT B

FIX B

This flag tells the program not to fit B factors at all. Usually, it is combined with the input of the B factors you want to apply but do not wish to refine anymore, or it is used for frozen crystals where you do not expect significant decay.

format                  FIX B

default                 turned off; B factors are fit after the scale factor refinement converges.

example               FIX B

 

 

FIXED WINDOW

For Hamlin archive files a fixed window of 3, 5, 7, or 9 frames or the original Hamlin determined window of frames may be used for summing a reflection. Valid for archive file format only. 

 

format                  FIXED WINDOW value

default                 uses the Hamlin definition of the window

example               FIXED WINDOW 7

FORMAT

This keyword specifies the format of the input hkl and intensity data. Input data can come from any of the nine types of files. This program requires this keyword to properly read the files.

 

modifiers DENZO_IP from frames processed with Denzo

  DENZO_YORK1 output created with Denzo option york 

  SCALEPACK from Scalepack output file

  RIGAKU RAXIS binary R-Axis software output

                               XDS PROFIT.HKL binary output from XDS output file

  MADNES PROCOR INTOUT binary madness output

  MADNES PROCOR ASCII ascii madness output

  XENGEN URF binary output from xengen. Also have to supply the program with info about frame width, in degrees, using frame width keyword.

  ARCHIVE binary output from Hamlin software

format FORMAT modifier  

default denzo_ip  

example FORMAT denzo_ip if you use one of the binary formats that the data must be scaled on the same type of computer that created the binary files due to incompatibilities in number representations between computers.

 

FRAME WIDTH

Only for URF files created by Xengen. Other formats do not need this specification because Scalepack can read this information off the file header. The oscillation range for each frame.

format                  FRAME WIDTH value

default                 there is no default, this input is required for this format

example               FRAME WIDTH 0.2

HKL MATRIX

Matrix for re-indexing. This matrix is applied to the hkl's as they are read in (applies to data read in after this command):

h' = (1)*h + (2)*k + (3)*l

k' = (4)*h + (5)*k + (6)*l

l' = (7)*h + (8)*k + (9)*l

Input is in the order (1....9). HKL MATRIX works with postrefinement. The program will not accept a matrix which has a negative determinant

format                  HKL MATRIX value11 value12 value13   

                                                 value21 value22 value23

                                                 value31 value32 value33

default                 the default is the unit matrix     

example               HKL MATRIX 0 1 0 1 0 0 0 0 -1

                               transforms hkl into k h -l,

                                           has the effect of switching a and b.

                                           -1 is to keep the determinant positive.

HKL SCALE

Divides each h, k, and l by the input values. Useful for reducing the unit cell volume, particularly after hkl matrix transformation. Rarely needed. Example, index data originally in C222 and want your data in, say, P3, you would apply HKL SCALE of 2 2 1, and HKL MATRIX of 1 1 0 1 -1 0 0 0 -1 so that the new indices would have values of h +k/2, h-k/2, and l. 

format                  HKL SCALE value1 value2 value3

default                 1 1 1

example               HKL SCALE 2 2 1

HKL SHIFT

Used to make a quick test of misindexing. Adds the specified integer vector to each original h, k, and l. If you are successful in using this, congratulations, you now have to reprocess your data in Denzo with the correct indexing.

format                  HKL SHIFT integer integer integer     

default                 0 0 0    

example               HKL SHIFT 0 1 0     we'll assume you weren't too far off

IGNORE OVERLOADS

Opposite of INCLUDE OVERLOADS. This is not the default. Useful if you collect data at low and high exposures and is useful to ignore the saturated reflections at the high exposures. This macro applies to data read in after this command.

 

format                  IGNORE OVERLOADS

default                 This is NOT the default   See Include Overloads

 

 

INCLUDE OVERLOADS

INCLUDE OVERLOADS is a flag for whether fitted profiles with some pixels missing (typically due to overload) should be included in the scaling. Affects only Denzo image plate output files (formats denzo_ip, denzo_york1). Note that for summed profiles this does not apply because only profile fitting can estimate the value of the overloaded pixels.

format                  INCLUDE OVERLOADS

default                 This is the default

 

 

INITIAL B FACTOR

Table of starting B factors, one per batch, beginning on the record after the title. After running the program once, the output table of B factors can be cut out and pasted in here, for example, when you have set the FIX B flag and are not refining B values anymore.

format                  INITIAL B FACTOR batch no. b value batch no. b value ...

default                 value of the default B factor, which is defaults to 0

example               INITIAL B FACTOR 1 0.0 2 0.1 3 0.1 4 0.1 5 -0.2 ...etc

INITIAL SCALE

Table of starting scale factors, one per batch, beginning on the record after the title. After one run of the program, the output table of scale factors can be cut out and pasted in here. This table is not required. If it is not included in the control input, then the DEFAULT SCALE is used (this is 1 unless otherwise specified). 

 

 If the initial scale factor is set to zero, that frame is ignored in the scaling and refinement.

format                  batch no. scale value batch no. scale value ...

default                 value of the default scale factor, which is defaulted to 1

example               INITIAL SCALE FACTOR 1 1.0 2 0.9 3 0.0 4 0.85 5 1.1 ...etc.

INPUT / @

Redirection. Tells the program to read the file which follows the keyword. Same as in Denzo.

format                  @filename

default                 no redirection

example               @reject

 

 

MERGE

Flag that tells the program to merge (combine, average) reflections with the same unique index. This is the Scalepack default.

format                  MERGE

default                 This is the default and need not be specified

 

 

MOSAICITY

Allows you to input the value of the mosaicity of the data set. This is normally read from the header of the .x file.

format                  MOSAICITY value

default                 read from the header of the .x file

example               MOSAICITY 0.5

 

 

NO ANOMALOUS

Opposite of the keyword ANOMALOUS. Causes I+ and I- to be merged. This is the default.

 

format                  NO ANOMALOUS

default                 This is the default

 

 

NO MERGE

Flag for the output of unmerged (reflections with the same unique hkl are not combined) data. Opposite of MERGE. This is a very handy for specialized work. This flag has subsidiary modifiers include partials, original index or as default include no partials 

modifiers             INCLUDE PARTIALS    all observations, both fully and partially recorded, are included in the output. The output will consist of the unique hkl, batch number, asymmetric unit, I, σ, and fractionality of the reflection. There is no information about I+ and I-, although it may be possible to get this in subsequent versions of the program.

                               INCLUDE NO PARTIALS    only fully recorded reflections and those fully recorded reflections created by the summation of partials are included in the output. Partials which cannot be summed to a fully recorded reflection are lost. The output will consist of the unique hkl, batch number, asymmetric unit, I, and σ. This is the default for NO MERGE. There is no information about I+ and I-. 

                               ORIGINAL INDEX    the output will also contain the original (not unique) hkl for each reflection. This is designed for MAD/local scaling work. The original index modifier only works with the default INCLUDE NO PARTIALS. The output will consist of the original hkl, unique hkl, batch  number, a flag (0 = centric, 1 = I+, 2 = I-), another flag (0 = hkl reflecting above the spindle, 1 = hkl reflecting below the spindle), the asymmetric unit of the reflection, I (scaled, Lorentz and Polarization corrected), and the σ of I. The format is (6i4, i6, 2i2, i3, 2f8.1).

format                  NO MERGE modifier

example               NO MERGE original index

 

This command is also useful if you just want to combine all of the information contained in multiple .x files into a single file. Simply read in all of the .x files, don't do any scale or B factor refinement, and output NO MERGE include partials

 

 

NO PROFILE TEST

This is a flag which operates only on ARCHIVE format files. Tells Scalepack that reflections with weird profiles should not be rejected. Opposite of PROFILE TEST.

format                  NO PROFILE TEST

default                 This is not the default

 

 

NUMBER OF ITERATIONS

The number of cycles for refinement of scale and B-factors. The default value is 20 cycles. If it is set to 0, the program computes statistics, merges reflections with the same unique hkl, and writes the output file based on the initial scale and B factors. Normally you do not have to specify this unless you want to avoid scaling. 

format                  NUMBER OF ITERATIONS integer

default                 20 iterations

example               NUMBER OF ITERATIONS 0

 

 

NUMBER OF ZONES

The number of resolution shells the data is divided into for the basis of calculating statistics. This input is required and must match the number of zones specified under the ESTIMATED ERROR keyword. Handy tip: it's nice to set up the NUMBER OF ZONES to equal the number of zones used by X-plor for the output of refinement statistics by shell, for when you get around to publishing your data and refinement statistics together in a paper.

format                  NUMBER OF ZONES integer

default                 This input is required

example               NUMBER OF ZONES 10

 

 

ORIENTATION AXIS 1

Three integer vector which describes the orientation of the vertical axis of the crystal.

Equivalent to Denzo and Scalepack keywords VERTICAL AXIS. This information is usually not inputted by the user, but subsequent versions will read it from the header of the .x files. This command does not affect scaling or postrefinement. However, it does affect the values of the misorientation angles reported in the Scalepack log file.

format                  ORIENTATION AXIS 1 integer integer integer

default                 1 0 0

example               ORIENTATION AXIS 1 0 1 0

 

 

ORIENTATION AXIS 2

Same as ORIENTATION AXIS 1 except describes the spindle axis of the crystal. Equivalent to Denzo and Scalepack keywords SPINDLE AXIS. Otherwise the same as above.

format                  ORIENTATION AXIS 2 integer integer integer

default                 0 0 1

example               ORIENTATION AXIS 2 1 0 0

 

 

ORIGINAL WINDOW

Refers to the window size of ARCHIVE format files. Opposite of FIXED WINDOW.

 

 

OUTPUT ANOMALOUS

Alias for the keyword ANOMALOUS.

 

 

OUTPUT FILE

Name of file for output of scaled measurements. A new file will be created if none exists, but a pre-existing file will be overwritten. Maximum of 80 characters allowed in the file name. Depending on whether the ANOMALOUS flag is set, there are one or two sets of I and σ(I) per reflection. The output is h, k, l, I, σ(I) (NO ANOMALOUS flag) or h, k, l, I+, σ(I+), I-, σ(I-) (ANOMALOUS flag) in format (3I4, 4F8.1). If the NO MERGE flag has been set, unmerged data are output as h, k, l, asymmetric unit #, I, σ(I) in format (4I4, 2F8.1) and the ANOMALOUS flag has no effect. For NO MERGE original index the format is: original hkl, unique hkl, batch (film) number, a flag (0 = centric, 1 = I+, 2 = I-), another flag (0 = hkl reflecting above the spindle, 1 = hkl reflecting below the spindle), the asymmetric unit of the reflection, I (scaled, Lorentz and Polarization corrected), and the σ(I). The format is (6i4, i6, 2i2, i3, 2f8.1). 

format                  OUTPUT FILE name    use single quotes for safety

default                 you must specify the output file name

example               OUTPUT FILE '/people/myname/LYSOZ.sca'

 

 

POLARIZATION

This command allows you to correct a mistaken polarization/monochromator value entered when you ran Denzo. It saves you from the chore of reprocessing all of your original images once you have learned of a mistake. This is mainly for synchrotron users who collect very high-resolution data.

format                  POLARIZATION Denzo value corrected value

default                 The polarization is read off the header of the .x file

example               POLARIZATION Denzo 0.0 corrected 0.9

 

 

POSTREFINE

The number of cycles of postrefinement to be done. Works in conjunction with the FIT commands described above. If you set the number of postrefinement cycles to zero, then postrefinement will be skipped even if you have FIT commands in your control file.

format                  POSTREFINE integer

default Postrefinement is not done unless you specify it

example               POSTREFINE 10

 

 

PRINT

This flag tells the program to print the results of the specified calculation after every refinement cycle. Useful in the case where one scales non-isomorphous data and wants to make the outlier list short by using the statement PRINT total chi2 200 PRINT single chi2 100.

modifiers             TOTAL CHI2       cutoff for printing poor reflections. Total χ2 for all measurements of a reflection. Default = 52

                               SINGLE CHI2    cutoff for printing poor reflections. χ2 for any single measurement of a reflection. Default = 3.52.

                               CORRELATION / NO CORRELATION    prints correlation matrix.

                                           Default: no correlation

                               SOLUTION / NON SOLUTION   prints new scale and B factors.

                                           Default: no solution 

                               SHIFTS / NO SHIFTS prints changes to scale and B factors. 

format                  PRINT modifier

defaults               PRINT total chi2

                               PRINT single chi2

                               PRINT no correlation

                               PRINT no solution

                               PRINT no shifts

New scale and B factors will always be printed if the number of fitted parameters exceeds 300. 

 

 

PROFILE TEST

PROFILE TEST is a flag that can be turned on and off as desired. This is the default. Bad profiles can be used to reject a DCREDUCE summed hkl. Valid for ARCHIVE file format only. Opposite of NO PROFILE TEST.

 

 

PROFILES FITTED

This flag tells Scalepack how the profiles were treated by the indexing program (e.g., Denzo). The two choices for profiles are fitted and summed. Valid for denzo_ip, Denzo_york1, and xengen urf formats. Most of the time, people fit profiles, so this is the default and need not be specified in the command file.

format                  PROFILES FITTED

default                 This is the default

 

 

PROFILES SUMMED

The opposite of PROFILES FITTED. This is not the default and must be specified if true.

 

 

RECSQ

Metric tensor description of the reciprocal space unit cell. Why would you ever use this when you can specify the real space constants?

format                  RECSQ value1 value2  . value6

default                 same as UNIT CELL

example               RECSQ 0.0001 0 0 0.0004 0 0.0009

                                           [to describe unit cell 100 50 33 90 90 90.]

 

 

REFERENCE BATCH

Specifies which batch or film or set of batches or films will be the reference for the scaling and B refinement. The scale and B factor for these are not refined. More than one film or batch may be used as the reference. This is important only for crystals which decay during data collection. If the crystal is frozen and does not decay, then the default may be used, which is to let the eigenvalue filter define the overall scale and B factor. With a large number of batches, reliance on the eigenvalue filter is a little bit dangerous, so you should consider using a reference batch number in those cases.

This keyword is entirely equivalent to the keywords REFERENCE BATCHES, REFERENCE FILM, AND REFERENCE FILMS. The others exist because some people have a grammatical hangup about using the singular to describe more than one object.

format                  REFERENCE BATCH integer integer integer ... etc. 

default                 no reference batch. Eigenvalue filter defines the overall scale and B

example               REFERENCE BATCH 1 3 4 5

 

 

REFERENCE BATCHES, FILM, FILMS

Same as reference batch

 

 

REJECT HKL

This flag tells the program to reject the list of individual h, k, l's which follow. This is useful for iterative rejection cycles since the file containing the rejected reflections can be reread. Each record after the title contains one reflection with the variables h, k, l (original index) and film number in free format. This information is most easily edited from the printed output of a previous run, can contain the whole line. One can read it from a reject file with command @reject.

format                  REJECT HKL

default                 This is not the default

 

 

REJECT OUTLIERS

Automatic rejection of outliers. Not yet reliably implemented. This is supposed to replace multiple rounds of running Scalepack. The idea is that when this keyword is set, Scalepack runs once, makes a reject.dat file, then runs again, reads the reject.dat file and scales the data based on the reduced set of observations.

 

 

REJECTION PROBABILITY

Applies Bayesian rejection of outliers. Rejected observations are written to the reject.dat file (see WRITE REJECTION FILE). The whole hkl (all original hkls with the same unique hkl) with at least one observation with a probability of being an outlier greater than or equal to the value specified after the WRITE REJECTION FILE keyword (default 0.9) are written to the log file. This is an estimation on your part of how frequently you expect any observation to be an outlier. In principle, the rejection probability should be about equal to the number of outliers divided by the number of observations. A good value to use is 1/10,000, i.e. 0.0001, for normal good data. If you have a non-random signal in your background (e.g., satellite crystals, malfunctioning detector, ice rings), then you will probably have to increase the rejection probability. If you do not want to generate a reject list in the log file, then omit the keyword. A comparison of R-linear and R-squared is often helpful for deciding whether to increase the rejection probability. See the discussion following Scenario 1 for more on this.

The rejection algorithm used in Scalepack is the most sensible and statistically sound outlier rejection algorithm. Unlike some other programs, the Scalepack outlier rejection is based on comparing differences to σ. So it will typically reject, say 4 or 5 σ outliers. Some reflections with a large discrepancy from the average value of I may simply represent a lack of adequate statistics in measurement and not a "mistake" (non-random error) in measurement. If a reflection has a large discrepancy from the average and a large σ, its contribution to the average will be very small anyway.

Here's an example of how the outlier rejection algorithm works:

 

Input data (after scaling)

Obs # 

I

σ

10.0

0.1

2

10.1 

0.1

9.9 

0.1

11.0 

0.1

20.0 

10.0

 

Although you may think that {5} is the outlier, in fact the consistent set of observations will be {1, 2, 3, 5}, and {4} will be the outlier. Why is this? A consistent average will be 10.0 (or more exactly 10.000003), because observation 5 has a 1/1000000 lower statistical weight than the other observations, due to its large σ. Observation 4, on the other hand, which was measured more accurately, will be 8.7 σ deviations from the average of the remaining observations because the expected error of the difference 4 - <{1,2,3,5}> is 0.115, and 8.7 = (11.0 - 10.000003) / 0.115. So the observation with the largest deviation from the average is not necessarily a statistical outlier. 

R-merge (after outlier rejection) will be very bad, 25%, even if the average is measured extremely well, with a sigma of 0.057%! This has to do with R-merge being an unweighted statistic, and I/σ being a weighted statistic. If some other program rejected observation 5 (for no good reason) and observation 4 (for good reason), R-merge would be 0.067%. Practice shows that most programs would reject observation 5 because many users want a good Rmerge. And sometimes these programs would not even flag the true outlier - observation 4!

format                  REJECTION PROBABILITY value

default                 0.0001

example               REJECTION PROBABILITY 0.0001

 

 

RESOLUTION

Minimum d-spacing for this run. Default is the maximum resolution found in the input data. One can supply two numbers in any order, and they will be minimum and maximum d-spacings.

format                  RESOLUTION value [value]

default                 highest resolution detected in .x file

examples             RESOLUTION 10 2.2

                                RESOLUTION 2.2

 

 

RESOLUTION STEP

Changes the number of reflections per shell for the purposes of printing out statistics. Formally, it represents the exponent of the zone volume calculation. Normally, this is 3, because the volume of a sphere goes as radius cubed, so all the statistics shells will have the same number of reflections. Changing the resolution step could be useful to prepare a table of statistics to compare with other programs which may print out the statistics differently.

format                  RESOLUTION STEP value

default                 RESOLUTION STEP 3

example               RESOLUTION STEP 2.5

 

 

SCALE ANOMALOUS

This is the flag for keeping Bijvoets (I+ and I-) separate both in scaling and in the output file. If the SCALE ANOMALOUS flag is on, anomalous pairs are considered non-equivalent when calculating scale and B factors and when computing statistics, and are merged separately and output as I+ and I- for each reflection. 

This is a dangerous option because scaling may be unstable due to the reduced number of intersections between images. The danger is much larger in low symmetry space groups.

SCALE ANOMALOUS will always reduce R-merge, even in the absence of an anomalous signal, because of the reduced redundancy. However, χ2's will not be affected in the absence of an anomalous signal.

format                  SCALE ANOMALOUS

default                 This is not the default.  Use with extreme caution.

 

 

SCALE RESTRAIN

Can be used to restrain scale factor differences from consecutive films or batches. The value which follows the flag represents the amount you will allow the scale factors to differ from consecutive films or batches. It adds a factor of (scale1 - scale2)2/(scale restrain)2 to the target function minimized in scaling. This only applies to batches between which you add partials. For very thin frames, this is almost obligatory. The value should roughly represent the expected relative change in scale factors between adjacent frames.

format                  SCALE RESTRAIN value 

default                 not turned on 

example               SCALE RESTRAIN 0.01   

                                           (expect 1% change between adjacent frames)

 

 

SECTOR

Substitutes for the ### wildcard to specify a group of files to be read. See FILE keyword.

format                  SECTOR integer to integer

default                 no default

example               SECTOR 1 to 40

 

 

SECTOR WIDTH

Sector width can be specified in degrees. A pseudofilm is a sector width's worth of data from one detector. Valid for area detector data only, default value 5 degrees.

format                  SECTOR WIDTH value

default                 5 degrees

example               SECTOR WIDTH 3.0

 

 

SIGMA CUTOFF

Cutoff for rejecting measurements on input. Default = -3.0. Be careful if you increase this!

What is the rationale for using σ cutoff -3.0 in Scalepack? Wouldn't you want to reject all negative intensities? Why shouldn't you use a σ cutoff 1.0 or zero? The answer is as follows: The best estimate of I may be negative, due to background subtraction and background fluctuation. Negative measurements typically represent random fluctuations in the detector's response to an X-ray signal. If a measurement is highly negative ( -3σ) than it may be more likely the result of a mistake, rather than just random fluctuation. If one eliminates negative fluctuations, but not the positive ones before averaging, the result will be highly biased. In Scalepack, SIGMA CUTOFF is applied before averaging. If one rejects all negative intensities before averaging a number of things would happen: 

 

1.      The averaged intensity would always be positive; 

2.      For totally random data with redundancy 8, in a shell where there was no signal, there would be on average 4 positive measurements, with average intensity one σ. This is because the negative measurements had been thrown out. So the average of the four remaining measurements would be about 2 σ! This would look like a resolution shell with a meaningful signal!

3.      R-merge would always be less than the R-merge with negative measurements included

4.      A SIGMA CUTOFF of 1 would improve R-merge even more, by excluding even more valid measurements! 

Why should this worry you? Exclusion of valid measurements will deteriorate the final data set. One may notice an inverse relationship between R-merge and data quality as a function of SIGMA CUTOFF So much for using R-merge as any criterion of success.

Even the best (averaged) estimate of intensity may be negative. How to use negative I estimates in subsequent phasing and refinement steps? The author of Scalepack suggests the following: 

1.      You should never convert I into F

2.      You should square Fcalc and compare it to I. Most, but not all of the crystallography programs do not do this. That is life. In the absence of the proper treatment, one can make approximations. One of them is provided by French and also by French and Wilson and implemented in the CCP4 program TRUNCATE. A very simplified and somewhat imprecise implementation of TRUNCATE is this: if I > σ(I), F=sqrt(I) if I < σ(I), F=sqrt(σ(I))

format                  SIGMA CUTOFF value

default                 -3

example               SIGMA CUTOFF -2.5

 

 

SPACE GROUP

Space group symbol from the list below. This input is required! The space group may be entered as a name (e.g., P212121) or as a number (e.g., 19, for the same space group). Most of the numbers correspond to those of the International Tables. The numbers above 230 are nonstandard definitions of space groups.

 

 

1

P1

89

P422

154

P3221

207

P432

3

P2

90

P4212

155

R32

208

P4232

4

P21

91

P4122

168

P6

209

F432

5

C2

92

P41212

169

P61

210

F4132

16

P222

93

P4222

170

P65

211

I432

17

P2221

94

P42212

171

P62

212

P4332

18

P21212

95

P4322

172

P64

213

P4132

19

P212121 

96

P43212

173

P63

214

I4132

20

C2221

97

I422

177

P622

303

P2C

21

C222

98

I4122

178

P6122

305

B2

22

F222

143

P3

179

P6522

318

P21221

23

I222

144

P31

180

P6222

401

C1

24

I212121

145

P32

181

P6422

403

P21C

75

P4

146

R3

182

P6322

446

H3

76

P41

149

P312

195

P23

455

H32

77

P42

150

P321

196

F23

501

I1

78

P43

151

P3112

197

I23

503

I2

79

I4

152

P3121

198

P213

505

C21

80

I41

153

P3212

199

I213

 

 

 

 

Notes to particular space groups:

 

146 R3 

R3 in hexagonal setting

446 H3

R3 in primitive setting

155 R32 

R32 in hexagonal setting

455 H32

R32 in primitive setting

401 C1

Non-standard, but useful to make angles close to 90.

501 I1

Non-standard, but useful to make angles close to 90.

303 P2C  

P2, C axis unique

403 P21C

P21, C axis unique

305 B2  

like C2, B face centered, c axis unique

503 I2  

Non-standard, but useful to make beta angle close to 90.

SPINDLE AXIS

 

Alias for ORIENTATION AXIS 2.

 

 

UNIT CELL

Real cell specified as a, b, c, alpha, beta, gamma. UNIT CELL is included in the header of Denzo_IP, Denzo_york1, ARCHIVE, and Scalepack files. Otherwise, it must be included.

 

format                  UNIT CELL value value value angle angle angle

default                 the value from the first header encountered

example               UNIT CELL 50 62 100.3 90 90 90

 

The postrefined value of the unit cell constants is not used, nor is it output in the Scalepack

output file. You must get this information from the log file if you are interested in it.

 

 

VERTICAL AXIS

Same as ORIENTATION AXIS 1

 

 

WRITE BADDIES

Writes *.xrej files so that reflections from the reject file may be displayed by XdisplayF. Not fully implemented. Will terminate the program and prevent scaling and postrefinement.

Note that the command @reject must precede the input of the .x files.

 

 

WRITE REJECTION FILE

Creates a file with hkl's to be rejected. UNIX file name: reject, VMS file name:

REJECT.DAT. Reject file is created if it does not exist. If the file exists, it is overwritten. 

You can specify the threshold for what you consider to be a rejectable probability. The default is 0.9, which is fairly safe, but you may want to decrease this to, say, 0.5 on later rounds of rejection.


 

 

format                  WRITE REJECTION FILE value

default                 Does not write the rejection file, but if it does, the default value is 0.9

example               WRITE REJECTION FILE 0.5