FATMEN -- H2Fatmentutorial-explanation-of-terms

Explanation of terms

When using the FATMEN system, all data is referenced by a name known as the generic-name. The generic name has the form

//catalogue/experiment/dir1/dir2/.../dirn/filename

where the slash character (/) is a directory delimiter, as for Unix file names, catalogue indicates in which catalogue the file resides experiment is the name of the experiment to which the file belongs The rest of the file name is free format, although its total length may not exceed 255 bytes and each component may not exceed 20 characters. Typically, experiments will have conventions for the generic-name, but the sort of information that you might want to include in the generic-name is

Real data, simulated (possibly also technical run, cosmics etc.)
Beam particle (e.g. pi+, pi- etc.)
Beam energy
Target (for fixed target experiments)
Period and run
Magnet setting, if relevant
Number of the pass through the reconstruction chain
Level, i.e. DST, RAWDATA, etc.
Examples of catalogue names are CERN, for all CERN experiments, DESY, for experiments based at DESY, and so on. Examples of experiment name are L3, H1, CDF.
Note that the same generic-name can be used for more than one file. In this case, the files are all assumed to contain the same data, but may well reside on different media or in different locations. The file format may differ, for example, one copy might be in Zebra FZ native format and another in exchange format. We will see later how different entries with the same generic-name may be selected or listed.
Associated with each generic-name are a catalogue entry and a key vector which contains important information that can be used to make a first pass selection. The catalogue entry is in fact a Zebra bank stored in a Zebra RZ file, and the keys are the normal RZ keys. However, for all practical purposes it is not necessary to know the structure of the catalogue in such detail, particularly when using the FATMEN shell (interactive interface) or the so-called 'novice' FORTRAN callable interface, which both hide Zebra completely.
The key vector contains the filename, i.e. the part of the generic-name following the last slash, and information on the mediatype on which the data resides, the location of the file and the so-called copy level. By using these keys, it is possible to make a first pass selection of a file, or to view only a subset of a catalogue in a very efficient manner. For example, when working at ones home laboratory say in the United States, one is probably not interested in looking at catalogue entries corresponding to files located several thousand kilometers away in CERN, still less in trying to access them over the network.
The meanings of the various keys are experiment defined, except for the media type, which is defined as follows:
```
1: disk
2: 3480 cartridge
3: 3420 tape
4: 8200 Exabyte cartridge
```
The location code

The location code is one piece of information available to FATMEN to select the best available source of data. The following convention is used by OPAL:
```
         0=Cern Vault     CERNVM VXCERN CRAY SHIFT etc
         1=Cern Vault
         2=Cern Vault
        11=VXOPON         OPAL Online Vax cluster
        12=Online         OPAL (apollo) online facilities
        21=VXOPOF         OPAL Offline cluster
        31=SHIFT          SHIFT disk and archive storage
     33101=Saclay         Active cartridges
     33901=Saclay         'obsolete' cartridges
     44501=UKACRL         Active cartridges
     44901=UKACRL         'obsolete' cartridges
```
Even if the location code is not set, FATMEN will still be able to find 'the best' copy of a file. However, it is much more efficient to restrict the search by specifying one or more location codes, as this results in less I/O to the FATMEN catalogue and, more importantly, less queries to the Tape Management System (TMS).

The copy level

Initially, the copy level was defined as follows:
```
     0       original
     1       copy of an original
     2       copy of a copy
     ...
```
In fact, this has been of limited use and it is now more commonly used to indicate the data representation. The following definitions are used by OPAL and correspond to those used by the FPACK package developed at DESY.
```
     1       IEEE floating point format (used on Unix workstations and
                                         for Zebra exchange data format)
     2       IBM  floating point format
     3       VAX  floating point format
     4       IEEE floating point format, but byte-swapped
     5       Cray floating point format
     4       IEEE floating point format, but byte-swapped
```

Explanation of terms

The location code

The copy level