When using the FATMEN system, all data is referenced by a name known as the generic-name. The generic name has the form
//catalogue/experiment/dir1/dir2/.../dirn/filenamewhere the slash character (/) is a directory delimiter, as for Unix file names, catalogue indicates in which catalogue the file resides experiment is the name of the experiment to which the file belongs The rest of the file name is free format, although its total length may not exceed 255 bytes and each component may not exceed 20 characters. Typically, experiments will have conventions for the generic-name, but the sort of information that you might want to include in the generic-name is
Examples of catalogue names are CERN, for all CERN experiments, DESY, for experiments based at DESY, and so on. Examples of experiment name are L3, H1, CDF.
Note that the same generic-name can be used for more than one file. In this case, the files are all assumed to contain the same data, but may well reside on different media or in different locations. The file format may differ, for example, one copy might be in Zebra FZ native format and another in exchange format. We will see later how different entries with the same generic-name may be selected or listed.
Associated with each generic-name are a catalogue entry and a key vector which contains important information that can be used to make a first pass selection. The catalogue entry is in fact a Zebra bank stored in a Zebra RZ file, and the keys are the normal RZ keys. However, for all practical purposes it is not necessary to know the structure of the catalogue in such detail, particularly when using the FATMEN shell (interactive interface) or the so-called 'novice' FORTRAN callable interface, which both hide Zebra completely.
The key vector contains the filename, i.e. the part of the generic-name following the last slash, and information on the mediatype on which the data resides, the location of the file and the so-called copy level. By using these keys, it is possible to make a first pass selection of a file, or to view only a subset of a catalogue in a very efficient manner. For example, when working at ones home laboratory say in the United States, one is probably not interested in looking at catalogue entries corresponding to files located several thousand kilometers away in CERN, still less in trying to access them over the network.
The meanings of the various keys are experiment defined, except for the media type, which is defined as follows:
1: disk 2: 3480 cartridge 3: 3420 tape 4: 8200 Exabyte cartridge
The location code is one piece of information available to FATMEN to select the best available source of data. The following convention is used by OPAL:
0=Cern Vault CERNVM VXCERN CRAY SHIFT etc 1=Cern Vault 2=Cern Vault 11=VXOPON OPAL Online Vax cluster 12=Online OPAL (apollo) online facilities 21=VXOPOF OPAL Offline cluster 31=SHIFT SHIFT disk and archive storage 33101=Saclay Active cartridges 33901=Saclay 'obsolete' cartridges 44501=UKACRL Active cartridges 44901=UKACRL 'obsolete' cartridgesEven if the location code is not set, FATMEN will still be able to find 'the best' copy of a file. However, it is much more efficient to restrict the search by specifying one or more location codes, as this results in less I/O to the FATMEN catalogue and, more importantly, less queries to the Tape Management System (TMS).
Initially, the copy level was defined as follows:
0 original 1 copy of an original 2 copy of a copy ...
In fact, this has been of limited use and it is now more commonly used to indicate the data representation. The following definitions are used by OPAL and correspond to those used by the FPACK package developed at DESY.
1 IEEE floating point format (used on Unix workstations and for Zebra exchange data format) 2 IBM floating point format 3 VAX floating point format 4 IEEE floating point format, but byte-swapped 5 Cray floating point format 4 IEEE floating point format, but byte-swapped