In order to explain the advantages of the Column-Wise-Ntuple format, we consider a small data sample containing some characteristics of the CERN staff as they were in 1988. For each member of the staff there exists one entry in the file. Each entry consists of 11 values, as described in the following table:
Variable Name | Description and possible values |
CATEGORY: | Professional category (integer between 100 and 600) |
100-199: Scientific staff | |
200-299: Engineering staff | |
300-399: Technical support staff | |
400-499: Crafts and trade support staff | |
500-529: Supervisory administrative staff | |
530-559: Intermediate level administrative staff | |
560-599: Lower level administrative staff | |
DIVISION: | Code for each division (Character variable) |
|'AG', 'DD', 'DG', 'EF', 'EP', 'FI', 'LEP', 'PE',| | |
|'PS', 'SPS', 'ST', 'TH', 'TIS'| | |
FLAG: | A flag where the first four bits have the following significance |
Bit 1 = 0 means femaleotherwise male | |
Bit 2 = 0 means residentotherwise non-resident | |
Bit 3 = 0 means singleotherwise head of family | |
Bit 4 = 0 means fixed term contractotherwise indefinite duration contract | |
AGE: | Age(in years) of staff member |
SERVICE: | Number of years of servicethat the staff member has at CERN |
CHILDREN: | Number of dependent children |
GRADE: | Staff member 's position in Gradescale (integer between 3 and 14) |
STEP: | Staff member 's position (step) insidegiven grade (integer between 0 and 15) |
NATION: | Code for staff member's nationality(character variable) |
|'AT', 'BE', 'CH', 'DE', 'DK', 'ES', 'FR', 'GB',| | |
|'GR', 'IT', 'NL', 'NO', 'PT', 'SE', 'ZZ'| | |
HRWEEK: | Number of contractual hours worked per week(between 20 and 44) |
COST: | Costof the staff member to CERN (in CHF) |
Note how the constraints on the various variables shown in the table are expressed in the job when creating the Ntuple.
On the next pages we show first the creation run, together with its output and the automatically generated analysis skeleton, and then the analysis program created based on the skeleton.
Creating the Ntuple
PROGRAM CERN PARAMETER (NWPAWC = 30000) PARAMETER (LRECL = 1024) COMMON /PAWC/ IPAW(NWPAWC) REAL RDATA(11) INTEGER CATEGORY, FLAG, AGE, SERVICE, CHILDREN, GRADE, STEP, + HRWEEK, COST CHARACTER*4 DIVISION, NATION COMMON /CERN/ CATEGORY, FLAG, AGE, SERVICE, CHILDREN, GRADE, + STEP, HRWEEK, COST COMMON /CERNC/ DIVISION, NATION CHARACTER*4 DIVS(13), NATS(15) DATA DIVS /'AG', 'DD', 'DG', 'EF', 'EP', 'FI', 'LEP', 'PE', + 'PS', 'SPS', 'ST', 'TH', 'TIS'/ DATA NATS /'AT', 'BE', 'CH', 'DE', 'DK', 'ES', 'FR', 'GB', + 'GR', 'IT', 'NL', 'NO', 'PT', 'SE', 'ZZ'/ CALL HLIMIT(NWPAWC) * *-- open a new RZ file * CALL HROPEN(1,'MYFILE','cern.hbook','N',LRECL,ISTAT) * *-- book Ntuple * CALL HBNT(10,'CERN Population',' ') * *-- define Ntuple (1 block with 11 columns) * CALL HBNAME(10, 'CERN', CATEGORY, 'CATEGORY[100,600]:I, + FLAG:U:4, AGE[1,100]:I, SERVICE[0,60]:I, + CHILDREN[0,10]:I, GRADE[3,14]:I, STEP[0,15]:I, + HRWEEK[20,44]:I, COST:I') CALL HBNAMC(10, 'CERN', DIVISION, 'DIVISION:C, NATION:C') * *-- open data file with staff information * OPEN(2,FILE='aptuple.dat', STATUS='OLD') \finalnewpage * *-- read data and store in Ntuple * 10 READ(2, '(10F4.0, F7.0)', END=20) RDATA * CATEGORY = RDATA(1) DIVISION = DIVS(INT(RDATA(2))) FLAG = RDATA(3) AGE = RDATA(4) SERVICE = RDATA(5) CHILDREN = RDATA(6) GRADE = RDATA(7) STEP = RDATA(8) NATION = NATS(INT(RDATA(9))) HRWEEK = RDATA(10) COST = RDATA(11) CALL HFNT(10) GOTO 10 * *-- read data of person #100 * 20 I = 100 CALL HGNT(10, I, IER) IF (IER .NE. 0) THEN PRINT *, 'Error reading row ',I ENDIF PRINT *,'Person 100',' ',CATEGORY,' ',DIVISION,' ',AGE,' ',NATION * *-- print Ntuple definition * CALL HPRNT(10) * *-- write batch version of analysis routine to file staff.f * OPEN(3, FILE='staff.f', STATUS='UNKNOWN') CALL HUWFUN(3, 10, 'STAFF', 0, 'B') * *-- write Ntuple buffer to disk and close RZ file * CALL HROUT(10, ICYCLE, ' ') CALL HREND('MYFILE') * END
Output generated by running the above program
***** ERROR in HFNT : HRWEEK: Value out of range, event 2668 : ID= 10 ***** ERROR in HFNT : HRWEEK: Value out of range, event 2673 : ID= 10 ***** ERROR in HFNT : HRWEEK: Value out of range, event 2710 : ID= 10 ***** ERROR in HFNT : HRWEEK: Value out of range, event 2711 : ID= 10 ***** ERROR in HFNT : HRWEEK: Value out of range, event 2833 : ID= 10 Person 100 415 PS 55 FR ****************************************************************** * Ntuple ID = 10 Entries = 3354 CERN Population * ****************************************************************** * Var numb * Type * Packing * Range * Block * Name * ****************************************************************** * 1 * I*4 * 11 * [100,600] * CERN * CATEGORY * * 2 * U*4 * 4 * * CERN * FLAG * * 3 * I*4 * 8 * [1,100] * CERN * AGE * * 4 * I*4 * 7 * [0,60] * CERN * SERVICE * * 5 * I*4 * 5 * [0,10] * CERN * CHILDREN * * 6 * I*4 * 5 * [3,14] * CERN * GRADE * * 7 * I*4 * 5 * [0,15] * CERN * STEP * * 8 * I*4 * 7 * [20,44] * CERN * HRWEEK * * 9 * I*4 * * * CERN * COST * * 10 * C*4 * * * CERN * DIVISION * * 11 * C*4 * * * CERN * NATION * ****************************************************************** * Block * Unpacked Bytes * Packed Bytes * Packing Factor * ****************************************************************** * CERN * 44 * 19 * 2.316 * * Total * 44 * 19 * 2.316 * ****************************************************************** * Number of blocks = 1 Number of columns = 11 * ****************************************************************** \label{lis:Ntupletabcreation}
Note the HFNT error messages, which report that out-of-range data were read in the input file. This is an example of the error checking performed by the CWN routines.
Analysis skeleton generated for above example
SUBROUTINE STAFF ********************************************************* * * * This file was generated by HUWFUN. * * * ********************************************************* * * N-tuple Id: 10 * N-tuple Title: CERN Population * Creation: 12/06/92 11.46.34 * ********************************************************* * INTEGER CATEGORY,FLAG,AGE,SERVICE,CHILDREN,GRADE,STEP,HRWEEK,COST CHARACTER DIVISION*4,NATION*4 COMMON /CERN/ CATEGORY,FLAG,AGE,SERVICE,CHILDREN,GRADE,STEP,HRWEEK + ,COST COMMON /CERN1/ DIVISION,NATION * CALL HBNAME(10,' ',0,'$CLEAR') CALL HBNAME(10,'CERN',CATEGORY,'$SET') CALL HBNAMC(10,'CERN',DIVISION,'$SET') * *-- Enter user code here * * END
This skeleton is used in the example below to prepare a job for analysing the Ntuple data sample.
Example of Fortran code based on skeleton
PROGRAM NEWNTUP PARAMETER (NWPAWC = 30000) PARAMETER (LRECL = 1024) COMMON /PAWC/ IPAW(NWPAWC) CALL HLIMIT(NWPAWC) CALL HROPEN(1,'MYFILE','cern.hbook',' ',LRECL,ISTAT) CALL HRIN(10,9999,0) CALL STAFF CALL HREND('MYFILE') END SUBROUTINE STAFF ********************************************************* * * * This file was generated by HUWFUN. * * * ********************************************************* * * N-tuple Id: 10 * N-tuple Title: CERN Population * Creation: 12/06/92 11.46.34 * ********************************************************* * INTEGER CATEGORY,FLAG,AGE,SERVICE,CHILDREN,GRADE,STEP,HRWEEK,COST COMMON /CERN/ CATEGORY,FLAG,AGE,SERVICE,CHILDREN,GRADE,STEP,HRWEEK + ,COST CHARACTER DIVISION*4,NATION*4 COMMON /CERN1/ DIVISION,NATION * CHARACTER*8 VAR(4) * CALL HBNAME(10,' ',0,'$CLEAR') ! Clear addresses in Ntuple CALL HBNAME(10,'CERN',CATEGORY,'$SET') ! Set addresses for variables CATEGORY... CALL HBNAMC(10,'CERN',DIVISION,'$SET') ! Set addresses for variables DIVISION... * *-- Enter user code here * *-- book the histograms * CALL HBOOK1(101, 'Staff Age', 45, 20., 65., 0.) CALL HBOOK1(102, 'Number of years at CERN', 35, 0., 35., 0.) CALL HBOOK2(103, 'Grade vs. Step', 12, 3., 15., 16, 0., 16., 0.) CALL HBIGBI(101,2) CALL HBIGBI(102,2) * *-- get number of entries * CALL HNOENT(10, NLOOP) * *-- read only the four desired columns * VAR(1) = 'AGE' VAR(2) = 'SERVICE' VAR(3) = 'GRADE' VAR(4) = 'STEP' CALL HGNTV(10, VAR, 4, 1, IER) DO 10 I = 1, NLOOP IF (I.NE.1) CALL HGNTF(10, I, IER) IF (IER .NE. 0) THEN PRINT *, 'Error reading row ', I ENDIF CALL HFILL(101, FLOAT(AGE), 0., 1.) CALL HFILL(102, FLOAT(SERVICE), 0., 1.) CALL HFILL(103, FLOAT(GRADE), FLOAT(STEP), 1.) 10 CONTINUE * CALL HISTDO * END
The summary table about the Ntuple shown below, as obtained by running the program above on the CERN Ntuple, should be compared with the table obtained during the creation run, as shown on page .
............................................................................................................................. . . . HBOOK HBOOK CERN VERSION 4.17 HISTOGRAM AND PLOT INDEX 09/03/93 . . . ............................................................................................................................. . . . NO TITLE ID B/C ENTRIES DIM NCHA LOWER UPPER ADDRESS LENGTH . . . ............................................................................................................................. . . . . . 1 CERN Population 10 N 27174 37 . . . . . . 2 Staff Age 101 32 3354 1 X 45 .200E+02 .650E+02 26527 90 . . . . . . 3 Number of years at CERN 102 32 3354 1 X 35 .000E+00 .350E+02 26432 83 . . . . . . 4 Grade vs. Step 103 32 3354 2 X 12 .300E+01 .150E+02 26347 298 . . Y 16 .000E+00 .160E+02 26074 264 . . . ............................................................................................................................. MEMORY UTILISATION MAXIMUM TOTAL SIZE OF COMMON /PAWC/ 30000 ****************************************************************** * Ntuple ID = 10 Entries = 3354 CERN Population * ****************************************************************** * Var numb * Type * Packing * Range * Block * Name * ****************************************************************** * 1 * I*4 * 11 * [100,600] * CERN * CATEGORY * * 2 * U*4 * 4 * * CERN * FLAG * * 3 * I*4 * 8 * [1,100] * CERN * AGE * * 4 * I*4 * 7 * [0,60] * CERN * SERVICE * * 5 * I*4 * 5 * [0,10] * CERN * CHILDREN * * 6 * I*4 * 5 * [3,14] * CERN * GRADE * * 7 * I*4 * 5 * [0,15] * CERN * STEP * * 8 * I*4 * 7 * [20,44] * CERN * HRWEEK * * 9 * I*4 * * * CERN * COST * * 10 * C*4 * * * CERN * DIVISION * * 11 * C*4 * * * CERN * NATION * ****************************************************************** * Block * Unpacked Bytes * Packed Bytes * Packing Factor * ****************************************************************** * CERN * 44 * 19 * 2.316 * * Total * 44 * 19 * 2.316 * ****************************************************************** * Blocks = 1 Variables = 11 Columns = 11 * ****************************************************************** \finalnewpage Staff Age HBOOK ID = 101 DATE 09/03/93 NO = 1 180 -- 176 II 172 II-- 168 -- I I 164 II I I-- 160 -- II I I 156 II II--I I 152 II--I I-- 148 I I 144 I I 140 I I 136 I I 132 I I -- 128 --I I II 124 I I --II 120 I I--I I 116 -- --I I 112 II I I-- 108 II I I 104 II I I 100 --II I I 96 I I--I I 92 I I---- 88 I I 84 I I 80 --I I 76 I I 72 I I 68 I I 64 ----I I 60 I I 56 I I 52 -- I I -- 48 II-- -- ----I I II 44 I I --II I I--II 40 -- I I--I I I I 36 II--I I--I I 32 I I-- 28 I I-- 24 I I 20 ----I I 16 --I I 12 I I-- 8 ----I I 4 ----I I CHANNELS 10 0 1 2 3 4 1 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 CONTENTS 100 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 1 1 1 3 3 4 4 3 4 4 3 4 4 6 6 7 0 1 9 1 2 5 5 6 5 7 7 6 5 1 2 3 1 9 8 4 5 2 2 1 1. 1 1 7 6 5 8 8 8 3 9 5 9 2 5 3 7 6 4 4 9 0 4 3 3 8 8 1 8 4 9 2 2 2 7 2 0 2 2 9 1 1 9 5 2 LOW-EDGE 10 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 1. 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 * ENTRIES = 3354 * ALL CHANNELS = .3354E+04 * UNDERFLOW = .0000E+00 * OVERFLOW = .0000E+00 * BIN WID = .1000E+01 * MEAN VALUE = .4765E+02 * R . M . S = .8643E+01 \finalnewpage Number of years at CERN HBOOK ID = 102 DATE 09/03/93 NO = 2 200 -- 195 -- II-- 190 II I I 185 II I I 180 -- II--I I 175 II I I 170 II I I 165 II -- I I-- 160 II II I I 155 II II I I -- 150 II II I I-- II 145 II II I I II 140 II--II I I--II 135 I I I I 130 I I--I I -- 125 I I II 120 --I I--II 115 I I 110 -- I I-- 105 II I I 100 II I I 95 II I I 90 II -- I I 85 II II I I 80 II II I I 75 II II I I -- 70 II II I I II 65 -- II II I I II 60 II II II I I II 55 II --II II -- I I---- II 50 II I I II II I I--II 45 --II----I I II --II I I 40 I I --II I I I I 35 I I I I I I --I I 30 I I I I I I I I 25 I I--I I--I I I I 20 I I I I 15 I I I I-- 10 I I--I I CHANNELS 10 0 1 2 3 1 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 CONTENTS 100 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 4 6 4 4 5 0 2 4 8 2 4 5 3 1 7 3 6 2 9 7 9 9 6 4 3 5 2 2 1 5 5 4 7 1 1. 3 5 2 1 4 9 1 0 9 3 5 5 7 2 8 8 6 5 9 5 7 8 2 4 6 7 2 0 9 0 2 2 7 2 4 LOW-EDGE 10 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 1. 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 * ENTRIES = 3354 * ALL CHANNELS = .3349E+04 * UNDERFLOW = .0000E+00 * OVERFLOW = .5000E+01 * BIN WID = .1000E+01 * MEAN VALUE = .1943E+02 * R . M . S = .8124E+01 Grade vs. Step HBOOK ID = 103 DATE 09/03/93 NO = 3 CHANNELS 10 U 0 1 O 1 N 123456789012 V ******************** OVE * * OVE 15 * 4* * 16 14 * 7 * 15 13 * 22******** * 14 12 * +J*YFB23G * 13 11 * 39**QJ6H8 * 12 10 * 3C*YTL6JE* * 11 9 * 36E**N9HD3 * 10 8 * 2K**VEN85 * 9 7 * +38D***NRB25 * 8 6 * 2GQ***TD8 * 7 5 * 3S9***UM7+ * 6 4 * 5I9P*QKG44 * 5 3 * 298WW*QK72 * 4 2 * +9J*S*NK5 + * 3 1 * 7EJMYQM6+ * 2 * 32A9GMNM2 * 1 UND * * UND ******************** LOW-EDGE 10 11111 1. 345678901234 * I I * ENTRIES = 3354 PLOT ---------I---------I--------- * SATURATION AT= INFINITY I 3354 I * SCALE .,+,2,3,.,., A,B, STATISTICS ---------I---------I--------- * STEP = 1.00 * MINIMUM=0.000E+00 I I