LINTRA -- Principal components analysis program

Authors: R. Brun                                          DD/EE/79-5
         M. Hansroul                                      DD/US/70
         J. Kubler                                        Revised 80.05.10
 
 
 
                              1.  INTRODUCTION
 
 
 
       In many applications of various  fields of research,  the treat-
 ment of large amounts of data  requires powerful techniques capable of
 rapid data reduction and analysis.  Usually,  the quantities most con-
 veniently measured by the experimentalist are not necessarily the most
 significant for the  classification and analysis of  his observations.
 It is then very  helpful to have a way of selecting  an optimal set of
 variables necessary for  the recognition process and  reducing the di-
 mensionality of the problem resulting  in easier classification proce-
 dure.
 
      This paper describes the implementation at CERN of one such meth-
 od of  feature selection,  namely  the principal  components analysis.
 This multi-dimensional technique is well known in the field of pattern
 recognition (Ref.  1)  and its use  in Particle Physics has been docu-
 mented elsewhere (Refs. 2, 3 and 4).
 
 
 1.2   EXAMPLE OF APPLICATION
       ______________________
 
 
      Let us consider  the case where the prototypes in  the sample are
 trajectories of particles passing through a spectrometer. If one meas-
 ures the passage of the particle at say, 8 fixed planes, the trajecto-
 ry is then described by an 8-component vector
  õ
  x = (x(1),...,x(8)) in an 8-dimensional pattern space.
 
      One proceeds  then by generating  a representative  tracks sample
 and building up the covariance matrix C. Its eigenvectors and eigenva-
 lues are computed by standard methods and thus a new basis is obtained
 for the original 8-dimensional space. The expansion of the prototypes
 
 
         õ      8        õ                      õt   õ
         x(m) = ÷  a(m,i)e(i)   with   a(m,i) = x(m).e(i)
                i=1
 
 allows the study of the behaviour  of the coefficients a(m,i)  for all
 the tracks in the sample. The eigenvectors which are insignificant for
 the trajectory  description in  the expansion  will have  their corre-
 sponding coefficients a(m,i) close to zero for all the prototypes.  On
 one hand a  reduction of the dimensionality is then  obtained by omit-
 ting these least  significant vectors in the  subsequent analysis.  On
 the other hand,  in the analysis  of real data these least significant
 variables can be used for the  pattern recognition problem of extract-
 ing the valid combinations of coordinates describing a true trajectory
 from the set of all possible wrong combinations.
 
      The  program described  here performs  this principal  components
 analysis on a sample of data provided by the user. It computes the co-
 variance matrix,   its eigenvalues and corresponding  eigenvectors and
 exhibits the behaviour of the principal components [a(m,i)], thus pro-
 viding to the user all the means of understanding his data.
 
    A short outline  of the method of Principal Components  is given in
 the Appendix.
 
 
 
 1.3   STRUCTURE OF THE PACKAGE LINTRA
       _______________________________
 
    The LINTRA package  has been written in FORTRAN and  can readily be
 implemented on various computers*.   The block diagram below shows the
 structure of LINTRA :
 
  +----------------+
  |                |----õ+-----------------------------------------+
  |  L I N T R A   |     | LINIT :  initialisation of control      |
     ___________
  |                |     |          variables                      |
  |                |Ê----+-----------------------------------------+
  |                |
  |                |----õ+-----------------------------------------+
  |                |     | LCREAD : data cards reading             |
  |                |Ê----+-----------------------------------------+
  | Initialisation |
  |                |----õ+-----------------------------------------+
  |                |     | LZBOOK : initialisation of data blocks  |
  |                |Ê----+-----------------------------------------+
  |                |
  |                |----õ+----------------------------+--õ+--------+
  |                |     | LINEAR: computes covariance|   | LINONE |
  |                |     |     matrix, its eigenvector|Ê--+--------+
  |                |     |     and eigenvalues        |
  |                |Ê----+----------------------------+
  |                |
  |                |----õ+-----------------------------------------+
  |                |     | LHBOOK : booking of histograms          |
  |                |Ê----+-----------------------------------------+
  |                |
  |  Computation   |----õ+-----------------------------------------+
  |                |     | LMATOU : prints results of LINEAR       |
  |                |Ê----+-----------------------------------------+
  |      of        |
  |                |----õ+-----------------------------------------+
  |                |     | LPUNCR : writes a FORTRAN subroutine    |
  | transformation |     |          performing the principal       |
  |                |     |          components transformation      |
  |                |Ê----+-----------------------------------------+
  |                |
  |                |----õ+-----+
  |                |     |LTEST|
  |                |     |   +-|--õ+-------------------------------+
  |                |     |   | |   | LINONE : reads one event      |
  |                |     |   | |Ê--+-------------------------------+
  |                |     |   | |
  |                |     |   | |--õ+-------------------------------+
  |     Test       |     |   | |   | LTOXSI : computes principal   |
  |                |     |   | |   |          components           |
  |                |     |   | |Ê--+-------------------------------+
  |                |     |   | |
  |      of        |     |   | |--õ+-------------------------------+
  |                |     |   | |   | LTOX : computes original event|
  |                |     |   | |   |        from  principal compo- |
  |                |     |   | |   |        nents                  |
  | transformation |     |   | |Ê--+-------------------------------+
  |                |     |   | |
  |                |     |   | |--õ+-------------------------------+
  |                |     |   | |   | LUSER : user subroutine       |
  |                |     |   | |Ê--+-------------------------------+
  |                |     |   | |
  |                |     |   | |--õ+-------------------------------+
  |                |     |   | |   | LSUMSQ : computes residuals   |
  |                |     |   +-|Ê--+-------------------------------+
  |                |Ê----+-----+
  |                |
  |                |----õ+-----------------------------------------+
  |                |     | LQUIT : prints final results            |
  |                |Ê----+-----------------------------------------+
  +----------------+
 
 
 *  The source  code of LINTRA is  available from the CERN  Program Li-
 brary.
 
 
    LINTRA is essentially divided into 3 parts :
 
 
    i)    the initialisation
 
    ii)   the  principal components  analysis  of  the training  sample
          leading to the linear transformation  to go from the original
          vectors to the new description of  the prototypes in terms of
          the principal components
 
    iii)  the application of the transformation to a test sample.
 
 
    The package utilises several utilities available from the CERN Pro-
 gram Library.  Among them, let us mention
 
      -  the histogramming package HBOOK (Ref. 5)
 
      -  the dynamic memory management system ZBOOK (Ref. 6)
 
      -  the free format input card reading subroutine FFREAD (Ref. 7).
 
 
    The usage  of these  facilities is  completely transparent  and the
 user need not know much about them in order to use LINTRA.
 
    In case the  user has asked for  histograms,  a simple call  to the
 subroutine HISTDO of the package HBOOK after  a call to LINTRA is suf-
 ficient to obtain the printing of the histograms (see Section 3.4).
 
    When LINTRA  is to  be called  from an  existing program  where the
 blank common is used, it is important to know that HBOOK also utilises
 the blank common.  In that case, it is necessary to make a call to the
 subroutine HISTGO of HBOOK in order  to define the starting address of
 the HBOOK area in the blank common.
 
 
 1.4   USAGE
       _____
 
 
    The subroutine LINTRA can be called from any existing user program.
 
 
    LINTRA uses two labelled common blocks :
 
 
    i)    the common LFLAG has a length of 115 words and contains main-
          ly control variables
 
    ii)   the common  LINT is dimensionned  at 2000 words  and contains
          the data blocks created by the package ZBOOK.
 
          Its length is sufficient for prototype  vectors of at most 20
                                                             _______
          components.
 
          When more than  20 component vectors are to  be treated,  the
          length of  the common LINT should  be defined by  the calling
          program according to the formula :
 
 
                     Length = 10 PNT Nvar + 4 ù Nvarý + 200
 
 
          where NVAR is the number of components in the original proto-
          types (see example in Section 3.4).
 
 
      In addition,  care must be taken that the logical files needed by
 LINTRA (see data cards PUNC and OUTP in Section 3.2)  must be declared
 in the calling program.
 
 
      In a first approach,  the user  has to provide a small subprogram
 (LINONE) whose action is to read in one element from the sample (Chap-
 ter 2) and also a small set of compulsory data cards (Chapter 3).
 
 
      Additional data cards are available in order to select a methodo-
 logical option or some different output options (Section 3.2).
 
 
      A subroutine (LUSER) may also be optionally provided by the user,
 in case he  wishes to interact with  the program and output  some more
 information (Section 3.3).
 
                          2.  USER SUBPROGRAM
 
 
 
 2.1   SUBROUTINE LINONE
       _________________
 
 
      The task  of this  subroutine is  to provide  one element  of the
 training sample to the principal components analysis program.
 
                                        õ
      An element consists of the vector X of its corresponding NVAR
  variables.
 
      In order to compute the transformation matrix,  the subroutine is
 called as many times as decided by  the user.  When the number of ele-
 ments requested  by the user has  been reached,  the  logical variable
 EOFDAT has to be set to .TRUE.
 
      The subroutine is called again in  order to apply the transforma-
 tion matrix  to a sample of  data and fill various  histograms.   This
 sample can be the same as the previous one.
 
      The call to this subroutine has the form :
 
 
                 CALL LINONE(X,NVAR,EOFDAT)
 
 
 where the meaning of the parameters is as follows :
 
 
  NVAR            =  number of variables used by the principal
                     components analysis (set by the user via
                     data card)
 
  (X(I),I=1,NVAR) =  values of the NVAR variables for the
                     element read
 
  EOFDAT          =  logical variable to be set .TRUE. when
                     the number of elements requested has
                     been reached (EOFDAT is initially set
                     to .FALSE.).
 
 
 2.2   BLOCK DIAGRAM OF ROUTINE LINONE
       _______________________________
 
 
 
      LINONE(X,NVAR,EOFDAT)
 
      +--------------------+
      |  Initialise        |
      |                    |
      |                    |
      |  LUNIT = 1  Ê--------- corresponds to the logical file
      |                    |   TAPE1 declared in the program MAIN
      |                    |
      |  NV = 0     Ê--------- counter set to zero
      |                    |
      |  NMAXEV = n Ê--------- maximum number of events to be read
      |                    |   from TAPE1 - defined by the user
      +--------------------+
                |
                |
                ¾
  +----------------------------+
  | read from the logical unit |
  | LUNIT one event of the     |
  | form                       |
  |                            |
  |     (X(I),I=1,NVAR)        |
  |                            |
  |     NV=NV+1  Ê--------------- increment the counter
  +----------------------------+
                |
                |
                ¾
  +----------------------------+
  |                            |
  | if (NV<NMAXEV) then RETURN |
  |                            |
  |                            |
  | else set  EOFDAT=.TRUE.    |
  |           NV=0 ‡           |
  |                ‡ Ê----------- reset the counter for the test
  |       NMAXEV=m ‡           |  sample and redefine the maximum
  |                            |  number of events to be read
  |                            |  from it
  +----------------------------+
                |
                |
                ¾
  +----------------------------+
  |                            |
  | if the test sample is the  |
  | same as the training sample|
  |                            |
  |  then                      |
  |       REWIND LUNIT         |
  |  else                      |
  |       LUNIT=2  Ê------------- corresponds to the logical file
  |                            |  TAPE2 declared in the program
  |                            |  MAIN
  +----------------------------+
                |
                |
                ¾
         +--------------+
         |    RETURN    |
         +--------------+
 
 2.3   FORTRAN EXAMPLE
       _______________
 
 
 
        SUBROUTINE LINONE(X,NVAR,EOFDAT)
  C
        DIMENSION X(NVAR)
  C
        LOGICAL EOFDAT
  C
  C     INITIALISATION
  C
        DATA LUNIT,NV,NMAXEV/1,0,1000/
  C
        READ (LUNIT) (X(I),I=1,NVAR)
        NV=NV+1
  C
        IF (NV.LT.NMAXEV) GO TO 999
  C
  C     RESET VARIABLES FOR TEST SAMPLE
  C
        EOFDAT=.TRUE.
        NV=0
        NMAXEV=500
  C
  C     IN THIS EXAMPLE THE TEST SAMPLE IS READ ON UNIT 2
  C
        LUNIT=2
   999  RETURN
        END
 
 
                             3.  DATA CARDS
 
 
 
      The program uses the subroutine FFREAD  in order to read the data
 cards in free format.
 
      The first four  characters are used as a keyword  by the program,
 in order to identify the group of  variables set in the remaining part
 of the card.
 
      The remaining pieces of information are  values to be assigned to
 certain control variables used by the program.
 
 
      The data cards are divided into  a compulsory set and an optional
 one.  The  END  card must be read last,  all other cards may be placed
 in any order.
 
 
 
 3.1   COMPULSORY DATA CARDS
       _____________________
 
 
    +--------------------------------------------------------+
    |         |                                              |
    | KEYWORD |                  VARIABLES                   |
    |         |                                              |
    |---------|----------------------------------------------|
    |         |         |                                    |
    | NVAR    |  NVAR   | Number of variables (< 100)        |
    |         |         |                                    |
    |---------|---------|------------------------------------|
    |         |         |                                    |
    | TEST    |         | Number of principal components     |
    |         |  NTEST  | used to reconstruct the original   |
    |         |         | variables.                         |
    |         |         | (by default, NTEST is set to NVAR) |
    |         |         |                                    |
    |---------|---------|------------------------------------|
    |         |         |                                    |
    | END     |         | Terminates the data cards reading. |
    |         |         | (this card has to be the last one) |
    |         |         |                                    |
    +--------------------------------------------------------+
 
 
 
 3.2   OPTIONAL DATA CARDS
       ___________________
 
 
 Methodological option :
 
 
   +----------------------------------------------------+
   |         |                                          |
   | KEYWORD |             VARIABLES                    |
   |         |                                          |
   |---------|------------------------------------------|
   |         |                                          |
   | NORM    | Allows the user to transform original    |
   |         | variables to normalized variables i.e.   |
   |         |                                          |
   |         |  X(I) = (X(I) - X(I)   ) / X(I)          |
   |         |   NEW     OLD    AVERAGE    STANDARD     |
   |         |                             DEVIATION    |
   |         |                                          |
   |         |     This option is useful in the case    |
   |         |     the ranges of the variables are very |
   |         |     different.                           |
   |         |                                          |
   +----------------------------------------------------+
 
 
 Output options :
 
   +---------------------------------------------------------+
   |         |                                               |
   | KEYWORD |           VARIABLES                           |
   |         |                                               |
   |---------|-----------------------------------------------|
   |         |         |                                     |
   | DEBU    |    1    | Debug of the events between event   |
   |         | (IDEMIN)| number IDEMIN and event number      |
   |         |    2    | IDEMAX. For each event, original    |
   |         | (IDEMAX)| variables, the coordinates of the   |
   |         |         | event on each principal component   |
   |         |         | and the reconstructed variables     |
   |         |         | using the NTEST first components    |
   |         |         | are successively printed            |
   |---------|---------|-------------------------------------|
   |         |         |                                     |
   | HIST    |    1    | Output of the histograms of the     |
   |         |    2    | principal components number 1..2... |
   |         |    3    | ..NVAR (see example).               |
   |         |    .    |                                     |
   |         |    .    |                                     |
   |---------|---------|-------------------------------------|
   |         |         |                                     |
   | RESI    |         | Output of the sum of squares of     |
   |         |         | the residuals for each variable,    |
   |         |         | taking into account from 1 to NVAR  |
   |         |         | principal components for the        |
   |         |         | reconstruction                      |
   |         |         |                                     |
   |---------|---------|-------------------------------------|
   |         |         |                                     |
   | PUNC    | LPUNCH  | writes on logical unit LPUNCH the   |
   |         |         | fortran code of the subroutine      |
   |         |         | XTOXSI which computes the coordina- |
   |         |         | tes for each event in the new system|
   |         |         | of axes defined by the principal    |
   |         |         | components                          |
   |         |         |                                     |
   |---------|---------|-------------------------------------|
   |         |         |                                     |
   | OUTP    | LPRINT  | writes on logical unit LPRINT all   |
   |         |         | the results of the principal        |
   |         |         | components analysis.                |
   |         |         | Default = line printer file         |
   |         |         |                                     |
   +---------------------------------------------------------+
 
 
          The logical file numbers set by  the data cards PUNC and OUTP
 Remark :
          must of  course be defined  by the  PROGRAM card of  the main
          program supplied by the user (see 2.1) on CDC,  or in the JCL
          on IBM.
 
 
 Example of data cards :
 
      LIST  ACTIVATES THE PRINTING OF THE DATA CARDS
 
      NVAR  8     NUMBER OF INITIAL VARIABLES PER EVENT
 
      TEST  5     NUMBER OF VARIABLES USED FOR THE RECONSTRUCTION
 
      DEBU  1,10  DEBUG OF EVENTS ONE TO TEN
 
      RESI        THE CALCULATION OF THE RESIDUALS IS MADE
 
      HIST  1=1 3=1 5=1 HISTOGRAM OF PRINC. COMP. ONE, THREE, FIVE
 
      NORM        THE VARIABLES ARE NORMALIZED
 
      PUNC  4     FORTRAN CODE WRITTEN ON LOGICAL UNIT FOUR
 
      END
 
 
 3.3   OPTIONAL SUBROUTINE
       ___________________
 
      The user subroutine LUSER allows one  to display some more infor-
 mation (such as histograms ..) and/or to perform some computation.
 
      The subroutine  LUSER is  called during the  second phase  of the
 program,  when the test of the adjustment is made.  In this case,  the
 data sample used by  the program is the test sample.    For each event
 read by the program from the test sample,  the subroutine gives access
 to the  coordinates of this  event and to  each of the  NVAR principal
 components.
 
      The call to this subroutine has the form :
 
                CALL LUSER(X,XSI,NVAR)
 
 where the meaning of X, XSI and NVAR is described below.
 
 
 Example :
 
       SUBROUTINE LUSER(X,XSI,NVAR)
       DIMENSION X(1),XSI(1)
 C
 C  NVAR              = NUMBER OF ORIGINAL VARIABLES
 C  (X(I),I=1,NVAR)   = VALUE OF THE NVAR ORIGINAL VARIABLES FOR
 C                      THE EVENT READ FROM THE TEST SAMPLE
 C  (XSI(I),I=1,NVAR) = COORDINATES OF THE ORIGINAL EVENT IN THE
 C                      NEW SYSTEM OF AXES DEFINED BY THE NVAR
 C                      PRINCIPAL COMPONENTS
 C  THIS ROUTINE COMPUTES THE DISTRIBUTION OF THE DISTANCE
 C  TO THE HYPERPLANE FORMED BY THE NTEST FIRST
 C  PRINCIPAL COMPONENTS (NTEST < NVAR)
 C
       DATA NTEST /5/
       DATA IENTRY /0/
       IF (IENTRY.NE.0) GO TO 200
 C
 C  FIRST ENTRY
 C
       IENTRY=1
       CALL HBOOK1(999,29HDISTRIBUTION OF THE DISTANCE$,100,0.,0.,0.)
 C
 C  SECOND ENTRY
 C
  200  CONTINUE
 C
 C  COMPUTE THE DISTANCE TO THE HYPERPLANE OF DIMENSION NTEST
 C
       D=0.
       NTT=NTEST+1
       DO 10 I=NTT,NVAR
  10   D=D+XSI(I)**2
       D=SQRT(D)
 C
 C  FILLING OF THE HISTOGRAM
 C
       CALL HFILL(999,D,0.,1.)
       RETURN
       END
 
 
          at this point,  the user  may conveniently use the subroutine
 Remark :
          LTOXSI and LTOX available in the LINTRA package.  The respec-
          tive calls have the form
 
            CALL LTOXSI(X,XSI,NVAR,1,NVAR)
 
            CALL LTOX(XSI,X,NVAR,NTEST)
 
          where X,   XSI,NVAR and NTEST are  defined as in  the example
          above.   LTOXSI  computes the  NVAR principal  components XSI
          from the original vector X.
          LTOX reconstructs the original vector  X from the NTEST first
          principal components XSI.
 
 3.4   EXAMPLE AND DECK SET-UP (CERN CDC 7600 SYSTEM)
       ______________________________________________
 
 
  JOB,T10.
  ACCOUNT,user,div,account.
  FTN,L,R.
  FIND,LLIB,LINTRALIB,ID=DE100.
  CERNLIB,LLIB,GENLIB.
  FILE,TAPE4,RT=Z,FL=80.
  FIND,TAPE1,DATASAMPLE,ID=USERID.
  FIND,TAPE2,TESTSAMPLE,ID=USERID.
  LGO.
  CATALOG,TAPE4,XTOXSI,ID=userid,ST=CCP,RP=999.
 
 End-of-record
 _____________
 
      PROGRAM MAIN(INPUT,OUTPUT,TAPE1,TAPE2,TAPE4,..)
 
      COMMON /LINT/ DUMMY(3000)
 
  C   THE COMMON LINT HAS A LENGTH SUFFICIENT FOR
  C   25-DIMENSIONAL VECTORS
 
      CALL LINTRA
 
  C   PRINT HISTOGRAMS
 
      CALL HISTDO
 
      END
      SUBROUTINE LINONE(X,NVAR,EOFDAT)
        .
        .
        .
      END
      SUBROUTINE LUSER(X,XSI,NVAR)
        .
        .
        .
      END
        .
        .
        .
 
 End-of-record
 _____________
 
  NVAR  25
  TEST   6
  DEBU   1   10
  RESI
  HIST   1=1   4=1   6=1   15=1
  NORM
  PUNC   4
  END
 
 End-of-file
 ___________
 
 
 
 3.5   EXAMPLE AND DECK SET-UP (CERN IBM SYSTEM)
       _________________________________________
 
 
 
  //uuunn JOB uuu$gg
  // EXEC NFORTCG,
  // LLB1='DL.PUB.LINTRA.LIB',
  // LLB2='CR.PUB.GENLIB'
  //C.SYSIN DD *
 
       COMMOM//B(5000)
       COMMON/LINT/DUMMY(3000)
       CALL HLIMIT(5000)
       CALL LINTRA
       CALL HISTDO
       END
       :
       :
       :
       END
 
  /*
  //G.SYSIN DD *
 
  LIST
  :
  :
  :
  END
 
  /*
  //G.FT01F001 DD DSN=gg.uuu.name1,DISP=(OLD)
  //G.FT02F001 DD DSN=gg.uuu.name2,DISP=(OLD)
  //G.FT04F001 DD DSN=gg.uuu.name3,
  // DISP=(NEW,CATLG,DELETE),
  // UNIT=SYSDA,
  // DCB=(RECFM=F,BLKSIZE=133,LRECL=133),
  // SPACE=(TRK,(1,1),RLSE)
 
 
 
                             4.  REFERENCES
 
 
 
 
 
 
    1.  Andrews H. C.
        Introduction to Mathematical techniques in pattern recognition,
        Wiley Interscience, 1972.
 
 
    2.  H. Grote, M. Hansroul, J. C. Lassalle, P. Zanella
        Identification of Digitized Particle Trajectories,
        CERN - DD/73/23.
 
 
    3.  M. Hansroul, D. Townsend, P. Zanella
        The application of Multi-dimensional techniques to the process-
        ing of event data from large spectrometers,
        CERN - DD/73/31.
 
 
    4.  H. Wind
        Function Parametrization, Proc.  of the 1972 CERN Computing and
        data processing school,  CERN 72-21.
 
 
    5.  HBOOK  :  HBOOK User Guide, CERN DD/EE/81-1, DD/US/72.
 
 
    6.  ZBOOK  :  ZBOOK User Guide, DD/US/73.
 
 
    7.  FFREAD :  FFREAD User Guide, CERN DD/EE/78-2, DD/US/71.
 
 
 
 
 Acknowledgements
 ________________
 
 
    We would like to express our thanks  to F.  G.  De Bilio to prepare
 this  document,  which  was  produced  with the  text-editing  program
 SCRIPT/SYSPUB.
 
 
                               Appendix A
 
                      PRINCIPAL COMPONENTS METHOD
 
 
      Let us consider a sample of M prototypes each being characterized
 by P variables X1,....,Xp.  Each prototype is a point or a column vec-
 tor in a P-dimensional pattern space
                        _____________
 
                      õ                       t
                      x = (x(1),x(2),...,x(P))
 
 
 where each x(n)   represents the particular value  associated with the
 n-th dimension and t means "transpose".
 
      Those P variables are the quantities accessible to the experimen-
 ter but are  not necessarily the most significant  for the classifica-
 tion purpose.
 
      The Principal  Components Method  consists in  applying a  linear
          ____________________________                           ______
 transformation to the original variables.   This transformation is de-
 scribed by an orthogonal matrix and is equivalent to a rotation of the
 original pattern  space to  a new set  of coordinates  vectors,  which
 hopefully provide easier feature identification and dimensionality re-
 duction.
 
      Let us define the covariance Matrix
 
                         õõt
                    C = <yy>  (1)
 
        õ   õ    õ
 where  y = x - <x>  and the brackets indicate mean value over the sam-
 ple of M prototypes.
 
      This matrix is real positive definite and symmetric and will have
 all its  eigenvalues greater than  zero.  It  will now be  shown that,
 among the  family of  all complete  orthonormal bases  in the  pattern
 space,  the basis formed by the  eigenvectors of the covariance matrix
 and belonging to the largest eigenvalues  corresponds to the most sig-
 nificant features for the description of the original prototypes.
 
      Let the prototypes be expanded into the set of N basis vectors
       õ
      (e(n), n=1,..,N,N+1,..,P)
 
 
           õ      N        õ
           y(i) = ÷  a(i,n)e (n)       i = 1,..,M
 (2)
                  n=1                  N < P
 
                                õ
 The "best" feature coordinates e(n) will be obtained by minimizing the
 error due to this truncated expansion, i.e.
 
 
                        + õ      N        õ     +”
     minimize   E(N) = <| y (i) -÷  a(i,n)e (n) | >
 (3)
                        +        n=1            +
 
                                      õ    õ
     with the orthonormal conditions  e(k).e(j) = 1   if  k=j
 (4)
                                                = 0   if  kØj
                        õt
     Multiplying (2) by e(n) and using (4), one gets
 
                 õt   õ
        a(i,n) = y(i).e(n)
 (5)
 
 
 The error becomes
 
 
           + P           õ    +      + P     õt   õ    õ    +
   E(N) = <| ÷    a(i,n) e(n) | > = <| ÷    (y(i).e(n))e(n) | >
           + n=N+1            +      + n=N+1                +
 
 
            P    õt  õ   õt  õ           p    õt   õ
        = < ÷    e(n)y(i)y(i)e(n) > =    ÷    e(n)Ce(n)
 (6)
            n=N+1                        n=N+1
 
                                                               õt   õ
 The  minimisation of  the  sum  in (6)   is  obtained  when each  term
 e(n)Ce(n) is minimum, as C is positive definite.  By the method of La-
 grange multipliers and the condition (4), one gets
 
 
               P     õt   õ          õt   õ
        E(N) = ÷    (e(n)Ce(n) - l(n)e(n).e(n) + l(n))
               n=N+1
 
                                 õ  t
 The minimum condition  dE(N) / de(n) = 0  leads to the equation
 
           õ          õ
          Ce(n) = l(n)e(n)
 (7)
 
                   õ
 which shows that e(n) is an eigenvector of the covariance matrix C and
 l(n) the corresponding eigenvalue. The estimated minimum error is then
 given by
 
               P     õt       õ       P
        E(N) = ÷     e(n).l(n)e(n) =  ÷    l(n)
 (8)
               n=N+1                  n=N+1
 
 
 where l(n),  n=N+1,...,P are the eigenvalues associated with the omit-
 ted eigenvectors in the expansion (2). Thus, by choosing the N largest
 eigenvalues and their associated eigenvectors, the error E(N) is mini-
 mized.
 
 The transformation matrix to go from the pattern space to the feature
                                            õ        õ
 space consists of the ordered eigenvectors e(1),...,e(p) for its rows
 
 
                  + õt      +
                  | e(1)    |
                  |         |
                  | õt      |
                  | e(2)    |
                  |         |
                  | .       |
                  |         |
             T =  | .       |
                  |         |
                  | .       |
                  |         |
                  | .       |
                  |         |
                  | õt      |
                  | e(p)    |
                  +         +
 
 This is an orthogonal transformation or  rotation of the pattern space
 and the feature  selection results in ignoring  certain coordinates in
 the transformed space.