- •Preface
- •Biological Vision Systems
- •Visual Representations from Paintings to Photographs
- •Computer Vision
- •The Limitations of Standard 2D Images
- •3D Imaging, Analysis and Applications
- •Book Objective and Content
- •Acknowledgements
- •Contents
- •Contributors
- •2.1 Introduction
- •Chapter Outline
- •2.2 An Overview of Passive 3D Imaging Systems
- •2.2.1 Multiple View Approaches
- •2.2.2 Single View Approaches
- •2.3 Camera Modeling
- •2.3.1 Homogeneous Coordinates
- •2.3.2 Perspective Projection Camera Model
- •2.3.2.1 Camera Modeling: The Coordinate Transformation
- •2.3.2.2 Camera Modeling: Perspective Projection
- •2.3.2.3 Camera Modeling: Image Sampling
- •2.3.2.4 Camera Modeling: Concatenating the Projective Mappings
- •2.3.3 Radial Distortion
- •2.4 Camera Calibration
- •2.4.1 Estimation of a Scene-to-Image Planar Homography
- •2.4.2 Basic Calibration
- •2.4.3 Refined Calibration
- •2.4.4 Calibration of a Stereo Rig
- •2.5 Two-View Geometry
- •2.5.1 Epipolar Geometry
- •2.5.2 Essential and Fundamental Matrices
- •2.5.3 The Fundamental Matrix for Pure Translation
- •2.5.4 Computation of the Fundamental Matrix
- •2.5.5 Two Views Separated by a Pure Rotation
- •2.5.6 Two Views of a Planar Scene
- •2.6 Rectification
- •2.6.1 Rectification with Calibration Information
- •2.6.2 Rectification Without Calibration Information
- •2.7 Finding Correspondences
- •2.7.1 Correlation-Based Methods
- •2.7.2 Feature-Based Methods
- •2.8 3D Reconstruction
- •2.8.1 Stereo
- •2.8.1.1 Dense Stereo Matching
- •2.8.1.2 Triangulation
- •2.8.2 Structure from Motion
- •2.9 Passive Multiple-View 3D Imaging Systems
- •2.9.1 Stereo Cameras
- •2.9.2 3D Modeling
- •2.9.3 Mobile Robot Localization and Mapping
- •2.10 Passive Versus Active 3D Imaging Systems
- •2.11 Concluding Remarks
- •2.12 Further Reading
- •2.13 Questions
- •2.14 Exercises
- •References
- •3.1 Introduction
- •3.1.1 Historical Context
- •3.1.2 Basic Measurement Principles
- •3.1.3 Active Triangulation-Based Methods
- •3.1.4 Chapter Outline
- •3.2 Spot Scanners
- •3.2.1 Spot Position Detection
- •3.3 Stripe Scanners
- •3.3.1 Camera Model
- •3.3.2 Sheet-of-Light Projector Model
- •3.3.3 Triangulation for Stripe Scanners
- •3.4 Area-Based Structured Light Systems
- •3.4.1 Gray Code Methods
- •3.4.1.1 Decoding of Binary Fringe-Based Codes
- •3.4.1.2 Advantage of the Gray Code
- •3.4.2 Phase Shift Methods
- •3.4.2.1 Removing the Phase Ambiguity
- •3.4.3 Triangulation for a Structured Light System
- •3.5 System Calibration
- •3.6 Measurement Uncertainty
- •3.6.1 Uncertainty Related to the Phase Shift Algorithm
- •3.6.2 Uncertainty Related to Intrinsic Parameters
- •3.6.3 Uncertainty Related to Extrinsic Parameters
- •3.6.4 Uncertainty as a Design Tool
- •3.7 Experimental Characterization of 3D Imaging Systems
- •3.7.1 Low-Level Characterization
- •3.7.2 System-Level Characterization
- •3.7.3 Characterization of Errors Caused by Surface Properties
- •3.7.4 Application-Based Characterization
- •3.8 Selected Advanced Topics
- •3.8.1 Thin Lens Equation
- •3.8.2 Depth of Field
- •3.8.3 Scheimpflug Condition
- •3.8.4 Speckle and Uncertainty
- •3.8.5 Laser Depth of Field
- •3.8.6 Lateral Resolution
- •3.9 Research Challenges
- •3.10 Concluding Remarks
- •3.11 Further Reading
- •3.12 Questions
- •3.13 Exercises
- •References
- •4.1 Introduction
- •Chapter Outline
- •4.2 Representation of 3D Data
- •4.2.1 Raw Data
- •4.2.1.1 Point Cloud
- •4.2.1.2 Structured Point Cloud
- •4.2.1.3 Depth Maps and Range Images
- •4.2.1.4 Needle map
- •4.2.1.5 Polygon Soup
- •4.2.2 Surface Representations
- •4.2.2.1 Triangular Mesh
- •4.2.2.2 Quadrilateral Mesh
- •4.2.2.3 Subdivision Surfaces
- •4.2.2.4 Morphable Model
- •4.2.2.5 Implicit Surface
- •4.2.2.6 Parametric Surface
- •4.2.2.7 Comparison of Surface Representations
- •4.2.3 Solid-Based Representations
- •4.2.3.1 Voxels
- •4.2.3.3 Binary Space Partitioning
- •4.2.3.4 Constructive Solid Geometry
- •4.2.3.5 Boundary Representations
- •4.2.4 Summary of Solid-Based Representations
- •4.3 Polygon Meshes
- •4.3.1 Mesh Storage
- •4.3.2 Mesh Data Structures
- •4.3.2.1 Halfedge Structure
- •4.4 Subdivision Surfaces
- •4.4.1 Doo-Sabin Scheme
- •4.4.2 Catmull-Clark Scheme
- •4.4.3 Loop Scheme
- •4.5 Local Differential Properties
- •4.5.1 Surface Normals
- •4.5.2 Differential Coordinates and the Mesh Laplacian
- •4.6 Compression and Levels of Detail
- •4.6.1 Mesh Simplification
- •4.6.1.1 Edge Collapse
- •4.6.1.2 Quadric Error Metric
- •4.6.2 QEM Simplification Summary
- •4.6.3 Surface Simplification Results
- •4.7 Visualization
- •4.8 Research Challenges
- •4.9 Concluding Remarks
- •4.10 Further Reading
- •4.11 Questions
- •4.12 Exercises
- •References
- •1.1 Introduction
- •Chapter Outline
- •1.2 A Historical Perspective on 3D Imaging
- •1.2.1 Image Formation and Image Capture
- •1.2.2 Binocular Perception of Depth
- •1.2.3 Stereoscopic Displays
- •1.3 The Development of Computer Vision
- •1.3.1 Further Reading in Computer Vision
- •1.4 Acquisition Techniques for 3D Imaging
- •1.4.1 Passive 3D Imaging
- •1.4.2 Active 3D Imaging
- •1.4.3 Passive Stereo Versus Active Stereo Imaging
- •1.5 Twelve Milestones in 3D Imaging and Shape Analysis
- •1.5.1 Active 3D Imaging: An Early Optical Triangulation System
- •1.5.2 Passive 3D Imaging: An Early Stereo System
- •1.5.3 Passive 3D Imaging: The Essential Matrix
- •1.5.4 Model Fitting: The RANSAC Approach to Feature Correspondence Analysis
- •1.5.5 Active 3D Imaging: Advances in Scanning Geometries
- •1.5.6 3D Registration: Rigid Transformation Estimation from 3D Correspondences
- •1.5.7 3D Registration: Iterative Closest Points
- •1.5.9 3D Local Shape Descriptors: Spin Images
- •1.5.10 Passive 3D Imaging: Flexible Camera Calibration
- •1.5.11 3D Shape Matching: Heat Kernel Signatures
- •1.6 Applications of 3D Imaging
- •1.7 Book Outline
- •1.7.1 Part I: 3D Imaging and Shape Representation
- •1.7.2 Part II: 3D Shape Analysis and Processing
- •1.7.3 Part III: 3D Imaging Applications
- •References
- •5.1 Introduction
- •5.1.1 Applications
- •5.1.2 Chapter Outline
- •5.2 Mathematical Background
- •5.2.1 Differential Geometry
- •5.2.2 Curvature of Two-Dimensional Surfaces
- •5.2.3 Discrete Differential Geometry
- •5.2.4 Diffusion Geometry
- •5.2.5 Discrete Diffusion Geometry
- •5.3 Feature Detectors
- •5.3.1 A Taxonomy
- •5.3.2 Harris 3D
- •5.3.3 Mesh DOG
- •5.3.4 Salient Features
- •5.3.5 Heat Kernel Features
- •5.3.6 Topological Features
- •5.3.7 Maximally Stable Components
- •5.3.8 Benchmarks
- •5.4 Feature Descriptors
- •5.4.1 A Taxonomy
- •5.4.2 Curvature-Based Descriptors (HK and SC)
- •5.4.3 Spin Images
- •5.4.4 Shape Context
- •5.4.5 Integral Volume Descriptor
- •5.4.6 Mesh Histogram of Gradients (HOG)
- •5.4.7 Heat Kernel Signature (HKS)
- •5.4.8 Scale-Invariant Heat Kernel Signature (SI-HKS)
- •5.4.9 Color Heat Kernel Signature (CHKS)
- •5.4.10 Volumetric Heat Kernel Signature (VHKS)
- •5.5 Research Challenges
- •5.6 Conclusions
- •5.7 Further Reading
- •5.8 Questions
- •5.9 Exercises
- •References
- •6.1 Introduction
- •Chapter Outline
- •6.2 Registration of Two Views
- •6.2.1 Problem Statement
- •6.2.2 The Iterative Closest Points (ICP) Algorithm
- •6.2.3 ICP Extensions
- •6.2.3.1 Techniques for Pre-alignment
- •Global Approaches
- •Local Approaches
- •6.2.3.2 Techniques for Improving Speed
- •Subsampling
- •Closest Point Computation
- •Distance Formulation
- •6.2.3.3 Techniques for Improving Accuracy
- •Outlier Rejection
- •Additional Information
- •Probabilistic Methods
- •6.3 Advanced Techniques
- •6.3.1 Registration of More than Two Views
- •Reducing Error Accumulation
- •Automating Registration
- •6.3.2 Registration in Cluttered Scenes
- •Point Signatures
- •Matching Methods
- •6.3.3 Deformable Registration
- •Methods Based on General Optimization Techniques
- •Probabilistic Methods
- •6.3.4 Machine Learning Techniques
- •Improving the Matching
- •Object Detection
- •6.4 Quantitative Performance Evaluation
- •6.5 Case Study 1: Pairwise Alignment with Outlier Rejection
- •6.6 Case Study 2: ICP with Levenberg-Marquardt
- •6.6.1 The LM-ICP Method
- •6.6.2 Computing the Derivatives
- •6.6.3 The Case of Quaternions
- •6.6.4 Summary of the LM-ICP Algorithm
- •6.6.5 Results and Discussion
- •6.7 Case Study 3: Deformable ICP with Levenberg-Marquardt
- •6.7.1 Surface Representation
- •6.7.2 Cost Function
- •Data Term: Global Surface Attraction
- •Data Term: Boundary Attraction
- •Penalty Term: Spatial Smoothness
- •Penalty Term: Temporal Smoothness
- •6.7.3 Minimization Procedure
- •6.7.4 Summary of the Algorithm
- •6.7.5 Experiments
- •6.8 Research Challenges
- •6.9 Concluding Remarks
- •6.10 Further Reading
- •6.11 Questions
- •6.12 Exercises
- •References
- •7.1 Introduction
- •7.1.1 Retrieval and Recognition Evaluation
- •7.1.2 Chapter Outline
- •7.2 Literature Review
- •7.3 3D Shape Retrieval Techniques
- •7.3.1 Depth-Buffer Descriptor
- •7.3.1.1 Computing the 2D Projections
- •7.3.1.2 Obtaining the Feature Vector
- •7.3.1.3 Evaluation
- •7.3.1.4 Complexity Analysis
- •7.3.2 Spin Images for Object Recognition
- •7.3.2.1 Matching
- •7.3.2.2 Evaluation
- •7.3.2.3 Complexity Analysis
- •7.3.3 Salient Spectral Geometric Features
- •7.3.3.1 Feature Points Detection
- •7.3.3.2 Local Descriptors
- •7.3.3.3 Shape Matching
- •7.3.3.4 Evaluation
- •7.3.3.5 Complexity Analysis
- •7.3.4 Heat Kernel Signatures
- •7.3.4.1 Evaluation
- •7.3.4.2 Complexity Analysis
- •7.4 Research Challenges
- •7.5 Concluding Remarks
- •7.6 Further Reading
- •7.7 Questions
- •7.8 Exercises
- •References
- •8.1 Introduction
- •Chapter Outline
- •8.2 3D Face Scan Representation and Visualization
- •8.3 3D Face Datasets
- •8.3.1 FRGC v2 3D Face Dataset
- •8.3.2 The Bosphorus Dataset
- •8.4 3D Face Recognition Evaluation
- •8.4.1 Face Verification
- •8.4.2 Face Identification
- •8.5 Processing Stages in 3D Face Recognition
- •8.5.1 Face Detection and Segmentation
- •8.5.2 Removal of Spikes
- •8.5.3 Filling of Holes and Missing Data
- •8.5.4 Removal of Noise
- •8.5.5 Fiducial Point Localization and Pose Correction
- •8.5.6 Spatial Resampling
- •8.5.7 Feature Extraction on Facial Surfaces
- •8.5.8 Classifiers for 3D Face Matching
- •8.6 ICP-Based 3D Face Recognition
- •8.6.1 ICP Outline
- •8.6.2 A Critical Discussion of ICP
- •8.6.3 A Typical ICP-Based 3D Face Recognition Implementation
- •8.6.4 ICP Variants and Other Surface Registration Approaches
- •8.7 PCA-Based 3D Face Recognition
- •8.7.1 PCA System Training
- •8.7.2 PCA Training Using Singular Value Decomposition
- •8.7.3 PCA Testing
- •8.7.4 PCA Performance
- •8.8 LDA-Based 3D Face Recognition
- •8.8.1 Two-Class LDA
- •8.8.2 LDA with More than Two Classes
- •8.8.3 LDA in High Dimensional 3D Face Spaces
- •8.8.4 LDA Performance
- •8.9 Normals and Curvature in 3D Face Recognition
- •8.9.1 Computing Curvature on a 3D Face Scan
- •8.10 Recent Techniques in 3D Face Recognition
- •8.10.1 3D Face Recognition Using Annotated Face Models (AFM)
- •8.10.2 Local Feature-Based 3D Face Recognition
- •8.10.2.1 Keypoint Detection and Local Feature Matching
- •8.10.2.2 Other Local Feature-Based Methods
- •8.10.3 Expression Modeling for Invariant 3D Face Recognition
- •8.10.3.1 Other Expression Modeling Approaches
- •8.11 Research Challenges
- •8.12 Concluding Remarks
- •8.13 Further Reading
- •8.14 Questions
- •8.15 Exercises
- •References
- •9.1 Introduction
- •Chapter Outline
- •9.2 DEM Generation from Stereoscopic Imagery
- •9.2.1 Stereoscopic DEM Generation: Literature Review
- •9.2.2 Accuracy Evaluation of DEMs
- •9.2.3 An Example of DEM Generation from SPOT-5 Imagery
- •9.3 DEM Generation from InSAR
- •9.3.1 Techniques for DEM Generation from InSAR
- •9.3.1.1 Basic Principle of InSAR in Elevation Measurement
- •9.3.1.2 Processing Stages of DEM Generation from InSAR
- •The Branch-Cut Method of Phase Unwrapping
- •The Least Squares (LS) Method of Phase Unwrapping
- •9.3.2 Accuracy Analysis of DEMs Generated from InSAR
- •9.3.3 Examples of DEM Generation from InSAR
- •9.4 DEM Generation from LIDAR
- •9.4.1 LIDAR Data Acquisition
- •9.4.2 Accuracy, Error Types and Countermeasures
- •9.4.3 LIDAR Interpolation
- •9.4.4 LIDAR Filtering
- •9.4.5 DTM from Statistical Properties of the Point Cloud
- •9.5 Research Challenges
- •9.6 Concluding Remarks
- •9.7 Further Reading
- •9.8 Questions
- •9.9 Exercises
- •References
- •10.1 Introduction
- •10.1.1 Allometric Modeling of Biomass
- •10.1.2 Chapter Outline
- •10.2 Aerial Photo Mensuration
- •10.2.1 Principles of Aerial Photogrammetry
- •10.2.1.1 Geometric Basis of Photogrammetric Measurement
- •10.2.1.2 Ground Control and Direct Georeferencing
- •10.2.2 Tree Height Measurement Using Forest Photogrammetry
- •10.2.2.2 Automated Methods in Forest Photogrammetry
- •10.3 Airborne Laser Scanning
- •10.3.1 Principles of Airborne Laser Scanning
- •10.3.1.1 Lidar-Based Measurement of Terrain and Canopy Surfaces
- •10.3.2 Individual Tree-Level Measurement Using Lidar
- •10.3.2.1 Automated Individual Tree Measurement Using Lidar
- •10.3.3 Area-Based Approach to Estimating Biomass with Lidar
- •10.4 Future Developments
- •10.5 Concluding Remarks
- •10.6 Further Reading
- •10.7 Questions
- •References
- •11.1 Introduction
- •Chapter Outline
- •11.2 Volumetric Data Acquisition
- •11.2.1 Computed Tomography
- •11.2.1.1 Characteristics of 3D CT Data
- •11.2.2 Positron Emission Tomography (PET)
- •11.2.2.1 Characteristics of 3D PET Data
- •Relaxation
- •11.2.3.1 Characteristics of the 3D MRI Data
- •Image Quality and Artifacts
- •11.2.4 Summary
- •11.3 Surface Extraction and Volumetric Visualization
- •11.3.1 Surface Extraction
- •Example: Curvatures and Geometric Tools
- •11.3.2 Volume Rendering
- •11.3.3 Summary
- •11.4 Volumetric Image Registration
- •11.4.1 A Hierarchy of Transformations
- •11.4.1.1 Rigid Body Transformation
- •11.4.1.2 Similarity Transformations and Anisotropic Scaling
- •11.4.1.3 Affine Transformations
- •11.4.1.4 Perspective Transformations
- •11.4.1.5 Non-rigid Transformations
- •11.4.2 Points and Features Used for the Registration
- •11.4.2.1 Landmark Features
- •11.4.2.2 Surface-Based Registration
- •11.4.2.3 Intensity-Based Registration
- •11.4.3 Registration Optimization
- •11.4.3.1 Estimation of Registration Errors
- •11.4.4 Summary
- •11.5 Segmentation
- •11.5.1 Semi-automatic Methods
- •11.5.1.1 Thresholding
- •11.5.1.2 Region Growing
- •11.5.1.3 Deformable Models
- •Snakes
- •Balloons
- •11.5.2 Fully Automatic Methods
- •11.5.2.1 Atlas-Based Segmentation
- •11.5.2.2 Statistical Shape Modeling and Analysis
- •11.5.3 Summary
- •11.6 Diffusion Imaging: An Illustration of a Full Pipeline
- •11.6.1 From Scalar Images to Tensors
- •11.6.2 From Tensor Image to Information
- •11.6.3 Summary
- •11.7 Applications
- •11.7.1 Diagnosis and Morphometry
- •11.7.2 Simulation and Training
- •11.7.3 Surgical Planning and Guidance
- •11.7.4 Summary
- •11.8 Concluding Remarks
- •11.9 Research Challenges
- •11.10 Further Reading
- •Data Acquisition
- •Surface Extraction
- •Volume Registration
- •Segmentation
- •Diffusion Imaging
- •Software
- •11.11 Questions
- •11.12 Exercises
- •References
- •Index
8 3D Face Recognition |
343 |
Fig. 8.8 A two-class classification problem in which we wish to reduce the data dimension to one. The standard PCA result is given as the black axis passing through the pooled data mean, and the LDA result is given by the green axis
8.8 LDA-Based 3D Face Recognition
One reason that PCA-based approaches have been popular is that they can operate with only one training example per subject. This is because it does not take account of the per-subject (within-class) distribution of the training data. However, because of this reason, the projection axes computed by PCA may make class discrimination difficult. Indeed, in the worst case for some surface feature type, it could be that the very dimensions that are discarded by PCA are those that provide good discrimination between classes. With the advent of more sophisticated datasets with several (preferably many) 3D scans per subject, more sophisticated subspaces can be used, which attempt to find the linear combination of features that best separates each subject (class). This is the aim of Linear Discriminant Analysis (LDA), while simultaneously performing dimension reduction. Thus, while PCA finds the most expressive linear combinations of surface feature map dimensions (in the simplest case, depth map pixels), LDA finds the most discriminative linear combinations.
Although 3D face recognition is an inherently multi-class classification problem in a high dimensional space, it is easier to initially look at LDA for a two-class problem in a two-dimensional space and compare it to PCA. Subsequently we will look at the issues involved with high dimensional feature vectors and we will also generalize to multi-class problems.
8.8.1 Two-Class LDA
Suppose that we have the two-class, 2D problem shown in Fig. 8.8. Intuitively we want to project the data onto a direction for which there is the largest separation of the class means, relative to the within-class scatter in that same projection direction.
344 |
A. Mian and N. Pears |
A scatter matrix is simply a scaled version of a covariance matrix, and for each set of training scans, Cc , belonging to class c {1, 2}, they are formed as:
Sc = X0Tc X0c , |
(8.26) |
where X0c is a zero-mean data matrix, as described in Sect. 8.7.1 (although it is now class-specific), and nc is the number of training scans in the set Cc . We note that these scatter matrices are often expressed as a sum of outer products:
Sc |
nc |
xc )(xi |
xc)T , |
xi |
|
Cc , |
(8.27) |
(xi |
|
||||||
|
= |
− ¯ |
− ¯ |
|
|
|
i=1
where x¯ c is the mean of the feature vectors in class Cc . Given that we have two classes, the within-class scatter matrix can be formed as:
SW = S1 + S2. |
(8.28) |
The between-class scatter is formed as the outer product of the difference between the two class means:
SB = (x¯ 1 − x¯2)(x¯ 1 − x¯ 2)T . |
(8.29) |
Fisher proposed to maximize the ratio of between class scatter to within class scatter relative to the projection direction [28], i.e. solve
J (w) = |
max |
wT SB w |
(8.30) |
|
|||
w wT SW w |
|
with respect to the 2 × 1 column vector w. This is known as Fisher’s criterion. A solution to this optimization can be found by differentiating Eq. (8.30) with respect to w and equating to zero. This gives:
S−1S w |
− |
J w |
= |
0. |
(8.31) |
W B |
|
|
|
We recognize this as a generalized eigenvalue-eigenvector problem, where the eigenvector of S−W1SB associated with its largest eigenvalue (J ) is our desired optimal direction, w . (In fact, the other eigenvalue will be zero because the between class scatter matrix, being a simple outer product, can only have rank 1.) Figure 8.8 shows the very different results of applying PCA to give the most expressive axis for the pooled data (both classes), and applying LDA which gives a near orthogonal axis, relative to that of PCA, for the best class separation in terms of Fisher’s criterion. Once data is projected into the new space, one can use various classifiers and distance metrics, as described earlier.
8.8.2 LDA with More than Two Classes
In order to extend the approach to a multiclass problem (K > 2 classes), we could train pairwise binary classifiers and classify a test 3D face according to the class that gets most votes. This has the advantage of finding the projection that
8 3D Face Recognition |
345 |
best separates each pair of distributions, but often results in a very large number of classifiers in 3D face recognition problems, due to the large number of classes (one per subject) often experienced. Alternatively one can train K one-versus-all classifiers where the binary classifiers are of the form subject X and not subject X. Although this results in fewer classifiers, the computed projections are usually less discriminative.
Rather than applying multiple binary classifiers to a multi-class problem, we can generalize the binary LDA case to multiple classes. This requires the assumption that the number of classes is less than or equal to the dimension of the feature space (i.e. K ≤ m where m is the dimension of our depth map space or surface feature map space). For high dimensional feature vectors, such as encountered in 3D face recognition, we also require a very large body of training data to prevent the scatter matrices from being singular. In general, the required number of 3D face scans is not available, but we address this point in the following subsection.
The multi-class LDA procedure is as follows: Firstly, we form the means of each class (i.e. each subject in the 3D face dataset). Using these means, we can compute the scatter matrix for each class using Eqs. (8.26) or (8.27). For the within-class scatter matrix, we simply sum the scatter matrices for each individual class:
K |
|
SW = Si . |
(8.32) |
i=1 |
|
This is an m × m matrix, where m is the dimension of the feature vector. We form the mean of all training faces, which is the weighted mean of the class means:
K
x¯ = 1 ni x¯ i , (8.33) n i=1
where ni is the number of training face scans in each class and n is the total number of training scans. The between-class scatter matrix is then formed as:
SB |
|
K |
x)(xi |
x)T . |
(8.34) |
= |
ni (xi |
||||
|
¯ |
− ¯ ¯ |
− ¯ |
|
i=1
This scatter matrix is also m × m. Rather than finding a single m-dimensional vector to project onto, we now seek a reduced dimension subspace in which to project our data, so that a feature vector in the new subspace is given as:
|
|
x˜ = WT x. |
|
(8.35) |
||
We formulate Fisher’s criterion as: |
|
|
|
|||
J (W) |
= |
max |
|WT SB W| |
, |
(8.36) |
|
|WT SW W| |
||||||
|
W |
|
|
where the vertical lines indicate that the determinant is to be computed. Given that the determinant is equivalent to the product of the eigenvalues, it is a measure of the square of the scattering volume. The projection matrix, W, maps the original
346 A. Mian and N. Pears
m-dimensional space to, at most, K − 1 dimensions and so its maximum size is m × (K − 1). This limit exists because SB is the sum of K outer products and matrices formed from outer products are rank 1 or less. Additionally, only K − 1 of these are independent due to their dependence on the overall mean x¯ of the pooled training data. As with the two class case, the optimal projection is found from the generalized eigenvalue-eigenvector problem:
S−1S W |
− |
J W |
= |
0, |
(8.37) |
W B |
|
|
|
and the optimal projection W is the one whose columns are the eigenvectors corresponding to the largest eigenvalues. At most, there will be K − 1 non-zero eigenvalues. In practise we select the subset of eigenvectors with eigenvalues above some small threshold, as those close to zero provide little useful discrimination. Note that we can only form the solution in this way if SW is non-singular and we discuss this issue in the following subsection.
8.8.3 LDA in High Dimensional 3D Face Spaces
When applying LDA to 3D face recognition, we have to address the problems associated with working in high dimensional spaces. In [45], for example, range maps of 60 pixels wide and 90 high are employed to give feature vectors of length m = 5400. Thus the scatter matrices, both SW and SB are extremely large (dimension m × m) and worse still, due to the typically small number (10–100) of training images per class, they are singular (non invertible). This is referred to as LDA’s small sample size problem. The more traditional approach to dealing with this is to initially apply PCA as a dimensional reduction stage, before LDA can proceed, as was done in Belhumeur et al.’s [8] seminal work in 2D face recognition and later in 3D face recognition work [39, 43, 45].
If the total number of pooled training vectors is n, then the rank of SW is at most k = n − K . Thus PCA can be applied to the data and the k eigenvectors with the largest eigenvalues are selected to give the m × k projection matrix Vk . This is then reduced to K − 1 dimensions using the LDA derived k × (K − 1) projection matrix, as described in the previous subsection.
Assuming that 3D faces can be represented as an m-dimensional vector, we can summarize this approach with the following training steps:
1.Determine the projection matrix of a PCA-derived subspace, of dimension n − K , as described in Sect. 8.7.
2.Project all of the 3D faces that constitute the training data into this subspace.
3.Form the within-class and between-class scatter matrices of the training data in this reduced dimension PCA-derived subspace, as described in Sect. 8.8.2.
4.Determine the smaller (maximum dimension K − 1) LDA-derived subspace of this PCA subspace by solving the generalized eigenvalue-eigenvector problem of the form of Eq. (8.37).
5.Project the training data from the PCA subspace into the LDA subspace.