A brief description of the various multivariate techniques named above (with special emphasis on factor analysis) is as under:
- Multiple regression: In multiple regression we form a linear composite of explanatory variables in such way that it has maximum correlation with a criterion variable. This technique is appropriate when the researcher has a single, metric criterion variable. Which is supposed to be a function of other explanatory variables. The main objective in using this technique is to predict the variability the dependent variable based on its covariance with all the independent variables. One can predict the level of the dependent phenomenon through multiple regression analysis model, given the levels of independent variables. Given a dependent variable, the linear-multiple regression problem is to estimate constants B1, B2, ... Bk and A such that the expression Y = B1X1 + B2X2 + ... + BkXk + A pare rovides a good estimate of an individual’s Y score based on his X scores.
In practice, Y and the several X variables are converted to standard scores; zy, zl, z2, ... zk; each z has a mean of 0 and standard deviation of 1. Then the problem is to estimate constants, bi , such that
z¢y = b1z1 + b2z2 + ...+ bk zk
where z'y stands for the predicted value of the standardized Y score, zy. The expression on the right side of the above equation is the linear combination of explanatory variables. The constant A is eliminated in the process of converting X’s to z’s. The least-squares-method is used, to estimate the beta weights in such a way that the sum of the squared prediction errors is kept as small as possible i.e., the Formula is minimized. The predictive adequacy of a set of beta weights is indicated by the size of the correlation coefficient rzy × z¢y between the predicted z¢y scores and the actual zy scores. This special correlation coefficient from Karl Pearson is termed the multiple correlation coefficient (R). The squared multiple correlation, R2, represents the proportion of criterion (zy) variance accounted for by the explanatory variables, i.e., the proportion of total variance that is ‘Common Variance’.
Sometimes the researcher may use step-wise regression techniques to have a better idea of the independent contribution of each explanatory variable. Under these techniques, the investigator adds the independent contribution of each explanatory variable into the prediction equation one by one, computing betas and R2 at each step. Formal computerized techniques are available for the purpose and the same can be used in the context of a particular problem being studied by the researcher.
- Multiple discriminant analysis: Through discriminant analysis technique, researcher may classify individuals or objects into one of two or more mutually exclusive and exhaustive groups on the basis of a set of independent variables. Discriminant analysis requires interval independent variables and a nominal dependent variable. For example, suppose that brand preference (say brand x or y) is the dependent variable of interest and its relationship to an individual’s income, age, education, etc. is being investigated, then we should use the technique of discriminant analysis. Regression analysis in such a situation is not suitable because the dependent variable is, not intervally scaled. Thus discriminant analysis is considered an appropriate technique when the single dependent variable happens to be non-metric and is to be classified into two or more groups, depending upon its relationship with several independent variables which all happen to be metric. The objective in discriminant analysis happens to be to predict an object’s likelihood of belonging to a particular group based on several independent variables. In case we classify the dependent variable in more than two groups, then we use the name multiple discriminant analysis; but in case only two groups are to be formed, we simply use the term discriminant analysis.
We may briefly refer to the technical aspects relating to discriminant analysis.
- There happens to be a simple scoring system that assigns a score to each individual or object. This score is a weighted average of the individual’s numerical values of his independent variables. On the basis of this score, the individual is assigned to the ‘most likely’ category. For example, an individual is 20 years old, has an annual income of Rs 12,000,and has 10 years of formal education. Let b1, b2, and b3 be the weights attached to the independent variables of age, income and education respectively. The individual’s score (z), assuming linear score, would be:
z = b1 (20) + b2 (12000) + b3 (10)
This numerical value of z can then be transformed into the probability that the individual is an early user, a late user or a non-user of the newly marketed consumer product (here we are making three categories viz. early user, late user or a non-user).
- The numerical values and signs of the b’s indicate the importance of the independent variables in their ability to discriminate among the different classes of individuals. Thus, through the discriminant analysis, the researcher can as well determine which independent variables are most useful in predicting whether the respondent is to be put into one group or the other. In other words, discriminant analysis reveals which specific variables in the profile account for the largest proportion of inter-group differences.
- In case only two groups of the individuals are to be formed on the basis of several independent variables, we can then have a model like this
zi = b0 + b1X1i + b2X2i + ... + bnXni
where Xji = the ith individual’s value of the jth independent variable;
bj = the discriminant coefficient of the jth variable;
zi = the ith individual’s discriminant score;
zcrit. = the critical value for the discriminant score.
The classification procedure in such a case would be
If zi > zcrit., classify individual i as belonging to Group I
If zi < zcrit, classify individual i as belonging to Group II.
When n (the number of independent variables) is equal to 2, we have a straight line classification boundary. Every individual on one side of the line is classified as Group I and on the other side, every one is classified as belonging to Group II. When n = 3, the classification boundary is a two-dimensional plane in 3 space and in general the classification boundary is an n – 1 dimensional hyper-plane in n space.
- In n-group discriminant analysis, a discriminant function is formed for each pair of groups. If there are 6 groups to be formed, we would have 6(6 – 1)/2 = 15 pairs of groups, and hence 15 discriminant functions. The b values for each function tell which variables are important for discriminating between particular pairs of groups. The z score for each discriminant function tells in which of these two groups the individual is more likely to belong. Then use is made of the transitivity of the relation “more likely than”. For example, if group II is more likely than group I and group III is more likely than group II, then group III is also more likely than group I. This way all necessary comparisons are made and the individual is assigned to the most likely of all the groups. Thus, the multiple-group discriminant analysis is just like the two-group discriminant analysis for the multiple groups are simply examined two at a time.
- For judging the statistical significance between two groups, we work out the Mahalanobis statistic, D2, which happens to be a generalized distance between two groups, where each group is characterized by the same set of n variables and where it is assumed that variancecovariance structure is identical for both groups. It is worked out thus:
where U1 = the mean vector for group I
U2 = the mean vector for group II
v = the common variance matrix
By transformation procedure, this D2 statistic becomes an F statistic which can be used to see if the two groups are statistically different from each other.
From all this, we can conclude that the discriminant analysis provides a predictive equation, measures the relative importance of each variable and is also a measure of the ability of the equation to predict actual class-groups (two or more) concerning the dependent variable.
- Multivariate analysis of variance: Multivariate analysis of variance is an extension of bivariate analysis of variance in which the ratio of among-groups variance to within-groups variance is calculated on a set of variables instead of a single variable. This technique is considered appropriate when several metric dependent variables are involved in a research study along with many non-metric explanatory variables. (But if the study has only one metric dependent variable and several nonmetric explanatory variables, then we use the ANOVA technique as explained earlier in the book.) In other words, multivariate analysis of variance is specially applied whenever the researcher wants to test hypotheses concerning multivariate differences in group responses to experimental manipulations. For instance, the market researcher may be interested in using one test market and one control market to examine the effect of an advertising campaign on sales as well as awareness, knowledge and attitudes. In that case he should use the technique of multivariate analysis of variance for meeting his objective.
- Canonical correlation analysis: This technique was first developed by Hotelling wherein an effort is made to simultaneously predict a set of criterion variables from their joint co-variance with a set of explanatory variables. Both metric and non-metric data can be used in the context of this multivariate technique. The procedure followed is to obtain a set of weights for the dependent and independent variables in such a way that linear composite of the criterion variables has a maximum correlation with the linear composite of the explanatory variables. For example, if we want to relate grade school adjustment to health and physical maturity of the child, we can then use canonical correlation analysis, provided we have for each child a number of adjustment scores (such as tests, teacher’s ratings, parent’s ratings and so on) and also we have for each child a number of health and physical maturity scores (such as heart rate, height, weight, index of intensity of illness and so on). The main objective of canonical correlation analysis is to discover factors separately in the two sets of variables such that the multiple correlation between sets of factors will be the maximum possible. Mathematically, in canonical correlation analysis, the weights of the two sets viz., a1, a2, … ak and yl, y2, y3, ... yj are so determined that the variables X = a1X1 + a2X2 +... + akXk + a and Y = y1Y1 + y2Y2 + … yjYj + y have a maximum common variance. The process of finding the weights requires factor analyses with two matrices.* The resulting canonical correlation solution then gives an over all description of the presence or absence of a relationship between the two sets of variables.
Factor analysis: Factor analysis is by far the most often used multivariate technique of research studies, specially pertaining to social and behavioural sciences. It is a technique applicable when there is a systematic interdependence among a set of observed or manifest variables and the researcher is interested in finding out something more fundamental or latent which creates this commonality.For instance, we might have data, say, about an individual’s income, education, occupation and dwelling area and want to infer from these some factor (such as social class) which summarises the commonality of all the said four variables. The technique used for such purpose is generally described as factor analysis. Factor analysis, thus, seeks to resolve a large set of measured variables in terms of relatively few categories, known as factors. This technique allows the researcher to group variables into factors (based on correlation between variables) and the factors so derived may be treated as new variables (often termed as latent variables) and their value derived by summing the values of the original variables which have been grouped into the factor. The meaning and name of such new variable is subjectively determined by the researcher. Since the factors happen to be linear combinations of data, the coordinates of each observation or variable is measured to obtain what are called factor loadings. Such factor loadings represent the correlation between the particular variable and the factor, and are usually place in a matrix of correlations between the variable and the factors.
The mathematical basis of factor analysis concerns a data matrix* (also termed as score matrix), symbolized as S. The matrix contains the scores of N persons of k measures. Thus a1 is the score of person 1 on measure a, a2 is the score of person 2 on measure a, and kN is the score of person N on measure k. The score matrix then take the form as shown following:
SCORE MATRIX (or Matrix S)
It is assumed that scores on each measure are standardized [i.e., xi = (X - Xi )2 /si ] . This being so, the sum of scores in any column of the matrix, S, is zero and the variance of scores in any column is 1.0. Then factors (a factor is any linear combination of the variables in a data matrix and can be stated in a general way like: A = Waa + Wbb + … + Wkk) are obtained (by any method of factoring). After this, we work out factor loadings (i.e., factor-variable correlations). Then communality, symbolized as h2, the eigen value and the total sum of squares are obtained and the results interpreted. For realistic results, we resort to the technique of rotation, because such rotations reveal different structures in the data. Finally, factor scores are obtained which help in explaining what the factors mean. They also facilitate comparison among groups of items as groups. With factor scores, one can also perform several other multivariate analyses such as multiple regression, cluster analysis, multiple discriminant analysis, etc.