Keywords applicable to this article: multivariate, research, statistical modeling, structural framework, causal relationships, model reliability, model validity, exploratory factor analysis, confirmatory factor analysis, structural equation modeling

A research problem may be univariate, bivariate, or multivariate. A univariate problem is concerned with only one research variable and a bivariate problem is concerned with linearity of relationship between two research variables. Normally, univariate problems comprise of study of multiple and independent research variables without bothering about their quantitative mutual relationships. For example, a single research may incorporate study of attitude, organizational commitment, and employee performance separately in a fast food chain without bothering about their quantitative relationships. Some researchers may design triangulation studies by collecting numerical data about the three variables but establishing their interrelationships qualitatively.

On the other hand, bivariate research problems incorporate study of relationships between two variables by establishing a null and an alternate hypothesis. Most bivariate research problems are concerned with mutual relationships between two variables investigated through multiple independent hypotheses. However, the hypotheses may not be interrelated in the form of a structure or theoretical framework. The hypotheses may be tested using bivariate techniques, like correlation analysis, regression analysis, analysis of variance, students’ t-test, Chi-square test, or simply the p-value testing. The outcomes may be definitive causal relationships (influence of an independent variable on a dependent variable) or simply a reflection of how a parameter varies with respect to another within a controlled research setting. Normally, establishing a relationship between two variables does not guarantee that a causal relationship is found. Cause-effect relationships can be established by taking support from established theories or by investigating more variables in action influencing the two variables. This is where multivariate problems come in the picture.

Multivariate problems are different and complex, requiring sophisticated techniques for investigating relationships among multiple variables. Most of the multivariate problems require investigation of complex structures than mere relationships. Hence, applying statistics in multivariate problems is not only about statistical calculations albeit involves complex statistical modeling. A model may be in the form of a theoretical framework or an initial measurement model. Before the multivariate techniques are discussed, it is important to differentiate between a theoretical framework and an initial measurement model.

A theoretical framework is formed by conducting intensive literature review and creating a structure having relationships grounded on theories. On the other hand, an initial measurement model can be established using the principal component analysis technique employing orthogonal factor rotation.

Technically, the models created following both the approaches are considered as an initial model and is taken through the same reliability, validity, and model fitment tests. However, the research studies involving theory-based formation of the initial model (commonly referred to as the theoretical framework) are confirmatory or extended studies whereas the research studies involving principal factor analysis technique are exploratory studies. In practice, a theory-based modeling approach should be chosen if the model can be grounded on an extensive and deep theoretical foundation, whereas the principal component analysis technique should be chosen if the model is not sufficiently supported by theories.

Multivariate problems have two flavours – relationships among multiple observable measurable) variables or relationships between single or multiple groups of observable variables and a group latent (unobservable, or immeasurable) variables. The latter is used in highly complex research studies. The sequence of techniques used in multivariate statistical modeling are – exploratory factor analysis, confirmatory factor analysis, and structured

equation modeling. The exploratory factor analysis technique may be skipped if theory-based initial modeling has been preferred. In the exploratory factor analysis, the number of latent (unobserved) variables influenced by a set of observed variables is explored by obtaining an orthogonal factor rotated solution using VARIMAX, QURTIMAX, EQUAMAX, PROMAX, and DIRECT OBLIMIN rotationmethods. The most used orthogonal factor rotation method is VARIMAX. The number of latent variables is determined by the number of rotated variables having an Eigen-value above unity. The researcher may predetermine the number of latent variables or simply proceed to investigate the variables having Eigen-values more than unity. It is imperative to keep the number of latent variables lesser than the number of variables having Eigen-values more than unity. This analysis is done on a Scree plot.

The rotated factor table obtained after rotation is of prime importance. It gives the level of loading by each observed variable on each latent variable. Normally, variables with significant loadings are selected and the rest rejected. The significance of loadings is determined by the loading value (should be normally at 5.0 or greater) or the importance of the observed variable in the reliability test. The researcher may like to name each latent variable by analyzing the group of observed variables loading them, or by taking help of literatures. Each group forms a scale representing the corresponding latent variables. The researcher may like to test the reliability of each scale using Cronbach Alpha, Split Half, Guttman, Parallel, or Strict Parallel techniques. In Cronbach Alpha test, an alpha value of 6 or greater is considered as a good reliability indicator for a scale if the research involves responses from human subjects (example, phenomenology and grounded theory studies). However, researchers prefer to choose a higher alpha value in scientific and technology-based research studies in which, the primary data is collected from experiments or simulations. It is normally observed that an observed variable having a high loading on the latent variable is a good contributor to the Cronbach Alpha value. However, sometimes an observed variable with low levels of loading (below 5.0) may appear to be a better contributor to the Cronbach Alpha value. The contribution of observed variables to the Cronbach Alpha value of the scale can be determined from a table called "scale if item deleted". In some research studies, the researcher may decide to conclude the research if very high reliability values of the scales are achieved. However, it is not guaranteed that these scales comprising groups of highest loading observed variables are the causal factors influencing the latent variables. It is recommended that a few validity tests are also conducted. This is where the confirmatory factor analysis technique is useful.

The confirmatory factor analysis technique helps in running validity tests on the model determined either through theory-based approach or through exploratory factor analysis technique. It involves computation of Average Variance Extracted (AVE), Cronbach Alpha, Degrees of Freedom, Root Mean Square Error of Approximation (RMSEA), Root Mean Square Residual (RMR), and Standardized Root Mean Square Residual (SRMR) values.

There are thresholds recommended by various research scholars based on the research area, and sample size for determining validity of the model.

One should be careful about deciding the thresholds before validating the model. If the objective is to simply validate the initial model, the researcher may conclude the research at this stage. However, there can be situations when the initial model returns unreliable scales and invalid relationships. This is unlikely if the initial model has been constructed with utmost care. But the researcher should be ready to face surprises and should not panic because the Structural Equation Modeling technique will come for rescuing the research from a probable failure.

Structural Equation Modeling helps in finding an alternate model having acceptable reliability and validity scores if the initial model has failed due to some unavoidable and irreparable issues. The technique allows the researcher to test multiple models by varying the relationships among variables and finally choose the best fit model. The test statistics that help in choosing the best fit model are goodness of fitment, adjusted goodness of fitment, normed fitment index, non-normed fitment index, comparative fitment index, parsimony fitment index, and incremental fitment index. It should be noted that all of these are not suitable for every research. The researcher should choose the most appropriate ones depending upon the area of research and the sample size. It is recommended to study a number of literatures for choosing the most appropriate fitment indices in structural equation modeling.

The recommended tool for applying exploratory factor analysis technique is SPSS, and the tool recommended for confirmatory factor analysis and structural equation modeling is LISREL. If you need any help in designing a research, collecting data, applying techniques for data analysis, and deriving meaningful conclusions and recommendations in a multivariate research involving exploratory factor analysis, confirmatory factor analysis, and structural equation modeling, you may please contact us at consulting@etcoindia.co and consulting@etcoindia.net. We recommend using Survey Monkey for collecting data and latest academic versions of SPSS and LISREL for applying thes techniques. The academic version of LISREL cannot be used if the number of variables is greater than 15. However, in most cases the number of variables can be reduced to 15 or lesser if Principal Component Analysis technique has been used and reliable scales constructed by testing their Cronbach Alpha values. This is another advantage of starting the research with exploratory factor analysis rather than theory-based structural framework. In some research studies, it may not be possible to keep the number of variables below 15. In such cases, it is recommended that a professional copy of LISREL is purchased.

Ideally, the number of variables should be kept as low as possible especially if the sample size is smaller (say, less than 100). Higher the number of variables, greater is the difficulty in determining the best fit model employing Structural Equation Modeling. It is observed that most of the modern causal research problems require application of multivariate techniques and hence, it is recommended to master SPSS and LISREL in this context.

We can support multivariate research studies in all the research areas mentioned on the page detailing our Subject areas of specialization. The choice of factors and latent variables may be chosen as per a problem description. Typically, latent variables are the ones

that cannot be measured directly. Examples are: human attitude, human feelings, commitment to the organisation, willingness to work in a particular field, and behavioural aspects in groups or teams. However, the variables lacking data availability because of lack of systems and processes can also be chosen as latent variables. The factors influencing the chosen latent variables under study may be chosen from past research studies, journal articles, professional studies, industrial reports, press releases, and expert advises. The structure of the theoretical framework may be designed by applying the exploratory factor analysis technique, or by designing based on literature reviews providing adequate information on structural models

involving the factors (observed variables) and the latent variables under study.

Some of the examples of multivariate problems are the following:

(a) Influence of organisational citizenship behaviour, organisational commitment, behavioural aspects with peers and superiors, and willingness to participate on effectiveness of information security governance in an organisation

(b) Influence of organisational citizenship behaviour, organisational commitment, behavioural aspects with peers and superiors, and willingness to participate on project performance

(c) Influence of multiple personality types on effectiveness of crisis management decision-making and change management

In the above examples, the influencing variables are unobservable and hence need to be considered as latent variables. In order to measure them, the factors affecting them need to be taken from literatures. The models will comprise of relationships of the following generalised form:

Factor groups ---> Latent variables ---> Output variables

The factor groups representing each latent variable are the scales with high reliability (Cronbach Alpha value of 6 or more). The scales can obtained from exploratory factor analysis (principal component analysis with a rotated solution) or literature-supported groups. The number of latent variables loaded by factor variables depend upon the number of Eigen Values greater than unity in the set. After rotation (like, VARIMAX with Kaiser's normalization), the factor variables regroup under the latent variables with varying levels of loadings. The significant loadings (like, 0.5 or above) are accepted and the rest are rejected. This results in reduced scales per latent variable, which can be tested using Cronbach Alpha or spli-half testing. The scale may reduce further if deleting a factor improves in the value of Cronbach Alpha, provided a negative error covariance does not crop up. The researcher should also try to retain the factors strongly supported by theories at the cost of keeping a low reliability level (Cronbach Alpha value) of the scale. The rest of the analysis can be completed through confirmatory factor analysis and structural equation modeling.

Please contact us at consulting@etcoindia.co or consulting@etcoindia.net to

discuss your topic or to get ideas about new topics pertaining to your subject area.

A research problem may be univariate, bivariate, or multivariate. A univariate problem is concerned with only one research variable and a bivariate problem is concerned with linearity of relationship between two research variables. Normally, univariate problems comprise of study of multiple and independent research variables without bothering about their quantitative mutual relationships. For example, a single research may incorporate study of attitude, organizational commitment, and employee performance separately in a fast food chain without bothering about their quantitative relationships. Some researchers may design triangulation studies by collecting numerical data about the three variables but establishing their interrelationships qualitatively.

On the other hand, bivariate research problems incorporate study of relationships between two variables by establishing a null and an alternate hypothesis. Most bivariate research problems are concerned with mutual relationships between two variables investigated through multiple independent hypotheses. However, the hypotheses may not be interrelated in the form of a structure or theoretical framework. The hypotheses may be tested using bivariate techniques, like correlation analysis, regression analysis, analysis of variance, students’ t-test, Chi-square test, or simply the p-value testing. The outcomes may be definitive causal relationships (influence of an independent variable on a dependent variable) or simply a reflection of how a parameter varies with respect to another within a controlled research setting. Normally, establishing a relationship between two variables does not guarantee that a causal relationship is found. Cause-effect relationships can be established by taking support from established theories or by investigating more variables in action influencing the two variables. This is where multivariate problems come in the picture.

Multivariate problems are different and complex, requiring sophisticated techniques for investigating relationships among multiple variables. Most of the multivariate problems require investigation of complex structures than mere relationships. Hence, applying statistics in multivariate problems is not only about statistical calculations albeit involves complex statistical modeling. A model may be in the form of a theoretical framework or an initial measurement model. Before the multivariate techniques are discussed, it is important to differentiate between a theoretical framework and an initial measurement model.

A theoretical framework is formed by conducting intensive literature review and creating a structure having relationships grounded on theories. On the other hand, an initial measurement model can be established using the principal component analysis technique employing orthogonal factor rotation.

Technically, the models created following both the approaches are considered as an initial model and is taken through the same reliability, validity, and model fitment tests. However, the research studies involving theory-based formation of the initial model (commonly referred to as the theoretical framework) are confirmatory or extended studies whereas the research studies involving principal factor analysis technique are exploratory studies. In practice, a theory-based modeling approach should be chosen if the model can be grounded on an extensive and deep theoretical foundation, whereas the principal component analysis technique should be chosen if the model is not sufficiently supported by theories.

Multivariate problems have two flavours – relationships among multiple observable measurable) variables or relationships between single or multiple groups of observable variables and a group latent (unobservable, or immeasurable) variables. The latter is used in highly complex research studies. The sequence of techniques used in multivariate statistical modeling are – exploratory factor analysis, confirmatory factor analysis, and structured

equation modeling. The exploratory factor analysis technique may be skipped if theory-based initial modeling has been preferred. In the exploratory factor analysis, the number of latent (unobserved) variables influenced by a set of observed variables is explored by obtaining an orthogonal factor rotated solution using VARIMAX, QURTIMAX, EQUAMAX, PROMAX, and DIRECT OBLIMIN rotationmethods. The most used orthogonal factor rotation method is VARIMAX. The number of latent variables is determined by the number of rotated variables having an Eigen-value above unity. The researcher may predetermine the number of latent variables or simply proceed to investigate the variables having Eigen-values more than unity. It is imperative to keep the number of latent variables lesser than the number of variables having Eigen-values more than unity. This analysis is done on a Scree plot.

The rotated factor table obtained after rotation is of prime importance. It gives the level of loading by each observed variable on each latent variable. Normally, variables with significant loadings are selected and the rest rejected. The significance of loadings is determined by the loading value (should be normally at 5.0 or greater) or the importance of the observed variable in the reliability test. The researcher may like to name each latent variable by analyzing the group of observed variables loading them, or by taking help of literatures. Each group forms a scale representing the corresponding latent variables. The researcher may like to test the reliability of each scale using Cronbach Alpha, Split Half, Guttman, Parallel, or Strict Parallel techniques. In Cronbach Alpha test, an alpha value of 6 or greater is considered as a good reliability indicator for a scale if the research involves responses from human subjects (example, phenomenology and grounded theory studies). However, researchers prefer to choose a higher alpha value in scientific and technology-based research studies in which, the primary data is collected from experiments or simulations. It is normally observed that an observed variable having a high loading on the latent variable is a good contributor to the Cronbach Alpha value. However, sometimes an observed variable with low levels of loading (below 5.0) may appear to be a better contributor to the Cronbach Alpha value. The contribution of observed variables to the Cronbach Alpha value of the scale can be determined from a table called "scale if item deleted". In some research studies, the researcher may decide to conclude the research if very high reliability values of the scales are achieved. However, it is not guaranteed that these scales comprising groups of highest loading observed variables are the causal factors influencing the latent variables. It is recommended that a few validity tests are also conducted. This is where the confirmatory factor analysis technique is useful.

The confirmatory factor analysis technique helps in running validity tests on the model determined either through theory-based approach or through exploratory factor analysis technique. It involves computation of Average Variance Extracted (AVE), Cronbach Alpha, Degrees of Freedom, Root Mean Square Error of Approximation (RMSEA), Root Mean Square Residual (RMR), and Standardized Root Mean Square Residual (SRMR) values.

There are thresholds recommended by various research scholars based on the research area, and sample size for determining validity of the model.

One should be careful about deciding the thresholds before validating the model. If the objective is to simply validate the initial model, the researcher may conclude the research at this stage. However, there can be situations when the initial model returns unreliable scales and invalid relationships. This is unlikely if the initial model has been constructed with utmost care. But the researcher should be ready to face surprises and should not panic because the Structural Equation Modeling technique will come for rescuing the research from a probable failure.

Structural Equation Modeling helps in finding an alternate model having acceptable reliability and validity scores if the initial model has failed due to some unavoidable and irreparable issues. The technique allows the researcher to test multiple models by varying the relationships among variables and finally choose the best fit model. The test statistics that help in choosing the best fit model are goodness of fitment, adjusted goodness of fitment, normed fitment index, non-normed fitment index, comparative fitment index, parsimony fitment index, and incremental fitment index. It should be noted that all of these are not suitable for every research. The researcher should choose the most appropriate ones depending upon the area of research and the sample size. It is recommended to study a number of literatures for choosing the most appropriate fitment indices in structural equation modeling.

The recommended tool for applying exploratory factor analysis technique is SPSS, and the tool recommended for confirmatory factor analysis and structural equation modeling is LISREL. If you need any help in designing a research, collecting data, applying techniques for data analysis, and deriving meaningful conclusions and recommendations in a multivariate research involving exploratory factor analysis, confirmatory factor analysis, and structural equation modeling, you may please contact us at consulting@etcoindia.co and consulting@etcoindia.net. We recommend using Survey Monkey for collecting data and latest academic versions of SPSS and LISREL for applying thes techniques. The academic version of LISREL cannot be used if the number of variables is greater than 15. However, in most cases the number of variables can be reduced to 15 or lesser if Principal Component Analysis technique has been used and reliable scales constructed by testing their Cronbach Alpha values. This is another advantage of starting the research with exploratory factor analysis rather than theory-based structural framework. In some research studies, it may not be possible to keep the number of variables below 15. In such cases, it is recommended that a professional copy of LISREL is purchased.

Ideally, the number of variables should be kept as low as possible especially if the sample size is smaller (say, less than 100). Higher the number of variables, greater is the difficulty in determining the best fit model employing Structural Equation Modeling. It is observed that most of the modern causal research problems require application of multivariate techniques and hence, it is recommended to master SPSS and LISREL in this context.

We can support multivariate research studies in all the research areas mentioned on the page detailing our Subject areas of specialization. The choice of factors and latent variables may be chosen as per a problem description. Typically, latent variables are the ones

that cannot be measured directly. Examples are: human attitude, human feelings, commitment to the organisation, willingness to work in a particular field, and behavioural aspects in groups or teams. However, the variables lacking data availability because of lack of systems and processes can also be chosen as latent variables. The factors influencing the chosen latent variables under study may be chosen from past research studies, journal articles, professional studies, industrial reports, press releases, and expert advises. The structure of the theoretical framework may be designed by applying the exploratory factor analysis technique, or by designing based on literature reviews providing adequate information on structural models

involving the factors (observed variables) and the latent variables under study.

Some of the examples of multivariate problems are the following:

(a) Influence of organisational citizenship behaviour, organisational commitment, behavioural aspects with peers and superiors, and willingness to participate on effectiveness of information security governance in an organisation

(b) Influence of organisational citizenship behaviour, organisational commitment, behavioural aspects with peers and superiors, and willingness to participate on project performance

(c) Influence of multiple personality types on effectiveness of crisis management decision-making and change management

In the above examples, the influencing variables are unobservable and hence need to be considered as latent variables. In order to measure them, the factors affecting them need to be taken from literatures. The models will comprise of relationships of the following generalised form:

Factor groups ---> Latent variables ---> Output variables

The factor groups representing each latent variable are the scales with high reliability (Cronbach Alpha value of 6 or more). The scales can obtained from exploratory factor analysis (principal component analysis with a rotated solution) or literature-supported groups. The number of latent variables loaded by factor variables depend upon the number of Eigen Values greater than unity in the set. After rotation (like, VARIMAX with Kaiser's normalization), the factor variables regroup under the latent variables with varying levels of loadings. The significant loadings (like, 0.5 or above) are accepted and the rest are rejected. This results in reduced scales per latent variable, which can be tested using Cronbach Alpha or spli-half testing. The scale may reduce further if deleting a factor improves in the value of Cronbach Alpha, provided a negative error covariance does not crop up. The researcher should also try to retain the factors strongly supported by theories at the cost of keeping a low reliability level (Cronbach Alpha value) of the scale. The rest of the analysis can be completed through confirmatory factor analysis and structural equation modeling.

Please contact us at consulting@etcoindia.co or consulting@etcoindia.net to

discuss your topic or to get ideas about new topics pertaining to your subject area.