Friday, July 22, 2022

Review of Statistical Analysis Concepts

 

 


We know of testing as a form of evaluation, assessment, or measurement.  As we research the subject further, we find that there are many types of tests, or methods of measurement, that can be used. 

Of paramount concern for an investigator is deciding what method to use which will provide reliable and valid data.  There are several methods available depending upon the research question to be answered. 

Whichever test is chosen for behavior assessment—i.e. data collection, it must be able to specifically serve a purpose as well as accurately differentiate between variables such as intelligence, affection, emotion, and activity.  While useful in placement and selection, tests will help the investigator determine the outcome of the experiment while also contributing input for evaluation of the program outcome. 

Among questions that should be asked when determining which type of test administer would be “is the assessed behavior physical, social or cognitive?” or “will their behavior be observed by trained data collectors or be self-reported through methods such as questionnaires, or will there be interviews?”, “is there a time limit?”, “is it subjective or objective?”, “individual or group?”, and “will the questions be multiple choice or open-ended?” 

Different types of tests that can be used are Achievement tests, Standardized tests, Researcher/Teacher-made tests, Norm-referenced tests, and Criterion-based tests. 

Multiple choice tests can be use to assess most areas.  They are easy to use for the participants and easy to score by the researchers.  A well-written test will provide accurate results.  Ill-advised situations for using these types of test would be when there is a broad content of information to be assessed or if participant writing skills are to be part of the evaluation process.  Perhaps one of the greatest advantages of using a multiple-choice test is that it positively differentiates between those who know and do not know the material being presented.  Item analysis is used to differentiate the difficulty level as well as the discrimination level.  The Difficulty index is a measure of the proportion of examinees who answered the item correctly.  It is also referred to as the p-value.  Discrimination index refers to how well an assessment differentiates between high and low scorers.

As achievement tests are likely to be used most often, Attitude tests are used to assess a person’s opinion regarding an object, individual, or event.  Two popular achievement tests are the Thurstone scale test, or method of equal-appearing intervals, which is an attitude scale consisting of items in the form of statements with which the respondent has either to agree or disagree. The second is the Likert scale test, also known as the method of summated ratings, which is an attitude scaling method in which respondents indicate the extent of their agreement with each item on a, say, five or seven point scale.  Their score on the scale is the sum of the scores for each item.  The latter is easy to develop and considered the most widely used.

When necessary to assess individual behavior patterns, two types of Personality tests are used: Projective and Structured.

As the methods discussed thus far have the participants acting as the active agents, there is another data collection method where the investigator becomes the active agent: Observation.  Different techniques used to recorder participants’ behavior include duration, frequency, interval, and continuous.  Each has their own purpose and come with their merits and possible drawbacks.  However whichever technique is used, it is imperative to remain clear of the behavior that is being recorded in order to minimize interference with the participants. 

Questionnaires also have a rightful place in data collection for many reasons.  They can be used to survey a broad geographical area and, in comparison to interviews, are inexpensive and can yield more truthful answers due to the anonymity involved.  Questionnaires, to be effective, come with the burden that they can be cumbersome and time-consuming to develop.  But as long as the basic assumptions and a systematic format are used during development, it can be an invaluable tool for collecting data.

An accompanying professional cover letter from the researcher(s) and academic sources such as professors, advisors, or institutions will show credibility to participants when soliciting their help for the study.

 Next, the process of data collection is an important process that, when planned and exercised correctly, will ease the investigator’s task during later stages.  It is systematic process that involves four steps: Construction of the data form, designation of the coding strategy, collection of the actual data, and entry onto the data collection form.

As data collection can be the most time-consuming part of the research process, Salkind (2012) has assembled his Ten Commandments of Data Collection which serve as detailed guidelines to ensure quality data collection methods as well as avoid potential errors during the process.

Once the data has been gathered, it must be analyzed.  For this, the researcher(s) should have a solid understanding of statistics.  Two basic types of statistics must be understood: descriptive and inferential. 

Descriptive statistics quantitatively describe the basic features of a collection of information (i.e. the data).  There are several terms in descriptive statistics that must be thoroughly understood and appreciated.  For this, it is most likely that most researchers would have been educated in at least a basic university-level statistics course.  Terms that require a demonstrated understanding include distribution of scores and measures of central tendency.  Measures of central tendency contain a further subset of terms such as mean, median, and mode.  Researchers must not only be intimate with these measures and the understanding of them, but must be well-versed in knowing when to use each one.

There are also measures of variability that must be understood.  Variability refers to the extent to which data points differ from each other.  There are four commonly used measures of variability: range, mean, variance and standard deviation.  In addition to these terms, an understanding is required of concepts such as the normal curve.  Also known as the bell-shaped curve, this curve plotted on a graph represents how variables are distributed.  Also involved are standard scores.  These scores, that have both the same reference point and standard deviation, are also referred to as z scores.  These z scores are necessary for the comparison of raw scores from various distributions while representing a particular location along the X-axis of a graph.

Equally important to descriptive statistics is the intimate understanding of inferential statistics.  Inferential statistics allow the researcher to use data samples to make decisions about the populations from which the samples were taken.  As earlier mentioned, this demonstrates the importance of the collected data sample accurately representing the population. 

In inference, chance plays a noteworthy role.  Chance is defined as “the occurrence of variability which cannot be accounted for by any of the variables that you are studying.”  Another key attribute to inference is the central limit theorem which states that the means of all samples selected from the population will be normally distributed regardless of the shape of the distribution—whether it is normal or not.  However for this theorem to be successful, the sample size must be greater than 30.  Because the process of sampling in not perfect, the process of sampling error is introduced.

Another important part of inferential statistics is statistical significance—also known as Type I error.  This is described as the degree of risk that the researchers are willing to take to reject a null hypothesis when it is true.  The level of significance, also known as alpha, is the risk in making this type of error. 

There are many different situations applying statistical tests.  Examples include noting average differences between two groups when measurements in the two groups are unrelated (i.e. independent).  Statistical tests can also be applied to groups that are not independent—such as a group of pilots, etc.  The tests would simply be administered at different times to note changes.

There are also techniques for working with more than one variable such as multivariate analysis of variance (MANOVA).    MANOVA takes into account the relationship between the dependent variables.  This advanced technique resembles a series of simpler statistical tests for independent means called t-tests. 

Another technique that can be used is called factor analysis.  This is also an advanced technique that allows the researchers to reduce the number of variables that are representative of a group and then use factor scores as dependent variables.

Lastly, there is a statistical technique known as meta-analysis that is used to combine, condense and simplify the findings from several independent studies on the same topic that used the same dependent variable.  This is an excellent tool to establish statistical significance with studies that have conflicting results.

Reference

Salkind, N.J. (2012). Exploring Research (8th ed.).  Upper Saddle River, NJ: Pearson/Prentice Hall.

The Boeing 777

 A freelance contract pilot and safety management system auditor/consultant with AvJet Solutions, Tilak S. Ramaprakash has a history as a co...