The StatsTest Flow: Difference >> Proportional or Categorical >> One Group Variable >> More Than Two Options >> Less Than 10 In a Cell
Not sure this is the right statistical method? Use the Choose Your StatsTest workflow to select the right method.
What is the Exact Test of Goodness of Fit (multinomial model)?
The Exact Test of Goodness of Fit (multinomial model) is a statistical test used to determine if the proportions of categories in a single qualitative variable significantly differ from an expected or known population proportion. To use it, you should have one group variable with more than two options and you should have fewer than 10 values per cell. See more below.
The Exact Test of Goodness of Fit (multinomial model) is also called the Multinomial Test, the Multinomial Model, the Goodness of Fit Test, and the Multinomial Exact Test.
Assumptions for the Exact Test of Goodness of Fit (multinomial model)
Every statistical method has assumptions. Assumptions mean that your data must satisfy certain properties in order for statistical method results to be accurate.
The assumptions for the Exact Test of Goodness of Fit (multinomial model) include:
- Categorical variable
- Mutually exclusive groups
Let’s dive into what that means.
For this test, your variable must be categorical with more than two categories. A categorical variable is a variable that is a category without a natural order. Examples of categorical variables are eye color, city of residence, type of dog, etc.
Each of your observations (data points) should be independent. This means that each value of your variables doesn’t “depend” on any of the others. For example, this assumption is usually violated when there are multiple data points over time from the same unit of observation (e.g. subject/customer/store), because the data points from the same unit of observation are likely to be related or affect one another.
Mutually Exclusive Groups
The groups of your categorical variable should be mutually exclusive. For example, if your categorical variable is city of residence, then your groups are mutually exclusive, because one person cannot live in multiple cities at once.
When to use the Exact Test of Goodness of Fit (multinomial model)?
You should use the Exact Test of Goodness of Fit (multinomial model) in the following scenario:
- You want to know the difference between two variables
- Your variable of interest is proportional or categorical
- You have more than two options
- You have less than 10 in a cell
Let’s clarify these to help you know when to use the Exact Test of Goodness of Fit (multinomial model).
You are looking for a statistical test to look at how a variable differs between two groups. Other types of analyses include testing for a relationship between two variables or predicting one variable using another variable (prediction).
Proportional or Categorical
For this test, your variable of interest must be proportional or categorical. A categorical variable is a variable that contains categories without a natural order. Examples of categorical variables are eye color, city of residence, type of dog, etc. Proportional variables are derived from categorical variables, for instance: the number of people that converted on two different versions of your website (10% vs 15%), percentages, the number of people who voted vs people who did not vote, the proportion of plants that died vs survived an experimental treatment, etc.
If you have a continuous variable that you want to compare to an expected population, you may want to use a Single Sample Z-Test.
More than Two Options
Your categorical variable should have more than two options. Some examples of variables like this are eye color, city of residence, and type of dog.
If you have only two options and less than 10 in a cell, you should consider using the Binomial Exact Test of Goodness of Fit.
Less than 10 in a Cell
The rule-of-thumb we recommend is to use this test when you have around 10 or fewer observations in each cell. “Cell” in this case refers simply to the count of values in each group. For example, if I have a list of survey responses with 5 “yes” and 1 “no”, there are 5 and 1 value(s) per cell, respectively.
If you have more than 10 in a cell, we recommend using the One-Proportion Z-Test. And if you have more than 10 in every cell and more than 1000 total observations, we recommend using the G-Test of Goodness of Fit.
Exact Test of Goodness of Fit (multinomial model) Example
Variable: Political party
In this example, we have a group of subjects and are interested in investigating whether their political party alignment differs from the typical proportions of the population from which the sample was drawn. The null hypothesis is that there is no difference between the proportions in each political party between the sample and population.
Because our variable is categorical with more than two values (one value for each political party), we know that the Exact Test of Goodness of Fit (multinomial model) is a suitable test.
The analysis will result in a probability or p-value. The p-value represents the chance of seeing our results if the sample was randomly selected from the population. The lower the p-value, the more different our sample proportions are from the population. A p-value less than or equal to 0.05 means that our result is statistically significant and we can conclude that our sample is different from the population on our variable of interest.
Frequently Asked Questions
Q: How do I run the Exact Test of Goodness of Fit (multinomial model) in R?
A: StatsTest is focused on helping you pick the right statistical method every time. There are many resources available to help you figure out how to run this method with your data:
R article: https://www.rdocumentation.org/packages/EMT/versions/1.1/topics/multinomial.test
R video: https://www.youtube.com/watch?v=WOoS7nVkfDk
If you still can’t figure something out, feel free to reach out.