What Defines a 2-Way Frequency Table?

In summary, a 2-way frequency table is a statistical tool used to display the relationship between two categorical variables. It organizes data into rows and columns, with each cell representing the frequency count of occurrences for the corresponding categories. This table helps in analyzing the interaction between variables, identifying patterns, and facilitating comparisons across different groups. The totals for each row and column can also provide insights into marginal distributions, enhancing the understanding of the data set.
  • #1
Agent Smith
336
36
TL;DR Summary
What's a 2-way table?
Has dogsDoesn't have dogsTotal
Has cats358
Doesn't have cats415
Total7613

The course I'm taking says the above is a 2-way frequency table because there are 2 categories: cats and dogs.

So the table below is not a 2-way frequency table?
MenWomenChildrenTotal
Ate mushroom3`2`49
Ate lobster1034
Skipped meal68014
Total1010727

Both tables seem equally useful and it seems we can compute the same type of proportions from each of them.
Is a 2-way table somehow special?
 
Last edited:
Physics news on Phys.org
  • #2
Agent Smith said:
TL;DR Summary: What's a 2-way table?

So the table below is not a 2-way frequency table?
It looks like one to me. You have two categorical variables, "type of person" and "meal" and each individual fits into one category. You could gather data with a two column table, simply looking at each person and recording M/W/C in the "type of person" column and M/L/S in the "meal" column.

Where you would have trouble is if one man had both lobster and mushroom. That man can't be recorded in your two column table because he doesn't fit in your categories. If you try to put him in your summary table he should be in both the man/lobster and man/mushroom cells. But now your grand total won't match the number of people and you can't tell why without going back to the original data.

That kind of table can be useful, but requires more care to analyse. I suspect that's the kind of thing they have in mind for "not a frequency table".

You could either go to a three way table (had lobster (Y/N), had mushroom (Y/N), type of person (M/W/C)) or add a "both" category to "meal" to get back to a simple frequency table.
 
  • Like
Likes Agent Smith
  • #3
@Ibix so the categories in the rows and the categories in the columns have to be mutually exclusive.
I had a difficult time trying to account for those who ate both mushroom and lobster.

I've made another "2-way table" below. This feels more correct to me.
The row categories and the column categories are mutually exclusive.
Ate mushroomDidn't eat mushroomTotal
Ate lobster044
Didn't eat lobster91423
Total91827

The above "actual" 2-way table conveys the exact same information as my original table, but now we have included the category ate both lobster and mushroom, but we've lost the men, women, children categories. We have 2 variables, viz. mushroom and lobster.

I would say that the 2nd table in my OP (men/women/children vs. mushroom/lobster/skipped meal) isn't a 2-way table because, excluding the marginal entries, the number of cells is not 4, a power of 2. Something binary about a 2-way table, oui?
 
  • #4
Agent Smith said:
@Ibix so the categories in the rows and the categories in the columns have to be mutually exclusive.
Mutually exclusive and complete, I think, so everyone fits in in one category and only one. If your restaurant also offers chicken then your current meal choice data collection wouldn't work because there can be people who haven't had lobster or mushroom and haven't skipped the meal - you'd need another category for that, too.
Agent Smith said:
I had a difficult time trying to account for those who ate both mushroom and lobster.
Sticking with your mushroom and lobster, you can add another category "both".
ManWomanChild
Ate lobster
Ate mushroom
Ate both
Ate neither
Or you can do what you did and switch to recording yes/no for the meal types separately - but then you'd need a three-way table to summarise everything. Since 3d tables are hard to draw and interpret, you'd normally lay that out something like this
Ate lobsterAte mushroomManWomanChild
YesYes
No
NoYes
No
I'd tend to think that for two options adding a "both" category is better. For more options than that I'd probably go with the multi-way table. If you add chicken to the menu you have three options and to record all possible combinations in one category you need eight options - M/L/C/ML/MC/LC/MLC/Skipped. If you add a fourth option you need sixteen categories, and it keeps on doubling every time. At some point it becomes easier and less error prone to record multiple yes/no questions than a single category with millions of options.

I'm drifting off topic. The point is that you need every person (or whatever you're surveying) to fit into one and only one category in each of your categorical variables, so each person (or whatever) contributes to one and only one cell in your summary table. If you have people who fit into more than one category or none then you don't have a frequency table because it's either excluding or double-counting some people.
 
  • Like
Likes Agent Smith
  • #5
I'm not sure that the term "frequency table" has a universally accepted definition that insists its categories be complete and mutually exclusive. If your definition specifies those properties, then your example is not a 2-way frequency distribution. Otherwise it is.

PS. The category of "Other" could always be added to make the categories complete and to make the "frequency" term more appropriate.
 
  • Like
Likes Agent Smith
  • #6
@FactChecker , the lesson on 2-way table came with a discussion on Venn diagrams. The data in my OP (cats & dogs) was represented in both table (2-way) and Venn diagram formats. @Ibix mentioned that there shouldn't be undercounting/overcounting because the numbers (totals in the margins) should add up/tally (complete). This can only be achieved if each cell records a unique (mutually exclusive) subset of the data (this, that, both, neither for example).

@Ibix those tables are excelente!

I surmise that 2-way tables are reserved for 2-variable binary (yes/no) data.
 
  • #7
FactChecker said:
PS. The category of "Other" could always be added to make the categories complete and to make the "frequency" term more appropriate.
capture.PNG
 
  • Like
Likes FactChecker
  • #8
  • Haha
Likes Agent Smith

Similar threads

Back
Top