How do I determine the class size for creating a histogram in statistics?

In summary, the conversation is discussing a stats problem involving creating a histogram based on the number of Medical Doctors per 100,000 people in each state. The question is asked about how to determine the number of classes and the size of each class. The individual is unsure about using the actual endpoints and asks if it is arbitrary. They also mention wanting the histogram to look like a "normal curve."
  • #1
Saladsamurai
3,020
7
I have a quick question regarding a stats problem in my girlfriend's text. It is pretty easy I suppose, but I am not quite sure; the text does not appear to give a "general form" of how to obtain the info.

Feel free to correct me if I misuse words here, I am not familiar with the stats lingo yet.

It is asking to make a histogram. There is a table that gives us each state (the individual) and the number of Medical Doctors per 100,000 people is in each state (<--the variable, I presume).

Now I know that I find the Range by subtracting the Lowest Actual variable from the Highest actual Variable.

Here is where I am getting a little lost: I think now I am supposed to divide the actual range by the "class size" in order to find the number of classes so I can start to draw my histogram.

But how do I choose the size of each class? Is it arbitrary?

And why do I use the actuals to compute the range instead of the reasonable beginning/end?Thanks,
Casey
 
Physics news on Phys.org
  • #2
I'd first plot the data by state to get a "feel."

Class size is arbitrary. It partly depends on whether you'd like the histogram to look like a "normal curve" or can live with "gaps in the middle."

You do not have to use the actual endpoints, but that's one way.
 
  • #3


Hello Casey,

Thank you for reaching out with your question. Determining the class size for creating a histogram in statistics is an important step in accurately representing your data. The class size refers to the width of each bar in the histogram and can greatly impact the overall shape and interpretation of the graph.

To determine the class size, you first need to calculate the range of your data, as you mentioned. This is done by subtracting the lowest value from the highest value in your data set. However, instead of using the actual range, it is recommended to use a rounded version of the range. This will help ensure that your classes are evenly distributed and easier to interpret.

The size of each class can be chosen based on the number of classes you want in your histogram and the range of your data. A general rule of thumb is to aim for 5-15 classes, but it ultimately depends on the size of your data set and the level of detail you want to show. A smaller number of classes may result in a simpler, more general overview of the data, while a larger number of classes can provide a more detailed, nuanced representation.

It is important to note that the size of each class should be the same throughout the entire histogram. This helps maintain consistency and makes it easier to compare data points within each class.

In terms of why we use the actual range instead of a "reasonable beginning/end," it is because using the actual range ensures that all data points are included and accounted for in the histogram. If we were to use a smaller range, some data points may be left out and the overall picture of the data would not be accurate.

I hope this helps clarify the process of determining class size for creating a histogram. If you have any further questions or need additional assistance, please don't hesitate to reach out. Best of luck with your stats problem!

Sincerely,

 

FAQ: How do I determine the class size for creating a histogram in statistics?

What is a histogram?

A histogram is a graphical representation of the distribution of numerical data. It consists of columns (or bins) that represent the frequency of occurrence of values within a specific range.

How do you make a histogram?

To make a histogram, you first need to determine the range of values for your data and divide it into equal intervals (bins). Then, you count the number of data points that fall into each bin and plot the frequency of each bin on the y-axis. Finally, you connect the points to form a series of rectangles, with the area of each rectangle representing the frequency of the data in that bin.

What is the purpose of a histogram?

The main purpose of a histogram is to visualize the distribution of data and identify any patterns or trends present. It also allows for easy comparison of data sets and provides insights into the central tendency and variability of the data.

What are some common shapes of histograms?

Some common shapes of histograms include normal (bell-shaped), skewed (asymmetric), bimodal (two peaks), and uniform (flat). These shapes can provide information about the underlying data, such as its symmetry, skewness, and the presence of outliers.

How do you interpret a histogram?

The interpretation of a histogram depends on its shape and the data it represents. Generally, a histogram with a bell-shaped curve indicates a normal distribution, while a skewed histogram may suggest a non-normal distribution. It is also important to consider the range and frequency of the data to draw meaningful conclusions from a histogram.

Similar threads

Replies
1
Views
3K
Replies
3
Views
7K
Replies
2
Views
3K
Replies
10
Views
433
Replies
7
Views
1K
Replies
15
Views
2K
Replies
33
Views
785
Replies
4
Views
890
Back
Top