Program that reads data from a file and calculates the mean?

In summary, the conversation discusses a programming task involving reading data from two files and calculating the mean, standard deviation, and standard error for each file. The poster is struggling with the I/O aspect and asks for help. They later update their progress and ask for assistance with incorrect standard deviation values. A solution is provided, using an online algorithm to avoid looping through the data twice.
  • #1
youngfreedman
Hi everyone.

I'm trying to write a program that reads data from 2 files and then calculates the mean, standard deviation and standard error of both files (separate values for each). I'm struggling to get my head around simple I/O, so excuse the poor attempt, but this is what I have so far: (I'm only attempting to just print out each value for now.)

Code:
    program data
    implicit none

    integer             :: j
    double precision    :: test

    open(unit = 100, file = 'tmax_1910.txt', status = 'old', action = 'read')
     do j = 1,  12
     read(100,*) test
     print *, 'N1=', test
    end do

    end program data

If it helps, the file is a list of monthly rainfalls for a year.

Thanks for any help!
 
Last edited by a moderator:
Technology news on Phys.org
  • #2
I can't immediately see anything wrong. Does it compile, link, execute? If you get an error message in one of those steps, that is the time to ask questions. And tell us what the error message is complete with line number.
 
  • #3
FactChecker said:
I can't immediately see anything wrong. Does it compile, link, execute? If you get an error message in one of those steps, that is the time to ask questions. And tell us what the error message is complete with line number.

I've actually made a lot of progress since posting this, I was going to delete the post but don't know how. However, I do still need help. Here's my code at this point:

Code:
    program data
    implicit none

    integer             :: R, F
    double precision    :: x, sum = 0, mean, y, mean2, sum2 = 0, var =0, sdv,
    var2 = 0, sdv2    open(unit = 100, file = 'tmax_1910.txt', status = 'old', action = 'read')
     do R = 1,  12
     read(100,*) x
     sum = sum + x

    end do

    mean = (sum)/12

     do R = 1, 12
      var = var + (((x - mean)**2.0)/12)
      sdv = var**0.5

    end do
    open(unit = 200, file = 'tmax_2010.txt', status = 'old', action = 'read')
     do F = 1, 12
     read(200,*) y
     sum2 = sum2 + y

    end do

    mean2 = (sum2)/12

     do F = 1, 12
      var2 = var2 + (((y - mean2)**2.0)/12)
      sdv2 = var2**0.5
    end do

    print *, 'mean=', mean, 'mean2=', mean2, 'sdv=', sdv, 'sdv2=', sdv2

    end program data

This prints the correct mean values, but the values for standard deviation (for both files) is incorrect. For reference, here are the numbers in each file:

File 1: 5.0, 6.6, 9.3, 10.4, 14.0, 18.0, 16.9, 18.6, 15.4, 13.1, 5.4, 7.6 (actual standard dev = 4.9..., my value 4.09..)
File 2:3.2, 4.3, 9.5, 13.0, 14.5, 19.2, 20.8, 19.0, 17.2, 12.9, 7.2, 2.0. (actual standard dev = 6.6.., my value 9.9...)

Thanks.

EDIT: Just to add, the values I get for mean are correct.
 
Last edited by a moderator:
  • #4
Your values of x are changing correctly in the first loop but once you get out of that loop, you are left with x = last value read. So the second loop has a constant x value. Your alternatives are to loop through the x values twice (either saving an array of them or reading them twice) or using a different formula for the variance.

You can use a formula for variance where you accumulate the sum of x2 in the first loop and the sample mean. That saves you from looping through the x values twice.
 
  • Like
Likes jim mcnamara and youngfreedman
  • #5
FactChecker said:
Your values of x are changing correctly in the first loop but once you get out of that loop, you are left with x = last value read. So the second loop has a constant x value. Your alternatives are to loop through the x values twice (either saving an array of them or reading them twice) or using a different formula for the variance.

You can use a formula for variance where you accumulate the sum of x2 in the first loop and the sample mean. That saves you from looping through the x values twice.

I hadn't considered that. That makes sense, thanks !
 
  • #6
The algorithm @FactChecker mentions is called an online algorithm. The idea is that you can type a single datastream at the keyboard and do things with the data like calculate mean, standard deviation, and variance. Works for a file, too: you just read through the file one time.

Wikpedia has examples if you search for 'variance', the original Knuth version is easy to understand.
 
  • Like
Likes FactChecker

FAQ: Program that reads data from a file and calculates the mean?

What is the purpose of a program that reads data from a file and calculates the mean?

The purpose of this program is to efficiently and accurately calculate the average value of a set of data stored in a file. This can be useful for analyzing large amounts of data and making informed decisions based on the mean value.

How does the program read the data from the file?

The program uses specific commands and functions to read the data from the file, such as the "open" and "read" functions in programming languages like Python. These functions allow the program to access and extract the data from the file.

What is the formula used to calculate the mean?

The mean is calculated by adding up all the values in a set of data and dividing by the total number of values. In mathematical notation, it is represented as: mean = (x1 + x2 + ... + xn) / n, where x1, x2, ..., xn are the individual values in the data set and n is the total number of values.

How does the program handle errors or missing data in the file?

The program can be designed to handle errors and missing data in different ways, depending on the specific implementation. Some programs may skip over any errors or missing data and only calculate the mean for the valid data, while others may display an error message or prompt the user to input the missing data.

Can this program be used for any type of data?

Yes, this program can be used for any type of data as long as it is organized in a file with each value separated by a delimiter (such as a comma or space). The program can also be modified to handle different types of data, such as numerical or categorical data.

Similar threads

Replies
5
Views
4K
Replies
12
Views
2K
Replies
16
Views
2K
Replies
5
Views
2K
Replies
7
Views
2K
Replies
2
Views
1K
Replies
4
Views
2K
Replies
22
Views
3K
Replies
1
Views
3K
Back
Top