Input/Output error with error code -5

In summary, the problem is that the program is writing to a disk file and there is not enough disk space.
  • #1
kelvin490
Gold Member
228
3
I got a problem running my FORTRAN program in high performance computer cluster. It runs well in my PC but I want to have mass production of data with different initial conditions so I put it in a cluster node with eight cores, simulate eight sets of data.

The program can run without problem in home directory but since I need extra memory space a scratch hard disk is added and I run the programs in this disk.

After a while the program stopped and there is an error message:

PGFIO/stdio: Input/output error
PGFIO-F-/formatted write/unit=6/error code returned by host stdio - 5.
File name = stdout formatted, sequential access record = 181
In source file TipNew8.f90, at line number 2018
FORTRAN STOP

I have run it several times, similar error occurs but the error occurs at different lines. Also it stopped at different time steps each time I run it. This kind of error seems quite random since it occurs at different steps and different lines. Every time it occurs at lines with "write" or "print" function. It runs without problem when I run it in my PC using Microsoft Visual Studio with PGI compiler.

Does anyone have ideas what's wrong with the program?
 
Technology news on Phys.org
  • #2
Is your program writing to a disk file? Do you have enough disk space on this scratch disk?

You could check with the "df -h" command if this is linux.
 
  • #3
jedishrfu said:
Is your program writing to a disk file? Do you have enough disk space on this scratch disk?

You could check with the "df -h" command if this is linux.

I have checked and there is enough space.
 
  • #4
I suggest you contact the support staff responsible for the cluster. I don't think there is much we can do to help you without access to the system.
 
  • #5
I don't exactly know what "cluster" means, but here are some ideas, maybe...
  • Are you compiling your program in your PC and then running it in the cluster?
  • Can you run your program in the cluster without taking advantage of the cluster aspect of it? like just one instance of it? does it run this way?
  • Is there such a thing as compiling your program in the cluster? for assured compatiblity?
  • What does cluster mean? Many independent instances of the same program? Are they all writing to the exact same file? or are the file names different?
 
  • #6
kelvin490 said:
I have checked and there is enough space.
Space shouldn't matter if your program has checked before doing the heavy processing, in which case that is one termination mode.
Are all channel resources made to be sure to be allocated before the run.
Program error handling ...

Sounds though something similar is happening, such as a buffer overflow somewhere, or a node conflict and timeout to disk access.

Is that your software or from the cluster I don't know enough about it. Is it from the network links - is that a possibility.

Random means that the error is indeterminate - ie works really well until the error occurs and you have complete collapse, such as adding the scratch disk has led to an overwhelming accumulation of data.

that;s about all I know.
 
  • Like
Likes kelvin490
  • #7
256bits said:
or a node conflict and timeout to disk access.
Now that you mention it, this is what I would investigate first. You should be careful that different nodes are not trying to write to a file at the same time. It is very good practice to have one node handle all input/output.
 

FAQ: Input/Output error with error code -5

What does an "Input/Output error with error code -5" mean?

An "Input/Output error with error code -5" typically means that there was an issue with transferring data between a device or program and the computer's memory. The error code -5 specifically indicates a read/write error, meaning that the data could not be properly read from or written to the device or program.

How can I fix an "Input/Output error with error code -5"?

There are a few potential ways to fix an "Input/Output error with error code -5", depending on the specific cause of the error. Some possible solutions include checking for any hardware issues, making sure the device or program is properly connected and functioning, and trying to access the data from a different computer.

Does an "Input/Output error with error code -5" always mean there is a problem with the computer?

Not necessarily. While an "Input/Output error with error code -5" can indicate a problem with the computer's hardware or software, it can also be caused by issues with the device or program itself. It's important to troubleshoot and identify the root cause of the error before assuming it is a problem with the computer.

Can an "Input/Output error with error code -5" result in data loss?

Yes, an "Input/Output error with error code -5" can potentially result in data loss if the error is not resolved. If the data cannot be properly read from or written to the device or program, it may become corrupted or lost. It's important to address the error as soon as possible to minimize the risk of data loss.

Is there any way to prevent an "Input/Output error with error code -5" from occurring?

While it's impossible to completely prevent all input/output errors, there are some steps you can take to reduce the likelihood of encountering an "Input/Output error with error code -5". These include regularly backing up important data, keeping hardware and software updated, and properly ejecting devices before disconnecting them from the computer.

Similar threads

Replies
4
Views
2K
Replies
17
Views
5K
Replies
5
Views
4K
Replies
2
Views
2K
Replies
2
Views
2K
Replies
2
Views
7K
Replies
4
Views
2K
Replies
9
Views
1K
Back
Top