File System Improvements in a Half Century

In summary, file systems have been around for a long time, but the features available have remained relatively the same.
  • #1
Vanadium 50
Staff Emeritus
Science Advisor
Education Advisor
2023 Award
35,005
21,671
Triggered by the restore data from a deleted file thread, I was thinking about what improvements have been made in file systems in the last 50 or so years, What was not present in Multics or ITS or early unix..."

I'm not talking about capacity limits. At the birth of the PC, one could have build FAT that would have worked on today's disks - a million times bigger - but what would have been the point?

I can think of three candidates:
1. Sparse files
2. Deduplication
3. Copy-on-write

Sparse files are definietly a niche. Copy on write can promote fragmentation, which was an issue back in the day, so I could understand why it didn't happen. ("You want me to do what?") Deduplicaton is a good ides when it is, and a very very bad idea at other times.

But I can't think of anything else. It's as if all the progress in CS kind of sidestepped the file systems.
 
Technology news on Phys.org
  • #3
Interesting question, I never knew how files were stored on mainframe disks. I knew they used terms like blinks and links for storage size and that lowlevel random access was used to get data. This implied a table that managed the available sectors, their order and platters one could write on.

There was no notion of undeleting files. You could do file recovery though where a tape could be mounted to get an old copy of a file. You had to request it through the computer center help desk And they did the magic recovery.

source code management for programs was done via card decks and mag tape with special control cards indicating what program to alter and what lines were to be replaced. This usually meant changes were seldom, small and required program redesign after the changes became unmanageable.

more on Multics a 1965 paper

https://multicians.org/fjcc4.html
 
Last edited:
  • #5
berkeman said:
Wear leveling for media that have limited endurance (like flash)..
That's good. I originally thought of that as SSD firmware, but you need the file system to tell the drive what can and can't be trimmed.

Filip Larsen said:
Wikipedia has a nice list
But most of those features have been around for a very long time.

jedishrfu said:
I never knew how files were stored on mainframe disks
I remember allocating blocks by hand. You'd tell the system you needed N blocks and it would give you the space. You could give it a mnemonic alias, but that was the closest thing to a file name.
 
  • #6
Vanadium 50 said:
what improvements have been made in file systems in the last 50 or so years, What was not present in Multics or ITS or early unix..."
How about journaling filesystems? IIRC those didn't come into use until the 1990s.
 
  • Like
Likes berkeman and Vanadium 50
  • #7
I think I was using IBM JFS, at least a beta version, in the late 80's.. But I don't think JFS was the first. CICS had a ROLLBACK command, and I think Tandem had something similar.

But I think there is some new feature related to this: snapshots. That uses journaling in a very different way.
 
  • #8
Wow, those bring back memories of IDS and ISP databases. IDS was the hot baby at the time with very fast lookup but had the problem getting entangled in its own data and requiring a custom program to unload it and reload it.

ISP for Indexed Sequential Processing (GE name // IBM used ISAM) was a simpler paged database where each as records were added they were inserted into a page and when the page got full a new overflow page was allocated and linked to and the record was written there.

A standard utility could follow the records sequentially offloading them to tape and later reloading them back onto pages. A percentage was used to indicate how full a page should allowing for a nominal number of record inserts before an overflow page was allocated.
 
  • #9
Vanadium 50 said:
But most of those features have been around for a very long time.
Yes, it was mainly just to have a nice overview of existing file systems in use. Last I tried, some years ago, to get up to speed on new advanced features of newer file systems I seem to recall ZFS being prominent on the list for its inclusion of most advanced features at that time, and it seems to still be the case.

I am no expert, but it somewhat seems file system technology has kind of stabilized and its difficult to imagine disruptive new features being added. Storage innovation last decade or so has mostly seemed to be oriented towards cloud and network storage technologies and less towards features for the "stand-alone" OS file system.
 
  • #10
Vanadium 50 said:
Triggered by the restore data from a deleted file thread, I was thinking about what improvements have been made in file systems in the last 50 or so years, What was not present in Multics or ITS or early unix..."
I have been writing code professionally for more than 50 years, so...

The first improvement is that there are file systems. Fifty years ago, it was normal to allocate disk cylinders to different tasks as a method of optimizing seek times. I had a coworker who told me that his old boss told him to only access the disk through a subroutine the boss had written - because the hardware supported the feature of seeking to more cylinders than were available. I called that "lathe mode".

On the IBM1620 we had at Lowell Tech, the 10Mbyte hard drive was arranged with code and a files area, but students were expected to keep their code on card decks and their output to printed form - paper tape was also available.

You mentioned "sparse files". In the '70's and early '80's, some systems would support B-Tree style random access files. So you could skip the file pointer around on writes and it would only allocate for sectors that were required.

As systems became multi-processing, with two or more processes writing to disk at the same time, the file systems needed to become re-entrant - and the file structures and system needed to mediate between applications that had no means of cooperation.

It's hard to separate file system changes from the changing requirements created by different hardware. The file systems are expected to handle fragmentation issues on their own. To guarantee a contiguous file, 50 years ago, you could allocate a file size before writing to it. Now, with SSD, it's not even an issue. The cost of fragmentation is imperceptible.
 
  • Like
Likes berkeman and Filip Larsen
  • #11
That's a very good point that problems today are different. Compared to ~40 years ago, CPUs are maybe 50,000x faster, but memory only 10x. Spinning disks are a million times bigger but only 1000x faster.

The same technology can serve different purposes. Back then, we compressed data to save space. Today we do it to save time. The reason I compress my data is not to get 10% more capacity - it's to get better speed: 10% more cache hits and 10% faster transfer times.
Filip Larsen said:
ZFS
I like ZFS. I run it at home. I suspect that there is little it can do that GPFS can not (except maybe be run at home in a practical manner)
 
  • Like
Likes Filip Larsen

FAQ: File System Improvements in a Half Century

What are the key advancements in file systems over the past 50 years?

Some key advancements in file systems over the past 50 years include the development of hierarchical file systems, journaling file systems, and distributed file systems. Additionally, improvements in file system performance, reliability, and scalability have been significant.

How have file systems evolved to accommodate larger storage capacities?

File systems have evolved to accommodate larger storage capacities by implementing features such as support for larger file sizes, improved disk allocation algorithms, and more efficient data structures. Additionally, advancements in storage technologies have allowed file systems to take advantage of larger storage devices.

What impact have solid-state drives (SSDs) had on file system design?

SSDs have had a significant impact on file system design by introducing new challenges and opportunities. File systems have been optimized to take advantage of the high-speed, low-latency nature of SSDs, as well as to address issues such as wear leveling and garbage collection specific to these devices.

How have file systems adapted to the increasing prevalence of cloud storage?

File systems have adapted to the increasing prevalence of cloud storage by integrating features such as seamless synchronization, data deduplication, and encryption. Additionally, file systems have been designed to work efficiently in distributed environments, allowing for seamless access to data stored in the cloud.

What are some current challenges in file system design and how are they being addressed?

Some current challenges in file system design include ensuring data integrity and security, optimizing performance for diverse workloads, and managing the complexity of modern storage systems. These challenges are being addressed through innovations in areas such as data redundancy, encryption, and file system virtualization.

Back
Top