Are There Standard Tools for Verifying Directory Structure?

  • Thread starter 0rthodontist
  • Start date
  • Tags
    Structure
In summary: There is no standard walk function in Python. You would need to write your own.In summary, there are no standard tools for verifying that a directory structure takes a certain form. You would need to write your own function to do this.
  • #1
0rthodontist
Science Advisor
1,231
0
Are there standard tools for verifying that a directory structure takes a certain form? This is for when a program needs certain minimum files with certain name constraints to be in certain places and directories with certain name constraints to be in certain places. It would specify things like "there must be a file called files.dat in the home directory, and there must be some number of directories with numeric names, each of which has a file called configure.dat." It's not hard to write application-specific means of verifying this but it would be nice if there is a general framework.
 
Technology news on Phys.org
  • #2
No - no OS provides that. Most OSes have system calls (like sysconf() in UNIX) that tells you a lot about a system - but not application expectations. And I don't know of a utility to do that. You can try scrounging around sourceforge and see.

You'll probably have to build your own front end. It's just a matter of stat()-ing for the existence of some directories and a few files. You may want to require the existence of an init file (or rule file) or a database table that lists the location of these files/directories. Programs do this all the time.
 
  • #3
Well, it would be interesting if there is no current tool that does this, because it would certainly be useful and would not be terribly difficult to write. I was thinking that directory structure could be abstractly specified using regular expressions or something similar and then verified in a standard way.
 
  • #4
I don't see any reason to make an entire abstracted "framework" for a task that requires perhaps ten lines of code, is different for every program, and occupies the CPU for perhaps a couple of milliseconds.

Unless you're scanning the filesystem for many thousands of files, this framework would be of little value. Of course, if you really need to store thousands of files on the filesystem, you probably have a pretty lousy design.

Sometimes I sense that you have the "abstractaholism" suffered by many programmers -- particularly new ones. I went through it, too. It's characterized by a desire to modularize, generalize, and abstract even the simplest actions into fifteen cooperating classes.

My best advice would be to spend your effort on the 20% of your code that consumes 80% of your resources -- memory, hard disk space, CPU time, man-hours spent coding, etc. -- and don't worry about the rest. Assuredly, the trivial task of looking for a few files is not worthy of this kind of attention.

- Warren
 
  • #5
Optimization is not the aim. In fact, since it would parse regular expressions (or I'm thinking now, expressions in backus-naur form), it would probably be slower by an insignificant amount. For most applications optimization is a very minor concern, which is why we have scripting languages.

You can never have too much abstraction and re-usability available to you.
 
  • #6
0rthodontist said:
You can never have too much abstraction and re-usability available to you.

Spoken like a true sufferer of abstractaholism. When you spend days writing and debugging the most general possible solution to looking for files in the filesystem -- when four calls to stat() would suffice for 90% of your programs -- you absolutely are using too much abstraction. Spend your energy on another part of your program that matters.

There's no way at all that you could possibly argue that your grammar-parsing "framework" is anywhere close to the speed of the equivalent calls to stat(). In fact, the time (and memory) required to just load the regular expression library would be orders of magnitude slower than just calling stat().

I remember those days... when I almost felt dirty for writing a line of code that actuallly did something.

- Warren
 
  • #7
We are talking milliseconds here. Even a two order of magnitude difference wouldn't matter. You think that a user cares that your program takes an extra .01 seconds to load a library?

A good programmer builds up his own private library of programming tools. This looks like a good candidate, one that may be re-used many times.
 
  • #8
Then knock yourself out, man. You don't need my (or anyone else's) approval. I just think it's a pretty boring thing to spend so much time on. It doesn't appear (to me) to be the sort of task that's big enough, hard enough, or meaningful enough to be made into reusable "framework."

- Warren
 
  • #9
Python has lots of Posix calls, generally in the library module "os". They also have tools that are a little bit above stat.

Look for a Python module/function/class called "walk": it will walk a directory tree for you, performing a user-supplied function at each node. If you do it right, you should be able to carry some state with you. Even if you're not using Python, it should give you some ideas, for better or worse. Another option is to construct an object tree that mirrors a directory tree and use a Visitor pattern.

As Warran points out, it's important to match the expense and complexity of developing something like this with the scale of the overall project and this component's role in that project.
 
  • #10
I agree with Warren. Checking file existence is a done deal, and creating some abstracted framework to do it is analagous to "circle-squaring" in the computer programming tools domain.

If it were a reasonable problem, it is also reasonable to assume tools would already exist -- even if they were poor. UNIX abounds with tools. Esp. since the advent of open source. And no tool I can locate does anything like this - I had a rummage at sourceforge and found nothing like your requirements.

On UNIX, you can integrate regex.h calls and ftw to create a custom file search.
 
  • #11
You're right--because it doesn't exist, it's not useful.
 
  • #12
0rthodontist said:
You're right--because it doesn't exist, it's not useful.

Sarcasm aside, if you think it's worth writing, then write it. It doesn't really matter if I, or anyone else, thinks it's useful.

- Warren
 
  • #13
The point I didn't make clearly:

This problem has been around since I started programming like 40+ years ago. It is not new. Since dealing with it on a "onesies" basis has been more than good enough for people who wrote lots of other tools, then maybe it's fair to conclude it ain't all that hard to do it the "hard" way.

Alternatively stated, abstracting it isn't worth the effort. Or abstracting it and applying the abstraction is much more time inefficient than doing it the old way. Ockham's razor, so to speak.

However if you want to write it, do it. Put your effort where your feelings are. I'd like to see a very simple abstraction - one or two lines of C/C++ to call a library call or a class.

My personal belief is that this isn't generally possible because of the required background data. How are you going to specify all of the requirements to any call without gobs of arguments? Or some kind of data file?

YMMV.
 
  • #14
Well, you would use a data file containing a BNF expression representing the directory structure. I can basically do it, except I'm having trouble making a parseable BNF expression (simpleparse module)

Basically I do
Code:
 def tdir(dirname):
	x = os.listdir(dirname)
	string = ''
	for y in x:
		if(os.path.isdir(dirname + '/' + y)):
			string = string + '[' + y + tdir(dirname + '/' + y) + ']'
		else:
			string = string + '(' + y + ')'
	return string
which turns the directory structure into a single string, then make a BNF expression that says what directory structure you want. However I am having some difficulty getting this to work... the simpleparse module matches the string greedily and sometimes won't parse a correct string. It doesn't seem to do full BNF parsing. Specifically I need to parse matched parentheses and brackets to arbitrary depth.
 
Last edited:
  • #15
If you need some exotic filesystem support you can experiment with this
http://www.eclipsezone.com/eclipse/forums/t83786.rhtml

otherwize having a simple try catch block does the job under most of todays OS' that support Java.

Code:
File configFile = new File(System.getProperty("user.dir") + File.separator + "Config.properties");
		Properties prop = new Properties();
		try {
			prop.load(new FileInputStream(configFile));
		} catch (FileNotFoundException e) {
	System.out.println("UPS, the config file cannot be found...");
		} catch (IOException e) {
		
		}
 
  • #16
No, I am having trouble with the simpleparse module, which parses the directory string according to the BNF grammar defining the directory structure.
 
  • #17
tdir looks odd. Consider the following tree:

dir0/file0
dir0/dir1/file1
dir0/dir1/file2

tdir("dir0") produces:

(file0)[dir1(file1)(file2)]

is that really what you wanted? I suppose it could be. Also, if you're matching too much, then make sure you aren't allowing parentheses to be matched as text. e.g. I would use the regex

[^\(\)\[\]]+

to match individual directory and file names.
 
  • #18
Yes, that's what I intended, so it is recursive. A file is a single name in round parentheses and a directory is grouped in square brackets along with all of its files and subdirectories. Regex's can't match parentheses.
 
  • #19
Regex's can't match parentheses.
I mean the regex

.+​

will happily match the string

file1)(file2​

which is a possible source of problems. (of course, I don't know exactly what problem you're having. :-p)


If you need to parse context-free grammars, have you considered using yacc, or bison?
 
  • #20
Of course regexes can match parentheses. You just need to escape them, as you must with all meaningful metacharacters, .* + etc.

- Warren
 
  • #21
No, what I meant was they can't match parentheses with each other. The set of all strings that are correctly parenthesized is not a regular language.

Hurkyl, I haven't figured out how to do a wildcard character in simpleparse yet. But I'll give those tools (yacc/bison) a shot when I get the time, thanks for mentioning.
 

FAQ: Are There Standard Tools for Verifying Directory Structure?

What is the purpose of verifying directory structure?

Verifying directory structure is important for ensuring that files and folders are organized properly and can be easily accessed by users. It also helps to identify any potential errors or issues that may affect the functionality of the directory.

How can I verify the directory structure?

The most common way to verify the directory structure is by manually navigating through the folders and subfolders to check for correct labeling and organization. Another method is using command line tools or file management software to view the directory structure in a more detailed and organized manner.

What are some common mistakes in directory structure?

Common mistakes in directory structure include duplicate or missing files, incorrect labeling or naming, and disorganized or cluttered folders. These mistakes can lead to difficulties in locating files or cause errors in file operations.

Why is it important to maintain a consistent directory structure?

Maintaining a consistent directory structure is crucial for efficient file management and collaboration among team members. It also ensures that files can be easily located and accessed even by new users who are not familiar with the directory.

What should I do if I encounter errors in the directory structure?

If errors are encountered in the directory structure, it is important to address them promptly to avoid any further issues. This may involve reorganizing files, renaming folders, or fixing any broken or missing links. Regularly checking and maintaining the directory structure can also help prevent future errors.

Back
Top