- #1
ADCooper
- 20
- 1
Homework Statement
I'm not sure if this is quite where this belongs since it's not homework really, but I just need help programming and I figured this would be the place to ask. It's related to undergraduate research I'm doing so if that's not kosher just say so!
I want to write a Python script that takes in a certain file as an input, and then, after transforming the the file as a list where each line is an index, then outputting every 567 lines (so something like [0:566], [567:1133], etc] to new separate files. There are well over a million lines if I remember correctly so doing this without a script would be extremely inefficient (unless I've overlooked some super simple way to do it in the bash shell). I only need to do this for one file, so technically I don't need to find an extremely general solution for this, just one that works with this file.
The problem is that I have very little programming experience, so a little guidance would be super helpful (even just basic programming advice is certainly welcome). I don't want someone to do it for me because I need to be able to do this myself, I just need to know if my ideas will possibly work and what general functions I should read up on to get this accomplished if I'm on the complete wrong track.
If this is the wrong place for this I apologize!2. List of Things that Must Be Done
1. Accept name of text file as an argument when executing file.
2. Turn Text File type into a python list. (I'm not sure if this is even necessary. Are text files already considered as a "list" type when they are opened in a python environment? Are they already indexed based on line?)
3. Output every 567 lines to separate text files. (Could technically do this some way by using a keyword that will start every separate file, but I'd assume that would require more work)
The Attempt at a Solution
1. This one I basically understand. Simply have to import the necessary packages and set the first argument equal to the file being split
2. I'm not really sure yet if this is even necessary because I don't know if the file, after being called on the command line, is already indexed by line. Any clues on this?
3. Perhaps a for loop (something like for line [i,j] in file) with i = 0 and j = 566. If I can find a way within the loop to call i and j the indices which I would then use some way to output to a file of some form FILE#, and with every iteration of the loop increase the # next to the file name by 1 and increase i and j by 567. Am I on the right track? Or should I abandon ship ont his way and start from scratch?