* re: in place file splitter
@ Tyler Spivey
` Charles Hallenbeck
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Tyler Spivey @ UTC (permalink / raw)
To: speakup
well, normally the split command does something like:
1. open the file for reading.
2. take one chunk, open a new output file, place it there and close it.
3. repeat until split.
this keeps the original file, and on a space limited system, e.g. a quota,
you're out of luck.
in place does:
1. open the file for reading.
2. read al the chunks into some kind of list.
3. wait to press enter, so the user can suspend the program and remove the
file.
4. write the output files.
if you remove the original file, the split files take up almost the same
space so quotas don't get in the way.
my program is no where complete, just a skeleton though.
^ permalink raw reply [flat|nested] 12+ messages in thread* re: in place file splitter
in place file splitter Tyler Spivey
@ ` Charles Hallenbeck
` Igor Gueths
` Janina Sajka
` Geoff Shang
2 siblings, 1 reply; 12+ messages in thread
From: Charles Hallenbeck @ UTC (permalink / raw)
To: speakup
Removing the input file before the output files are written is
what we used to call a "bridge burn".
When you make a list of the chunks of the input file, where are
they held before writing them? Does this mean you have to have a
ram total that is at least the size of the file?
Just curious.
On Fri, 8 Nov 2002, Tyler Spivey wrote:
> well, normally the split command does something like:
> 1. open the file for reading.
> 2. take one chunk, open a new output file, place it there and close it.
> 3. repeat until split.
> this keeps the original file, and on a space limited system, e.g. a quota,
> you're out of luck.
> in place does:
> 1. open the file for reading.
> 2. read al the chunks into some kind of list.
> 3. wait to press enter, so the user can suspend the program and remove the
> file.
> 4. write the output files.
> if you remove the original file, the split files take up almost the same
> space so quotas don't get in the way.
> my program is no where complete, just a skeleton though.
>
>
>
> _______________________________________________
> Speakup mailing list
> Speakup@braille.uwo.ca
> http://speech.braille.uwo.ca/mailman/listinfo/speakup
>
--
The Moon is Waxing Crescent (22% of Full)
So visit me at http://www.valstar.net/~hallenbeck
^ permalink raw reply [flat|nested] 12+ messages in thread* re: in place file splitter
` Charles Hallenbeck
@ ` Igor Gueths
` Ralph W. Reid
0 siblings, 1 reply; 12+ messages in thread
From: Igor Gueths @ UTC (permalink / raw)
To: speakup
Hi Chuck. I think you're probably right, as the file contents will have to
be stored in RAM until written to outfiles.
May you code in the power of the source,
may the kernel, libraries, and utilities be with you,
throughout all distributions until the end of the epoch.
On Fri, 8 Nov 2002, Charles Hallenbeck wrote:
> Removing the input file before the output files are written is
> what we used to call a "bridge burn".
>
> When you make a list of the chunks of the input file, where are
> they held before writing them? Does this mean you have to have a
> ram total that is at least the size of the file?
>
> Just curious.
>
>
> On Fri, 8 Nov 2002, Tyler Spivey wrote:
>
> > well, normally the split command does something like:
> > 1. open the file for reading.
> > 2. take one chunk, open a new output file, place it there and close it.
> > 3. repeat until split.
> > this keeps the original file, and on a space limited system, e.g. a quota,
> > you're out of luck.
> > in place does:
> > 1. open the file for reading.
> > 2. read al the chunks into some kind of list.
> > 3. wait to press enter, so the user can suspend the program and remove the
> > file.
> > 4. write the output files.
> > if you remove the original file, the split files take up almost the same
> > space so quotas don't get in the way.
> > my program is no where complete, just a skeleton though.
> >
> >
> >
> > _______________________________________________
> > Speakup mailing list
> > Speakup@braille.uwo.ca
> > http://speech.braille.uwo.ca/mailman/listinfo/speakup
> >
>
> --
> The Moon is Waxing Crescent (22% of Full)
> So visit me at http://www.valstar.net/~hallenbeck
>
>
> _______________________________________________
> Speakup mailing list
> Speakup@braille.uwo.ca
> http://speech.braille.uwo.ca/mailman/listinfo/speakup
>
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: in place file splitter
` Igor Gueths
@ ` Ralph W. Reid
` Igor Gueths
0 siblings, 1 reply; 12+ messages in thread
From: Ralph W. Reid @ UTC (permalink / raw)
To: speakup
Igor Gueths staggered into view and mumbled:
>
>Hi Chuck. I think you're probably right, as the file contents will have to
>be stored in RAM until written to outfiles.
I am about to take this a bit off topic for speakup, so if you are
not interested in a programming technique, you might want to delete
this article now--sorry if I stepped on anyone's toes with this
discussion.
Actually, a technique can be used to read chunks of the input file,
truncating it as you go. This technique will require more disk I/O,
but will not require storing massive files in memory. This technique
is not as necessary nowadays as it used to be given the low cost and
massive size of RAM available, but it might be of some use somewhere.
Here is an algorithm which describes the basics of how this technique
works:
Open the input file.
Open an output file.
Read a chunk of the input file.
while the end of the file hasnot been reached, do:
Write the chunk to the output file.
Close the output file.
Move all of the remaining input file data to the beginning of the
input file.
Get the current position in the input file.
Close the input file.
Truncate the input file at the current position.
Open the input file.
Open an output file.
Read a chunk of data from the input file.
End of while loop.
If a chunk of data has been read which has not been written do:
Write the chunk of data to the output file.
Close the output file.
Else
Close the empty output file.
Delete the empty output file.
End of if-else statement.
Close the input file.
Delete the remainder of the input file.
See the man page for the C function `truncate'. Once working
properly, this technique will chop the input file down in chunks equal
to the amount of data written to the output files. Because the input
file overwrites itself over and over again in ever shrinking amounts,
lots of disk I/O will be necessary, especially for large files which
are to be split into many smaller ones. All of this disk I/O will of
course require much more time than loading the entire input file into
memory and writing the output files from there, but any size input
files can be handled this way even if memory size is limited. You
may or may not find this technique useful. I am not too sure what
this technique has to do with speakup though;);).
Have a _great_ day!
--
Ralph. N6BNO. Wisdom comes from central processing, not from I/O.
rreid@sunset.net http://personalweb.sunset.net/~rreid
Opinions herein are either mine or they are flame bait.
SEC (x) / COSEC (x) = (TAN (x) / COTAN (x)) ^ 2
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: in place file splitter
` Ralph W. Reid
@ ` Igor Gueths
` Ralph W. Reid
0 siblings, 1 reply; 12+ messages in thread
From: Igor Gueths @ UTC (permalink / raw)
To: speakup
Hi Ralph. Very interesting technique. Two things. First, don't you have to
specify the chunksize which will be read? Also, if you have a very large
file (1 MB), and the chunksize is aproximately 5 KB (roughly 5000 bytes),
this could take a while.
May you code in the power of the source,
may the kernel, libraries, and utilities be with you,
throughout all distributions until the end of the epoch.
On Mon, 11 Nov 2002, Ralph W. Reid wrote:
> Igor Gueths staggered into view and mumbled:
> >
> >Hi Chuck. I think you're probably right, as the file contents will have to
> >be stored in RAM until written to outfiles.
>
>
> I am about to take this a bit off topic for speakup, so if you are
> not interested in a programming technique, you might want to delete
> this article now--sorry if I stepped on anyone's toes with this
> discussion.
>
> Actually, a technique can be used to read chunks of the input file,
> truncating it as you go. This technique will require more disk I/O,
> but will not require storing massive files in memory. This technique
> is not as necessary nowadays as it used to be given the low cost and
> massive size of RAM available, but it might be of some use somewhere.
> Here is an algorithm which describes the basics of how this technique
> works:
>
> Open the input file.
> Open an output file.
> Read a chunk of the input file.
> while the end of the file hasnot been reached, do:
> Write the chunk to the output file.
> Close the output file.
> Move all of the remaining input file data to the beginning of the
> input file.
> Get the current position in the input file.
> Close the input file.
> Truncate the input file at the current position.
> Open the input file.
> Open an output file.
> Read a chunk of data from the input file.
> End of while loop.
> If a chunk of data has been read which has not been written do:
> Write the chunk of data to the output file.
> Close the output file.
> Else
> Close the empty output file.
> Delete the empty output file.
> End of if-else statement.
> Close the input file.
> Delete the remainder of the input file.
>
> See the man page for the C function `truncate'. Once working
> properly, this technique will chop the input file down in chunks equal
> to the amount of data written to the output files. Because the input
> file overwrites itself over and over again in ever shrinking amounts,
> lots of disk I/O will be necessary, especially for large files which
> are to be split into many smaller ones. All of this disk I/O will of
> course require much more time than loading the entire input file into
> memory and writing the output files from there, but any size input
> files can be handled this way even if memory size is limited. You
> may or may not find this technique useful. I am not too sure what
> this technique has to do with speakup though;);).
>
> Have a _great_ day!
>
> --
> Ralph. N6BNO. Wisdom comes from central processing, not from I/O.
> rreid@sunset.net http://personalweb.sunset.net/~rreid
> Opinions herein are either mine or they are flame bait.
> SEC (x) / COSEC (x) = (TAN (x) / COTAN (x)) ^ 2
>
> _______________________________________________
> Speakup mailing list
> Speakup@braille.uwo.ca
> http://speech.braille.uwo.ca/mailman/listinfo/speakup
>
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: in place file splitter
` Igor Gueths
@ ` Ralph W. Reid
0 siblings, 0 replies; 12+ messages in thread
From: Ralph W. Reid @ UTC (permalink / raw)
To: speakup
Igor Gueths staggered into view and mumbled:
>
>Hi Ralph. Very interesting technique. Two things. First, don't you have to
>specify the chunksize which will be read? Also, if you have a very large
>file (1 MB), and the chunksize is aproximately 5 KB (roughly 5000 bytes),
>this could take a while.
The data chunk size will have to be defined somehow in any splitter
program. This technique allows the chunk size to be specified on the
command line or elsewhere, hard coded in the program, or a
combination which sets up a default which can be modified by the user
as required. As for the execution speed, so much disk I/O is
required that the program essentially operates at disk drive speed
instead of CPU or RAM speed. The more chunks a file is split into,
the longer the process will take--large files split into many small
chunks could require an appreciable amount of time. The main
advantage to this technique is that it will minimize disk and RAM
space usage during its operation. Other techniques will certainly
operate faster, but may have their own disadvantages. The best
anyone can do is to make a 'best guess' as to what will work best for
the given situation.
Have a _great_ day!
--
Ralph. N6BNO. Wisdom comes from central processing, not from I/O.
rreid@sunset.net http://personalweb.sunset.net/~rreid
Opinions herein are either mine or they are flame bait.
1 = x^0
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: in place file splitter
in place file splitter Tyler Spivey
` Charles Hallenbeck
@ ` Janina Sajka
` Geoff Shang
2 siblings, 0 replies; 12+ messages in thread
From: Janina Sajka @ UTC (permalink / raw)
To: speakup
Thanks, Tyler. I understand you now, and I agree this makes sense for the circumstance you describe.
Tyler Spivey writes:
> From: "Tyler Spivey" <tyler@blindcity.com>
>
> well, normally the split command does something like:
> 1. open the file for reading.
> 2. take one chunk, open a new output file, place it there and close it.
> 3. repeat until split.
> this keeps the original file, and on a space limited system, e.g. a quota,
> you're out of luck.
> in place does:
> 1. open the file for reading.
> 2. read al the chunks into some kind of list.
> 3. wait to press enter, so the user can suspend the program and remove the
> file.
> 4. write the output files.
> if you remove the original file, the split files take up almost the same
> space so quotas don't get in the way.
> my program is no where complete, just a skeleton though.
>
>
>
> _______________________________________________
> Speakup mailing list
> Speakup@braille.uwo.ca
> http://speech.braille.uwo.ca/mailman/listinfo/speakup
--
Janina Sajka, Director
Technology Research and Development
Governmental Relations Group
American Foundation for the Blind (AFB)
Email: janina@afb.net Phone: (202) 408-8175
^ permalink raw reply [flat|nested] 12+ messages in thread
* re: in place file splitter
in place file splitter Tyler Spivey
` Charles Hallenbeck
` Janina Sajka
@ ` Geoff Shang
` Igor Gueths
2 siblings, 1 reply; 12+ messages in thread
From: Geoff Shang @ UTC (permalink / raw)
To: speakup
Hi:
I appreciate the motivation for writing this tool, but there's no going
back if you don't like the results. The way I used to get around the quota
problem when I had one was to use /tmp for temporary storage. You couldn't
leave things there as they'd have a habbit of disappearing, but it proved
very useful when I wanted to download things larger than my quota. Suck it
down to /tmp and download it from there.
Geoff.
--
Geoff Shang <gshang@uq.net.au>
ICQ number 43634701
Make sure your E-mail can be read by everyone!
http://www.betips.net/etc/evilmail.html
Please avoid sending me Word or PowerPoint attachments.
See http://www.fsf.org/philosophy/no-word-attachments.html
^ permalink raw reply [flat|nested] 12+ messages in thread* re: in place file splitter
` Geoff Shang
@ ` Igor Gueths
` Geoff Shang
0 siblings, 1 reply; 12+ messages in thread
From: Igor Gueths @ UTC (permalink / raw)
To: speakup
Hi Geoff. This is true, especially with /var/tmp. /var/tmp is cleaned as
part of the shutdown sequence. So at that time, /tmp wasn't subjected to
your quota?
May you code in the power of the source,
may the kernel, libraries, and utilities be with you,
throughout all distributions until the end of the epoch.
On Sat, 9 Nov 2002, Geoff Shang wrote:
> Hi:
>
> I appreciate the motivation for writing this tool, but there's no going
> back if you don't like the results. The way I used to get around the quota
> problem when I had one was to use /tmp for temporary storage. You couldn't
> leave things there as they'd have a habbit of disappearing, but it proved
> very useful when I wanted to download things larger than my quota. Suck it
> down to /tmp and download it from there.
>
> Geoff.
>
>
> --
> Geoff Shang <gshang@uq.net.au>
> ICQ number 43634701
>
> Make sure your E-mail can be read by everyone!
> http://www.betips.net/etc/evilmail.html
>
> Please avoid sending me Word or PowerPoint attachments.
> See http://www.fsf.org/philosophy/no-word-attachments.html
>
>
>
> _______________________________________________
> Speakup mailing list
> Speakup@braille.uwo.ca
> http://speech.braille.uwo.ca/mailman/listinfo/speakup
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* re: in place file splitter
@ Tyler Spivey
` Adam Myrow
0 siblings, 1 reply; 12+ messages in thread
From: Tyler Spivey @ UTC (permalink / raw)
To: speakup
well, i guess you'd have to have enough memory to hold the whole file.
that doesn't matter though, since the worst that can happen is that the
kernel will kill any old process when the ram and swap fill up.
i've had no swap, and my ram got filled and it killed anything - it picks a
number at random and kills it.
not very useful, just extreamly annoying - it should kill the process that
is eating up all the ram.
but now that we can add huge ammounts of swap space, this isn't a big issue
anymore.
just dd to a file with input from /dev/zero, run swapon, and if the file
gets filled reboot so the swapspace
isn't used, and trash it.
^ permalink raw reply [flat|nested] 12+ messages in thread
* re: in place file splitter
Tyler Spivey
@ ` Adam Myrow
0 siblings, 0 replies; 12+ messages in thread
From: Adam Myrow @ UTC (permalink / raw)
To: speakup
Killing processes when memory runs out usually isn't a big deal, and it's
better than a hang or crash. However, try running parted with 32MB of RAM
from a boot disk some time and you'll learn real quick how to restore from
backup. I did that and since the boot floppy uses a RAM disk and there
was no swap, I ran out of memory in the middle of resizing a partition.
Well, guess what happens when parted gets killed in the middle of
something like resizing a partition? I had scrambled eggs for a
filesystem and it didn't just effect that partition. It somehow effected
my Slackware partition as well requiring me to restore two partitions from
backups. So, if you have less than 128MB of memory and plan on doing
anything disk intensive, better find a free partition for swap space or
disaster could strike.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~ UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
in place file splitter Tyler Spivey
` Charles Hallenbeck
` Igor Gueths
` Ralph W. Reid
` Igor Gueths
` Ralph W. Reid
` Janina Sajka
` Geoff Shang
` Igor Gueths
` Geoff Shang
Tyler Spivey
` Adam Myrow
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).