[Wlug] large file performance on linux
brad noyes
maitre at ccs.neu.edu
Wed May 16 12:02:33 EDT 2007
On Wed, May 16, 2007 at 11:44:13AM -0400, Jamie Guinan wrote:
> On Wed, 16 May 2007, brad noyes wrote:
> Hi,
>
> Do you mean MB/s as in MegaBytes/second? Lower-case "b" implies
> "bits" to me, but I suspect you meant bytes. :)
>
Yes, your assumption is correct. I never quite understood the capitalization
conventions when it came to computers.
> Anyway, if you know your total data set is going to exceed your system
> memory, which will largely get used as page cache, you might as well
> open the output with O_SYNC and write straight out. Then the big
> buffer in the writer thread can go away, and you can just queue chunks
> between the input and writer.
>
> A good experiment would be to run the writer thread with dummy input
> data and see what kind of throughput you get.
>
> You could try ext2 (no journalling), or tweaking some of the
> journalling options in ext3 (data=journal/ordered/writeback, see "man
> mount").
>
great idea. I forgot about tune2fs and mount options. thanks.
> If you have enough CPU bandwidth, and your input stream has enough
> redundancy, you could gzip it before writing, which might reduce your
> output bandwidth requirements.
>
There isn't enough redundancy in the data. its essentially random data so it
won't compress well at all.
> Hope this helps, keep us posted.
>
Thanks for the input.
-- brad
More information about the Wlug
mailing list