[Wlug] large file performance on linux
Jamie Guinan
guinan at bluebutton.com
Wed May 16 11:44:13 EDT 2007
On Wed, 16 May 2007, brad noyes wrote:
> Hello All,
>
> I am seeing some really slow performance regarding large files on linux. I
> write a lot of data points from a light sensor. The stream is about 53 Mb/s and
> i need to keep this rate for 7 minutes, that's a total of about 22Gb. I
> can sustain 53Mb/s pretty well until the file grows to over 1Gb or so, then
> things hit the wall and the writes to the filesystem can't keep up. The writes
> go from 20ms in duration to 500ms. I assume the filesystem/operating system
> is caching writes. Do you have any suggestions on how to speed up performance
> on these writes, filesystem options, kernel options, other strategies, etc?
>
> Things I have tried:
> - I have tried this on a ext3 file system as well as an xfs filesystem
> with the same result.
>
> - I have also tried spooling over several files (a la multiple volumes)
> but i see no difference in performance. In fact, i think this actually
> hinders performance a bit.
>
> - I keep my own giant memory buffer where all the data is stored and
> then it is written to disk in a background thread. This helps, but
> i run out of space in the buffer before i finish taking data.
Hi,
Do you mean MB/s as in MegaBytes/second? Lower-case "b" implies
"bits" to me, but I suspect you meant bytes. :)
Anyway, if you know your total data set is going to exceed your system
memory, which will largely get used as page cache, you might as well
open the output with O_SYNC and write straight out. Then the big
buffer in the writer thread can go away, and you can just queue chunks
between the input and writer.
A good experiment would be to run the writer thread with dummy input
data and see what kind of throughput you get.
You could try ext2 (no journalling), or tweaking some of the
journalling options in ext3 (data=journal/ordered/writeback, see "man
mount").
If you have enough CPU bandwidth, and your input stream has enough
redundancy, you could gzip it before writing, which might reduce your
output bandwidth requirements.
Hope this helps, keep us posted.
-Jamie
>
> Thanks,
> -- Brad
>
> _______________________________________________
> Wlug mailing list
> Wlug at mail.wlug.org
> http://mail.wlug.org/mailman/listinfo/wlug
>
More information about the Wlug
mailing list