[Wlug] large file performance on linux
brad noyes
maitre at ccs.neu.edu
Wed May 16 12:26:53 EDT 2007
On Wed, May 16, 2007 at 11:38:09AM -0400, Jeff Moyer wrote:
> ==> On Wed, 16 May 2007 14:53:16 +0000, brad noyes <maitre at ccs.neu.edu> said:
>
> brad> Hello All,
> brad> I am seeing some really slow performance regarding large files on linux. I
> brad> write a lot of data points from a light sensor. The stream is about 53 Mb/s and
> brad> i need to keep this rate for 7 minutes, that's a total of about 22Gb. I
> brad> can sustain 53Mb/s pretty well until the file grows to over 1Gb or so, then
> brad> things hit the wall and the writes to the filesystem can't keep up. The writes
> brad> go from 20ms in duration to 500ms. I assume the filesystem/operating system
> brad> is caching writes. Do you have any suggestions on how to speed up performance
> brad> on these writes, filesystem options, kernel options, other strategies, etc?
>
> Of course. Your data set is larger than the page cache, so when you
> hit the low watermark, it starts write-back. You can deal with this a
> few different ways, and I'll throw out the easiest ways first:
> 1) Get more memory
> 2) Get a faster disk
>
Ha :). I have 12GB of memory. Which actually brings me to another question.
How do i alter the per-process memory limit? I can only allocate a memory
buffer that is 3GB. I'd like to make use of the other 8GB left in the machine.
If i can double my buffer size i think i could sustain the 53MB/s for 7
minutes that i need.
> If those are not options, then you can tweak your application by using
> AIO and O_DIRECT. This will allow you to drive your disk queue depths
> a bit further and avoid the page cache. Check the man pages for
> io_setup, io_submit, and io_getevents to get started.
>
I'll check out these options and man pages.
> brad> Things I have tried:
> brad> - I have tried this on a ext3 file system as well as an xfs filesystem
> brad> with the same result.
>
> You may not want to use a journalled file system. If you must,
> though, with ext3 you could try running with the data=writeback
> option.
>
yup. I'll check this option out.
> brad> - I have also tried spooling over several files (a la multiple volumes)
> brad> but i see no difference in performance. In fact, i think this actually
> brad> hinders performance a bit.
>
> I'm not sure I fully understand what you mean. Are you saying you
> write to separate physical volumes,
>
Not physical volumes, but different files. By the end of the data
acquisition i will end up with the files: data.01, data.02, data.03 ... etc.
Each file is a 1GB in size or whatever i set the limit to be. The reason i did
this is because i thought that as the file grows larger there are several
layers of indirection in the inode to get to the actual data blocks on disk;
and perhaps that might hinder performance.
> and that you don't see any performance increase from doing so?
>
Correct. I don't see any improvement. At least no measurable performance
improvement in the kind of rates i'm dealing with.
> brad> - I keep my own giant memory buffer where all the data is stored and
> brad> then it is written to disk in a background thread. This helps, but
> brad> i run out of space in the buffer before i finish taking data.
>
> Right, this is exactly what happens in the OS. ;) Speaking of which,
> you don't mention which kernel you are using. Could you please
> provide that information? There are a few vm tunables that you could
> try tweaking, but I really don't think they will help if your data set
> is larger than memory. We can explore that option, though, if you
> like.
>
i'm using the 2.6.20 kernel from the ubuntu source tree. I recompiled it to get
the large memory support, up to 64GB.
I was looking for some tunable vm options in sysctl, but i didn't see much that
made sense to me. If nothing else helps perhaps i will ask about the vm
options.
>
> p.s. In your head, is Mb Megabit or Megabyte?
>
the latter. Jamie already pointed this typo out to me :). Perhaps this time
around my unit abbreviations are correct.
Thanks for your input. I'll keep the list posted.
-- Brad
More information about the Wlug
mailing list