[Wlug] NFS Trouble

John Stoffel stoffel at lucent.com
Thu Sep 9 15:39:21 EDT 2004


Chuck> The  curent line from the fstab is as follows:
Chuck> hostname:/home              /home                   nfs    
Chuck> rsize=4096,wsize=4096,bg,intr,rw,actimeo=15 0 0

Here's one of the problems, you're not specifying TCP here, and you're
using pretty low numbers.  Try changing to use 32768 and tcp for your
numbers.

Also, have you checked the duplex on server to make sure it's ok?  You
mention you're using RAID on the server, I assume RAID5?  If so, what
kind of stripe size are you using?

What does 'vmstat 1' (on the server) say from about 15 seconds before
the copy kicks in, to about 15 seconds after the copy is done?

Does the time scale up linearly when you copy a 100mb file?  Or drop
by half when you copy a 25mb file?  

What filesystem are you using on the NFS server and do you have quotas
or anything else like that setup?  How full is the filesystem?

What happens if you write a pair of 50 mb files at the same time from
two different clients?  Does the time double?  Does the load double?


Basically, I don't know what's going on here, but I suspect:

1. network speed/duplex mismatch - that you're seeing lots of timeouts
   and retries on writes, but not reads.

2. NFS needs tuning on the clients to write in bigger chunks

3. You're RAID stinks.  Which reminds me, how much time does a write
   from a non-raid disk on the server to the home directory/raid disk
   take?  Can you time the write time of a 50mb file and if it's still
   ugly, then you've narrowed down the issue.


Basically, there's alot of potential problems here and we need more
details.

John
   John Stoffel - Senior Unix Systems Administrator - Lucent Technologies
	 stoffel at lucent.com - http://www.lucent.com - 978-952-7548


More information about the Wlug mailing list