[Bug 1473948] Re: After most recent upgrade to 3.2.0-87-generic, nfs server process has extremely high I/O to /var/lib/nfs/v4recovery

Mon Jul 13 17:59:08 UTC 2015

As we had continuing severe problems with our applications, I downgraded the entire nfs-kernel-server and nfs-utils packages to version 1:1.2.5-3ubuntu3 using "apt-get install nfs-common=1:1.2.5-3ubuntu3" and "apt-get install nfs-kernel-server=1:1.2.5-3ubuntu3".
The I/O problems are gone with this older version.

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to nfs-utils in Ubuntu.
https://bugs.launchpad.net/bugs/1473948

Title:
  After most recent upgrade to 3.2.0-87-generic, nfs server process has
  extremely high I/O to /var/lib/nfs/v4recovery

Status in nfs-utils package in Ubuntu:
  New

Bug description:
  We upgraded our 12.04 LTS on 9 July at around 18:30.  Immediately
  after the reboot, the I/O on the / partition (sda) was extremely high.
  This was causing sluggish responsiveness on the NFS server who's
  exported directory is on a different file system (sdb) and LUN (disk).

  I investigated the problem and found using iotop that the process "jbd2/sda2-8" was responsible for an extremely high number of I/O operations
  Total DISK READ:       0.00 B/s | Total DISK WRITE:      13.33 M/
    TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
    293 be/3 root        0.00 B/s    0.00 B/s  0.00 % 39.16 % [jbd2/sda2-8]

  I turned on file system debugging using
     echo 1 > /sys/kernel/debug/tracing/events/ext4/ext4_sync_file_enter/enable
     echo 1 >/sys/kernel/debug/tracing/events/jbd2/jbd2_run_stats/enable

  and found that one inode was being massively addressed.

       jbd2/sda2-8-293   [000] 260587.952474: jbd2_run_stats: dev 8,2 tid 42050430 wait 0 running 0 locked 0 flushing 0 logging 0 handle_count 2 blocks 7 blocks_logged 8
              nfsd-1332  [000] 260587.953142: ext4_sync_file_enter: dev 8,2 ino 150 parent 16987 datasync 0 

  This inode (150) belongs to /var/lib/nfs/v4recovery

  Further investigation showed in dmesg that shortly after booting the directory  /var/lib/nfs/v4recovery couldn't be written to:
  [   99.020861] NFSD: failed to write recovery record (err -17); please check that /var/lib/nfs/v4recovery exists and is writeable
  [   99.089156] NFSD: failed to write recovery record (err -17); please check that /var/lib/nfs/v4recovery exists and is writeable
  [   99.189010] NFSD: failed to write recovery record (err -17); please check that /var/lib/nfs/v4recovery exists and is writeable

  I have tried deleting and recreating the directory
  /var/lib/nfs/v4recovery with 777 permissions however this did not
  solve the problem.  These messages were still produced even after a
  reboot.

  The I/O operations per second are sometimes in excess of 1000 and
  typically around 500-750.  This is completely different behaviour to
  the previous kernel where the I/O operations were in the 5-20 I/O
  operations per second with peaks around 30.  I will attach a graph
  from our central EMC storage system of the LUN for a graphical view of
  before and after the update.

  There were no changes in the parameters or shares to the NFS server
  and I have not been able to find any documentation about parameters
  that we should change to address such a problem so I can only conclude
  with this behaviour that this is a bug of some decsription.

  Additional information about our system:
  $ uname -a 
  Linux wsps428 3.2.0-87-generic #125-Ubuntu SMP Fri Jun 19 08:25:10 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

  $ cat /proc/version_signature
  Ubuntu 3.2.0-87.125-generic 3.2.69

  $ lsb_release -rd
  Description:    Ubuntu 12.04.5 LTS
  Release:        12.04

  $ apt-cache policy nfs-server
  nfs-server:
    Installed: (none)
    Candidate: (none)
    Version table:

  I will attach a dmesg output.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1473948/+subscriptions