Network File System (NFS) Deployment Recommendations

Introduction

ownCloud recommends using NFS for any scenario when accessing Unix-based remote servers or storages other than local storage. It has solid performance and is very stable. This guide covers the official ownCloud NFS (Network File System) deployment recommendations for Infinite Scale.

This guide only covers the NFS client side where ownCloud runs. Follow the server or storage vendors' recommendations to configure the NFS server (storage back end).

NFS Prerequisites

See the NFS Notes in the prerequisites documentation for important details.

General Performance Considerations

It is advised to use network storage like NFS only in un-routed, switched Gigabit or higher environments.

Consider that a network stack runs in ranges of µs while a storage back end usually runs in ranges of ms. Any tuning considerations, beside mounting options, should therefore be attempted first on the back end storage layout side, especially under high loads.

NFSv4 Overview

To use NFS for Infinite Scale, you must use NFSv4 for various reasons.

  • Most importantly, starting with NFSv4, Extended Attributes can be provided by the NFS server, if implemented. See the NFS Notes in the prerequisites documentation for more details.

  • Improved Security: It mandates a strong security architecture. It does not require rpc.statd or lockd. As a result, it only uses port 2049.

  • Improved Reliability: It uses TCP by default.

  • Improved Performance: It uses multi-component messages, which reduce network traffic. It is capable of using a 32KB page size, compared to the default, 1024 bytes.

  • Use of Read/Write Delegations.

NFSv4 Key Value Overview
NFSv4 Explanation

Exports

All exports can be mounted together in a directory tree structure as part of a pseudo-filesystem

Protocol

A single protocol with the addition of OPEN and CLOSE for security auditing

Locking

Lease-based locking in the same protocol

Security

Kerberos and ACL-based

Communication

Multiple operations per RPC (improves performance)

I18N

UTF-8

Scalable parallel access to files distributed between multiple servers

pNFS (with NFSv4.1)

Extended Attributes

Introduced by IETF, December 2017

NFS Mount Options

See the Linux man pages for a detailed description of the NFS mount options. The following options are default for NFS except if explicitly set differently when mounting: rw, suid, dev, exec, auto, nouser, and async.

Mount Options to Consider

_netdev

Use this option to ensure that the network is enabled before NFS attempts to mount these filesystems. This setting is essential when other services depend on the availability of the NFS storage during the start process. A service may fail or not start correctly if the mount is not ready before attempting to access its data files.

You can also use autofs, to ensure that mounts are always available before attempting to access them.

nofail

Using nofail allows the boot sequence to continue even if the drive fails to mount. This can happen if the NFS server accessed is down. The boot process will continue after the mount reaches timeout. The default device timeout is 90 seconds. If the option is not set, the boot process will wait until the NFS server is available but can be manually set using the option x-systemd.mount-timeout=<value in seconds>.

bg

ownCloud recommends using this option. Determines how the mount command behaves if an attempt to mount an export fails. If the bg option is specified, a timeout or failure triggers the mount command to fork a child, which will continue to attempt mounting the export. The parent immediately returns with a zero exit code. This is known as a "background" mount. This option is useful for continuous operation without manual intervention if the network connectivity is temporarily down or the storage back end must be rebooted.

hard

Default value is hard. For business-critical NFS exports, ownCloud recommends using hard mounts. ownCloud strongly discourages the use of soft mounts.

retrans

Default value is 3. This option can be tuned when using option soft.

timeout

Default value is 600 (60 seconds). This option can be tuned when using option soft.

sync/async

With the default value of async, the NFS client may delay sending application writes to the NFS server. In other words, under normal circumstances, data written by an application may not immediately appear on the server that hosts the file. The sync mount option provides greater data cache coherence among clients, but at a significant performance cost. You may consider further tuning when using clustered server environments.

tcp

ownCloud recommends using this option forcing the use of TCP as transport protocol. Alternatively you can use proto=tcp.

Tune the Read and Write Block Sizes

The allowed block sizes are the packet chunk sizes that NFS uses when reading and writing data. The smaller the size, the greater the number of packets needed to transfer a file. Conversely, the larger the size, the fewer the number of packets that need to be sent to transfer a file. You can set the rsize and wsize values as high as 65536 if the network transport protocol is TCP. The default value is 32768 and must be a multiple of 4096.

Read and write size must be identical on the NFS server and client.

You can find the configured values in the output of the mount command on a standard server, as in the example below.

root@server:~# mount | egrep -o rsize=[0-9]*
rsize=65536
root@server:~# mount | egrep -o wsize=[0-9]*
wsize=65536

The information can also be retrieved using the command set of your dedicated storage backend. Once you’ve determined the best sizes, set them permanently by passing the (rsize and wsize) options when mounting the share or in the share’s mount configuration.

Specifying the read and write block sizes when calling mount
mount 192.168.0.104:/data /mnt -o rsize=65536,wsize=65536
Example for a set of NFS mount options:
bg,nfsvers=4,wsize=65536,rsize=65536,tcp,_netdev

Ethernet Configuration Options

MTU (Maximum Transmission Unit) Size

The MTU size dictates the maximum amount of data that can be transferred in one Ethernet frame. If the MTU size is too small, the data must still be fragmented across multiple frames regardless of the read and write block sizes. Keep in mind that MTU = payload (packetsize) + 28.

Get the Current MTU Size

You can find the current MTU size for each interface using netstat, ifconfig, ip, and cat, as in the following examples:

Retrieve interface MTU size with netstat
netstat -i
Kernel Interface table
Iface      MTU    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
lo       65536   363183      0      0 0        363183      0      0      0 LRU
eth0      1500  3138292      0      0 0       2049155      0      0      0 BMR
Retrieve interface MTU size with ifconfig
ifconfig| grep -i MTU
lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
Retrieve interface MTU size with ip
ip addr | grep mtu
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
Retrieve interface MTU size with cat
cat /sys/class/net/<interface>/mtu

Check for MTU Fragmentation

To check if a particular packet size will be fragmented on the way to the target, run the following command:

ping <your-storage-backend> -c 3 -M do -s <packetsize>

Get the Optimal MTU Size

To get the optimal MTU size, run the following command:

tracepath <your-storage-backend>

You can expect to see output like the following:

 1?: [LOCALHOST]                      pmtu 1500 (1)
 1:  <your-storage-backend>                              0.263ms reached (2)
 1:  <your-storage-backend>                              0.224ms reached (3)
     Resume: pmtu 1500 hops 1 back 1
1 The first line with localhost shows the given MTS size.
2 The last line shows the optimal MTU size.
3 If both are identical, nothing needs to be done.

Change Your MTU Value

In case you need or want to change the MTU size, under Ubuntu e.g. to 1280:

  • If NetworkManager is managing all devices on the system, then you can use nmtui or nmcli to configure the MTU setting.

  • If NetworkManager is not managing all devices on the system, you can set the MTU to 1280 with Netplan, as in the following example.

    network:
      version: 2
      ethernets:
        eth0:
          mtu: 1280

    Refer to the Netplan documentation for further information.