How to fine-tune pfSense 2.4.4 for 1Gbit thoughput on APU2
This is an update to the article we wrote in 2017, where we showed that pfSense 2.3 is not able to route full gigabit on APU2. With some more testing and tinkering we are now able to get full gigabit on APU2!
As stated in the previous article, APU2 has 4x 1Ghz CPU cores, pfSense by design is able to use only 1 core per connection. This limitation still exists, however single-core performance has considerably improved. With new BIOS, and settings described below, pfSense can route about 750-800Mbit/s on one connection.
APU2C4 and APU2D4 have very performant Intel I210-AT Network Interfaces. These NICs have 4 transmit and four receive queues, being able to work simultaneusly on 4 connections. With some fine tuning, pfSense can take advantage of this and route at 1Gbit when using more than one connection.
The other APU boards (APU2C0, APU2C2, APU3, APU4) have I211-AT Network Interface, with 2 transmit/receive queues. This is less performant NIC, but it's still good enough to deliver 1Gbit on pfSense when one than one connection is used.
Routers rarily open just one connection, so single connecton is rarily a bottleneck in real world. Web browser opens about 8 TCP connections per website, Torrent clients open hundreds of connections, Netflix opens multiple TCP connections when streaming video, etc.
Previous throughput test wasn't taking advantage of multi-queue NICs, because by default pfSense uses only one NIC queue on APU2, here's how to change this.
|1Gbps throughput?||Single Connection||Multiple Connections|
|pfSense/OPNsense (no tweaks)||no||no|
Gigabit pfSense config
First, head to the pfSense Web panel -> System -> Advanced -> Networking -> Scroll to the bottom.
Make sure that all 3 first checkboxes under "Network Interfaces" are unchecked.
- Hardware Checksum Offloading
- Hardware TCP Segmentation Offloading
- Hardware Large Receive Offloading
Like shown on the screenshot:
Now we need to edit some settings from the shell. You can SSH to the box or connect with the serial cable.
To get the full gigabit, edit /boot/loader.conf.local (you may need to create it, if it doesn't exist) and insert the following settings:
# agree with Intel license terms legal.intel_ipw.license_ack=1 legal.intel_iwi.license_ack=1 # this is the magic. If you don't set this, queues won't be utilized properly # allow multiple processes for receive/transmit processing hw.igb.rx_process_limit="-1" hw.igb.tx_process_limit="-1" # more settings to play with below. Not strictly necessary. # force NIC to use 1 queue (don't do it on APU) # hw.igb.num_queues=1 # give enough RAM to network buffers (default is usually OK) # kern.ipc.nmbclusters="1000000" #net.pf.states_hashsize=2097152 #hw.igb.rxd=4096 #hw.igb.txd=4096 #net.inet.tcp.syncache.hashsize="1024" #net.inet.tcp.syncache.bucketlimit="100"
After saving this file, reboot your router to apply it.
Now you can run some tests to verify that your settings worked properly. The easiest way it to use iperf3 with multiple connections, where one device is on the LAN and the other one in the internet.
iperf3 APU2 throughput test
We setup one iperf3 server on the internet, and called it from a host on the LAN.
On the server (somewhere on the intetnet) run the following command
On your LAN run this command:
iperf3 -c SERVER_IP_HERE -P 4
If everything went well, you should be seeing about 940Mbit/s throughput, similar to the snippet below:
- - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 43.00-44.00 sec 56.1 MBytes 470 Mbits/sec 0 481 KBytes [ 7] 43.00-44.00 sec 55.7 MBytes 468 Mbits/sec 0 438 KBytes [SUM] 43.00-44.00 sec 112 MBytes 938 Mbits/sec 0 - - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 44.00-45.00 sec 56.4 MBytes 473 Mbits/sec 0 481 KBytes [ 7] 44.00-45.00 sec 56.1 MBytes 470 Mbits/sec 0 438 KBytes [SUM] 44.00-45.00 sec 112 MBytes 943 Mbits/sec 0 - - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 45.00-46.00 sec 56.1 MBytes 470 Mbits/sec 0 481 KBytes [ 7] 45.00-46.00 sec 55.6 MBytes 466 Mbits/sec 0 438 KBytes [SUM] 45.00-46.00 sec 112 MBytes 936 Mbits/sec 0 - - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 46.00-47.00 sec 57.7 MBytes 484 Mbits/sec 0 481 KBytes [ 7] 46.00-47.00 sec 55.0 MBytes 461 Mbits/sec 0 438 KBytes [SUM] 46.00-47.00 sec 113 MBytes 945 Mbits/sec 0 - - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 47.00-48.00 sec 55.2 MBytes 463 Mbits/sec 0 481 KBytes [ 7] 47.00-48.00 sec 55.8 MBytes 468 Mbits/sec 0 438 KBytes [SUM] 47.00-48.00 sec 111 MBytes 931 Mbits/sec 0
Here's a screenshot from pfSense panel - take a look at the traffic graph.
I think this is quite neat. It turns out that internet is wrong :-) . One can get full gigabit on pfSense when utilizing multiple NIC queues and multiple CPUs!
if you have any questions about the above article, mail us at firstname.lastname@example.org
Tip: check out many similar articles in our Knowledge Base.