Ampere Computing Logo
Contact Sales
Ampere Computing Logo
Customer reference board (CRB) platforms from Ampere

Redis Setup and Tuning Guide

For Ampere Altra family of Arm64-based Cloud Native Processors

How to Setup and Tune Redis on Ampere Altra/Altra Max Processors

The purpose of this guide is to describe techniques to run Redis in an optimal manner on Ampere ® Altra ® family of Cloud Native processors.

Redis is an open source, in-memory, key-value data store that is typically used as a fast cache. It uses an in-memory dataset, but data can be persisted through periodic writes or appends to disk. Due to its in-memory nature, Redis is very fast, and it can deliver high throughput at sub-millisecond latencies. It continues to rank highly in popularity among key value stores in the cloud, according to DB-engines.

Redis Set up and Tuning Guide.png

We have used memtier_benchmark (developed by Redis Labs) as a load generator for benchmarking Redis. Each test was configured to run with multiple threads, multiple clients per thread, and with pipelining enabled.

Running Redis on Ampere Processors

Compile open source Redis on the server

Install dependencies

sudo dnf groupinstall "Development Tools" -y; \ sudo dnf install libnsl libevent-devel pcre-devel numactl -y; \ sudo dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm -y

Clone and build Redis from source with GCC

git clone https://github.com/antirez/redis.git && cd redis && git checkout 6.2.1
make CFLAGS=”-march=native”

Compile open source Memtier Benchmark on the client and server

Install dependencies

sudo dnf groupinstall "Development Tools" -y; \ sudo dnf install libnsl libevent-devel pcre-devel numactl -y; \ sudo dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm -y

Clone and build Memtier from source with GCC

cd $HOME && git clone https://github.com/RedisLabs/memtier_benchmark; \ cd memtier_benchmark && git checkout 793d74dbc09395dfc241342d847730a6197d7c0c
autoreconf -ivf && PKG_CONFIG_PATH=/usr/local/lib/pkgconfig:${PKG_CONFIG_PATH} ./configure && sudo make install

Tune System Level Settings

On the server: optimize overcommit_memory and socket connections


echo -e "vm.overcommit_memory=1\nnet.core.somaxconn=65535\n" | sudo tee -a /etc/sysctl.conf; \ sudo /usr/sbin/sysctl -p

Ensure that firewall is provisioned adequately to run the test, either by disabling it or opening the correct ports for the Redis server processes. While running in the cloud, open ports for traffic in network security group.

Network Tuning

Redis is a network bandwidth and latency-sensitive application. Tuning network settings can have a large impact on performance, both throughput and latency. We recommend the following network tunings:

INTERFACE="" systemctl stop irqbalance.service tuned-adm profile throughput-performance echo performance | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor echo always > /sys/kernel/mm/transparent_hugepage/enabled echo always > /sys/kernel/mm/transparent_hugepage/defrag echo 1 > /sys/kernel/mm/transparent_hugepage/khugepaged/defrag echo 262144 > /proc/sys/net/core/somaxconn echo 262144 > /proc/sys/net/ipv4/tcp_max_syn_backlog echo 50 > /proc/sys/net/core/busy_read echo 75 > /proc/sys/net/core/busy_poll echo "none" > /sys/kernel/debug/sched/preempt ethtool -C ${INTERFACE} adaptive-rx off adaptive-tx off ;: ethtool -K ${INTERFACE} lro on ;: ethtool -K ${INTERFACE} gro on ethtool -C ${INTERFACE} rx-usecs 1 tx-usecs 1 rx-frames 8192 tx-frames 8192;:

We have also set the affinities of the IRQs to the first 2 cores of the server. This removes any unpredictable network bottlenecks while pushing the Redis server processes on the remaining cores to 100% CPU utilization.

Application Setup

On the server: create a customized Redis configuration file by adapting the default configuration from source

cp ./redis/redis.conf ./redis_custom.conf; \ sed -i 's/bind 127.0.0.1 -::1/# bind 127.0.0.1 -::1/' redis_custom.conf; \ sed -i 's/protected-mode yes/protected-mode no/' redis_custom.conf; \ sed -i 's/tcp-backlog 511/tcp-backlog 4096/' redis_custom.conf; \ sed -i 's/daemonize no/daemonize yes/' redis_custom.conf; \ sed -i 's/# supervised auto/supervised no/' redis_custom.conf; \ sed -i 's/loglevel notice/loglevel warning/' redis_custom.conf; \ sed -i 's/# save ""/save ""/' redis_custom.conf; \ sed -i 's/dbfilename dump.rdb/# dbfilename dump.rdb/' redis_custom.conf; \ sed -i 's;dir ./;dir /tmp;' redis_custom.conf; \ sed -i 's/repl-disable-tcp-nodelay no/repl-disable-tcp-nodelay yes/' redis_custom.conf; \ sed -i 's/# maxmemory <bytes>/maxmemory 1gb/' redis_custom.conf; \ sed -i 's/# maxmemory-policy noeviction/maxmemory-policy allkeys-random/' redis_custom.conf; \ sed -i 's/# maxmemory-samples 5/maxmemory-samples 5/' redis_custom.conf; \ sed -i 's/# io-threads 4/io-threads 1/' redis_custom.conf; \ sed -i 's/disable-thp yes/disable-thp no/' redis_custom.conf; \ sed -i 's/hz 10/hz 50/' redis_custom.conf; \ sed -i 's/rdb-save-incremental-fsync yes/# rdb-save-incremental-fsync yes/' redis_custom.conf; \ sed -i 's/# ignore-warnings ARM64-COW-BUG/ignore-warnings ARM64-COW-BUG/' redis_custom.conf

On the server: start the Redis server processes on each core from 2-15 (leaving room for IRQs pinned to cores 0-1 in the previous step). Since Redis is single-threaded, we are pinning each Redis server process to one core/port.

Note: we’ve found that setting io-threads to roughly 75% of the total vCPU count gives the best performance.

PORT=6379 for CORE in $(seq 2 15); do nohup sudo numactl -C $CORE ./redis/src/redis-server ./redis_custom.conf --port $PORT --protected-mode no --ignore-warnings ARM64-COW-BUG --save "" --io-threads 12 --maxmemory-policy noeviction --maxmemory 1536mb &> /dev/null & ((PORT+=1)) done

On the server: warmup the Redis processes that have been started using Memtier benchmark on localhost

PORT=6379 for CORE in $(seq 2 15); do numactl -C $CORE ./memtier_benchmark/memtier_benchmark --server localhost --port $PORT --protocol redis --clients 1 --threads 1 --ratio 1:0 --data-size 32 --pipeline 100 --key-minimum 1 --key-maximum 10000000 --requests allkeys --print-percentile 50,90,95,99,99.9 & ((PORT+=1)) done
Running the Benchmark

On the client: run memtier benchmark against Redis using the server’s internal IP address. We are aiming for 100% CPU utilization for all Redis server processes under SLA. The clients, threads, and pipeline command line options can be scaled up gradually to find this performance sweet spot.

SERVER_INTERNAL_IP="" PORT=6379 for CORE in $(seq 2 15); do numactl -C $CORE ./memtier_benchmark/memtier_benchmark --server $SERVER_INTERNAL_IP --port $PORT --protocol redis --clients 1 --threads 1 --ratio 1:9 --data-size 32 --pipeline 894 --key-minimum 1 --key-maximum 10000000 --key-pattern R:R --run-count 1 --test-time 30 --out-file /tmp/memtier_results_$PORT --print-percentile 50,90,95,99,99.9 --random-data & ((PORT+=1)) done wait
Created At : June 15th 2023, 11:54:38 am
Last Updated At : June 28th 2023, 3:36:47 pm
Ampere Logo

Ampere Computing LLC

4655 Great America Parkway Suite 601

Santa Clara, CA 95054

image
image
image
image
 |  |  |  |  |  | 
© 2023 Ampere Computing LLC. All rights reserved. Ampere, Altra and the A and Ampere logos are registered trademarks or trademarks of Ampere Computing.
This site is running on Ampere Altra Processors.