Redis
Redis (written in ANSI C) is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes with radius queries and streams. Redis has built-in replication, Lua scripting, LRU eviction, transactions and different levels of on-disk persistence, and provides high availability via Redis Sentinel and automatic partitioning with Redis Cluster.
also referred as a data structure server(REmote DIctionary Server.)
Features
- NoSQL
- Transactions
- serialized and atomic
- corrupted log can remove the partial transaction and restart the server
- check-and-set(CAS)(v2.2)
- does not support roll backs
- Redis commands can fail only if called with a wrong syntax,
- or against keys holding the wrong data type
- Redis is internally simplified and fast cause it does not need the ability to roll back
MULTI;...;EXEC/DISCARD
- Pub/Sub
- Lua scripting
- Keys with a limited time-to-live(ttl)
EXPIRE key TTL
- LRU eviction of keys
- Automatic failover
- LFU (Least Frequently Used) eviction mode (v4.0)
Limitations
- single threaded
- significant overhead for persistence
Table of Contents
- Table of Contents
- Python
- Data Types
- DLM (Distributed Lock Manager)
- Redis Sentinel vs Clustering
- Cluster
- Sharding
- Master-slave model
- Consistency
- Failover
- Persistence
- Performance
- Pub/Sub
PUBLISH/{UN,}SUBSCRIBE
- References
Python
import redis
import traceback
_redis = redis.StrictRedis(host='1.2.3.4', port=6379, db=0)
with _redis.pipeline() as pipe:
pipe.setex('key', 60*2, 'value')
try:
pipe.execute()
except Exception as e:
traceback.print_exc()
Data Types
- Binary-safe strings
- Lists: collections of string elements (sorted according to the order of insertion), Redis lists are implemented via Linked Lists, O(1) for head/tail operations. Use sorted sets if random access is required
- suitable to implement queues, and in general as a building block for inter process communication systems: blocking operations.
BRPOP
BLPOP
- clients are served in an ordered way, return values are key-value pairs
- Communication between processes (Consumer/Producer pattern)
- Capped Lists:
LTRIM
to keep the latest items - The popular Twitter social network takes the latest tweets posted by users into Redis lists.
- suitable to implement queues, and in general as a building block for inter process communication systems: blocking operations.
- Sets: collections of unique, unsorted string elements
- good for expressing relations between objects
- For instance we can easily use sets in order to implement tags.
- Cards games (random)
- Sorted sets: every element is associated to a floating number value called score, support range retrieval
- A < B if
A.score < B.score || memcmp(A, B) < 0
- Sorted sets are implemented via a dual-ported data structure containing both a skip list and a hash table, so every time we add an element Redis performs an O(log(N)) operation.
- Updating the score: leader boards
- A < B if
- Hashs: both the field and the value are strings
- small hashes (i.e., a few elements with small values) are encoded in special way in memory that make them very memory efficient.
- Bit arrays (bitmaps): it is possible, using special commands, to handle String values like an array of bits: you can set and clear individual bits, count all the bits set to 1, find the first set or unset bit, and so forth.
- Bitmaps are not an actual data type, but a set of bit-oriented operations defined on the String type.
- Since strings are binary safe blobs and their maximum length is 512 MB, they are suitable to set up to 2^32 different bits.
- extreme space savings (knowing whether a user wants to receive a newsletter of 4 billion user using only 512MB mem)
- Common use cases:
- Real time analytics of all kinds.
- Storing space efficient but high performance boolean information associated with object IDs.
- Bitmaps are trivial to split into multiple keys, for example for the sake of sharding the data set and because in general it is better to avoid working with huge keys. To split a bitmap across different keys instead of setting all the bits into a key, a trivial strategy is just to store M bits per key and obtain the key name with bit-number/M and the Nth bit to address inside the key with bit-number MOD M.
- HyperLogLogs
- Streams: append-only collections of map-like entries that provide an abstract log data type
Keys
keys are binary safe, empty string also valid
- Not too long, otherwise costly key-comparisons, hash for large value (newKey = hash(origKey))
- Not too short
- Schema
- Max key size: 512 MB
Values
string is the only data type for Memcached
max value: 512 MB
- compress to minimize memory usage (lz4, gzip, etc.)
- client side compression code (reduce network I/O too)
- Lua script (new approach)
- Redis uses LZF light data compressor at the dump time
conditional set: if exists or not:
set key val nx
set key val xx
The ability to set or retrieve the value of multiple keys in a single command is also useful for reduced latency. For this reason there are the MSET and MGET commands:
mset a 10 b 20 c 30 mget a b c
Redis expires
- They can be set both using seconds or milliseconds precision.
- However the expire time resolution is always 1 millisecond.
- Information about expires are replicated and persisted on disk, the time virtually passes when your Redis server remains stopped (this means that Redis saves the date at which a key will expire).
- EXPIRE: set the expire
- PERSIST: remove the expire
DLM (Distributed Lock Manager)
Algorithm named Redlock.
Properties:
- Safety: Mutual exclusion (only one client can hold a lock at any given moment)
- Liveness
- Deadlock free (Eventually it is always possible to acquire a lock even if the client that locked a resource crashes or get partitioned
- Fault tolerance (As long as the majority of Reids nodes are up, clients are able to acquire and release locks)
Single Instance Impl
# lock
SET resource_name my_random_value NX PX 30000
# unlock
if redis.get(resource_name) == my_random_value:
redis.del(resource_name)
fi
Redis Sentinel vs Clustering
- Redis Sentinel is the official high availability solution for Redis.
- More Nodes
- increse redundancy
- slaves would be to split reads
- If the amount of data to store is “small” and the command rate is not exceedingly high, then remember you don’t need to dedicate a host to Redis.
- Redis Cluster is not an HA solution - it is a multiple writer/larger-than-ram solution
- Redis Cluster comes with limitations, particularly around multi-key operations, so it isn’t necessarily a straightforward “just use cluster” operation.
Cluster
Automatically shard across multiple Redis nodes.
Some degree of availability during partitions are provided:
- automatically split dataset among multiple nodes
- continue operations when a subset of the nodes are experiencing failures
Port
Every Redis Cluster node requires 2 TCP connections open.
- 6379 serving the clients
- 10000+6379 node-to-node communication channel (binary protocol)
Sharding
Redis Cluster does not use consistent hashing.
16384 hash slots in Redis Cluster. Where every key is conceptually part of an hash slot.
hash_slot = CRC16(key)%16384
Every node in a Redis Cluster is responsible for a subset of the hash slots.
Redis Cluster supports multiple key operations as long as all the keys involved all belong to the same hash slot. The user can force multiple keys to be part of the same hash slot by using a concept called hash tags.
If there is a substring between
{}
brackets in a key, only what is inside the string is hashed.this{foo}key
andanother{foo}key
are guaranteed to be in the same hash slot, and can be used together in a command with multiple keys as arguments.
Master-slave model
In this model, every hash slot has 1 (the master) to N replicas (N-1 slave)
Consistency
Redis Cluster is not able to guarantee strong consistency. (means that Redis Cluster will lose writes that were acknowledged by the system to the client under specific conditions)
-
Redis Cluster uses asynchronous replication. (writer to master -> master ack OK to client -> master propates the write to its slaves) (trade-off between performance and consistency.)
-
Redis Cluster has support for synchronous writes, implemented via the WAIT command. (this makes losing writes a lot less likely, however, it’s possible a slave that was not able to receive the write is elected as master)
-
Redis Cluster will lose writes during a network partition where a client is isolated with a minority of instances including at least a master (write will lost only if the partition lasts enough time for the slave node to be promoted to master in the majority side of the partition) Note that there is a maximum window to the amount of writes send to isolated master. called node timeout. After node timeout has elapsed, a master node is considered to be failing, and can be replaced by one of its replicas.
Redis Cluster Configuration
cluster-enabled <yes/no>
cluster-config-file <filename>
- not the user editable config file, but the Redis Cluster persistent configration.
- generated at startup by the Redis Cluster instances
cluster-node-timeout <milliseconds>
- Maximum amount of time a node can be unavailable before considered as failing.
cluster-slave-validity-factor <factor>
cluster-migration-barrier <count>
cluster-require-full-coverage <yes/no>
- if yes, the cluster stops accepting writes if some percentage of the key space is not covered by any node
Create Redis Cluster
for port in {7000..7005}; do
path="cluster-x/$port"
mkdir -p "$path"
conf="$path/redis.conf"
cat > "$conf" << EOF
port $port
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000
appendonly yes
EOF
redis-server "$conf"
done
gem install redis
redis-trib.rb create --replicas 1 127.0.0.1:{7000..7005}
# alternate
create-cluster {start, create, stop}
Failover
Debug
DEBUG SEGFAULT
Manual Failover
CLUSTER FAILOVER
exec in one of the slaves whose master will be FAILOVER- clients connecting to master will be stopped
- slave sync with master
- slave and master switch
- master redirect clients to new master
Add A New Node
Add a node as master
- Add an empty node
redis-trib.rb add-node 127.0.0.1:7006 127.0.0.1:7000
- first check the state of the cluster before operate
- sent a
CLUSTER MEET
message to the node
- Reshard data to the node
Add a node as replica
- to random master
./redis-trib.rb add-node --slave 127.0.0.1:7006 127.0.0.1:7000
- to specific master
./redis-trib.rb add-node --slave --master-id 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 127.0.0.1:7006 127.0.0.1:7000
Remove a node
./redis-trib del-node 127.0.0.1:7000 <node-id>
`
Persistence
in-memory dataset
- dumping the dataset to disk every once a while
- appending each command to a log
- Persistence can be optionally disabled
- master-slave asynchronous replication, with very fast non-blocking first synchronization, auto-reconnection with partial resynchronization on net split.
Performance
Pipelining is not just a way in order to reduce the latency cost due to the round trip time, it actually improves by a huge amount the total operations you can perform per second in a given Redis server.
- use pipelining to speedup Redis queries
- Request/Response protocols have RTT
- A reasonable batch number (10k)
- the server also need this much memory to queue the requests
- socket I/O (read()/write() syscall means a lot user land and kernel land context switch, which is a huge speed penalty
- scripting
- able to both read and write data with minimal latency, making operations like read, compute, write very fast
- pipelining can’t help in this scenario since the client needs the reply of the read command before it can call the write command
- sometimes the application may also want to send EVAL or EVALSHA commands in a pipeline
- no loopback interface
- So in practical terms the loopback interface still involves network-alike latency, because of how the kernel scheduler works.
- The wise thing is just avoiding benchmarking in this way.
Performance Tuning
- Special encoding of small aggregate data types
- many data types are optimized to use less space up to a certain size
- If a specially encoded value will overflow the configured max size, Redis will automatically convert it into normal encoding.
- Using 32 bit instances
- since pointers are small, but such an instance will be limited to 4 GB of maximum memory usage
- RDB and AOF files are compatible between 32 bit and 64 bit instances
- Bit and byte level operations
- Use hashes when possible
- Small hashes are encoded in a very small space
- For instance if you have objects representing users in a web application, instead of using different keys for name, surname, email, password, use a single hash with all the required fields.
- Using hashes to abstract a very memory efficient plain key-value store on top of Redis
- a few keys use a lot more memory than a single key containing a hash with a few fields.
- many times hashes contain just a few fields
- When hashes are small we can instead just encode them in an O(N) data structure, like a linear array with length-prefixed key value pairs. Since we do this only when N is small, the amortized time for HGET and HSET commands is still O(1): the hash will be converted into a real hash table as soon as the number of elements it contains will grow too much (you can configure the limit in redis.conf)
- since a linear array of key value pairs happens to play very well with the CPU cache (it has a better cache locality than a hash table).
- So hashes are memory efficient
- In some way the final number can be considered as a form of implicit pre-sharding
HSET object:12 34 val
- Memory allocation
- Redis will not always free up (return) memory to the OS when keys are removed
- often most of the removed keys were allocated in the same pages as the other keys that still exist.
- means that you need to provision memory based on your peak memory usage
- However allocators are smart and are able to reuse free chunks of memory
- Because of all this, the fragmentation ratio is not reliable when you had a memory usage that at peak is much larger than the currently used memory
- If maxmemory is not set Redis will keep allocating memory as it finds fit and thus it can (gradually) eat up all your free memory. Therefore it is generally advisable to configure some limit.
- Redis will not always free up (return) memory to the OS when keys are removed
Pub/Sub PUBLISH/{UN,}SUBSCRIBE
This decoupling of publishers and subscribers can allow for greater scalability and a more dynamic network topology.
- Messages sent by other clients to these channels will be pushed by Redis to all the subscribed clients.
- Pub/Sub has no relation to the key space. It was made to not interfere with it on any level, including database numbers.
- If you need scoping of some kind, prefix the channels with the name of the environment (test, staging, production, …).
- The Redis Pub/Sub implementation supports pattern matching. Clients may subscribe to glob-style patterns in order to receive all the messages sent to channel names matching a given pattern.
- A client may receive a single message multiple times if it’s subscribed to multiple patterns matching a published message, or if it is subscribed to both patterns and channels matching the message.