学习资料
1.https://medium.com/nerd-for-tech/understanding-redis-in-system-design-7a3aa8abc26a
(这篇文章讲的Redis集群不错,后续可以看看,目前只关注故障恢复)
核心问题
研究下 Redis的故障恢复机制;写前日志和写后日志的特点
Redis
From the CAP theorem perspectiveRedis is neither highly available nor consistent. to understand why let's explain how Redis sync the data from memory to disk as the disk can consider consistency.
As we explained before Redis data lives in memory, which makes it is very fast to write to and read from, but in case of server crashes you lose all that’s in the memory, for some applications, it’s ok to lose these data in case of a crash, but for other apps, it’s important to beable to reload Redis data after server restarts.
RDB(Redis Database): The RDB persistence performs point-in-time snapshots of your dataset at specified intervals.
Every minute if 1000 keys were changed;Every 5 minutes if 10 keys were changed;Every 15 minutes if 1 key was changed
AOF(Append Only File): The AOF persistence logs every write operation received by the server, that will be played again at server startup, reconstructing the original dataset. Commands are logged using the same format as the Redis protocol itself, in an append-only fashion. Redis is able to rewrite the log in the background when it gets too big.
appendfsync always;never ;everysec
没有WAL无法保证持久性;Redis要求高性能,使用写后日志:
好处:
避免额外的检查开销;不会阻塞当前的写操作
风险:
如果命令执行完成,写日志之前宕机了,会丢失数据。主线程写磁盘压力大,导致写盘慢,阻塞后续操作。
No persistence: If you wish, you can disable persistence completely, if you want your data to just exist as long as the server is running.
RDB + AOF: It is possible to combine both AOF and RDB in the same instance. Notice that, in this case, when Redis restarts the AOF file will be used to reconstruct the original dataset since it is guaranteed to be the most complete.
RDB advantages :
1.RDB is a very compact single-file point-in-time representation of your Redis data. RDB files are perfect for backups. For instance, you may want to archive your RDB files every hour for the latest 24 hours and to save an RDB snapshot every day for 30 days. This allows you to easily restore different versions of the data set in case of disasters.
2.RDB is very good for disaster recovery, being a single compact file that can be transferred to far data centers.
3.RDB allows faster restarts with big datasets compared to AOF
RDB disadvantages:
RDB is NOT good if you need to minimize the chance of data loss in case Redis stops working (for example after a power outage). You can configure differentsave pointswhere an RDB is produced (for instance after at least five minutes and 100 writes against the data set, but you can have multiple save points). However you’ll usually create an RDB snapshot every five minutes or more, so in case of Redis stopping working without a correct shutdown for any reason, you should be prepared to lose the latest minutes of data.
AOF advantages:
Using AOF Redis is much more durable: you can have different fsync policies: no fsync at all, fsync every second, fsync at every query.With the default policy of fsync every second write performances are still great but you can only lose one second worth of writes.
The AOF log is an append-only log, so there are no seeks, nor corruption problems if there is a power outage. Even if the log ends with a half-written command for some reason (disk full or other reasons) the Redis-check-of tool is able to fix it easily.
Redis is able to automatically rewrite the AOF in the background when it gets too big. The rewrite is completely safe as while Redis continues appending to the old file, a completely new one is produced with the minimal set of operations needed to create the current data set, and once this second file is ready Redis switches the two and starts appending to the new one.
AOF contains a log of all the operations one after the other in an easy-to-understand and parsed format.
AOF disadvantages:
AOF files are usually bigger than the equivalent RDB files for the same dataset.
AOF can be slower than RDB depending on the exact fsync policy.
Finally, AOF can improve the data consistency but does not guarantee so likely you can lose your data but less than RDB mode considering the RDBis faster.