Raft一致性算法伪码详解

State

/* Persistent state on all servers:
(Updated on stable storage before responding to RPCs)
*/
currentTerm                   //当前任期   
votedFor                      //当前任期的候选者编号，无则为null
log[]                         //日志条目

//Volatile state on all servers,所有服务器上维护
commitIndex             //已知的最高的可被提交的日志条目的索引，初始为0
lastApplied             //当前已提交给state machine执行的条目的索引，初始为0

//Volatile state on leaders:(Reinitialized after election)，只在leader节点上维护
nextIndex[]          //对于每一台服务器，下一条将要发给该服务器的条目的索引，初始为leader最后一条条目索引+1
matchIndex[]         //每一个服务器已知的最高的已复制的条目的索引，初始为0

RequestVote RPC

//Invoked by candidates to gather votes (§5.2).
Arguments:
term                      //候选者的term值
candidateId               //候选者的id
lastLogIndex              //候选者最新的日志索引
lastLogTerm               //候选者最新的日志所属的term

Results:
term 
voteGranted                //true表示投票给该candidate

Receiver implementation:
1. Reply false if term < currentTerm 
//这里投票给候选者的条件是要求候选者的日志至少比自身的要新，也就是要么lastLogIndex比自身最新的日志条目index要大。
//要么lastLogIndex和lastLogTerm都和自身最新的日志条目一致
//这里对选举的这种限制是为了保证安全性。确保commit的日志一定不会被重写。
2. If votedFor is null or candidateId, and candidate’s log is at least as up-to-date as receiver’s log, grant vote

AppendEntries RPC

//Invoked by leader to replicate log entries; also used as heartbeat
Arguments:
term                               //leader当前的term值
leaderId                           //follower在收到client request时，可以用该值转发给leader
prevLogIndex                       //上一条日志条目的索引
prevLogTerm                        //上一条日志条目的term
entries[]                          //日志条目，对于心跳包则该值为空，日志条目可以为多条
leaderCommit                       //leader服务器的commitIndex

Results:
term                          //当前任期
success                       //具体的判断如下

Receiver implementation:
//任期值比当前任期小，则该RPC已失效，或当前leader已变更
1. Reply false if term < currentTerm 
//不包含匹配prevLogTerm的prevLogIndex所对应的条目，通常该情况为节点挂掉一段时间，落后leader节点
//leader会重新发包含较早的prevLogTerm及prevLogIndex的RPC给该节点
2. Reply false if log doesn’t contain an entry at prevLogIndex whose term matches prevLogTerm 
// 以下均返回true
// 若日志条目已有内容与entries里的内容冲突，则删除已有及其后的条目
3. If an existing entry conflicts with a new one (same index but different terms), delete the existing entry and all that follow it 
// 将新的日志条目追加到日志中
4. Append any new entries not already in the log
//如果leaderCommit比自身commitIndex大，则更新自身的commitIndex为min(leaderCommit,当前最新日志条目索引)
5. If leaderCommit > commitIndex, set commitIndex = min(leaderCommit, index of last new entry)

InstallSnapshot RPC

//Invoked by leader to send chunks of a snapshot to a follower. Leaders always send chunks in order.
Arguments:
term                               //leader的当前term
leaderId                           //leader的id
lastIncludedIndex                  //该snapshot中包含的最大的日志的索引值
lastIncludedTerm                   //该snapshot中包含的最大的日志的所属的term
offset                             //用来定位shapshot文件的偏移量，snapshot文件可能很大，要分几次传，每次称之为一个chunk
data[]                             //snapshot数据，通常为state machine的当前状态     
done                               //是否为最后一个chunk

Results:
term                               //currentTerm

Receiver implementation:
1. Reply immediately if term < currentTerm
//如果是第一个chunk，则新建snapshot文件
2. Create new snapshot file if first chunk (offset is 0)
//将data的数据写入到snapshot的相应位置
3. Write data into snapshot file at given offset
//如果done为false，则重复1-3过程，回复并等待最后一个chunk
4. Reply and wait for more data chunks if done is false
//保存snapshot文件，丢弃更老的snapshot
5. Save snapshot file, discard any existing or partial snapshot with a smaller index
// 已有的日志处理
6. If existing log entry has same index and term as snapshot’s last included entry, retain log entries following it and reply
// 丢弃老的日志
7. Discard the entire log
// 按照snapshot内容重设state machine
8. Reset state machine using snapshot contents (and load snapshot’s cluster configuration)

Rules for Servers

All Servers:
// commitIndex > lastApplied,证明lastApplied到commitIndex之间的日志条目都可以提交给state machine执行
• If commitIndex > lastApplied: increment lastApplied, apply log[lastApplied] to state machine 
// 若有新term，则更新自己的term值
• If RPC request or response contains term T > currentTerm: set currentTerm = T, convert to follower 

Followers:
//响应leaders和candidates发来的RPC，响应规则参照AppendEntries和RequestVote部分
• Respond to RPCs from candidates and leaders
// 一段时间内，没有收到AppendEntries或者RequestVote的消息，则转变为candidate
• If election timeout elapses without receiving AppendEntries RPC from current leader or granting vote to candidate: convert to candidate

Candidates :
//开启选举，增加自身term值，投票给自己，重设定时器，发送RequestVote给其他服务器
 • On conversion to candidate, start election:
  • Increment currentTerm
  • Vote for self
  • Reset election timer
  • Send RequestVote RPCs to all other servers
//从多数成员收到true的回应，则转变为leader
• If votes received from majority of servers: become leader
//收到AppendEntries，证明新的leader已产生，则自身转变为follower
• If AppendEntries RPC received from new leader: convert to follower
//如果超时，则开启下一轮选举
• If election timeout elapses: start new election

Leaders:
//发送心跳包给所有服务器，防止其他服务器超时开启新的选举
• Upon election: send initial empty AppendEntries RPCs (heartbeat) to each server; repeat during idle periods to prevent election timeouts 
//收到客户端请求，则将条目写入到日志，当条目提交之后再回复给客户端
• If command received from client: append entry to local log, respond after entry applied to state machine 
//如果当前的日志条目索引比follower大（leader自身的last log index 与其nextIndex[]比较）,则发送AppendEntries给相应follower
• If last log index ≥ nextIndex for a follower: send AppendEntries RPC with log entries starting at nextIndex
     //如果成功，则更新nextIndex数组及matchIndex数组中的follower对应的项
    • If successful: update nextIndex and matchIndex for follower 
    //如果因为日志不同步失败，则减小该follower对应的nextIndex值然后重试,
   //若相应的nextIndex值减小到leader节点已经进行了snapshot，则leader会发送InstallSnapshot RPC
    • If AppendEntries fails because of log inconsistency: decrement nextIndex and retry 
    //如果更新完matchIndex后，判断下commitIndex是否可以更新，更新条件为新的值是多数同意的，且该条目的term为当前term
    • If there exists an N such that N > commitIndex, a majority of matchIndex[i] ≥ N, and log[N].term == currentTerm: set commitIndex = N.

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 212,332评论 6赞 493
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 90,508评论 3赞 385
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 157,812评论 0赞 348
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 56,607评论 1赞 284
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 65,728评论 6赞 386
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 49,919评论 1赞 290
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 39,071评论 3赞 410
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 37,802评论 0赞 268
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 44,256评论 1赞 303
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 36,576评论 2赞 327
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 38,712评论 1赞 341
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 34,389评论 4赞 332
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 40,032评论 3赞 316
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 30,798评论 0赞 21
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 32,026评论 1赞 266
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 46,473评论 2赞 360
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 43,606评论 2赞 350

Raft一致性算法伪码详解

推荐阅读更多精彩内容