add scan command #199

NaNShaner · 2023-07-24T13:11:59Z

add scan command

NaNShaner · 2023-07-24T14:43:07Z

@HDT3213 貌似是raft这块的逻辑

HDT3213 · 2023-07-29T08:08:56Z

database/keys.go

@@ -412,6 +412,91 @@ func execCopy(mdb *Server, conn redis.Connection, args [][]byte) redis.Reply {
 	return protocol.MakeIntReply(1)
 }

+// execScan iteratively output all keys in the current db
+func execScan(db *DB, args [][]byte) redis.Reply {


scan 的参数有 MATCH，COUNT, TYPE 3个，实际执行 scan 的函数签名应该是 scan0(cursor, pattern, count, typ)。
现在的解析命令行逻辑过于晦涩，可以参考一下 execSet 的实现

HDT3213 · 2023-07-29T08:16:45Z

datastruct/dict/concurrent.go

+	if count >= size {
+		return dict.Keys(), nextCursor
+	}
+	remainingKeys := dict.table[cursor:]


table 中取出来的是 shard, 为啥起名叫 remainingKeys?

HDT3213 · 2023-07-29T08:18:10Z

datastruct/dict/concurrent.go

@@ -435,3 +436,56 @@ func (dict *ConcurrentDict) RWUnLocks(writeKeys []string, readKeys []string) {
 		}
 	}
 }
+
+// ScanKeys iteratively output all keys in the current db
+func (dict *ConcurrentDict) ScanKeys(cursor, count int, matchKey string) ([]string, int) {


matchKey -> pattern

HDT3213 · 2023-07-29T08:20:30Z

datastruct/dict/concurrent.go

+			for key, _ := range s.m {
+				if key != "" {
+					if matchKey != "*" {
+						pattern, err := wildcard.CompilePattern(matchKey)


CompilePattern 执行一次就可以了。。

HDT3213 · 2023-07-29T08:23:48Z

datastruct/dict/concurrent.go

+	nextCursor := 0
+	size := dict.Len()
+	result := make([]string, count)
+	r := make(map[string]struct{})


不要使用过于简单的变量名

HDT3213 · 2023-07-29T08:36:12Z

datastruct/dict/concurrent.go

+						i++
+					}
+				}
+				if count <= i {


scan 命令不保证返回的元素数一定等于 count 参数，但是保证从遍历开始直到完整遍历结束期间，一直存在于数据集内的所有元素都会被返回。如果到了 count 就中止的话，当前shard 中可能有一些 key 在遍历期间一直存在但因为遍历中止未被返回。

所以应完整遍历shard, 当 len(result) >= count 后返回即可。

for _, shard := range shards { for k, _ := range shard.m { result.add(key) } if len(result) >= count { break } }

HDT3213 · 2023-07-29T08:38:04Z

两个大问题

解析命令行逻辑过于晦涩，可以参考一下 execSet 的实现
scan 命令不保证返回的元素数一定等于 count 参数，但是保证：「从遍历开始直到完整遍历结束期间，一直存在于数据集内的所有元素都会被返回」。如果到了 count 就中止的话，当前shard 中可能有一些 key 在遍历期间一直存在但因为遍历中止未被返回。

所以应完整遍历shard, 当 len(result) >= count 后返回即可。

for _, shard := range shards {
    for k, _ := range shard.m {
          result.add(key)
    } 
    if len(result) >= count {
          break
    }
}

HDT3213 · 2023-07-29T08:38:50Z

datastruct/dict/concurrent.go

+		return dict.Keys(), nextCursor
+	}
+	remainingKeys := dict.table[cursor:]
+	i := 0


直接用 len(r) 不就行了

NaNShaner · 2023-08-05T13:25:35Z

按照您的建议做了一些修改，貌似test failed。略有些不解🤔。

HDT3213 · 2023-08-09T01:49:14Z

=== RUN   TestScan


    keys_test.go:332: test failed




--- FAIL: TestScan (0.05s)

新加的 TestScan 挂了

NaNShaner · 2023-08-09T07:42:15Z

我本地跑是正常的

NaNShaner · 2023-08-13T01:59:59Z

知道了，每次scan迭代返回的key是无序的，所以在test比对的时候会偶发不一致。

HDT3213 · 2023-08-13T13:34:20Z

database/keys.go

+	return protocol.MakeNullBulkReply()
+}
+
+func scanWithArg(db *DB, args [][]byte, match bool) redis.Reply {


函数签名改成： execScan0(cursor, pattern, count, typ)

HDT3213 · 2023-08-13T13:35:53Z

database/keys.go

+}
+
+// execScanWithArg execute scan command based on cli args
+func execScanWithArg(db *DB, args [][]byte, argType int) redis.Reply {


函数签名改成 execScan0(cursor string, pattern string, count int)
type 参数可以先不加

上次说的 type 不是参数类型， scan 命令可以根据 string/list/hash/set/sortedset 类型进行过滤.

HDT3213 · 2023-08-13T13:38:41Z

database/keys.go

+
+// execScan iteratively output all keys in the current db
+func execScan(db *DB, args [][]byte) redis.Reply {
+	const (


var count int var pattern string for i := 1; i < argsNum; i++ { arg := strings.ToLower(string(args[i])) if arg == "count" { count = itoa(args[i+1]) i++ } .... } execScan0(cursor, pattern, count)

这样不就完事了，为啥写这么麻烦

HDT3213 · 2023-08-13T13:41:28Z

datastruct/dict/concurrent.go

+						i++
+					}
+				}
+				if len(m) >= count {


#199 (comment)

遍历完 shard 再返回

这里有个问题没想明白，如果每次都需要把所有的shard遍历完再返回的话
1、性能上是不是存在而外的开销？
2、每次遍历以count参数的个数作为一个取值范围，直至遍历完成，不也可以保证保证数据的全量被遍历到么？（不保证已经被遍历过的shard发生数据变化）

add scan command

3755e04

add scan command

a1617b3

HDT3213 requested changes Jul 29, 2023

View reviewed changes

HDT3213 reviewed Jul 29, 2023

View reviewed changes

add scan command

24e3509

HDT3213 requested changes Aug 13, 2023

View reviewed changes

NaNShaner added 2 commits October 17, 2023 23:31

Merge branch 'HDT3213:master' into feat/scanCommand

7ada186

Merge branch 'HDT3213:master' into feat/scanCommand

8039246

NaNShaner closed this by deleting the head repository May 5, 2024

lhpqaq mentioned this pull request Jul 15, 2024

support scan #225

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add scan command #199

add scan command #199

NaNShaner commented Jul 24, 2023

NaNShaner commented Jul 24, 2023

HDT3213 Jul 29, 2023

HDT3213 Jul 29, 2023

HDT3213 Jul 29, 2023

HDT3213 Jul 29, 2023

HDT3213 Jul 29, 2023

HDT3213 Jul 29, 2023

HDT3213 commented Jul 29, 2023

HDT3213 Jul 29, 2023

NaNShaner commented Aug 5, 2023

HDT3213 commented Aug 9, 2023

NaNShaner commented Aug 9, 2023

NaNShaner commented Aug 13, 2023

HDT3213 Aug 13, 2023

HDT3213 Aug 13, 2023

HDT3213 Aug 13, 2023

HDT3213 Aug 13, 2023

HDT3213 Aug 13, 2023

NaNShaner Aug 15, 2023

add scan command #199

add scan command #199

Conversation

NaNShaner commented Jul 24, 2023

NaNShaner commented Jul 24, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

HDT3213 commented Jul 29, 2023

Choose a reason for hiding this comment

NaNShaner commented Aug 5, 2023

HDT3213 commented Aug 9, 2023

NaNShaner commented Aug 9, 2023

NaNShaner commented Aug 13, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment