Skip to content

Commit

Permalink
[doc] update Ranking length in doc (vesoft-inc#1878)
Browse files Browse the repository at this point in the history
  • Loading branch information
jude-zhu authored Mar 6, 2020
1 parent 10b8f58 commit 8c907c5
Show file tree
Hide file tree
Showing 2 changed files with 16 additions and 8 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -34,24 +34,31 @@

两个点之间可能存在多种类型的边,**Nebula Graph** 用 Edge Type 来表示边类型。而同一类型的边可能存在多条,比如,定义一个 edge type "转账",用户 A 可能多次转账给 B, 所以 **Nebula Graph** 又增加了一个 Rank 字段来做区分,表示 A 到 B 之间多次转账记录。 Edge key 的 format 如图3 所示:

![image](https://user-images.githubusercontent.com/42762957/71571055-6a89a500-2b13-11ea-8953-e5c6d2588b16.png)
![1](https://user-images.githubusercontent.com/51590253/75966340-20eb7b00-5f05-11ea-9d8e-c3ee17a33038.png)
图三 Edge Key Format

- `Type` :  1 个字节,用来表示 key 的类型,当前的类型有 data, index, system 等。
- `Part ID` : 3 个字节,用来表示数据分片 Partition,此字段主要用于 **Partition 重新分布 (balance) 时方便根据前缀扫描整个 Partition 数据**
- `Vertex ID` : 8 个字节,出边里面用来表示源点的 ID, 入边里面表示目标点的 ID。
- `Edge Type` : 4 个字节,用来表示这条边的类型,如果大于 0 表示出边,小于 0 表示入边。
- `Rank` : 4 个字节,用来处理同一种类型的边存在多条的情况。用户可以根据自己的需求进行设置,这个字段可_存放交易时间__交易流水号_、或_某个排序权重_
- `Rank` : 8 个字节,用来处理同一种类型的边存在多条的情况。用户可以根据自己的需求进行设置,这个字段可_存放交易时间__交易流水号_、或_某个排序权重_
- `Vertex ID` : 8 个字节,出边里面用来表示目标点的 ID, 入边里面表示源点的 ID。
- `Timestamp` : 8 个字节,对用户不可见,未来实现分布式做事务的时候使用。

针对 Edge Type 的值,若如果大于 0 表示出边,则对应的 edge key format 如图4 所示;若 Edge Type 的值小于 0,则对应的 edge key format 如图5 所示

![image](https://user-images.githubusercontent.com/42762957/71571153-e8e64700-2b13-11ea-9d96-ff7ac74db609.png)
---
![2](https://user-images.githubusercontent.com/51590253/75966451-5c864500-5f05-11ea-87e5-b357e29fbbd4.png)

图4 出边的 Key Format
![image](https://user-images.githubusercontent.com/42762957/71571168-01eef800-2b14-11ea-8c86-c0696b966162.png)

---
![3](https://user-images.githubusercontent.com/51590253/75966470-614af900-5f05-11ea-94eb-b693680f295f.png)

图5 入边的 Key Format

---

对于点或边的属性信息,有对应的一组 kv pairs,**Nebula Graph** 将它们编码后存在对应的 value 里。由于 **Nebula Graph** 使用强类型 schema,所以在解码之前,需要先去 Meta Service 中取具体的 schema 信息。另外,为了支持在线变更 schema,在编码属性时,会加入对应的 schema 版本信息。

数据的分片方式为对 Vertex ID `取模` 。通过对 Vertex ID 取模,同一个点的所有_出边__入边_以及这个点上所有关联的 _Tag 信息_都会被分到同一个 Partition,这种方式大大地提升了查询效率。对于在线图查询来讲,最常见的操作便是从一个点开始向外 BFS(广度优先)拓展,于是拿一个点的出边或者入边是最基本的操作,而这个操作的性能也决定了整个遍历的性能。BFS 中可能会出现按照某些属性进行剪枝的情况,**Nebula Graph** 通过将属性与点边存在一起,来保证整个操作的高效。在实际的场景中,大部分情况都是属性图,并且实际中的 BFS 也需要进行大量的剪枝操作。
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,8 @@ Each edge in **Nebula Graph** is modeled as two independent key-values. One is s
There may be multiple edge types or multiple edges with the same edge type between two vertices. For example, define an edge type 'transfer', user A may transfer money to user B multiple times, thus a field `rank` is added to distinguish the transfer records between the two. Edge key format is shown in Figure Three:

---
![image](https://user-images.githubusercontent.com/42762957/71571055-6a89a500-2b13-11ea-8953-e5c6d2588b16.png)
![1](https://user-images.githubusercontent.com/51590253/75966340-20eb7b00-5f05-11ea-9d8e-c3ee17a33038.png)


Fig.3 Edge Key Format

Expand All @@ -55,20 +56,20 @@ Fig.3 Edge Key Format
- `Part ID`: three bytes, used to indicate the sharding partition. This field can be used to **scan the partition data based on the prefix when re-balancing the partition**
- `Vertex ID`: eight bytes, used to indicate source vertex ID in out-edge, dest vertex ID in in-edge
- `Edge Type`: four bytes, used to indicate edge type, greater than zero means out-edge, less than zero means in-edge
- `Rank`: four bytes, used to indicate multiple edges in one edge type. Users can set the field based on needs and store weight such as _transaction time_, _transaction number_ and _weight of a certain order_
- `Rank`: eight bytes, used to indicate multiple edges in one edge type. Users can set the field based on needs and store weight such as _transaction time_, _transaction number_ and _weight of a certain order_
- `Vertex ID` : eight bytes, used to indicate dest vertex ID in out-edge, source vertex ID in in-edge
- `Timestamp`: eight bytes, not available to users, used in MVCC in the future

If the value of an edge type is greater than zero, the corresponding edge key format is shown in Figure Four; otherwise if the value is less than zero, the corresponding edge key format is shown in Figure Five.

---
![image](https://user-images.githubusercontent.com/42762957/71571153-e8e64700-2b13-11ea-9d96-ff7ac74db609.png)
![2](https://user-images.githubusercontent.com/51590253/75966451-5c864500-5f05-11ea-87e5-b357e29fbbd4.png)

Fig.4 Out-key format

---

![image](https://user-images.githubusercontent.com/42762957/71571168-01eef800-2b14-11ea-8c86-c0696b966162.png)
![3](https://user-images.githubusercontent.com/51590253/75966470-614af900-5f05-11ea-94eb-b693680f295f.png)

Fig.5 In-key format

Expand Down

0 comments on commit 8c907c5

Please sign in to comment.