Skip to content

Commit 3d54333

Browse files
committed
Update copy
1 parent 3756d2a commit 3d54333

File tree

4 files changed

+8
-5
lines changed

4 files changed

+8
-5
lines changed

_parts/part13.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -295,7 +295,7 @@ After a bunch of debugging, I discovered this was due to some bad pointer arithm
295295
}
296296
```
297297

298-
`INTERNAL_NODE_CHILD_SIZE` is 4. My here intention was to add 4 bytes to the result of `internal_node_cell()`, but since `internal_node_cell()` returns a `uint32_t*`, this it was actually adding `4 * sizeof(uint32_t)` bytes. I fixed it by casting to a `void*` before doing the arithmetic.
298+
`INTERNAL_NODE_CHILD_SIZE` is 4. My intention here was to add 4 bytes to the result of `internal_node_cell()`, but since `internal_node_cell()` returns a `uint32_t*`, this it was actually adding `4 * sizeof(uint32_t)` bytes. I fixed it by casting to a `void*` before doing the arithmetic.
299299

300300
NOTE! [Pointer arithmetic on void pointers is not part of the C standard and may not work with your compiler](https://stackoverflow.com/questions/3523145/pointer-arithmetic-for-void-pointer-in-c/46238658#46238658). I may do an article in the future on portability, but I'm leaving my void pointer arithmetic for now.
301301

_parts/part5.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -122,7 +122,7 @@ Following our new abstraction, we move the logic for fetching a page into its ow
122122
}
123123
```
124124

125-
The `get_page()` method has the logic for handling a cache miss. We assume pages are saved one after the other in the database file: Page 0 at offset 0, page 1 at offset 4096, page 2 at offset 8192, etc. If the requested page lies outside the bounds of the file, we know it should be blank, so we just allocate some memory return it. The page will be added to the file when we flush the cache to disk later.
125+
The `get_page()` method has the logic for handling a cache miss. We assume pages are saved one after the other in the database file: Page 0 at offset 0, page 1 at offset 4096, page 2 at offset 8192, etc. If the requested page lies outside the bounds of the file, we know it should be blank, so we just allocate some memory and return it. The page will be added to the file when we flush the cache to disk later.
126126

127127

128128
```diff
@@ -290,7 +290,9 @@ The next 256 bytes store the email in the same way. Here we can see some random
290290

291291
## Conclusion
292292

293-
Alright! We've got persistence. It's not the greatest. For example if you kill the program without typing `.exit`, you lose your changes. Additionally, we're writing all pages back to disk, even pages that haven't changed since we read them from disk. These are issues we can address later. The next thing I think we should work on is implementing the B-tree.
293+
Alright! We've got persistence. It's not the greatest. For example if you kill the program without typing `.exit`, you lose your changes. Additionally, we're writing all pages back to disk, even pages that haven't changed since we read them from disk. These are issues we can address later.
294+
295+
Next time we'll introduce cursors, which should make it easier to implement the B-tree.
294296

295297
Until then!
296298

_parts/part7.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ The B-Tree is the data structure SQLite uses to represent both tables and indexe
88
Why is a tree a good data structure for a database?
99

1010
- Searching for a particular value is fast (logarithmic time)
11-
- Inserting / deleting a value is fast (constant-ish time to rebalance)
11+
- Inserting / deleting a value you've already found is fast (constant-ish time to rebalance)
1212
- Traversing a range of values is fast (unlike a hash map)
1313

1414
A B-Tree is different from a binary tree (the "B" probably stands for the inventor's name, but could also stand for "balanced"). Here's an example B-Tree:
@@ -51,6 +51,7 @@ Let's work through an example to see how a B-tree grows as you insert elements i
5151
- up to 3 children per internal node
5252
- up to 2 keys per internal node
5353
- at least 2 children per internal node
54+
- at least 1 key per internal node
5455

5556
An empty B-tree has a single node: the root node. The root node starts as a leaf node with zero key/value pairs:
5657

_parts/part8.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ We're changing the format of our table from an unsorted array of rows to a B-Tre
99

1010
With the current format, each page stores only rows (no metadata) so it is pretty space efficient. Insertion is also fast because we just append to the end. However, finding a particular row can only be done by scanning the entire table. And if we want to delete a row, we have to fill in the hole by moving every row that comes after it.
1111

12-
If we stored the table as an array, but kept rows sorted by id, we could use binary search to find a particular id. However, insertion would have the same problem as deletion where we have to move a lot of rows to make space.
12+
If we stored the table as an array, but kept rows sorted by id, we could use binary search to find a particular id. However, insertion would be slow because we would have to move a lot of rows to make space.
1313

1414
Instead, we're going with a tree structure. Each node in the tree can contain a variable number of rows, so we have to store some information in each node to keep track of how many rows it contains. Plus there is the storage overhead of all the internal nodes which don't store any rows. In exchange for a larger database file, we get fast insertion, deletion and lookup.
1515

0 commit comments

Comments
 (0)