Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

something wrong with tablet size calculation #5408

Closed
willem520 opened this issue May 9, 2020 · 10 comments
Closed

something wrong with tablet size calculation #5408

willem520 opened this issue May 9, 2020 · 10 comments
Labels
area/operations Related to operational aspects of the DB, including signals, flags, env vars, etc. kind/bug Something is broken. status/accepted We accept to investigate/work on it.

Comments

@willem520
Copy link

willem520 commented May 9, 2020

What version of Dgraph are you using?

Dgraph v20.03.1

Have you tried reproducing the issue with the latest release?

yes

What is the hardware spec (RAM, OS)?

centos(126G)

Steps to reproduce the issue (command/config used to run Dgraph).

I noticed the tablet size is more than disk capacity
my machine disk capacity is
1

the p directory size is
2

when I used /state endpoint, I got the result
3

from the ratel, I got the result
4

md5id and gid size is about 6.1TB,but my disk capacity is 2.9T

Expected behaviour and actual result.

tablet size can calculate correctly.

Related to https://discuss.dgraph.io/t/ratel-predicate-capactiy

@jarifibrahim jarifibrahim added area/operations Related to operational aspects of the DB, including signals, flags, env vars, etc. kind/bug Something is broken. status/accepted We accept to investigate/work on it. labels May 9, 2020
@martinmr
Copy link
Contributor

martinmr commented Jun 11, 2020

I haven't been able to reproduce this exact issue but it looks like something else is wrong. When calculating the sizes, the function skips all the tables with the following errors.

alpha1    | I0611 23:25:56.138946      14 draft.go:1245] Calculating tablet sizes. Found 4 tables
alpha1    | I0611 23:25:56.139087      14 draft.go:1254] Unable to parse key: Invalid size 25185 for key [33 98 97 100 103 101 114 33 104 101 97 100 255 255 255 255 255 255 255 254]
alpha1    | I0611 23:25:56.139145      14 draft.go:1254] Unable to parse key: Invalid size 25185 for key [33 98 97 100 103 101 114 33 104 101 97 100 255 255 255 255 255 255 255 254]
alpha1    | I0611 23:25:56.139560      14 draft.go:1254] Unable to parse key: Invalid size 25185 for key [33 98 97 100 103 101 114 33 104 101 97 100 255 255 255 255 255 255 255 254]
alpha1    | I0611 23:25:56.139703      14 draft.go:1254] Unable to parse key: Invalid size 25185 for key [33 98 97 100 103 101 114 33 104 101 97 100 255 255 255 255 255 255 255 254]
alpha1    | I0611 23:25:56.139773      14 draft.go:1276] No tablets found.

The error happens when trying to read the biggest key of the table and to parse it into a Dgraph key. Note that the smallest (left) key can be read without issue. Also, the value is the same in all four tables. Maybe there's a special key at the end of a badger table?

I don't think there's an error with Dgraph itself because my cluster is working fine.

@jarifibrahim Do you have any insight into why the right keys of the tables might be different than what Dgraph expects?

@jarifibrahim
Copy link
Contributor

jarifibrahim commented Jun 12, 2020

@martinmr 33 98 97 100 103 101 114 33 104 101 97 is the !badger!head! key. The table contains keys inserted by dgraph but it also contains the internal keys inserted by badger. The biggest could be an internal badger key. See #5026 also.

. Also, the value is the same in all four tables. Maybe there's a special key at the end of a badger table?

Each level 0 table has one !badger!head! key.

@martinmr
Copy link
Contributor

@jarifibrahim Ok. So to deal with this should I iterate through the table backwards until I find a valid key? Can I do something like dgraph-io/badger#1309 but from the dgraph side?

@jarifibrahim
Copy link
Contributor

@martinmr, the last time we spoke to @manishrjain he suggested that it's okay to skip some tables. @parasssh would also remember this discussion.

So to deal with this should I iterate through the table backwards until I find a valid key? Can I do something like dgraph-io/badger#1309 but from the dgraph side?

The tables are not accessible out of badger. To perform a reverse iteration you would need access to the table and table iterator. The tables are not exposed. The db.Tables(..) call returns TableInfo, not the actual tables. We can expose the tables from badger and then dgraph can iterate over them (however it needs to).

@parasssh
Copy link
Contributor

parasssh commented Jun 13, 2020

Correct. The tablet size is really just a rough estimate. Unless the entire table consists of the keys from the same predicate, dgraph will skip it in the tablet size calculation.

Having said that, I think we should have the TableInfo.Right point to the rightmost valid key instead of badger internal key so the error is not seen on dgraph. After all, the field Right is exported and so applications may access it presuming it to be its valid key (and not internal badger key).

Alternatively or Additionally, on dgraph side, we can make our tablet size calculation only rely on the Left field of each TableInfo entry. As long as two consecutive Left keys have same predicate, we include it in the calculation.

@martinmr
Copy link
Contributor

@jarifibrahim I implemented what @parasssh suggested above. When I load the 1 million dataset I get a total size of 3.4GB. However, the size of the p directory (in a cluster with only one alpha running for simplicity) is 210MB.

One thing I don't know is whether the estimated size knows how to deal with compaction. Is the size the size of the uncompressed or compressed data? Maybe that could explain the difference I am seeing.

Otherwise, I think there's something wrong with the values EstimatedSz is reporting. The logic in the dgraph side is fairly simple and I haven't seen any other issue than the one mentioned above (which in any case is under-reporting the numbers so it doesn't explain the situation the user is seeing)..

martinmr added a commit that referenced this issue Jun 15, 2020
In badger, the right key might be a badger specific key that Dgraph
cannot understand. To deal with these keys, a table is included in
the size calculation if the next table starts with the same key.

Related to DGRAPH-1358 and #5408.
@jarifibrahim
Copy link
Contributor

When I load the 1 million dataset I get a total size of 3.4GB. However, the size of the p directory (in a cluster with only one alpha running for simplicity) is 210MB.

One thing I don't know is whether the estimated size knows how to deal with compaction. Is the size the size of the uncompressed or compressed data? Maybe that could explain the difference I am seeing.

@martinmr How did you test this? Do you have steps that I can follow? This could be a badger bug, maybe some issue with how we do estimates in badger. The size is the estimated size of the uncompressed data but compression cannot make such a huge difference. This is definitely a bug. Let me how you tested it and I can verify it in badger.

@martinmr
Copy link
Contributor

martinmr commented Jun 16, 2020

  1. Use this branch: Fix: Change tablet size calculation to not depend on the right key. #5656
  2. Change the tablet size calculation to happen once every minute instead of five.
  3. Live load the 1 million dataset.
  4. Wait for the tablet sizes to be calculated.

For simplicity, I used a cluster with 1 alpha and 1 zero.

EDIT: master now contains all the changes you need.

martinmr added a commit that referenced this issue Jun 16, 2020
…5656)

In badger, the right key might be a badger specific key that Dgraph
cannot understand. To deal with these keys, a table is included in
the size calculation if the next table starts with the same key.

Related to DGRAPH-1358 and #5408.
martinmr added a commit that referenced this issue Jun 16, 2020
…5656)

In badger, the right key might be a badger specific key that Dgraph
cannot understand. To deal with these keys, a table is included in
the size calculation if the next table starts with the same key.

Related to DGRAPH-1358 and #5408.
@jarifibrahim
Copy link
Contributor

@martinmr can you look at the badger code and figure out what's wrong? The calculations are done here https://github.com/dgraph-io/badger/blob/dd332b04e6e7fe06e4f213e16025128b1989c491/table/builder.go#L228

martinmr added a commit that referenced this issue Jun 18, 2020
…5656)

In badger, the right key might be a badger specific key that Dgraph
cannot understand. To deal with these keys, a table is included in
the size calculation if the next table starts with the same key.

Related to DGRAPH-1358 and #5408.
martinmr added a commit that referenced this issue Jun 25, 2020
…5684)

In badger, the right key might be a badger specific key that Dgraph
cannot understand. To deal with these keys, a table is included in
the size calculation if the next table starts with the same key.

Related to DGRAPH-1358 and #5408.
martinmr added a commit that referenced this issue Jun 25, 2020
…5665)

In badger, the right key might be a badger specific key that Dgraph
cannot understand. To deal with these keys, a table is included in
the size calculation if the next table starts with the same key.

Related to DGRAPH-1358 and #5408.
dna2github pushed a commit to dna2fork/dgraph that referenced this issue Jul 18, 2020
…graph-io#5656)

In badger, the right key might be a badger specific key that Dgraph
cannot understand. To deal with these keys, a table is included in
the size calculation if the next table starts with the same key.

Related to DGRAPH-1358 and dgraph-io#5408.
@minhaj-shakeel
Copy link
Contributor

Github issues have been deprecated.
This issue has been moved to discuss. You can follow the conversation there and also subscribe to updates by changing your notification preferences.

drawing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/operations Related to operational aspects of the DB, including signals, flags, env vars, etc. kind/bug Something is broken. status/accepted We accept to investigate/work on it.
Development

No branches or pull requests

5 participants