Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SN forwards a request despite of access to object operation denied error #1709

Closed
aprasolova opened this issue Aug 19, 2022 · 7 comments
Closed
Assignees
Labels
bug Something isn't working I4 No visible changes S3 Minimally significant U3 Regular
Milestone

Comments

@aprasolova
Copy link
Contributor

aprasolova commented Aug 19, 2022

In a container, where HEAD is denied, I execute HEAD on an object D6w3YA6k9FaGnM4vrv2UoyyRjGwcjLAiSDYTfcHCyBac/GhPM1dEK27HvkqdoW8L9nyJda7ZGMYBHGWhJFWMq9Eu1. In storage node logs I see that the node gets access to object operation denied, and despite of it forwards the request into the container.

  1. Is it necessary to forward the request? I believe local response is exhaustive.
  2. In the last line of the log, there is a 2049 status code, which corresponds to the object not found error. Is it ok? Anyways, in a CLI I get 2048 status code, but the log message confuses me.
2022-08-19T12:52:17.211Z	debug	get/get.go:87	serving request...	{"component": "Object.Get service", "request": "HEAD", "address": "D6w3YA6k9FaGnM4vrv2UoyyRjGwcjLAiSDYTfcHCyBac/GhPM1dEK27HvkqdoW8L9nyJda7ZGMYBHGWhJFWMq9Eu1", "raw": true, "local": false, "with session": true, "with bearer": false}
2022-08-19T12:52:17.211Z	debug	get/local.go:25	local get failed	{"component": "Object.Get service", "request": "HEAD", "address": "D6w3YA6k9FaGnM4vrv2UoyyRjGwcjLAiSDYTfcHCyBac/GhPM1dEK27HvkqdoW8L9nyJda7ZGMYBHGWhJFWMq9Eu1", "raw": true, "local": false, "with session": true, "with bearer": false, "error": "status: code = 2049"}
2022-08-19T12:52:17.211Z	debug	get/get.go:108	operation finished with error	{"component": "Object.Get service", "request": "HEAD", "address": "D6w3YA6k9FaGnM4vrv2UoyyRjGwcjLAiSDYTfcHCyBac/GhPM1dEK27HvkqdoW8L9nyJda7ZGMYBHGWhJFWMq9Eu1", "raw": true, "local": false, "with session": true, "with bearer": false, "error": "status: code = 2049"}
2022-08-19T12:52:17.211Z	debug	get/container.go:18	trying to execute in container...	{"component": "Object.Get service", "request": "HEAD", "address": "D6w3YA6k9FaGnM4vrv2UoyyRjGwcjLAiSDYTfcHCyBac/GhPM1dEK27HvkqdoW8L9nyJda7ZGMYBHGWhJFWMq9Eu1", "raw": true, "local": false, "with session": true, "with bearer": false, "netmap lookup depth": 0}
2022-08-19T12:52:17.211Z	debug	get/container.go:46	process epoch	{"component": "Object.Get service", "request": "HEAD", "address": "D6w3YA6k9FaGnM4vrv2UoyyRjGwcjLAiSDYTfcHCyBac/GhPM1dEK27HvkqdoW8L9nyJda7ZGMYBHGWhJFWMq9Eu1", "raw": true, "local": false, "with session": true, "with bearer": false, "number": 7}
2022-08-19T12:52:17.211Z	debug	get/remote.go:14	processing node...	{"component": "Object.Get service", "request": "HEAD", "address": "D6w3YA6k9FaGnM4vrv2UoyyRjGwcjLAiSDYTfcHCyBac/GhPM1dEK27HvkqdoW8L9nyJda7ZGMYBHGWhJFWMq9Eu1", "raw": true, "local": false, "with session": true, "with bearer": false}
2022-08-19T12:52:17.213Z	debug	get/remote.go:34	remote call failed	{"component": "Object.Get service", "request": "HEAD", "address": "D6w3YA6k9FaGnM4vrv2UoyyRjGwcjLAiSDYTfcHCyBac/GhPM1dEK27HvkqdoW8L9nyJda7ZGMYBHGWhJFWMq9Eu1", "raw": true, "local": false, "with session": true, "with bearer": false, "error": "read object header from NeoFS: status: code = 2048 message = access to object operation denied"}
2022-08-19T12:52:17.213Z	debug	get/remote.go:14	processing node...	{"component": "Object.Get service", "request": "HEAD", "address": "D6w3YA6k9FaGnM4vrv2UoyyRjGwcjLAiSDYTfcHCyBac/GhPM1dEK27HvkqdoW8L9nyJda7ZGMYBHGWhJFWMq9Eu1", "raw": true, "local": false, "with session": true, "with bearer": false}
2022-08-19T12:52:17.215Z	debug	get/remote.go:34	remote call failed	{"component": "Object.Get service", "request": "HEAD", "address": "D6w3YA6k9FaGnM4vrv2UoyyRjGwcjLAiSDYTfcHCyBac/GhPM1dEK27HvkqdoW8L9nyJda7ZGMYBHGWhJFWMq9Eu1", "raw": true, "local": false, "with session": true, "with bearer": false, "error": "read object header from NeoFS: status: code = 2048 message = access to object operation denied"}
2022-08-19T12:52:17.215Z	debug	get/remote.go:14	processing node...	{"component": "Object.Get service", "request": "HEAD", "address": "D6w3YA6k9FaGnM4vrv2UoyyRjGwcjLAiSDYTfcHCyBac/GhPM1dEK27HvkqdoW8L9nyJda7ZGMYBHGWhJFWMq9Eu1", "raw": true, "local": false, "with session": true, "with bearer": false}
2022-08-19T12:52:17.219Z	debug	get/remote.go:34	remote call failed	{"component": "Object.Get service", "request": "HEAD", "address": "D6w3YA6k9FaGnM4vrv2UoyyRjGwcjLAiSDYTfcHCyBac/GhPM1dEK27HvkqdoW8L9nyJda7ZGMYBHGWhJFWMq9Eu1", "raw": true, "local": false, "with session": true, "with bearer": false, "error": "read object header from NeoFS: status: code = 2048 message = access to object operation denied"}
2022-08-19T12:52:17.219Z	debug	get/container.go:63	no more nodes, abort placement iteration	{"component": "Object.Get service", "request": "HEAD", "address": "D6w3YA6k9FaGnM4vrv2UoyyRjGwcjLAiSDYTfcHCyBac/GhPM1dEK27HvkqdoW8L9nyJda7ZGMYBHGWhJFWMq9Eu1", "raw": true, "local": false, "with session": true, "with bearer": false}
2022-08-19T12:52:17.219Z	debug	get/get.go:108	operation finished with error	{"component": "Object.Get service", "request": "HEAD", "address": "D6w3YA6k9FaGnM4vrv2UoyyRjGwcjLAiSDYTfcHCyBac/GhPM1dEK27HvkqdoW8L9nyJda7ZGMYBHGWhJFWMq9Eu1", "raw": true, "local": false, "with session": true, "with bearer": false, "error": "status: code = 2049"}

This test fails

Your Environment

NeoFS v0.31.0

@cthulhu-rider
Copy link
Contributor

Firstly node tries to read the object locally not matter what, so 2049 (404) is pretty normal. Further we just see the behavior "do until 1st success". In current implementation operation isn't aborted on ACCESS_DENIED error's encounter, and in general I think it's OK. But for some scenarios we actually want to abort the execution on such an error, so I suggest to think about providing an option to tune the behavior for request.

@carpawell
Copy link
Member

@cthulhu-rider, we discussed that a little with @aprasolova and thought that it is pretty strange behavior: if a node looks for the object locally and finds out that it should not return that according to the ACL rules why does it try to ask other nodes for that object? What could change if other nodes will respond successfully? Should a node return object that it thinks it must not?

Also, all appearances that a node, in that case, has all the needed info to answer 2048, not 2049.

@cthulhu-rider
Copy link
Contributor

Maybe I haven't caught the issue theme correctly. In described scenario object isn't stored locally, yes? So how can

node looks for the object locally and finds out that it should not return that according to the ACL rules

?

@carpawell
Copy link
Member

carpawell commented Sep 14, 2022

object isn't stored locally, yes?

@cthulhu-rider, no, an object IS stored locally and a Node is not able to get it because of ACL. And the logic is: "could not get it locally (does not matter why), then ask other nodes". Not clear why. If the local engine returns 2048 (access denied), RPCs to other nodes seem redundant.

@roman-khimov roman-khimov added bug Something isn't working U3 Regular S3 Minimally significant I4 No visible changes and removed triage labels Dec 21, 2023
@cthulhu-rider
Copy link
Contributor

cthulhu-rider commented Apr 9, 2024

behavior continues to amaze for reading ops https://rest.fs.neo.org/HXSaMJXk2g8C14ht8HSi7BBaiYZ1HeWh2xnWPGQCg4H6/1487-1712598182/index.html#suites/7d0d2bff4573ced7a233aef01d2141a2/d40aabea2b91708f/

i see following possible options:

  1. abort survey on any status response (except Not Found one ofc) and forward it
  2. if all nodes failed, return multi-status response
  3. extend 2: return simple status when it's received from all nodes

1 finishes faster, but may intro inconsistency when nodes respond with different statuses (buggy, malicious, etc.). At the same time, this is ok for specific networks keeping node versions in sync. It's also stands for the best UX for the "good" case (like described in the issue)

2 is the most clear but also costly. 3 lightens response to the original client as the number of nodes in the container increases

1 is easier to implement since 2-3 responses have neven been supported before and 'd require protocol extensions

in total, i suggest to:

  • change behavior to 1
  • consider an option to enable 3 mode. Future default behavior (best to me), request flag (good to me) or node option (worse to me)

@roman-khimov @carpawell

@roman-khimov
Copy link
Member

abort survey on any status response

Dangerous for open networks, easy to DoS.

return multi-status response

Clients will freak out on this.

return simple status when it's received from all nodes

Makes sense even for 2/3+ replies.

But at the same time can ACL checks be completely done locally? Because

If the local engine returns 2048 (access denied), RPCs to other nodes seem redundant.

this is a correct thing to do in general, SN is being asked for something and if it can check ACLs locally then any problems with them can be immediately returned to the client. If client is not satisfied (bad node) he can try another SN.

@cthulhu-rider
Copy link
Contributor

Clients will freak out on this.
...
If client is not satisfied (bad node) he can try another SN.

this is what really freaks the clients. Everyone expects responsive and transparent behavior. I meant returning multi-status when majority averaging a single status cannot be done. E.g. N container nodes returned N different statuses. Server got them, returned them, so client is not burderned to get them on his own

with this, the behavior of container nodes (one of the core NeoFS features) becomes even more transparent to the client

But at the same time can ACL checks be completely done locally?

they are done locally. If object holder sees access violation, it aborts request immediately. The behavior discussed concerns transit nodes w/o an object, and they behave different

Makes sense even for 2/3+ replies.

such a threshold is not defined anywhere, now it sounds like fiction. Proposal is welcome


this topic may also affect case when most nodes, for example, refuse to service the request while "last" node responds. Now this will be a success because GET makes best effort. At the same time, this differs great from "all OK" case

@roman-khimov roman-khimov modified the milestones: v0.42.0, v0.41.1 Apr 19, 2024
@carpawell carpawell self-assigned this Apr 26, 2024
carpawell added a commit that referenced this issue Apr 27, 2024
Closes #1709.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working I4 No visible changes S3 Minimally significant U3 Regular
Projects
None yet
Development

No branches or pull requests

4 participants