Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

littlefs: Add lfs_file_getattr and lfs_file_setattr #1045

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

xiaoxiang781216
Copy link

@xiaoxiang781216 xiaoxiang781216 commented Nov 16, 2024

which are counterpart of lfs_getattr and lfs_setattr but work with lfs_file_t not a file path

which are counterpart of lfs_getattr and lfs_setattr
but work with lfs_file_t not a file path

Signed-off-by: zhouliang3 <zhouliang3@xiaomi.com>
@xiaoxiang781216
Copy link
Author

@geky I think the ci error isn't related to this patch:

Secret source: None
Prepare workflow directory
Prepare all required actions
Getting action download info
Error: This request has been automatically failed because it uses a deprecated version of `actions/upload-artifact: v2`. Learn more: https://github.blog/changelog/2024-02-13-deprecation-notice-v1-and-v2-of-the-artifact-actions/

@geky
Copy link
Member

geky commented Nov 18, 2024

Hi @xiaoxiang781216, thanks for creating a PR.

@geky I think the ci error isn't related to this patch:

You are correct, sorry about that. GitHub introduced some breaking changes somewhat recently and it's made things a bit of a mess. This would need to be rebased on to the devel branch to fix CI.


This isn't the first time lfs_file_getattr and lfs_file_setattr have been proposed (#580, #636). I understand the motivation for it, but unfortunately I don't see how it can fit in littlefs's data model.

In littlefs, writes to a file are not committed until lfs_file_sync or lfs_file_close. The data may be written to disk, but the metadata is not updated until one of these functions is called. This prevents partially written data from appearing on disk after a powerloss, and helps write safe powerloss-resilient applications.

This gets tricky for metadata, where we can't rely on on-disk copy-on-write data structures. The current assumption is that a file's metadata fits in RAM, with custom attributes being backed by the user-provided lfs_file_config.attrs array.

Calling lfs_dir_commit in lfs_file_setattr is enticing, but unfortunately breaks this model by writing attrs to disk immediately, potentially breaking applications if power is lost.

The only way I could see lfs_file_setattr working in this model is if it appended to a RAM-backed array, but this would require either dynamic memory or the user to providing lfs_file_config.attrs. And if users are already providing lfs_file_config.attrs, I think it's easier to manipulate custom attrs through the array directly...

@crafcat7
Copy link

crafcat7 commented Nov 20, 2024

Hello, I'm glad to see you reply to this PR. I have two questions about it:

  1. Suppose we distinguish the problem between lfs_file_getattr and lfs_file_setattr. I understand that lfs_file_getattr is reasonable at present, but the implementation of lfs_file_setattr does not meet the design expectations of littlefs. In this case, can we first introduce lfs_file_getattr as a new interface to littlefs as another way to obtain file attributes?

Calling lfs_dir_commit in lfs_file_setattr is enticing, but unfortunately breaks this model by writing attrs to disk immediately, potentially breaking applications if power is lost.

  1. Combining the first point, can we adjust the implementation of lfs_file_setattr to implement a RAM array to store this attribute field, and let it actually save during sync / close to solve the problem?

@geky
Copy link
Member

geky commented Nov 21, 2024

  1. Suppose we distinguish the problem between lfs_file_getattr and lfs_file_setattr. I understand that lfs_file_getattr is reasonable at present, but the implementation of lfs_file_setattr does not meet the design expectations of littlefs. In this case, can we first introduce lfs_file_getattr as a new interface to littlefs as another way to obtain file attributes?

Yes! I'm not opposed to it. lfs_file_getattr may be quite useful on its own.

I've considered adding it myself, but there is one extremely niche corner case that has made me hesitate. Consider removing a file with attrs:

lfs_file_open(&lfs, &file, "file.txt", LFS_O_RDONLY) => 0;
lfs_remove(&lfs, "file.txt") => 0;
lfs_file_getattr(&lfs, &file, 't', buffer, sizeof(buffer)) => ?;

If these attrs live on disk, lfs_file_getattr will fail. The concerning thing is that this is different from POSIX filesystems, where getxattr would succeed up until you close the file. This is because most POSIX filesystems use inodes and littlefs doesn't, which is a complicated design tradeoff that creates a lot of problems.

Is this practically an issue? To be honest I'm not sure.

On one hand it risks burning users with niche POSIX incompatibilities. On the other hand lfs_file_getattr may be useful enough that this unexpected behavior is worth it. It's hard to know.

So, I'm not opposed to it, but that's why I've held off so far.

--

Though before bringing it in, lfs_file_getattr would need some work:

  • lfs_file_getattr should also check lfs_file_config.attrs.

    We probably want user-specified attrs to take priority over on-disk attrs.

  • If we add lfs_file_getattr, we should probably also add lfs_dir_getattr.

    On one hand, lfs_dir_getattr is much less useful. On the other hand, it should be basically the same function.

  • New APIs need new tests.

    See tests/test_attrs.toml, adding tests here may be a good start.

    This can be extremely tedious, but is also extremely valuable.

Calling lfs_dir_commit in lfs_file_setattr is enticing, but unfortunately breaks this model by writing attrs to disk immediately, potentially breaking applications if power is lost.

  1. Combining the first point, can we adjust the implementation of lfs_file_setattr to implement a RAM array to store this attribute field, and let it actually save during sync / close to solve the problem?

Yes, this would satisfy littlefs's powerloss model. But it would the be the only API that requires malloc to work.

My original thinking was that lfs_file_config.attrs would be sufficient to allow this to be done on the user's side of things, but thinking about it now this doesn't allow dynamically adding new attrs. I'm not sure such an API is possible but I'll have to think about it...

At least lfs_file_config.attrs should be sufficient if you have a known-set of system-wide attrs.

If we do add a malloc-backed lfs_file_setattr, I think it would at least need to be opt-in instead of opt-out. Requiring some define like LFS_FILE_SETATTR or LFS_DYN before being available.

A malloc-backed lfs_file_setattr would be pretty bad in terms of heap fragmentation, though I realize most of the systems that would use this probably wouldn't really care.

@crafcat7
Copy link

crafcat7 commented Nov 25, 2024

Though before bringing it in, lfs_file_getattr would need some work:

  • lfs_file_getattr should also check lfs_file_config.attrs.
    We probably want user-specified attrs to take priority over on-disk attrs.

Thanks for your comments. I split out lfs_file_getattr into #1047 and improved it based on your suggestion.

I've considered adding it myself, but there is one extremely niche corner case that has made me hesitate. Consider removing a file with attrs:

lfs_file_open(&lfs, &file, "file.txt", LFS_O_RDONLY) => 0;
lfs_remove(&lfs, "file.txt") => 0;
lfs_file_getattr(&lfs, &file, 't', buffer, sizeof(buffer)) => ?;

If these attrs live on disk, lfs_file_getattr will fail. The concerning thing is that this is different from POSIX filesystems, where getxattr would succeed up until you close the file. This is because most POSIX filesystems use inodes and littlefs doesn't, which is a complicated design tradeoff that creates a lot of problems.

Is this practically an issue? To be honest I'm not sure.

On one hand it risks burning users with niche POSIX incompatibilities. On the other hand lfs_file_getattr may be useful enough that this unexpected behavior is worth it. It's hard to know.

So, I'm not opposed to it, but that's why I've held off so far.

I also found this problem during testing. Are there differences in the way different file systems handle some details? Perhaps this behavior is acceptable?🤔

  • If we add lfs_file_getattr, we should probably also add lfs_dir_getattr.
    On one hand, lfs_dir_getattr is much less useful. On the other hand, it should be basically the same function.

As for the implementation of lfs_dir_getattr, I don't have a good idea to implement it yet. Maybe I will have inspiration in the future, or someone will implement it.😢

Calling lfs_dir_commit in lfs_file_setattr is enticing, but unfortunately breaks this model by writing attrs to disk immediately, potentially breaking applications if power is lost.

  1. Combining the first point, can we adjust the implementation of lfs_file_setattr to implement a RAM array to store this attribute field, and let it actually save during sync / close to solve the problem?

Yes, this would satisfy littlefs's powerloss model. But it would the be the only API that requires malloc to work.

My original thinking was that lfs_file_config.attrs would be sufficient to allow this to be done on the user's side of things, but thinking about it now this doesn't allow dynamically adding new attrs. I'm not sure such an API is possible but I'll have to think about it...

At least lfs_file_config.attrs should be sufficient if you have a known-set of system-wide attrs.

If we do add a malloc-backed lfs_file_setattr, I think it would at least need to be opt-in instead of opt-out. Requiring some define like LFS_FILE_SETATTR or LFS_DYN before being available.

A malloc-backed lfs_file_setattr would be pretty bad in terms of heap fragmentation, though I realize most of the systems that would use this probably wouldn't really care.

Thank you for your idea. I will consider whether there is a simpler and more efficient way to implement the modification of lfs_file_setattr.

@geky
Copy link
Member

geky commented Dec 2, 2024

Thanks for creating #1047. I'm sorry but I think it may need to be left unmerged for a while.

I think the custom attribute API in its entirety needs an overhaul in order to hammer out these corner cases.

I also found this problem during testing. Are there differences in the way different file systems handle some details? Perhaps this behavior is acceptable?🤔

This is a good question. But digging around it seems most filesystems that support custom attributes also use inodes, which avoids the problem (inodes on disk are only cleaned up after the last file close).

Which raises the question should littlefs introduce inodes? The downside of inodes is they introduce more complexity, add more things that can go wrong during powerloss, etc, and increase the overhead of small inline files. On the other inodes would solve the problem of unique per-file ids, which some users have asked for...

As for the implementation of lfs_dir_getattr, I don't have a good idea to implement it yet. Maybe I will have inspiration in the future, or someone will implement it.😢

In hindsight we probably don't really need lfs_dir_getattr. It's unlikely to be used in practice and more API surface to test.

The fact that lfs_file_opencfg exists and lfs_dir_opencfg doesn't is already indication that these APIs don't need to be perfectly symmetrical.

@geky
Copy link
Member

geky commented Dec 2, 2024

Thank you for your idea. I will consider whether there is a simpler and more efficient way to implement the modification of lfs_file_setattr.

This has also been in the back of my mind the past week. A couple ideas how we could improve things long term:

  1. One long-term planned feature is to add a set of APIs that operate on the low-level per-mutation id. In theory these could be used to emulate lfs_file_getattr/setattr like so:

    lfs_getattrmid(&lfs, lfs_file_mid(&file), 't', buffer, sizeof(buffer));
    lfs_setattrmid(&lfs, lfs_file_mid(&file), 't', buffer, sizeof(buffer));

    However they would still have the above limitations -- not able to access uncommitted file attrs, not able to access attrs after remove, etc.

    But they could still be practically useful for users without overcommitting the littlefs API.

  2. We could add an optional opt-in API that calls malloc to populate the attr array during lfs_file_open.

    This would solve all problems with the theoretical lfs_file_setattr/getattr API, since the custom attributes would be backed by RAM. But the dependency on malloc runs counter to littlefs's goals and design.

    I'm also not sure how users would take an API that can introduce potentially unbounded memory use during lfs_file_open...

    But this would provide the easiest API for users, so it may be worth adding at some point if only for the "big" systems where no one cares about an extra 4KiB malloc.

  3. In looking into the feature mismatch between the current API and malloc API, one glaring issue is that the file's attr array does not allow for changing the number of attrs after lfs_file_open.

    This is tricky, and I'm not really sure what an API would look like that would allow users to change the memory that backs attrs in open files.

    We could add a sort of lfs_file_pendattr function that adds new attrs to the file's attr list, with the caveat that the provided memory would need to remain allocated until lfs_file_close. But such an API would be tricky to implement and I worry it would be very confusing for users.

    I don't think this is a promising route API-wise.

  4. Something interesting to note is that while the custom attribute API seems unbounded, in practice attrs attached to a file are limited to the size of the metadata block.

    One could imagine, instead of giving lfs_file_opencfg a list of attrs to populate, just giving it a block of memory to fill with whatever attrs reside on disk. Potentially returning LFS_ERR_NOMEM if the on-disk attrs cannot fit in the provided memory block. This could provide the backing memory necessary to implement a problem-free lfs_file_setattr/getattr without needing to invoke malloc.

    Alternatively lfs_file_opencfg could succeed but mark the last attr with LFS_ERR_NOMEM somehow to indicate not all attrs were read.

    We could even add an additional limit in the superblock, something like attrs_max or file_attrs_max, that could prevent attr updates that would cause lfs_file_opencfg to fail.

    The downsides of this approach is that you 1. no longer control which attrs you read in lfs_file_opencfg, 2. can no longer point the attr buffer at in-struct fields, though maybe this is a misfeature, and 3. is a bit more cumbersome to attach a fixed set of attrs in every lfs_file_opencfg.

    I think realistically this block-of-memory approach can't replace the existing attr-array API, but there's nothing stopping them both from living side-by-side.

    Though it does make naming things a pain, maybe static_attrs and dyn_attrs?

Just brainstorming, but I think long-term the best solution will be to add some mix of 4., 1., and 2., to cover the different use cases.

I didn't expect this API to become such a mess when first adding it!

@geky
Copy link
Member

geky commented Dec 2, 2024

And sorry for blocking #1047, but I think it's worth waiting on it.

Note that if we adopt 4., lfs_file_getattr should probably return LFS_ERR_NOMEM for attrs that are on-disk, but don't fit in the configured memory block. But in #1047 it would succeed.

These API differences would be difficult to fix retroactively.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants