Skip to content

Conversation

@liconghui628
Copy link

@liconghui628 liconghui628 commented Oct 21, 2025

The internal interface of littlfs calls lfs_dir_fetchmatch with a null cb parameter, and lfs_dir_fetchmatch does not internally determine cb, which may directly call and cause the program to crash.

static int lfs_dir_fetch(lfs_t *lfs,
lfs_mdir_t *dir, const lfs_block_t pair[2]) {
// note, mask=-1, tag=-1 can never match a tag since this
// pattern has the invalid bit set
return (int)lfs_dir_fetchmatch(lfs, dir, pair,
(lfs_tag_t)-1, (lfs_tag_t)-1, NULL, NULL, NULL);
}

@geky-bot
Copy link
Collaborator

Tests passed ✓, Code: 17132 B (+0.0%), Stack: 1448 B (+0.0%), Structs: 812 B (+0.0%)
Code Stack Structs Coverage
Default 17132 B (+0.0%) 1448 B (+0.0%) 812 B (+0.0%) Lines 2438/2599 lines (-0.0%)
Readonly 6238 B (+0.1%) 448 B (+0.0%) 812 B (+0.0%) Branches 1289/1626 branches (-0.0%)
Threadsafe 17984 B (+0.0%) 1448 B (+0.0%) 820 B (+0.0%) Benchmarks
Multiversion 17204 B (+0.0%) 1448 B (+0.0%) 816 B (+0.0%) Readed 29000746676 B (+0.0%)
Migrate 18796 B (+0.0%) 1752 B (+0.0%) 816 B (+0.0%) Proged 1482895246 B (+0.0%)
Error-asserts 17956 B (+0.0%) 1440 B (+0.0%) 812 B (+0.0%) Erased 1568921600 B (+0.0%)

@BenBE
Copy link

BenBE commented Oct 21, 2025

Rationale as explained in #1158 (comment)

@geky geky added the needs investigation no idea what is wrong label Oct 22, 2025
@geky
Copy link
Member

geky commented Oct 22, 2025

Hi @liconghui628, thanks for creating an issue + PR.

@geky AFAICT, the place where the tag is checked we expect valid data, which is not the case. Given the call in lfs_dir_fetch provides placeholder values for the ftag might be treated as a sign to ignore the callback. A more robust check would amend

Hmmm. It's true lfs_dir_fetch/lfs_dir_fetchmatch need to written defensively and expect unwritten/corrupt storage. But I don't think this can be reached, even if corrupt?

Earlier in this loop we check if the tag is "valid" (if the sign bit is clear), and break if it's not:

littlefs/lfs.c

Lines 1182 to 1186 in adad0fb

// next commit not yet programmed?
if (!lfs_tag_isvalid(tag)) {
// we only might be erased if the last tag was a crc
maybeerased = (lfs_tag_type2(ptag) == LFS_TYPE_CCRC);
break;

lfs_tag_isvalid impl:

littlefs/lfs.c

Lines 351 to 353 in adad0fb

static inline bool lfs_tag_isvalid(lfs_tag_t tag) {
return !(tag & 0x80000000);
}

Nothing else touches tag between that check and (fmask & tag) == (fmask & ftag), so, unless I'm missing something, the callback should never be called when fmask=ftag=-1.

Short of in-RAM corruption or some other bug.

@liconghui628
Copy link
Author

liconghui628 commented Oct 23, 2025

Hi @liconghui628, thanks for creating an issue + PR.

@geky AFAICT, the place where the tag is checked we expect valid data, which is not the case. Given the call in lfs_dir_fetch provides placeholder values for the ftag might be treated as a sign to ignore the callback. A more robust check would amend

Hmmm. It's true lfs_dir_fetch/lfs_dir_fetchmatch need to written defensively and expect unwritten/corrupt storage. But I don't think this can be reached, even if corrupt?

Earlier in this loop we check if the tag is "valid" (if the sign bit is clear), and break if it's not:

littlefs/lfs.c

Lines 1182 to 1186 in adad0fb

// next commit not yet programmed?
if (!lfs_tag_isvalid(tag)) {
// we only might be erased if the last tag was a crc
maybeerased = (lfs_tag_type2(ptag) == LFS_TYPE_CCRC);
break;

lfs_tag_isvalid impl:

littlefs/lfs.c

Lines 351 to 353 in adad0fb

static inline bool lfs_tag_isvalid(lfs_tag_t tag) {
return !(tag & 0x80000000);
}

Nothing else touches tag between that check and (fmask & tag) == (fmask & ftag), so, unless I'm missing something, the callback should never be called when fmask=ftag=-1.

Short of in-RAM corruption or some other bug.

@geky This question is indeed very strange. I tried to add some printing information, and in fact, it did call up here.

[2025-10-20 18:59:54]  lfs_trace:3730: lfs_mount -> 0
[2025-10-20 18:59:54]  milfs_mount_log 282 lfs_mout ret 0
[2025-10-20 18:59:54]  lfs_trace:1962: lfs_dir_open(62FD54B8, 62FCE2C4, "/")
[2025-10-20 18:59:54]  lfs_trace:1974: lfs_dir_open -> 0x3ff
[2025-10-20 18:59:54]  lfs_dir_fetchmatch 767
[2025-10-20 18:59:54]  milfs_flash_read_log block=1 off=0
[2025-10-20 18:59:54]  lfs_dir_fetchmatch 783
[2025-10-20 18:59:54]  lfs_dir_fetchmatch 791
[2025-10-20 18:59:54]  lfs_dir_fetchmatch 874
[2025-10-20 18:59:54]  lfs_dir_fetchmatch 891
[2025-10-20 18:59:54]  lfs_dir_fetchmatch 895
[2025-10-20 18:59:54]  lfs_dir_fetchmatch 929 cb=00000000
[2025-10-20 18:59:54]  Exception Entry--->>>
[2025-10-20 18:59:54]  mcause 38000001, mepc 00000000, mtval 00000000

[2025-10-20 18:59:54]  Exception code: 1

[2025-10-20 18:59:54]    msg: Instruction access fault
[2025-10-20 18:59:54]  === backtrace ===

[2025-10-20 18:59:54]  addr2line -e <your elf file>.elf -fp 0xa00a715a 0xa0027dbc 0x0 0xa013cd30 0xa013fbd4 0xa013feac 0xa006db38 

[2025-10-20 18:59:54]  
[2025-10-20 18:59:54]  -+-+-+- BFLB COREDUMP v0.0.1 +-+-+-+

lfs.c

@geky
Copy link
Member

geky commented Oct 23, 2025

Ah! @liconghui628, this looks like an older version of the codebase.

In particular it's missing this patch: #337, which was merged in v2.1.4.

I believe this is the same issue.

@geky geky added fixed? and removed needs investigation no idea what is wrong labels Oct 23, 2025
@BenBE
Copy link

BenBE commented Oct 23, 2025

Would it make sense to include the NULL pointer check anyway; and be it as an assertion at the call site of cb? That way you could at least provide a somewhat reasonable message for debugging in case there was some regression.

@geky
Copy link
Member

geky commented Oct 23, 2025

Would it make sense to include the NULL pointer check anyway; and be it as an assertion at the call site of cb? That way you could at least provide a somewhat reasonable message for debugging in case there was some regression.

I'm not opposed to it, but what would it really add over hardfaulting? Both point out this cb is NULL when it shouldn't be.

There's a lot of things that can be NULL in C, fortunately even microcontrollers these days make NULL dereferences "safe" and easy to find. The log @liconghui628 shared even had a full backtrace + addr2line invocation we could have looked at if we needed more info.

@liconghui628
Copy link
Author

@geky @BenBE Thank you for your patient explanation. From the perspective of internal implementation alone, I think NULL detection is necessary because it is uncertain whether external modifications will be executed here in the future, the impact on business is quite significant. We just need to make the interface implementation more robust, without worrying about how it is called externally.

@bmcdonnell-fb
Copy link

Would it make sense to include the NULL pointer check anyway; and be it as an assertion at the call site of cb?

I'm not opposed to it, but what would it really add over hardfaulting? Both point out this cb is NULL when it shouldn't be.

IMO, it's valuable to make intent/expectations explicit between assert(ptr != NULL) (ptr should never be NULL) and if (ptr == NULL) return; (it might be NULL, just ignore it). It can help in reasoning/debugging, even if the assert behavior were no different than letting it hard fault.

@liconghui628 liconghui628 changed the title Update lfs.c fix lfs_dir_fetchmatch crash issue Add null pointer check for lfs_dir_fetchmatch Oct 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants