Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Entry cache component #1515

Merged
merged 32 commits into from
Nov 12, 2024
Merged
Show file tree
Hide file tree
Changes from 25 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
1e3d88f
Adding listing caching option
vibhansa-msft Jul 13, 2024
4bf33fd
Optimize list and readlink
vibhansa-msft Jul 13, 2024
6f6b626
Update version for private builds
vibhansa-msft Jul 13, 2024
866f654
UT correction
vibhansa-msft Jul 15, 2024
c3fb4cf
Remove metadata retrived related flag
vibhansa-msft Jul 15, 2024
3178b50
Remove symlink stuff from attr_cache
vibhansa-msft Jul 15, 2024
4aaf202
correcting ut
vibhansa-msft Jul 15, 2024
fca0dba
Correct ut
vibhansa-msft Jul 15, 2024
86d40f1
UT correction
vibhansa-msft Jul 15, 2024
2c11d5b
UT correction
vibhansa-msft Jul 15, 2024
3edac72
Uncomment readlink test cases
vibhansa-msft Jul 16, 2024
8523768
Correcting ut
vibhansa-msft Jul 16, 2024
98d941e
Merge remote-tracking branch 'origin/main' into vibhansa/entry_cache
vibhansa-msft Jul 16, 2024
1142208
Merge remote-tracking branch 'origin/main' into vibhansa/entry_cache
vibhansa-msft Jul 17, 2024
f755e8b
Merge remote-tracking branch 'origin/main' into vibhansa/entry_cache
vibhansa-msft Jul 17, 2024
dc321a5
Add code to release to mariner preview for preview binaries
vibhansa-msft Jul 18, 2024
e926faf
Merge branch 'main' into vibhansa/entry_cache
vibhansa-msft Jul 23, 2024
7062ff9
Sync with main
vibhansa-msft Aug 23, 2024
e9b024f
Merge branch 'vibhansa/entry_cache' of https://github.com/Azure/azure…
vibhansa-msft Aug 23, 2024
a392356
sync with main
vibhansa-msft Sep 6, 2024
ea40ce7
Merge branch 'main' into vibhansa/entry_cache_new
vibhansa-msft Sep 18, 2024
206bf18
Merge branch 'main' into vibhansa/entry_cache_new
vibhansa-msft Oct 24, 2024
867fa19
Merge remote-tracking branch 'origin/main' into vibhansa/entry_cache_new
vibhansa-msft Nov 4, 2024
daab6bd
Merge branch 'main' into vibhansa/entry_cache_new
vibhansa-msft Nov 4, 2024
922a9dd
Sync with main'
vibhansa-msft Nov 5, 2024
85cf21b
Merge remote-tracking branch 'origin/main' into vibhansa/entry_cache_new
vibhansa-msft Nov 7, 2024
cf1c888
Adding check to support this only in read-only mode
vibhansa-msft Nov 7, 2024
c786123
Sync with main
vibhansa-msft Nov 11, 2024
2ef4928
Merge branch 'main' into vibhansa/entry_cache_new
vibhansa-msft Nov 11, 2024
aff44e1
Merge branch 'main' into vibhansa/entry_cache_new
vibhansa-msft Nov 11, 2024
8c5cf3f
Merge remote-tracking branch 'origin/main' into vibhansa/entry_cache_new
vibhansa-msft Nov 12, 2024
abc0419
Update as per review comments
vibhansa-msft Nov 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
## 2.4.0 (Unreleased)
**Features**
- Entry cache to hold directory listing results in cache for a given timeout. This will reduce REST calls going to storage while listing the blobs in parallel.

**Bug Fixes**
- [#1426](https://github.com/Azure/azure-storage-fuse/issues/1426) Read panic in block-cache due to boundary conditions.

Expand Down
2 changes: 1 addition & 1 deletion MIGRATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ Note: Blobfuse2 accepts all CLI parameters that Blobfuse does, but may ignore pa
| --log-level=LOG_WARNING | --log-level=LOG_WARNING | logging.level | |
| --use-attr-cache=true | --use-attr-cache=true | attr_cache | Add attr_cache to the components list |
| --use-adls=false | --use-adls=false | azstorage.type | Specify either 'block' or 'adls' |
| --no-symlinks=false | --no-symlinks=false | attr_cache.no-symlinks | |
| --no-symlinks=false | --no-symlinks=true | attr_cache.no-symlinks | |
| --cache-on-list=true | --cache-on-list=true | attr_cache.no-cache-on-list | This parameter has the opposite boolean semantics |
| --upload-modified-only=true | --upload-modified-only=true | | Always on in blobfuse2 |
| --max-concurrency=12 | --max-concurrency=12 | azstorage.max-concurrency | |
Expand Down
7 changes: 7 additions & 0 deletions blobfuse2-release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1697,6 +1697,13 @@ stages:
echo "##vso[task.setvariable variable=is_preview]$is_preview"
fi

is_preview="false"
echo "##vso[task.setvariable variable=is_preview]$is_preview"
if [[ $marinerFuse3AmdRpm == *"preview"* ]]; then
is_preview="true"
echo "##vso[task.setvariable variable=is_preview]$is_preview"
fi

while IFS=, read -r distro fuseArchType repoName releaseName; do

# If the package is preview, publish to mariner preview package
Expand Down
1 change: 1 addition & 0 deletions cmd/imports.go
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ import (
_ "github.com/Azure/azure-storage-fuse/v2/component/azstorage"
_ "github.com/Azure/azure-storage-fuse/v2/component/block_cache"
_ "github.com/Azure/azure-storage-fuse/v2/component/custom"
_ "github.com/Azure/azure-storage-fuse/v2/component/entry_cache"
_ "github.com/Azure/azure-storage-fuse/v2/component/file_cache"
_ "github.com/Azure/azure-storage-fuse/v2/component/libfuse"
_ "github.com/Azure/azure-storage-fuse/v2/component/loopback"
Expand Down
5 changes: 5 additions & 0 deletions cmd/mount.go
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,7 @@ type mountOptions struct {
AttrCache bool `config:"use-attr-cache"`
LibfuseOptions []string `config:"libfuse-options"`
BlockCache bool `config:"block-cache"`
EntryCache int `config:"list-cache-timeout"`
vibhansa-msft marked this conversation as resolved.
Show resolved Hide resolved
}

var options mountOptions
Expand Down Expand Up @@ -313,6 +314,10 @@ var mountCmd = &cobra.Command{
options.Components = pipeline
}

if config.IsSet("entry_cache.timeout-sec") || options.EntryCache > 0 {
options.Components = append(options.Components[:1], append([]string{"entry_cache"}, options.Components[1:]...)...)
}
vibhansa-msft marked this conversation as resolved.
Show resolved Hide resolved

if config.IsSet("libfuse-options") {
for _, v := range options.LibfuseOptions {
parameter := strings.Split(v, "=")
Expand Down
2 changes: 1 addition & 1 deletion common/types.go
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ import (

// Standard config default values
const (
blobfuse2Version_ = "2.3.3"
blobfuse2Version_ = "2.4.0"

DefaultMaxLogFileSize = 512
DefaultLogFileCount = 10
Expand Down
27 changes: 8 additions & 19 deletions component/attr_cache/attr_cache.go
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,6 @@ const defaultAttrCacheTimeout uint32 = (120)
type AttrCache struct {
internal.BaseComponent
cacheTimeout uint32
cacheOnList bool
noSymlinks bool
maxFiles int
cacheMap map[string]*attrCacheItem
Expand Down Expand Up @@ -139,22 +138,18 @@ func (ac *AttrCache) Configure(_ bool) error {
ac.cacheTimeout = defaultAttrCacheTimeout
}

if config.IsSet(compName + ".cache-on-list") {
ac.cacheOnList = conf.CacheOnList
} else {
vibhansa-msft marked this conversation as resolved.
Show resolved Hide resolved
ac.cacheOnList = !conf.NoCacheOnList
}

if config.IsSet(compName + ".max-files") {
ac.maxFiles = conf.MaxFiles
} else {
ac.maxFiles = defaultMaxFiles
}

ac.noSymlinks = conf.NoSymlinks
if config.IsSet(compName + ".no-symlinks") {
ac.noSymlinks = conf.NoSymlinks
}

log.Info("AttrCache::Configure : cache-timeout %d, symlink %t, cache-on-list %t, max-files %d",
ac.cacheTimeout, ac.noSymlinks, ac.cacheOnList, ac.maxFiles)
log.Info("AttrCache::Configure : cache-timeout %d, symlink %t, max-files %d",
ac.cacheTimeout, ac.noSymlinks, ac.maxFiles)

return nil
}
Expand Down Expand Up @@ -283,7 +278,7 @@ func (ac *AttrCache) StreamDir(options internal.StreamDirOptions) ([]*internal.O
// cacheAttributes : On dir listing cache the attributes for all files
func (ac *AttrCache) cacheAttributes(pathList []*internal.ObjAttr) {
// Check whether or not we are supposed to cache on list
if ac.cacheOnList && len(pathList) > 0 {
if len(pathList) > 0 {
// Putting this inside loop is heavy as for each item we will do a kernel call to get current time
// If there are millions of blobs then cost of this is very high.
currTime := time.Now()
Expand Down Expand Up @@ -478,14 +473,8 @@ func (ac *AttrCache) GetAttr(options internal.GetAttrOptions) (*internal.ObjAttr
// no entry if path does not exist
return &internal.ObjAttr{}, syscall.ENOENT
} else {
// IsMetadataRetrieved is false in the case of ADLS List since the API does not support metadata.
// Once migration of ADLS list to blob endpoint is done (in future service versions), we can remove this.
// options.RetrieveMetadata is set by CopyFromFile and WriteFile which need metadata to ensure it is preserved.
if value.getAttr().IsMetadataRetrieved() || (ac.noSymlinks && !options.RetrieveMetadata) {
// path exists and we have all the metadata required or we do not care about metadata
log.Debug("AttrCache::GetAttr : %s served from cache", options.Name)
return value.getAttr(), nil
}
log.Debug("AttrCache::GetAttr : %s served from cache", options.Name)
return value.getAttr(), nil
}
}

Expand Down
34 changes: 2 additions & 32 deletions component/attr_cache/attr_cache_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -80,9 +80,6 @@ func newTestAttrCache(next internal.Component, configuration string) *AttrCache

func getPathAttr(path string, size int64, mode os.FileMode, metadata bool) *internal.ObjAttr {
flags := internal.NewFileBitMap()
if metadata {
flags.Set(internal.PropFlagMetadataRetrieved)
}
return &internal.ObjAttr{
Path: path,
Name: filepath.Base(path),
Expand Down Expand Up @@ -210,8 +207,7 @@ func (suite *attrCacheTestSuite) TestDefault() {
defer suite.cleanupTest()
suite.assert.Equal(suite.attrCache.Name(), "attr_cache")
suite.assert.EqualValues(suite.attrCache.cacheTimeout, 120)
suite.assert.Equal(suite.attrCache.cacheOnList, true)
suite.assert.Equal(suite.attrCache.noSymlinks, false)
// suite.assert.Equal(suite.attrCache.noSymlinks, false)
vibhansa-msft marked this conversation as resolved.
Show resolved Hide resolved
}

// Tests configuration
Expand All @@ -223,7 +219,6 @@ func (suite *attrCacheTestSuite) TestConfig() {

suite.assert.Equal(suite.attrCache.Name(), "attr_cache")
suite.assert.EqualValues(suite.attrCache.cacheTimeout, 60)
suite.assert.Equal(suite.attrCache.cacheOnList, false)
suite.assert.Equal(suite.attrCache.noSymlinks, true)
}

Expand All @@ -246,7 +241,6 @@ func (suite *attrCacheTestSuite) TestConfigZero() {

suite.assert.Equal(suite.attrCache.Name(), "attr_cache")
suite.assert.EqualValues(suite.attrCache.cacheTimeout, 0)
suite.assert.Equal(suite.attrCache.cacheOnList, false)
suite.assert.Equal(suite.attrCache.noSymlinks, true)
}

Expand Down Expand Up @@ -426,29 +420,6 @@ func (suite *attrCacheTestSuite) TestReadDirExists() {
}
}

func (suite *attrCacheTestSuite) TestReadDirNoCacheOnList() {
defer suite.cleanupTest()
suite.cleanupTest() // clean up the default attr cache generated
cacheOnList := false
config := fmt.Sprintf("attr_cache:\n no-cache-on-list: %t", !cacheOnList)
suite.setupTestHelper(config) // setup a new attr cache with a custom config (clean up will occur after the test as usual)
suite.assert.EqualValues(suite.attrCache.cacheOnList, cacheOnList)
path := "a"
size := int64(1024)
mode := os.FileMode(0)
aAttr := generateNestedPathAttr(path, size, mode)

options := internal.ReadDirOptions{Name: path}
suite.mock.EXPECT().ReadDir(options).Return(aAttr, nil)

suite.assert.Empty(suite.attrCache.cacheMap) // cacheMap should be empty before call
returnedAttr, err := suite.attrCache.ReadDir(options)
suite.assert.Nil(err)
suite.assert.Equal(aAttr, returnedAttr)

suite.assert.Empty(suite.attrCache.cacheMap) // cacheMap should be empty after call
}

func (suite *attrCacheTestSuite) TestReadDirError() {
defer suite.cleanupTest()
var paths = []string{"a", "a/", "ab", "ab/"}
Expand Down Expand Up @@ -912,7 +883,6 @@ func (suite *attrCacheTestSuite) TestGetAttrExistsWithoutMetadataNoSymlinks() {
// This is a little janky but required since testify suite does not support running setup or clean up for subtests.
suite.cleanupTest()
suite.setupTestHelper(config) // setup a new attr cache with a custom config (clean up will occur after the test as usual)
suite.assert.EqualValues(suite.attrCache.cacheOnList, noSymlinks)
suite.Run(path, func() {
truncatedPath := internal.TruncateDirName(path)
addDirectoryToCache(suite.assert, suite.attrCache, "a", true) // add the paths to the cache with IsMetadataRetrived=true
Expand Down Expand Up @@ -941,7 +911,7 @@ func (suite *attrCacheTestSuite) TestGetAttrExistsWithoutMetadata() {

options := internal.GetAttrOptions{Name: path}
// attributes should not be accessible so call the mock
suite.mock.EXPECT().GetAttr(options).Return(getPathAttr(path, defaultSize, fs.FileMode(defaultMode), false), nil)
//suite.mock.EXPECT().GetAttr(options).Return(getPathAttr(path, defaultSize, fs.FileMode(defaultMode), false), nil)

_, err := suite.attrCache.GetAttr(options)
suite.assert.Nil(err)
Expand Down
2 changes: 1 addition & 1 deletion component/azstorage/azstorage.go
Original file line number Diff line number Diff line change
Expand Up @@ -508,7 +508,7 @@ func (az *AzStorage) CreateLink(options internal.CreateLinkOptions) error {

func (az *AzStorage) ReadLink(options internal.ReadLinkOptions) (string, error) {
log.Trace("AzStorage::ReadLink : Read symlink %s", options.Name)
data, err := az.storage.ReadBuffer(options.Name, 0, 0)
data, err := az.storage.ReadBuffer(options.Name, 0, options.Size)

if err != nil {
azStatsCollector.PushEvents(readLink, options.Name, nil)
Expand Down
3 changes: 0 additions & 3 deletions component/azstorage/block_blob.go
Original file line number Diff line number Diff line change
Expand Up @@ -450,7 +450,6 @@ func (bb *BlockBlob) getAttrUsingRest(name string) (attr *internal.ObjAttr, err

parseMetadata(attr, prop.Metadata)

attr.Flags.Set(internal.PropFlagMetadataRetrieved)
attr.Flags.Set(internal.PropFlagModeDefault)

return attr, nil
Expand Down Expand Up @@ -595,7 +594,6 @@ func (bb *BlockBlob) List(prefix string, marker *string, count int32) ([]*intern
MD5: blobInfo.Properties.ContentMD5,
}
parseMetadata(attr, blobInfo.Metadata)
attr.Flags.Set(internal.PropFlagMetadataRetrieved)
attr.Flags.Set(internal.PropFlagModeDefault)
}
blobList = append(blobList, attr)
Expand Down Expand Up @@ -634,7 +632,6 @@ func (bb *BlockBlob) List(prefix string, marker *string, count int32) ([]*intern
attr.Atime = attr.Mtime
attr.Crtime = attr.Mtime
attr.Ctime = attr.Mtime
attr.Flags.Set(internal.PropFlagMetadataRetrieved)
attr.Flags.Set(internal.PropFlagModeDefault)
blobList = append(blobList, attr)
}
Expand Down
8 changes: 0 additions & 8 deletions component/azstorage/block_blob_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -644,7 +644,6 @@ func (s *blockBlobTestSuite) TestReadDirNoVirtualDirectory() {
s.assert.EqualValues(name, entries[0].Path)
s.assert.EqualValues(name, entries[0].Name)
s.assert.True(entries[0].IsDir())
s.assert.True(entries[0].IsMetadataRetrieved())
s.assert.True(entries[0].IsModeDefault())
})
}
Expand All @@ -664,13 +663,11 @@ func (s *blockBlobTestSuite) TestReadDirHierarchy() {
s.assert.EqualValues(base+"/c1", entries[0].Path)
s.assert.EqualValues("c1", entries[0].Name)
s.assert.True(entries[0].IsDir())
s.assert.True(entries[0].IsMetadataRetrieved())
s.assert.True(entries[0].IsModeDefault())
// Check the file
s.assert.EqualValues(base+"/c2", entries[1].Path)
s.assert.EqualValues("c2", entries[1].Name)
s.assert.False(entries[1].IsDir())
s.assert.True(entries[1].IsMetadataRetrieved())
s.assert.True(entries[1].IsModeDefault())
}

Expand All @@ -693,19 +690,16 @@ func (s *blockBlobTestSuite) TestReadDirRoot() {
s.assert.EqualValues(base, entries[0].Path)
s.assert.EqualValues(base, entries[0].Name)
s.assert.True(entries[0].IsDir())
s.assert.True(entries[0].IsMetadataRetrieved())
s.assert.True(entries[0].IsModeDefault())
// Check the baseb dir
s.assert.EqualValues(base+"b", entries[1].Path)
s.assert.EqualValues(base+"b", entries[1].Name)
s.assert.True(entries[1].IsDir())
s.assert.True(entries[1].IsMetadataRetrieved())
s.assert.True(entries[1].IsModeDefault())
// Check the basec file
s.assert.EqualValues(base+"c", entries[2].Path)
s.assert.EqualValues(base+"c", entries[2].Name)
s.assert.False(entries[2].IsDir())
s.assert.True(entries[2].IsMetadataRetrieved())
s.assert.True(entries[2].IsModeDefault())
})
}
Expand All @@ -725,7 +719,6 @@ func (s *blockBlobTestSuite) TestReadDirSubDir() {
s.assert.EqualValues(base+"/c1"+"/gc1", entries[0].Path)
s.assert.EqualValues("gc1", entries[0].Name)
s.assert.False(entries[0].IsDir())
s.assert.True(entries[0].IsMetadataRetrieved())
s.assert.True(entries[0].IsModeDefault())
}

Expand All @@ -745,7 +738,6 @@ func (s *blockBlobTestSuite) TestReadDirSubDirPrefixPath() {
s.assert.EqualValues("c1"+"/gc1", entries[0].Path)
s.assert.EqualValues("gc1", entries[0].Name)
s.assert.False(entries[0].IsDir())
s.assert.True(entries[0].IsMetadataRetrieved())
s.assert.True(entries[0].IsModeDefault())
}

Expand Down
7 changes: 3 additions & 4 deletions component/azstorage/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -557,15 +557,14 @@ func ParseAndReadDynamicConfig(az *AzStorage, opt AzStorageOptions, reload bool)
az.stConfig.honourACL = false
}

// by default symlink will be disabled
az.stConfig.disableSymlink = true

if config.IsSet("attr_cache.no-symlinks") {
err := config.UnmarshalKey("attr_cache.no-symlinks", &az.stConfig.disableSymlink)
if err != nil {
az.stConfig.disableSymlink = true
log.Err("ParseAndReadDynamicConfig : Failed to unmarshal attr_cache.no-symlinks")
}
} else {
// by default symlink will be disabled
az.stConfig.disableSymlink = true
}

// Auth related reconfig
Expand Down
2 changes: 0 additions & 2 deletions component/azstorage/datalake.go
Original file line number Diff line number Diff line change
Expand Up @@ -402,8 +402,6 @@ func (dl *Datalake) GetAttr(name string) (attr *internal.ObjAttr, err error) {
attr.Mode = attr.Mode | os.ModeDir
}

attr.Flags.Set(internal.PropFlagMetadataRetrieved)

if dl.Config.honourACL && dl.Config.authConfig.ObjectID != "" {
acl, err := fileClient.GetAccessControl(context.Background(), nil)
if err != nil {
Expand Down
7 changes: 0 additions & 7 deletions component/azstorage/datalake_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -498,13 +498,11 @@ func (s *datalakeTestSuite) TestReadDirHierarchy() {
s.assert.EqualValues(base+"/c1", entries[0].Path)
s.assert.EqualValues("c1", entries[0].Name)
s.assert.True(entries[0].IsDir())
s.assert.False(entries[0].IsMetadataRetrieved())
s.assert.False(entries[0].IsModeDefault())
// Check the file
s.assert.EqualValues(base+"/c2", entries[1].Path)
s.assert.EqualValues("c2", entries[1].Name)
s.assert.False(entries[1].IsDir())
s.assert.False(entries[1].IsMetadataRetrieved())
s.assert.False(entries[1].IsModeDefault())
}

Expand All @@ -527,19 +525,16 @@ func (s *datalakeTestSuite) TestReadDirRoot() {
s.assert.EqualValues(base, entries[0].Path)
s.assert.EqualValues(base, entries[0].Name)
s.assert.True(entries[0].IsDir())
s.assert.False(entries[0].IsMetadataRetrieved())
s.assert.False(entries[0].IsModeDefault())
// Check the baseb dir
s.assert.EqualValues(base+"b", entries[1].Path)
s.assert.EqualValues(base+"b", entries[1].Name)
s.assert.True(entries[1].IsDir())
s.assert.False(entries[1].IsMetadataRetrieved())
s.assert.False(entries[1].IsModeDefault())
// Check the basec file
s.assert.EqualValues(base+"c", entries[2].Path)
s.assert.EqualValues(base+"c", entries[2].Name)
s.assert.False(entries[2].IsDir())
s.assert.False(entries[2].IsMetadataRetrieved())
s.assert.False(entries[2].IsModeDefault())
})
}
Expand All @@ -559,7 +554,6 @@ func (s *datalakeTestSuite) TestReadDirSubDir() {
s.assert.EqualValues(base+"/c1"+"/gc1", entries[0].Path)
s.assert.EqualValues("gc1", entries[0].Name)
s.assert.False(entries[0].IsDir())
s.assert.False(entries[0].IsMetadataRetrieved())
s.assert.False(entries[0].IsModeDefault())
}

Expand All @@ -579,7 +573,6 @@ func (s *datalakeTestSuite) TestReadDirSubDirPrefixPath() {
s.assert.EqualValues(base+"/c1"+"/gc1", entries[0].Path)
s.assert.EqualValues("gc1", entries[0].Name)
s.assert.False(entries[0].IsDir())
s.assert.False(entries[0].IsMetadataRetrieved())
s.assert.False(entries[0].IsModeDefault())
}

Expand Down
Loading
Loading