-
Notifications
You must be signed in to change notification settings - Fork 27
[devices] optimize last seen time updates #178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds a pluggable cache subsystem (memory/Redis) with API, implementations, tests/benchmarks, DI wiring and metrics; introduces an online service that batches/persists device last-seen via the cache; updates auth/devices to use online/cache; bumps Go/Docker build images and CI workflows; adds DB index and observability assets. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant App as fx.App
participant Cfg as Config
participant CacheMod as cache.Module
participant Factory as cache.Factory
participant Mem as MemoryBackend
participant Redis as RedisBackend
participant OnlineMod as online.Module
participant OnlineSvc as online.Service
participant Devices as devices.Service
App->>Cfg: provide Config (cache.url)
App->>CacheMod: init cache.Module()
CacheMod->>Factory: fx.Provide(NewFactory(cfg.Cache))
Factory->>Factory: parse URL & choose backend
alt scheme == memory
Factory->>Mem: NewMemory(...)
else scheme == redis
Factory->>Redis: redis.NewClient(...)
end
App->>OnlineMod: init online.Module()
OnlineMod->>Factory: provide cache instance (f.New("online"))
OnlineMod->>OnlineSvc: fx.Provide(New)
App->>OnlineSvc: OnStart -> Run(ctx) (goroutine, ticker)
OnlineSvc->>Factory: Drain() periodically -> batch
OnlineSvc->>Devices: SetLastSeen(ctx, batch)
App->>OnlineSvc: OnStop -> cancel ctx -> Run exits
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).Please share your feedback with us on this Discord post. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 16
🧹 Nitpick comments (18)
build/package/Dockerfile (1)
2-2: Pin Go base image to a patch version (reproducibility).To avoid unexpected toolchain changes, pin to 1.24.1 (matches go.mod) or a digest.
-FROM golang:1.24-alpine AS build +FROM golang:1.24.1-alpine AS buildbuild/package/Dockerfile.dev (1)
1-1: Pin dev base image to a patch version.Same rationale as the prod builder: keep the toolchain stable.
-FROM golang:1.24-alpine +FROM golang:1.24.1-alpinepkg/cache/errors.go (1)
5-9: Add GoDoc for exported errors.Public identifiers should have comments for lint/docs.
var ( - ErrKeyNotFound = errors.New("key not found") - ErrKeyExpired = errors.New("key expired") - ErrKeyExists = errors.New("key already exists") + // ErrKeyNotFound indicates no value exists for the given key. + ErrKeyNotFound = errors.New("key not found") + // ErrKeyExpired indicates a value exists but has expired. + ErrKeyExpired = errors.New("key expired") + // ErrKeyExists indicates a conflicting set when the key already exists. + ErrKeyExists = errors.New("key already exists") )pkg/cache/options.go (3)
5-5: Document the exported Option type.Exports should have a brief GoDoc.
-type Option func(*options) +// Option configures per-item cache behavior (e.g., expiry). +type Option func(*options)
21-25: Define TTL<=0 semantics; normalize time to UTC.Current code sets validUntil=Now+ttl. With ttl<=0 this becomes “expires now/past”, which may conflict with the package-level default (“<=0 means never expire” per PR context). Either document that 0 expires immediately or treat <=0 as no expiry. Also consider using UTC consistently.
-func WithTTL(ttl time.Duration) Option { - return func(o *options) { - o.validUntil = time.Now().Add(ttl) - } -} +func WithTTL(ttl time.Duration) Option { + return func(o *options) { + if ttl <= 0 { + // No per-item expiry; rely on cache default TTL policy. + o.validUntil = time.Time{} + return + } + o.validUntil = time.Now().UTC().Add(ttl) + } +}If you prefer “0 means expire immediately”, please add that to the comment of WithTTL.
29-33: Normalize explicit timestamps to UTC (optional).Reduces ambiguity if other components compare against UTC times.
func WithValidUntil(validUntil time.Time) Option { return func(o *options) { - o.validUntil = validUntil + o.validUntil = validUntil.UTC() } }internal/sms-gateway/cache/config.go (1)
3-5: Add a short GoDoc for Config.Improves discoverability in IDEs/docs.
-type Config struct { +// Config controls the cache backend via a URL (e.g., "memory://", "redis://..."). +type Config struct { URL string }internal/config/config.go (1)
17-17: Leverage existing cache URL validation or add early config validation
The cache factory already parses and checks the URL scheme (internal/sms-gateway/cache/factory.go:24–29). To fail fast on misconfiguration, optionally introduce aValidate()method on theConfigstruct ininternal/config/config.gothat invokesurl.ParseonCache.URL.pkg/cache/memory.go (2)
44-48: Cleanup method has an empty callback - consider simplifying.The empty callback pattern seems unnecessary here.
// Cleanup implements Cache. func (m *memoryCache) Cleanup(_ context.Context) error { - m.cleanup(func() {}) + now := time.Now() + m.mux.Lock() + for key, item := range m.items { + if item.isExpired(now) { + delete(m.items, key) + } + } + m.mux.Unlock() return nil }
134-146: Consider implementing lazy expiration for better performance.The current implementation checks expiration on every read. For better performance at scale, consider:
- Lazy deletion during reads (current approach) combined with
- Periodic background cleanup to prevent memory bloat
This is particularly important for high-traffic scenarios where expired entries might accumulate if they're not frequently accessed.
pkg/cache/memory_profile_test.go (3)
62-108: Fix the error message threshold mismatch.The error message states "1MB" but the actual threshold is 2MB.
Apply this diff to fix the error message:
-if allocDiff > 2*1024*1024 { // 2MB - t.Errorf("Expected less than 1MB memory growth after cleanup, got %d bytes", allocDiff) +if allocDiff > 2*1024*1024 { // 2MB + t.Errorf("Expected less than 2MB memory growth after cleanup, got %d bytes", allocDiff)
167-214: Fix variable name typo in comment.There's a typo in the comment on line 168.
Apply this diff:
-// This test verifies c behavior under frequent GC cycles +// This test verifies cache behavior under frequent GC cycles
269-297: Add error handling in benchmark operations.The benchmark should handle potential errors from cache operations.
Apply this diff:
// Set and get -cache.Set(ctx, key, value) -cache.Get(ctx, key) +if err := cache.Set(ctx, key, value); err != nil { + b.Fatal(err) +} +if _, err := cache.Get(ctx, key); err != nil { + b.Fatal(err) +} // Delete -cache.Delete(ctx, key) +if err := cache.Delete(ctx, key); err != nil { + b.Fatal(err) +}pkg/cache/memory_concurrency_test.go (2)
98-142: Consider using atomic counters for operation tracking.While the test correctly handles concurrent read/write operations, tracking successful operations with atomic counters would provide better insights into the test's behavior and help detect potential issues like excessive contention.
Apply this enhancement to add operation tracking:
func TestMemoryCache_ConcurrentReadWrite(t *testing.T) { c := cache.NewMemory(0) ctx := context.Background() const numOperations = 1000 const numReaders = 8 const numWriters = 2 var wg sync.WaitGroup + var readCount, writeCount atomic.Int64 // Launch concurrent readers for i := 0; i < numReaders; i++ { wg.Add(1) go func() { defer wg.Done() for j := 0; j < numOperations/numReaders; j++ { key := "shared-key" _, err := c.Get(ctx, key) if err != nil && err != cache.ErrKeyNotFound { t.Errorf("Get failed: %v", err) + } else if err == nil { + readCount.Add(1) } } }() } // Launch concurrent writers for i := 0; i < numWriters; i++ { wg.Add(1) go func() { defer wg.Done() for j := 0; j < numOperations/numWriters; j++ { key := "shared-key" value := "value-" + strconv.Itoa(j) err := c.Set(ctx, key, value) if err != nil { t.Errorf("Set failed: %v", err) + } else { + writeCount.Add(1) } } }() } wg.Wait() + t.Logf("Completed %d successful reads and %d writes", readCount.Load(), writeCount.Load()) }
239-281: Consider validating drain operation results.While the test correctly verifies that the cache is empty after draining, it doesn't validate that the drain operations actually returned the expected items. Only one drain operation should return items, while the others should return empty results.
Apply this enhancement to validate drain results:
func TestMemoryCache_ConcurrentDrain(t *testing.T) { c := cache.NewMemory(0) ctx := context.Background() const numItems = 100 const numGoroutines = 5 var wg sync.WaitGroup + var drainResults sync.Map // Pre-populate cache with items for i := 0; i < numItems; i++ { key := "item-" + strconv.Itoa(i) value := "value-" + strconv.Itoa(i) err := c.Set(ctx, key, value) if err != nil { t.Fatalf("Set failed for item %d: %v", i, err) } } // Launch concurrent drain operations for i := 0; i < numGoroutines; i++ { wg.Add(1) - go func() { + go func(id int) { defer wg.Done() - _, err := c.Drain(ctx) + items, err := c.Drain(ctx) if err != nil { t.Errorf("Drain failed: %v", err) } - }() + drainResults.Store(id, items) + }(i) } wg.Wait() + + // Verify that items were drained (at least one goroutine should have gotten items) + totalDrained := 0 + drainResults.Range(func(key, value interface{}) bool { + items := value.(map[string]interface{}) + totalDrained += len(items) + return true + }) + + if totalDrained != numItems { + t.Errorf("Expected %d total items drained, got %d", numItems, totalDrained) + } // Cache should be empty after all drain operations for i := 0; i < numItems; i++ { key := "item-" + strconv.Itoa(i) _, err := c.Get(ctx, key) if err != cache.ErrKeyNotFound { t.Errorf("Expected ErrKeyNotFound for key %s after drain, got %v", key, err) } } }pkg/cache/memory_edge_test.go (3)
220-257: Improve performance measurement accuracy.The test includes a performance measurement but the threshold check at Line 254 seems arbitrary. The variable
durationis set to 100ms but never used in the actual operations.Either remove the unused
durationvariable or clarify the performance expectations:func TestMemoryCache_RapidOperations(t *testing.T) { // Test rapid c operations c := cache.NewMemory(0) ctx := context.Background() const numOperations = 1000 - const duration = 100 * time.Millisecond start := time.Now() opsCompleted := 0 for i := 0; i < numOperations; i++ { // Alternate between set and get if i%2 == 0 { key := "rapid-key-" + strconv.Itoa(i) value := "rapid-value-" + strconv.Itoa(i) err := c.Set(ctx, key, value) if err != nil { t.Errorf("Set failed: %v", err) } } else { key := "rapid-key-" + strconv.Itoa(i-1) _, err := c.Get(ctx, key) if err != nil && err != cache.ErrKeyNotFound { t.Errorf("Get failed: %v", err) } } opsCompleted++ } durationTaken := time.Since(start) t.Logf("Completed %d operations in %v (%.2f ops/ms)", opsCompleted, durationTaken, float64(opsCompleted)/float64(durationTaken.Milliseconds())) - - // Verify operations completed within reasonable time - if durationTaken > 2*duration { - t.Errorf("Operations took too long: %v", durationTaken) - } }
329-351: Potential issue with zero-byte key.The test creates a 1KB key using
make([]byte, 1024)which results in a string of 1024 null bytes. While this tests the cache's ability to handle long keys, it doesn't test with actual character data. Consider using a more realistic long key.Generate a more realistic long key:
func TestMemoryCache_ExtremeKeyLength(t *testing.T) { // Test with very long keys cache := cache.NewMemory(0) ctx := context.Background() // Create a very long key (1KB) - longKey := string(make([]byte, 1024)) + longKey := strings.Repeat("a", 1024) value := "extreme-key-value" err := cache.Set(ctx, longKey, value) if err != nil { t.Fatalf("Set with long key failed: %v", err) } retrieved, err := cache.Get(ctx, longKey) if err != nil { t.Fatalf("Get with long key failed: %v", err) } if retrieved != value { t.Errorf("Expected %s, got %s", value, retrieved) } }Note: You'll need to add
"strings"to the imports at the top of the file.
353-389: Potential race in expiration timing test.The test sets items to expire in 10ms and then has goroutines sleep for exactly 10ms before attempting access. This creates a race where some goroutines might access the item just before expiration while others access it just after. While this tests race conditions, the timing is very tight and might lead to flaky tests.
Consider adding a small random jitter to make the test more robust:
func TestMemoryCache_RaceConditionWithExpiration(t *testing.T) { // Test race conditions between expiration and access c := cache.NewMemory(0) ctx := context.Background() key := "race-expire-key" value := "race-expire-value" ttl := 10 * time.Millisecond // Set item with short TTL err := c.Set(ctx, key, value, cache.WithTTL(ttl)) if err != nil { t.Fatalf("Set failed: %v", err) } const numGoroutines = 50 var wg sync.WaitGroup // Launch goroutines that try to access the key while it's expiring for i := 0; i < numGoroutines; i++ { wg.Add(1) - go func() { + go func(id int) { defer wg.Done() - // Wait for the item to be close to expiration - time.Sleep(ttl) + // Wait for the item to be close to expiration with some jitter + jitter := time.Duration(id%3) * time.Millisecond + time.Sleep(ttl - 2*time.Millisecond + jitter) // Try to get the item _, err := c.Get(ctx, key) if err != nil && err != cache.ErrKeyExpired && err != cache.ErrKeyNotFound { t.Errorf("Get failed: %v", err) } - }() + }(i) } wg.Wait() }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (19)
build/package/Dockerfile(1 hunks)build/package/Dockerfile.dev(1 hunks)go.mod(4 hunks)internal/config/config.go(3 hunks)internal/config/module.go(2 hunks)internal/sms-gateway/app.go(2 hunks)internal/sms-gateway/cache/config.go(1 hunks)internal/sms-gateway/cache/factory.go(1 hunks)internal/sms-gateway/cache/module.go(1 hunks)pkg/cache/cache.go(1 hunks)pkg/cache/errors.go(1 hunks)pkg/cache/memory.go(1 hunks)pkg/cache/memory_bench_test.go(1 hunks)pkg/cache/memory_concurrency_test.go(1 hunks)pkg/cache/memory_edge_test.go(1 hunks)pkg/cache/memory_profile_test.go(1 hunks)pkg/cache/memory_test.go(1 hunks)pkg/cache/options.go(1 hunks)pkg/cache/redis.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (12)
internal/sms-gateway/cache/module.go (3)
internal/config/module.go (1)
Module(21-124)internal/sms-gateway/app.go (1)
Module(34-56)internal/sms-gateway/cache/factory.go (1)
NewFactory(19-49)
pkg/cache/cache.go (1)
pkg/cache/options.go (1)
Option(5-5)
pkg/cache/redis.go (3)
pkg/cache/cache.go (1)
Cache(5-36)pkg/cache/errors.go (2)
ErrKeyNotFound(6-6)ErrKeyExists(8-8)pkg/cache/options.go (1)
Option(5-5)
pkg/cache/memory_bench_test.go (2)
pkg/cache/memory.go (1)
NewMemory(16-23)pkg/cache/options.go (1)
WithTTL(21-25)
internal/sms-gateway/cache/factory.go (3)
pkg/cache/cache.go (1)
Cache(5-36)pkg/cache/memory.go (1)
NewMemory(16-23)pkg/cache/redis.go (1)
NewRedis(24-36)
pkg/cache/memory_test.go (3)
pkg/cache/memory.go (1)
NewMemory(16-23)pkg/cache/options.go (2)
WithTTL(21-25)WithValidUntil(29-33)pkg/cache/errors.go (2)
ErrKeyNotFound(6-6)ErrKeyExists(8-8)
internal/sms-gateway/app.go (2)
internal/sms-gateway/cache/module.go (1)
Module(8-16)internal/config/module.go (1)
Module(21-124)
pkg/cache/memory_concurrency_test.go (3)
pkg/cache/memory.go (1)
NewMemory(16-23)pkg/cache/errors.go (3)
ErrKeyNotFound(6-6)ErrKeyExists(8-8)ErrKeyExpired(7-7)pkg/cache/options.go (1)
WithTTL(21-25)
pkg/cache/memory_profile_test.go (2)
pkg/cache/memory.go (1)
NewMemory(16-23)pkg/cache/errors.go (1)
ErrKeyNotFound(6-6)
pkg/cache/memory.go (3)
pkg/cache/cache.go (1)
Cache(5-36)pkg/cache/options.go (1)
Option(5-5)pkg/cache/errors.go (3)
ErrKeyExists(8-8)ErrKeyNotFound(6-6)ErrKeyExpired(7-7)
pkg/cache/memory_edge_test.go (3)
pkg/cache/memory.go (1)
NewMemory(16-23)pkg/cache/options.go (1)
WithTTL(21-25)pkg/cache/errors.go (2)
ErrKeyExpired(7-7)ErrKeyNotFound(6-6)
internal/config/module.go (2)
internal/config/config.go (2)
Config(10-18)Cache(74-76)internal/sms-gateway/cache/config.go (1)
Config(3-5)
🪛 GitHub Check: lint
pkg/cache/memory_bench_test.go
[failure] 154-154:
Error return value of cache.Drain is not checked (errcheck)
[failure] 134-134:
Error return value of cache.Cleanup is not checked (errcheck)
[failure] 114-114:
Error return value of cache.Delete is not checked (errcheck)
[failure] 95-95:
Error return value of cache.GetAndDelete is not checked (errcheck)
[failure] 76-76:
Error return value of cache.SetOrFail is not checked (errcheck)
[failure] 61-61:
Error return value of cache.Get is not checked (errcheck)
[failure] 60-60:
Error return value of cache.Set is not checked (errcheck)
[failure] 42-42:
Error return value of cache.Get is not checked (errcheck)
[failure] 37-37:
Error return value of cache.Set is not checked (errcheck)
[failure] 24-24:
Error return value of cache.Set is not checked (errcheck)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Test
- GitHub Check: E2E
- GitHub Check: Analyze (go)
🔇 Additional comments (49)
go.mod (1)
3-3: Ensure CI/build images match Go 1.24.1.go.mod targets 1.24.1; please align Dockerfiles (suggested) and verify CI uses the same toolchain to avoid mismatched stdlib behavior.
internal/config/module.go (1)
7-7: LGTM! Clean integration of cache configuration into the DI graph.The cache configuration provider follows the established pattern in the module and correctly wires the cache URL from the main config into the cache subsystem.
Also applies to: 119-123
internal/sms-gateway/cache/factory.go (1)
19-49: Verify Redis client initialization and document scheme
- Ensure the Redis client performs a connectivity check (e.g. Ping) on creation or handle failures early.
- Confirm its default connection‐pool settings satisfy production load and adjust/configure pooling if needed.
- Document the
redis://URL scheme in theConfig.URLbehavior and expectations.pkg/cache/cache.go (1)
5-36: Well-designed cache interface with clear semantics.The interface is clean, follows Go conventions, and provides a good balance of functionality. The error semantics are clearly documented.
pkg/cache/memory_test.go (19)
11-33: LGTM! Well-structured basic set/get test.The test properly verifies the basic Set/Get operations with clear setup and assertions.
35-58: LGTM! Good coverage of TTL functionality.The test properly validates the
WithTTLoption for setting values with explicit TTL.
60-83: LGTM! ValidUntil option properly tested.The test correctly validates the
WithValidUntiloption for setting explicit expiration times.
85-108: LGTM! Default TTL behavior validated.The test verifies that items use the default TTL when no explicit TTL is provided.
110-120: LGTM! Error signaling correctly tested.The test properly verifies that
ErrKeyNotFoundis returned for non-existent keys.
122-144: LGTM! SetOrFail for new keys works correctly.The test validates that
SetOrFailsuccessfully sets new keys and returns the expected values.
146-175: LGTM! SetOrFail correctly prevents overwrites.The test properly verifies that
SetOrFailreturnsErrKeyExistswhen trying to overwrite existing keys and preserves the original value.
177-201: LGTM! Delete operation properly tested.The test validates successful deletion and confirms the key is removed.
203-214: LGTM! Graceful handling of non-existent key deletion.The test correctly verifies that deleting a non-existent key doesn't return an error.
216-244: LGTM! GetAndDelete atomic operation validated.The test properly verifies that
GetAndDeletereturns the value and removes the key in a single operation.
246-257: LGTM! GetAndDelete error handling verified.The test correctly validates that
GetAndDeletereturnsErrKeyNotFoundfor non-existent keys.
259-306: LGTM! Comprehensive drain functionality test.The test thoroughly validates that
Drainreturns all items and clears the cache.
308-322: LGTM! Empty cache drain properly handled.The test correctly verifies that draining an empty cache returns an empty map without errors.
324-358: LGTM! Cleanup removes expired items correctly.The test properly validates that
Cleanupremoves expired items while preserving valid ones.
360-388: LGTM! Cleanup preserves non-expired items.The test correctly verifies that cleanup doesn't affect items without expiration.
390-419: LGTM! Overwrite functionality validated.The test properly verifies that
Setcan overwrite existing values.
421-443: LGTM! Empty string values handled correctly.The test validates that empty strings can be stored and retrieved properly.
445-467: LGTM! Special characters properly supported.The test verifies that keys and values with special characters are handled correctly.
469-491: LGTM! Large values handled properly.The test validates that the cache can store and retrieve large values (1MB) accurately.
pkg/cache/memory_bench_test.go (10)
159-189: LGTM! Concurrent read benchmark well-structured.The benchmark properly tests read performance with varying levels of concurrency.
191-221: LGTM! Concurrent write benchmark properly implemented.The benchmark effectively measures write performance under different concurrency levels with unique keys.
223-262: LGTM! Mixed workload benchmark provides valuable insights.The benchmark effectively simulates realistic cache usage patterns with configurable read/write ratios.
264-301: LGTM! Scaling benchmark comprehensively tests load patterns.The benchmark provides good coverage of various load scenarios from small to very large.
303-333: LGTM! TTL overhead benchmark is well-designed.The benchmark effectively measures the performance impact of TTL operations.
335-367: LGTM! Large value benchmark covers important use cases.The benchmark properly tests performance with various payload sizes from 1KB to 1MB.
369-393: LGTM! Memory growth benchmark provides useful allocation insights.The benchmark effectively measures memory allocation patterns with different cache sizes.
395-417: LGTM! Random access benchmark simulates realistic access patterns.The benchmark properly tests performance with random key access over a pre-populated cache.
419-435: LGTM! Hot key benchmark tests an important edge case.The benchmark effectively measures performance when a single key is accessed frequently.
437-459: LGTM! Cold key benchmark complements the hot key test.The benchmark properly tests performance with distributed access patterns across many keys.
pkg/cache/memory_profile_test.go (2)
12-60: LGTM! Memory allocation profiling test is well-implemented.The test effectively analyzes memory allocation patterns with appropriate thresholds and detailed logging.
110-165: LGTM! Memory pressure test simulates realistic scenarios.The test effectively validates cache behavior under memory pressure with multiple cache instances.
pkg/cache/memory_concurrency_test.go (6)
14-49: LGTM! Well-structured concurrent read test.The test properly validates thread-safe concurrent reads with appropriate synchronization and error handling.
51-96: LGTM! Comprehensive concurrent write test with good verification.The test effectively validates concurrent writes with non-overlapping keys and proper post-write verification.
144-190: LGTM! Comprehensive test of mixed cache operations.The test effectively validates the complete lifecycle of cache entries with concurrent Set, Get, and Delete operations.
192-237: LGTM! Excellent test for SetOrFail atomicity.The test properly validates that only one goroutine succeeds in setting a key when using SetOrFail, with accurate counting of success and failure cases.
336-382: LGTM! Well-designed GetAndDelete concurrency test.The test effectively validates concurrent GetAndDelete operations with proper pre-population and post-verification.
384-422: LGTM! Excellent race condition detection test.The test is well-designed to trigger race conditions with time-based operation selection and concurrent access patterns. This will be valuable when running with the race detector enabled.
pkg/cache/memory_edge_test.go (8)
13-37: LGTM! Proper test for zero TTL behavior.The test correctly validates that items with zero TTL do not expire.
39-60: LGTM! Good edge case test for immediate expiration.The test properly validates TTL expiration behavior with appropriate timing margins.
62-81: LGTM! Important nil context edge case test.The test validates that the cache handles nil context gracefully, which is important for robustness.
83-103: LGTM! Good validation of empty key handling.The test ensures the cache properly handles empty string keys, which is an important edge case.
105-145: LGTM! Comprehensive TTL overwrite test.The test thoroughly validates that overwriting a key updates its TTL correctly.
147-218: LGTM! Excellent mixed TTL scenario test.The test comprehensively validates different TTL behaviors with proper timing and verification of expiration states.
259-286: LGTM! Good edge case test for empty cache cleanup.The test validates that cleanup operations work correctly on an empty cache and don't affect subsequent operations.
288-327: LGTM! Comprehensive drain test with expired items.The test properly validates that drain operations exclude expired items and only return valid entries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 9
♻️ Duplicate comments (2)
pkg/cache/memory_profile_test.go (1)
243-245: Handle Drain error (matches prior feedback).This call drops the error and can hide failures.
- tempCache.Drain(ctx) + if _, err := tempCache.Drain(ctx); err != nil { + t.Errorf("Drain failed: %v", err) + }internal/sms-gateway/app.go (1)
47-47: Drop cache.Module() logger decoration (global logger name pollution).Including
cache.Module()while it decorates*zap.Loggerto"cache"renames subsequent loggers across the app. Please remove that decoration ininternal/sms-gateway/cache/module.go. This was raised earlier and still applies.Apply in cache/module.go:
func Module() fx.Option { return fx.Module( "cache", - fx.Decorate(func(log *zap.Logger) *zap.Logger { - return log.Named("cache") - }), fx.Provide(NewFactory), ) }
🧹 Nitpick comments (8)
pkg/cache/memory_profile_test.go (2)
275-297: Benchmark memory accounting is racy; measure outside RunParallel and use ReportAllocs.Reading MemStats per goroutine races with other goroutines and divides by b.N incorrectly. Measure once around the whole run.
- b.ResetTimer() - b.RunParallel(func(pb *testing.PB) { - var m1, m2 runtime.MemStats - runtime.ReadMemStats(&m1) + b.ReportAllocs() + runtime.GC() + var m1, m2 runtime.MemStats + runtime.ReadMemStats(&m1) + b.ResetTimer() + b.RunParallel(func(pb *testing.PB) { i := 0 for pb.Next() { key := "bench-key-" + strconv.Itoa(i) value := "bench-value-" + strconv.Itoa(i) // Set and get cache.Set(ctx, key, value) cache.Get(ctx, key) // Delete cache.Delete(ctx, key) i++ } - runtime.ReadMemStats(&m2) - b.Logf("Memory usage per iteration: %.2f bytes", float64(m2.TotalAlloc-m1.TotalAlloc)/float64(b.N)) }) + runtime.ReadMemStats(&m2) + b.Logf("TotalAlloc per op: %.2f bytes/op", float64(m2.TotalAlloc-m1.TotalAlloc)/float64(b.N))
56-60: Reduce flakiness of hard-coded memory thresholds.Per-item and absolute heap thresholds vary by Go version, arch, GC settings. Consider widening margins, gating under testing.Short, or converting strict asserts to logs/benchmarks.
Also applies to: 106-108, 160-165, 265-267
pkg/cache/memory_concurrency_test.go (1)
284-288: Avoid panic on type assertion in results aggregation.If Drain’s return type changes, this panics. Use a checked assertion.
- drainResults.Range(func(key, value any) bool { - items := value.(map[string]string) + drainResults.Range(func(key, value any) bool { + items, ok := value.(map[string]string) + if !ok { + t.Errorf("unexpected Drain result type: %T", value) + return true + }pkg/cache/memory_edge_test.go (2)
54-60: Be tolerant to ErrKeyNotFound if Get removes expired entries.Depending on Get’s semantics, expired keys may be evicted and reported as not found.
- _, err = c.Get(ctx, key) - if err != cache.ErrKeyExpired { - t.Errorf("Expected ErrKeyExpired, got %v", err) - } + _, err = c.Get(ctx, key) + if err != cache.ErrKeyExpired && err != cache.ErrKeyNotFound { + t.Errorf("Expected ErrKeyExpired or ErrKeyNotFound, got %v", err) + }
221-258: Make RapidOperations test less time-sensitive.CI variance can cause false negatives. Gate under -short or relax the SLA.
func TestMemoryCache_RapidOperations(t *testing.T) { + if testing.Short() { + t.Skip("skipping time-sensitive test in -short mode") + }internal/sms-gateway/online/service.go (1)
31-40: Optional: add small jitter to the ticker to avoid sync across replicas.Helps reduce thundering herd if many instances start together.
Example:
func (s *Service) Run(ctx context.Context) { - ticker := time.NewTicker(1 * time.Minute) + interval := time.Minute + jitter := time.Duration(rand.Int63n(int64(interval / 10))) + timer := time.NewTimer(jitter) + defer timer.Stop() + select { + case <-ctx.Done(): + return + case <-timer.C: + } + ticker := time.NewTicker(interval) defer ticker.Stop()(Remember to import math/rand.)
pkg/cache/redis.go (2)
49-52: Cleanup no-op: acceptable if per-key TTL is used; document behavior.If you keep the hash approach, implement selective expiry removal; otherwise, add a short doc noting Redis TTL handles expiry.
35-47: Nit: rename struct fieldkey→prefix/hashKeyfor clarity.Improves readability and reduces confusion with method param
key.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
internal/sms-gateway/app.go(4 hunks)internal/sms-gateway/cache/config.go(1 hunks)internal/sms-gateway/cache/factory.go(1 hunks)internal/sms-gateway/online/module.go(1 hunks)internal/sms-gateway/online/service.go(1 hunks)pkg/cache/errors.go(1 hunks)pkg/cache/memory_bench_test.go(1 hunks)pkg/cache/memory_concurrency_test.go(1 hunks)pkg/cache/memory_edge_test.go(1 hunks)pkg/cache/memory_profile_test.go(1 hunks)pkg/cache/options.go(1 hunks)pkg/cache/redis.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- pkg/cache/errors.go
- pkg/cache/options.go
- internal/sms-gateway/cache/factory.go
- internal/sms-gateway/cache/config.go
- pkg/cache/memory_bench_test.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-10T11:11:43.452Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/memory.go:59-74
Timestamp: 2025-09-10T11:11:43.452Z
Learning: In pkg/cache/memory.go, the cleanup method removes expired items first, then executes the provided callback function. This means methods like Drain that use cleanup don't need additional expiration filtering since expired items are already removed before the callback runs.
Applied to files:
pkg/cache/memory_concurrency_test.gopkg/cache/memory_edge_test.gopkg/cache/redis.gopkg/cache/memory_profile_test.go
🧬 Code graph analysis (7)
internal/sms-gateway/online/service.go (2)
internal/sms-gateway/cache/factory.go (1)
Cache(11-11)internal/sms-gateway/modules/events/service.go (2)
NewService(26-38)deviceSvc(13-24)
internal/sms-gateway/online/module.go (4)
internal/sms-gateway/app.go (1)
Module(35-58)internal/sms-gateway/cache/factory.go (2)
Factory(13-15)Cache(11-11)internal/sms-gateway/online/service.go (2)
New(20-28)Service(12-18)internal/sms-gateway/openapi/module.go (1)
Module(8-16)
pkg/cache/memory_concurrency_test.go (3)
pkg/cache/memory.go (1)
NewMemory(16-23)pkg/cache/errors.go (2)
ErrKeyNotFound(7-7)ErrKeyExists(11-11)pkg/cache/options.go (1)
WithTTL(22-30)
internal/sms-gateway/app.go (3)
internal/sms-gateway/online/module.go (1)
Module(11-35)internal/sms-gateway/cache/module.go (1)
Module(8-16)internal/config/module.go (1)
Module(21-124)
pkg/cache/memory_edge_test.go (3)
pkg/cache/memory.go (1)
NewMemory(16-23)pkg/cache/options.go (1)
WithTTL(22-30)pkg/cache/errors.go (2)
ErrKeyExpired(9-9)ErrKeyNotFound(7-7)
pkg/cache/redis.go (3)
pkg/cache/cache.go (1)
Cache(5-36)pkg/cache/errors.go (2)
ErrKeyNotFound(7-7)ErrKeyExists(11-11)pkg/cache/options.go (1)
Option(6-6)
pkg/cache/memory_profile_test.go (2)
pkg/cache/memory.go (1)
NewMemory(16-23)pkg/cache/errors.go (1)
ErrKeyNotFound(7-7)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: E2E
- GitHub Check: lint
- GitHub Check: Analyze (go)
🔇 Additional comments (7)
pkg/cache/memory_concurrency_test.go (2)
323-349: Timing fix for concurrent cleanup looks good.Waiting for TTL to elapse before cleanup avoids the prior race. LGTM.
109-116: No action needed: go.mod specifies Go 1.24.1, which supports integer ranges (Go 1.22+).pkg/cache/memory_edge_test.go (1)
289-328: Drain behavior assertion matches implementation semantics.Verifying that Drain returns only non-expired items and purges expired ones aligns with the cache’s cleanup-first contract.
internal/sms-gateway/online/module.go (1)
21-33: Lifecycle wiring LGTM.Start/stop with cancelable context is correct.
internal/sms-gateway/online/service.go (1)
7-9: Cache type alias exists; no action required
internal/sms-gateway/cache/factory.goalready declarestype Cache = cache.Cache, so the return type fromonline.Moduleand thecache.Cacheimport inservice.goare identical.pkg/cache/redis.go (2)
92-104: Atomic GetAndDelete: good.Atomicity addressed (Lua previously; with per-key refactor, prefer GETDEL).
118-121: HExpireAt supported in go-redis v9.6+: no changes needed.
TheHExpireAtmethod was introduced in go-redis/v9.6.x alongside Redis ≥7.4’s hash‐field expiration commands, so the current code is valid.Likely an incorrect or invalid review comment.
8ad2c4d to
8ad3f09
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/sms-gateway/modules/devices/service.go (1)
123-125: Token-cache invalidation bug: key mismatch
GetByTokencaches by SHA-256(hex) of the token, butDeleteuses the raw token, so entries are never invalidated.Apply:
- if err := s.tokensCache.Delete(device.AuthToken); err != nil { + hash := sha256.Sum256([]byte(device.AuthToken)) + cacheKey := hex.EncodeToString(hash[:]) + if err := s.tokensCache.Delete(cacheKey); err != nil { s.logger.Error("can't invalidate token cache", zap.Error(err)) }
♻️ Duplicate comments (5)
pkg/cache/memory_profile_test.go (4)
45-55: Use HeapAlloc delta, not TotalAlloc, for per-item growth.TotalAlloc is cumulative; HeapAlloc reflects heap-in-use.
- allocDiff := m2.TotalAlloc - m1.TotalAlloc - allocPerItem := float64(allocDiff) / float64(numItems) + heapDiff := m2.HeapAlloc - m1.HeapAlloc + allocPerItem := float64(heapDiff) / float64(numItems) @@ - t.Logf(" Total allocated: %d bytes", m2.TotalAlloc) - t.Logf(" Allocation difference: %d bytes", allocDiff) + t.Logf(" Heap alloc: %d bytes", m2.HeapAlloc) + t.Logf(" Heap delta: %d bytes", heapDiff)
95-109: MemoryCleanup: compare heap-in-use, not cumulative allocation.- allocDiff := m2.TotalAlloc - m1.TotalAlloc + heapDiff := m2.HeapAlloc - m1.HeapAlloc @@ - t.Logf(" Total allocated: %d bytes", m2.TotalAlloc) - t.Logf(" Allocation difference: %d bytes", allocDiff) + t.Logf(" Heap alloc: %d bytes", m2.HeapAlloc) + t.Logf(" Heap delta: %d bytes", heapDiff) @@ - if allocDiff > 2*1024*1024 { // 2MB - t.Errorf("Expected less than 2MB memory growth after cleanup, got %d bytes", allocDiff) + if heapDiff > 2*1024*1024 { // 2MB + t.Errorf("Expected <2MB heap growth after cleanup, got %d bytes", heapDiff)
150-165: MemoryPressure: switch to HeapAlloc delta to avoid false positives.- allocDiff := m2.TotalAlloc - m1.TotalAlloc + heapDiff := m2.HeapAlloc - m1.HeapAlloc @@ - t.Logf(" Total allocated: %d bytes", m2.TotalAlloc) - t.Logf(" Allocation difference: %d bytes", allocDiff) + t.Logf(" Heap alloc: %d bytes", m2.HeapAlloc) + t.Logf(" Heap delta: %d bytes", heapDiff) @@ - if allocDiff > expectedMaxGrowth { - t.Errorf("Expected less than %d bytes memory growth under pressure, got %d bytes", expectedMaxGrowth, allocDiff) + if heapDiff > expectedMaxGrowth { + t.Errorf("Expected < %d bytes heap growth under pressure, got %d bytes", expectedMaxGrowth, heapDiff)
243-245: Handle Drain error during leak detection loop.- // Clear the cache - tempCache.Drain(ctx) + // Clear the cache + if _, err := tempCache.Drain(ctx); err != nil { + t.Errorf("Drain failed: %v", err) + }internal/sms-gateway/online/service.go (1)
27-35: Name the logger for this module.Create a child logger to keep logs consistently scoped.
func New(devicesSvc *devices.Service, cache cache.Cache, logger *zap.Logger) Service { return &service{ devicesSvc: devicesSvc, - cache: cache, - logger: logger, + cache: cache, + logger: logger.Named("online"), } }
🧹 Nitpick comments (17)
pkg/cache/memory_profile_test.go (3)
191-198: GCStress: ErrKeyNotFound shouldn’t occur for existing keys (TTL=0). Verify and tighten.These 100 keys were inserted and shouldn’t expire/evict; treat any error as a failure.
- _, err := c.Get(ctx, key) - if err != nil && err != cache.ErrKeyNotFound { - t.Errorf("Get failed during GC stress test: %v", err) - } + if _, err := c.Get(ctx, key); err != nil { + t.Errorf("Get failed during GC stress test: %v", err) + }If TTL=0 implies “no expiry” in pkg/cache/memory.go (expected), this is safe. If there’s eviction logic, please confirm and adjust test accordingly.
1-1: Scope errcheck suppression to benchmark; drop file-wide //nolint.Keep lints active for tests; ignore only benchmark calls.
-//nolint:errcheck- // Set and get - cache.Set(ctx, key, value) - cache.Get(ctx, key) + // Set and get + cache.Set(ctx, key, value) //nolint:errcheck // benchmark: ignore errors for perf + cache.Get(ctx, key) //nolint:errcheck // benchmark: ignore errors for perf @@ - // Delete - cache.Delete(ctx, key) + // Delete + cache.Delete(ctx, key) //nolint:errcheck // benchmark: ignore errors for perfAlso applies to: 289-295
15-16: Nit: avoid shadowing the imported package name with a local variable.Renaming improves readability and avoids confusion between package and value.
- cache := cache.NewMemory(0) + c := cache.NewMemory(0) @@ - err := cache.Set(ctx, key, value) + err := c.Set(ctx, key, value) @@ - _, err = cache.Get(ctx, key) + _, err = c.Get(ctx, key)- cache := cache.NewMemory(0) + c := cache.NewMemory(0) @@ - err := cache.Set(ctx, key, value) + err := c.Set(ctx, key, value) @@ - _, err := cache.Drain(ctx) + _, err := c.Drain(ctx)- cache := cache.NewMemory(0) + c := cache.NewMemory(0)Also applies to: 29-36, 63-66, 79-86, 273-276
configs/config.example.yml (1)
27-28: Document Redis URL auth/TLS and multi-replica guidanceAdd concrete Redis examples (auth, DB index, TLS) and a note that memory is per-process and unsuitable for multi-replica deployments.
cache: # cache config - url: memory:// # cache url (memory:// or redis://) [CACHE__URL] + url: memory:// # cache url (memory:// or redis://) [CACHE__URL] + # Examples: + # - url: redis://localhost:6379?db=0 + # - url: redis://:password@localhost:6379?db=0 + # - url: rediss://redis.example.com:6380?db=0 # TLS-enabled + # Note: memory:// is in-process only; use Redis in multi-replica or horizontally scaled setups..github/workflows/go.yml (6)
31-35: Pin Go version via go.mod and enable module cache
go-version: stablecan drift. Use the version in go.mod and cache modules for speed.Apply here (repeat similarly in Test and Benchmark jobs):
- - name: Set up Go - uses: actions/setup-go@v5 - with: - go-version: stable + - name: Set up Go + uses: actions/setup-go@v5 + with: + go-version-file: go.mod + cache: true + cache-dependency-path: go.sum
36-41: Avoid “latest” golangci-lint to reduce CI flakinessPin a specific linter version to avoid surprise breakages.
For example:
- name: Run golangci-lint uses: golangci/golangci-lint-action@v8 with: - version: latest + version: v1.60.3 args: --timeout=5m
6-20: Ensure CI runs when the workflow changesEdits to this file won’t trigger tests/linters because of the path filters. Include the workflow file itself.
paths: - "**.go" - "go.mod" - "go.sum" + - ".github/workflows/go.yml"Apply to both push and pull_request sections.
62-64: Improve test robustness and coverage accountingShuffle tests to catch order dependencies and include all packages in coverage.
- run: go test -race -coverprofile=coverage.out -covermode=atomic ./... + run: go test -race -shuffle=on -covermode=atomic -coverpkg=./... -coverprofile=coverage.out ./...Optional: add a preceding
go vet ./...step.
79-87: Minor: ensure deps are pre-fetched for benchmarksWarm modules to reduce variance and flakiness in benchmark timings.
- name: Run benchmarks - run: go test -bench=. -benchmem ./... | tee benchmark.txt + run: | + go mod download + go test -bench=. -benchmem ./... | tee benchmark.txtAlso consider
timeout-minutes: 15on the job.
1-2: Optional: add concurrency to cancel superseded runsSaves CI minutes on rapid pushes.
name: Go +concurrency: + group: ${{ github.workflow }}-${{ github.ref }} + cancel-in-progress: trueinternal/sms-gateway/modules/devices/service.go (1)
103-111: Batch handler: honor context, log a summary, and count failuresProcessing continues on canceled contexts and silently returns nil regardless of failures.
Apply:
func (s *Service) SetLastSeen(ctx context.Context, batch map[string]time.Time) error { - for deviceId, lastSeen := range batch { - if err := s.devices.SetLastSeen(ctx, deviceId, lastSeen); err != nil { - s.logger.Error("can't set last seen", zap.String("device_id", deviceId), zap.Error(err)) - } - } - - return nil + if len(batch) == 0 { + return nil + } + var updated, failed int + for deviceId, lastSeen := range batch { + if ctx.Err() != nil { + break + } + if err := s.devices.SetLastSeen(ctx, deviceId, lastSeen); err != nil { + failed++ + s.logger.Error("can't set last seen", zap.String("device_id", deviceId), zap.Error(err)) + continue + } + updated++ + } + s.logger.Debug("processed last_seen batch", zap.Int("updated", updated), zap.Int("failed", failed)) + return ctx.Err() }internal/sms-gateway/modules/auth/service.go (1)
147-150: Use defer cancel inside goroutineSlightly safer and idiomatic; guarantees cancellation on early returns/panics.
Apply:
- go func(id string) { - ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) - s.onlineSvc.SetOnline(ctx, id) - cancel() - }(device.ID) + go func(id string) { + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + s.onlineSvc.SetOnline(ctx, id) + }(device.ID)internal/sms-gateway/online/service.go (5)
37-54: Reduce periodic log noise and flush on shutdown.Downgrade the per-minute infos to debug and attempt a best-effort flush on exit.
func (s *service) Run(ctx context.Context) { ticker := time.NewTicker(1 * time.Minute) defer ticker.Stop() for { select { case <-ctx.Done(): - return + s.logger.Info("Shutting down: flushing online status") + flushCtx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + if err := s.persist(flushCtx); err != nil { + s.logger.Warn("Flush on shutdown failed", zap.Error(err)) + } + return case <-ticker.C: - s.logger.Info("Persisting online status") + s.logger.Debug("Persisting online status") if err := s.persist(ctx); err != nil { s.logger.Error("Can't persist online status", zap.Error(err)) - } else { - s.logger.Info("Online status persisted") } } } }
69-96: Short-circuit when nothing to persist.Skip parsing/DB call when drain is empty and keep these logs at debug.
func (s *service) persist(ctx context.Context) error { items, err := s.cache.Drain(ctx) if err != nil { return fmt.Errorf("can't drain cache: %w", err) } - s.logger.Info("Drained cache", zap.Int("count", len(items))) + if len(items) == 0 { + s.logger.Debug("No online statuses to persist") + return nil + } + s.logger.Debug("Drained cache", zap.Int("count", len(items))) - timestamps := maps.MapValues(items, func(v string) time.Time { - t, err := time.Parse(time.RFC3339, v) - if err != nil { - s.logger.Error("Can't parse last seen", zap.String("last_seen", v), zap.Error(err)) - return time.Now().UTC() - } - - return t - }) + timestamps := make(map[string]time.Time, len(items)) + for id, v := range items { + t, err := time.Parse(time.RFC3339, v) + if err != nil { + s.logger.Error("Can't parse last seen", zap.String("device_id", id), zap.String("last_seen", v), zap.Error(err)) + t = time.Now().UTC() + } + timestamps[id] = t + } - s.logger.Info("Parsed last seen timestamps", zap.Int("count", len(timestamps))) + s.logger.Debug("Parsed last seen timestamps", zap.Int("count", len(timestamps))) if err := s.devicesSvc.SetLastSeen(ctx, timestamps); err != nil { return fmt.Errorf("can't set last seen: %w", err) } s.logger.Info("Set last seen") return nil }
37-54: Make interval configurable and add jitter.Avoid minute-boundary thundering herd and allow tuning via config; add small random jitter to the ticker.
Also applies to: 69-96
89-91: Consider batching to limit write size and latency.If timestamps can be large, chunk the map into fixed-size batches for SetLastSeen to reduce DB pressure and timeouts.
69-96: Add metrics for observability.Expose counters/timers: drained_count, parse_errors, persist_duration, last_success_timestamp. Helps alerting and SLOs.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
.github/workflows/docker-publish.yml(0 hunks).github/workflows/go.yml(1 hunks).github/workflows/golangci-lint.yml(0 hunks)configs/config.example.yml(1 hunks)internal/sms-gateway/modules/auth/service.go(5 hunks)internal/sms-gateway/modules/devices/repository.go(1 hunks)internal/sms-gateway/modules/devices/service.go(1 hunks)internal/sms-gateway/online/module.go(1 hunks)internal/sms-gateway/online/service.go(1 hunks)pkg/cache/memory_profile_test.go(1 hunks)
💤 Files with no reviewable changes (2)
- .github/workflows/golangci-lint.yml
- .github/workflows/docker-publish.yml
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/sms-gateway/online/module.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-10T11:11:43.452Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/memory.go:59-74
Timestamp: 2025-09-10T11:11:43.452Z
Learning: In pkg/cache/memory.go, the cleanup method removes expired items first, then executes the provided callback function. This means methods like Drain that use cleanup don't need additional expiration filtering since expired items are already removed before the callback runs.
Applied to files:
pkg/cache/memory_profile_test.go
🧬 Code graph analysis (4)
internal/sms-gateway/modules/devices/repository.go (2)
internal/sms-gateway/models/models.go (1)
Device(25-36)internal/sms-gateway/modules/messages/repository.go (3)
r(141-164)r(198-205)r(166-191)
internal/sms-gateway/modules/auth/service.go (2)
internal/sms-gateway/modules/devices/service.go (1)
Service(29-38)internal/sms-gateway/online/service.go (1)
Service(14-17)
internal/sms-gateway/online/service.go (3)
internal/sms-gateway/modules/auth/service.go (1)
Service(39-52)internal/sms-gateway/modules/devices/service.go (1)
Service(29-38)internal/sms-gateway/cache/factory.go (1)
Cache(11-11)
pkg/cache/memory_profile_test.go (2)
pkg/cache/memory.go (1)
NewMemory(16-23)pkg/cache/errors.go (1)
ErrKeyNotFound(7-7)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Build / Docker image (linux/arm64)
- GitHub Check: Benchmark
- GitHub Check: Analyze (go)
🔇 Additional comments (8)
pkg/cache/memory_profile_test.go (1)
85-90: Nice: Drain error is handled.Good catch adding error handling for Drain here.
configs/config.example.yml (2)
27-28: Defaulting/backcompat confirmed — Cache.URL defaults to "memory://"Verified: internal/config/config.go sets the default Cache.URL = "memory://" (internal/config/config.go:104); the loader uses defaultConfig via config.LoadConfig(&defaultConfig) (internal/config/module.go:25); NewFactory also falls back to "memory://" when URL is empty (internal/sms-gateway/cache/factory.go:22–23). No change required.
27-28: Redact cache URL in logs — avoid leaking credentialsNo repo code was found that logs the cache URL, but NewFactory passes the full URL into redis.New (internal/sms-gateway/cache/factory.go). Ensure neither our code nor the redis client logs the full URL; if logging is necessary, sanitize first (parse the URL and remove/u.User or replace credentials with "") or log only scheme+host (u.Scheme + "://" + u.Host).
- Check internal/sms-gateway/cache/factory.go — avoid logging or passing an unhashed URL; sanitize before logging or before handing to clients that may log it.
- Check configs/config.example.yml — do not include credentials in examples; document that credentials must be kept secret.
internal/sms-gateway/modules/devices/service.go (1)
103-111: No stale UpdateLastSeen usages foundrg found no matches for UpdateLastSeen; callers use SetLastSeen (e.g. internal/sms-gateway/online/service.go:89); implementation at internal/sms-gateway/modules/devices/service.go:103 and internal/sms-gateway/modules/devices/repository.go:74.
internal/sms-gateway/modules/auth/service.go (4)
14-14: Online service wiring import looks good
34-35: DI: added OnlineSvc to Params
47-48: Service field: onlineSvc injection
61-63: Constructor wires onlineSvc and logger
8ad3f09 to
5abcc8d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/sms-gateway/modules/devices/service.go (1)
131-135: Bug: token cache invalidation uses the wrong key; entry won’t be evicted.GetByToken caches by sha256(token) hex key, but Remove deletes by raw device.AuthToken. This leaves stale cache entries.
- if err := s.tokensCache.Delete(device.AuthToken); err != nil { + hash := sha256.Sum256([]byte(device.AuthToken)) + if err := s.tokensCache.Delete(hex.EncodeToString(hash[:])); err != nil { s.logger.Error("can't invalidate token cache", zap.Error(err)) }
♻️ Duplicate comments (5)
pkg/cache/memory_profile_test.go (5)
25-25: Fix invalid ranges over integers (won’t compile).Replace range-over-int with classic for-loops.
- for i := range numItems { + for i := 0; i < numItems; i++ {- for i := range numItems { + for i := 0; i < numItems; i++ {- for i := range numItems { + for i := 0; i < numItems; i++ {- for range numGCs { + for i := 0; i < numGCs; i++ {- for j := range 100 { + for j := 0; j < 100; j++ {- for i := range numItems { + for i := 0; i < numItems; i++ {- for i := range numCaches { + for i := 0; i < numCaches; i++ {- for j := range 10 { + for j := 0; j < 10; j++ {Also applies to: 75-75, 175-175, 187-187, 192-192, 202-202, 228-228, 233-233
45-55: Use HeapAlloc delta, not TotalAlloc, for per-item growth.TotalAlloc is cumulative and misleads tests.
- allocDiff := m2.TotalAlloc - m1.TotalAlloc - allocPerItem := float64(allocDiff) / float64(numItems) + heapDiff := m2.HeapAlloc - m1.HeapAlloc + perItem := float64(heapDiff) / float64(numItems) @@ - t.Logf(" Total allocated: %d bytes", m2.TotalAlloc) - t.Logf(" Allocation difference: %d bytes", allocDiff) - t.Logf(" Allocations per item: %.2f bytes", allocPerItem) + t.Logf(" Heap alloc: %d bytes", m2.HeapAlloc) + t.Logf(" Heap delta: %d bytes", heapDiff) + t.Logf(" Heap delta per item: %.2f bytes", perItem) @@ - if allocPerItem > 300 { - t.Errorf("Expected less than 300 bytes per item, got %.2f bytes", allocPerItem) + if perItem > 300 { + t.Errorf("Expected less than 300 bytes per item, got %.2f bytes", perItem)Also applies to: 58-60
95-109: MemoryCleanup: compare heap-in-use, not cumulative allocation.- allocDiff := m2.TotalAlloc - m1.TotalAlloc + heapDiff := m2.HeapAlloc - m1.HeapAlloc @@ - t.Logf(" Total allocated: %d bytes", m2.TotalAlloc) - t.Logf(" Allocation difference: %d bytes", allocDiff) + t.Logf(" Heap alloc: %d bytes", m2.HeapAlloc) + t.Logf(" Heap delta: %d bytes", heapDiff) @@ - if allocDiff > 2*1024*1024 { // 2MB - t.Errorf("Expected less than 2MB memory growth after cleanup, got %d bytes", allocDiff) + if heapDiff > 2*1024*1024 { // 2MB + t.Errorf("Expected less than 2MB memory growth after cleanup, got %d bytes", heapDiff)
150-166: MemoryPressure: switch to HeapAlloc delta to avoid false positives.- allocDiff := m2.TotalAlloc - m1.TotalAlloc + heapDiff := m2.HeapAlloc - m1.HeapAlloc @@ - t.Logf(" Total allocated: %d bytes", m2.TotalAlloc) - t.Logf(" Allocation difference: %d bytes", allocDiff) + t.Logf(" Heap alloc: %d bytes", m2.HeapAlloc) + t.Logf(" Heap delta: %d bytes", heapDiff) @@ - if allocDiff > expectedMaxGrowth { - t.Errorf("Expected less than %d bytes memory growth under pressure, got %d bytes", expectedMaxGrowth, allocDiff) + if heapDiff > expectedMaxGrowth { + t.Errorf("Expected less than %d bytes memory growth under pressure, got %d bytes", expectedMaxGrowth, heapDiff)
243-245: Check Drain error.Avoid silently ignoring failures.
- // Clear the cache - tempCache.Drain(ctx) + // Clear the cache + if _, err := tempCache.Drain(ctx); err != nil { + t.Errorf("Drain failed: %v", err) + }
🧹 Nitpick comments (3)
internal/sms-gateway/modules/devices/service.go (1)
103-119: Don’t swallow per-item failures; aggregate and return them (and include more context in logs).Right now, device-level errors are only logged and the function returns ctx.Err(), which is usually nil, masking failed updates. Aggregate per-item errors (and combine with ctx.Err()) and add last_seen to logs. Also prefer deviceID for Go initialism.
func (s *Service) SetLastSeen(ctx context.Context, batch map[string]time.Time) error { if len(batch) == 0 { return nil } - for deviceId, lastSeen := range batch { - if ctx.Err() != nil { - break - } - - if err := s.devices.SetLastSeen(ctx, deviceId, lastSeen); err != nil { - s.logger.Error("can't set last seen", zap.String("device_id", deviceId), zap.Error(err)) - } - } - - return ctx.Err() + var multiErr error + for deviceID, lastSeen := range batch { + if err := ctx.Err(); err != nil { + return errors.Join(err, multiErr) + } + if err := s.devices.SetLastSeen(ctx, deviceID, lastSeen); err != nil { + multiErr = errors.Join(multiErr, fmt.Errorf("device %s: %w", deviceID, err)) + s.logger.Error("can't set last seen", + zap.String("device_id", deviceID), + zap.Time("last_seen", lastSeen), + zap.Error(err), + ) + } + } + return multiErr }Add missing import:
// add to imports "errors"If your intent is best-effort with no upstream retries, keep return nil on partial failures but at least count and emit a single warning with failure count. Want a bounded-concurrency variant (errgroup + worker pool) for large batches? I can provide one.
pkg/cache/memory_profile_test.go (2)
1-1: Scope errcheck suppression to benchmark calls; remove file-wide nolint.Prefer local suppressions to avoid masking test issues.
-//nolint:errcheck- cache.Set(ctx, key, value) + cache.Set(ctx, key, value) //nolint:errcheck - cache.Get(ctx, key) + cache.Get(ctx, key) //nolint:errcheck @@ - cache.Delete(ctx, key) + cache.Delete(ctx, key) //nolint:errcheckAlso applies to: 290-295
15-16: Nit: avoid shadowing the imported package name.Rename local variable
cachetoc(applies similarly elsewhere) to reduce confusion.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
.github/workflows/go.yml(1 hunks)internal/sms-gateway/cache/factory.go(1 hunks)internal/sms-gateway/modules/auth/service.go(5 hunks)internal/sms-gateway/modules/devices/repository.go(1 hunks)internal/sms-gateway/modules/devices/service.go(1 hunks)internal/sms-gateway/online/service.go(1 hunks)pkg/cache/memory_profile_test.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- .github/workflows/go.yml
- internal/sms-gateway/cache/factory.go
- internal/sms-gateway/online/service.go
- internal/sms-gateway/modules/auth/service.go
- internal/sms-gateway/modules/devices/repository.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-10T11:11:43.452Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/memory.go:59-74
Timestamp: 2025-09-10T11:11:43.452Z
Learning: In pkg/cache/memory.go, the cleanup method removes expired items first, then executes the provided callback function. This means methods like Drain that use cleanup don't need additional expiration filtering since expired items are already removed before the callback runs.
Applied to files:
pkg/cache/memory_profile_test.go
🧬 Code graph analysis (1)
pkg/cache/memory_profile_test.go (1)
pkg/cache/memory.go (1)
NewMemory(16-23)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Build / Docker image (linux/arm64)
- GitHub Check: Benchmark
- GitHub Check: Analyze (go)
🔇 Additional comments (3)
internal/sms-gateway/modules/devices/service.go (1)
103-119: Resolved — SetLastSeen enforces monotonic updatesrepository.SetLastSeen uses Where("id = ? AND last_seen < ?", id, lastSeen) before Update("last_seen", lastSeen) in internal/sms-gateway/modules/devices/repository.go, so older timestamps are ignored; no change required.
pkg/cache/memory_profile_test.go (2)
254-271: LGTM: correct leak check using HeapAlloc delta.Good use of int64 to handle possible negative deltas and clear logging.
56-61: Thresholds validated — no CI flakiness observedRan the targeted memory tests 25×; allocations per item were consistently ~225.98 bytes (<300) and other memory metrics (allocation diffs, heap delta) remained stable; all targeted tests passed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
internal/sms-gateway/modules/devices/service.go (2)
136-141: Evict-by-hash is good; enrich log context (and downgrade on cache-miss if applicable)Add device/cache identifiers to the log to aid debugging. If Delete returns a “not found” sentinel, log at debug to avoid noise.
Apply this diff:
- if err := s.tokensCache.Delete(cacheKey); err != nil { - s.logger.Error("can't invalidate token cache", zap.Error(err)) - } + if err := s.tokensCache.Delete(cacheKey); err != nil { + // If the cache lib exposes a not-found error, consider logging at Debug here. + s.logger.Error("can't invalidate token cache", + zap.String("device_id", device.ID), + zap.String("cache_key", cacheKey), + zap.Error(err), + ) + }If you confirm the cache package returns a specific not-found error, I can propose a small guard to downgrade the log level accordingly.
104-124: Add empty-device-id guard and normalize lastSeen to UTC; preserve zero-time behavior
- Add a guard for empty device IDs and call lastSeen = lastSeen.UTC() in internal/sms-gateway/modules/devices/service.go before invoking s.devices.SetLastSeen to avoid DB timezone mismatches.
- Do not treat zero timestamps as errors — repository already ignores zero timestamps (internal/sms-gateway/modules/devices/repository.go:74–76).
- Verified no UpdateLastSeen usages remain; SetLastSeen is called from internal/sms-gateway/online/service.go:91.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
internal/sms-gateway/modules/devices/service.go(3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Benchmark
- GitHub Check: E2E
- GitHub Check: Analyze (go)
🔇 Additional comments (1)
internal/sms-gateway/modules/devices/service.go (1)
7-7:errors.Joinimport is appropriate hereUsed correctly for multi-error accumulation; consistent with the Go 1.20+ toolchain bump.
dd32085 to
1438a30
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (7)
.github/workflows/go.yml (2)
70-73: Codecov upload will fail on forked PRs.PRs from forks don't have access to
CODECOV_TOKEN, causing the upload step to fail. Either guard the step or remove the token for public repositories.Apply this fix to guard the upload:
- - name: Upload coverage to Codecov - uses: codecov/codecov-action@v5 - with: - token: ${{ secrets.CODECOV_TOKEN }} + - name: Upload coverage to Codecov + if: ${{ secrets.CODECOV_TOKEN != '' }} + uses: codecov/codecov-action@v5 + with: + token: ${{ secrets.CODECOV_TOKEN }} + fail_ci_if_error: false
100-110: Benchmark alerts ineffective without baseline persistence.With
auto-push: false, no historical data is persisted, making comparisons and alerts meaningless. Either enable persistence or disable alerting to avoid false failures.Option A (recommended): Enable persistence by adding
contents: writepermission and settingauto-push: true:benchmark: name: Benchmark runs-on: ubuntu-latest permissions: - contents: read + contents: write pull-requests: write- fail-on-alert: true + auto-push: true + fail-on-alert: trueOption B: Disable alerting if persistence cannot be enabled:
- fail-on-alert: true + fail-on-alert: false + comment-on-alert: falsepkg/cache/memory_bench_test.go (5)
1-13: File-level error suppression conflicts with past review feedback.The file-level
//nolint:errcheckdirective suppresses error checking for all benchmarks, but past review comments specifically requested adding explicit error handling to benchmark loops to ensure failures don't go unnoticed.
15-28: Missing error handling in Set benchmark.The benchmark ignores errors from
cache.Set, which could lead to misleading performance results if operations fail silently.
30-46: Missing error handling for Get operations.Similar to the Set benchmark, both the pre-population and the benchmark loop ignore errors from cache operations.
48-65: Missing error handling for SetAndGet operations.The benchmark should handle errors from both Set and Get operations to ensure meaningful performance measurements.
67-80: SetOrFail benchmark needs proper error handling.SetOrFail is expected to return
ErrKeyExistson subsequent calls to the same key. The benchmark should handle this specific error case appropriately.
🧹 Nitpick comments (7)
internal/sms-gateway/modules/devices/repository.go (1)
74-87: Good fix: safe, monotonic last_seen update with proper GORM WHERE.
- Guards zero timestamp, uses placeholders, and prevents regressions via
last_seen < ?. Looks solid.Normalize time precision/timezone to match DB TIMESTAMP(3).
- Recommend UTC + millisecond truncation to avoid no‑ops from sub‑ms precision mismatches.
Apply:
func (r *repository) SetLastSeen(ctx context.Context, id string, lastSeen time.Time) error { if lastSeen.IsZero() { return nil // ignore zero timestamps } + // Normalize to DB precision/timezone (TIMESTAMP(3) in UTC) + lastSeen = lastSeen.UTC().Truncate(time.Millisecond) res := r.db.WithContext(ctx). Model(&models.Device{}). Where("id = ? AND last_seen < ?", id, lastSeen). - Update("last_seen", lastSeen) + Update("last_seen", lastSeen) if res.Error != nil { return res.Error } // RowsAffected==0 => not found or stale timestamp; treat as no-op. return nil }Confirm UpdatedAt semantics.
- If bumping
updated_aton last-seen writes is undesirable (affectsremoveUnusedwhich keys offupdated_at), switch toUpdateColumnto avoid auto-updating timestamps.Option:
- Update("last_seen", lastSeen) + UpdateColumn("last_seen", lastSeen)pkg/cache/memory_edge_test.go (6)
242-247: Tighten the assertion; allowing ErrKeyNotFound hides a bug.This test sets the key right before the Get (no TTL). Getting ErrKeyNotFound should fail the test.
Apply:
- _, err := c.Get(ctx, key) - if err != nil && err != cache.ErrKeyNotFound { - t.Errorf("Get failed: %v", err) - } + _, err := c.Get(ctx, key) + if err != nil { + t.Errorf("Get failed: %v", err) + }
221-228: Gate timing-sensitive test in -short to reduce CI flakiness.This test measures wall time; skip under
-short.Apply:
func TestMemoryCache_RapidOperations(t *testing.T) { // Test rapid c operations c := cache.NewMemory(0) ctx := context.Background() + if testing.Short() { + t.Skip("skipping timing-sensitive test in -short mode") + }
47-56: Reduce timing flakiness for immediate-expiration test.1 ms TTL with 2× sleep can flake on busy CI. Give a bit more slack.
Apply:
- ttl := 1 * time.Millisecond + ttl := 5 * time.Millisecond ... - time.Sleep(2 * ttl) + time.Sleep(3 * ttl)
378-381: Guard against negative sleep and tighten jitter logic.Defensive programming keeps the test stable if TTL changes later.
Apply:
- jitter := time.Duration(id%3) * time.Millisecond - time.Sleep(ttl - 2*time.Millisecond + jitter) + jitter := time.Duration(id%3) * time.Millisecond + d := ttl - 2*time.Millisecond + jitter + if d < 0 { + d = 0 + } + time.Sleep(d)
63-82: Nil context contract: confirm API guarantees.You intentionally allow nil context (suppressed via nolint). Confirm the Cache interface docs commit to accepting nil; otherwise this test bakes in an undesirable API promise.
84-104: Empty key semantics: confirm cross-backend behavior.If empty keys must be supported, ensure Redis and any future backends behave identically; if not, this test should be dropped or feature-gated.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (28)
.github/workflows/docker-publish.yml(0 hunks).github/workflows/go.yml(1 hunks).github/workflows/golangci-lint.yml(0 hunks)build/package/Dockerfile(1 hunks)build/package/Dockerfile.dev(1 hunks)configs/config.example.yml(1 hunks)go.mod(4 hunks)internal/config/config.go(3 hunks)internal/config/module.go(2 hunks)internal/sms-gateway/app.go(4 hunks)internal/sms-gateway/cache/config.go(1 hunks)internal/sms-gateway/cache/factory.go(1 hunks)internal/sms-gateway/cache/module.go(1 hunks)internal/sms-gateway/modules/auth/service.go(5 hunks)internal/sms-gateway/modules/devices/repository.go(1 hunks)internal/sms-gateway/modules/devices/service.go(3 hunks)internal/sms-gateway/online/module.go(1 hunks)internal/sms-gateway/online/service.go(1 hunks)pkg/cache/cache.go(1 hunks)pkg/cache/errors.go(1 hunks)pkg/cache/memory.go(1 hunks)pkg/cache/memory_bench_test.go(1 hunks)pkg/cache/memory_concurrency_test.go(1 hunks)pkg/cache/memory_edge_test.go(1 hunks)pkg/cache/memory_profile_test.go(1 hunks)pkg/cache/memory_test.go(1 hunks)pkg/cache/options.go(1 hunks)pkg/cache/redis.go(1 hunks)
💤 Files with no reviewable changes (2)
- .github/workflows/docker-publish.yml
- .github/workflows/golangci-lint.yml
🚧 Files skipped from review as they are similar to previous changes (20)
- internal/sms-gateway/cache/module.go
- internal/sms-gateway/cache/config.go
- pkg/cache/errors.go
- pkg/cache/cache.go
- internal/config/module.go
- pkg/cache/options.go
- build/package/Dockerfile.dev
- configs/config.example.yml
- build/package/Dockerfile
- internal/sms-gateway/modules/devices/service.go
- internal/sms-gateway/cache/factory.go
- pkg/cache/redis.go
- internal/sms-gateway/online/module.go
- internal/sms-gateway/modules/auth/service.go
- internal/config/config.go
- pkg/cache/memory_profile_test.go
- pkg/cache/memory_concurrency_test.go
- pkg/cache/memory.go
- internal/sms-gateway/app.go
- internal/sms-gateway/online/service.go
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-10T23:53:42.006Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/redis.go:63-75
Timestamp: 2025-09-10T23:53:42.006Z
Learning: In pkg/cache/redis.go, the Redis cache implementation uses Redis's native field-level TTL via HExpireAt, meaning expired fields are automatically removed by Redis itself. This differs from the memory implementation where the application tracks expiration times. As a result, operations like Drain that call HGetAll will only return non-expired items since expired fields are already removed by Redis.
Applied to files:
pkg/cache/memory_edge_test.go
📚 Learning: 2025-09-10T11:11:43.462Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/memory.go:59-74
Timestamp: 2025-09-10T11:11:43.462Z
Learning: In pkg/cache/memory.go, the cleanup method removes expired items first, then executes the provided callback function. This means methods like Drain that use cleanup don't need additional expiration filtering since expired items are already removed before the callback runs.
Applied to files:
pkg/cache/memory_edge_test.gopkg/cache/memory_bench_test.go
🧬 Code graph analysis (4)
internal/sms-gateway/modules/devices/repository.go (1)
internal/sms-gateway/models/models.go (1)
Device(25-36)
pkg/cache/memory_edge_test.go (3)
pkg/cache/memory.go (1)
NewMemory(16-23)pkg/cache/options.go (1)
WithTTL(22-30)pkg/cache/errors.go (2)
ErrKeyExpired(9-9)ErrKeyNotFound(7-7)
pkg/cache/memory_test.go (3)
pkg/cache/memory.go (1)
NewMemory(16-23)pkg/cache/options.go (2)
WithTTL(22-30)WithValidUntil(34-38)pkg/cache/errors.go (2)
ErrKeyNotFound(7-7)ErrKeyExists(11-11)
pkg/cache/memory_bench_test.go (2)
pkg/cache/memory.go (1)
NewMemory(16-23)pkg/cache/options.go (1)
WithTTL(22-30)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: E2E
- GitHub Check: Benchmark
- GitHub Check: Analyze (go)
🔇 Additional comments (11)
.github/workflows/go.yml (1)
78-80: Benchmark permissions are correctly configured.The job-level permissions correctly grant
pull-requests: writepermission for posting benchmark comments while maintaining read-only access for other operations. This addresses the workflow-widecontents: readlimitation.pkg/cache/memory_test.go (4)
1-9: Clean test package organization.The package structure and imports are well organized for comprehensive cache testing.
11-33: LGTM for basic Set/Get functionality.The test correctly verifies basic cache operations with appropriate error handling and value validation.
324-358: Good TTL expiration and cleanup validation.This test properly exercises the cleanup functionality by setting items with short TTL, waiting for expiration, and verifying cleanup removes expired items.
469-491: Appropriate large value testing.Testing with 1MB values helps validate memory handling and ensures the cache can handle reasonably large payloads without issues.
pkg/cache/memory_bench_test.go (3)
224-263: Comprehensive mixed workload benchmark design.The benchmark design with different read/write ratios and randomized access patterns provides valuable performance insights for realistic usage scenarios.
304-334: Good TTL overhead measurement approach.Comparing performance with and without TTL provides valuable insights into the cost of TTL features. The benchmark design is sound.
396-418: Excellent random access pattern benchmark.Pre-populating with 1000 keys and testing random access patterns provides realistic performance data for cache lookup scenarios.
go.mod (1)
3-3: Pin Go 1.24.1 in CI/Docker and vet new deps.
- No CI workflows or Dockerfiles pin Go (search found no actions/setup-go or FROM golang:); update GitHub Actions, Dockerfiles, and dev toolchain to Go 1.24.1.
- go.mod shows go 1.24.1 (line 3) and new/updated deps — validator v10.26.0 (ln 13), redis v9.9.0 (21), fx v1.24.0 (23), zap v1.27.0 (24); indirect: go-rendezvous, mimetype (45–46). Run full build + vuln checks.
- Run: go mod tidy && go build ./... && go vet ./... && govulncheck ./...
- Optional (reproducibility): add toolchain pin:
go 1.24.1 +toolchain go1.24.1pkg/cache/memory_edge_test.go (2)
309-327: Drain semantics assertion looks correct (memory backend).Per retrieved learnings, memory.Drain cleans expired items first, so expecting ErrKeyNotFound post-Drain is right. Consider mirroring this as an integration test for the Redis backend to ensure parity since Redis removes expired fields natively via HExpireAt.
Would you like a follow-up test template for the Redis cache variant?
1-12: Approve — great coverage; run flakiness + race locally (sandbox couldn't run -race)Solid edge-case matrix for TTL, drain, and concurrency. Nice use of package cache_test to assert public API.
Sandbox test run failed with environment errors:
- go: -race requires cgo; enable cgo by setting CGO_ENABLED=1
- failed to start telemetry sidecar: os.Executable: readlink /proc/self/exe: no such file or directory
Run locally or in CI with one of these and attach output if flakes/failures appear:
# Run race locally with CGO enabled CGO_ENABLED=1 go test ./pkg/cache -run 'TestMemoryCache_' -race -count=100 -timeout=5m -v # If CGO cannot be enabled in your environment, run flakiness sweep without -race go test ./pkg/cache -run 'TestMemoryCache_' -count=100 -timeout=5m -v
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/sms-gateway/modules/devices/repository.go (1)
70-72: Fix invalid GORM WHERE usage — use a parameterized predicateWhere("id", id) is not a valid predicate and can cause a full-table update or SQL error.
Location: internal/sms-gateway/modules/devices/repository.go:71Apply:
- return r.db.Model(&models.Device{}).Where("id", id).Update("push_token", token).Error + return r.db.Model(&models.Device{}).Where("id = ?", id).Update("push_token", token).Error
♻️ Duplicate comments (1)
internal/sms-gateway/modules/devices/repository.go (1)
74-87: Monotonic last_seen update + proper WHERE — resolvedGood fix: guards zero time, uses
id = ? AND last_seen < ?, andUpdateColumnwith context. This addresses the earlier feedback to prevent regressions and to use placeholders.
🧹 Nitpick comments (2)
internal/sms-gateway/modules/devices/repository.go (2)
74-87: Normalize to UTC to avoid TZ comparison edge casesIf callers pass non-UTC times, DB/session TZ differences can cause subtle misorders. Normalize before updating.
func (r *repository) SetLastSeen(ctx context.Context, id string, lastSeen time.Time) error { if lastSeen.IsZero() { return nil // ignore zero timestamps } + lastSeen = lastSeen.UTC() res := r.db.WithContext(ctx). Model(&models.Device{}). Where("id = ? AND last_seen < ?", id, lastSeen). UpdateColumn("last_seen", lastSeen)Confirm the column type is TIMESTAMP/TIMESTAMPTZ with consistent TZ handling in your DB.
99-106: Pruning by last_seen — confirm policy and index for performanceSwitching to
last_seenchanges retention semantics (vs.updated_at). Confirm this aligns with product policy. Also ensure an index onlast_seento avoid deletes scanning the whole table.If not already indexed, consider adding a GORM index tag/migration for
devices.last_seen.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
internal/sms-gateway/modules/devices/repository.go(2 hunks)pkg/cache/memory_edge_test.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/cache/memory_edge_test.go
🧰 Additional context used
🧬 Code graph analysis (1)
internal/sms-gateway/modules/devices/repository.go (1)
internal/sms-gateway/models/models.go (1)
Device(25-36)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Build / Docker image (linux/amd64)
- GitHub Check: Build / Docker image (linux/arm64)
- GitHub Check: Benchmark
- GitHub Check: Analyze (go)
ca5d2d5 to
ba4ce53
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/sms-gateway/modules/devices/repository.go (1)
99-106: Add an index on last_seen and switch removeUnused to batched deletesNo index found on LastSeen (internal/sms-gateway/models/models.go) and the migration only adds the column (internal/sms-gateway/models/migrations/mysql/20240214195939_device_last_seen.sql) — deleting by last_seen can cause full-table scans and long locks. Actions:
- Add an index for last_seen (e.g., new migration: CREATE INDEX idx_devices_last_seen ON devices(last_seen) or add
gorm:"index:idx_devices_last_seen"to LastSeen and add a migration).- Change removeUnused to delete in small batches (loop DELETE ... LIMIT or delete by PK chunks) to avoid long locks; implement in internal/sms-gateway/modules/devices/repository.go (removeUnused).
♻️ Duplicate comments (6)
pkg/cache/memory_profile_test.go (3)
46-53: Use HeapAlloc deltas; TotalAlloc is cumulative and misleading.Switch to HeapAlloc for growth checks/logs to avoid false positives.
-allocDiff := m2.TotalAlloc - m1.TotalAlloc -allocPerItem := float64(allocDiff) / float64(numItems) - t.Logf(" Total allocated: %d bytes", m2.TotalAlloc) - t.Logf(" Allocation difference: %d bytes", allocDiff) +heapDiff := m2.HeapAlloc - m1.HeapAlloc +allocPerItem := float64(heapDiff) / float64(numItems) + t.Logf(" Heap alloc: %d bytes", m2.HeapAlloc) + t.Logf(" Heap delta: %d bytes", heapDiff)-allocDiff := m2.TotalAlloc - m1.TotalAlloc +heapDiff := m2.HeapAlloc - m1.HeapAlloc @@ - t.Logf(" Total allocated: %d bytes", m2.TotalAlloc) - t.Logf(" Allocation difference: %d bytes", allocDiff) + t.Logf(" Heap alloc: %d bytes", m2.HeapAlloc) + t.Logf(" Heap delta: %d bytes", heapDiff) @@ - if allocDiff > 2*1024*1024 { - t.Errorf("Expected less than 2MB memory growth after cleanup, got %d bytes", allocDiff) + if heapDiff > 2*1024*1024 { + t.Errorf("Expected less than 2MB memory growth after cleanup, got %d bytes", heapDiff)-allocDiff := m2.TotalAlloc - m1.TotalAlloc +heapDiff := m2.HeapAlloc - m1.HeapAlloc @@ - t.Logf(" Total allocated: %d bytes", m2.TotalAlloc) - t.Logf(" Allocation difference: %d bytes", allocDiff) + t.Logf(" Heap alloc: %d bytes", m2.HeapAlloc) + t.Logf(" Heap delta: %d bytes", heapDiff) @@ - if allocDiff > expectedMaxGrowth { - t.Errorf("Expected less than %d bytes memory growth under pressure, got %d bytes", expectedMaxGrowth, allocDiff) + if heapDiff > expectedMaxGrowth { + t.Errorf("Expected less than %d bytes memory growth under pressure, got %d bytes", expectedMaxGrowth, heapDiff)-b.Logf("TotalAlloc per op: %.2f bytes/op", float64(m2.TotalAlloc-m1.TotalAlloc)/float64(b.N)) +b.Logf("HeapAlloc per op: %.2f bytes/op", float64(m2.HeapAlloc-m1.HeapAlloc)/float64(b.N))Also applies to: 96-103, 151-157, 300-301
244-244: Handle Drain error.Don’t ignore the error from Drain.
- tempCache.Drain(ctx) + if _, err := tempCache.Drain(ctx); err != nil { + t.Errorf("Drain failed: %v", err) + }
25-25: Fix: invalid range over integers (compile errors).Replace all “range” loops over ints with classic for-loops.
- for i := range numItems { + for i := 0; i < numItems; i++ { } - for i := range numItems { + for i := 0; i < numItems; i++ { } - for i := range numItems { + for i := 0; i < numItems; i++ { } - for range numGCs { + for i := 0; i < numGCs; i++ { } - for j := range 100 { + for j := 0; j < 100; j++ { } - for i := range numItems { + for i := 0; i < numItems; i++ { } - for i := range numCaches { + for i := 0; i < numCaches; i++ { } - for j := range 10 { + for j := 0; j < 10; j++ { }Also applies to: 75-75, 175-175, 187-187, 192-192, 202-202, 228-228, 233-233
pkg/cache/redis.go (1)
127-151: Missing field in HExpireAt call (TTL not applied).SetOrFail doesn’t pass the field to HExpireAt, so the TTL isn’t set.
Apply:
- if !options.validUntil.IsZero() { - if err := r.client.HExpireAt(ctx, r.key, options.validUntil).Err(); err != nil { + if !options.validUntil.IsZero() { + if err := r.client.HExpireAt(ctx, r.key, key, options.validUntil).Err(); err != nil { return fmt.Errorf("can't set cache item ttl: %w", err) } }Also verify all other HExpire* usages repo-wide:
#!/bin/bash rg -nP 'HExpire(?:At|)\s*\(' -C2 --type=gointernal/sms-gateway/online/service.go (2)
27-35: Name the logger for this service.Scope logs under a stable component name.
Apply:
func New(devicesSvc *devices.Service, cache cache.Cache, logger *zap.Logger) Service { return &service{ devicesSvc: devicesSvc, cache: cache, - logger: logger, + logger: logger.Named("online"), } }
54-65: Validate deviceID and reduce log verbosity.Prevent empty-key writes and avoid per-call info‑level noise.
Apply:
func (s *service) SetOnline(ctx context.Context, deviceID string) { dt := time.Now().UTC().Format(time.RFC3339) - s.logger.Info("Setting online status", zap.String("device_id", deviceID), zap.String("last_seen", dt)) + if strings.TrimSpace(deviceID) == "" { + s.logger.Warn("Empty device_id; skipping SetOnline") + return + } + s.logger.Debug("Setting online status", zap.String("device_id", deviceID), zap.String("last_seen", dt)) if err := s.cache.Set(ctx, deviceID, dt); err != nil { s.logger.Error("Can't set online status", zap.String("device_id", deviceID), zap.Error(err)) return } - s.logger.Info("Online status set", zap.String("device_id", deviceID)) + s.logger.Debug("Online status set", zap.String("device_id", deviceID)) }Add missing import:
import ( "context" "fmt" + "strings" "time"
🧹 Nitpick comments (11)
pkg/cache/memory.go (3)
16-23: Redundant mutex initialization.sync.RWMutex has a valid zero value; remove the explicit init.
func NewMemory(ttl time.Duration) Cache { return &memoryCache{ items: make(map[string]*memoryItem), ttl: ttl, - - mux: sync.RWMutex{}, } }
157-169: Callback runs under write lock—document constraint.cb executes while holding m.mux; ensure it stays O(1) (as in Drain) to avoid blocking writers/readers. Consider a short comment to prevent future misuse.
77-85: Optional: lazily delete expired on Get.You could delete-on-read when expired to reduce stale entries, though it trades a read lock for a write lock. Fine to keep as-is if Cleanup runs periodically.
pkg/cache/memory_profile_test.go (1)
15-15: Avoid shadowing the imported cache package name.Rename local vars (e.g., c := cache.NewMemory(0)) to reduce confusion.
Also applies to: 65-65, 275-276
pkg/cache/redis.go (2)
49-52: Cleanup no-op: document behavior.Given Redis handles per-field TTL removal, this is fine—add a comment to clarify that Cleanup is intentionally a no-op on Redis.
35-47: Add a startup capability check for HEXPIRE (defensive).Fail fast if the Redis server doesn’t support hash-field TTL to avoid runtime “unknown command” errors.
Example helper (place in this file) and call from NewRedis:
func ensureHashFieldTTL(ctx context.Context, c *redis.Client) error { cmd := c.Do(ctx, "COMMAND", "INFO", "HEXPIRE") if err := cmd.Err(); err != nil { return err } if v, _ := cmd.Result().([]interface{}); len(v) == 0 { return fmt.Errorf("Redis server lacks HEXPIRE; requires Redis 7.4+") } return nil }Then in NewRedis, before return:
_ = ensureHashFieldTTL(context.Background(), client) // log/handle as desiredinternal/sms-gateway/online/service.go (4)
37-52: Flush on startup and on shutdown (bounded).Avoid losing recent writes on process stop; also persist immediately after start.
Apply:
func (s *service) Run(ctx context.Context) { ticker := time.NewTicker(1 * time.Minute) defer ticker.Stop() + // eager flush on start + if err := s.persist(ctx); err != nil { + s.logger.Error("Can't persist online status (startup)", zap.Error(err)) + } + for { select { case <-ctx.Done(): - return + // best-effort final flush with timeout + flushCtx, cancel := context.WithTimeout(context.Background(), 2*time.Second) + if err := s.persist(flushCtx); err != nil { + s.logger.Error("Can't persist online status (shutdown)", zap.Error(err)) + } + cancel() + return case <-ticker.C: s.logger.Debug("Persisting online status") if err := s.persist(ctx); err != nil { s.logger.Error("Can't persist online status", zap.Error(err)) } } } }
79-87: Don’t advance timestamps on parse errors; skip invalid entries.Advancing to now() can falsely move last_seen forward. Safer to drop invalids and log.
Apply:
- timestamps := maps.MapValues(items, func(v string) time.Time { - t, err := time.Parse(time.RFC3339, v) - if err != nil { - s.logger.Error("Can't parse last seen", zap.String("last_seen", v), zap.Error(err)) - return time.Now().UTC() - } - - return t - }) + timestamps := make(map[string]time.Time, len(items)) + invalid := 0 + for id, v := range items { + t, err := time.Parse(time.RFC3339, v) + if err != nil { + invalid++ + s.logger.Warn("Skipping invalid last_seen", zap.String("device_id", id), zap.String("last_seen", v), zap.Error(err)) + continue + } + timestamps[id] = t + } - s.logger.Debug("Parsed last seen timestamps", zap.Int("count", len(timestamps))) + s.logger.Debug("Parsed last seen timestamps", zap.Int("count", len(timestamps)), zap.Int("invalid", invalid))And remove the unused helper import:
- "github.com/capcom6/go-helpers/maps"Also applies to: 89-89
95-97: Lower success log to debug and include count.Keep logs quiet on healthy paths and add useful signal.
Apply:
- s.logger.Info("Set last seen") + s.logger.Debug("Set last seen", zap.Int("count", len(timestamps)))
19-25: Depend on a narrow interface, not concrete service.Improves testability and reduces coupling.
Define locally:
type lastSeenWriter interface { SetLastSeen(ctx context.Context, batch map[string]time.Time) error }Then change:
- devicesSvc *devices.Service + devicesSvc lastSeenWriterinternal/sms-gateway/modules/devices/repository.go (1)
74-88: Forward‑only update is correct; add cheap guards.Consider early‑return on empty id to avoid full‑table scans and keep using UpdateColumn to skip updated_at.
Apply:
func (r *repository) SetLastSeen(ctx context.Context, id string, lastSeen time.Time) error { - if lastSeen.IsZero() { + if id == "" || lastSeen.IsZero() { return nil // ignore zero timestamps } res := r.db.WithContext(ctx). Model(&models.Device{}). Where("id = ? AND last_seen < ?", id, lastSeen). UpdateColumn("last_seen", lastSeen)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (28)
.github/workflows/docker-publish.yml(0 hunks).github/workflows/go.yml(1 hunks).github/workflows/golangci-lint.yml(0 hunks)build/package/Dockerfile(1 hunks)build/package/Dockerfile.dev(1 hunks)configs/config.example.yml(1 hunks)go.mod(4 hunks)internal/config/config.go(3 hunks)internal/config/module.go(2 hunks)internal/sms-gateway/app.go(4 hunks)internal/sms-gateway/cache/config.go(1 hunks)internal/sms-gateway/cache/factory.go(1 hunks)internal/sms-gateway/cache/module.go(1 hunks)internal/sms-gateway/modules/auth/service.go(5 hunks)internal/sms-gateway/modules/devices/repository.go(2 hunks)internal/sms-gateway/modules/devices/service.go(3 hunks)internal/sms-gateway/online/module.go(1 hunks)internal/sms-gateway/online/service.go(1 hunks)pkg/cache/cache.go(1 hunks)pkg/cache/errors.go(1 hunks)pkg/cache/memory.go(1 hunks)pkg/cache/memory_bench_test.go(1 hunks)pkg/cache/memory_concurrency_test.go(1 hunks)pkg/cache/memory_edge_test.go(1 hunks)pkg/cache/memory_profile_test.go(1 hunks)pkg/cache/memory_test.go(1 hunks)pkg/cache/options.go(1 hunks)pkg/cache/redis.go(1 hunks)
💤 Files with no reviewable changes (2)
- .github/workflows/golangci-lint.yml
- .github/workflows/docker-publish.yml
🚧 Files skipped from review as they are similar to previous changes (20)
- internal/config/config.go
- internal/sms-gateway/cache/config.go
- build/package/Dockerfile
- internal/config/module.go
- internal/sms-gateway/cache/factory.go
- pkg/cache/errors.go
- internal/sms-gateway/app.go
- internal/sms-gateway/cache/module.go
- internal/sms-gateway/online/module.go
- pkg/cache/memory_test.go
- .github/workflows/go.yml
- configs/config.example.yml
- pkg/cache/memory_edge_test.go
- pkg/cache/memory_concurrency_test.go
- build/package/Dockerfile.dev
- internal/sms-gateway/modules/auth/service.go
- pkg/cache/memory_bench_test.go
- pkg/cache/options.go
- pkg/cache/cache.go
- internal/sms-gateway/modules/devices/service.go
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-09-10T11:11:43.462Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/memory.go:59-74
Timestamp: 2025-09-10T11:11:43.462Z
Learning: In pkg/cache/memory.go, the cleanup method removes expired items first, then executes the provided callback function. This means methods like Drain that use cleanup don't need additional expiration filtering since expired items are already removed before the callback runs.
Applied to files:
pkg/cache/memory.gopkg/cache/redis.gopkg/cache/memory_profile_test.go
📚 Learning: 2025-09-10T23:53:42.006Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/redis.go:63-75
Timestamp: 2025-09-10T23:53:42.006Z
Learning: In pkg/cache/redis.go, the Redis cache implementation uses Redis's native field-level TTL via HExpireAt, meaning expired fields are automatically removed by Redis itself. This differs from the memory implementation where the application tracks expiration times. As a result, operations like Drain that call HGetAll will only return non-expired items since expired fields are already removed by Redis.
Applied to files:
pkg/cache/memory.gopkg/cache/redis.go
📚 Learning: 2025-09-10T23:54:16.828Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/redis.go:118-121
Timestamp: 2025-09-10T23:54:16.828Z
Learning: Redis 7.4.0 introduced HEXPIRE, HEXPIREAT, and related commands that support per-field TTL on hash data structures. go-redis v9.6.0+ includes support for these commands with methods like HExpireAt.
Applied to files:
pkg/cache/redis.go
🧬 Code graph analysis (5)
internal/sms-gateway/online/service.go (3)
internal/sms-gateway/modules/devices/service.go (1)
Service(30-39)internal/sms-gateway/modules/auth/service.go (2)
Service(39-52)New(54-68)internal/sms-gateway/cache/factory.go (1)
Cache(15-15)
pkg/cache/memory.go (3)
pkg/cache/cache.go (1)
Cache(5-36)pkg/cache/options.go (1)
Option(6-6)pkg/cache/errors.go (3)
ErrKeyExists(11-11)ErrKeyNotFound(7-7)ErrKeyExpired(9-9)
pkg/cache/redis.go (3)
pkg/cache/cache.go (1)
Cache(5-36)pkg/cache/errors.go (2)
ErrKeyNotFound(7-7)ErrKeyExists(11-11)pkg/cache/options.go (1)
Option(6-6)
pkg/cache/memory_profile_test.go (1)
pkg/cache/memory.go (1)
NewMemory(16-23)
internal/sms-gateway/modules/devices/repository.go (1)
internal/sms-gateway/models/models.go (1)
Device(25-36)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Build / Docker image (linux/arm64)
- GitHub Check: Build / Docker image (linux/amd64)
- GitHub Check: Benchmark
- GitHub Check: Analyze (go)
🔇 Additional comments (6)
pkg/cache/memory.go (1)
59-74: Drain logic looks correct and atomic.Expired items are purged before swapping maps; snapshot is built outside the lock. LGTM.
pkg/cache/redis.go (2)
93-105: Atomic GetAndDelete via Lua looks good.
106-125: HExpireAt call is correct — no change required.
go-redis v9 signature is HExpireAt(ctx, key string, tm time.Time, fields ...string), so r.client.HExpireAt(ctx, r.key, options.validUntil, key) uses the correct argument order.Likely an incorrect or invalid review comment.
go.mod (2)
21-21: HEXPIRE/HEXPIREAT require Redis ≥ 7.4 — confirm server + client usage
- go-redis v9 HExpireAt signature: HExpireAt(ctx context.Context, key string, tm time.Time, fields ...string) — pass expiry time before the field names.
- HEXPIRE/HEXPIREAT introduced in Redis Open Source 7.4.0 — verify production Redis is >= 7.4.0.
- Add a startup capability check (COMMAND INFO HEXPIRE / HEXPIREAT) and fail fast if unsupported.
3-3: Pin CI and container images to Go 1.24.1.go.mod already contains "go 1.24.1" — update CI and container references to match:
- .github/workflows/go.yml — actions/setup-go: go-version: stable -> 1.24.1
- .github/workflows/docker-publish.yml — actions/setup-go: go-version: stable -> 1.24.1
- build/package/Dockerfile and build/package/Dockerfile.dev — FROM golang:1.24-alpine -> FROM golang:1.24.1-alpine (or FROM golang:1.24.1)
Confirm any developer docs mention the same version.internal/sms-gateway/modules/devices/repository.go (1)
71-71: Good fix: parameterized WHERE.Prevents invalid GORM usage and ensures correct binding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (3)
internal/sms-gateway/online/service.go (3)
27-35: Name the logger for this module.Scope logs with a child logger to aid filtering.
func New(devicesSvc *devices.Service, cache cache.Cache, logger *zap.Logger) Service { return &service{ devicesSvc: devicesSvc, cache: cache, - logger: logger, + logger: logger.Named("online"), } }
54-65: Validate deviceID and reduce log verbosity.Avoid empty-key writes; SetOnline is high‑QPS, so use Debug not Info.
func (s *service) SetOnline(ctx context.Context, deviceID string) { - dt := time.Now().UTC().Format(time.RFC3339) - - s.logger.Info("Setting online status", zap.String("device_id", deviceID), zap.String("last_seen", dt)) - - if err := s.cache.Set(ctx, deviceID, dt); err != nil { - s.logger.Error("Can't set online status", zap.String("device_id", deviceID), zap.Error(err)) + id := strings.TrimSpace(deviceID) + if id == "" { + s.logger.Warn("Empty device_id; skipping SetOnline") + return + } + dt := time.Now().UTC().Format(time.RFC3339) + + s.logger.Debug("Setting online status", zap.String("device_id", id), zap.String("last_seen", dt)) + + if err := s.cache.Set(ctx, id, dt); err != nil { + s.logger.Error("Can't set online status", zap.String("device_id", id), zap.Error(err)) return } - s.logger.Info("Online status set", zap.String("device_id", deviceID)) + s.logger.Debug("Online status set", zap.String("device_id", id)) }Additional change required outside this hunk:
// add to imports import "strings"
59-60: Ensure a non-zero TTL is applied to online entries.Without TTL, keys can persist indefinitely if persist stalls. Verify factory wiring or apply per‑write TTL.
#!/bin/bash # Verify TTL for the "online" cache and any WithTTL usage. rg -n --type=go 'factory\.New\("online"\)|WithTTL\(|DefaultTTL|TTL' \ internal/sms-gateway internal/config pkg/cache || true # Inspect memory/redis cache constructors for ttl args. rg -n --type=go 'New(Memory|Redis)\(' pkg/cache || true
🧹 Nitpick comments (5)
internal/sms-gateway/online/service.go (5)
37-52: Flush once on start and on shutdown to avoid dropping buffered updates.Persist immediately after start and do a best‑effort flush on ctx cancel.
func (s *service) Run(ctx context.Context) { ticker := time.NewTicker(1 * time.Minute) defer ticker.Stop() + // best-effort initial persist + if err := s.persist(ctx); err != nil { + s.logger.Warn("Initial persist failed", zap.Error(err)) + } + for { select { - case <-ctx.Done(): - return + case <-ctx.Done(): + // best-effort final persist + flushCtx, cancel := context.WithTimeout(context.Background(), 2*time.Second) + defer cancel() + if err := s.persist(flushCtx); err != nil { + s.logger.Warn("Final persist failed", zap.Error(err)) + } + return case <-ticker.C: s.logger.Debug("Persisting online status") if err := s.persist(ctx); err != nil { s.logger.Error("Can't persist online status", zap.Error(err)) } } } }
38-38: Make the persist interval configurable (with sane default).Improves ops flexibility; reduces need for code changes.
55-55: Prefer RFC3339Nano for higher precision (optional).- dt := time.Now().UTC().Format(time.RFC3339) + dt := time.Now().UTC().Format(time.RFC3339Nano)
79-87: Include device_id in parse errors and avoid per-item time.Now().Refactor to a simple loop; log the offending device; compute now once.
- timestamps := maps.MapValues(items, func(v string) time.Time { - t, err := time.Parse(time.RFC3339, v) - if err != nil { - s.logger.Error("Can't parse last seen", zap.String("last_seen", v), zap.Error(err)) - return time.Now().UTC() - } - - return t - }) + now := time.Now().UTC() + timestamps := make(map[string]time.Time, len(items)) + for id, v := range items { + t, err := time.Parse(time.RFC3339, v) + if err != nil { + s.logger.Warn("Invalid last_seen in cache; using fallback 'now'", + zap.String("device_id", id), zap.String("last_seen", v), zap.Error(err)) + timestamps[id] = now + continue + } + timestamps[id] = t + }Additional change required outside this hunk:
// remove unused helper import // - "github.com/capcom6/go-helpers/maps"
19-25: *Depend on a narrow interface, not devices.Service.For testability and decoupling, inject an interface with just SetLastSeen(ctx, map[string]time.Time) error.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
internal/sms-gateway/online/service.go(1 hunks)pkg/cache/memory.go(1 hunks)pkg/cache/redis.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- pkg/cache/memory.go
- pkg/cache/redis.go
🧰 Additional context used
🧬 Code graph analysis (1)
internal/sms-gateway/online/service.go (2)
internal/sms-gateway/modules/devices/service.go (1)
Service(30-39)internal/sms-gateway/cache/factory.go (1)
Cache(15-15)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: E2E
- GitHub Check: Benchmark
- GitHub Check: Analyze (go)
🔇 Additional comments (2)
internal/sms-gateway/online/service.go (2)
14-17: Service API shape looks good.Clear, minimal surface: Run + SetOnline. No changes requested.
67-71: Confirm Drain() is atomic across backends.
- Redis: Drain calls r.client.Eval(ctx, hgetallAndDeleteScript, []string{r.key}) — EVAL is atomic on the Redis server, so Redis backend is atomic.
- Memory: pkg/cache/memory.go: Drain(...) — verify it acquires the cache mutex (or performs an atomic map-swap) while copying/clearing entries; if it doesn't, concurrent Set() during Drain can be lost or duplicated.
d5f3c2c to
e4dce05
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/cache/memory_bench_test.go (1)
1-461: Fix ignored returns — remove file-level //nolint:errcheck and handle or narrowly suppress per-callrg found 31 cache.* callsites in pkg/cache/memory_bench_test.go that ignore return values; file-level //nolint:errcheck is still present in:
- pkg/cache/memory_bench_test.go (lines: 25,38,43,61,62,77,95,96,114,115,129,135,149,155,168,185,217,250,257,287,296,362,363,383,389,406,415,428,433,448,457)
- pkg/cache/memory_concurrency_test.go:1
- pkg/cache/memory_profile_test.go:1
Actions:
- Remove the top-level //nolint:errcheck and either handle errors at each callsite, assert them in tests/benchmarks, or apply per-call //nolint:errcheck with a one-line justification.
- Re-run the grep to verify no remaining ignored returns.
♻️ Duplicate comments (5)
pkg/cache/redis.go (3)
95-107: Get(): Redis-backed cache returns ErrKeyNotFound for expired fields — violates Cache.Get contract
- Evidence: cache.go documents Get must return ErrKeyExpired on expiry; memory implementation returns ErrKeyExpired (pkg/cache/memory.go) and tests assert ErrKeyExpired (pkg/cache/memory_edge_test.go). pkg/cache/redis.go's Get returns ErrKeyNotFound when HGet yields redis.Nil (Redis removes expired fields), so expired vs missing are indistinguishable.
- Required action (choose one and apply consistently): (A) update Cache contract/docs/tests to treat expired as not-found for Redis-backed caches, or (B) change Redis implementation to track expiry metadata (parallel hash/sorted-set or store expiry timestamps) and return ErrKeyExpired.
145-168: Bug: HExpireAt missing field in SetOrFail → runtime “wrong number of arguments”.Pass the field name; also clean up the inserted value if TTL setting fails to avoid immortal entries.
if !val { return ErrKeyExists } @@ if !options.validUntil.IsZero() { - if err := r.client.HExpireAt(ctx, r.key, options.validUntil).Err(); err != nil { - return fmt.Errorf("can't set cache item ttl: %w", err) - } + if err := r.client.HExpireAt(ctx, r.key, options.validUntil, key).Err(); err != nil { + // Avoid leaving a non-expiring entry if TTL application fails. + _ = r.client.HDel(ctx, r.key, key).Err() + return fmt.Errorf("can't set cache item ttl: %w", err) + } }
9-10: Fix HExpireAt usage and verify Redis ≥7.4 in CI
- go.mod shows github.com/redis/go-redis/v9 v9.9.0 — OK (>= v9.6).
- pkg/cache/redis.go:136 — currently
r.client.HExpireAt(ctx, r.key, options.validUntil, key); args are out of order. Change tor.client.HExpireAt(ctx, r.key, key, options.validUntil).- pkg/cache/redis.go:162 — currently
r.client.HExpireAt(ctx, r.key, options.validUntil); missing the field argument. Change tor.client.HExpireAt(ctx, r.key, key, options.validUntil)(or the correct field name).- REDIS_URL not set locally; server version couldn't be checked — ensure CI/environments run Redis 7.4+.
pkg/cache/memory_bench_test.go (2)
4-13: Import errors and treat ErrKeyExists as expected in SetOrFail.Use errors.Is or direct compare against cache.ErrKeyExists; fail on any other error.
import ( "context" + "errors" "fmt" "math/rand" "strconv" "testing" "time" @@ b.RunParallel(func(pb *testing.PB) { - for pb.Next() { - cache.SetOrFail(ctx, key, value) - } + for pb.Next() { + if err := cache.SetOrFail(ctx, key, value); err != nil && !errors.Is(err, cache.ErrKeyExists) { + b.Fatal(err) + } + } })Also applies to: 74-80
1-1: Stop suppressing errcheck; handle errors in benchmarks.Dropping errors risks benchmarking failure paths and hiding regressions. Remove
//nolint:errcheckand check/handle returned errors. Apply the pattern below and replicate across the file.-//nolint:errcheck package cache_test @@ func BenchmarkMemoryCache_Set(b *testing.B) { - cache := cache.NewMemory(0) + cache := cache.NewMemory(0) ctx := context.Background() key := "benchmark-key" value := "benchmark-value" @@ b.RunParallel(func(pb *testing.PB) { for pb.Next() { - cache.Set(ctx, key, value) + if err := cache.Set(ctx, key, value); err != nil { + b.Fatal(err) + } } }) } @@ -// Pre-populate the cache -cache.Set(ctx, key, value) +// Pre-populate the cache +if err := cache.Set(ctx, key, value); err != nil { + b.Fatal(err) +} @@ b.RunParallel(func(pb *testing.PB) { for pb.Next() { - cache.Get(ctx, key) + if _, err := cache.Get(ctx, key); err != nil { + b.Fatal(err) + } } }) } @@ b.RunParallel(func(pb *testing.PB) { i := 0 for pb.Next() { key := "key-" + strconv.Itoa(i) value := "value-" + strconv.Itoa(i) i++ - cache.Set(ctx, key, value) - cache.Get(ctx, key) + if err := cache.Set(ctx, key, value); err != nil { + b.Fatal(err) + } + if _, err := cache.Get(ctx, key); err != nil { + b.Fatal(err) + } } }) } @@ b.RunParallel(func(pb *testing.PB) { - for pb.Next() { - cache.SetOrFail(ctx, key, value) - } + i := 0 + for pb.Next() { + // Use distinct keys OR ignore expected ErrKeyExists. + testKey := key + "-" + strconv.Itoa(i) + if err := cache.SetOrFail(ctx, testKey, value); err != nil && err != cache.ErrKeyExists { + b.Fatal(err) + } + i++ + } }) } @@ b.RunParallel(func(pb *testing.PB) { i := 0 for pb.Next() { key := "key-" + strconv.Itoa(i) value := "value-" + strconv.Itoa(i) i++ - cache.Set(ctx, key, value) - cache.GetAndDelete(ctx, key) + if err := cache.Set(ctx, key, value); err != nil { + b.Fatal(err) + } + if got, err := cache.GetAndDelete(ctx, key); err != nil { + b.Fatal(err) + } else if got != value { + b.Fatalf("GetAndDelete: got %q, want %q", got, value) + } } }) } @@ b.RunParallel(func(pb *testing.PB) { i := 0 for pb.Next() { key := "key-" + strconv.Itoa(i) value := "value-" + strconv.Itoa(i) i++ - cache.Set(ctx, key, value) - cache.Delete(ctx, key) + if err := cache.Set(ctx, key, value); err != nil { + b.Fatal(err) + } + if err := cache.Delete(ctx, key); err != nil { + b.Fatal(err) + } } }) } @@ for i := 0; i < 1000; i++ { key := "item-" + strconv.Itoa(i) value := "value-" + strconv.Itoa(i) - cache.Set(ctx, key, value) + if err := cache.Set(ctx, key, value); err != nil { + b.Fatal(err) + } } @@ b.RunParallel(func(pb *testing.PB) { for pb.Next() { - cache.Cleanup(ctx) + if err := cache.Cleanup(ctx); err != nil { + b.Fatal(err) + } } }) } @@ for i := 0; i < 1000; i++ { key := "item-" + strconv.Itoa(i) value := "value-" + strconv.Itoa(i) - cache.Set(ctx, key, value) + if err := cache.Set(ctx, key, value); err != nil { + b.Fatal(err) + } } @@ b.RunParallel(func(pb *testing.PB) { for pb.Next() { - cache.Drain(ctx) + if _, err := cache.Drain(ctx); err != nil { + b.Fatal(err) + } } }) } @@ -// Pre-populate the cache -cache.Set(ctx, key, value) +// Pre-populate the cache +if err := cache.Set(ctx, key, value); err != nil { + b.Fatal(err) +} @@ - cache.Get(ctx, key) + if _, err := cache.Get(ctx, key); err != nil { + b.Fatal(err) + } @@ - cache.Set(ctx, key, value) + if err := cache.Set(ctx, key, value); err != nil { + b.Fatal(err) + } @@ - if bm.withTTL { - c.Set(ctx, key, value, cache.WithTTL(ttl)) - } else { - c.Set(ctx, key, value) - } + if bm.withTTL { + if err := c.Set(ctx, key, value, cache.WithTTL(ttl)); err != nil { + b.Fatal(err) + } + } else { + if err := c.Set(ctx, key, value); err != nil { + b.Fatal(err) + } + } @@ - cache.Set(ctx, key, string(value)) - cache.Get(ctx, key) + if err := cache.Set(ctx, key, string(value)); err != nil { + b.Fatal(err) + } + if _, err := cache.Get(ctx, key); err != nil { + b.Fatal(err) + } @@ - cache.Drain(ctx) + if _, err := cache.Drain(ctx); err != nil { + b.Fatal(err) + } @@ - cache.Set(ctx, key, value) + if err := cache.Set(ctx, key, value); err != nil { + b.Fatal(err) + } @@ - cache.Set(ctx, key, value) + if err := cache.Set(ctx, key, value); err != nil { + b.Fatal(err) + } @@ - cache.Get(ctx, key) + if _, err := cache.Get(ctx, key); err != nil { + b.Fatal(err) + } @@ -// Pre-populate the hot key -cache.Set(ctx, hotKey, value) +// Pre-populate the hot key +if err := cache.Set(ctx, hotKey, value); err != nil { + b.Fatal(err) +} @@ - cache.Get(ctx, hotKey) + if _, err := cache.Get(ctx, hotKey); err != nil { + b.Fatal(err) + } @@ - cache.Set(ctx, key, value) + if err := cache.Set(ctx, key, value); err != nil { + b.Fatal(err) + } @@ - cache.Get(ctx, key) + if _, err := cache.Get(ctx, key); err != nil { + b.Fatal(err) + }Also applies to: 22-27, 37-46, 61-63, 74-80, 95-97, 114-116, 125-131, 134-136, 145-151, 154-156, 168-169, 183-186, 213-218, 247-258, 287-288, 296-297, 323-330, 362-364, 383-390, 403-407, 414-416, 428-429, 433-434, 446-449, 456-458
🧹 Nitpick comments (15)
internal/sms-gateway/modules/devices/repository.go (3)
71-71: LGTM on parameterized UPDATE for push_token. Consider nullable clear semantics.Good fix switching to
Where("id = ?", id). If you ever need to clear the token, consider supporting NULL (column is*string) via a companionClearPushToken(id)or by accepting*stringand passingnilwhen clearing.
74-87: Monotonic last_seen update looks correct. Normalize time to avoid precision churn.To align with DB precision (models default
CURRENT_TIMESTAMP(3)), normalize to UTC and truncate to milliseconds before the update to reduce redundant no-op comparisons due to sub-ms/nanos.Apply:
func (r *repository) SetLastSeen(ctx context.Context, id string, lastSeen time.Time) error { if lastSeen.IsZero() { return nil // ignore zero timestamps } + // Normalize to DB precision (ms) and UTC to avoid precision mismatches. + lastSeen = lastSeen.UTC().Truncate(time.Millisecond) res := r.db.WithContext(ctx). Model(&models.Device{}). Where("id = ? AND last_seen < ?", id, lastSeen). UpdateColumn("last_seen", lastSeen) if res.Error != nil { return res.Error } // RowsAffected==0 => not found or stale timestamp; treat as no-op. return nil }
102-103: Cleanup by last_seen: add an index to keep this fast at scale.This path will scan without an index on
last_seen. Consider adding a DB index (e.g., GORM tagindex:idx_devices_last_seen) tomodels.Device.LastSeento keep periodic cleanup O(log n).(Outside this diff) Update models:
LastSeen time.Time `gorm:"not null;autocreatetime:false;default:CURRENT_TIMESTAMP(3);index:idx_devices_last_seen"`internal/sms-gateway/modules/devices/service.go (2)
104-123: Batch SetLastSeen: skip zero timestamps; consider bounded parallelism for large batches.Minor: avoid pointless DB calls for zero times; Optional: use a worker pool (errgroup with limited concurrency) if batches can be large.
Apply the zero-time guard:
for deviceID, lastSeen := range batch { if err := ctx.Err(); err != nil { return errors.Join(err, multiErr) } + if lastSeen.IsZero() { + continue + } if err := s.devices.SetLastSeen(ctx, deviceID, lastSeen); err != nil { multiErr = errors.Join(multiErr, fmt.Errorf("device %s: %w", deviceID, err)) s.logger.Error("can't set last seen", zap.String("device_id", deviceID), zap.Time("last_seen", lastSeen), zap.Error(err), ) } }If desired later, I can sketch a 8–16 worker pool variant.
136-144: Token cache invalidation: OK. Consider down-level logging for not-found.If
Deletereturns a not-found sentinel, consider downgrading that case to debug to avoid noisy logs.internal/sms-gateway/modules/auth/service.go (2)
61-63: Name the component logger.Use a named logger for clearer attribution in logs.
- logger: params.Logger, + logger: params.Logger.Named("auth"),
146-150: Async SetOnline: make timeout configurable and guard against missing DI.
- Replace hardcoded 5s with a constant/config.
- Guard s.onlineSvc before launching the goroutine and log if it's not configured.
Note: device.LastSeen is set immediately after the goroutine (device.LastSeen = time.Now()), and SetOnline is implemented in internal/sms-gateway/online/service.go — changing the timeout / adding a nil-check is safe.
- go func(id string) { - ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) - defer cancel() - s.onlineSvc.SetOnline(ctx, id) - }(device.ID) + if s.onlineSvc != nil { + go func(id string) { + ctx, cancel := context.WithTimeout(context.Background(), onlineSetTimeout) + defer cancel() + s.onlineSvc.SetOnline(ctx, id) + }(device.ID) + } else { + s.logger.Debug("online service not configured; skipping SetOnline") + }Add (package scope):
const onlineSetTimeout = 2 * time.Secondpkg/cache/redis.go (4)
45-57: Document Redis 7.4+ requirement (HEXPIRE) and behavior upfront.Add a brief doc comment so callers know this depends on Redis 7.4+ and go-redis v9.6+ and that per-field TTL is used.
-func NewRedis(client *redis.Client, prefix string, ttl time.Duration) Cache { +// NewRedis creates a Redis-backed cache using a single hash with per-field TTL (HEXPIRE/HEXPIREAT). +// Requires Redis 7.4+ and go-redis v9.6+. +func NewRedis(client *redis.Client, prefix string, ttl time.Duration) Cache {
59-62: Clarify why Cleanup is a no-op.Make the no-op explicit since Redis evicts expired fields itself.
-// Cleanup implements Cache. +// Cleanup implements Cache. +// Redis automatically evicts expired hash fields; nothing to do here. func (r *redisCache) Cleanup(_ context.Context) error {
26-34: Drain Lua loops HDEL per field — switch to UNLINK (or DEL) to avoid O(n) work.Atomicity is good, but looping HDEL can block Redis on large hashes. UNLINK frees asynchronously; fall back to DEL if unavailable.
- hgetallAndDeleteScript = ` -local items = redis.call('HGETALL', KEYS[1]) -if #items > 0 then - for i = 1, #items, 2 do - redis.call('HDEL', KEYS[1], items[i]) - end -end -return items -` + hgetallAndDeleteScript = ` +local items = redis.call('HGETALL', KEYS[1]) +if #items > 0 then + local ok = pcall(redis.call, 'UNLINK', KEYS[1]) + if not ok then redis.call('DEL', KEYS[1]) end +end +return items +`
123-142: Set: make HSET + HEXPIREAT a single round-trip.Pipeline to reduce race surface and network chatter. If TTL setting fails today, value may remain immortal.
- if err := r.client.HSet(ctx, r.key, key, value).Err(); err != nil { - return fmt.Errorf("can't set cache item: %w", err) - } - - if !options.validUntil.isZero() { - if err := r.client.HExpireAt(ctx, r.key, options.validUntil, key).Err(); err != nil { - return fmt.Errorf("can't set cache item ttl: %w", err) - } - } + _, err := r.client.Pipelined(ctx, func(p redis.Pipeliner) error { + p.HSet(ctx, r.key, key, value) + if !options.validUntil.IsZero() { + p.HExpireAt(ctx, r.key, options.validUntil, key) + } + return nil + }) + if err != nil { + return fmt.Errorf("can't set cache item: %w", err) + }pkg/cache/memory_bench_test.go (4)
239-247: MixedWorkload: pre-populate working set to avoid measuring miss/path mostly.Reads target keys 0..99 but nothing seeds them; you’re benchmarking ErrKeyNotFound. Seed before ResetTimer.
for _, bm := range benchmarks { b.Run(bm.name, func(b *testing.B) { + // Seed working set for read hits + for i := 0; i < 100; i++ { + if err := cache.Set(ctx, fmt.Sprintf("key-%d", i), "seed"); err != nil { + b.Fatal(err) + } + } b.ResetTimer() b.RunParallel(func(pb *testing.PB) {
354-357: LargeValues: avoid per-iteration []byte→string conversion overhead.Convert once to focus the benchmark on cache behavior, not conversion cost.
value := make([]byte, size.size) for i := range value { value[i] = byte(i % 256) } + valueStr := string(value) @@ b.RunParallel(func(pb *testing.PB) { for pb.Next() { - cache.Set(ctx, key, string(value)) - cache.Get(ctx, key) + if err := cache.Set(ctx, key, valueStr); err != nil { + b.Fatal(err) + } + if _, err := cache.Get(ctx, key); err != nil { + b.Fatal(err) + } } })Also applies to: 360-364
372-374: MemoryGrowth: check Drain/Set errors and consider reporting allocs.Add error checks and call b.ReportAllocs to make the benchmark more informative.
func BenchmarkMemoryCache_MemoryGrowth(b *testing.B) { cache := cache.NewMemory(0) ctx := context.Background() + b.ReportAllocs() @@ // Clear cache - cache.Drain(ctx) + if _, err := cache.Drain(ctx); err != nil { + b.Fatal(err) + } @@ key := "key-" + strconv.Itoa(j) value := "value-" + strconv.Itoa(j) - cache.Set(ctx, key, value) + if err := cache.Set(ctx, key, value); err != nil { + b.Fatal(err) + }Also applies to: 381-391
162-169: Nit: avoid shadowing the imported package name with localcachevariables.Prefer
cormemto reduce confusion when also referencing package identifiers (e.g., cache.WithTTL).Also applies to: 398-407, 420-429, 445-449
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
.github/workflows/docker-publish.yml(0 hunks).github/workflows/go.yml(1 hunks).github/workflows/golangci-lint.yml(0 hunks)configs/config.example.yml(1 hunks)internal/sms-gateway/app.go(4 hunks)internal/sms-gateway/cache/factory.go(1 hunks)internal/sms-gateway/modules/auth/service.go(5 hunks)internal/sms-gateway/modules/devices/repository.go(2 hunks)internal/sms-gateway/modules/devices/service.go(3 hunks)internal/sms-gateway/online/module.go(1 hunks)internal/sms-gateway/online/service.go(1 hunks)pkg/cache/memory.go(1 hunks)pkg/cache/memory_bench_test.go(1 hunks)pkg/cache/memory_concurrency_test.go(1 hunks)pkg/cache/memory_edge_test.go(1 hunks)pkg/cache/memory_profile_test.go(1 hunks)pkg/cache/redis.go(1 hunks)
💤 Files with no reviewable changes (2)
- .github/workflows/docker-publish.yml
- .github/workflows/golangci-lint.yml
✅ Files skipped from review due to trivial changes (1)
- configs/config.example.yml
🚧 Files skipped from review as they are similar to previous changes (9)
- pkg/cache/memory_concurrency_test.go
- internal/sms-gateway/online/module.go
- internal/sms-gateway/online/service.go
- internal/sms-gateway/app.go
- .github/workflows/go.yml
- pkg/cache/memory_profile_test.go
- internal/sms-gateway/cache/factory.go
- pkg/cache/memory_edge_test.go
- pkg/cache/memory.go
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-09-10T11:11:43.462Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/memory.go:59-74
Timestamp: 2025-09-10T11:11:43.462Z
Learning: In pkg/cache/memory.go, the cleanup method removes expired items first, then executes the provided callback function. This means methods like Drain that use cleanup don't need additional expiration filtering since expired items are already removed before the callback runs.
Applied to files:
pkg/cache/memory_bench_test.gopkg/cache/redis.go
📚 Learning: 2025-09-10T23:53:42.006Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/redis.go:63-75
Timestamp: 2025-09-10T23:53:42.006Z
Learning: In pkg/cache/redis.go, the Redis cache implementation uses Redis's native field-level TTL via HExpireAt, meaning expired fields are automatically removed by Redis itself. This differs from the memory implementation where the application tracks expiration times. As a result, operations like Drain that call HGetAll will only return non-expired items since expired fields are already removed by Redis.
Applied to files:
pkg/cache/redis.go
📚 Learning: 2025-09-10T23:54:16.828Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/redis.go:118-121
Timestamp: 2025-09-10T23:54:16.828Z
Learning: Redis 7.4.0 introduced HEXPIRE, HEXPIREAT, and related commands that support per-field TTL on hash data structures. go-redis v9.6.0+ includes support for these commands with methods like HExpireAt.
Applied to files:
pkg/cache/redis.go
🧬 Code graph analysis (5)
internal/sms-gateway/modules/auth/service.go (2)
internal/sms-gateway/modules/devices/service.go (1)
Service(30-39)internal/sms-gateway/online/service.go (1)
Service(14-17)
pkg/cache/memory_bench_test.go (2)
pkg/cache/memory.go (1)
NewMemory(16-23)pkg/cache/options.go (1)
WithTTL(22-30)
internal/sms-gateway/modules/devices/service.go (3)
internal/sms-gateway/modules/auth/service.go (1)
Service(39-52)internal/sms-gateway/online/service.go (1)
Service(14-17)internal/sms-gateway/modules/webhooks/service.go (1)
Service(28-37)
pkg/cache/redis.go (3)
pkg/cache/cache.go (1)
Cache(5-36)pkg/cache/errors.go (2)
ErrKeyNotFound(7-7)ErrKeyExists(11-11)pkg/cache/options.go (1)
Option(6-6)
internal/sms-gateway/modules/devices/repository.go (1)
internal/sms-gateway/models/models.go (1)
Device(25-36)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Benchmark
- GitHub Check: Analyze (go)
🔇 Additional comments (7)
internal/sms-gateway/modules/devices/service.go (1)
7-7: errors.Join usage OK — toolchain ≥ Go 1.20 verified.
go.mod declares go 1.24.1 and Dockerfiles use golang:1.24-alpine, so errors.Join in internal/sms-gateway/modules/devices/service.go (lines 112, 115) is supported — no change needed.internal/sms-gateway/modules/auth/service.go (3)
34-34: Params addition looks good.Interface-based DI for
OnlineSvcis appropriate.
47-47: Service field addition is fine.Keeps dependency local and testable.
14-14: DI: online.Module is wired in the app; online.Service is provided — no app startup panic expected.internal/sms-gateway/app.go registers online.Module(); internal/sms-gateway/online/module.go fx.Provide(New)/fx.Invoke bind the Service (interface in internal/sms-gateway/online/service.go); auth module depends on online.Service (internal/sms-gateway/modules/auth/service.go). No _test.go references to online.Module/online.Service found — add online.Module to any tests that construct auth.Module in isolation.
pkg/cache/redis.go (3)
64-71: Delete path matches interface semantics.HDel is idempotent and returns nil if missing; good.
73-93: Atomic drain via Lua looks correct.Combined with the UNLINK/DEL script above, this should be both atomic and efficient.
109-121: Atomic GetAndDelete via Lua looks good.Correctly returns ErrKeyNotFound when absent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
internal/sms-gateway/models/migrations/mysql/20250918231606_add_last_seen_index.sql (2)
1-4: Make the Up migration idempotent and less blocking.As written,
CREATE INDEXwill fail if the index already exists and doesn’t enforceLOCK=NONE. Prefer conditional creation andALTER TABLE ... ALGORITHM=INPLACE, LOCK=NONEfor MySQL 8 to minimize lock time.Apply:
--- a/internal/sms-gateway/models/migrations/mysql/20250918231606_add_last_seen_index.sql +++ b/internal/sms-gateway/models/migrations/mysql/20250918231606_add_last_seen_index.sql -- +goose Up -- +goose StatementBegin -CREATE INDEX `idx_devices_last_seen` ON `devices`(`last_seen`); +SET @exists := ( + SELECT COUNT(1) + FROM INFORMATION_SCHEMA.STATISTICS + WHERE TABLE_SCHEMA = DATABASE() + AND TABLE_NAME = 'devices' + AND INDEX_NAME = 'idx_devices_last_seen' +); +SET @sql := IF(@exists = 0, + 'ALTER TABLE `devices` ADD INDEX `idx_devices_last_seen` (`last_seen`), ALGORITHM=INPLACE, LOCK=NONE', + 'SELECT 1'); +PREPARE stmt FROM @sql; EXECUTE stmt; DEALLOCATE PREPARE stmt; -- +goose StatementEnd
6-9: Guard the Down migration.Dropping an index that’s already gone will error. Make it conditional.
Apply:
--- a/internal/sms-gateway/models/migrations/mysql/20250918231606_add_last_seen_index.sql +++ b/internal/sms-gateway/models/migrations/mysql/20250918231606_add_last_seen_index.sql -- +goose Down -- +goose StatementBegin -DROP INDEX `idx_devices_last_seen` ON `devices`; +SET @exists := ( + SELECT COUNT(1) + FROM INFORMATION_SCHEMA.STATISTICS + WHERE TABLE_SCHEMA = DATABASE() + AND TABLE_NAME = 'devices' + AND INDEX_NAME = 'idx_devices_last_seen' +); +SET @sql := IF(@exists = 1, + 'DROP INDEX `idx_devices_last_seen` ON `devices`', + 'SELECT 1'); +PREPARE stmt FROM @sql; EXECUTE stmt; DEALLOCATE PREPARE stmt; -- +goose StatementEndinternal/sms-gateway/models/models.go (1)
31-31: Add composite index (user_id, last_seen) for per-user last-seen queries.modules/devices/repository_filter.go applies both
user_idandlast_seenfilters — add a migration creating an index on (user_id, last_seen); keep the existing single-columnlast_seenindex for global scans/operations (e.g. removeUnused, SetLastSeen uses id+last_seen).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
internal/sms-gateway/models/migrations/mysql/20250918231606_add_last_seen_index.sql(1 hunks)internal/sms-gateway/models/models.go(1 hunks)pkg/cache/memory_bench_test.go(1 hunks)pkg/cache/redis.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- pkg/cache/memory_bench_test.go
- pkg/cache/redis.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Build / Docker image (linux/arm64)
- GitHub Check: Build / Docker image (linux/amd64)
- GitHub Check: Benchmark
- GitHub Check: Analyze (go)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (4)
internal/sms-gateway/online/service.go (4)
56-61: Validate deviceID before writing to cache.Guard against empty/whitespace IDs to avoid empty-key writes and noisy logs.
func (s *service) SetOnline(ctx context.Context, deviceID string) { + if strings.TrimSpace(deviceID) == "" { + s.logger.Warn("Empty device_id; skipping SetOnline") + s.metrics.IncrementCacheOperation(operationSet, statusError) + s.metrics.IncrementStatusSet(false) + return + } dt := time.Now().UTC().Format(time.RFC3339)Add import:
import ( "context" "fmt" "time" + "strings"
34-35: Name the logger for this module.Scopes logs consistently to "online".
- logger: logger, + logger: logger.Named("online"),
22-23: Confirm cache TTL and Drain atomicity.Ensure the "online" cache uses a sensible TTL (e.g., 10–30m) and Drain is atomic w.r.t. concurrent Set to avoid dropping in-flight updates.
#!/bin/bash # Inspect Cache interface and implementations for Drain semantics and default TTL. rg -nP -C3 --type=go 'type\s+Cache\b[\s\S]*?Drain\(' rg -nP -C3 --type=go 'New\w*Cache|NewMemory|NewRedis|WithTTL\(' rg -nP -C3 --type=go 'factory\.New\("online"\)|Named\("online"\)'Also applies to: 56-76, 79-126
108-115: Avoid data loss: re-queue drained items if persistence fails.On SetLastSeen error, drained entries are dropped. Reinsert best-effort to the cache.
- if err := s.devicesSvc.SetLastSeen(ctx, timestamps); err != nil { - persistErr = fmt.Errorf("can't set last seen: %w", err) - s.metrics.IncrementPersistenceError(operationPersist) - return - } + if err := s.devicesSvc.SetLastSeen(ctx, timestamps); err != nil { + persistErr = fmt.Errorf("can't set last seen: %w", err) + s.metrics.IncrementPersistenceError(operationPersist) + // Best-effort requeue drained items + for id, v := range items { + if setErr := s.cache.Set(ctx, id, v); setErr != nil { + s.logger.Warn("Failed to requeue last_seen", zap.String("device_id", id), zap.Error(setErr)) + } + } + return + }
🧹 Nitpick comments (6)
internal/sms-gateway/online/metrics.go (2)
31-33: Use plain Histogram instead of HistogramVec with zero labels.Slightly simpler and avoids WithLabelValues() calls; behavior is identical here.
type metrics struct { statusSetCounter *prometheus.CounterVec cacheOperations *prometheus.CounterVec - cacheLatency *prometheus.HistogramVec - persistenceLatency *prometheus.HistogramVec + cacheLatency prometheus.Histogram + persistenceLatency prometheus.Histogram persistenceErrors *prometheus.CounterVec } @@ - cacheLatency: promauto.NewHistogramVec(prometheus.HistogramOpts{ + cacheLatency: promauto.NewHistogram(prometheus.HistogramOpts{ Namespace: "sms", Subsystem: "online", Name: metricCacheLatency, Help: "Cache operation latency in seconds", Buckets: []float64{0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1}, - }, []string{}), + }), @@ - persistenceLatency: promauto.NewHistogramVec(prometheus.HistogramOpts{ + persistenceLatency: promauto.NewHistogram(prometheus.HistogramOpts{ Namespace: "sms", Subsystem: "online", Name: metricPersistenceLatency, Help: "Persistence operation latency in seconds", Buckets: []float64{0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1}, - }, []string{}), + }), @@ func (m *metrics) ObserveCacheLatency(f func()) { - timer := prometheus.NewTimer(m.cacheLatency.WithLabelValues()) + timer := prometheus.NewTimer(m.cacheLatency) f() timer.ObserveDuration() } @@ func (m *metrics) ObservePersistenceLatency(f func()) { - timer := prometheus.NewTimer(m.persistenceLatency.WithLabelValues()) + timer := prometheus.NewTimer(m.persistenceLatency) f() timer.ObserveDuration() }Also applies to: 53-60, 61-68, 92-97, 99-104
36-76: Ensure metrics are single-registered or accept a prometheus.Registerer
- Verified: newMetrics is provided via fx.Provide in these modules: internal/sms-gateway/online/module.go, internal/sms-gateway/modules/push/module.go, internal/sms-gateway/modules/events/module.go, internal/sms-gateway/modules/sse/module.go. Each package registers metrics with promauto.New using distinct Namespace+Subsystem combinations — no immediate duplicate metric-name registration found.
- Action: to avoid panics in tests or future duplicate construction, accept a prometheus.Registerer (use promauto.With(reg)) or ensure newMetrics is only constructed once (singleton wiring).
deployments/prometheus/alerts/online-alerts.yml (1)
10-12: Replace placeholder runbook URLs.example.com placeholders reduce SRE usefulness during pages.
I can draft concise runbooks if you share the intended remediation steps.
Also applies to: 20-22, 30-32, 40-42
internal/sms-gateway/online/service.go (3)
97-101: Demote per-item parse errors to warn (or sample) to avoid log spam.Parsing issues on individual timestamps can flood logs; warn once per item is fine, but consider sampling or counting.
- s.logger.Error("Can't parse last seen", zap.String("last_seen", v), zap.Error(err)) + s.logger.Warn("Can't parse last seen", zap.String("last_seen", v), zap.Error(err))
40-48: Optional: add jitter to the tick to avoid herd effects across replicas.Small per-instance jitter (e.g., ±10–20s) prevents synchronized drains.
I can propose a jittered ticker helper if desired.
20-26: Depend on an interface for devices service to improve testability.Field is a concrete *devices.Service; prefer a narrow interface defined here.
Example:
- Define: type lastSeenWriter interface { SetLastSeen(context.Context, map[string]time.Time) error }
- Store lastSeenWriter in service; adapt New to accept that.
Also applies to: 28-37
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
deployments/grafana/dashboards/online.json(1 hunks)deployments/prometheus/alerts/online-alerts.yml(1 hunks)internal/sms-gateway/online/metrics.go(1 hunks)internal/sms-gateway/online/module.go(1 hunks)internal/sms-gateway/online/service.go(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- deployments/grafana/dashboards/online.json
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/sms-gateway/online/module.go
🧰 Additional context used
🧬 Code graph analysis (1)
internal/sms-gateway/online/service.go (3)
internal/sms-gateway/modules/devices/service.go (1)
Service(30-39)internal/sms-gateway/modules/auth/service.go (2)
Service(39-52)New(54-68)internal/sms-gateway/cache/factory.go (1)
Cache(15-15)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: E2E
- GitHub Check: Benchmark
- GitHub Check: Analyze (go)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
deployments/prometheus/alerts/online-alerts.yml (1)
1-38: Nice fix: counters use increase/rate; latency uses histogram_quantile over _bucket.This addresses the prior alerting pitfalls and looks correct.
🧹 Nitpick comments (1)
deployments/prometheus/alerts/online-alerts.yml (1)
13-21: Optional: surfaceoperationas a static alert label for routing.Prometheus will carry the
operationlabel from the expr, but adding it tolabelscan simplify Alertmanager routing/grouping.labels: severity: warning + operation: '{{ $labels.operation }}'
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
deployments/prometheus/alerts/online-alerts.yml(1 hunks)internal/sms-gateway/online/metrics.go(1 hunks)internal/sms-gateway/online/service.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- internal/sms-gateway/online/metrics.go
- internal/sms-gateway/online/service.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Benchmark
- GitHub Check: E2E
- GitHub Check: Analyze (go)
🔇 Additional comments (1)
deployments/prometheus/alerts/online-alerts.yml (1)
31-38: Verify histogram label dimensions for HighPersistenceLatency alertrg returned no matches for sms_online_persistence_latency_seconds_bucket; confirm the metric name and its label set and either scope the aggregation by those labels or create per-dimension alerts. Examples:
- Per-label: histogram_quantile(0.95, sum by (le, operation) (rate(sms_online_persistence_latency_seconds_bucket[5m]))) > 0.5
- Global (only if no extra labels): histogram_quantile(0.95, sum by (le) (rate(sms_online_persistence_latency_seconds_bucket[5m]))) > 0.5
Verify by searching the repo without a type filter: rg -n -C3 'sms_online_persistence_latency_seconds(_bucket)?' or inspect the instrumentation that registers the metric.
File: deployments/prometheus/alerts/online-alerts.yml:31-38
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (5)
internal/sms-gateway/online/service.go (5)
28-36: Scope the logger with a module name.Name the logger to keep logs consistently scoped to this service.
func New(devicesSvc *devices.Service, cache cache.Cache, logger *zap.Logger, metrics *metrics) Service { return &service{ devicesSvc: devicesSvc, cache: cache, - logger: logger, + logger: logger.Named("online"), metrics: metrics, } }
3-12: Import strings for input validation.Needed for the deviceID guard.
import ( "context" "fmt" "time" + "strings" "github.com/android-sms-gateway/server/internal/sms-gateway/modules/devices" "github.com/android-sms-gateway/server/pkg/cache" "github.com/capcom6/go-helpers/maps" "go.uber.org/zap" )
56-61: Guard against empty/whitespace deviceID to avoid empty-key writes.Skip cache writes when deviceID is blank.
func (s *service) SetOnline(ctx context.Context, deviceID string) { + if strings.TrimSpace(deviceID) == "" { + s.logger.Warn("Empty device_id; skipping SetOnline") + return + } dt := time.Now().UTC().Format(time.RFC3339)
82-91: Confirm cache Drain semantics (atomicity and namespacing/TTL).Ensure Drain is atomic across workers and that the “online” cache has a non-zero TTL to expire entries if persisting stalls.
#!/bin/bash # Inspect cache interface/impl for Drain atomicity and TTL for "online" rg -nP -C3 --type=go 'type\s+Cache\b|interface\s*\{[^}]*Drain' pkg/cache internal/sms-gateway/cache || true rg -nP -C5 --type=go 'New\("online"\)|WithTTL|TTL:' internal/sms-gateway || true
110-114: Re-queue drained items when persistence fails to prevent data loss.If SetLastSeen fails, put items back into the cache.
- if err := s.devicesSvc.SetLastSeen(ctx, timestamps); err != nil { - persistErr = fmt.Errorf("can't set last seen: %w", err) - s.metrics.IncrementPersistenceError() - return - } + if err := s.devicesSvc.SetLastSeen(ctx, timestamps); err != nil { + persistErr = fmt.Errorf("can't set last seen: %w", err) + s.metrics.IncrementPersistenceError() + for id, v := range items { + if setErr := s.cache.Set(ctx, id, v); setErr != nil { + s.logger.Warn("Failed to requeue last_seen", zap.String("device_id", id), zap.Error(setErr)) + } + } + return + }
🧹 Nitpick comments (11)
deployments/grafana/dashboards/online.json (8)
491-501: Fix mismatch: title says 5m, query uses $__rate_intervalUse a fixed 5m window to match the title.
- "expr": "increase(sms_online_status_set_total{status=\"error\", instance=~\"$instance\"}[$__rate_interval])", + "expr": "increase(sms_online_status_set_total{status=\"error\", instance=~\"$instance\"}[5m])",
78-79: Use ops/s unit for rate panelsThese queries are rates; switch unit to ops/s.
- "unit": "ops" + "unit": "ops/s"
173-174: Use ops/s unit for rate panelsSame here; this is a per‑second rate.
- "unit": "ops" + "unit": "ops/s"
22-25: Avoid hard‑coded datasource UID; parameterize for portabilityHard‑coded UID will break across environments. Use a templated UID and remove per‑target overrides.
- "datasource": { - "type": "prometheus", - "uid": "edqp0a73uh2bka" - }, + "datasource": { + "type": "prometheus", + "uid": "${DS_PROMETHEUS}" + },And for variables:
- "datasource": { - "type": "prometheus", - "uid": "edqp0a73uh2bka" - }, + "datasource": { + "type": "prometheus", + "uid": "${DS_PROMETHEUS}" + },After applying, verify import works in your Grafana via Provisioning or UI import.
Also applies to: 117-120, 212-215, 311-314, 410-413, 504-507, 615-617
105-106: Remove per‑target datasource to prevent conflictsTargets inherit the panel datasource; the mixed string “Prometheus” can cause DS resolution issues on import.
- "datasource": "Prometheus",Also applies to: 200-201, 299-300, 398-399, 492-493, 586-587
18-18: Drop static dashboard id and reset version"id" is instance-specific and causes churn; version can start at 1 (Grafana will manage).
- "id": 19, ... - "version": 6 + "version": 1Also applies to: 660-661
414-466: Set unit for error counts (optional clarity)These panels show counts over a window. Adding a unit improves readability.
"thresholds": { "mode": "absolute", "steps": [ { "color": "green", "value": 0 }, { "color": "red", "value": 0 } ] - } + }, + "unit": "ops"
508-560: Set unit for error counts (optional clarity)Same for persistence errors.
"thresholds": { "mode": "absolute", "steps": [ { "color": "green", "value": 0 }, { "color": "red", "value": 0 } ] - } + }, + "unit": "ops"internal/sms-gateway/online/service.go (3)
98-106: Don’t stamp “now” on parse failures; skip or requeue invalid entries.Using current time can distort last-seen. Prefer to skip and requeue parse failures.
- timestamps := maps.MapValues(items, func(v string) time.Time { - t, err := time.Parse(time.RFC3339, v) - if err != nil { - s.logger.Warn("Can't parse last seen", zap.String("last_seen", v), zap.Error(err)) - return time.Now().UTC() - } - - return t - }) + timestamps := make(map[string]time.Time, len(items)) + for id, v := range items { + t, err := time.Parse(time.RFC3339, v) + if err != nil { + s.logger.Warn("Can't parse last seen; requeuing", zap.String("device_id", id), zap.String("last_seen", v), zap.Error(err)) + if setErr := s.cache.Set(ctx, id, v); setErr != nil { + s.logger.Warn("Failed to requeue unparsable last_seen", zap.String("device_id", id), zap.Error(setErr)) + } + continue + } + timestamps[id] = t + }
40-41: Make persist interval configurable and add small jitter to avoid herd effects.Helps multi-node deployments.
- ticker := time.NewTicker(1 * time.Minute) + interval := 1 * time.Minute // TODO: inject from config + jitter := time.Duration(rand.Int63n(int64(5 * time.Second))) // import math/rand + ticker := time.NewTicker(interval + jitter)
56-58: Consider RFC3339Nano for higher precision timestamps.Only if consumers can parse it; otherwise keep as-is.
- dt := time.Now().UTC().Format(time.RFC3339) + dt := time.Now().UTC().Format(time.RFC3339Nano)- t, err := time.Parse(time.RFC3339, v) + t, err := time.Parse(time.RFC3339Nano, v)Also applies to: 99-101
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
deployments/grafana/dashboards/online.json(1 hunks)internal/sms-gateway/online/metrics.go(1 hunks)internal/sms-gateway/online/service.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/sms-gateway/online/metrics.go
🧰 Additional context used
🧬 Code graph analysis (1)
internal/sms-gateway/online/service.go (3)
internal/sms-gateway/modules/devices/service.go (1)
Service(30-39)internal/sms-gateway/modules/auth/service.go (2)
Service(39-52)New(54-68)internal/sms-gateway/cache/factory.go (1)
Cache(15-15)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (8)
deployments/grafana/dashboards/online.json (8)
22-26: Unify datasource references; avoid hardcoded UID where possiblePanels mix object UID refs and plain string "Prometheus". Pick one convention. If this dashboard is shared across envs, prefer a named datasource or a dashboard input over a hardcoded UID to avoid import failures.
Also applies to: 200-203
199-209: Disambiguate legend or aggregate by status to avoid duplicate series namesMultiple instances will render identical legends ("success"/"error") and overlay lines. Either aggregate or include instance in legend.
Apply one of the following:
Option A (aggregate across instances):
- "expr": "rate(sms_online_status_set_total{instance=~\"$instance\"}[$__rate_interval])", - "legendFormat": "{{status}}", + "expr": "sum by (status) (rate(sms_online_status_set_total{instance=~\"$instance\"}[$__rate_interval]))", + "legendFormat": "{{status}}",Option B (keep per-instance):
- "legendFormat": "{{status}}", + "legendFormat": "{{instance}} · {{status}}",
295-304: Cache ops: add aggregation to match legend and reduce series cardinalityWithout aggregation, legend duplicates per instance. Sum by (operation,status) or add instance to legend.
- "expr": "rate(sms_online_cache_operations_total{instance=~\"$instance\", operation=~\"$operation\"}[$__rate_interval])", - "legendFormat": "{{operation}} - {{status}}", + "expr": "sum by (operation, status) (rate(sms_online_cache_operations_total{instance=~\"$instance\", operation=~\"$operation\"}[$__rate_interval]))", + "legendFormat": "{{operation}} - {{status}}",Or:
- "legendFormat": "{{operation}} - {{status}}", + "legendFormat": "{{instance}} · {{operation}} - {{status}}",
390-408: Latency quantiles: consider grouping by instance (or operation) for per-node SLOsCurrent queries aggregate all selected instances; useful for fleet view, but hides per-instance outliers. If you want per-instance SLOs, keep instance in the sum and adjust legend.
- "expr": "histogram_quantile(0.95, sum(rate(sms_online_cache_latency_seconds_bucket{instance=~\"$instance\"}[$__rate_interval])) by (le))", - "legendFormat": "95th Percentile", + "expr": "histogram_quantile(0.95, sum by (le, instance) (rate(sms_online_cache_latency_seconds_bucket{instance=~\"$instance\"}[$__rate_interval])))", + "legendFormat": "{{instance}} p95",(Apply similarly for persistence latency.)
Also applies to: 498-516
565-576: Thresholds define green@0 and red@0 — everything may appear redSet a non-zero red threshold so low error rates aren’t flagged.
- { "color": "green", "value": 0 }, - { "color": "red", "value": 0 } + { "color": "green", "value": 0 }, + { "color": "red", "value": 1 }Tune value to your alerting tolerance.
Also applies to: 659-670
741-760: Add datasource to the “operation” variable for consistencyThe first variable specifies the Prometheus datasource; this one doesn’t. Missing datasource can break provisioning when no default is set.
{ "allValue": ".*", "allowCustomValue": false, "current": { "text": "All", "value": "$__all" }, + "datasource": { "type": "prometheus", "uid": "edqp0a73uh2bka" }, "definition": "label_values(sms_online_cache_operations_total,operation)", "includeAll": true, "name": "operation",(Or switch both variables to a named datasource/input to avoid hardcoded UID.)
710-711: Provisioning hygiene: explicit refresh, avoid pinned id/version
- Set a concrete refresh interval (e.g., "30s"); "auto" may be ambiguous across Grafana versions.
- Omit fixed dashboard id and version in provisioned JSON to avoid collisions and churn.
- "refresh": "auto", + "refresh": "30s",And drop:
- "id": 19, - "version": 9Also applies to: 18-19, 769-772
109-116: Use a clearer legend for Batch Size“__auto” is vague. Show instance explicitly.
- "legendFormat": "__auto", + "legendFormat": "{{instance}}",
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
deployments/grafana/dashboards/online.json(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Analyze (go)
- GitHub Check: Benchmark
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
deployments/grafana/dashboards/online.json (2)
24-24: Consider using templated datasource UIDs for portabilityThe dashboard uses hardcoded datasource UID
"edqp0a73uh2bka"in all panels. For better portability across environments, consider using templated datasource variables or consistent custom UIDs when provisioning datasources.Apply this approach by either:
Option 1: Use templated datasource (recommended)
"datasource": { "type": "prometheus", - "uid": "edqp0a73uh2bka" + "uid": "${datasource}" },And add a datasource template variable to the
templating.listsection.Option 2: Use consistent custom UID in provisioning
Ensure your Prometheus datasource is provisioned with a consistent custom UID likeprometheus-mainacross all environments.Also applies to: 121-121, 216-216, 311-311, 420-420, 531-531, 625-625
361-363: Consider adjusting latency thresholdsThe latency thresholds appear quite aggressive:
- Cache latency: 0.3s warning threshold
- Persistence latency: 0.3s warning, 0.5s critical
For a batch-oriented service processing device status updates, these thresholds might generate false alarms.
Consider adjusting based on your SLAs:
"steps": [ { "color": "green", "value": 0 }, { "color": "red", - "value": 0.3 + "value": 1.0 } ]Also applies to: 473-475
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
deployments/grafana/dashboards/online.json(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Benchmark
- GitHub Check: E2E
- GitHub Check: Analyze (go)
🔇 Additional comments (5)
deployments/grafana/dashboards/online.json (5)
81-86: Consistent grid positioning supports good dashboard layoutThe panels are well-positioned using Grafana's 24-column grid system with appropriate heights (8 units) and logical arrangement. The grid positioning follows Grafana best practices with proper width (w) and height (h) values.
Also applies to: 179-184, 274-279, 370-375, 482-487, 589-594, 683-688
757-769: Template variables enhance dashboard flexibilityThe template variables for
instanceandoperationfiltering are well-configured, allowing users to scope the dashboard to specific instances or cache operations. This follows Grafana dashboard best practices for creating flexible, reusable dashboards.
778-780: Dashboard metadata follows conventionsThe dashboard has appropriate metadata with descriptive title "SMS Online Metrics", consistent UID
online_metrics, and relevant tags. This supports version control and dashboard organization best practices.
106-106: Verify Prometheus metric names and exposure
- Confirm the metrics in deployments/grafana/dashboards/online.json (lines 106, 204, 299, 399, 407, 511, 519, 614, 708) are exported by the online service and that each metric's type matches its name — counters must end with _total (OpenMetrics/Prometheus).
- If any counter is missing the _total suffix, rename it and add/verify a HELP string; if a non-counter uses magic suffixes (_total, _count, _sum, _bucket, _created), rename to avoid collisions and high-cardinality labels.
1-781: LGTM — dashboard metrics verifiedDashboard metrics in deployments/grafana/dashboards/online.json match the Prometheus metrics defined in internal/sms-gateway/online/metrics.go (status_set_total, batch_size, cache_operations_total, cache_latency_seconds, persistence_latency_seconds, persistence_errors_total); no changes required.
f9e0953 to
6f1dc1f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (12)
pkg/cache/memory_profile_test.go (5)
45-61: Use HeapAlloc delta (not TotalAlloc) and guard underflow.TotalAlloc is cumulative; use heap-in-use and signed delta.
- allocDiff := m2.TotalAlloc - m1.TotalAlloc - allocPerItem := float64(allocDiff) / float64(numItems) + heapDiff := int64(m2.HeapAlloc) - int64(m1.HeapAlloc) + allocPerItem := float64(heapDiff) / float64(numItems) @@ - t.Logf(" Total allocated: %d bytes", m2.TotalAlloc) - t.Logf(" Allocation difference: %d bytes", allocDiff) + t.Logf(" Heap alloc: %d bytes", m2.HeapAlloc) + t.Logf(" Heap delta: %d bytes", heapDiff) @@ - if allocPerItem > 300 { + if allocPerItem > 300 { t.Errorf("Expected less than 300 bytes per item, got %.2f bytes", allocPerItem) }
95-109: Cleanup check: compare heap-in-use, not TotalAlloc.Align metric and logs; avoid unsigned wrap.
- allocDiff := m2.TotalAlloc - m1.TotalAlloc + heapDiff := int64(m2.HeapAlloc) - int64(m1.HeapAlloc) @@ - t.Logf(" Total allocated: %d bytes", m2.TotalAlloc) - t.Logf(" Allocation difference: %d bytes", allocDiff) + t.Logf(" Heap alloc: %d bytes", m2.HeapAlloc) + t.Logf(" Heap delta: %d bytes", heapDiff) @@ - if allocDiff > 2*1024*1024 { // 2MB - t.Errorf("Expected less than 2MB memory growth after cleanup, got %d bytes", allocDiff) + if heapDiff > 2*1024*1024 { // 2MB + t.Errorf("Expected less than 2MB heap growth after cleanup, got %d bytes", heapDiff)
150-166: Pressure test: compare HeapAlloc delta.Prevents false positives from cumulative allocation.
- allocDiff := m2.TotalAlloc - m1.TotalAlloc + heapDiff := int64(m2.HeapAlloc) - int64(m1.HeapAlloc) @@ - t.Logf(" Total allocated: %d bytes", m2.TotalAlloc) - t.Logf(" Allocation difference: %d bytes", allocDiff) + t.Logf(" Heap alloc: %d bytes", m2.HeapAlloc) + t.Logf(" Heap delta: %d bytes", heapDiff) @@ - if allocDiff > expectedMaxGrowth { - t.Errorf("Expected less than %d bytes memory growth under pressure, got %d bytes", expectedMaxGrowth, allocDiff) + if heapDiff > int64(expectedMaxGrowth) { + t.Errorf("Expected < %d bytes heap growth under pressure, got %d bytes", expectedMaxGrowth, heapDiff)
243-245: Handle Drain error in leak test.Don’t drop possible errors.
- tempCache.Drain(ctx) + if _, err := tempCache.Drain(ctx); err != nil { + t.Errorf("Drain failed: %v", err) + }
25-39: Fix: invalid range over integers (won’t compile).Convert all range-over-int loops to classic indexed loops.
- for i := range numItems { + for i := 0; i < numItems; i++ {- for i := range numItems { + for i := 0; i < numItems; i++ {- for i := range numItems { + for i := 0; i < numItems; i++ {- for range numGCs { + for i := 0; i < numGCs; i++ {- for j := range 100 { + for j := 0; j < 100; j++ {- for i := range numItems { + for i := 0; i < numItems; i++ {- for i := range numCaches { + for i := 0; i < numCaches; i++ {- for j := range 10 { + for j := 0; j < 10; j++ {Also applies to: 75-84, 175-184, 187-199, 201-215, 228-248
pkg/cache/memory_edge_test.go (1)
232-249: Fix: invalid range over integers (won’t compile).Replace range-over-int with classic for-loops.
- for i := range numOperations { + for i := 0; i < numOperations; i++ {- for i := range numGoroutines { + for i := 0; i < numGoroutines; i++ {Also applies to: 373-389
.github/workflows/go.yml (2)
70-74: Codecov on forks: guard missing secret or switch to tokenless.Unconditional use of CODECOV_TOKEN will fail for forked PRs.
Option A — guard by secret and event:
- - name: Upload coverage to Codecov - uses: codecov/codecov-action@v5 - with: - token: ${{ secrets.CODECOV_TOKEN }} + - name: Upload coverage to Codecov + if: > + ${{ + (github.event_name != 'pull_request' || github.event.pull_request.head.repo.fork == false) + && secrets.CODECOV_TOKEN != '' + }} + uses: codecov/codecov-action@v5 + with: + token: ${{ secrets.CODECOV_TOKEN }} + fail_ci_if_error: falseOption B — tokenless (public repos):
- - name: Upload coverage to Codecov - uses: codecov/codecov-action@v5 - with: - token: ${{ secrets.CODECOV_TOKEN }} + - name: Upload coverage to Codecov + uses: codecov/codecov-action@v5 + with: + fail_ci_if_error: false
75-81: Benchmark baselines not persisted; fail-on-alert risks flakes.You’re reading an external JSON but never persisting updates. Also commenting/push needs proper perms.
Option A — persist to gh-pages (recommended):
benchmark: name: Benchmark runs-on: ubuntu-latest permissions: - contents: read + contents: write pull-requests: write @@ - name: Upload benchmark results uses: benchmark-action/github-action-benchmark@v1 with: tool: "go" output-file-path: benchmark.txt external-data-json-path: ./cache/benchmark-data.json - fail-on-alert: true + auto-push: true + comment-on-alert: true + fail-on-alert: trueOption B — no persistence: disable alerting to avoid false failures:
external-data-json-path: ./cache/benchmark-data.json - fail-on-alert: true + comment-on-alert: false + fail-on-alert: falseAlso ensure the external path exists:
- name: Install all Go dependencies run: go mod download + + - name: Ensure benchmark cache dir exists + run: mkdir -p cacheAlso applies to: 100-111
internal/sms-gateway/online/service.go (4)
28-36: Name the logger for this module.Consistent scoping helps log filtering.
func New(devicesSvc *devices.Service, cache cache.Cache, logger *zap.Logger, metrics *metrics) Service { return &service{ devicesSvc: devicesSvc, cache: cache, - logger: logger, + logger: logger.Named("online"), metrics: metrics, } }
39-46: Flush once on shutdown to avoid dropping buffered updates.Do a best-effort final persist with a short timeout.
for { select { case <-ctx.Done(): - return + s.logger.Debug("Shutting down; flushing online status") + flushCtx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + if err := s.persist(flushCtx); err != nil { + s.logger.Warn("Final persist failed", zap.Error(err)) + } + return
56-67: Guard against empty deviceID to avoid empty-key writes.Skip and warn early.
func (s *service) SetOnline(ctx context.Context, deviceID string) { + if deviceID == "" { + s.logger.Warn("Empty device_id; skipping SetOnline") + return + } dt := time.Now().UTC().Format(time.RFC3339)
82-117: Re-queue drained items on persist failure to prevent data loss.Drain clears the buffer; on DB error, best-effort reinsert.
items, err := s.cache.Drain(ctx) @@ - if err := s.devicesSvc.SetLastSeen(ctx, timestamps); err != nil { + if err := s.devicesSvc.SetLastSeen(ctx, timestamps); err != nil { persistErr = fmt.Errorf("can't set last seen: %w", err) s.metrics.IncrementPersistenceError() + // Best-effort requeue + requeueCtx, cancel := context.WithTimeout(context.Background(), 3*time.Second) + defer cancel() + for id, v := range items { + if setErr := s.cache.Set(requeueCtx, id, v); setErr != nil { + s.logger.Warn("Failed to requeue last_seen", zap.String("device_id", id), zap.Error(setErr)) + } + } return }
🧹 Nitpick comments (13)
pkg/cache/memory_edge_test.go (3)
47-56: Harden expiry timing to reduce flakiness.1ms TTL with 2ms sleep is tight on CI; bump margins.
- ttl := 1 * time.Millisecond + ttl := 10 * time.Millisecond @@ - time.Sleep(2 * ttl) + time.Sleep(5 * ttl)
254-257: Loosen duration bound (avoid timing flakes).Increase tolerance for RapidOperations or this will flap on busy runners.
- if durationTaken > 2*duration { + if durationTaken > 5*duration { t.Errorf("Operations took too long: %v", durationTaken) }
16-37: Avoid shadowing the imported package namecache.Renaming local vars improves readability and prevents accidental package masking.
- cache := cache.NewMemory(0) + c := cache.NewMemory(0) @@ - err := cache.Set(ctx, key, value) + err := c.Set(ctx, key, value) @@ - retrieved, err := cache.Get(ctx, key) + retrieved, err := c.Get(ctx, key)(Apply similarly in the other functions in this file.)
Also applies to: 63-82, 84-104, 260-287, 330-352
pkg/cache/memory_profile_test.go (2)
15-17: Avoid shadowing the imported packagecache.Rename locals for clarity.
- cache := cache.NewMemory(0) + c := cache.NewMemory(0) @@ - err := cache.Set(ctx, key, value) + err := c.Set(ctx, key, value) @@ - _, err = cache.Get(ctx, key) + _, err = c.Get(ctx, key)(Apply similarly where the local is named
cache.)Also applies to: 65-67, 170-172, 275-277
58-60: Thresholds may be too strict across Go/GC versions.Consider widening or logging-only to avoid flaky failures.
Also applies to: 162-165
.github/workflows/go.yml (2)
35-39: Enable Go module caching to speed CI.setup-go v5 supports built-in caching.
- name: Set up Go uses: actions/setup-go@v5 with: go-version: stable + cache: true(Apply in Lint, Test, and Benchmark jobs.)
Also applies to: 56-60, 87-91
41-45: Pin golangci-lint version; avoid “latest” for reproducibility.Floating linter versions can break CI unexpectedly.
Example:
- name: Run golangci-lint uses: golangci/golangci-lint-action@v8 with: - version: latest + version: v1.59.1 # pin to a known-good version args: --timeout=5minternal/sms-gateway/cache/factory.go (3)
3-9: Default TTL for "online" cache + minor robustness (lowercase scheme).Avoid non-expiring “online” entries and normalize scheme. Apply:
import ( "fmt" "net/url" + "strings" + "time" "github.com/android-sms-gateway/core/redis" "github.com/android-sms-gateway/server/pkg/cache" ) const ( keyPrefix = "sms-gateway:" + onlineDefaultTTL = 15 * time.Minute ) @@ - switch u.Scheme { + switch strings.ToLower(u.Scheme) { case "memory": return &factory{ new: func(name string) (cache.Cache, error) { - return cache.NewMemory(0), nil + ttl := time.Duration(0) + if name == keyPrefix+"online" || strings.HasSuffix(name, ":online") { + ttl = onlineDefaultTTL + } + return cache.NewMemory(ttl), nil }, }, nil case "redis": client, err := redis.New(redis.Config{URL: config.URL}) @@ return &factory{ new: func(name string) (cache.Cache, error) { - return cache.NewRedis(client, name, 0), nil + ttl := time.Duration(0) + if name == keyPrefix+"online" || strings.HasSuffix(name, ":online") { + ttl = onlineDefaultTTL + } + return cache.NewRedis(client, name, ttl), nil }, }, nilEnsure the “online” cache is instantiated via factory.New("online") (it is in module wiring). If other names should expire too, extend the predicate accordingly.
Also applies to: 11-13, 35-51, 57-60
15-23: Use the local alias consistently.Prefer the exported alias to reduce verbosity.
-type Cache = cache.Cache +type Cache = cache.Cache type Factory interface { - New(name string) (cache.Cache, error) + New(name string) (Cache, error) } type factory struct { - new func(name string) (cache.Cache, error) + new func(name string) (Cache, error) }
35-51: Optional: memoize memory caches by name.Multiple New("online") calls return separate in‑memory stores; consider memoizing per name for parity with Redis namespacing.
If desired, I can draft a small sync.Map-based memoization inside factory.
internal/sms-gateway/modules/devices/service.go (1)
104-124: Batch to DB to cut round-trips; keep per-item fallback as a guard.Current loop issues one query per device. If the repository can handle bulk updates, prefer a single batch call.
-func (s *Service) SetLastSeen(ctx context.Context, batch map[string]time.Time) error { +func (s *Service) SetLastSeen(ctx context.Context, batch map[string]time.Time) error { if len(batch) == 0 { return nil } - var multiErr error - for deviceID, lastSeen := range batch { - if err := ctx.Err(); err != nil { - return errors.Join(err, multiErr) - } - if err := s.devices.SetLastSeen(ctx, deviceID, lastSeen); err != nil { - multiErr = errors.Join(multiErr, fmt.Errorf("device %s: %w", deviceID, err)) - s.logger.Error("can't set last seen", - zap.String("device_id", deviceID), - zap.Time("last_seen", lastSeen), - zap.Error(err), - ) - } - } - return multiErr + // Prefer a repository-level bulk update when available. + if bulkErr := s.devices.SetLastSeenBatch(ctx, batch); bulkErr == nil { + return nil + } + // Fallback: per-item updates. + var multiErr error + for deviceID, lastSeen := range batch { + if err := ctx.Err(); err != nil { + return errors.Join(multiErr, err) + } + if err := s.devices.SetLastSeen(ctx, deviceID, lastSeen); err != nil { + multiErr = errors.Join(multiErr, fmt.Errorf("device %s: %w", deviceID, err)) + s.logger.Error("can't set last seen", zap.String("device_id", deviceID), zap.Time("last_seen", lastSeen), zap.Error(err)) + } + } + return multiErr }If SetLastSeenBatch doesn’t exist yet, I can sketch the repo SQL for your DB dialect.
internal/sms-gateway/online/service.go (2)
40-41: Make the persist interval configurable.Hard-coding 1m limits ops flexibility; read from module config with a sane default.
I can wire a Config{PersistInterval time.Duration} into fx if you want.
56-65: Set a short TTL on cache writes (use cache.WithTTL)pkg/cache exposes WithTTL — set a short TTL (e.g., 15time.Minute) when calling Set to self‑heal on crashes. Example replacement:
if err = s.cache.Set(ctx, deviceID, dt, cache.WithTTL(15time.Minute)); err != nil { ... }Location: internal/sms-gateway/online/service.go (around lines 56–65)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (22)
.github/workflows/docker-publish.yml(0 hunks).github/workflows/go.yml(1 hunks).github/workflows/golangci-lint.yml(0 hunks)configs/config.example.yml(1 hunks)deployments/grafana/dashboards/online.json(1 hunks)deployments/prometheus/alerts/online-alerts.yml(1 hunks)internal/sms-gateway/app.go(4 hunks)internal/sms-gateway/cache/factory.go(1 hunks)internal/sms-gateway/models/migrations/mysql/20250918231606_add_last_seen_index.sql(1 hunks)internal/sms-gateway/models/models.go(1 hunks)internal/sms-gateway/modules/auth/service.go(5 hunks)internal/sms-gateway/modules/devices/repository.go(2 hunks)internal/sms-gateway/modules/devices/service.go(3 hunks)internal/sms-gateway/online/metrics.go(1 hunks)internal/sms-gateway/online/module.go(1 hunks)internal/sms-gateway/online/service.go(1 hunks)pkg/cache/memory.go(1 hunks)pkg/cache/memory_bench_test.go(1 hunks)pkg/cache/memory_concurrency_test.go(1 hunks)pkg/cache/memory_edge_test.go(1 hunks)pkg/cache/memory_profile_test.go(1 hunks)pkg/cache/redis.go(1 hunks)
💤 Files with no reviewable changes (2)
- .github/workflows/docker-publish.yml
- .github/workflows/golangci-lint.yml
🚧 Files skipped from review as they are similar to previous changes (14)
- internal/sms-gateway/app.go
- pkg/cache/memory_concurrency_test.go
- internal/sms-gateway/online/module.go
- pkg/cache/redis.go
- deployments/prometheus/alerts/online-alerts.yml
- internal/sms-gateway/modules/auth/service.go
- internal/sms-gateway/modules/devices/repository.go
- internal/sms-gateway/models/models.go
- deployments/grafana/dashboards/online.json
- pkg/cache/memory.go
- internal/sms-gateway/models/migrations/mysql/20250918231606_add_last_seen_index.sql
- pkg/cache/memory_bench_test.go
- configs/config.example.yml
- internal/sms-gateway/online/metrics.go
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-10T11:11:43.462Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/memory.go:59-74
Timestamp: 2025-09-10T11:11:43.462Z
Learning: In pkg/cache/memory.go, the cleanup method removes expired items first, then executes the provided callback function. This means methods like Drain that use cleanup don't need additional expiration filtering since expired items are already removed before the callback runs.
Applied to files:
pkg/cache/memory_profile_test.gopkg/cache/memory_edge_test.go
📚 Learning: 2025-09-10T23:53:42.006Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/redis.go:63-75
Timestamp: 2025-09-10T23:53:42.006Z
Learning: In pkg/cache/redis.go, the Redis cache implementation uses Redis's native field-level TTL via HExpireAt, meaning expired fields are automatically removed by Redis itself. This differs from the memory implementation where the application tracks expiration times. As a result, operations like Drain that call HGetAll will only return non-expired items since expired fields are already removed by Redis.
Applied to files:
pkg/cache/memory_edge_test.go
🧬 Code graph analysis (5)
internal/sms-gateway/online/service.go (3)
internal/sms-gateway/modules/devices/service.go (1)
Service(30-39)internal/sms-gateway/modules/auth/service.go (2)
Service(39-52)New(54-68)internal/sms-gateway/cache/factory.go (1)
Cache(15-15)
pkg/cache/memory_profile_test.go (1)
pkg/cache/memory.go (1)
NewMemory(16-23)
internal/sms-gateway/modules/devices/service.go (2)
internal/sms-gateway/modules/events/service.go (3)
s(71-116)deviceSvc(13-24)s(40-57)internal/sms-gateway/handlers/mobile.go (1)
h(149-165)
internal/sms-gateway/cache/factory.go (5)
pkg/cache/cache.go (1)
Cache(5-36)internal/config/config.go (2)
Cache(74-76)Config(10-18)internal/sms-gateway/cache/config.go (1)
Config(4-6)pkg/cache/memory.go (1)
NewMemory(16-23)pkg/cache/redis.go (1)
NewRedis(44-56)
pkg/cache/memory_edge_test.go (3)
pkg/cache/memory.go (1)
NewMemory(16-23)pkg/cache/options.go (1)
WithTTL(22-30)pkg/cache/errors.go (2)
ErrKeyExpired(9-9)ErrKeyNotFound(7-7)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Benchmark
- GitHub Check: E2E
- GitHub Check: Analyze (go)
🔇 Additional comments (3)
pkg/cache/memory_edge_test.go (1)
289-329: LGTM: Drain semantics validated correctly.Test matches memory implementation: expired items are removed prior to Drain callback, so only non‑expired entries are returned and expired keys become “not found”.
.github/workflows/go.yml (1)
10-19: Fix glob: ".go" won’t match files in subdirs; use "/*.go".Current patterns may skip CI on most Go changes.
push: branches: [master] paths: - - "**.go" + - "**/*.go" - "go.mod" - "go.sum" pull_request: branches: [master] paths: - - "**.go" + - "**/*.go" - "go.mod" - "go.sum"Likely an incorrect or invalid review comment.
internal/sms-gateway/modules/devices/service.go (1)
136-144: Token cache invalidation now matches GetByToken hashing. LGTM.
02145a2 to
e149f8e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (13)
go.mod (1)
3-3: Pin CI to Go 1.24.1 via go-version-file to avoid drift.Your workflow uses setup-go: stable. Safer to bind CI to the version declared here (1.24.1) to prevent unexpected toolchain changes.
Apply in .github/workflows/go.yml for each setup-go step:
- with: - go-version: stable + with: + go-version-file: go.mod.github/workflows/go.yml (5)
11-13: Fix path filters to match all Go files.Use "/*.go" instead of ".go" for recursive matches.
- - "**.go" + - "**/*.go"Also applies to: 17-19
35-39: Bind Go toolchain to go.mod.Use go-version-file to ensure CI uses Go 1.24.1 as in go.mod.
- with: - go-version: stable + with: + go-version-file: go.mod
56-60: Same: bind toolchain to go.mod in test job.- with: - go-version: stable + with: + go-version-file: go.mod
87-91: Same: bind toolchain to go.mod in benchmark job.- with: - go-version: stable + with: + go-version-file: go.mod
97-99: Run only benchmarks (skip tests) for speed/stability.- run: go test -bench=. -benchmem ./... | tee benchmark.txt + run: go test -run='^$$' -bench=. -benchmem ./... | tee benchmark.txtinternal/sms-gateway/modules/devices/repository.go (1)
74-88: last_seen update guards look correct; consider optional NULL-tolerant condition.If schema ever allows NULLs, add “OR last_seen IS NULL” for robustness. Current model sets NOT NULL, so this is optional.
- Where("id = ? AND last_seen < ?", id, lastSeen). + Where("id = ? AND (last_seen IS NULL OR last_seen < ?)", id, lastSeen).pkg/cache/redis.go (2)
58-61: Document Cleanup as a no-op for Redis (TTL handled server-side).Add a brief comment to avoid confusion vs memory impl that actively prunes.
// Cleanup implements Cache. +// Redis handles field expiration natively; nothing to do here. func (r *redisCache) Cleanup(_ context.Context) error {
130-137: Optional: use TxPipeline for fewer RTTs (no behavior change).Tiny improvement; current Pipelined is fine.
- _, err := r.client.Pipelined(ctx, func(p redis.Pipeliner) error { + p := r.client.TxPipeline() + { p.HSet(ctx, r.key, key, value) if !options.validUntil.IsZero() { p.HExpireAt(ctx, r.key, options.validUntil, key) } - return nil - }) - if err != nil { + } + if _, err := p.Exec(ctx); err != nil { return fmt.Errorf("can't set cache item: %w", err) }pkg/cache/memory_edge_test.go (1)
221-258: Time-based perf assertion may be flaky in CI.Either increase the threshold substantially or convert to a benchmark. Current 2x of 100ms is tight on shared runners.
internal/sms-gateway/cache/factory.go (2)
3-9: Normalize URL scheme; accept mixed‑case and future-proof imports.Switch to strings.ToLower(u.Scheme) and add the import. Prevents failures on "REDIS://..." etc.
import ( "fmt" "net/url" + "strings" "github.com/android-sms-gateway/core/redis" "github.com/android-sms-gateway/server/pkg/cache" ) @@ - switch u.Scheme { + switch strings.ToLower(u.Scheme) { case "memory": return &factory{ new: func(name string) (Cache, error) { return cache.NewMemory(0), nil }, }, nil case "redis":Also applies to: 35-54
58-60: Guard against empty cache names.Avoid creating a root-scoped prefix ("sms-gateway:") if callers pass an empty name.
func (f *factory) New(name string) (Cache, error) { - return f.new(keyPrefix + name) + n := keyPrefix + name + if strings.TrimSpace(name) == "" { + return nil, fmt.Errorf("empty cache name") + } + return f.new(n) }pkg/cache/memory_profile_test.go (1)
1-1: Drop //nolint:errcheck; fix missed checks instead.The file already checks most errors; the pragma hides real misses (e.g., Drain above). Remove it once fixes are applied.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (33)
.github/workflows/docker-publish.yml(0 hunks).github/workflows/go.yml(1 hunks).github/workflows/golangci-lint.yml(0 hunks)build/package/Dockerfile(1 hunks)build/package/Dockerfile.dev(1 hunks)configs/config.example.yml(1 hunks)deployments/grafana/dashboards/online.json(1 hunks)deployments/prometheus/alerts/online-alerts.yml(1 hunks)go.mod(4 hunks)internal/config/config.go(3 hunks)internal/config/module.go(2 hunks)internal/sms-gateway/app.go(4 hunks)internal/sms-gateway/cache/config.go(1 hunks)internal/sms-gateway/cache/factory.go(1 hunks)internal/sms-gateway/cache/module.go(1 hunks)internal/sms-gateway/models/migrations/mysql/20250918231606_add_last_seen_index.sql(1 hunks)internal/sms-gateway/models/models.go(1 hunks)internal/sms-gateway/modules/auth/service.go(5 hunks)internal/sms-gateway/modules/devices/repository.go(2 hunks)internal/sms-gateway/modules/devices/service.go(3 hunks)internal/sms-gateway/online/metrics.go(1 hunks)internal/sms-gateway/online/module.go(1 hunks)internal/sms-gateway/online/service.go(1 hunks)pkg/cache/cache.go(1 hunks)pkg/cache/errors.go(1 hunks)pkg/cache/memory.go(1 hunks)pkg/cache/memory_bench_test.go(1 hunks)pkg/cache/memory_concurrency_test.go(1 hunks)pkg/cache/memory_edge_test.go(1 hunks)pkg/cache/memory_profile_test.go(1 hunks)pkg/cache/memory_test.go(1 hunks)pkg/cache/options.go(1 hunks)pkg/cache/redis.go(1 hunks)
💤 Files with no reviewable changes (2)
- .github/workflows/golangci-lint.yml
- .github/workflows/docker-publish.yml
🚧 Files skipped from review as they are similar to previous changes (22)
- internal/config/config.go
- internal/sms-gateway/models/migrations/mysql/20250918231606_add_last_seen_index.sql
- build/package/Dockerfile.dev
- internal/sms-gateway/online/module.go
- pkg/cache/memory.go
- pkg/cache/cache.go
- configs/config.example.yml
- internal/sms-gateway/modules/devices/service.go
- pkg/cache/memory_concurrency_test.go
- internal/sms-gateway/online/metrics.go
- pkg/cache/errors.go
- internal/sms-gateway/cache/module.go
- build/package/Dockerfile
- internal/sms-gateway/modules/auth/service.go
- internal/sms-gateway/app.go
- internal/config/module.go
- deployments/grafana/dashboards/online.json
- pkg/cache/options.go
- internal/sms-gateway/cache/config.go
- pkg/cache/memory_test.go
- deployments/prometheus/alerts/online-alerts.yml
- internal/sms-gateway/models/models.go
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-09-10T23:53:42.006Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/redis.go:63-75
Timestamp: 2025-09-10T23:53:42.006Z
Learning: In pkg/cache/redis.go, the Redis cache implementation uses Redis's native field-level TTL via HExpireAt, meaning expired fields are automatically removed by Redis itself. This differs from the memory implementation where the application tracks expiration times. As a result, operations like Drain that call HGetAll will only return non-expired items since expired fields are already removed by Redis.
Applied to files:
pkg/cache/memory_edge_test.gopkg/cache/redis.go
📚 Learning: 2025-09-10T11:11:43.462Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/memory.go:59-74
Timestamp: 2025-09-10T11:11:43.462Z
Learning: In pkg/cache/memory.go, the cleanup method removes expired items first, then executes the provided callback function. This means methods like Drain that use cleanup don't need additional expiration filtering since expired items are already removed before the callback runs.
Applied to files:
pkg/cache/memory_edge_test.gopkg/cache/redis.gopkg/cache/memory_profile_test.gopkg/cache/memory_bench_test.go
📚 Learning: 2025-09-10T23:54:16.828Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/redis.go:118-121
Timestamp: 2025-09-10T23:54:16.828Z
Learning: Redis 7.4.0 introduced HEXPIRE, HEXPIREAT, and related commands that support per-field TTL on hash data structures. go-redis v9.6.0+ includes support for these commands with methods like HExpireAt.
Applied to files:
pkg/cache/redis.go
🧬 Code graph analysis (7)
pkg/cache/memory_edge_test.go (3)
pkg/cache/memory.go (1)
NewMemory(16-23)pkg/cache/options.go (1)
WithTTL(22-30)pkg/cache/errors.go (2)
ErrKeyExpired(9-9)ErrKeyNotFound(7-7)
internal/sms-gateway/modules/devices/repository.go (2)
internal/sms-gateway/models/models.go (1)
Device(25-36)internal/sms-gateway/handlers/mobile.go (1)
h(149-165)
internal/sms-gateway/online/service.go (3)
internal/sms-gateway/modules/devices/service.go (1)
Service(30-39)internal/sms-gateway/modules/auth/service.go (2)
Service(39-52)New(54-68)internal/sms-gateway/cache/factory.go (1)
Cache(15-15)
pkg/cache/redis.go (3)
pkg/cache/cache.go (1)
Cache(5-36)pkg/cache/errors.go (2)
ErrKeyNotFound(7-7)ErrKeyExists(11-11)pkg/cache/options.go (1)
Option(6-6)
pkg/cache/memory_profile_test.go (1)
pkg/cache/memory.go (1)
NewMemory(16-23)
pkg/cache/memory_bench_test.go (2)
pkg/cache/memory.go (1)
NewMemory(16-23)pkg/cache/options.go (1)
WithTTL(22-30)
internal/sms-gateway/cache/factory.go (5)
internal/config/config.go (2)
Cache(74-76)Config(10-18)pkg/cache/cache.go (1)
Cache(5-36)internal/sms-gateway/cache/config.go (1)
Config(4-6)pkg/cache/memory.go (1)
NewMemory(16-23)pkg/cache/redis.go (1)
NewRedis(44-56)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: E2E
- GitHub Check: Benchmark
- GitHub Check: Analyze (go)
🔇 Additional comments (18)
go.mod (1)
21-21: Redis client OK; ensure server >= 7.4 for HEXPIRE/HEXPIREAT.Code relies on HExpireAt. Please verify runtime Redis is 7.4+ and document this requirement (README/helm/chart).
.github/workflows/go.yml (1)
70-74: Guard Codecov upload on forks.- - name: Upload coverage to Codecov - uses: codecov/codecov-action@v5 - with: - token: ${{ secrets.CODECOV_TOKEN }} + - name: Upload coverage to Codecov + if: ${{ secrets.CODECOV_TOKEN != '' }} + uses: codecov/codecov-action@v5 + with: + token: ${{ secrets.CODECOV_TOKEN }} + fail_ci_if_error: falseinternal/sms-gateway/modules/devices/repository.go (2)
70-72: Good fix: parameterized WHERE for push_token.
99-106: Confirm soft-delete intent and index usage.Delete respects GORM soft delete if SoftDeletableModel is present—confirm that’s desired. Also ensure an index on last_seen exists (models shows it), since this is a range delete.
pkg/cache/redis.go (3)
72-92: Drain is atomic via Lua — LGTM.
145-168: Bug: HExpireAt called without field — wrong arity at runtime.This sets no per-field TTL and will error. Pass the field name.
- if !options.validUntil.IsZero() { - if err := r.client.HExpireAt(ctx, r.key, options.validUntil).Err(); err != nil { + if !options.validUntil.IsZero() { + if err := r.client.HExpireAt(ctx, r.key, options.validUntil, key).Err(); err != nil { return fmt.Errorf("can't set cache item ttl: %w", err) } }
94-106: Critical: redis.Get violates Cache.Get contract — returns ErrKeyNotFound for expired keys (should return ErrKeyExpired).Redis removes expired hash fields (field-level TTL / HExpireAt), so HGet cannot distinguish "expired" vs "never existed". Either:
- Update Cache interface/docs/tests to accept ErrKeyNotFound for expired items on the Redis backend; or
- Add expiry metadata (side hash or zset with expires_at) and change pkg/cache/redis.go Get to consult it and return ErrKeyExpired when appropriate.
Location: pkg/cache/redis.go (Get).
pkg/cache/memory_edge_test.go (2)
373-389: Same compile error in goroutine launcher.- for i := range numGoroutines { + for i := 0; i < numGoroutines; i++ {
232-249: Fix compile error: range over int.- for i := range numOperations { + for i := 0; i < numOperations; i++ {internal/sms-gateway/online/service.go (5)
28-36: Name the logger ("online").func New(devicesSvc *devices.Service, cache cache.Cache, logger *zap.Logger, metrics *metrics) Service { return &service{ devicesSvc: devicesSvc, cache: cache, - logger: logger, + logger: logger.Named("online"), metrics: metrics, } }
3-13: Validate deviceID early; drop empty/whitespace IDs.import ( "context" "fmt" "time" + "strings" @@ func (s *service) SetOnline(ctx context.Context, deviceID string) { + if strings.TrimSpace(deviceID) == "" { + s.logger.Warn("empty device_id; skipping SetOnline") + return + } dt := time.Now().UTC().Format(time.RFC3339)Also applies to: 56-57
61-69: Set per-item TTL for “online” updates (prevents unbounded retention).s.metrics.ObserveCacheLatency(func() { - if err = s.cache.Set(ctx, deviceID, dt); err != nil { + if err = s.cache.Set(ctx, deviceID, dt, cache.WithTTL(15*time.Minute)); err != nil { s.metrics.IncrementCacheOperation(operationSet, statusError) s.logger.Error("Can't set online status", zap.String("device_id", deviceID), zap.Error(err)) s.metrics.IncrementStatusSet(false) } })Note: ensure pkg/cache/options.go WithTTL semantics are correct (set validUntil only when ttl > 0). See separate comment.
39-54: Flush once on shutdown to avoid losing in-memory updates.for { select { case <-ctx.Done(): - return + s.logger.Debug("Shutting down; flushing online status") + flushCtx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + if err := s.persist(flushCtx); err != nil { + s.logger.Warn("Final persist failed", zap.Error(err)) + } + return
110-116: Re-queue drained items on persist failure (best-effort).- if err := s.devicesSvc.SetLastSeen(ctx, timestamps); err != nil { - persistErr = fmt.Errorf("can't set last seen: %w", err) - s.metrics.IncrementPersistenceError() - return - } + if err := s.devicesSvc.SetLastSeen(ctx, timestamps); err != nil { + persistErr = fmt.Errorf("can't set last seen: %w", err) + s.metrics.IncrementPersistenceError() + // best-effort requeue + for id, v := range items { + if setErr := s.cache.Set(ctx, id, v); setErr != nil { + s.logger.Warn("Failed to requeue last_seen", zap.String("device_id", id), zap.Error(setErr)) + } + } + return + }pkg/cache/memory_profile_test.go (2)
45-55: Use HeapAlloc delta, not TotalAlloc (avoids false positives).- allocDiff := m2.TotalAlloc - m1.TotalAlloc - allocPerItem := float64(allocDiff) / float64(numItems) + allocDiff := m2.HeapAlloc - m1.HeapAlloc + allocPerItem := float64(allocDiff) / float64(numItems) @@ - t.Logf(" Total allocated: %d bytes", m2.TotalAlloc) - t.Logf(" Allocation difference: %d bytes", allocDiff) + t.Logf(" Heap alloc: %d bytes", m2.HeapAlloc) + t.Logf(" Heap delta: %d bytes", allocDiff) @@ - allocDiff := m2.TotalAlloc - m1.TotalAlloc + allocDiff := m2.HeapAlloc - m1.HeapAlloc @@ - t.Logf(" Total allocated: %d bytes", m2.TotalAlloc) - t.Logf(" Allocation difference: %d bytes", allocDiff) + t.Logf(" Heap alloc: %d bytes", m2.HeapAlloc) + t.Logf(" Heap delta: %d bytes", allocDiff) @@ - if allocDiff > expectedMaxGrowth { - t.Errorf("Expected less than %d bytes memory growth under pressure, got %d bytes", expectedMaxGrowth, allocDiff) + if allocDiff > expectedMaxGrowth { + t.Errorf("Expected < %d bytes heap growth under pressure, got %d bytes", expectedMaxGrowth, allocDiff)Also applies to: 95-109, 150-166
244-245: Handle Drain error (don’t ignore).- // Clear the cache - tempCache.Drain(ctx) + // Clear the cache + if _, err := tempCache.Drain(ctx); err != nil { + t.Errorf("Drain failed: %v", err) + }pkg/cache/memory_bench_test.go (2)
1-1: Stop silencing errors; add minimal checks in hot paths.
- Remove //nolint:errcheck and check errors for Set/Get/Delete/Cleanup/Drain in each benchmark loop. Fail fast with b.Fatal/b.Fatalf on unexpected errors. For SetOrFail, ignore cache.ErrKeyExists only.
If helpful, I can push a patch covering all benches consistently.
Also applies to: 22-27, 37-45, 55-65, 74-80, 95-99, 106-118, 132-137, 152-157, 181-189, 208-221, 246-263, 286-292, 299-304, 324-336, 364-375, 413-417, 438-445, 455-460, 465-470
316-334: Verify WithTTL semantics before benchmarking TTL overhead.pkg/cache/options.go shows WithTTL setting validUntil even when ttl <= 0. Ensure it is:
- ttl > 0: o.validUntil = time.Now().Add(ttl)
- else: o.validUntil = time.Time{}
Also applies to: 12-13
5d76e09 to
2da81b9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
pkg/cache/memory_edge_test.go (2)
54-61: Reduce timing flakiness in TTL tests.Sleep-based expiry checks can be flaky on contended runners. Consider polling until deadline or adding slack (e.g., wait TTL + jitter and assert within a time-bound loop).
Also applies to: 182-189
16-17: Avoid shadowing the imported package name.Local var named “cache” shadows the imported package alias “cache”, which can confuse readers. Prefer “c” for the instance.
- cache := cache.NewMemory(0) + c := cache.NewMemory(0)pkg/cache/redis.go (1)
58-61: Document no-op Cleanup for Redis.Clarify that Redis enforces TTL itself.
Apply this diff:
-// Cleanup implements Cache. +// Cleanup implements Cache. +// No-op for Redis: field-level TTL is enforced by Redis.internal/sms-gateway/cache/factory.go (1)
35-54: Support rediss scheme (TLS).Allow secure Redis URLs.
Apply this diff:
- switch u.Scheme { + switch u.Scheme { case "memory": ... - case "redis": + case "redis", "rediss":
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
.github/workflows/docker-publish.yml(0 hunks).github/workflows/go.yml(1 hunks).github/workflows/golangci-lint.yml(0 hunks)internal/sms-gateway/cache/factory.go(1 hunks)internal/sms-gateway/models/migrations/mysql/20250918231606_add_last_seen_index.sql(1 hunks)internal/sms-gateway/models/models.go(1 hunks)internal/sms-gateway/modules/devices/repository.go(2 hunks)internal/sms-gateway/modules/devices/service.go(3 hunks)pkg/cache/memory.go(1 hunks)pkg/cache/memory_bench_test.go(1 hunks)pkg/cache/memory_edge_test.go(1 hunks)pkg/cache/memory_profile_test.go(1 hunks)pkg/cache/redis.go(1 hunks)
💤 Files with no reviewable changes (2)
- .github/workflows/docker-publish.yml
- .github/workflows/golangci-lint.yml
🚧 Files skipped from review as they are similar to previous changes (5)
- internal/sms-gateway/models/models.go
- pkg/cache/memory.go
- .github/workflows/go.yml
- internal/sms-gateway/modules/devices/repository.go
- internal/sms-gateway/models/migrations/mysql/20250918231606_add_last_seen_index.sql
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-09-10T11:11:43.462Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/memory.go:59-74
Timestamp: 2025-09-10T11:11:43.462Z
Learning: In pkg/cache/memory.go, the cleanup method removes expired items first, then executes the provided callback function. This means methods like Drain that use cleanup don't need additional expiration filtering since expired items are already removed before the callback runs.
Applied to files:
pkg/cache/memory_profile_test.gopkg/cache/redis.gopkg/cache/memory_bench_test.gopkg/cache/memory_edge_test.go
📚 Learning: 2025-09-10T23:53:42.006Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/redis.go:63-75
Timestamp: 2025-09-10T23:53:42.006Z
Learning: In pkg/cache/redis.go, the Redis cache implementation uses Redis's native field-level TTL via HExpireAt, meaning expired fields are automatically removed by Redis itself. This differs from the memory implementation where the application tracks expiration times. As a result, operations like Drain that call HGetAll will only return non-expired items since expired fields are already removed by Redis.
Applied to files:
pkg/cache/redis.gopkg/cache/memory_edge_test.go
📚 Learning: 2025-09-10T23:54:16.828Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/redis.go:118-121
Timestamp: 2025-09-10T23:54:16.828Z
Learning: Redis 7.4.0 introduced HEXPIRE, HEXPIREAT, and related commands that support per-field TTL on hash data structures. go-redis v9.6.0+ includes support for these commands with methods like HExpireAt.
Applied to files:
pkg/cache/redis.go
🧬 Code graph analysis (5)
pkg/cache/memory_profile_test.go (1)
pkg/cache/memory.go (1)
NewMemory(16-23)
internal/sms-gateway/cache/factory.go (5)
internal/config/config.go (1)
Cache(74-76)pkg/cache/cache.go (1)
Cache(5-36)internal/sms-gateway/online/service.go (1)
New(28-37)pkg/cache/memory.go (1)
NewMemory(16-23)pkg/cache/redis.go (1)
NewRedis(44-56)
pkg/cache/redis.go (3)
pkg/cache/cache.go (1)
Cache(5-36)pkg/cache/errors.go (2)
ErrKeyNotFound(7-7)ErrKeyExists(11-11)pkg/cache/options.go (1)
Option(6-6)
pkg/cache/memory_bench_test.go (2)
pkg/cache/memory.go (1)
NewMemory(16-23)pkg/cache/options.go (1)
WithTTL(22-30)
pkg/cache/memory_edge_test.go (3)
pkg/cache/memory.go (1)
NewMemory(16-23)pkg/cache/options.go (1)
WithTTL(22-30)pkg/cache/errors.go (2)
ErrKeyExpired(9-9)ErrKeyNotFound(7-7)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Build / Docker image (linux/amd64)
- GitHub Check: Build / Docker image (linux/arm64)
- GitHub Check: Benchmark
- GitHub Check: Analyze (go)
🔇 Additional comments (21)
pkg/cache/memory_edge_test.go (3)
289-328: Drain behavior assertion is correct (expired items removed).Good alignment with memory cleanup semantics: expired entries are removed before callback, so Drain returns only valid items and subsequent Get yields ErrKeyNotFound.
232-249: Fix: invalid range over int (won’t compile).Replace the range loop with a classic indexed for-loop.
- for i := range numOperations { + for i := 0; i < numOperations; i++ {
373-389: Fix: invalid range over int in goroutine launcher (won’t compile).Use an indexed for-loop.
- for i := range numGoroutines { + for i := 0; i < numGoroutines; i++ {pkg/cache/memory_profile_test.go (5)
46-55: Measure heap-in-use deltas, not cumulative TotalAlloc.TotalAlloc is monotonic; use HeapAlloc deltas for per-item growth checks.
- allocDiff := m2.TotalAlloc - m1.TotalAlloc - allocPerItem := float64(allocDiff) / float64(numItems) + heapDiff := int64(m2.HeapAlloc) - int64(m1.HeapAlloc) + heapPerItem := float64(heapDiff) / float64(numItems) @@ - t.Logf(" Total allocated: %d bytes", m2.TotalAlloc) - t.Logf(" Allocation difference: %d bytes", allocDiff) - t.Logf(" Allocations per item: %.2f bytes", allocPerItem) + t.Logf(" Heap alloc: %d bytes", m2.HeapAlloc) + t.Logf(" Heap delta: %d bytes", heapDiff) + t.Logf(" Heap delta per item: %.2f bytes", heapPerItem) @@ - if allocPerItem > 300 { - t.Errorf("Expected less than 300 bytes per item, got %.2f bytes", allocPerItem) + if heapPerItem > 300 { + t.Errorf("Expected less than 300 bytes per item, got %.2f bytes", heapPerItem)
95-109: MemoryCleanup: compare HeapAlloc delta.Avoid TotalAlloc for post-cleanup growth checks.
- allocDiff := m2.TotalAlloc - m1.TotalAlloc + heapDiff := int64(m2.HeapAlloc) - int64(m1.HeapAlloc) @@ - t.Logf(" Total allocated: %d bytes", m2.TotalAlloc) - t.Logf(" Allocation difference: %d bytes", allocDiff) + t.Logf(" Heap alloc: %d bytes", m2.HeapAlloc) + t.Logf(" Heap delta: %d bytes", heapDiff) @@ - if allocDiff > 2*1024*1024 { // 2MB - t.Errorf("Expected less than 2MB memory growth after cleanup, got %d bytes", allocDiff) + if heapDiff > 2*1024*1024 { // 2MB + t.Errorf("Expected less than 2MB heap growth after cleanup, got %d bytes", heapDiff)
150-165: MemoryPressure: use HeapAlloc delta and update threshold check.- allocDiff := m2.TotalAlloc - m1.TotalAlloc + heapDiff := int64(m2.HeapAlloc) - int64(m1.HeapAlloc) @@ - t.Logf(" Total allocated: %d bytes", m2.TotalAlloc) - t.Logf(" Allocation difference: %d bytes", allocDiff) + t.Logf(" Heap alloc: %d bytes", m2.HeapAlloc) + t.Logf(" Heap delta: %d bytes", heapDiff) @@ - if allocDiff > expectedMaxGrowth { - t.Errorf("Expected less than %d bytes memory growth under pressure, got %d bytes", expectedMaxGrowth, allocDiff) + if heapDiff > int64(expectedMaxGrowth) { + t.Errorf("Expected < %d bytes heap growth under pressure, got %d bytes", expectedMaxGrowth, heapDiff)
243-245: Check Drain error in leak detection test.Surface failures if Drain ever returns an error.
- // Clear the cache - tempCache.Drain(ctx) + // Clear the cache + if _, err := tempCache.Drain(ctx); err != nil { + t.Errorf("Drain failed: %v", err) + }
25-39: Fix: invalid range over integers (won’t compile).Replace all “range over int” loops with indexed loops; also add an index for the GC loop and the fixed-iteration inner loops.
- for i := range numItems { + for i := 0; i < numItems; i++ {- for i := range numItems { + for i := 0; i < numItems; i++ {- for i := range numItems { + for i := 0; i < numItems; i++ {- for range numGCs { + for i := 0; i < numGCs; i++ {- for j := range 100 { + for j := 0; j < 100; j++ {- for i := range numItems { + for i := 0; i < numItems; i++ {- for i := range numCaches { + for i := 0; i < numCaches; i++ {- for j := range 10 { + for j := 0; j < 10; j++ {Also applies to: 75-84, 175-183, 187-199, 201-215, 228-248, 233-241
pkg/cache/memory_bench_test.go (3)
356-376: LGTM: large-value benchmark includes error checks.Good to see Set/Get errors handled here while measuring large values.
15-28: Benchmarks ignore errors due to //nolint:errcheck; consider minimal checks in hot loops.To avoid masking regressions, add lightweight error checks (or validate outside timing windows). If keeping //nolint for perf, at least pre-populate and validate keys where reads are expected to succeed.
Also applies to: 30-46, 48-66, 67-81, 82-100, 101-119, 120-139, 140-159, 160-191, 193-224, 226-266, 268-306, 308-338, 340-377, 379-405, 406-429, 430-471
391-401: Fix: non-existent testing.B API and invalid range over int (won’t compile).Use the standard b.N loop and an indexed loop over size; also handle Drain/Set errors to avoid silent failures.
- for b.Loop() { + for i := 0; i < b.N; i++ { // Clear cache - cache.Drain(ctx) + if _, err := cache.Drain(ctx); err != nil { + b.Fatal(err) + } // Add new items - for j := range size { + for j := 0; j < size; j++ { key := "key-" + strconv.Itoa(j) value := "value-" + strconv.Itoa(j) - cache.Set(ctx, key, value) + if err := cache.Set(ctx, key, value); err != nil { + b.Fatal(err) + } } }internal/sms-gateway/modules/devices/service.go (3)
104-124: LGTM: batch SetLastSeen with ctx cancellation and multi-error aggregationBehavior is correct: empty batch short‑circuits, ctx cancellation is respected, per‑item errors are aggregated and logged.
7-7: Confirmed — Go >= 1.20 present (no action required)go.mod declares "go 1.24.1" and Dockerfiles use "golang:1.24-alpine", so errors.Join is supported.
136-145: Don't use cache.ErrNotFound — the cache package doesn't export it; change approach
- capcom6/go-helpers/cache does not expose a sentinel ErrNotFound, so the proposed errors.Is(err, cache.ErrNotFound) will not compile.
- Options:
- Preferable: add/export a sentinel (cache.ErrNotFound) in the cache package and have Delete return it for missing keys; then use errors.Is(err, cache.ErrNotFound) and keep the suggested logging diff.
- Alternative: if you can't change the dependency, avoid noisy logs by lowering the log level for Delete errors (Debug), making Delete idempotent in a local wrapper (treat not-found as nil), or (last resort, fragile) detect a not-found error by message.
- Also extract tokenCacheKey(token string) to DRY the SHA‑256 + hex logic used here and in GetByToken.
Likely an incorrect or invalid review comment.
pkg/cache/redis.go (6)
15-24: Good: Atomic GetAndDelete via Lua.The script correctly ensures atomicity for get-and-delete.
94-103: Expired vs missing are indistinguishable — interface requires ErrKeyExpired.Current Redis impl returns ErrKeyNotFound for expired fields. Decide: either change interface/docs or store expiry metadata to return ErrKeyExpired.
Would you like to (A) align the interface to treat expired as not found across backends, or (B) add metadata (e.g., write validUntil alongside value) to support ErrKeyExpired?
108-120: LGTM: Atomic GetAndDelete implementation.Matches the Lua script behavior and maps not-found correctly.
26-33: Drain deletes concurrent writes — delete only returned fields.UNLINK/DEL of the whole hash will drop fields written after HGETALL but before deletion. Make the script HDEL only the fields returned.
Apply this diff:
- hgetallAndDeleteScript = ` -local items = redis.call('HGETALL', KEYS[1]) -if #items > 0 then - local ok = pcall(redis.call, 'UNLINK', KEYS[1]) - if not ok then redis.call('DEL', KEYS[1]) end -end -return items -` + hgetallAndDeleteScript = ` +local items = redis.call('HGETALL', KEYS[1]) +if #items > 0 then + for i = 1, #items, 2 do + redis.call('HDEL', KEYS[1], items[i]) + end +end +return items +`
130-134: HExpireAt arg order is correct — no change neededgo-redis v9 signature is HExpireAt(ctx context.Context, key string, tm time.Time, fields ...string); the existing p.HExpireAt(ctx, r.key, options.validUntil, key) call is correct.
Likely an incorrect or invalid review comment.
161-164: Bug: HExpireAt missing field argument (won’t compile / wrong TTL target).You must pass the field name.
Apply this diff:
- if !options.validUntil.IsZero() { - if err := r.client.HExpireAt(ctx, r.key, options.validUntil).Err(); err != nil { - return fmt.Errorf("can't set cache item ttl: %w", err) - } - } + if !options.validUntil.IsZero() { + if err := r.client.HExpireAt(ctx, r.key, key, options.validUntil).Err(); err != nil { + return fmt.Errorf("can't set cache item ttl: %w", err) + } + }internal/sms-gateway/cache/factory.go (1)
43-51: Ensure runtime Redis supports HEXPIRE (Redis ≥ 7.4).go.mod pins github.com/redis/go-redis/v9 v9.9.0 (>= v9.6) so the client supports HExpireAt; the Redis server itself must be ≥ 7.4 to provide HEXPIRE/HEXPIREAT — otherwise Set may partially succeed (value set, per-field TTL fails). Verify production/runtime Redis version or add a fallback (feature-detect HEXPIRE or use per-key TTL) before relying on per-field TTL behavior.
|
Deployed... |
Summary by CodeRabbit
New Features
Configuration
Behavior
Build & CI
Dependencies
Tests
Monitoring