Skip to content

Conversation

fm3
Copy link
Member

@fm3 fm3 commented Sep 3, 2025

  • BuildInfo field datastoreApiVersion was unused and out of date for a long time, removed now
  • Refactored BuildInfo out of the controller into separate service
  • Add heartbeat event

  • Removed dev-only changes like prints and application.conf edits
  • Considered common edge cases
  • Needs datastore update after deployment

Copy link
Contributor

coderabbitai bot commented Sep 3, 2025

📝 Walkthrough

Walkthrough

Adds BuildInfoService and refactors Application to use it for build info. Introduces AnalyticsService scheduling to emit periodic heartbeat events using the oldest active user and build info. Wires AnalyticsService as an eager singleton. Adds UserDAO.findOldestActive and WebknossosHeartbeatAnalyticsEvent. Removes datastoreApiVersion build info keys.

Changes

Cohort / File(s) Summary
Dependency wiring
app/WebknossosModule.scala
Reorganized imports and added an eager singleton binding for AnalyticsService.
Application build info refactor
app/controllers/Application.scala
Replaced constructor DI of ReleaseInformationDAO and StoreModules with BuildInfoService; removed ApiVersioning mixin; buildInfo endpoint now returns buildInfoService.buildInfoJson wrapped with remote origin headers.
Analytics heartbeat scheduling
app/models/analytics/AnalyticsService.scala, app/models/analytics/AnalyticsEvent.scala
Added WebknossosHeartbeatAnalyticsEvent. AnalyticsService now mixes in IntervalScheduler, injects UserDAO, BuildInfoService, ActorSystem, and ApplicationLifecycle; schedules periodic ticks (tickerInterval = 24 hours) to emit heartbeat events built with oldest active user and build info.
User DAO extension
app/models/user/User.scala
Added findOldestActive method to fetch the oldest non-deactivated user.
Build info service introduction
app/utils/BuildInfoService.scala
New BuildInfoService assembling buildInfoJson from webknossos.BuildInfo (with certificatePublicKey removed), ReleaseInformationDAO schema version, API versioning info, and StoreModules flags.
Build info settings update
project/BuildInfoSettings.scala
Added import of sbtbuildinfo.BuildInfoOption and removed datastoreApiVersion -> 2.0 from build info keys.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

  • MichaelBuessemeyer
  • frcroth

Poem

I hop between the wires and streams, a tiny dev delight,
I send a heartbeat in the night, build-info glowing bright.
Oldest user leads the way, the scheduler hums along,
Modules snug and bindings set — a rabbit’s little song. 🐇

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch buildinfo-cleanup

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@fm3 fm3 self-assigned this Sep 3, 2025
@fm3 fm3 requested a review from normanrz September 3, 2025 13:24
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (7)
app/models/user/User.scala (1)

171-176: Heartbeat should not rely on an arbitrary real user; handle empty DB and reduce PII exposure

  • Using the oldest active user for a periodic heartbeat ties analytics to a real user's org/admin flags. Consider a system/synthetic user or event-level userProperties independent of any real user. Also, if there are no active users, this yields an empty Fox and the tick silently no-ops.
  • If you keep this approach, at least exclude unlisted users and add a deterministic tiebreaker.

Apply to improve selection stability and avoid unlisted users:

-        q"SELECT $columns FROM $existingCollectionName WHERE NOT isDeactivated ORDER BY created LIMIT 1".as[UsersRow])
+        q"SELECT $columns FROM $existingCollectionName WHERE NOT isDeactivated AND NOT isUnlisted ORDER BY created ASC, _id ASC LIMIT 1".as[UsersRow])

Optionally, expose findOldestActiveOption: Fox[Option[User]] and have the ticker skip emitting when None to avoid hard dependency on user presence. I can draft that change if desired.

project/BuildInfoSettings.scala (1)

53-56: Remove trailing comma for broader Scala/SBT compatibility

Some Scala versions/settings reject trailing commas in argument lists in build definitions.

Apply:

-      "version" -> webknossosVersion,
+      "version" -> webknossosVersion
app/WebknossosModule.scala (1)

46-47: Eager-starting AnalyticsService runs the 12s heartbeat in all environments

If tests/dev don’t expect background DB and network activity, gate the eager binding behind config or start the ticker conditionally when sending or DB persistence is enabled.

I can provide a provider-based binding that checks config before instantiating the ticker if helpful.

app/models/analytics/AnalyticsEvent.scala (1)

234-244: Drop the trailing comma in Json.obj for consistency

Minor style/compat nit; the rest of this file doesn’t use trailing commas in Json.obj calls.

     Fox.successful(
       Json.obj(
-        "build_info" -> buildInfoJson,
+        "build_info" -> buildInfoJson
       )
     )
app/utils/BuildInfoService.scala (2)

18-23: Avoid deprecated view/filterKeys/mapValues combo; simplify and future-proof

Use a straightforward filter/map to drop certificatePublicKey and stringize values.

-        "webknossos" -> Json.toJson(
-          webknossos.BuildInfo.toMap.view.mapValues(_.toString).filterKeys(_ != "certificatePublicKey").toMap),
+        "webknossos" -> Json.toJson(
+          webknossos.BuildInfo.toMap
+            .filterNot { case (k, _) => k == "certificatePublicKey" }
+            .map { case (k, v) => k -> v.toString }
+        ),

13-24: Consider memoizing build info to reduce DB load from the 12s heartbeat

Schema version and build info change rarely. Cache the assembled JsObject for, e.g., 1–5 minutes to avoid hitting the DB every tick.

I can add a simple in-memory cache with TTL using an AtomicReference and timestamp or Caffeine if you prefer.

app/controllers/Application.scala (1)

77-86: Optional: move ReleaseInformationDAO out of controllers

DAO types in a controller file are a layering smell. Consider relocating to models/dao to avoid accidental tight coupling.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 24d620c and f2f149b.

📒 Files selected for processing (7)
  • app/WebknossosModule.scala (2 hunks)
  • app/controllers/Application.scala (1 hunks)
  • app/models/analytics/AnalyticsEvent.scala (1 hunks)
  • app/models/analytics/AnalyticsService.scala (3 hunks)
  • app/models/user/User.scala (1 hunks)
  • app/utils/BuildInfoService.scala (1 hunks)
  • project/BuildInfoSettings.scala (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (6)
app/models/user/User.scala (5)
app/utils/sql/SimpleSQLDAO.scala (1)
  • run (28-48)
app/utils/sql/SqlInterpolation.scala (2)
  • q (19-38)
  • as (53-73)
app/models/dataset/credential/CredentialDAO.scala (1)
  • columns (24-24)
app/utils/sql/SQLDAO.scala (3)
  • columns (21-21)
  • parseFirst (31-34)
  • parseFirst (34-40)
app/utils/sql/SecuredSQLDAO.scala (1)
  • existingCollectionName (16-16)
app/models/analytics/AnalyticsEvent.scala (2)
app/utils/BuildInfoService.scala (1)
  • buildInfoJson (13-25)
util/src/main/scala/com/scalableminds/util/tools/Fox.scala (1)
  • successful (53-56)
app/utils/BuildInfoService.scala (1)
app/controllers/Application.scala (2)
  • ReleaseInformationDAO (77-85)
  • getSchemaVersion (80-85)
app/WebknossosModule.scala (1)
app/models/analytics/AnalyticsService.scala (2)
  • AnalyticsService (23-98)
  • AnalyticsSessionService (117-138)
app/models/analytics/AnalyticsService.scala (7)
app/utils/BuildInfoService.scala (2)
  • BuildInfoService (10-25)
  • buildInfoJson (13-25)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceService.scala (2)
  • tickerInterval (44-44)
  • tick (55-62)
app/models/storage/UsedStorageService.scala (2)
  • tickerInterval (36-36)
  • tick (44-55)
app/mail/MailchimpTicker.scala (2)
  • tickerInterval (27-27)
  • tick (31-37)
app/models/job/Worker.scala (2)
  • tickerInterval (112-112)
  • tick (114-120)
app/models/user/User.scala (1)
  • findOldestActive (171-178)
app/models/analytics/AnalyticsEvent.scala (1)
  • WebknossosHeartbeatAnalyticsEvent (234-244)
app/controllers/Application.scala (2)
app/utils/BuildInfoService.scala (2)
  • BuildInfoService (10-25)
  • buildInfoJson (13-25)
util/src/main/scala/com/scalableminds/util/mvc/ExtendedController.scala (1)
  • addRemoteOriginHeaders (69-71)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: frontend-tests
  • GitHub Check: backend-tests
  • GitHub Check: build-smoketest-push
🔇 Additional comments (8)
project/BuildInfoSettings.scala (1)

2-2: LGTM on BuildInfoOption import and datastoreApiVersion removal

Import looks correct and aligns with ToJson usage; removal of datastoreApiVersion matches PR intent.

app/controllers/Application.scala (4)

15-15: Good: build info logic centralized via service

Importing and using BuildInfoService here reduces controller scope and duplication. LGTM.


22-22: Constructor DI swap to BuildInfoService is appropriate

Using the service instead of assembling JSON in the controller improves separation of concerns.


36-37: LGTM: buildInfo delegated to service

Endpoint simply returns the prebuilt JSON and keeps CORS headers. Clean and readable.


28-28: Application safely drops ApiVersioning
No ApiVersioning methods are invoked in Application.scala and addRemoteOriginHeaders continues to come from ExtendedController. (Optional: remove the now-unused import com.scalableminds.util.mvc.ApiVersioning.)

app/models/analytics/AnalyticsService.scala (3)

11-17: Adopting IntervalScheduler and wiring new deps looks good

Bringing in IntervalScheduler and BuildInfoService is the right direction for a periodic heartbeat.


33-34: LGTM: extends IntervalScheduler

Consistent with other tickers in the codebase.


27-31: AnalyticsService eager singleton binding confirmed
Bound in app/WebknossosModule.scala (line 46) via asEagerSingleton(), so IntervalScheduler starts on app startup.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
app/models/analytics/AnalyticsService.scala (1)

89-95: Heartbeat currently attributed to a real user → PII emission and session skew; also fails when no active users

  • Attributes to oldest active user, which will include userProperties in payload and refresh their session id artificially once per tick.
  • If there are zero active users, findOldestActive fails; tick will error/log each run.

Action:

  • Use a dedicated system/anonymous identity for heartbeat events, or ensure the event anonymizes userProperties (no userId/org/isAdmin/timezone, etc.).
  • Guard empty-user case: skip the tick (or fall back to system identity) when no active users exist.

I can draft a small patch (here + AnalyticsEvent.scala) if you want.

Check anonymization override exists:

#!/bin/bash
# Heartbeat event should override userProperties
rg -n 'case class\\s+WebknossosHeartbeatAnalyticsEvent' app/models/analytics/AnalyticsEvent.scala -n -C5
rg -n 'override\\s+def\\s+userProperties' app/models/analytics/AnalyticsEvent.scala -n -C3
🧹 Nitpick comments (2)
app/models/analytics/AnalyticsService.scala (2)

87-88: Make heartbeat cadence configurable (default to 24h)

Avoid hard-coding; read from config with 24h default.

Apply:

-  override protected def tickerInterval: FiniteDuration = 24 hours
+  override protected def tickerInterval: FiniteDuration =
+    wkConf.BackendAnalytics.Heartbeat.tickerInterval // keep config default at 24h

89-95: Minor: avoid recomputing build info per tick if expensive

At a 24h cadence this is fine; if you later tighten the interval, consider memoizing buildInfoJson with a short TTL.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between f2f149b and a9fe790.

📒 Files selected for processing (2)
  • app/controllers/Application.scala (1 hunks)
  • app/models/analytics/AnalyticsService.scala (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • app/controllers/Application.scala
🧰 Additional context used
🧬 Code graph analysis (1)
app/models/analytics/AnalyticsService.scala (4)
app/utils/BuildInfoService.scala (2)
  • BuildInfoService (10-25)
  • buildInfoJson (13-25)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceService.scala (2)
  • tickerInterval (44-44)
  • tick (55-62)
app/models/user/User.scala (1)
  • findOldestActive (171-178)
app/models/analytics/AnalyticsEvent.scala (1)
  • WebknossosHeartbeatAnalyticsEvent (234-244)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-smoketest-push
  • GitHub Check: backend-tests
🔇 Additional comments (2)
app/models/analytics/AnalyticsService.scala (2)

11-16: Imports and wiring surface look fine

The added imports (IntervalScheduler, ActorSystem, ApplicationLifecycle, BuildInfoService) align with the new scheduling/use sites.


26-31: Verify eager singleton binding for AnalyticsService
Ensure AnalyticsService is bound with .asEagerSingleton() in your DI module (e.g., bind[AnalyticsService].asEagerSingleton()). Without an eager singleton binding, the IntervalScheduler won’t start.

@fm3 fm3 merged commit d29c6d5 into master Sep 4, 2025
5 checks passed
@fm3 fm3 deleted the buildinfo-cleanup branch September 4, 2025 10:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants