Skip to content

Conversation

@chaptersix
Copy link
Contributor

@chaptersix chaptersix commented Oct 13, 2025

What Changed

This PR introduces a new --config-file flag (and the TEMPORAL_SERVER_CONFIG_FILE_PATH environment variable) to remove the dependency on dockerize in the Temporal server Docker image.

When a configuration file is specified using either the CLI flag or the environment variable, the server will load configuration only from that file.
Users who want templating behavior similar to dockerize can enable it by adding the comment # enable-template at the top of the configuration file.


Key Changes

  1. New --config-file flag:

    • Adds a global --config-file flag that accepts a path to a single configuration file (absolute or relative to the project root).
    • Can also be set via the TEMPORAL_SERVER_CONFIG_FILE_PATH environment variable.
  2. Deprecated legacy flags:

    • The --config, --env, and --zone flags are now marked as deprecated in CLI help text.
    • These flags still work for backward compatibility.
  3. Embedded config template:

    • The config_template.yaml file is now embedded in the binary to support loading configuration from environment variables.
    • Templating is supported if the file includes the # enable-template comment at the top.
  4. Templating support:

    • Configuration files can use templating by including # enable-template at the beginning of the YAML file.

Configuration Loading Priority (Highest to Lowest)

  1. --config-file specified → Load that specific file
  2. --config, --env, or --zone specified → Load from configuration directory (deprecated)
  3. No configuration specified → Load from embedded template using environment variables (default)

Expected Behavior

The following examples illustrate how the new configuration loading logic behaves:

  • Default behavior:
    Running temporal start without flags loads configuration from environment variables only using the embedded template.

  • Using --config-file:
    temporal --config-file=/path/to/config.yaml start loads configuration from the specified file path.

  • Using TEMPORAL_SERVER_CONFIG_FILE_PATH:
    Setting TEMPORAL_SERVER_CONFIG_FILE_PATH=/path/to/config.yaml temporal start has the same effect as using the flag.

  • Validation and error handling:
    The CLI returns clear error messages when conflicting flags or environment variables are used, or when a specified file does not exist.


Breaking Change

The default behavior of temporal start has changed.
It now loads configuration from environment variables instead of using a default template path.

@chaptersix
Copy link
Contributor Author

tabled until we start work

chaptersix and others added 13 commits October 30, 2025 14:46
Update config tests to properly handle the new loading mechanism.
Fix sqlite file config to use correct path.
Ensure that existing configuration files and environment variables
continue to work with the new config loader implementation.
Expose config file path as an option in the FX dependency injection
provider to support programmatic configuration.
## What changed?
Fixing data race by extracting out close transfer task information prior
to release lock. Added more context to comment.


## Why?
Bug

## How did you test it?
- [x] built
- [x] run locally and tested manually
- [x] covered by existing tests
- [ ] added new unit test(s)
- [ ] added new functional test(s)
## What changed?
- Revert the removal of slice count check in multi-cursor slice count
action.
- The check was incorrectly removed in [#8416
L117](https://github.com/temporalio/temporal/pull/8416/files#diff-deecce1e374d8c4db074d9c923c2b80d1b8e28cef778202aa01397e8d56e1bafL117)

## Why?
- Without the check, the code will panic later in
`pickCompactCandidates` when currentSliceCount < targetSliceCount.

## How did you test it?
- [x] built
- [ ] run locally and tested manually
- [ ] covered by existing tests
- [ ] added new unit test(s)
- [ ] added new functional test(s)
- [x] will follow up with a test PR
## What changed?
- Wire up chasm workflow library

## Why?
- #8485 makes workflow start to use CHASM as well and 
we need to register workflow as a chasm library.

## How did you test it?
- [x] built
- [ ] run locally and tested manually
- [ ] covered by existing tests
- [ ] added new unit test(s)
- [ ] added new functional test(s)

## Potential risks
- Logic is only used when chasm feature flag is turned on. The change is
mainly for functional tests.
## What changed?
Add locking in newRateLimitManager and Stop.

## Why?
Unlikely but potential data race if we get a dynamic config subscription
callback after Subscribe returns before assigning to the field in the
constructor.

## How did you test it?
- [x] built and ran existing tests
## What changed?
- Fix last change tracking for pri/fairness tasks.
- Don't allow migration for sticky queues.
- Improve fairness migration tests:
  - Set config only on test task queue
  - Use only ApproximateBacklogCount

## Why?
This fixes some situations where we wouldn't update
ApproximateBacklogCount on partition unload. It should also make the
test less flaky.

## How did you test it?
- [x] covered by existing tests
## What changed?
No-oping close transfer tasks for SyncWorkflowState tasks


## Why?
We need non state based replication to be eligible for this
optimization.


## How did you test it?
- [x] built
- [x] run locally and tested manually
- [x] covered by existing tests
- [ ] added new unit test(s)
- [x] added new functional test(s)

`go test -v -tags test_dep ./tests/xdc -run
TestStreamBasedReplicationTestSuite/DisableTransitionHistory/TestCloseTransferTaskAckedReplication
-timeout 10m -count=1`

shows

`2025-10-20T08:15:08.290-0700 info Skipping close transfer task
generation - already acked on active cluster {"cluster-name":
"standby_aadnd", "host": "127.0.0.1:57179", "shard-id": 1, "address":
"127.0.0.1:57179", "wf-namespace-id":
"e550305e-0b43-4bcd-a490-8e3223f51ce1", "wf-id":
"test-replication-e2c094d3-c34f-42d9-a166-a967d4e7f602", "wf-run-id":
"019a0230-299a-74ba-a31f-ac247c05f2b9", "logging-call-at":
"/Users/michaely520/projects/temporal/service/history/workflow/task_generator.go:206"}
stream_based_replication_test.go:975: Verified IsCloseTransferTaskAcked
and IsForceReplication flags in SyncWorkflowStateTask`
Change help text to specify path is relative to current working
directory rather than root directory, improving user clarity.
Update test assertions to match new config loading behavior.
@chaptersix chaptersix marked this pull request as draft October 30, 2025 22:00
@chaptersix chaptersix marked this pull request as ready for review October 30, 2025 22:30
Copy link
Member

@bergundy bergundy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM. Didn't have any blocking comments. Feel free to address whatever and merge.

@chaptersix chaptersix enabled auto-merge (squash) November 18, 2025 14:23
@chaptersix chaptersix merged commit 70c2b81 into main Nov 18, 2025
94 of 96 checks passed
@chaptersix chaptersix deleted the alex/fix_loader branch November 18, 2025 16:23
@chaptersix chaptersix mentioned this pull request Nov 20, 2025
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants