Skip to content

Mark resource as unhealthy#6029

Open
npmenard wants to merge 1 commit into
viamrobotics:mainfrom
npmenard:RSDK-13138-weak-dependency
Open

Mark resource as unhealthy#6029
npmenard wants to merge 1 commit into
viamrobotics:mainfrom
npmenard:RSDK-13138-weak-dependency

Conversation

@npmenard
Copy link
Copy Markdown
Member

When a resource declares a weak dependency there is a path where it remains in an healthy state in the resource graph after Close was called on it and the subsequent call to the resource Constructor fails. Allowing any clients to call APIs on a closed resource

To illustrate better the bug here is a simple go module to reproduce the issue:

//... import and module declaration

const lockPath = "/tmp/arm-singleton.lock"

// optionalDepName is the hard-coded name of the optional arm dependency.
// Editing that arm's config (without touching arm-singleton) is what fires the
// weak/optional dependency update path in local_robot.
const optionalDepName = "opt"

type armSingleton struct {
	resource.Named
	resource.AlwaysRebuild

	closed atomic.Bool
}

// Validate declares an optional dependency on the arm named "opt".
// The second return value is the list of optional dep name strings.
func (cfg *Config) Validate(path string) ([]string, []string, error) {
	return nil, []string{arm.Named(optionalDepName).String()}, nil
}

func newArm(
	ctx context.Context,
	deps resource.Dependencies,
	conf resource.Config,
	logger logging.Logger,
) (arm.Arm, error) {
	logger.Infow("constructing arm-singleton", "lockPath", lockPath)

	// Look up the optional dep. Missing is fine (it's optional); we just log.
	if _, err := arm.FromProvider(deps, optionalDepName); err == nil {
		logger.Infow("optional dep resolved", "name", optionalDepName)
	} else {
		logger.Infow("optional dep not present (this is fine)", "name", optionalDepName, "err", err)
	}

	// Refuse to construct if the sentinel already exists.
	if _, err := os.Stat(lockPath); err == nil {
		logger.Errorw("lock file already exists; refusing to construct", "lockPath", lockPath)
		return nil, errors.New("arm-singleton: lock file already present at " + lockPath +
			" (a previous instance did not release it). Delete it to reconstruct.")
	} else if !errors.Is(err, os.ErrNotExist) {
		return nil, err
	}

	f, err := os.Create(lockPath)
	if err != nil {
		return nil, err
	}
	if err := f.Close(); err != nil {
		return nil, err
	}

	a := &armSingleton{
		Named:  conf.ResourceName().AsNamed(),
		logger: logger,
	}
	return a, nil
}

func (a *armSingleton) errIfClosed() error {
	if a.closed.Load() {
		return errors.New("arm is closed!")
	}
	return nil
}

// --- Resource ---

func (a *armSingleton) Close(ctx context.Context) error {
	a.closed.Store(true)
	a.logger.Info("Close called; closed flag set")
	return nil
}

func (a *armSingleton) DoCommand(ctx context.Context, cmd map[string]interface{}) (map[string]interface{}, error) {
        // we will hit this forever
	if err := a.errIfClosed(); err != nil {
		return nil, err
	}
	return map[string]interface{}{}, nil
}
// more apis....

Associated config

{
  "components": [
    {
      "name": "myarm",
      "api": "rdk:component:arm",
      "model": "local:demo:singleton",
      "attributes": {
        "note": "v2"
      }
    },
    {
      "name": "opt",
      "api": "rdk:component:arm",
      "model": "rdk:builtin:fake",
      "attributes": {}
    }
  ],
  "modules": [
    {
      "type": "local",
      "name": "arm-singleton",
      "executable_path": "/tmp/arm-not-nice/bin/arm-singleton"
    }
  ]
}

When RDK comes up it will build the arm-singleton. If the optional resource changes or comes up RDK will close the resource and attempt to build it again which will fail. The closed resource will remain accessible in the resource graph

screenshot-failed

Leaving the resources in a broken state (use after Close) with no automatic recovery and a potentially opaque error message hard to decipher for an end user

@npmenard npmenard requested review from cheukt and dgottlieb May 20, 2026 15:20
@viambot viambot added the safe to test This pull request is marked safe to test from a trusted zone label May 20, 2026
Copy link
Copy Markdown
Member

@cheukt cheukt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change makes sense - will approve with ci checks passing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

safe to test This pull request is marked safe to test from a trusted zone

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants