Skip to content

Conversation

villebro
Copy link
Member

SUMMARY

WIP

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@dosubot dosubot bot added the change:backend Requires changing the backend label Sep 24, 2025
Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Status
Design Incomplete Abstract Interface Definition ▹ view 🧠 Incorrect
Functionality Empty models module missing required exports ▹ view 🧠 Not in standard
Performance Missing batching guidance for bulk ID queries ▹ view 🧠 Not in standard
Files scanned
File Path Reviewed
superset-core/src/superset_core/init.py
superset-core/src/superset_core/commands/init.py
superset-core/src/superset_core/dao/init.py
superset-core/src/superset_core/models/init.py
superset-core/src/superset_core/commands/types.py
superset-core/src/superset_core/models/base.py
superset/core/api/types/models.py
superset-core/src/superset_core/api/types/models.py
superset/commands/importers/v1/init.py
superset-core/src/superset_core/dao/types.py
superset/daos/base.py
superset/models/core.py
superset/connectors/sqla/models.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Check out our docs on how you can make Korbit work best for you and your team.

Loving Korbit!? Share us on LinkedIn Reddit and X

@villebro villebro force-pushed the villebro/core-protocols branch from 754388d to 87b387a Compare September 24, 2025 04:40
@villebro villebro marked this pull request as draft September 24, 2025 04:44

# Class attributes that implementations should define
model_cls: Optional[type[T_Model]]
base_filter: Optional[BaseFilter]
Copy link
Member

@betodealmeida betodealmeida Sep 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use something more generic here instead of BaseFilter, which is tightly coupled with FAB? Maybe we could rething how base filter works — it could be something passed to the instance, eg:

dao = DatasetDAO()
datasets = dao.find_by_ids([1, 2])  # equivalent to `skip_base_filter=True`

dao_for_user = dao.filtered(User.id == current_user.id)
datasets = dao_for_user.find_by_ids([1, 2])  # equivalent to `skip_base_filter=False`

In general I think the less dependencies superset-core has, the better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fluent Interfaces are cool! but like that makes me think why not just use Query from SQLAlchemy?
Have a somewhat static base filter is a plus for security, we enforce that all DAO operations take into account a certain filter that enforces a security constraint

@abstractmethod
def find_by_ids(
self,
model_ids: Union[list[str], list[int]],
Copy link
Member

@betodealmeida betodealmeida Sep 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use Collection instead of list? Here and in other places where order doesn't matter.

Also, can we use this opportunity to just use UUIDs everywhere, instead of having int IDs, string IDs, and UUIDs?

@mistercrunch
Copy link
Member

A tiny bit concerned about having superset-core treated (explicitely or implicitely) as a wide public API and the change management concerns that come with that. To help manage that, I'd propose that we:

  • use python convention for private/public (_method= you probably shouldn't use that if care about easy upgrade, this is meant for the internal ecosystem) and __method -> definitely don't want to use that, even in the ecosystem/monorepo
  • actively reduce the API surface, go through all the methods and "privatize" what's not essential in the public API
  • make superset-core pre-1.0 and allow ourselves to break API and privatize things until 1.0 is officially out. From that point onwards we do proper semver/change-management on public methods/objects/properties

@mistercrunch
Copy link
Member

About Commands more specifically, I can see how some are useful here, but thinking most should either live in the app, or be treated as private.

@mistercrunch
Copy link
Member

Note: another option I mentioned before would be to treat superset-core as 100% private, and have something like a superset-extension-sdk as the thing that's public/semvered, it would import/re-export the bits of superset-core we want to actively expose (and maintain) as a public API.

@staticmethod
@abstractmethod
def get_datasets(query: BaseQuery | None = None, **kwargs: Any) -> list[Any]:
def get_datasets(query: BaseQuery | None = None, **kwargs: Any) -> list[Dataset]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We can use Query from SQLAlchemy

@abstractmethod
def find_by_id(
self, model_id: Union[str, int], skip_base_filter: bool = False
) -> Optional[T_Model]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: In the future we could have models with composite keys

__abstract__ = True


class Database(CoreModel):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we call this AbstractDatabase or BaseDatabase ?

@michael-s-molina
Copy link
Member

A tiny bit concerned about having superset-core treated (explicitely or implicitely) as a wide public API and the change management concerns that come with that.

@mistercrunch I share the same concerns as you and this is one of the motivations for exposing only the API definitions in superset-core without any implementation or private methods. In essence, the APIs should only contain public methods that we will support with semver. Any private method will not be part of superset-core and will be defined at the host implementation. For example:

ISomeDAO: (superset-core)
     @abstractmethod
     def public_function():
          ...

SomeDAO(ISomeDAO):  (host)
    def public_function():
         ... implementation logic
    
    def private_function():
         ... implementation logic

make superset-core pre-1.0 and allow ourselves to break API and privatize things until 1.0 is officially out. From that point onwards we do proper semver/change-management on public methods/objects/properties

That's exactly what we defined in the SIP and the reason we publish the packages as 0.0.1-rcx in both NPM and PyPI.

@michael-s-molina
Copy link
Member

michael-s-molina commented Sep 25, 2025

Note: another option I mentioned before would be to treat superset-core as 100% private, and have something like a superset-extension-sdk as the thing that's public/semvered, it would import/re-export the bits of superset-core we want to actively expose (and maintain) as a public API.

@mistercrunch I think we're achieving the same objective but through different means. The superset-core contains only the bits we want to actively expose and the host application contain the private parts. The private parts don't need to be a separate package, they are already part of the apache-superset package.

Correlating with your example, superset-core is what you're calling as superset-extension-sdk but it's not only for extensions. It can be used by embedded and MCP tools.

@mistercrunch
Copy link
Member

mistercrunch commented Sep 25, 2025

the APIs should only contain public methods

but clearly there are useful bits/methods/objects that we want to use/reuse within superset-core but don't want to or don't need to expose as part of the public API. I say we just put an underscore in their name to signal they're not part of the public api.

One example might be something like _is_uuid_or_int(any_id: int | str), clearly it's something we need to use/reuse, but really no need to make that public...

Python really lacking here, though doing it by convention with _ and __ prefix is fine.

Also restating that I think there are many pieces of logic that logically belong in superset-core/ (especially if we want to keep the very-related logic together, in one place) that are logically private or only meant for the main app.

Two main approaches here:

  • have those bits of logic live in superset-core, close to the code they are highly related-to and marked as private
  • crazy inheritance schemes (some highly-related DAOs/Commands live in superset-core, and superset-app inherits-and-augments them) -> not my favorite approach! ... Some commands are in one module/app, others are in another, calling super() or methods from the parent class, .... = inheritance confusion

@abstractmethod
def find_by_id(
self, model_id: Union[str, int], skip_base_filter: bool = False
) -> Optional[T_Model]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding Optional[T_Model], consider: many ORMs provide get() (raises) and get_or_none() (returns None) to handle different use cases. Do we want/need both of these or are we good with what is here?

When debugging missing entities, would we rather see AttributeError: 'NoneType' object has no attribute 'name' on line 47, or EntityNotFoundError: Dashboard with id=123 not found on line 23?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
change:backend Requires changing the backend review:draft size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants