Skip to content

Consider creating binary artifacts via CI #108

Closed
@cdleary

Description

@cdleary

Right now we require users to build XLS from source, which can take a long time because of some of our larger dependencies 1 (e.g. LLVM, Z3) and requires users to set up Bazel / a build environment.

For comparison, in JAX we implicitly publish XLA as a shared library by way of Python extensions that can be pip installed. If we wanted users to easily obtain XLS in Python (via our pybind11 bindings) we'd need to do similar and have extension-module creation flows. (Our pybind11 XLS bindings are notably exposing a fairly rich API surface area with ownership and complex types being passed and such.)

However, the open question is how to make sure C/C++ embeddings that can work nicely too that'll fit naturally in the XLS development process.

If we can determine an appropriate set of "public" C++ APIs we should be able to have CI create a mostly-statically-linked Dynamic Shared Library (DSO) libxls.so and publish that with associated headers. There would be no stability would be guaranteed (API or ABI, across any time window). However, publishing any exposed C++ API brings considerations around libstdc++ vs libc++ usage (and versions of those) for C++ types in the exposed APIs! Also, this still doesn't enable C embeddings, or the many other language environments that are happy to try to interop with C-API-exposed functionality.

Off the cuff a conceptually nice approach would be to (1) make the flat C API to expose from a DSO (2) use CFFI to build our Python bindings from those (3) create a C++ wrapper "client" library around (1) that's compile-time-polymorphic between the DSO being there and not being there (latter being used for source builds). But a) we already have pybind11 working nicely and b) what I'm describing is a good chunk of work and c) generally restraining ourselves to expose all functionality via a flat C API object model feels like a bump in our typical development process 2 ... for a future embedding scenario that's neither Python nor C++.

But it could be enabling! There are always cool use cases that are impossible to anticipate.

I'd say priority-judgement-call-wise I'd suggest we maybe can decide on the public API boundary, build a libstdc++ and libc++ artifact in CI, and call that a day until we have more understanding of the upside for a simpler (C) embedding.

[1]: Even if we had a great solution for shared artifact caching (e.g. via RBE) it would still be useful for folks to not have to set up and use Bazel to try XLS; e.g. in some embedded context.

[2]: Forcing all the thinking into how to make a flat API (vs arbitrary C++ object model) can potentially slow things down vs just working in the C++ object model directly. But also flat APIs usually encourage nice design thinking on composition and orthogonality. On the flip side, for complex embeddings working directly with the richer C++ object model can be nice instead of depending on things to be explicitly exposed through the flat API. War story: in XLA we had the "client" boundary as a natural cut point, where things could either delegate via C++ objects or via protobuf RPCs, and since it was proto oriented it could even do things stably and via inter-process-communication, yet sophisticated embeddings didn't usually want to use the more limited API "cut" because they wanted "all the possible power". Also in XLS we don't have that mostly-stable sort of separation yet.

Metadata

Metadata

Assignees

Labels

buildRelated to build flow, build system, or build macrosenhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions