Bindings for the XGBoost system library. The aim of this package is to mimic XGBoost Python bindings but, at the same time, utilize the power of Swift and C compatibility. Some things thus behave differently but should provide you maximum flexibility over XGBoost.
Check out:
Install XGBoost from sources
git clone https://github.com/dmlc/xgboost
cd xgboost
git checkout tags/v1.1.1
git submodule update --init --recursive
mkdir build
cd build
cmake ..
make
make install
ldconfig
Or you can use provided installation script
./install.sh
You can build and install similarly as on Linux, or just use brew
brew install xgboost
Before version 1.1.1, XGBoost did not create pkg-config. This was fixed with PR Add pkgconfig to cmake #5744.
If you are using for some reason older versions, you may need to specify path to the XGBoost libraries while building, e.g.:
swift build -Xcc -I/usr/local/include -Xlinker -L/usr/local/lib
or create pkg-config file manualy. Example of it for macOS 10.15
and XGBoost 1.1.0
is
prefix=/usr/local/Cellar/xgboost/1.1.0
exec_prefix=${prefix}/bin
libdir=${prefix}/lib
includedir=${prefix}/include
Name: xgboost
Description: XGBoost machine learning libarary.
Version: 1.1.0
Cflags: -I${includedir}
Libs: -L${libdir} -lxgboost
and needs to be placed at /usr/local/lib/pkgconfig/xgboost.pc
Add a dependency in your your Package.swift
.package(url: "https://github.com/kongzii/SwiftXGBoost.git", from: "0.0.0"),
Import Swifty XGBoost
import XGBoost
or directly C library
import CXGBoost
both Booster
and DMatrix
classes are exposing pointers to the underlying C,
so you can utilize C-API directly for more advanced usage.
As the library is still evolving, there can be incompatible changes between updates, the releases before version 1.0.0 doesn't follow Semantic Versioning. Please use the exact version if you do not want to worry about updating your packages.
.package(url: "https://github.com/kongzii/SwiftXGBoost.git", .exact("0.1.0")),
DMatrix can be created from numpy array just like in Python
let pandas = Python.import("pandas")
let dataFrame = pandas.read_csv("data.csv")
let data = try DMatrix(
name: "training",
from: dataFrame.values
)
and the swift array can be converted back to numpy
let predicted = try booster.predict(
from: validationData
)
let compare = pandas.DataFrame([
"Label lower bound": yLowerBound[validIndex],
"Label upper bound": yUpperBound[validIndex],
"Prediced": predicted.makeNumpyArray(),
])
print(compare)
This is possible thanks to the PythonKit. For more detailed usage and workarounds for known issues, check out examples.
Swift4TensorFlow is a great project from Google. If you are using one of the S4TF swift toolchains, you can combine its power directly with XGBoost.
let tensor = Tensor<Float>(shape: TensorShape([2, 3]), scalars: [1, 2, 3, 4, 5, 6])
let data = try DMatrix(name: "training", from: tensor)
Swift4TensorFlow toolchains ships with preinstalled PythonKit and you may run into a problem when using package with extra PythonKit dependency. If so, please just add package version with -tensorflow
suffix, where PythonKit dependency is removed.
.package(url: "https://github.com/kongzii/SwiftXGBoost.git", .exact("0.7.0-tensorflow")),
This bug is known and hopefully will be resolved soon.
More examples can be found in Examples directory and run inside docker
docker-compose run swiftxgboost swift run exampleName
or on host
swift run exampleName
import XGBoost
// Register your own callback function for log(info) messages
try XGBoost.registerLogCallback {
print("Swifty log:", String(cString: $0!))
}
// Create some random features and labels
let randomArray = (0 ..< 1000).map { _ in Float.random(in: 0 ..< 2) }
let labels = (0 ..< 100).map { _ in Float([0, 1].randomElement()!) }
// Initialize data, DMatrixHandle in the background
let data = try DMatrix(
name: "data",
from: randomArray,
shape: Shape(100, 10),
label: labels,
threads: 1
)
// Slice array into train and test
let train = try data.slice(indexes: 0 ..< 90, newName: "train")
let test = try data.slice(indexes: 90 ..< 100, newName: "test")
// Parameters for Booster, check https://xgboost.readthedocs.io/en/latest/parameter.html
let parameters = [
Parameter("verbosity", "2"),
Parameter("seed", "0"),
]
// Create Booster model, `with` data will be cached
let booster = try Booster(
with: [train, test],
parameters: parameters
)
// Train booster, optionally provide callback functions called before and after each iteration
try booster.train(
iterations: 10,
trainingData: train,
evaluationData: [train, test]
)
// Predict from test data
let predictions = try booster.predict(from: test)
// Save
try booster.save(to: "model.xgboost")
Jazzy is used for the generation of documentation.
You can generate documentation locally using
make documentation
Github pages will be updated automatically when merged into master.
Where possible, Swift implementation is tested against reference implementation in Python via PythonKit. For example, test of score
method in scoreEmptyFeatureMapTest
let pyFMap = [String: Int](pyXgboost.get_score(
fmap: "", importance_type: "weight"))!
let (fMap, _) = try booster.score(featureMap: "", importance: .weight)
XCTAssertEqual(fMap, pyFMap)
On ubuntu using docker
docker-compose run test
On host
swift test
SwiftFormat is used for code formatting.
make format