Skip to content

[ML] Add system call restrictions to the ML processes #98

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
May 28, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions bin/autoconfig/Main.cc
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@
#include <config/CAutoconfigurerParams.h>
#include <config/CReportWriter.h>

#include <seccomp/CSystemCallFilter.h>

#include "CCmdLineParser.h"

#include <boost/bind.hpp>
Expand Down Expand Up @@ -76,6 +78,8 @@ int main(int argc, char** argv) {

ml::core::CProcessPriority::reducePriority();

ml::seccomp::CSystemCallFilter::installSystemCallFilter();

if (ioMgr.initIo() == false) {
LOG_FATAL(<< "Failed to initialise IO");
return EXIT_FAILURE;
Expand Down
4 changes: 4 additions & 0 deletions bin/autodetect/Main.cc
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@
#include <api/CSingleStreamSearcher.h>
#include <api/CStateRestoreStreamFilter.h>

#include <seccomp/CSystemCallFilter.h>

#include "CCmdLineParser.h"

#include <boost/bind.hpp>
Expand Down Expand Up @@ -120,6 +122,8 @@ int main(int argc, char** argv) {

ml::core::CProcessPriority::reducePriority();

ml::seccomp::CSystemCallFilter::installSystemCallFilter();

if (ioMgr.initIo() == false) {
LOG_FATAL(<< "Failed to initialise IO");
return EXIT_FAILURE;
Expand Down
4 changes: 4 additions & 0 deletions bin/categorize/Main.cc
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,8 @@
#include <api/CSingleStreamSearcher.h>
#include <api/CStateRestoreStreamFilter.h>

#include <seccomp/CSystemCallFilter.h>

#include "CCmdLineParser.h"

#include <boost/bind.hpp>
Expand Down Expand Up @@ -91,6 +93,8 @@ int main(int argc, char** argv) {

ml::core::CProcessPriority::reducePriority();

ml::seccomp::CSystemCallFilter::installSystemCallFilter();

if (ioMgr.initIo() == false) {
LOG_FATAL(<< "Failed to initialise IO");
return EXIT_FAILURE;
Expand Down
4 changes: 4 additions & 0 deletions bin/normalize/Main.cc
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@
#include <api/CLineifiedJsonOutputWriter.h>
#include <api/CResultNormalizer.h>

#include <seccomp/CSystemCallFilter.h>

#include "CCmdLineParser.h"

#include <boost/bind.hpp>
Expand Down Expand Up @@ -78,6 +80,8 @@ int main(int argc, char** argv) {

ml::core::CProcessPriority::reducePriority();

ml::seccomp::CSystemCallFilter::installSystemCallFilter();

if (ioMgr.initIo() == false) {
LOG_FATAL(<< "Failed to initialise IO");
return EXIT_FAILURE;
Expand Down
6 changes: 5 additions & 1 deletion docs/CHANGELOG.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

=== Deprecations

=== New Features
=== New Features

=== Enhancements

Expand All @@ -36,6 +36,10 @@ Improve partition analysis memory usage ({pull}97[#97])
Forecasting of Machine Learning job time series is now supported for large jobs by temporarily storing
model state on disk ({pull}89[#89])

Secure the ML processes by preventing system calls such as fork and exec. The Linux implemenation uses
Seccomp BPF to intercept system calls and is available in kernels since 3.5. On Windows Job Objects prevent
new processes being created and macOS uses the sandbox functionality ({pull}98[#98])

=== Bug Fixes

Age seasonal components in proportion to the fraction of values with which they're updated ({pull}88[#88])
Expand Down
43 changes: 43 additions & 0 deletions include/seccomp/CSystemCallFilter.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License;
* you may not use this file except in compliance with the Elastic License.
*/
#ifndef INCLUDED_ml_seccomp_CSystemCallFilter_h
#define INCLUDED_ml_seccomp_CSystemCallFilter_h

#include <core/CNonInstantiatable.h>

namespace ml {
namespace seccomp {

//! \brief
//! Installs secure computing modes for Linux, macOs and Windows
//!
//! DESCRIPTION:\n
//! ML processes require a subset of system calls to function correctly.
//! These are create a named pipe, connect to a named pipe, read and write
//! no other system calls are necessary and should be resticted to prevent
//! malicious actions.
//!
//! IMPLEMENTATION DECISIONS:\n
//! Implementations are platform specific more details can be found in the
//! particular .cc files.
//!
//! Linux:
//! Seccomp BPF is used to restrict system calls on kernels since 3.5.
//!
//! macOs:
//! The sandbox facility is used to restict access to system resources.
//!
//! Windows:
//! Job Objects prevent the process spawning another.
//!
class CSystemCallFilter : private core::CNonInstantiatable {
public:
static void installSystemCallFilter();
};
}
}

#endif // INCLUDED_ml_seccomp_CSystemCallFilter_h
2 changes: 2 additions & 0 deletions lib/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ COMPONENTS= \
test \
api \
config \
seccomp \



include $(CPP_SRC_HOME)/mk/toplevel.mk
Expand Down
Empty file added lib/seccomp/.objs/.gitignore
Empty file.
151 changes: 151 additions & 0 deletions lib/seccomp/CSystemCallFilter_Linux.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License;
* you may not use this file except in compliance with the Elastic License.
*/
#include "seccomp/CSystemCallFilter.h"

#include <core/CLogger.h>

#include <linux/audit.h>
#include <linux/filter.h>
#include <sys/prctl.h>
#include <sys/syscall.h>

#include <cerrno>
#include <cstdint>
#include <cstring>

namespace ml {
namespace seccomp {

namespace {
// The old x32 ABI always has bit 30 set in the sys call numbers.
// The x64 architecture should fail these calls
const std::uint32_t UPPER_NR_LIMIT = 0x3FFFFFFF;

// Offset to the nr field in struct seccomp_data
const std::uint32_t SECCOMP_DATA_NR_OFFSET = 0x00;
// Offset to the arch field in struct seccomp_data
const std::uint32_t SECCOMP_DATA_ARCH_OFFSET = 0x04;

// Copied from seccomp.h
// seccomp.h cannot be included as it was added in Linux kernel 3.17
// and this must build on older versions.
// TODO: remove on the minumum build kernel version supports seccomp
#define SECCOMP_MODE_FILTER 2
#define SECCOMP_RET_ERRNO 0x00050000U
#define SECCOMP_RET_ALLOW 0x7fff0000U
#define SECCOMP_RET_DATA 0x0000ffffU

// Added in Linux 3.5
#ifndef PR_SET_NO_NEW_PRIVS
#define PR_SET_NO_NEW_PRIVS 38
#endif

const struct sock_filter FILTER[] = {
// Load architecture from 'seccomp_data' buffer into accumulator
BPF_STMT(BPF_LD | BPF_W | BPF_ABS, SECCOMP_DATA_ARCH_OFFSET),
// Jump to disallow if architecture is not X86_64
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, AUDIT_ARCH_X86_64, 0, 5),
// Load the system call number into accumulator
BPF_STMT(BPF_LD | BPF_W | BPF_ABS, SECCOMP_DATA_NR_OFFSET),
// Only applies to X86_64 arch. Jump to disallow for calls using the x32 ABI
BPF_JUMP(BPF_JMP | BPF_JGT | BPF_K, UPPER_NR_LIMIT, 34, 0),
// Allowed sys calls, jump to return allow on match
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_read, 34, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_write, 33, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_writev, 32, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_lseek, 31, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_lstat, 30, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_readlink, 29, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_stat, 28, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_fstat, 27, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_open, 26, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_close, 25, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_connect, 24, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_clone, 23, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_statfs, 22, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_dup2, 21, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_rmdir, 20, 0), // for forecast temp storage
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_getdents, 19, 0), // for forecast temp storage
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_openat, 18, 0), // for forecast temp storage
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_tgkill, 17, 0), // for the crash handler
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_rt_sigaction, 16, 0), // for the crash handler
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_rt_sigreturn, 15, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_futex, 14, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_madvise, 13, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_unlink, 12, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_mknod, 11, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_nanosleep, 10, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_set_robust_list, 9, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_mprotect, 8, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_munmap, 7, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_mmap, 6, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_getuid, 5, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_exit_group, 4, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_access, 3, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_brk, 2, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_exit, 1, 0),
// Disallow call with error code EACCES
BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_ERRNO | (EACCES & SECCOMP_RET_DATA)),
// Allow call
BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_ALLOW)};

bool canUseSeccompBpf() {
// This call is expected to fail due to the nullptr argument
// but the failure mode informs us if the kernel was configured
// with CONFIG_SECCOMP_FILTER
// http://man7.org/linux/man-pages/man2/prctl.2.html
int result = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, nullptr);
int configError = errno;
if (result != -1) {
LOG_ERROR(<< "prctl set seccomp with null argument should have failed");
return false;
}

// If the kernel is not configured with CONFIG_SECCOMP_FILTER
// or CONFIG_SECCOMP the error is EINVAL. EFAULT indicates the
// seccomp filters are enabled but the 3rd argument (nullptr)
// was invalid.
return configError == EFAULT;
}
}

void CSystemCallFilter::installSystemCallFilter() {
if (canUseSeccompBpf()) {
LOG_DEBUG(<< "Seccomp BPF filters available");

// Ensure more permissive privileges cannot be set in future.
// This must be set before installing the filter.
// PR_SET_NO_NEW_PRIVS was aded in kernel 3.5
if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
LOG_ERROR(<< "prctl PR_SET_NO_NEW_PRIVS failed: " << std::strerror(errno));
return;
}

struct sock_fprog prog = {
.len = static_cast<unsigned short>(sizeof(FILTER) / sizeof(FILTER[0])),
.filter = const_cast<sock_filter*>(FILTER)};

// Install the filter.
// prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, filter) was introduced
// in kernel 3.5. This is functionally equivalent to
// seccomp(SECCOMP_SET_MODE_FILTER, 0, filter) which was added in
// kernel 3.17. We choose the older more compatible function.
// Note this precludes the use of calling seccomp() with the
// SECCOMP_FILTER_FLAG_TSYNC which is acceptable if the filter
// is installed by the main thread before any other threads are
// spawned.
if (prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &prog)) {
LOG_ERROR(<< "Unable to install Seccomp BPF: " << std::strerror(errno));
} else {
LOG_DEBUG(<< "Seccomp BPF installed");
}

} else {
LOG_DEBUG(<< "Seccomp BPF not available");
}
}
}
}
92 changes: 92 additions & 0 deletions lib/seccomp/CSystemCallFilter_MacOSX.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License;
* you may not use this file except in compliance with the Elastic License.
*/
#include "seccomp/CSystemCallFilter.h"

#include <core/CLogger.h>

#include <paths.h>
#include <sandbox.h>
#include <unistd.h>

#include <cerrno>
#include <cstring>

namespace ml {
namespace seccomp {

namespace {
// The Sandbox rules deny all actions apart from creating fifos,
// opening files, reading and writing.
// (allow file-write*) is required for mkfifo and that permission
// can not be set using the more granular controls.
const std::string SANDBOX_RULES("\
(version 1) \
(deny default) \
(allow file-read*) \
(allow file-read-data) \
(allow file-write*) \
(allow file-write-data)");

// mkstemps will replace the Xs with random characters
const std::string FILE_NAME_TEMPLATE("ml.XXXXXX.sb");
// The length of the suffix '.sb'
const int FILE_NAME_TEMPLATE_SUFFIX_LEN = 3;

std::string getTempDir() {
// Prefer to use the temporary directory set by the Elasticsearch JVM
const char* tmpDir(::getenv("TMPDIR"));

// If TMPDIR is not set use _PATH_VARTMP
std::string path((tmpDir == nullptr) ? _PATH_VARTMP : tmpDir);
// Make sure path ends with a slash so it's ready to have a file name appended
if (path[path.length() - 1] != '/') {
path += '/';
}
return path;
}

std::string writeTempRulesFile() {
std::string profileFilename = getTempDir() + FILE_NAME_TEMPLATE;

// Create and open a temporary file with a random name
// profileFilename is updated with the new filename.
int fd = mkstemps(&profileFilename[0], FILE_NAME_TEMPLATE_SUFFIX_LEN);
if (fd == -1) {
LOG_ERROR(<< "Opening a temporary file with mkstemps failed: "
<< std::strerror(errno));
return std::string();
}
write(fd, SANDBOX_RULES.c_str(), SANDBOX_RULES.size());
close(fd);

return profileFilename;
}
}

void CSystemCallFilter::installSystemCallFilter() {
std::string profileFilename = writeTempRulesFile();
if (profileFilename.empty()) {
LOG_WARN(<< "Cannot write sandbox rules. macOS sandbox will not be initialized");
return;
}

char* errorbuf = nullptr;
if (sandbox_init(profileFilename.c_str(), SANDBOX_NAMED, &errorbuf) != 0) {
std::string msg("Error initializing macOS sandbox");
if (errorbuf != nullptr) {
msg += ": ";
msg += errorbuf;
sandbox_free_error(errorbuf);
}
LOG_ERROR(<< msg);
} else {
LOG_DEBUG(<< "macOS sandbox initialized");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we delete the file here? I think it's not necessary to keep around for the lifetime of the program as the ES Java code deletes it immediately after the sandbox is initialized.


std::remove(profileFilename.c_str());
}
}
}
Loading