In this article, we'd like to introduce you to a new cloud CI ecosystem called SAVE. Its primary purpose is to simplify the lives of dev-tool developers. Actually, SAVE can be very beneficial for:
- Continuous Integration system for validating their tools;
- Benchmarking Platform to compare their tool with others and share tests;
- Validation Platform to check their tool against a predefined standard;
- As a future bonus, SAVE plans to offer a Community Competition platform.
Our team focuses on creating libraries and tools for developers. The concept of SAVE came to us during the development of an open-source Static Analyzer named Diktat. We realized there was a deficiency of test frameworks available for such tools. Each author of a compiler or static analyzer seems to reinvent their own test framework for their specific needs. We wanted to change this!
After examining various open-source static analyzers and compilers, the only useful tool for this type of testing was LLVM LIT. However, even this tool has some disadvantages:
- LIT is essentially a set of Python scripts and isn't ready-to-go without extra installations;
- LIT was designed for compilers and lacks functionality for extending application logic;
- As LIT was developed by community system programmers, it has zero ecosystem and is merely a simple command-line tool.
One crucial point is that testing scenarios are very common for these types of programming tools (compilers, code analyzers, parsers, license scanners, etc.):
- Some source code (in string format) is passed to the tool;
- The tool performs its internal logic as a black box;
- The tool outputs some result: fixed code (for linters),
code execution result (for compilers), warnings (for static analyzers),
some Internal Representation (for parsers), etc. All output data can be in a
string format
(for example in SARIF format)
This motivation led us to create save-cli, a simple native command application for functional testing of your tool. Our goal was to create a universal framework that anyone could use, eliminating dependencies on particular ecosystems (JVM/Python/etc).
For instance, we recognize that C++ compiler engineers may not want to install Java, and Java developers of static analyzers may not want to install Python. Users only need to write tests in the SAVE format without worrying about other considerations.
We wanted to make save-cli as extendable as possible by creating plugin-like interfaces. Everyone can easily contribute and add their plugin to SAVE. Currently, we offer "fix" and "warn" plugins right out of the box:
The fix plugin runs the provided executable on the initial test file and compares its output with an expected result. For this comparison, we use our own diff library.
The warn plugin is particularly interesting, as we have prepared our own DSL for this case that can be used in the source code of test examples. This plugin reflects a common scenario for the validation of warnings generated by a Static Analyzer or, for example, errors generated by a Compiler's Frontend. Our DSL syntax for this plugin is fully configurable in SAVE, so you can easily use proper syntax comments in relation to your programming language.
For save-cli, we implemented our own recursive resource detection mechanism. To make SAVE detect your test suites, you need to put a save.toml file in each directory where you have tests that should be run. These configuration files inherit configurations from the previous level of directories. For instance, if you have the following directory hierarchy:
A
|- save.toml
|- B
|- save.toml
save.toml
from directory B
will inherit settings and properties from directory A
.
In SAVE, we have established the rule: one suite - one save.toml
config.
This means that if you have a directory with tests in your project and you want SAVE to detect them, you should put a save.toml
file with at least basic information: suite name/description. Otherwise, this directory will be ignored.
SAVE will treat all files with a Test
suffix (configurable) as test resources and will automatically use the configuration from the save.toml
file located in the same directory (and inherited configuration):
A
|- save.toml <<< configuration from A dir that will be inherited to B
|- B <<< test suite
|- myTest.java <<< test resource
|- save.toml <<< configuration for B
When developing our own Static Analyzer, we found that the lack of a test framework wasn't the only problem. Every creator of static analyzers begins by identifying the types of issues their tool will detect. This leads to searching for existing lists of potential issues or test packages that can be used to measure the result of their work or for Test-Driven Development (TDD).
In other areas of system programming, such benchmarks and test sets already exist. For instance, SPEC.org benchmarks are used worldwide to test functionality, evaluate, and measure the performance of various applications and hardware. But there are no such test sets or even strict standards for detecting issues in popular programming languages.
Existing test suites at NIST, Juliet, Misra, CWE, etc. do exist, but their framework and ecosystem remain very limited. As a result, every new developer who invents their new code style or mechanism for static analysis often ends up reinventing a test framework and writing test sets that have already been created thousands of times for their specific programming language. This leads to a lot of time being spent on reinventing, writing, and debugging tests.
We decided to change this situation and make an open-source product that contains the following components:
- CI platform to execute tests in save-cli format in the Cloud in parallel;
- Dashboards to visualize, search results and check logs;
- Storage for historical test results: detection of regression and flaky tests;
- Archive of benchmarks that can be used for certification or comparison of tools;
- Sharing of tests and benchmarks in the community;
- Contests in the area of bug hunting with rating of best tools;
- Online demo service for code analyzers to show your tool's capabilities.
When we were examining existing testing tools in large projects like GCC or Clang, we found that these projects have hundreds of thousands of tests. More than 500,000 tests in each compiler! Given this volume, we realized that we definitely need a batching and parallelization mechanism using a Cloud environment, especially considering that all tests are usually encapsulated and isolated from each other.
So after completing save-cli, we began implementing our ideas in save-cloud. We wanted these projects to be separate:
- save-cli for local testing and configuration checking by the user
- save-cloud for Cloud testing with save-cli and storing historical results
The logic is as follows on a high level:
- save-cloud provides a REST API or a WEB UI for the user;
- The user can select existing or upload their own benchmarks;
- Processing triggers and orchestrates Kubernetes nodes where Native save-cli executes tests in docker containers;
- All historical results for executions are saved in a database for analysis;
The SAVE instance is deployed at saveourtool.com.
Of course, save-cloud also has a REST API and a Java library that can be used as a client for this API. Using this API, it can be easily integrated into various CI/CD platforms, like GitHub Action, Jenkins, TeamCity, and many others. You can read about the API here.
We believe that SAVE will become a standard for testing, benchmarking, and standardizing dev-tools We are always open for any proposals, discussions and collaboration. Please feel free to contact us on our website or via GitHub. We also appreciate if you will give a star to our repositories, this attracts developers to the community.