GitHub

Overview

Arctern is a fast scalable spatial-temporal analytics framework.

Scalability is key to building productive data science pipelines. To address the scalability challenge, we launched Arctern, an open source spatial-temporal analytic framework for boosting end-to-end data science performance. Arctern aims to improve scalability from two aspects:

Unified data analytic and processing interface across different platforms, from laptops to clusters and cloud.
Rich and consistent algorithms and models, including trajectory processing, spatial clustering and regression, etc., across different data science pipeline stages.

Arctern's approach and current progress

We adopt GeoPandas‘s interface and plan to build the GeoDataFrame/GeoSeries that scale both up and out. On top of GeoDataFrame/GeoSeries, we will develop a consistent spatial-temporal algorithm set across execution environments.

We have now developed an efficient multi-thread GeoSeries implementation, and the distributed version is in progress. In the latest version 0.2.0, Arctern achieves 24x speed up against GeoPandas. Even under single-thread execution, Arctern outperforms GeoPandas 7x on average. The detailed evaluation results are illustrated in the figure below.

We are also conducting experimental GPU acceleration for spatial-temporal data analysis and rendering. By now Arctern provides six GPU-accelerated rendering methods and eight spatial-relation operations, which outperform their CPU-based counterparts with up to 36x speed up.

In the next few releases, our team will focus on:

Developing a distributed version of GeoSeries. Our first distributed implementation of GeoDataFrame/GeoSeries will be based on Spark. It is developed in sync with Spark 3.0 since its preview release. Spark's supports on GPU scheduling and column-based processing is highly in line with our idea of high-performance spatial-temporal data processing. Besides, the introduced Koalas interface offers a promising option for implementing consistent GeoDataFrame/GeoSeries interfaces on Spark.
Enriching our spatial-temporal algorithm sets. We will concentrate on KNN search and trajectory analysis in the project's early stages.

Name		Name	Last commit message	Last commit date
Latest commit History 2,712 Commits
.github		.github
ci		ci
conda/recipes		conda/recipes
cpp		cpp
doc/img		doc/img
docker		docker
gui		gui
python		python
spark/pyspark		spark/pyspark
tests		tests
.clang-format		.clang-format
.clang-tidy		.clang-tidy
.clang-tidy-ignore		.clang-tidy-ignore
.env		.env
.gitignore		.gitignore
.gitmodules		.gitmodules
.pylintrc		.pylintrc
LICENSE		LICENSE
README.md		README.md
README_CN.md		README_CN.md
codecov.yaml		codecov.yaml
docker-compose.yaml		docker-compose.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Arctern's approach and current progress

About

Releases

Packages

Languages

License

GuoRentong/arctern

Folders and files

Latest commit

History

Repository files navigation

Overview

Arctern's approach and current progress

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages