[FEATURE] Object Storage (S3) Data Ingestion through Streaming Query

**Is your feature request related to a problem?**
One of the key technical challenge in https://github.com/opensearch-project/sql/issues/719 is how to maintain the consistency between base table (S3 data) and derived table (OpenSearch index/materialized view).

**What solution would you like?**
One solution for the problem is to refresh new data from S3 to OpenSearch incrementally. We are proposing to enhance our query engine by unifying the batch processing and stream processing capability in single architecture as existing solution in Apache Flink and Spark. In particular, the enhancement includes changes in query planning, query execution engine and query plan itself.

PoC branch: https://github.com/opensearch-project/sql/tree/poc/maximus-m1. User manual and design doc in details will be published later as planned below.

**What alternatives have you considered?**
The alternative solution is rebuild the derived table (full refresh) on user demand or regular basis. This can be done by current batch processing architecture, however, introduce significant overhead for large S3 dataset it will.

**Do you have any additional context?**

## Phase 1 
### Goal:
* Ready for performance evaluation
* Ready for feature evaluation
* Missing
  * Failure recovery
  * Security
### Tasks
- [x] Infra Enhancement
    - [x] https://github.com/opensearch-project/sql/pull/822
    - [x] https://github.com/opensearch-project/sql/pull/845
    - [x] https://github.com/opensearch-project/sql/pull/1085
    - [x] https://github.com/opensearch-project/sql/pull/1091
- [x] https://github.com/opensearch-project/sql/issues/968
    - [x] https://github.com/opensearch-project/sql/pull/1044
    - [x] https://github.com/opensearch-project/sql/pull/1068
- [x] opensearch-project/sql#969
   - [x] opensearch-project/sql#974
   - [x] https://github.com/opensearch-project/sql/pull/994
- [ ] https://github.com/opensearch-project/sql/issues/1093
    - [x] https://github.com/opensearch-project/sql/pull/1094
    - [ ] https://github.com/opensearch-project/sql/pull/1139
- [x] https://github.com/opensearch-project/sql/issues/951
    - [x] https://github.com/opensearch-project/sql/pull/950
    - [x] https://github.com/opensearch-project/sql/pull/958
    - [ ] https://github.com/opensearch-project/sql/pull/1100
- [x] https://github.com/opensearch-project/sql/issues/953
    - [x] https://github.com/opensearch-project/sql/pull/959
- [ ] https://github.com/opensearch-project/sql/issues/954
    - [x] https://github.com/opensearch-project/sql/pull/990
    - [ ] Refactor AggregateOperator to support stream processing
- [ ] https://github.com/opensearch-project/sql/issues/955
    - [ ] Add INSERT STREAM statement
    - [ ] Add CREATE TABLE statement. https://github.com/penghuo/os-sql/tree/hp/test/maximus-m1
- [x] opensearch-project/sql#972
   - [ ] [S3 impl](https://github.com/penghuo/os-sql/tree/hp/test/maximus-m1) is blocked by https://github.com/opensearch-project/OpenSearch/issues/5359
- [x] opensearch-project/sql#1151

## Phase 2
### Goal:
* Ready for experimental release
* Missing
  * Pipeline Execution
  * Distributed Execution
### Tasks
- [ ] Enhancement
  - [ ] https://github.com/opensearch-project/sql/issues/1071
- [ ] Fault Tolerant
  - [ ] https://github.com/opensearch-project/sql/issues/1007
  - [ ] https://github.com/opensearch-project/sql/issues/1072
- [ ] Security
- [ ] Use cases related feature
  - [ ] object/array support
  - [ ] full text search capability in streaming - match
- [ ] Test
- [ ] Documentation
- [ ] User Interface

## Phase 3
### Goal:
* Ready for production deployment

### Tasks
- [ ] Pipeline Execution
- [ ] Distributed Execution


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Object Storage (S3) Data Ingestion through Streaming Query #948

Phase 1

Goal:

Tasks

Phase 2

Goal:

Tasks

Phase 3

Goal:

Tasks

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE] Object Storage (S3) Data Ingestion through Streaming Query #948

Description

Phase 1

Goal:

Tasks

Phase 2

Goal:

Tasks

Phase 3

Goal:

Tasks

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions