Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: move 'EVA' to 'EvaDB' #834

Merged
merged 13 commits into from
Jun 8, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
8 changes: 4 additions & 4 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
👋 Thanks for submitting a Pull Request to EVA DB!
👋 Thanks for submitting a Pull Request to EvaDB!

🙌 We want to make contributing to EVA DB as easy and transparent as possible. Here are a few tips to get you started:
🙌 We want to make contributing to EvaDB as easy and transparent as possible. Here are a few tips to get you started:

- 🔍 Search existing EVA DB [PRs](https://github.com/georgia-tech-db/eva/pulls) to see if a similar PR already exists.
- 🔗 Link this PR to a EVA DB [issue](https://github.com/georgia-tech-db/eva/issues) to help us understand what bug fix or feature is being implemented.
- 🔍 Search existing EvaDB [PRs](https://github.com/georgia-tech-db/eva/pulls) to see if a similar PR already exists.
- 🔗 Link this PR to a EvaDB [issue](https://github.com/georgia-tech-db/eva/issues) to help us understand what bug fix or feature is being implemented.
- 📈 Provide before and after profiling results to help us quantify the improvement your PR provides (if applicable).

👉 Please see our ✅ [Contributing Guide](https://evadb.readthedocs.io/en/stable/source/contribute/index.html) for more details.
Expand Down
22 changes: 8 additions & 14 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
* PR #830: docs: updating docs based on Python API
* PR #826: fix: orderby bug
* PR #821: feat: adding python API drop and drop_udf
* PR #822: refactor: eva -> evadb
* PR #820: refactor: mv eva -> evadb
* PR #810: feat: grouping paragraphs in documents and samples in audio
* PR #800: test: api testing -> similarity between text and relevance keyword
* PR #818: fix: update youtube qa app with new api updates
Expand All @@ -36,14 +34,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [0.2.5] - 2023-06-02

* PR #801: feat / doc: example app that summarizes any youtube video with eva api
* PR #801: feat / doc: example app that summarizes any youtube video with Python api
* PR #796: docs: Updating Docs based on feedback
* PR #787: feat: enable ray by default
* PR #777: feat: Hugging face entity extraction
* PR #789: feat: create table from select query
* PR #794: docs: bump python for docs
* PR #793: refactor: Create .git-blame-ignore-revs
* PR #792: refactor: updated eva license
* PR #784: feat: support relational apis
* PR #774: Improve create mat
* PR #785: fix: update mnist notebook
Expand All @@ -57,12 +54,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
* PR #768: feat: create materialized view infers column name
* PR #770: tutorials: hugging face text summarizer, text classifier, + pdf loader
* PR #769: feat: load pdf
* PR #766: feat: change default dir to "eva_data" and make it configurable
* PR #767: ci: improve docker support
* PR #751: feat: integration with langchain
* PR #764: feat: db apis made more pythonic
* PR #765: test: fix test case
* PR #752: fix: restart eva_server process for every tutorial
* PR #753: chore
* PR #748: feat: common image transformation UDFs
* PR #731: feat: extensible parallel execution
Expand Down Expand Up @@ -155,15 +150,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

* PR #647: feat: LOAD CSV Notebook
* PR #626: docs: Documentation for creating UDFs using Decorators.
* PR #599: feat: EVA x HuggingFace
* PR #599: feat: EvaDB x HuggingFace
* PR #621: feat: Ray integration

### [Changed]

* PR #649: fix: Expr bugs
* PR #628: test: adding support for pytest-xdist
* PR #633: fix: Install Decord from EVA-Fork
* PR #646: update doc for extending eva
* PR #633: fix: Install Decord from EvaDB-Fork
* PR #642: Build fix
* PR #641: fix: Unnest bug

Expand All @@ -188,7 +182,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
* PR #617: feat: Add UDF Cost Catalog
* PR #616: feat: Add support for iframe based video sampling
* PR #606: feat: Add metadata to UDFs in catalog
* PR #589: feat: Fuzzy Join support in EVA
* PR #589: feat: Fuzzy Join support in EvaDB
* PR #601: feat: Decorators for UDF
* PR #619: chore: reducing coverage loss
* PR #595: doc: Adding CatalogManager, INSERT and DELETE documentation
Expand All @@ -208,7 +202,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
* PR #611: fix: insert and delete executor
* PR #615: fix: dropbox links fixed
* PR #614: Fix: updated dropbox links
* PR #602: fix: EVA on Ray bugs
* PR #602: fix: EvaDB on Ray bugs
* PR #596: fix: Raise Error on Missing Files during Load
* PR #593: fix: Windows path error in S3 testcases
* PR #584: Rename Array_Count to ArrayCount
Expand Down Expand Up @@ -299,7 +293,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
* PR #419: Updated Dockerfile
* PR #407: Movie analysis notebook
* PR #404: feat: Lateral join + sampling
* PR #393: feat: EVA config updates
* PR #393: feat: EvaDB config updates

### [Contributors]

Expand All @@ -312,7 +306,7 @@ Thanks to @gaurav274, @xzdandy, @LordDarkula, @jarulraj, @Anirudh58, @Aryan-Rajo
* PR #372: bugfix: Make ConfigurationManager read and update operate on eva.yml (#372)
* PR #367: Dataset support (#367)
* PR #362: Automatically adding Tutorial Notebooks to docs (#362)
* PR #359: Layout for EVA Documentation (#359)
* PR #359: Layout for EvaDB Documentation (#359)
* PR #355: docs: Adding instructions for setup on M1 Mac (#355)
* PR #344: Updated the tutorial notebooks (#344)
* PR #342: Support for NOT NULL (#342)
Expand All @@ -329,7 +323,7 @@ Thanks to @gaurav274, @jarulraj, @xzdandy, @LordDarkula, @Anirudh58, @Aryan-Rajo
## [0.0.9] - 2022-08-13
### [Added]

* PR #323: Fix EVA Configuration
* PR #323: Fix EvaDB Configuration
* PR #321: CI: Caching dependencies
* PR #315: Unified load
* PR #313: Logo update
Expand Down
40 changes: 20 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# EVA AI-SQL Database System
# EvaDB AI-SQL Database System

<div>
<a href="https://colab.research.google.com/github/georgia-tech-db/eva/blob/master/tutorials/03-emotion-analysis.ipynb">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open EVA on Colab"/>
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open EvaDB on Colab"/>
</a>
<a href="https://join.slack.com/t/eva-db/shared_invite/zt-1i10zyddy-PlJ4iawLdurDv~aIAq90Dg">
<img alt="Slack" src="https://img.shields.io/badge/slack-eva-ff69b4.svg?logo=slack">
Expand All @@ -22,13 +22,13 @@
<img alt="Python Versions" src="https://img.shields.io/badge/Python--versions-3.8%20|%203.9%20|%203.10-brightgreen"/>
</div>

<p align="center"> <b><h3>EVA DB is a database system for building simpler and faster AI-powered applications.</b></h3> </p>
<p align="center"> <b><h3>EvaDB is a database system for building simpler and faster AI-powered applications.</b></h3> </p>

EVA DB is an AI-SQL database system for developing applications powered by AI models. We aim to simplify the development and deployment of AI-powered applications that operate on structured (tables, feature stores) and unstructured data (videos, text, podcasts, PDFs, etc.).
EvaDB is an AI-SQL database system for developing applications powered by AI models. We aim to simplify the development and deployment of AI-powered applications that operate on structured (tables, feature stores) and unstructured data (videos, text, podcasts, PDFs, etc.).

EVA DB accelerates AI pipelines by 10-100x using a collection of performance optimizations inspired by time-tested SQL database systems, including data-parallel query execution, function caching, sampling, and cost-based predicate reordering. EVA supports an AI-oriented SQL-like query language tailored for analyzing both structured and unstructured data. It has first-class support for PyTorch, Hugging Face, YOLO, and Open AI models.
EvaDB accelerates AI pipelines by 10-100x using a collection of performance optimizations inspired by time-tested SQL database systems, including data-parallel query execution, function caching, sampling, and cost-based predicate reordering. EvaDB supports an AI-oriented SQL-like query language tailored for analyzing both structured and unstructured data. It has first-class support for PyTorch, Hugging Face, YOLO, and Open AI models.

The high-level SQL API allows even beginners to use EVA in a few lines of code. Advanced users can define custom user-defined functions that wrap around any AI model or Python library. EVA DB is fully implemented in Python and licensed under the Apache license.
The high-level SQL API allows even beginners to use EvaDB in a few lines of code. Advanced users can define custom user-defined functions that wrap around any AI model or Python library. EvaDB is fully implemented in Python and licensed under the Apache license.

## Quick Links

Expand Down Expand Up @@ -56,7 +56,7 @@ The high-level SQL API allows even beginners to use EVA in a few lines of code.

## Illustrative Applications

Here are some illustrative EVA-powered applications (each Jupyter notebook can be opened on Google Colab):
Here are some illustrative EvaDB-powered applications (each Jupyter notebook can be opened on Google Colab):

* 🔮 <a href="https://evadb.readthedocs.io/en/stable/source/tutorials/08-chatgpt.html">Using ChatGPT to ask questions based on videos</a>
* 🔮 <a href="https://evadb.readthedocs.io/en/stable/source/tutorials/02-object-detection.html">Analysing traffic flow at an intersection</a>
Expand All @@ -70,8 +70,8 @@ Here are some illustrative EVA-powered applications (each Jupyter notebook can b
## Documentation

* [Detailed Documentation](https://evadb.readthedocs.io/)
- The <a href="https://evadb.readthedocs.io/en/stable/source/overview/installation.html">Getting Started</a> page shows how you can use EVA for different AI tasks and how you can easily extend EVA to support your custom deep learning model through user-defined functions.
- The <a href="https://evadb.readthedocs.io/en/latest/source/tutorials/11-similarity-search-for-motif-mining.html">User Guides</a> section contains Jupyter Notebooks that demonstrate how to use various features of EVA. Each notebook includes a link to Google Colab, where you can run the code yourself.
- The <a href="https://evadb.readthedocs.io/en/stable/source/overview/installation.html">Getting Started</a> page shows how you can use EvaDB for different AI tasks and how you can easily extend EvaDB to support your custom deep learning model through user-defined functions.
- The <a href="https://evadb.readthedocs.io/en/latest/source/tutorials/11-similarity-search-for-motif-mining.html">User Guides</a> section contains Jupyter Notebooks that demonstrate how to use various features of EvaDB. Each notebook includes a link to Google Colab, where you can run the code yourself.
* [Tutorials](https://github.com/georgia-tech-db/eva/blob/master/tutorials/03-emotion-analysis.ipynb)
* [Join us on Slack](https://join.slack.com/t/eva-db/shared_invite/zt-1i10zyddy-PlJ4iawLdurDv~aIAq90Dg)
* [Follow us on Twitter](https://twitter.com/evadb_ai)
Expand All @@ -80,18 +80,18 @@ Here are some illustrative EVA-powered applications (each Jupyter notebook can b

## Quick Start

- Install EVA using the pip package manager. EVA supports Python versions >= 3.8:
- Install EvaDB using the pip package manager. EvaDB supports Python versions >= 3.8:

```shell
pip install evadb
```

- To launch and connect to an EVA server in a Jupyter notebook, check out this [illustrative emotion analysis notebook](https://github.com/georgia-tech-db/eva/blob/master/tutorials/03-emotion-analysis.ipynb):
- To launch and connect to an EvaDB server in a Jupyter notebook, check out this [illustrative emotion analysis notebook](https://github.com/georgia-tech-db/eva/blob/master/tutorials/03-emotion-analysis.ipynb):
```shell
cursor = connect_to_server()
```

- Load a video onto the EVA server (we use [ua_detrac.mp4](data/ua_detrac/ua_detrac.mp4) for illustration):
- Load a video onto the EvaDB server (we use [ua_detrac.mp4](data/ua_detrac/ua_detrac.mp4) for illustration):

```mysql
LOAD VIDEO "data/ua_detrac/ua_detrac.mp4" INTO TrafficVideo;
Expand Down Expand Up @@ -141,11 +141,11 @@ TYPE ultralytics
WHERE id < 15;
```

- **EVA runs queries faster using its AI-centric query optimizer**. Two key optimizations are:
- **EvaDB runs queries faster using its AI-centric query optimizer**. Two key optimizations are:

💾 **Caching**: EVA automatically caches and reuses previous query results (especially model inference results), eliminating redundant computation and reducing query processing time.
💾 **Caching**: EvaDB automatically caches and reuses previous query results (especially model inference results), eliminating redundant computation and reducing query processing time.

🎯 **Predicate Reordering**: EVA optimizes the order in which the query predicates are evaluated (e.g., runs the faster, more selective model first), leading to faster queries and lower inference costs.
🎯 **Predicate Reordering**: EvaDB optimizes the order in which the query predicates are evaluated (e.g., runs the faster, more selective model first), leading to faster queries and lower inference costs.

Consider these two exploratory queries on a dataset of dog images:
<img align="right" style="display:inline;" width="40%" src="https://github.com/georgia-tech-db/eva/blob/master/data/assets/eva_performance_comparison.png?raw=true"></a>
Expand All @@ -165,11 +165,11 @@ Consider these two exploratory queries on a dataset of dog images:
AND Color(Crop(data, bbox)) = 'black';
```

By reusing the results of the first query and reordering the predicates based on the available cached inference results, EVA runs the second query **10x faster**!
By reusing the results of the first query and reordering the predicates based on the available cached inference results, EvaDB runs the second query **10x faster**!

## Architecture Diagram

This diagram presents the key components of EVA DB. EVA's AI-centric Query Optimizer takes a parsed query as input and generates a query plan that is then executed by the Query Engine. The Query Engine hits multiple storage engines to retrieve the data required for efficiently running the query:
This diagram presents the key components of EvaDB. EvaDB's AI-centric Query Optimizer takes a parsed query as input and generates a query plan that is then executed by the Query Engine. The Query Engine hits multiple storage engines to retrieve the data required for efficiently running the query:
1. Structured data (SQL database system connected via `sqlalchemy`).
2. Unstructured media data (on cloud buckets or local filesystem).
3. Vector data (vector database system).
Expand Down Expand Up @@ -202,10 +202,10 @@ This diagram presents the key components of EVA DB. EVA's AI-centric Query Optim

## Community and Support

👋 If you have general questions about EVA, want to say hello or just follow along, we'd like to invite you to join our [Slack Community](https://join.slack.com/t/eva-db/shared_invite/zt-1i10zyddy-PlJ4iawLdurDv~aIAq90Dg)and to [follow us on Twitter](https://twitter.com/evadb_ai).
👋 If you have general questions about EvaDB, want to say hello or just follow along, we'd like to invite you to join our [Slack Community](https://join.slack.com/t/eva-db/shared_invite/zt-1i10zyddy-PlJ4iawLdurDv~aIAq90Dg)and to [follow us on Twitter](https://twitter.com/evadb_ai).

<a href="https://join.slack.com/t/eva-db/shared_invite/zt-1i10zyddy-PlJ4iawLdurDv~aIAq90Dg">
<img src="https://raw.githubusercontent.com/georgia-tech-db/eva/master/docs/images/eva/eva-slack.png" alt="EVA Slack Channel" width="500">
<img src="https://raw.githubusercontent.com/georgia-tech-db/eva/master/docs/images/eva/eva-slack.png" alt="EvaDB Slack Channel" width="500">
</a>

If you run into any problems or issues, please create a Github issue and we'll try our best to help.
Expand All @@ -218,7 +218,7 @@ Don't see a feature in the list? Search our issue tracker if someone has already
[![CI Status](https://circleci.com/gh/georgia-tech-db/eva.svg?style=svg)](https://circleci.com/gh/georgia-tech-db/eva)
[![Documentation Status](https://readthedocs.org/projects/evadb/badge/?version=latest)](https://evadb.readthedocs.io/en/latest/index.html)

EVA is the beneficiary of many [contributors](https://github.com/georgia-tech-db/eva/graphs/contributors). All kinds of contributions to EVA are appreciated. To file a bug or to request a feature, please use <a href="https://github.com/georgia-tech-db/eva/issues">GitHub issues</a>. <a href="https://github.com/georgia-tech-db/eva/pulls">Pull requests</a> are welcome.
EvaDB is the beneficiary of many [contributors](https://github.com/georgia-tech-db/eva/graphs/contributors). All kinds of contributions to EvaDB are appreciated. To file a bug or to request a feature, please use <a href="https://github.com/georgia-tech-db/eva/issues">GitHub issues</a>. <a href="https://github.com/georgia-tech-db/eva/pulls">Pull requests</a> are welcome.

For more information, see our
[contribution guide](https://evadb.readthedocs.io/en/stable/source/contribute/index.html).
Expand Down
6 changes: 3 additions & 3 deletions RELEASING.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# EVA Release Guide
# EvaDB Release Guide

## Before You Start

Make sure you have [PyPI](https://pypi.org) account with maintainer access to the EVA project.
Make sure you have [PyPI](https://pypi.org) account with maintainer access to the EvaDB project.
Create a .pypirc in your `$HOME` directory.

It should look like this (contain your PyPI credentials):
Expand Down Expand Up @@ -102,7 +102,7 @@ Then run `chmod 600 ./.pypirc` so only you can read/write.
git push --set-upstream origin bump-v0.9.1+dev


1. Add the new tag to [the EVA project on ReadTheDocs](https://readthedocs.org/projects/evadb),
1. Add the new tag to [the EvaDB project on ReadTheDocs](https://readthedocs.org/projects/evadb),
* Trigger a build for main to pull new tags.
* Go to the "Versions" tab, and "Activate" the new tag.
* Go to Admin/Advanced to set this tag as the new default version.
Expand Down
4 changes: 2 additions & 2 deletions apps/story_qa/README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# Story Question and Answering
This example demonstrates the capability of EvaDBDB in extracting embedding from texts, building similarity index, searching similar sources, and using LLM to answer question based on that. For this example, we use "War and Peace" story as the source for our demonstration purpose.
This example demonstrates the capability of EvaDB in extracting embedding from texts, building similarity index, searching similar sources, and using LLM to answer question based on that. For this example, we use "War and Peace" story as the source for our demonstration purpose.

## Hardware Setup
For all examples in this folder, the performance results on measured on a server with AMD EPYC 7452 32-Cores CPU with `256`GB memory and one A40 NVIDIA GPU, which has `48`GB GPU memory.

## Single Question Answering
The major performance benefit of EvaDBDB in single question answering comes from its capability of parallelizing the feature extraction step.
The major performance benefit of EvaDB in single question answering comes from its capability of parallelizing the feature extraction step.

### How to Run
```bash
Expand Down
6 changes: 3 additions & 3 deletions apps/youtube_qa/youtube_qa.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ def analyze_video():
"""Extracts speech from video for llm processing.

Returns:
EvaDBDBCursor: evadb api cursor.
EvaDBCursor: evadb api cursor.
"""
print("Analyzing video. This may take a while...")
start = time.time()
Expand Down Expand Up @@ -78,8 +78,8 @@ def cleanup():
"""Removes any temporary file / directory created by EvaDB."""
if os.path.exists("online_video.mp4"):
os.remove("online_video.mp4")
if os.path.exists("eva_data"):
shutil.rmtree("eva_data")
if os.path.exists("evadb_data"):
shutil.rmtree("evadb_data")


if __name__ == "__main__":
Expand Down
6 changes: 3 additions & 3 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -26,11 +26,11 @@ RUN chown -R evauser:evauser /app
USER evauser
ENV PATH="/home/evauser/.local/bin:${PATH}"

# Install EVA
# Install EvaDB
RUN python3.9 -m pip install evadb

# Expose the default port EVA runs on
# Expose the default port EvaDB runs on
EXPOSE 8803

# Start EVA
# Start EvaDB
CMD ["eva_server"]
Loading