Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merging all the codes of dev to main branch. #9

Merged
merged 64 commits into from
Oct 19, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
37650b3
[ML] Improve NLP model import by using nicely defined types (#459)
benwtrent May 3, 2022
74245f5
[ML] add support for question_answering NLP tasks (#457)
benwtrent May 4, 2022
f0b7272
[ML] improve general pytorch model import and add tests (#463)
benwtrent May 5, 2022
f468edb
Release 8.2.0
sethmlarson May 5, 2022
893c89d
[ML] fixes decision tree classifier upload to account for probabiliti…
benwtrent May 17, 2022
3ec7e6a
Add authentication methods for import model script (#466)
lcawl May 18, 2022
3a2353d
Ignore type checking for `agg_value`
technige May 31, 2022
29d3498
[DOCS] Adds question_answering task type for eland_import_hub_model
lcawl May 31, 2022
9865bed
Stop explicitly pulling master
sethmlarson May 31, 2022
205f989
Remove 'numpydoc' to stop reformatting
sethmlarson May 31, 2022
07b088a
Also pin traitlets
sethmlarson May 31, 2022
003a47d
[DOCS] Include missing attributes (#468)
lcawl May 31, 2022
becb9b4
[ML] ensure quantization is applied (#472)
benwtrent Jun 15, 2022
48969b9
Freeze the traced PyTorch model
davidkyle Jun 21, 2022
c8e1138
Bump minimum PyTorch version to 1.11
davidkyle Jun 21, 2022
3ba0ec8
[ML] adds new auto task type that attempts to automatically determine…
benwtrent Jun 23, 2022
1c6e8c1
added opensearch as dependency
LEFTA98 Jul 1, 2022
6bdaa33
replaced core mentions of elasticsearch client w opensearch
LEFTA98 Jul 1, 2022
0044256
changed index names for testing
LEFTA98 Jul 5, 2022
699c6bd
modified test dataframes to accommodate opensearch indexing
LEFTA98 Jul 6, 2022
eb95724
fixed aggregatable field name tests
LEFTA98 Jul 7, 2022
b4e1a71
fixing pytests that mention indices of ed/pd dataframes
LEFTA98 Jul 7, 2022
6323b05
fixed equality boolean filter to accommodate terminology difference i…
LEFTA98 Jul 7, 2022
70b688f
fixed pytests with indexing issues, geolocation field renaming issues
LEFTA98 Jul 7, 2022
849d45f
modified test setup code to work for opensearch
LEFTA98 Jul 8, 2022
f1c43f7
reverted many erroneous "fixes" to tests
LEFTA98 Jul 8, 2022
ee41ba6
fixed opensearch integration so remaining non-ml tests run
LEFTA98 Jul 8, 2022
c0a227a
added initial connection to predicting with sagemaker
LEFTA98 Jul 19, 2022
100aa13
added sagemaker predict api
LEFTA98 Jul 21, 2022
397a65a
added band-aid to fix iterating over rows
LEFTA98 Jul 25, 2022
a9952f1
debugging indexing issue
LEFTA98 Jul 25, 2022
691797e
reverted indexing change for sagemaker predict
LEFTA98 Jul 28, 2022
b8887c5
added deprecation warnings to ml module
LEFTA98 Aug 2, 2022
f8f6420
refactoring elasticsearch names to opensearch
LEFTA98 Aug 2, 2022
995912e
continued renaming opensearch variables
LEFTA98 Aug 3, 2022
7a4ee0a
more renaming changes
LEFTA98 Aug 4, 2022
e2d547f
first commit for ml common integration
LEFTA98 Aug 11, 2022
afa0446
PoC for model upload
LEFTA98 Aug 12, 2022
881a228
renamed model chunk uploading path
LEFTA98 Aug 23, 2022
3c5fd17
added total chunks to model upload
LEFTA98 Aug 24, 2022
9503bc1
fixed docstring typo
LEFTA98 Aug 31, 2022
6318e3e
added first iteration of custom model load supprot
LEFTA98 Sep 6, 2022
12e4fae
removed unsupported features
LEFTA98 Sep 6, 2022
e675a6a
renaming all instances of elastic in code
LEFTA98 Sep 8, 2022
11364b0
created new dev requirements file
LEFTA98 Sep 8, 2022
d2f2768
typo fix
LEFTA98 Sep 9, 2022
27f89e7
PR feedback
LEFTA98 Sep 6, 2022
7c22abf
implement PR feedback
LEFTA98 Sep 6, 2022
50395ad
PR feedback
LEFTA98 Sep 6, 2022
f909aed
implement pr feedback
LEFTA98 Sep 9, 2022
a540cae
Update README.md
LEFTA98 Sep 9, 2022
a56871d
added demo materials
LEFTA98 Sep 9, 2022
1995a65
refactoring
dhrubo-os Sep 29, 2022
de7a635
refactoring code and changed code to address some of the deprection w…
dhrubo-os Oct 3, 2022
9ed7f6f
adding header license info
dhrubo-os Oct 4, 2022
c2033b5
formatted code with black, isort, mypy
dhrubo-os Oct 4, 2022
4c216fc
updating git ci workflow
dhrubo-os Oct 4, 2022
c930411
refactoring code + adding pytest in the ci workflow
dhrubo-os Oct 7, 2022
10e99ed
removing test from ci workflow
dhrubo-os Oct 7, 2022
0a7b43b
setup CI for integration test
dhrubo-os Oct 12, 2022
9a1293b
resolving conflicts
dhrubo-os Oct 19, 2022
f8e57ad
adding files required for CI
dhrubo-os Oct 19, 2022
5d916a3
adding files which got deleted during merge
dhrubo-os Oct 19, 2022
07800e6
adding deleted files by git merge
dhrubo-os Oct 19, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
resolving conflicts
  • Loading branch information
dhrubo-os committed Oct 19, 2022
commit 9a1293b0058521ca6746de8e470fb828be66253c
4 changes: 4 additions & 0 deletions CODE_OF_CONDUCT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
## Code of Conduct
This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct).
For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact
opensource-codeofconduct@amazon.com with any additional questions or comments.
145 changes: 37 additions & 108 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,133 +1,62 @@
# Contributing to eland
# Contributing Guidelines

Eland is an open source project and we love to receive contributions
from our community --- you! There are many ways to contribute, from
writing tutorials or blog posts, improving the documentation, submitting
bug reports and feature requests or writing code which can be
incorporated into eland itself.
Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional
documentation, we greatly value feedback and contributions from our community.

## Bug reports
Please read through this document before submitting any issues or pull requests to ensure we have all the necessary
information to effectively respond to your bug report or contribution.

If you think you have found a bug in eland, first make sure that you are
testing against the [latest version of
eland](https://github.com/elastic/eland) - your issue may already have
been fixed. If not, search our [issues
list](https://github.com/elastic/eland/issues) on GitHub in case a
similar issue has already been opened.

It is very helpful if you can prepare a reproduction of the bug. In
other words, provide a small test case which we can run to confirm your
bug. It makes it easier to find the problem and to fix it. Test cases
should be provided as python scripts, ideally with some details of your
Elasticsearch environment and index mappings, and (where appropriate) a
pandas example.
## Reporting Bugs/Feature Requests

Provide as much information as you can. You may think that the problem
lies with your query, when actually it depends on how your data is
indexed. The easier it is for us to recreate your problem, the faster it
is likely to be fixed.
We welcome you to use the GitHub issue tracker to report bugs or suggest features.

## Feature requests
When filing an issue, please check existing open, or recently closed, issues to make sure somebody else hasn't already
reported the issue. Please try to include as much information as you can. Details like these are incredibly useful:

If you find yourself wishing for a feature that doesn\'t exist in eland,
you are probably not alone. There are bound to be others out there with
similar needs. Many of the features that eland has today have been added
because our users saw the need. Open an issue on our [issues
list](https://github.com/elastic/eland/issues) on GitHub which describes
the feature you would like to see, why you need it, and how it should
work.
* A reproducible test case or series of steps
* The version of our code being used
* Any modifications you've made relevant to the bug
* Anything unusual about your environment or deployment

## Contributing code and documentation changes

If you have a bugfix or new feature that you would like to contribute to
eland, please find or open an issue about it first. Talk about what you
would like to do. It may be that somebody is already working on it, or
that there are particular issues that you should know about before
implementing the change.
## Contributing via Pull Requests
Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that:

We enjoy working with contributors to get their code accepted. There are
many approaches to fixing a problem and it is important to find the best
approach before writing too much code.
1. You are working against the latest source on the *main* branch.
2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already.
3. You open an issue to discuss any significant work - we would hate for your time to be wasted.

Note that it is unlikely the project will merge refactors for the sake
of refactoring. These types of pull requests have a high cost to
maintainers in reviewing and testing with little to no tangible benefit.
This especially includes changes generated by tools.
To send us a pull request, please:

The process for contributing to any of the [Elastic
repositories](https://github.com/elastic/) is similar. Details for
individual projects can be found below.
1. Fork the repository.
2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change.
3. Ensure local tests pass.
4. Commit to your fork using clear commit messages.
5. Send us a pull request, answering any default questions in the pull request interface.
6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation.

### Fork and clone the repository
GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and
[creating a pull request](https://help.github.com/articles/creating-a-pull-request/).

You will need to fork the main eland code or documentation repository
and clone it to your local machine. See [github help
page](https://docs.github.com/en/free-pro-team@latest/github/getting-started-with-github/fork-a-repo) for help.

Further instructions for specific projects are given below.
## Finding contributions to work on
Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start.

### Submitting your changes

Once your changes and tests are ready to submit for review:
## Code of Conduct
This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct).
For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact
opensource-codeofconduct@amazon.com with any additional questions or comments.

1. Run the linter and test suite to ensure your changes do not break the existing code:

(TODO Add link to the testing document)
## Security issue notifications
If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue.

``` bash
# Run Auto-format, lint, mypy type checker for your changes
$ nox -s format

# Run the test suite
$ pytest --doctest-modules eland/ tests/
$ pytest --nbval tests/notebook/

```

2. Sign the Contributor License Agreement

Please make sure you have signed our [Contributor License Agreement](https://www.elastic.co/contributor-agreement/).
We are not asking you to assign copyright to us, but to give us the right to distribute your code without restriction.
We ask this of all contributors in order to assure our users of the origin and continuing existence of the code.
You only need to sign the CLA once.

3. Rebase your changes

Update your local repository with the most recent code from the main
eland repository, and rebase your branch on top of the latest main
branch. We prefer your initial changes to be squashed into a single
commit. Later, if we ask you to make changes, add them as separate
commits. This makes them easier to review. As a final step before
merging we will either ask you to squash all commits yourself or
we\'ll do it for you.

4. Submit a pull request

Push your local changes to your forked copy of the repository and
[submit a pull
request](https://docs.github.com/en/free-pro-team@latest/github/collaborating-with-issues-and-pull-requests/proposing-changes-to-your-work-with-pull-requests) .
In the pull request, choose a title which sums up the changes that you
have made, and in the body provide more details about what your
changes do. Also mention the number of the issue where discussion
has taken place, eg "Closes \#123".

Then sit back and wait. There will probably be discussion about the pull
request and, if any changes are needed, we would love to work with you
to get your pull request merged into `eland` .

Please adhere to the general guideline that you should never force push
to a publicly shared branch. Once you have opened your pull request, you
should consider your branch publicly shared. Instead of force pushing
you can just add incremental commits; this is generally easier on your
reviewers. If you need to pick up changes from main, you can merge
main into your branch. A reviewer might ask you to rebase a
long-running pull request in which case force pushing is okay for that
request. Note that squashing at the end of the review process should
also not be done, that can be done when the pull request is [integrated
via GitHub](https://github.com/blog/2141-squash-your-commits).

## Contributing to the eland codebase
## Licensing

See the [LICENSE](LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution.
**Repository:** <https://github.com/elastic/eland>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


We internally develop using the PyCharm IDE. For PyCharm, we are
Expand Down
28 changes: 1 addition & 27 deletions LICENSE.txt → LICENSE
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@

Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
Expand Down Expand Up @@ -172,30 +173,3 @@
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.

END OF TERMS AND CONDITIONS

APPENDIX: How to apply the Apache License to your work.

To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.

Copyright [yyyy] [name of copyright owner]

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
1 change: 1 addition & 0 deletions NOTICE
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
52 changes: 0 additions & 52 deletions NOTICE.txt

This file was deleted.

9 changes: 0 additions & 9 deletions setup.cfg

This file was deleted.

Binary file removed tests/anonreviews.csv.gz
Binary file not shown.
Empty file.
Binary file removed tests/ecommerce.json.gz
Binary file not shown.
Binary file removed tests/ecommerce_df.json.gz
Binary file not shown.
Binary file removed tests/flights.json.gz
Binary file not shown.
Binary file removed tests/flights_df.json.gz
Binary file not shown.
Binary file removed tests/flights_small.json.gz
Binary file not shown.
You are viewing a condensed version of this merge commit. You can view the full changes here.