Skip to content

Commit

Permalink
Merge branch 'instruct-lab:main' into word_chain_game
Browse files Browse the repository at this point in the history
  • Loading branch information
spk-hebbar authored Mar 25, 2024
2 parents 69a02ae + 5041065 commit 3e33276
Show file tree
Hide file tree
Showing 182 changed files with 13,765 additions and 532 deletions.
36 changes: 28 additions & 8 deletions .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,42 @@
---
name: Bug report
name: Bug/Problem report
about: Create a report to help us improve
title: ''
labels: ''
assignees: ''

---

**Describe the bug**
A clear and concise description of what the bug is.
<!-- If you want to report a problem with the current model or taxonomy, please, fill out the following questionnaire. If the questionnaire doesn't match the type of problem you want to report, just delete the sections related to the model. -->

**Describe the bug/problem**

<!-- A concise description of what the problem is, replace "..." in the bullet list. -->

- ...
- ...
- ...

**Input given at the prompt**
What you entered.

<!-- What you entered, replace "..." -->

```
...
```

**Response that was received from the current model**

<!-- What you received from the current model in response to your input,
replace "..." -->

**Response that was received**
What you received in response to your input.
```
...
```

**Response that you expected instead**
<!-- What you expected to receive instead, replace "...". -->

**Response that was expected**
What you expected to receive instead.
```
...
```
40 changes: 40 additions & 0 deletions .github/ISSUE_TEMPLATE/proposal.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
---
name: Proposal
about: Create a contribution proposal
title: ''
labels: ''
assignees: ''

---

**Describe the proposed contribution to the taxonomy**

<!-- A concise description of what the proposed contribution would bring, replace "..." in the bullet list. -->

- ...
- ...
- ...

**Input given at the prompt**

<!-- What you entered, replace "..." -->

```
...
```

**Response from the current model**

<!-- What you received from the current model in response to your input,
replace "..." -->

```
...
```

**Response that you would expect instead with the contribution**
<!-- What you expect to receive instead with the finetuned model, replace "...". -->

```
...
```
42 changes: 42 additions & 0 deletions .github/labeler.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Copyright The InstructLab Authors, 2024
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

ci:
- changed-files:
- any-glob-to-any-file:
- .github/scripts/**
- .github/workflows/**
- .github/*.yml

documentation:
- changed-files:
- any-glob-to-any-file:
- "*.md"
- docs/**

knowledge:
- changed-files:
- any-glob-to-any-file:
- knowledge/**

skill:
- changed-files:
- any-glob-to-any-file:
- compositional_skills/**

triage-needed:
- changed-files:
- any-glob-to-any-file:
- compositional_skills/**
- knowledge/**
22 changes: 14 additions & 8 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,18 +21,23 @@ following questionnaire with whatever information is applicable to your PR.
```


**Response that was received**
**Response from the original model**

<!-- What you received in response to your input, replace "..." -->

<!-- What you received from the original model in response to your input,
replace "..." -->

```
...
```


**Response that is now received instead**
**Response from the fine-tuned model**


<!-- What you receive with your contribution, replace "..." -->
<!-- Generate a synthetic dataset based on your newly added seed data; train the model
with the synthetic data and now re-test the model's response with the same prompt.
Replace "..." with what you receive with the finetuned model. -->

```
...
Expand All @@ -42,7 +47,8 @@ following questionnaire with whatever information is applicable to your PR.

<!-- Insert an x between the empty brackets: [ ] >> [x] -->

- [ ] tested contribution with `lab generate`
- [ ] `lab generate` does not produce any warnings or errors
- [ ] all [commits are signed off](https://github.com/instruct-lab/taxonomy/blob/main/CONTRIBUTING.md#legal) (DCO)
- [ ] the `qna.yaml` file was [linted](https://yamllint.com)
- [ ] The contribution was tested with `lab generate`
- [ ] No errors or warnings were produced by `lab generate`
- [ ] All [commits are signed off](https://github.com/instruct-lab/taxonomy/blob/main/CONTRIBUTING.md#legal) (DCO)
- [ ] The `qna.yaml` file contains at least 5 `seed_examples`
- [ ] The `qna.yaml` file was [linted](https://yamllint.com) and [prettified](https://onlineyamltools.com/prettify-yaml) ([yaml-validator](https://jsonformatter.org/yaml-validator) can do both)
7 changes: 7 additions & 0 deletions .github/schemas/knowledge_schema.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
seed_examples: list(include('seed'), min=5)
task_description: str(min=1)
domain: str(required=False)
---
seed:
answer: str(min=1)
question: str(min=1)
7 changes: 7 additions & 0 deletions .github/schemas/skills_extraction_schema.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
seed_examples: list(include('seed'), min=5)
task_description: str(min=1)
---
seed:
answer: str(min=1)
context: str(min=1)
question: str(min=1)
7 changes: 7 additions & 0 deletions .github/schemas/skills_freeform_schema.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
seed_examples: list(include('seed'), min=5)
task_description: str(min=1)
---
seed:
answer: str(min=1)
context: str(min=1,required=False)
question: str(min=1)
7 changes: 7 additions & 0 deletions .github/schemas/skills_grounded_schema.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
seed_examples: list(include('seed'), min=5)
task_description: str(min=1)
---
seed:
answer: str(min=1)
context: str(min=1)
question: str(min=1)
45 changes: 45 additions & 0 deletions .github/scripts/check-yaml.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
#!/usr/bin/env bash
# Copyright The InstructLab Authors, 2024
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

if [ $# -lt 1 ]; then
echo "Usage: check-yaml.sh [CHANGED_FILES] - Ensure changed Taxonomy YAML files follow proper schema"
exit 1
fi

CHANGED_FILES="$@"
ERR=0
error() { echo "ERROR: $file:$@" 1>&2; ERR=1; }
warn() { echo "WARN: $file:$@" 1>&2; }
for file in ${CHANGED_FILES}; do
case $file in knowledge*)
error "1:1: We do not accept knowledge PRs at this time"
esac
if [[ "$file" != *"/qna.yaml" ]]; then
error "1:1: Skills file has to be named 'qna.yaml', not '$(basename $file)'"
fi
yq '.created_by | length > 0' $file | grep -q false && error "$(yq '.created_by|line' $file):1: missing/empty 'created_by'"
yq '.task_description | length > 0' $file | grep -q false && warn "$(yq '.task_description|line' $file):1: missing/empty 'task_description'"
yq '.seed_examples' $file | grep -q null && error "$(yq '.seed_examples|line' $file):1: missing 'seed_examples'"
yq '.seed_examples | length >= 5' $file | grep -q false && error "$(yq '.seed_examples|line' $file):1: less than 5 'seed_examples'"
yq '.seed_examples[] | .question | length > 0' $file | grep -q false && error "$(yq '.seed_examples|line' $file):1: missing/empty 'question's"
yq '.seed_examples[] | .answer | length > 0' $file | grep -q false && error "$(yq '.seed_examples|line' $file):1: missing/empty 'answer's"
if $( yq '.seed_examples[] | has("context")' $file | grep -q true ); then
yq '.seed_examples[] | .context| length > 0' $file | grep -q false && error "$(yq '.seed_examples|line' $file):1: missing/empty 'context's"
fi
yq '.seed_examples[].attribution | length > 0' $file | grep -q false && error "$(yq '.seed_examples|line' $file):1: missing/empty 'attribution's"
yq '.seed_examples[].attribution[].source | length > 0' $file | grep -q false && error "$(yq '.seed_examples|line' $file):1: missing/empty 'attribution source's"
yq '.seed_examples[].attribution[].license | length > 0' $file | grep -q false && error "$(yq '.seed_examples|line' $file):1: missing/empty 'attribution license's"
done
exit $ERR
30 changes: 30 additions & 0 deletions .github/workflows/label.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Copyright The InstructLab Authors, 2024
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

name: "Pull Request Labeler"

on:
- pull_request_target

jobs:
labeler:
permissions:
contents: read
pull-requests: write
runs-on: ubuntu-latest
steps:
- uses: actions/labeler@v5
with:
repo-token: "${{ secrets.GITHUB_TOKEN }}"
sync-labels: true
32 changes: 7 additions & 25 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,10 @@ on:
- knowledge/**/*.yaml
- knowledge/**/*.yml

defaults:
run:
shell: bash

jobs:
lint:
runs-on: ubuntu-latest
Expand All @@ -41,7 +45,7 @@ jobs:
with:
fetch-depth: 0

- name: "Find changed skills files"
- name: "Find changed skills and knowledge files"
id: changed-files
uses: tj-actions/changed-files@v42
with:
Expand All @@ -59,31 +63,9 @@ jobs:
# use yq to verify YAML fields exist and generate a matchable log message with line number and error severity
# the added matcher .github/workflows/matchers/lint.json will parse those log messages to generate annotation
# the generated annotations are then applied in the GH PR Changed Files view under the changed line of code
- name: "Check skills file contents"
- name: "Check file contents"
if: steps.changed-files.outputs.any_changed == 'true'
env:
CHANGED_FILES: ${{ steps.changed-files.outputs.all_changed_files }}
run: |
echo "::add-matcher::.github/workflows/matchers/lint.json"
ERR=0
error() { echo "ERROR: $file:$@" 1>&2; ERR=1; }
warn() { echo "WARN: $file:$@" 1>&2; }
echo
for file in ${CHANGED_FILES}; do
case $file in knowledge*)
error "1:1: We do not accept knowledge PRs at this time"
esac
if [[ "$file" != *"/qna.yaml" ]]; then
error "1:1: Skills file has to be named 'qna.yaml', not '$(basename $file)'"
fi
yq '.created_by | length > 0' $file | grep -q false && warn "$(yq '.created_by|line' $file):1: missing/empty 'created_by'"
yq '.task_description | length > 0' $file | grep -q false && warn "$(yq '.task_description|line' $file):1: missing/empty 'task_description'"
yq '.seed_examples' $file | grep -q null && error "$(yq '.seed_examples|line' $file):1: missing 'seed_examples'"
yq '.seed_examples | length >= 3' $file | grep -q false && error "$(yq '.seed_examples|line' $file):1: less than 3 'seed_examples'"
yq '.seed_examples[] | .question | length > 0' $file | grep -q false && error "$(yq '.seed_examples|line' $file):1: missing/empty 'question's"
yq '.seed_examples[] | .answer | length > 0' $file | grep -q false && error "$(yq '.seed_examples|line' $file):1: missing/empty 'answer's"
if $( yq '.seed_examples[] | has("context")' $file | grep -q true ); then
yq '.seed_examples[] | .context| length > 0' $file | grep -q false && error "$(yq '.seed_examples|line' $file):1: missing/empty 'context's"
fi
done
exit $ERR
.github/scripts/check-yaml.sh "${{ steps.changed-files.outputs.all_changed_files }}"
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,12 @@ dmypy.json

scratch.ipynb

# IDEs
.vscode/
.idea/

# Mac personalization files
.DS_Store

# Ignore config.yaml from the cli
config.yaml
5 changes: 4 additions & 1 deletion CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,7 @@
# https://help.github.com/en/articles/about-code-owners
#

* @instruct-lab/taxonomy-maintainers
* @instruct-lab/taxonomy-approvers
*.md @mairin @juliadenham @kelbrown20 @obuzek @lehors @instruct-lab/taxonomy-approvers
/.github/ @instruct-lab/taxonomy-maintainers @instruct-lab/taxonomy-approvers
/docs/ @mairin @juliadenham @kelbrown20 @obuzek @lehors @instruct-lab/taxonomy-approvers
17 changes: 15 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,9 @@ The whole point of InstructLab 🥼 is to enable true collaborative development

So, the general gist of making a contribution to this project consists of writing an extension to the existing taxonomy, make a pull request, and your work will be reviewed and merged so that it can benefit the whole community.

If you're planning on making a large addition or if you find a problem with the existing taxonomy, it's best to [write an issue](https://github.com/instruct-lab/cli/issues/new?assignees=&labels=&template=feature_request.md&title=) first to discuss it with maintainers.
Before investing time into developing a contribution, it's best to [open an issue](https://github.com/instruct-lab/taxonomy/issues/new?assignees=&labels=&template=proposal.md&title=) first to discuss your proposal idea with the maintainers.

You should also review what others have already proposed, in open issues and pull requests, to avoid duplicating efforts. You might instead be able to join forces with them by providing input to what they have started.

To contribute to this repo, you'll use the Fork and Pull model common in many open source repositories. For details on this process, check out [The GitHub Workflow
Guide](https://github.com/kubernetes/community/blob/master/contributors/guide/github-workflow.md)
Expand Down Expand Up @@ -62,7 +64,18 @@ Once you've [created a pull request](#how-can-i-contribute), maintainers will re
We have tried to make it as easy as possible to make contributions.
This applies to how we handle the legal aspects of contribution.
We use the same approach - the [Developer's Certificate of Origin 1.1 (DCO)][DCO] - that [the Linux Kernel community uses][Linux-DCO] to manage code contributions.
For this project, unless the file says otherwise, the relevant open source license is [the Apache License, Version 2.0](LICENSE).

For this project, unless the file says otherwise, or unless the attributed source provided in the file says otherwise, the relevant open source license is [the Apache License, Version 2.0](LICENSE).
All contributions that leverage third party content should either come from the public domain or be licensed with an open data license that does not restrict commercial use or the creation of derivative works, including the following license types:

- CC0-1.0
- CDLA-Permissive-2.0
- CC-BY-4.0
- Apache-2.0
- MIT

Any third party content contributed to this project undergoes modifications in order to formulate it in the templated format required for submission to this project.

We simply ask that when submitting a patch for review, the developer must include a sign-off statement in the commit message.

Here is an example `Signed-off-by` line, which indicates that the submitter accepts the DCO:
Expand Down
Loading

0 comments on commit 3e33276

Please sign in to comment.