-
Notifications
You must be signed in to change notification settings - Fork 26
Documentation improvements #29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@carlosms could you help me please with |
@smacker of course |
man, it's really awesome! Thanks a lot. You should write more documentation for our projects 😂 |
README.md
Outdated
@@ -1,16 +1,23 @@ | |||
# Source Code Annotation application | |||
[](https://travis-ci.org/src-d/go-git) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's update it to src-d/code-annotation
README.md
Outdated
CREATE TABLE files ( | ||
blob_id_a TEXT, repository_id_a TEXT, commit_hash_a TEXT, path_a TEXT, content_a TEXT, | ||
blob_id_b TEXT, repository_id_b TEXT, commit_hash_b TEXT, path_b TEXT, content_b TEXT, | ||
score DOUBLE PRECISION); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To keep README concise and maintainable, I suggest to remove DDL from here and just link to the relevant place in FS/code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, can you point to such place? It's expected input schema. Not schema created by our tool.
Or do you suggest to use cli/examples/import/example.sql
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is relevant to document clearly the expected input DB schema that must be created by the users of the app.
This line is not a reference for developers, it is actually an important required first step, without this the annotation tool will have no data to work with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the dbutil
are internal things that prepare the db. I understood they're provided as a kind of helpers, but there could be used alternative approaches.
Considering that point, I'd move the dbutil
documentation into a separated documentation, and link it from the main README.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not suggesting to avoid documenting it, for sure; I'm just suggesting to document it separately, not in the README.md
that should be only a first and quick contact with the app.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the readme is not that big and spreading sections in different files can make it difficult to find things. But I guess it's up to @bzz.
What I am strongly against is removing documentation and pointing to source code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, @carlosms you have the point in #29 (comment)
I agree, but If possible I would prefer to have a link in REAME to the https://github.com/src-d/code-annotation/blob/master/cli/examples/import/example.sql#L6 as from what I can see - it contains the very same DDL.
In case of future changes - there is no need to maintain it in two different places, which has a benefit of avoiding documentation rot.
Let me know if you think that might work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that can also work. even if it's not my favourite approach (personal preference). I'll make the change right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@carlosms I might be missing something, but I think I have already saw it being replaced by the link to the file and now it's SQL in README here again here :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my bad. I fetched a wrong origin branch during rebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, in general, but...
#29 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after #29 (comment) is resolved @carlosms discretion.
@dpordomingo I'm not sure if I understood what do you mean by |
Looks awesome to me |
plumbing: packp, Skip argument validations for unknown capabilities. Fixes #623 | ||
``` | ||
|
||
## Development |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this section needs a bit more detail.
For example:
I can run the frontend in dev mode with yarn start. Does that mean the backend is not needed? Or needs to be started in some other way?
If I run make gorun, do I need to build the frontend code before?
What about the github auth tokens mentioned in the REAME? Is also needed for development?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
about frontend is it more clear in #36 in description?
I'll add info about token.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in that comment the steps are more clearly put
CONTRIBUTING.md
Outdated
|
||
Frontend: | ||
|
||
If you want to benifit from frontend hot reloading feature this line in your `.env` file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"hot reloading feature put this line..."
I think "hot reloading" should be explained, or at least have a link to some explanation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not really. It's super duper common technology in frontend world. If you don't know what it is - you don't need it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair enough, it's not that crucial. I was thinking on designers instead of developers, since we talked about design people editing code directly.
CONTRIBUTING.md
Outdated
UI_DOMAIN=http://127.0.0.1:3000 | ||
``` | ||
|
||
And then restart server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And then restart the server.
Might be obvious, but you could mention explicitly the command.
CONTRIBUTING.md
Outdated
|
||
And then restart server. | ||
|
||
To run frontend in dev mode: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To run the frontend in dev mode, execute:
``` | ||
yarn | ||
yarn start | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
```bash
I think we should also have a common style to put (or not) a $ before commands that are supposed to be run in the shell.
Signed-off-by: Maxim Sukharev <maxim@sourced.tech>
Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
Signed-off-by: Maxim Sukharev <maxim@sourced.tech>
Signed-off-by: Maxim Sukharev <maxim@sourced.tech>
Please do another pass. |
## Development | ||
|
||
```bash | ||
$ go get -d -u github.com/src-d/code-annotation/... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does #33 being merged change any of these, here and below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no. go get
will download everything with vendors and you still need make serve
to build FE.
Link to example.sql instead Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
Not yet implemented, but the commands will use filepaths for the input and output DB, limited to SQLite See PR src-d#28: src-d#28 (comment) Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments made using this documentation to execute code in PR #36.
```bash | ||
$ go get -d -u github.com/src-d/code-annotation/... | ||
$ cd $GOPATH/github.com/src-d/code-annotation | ||
$ make serve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a missing step, setting .env. Otherwise make serve will complain.
Also make serve will not inform that the server is listening on :8080
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is exactly the same than wrote in the README.md
under the "Non-docker" section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same than wrote in README.md
under the "Installation section", so unnecessary here
$ go get -d -u github.com/src-d/code-annotation/... | ||
$ cd $GOPATH/github.com/src-d/code-annotation | ||
$ make serve | ||
$ yarn start |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be confusing, as make server does not stop until it is killed. It should be mentioned that "yarn start" needs to be executed in another terminal
README.md
Outdated
|
||
# Source Code Annotation Tool | ||
|
||
In order to evaluate quality of ML models, as well as to create “ImageNet for source core” there is a need for tools to automate the data collection/labeling/annotation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ImageNet for Source Code
|
||
In order to evaluate quality of ML models, as well as to create “ImageNet for source core” there is a need for tools to automate the data collection/labeling/annotation. | ||
|
||
 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what this screenshot shows.
Does it detect there's a new line of code? Is that line relevant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it shows the interface of the tool. There are 2 files, the difference is in 1 line. We ask a user to mark it as identical/similar/different. It is what this tool does.
README.md
Outdated
### Github OAuth tokens | ||
|
||
First you need OAuth application on github. [Read how to create it](https://developer.github.com/apps/building-oauth-apps/creating-an-oauth-app/). | ||
1. You need OAuth application on github. [Read how to create it](https://developer.github.com/apps/building-oauth-apps/creating-an-oauth-app/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try to write links that describe what they point to.
You need an OAuth application on GitHub. See how to create OAuth applications on GitHub.
README.md
Outdated
2. Copy `.env.tpl` to `.env`. | ||
|
||
Copy `.env.tpl` to `.env` and set tokens there. | ||
3. On a [page](https://github.com/settings/developers) with your application find `Client ID` and `Client Secret` and put them in `.env` file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Retrieve the values for your application's
Client ID
andClient Secret
from the GitHub Developer Settings page and add them to the end of the corresponding lines in.env
.
README.md
Outdated
The `import` command will use those file pairs to create a new [SQLite](https://sqlite.org/) or [PostgreSQL](https://www.postgresql.org/) database that will be used internally by the Annotation Tool. The destination database does not need to be empty, new imported file pairs can be added to previous imports. | ||
|
||
If you want to benifit from frontend hot reloading feature this line in your `.env` file: | ||
_Please note_: if a file pair is identical to an existing one it will not be detected. A new pair entry will be created with the same contents. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand this phrase.
Does this mean we can end up with duplicate results for files that are equal if we run more than the analysis more than once?
Please clarify
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@carlosms could you please handle it. It's a part about the import.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify: if a user logs in to annotate file pairs more than once, we will not end up with duplicate results. Previous annotation results are already saved for each user.
This only applies if you add new data to annotate to an already existing installation, using the "import" tool.
Let's say you decide to add new data, and one of those new file pairs is repoA repoB, src/A src/A.
If that very same file pair was existing from before, the DB will contain that previous pair, and the new one.
The only reason it works this way is because we decided not to guess too much. This can be changed easily when the ML team starts using the app and providing feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it a super-useful implementation-detail notice for the internal users and developer of the project.
Signed-off-by: Maxim Sukharev <maxim@sourced.tech>
@dpordomingo @campoy please, feel free to do one more pass on it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for such effort of improving the documentation!!!
Let me suggest some things; here is a list of the main ones:
README.md
:- explain first the app purpose, and then its motivation,
- refactor the installation section to cover the full process from the beginning to the end, explaining the two different alternatives: with/without docker, and how to setup the backend DB,
- fix docker commands, that does not work,
- simplify the import/export section to its minimum necesities,
- add a Requirements section (example in landing)
- add a link to the Contributing#Development
Contributing.md
- simplify the Contributing#Development section,
README.md
Outdated
In order to evaluate quality of ML models, as well as to create “ImageNet for source core” there is a need for tools to automate the data collection/labeling/annotation. | ||
# Source Code Annotation Tool | ||
|
||
In order to evaluate quality of ML models, as well as to create “ImageNet for Source Code” there is a need for tools to automate the data collection/labeling/annotation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering if it would more clear if we explain first what this tool does, and second why is it needed.
I'd try with something like:
Source code annotation tool offers an UI to annotate source code and review these annotations, and a CLI to define the code to be annotated and export the annotations.
Why is is needed?
To evaluate the quality of ML models, as well as to create “ImageNet for Source Code” it is needed to have a big amount of source code already annotated by humans. To automate the data collection/labeling/annotation process it was decided to build this tool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the first section added. The second one kept unchanged. Your proposal sounds just more complicated but saying the same thing in my opinion.
README.md
Outdated
First you need OAuth application on github. [Read how to create it](https://developer.github.com/apps/building-oauth-apps/creating-an-oauth-app/). | ||
1. You need OAuth application on github. You need an OAuth application on GitHub. See [how to create OAuth applications on GitHub](https://developer.github.com/apps/building-oauth-apps/creating-an-oauth-app/). | ||
|
||
`Authorization callback URL: http://127.0.0.1:8080/oauth-callback` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a development configuration value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to change it to http(s)://<hostname:port>/oauth-callback
? I think it's even more confusing.
README.md
Outdated
### Github OAuth tokens | ||
|
||
First you need OAuth application on github. [Read how to create it](https://developer.github.com/apps/building-oauth-apps/creating-an-oauth-app/). | ||
1. You need OAuth application on github. You need an OAuth application on GitHub. See [how to create OAuth applications on GitHub](https://developer.github.com/apps/building-oauth-apps/creating-an-oauth-app/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First sentence is repeated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd reword:
Application users are created through GitHub OAuth App, so it is needed to create and configure one to use the UI of this application
1. Create a GitHub OAuth App.
When you're configuring the OAuth App, the "Authorization callback URL" must behttp://<APP_HOST_NAME>/oauth-callback
(see how to create OAuth applications on GitHub)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't have to create. You just need one.
2. Copy `.env.tpl` to `.env`. | ||
|
||
Copy `.env.tpl` to `.env` and set tokens there. | ||
3. Retrieve the values for your application's Client ID and Client Secret from the [GitHub Developer Settings page](https://github.com/settings/developers) and add them to the end of the corresponding lines in .env. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd reword:
Edit the
.env
file to use theClient ID
andClient Secret
you obtained after creating your OAuth App. You can recover both codes from your registered OAuth Apps at GitHub Developer settings: OAuth Apps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess "recover" isn't a correct English word for it.
This text was suggested by @campoy, I prefer to stick to it.
#29 (comment)
README.md
Outdated
```bash | ||
docker build -t srcd/code-annotation . | ||
docker run --env-file .env --rm -p 8080:8080 srcd/code-annotation | ||
$ docker build -t srcd/code-annotation . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dep
is not a built-in command in golang:1.8-alpine3.6
image, so it should be added to the Dockerfile
; otherwise, the suggested docker build
command will fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's solved by smacker#3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the section about docker. A user should run the only image created by us. We don't have such image yet.
More info here: #36 (comment)
To work with the annotation results, the internal data can be extracted into a new SQLite database using the `export` command. | ||
|
||
```bash | ||
$ export <origin-DSN> <path-to-sqlite.db> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it is said that:
the internal data can be extracted into a new SQLite database using the
export
command.
I'd use the same argument hint for the input than used in the input
documentation:
export <application-internal-db-DSN> <path-to-output-sqlite.db>
```bash | ||
$ go get -d -u github.com/src-d/code-annotation/... | ||
$ cd $GOPATH/github.com/src-d/code-annotation | ||
$ make serve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is exactly the same than wrote in the README.md
under the "Non-docker" section.
CONTRIBUTING.md
Outdated
|
||
### Frontend: | ||
|
||
If you want to benifit from frontend hot reloading feature put this line in your `.env` file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/benifit/benefit
CONTRIBUTING.md
Outdated
|
||
### Frontend: | ||
|
||
If you want to benifit from frontend hot reloading feature put this line in your `.env` file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find a great idea to explain how to run the front in hot-reloading mode, but suggesting to add a different UI_DOMAIN
into the .env
file will ruin the server if it is launched with the recommended command (make serve
). To avoid that problem, I would not recommend modifying the .env
file, rewording this section as it follows:
### Running the frontend with hot reloading.
First, run the frontend:
yarn startand then, serve the backend with a proper
UI_DOMAIN
value, running:UI_DOMAIN=http://127.0.0.1:3000 make gorun
@@ -0,0 +1,73 @@ | |||
# Contributor Covenant Code of Conduct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for such effort of improving the documentation!!!
Let me suggest some things; here is a list of the main ones:
README.md
:- explain first the app purpose, and then its motivation,
- refactor the installation section to cover the full process from the beginning to the end, explaining the two different alternatives: with/without docker, and how to setup the backend DB,
- fix docker commands, that does not work,
- simplify the import/export section to its minimum necesities,
- add a Requirements section (example in landing)
- add a link to the Contributing#Development
Contributing.md
- simplify the Contributing#Development section,
Signed-off-by: Maxim Sukharev <maxim@sourced.tech>
@dpordomingo feedback on README (besides import/export) is applied or answered. |
Every commit message should describe what was changed, under which context and, if applicable, the GitHub issue it relates to: | ||
|
||
``` | ||
plumbing: packp, Skip argument validations for unknown capabilities. Fixes #623 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a single line?
Normally the Fixes #123
tends to be on its own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It comes from the template in the guide: https://github.com/src-d/guide/blob/master/engineering/documents/CONTRIBUTING.md
From my understanding, it should be common for all src-d projects.
At least it is in random 2 projects:
https://github.com/src-d/go-git/blob/master/CONTRIBUTING.md#format-of-the-commit-message
https://github.com/bblfsh/dashboard/blob/master/CONTRIBUTING.md#format-of-the-commit-message
|
||
``` | ||
go version; # prints your go version | ||
echo $GOPATH; # prints your $GOPATH path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use go env GOPATH
instead so it prints the default GOPATH if none has been defined
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry? We need $GOPATH
later to do cd
.
README.md
Outdated
cd $GOPATH/github.com/src-d/code-annotation | ||
make serve | ||
``` | ||
cd $GOPATH/src/github.com/src-d/landing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
landing?
README.md
Outdated
``` | ||
|
||
### Non-docker | ||
The project must be under the `$GOPATH`, following the Go import conventions, what means you can go to its directory running: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence doesn't make much sense.
The project must be under
$GOPATH
, as required by the Go tooling.
You should be able to navigate into the source code by running:
README.md
Outdated
1. You need an OAuth application on GitHub. See [how to create OAuth applications on GitHub](https://developer.github.com/apps/building-oauth-apps/creating-an-oauth-app/). | ||
|
||
If you want to benifit from frontend hot reloading feature this line in your `.env` file: | ||
`Authorization callback URL: http://127.0.0.1:8080/oauth-callback` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explain what this means.
In order to be able to use this token while running your application locally, make sure you add
http://127.0.0.1:8080/oauth-callback
to theauthorization callback URL
field.
Maybe add a screenshot of the form with the value in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what token? README doesn't mention any token at this point.
I changed
In order to be able to use this application while running the tool locally
screenshot might be too much. The is already a screenshot of the field in github documentation we are pointing to.
README.md
Outdated
``` | ||
UI_DOMAIN=http://127.0.0.1:3000 | ||
|
||
It runs everything you need to get the tool working at [http://localhost:8080](http://localhost:8080) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will start a server locally, which you can access on http://localhost:8080.
README.md
Outdated
|
||
### Import File Pairs for Annotation | ||
|
||
The file pairs must be initially provided via an [SQLite](https://sqlite.org/) database. The database **must follow the expected schema**, please [follow this link](./cli/examples/import/example.sql) to see an example. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop the "initially", it implies that should have happened before this step.
Also since the schema is so important you should make it explicit and maybe paste the result of calling DESCRIBE TABLE
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the schema: it was actually removed as requested, see #29 (comment).
Pinging @bzz.
README.md
Outdated
|
||
The `import` command will use those file pairs to create a new [SQLite](https://sqlite.org/) or [PostgreSQL](https://www.postgresql.org/) database that will be used internally by the Annotation Tool. The destination database does not need to be empty, new imported file pairs can be added to previous imports. | ||
|
||
_Please note_: if a file pair is identical to an existing one it will not be detected. A new pair entry will be created with the same contents. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: duplicate entries are not filtered, so running an import multiple times will result in repeated rows.
Some usage examples: | ||
|
||
```bash | ||
$ import ./input.db sqlite:///home/user/internal.db |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any output expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there is. I've added the output to the usage examples, a few lines below.
README.md
Outdated
|
||
# Code of Conduct | ||
|
||
All activities source{d} projects are governed by the [source{d} code of conduct](CODE_OF_CONDUCT.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there was a mistake on the template!
Signed-off-by: Maxim Sukharev <maxim@sourced.tech>
Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
@dpordomingo @campoy ready for another pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd require:
- explain better the features provided by our CLU and GUI #29 (comment),
- assume no context/background in the people reading the
README.md
#29 (comment), - link to
Contributing#Development
from ourREADME.md
#29 (comment)
Unless @campoy disagree, I'd like to have:
- a sequential install process #29 (comment)
```bash | ||
$ go get -d -u github.com/src-d/code-annotation/... | ||
$ cd $GOPATH/github.com/src-d/code-annotation | ||
$ make serve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same than wrote in README.md
under the "Installation section", so unnecessary here
the coordinates of the box containing the objects to be identified in an object detection problem, etc. | ||
|
||
### Github OAuth tokens | ||
This tool provides a simple UI to add annotations to existing datasets, a command line tool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the current wording, I understand the following:
- UI to annotate,
- CLI to fetch (retrieve/download...) elements and to export.
But current app offers:
- GUI
UIto annotate and to review the annotations, - CLI to add/push/load
fetchnew elements to be annotated and to export the annotations made.
@campoy do you think we could reword this section to clarify the features offered by our app?
 | ||
|
||
### Docker | ||
## Requirements |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With current order, the process cannot be sequentially followed
- requirements
- dependencies
- OAuth (requires
.env.tpl
that has not yet been downloaded)
- download and run (if you run it without data, the app won't "work")
- dbtools
- import (should be done before running the app)
- export
I'd purpose the following structure, that explains the process secuentially:
- Installation
- dependencies (go and yarn)
- download (go get)
- env vars (because no only OAuth thing is needed)
- load db (to ensure the all will be properly run)
- run
|
||
### Import File Pairs for Annotation | ||
|
||
The file pairs must be provided via an [SQLite](https://sqlite.org/) database. The database **must follow the expected schema**, please [follow this link](./cli/examples/import/example.sql) to see an example. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it assumed that the reader of this README.md
will have any extra context/background about this project?
Or –in the other hand– should the README.md
be something understandable from the beginning to the end –without extra context–?
If we choose the second, considering the intro of the README.md
:
Currently, the project provides one (feature) consisting on labeling two pieces of code as being identical, similar, or different.
I'd reword the first phrase of this section:
The pieces of code to be annotated as identical/similar/different (aka
file pairs
) must be provided via...
``` | ||
|
||
And then restart server. | ||
Where the `DSN` (Data Source Name) argument must be one of: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the export section, it was added a hint about the command parameters:
In this case, origin will be the internal database, and destination the new database.
I think it would be a good idea to do so here:
In this case,
<path-to-sqlite.db>
is the database with the pairs to import, and<destination-DSN>
will be the DSN of the internal database.
|
||
Please take a look at [CONTRIBUTING](CONTRIBUTING.md) file to see how to contribute in this project, get more information about the dashboard [architecture](CONTRIBUTING.md#Architecture) and how to launch it for [development](CONTRIBUTING.md#Development) purposes. | ||
[Contributions](https://github.com/src-d/code-annotation/issues) are more than welcome, if you are interested please take a look to | ||
our [Contributing Guidelines](CONTRIBUTING.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since links to other internal docs are welcomed, could you add it? :D
Guys, its really nice that we care about our docs so much, but it's getting a bit out of hand - discussion on documentation change is open for 14 days in a single not merged PR. It is inconvenient for many reasons, biggest one is the other PR can not have doc changes as this one is so massive. Ofcourse it's important to have awesome documentation, so if I may, I would suggest merging current state as is, carefully moving links to the rest of the feedback as TODO to a new issue, which can be addressed in subsequent PRs according to the priorities. |
This new env var is introduced by code in src-d#52 Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
Here is requested issue: #67 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Next time sending smaller PRs improving pieces of documentation separately might be better.
Once something has been merged, the probability that it will be improved is low, so it makes sense that such a PR will take longer to approved.
@campoy it used to be small. Just adding templates and moving dev part to contributions.md. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM,
requested changes can be addressed through the new issue #67
@dpordomingo request: src-d#29 (comment) Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
@dpordomingo request: src-d#29 (comment) Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
Fixes: #17