Skip to content

Conversation

smacker
Copy link
Contributor

@smacker smacker commented Jan 24, 2018

Fixes: #17

@smacker
Copy link
Contributor Author

smacker commented Jan 24, 2018

@carlosms could you help me please with Import files for annotation section of readme?

@carlosms
Copy link
Contributor

@smacker of course

@carlosms
Copy link
Contributor

@smacker you have a PR for your branch here

@smacker
Copy link
Contributor Author

smacker commented Jan 24, 2018

man, it's really awesome! Thanks a lot. You should write more documentation for our projects 😂

@smacker smacker changed the title [WIP] Documentation improvements Documentation improvements Jan 24, 2018
@bzz bzz removed the review label Jan 25, 2018
README.md Outdated
@@ -1,16 +1,23 @@
# Source Code Annotation application
[![Build Status](https://travis-ci.org/src-d/go-git.svg)](https://travis-ci.org/src-d/go-git)
Copy link
Contributor

@bzz bzz Jan 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's update it to src-d/code-annotation

README.md Outdated
CREATE TABLE files (
blob_id_a TEXT, repository_id_a TEXT, commit_hash_a TEXT, path_a TEXT, content_a TEXT,
blob_id_b TEXT, repository_id_b TEXT, commit_hash_b TEXT, path_b TEXT, content_b TEXT,
score DOUBLE PRECISION);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep README concise and maintainable, I suggest to remove DDL from here and just link to the relevant place in FS/code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, can you point to such place? It's expected input schema. Not schema created by our tool.
Or do you suggest to use cli/examples/import/example.sql?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is relevant to document clearly the expected input DB schema that must be created by the users of the app.
This line is not a reference for developers, it is actually an important required first step, without this the annotation tool will have no data to work with.

Copy link
Contributor

@dpordomingo dpordomingo Jan 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the dbutil are internal things that prepare the db. I understood they're provided as a kind of helpers, but there could be used alternative approaches.
Considering that point, I'd move the dbutil documentation into a separated documentation, and link it from the main README.md

Copy link
Contributor

@dpordomingo dpordomingo Jan 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not suggesting to avoid documenting it, for sure; I'm just suggesting to document it separately, not in the README.md that should be only a first and quick contact with the app.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the readme is not that big and spreading sections in different files can make it difficult to find things. But I guess it's up to @bzz.

What I am strongly against is removing documentation and pointing to source code.

Copy link
Contributor

@bzz bzz Jan 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, @carlosms you have the point in #29 (comment)

I agree, but If possible I would prefer to have a link in REAME to the https://github.com/src-d/code-annotation/blob/master/cli/examples/import/example.sql#L6 as from what I can see - it contains the very same DDL.

In case of future changes - there is no need to maintain it in two different places, which has a benefit of avoiding documentation rot.

Let me know if you think that might work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that can also work. even if it's not my favourite approach (personal preference). I'll make the change right now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carlosms I might be missing something, but I think I have already saw it being replaced by the link to the file and now it's SQL in README here again here :/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my bad. I fetched a wrong origin branch during rebase.

Copy link
Contributor

@dpordomingo dpordomingo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, in general, but...
#29 (comment)

@src-d src-d deleted a comment from smacker Jan 25, 2018
Copy link
Contributor

@bzz bzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after #29 (comment) is resolved @carlosms discretion.

@bzz
Copy link
Contributor

bzz commented Jan 25, 2018

@dpordomingo I'm not sure if I understood what do you mean by dbutil documentation - is there a chance you could take another pass on it, and see if it somehow was already addressed?

@bzz
Copy link
Contributor

bzz commented Jan 26, 2018

Looks awesome to me

plumbing: packp, Skip argument validations for unknown capabilities. Fixes #623
```

## Development
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this section needs a bit more detail.

For example:
I can run the frontend in dev mode with yarn start. Does that mean the backend is not needed? Or needs to be started in some other way?
If I run make gorun, do I need to build the frontend code before?
What about the github auth tokens mentioned in the REAME? Is also needed for development?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

about frontend is it more clear in #36 in description?
I'll add info about token.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in that comment the steps are more clearly put

CONTRIBUTING.md Outdated

Frontend:

If you want to benifit from frontend hot reloading feature this line in your `.env` file:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"hot reloading feature put this line..."

I think "hot reloading" should be explained, or at least have a link to some explanation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not really. It's super duper common technology in frontend world. If you don't know what it is - you don't need it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough, it's not that crucial. I was thinking on designers instead of developers, since we talked about design people editing code directly.

CONTRIBUTING.md Outdated
UI_DOMAIN=http://127.0.0.1:3000
```

And then restart server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then restart the server.

Might be obvious, but you could mention explicitly the command.

CONTRIBUTING.md Outdated

And then restart server.

To run frontend in dev mode:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To run the frontend in dev mode, execute:

```
yarn
yarn start
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

```bash

I think we should also have a common style to put (or not) a $ before commands that are supposed to be run in the shell.

@smacker smacker mentioned this pull request Jan 26, 2018
smacker and others added 4 commits January 26, 2018 19:44
Signed-off-by: Maxim Sukharev <maxim@sourced.tech>
Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
Signed-off-by: Maxim Sukharev <maxim@sourced.tech>
Signed-off-by: Maxim Sukharev <maxim@sourced.tech>
@smacker
Copy link
Contributor Author

smacker commented Jan 26, 2018

Please do another pass.

## Development

```bash
$ go get -d -u github.com/src-d/code-annotation/...
Copy link
Contributor

@bzz bzz Jan 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does #33 being merged change any of these, here and below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no. go get will download everything with vendors and you still need make serve to build FE.

Link to example.sql instead

Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
Not yet implemented, but the commands will use
filepaths for the input and output DB, limited
to SQLite
See PR src-d#28: src-d#28 (comment)

Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
Copy link
Contributor

@carlosms carlosms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments made using this documentation to execute code in PR #36.

```bash
$ go get -d -u github.com/src-d/code-annotation/...
$ cd $GOPATH/github.com/src-d/code-annotation
$ make serve
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a missing step, setting .env. Otherwise make serve will complain.

Also make serve will not inform that the server is listening on :8080

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly the same than wrote in the README.md under the "Non-docker" section.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same than wrote in README.md under the "Installation section", so unnecessary here

$ go get -d -u github.com/src-d/code-annotation/...
$ cd $GOPATH/github.com/src-d/code-annotation
$ make serve
$ yarn start
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be confusing, as make server does not stop until it is killed. It should be mentioned that "yarn start" needs to be executed in another terminal

README.md Outdated

# Source Code Annotation Tool

In order to evaluate quality of ML models, as well as to create “ImageNet for source core” there is a need for tools to automate the data collection/labeling/annotation.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ImageNet for Source Code


In order to evaluate quality of ML models, as well as to create “ImageNet for source core” there is a need for tools to automate the data collection/labeling/annotation.

![Screenshot](.github/screenshot.png?raw=true)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this screenshot shows.
Does it detect there's a new line of code? Is that line relevant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it shows the interface of the tool. There are 2 files, the difference is in 1 line. We ask a user to mark it as identical/similar/different. It is what this tool does.

README.md Outdated
### Github OAuth tokens

First you need OAuth application on github. [Read how to create it](https://developer.github.com/apps/building-oauth-apps/creating-an-oauth-app/).
1. You need OAuth application on github. [Read how to create it](https://developer.github.com/apps/building-oauth-apps/creating-an-oauth-app/).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to write links that describe what they point to.

You need an OAuth application on GitHub. See how to create OAuth applications on GitHub.

README.md Outdated
2. Copy `.env.tpl` to `.env`.

Copy `.env.tpl` to `.env` and set tokens there.
3. On a [page](https://github.com/settings/developers) with your application find `Client ID` and `Client Secret` and put them in `.env` file.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Retrieve the values for your application's Client ID and Client Secret from the GitHub Developer Settings page and add them to the end of the corresponding lines in .env.

README.md Outdated
The `import` command will use those file pairs to create a new [SQLite](https://sqlite.org/) or [PostgreSQL](https://www.postgresql.org/) database that will be used internally by the Annotation Tool. The destination database does not need to be empty, new imported file pairs can be added to previous imports.

If you want to benifit from frontend hot reloading feature this line in your `.env` file:
_Please note_: if a file pair is identical to an existing one it will not be detected. A new pair entry will be created with the same contents.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this phrase.
Does this mean we can end up with duplicate results for files that are equal if we run more than the analysis more than once?

Please clarify

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carlosms could you please handle it. It's a part about the import.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify: if a user logs in to annotate file pairs more than once, we will not end up with duplicate results. Previous annotation results are already saved for each user.

This only applies if you add new data to annotate to an already existing installation, using the "import" tool.

Let's say you decide to add new data, and one of those new file pairs is repoA repoB, src/A src/A.
If that very same file pair was existing from before, the DB will contain that previous pair, and the new one.

The only reason it works this way is because we decided not to guess too much. This can be changed easily when the ML team starts using the app and providing feedback.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it a super-useful implementation-detail notice for the internal users and developer of the project.

Signed-off-by: Maxim Sukharev <maxim@sourced.tech>
@bzz
Copy link
Contributor

bzz commented Jan 30, 2018

@dpordomingo @campoy please, feel free to do one more pass on it

Copy link
Contributor

@dpordomingo dpordomingo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for such effort of improving the documentation!!!

Let me suggest some things; here is a list of the main ones:

  • README.md:
    • explain first the app purpose, and then its motivation,
    • refactor the installation section to cover the full process from the beginning to the end, explaining the two different alternatives: with/without docker, and how to setup the backend DB,
    • fix docker commands, that does not work,
    • simplify the import/export section to its minimum necesities,
    • add a Requirements section (example in landing)
    • add a link to the Contributing#Development
  • Contributing.md
    • simplify the Contributing#Development section,

README.md Outdated
In order to evaluate quality of ML models, as well as to create “ImageNet for source core” there is a need for tools to automate the data collection/labeling/annotation.
# Source Code Annotation Tool

In order to evaluate quality of ML models, as well as to create “ImageNet for Source Code” there is a need for tools to automate the data collection/labeling/annotation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering if it would more clear if we explain first what this tool does, and second why is it needed.

I'd try with something like:

Source code annotation tool offers an UI to annotate source code and review these annotations, and a CLI to define the code to be annotated and export the annotations.
Why is is needed?
To evaluate the quality of ML models, as well as to create “ImageNet for Source Code” it is needed to have a big amount of source code already annotated by humans. To automate the data collection/labeling/annotation process it was decided to build this tool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the first section added. The second one kept unchanged. Your proposal sounds just more complicated but saying the same thing in my opinion.

README.md Outdated
First you need OAuth application on github. [Read how to create it](https://developer.github.com/apps/building-oauth-apps/creating-an-oauth-app/).
1. You need OAuth application on github. You need an OAuth application on GitHub. See [how to create OAuth applications on GitHub](https://developer.github.com/apps/building-oauth-apps/creating-an-oauth-app/).

`Authorization callback URL: http://127.0.0.1:8080/oauth-callback`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a development configuration value

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to change it to http(s)://<hostname:port>/oauth-callback? I think it's even more confusing.

README.md Outdated
### Github OAuth tokens

First you need OAuth application on github. [Read how to create it](https://developer.github.com/apps/building-oauth-apps/creating-an-oauth-app/).
1. You need OAuth application on github. You need an OAuth application on GitHub. See [how to create OAuth applications on GitHub](https://developer.github.com/apps/building-oauth-apps/creating-an-oauth-app/).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First sentence is repeated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd reword:

Application users are created through GitHub OAuth App, so it is needed to create and configure one to use the UI of this application

1. Create a GitHub OAuth App.
When you're configuring the OAuth App, the "Authorization callback URL" must be http://<APP_HOST_NAME>/oauth-callback
(see how to create OAuth applications on GitHub)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't have to create. You just need one.

2. Copy `.env.tpl` to `.env`.

Copy `.env.tpl` to `.env` and set tokens there.
3. Retrieve the values for your application's Client ID and Client Secret from the [GitHub Developer Settings page](https://github.com/settings/developers) and add them to the end of the corresponding lines in .env.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd reword:

Edit the .env file to use the Client ID and Client Secret you obtained after creating your OAuth App. You can recover both codes from your registered OAuth Apps at GitHub Developer settings: OAuth Apps

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess "recover" isn't a correct English word for it.
This text was suggested by @campoy, I prefer to stick to it.
#29 (comment)

README.md Outdated
```bash
docker build -t srcd/code-annotation .
docker run --env-file .env --rm -p 8080:8080 srcd/code-annotation
$ docker build -t srcd/code-annotation .
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dep is not a built-in command in golang:1.8-alpine3.6 image, so it should be added to the Dockerfile; otherwise, the suggested docker build command will fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's solved by smacker#3

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the section about docker. A user should run the only image created by us. We don't have such image yet.
More info here: #36 (comment)

To work with the annotation results, the internal data can be extracted into a new SQLite database using the `export` command.

```bash
$ export <origin-DSN> <path-to-sqlite.db>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it is said that:

the internal data can be extracted into a new SQLite database using the export command.

I'd use the same argument hint for the input than used in the input documentation:

export <application-internal-db-DSN> <path-to-output-sqlite.db>

```bash
$ go get -d -u github.com/src-d/code-annotation/...
$ cd $GOPATH/github.com/src-d/code-annotation
$ make serve
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly the same than wrote in the README.md under the "Non-docker" section.

CONTRIBUTING.md Outdated

### Frontend:

If you want to benifit from frontend hot reloading feature put this line in your `.env` file:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/benifit/benefit

CONTRIBUTING.md Outdated

### Frontend:

If you want to benifit from frontend hot reloading feature put this line in your `.env` file:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find a great idea to explain how to run the front in hot-reloading mode, but suggesting to add a different UI_DOMAIN into the .env file will ruin the server if it is launched with the recommended command (make serve). To avoid that problem, I would not recommend modifying the .env file, rewording this section as it follows:

### Running the frontend with hot reloading.

First, run the frontend:

yarn start

and then, serve the backend with a proper UI_DOMAIN value, running:

UI_DOMAIN=http://127.0.0.1:3000 make gorun

@@ -0,0 +1,73 @@
# Contributor Covenant Code of Conduct
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for such effort of improving the documentation!!!

Let me suggest some things; here is a list of the main ones:

  • README.md:
    • explain first the app purpose, and then its motivation,
    • refactor the installation section to cover the full process from the beginning to the end, explaining the two different alternatives: with/without docker, and how to setup the backend DB,
    • fix docker commands, that does not work,
    • simplify the import/export section to its minimum necesities,
    • add a Requirements section (example in landing)
    • add a link to the Contributing#Development
  • Contributing.md
    • simplify the Contributing#Development section,

Signed-off-by: Maxim Sukharev <maxim@sourced.tech>
@smacker
Copy link
Contributor Author

smacker commented Feb 3, 2018

@dpordomingo feedback on README (besides import/export) is applied or answered.
@carlosms your turn.

@carlosms carlosms mentioned this pull request Feb 6, 2018
2 tasks
Every commit message should describe what was changed, under which context and, if applicable, the GitHub issue it relates to:

```
plumbing: packp, Skip argument validations for unknown capabilities. Fixes #623
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a single line?
Normally the Fixes #123 tends to be on its own.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


```
go version; # prints your go version
echo $GOPATH; # prints your $GOPATH path
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use go env GOPATH instead so it prints the default GOPATH if none has been defined

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry? We need $GOPATH later to do cd.

README.md Outdated
cd $GOPATH/github.com/src-d/code-annotation
make serve
```
cd $GOPATH/src/github.com/src-d/landing
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

landing?

README.md Outdated
```

### Non-docker
The project must be under the `$GOPATH`, following the Go import conventions, what means you can go to its directory running:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence doesn't make much sense.

The project must be under $GOPATH, as required by the Go tooling.
You should be able to navigate into the source code by running:

README.md Outdated
1. You need an OAuth application on GitHub. See [how to create OAuth applications on GitHub](https://developer.github.com/apps/building-oauth-apps/creating-an-oauth-app/).

If you want to benifit from frontend hot reloading feature this line in your `.env` file:
`Authorization callback URL: http://127.0.0.1:8080/oauth-callback`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explain what this means.

In order to be able to use this token while running your application locally, make sure you add http://127.0.0.1:8080/oauth-callback to the authorization callback URL field.

Maybe add a screenshot of the form with the value in it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what token? README doesn't mention any token at this point.
I changed

In order to be able to use this application while running the tool locally

screenshot might be too much. The is already a screenshot of the field in github documentation we are pointing to.

README.md Outdated
```
UI_DOMAIN=http://127.0.0.1:3000

It runs everything you need to get the tool working at [http://localhost:8080](http://localhost:8080)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will start a server locally, which you can access on http://localhost:8080.

README.md Outdated

### Import File Pairs for Annotation

The file pairs must be initially provided via an [SQLite](https://sqlite.org/) database. The database **must follow the expected schema**, please [follow this link](./cli/examples/import/example.sql) to see an example.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop the "initially", it implies that should have happened before this step.

Also since the schema is so important you should make it explicit and maybe paste the result of calling DESCRIBE TABLE here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the schema: it was actually removed as requested, see #29 (comment).
Pinging @bzz.

README.md Outdated

The `import` command will use those file pairs to create a new [SQLite](https://sqlite.org/) or [PostgreSQL](https://www.postgresql.org/) database that will be used internally by the Annotation Tool. The destination database does not need to be empty, new imported file pairs can be added to previous imports.

_Please note_: if a file pair is identical to an existing one it will not be detected. A new pair entry will be created with the same contents.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: duplicate entries are not filtered, so running an import multiple times will result in repeated rows.

Some usage examples:

```bash
$ import ./input.db sqlite:///home/user/internal.db
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any output expected?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there is. I've added the output to the usage examples, a few lines below.

README.md Outdated

# Code of Conduct

All activities source{d} projects are governed by the [source{d} code of conduct](CODE_OF_CONDUCT.md).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there was a mistake on the template!

src-d/guide#123

smacker and others added 3 commits February 6, 2018 15:23
Signed-off-by: Maxim Sukharev <maxim@sourced.tech>
Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
@bzz
Copy link
Contributor

bzz commented Feb 7, 2018

@dpordomingo @campoy ready for another pass

Copy link
Contributor

@dpordomingo dpordomingo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd require:

  • explain better the features provided by our CLU and GUI #29 (comment),
  • assume no context/background in the people reading the README.md #29 (comment),
  • link to Contributing#Development from our README.md #29 (comment)

Unless @campoy disagree, I'd like to have:

```bash
$ go get -d -u github.com/src-d/code-annotation/...
$ cd $GOPATH/github.com/src-d/code-annotation
$ make serve
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same than wrote in README.md under the "Installation section", so unnecessary here

the coordinates of the box containing the objects to be identified in an object detection problem, etc.

### Github OAuth tokens
This tool provides a simple UI to add annotations to existing datasets, a command line tool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the current wording, I understand the following:

  • UI to annotate,
  • CLI to fetch (retrieve/download...) elements and to export.

But current app offers:

  • GUI UI to annotate and to review the annotations,
  • CLI to add/push/load fetch new elements to be annotated and to export the annotations made.

@campoy do you think we could reword this section to clarify the features offered by our app?

![Screenshot](.github/screenshot.png?raw=true)

### Docker
## Requirements
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With current order, the process cannot be sequentially followed

  • requirements
    • dependencies
    • OAuth (requires .env.tpl that has not yet been downloaded)
  • download and run (if you run it without data, the app won't "work")
  • dbtools
    • import (should be done before running the app)
    • export

I'd purpose the following structure, that explains the process secuentially:

  • Installation
    • dependencies (go and yarn)
    • download (go get)
    • env vars (because no only OAuth thing is needed)
    • load db (to ensure the all will be properly run)
    • run


### Import File Pairs for Annotation

The file pairs must be provided via an [SQLite](https://sqlite.org/) database. The database **must follow the expected schema**, please [follow this link](./cli/examples/import/example.sql) to see an example.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it assumed that the reader of this README.md will have any extra context/background about this project?
Or –in the other hand– should the README.md be something understandable from the beginning to the end –without extra context–?

If we choose the second, considering the intro of the README.md:

Currently, the project provides one (feature) consisting on labeling two pieces of code as being identical, similar, or different.

I'd reword the first phrase of this section:

The pieces of code to be annotated as identical/similar/different (aka file pairs) must be provided via...

```

And then restart server.
Where the `DSN` (Data Source Name) argument must be one of:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the export section, it was added a hint about the command parameters:

In this case, origin will be the internal database, and destination the new database.

I think it would be a good idea to do so here:

In this case, <path-to-sqlite.db> is the database with the pairs to import, and <destination-DSN> will be the DSN of the internal database.


Please take a look at [CONTRIBUTING](CONTRIBUTING.md) file to see how to contribute in this project, get more information about the dashboard [architecture](CONTRIBUTING.md#Architecture) and how to launch it for [development](CONTRIBUTING.md#Development) purposes.
[Contributions](https://github.com/src-d/code-annotation/issues) are more than welcome, if you are interested please take a look to
our [Contributing Guidelines](CONTRIBUTING.md).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since links to other internal docs are welcomed, could you add it? :D

@bzz
Copy link
Contributor

bzz commented Feb 7, 2018

Guys, its really nice that we care about our docs so much, but it's getting a bit out of hand - discussion on documentation change is open for 14 days in a single not merged PR.

It is inconvenient for many reasons, biggest one is the other PR can not have doc changes as this one is so massive.

Ofcourse it's important to have awesome documentation, so if I may, I would suggest merging current state as is, carefully moving links to the rest of the feedback as TODO to a new issue, which can be addressed in subsequent PRs according to the priorities.

This new env var is introduced by code in
src-d#52

Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
@smacker
Copy link
Contributor Author

smacker commented Feb 7, 2018

Here is requested issue: #67

Copy link

@campoy campoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Next time sending smaller PRs improving pieces of documentation separately might be better.

Once something has been merged, the probability that it will be improved is low, so it makes sense that such a PR will take longer to approved.

@smacker
Copy link
Contributor Author

smacker commented Feb 7, 2018

@campoy it used to be small. Just adding templates and moving dev part to contributions.md.

Copy link
Contributor

@dpordomingo dpordomingo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM,
requested changes can be addressed through the new issue #67

@bzz bzz merged commit d1d34be into src-d:master Feb 7, 2018
carlosms added a commit to carlosms/code-annotation that referenced this pull request Feb 28, 2018
@dpordomingo request:
src-d#29 (comment)

Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
carlosms added a commit to carlosms/code-annotation that referenced this pull request Mar 1, 2018
@dpordomingo request:
src-d#29 (comment)

Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
dpordomingo pushed a commit that referenced this pull request Mar 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants