Crowd Knowledge Answer Generator. Feel free to try our tool. For more details about the technique, please check our journal paper, which is an extension of our original paper. This project is an extension of the original project.
To provide comprehensive solutions for daily programming tasks containing code examples and succinct explanations (limited to Java in this initial vesion).
- Input: API related query in natural language.
- Output: Code examples containing explanations.
CROKAGE receives as input a query written in natural language and uses state-of-art text retrieval models combined with three state-of-art API recommender tools to retrieve the most related Stack Overflow answers to that query, sorted by relevance. CROKAGE then uses natural language processing to extract the code and relevant sentences to compose a summary containing the solution for the query.
- AnswerBot is limited as it does not provide code.
- BIKER is limited as its documentation is limited to JAVA SE and does not provide code for every query.
- CROKAGE address both limitations by providing relevant code and explanations in form of summaries.
Note: all the experiments were conducted over a server equipped with 86 GB RAM, 3.1 GHz on 12 cores and 64-bit Linux Mint Cinnamon operating system. We strongly recommend a similar or better hardware environment. The operating system however could be changed.
Softwares:
- Java 1.8
- Postgres 9.4 - Configure your DB to accept local connections. An example of pg_hba.conf configuration:
...
# TYPE DATABASE USER ADDRESS METHOD
# "local" is for Unix domain socket connections only
local all all md5
# IPv4 local connections:
host all all 127.0.0.1/32 md5
...
- PgAdmin (we used PgAdmin 4) but feel free to use any DB tool for PostgreSQL.
-
Download the SO Dump of March 2019 here. This is a preprocessed dump, downloaded from the official web site containing the main tables we use. The Postsmin table (representing posts table) has extra columns with the preprocessed data used by Crokage.
-
On your DB tool, create a new database named stackoverflow2019emse-min. This is a query example:
CREATE DATABASE stackoverflow2019emse-min
WITH OWNER = postgres
ENCODING = 'UTF8'
TABLESPACE = pg_default
LC_COLLATE = 'en_US.UTF-8'
LC_CTYPE = 'en_US.UTF-8'
CONNECTION LIMIT = -1;
- Restore the downloaded dump to the created database.
Obs: restoring this dump would require at least 10 Gb of free space. If your operating system runs in a partition with insufficient free space, create a tablespace pointing to a larger partition and associate the database to it by replacing the "TABLESPACE" value to the new tablespace name: TABLESPACE = tablespacename
.
- Assert the database is sound. Execute the following SQL command:
select id, title,body,processedtitle,processedbody,code, processedcode from postsmin po limit 10
. The return should list the main fields for 10 posts.
git clone https://github.com/muldon/crokage-emse-replication-package.git
In the end, you will have the structure: /home/user/crokage/crokage-emse-replication-package
Download our fat jar here. Place it along with the downloaded files (/home/user/crokage/crokage-emse-replication-package
).
Make sure your crokage folder (/home/user/crokage/crokage-emse-replication-package
) contains this structure:
..
./data
crokage.jar
main.properties (not "main.properties.txt")
...
Obs: if for some reason you opt to zip and download, make sure the extracted file main.properties
does not change to main.properties.txt
.
Obs 2: for now we only provide the replication package containing the files for the reproduction of CROKAGE, along with the results of the User Study, described in our paper. The complete source code will be released soon.
Edit main.properties
and set the ######### Must be set following parameters:
CROKAGE_HOME
= the root folder of the project (ex /home/user/crokage/crokage-emse-replication-package
).
spring.datasource.username
= your db user
spring.datasource.password=
= your db password
spring.datasource.url=
your database URL, as for example: jdbc:postgresql://localhost:5432/stackoverflow2019emse-min
.
Open a terminal, go to the folder where the jar file and main.properties are located and run the following command: java -Xms1024M -Xmx50g -jar crokage.jar --spring.config.location=./main.properties
. This command use the file main.properties
to overwrite the default parameters which must be set as described above.
The results are displayed in the terminal/console but also stored in the database in tables metricsresults. The following query should return the results:
select * from metricsresults
We implemented our approach in form of a tool to assist developers with their daily programming issues. The figure below shows the tool architecture. We follow a REST (Representational State Transfer) architecture. The tool is in beta version and only provide solutions for Java language, but we expect to release the full version soon. If you wish to use our tool to provide solutions to your natural language queries, please follow the instructions here.
If you intend to use this work, please cite us:
@article{SilvaEMSE2020,
author = {Silva, Rodrigo F. G. and Roy, Chanchal K. and Rahman, Mohammad Masudur and Schneider, Kevin A. and Paixao, Klerisson and Dantas, Carlos and de Almeida Maia, Marcelo},
title={{CROKAGE}: Effective Solution Recommendations for Programming Tasks by Leveraging Crowd Knowledge},
journal={Empirical Software Engineering (accepted)},
year=2020}
This project is licensed under the MIT License - see the LICENSE file for details