[Task]: Spotting commands in the stream from coding assistants like cline #844
Description
Description
We have done some work to spot suspicious commands in #34. The task here is to write this code into codegate. This involves
- Creating the model from the code in https://github.com/stacklok/research/blob/command-detection/command_detection/command_models.ipynb. The should result in a function which returns good or bad when fed a command.
- In a platform neutral way (cline, copilot edits, etc) spot when a command is returned and categorise it and log this if the command is bad.
Extensions for the future
- Have more than two categories - e.g. safe, risky, and block
- Block commands in the 'block' category
- Have the block behaviour configurable
- Have more options around context - e.g. files and dirs that are writable
- Have the NN learn from feedback from the user (i.e. retrain the NN from feedback in the codegate UI)
We will probably have to intercept the commands at
snippets = extract_snippets(current_content)
and write the comment back at
async def _snippet_comment(self, snippet: CodeSnippet, context: PipelineContext) -> str:
As a baseline we decided to use the hybrid-all-MiniLM-L6-v2
with post-processing by a small ANN. We didn't want the extra cost of codebert, but the local ANN seems to produce some benefit.
Additional Context
We need to decide which model to use for the embeddings. all-minilm-L6-v2 works well, especially with a post ANN process step. It is already in codegate, so we get it for free. microsoft/codebert-base works better as expected, but at a cost of 476 MB.
The ANNs are much smaller
ls -lh | grep hybrid
-rw-r--r-- 1 nigel staff 228K 29 Jan 18:21 hybrid-all-MiniLM-L6-v2.model
-rw-r--r-- 1 nigel staff 420K 29 Jan 18:21 hybrid-microsoft-codebert-base.model