This project is an Edge AI execution environment for running Meta's LLaMa2 and Code LLaMa on a Raspberry Pi 4 with 64-bit OS to automatically generate verilog code.
The setup procedure is described below.
- Hardware
- Raspberry Pi 4 8GB
- OS
- Raspberry Pi OS Lite 64bit
- SD Card
- 128GB or more
If you have trouble setting up your own system, you can use this image.
- ID:ishikai
- Pass:ishikai
Download the latest version (version after 6/6/2023). And Build it. https://github.com/ggerganov/llama.cpp
sudo apt install git build-essential
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make
Download LLaMa2 data from Meta's website and convert it to GGML format using a converter that is built at the same time.
convert-llama2c-to-ggml
However, since a fairly powerful GPU, etc. is required, data converted to GGML is available and may be used here.
wget https://huggingface.co/TheBloke/Llama-2-7B-GGML/blob/main/llama-2-7b.ggmlv3.q4_K_M.bin
Code LLaMa , a fine-tuning of LLaMa2 in a programming language, will also be prepared.
However, since a fairly powerful GPU, etc. is required, data converted to GGML is available and may be used here.
wget https://huggingface.co/TheBloke/CodeLlama-7B-GGML/blob/main/codellama-7b.ggmlv3.Q4_K_M.bin
The following command will prompt:
./main -m ./models/7B/llama-2-7b.ggmlv3.q4_K_M.bin \
--color \
--ctx_size 2048 \
-n -1 \
-ins -b 256 \
--top_k 10000 \
--temp 0.2 \
--repeat_penalty 1.1 \
-t 8
./main -m ./models/7B/codellama-7b.ggmlv3.Q4_K_M.bin \
--color \
--ctx_size 2048 \
-n -1 \
-ins -b 256 \
--top_k 10000 \
--temp 0.2 \
--repeat_penalty 1.1 \
-t 8
The Verilog code generated by the above Edge AI is built in an environment prepared on a PC.
git clone https://github.com/noritsuna/Edge_Circuit_Designer.git
export PDK=sky130A
make setup
make Edge_Circuit_Designer
make user_project_wrapper
make precheck
make run-precheck
The prompts used in the Edge_Circuit_Designer Project are as follows:
Please generate a 16 bits counter in verilog.
The following output is the result of the output by LLaMa2.
Build fails because reg [15:0] count;
does not match the grammar.
module counter(
input clk,
input reset,
output [15:0] count);
reg [15:0] count;
always @ (posedge clk) begin
if (!reset) begin
count <= 16'h0000;
end else begin
count <= count + 1;
end
end
endmodule
The following output by Code LLaMa, which is based on LLaMa2 and fine-tuned for programming languages.
Unlike LLaMa2, the operable code is output. However, only about 1 in 10 times is output correctly.
This time, the GDSII output is based on this.
module counter(clk,rst,q);
input clk, rst;
output [15:0] q;
reg [15:0] q;
always @ (posedge clk or posedge rst) begin
if (rst == 1'b1) begin
q <= 16'h000;
end else begin
q <= q + 16'h0001;
end
end
endmodule
It is a very simple 16-bit counter.
module counter(clk,rst,q);
input clk, rst;
output [15:0] q;
reg [15:0] q;
always @ (posedge clk or posedge rst) begin
if (rst == 1'b1) begin
q <= 16'h000;
end else begin
q <= q + 16'h0001;
end
end
endmodule
A test bench is being created for the above Code (16-bit counter).
More details can be found here.
The test bench is executed with the following commands.
make simen
make verify-counter-rtl
make verify-counter-gl
Timing files are generated by the following commands.
make setup-timing-scripts
make install_mcw
make extract-parasitics
make create-spef-mapping
make caravel-sta
This is an edge computing system that does not have Merit in the AI generated results, but rather gains Merit by applying the AI generation system.
We believe this will yield three Merits.
The fact that it works at the Raspberry Pi level means that it can be integrated into an embedded device (smartphone) level processing device. In other words, it could be incorporated into each semiconductor manufacturing device.
Although we did not go as far as outputting a testbench this time, it is theoretically possible to generate the source code for a testbench with a generative AI.
This can be applied to automatically generate test benches by semiconductor manufacturing equipment (manufacturing process).
Since this is not expected in the current process, information about the fab and equipment is available to the public in order to have the necessary test benches written. However, it is believed that it also contains information that the company would not want to disclose outside the company.
Therefore, by bringing in a testbench generated by AI, it may be possible to minimize the information to be disclosed by automatically generating a testbench in the process by having the designer (client) present the "testbench specifications".
This is a Merit for both the fab operator and the designer who does not want to write test benches.
You can do your own LLM data "Fine Tuning". By "Fine Tuning" your own source code or circuits, you can have the AI generate source code or circuits that copy your habits.
This is a very important Merit.
If an engineer were to have a generative AI automatically generate source code or circuits, he or she would not use the source code or circuits as they are, but rather modify them based on them. In that case, the engineer must read and understand that automatically generated source code and circuitry. And it is easier for engineers to understand source code and circuits they have written than to understand automatically generated (written by others) source code and circuits.
For example, personal habits, such as the use or non-use of ternary operators, can appear in source code and circuits. Generative AI generates source code and circuits that incorporate these individual quirks.
This is a Merit to all engineers.
Multiple Raspberry Pi, which are very inexpensive devices, can be clustered together and used in the same way as a PC with a powerful GPU costing $100,000 to $1 million.
This means that researchers and engineers, who until now have only been able to use generative AI provided by others, can now participate in the research and development of generative AI. This will allow many researchers and engineers to conduct a variety of research and development, which can be expected to produce more results.
This is a Merit to all mankind.
As described above, a new world could be opened up by running your own customized LLM on a small & inexpensive device such as the Raspberry Pi or on your personal PC.
This project is an Edge AI execution environment for running Meta's LLaMa2 and Code LLaMa on a Raspberry Pi 4 with 64-bit OS to automatically generate verilog code.
In this case, a simple 16-bit counter was generated. However, as noted in #Technical Merit, this is a system that shows promise.
And We call this system "Edge Circuit Designer".
The results of this study showed that Code LLaMa LLM data fine-tuned to the programming language was better than the generic LLaMa2 LLM data.
In other words, LLM data (such as LLaMa2) tuning using Verilog-specific training models or training data using your own source code could potentially support a variety of models.
If you want to create a Verilog-specific training model, you can simply scrape the Verilog files on github.However, it is difficult to use the files as public data because they may not have permission to be used as training data.
Therefore, The more Verilog files made available by the community, the more valuable this system will be. We are convinced that this will be of great help to those who use this system.
- Software for Fine Tuning LLM data (LLaMa2)
finetune.py
from Alpaca-lora- Rewrite two parameters in
finetune.py
- Rewrite two parameters in
base_model: str = "", ;The LLM data path
data_path: str = "", ;The Training data path
❗ Important Note |
---|
Refer to README for a quickstart of how to use caravel_user_project
Refer to README for this sample project documentation.