Skip to content

Update the Base: Vector Add sample README.md #1019

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 4, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
296 changes: 117 additions & 179 deletions DirectProgramming/DPC++/DenseLinearAlgebra/vector-add/README.md
Original file line number Diff line number Diff line change
@@ -1,240 +1,177 @@
# `vector-add` Sample
# `Base: Vector Add` Sample

Vector Add is the equivalent of a ‘Hello, World!’ sample for data parallel
programs. Building and running the code sample verifies that your development
environment is set up correctly and demonstrates the use of the core features of
SYCL*.
The `Base: Vector Add` is the equivalent of a ‘Hello, World!’ sample for data parallel programs. Building and running this sample verifies that your development environment is set up correctly, and the sample code demonstrates some core features of SYCL*.

For comprehensive instructions, see the [Intel® oneAPI Programming
Guide](https://software.intel.com/en-us/oneapi-programming-guide) and search
based on relevant terms noted in the comments.


| Optimized for | Description
|:--- |:---
| OS | Linux* Ubuntu* 18.04 <br>Windows* 10
| Hardware | Skylake with GEN9 or newer <br>Intel&reg; Programmable Acceleration Card with Intel&reg; Arria&reg; 10 GX FPGA
| Software | Intel&reg; oneAPI DPC++/C++ Compiler
| Area | Description
|:--- |:---
| What you will learn | How to begin using SYCL* to offload computations to a GPU
| Time to complete | 15 minutes

## Purpose
The `Base: Vector Add` is a simple program that adds two large vectors of integers and verifies the results. This program uses C++ and SYCL* for Intel® CPU and accelerators.

The `vector-add` is a simple program that adds two large vectors of integers and
verifies the results. This program is implemented using C++ and SYCL* for
Intel&reg; CPU and accelerators.

In this sample, you can learn how to use the most basic code in C++ language
that offloads computations to a GPU. This includes using Unified Shared Memory
(USM) and buffers. USM requires an explicit wait for the asynchronous kernel's
computation to complete. Buffers, at the time they go out of scope, bring main
memory in sync with device memory implicitly; the explicit wait on the event is
not required as a result. This sample provides examples of both implementations
for simple side-by-side reviews (the Windows sample only supports USM).

The code attempts to execute on an available GPU and fallback to the system CPU
if a compatible GPU is not detected. If successful, the name of the offload
device and a success message is displayed, which indicates your development
environment is set up correctly.

In addition, you can target an FPGA device using the build scripts described
below. If you do not have FPGA hardware, the sample will run in emulation mode,
which includes static optimization reports for design analysis.
In this sample, you can learn how to use C++ code to offload computations to a GPU. This includes using Unified Shared Memory (USM) and buffers. USM requires an explicit wait for the asynchronous kernel's computation to complete. Buffers, at the time they go out of scope, bring main memory in sync with device memory implicitly; the explicit wait on the event is not required as a result. This sample provides examples of both implementations for simple side-by-side reviews (the Windows sample only supports USM).

A detailed code walkthrough can be found in the [Explore SYCL* with Samples from
Intel](https://software.intel.com/content/www/us/en/develop/documentation/explore-dpcpp-samples-from-intel/top.html#top_STEP1_VECTOR_ADD)
guide.

> **Note**: For comprehensive information about oneAPI programming, see the [Intel® oneAPI Programming Guide](https://software.intel.com/en-us/oneapi-programming-guide). (Use search or the table of contents to find relevant information quickly.)

## Prerequisites
| Optimized for | Description
|:--- |:---
| OS | Ubuntu* 18.04 <br> Windows* 10
| Hardware | Skylake with GEN9 or newer <br>Intel® Programmable Acceleration Card with Intel® Arria&reg; 10 GX FPGA
| Software | Intel® oneAPI DPC++/C++ Compiler

## Key Implementation Details
The basic SYCL* implementation explained in the code includes device selector, USM, buffer, accessor, kernel, and command groups.

The basic SYCL* implementation explained in the code includes device selector,
USM, buffer, accessor, kernel, and command groups.
The code attempts to execute on an available GPU and fallback to the system CPU if a compatible GPU is not detected. If successful, the name of the offload device and a success message is displayed, which indicates your development environment is set up correctly.

## License
Code samples are licensed under the MIT license. See
[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt)
for details.
In addition, you can target an FPGA device using the build scripts described below. If you do not have FPGA hardware, the sample will run in emulation mode, which includes static optimization reports for design analysis.

Third party program Licenses can be found here:
[third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt).
### Known Issues
With oneAPI 2021.4 the argument for accessors was changed from `noinit` to `no_init`. The change was derived from a change between the SYCL 2020
provisional spec and that of the 2020Rev3 spec.

## Known Issues
With oneAPI 2021.4 the argument for accessors was changed from `noinit` to
`no_init`. The change was derived from a change between the SYCL 2020
provisional spec and that of the 2020Rev3 spec
If this sample fails to run, do one of the following:
- Update the Intel® oneAPI Base Toolkit to 2021.4 or later.
- Change the `no_init` argument to `noinit`.

If running this sample and it fails, do one of the following
- Update the Intel® oneAPI Base Toolkit to 2021.4
- Change the 'no_init' argument to 'noinit'
## Setting Environment Variables
When working with the command-line interface (CLI), you should configure the oneAPI toolkits using environment variables. Set up your CLI environment by sourcing the `setvars` script every time you open a new terminal window. This practice ensures that your compiler, libraries, and tools are ready for development.

> **Note**: If you have not already done so, set up your CLI environment by
> sourcing the `setvars` script located in the root of your oneAPI
> installation.
## Build the `Base: Vector Add` Sample for GPU and FPGA
> **Note**: If you have not already done so, set up your CLI
> environment by sourcing the `setvars` script in the root of your oneAPI installation.
>
> Linux:
> Linux*:
> - For system wide installations: `. /opt/intel/oneapi/setvars.sh`
> - For private installations: `. ~/intel/oneapi/setvars.sh`
> - For non-POSIX shells, like csh, use the following command: `bash -c 'source <install-dir>/setvars.sh ; exec csh'`
>
> Windows:
> Windows*:
> - `C:\Program Files(x86)\Intel\oneAPI\setvars.bat`
> - Windows PowerShell*, use the following command: `cmd.exe "/K" '"C:\Program Files (x86)\Intel\oneAPI\setvars.bat" && powershell'`
>
>For more information on environment variables, see Use the setvars Script for
>[Linux or
>macOS](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html),
>or
>[Windows](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-windows.html).
> For more information on configuring environment variables, see [Use the setvars Script with Linux* or macOS*](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html) or [Use the setvars Script with Windows*](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-windows.html).

### Running Samples in DevCloud

If running a sample in the Intel DevCloud, you must specify the compute node
(CPU, GPU, FPGA) and whether to run in batch or interactive mode. For more
information, see the Intel&reg; oneAPI Base Toolkit [Get Started
Guide](https://devcloud.intel.com/oneapi/get_started/).

### Using Visual Studio Code* (Optional)

You can use Visual Studio Code (VS Code) extensions to set your environment,
### Using Visual Studio Code* (VS Code) (Optional)
You can use Visual Studio Code* (VS Code) extensions to set your environment,
create launch configurations, and browse and download samples.

The basic steps to build and run a sample using VS Code include:
- Download a sample using the extension **Code Sample Browser for Intel oneAPI
Toolkits**.
- Configure the oneAPI environment with the extension **Environment
Configurator for Intel oneAPI Toolkits**.
- Open a Terminal in VS Code (**Terminal>New Terminal**).
- Run the sample in the VS Code terminal using the instructions below.

To learn more about the extensions and how to configure the oneAPI environment,
see the [Using Visual Studio Code with Intel&reg; oneAPI Toolkits User
Guide](https://software.intel.com/content/www/us/en/develop/documentation/using-vs-code-with-intel-oneapi/top.html).
1. Configure the oneAPI environment with the extension **Environment Configurator for Intel® oneAPI Toolkits**.
2. Download a sample using the extension **Code Sample Browser for Intel® oneAPI Toolkits**.
3. Open a terminal in VS Code (**Terminal > New Terminal**).
4. Run the sample in the VS Code terminal using the instructions below.

After learning how to use the extensions for Intel&reg; oneAPI Toolkits, return
to this readme for instructions on how to build and run a sample.
To learn more about the extensions and how to configure the oneAPI environment, see the
[Using Visual Studio Code with Intel® oneAPI Toolkits User Guide](https://www.intel.com/content/www/us/en/develop/documentation/using-vs-code-with-intel-oneapi/top.html).


### On Linux* using the Command Line
Perform the following steps:

1. Build the program using the following `make` commands (default uses buffers):
### On Linux* for CPU and GPU
1. Build the program.
```
make all
make build_usm
```
> **Note**: For USM use `make build_usm`
> **Note**: To build everything, use `make all`.

2. Run the program using:
### On Linux* for FPGA
1. Build for FPGA emulation using the following commands:
```
make run
make fpga_emu -f Makefile.fpga
```
> **Note**: For USM use `make run_usm`

3. Clean the program using:
2. Build for FPGA hardware. (Compiling for hardware can take a long
time.)
```
make clean
make hw -f Makefile.fpga
```

### On Windows* Using a Command Line Interface

1. Select **Programs** > **Intel oneAPI 2021** > **Intel oneAPI Command Prompt**
to launch a command window.
2. Build the program using the following `nmake` commands:
3. Generate static optimization reports for design analysis. (The path to the
reports is `vector-add_report.prj/reports/report.html`.)
```
nmake -f Makefile.win
make report -f Makefile.fpga
```
> **Note**: For USM use `nmake -f Makefile.win build_usm`

3. Run the program using:
### On Windows* for CPU and GPU
1. Open the **Intel oneAPI Command Prompt**.
2. Build the program.
```
nmake -f Makefile.win run
nmake -f Makefile.win build_usm
```
> **Note**: For USM use `nmake -f Makefile.win run_usm`
> **Note**: To build everything, use `nmake -f Makefile.win`

4. Clean the program using:
```
nmake -f Makefile.win clean
```
### On Windows for FPGA Emulation Only
> **Note** On Windows*, you can compile and run on the FPGA
emulator only. Generating optimization reports and compiling or running on
the FPGA hardware is not supported.

### On a Windows* System Using Visual Studio* Version 2017 or Newer
Perform the following steps:
1. Launch the Visual Studio* 2017.
2. Select the menu sequence **File** > **Open** > **Project/Solution**.
3. Locate the `vector-add` folder.
1. Open the **Intel oneAPI Command Prompt**.

2. Build the program.
```
nmake -f Makefile.win.fpga
```
### On Windows Using Visual Studio* 2017 or Newer
1. Change to the sample directory.
2. Launch Visual Studio*.
3. Select the menu sequence **File** > **Open** > **Project/Solution**.
4. Select the `vector-add.sln` file.
5. Select the configuration 'Debug' or 'Release'
6. Select **Project** > **Build** menu option to build the selected
configuration.
7. Select **Debug** > **Start Without Debugging** menu option to run the
program.
5. For CPU and GPU, skip to Step 7 (below).
6. For FPGA emulation only, select the configuration **Debug-fpga**, which contains the settings shown in below. Alternatively, confirm the following settings from the **Project Property** dialog.

a. Select the **DPC++** tab.

b. **General** > **Perform ahead of time compilation for the FPGA** is set to **Yes**.

## Building the `vector-add` Program for Intel&reg; FPGA
c. **Preprocessor** > **Preprocessor Definitions** contains **FPGA_EMULATOR=1**.

### Linux*
7. Select **Project** > **Build** menu option to build the selected
configuration.
8. Select **Debug** > **Start Without Debugging** menu option to run the
program.

Perform the following steps:
#### Troubleshooting
If you receive an error message, troubleshoot the problem using the **Diagnostics Utility for Intel® oneAPI Toolkits**. The diagnostic utility provides configuration and system checks to help find missing dependencies, permissions errors, and other issues. See the [Diagnostics Utility for Intel® oneAPI Toolkits User Guide](https://www.intel.com/content/www/us/en/develop/documentation/diagnostic-utility-user-guide/top.html) for more information on using the utility.

1. Clean the `vector-add` program using:
## Run the Sample
### On Linux for CPU and GPU
1. Run the program.
```
make clean -f Makefile.fpga
make run_usm
```
2. Based on your requirements, you can perform the following:
* Build and run for FPGA emulation using the following commands:
> **Note**: To run everything, use `make run`.

### On Linux for FPGA
1. Run for FPGA emulation.
```
make fpga_emu -f Makefile.fpga
make run_emu -f Makefile.fpga
```
* Build and run for FPGA hardware. (The hardware compilation can take a long
time to complete.)
2. Run on FPGA hardware.
```
make hw -f Makefile.fpga
make run_hw -f Makefile.fpga
```
* Generate static optimization reports for design analysis. Path to the
reports is `vector-add_report.prj/reports/report.html`
### On Windows for CPU and GPU
1. Open the **Intel oneAPI Command Prompt**.
3. Run the program using:
```
make report -f Makefile.fpga
nmake -f Makefile.win run_usm
```
> **Note**: To run everything, use `nmake -f Makefile.win run`

### On Windows Using a Command Line Interface
Perform the following steps:

> **Note** On a Windows* system, you can only compile and run on the FPGA
emulator. Generating an HTML optimization report and compiling and running on
the FPGA hardware is not currently supported.

1. Select **Programs** > **Intel oneAPI 2021** > **Intel oneAPI Command Prompt**
to launch a command window.
2. Build the program using the following `nmake` commands:
## On Windows for FPGA Emulation
1. Open the **Intel oneAPI Command Prompt**.
2. Build the program.
```
nmake -f Makefile.win.fpga clean
nmake -f Makefile.win.fpga
nmake -f Makefile.win.fpga run
```
### Run the `Base: Vector Add` Sample in Intel® DevCloud (Optional)
When running a sample in the Intel® DevCloud, you must specify the compute node (CPU, GPU, FPGA) and whether to run in batch or interactive mode. For more information, see the Intel® oneAPI Base Toolkit [Get Started Guide](https://devcloud.intel.com/oneapi/get_started/).

### On Windows Using Visual Studio* Version 2017 or Newer
Perform the following steps:
1. Launch the Visual Studio* 2017.
2. Select the menu sequence **File** > **Open** > **Project/Solution**.
3. Locate the `vector-add` folder.
4. Select the `vector-add.sln` file.
5. Select the configuration 'Debug-fpga' that have the necessary project
settings already below:

Under the 'Project Property' dialog:

a. Select the **DPC++** tab. <br>b. In the **General** subtab, the
**Perform ahead of time compilation for the FPGA** setting is set to
**Yes**. <br>c. In the **Preprocessor** subtab, the **Preprocessor
Definitions" setting has **FPGA_EMULATOR=1** added. <br>d. Close the
dialog.

6. Select **Project** > **Build** menu option to build the selected
configuration.
7. Select **Debug** > **Start Without Debugging** menu option to run the
program.

## Running the Sample
### Application Parameters
There is an optional parameter which determines the size of vector. Default
value is 10000.
There is an optional parameter which determines vector size. Default value is `10000`.

### Example of Output
## Example Output
```
Running on device: Intel(R) Gen(R) HD Graphics NEO
Vector size: 10000
Expand All @@ -246,9 +183,10 @@ Vector size: 10000
Vector add successfully completed on device.
```

## Troubleshooting
If you receive an error message, troubleshoot the problem using the Diagnostics
Utility for Intel&reg; oneAPI Toolkits, which provides system checks to find
missing dependencies and permissions errors. See [Diagnostics Utility for
Intel&reg; oneAPI Toolkits User
Guide](https://www.intel.com/content/www/us/en/develop/documentation/diagnostic-utility-user-guide/top.html).
## License
Code samples are licensed under the MIT license. See
[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt)
for details.

Third party program Licenses can be found here:
[third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt).