Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test final Monero POW cryptonight_v8 #1851

Open
psychocrypt opened this issue Sep 24, 2018 · 142 comments
Open

Test final Monero POW cryptonight_v8 #1851

psychocrypt opened this issue Sep 24, 2018 · 142 comments
Assignees

Comments

@psychocrypt
Copy link
Collaborator

psychocrypt commented Sep 24, 2018

Monero is changing there POW in October 2018. Please test the implementation of the new algorithm against the test pool (http://killallasics.moneroworld.com/)

You can find the source code of xmr-stak in pull request #1850 or download the zipped source directly.

Please report here only the speed comparison between cryptonight_v7 and cryptonight_v8. If you fund any bugs please report it in the pull request #1850.
Please also take the time to mine a few minutes against the testnet pool to check that you not get invalid results.

How to bench the system:

Please start the miner once with ./xmr-stak to create pools.txtand all other config files.
Change cryptonight_v8 into cryptonight_v7 to measure the performance of the current monero POW. Please do not forget to remove the backend configs if you switch the algorithm because "strided_index" : 1 is not allowed for cryptonight_v8

CPU:

./xmr-stak  --currency cryptonight_v8 --noAMD --noNVIDIA --benchmark 8 --benchwait  20 --benchwork 30

CUDA/AMD OpenCL:

./xmr-stak --currency cryptonight_v8  --benchmark 8 --benchwait  20 --benchwork 30

CUDA is currently not supported. I am currently try to get some performance out it.

NVIDIA via OpenCL

./xmr-stak --currency cryptonight_v8 --openCLVendor NVIDIA --benchmark 8 --benchwait  20 --benchwork 30

Template for speed reporting:

OS: XX
Backend: (including the type e.g. AMD RX570)
 - CPU
  - NVIDIA (native CUDA or via OpenCL)
  - AMD 
# if the CPU/GPU is overclocked please add the modifications here
speed:  
 - cryptonight_v7: XXX H/s
 - cryptonight_v8: XXX H/s
Miner config: please add here your config for the backend
@SChernykh
Copy link
Contributor

I did't run benchmark this time, just started mining for a few minutes and checked highest reported hashrate.

OS: Windows 7
Backend: CPU, Intel Core i5 3210M

speed:  
 - cryptonight_v7: 74.3 H/s
 - cryptonight_v8: 67.3 H/s

config:
"cpu_threads_conf" :
[
    { "low_power_mode" : false, "no_prefetch" : true, "asm" : "intel", "affine_to_cpu" : 0 },
],

This is strange - I get 68.3 H/s with xmrig on this machine. It uses the same asm code, the only difference is that I used Visual Studio 2017 to compile xmr-stak and MSYS2 with GCC 8.2.0 to compile xmrig.

@SChernykh
Copy link
Contributor

SChernykh commented Sep 24, 2018

OS: Windows 10
Backend: CPU, AMD Ryzen 5 2600 @ 4 GHz

speed:  
 - cryptonight_v7: 630.6 H/s
 - cryptonight_v8: 627.3 H/s

config:
"cpu_threads_conf" :
[
    { "low_power_mode" : false, "no_prefetch" : true, "asm" : "intel", "affine_to_cpu" : 0 },
    { "low_power_mode" : false, "no_prefetch" : true, "asm" : "intel", "affine_to_cpu" : 2 },
    { "low_power_mode" : false, "no_prefetch" : true, "asm" : "intel", "affine_to_cpu" : 4 },
    { "low_power_mode" : false, "no_prefetch" : true, "asm" : "intel", "affine_to_cpu" : 5 },
    { "low_power_mode" : false, "no_prefetch" : true, "asm" : "intel", "affine_to_cpu" : 6 },
    { "low_power_mode" : false, "no_prefetch" : true, "asm" : "intel", "affine_to_cpu" : 8 },
    { "low_power_mode" : false, "no_prefetch" : true, "asm" : "intel", "affine_to_cpu" : 10 },
    { "low_power_mode" : false, "no_prefetch" : true, "asm" : "intel", "affine_to_cpu" : 11 },
],

xmrig showed identical performance this time.

@SChernykh
Copy link
Contributor

OS: Windows 7
Backend: CPU, Intel Core i7 2600k

speed:  
 - cryptonight_v7: 287.5 H/s
 - cryptonight_v8: 264.3 H/s

config:
"cpu_threads_conf" :
[
    { "low_power_mode" : false, "no_prefetch" : true, "asm" : "intel", "affine_to_cpu" : 0 },
    { "low_power_mode" : false, "no_prefetch" : true, "asm" : "intel", "affine_to_cpu" : 2 },
    { "low_power_mode" : false, "no_prefetch" : true, "asm" : "intel", "affine_to_cpu" : 4 },
    { "low_power_mode" : false, "no_prefetch" : true, "asm" : "intel", "affine_to_cpu" : 6 },
],

Again, xmrig showed a bit higher hashrate - 267.9 H/s.

@psychocrypt
Copy link
Collaborator Author

@SChernykh Thanks for your tests. A few hashes difference can be. Depending on which port of the test pool you are mining. If you are lucky and found a lot of hashes than the hash rate will go down a few hashes. But let us wait for other results.

Big thanks again for the asm code.

@SChernykh
Copy link
Contributor

SChernykh commented Sep 24, 2018

I've also tested RX 560: hashrate numbers here are what was reported as highest when mining. I double checked it - performance is identical, I was mining v7 against v7 pool and v8 against v8 pool, all shares were accepted. I tried a few different configs, but I couldn't find faster settings for v7.

OS: Windows 10
Backend: OpenCL, AMD Radeon RX 560 4GB, 1 click PBE timing straps, core @ 1196 MHz, memory @ 2200 MHz

speed:  
 - cryptonight_v7: 469.3 H/s
 - cryptonight_v8: 469.3 H/s

config for v7:
"gpu_threads_conf" : [
  // gpu: Baffin memory:2752
  // compute units: 16
  { "index" : 0,
    "intensity" : 512, "worksize" : 32,
    "affine_to_cpu" : false, "strided_index" : 2, "mem_chunk" : 2,
    "unroll" : 8, "comp_mode" : true
  },
  { "index" : 0,
    "intensity" : 512, "worksize" : 32,
    "affine_to_cpu" : false, "strided_index" : 2, "mem_chunk" : 2,
    "unroll" : 8, "comp_mode" : true
  },
],

config for v8:
"gpu_threads_conf" : [
  // gpu: Baffin memory:2752
  // compute units: 16
  { "index" : 0,
    "intensity" : 1024, "worksize" : 32,
    "affine_to_cpu" : false, "strided_index" : 2, "mem_chunk" : 2,
    "unroll" : 16, "comp_mode" : true
  },
],

@SChernykh
Copy link
Contributor

60 second benchmark on the same RX 560 with configs listed above:

speed:  
 - cryptonight_v7: 466.2 H/s
 - cryptonight_v8: 461.8 H/s

I'm not sure what numbers to trust more.

@SChernykh
Copy link
Contributor

SChernykh commented Sep 24, 2018

@MoneroCrusher @mobilepolice @kio3i0j9024vkoenio @Bathmat This is the final code for the next Monero PoW, it would be good if you tested it on everything you got and posted results here.

Edit: everything except NVIDIA GPUs, CUDA version is not ready yet.

@Spudz76
Copy link
Contributor

Spudz76 commented Sep 24, 2018

CPU is i7-2600 non-K cache 8M

OS: XX
Backend: CPU
speed:  
 - cryptonight_v7: 161 H/s
 - cryptonight_v8: 162 H/s
Miner config:
    { "low_power_mode" : false, "no_prefetch" : true, "asm" : "intel", "affine_to_cpu" : 0 },
    { "low_power_mode" : false, "no_prefetch" : true, "asm" : "intel", "affine_to_cpu" : 2 },
    { "low_power_mode" : false, "no_prefetch" : true, "asm" : "intel", "affine_to_cpu" : 4 },
    { "low_power_mode" : false, "no_prefetch" : true, "asm" : "intel", "affine_to_cpu" : 6 },

Also tried prefetch both ways, all three asm, in all cases the above was winner
High variance between tests, I blame firefox in the background, but I ran a bunch of times and took the best
Did the N=2 optimize get in? I skipped l_p_m:2 tests for now

@SChernykh
Copy link
Contributor

@Spudz76 it's better to test without anything running in background, right after reboot. Double hash asm code is not added to xmr-stak yet.

@plavirudar
Copy link

plavirudar commented Sep 24, 2018

Nvidia OpenCL causes severe lag for my computer (1080ti) and makes it unusable while mining. Previously, using CUDA with bfactor=12 and bsleep=100 causes no slowdown of the computer while doing word processing and basic videos.

Is there a tweak that I'm missing here? Currently it's so laggy that it causes music to stutter and even pressing "h" to show hashrate isn't working properly.

@Bathmat
Copy link

Bathmat commented Sep 24, 2018

Nvidia OpenCL causes severe lag for my computer (1080ti) and makes it unusable while mining. Previously, using CUDA with bfactor=12 and bsleep=100 causes no slowdown of the computer while doing word processing and basic videos.

Is there a tweak that I'm missing here?

@plavirudar you could try lowering the intensity... might help
@psychocrypt do you expect to be able to get a working CUDA version? I have an older Win7 rig with a GTX970 and 2x GTX1050s that OpenCL is not playing nice with. I'll keep troubleshooting though.

@plavirudar
Copy link

plavirudar commented Sep 24, 2018

@Bathmat I started off with intensity 896 (the recommended) and it was hashing ~1000h/s with ethlargement+600MHz mem OC (which is similar to its performance on cnh/cnv1, however the computer was unusable. When I dropped intensity to 640, the computer was still unusable, however hashrate fell to 800, which was lower than its performance on cnv2 CUDA.

@plavirudar
Copy link

plavirudar commented Sep 24, 2018

@Spudz76 Are you sure you're using the right CPU? I have an i7-2600 non-k and I'm getting 220 with "asm":"off", 256 with "asm":"intel" , 250 with "asm":"ryzen" and 270 with CNv7 (so ~5% slowdown after using the best asm optimizations). That CPU is also running a bunch of random shitcoin daemons as well, so it's not even performing at max.

Not sure what you meant by OS:XX, I tried mine on Ubuntu 16.04 with threads 0,1,2,3.

@w104tcl
Copy link

w104tcl commented Sep 24, 2018

OS: Kubuntu
Backend: - CPU - Intel Core i5-3320M - stock settings
speed:

  • cryptonight_v7: 74.1 H/s

  • cryptonight_v8: 74.6 H/s

    { "low_power_mode" : false, "no_prefetch" : true, "asm" : "intel", "affine_to_cpu" : 0 },
    { "low_power_mode" : false, "no_prefetch" : true, "asm" : "intel", "affine_to_cpu" : 2 },

@Spudz76
Copy link
Contributor

Spudz76 commented Sep 24, 2018

@plavirudar Win7 and all sorts of garbage running in the background (my daily driver desktop box) Definitely not rebooting let alone closing all these tabs. But I have others I can test that are only for mining / not a i7-2600 though / most of them are non-AES and Linux so I was testing the applicable AES-capable stuff I have first

Also this is about comparison from v7 to v8 not global competition. I ran so many passes and took the highest which should account for background task variances So whatever I have holding me back is doing the same thing to v8 the delta still applies (and in this case v8 was faster) I don't mine on this box normally but it is an extra test point (and has a GTX970 as well). I did tell Firefox to quit fiddling with the GPU for offloading and a few other easy avoidance measures (mostly to open more VRAM).

@Spudz76
Copy link
Contributor

Spudz76 commented Sep 24, 2018

Same Win7 box as above GTX970-4GB stock / driver profile max performance + P0

Backend: OpenCL->NVIDIA
speed:  
 - cryptonight_v7: 448 H/s
 - cryptonight_v8: 378 H/s
Miner config:
  // gpu: GeForce GTX 970 memory:3968
  // compute units: 13
  { "index" : 0,
    "intensity" : 832, "worksize" : 8,
    "affine_to_cpu" : false, "strided_index" : 0, "mem_chunk" : 2,
    "unroll" : 4, "comp_mode" : false
  },

832 = 13 * 8 * 8 = smx * 8 * worksize which seemed to work best
unroll 4 and 8 same performance
strided 2 and chunk 2,3,4 did not change much (mostly same, some slightly worse)
Similar with worksizes, intensities.
In all cases windows chokes the entire time the miner is hashing.

@Bathmat
Copy link

Bathmat commented Sep 24, 2018

OS: Win10 1803
Backend: AMD GPUs (4), all bios modded with 1 click timings (all hynix mem, 2000 clock)
0: RX580 8GB Gigabyte, 99W
1: RX480 4GB AMD brand, 82W
2: RX570 4GB Sapphire ITX, 101W
3: RX470 4GB Sapphire, 80W
speed:  
 - cryptonight_v7: 3530 H/s
 - cryptonight_v8: 3350 H/s
HASHRATE REPORT - AMD
| ID |    10s |    60s |    15m | ID |    10s |    60s |    15m |
|  0 |  394.0 |  394.0 |   (na) |  1 |  395.2 |  393.7 |   (na) |
|  2 |  425.5 |  425.5 |   (na) |  3 |  424.9 |  425.2 |   (na) |
|  4 |  435.3 |  435.0 |   (na) |  5 |  434.4 |  434.7 |   (na) |
|  6 |  420.7 |  421.1 |   (na) |  7 |  421.2 |  421.1 |   (na) |
Totals (AMD):  3351.1 3350.5    0.0 H/s
-----------------------------------------------------------------
Totals (ALL):   3351.1 3350.5    0.0 H/s

amd.txt config:

"gpu_threads_conf" : [
  // gpu: Ellesmere memory:3920
  // compute units: 36
  { "index" : 0,
    "intensity" : 896, "worksize" : 16,
    "affine_to_cpu" : false, "strided_index" : 2, "mem_chunk" : 2,
    "unroll" : 4, "comp_mode" : true
  },
  { "index" : 0,
    "intensity" : 896, "worksize" : 8,
    "affine_to_cpu" : false, "strided_index" : 2, "mem_chunk" : 2,
    "unroll" : 4, "comp_mode" : true
  },
  // gpu: Ellesmere memory:3712
  // compute units: 36
  { "index" : 1,
    "intensity" : 896, "worksize" : 16,
    "affine_to_cpu" : false, "strided_index" : 2, "mem_chunk" : 2,
    "unroll" : 4, "comp_mode" : true
  },
  { "index" : 1,
    "intensity" : 896, "worksize" : 8,
    "affine_to_cpu" : false, "strided_index" : 2, "mem_chunk" : 2,
    "unroll" : 4, "comp_mode" : true
  },
  // gpu: Ellesmere memory:3712
  // compute units: 32
  { "index" : 2,
    "intensity" : 896, "worksize" : 16,
    "affine_to_cpu" : false, "strided_index" : 2, "mem_chunk" : 2,
    "unroll" : 4, "comp_mode" : true
  },
  { "index" : 2,
    "intensity" : 896, "worksize" : 8,
    "affine_to_cpu" : false, "strided_index" : 2, "mem_chunk" : 2,
    "unroll" : 4, "comp_mode" : true
  },
  // gpu: Ellesmere memory:3712
  // compute units: 32
  { "index" : 3,
    "intensity" : 896, "worksize" : 16,
    "affine_to_cpu" : false, "strided_index" : 2, "mem_chunk" : 2,
    "unroll" : 4, "comp_mode" : true
  },
  { "index" : 3,
    "intensity" : 896, "worksize" : 8,
    "affine_to_cpu" : false, "strided_index" : 2, "mem_chunk" : 2,
    "unroll" : 4, "comp_mode" : true
  },
],

Performance is in-line with what I was expecting. Power consumption is from HWMonitor for CNv8-2. Power use for CNv7 was lower by 3-7 watts depending on the GPU, but again, this was expected. Turns out that changing unroll didn't seem to have an affect. I tested all 8, all 4, all 1 and a mix of 8 on W:16 threads, 4 on W:8 threads, and total hashrate remained within 10 h/s for each.

Note: I did try a single thread test; however, hashrate was 7% slower than dual thread, and power consumption was the same as dual thread.

@kio3i0j9024vkoenio
Copy link

OS: Ubuntu 16.04

Backend: CPU Only

  • CPU 4x Xeon E7-8837's in a HP DL580 G7

  • cryptonight_v7: 1624 H/s - 100%

  • cryptonight_v8: 1293 H/s - 79.6%

Note that using SChernykh XMR-Stak-CPU latest code with all the same v8 changes and the optimized asm for 1x and 2x threads produces:

  • cryptonight_v8: 1525 H/s - 93.9%

Miner config:

"cpu_threads_conf" :
[

{ "low_power_mode" : true, "no_prefetch" : false,  "asm" : "ryzen", "affine_to_cpu" : 0 },
{ "low_power_mode" : true, "no_prefetch" : false,  "asm" : "ryzen", "affine_to_cpu" : 1 },
{ "low_power_mode" : true, "no_prefetch" : false,  "asm" : "ryzen", "affine_to_cpu" : 2 },
{ "low_power_mode" : true, "no_prefetch" : false,  "asm" : "ryzen", "affine_to_cpu" : 3 },
{ "low_power_mode" : false, "no_prefetch" : true,  "asm" : "ryzen", "affine_to_cpu" : 4 },
{ "low_power_mode" : false, "no_prefetch" : true,  "asm" : "ryzen", "affine_to_cpu" : 5 },
{ "low_power_mode" : false, "no_prefetch" : true,  "asm" : "ryzen", "affine_to_cpu" : 6 },
{ "low_power_mode" : false, "no_prefetch" : true,  "asm" : "ryzen", "affine_to_cpu" : 7 },
{ "low_power_mode" : true, "no_prefetch" : false,  "asm" : "ryzen", "affine_to_cpu" : 8 },
{ "low_power_mode" : true, "no_prefetch" : false,  "asm" : "ryzen", "affine_to_cpu" : 9 },
{ "low_power_mode" : true, "no_prefetch" : false,  "asm" : "ryzen", "affine_to_cpu" : 10 },
{ "low_power_mode" : true, "no_prefetch" : false,  "asm" : "ryzen", "affine_to_cpu" : 11 },
{ "low_power_mode" : false, "no_prefetch" : true,  "asm" : "ryzen", "affine_to_cpu" : 12 },
{ "low_power_mode" : false, "no_prefetch" : true,  "asm" : "ryzen", "affine_to_cpu" : 13 },
{ "low_power_mode" : false, "no_prefetch" : true,  "asm" : "ryzen", "affine_to_cpu" : 14 },
{ "low_power_mode" : false, "no_prefetch" : true,  "asm" : "ryzen", "affine_to_cpu" : 15 },
{ "low_power_mode" : true, "no_prefetch" : false,  "asm" : "ryzen", "affine_to_cpu" : 16 },
{ "low_power_mode" : true, "no_prefetch" : false,  "asm" : "ryzen", "affine_to_cpu" : 17 },
{ "low_power_mode" : true, "no_prefetch" : false,  "asm" : "ryzen", "affine_to_cpu" : 18 },
{ "low_power_mode" : true, "no_prefetch" : false,  "asm" : "ryzen", "affine_to_cpu" : 19 },
{ "low_power_mode" : false, "no_prefetch" : true,  "asm" : "ryzen", "affine_to_cpu" : 20 },
{ "low_power_mode" : false, "no_prefetch" : true,  "asm" : "ryzen", "affine_to_cpu" : 21 },
{ "low_power_mode" : false, "no_prefetch" : true,  "asm" : "ryzen", "affine_to_cpu" : 22 },
{ "low_power_mode" : false, "no_prefetch" : true,  "asm" : "ryzen", "affine_to_cpu" : 23 },
{ "low_power_mode" : true, "no_prefetch" : false,  "asm" : "ryzen", "affine_to_cpu" : 24 },
{ "low_power_mode" : true, "no_prefetch" : false,  "asm" : "ryzen", "affine_to_cpu" : 25 },
{ "low_power_mode" : true, "no_prefetch" : false,  "asm" : "ryzen", "affine_to_cpu" : 26 },
{ "low_power_mode" : true, "no_prefetch" : false,  "asm" : "ryzen", "affine_to_cpu" : 27 },
{ "low_power_mode" : false, "no_prefetch" : true,  "asm" : "ryzen", "affine_to_cpu" : 28 },
{ "low_power_mode" : false, "no_prefetch" : true,  "asm" : "ryzen", "affine_to_cpu" : 29 },
{ "low_power_mode" : false, "no_prefetch" : true,  "asm" : "ryzen", "affine_to_cpu" : 30 },
{ "low_power_mode" : false, "no_prefetch" : true,  "asm" : "ryzen", "affine_to_cpu" : 31 },
],

@kio3i0j9024vkoenio
Copy link

kio3i0j9024vkoenio commented Sep 25, 2018

OS: Ubuntu 16.04

Backend: 8x Nvidia GTX 750 1GB

CUDA

cryptonight_v7 1x GTX 750 1GB: varies from 228 to 249 H/s for each card
cryptonight_v7 8x GTX 750 1GB: 1897 H/s - 100%

OpenCL

cryptonight_v8 1x GTX 750 1GB: varies from 118 to 130 H/s for each card
cryptonight_v8 8x GTX 750 1GB: 1011 H/s - 53.3%

Losing almost half of the hash rate going from V7 to V8 is brutal.

Below are the auto config GPU config files from v7 and v8 for the first two GPU's the remaining six have the exact same settings as the second GPU. The first GPU has a display attached.

Miner config:

V7 CUDA Nvidia.txt configuration

// gpu: GeForce GTX 750 architecture: 50
//      memory: 859/976 MiB
//      smx: 4

{ "index" : 0,
"threads" : 30, "blocks" : 12,
"bfactor" : 8, "bsleep" :  100,
"affine_to_cpu" : false, "sync_mode" : 3,
},

// gpu: GeForce GTX 750 architecture: 50
//      memory: 943/981 MiB
//      smx: 4

{ "index" : 1,
"threads" : 32, "blocks" : 12,
"bfactor" : 2, "bsleep" :  0,
"affine_to_cpu" : false, "sync_mode" : 3,
},

V8 OpenCL AMD.txt Nvidia configuration

// gpu: GeForce GTX 750 memory:848
// compute units: 4
{ "index" : 0,
"intensity" : 416, "worksize" : 8,
"affine_to_cpu" : false, "strided_index" : 2, "mem_chunk" : 2,
"unroll" : 8, "comp_mode" : true
},

// gpu: GeForce GTX 750 memory:853
// compute units: 4
{ "index" : 1,
"intensity" : 416, "worksize" : 8,
"affine_to_cpu" : false, "strided_index" : 2, "mem_chunk" : 2,
"unroll" : 8, "comp_mode" : true
},

@kio3i0j9024vkoenio
Copy link

kio3i0j9024vkoenio commented Sep 25, 2018

OS: Ubuntu 16.04

Backend: 8x Nvidia GTX 750 1GB

CUDA

cryptonight_v7 1x GTX 750 1GB: varies from 228 to 249 H/s for each card
cryptonight_v7 8x GTX 750 1GB: 1897 H/s - 100%

OpenCL

cryptonight_v8 1x GTX 750 1GB: varies from 156 to 180 H/s for each card
cryptonight_v8 8x GTX 750 1GB: 1373 H/s - 72.4%

Going from V7 to V8 is now a little less brutal but still a 27.6% lower hash rate than v7.

Below are the GPU config files for v7 and v8

I have tweaked the v8 settings from the auto generated to the best settings I could obtain by various changes and retesting.

Unroll of 4 is the best, going to 8 reduces performance
Intensity of 352 is also the best, going with the autodefined 416 kills performance
Also changing worksize to either 12 or 4 kills performance

Miner config:

V7 CUDA Nvidia.txt configuration

// gpu: GeForce GTX 750 architecture: 50
//      memory: 859/976 MiB
//      smx: 4

{ "index" : 0,
"threads" : 30, "blocks" : 12,
"bfactor" : 8, "bsleep" :  100,
"affine_to_cpu" : false, "sync_mode" : 3,
},

// gpu: GeForce GTX 750 architecture: 50
//      memory: 943/981 MiB
//      smx: 4

{ "index" : 1,
"threads" : 32, "blocks" : 12,
"bfactor" : 2, "bsleep" :  0,
"affine_to_cpu" : false, "sync_mode" : 3,
},

V8 OpenCL AMD.txt Nvidia configuration

// gpu: GeForce GTX 750 memory:848
// compute units: 4
{ "index" : 0,
"intensity" : 352, "worksize" : 8,
"affine_to_cpu" : false, "strided_index" : 2, "mem_chunk" : 2,
"unroll" : 4, "comp_mode" : true
},

// gpu: GeForce GTX 750 memory:853
// compute units: 4
{ "index" : 1,
"intensity" : 352, "worksize" : 8,
"affine_to_cpu" : false, "strided_index" : 2, "mem_chunk" : 2,
"unroll" : 4, "comp_mode" : true
},

@SChernykh
Copy link
Contributor

SChernykh commented Sep 25, 2018

OS: Kubuntu
Backend: - CPU - Intel Core i5-3320M - stock settings
speed:
cryptonight_v7: 74.1 H/s
cryptonight_v8: 74.6 H/s
{ "low_power_mode" : false, "no_prefetch" : true, "asm" : "intel", "affine_to_cpu" : 0 },
{ "low_power_mode" : false, "no_prefetch" : true, "asm" : "intel", "affine_to_cpu" : 2 },

I can confirm these numbers. I got 73.6 H/s on CNv7 and 74.2 H/s on CNv8 with Core i5-3210M and these settings. Even though it has only 3 MB cache, second CPU thread helps a lot more when running CNv8.

@Bathmat
Copy link

Bathmat commented Sep 25, 2018

@kio3i0j9024vkoenio did you try "strided_index" : 0?

@kio3i0j9024vkoenio
Copy link

kio3i0j9024vkoenio commented Sep 25, 2018

Just tried "strided_index" : 0 and the results are exactly the same as with "strided_index" : 2:

cryptonight_v7 1x GTX 750 1GB: varies from 228 to 249 H/s for each card
cryptonight_v7 8x GTX 750 1GB: 1897 H/s - 100%

cryptonight_v8 1x GTX 750 1GB: varies from 156 to 180 H/s for each card
cryptonight_v8 8x GTX 750 1GB: 1376 H/s - 72.5%

EDIT

I have tries many other changes to the config file and the absolute best I can get with OpenCL is:

cryptonight_v8 1x GTX 750 1GB: varies from 158 to 181 H/s for each card
cryptonight_v8 8x GTX 750 1GB: 1392 H/s - 73.4%

The final config is:

// gpu: GeForce GTX 750 memory:848
// compute units: 4 
{ "index" : 0,
"intensity" : 352, "worksize" : 8,
"affine_to_cpu" : false, "strided_index" : 0, "mem_chunk" : 0,
"unroll" : 2, "comp_mode" : false
}, 

I hope that the CUDA version can be made available soon and I hope for better results with it.

@Bathmat
Copy link

Bathmat commented Sep 26, 2018

I have a Win7 rig with 3 Nvidia GPUs that is giving me issues with OpenCL... GPUs are one GTX970, and 2 GTX1050. If I run just 1 gpu, hashrates are about what I expect for OpenCL; however, if I try to run all 3, hashrate drops significantly and watching HWmonitor shows that GPU Utilization will only be 100% for 1 gpu at a time and it rotates between the gpus (thus causing the low hashrate). Does anyone know how to force each GPU to work simultaneously using OpenCL and Win7?

Thoughts @Spudz76, @kio3i0j9024vkoenio? I've tried Googling, but my searches are coming up empty. Perhaps something in nvidia-smi? I've never really used nvidia-smi, so I'm not very familiar.

EDIT: P.S. this rig works just fine on CNv7 and CUDA

@psychocrypt
Copy link
Collaborator Author

Everyone can now check the performance of the native CUDA backend. Please take care the default config for CUDA devices is complete different to the old configs.
From my first checks it looks like old Kepler GPUs will have only 1/3 performance compared to v7.

@Q-Sharp
Copy link

Q-Sharp commented Oct 8, 2018

My 550 hashes are good now ~450 h/s on v8

But vega 56 + 64 are really low hashes... ~1000h/s

@Spudz76
Copy link
Contributor

Spudz76 commented Oct 8, 2018

I will do a windows CUDA 10 build and try with my 970GTX and 411 driver on the test pool for a bit
Until now I only bothered with local benchmarks

Update built with both cuda patches (cudaFunction and volatileCUDA):
all invalid with comp_mode:false
all valid comp_mode:true

@caowenyu75
Copy link

My 550 hashes are good now ~450 h/s on v8

But vega 56 + 64 are really low hashes... ~1000h/s

what's 550's config?

@MoneroCrusher
Copy link

My 550 hashes are good now ~450 h/s on v8
But vega 56 + 64 are really low hashes... ~1000h/s

what's 550's config?

I use 2 threads, intensity 448, worksize 32, unroll 8 strided:mem/2:2
SChernykh claims 1 thread is better (for RX 560) but I have yet to find a one thread setting for RX 550 that is better than 2 threads.

@MoneroCrusher
Copy link

XMR-Stak + latest 2.8.1 xmrig-proxy working flawlessly together on pool & CNv2. Great work guys.

WORKER NAME             | LAST IP         | COUNT | ACCEPTED | REJ |  10 MINUTES |    24 HOURS |
Test                     | 1.1.1.1    |     1 |       77 |   0 |   5.67 kH/s |   0.18 kH/s |

@caowenyu75
Copy link

@MoneroCrusher thanks a lot. I will try it out when i get back home.

@psychocrypt
Copy link
Collaborator Author

For documentation:

GTX1080 on Linux (not overclocked)

"gpu_threads_conf" :
[
  // gpu: GeForce GTX 1080 architecture: 61
  //      memory: 7994/8113 MiB
  //      smx: 20
  { "index" : 0,
    "threads" : 16, "blocks" : 160,
    "bfactor" : 0, "bsleep" :  0,
    "affine_to_cpu" : false, "sync_mode" : 3,
    "mem_mode" : 1,
  },
],

-> 536H/s

before:

"gpu_threads_conf" :
[
  // gpu: GeForce GTX 1080 architecture: 61
  //      memory: 7994/8113 MiB
  //      smx: 20
  { "index" : 0,
    "threads" : 4, "blocks" : 160,
    "bfactor" : 0, "bsleep" :  0,
    "affine_to_cpu" : false, "sync_mode" : 3,
    "mem_mode" : 1,
  },
],

-> 515 H/s

@psychocrypt
Copy link
Collaborator Author

psychocrypt commented Oct 11, 2018 via email

@psychocrypt
Copy link
Collaborator Author

Can someone please test #1898 on Windows (NVIDIA GPUs) with CUDa8+ and/or CUDA10 and give me feedback if all is working.

@Spudz76
Copy link
Contributor

Spudz76 commented Oct 11, 2018

Win7 / driver 411.70 / CUDA10 / 970GTX works well @ 370H/s

@Spudz76
Copy link
Contributor

Spudz76 commented Oct 11, 2018

@kio3i0j9024vkoenio You've been clearing .openclcache between versions right?

@psychocrypt
Copy link
Collaborator Author

psychocrypt commented Oct 11, 2018 via email

@psychocrypt
Copy link
Collaborator Author

psychocrypt commented Oct 11, 2018 via email

@kio3i0j9024vkoenio
Copy link

kio3i0j9024vkoenio commented Oct 11, 2018

I finally got CUDA (1320 H/S) to near the OpenCL (1373) H/S that I was getting before on the 8x GTX 750's.

It was a lot of trial and error doing it.

This was the auto-generated config that produced only 816 H/s:

// gpu: GeForce GTX 750 architecture: 50
//      memory: 843/976 MiB 
//      smx: 4
{ "index" : 0-7,
  "threads" : 4, "blocks" : 32,
  "bfactor" : 2, "bsleep" :  0,
  "affine_to_cpu" : false, "sync_mode" : 3, 
  "comp_mode" : true,
},

Changing "threads" : 4, "blocks" : 32 to "threads" : 32, "blocks" : 8 brought the hashes to 1169 H/s:

// gpu: GeForce GTX 750 architecture: 50
//      memory: 843/976 MiB 
//      smx: 4
{ "index" : 0-7,
  "threads" : 32, "blocks" : 8,
  "bfactor" : 2, "bsleep" :  0,
  "affine_to_cpu" : false, "sync_mode" : 3, 
  "comp_mode" : true,
},

Finally changing "comp_mode" to false produced the best 1320 H/S:

// gpu: GeForce GTX 750 architecture: 50
//      memory: 843/976 MiB 
//      smx: 4
{ "index" : 0-7,
  "threads" : 32, "blocks" : 8,
  "bfactor" : 2, "bsleep" :  0,
  "affine_to_cpu" : false, "sync_mode" : 3, 
  "comp_mode" : false,
},

So the V7 vs V8 numbers are as follows:

System: HP DL580 G7 with 4x Xeon E7-8837 processors and 8x Nvidia GTX 750's OS: Ubuntu 16.04

Backend: CPU Only 4x Xeon E7-8837's

cryptonight_v7: 1624 H/s - 100%
cryptonight_v8: 1493 H/s - 91.9%

Backend: 8x Nvidia GTX 750 1GB Only

CUDA

cryptonight_v7 8x GTX 750 1GB: 1897 H/s - 100%
cryptonight_v8 8x GTX 750 1GB: 1320 H/s - 69.6%

Complete System: HP DL580 G7 with 4x Xeon E7-8837 processors and 8x Nvidia GTX 750's

cryptonight_v7 8x GTX 750 1GB: 3521 H/s - 100%
cryptonight_v8 8x GTX 750 1GB: 2812 H/s - 79.6%

@SChernykh
Copy link
Contributor

@kio3i0j9024vkoenio It looks like that with the current code, you must have intensity * work_size * 2 < GPU memory in MB for OpenCL to work on NVIDIA. Performance is very bad when it works.

@kio3i0j9024vkoenio
Copy link

kio3i0j9024vkoenio commented Oct 11, 2018

I have found the problem using OpenCL on Nvidia. It turns out to be a brain fart on my part. During my testing both OpenCL and CUDA was running on the GPU's and that caused the Error CL_MEM_OBJECT_ALLOCATION_FAILURE when calling clEnqueueNDRangeKernel for kernel 0.

This is the command line that I run now to only test OpenCL on Nvidia:

./xmr-stak --currency cryptonight_v8 --noCPU --openCLVendor NVIDIA --noNVIDIA --noAMDCache --benchmark 8 --benchwait  10 --benchwork 30

The --noNVIDIA is needed to not run CUDA

@psychocrypt - Maybe a check could be added to not allow CUDA and OpenCL to run on the same GPU at the same time.

And the reason that the older version xmr-stak 2.4.7 0fef2cf from Sept 24th worked was because I compiled that version without CUDA.

Thanks for all the help. I will be deleting my posts above that had incorrect information.

This is the V8 config I am using for OpenCL on Nvidia GTX 750's

// gpu: GeForce GTX 750 memory:848
// compute units: 4 
{ "index" : 0-7,
"intensity" : 352, "worksize" : 8,
"affine_to_cpu" : 7, "strided_index" : 0, "mem_chunk" : 0,
"unroll" : 2, "comp_mode" : false
},

Backend: 8x Nvidia GTX 750 1GB Only

CUDA

cryptonight_v7 8x GTX 750 1GB: 1897 H/s - 100%
cryptonight_v8 8x GTX 750 1GB: 1320 H/s - 69.6%

OpenCL

cryptonight_v8 8x GTX 750 1GB: 1371 H/s - 72.3%

@kio3i0j9024vkoenio
Copy link

kio3i0j9024vkoenio commented Oct 11, 2018

@kio3i0j9024vkoenio It looks like that with the current code, you must have intensity * work_size * 2 < GPU memory in MB for OpenCL to work on NVIDIA.

The issue I was having has been resolved here: #1851 (comment)

This is the V8 config I am using for OpenCL on Nvidia GTX 750's

// gpu: GeForce GTX 750 memory:848
// compute units: 4 
{ "index" : 0-7,
"intensity" : 352, "worksize" : 8,
"affine_to_cpu" : 7, "strided_index" : 0, "mem_chunk" : 0,
"unroll" : 2, "comp_mode" : false
},

Using the `intensity * work_size * 2 < GPU memory in MB from this config gives 352 * 8 * 2 = 5632 which is way higher than memory:848 and the above config works just fine.

Performance is very bad when it works.

You are correct in that OpenCL is worse than CUDA.

Backend: 8x Nvidia GTX 750 1GB Only

CUDA

cryptonight_v7 8x GTX 750 1GB: 1897 H/s - 100%
cryptonight_v8 8x GTX 750 1GB: 1694 H/s - 89.3%

OpenCL

cryptonight_v8 8x GTX 750 1GB: 1371 H/s - 72.3%

@kio3i0j9024vkoenio
Copy link

kio3i0j9024vkoenio commented Oct 11, 2018

It looks like CUDA has additional optimizations in the current version as I can now use "threads" : 32, "blocks" : 12 whereas before I could not get that to produce good hash rates.

#1851 (comment)

So now V8 CUDA is producing 1694 H/S on the 8x Nvidia GTX 750's.

CUDA Nvidia.txt Config:

// gpu: GeForce GTX 750 architecture: 50
//      memory: 843/976 MiB 
//      smx: 4
// The display is on this GPU so threads is lowered by two and bsleep is set to 100
{ "index" : 0,
  "threads" : 30, "blocks" : 12,
  "bfactor" : 8, "bsleep" :  100,
  "affine_to_cpu" :  false, "sync_mode" : 3, 
  "comp_mode" : false,
},

// gpu: GeForce GTX 750 architecture: 50
//      memory: 843/976 MiB 
//      smx: 4
// bfactor needs to be 8 and bsleep needs to be 25 to prevent GPU errors
{ "index" : 1-7,
  "threads" : 32, "blocks" : 12,
  "bfactor" : 8, "bsleep" :  25,
  "affine_to_cpu" : false, "sync_mode" : 3, 
  "comp_mode" : false,
},

Backend: 8x Nvidia GTX 750 1GB Only

CUDA

cryptonight_v7 8x GTX 750 1GB: 1897 H/s - 100%
cryptonight_v8 8x GTX 750 1GB: 1694 H/s - 89.3%

Complete System: HP DL580 G7 with 4x Xeon E7-8837 processors and 8x Nvidia GTX 750's

cryptonight_v7 8x GTX 750 1GB: 3521 H/s - 100%
cryptonight_v8 8x GTX 750 1GB: 3216 H/s - 91.3%

@psychocrypt - Thanks for all the hard work as the V8 91.3% vs V7 on pretty old hardware is very very nice.

Xeon E7-8837: Introduction date | Apr 3, 2011, Microarchitecture | Westmere
http://www.cpu-world.com/CPUs/Xeon/Intel-Xeon%20E7-8837.html

It is amazing how hardware prices drop over time as the E7-8837 was $2280 at introduction and I have been getting four of them for around $40 or $10 each.

Nvidia GTX 750: Introduction date | Feb 18, 2014
https://www.geforce.com/hardware/desktop-gpus/geforce-gtx-750/specifications
https://www.anandtech.com/show/7764/the-nvidia-geforce-gtx-750-ti-and-gtx-750-review-maxwell

@SChernykh
Copy link
Contributor

SChernykh commented Oct 12, 2018

@kio3i0j9024vkoenio Try threads=8,12,16 and blocks=32, should be a bit faster.

@onweer
Copy link

onweer commented Oct 12, 2018

OS: Windows 10
Backend:

RX Vega56
core: 1407 , mem: 900
speed:

cryptonight_v7: 1800 H/s 100%
cryptonight_v8: 900 H/s only 50%

Miner config:

"gpu_threads_conf" : [
{ "index" : 0,
"intensity" : 1696, "worksize" : 8,
"affine_to_cpu" : false, "strided_index" : 2, "mem_chunk" : 2,
"unroll" : 8, "comp_mode" : true
},
],

/*

Platform index. This will be 0 unless you have different OpenCL platform - eg. AMD and Intel.
*/
"platform_index" : 1,

@SChernykh
Copy link
Contributor

SChernykh commented Oct 12, 2018

@onweer try worksize=16 or 32, unroll=8 or 16 in different combinations. Also try two threads per GPU.

@kio3i0j9024vkoenio
Copy link

OS: Windows 10

Backend: CPU Only

  • CPU: Intel Xeon E3-1225 V3

  • cryptonight_v7: 270.3 H/s - 100%

  • cryptonight_v8: 260.3 H/s - 96.3%

Miner config: quad core with four 1x threads (one per core)

@kio3i0j9024vkoenio
Copy link

kio3i0j9024vkoenio commented Oct 13, 2018

Final results for my HP DL580 G7 with 4x E7-8837 Xeons and 8x Nvidia GTX 750's running on Ubuntu 16.04 and CUDA version 9.2.

V7 - xmr-stak 2.4.4 c0ab173
V8 - xmr-stak 2.5.0 9fe30b2

CUDA Nvidia.txt Config:

// gpu: GeForce GTX 750 architecture: 50
// memory: 843/976 MiB
// smx: 4
// The display is on this GPU so bsleep is set to 100
{ "index" : 0,
"threads" : 32, "blocks" : 12,
"bfactor" : 8, "bsleep" : 100,
"affine_to_cpu" : 7, "sync_mode" : 3,
"mem_mode" : 1,
},

// gpu: GeForce GTX 750 architecture: 50
// memory: 843/976 MiB
// smx: 4
// bfactor needs to be 8 and bsleep needs to be 25 to prevent GPU errors
{ "index" : 1-7,
"threads" : 32, "blocks" : 12,
"bfactor" : 8, "bsleep" : 25,
"affine_to_cpu" : 7, "sync_mode" : 3,
"mem_mode" : 1,
},

Backend: 8x Nvidia GTX 750 1GB Only CUDA

cryptonight_v7 8x GTX 750 1GB: 1896 H/s - 100%
cryptonight_v8 8x GTX 750 1GB: 1696 H/s - 89.5%

Backend: CPU Only 4x Xeon E7-8837's

cryptonight_v7: 1637 H/s - 100%
cryptonight_v8: 1535 H/s - 93.8%

Complete System: HP DL580 G7 with 4x Xeon E7-8837 processors and 8x Nvidia GTX 750's

cryptonight_v7 8x GTX 750 1GB: 3533 H/s - 100%
cryptonight_v8 8x GTX 750 1GB: 3231 H/s - 91.5%

@kio3i0j9024vkoenio
Copy link

kio3i0j9024vkoenio commented Oct 13, 2018

@kio3i0j9024vkoenio Try threads=8,12,16 and blocks=32, should be a bit faster.

Backend: 8x Nvidia GTX 750 1GB Only CUDA

T32:B12: gets 1713.2 H/s without any issues

T8,B32: gets 1494.4 H/s and produces this error:
[CUDA] Error gpu 5: </home/miner/xmr-stak/xmrstak/backend/nvidia/nvcc_code/cuda_core.cu>:791
suggestion: Try to increase the value of the attribute 'bfactor' or reduce 'threads' in the NVIDIA config file.
terminate called after throwing an instance of 'std::runtime_error'

T12,B32: gets 1749.5 H/s and produces this error:
Error gpu 4: </home/miner/xmr-stak/xmrstak/backend/nvidia/nvcc_code/cuda_core.cu>:791
suggestion: Try to increase the value of the attribute 'bfactor' or reduce 'threads' in the NVIDIA config file.
terminate called after throwing an instance of 'std::runtime_error'

T16:B32: doesn't even run:
[CUDA] Error gpu 0: </home/miner/xmr-stak/xmrstak/backend/nvidia/nvcc_code/cuda_extra.cu>:330
suggestion: Try to reduce the value of the attribute 'threads' in the NVIDIA config file.
terminate called after throwing an instance of 'std::runtime_error'
what(): [CUDA] Error: out of memory

T12,B32: For an additional 2% gain it doesn't seem worthwhile to try to tweak it to not produce errors.

@psychocrypt
Copy link
Collaborator Author

If someone has problems that with 2.5.0 the hash rate for cryptonight_v7 is to low please have a look to #1930

@trasherdk
Copy link

First test of CN/2 vs CN/1

CPU....: Intel(R) Core(TM) i3-7100 CPU @ 3.90GHz
RAM....: 8 GB RAM DDR4.
OS.....: Windows 10 Pro x64 (Version 10.0.16299.611)
Driver.: Radeon Software Version Radeon Software Version Adrenalin 18.5.1

Version: xmr-stak 2.4.7 c5f0505
Custom amt.txt config.
HashVault - CN/1

HASHRATE REPORT - CPU
| ID |    10s |    60s |    15m | ID |    10s |    60s |    15m |
|  0 |   29.5 |   30.1 |   29.1 |  1 |   31.0 |   31.7 |   30.9 |
Totals (CPU):    60.5   61.8   60.0 H/s
-----------------------------------------------------------------
HASHRATE REPORT - AMD
| ID |    10s |    60s |    15m | ID |    10s |    60s |    15m |
|  0 |  234.9 |  235.4 |  235.3 |  1 |  236.1 |  235.2 |  235.3 | MSI RX 550 2GB
|  2 |  238.3 |  237.3 |  237.0 |  3 |  234.5 |  236.9 |  237.0 | MSI RX 550 2GB
|  4 |  231.3 |  231.0 |  231.0 |  5 |  230.9 |  231.1 |  231.1 | GIGABYTE RX 560 4GB
|  6 |  236.8 |  237.2 |  237.3 |  7 |  238.1 |  237.9 |  237.3 | MSI RX 550 2GB
|  8 |  233.9 |  234.3 |  234.0 |  9 |  233.8 |  233.7 |  234.0 | MSI RX 550 2GB
Totals (AMD):  2348.5 2350.0 2349.3 H/s
-----------------------------------------------------------------
Totals (ALL):   2409.0 2411.8 2409.3 H/s
Highest:  2417.2 H/s
-----------------------------------------------------------------

Version: xmr-stak 2.5.0 9012512
Default autodetected amd.txt settings.
HashVault - CN/1

HASHRATE REPORT - CPU
| ID |    10s |    60s |    15m | ID |    10s |    60s |    15m |
|  0 |   28.4 |   29.0 |   28.8 |  1 |   29.9 |   30.3 |   30.1 |
Totals (CPU):    58.3   59.2   58.9 H/s
-----------------------------------------------------------------
HASHRATE REPORT - AMD
| ID |    10s |    60s |    15m | ID |    10s |    60s |    15m |
|  0 |  377.1 |  376.6 |  376.0 |  1 |  380.6 |  380.6 |  380.6 |
|  2 |  426.7 |  426.6 |  423.9 |  3 |  379.6 |  377.9 |  380.6 |
|  4 |  377.5 |  378.8 |  378.8 |
Totals (AMD):  1941.5 1940.5 1939.9 H/s
-----------------------------------------------------------------
Totals (ALL):   1999.8 1999.7 1998.9 H/s
Highest:  2011.0 H/s
-----------------------------------------------------------------

Version: xmr-stak 2.5.0 9012512
Custom amd.txt config. Same as ver. 2.4.7 + onroll: 8
HashVault - CN/1

HASHRATE REPORT - CPU
| ID |    10s |    60s |    15m | ID |    10s |    60s |    15m |
|  0 |   29.7 |   29.0 |   30.5 |  1 |   31.3 |   30.4 |   31.7 |
Totals (CPU):    61.0   59.3   62.1 H/s
-----------------------------------------------------------------
HASHRATE REPORT - AMD
| ID |    10s |    60s |    15m | ID |    10s |    60s |    15m |
|  0 |  234.5 |  235.4 |  235.2 |  1 |  236.8 |  234.9 |  235.2 | MSI RX 550 2GB
|  2 |  236.9 |  237.2 |  237.2 |  3 |  237.6 |  237.2 |  237.2 | MSI RX 550 2GB
|  4 |  230.9 |  231.1 |  231.1 |  5 |  231.3 |  231.0 |  231.0 | GIGABYTE RX 560 4GB
|  6 |  235.6 |  237.2 |  237.4 |  7 |  238.8 |  237.5 |  237.5 | MSI RX 550 2GB
|  8 |  231.4 |  233.8 |  233.9 |  9 |  235.3 |  233.6 |  233.9 | MSI RX 550 2GB
Totals (AMD):  2349.0 2348.9 2349.5 H/s
-----------------------------------------------------------------
Totals (ALL):   2410.0 2408.3 2411.7 H/s
Highest:  2417.4 H/s
-----------------------------------------------------------------

Version: xmr-stak 2.5.0 9012512
Custom amd.txt config. Same as ver. 2.4.7 + onroll: 8
KILL ALL ASICS - CN/2-2

HASHRATE REPORT - CPU
| ID |    10s |    60s |    15m | ID |    10s |    60s |    15m |
|  0 |   29.1 |   26.2 |   28.3 |  1 |   30.5 |   27.9 |   29.7 |
Totals (CPU):    59.6   54.1   58.0 H/s
-----------------------------------------------------------------
HASHRATE REPORT - AMD
| ID |    10s |    60s |    15m | ID |    10s |    60s |    15m |
|  0 |  198.3 |  190.6 |  189.9 |  1 |  192.4 |  191.7 |  189.9 | MSI RX 550 2GB
|  2 |  198.3 |  193.3 |  192.9 |  3 |  196.2 |  193.4 |  192.9 | MSI RX 550 2GB
|  4 |  212.3 |  211.4 |  214.0 |  5 |  217.5 |  210.5 |  214.0 | GIGABYTE RX 560 4GB
|  6 |  189.6 |  188.2 |  187.6 |  7 |  190.0 |  187.8 |  187.6 | MSI RX 550 2GB
|  8 |  202.2 |  194.9 |  193.3 |  9 |  195.2 |  195.2 |  193.3 | MSI RX 550 2GB
Totals (AMD):  1992.1 1957.1 1955.3 H/s
-----------------------------------------------------------------
Totals (ALL):   2051.7 2011.2 2013.3 H/s
Highest:  2061.2 H/s
-----------------------------------------------------------------

amd.txt

RX 550 - 2 x { "index" : 0, "intensity" : 432, "worksize" : 16, "affine_to_cpu" : false, "strided_index" : 1, "mem_chunk" : 2, "unroll" : 8, "comp_mode" : true },
RX 560 - 2 x { "index" : 2, "intensity" : 424, "worksize" : 8, "affine_to_cpu" : false, "strided_index" : 1, "mem_chunk" : 2, "unroll" : 8, "comp_mode" : true },

Just before switching back to CN/1, I copied the report on CN/2, and noticed that "Highest" is close to what I'd expect on CN/1

HASHRATE REPORT - CPU
| ID |    10s |    60s |    15m | ID |    10s |    60s |    15m |
|  0 |   18.0 |   24.3 |   27.6 |  1 |   20.8 |   26.1 |   28.9 |
Totals (CPU):    38.8   50.4   56.5 H/s
-----------------------------------------------------------------
HASHRATE REPORT - AMD
| ID |    10s |    60s |    15m | ID |    10s |    60s |    15m |
|  0 |  179.3 |  190.6 |  191.6 |  1 |  192.4 |  190.0 |  191.7 |
|  2 |  192.4 |  192.7 |  194.4 |  3 |  190.0 |  192.6 |  194.4 |
|  4 |  214.9 |  215.3 |  212.3 |  5 |  222.8 |  216.4 |  212.2 |
|  6 |  193.1 |  189.7 |  189.8 |  7 |  185.7 |  188.8 |  189.9 |
|  8 |  199.0 |  195.3 |  194.8 |  9 |  200.5 |  195.0 |  194.8 |
Totals (AMD):  1970.0 1966.5 1966.0 H/s
-----------------------------------------------------------------
Totals (ALL):   2008.8 2016.9 2022.5 H/s
Highest:  2436.0 H/s
-----------------------------------------------------------------

@vasilevskykv
Copy link

Please, can ou write ho wto run xmr-stak-cpu from Visual Studio 2017?

@vasilevskykv
Copy link

And why no to do so, that all parametrs have been written in config.txt?
It is difficult to compile and run it from command line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests