Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarking Pin Performance #375

Open
coofercat opened this issue Oct 25, 2019 · 2 comments
Open

Benchmarking Pin Performance #375

coofercat opened this issue Oct 25, 2019 · 2 comments
Labels
notice Issues that are solved/do not require input, but preserved and marked of interest to users.

Comments

@coofercat
Copy link

coofercat commented Oct 25, 2019

I've been playing about with Neopixels, and found that SPI performance on my Pi Zero is quite a long way below the "theoretical" (about 30-40uS per LED). To that end, I did some benchmarking. The code I used is below, but here are the results:

Pin 12 (PWM0) over 100 LEDs: 2330 frames in 10.01 seconds = 232.73 fps (42.97us/LED)
Pin 13 (PWM1) over 100 LEDs: 2330 frames in 10.01 seconds = 232.75 fps (42.97us/LED)
Pin 21 (PCM) over 100 LEDs: 2460 frames in 10.04 seconds = 245.11 fps (40.80us/LED)
Pin 10 (SPI) over 100 LEDs: 1290 frames in 10.03 seconds = 128.59 fps (77.77us/LED)

Pin 12 (PWM0) over 500 LEDs: 500 frames in 10.15 seconds = 49.27 fps (40.60us/LED)
Pin 13 (PWM1) over 500 LEDs: 500 frames in 10.15 seconds = 49.27 fps (40.59us/LED)
Pin 21 (PCM) over 500 LEDs: 500 frames in 10.04 seconds = 49.80 fps (40.16us/LED)
Pin 10 (SPI) over 500 LEDs: 280 frames in 10.11 seconds = 27.69 fps (72.22us/LED)

Pin 12 (PWM0) over 1000 LEDs: 250 frames in 10.07 seconds = 24.82 fps (40.30us/LED)
Pin 13 (PWM1) over 1000 LEDs: 250 frames in 10.07 seconds = 24.82 fps (40.30us/LED)
Pin 21 (PCM) over 1000 LEDs: 250 frames in 10.02 seconds = 24.95 fps (40.08us/LED)
Pin 10 (SPI) over 1000 LEDs: 140 frames in 10.01 seconds = 13.98 fps (71.52us/LED)

Pin 12 (PWM0) over 2000 LEDs: 130 frames in 10.44 seconds = 12.45 fps (40.15us/LED)
Pin 13 (PWM1) over 2000 LEDs: 130 frames in 10.44 seconds = 12.45 fps (40.15us/LED)
Pin 21 (PCM) over 2000 LEDs: 130 frames in 10.41 seconds = 12.49 fps (40.04us/LED)
Pin 10 (SPI) over 2000 LEDs: 80 frames in 11.38 seconds = 7.03 fps (71.15us/LED)

The good news here is that the length of the strip doesn't affect the time per LED. It also turns out PCM is the fastest way to update LEDs. I'd like to try this on a Pi2, Pi3 etc too - just to see if it makes any difference. It might also be interesting to try this in a different language to see if the language bindings have any effect on performance.

Code I used (run it with sudo):

#!/usr/bin/env python3
import time
from neopixel import *

LED_FREQ_HZ    = 800000  # LED signal frequency in hertz (usually 800khz)
LED_DMA        = 10      # DMA channel to use for generating signal (try 10)
LED_BRIGHTNESS = 255     # Set to 0 for darkest and 255 for brightest
LED_INVERT     = False   # True to invert the signal (when using NPN transistor level shift)

test_time = 10
led_lengths = [100, 500, 1000, 2000]
pins_to_test = [
    {'pin': 12, 'type': 'PWM0', 'channel': 0 },
    {'pin': 13, 'type': 'PWM1', 'channel': 1, },
    {'pin': 21, 'type': 'PCM', 'channel': 0 },
    {'pin': 10, 'type': 'SPI', 'channel': 0 },
]

for led_length in led_lengths:
    for pin in pins_to_test:
        # Create NeoPixel object with appropriate configuration.
        strip = Adafruit_NeoPixel(led_length, pin['pin'], LED_FREQ_HZ, LED_DMA, LED_INVERT, LED_BRIGHTNESS, pin['channel'])
        # Intialize the library (must be called once before other functions).
        strip.begin()

        for i in range(0, strip.numPixels()):
            strip.setPixelColor(i, 0x00ffffff)
        strip.show()

        frames = 0

        start = time.time()
        end = start + test_time

        while(time.time() < end):
            for i in range(10):
                strip.show()
                frames = frames + 1

        diff = time.time() - start
        time_per_led = (diff * 1000000) / (led_length * frames)
        print("Pin %s (%s) over %s LEDs: %s frames in %.2f seconds = %.2f fps (%.2fus/LED)" % (pin['pin'], pin['type'], led_length, frames, diff, frames/diff, time_per_led))

    print("")
@Gadgetoid Gadgetoid added the notice Issues that are solved/do not require input, but preserved and marked of interest to users. label Nov 6, 2019
@Gadgetoid
Copy link
Collaborator

Interesting stuff! I'm not at all surprised by the speed of SPI vs PCM/PWM, but intrigued by the difference between PCM and PWM. As far as I know the PCM hardware is very similar to the PWM hardware, and is also driven by feeding (via DMA) a buffer of specially crafted bytes from memory into the peripheral. I wonder if the time difference- which doesn't seem to scale with the number of LEDs- is entirely down to the setup time of the peripheral/transfer.

@coofercat
Copy link
Author

I got some time on an Raspberry Pi 4, but couldn't do the necessary setup to fiddle with SPI, but the results are slightly better than on a Pi Zero:

Pin 12 (PWM0) over 100 LEDs: 2440 frames in 10.03 seconds = 243.39 fps (41.09us/LED)
Pin 13 (PWM1) over 100 LEDs: 2440 frames in 10.03 seconds = 243.32 fps (41.10us/LED)
Pin 21 (PCM) over 100 LEDs: 2510 frames in 10.01 seconds = 250.69 fps (39.89us/LED)

Pin 12 (PWM0) over 500 LEDs: 510 frames in 10.09 seconds = 50.54 fps (39.57us/LED)
Pin 13 (PWM1) over 500 LEDs: 510 frames in 10.09 seconds = 50.56 fps (39.56us/LED)
Pin 21 (PCM) over 500 LEDs: 510 frames in 10.01 seconds = 50.94 fps (39.26us/LED)

Pin 12 (PWM0) over 1000 LEDs: 260 frames in 10.23 seconds = 25.42 fps (39.34us/LED)
Pin 13 (PWM1) over 1000 LEDs: 260 frames in 10.23 seconds = 25.42 fps (39.33us/LED)
Pin 21 (PCM) over 1000 LEDs: 260 frames in 10.19 seconds = 25.52 fps (39.19us/LED)

Pin 12 (PWM0) over 2000 LEDs: 130 frames in 10.20 seconds = 12.75 fps (39.22us/LED)
Pin 13 (PWM1) over 2000 LEDs: 130 frames in 10.20 seconds = 12.75 fps (39.22us/LED)
Pin 21 (PCM) over 2000 LEDs: 130 frames in 10.18 seconds = 12.77 fps (39.15us/LED)

I imagine the improvement is because the Pi4 has considerably better performance than a Zero, but for those of us who care about such things, you can squeeze a couple of extra 'fps' out of a Pi4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
notice Issues that are solved/do not require input, but preserved and marked of interest to users.
Projects
None yet
Development

No branches or pull requests

2 participants