Skip to content

How to do fast and battery efficient DMA-driven SPI transfers? #992

@juj

Description

@juj

I am trying to do fast continuous SPI transfers out from a Pi (working on a Model 3 B and a Zero W) and while the transfers are running, I'd like to make the main CPU idle until the transfers are finished.

The BCM core has an idle frequency of 250 MHz, and when the CPU is under load, it boosts up to 400 MHz. It looks like the CPU frequency is linked to this same turbo, and it idles at 600 MHz, and turbos up to 1200 MHz on Pi 3, and 1000 MHz on Zero.

Originally I was doing SPI Polled Mode transfers, busy spinning the CPU in a loop pushing bytes out to FIFO, and reading back from it when the read bytes become available. This gave me a nice 400/6=66mbits/sec transfer rate with CDIV=6, but the issue was that this busy spinning kills the battery, and one hardware thread, so this was not feasible on the Zero.

Then after migrating to using DMA instead of Polled Mode, I see I get the same 400/6=66mbits/sec of transfer rate as long as I busy spin the CPU to wait until the DMA transfer is complete. However, after switching from busy spinning to actually sleeping the CPU to wait until the DMA completes, I get a drop of the BCM core frequency down to 250 MHz, and my transfer rates drop to 250/6=41.66mbits/sec, a dramatic -37.5% reduction in SPI throughput.

It seems that heavy SPI activity by itself in the absence of CPU activity does not cause the BCM core to trigger itself to turbo up to increase the SPI transfer speed, but the turbo is controlled only by activity on the main CPU core.

Ideally, what I would like to achieve is to have the BCM core automatically trigger itself to turbo up whenever there exists heavy SPI activity in the FIFO (or perhaps when there are active DMA writes to the SPI TX or RX PER_MAPs ongoing?), ideally keeping the main CPU core frequency at idle, so the system would bump itself up from 600MHz/250MHz to 600MHz/400Mhz when SPI transfers are ongoing.

If such "600MHz/400Mhz" turbo mode is not technically possible and the main CPU and BCM core clocks are fixed to have to turbo at the same time, I'd then like to the system to automatically detect to turbo up to 1200MHz/400Mhz when there is SPI activity going on, while user code could still run an usleep() or a futex/mutex wait for a signal/interrupt to occur.

Are either of the above technically feasible?

As a third fallback option, it would be possible in my application to manually control turbo via some kind of hinting, if such a method might be feasible. My DMA transfers can be anything between a few bytes to up to 480 * 320 * 2 bytes in size at a time, and before I start a DMA transfer, I could add in a hint trigger to tell the system e.g. "please keep BCM core turbo up for the next 0.7/1.3/2.5 msecs". This kind of hinting would allow the BCM core get a breather immediately when the application does not need to do any SPI transfers, dropping back to idle to save power.

My application is about implementing a power and performance efficient display driver for SPI connected displays, you can find the fbcp-ili9341 project here:

30425771_10216084740774348_7762580950430154112_o

A demo video of Quake running at 60fps here.

The transfer footprint of my application ranges between long periods of heavy activity, to short bursts of heavy activity, to long periods of no activity, depending on how much pixel animation there is on screen in particular content. Ideally I'd be able to turbo up the BCM core quickly when SPI transfers are performed, and drop it back to idle when there are none.

As a workaround to not have to busy spin burn cycles on the main CPU to make the SPI transfers keep up, I have added force_turbo=1 in /boot/config.txt, and in that way, I can keep the main CPU asleep but still have the SPI bus running at 400 MHz. This lets the CPU schedule other processes on the Pi Zero W to keep things running smooth. However this is not a feasible solution, as I understand booting with force_turbo=1 irrevocably sets a "warranty void" bit on the device, and it's likely excessive to have the main CPU core run at 1200MHz (1000Mhz on Zero W) even if it is sleeping idle for the most of those cycles.

Any thoughts on what would be the best way to proceed? Thanks in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions