STM32中DMA有啥好处

Posted 2023-05-12

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了STM32中DMA有啥好处相关的知识，希望对你有一定的参考价值。

例如串口发送，没用DMA是也照样可以发送！！！！！请赐教

用和不用当然都可以发送。不用DMA发送是需要单片机实时参与，由单片机一个一个地发送数据并进行监控。但是如果用DMA,设置了起始地址，数据大小等参数后，就直接由专门的一个DMA模块进行数据发送，发送过程中单片机无需参与。发送完后会产生中断告知单片机。由此可知用DMA可以节省单片机资源，让单片可以在同一时间里干更多事。参考技术A 不占用cpu 提高数据吞吐量参考技术B 你把AD的采样时间搞常一些，ADC_SampleTime大些；如果还不行估计是AD供电的VDDA和VSSA不够稳定，你测下，在采样的同时，VDDA和VSSA两端的电压是否变化，如果变化比较大，你就需要从新设计下这个电源了..

在 STM32 中传输结束时，循环 DMA 外设到内存的行为如何？

【中文标题】在 STM32 中传输结束时，循环 DMA 外设到内存的行为如何？【英文标题】：How will circular DMA periph to memory behave at the end of the transfer in STM32? 【发布时间】：2020-05-24 16:36:19 【问题描述】：

我想问，在以下情况下，STM32 中的 DMA SPI rx 将如何表现。我有一个名为 A 的指定（例如）96 字节数组，用于存储从 SPI 接收的数据。我打开了在每个字节上运行的循环 SPI DMA，配置为 96 字节。是否有可能，当 DMA 将填充我的 96 字节数组时，传输完成中断将关闭，以快速将 96 字节数组复制到另一个 - B，然后循环 DMA 将开始写入 A（并销毁保存在 B 中的数据） ? 我想通过 USB 将 B 中的数据快速传输（每次从 B 中的 A 获取新数据时）。

我只是在考虑如何通过 USB 将连续数据流 SPI 从 STM32 传输到 PC，因为我认为 USB 每次传输一次的 96 字节数据块比通过实时 SPI 流式传输到 USB 更容易STM32？我什至不知道这是否可能

【问题讨论】：

是的，有可能，这是一场比赛。 【参考方案1】：

为此，您必须能够保证在接收到下一个 SPI 字节并将其传输到缓冲区的开头之前，您可以复制所有数据。这是否可能取决于处理器的时钟速度和 SPI 的速度，并且能够保证不会发生可能延迟传输的更高优先级的中断。为了安全起见，它需要非常慢的 SPI 速度，在这种情况下可能根本不需要使用 DMA。

总而言之，这是一个坏主意，完全没有必要。 DMA 控制器有一个“半传输”中断正是为了这个目的。传输前 48 个字节时您将获得 HT 中断，而 DMA 将在您复制 lower half 缓冲区时继续传输剩余的 48 个字节。当您完成转移时，您转移上半部分。这将您传输数据的时间从单个字节的接收时间延长到了 48 字节的接收时间。

如果您每次传输确实需要 96 字节，那么您只需将缓冲区设置为 192 字节长 (2 x 96)。

在伪代码中：

#define BUFFER_LENGTH 96
char DMA_Buffer[2][BUFFER_LENGTH] ;

void DMA_IRQHandler()

    if( DMA_IT_Flag(DMA_HT) == SET )
    
        memcpy( B, DMA_Buffer[0], BUFFER_LENGTH ) ;
        Clear_IT_Flag(DMA_HT) ;
    
    else if( DMA_IT_Flag(DMA_TC) == SET )
    
        memcpy( B, DMA_Buffer[1], BUFFER_LENGTH ) ;
        Clear_IT_Flag(DMA_TC) ;

关于通过 USB 将数据传输到 PC，首先您需要确保您的 USB 传输速率至少与 SPI 传输速率一样快或更快。 USB 传输的确定性可能较低（因为它由 PC 主机控制 - 也就是说，您只能在主机明确要求时在 USB 上输出数据），所以即使平均 em> 传输速率足够，可能存在需要进一步缓冲的延迟，因此您可能需要一个循环缓冲区或 FIFO 队列来为 USB 提供数据，而不是简单地从 DMA 缓冲区 A 复制到 USB 缓冲区 B。另一方面，如果您已经拥有缓冲区DMA_Buffer[0]、DMA_Buffer[1] 和B，那么您实际上已经拥有了三个 96 字节块的 FIFO，这可能就足够了

【讨论】：

不错！我明白了你的想法。但是我必须在 dma 中设置的变量名是什么？它只是“DMA_Buffer”吗？我忘了，如果我发送SPI数据的设备停止发送怎么办？我会有中断停止吗？ @Niko : 关于 DMA 缓冲区的地址，是的 DMA_Buffer 是有效的（强制转换为 uint32_t）——数组中的所有数据都是连续的，所以它指向一个 192 字节的块，使用二维数组只是为了简化代码。 @Niko ：关于无法完成传输，您将不得不实施某种超时 - 您确实应该为此发布一个新问题，但基本上您在每个 HT/ 上启动/重新启动计时器TC 中断。如果定时器中断发生在 DMA 中断之前，则该数据流已停止。然后，您可能会抓取部分缓冲区并重新启动 DMA，或者让 DMA 继续运行，并记下您已经占用了多少，这样当下一次 DMA 中断发生时，您只需要之前未读取的剩余部分。跨度> @Niko ：关于超时问题，只有从属 SPI 设备与您有关。如果您是 SPI 主机，那么数据不会“停止”，除非您作为主机停止 SPI - 即使从机没有主动更新其移出，主机将在 MISO 线上的任何电平上进行时钟和移位- 这将是它发送的任何内容的延迟副本。【参考方案2】：

在我的一个项目中，我遇到了类似的问题。任务是通过全速 USB 将来自外部 ADC 芯片（与 SPI 连接）的数据传输到 PC。数据是（8 ch x 16-bit），我被要求达到尽可能快的采样频率。

我最终得到了一个三重缓冲解决方案。缓冲区有 4 种可能的状态：

READY：

已发送：

IN_USE：

NEXT：

由于 USB 请求的时序无法与 SPI 进程同步，我相信双缓冲解决方案行不通。如果您没有 NEXT 缓冲区，当您决定发送 READY 缓冲区时，DMA 可能会完成填充 IN_USE 缓冲区并开始破坏 READY 缓冲区。但是在三缓冲解决方案中，READY 缓冲区可以安全地通过 USB 发送，因为即使当前 IN_USE 缓冲区已满，它也不会被填满。

所以随着时间的推移，缓冲区状态看起来像这样：

Buf0     Buf1      Buf2
====     ====      ====
READY    IN_USE    NEXT
SENT     IN_USE    NEXT
NEXT     READY     IN_USE
NEXT     SENT      IN_USE
IN_USE   NEXT      READY

当然，如果 PC 启动 USB 请求的速度不够快，您可能仍然会在它变为 NEXT 时丢失一个 READY 缓冲区（在变为发送）。 PC 异步发送 USB IN 请求，没有关于当前缓冲区状态的信息。如果没有 READY 缓冲区（它处于 SENT 状态），则 STM32 以 ZLP（零长度包）响应，PC 在 1 ms 延迟后再次尝试。

对于 STM32 上的实现，我使用双缓冲模式，并在 DMA 传输完成 ISR 中修改 M0AR 和 M1AR 寄存器以寻址 3 个缓冲区。

顺便说一句，我使用了 (3 x 4000) 字节缓冲区，最后达到了 32 kHz 的采样频率。 USB 配置为供应商特定类，它使用批量传输。

【讨论】：

【参考方案3】：

通常使用循环 DMA 仅在半满/半空时触发，否则您没有足够的时间将信息复制出缓冲区。

我建议不要在中断期间将数据复制出缓冲区。而是直接使用缓冲区中的数据，而不需要额外的复制步骤。

如果您在中断中进行复制，则您在复制期间阻塞了其他优先级较低的中断。在 STM32 上，48 字节的简单原始字节副本可能需要额外的 48*6 ~ 300 个时钟周期。

如果您独立跟踪缓冲区的读取和写入位置，您只需要更新一个指针并将延迟的通知调用发布到缓冲区的使用者。

如果您想要更长的周期，则不要使用循环 DMA，而是使用 48 字节块中的普通 DMA 并将循环字节缓冲区实现为数据结构。

我为接收异步可变长度数据包的 460k 波特率的 USART 执行此操作。如果您确保生产者只更新写指针而消费者只更新读指针，您可以避免大部分情况下的数据竞争。请注意，皮质 m3/m4 上对齐的

包含的代码是我使用的支持 DMA 的循环缓冲区的简化版本。它仅限于 2^n 的缓冲区大小并使用模板和 C++11 功能，因此根据您的开发/平台限制，它可能不适合。

要使用缓冲区调用 getDmaReadBlock() 或 getDMAwriteBlock() 并获取 DMA 内存地址和块长度。一旦 DMA 完成，使用 skipRead() / skipWrite() 将读取或写入指针增加实际传输量。

 /**
   * Creates a circular buffer. There is a read pointer and a write pointer
   * The buffer is full when the write pointer is = read pointer -1
   */
 template<uint16_t SIZE=256>
  class CircularByteBuffer 
    public:
      struct MemBlock 
          uint8_t  *blockStart;
          uint16_t blockLength;
      ;

    private:
      uint8_t *_data;
      uint16_t _readIndex;
      uint16_t _writeIndex;

      static constexpr uint16_t _mask = SIZE - 1;

      // is the circular buffer a power of 2
      static_assert((SIZE & (SIZE - 1)) == 0);

    public:
      CircularByteBuffer &operator=(const CircularByteBuffer &) = default;

      CircularByteBuffer(uint8_t (&data)[SIZE]);

      CircularByteBuffer(const CircularByteBuffer &) = default;

      ~CircularByteBuffer() = default;

    private:
      static uint16_t wrapIndex(int32_t index);

    public:
      /*
       * The number of byte available to be read. Writing bytes to the buffer can only increase this amount.
       */
      uint16_t readBytesAvail() const;

      /**
       * Return the number of bytes that can still be written. Reading bytes can only increase this amount.
       */
      uint16_t writeBytesAvail() const;

      /**
       * Read a byte from the buffer and increment the read pointer
       */
      uint8_t readByte();

      /**
       * Write a byte to the buffer and increment the write pointer. Throws away the byte if there is no space left.
       * @param byte
       */
      void writeByte(uint8_t byte);

      /**
       * Provide read only access to the buffer without incrementing the pointer. Whilst memory accesses outside the
       * allocated memeory can be performed. Garbage data can still be read if that byte does not contain valid data
       * @param pos the offset from teh current read pointer
       * @return the byte at the given offset in the buffer.
       */
      uint8_t operator[](uint32_t pos) const;

      /**
       * INcrement the read pointer by a given amount
       */
      void skipRead(uint16_t amount);
      /**
       * Increment the read pointer by a given amount
       */
      void skipWrite(uint16_t amount);


      /**
       * Get the start and lenght of the memeory block used for DMA writes into the queue.
       * @return
       */
      MemBlock getDmaWriteBlock();

      /**
       * Get the start and lenght of the memeory block used for DMA reads from the queue.
       * @return
       */
      MemBlock getDmaReadBlock();

  ;

  // CircularByteBuffer
  // ------------------
  template<uint16_t SIZE>
  inline CircularByteBuffer<SIZE>::CircularByteBuffer(uint8_t (&data)[SIZE]):
      _data(data),
      _readIndex(0),
      _writeIndex(0) 
  

  template<uint16_t SIZE>
  inline uint16_t CircularByteBuffer<SIZE>::wrapIndex(int32_t index)
    return static_cast<uint16_t>(index & _mask);
  

  template<uint16_t SIZE>
  inline uint16_t CircularByteBuffer<SIZE>::readBytesAvail() const 
    return wrapIndex(_writeIndex - _readIndex);
  

  template<uint16_t SIZE>
  inline uint16_t CircularByteBuffer<SIZE>::writeBytesAvail() const 
    return wrapIndex(_readIndex - _writeIndex - 1);
  

  template<uint16_t SIZE>
  inline uint8_t CircularByteBuffer<SIZE>::readByte() 
    if (readBytesAvail()) 
      uint8_t result = _data[_readIndex];
      _readIndex = wrapIndex(_readIndex+1);
      return result;
     else 
      return 0;
    
  

  template<uint16_t SIZE>
  inline void CircularByteBuffer<SIZE>::writeByte(uint8_t byte) 
    if (writeBytesAvail()) 
      _data[_writeIndex] = byte;
      _writeIndex = wrapIndex(_writeIndex+1);
    
  

  template<uint16_t SIZE>
  inline uint8_t CircularByteBuffer<SIZE>::operator[](uint32_t pos) const 
    return _data[wrapIndex(_readIndex + pos)];
  

  template<uint16_t SIZE>
  inline void CircularByteBuffer<SIZE>::skipRead(uint16_t amount) 
    _readIndex = wrapIndex(_readIndex+ amount);
  

  template<uint16_t SIZE>
  inline void CircularByteBuffer<SIZE>::skipWrite(uint16_t amount) 
    _writeIndex = wrapIndex(_writeIndex+ amount);
  

  template <uint16_t SIZE>
  inline typename CircularByteBuffer<SIZE>::MemBlock  CircularByteBuffer<SIZE>::getDmaWriteBlock()
    uint16_t len = static_cast<uint16_t>(SIZE - _writeIndex);
   // full is  (write == (read -1)) so on wrap around we need to ensure that we stop 1 off from the read pointer.
    if( _readIndex == 0)
      len = static_cast<uint16_t>(len - 1);
    
    if( _readIndex > _writeIndex)
      len = static_cast<uint16_t>(_readIndex - _writeIndex - 1);
    
    return &_data[_writeIndex], len;
  

  template <uint16_t SIZE>
  inline typename CircularByteBuffer<SIZE>::MemBlock  CircularByteBuffer<SIZE>::getDmaReadBlock()
    if( _readIndex > _writeIndex)
      return &_data[_readIndex], static_cast<uint16_t>(SIZE- _readIndex);
     else 
      return &_data[_readIndex], static_cast<uint16_t>(_writeIndex - _readIndex);
    
  
`

【讨论】：

恢复旧答案，但如何在接收可变宽度数据包时有效地使用 DMA？ TX 很容易，因为您设置了传输长度，但对于 RX，您不知道会发生什么，所以您要么使用一个字节的传输长度，要么使用某种超时机制，不是吗？对于 STM32 uart，它们实现了字符超时中断。这是您想要的，而不是一般超时。中断在接收到最后一个字符后触发 x 位间隔，而不再接收。因此，要么 DMA 触发中断，要么触发字符超时中断，您需要检查 DMA 的状态并传输那里的内容。

以上是关于STM32中DMA有啥好处的主要内容，如果未能解决你的问题，请参考以下文章