PulsePins RLE-Decoder

Project summary

PulsePins is a flexible run-length–encoded (RLE) pattern generator for 32-bit (or wider) parallel data buses with 10 ns timing resolution and advanced triggering capabilities. It is designed for reliable, robust operation with extensive self-testing. Common use cases are quick to implement, while the architecture remains flexible and easily extensible.

Main features

  • High-speed RLE decoding core: zero-wait-state decoding (one update per clock period), limited only by the clock frequency. At 100 MHz this provides 10 ns timing resolution for pulse durations and separations.
  • Two data sources: streaming from the hard processor system (ARM core) through a FIFO queue, or from predefined sequences in memory (buffer size up to 512 MB, streamed from RAM via DMA).
  • Output FIFO with throttling to guarantee loss-free streaming.
  • Preprocessor implementing a second level of run-length decoding (repetitions of short sequences of RLE elements), enabling compact representation of periodic signals.
  • Internal clock (PLL-generated) or external clock input.
  • Rich set of data-path update operations: load, set, clear, toggle, rotate left/right, NOT, AND, OR, XOR, XNOR.
  • Pseudorandom bitstream generator based on the xoroshiro128+ algorithm.
  • Explicit control over output enabling (asserted/deasserted valid signal or presence/absence of strobe pulses), configurable on the fly while streaming a sequence.
  • Advanced multi-bit, multi-stage triggering with arbitrarily long trigger programs (bounded only by the configurable trigger-stage buffer, 256 stages by default). Each stage is defined by a mask (which bits are observed) and a pattern (expected bit values). The 8 trigger inputs can be extended to a wider trigger bus.
  • Switchable trigger sources (external inputs, internal signals, on-board push-buttons/switches) with per-bit masking and inversion.
  • Multiple streamer cores (four instances by default) with independent triggering for conditional streaming controlled by external signals. Outputs from the cores are combined by an advanced multiplexer that supports:

    • selection between streamers,
    • logic operations (AND, OR, XOR, XNOR, majority),
    • concatenation of data blocks from different cores,
    • arithmetic sums and differences,
    • per-bit masking and inversion.
  • Pausing and retriggering to support conditional branching and looping.

  • Gating: output streaming can be halted by a gate signal.
  • High-level, modern, object-oriented C++ API.
  • Python bindings for the C++ API (nanobind), with unit tests based on pytest.
  • Buffer-underrun detection and read-back circuitry with an on-chip run-length encoder for verification and high-assurance scenarios where reliability and correctness under all operating conditions are critical.
  • Comprehensive hardware self-tests via the read-back interface and a suite of test cases for systematic, intensive validation of correct device operation; most of the functionality is covered by these tests.
  • 8-bit auxiliary input/output lines for general-purpose use.
  • Time-stamping circuit for synchronization and timing purposes.
  • General-purpose operation as a delay generator or function generator.
  • Clean, well-documented Verilog implementation with test benches and high test coverage.
  • KiCad schematics and layouts for interface cards that provide easy interfacing (PMOD, SMA), buffering (50 Ω drivers), ESD protection, status LEDs, external clock input with threshold control, and pads for CMOS crystal oscillators.
  • Proven stability: no lockups or errors observed during five days of continuous stress testing at 100 MHz streamer clock even without a heatsink on the FPGA.
  • Configurable widths for the output data bus (32 or 64 bits) and for the run-length counter (32 or 64 bits).
  • Reference and user manuals (these web pages).
  • Liberal MIT license, requiring only attribution.

Typical use cases

  • Control of complex scientific apparatus under strict timing constraints (where 10 ns resolution is sufficient).
  • Driving serializer circuits for generating high-frequency signals.
  • Driving digital-to-analog converters (DACs) for generating analog waveforms.
  • Driving direct digital synthesis (DDS) chips for RF/microwave signal generation.
  • Delay generation and synchronization.
  • Characterization of logic devices and circuits.
  • Protocol emulation (I²C, SPI, and other serial buses).
  • Generation of periodic signals (repetitive bit patterns) or pseudorandom sequences for communications testing.
  • Burn-in and stress testing.

Acknowledgments

This project incorporates code from the project rsyocto (c) Robin Sebastian, licensed under the MIT License. See third_party/rsyocto/LICENSE for details.

Rok Zitko, rok.zitko@ijs.si, 2025