v0.1.0 // open source

WAVEPX

Transmit files, images, and data through sound. Fountain codes. DEFLATE compression. Reliable transfer protocol. No internet required.

Built on ggwave.
An application framework on top.
~

Powered by ggwave

ggwave by Georgi Gerganov is a brilliant data-over-sound library that handles the hardest part — turning bytes into precisely tuned audio tones and decoding them back using FSK modulation and FFT. It supports 9 protocols across audible, ultrasound, and dual-tone frequency bands. It’s the foundation that makes everything here possible.

WAVEPX is an application framework layer on top of ggwave. It adds multi-frame chunked images with progressive rendering, DEFLATE compression for higher throughput, fountain codes for reliable delivery, palette quantization, dithering, and full application protocols for file transfer and games.

Think of it this way: ggwave is the brilliant physical layer that turns data into sound. WAVEPX adds the application stack on top — chunking, compression, error recovery, and the UI.
Everything the air
can carry
Nine data types, four compression strategies, and a complete UI. Each feature is both functional and educational.
📡

File Transfer

Drag-drop any file up to 64KB. SYN/ACK handshake, fountain-coded data burst, CRC-32 verification. Reliable delivery over a lossy audio channel.

fountain codes
🖼

Image Transfer

Load photos, apply dithering (Floyd-Steinberg, Atkinson), and transmit as multi-chunk pixel data. Supports B&W, 4-gray, 16-gray, and 16-color palette modes.

chunked protocol
💬

Text Messages

Send UTF-8 text up to 138 bytes in a single frame. With DEFLATE compression in transfer mode, send entire paragraphs efficiently.

single frame
🎯

QR Over Sound

Generate and transmit QR codes (versions 1–4) as bit-packed audio frames. URLs, contact cards, WiFi configs — no camera needed.

auto-version
🎮

Battleship Game

Full game protocol over sound: ship placement, shot/result exchanges, win detection. Coordinate labels, ghost preview, ship roster, status messages.

game protocol
🏆

Compression Arena

Side-by-side comparison of 7 encoding strategies. See raw vs. RLE vs. palette, byte counts, chunk counts, and visual previews. Learn by seeing.

educational
🎨

Pixel Drawing

Interactive pixel canvas with adjustable grid (8×8 to 512×512), grayscale modes, and real-time encoding stats. Draw and send pixel art over audio.

interactive
🔍

Dithering Engine

Floyd-Steinberg and Atkinson error diffusion algorithms. Convert photos to low-bit-depth pixel data while preserving visual quality through perceptual tricks.

image processing
📚

Built-in Education

Every tab has expandable blurbs explaining the science: how FSK works, why fountain codes tolerate loss, what dithering does. Learning is the feature.

details/summary
Five layers deep
From your file to audible tones — each layer adds reliability, efficiency, or structure.
Application
File transfer, image viewer, QR generator, text chat, battleship game — each with its own frame subtype and UI
Fountain
XOR-based erasure coding — K source blocks + 50% parity blocks. Receiver reconstructs from any sufficient subset
Compression
DEFLATE-raw via CompressionStream API. RLE for pixel data. Palette quantization with row-run encoding for color images
Chunking
Splits payloads into 140-byte frames with headers: chunk index, total count, dimensions, compression flags, CRC-32
Transport
ggwave by Georgi Gerganov — 9 FSK protocols across audible, ultrasound, and dual-tone bands. 48kHz sample rate, AudioWorklet capture
Three lines to transmit
Install the library, create an instance, send. Or just run npx wavepx for the full app.
import { SonicPixel } from 'wavepx';

const sp = new SonicPixel();
await sp.init();

// Send text
await sp.sendText('hello from the other side');

// Send a QR code
await sp.sendQr('https://github.com/0xNtive/wavepx');

// Send a pixel image (16x16 B&W)
const pixels = new Array(256).fill(false);
await sp.sendImage(16, 16, pixels);
import { SonicPixel, FrameType } from 'wavepx';

const sp = new SonicPixel({
  onReceive: (msg) => {
    switch (msg.type) {
      case FrameType.TXT:
        console.log('Text:', msg.text);
        break;
      case FrameType.IMG:
        console.log(`Image: ${msg.width}x${msg.height}`);
        break;
      case FrameType.CHUNK:
        console.log('Chunked image complete');
        break;
    }
  },
});

await sp.init();
await sp.startListening();  // mic access required
import { SonicPixel } from 'wavepx';

const sp = new SonicPixel({
  onTransferMessage: (msg) => {
    // Handle incoming transfer protocol frames
    // SYN, DATA, DONE are handled automatically
  },
});
await sp.init();

// Send a file with fountain codes + compression
const data = new TextEncoder().encode('file contents...');
await sp.sendFileTransfer(
  data.buffer,
  'notes.txt',
  (sent, total, state) => {
    console.log(`${state}: ${sent}/${total} blocks`);
  }
);
# Launch the full app locally
$ npx wavepx

# Custom port
$ npx wavepx -p 8080

# Or install globally
$ npm install -g wavepx
$ wavepx

  WAVEPX  data over sound

  Local:   http://localhost:3000

  Press Ctrl+C to stop
API Reference
The complete SonicPixel class API. Import anything from 'wavepx'.
SonicPixel — Core
new SonicPixel(config?: SonicPixelConfig)
Create a new instance. Config accepts callbacks: onReceive, onError, onStateChange, onAudioLevel, onChunkProgress, onGameMessage, onTransferMessage. Also protocol (0-8) and volume (0-100).
init(): Promise<void>
Initialize audio context and ggwave transport. Must be called from a user gesture (click/tap) in browsers.
startListening(): Promise<void>
Begin capturing audio from the microphone. Decoded frames are delivered via the onReceive callback. Requires mic permission.
stopListening(): void
Stop audio capture and release microphone.
destroy(): void
Clean up all resources (audio context, ggwave instance, mic stream).
Sending
sendText(text: string): Promise<void>
Send a UTF-8 text message (up to 138 bytes). Single frame, no chunking.
sendQr(text: string, opts?: { ecLevel?, version? }): Promise<void>
Generate and send a QR code. Auto-selects the smallest QR version (1-4) that fits. EC levels: L, M, Q, H.
sendImage(w, h, pixels): Promise<void>
Send a 1-bit B&W image. If it fits in 140 bytes, sends as single frame (IMG). Otherwise automatically uses chunked multi-frame.
sendChunkedImage(w, h, pixels, onProgress?): Promise<void>
Explicitly chunked 1-bit image. Uses RLE if more compact than raw. Progress callback: (sent, total) => void.
sendGrayImage(w, h, pixels, bitDepth, onProgress?): Promise<void>
Multi-frame grayscale image. bitDepth: 1, 2 (4 levels), or 4 (16 levels). Chooses raw or RLE automatically.
sendPaletteImage(img: PaletteImage): Promise<void>
Send an indexed-color image (up to 16 RGB colors) with row-run compression. Use quantizeColors() to prepare.
sendFileTransfer(data, fileName, onProgress?): Promise<void>
Reliable file transfer (up to 64KB) using fountain codes + DEFLATE + CRC-32. Full SYN/SYN_ACK/DATA/DONE handshake. Progress: (sent, total, state) => void.
sendRaw(frame: Uint8Array): Promise<void>
Send a pre-encoded frame (e.g., game protocol frames). No validation — you build the bytes.
pauseSend() / resumeSend() / abortSend()
Control multi-frame sends. Pause suspends after the current frame; resume continues; abort cancels remaining frames.
Configuration
setProtocol(protocol: SonicProtocol): void
Set the audio protocol. 9 variants: Audible/Ultrasound/DT800, each with Normal/Fast/Fastest speed. Higher speed = lower reliability.
setVolume(vol: number): void
Set playback volume (0-100). Higher volume improves range but may cause distortion.
getFrequencyData(): Uint8Array | null
Get real-time FFT spectrum (128 bins, 0-255 per bin). Useful for audio visualization. Returns null if not listening.
generateWav(frames: Uint8Array[]): Promise<Blob>
Encode frames to a WAV file blob (48kHz, 16-bit PCM) without playing them. Useful for offline generation.
Lower-Level Exports
encodeFrame / decodeFrame
Encode/decode QR, IMG, TXT messages to/from raw bytes. Returns typed message objects.
packBits / unpackBits / packValues / unpackValues
MSB-first bit packing for 1-bit booleans and N-bit values (1, 2, 4 bits per value).
rleEncode / rleDecode / rleEncodeGray / rleDecodeGray
Run-length encoding for 1-bit (7-bit length + color bit) and grayscale (value + length pairs).
encodeBlocks / FountainDecoder
XOR-based fountain codes. encodeBlocks(data, blockSize, redundancy) produces K systematic + parity blocks. Decoder reconstructs from any K received blocks.
quantizeColors / encodePaletteImage / decodePaletteImage
Median-cut color quantization (up to 16 colors) and row-run palette encoding. Sparse colors compress to near-zero overhead.
ditherImage(luma, w, h, levels, algorithm): number[]
Floyd-Steinberg or Atkinson error diffusion dithering. Input: Float64Array luminance. Output: quantized pixel values.
deflateCompress / deflateDecompress
DEFLATE-raw compression via browser CompressionStream API. Falls back to passthrough in Node.js.
computeCrc32(data: Uint8Array): number
IEEE 802.3 CRC-32 with pre-computed lookup table. Used for file transfer integrity verification.
waveformsToWav(waveforms, gapMs): Blob
Concatenate Float32 PCM waveforms with silence gaps into a downloadable WAV file.
Game & Transfer
encodeGameFrame / decodeGameFrame
Battleship protocol: SETUP (placement hash), SHOT (x, y), RESULT (hit/miss/sunk), WIN. 3-7 byte frames.
createGameState / placeShip / receiveShot / checkWin
Pure game state machine. 8×8 grid, 3 ships (3, 2, 2). FNV-1a hash of placement determines turn order.
TransferSenderSession / TransferReceiverSession
Full file transfer lifecycle managers. Sender: prepare → SYN → SYN_ACK → DATA burst → DONE. Receiver: SYN → SYN_ACK → collect DATA → verify CRC → DONE.
Wire Protocol
All frames are single ggwave transmissions (max 140 bytes). Byte 0 identifies the frame type.
TypeByte 0HeaderPayloadMax Size
QR 0x01 version/EC (1B) + proto ver (1B) Bit-packed QR modules 140B
IMG 0x02 width (1B) + height (1B) + proto ver (1B) Bit-packed pixels 140B
TXT 0x03 proto ver (1B) UTF-8 text (up to 138B) 140B
CHUNK 0x04 index + total + flags + dimensions Compressed pixel data 140B/frame
GAME 0x05 subtype (1B) + session fields Setup/shot/result/win data 7B
TRANSFER 0x06 subtype (1B) + session ID + fields Fountain-coded blocks 140B/frame
Transfer Subtypes
SubtypeCodeDirectionPurpose
SYN0x01Sender → ReceiverFile metadata, block params, CRC-32, filename
SYN_ACK0x02Receiver → SenderAcknowledge, ready to receive
DATA0x03Sender → ReceiverFountain-coded block (index, degree, source indices, payload)
DONE0x04Receiver → SenderCRC verified: success or mismatch
ABORT0x05EitherCancel: user, timeout, or error
Chunk Frame Layout
ByteFirst Chunk (idx=0)Subsequent Chunks
00x04 (type)0x04 (type)
1Chunk indexChunk index
2Total chunksTotal chunks
3Flags [compress:2|depth:2|reserved:4]Flags [compress:2|depth:2|reserved:4]
4-5Width (big-endian)Proto version (1B) + payload...
6-7Height (big-endian)
8Protocol version
9-139Payload (131B max)Payload (135B max)
Compression Strategies (Flags bits 7-6)
ValueStrategyBest For
00RawRandom patterns, high-entropy images
01RLE (1-bit)Solid regions, line art, sparse graphics
10RLE (gray)Grayscale with large uniform areas
11PaletteColor images with ≤16 colors
Game Subtypes
SubtypeCodeSizeFields
SETUP0x017Bhash (4 bytes, FNV-1a of ship placement)
SHOT0x025Bx, y coordinates
RESULT0x036Bx, y, result (0=miss, 1=hit, 2=sunk)
WIN0x043B(none)
The numbers
~1
KB/s
Effective throughput
with compression
64
KB max
File transfer limit
per session
3-5x
compression
DEFLATE on text
and structured data
50%
redundancy
Fountain code parity
for loss tolerance
The science
Every layer uses real signal processing and information theory. Here's what's happening when you hear the chirps.

FSK Modulation

Powered by ggwave. Frequency-Shift Keying maps each byte to a specific audio frequency. The receiver uses FFT (Fast Fourier Transform) to decompose the audio back into frequency components and read the data. Like a musical barcode.

byte → frequency(byte) → FFT → byte

Fountain Codes

Rateless erasure codes that let the receiver reconstruct data from any sufficient subset of encoded blocks. Each parity block XORs 2–3 source blocks. If a block is lost, it can be recovered from the parity without retransmission.

parity = src[i] XOR src[j] XOR src[k]

Error Diffusion

Floyd-Steinberg dithering distributes quantization error to neighboring pixels, creating the illusion of more gray levels than actually exist. Your eye averages nearby pixels, perceiving smooth gradients from binary dots.

error = old_pixel - quantized_pixel
neighbors += error * weights

Median Cut

Color quantization finds the N most representative colors by recursively splitting the color space along its widest axis. Each pixel is then mapped to the nearest palette entry, with row-run compression for efficient encoding.

split(colors, widest_channel) → palette[16]
Start transmitting
Open the app on two devices. Hit listen on one, send on the other. Watch data travel through the air.