Parabun
A fork of Bun with extra runtime modules: a worker pool with shared typed arrays, raw CUDA and Metal kernels, SIMD primitives, V4L2 / ALSA capture, GGUF LLM inference, and statically-linked image / audio / CSV codecs.
These aren't npm packages — they're built into the runtime, so there's no node-gyp step and no
per-platform binary distribution. Imports look the same as Bun's other built-ins (import gpu from "para:gpu"). Plain .ts / .js files behave the same as upstream Bun.
Targets edge devices and IoT — Linux SBCs like Raspberry Pi 5 and Jetson Orin, NUCs, and anything else running a real OS with capable CPU/GPU. Not microcontrollers (Cortex-M / ESP32 / RP2040): those need a different runtime — JavaScriptCore alone is bigger than an MCU's flash budget. The positioning is "device that's basically a small computer," not "device that's basically a register file."
Linux and macOS. Windows build is in progress. parabun self-update refreshes an existing install
along with the VS Code extension.
Installs the VS Code extension into any of code, cursor, or kiro found on
$PATH. The extension provides the .pts / .pjs TextMate grammar and an LSP
with hover, go-to-definition, purity diagnostics, memo hints, and operator documentation.
Module index
Modules grouped by what you'd reach for them to do. Click a pill for the API. Higher-level modules compose the lower-level ones — para:assistant is built on para:audio + para:speech + para:llm; para:llm on para:gpu + para:simd; and so on. The imports are the truth, but you don't need to know the dependency graph to pick a module.
Runtime modules
para:parallel
pmap and preduce chunk arrays across a persistent worker pool. Functions are
serialized via fn.toString(), so they must be pure — no closures, no outer references.
TypedArrays are passed through a SharedArrayBuffer, so postMessage transfers a
handle rather than a copy.
import { pmap } from "para:parallel";
const rows = Array.from({ length: 1_000_000 }, (_, i) => `record-${i}`);
function score(row: string): number {
let h = 0;
for (let i = 0; i < row.length; i++) h = (h * 31 + row.charCodeAt(i)) | 0;
return h * h;
}
const scores = await pmap(score, rows, { concurrency: 8 });
console.log("scored", scores.length, "rows; first =", scores[0]);
import { pmap } from "para:parallel";
const rows = Array.from({ length: 1_000_000 }, (_, i) => `record-${i}`);
pure function score(row: string): number {
let h = 0;
for (let i = 0; i < row.length; i++) h = (h * 31 + row.charCodeAt(i)) | 0;
return h * h;
}
const scores = await pmap(score, rows, { concurrency: 8 });
console.log("scored", scores.length, "rows; first =", scores[0]);
para:simd
WebAssembly v128 kernels for Float32Array (f32x4) and Float64Array (f64x2). Inputs
above 4 MiB are processed in place rather than copied into WASM memory. alloc() returns a
typed array backed by the WASM linear memory for zero-copy use.
import { mulScalar, add, dot, sum } from "para:simd";
const a = new Float32Array([1, 2, 3, 4]);
const b = new Float32Array([5, 6, 7, 8]);
const y = mulScalar(a, 3); // [3, 6, 9, 12]
const z = add(a, b); // [6, 8, 10, 12]
const d = dot(a, b); // 70
const s = sum(a); // 10
| op (N=100k, f32) | .map / .reduce | tight loop | para:simd |
|---|---|---|---|
| mulScalar(a, 3) | 808 µs | 60 µs | 30 µs |
| add(a, b) | 884 µs | 73 µs | 40 µs |
| sum(a) | 574 µs | 43 µs | 17 µs |
| dot(a, b) | 716 µs | 51 µs | 24 µs |
para:gpu
Metal on macOS, CUDA on Linux and Windows, CPU fallback on hosts without a GPU. A matrix passed to
gpu.hold() stays resident across matVec calls, so only the input vector crosses the
host↔device boundary per call. Pure Float32Array → Float32Array functions are
runtime-compiled to PTX (via NVRTC) or MSL (via newLibraryWithSource:) when the body fits a
supported shape: arithmetic, ternary, Math.*.
import gpu from "para:gpu";
const M = 1024, K = 768;
const mat = gpu.alloc(M * K, "f32");
for (let i = 0; i < mat.length; i++) mat[i] = Math.random();
const held = gpu.hold(mat); // uploaded once
const queries = [
new Float32Array(K).fill(0.1),
new Float32Array(K).fill(0.2),
];
for (const q of queries) {
const scores = gpu.matVec(held, q, M, K); // no copy
console.log("top score:", Math.max(...scores));
}
gpu.release(held);
Beyond matVec / simdMap, para:gpu ships
conv2D, scan, reduce, argMin / argMax,
histogram, and median / quantile — CPU correctness paths today, with
optional CUDA / Metal hooks on the same dispatch surface for follow-up device kernels.
para:arena
A pool of SharedArrayBuffer-backed typed arrays. para:parallel and
para:pipeline draw from it so per-chunk work doesn't allocate a fresh buffer every
time. Entries return to the pool at the end of an arena { } block or a pmap
chunk, instead of waiting for the GC.
para:pipeline
A chain of para:simd calls (mulScalar, add,
relu, …) is collapsed into a single pass at .run() time, so the intermediate arrays
don't get allocated. If the input is large enough that GPU dispatch wins (gpu.winsForSize(...)),
the fused chain runs as a single para:gpu simdMap instead.
para:signals
signal() is a reactive cell, derived() derives one from others, and
effect() runs side effects when something it read changes. Reads inside an effect register a
dependency; writes invalidate downstream and a microtask flush re-runs only the effects that observed a value
that actually changed. batch() coalesces multi-write transactions; untrack() reads
inside a reactive context without registering a dep. Pairs with the signal /
effect { } / ~> language extensions.
Many other modules expose their own state as Signals — para:audio's capture stream
(peakLevel, active), para:llm's LLM and
WhisperModel instances (busy, device),
para:speech's listen() stream (active,
noiseFloor, lastUtterance), and para:assistant's
state / history / lastTurn / interrupted. Wire any of them
into UI without polling.
para:rtp
RFC 3550 packet pack / parse and a jitter buffer. Built to sit under
para:audio's Opus encoder for a WebRTC-style send/receive path.
rtp.pack({ payloadType, sequence, timestamp, ssrc, payload }) produces a wire-format packet; the
jitter buffer reorders by sequence number with a configurable depth.
para:image
A Sharp-class image module baked into the runtime — JPEG / PNG / WebP decode and encode (libjpeg-turbo,
libpng, libwebp + libsharpyuv vendored statically), bilinear and Lanczos resize, separable Gaussian blur,
unsharp-mask sharpen, Sobel edge-detect, 90 / 180 / 270 rotate, flip, crop, brightness / contrast / saturation
adjust, threshold, invert, grayscale, per-channel histogram, and Porter-Duff source-over alpha compositing. No
npm install sharp, no Node-ABI-versioned binary distribution.
import image from "para:image";
const bytes = await Bun.file("photo.jpg").bytes();
const img = image.decode(bytes);
const small = image.resize(img, { width: 800, height: 600, kernel: "lanczos" });
const sharp = image.sharpen(small, { amount: 1.5 });
const webp = image.encode(sharp, { format: "webp", quality: 85 });
await Bun.write("photo.webp", webp);
para:audio
A from-scratch audio toolkit: WAV / MP3 decode, Opus encode and decode (libopus 1.6.1), rnnoise-based denoiser, FFT, RBJ Audio EQ Cookbook biquads (lowpass / highpass / bandpass / notch), resample, STFT spectrogram, mel spectrogram (Whisper-mode included for STT pipelines), voice-activity detection, AGC, peak / RMS / windowed envelope, mix, normalize, interleave / deinterleave, and PCM type conversion. Heavy codecs (libopus, minimp3, rnnoise) ship statically.
OS audio I/O is wired on Linux: audio.devices() enumerates ALSA capture and playback devices,
audio.capture({ device, sampleRate, channels }) returns a stream whose
.frames() async-iterator yields Float32Array PCM straight from
snd_pcm_readi, and audio.play({ ... }).write(samples) pushes PCM through
snd_pcm_writei. The capture stream exposes reactive peakLevel and
active Signals — RMS rate-limited to 10 Hz so a level meter is one effect() away.
CoreAudio + WASAPI mount on the same surface in follow-ups.
import audio from "para:audio";
import rtp from "para:rtp";
await using mic = await audio.capture({ sampleRate: 48000, channels: 1 });
const enc = new audio.OpusEncoder({ sampleRate: 48000, channels: 1, application: "voip" });
const den = new audio.Denoiser();
const agc = new audio.Gain({ targetLevel: 0.1 });
const ssrc = 0x12345678;
let sequence = 0, timestamp = 0;
for await (const frame of mic.frames()) {
den.process(frame.samples); // suppress noise (in place)
agc.process(frame.samples); // normalize loudness
const opus = enc.encode(frame.samples);
const packet = rtp.pack({ payloadType: 111, sequence: sequence++, timestamp, ssrc, payload: opus });
// send `packet` over your transport (UDP, WebRTC, …)
timestamp += frame.samples.length;
}
para:csv
Streaming RFC 4180 parser — async generator, full quote and escape handling, configurable delimiter, header
mode that yields records keyed by column name, per-cell type inference (number / boolean / null). An opt-in
parallel: true mode chunks the input across para:parallel's worker
pool when the input has no quoted cells and is large enough.
import csv from "para:csv";
for await (const row of csv.parseCsv(Bun.file("rows.csv"), { header: true })) {
console.log(row.id, row.name, row.score);
}
| fixture | serial (med) | parallel (med) | speedup |
|---|---|---|---|
| 5 MB · 128k rows | 152 ms | 129 ms | 1.18× |
| 50 MB · 1.25M rows | 1446 ms | 1528 ms | 0.95× |
| 200 MB · 4.92M rows | 5892 ms | 6363 ms | 0.93× |
parallel: true is not a per-file speedup. The serial state machine is already
memory-bandwidth-bound, and the parallel path's materialize-and-fork overhead grows with input size — so it
helps a little at small files, breaks even around 50 MB, and gets worse from there. Use it to keep the event
loop responsive while parsing (parsing N files concurrently does scale across cores), not because you expect
bigger files to go faster. bench/parabun-csv-parallel/ reproduces these numbers.
para:camera
V4L2 capture on Linux. camera.devices() reads /sys/class/video4linux/ and runs
VIDIOC_QUERYCAP on each to filter to actual capture devices.
camera.formats(path) enumerates the supported (format, width, height, fps) tuples.
camera.open(...) mmaps the kernel ring buffer and starts streaming, and
cam.frames() is an async iterator of frames. AVFoundation (macOS) and Media Foundation (Windows)
backends are planned on the same JS surface.
para:video
Scaffold only — the JS surface is in place (video.probe, video.decode,
video.encode, video.decodeAll, with codec / container / acceleration options) but
the native side hasn't been wired yet. The plan is libavcodec on desktop, V4L2 M2M on Pi 5, NVDEC/NVENC on
Jetson, all behind the same JS API.
para:llm
An in-tree native inference stack covering three model classes: Llama / Qwen2 chat + completion
(LLM), BERT-family sentence embedders (Encoder), and Whisper STT
(WhisperModel). Weights mmap off disk; residual stream and KV cache live on-device.
Per-token traffic across PCIe is a 4-byte argmax. Q4_K and Q6_K matVec kernels use a 1-warp-per-row,
4-warps-per-block layout; QKV and Gate+Up projections are byte-concatenated at load time and dispatched as one
matVec per layer.
import llm from "para:llm";
using m = await llm.LLM.load("./Llama-3.2-1B-Instruct-Q4_K_M.gguf");
for await (const piece of m.chat([
{ role: "system", content: "You are helpful and concise." },
{ role: "user", content: "What is the capital of France?" },
])) {
process.stdout.write(piece);
}
| Llama-3.2-1B Q4_K_M · RTX 4070 Ti | parabun | ollama |
|---|---|---|
| greedy decode (device-only) | 340 tok/s | ~350 tok/s |
| greedy decode (logits DtoH) | 275 tok/s | — |
| prompt prefill | 295 tok/s | — |
Numbers are within run-to-run noise of ollama on this model and hardware. Chat templates for Llama-3, ChatML,
and Mistral-Instruct are detected from the GGUF's tokenizer.chat_template. Only the CUDA backend
is wired in this module today; Metal kernels are pending.
llm.serve({ engine, modelId, port }) exposes any model (or anything else implementing
.chat() / .generate() / .embed()) over an OpenAI-compatible HTTP API.
Routes: GET /v1/models, POST /v1/chat/completions (sync and SSE streaming),
POST /v1/completions, POST /v1/embeddings. Optional bearer auth and a FIFO
concurrency gate (default 1). Default port is 11434, matching ollama's, so OpenAI clients that auto-discover a
local ollama work unchanged.
WhisperModel loads whisper.cpp ggml-*.bin files (F32 / F16 / Q4_0 / Q5_0 / Q5_1 /
Q8_0) and runs encoder-decoder STT — KV cache, chunked long-audio, beam search, language detection across all
99 Whisper languages. CUDA-accelerated end-to-end (encoder im2col conv + matmuls + per-head batched attention;
decoder per-token matVecs + LM head). On an RTX 4070 Ti, an 11 s JFK clip transcribes in 1.6 s with
tiny.en — about 6.9× real-time.
Both LLM and WhisperModel instances expose reactive
para:signals Signals: m.busy (refcounted, flips while a
chat / generate / embed / transcribe call is in flight)
and m.device ("cuda" | "metal" | "cpu", stable for the life of the instance). Wire a
busy spinner or a backend badge with a one-liner effect().
para:vision
vision.frames(stream, { decodeMjpg? }) takes a frame iterator from
para:camera (or any source yielding the same shape) and yields packed-RGBA8 frames.
yuyv, nv12, and rgb24 are converted inline; mjpeg requires the caller to pass image.decode from
para:image (cross-builtin imports between bun: modules aren't
supported, so dependencies are passed in at the call site). vision.detectMotion adds a
downsampled-luma frame-diff estimator with temporal smoothing.
vision.detect (YOLO / SSD / RT-DETR) and vision.recognize (Tesseract / EasyOCR) are
typed but throw — both need an ONNX runtime vendored before they can do anything. The interfaces are there so
callers can write against them now and have them work later.
para:speech
speech.listen(stream, { sampleRate }) takes an audio chunk iterator (para:audio's capture stream, a file reader, anything yielding { samples }) and yields one utterance per
detected speech burst. The classifier is RMS-against-an-adaptive-noise-floor, with pre-roll to catch word
onsets, hangover to seal on silence, and a minimum-length filter to drop clicks and breath sounds.
speech.transcribe(utt, { engine: "whisper", model }) dispatches to the
WhisperModel in para:llm, with a per-process model cache so the
weights aren't reloaded between calls. speech.speak(text, { engine: "piper", model }) drives the
Piper voice synthesizer (subprocess in v1, libpiper FFI v2 tracked) and returns f32 mono PCM at the voice's
native sample rate, ready to hand straight to audio.play().write(). The
listen stream also exposes reactive active / noiseFloor /
lastUtterance signals.
para:assistant
The 3-line case. Composes para:audio (mic + speaker),
para:speech (VAD + STT + TTS), and para:llm (Llama /
Qwen2 inference) into a complete on-device voice loop:
await using bot = await assistant.create({ llm, stt, tts, system });, then
await bot.run(). Mic captures, VAD gates, Whisper transcribes, the LLM generates, Piper
synthesizes, ALSA plays — fully local, no cloud round-trip. bot.turns() exposes the loop as an
async iterator; bot.ask(text) skips STT for text-only turns; bot.say(text) pushes a
proactive utterance.
Reactive surface: bot.state ("idle" | "listening" | "thinking" | "speaking"),
bot.history, bot.lastTurn, and bot.interrupted are all
para:signals Signals — wire them straight into UI without polling. Persistent
memory is one option away: pass memory: "/path/to/memory.sqlite" and the conversation transcript
replays into history on every create. Power users keep their seat — bot.llm exposes
the underlying model, so anything reachable directly via para:llm /
para:speech / para:audio is reachable through
bot too.
para:arrow
In-memory columnar tables. arrow.recordBatch({ ... }) takes a map of typed arrays (or plain
arrays — types are inferred) and returns a RecordBatch with a Schema. Columns are
typed-array views with optional validity bitmaps; arrow.table([...]) concatenates batches across
one schema. Computes: sum, mean, min, max,
count, variance, stddev, quantile, median,
distinct, filter, groupBy. fromRows /
toRows bridge between row-shaped JS data (e.g. para:csv output) and the columnar
form; concat materializes a table-wide column into one typed array.
arrow.toIPC(table) and arrow.fromIPC(bytes) handle both the Arrow IPC
streaming format (continuation-prefixed Schema + RecordBatch messages) and the
file format (ARROW1-bracketed messages + a Footer flatbuffer with random-access
Block offsets). fromIPC auto-detects via the head/tail magic bytes. Pass "file" as
the second arg to toIPC to write the file format; default is streaming.
Seven logical types round-trip: int32, int64, float32,
float64, bool (bit-packed on the wire), utf8 (offsets + bytes), and
list<T> (offsets + recursive child column — including
list<list<T>>). On read, narrow ints (int8 / int16 / uint8 / uint16) are widened to
int32, uint32 widens to int64; Date / Time / Timestamp coerce to int / int64. DictionaryBatch decode handles
apache-arrow's default Dictionary<Utf8>. The FlatBuffers builder/reader is hand-rolled — no
npm dep.
Wire compat verified against apache-arrow@21.1.0 in
bench/parabun-arrow-ipc-interop/ — six round-trip directions (streaming + file ×
Parabun↔apache-arrow, plus Parabun→apache-arrow → Parabun for List). The bytes Parabun produces are the same
wire format pyarrow, arrow-rs, nanoarrow, polars, and duckdb consume on both the streaming and file paths.
arrow.fromParquet(bytes) and arrow.toParquet(table, opts?) read and write Apache
Parquet files. Hand-rolled Thrift compact-protocol codec, Snappy compressor + decompressor, dictionary + RLE +
bit-pack hybrid decoders, RLE writer for definition levels — no npm dep. Covers BOOLEAN /
INT32 / INT64 / FLOAT / DOUBLE /
BYTE_ARRAY physical types under UNCOMPRESSED, SNAPPY, and
GZIP. Verified end-to-end against pyarrow in both directions on 10,000-row multi-row-group
fixtures with scattered nulls; null counts match exactly across all three codecs.
Example: LangChain VectorStore
ParabunVectorStore extends VectorStore from @langchain/core and
implements the addVectors and similaritySearchVectorWithScore methods, so call sites
that accept any VectorStore work against it without changes. The shared setup below feeds both
snippets:
import { OpenAIEmbeddings } from "@langchain/openai";
const emb = new OpenAIEmbeddings({ modelName: "text-embedding-3-small" });
const docs = ["hello world", "good morning", "see you later"];
const vectors = await emb.embedDocuments(docs);
const q = await emb.embedQuery("greetings");
import { MemoryVectorStore }
from "langchain/vectorstores/memory";
const store = new MemoryVectorStore(emb);
await store.addVectors(vectors, docs);
const hits = await store
.similaritySearchVectorWithScore(q, 10);
import { ParabunVectorStore }
from "./parabun-store.pjs";
const store = new ParabunVectorStore(emb);
await store.addVectors(vectors, docs);
const hits = await store
.similaritySearchVectorWithScore(q, 10);
| 100k × 384 f32, top-10 | add_ms | score_ms | vs LangChain |
|---|---|---|---|
| LangChain MemoryVectorStore | 4.0 | 48.2 | 1.00× |
| ParabunVectorStore | 82.7 | 15.9 | 2.83× |
add_ms is higher because rows are packed into a single SAB Float32Array and normalized
in place — one-time O(N·D) work amortized across subsequent queries. Top-K indices and scores match
LangChain's to four decimal places.
Composition examples
Where signals and module composition pay off — wiring multiple bun:* modules into one program
without postMessage, child processes, or N-API bindings.
Voice → LLM → tool dispatch
Mic captures, Whisper transcribes, the LLM picks a tool under a JSON schema (mathematically guaranteed-valid
output, no parse retries), the runtime dispatches it, Piper speaks the reply. The dispatch table is plain JS
below; para:mcp (now shipped — stdio + WebSocket transports) lets you swap the table
for a Model Context Protocol client without changing this control flow. effect() over
mic.peakLevel / chat.busy / wsp.busy drives a status line — no polling,
no observers, no event emitters.
import assistant from "para:assistant";
import { effect } from "para:signals";
await using bot = await assistant.create({
llm: "./Llama-3.2-1B-Instruct-Q4_K_M.gguf",
stt: "./ggml-tiny.en.bin",
tts: "./en_US-lessac-medium.onnx",
system: "You are a helpful home assistant. Keep replies short.",
tools: {
setLight: ({ room, on, brightness }) => console.log(`light ${room} ${on ? "on" : "off"} @ ${brightness}`),
playMusic: ({ track }) => console.log(`play ${track}`),
},
});
// One reactive line redraws a live status line in place.
effect(() => process.stdout.write(`\r${bot.state.get()}`));
await bot.run(); // VAD → STT → LLM (with grammar-constrained tool calls) → TTS → speaker
import assistant from "para:assistant";
await using bot = await assistant.create({
llm: "./Llama-3.2-1B-Instruct-Q4_K_M.gguf",
stt: "./ggml-tiny.en.bin",
tts: "./en_US-lessac-medium.onnx",
system: "You are a helpful home assistant. Keep replies short.",
tools: {
setLight: ({ room, on, brightness }) => console.log(`light ${room} ${on ? "on" : "off"} @ ${brightness}`),
playMusic: ({ track }) => console.log(`play ${track}`),
},
});
// One reactive line redraws a live status line in place.
`\r${bot.state.get()}` -> process.stdout.write;
await bot.run(); // VAD → STT → LLM (with grammar-constrained tool calls) → TTS → speaker
para:assistant composes para:audio +
para:speech + para:llm +
para:signals internally. bot.state is a Signal that cycles through
"idle" | "listening" | "thinking" | "speaking"; the tools dispatch table works the same as an
para:mcp client. What's underneath shows the same
loop hand-rolled — same five modules, no facade.
Webcam motion → reactive assistant
Module composition: para:camera + para:vision +
para:assistant over para:signals.
vision.detectMotion emits frame-by-frame scores; the onRising() predicate auto-tracks
every signal it reads, so the greeting fires once on each false→true edge — no derived() wrapper,
no state machine, no debounce timer, no wasPresent flag. The Parabun tab collapses each
await using x = await resource.open(…) to await using x ..= resource.open(…) — same
semantics, less doubled-await.
import camera from "para:camera";
import vision from "para:vision";
import assistant from "para:assistant";
import { onRising } from "para:signals";
await using cam = await camera.open("/dev/video0", { format: "yuyv", width: 640, height: 480 });
await using bot = await assistant.create({
llm: "/models/Llama-3.2-1B-Instruct-Q4_K_M.gguf",
tts: "/models/en_US-lessac-medium.onnx",
system: "You are a friendly home assistant. Keep replies short.",
});
// motion.score / motion.detected auto-fill as frames flow.
const motion = vision.detectMotion(vision.frames(cam.frames()), { sensitivity: 0.04 }).run();
// Greet whenever motion appears AND the bot is idle. The predicate is
// auto-tracked — every signal it reads becomes a dep, no derived() wrapper.
onRising(
() => motion.detected.get() && bot.state.get() === "idle",
() => bot.say("Welcome back!"),
);
import camera from "para:camera";
import vision from "para:vision";
import assistant from "para:assistant";
import { onRising } from "para:signals";
await using cam ..= camera.open("/dev/video0", { format: "yuyv", width: 640, height: 480 });
await using bot ..= assistant.create({
llm: "/models/Llama-3.2-1B-Instruct-Q4_K_M.gguf",
tts: "/models/en_US-lessac-medium.onnx",
system: "You are a friendly home assistant. Keep replies short.",
});
const motion = vision.detectMotion(vision.frames(cam.frames()), { sensitivity: 0.04 }).run();
onRising(
() => motion.detected.get() && bot.state.get() === "idle",
() => bot.say("Welcome back!"),
);
Four modules, three signals, zero glue code. Barge-in is built into
para:assistant now (rising edge on vad.active drops the queued TTS via
spk.stop() and stamps turn.interrupted); programmatic cancel is
bot.interrupt().
What's underneath
para:assistant isn't magic — it stitches the same five modules a user could call directly. The
version below is the loop the home-assistant facade performs for you. Useful when you want a non-default flow
(custom VAD threshold, separate transcribe + chat sessions, your own JSON dispatch). Routine smart-home or IoT
cases should still pick the facade; this one's longer.
import audio from "para:audio";
import speech from "para:speech";
import llm from "para:llm";
import { effect } from "para:signals";
const tools: Record<string, (args: any) => any> = {
setLight: ({ room, on, brightness }) => console.log(`light ${room} ${on ? "on" : "off"} @ ${brightness}`),
playMusic: ({ track }) => console.log(`play ${track}`),
reply: ({ text }) => text,
};
const ToolSchema = { /* JSON schema with oneOf for each tool */ };
await using mic = await audio.capture({ sampleRate: 16000, channels: 1 });
using wsp = await llm.WhisperModel.load("./ggml-tiny.en.bin");
using chat = await llm.LLM.load("./Llama-3.2-1B-Instruct-Q4_K_M.gguf");
effect(() => {
process.stdout.write(`\rmic ${mic.peakLevel.get().toFixed(3)} llm ${chat.busy.get() ? "🤔" : "✅"} whisper ${wsp.busy.get() ? "🎙️" : "✅"}`);
});
for await (const utt of speech.listen(mic.frames(), { sampleRate: 16000 })) {
const heard = wsp.transcribe(utt.samples, { language: "en" });
const { tool, args } = await chat.chatJSON([{ role: "user", content: heard }], { schema: ToolSchema, maxTokens: 80 });
const result = await tools[tool](args);
if (typeof result === "string") {
await speech.say(result, { engine: "piper", model: "./en_US-lessac-medium.onnx" });
}
}
import audio from "para:audio";
import speech from "para:speech";
import llm from "para:llm";
const tools: Record<string, (args: any) => any> = {
setLight: ({ room, on, brightness }) => console.log(`light ${room} ${on ? "on" : "off"} @ ${brightness}`),
playMusic: ({ track }) => console.log(`play ${track}`),
reply: ({ text }) => text,
};
const ToolSchema = { /* JSON schema with oneOf for each tool */ };
await using mic = await audio.capture({ sampleRate: 16000, channels: 1 });
using wsp = await llm.WhisperModel.load("./ggml-tiny.en.bin");
using chat = await llm.LLM.load("./Llama-3.2-1B-Instruct-Q4_K_M.gguf");
`\rmic ${mic.peakLevel.get().toFixed(3)} llm ${chat.busy.get() ? "🤔" : "✅"} whisper ${wsp.busy.get() ? "🎙️" : "✅"}`
-> process.stdout.write;
for await (const utt of speech.listen(mic.frames(), { sampleRate: 16000 })) {
const heard = wsp.transcribe(utt.samples, { language: "en" });
const { tool, args } = await chat.chatJSON([{ role: "user", content: heard }], { schema: ToolSchema, maxTokens: 80 });
const result = await tools[tool](args);
if (typeof result === "string") {
await speech.say(result, { engine: "piper", model: "./en_US-lessac-medium.onnx" });
}
}
chat.chatJSON({ schema }) drains the streamed grammar-constrained chat and parses the result in one
call; speech.say(text) wraps speak() + audio.play() +
spk.write() with a process-wide cached PlaybackStream keyed on (sampleRate, channels). Both
ergonomic shortcuts shipped alongside the assistant facade.
Language extensions — .pts / .pjs
Files ending in .pts, .ptsx, .pjs, or .pjsx are parsed with
additional desugarings. All output is standard JS; no runtime support is required, and the runtime modules above
do not depend on any of this syntax. GitHub's TextMate grammars do not cover .pts; the
VS Code / Cursor / Kiro extension
provides the grammar and an LSP.
pure and memo
A pure function is rejected at parse time if it mutates an outer variable, reads this,
or calls a known-impure global. Prefix pure with memo — or drop
pure entirely and write memo as the declarator — and the result is cached by argument
identity: 0-arg singleton, 1-arg Map, multi-arg nested Map chain. Recursive
self-references route through the outer wrapper, so fib below runs the body 21 times for
fib(20), not 21,891.
// Hand-rolled memoization. `memo` is sugar over this Map-keyed pattern.
const fibCache = new Map<number, number>();
function fib(n: number): number {
const hit = fibCache.get(n);
if (hit !== undefined) return hit;
const v = n < 2 ? n : fib(n - 1) + fib(n - 2);
fibCache.set(n, v);
return v;
}
// Single-arg form — same Map-keyed cache, simpler API.
const normCache = new Map<string, string>();
const normalize = (s: string) => {
const hit = normCache.get(s);
if (hit !== undefined) return hit;
const v = s.trim().toLowerCase();
normCache.set(s, v);
return v;
};
// Async-dedupe: cache the in-flight Promise, evict on reject.
const db = { users: { get: async (id: string) => ({ id, name: "alice" }) };
const profileCache = new Map<string, Promise<any>>();
async function fetchProfile(id: string) {
const hit = profileCache.get(id);
if (hit) return hit;
const p = db.users.get(id);
profileCache.set(id, p);
p.catch(() => profileCache.delete(id));
return p;
}
// declarator form — `memo` implies pure + function
memo fib(n: number): number {
return n < 2 ? n : fib(n - 1) + fib(n - 2);
}
// arrow form — same thing as an expression prefix
const normalize = memo (s: string) => s.trim().toLowerCase();
// async dedupes concurrent in-flight calls, evicts on reject
const db = { users: { get: async (id: string) => ({ id, name: "alice" }) };
memo async fetchProfile(id: string) { return await db.users.get(id); }
signal, effect, ~>, ->
signal NAME = <rhs> desugars to a Signal binding; bare reads rewrite to
.get(), assignments to .set(). If the RHS references another in-scope signal, the
binding auto-promotes to a read-only derived(). effect { ... } tracks every signal it
reads as a dependency and re-runs on change. A ~> B is reactive binding — it desugars to
effect(() => { B = A; }), so B stays in step with A and whatever
signals A reads from. A -> fn is the function-sink complement — it desugars to
effect(() => { fn(A); }) and replaces the most common
effect { someFn(template) } boilerplate.
import { signal, derived, effect } from "para:signals";
const count = signal(0);
const doubled = derived(() => count.get() * 2);
effect(() => { console.log(count.get(), doubled.get()); });
count.update(n => n + 1); // effect re-runs: 1, 2
// assignment sink — bind into a DOM-ish lvalue
const el = { innerHTML: "" }; // pretend DOM element
effect(() => { el.innerHTML = String(count.get()); });
// function/method sink — bind into a writer
effect(() => { console.log(count.get()); });
signal count = 0;
signal doubled = count * 2; // auto-derived
effect { console.log(count, doubled); }
count++; // effect re-runs: 1, 2
// assignment sink — bind into a DOM-ish lvalue
const el = { innerHTML: "" }; // pretend DOM element
count ~> el.innerHTML;
// function/method sink — bind into a writer
count -> console.log;
|>, ..!, ..&, ..=
x |> f is f(x). pure functions passed through |> get
inlined at parse time — no call overhead. ..! / ..& are .catch /
.finally in suffix position. ..= is = await in a declaration and
disambiguates to an inclusive-range literal otherwise (0..5 excludes 5, 0..=5 includes
it).
function sq(x: number) { return x * x; }
const result = sq(sq(5)); // 625
console.log(result);
const json = await fetch("https://api.github.com").then(r => r.json())
.catch(err => console.error(err))
.finally(() => console.log("done"));
for (let i = 0; i <= 9; i++) console.log(i); // 0,1,2,…,9
pure function sq(x: number) { return x * x; }
const result = 5 |> sq |> sq; // 625 — both calls inlined
console.log(result);
const json ..= fetch("https://api.github.com").then(r => r.json())
..! err => console.error(err) // .catch
..& () => console.log("done"); // .finally
for (const i of 0..=9) console.log(i); // 0,1,2,…,9
defer and arena
defer EXPR schedules EXPR to run when the enclosing block exits (return, throw,
fall-through). Multiple defers dispose in LIFO order. defer await EXPR inside an async function
awaits the cleanup. arena { ... } runs the block with the GC paused, then frees everything
allocated inside on exit — useful for tight numeric loops with short-lived intermediate allocations.
import * as fs from "node:fs";
import arena from "para:arena";
function readConfig(path: string) {
const fd = fs.openSync(path, "r");
try {
return JSON.parse(fs.readFileSync(fd, "utf8"));
} finally {
fs.closeSync(fd); // runs on every exit path
}
}
arena.scope(() => {
const buf = new Float32Array(1_000_000);
for (let i = 0; i < buf.length; i++) buf[i] = Math.sin(i * 0.001);
console.log("sum:", buf.reduce((a, b) => a + b, 0));
}); // buf freed here, no GC pressure
import * as fs from "node:fs";
function readConfig(path: string) {
const fd = fs.openSync(path, "r");
defer fs.closeSync(fd); // runs on every exit path
return JSON.parse(fs.readFileSync(fd, "utf8"));
}
arena {
const buf = new Float32Array(1_000_000);
for (let i = 0; i < buf.length; i++) buf[i] = Math.sin(i * 0.001);
console.log("sum:", buf.reduce((a, b) => a + b, 0));
} // buf freed here, no GC pressure
Full grammar in LLMs.md, and the LSP carries arity-based "could be memo" / "memo probably not worth it" hints plus full purity diagnostics.
Roadmap
Parabun's positioning is to open typical JS performance bottlenecks via multithreading and GPU. The shipped modules — para:parallel, para:simd, para:gpu, para:pipeline, para:arena, para:signals, para:llm, para:image, para:audio, para:csv, para:rtp — cover the typed-array, codec, and CPU/GPU-parallel surface; the remaining items below attack the next layer of "I have to shell out / use Python / write native code" pain points.
Each module ships behind a compile-time feature flag. The
configurator generates a bun build --compile invocation with only the
modules you check — production builds slim to whatever your app actually imports.
| Status | Module | What it does |
|---|---|---|
| shipped | para:image | JPEG / PNG / WebP decode + encode, resize (bilinear / Lanczos), blur / sharpen / edge-detect, rotate / flip / crop, adjust / threshold / invert / grayscale, histogram, alpha composite. |
| shipped | para:audio | WAV / MP3 / Opus codecs, RBJ biquads, FFT, resample, spectrogram, VAD, denoiser (rnnoise), AGC, mix / normalize / envelope, planar ⇄ frame-major + i16 ⇄ f32 PCM helpers. |
| shipped | para:csv |
Streaming RFC 4180 parser. parallel: true is "off-the-main-thread", not a per-file speedup —
see the table above.
|
| shipped | para:rtp | RFC 3550 packet pack/parse + jitter buffer. Transport for the codec stack. |
| shipped | para:gpu primitives |
conv2D, scan, reduce, argMin / argMax,
histogram, median / quantile. CPU correctness paths today; CUDA /
Metal hooks slot in via the existing dispatch.
|
| shipped | para:camera |
V4L2 capture on Linux — devices(), formats(path), open(...) with an
async-iterator frames() over kernel-mmapped buffers. AVFoundation + Media Foundation follow
on the same surface.
|
| shipped | OS audio I/O |
Live ALSA capture + playback for para:audio. devices() /
capture(...) / play(...) with Float32 PCM streams, S16_LE on the wire. CoreAudio
+ WASAPI follow.
|
| partial | para:gpu device kernels |
CUDA reduce (sum / min / max) + atomic-privatized histogram shipped. Scan, Metal
mirror, and the rest of the secondary primitives still on CPU until wired.
|
| partial | para:vision |
Frame stream + frame-diff motion detection ship today (vision.frames,
vision.detectMotion). Detector (detect) and OCR (recognize) engines
stub until ONNX runtime is vendored.
|
| shipped | para:speech |
VAD-gated speech.listen (with reactive active / noiseFloor /
lastUtterance signals), Whisper STT (speech.transcribe, dispatching to
para:llm's WhisperModel — encoder-decoder, KV cache, beam search,
language detection, CUDA-accelerated end-to-end), and Piper TTS (speech.speak — subprocess in
v1; libpiper FFI v2 tracked).
|
| shipped | para:assistant |
Three-line voice-assistant facade composing para:audio +
para:speech + para:llm +
para:mcp. bot.run / turns / ask /
say / interrupt + reactive state / history /
lastTurn / interrupted / toolsActive signals + sqlite-backed
persistent memory + tool dispatch (inline + MCP) + VAD-driven barge-in + wake word (wakeWord: "hey jetson") + cron-driven scheduled prompts + RAG (knowledge: { dir, encoder }). Vision (VLM) turns
deferred to follow-up.
|
| shipped | para:arrow |
In-memory columnar tables, computes (sum / mean / min / max / variance / stddev / quantile / median /
distinct / filter / groupBy / sort / cumsum / diff / argMin / argMax / count),
fromRows / toRows bridges, Arrow IPC streaming + file formats (with
dictionary-batch decode and List<T>), and Parquet read + write (fromParquet /
toParquet — hand-rolled Thrift / Snappy / RLE / dictionary; UNCOMPRESSED / SNAPPY / GZIP;
verified bit-for-bit against pyarrow in both directions). Wire-compat verified against apache-arrow 21.1.0
in bench/parabun-arrow-ipc-interop/. Dictionary write encoding + nested types (Struct / Map /
FixedSizeList / Decimal) pending.
|
| in progress | para:video | JS surface scaffolded; libavcodec / V4L2 M2M / NVDEC native binding lands with hardware bring-up. Decode + encode + container muxing. |
| next | para:parallel v2 |
Closure-aware persistent worker pool + SharedArrayBuffer channels. Lifts today's
pmap ceiling.
|
| planned | para:image AVIF | AVIF decode/encode (libavif + AOM / dav1d vendor add). Rounds out the codec coverage matrix. |
para:llm serves as proof-of-concept for the stack — built on para:gpu +
para:simd + para:parallel. Parabun is positioned as a perf runtime, not an AI runtime.