Resources
Research, technical writing, and engineering references.
In development
Substrate prototype
In developmentEngineering work on the four-tier architecture proposed in the paper. First milestone replicates the Nickel and Kiela hyperbolic WordNet result.
Engineering notes
All notesQuantization Explained: VRAM, Quality, and the RAM Performance Cliff
What quantization does to model quality and speed, how Q4/Q5/Q8 compare on real hardware, and why running a model split between GPU memory and system RAM can be 5-30x slower. Third-party benchmarks, practical recommendations.
Qwen 3.5 Benchmarks: 0.8B to 14B on RDNA4 Vulkan
Qwen 3.5 benchmarks across 80 prompts on AMD RDNA4 16GB with Vulkan. Unconstrained (2048-token) results show qwen3:14b leading at 0.594, with 9B pulling ahead of 4B when not artificially truncated. Includes analysis of how token budget constraints distort benchmark rankings.
LLM Benchmark Comparison: 12-Source Rankings
Cross-referenced LLM benchmark data from 12+ sources. Consensus rankings for coding, math, reasoning, tool calling. Interactive charts, cost vs. quality scatter plots, and full methodology.
How We Price: Aggregate Component Costs by Scrape
We publish the exact component costs, eBay search queries, retail sources, and markup formula for every Foundry GPU server. Prices are scraped from live market data and fed into an auto-pricing engine. This is the full breakdown.
ROCm HIP Idle GPU Bug on RDNA4: Vulkan Workaround
When loading Qwen 3/3.5 models via Ollama using ROCm/HIP on AMD RDNA4 GPUs, the GPU becomes permanently stuck at 100% utilization. We trace the root cause to HSA runtime teardown and benchmark the Vulkan backend as a drop-in fix with comparable performance.