Synoros Foundry
Run AI Models Locally
Pre-built computers that run large language models on your desk or network. From compact boxes for personal use to rack servers for production workloads. Every system is tested for 48+ hours, warrantied, and ships with a guide showing exactly which models it can run.
21
Configurations
16-768GB
VRAM Range
$499
Starting Price
90d-1yr
Hardware Warranty
Why Buy From Us
Every build is assembled, tested, and documented by the same engineers who run these systems for production LLM inference.
Tested and Validated
Every system runs real AI workloads for 48+ hours before shipping. We verify thermals, memory, and model performance so you know it works out of the box.
Fraction of the Price
We build from quality refurbished enterprise components instead of charging new-retail markups. Same hardware, 40-60% less than comparable pre-built systems.
Hardware Warranty
Non-GPU components covered for 1 year. Used GPUs covered for 90 days. New GPUs carry a full 1-year warranty. DOA replacement within 14 days.
Model Guidance Included
Each configuration ships with tested model recommendations and quantization guidance so you know exactly what runs and at what quality level.
Transparent Pricing
We publish our exact component costs, sources, and markup formula. See how every dollar is spent.
Read the full breakdown →No Vendor Lock-in
Standard server hardware, standard GPUs, standard Linux. No proprietary firmware, no licensing fees, no support contracts required. You own it outright.
Three Series
Foundry Lite
CB-16 / CB-24 / CB-32 / CB-32D / CB-48 / CB-64
Local AI boxes for your desk or home network. Quiet, compact, plugs into a standard outlet. Run models like Llama, Mistral, and Qwen privately on your own hardware. From $499.
Foundry Workstations
WS-48 / WS-72 / WS-96 / WS-128 / WS-192T / WS-320 / WS-384
Multi-GPU towers for teams and heavier workloads. Run 70B+ parameter models or serve multiple users on your local network. 3-4 GPUs in a tower chassis.
Foundry Rack & Servers
R4-128 / R8-192 / R8-256 / R4-192A / R4-192W / R4-320 / R4-384 / S8-768
Rackmount servers for production deployment, model training, and infrastructure. 4-8 GPUs, designed for datacenter or server closet installation.
Quality Certified Refurbished Parts. We source quality-certified refurbished and lightly used enterprise components from verified suppliers. Every GPU, CPU, and memory module is individually tested and validated during our 48-hour burn-in process. This is how we keep prices 40-60% below comparable new-build configurations without compromising reliability. GPUs carry a 90-day warranty, all other components 1 year. DOA replacement within 14 days.
Choose Your Configuration
Ultra-budget inference boxes with datacenter GPUs

16GB
Synoros CostaBox 16
Cheapest local LLM box that actually works
Compact Desktop / First-time local LLM user, student, or hobbyist
Starting at
$499
Ships in 1-2 weeks
Sweet spot
7B-13B models at full quality
Stretch
24B-27B at q4 quantization
Configurable
- +CPU: Xeon E3-1245 v3 4-core (standard, has iGPU) / Xeon E3-1275 v3 4-core (faster, has iGPU) / Xeon E5-1650 v3 6-core (no iGPU) / Xeon E5-2680 v3 12-core (no iGPU)
- +RAM: 8GB DDR4 (standard) / 16GB DDR4 / 32GB DDR4
- +Storage: 256GB SSD (standard) / 512GB NVMe / 1TB NVMe
- +Display: Headless (no display adapter, SSH only, standard) / iGPU display (no extra card) / GT 710 1GB basic display / Quadro P400 3x 4K display / Quadro P620 4x 4K + streaming
- +Network: 1GbE onboard (standard) / 10GbE SFP+ (Mellanox ConnectX-3)
- +OS: Costa OS (standard) / Ubuntu 24.04 LTS / Windows 11 Pro / No OS
•Compact chassis limits cooling. GPU is power-limited for thermals
•Single PCIe slot. Not expandable without changing chassis
•16GB VRAM ceiling means larger models require aggressive quantization
⚡Standard 120V outlet. Total system draw under 300W.
↑Upgrade path: CostaBox 24 or CostaBox 32

24GB
Synoros CostaBox 24
24GB VRAM for 27B model inference
Compact Desktop / Developer, researcher, or small team wanting real 27B inference
Starting at
$799
Ships in 1-2 weeks
Sweet spot
13B-27B models (Gemma 2 27B, Qwen 2.5 27B, Mistral Small)
Stretch
35B at q4 quantization
Configurable
- +CPU: Xeon E3-1245 v3 4-core (standard, has iGPU) / Xeon E3-1275 v3 4-core (faster, has iGPU) / Xeon E5-1650 v3 6-core (no iGPU) / Xeon E5-2680 v3 12-core (no iGPU)
- +RAM: 8GB DDR4 (standard) / 16GB DDR4 / 32GB DDR4
- +Storage: 256GB SSD (standard) / 512GB NVMe / 1TB NVMe
- +Display: Headless (no display adapter, SSH only, standard) / iGPU display (no extra card) / GT 710 1GB basic display / Quadro P400 3x 4K display / Quadro P620 4x 4K + streaming
- +Network: 1GbE onboard (standard) / 10GbE SFP+ (Mellanox ConnectX-3)
- +OS: Costa OS (standard) / Ubuntu 24.04 LTS / Windows 11 Pro / No OS
•P40 has no tensor cores. Slower FP16 than V100, but 24GB is the draw
•Compact chassis limits cooling. GPU power-limited for thermals
•Single slot. Not expandable
⚡Standard 120V outlet. Total system draw under 350W.
↑Upgrade path: CostaBox 32 or CostaBox Duo 48

32GB
Synoros CostaBox Duo 32
Two P100s for the price of one V100
Mini Tower / Budget buyer who wants 2-way parallelism
Starting at
$749
Ships in 1-2 weeks
Sweet spot
24B-35B with 2-way tensor parallelism
Stretch
70B at q4 with both cards
Configurable
- +RAM: 16GB DDR4 ECC / 8GB DDR4 ECC (budget) / 32GB DDR4 ECC / 64GB DDR4 ECC
- +CPU: Xeon E5-1620 v3 4-core (standard) / Xeon E5-1650 v3 6-core (no iGPU) / Xeon E5-2680 v3 12-core (no iGPU)
- +Storage: 256GB SSD / 512GB SSD / 1TB NVMe
- +Display: Headless (no display adapter, SSH only, standard) / GT 710 1GB basic display / Quadro P400 3x 4K display / Quadro P620 4x 4K + streaming
- +Network: 1GbE onboard (standard) / 10GbE SFP+ (Mellanox ConnectX-3)
- +OS: Costa OS (standard) / Ubuntu 24.04 LTS / Windows 11 Pro / No OS
•2x P100 is slower per-token than 1x V100 (no tensor cores, PCIe communication overhead)
•Models must be split across cards. Single-card models limited to 16GB
•Tower chassis is larger than SFF CostaBox
⚡Standard 120V outlet. 2x250W GPUs + host = ~600W peak.
↑Upgrade path: CostaBox Duo 48 or Foundry WS-48

48GB
Synoros CostaBox Duo 48
48GB VRAM in a desktop tower: Dual P40
Mini Tower / Serious hobbyist or small team wanting 70B at home
Starting at
$1,299
Ships in 1-2 weeks
Sweet spot
35B-70B at q4 quantization
Stretch
70B at q5/q6 with tight VRAM budget
Configurable
- +RAM: 16GB DDR4 ECC / 8GB DDR4 ECC (budget) / 32GB DDR4 ECC / 64GB DDR4 ECC
- +CPU: Xeon E5-1620 v3 (budget) / E5-2680 v3 (performance)
- +Storage: 256GB SSD / 512GB SSD / 1TB NVMe
- +Display: Headless (no display adapter, SSH only, standard) / GT 710 1GB basic display / Quadro P400 3x 4K display / Quadro P620 4x 4K + streaming
- +Network: 1GbE onboard (standard) / 10GbE SFP+ (Mellanox ConnectX-3)
- +OS: Costa OS (standard) / Ubuntu 24.04 LTS / Windows 11 Pro / No OS
•P40 has no tensor cores. Capacity-first, not speed-first
•70B at q4 is a tight fit (~43GB). Limited context window
•2 slots used in Z440/Z620. No further GPU expansion
⚡Standard 120V outlet. 2x250W GPUs + host = ~600W peak.
↑Upgrade path: CostaBox Duo 64 or Foundry WS-72

32GB
Synoros CostaBox 32
V100 tensor cores in a desktop form factor
Compact Desktop / Developer who wants speed and capacity in the smallest box
Starting at
$1,399
Ships in 1-2 weeks
Sweet spot
27B-35B models with room for context
Stretch
70B at aggressive q3/q4 with CPU offload
Configurable
- +CPU: Xeon E3-1245 v3 4-core (standard, has iGPU) / Xeon E3-1275 v3 4-core (faster, has iGPU) / Xeon E5-1650 v3 6-core (no iGPU) / Xeon E5-2680 v3 12-core (no iGPU)
- +RAM: 16GB DDR4 / 8GB DDR4 (budget) / 32GB DDR4 / 64GB DDR4
- +Storage: 256GB SSD / 512GB SSD / 1TB NVMe
- +Display: Headless (no display adapter, SSH only, standard) / iGPU display (no extra card) / GT 710 1GB basic display / Quadro P400 3x 4K display / Quadro P620 4x 4K + streaming
- +Network: 1GbE onboard (standard) / 10GbE SFP+ (Mellanox ConnectX-3)
- +OS: Costa OS (standard) / Ubuntu 24.04 LTS / Windows 11 Pro / No OS
•Compact chassis limits cooling. May need power limiting
•Single slot. Not expandable without chassis swap
•V100 PCIe (not SXM). Still excellent for single-card inference
⚡Standard 120V outlet. Total system draw under 400W.
↑Upgrade path: CostaBox Duo 64 or Foundry WS-96

64GB
Synoros CostaBox Duo 64
Dual V100 tensor cores in a tower
Mini Tower / Developer or researcher wanting 70B with real speed
Starting at
$2,499
Ships in 1-2 weeks
Sweet spot
70B q4 with comfortable headroom
Stretch
70B q6/q8 for higher quality output
Configurable
- +RAM: 16GB DDR4 ECC (standard) / 32GB DDR4 ECC / 64GB DDR4 ECC / 128GB DDR4 ECC
- +CPU: Xeon E5-1650 v3 6-core (standard) / Xeon E5-2680 v3 12-core / Xeon E5-2690 v3 12-core (higher clocks)
- +Storage: 512GB SSD / 1TB NVMe
- +NVLink: None (standard) / V100 NVLink bridge (doubles inter-GPU bandwidth)
- +Display: Headless (no display adapter, SSH only, standard) / GT 710 1GB basic display / Quadro P400 3x 4K display / Quadro P620 4x 4K + streaming
- +Network: 1GbE onboard (standard) / 10GbE SFP+ (Mellanox ConnectX-3)
- +OS: Costa OS (standard) / Ubuntu 24.04 LTS / Windows 11 Pro / No OS
•V100 PCIe supports 2-way NVLink bridge (available as upgrade). Without bridge, tensor parallelism runs over PCIe
•Z440/Z620 has 2 PCIe x16 slots. Not expandable to 3+ GPUs
•Higher cost than P40 Duo but meaningfully faster per token
⚡Standard 120V outlet. 2x250W GPUs + host = ~650W peak.
↑Upgrade path: Foundry WS-96 or Foundry Rack 128
Multi-GPU tower workstations

48GB
Synoros Foundry WS-48
3x P100 in a dual-Xeon tower: Entry multi-GPU
Tower / Hobbyist or first-time local LLM buyer
Starting at
$1,299
Ships in 1-2 weeks
Sweet spot
24B-35B parameter models
Stretch
70B q4 experiments
Configurable
- +RAM: 16GB DDR4 ECC (headless) / 32GB DDR4 ECC (standard) / 64GB DDR4 ECC / 128GB DDR4 ECC
- +CPU: Xeon E5-2620 v3 (budget) / E5-2680 v4 (performance)
- +Storage: 256GB SSD boot (standard) / 512GB NVMe boot / 1TB NVMe boot / + additional 2.5" or 3.5" SATA drives (up to 4x 3.5" + 4x 2.5" bays available)
- +Network: 1GbE onboard (standard) / 10GbE SFP+ (Mellanox ConnectX-3)
- +Display: Headless (no display adapter, SSH only) / GT 710 1GB basic display (standard) / Quadro P400 3x 4K display / Quadro P620 4x 4K + streaming
- +OS: Costa OS (standard) / Ubuntu 24.04 LTS / Windows 11 Pro / No OS
•3x250W GPUs exceed HP's official 3x225W graphics envelope. Power-limited and thermally validated during burn-in
•PCIe 3.0 bandwidth ceiling visible during model load/offload, not during steady-state decode
•Pascal-generation FP16 performance. Compute-bound on small batch sizes
⚡Standard 120V outlet (1275W PSU). Power-limited GPUs draw ~225W each.
↑Upgrade path: Synoros Foundry WS-96 or Rack 128

72GB
Synoros Foundry WS-72
Cheapest path to 70B-capable workstation VRAM
Tower / Capacity-first tinkerer or budget model collector
Starting at
$2,099
Ships in 1-2 weeks
Sweet spot
70B q4 models
Stretch
122B low-quant experiments
Configurable
- +RAM: 16GB DDR4 ECC (headless) / 32GB DDR4 ECC (standard) / 64GB DDR4 ECC / 128GB DDR4 ECC
- +CPU: Xeon E5-2620 v3 (budget) / E5-2680 v4 (performance)
- +Storage: 256GB SSD boot (standard) / 512GB NVMe boot / 1TB NVMe boot / + additional SATA drives (up to 4x 3.5" + 4x 2.5" bays available)
- +Network: 1GbE onboard (standard) / 10GbE SFP+ (Mellanox ConnectX-3)
- +Display: Headless (no display adapter, SSH only) / GT 710 1GB basic display (standard) / Quadro P400 3x 4K display / Quadro P620 4x 4K + streaming
- +OS: Costa OS (standard) / Ubuntu 24.04 LTS / Windows 11 Pro / No OS
•P40 has weaker FP16 performance than P100/V100. Capacity-first, not speed-first
•PCIe 3.0 bandwidth ceiling on model load/offload
•No tensor cores. Pure CUDA compute
⚡Standard 120V outlet (1275W PSU). 3x250W GPU draw.
↑Upgrade path: Synoros Foundry WS-96 or R4-192

96GB
Synoros Foundry WS-96
Best-value used tower for serious 70B work
Tower / Power user who wants 70B to feel real on a tower
Starting at
$3,999
Ships in 1-2 weeks
Sweet spot
70B q5/q6 models
Stretch
122B q3-ish with CPU offload
Configurable
- +RAM: 16GB DDR4 ECC (headless) / 32GB DDR4 ECC (standard) / 64GB DDR4 ECC / 128GB DDR4 ECC
- +CPU: Xeon E5-2620 v3 (budget) / E5-2680 v4 (performance)
- +Storage: 256GB SSD boot (standard) / 512GB NVMe boot / 1TB NVMe boot / + additional SATA drives (up to 4x 3.5" + 4x 2.5" bays available)
- +NVLink: V100 NVLink bridge for 2-way GPU pairing (doubles inter-GPU bandwidth)
- +Network: 1GbE onboard (standard) / 10GbE SFP+ (Mellanox ConnectX-3)
- +Display: Headless (no display adapter, SSH only) / GT 710 1GB basic display (standard) / Quadro P400 3x 4K display / Quadro P620 4x 4K + streaming
- +OS: Costa OS (standard) / Ubuntu 24.04 LTS / Windows 11 Pro / No OS
•Still a PCIe 3.0 tower. Best value, not a modern platform
•V100 PCIe supports 2-way NVLink bridge (available as upgrade)
•Model load times limited by PCIe 3.0 host bandwidth
⚡Standard 120V outlet (1275W PSU). 3x250W GPU draw.
↑Upgrade path: Synoros Foundry R4-192 or R4-320

128GB
Synoros Foundry WS-128
128GB VRAM in a tower: Rack performance without the rack
Full Tower Workstation / Lab, startup, or power user who wants 4x V100 on a desk
Starting at
$5,499
Built to order (1-2 weeks)
Sweet spot
70B at comfortable quantization with headroom
Stretch
122B at low quant with 4-way parallelism
Configurable
- +RAM: 64GB DDR4 ECC (standard) / 128GB DDR4 ECC / 256GB DDR4 ECC
- +CPU: Dual Xeon E5-2620 v4 (budget) / E5-2680 v4 (performance) / E5-2697 v4 (18-core)
- +Storage: 512GB NVMe boot / 1TB NVMe / 2TB NVMe + HDD data drive
- +NVLink: None (standard) / V100 NVLink bridges for 2 GPU pairs
- +Network: 1GbE onboard (standard) / 10GbE SFP+ (Mellanox ConnectX-3) / ConnectX-5 25GbE SmartNIC
- +Cooling: Air cooled (standard) / Enhanced fan configuration for 4-GPU thermals
- +Display: Headless (no display adapter, SSH only) / GT 710 1GB basic display (standard) / Quadro P400 3x 4K display / Quadro P620 4x 4K + streaming
- +OS: Costa OS (standard) / Ubuntu 24.04 LTS / Windows 11 Pro / No OS
•Full tower is large. This is a floor-standing workstation, not a desk box
•4x250W passive GPUs need good case airflow. Validated during burn-in
•PCIe 3.0 host limits load/offload but steady-state inference is fine
•V100 PCIe supports 2-way NVLink bridges (pairs of cards). Available as upgrade
⚡120V with high-wattage PSU (1400W+). 4x250W GPUs + dual Xeon = ~1.5kW peak.
↑Upgrade path: Synoros Foundry Rack 128 or R8-256

192GB
Synoros Foundry WS-192T
4x Ampere A6000 in a Threadripper Pro tower: 192GB VRAM on PCIe 4.0
Full Tower Workstation / Research team, AI startup, or production-grade desktop inference lab
Starting at
$36,999
Built to order (2-3 weeks)
Sweet spot
122B 4-bit with Ampere tensor cores
Stretch
400B at aggressive quantization with CPU offload
Configurable
- +RAM: 128GB DDR4 ECC (standard) / 256GB DDR4 ECC / 512GB DDR4 ECC
- +CPU: Threadripper PRO 5955WX 16-core (standard) / 5975WX 32-core / 5995WX 64-core
- +Storage: 1TB NVMe Gen4 (standard) / 2TB NVMe / 4TB NVMe + SATA data array
- +NVLink: None (standard) / A6000 NVLink bridges for 2 GPU pairs
- +Network: 1GbE onboard (standard) / 10GbE SFP+ (Mellanox ConnectX-3) / ConnectX-6 25GbE SmartNIC / ConnectX-6 100GbE SmartNIC
- +Cooling: Air cooled (standard) / Liquid cooling for quieter operation
- +Display: Headless (no display adapter, SSH only) / GT 710 1GB basic display (standard) / Quadro P400 3x 4K display / Quadro P620 4x 4K + streaming
- +OS: Costa OS (standard) / Ubuntu 24.04 LTS / Windows 11 Pro / No OS
•A6000 is active-cooled (fan noise). Louder than passive server cards
•Threadripper Pro platform is more expensive than used Xeon towers
•4 slots used. Not expandable to 8 GPUs without switching to rack
⚡120V with 1600W+ PSU. 4x300W GPUs + Threadripper = ~1.6kW peak.
↑Upgrade path: Synoros Foundry R4-192A or R4-320

384GB
Synoros Foundry WS-384
4x Blackwell 96GB in a modern tower: The premium desktop flagship
Full Tower Workstation / Research group, inference provider, or premium buyer who wants 384GB on a desk
Starting at
$39,999
Built to order (2-4 weeks)
Sweet spot
400B+ models with room to spare
Stretch
Multi-model serving or large-scale training
Configurable
- +RAM: 256GB DDR5 ECC (standard) / 512GB DDR5 ECC / 1TB DDR5 ECC
- +CPU: Xeon W9-3495X (56-core) / W9-3475X (36-core)
- +Storage: 2TB NVMe boot + NVMe data array
- +NVLink: None (standard) / RTX PRO 6000 NVLink bridge (confirm compatibility at order)
- +Network: 1GbE onboard (standard) / 10GbE SFP+ (Mellanox ConnectX-3) / ConnectX-6 25GbE SmartNIC / ConnectX-7 200GbE/400GbE SmartNIC
- +Cooling: Air cooled (standard) / Liquid cooling (recommended for 4x passive GPUs)
- +Display: Headless (no display adapter, SSH only, standard) / GT 710 1GB basic display / Quadro P400 3x 4K display / Quadro P620 4x 4K + streaming
- +OS: Costa OS (standard) / Ubuntu 24.04 LTS / Windows 11 Pro / No OS
•RTX PRO 6000 is passive-cooled. Requires validated tower airflow (burn-in verified)
•600W TDP per card at full power. Power-limiting available (400-600W configurable)
•Premium pricing reflects new Blackwell cards + modern Xeon W9 platform
⚡240V recommended. 4x400-600W GPUs + Xeon W9 = 2.5-4kW depending on power config.
↑Upgrade path: Synoros Foundry S8-768

320GB
Synoros Foundry WS-320
4x A100 80GB in a modern tower: Datacenter in a box
Full Tower Workstation / Enterprise team, research lab, or AI startup wanting 320GB without a rack
Starting at
$82,999
Built to order (2-3 weeks)
Sweet spot
400B low-quant with 4-way tensor parallelism
Stretch
Serious 122B/397B production serving
Configurable
- +RAM: 256GB DDR5 ECC (standard) / 512GB DDR5 ECC / 1TB DDR5 ECC
- +CPU: Xeon W9-3495X (56-core) / W9-3475X (36-core)
- +Storage: 2TB NVMe boot + NVMe data array
- +Network: 1GbE onboard (standard) / 10GbE SFP+ (Mellanox ConnectX-3) / ConnectX-6 25GbE SmartNIC / ConnectX-7 200GbE/400GbE SmartNIC
- +Cooling: Air cooled (standard) / Liquid cooling for sustained compute
- +Display: Headless (no display adapter, SSH only, standard) / GT 710 1GB basic display / Quadro P400 3x 4K display / Quadro P620 4x 4K + streaming
- +OS: Costa OS (standard) / Ubuntu 24.04 LTS / Windows 11 Pro / No OS
•A100 PCIe variant. NVLink not available on PCIe A100 (only SXM version has NVLink)
•Premium pricing reflects A100 market rates + modern workstation platform
•Large tower footprint. This is a full-size workstation, not a compact build
⚡120V with 1600W+ PSU or 240V recommended. 4x300W GPUs + Xeon W9 = ~1.8kW peak.
↑Upgrade path: Synoros Foundry R4-384 or S8-768
Production rackmount servers

192GB
Synoros Foundry R8-192
Maximum VRAM per dollar in an 8-GPU chassis
4U Rackmount / Budget-conscious buyer who needs raw VRAM capacity over speed
Starting at
$6,499
Built to order (1-2 weeks)
Sweet spot
122B low-quant with 8-way parallelism
Stretch
400B experiments with aggressive quantization + CPU offload
Configurable
- +RAM: 128GB DDR4 ECC (standard) / 256GB DDR4 ECC / 512GB DDR4 ECC
- +CPU: Dual Xeon / Dual EPYC 7002 (flexible based on chassis)
- +Storage: 1TB NVMe boot + SATA/NVMe array (hot-swap bays available)
- +Network: 1GbE onboard (standard) / 10GbE SFP+ (Mellanox ConnectX-3) / ConnectX-5 25GbE SmartNIC / ConnectX-5 100GbE SmartNIC
- +GPU swap: Replace P40s with V100 32GB (better compute, same socket)
- +Display: Headless (no display adapter, SSH only, standard) / GT 710 1GB basic display / Quadro P400 3x 4K display / Quadro P620 4x 4K + streaming
- +OS: Costa OS (standard) / Ubuntu 24.04 LTS / Windows 11 Pro / No OS
•P40 has no tensor cores and weak FP16. This is a capacity play, not a speed play
•8x250W passive cards = 2kW GPU draw alone. 208/240V mandatory
•PCIe 3.0 host limits load/offload speed across all 8 cards
•Older compute architecture. Inference speed per token is slower than Ampere/Volta equivalent VRAM
⚡240V required. 8x250W GPUs + dual Xeon/EPYC host = ~2.5-3kW steady state.
↑Upgrade path: Synoros Foundry R4-192A or R4-320

128GB
Synoros Foundry Rack 128
Entry rack for clean 70B deployment
4U Rackmount / Small lab or serious home rack builder
Starting at
$5,499
Ships in 1-2 weeks
Sweet spot
70B at comfortable quantization
Stretch
122B low-quant with CPU offload margin
Configurable
- +RAM: 64GB DDR4 ECC (standard) / 128GB DDR4 ECC / 256GB DDR4 ECC
- +CPU: Xeon E5-2680 v3 (budget) / E5-2697 v4 (18-core)
- +Storage: 512GB NVMe boot (standard) / 1TB NVMe boot / + up to 7x additional 3.5" SATA hot-swap drives (8 bays total)
- +NVLink: None (standard) / V100 NVLink bridge for paired cards
- +Network: 1GbE onboard (standard) / 10GbE SFP+ (Mellanox ConnectX-3) / ConnectX-5 25GbE SmartNIC / ConnectX-5 100GbE SmartNIC
- +Display: Headless (no display adapter, SSH only, standard) / GT 710 1GB basic display / Quadro P400 3x 4K display / Quadro P620 4x 4K + streaming
- +OS: Costa OS (standard) / Ubuntu 24.04 LTS / Windows 11 Pro / No OS
•4x250W passive GPUs on 120V is not comfortable. 208/240V strongly recommended
•PCIe 3.0 host limits model load throughput
•V100 PCIe supports 2-way NVLink bridge. Available as upgrade for paired cards
⚡208/240V strongly recommended. 4x250W GPUs + host = ~1.5kW total draw.
↑Upgrade path: Synoros Foundry R4-192

256GB
Synoros Foundry R8-256
8x V100 performance rack with 256GB VRAM
4U Rackmount / Serious lab, training workloads, or multi-model serving
Starting at
$10,999
Built to order (1-2 weeks)
Sweet spot
70B-122B with 8-way tensor parallelism
Stretch
400B at aggressive quantization with CPU offload
Configurable
- +RAM: 128GB DDR4 ECC (standard) / 256GB DDR4 ECC / 512GB DDR4 ECC / 1TB DDR4 ECC
- +CPU: Dual Xeon / Dual EPYC 7002 (flexible based on chassis)
- +Storage: 1TB NVMe boot + SATA/NVMe array (hot-swap bays)
- +NVLink: None (standard) / V100 NVLink bridges for GPU pairing
- +Network: 1GbE onboard (standard) / 10GbE SFP+ (Mellanox ConnectX-3) / ConnectX-6 25GbE SmartNIC / ConnectX-6 100GbE SmartNIC
- +Display: Headless (no display adapter, SSH only, standard) / GT 710 1GB basic display / Quadro P400 3x 4K display / Quadro P620 4x 4K + streaming
- +OS: Costa OS (standard) / Ubuntu 24.04 LTS / Windows 11 Pro / No OS
•8x250W passive cards = 2kW GPU draw. 208/240V mandatory
•PCIe 3.0 host limits model load throughput but steady-state decode is fine
•V100 PCIe supports 2-way NVLink bridge (available as upgrade)
•Used enterprise chassis. Cosmetic wear does not affect functionality
⚡240V required. 8x250W GPUs + dual CPU host = ~2.5-3kW steady state.
↑Upgrade path: Synoros Foundry R4-320 or R4-384

192GB
Synoros Foundry R4-192A
Modern CUDA 48GB-per-card rack with tensor cores
4U Rackmount / Startup or lab wanting speed and capacity on NVIDIA CUDA
Starting at
$35,999
Built to order (2-3 weeks)
Sweet spot
122B 4-bit inference with Ampere tensor cores
Stretch
400B with aggressive quantization + CPU offload
Configurable
- +RAM: 128GB DDR4 ECC (standard) / 256GB DDR4 ECC / 512GB DDR4 ECC
- +CPU: EPYC 7313 (budget) / EPYC 7443 (24-core) / EPYC 7543 (32-core)
- +Storage: 1TB NVMe boot + NVMe/SATA data array
- +NVLink: None (standard) / A6000 NVLink bridge for GPU pairing
- +Network: 1GbE onboard (standard) / 10GbE SFP+ (Mellanox ConnectX-3) / ConnectX-6 25GbE SmartNIC / ConnectX-6 100GbE SmartNIC
- +Cooling: Air cooled (standard) / Liquid cooling for quieter deployment
- +GPU upgrade: A6000 Ada (Lovelace) when used prices drop
- +Display: Headless (no display adapter, SSH only, standard) / GT 710 1GB basic display / Quadro P400 3x 4K display / Quadro P620 4x 4K + streaming
- +OS: Costa OS (standard) / Ubuntu 24.04 LTS / Windows 11 Pro / No OS
•A6000 is a workstation card (active cooling, 300W). Louder than passive server cards
•Used A6000 pricing has come down but is still higher than P40/V100 per-GB
•4 slots used. Not expandable beyond 192GB without a chassis swap
⚡208/240V recommended. 4x300W GPUs + EPYC host = ~1.8-2kW peak.
↑Upgrade path: Synoros Foundry R4-320 or R4-384

192GB
Synoros Foundry R4-192W
Modern 48GB-per-card 4-GPU rack
4U Rackmount / Startup, boutique datacenter, or quiet-ish rack buyer
Starting at
$14,999
Built to order (2-3 weeks)
Sweet spot
122B 4-bit inference
Stretch
400B floor still out of reach without CPU offload
Configurable
- +RAM: 128GB DDR4 ECC (standard) / 256GB DDR4 ECC / 512GB DDR4 ECC
- +CPU: EPYC 7313 (budget) / EPYC 7443 (24-core) / EPYC 7543 (32-core)
- +Storage: 1TB NVMe boot + NVMe/SATA data array
- +Network: 1GbE onboard (standard) / 10GbE SFP+ (Mellanox ConnectX-3) / ConnectX-6 25GbE SmartNIC / ConnectX-6 100GbE SmartNIC
- +Cooling: Air cooled (standard) / Liquid cooling
- +Display: Headless (no display adapter, SSH only, standard) / GT 710 1GB basic display / Quadro P400 3x 4K display / Quadro P620 4x 4K + streaming
- +OS: Costa OS (standard) / Ubuntu 24.04 LTS / Windows 11 Pro / No OS
•Active workstation GPUs. Easier acoustically, higher per-card cost
•AMD ROCm ecosystem. Verify framework compatibility for your stack
•Premium pricing reflects current W7900 market rates
⚡208/240V recommended. 4x active-cooled GPUs + EPYC host = ~1.8kW peak.
↑Upgrade path: Synoros Foundry R4-320 or R4-384

384GB
Synoros Foundry R4-384
Premium 4-card single-node 400B platform
4U Rackmount / Provider or research team needing one serious 4-card node
Starting at
$31,999
Built to order (2-3 weeks)
Sweet spot
400B-class models with room to spare
Stretch
Serious multi-model serving infrastructure
Configurable
- +RAM: 256GB DDR4 ECC (standard) / 512GB DDR4 ECC / 1TB DDR4 ECC
- +CPU: Dual EPYC 7443 (standard) / 7713 (64-core)
- +Storage: 2TB NVMe boot + NVMe/SATA data array
- +Network: 1GbE onboard (standard) / 10GbE SFP+ (Mellanox ConnectX-3) / ConnectX-6 25GbE SmartNIC / ConnectX-7 200GbE SmartNIC
- +Cooling: Air cooled (standard) / Liquid cooling
- +PCIe 5.0 host platform upgrade (eliminates bandwidth compromise)
- +GPU expansion: 8x RTX PRO 6000 (768GB) in 8-GPU chassis
- +Display: Headless (no display adapter, SSH only, standard) / GT 710 1GB basic display / Quadro P400 3x 4K display / Quadro P620 4x 4K + streaming
- +OS: Costa OS (standard) / Ubuntu 24.04 LTS / Windows 11 Pro / No OS
•PCIe 4.0 host halves host-link bandwidth vs Gen5. Acceptable for inference, not ideal for high-throughput batching
•600W TDP per card at full power. Power-limiting available (400-600W configurable)
•New-generation cards. Pricing reflects current market, not used/refurbished
⚡240V required. 4x400-600W GPUs + host = 3-4kW depending on power config.
↑Upgrade path: Synoros Foundry S8-768

320GB
Synoros Foundry R4-320
Used-enterprise 80GB/card performance rack
4U Rackmount / Enterprise pilot or premium used-rack buyer
Starting at
$75,999
Built to order (2-3 weeks)
Sweet spot
122B/397B serving with tensor parallelism
Stretch
400B low-quant single-node inference
Configurable
- +RAM: 256GB DDR4 ECC (standard) / 512GB DDR4 ECC / 1TB DDR4 ECC
- +CPU: Dual EPYC 7443 (standard) / 7543 (32-core) / 7713 (64-core)
- +Storage: 2TB NVMe boot + up to 8x NVMe/SATA hot-swap data drives
- +Network: 1GbE onboard (standard) / 10GbE SFP+ (Mellanox ConnectX-3) / ConnectX-6 25GbE SmartNIC / ConnectX-7 200GbE SmartNIC / ConnectX-7 400GbE SmartNIC
- +Cooling: Liquid cooling option for datacenter deployment
- +GPU expansion: Add 4 more A100s (640GB total) in same chassis
- +Display: Headless (no display adapter, SSH only, standard) / GT 710 1GB basic display / Quadro P400 3x 4K display / Quadro P620 4x 4K + streaming
- +OS: Costa OS (standard) / Ubuntu 24.04 LTS / Windows 11 Pro / No OS
•Premium used-enterprise gear. Not mainstream budget hardware
•A100 PCIe variant (not SXM). NVLink not available
•High power draw requires dedicated 208/240V circuit
⚡240V required. 4x300W GPUs + dual EPYC host = ~2.5kW steady state.
↑Upgrade path: Synoros Foundry R4-384 or S8-768

768GB
Synoros Foundry S8-768
Budget-datacenter flagship for large-model serving
4U/5U Rackmount / Budget datacenter, inference provider, or halo SKU buyer
Starting at
$59,999
Built to order (3-4 weeks)
Sweet spot
400B+ models with full tensor parallelism
Stretch
Multi-model serving infrastructure or training workloads
Configurable
- +RAM: 512GB DDR4 ECC (standard) / 1TB DDR4 ECC / 2TB DDR4 ECC
- +CPU: Dual EPYC 7443 (standard) / 7713 (64-core)
- +Storage: 2TB NVMe boot + up to 24x 2.5" hot-swap SATA/NVMe bays
- +Network: 1GbE onboard (standard) / 10GbE SFP+ (Mellanox ConnectX-3) / ConnectX-6 25GbE SmartNIC / ConnectX-7 200GbE SmartNIC / ConnectX-7 400GbE SmartNIC
- +Cooling: Air cooled (standard) / Liquid cooling for sustained full-power operation
- +PCIe 5.0 host platform upgrade (eliminates all bandwidth compromise)
- +Power: Dual 240V 30A circuits for redundancy
- +Display: Headless (no display adapter, SSH only, standard) / GT 710 1GB basic display / Quadro P400 3x 4K display / Quadro P620 4x 4K + streaming
- +OS: Costa OS (standard) / Ubuntu 24.04 LTS / Windows 11 Pro / No OS
•PCIe 4.0 host is the cost-down move. Gen5 host would add significant cost
•5kW+ power draw at full load. 240V 30A minimum, 40A recommended
•Passive GPUs require validated server chassis airflow. Do not attempt in consumer cases
⚡240V 30A minimum (40A recommended). 8x400-600W GPUs + dual EPYC + 512GB RAM = 4-6kW.
↑Upgrade path: 2-node cluster or premium Gen5 host refresh
What Can You Run
Green = comfortable, amber = possible with quantization, dash = not viable.
| Model Size | CB-16 16GB | CB-24 24GB | CB-32 32GB | CB-32D 32GB | CB-48 48GB | CB-64 64GB | CO-12 12GB | CO-16 16GB | WS-128 128GB | WS-192T 192GB | WS-320 320GB | WS-384 384GB | WS-48 48GB | WS-72 72GB | WS-96 96GB | R4-128 128GB | R8-192 192GB | R8-256 256GB | R4-192A 192GB | R4-192W 192GB | R4-320 320GB | R4-384 384GB | S8-768 768GB |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 7B | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| 13B | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| 24B-35B | ~ | ✓ | ✓ | ✓ | ✓ | ✓ | ~ | ~ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| 70B | - | - | ~ | ~ | ~ | ✓ | - | - | ✓ | ✓ | ✓ | ✓ | ~ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| 122B | - | - | - | - | - | - | - | - | ~ | ✓ | ✓ | ✓ | - | ~ | ~ | ~ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| 400B+ | - | - | - | - | - | - | - | - | - | ~ | ✓ | ✓ | - | - | - | - | ~ | ~ | ~ | ~ | ✓ | ✓ | ✓ |
Model VRAM Guide
Approximate VRAM at Q4_K_M and Q8 quantization. Use this to match models to configurations above.
| Model | Params | Q4 VRAM | Q8 VRAM | Notes |
|---|---|---|---|---|
| Qwen 3.5qwen3.5:0.8b | 0.8B | 1.5 GB | 2 GB | |
| qwen3.5:2b | 2B | 2.5 GB | 3.5 GB | |
| qwen3.5:4b | 4.7B | 4 GB | 6.5 GB | |
| qwen3.5:9b | 9.7B | 7 GB | 12 GB | |
| Qwen 3qwen3:14b | 14B | 9 GB | 16 GB | |
| Aprielapriel-15b-thinker | 15B | 10 GB | 17 GB | |
| GPT-OSSgpt-oss:20b | 21B (3.6B active) MoE | MXFP4: 16 GB | ||
| Llama 3llama3.1:8b | 8B | 5.5 GB | 9.5 GB | |
| llama3.3:70b | 70B | 40 GB | 74 GB | |
| Mistralmistral-small3.1:24b | 24B | 14 GB | 26 GB | |
| mistral-nemo:12b | 12B | 8 GB | 14 GB | |
| Gemma 3gemma3:4b | 4B | 3.5 GB | 5.5 GB | |
| gemma3:12b | 12B | 8 GB | 14 GB | |
| gemma3:27b | 27B | 17 GB | 30 GB | |
| Phiphi4:14b | 14B | 9 GB | 16 GB | |
| DeepSeek R1deepseek-r1:7b | 7B | 5 GB | 8.5 GB | |
| deepseek-r1:14b | 14B | 9 GB | 16 GB | |
| deepseek-r1:32b | 32B | 19 GB | 35 GB | |
| Devstraldevstral | 24B (MoE) MoE | 14 GB | 26 GB | |
| Qwen Coderqwen3-coder | 30B (3.3B active) MoE | 8.5 GB | 18 GB | |
| qwen2.5-coder:14b | 14B | 10 GB | 16 GB |
Green = fits 16 GB, amber = tight fit, red = needs more VRAM. MoE models load all weights but activate a fraction per token. See our quantization guide for details.
Performance Benchmarks
Published inference speeds from third-party benchmarks. Single GPU, single user, llama.cpp / Ollama.
7B Q4 Generation (tokens/sec, single GPU)
~35-50
P100 16GB
~41
P40 24GB
~85-107
V100 32GB
~102
A6000 48GB
~121
W7900 48GB
~138
A100 80GB
~185
RTX PRO 6000
Sources: GPU-Benchmarks-on-LLM-Inference (GitHub), DatabaseMart, Hardware Corner, LocalScore, GamersNexus.
Frequently Asked Questions
Common questions about our GPU server configurations.
What GPU do I need to run a 70B parameter model locally?
Are these servers new or refurbished?
Can I use these servers for training or just inference?
What is the difference between P100, P40, V100, A100, and RTX PRO 6000?
Do these servers come with an operating system?
What warranty coverage do GPU servers have?
Can I customize the configuration before ordering?
Warranty & Support
1 Year
Non-GPU warranty
90 Days
Used GPU warranty
10 Days
RMA turnaround target
Direct
Builder email support
Non-GPU components covered for 1 year. Used GPUs (which may have prior mining or datacenter runtime) covered for 90 days. New GPUs carry a full 1-year warranty. DOA replacement within 14 days.
Read full warranty terms →Ready to Build?
Tell us what you need. We respond within 24 hours. Custom configurations available on request.