Invariant — AI Precision Compiler

The Problem

Today's compression is one-size-fits-all

Current heuristic methods fail to exploit the best memory-aware representations — leaving performance and cost savings on the table.

⚠

Blind compression

Treating every layer the same wastes precision on easy parts and destroys the hard ones.

⚡

Models don't fit

The best AI models are too large for most hardware. Memory is the real bottleneck.

📈

Quality cliffs

Push compression too far and quality drops off a cliff. There's a smarter way to reach the same size.

Our Approach

A fundamentally different workflow

import invariant

model = invariant.load("meta-llama/Llama-3-70B")
compressed = model.compile(target_bits=4.0)

Optimally compile your models in 3 lines.

Map sensitivity

We probe the loss landscape to discover which parts of the model are fragile and which are robust.

Allocate precision

Sensitive layers get more bits. Robust layers get fewer. The total size stays the same — the quality goes up.

Correct errors

After compressing, we use the sensitivity map to fix the remaining distortion — eliminating cascading errors.

Find a quantized model

Search HuggingFace for someone who already ran a heuristic quantization. Hope the settings are reasonable.

Check if quality fits

Evaluate whether it still meets your task requirements and memory constraints. Usually it doesn't.

Tune heuristics yourself

Manually tweak parameters, re-run, re-evaluate. Repeat until something is acceptable.

∞

Still no guarantees

After days of iteration you have "good enough" — with no proof it's optimal.

The Key Insight

Not all parameters
are created equal

Some layers sit on steep ridges in the loss landscape — even tiny changes cause big quality drops. Others sit in flat valleys where you can compress aggressively with no consequence.

We measure this curvature directly, then use it to decide where every bit of precision should go. The result: models that are dramatically smaller but behave almost identically to the original.

AI Precision
Compiler

Today's compression is one-size-fits-all

Blind compression

Models don't fit

Quality cliffs

Same model. Fraction of the size.

A fundamentally different workflow

Map sensitivity

Allocate precision

Correct errors

Find a quantized model

Check if quality fits

Tune heuristics yourself

Still no guarantees

Explore the tradeoff

Smaller models that still perform

Not all parameters
are created equal

Built by the people behind the research

Dr. Diego Granziol

Khurshid Juraev

Prof. Jon Keating

Ship models that
actually fit

AI PrecisionCompiler

Today's compression is one-size-fits-all

Blind compression

Models don't fit

Quality cliffs

Same model. Fraction of the size.

A fundamentally different workflow

Map sensitivity

Allocate precision

Correct errors

Find a quantized model

Check if quality fits

Tune heuristics yourself

Still no guarantees

Explore the tradeoff

Smaller models that still perform

Not all parametersare created equal

Built by the people behind the research

Dr. Diego Granziol

Khurshid Juraev

Prof. Jon Keating

Ship models thatactually fit

AI Precision
Compiler

Not all parameters
are created equal

Ship models that
actually fit