AI // ENCYCLOPEDIA / QUANT / 04 / INTEREST-RATE MODELS INDEX NEXT: MONTE CARLO →
QUANTITATIVE FINANCE · CHAPTER 04 / 06

Interest-Rate Models — Vasicek, CIR & Hull–White

In Black–Scholes the rate \(r\) was a constant you looked up; here it becomes the quantity being modelled. The field rests on one stylized fact: rates wander but are pulled back toward a long-run level. Mean-reverting short-rate models price every bond and swaption from that idea, with a single stochastic differential equation generating the entire yield curve and the derivatives written on it.

LEVELADVANCED READING TIME≈ 28 MIN BUILDS ONQUANT 01 · 03 INSTRUMENTSVASICEK SIM · CURVE · CIR vs VASICEK
4.1

The term structure of interest rates

There is not one interest rate, there is a curve of them. Lend money for three months and you earn one rate; lend for thirty years and you earn another. The term structure — the map from maturity to yield — is the central object of fixed-income finance, and almost everything in this chapter is a way to produce it from a smaller set of moving parts.

The cleanest coordinate is the zero-coupon bond \(P(t,T)\): the price today (time \(t\)) of one dollar paid with certainty at maturity \(T\), with no coupons along the way. Every fixed cash flow is a bundle of these, so the function \(T \mapsto P(t,T)\) — the discount curve — prices any default-free instrument by linearity. From it, three equivalent descriptions of the same information:

EQ Q4.1 — THREE FACES OF THE CURVE $$ \underbrace{y(t,T) = -\frac{\ln P(t,T)}{T - t}}_{\text{continuously-compounded yield}}, \qquad \underbrace{f(t,T) = -\frac{\partial \ln P(t,T)}{\partial T}}_{\text{instantaneous forward rate}}, \qquad \underbrace{r_t = f(t,t) = \lim_{T \to t} y(t,T)}_{\text{short rate}} $$
The yield \(y\) is the single rate that, compounded over \([t,T]\), reproduces the bond price. The forward rate \(f(t,T)\) is the rate locked in today for an instantaneous loan starting at \(T\); the yield is the average of forwards across the maturity, \(y(t,T) = \frac{1}{T-t}\int_t^T f(t,u)\,du\). The short rate \(r_t\) is the front end of the curve — the overnight rate. The entire chapter is the project of choosing a stochastic model for \(r_t\) and deriving \(P(t,T)\), hence the whole curve, from it.

Empirically the curve is usually upward-sloping (long money pays more, compensating for term risk and expected rate rises), occasionally flat, and sometimes inverted — short rates above long rates, the historically reliable recession signal that appeared across 2022–2024 before normalizing. A good model must be able to produce all three shapes without re-tuning, and it must let the curve evolve randomly, because that randomness is exactly what interest-rate options pay off on.

WHY r MATTERS NOW

In Quant 03 we held \(r\) constant and known — harmless for a three-month equity call. It is not harmless for a 30-year swap, a callable bond, or a swaption, where the underlying is the rate and its volatility drives the price. Treating \(r\) as a stochastic process is not a refinement here; it is the whole subject.

A zero-coupon bond maturing in \(T - t = 5\) years trades at \(P = 0.8187\) per dollar of face. What is its continuously-compounded yield \(y = -\ln(P)/(T-t)\)? (Use \(\ln 0.8187 = -0.2000\).)
\(y = -\dfrac{\ln 0.8187}{5} = -\dfrac{-0.2000}{5} = \dfrac{0.2000}{5} = \) 0.04 — a flat 4% curve at that maturity.
4.2

Short-rate models — the framework

A short-rate model posits a single stochastic differential equation (SDE) for the instantaneous short rate \(r_t\) under the risk-neutral measure \(\mathbb{Q}\), then derives every bond price as a risk-neutral expectation — the same machine as Quant 03's EQ Q3.1, now applied to the rate that does the discounting:

EQ Q4.2 — BOND PRICE AS A RISK-NEUTRAL EXPECTATION $$ P(t,T) \;=\; \mathbb{E}^{\mathbb{Q}}\!\left[\, \exp\!\Big(-\!\int_t^T r_s\,\mathrm{d}s\Big)\;\Big|\;\mathcal{F}_t \,\right] $$
The bond pays \$1 at \(T\); its value today is the expected discount factor, where discounting itself is now random because \(r_s\) is random. This single formula reduces every short-rate model to one question: can we compute that expectation? When \(r_t\) is a Gaussian or square-root diffusion the integral \(\int_t^T r_s\,\mathrm{d}s\) is tractable and the answer is a closed-form exponential — the defining luxury of the models in this chapter.

A model is called affine when the resulting yield is a linear (affine) function of the short rate. Affine models are the backbone of the field because they collapse the expectation in EQ Q4.2 into an exponential of \(r_t\):

EQ Q4.3 — THE AFFINE TERM-STRUCTURE FORM $$ P(t,T) = A(t,T)\,e^{-B(t,T)\,r_t} \qquad\Longleftrightarrow\qquad y(t,T) = \frac{B(t,T)}{T-t}\,r_t \;-\; \frac{\ln A(t,T)}{T-t} $$
\(A\) and \(B\) are deterministic functions of the calendar, solved from ordinary differential equations (Riccati equations) implied by the SDE. Vasicek, CIR and Hull–White are all affine — that is precisely why each admits a closed-form bond price. \(B(t,T)\) is a duration-like sensitivity: it measures how strongly the bond's log-price responds to a move in the short rate, and it shrinks toward \(1/a\) as maturity grows.

Two more pieces of vocabulary that organize the whole zoo. A model is an equilibrium model if you specify the dynamics and read off whatever curve they imply (Vasicek, CIR); it is a no-arbitrage model if you instead let parameters become time-dependent so the model reproduces today's observed curve exactly (Hull–White, HJM). Equilibrium models are honest about the economics but generically misprice traded bonds on day one; no-arbitrage models fit by construction, at the cost of a function rather than a number to estimate. Production desks use no-arbitrage models because mispricing the hedging instruments is not an option.

ONE-FACTOR LIMITATION

Every model in this chapter is driven by a single Brownian motion. That means all points on the curve move in lockstep — a one-factor model cannot produce a yield curve that simultaneously steepens at the front and flattens at the back, and it forces perfect correlation between all rates. Principal-component analysis of real curves shows roughly three independent factors (level, slope, curvature). One-factor models survive because they are tractable and calibrate adequately to a single product; serious curve-and-vol work uses multi-factor or HJM/LMM frameworks (beyond this chapter).

4.3

Vasicek: mean reversion in its purest form

Vasicek (1977) wrote the simplest equation that captures the central stylized fact. The short rate is an Ornstein–Uhlenbeck process: a drift that always points back toward a long-run level, plus constant-magnitude Gaussian noise.

EQ Q4.4 — THE VASICEK SDE $$ \mathrm{d}r_t \;=\; a\,(b - r_t)\,\mathrm{d}t \;+\; \sigma\,\mathrm{d}W_t $$
\(a > 0\) is the speed of mean reversion (how hard the rate is pulled home), \(b\) the long-run mean level it is pulled toward, and \(\sigma\) the instantaneous volatility. When \(r_t > b\) the drift \(a(b - r_t)\) is negative and the rate falls; when \(r_t < b\) it rises. Without the noise, \(r_t\) decays exponentially to \(b\) with time-constant \(1/a\); the noise keeps it perpetually wandering around \(b\). The half-life of a shock is \(\ln 2 / a\).

Because the equation is linear with additive Gaussian noise, it solves in closed form. Conditional on today's rate, the future rate is normally distributed:

EQ Q4.5 — VASICEK: MEAN, VARIANCE, STATIONARY LAW $$ \mathbb{E}[r_T \mid r_t] = b + (r_t - b)\,e^{-a(T-t)}, \qquad \mathrm{Var}[r_T \mid r_t] = \frac{\sigma^2}{2a}\Big(1 - e^{-2a(T-t)}\Big) $$ $$ r_\infty \sim \mathcal{N}\!\left(b,\ \frac{\sigma^2}{2a}\right) $$
The conditional mean is a weighted average of the start \(r_t\) and the target \(b\), with the weight on \(b\) growing as \(e^{-a(T-t)} \to 0\). As \(T \to \infty\) the mean converges to \(b\) and the variance saturates at \(\sigma^2/2a\) — the stationary variance. So Vasicek's long horizon is a Gaussian bell centered on \(b\) with spread set by the noise-to-reversion ratio. Faster reversion (larger \(a\)) or smaller \(\sigma\) gives a tighter long-run distribution.
A Vasicek short rate follows \( \mathrm{d}r_t = a(\theta - r_t)\,\mathrm{d}t + \sigma\,\mathrm{d}W_t \) with reversion level \( \theta = 0.03 \), speed \( a = 0.4 \), and vol \( \sigma = 0.01 \). What is its long-run (stationary) mean, \( \lim_{T\to\infty} \mathbb{E}[r_T] \)?
The Ornstein–Uhlenbeck process reverts to its level parameter, so \( \lim_{T\to\infty}\mathbb{E}[r_T] = \theta = \) 0.03. The speed \(a\) and vol \(\sigma\) set how fast it gets there and how wide it wanders, but not where it centers.

Because Vasicek is affine, the bond-price coefficients of EQ Q4.3 have explicit closed forms:

EQ Q4.6 — VASICEK ZERO-COUPON BOND $$ B(t,T) = \frac{1 - e^{-a(T-t)}}{a}, \qquad \ln A(t,T) = \Big(b - \frac{\sigma^2}{2a^2}\Big)\big(B(t,T) - (T-t)\big) - \frac{\sigma^2}{4a}\,B(t,T)^2 $$
Plug into \(P = A\,e^{-B r_t}\) for an exact price; differentiate the yield \(y(t,T)\) and you can produce upward, flat, humped and inverted curves by moving \(r_t\) relative to \(b\). The \(-\sigma^2/2a^2\) term is a convexity correction — volatility lowers long yields, because a bond price is convex in rates and Jensen's inequality bites. Closed-form bonds plus Gaussian \(r_T\) also make European bond options and swaptions closed-form (Jamshidian's trick), which is why Vasicek's Gaussian descendant, Hull–White, runs trading desks.

The famous flaw. Because \(r_T\) is Gaussian, Vasicek assigns positive probability to negative rates. For decades this was treated as a fatal defect — until 2014–2021, when policy rates in the euro area, Japan, Switzerland and elsewhere actually went negative, and the "flaw" became a feature. The honest verdict in 2026: negative-rate capability is sometimes exactly what you want, but Vasicek's symmetric Gaussian tail can still send rates implausibly far below zero, and it cannot reproduce the way real volatility rises with the level of rates. That second point is what CIR fixes.

INSTRUMENT Q4.1 — VASICEK PATH SIMULATOREQ Q4.4 · EULER · 5 PATHS · SEEDED
SHOCK HALF-LIFE (ln2 / a)
STATIONARY MEAN
STATIONARY STD √(σ²/2a)
The dashed mint line is the long-run level \(b\); the shaded band is the stationary \(\pm 1\sigma_\infty = \pm\sqrt{\sigma^2/2a}\) corridor. Start \(r_0\) high and watch every path decay toward \(b\) at rate \(a\) — raise \(a\) and the pull snaps the paths home in a fraction of the window; drop \(a\) toward zero and the band balloons as reversion can no longer contain the noise. Push \(b\) low and \(\sigma\) high and some paths dip below zero — Vasicek's Gaussian tail made visible.
PYTHON · RUNNABLE IN-BROWSER
# Simulate Vasicek short-rate paths; check the long-run mean & variance (EQ Q4.5)
import numpy as np
rng = np.random.default_rng(0)

a, b, sig, r0 = 0.4, 0.03, 0.015, 0.06     # speed, level, vol, start
T, dt = 30.0, 1/52                          # 30 years, weekly steps
n = int(T/dt); paths = 4000
r = np.full(paths, r0)
for _ in range(n):                          # Euler-Maruyama on dr = a(b-r)dt + sig dW
    r += a*(b - r)*dt + sig*np.sqrt(dt)*rng.standard_normal(paths)

print(f"simulated mean at T : {r.mean():.5f}   (theory b = {b})")
print(f"simulated var  at T : {r.var():.6e}   (theory sig^2/2a = {sig**2/(2*a):.6e})")
print(f"simulated std  at T : {r.std():.5f}   (theory = {np.sqrt(sig**2/(2*a)):.5f})")
print(f"fraction of paths below zero : {100*np.mean(r < 0):.2f} %  <- Vasicek can go negative")
print(f"shock half-life ln2/a        : {np.log(2)/a:.2f} years")
edits are live — break it on purpose
INSTRUMENT Q4.2 — YIELD-CURVE SHAPEREQ Q4.6 · VASICEK · y(0,T) vs MATURITY
SHAPE
30Y YIELD
CONVEXITY DROP @30Y
The same SDE makes every curve shape. Set the short rate \(r_0\) below the long-run level \(b\) for the textbook upward slope; set it above \(b\) and the curve inverts — the recession signal of 2022–24, here a one-slider phenomenon. The asymptotic long yield is \(b - \sigma^2/2a^2\); crank \(\sigma\) and watch the entire long end sag below \(b\) as the convexity correction grows. With \(\sigma = 0\) the curve is the pure expectation of future short rates.
PYTHON · RUNNABLE IN-BROWSER
# Price a zero-coupon bond under Vasicek: closed form (EQ Q4.6) vs Monte Carlo (EQ Q4.2)
import numpy as np
rng = np.random.default_rng(0)

a, b, sig, r0, T = 0.3, 0.05, 0.02, 0.03, 5.0

def vasicek_bond(a, b, sig, r0, T):                 # closed form P(0,T)
    B = (1 - np.exp(-a*T)) / a
    lnA = (b - sig**2/(2*a**2))*(B - T) - (sig**2/(4*a))*B**2
    return np.exp(lnA) * np.exp(-B*r0)

# Monte Carlo: simulate r, discount by exp(-integral r ds) per EQ Q4.2
dt = 1/250; n = int(T/dt); paths = 60000
r = np.full(paths, r0); acc = np.zeros(paths)
for _ in range(n):
    acc += r*dt                                     # accumulate integral of r
    r   += a*(b - r)*dt + sig*np.sqrt(dt)*rng.standard_normal(paths)
mc = np.exp(-acc).mean()
se = np.exp(-acc).std()/np.sqrt(paths)
cf = vasicek_bond(a, b, sig, r0, T)

print(f"closed-form  P(0,{T:.0f}) : {cf:.6f}   yield {-np.log(cf)/T:.5f}")
print(f"Monte-Carlo  P(0,{T:.0f}) : {mc:.6f}  (+/- {1.96*se:.6f}, 95% CI)")
print(f"gap (MC - CF)            : {mc - cf:+.6f}")
edits are live — break it on purpose
4.4

Cox–Ingersoll–Ross: the square-root fix

Cox, Ingersoll and Ross (1985) kept Vasicek's mean-reverting drift but multiplied the noise by \(\sqrt{r_t}\). One small change buys two important properties.

EQ Q4.7 — THE CIR SDE $$ \mathrm{d}r_t \;=\; a\,(b - r_t)\,\mathrm{d}t \;+\; \sigma\,\sqrt{r_t}\;\mathrm{d}W_t $$
Same drift as Vasicek, but the diffusion now scales with \(\sqrt{r_t}\). As \(r_t \to 0\) the volatility vanishes, so the noise cannot push the rate through zero — the drift \(ab > 0\) at the boundary deterministically lifts it back up. The result: rates that get small naturally calm down, matching the empirical fact that interest-rate volatility tends to rise with the level of rates. The conditional law of \(r_T\) is a (scaled) non-central chi-squared, not a normal.

Whether zero is truly unreachable depends on a sharp threshold. The Feller condition states the boundary at zero is inaccessible — rates stay strictly positive with probability one — exactly when:

EQ Q4.8 — THE FELLER CONDITION $$ 2\,a\,b \;\ge\; \sigma^2 $$
When the mean-reversion "budget" \(2ab\) at the origin dominates the noise \(\sigma^2\), the process never touches zero. If \(2ab < \sigma^2\) the rate can hit zero (and reflect off it) but, crucially, still never goes negative — the square-root diffusion guarantees \(r_t \ge 0\) regardless. This non-negativity is CIR's headline difference from Vasicek, and the reason CIR became the standard whenever the modelled quantity (a nominal rate, a default intensity, a stochastic variance) must stay positive — it is the same square-root process Heston used for variance in Quant 03's EQ Q3.7.
One of the two models in this chapter can produce negative short rates; the other cannot. Which model rules out negative rates by construction — Vasicek or CIR? (Answer with the model name.)
Vasicek's additive Gaussian noise has support on the whole real line, so it can drift below zero. CIR multiplies its noise by \(\sqrt{r_t}\), which vanishes at the boundary and lets the positive drift push the rate back up — keeping \(r_t \ge 0\) always. The model that rules out negative rates is CIR.

CIR is still affine, so the bond price keeps the \(P = A\,e^{-B r_t}\) form — only the coefficients change, now built from \(\gamma = \sqrt{a^2 + 2\sigma^2}\):

EQ Q4.9 — CIR ZERO-COUPON BOND $$ B(t,T) = \frac{2\big(e^{\gamma(T-t)} - 1\big)}{(\gamma + a)\big(e^{\gamma(T-t)} - 1\big) + 2\gamma}, \qquad A(t,T) = \left[\frac{2\gamma\,e^{(a+\gamma)(T-t)/2}}{(\gamma + a)\big(e^{\gamma(T-t)} - 1\big) + 2\gamma}\right]^{\!2ab/\sigma^2} $$
Messier than Vasicek but just as closed-form, with \(\gamma = \sqrt{a^2 + 2\sigma^2}\). As \(\sigma \to 0\) these collapse to the deterministic-rate discount factor, and for small \(\sigma\) they track Vasicek's coefficients closely — the two models only diverge meaningfully when rates approach zero or volatility is large. The price of CIR's realism is that its conditional distribution is non-central \(\chi^2\), so simulation and option pricing are more involved than Vasicek's clean Gaussian.
INSTRUMENT Q4.3 — CIR vs VASICEK · THE ZERO FLOORSAME a, b, σ · TERMINAL DENSITY OF r_T
FELLER 2ab vs σ²
VASICEK P(r_T < 0)
CIR P(r_T < 0)
0.00%
Both densities are the law of \(r_T\) started at \(r_0 = b\) under identical \((a,b,\sigma)\). The mint curve is Vasicek's symmetric Gaussian — push \(\sigma\) up or \(b\) down and watch its left tail spill across the zero line, the shaded negative-rate region. The blue curve is CIR's non-central \(\chi^2\): it is right-skewed, pinned at zero, and never assigns mass to negative rates. When the Feller readout turns red (\(2ab < \sigma^2\)) CIR can touch zero but still reflects — it never crosses.
4.5

Hull–White & fitting the curve exactly

Vasicek and CIR are equilibrium models: feed them constant parameters and they produce a curve, which will generally not match the curve quoted in the market this morning. For a trading desk that hedges with real bonds, a model that misprices its own hedging instruments at \(t = 0\) is unusable. Hull and White (1990) fixed this with one elegant move: make the drift's target time-dependent.

EQ Q4.10 — THE HULL–WHITE (EXTENDED VASICEK) SDE $$ \mathrm{d}r_t \;=\; \big(\theta(t) - a\,r_t\big)\,\mathrm{d}t \;+\; \sigma\,\mathrm{d}W_t $$
This is Vasicek with the constant target \(ab\) promoted to a deterministic function \(\theta(t)\). That single degree of freedom — a whole function, not a number — is exactly enough to reproduce today's observed term structure perfectly, by construction. \(a\) and \(\sigma\) are left to calibrate the model's volatility (the prices of caps and swaptions), while \(\theta(t)\) absorbs the shape of the initial curve. It is the workhorse of interest-rate desks.

The function \(\theta(t)\) is not guessed — it is read off the market forward curve \(f(0,t)\) so that EQ Q4.2 returns the observed bond prices:

EQ Q4.11 — CALIBRATING θ(t) TO THE MARKET CURVE $$ \theta(t) \;=\; \frac{\partial f(0,t)}{\partial t} \;+\; a\,f(0,t) \;+\; \frac{\sigma^2}{2a}\Big(1 - e^{-2at}\Big) $$
Here \(f(0,t)\) is the instantaneous forward rate observed today (EQ Q4.1). The first two terms make the model's expected rate track the forward curve; the third is the same convexity correction seen in Vasicek. Once \(\theta(t)\) is fixed this way, the model fits every quoted zero-coupon bond exactly and Hull–White retains Vasicek's Gaussian tractability — closed-form bonds, bond options, and (via Jamshidian's decomposition) European swaptions. The cost, inherited from Vasicek, is that rates can still go negative.

That trade-off is the whole reason the model zoo exists. There is no single best short-rate model; there is a menu of compromises along three axes — tractability, realism, and exact fit to the market — and which corner you choose depends on the product you must price and hedge.

ModelSDEr < 0?Fits today's curve?Where it wins
Vasiceka(b − r)dt + σ dWyesno (equilibrium)The pedagogical baseline; cleanest closed forms.
CIRa(b − r)dt + σ√r dWno (r ≥ 0)no (equilibrium)When positivity matters: nominal rates, default intensity, variance.
Hull–White(θ(t) − a r)dt + σ dWyesyes (no-arbitrage)Production desks; exact curve fit + closed-form swaptions.
Black–Karasinskid ln r = (θ(t) − a ln r)dt + σ dWno (r > 0)yesLognormal rate, positive + curve-fitting; no closed-form bond.
G2++ / HW two-factortwo correlated Gaussian factorsyesyesRealistic curve moves (de-correlated front vs back).

Where the field sits in 2026. One-factor Gaussian models (Hull–White, G2++) remain the default for vanilla rate derivatives because they are fast and calibrate cleanly. For the full smile of caps and swaptions, desks layer on the SABR stochastic-vol model per expiry/tenor, or move to the LIBOR/forward Market Model (LMM) framework, which models observable forward rates directly rather than the unobservable instantaneous short rate. The post-2021 transition from LIBOR to risk-free overnight benchmarks (SOFR, €STR, SONIA) reshaped the plumbing — discounting, fixings, and convexity adjustments — but left the short-rate mathematics of this chapter intact: SOFR-based curves are still bootstrapped to zero-coupon bonds, and Hull–White still prices the options on them.

NEXT

Three of this chapter's instruments leaned on Monte-Carlo when no closed form was at hand — the Vasicek bond cross-check, CIR's non-central \(\chi^2\), every path-dependent exotic. Quant 05 makes that the main event: simulating SDEs properly (Euler vs Milstein, where discretization bias hides), variance reduction (antithetics, control variates, the closed-form Vasicek bond as its own control), quasi-random sequences, and why the same engine that priced these bonds prices the whole derivatives book.

4.R

References

  1. Vasicek, O. (1977). An Equilibrium Characterization of the Term Structure. Journal of Financial Economics 5(2) — the Ornstein–Uhlenbeck short rate and its closed-form bond (EQ Q4.4–Q4.6).
  2. Cox, J. C., Ingersoll, J. E. & Ross, S. A. (1985). A Theory of the Term Structure of Interest Rates. Econometrica 53(2) — the square-root diffusion, the Feller condition, and non-negative rates (EQ Q4.7–Q4.9).
  3. Hull, J. & White, A. (1990). Pricing Interest-Rate-Derivative Securities. Review of Financial Studies 3(4) — time-dependent drift that fits the initial curve exactly (EQ Q4.10–Q4.11).
  4. Jamshidian, F. (1989). An Exact Bond Option Formula. Journal of Finance 44(1) — decomposes a swaption into a portfolio of bond options, making Gaussian models swaption-closed-form.
  5. Heath, D., Jarrow, R. & Morton, A. (1992). Bond Pricing and the Term Structure of Interest Rates: A New Methodology. Econometrica 60(1) — the HJM no-arbitrage framework that generalizes all short-rate models to the whole forward curve.
  6. Brigo, D. & Mercurio, F. (2006). Interest Rate Models — Theory and Practice (2nd ed.). Springer Finance — the standard practitioner reference for Vasicek, CIR, Hull–White, G2++, and calibration.