home / skills / benchflow-ai / skillsbench / finite-horizon-lqr
/tasks/r2r-mpc-control/environment/skills/finite-horizon-lqr
This skill enables finite-horizon LQR-based MPC control by computing backward Riccati gains and applying the first optimal input.
npx playbooks add skill benchflow-ai/skillsbench --skill finite-horizon-lqrReview the files below or copy the command above to add this skill to your agents.
---
name: finite-horizon-lqr
description: Solving finite-horizon LQR via dynamic programming for MPC.
---
# Finite-Horizon LQR for MPC
## Problem Formulation
Minimize cost over horizon N:
```
J = Σ(k=0 to N-1) [x'Qx + u'Ru] + x_N' P x_N
```
## Backward Riccati Recursion
Initialize: P_N = Q (or LQR solution for stability)
For k = N-1 down to 0:
```python
K_k = inv(R + B'P_{k+1}B) @ B'P_{k+1}A
P_k = Q + A'P_{k+1}(A - B @ K_k)
```
## Forward Simulation
Starting from x_0:
```python
u_k = -K_k @ x_k
x_{k+1} = A @ x_k + B @ u_k
```
## Python Implementation
```python
def finite_horizon_lqr(A, B, Q, R, N, x0):
nx, nu = A.shape[0], B.shape[1]
K = np.zeros((nu, nx, N))
P = Q.copy()
# Backward pass
for k in range(N-1, -1, -1):
K[:,:,k] = np.linalg.solve(R + B.T @ P @ B, B.T @ P @ A)
P = Q + A.T @ P @ (A - B @ K[:,:,k])
# Return first control
return -K[:,:,0] @ x0
```
## MPC Application
At each timestep:
1. Measure current state x
2. Solve finite-horizon LQR from x
3. Apply first control u_0
4. Repeat next timestep
This skill implements a finite-horizon Linear Quadratic Regulator (LQR) solver using dynamic programming for Model Predictive Control (MPC). It computes time-varying feedback gains over a specified horizon and returns the optimal first control action for the current state. The implementation is lightweight and suitable for online MPC loops where a quadratic cost and linear dynamics are assumed.
The solver runs a backward Riccati recursion to compute a sequence of cost-to-go matrices and corresponding feedback gains from the horizon end to the present. After the backward pass it performs a forward simulation rule: the first control is u0 = -K0 x0, where K0 is the gain at time 0. In MPC use, this process is repeated each control step: measure the state, solve the finite-horizon LQR, apply the first control, and shift the horizon.
What should I pick for the terminal cost P_N?
Use the infinite-horizon Riccati solution if stability is critical, otherwise setting P_N = Q is a simple default.
Can this handle constraints on states or inputs?
No—finite-horizon LQR assumes unconstrained linear dynamics with quadratic costs. Use constrained MPC solvers for hard limits.
How often must I re-solve the finite-horizon problem?
Re-solve at every control timestep for receding-horizon MPC. If dynamics are static and environment is unchanged, less frequent updates can be used but reduce reactivity.