Stochastic Redistribution vs. LP Surrogates for Circuit Balancing
Abstract
We study circuit balancing when service points (SPs) exhibit stochastic load deviations with lognormally distributed overshoot probabilities. The operator periodically redistributes SPs across circuits to minimize imbalance. We model SP deviation dynamics, derive aggregate circuit loads, and compare a discrete Redistribution objective (RMS risk) with a linear LP surrogate (L1 risk). Our main theorem shows that, under exogenous stochastic dynamics and lognormal heterogeneity, the LP relaxation is exact almost surely, reducing a stochastic assignment problem to a convex program. We also delineate when this equivalence breaks (RMS objectives, capacities, topology) and how to extend it with MILP/MIQP formulations.
Components of the Paper
1) The Lognormal Formalization
- Introduce a global lognormal factor: \( \delta_j(t) = a_j\,\varepsilon_t \), with \( \varepsilon_t \) i.i.d. lognormal.
- This gives multiplicative separability (rank-1 in time), enabling all simplifications below.
2) RMSE ↔ L1 Equivalence under Lognormal
With \( A_T=\tfrac{1}{T}\sum_t \varepsilon_t \) and \( B_T=\sqrt{\tfrac{1}{T}\sum_t \varepsilon_t^2} \), we show \( J_{L_2}(x) = \dfrac{B_T}{A_T}\, J_{L_1}(x) \). Because the ratio is independent of \(x\), both objectives have the same minimizer. This eliminates the need to solve RMSE directly in this setting.
3) The L1 Objective Collapse to a Small LP
The time dimension factors out; only \(m\) absolute-value constraints remain: \[ \min_{x,u}\ \sum_i u_i \quad \text{s.t.}\quad u_i \ge \sum_j a_j x_{ij},\ \ u_i \ge -\sum_j a_j x_{ij},\ \ \sum_i x_{ij}=1,\ 0\le x_{ij}\le 1. \] This LP is small, exact, and polynomial-time solvable; optimal solutions are one-hot (integral).
Together these three steps: hard nonlinear RMSE → equivalent L1 → exact small LP.
1. Problem Setting (stochastic primitives)
We formalize the grid and the stochastic processes that drive deviations and, ultimately, redistribution decisions.
Let circuits \(\mathcal{C}=\{C_1,\dots,C_m\}\) and service points \(\mathcal{P}=\{P_1,\dots,P_n\}\). Each SP \(P_j\) is assigned to one circuit:
For discrete time \(t=1,\dots,T\), actual and predicted loads satisfy
where \(\eta_j(t)\) is zero-mean noise. We encode asymmetric stochastic spikes via
so most SPs are biased slightly negative, while a fat-tailed minority spike positive more often.
Circuit aggregates (random processes) are
Intuition. The operator cannot change \(\delta_j(\cdot)\) realizations, but can redistribute which sums they enter, reshaping the distribution of \(\Delta_i(\cdot)\).
2. Stochastic Objectives & Risk (what we control)
We cast redistribution as minimizing a risk proxy of random circuit deviations, using sample-path estimators over a horizon \(T\).
Redistribution (RMS risk, discrete)
Intuition. RMS emphasizes volatility/peaks, not just average error. It is physically meaningful: thermal/operational stress is closer to RMS than mean.
LP surrogate (L1 risk, convex)
Linearization with \(z_{it}\ge0\) gives
Intuition. \(\tilde J\) targets robustness to outliers and is a tight piecewise-linear envelope for RMS, enabling efficient LP solvers on stochastic data.
3. Main Result (stochastic/lognormal structural reduction)
Under lognormal overshoot dynamics, the redistribution problem with an L1 objective collapses to an exact LP almost surely, and thus also in expectation.
We give two complementary proofs that lead to the same optimal redistribution. First, a simple analytic reduction shows that with a global lognormal shock the RMS objective is a fixed positive multiple of the L1 surrogate, so both objectives have the same minimizers (§3.2). Second, a geometric/pathwise argument shows that on any realized trajectory, the LP surrogate admits an optimal solution whose assignment is one-hot per service point (equivalently, integral), so the LP optimum equals the binary optimum (§3.1). In the lognormal setting, these two views are consistent: proportionality in §3.2 explains why the TU/assignment geometry in §3.1 yields the same decision rule.
Stochastic model
Theorem (Lognormal L2→L1 Proportionality)
Assume a global multiplicative driver \(\varepsilon_t>0\) with \[ \delta_j(t)=\tilde a_j\,\varepsilon_t,\qquad \tilde a_j:=a_j\tau_j,\qquad \varepsilon_t\sim \mathrm{LogNormal}(\mu,\sigma^2), \] and take uniform temporal weights \(w_{it}\equiv \tfrac{1}{T}\) (or any weights independent of circuit index \(i\)). For an assignment \(x\), define \(s_i(x):=\sum_j \tilde a_j\,x_{ij}\) and the sample factors \[ A_T:=\tfrac{1}{T}\sum_{t=1}^T \varepsilon_t,\qquad B_T:=\sqrt{\tfrac{1}{T}\sum_{t=1}^T \varepsilon_t^2}\,. \] Then
Collapsed “Small” LP (post-collapse; circuit-only)
Because time factors out, minimizing either objective is equivalent to the following LP with only \(m\) absolute-value epigraphs:
Extreme-point integrality. Over the assignment polytope, an optimal solution exists with each column one-hot; thus this LP returns a discrete redistribution without rounding.
Proof (analytic; full)
By the global factorization \(\Delta_i(t)=\sum_j x_{ij}\delta_j(t)=\big(\sum_j \tilde a_j x_{ij}\big)\varepsilon_t=s_i(x)\varepsilon_t\). Therefore \[ \sqrt{\tfrac{1}{T}\sum_t \Delta_i(t)^2}=\sqrt{\tfrac{1}{T}\sum_t s_i(x)^2\varepsilon_t^2}=|s_i(x)|\,B_T, \] and \[ \tfrac{1}{T}\sum_t |\Delta_i(t)|=\tfrac{1}{T}\sum_t |s_i(x)|\,|\varepsilon_t|=|s_i(x)|\,A_T. \] Summing over \(i\) yields the two displayed identities and the ratio \(J_{L_2}(x)=(B_T/A_T)J_{L_1}(x)\) with \(B_T/A_T>0\) independent of \(x\). If \(x^\star\) minimizes \(J_{L_1}\), then for any \(x\), \(J_{L_1}(x^\star)\le J_{L_1}(x)\Rightarrow J_{L_2}(x^\star)=(B_T/A_T)J_{L_1}(x^\star)\le (B_T/A_T)J_{L_1}(x)=J_{L_2}(x)\), so the minimizers coincide for any finite \(T\). Finally, by the strong law of large numbers, \(A_T\to \mathbb{E}[\varepsilon]=e^{\mu+\sigma^2/2}\) and \(B_T\to \sqrt{\mathbb{E}[\varepsilon^2]}=\sqrt{e^{2\mu+2\sigma^2}}=e^{\mu+\sigma^2}\), hence \(B_T/A_T\to e^{\sigma^2/2}\). \(\square\)
3.1 Geometric Proof (pathwise LP exactness on realized trajectories)
Setup. Fix a realized sample path \(\{\delta_j(t)\}_{t=1}^T\). Consider the LP surrogate with the standard linearization of absolute values:
The feasible set in \(x\) is the assignment polytope \(P:=\{x\in\mathbb{R}_+^{m\times n}:\ \sum_i x_{ij}=1\ \forall j\}\), i.e., a Cartesian product of simplices—each column \(x_{\cdot j}\) lies in \(\Delta_m\). The extreme points of \(P\) are exactly the one-hot matrices (each column has a single 1 and zeros elsewhere).
Proof (geometric/pathwise; full)
Step 1 (tightness of linearization). Because all weights \(w_{it}\ge 0\), for any feasible \((x,z)\) we can decrease any slack in the two inequalities for \(z_{it}\) without violating feasibility and strictly improving the objective. Hence at any optimal solution we must have \[ z_{it}= \bigg|\sum_{j=1}^n x_{ij}\,\delta_j(t)\bigg| \quad \text{for all } (i,t). \] Consequently, the optimal LP value equals \[ \min_{x\in P}\ \sum_{i,t} w_{it}\,\bigg|\sum_j x_{ij}\,\delta_j(t)\bigg|. \]
Step 2 (existence of an extreme-point minimizer). Suppose \(x\in P\) is an optimal solution but is not an extreme point of \(P\). Then there exist distinct \(x^{(1)},x^{(2)}\in P\) and \(\theta\in(0,1)\) with \(x=\theta x^{(1)}+(1-\theta)x^{(2)}\). Define for \(k\in\{1,2\}\), \[ z^{(k)}_{it}:=\bigg|\sum_j x^{(k)}_{ij}\,\delta_j(t)\bigg|. \] By convexity of absolute value (triangle inequality), \[ \bigg|\sum_j \big(\theta x^{(1)}_{ij}+(1-\theta)x^{(2)}_{ij}\big)\delta_j(t)\bigg| \ \le\ \theta\,z^{(1)}_{it} + (1-\theta)\,z^{(2)}_{it}, \] with strict inequality for at least one pair \((i,t)\) unless the two signed sums are perfectly aligned for all \((i,t)\). Summing with nonnegative weights \(w_{it}\) gives \[ \sum_{i,t} w_{it}\,\bigg|\sum_j x_{ij}\,\delta_j(t)\bigg| \ \le\ \theta\sum_{i,t} w_{it}\,z^{(1)}_{it} + (1-\theta)\sum_{i,t} w_{it}\,z^{(2)}_{it}. \] If the inequality is strict, \(x\) cannot be optimal. If equality holds, then \((x^{(1)},z^{(1)})\) and \((x^{(2)},z^{(2)})\) are also optimal; choose one that is closer to (or already at) an extreme point of \(P\). Repeating this argument (Carathéodory/face descent) yields an optimal solution with \(x\) at an extreme point of \(P\), i.e., one-hot per column.
Step 3 (equality with the binary problem). On one-hot assignments \(x\), the LP cost \(\sum_{i,t} w_{it}\,| \sum_j x_{ij}\delta_j(t)|\) equals the binary problem’s cost on the same \(x\), and both feasible sets share those one-hot assignments. Hence the optimal LP value equals the optimal binary value on the realized path. \(\square\)
Discussion. The key geometric facts are: (i) linearization is tight at optimum; (ii) minimizing a separable sum of absolute values over a product of simplices admits an extreme-point minimizer; (iii) extreme points of the assignment polytope are one-hot. Thus the LP surrogate is pathwise exact and returns a discrete redistribution.
3.2 Simple Analytic Reduction (global lognormal shock)
With \(\Delta_i(t)=s_i(x)\varepsilon_t\), RMS and L1 factor as \(\sqrt{\tfrac{1}{T}\sum_t \Delta_i(t)^2}=|s_i(x)|\,B_T\) and \(\tfrac{1}{T}\sum_t|\Delta_i(t)|=|s_i(x)|\,A_T\), giving \(J_{L_2}(x)=\frac{B_T}{A_T}\,J_{L_1}(x)\). Because \(\frac{B_T}{A_T}>0\) does not depend on \(x\), both objectives pick the same assignments for any finite horizon, and in the ergodic limit the ratio tends to \(e^{\sigma^2/2}\). This proportionality explains why the geometric/pathwise exactness yields the same decision rule under a global lognormal driver. (See §6 for when this collapse breaks and how to restore it.)
3.3 Remarks
Intuitively, a single multiplicative factor makes every circuit’s deviation a scaled copy of one shock, so RMS and L1 differ only by a scalar the operator cannot influence. Geometrically, the assignment polytope already encodes one-hot redistributions; tight linearization plus convexity ensures an extreme-point minimizer. Two views—same optimizer.
4. When Equivalence Breaks (and how to fix it)
The reduction hinges on linearity and decoupling, realistic systems re-introduce coupling.
-
True RMS risk (nonlinear):
\(J(x)=\sum_i \sqrt{\tfrac1T\sum_t(\sum_j x_{ij}\delta_j(t))^2}\) is not linear.
Fix: MIQP or MILP with piecewise-linear RMS; greedy/local search with provable improvement. -
Capacity coupling: constraints \(\sum_j x_{ij}p_j \le \kappa_i\) destroy TU; LP can be fractional.
Fix: capacity-aware MILP (valid inequalities or Lagrangian relaxation). -
Topology-dependent dynamics: if moving \(P_j\) changes \(\delta_j(t)\) (voltage/impedance/controls), (S1) fails.
Fix: mixed-integer OPF or bilevel formulations linking assignments to power-flow states.
Rule of thumb. Use the LP when penalizing absolute deviations and there is no capacity coupling; upgrade to MILP/MIQP when you care about RMS and/or must honor capacities.
5. Practical Metrics & Evaluation (stochastic reporting)
We evaluate redistribution on sample paths and across Monte-Carlo draws to reflect stochastic risk.
Interpretation. Minimizing \(J\) suppresses persistent volatility at the feeder level; minimizing \(\tilde J\) suppresses large absolute swings and is robust to outliers. Report \(J,\tilde J\) per trajectory and their expectations/quantiles (e.g., CVaR) over draws of \(\{\delta_j(t)\}\).
6. Lognormal Collapse: Conditions, Scale Tuning, and Realism
We clarify when the \(L_2 \rightarrow L_1\) collapse holds, how a per–SP scale tuner preserves it, and why this setup matches realistic load balancing.
6.1 Collapse requires a global lognormal multiplier
The collapse relies on a rank-1 stochastic structure in time:
With a global lognormal multiplier \( \varepsilon_t \), circuit aggregates factor as
yielding, over a horizon \(T\),
The ratio is independent of \(x\), hence \(J_{L_2}(x)=\kappa\,J_{L_1}(x)\) and both objectives share the same minimizer.
If instead each SP has its own multiplier, \( \delta_j(t)=a_j\,\varepsilon_{j,t} \), cross terms depend on \(x\) and the collapse breaks in general.
6.2 Scale tuner preserves collapse
A realistic way to capture heterogeneous sensitivity while preserving the rank-1 structure is a scale tuner:
Then \( \Delta_i(t) = (\sum_j \tilde a_j x_{ij})\,\varepsilon_t \) and the same factorization applies; \(L_2\) and \(L_1\) retain the same minimizer.
6.3 Realism: why this matches load balancing
- Macro driver + local sensitivity. Many grids face systemic, multiplicative fluctuations (weather, market, behavior) that move all SPs together; per-SP gains \( \tau_j \) model how strongly each responds.
- Parsimonious and identifiable. Estimating \( \tau_j \) from historical ratios of deviations to a global index is straightforward; the common shock explains most temporal variance with one factor.
- Computationally robust. The surrogate stays LP-solvable and (under our assumptions) integral, giving fast, discrete reallocations without rounding heuristics.
- Extendable. Clusters with their own global factors can be modeled by repeating this construction per cohort; weak idiosyncratic noise can be handled with reweighted \(L_1\) or a light two-stage polish.
7. Interactive 3D Demo (Latent a–p–d Space)
Load Rebalancing DemoWe provide an interactive visualization that animates the redistribution process on a single realized sample path. Service points (SPs) appear as glowing spheres labeled with their per–SP statistics—actual a, prediction p, and deviation d—and are grouped inside semi-opaque circuit spheres. The animation proceeds in two phases: the first half shows the original distribution; at the midpoint, points smoothly transition to their optimized circuits (with a progress bar indicating time). Circuit spheres fade out and back in to highlight the redistribution event, and the overlay displays per-circuit risks and aggregate objective values.
7.1 Mapping and Visual Design
Let circuits be \(\mathcal{C}=\{C_1,\dots,C_m\}\) and SPs \(\mathcal{P}=\{P_1,\dots,P_n\}\) with assignments \(x_{ij}\in\{0,1\}\), \(\sum_i x_{ij}=1\). For each SP \(P_j\), the demo displays the pair \((a_j, p_j)\) and the residual \(d_j\) (sign convention consistent with the paper), and places a labeled sphere for \(P_j\) inside the sphere of its assigned circuit.
7.2 Objective and Per-Circuit Risk
The demo’s text panel summarizes the risk the operator seeks to minimize. For a realized trajectory \(\{\delta_j(t)\}_{t=1}^T\), circuit-level RMS risk and the aggregate objective are
The overlay reports per-circuit \(R_i\) and the aggregate \(J\) before and after redistribution, mirroring the paper’s redistribution objective (RMS risk) and its LP surrogate discussion.
7.3 Animation Procedure
- Initialization (original assignment). Points are positioned within their circuits and labeled with a, p, d and an SP identifier.
- Midpoint transition. At \(t=T/2\), each point interpolates from its “before” circuit center to its “after” circuit center over a short window, while circuit spheres dissolve and reappear to emphasize the segmentation change.
- Optimized phase. Points remain stationary in their new circuits; the overlay updates to show per-circuit \(R_i\) and aggregate \(J\) for the redistributed configuration. The timeline then resets and loops.
7.4 Relation to the Model (and Deviations)
- Objective consistency. The quantities reported in the panel correspond to the paper’s RMS functional
\(J(x)=\sum_i R_i(x)\) computed on a realized sample path. The demo can be wired directly to the implementation of
compute_Jfor the actual dataset so that “before/after” values reflect the true assignments. - LP surrogate vs. RMS. The write-up establishes an L1 (LP) surrogate \(\tilde J(x)=\sum_{i,t} w_{it}\big|\sum_j x_{ij}\delta_j(t)\big|\) and shows pathwise exactness under stated assumptions. The visual narrative mentions this surrogate but renders the RMS objective in the panel; no LP geometry is visualized.
- Capacity constraint. For demonstration, we enforce a practical cardinality cap (e.g., \(\le 5\) SPs per circuit) and deploy a local improvement routine (hill-climb with swaps). This departs from assumption (S3) of the paper (“no capacity coupling”) under which the LP relaxation is integral; hence, the demo illustrates a realistic, capacity-aware variant where the LP-integrality guarantee does not strictly apply.
- Spatial encoding. The on-screen coordinates serve cluster legibility—points are kept inside circuit spheres using fixed offsets. Thus, the spatial layout is symbolic rather than an isometric embedding of \((a,p,d)\). The values of \(a,p,d\) are shown in the labels; geometry encodes circuit membership and the temporal change of assignment, not a metric in \(\mathbb{R}^3\).
- Single-trajectory view. The animation is sample-path–level (one realization of \(\{\delta_j(t)\}\)); it does not animate time series within a phase. This aligns with the pathwise interpretation of the objectives used in the paper’s formulation.
7.5 Interpretation
Visually, the operator’s action is to redistribute which deviations sum together, reshaping each circuit’s aggregate process \(\Delta_i(t)=\sum_j x_{ij}\delta_j(t)\) and, consequently, its risk \(R_i\). A successful redistribution brings the post-transition points into a configuration whose reported \(J\) is lower than before, indicating reduced circuit stress on the realized path (RMS) or reduced absolute swings under the surrogate (L1).
7.6 Practical Notes
- Data wiring. Replace the panel’s placeholders with the computed before/after values of \(R_i\) and
\(J\) from the dataset (
compute_Jon the original and optimized assignments). - Embedding variant. If desired, map \((a,p,d)\) into a true 3D embedding (e.g., affine-scaled axes or PCA on \([a,p,d]\)) and place points at those coordinates within circuit spheres; labels remain unchanged.
- Scalability. For larger \(n\), fade labels at distance or show them on hover; the risk panel still conveys the effect of redistribution succinctly.
Key message from the paper
. Conclusion (stochastic takeaways)
The stochastic framing reveals a structural shortcut between discrete and convex formulations. Under lognormal overshoot probabilities and exogenous SP dynamics, the L1 redistribution problem admits a convex, integral LP solution on every realization, showing that stochasticity does not destroy tractability. In realistic grids, where RMS, capacities, or topology enter, MILP/MIQP and piecewise-linear constructions preserve the stochastic framing while respecting physics and operations.
Future directions. (i) variance bounds under lognormal asymmetry; (ii) capacity-aware MILP with CVaR-based stochastic regularization; (iii) topology-aware mixed-integer OPF that couples redistribution to volt/VAR limits.