Spatiotemporal Event Detection in Structured Grid Data

Abstract

We propose comment-centric event detection over power-grid assets. Each comment inherits a feeder coordinate and an incident interval, but comments may arrive during the incident or days later. We formalize the data and propose two clustering formulations: (A) metric density clustering via DBSCAN in a designed spatiotemporal geometry, and (B) constraint-based graph connectivity induced by explicit domain predicates (feeder proximity, interval overlap/gap, comment-time proximity). We give definitions and runtime notes.

1. Introduction

Operational “events” in grid workflows are often loosely defined: multiple incidents may be related (same region, overlapping activity windows), and the observed narrative arrives as a stream of comments attached to incidents. Because comment timestamps can lag the underlying incident interval, clustering must account for both observed incident intervals and noisy/delayed comment observations. Our aim is to partition comments into coherent clusters representing significant latent events.

2. Data model

Feeders form an ordinal spatial axis \(\mathcal F=\{1,\dots,F\}\) with distance \(d_F(f,f'):=|f-f'|\). Incidents are \(\mathcal I=\{1,\dots,N\}\). Each incident \(i\in\mathcal I\) is assigned to a feeder \(\phi(i)\in\mathcal F\) and has observed active interval \(\Delta_i=[t_i^{start},t_i^{end}]\subset\mathbb R\). Comments are \(\mathcal C=\{1,\dots,M\}\). Each comment \(c\in\mathcal C\) attaches to an incident \(i_c=\psi(c)\in\mathcal I\) and has timestamp \(\tau_c\in\mathbb R\). Delayed comments are allowed: \(\tau_c\notin\Delta_{i_c}\).

\[ x_c := (f_c,\Delta_{i_c},\tau_c),\qquad f_c:=\phi(i_c). \]

Problem. Find a partition \(\{E_k\}\) of \(\mathcal C\) into “significant clusters” (events) such that members of an event are close along feeders, aligned in incident activity (overlap or small gap), and close in comment time (with possible delays).

3. Two clustering formulations

3.1 Interval primitive

Define the interval gap between incidents \(i,j\) by

\[ g(\Delta_i,\Delta_j)=\max\{0,\ t_i^{start}-t_j^{end},\ t_j^{start}-t_i^{end}\}, \]

so \(g(\Delta_i,\Delta_j)=0\) iff \(\Delta_i\cap\Delta_j\neq\emptyset\). This captures the structural notion “incidents overlap” while still allowing small separations via a bounded gap.

3.2 Solution A: metric density clustering (DBSCAN)

Choose weights \(\alpha,\beta,\gamma\ge 0\) and define a comment distance

\[ d(c,c')=\alpha|f_c-f_{c'}|+\beta\,g(\Delta_{i_c},\Delta_{i_{c'}})+\gamma|\tau_c-\tau_{c'}|. \]

DBSCAN with parameters \((\varepsilon,m)\) declares \(c\) a core point if \(|\mathcal N_\varepsilon(c)|\ge m\), where \(\mathcal N_\varepsilon(c)=\{c': d(c,c')\le\varepsilon\}\), and returns clusters as maximal density-connected sets. Intuitively, events are dense “blobs” in the designed geometry; the weights \((\alpha,\beta,\gamma)\) govern additive tradeoffs.

Runtime. Naively \(O(M^2)\) distance checks. With spatial/temporal indexing, typical cost is \(O(M\log M + K)\) where \(K\) is the number of neighbor pairs (worst-case \(O(M^2)\)).

3.3 Solution B: constraint graph connectivity

Fix thresholds \(\Delta_f,\Delta_I,\Delta_t\ge 0\) and define a hard adjacency relation

\[ c\sim c'\iff |f_c-f_{c'}|\le \Delta_f\ \land\ g(\Delta_{i_c},\Delta_{i_{c'}})\le \Delta_I\ \land\ |\tau_c-\tau_{c'}|\le \Delta_t. \]

Construct \(G=(\mathcal C,E)\) with \((c,c')\in E\iff c\sim c'\). Output clusters as connected components \(E_k\) with \(|E_k|\ge m\). Intuitively, events are connected structures satisfying independent domain constraints; there is no implicit “trade” between space/time/overlap unless introduced explicitly.

Runtime. Naively \(O(M^2)\) candidate edges. With feeder bucketing and time-window candidate generation, edge creation is often \(O(M\cdot d)\) for average local candidate count \(d\); connected components via BFS/DFS are \(O(M+|E|)\).

4. Discussion and tradeoffs

Both approaches can be read as “build a graph on comments, then take connected components,” but they differ in the edge rule: DBSCAN induces edges via a single scalar inequality \(d(c,c')\le\varepsilon\) and adds a density criterion, yielding smooth boundaries but sensitivity to scaling; the constraint graph induces edges via a conjunction of predicates, yielding transparency and easy extensibility (e.g., add a sentiment gate or keyword clause), but requiring careful threshold selection to avoid fragmentation when the phenomenon is graded rather than thresholded.

\[ \text{Design choice: “dense regions in a designed geometry”}\\ \text{vs. “connected structure under explicit rules.”} \]