Flow Sampling (from $\mathbf{x}_0$ and $u_t$ to get $\mathbf{x}_t$)
最重要的是 $\mathbf{x}_t$不是直線!因為 $\mathbf{u}_t(\mathbf{x}_t)$ 是一個平均的結果,不是一個 constant vector! 但是 condition vector 是直線 (in the OT case).
理論上非常簡單,就是解一個 ODE with initial condition $\mathbf{x}_0$ is: \(\frac{d\mathbf{x}}{dt} = \mathbf{u}_t(\mathbf{x})\) 通用的表示:(Two independent Gaussians)
The vector field is given by:
\(\mathbf{u}_t(x) = \underbrace{\dot{\boldsymbol{\mu}}_t}_{\text{Mean component}\,} + \underbrace{\frac{1}{2} \dot{\boldsymbol{\Sigma}_t} \boldsymbol{\Sigma}_t^{-1} ((\mathbf{x} - \boldsymbol{\mu}_t)}_{\text{Covariance component}}\) \(\boxed{\mathbf{u}_t(\mathbf{x}) = (\boldsymbol{\mu}_1 - \boldsymbol{\mu}_0) + ( t \boldsymbol{\Sigma}_1 - (1-t) \boldsymbol{\Sigma}_0)\boldsymbol{\Sigma}_t^{-1} (\mathbf{x} - \boldsymbol{\mu}_t)}\)
Isotropic and equal variance
\(\boxed{\mathbf{u}_t(\mathbf{x}) = (\boldsymbol{\mu}_1 - \boldsymbol{\mu}_0) + \frac{1}{2}\frac{ {d\log\sigma}_t^{2}}{dt} (\mathbf{x} - \boldsymbol{\mu}_t)}\) \(\begin{aligned} \mathbf{x}(t) &= \left[(1-t) \boldsymbol{\mu}_0 + t \boldsymbol{\mu}_1\right] + \dfrac{\sqrt{(1-t)^2 \sigma_0^2 + t^2 \sigma_1^2}}{\sigma_0} (\mathbf{x}_0 - \boldsymbol{\mu}_0)\\ \phi_t(\mathbf{x}_0) &= \mathbf{x}(t)=\boldsymbol{\mu}_t + \dfrac{\sigma_t}{\sigma_0} (\mathbf{x}_0 - \boldsymbol{\mu}_0)\\ \end{aligned}\)
- $\boldsymbol{\mu}_t = (1-t) \boldsymbol{\mu}_0 + t \boldsymbol{\mu}_1 = \boldsymbol{\mu}_0 + t(\boldsymbol{\mu}_1-\boldsymbol{\mu}_0)$ 是以 $\boldsymbol{\mu}_0$ 為起點,斜率為 $(\boldsymbol{\mu}_1-\boldsymbol{\mu}_0)$ 的直綫。只有在$\mathbf{x}_0 = \boldsymbol{\mu}_0$ 才會走這條直綫。當 $\mathbf{x}_0 \ne \boldsymbol{\mu}_0$ 偏離的部分就會照 standard deviation 比例 ($\frac{\sigma_t}{\sigma_0}$) 加到這條直綫。
- $t =0, \phi_0(\mathbf{x}_0) = \mathbf{x}_0$
-
$t =1, \phi_1(\mathbf{x}_0) = \boldsymbol{\mu}_1 + \frac{\sigma_1}{\sigma_0}(\mathbf{x}_0-\boldsymbol{\mu}_0)$. 如果 $\sigma_1=\sigma_0=\sigma$ , $\phi_1(\mathbf{x}_0) = \mathbf{x}_0+(\boldsymbol{\mu}_1-\boldsymbol{\mu}_0)$. 即是所有的終點都是起點加上 mean difference.
- $\boldsymbol{\mu}_t = (1 - t) \boldsymbol{\mu}_0 + t \boldsymbol{\mu}_1$ 不論 $\mathbf{x}_0$ 和 $\mathbf{x}_1$ 是否是 independent coupling!
- $\boldsymbol{\Sigma}_t = (1 - t)^2 \boldsymbol{\Sigma}_0 + t^2 \boldsymbol{\Sigma}_1$. 重點:這裡是假設 $\mathbf{x}_0$ 和 $\mathbf{x}_1$ 是 independent coupling!
If $\boldsymbol{\Sigma}_0$ and $\boldsymbol{\Sigma}_1$ commute, the flow simplifies to: \(\boxed{\phi_t(\mathbf{x}_0)=\mathbf{x}(t) = \boldsymbol{\mu}_t + \boldsymbol{\Sigma}_t^{1/2} \boldsymbol{\Sigma}_0^{-1/2} \left( \mathbf{x}_0 - \boldsymbol{\mu}_0 \right)}\) 一個有用的恆等式 or invariant (? Gaussian flow only?): \(\boldsymbol{\Sigma}_0^{-1/2} \left( \mathbf{x}_0 - \boldsymbol{\mu}_0 \right) = \boldsymbol{\Sigma}_t^{-1/2} \left( \mathbf{x}_t - \boldsymbol{\mu}_t \right)\) Sanity check $t=1$ \(\boldsymbol{\Sigma}_0^{-1/2} \left( \mathbf{x}_0 - \boldsymbol{\mu}_0 \right) = \boldsymbol{\Sigma}_1^{-1/2} \left( \mathbf{x}_1 - \boldsymbol{\mu}_1 \right)\) 好像 make sense! 至少左邊所有 $x_0$ 的集合 normalized to N(0, I) 和右邊所有 $x_1$ 的集合 normalized to N(0, I) 看起來是相等。 另外可以直接得出 deterministic relationship: (後面可以用於 reflow!)
\(\mathbf{x}_1 = \boldsymbol{\mu}_1+\boldsymbol{\Sigma}_1^{1/2}\boldsymbol{\Sigma}_0^{-1/2} \left( \mathbf{x}_0 - \boldsymbol{\mu}_0 \right)\) 反向 given $\mathbf{x}_1$ \(\mathbf{x}_0 = \boldsymbol{\mu}_0+\boldsymbol{\Sigma}_0^{1/2}\boldsymbol{\Sigma}_1^{-1/2} \left( \mathbf{x}_1 - \boldsymbol{\mu}_1 \right)\)
所以 \(\phi_t(\mathbf{x}_0)=\mathbf{x}(t) = \boldsymbol{\mu}_t + (\mathbf{x}_t - \boldsymbol{\mu}_t) = \boldsymbol{\mu}_t + \boldsymbol{\Sigma}_t^{1/2} \boldsymbol{\Sigma}_t^{-1/2} \left( \mathbf{x}_t - \boldsymbol{\mu}_t \right) =\boldsymbol{\mu}_t + \boldsymbol{\Sigma}_t^{1/2} \boldsymbol{\Sigma}_0^{-1/2} \left( \mathbf{x}_0 - \boldsymbol{\mu}_0 \right)\)
\(v_t(\mathbf{x})=\dfrac{d\mathbf{x}(t)}{dt} = \dot{\boldsymbol{\mu}}_t + \dot{\boldsymbol{\Sigma}}_t^{1/2} \boldsymbol{\Sigma}_0^{-1/2} \left( \mathbf{x}_0 - \boldsymbol{\mu}_0 \right)\) \(v_t(\mathbf{x})=({\boldsymbol{\mu}}_1-{\boldsymbol{\mu}}_0) + \dot{\boldsymbol{\Sigma}}_t^{1/2} \boldsymbol{\Sigma}_0^{-1/2} \left( \mathbf{x}_0 - \boldsymbol{\mu}_0 \right)=({\boldsymbol{\mu}}_1-{\boldsymbol{\mu}}_0) + {\dot{\sigma}_t} {\dfrac{\mathbf{x}_0 - \boldsymbol{\mu}_0}{\sigma_0}}\)
$x_t$ 是否是直線 (linear) 可以很簡單判斷:
- $\boldsymbol{\mu}_t$ 本來是 linear in $t$, $\dot{\boldsymbol{\mu}}_t = {\boldsymbol{\mu}}_1-{\boldsymbol{\mu}}_0$ 是常向量,這是基本盤。
- 重點是 $\boldsymbol{\Sigma}_t^{1/2} = \sigma_t$ for isotropic case. 在 independent coupling case, 顯然不是 linear in $t$. 如果要 linear in $t$, $\sigma_t = (1-t)\sigma_0 + t\sigma_1$ 纔有可能線性。也就是 $x_0$ 和 $x_1$ 要 coupling.
Introduction
Naive Flow Matching 的推導是假設 $x_0$ and $x_1$ independent coupling. 這是實際的情況,因為我們希望 sample 一個簡單的 distribution (通常是. Gaussian),經過 flow 之後對應到 target distribution. 但是有缺點 (1) 這種 global flow 是多條路徑平均的結果,不是最短/最直的路徑,而是曲線。也就是需要多步才能完成 sample. PQ 是否也會受到曲線的影響? (2) loss the local neighborhood information。如果 sample 附近的點。或是做 interpolation, 可能會得到不好的 image.
Reflow
Reflow 的概念很簡單但深刻?就是打破 $x_0$ and $x_1$ independent coupling 的假設。How? 利用迭代的方法讓 $x_0$ 和 $x_1$ 變成 dependent!
Framework
Reference: https://arxiv.org/pdf/2209.03003
Rectified flow : Given empirical observations of $X_0 \sim \pi_0, X_1 \sim \pi_1$ ,the rectified flow induced from ( $X_0, X_1$ )is an ordinary differentiable model(ODE)on time $t \in[0,1]$ ,
\[\mathrm{d} Z_t=v\left(Z_t, t\right) \mathrm{d} t\]which converts $Z_0$ from $\pi_0$ to a $Z_1$ following $\pi_1$ .The drift force $v: \mathbb{R}^d \rightarrow \mathbb{R}^d$ is set to drive the flow to follow the direction $\left(X_1-X_0\right)$ of the linear path pointing from $X_0$ to $X_1$ as much as possible,by solving a simple least squares regression problem:
\[\min _v \int_0^1 \mathbb{E}\left[\left\|\left(X_1-X_0\right)-v\left(X_t, t\right)\right\|^2\right] \mathrm{d} t, \quad \text { with } \quad X_t=t X_1+(1-t) X_0\]重點:上式的解:for a given input coupling( $X_0, X_1$ ),it is easy to see that the exact minimum is achieved if
\[v^X(x, t)=\mathbb{E}\left[X_1-X_0 \mid X_t=x\right]\]Reducing transport costs:The coupling $\left(Z_0, Z_1\right)$ yields lower or equal convex transport costs than the input $\left(X_0, X_1\right)$ in that $\mathbb{E}\left[c\left(Z_1-Z_0\right)\right] \leq \mathbb{E}\left[c\left(X_1-X_0\right)\right]$ for any convex cost $c: \mathbb{R}^d \rightarrow \mathbb{R}$ .
\(v^1(x, t)=\mathbb{E}\left[X_1-X_0 \mid X_t=x\right]\) \(\boxed{\mathbf{v}^1_t(\mathbf{x}) = (\boldsymbol{\mu}_1 - \boldsymbol{\mu}_0) + \frac{1}{2}\frac{ {d\log\sigma}_t^{2}}{dt} (\mathbf{x} - \boldsymbol{\mu}_t)}\)
計算 vector field for inference 要用 vector field 需要看 x_t,
但是 trajectory 需要看 x_0? 但是看 vector field 不容易看出直線。看直線還是要 trajectory? No, trajectory 微分就是 vector field. 所以關鍵還是 vector field w.r.t x_0?
整理
\(\boxed{\mathbf{u}_t(\mathbf{x}) = (\boldsymbol{\mu}_1 - \boldsymbol{\mu}_0) + ( t \boldsymbol{\Sigma}_1 - (1-t) \boldsymbol{\Sigma}_0)\boldsymbol{\Sigma}_t^{-1} (\mathbf{x} - \boldsymbol{\mu}_t)}\)
\(v_t^{(1)}(\mathbf{x})=({\boldsymbol{\mu}}_1-{\boldsymbol{\mu}}_0) + \dot{\boldsymbol{\Sigma}}_t^{1/2} \boldsymbol{\Sigma}_0^{-1/2} \left( \mathbf{x}_0 - \boldsymbol{\mu}_0 \right)=({\boldsymbol{\mu}}_1-{\boldsymbol{\mu}}_0) + {\dot{\sigma}_t} {\dfrac{\mathbf{x}_0 - \boldsymbol{\mu}_0}{\sigma_0}}\) 不是直綫,雖然 $\mu_0,\mu_1,\sigma_0, x_0$ 和 $t$ 無關,但是 $\dot{\sigma_t}$ 和 $t$ 有關, $\dot{\sigma}_t=\dfrac{t\sigma_1^2 - (1-t)\sigma_0^2}{\sqrt{(1-t)^2 \sigma_0^2 + t^2 \sigma_1^2}}$ \(\begin{aligned} \mathbf{x}(t) &= \left[(1-t) \boldsymbol{\mu}_0 + t \boldsymbol{\mu}_1\right] + \dfrac{\sqrt{(1-t)^2 \sigma_0^2 + t^2 \sigma_1^2}}{\sigma_0} (\mathbf{x}_0 - \boldsymbol{\mu}_0)\\ \phi_t(\mathbf{x}_0) &= \mathbf{x}(t)=\boldsymbol{\mu}_t + \dfrac{\sigma_t}{\sigma_0} (\mathbf{x}_0 - \boldsymbol{\mu}_0)\\ \end{aligned}\) PDF 爲: \(\boxed{p^{(1)}(\mathbf{x}, t) = \mathcal{N}\left( \mathbf{x} \mid \boldsymbol{\mu}_t, \left[(1-t)^2 \sigma_0^2 + t^2 \sigma_1^2\right] \mathbf{I} \right)}\) Covariance: \(\boxed{\text{Cov}(\mathbf{Z}_0^{(1)}, \mathbf{Z}_1^{(1)}) = 0}\)
第二次 (也就是修正流)
\(\begin{array}{c} {v}_t^{(2)} = (\boldsymbol{\mu}_{1} - \boldsymbol{\mu}_{0}) + \left( \dfrac{\sigma_{1}}{\sigma_{0}} - 1 \right) (\mathbf{Z}_{0} - \boldsymbol{\mu}_{0})=(\boldsymbol{\mu}_{1} - \boldsymbol{\mu}_{0}) + ({\sigma_{1}-\sigma_{0}}) \dfrac{(\mathbf{Z}_{0} - \boldsymbol{\mu}_{0})}{\sigma_0} \end{array}\) 是直綫,因爲所有參數都和 $t$ 無關,可以看出Gaussian flow reflow 一次就變成直線。 我們可以反推 $\sigma_t$ 和時間的關係必須是 linear, 才能滿足直綫和 boundary condition! $\dot{\sigma_t}=\sigma_1-\sigma_0$ 也就是 $\sigma_t = (1-t)\sigma_0 + t \sigma_1$ 也就是說 pdf 在 flow 的過程中 Fokker-Planck equation 的解 $p(x,t)$ 的 mean 和 standard deviation 都必須是 linear in $t$.
\(\mathbf{Z}_t^{(2)} = (1-t)\mathbf{Z}_0^{(2)} + t \left[ \boldsymbol{\mu}_1 + \dfrac{\sigma_1}{\sigma_0} (\mathbf{Z}_0^{(2)} - \boldsymbol{\mu}_0) \right]\) PDF 爲: \(\boxed{p^{(2)}(\mathbf{x}, t) = \mathcal{N}\left( \mathbf{x} \mid \boldsymbol{\mu}_t, \left[(1-t)\sigma_0 + t\sigma_1\right]^2 \mathbf{I} \right)}\) Covariance:
\[\boxed{\text{Cov}(\mathbf{Z}_0^{(2)}, \mathbf{Z}_1^{(2)}) = \dfrac{\sigma_1}{\sigma_0} \cdot \sigma_0^2 \mathbf{I} = \sigma_0 \sigma_1 \mathbf{I}}\]Fokker-Planck PDF
第一次修正流(First Rectified Flow)的 PDF $p^{(1)}(\mathbf{x}, t)$
在第一次修正流中,軌跡定義爲: \(\mathbf{Z}_t = (1-t)\mathbf{Z}_0 + t\mathbf{Z}_1\) 其中:
- $\mathbf{Z}_0 \sim \mathcal{N}(\boldsymbol{\mu}_0, \sigma_0^2 \mathbf{I})$
- $\mathbf{Z}_1 \sim \mathcal{N}(\boldsymbol{\mu}_1, \sigma_1^2 \mathbf{I})$
- $\mathbf{Z}_0$ 和 $\mathbf{Z}_1$ 獨立。
由於 $\mathbf{Z}_t$ 是獨立高斯隨機變量的線性組合,其分佈仍是高斯分佈:
- 均值: \(\mathbb{E}[\mathbf{Z}_t] = (1-t)\boldsymbol{\mu}_0 + t\boldsymbol{\mu}_1 = \boldsymbol{\mu}_t\)
- 方差: \(\text{Var}(\mathbf{Z}_t) = (1-t)^2 \text{Var}(\mathbf{Z}_0) + t^2 \text{Var}(\mathbf{Z}_1) = (1-t)^2 \sigma_0^2 + t^2 \sigma_1^2\)
因此,PDF 爲: \(\boxed{p^{(1)}(\mathbf{x}, t) = \mathcal{N}\left( \mathbf{x} \mid \boldsymbol{\mu}_t, \left[(1-t)^2 \sigma_0^2 + t^2 \sigma_1^2\right] \mathbf{I} \right)}\)
第二次修正流(Second Rectified Flow)的 PDF $p^{(2)}(\mathbf{x}, t)$
在第二次修正流中,軌跡定義爲: \(\mathbf{Z}_t^{(2)} = (1-t)\mathbf{Z}_0^{(2)} + t\mathbf{Z}_1^{(2)}\) 其中:
- $\mathbf{Z}_0^{(2)} \sim \mathcal{N}(\boldsymbol{\mu}_0, \sigma_0^2 \mathbf{I})$
- $\mathbf{Z}_1^{(2)} = \boldsymbol{\mu}_1 + \dfrac{\sigma_1}{\sigma_0} (\mathbf{Z}_0^{(2)} - \boldsymbol{\mu}_0)$(線性映射)
- $\mathbf{Z}_0^{(2)}$ 和 $\mathbf{Z}_1^{(2)}$ 線性相關(非獨立)。
將 $\mathbf{Z}_1^{(2)}$ 代入軌跡公式: \(\mathbf{Z}_t^{(2)} = (1-t)\mathbf{Z}_0^{(2)} + t \left[ \boldsymbol{\mu}_1 + \dfrac{\sigma_1}{\sigma_0} (\mathbf{Z}_0^{(2)} - \boldsymbol{\mu}_0) \right]\) 整理得: \(\mathbf{Z}_t^{(2)} = \left[(1-t) + t \dfrac{\sigma_1}{\sigma_0}\right] \mathbf{Z}_0^{(2)} + t \boldsymbol{\mu}_1 - t \dfrac{\sigma_1}{\sigma_0} \boldsymbol{\mu}_0\) 這是 $\mathbf{Z}_0^{(2)}$ 的線性變換,因此 $\mathbf{Z}_t^{(2)}$ 仍是高斯分佈:
- 均值: \(\mathbb{E}[\mathbf{Z}_t^{(2)}] = \left[(1-t) + t \dfrac{\sigma_1}{\sigma_0}\right] \boldsymbol{\mu}_0 + t \boldsymbol{\mu}_1 - t \dfrac{\sigma_1}{\sigma_0} \boldsymbol{\mu}_0 = (1-t)\boldsymbol{\mu}_0 + t\boldsymbol{\mu}_1 = \boldsymbol{\mu}_t\)
- 方差: \(\text{Var}(\mathbf{Z}_t^{(2)}) = \left[(1-t) + t \dfrac{\sigma_1}{\sigma_0}\right]^2 \text{Var}(\mathbf{Z}_0^{(2)}) = \left[(1-t)\sigma_0 + t\sigma_1\right]^2\)
因此,PDF 爲: \(\boxed{p^{(2)}(\mathbf{x}, t) = \mathcal{N}\left( \mathbf{x} \mid \boldsymbol{\mu}_t, \left[(1-t)\sigma_0 + t\sigma_1\right]^2 \mathbf{I} \right)}\)
關鍵對比
| 屬性 | 第一次修正流 $p^{(1)}(\mathbf{x}, t)$ | 第二次修正流 $p^{(2)}(\mathbf{x}, t)$ | | ——— | ——————————————————————– | ———————————————— | | 分佈類型 | 高斯分佈 | 高斯分佈 | | 均值 | $\boldsymbol{\mu}_t = (1-t)\boldsymbol{\mu}_0 + t\boldsymbol{\mu}_1$ | 同左 | | 方差 | $(1-t)^2 \sigma_0^2 + t^2 \sigma_1^2$ | $[(1-t)\sigma_0 + t\sigma_1]^2$ | | 軌跡線性性 | 非線性($\sigma_t = \sqrt{(1-t)^2\sigma_0^2 + t^2\sigma_1^2}$) | 線性($\sigma_t = (1-t)\sigma_0 + t\sigma_1$) | | 數據依賴性 | $\mathbf{Z}_0$ 和 $\mathbf{Z}_1$ 獨立 | $\mathbf{Z}_1^{(2)}$ 由 $\mathbf{Z}_0^{(2)}$ 線性生成 |
物理意義
- 第一次修正流:
- 方差 $(1-t)^2 \sigma_0^2 + t^2 \sigma_1^2$ 反映獨立高斯的方差疊加效應(非線性)。
- 軌跡彎曲,因爲 $\mathbf{Z}_0$ 和 $\mathbf{Z}_1$ 獨立,插值時噪聲相互疊加。
- 第二次修正流:
- 方差 $[(1-t)\sigma_0 + t\sigma_1]^2$ 對應線性標準差插值(最優傳輸路徑)。
- 軌跡爲直線,因爲 $\mathbf{Z}_1^{(2)}$ 是 $\mathbf{Z}_0^{(2)}$ 的確定性線性映射,無額外噪聲。
\(\boxed{ \begin{array}{c} \text{第一次修正流:} \\ p^{(1)}(\mathbf{x}, t) = \mathcal{N}\left( \mathbf{x} \mid \boldsymbol{\mu}_{t},\ \left[(1-t)^{2} \sigma_{0}^{2} + t^{2} \sigma_{1}^{2}\right] \mathbf{I} \right) \\ \\ \text{第二次修正流:} \\ p^{(2)}(\mathbf{x}, t) = \mathcal{N}\left( \mathbf{x} \mid \boldsymbol{\mu}_{t},\ \left[(1-t)\sigma_{0} + t\sigma_{1}\right]^{2} \mathbf{I} \right) \end{array} }\)
具體做法
Step 1: 仍然 independent sampling $x_0$ and $x_1$, 做 flow matching, i.e. $v_t(x) = \dfrac{dx_t}{dt}=x_1-x_0$
Step 2: based on Step 1 得到的 $v_t(x)$, 用原來 sampled 的 $x_0$ 得到對應的 $x_1$, create coupled pairs ($x_0, x_1$)!
兩個 twists
- 是否需要重新 sample $x_0$?
- 是否可以反向用 $x_1$ 得到 $x_0$?
Step 3: 利用 coupled ($x_0, x_1$) 重新做 flow matching!
完整的流程如下。
- Inputs: coupling pair ($X_0, X_1$). 第一次 $\mathbf{Z}^0 =(Z_0, Z_1)=(X_0, X_1)$ 是 independent sampling, 接下來都是 coupling pairs.
- Training: 得到一個平均的 vector field,$v_\theta$
- Sampling: based on above vector field, 解微分方程 $\dfrac{dZ_t}{dt}=v_{\theta}(Z_t)$ with $Z_0\sim\pi_0$ 或是反向。
- 重新得到 coupling pairs 繼續。
![[Pasted image 20250625102913.png]]
Gaussian ReFlow
如何判斷是否是直線? 最簡單的 Gaussian to Gaussian 就是 mean 和 variance 都是 affine functions?
所以新的 ($Z_0, Z_1$) coupling pair 就變成 $(\mathbf{x}_0, \boldsymbol{\mu}_1+\boldsymbol{\Sigma}_1^{1/2}\boldsymbol{\Sigma}_0^{-1/2} \left( \mathbf{x}_0 - \boldsymbol{\mu}_0 \right))$ 和 $( \boldsymbol{\mu}_0+\boldsymbol{\Sigma}_0^{1/2}\boldsymbol{\Sigma}_1^{-1/2} \left( \mathbf{x}_1 - \boldsymbol{\mu}_1 \right), \mathbf{x}_1)$
所以新的 $Z_1 - Z_0 = \boldsymbol{\mu}_1+\boldsymbol{\Sigma}_1^{1/2}\boldsymbol{\Sigma}_0^{-1/2} \left( \mathbf{x}_0 - \boldsymbol{\mu}_0 \right)-\mathbf{x}_0= (\boldsymbol{\mu}_1 - \boldsymbol{\mu}_0) +(\boldsymbol{\Sigma}_1^{1/2}\boldsymbol{\Sigma}_0^{-1/2} -I)\left( \mathbf{x}_0 - \boldsymbol{\mu}_0 \right)$
or 反向 $Z_1 - Z_0 = \mathbf{x}_1-( \boldsymbol{\mu}_0+\boldsymbol{\Sigma}_0^{1/2}\boldsymbol{\Sigma}_1^{-1/2} \left( \mathbf{x}_1 - \boldsymbol{\mu}_1 \right))=(\boldsymbol{\mu}_1 - \boldsymbol{\mu}_0) +(\boldsymbol{-\Sigma}_0^{1/2}\boldsymbol{\Sigma}_1^{-1/2} +I)\left( \mathbf{x}_1 - \boldsymbol{\mu}_1 \right)$
- 如果 $\Sigma_0=\Sigma_1$, $\dfrac{dz_t}{dt} = Z_1 - Z_0=\mu_1 - \mu_0$ 的確是 constant vector (直線),$z_t = (\mu_1-\mu_0)t + Z_0$基本就是一個 mean 平移。$Z_t(t=0)=Z_0, Z_t(t=1)=Z_1$
- 如果 $\Sigma_0 \ne \Sigma_1$, 假設 isotropic $\Sigma_0=\sigma_0^2 I$, $\Sigma_0=\sigma_1^2 I$
要解 $\dfrac{dz_t}{dt} = Z_1 - Z_0=(\boldsymbol{\mu}_1 - \boldsymbol{\mu}_0) +(\dfrac{\sigma_1}{\sigma_0} -1)\left( \mathbf{Z}_0 - \boldsymbol{\mu}_0 \right)$
再來是把 $Z_0$ 換成 $z_t$ 解上面的微分方程。
Express $Z_0$ in Terms of $\mathbf{z}_t$ (Appendix A)
From the trajectory definition: \(\mathbf{z}_t = (1-t)Z_0 + tZ_1\) Substitute $Z_1$ again: \(\mathbf{z}_t = (1-t)Z_0 + t\left[\boldsymbol{\mu}_1 + \frac{\sigma_1}{\sigma_0}(Z_0 - \boldsymbol{\mu}_0)\right]\) Solve for $Z_0$: \(Z_0 = \frac{\mathbf{z}_t - \left[t\boldsymbol{\mu}_1 - t\frac{\sigma_1}{\sigma_0}\boldsymbol{\mu}_0\right]}{(1-t) + t\frac{\sigma_1}{\sigma_0}} = \frac{\mathbf{z}_t \sigma_0 - t\boldsymbol{\mu}_1 \sigma_0 + t{\sigma_1}\boldsymbol{\mu}_0}{(1-t)\sigma_0 + t{\sigma_1}}\)
Substitute into Velocity Expression
Plug this $Z_0$ back into the velocity: \(\frac{d\mathbf{z}_t}{dt} = (\boldsymbol{\mu}_1 - \boldsymbol{\mu}_0) + \left(\frac{\sigma_1}{\sigma_0} - 1\right)\left(\frac{\mathbf{z} \sigma_0 - t\boldsymbol{\mu}_1 \sigma_0 + t{\sigma_1}\boldsymbol{\mu}_0}{(1-t)\sigma_0 + t{\sigma_1}} - \boldsymbol{\mu}_0\right)\)
\(\frac{d\mathbf{z}_t}{dt} = \boldsymbol{v}_t^{(1)}(\mathbf{z}) = \frac{(\sigma_1 - \sigma_0)\mathbf{z} + \sigma_0\boldsymbol{\mu}_1 - \sigma_1\boldsymbol{\mu}_0}{\sigma_0(1-t) + t\sigma_1}\)
Solve the differential Equation (Appendix B)
結果非常簡單,就是從 $Z_0$ 到 $Z_1$ 的一條直線!即使是在 $\sigma_0\ne\sigma_1$ 情況下。 \(\boxed{ \begin{array}{c} \mathbf{z}(t) = \mathbf{A} t + \mathbf{Z}_{0}\\ \mathbf{A} = (\boldsymbol{\mu}_{1} - \boldsymbol{\mu}_{0}) + \left( \dfrac{\sigma_{1}}{\sigma_{0}} - 1 \right) (\mathbf{Z}_{0} - \boldsymbol{\mu}_{0}) \end{array} }\)
$t=0$, $\mathbf{z}(t=0)=\mathbf{Z}_0$ $t=1$, $\mathbf{z}(t=1)=\mathbf{A}+\mathbf{Z}_0=\mathbf{Z}_1$, 所以 $\mathbf{A}=(\mathbf{Z}_1-\mathbf{Z}_0) = (\boldsymbol{\mu}_1 - \boldsymbol{\mu}_0) +(\frac{\sigma_1} {\sigma_0} -1)\left( \mathbf{Z}_0 - \boldsymbol{\mu}_0 \right)$
證明 Gaussian flow 只要一次 reflow 就變成直線!
反向 Reflow
反向 $Z_1 - Z_0 = (\boldsymbol{\mu}_1 - \boldsymbol{\mu}_0) +(\boldsymbol{-\Sigma}_1^{1/2}\boldsymbol{\Sigma}_0^{-1/2} +I)\left( \mathbf{Z}_1 - \boldsymbol{\mu}_1 \right)$
$Z_0 = Z_1 - (\boldsymbol{\mu}_1 - \boldsymbol{\mu}_0) +(\boldsymbol{\Sigma}_1^{1/2}\boldsymbol{\Sigma}_0^{-1/2}-I)\left( \mathbf{Z}_1 - \boldsymbol{\mu}_1 \right)$
Express $Z_1$ in Terms of $\mathbf{z}_t$
From the trajectory definition: \(\mathbf{z}_t = (1-t)Z_0 + tZ_1\) Substitute $Z_0$ and solve $Z_1$: \(Z_1 = ?\) and Solve \(\frac{d\mathbf{z}_t}{dt} = Z_1 - Z_0\)
如何推廣到 general flow? GMM?
如何和 Sinkhorn algorithm 比較?
整理
\(v_t^{(1)}(\mathbf{x})=({\boldsymbol{\mu}}_1-{\boldsymbol{\mu}}_0) + \dot{\boldsymbol{\Sigma}}_t^{1/2} \boldsymbol{\Sigma}_0^{-1/2} \left( \mathbf{x}_0 - \boldsymbol{\mu}_0 \right)=({\boldsymbol{\mu}}_1-{\boldsymbol{\mu}}_0) + {\dot{\sigma}_t} {\dfrac{\mathbf{x}_0 - \boldsymbol{\mu}_0}{\sigma_0}}\) \(\begin{array}{c} {v}_t^{(2)} = (\boldsymbol{\mu}_{1} - \boldsymbol{\mu}_{0}) + \left( \dfrac{\sigma_{1}}{\sigma_{0}} - 1 \right) (\mathbf{Z}_{0} - \boldsymbol{\mu}_{0})=(\boldsymbol{\mu}_{1} - \boldsymbol{\mu}_{0}) + ({\sigma_{1}-\sigma_{0}}) \dfrac{(\mathbf{Z}_{0} - \boldsymbol{\mu}_{0})}{\sigma_0} \end{array}\) 可以看出 reflow 一次就變成直線。
$\dot{\sigma_t}=\sigma_1-\sigma_0$ 也就是 $\sigma_t = (1-t)\sigma_0 + t \sigma_1$
Reference
MIT 6.S184: Flow Matching and Diffusion Models https://www.youtube.com/watch?v=GCoP2w-Cqtg&t=28s&ab_channel=PeterHolderrieth
Yaron Meta paper: [2210.02747] Flow Matching for Generative Modeling
An Introduction to Flow Matching: https://mlg.eng.cam.ac.uk/blog/2024/01/20/flow-matching.html
youtube video: next generation model https://www.youtube.com/watch?v=swKdn-qT47Q&ab_channel=%E9%98%BF%E5%86%89
MIT Kaiming He https://arxiv.org/abs/2505.13447
Appendix A: 1st Reflow as Straight Line
Step-by-Step Explanation of the Velocity Field Derivation after 1 Reflow Iteration
1. Velocity Definition
The velocity is defined as the time derivative of the trajectory $\mathbf{z}_t$: \(\frac{d\mathbf{z}_t}{dt} = Z_1 - Z_0\) This comes directly from the definition of the straight-line trajectory $\mathbf{z}_t = (1-t)Z_0 + tZ_1$.
2. Substitute $Z_1$
From the reflow coupling: \(Z_1 = \boldsymbol{\mu}_1 + \frac{\sigma_1}{\sigma_0} (Z_0 - \boldsymbol{\mu}_0)\) Substitute this into the velocity expression: \(Z_1 - Z_0 = \left(\boldsymbol{\mu}_1 + \frac{\sigma_1}{\sigma_0} (Z_0 - \boldsymbol{\mu}_0)\right) - Z_0\) Simplify: \(= \boldsymbol{\mu}_1 - \boldsymbol{\mu}_0 + \frac{\sigma_1}{\sigma_0}Z_0 - \frac{\sigma_1}{\sigma_0}\boldsymbol{\mu}_0 - Z_0\) \(= (\boldsymbol{\mu}_1 - \boldsymbol{\mu}_0) + \left(\frac{\sigma_1}{\sigma_0} - 1\right)(Z_0 - \boldsymbol{\mu}_0)\)
3. Express $Z_0$ in Terms of $\mathbf{z}_t$
From the trajectory definition: \(\mathbf{z}_t = (1-t)Z_0 + tZ_1\) Substitute $Z_1$ again: \(\mathbf{z}_t = (1-t)Z_0 + t\left[\boldsymbol{\mu}_1 + \frac{\sigma_1}{\sigma_0}(Z_0 - \boldsymbol{\mu}_0)\right]\) Expand: \(= (1-t)Z_0 + t\boldsymbol{\mu}_1 + t\frac{\sigma_1}{\sigma_0}Z_0 - t\frac{\sigma_1}{\sigma_0}\boldsymbol{\mu}_0\) Group $Z_0$ terms: \(= \left[(1-t) + t\frac{\sigma_1}{\sigma_0}\right]Z_0 + \left[t\boldsymbol{\mu}_1 - t\frac{\sigma_1}{\sigma_0}\boldsymbol{\mu}_0\right]\) Solve for $Z_0$: \(Z_0 = \frac{\mathbf{z}_t - \left[t\boldsymbol{\mu}_1 - t\frac{\sigma_1}{\sigma_0}\boldsymbol{\mu}_0\right]}{(1-t) + t\frac{\sigma_1}{\sigma_0}}\)
4. Substitute into Velocity Expression
Plug this $Z_0$ back into the velocity: \(\frac{d\mathbf{z}_t}{dt} = (\boldsymbol{\mu}_1 - \boldsymbol{\mu}_0) + \left(\frac{\sigma_1}{\sigma_0} - 1\right)\left(\frac{\mathbf{z}_t - t\boldsymbol{\mu}_1 + t\frac{\sigma_1}{\sigma_0}\boldsymbol{\mu}_0}{(1-t) + t\frac{\sigma_1}{\sigma_0}} - \boldsymbol{\mu}_0\right)\)
5. Simplify to Final Form
After algebraic manipulation (combining terms over a common denominator), we get the closed-form velocity field:
\[\boxed{\boldsymbol{v}_t^{(1)}(\mathbf{z}) = \frac{(\sigma_1 - \sigma_0)\mathbf{z} + \sigma_0\boldsymbol{\mu}_1 - \sigma_1\boldsymbol{\mu}_0}{\sigma_0(1-t) + t\sigma_1}}\]Where:
- $\mathbf{z}$ is the current position at time $t$
- $\sigma_0, \sigma_1$ are standard deviations
- $\boldsymbol{\mu}_0, \boldsymbol{\mu}_1$ are means
Key Insights
-
Linear in $\mathbf{z}$: The velocity field is linear in $\mathbf{z}$, making it computationally efficient.
-
Time-Dependent Denominator: The denominator $\sigma_0(1-t) + t\sigma_1$ acts as a time-varying scaling factor.
-
Constant Offset: The term $\sigma_0\boldsymbol{\mu}_1 - \sigma_1\boldsymbol{\mu}_0$ provides a constant direction bias.
-
Optimal Transport Case: When $\sigma_0 = \sigma_1$, this simplifies to $\boldsymbol{v}_t^{(1)} = \boldsymbol{\mu}_1 - \boldsymbol{\mu}_0$, the constant optimal transport displacement.
Why This Matters
This closed-form solution:
- Shows reflow straightens trajectories in just 1 iteration
- Provides an exact reference for validating neural network implementations
- Reveals the mathematical structure of the velocity field
- Confirms that for equal variances, we recover the optimal transport solution
The derivation demonstrates how reflow simplifies the velocity field from a complex time-dependent function to a more manageable linear form, significantly improving sampling efficiency.
Appendix B: Elaborate on Appendix A
To demonstrate that the trajectory $\mathbf{z}_t$ is a straight line, we solve the ordinary differential equation (ODE) given by:
\[\frac{d\mathbf{z}_t}{dt} = \boldsymbol{v}_t^{(1)}(\mathbf{z}) = \frac{(\sigma_1 - \sigma_0)\mathbf{z} + \sigma_0\boldsymbol{\mu}_1 - \sigma_1\boldsymbol{\mu}_0}{\sigma_0(1-t) + t\sigma_1}\]Define the denominator as a function of $t$:
\[s(t) = \sigma_0(1-t) + t\sigma_1 = \sigma_0 + t(\sigma_1 - \sigma_0)\]and the constant vector:
\[\mathbf{b} = \sigma_0\boldsymbol{\mu}_1 - \sigma_1\boldsymbol{\mu}_0\]The ODE becomes:
\[\frac{d\mathbf{z}}{dt} = \frac{(\sigma_1 - \sigma_0)\mathbf{z} + \mathbf{b}}{s(t)}\]This is a first-order linear ODE. Rearrange it into standard form:
\[\frac{d\mathbf{z}}{dt} - \frac{\sigma_1 - \sigma_0}{s(t)} \mathbf{z} = \frac{\mathbf{b}}{s(t)}\]Step 1: Solve the Homogeneous Equation
The homogeneous equation is:
\[\frac{d\mathbf{z}_h}{dt} = \frac{\sigma_1 - \sigma_0}{s(t)} \mathbf{z}_h\]Separate variables and integrate:
\[\int \frac{d\mathbf{z}_h}{\mathbf{z}_h} = \int \frac{\sigma_1 - \sigma_0}{s(t)} dt\]The right-hand side integral is:
\[\int \frac{\sigma_1 - \sigma_0}{\sigma_0 + t(\sigma_1 - \sigma_0)} dt = \int \frac{du}{u} = \ln|u| + C = \ln|s(t)| + C\]where $u = s(t)$ and $du = (\sigma_1 - \sigma_0) dt$. Thus,
\[\ln|\mathbf{z}_h| = \ln|s(t)| + C \quad \Rightarrow \quad \mathbf{z}_h(t) = \mathbf{C} s(t)\]where $\mathbf{C}$ is a constant vector.
Step 2: Find a Particular Solution
Assume a constant particular solution $\mathbf{z}_p = \mathbf{k}$. Substituting into the ODE:
\[0 - \frac{\sigma_1 - \sigma_0}{s(t)} \mathbf{k} = \frac{\mathbf{b}}{s(t)}\]This simplifies to:
\[-(\sigma_1 - \sigma_0)\mathbf{k} = \mathbf{b} \quad \Rightarrow \quad \mathbf{k} = -\frac{\mathbf{b}}{\sigma_1 - \sigma_0} = \frac{\mathbf{b}}{\sigma_0 - \sigma_1}\]Substitute $\mathbf{b} = \sigma_0\boldsymbol{\mu}_1 - \sigma_1\boldsymbol{\mu}_0$:
\[\mathbf{k} = \frac{\sigma_0\boldsymbol{\mu}_1 - \sigma_1\boldsymbol{\mu}_0}{\sigma_0 - \sigma_1}\]Step 3: General Solution
The general solution is the sum of homogeneous and particular solutions:
\[\mathbf{z}(t) = \mathbf{z}_h(t) + \mathbf{z}_p = \mathbf{C} s(t) + \frac{\sigma_0\boldsymbol{\mu}_1 - \sigma_1\boldsymbol{\mu}_0}{\sigma_0 - \sigma_1}\]Substitute $s(t) = \sigma_0(1-t) + t\sigma_1$:
\[\mathbf{z}(t) = \mathbf{C} \left[ \sigma_0(1-t) + t\sigma_1 \right] + \frac{\sigma_0\boldsymbol{\mu}_1 - \sigma_1\boldsymbol{\mu}_0}{\sigma_0 - \sigma_1}\]Step 4: Linearity in $t$
Expand the expression:
\[\mathbf{z}(t) = \mathbf{C} \sigma_0 (1-t) + \mathbf{C} t \sigma_1 + \mathbf{d}\]where $\mathbf{d} = \frac{\sigma_0\boldsymbol{\mu}_1 - \sigma_1\boldsymbol{\mu}_0}{\sigma_0 - \sigma_1}$ is a constant vector. Rearranging terms:
\[\mathbf{z}(t) = t \left[ \mathbf{C} (\sigma_1 - \sigma_0) \right] + \left[ \mathbf{C} \sigma_0 + \mathbf{d} \right]\]Define constant vectors $\mathbf{A} = \mathbf{C} (\sigma_1 - \sigma_0)$ and $\mathbf{B} = \mathbf{C} \sigma_0 + \mathbf{d}$:
\[\mathbf{z}(t) = \mathbf{A} t + \mathbf{B}\]This is the parametric equation of a straight line in the vector space of $\mathbf{z}$, with direction vector $\mathbf{A}$ and intercept $\mathbf{B}$ at $t=0$.
Conclusion
The solution $\mathbf{z}(t)$ is linear in $t$, confirming that the trajectory is a straight line.
\[\boxed{\text{The solution } \mathbf{z}_t \text{ is a straight line because it is linear in } t\text{, as shown by } \mathbf{z}(t) = \mathbf{A}t + \mathbf{B}\text{ for constant vectors } \mathbf{A} \text{ and } \mathbf{B}.}\]Boundary Condition Check
根據之前的推導和邊界條件,列出常數向量 $\mathbf{A}$, $\mathbf{B}$, $\mathbf{C}$ 和 $\mathbf{d}$ 的表達式:
1. $\mathbf{A}$ (方向向量)
- 定義:$\mathbf{A} = \mathbf{Z}_1 - \mathbf{Z}_0$
- 顯式表達式:
\(\mathbf{A} = (\boldsymbol{\mu}_1 - \boldsymbol{\mu}_0) + \left( \frac{\sigma_1}{\sigma_0} - 1 \right) (\mathbf{Z}_0 - \boldsymbol{\mu}_0)\)
2. $\mathbf{B}$ (截距向量,起點)
- 邊界條件:$\mathbf{z}(t=0) = \mathbf{Z}_0$
- 表達式:
\(\mathbf{B} = \mathbf{Z}_0\)
3. $\mathbf{C}$ (比例係數)
- 關係:$\mathbf{A} = \mathbf{C} (\sigma_1 - \sigma_0)$
- 顯式表達式:
\(\mathbf{C} = \frac{\mathbf{A}}{\sigma_1 - \sigma_0} = \frac{1}{\sigma_1 - \sigma_0} \left[ (\boldsymbol{\mu}_1 - \boldsymbol{\mu}_0) + \left( \frac{\sigma_1}{\sigma_0} - 1 \right) (\mathbf{Z}_0 - \boldsymbol{\mu}_0) \right]\)
4. $\mathbf{d}$ (常數偏移)
- 定義:
\(\mathbf{d} = \frac{\sigma_0 \boldsymbol{\mu}_1 - \sigma_1 \boldsymbol{\mu}_0}{\sigma_0 - \sigma_1} = \frac{\sigma_1 \boldsymbol{\mu}_0 - \sigma_0 \boldsymbol{\mu}_1}{\sigma_1 - \sigma_0}\)
驗證關係 $\mathbf{B} = \mathbf{C} \sigma_0 + \mathbf{d}$
- 代入 $\mathbf{C}$ 和 $\mathbf{d}$:
\(\mathbf{C} \sigma_0 + \mathbf{d} = \left[ \frac{\mathbf{A}}{\sigma_1 - \sigma_0} \right] \sigma_0 + \frac{\sigma_1 \boldsymbol{\mu}_0 - \sigma_0 \boldsymbol{\mu}_1}{\sigma_1 - \sigma_0}\) - 代入 $\mathbf{A}$ 並化簡:
\(= \frac{1}{\sigma_1 - \sigma_0} \left[ \sigma_0 \mathbf{A} + (\sigma_1 \boldsymbol{\mu}_0 - \sigma_0 \boldsymbol{\mu}_1) \right]\) \(= \frac{1}{\sigma_1 - \sigma_0} \left[ \sigma_0 \left( (\boldsymbol{\mu}_1 - \boldsymbol{\mu}_0) + \left( \frac{\sigma_1}{\sigma_0} - 1 \right) (\mathbf{Z}_0 - \boldsymbol{\mu}_0) \right) + \sigma_1 \boldsymbol{\mu}_0 - \sigma_0 \boldsymbol{\mu}_1 \right]\) - 展開後各項抵消,最終得:
\(\mathbf{C} \sigma_0 + \mathbf{d} = \mathbf{Z}_0 = \mathbf{B}\) 符合邊界條件。
最終總結
\(\boxed{ \begin{array}{c} \mathbf{A} = (\boldsymbol{\mu}_{1} - \boldsymbol{\mu}_{0}) + \left( \dfrac{\sigma_{1}}{\sigma_{0}} - 1 \right) (\mathbf{Z}_{0} - \boldsymbol{\mu}_{0}) \\ \\ \mathbf{B} = \mathbf{Z}_{0} \\ \\ \mathbf{C} = \dfrac{1}{\sigma_{1} - \sigma_{0}} \left[ (\boldsymbol{\mu}_{1} - \boldsymbol{\mu}_{0}) + \left( \dfrac{\sigma_{1}}{\sigma_{0}} - 1 \right) (\mathbf{Z}_{0} - \boldsymbol{\mu}_{0}) \right] \\ \\ \mathbf{d} = \dfrac{\sigma_{0} \boldsymbol{\mu}_{1} - \sigma_{1} \boldsymbol{\mu}_{0}}{\sigma_{0} - \sigma_{1}} \end{array} }\)
Appendix C: Mean Flow Two Gaussian Case
Given the specific conditions $\sigma_0^2 = \sigma_1^2 = \sigma^2$, $\boldsymbol{\mu}_0 = -\boldsymbol{\mu}$, and $\boldsymbol{\mu}_1 = +\boldsymbol{\mu}$, the instantaneous vector field is:
\[\mathbf{v}_t(\mathbf{x}) = \frac{(2t-1)\mathbf{x} + \boldsymbol{\mu}}{2t^2 - 2t + 1}\]The mean flow matching field $u(z_t, t_{\text{end}}, t)$ is defined as the time-average of $\mathbf{v}$ along the trajectory from $t$ to $t_{\text{end}}$:
\[u(z_t, t_{\text{end}}, t) = \frac{1}{t_{\text{end}} - t} \int_t^{t_{\text{end}}} \mathbf{v}_s(\mathbf{x}_s) ds\]where the trajectory $\mathbf{x}_s$ starting from $z_t$ at time $t$ is given by:
\[\mathbf{x}_s = \boldsymbol{\mu}_s + \frac{\sigma_s}{\sigma_t} (z_t - \boldsymbol{\mu}_t)\]with:
- $\boldsymbol{\mu}_s = (2s - 1) \boldsymbol{\mu}$
- $\sigma_s = \sigma \sqrt{2s^2 - 2s + 1}$
- $\boldsymbol{\mu}_t = (2t - 1) \boldsymbol{\mu}$
- $\sigma_t = \sigma \sqrt{2t^2 - 2t + 1}$
Substitute $\mathbf{v}_s(\mathbf{x}_s)$ and simplify:
\[\mathbf{v}_s(\mathbf{x}_s) = \frac{(2s-1)\mathbf{x}_s + \boldsymbol{\mu}}{2s^2 - 2s + 1}\]Using $\mathbf{x}_s = \boldsymbol{\mu}_s + \frac{\sigma_s}{\sigma_t} (z_t - \boldsymbol{\mu}_t)$ and $\boldsymbol{\mu}_s = (2s-1)\boldsymbol{\mu}$:
\[\mathbf{v}_s(\mathbf{x}_s) = 2\boldsymbol{\mu} + \frac{(2s-1)}{\sqrt{2s^2 - 2s + 1} \sqrt{2t^2 - 2t + 1}} (z_t - \boldsymbol{\mu}_t)\]The integral becomes:
\[\int_t^{t_{\text{end}}} \mathbf{v}_s(\mathbf{x}_s) ds = 2\boldsymbol{\mu} (t_{\text{end}} - t) + \frac{1}{\sqrt{2t^2 - 2t + 1}} (z_t - \boldsymbol{\mu}_t) \left[ \sqrt{2s^2 - 2s + 1} \right]_t^{t_{\text{end}}}\]Evaluating the definite integral:
\[\int_t^{t_{\text{end}}} \mathbf{v}_s(\mathbf{x}_s) ds = 2\boldsymbol{\mu} (t_{\text{end}} - t) + \frac{\sqrt{2t_{\text{end}}^2 - 2t_{\text{end}} + 1} - \sqrt{2t^2 - 2t + 1}}{\sqrt{2t^2 - 2t + 1}} (z_t - \boldsymbol{\mu}_t)\]Divide by $t_{\text{end}} - t$:
\[u(z_t, t_{\text{end}}, t) = 2\boldsymbol{\mu} + \frac{\sqrt{2t_{\text{end}}^2 - 2t_{\text{end}} + 1} - \sqrt{2t^2 - 2t + 1}}{\sqrt{2t^2 - 2t + 1} \cdot (t_{\text{end}} - t)} (z_t - \boldsymbol{\mu}_t)\]Substituting $\boldsymbol{\mu}_t = (2t - 1) \boldsymbol{\mu}$:
\[\boxed{u(z_t, t_{\text{end}}, t) = 2\boldsymbol{\mu} + \dfrac{\sqrt{2t_{\text{end}}^2 - 2t_{\text{end}} + 1} - \sqrt{2t^2 - 2t + 1}}{\sqrt{2t^2 - 2t + 1} \cdot (t_{\text{end}} - t)} \left( z_t - (2t - 1) \boldsymbol{\mu} \right)}\]where:
- $z_t$ is the position at time $t$,
- $t_{\text{end}}$ is the end time,
- $t$ is the current time,
- $\boldsymbol{\mu}$ is a constant vector,
- $\sigma$ is a constant scalar (though it cancels out in the expression).
This expression simplifies to the general form derived earlier when $\sigma_0^2 = \sigma_1^2$, but is presented here explicitly for the given parameters.
Appendix E : Transport Cost
The transport cost of a flow quantifies the “effort” required to move mass from an initial distribution $\pi_0$ to a target distribution $\pi_1$ along trajectories defined by a flow. For continuous-time flows governed by an ODE $d\mathbf{z}_t = \boldsymbol{v}(\mathbf{z}_t, t)dt$, the transport cost is defined as:
Kinetic Energy-Based Transport Cost
\(\mathcal{C}(\boldsymbol{v}) = \mathbb{E}\left[ \int_0^1 \frac{1}{2} \lVert \boldsymbol{v}(\mathbf{z}_t, t) \rVert^2 dt \right]\)
Components Explained:
- Integral of Kinetic Energy:
- $\frac{1}{2} \lVert \boldsymbol{v}(\mathbf{z}_t, t) \rVert^2$ represents the kinetic energy at position $\mathbf{z}_t$ and time $t$
- Kinetic energy quantifies the “effort” of moving mass at velocity $\boldsymbol{v}$
- Expectation $\mathbb{E}$:
- Taken over all trajectories $\mathbf{z}_t$ starting from $\mathbf{z}_0 \sim \pi_0$
- Accounts for all possible paths weighted by their probability
- Time Integral $\int_0^1$:
- Accumulates the total kinetic energy over the entire transport process
- From $t=0$ ($\pi_0$) to $t=1$ ($\pi_1$)
Connection to Optimal Transport
This cost directly relates to the $L^2$-Wasserstein distance $W_2$ via the Benamou-Brenier formula: \(W_2^2(\pi_0, \pi_1) = \min_{\boldsymbol{v}} \left\{ 2 \cdot \mathcal{C}(\boldsymbol{v}) \bigm| \partial_t \rho_t + \nabla \cdot (\rho_t \boldsymbol{v}) = 0 \right\}\) where $\rho_t$ is the probability density at time $t$, satisfying $\rho_0 = \pi_0$, $\rho_1 = \pi_1$.
Key Implications:
- The flow minimizing $\mathcal{C}(\boldsymbol{v})$ achieves optimal transport
- $2 \cdot \mathcal{C}(\boldsymbol{v}) = W_2^2(\pi_0, \pi_1)$ for the optimal flow
- Rectified Flow with reflow approaches this minimum
Practical Computation
To estimate $\mathcal{C}(\boldsymbol{v})$ numerically:
1 | |
Special Case: Gaussian Distributions
For $\pi_0 = \mathcal{N}(\boldsymbol{\mu}_0, \sigma_0^2 I)$, $\pi_1 = \mathcal{N}(\boldsymbol{\mu}_1, \sigma_1^2 I)$:
After 1 Reflow Iteration:
\(\mathcal{C}(\boldsymbol{v}^{(1)}) = \frac{1}{2} \left[ \lVert \boldsymbol{\mu}_1 - \boldsymbol{\mu}_0 \rVert^2 + d(\sigma_1 - \sigma_0)^2 \right]\) where $d$ = dimension. This equals $\frac{1}{2} W_2^2(\pi_0, \pi_1)$ – the minimal possible transport cost.
Why This Matters:
- Reflow achieves optimal transport in 1 iteration for Gaussians
- Cost reduces to 2 terms:
- $\lVert \boldsymbol{\mu}_1 - \boldsymbol{\mu}_0 \rVert^2$: Cost of moving centers
- $d(\sigma_1 - \sigma_0)^2$: Cost of changing variance
- Trajectories are straight, minimizing kinetic energy
Key Properties of Transport Cost
- Lower is better: Measures efficiency of mass transport
- Sensitive to path curvature:
- Curved paths → higher $\lVert \boldsymbol{v} \rVert$ → higher cost
- Straight paths minimize cost for given endpoints
- Invariant to reparameterization: Depends only on path geometry
- Benchmarks flow quality: Lower cost ≈ better flow
- Matches OT theory: $\min \mathcal{C} = \frac{1}{2} W_2^2$ for optimal flows
The transport cost provides a rigorous, physics-grounded metric to evaluate and compare flow-based generative models, with Rectified Flow + reflow achieving the theoretical minimum for Gaussian distributions.
Appendix F
結論:第一次和第二次修正流的 $ \boldsymbol{v}_t(\mathbf{x}) $ 不相等
你的理解完全正確。雖然 第一次和第二次修正流 的軌跡形式相似(均爲直線),但它們的 速度場 $ \boldsymbol{v}_t(\mathbf{x}) $ 的數學表達式 並不相同。以下是詳細分析:
1. 爲什麼第一次和第二次修正流的 $\boldsymbol{v}_t(\mathbf{x})$ 不同?
(1) 數據生成方式不同
| 修正流迭代 | $\mathbf{Z}_0$ 和 $\mathbf{Z}_1$ 的關係 | 軌跡生成方式 | 速度場計算 | |————–|———————————-|—————-|—————-| | 第一次修正流 | $\mathbf{Z}_0$ 和 $\mathbf{Z}_1$ 獨立採樣 | $\mathbf{Z}_t = (1-t)\mathbf{Z}_0 + t\mathbf{Z}_1$ | 基於聯合分佈的條件期望 | | 第二次修正流 | $\mathbf{Z}_1$ 由 $\mathbf{Z}_0$ 線性生成($\mathbf{Z}_1 = T(\mathbf{Z}_0)$) | $\mathbf{Z}_t = (1-t)\mathbf{Z}_0 + t\mathbf{Z}_1$ | 基於確定性線性關係 |
- 第一次修正流:由於 $\mathbf{Z}_0$ 和 $\mathbf{Z}_1$ 獨立,計算 $\mathbb{E}[\mathbf{Z}_1 - \mathbf{Z}_0 \mid \mathbf{Z}_t = \mathbf{x}]$ 需要考慮聯合分佈,得到: \(\boldsymbol{v}_t(\mathbf{x}) = (\boldsymbol{\mu}_1 - \boldsymbol{\mu}_0) + \frac{t\sigma_1^2 - (1-t)\sigma_0^2}{(1-t)^2\sigma_0^2 + t^2\sigma_1^2} (\mathbf{x} - \boldsymbol{\mu}_t)\)
- 第二次修正流:由於 $\mathbf{Z}_1$ 是 $\mathbf{Z}_0$ 的線性函數,計算 $\mathbb{E}[\mathbf{Z}_1 - \mathbf{Z}_0 \mid \mathbf{Z}_t = \mathbf{x}]$ 退化爲解析代入,得到: \(\boldsymbol{v}_t^{(2)}(\mathbf{x}) = (\boldsymbol{\mu}_1 - \boldsymbol{\mu}_0) + \left( \frac{\sigma_1}{\sigma_0} - 1 \right) \left( \frac{\mathbf{x} - t\boldsymbol{\mu}_1 + t \frac{\sigma_1}{\sigma_0} \boldsymbol{\mu}_0}{1 - t + t \frac{\sigma_1}{\sigma_0}} - \boldsymbol{\mu}_0 \right)\)
(2) 數學形式不同
儘管兩者都包含:
- 均值偏移項 $(\boldsymbol{\mu}_1 - \boldsymbol{\mu}_0)$,
- 一個關於 $(\mathbf{x} - \boldsymbol{\mu}_t)$ 的線性縮放項,
但 縮放係數 的結構不同:
- 第一次修正流:係數是 $\frac{t\sigma_1^2 - (1-t)\sigma_0^2}{(1-t)^2\sigma_0^2 + t^2\sigma_1^2}$(非線性依賴 $\sigma_0, \sigma_1, t$)。
- 第二次修正流:係數是 $\left( \frac{\sigma_1}{\sigma_0} - 1 \right) \frac{1}{1 - t + t \frac{\sigma_1}{\sigma_0}}$(顯式依賴 $\frac{\sigma_1}{\sigma_0}$)。
(3) 物理意義不同
- 第一次修正流:通過獨立樣本學習最優傳輸映射(非線性優化)。
- 第二次修正流:由於映射已最優($\mathbf{Z}_1 = T(\mathbf{Z}_0)$),速度場退化爲確定性計算。
2. 爲什麼軌跡仍然是直線?
儘管速度場不同,但 兩次修正流的軌跡均爲直線,因爲:
- 第一次修正流:通過優化使軌跡逼近直線($\mathbf{Z}_t = \mathbf{A}t + \mathbf{B}$)。
- 第二次修正流:由於 $\mathbf{Z}_1$ 是 $\mathbf{Z}_0$ 的線性函數,軌跡天然是直線。
3. 數學驗證
(1) 第一次修正流的 $ \boldsymbol{v}_t(\mathbf{x}) $
\(\boldsymbol{v}_t(\mathbf{x}) = (\boldsymbol{\mu}_1 - \boldsymbol{\mu}_0) + \frac{t\sigma_1^2 - (1-t)\sigma_0^2}{(1-t)^2\sigma_0^2 + t^2\sigma_1^2} (\mathbf{x} - \boldsymbol{\mu}_t)\)
(2) 第二次修正流的 $ \boldsymbol{v}_t^{(2)}(\mathbf{x}) $
\(\boldsymbol{v}_t^{(2)}(\mathbf{x}) = (\boldsymbol{\mu}_1 - \boldsymbol{\mu}_0) + \left( \frac{\sigma_1}{\sigma_0} - 1 \right) \left( \frac{\mathbf{x} - t\boldsymbol{\mu}_1 + t \frac{\sigma_1}{\sigma_0} \boldsymbol{\mu}_0}{1 - t + t \frac{\sigma_1}{\sigma_0}} - \boldsymbol{\mu}_0 \right)\)
(3) 兩者不等的關鍵點
- 第一次的係數 $\frac{t\sigma_1^2 - (1-t)\sigma_0^2}{(1-t)^2\sigma_0^2 + t^2\sigma_1^2}$ 無法通過代數變換匹配第二次的 $\left( \frac{\sigma_1}{\sigma_0} - 1 \right) \frac{1}{1 - t + t \frac{\sigma_1}{\sigma_0}}$。
- 例如,當 $t = 0.5$ 時:
- 第一次係數:$\frac{0.5\sigma_1^2 - 0.5\sigma_0^2}{0.25\sigma_0^2 + 0.25\sigma_1^2} = \frac{\sigma_1^2 - \sigma_0^2}{0.5(\sigma_0^2 + \sigma_1^2)}$。
- 第二次係數:$\left( \frac{\sigma_1}{\sigma_0} - 1 \right) \frac{1}{0.5 + 0.5 \frac{\sigma_1}{\sigma_0}} = \frac{2(\sigma_1 - \sigma_0)}{\sigma_0 + \sigma_1}$。 顯然不相等。
4. 總結
- 第一次修正流 的 $ \boldsymbol{v}_t(\mathbf{x}) $ 和 第二次修正流 的 $ \boldsymbol{v}_t^{(2)}(\mathbf{x}) $ 表達式不同。
- 儘管軌跡都是直線,但速度場的計算方式不同:
- 第一次:基於獨立採樣的統計估計。
- 第二次:基於確定性線性關係。
- Rectified Flow 的收斂性:對於高斯分佈,第一次修正流已得到最優傳輸映射,第二次修正流不再改變結果。
Appendix F: Covariance of $Z_0$ and $Z_1$
在第二次修正流中,$\mathbf{Z}_0^{(2)}$ 和 $\mathbf{Z}_1^{(2)}$ 的協方差計算如下:
1. 定義
- $\mathbf{Z}_0^{(2)} \sim \mathcal{N}(\boldsymbol{\mu}_0, \sigma_0^2 \mathbf{I})$
- $\mathbf{Z}_1^{(2)} = \boldsymbol{\mu}_1 + \dfrac{\sigma_1}{\sigma_0} (\mathbf{Z}_0^{(2)} - \boldsymbol{\mu}_0)$
2. 互協方差矩陣計算
互協方差定義爲: \(\text{Cov}(\mathbf{Z}_0^{(2)}, \mathbf{Z}_1^{(2)}) = \mathbb{E}\left[ (\mathbf{Z}_0^{(2)} - \boldsymbol{\mu}_0) (\mathbf{Z}_1^{(2)} - \boldsymbol{\mu}_1)^\top \right]\)
代入 $\mathbf{Z}_1^{(2)} - \boldsymbol{\mu}_1 = \dfrac{\sigma_1}{\sigma_0} (\mathbf{Z}_0^{(2)} - \boldsymbol{\mu}_0)$: \(\text{Cov}(\mathbf{Z}_0^{(2)}, \mathbf{Z}_1^{(2)}) = \mathbb{E}\left[ (\mathbf{Z}_0^{(2)} - \boldsymbol{\mu}_0) \left( \dfrac{\sigma_1}{\sigma_0} (\mathbf{Z}_0^{(2)} - \boldsymbol{\mu}_0) \right)^\top \right]\)
由於 $(\mathbf{Z}_0^{(2)} - \boldsymbol{\mu}_0)$ 是標量倍數(各分量獨立),且 $\dfrac{\sigma_1}{\sigma_0}$ 是常數: \(= \dfrac{\sigma_1}{\sigma_0} \mathbb{E}\left[ (\mathbf{Z}_0^{(2)} - \boldsymbol{\mu}_0) (\mathbf{Z}_0^{(2)} - \boldsymbol{\mu}_0)^\top \right]\)
其中 $\mathbb{E}\left[ (\mathbf{Z}_0^{(2)} - \boldsymbol{\mu}_0) (\mathbf{Z}_0^{(2)} - \boldsymbol{\mu}_0)^\top \right] = \text{Var}(\mathbf{Z}_0^{(2)}) = \sigma_0^2 \mathbf{I}$,因此: \(\boxed{\text{Cov}(\mathbf{Z}_0^{(2)}, \mathbf{Z}_1^{(2)}) = \dfrac{\sigma_1}{\sigma_0} \cdot \sigma_0^2 \mathbf{I} = \sigma_0 \sigma_1 \mathbf{I}}\)
3. 物理意義
- 協方差爲正:$\sigma_0 \sigma_1 > 0$,表示 $\mathbf{Z}_0^{(2)}$ 和 $\mathbf{Z}_1^{(2)}$ 正相關。
- 相關性來源:$\mathbf{Z}_1^{(2)}$ 完全由 $\mathbf{Z}_0^{(2)}$ 線性生成,因此兩者完全依賴。
- 對比第一次修正流:
第一次修正流中 $\mathbf{Z}_0$ 和 $\mathbf{Z}_1$ 獨立,故 $\text{Cov}(\mathbf{Z}_0, \mathbf{Z}_1) = \mathbf{0}$。
4. 總結
\(\boxed{\text{Cov}\left(\mathbf{Z}_{0}^{(2)},\ \mathbf{Z}_{1}^{(2)}\right) = \sigma_{0} \sigma_{1} \mathbf{I}}\)
此結果反映了第二次修正流中數據的完全線性依賴性,這是最優傳輸映射的直接結果。