본문 바로가기

Math

Derivation of Fourier Series Based on Discrete Fourier Transform

  Fourier Series is a very useful tool in mathematics for analyzing functions, expressing an arbitrary continuous function as infinite sum of trigonometric functions. Fourier Series is derived on the basis of norm and orthogonality of functions, and an abstract algebraic description that the trigonometric series form an orthonormal basis. However, these concepts are not easy to understand at the elementary level, and the constriction of 'orthonormality' of the basis keeps us from expanding this theory to various sets of functions. For example, this way of approaching Fourier series don't lead us to any insight about the Taylor and Maclaurin series.

  Therefore, I will write about a completely different explanation of Fourier Series, based on Discrete Fourier Transform (DFT). While the idea behind Fourier Series is to approximate the function with finite sums, the idea of DFT is constructing a function that passes through n given points exactly. So DFT is much easier and requires less knowledge than FFT to understand. However, I realized that FFT can be justified partially by taking limit of DFT. The logic is not complete because it doesn't cover the whole nature of differences between discrete and continuous functions but rather relies on intution. Nevertheless, it seems clear that this derivation is much easier and intutitive, as well as applicable to other mathematical phenomena including Maclaurin series.

Discrete Fourier Transform (DFT)

  We are given a sequence of complex numbers $a_0, a_1, a_2, ..., a_{N-1}$. We want to express this sequence as a linear combination of exponential functions with frequency $0, 1, ..., N-1$ that is:

$$a_x=\sum _{n=0}^{N-1}A_n\exp\left(\frac{2\pi nxi}{N}\right)=\sum _{n=0}^{N-1}A_nw^{nx}$$

where $w=\exp(2\pi i/N)$ is the Nth root of unity. In order to find the coefficients $A_0, A_1, A_2, ..., A_{N-1}$, we right this in matrix notation:

$$\begin{pmatrix}
a_0\\ 
a_1 \\ 
a_2\\ 
...\\ 
a_{N-1}
\end{pmatrix}=T\begin{pmatrix}
A_0\\ 
A_1 \\ 
A_2\\ 
...\\ 
A_{N-1}
\end{pmatrix}, T=
\begin{pmatrix}
w^0 & w^0 & w^0 & ... & w^0\\ 
w^0 &  w^1&w^2  &...  &w^{N-1} \\ 
w^0 &w^2  &w^4  &...  &w^{2(N-1)} \\ 
... &...  &...  &...  &... \\ 
w^0 &w^{N-1}  &w^{2(N-1)}  &...  &w^{(N-1)(N-1)} 
\end{pmatrix}$$

Because $\sum _{k=0}^{N-1}w^{nk}=N\delta_{n, k}$, where the Kronecker delta $\delta_{n, k}$ is 1 if $n=k$ and 0 otherwise, the inverse of matrix T can be easily found. It can be shown that,

$$\begin{pmatrix}
A_0\\ 
A_1 \\ 
A_2\\ 
...\\ 
A_{N-1}
\end{pmatrix}=T^{-1}\begin{pmatrix}
a_0\\ 
a_1 \\ 
a_2\\ 
...\\ 
a_{N-1}
\end{pmatrix}, T^{-1}=\frac{1}{N}
\begin{pmatrix}
w^0 & w^0 & w^0 & ... & w^0\\ 
w^0 &  w^{-1}&w^{-2}  &...  &w^{-(N-1)} \\ 
w^0 &w^{-2}  &w^{-4}  &...  &w^{-2(N-1)} \\ 
... &...  &...  &...  &... \\ 
w^0 &w^{-(N-1)}  &w^{-2(N-1)}  &...  &w^{-(N-1)(N-1)} 
\end{pmatrix}$$

Hence the fourier coefficients are given by the above matrix equation, or more directly $A_n=\sum _{x=0}^{N-1}a_xw^{-nx}/N$.

Explaining Fourier Series with DFT

  The concept of Fourier series is essentially same as that of DFT, except for the difference that Fourier Series deals with continuous functions. Thus, we can speculate that it can be obtained by using DFT for sufficiently many points on the function. In fact, physical analysis programs use DFT(more precisely, an algorithm called Fast Fourier Transform that does DFT quickly) to express waves as a combination of sinusoidal waves, because the digital data stored in a computer is discrete. But we know that the results of software is same as Fourier Series of the continuous wave, provided that sampling rate is sufficiently high. Here, I will prove this fact mathematically.

  Suppose that $f:[0, P)\rightarrow\mathbb{C}$ is a complex-valued function, and we know N values $a_0=f(0), a_1=f(P/N), a_2=f(2P/N), ..., a_{N-1}=f((N-1)P/N)$. Then we can perform DFT with these N values, yielding the result $$f\left ( \frac{xP}{N} \right )=a_x=\sum_{n=0}^{N-1}A_n\exp\left(\frac{2\pi nxi}{N}\right), A_n=\frac{1}{N}\sum _{x=0}^{N-1}a_x\exp\left(-\frac{2\pi nxi}{N}\right)$$

  Now we will send N to infinity. Before that, we should remember that rearranging the terms preserves the sum for finite sum, but not for infinite series. It is likely that the exponential terms with low frequency is crucial in the sum. In $\exp(2\pi nxi/N)$ can be interpreted as a phasor rotating counterclockwise by $n/N$ parts of a circle every unit of x. However, since x is discrete for now, we can also think that the phasor rotates counterclockwise at frequency $(N-n)/N$. Consequently, the values for n which $n$ or $N-n$ is small, namely $n=0, 1, 2, ...$ and $n=N-1, N-2, ...$, are crucial. To emphasize this, we right the sum again as $$f\left ( \frac{xP}{N} \right )=\sum_{n=-(N-1)/2}^{(N-1)/2}A_n\exp\left(\frac{2\pi nxi}{N}\right)$$

for odd N, which is valid since $\exp\left(2\pi nxi/N\right)=\exp\left(2\pi (n-N)xi/N\right))$ for $(N+1)/2\leq n\leq N-1$.

  When sending N to infinity, the upper bound and lower bound in the summand goes to positive and negative infinity respectively, with the middle terms near $n=0$ with low frequency being the first terms to be calculated as it should. Replacing $x$ with $xP/N$ gives $$f(x)=\sum_{n=-\infty}^{+\infty}A_n\exp\left(\frac{2\pi nxi}{P}\right)$$

Furthermore, the equations for $A_n$ changes to integrals, giving $$A_n=\lim_{N\rightarrow\infty}\frac{1}{N}\sum _{x=0}^{N-1}f\left(\frac{xP}{N} \right )\exp\left(-\frac{2\pi nxi}{N}\right)=\frac{1}{P}\int_{0}^{P}f(t)\exp\left(\frac{-2\pi nti}{P}\right)dt$$

These equations make the well-known form of Fourier series.

Derivation of Maclaurin Series

  Maclaurin series also can be derived in a similar manner, that is, constructing polynomials for N discrete values and observing them as N tends to infinity. Suppose that we are given N values of the function $f: \mathbb{R}\rightarrow\mathbb{C}$, namely $f(0), f(1), ..., f(N-1)$. We want to make a polynomial with these N given points. As there are N constrictions, degree of $N-1$ will be necessary and sufficient, so we set $f(x)=a_0+a_1x+a_2x^2+...+a_{N-1}x^{N-1}$. Then a calculation similar to that of DFT gives $$\begin{pmatrix}
f(0)\\ 
f(1) \\ 
f(2)\\ 
...\\ 
f(N-1)
\end{pmatrix}=T\begin{pmatrix}
a_0\\ 
a_1 \\ 
a_2\\ 
...\\ 
a_{N-1}
\end{pmatrix}, T=
\begin{pmatrix}
0^0 & 0^1 & 0^2 & ... & 0^{N-1}\\ 
1^0 &  1^1&1^2  &...  &1^{N-1} \\ 
2^0 &2^1  &2^2  &...  &2^{N-1} \\ 
... &...  &...  &...  &... \\ 
(N-1)^0 &(N-1)^1 &(N-1)^2 &...  &(N-1)^{N-1}
\end{pmatrix}$$

where $0^0=1$ in this context. As all rows of T are linearly independent, T is invertible. However, it seems that the inverse of T cannot be found easily. However, we can find the last row of $T^{-1}$, which we set $x_0, x_1, ..., x_{N-1}$, easily. By the definition of matrix multiplication, these numbers should satisfy $$\sum_{k=0}^{N-1}x_k k^j=\delta_{N-1, j}=\left\{\begin{matrix}0, j\leq N-2
\\ 
1, j=N-1
\end{matrix}\right.$$

As the equations in this system are linearly independent, it is sufficient to find one such sequence of numbers. the case $j=N-1$ shows us that not all $x_i$ can be zero. So, we should find a nontrivial solution to the system of equations for $ j\leq N-2$ and then use $j=N-1$ to scale. The form of the equations for $ j\leq N-2$  reminds us of the equality $$\sum_{k=0}^{N-1}(-1)^k\binom{N-1}{k}k^j=0(0\leq j\leq N-2)$$.

This is derived via differentiating the identity $$(1+x)^{N-1}=\sum_{k=0}^{N-1}\binom{N-1}{k}x^k$$ $N-2$ times, substituting $x=-1$ and expanding the set of j for which the equation holds for every differentiation. Consequently, $x_k=C(-1)^k\binom{N-1}{k}$ for some constant C. The constriction for $j=N-1$ gives $$C\sum_{k=0}^{N-1}(-1)^k\binom{N-1}{k}k^{N-1}=1$$.

After differentiating the binomial theorem identity $N-1$ times, we get $$(N-1)!=\sum_{k=0}^{N-1}\binom{N-1}{k}k^{N-1}x^k \Rightarrow C=\frac{1}{(N-1)!}$$ because all powers of $k$ less than $N-1$ vanish. So we found the last row of $T^{-1}$, and $$a_{N-1}=\sum_{k=0}^{N-1}x_kf(k)=\frac{1}{(N-1)!}\sum_{k=0}^{N-1}(-1)^k\binom{N-1}{k}f(k)$$

On the other hand, note that for sufficiently large N (and f with sufficiently little change with respect to x; this can be achieved, for example, by "stratching" the graph of f by a factor of $\sqrt{N}$ each time), the derivatives can be estimated as follows:

$$f'(x)\approx f(x+1)-f(x)$$
$$f''(x)\approx f'(x+1)-f'(x)\approx(f(x+2)-f(x+1))-(f(x+1)-f(x))=f(x+2)-2f(x+1)+f(x)$$
$$f'''(x)\approx f''(x+1)-f''(x)\approx f(x+3)-3f(x+2)+3f(x+1)-f(x)$$

Mathematical induction shows that $$f^{(N)}(x)\approx\sum_{k=0}^{N-1}(-1)^k\binom{N-1}{k}f(x+k)$$, which means that the term $ a_{N-1}$ is in fact approximately $f^{(N-1)}(x)/(N-1)!$. This gives the Maclaurin series.