Merge branch 'main' of https://git.victormylle.be/VictorMylle/Thesis

Worked further on thesis
2024-05-08 17:56:59 +02:00 · 2024-05-08 17:53:19 +02:00
16 changed files with 569 additions and 423 deletions
--- a/Reports/Thesis/images/diffusion/results/intermediates/Testing
+++ b/Reports/Thesis/images/diffusion/results/intermediates/Testing
--- a/Reports/Thesis/images/diffusion/results/intermediates/Testing
+++ b/Reports/Thesis/images/diffusion/results/intermediates/Testing
--- a/Reports/Thesis/images/diffusion/results/intermediates/Testing
+++ b/Reports/Thesis/images/diffusion/results/intermediates/Testing
--- a/Reports/Thesis/images/diffusion/results/intermediates/Testing
+++ b/Reports/Thesis/images/diffusion/results/intermediates/Testing
--- a/Reports/Thesis/sections/results/diffusion.tex
+++ b/Reports/Thesis/sections/results/diffusion.tex
@@ -5,9 +5,38 @@ This training process can also be used for other data types. An image is just a

 Once the diffusion model is trained, it can be used efficiently to generate new samples. The model can generate samples in parallel, which is not possible with autoregressive models. It combines the parallel sample generation of the non-autoregressive models while the quarter NRV values still depend on each other.  A batch of noise vectors can be sampled and passed through the model in one batch to generate the new samples. The generated samples contain the 96 NRV values for the next day without needing to sample every quarter sequentially.

-TODO: Visualization of the diffusion model in the context of the NRV data.
-
 The model is trained in a completely different way than the quantile regression models. A simple implementation of the Denoising Diffusion Probabilistic Model (DDPM) is used to perform the experiments. More complex implementations with more advanced techniques could be used to improve the results. This is out of the scope of this thesis. The goal is to show that more recent generative models can also be used to model the NRV data. These results can then be compared to the quantile regression models to see if the diffusion model can generate better samples.

 % TODO: In background information?
 First of all, the model architecture needs to be chosen. The model takes multiple inputs which include the noisy NRV time series, the positional encoding of the current denoising step and the conditional input features. The model needs to predict the noise in the current time series. The time series can then be denoised by subtracting the predicted noise in every denoising step. Multiple model architectures can be used as long as the model can predict the noise in the time series. A simple feedforward neural network is used. The neural network exists of multiple linear layers with ReLu activation functions. To predict the noise in a noisy time series, the current denoising step index must also be provided. This integer is then transformed into a vector using sine and cosine functions. The positional encoding is then concatenated with the noisy time series and the conditional input features. This tensor is then passed through the first linear layer and activation function of the neural network. This results in a tensor of the hidden size that was chosen. Before passing this tensor to the next layer, the positional encoding and conditional input features are concatenated again. This process is repeated until the last layer is reached. This provides every layer in the neural network with the necessary information to predict the noise in the time series. The output of the last layer is then the predicted noise in the time series. The model is trained by minimizing the mean squared error between the predicted noise and the real noise in the time series.
+
+Other hyperparameters that need to be chosen are the number of denoising steps, number of layers and hidden size of the neural network. Experiments are performed to get an insight into the influence these parameters have on the model performance. Results are shown in Table \ref{tab:diffusion_results}.
+
+\begin{figure}[h]
+    \centering
+    \begin{tikzpicture}
+        % First row
+        % Node for Image 1
+        \node (img1) {\includegraphics[width=0.45\textwidth]{images/diffusion/results/intermediates/Testing Intermediates 864_Sample intermediate 1_00000000.jpeg}};
+        % Node for Image 2 with an arrow from Image 1
+        \node[right=of img1] (img2) {\includegraphics[width=0.45\textwidth]{images/diffusion/results/intermediates/Testing Intermediates 864_Sample intermediate 2_00000000.jpeg}};
+        \draw[-latex] (img1) -- (img2);
+
+        % Second row
+        % Node for Image 3 below Image 1 with an arrow from Image 2
+        \node[below=of img1] (img3) {\includegraphics[width=0.45\textwidth]{images/diffusion/results/intermediates/Testing Intermediates 864_Sample intermediate 3_00000000.jpeg}};
+        % Node for Image 4 with an arrow from Image 3
+        \node[right=of img3] (img4) {\includegraphics[width=0.45\textwidth]{images/diffusion/results/intermediates/Testing Intermediates 864_Sample intermediate 4_00000000.jpeg}};
+        \draw[-latex] (img3) -- (img4);
+        
+        % Complex arrow from Image 2 to Image 3
+        % Calculate midpoint for the horizontal segment
+        \coordinate (Middle) at ($(img2.south)!0.5!(img3.north)$);
+        \draw[-latex] (img2.south) |- (Middle) -| (img3.north);
+    \end{tikzpicture}
+    \caption{Intermediate steps of the diffusion model for example 864 from the test set. The confidence intervals shown in the plots are made using 100 samples.}
+    \label{fig:diffusion_intermediates}
+\end{figure}
+
+In Figure \ref{fig:diffusion_intermediates}, multiple intermediate steps of the denoising process are shown as an example from the test set. The model starts with noisy full-day NRV samples which can be seen in the first steps. These noisy samples are then denoised in multiple steps until realistic samples are generated. This can be seen in the last image in the figure. It can be observed that the confidence intervals get more narrow over time as the noise is removed from the samples. 
+
--- a/Reports/Thesis/verslag
+++ b/Reports/Thesis/verslag
--- a/Reports/Thesis/verslag.aux
+++ b/Reports/Thesis/verslag.aux
@@ -80,6 +80,8 @@
 \@writefile{lof}{\contentsline {figure}{\numberline {13}{\ignorespaces Over/underestimation of the quantiles for the autoregressive and non-autoregressive GRU models. Both the quantile performance for the training and test set are shown. The plots are generated using the input features NRV, Load, Wind, PV, Net Position, and the quarter embedding (only for the autoregressive model).\relax }}{36}{figure.caption.22}\protected@file@percent }
 \newlabel{fig:gru_model_quantile_over_underestimation}{{13}{36}{Over/underestimation of the quantiles for the autoregressive and non-autoregressive GRU models. Both the quantile performance for the training and test set are shown. The plots are generated using the input features NRV, Load, Wind, PV, Net Position, and the quarter embedding (only for the autoregressive model).\relax }{figure.caption.22}{}}
 \@writefile{toc}{\contentsline {subsection}{\numberline {6.3}Diffusion}{36}{subsection.6.3}\protected@file@percent }
+\@writefile{lof}{\contentsline {figure}{\numberline {14}{\ignorespaces Intermediate steps of the diffusion model for example 864 from the test set. The confidence intervals shown in the plots are made using 100 samples.\relax }}{38}{figure.caption.23}\protected@file@percent }
+\newlabel{fig:diffusion_intermediates}{{14}{38}{Intermediate steps of the diffusion model for example 864 from the test set. The confidence intervals shown in the plots are made using 100 samples.\relax }{figure.caption.23}{}}
 \@writefile{toc}{\contentsline {section}{\numberline {7}Policies for battery optimization}{38}{section.7}\protected@file@percent }
 \@writefile{toc}{\contentsline {subsection}{\numberline {7.1}Baselines}{38}{subsection.7.1}\protected@file@percent }
 \@writefile{toc}{\contentsline {subsection}{\numberline {7.2}Policies using NRV predictions}{38}{subsection.7.2}\protected@file@percent }
--- a/Reports/Thesis/verslag.bcf
+++ b/Reports/Thesis/verslag.bcf
@@ -2818,70 +2818,4 @@
      <bcf:entrytype>article</bcf:entrytype>
      <bcf:entrytype>report</bcf:entrytype>
      <bcf:constraint type="mandatory">
-        <bcf:field>author</bcf:field>
-        <bcf:field>title</bcf:field>
-      </bcf:constraint>
-    </bcf:constraints>
-  </bcf:datamodel>
-  <!-- CITATION DATA -->
-  <!-- SECTION 0 -->
-  <bcf:bibdata section="0">
-    <bcf:datasource type="file" datatype="bibtex" glob="false">./references.bib</bcf:datasource>
-  </bcf:bibdata>
-  <bcf:section number="0">
-    <bcf:citekey order="1" intorder="1">weron_electricity_2014</bcf:citekey>
-    <bcf:citekey order="2" intorder="1">poggi_electricity_2023</bcf:citekey>
-    <bcf:citekey order="3" intorder="1">lu_scenarios_2022</bcf:citekey>
-    <bcf:citekey order="4" intorder="1">dumas_deep_2022</bcf:citekey>
-    <bcf:citekey order="5" intorder="1">rasul_autoregressive_2021</bcf:citekey>
-    <bcf:citekey order="6" intorder="1">dumas_deep_2022</bcf:citekey>
-  </bcf:section>
-  <!-- SORTING TEMPLATES -->
-  <bcf:sortingtemplate name="nyt">
-    <bcf:sort order="1">
-      <bcf:sortitem order="1">presort</bcf:sortitem>
-    </bcf:sort>
-    <bcf:sort order="2" final="1">
-      <bcf:sortitem order="1">sortkey</bcf:sortitem>
-    </bcf:sort>
-    <bcf:sort order="3">
-      <bcf:sortitem order="1">sortname</bcf:sortitem>
-      <bcf:sortitem order="2">author</bcf:sortitem>
-      <bcf:sortitem order="3">editor</bcf:sortitem>
-      <bcf:sortitem order="4">translator</bcf:sortitem>
-      <bcf:sortitem order="5">sorttitle</bcf:sortitem>
-      <bcf:sortitem order="6">title</bcf:sortitem>
-    </bcf:sort>
-    <bcf:sort order="4">
-      <bcf:sortitem order="1">sortyear</bcf:sortitem>
-      <bcf:sortitem order="2">year</bcf:sortitem>
-    </bcf:sort>
-    <bcf:sort order="5">
-      <bcf:sortitem order="1">sorttitle</bcf:sortitem>
-      <bcf:sortitem order="2">title</bcf:sortitem>
-    </bcf:sort>
-    <bcf:sort order="6">
-      <bcf:sortitem order="1">volume</bcf:sortitem>
-      <bcf:sortitem literal="1" order="2">0</bcf:sortitem>
-    </bcf:sort>
-  </bcf:sortingtemplate>
-  <!-- DATALISTS -->
-  <bcf:datalist section="0"
-                name="nyt/apasortcite//global/global"
-                type="entry"
-                sortingtemplatename="nyt"
-                sortingnamekeytemplatename="apasortcite"
-                labelprefix=""
-                uniquenametemplatename="global"
-                labelalphanametemplatename="global">
-  </bcf:datalist>
-  <bcf:datalist section="0"
-                name="nyt/global//global/global"
-                type="entry"
-                sortingtemplatename="nyt"
-                sortingnamekeytemplatename="global"
-                labelprefix=""
-                uniquenametemplatename="global"
-                labelalphanametemplatename="global">
-  </bcf:datalist>
-</bcf:controlfile>
+        <bcf:field>
--- a/Reports/Thesis/verslag.log
+++ b/Reports/Thesis/verslag.log
--- a/Reports/Thesis/verslag.out
+++ b/Reports/Thesis/verslag.out
@@ -1,30 +0,0 @@
-\BOOKMARK [1][-]{section.1}{\376\377\000I\000n\000t\000r\000o\000d\000u\000c\000t\000i\000o\000n}{}% 1
-\BOOKMARK [1][-]{section.2}{\376\377\000E\000l\000e\000c\000t\000r\000i\000c\000i\000t\000y\000\040\000m\000a\000r\000k\000e\000t}{}% 2
-\BOOKMARK [1][-]{section.3}{\376\377\000G\000e\000n\000e\000r\000a\000t\000i\000v\000e\000\040\000m\000o\000d\000e\000l\000i\000n\000g}{}% 3
-\BOOKMARK [2][-]{subsection.3.1}{\376\377\000Q\000u\000a\000n\000t\000i\000l\000e\000\040\000R\000e\000g\000r\000e\000s\000s\000i\000o\000n}{section.3}% 4
-\BOOKMARK [2][-]{subsection.3.2}{\376\377\000A\000u\000t\000o\000r\000e\000g\000r\000e\000s\000s\000i\000v\000e\000\040\000v\000s\000\040\000N\000o\000n\000-\000A\000u\000t\000o\000r\000e\000g\000r\000e\000s\000s\000i\000v\000e\000\040\000m\000o\000d\000e\000l\000s}{section.3}% 5
-\BOOKMARK [2][-]{subsection.3.3}{\376\377\000M\000o\000d\000e\000l\000\040\000T\000y\000p\000e\000s}{section.3}% 6
-\BOOKMARK [3][-]{subsubsection.3.3.1}{\376\377\000L\000i\000n\000e\000a\000r\000\040\000M\000o\000d\000e\000l}{subsection.3.3}% 7
-\BOOKMARK [3][-]{subsubsection.3.3.2}{\376\377\000N\000o\000n\000-\000L\000i\000n\000e\000a\000r\000\040\000M\000o\000d\000e\000l}{subsection.3.3}% 8
-\BOOKMARK [3][-]{subsubsection.3.3.3}{\376\377\000R\000e\000c\000u\000r\000r\000e\000n\000t\000\040\000N\000e\000u\000r\000a\000l\000\040\000N\000e\000t\000w\000o\000r\000k\000\040\000\050\000R\000N\000N\000\051}{subsection.3.3}% 9
-\BOOKMARK [2][-]{subsection.3.4}{\376\377\000D\000i\000f\000f\000u\000s\000i\000o\000n\000\040\000m\000o\000d\000e\000l\000s}{section.3}% 10
-\BOOKMARK [3][-]{subsubsection.3.4.1}{\376\377\000O\000v\000e\000r\000v\000i\000e\000w}{subsection.3.4}% 11
-\BOOKMARK [3][-]{subsubsection.3.4.2}{\376\377\000A\000p\000p\000l\000i\000c\000a\000t\000i\000o\000n\000s}{subsection.3.4}% 12
-\BOOKMARK [3][-]{subsubsection.3.4.3}{\376\377\000G\000e\000n\000e\000r\000a\000t\000i\000o\000n\000\040\000p\000r\000o\000c\000e\000s\000s}{subsection.3.4}% 13
-\BOOKMARK [2][-]{subsection.3.5}{\376\377\000E\000v\000a\000l\000u\000a\000t\000i\000o\000n}{section.3}% 14
-\BOOKMARK [1][-]{section.4}{\376\377\000P\000o\000l\000i\000c\000i\000e\000s}{}% 15
-\BOOKMARK [2][-]{subsection.4.1}{\376\377\000B\000a\000s\000e\000l\000i\000n\000e\000s}{section.4}% 16
-\BOOKMARK [2][-]{subsection.4.2}{\376\377\000P\000o\000l\000i\000c\000i\000e\000s\000\040\000b\000a\000s\000e\000d\000\040\000o\000n\000\040\000N\000R\000V\000\040\000g\000e\000n\000e\000r\000a\000t\000i\000o\000n\000s}{section.4}% 17
-\BOOKMARK [1][-]{section.5}{\376\377\000L\000i\000t\000e\000r\000a\000t\000u\000r\000e\000\040\000S\000t\000u\000d\000y}{}% 18
-\BOOKMARK [2][-]{subsection.5.1}{\376\377\000E\000l\000e\000c\000t\000r\000i\000c\000i\000t\000y\000\040\000P\000r\000i\000c\000e\000\040\000F\000o\000r\000e\000c\000a\000s\000t\000i\000n\000g}{section.5}% 19
-\BOOKMARK [2][-]{subsection.5.2}{\376\377\000P\000o\000l\000i\000c\000i\000e\000s\000\040\000f\000o\000r\000\040\000B\000a\000t\000t\000e\000r\000y\000\040\000O\000p\000t\000i\000m\000i\000z\000a\000t\000i\000o\000n}{section.5}% 20
-\BOOKMARK [1][-]{section.6}{\376\377\000R\000e\000s\000u\000l\000t\000s\000\040\000\046\000\040\000D\000i\000s\000c\000u\000s\000s\000i\000o\000n}{}% 21
-\BOOKMARK [2][-]{subsection.6.1}{\376\377\000D\000a\000t\000a}{section.6}% 22
-\BOOKMARK [2][-]{subsection.6.2}{\376\377\000Q\000u\000a\000n\000t\000i\000l\000e\000\040\000R\000e\000g\000r\000e\000s\000s\000i\000o\000n}{section.6}% 23
-\BOOKMARK [3][-]{subsubsection.6.2.1}{\376\377\000L\000i\000n\000e\000a\000r\000\040\000M\000o\000d\000e\000l}{subsection.6.2}% 24
-\BOOKMARK [3][-]{subsubsection.6.2.2}{\376\377\000N\000o\000n\000-\000L\000i\000n\000e\000a\000r\000\040\000M\000o\000d\000e\000l}{subsection.6.2}% 25
-\BOOKMARK [3][-]{subsubsection.6.2.3}{\376\377\000G\000R\000U\000\040\000M\000o\000d\000e\000l}{subsection.6.2}% 26
-\BOOKMARK [2][-]{subsection.6.3}{\376\377\000D\000i\000f\000f\000u\000s\000i\000o\000n}{section.6}% 27
-\BOOKMARK [1][-]{section.7}{\376\377\000P\000o\000l\000i\000c\000i\000e\000s\000\040\000f\000o\000r\000\040\000b\000a\000t\000t\000e\000r\000y\000\040\000o\000p\000t\000i\000m\000i\000z\000a\000t\000i\000o\000n}{}% 28
-\BOOKMARK [2][-]{subsection.7.1}{\376\377\000B\000a\000s\000e\000l\000i\000n\000e\000s}{section.7}% 29
-\BOOKMARK [2][-]{subsection.7.2}{\376\377\000P\000o\000l\000i\000c\000i\000e\000s\000\040\000u\000s\000i\000n\000g\000\040\000N\000R\000V\000\040\000p\000r\000e\000d\000i\000c\000t\000i\000o\000n\000s}{section.7}% 30
--- a/Reports/Thesis/verslag.pdf
+++ b/Reports/Thesis/verslag.pdf
--- a/Reports/Thesis/verslag.synctex(busy)
+++ b/Reports/Thesis/verslag.synctex(busy)
--- a/Reports/Thesis/verslag.synctex.gz
+++ b/Reports/Thesis/verslag.synctex.gz
--- a/Reports/Thesis/verslag.tex
+++ b/Reports/Thesis/verslag.tex
@@ -28,6 +28,8 @@
 \usepackage{caption}
 \usepackage{subcaption}
 \usepackage{booktabs}
+\usepackage{tikz}
+\usetikzlibrary{positioning, calc}

 % Electricity market
 % Generative Modeling
--- a/src/trainers/diffusion_trainer.py
+++ b/src/trainers/diffusion_trainer.py
@@ -84,7 +84,7 @@ class DiffusionTrainer:
        self.model = model
        self.device = device

-        self.noise_steps = 1000
+        self.noise_steps = 300
        self.beta_start = 0.0001
        self.beta_end = 0.02
        self.ts_length = 96
--- a/src/training_scripts/diffusion_training.py
+++ b/src/training_scripts/diffusion_training.py
@@ -2,7 +2,7 @@ from src.utils.clearml import ClearMLHelper

 clearml_helper = ClearMLHelper(project_name="Thesis/NrvForecast")
 task = clearml_helper.get_task(
-    task_name="Diffusion Training: hidden_sizes=[256, 256, 256], lr=0.0001, time_dim=8 + Load + PV + Wind + NP"
+    task_name="Diffusion Training: hidden_sizes=[1024, 1024, 1024, 1024] (300 steps), lr=0.0001, time_dim=8 + Load + Wind + PV + NP"
 )
 task.execute_remotely(queue_name="default", exit_process=True)

@@ -42,7 +42,7 @@ print("Input dim: ", inputDim)
 model_parameters = {
    "epochs": 15000,
    "learning_rate": 0.0001,
-    "hidden_sizes": [256, 256, 256],
+    "hidden_sizes": [1024, 1024, 1024, 1024],
    "time_dim": 8,
 }
Author	SHA1	Message	Date
Victor Mylle	4c4914e227	Merge branch 'main' of https://git.victormylle.be/VictorMylle/Thesis	2024-05-08 17:56:59 +02:00
Victor Mylle	8a2e1ce7d5	Worked further on thesis	2024-05-08 17:53:19 +02:00