-
Notifications
You must be signed in to change notification settings - Fork 0
/
rapport.tex
1355 lines (1228 loc) · 73.8 KB
/
rapport.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\documentclass[10pt]{article}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{lmodern}
\usepackage{numprint}
\npdecimalsign{.}
\nprounddigits{3}
\usepackage[english]{babel}
%AU CHOIX
\usepackage{MATISpdfhisto}
%\usepackage{MATISpdfupe}% page couverture et seconde page a facon
\usepackage{geometry}
\usepackage{amstext,amsmath,amssymb}
\usepackage{color}
\usepackage{booktabs}
\usepackage[usenames,dvipsnames]{xcolor}
\usepackage{tikz}
\usetikzlibrary{arrows,shapes,positioning,trees}
\PassOptionsToPackage{hyphens}{url}\usepackage[
colorlinks=true,
urlcolor=PineGreen,
linkcolor=RoyalBlue,
citecolor=blue
]{hyperref}
\usepackage{multirow}
\usepackage{geometry}
\usepackage{amsmath,amssymb}
\usepackage{amsmath}
\usepackage{amsfonts}
\usepackage{csquotes}
\usepackage{pgffor}
\usepackage{amssymb}
\usepackage{chngcntr}
\usepackage{mathtools}
\usepackage{slashbox}
\newcommand{\exposantdevant}[1][2]{\prescript{##1}{}{##2}}
\newcommand{\indicedevant}[1][2]{\prescript{}{##1}{##2}}
\newcommand{\transpose}[1]{\prescript{t}{}{##1}}
\DeclareMathOperator{\atan}{atan}
\DeclareMathOperator{\sgn}{sgn}
\DeclareMathOperator{\abs}{abs}
\DeclareMathOperator{\argmax}{argmax}
\DeclareMathOperator{\argmin}{argmin}
\usepackage{algorithm}
\usepackage[noend]{algpseudocode}
\makeatletter
\def\BState{\State\hskip-\ALG@thistlm}
\makeatother
\usepackage{cellspace}
\usepackage{slashbox}
\usepackage{pdfpages}
\usepackage{enumitem}
\usepackage[pdftex]{graphicx}
\usepackage{subcaption}
\graphicspath{{Figures/}{Figures/finistere_T41000_30000/}{Figures/finistere_T35000_40000/}{Figures/gironde_T10500_34500/}{Figures/finistere_all/}}
%Bibliography
\usepackage[
backend=biber,
style=authoryear,
citestyle=authoryear,
uniquename=init,
maxcitenames=1,
giveninits=true,
hyperref=auto,
sorting=nyt
]{biblatex}
\addbibresource{rapport.bib}
\newcommand{\wbal}{The Wikibook about \LaTeX}
\newcommand{\legende}{\vspace{3mm}
\small\centering
\fcolorbox{black}{red}{\rule{0pt}{6pt}\rule{6pt}{0pt}}\quad Building \hspace{2mm} \fcolorbox{black}{gray}{\rule{0pt}{6pt}\rule{6pt}{0pt}}\quad Road \hspace{2mm}
\fcolorbox{black}{green}{\rule{0pt}{6pt}\rule{6pt}{0pt}}\quad Forest \hspace{2mm}
\fcolorbox{black}{yellow}{\rule{0pt}{6pt}\rule{6pt}{0pt}}\quad Other Vegetation \hspace{2mm}
\fcolorbox{black}{blue}{\rule{0pt}{6pt}\rule{6pt}{0pt}}\quad Water
}
\newcommand{\legendeBinEval}{\vspace{3mm}
\small\centering
\fcolorbox{black}{green}{\rule{0pt}{6pt}\rule{6pt}{0pt}}\quad Good \hspace{2mm}
\fcolorbox{black}{red}{\rule{0pt}{6pt}\rule{6pt}{0pt}}\quad Bad (GT=1) \hspace{2mm}
\fcolorbox{black}{blue}{\rule{0pt}{6pt}\rule{6pt}{0pt}}\quad Bad (Classification=1)
}
\newcommand{\legendebuf}{\vspace{3mm}
\small\centering
\fcolorbox{black}{red}{\rule{0pt}{6pt}\rule{6pt}{0pt}}\quad Building \hspace{2mm} \fcolorbox{black}{gray}{\rule{0pt}{6pt}\rule{6pt}{0pt}}\quad Road \hspace{2mm}
\fcolorbox{black}{green}{\rule{0pt}{6pt}\rule{6pt}{0pt}}\quad Forest \hspace{2mm}
\fcolorbox{black}{yellow}{\rule{0pt}{6pt}\rule{6pt}{0pt}}\quad Other Vegetation \hspace{2mm}
\fcolorbox{black}{blue}{\rule{0pt}{6pt}\rule{6pt}{0pt}}\quad Water\hspace{2mm}
\fcolorbox{black}{cyan}{\rule{0pt}{6pt}\rule{6pt}{0pt}}\quad Buffer Class
}
\newcommand{\legendebin}{\vspace{3mm}
\small\centering
\fcolorbox{black}{red}{\rule{0pt}{6pt}\rule{6pt}{0pt}}\quad Artificialized area
\fcolorbox{black}{green}{\rule{0pt}{6pt}\rule{6pt}{0pt}}\quad Non-artificialized area
}
\newcommand{\tile}{41000_30000}
\newcommand{\region}{finistere}
\MATISdate{\today}
\MATISdatee{\today}
\MATISauthor{Cyril Wendl}
\MATISauthore{Cyril \textsc{Wendl}}
\MATISauthormail{cyril.wendl@epfl.ch}
\authorhead{Cyril \textsc{Wendl}}
\titlehead{Fusion of Multi-Temporal Sentinel-2 image Series and Very-High Spatial Resolution Images for Detection of Urban Areas}
\MATISetitle{Fusion of Multi-Temporal Sentinel-2 image Series and Very-High Spatial Resolution Images for Detection of Urban Areas}
%\MATISaffiliation{IGN - Laboratoire MATIS, CRNI - Strasbourg, EPFL}
\MATISaffiliatione{IGN - Laboratoire MATIS, CRNI - Strasbourg, EPFL}
%Supervisors: Arnaud Le-Bris (IGN), Anne Le-Puissant (CRNI), Frank De Morsier (EPFL)
%\MATISaffiliationune{IGN - Laboratoire MATIS}
%%%%%%%%%%%%%%%%%%%%%%%%
% Resume & abstract
%%%%%%%%%%%%%%%%%%%%%%%%
%\MATISresume{Summary}
%\MATISmotcle{}
\MATISabstract{Fusion of very high spatial resolution multispectral images with lower spatial resolution image time series having a higher number of bands can enable better land use classification, combining geometric and semantic advantages of both sources. This study presents a workflow to find the extent of urbanized ground using decision-level fusion and regularization of individual classifications on Sentinel-2 and SPOT-6 satellite images. First, both images are classified individually in five classes, using state-of-the-art supervised classification approaches and Deep Convolutional Neural Networks. Decision-level fusion and regularization are used to combine the spatial and spectral advantages of both classifiers: First, both sources are merged in order to extract building labels with as high semantic and spectral precision as possible. Second, the building labels are used together with the Sentinel-2 classification as input for a binary classification of the artificialized area. The building labels from the regularization are dilated in order to connect the building objects and a binary classification is derived from the original Sentinel-2 classification. These two separate binary classifications are reintroduced in a fusion and regularization to find the artificialized area. Segmentation of the Sentinel-2 satellite image and majority voting of the object-level classification are used to refine the contours of the artificialized area.}
\MATISkeyword{Multispectral, Decision fusion, Regularization, Urban classification, Artificialized area, Segmentation}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% FIN DE LA PRESENTATION
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%% Debut du rapport a proprement parle %%%%%%%%%%%%%%%
\begin{document}
%%%% On importe le style
\makeMATIS
%%%% Apres, c'est comme d'habitude, on ne change rien. %%%%
\tableofcontents
\newpage
\section{Introduction}
% Importance of ground occupation classification: urban spread, planning
% Fusion: combination of high spectral and spatial resolution
Classification of urban land cover is of central importance to monitor urban sprawl, impermeabilization of soils and to predict their further evolution (\cite{kurtz_histogram_2012,kurtz_extraction_2012,wemmert_multiresolution_2009,Lefebvre_2016}). Supervised classification approaches using satellite imagery have been studied to automate the process of land use classification \parencite{inglada_operational_2017,li_urban_2016,Pesaresi_2016}. High spatial resolution satellite images enable the delineation of much smaller features, however they are often characterized by semantic uncertainty that is, not having enough spectral information to distinguish land cover types precisely, resulting in confusion of classes. On the other hand, satellites using many bands offer high spectral depth offering more semantic information but less geometric detail. Fusion of several sources aims at combining their advantages to reduce spatial and semantic uncertainties \parencite{ouerghemmi_two-step_2017,fauvel_decision_fusion,hervieu_fusion_2016}. \\
% Fusion at decision level
Fusion schemes can be regrouped on three levels \parencite{ouerghemmi_two-step_2017}:
\begin{enumerate}
\item At the data level, where the original images are merged, using methods such as pan-sharpening of the multispectral image in order to produce the best single classification;
\item At the feature level, applying a single classification to extract features from both sources;
\item At the decision level, where two separate classifications from heterogeneous data sets are merged.
\end{enumerate}
\cite{ouerghemmi_two-step_2017} have proposed several fusion approaches at decision level as well as a regularization whose aim is to smooth the fusion result while using the contrast information of the original image to follow the object contours and reinforce the geometric accuracy after the fusion. \\
% Aim
The aim of this paper is to propose a decision-level fusion processing chain using freely available and commercial satellite imagery to improve the automatic ground use classification in urban environments in the following five categories: building, roads, water, forest and other vegetation. Firstly, a ground classification map is produced following as precisely as possible the contours of individual buildings and roads. Secondly, the building labels from this classification together with the classification on the Sentinel-2 time series are used as an \textit{a priori} input data set for the derivation a binary land use map of the "urban footprint", which can be understood as the area made irreversibly impermeable through the construction of buildings and roads, including smaller enclosed structures such as backyards or public green spaces \parencite{puissant_object-oriented_2014}. This bigger aggregation of buildings, roads and enclosed small structure will be called hereafter "artificialized area".\\
The aim of this study is twofold:
\begin{enumerate}
\item To provide an accurate classification of buildings and roads using decision-level fusion of Sentinel-2 and SPOT-6 classifications;
\item To produce a map of the artificialized area using the extracted building labels of the previous fusion and the classification of the Sentinel-2 time series as \textit{a priori} input data.
\end{enumerate}
The overall goal is to find the artificialized area, following the real-world city boundary countour as closely as possible.
\subsection{Satellite data and Study Sites}
The satellite data used for the ground land classification are described in in tables \ref{table:bands_s2} and \ref{table:bands_spot}.
\begin{table}[H]
\centering
\begin{subtable}{.52\textwidth}
\centering
\begin{tabular}{@{}lp{2cm}p{2.2cm}l@{}}\toprule
Band & GSD Resolution [m] & Central Wavelength [nm] & Domain \\\hline
B1 & 60 & 443 & Violet \\
B2 & 10 & 490 & Blue \\
B3 & 10 & 560 & Green \\
B4 & 10 & 665 & Red \\
B5 & 20 & 705 & VNIR \\
B6 & 20 & 740 & VNIR \\
B7 & 20 & 783 & VNIR \\
B8 & 10 & 842 & VNIR \\
B8a & 20 & 865 & VNIR \\
B9 & 60 & 940 & SWIR \\
B10 & 60 & 1375 & SWIR \\
B11 & 20 & 1610 & SWIR \\
B12 & 20 & 2190 & SWIR\\\bottomrule
\end{tabular}
\caption{Sentinel-2 Level 2A (\cite{esa-s2res,theia})}
\label{table:bands_s2}
\end{subtable}
\begin{subtable}{.45\textwidth}
\centering
\begin{tabular}{@{}lp{1.7cm}p{2.1cm}l@{}}\toprule
Band & GSD Resolution [m] & Wavelength range [nm] & Domain \\\hline
B0 & 6.0 & 450 - 520 & Blue \\
B1 & 6.0 & 530 - 590 & Green \\
B2 & 6.0 & 625-695 & Red \\
B3 & 6.0 & 760-890 & VNIR \\
B4 & 1.5 & 450-745 & Panchromatic \\\bottomrule
\end{tabular}
\caption{SPOT-6 (\cite{SPOT6_technical-sheet})}
\label{table:bands_spot}
\end{subtable}
\caption{Spatial and spectral resolution of satellite data. GSD = Ground Sampling Distance, VNIR = Visible and Near Infrared, SWIR = Short Wave Infrared}
\label{table:bands}
\end{table}
Regarding Sentinel-2, only the 10 bands having a (GSD) resolution of 10 or 20 meters were used and in practice the bands having a resolution of 20 meters were up-scaled to 10 m resolution. Regarding SPOT-6, the four bands B0-B3 resolution were used and pan-sharpened to 1.5m resolution by using a fifth panchromatic band for pan-sharpening. \\
Sentinel-2 data was downloaded on several dates in order to mitigate seasonal effects on land cover classification. Table \ref{table:dates} lists the used dates for each satellite and test region.
\begin{table}[H]
\centering
\begin{tabular}{lll}\toprule
Region & Sentinel-2 dates & SPOT-6 dates \\\hline
\multirow{5}{*}{Finistère} & 2017-05-25 & \multirow{5}{*}{2014-04-16} \\
& 2017-04-12 & \\
& 2017-03-16 & \\
& 2017-01-25 & \\
& 2016-08-15 & \\\hline
\multirow{4}{*}{Gironde} & 2017-06-18 & \multirow{4}{*}{2016-03-22} \\
& 2017-04-19 & \\
& 2016-11-30 & \\
& 2016-09-01 & \\\bottomrule
\end{tabular}
\caption{Used dates for each classification}
\label{table:dates}
\end{table}
All classifiers presented below have been applied to a zone spanning in the Finistère region in North-Western France and Gironde in Western France. Fusion and regularization products are produced for the entire zone covered in both available datasets for Finistère, spanning 648 km$^2$. Results are presented visually for a sub-zone of 0.64 km$^2$ area (fig. \ref{fig:area\tile}) and for two more sub-zones of 0.64 km$^2$ area (Finistère, sec. \ref{sec:otherZone_finistere}) and 1.44 km$^2$ area (Gironde, sec. \ref{sec:otherZone_gironde}), respectively.
\newcommand{\figureTile}{
\begin{figure}[H]
\centering
\begin{subfigure}{0.49\textwidth}
\centering
\includegraphics[width=.9\textwidth]{\region_T\tile_Im_SPOT6}
\caption{SPOT-6 image tile (3000$\times$ 3000 m)}
\end{subfigure}
\hfill
\begin{subfigure}{0.49\textwidth}
\centering
\includegraphics[width=.9\textwidth]{\region_T\tile_Im_SPOT6_crop}
\caption{Zoomed area (800 $\times$ 800 m)}
\end{subfigure}
\caption{SPOT-6 image tile and crop window}
\label{fig:area\tile}
\centering
\end{figure}
}
\figureTile
\section{Methodology}\label{sec:method}
\subsection{General Workflow}
Figure \ref{fig:methods} shows the overall workflow for producing the artificialized area, consisting of two steps:\\
% Fusion, regularization
First, different fusion metrics and regularization options are applied individually to the initial classifications on both sources into 5 classes (buildings, road, water, forest and other vegetation) with the aim to extract the building labels as precisely as possible. The fusion and regularization method producing the best results in terms of visual quality, absence of artifacts and accuracy measures is retained and used for the further steps.\\
% Artificialized area
Second, a fusion and regularization is performed between the building labels from the regularization step and the binary Sentinel-2 classifier to find the artificialized area. The contours of the artificialized area are refined using segmentation of the Sentinel-2 image and majority voting with the segments.
\begin{figure}[H]
\centering
\includegraphics[width=.7\textwidth]{IGN-methods}
\caption{Methods}
\label{fig:methods}
\end{figure}
\subsection{Initial Fusion}
The data sets were first classified individually before fusion: The Sentinel-2 image series was classified using 10 bands of each date (tables \ref{table:bands_s2}, \ref{table:dates}) using a Random Forest classifier with 50'000 samples per class and 100 trees \parencite{Breiman2001}. The SPOT-6 image was classified using a Deep Convolutional Neural Network (DCNN) on the four pansharpened bands of the SPOT-6 image described in tables \ref{table:bands_spot}, \ref{table:dates} (\cite{postadjian_investigating_2017}). Both classifiers produced membership probabilities for the 5 classes described in the introduction (building, roads, water, forest and other vegetation).\\
The Random Forest classifier was trained using ground truth data of the freely available BDTOPO and RPG databases, from which a ground truth map of 5 class labels was derived (\cite{bdtopo,RPG}, fig. \ref{fig:vt\tile}).
\newcommand{\figureGT}[2]{
\begin{figure}[H]
\centering
\begin{subfigure}{0.49\textwidth}
\centering
\includegraphics[width=\textwidth]{#1}
\caption{Original image (SPOT-6)}
\label{fig:SPOT6\tile}
\end{subfigure}
\begin{subfigure}{0.49\textwidth}
\centering
\includegraphics[width=\textwidth]{#2}
\caption{Ground Truth}
\label{fig:GT\tile}
\end{subfigure}
\legende
\caption{Original image and ground truth. Black parts indicate missing ground truth data.}
\label{fig:vt\tile}
\end{figure}
}
\figureGT{\region_T\tile_Im_SPOT6_crop}{\region_T\tile_train_5cl}
\subsection{Fusion schemes}\label{sec:fusion}
The fusion rules presented below are based on previous work by \cite{ouerghemmi_two-step_2017} and propose pixel-level decision schemes between two or more classifiers based on their class membership probabilities. Following notation is used:
\begin{align}
\mathcal{L}&=\{c_i\}_{1\leq i\leq n_c}&& \text{Set of classes}\\
P_S^{(c)}&=P(x\in c|S)&& \text{Membership probability of a sample x in class c according to source S}
\end{align}
\subsubsection{Fuzzy Rules}\label{sec:fuzzyLogic}
The first tested fusion approach is based on fuzzy rules (\cite{zadeh_fuzzy_1965}). Fuzzy rule theory states that a fuzzy set $A$ in a reference set $\mathcal{L}$ is a set of ordered pairs:
\begin{equation}
A=[(x,P_A(x)|x\in \mathcal{L})]
\end{equation}
Where the membership probability of $A$ in $\mathcal{P}$ is given by $P_A:\mathcal{L}\rightarrow[0,1]$. The measure of conflict between two sources ($1-K$) is given as (\cite{dubois_possibility}):
\begin{equation}
K=\sup_x\min(P_A(x), P_B(x))
\end{equation}
In order to account for the fact that fuzzy sets with a strong fuzziness possibly hold unreliable information, each fuzzy set $i$ is weighted by a pointwise confidence measure $w_i$ (\cite{fauvel_decision_fusion}):
\begin{equation}
w_i=\frac{\sum_{k=0,k\neq i}^{n}H_{\alpha QE}(P_k)}{(n-1)\sum_{k=0}^{n}H_{\alpha QE}(P_k)}
\end{equation}
$n$ being the number of sources and $H_{\alpha QE}$ a fuzziness measure called $\alpha$-quadratic entropy (QE) \parencite{pal_measuring_1994}. Every fuzzy set $i$ is weighted by the fuzziness degree of all other fuzzy sets (classifications). If the fuzziness degree of the other sets is high the weight on a given source $i$ will be high too. For all of the following fusion rules, the fuzzy sets have been weighted as $\tilde{P}_A(x)=w_A\cdot P_A(x), \tilde{P}_B(x)=w_B\cdot P_B(x)$, $P_A(x)$ and $P_B(x)$ being the original membership probabilities and $w_A, w_B$ their corresponding pointwise measures.\\
All fuzzy rules have been tested using the original class probabilities as input as well as using the probabilities weighted by the just explained pointwise measure (section \ref{sec:fuzzyLogic}). In fig. \ref{fig:proba_point}, an example is shown where the SPOT-6 classifier has a higher fuzziness degree than the Sentinel-2 classifier, hence the probability of the SPOT-6 classifier is reduced more strongly by the pointwise measure.
\begin{figure}[H]
\centering
\includegraphics[width=.7\textwidth]{\region_T\tile_proba_point}
\caption{Original and weighted class probabilities}
\label{fig:proba_point}
\end{figure}
The following fusion rules based on fuzzy logic were considered:
\begin{enumerate}
\item Minimum rule as the intersection of two fuzzy sets $P_A$ and $P_B$, given by the minimum of their membership probabilities (conjunctive behaviour):
\begin{equation}
\forall A\in\mathcal{L} \;\;\; (P_A\cap P_B)(x) = P_{fusion}(x) =\min\big(P_A(x),P_B(x)\big)
\end{equation}
\item Maximum rule as the union between the two fuzzy sets $P_A$ and $P_B$, given by the maximum of their membership probabilities (disjunctive behaviour):
\begin{equation}
\forall A\in\mathcal{L} \;\;\; (P_A\cup P_B)(x) = P_{fusion}(x)=\max\big(P_A(x),P_B(x)\big)
\end{equation}
\item Compromise operator:
\begin{equation}\label{eq:compro}
P_{fusion}(x)=
\begin{cases}
\max\Big(T_1,\min\big(T_2,(1-K)\big)\Big)& \text{if } (1-K)\neq q\\
\max\Big(P_A(x),P_B(x)\Big) &\text{if }(1-K)= 1
\end{cases}
\end{equation}
where $T_1=\frac{\min\big(P_A(x),P_B(x)\big)}{K}$ and $T_2=\max\big(P_A(x),P_B(x)\big)$, noticing that the operator behavior is conjunctive when the conflict between $A$ and $B$ is low ($1-K\approx 0$) and disjunctive when the conflict is high ($1-K\approx 1$). When the conflict is partial, the operator behaves in a compromise way \parencite{ouerghemmi_two-step_2017}.
\item Compromise modified: Since a pure compromise fusion rule would favour $T_1$ at fusion level, \cite{ouerghemmi_two-step_2017} propose to measure the intra-class conflict as $f_c=\abs(Cbest1-Cbest2)$ and set a conflict threshold $t_c=0.25$ to be used as follows:
\begin{algorithm}[H]
\begin{algorithmic}
\If {$f_c<t_c$}
\Comment{existence of intra-membership conflict}
\State $P_{fusion}=\max\big(P_A(x),P_B(x)\big)$
\Else
\Comment{no intra-membership conflict}
\State $P_{fusion}$ following equation \ref{eq:compro}
\EndIf
\end{algorithmic}
\caption{Compromise rule according to \cite{ouerghemmi_two-step_2017}}
\label{alg:comp-wo}
\end{algorithm}
\item Prioritized operators (Prior 1, eq. \ref{eq:prior1} and Prior 2, eq. \ref{eq:prior2}):
\begin{align}
P_{fusion}(x)&=\max\Big(P_A(x),\min(P_B(x),K)\Big)\label{eq:prior1}\\
P_{fusion}(x)&=\min\Big(P_A(x),\max\big(P_B(x),(1-K)\big)\Big)\label{eq:prior2}
\end{align}
If the conflict between $A$ and $B$ is high (i.e., $K\approx 0$), only $P_A$ is taken into account (prioritized) and $P_B$ is considered a a specific piece of information. Thus, the fusion result depends on the order of the classes $A$ and $B$. Fusion results are shown in both ways (fig. \ref{fig:app:fusion-fl-45}), with "-inv" denoting the but only the configuration prioritizing the SPOT-6 classifier was retained for evaluation due missing geometric detail.
\end{enumerate}
\subsubsection{Bayesian combination}\label{sec:bayesian}
A very simple approach is to sum or multiply the input probabilities as a Bayesian sum or product:
\begin{align}
P_{fusion}&=P_A(x)+P_B(x)\\
P_{fusion}&=P_A(x)\times P_B(x)
\end{align}
Those fusion rules will be referred to as Bayesian Sum and Bayesian Product.
\subsubsection{Margin-based Rule (Margin-Max)}\label{sec:margin}
In the Max-Margin approach, the aim is to conserve the most confident source in each pixel, that is, the one having the least smallest margin:
\begin{equation}
margin^{(s)}(x)=P_s^{cbest1}-P_s^{cbest2}(x)
\end{equation}
where $cbest1=\argmax_{c\in\mathcal{L}}P_S(C)$ and $cbest2=\argmax_{c\in\mathcal{L}\setminus cbest1}P_S(C)$. The fusion will choose for each pixel the source for which the margin between the two highest probabilities is the biggest, with sources $\mathcal{S}={A,B}$ and the classes $\mathcal{L}=\{\mathcal{C}_i\}_{i\in[1,n]}$:
\begin{equation}
\forall x,\forall c\in \mathcal{L}\;\;\; P_{fusion}^{(c)}(x)=P_{S_{best}}^{(c)}(x)
\end{equation}
where $S_{best}=\argmax_{\mathcal{S}\in\mathcal{C}}margin^{(s)}(x)$.
Fusion was carried out preferring the source with the highest margin (Margin Max) as well as using using the probability weighted by the margin for the Bayesian sum and product (Margin Bayesian Sum Weighted, Margin Bayesian Product Weighted).
\subsubsection{Dempster-Shafer evidence theory}\label{sec:DS}
According to the Dempster-Shafer (DS) formalism, an information from a source $s$ for a class $c$ can be given as a mass function $m_c\vert m_c\in [0,1]$ (\cite{shafer-evidence}). Dempster-Shafer's evidence theory rule assumes simple classes $c\in \mathcal{L}$ as well as composed classes, which were limited to at most two simple classes (\cite{ouerghemmi_two-step_2017}). Masses associated to each simple class are denoted simply the class membership probability:
\begin{align}
m_s^{(c)}(x)=P_s^{(x)}
\end{align}
For compound classes, $\forall c_1,c_2\in \mathcal{L}, \forall$ pixel $x$ and $\forall s\in \mathcal{S}$, two versions were tested, denoted $V_1$ and $V_2$, respectively
\begin{align}
m_s^{(c_1\cup c_2)}(x)&=\big(P_s^{(c_1)}(x)+P_s^{(c_2)}(x)\big)\times \Big(1-\max\big(P_s^{(c_1)}(x),P_s^{(c_2)}(x)\big)\Big)+\min\big(P_s^{(c_1)}(x),P_s^{(c_2)}\big)\\
m_s^{(c_1\cup c_2)}(x)&=\frac{1}{2}\big(P_s^{(c_1)}(x)+P_s^{(c_2)}(x)\big)\times \Big(1-\max\big(P_s^{(c_1)}(x),P_s^{(c_2)}(x)\big)\Big)+\min\big(P_s^{(c_1)}(x),P_s^{(c_2)}\big)
\end{align}
This yields a mass $m_s^{(c_1\cup c_2)}(x)\in [0,1]$ being 1 if $P_s^{(c_1)}=0, P_s^{(c_2)}=1$ or $P_s^{(c_2)}=0, P_s^{(c_1)}=1$. In both versions, all masses are normalized such that $\sum_{c \in \mathcal{L}}^{(s)}m_c^{(s)}(x)=1$.
The fusion rule is based on the following conflict measure between two sources $A$ and $B$:
\begin{align}
K(x)=\sum_{\substack{c,d\in\mathcal{L}'\\c\cap d\neq\emptyset}}^{(s)}=m_A^{(c)}(x)m_B^{(c)}(x)
\end{align}
$c,d\in\mathcal{L}$ being compound classes with $c\cap d =\emptyset$. The fusion is performed as:
\begin{equation}
m_{fusion}^{(c)}(x)=\frac{1}{1-K(x)}\sum_{\substack{c_1,c_2\in\mathcal{L}'\\c_1 \cap c_2 = c}}^{(s)}m_A^{(c)}(x)m_B^{(c)}(x)
\end{equation}
\subsubsection{Fusion by Classification}
Supervised Classification Methods were used to perform a fusion on decision level. Following models were trained on the ground truth derived from the BDTOPO and RPG databases (\cite{bdtopo,RPG}), using the LIBSVM and OpenCV libraries (\cite{libsvm,opencv}) :
\begin{enumerate}
\item Random Forest (RF), with 10'000 training samples per class and 100 trees (\cite{opencv})
\item Support Vector Machine (SVM) with a linear kernel ($u'\times v$), with 10'000 training samples per class (\cite{libsvm})
\item SVM with a radial basis kernel function: $\exp(-\gamma*|u-v|^2)$, with 500 training samples per class (\cite{libsvm}). Parameters of the SVM model were optimized using cross-validation.
\end{enumerate}
Each classifier was trained using the ground truth from the BDTOPO and RPG databases \cite{bdtopo,RPG}. Since the ground truth could contained gaps between buildings, the classifiers tended to aggregate individual buildings as there was no training data available around building. A sixth arbitrary buffer class extending 10 m radius around each building was added to the ground truth to preserve more details of the buildings during classification (fig. \ref{fig:vt_buffer}). Since only building labels were extracted for the subsequent steps, the other class labels as well as the buffer label could be ignored.
\begin{figure}[H]
\centering
\begin{subfigure}{0.49\textwidth}
\centering
\includegraphics[width=\textwidth]{\region_T\tile_train_5cl}
\caption{Original Ground Truth}
\label{fig:GT_5cl}
\end{subfigure}
\begin{subfigure}{0.49\textwidth}
\centering
\includegraphics[width=\textwidth]{\region_T\tile_train}
\caption{Ground Truth with filled-in gaps}
\label{fig:GT_6cl}
\end{subfigure}
\legendebuf
\caption{Ground truth with buffer class}
\label{fig:vt_buffer}
\end{figure}
Samples of all fusion classifiers were taken within an area of 45 km$^2$, including the first zone presented below (fig. \ref{fig:area\tile}). The classification model was also applied to a zone outside of the training region (sec. \ref{sec:otherZone_finistere}).
\subsection{Regularization}\label{sec:regularization}
In order to smooth the fusion result, which still contains noisy patches, and make it follow real-world contours more accurately, an energy minimization regularization function was applied based on previous work by \cite{hervieu_fusion_2016} and \cite{ouerghemmi_two-step_2017}. The regularization consists of two terms, one related to the data fidelity, and one to the spatial regularization respectively:
\begin{equation}
E(P_{fusion},C_{fusion},C)=\sum_{x\in I_{MS}}E_{data}(C(x))+\lambda\sum _{\substack{x,y\in N\\x\neq y}}E_{reg}(C(x),C(y))
\end{equation}
where:
\begin{equation*}
\begin{aligned}
E_{data}\big(C(x)\big)&=f\Big(P_{fusion}C(x)\big)\Big)\\
E_{reg}\big(C(x)=C(y)\big)&=g\Big(P_{fusion}\big(C(x)\big),C_{fusion}\Big)\\
E_{reg}\big(C(x)\neq C(y)\big)&=h\Big(P_{fusion}\big(C(x)\big),C_{fusion}\Big)
\end{aligned}
\end{equation*}
The data term is a fit-to-data attachment term which models the probability distribution $P_{fusion}$. defined by a function $f(t)$. Following options were tested:
\begin{align}
\text{Option 1: } f(t)&=-\log(x)\\
\text{Option 2: } f(t)&=1-x
\end{align}
The regularization term $E_{reg}$ is based on a enhanced Potts model (\cite{schindler_overview_2012}) to prefer smoother label changes . A simple Potts model is defined as:
\begin{equation}
\begin{aligned}
E_{reg}(C(x)=C(y)&=0\\
E_{reg}(C(y)\neq C(y))&=1
\end{aligned}
\end{equation}
The regularization term is enhanced by a contrast information accounting for the fact that label changes should be less strongly penalized in high-contrast areas of the image:
\begin{equation}
\begin{aligned}
E_{reg}(C(x)=C(y)&=0\\
E_{reg}(C(y)\neq C(y))&=(1-\gamma)+\gamma V(x,y,\varepsilon)
\end{aligned}
\end{equation}
With $\gamma = 0$ yielding a pure Potts model and $\gamma = 1$ putting all weight on the contrast term. The contrast component accounts for the fact that label changes need not be penalized at strong contrast changes. Two alternative models were compared:
$V(x,y,\varepsilon)$ being the term for contrast
\begin{enumerate}
\item Option 1, based on \cite{hervieu_fusion_2016,Rother_2004}:
\begin{equation}
V(x,y)=\frac{1}{dim}\sum_{i\in[0,dim]}V_i(x,y)^\varepsilon
\end{equation}
Where $V_i(x,y)=\exp\Bigg(\frac{-\big(I_i(x)-I_i(y)\big)^2}{2\big(\overline{I_i(x)-I_i(y)}\big)^2}\Bigg)$,, $dim$ being the dimension of the image $I_{MS}$, $I_i$ the intensity for pixel $x$ in the multispectral image.
\item Option 2, based on \cite{schindler_overview_2012}:
\begin{equation}
\begin{aligned}
&\mathcal{G}_{0.5}\odot
\begin{bmatrix}
-1 &1
\end{bmatrix}& \mathcal{G}_{0.5}\odot
\begin{bmatrix}
1 \\
-1
\end{bmatrix}&\\
&\mathcal{G}_{0.5}\odot
\begin{bmatrix}
-1 & 0 \\
0 &-1
\end{bmatrix}&\mathcal{G}_{0.5}\odot
\begin{bmatrix}
0& 1 \\
-1 &0
\end{bmatrix}&\\
\end{aligned}
\end{equation}
For example, the gradient in the horizontal row would be $\dot{I}=||\mathcal{G}_{0.5}\odot\begin{bmatrix}-1 &1\end{bmatrix}\odot I||$. The contrast term penalizes label changes most strongly if the contrast is zero, and decreases linearly until a point where it becomes zero, for which $\phi=70\%$ is chosen. The contrast formula thus writes:
\begin{equation}
E_{reg}(C(u)\neq C(v))=\max\Big(0,1-\frac{2-\phi}{\dot{I}_{MAX}}\dot{I}_{u\rightarrow v}\Big)
\end{equation}
\end{enumerate}
For both contrast alternatives, a large number of combinations of values were tested on a small test zone and narrowed than successively to reach the best combination. Cross-validation between parameters was performed to find the best values for $\lambda$ $\gamma$ and $\varepsilon$ yielding the nicest and smoothest possible result while following the real object contours.\\
In addition, the image used for the contrast term was blurred by a Gaussian filter of standard deviation 2 to obtain smoother contours.
\subsection{Evaluation}
Evaluation of the 5-class classifications was performed on the fusion and regularization results on all five classes, applied to 5 tiles of 3000 $\times$ 3000m size, totalling an area of 45 $km^2$. The class labels of each classification were compared against the class labels of the ground truth of the freely available BDTOPO and RPG (databases, from which a ground truth map of 5 class labels was derived (\cite{bdtopo,RPG}, fig. \ref{fig:vt\tile}). In particular, for the classification by fusion, the sixth buffer class around buildings was ignored for evaluation.\\
Following indicators were produced for each classification (\cite{deMorsier}):
\begin{itemize}
\item Overall Accuracy (OA) and User Accuracy (UA) scores based on Confusion matrices (table \ref{table:cm}).
\begin{table}[H]
\centering
\begin{tabular}{cc|c|c|c|c|cl}
& & \multicolumn{4}{c|}{True labels y}&PA\\
\multirow{5}{*}{Predicted labels y} & & 1& 2& $\hdots$ &r &\\ \cline{1-7}
& 1 & $n_{11}$ & $n_{12}$ & $\hdots$ & $n_{1r}$& $n_{11}/n_{1\bullet}$ \\ \cline{2-7}
& 2 & $n_{21}$ & $n_{22}$ &$\hdots$ &$n_{2r}$ &$n_{22}/n_{2\bullet}$\\ \cline{2-7}
& $\vdots$ & $\vdots$ & $\vdots$ &$\ddots$ &$\vdots$&$\vdots$\\ \cline{2-7}
& r & $n_{r1}$ & $n_{r2}$ &$\hdots$ & $n_{rr}$ & $n_{rr}/n_{r\bullet}$ \\\cline{1-7}
\multicolumn{2}{c|}{UA} & $n_{11}/n_{\bullet 1}$ & $n_{22}/n_{\bullet 2}$ & $\hdots$ & $n_{rr}/n_{\bullet r}$ & OA$=\sum_in_{ii}/n_{\bullet\bullet}$\\
\end{tabular}
\caption{Confusion matrix: UA = User's Accuracy, PA = Producer's Accuracy, OA = Overall Accuracy, $1,2,\hdots,r$=classes. Bullets indexes signify either the sum of the row (e.g., $n_{O\bullet}$), the sum of the column (e.g., $n_{\bullet O}$) or the sum of all elements of the confusion matrix ($n_{\bullet\bullet}$).}
\label{table:cm}
\end{table}
\item Cohen's Kappa:
\begin{equation}
\kappa=\frac{P_{obs}-P_{est}}{1-P_{est}},P_{obs}=OA=\frac{\sum_in_{ii}}{n}, P_{est}=\frac{\sum_i\big(\frac{\sum_jn_{ij}\sum_jn_{ji}}{n}\big)}{n}
\end{equation}
\item The F-Score is based on user accuracy (precision) and the producer accuracy (recall) for a given class (\cite{zhang_f-measure_2009,ting_precision_2011}):
\begin{itemize}
\item Producer accuracy measures how large a fraction of the expected results is actually found.
\begin{equation}
R=\frac{tp}{tp+fn}
\end{equation}
\item User accuracy measures how many of the results returned are actually relevant.
\begin{equation}
P=\frac{tp}{tp+fp}
\end{equation}
\end{itemize}
True positives (TP), true negatives (TN), false positives (FP) and false negatives (FN) are calculated as the number of labels correctly classified or not with respect to the ground truth labels.
\begin{table}[H]
\begin{center}
\begin{tabular}{c|c c| c}
\multicolumn{4}{c}{Confusion matrix} \\
& Relevant & Non-relevant & Total \\
\hline
Retrieved & TP & FP & Pp \\
Not retrieved & FN & TN & Np \\
\hline
Total & P & N & n
\end{tabular}
\caption{Confusion Matrix for a single class}
\label{table:confusion_oneclass}
\end{center}
\end{table}
The F-Score produces a weighted harmonic measure between the precision and recall:
$F1=\frac{2PR}{P+R}$
The mean F-Score over all classes and the F-Scores specifically for the buildings class are indicated. Methods are ranked with respect to their AA and F-Scores for the buildings class.
\end{itemize}
\section{Artificialized area}
\subsection{Binary classification}
Firstly, the binary class probabilities were extracted. A linearly decreasing function assigning a probability from 1 to 0 to be in an artificialized area was applied surrounding all buildings, reaching a probability of 0 to be in an artificialized area after a distance of 200 meters (fig. \ref{fig:regul_fusion_steps}).\\
A more sophisticated approach was tested taking into account the mean membership probability of the building class from the fusion result for each building instead of assigning the probability 1 within buildings, admitting that the class confusion in boundary regions is higher. However, the approach showed similar results and was not used further.
\begin{figure}[H]
\centering
\begin{subfigure}{0.49\textwidth}
\centering
\includegraphics[width=.9\textwidth]{\region_T\tile_binary_1_bati}
\caption{Buildings from regularization}
{\vspace{3mm}
\small\centering
\fcolorbox{black}{red}{\rule{0pt}{6pt}\rule{6pt}{0pt}}\quad Buildings
}
\end{subfigure}
\begin{subfigure}{0.49\textwidth}
\centering
\includegraphics[width=.9\textwidth]{\region_T\tile_binary_2_dist}
\caption{Distance around each building}
{\vspace{3mm}
\small\centering
\fcolorbox{black}{black}{\rule{0pt}{6pt}\rule{6pt}{0pt}}\quad d=0
\fcolorbox{black}{white}{\rule{0pt}{6pt}\rule{6pt}{0pt}}\quad d=max
}
\end{subfigure}
\begin{subfigure}{0.49\textwidth}
\centering
\includegraphics[width=.9\textwidth]{\region_T\tile_binary_3_bati-dist}
\caption{Distance and buildings superposed}
{\vspace{3mm}
\small\centering
\fcolorbox{black}{black}{\rule{0pt}{6pt}\rule{6pt}{0pt}}\quad d=0
\fcolorbox{black}{white}{\rule{0pt}{6pt}\rule{6pt}{0pt}}\quad d=max
\fcolorbox{black}{red}{\rule{0pt}{6pt}\rule{6pt}{0pt}}\quad buildings
}
\end{subfigure}
\begin{subfigure}{0.49\textwidth}
\centering
\includegraphics[width=.9\textwidth]{\region_T\tile_binary_4_proba_regul_urbain}
\caption{Probability to be in artificialized area as linearly decreasing distance function around buildings}
{\vspace{3mm}
\small\centering
\fcolorbox{black}{black}{\rule{0pt}{6pt}\rule{6pt}{0pt}}\quad P(a)=0
\fcolorbox{black}{white}{\rule{0pt}{6pt}\rule{6pt}{0pt}}\quad P(a)=1
}
\end{subfigure}
\caption{Urban probability extraction}
\label{fig:regul_fusion_steps}
\end{figure}
The class probabilities from the RF classification on the Sentinel-2 image were converted to a binary classifications to obtain a class for artificialized areas ($a$) and one for non-artificialized areas ($\neg a$). Two options were tested:
\begin{align}
P(a)&=P(b), &P(\neg a)&= 1 - P(a)\label{eq:bin1}\\
P(a)&=P(b)+P(r), &P(\neg a)&= 1 - P(a)\label{eq:bin2}
\end{align}
$P(b)$ and $P(r)$ are the class probabilities for buildings and roads respectively. $b, r\in \mathcal{C}$ are classes of the Sentinel-2 and regularization classifications respectively, each consisting of membership probabilities for five classes $\mathcal{C}, \sum_{c \in \mathcal{C}}P(c) = 1$. Option two was retained only since buildings and roads were conf.\\
The binary class probabilities were used to perform a second fusion of the regularization result with the Sentinel-2 classification in order to find the extent of the urban tissue, consisting of contiguous structures of buildings and roads on a coarser level.
This probability map together with binary class probabilities from the Sentinel-2 classifier are used for fusion and regularization at the resolution of the Sentinel-2 classifier, using the Min rule and the same parameters for regularization as described above.\\
\subsection{Segmentation}
Majority voting of the binary fusion and regularization result within segmented parts of the image was performed to follow real-world contours of artificialized areas more closely. Segmentation of the Sentinel-2 image and majority voting of the binary classification after fusion and regularization were performed to follow the contours of the image more closely. In addition, the BD Parcellaire (parcel database, \cite{bdparcellaire}) was also used for majority voting within each parcel, however it was not retained due to mixed residential and agricultural areas. \\
Segmentation is performed using the "scale climbing" algorithm described in \cite{guigues_scale-sets_2003} and repeated with cuts of 3, 8, 20 and 30. The segmentation starts from an initial over-segmentation (e.g., using a watershed segmentation) and calculates an energy term on each pair of regions $R_1\cup R_2\;\forall (R_1,R_2)\in\mathcal{R}$, with $\mathcal{R}$ being the set of all the regions resulting from the segmentation:
\begin{equation}
E_{total}^{(R_1\cup R_2)}=E_{data}^{(R_1\cup R_2)}+\lambda E_{compl}^{(R_1\cup R_2)}
\end{equation}
With $E_{data}$ being a data term assessing how well the segmentation represents the original image and $E_{compl}$ a complexity term calculated over each sub-region. $E_{compl}$ is calculated as the length of the region boundary of the region $R=R_1\cup R_2$:
\begin{equation}
E^{(R_1\cup R_2)}_{compl}=\sum L_R
\end{equation}
$E_{data}$ is calculated based on the pixel value intensities $I$:
\begin{align}
E_{data}^{(R_1\cup R_2)}=\sum_{u,v}\big(I(u,v)-I_R(u,v)\big)^2
\end{align}
$I(u,v)$ being the intensity for at a given pixel coordinate $(u,v)$ and $I_R(u,v)$ being the mean intensity over this sub-region. The total energy term over a set of regions $\mathcal{R}$ is calculated as:
\begin{align}
E_{total}=\sum_{R\in \mathcal{R}} E_{data}^{(R)}+\lambda\sum_{R\in \mathcal{R}} E_{compl}^{(R)}
\end{align}
One tries to minimize $E_{total}$, thus to make $\Delta E_{total}$ as small as possible. $\lambda$ is calculated as follows:
\begin{equation}
\begin{split}
\Delta E_{total}=&E_{data}^{(R_1\cup R_2)}+\lambda E_{compl}^{(R_1\cup R_2)}\\
&-E_{data}^{(R_1)}-E_{data}^{(R_2)}-\lambda\big(E_{compl}^{(R_1)}+E_{compl}^{(R_2)}\big)\\
=&(E_{data}^{(R_1\cup R_2)} - E_{data}^{(R_1)} - E_{data}^{(R_2)})\\
&+\lambda\big(E_{compl}^{(R_1\cup R_2)}-E_{compl}^{(R_1)}+E_{compl}^{(R_2)}\big)
\end{split}
\end{equation}
\begin{equation}
\therefore \lambda=\frac{E_{data}^{(R_1)}+E_{data}^{(R_2)}-E_{data}^{(R_1\cup R_2)}}{E_{compl}^{(R_1\cup R_2)} -E_{compl}^{(R_1)}-E_{compl}^{(R_2)}}
\end{equation}
With each fusion of two segmented regions, $\lambda$ will increase. The CUT parameter sets a threshold on $\lambda$ for which merging of sub-regions stops.
\subsection{Evaluation}
Evaluation of the artificialized area was performed against binary classifications derived from the following sources (fig. \ref{fig:gt-bin}):
\begin{enumerate}
\item Dilated BDTOPO ground truth: Buildings of the ground truth were extracted and dilated by 20 metres (\cite{bdtopo}).
\item OpenStreetMaps (OSM) data which uses refined Corine landcover data (\cite{osm,corine}).
\item The CESBIO land use(OSO, \textit{\textbf{o}ccupation du \textbf{so}l}) product, regrouping the classes "urban diffuse", "urban dense" and "industrial and commercial zones" (\cite{oso})
\end{enumerate}
\begin{figure}[H]
\centering
\foreach \n/\captiontext in {bdtopo_original/BDTOPO,
bdtopo/BDTOPO dilated,
osm/OpenStreetMaps,
oso/OSO
}{
\begin{subfigure}{0.49\textwidth}
\centering
\includegraphics[width=.9\textwidth]{\region_T\tile_gt_\n}
\caption{\captiontext}
\end{subfigure}
}
\legendebin
\caption{Binary ground truth}
\label{fig:gt-bin}
\end{figure}
In addition qualitative comparison between ground truth data and binary maps and accuracy measures presented above, the intersection over union was calculated based on the confusion matrix (tables \ref{table:cm}, \ref{table:confusion_oneclass}), for a binary classification with class $i$ (\cite{jaccard1912distribution}):
\begin{equation}
IoU_{i}=\frac{n_{ii}}{n_{i.}+n_{.i}-n_{ii}}=\frac{\text{TP}}{\text{Pp}+\text{P}-\text{TP}}
\end{equation}
\section{Results}
\subsection{Original Classifications}
The original classification confirms well the initial observation that the SPOT-6 classifier tends to preserve geometries, whereas the S2 classifier overall behaves better, while mixing buildings and roads due to its coarse spatial resolution. Figure \ref{fig:cl_orig\tile} shows the original classification result from the individual classifications on the SPOT-6 and Sentinel-2 images. The presented zone was selected in particular to illustrate problems of both classifiers and improvements yielded by fusion and regularization, however the initial classifiers show better results in other zones (sec. \ref{subsec:otherZones}).
A few particularly interesting confusions can be highlighted on the example of figure \ref{fig:cl_orig\tile}:
\begin{enumerate}
\item The SPOT-6 classifier discovers a big patch of buildings and road in the top-left corner where there is actually a field (probabilities are shown in fig. \ref{fig:proba_point}). This is probably due to the fact that the field was barren at the time the image was taken and leads to a wrong classification due to the low number of bands. The patch is correctly labelled in the Sentinel-2 classification thanks to the use of a time series.
\item In the middle of the image, there is an industrial complex visibly surrounded by roads and concrete ground. Several patches around the building are wrongly classified as water by the SPOT-6 classification while there is also a minor however smaller amount of confusion in Sentinel-2. The confusion is due to the training set of both classifiers which contains muddy waters in other zones than the one represented below having a brighter texture more similar to the pavement surrounding the industrial area.
\item Individual buildings in the left-hand corner are nicely recognizable in the SPOT-6 classification, and they are mixed with the roads in the Sentinel-2 classification due to its lower resolution.
\end{enumerate}
Overall, the Sentinel-2 RF classifier tends to recognize well bigger patches of culture and forest while showing some confusion between roads and buildings. The SPOT-6 DCNN classifier shows good discrimination between building and roads as well as finer geometric level of detail, however it shows more confusions between classes. Buildings, which are of main interest here, are generally less confused in the Finistère region.
\newcommand{\figureOrigClass}{
\begin{figure}[H]
\centering
\begin{subfigure}{0.49\textwidth}
\centering
\includegraphics[width=.9\textwidth]{\region_T\tile_classif_SPOT6_all}
\caption{SPOT-6 classification}
\label{fig:SPOT6cl\tile}
\end{subfigure}
\begin{subfigure}{0.49\textwidth}
\centering
\includegraphics[width=.9\textwidth]{\region_T\tile_classif_S2_all}
\caption{Sentinel-2 classification}
\label{fig:S2cl\tile}
\end{subfigure}
\legende
\caption{Original image, ground truth and classifications. The SPOT-6 image is superposed transparently in the background of the classifications. }
\label{fig:cl_orig\tile}
\end{figure}
}
\figureOrigClass
\subsection{Fusion}
All results of the fusion are shown in appendix \ref{app:subsec:fusion}. Among the different tested fusion metrics, the Min and Bayes classifiers produced the best results since they followed the objects most precisely while producing the least class confusions (fig. \ref{fig:fusion\tile}). The fusion by classification (RF, SVM) initially tended to produce patches of buildings rather than separate buildings due to missing ground truth data. Adding a sixth buffer class around buildings helped refine the building's contours and obtaining a higher level of detail than just using the Min or Bayes classifier. In certain regions such as the industrial area shown below, some confusion was caused between buildings and the newly introduced artificial class. Since this was however an exception, the SVM t2 classifier with buffer class was further used in the regularization.
% RF vs SVMt2: better overall quality
\begin{figure}[H]
\centering
\begin{subfigure}{0.49\textwidth}
\centering
\includegraphics[width=\textwidth]{\region_T\tile_classif_Fusion_Min_weighted}
\caption{Fusion: Min rule}
\label{fig:fusion_min\tile}
\end{subfigure}
\begin{subfigure}{0.49\textwidth}
\centering
\includegraphics[width=\textwidth]{\region_T\tile_classif_Fusion_Bayes}
\caption{Fusion: Bayesian Product rule}
\label{fig:fusion_bayes\tile}
\end{subfigure}
\begin{subfigure}{0.49\textwidth}
\centering
\includegraphics[width=\textwidth]{\region_T\tile_classif_Fusion_svmt2_5cl}
\caption{Fusion: SVM t2 without buffer class}
\label{fig:fusion_svmt2\tile}
\end{subfigure}
\begin{subfigure}{0.49\textwidth}
\centering
\includegraphics[width=\textwidth]{\region_T\tile_classif_Fusion_svmt2}
\caption{Fusion: SVM t2 with buffer class}
\label{fig:fusion_svmt2buf\tile}
\end{subfigure}
\legendebuf
\caption{Fuzzy fusion rules and fusion by classification}
\label{fig:fusion\tile}
\end{figure}
With respect to the comments on the individual classifications before fusion:
\begin{enumerate}
\item All fusion rules manage to get rid of the wrongly classified building patch in the top-left corner (cf. fig. \ref{fig:SPOT6cl\tile}), preferring the Sentinel-2 classification pixels over the one of SPOT-6. The Min classifier follows the field contours less smoothly than the other classifiers.
\item The industrial zone in the middle is still confused with water in both classifiers, however the southeastern part of it is now correctly classified as vegetation.
\item All fusion methods manage to conserve the level of detail of individual buildings in the bottom-right corner. The SVM t2 classifier with buffer class preserves most detail of all buildings.
\item The fusion by classification with additional buffer class occasionally erases small patches of buildings, such as in the top-right part of the image, but overall manages to preserve more detail of individual buildings.
\end{enumerate}
The same observations were made in all test zones, even those on which the classification model was not trained (sec. \ref{sec:otherZone_finistere}).
\subsection{Regularization}
The model and parameters chosen for the final regularization are as follows:
\begin{itemize}
\item Data term: $f(t)=1-x$ since a logarithmic model would penalize too strongly locally bad classifications (the energy terms rises to $\infty$ as $x\rightarrow 0$).
\item Contrast model 1, since model 2 was not enough adaptable (behaved very similar under different parameter sets and did not show significant improvement over model 1)
\item Optimal parameters: $\lambda = 10, \gamma = 0.7, \varepsilon = 50$. The parameters were selected by cross-validation, optimizing visual qualities of the regularization result.
\end{itemize}
The regularization product using the above-mentioned regularization model and parameters is shown in figure \ref{subfig:regularisation\tile}.
\newcommand{\figureFusionRegul}{
\begin{figure}[H]
\centering
\begin{subfigure}{0.49\textwidth}
\centering
\includegraphics[width=\textwidth]{\region_T\tile_classif_Fusion_svmt2}
\caption{Fusion by SVM t2}
\label{subfig:fusion_svmt2\tile}
\end{subfigure}
\begin{subfigure}{0.49\textwidth}
\centering
\includegraphics[width=\textwidth]{\region_T\tile_regul_svmt2_G2_l1000_g70_e500_0_0_0}
\caption{Regularization}
\label{subfig:regularisation\tile}
\end{subfigure}
\legendebuf
\caption{SVM t2 Fusion and Regularization}
\label{fig:fusion_regularisation\tile}
\end{figure}
}
\figureFusionRegul
The regularization does its job smoothing out small noisy patches and yielding a visually more appealing result, although its classification accuracy is lower compared to those of the fusion (cf. table \ref{table:eval}).
\subsection{Evaluation}
Evaluation measures for individual classifications, fusion and regularization all using 5 classes are shown in table \ref{table:eval}, both using the weighted and non-weighted class membership probabilities as input. Accuracy measures were calculated over 5 SPOT-6 tiles of $3000\times3000$ meters size each, totalling an area of 45 km$^2$, distributed over the entire study zone of 648 km$^2$ area.
\begin{table}[H]
\centering
\begin{tabular}{p{1.8cm}llllll}\toprule
\multicolumn{2}{c}{\textbf{Classification}}& \textbf{Kappa} [\%] & \textbf{OA} [\%] & \textbf{AA}[\%] & \textbf{Fm} [\%] & \textbf{Fb} [\%]\\\hline
\multirow{2}{*}{Direct}& Sentinel-2 & 72.0 & 83.7 & 81.5 & 64.6 & 52.2 \\
& SPOT-6 & 73.3 & 85.2 & 70.8 & 63.4 & 62.5 \\\hline
\multirow{12}{*}{\parbox{1.8cm}{Fusion: Fuzzy rules}}& \textbf{Min} & 79.1 & 88.4 & 84.7 & 76.7 & 73.8 \\
& Max & 77.3 & 87.4 & 84.2 & 73.2 & 70.5 \\
& Compromise & 78.8 & 88.2 & 84.5 & 76.2 & 72.5 \\
& Compromise WO & 78.8 & 88.2 & 84.5 & 76.2 & 72.5 \\
& Prior 1 & 73.3 & 85.2 & 70.8 & 63.4 & 62.5 \\
& Prior 2 & 73.3 & 85.2 & 70.8 & 63.4 & 62.5 \\
& Max weighted & 76.7 & 87.1 & 83.9 & 72.7 & 70.3 \\
& Min weighted & 78.9 & 88.2 & 83.8 & 76.4 & 73.5 \\
& Compromise weighted & 78.6 & 88.1 & 83.6 & 75.9 & 72.2 \\
& Compromise WO weighted & 78.6 & 88.1 & 83.6 & 75.9 & 72.2 \\
& Prior 1 weighted & 73.3 & 85.2 & 70.8 & 63.4 & 62.5 \\
& Prior 2 weighted & 73.3 & 85.2 & 70.8 & 63.4 & 62.5 \\\hline
\multirow{2}{*}{\parbox{1.8cm}{Fusion: Bayesian}}& Bayesian Sum & 78.3 & 87.9 & 84.8 & 74.8 & 71.9 \\
& Bayesian Product & 79.1 & 88.5 & 85.0 & 76.7 & 73.8 \\\hline
\multirow{3}{*}{\parbox{1.8cm}{Fusion: Dempster-Shafer}}& DS Mass Sum & 78.9 & 88.3 & 85.1 & 76.1 & 73.6 \\
& \textbf{DS V1} & 79.1 & 88.4 & 85.1 & 76.5 & 73.7 \\
& DS V2 & 79.0 & 88.4 & 85.1 & 76.4 & 73.7 \\\hline
\multirow{3}{*}{\parbox{1.8cm}{Fusion: Margin}}& Margin Maximum & 77.7 & 87.6 & 84.5 & 73.4 & 70.2 \\
& Margin Bayesian Product & 78.4 & 88.0 & 84.7 & 75.1 & 72.0 \\
& Margin Bayesian Sum & 78.0 & 87.7 & 84.6 & 74.0 & 71.0 \\\hline
\multirow{3}{*}{\parbox{1.8cm}{Fusion: classification}} & \textbf{rf} & 81.8 & 90.0 & 90.1 & 81.2 & 81.6 \\
&\textbf{svmt0} & 80.5 & 89.3 & 88.6 & 77.9 & 80.4 \\
&\textbf{svmt2} & 81.0 & 89.6 & 89.1 & 79.1 & 83.4 \\\\\hline
\multirow{2}{*}{Regularization} & regul Min & 75.1 & 85.8 & 82.3 & 73.9 & 73.8 \\
& \textbf{regul svmt2} & 81.4 & 89.8 & 89.3 & 79.8 & 83.9\\\bottomrule
\end{tabular}
\caption{Evaluation indicators for each method. OA = Overall Accuracy, AA = Average Accuracy, Fm = Mean F-Score, Fb = F-Score for the buildings class. The top 5 classifiers either in terms of AA or Fb are indicated in bold.}
\label{table:eval}
\end{table}
\subsection{Artificialized area}
\subsubsection{Fusion with Sentinel-2}
Results using the dilated binary class probabilities from the regularization and the binary Sentinel-2 class probabilities are shown in figure \ref{fig:regul_fusion}. The upper two images show the input data produced according to fig. \ref{fig:regul_fusion_steps} for the dilated binary regularization and produced according to equation \ref{eq:bin2} by regrouping of the 5 classes into two for the Sentinel-2 binary classification. Fusion metrics yielded visually similar results and the Min fusion metric was chosen. Regularization parameters were kept the same as for the 5-class regularization.
\begin{figure}[H]
\centering
\foreach \n/\captiontext in {classif_regul_urbain/Input: Dilated binary regularization,
classif_S2_urbain/Input: Sentinel-2 binary classification,
classif_Fusion_Min/Fusion by Min rule,
regul_proba_Fusion_Min_100_1000_100_0_100_70_100_200_0_0_0/Regularization
}{
\begin{subfigure}{0.49\textwidth}
\centering
\includegraphics[width=.9\textwidth]{\region_T\tile_BIN_\n}
\caption{\captiontext}
\end{subfigure}
}
\legendebin
\caption{Artificialized Area: Input data, Fusion and Regularization}
\label{fig:regul_fusion}
\end{figure}
\subsubsection{Segmentation}
Results using majority vote of the dilated binary class probabilities within the segmented image regions of Sentinel-2 are shown in figure \ref{fig:regul_fusion}.
\begin{figure}[H]
\centering
\foreach \n/\captiontext in {3,8,20,30}{
\begin{subfigure}{0.49\textwidth}
\centering
\includegraphics[width=.9\textwidth]{\region_T\tile_regul_seg_maj_\n}
\caption{cut=\n}
\end{subfigure}
}
\legendebin
\caption{Segmentation}
\label{fig:segmentation}
\end{figure}
A too high cut leads to an over-generalization and loss of useful information, while an too low cut can keep too many small zones from the initial over-segmentation. A cut of 8 is a good trade-off and will be further analyzed.
\subsubsection{Comparison}
\newcommand{\figureBinRegulSeg}{
\begin{figure}[H]
\centering
\begin{subfigure}{0.49\textwidth}
\centering
\includegraphics[width=\textwidth]{\region_T\tile_BIN_regul_proba_Fusion_Min_100_1000_100_0_100_70_100_200_0_0_0}
\caption{Fusion and Regularization}
\label{subfig:fusionRegComp\tile}
\end{subfigure}
\begin{subfigure}{0.49\textwidth}
\centering
\includegraphics[width=\textwidth]{\region_T\tile_regul_seg_maj_8}
\caption{Segmentation with CUTS=8}
\end{subfigure}
\legendebin
\caption{Fusion and regularization followed by segmentation and majority voting. The segmentation step allows following contours in the original image more closely.}
\label{fig:comparison\tile}
\end{figure}
}
\figureBinRegulSeg
The segmentation yields some refinements to the fusion and regularization:
\begin{itemize}
\item It provides a more accurate classification in terms of following the original image contours. The effects of the non-directional dilatation of the building probability is well visible in figure \ref{subfig:fusionRegComp\tile} where the industrial area in the North-Western part of the image extends in all directions, which it doesn't in the segmentation result.
\item The segmentation approach manages to remove some artifacts which were still present in the regularization, such as the classification of the road intersection as a building near the North-Eastern part.
\end{itemize}
However, a drawback of the Sentinel-2 classification is the arbitrary choice of the cut parameter which is done manually.
\subsection{Binary Evaluation}
\begin{figure}[H]
\centering
\foreach \n/\captiontext in {bdtopo/BDTOPO,
osm/OSM,
oso/OSO
}{
\begin{subfigure}{0.49\textwidth}
\centering
\includegraphics[width=.9\textwidth]{\region_T\tile_gt_eval_\n-regul_seg_maj_8}
\caption{\captiontext}
\end{subfigure}
}
\legendeBinEval
\caption{Binary evaluation: Segmentation, cut=8}
\label{fig:eval-bin}
\end{figure}
The following aspects can be underlined:
\begin{itemize}
\item The OSM ground truth is rather simplified and contains few, rough urbanized patches.
\item The OSO ground truth is partial, containing gaps of artificial areas
within urban patches. This is due to the resolution and methodology used for the classification and the selection of classes considered as urban.
\item The dilated BDTOPO ground truth is the most truthful, however it is also approximate and exceeds the urban area due to the dilatation used to create it.
\end{itemize}
Accuracy measures calculated on 5 SPOT-6 tiles of a total area of 45 km$^2$ for ever ground truth are given in table \ref{table:accuracy-bin\region}.
\begin{table}[H]
\centering
\begin{tabular}{llp{1.5cm}llp{1.5cm}}
\toprule
\textbf{Classification} & \textbf{Ground Truth} & \textbf{F-Score Buildings} & \textbf{Kappa} & \textbf{OA} & \textbf{IoU buildings} \\\hline
\multirow{3}{*}{Binary Regularization Input}& BDTOPO & 86.7 & 83.2 & 94.5 & 76.5 \\
& OSO & 56.8 & 50.1 & 86.8 & 39.7 \\
& OSM & 58.3 & 51.9 & 87.3 & 41.2 \\\hline
\multirow{3}{*}{Binary Sentinel-2 Input}& BDTOPO & 65.2 & 56.4 & 85.9 & 48.3 \\
& OSO & 63.9 & 58.3 & 89.3 & 46.9 \\
& OSM & 52.6 & 45.5 & 86.0 & 35.7 \\\hline
\multirow{3}{*}{Fusion Min}& BDTOPO & 79.8 & 75.2 & 92.4 & 66.3 \\
& OSO & 66.9 & 62.2 & 91.2 & 50.3 \\
& OSM & 62.9 & 57.6 & 90.2 & 45.9 \\\hline
\multirow{3}{*}{Regularization}& BDTOPO & 79.7 & 75.1 & 92.5 & 66.2 \\
& OSO & 67.4 & 62.8 & 91.5 & 50.9 \\
& OSM & 64.4 & 59.4 & 90.7 & 47.4 \\\hline
\multirow{3}{*}{Segmentation (cut=8)}& BDTOPO & 68.6 & 62.5 & 89.3 & 52.2 \\
& OSO & 65.0 & 60.5 & 91.9 & 48.1 \\
& OSM & 65.3 & 60.9 & 92.0 & 48.5 \\
\bottomrule
\end{tabular}
\caption{Accuracy measures for Artificialized Area: F-Score for the buildings class, Kappa, Overall Accuracy (OA) and Intersection over Union (IoU) for the buildings class.}
\label{table:accuracy-bin\region}
\end{table}
The dilated BDTOPO ground truth generally yields highest accuracy measures, showing the strongest correspondence with the binary classifications. The Fusion and Regularization improve the accuracy measures over the individual input classifications. The segmentation in most cases reduces the accuracy scores because it tends to reduce the extent of the areas classified as artificialized and all ground truth maps over-estimate the artificialized area.
\subsection{Experiments over the entire area}
In addition to the small-scale visualizations shown, the same results were also produced for the entire area covered by both data sets, spanning 648 $km^2$, although accuracy measures for the classification were not produced for the entire area. They showed some interesting classification results in the Northern part which was cloudy in some of the Sentinel-2 image, producing confusions between urban areas and water (appendix \ref{app:subsec:entire_area}).
\subsection{Other zones}\label{subsec:otherZones}
The following section presents results of the processing chain applied to other study zones:
\begin{itemize}
\item Finistère, where the initial classifiers produce better results in terms of class confusions
\item Gironde, where classifiers behave differently due to different spectral response.
\end{itemize}
\subsubsection{Finistère}
\label{sec:otherZone_finistere}
The figures below show the results of using the same processing chain as described in the methodology in a zone having less class confusions in the initial classifications. The tile presented below has less confusions in the initial classifiers. The classification models for fusion were trained in a different zone.
\renewcommand{\tile}{35000_40000}