Skip to content

zfengyan/Kmeans_clustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

61 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kmeans-clustering

About classfication encoding: https://desktop.arcgis.com/en/arcmap/latest/manage-data/las-dataset/lidar-point-classification.htm

Some data structures referenced : https://github.com/luizgh/knn

HOW TO USE

build:

All files can be downloaded here.

Use all files inside src folder to build this project. Please be aware of the settings of the CMakeLists.txt if the program runs across platforms.

Compile and run:

After compiling all the source files, a binary executable file (Kmeans.exe) would be obtained. Run it directly and no arguments are required.

INPUT

It automatically reads in all .xyz files inside data folder.

OUTPUT

The output files are in the same data folder as the input files.

dataset_truth

dataset_result

  • report.txt -- the classification statistics report

Comment:

Manhattan distance is used by default in this program as using it has an relatively ideal result.

Euclidean distance has also been tested, the corresponding result files are:

Designed features:

(1) area : (max_x - min_x) * (max_y - min_y)

(2) height difference : max_z - min_z

(3) density : number of points in one file / area

(4) ratio : x_distance / y_distance (the ratio of width / height)

Overall statistics:

Numbers:

Correct numbers: 349 out of 500

Wrong numbers: 151 out of 500

Accuracy:

Avg accuracy (mAcc) = 0.776148 -- the average of the accuracy rates for the five categories

Overall accuracy (OA) = 0.698 -- total numbers of correct data / total numbers of data