The main objective of this tool is to perform semantic segmentation on satellite images to segment each pixel of the image into either of the five classes considered which are greenery, soil, water, building, and utility.
Satellite image segmentation has a wide range of real-world applications like monitoring deforestation, urbanization, traffic identification of natural resources, urban planning, etc. Image segmentation involves colour coding each pixel of the image into either one of the training classes. Typical training classes include vegetation, land, buildings, roads, cars, water bodies, etc.
Though traditional convolution like UNet has shown decent accuracy in satellite image segmentation, it still has some drawbacks like predicting classes that have very near distinguishable features, not being able to predict precise boundaries, etc. In addressing these drawbacks, we have performed satellite image segmentation using the FPN + PointRend model from the Detectron2 library which has significantly rectified the above-mentioned drawbacks and showed a 15% increase in accuracy when compared to the U-Net model on the validation dataset used.
The basic idea of the PointRend model is to see segmentation tasks as computer graphics rendering. Same as in rendering where pixels with high variance are refined by subdivision and adaptive sampling techniques, the PointRend model also considers the most uncertain pixels in semantic segmentation output, upsamples them, and makes point-wise predictions which result in more refined predictions. The pointed model performs two main tasks to generate final predictions.
Points selection – how uncertain points are selected during inference and training
Point-wise predictions – how predictions are made for these selected uncertain points