Tuesday, January 30, 2018

[C#] OpenCvSharp DNN with YOLO2

yolo
On my post "OpenCV DNN speed compare in Python, C#, C++", Blaise Thunderbytes asked me to implement pjreddie's YOLO with OpenCvSharp, so that was why this post came out :P

Since OpenCV 3.3.1, DNN moudle supported parse YOLO models, so we can easily using YOLO pre-trained model now. OpenCv Doc have a tutorial of YOLO object detection writed in C++, if you using C++ can check it, I will using C# with OpenCVSharp.


Better Speed

yolo4
Compare with my previous test, YOLOv2 544x544 was almost 2x faster than SSD 512x512 (1000ms vs 1900ms using CPU and OpenCvSharp), and faster than SSD with python almost half(1000ms vs 1500ms).


Let's look the code, because we using DNN module to load darknet model, so the code template was similar.

            var cfg = "yolo-voc.cfg";
            var model = "yolo-voc.weights"; //YOLOv2 544x544
            var threshold = 0.3;
We using YOLO2 voc 544x544 model.



            var blob = CvDnn.BlobFromImage(org, 1 / 255.0, new Size(544, 544), new Scalar(), true, false);
            var net = CvDnn.ReadNetFromDarknet(cfg, model);
            net.SetInput(blob, "data");
Setting blob, remember the parameter value are important, it make the result different.



const int prefix = 5;   //skip 0~4

for (int i = 0; i < prob.Rows; i++)
{
 var confidence = prob.At(i, 4);
 if (confidence > threshold)
 {
  //get classes probability
  Cv2.MinMaxLoc(prob.Row[i].ColRange(prefix, prob.Cols), out _, out Point max);
  var classes = max.X;
  var probability = prob.At(i, classes + prefix);

  if (probability > threshold) //more accuracy
  {
   //get center and width/height
   var centerX = prob.At(i, 0) * w;
   var centerY = prob.At(i, 1) * h;
   var width = prob.At(i, 2) * w;
   var height = prob.At(i, 3) * h;
   //label formating
   var label = $"{Labels[classes]} {probability * 100:0.00}%";
   Console.WriteLine($"confidence {confidence * 100:0.00}% {label}");
   var x1 = (centerX - width / 2) < 0 ? 0 : centerX - width / 2; //avoid left side over edge
   //draw result
   org.Rectangle(new Point(x1, centerY - height / 2), new Point(centerX + width / 2, centerY + height / 2), Colors[classes], 2);
   var textSize = Cv2.GetTextSize(label, HersheyFonts.HersheyTriplex, 0.5, 1, out var baseline);
   Cv2.Rectangle(org, new Rect(new Point(x1, centerY - height / 2 - textSize.Height - baseline),
     new Size(textSize.Width, textSize.Height + baseline)), Colors[classes], Cv2.FILLED);
   Cv2.PutText(org, label, new Point(x1, centerY - height / 2-baseline), HersheyFonts.HersheyTriplex, 0.5, Scalar.Black);
  }
 }
}
YOLO's output format was like this :
0,1 : Center of x, y
2,3 : Width, Height
4 : Confidence
rest : Individual class probability
In this case, VOC has 20 classes, so 5~24 are class probability.
After take few time to figure out it, the other part just draw the result like before.

BTW I add a IF CASE of (probability > threshold) to make result look better, if you don't do it, the result will look like this.
yolo1


And the final result was here.
yolo2

and other pictures.
yolo5

yolo3


The full code was here, or you can get it from github.

using System;
using System.Diagnostics;
using System.Linq;
using OpenCvSharp;
using OpenCvSharp.Dnn;

namespace OpenCvDnnYolo
{
    class Program
    {
        private static readonly string[] Labels = { "aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor" };
        private static readonly Scalar[] Colors = Enumerable.Repeat(false, 20).Select(x => Scalar.RandomColor()).ToArray();
        static void Main()
        {
            var file = "bali.jpg";
            // https://pjreddie.com/darknet/yolo/
            var cfg = "yolo-voc.cfg";
            var model = "yolo-voc.weights"; //YOLOv2 544x544
            var threshold = 0.3;

            var org = Cv2.ImRead(file);
            var w = org.Width;
            var h = org.Height;
            //setting blob, parameter are important
            var blob = CvDnn.BlobFromImage(org, 1 / 255.0, new Size(544, 544), new Scalar(), true, false);
            var net = CvDnn.ReadNetFromDarknet(cfg, model);
            net.SetInput(blob, "data");

            Stopwatch sw = new Stopwatch();
            sw.Start();
            //forward model
            var prob = net.Forward();
            sw.Stop();
            Console.WriteLine($"Runtime:{sw.ElapsedMilliseconds} ms");

            /* YOLO2 VOC output
             0 1 : center                    2 3 : w/h
             4 : confidence                  5 ~24 : class probability */
            const int prefix = 5;   //skip 0~4

            for (int i = 0; i < prob.Rows; i++)
            {
                var confidence = prob.At(i, 4);
                if (confidence > threshold)
                {
                    //get classes probability
                    Cv2.MinMaxLoc(prob.Row[i].ColRange(prefix, prob.Cols), out _, out Point max);
                    var classes = max.X;
                    var probability = prob.At(i, classes + prefix);

                    if (probability > threshold) //more accuracy
                    {
                        //get center and width/height
                        var centerX = prob.At(i, 0) * w;
                        var centerY = prob.At(i, 1) * h;
                        var width = prob.At(i, 2) * w;
                        var height = prob.At(i, 3) * h;
                        //label formating
                        var label = $"{Labels[classes]} {probability * 100:0.00}%";
                        Console.WriteLine($"confidence {confidence * 100:0.00}% {label}");
                        var x1 = (centerX - width / 2) < 0 ? 0 : centerX - width / 2; //avoid left side over edge
                        //draw result
                        org.Rectangle(new Point(x1, centerY - height / 2), new Point(centerX + width / 2, centerY + height / 2), Colors[classes], 2);
                        var textSize = Cv2.GetTextSize(label, HersheyFonts.HersheyTriplex, 0.5, 1, out var baseline);
                        Cv2.Rectangle(org, new Rect(new Point(x1, centerY - height / 2 - textSize.Height - baseline),
                                new Size(textSize.Width, textSize.Height + baseline)), Colors[classes], Cv2.FILLED);
                        Cv2.PutText(org, label, new Point(x1, centerY - height / 2-baseline), HersheyFonts.HersheyTriplex, 0.5, Scalar.Black);
                    }
                }
            }
            using (new Window("died.tw", org))
            {
                Cv2.WaitKey();
            }
        }
    }
}

Hope you enjoy it.

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...