The DarkHelp C++ API is a wrapper (not a replacement!) for the libdarknet.so
C API.
To use DarkHelp, you must include the project header file within your C++ application:
Instantiate a DarkHelp::NN object. These can easily be placed either on the stack, or created dynamically with new
. You'll want this object to persist for a long time, as the constructor loads the neural network into memory which takes a (relatively) long time.
At this point, the neural network has been fully loaded and is ready to use. But just prior to using it, if you have certain settings you'd like to tweak, see the DarkHelp::Config class. Several examples:
The alternative method is to instantiate a DarkHelp::Config object first and configuring it as needed. Once it has been setup correctly, then you use it to instantiate the DarkHelp::NN object.
This allows the DarkHelp configuration to be passed around if necessary, or have multiple configurations and swapping between them as needed.
The only thing left is to loop through every image and call DarkHelp::NN::predict(). If you want DarkHelp to annotate the image with the results, you must also call DarkHelp::NN::annotate():
Calling any of the DarkHelp::NN::predict() overloads gives back a std::vector
of DarkHelp::PredictionResult objects, which should be extremely simple to manage.
While the previous example used image filenames, you can also use cv::Mat
objects. Here is an example with cv::Mat
images obtained from video frames:
For example, the std::cout
line in the previous example might results in the following text:
But most likely you'll want to handle the result
vector yourself, instead of dumping lines of text to std::cout
. See DarkHelp::PredictionResult and the various members it provides, such as DarkHelp::PredictionResult::rect and DarkHelp::PredictionResult::all_probabilities.
The results from object detection can be passed in to an instance of DarkHelp::PositionTracker. This will attempt to do simple position-based object tracking. The information gathered during object tracking can then be leveraged to draw tails, or uniquely count the objects in a video.