PopSift Library¶
PopSift is an open-source implementation of the SIFT algorithm in CUDA [GCH18]. PopSift tries to stick as closely as possible to David Lowe’s famous paper [Low04], while extracting features from an image in real-time at least on an NVidia GTX 980 Ti GPU.
Requirements¶
Hardware¶
PopSift is a GPU implementation that requires an NVIDIA GPU card with a CUDA compute capability >= 3.0 (including, e.g. the GT 650M). The code is originally developed with the compute capability 5.2 card GTX 980 Ti in mind.
You can check your NVIDIA GPU card CC support here or on the NVIDIA dev page. If you do not have a NVIDIA card you will still able to compile and use the CPU version.
Here are the minimum hardware requirements for PopSift:
Minimum requirements |
|
---|---|
Operating systems |
Windows x64, Linux, macOS |
CPU |
Recent Intel or AMD cpus |
RAM Memory |
8 GB |
Hard Drive |
No particular requirements |
GPU |
NVIDIA CUDA-enabled GPU (compute capability >= 3.5) |
Software¶
The core library depends only on Cuda >= 7.0
The library includes a few sample applications that show how to use the library. They require
Boost >= 1.55 (required components atomic, chrono, date-time, system, thread)
[optionally] DevIL (libdevil-dev) can be used to load a broader range of image formats, otherwise only pgm is supported.
vcpkg¶
vcpkg is a cross-platform (Windows, Linux and MacOS), open-source package manager created by Microsoft.
Starting from v0.9, PopSift package can be installed on each platform via vcpkg. To install the library:
vcpkg install popsift --triplet <arch>
where <arch>
is the architecture to build for (e.g. x64-windows
, x64-linux-dynamic
etc.)
If you want to install the demo applications that come with the library you can add the option apps
:
vcpkg install popsift[apps] --triplet <arch>
Building the library¶
Building tools¶
Required tools:
CMake >= 3.14 to build the code
Git
C/C++ compiler supporting the C++11 standard (gcc >= 4.6 or visual studio or clang)
CUDA >= 7.0
Dependencies¶
vcpkg¶
vcpkg can be used to install all the dependencies on all the supported platforms. This is particularly useful on Windows. To install the dependencies:
vcpkg install cuda devil boost-system boost-program-options boost-thread boost-filesystem
You can add the flag --triplet
to specify the architecture and the version you want to build.
For example:
--triplet x64-windows
will build the dynamic version for Windows 64 bit--triplet x64-windows-static
will build the static version for Windows 64 bit--triplet x64-linux-dynamic
will build the dynamic version for Linux 64 bit
and so on. More information can be found here
Linux¶
On Linux you can install from the package manager:
For Ubuntu/Debian package system:
sudo apt-get install g++ git-all libboost-all-dev libdevil-dev
For CentOS package system:
sudo yum install gcc-c++ git boost-devel devil
MacOS¶
On MacOs using Homebrew install the following packages:
brew install git boost devil
Getting the sources¶
git clone https://github.com/alicevision/PopSift.git
CMake configuration¶
From PopSift root folder you can run cmake:
mkdir build && cd build
cmake ..
make -j `nproc`
On Windows add -G "Visual Studio 16 2019" -A x64
to generate the Visual Studio solution according to your VS version (see CMake documentation).
If you are using the dependencies built with VCPKG you need to pass -DCMAKE_TOOLCHAIN_FILE=path/to/vcpkg/scripts/buildsystems/vcpkg.cmake
at cmake step to let it know where to find the dependencies.
CMake options¶
CMake configuration can be controlled by changing the values of the following variables (here with their default value)
BUILD_SHARED_LIBS:BOOL=ON
to enable/disable the building shared librariesPopSift_BUILD_EXAMPLES:BOOL=ON
to enable/disable the building of applicationsPopSift_BUILD_DOC:BOOL=OFF
to enable/disable building this documentation and the Doxygen one.
For example, if you do not want to build the applications, you have to pass -DPopSift_BUILD_EXAMPLES:BOOL=OFF
and so on.
PopSift as third party¶
When you install PopSift a file PopSiftConfig.cmake
is installed in <install_prefix>/lib/cmake/PopSift/
that allows you to import the library in your CMake project.
In your CMakeLists.txt
file you can add the dependency in this way:
1# Find the package from the PopSiftConfig.cmake
2# in <prefix>/lib/cmake/PopSift/. Under the namespace PopSift::
3# it exposes the target PopSift that allows you to compile
4# and link with the library
5find_package(PopSift CONFIG REQUIRED)
6...
7# suppose you want to try it out in a executable
8add_executable(popsiftTest yourfile.cpp)
9# add link to the library
10target_link_libraries(popsiftTest PUBLIC PopSift::PopSift)
Then, in order to build just pass the location of PopSiftConfig.cmake
from the cmake command line:
cmake .. -DPopSift_DIR=<install_prefix>/lib/cmake/PopSift/
Docker image¶
A docker image can be built using the Ubuntu based Dockerfile
, which is based on nvidia/cuda image (https://hub.docker.com/r/nvidia/cuda/ )
Building the dependency image¶
We provide a Dockerfile_deps
containing a cuda image with all the necessary PopSift dependencies installed.
A parameter CUDA_TAG
can be passed when building the image to select the cuda version.
Similarly, OS_TAG
can be passed to select the Ubuntu version.
By default, CUDA_TAG=10.2
and OS_TAG=18.04
For example to create the dependency image based on ubuntu 18.04 with cuda 8.0 for development, use
docker build --build-arg CUDA_TAG=8.0 --tag alicevision/popsift-deps:cuda8.0-ubuntu18.04 -f Dockerfile_deps .
The complete list of available tags can be found on the nvidia [dockerhub page](https://hub.docker.com/r/nvidia/cuda/)
Building the PopSift image¶
Once you built the dependency image, you can build the popsift image in the same manner using Dockerfile
:
docker build --tag alicevision/popsift:cuda8.0-ubuntu18.04 .
Running the PopSift image¶
In order to run the image nvidia docker is needed: see the installation instruction. Once installed, the docker can be run, e.g., in interactive mode with
docker run -it --runtime=nvidia alicevision/popsift:cuda8.0-ubuntu18.04
Official images on DockeHub¶
Check the docker hub PopSift repository for the available images.
Library usage¶
Detection¶
API References¶
Main Classes¶
-
class SiftJob¶
Public Functions
-
SiftJob(int w, int h, const unsigned char *imageData)¶
Constructor for byte images, value range 0..255.
- Parameters
w – [in] the width in pixel of the image
h – [in] the height in pixel of the image
imageData – [in] the image buffer
-
SiftJob(int w, int h, const float *imageData)¶
Constructor for float images, value range [0..1[.
- Parameters
w – [in] the width in pixel of the image
h – [in] the height in pixel of the image
imageData – [in] the image buffer
-
~SiftJob()¶
Destructor releases all the resources.
-
popsift::FeaturesHost *getHost()¶
- Returns
-
void setFeatures(popsift::FeaturesBase *f)¶
fulfill the promise
-
SiftJob(int w, int h, const unsigned char *imageData)¶
-
class PopSift¶
Public Types
-
enum ImageMode¶
Image modes.
Values:
-
enumerator ByteImages¶
byte image, value range 0..255
-
enumerator FloatImages¶
float images, value range [0..1[
-
enumerator ByteImages¶
-
enum AllocTest¶
Results for the allocation test.
Values:
-
enumerator Ok¶
the image dimensions are supported by this device’s CUDA texture engine.
-
enumerator ImageExceedsLinearTextureLimit¶
the input image size exceeds the dimensions of the CUDA Texture used for loading.
-
enumerator ImageExceedsLayeredSurfaceLimit¶
the scaled input image exceeds the dimensions of the CUDA Surface used for the image pyramid.
-
enumerator Ok¶
Public Functions
-
explicit PopSift(ImageMode imode = ByteImages, int device = 0)¶
We support more than 1 streams, but we support only one sigma and one level parameters.
-
explicit PopSift(const popsift::Config &config, popsift::Config::ProcessingMode mode = popsift::Config::ExtractingMode, ImageMode imode = ByteImages, int device = 0)¶
- Parameters
config –
mode –
imode –
-
~PopSift()¶
Release all the resources.
-
bool configure(const popsift::Config &config, bool force = false)¶
Provide the configuration if you used the PopSift default constructor.
-
void uninit()¶
Release the resources.
-
AllocTest testTextureFit(int width, int height)¶
Check whether the current CUDA device can support the image resolution (width,height) with the current configuration based on the card’s texture engine. The function does not check if there is sufficient available memory.
The first part of the test depends on the parameters width and height. It checks whether the image size is supported by CUDA 2D linear textures on this card. This is used to load the image into the first level of the first octave. For the second part of the tst, two value of the configuration are important: “downsampling”, because it determines the required texture size after loading. The CUDA 2D layered texture must support the scaled width and height. “levels”, because it determines the number of levels in each octave. The CUDA 2D layered texture must support enough depth for each level.
- Remark
* If you want to call configure() before extracting features, you should call configure() before textTextureFit().
- Remark
* The current CUDA device is determined by a call to cudaGetDevice(), card properties are only read once.
- See
- Parameters
width – [in] The width of the input image
height – [in] The height of the input image
- Returns
AllocTest::Ok if the image dimensions are supported by this device’s CUDA texture engine, AllocTest::ImageExceedsLinearTextureLimit if the input image size exceeds the dimensions of the CUDA Texture used for loading. The input image must be scaled. AllocTest::ImageExceedsLayeredSurfaceLimit if the scaled input image exceeds the dimensions of the CUDA Surface used for the image pyramid. The scaling factor must be changes to fit in.
-
std::string testTextureFitErrorString(AllocTest err, int w, int h)¶
Create a warning string for an AllocTest error code.
-
SiftJob *enqueue(int w, int h, const unsigned char *imageData)¶
Enqueue a byte image, value range [0,255].
- See
- Parameters
w – [in] the width of the image.
h – [in] the height of the image.
imageData – [in] the image buffer.
- Returns
the associated job
-
SiftJob *enqueue(int w, int h, const float *imageData)¶
Enqueue a float image, value range [0,1].
- See
- Parameters
w – [in] the width of the image.
h – [in] the height of the image.
imageData – [in] the image buffer.
- Returns
the associated job
-
inline void uninit(int)¶
- Deprecated:
-
inline bool init(int, int w, int h)¶
- Deprecated:
-
inline popsift::FeaturesBase *execute(int, const unsigned char *imageData)¶
- Deprecated:
-
enum ImageMode¶
-
struct popsift::Config¶
Struct containing the parameters that control the extraction algorithm.
Public Types
-
enum GaussMode¶
The way the gaussian mode is compute.
Each setting allows to mimic and reproduce the behaviour of other Sift implementations.
Values:
-
enumerator VLFeat_Compute¶
-
enumerator VLFeat_Relative¶
-
enumerator VLFeat_Relative_All¶
-
enumerator OpenCV_Compute¶
-
enumerator Fixed9¶
-
enumerator Fixed15¶
-
enumerator VLFeat_Compute¶
-
enum SiftMode¶
General setting to reproduce the results of other Sift implementations.
Values:
-
enumerator PopSift¶
Popsift implementation.
-
enumerator OpenCV¶
OpenCV implementation.
-
enumerator VLFeat¶
VLFeat implementation.
-
enumerator PopSift¶
-
enum ScalingMode¶
The scaling mode.
Values:
-
enumerator ScaleDirect¶
-
enumerator ScaleDefault¶
Indirect - only working method.
-
enumerator ScaleDirect¶
-
enum DescMode¶
Modes for descriptor extraction.
Values:
-
enumerator Loop¶
scan horizontal, extract valid points
-
enumerator ILoop¶
scan horizontal, extract valid points, interpolate with tex engine
-
enumerator Grid¶
scan in rotated mode, round pixel address
-
enumerator IGrid¶
scan in rotated mode, interpolate with tex engine
-
enumerator NoTile¶
variant of IGrid, no duplicate gradient fetching
-
enumerator Loop¶
-
enum NormMode¶
Type of norm to use for matching.
Values:
-
enumerator RootSift¶
The L1-inspired norm, gives better matching results (“RootSift”)
-
enumerator Classic¶
The L2-inspired norm, all descriptors on a hypersphere (“classic”)
-
enumerator RootSift¶
-
enum GridFilterMode¶
Filtering strategy.
To reduce time used in descriptor extraction, some extrema can be filtered immediately after finding them. It is possible to keep those with the largest scale (LargestScaleFirst), smallest scale (SmallestScaleFirst), or a random selection. Note that largest and smallest give a stable result, random does not.
Values:
-
enumerator RandomScale¶
keep a random selection
-
enumerator LargestScaleFirst¶
keep those with the largest scale
-
enumerator SmallestScaleFirst¶
keep those with the smallest scale
-
enumerator RandomScale¶
Public Functions
-
void setGaussMode(const std::string &m)¶
Set the Gaussian mode from string.
- See
- Parameters
m – [in] The string version of the GaussMode
-
void setGaussMode(GaussMode m)¶
Set the Gaussian mode.
- Parameters
m – [in] The Gaussian mode to use.
-
void setVerbose(bool on = true)¶
Enable/desable verbose mode.
- Parameters
on – [in] Whether to display additional information .
-
void setDescMode(const std::string &byname)¶
Set the descriptor mode by string.
- See
- Parameters
byname – [in] The string containing the descriptor mode.
-
void setDescMode(DescMode mode = Loop)¶
Set the descriptor mode.
- See
- Parameters
mode – [in] The descriptor mode.
-
float getPeakThreshold() const¶
computes the actual peak threshold depending on the threshold parameter and the non-augmented number of levels
-
bool ifPrintGaussTables() const¶
print Gauss spans and tables?
-
SiftMode getSiftMode() const¶
Get the SIFT mode for more detailed sub-modes.
- See
- Returns
The SiftMode
-
DEPRECATED(void setUseRootSift(bool on))¶
Set the normalization mode.
- Deprecated:
- See
- Parameters
on – [in] Use RootSift (
true
) or the L2-norm (false
).
-
int getNormalizationMultiplier() const¶
Functions related to descriptor normalization: multiply with a power of 2.
-
inline float getUpscaleFactor() const¶
The input image is stretched by 2^upscale_factor before processing. The factor 1 is default.
-
bool getCanFilterExtrema() const¶
Have we enabled filtering? This is a compile time decision. The reason is that we use Thrust, which increases compile considerably and can be deactivated at the CMake level when you work on something else.
-
inline int getFilterMaxExtrema() const¶
Set the approximate number of extrema whose orientation and descriptor should be computed. Default is -1, which sets the hard limit defined by “number of octaves * getMaxExtrema()”.
-
inline int getFilterGridSize() const¶
Get the grid size for filtering.
To avoid that grid filtering happens only in a tiny piece of an image, the image is split into getFilterGridSize() X getFilterGridSize() tiles and we allow getFilterMaxExtrema() / getFilterGridSize() extrema in each tile.
-
inline GridFilterMode getFilterSorting() const¶
Get the filtering mode.
- Returns
the filtering mode.
-
inline ScalingMode getScalingMode() const¶
Get the scaling mode.
- See
- Returns
the descriptor extraction mode.
Public Members
-
int octaves¶
The number of octaves is chosen freely. If not specified, it is: log_2( min(x,y) ) - 3 - start_sampling
-
int levels¶
The number of levels per octave. This is actually the number of inner DoG levels where we can search for feature points. The number of …
This is the non-augmented number of levels, meaning the this is not the number of gauss-filtered picture layers (which is levels+3), but the number of DoG layers in which we can search for extrema.
-
float _edge_limit¶
default edge_limit 16.0f from Celebrandil default edge_limit 10.0f from Bemap
-
enum GaussMode¶
Functions¶
Utility Classes¶
About¶
License¶
PopSift is licensed under MPLv2 license.
More info about the license and what you can do with the code can be found at tldrlegal website
SIFT was patented in the United States from 1999-03-08 to 2020-03-28. See the patent link for more information. PopSift license only concerns the PopSift source code and does not release users of this code from any requirements that may arise from patents.
Contact us¶
You can contact us on the public mailing list at alicevision@googlegroups.com
You can also contact us privately at alicevision-team@googlegroups.com
Cite us¶
If you want to cite this work in your publication, please use the following
@inproceedings{Griwodz2018Popsift,
author = {Griwodz, Carsten and Calvet, Lilian and Halvorsen, P{\aa}l},
title = {Popsift: A Faithful SIFT Implementation for Real-time Applications},
booktitle = {Proceedings of the 9th {ACM} Multimedia Systems Conference},
series = {MMSys '18},
year = {2018},
isbn = {978-1-4503-5192-8},
location = {Amsterdam, Netherlands},
pages = {415--420},
numpages = {6},
doi = {10.1145/3204949.3208136},
acmid = {3208136},
publisher = {ACM},
address = {New York, NY, USA},
}
Acknowledgements¶
This has been developed in the context of the European project POPART founded by European Union’s Horizon 2020 research and innovation programme under grant agreement No 644874.
Bibliography¶
- GCH18
Carsten Griwodz, Lilian Calvet, and Pål Halvorsen. Popsift: a faithful sift implementation for real-time applications. In Proceedings of the 9th ACM Multimedia Systems Conference, MMSys '18, 415–420. New York, NY, USA, 2018. ACM. doi:10.1145/3204949.3208136.
- Low04
DG Lowe. Distinctive image features from scale-invariant keypoints. International journal of computer vision, pages 1–29, 2004. doi:10.1023/B:VISI.0000029664.99615.94.