PopSift Library¶
PopSift is an open-source implementation of the SIFT algorithm in CUDA [GCH18]. PopSift tries to stick as closely as possible to David Lowe’s famous paper [Low04], while extracting features from an image in real-time at least on an NVidia GTX 980 Ti GPU.
Requirements¶
Hardware¶
PopSift is a GPU implementation that requires an NVIDIA GPU card with a CUDA compute capability >= 3.0 (including, e.g. the GT 650M). The code is originally developed with the compute capability 5.2 card GTX 980 Ti in mind.
You can check your NVIDIA GPU card CC support here or on the NVIDIA dev page. If you do not have a NVIDIA card you will still able to compile and use the CPU version.
Here are the minimum hardware requirements for PopSift:
Minimum requirements |
|
---|---|
Operating systems |
Windows x64, Linux, macOS |
CPU |
Recent Intel or AMD cpus |
RAM Memory |
8 GB |
Hard Drive |
No particular requirements |
GPU |
NVIDIA CUDA-enabled GPU (compute capability >= 3.5) |
Software¶
The core library depends only on Cuda >= 7.0
The library includes a few sample applications that show how to use the library. They require
Boost >= 1.55 (required components atomic, chrono, date-time, system, thread)
[optionally] DevIL (libdevil-dev) can be used to load a broader range of image formats, otherwise only pgm is supported.
vcpkg¶
vcpkg is a cross-platform (Windows, Linux and MacOS), open-source package manager created by Microsoft.
We are planning to release a port of the library so that it can be easily built using the package manager on all supported platforms. Stay tuned!
Building the library¶
Building tools¶
Required tools:
CMake >= 3.14 to build the code
Git
C/C++ compiler supporting the C++11 standard (gcc >= 4.6 or visual studio or clang)
CUDA >= 7.0
Dependencies¶
vcpkg¶
vcpkg can be used to install all the dependencies on all the supported platforms. This is particularly useful on Windows. To install the dependencies:
vcpkg install cuda devil boost-system boost-program-options boost-thread boost-filesystem
You can add the flag --triplet
to specify the architecture and the version you want to build.
For example:
--triplet x64-windows
will build the dynamic version for Windows 64 bit--triplet x64-windows-static
will build the static version for Windows 64 bit--triplet x64-linux-dynamic
will build the dynamic version for Linux 64 bit
and so on. More information can be found here
Linux¶
On Linux you can install from the package manager:
For Ubuntu/Debian package system:
sudo apt-get install g++ git-all libboost-all-dev libdevil-dev
For CentOS package system:
sudo yum install gcc-c++ git boost-devel devil
Getting the sources¶
git clone https://github.com/alicevision/PopSift.git
CMake configuration¶
From PopSift root folder you can run cmake:
mkdir build && cd build
cmake ..
make -j `nproc`
On Windows add -G "Visual Studio 16 2019" -A x64
to generate the Visual Studio solution according to your VS version (see CMake documentation).
If you are using the dependencies built with VCPKG you need to pass -DCMAKE_TOOLCHAIN_FILE=path/to/vcpkg/scripts/buildsystems/vcpkg.cmake
at cmake step to let it know where to find the dependencies.
CMake options¶
CMake configuration can be controlled by changing the values of the following variables (here with their default value)
BUILD_SHARED_LIBS:BOOL=ON
to enable/disable the building shared librariesPopSift_BUILD_EXAMPLES:BOOL=ON
to enable/disable the building of applicationsPopSift_BUILD_DOC:BOOL=OFF
to enable/disable building this documentation and the Doxygen one.
For example, if you do not want to build the applications, you have to pass -DPopSift_BUILD_EXAMPLES:BOOL=OFF
and so on.
PopSift as third party¶
When you install PopSift a file PopSiftConfig.cmake
is installed in <install_prefix>/lib/cmake/PopSift/
that allows you to import the library in your CMake project.
In your CMakeLists.txt
file you can add the dependency in this way:
1 2 3 4 5 6 7 8 9 10 | # Find the package from the PopSiftConfig.cmake
# in <prefix>/lib/cmake/PopSift/. Under the namespace PopSift::
# it exposes the target PopSift that allows you to compile
# and link with the library
find_package(PopSift CONFIG REQUIRED)
...
# suppose you want to try it out in a executable
add_executable(popsiftTest yourfile.cpp)
# add link to the library
target_link_libraries(popsiftTest PUBLIC PopSift::PopSift)
|
Then, in order to build just pass the location of PopSiftConfig.cmake
from the cmake command line:
cmake .. -DPopSift_DIR=<install_prefix>/lib/cmake/PopSift/
Docker image¶
A docker image can be built using the Ubuntu based Dockerfile
, which is based on nvidia/cuda image (https://hub.docker.com/r/nvidia/cuda/ )
Building the dependency image¶
We provide a Dockerfile_deps
containing a cuda image with all the necessary PopSift dependencies installed.
A parameter CUDA_TAG
can be passed when building the image to select the cuda version.
Similarly, OS_TAG
can be passed to select the Ubuntu version.
By default, CUDA_TAG=10.2
and OS_TAG=18.04
For example to create the dependency image based on ubuntu 18.04 with cuda 8.0 for development, use
docker build --build-arg CUDA_TAG=8.0 --tag alicevision/popsift-deps:cuda8.0-ubuntu18.04 -f Dockerfile_deps .
The complete list of available tags can be found on the nvidia [dockerhub page](https://hub.docker.com/r/nvidia/cuda/)
Building the PopSift image¶
Once you built the dependency image, you can build the popsift image in the same manner using Dockerfile
:
docker build --tag alicevision/popsift:cuda8.0-ubuntu18.04 .
Running the PopSift image¶
In order to run the image nvidia docker is needed: see the installation instruction. Once installed, the docker can be run, e.g., in interactive mode with
docker run -it --runtime=nvidia alicevision/popsift:cuda8.0-ubuntu18.04
Official images on DockeHub¶
Check the docker hub PopSift repository for the available images.
API References¶
Main Classes¶
-
class
SiftJob
¶ Public Functions
-
SiftJob
(int w, int h, const unsigned char *imageData)¶ Constructor for byte images, value range 0..255.
- Parameters
[in] w
: the width in pixel of the image[in] h
: the height in pixel of the image[in] imageData
: the image buffer
-
SiftJob
(int w, int h, const float *imageData)¶ Constructor for float images, value range [0..1[.
- Parameters
[in] w
: the width in pixel of the image[in] h
: the height in pixel of the image[in] imageData
: the image buffer
-
~SiftJob
()¶ Destructor releases all the resources.
-
popsift::FeaturesHost *
getHost
()¶ - Return
-
void
setFeatures
(popsift::FeaturesBase *f)¶ fulfill the promise
-
-
class
PopSift
¶ Public Types
-
enum
ImageMode
¶ Image modes.
Values:
-
enumerator
ByteImages
¶ byte image, value range 0..255
-
enumerator
FloatImages
¶ float images, value range [0..1[
-
enumerator
-
enum
AllocTest
¶ Results for the allocation test.
Values:
-
enumerator
Ok
¶ the image dimensions are supported by this device’s CUDA texture engine.
-
enumerator
ImageExceedsLinearTextureLimit
¶ the input image size exceeds the dimensions of the CUDA Texture used for loading.
-
enumerator
ImageExceedsLayeredSurfaceLimit
¶ the scaled input image exceeds the dimensions of the CUDA Surface used for the image pyramid.
-
enumerator
Public Functions
-
PopSift
(ImageMode imode = ByteImages)¶ We support more than 1 streams, but we support only one sigma and one level parameters.
-
PopSift
(const popsift::Config &config, popsift::Config::ProcessingMode mode = popsift::Config::ExtractingMode, ImageMode imode = ByteImages)¶ - Parameters
config
:mode
:imode
:
-
~PopSift
()¶ Release all the resources.
-
bool
configure
(const popsift::Config &config, bool force = false)¶ Provide the configuration if you used the PopSift default constructor.
-
void
uninit
()¶ Release the resources.
-
AllocTest
testTextureFit
(int width, int height)¶ Check whether the current CUDA device can support the image resolution (width,height) with the current configuration based on the card’s texture engine. The function does not check if there is sufficient available memory.
The first part of the test depends on the parameters width and height. It checks whether the image size is supported by CUDA 2D linear textures on this card. This is used to load the image into the first level of the first octave. For the second part of the tst, two value of the configuration are important: “downsampling”, because it determines the required texture size after loading. The CUDA 2D layered texture must support the scaled width and height. “levels”, because it determines the number of levels in each octave. The CUDA 2D layered texture must support enough depth for each level.
- Return
AllocTest::Ok if the image dimensions are supported by this device’s CUDA texture engine, AllocTest::ImageExceedsLinearTextureLimit if the input image size exceeds the dimensions of the CUDA Texture used for loading. The input image must be scaled. AllocTest::ImageExceedsLayeredSurfaceLimit if the scaled input image exceeds the dimensions of the CUDA Surface used for the image pyramid. The scaling factor must be changes to fit in.
- Remark
* If you want to call configure() before extracting features, you should call configure() before textTextureFit().
- Remark
* The current CUDA device is determined by a call to cudaGetDevice(), card properties are only read once.
- See
- Parameters
[in] width
: The width of the input image[in] height
: The height of the input image
-
std::string
testTextureFitErrorString
(AllocTest err, int w, int h)¶ Create a warning string for an AllocTest error code.
-
SiftJob *
enqueue
(int w, int h, const unsigned char *imageData)¶ Enqueue a byte image, value range [0,255].
- Return
the associated job
- See
- Parameters
[in] w
: the width of the image.[in] h
: the height of the image.[in] imageData
: the image buffer.
-
SiftJob *
enqueue
(int w, int h, const float *imageData)¶ Enqueue a float image, value range [0,1].
- Return
the associated job
- See
- Parameters
[in] w
: the width of the image.[in] h
: the height of the image.[in] imageData
: the image buffer.
-
void
uninit
(int)¶
-
bool
init
(int, int w, int h)¶
-
popsift::FeaturesBase *
execute
(int, const unsigned char *imageData)¶
-
enum
-
struct
popsift
::
Config
¶ Struct containing the parameters that control the extraction algorithm.
Public Types
-
enum
GaussMode
¶ The way the gaussian mode is compute.
Each setting allows to mimic and reproduce the behaviour of other Sift implementations.
Values:
-
enumerator
VLFeat_Compute
¶
-
enumerator
VLFeat_Relative
¶
-
enumerator
VLFeat_Relative_All
¶
-
enumerator
OpenCV_Compute
¶
-
enumerator
Fixed9
¶
-
enumerator
Fixed15
¶
-
enumerator
-
enum
SiftMode
¶ General setting to reproduce the results of other Sift implementations.
Values:
-
enumerator
PopSift
¶ Popsift implementation.
-
enumerator
OpenCV
¶ OpenCV implementation.
-
enumerator
VLFeat
¶ VLFeat implementation.
-
enumerator
-
enum
ScalingMode
¶ The scaling mode.
Values:
-
enumerator
ScaleDirect
¶
-
enumerator
ScaleDefault
¶ Indirect - only working method.
-
enumerator
-
enum
DescMode
¶ Modes for descriptor extraction.
Values:
-
enumerator
Loop
¶ scan horizontal, extract valid points
-
enumerator
ILoop
¶ scan horizontal, extract valid points, interpolate with tex engine
-
enumerator
Grid
¶ scan in rotated mode, round pixel address
-
enumerator
IGrid
¶ scan in rotated mode, interpolate with tex engine
-
enumerator
NoTile
¶ variant of IGrid, no duplicate gradient fetching
-
enumerator
-
enum
NormMode
¶ Type of norm to use for matching.
Values:
-
enumerator
RootSift
¶ The L1-inspired norm, gives better matching results (“RootSift”)
-
enumerator
Classic
¶ The L2-inspired norm, all descriptors on a hypersphere (“classic”)
-
enumerator
-
enum
GridFilterMode
¶ Filtering strategy.
To reduce time used in descriptor extraction, some extrema can be filtered immediately after finding them. It is possible to keep those with the largest scale (LargestScaleFirst), smallest scale (SmallestScaleFirst), or a random selection. Note that largest and smallest give a stable result, random does not.
Values:
-
enumerator
RandomScale
¶ keep a random selection
-
enumerator
LargestScaleFirst
¶ keep those with the largest scale
-
enumerator
SmallestScaleFirst
¶ keep those with the smallest scale
-
enumerator
Public Functions
-
void
setGaussMode
(const std::string &m)¶ Set the Gaussian mode from string.
- See
- Parameters
[in] m
: The string version of the GaussMode
-
void
setVerbose
(bool on = true)¶ Enable/desable verbose mode.
- Parameters
[in] on
: Whether to display additional information .
-
void
setDescMode
(const std::string &byname)¶ Set the descriptor mode by string.
- See
- Parameters
[in] byname
: The string containing the descriptor mode.
-
void
setDescMode
(DescMode mode = Loop)¶ Set the descriptor mode.
- See
- Parameters
[in] mode
: The descriptor mode.
-
float
getPeakThreshold
() const¶ computes the actual peak threshold depending on the threshold parameter and the non-augmented number of levels
-
bool
ifPrintGaussTables
() const¶ print Gauss spans and tables?
-
DEPRECATED
(void setUseRootSift(bool on))¶ Set the normalization mode.
- See
- Parameters
[in] on
: Use RootSift (true
) or the L2-norm (false
).
-
int
getNormalizationMultiplier
() const¶ Functions related to descriptor normalization: multiply with a power of 2.
-
float
getUpscaleFactor
() const¶ The input image is stretched by 2^upscale_factor before processing. The factor 1 is default.
-
bool
getCanFilterExtrema
() const¶ Have we enabled filtering? This is a compile time decision. The reason is that we use Thrust, which increases compile considerably and can be deactivated at the CMake level when you work on something else.
-
int
getFilterMaxExtrema
() const¶ Set the approximate number of extrema whose orientation and descriptor should be computed. Default is -1, which sets the hard limit defined by “number of octaves * getMaxExtrema()”.
-
int
getFilterGridSize
() const¶ Get the grid size for filtering.
To avoid that grid filtering happens only in a tiny piece of an image, the image is split into getFilterGridSize() X getFilterGridSize() tiles and we allow getFilterMaxExtrema() / getFilterGridSize() extrema in each tile.
-
GridFilterMode
getFilterSorting
() const¶ Get the filtering mode.
- Return
the filtering mode.
- See
-
ScalingMode
getScalingMode
() const¶ Get the scaling mode.
- Return
the descriptor extraction mode.
- See
Public Members
-
int
octaves
¶ The number of octaves is chosen freely. If not specified, it is: log_2( min(x,y) ) - 3 - start_sampling
-
int
levels
¶ The number of levels per octave. This is actually the number of inner DoG levels where we can search for feature points. The number of …
This is the non-augmented number of levels, meaning the this is not the number of gauss-filtered picture layers (which is levels+3), but the number of DoG layers in which we can search for extrema.
-
float
_edge_limit
¶ default edge_limit 16.0f from Celebrandil default edge_limit 10.0f from Bemap
-
enum
Functions¶
Utility Classes¶
About¶
License¶
PopSift is licensed under MPLv2 license.
More info about the license and what you can do with the code can be found at tldrlegal website
SIFT was patented in the United States from 1999-03-08 to 2020-03-28. See the patent link for more information. PopSift license only concerns the PopSift source code and does not release users of this code from any requirements that may arise from patents.
Contact us¶
You can contact us on the public mailing list at alicevision@googlegroups.com
You can also contact us privately at alicevision-team@googlegroups.com
Cite us¶
If you want to cite this work in your publication, please use the following
@inproceedings{Griwodz2018Popsift,
author = {Griwodz, Carsten and Calvet, Lilian and Halvorsen, P{\aa}l},
title = {Popsift: A Faithful SIFT Implementation for Real-time Applications},
booktitle = {Proceedings of the 9th {ACM} Multimedia Systems Conference},
series = {MMSys '18},
year = {2018},
isbn = {978-1-4503-5192-8},
location = {Amsterdam, Netherlands},
pages = {415--420},
numpages = {6},
doi = {10.1145/3204949.3208136},
acmid = {3208136},
publisher = {ACM},
address = {New York, NY, USA},
}
Acknowledgements¶
This has been developed in the context of the European project POPART founded by European Union’s Horizon 2020 research and innovation programme under grant agreement No 644874.
Bibliography¶
- GCH18
Carsten Griwodz, Lilian Calvet, and Pål Halvorsen. Popsift: a faithful sift implementation for real-time applications. In Proceedings of the 9th ACM Multimedia Systems Conference, MMSys ‘18, 415–420. New York, NY, USA, 2018. ACM. doi:10.1145/3204949.3208136.
- Low04
DG Lowe. Distinctive image features from scale-invariant keypoints. International journal of computer vision, pages 1–29, 2004. doi:10.1023/B:VISI.0000029664.99615.94.