diff --git a/CMakeLists.txt b/CMakeLists.txt index 47dceaf85..480fc6f32 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -91,11 +91,11 @@ SET(CMAKE_RELEASE_POSTFIX "") SET(CMAKE_DEBUG_POSTFIX "-debug") SET(THEIA_VERSION_MAJOR 0) -SET(THEIA_VERSION_MINOR 3) +SET(THEIA_VERSION_MINOR 4) SET(THEIA_VERSION_PATCH 0) SET(THEIA_VERSION ${THEIA_VERSION_MAJOR}.${THEIA_VERSION_MINOR}.${THEIA_VERSION_PATCH}) -SET(THEIA_ABI_VERSION 0.3.0) +SET(THEIA_ABI_VERSION 0.4.0) # THEIA data directory ADD_DEFINITIONS(-DTHEIA_DATA_DIR="${CMAKE_SOURCE_DIR}/data") diff --git a/docs/source/bibliography.rst b/docs/source/bibliography.rst index 3dee7493f..e6a0428dd 100644 --- a/docs/source/bibliography.rst +++ b/docs/source/bibliography.rst @@ -99,10 +99,17 @@ Bibliography Estimation via Residual Consensus**. *International Conference on Computer Vision (ICCV)*, 2011. +.. [PhotoTourism] N. Snavely, S. Seitz, and R. Szeliski. **Photo tourism: + exploring photo collections in 3D.** *ACM transactions on graphics (TOG)*, 2006. + .. [Stewenius5pt] H. Stewénius, C. Engels, D. Nistér. **Recent developments on direct relative orientation**. *ISPRS Journal of Photogrammetry and Remote Sensing*, 2006 +.. [SweeneyCVPR2015] C. Sweeney, L. Kneip, T. Hollerer, M. Turk. **Computing + Similarity Transformations from Only Image Correspondences**. *Computer Vision + and Pattern Recognition (CVPR)*, 2015 + .. [SweeneyGDLS] C. Sweeney, V. Fragoso, T. Hollerer, M. Turk. **gDLS: A Scalable Solution to the Generalized Pose and Scale Problem**. *European Conference on Computer Vision (ECCV)*, 2014 @@ -119,5 +126,8 @@ Bibliography between two point patterns**. *IEEE Transactions on Pattern Analysis and Machine Intelligence*, 1991. +.. [VisualSfM] Wu, Changchang. **Towards a Linear Time Incremental Structure From + Motion**. *International Conference on 3D Vision*, 2013. + .. [WilsonECCV2014] Wilson, K. and Snavely, N. **Robust Global Translation with 1DSfM** *European Conference on Computer Vision*, 2014. diff --git a/docs/source/global_linear_position_estimation.png b/docs/source/global_linear_position_estimation.png new file mode 100644 index 000000000..869725292 Binary files /dev/null and b/docs/source/global_linear_position_estimation.png differ diff --git a/docs/source/global_sfm.png b/docs/source/global_sfm.png new file mode 100644 index 000000000..9695d05d0 Binary files /dev/null and b/docs/source/global_sfm.png differ diff --git a/docs/source/incremental_sfm.png b/docs/source/incremental_sfm.png new file mode 100644 index 000000000..7a47534cc Binary files /dev/null and b/docs/source/incremental_sfm.png differ diff --git a/docs/source/pisa.png b/docs/source/pisa.png new file mode 100644 index 000000000..daa0c441c Binary files /dev/null and b/docs/source/pisa.png differ diff --git a/docs/source/releases.rst b/docs/source/releases.rst index f277afa39..6225137cd 100644 --- a/docs/source/releases.rst +++ b/docs/source/releases.rst @@ -9,12 +9,34 @@ HEAD New Features ------------ -* Better rendering for point clouds. Bug Fixes --------- -* Some Visual Studio bugs and incompatabilities (thanks to Pierre Moulon and Brojeshwar Bhowmick) +`0.4.0 `_ +======================================================================= + +New Features +------------ +* Incremental SfM pipeline. +* New website: `www.theia-sfm.org `_. +* Linear method for camera pose registration [JiangICCV]_. +* Better rendering for point clouds. +* Significantly better Cmake scripts for Windows (thanks to bvanevery for testing) +* Mutable priority queue class. +* Bundle adjustment method for cameras only (points held constant). +* Calibrated and Uncalibrated absolute pose estimators. +* Two-view bundle adjustment will now optimize camera intrinsics if they are not known. +* New small and large-scale benchmarking results on the Theia website. + +Bug Fixes +--------- +* Some Visual Studio bugs and incompatabilities (thanks to Pierre Moulon and Brojeshwar Bhowmick). +* Sample Consensus estimators were incorrectly counting the number of samples needed (found by inspirit). +* Proper normalization the 1dSfM axis of projection. +* OpenGL viewer properly sets zero-values of matrices upon initialization. +* Relative translation optimization (with known rotation) is dramatically improved (thanks to Onur Ozyesil) +* Translations solver uses SPARSE_NORMAL_CHOLESKY when no 3D points are used. `0.3.0 `_ ======================================================================= diff --git a/docs/source/sfm.rst b/docs/source/sfm.rst index 1afa19772..c978d3fb5 100644 --- a/docs/source/sfm.rst +++ b/docs/source/sfm.rst @@ -12,15 +12,17 @@ Theia has a full Structure-from-Motion pipeline that is extremely efficient. Our overall pipeline consists of several steps. First, we extract features (SIFT is the default). Then, we perform two-view matching and geometric verification to obtain relative poses between image pairs and create a :class:`ViewGraph`. Next, -we perform global pose estimation with global SfM. Global SfM is different from -incremental SfM in that it considers the entire view graph at the same time -instead of incrementally adding more and more images to the -:class:`Reconstruction`. Global SfM methods have been proven to be very fast -with comparable or better accuracy to incremental SfM approaches (See -[JiangICCV]_, [MoulonICCV]_, [WilsonECCV2014]_), and they are much more readily -parallelized. After we have obtained camera poses, we perform triangulation and -:class:`BundleAdjustment` to obtain a valid 3D reconstruction consisting of -cameras and 3D points. +we perform either incremental or global SfM. Incremental SfM is the standard +approach that adds on one image at a time to grow the reconstruction. While this +method is robust, it is not scalable because it requires repeated operations of +expensive bundle adjustment. Global SfM is different from incremental SfM in +that it considers the entire view graph at the same time instead of +incrementally adding more and more images to the :class:`Reconstruction`. Global +SfM methods have been proven to be very fast with comparable or better accuracy +to incremental SfM approaches (See [JiangICCV]_, [MoulonICCV]_, +[WilsonECCV2014]_), and they are much more readily parallelized. After we have +obtained camera poses, we perform triangulation and :class:`BundleAdjustment` to +obtain a valid 3D reconstruction consisting of cameras and 3D points. The first step towards creating a reconstruction is to determine images which view the same objects. To do this, we must create a :class:`ViewGraph`. @@ -29,8 +31,6 @@ view the same objects. To do this, we must create a :class:`ViewGraph`. #. Match features to obtain image correspondences. #. Estimate camera poses from two-view matches and geometries. -.. TODO:: Insert figure. - #1. and #2. have been covered in other sections, so we will focus on creating a reconstruction from two-view matches and geometry. First, we will describe the fundamental elements of our reconstruction. @@ -40,7 +40,7 @@ Reconstruction .. class:: Reconstruction -.. TODO:: Insert figure. +.. image:: pisa.png At the core of our SfM pipeline is an SfM :class:`Reconstruction`. A :class:`Reconstruction` is the representation of a 3D reconstuction consisting @@ -117,8 +117,6 @@ ViewGraph .. class:: ViewGraph -.. TODO:: INSERT FIGURE HERE - A :class:`ViewGraph` is a basic SfM construct that is created from two-view matching information. Any pair of views that have a view correlation form an edge in the :class:`ViewGraph` such that the nodes in the graph are @@ -296,9 +294,83 @@ In addition to typical getter/setter methods for the camera parameters, the according to the camera orientation in 3D space. The returned vector is not unit length. +Incremental SfM Pipeline +======================== + +.. image:: incremental_sfm.png + +The incremental SfM pipeline follows very closely the pipelines of `Bundler +`_ [PhotoTourism]_ and `VisualSfM +`_ [VisualSfM]_. The method begins by first estimating the +3D structure and camera poses of 2 cameras based on their relative pose. Then +additional cameras are added on sequentially and new 3D structure is estimated +as new parts of the scene are observed. Bundle adjustment is repeatedly +performed as more cameras are added to ensure high quality reconstructions and +to avoid drift. + +The incremental SfM pipeline is as follows: + #. Choose an initial camera pair to reconstruct. + #. Estimate 3D structure of the scene. + #. Bundle adjustment on the 2-view reconstruction. + #. Localize a new camera to the current 3D points. Choose the camera that + observes the most 3D points currently in the scene. + #. Estimate new 3D structure. + #. Bundle adjustment if the model has grown by more than 5% since the last + bundle adjustment. + #. Repeat steps 4-6 until all cameras have been added. + +Incremental SfM is generally considered to be more robust than global SfM +methods; hwoever, it requires many more instances of bundle adjustment (which +is very costly) and so incremental SfM is not as efficient or scalable. + +.. member:: double ReconstructorEstimatorOptions::multiple_view_localization_ratio + + DEFAULT: ``0.8`` + + If M is the maximum number of 3D points observed by any view, we want to + localize all views that observe > M * multiple_view_localization_ratio 3D + points. This allows for multiple well-conditioned views to be added to the + reconstruction before needing bundle adjustment. + +.. member:: double ReconstructionEstimatorOptions::absolute_pose_reprojection_error_threshold + + DEFAULT: ``8.0`` + + When adding a new view to the current reconstruction, this is the + reprojection error that determines whether a 2D-3D correspondence is an + inlier during localization. + +.. member:: int ReconstructionEstimatorOptions::min_num_absolute_pose_inliers + + DEFAULT: ``30`` + + Minimum number of inliers for absolute pose estimation to be considered + successful. + +.. member:: double ReconstructionEstimatorOptions::full_bundle_adjustment_growth_percent + + DEFAULT: ``5.0`` + + Bundle adjustment of the entire reconstruction is triggered when the + reconstruction has grown by more than this percent. That is, if we last ran + BA when there were K views in the reconstruction and there are now N views, + then G = (N - K) / K is the percent that the model has grown. We run bundle + adjustment only if G is greater than this variable. This variable is + indicated in percent so e.g., 5.0 = 5%. + +.. member:: int ReconstructionEstimatorOptions::partial_bundle_adjustment_num_views + + DEFAULT: ``20`` + + During incremental SfM we run "partial" bundle adjustment on the most + recent views that have been added to the 3D reconstruction. This parameter + controls how many views should be part of the partial BA. + Global SfM Pipeline =================== +.. image:: global_sfm.png + The global SfM pipelines in Theia follow a general procedure of filtering outliers and estimating camera poses or structure. Removing outliers can help increase performance dramatically for global SfM, though robust estimation @@ -325,8 +397,8 @@ follows: .. class:: ReconstructionEstimator This is the base class for which all SfM reconstruction pipelines derive - from. The reconstruction estimation type can be specified at runtime, though - currently only ``NONLINEAR`` is implemented. + from. The reconstruction estimation type can be specified at runtime + (currently ``NONLINEAR`` and ``INCREMENTAL`` are implemented). .. function:: ReconstructionEstimator::ReconstructionEstimator(const ReconstructorEstimatorOptions& options) @@ -583,15 +655,17 @@ Estimating Global Positions =========================== Positions of cameras may be estimated simultaneously after the rotations are -known. We use a nonlinear optimization to estimate camera positions based. Given -pairwise relative translations from :class:`TwoViewInfo` and the estimated -rotation, the constraint +known. We use either a linear or a nonlinear optimization to estimate camera +positions based. + +Given pairwise relative translations from :class:`TwoViewInfo` +and the estimated rotation, the constraint .. math:: R_i * (c_j - c_i) = \alpha_{i,j} * t_{i,j} -Where :math:`\alpha_{i,j} = ||c_j - c_i||^2`. This ensures that we optimize for -positions that agree with the relative positions computed in two-view -estimation. +is used to determine the global camera positions, where :math:`\alpha_{i,j} = +||c_j - c_i||^2`. This ensures that we optimize for positions that agree with +the relative positions computed in two-view estimation. .. class:: NonlinearPositionEstimatorOptions @@ -653,6 +727,40 @@ estimation. using the nonlinear algorithm described above. Only positions that have an orientation set are estimated. Returns true upons success and false on failure. + +.. class:: LinearPositionEstimator + +.. image:: global_linear_position_estimation.png + :width: 40% + :align: center + +For the linear position estimator of [JiangICCV]_, we utilize an approximate geometric error to determine the position locations within a triplet as shown above. The cost function we minimize is: + + .. math:: f(i, j, k) = c_k - \dfrac{1}{2} (c_i + ||c_k - c_i|| c_{ik}) + c_j + ||c_k - c_j|| c_{jk}) + +This can be formed as a linear constraint in the unknown camera positions :math:`c_i`. Tthe solution that minimizes this cost lies in the null-space of the resultant linear system. Instead of extracting the entire null-space as [JiangICCV]_ does, we instead hold one camera constant at the origin and use the Inverse-Iteration Power Method to efficiently determine the null vector that best solves our minimization. This results in a dramatic speedup without sacrificing efficiency. + +.. NOTE:: Currently this position estimation method is not integrated into the Theia global SfM pipeline. More testing needs to be done with this method before it can be reliably integrated. + +.. member:: int LinearPositionEstimator::Options::num_threads + + DEFAULT: ``1`` + + The number of threads to use to solve for camera positions + +.. member:: int LinearPositionEstimator::Options::max_power_iterations + + DEFAULT: ``1000`` + + Maximum number of power iterations to perform while solving for camera positions. + +.. member:: double LinearPositionEstimator::Options::eigensolver_threshold + + DEFAULT: ``1e-8`` + + This number determines the convergence of the power iteration method. The + lower the threshold the longer it will take to converge. + Triangulation ============= @@ -853,7 +961,7 @@ the reprojection error. Similarity Transformation ========================= - .. function:: void AlignPointCloudsICP(const int num_points, const double left[], const double right[], double rotation[3 * 3], double translation[3]) + .. function:: void AlignPointCloudsICP(const int num_points, const double left[], const double right[], double rotation[], double translation[]) We implement ICP for point clouds. We use Besl-McKay registration to align point clouds. We use SVD decomposition to find the rotation, as this is much @@ -863,7 +971,7 @@ Similarity Transformation the left and right reconstructions have the same number of points, and that the points are aligned by correspondence (i.e. left[i] corresponds to right[i]). - .. function:: void AlignPointCloudsUmeyama(const int num_points, const double left[], const double right[], double rotation[3 * 3], double translation[3], double* scale) + .. function:: void AlignPointCloudsUmeyama(const int num_points, const double left[], const double right[], double rotation[], double translation[], double* scale) This function estimates the 3D similiarty transformation using the least squares method of [Umeyama]_. The returned rotation, translation, and scale @@ -903,3 +1011,17 @@ Similarity Transformation ``solution_translation``: the translation of the candidate solutions ``solution_scale``: the scale of the candidate solutions + + .. function:: void SimTransformPartialRotation(const Eigen::Vector3d& rotation_axis, const Eigen::Vector3d image_one_ray_directions[5], const Eigen::Vector3d image_one_ray_origins[5], const Eigen::Vector3d image_two_ray_directions[5], const Eigen::Vector3d image_two_ray_origins[5], std::vector* soln_rotations, std::vector* soln_translations, std::vector* soln_scales) + + Solves for the similarity transformation that will transform rays in image + two such that the intersect with rays in image one such that: + + .. math:: s * R * X' + t = X + + where s, R, t are the scale, rotation, and translation returned, X' is a + point in coordinate system 2 and X is the point transformed back to + coordinate system 1. Up to 8 solutions will be returned. + + Please cite the paper "Computing Similarity Transformations from Only Image + Correspondences" by C. Sweeney et al (CVPR 2015) [SweeneyCVPR2015]_ when using this algorithm.