Generation of 3D-city models and their utilisation in image sequences

Three-dimensional city models find more and more interest in city and regional planning. They are used for visualisation, e.g. to demonstrate the influence of a planned building to the surrounding townscape. Furthermore, there is a great demand for such models in civil and military mission planning, disaster management and as basis for simulations. However, detailed 3D-descriptions may be used as well for interpretation of scenes by image sequences, which were taken by different airborne sensors or with different views. Knowing the sensor geometry, the position and orientation of the platform from Global Positioning System (GPS) and Inertial Navigation System (INS) measurements, visible structures of the city model can be calculated and projected into the image sequence. This can be used for GIS aided interpretation, offline or even on the fly in future applications. Possible GIS applications could be automated overlaying of selected buildings or query detailed building information from a database by interactively pointing on a frame in a sequence. In practice the projected building contours do not exactly coincide with their image location. Especially images taken with large focal length suffer from severe misalignment even for very small errors of the navigation data. To overcome this problem an automated matching approach of image and model description is required. In this paper we describe the construction of the city model and its support for the analysis of image sequences. Our 3D city model consists of building models which were generated from maps and laser elevation data. From large scale digital maps building descriptions were derived which are used to mask the elevation data. Depending on the task prismatic or polyhedral object models are reconstructed from the masked elevation data. Using the given navigation data and camera parameters the 3D polygons are projected into the images. A hidden line algorithm is applied to extract visible lines only. Endpoints of lines enclosing a suitable angle are selected as reference tie points for matching. Possible correspondences of these reference points are corners extracted in the image. Therefore, in every single image of the sequence an edge detector (Burns) is applied. Two lines which fulfil requirements concerning distance and angle built an angle structure. The vertices of the angle structures form the image tie point set. The two sets of tie points are matched with a Geometric Hashing algorithm. The search space of the matching approach can be approximated by affine transformation, because we apply the matching in the image co-ordinate system. Knowing corresponding model and image points or model and image lines it is possible to calculate the camera orientation and navigation data of the platform. The pose determination is based on linear and non-linear methods known from literature. In this way we obtain an image based determination of navigation data. Using this calculated parameters the projection of the building contours can be improved. For airborne taken image sequences initial approximations for each frame can be obtained in two ways: Either from recorded navigation data for every single frame or by prediction of the camera parameters of subsequent frames. Video sequences in oblique view (side looking) were taken by multiple flights over the campus of the university of Karlsruhe. The test data differs in carrier elevation, flight direction and focal length of the camera. Subject of ongoing investigations is to consider scaling effects in the field of view by projecting the objects in different levels of detail.
Stilla U, Sörgel U, Jäger K (2000) Generation of 3D-city models and their utilisation in image sequences. International Archives of Photogrammetry and Remote Sensing. Vol. 33, Part B2, 518-524
[ ]