Currently, bag-of-words approaches for image categorization are very popular due to their relative simplicity, robustness and high efficiency. However, they lack the ability to represent the spatial composition of an image. This drawback has been addressed by several approaches, with spatial pyramids being the most popular. Spatial pyramids divide an image into smaller blocks, resulting in a feature vector for each block of the image. The feature vectors for these blocks are concatenated to form the feature vector of the whole image. This leads to an increase in dimension of the whole image's feature vector by a factor corresponding to the number of blocks the image is divided into. Consequently, this causes an increase in computation time proportional to the number of blocks. We propose an extension of the image feature vector by spatial features, which results in a descriptor of similar size as in the standard bag-of-words approach. The classification performance howe ver is similar to those of spatial pyramids which use a feature vector of significantly larger size and therefore are more computationally expensive.