Articles | Volume 18, issue 4
https://doi.org/10.5194/essd-18-2609-2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
https://doi.org/10.5194/essd-18-2609-2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
BuildingSense: a new multimodal building function classification dataset
Pengxiang Su
College of Surveying and Geo-informatics, Tongji University, Shanghai, China
Runfei Chen
Urban Mobility Institute, Tongji University, Shanghai, China
Heng Xu
College of Surveying and Geo-informatics, Tongji University, Shanghai, China
Wei Huang
CORRESPONDING AUTHOR
College of Surveying and Geo-informatics, Tongji University, Shanghai, China
Urban Mobility Institute, Tongji University, Shanghai, China
Department of Civil Engineering, Toronto Metropolitan University, Toronto, Canada
Xinling Deng
Cornell Tech, Cornell University, New York City, USA
Songnian Li
Department of Civil Engineering, Toronto Metropolitan University, Toronto, Canada
Wanglin Yan
Faculty of Environment and Information Studies, Keio University, Fujisawa City, Japan
Hangbin Wu
College of Surveying and Geo-informatics, Tongji University, Shanghai, China
Chun Liu
College of Surveying and Geo-informatics, Tongji University, Shanghai, China
Related authors
No articles found.
Faezeh Sadat Mortazavi, Junyi Wei, Tim Schimansky, Hangbin Wu, Claus Brenner, and Monika Sester
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLVIII-1-W6-2025, 155–162, https://doi.org/10.5194/isprs-archives-XLVIII-1-W6-2025-155-2025, https://doi.org/10.5194/isprs-archives-XLVIII-1-W6-2025-155-2025, 2025
Haopeng Hu, Hangbin Wu, Shengke Huang, Wei Huang, and Chun Liu
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLVIII-G-2025, 641–647, https://doi.org/10.5194/isprs-archives-XLVIII-G-2025-641-2025, https://doi.org/10.5194/isprs-archives-XLVIII-G-2025-641-2025, 2025
Gopika Rajan and Songnian Li
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLVIII-4-2024, 373–379, https://doi.org/10.5194/isprs-archives-XLVIII-4-2024-373-2024, https://doi.org/10.5194/isprs-archives-XLVIII-4-2024-373-2024, 2024
Xiaoqian Hu, Yang Chen, Shishuo Xu, and Songnian Li
ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., X-4-2024, 169–174, https://doi.org/10.5194/isprs-annals-X-4-2024-169-2024, https://doi.org/10.5194/isprs-annals-X-4-2024-169-2024, 2024
Su Yang, Jun Chen, Miaole Hou, Hongchao Fan, Songnian Li, and Yingkui Sun
ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., X-4-2024, 419–425, https://doi.org/10.5194/isprs-annals-X-4-2024-419-2024, https://doi.org/10.5194/isprs-annals-X-4-2024-419-2024, 2024
Deepak Satheesan, Mahdi Talib, Songnian Li, and Arnold (Xian-Xun) Yuan
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLVIII-M-4-2024, 47–53, https://doi.org/10.5194/isprs-archives-XLVIII-M-4-2024-47-2024, https://doi.org/10.5194/isprs-archives-XLVIII-M-4-2024-47-2024, 2024
S. Zhang, W. Zhang, and C. Liu
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLVIII-1-W2-2023, 1923–1928, https://doi.org/10.5194/isprs-archives-XLVIII-1-W2-2023-1923-2023, https://doi.org/10.5194/isprs-archives-XLVIII-1-W2-2023-1923-2023, 2023
M. H. Ismail, A. Shaker, and S. Li
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLVIII-1-W2-2023, 7–14, https://doi.org/10.5194/isprs-archives-XLVIII-1-W2-2023-7-2023, https://doi.org/10.5194/isprs-archives-XLVIII-1-W2-2023-7-2023, 2023
S. Xu, Y. Zhao, and S. Li
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLVIII-1-W2-2023, 369–374, https://doi.org/10.5194/isprs-archives-XLVIII-1-W2-2023-369-2023, https://doi.org/10.5194/isprs-archives-XLVIII-1-W2-2023-369-2023, 2023
Y. J. Zhang and W. Huang
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLVIII-1-W2-2023, 861–866, https://doi.org/10.5194/isprs-archives-XLVIII-1-W2-2023-861-2023, https://doi.org/10.5194/isprs-archives-XLVIII-1-W2-2023-861-2023, 2023
C. Liu, Y. Li, J. Gu, Y. Lou, and T. Shen
ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., X-1-W1-2023, 319–328, https://doi.org/10.5194/isprs-annals-X-1-W1-2023-319-2023, https://doi.org/10.5194/isprs-annals-X-1-W1-2023-319-2023, 2023
C. Yang, Z. Wang, H. Zhou, Y. Wu, M. Du, Y. Zhu, L. Ming, H. Wu, J. Li, and Z. Chen
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLVIII-M-2-2023, 1659–1663, https://doi.org/10.5194/isprs-archives-XLVIII-M-2-2023-1659-2023, https://doi.org/10.5194/isprs-archives-XLVIII-M-2-2023-1659-2023, 2023
S. Zhao, C. Liu, and J. Zhang
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLVIII-2-W3-2023, 269–274, https://doi.org/10.5194/isprs-archives-XLVIII-2-W3-2023-269-2023, https://doi.org/10.5194/isprs-archives-XLVIII-2-W3-2023-269-2023, 2023
C. Liu and S. Wang
ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., V-1-2022, 25–32, https://doi.org/10.5194/isprs-annals-V-1-2022-25-2022, https://doi.org/10.5194/isprs-annals-V-1-2022-25-2022, 2022
C. Liu and Y. Qi
ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., V-2-2022, 375–382, https://doi.org/10.5194/isprs-annals-V-2-2022-375-2022, https://doi.org/10.5194/isprs-annals-V-2-2022-375-2022, 2022
Cited articles
Arribas-Bel, D. and Fleischmann, M.: Spatial signatures – Understanding (urban) spaces through form and function, Habitat Int., 128, 102641, https://doi.org/10.1016/j.habitatint.2022.102641, 2022. a
Azimi, S. M., Henry, C., Sommer, L. W., Schumann, A., and Vig, E.: SkyScapes – Fine-grained semantic understanding of aerial scenes, 2019 IEEE/CVF Int. Conf. Comput. Vis. (ICCV), 7392–7402, https://api.semanticscholar.org/CorpusID:207998372 (last access: 20 February 2026), 2019. a
Bommasani, R., Hudson, D. A., and Ehsan Adeli, E. A.: On the opportunities and risks of foundation models, arXiv [preprint] https://doi.org/10.48550/arXiv.2108.07258, 2022. a
Braun, M., Krebs, S., Flohr, F., and Gavrila, D. M.: EuroCity persons: A novel benchmark for person detection in traffic scenes, IEEE T. Pattern Anal. Mach. Intell., 41, 1844–1861, https://doi.org/10.1109/TPAMI.2019.2897684, 2019. a
Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., and Amodei, D.: Language models are few-shot learners, in: Proceedings of the 34th International Conference on Neural Information Processing Systems, NIPS '20, Curran Associates Inc., Red Hook, NY, USA, ISBN 9781713829546, 2020. a, b
Chen, W., Zhou, Y., Stokes, E. C., and Zhang, X.: Large-scale urban building function mapping by integrating multi-source web-based geospatial data, Geo-Spat. Inf. Sci., 27, 1785–1799, https://doi.org/10.1080/10095020.2023.2264342, 2024. a, b, c
Choi, S. and Yoon, S.: Energy signature-based clustering using open data for urban building energy analysis toward carbon neutrality: A case study on electricity change under COVID-19, Sust. Cities Soc., 92, 104471, https://doi.org/10.1016/j.scs.2023.104471, 2023. a
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., and Schiele, B.: The cityscapes dataset for semantic urban scene understanding, in: 2016 IEEE Conf. Comput. Vis. Pattern. Recognit. (CVPR), 3213–3223, https://doi.org/10.1109/CVPR.2016.350, 2016. a
Deng, Y., Chen, R., Yang, J., Li, Y., Jiang, H., Liao, W., and Sun, M.: Identify urban building functions with multisource data: A case study in Guangzhou, China, Int. J. Geogr. Inf. Sci., 36, 2060–2085, https://doi.org/10.1080/13658816.2022.2046756, 2022. a, b, c, d
Du, S., Zheng, M., Guo, L., Wu, Y., Li, Z., and Liu, P.: Urban building function classification based on multisource geospatial data: A two-stage method combining unsupervised and supervised algorithms, Earth Sci. Inform., 17, 1179–1201, https://doi.org/10.1007/s12145-024-01250-5, 2024. a, b, c
Geiger, A., Lenz, P., and Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite, in: 2012 IEEE Conf. on Comput. Vis. Pattern Recognit.(CVPR), 3354–3361, https://doi.org/10.1109/CVPR.2012.6248074, 2012. a
Griffiths, D. and Boehm, J.: Improving public data for building segmentation from Convolutional Neural Networks (CNNs) for fused airborne lidar and image data using active contours, ISPRS-J. Photogramm. Remote Sens., 154, 70–83, https://doi.org/10.1016/j.isprsjprs.2019.05.013, 2019. a
Johnson, J.: Cities: systems of systems of systems, in: Complexity theories of cities have come of age: An overview with implications to urban planning and design, edited by: Portugali, J., Meyer, H., Stolk, E., and Tan, E., chap. 153–172, Springer Berlin Heidelberg, Berlin, Heidelberg, ISBN 978-3-642-24544-2, https://doi.org/10.1007/978-3-642-24544-2_9, 2012. a
Kun, F., Wanxuan, L., Xiaoyu, L., Chubo, D., Hongfeng, Y., and Xian, S.: A comprehensive survey and assumption of remote sensing foundation modal, National Remote Sensing Bulletin, 28, 1667–1680, https://doi.org/10.11834/jrs.20233313, 2024. a, b
Li, S. and Tang, H.: Multimodal alignment and fusion: A survey, arXiv [preprint], https://doi.org/10.48550/arXiv.2411.17040, 2024. a
Li, W., Meng, L., Wang, J., He, C., Xia, G. S., and Lin, D.: 3D building reconstruction from monocular remote sensing images, in: 2021 IEEE/CVF Int. Conf. Comput. Vis. (ICCV), 12528–12537, https://doi.org/10.1109/ICCV48922.2021.01232, 2021. a
Li, W., Lai, Y., Xu, L., Xiangli, Y., Yu, J., He, C., Xia, G. S., and Lin, D.: OmniCity: Omnipotent city understanding with multi-Level and multi-View images, in: 2023 IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 17397–17407, https://doi.org/10.1109/CVPR52729.2023.01669, 2023. a
Li, W., Yu, J., Chen, D., Lin, Y., Dong, R., Zhang, X., He, C., and Fu, H.: Fine-grained building function recognition with street-view images and GIS map data via geometry-aware semi-supervised learning, Int. J. Appl. Earth Obs. Geoinf., 137, 104386, https://doi.org/10.1016/j.jag.2025.104386, 2025. a, b, c, d, e, f
Ma, Y. Z., Huang, J., Dai, X., Liu, S., Luo, L., Chen, Z., and Yi: HoliCity: A city-scale data platform for learning holistic 3D structures, arXiv [preprint], https://doi.org/10.48550/arXiv.2008.03286 2021. a
Mai, G., Huang, W., Sun, J., Song, S., Mishra, D., Liu, N., Gao, S., Liu, T., Cong, G., Hu, Y., Cundy, C., Li, Z., Zhu, R., and Lao, N.: On the opportunities and challenges of foundation models for GeoAI (Vision Paper), ACM Trans. Spatial Algorithms Syst., 10, 46, https://doi.org/10.1145/3653070, 2024. a, b
Marcus, L. and Koch, D.: Cities as implements or facilities – The need for a spatial morphology in smart city systems, Env. Plan. B-Urban Anal. City Sci., 44, 204–226, https://doi.org/10.1177/0265813516685565, https://doi.org/10.1177/0265813516685565, 2016. a
Memduhoglu, A., Fulman, N., and Zipf, A.: Enriching building function classification using Large Language Model embeddings of OpenStreetMap Tags, Earth Sci. Inform., 17, 5403–5418, https://doi.org/10.1007/s12145-024-01463-8, 2024. a, b, c
Shen, P., Liu, J., and Wang, M.: Fast generation of microclimate weather data for building simulation under heat island using map capturing and clustering technique, Sust. Cities Soc, 71, 102954, https://doi.org/10.1016/j.scs.2021.102954, 2021. a
Su, P., Chen, R., Xu, H., Huang, W., Deng, X., Yan, W., Wu, Hangbin, and Liu C.: BuildingSense-A multimodal building function classification dataset, https://doi.org/10.6084/m9.figshare.30645776.v2, 2025a. a, b, c
Su, P., Yan, Y., Li, H., Wu, H., Liu, C., and Huang, W.: Images and deep learning in human and urban infrastructure interactions pertinent to sustainable urban studies: Review and perspective, Int. J. Appl. Earth Obs. Geoinf., 136, 104352, https://doi.org/10.1016/j.jag.2024.104352, 2025b. a
Sun, J., Zheng, C., Xie, E., Liu, Z., Chu, R., Qiu, J., Xu, J., Ding, M., Li, H., Geng, M., Wu, Y., Wang, W., Chen, J., Yin, Z., Ren, X., Fu, J., He, J., Yuan, W., Liu, Q., Liu, X., Li, Y., Dong, H., Cheng, Y., Zhang, M., Heng, P.-A., Dai, J., Luo, P., Wang, J., Wen, J.-R., Qiu, X., Guo, Y.-C., Xiong, H., Liu, Q., and Li, Z.: A survey of reasoning with foundation models, arXiv [preprint], https://doi.org/10.48550/arXiv.2312.11562, 2023. a, b
The construction wiki contributors: Function, https://www.designingbuildings.co.uk/wiki/Function (last access: 12 September 2025), 2021. a
Wang, S., Bai, M., Mattyus, G., Chu, H., Luo, W., Yang, B., Liang, J., Cheverie, J., Fidler, S., and Urtasun, R.: TorontoCity: Seeing the world with a million eyes, in: 2017 IEEE Int. Conf. Comput. Vis. (ICCV), 3028–3036, https://doi.org/10.1109/ICCV.2017.327, 2017. a
Wang, Y., Zhang, Y., Dong, Q., Guo, H., Tao, Y., and Zhang, F.: A multi-view graph neural network for building age prediction, ISPRS-J. Photogramm. Remote Sens., 218, 294–311, https://doi.org/10.1016/j.isprsjprs.2024.10.011, 2024. a, b, c
Weir, N., Lindenbaum, D., Bastidas, A., Etten, A., Kumar, V., Mcpherson, S., Shermeyer, J., and Tang, H.: SpaceNet MVOI: A multi-view overhead imagery dataset, in: 2019 IEEE/CVF Int. Conf. Comput. Vis. (ICCV), 992–1001, ISBN 2380-7504, https://doi.org/10.1109/ICCV.2019.00108, 2019. a
Wojna, Z., Maziarz, K., Jocz, L., Paluba, R., Kozikowski, R., and Kokkinos, I.: Holistic multi-view building analysis in the wild with projection pooling, Proceedings of the AAAI Conference on Artificial Intelligence, 2870–2878, https://doi.org/10.1609/aaai.v35i4.16393, 2021. a
Xiao, B., Jia, X., Yang, D., Sun, L., Shi, F., Wang, Q., and Jia, Y.: Research on classification method of building function oriented to urban building stock management, Sustainability, 14, 5871, https://doi.org/10.3390/su14105871, 2022. a
Xu, Y., He, Z., Xie, X., Xie, Z., Luo, J., and Xie, H.: Building function classification in Nanjing, China, using deep learning, Trans. GIS, 26, 2145–2165, https://doi.org/10.1111/tgis.12934, 2022. a, b, c
Xu, Z., Zhang, F., Wu, Y., Yang, Y., and Wu, Y.: Building height calculation for an urban area based on street view images and deep learning, Comput.-Aided Civil Infrastruct. Eng., 38, 892–906, https://doi.org/10.1111/mice.12930, 2023. a
Yang, K., Hu, X., Bergasa, L. M., Romera, E., and Wang, K.: PASS: Panoramic annular semantic segmentation, IEEE Trans. Intell. Transp. Syst., 21, 4171–4185, https://doi.org/10.1109/TITS.2019.2938965, 2020. a
Yang, K., Hu, X., and Stiefelhagen, R.: Is context-aware CNN ready for the surroundings? panoramic semantic segmentation in the wild, IEEE Trans. Image Process., 30, 1866–1881, https://doi.org/10.1109/TIP.2020.3048682, 2021. a
Yu, D. and Fang, C.: Urban remote sensing with spatial big data: A review and renewed perspective of urban studies in recent decades, Remote Sens., 15, 1307, https://doi.org/10.3390/rs15051307, 2023. a
Zhang, C., Shi, Q., Zhuo, L., Wang, F., and Tao, H.: Inferring mixed use of buildings with multisource data based on tensor decomposition, ISPRS Int. J. Geo-Inf., 10, 185, https://doi.org/10.3390/ijgi10030185, 2021a. a
Zhang, J., Fukuda, T., and Yabuki, N.: Development of a city-scale approach for facade color measurement with building functional classification using deep learning and street view images, ISPRS Int. J. Geo-Inf., 10, 551, https://doi.org/10.3390/ijgi10080551, 2021b. a, b, c, d
Zhang, Y., Zhao, H., and Long, Y.: CMAB: A Multi-Attribute Building Dataset of China, Sci. Data, 12, 430, https://doi.org/10.1038/s41597-025-04730-5, 2025. a, b
Zhao, Y., Wu, B., Li, Q., Yang, L., Fan, H., Wu, J., and Yu, B.: Combining ICESat-2 photons and Google Earth Satellite images for building height extraction, Int. J. Appl. Earth Obs. Geoinf., 117, 103213, https://doi.org/10.1016/j.jag.2023.103213, 2023. a, b
Zheng, Y., Zhang, X., Ou, J., and Liu, X.: Identifying building function using multisource data: A case study of China's three major urban agglomerations, Sust. Cities Soc., 108, 105498, https://doi.org/10.1016/j.scs.2024.105498, 2024. a, b, c, d
Zhou, W., Persello, C., Li, M., and Stein, A.: Building use and mixed-use classification with a transformer-based network fusing satellite images and geospatial textual information, Remote Sens. Environ., 297, 113767, https://doi.org/10.1016/j.rse.2023.113767, 2023. a
Short summary
The accessibility of building function is essential for urban research. We reviewed the recent work and concluded three limitations: few open-source datasets, coarse building function categories, and poor model interpretability with inadequate multimodal feature fusion. Thus, we created BuildingSense with fine-grained categories and multimodal data, and proved that the large model can be used for improving the interpretability of results, with three directions for enhancing their performance.
The accessibility of building function is essential for urban research. We reviewed the recent...
Altmetrics
Final-revised paper
Preprint