Internal Special Research Projects
Internal Special Research Projects
-
2024
View Summary
Between April 2024 and March 2025, we made significant contributions to hyperspectral image (HSI) classification. Our first journal paper, "Segmented Recurrent Transformer With Cubed 3-D Multiscanning Strategy for Hyperspectral Image Classification," was published in IEEE Transactions on Geoscience and Remote Sensing, and the second, "Mamba-in-Mamba: Centralized Mamba-Cross-Scan in Tokenized Mamba Model for Hyperspectral Image Classification," appeared in Neurocomputing. We also submitted a paper to the ISPRS Journal of Photogrammetry and Remote Sensing, currently under review. Additionally, we presented a joint conference paper, "Adversarial Detection Transformer for Kuzushiji Recognition," at the 2024 IEEE International Conference on Image Processing.Our first paper introduces a novel algorithm that extends the multi-scanning strategy into a 3D version, integrated with a recurrent Transformer for hyperspectral image processing. We propose a segmented recurrent module that handles scanned sequences both locally and globally, termed short-term and long-term features. Applying this method to four public datasets, we demonstrated superior classification results compared to state-of-the-art methods.The second paper explores the use of the state-space model for hyperspectral image classification, comparing it with RNNs and Transformers. We present the Mamba-in-Mamba (MiM) architecture, which includes (1) a novel centralized Mamba-Cross-Scan (MCS) mechanism for image-to-sequence data transformation, (2) a Tokenized Mamba (T-Mamba) encoder with advanced components like the Gaussian Decay Mask (GDM), Semantic Token Learner (STL), and Semantic Token Fuser (STF) for better feature generation, and (3) a Weighted MCS Fusion (WMF) module with a Multi-Scale Loss Design to improve training efficiency. Our approach outperforms existing methods across four public HSI datasets in terms of overall accuracy.The third paper under review introduces HSIseg, a novel model for HSI classification based on segmentation techniques, departing from traditional patch-based methods. It incorporates multi-source data collaboration and a progressive pseudo-labeling strategy to address unlabeled regions. This strategy improves training and classification performance. The paper is still under review, and further revisions are ongoing.
-
2023
View Summary
During this research period, we investigated developing a deep learningmodel for hyperspectral image (HSI) classification tasks based on the popularTransformer architecture. By identifying shortcomings in the existingTransformer model, we proposed a new perspective for integrating RecurrentNeural Networks (RNNs) with Transformers for complementarity. The proposalconsisted of three components: 1) RNN-Transformer encoder, 2) Soft MaskedSpectral-Spatial-Based Self-Attention (SMSA), and 3) Multiscanning FusionTransformer. In this case, the transaction paper titled “Multiscanning-based RNN-Transformer for HyperspectralImage Classification” was accepted by the IEEETransactions on Geoscience and Remote Sensing (TGRS). Compared withbaseline methods, this work achieved a 6%~11% accuracy improvement. Moreover,compared with other state-of-the-art methods, our work obtained a 1%~5%accuracy improvement and saved almost 50% of processing time with almost 40%model size reduction.Meanwhile, the idea of the “multiscanning strategy” was extended into another field, image compression, which was studiedby our coworkers. The paper, titled “Learned Image Compression with Multi-Scan Based Channel Fusion,” was accepted by the International Conference onImage Processing (ICIP) in 2023. This work verified the effectiveness ofthe “multiscanning strategy” and showed the general attribute of this idea. We hoped to furtherdevelop this concept into other research fields.Furthermore, to facilitate the multiscanning strategy into a 3D version, we proposed a cubed 3D-multiscanning strategy. The manuscript, titled "Segmented Recurrent Transformer with Cubed 3D Multiscanning Strategy for Hyperspectral Image Classification", is accepted at 26 March, by IEEE Transactions on Geoscience and Remote Sensing (TGRS) .
-
2022 Sei-ichiro Kamata
View Summary
I published two papers in international conferences - the 26th International Conference on Pattern Recognition (ICPR) and the International Conference on Image Processing (ICIP) in 2022. The first paper [1], presented a novel approach to designing a unified spectral-spatial Transformer for hyperspectral image classification. Specifically, I proposed a cascaded integration of the spectral vision Transformer with the spatial pyramid vision Transformer, along with a cross-scale fusion module. Moreover, I introduced a local-global encoder in the spatial domain, which validates the effectiveness of incorporating local features into the Transformer model. Overall, my paper contributed to the advancement and practicality of using a pure vision Transformer-based model for hyperspectral image classification. The second paper [2] proposed a new approach for addressing hyperspectral image classification by leveraging the 3D configuration of a vision Transformer, which enabled simultaneous correlation of spectral and spatial features. To this end, I introduced a novel 3D coordinate positional embedding method that distinguished the relative distances among all hyper-cubes resulting from the 3D partition operation. I also designed a local-global feature combination approach that seamlessly integrates with the 3D configuration of the vision Transformer. Furthermore, we presented our research at two conferences and received positive feedback.
Click to view the Scopus page. The data was downloaded from Scopus API in March 12, 2025, via http://api.elsevier.com and http://www.scopus.com .