Updated on 2023/10/01


IUCHI, Hitoshi
Faculty of Science and Engineering, Waseda Research Institute for Science and Engineering
Job title
Junior Researcher(Assistant Professor)

Research Areas

  • Genome biology / System genome science


  • Bioinformatics approaches for unveiling virus-host interactions.

    Hitoshi Iuchi, Junna Kawasaki, Kento Kubo, Tsukasa Fukunaga, Koki Hokao, Gentaro Yokoyama, Akiko Ichinose, Kanta Suga, Michiaki Hamada

    Computational and structural biotechnology journal   21   1774 - 1784  2023  [International journal]

     View Summary

    The coronavirus disease-2019 (COVID-19) pandemic has elucidated major limitations in the capacity of medical and research institutions to appropriately manage emerging infectious diseases. We can improve our understanding of infectious diseases by unveiling virus-host interactions through host range prediction and protein-protein interaction prediction. Although many algorithms have been developed to predict virus-host interactions, numerous issues remain to be solved, and the entire network remains veiled. In this review, we comprehensively surveyed algorithms used to predict virus-host interactions. We also discuss the current challenges, such as dataset biases toward highly pathogenic viruses, and the potential solutions. The complete prediction of virus-host interactions remains difficult; however, bioinformatics can contribute to progress in research on infectious diseases and human health.

    DOI PubMed


  • Jonckheere-Terpstra-Kendall-based non-parametric analysis of temporal differential gene expression.

    Hitoshi Iuchi, Michiaki Hamada

    NAR genomics and bioinformatics   3 ( 1 ) lqab021  2021.03  [International journal]

     View Summary

    Time-course experiments using parallel sequencers have the potential to uncover gradual changes in cells over time that cannot be observed in a two-point comparison. An essential step in time-series data analysis is the identification of temporal differentially expressed genes (TEGs) under two conditions (e.g. control versus case). Model-based approaches, which are typical TEG detection methods, often set one parameter (e.g. degree or degree of freedom) for one dataset. This approach risks modeling of linearly increasing genes with higher-order functions, or fitting of cyclic gene expression with linear functions, thereby leading to false positives/negatives. Here, we present a Jonckheere-Terpstra-Kendall (JTK)-based non-parametric algorithm for TEG detection. Benchmarks, using simulation data, show that the JTK-based approach outperforms existing methods, especially in long time-series experiments. Additionally, application of JTK in the analysis of time-series RNA-seq data from seven tissue types, across developmental stages in mouse and rat, suggested that the wave pattern contributes to the TEG identification of JTK, not the difference in expression levels. This result suggests that JTK is a suitable algorithm when focusing on expression patterns over time rather than expression levels, such as comparisons between different species. These results show that JTK is an excellent candidate for TEG detection.

    DOI PubMed


  • Representation learning applications in biological sequence analysis.

    Hitoshi Iuchi, Taro Matsutani, Keisuke Yamada, Natsuki Iwano, Shunsuke Sumi, Shion Hosoda, Shitao Zhao, Tsukasa Fukunaga, Michiaki Hamada

    Computational and structural biotechnology journal   19   3198 - 3208  2021  [International journal]

     View Summary

    Although remarkable advances have been reported in high-throughput sequencing, the ability to aptly analyze a substantial amount of rapidly generated biological (DNA/RNA/protein) sequencing data remains a critical hurdle. To tackle this issue, the application of natural language processing (NLP) to biological sequence analysis has received increased attention. In this method, biological sequences are regarded as sentences while the single nucleic acids/amino acids or k-mers in these sequences represent the words. Embedding is an essential step in NLP, which performs the conversion of these words into vectors. Specifically, representation learning is an approach used for this transformation process, which can be applied to biological sequences. Vectorized biological sequences can then be applied for function and structure estimation, or as input for other probabilistic models. Considering the importance and growing trend for the application of representation learning to biological research, in the present study, we have reviewed the existing knowledge in representation learning for biological sequence analysis.

    DOI PubMed


  • MICOP: Maximal information coefficient-based oscillation prediction to detect biological rhythms in proteomics data.

    Hitoshi Iuchi, Masahiro Sugimoto, Masaru Tomita

    BMC bioinformatics   19 ( 1 ) 249 - 249  2018.06  [International journal]

     View Summary

    BACKGROUND: Circadian rhythms comprise oscillating molecular interactions, the disruption of the homeostasis of which would cause various disorders. To understand this phenomenon systematically, an accurate technique to identify oscillating molecules among omics datasets must be developed; however, this is still impeded by many difficulties, such as experimental noise and attenuated amplitude. RESULTS: To address these issues, we developed a new algorithm named Maximal Information Coefficient-based Oscillation Prediction (MICOP), a sine curve-matching method. The performance of MICOP in labeling oscillation or non-oscillation was compared with four reported methods using Mathews correlation coefficient (MCC) values. The numerical experiments were performed with time-series data with (1) mimicking of molecular oscillation decay, (2) high noise and low sampling frequency and (3) one-cycle data. The first experiment revealed that MICOP could accurately identify the rhythmicity of decaying molecular oscillation (MCC > 0.7). The second experiment revealed that MICOP was robust against high-level noise (MCC > 0.8) even upon the use of low-sampling-frequency data. The third experiment revealed that MICOP could accurately identify the rhythmicity of noisy one-cycle data (MCC > 0.8). As an application, we utilized MICOP to analyze time-series proteome data of mouse liver. MICOP identified that novel oscillating candidates numbered 14 and 30 for C57BL/6 and C57BL/6 J, respectively. CONCLUSIONS: In this paper, we presented MICOP, which is an MIC-based algorithm, for predicting periodic patterns in large-scale time-resolved protein expression profiles. The performance test using artificially generated simulation data revealed that the performance of MICOP for decaying data was superior to that of the existing widely used methods. It can reveal novel findings from time-series data and may contribute to biologically significant results. This study suggests that MICOP is an ideal approach for detecting and characterizing oscillations in time-resolved omics data sets.

    DOI PubMed



Books and Other Publications

  • バイオDBとウェブツール : ラボで使える最新70選 : 知る・学ぶ・使う、バイオDX時代の羅針盤

    小野, 浩雅( Part: Joint author)

    羊土社  2022.11 ISBN: 9784758104067

  • Epigenetic Methods in Neuroscience Research (Neuromethods)

    ( Part: Joint author)

    Humana  2016.08 ISBN: 1493949411


Research Projects

  • Algorithm development for detecting differentially expressed genes in single-cell RNA-seq

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research Grant-in-Aid for Early-Career Scientists

    Project Year :