2024/02/25 更新

写真a

イウチ ヒトシ
井内 仁志
所属
理工学術院 理工学術院総合研究所
職名
次席研究員(研究院講師)

研究分野

  • ゲノム生物学 / システムゲノム科学
 

論文

  • Bioinformatics approaches for unveiling virus-host interactions.

    Hitoshi Iuchi, Junna Kawasaki, Kento Kubo, Tsukasa Fukunaga, Koki Hokao, Gentaro Yokoyama, Akiko Ichinose, Kanta Suga, Michiaki Hamada

    Computational and structural biotechnology journal   21   1774 - 1784  2023年  [国際誌]

     概要を見る

    The coronavirus disease-2019 (COVID-19) pandemic has elucidated major limitations in the capacity of medical and research institutions to appropriately manage emerging infectious diseases. We can improve our understanding of infectious diseases by unveiling virus-host interactions through host range prediction and protein-protein interaction prediction. Although many algorithms have been developed to predict virus-host interactions, numerous issues remain to be solved, and the entire network remains veiled. In this review, we comprehensively surveyed algorithms used to predict virus-host interactions. We also discuss the current challenges, such as dataset biases toward highly pathogenic viruses, and the potential solutions. The complete prediction of virus-host interactions remains difficult; however, bioinformatics can contribute to progress in research on infectious diseases and human health.

    DOI PubMed

    Scopus

    5
    被引用数
    (Scopus)
  • Jonckheere-Terpstra-Kendall-based non-parametric analysis of temporal differential gene expression.

    Hitoshi Iuchi, Michiaki Hamada

    NAR genomics and bioinformatics   3 ( 1 ) lqab021  2021年03月  [国際誌]

     概要を見る

    Time-course experiments using parallel sequencers have the potential to uncover gradual changes in cells over time that cannot be observed in a two-point comparison. An essential step in time-series data analysis is the identification of temporal differentially expressed genes (TEGs) under two conditions (e.g. control versus case). Model-based approaches, which are typical TEG detection methods, often set one parameter (e.g. degree or degree of freedom) for one dataset. This approach risks modeling of linearly increasing genes with higher-order functions, or fitting of cyclic gene expression with linear functions, thereby leading to false positives/negatives. Here, we present a Jonckheere-Terpstra-Kendall (JTK)-based non-parametric algorithm for TEG detection. Benchmarks, using simulation data, show that the JTK-based approach outperforms existing methods, especially in long time-series experiments. Additionally, application of JTK in the analysis of time-series RNA-seq data from seven tissue types, across developmental stages in mouse and rat, suggested that the wave pattern contributes to the TEG identification of JTK, not the difference in expression levels. This result suggests that JTK is a suitable algorithm when focusing on expression patterns over time rather than expression levels, such as comparisons between different species. These results show that JTK is an excellent candidate for TEG detection.

    DOI PubMed

    Scopus

    2
    被引用数
    (Scopus)
  • Representation learning applications in biological sequence analysis.

    Hitoshi Iuchi, Taro Matsutani, Keisuke Yamada, Natsuki Iwano, Shunsuke Sumi, Shion Hosoda, Shitao Zhao, Tsukasa Fukunaga, Michiaki Hamada

    Computational and structural biotechnology journal   19   3198 - 3208  2021年  [国際誌]

     概要を見る

    Although remarkable advances have been reported in high-throughput sequencing, the ability to aptly analyze a substantial amount of rapidly generated biological (DNA/RNA/protein) sequencing data remains a critical hurdle. To tackle this issue, the application of natural language processing (NLP) to biological sequence analysis has received increased attention. In this method, biological sequences are regarded as sentences while the single nucleic acids/amino acids or k-mers in these sequences represent the words. Embedding is an essential step in NLP, which performs the conversion of these words into vectors. Specifically, representation learning is an approach used for this transformation process, which can be applied to biological sequences. Vectorized biological sequences can then be applied for function and structure estimation, or as input for other probabilistic models. Considering the importance and growing trend for the application of representation learning to biological research, in the present study, we have reviewed the existing knowledge in representation learning for biological sequence analysis.

    DOI PubMed

    Scopus

    32
    被引用数
    (Scopus)
  • MICOP: Maximal information coefficient-based oscillation prediction to detect biological rhythms in proteomics data.

    Hitoshi Iuchi, Masahiro Sugimoto, Masaru Tomita

    BMC bioinformatics   19 ( 1 ) 249 - 249  2018年06月  [国際誌]

     概要を見る

    BACKGROUND: Circadian rhythms comprise oscillating molecular interactions, the disruption of the homeostasis of which would cause various disorders. To understand this phenomenon systematically, an accurate technique to identify oscillating molecules among omics datasets must be developed; however, this is still impeded by many difficulties, such as experimental noise and attenuated amplitude. RESULTS: To address these issues, we developed a new algorithm named Maximal Information Coefficient-based Oscillation Prediction (MICOP), a sine curve-matching method. The performance of MICOP in labeling oscillation or non-oscillation was compared with four reported methods using Mathews correlation coefficient (MCC) values. The numerical experiments were performed with time-series data with (1) mimicking of molecular oscillation decay, (2) high noise and low sampling frequency and (3) one-cycle data. The first experiment revealed that MICOP could accurately identify the rhythmicity of decaying molecular oscillation (MCC > 0.7). The second experiment revealed that MICOP was robust against high-level noise (MCC > 0.8) even upon the use of low-sampling-frequency data. The third experiment revealed that MICOP could accurately identify the rhythmicity of noisy one-cycle data (MCC > 0.8). As an application, we utilized MICOP to analyze time-series proteome data of mouse liver. MICOP identified that novel oscillating candidates numbered 14 and 30 for C57BL/6 and C57BL/6 J, respectively. CONCLUSIONS: In this paper, we presented MICOP, which is an MIC-based algorithm, for predicting periodic patterns in large-scale time-resolved protein expression profiles. The performance test using artificially generated simulation data revealed that the performance of MICOP for decaying data was superior to that of the existing widely used methods. It can reveal novel findings from time-series data and may contribute to biologically significant results. This study suggests that MICOP is an ideal approach for detecting and characterizing oscillations in time-resolved omics data sets.

    DOI PubMed

    Scopus

    7
    被引用数
    (Scopus)
  • Epigenetic Methods in Neuroscience Research

    Hitoshi Iuchi

       2016年

    DOI

書籍等出版物

  • バイオDBとウェブツール : ラボで使える最新70選 : 知る・学ぶ・使う、バイオDX時代の羅針盤

    小野, 浩雅( 担当: 共著)

    羊土社  2022年11月 ISBN: 9784758104067

  • Epigenetic Methods in Neuroscience Research (Neuromethods)

    ( 担当: 共著)

    Humana  2016年08月 ISBN: 1493949411

    ASIN

共同研究・競争的資金等の研究課題

  • 細胞運命が分岐した細胞群における発現変動遺伝子検出アルゴリズムの開発

    日本学術振興会  科学研究費助成事業 若手研究

    研究期間:

    2021年04月
    -
    2025年03月
     

    井内 仁志

     概要を見る

    近年、1細胞遺伝子発現解析によって発生や刺激に伴う細胞状態の変化を遺伝子発現プロファイルによってモデル化することで、それぞれのサンプルにおける細胞状態の変化と擬時間を推定する解析が行われている。なお、擬時間とは実時間とは必ずしも一致するものではなく、細胞の遷移状態の進行度合いを示したものである。細胞状態の変化の軌跡と擬時間を推定した次のステップとして、細胞運命が分岐した細胞種間で異なる発現パターンを示す遺伝子の抽出をしたいというニーズがある。ここで、発生や刺激に伴う細胞状態の変化の軌跡と擬時間を推定するためのソフトウェアは多数発表されている一方、擬時間ベースで発現変動遺伝子を抽出するアルゴリズムは限られていた。本研究の目的は、フーリエ変換を用いた擬時間ベースの発現変動遺伝子検出アルゴリズムを開発である。2021年度はまず、細胞状態の変化の軌跡と擬時間を推定するソフトウェアの選定とフーリエ変換ベースのアルゴリズムの実装をおこなった。
    また、ゲノミクスデータの解析を行うにあたり、高次元なデータを低次元のベクトルで表現することが必要になる場合がある。そこで、何らかのタスクの解決な特徴表現を学ぶための手法である表現学習について網羅的に調査した。特に自然言語処理をDNA/RNA/タンパクのような生物学的配列に適用している論文にフォーカスしてレビュー論文として発表した (Iuchi et al., Comput Struct Biotechnol J., 2021) 。