Details of a Researcher - HAMADA, Michiaki

写真a

HAMADA, Michiaki

Scopus Paper Info

Paper Count: 130 Citation Count: 3373 h-index: 32

Click to view the Scopus page. The data was downloaded from Scopus API in March 04, 2026, via http://api.elsevier.com and http://www.scopus.com .

Google Scholar Information (Citations per year)

Citation Count: 5026 h-index: 35 i10-index: 73

Click to view the Google Scholar page.

Scopus Information

News & Topics

2026.02.03

医理工連携交流の成果の社会実装にむけて（第5回日本医科大学・早稲田大学合同シンポジウム開催報告）

2024.10.03

見えてきた医理工連携の成果と展開（第4回日本医科大学・早稲田大学合同シンポジウム開催報告）

2024.01.19

Constructing a Deep Generative Approach for Functional RNA Design

2023.10.16

進む、医理工研究交流（第３回日本医科大学・早稲田大学合同シンポジウム開催報告）

2022.09.01

Waseda Researcher – Michiaki Hamada

2022.06.08

Scientists Develop Novel Computational Model for Aptamer Generation, With Wide Applications

2021.12.13

Announcing the recipients of the 2021 Waseda Research Award

▼display all

Affiliation

Faculty of Science and Engineering, School of Advanced Science and Engineering

Job title

Professor

Degree

Ph.D. ( 2009.03 Tokyo Institute of Technology )

Mail Address

Homepage URL

https://www.hamadalab.com/

Profile

In March 2000, I graduated from the Department of Mathematics, Faculty of Science, Tohoku University, and completed the Master's program in Mathematics at the Graduate School of Science, Tohoku University in March 2002, specializing in operator theory. From April 2002, I spent eight and a half years as a researcher at Fuji Research Institute Corporation (now Mizuho Research & Technologies, Ltd.), where I was engaged in contract R&D related to science and technology. During this time, I participated in the NEDO "Functional RNA Project," where I first encountered bioinformatics. In this project, I was involved in the development of various RNA sequence analysis technologies, including CentroidFold.

While still employed at the company, I completed a doctoral program for working professionals at the Department of Computational Intelligence and Systems Science, Graduate School of Interdisciplinary Science and Engineering, Tokyo Institute of Technology, earning a Ph.D. in Science in March 2009. On October 1, 2010, I was appointed Project Associate Professor at the Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo. Since April 1, 2014, I have been serving as Associate Professor (and Professor since April 2018) at the Faculty of Science and Engineering, Waseda University, where I lead a bioinformatics research laboratory.

Since October 1, 2016, I have also been a Visiting Researcher at the National Institute of Advanced Industrial Science and Technology (AIST), and since April 1, 2017, a Visiting Professor at the Graduate School of Medicine, Nippon Medical School. In 2023, I was selected as one of Waseda University’s “Next Generation Core Researchers,” and as of April 2024, I have assumed the position of President and Chair of the Japanese Society for Bioinformatics (JSBi). To foster human resources and establish a structured body of knowledge in the field of bioinformatics, I am currently supervising the Corona Publishing Bioinformatics Series (a 15-volume set), and actively advancing research, education, and outreach in bioinformatics.

Among my major awards are the 2017 Young Scientists’ Award from the Minister of Education, Culture, Sports, Science and Technology (MEXT), the 2022 Waseda University Research Award for International Research Dissemination, the 2023 Waseda University Teaching Award, and the 2024 Okuma Memorial Academic Prize (Encouragement Award). As a Principal Investigator, I have led various competitive research projects, including two KAKENHI Young Researcher (A) grants, Challenging Exploratory Research, Challenging Research (Exploratory and Pioneering), New Academic Fields (Open Research), two KAKENHI Grant-in-Aid for Scientific Research (A), and the Strategic Basic Research Program (CREST).

My current research focuses broadly on bioinformatics, with particular emphasis on functional RNA, epigenetics, and RNA drug discovery. My ultimate goal is to develop long-lasting “killer tools” that catalyze breakthroughs in biology, medicine, and pharmaceutical sciences through the lens of software and information technology.

Research Experience

2025.04

-

Now

日本バイオインフォマティクス学会理事長
2018.04

-

Now

Waseda University Faculty of Science and Engineering Professor
2025.04

-

Now

National Institute of Advanced Industrial Science and Technology
2017.04

-

Now

Nippon Medical School
2023.04

-

2025.03

Japanese Society for Bioinformatics (JSBi) Vice president
2022.04

-

2025.03

早稲田大学次代の中核研究者
2016.10

-

2025.03

産業技術総合研究所生体システムビッグデータ解析オープンイノベーションラボラトリ（CBBD-OIL) 招聘研究員 / 班長
2014.04

-

2018.03

Waseda University Faculty of Science and Engineering Associate Professor
2010.10

-

2014.03

The University of Tokyo
2002.04

-

2010.09

株式会社富士総合研究所研究員

▼display all

Education Background

2005.10

-

2009.03

Tokyo Institute of Technology Interdisciplinary Science and Engineering Intelligent Systems Science
2000.04

-

2002.03

Tohoku University Graduate School of Science Department of Mathematics
1996.04

-

2000.03

Tohoku University

Committee Memberships

2025.04

-

Now

Japan Agency for Medical Research and Development (AMED) AMED Project Evaluation Committee member
2025.04

-

Now

Japanese Society for Bioinformatics President
2024.09

-

Now

早稲田大学人を対象とする研究に関する倫理審査委員会Ｂ委員長
2022.06

-

Now

mRNA Targeted Drug Discovery Research Organization board of directors
2020.04

-

Now

Japanese Society for Bioinformatics
2023.04

-

2025.03

Japanese Society for Bioinformatics Vice president
2021.09

-

　

2021年日本バイオインフォマティクス学会年会・第10回生命医薬情報学連合大会（IIBMP2021） congress president
2015.04

-

2017.03

Japanese Society for Bioinformatics Board member

▼display all

Research Areas

System genome science / Genome biology / Life, health and medical informatics / Intelligent informatics

Research Interests

RNA therapeutics
Artificial Intelligence
probabilistic model
RNA-protein interaction
RNA-RNA interaction
interactome
long noncoding RNA (lncRNA)
epi-transcriptome
epi-genome
RNA aptamer
sequece analysis
RNA
Bioinformatics

▼display all

Awards

Okuma Memorial Academic Award (Encouragement Award)

2024.11 Waseda University

Winner： Michiaki Hamada
Waseda University Teaching Award

2023.07 Waseda University Bioinformatics

Winner： Michiaki Hamada, Chao Zeng
早稲田大学次代の中核研究者 2022

2022.04 早稲田大学

Winner：浜田道昭
早稲田大学リサーチアワード（国際研究発信力）

2021.12 早稲田大学

Winner：浜田道昭
平成29年度科学技術分野の文部科学大臣表彰若手科学者賞

2017.04 文部科学省

Winner：浜田道昭
産業技術総合研究所理事長賞（研究）

2016.04 産業技術総合研究所

Winner：浜田道昭

▼display all

Papers

Deep generative design of RNA family sequences

Shunsuke Sumi, Michiaki Hamada, Hirohide Saito

Nature Methods 21 ( 3 ) 435 - 443 2024.03 [Refereed] [International journal]

Authorship：Corresponding author

　View Summary

RNA engineering has immense potential to drive innovation in biotechnology and medicine. Despite its importance, a versatile platform for the automated design of functional RNA is still lacking. Here, we propose RNA family sequence generator (RfamGen), a deep generative model that designs RNA family sequences in a data-efficient manner by explicitly incorporating alignment and consensus secondary structure information. RfamGen can generate novel and functional RNA family sequences by sampling points from a semantically rich and continuous representation. We have experimentally demonstrated the versatility of RfamGen using diverse RNA families. Furthermore, we confirmed the high success rate of RfamGen in designing functional ribozymes through a quantitative massively parallel assay. Notably, RfamGen successfully generates artificial sequences with higher activity than natural sequences. Overall, RfamGen significantly improves our ability to design functional RNA and opens up new potential for generative RNA engineering in synthetic biology.

DOI PubMed

Scopus

37

Citation

(Scopus)
Landscape of semi-extractable RNAs across five human cell lines.

Chao Zeng, Takeshi Chujo, Tetsuro Hirose, Michiaki Hamada

Nucleic acids research 51 ( 15 ) 7820 - 7831 2023.07 [Refereed] [International journal]

Authorship：Last author, Corresponding author

　View Summary

Phase-separated membraneless organelles often contain RNAs that exhibit unusual semi-extractability using the conventional RNA extraction method, and can be efficiently retrieved by needle shearing or heating during RNA extraction. Semi-extractable RNAs are promising resources for understanding RNA-centric phase separation. However, limited assessments have been performed to systematically identify and characterize semi-extractable RNAs. In this study, 1074 semi-extractable RNAs, including ASAP1, DANT2, EXT1, FTX, IGF1R, LIMS1, NEAT1, PHF21A, PVT1, SCMH1, STRG.3024.1, TBL1X, TCF7L2, TVP23C-CDRT4, UBE2E2, ZCCHC7, ZFAND3 and ZSWIM6, which exhibited consistent semi-extractability were identified across five human cell lines. By integrating publicly available datasets, we found that semi-extractable RNAs tend to be distributed in the nuclear compartments but are dissociated from the chromatin. Long and repeat-containing semi-extractable RNAs act as hubs to provide global RNA-RNA interactions. Semi-extractable RNAs were divided into four groups based on their k-mer content. The NEAT1 group preferred to interact with paraspeckle proteins, such as FUS and NONO, implying that RNAs in this group are potential candidates of architectural RNAs that constitute nuclear bodies.

DOI PubMed

Scopus

6

Citation

(Scopus)
PBSIM3: a simulator for all types of PacBio and ONT long reads.

Yukiteru Ono, Michiaki Hamada, Kiyoshi Asai

NAR genomics and bioinformatics 4 ( 4 ) lqac092 2022.12 [International journal]

　View Summary

Long-read sequencers, such as Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) sequencers, have improved their read length and accuracy, thereby opening up unprecedented research. Many tools and algorithms have been developed to analyze long reads, and rapid progress in PacBio and ONT has further accelerated their development. Together with the development of high-throughput sequencing technologies and their analysis tools, many read simulators have been developed and effectively utilized. PBSIM is one of the popular long-read simulators. In this study, we developed PBSIM3 with three new functions: error models for long reads, multi-pass sequencing for high-fidelity read simulation and transcriptome sequencing simulation. Therefore, PBSIM3 is now able to meet a wide range of long-read simulation requirements.

DOI PubMed

Scopus

51

Citation

(Scopus)
Generative aptamer discovery using RaptGen

Natsuki Iwano, Tatsuo Adachi, Kazuteru Aoki, Yoshikazu Nakamura, Michiaki Hamada

Nature Computational Science 2 ( 6 ) 378 - 386 2022.06 [Refereed]

Authorship：Last author, Corresponding author

　View Summary

Nucleic acid aptamers are generated by an in vitro molecular evolution method known as systematic evolution of ligands by exponential enrichment (SELEX). Various candidates are limited by actual sequencing data from an experiment. Here we developed RaptGen, which is a variational autoencoder for in silico aptamer generation. RaptGen exploits a profile hidden Markov model decoder to represent motif sequences effectively. We showed that RaptGen embedded simulation sequence data into low-dimensional latent space on the basis of motif information. We also performed sequence embedding using two independent SELEX datasets. RaptGen successfully generated aptamers from the latent space even though they were not included in high-throughput sequencing. RaptGen could also generate a truncated aptamer with a short learning model. We demonstrated that RaptGen could be applied to activity-guided aptamer generation according to Bayesian optimization. We concluded that a generative method by RaptGen and latent representation are useful for aptamer discovery.

DOI

Scopus

75

Citation

(Scopus)
RaptRanker: in silico RNA aptamer selection from HT-SELEX experiment based on local sequence and structure information.

Ryoga Ishida, Tatsuo Adachi, Aya Yokota, Hidehito Yoshihara, Kazuteru Aoki, Yoshikazu Nakamura, Michiaki Hamada

Nucleic acids research 48 ( 14 ) e82 2020.08 [International journal]

　View Summary

Aptamers are short single-stranded RNA/DNA molecules that bind to specific target molecules. Aptamers with high binding-affinity and target specificity are identified using an in vitro procedure called high throughput systematic evolution of ligands by exponential enrichment (HT-SELEX). However, the development of aptamer affinity reagents takes a considerable amount of time and is costly because HT-SELEX produces a large dataset of candidate sequences, some of which have insufficient binding-affinity. Here, we present RNA aptamer Ranker (RaptRanker), a novel in silico method for identifying high binding-affinity aptamers from HT-SELEX data by scoring and ranking. RaptRanker analyzes HT-SELEX data by evaluating the nucleotide sequence and secondary structure simultaneously, and by ranking according to scores reflecting local structure and sequence frequencies. To evaluate the performance of RaptRanker, we performed two new HT-SELEX experiments, and evaluated binding affinities of a part of sequences that include aptamers with low binding-affinity. In both datasets, the performance of RaptRanker was superior to Frequency, Enrichment and MPBind. We also confirmed that the consideration of secondary structures is effective in HT-SELEX data analysis, and that RaptRanker successfully predicted the essential subsequence motifs in each identified sequence.

DOI PubMed

Scopus

51

Citation

(Scopus)
RaptScore: a large language model-based algorithm for versatile aptamer evaluation

Akira Kimura-Yamazaki, Tatsuo Adachi, Shigetaka Nakamura, Yoshikazu Nakamura, Michiaki Hamada

Nucleic Acids Research 2026.01

DOI

Scopus
Identification and Sequence Characterization of Semi-extractable RNAs.

Chao Zeng, Michiaki Hamada

Methods in molecular biology (Clifton, N.J.) 2949 239 - 252 2026 [International journal]

　View Summary

We present customized bioinformatic protocols for the identification and sequence characterization of semi-extractable RNAs, which are promising in understanding RNA-centric phase separation. Through a comprehensive bioinformatic analysis of multiple public datasets, we describe the unique sequence characteristics and interaction profiles of these RNAs, shedding light on their biological significance. Our findings provide a foundation for future investigations into the molecular mechanisms of RNA-mediated phase separations and their implications in RNA biology. The associated analysis code can be accessed at ( https://github.com/zengXchao/serna_mimb ).

DOI PubMed

Scopus
RNA degradation modulates unique aging-related gene expression in naked mole-rats

Ryuma Matsubara, Jinyu Wu, Atsuko Nakanishi Ozeki, Yoshimi Kawamura, Michiaki Hamada, Kyoko Miura, Nobuyoshi Akimitsu

2025.11

DOI
Blind Prediction of Complex Water and Ion Ensembles Around RNA in CASP16.

Rachael C Kretsch, Elisa Posani, Eugene F Baulin, Janusz M Bujnicki, Giovanni Bussi, Thomas E Cheatham 3rd, Shi-Jie Chen, Arne Elofsson, Masoud Amiri Farsani, Olivia N Fisher, M Michael Gromiha, Ayush Gupta, Michiaki Hamada, K Harini, Gang Hu, David Huang, Junichi Iwakiri, Anika Jain, Yuki Kagaya, Daisuke Kihara, Sebastian Kmiecik, Sowmya Ramaswamy Krishnan, Ikuo Kurisaki, Olivier Languin-Cattoën, Jun Li, Shanshan Li, Karim Malekzadeh, Tsukasa Nakamura, Wentao Ni, Chandran Nithin, Michael Z Palo, Joon Hong Park, Smita P Pilla, Simón Poblete, Fabrizio Pucci, Pranav Punuru, Anouka Saha, Kengo Sato, Ambuj Srivastava, Genki Terashi, Emilia Tugolukova, Jacob Verburgt, Qiqige Wuyun, Gül H Zerze, Kaiming Zhang, Sicheng Zhang, Wei Zheng, Yuanzhe Zhou, Wah Chiu, David A Case, Rhiju Das

Proteins 2025.11 [International journal]

　View Summary

Biomolecules rely on water and ions for stable folding, but these interactions are often transient, dynamic, or disordered and thus hidden from experiments and evaluation challenges that represent biomolecules as single, ordered structures. Here, we compare blindly predicted ensembles of water and ion structure to the cryo-EM densities observed around the Tetrahymena ribozyme at 2.2-2.3 Å resolution, collected through target R1260 in the CASP16 competition. Twenty-six groups participated in this solvation "cryo-ensemble" prediction challenge, submitting over 350 million atoms in total, offering the first opportunity to compare blind predictions of dynamic solvent shell ensembles to cryo-EM density. Predicted atomic ensembles were converted to density through local alignment and these densities were compared to the cryo-EM densities using Pearson correlation, Spearman correlation, mutual information, and precision-recall curves. These predictions show that an ensemble representation is able to capture information of transient or dynamic water and ions better than traditional atomic models, but there remains a large accuracy gap to the performance ceiling set by experimental uncertainty. Overall, molecular dynamics approaches best matched the cryo-EM density, with blind predictions from bussilab_plain_md, SoutheRNA, bussilab_replex, coogs2, and coogs3 outperforming the baseline molecular dynamics prediction. This study indicates that simulations of water and ions can be quantitatively evaluated with cryo-EM maps. We propose that further community-wide blind challenges can drive and evaluate progress in modeling water, ions, and other previously hidden components of biomolecular systems.

DOI PubMed

Scopus

3

Citation

(Scopus)
Blind prediction of complex water and ion ensembles around RNA in CASP16.

Rachael C Kretsch, Elisa Posani, Eugene F Baulin, Janusz M Bujnicki, Giovanni Bussi, Thomas E Cheatham, Shi-Jie Chen, Arne Elofsson, Masoud Amiri Farsani, Olivia N Fisher, M Michael Gromiha, Ayush Gupta, Michiaki Hamada, K Harini, Gang Hu, David Huang, Junichi Iwakiri, Anika Jain, Yuki Kagaya, Daisuke Kihara, Sebastian Kmiecik, Sowmya Ramaswamy Krishnan, Ikuo Kurisaki, Olivier Languin-Cattoën, Jun Li, Shanshan Li, Karim Malekzadeh, Tsukasa Nakamura, Wentao Ni, Chandran Nithin, Michael Z Palo, Joon Hong Park, Smita P Pilla, Simón Poblete, Fabrizio Pucci, Pranav Punuru, Anouka Saha, Kengo Sato, Ambuj Srivastava, Genki Terashi, Emilia Tugolukova, Jacob Verburgt, Qiqige Wuyun, Gül H Zerze, Kaiming Zhang, Sicheng Zhang, Wei Zheng, Yuanzhe Zhou, Wah Chiu, David A Case, Rhiju Das

bioRxiv : the preprint server for biology 2025.11 [International journal]

　View Summary

Biomolecules rely on water and ions for stable folding, but these interactions are often transient, dynamic, or disordered and thus hidden from experiments and evaluation challenges that represent biomolecules as single, ordered structures. Here, we compare blindly predicted ensembles of water and ion structure to the cryo-EM densities observed around the Tetrahymena ribozyme at 2.2-2.3 Å resolution, collected through target R1260 in the CASP16 competition. 26 groups participated in this solvation 'cryo-ensemble' prediction challenge, submitting over 350 million atoms in total, offering the first opportunity to compare blind predictions of dynamic solvent shell ensembles to cryo-EM density. Predicted atomic ensembles were converted to density through local alignment and these densities were compared to the cryo-EM densities using Pearson correlation, Spearman correlation, mutual information, and precision-recall curves. These predictions show that an ensemble representation is able to capture information of transient or dynamic water and ions better than traditional atomic models, but there remains a large accuracy gap to the performance ceiling set by experimental uncertainty. Overall, molecular dynamics approaches best matched the cryo-EM density, with blind predictions from bussilab_plain_md, SoutheRNA, bussilab_replex, coogs2, and coogs3 outperforming the baseline molecular dynamics prediction. This study indicates that simulations of water and ions can be quantitatively evaluated with cryo-EM maps. We propose that further community-wide blind challenges can drive and evaluate progress in modeling water, ions and other previously hidden components of biomolecular systems.

DOI PubMed
Fitness landscapes and thermodynamic approaches to development of nucleic acids enzymes: from classical methods to AI integration.

Shuntaro Takahashi, Michiaki Hamada, Hisae Tateishi-Karimata, Naoki Sugimoto

RSC chemical biology 6 ( 11 ) 1667 - 1685 2025.10 [International journal]

　View Summary

Nucleic acids (NA), namely DNA and RNA, dynamically fold and unfold to perform their functions in cells. Functional NAs include NA enzymes, such as ribozymes and DNAzymes. Their folding and target binding are governed by interactions between nucleobases, including base pairings, which follow thermodynamic principles. To elucidate biological mechanisms and enable diverse technical applications, it is essential to clarify the relationship between the primary sequence and the catalytic activity of NA enzymes. Unlike methods for predicting the stability of NA duplexes, which have been widely used for over half a century, predictive approaches for the catalytic activity of NA enzymes remain limited due to the low throughput of activity assays. However, recent advances in genome analysis and computational data science have significantly improved our understanding of the sequence-function relationship in NA enzymes. This article reviews the contributions of data-driven chemistry to understanding the reaction mechanisms of NA enzymes at the nucleotide level and predicting novel NA enzymes with catalytic activity from sequence information. Furthermore, we discuss potential databases for predicting NA enzyme activity under various solution conditions and their integration with artificial intelligence for future applications.

DOI PubMed

Scopus
Downregulation of HLA Class I Expression Through HLA-A DNA Methylation Is Associated with Reduced CD8+ T Cell Infiltration in Cervical Cancer.

Daisuke Yoshimoto, Hitoshi Iuchi, Ayumi Taguchi, Kenbun Sone, Kana Tamai, Ayako Mori, Shuhei Kitamura, Anh Quynh Duong, Aya Ishizaka, Misako Kusakabe, Yoko Yamamoto, Akiko Takase, Masako Ikemura, Hiroko Matsunaga, Takayuki Iriyama, Iwao Kukimoto, Masahito Kawazu, Michiaki Hamada, Tetsuo Ushiku, Katsutoshi Oda, Haruko Takeyama, Yasushi Hirota, Yutaka Osuga

Cancer immunology research 2025.10 [International journal]

　View Summary

Human leukocyte antigen class I (HLA-I) is central to tumor immune recognition, but its regulatory mechanisms in cervical cancer remain poorly understood. This study aimed to elucidate the impact of HLA-I regulatory mechanisms on CD8+ T cell infiltration and identify distinct histotype-specific immune escape strategies across cervical cancer subtypes. Using 98 cervical cancer cases, including squamous cell carcinoma (SCC, n=53), adenocarcinoma (AC, n=32), gastric-type adenocarcinoma (GAS, n=5), small cell carcinoma (Small, n=4), and mixed histological types (MIX, n=4), we examined the relationship between CD8+ T cell infiltration patterns (categorized as Infiltrated, Excluded, or Absent) and HLA-I expression, HLA-A DNA methylation, and HLA-I loss of heterozygosity (LOH). CD8+ T cell infiltration patterns varied significantly by histological subtype (P<0.0001). SCC showed the highest frequency of the Infiltrated pattern (73.6%), whereas GAS and Small predominantly displayed an Absent pattern. Reduced CD8+ T cell infiltration correlated with poor survival (P<0.0001). HLA-I expression mirrored these trends, being highest in SCC and lowest in Small and GAS. HLA-A DNA methylation emerged as a key driver of HLA-I downregulation, leading to reduced CD8+ infiltration (P<0.05). In SCC, both HLA-A methylation and HLA-I LOH contributed to immune evasion; cases lacking these alterations exhibited the highest CD8+ T cell infiltration levels (P<0.01). This study identifies distinct HLA-I regulatory mechanisms in cervical cancer, highlighting HLA-A methylation-and particularly HLA-I LOH in SCC-as key drivers of immune evasion. These findings provide a foundation for developing predictive biomarkers and suggest that targeting these specific HLA-I regulatory mechanisms could enhance immunotherapy efficacy.

DOI PubMed

Scopus
Nematode telomerase RNA hitchhikes on introns of germline–up-regulated genes

Jingjing Zhang, Eriko Kajikawa, Osamu Nishimura, Shunsuke Tagami, Mitsutaka Kadota, Michiaki Hamada, Takefumi Kondo, Hiroki Shibuya, Yutaka Takeda, Io Yamamoto, Morié Ishida, Masahiro Onoguchi, Hirohide Saito, Fumiya Ito, Shunsuke Sumi, Tatsuyuki Yoshii

Science 390 ( 6771 ) eads7778 2025.10 [International journal]

　View Summary

Telomerase is a ribonucleoprotein complex that elongates telomeric DNA, ensuring germline immortality. In this study, we identified the Caenorhabditis elegans telomerase RNA component 1 (terc-1), as the first known telomerase RNA expressed as an intronic long noncoding RNA (lncRNA), embedded in an intron of germline-up-regulated gene nmy-2. terc-1 undergoes splicing, polyadenylation, and nuclear RNA exosome-dependent maturation, stabilized by H/ACA small nucleolar ribonucleoproteins, thus co-opting the H/ACA small nucleolar RNA (snoRNA) biogenesis machinery. Mutations in terc-1 led to progressive telomere shortening and sterility in successive generations. Artificially transplanting the nmy-2 intron into the introns of germline-expressed genes but not non-germline-expressed genes restored germline immortality, highlighting the importance of genomic context. Our findings suggest that nematode telomerase RNA is a snoRNA-like intronic lncRNA that exploits the introns of germline-up-regulated genes to ensure species survival.

DOI PubMed

Scopus
Elucidating Alterations in Viral and Human Gene Expression Due to Human Papillomavirus Integration by Using Multimodal RNA Sequencing

Kana Tamai, Sonoko Kinjo, Ayumi Taguchi, Kazunori Nagasaka, Daisuke Yoshimoto, Anh Quynh Duong, Yoko Yamamoto, Hitoshi Iuchi, Mayuyo Mori, Kenbun Sone, Michiaki Hamada, Kei Kawana, Kazuho Ikeo, Yasushi Hirota, Yutaka Osuga

Viruses 17 ( 10 ) 2025.10 [International journal]

　View Summary

Human papillomavirus (HPV) infection is a primary driver of cervical cancer. Integration of HPV into the human genome causes persistent expression of viral oncogenes E6 and E7, which promote carcinogenesis and disrupt host genomic function. However, the impact of integration on host gene expression remains incompletely understood. We used multimodal RNA sequencing, combining total RNA-seq and Cap Analysis of Gene Expression (CAGE), to clarify virus-host interactions after HPV integration. HPV-derived transcripts were detected in 17 of 20 clinical samples. In most specimens, transcriptional start sites (TSSs) showed predominant early promoter usage, and transcript patterns differed with detectable E4 RNA region. Notably, the high RNA expressions of E4 region and viral-human chimeric RNAs were mutually exclusive. Chimeric RNAs were identified in 13 of 17 samples, revealing 16 viral integration sites (ISs). CAGE data revealed two patterns of TSS upregulation centered on the ISs: a two-sided pattern (43.8%) and a one-sided pattern (31.3%). Total RNA-seq showed upregulation of 12 putative cancer-related genes near ISs, including MAGI1-AS1, HAS3, CASC8, BIRC2, and MMP12. These findings indicate that HPV integration drives transcriptional activation near ISs, enhancing expression of adjacent oncogenes. Our study deepens understanding of HPV-induced carcinogenesis and informs precision medicine strategies for cervical cancer.

DOI PubMed

Scopus
Molecular and genetic evidence for the role of AMBRA1 in suppressing S-phase entry and tumorigenesis.

Hisako Akatsuka, Tomohiro Kashikawa, Kaori Masuhara, Mizuki Tokusanai, Chenyang Li, Yumi Iida, Chisa Okada-Yamaguchi, Yoshinori Okada, Masayuki Tanaka, Takahiro Suzuki, Norio Yamamoto, Katsuto Hozumi, Tomoaki Tanaka, Hirofumi Nakaoka, Kazuyoshi Hosomichi, Yu Hamaguchi, Michiaki Hamada, Yoshiki Shiraishi, Akihide Kamiya, Yoshihiko Nakamura, Kaito Harada, Abd Aziz Ibrahim, Takashi Yahata, Masato Ohtsuka, Naoya Nakamura, Hiroyuki Hosokawa, Minoru Kimura, Ituro Inoue, Takehito Sato

iScience 28 ( 8 ) 113054 - 113054 2025.08 [International journal]

　View Summary

AMBRA1, which was initially reported to be essential for nervous system development via autophagy and cell proliferation control, also functions as a tumor suppressor by regulating the ubiquitination of D-type cyclins through interaction with DDB1-Cullin4A/4 B E3 ligase. We had identified a missense mutation in AMBRA1 through exome analysis of a family with Cowden syndrome. The patient-type mutant showed reduced DDB1 binding and impaired cyclin D degradation. To investigate the physiological role of AMBRA1, we generated Ambra1 flox mice crossed with Rosa-Cre-ERT2-Tg mice. These inducible Ambra1 conditional knock out mice exhibited increased body weight, organ size, and enhanced S phase entry, with elevated cyclin D expression in a cell lineage- or differentiation-specific manner. Notably, their susceptibility to spontaneous, radiation-, and chemically induced malignancies was significantly higher. These findings support the role of AMBRA1 as a tumor suppressor that regulates cyclin Ds, although other targets may also contribute.

DOI PubMed

Scopus

1

Citation

(Scopus)
Screening and machine learning-based prediction of translation-enhancing peptides that reduce ribosomal stalling in Escherichia coli

Teruyo Ojima-Kato, Gentaro Yokoyama, Hideo Nakano, Michiaki Hamada, Chie Motono

RSC Chemical Biology 2025.07 [International journal]

　View Summary

We previously reported that the nascent SKIK peptide enhances translation and alleviates ribosomal stalling caused by arrest peptides (APs) such as SecM and polyproline when positioned immediately upstream of the APs in both Escherichia coli in vivo and in vitro translation systems. In this study, we conducted a comprehensive screening of translation-enhancing peptides (TEPs) using a randomized artificial tetrapeptide library. The screening focused on the ability of the peptides to suppress SecM AP-induced translational stalling in E. coli cells. We identified TEPs exhibiting a range of translation-enhancing activities. In vitro translation analysis suggested that the fourth amino acid in the tetrapeptide influences the reduction of SecM AP-mediated stalling. Additionally, we developed a machine learning model using a random forest algorithm to predict TEP activity, which showed a strong correlation with experimentally measured activities. These findings provide a compact peptide toolkit and a data-driven approach for alleviating AP-induced ribosome stalling, with potential applications in synthetic biology.

DOI PubMed
Integrative epigenome and transcriptome analyses reveal transcriptional programs differentially regulated by ASCL1 and NEUROD1 in small cell lung cancer

Hiroshi Takumida, Akira Saito, Yugo Okabe, Yasuhiro Terasaki, Yu Mikami, Hidenori Tanaka, Masami Suzuki, Yu Hamaguchi, Chao Zeng, Michiaki Hamada, Hiroshi I. Suzuki, Hidenori Kage, Masafumi Horie

Oncogene 44 ( 34 ) 3113 - 3125 2025.07 [International journal]

　View Summary

Small cell lung cancer (SCLC), an aggressive neuroendocrine carcinoma, has an extremely poor prognosis. ASCL1 and NEUROD1 are key regulators of neuroendocrine features, and previous studies have suggested that SCLC plasticity occurs during the transition from ASCL1-positive (SCLC-A) to NEUROD1-positive (SCLC-N) subtypes. In this study, we attempted to understand the transcriptional programs governed by ASCL1 and NEUROD1 to identify markers of SCLC plasticity. Immunohistochemistry and epigenome and transcriptome analyses in ASCL1/NEUROD1 double-positive SCLC cells (SCLC-A/N) revealed co-expression of ASCL1 and NEUROD1 in almost half of SCLC cases. Genome-wide profiling of histone modifications, ASCL1 and NEUROD1 binding sites, and gene co-expression patterns revealed that both ASCL1 and NEUROD1 are active in SCLC-A/N and regulate partially distinct target genes. Furthermore, SCLC-A/N exhibited characteristics that were intermediate between SCLC-A and SCLC-N subtypes. NEUROD1 knockout, followed by RNA-seq, suggested an association between NEUROD and NHLH transcription factors that might shape the NEUROD1-mediated regulatory network. Small RNA-seq further indicated that miR-139-5p is specifically expressed in NEUROD1-positive SCLC, and transcriptomic studies suggested that miR-139-5p might regulate an array of pathologically relevant genes in collaboration with other NEUROD1-associated miRNAs. Our integrative analyses provide deeper insights into SCLC heterogeneity and multi-layered transcriptional programs differentially governed by ASCL1 and NEUROD1.

DOI PubMed

Scopus

2

Citation

(Scopus)
Mitotic phosphorylation of ADAR1 regulates its centromeric localization and is required for faithful mitotic progression

Yuxi Yang, Mai Kubota, Kokone Hasegawa, Chao Zeng, Kazuko Nishikura, Michiaki Hamada, Masayuki Sakurai

2025.05

DOI
Deciphering the comprehensive relationship between 5′UTR and 3′UTR sequences with deep learning

Kanta Suga, Keisuke Yamada, Michiaki Hamada

2025.05

DOI
Hidden challenges in evaluating spillover risk of zoonotic viruses using machine learning models

Junna Kawasaki, Tadaki Suzuki, Michiaki Hamada

Communications Medicine 5 ( 1 ) 187 - 187 2025.05 [International journal]

　View Summary

BACKGROUND: Machine learning models have been deployed to assess the zoonotic spillover risk of viruses by identifying their potential for human infectivity. However, the lack of comprehensive datasets for viral infectivity poses a major challenge, limiting the predictable range of viruses. METHODS: In this study, we address this limitation through two key strategies: constructing expansive datasets across 26 viral families and developing the BERT-infect model, which leverages large language models pre-trained on extensive nucleotide sequences. RESULTS: Here we show that our approach substantially boosts model performance. This enhancement is particularly notable in segmented RNA viruses, which are involved with severe zoonoses but have been overlooked due to limited data availability. Our model also exhibits high predictive performance even with partial viral sequences, such as high-throughput sequencing reads or contig sequences from de novo sequence assemblies, indicating the model's applicability for mining zoonotic viruses from virus metagenomic data. Furthermore, models trained on data up to 2018 demonstrate robust predictive capability for most viruses identified post-2018. Nonetheless, high-resolution evaluation based on phylogenetic analysis reveals general limitations in current machine learning models: the difficulty in alerting the human infectious risk in specific zoonotic viral lineages, including SARS-CoV-2. CONCLUSIONS: Our study provides a comprehensive benchmark for viral infectivity prediction models and highlights unresolved issues in fully exploiting machine learning to prepare for future zoonotic threats.

DOI PubMed

Scopus

6

Citation

(Scopus)
Multi-objective computational optimization of human 5′ UTR sequences

Keisuke Yamada, Kanta Suga, Naoko Abe, Koji Hashimoto, Susumu Tsutsumi, Masahito Inagaki, Fumitaka Hashiya, Hiroshi Abe, Michiaki Hamada

Briefings in Bioinformatics 26 ( 3 ) 2025.05 [International journal]

　View Summary

The computational design of messenger RNA (mRNA) sequences is a critical technology for both scientific research and industrial applications. Recent advances in prediction and optimization models have enabled the automatic scoring and optimization of $5^\prime $ UTR sequences, key upstream elements of mRNA. However, fully automated design of $5^\prime $ UTR sequences with more than two objective scores has not yet been explored. In this study, we present a computational pipeline that optimizes human $5^\prime $ UTR sequences in a multi-objective framework, addressing up to four distinct and conflicting objectives. Our work represents an important advancement in the multi-objective computational design of mRNA sequences, paving the way for more sophisticated mRNA engineering.

DOI PubMed

Scopus
REPrise: de novo interspersed repeat detection using inexact seeding.

Atsushi Takeda, Daisuke Nonaka, Yuta Imazu, Tsukasa Fukunaga, Michiaki Hamada

Mobile DNA 16 ( 1 ) 16 - 16 2025.04 [International journal]

　View Summary

BACKGROUND: Interspersed repeats occupy a large part of many eukaryotic genomes, and thus their accurate annotation is essential for various genome analyses. Database-free de novo repeat detection approaches are powerful for annotating genomes that lack well-curated repeat databases. However, existing tools do not yet have sufficient repeat detection performance. RESULTS: In this study, we developed REPrise, a de novo interspersed repeat detection software program based on a seed-and-extension method. Although the algorithm of REPrise is similar to that of RepeatScout, which is currently the de facto standard tool, we incorporated three unique techniques into REPrise: inexact seeding, affine gap scoring and loose masking. Analyses of rice and simulation genome datasets showed that REPrise outperformed RepeatScout in terms of sensitivity, especially when the repeat sequences contained many mutations. Furthermore, when applied to the complete human genome dataset T2T-CHM13, REPrise demonstrated the potential to detect novel repeat sequence families. CONCLUSION: REPrise can detect interspersed repeats with high sensitivity even in long genomes. Our software enhances repeat annotation in diverse genomic studies, contributing to a deeper understanding of genomic structures.

DOI PubMed

Scopus
Generation of RNA aptamers against chikungunya virus E2 envelope protein.

Kaku Goto, Ryo Amano, Akiko Ichinose, Akiya Michishita, Michiaki Hamada, Yoshikazu Nakamura, Masaki Takahashi

Journal of virology 99 ( 3 ) e0209524 2025.03 [International journal]

　View Summary

Nucleic acid aptamers are a promising drug modality, whereas the generation of virus-neutralizing aptamers has remained difficult due to the lack of a robust system for targeting the viral particles of interest. Here, we took advantage of our latest platform technology of Systematic Evolution of Ligands by EXponential enrichment (SELEX) with virus-like particles (VLPs) and targeted chikungunya virus (CHIKV) as a model, the pathogenic reemerging virus with an unmet need for control. The identified aptamer against CHIKV-VLPs, Apt#1, and its truncated derivatives showed neutralizing activity with nanomolar IC50 values in a cell-based assay system using a pseudoviral particle of CHIKV (CHIKVpp). An antiviral-based chemical genetics approach revealed significant competition of Apt#1 with suramin, a reported interactant with domain A of the E2 envelope protein (E2DA), in both CHIKVpp and surface plasmon resonance (SPR) analyses, predicting E2DA to be the Apt#1 interface. In addition, Apt#1 interfered with the attachment of CHIKVpp, collectively suggesting its property as an attachment inhibitor via E2DA of CHIKV. Thus, the generation of the VLP-targeted aptamers proved to contribute to anti-CHIKV strategies and confirmed the utility of the platform as a novel and viable option for the development of neutralizing agents against viral particles of interest.IMPORTANCEOur latest SELEX technology using VLPs has generated aptamers that bind the native conformation of the incorporated envelope protein and achieve the virus binding and neutralizing effects. Indeed, the aptamer-probed target E2DA is a representative neutralization site on the surface of the viral particle, validating the utility of the VLP-driven procedure. Simultaneously, the enhanced antiviral effects of the aptamer in combination with approved drugs using the CHIKVpp assay with human cells indicated potential therapeutic strategies that are expected to help address unmet needs in CHIKV control. The robust affinity of the aptamer to viral particles demonstrated by SPR analysis can also lead to conjugates with antivirals as guiding molecules and aptasensors for diagnostic tools. Overall, our VLP-based method provided anti-CHIKV as well as a versatile platform applicable to other emerging and reemerging viruses, in preparation for outbreaks with the need for rapid development of antiviral strategies as next-generation theranostics.

DOI PubMed

Scopus

2

Citation

(Scopus)
Involvement of lncRNA MIR205HG in idiopathic pulmonary fibrosis and IL-33 regulation via Alu elements

Tsuyoshi Takashima, Chao Zeng, Eitaro Murakami, Naoko Fujiwara, Masaharu Kohara, Hideki Nagata, Zhaozu Feng, Ayako Sugai, Yasue Harada, Rika Ichijo, Daisuke Okuzaki, Satoshi Nojima, Takahiro Matsui, Yasushi Shintani, Gota Kawai, Michiaki Hamada, Tetsuro Hirose, Kazuhiko Nakatani, Eiichi Morii

JCI Insight 10 ( 5 ) 2025.03 [International journal]

　View Summary

Idiopathic pulmonary fibrosis (IPF) causes remodeling of the distal lung. Pulmonary remodeling is histologically characterized by fibrosis, as well as appearance of basal cells; however, the involvement of basal cells in IPF remains unclear. Here, we focus on the long noncoding RNA MIR205HG, which is highly expressed in basal cells, using RNA sequencing. Through RNA sequencing of genetic manipulations using primary cells and organoids, we discovered that MIR205HG regulates IL-33 expression. Mechanistically, the AluJb element of MIR205HG plays a key role in IL-33 expression. Additionally, we identified a small molecule that targets the AluJb element, leading to decreased IL-33 expression. IL-33 is known to induce type 2 innate lymphoid cells (ILC2s), and we observed that MIR205HG expression was positively correlated with the number of ILC2s in patients with IPF. Collectively, these findings provide insights into the mechanisms by which basal cells contribute to IPF and suggest potential therapeutic targets.

DOI PubMed

Scopus

6

Citation

(Scopus)
Hybrid MD-generative modeling expands RNA ensembles to include cryptic ligand-binding conformations: application to HIV-1 TAR

Ikuo Kurisaki, Michiaki Hamada

2025.01

DOI
Deep learning generatesapoRNA conformations with cryptic ligand binding site

Ikuo Kurisaki, Michiaki Hamada

2025.01

　View Summary

RNA plays vital roles in diverse biological processes, thus drawing much attention as potential drug target. Structure Based Drug Design (SBDD) for RNA is a promising approach but available RNA-ligand complex tertiary structures are substantially limited so far. Then, theoretical RNA-ligand docking simulations play central roles. However, success of SBDD highly depends on use of RNA structure sufficiently close to ligand-boundable conformations, which do not necessarily appear in experimentally-resolved and theoretically sampled apo RNA conformations. To overcome such difficulty in SBDD for RNA, we leverage efficiently sampling ligand-boudable apo RNA conformations by using a generative deep learning model (DL). We succeeded to generate HIV-1 Transactivation Response Element (TAR) conformations with a cryptic MV2003 binding cavity without a priori knowledge of the cavity. These conformations bind to MV2003 with binding scores similar to those calculated for experimentally-resolved TAR-MV2003 complexes, illustrating new application of DL to promote SBDD for RNA.

DOI
RaptGen-UI: an integrated platform for exploring and analyzing the sequence landscape of HT-SELEX experiments

Ryota Nakano, Natsuki Iwano, Akiko Ichinose, Michiaki Hamada

Bioinformatics Advances 5 ( 1 ) vbaf120 2024.12 [International journal]

　View Summary

SUMMARY: RaptGen-UI provides intuitive graphical user-interface of the system exploring and analyzing the sequence landscape of high-throughput (HT)-SELEX (Systematic Evolution of Ligands by EXponential enrichment) experiments through machine learning-driven visualization with optimization capabilities. This software enables wet-lab researchers to efficiently analyze HT-SELEX dataset and optimize RNA aptamers without requiring extensive computational expertise. The containerized architecture ensures secure local deployment and supports both of high-performance Graphics Processing Unit (GPU) acceleration and CPU-only environments, making it suitable for various research settings. AVAILABILITY AND IMPLEMENTATION: This software is a web-based application running locally on the user's PC. The frontend is constructed using Next.js and Plotly.js with TypeScript, while the backend is developed using FastAPI, Celery, PostgreSQL RDBMS, and Redis with Python. Each module is encapsulated within Docker containers and deployed via Docker Compose. The system supports both CUDA GPU and CPU-only environments. Source code and documentation are freely available at https://github.com/hmdlab/RaptGen-UI.

DOI PubMed

Scopus
The MTR4/hnRNPK complex surveils aberrant polyadenylated RNAs with multiple exons.

Kenzui Taniue, Anzu Sugawara, Chao Zeng, Han Han, Xinyue Gao, Yuki Shimoura, Atsuko Nakanishi Ozeki, Rena Onoguchi-Mizutani, Masahide Seki, Yutaka Suzuki, Michiaki Hamada, Nobuyoshi Akimitsu

Nature communications 15 ( 1 ) 8684 - 8684 2024.10 [Refereed] [International journal]

　View Summary

RNA surveillance systems degrade aberrant RNAs that result from defective transcriptional termination, splicing, and polyadenylation. Defective RNAs in the nucleus are recognized by RNA-binding proteins and MTR4, and are degraded by the RNA exosome complex. Here, we detect aberrant RNAs in MTR4-depleted cells using long-read direct RNA sequencing and 3' sequencing. MTR4 destabilizes intronic polyadenylated transcripts generated by transcriptional read-through over one or more exons, termed 3' eXtended Transcripts (3XTs). MTR4 also associates with hnRNPK, which recognizes 3XTs with multiple exons. Moreover, the aberrant protein translated from KCTD13 3XT is a target of the hnRNPK-MTR4-RNA exosome pathway and forms aberrant condensates, which we name KCTD13 3eXtended Transcript-derived protein (KeXT) bodies. Our results suggest that RNA surveillance in human cells inhibits the formation of condensates of a defective polyadenylated transcript-derived protein.

DOI PubMed

Scopus
Landscape of evolutionary arms races between transposable elements and KRAB-ZFP family.

Masato Kosuge, Jumpei Ito, Michiaki Hamada

Scientific reports 14 ( 1 ) 23358 - 23358 2024.10 [Refereed] [International journal]

Authorship：Last author, Corresponding author

　View Summary

Transposable elements (TEs) are mobile parasitic sequences that have expanded within the host genome. It has been hypothesized that host organisms have expanded the Krüppel-associated box-containing zinc finger proteins (KRAB-ZFPs), which epigenetically suppress TEs, to counteract disorderly TE transpositions. This process is referred to as the evolutionary arms race. However, the extent to which this evolutionary arms race occurred across various TE families remains unclear. In the present study, we systematically explored the evolutionary arms race between TE families and human KRAB-ZFPs using public ChIP-seq data. We discovered and characterized new instances of evolutionary arms races with KRAB-ZFPs in endogenous retroviruses. Furthermore, we found that the regulatory landscape shaped by this arms race contributed to the gene regulatory networks. In summary, our results provide insight into the impact of the evolutionary arms race on TE families, the KRAB-ZFP family, and host gene regulatory networks.

DOI PubMed

Scopus

10

Citation

(Scopus)
A chimeric RNA consisting of siRNA and aptamer for inhibiting dengue virus replication

Ryo Amano, Masaki Takahashi, Kazumi Haga, Mizuki Yamamoto, Kaku Goto, Akiko Ichinose, Michiaki Hamada, Jin Gohda, Jun-ichiro Inoue, Yasushi Kawaguchi, Meng Ling Moi, Yoshikazu Nakamura

NAR Molecular Medicine 1 ( 4 ) ugae025 2024.10 [International journal]

　View Summary

Silencing viruses by chimeric RNAs, wherein small interfering RNAs (siRNAs) targeting viral RNAs are conjugated with RNA aptamers specific to viral envelope proteins, is a promising treatment for viral infection diseases; however, practical evaluations are apparently lacking. Here, we present a chimeric RNA comprises siRNA and RNA aptamer, both of which target all four serotypes of dengue virus (DENV), for suppressing DENV replication. The siRNA targeting consensus sequences in the 3'-UTR of all four DENV serotypes suppressed the expression of a reporter gene carrying the siRNA-targeted sequence of DENV-1 by ∼70%. The RNA aptamer generated by VLP-SELEX using DENV-1-VLPs as baits showed an affinity for all four DENV-VLP serotypes, presumably without affecting the fusion process. After conjugation of each modality, the chimeric RNA significantly suppressed authentic DENV-1 and DENV-2 production in vitro. Our study provides evidence that chimeric RNA is a potentially effective antiviral agent.

DOI PubMed
The Lomb-Scargle periodogram-based differentially expressed gene detection along pseudotime

Hitoshi Iuchi, Michiaki Hamada

2024.08 [Refereed]

Authorship：Last author, Corresponding author

　View Summary

Abstract

Motivation

In recent years, single-cell RNA sequencing (scRNA-seq) has provided high-resolution snapshots of biological processes and has contributed to the understanding of cell dynamics. Trajectory inference has the potential to provide a quantitative representation of cell dynamics, and several trajectory inference algorithms have been developed. However, the downstream analysis of trajectory inference, such as the analysis of differentially expressed genes (DEG), remains challenging.

Results

In this study, we introduce a Lomb-Scargle (LS) periodogram-based algorithm for identifying DEGs associated with pseudotime in a trajectory analysis. The algorithm is capable of analyzing any inferred trajectory, including tree structures with multiple branching points, leading to diverse cell types. We validated this approach using simulated data and real datasets, and our results showed that our approach was superior when performing DEG analysis on complex structured trajectories. Our approach will contribute to gene characterization in trajectory analysis and help gain deeper biological insights.

Availability

All code used in our proposed method can be found athttps://github.com/hiuchi/LS.

Contact

hitoshi.iuchi@hamadalab.com

Supplementary information

Supplementary data are available atJournal Nameonline.

DOI
Identification of a novel RNA transcript TISPL upregulated by stressors that stimulate ATF4.

Yutaro Wakabayashi, Aika Shimono, Yuki Terauchi, Chao Zeng, Michiaki Hamada, Kentaro Semba, Shinya Watanabe, Kosuke Ishikawa

Gene 917 148464 - 148464 2024.07 [Refereed] [International journal]

　View Summary

Cells sense, respond, and adapt to environmental conditions that cause stress. In a previous study using HeLa cells, we isolated reporter cells responding to the endoplasmic reticulum (ER) stress inducers, thapsigargin and tunicamycin, using a highly sensitive promoter trap vector system. Splinkerette PCR and 5' rapid amplification of cDNA ends (5' RACE) identified a novel transcript that is upregulated by ER stress. Its endogenous expression increased approximately 10-fold in response to thapsigargin and tunicamycin within 1 h, but was down-regulated after 4 h. Because the transcript starts from an intron of a long noncoding RNA known as LINC-PINT, we designated the newly identified transcript TISPL (transcript induced by stressors from LINC-PINTlocus). TISPL was also expressed under several other stress conditions. It was particularly increased > 10-fold upon glucose starvation and 7-fold by arsenite exposure. Furthermore, in silico analyses, including a ChIP-atlas search, revealed that there is an ATF4-binding region with a c/ebp-Atf response element (CARE) downstream of the transcription start site of TISPL. Based on these results, we hypothesized that TISPL may be induced by the phospho-eIF2α and ATF4- axis of the integrated stress response pathway, which is known to be activated by the stress conditions listed above. As expected, knockout of ATF4 abolished the stress-induced upregulation of TISPL. Our results indicate that TISPL may be a useful biomarker for detecting stress conditions that activate ATF4. Our highly sensitive trap vector system proved beneficial in discovering new biomarkers.

DOI PubMed

Scopus

3

Citation

(Scopus)
Inflammation primes the murine kidney for recovery by activating AZIN1 adenosine-to-inosine editing.

Segewkal Hawaze Heruye, Jered Myslinski, Chao Zeng, Amy Zollman, Shinichi Makino, Azuma Nanamatsu, Quoseena Mir, Sarath Chandra Janga, Emma H Doud, Michael T Eadon, Bernhard Maier, Michiaki Hamada, Tuan M Tran, Pierre C Dagher, Takashi Hato

The Journal of clinical investigation 134 ( 17 ) 2024.07 [Refereed] [International journal]

　View Summary

The progression of kidney disease varies among individuals, but a general methodology to quantify disease timelines is lacking. Particularly challenging is the task of determining the potential for recovery from acute kidney injury following various insults. Here, we report that quantitation of post-transcriptional adenosine-to-inosine (A-to-I) RNA editing offers a distinct genome-wide signature, enabling the delineation of disease trajectories in the kidney. A well-defined murine model of endotoxemia permitted the identification of the origin and extent of A-to-I editing, along with temporally discrete signatures of double-stranded RNA stress and Adenosine Deaminase isoform switching. We found that A-to-I editing of Antizyme Inhibitor 1 (AZIN1), a positive regulator of polyamine biosynthesis, serves as a particularly useful temporal landmark during endotoxemia. Our data indicate that AZIN1 A-to-I editing, triggered by preceding inflammation, primes the kidney and activates endogenous recovery mechanisms. By comparing genetically modified human cell lines and mice locked in either A-to-I edited or uneditable states, we uncovered that AZIN1 A-to-I editing not only enhances polyamine biosynthesis but also engages glycolysis and nicotinamide biosynthesis to drive the recovery phenotype. Our findings implicate that quantifying AZIN1 A-to-I editing could potentially identify individuals who have transitioned to an endogenous recovery phase. This phase would reflect their past inflammation and indicate their potential for future recovery.

DOI PubMed

Scopus

9

Citation

(Scopus)
Selection and characterization of aptamers targeting the Vif-CBFβ-ELOB-ELOC-CUL5 complex.

Kazuyuki Kumagai, Keisuke Kamba, Takuya Suzuki, Yuto Sekikawa, Chisato Yuki, Michiaki Hamada, Kayoko Nagata, Akifumi Takaori-Kondo, Li Wan, Masato Katahira, Takashi Nagata, Taiichi Sakamoto

Journal of biochemistry 176 ( 3 ) 205 - 215 2024.05 [Refereed] [International journal]

　View Summary

The viral infectivity factor (Vif) of human immunodeficiency virus 1 forms a complex with host proteins, designated as Vif-CBFβ-ELOB-ELOC-CUL5 (VβBCC), initiating the ubiquitination and subsequent proteasomal degradation of the human antiviral protein APOBEC3G (A3G), thereby negating its antiviral function. While recent cryo-electron microscopy (cryo-EM) studies have implicated RNA molecules in the Vif-A3G interaction that leads to A3G ubiquitination, our findings indicated that the VβBCC complex can also directly impede A3G-mediated DNA deamination, bypassing the proteasomal degradation pathway. Employing the Systematic Evolution of Ligands by EXponential enrichment (SELEX) method, we have identified RNA aptamers with high affinity for the VβBCC complex. These aptamers not only bind to the VβBCC complex but also reinstate A3G's DNA deamination activity by inhibiting the complex's function. Moreover, we delineated the sequences and secondary structures of these aptamers, providing insights into the mechanistic aspects of A3G inhibition by the VβBCC complex. Analysis using selected aptamers will enhance our understanding of the inhibition of A3G by the VβBCC complex, offering potential avenues for therapeutic intervention.

DOI PubMed

Scopus
Prediction of antibiotic resistance mechanisms using a protein language model

Yagimoto K, Hosoda S, Sato M, Hamada M

Bioinformatics (Oxford, England) 40 ( 10 ) 2024.05 [Refereed] [International journal]

Authorship：Last author, Corresponding author

　View Summary

MOTIVATION: Antibiotic resistance has emerged as a major global health threat, with an increasing number of bacterial infections becoming difficult to treat. Predicting the underlying resistance mechanisms of antibiotic resistance genes (ARGs) is crucial for understanding and combating this problem. However, existing methods struggle to accurately predict resistance mechanisms for ARGs with low similarity to known sequences and lack sufficient interpretability of the prediction models. RESULTS: In this study, we present a novel approach for predicting ARG resistance mechanisms using ProteinBERT, a protein language model (pLM) based on deep learning. Our method outperforms state-of-the-art techniques on diverse ARG datasets, including those with low homology to the training data, highlighting its potential for predicting the resistance mechanisms of unknown ARGs. Attention analysis of the model reveals that it considers biologically relevant features, such as conserved amino acid residues and antibiotic target binding sites, when making predictions. These findings provide valuable insights into the molecular basis of antibiotic resistance and demonstrate the interpretability of pLMs, offering a new perspective on their application in bioinformatics. AVAILABILITY AND IMPLEMENTATION: The source code is available for free at https://github.com/hmdlab/ARG-BERT. The output results of the model are published at https://waseda.box.com/v/ARG-BERT-suppl.

DOI PubMed
RaptGen-Assisted Generation of an RNA/DNA Hybrid Aptamer against SARS-CoV-2 Spike Protein.

Tatsuo Adachi, Shigetaka Nakamura, Akiya Michishita, Daiki Kawahara, Mizuki Yamamoto, Michiaki Hamada, Yoshikazu Nakamura

Biochemistry 63 ( 7 ) 906 - 912 2024.03 [Refereed] [International journal]

　View Summary

Optimization of aptamers in length and chemistry is crucial for industrial applications. Here, we developed aptamers against the SARS-CoV-2 spike protein and achieved optimization with a deep-learning-based algorithm, RaptGen. We conducted a primer-less SELEX against the receptor binding domain (RBD) of the spike with an RNA/DNA hybrid library, and the resulting sequences were subjected to RaptGen analysis. Based on the sequence profiling by RaptGen, a short truncation aptamer of 26 nucleotides was obtained and further optimized by a chemical modification of relevant nucleotides. The resulting aptamer is bound to RBD not only of SARS-CoV-2 wildtype but also of its variants, SARS-CoV-1, and Middle East respiratory syndrome coronavirus (MERS-CoV). We concluded that the RaptGen-assisted discovery is efficient for developing optimized aptamers.

DOI PubMed

Scopus

7

Citation

(Scopus)
Systematic discovery of directional regulatory motifs associated with human insulator sites

Naoki Osato, Michiaki Hamada

2024.01

DOI
Inflammation primes the kidney for recovery by activating AZIN1 A-to-I editing.

Segewkal Heruye, Jered Myslinski, Chao Zeng, Amy Zollman, Shinichi Makino, Azuma Nanamatsu, Quoseena Mir, Sarath Chandra Janga, Emma H Doud, Michael T Eadon, Bernhard Maier, Michiaki Hamada, Tuan M Tran, Pierre C Dagher, Takashi Hato

bioRxiv : the preprint server for biology 2023.11 [International journal]

　View Summary

The progression of kidney disease varies among individuals, but a general methodology to quantify disease timelines is lacking. Particularly challenging is the task of determining the potential for recovery from acute kidney injury following various insults. Here, we report that quantitation of post-transcriptional adenosine-to-inosine (A-to-I) RNA editing offers a distinct genome-wide signature, enabling the delineation of disease trajectories in the kidney. A well-defined murine model of endotoxemia permitted the identification of the origin and extent of A-to-I editing, along with temporally discrete signatures of double-stranded RNA stress and Adenosine Deaminase isoform switching. We found that A-to-I editing of Antizyme Inhibitor 1 (AZIN1), a positive regulator of polyamine biosynthesis, serves as a particularly useful temporal landmark during endotoxemia. Our data indicate that AZIN1 A-to-I editing, triggered by preceding inflammation, primes the kidney and activates endogenous recovery mechanisms. By comparing genetically modified human cell lines and mice locked in either A-to-I edited or uneditable states, we uncovered that AZIN1 A-to-I editing not only enhances polyamine biosynthesis but also engages glycolysis and nicotinamide biosynthesis to drive the recovery phenotype. Our findings implicate that quantifying AZIN1 A-to-I editing could potentially identify individuals who have transitioned to an endogenous recovery phase. This phase would reflect their past inflammation and indicate their potential for future recovery.

DOI PubMed
DeepRaccess: high-speed RNA accessibility prediction using deep learning

Kaisei Hara, Natsuki Iwano, Tsukasa Fukunaga, Michiaki Hamada

Frontiers in Bioinformatics 3 1275787 - 1275787 2023.10 [Refereed] [International journal]

Authorship：Corresponding author

　View Summary

RNA accessibility is a useful RNA secondary structural feature for predicting RNA-RNA interactions and translation efficiency in prokaryotes. However, conventional accessibility calculation tools, such as Raccess, are computationally expensive and require considerable computational time to perform transcriptome-scale analysis. In this study, we developed DeepRaccess, which predicts RNA accessibility based on deep learning methods. DeepRaccess was trained to take artificial RNA sequences as input and to predict the accessibility of these sequences as calculated by Raccess. Simulation and empirical dataset analyses showed that the accessibility predicted by DeepRaccess was highly correlated with the accessibility calculated by Raccess. In addition, we confirmed that DeepRaccess could predict protein abundance in E.coli with moderate accuracy from the sequences around the start codon. We also demonstrated that DeepRaccess achieved tens to hundreds of times software speed-up in a GPU environment. The source codes and the trained models of DeepRaccess are freely available at https://github.com/hmdlab/DeepRaccess.

DOI PubMed

Scopus

3

Citation

(Scopus)
Neat1 lncRNA organizes the inflammatory gene expressions in the dorsal root ganglion in neuropathic pain caused by nerve injury

Motoyo Maruyama, Atsushi Sakai, Tsukasa Fukunaga, Yoshitaka Miyagawa, Takashi Okada, Michiaki Hamada, Hidenori Suzuki

Frontiers in Immunology 14 1185322 - 1185322 2023.08 [Refereed] [International journal]

　View Summary

Primary sensory neurons regulate inflammatory processes in innervated regions through neuro-immune communication. However, how their immune-modulating functions are regulated in concert remains largely unknown. Here, we show that Neat1 long non-coding RNA (lncRNA) organizes the proinflammatory gene expressions in the dorsal root ganglion (DRG) in chronic intractable neuropathic pain in rats. Neat1 was abundantly expressed in the DRG and was upregulated after peripheral nerve injury. Neat1 overexpression in primary sensory neurons caused mechanical and thermal hypersensitivity, whereas its knockdown alleviated neuropathic pain. Bioinformatics analysis of comprehensive transcriptome changes indicated the inflammatory response was the most relevant function of genes upregulated through Neat1. Consistent with this, upregulation of proinflammatory genes in the DRG following nerve injury was suppressed by Neat1 knockdown. Expression changes of these proinflammatory genes were regulated through Neat1-mRNA interaction-dependent and -independent mechanisms. Notably, Neat1 increased proinflammatory genes by stabilizing its interacting mRNAs in neuropathic pain. Finally, Neat1 in primary sensory neurons contributed to spinal inflammatory processes that mediated peripheral neuropathic pain. These findings demonstrate that Neat1 lncRNA is a key regulator of neuro-immune communication in neuropathic pain.

DOI PubMed

Scopus

7

Citation

(Scopus)
Transposons contribute to the acquisition of cell type-specific cis-elements in the brain

Kotaro Sekine, Masahiro Onoguchi, Michiaki Hamada

Communications Biology 6 ( 1 ) 631 - 631 2023.06 [Refereed] [International journal]

Authorship：Last author, Corresponding author

　View Summary

Abstract

Mammalian brains have evolved in stages over a long history to acquire higher functions. Recently, several transposable element (TE) families have been shown to evolve into cis-regulatory elements of brain-specific genes. However, it is not fully understood how TEs are important for gene regulatory networks. Here, we performed a single-cell level analysis using public data of scATAC-seq to discover TE-derived cis-elements that are important for specific cell types. Our results suggest that DNA elements derived from TEs, MER130 and MamRep434, can function as transcription factor-binding sites based on their internal motifs for Neurod2 and Lhx2, respectively, especially in glutamatergic neuronal progenitors. Furthermore, MER130- and MamRep434-derived cis-elements were amplified in the ancestors of Amniota and Eutheria, respectively. These results suggest that the acquisition of cis-elements with TEs occurred in different stages during evolution and may contribute to the acquisition of different functions or morphologies in the brain.

DOI PubMed

Scopus

7

Citation

(Scopus)
Recent trends in RNA informatics: a review of machine learning and deep learning for RNA secondary structure prediction and RNA drug discovery

Kengo Sato, Michiaki Hamada

Briefings in Bioinformatics 24 ( 4 ) 2023.05 [Refereed] [International journal]

　View Summary

Abstract

Computational analysis of RNA sequences constitutes a crucial step in the field of RNA biology. As in other domains of the life sciences, the incorporation of artificial intelligence and machine learning techniques into RNA sequence analysis has gained significant traction in recent years. Historically, thermodynamics-based methods were widely employed for the prediction of RNA secondary structures; however, machine learning-based approaches have demonstrated remarkable advancements in recent years, enabling more accurate predictions. Consequently, the precision of sequence analysis pertaining to RNA secondary structures, such as RNA–protein interactions, has also been enhanced, making a substantial contribution to the field of RNA biology. Additionally, artificial intelligence and machine learning are also introducing technical innovations in the analysis of RNA–small molecule interactions for RNA-targeted drug discovery and in the design of RNA aptamers, where RNA serves as its own ligand. This review will highlight recent trends in the prediction of RNA secondary structure, RNA aptamers and RNA drug discovery using machine learning, deep learning and related technologies, and will also discuss potential future avenues in the field of RNA informatics.

DOI PubMed

Scopus

61

Citation

(Scopus)
Mobile element variation contributes to population-specific genome diversification, gene regulation and disease risk.

Shohei Kojima, Satoshi Koyama, Mirei Ka, Yuka Saito, Erica H Parrish, Mikiko Endo, Sadaaki Takata, Misaki Mizukoshi, Keiko Hikino, Atsushi Takeda, Asami F Gelinas, Steven M Heaton, Rie Koide, Anselmo J Kamada, Michiya Noguchi, Michiaki Hamada, Yoichiro Kamatani, Yasuhiro Murakawa, Kazuyoshi Ishigaki, Yukio Nakamura, Kaoru Ito, Chikashi Terao, Yukihide Momozawa, Nicholas F Parrish

Nature genetics 55 ( 6 ) 939 - 951 2023.05 [Refereed] [International journal]

　View Summary

Mobile genetic elements (MEs) are heritable mutagens that recursively generate structural variants (SVs). ME variants (MEVs) are difficult to genotype and integrate in statistical genetics, obscuring their impact on genome diversification and traits. We developed a tool that accurately genotypes MEVs using short-read whole-genome sequencing (WGS) and applied it to global human populations. We find unexpected population-specific MEV differences, including an Alu insertion distribution distinguishing Japanese from other populations. Integrating MEVs with expression quantitative trait loci (eQTL) maps shows that MEV classes regulate tissue-specific gene expression by shared mechanisms, including creating or attenuating enhancers and recruiting post-transcriptional regulators, supporting class-wide interpretability. MEVs more often associate with gene expression changes than SNVs, thus plausibly impacting traits. Performing genome-wide association study (GWAS) with MEVs pinpoints potential causes of disease risk, including a LINE-1 insertion associated with keloid and fasciitis. This work implicates MEVs as drivers of human divergence and disease risk.

DOI PubMed

Scopus

36

Citation

(Scopus)
Bioinformatics approaches for unveiling virus-host interactions.

Hitoshi Iuchi, Junna Kawasaki, Kento Kubo, Tsukasa Fukunaga, Koki Hokao, Gentaro Yokoyama, Akiko Ichinose, Kanta Suga, Michiaki Hamada

Computational and structural biotechnology journal 21 1774 - 1784 2023 [Refereed] [International journal]

Authorship：Last author, Corresponding author

　View Summary

The coronavirus disease-2019 (COVID-19) pandemic has elucidated major limitations in the capacity of medical and research institutions to appropriately manage emerging infectious diseases. We can improve our understanding of infectious diseases by unveiling virus-host interactions through host range prediction and protein-protein interaction prediction. Although many algorithms have been developed to predict virus-host interactions, numerous issues remain to be solved, and the entire network remains veiled. In this review, we comprehensively surveyed algorithms used to predict virus-host interactions. We also discuss the current challenges, such as dataset biases toward highly pathogenic viruses, and the potential solutions. The complete prediction of virus-host interactions remains difficult; however, bioinformatics can contribute to progress in research on infectious diseases and human health.

DOI PubMed

Scopus

21

Citation

(Scopus)
Web Services for RNA-RNA Interaction Prediction

Tsukasa Fukunaga, Junichi Iwakiri, Michiaki Hamada

Methods in Molecular Biology 2586 175 - 195 2023 [Refereed] [International journal]

Authorship：Last author

　View Summary

Non-coding RNAs have various biological functions such as translational regulation, and RNA-RNA interactions play essential roles in the mechanisms of action of these RNAs. Therefore, RNA-RNA interaction prediction is an important problem in bioinformatics, and many tools have been developed for the computational prediction of RNA-RNA interactions. In addition to the development of novel algorithms with high accuracy, the development and maintenance of web services is essential for enhancing usability by experimental biologists. In this review, we survey web services for RNA-RNA interaction predictions and introduce how to use primary web services. We present various prediction tools, including general interaction prediction tools, prediction tools for specific RNA classes, and RNA-RNA interaction-based RNA design tools. Additionally, we discuss the future perspectives of the development of RNA-RNA interaction prediction tools and the sustainability of web services.

DOI PubMed

Scopus
Structure-based screening for functional non-coding RNAs in fission yeast identifies a factor repressing untimely initiation of sexual differentiation.

Yu Ono, Kenta Katayama, Tomoki Onuma, Kento Kubo, Hayato Tsuyuzaki, Michiaki Hamada, Masamitsu Sato

Nucleic acids research 50 ( 19 ) 11229 - 11242 2022.10 [Refereed] [International journal]

Authorship：Corresponding author

　View Summary

Non-coding RNAs (ncRNAs) ubiquitously exist in normal and cancer cells. Despite their prevalent distribution, the functions of most long ncRNAs remain uncharacterized. The fission yeast Schizosaccharomyces pombe expresses >1800 ncRNAs annotated to date, but most unconventional ncRNAs (excluding tRNA, rRNA, snRNA and snoRNA) remain uncharacterized. To discover the functional ncRNAs, here we performed a combinatory screening of computational and biological tests. First, all S. pombe ncRNAs were screened in silico for those showing conservation in sequence as well as in secondary structure with ncRNAs in closely related species. Almost a half of the 151 selected conserved ncRNA genes were uncharacterized. Twelve ncRNA genes that did not overlap with protein-coding sequences were next chosen for biological screening that examines defects in growth or sexual differentiation, as well as sensitivities to drugs and stresses. Finally, we highlighted an ncRNA transcribed from SPNCRNA.1669, which inhibited untimely initiation of sexual differentiation. A domain that was predicted as conserved secondary structure by the computational operations was essential for the ncRNA to function. Thus, this study demonstrates that in silico selection focusing on conservation of the secondary structure over species is a powerful method to pinpoint novel functional ncRNAs.

DOI PubMed

Scopus

4

Citation

(Scopus)
Mobile elements in human population-specific genome and phenotype divergence

Shohei Kojima, Satoshi Koyama, Mirei Ka, Yuka Saito, Erica H. Parrish, Mikiko Endo, Sadaaki Takata, Misaki Mizukoshi, Keiko Hikino, Atsushi Takeda, Asami F. Gelinas, Steven M. Heaton, Rie Koide, Anselmo J. Kamada, Michiya Noguchi, Michiaki Hamada, Yoichiro Kamatani, Yasuhiro Murakawa, Kazuyoshi Ishigaki, Yukio Nakamura, Kaoru Ito, Chikashi Terao, Yukihide Momozawa, Nicholas F. Parrish

2022.03

　View Summary

Abstract

Mobile genetic elements (MEs) are heritable mutagens that contribute to divergence between lineages by recursively generating structural variants. ME variants (MEVs) are difficult to genotype, obscuring their impact on recent genome and trait diversification. We developed a tool that uses short-read sequence data to accurately genotype MEVs, enabling us to study them using statistical genetics methods in global human genomes. We observe population-specific differences in the distribution of Alu insertions that distinguish Japanese from other populations. We integrated MEVs with epigenomic and expression quantitative trait loci (eQTL) maps to determine how they impact traits. This reveals coherent patterns by which specific MEs regulate tissue-specific gene expression, including creating or attenuating enhancers and recruiting post-transcriptional regulators. We pinpoint MEVs as genetic causes of disease risk, including a LINE-1 insertion linked to keloid and other diseases of fibroblast inflammation, by introducing MEVs into the genome-wide association study (GWAS) framework. In addition to nominating previously-hidden MEVs as causes of human diseases, this work highlights MEs as accelerators of human population divergence and begins to decipher the semantics of MEs.

DOI
Probiotic responder identification in cross-over trials for constipation using a Bayesian statistical model considering lags between intake and effect periods

Shion Hosoda, Yuichiro Nishimoto, Yohsuke Yamauchi, Takuji Yamada, Michiaki Hamada

Computational and Structural Biotechnology Journal 21 5350 - 5357 2022.03 [Refereed] [International journal]

Authorship：Last author, Corresponding author

　View Summary

Recent advances in microbiome research have led to the further development of microbial interventions, such as probiotics and prebiotics, which are potential treatments for constipation. However, the effects of probiotics vary from person to person; therefore, the effectiveness of probiotics needs to be verified for each individual. Individuals showing significant effects of the target probiotic are called responders. A statistical model for the evaluation of responders was proposed in a previous study. However, the previous model does not consider the lag between intake and effect periods of the probiotic. It is expected that the lag exists when probiotics are administered and when they are effective. In this study, we propose a Bayesian statistical model to estimate the probability that a subject is a responder, by considering the lag between intake and effect periods. In synthetic dataset experiments, the proposed model was found to outperform the base model, which did not factor in the lag. Further, we found that the proposed model could distinguish responders showing large uncertainty in terms of the lag between intake and effect periods.

DOI PubMed
G0S2 regulates innate immunity in Kawasaki disease via lncRNA HSD11B1-AS1.

Mako Okabe, Shinya Takarada, Nariaki Miyao, Hideyuki Nakaoka, Keijiro Ibuki, Sayaka Ozawa, Kazuhiro Watanabe, Harue Tsuji, Ikuo Hashimoto, Kiyoshi Hatasaki, Shotaro Hayakawa, Yu Hamaguchi, Michiaki Hamada, Fukiko Ichida, Keiichi Hirono

Pediatric research 92 ( 2 ) 378 - 387 2022.03 [Refereed] [International journal]

　View Summary

BACKGROUND: Kawasaki disease (KD) is a systemic vasculitis that is currently the most common cause of acquired heart disease in children. However, its etiology remains unknown. Long non-coding RNAs (lncRNAs) contribute to the pathophysiology of various diseases. Few studies have reported the role of lncRNAs in KD inflammation; thus, we investigated the role of lncRNA in KD inflammation. METHODS: A total of 50 patients with KD (median age, 19 months; 29 males and 21 females) were enrolled. We conducted cap analysis gene expression sequencing to determine differentially expressed genes in monocytes of the peripheral blood of the subjects. RESULTS: About 21 candidate lncRNA transcripts were identified. The analyses of transcriptome and gene ontology revealed that the immune system was involved in KD. Among these genes, G0/G1 switch gene 2 (G0S2) and its antisense lncRNA, HSD11B1-AS1, were upregulated during the acute phase of KD (P < 0.0001 and <0.0001, respectively). Moreover, G0S2 increased when lipopolysaccharides induced inflammation in THP-1 monocytes, and silencing of G0S2 suppressed the expression of HSD11B1-AS1 and tumor necrosis factor-α. CONCLUSIONS: This study uncovered the crucial role of lncRNAs in innate immunity in acute KD. LncRNA may be a novel target for the diagnosis of KD. IMPACT: This study revealed the whole aspect of the gene expression profile of monocytes of patients with Kawasaki disease (KD) using cap analysis gene expression sequencing and identified KD-specific molecules: G0/G1 switch gene 2 (G0S2) and long non-coding RNA (lncRNA) HSD11B1-AS1. We demonstrated that G0S2 and its antisense HSD11B1-AS1 were associated with inflammation of innate immunity in KD. lncRNA may be a novel key target for the diagnosis of patients with KD.

DOI PubMed

Scopus

15

Citation

(Scopus)
HT-SELEX-based identification of binding pre-miRNA hairpin-motif for small molecules.

Sanjukta Mukherjee, Asako Murata, Ryoga Ishida, Ayako Sugai, Chikara Dohno, Michiaki Hamada, Sudhir Krishna, Kazuhiko Nakatani

Molecular therapy. Nucleic acids 27 165 - 174 2022.03 [Refereed] [International journal]

　View Summary

Selective targeting of biologically relevant RNAs with small molecules is a long-standing challenge due to the lack of clear understanding of the binding RNA motifs for small molecules. The standard SELEX procedure allows the identification of specific RNA binders (aptamers) for the target of interest. However, more effort is needed to identify and characterize the sequence-structure motifs in the aptamers important for binding to the target. Herein, we described a strategy integrating high-throughput (HT) sequencing with conventional SELEX followed by bioinformatic analysis to identify aptamers with high binding affinity and target specificity to unravel the sequence-structure motifs of pre-miRNA, which is essential for binding to the recently developed new water-soluble small-molecule CMBL3aL. To confirm the fidelity of this approach, we investigated the binding of CMBL3aL to the identified motifs by surface plasmon resonance (SPR) spectroscopy and its potential regulatory activity on dicer-mediated cleavage of the obtained aptamers and endogenous pre-miRNAs comprising the identified motif in its hairpin loop. This new approach would significantly accelerate the identification process of binding sequence-structure motifs of pre-miRNA for the compound of interest and would contribute to increase the spectrum of biomedical application.

DOI PubMed

Scopus

9

Citation

(Scopus)
Prediction of RNA-protein interactions using a nucleotide language model

Keisuke Yamada, Michiaki Hamada

Bioinformatics Advances 2 ( 1 ) vbac023 2022 [Refereed] [International journal]

Authorship：Last author, Corresponding author

　View Summary

MOTIVATION: The accumulation of sequencing data has enabled researchers to predict the interactions between RNA sequences and RNA-binding proteins (RBPs) using novel machine learning techniques. However, existing models are often difficult to interpret and require additional information to sequences. Bidirectional encoder representations from transformer (BERT) is a language-based deep learning model that is highly interpretable. Therefore, a model based on BERT architecture can potentially overcome such limitations. RESULTS: Here, we propose BERT-RBP as a model to predict RNA-RBP interactions by adapting the BERT architecture pretrained on a human reference genome. Our model outperformed state-of-the-art prediction models using the eCLIP-seq data of 154 RBPs. The detailed analysis further revealed that BERT-RBP could recognize both the transcript region type and RNA secondary structure only based on sequence information. Overall, the results provide insights into the fine-tuning mechanism of BERT in biological contexts and provide evidence of the applicability of the model to other RNA-related problems. AVAILABILITY AND IMPLEMENTATION: Python source codes are freely available at https://github.com/kkyamada/bert-rbp. The datasets underlying this article were derived from sources in the public domain: [RBPsuite (http://www.csbio.sjtu.edu.cn/bioinf/RBPsuite/), Ensembl Biomart (http://asia.ensembl.org/biomart/martview/)]. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online.

DOI PubMed

Scopus

55

Citation

(Scopus)
LinAliFold and CentroidLinAliFold: fast RNA consensus secondary structure prediction for aligned sequences using beam search methods

Tsukasa Fukunaga, Michiaki Hamada

Bioinformatics Advances 2 ( 1 ) vbac078 2022.01 [International journal]

　View Summary

Abstract

Motivation

RNA consensus secondary structure prediction from aligned sequences is a powerful approach for improving the secondary structure prediction accuracy. However, because the computational complexities of conventional prediction tools scale with the cube of the alignment lengths, their application to long RNA sequences, such as viral RNAs or long non-coding RNAs, requires significant computational time.

Results

In this study, we developed LinAliFold and CentroidLinAliFold, fast RNA consensus secondary structure prediction tools based on minimum free energy and maximum expected accuracy principles, respectively. We achieved software acceleration using beam search methods that were successfully used for fast secondary structure prediction from a single RNA sequence. Benchmark analyses showed that LinAliFold and CentroidLinAliFold were much faster than the existing methods while preserving the prediction accuracy. As an empirical application, we predicted the consensus secondary structure of coronaviruses with approximately 30 000 nt in 5 and 79 min by LinAliFold and CentroidLinAliFold, respectively. We confirmed that the predicted consensus secondary structure of coronaviruses was consistent with the experimental results.

Availability and implementation

The source codes of LinAliFold and CentroidLinAliFold are freely available at https://github.com/fukunagatsu/LinAliFold-CentroidLinAliFold.

Supplementary information

Supplementary data are available at Bioinformatics Advances online.

DOI PubMed

Scopus

4

Citation

(Scopus)
Bioinformatics Approaches for Determining the Functional Impact of Repetitive Elements on Non-coding RNAs.

Chao Zeng, Atsushi Takeda, Kotaro Sekine, Naoki Osato, Tsukasa Fukunaga, Michiaki Hamada

Methods in molecular biology (Clifton, N.J.) 2509 315 - 340 2022 [International journal]

　View Summary

With a large number of annotated non-coding RNAs (ncRNAs), repetitive sequences are found to constitute functional components (termed as repetitive elements) in ncRNAs that perform specific biological functions. Bioinformatics analysis is a powerful tool for improving our understanding of the role of repetitive elements in ncRNAs. This chapter summarizes recent findings that reveal the role of repetitive elements in ncRNAs. Furthermore, relevant bioinformatics approaches are systematically reviewed, which promises to provide valuable resources for studying the functional impact of repetitive elements on ncRNAs.

DOI PubMed

Scopus

3

Citation

(Scopus)
Clone decomposition based on mutation signatures provides novel insights into mutational processes.

Taro Matsutani, Michiaki Hamada

NAR genomics and bioinformatics 3 ( 4 ) lqab093 2021.12 [International journal]

　View Summary

Intra-tumor heterogeneity is a phenomenon in which mutation profiles differ from cell to cell within the same tumor and is observed in almost all tumors. Understanding intra-tumor heterogeneity is essential from the clinical perspective. Numerous methods have been developed to predict this phenomenon based on variant allele frequency. Among the methods, CloneSig models the variant allele frequency and mutation signatures simultaneously and provides an accurate clone decomposition. However, this method has limitations in terms of clone number selection and modeling. We propose SigTracer, a novel hierarchical Bayesian approach for analyzing intra-tumor heterogeneity based on mutation signatures to tackle these issues. We show that SigTracer predicts more reasonable clone decompositions than the existing methods against artificial data that mimic cancer genomes. We applied SigTracer to whole-genome sequences of blood cancer samples. The results were consistent with past findings that single base substitutions caused by a specific signature (previously reported as SBS9) related to the activation-induced cytidine deaminase intensively lie within immunoglobulin-coding regions for chronic lymphocytic leukemia samples. Furthermore, we showed that this signature mutates regions responsible for cell-cell adhesion. Accurate assignments of mutations to signatures by SigTracer can provide novel insights into signature origins and mutational processes.

DOI PubMed

Scopus
Multi-resBind: a residual network-based multi-label classifier for in vivo RNA binding prediction and preference visualization.

Shitao Zhao, Michiaki Hamada

BMC bioinformatics 22 ( 1 ) 554 - 554 2021.11 [International journal]

　View Summary

BACKGROUND: Protein-RNA interactions play key roles in many processes regulating gene expression. To understand the underlying binding preference, ultraviolet cross-linking and immunoprecipitation (CLIP)-based methods have been used to identify the binding sites for hundreds of RNA-binding proteins (RBPs) in vivo. Using these large-scale experimental data to infer RNA binding preference and predict missing binding sites has become a great challenge. Some existing deep-learning models have demonstrated high prediction accuracy for individual RBPs. However, it remains difficult to avoid significant bias due to the experimental protocol. The DeepRiPe method was recently developed to solve this problem via introducing multi-task or multi-label learning into this field. However, this method has not reached an ideal level of prediction power due to the weak neural network architecture. RESULTS: Compared to the DeepRiPe approach, our Multi-resBind method demonstrated substantial improvements using the same large-scale PAR-CLIP dataset with respect to an increase in the area under the receiver operating characteristic curve and average precision. We conducted extensive experiments to evaluate the impact of various types of input data on the final prediction accuracy. The same approach was used to evaluate the effect of loss functions. Finally, a modified integrated gradient was employed to generate attribution maps. The patterns disentangled from relative contributions according to context offer biological insights into the underlying mechanism of protein-RNA interactions. CONCLUSIONS: Here, we propose Multi-resBind as a new multi-label deep-learning approach to infer protein-RNA binding preferences and predict novel interactions. The results clearly demonstrate that Multi-resBind is a promising tool to predict unknown binding sites in vivo and gain biology insights into why the neural network makes a given prediction.

DOI PubMed

Scopus

9

Citation

(Scopus)
Impact of human gene annotations on RNA-seq differential expression analysis.

Yu Hamaguchi, Chao Zeng, Michiaki Hamada

BMC genomics 22 ( 1 ) 730 - 730 2021.10 [International journal]

　View Summary

BACKGROUND: Differential expression (DE) analysis of RNA-seq data typically depends on gene annotations. Different sets of gene annotations are available for the human genome and are continually updated-a process complicated with the development and application of high-throughput sequencing technologies. However, the impact of the complexity of gene annotations on DE analysis remains unclear. RESULTS: Using "mappability", a metric of the complexity of gene annotation, we compared three distinct human gene annotations, GENCODE, RefSeq, and NONCODE, and evaluated how mappability affected DE analysis. We found that mappability was significantly different among the human gene annotations. We also found that increasing mappability improved the performance of DE analysis, and the impact of mappability mainly evident in the quantification step and propagated downstream of DE analysis systematically. CONCLUSIONS: We assessed how the complexity of gene annotations affects DE analysis using mappability. Our findings indicate that the growth and complexity of gene annotations negatively impact the performance of DE analysis, suggesting that an approach that excludes unnecessary gene models from gene annotations improves the performance of DE analysis.

DOI PubMed

Scopus

7

Citation

(Scopus)
Binding patterns of RNA-binding proteins to repeat-derived RNA sequences reveal putative functional RNA elements.

Masahiro Onoguchi, Chao Zeng, Ayako Matsumaru, Michiaki Hamada

NAR genomics and bioinformatics 3 ( 3 ) lqab055 2021.09 [International journal]

　View Summary

Recent reports have revealed that repeat-derived sequences embedded in introns or long noncoding RNAs (lncRNAs) are targets of RNA-binding proteins (RBPs) and contribute to biological processes such as RNA splicing or transcriptional regulation. These findings suggest that repeat-derived RNAs are important as scaffolds of RBPs and functional elements. However, the overall functional sequences of the repeat-derived RNAs are not fully understood. Here, we show the putative functional repeat-derived RNAs by analyzing the binding patterns of RBPs based on ENCODE eCLIP data. We mapped all eCLIP reads to repeat sequences and observed that 10.75 % and 7.04 % of reads on average were enriched (at least 2-fold over control) in the repeats in K562 and HepG2 cells, respectively. Using these data, we predicted functional RNA elements on the sense and antisense strands of long interspersed element 1 (LINE1) sequences. Furthermore, we found several new sets of RBPs on fragments derived from other transposable element (TE) families. Some of these fragments show specific and stable secondary structures and are found to be inserted into the introns of genes or lncRNAs. These results suggest that the repeat-derived RNA sequences are strong candidates for the functional RNA elements of endogenous noncoding RNAs.

DOI PubMed

Scopus

5

Citation

(Scopus)
Umibato: estimation of time-varying microbial interaction using continuous-time regression hidden Markov model.

Shion Hosoda, Tsukasa Fukunaga, Michiaki Hamada

Bioinformatics (Oxford, England) 37 ( Suppl_1 ) i16-i24 2021.07 [International journal]

　View Summary

MOTIVATION: Accumulating evidence has highlighted the importance of microbial interaction networks. Methods have been developed for estimating microbial interaction networks, of which the generalized Lotka-Volterra equation (gLVE)-based method can estimate a directed interaction network. The previous gLVE-based method for estimating microbial interaction networks did not consider time-varying interactions. RESULTS: In this study, we developed unsupervised learning-based microbial interaction inference method using Bayesian estimation (Umibato), a method for estimating time-varying microbial interactions. The Umibato algorithm comprises Gaussian process regression (GPR) and a new Bayesian probabilistic model, the continuous-time regression hidden Markov model (CTRHMM). Growth rates are estimated by GPR, and interaction networks are estimated by CTRHMM. CTRHMM can estimate time-varying interaction networks using interaction states, which are defined as hidden variables. Umibato outperformed the existing methods on synthetic datasets. In addition, it yielded reasonable estimations in experiments on a mouse gut microbiota dataset, thus providing novel insights into the relationship between consumed diets and the gut microbiota. AVAILABILITY AND IMPLEMENTATION: The C++ and python source codes of the Umibato software are available at https://github.com/shion-h/Umibato. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

DOI PubMed

Scopus

4

Citation

(Scopus)
Possible roles for the hominoid-specific DSCR4 gene in human cells.

Morteza M Saber, Marziyeh Karimiavargani, Takanori Uzawa, Nilmini Hettiarachchi, Michiaki Hamada, Yoshihiro Ito, Naruya Saitou

Genes & genetic systems 96 ( 1 ) 1 - 11 2021.05 [Domestic journal]

　View Summary

Down syndrome in humans is caused by trisomy of chromosome 21. DSCR4 (Down syndrome critical region 4) is a de novo-originated protein-coding gene present only in human chromosome 21 and its homologous chromosomes in apes. Despite being located in a medically critical genomic region and an abundance of evidence indicating its functionality, the roles of DSCR4 in human cells are unknown. We used a bioinformatic approach to infer the biological importance and cellular roles of this gene. Our analysis indicates that DSCR4 is likely involved in the regulation of interconnected biological pathways related to cell migration, coagulation and the immune system. We also showed that these predicted biological functions are consistent with tissue-specific expression of DSCR4 in migratory immune system leukocyte cells and neural crest cells (NCCs) that shape facial morphology in the human embryo. The immune system and NCCs are known to be affected in Down syndrome individuals, who suffer from DSCR4 misregulation, which further supports our findings. Providing evidence for the critical roles of DSCR4 in human cells, our findings establish the basis for further experimental investigations that will be necessary to confirm the roles of DSCR4 in the etiology of Down syndrome.

DOI PubMed

Scopus

5

Citation

(Scopus)
PBSIM2: a simulator for long-read sequencers with a novel generative model of quality scores.

Yukiteru Ono, Kiyoshi Asai, Michiaki Hamada

Bioinformatics (Oxford, England) 37 ( 5 ) 589 - 595 2021.05 [International journal]

　View Summary

MOTIVATION: Recent advances in high-throughput long-read sequencers, such as PacBio and Oxford Nanopore sequencers, produce longer reads with more errors than short-read sequencers. In addition to the high error rates of reads, non-uniformity of errors leads to difficulties in various downstream analyses using long reads. Many useful simulators, which characterize long-read error patterns and simulate them, have been developed. However, there is still room for improvement in the simulation of the non-uniformity of errors. RESULTS: To capture characteristics of errors in reads for long-read sequencers, here, we introduce a generative model for quality scores, in which a hidden Markov Model with a latest model selection method, called factorized information criteria, is utilized. We evaluated our developed simulator from various points, indicating that our simulator successfully simulates reads that are consistent with real reads. AVAILABILITY AND IMPLEMENTATION: The source codes of PBSIM2 are freely available from https://github.com/yukiteruono/pbsim2. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

DOI PubMed

Scopus

94

Citation

(Scopus)
Long Non-Coding RNA CRNDE Is Involved in Resistance to EGFR Tyrosine Kinase Inhibitor in EGFR-Mutant Lung Cancer via eIF4A3/MUC1/EGFR Signaling.

Satoshi Takahashi, Rintaro Noro, Masahiro Seike, Chao Zeng, Masaru Matsumoto, Akiko Yoshikawa, Shinji Nakamichi, Teppei Sugano, Mariko Hirao, Kuniko Matsuda, Michiaki Hamada, Akihiko Gemma

International journal of molecular sciences 22 ( 8 ) 2021.04 [International journal]

　View Summary

(1) Background: Acquired resistance to epidermal growth factor receptor-tyrosine kinase inhibitors (EGFR-TKIs) is an intractable problem for many clinical oncologists. The mechanisms of resistance to EGFR-TKIs are complex. Long non-coding RNAs (lncRNAs) may play an important role in cancer development and metastasis. However, the biological process between lncRNAs and drug resistance to EGFR-mutated lung cancer remains largely unknown. (2) Methods: Osimertinib- and afatinib-resistant EGFR-mutated lung cancer cells were established using a stepwise method. A microarray analysis of non-coding and coding RNAs was performed using parental and resistant EGFR-mutant non-small cell lung cancer (NSCLC) cells and evaluated by bioinformatics analysis through medical-industrial collaboration. (3) Results: Colorectal neoplasia differentially expressed (CRNDE) and DiGeorge syndrome critical region gene 5 (DGCR5) lncRNAs were highly expressed in EGFR-TKI-resistant cells by microarray analysis. RNA-protein binding analysis revealed eukaryotic translation initiation factor 4A3 (eIF4A3) bound in an overlapping manner to CRNDE and DGCR5. The CRNDE downregulates the expression of eIF4A3, mucin 1 (MUC1), and phospho-EGFR. Inhibition of CRNDE activated the eIF4A3/MUC1/EGFR signaling pathway and apoptotic activity, and restored sensitivity to EGFR-TKIs. (4) Conclusions: The results showed that CRNDE is associated with the development of resistance to EGFR-TKIs. CRNDE may be a novel therapeutic target to conquer EGFR-mutant NSCLC.

DOI PubMed

Scopus

34

Citation

(Scopus)
Jonckheere-Terpstra-Kendall-based non-parametric analysis of temporal differential gene expression.

Hitoshi Iuchi, Michiaki Hamada

NAR genomics and bioinformatics 3 ( 1 ) lqab021 2021.03 [International journal]

　View Summary

Time-course experiments using parallel sequencers have the potential to uncover gradual changes in cells over time that cannot be observed in a two-point comparison. An essential step in time-series data analysis is the identification of temporal differentially expressed genes (TEGs) under two conditions (e.g. control versus case). Model-based approaches, which are typical TEG detection methods, often set one parameter (e.g. degree or degree of freedom) for one dataset. This approach risks modeling of linearly increasing genes with higher-order functions, or fitting of cyclic gene expression with linear functions, thereby leading to false positives/negatives. Here, we present a Jonckheere-Terpstra-Kendall (JTK)-based non-parametric algorithm for TEG detection. Benchmarks, using simulation data, show that the JTK-based approach outperforms existing methods, especially in long time-series experiments. Additionally, application of JTK in the analysis of time-series RNA-seq data from seven tissue types, across developmental stages in mouse and rat, suggested that the wave pattern contributes to the TEG identification of JTK, not the difference in expression levels. This result suggests that JTK is a suitable algorithm when focusing on expression patterns over time rather than expression levels, such as comparisons between different species. These results show that JTK is an excellent candidate for TEG detection.

DOI PubMed

Scopus

3

Citation

(Scopus)
RaptGen: A variational autoencoder with profile hidden Markov model for generative aptamer discovery

Natsuki Iwano, Tatsuo Adachi, Kazuteru Aoki, Yoshikazu Nakamura, Michiaki Hamada

2021.02

DOI
Association analysis of repetitive elements and R-loop formation across species.

Chao Zeng, Masahiro Onoguchi, Michiaki Hamada

Mobile DNA 12 ( 1 ) 3 - 3 2021.01 [International journal]

　View Summary

BACKGROUND: Although recent studies have revealed the genome-wide distribution of R-loops, our understanding of R-loop formation is still limited. Genomes are known to have a large number of repetitive elements. Emerging evidence suggests that these sequences may play an important regulatory role. However, few studies have investigated the effect of repetitive elements on R-loop formation. RESULTS: We found different repetitive elements related to R-loop formation in various species. By controlling length and genomic distributions, we observed that satellite, long interspersed nuclear elements (LINEs), and DNA transposons were each specifically enriched for R-loops in humans, fruit flies, and Arabidopsis thaliana, respectively. R-loops also tended to arise in regions of low-complexity or simple repeats across species. We also found that the repetitive elements associated with R-loop formation differ according to developmental stage. For instance, LINEs and long terminal repeat retrotransposons (LTRs) are more likely to contain R-loops in embryos (fruit fly) and then turn out to be low-complexity and simple repeats in post-developmental S2 cells. CONCLUSIONS: Our results indicate that repetitive elements may have species-specific or development-specific regulatory effects on R-loop formation. This work advances our understanding of repetitive elements and R-loop biology.

DOI PubMed

Scopus

23

Citation

(Scopus)
Representation learning applications in biological sequence analysis.

Hitoshi Iuchi, Taro Matsutani, Keisuke Yamada, Natsuki Iwano, Shunsuke Sumi, Shion Hosoda, Shitao Zhao, Tsukasa Fukunaga, Michiaki Hamada

Computational and structural biotechnology journal 19 3198 - 3208 2021 [International journal]

　View Summary

Although remarkable advances have been reported in high-throughput sequencing, the ability to aptly analyze a substantial amount of rapidly generated biological (DNA/RNA/protein) sequencing data remains a critical hurdle. To tackle this issue, the application of natural language processing (NLP) to biological sequence analysis has received increased attention. In this method, biological sequences are regarded as sentences while the single nucleic acids/amino acids or k-mers in these sequences represent the words. Embedding is an essential step in NLP, which performs the conversion of these words into vectors. Specifically, representation learning is an approach used for this transformation process, which can be applied to biological sequences. Vectorized biological sequences can then be applied for function and structure estimation, or as input for other probabilistic models. Considering the importance and growing trend for the application of representation learning to biological research, in the present study, we have reviewed the existing knowledge in representation learning for biological sequence analysis.

DOI PubMed

Scopus

70

Citation

(Scopus)
Corrigendum: Possible roles for the hominoid-specific DSCR4 gene in human cells [Genes Genet. Syst. (2021) 96, p. 1-11].

Morteza M Saber, Marziyeh Karimiavargani, Takanori Uzawa, Nilmini Hettiarachchi, Michiaki Hamada, Yoshihiro Ito, Naruya Saitou

Genes & genetic systems 96 ( 2 ) 105 - 105 2021 [Domestic journal]

　View Summary

Legends to Figures 4 and 5 (p. 7) should be exchanged. Below are the correct legends to Figure 4 and Figure 5. Fig. 4. Interconnection of DSCR4 overexpression-mediated perturbed pathways. KEGG analysis of DSCR4 overexpression-mediated DEGs shows enrichment for the tightly interconnected pathways of the coagulation cascade and the complement cascade (highlighted in red) and further confirm the connection of these cascades with cell adhesion, migration and proliferation (red circle). Fig. 5. Expression profile of DSCR4 across human cell lines and tissues. According to Roadmap Epigenomics Project data, DSCR4 and DSCR8, which share a bidirectional promoter, are highly expressed only in K562 cells, a type of leukemia cell. Analysis of transcriptome data provided by Prescott et al. (2015) showed that DSCR4 and DSCR8 also display high expression in human and chimpanzee neural crest cells, which are critical migratory cells involved in facial morphogenesis in the embryo. (1) Data from Prescott et al. (2015). (2) Samples also include esophagus, lung, spleen and fetal large intestine. (3) Samples also include brain germinal matrix, hippocampus, fetal small intestine, stomach, left ventricle, small intestine, sigmoid colon, HEPG2 cells and HMEC cells. The PDF file for DOI: https://doi.org/10.1266/ggs.20-00012 has been replaced with the corrected version as of June 17, 2021.

DOI PubMed

Scopus
Identification of m6A-Associated RNA Binding Proteins Using an Integrative Computational Framework.

Yiqian Zhang, Michiaki Hamada

Frontiers in genetics 12 625797 - 625797 2021 [International journal]

　View Summary

N6-methyladenosine (m6A) is an abundant modification on mRNA that plays an important role in regulating essential RNA activities. Several wet lab studies have identified some RNA binding proteins (RBPs) that are related to m6A's regulation. The objective of this study was to identify potential m6A-associated RBPs using an integrative computational framework. The framework was composed of an enrichment analysis and a classification model. Utilizing RBPs' binding data, we analyzed reproducible m6A regions from independent studies using this framework. The enrichment analysis identified known m6A-associated RBPs including YTH domain-containing proteins; it also identified RBM3 as a potential m6A-associated RBP for mouse. Furthermore, a significant correlation for the identified m6A-associated RBPs is observed at the protein expression level rather than the gene expression level. On the other hand, a Random Forest classification model was built for the reproducible m6A regions using RBPs' binding data. The RBP-based predictor demonstrated not only competitive performance when compared with sequence-based predictions but also reflected m6A's action of repelling against RBPs, which suggested that our framework can infer interaction between m6A and m6A-associated RBPs beyond sequence level when utilizing RBPs' binding data. In conclusion, we designed an integrative computational framework for the identification of known and potential m6A-associated RBPs. We hope the analysis will provide more insights on the studies of m6A and RNA modifications.

DOI PubMed

Scopus

10

Citation

(Scopus)
Detection and Characterization of Ribosome-Associated Long Noncoding RNAs.

Chao Zeng, Michiaki Hamada

Methods in molecular biology (Clifton, N.J.) 2254 179 - 194 2021 [International journal]

　View Summary

Ribosome profiling shows potential for studying the function of long noncoding RNAs (lncRNAs). We introduce a bioinformatics pipeline for detecting ribosome-associated lncRNAs (ribo-lncRNAs) from ribosome profiling data. Further, we describe a machine-learning approach for the characterization of ribo-lncRNAs based on their sequence features. Scripts for ribo-lncRNA analysis can be accessed at ( https://ribolnc.hamadalab.com/ ).

DOI PubMed

Scopus

2

Citation

(Scopus)
Parallelized Latent Dirichlet Allocation Provides a Novel Interpretability of Mutation Signatures in Cancer Genomes.

Taro Matsutani, Michiaki Hamada

Genes 11 ( 10 ) 2020.09 [International journal]

　View Summary

Mutation signatures are defined as the distribution of specific mutations such as activity of AID/APOBEC family proteins. Previous studies have reported numerous signatures, using matrix factorization methods for mutation catalogs. Different mutation signatures are active in different tumor types; hence, signature activity varies greatly among tumor types and becomes sparse. Because of this, many previous methods require dividing mutation catalogs for each tumor type. Here, we propose parallelized latent Dirichlet allocation (PLDA), a novel Bayesian model to simultaneously predict mutation signatures with all mutation catalogs. PLDA is an extended model of latent Dirichlet allocation (LDA), which is one of the methods used for signature prediction. It has parallelized hyperparameters of Dirichlet distributions for LDA, and they represent the sparsity of signature activities for each tumor type, thus facilitating simultaneous analyses. First, we conducted a simulation experiment to compare PLDA with previous methods (including SigProfiler and SignatureAnalyzer) using artificial data and confirmed that PLDA could predict signature structures as accurately as previous methods without searching for the optimal hyperparameters. Next, we applied PLDA to PCAWG (Pan-Cancer Analysis of Whole Genomes) mutation catalogs and obtained a signature set different from the one predicted by SigProfiler. Further, we have shown that the mutation spectrum represented by the predicted signature with PLDA provides a novel interpretability through post-analyses.

DOI PubMed

Scopus

3

Citation

(Scopus)
Free-Energy Calculation of Ribonucleic Inosines and Its Application to Nearest-Neighbor Parameters.

Shun Sakuraba, Junichi Iwakiri, Michiaki Hamada, Tomoshi Kameda, Genichiro Tsuji, Yasuaki Kimura, Hiroshi Abe, Kiyoshi Asai

Journal of chemical theory and computation 16 ( 9 ) 5923 - 5935 2020.09 [Refereed] [International journal]

　View Summary

Can current simulations quantitatively predict the stability of ribonucleic acids (RNAs)? In this research, we apply a free-energy perturbation simulation of RNAs containing inosine, a modified ribonucleic base, to the derivation of RNA nearest-neighbor parameters. A parameter set derived solely from 30 simulations was used to predict the free-energy difference of the RNA duplex with a mean unbiased error of 0.70 kcal/mol, which is a level of accuracy comparable to that obtained with parameters derived from 25 experiments. We further show that the error can be lowered to 0.60 kcal/mol by combining the simulation-derived free-energy differences with experimentally measured differences. This protocol can be used as a versatile method for deriving nearest-neighbor parameters of RNAs with various modified bases.

DOI PubMed

Scopus

5

Citation

(Scopus)
RNA-Seq Analysis Reveals Localization-Associated Alternative Splicing across 13 Cell Lines.

Chao Zeng, Michiaki Hamada

Genes 11 ( 7 ) 2020.07 [International journal]

　View Summary

Alternative splicing, a ubiquitous phenomenon in eukaryotes, is a regulatory mechanism for the biological diversity of individual genes. Most studies have focused on the effects of alternative splicing for protein synthesis. However, the transcriptome-wide influence of alternative splicing on RNA subcellular localization has rarely been studied. By analyzing RNA-seq data obtained from subcellular fractions across 13 human cell lines, we identified 8720 switching genes between the cytoplasm and the nucleus. Consistent with previous reports, intron retention was observed to be enriched in the nuclear transcript variants. Interestingly, we found that short and structurally stable introns were positively correlated with nuclear localization. Motif analysis reveals that fourteen RNA-binding protein (RBPs) are prone to be preferentially bound with such introns. To our knowledge, this is the first transcriptome-wide study to analyze and evaluate the effect of alternative splicing on RNA subcellular localization. Our findings reveal that alternative splicing plays a promising role in regulating RNA subcellular localization.

DOI PubMed

Scopus

12

Citation

(Scopus)
Revealing the microbial assemblage structure in the human gut microbiome using latent Dirichlet allocation.

Shion Hosoda, Suguru Nishijima, Tsukasa Fukunaga, Masahira Hattori, Michiaki Hamada

Microbiome 8 ( 1 ) 95 - 95 2020.06 [International journal]

　View Summary

BACKGROUND: The human gut microbiome has been suggested to affect human health and thus has received considerable attention. To clarify the structure of the human gut microbiome, clustering methods are frequently applied to human gut taxonomic profiles. Enterotypes, i.e., clusters of individuals with similar microbiome composition, are well-studied and characterized. However, only a few detailed studies on assemblages, i.e., clusters of co-occurring bacterial taxa, have been conducted. Particularly, the relationship between the enterotype and assemblage is not well-understood. RESULTS: In this study, we detected gut microbiome assemblages using a latent Dirichlet allocation (LDA) method. We applied LDA to a large-scale human gut metagenome dataset and found that a 4-assemblage LDA model could represent relationships between enterotypes and assemblages with high interpretability. This model indicated that each individual tends to have several assemblages, three of which corresponded to the three classically recognized enterotypes. Conversely, the fourth assemblage corresponded to no enterotypes and emerged in all enterotypes. Interestingly, the dominant genera of this assemblage (Clostridium, Eubacterium, Faecalibacterium, Roseburia, Coprococcus, and Butyrivibrio) included butyrate-producing species such as Faecalibacterium prausnitzii. Indeed, the fourth assemblage significantly positively correlated with three butyrate-producing functions. CONCLUSIONS: We conducted an assemblage analysis on a large-scale human gut metagenome dataset using LDA. The present study revealed that there is an enterotype-independent assemblage. Video Abstract.

DOI PubMed

Scopus

34

Citation

(Scopus)
MoAIMS: efficient software for detection of enriched regions of MeRIP-Seq.

Yiqian Zhang, Michiaki Hamada

BMC bioinformatics 21 ( 1 ) 103 - 103 2020.03 [International journal]

　View Summary

BACKGROUND: Methylated RNA immunoprecipitation sequencing (MeRIP-Seq) is a popular sequencing method for studying RNA modifications and, in particular, for N6-methyladenosine (m6A), the most abundant RNA methylation modification found in various species. The detection of enriched regions is a main challenge of MeRIP-Seq analysis, however current tools either require a long time or do not fully utilize features of RNA sequencing such as strand information which could cause ambiguous calling. On the other hand, with more attention on the treatment experiments of MeRIP-Seq, biologists need intuitive evaluation on the treatment effect from comparison. Therefore, efficient and user-friendly software that can solve these tasks must be developed. RESULTS: We developed a software named "model-based analysis and inference of MeRIP-Seq (MoAIMS)" to detect enriched regions of MeRIP-Seq and infer signal proportion based on a mixture negative-binomial model. MoAIMS is designed for transcriptome immunoprecipitation sequencing experiments; therefore, it is compatible with different RNA sequencing protocols. MoAIMS offers excellent processing speed and competitive performance when compared with other tools. When MoAIMS is applied to studies of m6A, the detected enriched regions contain known biological features of m6A. Furthermore, signal proportion inferred from MoAIMS for m6A treatment datasets (perturbation of m6A methyltransferases) showed a decreasing trend that is consistent with experimental observations, suggesting that the signal proportion can be used as an intuitive indicator of treatment effect. CONCLUSIONS: MoAIMS is efficient and easy-to-use software implemented in R. MoAIMS can not only detect enriched regions of MeRIP-Seq efficiently but also provide intuitive evaluation on treatment effect for MeRIP-Seq treatment datasets.

DOI PubMed

Scopus

12

Citation

(Scopus)
Nucleosome destabilization by nuclear non-coding RNAs.

Risa Fujita, Tatsuro Yamamoto, Yasuhiro Arimura, Saori Fujiwara, Hiroaki Tachiwana, Yuichi Ichikawa, Yuka Sakata, Liying Yang, Reo Maruyama, Michiaki Hamada, Mitsuyoshi Nakao, Noriko Saitoh, Hitoshi Kurumizaka

Communications biology 3 ( 1 ) 60 - 60 2020.02 [Refereed] [International journal]

　View Summary

In the nucleus, genomic DNA is wrapped around histone octamers to form nucleosomes. In principle, nucleosomes are substantial barriers to transcriptional activities. Nuclear non-coding RNAs (ncRNAs) are proposed to function in chromatin conformation modulation and transcriptional regulation. However, it remains unclear how ncRNAs affect the nucleosome structure. Eleanors are clusters of ncRNAs that accumulate around the estrogen receptor-α (ESR1) gene locus in long-term estrogen deprivation (LTED) breast cancer cells, and markedly enhance the transcription of the ESR1 gene. Here we detected nucleosome depletion around the transcription site of Eleanor2, the most highly expressed Eleanor in the LTED cells. We found that the purified Eleanor2 RNA fragment drastically destabilized the nucleosome in vitro. This activity was also exerted by other ncRNAs, but not by poly(U) RNA or DNA. The RNA-mediated nucleosome destabilization may be a common feature among natural nuclear RNAs, and may function in transcription regulation in chromatin.

DOI PubMed

Scopus

10

Citation

(Scopus)
Targeting the TR4 nuclear receptor-mediated lncTASR/AXL signaling with tretinoin increases the sunitinib sensitivity to better suppress the RCC progression.

Hangchuan Shi, Yin Sun, Miao He, Xiong Yang, Michiaki Hamada, Tsukasa Fukunaga, Xiaoping Zhang, Chawnshang Chang

Oncogene 39 ( 3 ) 530 - 545 2020.01 [Refereed] [International journal]

　View Summary

Renal cell carcinoma (RCC) is one of the most lethal urological tumors. Using sunitinib to improve the survival has become the first-line therapy for metastatic RCC patients. However, the occurrence of sunitinib resistance in the clinical application has curtailed its efficacy. Here we found TR4 nuclear receptor might alter the sunitinib resistance to RCC via altering the TR4/lncTASR/AXL signaling. Mechanism dissection revealed that TR4 could modulate lncTASR (ENST00000600671.1) expression via transcriptional regulation, which might then increase AXL protein expression via enhancing the stability of AXL mRNA to increase the sunitinib resistance in RCC. Human clinical surveys also linked the expression of TR4, lncTASR, and AXL to the RCC survival, and results from multiple RCC cell lines revealed that targeting this newly identified TR4-mediated signaling with small molecules, including tretinoin, metformin, or TR4-shRNAs, all led to increase the sunitinib sensitivity to better suppress the RCC progression, and our preclinical study using the in vivo mouse model further proved tretinoin had a better synergistic effect to increase sunitinib sensitivity to suppress RCC progression. Future successful clinical trials may help in the development of a novel therapy to better suppress the RCC progression.

DOI PubMed

Scopus

24

Citation

(Scopus)
Discovering novel mutation signatures by latent Dirichlet allocation with variational Bayes inference.

Taro Matsutani, Yuki Ueno, Tsukasa Fukunaga, Michiaki Hamada

Bioinformatics (Oxford, England) 35 ( 22 ) 4543 - 4552 2019.11 [Refereed] [International journal]

　View Summary

MOTIVATION: A cancer genome includes many mutations derived from various mutagens and mutational processes, leading to specific mutation patterns. It is known that each mutational process leads to characteristic mutations, and when a mutational process has preferences for mutations, this situation is called a 'mutation signature.' Identification of mutation signatures is an important task for elucidation of carcinogenic mechanisms. In previous studies, analyses with statistical approaches (e.g. non-negative matrix factorization and latent Dirichlet allocation) revealed a number of mutation signatures. Nonetheless, strictly speaking, these existing approaches employ an ad hoc method or incorrect approximation to estimate the number of mutation signatures, and the whole picture of mutation signatures is unclear. RESULTS: In this study, we present a novel method for estimating the number of mutation signatures-latent Dirichlet allocation with variational Bayes inference (VB-LDA)-where variational lower bounds are utilized for finding a plausible number of mutation patterns. In addition, we performed cluster analyses for estimated mutation signatures to extract novel mutation signatures that appear in multiple primary lesions. In a simulation with artificial data, we confirmed that our method estimated the correct number of mutation signatures. Furthermore, applying our method in combination with clustering procedures for real mutation data revealed many interesting mutation signatures that have not been previously reported. AVAILABILITY AND IMPLEMENTATION: All the predicted mutation signatures with clustering results are freely available at http://www.f.waseda.jp/mhamada/MS/index.html. All the C++ source code and python scripts utilized in this study can be downloaded on the Internet (https://github.com/qkirikigaku/MS_LDA). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

DOI PubMed

Scopus

11

Citation

(Scopus)
Stromal fibroblasts induce metastatic tumor cell clusters via epithelial-mesenchymal plasticity.

Yuko Matsumura, Yasuhiko Ito, Yoshihiro Mezawa, Kaidiliayi Sulidan, Yataro Daigo, Toru Hiraga, Kaoru Mogushi, Nadila Wali, Hiromu Suzuki, Takumi Itoh, Yohei Miyagi, Tomoyuki Yokose, Satoru Shimizu, Atsushi Takano, Yasuhisa Terao, Harumi Saeki, Masayuki Ozawa, Masaaki Abe, Satoru Takeda, Ko Okumura, Sonoko Habu, Okio Hino, Kazuyoshi Takeda, Michiaki Hamada, Akira Orimo

Life science alliance 2 ( 4 ) 2019.08 [Refereed] [International journal]

　View Summary

Emerging evidence supports the hypothesis that multicellular tumor clusters invade and seed metastasis. However, whether tumor-associated stroma induces epithelial-mesenchymal plasticity in tumor cell clusters, to promote invasion and metastasis, remains unknown. We demonstrate herein that carcinoma-associated fibroblasts (CAFs) frequently present in tumor stroma drive the formation of tumor cell clusters composed of two distinct cancer cell populations, one in a highly epithelial (E-cadherinhiZEB1lo/neg: Ehi) state and another in a hybrid epithelial/mesenchymal (E-cadherinloZEB1hi: E/M) state. The Ehi cells highly express oncogenic cell-cell adhesion molecules, such as carcinoembryonic antigen-related cell adhesion molecule 5 (CEACAM5) and CEACAM6 that associate with E-cadherin, resulting in increased tumor cell cluster formation and metastatic seeding. The E/M cells also retain associations with Ehi cells, which follow the E/M cells leading to collective invasion. CAF-produced stromal cell-derived factor 1 and transforming growth factor-β confer the Ehi and E/M states as well as invasive and metastatic traits via Src activation in apposed human breast tumor cells. Taken together, these findings indicate that invasive and metastatic tumor cell clusters are induced by CAFs via epithelial-mesenchymal plasticity.

DOI PubMed

Scopus

57

Citation

(Scopus)
Identification of RNA biomarkers for chemical safety screening in neural cells derived from mouse embryonic stem cells using RNA deep sequencing analysis.

Hidenori Tani, Taro Matsutani, Hiroshi Aoki, Kaoru Nakamura, Yu Hamaguchi, Tetsuya Nakazato, Michiaki Hamada

Biochemical and biophysical research communications 512 ( 4 ) 641 - 646 2019.05 [Refereed] [International journal]

　View Summary

Chemical safety screening requires the development of more efficient assays that do not involve testing in animals. In vitro cell-based assays are among the most appropriate alternatives to animal testing for screening of chemical toxicity. Most studies performed to date made use of mRNAs as biomarkers. Recent studies have however indicated the presence of many unannotated non-coding RNAs (ncRNAs) in the transcriptome that do appear to encode proteins. In the present study, we performed whole-transcriptome sequencing analysis (RNA-Seq) to identify novel RNA biomarkers, including ncRNAs, which showed marked responses to the toxicity of nine chemicals. Chemical safety screening was performed in cell-based assays using mouse embryonic stem cell (mESC)-derived neural cells. Marked responses in the expression of some ncRNAs to the chemical compounds were observed. The results of the present study suggested that ncRNAs may be useful in chemical safety screening as novel RNA biomarkers.

DOI PubMed

Scopus

1

Citation

(Scopus)
LncRRIsearch: A Web Server for lncRNA-RNA Interaction Prediction Integrated With Tissue-Specific Expression and Subcellular Localization Data.

Tsukasa Fukunaga, Junichi Iwakiri, Yukiteru Ono, Michiaki Hamada

Frontiers in genetics 10 462 - 462 2019 [Refereed] [International journal]

　View Summary

Long non-coding RNAs (lncRNAs) play critical roles in various biological processes, but the function of the majority of lncRNAs is still unclear. One approach for estimating a function of a lncRNA is the identification of its interaction target because functions of lncRNAs are expressed through interaction with other biomolecules in quite a few cases. In this paper, we developed "LncRRIsearch," which is a web server for comprehensive prediction of human and mouse lncRNA-lncRNA and lncRNA-mRNA interaction. The prediction was conducted using RIblast, which is a fast and accurate RNA-RNA interaction prediction tool. Users can investigate interaction target RNAs of a particular lncRNA through a web interface. In addition, we integrated tissue-specific expression and subcellular localization data for the lncRNAs with the web server. These data enable users to examine tissue-specific or subcellular localized lncRNA interactions. LncRRIsearch is publicly accessible at http://rtools.cbrc.jp/LncRRIsearch/.

DOI PubMed

Scopus

108

Citation

(Scopus)
DeepM6ASeq: prediction and characterization of m6A-containing sequences using deep learning.

Yiqian Zhang, Michiaki Hamada

BMC bioinformatics 19 ( Suppl 19 ) 524 - 524 2018.12 [Refereed] [International journal]

　View Summary

BACKGROUND: N6-methyladensine (m6A) is a common and abundant RNA methylation modification found in various species. As a type of post-transcriptional methylation, m6A plays an important role in diverse RNA activities such as alternative splicing, an interplay with microRNAs and translation efficiency. Although existing tools can predict m6A at single-base resolution, it is still challenging to extract the biological information surrounding m6A sites. RESULTS: We implemented a deep learning framework, named DeepM6ASeq, to predict m6A-containing sequences and characterize surrounding biological features based on miCLIP-Seq data, which detects m6A sites at single-base resolution. DeepM6ASeq showed better performance as compared to other machine learning classifiers. Moreover, an independent test on m6A-Seq data, which identifies m6A-containing genomic regions, revealed that our model is competitive in predicting m6A-containing sequences. The learned motifs from DeepM6ASeq correspond to known m6A readers. Notably, DeepM6ASeq also identifies a newly recognized m6A reader: FMR1. Besides, we found that a saliency map in the deep learning model could be utilized to visualize locations of m6A sites. CONCULSION: We developed a deep-learning-based framework to predict and characterize m6A-containing sequences and hope to help investigators to gain more insights for m6A research. The source code is available at https://github.com/rreybeyb/DeepM6ASeq .

DOI PubMed

Scopus

124

Citation

(Scopus)
Identifying sequence features that drive ribosomal association for lncRNA.

Chao Zeng, Michiaki Hamada

BMC genomics 19 ( Suppl 10 ) 906 - 906 2018.12 [Refereed] [International journal]

　View Summary

BACKGROUND: With the increasing number of annotated long noncoding RNAs (lncRNAs) from the genome, researchers are continually updating their understanding of lncRNAs. Recently, thousands of lncRNAs have been reported to be associated with ribosomes in mammals. However, their biological functions or mechanisms are still unclear. RESULTS: In this study, we tried to investigate the sequence features involved in the ribosomal association of lncRNA. We have extracted ninety-nine sequence features corresponding to different biological mechanisms (i.e., RNA splicing, putative ORF, k-mer frequency, RNA modification, RNA secondary structure, and repeat element). An [Formula: see text]-regularized logistic regression model was applied to screen these features. Finally, we obtained fifteen and nine important features for the ribosomal association of human and mouse lncRNAs, respectively. CONCLUSION: To our knowledge, this is the first study to characterize ribosome-associated lncRNAs and ribosome-free lncRNAs from the perspective of sequence features. These sequence features that were identified in this study may shed light on the biological mechanism of the ribosomal association and provide important clues for functional analysis of lncRNAs.

DOI PubMed

Scopus

18

Citation

(Scopus)
Nearest-neighbor parameter for inosine-cytosine pairs through a combined experimental and computational approach

Shun Sakuraba, Junichi Iwakiri, Michiaki Hamada, Tomoshi Kameda, Genichiro Tsuji, Yasuaki Kimura, Hiroshi Abe, Kiyoshi Asai

2018.10 [Refereed]

　View Summary

<title>Abstract</title>In RNA secondary structure prediction, nearest-neighbor parameters are used to determine the stability of a given structure. We derived the nearest-neighbor parameters for RNAs containing inosine-cytosine pairs. For parameter derivation, we developed a method that combines UV adsorption measurement experiments with free-energy calculations using molecular dynamics simulations. The method provides fast drop-in parameters for modified bases. Derived parameters were compared and found to be consistent with existing parameters for canonical RNAs. A duplex with an internal inosine-cytosine pair is 0.9 kcal/mol more unstable than the same duplex with an internal guanine-cytosine pair, and is as stable as the one with an internal adenine-uracil pair (only 0.1 kcal/mol more stable) on average.

DOI
A Novel Method for Assessing the Statistical Significance of RNA-RNA Interactions Between Two Long RNAs.

Tsukasa Fukunaga, Michiaki Hamada

Journal of computational biology : a journal of computational molecular cell biology 25 ( 9 ) 976 - 986 2018.09 [Refereed] [International journal]

　View Summary

RNA-RNA interactions are key mechanisms through which noncoding RNA (ncRNA) regions exert biological functions. Computational prediction of RNA-RNA interactions is an essential method for detecting novel RNA-RNA interactions because their comprehensive detection by biological experimentation is still quite difficult. Many RNA-RNA interaction prediction tools have been developed, but they tend to produce many false positives. Accordingly, assessment of the statistical significance of computationally predicted interactions is an important task. However, there is no method to evaluate the statistical significance of RNA-RNA interactions that is applicable to interactions between two long RNA sequences. We developed a method to calculate the p-value for the minimal interaction energy between two long RNA sequences. The developed method depends on the fact that minimum interaction energies of RNA-RNA interactions between long RNAs follow a Gumbel distribution when repeat sequences in RNAs are masked. To show the usefulness of the developed method, we applied it to whole human 5'-untranslated region (UTR) and 3'-UTR sequences to detect novel 5'-UTR-3'-UTR interactions. We thus identified two significant 5'-UTR-3'-UTR interactions. Specifically, the human small proline-rich repeat protein 3 shows conserved 5'-UTR-3'-UTR interactions with some nucleotide variations preserving base pairings among primates. Our developed method enables us to detect statistically significant RNA-RNA interactions between long RNAs such as long ncRNAs. Statistical significance estimates help in identification of interactions for experimental validation and provide novel insights into the function of ncRNA regions.

DOI PubMed

Scopus

2

Citation

(Scopus)
Computational approaches for alternative and transient secondary structures of ribonucleic acids.

Tsukasa Fukunaga, Michiaki Hamada

Briefings in functional genomics 18 ( 3 ) 182 - 191 2018.06 [Refereed] [International journal]

　View Summary

Transient and alternative structures of ribonucleic acids (RNAs) play essential roles in various regulatory processes, such as translation regulation in living cells. Because experimental analyses for RNA structures are difficult and time-consuming, computational approaches based on RNA secondary structures are promising. In this article, we review computational methods for detecting and analyzing transient/alternative secondary structures of RNAs, including static approaches based on probabilistic distributions of RNA secondary structures and dynamic approaches such as kinetic folding and folding pathway predictions.

DOI PubMed

Scopus

3

Citation

(Scopus)
Identification and analysis of ribosome-associated lncRNAs using ribosome profiling data.

Chao Zeng, Tsukasa Fukunaga, Michiaki Hamada

BMC genomics 19 ( 1 ) 414 - 414 2018.05 [Refereed] [International journal]

　View Summary

BACKGROUND: Although the number of discovered long non-coding RNAs (lncRNAs) has increased dramatically, their biological roles have not been established. Many recent studies have used ribosome profiling data to assess the protein-coding capacity of lncRNAs. However, very little work has been done to identify ribosome-associated lncRNAs, here defined as lncRNAs interacting with ribosomes related to protein synthesis as well as other unclear biological functions. RESULTS: On average, 39.17% of expressed lncRNAs were observed to interact with ribosomes in human and 48.16% in mouse. We developed the ribosomal association index (RAI), which quantifies the evidence for ribosomal associability of lncRNAs over various tissues and cell types, to catalog 691 and 409 lncRNAs that are robustly associated with ribosomes in human and mouse, respectively. Moreover, we identified 78 and 42 lncRNAs with a high probability of coding peptides in human and mouse, respectively. Compared with ribosome-free lncRNAs, ribosome-associated lncRNAs were observed to be more likely to be located in the cytoplasm and more sensitive to nonsense-mediated decay. CONCLUSION: Our results suggest that RAI can be used as an integrative and evidence-based tool for distinguishing between ribosome-associated and free lncRNAs, providing a valuable resource for the study of lncRNA functions.

DOI PubMed

Scopus

58

Citation

(Scopus)
Estimating energy parameters for RNA secondary structure predictions using both experimental and computational data.

Nishida S, Sakuraba S, Asai K, Hamada M

IEEE/ACM transactions on computational biology and bioinformatics 16 ( 5 ) 1645 - 1655 2018.03 [Refereed]

DOI PubMed
Beyond similarity assessment: selecting the optimal model for sequence alignment via the Factorized Asymptotic Bayesian algorithm.

Taikai Takeda, Michiaki Hamada, John Hancock

Bioinformatics (Oxford, England) 34 ( 4 ) 576 - 584 2018.02 [Refereed] [International journal]

　View Summary

Motivation: Pair Hidden Markov Models (PHMMs) are probabilistic models used for pairwise sequence alignment, a quintessential problem in bioinformatics. PHMMs include three types of hidden states: match, insertion and deletion. Most previous studies have used one or two hidden states for each PHMM state type. However, few studies have examined the number of states suitable for representing sequence data or improving alignment accuracy. Results: We developed a novel method to select superior models (including the number of hidden states) for PHMM. Our method selects models with the highest posterior probability using Factorized Information Criterion, which is widely utilized in model selection for probabilistic models with hidden variables. Our simulations indicated that this method has excellent model selection capabilities with slightly improved alignment accuracy. We applied our method to DNA datasets from 5 and 28 species, ultimately selecting more complex models than those used in previous studies. Availability and implementation: The software is available at https://github.com/bigsea-t/fab-phmm. Contact: mhamada@waseda.jp. Supplementary information: Supplementary data are available at Bioinformatics online.

DOI PubMed

Scopus
In silico approaches to RNA aptamer design.

Michiaki Hamada

Biochimie 145 8 - 14 2018.02 [Refereed] [International journal]

　View Summary

RNA aptamers are ribonucleic acids that bind to specific target molecules. An RNA aptamer for a disease-related protein has great potential for development into a new drug. However, huge time and cost investments are required to develop an RNA aptamer into a pharmaceutical. Recently, SELEX combined with high-throughput sequencers (i.e., HT-SELEX) has been widely used to select candidate RNA aptamers that bind to a target protein with high affinity and specificity. After candidate selection, further optimizations such as shortening and modifying candidate sequences are performed. In these steps, in silico approaches are expected to reduce the time and cost associated with aptamer drug development. In this article, we review existing in silico approaches to RNA aptamer development, including a method for ranking the candidates of RNA aptamers from HT-SELEX data, clustering a huge number of aptamer sequences, and finding motifs amidst a set of significant RNA aptamers. It is expected that further studies in addition to these methods will be utilized for in silico RNA aptamer design, permitting a minimal number of experiments to be performed through the utilization of sophisticated computational methods.

DOI PubMed

Scopus

50

Citation

(Scopus)
Identification of Transposable Elements Contributing to Tissue-Specific Expression of Long Non-Coding RNAs.

Takafumi Chishima, Junichi Iwakiri, Michiaki Hamada

Genes 9 ( 1 ) 2018.01 [Refereed] [International journal]

　View Summary

It has been recently suggested that transposable elements (TEs) are re-used as functional elements of long non-coding RNAs (lncRNAs). This is supported by some examples such as the human endogenous retrovirus subfamily H (HERVH) elements contained within lncRNAs and expressed specifically in human embryonic stem cells (hESCs), as required to maintain hESC identity. There are at least two unanswered questions about all lncRNAs. How many TEs are re-used within lncRNAs? Are there any other TEs that affect tissue specificity of lncRNA expression? To answer these questions, we comprehensively identify TEs that are significantly related to tissue-specific expression levels of lncRNAs. We downloaded lncRNA expression data corresponding to normal human tissue from the Expression Atlas and transformed the data into tissue specificity estimates. Then, Fisher's exact tests were performed to verify whether the presence or absence of TE-derived sequences influences the tissue specificity of lncRNA expression. Many TE-tissue pairs associated with tissue-specific expression of lncRNAs were detected, indicating that multiple TE families can be re-used as functional domains or regulatory sequences of lncRNAs. In particular, we found that the antisense promoter region of L1PA2, a LINE-1 subfamily, appears to act as a promoter for lncRNAs with placenta-specific expression.

DOI PubMed

Scopus

49

Citation

(Scopus)
The hominoid-specific gene DSCR4 is involved in regulation of human leukocyte migration

Saber MM, Karimiavargani M, Hettiarachchi N, Hamada M, Uzawa T, Ito Y, Saitou N

2017.09

DOI
RIblast: an ultrafast RNA-RNA interaction prediction system based on a seed-and-extension approach.

Tsukasa Fukunaga, Michiaki Hamada

Bioinformatics (Oxford, England) 33 ( 17 ) 2666 - 2674 2017.09 [Refereed] [International journal]

　View Summary

Motivation: LncRNAs play important roles in various biological processes. Although more than 58 000 human lncRNA genes have been discovered, most known lncRNAs are still poorly characterized. One approach to understanding the functions of lncRNAs is the detection of the interacting RNA target of each lncRNA. Because experimental detections of comprehensive lncRNA-RNA interactions are difficult, computational prediction of lncRNA-RNA interactions is an indispensable technique. However, the high computational costs of existing RNA-RNA interaction prediction tools prevent their application to large-scale lncRNA datasets. Results: Here, we present 'RIblast', an ultrafast RNA-RNA interaction prediction method based on the seed-and-extension approach. RIblast discovers seed regions using suffix arrays and subsequently extends seed regions based on an RNA secondary structure energy model. Computational experiments indicate that RIblast achieves a level of prediction accuracy similar to those of existing programs, but at speeds over 64 times faster than existing programs. Availability and implementation: The source code of RIblast is freely available at https://github.com/fukunagatsu/RIblast . Contact: t.fukunaga@kurenai.waseda.jp or mhamada@waseda.jp. Supplementary information: Supplementary data are available at Bioinformatics online.

DOI PubMed

Scopus

91

Citation

(Scopus)
Computational prediction of lncRNA-mRNA interactionsby integrating tissue specificity in human transcriptome.

Junichi Iwakiri, Goro Terai, Michiaki Hamada

Biology direct 12 ( 1 ) 15 - 15 2017.06 [Refereed] [International journal]

　View Summary

Long noncoding RNAs (lncRNAs) play a key role in normal tissue differentiation and cancer development through their tissue-specific expression in the human transcriptome. Recent investigations of macromolecular interactions have shown that tissue-specific lncRNAs form base-pairing interactions with various mRNAs associated with tissue-differentiation, suggesting that tissue specificity is an important factor controlling human lncRNA-mRNA interactions.Here, we report investigations of the tissue specificities of lncRNAs and mRNAs by using RNA-seq data across various human tissues as well as computational predictions of tissue-specific lncRNA-mRNA interactions inferred by integrating the tissue specificity of lncRNAs and mRNAs into our comprehensive prediction of human lncRNA-RNA interactions. Our predicted lncRNA-mRNA interactions were evaluated by comparisons with experimentally validated lncRNA-mRNA interactions (between the TINCR lncRNA and mRNAs), showing the improvement of prediction accuracy over previous prediction methods that did not account for tissue specificities of lncRNAs and mRNAs. In addition, our predictions suggest that the potential functions of TINCR lncRNA not only for epidermal differentiation but also for esophageal development through lncRNA-mRNA interactions. REVIEWERS: This article was reviewed by Dr. Weixiong Zhang and Dr. Bojan Zagrovic.

DOI PubMed

Scopus

42

Citation

(Scopus)
AMAP: A pipeline for whole-genome mutation detection in Arabidopsis thaliana.

Kotaro Ishii, Yusuke Kazama, Tomonari Hirano, Michiaki Hamada, Yukiteru Ono, Mieko Yamada, Tomoko Abe

Genes & genetic systems 91 ( 4 ) 229 - 233 2017.03 [Refereed] [Domestic journal]

　View Summary

Detection of mutations at the whole-genome level is now possible by the use of high-throughput sequencing. However, determining mutations is a time-consuming process due to the number of false positives provided by mutation-detecting programs. AMAP (automated mutation analysis pipeline) was developed to overcome this issue. AMAP integrates a set of well-validated programs for mapping (BWA), removal of potential PCR duplicates (Picard), realignment (GATK) and detection of mutations (SAMtools, GATK, Pindel, BreakDancer and CNVnator). Thus, all types of mutations such as base substitution, deletion, insertion, translocation and chromosomal rearrangement can be detected by AMAP. In addition, AMAP automatically distinguishes false positives by comparing lists of candidate mutations in sequenced mutants. We tested AMAP by inputting already analyzed read data derived from three individual Arabidopsis thaliana mutants and confirmed that all true mutations were included in the list of candidate mutations. The result showed that the number of false positives was reduced to 12% of that obtained in a previous analysis that lacked a process of reducing false positives. Thus, AMAP will accelerate not only the analysis of mutation induction by individual mutagens but also the process of forward genetics.

DOI PubMed

Scopus

4

Citation

(Scopus)
Training alignment parameters for arbitrary sequencers with LAST-TRAIN.

Michiaki Hamada, Yukiteru Ono, Kiyoshi Asai, Martin C Frith

Bioinformatics (Oxford, England) 33 ( 6 ) 926 - 928 2017.03 [Refereed] [International journal]

　View Summary

Summary: LAST-TRAIN improves sequence alignment accuracy by inferring substitution and gap scores that fit the frequencies of substitutions, insertions, and deletions in a given dataset. We have applied it to mapping DNA reads from IonTorrent and PacBio RS, and we show that it reduces reference bias for Oxford Nanopore reads. Availability and Implementation: the source code is freely available at http://last.cbrc.jp/. Contact: mhamada@waseda.jp or mcfrith@edu.k.u-tokyo.ac.jp. Supplementary information: Supplementary data are available at Bioinformatics online.

DOI PubMed

Scopus

58

Citation

(Scopus)
Improved Accuracy in RNA-Protein Rigid Body Docking by Incorporating Force Field for Molecular Dynamics Simulation into the Scoring Function.

Junichi Iwakiri, Michiaki Hamada, Kiyoshi Asai, Tomoshi Kameda

Journal of chemical theory and computation 12 ( 9 ) 4688 - 97 2016.09 [Refereed] [International journal]

　View Summary

RNA-protein interactions play fundamental roles in many biological processes. To understand these interactions, it is necessary to know the three-dimensional structures of RNA-protein complexes. However, determining the tertiary structure of these complexes is often difficult, suggesting that an accurate rigid body docking for RNA-protein complexes is needed. In general, the rigid body docking process is divided into two steps: generating candidate structures from the individual RNA and protein structures and then narrowing down the candidates. In this study, we focus on the former problem to improve the prediction accuracy in RNA-protein docking. Our method is based on the integration of physicochemical information about RNA into ZDOCK, which is known as one of the most successful computer programs for protein-protein docking. Because recent studies showed the current force field for molecular dynamics simulation of protein and nucleic acids is quite accurate, we modeled the physicochemical information about RNA by force fields such as AMBER and CHARMM. A comprehensive benchmark of RNA-protein docking, using three recently developed data sets, reveals the remarkable prediction accuracy of the proposed method compared with existing programs for docking: the highest success rate is 34.7% for the predicted structure of the RNA-protein complex with the best score and 79.2% for 3,600 predicted ones. Three full atomistic force fields for RNA (AMBER94, AMBER99, and CHARMM22) produced almost the same accurate result, which showed current force fields for nucleic acids are quite accurate. In addition, we found that the electrostatic interaction and the representation of shape complementary between protein and RNA plays the important roles for accurate prediction of the native structures of RNA-protein complexes.

DOI PubMed J-GLOBAL

Scopus

27

Citation

(Scopus)
RIblast: An ultrafast RNA-RNA interaction prediction system for comprehensive lncRNA interaction analysis

Fukunaga T, Hamada M

2016.09

DOI
Computational Approaches for Long Non-coding RNA Research

IWAKIRI Junichi, HAMADA Michiaki

Seibutsu Butsuri 56 ( 4 ) 217 - 220 2016.08 [Refereed]

　View Summary

Recent advances in high throughput sequencing technologies unveiled that large number of long non-coding RNAs (lncRNAs) are transcribed from human genome. Currently, these emerging transcripts are needed to be functionally classified and annotated. Here we review several bioinformatic approaches for analyzing the important characteristics of the lncRNAs toward discovering their functions: 1) tissue specificities of lncRNA expressions, 2) two types of macromolecular interactions (RNA-RNA and RNA-protein interactions), 3) secondary structures of lncRNAs.

DOI CiNii
Rtools: a web server for various secondary structural analyses on single RNA sequences

Michiaki Hamada, Yukiteru Ono, Hisanori Kiryu, Kengo Sato, Yuki Kato, Tsukasa Fukunaga, Ryota Mori, Kiyoshi Asai

NUCLEIC ACIDS RESEARCH 44 ( W1 ) W302 - W307 2016.07 [Refereed]

　View Summary

The secondary structures, as well as the nucleotide sequences, are the important features of RNA molecules to characterize their functions. According to the thermodynamic model, however, the probability of any secondary structure is very small. As a consequence, any tool to predict the secondary structures of RNAs has limited accuracy. On the other hand, there are a few tools to compensate the imperfect predictions by calculating and visualizing the secondary structural information from RNA sequences. It is desirable to obtain the rich information from those tools through a friendly interface. We implemented a web server of the tools to predict secondary structures and to calculate various structural features based on the energy models of secondary structures. By just giving an RNA sequence to the web server, the user can get the different types of solutions of the secondary structures, the marginal probabilities such as base-paring probabilities, loop probabilities and accessibilities of the local bases, the energy changes by arbitrary base mutations as well as the measures for validations of the predicted secondary structures. The web server is available at http://rtools.cbrc.jp, which integrates software tools, CentroidFold, CentroidHomfold, IPKnot, CapR, Raccess, Rchange and RintD.

DOI
Rtools: a web server for various secondary structural analyses on single RNA sequences

Michiaki Hamada, Yukiteru Ono, Hisanori Kiryu, Kengo Sato, Yuki Kato, Tsukasa Fukunaga, Ryota Mori, Kiyoshi Asai

NUCLEIC ACIDS RESEARCH 44 ( W1 ) W302 - W307 2016.07 [Refereed]

　View Summary

The secondary structures, as well as the nucleotide sequences, are the important features of RNA molecules to characterize their functions. According to the thermodynamic model, however, the probability of any secondary structure is very small. As a consequence, any tool to predict the secondary structures of RNAs has limited accuracy. On the other hand, there are a few tools to compensate the imperfect predictions by calculating and visualizing the secondary structural information from RNA sequences. It is desirable to obtain the rich information from those tools through a friendly interface. We implemented a web server of the tools to predict secondary structures and to calculate various structural features based on the energy models of secondary structures. By just giving an RNA sequence to the web server, the user can get the different types of solutions of the secondary structures, the marginal probabilities such as base-paring probabilities, loop probabilities and accessibilities of the local bases, the energy changes by arbitrary base mutations as well as the measures for validations of the predicted secondary structures. The web server is available at http://rtools.cbrc.jp, which integrates software tools, CentroidFold, CentroidHomfold, IPKnot, CapR, Raccess, Rchange and RintD.

DOI PubMed
Comprehensive prediction of lncRNA-RNA interactions in human transcriptome

Goro Terai, Junichi Iwakiri, Tomoshi Kameda, Michiaki Hamada, Kiyoshi Asai

BMC GENOMICS 17 ( S-1 ) 12 2016 [Refereed]

　View Summary

Motivation: Recent studies have revealed that large numbers of non-coding RNAs are transcribed in humans, but only a few of them have been identified with their functions. Identification of the interaction target RNAs of the non-coding RNAs is an important step in predicting their functions. The current experimental methods to identify RNA-RNA interactions, however, are not fast enough to apply to a whole human transcriptome. Therefore, computational predictions of RNA-RNA interactions are desirable, but this is a challenging task due to the huge computational costs involved.
Results: Here, we report comprehensive predictions of the interaction targets of lncRNAs in a whole human transcriptome for the first time. To achieve this, we developed an integrated pipeline for predicting RNA-RNA interactions on the K computer, which is one of the fastest super-computers in the world. Comparisons with experimentally-validated lncRNA-RNA interactions support the quality of the predictions. Additionally, we have developed a database that catalogs the predicted lncRNA-RNA interactions to provide fundamental information about the targets of lncRNAs.

DOI

Scopus

55

Citation

(Scopus)
Bioinformatics tools for lncRNA research.

Iwakiri J, Hamada M, Asai K

Biochimica et biophysica acta 1859 ( 1 ) 23 - 30 2016.01 [Refereed]

DOI PubMed

Scopus

50

Citation

(Scopus)
Comprehensive prediction of lncRNA-RNA interactions in human transcriptome

Goro Terai, Junichi Iwakiri, Tomoshi Kameda, Michiaki Hamada, Kiyoshi Asai

BMC GENOMICS 17 ( 1 ) 12 2016 [Refereed]

　View Summary

Motivation: Recent studies have revealed that large numbers of non-coding RNAs are transcribed in humans, but only a few of them have been identified with their functions. Identification of the interaction target RNAs of the non-coding RNAs is an important step in predicting their functions. The current experimental methods to identify RNA-RNA interactions, however, are not fast enough to apply to a whole human transcriptome. Therefore, computational predictions of RNA-RNA interactions are desirable, but this is a challenging task due to the huge computational costs involved.
Results: Here, we report comprehensive predictions of the interaction targets of lncRNAs in a whole human transcriptome for the first time. To achieve this, we developed an integrated pipeline for predicting RNA-RNA interactions on the K computer, which is one of the fastest super-computers in the world. Comparisons with experimentally-validated lncRNA-RNA interactions support the quality of the predictions. Additionally, we have developed a database that catalogs the predicted lncRNA-RNA interactions to provide fundamental information about the targets of lncRNAs.

DOI PubMed J-GLOBAL

Scopus

55

Citation

(Scopus)
Bioinformatics tools for lncRNA research

Junichi Iwakiri, Michiaki Hamada, Kiyoshi Asai

BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 1859 ( 1 ) 23 - 30 2016.01 [Refereed]

　View Summary

Current experimental methods to identify the functions of a large number of the candidates of long non-coding RNAs (lncRNAs) are limited in their throughput. Therefore, it is essential to know which tools are effective for understanding lncRNAs so that reasonable speed and accuracy can be achieved. In this paper, we review the currently available bioinformatics tools and databases that are useful for finding non-coding RNAs and analyzing their structures, conservation, interactions, co-expressions and localization. This article is part of a Special Issue entitled: Clues to long noncoding RNA taxonomy1, edited by Dr. Tetsuro Hirose and Dr. Shinichi Nakagawa. (C) 2015 Elsevier B.V. All fights reserved.

DOI PubMed J-GLOBAL

Scopus

50

Citation

(Scopus)
Privacy-preserving search for chemical compound databases

Kana Shimizu, Koji Nuida, Hiromi Arai, Shigeo Mitsunari, Nuttapong Attrapadung, Michiaki Hamada, Koji Tsuda, Takatsugu Hirokawa, Jun Sakuma, Goichiro Hanaoka, Kiyoshi Asai

BMC BIOINFORMATICS 16 ( 18 ) S6 2015.12 [Refereed]

　View Summary

Background: Searching for similar compounds in a database is the most important process for in-silico drug screening. Since a query compound is an important starting point for the new drug, a query holder, who is afraid of the query being monitored by the database server, usually downloads all the records in the database and uses them in a closed network. However, a serious dilemma arises when the database holder also wants to output no information except for the search results, and such a dilemma prevents the use of many important data resources.
Results: In order to overcome this dilemma, we developed a novel cryptographic protocol that enables database searching while keeping both the query holder's privacy and database holder's privacy. Generally, the application of cryptographic techniques to practical problems is difficult because versatile techniques are computationally expensive while computationally inexpensive techniques can perform only trivial computation tasks. In this study, our protocol is successfully built only from an additive-homomorphic cryptosystem, which allows only addition performed on encrypted values but is computationally efficient compared with versatile techniques such as general purpose multi-party computation. In an experiment searching ChEMBL, which consists of more than 1,200,000 compounds, the proposed method was 36,900 times faster in CPU time and 12,000 times as efficient in communication size compared with general purpose multi-party computation.
Conclusion: We proposed a novel privacy-preserving protocol for searching chemical compound databases. The proposed method, easily scaling for large-scale databases, may help to accelerate drug discovery research by making full use of unused but valuable data that includes sensitive information.

DOI PubMed J-GLOBAL

Scopus

12

Citation

(Scopus)
A semi-supervised learning approach for RNA secondary structure prediction

Haruka Yonemoto, Kiyoshi Asai, Michiaki Hamada

COMPUTATIONAL BIOLOGY AND CHEMISTRY 57 72 - 79 2015.08 [Refereed]

　View Summary

RNA secondary structure prediction is a key technology in RNA bioinformatics. Most algorithms for RNA secondary structure prediction use probabilistic models, in which the model parameters are trained with reliable RNA secondary structures. Because of the difficulty of determining RNA secondary structures by experimental procedures, such as NMR or X-ray crystal structural analyses, there are still many RNA sequences that could be useful for training whose secondary structures have not been experimentally determined. In this paper, we introduce a novel semi-supervised learning approach for training parameters in a probabilistic model of RNA secondary structures in which we employ not only RNA sequences with annotated secondary structures but also ones with unknown secondary structures. Our model is based on a hybrid of generative (stochastic context-free grammars) and discriminative models (conditional random fields) that has been successfully applied to natural language processing. Computational experiments indicate that the accuracy of secondary structure prediction is improved by incorporating RNA sequences with unknown secondary structures into training. To our knowledge, this is the first study of a semi-supervised learning approach for RNA secondary structure prediction. This technique will be useful when the number of reliable structures is limited. (C) 2015 Elsevier Ltd. All rights reserved.

DOI

Scopus

13

Citation

(Scopus)
Learning chromatin states with factorized information criteria

Michiaki Hamada, Yukiteru Ono, Ryohei Fujimaki, Kiyoshi Asai

BIOINFORMATICS 31 ( 15 ) 2426 - 2433 2015.08 [Refereed]

　View Summary

Motivation: Recent studies have suggested that both the genome and the genome with epigenetic modifications, the so-called epigenome, play important roles in various biological functions, such as transcription and DNA replication, repair, and recombination. It is well known that specific combinations of histone modifications (e.g. methylations and acetylations) of nucleosomes induce chromatin states that correspond to specific functions of chromatin. Although the advent of next-generation sequencing (NGS) technologies enables measurement of epigenetic information for entire genomes at high-resolution, the variety of chromatin states has not been completely characterized.
Results: In this study, we propose a method to estimate the chromatin states indicated by genome-wide chromatin marks identified by NGS technologies. The proposed method automatically estimates the number of chromatin states and characterize each state on the basis of a hidden Markov model (HMM) in combination with a recently proposed model selection technique, factorized information criteria. The method is expected to provide an unbiased model because it relies on only two adjustable parameters and avoids heuristic procedures as much as possible. Computational experiments with simulated datasets show that our method automatically learns an appropriate model, even in cases where methods that rely on Bayesian information criteria fail to learn the model structures. In addition, we comprehensively compare our method to ChromHMM on three real datasets and show that our method estimates more chromatin states than ChromHMM for those datasets.

DOI

Scopus

9

Citation

(Scopus)
A semi-supervised learning approach for RNA secondary structure prediction

Haruka Yonemoto, Kiyoshi Asai, Michiaki Hamada

COMPUTATIONAL BIOLOGY AND CHEMISTRY 57 72 - 79 2015.08 [Refereed]

　View Summary

RNA secondary structure prediction is a key technology in RNA bioinformatics. Most algorithms for RNA secondary structure prediction use probabilistic models, in which the model parameters are trained with reliable RNA secondary structures. Because of the difficulty of determining RNA secondary structures by experimental procedures, such as NMR or X-ray crystal structural analyses, there are still many RNA sequences that could be useful for training whose secondary structures have not been experimentally determined. In this paper, we introduce a novel semi-supervised learning approach for training parameters in a probabilistic model of RNA secondary structures in which we employ not only RNA sequences with annotated secondary structures but also ones with unknown secondary structures. Our model is based on a hybrid of generative (stochastic context-free grammars) and discriminative models (conditional random fields) that has been successfully applied to natural language processing. Computational experiments indicate that the accuracy of secondary structure prediction is improved by incorporating RNA sequences with unknown secondary structures into training. To our knowledge, this is the first study of a semi-supervised learning approach for RNA secondary structure prediction. This technique will be useful when the number of reliable structures is limited. (C) 2015 Elsevier Ltd. All rights reserved.

DOI PubMed J-GLOBAL

Scopus

13

Citation

(Scopus)
Learning chromatin states with factorized information criteria

Michiaki Hamada, Yukiteru Ono, Ryohei Fujimaki, Kiyoshi Asai

BIOINFORMATICS 31 ( 15 ) 2426 - 2433 2015.08 [Refereed]

　View Summary

Motivation: Recent studies have suggested that both the genome and the genome with epigenetic modifications, the so-called epigenome, play important roles in various biological functions, such as transcription and DNA replication, repair, and recombination. It is well known that specific combinations of histone modifications (e.g. methylations and acetylations) of nucleosomes induce chromatin states that correspond to specific functions of chromatin. Although the advent of next-generation sequencing (NGS) technologies enables measurement of epigenetic information for entire genomes at high-resolution, the variety of chromatin states has not been completely characterized.
Results: In this study, we propose a method to estimate the chromatin states indicated by genome-wide chromatin marks identified by NGS technologies. The proposed method automatically estimates the number of chromatin states and characterize each state on the basis of a hidden Markov model (HMM) in combination with a recently proposed model selection technique, factorized information criteria. The method is expected to provide an unbiased model because it relies on only two adjustable parameters and avoids heuristic procedures as much as possible. Computational experiments with simulated datasets show that our method automatically learns an appropriate model, even in cases where methods that rely on Bayesian information criteria fail to learn the model structures. In addition, we comprehensively compare our method to ChromHMM on three real datasets and show that our method estimates more chromatin states than ChromHMM for those datasets.

DOI PubMed J-GLOBAL

Scopus

9

Citation

(Scopus)
RNA secondary structure prediction from multi-aligned sequences

Michiaki Hamada

RNA Bioinformatics 1269 17 - 38 2015.01 [Refereed]

　View Summary

It has been well accepted that the RNA secondary structures of most functional non-coding RNAs (ncRNAs) are closely related to their functions and are conserved during evolution. Hence, prediction of conserved secondary structures from evolutionarily related sequences is one important task in RNA bioinformatics
the methods are useful not only to further functional analyses of ncRNAs but also to improve the accuracy of secondary structure predictions and to find novel functional RNAs from the genome. In this review, I focus on common secondary structure prediction from a given aligned RNA sequence, in which one secondary structure whose length is equal to that of the input alignment is predicted. I systematically review and classify existing tools and algorithms for the problem, by utilizing the information employed in the tools and by adopting a unified viewpoint based on maximum expected gain (MEG) estimators. I believe that this classification will allow a deeper understanding of each tool and provide users with useful information for selecting tools for common secondary structure predictions.

DOI PubMed J-GLOBAL

Scopus

3

Citation

(Scopus)
Efficient calculation of exact probability distributions of integer features on RNA secondary structures

Ryota Mori, Michiaki Hamada, Kiyoshi Asai

BMC GENOMICS 15 S6 2014.12 [Refereed]

　View Summary

Background: Although the needs for analyses of secondary structures of RNAs are increasing, prediction of the secondary structures of RNAs are not always reliable. Because an RNA may have a complicated energy landscape, comprehensive representations of the whole ensemble of the secondary structures, such as the probability distributions of various features of RNA secondary structures are required.
Results: A general method to efficiently compute the distribution of any integer scalar/vector function on the secondary structure is proposed. We also show two concrete algorithms, for Hamming distance from a reference structure and for 5' - 3' distance, which can be constructed by following our general method. These practical applications of this method show the effectiveness of the proposed method.
Conclusions: The proposed method provides a clear and comprehensive procedure to construct algorithms for distributions of various integer features. In addition, distributions of integer vectors, that is a combination of different integer scores, can be also described by applying our 2D expanding technique.

DOI PubMed J-GLOBAL

Scopus

12

Citation

(Scopus)
Reference-free prediction of rearrangement breakpoint reads

Edward Wijaya, Kana Shimizu, Kiyoshi Asai, Michiaki Hamada

BIOINFORMATICS 30 ( 18 ) 2559 - 2567 2014.09 [Refereed]

　View Summary

Motivation: Chromosome rearrangement events are triggered by atypical breaking and rejoining of DNA molecules, which are observed in many cancer-related diseases. The detection of rearrangement is typically done by using short reads generated by next-generation sequencing (NGS) and combining the reads with knowledge of a reference genome. Because structural variations and genomes differ from one person to another, intermediate comparison via a reference genome may lead to loss of information.
Results: In this article, we propose a reference-free method for detecting clusters of breakpoints from the chromosomal rearrangements. This is done by directly comparing a set of NGS normal reads with another set that may be rearranged. Our method SlideSort-BPR (breakpoint reads) is based on a fast algorithm for all-against-all comparisons of short reads and theoretical analyses of the number of neighboring reads. When applied to a dataset with a sequencing depth of 100x, it finds similar to 88% of the breakpoints correctly with no false-positive reads. Moreover, evaluation on a real prostate cancer dataset shows that the proposed method predicts more fusion transcripts correctly than previous approaches, and yet produces fewer false-positive reads. To our knowledge, this is the first method to detect breakpoint reads without using a reference genome.

DOI PubMed J-GLOBAL

Scopus

5

Citation

(Scopus)
Reference-free prediction of rearrangement breakpoint reads

Edward Wijaya, Kana Shimizu, Kiyoshi Asai, Michiaki Hamada

BIOINFORMATICS 30 ( 18 ) 2559 - 2567 2014.09 [Refereed]

　View Summary

Motivation: Chromosome rearrangement events are triggered by atypical breaking and rejoining of DNA molecules, which are observed in many cancer-related diseases. The detection of rearrangement is typically done by using short reads generated by next-generation sequencing (NGS) and combining the reads with knowledge of a reference genome. Because structural variations and genomes differ from one person to another, intermediate comparison via a reference genome may lead to loss of information.
Results: In this article, we propose a reference-free method for detecting clusters of breakpoints from the chromosomal rearrangements. This is done by directly comparing a set of NGS normal reads with another set that may be rearranged. Our method SlideSort-BPR (breakpoint reads) is based on a fast algorithm for all-against-all comparisons of short reads and theoretical analyses of the number of neighboring reads. When applied to a dataset with a sequencing depth of 100x, it finds similar to 88% of the breakpoints correctly with no false-positive reads. Moreover, evaluation on a real prostate cancer dataset shows that the proposed method predicts more fusion transcripts correctly than previous approaches, and yet produces fewer false-positive reads. To our knowledge, this is the first method to detect breakpoint reads without using a reference genome.

DOI

Scopus

5

Citation

(Scopus)
Fighting against uncertainty: an essential issue in bioinformatics

Michiaki Hamada

BRIEFINGS IN BIOINFORMATICS 15 ( 5 ) 748 - 767 2014.09 [Refereed]

　View Summary

Many bioinformatics problems, such as sequence alignment, gene prediction, phylogenetic tree estimation and RNA secondary structure prediction, are often affected by the 'uncertainty' of a solution, that is, the probability of the solution is extremely small. This situation arises for estimation problems on high-dimensional discrete spaces in which the number of possible discrete solutions is immense. In the analysis of biological data or the development of prediction algorithms, this uncertainty should be handled carefully and appropriately. In this review, I will explain several methods to combat this uncertainty, presenting a number of examples in bioinformatics. The methods include (i) avoiding point estimation, (ii) maximum expected accuracy (MEA) estimations and (iii) several strategies to design a pipeline involving several prediction methods. I believe that the basic concepts and ideas described in this review will be generally useful for estimation problems in various areas of bioinformatics.

DOI

Scopus

13

Citation

(Scopus)
RNA structural alignments, part II: non-Sankoff approaches for structural alignments.

Asai K, Hamada M

Methods in molecular biology (Clifton, N.J.) 1097 291 - 301 2014 [Refereed]

DOI PubMed J-GLOBAL
2P126 Three dimensional structure prediction of RNA-protein complexes by MD simulation(05B. Nucleic acid: Interaction & Complex formation,Poster,The 52nd Annual Meeting of the Biophysical Society of Japan(BSJ2014))

Yura Kei, Iwakiri Junichi, Hamada Michiaki, Asai Kiyoshi, Kameda Tomoshi

Seibutsu Butsuri 54 ( 1 ) S215 2014

DOI CiNii
Analysis of base-pairing probabilities of RNA molecules involved in protein-RNA interactions

Junichi Iwakiri, Tomoshi Kameda, Kiyoshi Asai, Michiaki Hamada

BIOINFORMATICS 29 ( 20 ) 2524 - 2528 2013.10 [Refereed]

　View Summary

Motivation: Understanding the details of protein-RNA interactions is important to reveal the functions of both the RNAs and the proteins. In these interactions, the secondary structures of the RNAs play an important role. Because RNA secondary structures in protein-RNA complexes are variable, considering the ensemble of RNA secondary structures is a useful approach. In particular, recent studies have supported the idea that, in the analysis of RNA secondary structures, the base-pairing probabilities (BPPs) of RNAs (i.e. the probabilities of forming a base pair in the ensemble of RNA secondary structures) provide richer and more robust information about the structures than a single RNA secondary structure, for example, the minimum free energy structure or a snapshot of structures in the Protein Data Bank. However, there has been no investigation of the BPPs in protein-RNA interactions.
Results: In this study, we analyzed BPPs of RNA molecules involved in known protein-RNA complexes in the Protein Data Bank. Our analysis suggests that, in the tertiary structures, the BPPs (which are computed using only sequence information) for unpaired nucleotides with intermolecular hydrogen bonds (hbonds) to amino acids were significantly lower than those for unpaired nucleotides without hbonds. On the other hand, no difference was found between the BPPs for paired nucleotides with and without intermolecular hbonds. Those findings were commonly supported by three probabilistic models, which provide the ensemble of RNA secondary structures, including the McCaskill model based on Turner's free energy of secondary structures.

DOI

Scopus

10

Citation

(Scopus)
Analysis of base-pairing probabilities of RNA molecules involved in protein-RNA interactions

Junichi Iwakiri, Tomoshi Kameda, Kiyoshi Asai, Michiaki Hamada

BIOINFORMATICS 29 ( 20 ) 2524 - 2528 2013.10 [Refereed]

　View Summary

Motivation: Understanding the details of protein-RNA interactions is important to reveal the functions of both the RNAs and the proteins. In these interactions, the secondary structures of the RNAs play an important role. Because RNA secondary structures in protein-RNA complexes are variable, considering the ensemble of RNA secondary structures is a useful approach. In particular, recent studies have supported the idea that, in the analysis of RNA secondary structures, the base-pairing probabilities (BPPs) of RNAs (i.e. the probabilities of forming a base pair in the ensemble of RNA secondary structures) provide richer and more robust information about the structures than a single RNA secondary structure, for example, the minimum free energy structure or a snapshot of structures in the Protein Data Bank. However, there has been no investigation of the BPPs in protein-RNA interactions.
Results: In this study, we analyzed BPPs of RNA molecules involved in known protein-RNA complexes in the Protein Data Bank. Our analysis suggests that, in the tertiary structures, the BPPs (which are computed using only sequence information) for unpaired nucleotides with intermolecular hydrogen bonds (hbonds) to amino acids were significantly lower than those for unpaired nucleotides without hbonds. On the other hand, no difference was found between the BPPs for paired nucleotides with and without intermolecular hbonds. Those findings were commonly supported by three probabilistic models, which provide the ensemble of RNA secondary structures, including the McCaskill model based on Turner's free energy of secondary structures.

DOI PubMed J-GLOBAL

Scopus

10

Citation

(Scopus)
Fighting against uncertainty: An essential issue in bioinformatics

Michiaki Hamada

Briefings in Bioinformatics 15 ( 5 ) 748 - 767 2013.05 [Refereed]

　View Summary

Many bioinformatics problems, such as sequence alignment, gene prediction, phylogenetic tree estimation and RNA secondary structure prediction, are often affected by the 'uncertainty' of a solution, that is, the probability of the solution is extremely small. This situation arises for estimation problems on high-dimensional discrete spaces in which the number of possible discrete solutions is immense. In the analysis of biological data or the development of prediction algorithms, this uncertainty should be handled carefully and appropriately. In this review, I will explain several methods to combat this uncertainty, presenting a number of examples in bioinformatics. The methods include (i) avoiding point estimation, (ii) maximum expected accuracy (MEA) estimations and (iii) several strategies to design a pipeline involving several prediction methods. I believe that the basic concepts and ideas described in this review will be generally useful for estimation problems in various areas of bioinformatics.

DOI PubMed J-GLOBAL

Scopus

13

Citation

(Scopus)
CentroidAlign-Web: A fast and accurate multiple aligner for long non-coding RNAs

Haruka Yonemoto, Kiyoshi Asai, Michiaki Hamada

International Journal of Molecular Sciences 14 ( 3 ) 6144 - 6156 2013.03 [Refereed]

　View Summary

Due to the recent discovery of non-coding RNAs (ncRNAs), multiple sequence alignment (MSA) of those long RNA sequences is becoming increasingly important for classifying and determining the functional motifs in RNAs. However, not only primary (nucleotide) sequences, but also secondary structures of ncRNAs are closely related to their function and are conserved evolutionarily. Hence, information about secondary structures should be considered in the sequence alignment of ncRNAs. Yet, in general, a huge computational time is required in order to compute MSAs, taking secondary structure information into account. In this paper, we describe a fast and accurate web server, called CentroidAlign-Web, which can handle long RNA sequences. The web server also appropriately incorporates information about known secondary structures into MSAs. Computational experiments indicate that our web server is fast and accurate enough to handle long RNA sequences. CentroidAlign-Web is freely available from http://centroidalign.ncrna.org/. © 2013 by the authors
licensee MDPI, Basel, Switzerland.

DOI PubMed J-GLOBAL

Scopus

4

Citation

(Scopus)
Generalized Centroid Estimators in Bioinformatics

Michiaki Hamada, Hisanori Kiryu, Wataru Iwasaki, Kiyoshi Asai

CoRR abs/1305.4339 2013 [Refereed]
PBSIM: PacBio reads simulator-toward accurate genome assembly

Yukiteru Ono, Kiyoshi Asai, Michiaki Hamada

BIOINFORMATICS 29 ( 1 ) 119 - 121 2013.01 [Refereed]

　View Summary

Motivation: PacBio sequencers produce two types of characteristic reads (continuous long reads: long and high error rate and circular consensus sequencing: short and low error rate), both of which could be useful for de novo assembly of genomes. Currently, there is no available simulator that targets the specific generation of PacBio libraries.
Results: Our analysis of 13 PacBio datasets showed characteristic features of PacBio reads (e.g. the read length of PacBio reads follows a log-normal distribution). We have developed a read simulator, PBSIM, that captures these features using either a model-based or sampling-based method. Using PBSIM, we conducted several hybrid error correction and assembly tests for PacBio reads, suggesting that a continuous long reads coverage depth of at least 15 in combination with a circular consensus sequencing coverage depth of at least 30 achieved extensive assembly results.

DOI

Scopus

229

Citation

(Scopus)
PBSIM: PacBio reads simulator-toward accurate genome assembly

Yukiteru Ono, Kiyoshi Asai, Michiaki Hamada

BIOINFORMATICS 29 ( 1 ) 119 - 121 2013.01 [Refereed]

　View Summary

Motivation: PacBio sequencers produce two types of characteristic reads (continuous long reads: long and high error rate and circular consensus sequencing: short and low error rate), both of which could be useful for de novo assembly of genomes. Currently, there is no available simulator that targets the specific generation of PacBio libraries.
Results: Our analysis of 13 PacBio datasets showed characteristic features of PacBio reads (e.g. the read length of PacBio reads follows a log-normal distribution). We have developed a read simulator, PBSIM, that captures these features using either a model-based or sampling-based method. Using PBSIM, we conducted several hybrid error correction and assembly tests for PacBio reads, suggesting that a continuous long reads coverage depth of at least 15 in combination with a circular consensus sequencing coverage depth of at least 30 achieved extensive assembly results.

DOI PubMed J-GLOBAL

Scopus

229

Citation

(Scopus)
1P125 Tertiary structure prediction of Protein-RNA complexes(05A.Nucleic acid: Structure & Property,Poster,The 51st Annual Meeting of the Biophysical Society of Japan)

Kameda Tomoshi, Iwakiri Junichi, Hamada Michiaki, Asai Kiyoshi

Seibutsu Butsuri 53 ( 1 ) S126 2013

DOI CiNii
Direct Updating of an RNA Base-Pairing Probability Matrix with Marginal Probability Constraints

Michiaki Hamada

JOURNAL OF COMPUTATIONAL BIOLOGY 19 ( 12 ) 1265 - 1276 2012.12 [Refereed]

　View Summary

A base-pairing probability matrix (BPPM) stores the probabilities for every possible base pair in an RNA sequence and has been used in many algorithms in RNA informatics (e.g., RNA secondary structure prediction and motif search). In this study, we propose a novel algorithm to perform iterative updates of a given BPPM, satisfying marginal probability constraints that are (approximately) given by recently developed biochemical experiments, such as SHAPE, PAR, and FragSeq. The method is easily implemented and is applicable to common models for RNA secondary structures, such as energy-based or machine-learning-based models. In this article, we focus mainly on the details of the algorithms, although preliminary computational experiments will also be presented.

DOI

Scopus

8

Citation

(Scopus)
Direct updating of an RNA base-pairing probability matrix with marginal probability constraints

Michiaki Hamada

Journal of Computational Biology 19 ( 12 ) 1265 - 1276 2012.12 [Refereed]

　View Summary

A base-pairing probability matrix (BPPM) stores the probabilities for every possible base pair in an RNA sequence and has been used in many algorithms in RNA informatics (e.g., RNA secondary structure prediction and motif search). In this study, we propose a novel algorithm to perform iterative updates of a given BPPM, satisfying marginal probability constraints that are (approximately) given by recently developed biochemical experiments, such as SHAPE, PAR, and FragSeq. The method is easily implemented and is applicable to common models for RNA secondary structures, such as energy-based or machine-learning-based models. In this article, we focus mainly on the details of the algorithms, although preliminary computational experiments will also be presented. © 2012 Mary Ann Liebert, Inc.

DOI PubMed J-GLOBAL

Scopus

8

Citation

(Scopus)
Shape-based alignment of genomic landscapes in multi-scale resolution

Hiroki Ashida, Kiyoshi Asai, Michiaki Hamada

NUCLEIC ACIDS RESEARCH 40 ( 14 ) 6435 - 6448 2012.08 [Refereed]

　View Summary

Due to dramatic advances in DNA technology, quantitative measures of annotation data can now be obtained in continuous coordinates across the entire genome, allowing various heterogeneous 'genomic landscapes' to emerge. Although much effort has been devoted to comparing DNA sequences, not much attention has been given to comparing these large quantities of data comprehensively. In this article, we introduce a method for rapidly detecting local regions that show high correlations between genomic landscapes. We overcame the size problem for genome-wide data by converting the data into series of symbols and then carrying out sequence alignment. We also decomposed the oscillation of the landscape data into different frequency bands before analysis, since the real genomic landscape is a mixture of embedded and confounded biological processes working at different scales in the cell nucleus. To verify the usefulness and generality of our method, we applied our approach to well investigated landscapes from the human genome, including several histone modifications. Furthermore, by applying our method to over 20 genomic landscapes in human and 12 in mouse, we found that DNA replication timing and the density of Alu insertions are highly correlated genome-wide in both species, even though the Alu elements have amplified independently in the two genomes. To our knowledge, this is the first method to align genomic landscapes at multiple scales according to their shape.

DOI PubMed J-GLOBAL

Scopus

5

Citation

(Scopus)
A Classification of Bioinformatics Algorithms from the Viewpoint of Maximizing Expected Accuracy (MEA)

Michiaki Hamada, Kiyoshi Asai

JOURNAL OF COMPUTATIONAL BIOLOGY 19 ( 5 ) 532 - 549 2012.05 [Refereed]

　View Summary

Many estimation problems in bioinformatics are formulated as point estimation problems in a high-dimensional discrete space. In general, it is difficult to design reliable estimators for this type of problem, because the number of possible solutions is immense, which leads to an extremely low probability for every solution-even for the one with the highest probability. Therefore, maximum score and maximum likelihood estimators do not work well in this situation although they are widely employed in a number of applications. Maximizing expected accuracy (MEA) estimation, in which accuracy measures of the target problem and the entire distribution of solutions are considered, is a more successful approach. In this review, we provide an extensive discussion of algorithms and software based on MEA. We describe how a number of algorithms used in previous studies can be classified from the viewpoint of MEA. We believe that this review will be useful not only for users wishing to utilize software to solve the estimation problems appearing in this article, but also for developers wishing to design algorithms on the basis of MEA.

DOI

Scopus

15

Citation

(Scopus)
A Classification of Bioinformatics Algorithms from the Viewpoint of Maximizing Expected Accuracy (MEA)

Michiaki Hamada, Kiyoshi Asai

JOURNAL OF COMPUTATIONAL BIOLOGY 19 ( 5 ) 532 - 549 2012.05 [Refereed]

　View Summary

Many estimation problems in bioinformatics are formulated as point estimation problems in a high-dimensional discrete space. In general, it is difficult to design reliable estimators for this type of problem, because the number of possible solutions is immense, which leads to an extremely low probability for every solution-even for the one with the highest probability. Therefore, maximum score and maximum likelihood estimators do not work well in this situation although they are widely employed in a number of applications. Maximizing expected accuracy (MEA) estimation, in which accuracy measures of the target problem and the entire distribution of solutions are considered, is a more successful approach. In this review, we provide an extensive discussion of algorithms and software based on MEA. We describe how a number of algorithms used in previous studies can be classified from the viewpoint of MEA. We believe that this review will be useful not only for users wishing to utilize software to solve the estimation problems appearing in this article, but also for developers wishing to design algorithms on the basis of MEA.

DOI PubMed J-GLOBAL

Scopus

15

Citation

(Scopus)
Privacy preservation in information retrieval

荒井ひろみ, 清水佳奈, 浜田道昭

人工知能学会全国大会論文集 26 1 - 4 2012

CiNii
Probabilistic alignments with quality scores: an application to short-read mapping toward accurate SNP/indel detection

Michiaki Hamada, Edward Wijaya, Martin C. Frith, Kiyoshi Asai

BIOINFORMATICS 27 ( 22 ) 3085 - 3092 2011.11 [Refereed]

　View Summary

Motivation: Recent studies have revealed the importance of considering quality scores of reads generated by next-generation sequence (NGS) platforms in various downstream analyses. It is also known that probabilistic alignments based on marginal probabilities (e. g. aligned-column and/or gap probabilities) provide more accurate alignment than conventional maximum score-based alignment. There exists, however, no study about probabilistic alignment that considers quality scores explicitly, although the method is expected to be useful in SNP/indel callers and bisulfite mapping, because accurate estimation of aligned columns or gaps is important in those analyses.
Results: In this study, we propose methods of probabilistic alignment that consider quality scores of (one of) the sequences as well as a usual score matrix. The method is based on posterior decoding techniques in which various marginal probabilities are computed from a probabilistic model of alignments with quality scores, and can arbitrarily trade-off sensitivity and positive predictive value (PPV) of prediction (aligned columns and gaps). The method is directly applicable to read mapping (alignment) toward accurate detection of SNPs and indels. Several computational experiments indicated that probabilistic alignments can estimate aligned columns and gaps accurately, compared with other mapping algorithms e.g. SHRiMP2, Stampy, BWA and Novoalign. The study also suggested that our approach yields favorable precision for SNP/indel calling.

DOI

Scopus

13

Citation

(Scopus)
Probabilistic alignments with quality scores: an application to short-read mapping toward accurate SNP/indel detection

Michiaki Hamada, Edward Wijaya, Martin C. Frith, Kiyoshi Asai

BIOINFORMATICS 27 ( 22 ) 3085 - 3092 2011.11 [Refereed]

　View Summary

Motivation: Recent studies have revealed the importance of considering quality scores of reads generated by next-generation sequence (NGS) platforms in various downstream analyses. It is also known that probabilistic alignments based on marginal probabilities (e. g. aligned-column and/or gap probabilities) provide more accurate alignment than conventional maximum score-based alignment. There exists, however, no study about probabilistic alignment that considers quality scores explicitly, although the method is expected to be useful in SNP/indel callers and bisulfite mapping, because accurate estimation of aligned columns or gaps is important in those analyses.
Results: In this study, we propose methods of probabilistic alignment that consider quality scores of (one of) the sequences as well as a usual score matrix. The method is based on posterior decoding techniques in which various marginal probabilities are computed from a probabilistic model of alignments with quality scores, and can arbitrarily trade-off sensitivity and positive predictive value (PPV) of prediction (aligned columns and gaps). The method is directly applicable to read mapping (alignment) toward accurate detection of SNPs and indels. Several computational experiments indicated that probabilistic alignments can estimate aligned columns and gaps accurately, compared with other mapping algorithms e.g. SHRiMP2, Stampy, BWA and Novoalign. The study also suggested that our approach yields favorable precision for SNP/indel calling.

DOI PubMed J-GLOBAL

Scopus

13

Citation

(Scopus)
CentroidHomfold-LAST: accurate prediction of RNA secondary structure using automatically collected homologous sequences

Michiaki Hamada, Koichiro Yamada, Kengo Sato, Martin C. Frith, Kiyoshi Asai

NUCLEIC ACIDS RESEARCH 39 ( Web-Server-Issue ) W100 - W106 2011.07 [Refereed]

　View Summary

Although secondary structure predictions of an individual RNA sequence have been widely used in a number of sequence analyses of RNAs, accuracy is still limited. Recently, we proposed a method (called 'CentroidHomfold'), which includes information about homologous sequences into the prediction of the secondary structure of the target sequence, and showed that it substantially improved the performance of secondary structure predictions. CentroidHomfold, however, forces users to prepare homologous sequences of the target sequence. We have developed a Web application (CentroidHomfold-LAST) that predicts the secondary structure of the target sequence using automatically collected homologous sequences. LAST, which is a fast and sensitive local aligner, and CentroidHomfold are employed in the Web application. Computational experiments with a commonly-used data set indicated that CentroidHomfold-LAST substantially outperformed conventional secondary structure predictions including CentroidFold and RNAfold.

DOI

Scopus

22

Citation

(Scopus)
IPknot: fast and accurate prediction of RNA secondary structures with pseudoknots using integer programming

Kengo Sato, Yuki Kato, Michiaki Hamada, Tatsuya Akutsu, Kiyoshi Asai

BIOINFORMATICS 27 ( 13 ) I85 - I93 2011.07 [Refereed]

　View Summary

Motivation: Pseudoknots found in secondary structures of a number of functional RNAs play various roles in biological processes. Recent methods for predicting RNA secondary structures cover certain classes of pseudoknotted structures, but only a few of them achieve satisfying predictions in terms of both speed and accuracy.
Results: We propose IPknot, a novel computational method for predicting RNA secondary structures with pseudoknots based on maximizing expected accuracy of a predicted structure. IPknot decomposes a pseudoknotted structure into a set of pseudoknot-free substructures and approximates a base-pairing probability distribution that considers pseudoknots, leading to the capability of modeling a wide class of pseudoknots and running quite fast. In addition, we propose a heuristic algorithm for refining base-paring probabilities to improve the prediction accuracy of IPknot. The problem of maximizing expected accuracy is solved by using integer programming with threshold cut. We also extend IPknot so that it can predict the consensus secondary structure with pseudoknots when a multiple sequence alignment is given. IPknot is validated through extensive experiments on various datasets, showing that IPknot achieves better prediction accuracy and faster running time as compared with several competitive prediction methods.

DOI

Scopus

238

Citation

(Scopus)
Antagonistic RNA aptamer specific to a heterodimeric form of human interleukin-17A/F

Hironori Adachi, Akira Ishiguro, Michiaki Hamada, Eri Sakota, Kiyoshi Asai, Yoshikazu Nakamura

BIOCHIMIE 93 ( 7 ) 1081 - 1088 2011.07 [Refereed]

　View Summary

Interleukin-17 (IL-17) is a pro-inflammatory cytokine produced primarily by a subset of CD4(+) cells, called Th17 cells, that is involved in host defense, inflammation and autoimmune disorders. The two most structurally related IL-17 family members, IL-17A and IL-17F, form homodimeric (IL-17A/A, IL-17F/F) and heterodimeric (IL-17A/F) complexes. Although the biological significance of IL-17A and IL-17F have been investigated using respective antibodies or gene knockout mice, the functional study of IL-17A/F heterodimeric form has been hampered by the lack of an inhibitory tool specific to IL-17A/F. In this study, we aimed to develop an RNA aptamer that specifically inhibits IL-17A/F. Aptamers are short single-stranded nucleic acid sequences that are selected in vitro based on their high affinity to a target molecule. One selected aptamer against human IL-17A/F, AptAF42, was isolated by repeated cycles of selection and counterselection against heterodimeric and homodimeric complexes, respectively. Thus, AptAF42 bound IL-17A/F but not IL-17A/A or IL-17F/F. The optimized derivative, AptAF42dope1, blocked the binding of IL-17A/F, but not of IL-17A/A or IL-17F/F, to the IL-17 receptor in the surface plasmon resonance assay in vitro. Consistently, AptAF42dope1 blocked cytokine GRO-alpha production induced by IL-17A/F, but not by IL-17A/A or IL-17F/F, in human cells. An RNA footprinting assay using ribonucleases against AptAF42dope1 in the presence or absence of IL-17A/F revealed that part of the predicted secondary structure fluctuates between alternate forms and that AptAF42dope1 is globally protected from ribonuclease cleavage by IL-17A/F. These results suggest that the selected aptamer recognizes a global conformation specified by the heterodimeric surface of IL-17A/F. (C) 2011 Elsevier Masson SAS. All rights reserved.

DOI PubMed J-GLOBAL

Scopus

21

Citation

(Scopus)
CentroidHomfold-LAST: accurate prediction of RNA secondary structure using automatically collected homologous sequences

Michiaki Hamada, Koichiro Yamada, Kengo Sato, Martin C. Frith, Kiyoshi Asai

NUCLEIC ACIDS RESEARCH 39 ( Web Server issue ) W100 - W106 2011.07 [Refereed]

　View Summary

Although secondary structure predictions of an individual RNA sequence have been widely used in a number of sequence analyses of RNAs, accuracy is still limited. Recently, we proposed a method (called 'CentroidHomfold'), which includes information about homologous sequences into the prediction of the secondary structure of the target sequence, and showed that it substantially improved the performance of secondary structure predictions. CentroidHomfold, however, forces users to prepare homologous sequences of the target sequence. We have developed a Web application (CentroidHomfold-LAST) that predicts the secondary structure of the target sequence using automatically collected homologous sequences. LAST, which is a fast and sensitive local aligner, and CentroidHomfold are employed in the Web application. Computational experiments with a commonly-used data set indicated that CentroidHomfold-LAST substantially outperformed conventional secondary structure predictions including CentroidFold and RNAfold.

DOI PubMed J-GLOBAL

Scopus

22

Citation

(Scopus)
IPknot: fast and accurate prediction of RNA secondary structures with pseudoknots using integer programming

Kengo Sato, Yuki Kato, Michiaki Hamada, Tatsuya Akutsu, Kiyoshi Asai

BIOINFORMATICS 27 ( 13 ) I85 - I93 2011.07 [Refereed]

　View Summary

Motivation: Pseudoknots found in secondary structures of a number of functional RNAs play various roles in biological processes. Recent methods for predicting RNA secondary structures cover certain classes of pseudoknotted structures, but only a few of them achieve satisfying predictions in terms of both speed and accuracy.
Results: We propose IPknot, a novel computational method for predicting RNA secondary structures with pseudoknots based on maximizing expected accuracy of a predicted structure. IPknot decomposes a pseudoknotted structure into a set of pseudoknot-free substructures and approximates a base-pairing probability distribution that considers pseudoknots, leading to the capability of modeling a wide class of pseudoknots and running quite fast. In addition, we propose a heuristic algorithm for refining base-paring probabilities to improve the prediction accuracy of IPknot. The problem of maximizing expected accuracy is solved by using integer programming with threshold cut. We also extend IPknot so that it can predict the consensus secondary structure with pseudoknots when a multiple sequence alignment is given. IPknot is validated through extensive experiments on various datasets, showing that IPknot achieves better prediction accuracy and faster running time as compared with several competitive prediction methods.

DOI PubMed J-GLOBAL

Scopus

238

Citation

(Scopus)
Generalized Centroid Estimators in Bioinformatics

Michiaki Hamada, Hisanori Kiryu, Wataru Iwasaki, Kiyoshi Asai

PLOS ONE 6 ( 2 ) e16450 2011.02 [Refereed]

　View Summary

In a number of estimation problems in bioinformatics, accuracy measures of the target problem are usually given, and it is important to design estimators that are suitable to those accuracy measures. However, there is often a discrepancy between an employed estimator and a given accuracy measure of the problem. In this study, we introduce a general class of efficient estimators for estimation problems on high-dimensional binary spaces, which represent many fundamental problems in bioinformatics. Theoretical analysis reveals that the proposed estimators generally fit with commonly-used accuracy measures (e.g. sensitivity, PPV, MCC and F-score) as well as it can be computed efficiently in many cases, and cover a wide range of problems in bioinformatics from the viewpoint of the principle of maximum expected accuracy (MEA). It is also shown that some important algorithms in bioinformatics can be interpreted in a unified manner. Not only the concept presented in this paper gives a useful framework to design MEA-based estimators but also it is highly extendable and sheds new light on many problems in bioinformatics.

DOI PubMed

Scopus

14

Citation

(Scopus)
Improving the accuracy of predicting secondary structure for aligned RNA sequences

Michiaki Hamada, Kengo Sato, Kiyoshi Asai

NUCLEIC ACIDS RESEARCH 39 ( 2 ) 393 - 402 2011.01 [Refereed]

　View Summary

Considerable attention has been focused on predicting the secondary structure for aligned RNA sequences since it is useful not only for improving the limiting accuracy of conventional secondary structure prediction but also for finding non-coding RNAs in genomic sequences. Although there exist many algorithms of predicting secondary structure for aligned RNA sequences, further improvement of the accuracy is still awaited. In this article, toward improving the accuracy, a theoretical classification of state-of-the-art algorithms of predicting secondary structure for aligned RNA sequences is presented. The classification is based on the viewpoint of maximum expected accuracy (MEA), which has been successfully applied in various problems in bioinformatics. The classification reveals several disadvantages of the current algorithms but we propose an improvement of a previously introduced algorithm (CentroidAlifold). Finally, computational experiments strongly support the theoretical classification and indicate that the improved CentroidAlifold substantially outperforms other algorithms.

DOI

Scopus

55

Citation

(Scopus)
Improving the accuracy of predicting secondary structure for aligned RNA sequences

Michiaki Hamada, Kengo Sato, Kiyoshi Asai

NUCLEIC ACIDS RESEARCH 39 ( 2 ) 393 - 402 2011.01 [Refereed]

　View Summary

Considerable attention has been focused on predicting the secondary structure for aligned RNA sequences since it is useful not only for improving the limiting accuracy of conventional secondary structure prediction but also for finding non-coding RNAs in genomic sequences. Although there exist many algorithms of predicting secondary structure for aligned RNA sequences, further improvement of the accuracy is still awaited. In this article, toward improving the accuracy, a theoretical classification of state-of-the-art algorithms of predicting secondary structure for aligned RNA sequences is presented. The classification is based on the viewpoint of maximum expected accuracy (MEA), which has been successfully applied in various problems in bioinformatics. The classification reveals several disadvantages of the current algorithms but we propose an improvement of a previously introduced algorithm (CentroidAlifold). Finally, computational experiments strongly support the theoretical classification and indicate that the improved CentroidAlifold substantially outperforms other algorithms.

DOI PubMed CiNii J-GLOBAL

Scopus

55

Citation

(Scopus)
Prediction of RNA secondary structure by maximizing pseudo-expected accuracy

Michiaki Hamada, Kengo Sato, Kiyoshi Asai

BMC BIOINFORMATICS 11 586 2010.11 [Refereed]

　View Summary

Background: Recent studies have revealed the importance of considering the entire distribution of possible secondary structures in RNA secondary structure predictions; therefore, a new type of estimator is proposed including the maximum expected accuracy ( MEA) estimator. The MEA-based estimators have been designed to maximize the expected accuracy of the base-pairs and have achieved the highest level of accuracy. Those methods, however, do not give the single best prediction of the structure, but employ parameters to control the trade-off between the sensitivity and the positive predictive value (PPV). It is unclear what parameter value we should use, and even the well-trained default parameter value does not, in general, give the best result in popular accuracy measures to each RNA sequence.
Results: Instead of using the expected values of the popular accuracy measures for RNA secondary structure prediction, which is difficult to be calculated, the pseudo-expected accuracy, which can easily be computed from base-pairing probabilities, is introduced. It is shown that the pseudo-expected accuracy is a good approximation in terms of sensitivity, PPV, MCC, or F-score. The pseudo-expected accuracy can be approximately maximized for each RNA sequence by stochastic sampling. It is also shown that well-balanced secondary structures between sensitivity and PPV can be predicted with a small computational overhead by combining the pseudo-expected accuracy of MCC or F-score with the gamma-centroid estimator.
Conclusions: This study gives not only a method for predicting the secondary structure that balances between sensitivity and PPV, but also a general method for approximately maximizing the (pseudo-) expected accuracy with respect to various evaluation measures including MCC and F-score.

DOI PubMed J-GLOBAL

Scopus

22

Citation

(Scopus)
RactIP: Fast and accurate prediction of RNA-RNA interaction using integer programming

Yuki Kato, Kengo Sato, Michiaki Hamada, Yoshihide Watanabe, Kiyoshi Asai, Tatsuya Akutsu

Bioinformatics 26 ( 18 ) i460 - i466 2010.09 [Refereed]

　View Summary

Motivation: Considerable attention has been focused on predicting RNA-RNA interaction since it is a key to identifying possible targets of non-coding small RNAs that regulate gene expression post-transcriptionally. A number of computational studies have so far been devoted to predicting joint secondary structures or binding sites under a specific class of interactions. In general, there is a trade-off between range of interaction type and efficiency of a prediction algorithm, and thus efficient computational methods for predicting comprehensive type of interaction are still awaited.Results: We present RactIP, a fast and accurate prediction method for RNA-RNA interaction of general type using integer programming. RactIP can integrate approximate information on an ensemble of equilibrium joint structures into the objective function of integer programming using posterior internal and external base-paring probabilities. Experimental results on real interaction data show that prediction accuracy of RactIP is at least comparable to that of several state-of-the-art methods for RNA-RNA interaction prediction. Moreover, we demonstrate that RactIP can run incomparably faster than competitive methods for predicting joint secondary structures. © The Author(s) 2010. Published by Oxford University Press.

DOI PubMed J-GLOBAL

Scopus

3

Citation

(Scopus)
RactIP: fast and accurate prediction of RNA-RNA interaction using integer programming

Yuki Kato, Kengo Sato, Michiaki Hamada, Yoshihide Watanabe, Kiyoshi Asai, Tatsuya Akutsu

BIOINFORMATICS 26 ( 18 ) i460 - i466 2010.09 [Refereed]

　View Summary

Motivation: Considerable attention has been focused on predicting RNA-RNA interaction since it is a key to identifying possible targets of non-coding small RNAs that regulate gene expression post-transcriptionally. A number of computational studies have so far been devoted to predicting joint secondary structures or binding sites under a specific class of interactions. In general, there is a tradeoff between range of interaction type and efficiency of a prediction algorithm, and thus efficient computational methods for predicting comprehensive type of interaction are still awaited.
Results: We present RactIP, a fast and accurate prediction method for RNA-RNA interaction of general type using integer programming. RactIP can integrate approximate information on an ensemble of equilibrium joint structures into the objective function of integer programming using posterior internal and external base-paring probabilities. Experimental results on real interaction data show that prediction accuracy of RactIP is at least comparable to that of several state-of-the-art methods for RNA-RNA interaction prediction. Moreover, we demonstrate that RactIP can run incomparably faster than competitive methods for predicting joint secondary structures.

DOI

Scopus

3

Citation

(Scopus)
A non-parametric bayesian approach for predicting rna secondary structures

Kengo Sato, Michiaki Hamada, Toutai Mituyama, Kiyoshi Asai, Yasubumi Sakakibara

Journal of Bioinformatics and Computational Biology 8 ( 4 ) 727 - 742 2010.08 [Refereed]

　View Summary

Since many functional RNAs form stable secondary structures which are related to their functions, RNA secondary structure prediction is a crucial problem in bioinformatics. We propose a novel model for generating RNA secondary structures based on a non-parametric Bayesian approach, called hierarchical Dirichlet processes for stochastic context-free grammars (HDP-SCFGs). Here non-parametric means that some meta-parameters, such as the number of non-terminal symbols and production rules, do not have to be fixed. Instead their distributions are inferred in order to be adapted (in the Bayesian sense) to the training sequences provided. The results of our RNA secondary structure predictions show that HDP-SCFGs are more accurate than the MFE-based and other generative models. © 2010 Imperial College Press.

DOI

Scopus

11

Citation

(Scopus)
Parameters for accurate genome alignment

Martin C. Frith, Michiaki Hamada, Paul Horton

BMC Bioinformatics 11 80 2010.02 [Refereed]

　View Summary

Background: Genome sequence alignments form the basis of much research. Genome alignment depends on various mundane but critical choices, such as how to mask repeats and which score parameters to use. Surprisingly, there has been no large-scale assessment of these choices using real genomic data. Moreover, rigorous procedures to control the rate of spurious alignment have not been employed.Results: We have assessed 495 combinations of score parameters for alignment of animal, plant, and fungal genomes. As our gold-standard of accuracy, we used genome alignments implied by multiple alignments of proteins and of structural RNAs. We found the HOXD scoring schemes underlying alignments in the UCSC genome database to be far from optimal, and suggest better parameters. Higher values of the X-drop parameter are not always better. E-values accurately indicate the rate of spurious alignment, but only if tandem repeats are masked in a non-standard way. Finally, we show that γ-centroid (probabilistic) alignment can find highly reliable subsets of aligned bases.Conclusions: These results enable more accurate genome alignment, with reliability measures for local alignments and for individual aligned bases. This study was made possible by our new software, LAST, which can align vertebrate genomes in a few hours http://last.cbrc.jp/. © 2010 Frith et al
licensee BioMed Central Ltd.

DOI PubMed J-GLOBAL

Scopus

175

Citation

(Scopus)
Parameters for accurate genome alignment

Martin C. Frith, Michiaki Hamada, Paul Horton

BMC BIOINFORMATICS 11 80 2010.02 [Refereed]

　View Summary

Background: Genome sequence alignments form the basis of much research. Genome alignment depends on various mundane but critical choices, such as how to mask repeats and which score parameters to use. Surprisingly, there has been no large-scale assessment of these choices using real genomic data. Moreover, rigorous procedures to control the rate of spurious alignment have not been employed.
Results: We have assessed 495 combinations of score parameters for alignment of animal, plant, and fungal genomes. As our gold-standard of accuracy, we used genome alignments implied by multiple alignments of proteins and of structural RNAs. We found the HOXD scoring schemes underlying alignments in the UCSC genome database to be far from optimal, and suggest better parameters. Higher values of the X-drop parameter are not always better. E-values accurately indicate the rate of spurious alignment, but only if tandem repeats are masked in a non-standard way. Finally, we show that gamma-centroid (probabilistic) alignment can find highly reliable subsets of aligned bases.
Conclusions: These results enable more accurate genome alignment, with reliability measures for local alignments and for individual aligned bases. This study was made possible by our new software, LAST, which can align vertebrate genomes in a few hours http://last.cbrc.jp/.

DOI

Scopus

175

Citation

(Scopus)
CentroidAlign: fast and accurate aligner for structured RNAs by maximizing expected sum-of-pairs score

Michiaki Hamada, Kengo Sato, Hisanori Kiryu, Toutai Mituyama, Kiyoshi Asai

BIOINFORMATICS 25 ( 24 ) 3236 - 3243 2009.12 [Refereed]

　View Summary

Motivation: The importance of accurate and fast predictions of multiple alignments for RNA sequences has increased due to recent findings about functional non-coding RNAs. Recent studies suggest that maximizing the expected accuracy of predictions will be useful for many problems in bioinformatics.
Results: We designed a novel estimator for multiple alignments of structured RNAs, based on maximizing the expected accuracy of predictions. First, we define the maximum expected accuracy (MEA) estimator for pairwise alignment of RNA sequences. This maximizes the expected sum-of-pairs score (SPS) of a predicted alignment under a probability distribution of alignments given by marginalizing the Sankoff model. Then, by approximating the MEA estimator, we obtain an estimator whose time complexity is O(L-3 + c(2)dL(2)) where L is the length of input sequences and both c and d are constants independent of L. The proposed estimator can handle uncertainty of secondary structures and alignments that are obstacles in Bioinformatics because it considers all the secondary structures and all the pairwise alignments as input sequences. Moreover, we integrate the probabilistic consistency transformation (PCT) on alignments into the proposed estimator. Computational experiments using six benchmark datasets indicate that the proposed method achieved a favorable SPS and was the fastest of many state-of-the-art tools for multiple alignments of structured RNAs.

DOI PubMed J-GLOBAL

Scopus

42

Citation

(Scopus)
CENTROIDFOLD: a web server for RNA secondary structure prediction

Kengo Sato, Michiaki Hamada, Kiyoshi Asai, Toutai Mituyama

NUCLEIC ACIDS RESEARCH 37 ( Web Server issue ) W277 - W280 2009.07 [Refereed]

　View Summary

The CENTROIDFOLD web server (http://www.ncrna.org/centroidfold/) is a web application for RNA secondary structure prediction powered by one of the most accurate prediction engine. The server accepts two kinds of sequence data: a single RNA sequence and a multiple alignment of RNA sequences. It responses with a prediction result shown as a popular base-pair notation and a graph representation. PDF version of the graph representation is also available. For a multiple alignment sequence, the server predicts a common secondary structure. Usage of the server is quite simple. You can paste a single RNA sequence (FASTA or plain sequence text) or a multiple alignment (CLUSTAL-W format) into the textarea then click on the 'execute CentroidFold' button. The server quickly responses with a prediction result. The major advantage of this server is that it employs our original CENTROIDFOLD software as its prediction engine which scores the best accuracy in our benchmark results. Our web server is freely available with no login requirement.

DOI PubMed J-GLOBAL

Scopus

280

Citation

(Scopus)
Predictions of RNA secondary structure by combining homologous sequence information.

Hamada M, Sato K, Kiryu H, Mituyama T, Asai K

Bioinformatics (Oxford, England) 25 ( 12 ) 330 - 338 2009.06 [Refereed] [International journal]

　View Summary

MOTIVATION: Secondary structure prediction of RNA sequences is an important problem. There have been progresses in this area, but the accuracy of prediction from an RNA sequence is still limited. In many cases, however, homologous RNA sequences are available with the target RNA sequence whose secondary structure is to be predicted. RESULTS: In this article, we propose a new method for secondary structure predictions of individual RNA sequences by taking the information of their homologous sequences into account without assuming the common secondary structure of the entire sequences. The proposed method is based on posterior decoding techniques, which consider all the suboptimal secondary structures of the target and homologous sequences and all the suboptimal alignments between the target sequence and each of the homologous sequences. In our computational experiments, the proposed method provides better predictions than those performed only on the basis of the formation of individual RNA sequences and those performed by using methods for predicting the common secondary structure of the homologous sequences. Remarkably, we found that the common secondary predictions sometimes give worse predictions for the secondary structure of a target sequence than the predictions from the individual target sequence, while the proposed method always gives good predictions for the secondary structure of target sequences in all tested cases. AVAILABILITY: Supporting information and software are available online at: http://www.ncrna.org/software/centroidfold/ismb2009/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

DOI PubMed J-GLOBAL

Scopus

46

Citation

(Scopus)
Prediction of RNA secondary structure using generalized centroid estimators

Michiaki Hamada, Hisanori Kiryu, Kengo Sato, Toutai Mituyama, Kiyoshi Asai

BIOINFORMATICS 25 ( 4 ) 465 - 473 2009.02 [Refereed]

　View Summary

Motivation: Recent studies have shown that the methods for predicting secondary structures of RNAs on the basis of posterior decoding of the base-pairing probabilities has an advantage with respect to prediction accuracy over the conventionally utilized minimum free energy methods. However, there is room for improvement in the objective functions presented in previous studies, which are maximized in the posterior decoding with respect to the accuracy measures for secondary structures.
Results: We propose novel estimators which improve the accuracy of secondary structure prediction of RNAs. The proposed estimators maximize an objective function which is the weighted sum of the expected number of the true positives and that of the true negatives of the base pairs. The proposed estimators are also improved versions of the ones used in previous works, namely CONTRAfold for secondary structure prediction from a single RNA sequence and McCaskill-MEA for common secondary structure prediction from multiple alignments of RNA sequences. We clarify the relations between the proposed estimators and the estimators presented in previous works, and theoretically show that the previous estimators include additional unnecessary terms in the evaluation measures with respect to the accuracy. Furthermore, computational experiments confirm the theoretical analysis by indicating improvement in the empirical accuracy. The proposed estimators represent extensions of the centroid estimators proposed in Ding et al. and Carvalho and Lawrence, and are applicable to a wide variety of problems in bioinformatics.

DOI PubMed J-GLOBAL

Scopus

201

Citation

(Scopus)
A Non-parametric Bayesian Approach for Predicting RNA Secondary Structures

Kengo Sato, Michiaki Hamada, Toutai Mituyama, Kiyoshi Asai, Yasubumi Sakakibara

ALGORITHMS IN BIOINFORMATICS, PROCEEDINGS 5724 286 - + 2009 [Refereed]

　View Summary

Since many functional RNAs form stable secondary structures which are related to their functions, RNA secondary structure prediction is a crucial problem in bioinformatics. We propose a novel model for generating RNA secondary structures based on a non-parametric Bayesian approach, called hierarchical Dirichlet processes for stochastic context-free grammars (HDP-SCFGs). Here non-parametric means that some meta-parameters, such as the number of non-terminal symbols and production rules, do not have to be fixed. Instead their distributions are inferred in order to be adapted (in the Bayesian sense) to the training sequences provided. The results of our RNA secondary structure predictions show that HDP-SCFGs are more accurate than the MFE-based and other generative models.
Large scale similarity search for locally stable secondary structures among RNA sequences

Michiaki Hamada, Toutai Mituyama, Kiyoshi Asai

IPSJ Transactions on Bioinformatics 2 36 - 46 2009 [Refereed]

　View Summary

Recently, a large number of candidates of non-coding RNAs (ncRNAs) has been predicted by experimental or computational approaches. Moreover, in genomic sequences, there are still many interesting regions whose functions are unknown (e.g., indel conserved regions, human accelerated regions, ultraconserved elements and transposon free regions) and some of those regions may be ncRNAs. On the other hand, it is known that many ncRNAs have characteristic secondary structures which are strongly related to their functions. Therefore, detecting clusters which have mutually similar secondary structures is important for revealing new ncRNA families. In this paper, we describe a novel method, called RNAclique, which is able to search for clusters containing mutually similar and locally stable secondary structures among a large number of unaligned RNA sequences. Our problem is formulated as a constraint quasiclique search problem, and we use an approximate combinatorial optimization method, called GRASP, for solving the problem. Several computational experiments show that our method is useful and scalable for detecting ncRNA families from large sequences. We also present two examples of large scale sequence analysis using RNAclique. © 2009 Information Processing Society of Japan.

DOI CiNii

Scopus

1

Citation

(Scopus)
Software.ncrna.org: web servers for analyses of RNA sequences

Kiyoshi Asai, Hisanori Kiryu, Michiaki Hamada, Yasuo Tabei, Kengo Sato, Hiroshi Matsui, Yasubumi Sakakibara, Goro Terai, Toutai Mituyama

NUCLEIC ACIDS RESEARCH 36 ( Web Server issue ) W75 - W78 2008.07 [Refereed]

　View Summary

We present web servers for analysis of non-coding RNA sequences on the basis of their secondary structures. Software tools for structural multiple sequence alignments, structural pairwise sequence alignments and structural motif findings are available from the integrated web server and the individual stand-alone web servers. The servers are located at http://software.ncrna.org, along with the information for the evaluation and downloading. This website is freely available to all users and there is no login requirement.

DOI PubMed J-GLOBAL

Scopus

5

Citation

(Scopus)
Mining frequent stem patterns from unaligned RNA sequences

Michiaki Hamada, Koji Tsuda, Taku Kudo, Taishin Kin, Kiyoshi Asai

BIOINFORMATICS 22 ( 20 ) 2480 - 2487 2006.10 [Refereed]

　View Summary

Motivation: In detection of non-coding RNAs, it is often necessary to identify the secondary structure motifs from a set of putative RNA sequences. Most of the existing algorithms aim to provide the best motif or few good motifs, but biologists often need to inspect all the possible motifs thoroughly.
Results: Our method RNAmine employs a graph theoretic representation of RNA sequences and detects all the possible motifs exhaustively using a graph mining algorithm. The motif detection problem boils down to finding frequently appearing patterns in a set of directed and labeled graphs. In the tasks of common secondary structure prediction and local motif detection from long sequences, our method performed favorably both in accuracy and in efficiency with the state-of-the-art methods such as CMFinder.

DOI PubMed J-GLOBAL

Scopus

39

Citation

(Scopus)

▼display all

Books and Other Publications

Chapter Sixteen - ATAC-seq method applied to embryonic germ cells and neural stem cells from mouse: Practical tips and modifications

Soichiro Yamanaka, Masahiro Onoguchi, Rena Onoguchi-Mizutani, Yusuke Kishi, Michiaki Hamada, Nobuyoshi Akimitsu, Haruhiko Siomi( Part： Joint author, Pages 371-391)

Academic Press 2025.08 ISBN: 9780443267598
生体の科学 Vol.76 No.2 2025年 04月号

加藤晃代, 横山源太朗, 中野秀雄, 本野千恵, 浜田道昭( Part： Joint author, アミノ酸配列に着目した翻訳促進技術の開発)

医学書院 2025.04
トランスクリプトーム解析

松本, 拡高, 浜田, 道昭

コロナ社 2025.04 ISBN: 9784339027365
ベイズ最適化の活用事例材料探索／物性予測／配合・プロセス条件の最適化

本郷研太, 倉上拓真, 杉澤宏樹, 徳田陽明, 田邊祐介, 山下智樹, 堀田一海, 都甲薫, 森貴裕, 粟屋康輝, 仲村武瑠, 小嗣真人, 岡本昌彦, 岩崎悠真, 稲垣英輔, 高橋亮, 長岡正宏, 嶋田雄介, 桐淵大貴, 足立吉隆, 芝公仁, 岩田浩明, 山田和明, 浜田道昭, 田中冬彦, 山崎直人, 水牧仁一朗, 中江隆博, 花岡悟一郎, 長藤圭介, 田上湖都, 中野智宏, 天本義史, 芦刈洋祐, 菊地淳, 岡野泰則, 小木曽望, 黒川康良, 金羽晴, 沓掛健太朗, 林周斗, 前田佳弘, 松田翔一, 内山祐介, 豊田丈紫, 佐藤孝磨, 谷口卓也, 小林翔太, 山口祐貴, 渡部洋介, 高橋竜太, 田中大介, 岡﨑湧一, 小寺正徳, 加藤央, 小野寛太( Part： Joint author, 第5章、第2節「ベイズ最適化を用いたRNA配列デザイン」)

技術情報協会 2025.03 ISBN: 9784867980668
RNAデザイン、Drug Delivery System, Vol.39 No.5 NOVEMBER 2024

浜田道昭( Part： Contributor, RNAデザイン Pages 333-345)

日本DDS学会 2024.11
ゲノム配列情報解析

三澤, 計治, 浜田, 道昭

コロナ社 2024.08 ISBN: 9784339027358
核酸医薬〜モダリティ・合成・分析・DDSの最新動向〜

青木華古, 永田哲也, 横田隆徳, 他( Part： Joint author, 第6章「核酸医薬研究を加速する情報技術」)

（株）エヌ・ティー・エス 2024.04 ISBN: 9784860438869
あなたのラボから薬を生み出すアカデミア創薬の実践 : all Japan体制の先端技術支援を利用した創薬の最前線

善光, 龍哉, 辻川, 和丈( Part： Joint author, 第1章最新の疾患標的分子の探索・評価技術，第4節シングルセル/微小組織マルチオミクス解析)

羊土社 2024.01 ISBN: 9784758104166
機械学習を用いたアプタマー配列の解析と創薬、実験医学別冊 Pythonで実践生命科学データの機械学習

岩野夏樹, 浜田道昭, 清水秀幸(第 12 章発展編①)

羊土社 2023.03 ISBN: 9784758122634
機械学習による遺伝子転写制御に関わる因子の探索, 月刊細胞

大里直樹, 浜田道昭(2022年11月号 62-64)

ニューサイエンス社 2022.11
システムバイオロジー

宇田, 新介, 浜田, 道昭

コロナ社 2022.11 ISBN: 9784339027341
バイオインフォマティクスのための生命科学入門

福永, 津嵩, 岩切, 淳一, 浜田, 道昭

コロナ社 2022.08 ISBN: 9784339027310
RNA情報科学・AI技術を融合したAIアプタマー創薬技術の開発，革新的AI創薬最前線

浜田道昭(第５章，第５節)

エヌ・ティー・エス 2022.07 ISBN: 9784860437886
生物統計

木立, 尚孝, 浜田, 道昭

コロナ社 2022.05 ISBN: 9784339027334
生物ネットワーク解析

竹本, 和広, 浜田, 道昭

コロナ社 2021.11 ISBN: 9784339027327
よくわかるバイオインフォマティクス入門 (KS生命科学専門書)

岩部直之, 川端猛, 浜田道昭, 門田幸二, 須山幹太, 光山統泰, 黒川顕, 森宙史, 東光一, 吉沢明康, 片山俊明, 藤博幸( Part： Joint author)

講談社 2018.11 ISBN: 4065138213

ASIN
生命情報処理における機械学習多重検定と推定量設計 (機械学習プロフェッショナルシリーズ)

瀬々潤, 浜田道昭( Part： Joint author)

講談社 2015.12 ISBN: 4061529110

ASIN

▼display all

Presentations

RNAデータサイエンス

浜田道昭 [Invited]

一般社団法人ゲノムテクノロジー研究会第8回バイオインフォマティクス分科会「データサイエンス」

Presentation date： 2024.09

Event date：
2024.09

　

　
RNA創薬を加速する情報技術、[S41] 外来性RNAに対する防御機構解明が切り拓くRNA創薬のニューフロンティア

浜田道昭 [Invited]

日本薬学会第144年会

Presentation date： 2024.03

Event date：
2024.03

　

　
核酸医薬研究を加速する情報技術の開発と応用

浜田道昭 [Invited]

第453回CBI学会講演会「中分子創薬を革新する計算科学、情報科学の最前線」

Presentation date： 2024.02

Event date：
2024.02

　

　
mRNAのトータルデザインに向けた情報技術

浜田道昭 [Invited]

日本核酸医薬学会第8回年会 mRNAシンポジウム

Presentation date： 2023.07
RNAバイオインフォマティクスを用いた核酸医薬研究

浜田道昭 [Invited]

日本核酸医薬学会第8回年会教育セッション（生物）

Presentation date： 2023.07
AIアプタマー創薬 ―人工知能技術を用いたRNAアプタマー創薬の加速―

浜田道昭 [Invited]

日本コンピュータ化学会20周年記念シンポジウム

Presentation date： 2022.06
AI aptamer drug discovery project

Michiaki Hamada [Invited]

Presentation date： 2022.03
RNAウイルスゲノム2次構造のLadder Distanceに基づく空間的直径制限の解明

荻野祥平, 川崎純菜, 浜田道昭

第48回日本分子生物学会年会

Presentation date： 2025.12
RNA-リガンドドッキングシミュレーションを用いたバーチャルスクリーニング最適化のためのパイプライン構築

濵田一輝, 栗崎以久男, 浜田道昭

第48回日本分子生物学会年会

Presentation date： 2025.12
リピート配列由来の潜在的機能エレメントの同定

曽超, 浜田道昭

第48回日本分子生物学会年会

Presentation date： 2025.12
オープンデータを活用した撹乱RNAの網羅的探索と性質予測

川崎純菜, 伊東潤平, 浜田道昭, 鈴木忠樹

第48回日本分子生物学会年会

Presentation date： 2025.12
ノンコーディングRNAが形成する熱応答性核内構造体,HiNoCo bodyによる熱ストレス応答機構の解明

小野口玲菜, 小野口真広, 川村猛, 浜田道昭, 秋光信佳

第48回日本分子生物学会年会

Presentation date： 2025.12
シングルセルメタ解析と連続モデリングが解明する脳血管老化の分子マーカー

佐藤由弥, 髙良和宏, 能瀬由翔, 木戸屋浩康, 井内仁志, 浜田道昭, 朝日透, 片岡孝介

第48回日本分子生物学会年会

Presentation date： 2025.12
AI提案配列のハイスループット評価に向けたセルフリー反応ベース酵素自動評価技術の開

太田真理, 田邉麻衣子, 木村晃, 三森隆広, 愛甲和秀, 浜田道昭, 木賀大介, 伊藤潔人, 神鳥明彦

第48回日本分子生物学会年会

Presentation date： 2025.12
生薬誘発トランスクリプトームの解析による寒熱証の理解

砂金美月, 浜田道昭

第48回日本分子生物学会年会

Presentation date： 2025.12
グラフ対照学習と二次構造情報を利用したncRNA埋め込み手法の開発

本田皓子, 浜田道昭

第48回日本分子生物学会年会

Presentation date： 2025.12
CRISPRiスクリーニングによる肺腺癌細胞の生存に必須なlncRNAの同定

藤原佐紀, 藤原奈央子, 曽超, 岩切淳一, 浜田道昭, 浅井潔, 廣瀬哲郎

第48回日本分子生物学会年会

Presentation date： 2025.12
深層学習による自在なmRNA翻訳制御のための人工5'UTR設計

木下通理, 角俊輔, 浜田道昭, 齊藤博英

第48回日本分子生物学会年会

Presentation date： 2025.12
mRNA医薬の安定性最適化に重要な要素の特定

阿部有佑子, 山田啓介, 浜田道昭

第48回日本分子生物学会年会

Presentation date： 2025.12
RNAMergeDistill: 多教師知識蒸留によるRNA言語モデルの開発

橋本和磨, 山田啓介, 浜田道昭

第48回日本分子生物学会年会

Presentation date： 2025.12
Information Technologies Accelerating RNA Therapeutics

Michiaki Hamada [Invited]

The 23rd Asia Pacific Bioinformatics Conference (APBC2025)

Presentation date： 2025.11
Odd Sketchを用いたゲノム間類似度推定

菊田恭平, 浜田道昭

第28回情報論的学習理論ワークショップ

Presentation date： 2025.11
複数構造を標的とするRNA設計のための償却推論：折り畳みエネルギーランドスケープを制御するためのMRFポテンシャル学習

西村友宏, 浜田道昭

第28回情報論的学習理論ワークショップ

Presentation date： 2025.11
情報量基準と価値関数を組み合わせた行動空間選択手法

楊之介, 浜田道昭

第28回情報論的学習理論ワークショップ

Presentation date： 2025.11
バイオインフォマティクスを用いた一細胞・空間解析研究，一細胞マルチモーダル解析が拓く生物学フロンティア

浜田道昭 [Invited]

第98回日本生化学会大会

Presentation date： 2025.11
Discovery of Translation-Enhancing Peptides in Escherichia coli through Machine Learning and Iterative Validation

Chie Motono, Gentaro Yokoyama, Hideo Nakano, Michiaki Hamada, Teruyo Ojima-Kato

International Conference on Bottom-up Biotechnology for Understanding and Engineering Living Systems 2025

Presentation date： 2025.11
RNAインフォマティクスー生命医薬研究への応用ー

浜田道昭 [Invited]

弘前大学農学生命科学部 2025年度第11回研究推進セミナー

Presentation date： 2025.10
量子・AI次世代創薬

浜田道昭 [Invited]

BioJapan

Presentation date： 2025.10
RNA標的創薬研究に資する情報技術の開発

浜田道昭 [Invited]

BioJapan

Presentation date： 2025.10
RNAインフォマティクスを用いた生命医薬学研究（特別講演）

浜田道昭 [Invited]

第６回細胞分子工学研究部門研究報告会

Presentation date： 2025.10
ワクチン副反応メカニズムの解明を目指した免疫プロファイリング解析

高野智弘, 井内仁志, 浜田道昭, 新海正晴, 松村隆之, 高橋宜聖

第29回日本ワクチン学会

Presentation date： 2025.09
大腸菌における翻訳促進ペプチドの機械学習による予測と実証

本野千恵, 横山源太朗, 中野秀雄, 加藤晃代, 浜田道昭

第63回日本生物物理学会年会

Presentation date： 2025.09
深層学習でリガンド結合状態のRNA構造を生成する

栗崎以久男, 浜田道昭

第63回日本生物物理学会年会

Presentation date： 2025.09
Strategy for Efficient Protein Production using Translation-Enhancing Peptides in Escherichia coli

Teruyo OJIMA-KATO, Yuma NISHIKAWA, Riko FUJIKAWA, Hideo NAKANO, Gentaro YOKOYAMA, Chie MOTONO, Michiaki HAMADA

KSBB-AFOB Conference 2025

Presentation date： 2025.09
Optimized design of lipid nanoparticle-mRNA vaccine for minimizing reactogenicity while preserving immunogenicity through immune profiling

Tomohiro Takano, Keigo Kumagai, Hitoshi Iuchi, Kazutaka Terahara, Aya Mizuike, Eita Sasaki, Yu Adachi, Ryutaro Kotaki, Shinichiro Ota, Mizuki Fujisawa, Tomoharu Mizukami, Kyoko Saito, Masanori Isogawa, Kohei Soga, Haruyo Nakajima-Adachi, Satoshi Hachimura, Kouji Kobiyama, Ken J. Ishii, Michiaki Hamada, Masayoshi Fukasawa, Masaharu Shinkai, Takayuki Matsumura, Yoshimasa Takahashi [Invited]

19th Vaccine Congress

Presentation date： 2025.09
ランチョンセミナー「BI人材をめぐるキャリアと多様性(2) – Meet the Professors / Managers」

浜田道昭 [Invited]

2025年日本バイオインフォマティクス学会年会・第13回生命医薬情報学連合大会（IIBMP2025）

Presentation date： 2025.09
RNA 創薬を加速するバイオインフォマティクス技術

浜田道昭

2025年日本バイオインフォマティクス学会年会・第13回生命医薬情報学連合大会（IIBMP2025）

Presentation date： 2025.09
タンパク質発現向上のための翻訳促進ペプチドの機械学習と実験による探索

本野千恵, 横山源太朗, 中野秀雄, 浜田道昭, 加藤晃代

2025年日本バイオインフォマティクス学会年会・第13回生命医薬情報学連合大会（IIBMP2025）

Presentation date： 2025.09
BWT-based de novo interspersed repeat detection for large-scale genomes

Presentation date： 2025.09
Neural posterior estimation for switching stochastic differential equation-based models applied to biological time-series data

Presentation date： 2025.09
Meta-Analysis of Host Transcriptomic Signatures to Assess Infection and Pathogenicity of Animal-Derived Viruses

Presentation date： 2025.09
Benchmarking Robust PCA Methods for scRNA-seq Dimensionality Reduction

Presentation date： 2025.09
Development of a controllable model for the generation of novel target RNA sequences for RNA-binding proteins using Prefix-tuning.

Presentation date： 2025.09
RaptGFN: RNA Aptamer Sequence Design with GFlowNets

Presentation date： 2025.09
RNAMergeDistill: Multi-Teacher Knowledge Distillation for RNA Language Model

Presentation date： 2025.09
Machine learning-guided discovery and validation of translation-enhancing peptides in Escherichia coli

本野千恵, 横山源太朗, 中野秀雄, 浜田道昭, 加藤晃代

2025年日本バイオインフォマティクス学会年会・第13回生命医薬情報学連合大会（IIBMP2025）

Presentation date： 2025.09
バイオインフォマティクスが解き明かすトランスポゾン駆動のゲノム進化と制御機構

浜田道昭 [Invited]

⽇本進化学会第27回⼤会

Presentation date： 2025.08
Comprehensive RNA-protein interactome mapping along the entire RNA sequences

[Invited]

Presentation date： 2025.08
Deep Learning Design of Synthetic 5′UTRs for Programmable mRNA Translation

Michitaka Kinoshita, Shunsuke Sumi, Michiaki Hamada, Hirohide Saito

第26回日本RNA学会

Presentation date： 2025.07
SERIPH: A Refined Method for the Targeted Enrichment and Molecular Characterization of Semi-Extractable RNAs

Naoko Fujiwara, Daiki Kohsoh, Junichi Iwakiri, Chao Zeng, Takeshi Chujo, Kiyoshi Asai, Michiaki Hamada, Tetsuro Hirose

第26回日本RNA学会

Presentation date： 2025.07
ADAR1p110 regulates mitotic progression through chromatin association

Yuxi Yang, Mai Kubota, Kokone Hasegawa, Kazuko Nishikura, Chao zeng, Michiaki Hamada, Masayuki Sakurai

第26回日本RNA学会

Presentation date： 2025.07
Participation Report on RNA 3D structure prediction at CASP16

Junichi Iwakiri, Takumi Otagaki, Kazuteru Yamamura, Shunsuke Sumi, Ikuo Kurisaki, Michiaki Hamada, Jiro Kondo, Kengo Sato

第26回日本RNA学会

Presentation date： 2025.07
核酸医薬研究を加速する情報技術

浜田道昭 [Invited]

第70回野依フォーラム例会

Presentation date： 2025.04
マウス着床期の子宮内膜上皮細胞におけるゲノムワイドなエンハンサー動態

小林良祐, Elena Solovieva, 髙橋直紀, 渡邉敬文, 豊田敦, 浜田道昭

第5回日本獣医解剖アカデミア

Presentation date： 2025.03
アミノ酸配列に着目した翻訳されやすいタンパク質の設計技術開発

加藤晃代, 横山源太朗, 本野千恵, 西河佑馬, 中野秀雄, 浜田道昭

第3回 MfIP連携探索ワークショップ

Presentation date： 2025.03
LINE1 RNAと相互作用するタンパク質複合体の網羅的解析とクロマチン制御機構の解明

小野口真広, 足達俊吾, 浜田道昭

第42回染色体ワークショップ・第23回核ダイナミクス研究会

Presentation date： 2025.01
拡散モデルによる選択的配列濃縮過程のモデリング

松本英倫, 中野涼太, 佐藤健吾, 浜田道昭

情報処理学会第80回BIO研究発表会

Presentation date： 2024.12

Event date：
2024.12

　

　
拡散モデルを用いたRNA配列および二次構造の同時生成

中野涼太, 浜田道昭

情報処理学会第80回BIO研究発表会

Presentation date： 2024.12

Event date：
2024.12

　

　
Cap-R-G4: G4 構造を考慮した RNA 二次構造の構造プロファイル計算

福永津嵩, 浜田道昭

情報処理学会第80回BIO研究発表会

Presentation date： 2024.12

Event date：
2024.12

　

　
Immune profiling of less reactogenic mRNA vaccine revealed the pathways associated with adverse reaction

Tomohiro Takano, Keigo Kumagai, Hitoshi Iuchi, Aya Mizuike, Tomoharu Mizukami, Eita Sasaki, Koji Kobiyama, Ken Ishii, Michiaki Hamada, Masayoshi Fukasawa, Takayuki Matsumura, Yoshimasa Takahashi

第53回日本免疫学会学術集会

Presentation date： 2024.12
LINE1のRNA機能エレメントと結合タンパク質複合体の網羅的解析

小野口真広, 足達俊吾, 浜田道昭

第47回日本分子生物学会年会

Presentation date： 2024.11

Event date：
2024.11

　

　
血清飢餓ストレスによって誘導される新規難抽出性arcRNA SISTの解析

金子修士, 藤原奈央子, 中條岳志, 曽超, 浜田道昭, 廣瀬哲郎

第47回日本分子生物学会年会

Presentation date： 2024.11

Event date：
2024.11

　

　
チクングニアウイルス中和アプタマーはE2を介した吸着を阻害する

後藤覚, 天野亮, 一ノ瀬顕子, 道下瑛陽, 浜田道昭, 中村義一, 高橋理貴

第47回日本分子生物学会年会

Presentation date： 2024.11

Event date：
2024.11

　

　
VLP-SELEX法とin silico解析を活用したデングウイルス中和アプタマーの創製とその有効性評価

天野亮, 道下瑛陽, 芳賀和美, 中野涼太, 一ノ瀬顕子, モイメイリン, 浜田道昭, 中村義一, 高橋理貴

第47回日本分子生物学会年会

Presentation date： 2024.11

Event date：
2024.11

　

　
多因子間ネットワークが司る遺伝子発現ユニティー

小野口玲菜, 小野口真広, 川村猛, 浜田道昭, 秋光信佳

第47回日本分子生物学会年会

Presentation date： 2024.11

Event date：
2024.11

　

　
GFlowNets による多様性制御生成モデルの学習

三森隆広, 浜田道昭

第27回情報論的学習理論ワークショップ（IBIS2024）、電子情報通信学会情報論的学習理論と機械学習研究会

Event date：
2024.11

　

　
ベイズ最適化によるCap2型mRNA生産の最適条件の探索

木村祐太, 須賀幹太, 木村康明, 浜田道昭, 阿部洋

第55回中部化学関係学協会支部連合秋季大会

Event date：
2024.11

　

　
mRNA脂質ナノ粒子ワクチン関連副反応メカニズムの解析

高野智弘, 熊谷圭悟, 井内仁志, 水池彩, 齊藤恭子, 水上智晴, 佐々木永太, 足立, 中嶋, はるよ, 八村敏志, 小檜山康司, 石井健, 浜田道昭, 深澤征義, 松村隆之, 高橋宜聖

第28回日本ワクチン学会

Presentation date： 2024.10
Deep generative model for functional RNA design

Shunsuke Sumi, Michiaki Hamada, Hirohide Saito [Invited]

BIOINFO 2024

Event date：
2024.10

　

　
Participation Report on RNA 3D structure prediction at CASP16: Usage and limitations of AlphaFold3

Junichi Iwakiri, Takumi Otagaki, Kazuteru Yamamura, Shunsuke Sumi, Ikuo Kurisaki, Michiaki Hamada, Jiro Kondo, Kengo Sato

1st Asia & Pacific Bioinformatics Joint Conference

Event date：
2024.10

　

　
Deep Learning-based ncRNA Detection: Surpassing RNAz with Advanced MSA Transformer Models

Ibuki Kaburagi, Tsukasa Fukunaga, Michiaki Hamada

1st Asia & Pacific Bioinformatics Joint Conference

Event date：
2024.10

　

　
Conditional Variational AutoEncoder for Multi-Objective Controlled 5'UTR Design

Kazuma Hashimoto, Kenta Suga, Keisuke Yamada, Michiaki Hamada

1st Asia & Pacific Bioinformatics Joint Conference

Event date：
2024.10

　

　
RaptCouple: a versatile tool for characterization of de novo RNA discovered by SELEX

Shunsuke Sumi, Tatsuo Adachi, Hirohide Saito, Michiaki Hamada

1st Asia & Pacific Bioinformatics Joint Conference

Event date：
2024.10

　

　
Structure-conditioned generative modeling for RNA design

Takahiro Mimori, Michiaki Hamada

1st Asia & Pacific Bioinformatics Joint Conference

Event date：
2024.10

　

　
Cellular heterogeneity guided molecule generation

Zhijie Yang, Michiaki Hamada

1st Asia & Pacific Bioinformatics Joint Conference

Event date：
2024.10

　

　
De novo interspersed repeat detection using inexact seeding

Atsushi Takeda, Daisuke Nonaka, Yuta Imazu, Tsukasa Fukunaga, Michiaki Hamada

1st Asia & Pacific Bioinformatics Joint Conference

Event date：
2024.10

　

　
Progress and Pitfalls in Genome-Based Machine Learning Models for Zoonotic Virus Prediction

Junna Kawasaki, Tadaki Suzuki, Michiaki Hamada

1st Asia & Pacific Bioinformatics Joint Conference

Event date：
2024.10

　

　
Progress and Pitfalls in Genome-Based Machine Learning Models for Zoonotic Virus Prediction

Junna Kawasaki, Tadaki Suzuki, Michiaki Hamada

1st Asia & Pacific Bioinformatics Joint Conference

Event date：
2024.10

　

　
A fast RNA secondary structure prediction method using homologous sequence information

Tomohiro Nishimura, Kento Kubo, Tsukasa Fukunaga, Michiaki Hamada

1st Asia & Pacific Bioinformatics Joint Conference

Event date：
2024.10

　

　
Landscape of Evolutionary Arms Races between Transposable Elements and KRAB-ZFP Family

Masato Kosuge, Jumpei Ito, Michiaki Hamada

1st Asia & Pacific Bioinformatics Joint Conference

Event date：
2024.10

　

　
Landscape of Evolutionary Arms Races between Transposable Elements and KRAB-ZFP Family

Masato Kosuge, Jumpei Ito, Michiaki Hamada

1st Asia & Pacific Bioinformatics Joint Conference

Event date：
2024.10

　

　
Landscape of Virusesʼ RNA Secondary Structure to Find New Viruses

Shohei Ogino, Junna Kawasaki, Michiaki Hamada

1st Asia & Pacific Bioinformatics Joint Conference

Event date：
2024.10

　

　
The Lomb-Scargle periodogram based differentially expressed gene detection along pseudotime

Hitoshi Iuch, Michiaki Hamada

1st Asia & Pacific Bioinformatics Joint Conference

Event date：
2024.10

　

　
Information technology accelerating RNA therapeutics

Michiaki Hamada [Invited]

RNA and Developmental Biology A Symposium Commemorating the End of the 12-Year RNA Medical Science Laboratory

Presentation date： 2024.10
Comprehensive analysis of the LINE1-associated protein complexes reveals putative functional elements of LINE1 RNA

Masahiro Onoguchi, Shungo Adachi, Michiaki Hamada

1st Asia & Pacific Bioinformatics Joint Conference

Presentation date： 2024.10
ncRNA informatics for drug discovery

Michiaki Hamada

Non-coding RNA World 2024: Exploring Mechanisms

Presentation date： 2024.10
タンパク質言語モデルを用いたコロナウイルスの宿主予測

川崎千晶, 井内仁志, 浜田道昭

生命情報科学若手の会第16回年会

Event date：
2024.09

　

　
トランスポゾンのサブファミリー分類手法の確立

山本晃大, 小菅将斗, 武田淳志, 福永津嵩, 浜田道昭

生命情報科学若手の会第16回年会

Event date：
2024.09

　

　
Prefix-tuned RNA言語モデルによる新規タンパク質結合RNA配列の生成

横山源太朗, 浜田道昭

生命情報科学若手の会第16回年会

Event date：
2024.09

　

　
大規模ゲノムに適用可能なde novo散在反復配列検出手法の開発

武田淳志, 福永津嵩, 浜田道昭

生命情報科学若手の会第16回年会

Event date：
2024.09

　

　
トランスポゾンとKRAB-ZFPの進化的軍拡競争

小菅将斗, 伊東潤平, 浜田道昭

生命情報科学若手の会第16回年会

Event date：
2024.09

　

　
RNAバイオインフォマティクスを用いた生命医薬学研究

浜田道昭 [Invited]

公益財団法人ときわ会先端医学研究所セミナー

Presentation date： 2024.09

Event date：
2024.09

　

　
De novo interspersed repeat detection using inexact seed

Atsushi Takeda, Daisuke Nonaka, Yuta Imazu, Tsukasa Fukunaga, Michiaki Hamada

ISMB2024

Event date：
2024.07

　

　
Identify the effect of R-lop on transcriptional regulatory mechanisms

Ryotaro Yanoshita, Eito Ichihash, Mai Kubora, Chao Zeng, Michaki Hamada, Masayuki Sakurai

第25回日本RNA学会

Event date：
2024.06

　

　
Exploring architectural RNAs associated to cellular senescence

Saki Fujiwara, Naoko Fujiwara, Takeshi Chujo, Chao Zeng, Michiaki Hamada, Tetsuro Hirose

第25回日本RNA学会

Event date：
2024.06

　

　
Comprehensive Database for RNA-Targeting Drug Discovery

Chao Zeng, Michiaki Hamada

第25回日本RNA学会

Event date：
2024.06

　

　
The MTR4/hRNPK complex-mediated degradation of aberrant polyadenylated RNAs with multiple exons

Xinyue Gao, Kenzui Taniue, Anzu Sugawara, Chao Zeng, Han Han, Masahide Seki, Yutaka Suzuki, Michiaki Hamada, Nobuyoshi Akimitsu

第25回日本RNA学会

Event date：
2024.06

　

　
A universal tool for characterization of RNA discovered by SELEX

Shunsuke Sumi, Tatsuyuki Yoshii, Tatsuo Adachi, Hirohide Saito, Michiaki Hamada

第25回日本RNA学会

Event date：
2024.06

　

　
Sequence characterization and prediction of semi-extractable RNAs

Ryoma Yamawaki, Chao Zeng, Michiaki Hamada

第25回日本RNA学会

Event date：
2024.06

　

　
Deciphering the relationship between 5'UTR and 3'UTR sequence of mRNA

Kanta Suga, Michiaki Hamada

第25回日本RNA学会

Event date：
2024.06

　

　
Landscape of semi-extractable RNAs across five human cell lines

Chao Zeng, Takeshi Chujo, Tetsuo Hirose, Michiaki Hamada

The 29thﾂAnnual Meeting of theﾂRNAﾂSociety

Presentation date： 2024.05
Systematic discovery of regulatory motifs associated with human insulator sites

Naoki Osato, Michiaki Hamada

Human Genome Meeting 2024

Event date：
2024.04

　

　
大腸菌における翻訳促進新生ペプチドの網羅的探索

加藤晃代, 西河佑馬, 中野秀雄, 横山源太朗, 浜田道昭, 本野千恵

日本農芸化学会2024年度大会

Presentation date： 2024.03
Deep generative design of RNA family sequences

Shunsuke Sumi, Michiaki Hamada, Hirohide Saito

Winter Q-bio 2025

Event date：
2024.02

　

　
バイオインフォマティクス：情報科学で生命・医学・薬学研究にブレイクスルーを

浜田道昭 [Invited]

早稲田大学校友会稲門医師会第8回総会

Presentation date： 2024.02

Event date：
2024.02

　

　
バイオインフォマティクス：情報科学で生命・医学・薬学研究にブレイクスルーを

浜田道昭 [Invited]

先進技術研究会

Presentation date： 2023.11

Event date：
2023.11

　

　
情報科学を用いたmRNA・核酸医薬研究

浜田道昭 [Invited]

第8回 mRNA薬検討会

Presentation date： 2023.09

Event date：
2023.09

　

　
RNA構造予測ソフトウエアの紹介と比較

浜田道昭, 栗崎以久男 [Invited]

NPO法人mRNAターゲット創薬研究機構 2023年度第1回講演会

Presentation date： 2023.06
バイオインフォマティクス：情報科学で生命・医学・薬学研究にブレイクスルーを

浜田道昭 [Invited]

千代田稲門会2023年度定時総会講演会

Presentation date： 2023.06
AI aptamer drug discovery, Special session invited talk

Michiaki Hamada [Invited]

GIW / ISCB-Asia 2022

Presentation date： 2022.12
情報科学を用いた核酸医薬・mRNA医薬研究

浜田道昭 [Invited]

第31回WAKO Web受託セミナー RNA合成の進展

Presentation date： 2022.11
AIアプタマー創薬プロジェクト

浜田道昭 [Invited]

2022年度CREST「バイオDX」領域キックオフシンポジウム

Presentation date： 2022.11
RNAバイオインフォマティクス研究の最前線

浜田道昭 [Invited]

千葉工業大学大学院最先端生命科学特論講演会

Presentation date： 2022.09
RNA 情報学を用いた医薬学研究

浜田道昭 [Invited]

特定非営利活動法人 mRNAターゲット創薬研究機構 2022年度第2回講演会

Presentation date： 2022.08
RNA情報学を基軸とした創薬基盤研究

浜田道昭 [Invited]

RNA情報学を基軸とした創薬基盤研究

Presentation date： 2022.05
RNA研究の最前線】RNA情報学を基軸とした生命科学・医薬学研究

浜田道昭 [Invited]

日本医科大学講演会

Presentation date： 2022.02
RNAを基軸とした創薬研究

浜田道昭 [Invited]

EWE講演会

Presentation date： 2022.01
AIアプタマー創薬

浜田道昭 [Invited]

分子生物学会

Presentation date： 2021.12
ゲノム社会とバイオインフォマティクス

浜田道昭 [Invited]

日本バイオインフォマティクス学会・日本オミックス医学会合同シンポジウム, IIBMP2021

Presentation date： 2021.09
AIアプタマー創薬プロジェクト

浜田道昭 [Invited]

日本医科大学・早稲田大学合同シンポジウム

Presentation date： 2021.06
RNA情報学の最前線

浜田道昭 [Invited]

生命情報科学勉強会＠宮崎大学

Presentation date： 2021.05
RNAバイオインフォマティクスの最前線

浜田道昭 [Invited]

名古屋大学特別講演

Presentation date： 2021.01
RNAを基軸とした創薬研究

浜田道昭 [Invited]

EWE講演会

Presentation date： 2021.01
核酸医薬品開発に向けたバイオインフォマティクス技術

浜田道昭 [Invited]

第15回理研「バイオものづくり」シンポジウム

Presentation date： 2020.12
AIアプタマー創薬の実現に向けた情報技術

浜田道昭 [Invited]

NVIDIA GPU Technology Conference (GTC)

Event date：
2020.10

　

　
AIアプタマー創薬プロジェクト

浜田道昭 [Invited]

CREST「人工知能」領域第3回成果展開シンポジウム

Presentation date： 2020.09
長鎖ノンコーディングRNAの機能の解明に向けたバイオインフォマティクス

浜田道昭 [Invited]

ゲノム創薬・創発フォーラム第 3 回シンポジウム（主要テーマ：RNA関連の基礎研究とその創薬応用） (東京大学医科学研究所附属病院 A棟8階トミーホール)

Presentation date： 2020.02
⻑鎖ノンコーディングRNAの機能の解明に向けたバイオインフォマティクス技術

浜田道昭 [Invited]

2019年度 RNAフロンティアミーティング (IBM 天城ホームステッド)

Presentation date： 2019.09
RNAバイオインフォマティクス：技術開発と応用

浜田道昭 [Invited]

2019年度第１回核酸を標的とした低分子創薬研究会 (大阪大学産業科学研究所)

Presentation date： 2019.08
Model Learning meets Biology ー生物データの背後に潜む「構造」を情報科学で明らかにするー

浜田道昭 [Invited]

第7回生命医薬情報学連合大会（IIBMP2018）

Event date：
2018.09

　

　
長鎖ノンコーディング RNA の機能の解明に向けたバイオインフォマティクス技術

浜田道昭 [Invited]

EWE 三月会 11 月例会 (日比谷市政会館)

Presentation date： 2017.11
生命情報科学と私

浜田道昭 [Invited]

第9回生命情報科学若手の会 (西浦温泉ホテルたつき)

Presentation date： 2017.10

▼display all

Research Projects

Molecular design and application of circular RNAs using novel translation induction techniques

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2025.04

-

2030.03
Organized group activities for the area of perRNA research

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2025.04

-

2030.03
Platform for Advanced Genome Science

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Transformative Research Areas (platforms for Advanced Technologies and Research Resources)

Project Year :

2022.04

-

2028.03
RNAリインカネーション

日本学術振興会科学研究費助成事業

Project Year :

2024.06

-

2027.03

浜田道昭, 秋光信佳, 櫻井雅之
The lncRNA landscape of skeletal muscle cell biotransformation with aging.

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2024.04

-

2026.03
Overview and Systematic Understanding of Biological Phase Separation Based on RNA-Centric Molecular Networks

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2023.04

-

2026.03
AIアプタマー創薬プロジェクト

国立研究開発法人科学技術振興機構戦略的創造研究推進事業(CREST)

Project Year :

2021.04

-

2024.03

浜田道昭

　View Summary

低分子化合物に替わる次世代の新薬として注目されている「RNAアプタマー」の創薬期間を劇的に短縮するために、アプタマー創薬実験とRNA情報科学・人工知能技術を融合した「AIアプタマー創薬」を確立する。
リピート要素のde novo発見に基づく長鎖ノンコーディングRNAの機能の解明

日本学術振興会科学研究費助成事業基盤研究(A)

Project Year :

2020.04

-

2023.03

浜田道昭, 小野口真広, 福永津嵩
発達期ダイオキシンと老年期の高次認知機能低下の関係性解明

日本学術振興会科学研究費助成事業基盤研究(A)

Project Year :

2019.04

-

2022.03

掛山正心, 浜田道昭, 久保健一郎, 皆川栄子, 前川文彦

　View Summary

我々は動物実験により、ダイオキシン等の胎仔期曝露が認知機能を低下させることを認知課題成績と神経細胞の微細形態変化の双方で報告した。本研究では到達目標を、ダイオキシン等の発達期曝露が認知症の発症・増悪に関与する科学的知見を集積し、認知症の毒性エンドポイントとしての重要性を示すことにおく。（1）ダイオキシン等によって老年期に生じる認知的柔軟性の低下に焦点をあて、ヒト調査ならびに動物毒性実験により、影響の質と程度、そしてその毒性機構を明らかにして、（2）その成果をもとに、ヒト調査ならびに動物毒性実験において、高次認知機能の表現型解析技術を確立することを目的としている。本年度は、ヒト・コホート調査と動物毒性実験を実施するため、ヒト調査で用いる課題アプリを作成するとともに、コホート調査手続きを行った。タブレット端末での課題提示によるリモート評価を行う基盤整備も進めた。動物実験では認知的柔軟性と脳活動の定量評価を行うため、課題の作成と毒性試験の準備を行った。IntelliCageを用いた課題とともに、タッチスクリーンオペラント実験装置を用いた課題の確立も行なった。理化学研究所との共同研究により、アルツハイマー病モデルマウスを対象とした表現型解析を行い、認知症とメンタルスキーマの関係についての有望な知見を得た（論文投稿中）。また、本プロジェクトで取得するデータをモデリングするため、既存データのメタ解析を実施した。
Development of strategy to design anticancer drug based on ceRNA network

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (B)

Project Year :

2018.07

-

2021.03

Nobuyoshi Akimitsu

　View Summary

The post-transcriptional gene regulation is achieved by RNA-based gene regulatory networks. The RNA-based networks are classified by RNA-RNA and RNA-protein network. In this study, we investigate these RNA-based network to reveal gene regulation of cancer and we aimed development of new strategy to develop anticancer drugs. Our study has revealed that RNA-protein network is remodeled in response to chronic hypoxia and we found the target molecules of RNA-protein network under hypoxia.
RNA-クロマチン相互作用予測と応用

日本学術振興会科学研究費助成事業挑戦的研究(萌芽)

Project Year :

2017.06

-

2021.03

浜田道昭, 岩切淳一

　View Summary

哺乳類ゲノムの大部分は，コーディングあるいはノンコーディングRNAを転写している．このうちノンコーディングRNAの一部は，クロマチンと相互作用を行い，エピジェネティックな制御を行っていることが示唆されている．RNAとクロマチン相互作用のメカニズムを解明するために，lncRNAとクロマチンの相互作用予測を行うモデルを構築し，構築したモデルからどのような特徴が相互作用い寄与しているかの検討を行った．今回考えた特徴としては下記のものである：R-loop形成，RNA:DNA triplex, RNA結合によるscafold．このうち，R-loop形成に関しては配列相補性をアラインメントにより同定することにより推定した．またこの際には，RNAアクセシビリティも考慮するようにした．RNA:DNA triplexに関しては，既存のtriplex予測ツールを利用した．機械学習モデルとしては，ランダムフォレストを主に利用した．これは，ランダムフォレストは，分類に寄与した特徴量の導出が容易に可能となるためである．実際のデータとしては，RNAクロマチン相互作用に関する大規模実験データを用いて，正例と負例を作成し，構築したモデルの学習を行った．予測精度の評価はクロスバリデーションを用いたが，現状十分な予測精度は出ていない．特徴量および学習データの両面から現在詳細に検討を行っている段階である．機械学習モデルに関しても深層学習なども含めて検討を行うことを計画している．
人工知能技術を用いた革新的アプタマー創薬システムの開発

JST 戦略的創造研究推進事業（CREST）

Project Year :

2018.10

-

2021.03

浜田道昭

　View Summary

本研究提案は,次世代新薬の要である『RNAアプタマー』の創薬のプロセスの劇的な短縮および成功率の向上を実現し,医薬品開発にブレイクスルーを起こすことを目的とします.そのために,アプタマー創薬プロセスの短鎖化までのステップを人工知能技術と核酸インフォマティクスにより自動化した『AIアプタマー創薬システム』の研究開発を行い,製薬企業のリボミックに導入しその汎用性・有効性を検証した後に公開します.
Platform for Advanced Genome Science

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2016

-

2021

KOHARA Yuji, Kato Kazuto, Kawashima Minae, TOYODA Atsushi, Suzuki Yutaka, MITSUI Jun, Hayashi Tetsuya, TOKINO Takashi, Kurokawa Ken, Nakamura Yasukazu, Noguchi Hideki, IWASAKI WATARU, Morishita Shinichi, Asai Kiyoshi, Kasahara Masahiro, Ito Takehiko, Yamada Takuji, KUHARA Satoshi, Takahashi Hiroki, Sakakibara Yasubumi, HAMADA MICHIAKI, Takagi Toshihisa, SESE JUN, Ogura Yoshitoshi, Ida Ryuichi, YAMAGATA Zentaro, Masui Toru, Muto Kaori, Kodama Satoshi, Setoyama Koichi, Kokado Minori, Ohashi Noriko, FUJIYAMA Asao, INOUE Ituro, Nakaoka Hirofumi, Sugano Sumio, Tsuji Shoji, Gotoh Yasuhiro, Nakamura Keiji, Ogura Yoshitoshi, Okuno Miki, Nakase Hiroshi, SASAKI Yasushi, IDOGAWA Masashi, Tange Shoichiro, Mori Hiroshi, OGASAWARA Osamu, Tanizawa Yasuhiro, Kondo Shinji, kiryu hisanori, Kajitani Rei, TASHIRO Kosuke, Frith Martin, HIRAKAWA Hideki, Suzuki Hiromu, NOSHO KATSUHIKO, KAI Masahiro

　View Summary

Our group has provided the state of art genome technologies, named PAGS Support, including de novo genome sequencing, variation analysis, epigenomics, RNA analysis, metagenome analysis and single cell analysis, to the projects that were selected from proposals based on KAKENHI projects. Thus far, we have provided PAGS Support to altogether 912 proposals that were selected from 1988 proposals. The proposals cover the most fields of life sciences, expanding to the fields of physical sciences, environmental studies and so on. Thus far 556 papers have been published as the outcome, which covers from biology to agriculture, medicine and pharmacy, from basic to applied sciences. Our group has also developed new technologies and algorithms to overcome the problems emerged in the PAGS Support, which are used in the other PAGS Support projects. This is a positive cycle and therefore our system becomes a very effective way for the promotion of biological sciences.
RNA-クロマチン相互作用予測と応用

文部科学省挑戦的研究(萌芽)

Project Year :

2017.03

-

2020.04

浜田道昭
Functional classification of long noncoding RNAs based on functional elements and deep learning

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Young Scientists (A)

Project Year :

2016.04

-

2020.03

Hamada Michiaki

　View Summary

To elucidate the function of long non-coding RNAs (lncRNAs), which are not translated into proteins but have their own functions, we have conducted many studies from an informatics perspective, focusing on "functional elements" such as RNA sequences, structures, modifications, and interactions with biological macromolecules. For example, we attempted to elucidate the functions of lncRNAs by clarifying that repeat sequences, which were thought to be junk, contribute to the tissue-specific expression of lncRNAs and to their interactions with proteins and DNA. Through these efforts, we have established information infrastructure technology that contributes to the classification of lncRNA functions, and have made it widely available to the public.
機能エレメントと深層学習に基づく長鎖ノンコーディングRNAの機能分類

文部科学省若手研究(A)

Project Year :

2016.04

-

2020.03

浜田道昭
Long non cording RNA associated with drug resistance in lung cancer with driver mutation

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (C)

Project Year :

2016.04

-

2019.03

SEIKE MASAHIRO, HAMADA Michiaki

　View Summary

We tried to identify a long non cording RNA (lncRNA) associated with drug resistance to molecular targeted therapy in lung cancer with driver mutation. We analyzed lnc RNA expression profiles of 4 drug sensitive lung cancer cells and 10 drug reistant lung cancer cells showing cancer stem cell properties and epithelial- mesenchimal trasition using microarray and bioinformatic analysis. We identified CRNDE and IRX5 as lnc RNA and its targeted protein associated with drug resistance to molecular targeted therapy in lung cancer with driver mutation.Inhibition of IRX5 using siRNA showed apototic activity in drug resistant lung cancer cells. CRNDE and IRX5 may be promising targets to overcome the drug resistance to molecular targeted therapy in lung cancer with driver mutation.
RNA informatics for epi-transcriptome analysis

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (A)

Project Year :

2016.04

-

2019.03

Asai Kiyoshi

　View Summary

The energy parameters of the important modified bases, inosine and N6 methyladenosine were identified by a combination of thermometric experiments and molecular simulations. The effect of estimation error on structure prediction was evaluated and presented by theoretical analysis and computer experiments. A model of the effect of A-to-I editing on translational repression efficiency by miRNA was constructed and presented in a joint study using the identified inosine parameters.
We have improved RintD, an analysis tool for secondary structure probability distribution, by developing RintW, which calculates the distribution of base pair probability, and RintC, which speeds up the calculation with maximum base pair constraint. At that time, the effect of the Fourier transform on the numerical error was analyzed using the accuracy guarantee calculation, and it was shown that the large probability was reliable.
ヒストンバリアントに基づくクロマチンの機能の推定

日本学術振興会科学研究費助成事業新学術領域研究(研究領域提案型)

Project Year :

2016.04

-

2018.03

浜田道昭

　View Summary

(1) ヒストンバリアントを含むクロマチンマークに対するクロマチン状態の推定．
ヒストンバリアントのデータとしては，ヒト：Kujirai+, NAR (2016) 44, 6127-41，マウス：Maehara+, Epigenetics Chromatin (2015) 17;8:35を用いた．これらのデータを用いて，研究代表者が開発した手法を用いてクロマチン状態の推定を行った．さらに，推定されたクロマチン状態と，様々なゲノムアノテーションとの相関を調査した．
(2)データベースlncRRIdb: 発現，局在情報を統合したlncRNA-RNA相互作用データベース
本研究では，クロマチン機能を長鎖ノンコーディングRNA（lncRNA）の観点から特徴づけることを試みるために，lncRNAと相互作用を行うRNAの網羅的なデータベースの構築を行った．これは研究代表者らが開発したRIblastを用いて，計算機による網羅的な相互作用予測を行った結果を，発現および局在の実験情報とともに格納したデータベースである
(3)階層的なクロマチン状態を推定するための情報技術の開発．
プロモーターやエンハンサーも，階層的な構造を有していると考えた．例えば，promoter⇒strong promoter, weak promoter, bivalent promoterなどである．従来のクロマチン状態の推定手法においては，このような階層性を考えることはできなかったため，我々は独自に手法の開発を行った．そのためのプロトタイプシステムの開発を行い小さなデータを用いてその有効性を検証した．
ヒストンバリアントに基づくクロマチンの機能の推定

文部科学省新学術領域研究(研究領域提案型)

Project Year :

2016.04

-

2018.03

浜田道昭
A population genetics analysis of RNA secondary structures

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Young Scientists (B)

Project Year :

2013.04

-

2017.03

Kiryu Hisanori, ASAI Kiyoshi, HAMADA Michiaki, SATO Kengo, KATO Yuki, IWASAKI Wataru, ONO Yukiteru, TERAI Goro, OZAKI Haruka, MATSUMOTO Hirotaka, FUKUNAGA Tsukasa, MORI Ryota, KASHIHARA Yuki, KAWAGUCHI Risa

　View Summary

RNA molecules in a cell play very important roles so that genetic information encoded in the genome is instantiated as proteins and exerts actual functions. Three dimensional structure of an RNA is understood by its arrangement of stem structures, and it is important to investigate the properties of this secondary structure to understand the functions of RNAs. In this study, we have succeeded in developing an algorithm (ParasoR) for computing several properties of RNA secondary structures of very long RNAs such as messenger RNAs and long non-coding RNAs for the first time. We have also succeeded in developing an algorithm (CapR) for computing secondary structural context around the binding regions of RNA binding proteins.
Comprehensive prediction of RNA-protein interactinos

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (A)

Project Year :

2013.04

-

2016.03

Asai Kiyoshi, YURA Kei

　View Summary

The aim of the research was to predict the RNA-protein interactions for non-coding RNAs and function-known proteins. Our analysis of RNA-protein complex in PDB showed that the nucleotides that do not form base-pairs in RNA 2D structures but form hydrogen bond with amino acids have lower base-pairing probabilities than the nucleotides that form neither base-pairs or hydrogen bonds with amino acids. We developed a new method to understand the landscape of the distribution of RNA 2D structures, by efficiently calculating the probabilities of all the structures with specific Hamming distances from the canonical structures. In order to predict the joint structure of RNA-protein complex, we performed rigid body docking simulations. After revising the force field for RNAs, our docking simulations showed better accuracy than previous methods, and we reported that in a peer reviewed journal.
Development of basic technology for privacy-preserving bioinformatics and its application

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Challenging Exploratory Research

Project Year :

2013.04

-

2016.03

Hamada Michiaki, Shimizu Kana, Hanaoka Goichiro, Tsuda Koji, Frith Martin, Asai Kiyoshi

　View Summary

It is highly demanded to deal with the information of personal genome and chemical compound secretly, because they are sensitive information that should not be leaked. On the other hand, from a viewpoint of "open" science, it is important to perform data-mining by combining those sensitive information with other data. In this study, we have developed several methods to perform data-mining, making those information secret. Specifically, we developed (i) privacy-preserving search for chemical database, (ii) privacy-preserving genome sequence search with hidden Markov Model (HMM) and (iii) privacy preserving sequence alignment, all of which will be useful toward open science of biology.
Research on structure predictions of RNA with modified nucleotieds

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Young Scientists (A)

Project Year :

2012.04

-

2016.03

Hamada Michiaki

　View Summary

We have developed bioinformatic methods for predicting secondary structures including modified bases. Due to the limitation of the known structures with modified bases, we employed a semi-supervised learning approach for predicting RNA secondary structures using RNA sequences with and without secondary structures. Moreover, we have developed an integrated web server, Rtools, for performing various analyses based on RNA secondary structures.
Platform of large scale and high quality genomics and bioinformatics: Towards the advancement of genome sciences in academia

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research on Innovative Areas (Research in a proposed research area)

Project Year :

2010.04

-

2016.03

KOHARA Yuji, KATO Kazuto, TOYODA Atsushi, KUROKI Yoko, SUGANO Sumio, SUZUKI Yutaka, HAYASHI Tetsuya, YAMAMOTO Ken, TSUJI Shoji, INOUE Ituro, KUROKAWA Ken, MORISHITA Shinichi, NAKAMURA Yasukazu, TABATA Satoshi, KUHARA Satoshi, IWASAKI Wataru, SESE Jun, TAKAHASHI Hiroki, ASAI Kiyoshi, KASAHARA Masahiro, SAKAKIBARA Yasubumi, YADA Tetsushi, YAMAGATA Zentaro, MUTO Kaori, IDA Ryuichi, MASUI Tohru, KURIYAMA Mariko, TAKAGI Toshihisa, FUJIYAMA Asao, HATTORI Masahira, OGURA Yoshitoshi, TOKUNAGA Katsushi, KUWANO Ryozo, OHASHI Jun, ITOH Takehiko, HIRAKAWA Hideki, NOGUCHI Hideki, MATSUOKA Satoshi, OGASAWARA Naotake, NAKAMURA Kensuke, HAMADA Michiaki, KANAYA Shigehiko, ANZAI Yuichiro, OKADA Kiyotaka, SAKAKI Yoshiyuki, TAKAKU Fumimaro, TOYOSHIMA Kumao, NAKAMURA Keiko, HOTTA Yoshiki, YONEZAWA Akinori, YOSHIKAWA Hiroshi, YOSHIDA Mitsuaki, INOKO Hidetoshi, TODA Tatsushi, INAZAWA Johji, GOJOBORI Takashi, URUSHIHARA Hideko, TAKEDA Hiroyuki, SHIROISHI Toshihiko, ITOH Takashi, SATOH Noriyuki, MATSUDA Hideo, GOTO Susumu, TSUDA Masataka

　View Summary

We have provided technologies of large scale and high quality genomics and bioinformatics to many KAKENHI projects, 60 to 90 subjects every year and altogether 464 subjects, based on application and selection. This kind of support became possible by concentrating to a limited number of DNA sequencing centers under the situation that there was unexpectedly fast advancement of these technologies in the world. Our activity has led to 363 papers including the Coelacanth genome paper. The KAKENHI subjects that we supported cover all the KAKENHI items and almost divisions of life science domain. Furthermore, we have developed new methodologies to solve the problems that emerged from the support activity : One of them is the genome assembly software PLATANUS that has become a key method to decipher difficult genomes. Such a virtuous circle and the outcome show that the platform is essential and effective in life sciences.

▼display all

Misc

Fast RNA-RNA Interaction Prediction Methods for Interaction Analysis of Transcriptome-Scale Large Datasets

Tsukasa Fukunaga, Michiaki Hamada

Methods in molecular biology (Clifton, N.J.) 2586 163 - 173 2023 [International journal]

　View Summary

The computational prediction of RNA-RNA interactions has long been studied in RNA informatics. Most of the existing approaches focused on the interaction prediction of short RNAs in small datasets. However, in recent years, two fast prediction methods, RIsearch2 and RIblast, have been developed to predict transcriptome-scale interactions or long RNA interactions. The key idea of the software acceleration of these tools was the integration of a seed-and-extend method, which is used in fast sequence alignment tools, into RNA-RNA interaction prediction. As a result, the two software programs were ten to a thousand times faster than the existing tools; because of this acceleration, detection of genome-wide microRNA target sites or interaction partners of function-unknown long noncoding RNAs has become possible. In this review, we describe the basic concept of the algorithm, its applications, and the future perspectives of the fast RNA-RNA interaction prediction tools.

DOI PubMed
ドライバー遺伝子異常肺癌の薬剤耐性機序における長鎖ノンコーディングRNAの意義

高橋聡, 野呂林太郎, 吉川明子, 中道真仁, 菅野哲平, 松本優, 武内進, 平尾真季子, 松田久仁子, Zeng Chao, 浜田道昭, 久保田馨, 清家正博, 弦間昭彦

日本呼吸器学会誌 9 ( 増刊 ) 177 - 177 2020.08
ドライバー遺伝子異常肺癌の薬剤耐性機序における長鎖ノンコーディングRNAの意義

高橋聡, 野呂林太郎, 吉川明子, 中道真仁, 菅野哲平, 松本優, 武内進, 平尾真季子, 松田久仁子, Zeng Chao, 浜田道昭, 久保田馨, 清家正博, 弦間昭彦

日本呼吸器学会誌 9 ( 増刊 ) 177 - 177 2020.08
CAFs induce formation of metastatic human breast tumor cell clusters with partial epithelial-mesenchymal transition

Akira Orimo, Yasuhiko Ito, Yoshihiro Mezawa, Kaidiliavi Sulidan, Yataro Daigo, Nadila Wali, Okio Hino, Kazuyoshi Takeda, Michiaki Hamada, Yuko Matsumura

CANCER SCIENCE 109 797 - 797 2018.12 [Refereed]

Research paper, summary (international conference)
非コードRNA Eleanorはヌクレオソーム中のヒストンの交換を促進する

藤田理紗, 有村泰宏, 山本達郎, 浜田道昭, 斉藤典子, 胡桃坂仁志

生命科学系学会合同年次大会 2017年度 [3PT18 - 0555)] 2017.12
トピックモデルを用いたがんゲノムの変異シグネチャー解析 (ニューロコンピューティング)

松谷太郎, 宇恵野雄貴, 福永津嵩, 浜田道昭

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 117 ( 109 ) 159 - 164 2017.06

CiNii
トピックモデルを用いたがんゲノムの変異シグネチャー解析 (情報論的学習理論と機械学習)

松谷太郎, 宇恵野雄貴, 福永津嵩, 浜田道昭

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 117 ( 110 ) 105 - 110 2017.06

CiNii
Applications of Machine Learning to Bioimage Informatics

56 ( 2 ) 163 - 167 2017

CiNii
Implementation and evaluation of privacy-preserving HMM using holomorphic encryption toward disease risk estimation (version 2016/6/2)

116 ( 120 ) 121 - 126 2016.07

CiNii
Privacy-preserving search for chemical compound databases

Shimizu K, Nuida K, Arai H, Mitsunari S, Attrapadung N, Hamada M, Tsuda K, Hirokawa T, Sakuma J, Hanaoka G, Asai K

bioRxiv ( 013995 ) 2015.01

Internal/External technical report, pre-print, etc.

DOI
RNA secondary structure prediction from multi-aligned sequences

Michiaki Hamada

2013.07

Internal/External technical report, pre-print, etc.

　View Summary

It has been well accepted that the RNA secondary structures of most 
functional non-coding RNAs (ncRNAs) are closely related to their functions and 
are conserved during evolution. Hence, prediction of conserved secondary 
structures from evolutionarily related sequences is one important task in RNA 
bioinformatics; the methods are useful not only to further functional analyses 
of ncRNAs but also to improve the accuracy of secondary structure predictions 
and to find novel functional RNAs from the genome. In this review, I focus on 
common secondary structure prediction from a given aligned RNA s...
Generalized Centroid Estimators in Bioinformatics

Michiaki Hamada, Hisanori Kiryu, Wataru Iwasaki, Kiyoshi Asai

PLoS ONE 6(2):e16450, 2011 2013.05

Internal/External technical report, pre-print, etc.

　View Summary

In a number of estimation problems in bioinformatics, accuracy measures of 
the target problem are usually given, and it is important to design estimators 
that are suitable to those accuracy measures. However, there is often a 
discrepancy between an employed estimator and a given accuracy measure of the 
problem. In this study, we introduce a general class of efficient estimators 
for estimation problems on high-dimensional binary spaces, which representmany 
fundamental problems in bioinformatics. Theoretical analysis reveals that the 
proposed estimators generally fit with commonly-used accura...

DOI
Fighting against uncertainty: An essential issue in bioinformatics

Michiaki Hamada

2013.05

Internal/External technical report, pre-print, etc.

　View Summary

Many bioinformatics problems, such as sequence alignment, gene prediction, 
phylogenetic tree estimation and RNA secondary structure prediction, are often 
affected by the "uncertainty" of a solution; that is, the probability of the 
solution is extremely small. This situation arises for estimation problems on 
high-dimensional discrete spaces in which the number of possible discrete 
solutions is immense. In the analysis of biological data or the development of 
prediction algorithms, this uncertainty should be handled carefully and 
appropriately. In this review, I will explain several methods t...
加法準同型暗号を用いた化合物データベースの秘匿検索プロトコル

縫田光司, 清水佳奈, 荒井ひろみ, 浜田道昭, 津田宏治, 広川貴次, 花岡悟一郎, 佐久間淳, 浅井潔

情報処理学会シンポジウムシリーズ(CD-ROM) 2012 ( 3 ) ROMBUNNO.2C2-1 - 389 2012.10

CiNii J-GLOBAL
半教師あり学習を用いたRNA二次構造予測アルゴリズムの提案

米本悠, 浜田道昭, 浜田道昭, 浅井潔, 浅井潔

日本RNA学会年会要旨集 14th 160 2012.07

J-GLOBAL
カノニカル分布に基づいたRNA二次構造安定性解析手法の開発

森遼太, 浜田道昭, 浜田道昭, 浅井潔, 浅井潔

日本RNA学会年会要旨集 14th 154 2012.07

J-GLOBAL
検索行動におけるプライバシ保護

荒井ひろみ, 清水佳奈, 浜田道昭, 津田宏治, 広川貴次, 佐久間淳, 浅井潔, 浅井潔

人工知能学会全国大会論文集(CD-ROM) 26th ROMBUNNO.3I2-OS-20-1 2012

J-GLOBAL
カノニカル分布に基づくRNA二次構造の存在確率分布記述手法の開発

森遼太, 浜田道昭, 浜田道昭, 浅井潔, 浅井潔

日本分子生物学会年会プログラム・要旨集(Web) 35th WEB ONLY 1P-0244 2012

J-GLOBAL
半教師あり学習を用いたRNA二次構造予測アルゴリズムの提案

米本悠, 浜田道昭, 浜田道昭, 浅井潔, 浅井潔

日本分子生物学会年会プログラム・要旨集(Web) 35th WEB ONLY 3P-0071 2012

J-GLOBAL
Maximizing Expected Accuracy in Bioinformatics(Industrial Materials)

Hamada Michiaki, Asai Kiyoshi

Bulletin of the Japan Society for Industrial and applied Mathematics 21 ( 1 ) 34 - 39 2011.03

DOI CiNii
期待精度最大化とバイオインフォマティクス

浜田道昭, 浅井潔

応用数理 21 ( 1 ) 34 - 39 2011.03

DOI CiNii J-GLOBAL
RNA-RNA interaction prediction using integer programming with threshold cut (ニューロコンピューティング)

Kato Yuki, Sato Kengo, Hamada Michiaki

電子情報通信学会技術研究報告 110 ( 83 ) 183 - 190 2010.06

CiNii
RNA-RNA Interaction Prediction Using Integer Programming with Threshold Cut

KATO YUKI, SATO KENGO, HAMADA MICHIAKI, WATANABE YOSHIHIDE, ASAI KIYOSHI, AKUTSU TATSUYA

2010 ( 32 ) 1 - 8 2010.06

CiNii
CentroidFold:RNA二次構造予測ウェブサーバー

佐藤健吾, 佐藤健吾, 浜田道昭, 浜田道昭, 浅井潔, 浅井潔, 光山統泰

日本RNA学会年会要旨集 11th 96 2009.07

J-GLOBAL
Large Scale Similarity Search for Locally stable Secondary Structures among RNA Sequences (IPSJ Transactions on Bioinformatics Vol.2)

2008 ( 2 ) 36 - 46 2009.04

CiNii
CentroidHomfold:相同配列群の情報を利用したRNAの2次構造予測

浜田道昭, 浜田道昭, 佐藤健吾, 佐藤健吾, 木立尚孝, 木立尚孝, 光山統泰, 浅井潔, 浅井潔

日本分子生物学会年会講演要旨集 32nd ( Vol.1 ) 48 2009

J-GLOBAL
期待精度を最大化するRNA情報解析手法の開発

浜田道昭, 浜田道昭, 木立尚孝, 佐藤健吾, 佐藤健吾, 光山統泰, 浅井潔, 浅井潔

生化学 2P-0776 2008

J-GLOBAL
Support Vector Machineを用いた機能性RNAファミリーの分類

浜田道昭, 浜田道昭, 浜田道昭, 加藤毅, 加藤毅, 金大真, 津田宏治, 浅井潔, 浅井潔

RNAミーティング 7th 69 2005

J-GLOBAL
A High Performance Computing Environments for Prediction of Activity and function of Biomolecules : An Application to Analysis of HIV Protease Inhibitors

Hamada Michiaki, Feng Cheng, Inagaki Yuichiro, Nagashima Umpei, Murakami Kazuaki, Chuman Hiroshi

Transactions of the Japan Society for Industrial and Applied Mathematics 14 ( 4 ) 267 - 288 2004.12

　View Summary

We have developed an object oriented large-scale scientific simulations system that contains algorithms of molecular scientific computing programs, called Embedded High-Performance Computing (EHPC). As an application of the system, "EHPC-Drug platform" has been constructed for rational drug design. It can provide a high-performance computing ability for exhaustive conformational analyses of biomolecules, generating computation of their three-dimensional topological descriptors, and docking calculations with their target receptors. To enhance its computing abilities, we are also planning to ...

DOI CiNii
A High Performance Computing Environments for Prediction of Activity and Function of Biomolecules:-An Application to Analysis of HIV Protease Inhibitors

HAMADA MICHIAKI, FENG C, INAGAKI YUICHIRO, NAGASHIMA UMPEI, MURAKAMI KAZUAKI, CHUMAN HIROSHI

日本応用数理学会論文誌 14 ( 4 ) 267 - 288 2004.12

　View Summary

We have developed an object oriented large-scale scientific simulations system that contains algorithms of molecular scientific computing programs, called Embedded High-Performance Computing (EHPC). As an application of the system, "EHPC-Drug platform" has been constructed for rational drug design. It can provide a high-performance computing ability for exhaustive conformational analyses of biomolecules, generating computation of their three-dimensional topological descriptors, and docking calculations with their target receptors. To enhance its computing abilities, we are also planning to apply Grid computing technology to this system for parallel and distributed computing and Grid Data processing. As a critical test of our approach, we applied it to a prediction of bound conformation of several HIV protease inhibitors with the protease.

DOI CiNii J-GLOBAL
Development and application of a platform for drug discovery using grid technology and XML database

HAMADA MICHIAKI, INAGAKI YUICHIRO, CHUMAN HIROSHI

構造活性相関シンポジウム講演要旨集 32nd 141 - 144 2004.11

J-GLOBAL
薬師(Xsi)―創薬のための仮想スクリーニング統合システムの開発

稲垣祐一郎, 浜田道昭, 山崎一人, 金岡昌治, 中馬寛

情報計算化学生物学会大会予稿集 2004 205 - 206 2004.07

J-GLOBAL
DrugMLとGrid創薬

浜田道昭, 稲垣祐一郎, 中馬寛

日本コンピュータ化学会年会講演予稿集 2004 51 2004.05

J-GLOBAL
Drug Discovery Using Grid Technologies and DrugML.

HAMADA MICHIAKI, INAGAKI YUICHIRO, CHUMAN HIROSHI

構造活性相関シンポジウム講演要旨集 31st 101 - 102 2003.11

J-GLOBAL

▼display all

Industrial Property Rights

人工合成RNAを用いた包括的かつ定量的なRNA-タンパク質複合体相互作用解析方法

特許第7682581号

小野口真広, 足達俊吾, 浜田道昭

Patent

J-GLOBAL
検索システム、検索方法、およびプログラム

岩村佳奈, 広川貴次, 津田宏治, 荒井ひろみ, 佐久間淳, 浅井潔, 浜田道昭, 花岡悟一郎, 縫田光司

Patent

J-GLOBAL
検索システム、検索方法、およびプログラム

特許第5975490号

岩村佳奈, 広川貴次, 津田宏治, 荒井ひろみ, 佐久間淳, 浅井潔, 浜田道昭, 花岡悟一郎, 縫田光司

Patent

J-GLOBAL
RNA配列情報処理装置

特許第4940396号

津田宏治, 金大真, 浜田道昭, 浅井潔

Patent

J-GLOBAL
RNA配列情報処理装置

津田宏治, 金大真, 浜田道昭, 浅井潔

Patent

J-GLOBAL

Syllabus

Science and Engineering Laboratory 1B II

School of Fundamental Science and Engineering

2026 fall semester
Science and Engineering Laboratory 1B II

School of Creative Science and Engineering

2026 fall semester
Science and Engineering Laboratory 1B II

School of Advanced Science and Engineering

2026 fall semester
C Programming Densei3

School of Advanced Science and Engineering

2026 fall semester
Introduction to C Programming Densei3

School of Advanced Science and Engineering

2026 spring semester
Frontiers of Electrical Engineering and Bioscience [S Grade]

School of Advanced Science and Engineering

2026 spring semester
Frontiers of Electrical Engineering and Bioscience

School of Advanced Science and Engineering

2026 spring semester
Bioinformatics

School of Advanced Science and Engineering

2026 fall semester
Graduation Thesis B [Spring]

School of Advanced Science and Engineering

2026 spring semester
Graduation Thesis B

School of Advanced Science and Engineering

2026 fall semester
Graduation Thesis A

School of Advanced Science and Engineering

2026 spring semester
Graduation Thesis B [Spring]

School of Advanced Science and Engineering

2026 spring semester
Graduation Thesis A [Fall]

School of Advanced Science and Engineering

2026 fall semester
Laboratory C on Electrical Engineering and Bioscience

School of Advanced Science and Engineering

2026 fall semester
Graduation Thesis B

School of Advanced Science and Engineering

2026 fall semester
Project Laboratory B

School of Advanced Science and Engineering

2026 fall semester
Laboratory C on Electrical Engineering and Bioscience [S Grade]

School of Advanced Science and Engineering

2026 fall semester
Project Laboratory A [S Grade]

School of Advanced Science and Engineering

2026 spring semester
Project Laboratory A

School of Advanced Science and Engineering

2026 spring semester
Frontiers of Electrical Engineering and Bioscience

School of Advanced Science and Engineering

2026 spring semester
Frontiers of Electrical Engineering and Bioscience [S Grade]

School of Advanced Science and Engineering

2026 spring semester
Graduation Thesis Fall [S Grade]

School of Advanced Science and Engineering

2026 fall semester
Graduation Thesis Fall

School of Advanced Science and Engineering

2026 fall semester
Graduation Thesis Spring

School of Advanced Science and Engineering

2026 spring semester
Graduation Thesis Spring [S Grade]

School of Advanced Science and Engineering

2026 spring semester
Research on Bioinformatics

Graduate School of Advanced Science and Engineering

2026 full year
Seminar on Bioinformatics and Medical Science D

Graduate School of Advanced Science and Engineering

2026 fall semester
Seminar on Bioinformatics and Medical Science C

Graduate School of Advanced Science and Engineering

2026 spring semester
Seminar on Bioinformatics and Medical Science B

Graduate School of Advanced Science and Engineering

2026 fall semester
Seminar on Bioinformatics D

Graduate School of Advanced Science and Engineering

2026 fall semester
Seminar on Bioinformatics C

Graduate School of Advanced Science and Engineering

2026 spring semester
Advanced Seminar B

Graduate School of Advanced Science and Engineering

2026 fall semester
Advanced Seminar A

Graduate School of Advanced Science and Engineering

2026 spring semester
Research on Bioinformatics and Medical Science

Graduate School of Advanced Science and Engineering

2026 full year
Seminar on Bioinformatics and Medical Science D

Graduate School of Advanced Science and Engineering

2026 fall semester
Seminar on Bioinformatics and Medical Science C

Graduate School of Advanced Science and Engineering

2026 spring semester
Seminar on Bioinformatics and Medical Science B

Graduate School of Advanced Science and Engineering

2026 fall semester
Seminar on Bioinformatics and Medical Science A

Graduate School of Advanced Science and Engineering

2026 spring semester
Seminar on Bioinformatics D

Graduate School of Advanced Science and Engineering

2026 fall semester
Seminar on Bioinformatics C

Graduate School of Advanced Science and Engineering

2026 spring semester
Seminar on Bioinformatics B

Graduate School of Advanced Science and Engineering

2026 fall semester
Seminar on Bioinformatics A

Graduate School of Advanced Science and Engineering

2026 spring semester
Advanced Seminar B

Graduate School of Advanced Science and Engineering

2026 fall semester
Advanced Seminar A

Graduate School of Advanced Science and Engineering

2026 spring semester
Topics on Bioinformatics

Graduate School of Advanced Science and Engineering

2026 spring semester
Research on Bioinformatics and Medical Science

Graduate School of Advanced Science and Engineering

2026 full year
Research on Bioinformatics

Graduate School of Advanced Science and Engineering

2026 full year
Seminar on Bioinformatics B

Graduate School of Advanced Science and Engineering

2026 fall semester
Seminar on Bioinformatics A

Graduate School of Advanced Science and Engineering

2026 spring semester
Seminar on Bioinformatics and Medical Science A

Graduate School of Advanced Science and Engineering

2026 spring semester
Research on Bioinformatics and Medical Science

Graduate School of Advanced Science and Engineering

2026 full year
Research on Bioinformatics

Graduate School of Advanced Science and Engineering

2026 full year
Introduction to Natural Science for a Carbon Neutral Society (Life Science)

Global Education Center

2026 winter quarter

▼display all

Sub-affiliation

Affiliated organization Global Education Center
Faculty of Science and Engineering Graduate School of Advanced Science and Engineering

Research Institute

2025

-

2026

Center for Data Science Concurrent Researcher
2024

-

2026

Waseda Research Institute for Science and Engineering Concurrent Researcher
2024

-

2026

Waseda Center for a Carbon Neutral Society Concurrent Researcher

Internal Special Research Projects

AIアプタマー創薬の高度化

2024

　View Summary

前年度に開発したRNAアプタマーの生成AIであるRaptGenのユーザビリティを向上させるためにユーザインターフェースRaptGen-UIの開発を行いまいた。RaptGen-UIは、RNA アプターマーの同定と最適化のための対話型インターフェースを提供するソフトウェアです。このシステムは、SELEXデータセットで学習された変分オートエンコーダー（VAE）における潜在空間ベイズ最適化を活用し、計算機の専門知識を持たない実験研究者でも効率的にRNAアプターマーの同定と最適化を行うことを可能にします。本ソフトウェアは、VAEトレーナー、VAEビューワー、ベイズ最適化モジュールの3つの主要コンポーネントから構成されています。これらのモジュールを通じて、SELEXデータの学習、潜在空間の可視化、およびアプターマー配列の最適化を一貫して実行できます。技術的には、Next.jsとTypeScriptによるフロントエンド、FastAPIとPythonによるバックエンド、そしてDockerコンテナによる容易なデプロイメントを特徴としています。さらに、ローカル環境での実行を前提とした設計により、データの機密性を確保しつつ、GPUとCPU両環境での実行に対応しています。本ツールは既にCOVID-19のスパイクタンパク質に対するアプターマー（SPA1）の同定など、実際の創薬研究での活用実績があり、特に緊急の治療ニーズに対応する創薬研究において、有用なツールとして期待されています。
シミュレーション技術を用いたRNA構造解析技術の開発

2023

　View Summary

シミュレーション技術を用いたRNA構造解析技術として以下の研究開発を進めた．１．深層学習技術を用いてRNAとタンパク質の複合体立体構造を予測するための技術の開発を行った．現在論文執筆中である．２．分子動力学法などから得られる複数の立体構造情報を低次元の潜在空間に射影するための深層学習技術の開発を行った．現在論文執筆中である．
RNAリンカネーション

2022

　View Summary

RNAリンカネーションの解明に寄与することが期待されるRNAバイオインフォマティクスの技術の開発を行った．例えば以下はRNAに共通する構造を高速に発見することを可能とするツールである．Tsukasa Fukunaga*, Michiaki Hamada, LinAliFold and CentroidLinAliFold: Fast RNA consensus secondary structure prediction for aligned sequences using beam search methods, Bioinformatics Advances, vbac078, https://doi.org/10.1093/bioadv/vbac078 Published: 22 October 2022また，予備実験も継続的に進めている．
長鎖ノンコーディングRNA情報解析基盤の開発

2021

　View Summary

長鎖ノンコーディングRNA（lncRNA）は生体内で単独で機能を発揮しているわけではなく，他の機能性分子と相互作用を行うことにより様々な機能を実現している．今年度はlncRNAと相互作用するRNA結合タンパク質（RBP）を情報科学的に解析するための研究を複数行った．第一に，RBPに結合するRNA配列をBERTの事前学習モデルを用いて予測するRBP-BERTを開発した．さらに学習された結果を解析することによりRBP結合の生物学的な特徴を抽出した．第二に，トランスポゾンなどのリピート要素に結合するRBPの網羅的な解析を行った．これにより，リピート要素がRBP結合の機能性配列となっていることが明らかになった．
ノンコーディングRNA解析情報基盤技術の研究

2020

　View Summary

ヒトなどの高等真核生物で多数発見されている長鎖ノンコーディングRNAの機能を解明するために，基盤情報技術を構築し様々なバイオインフォマティクスの解析を行った．具体的には以下を行った．・局在と選択的スプライシングの関連性に関する網羅的解析・トランスクリプトームなm6A修飾の測定データから，高精度にm6A修飾位置を同定するためのツールMoAIMSの開発・ゲノムワイドなR-loop構造の同定と，その特徴の抽出
秘密分散手法を用いた生命情報秘匿解析手法の研究

2019

　View Summary

秘密分散法を用いて，アフィンギャップを用いた配列比較手法を安全に行うための手法の考案および実装を行った．既存手法との比較を行い，既存手法に比べて計算速度が大幅に改善することが確かめられた．[1] 深見匠、浜田道昭, アフィンギャップを考慮した安全な個人ゲノム比較, 2019/12/3, 第42回日本分子生物学会年会, 福岡国際会議場・マリンメッセ福岡[2] 深見匠, 浜田道昭, セキュアな個人ゲノム類似度計算, 2019年暗号と情報セキュリティシンポジウム，2019年1月22日〜25日，びわ湖大津プリンスホテル
統合オミックスデータ駆動生物学の数理情報基盤と実践

2018

　View Summary

長鎖ノンコーディングRNAの機能の解明に向けたバイオインフォマティクス技術として，深層学習技術を用いた，m6A修飾の予測アルゴリズム／ツールの開発を行った．また，RNA-RNA相互作用を，配列情報のみを入力とし高速・高精度によろ即するためのアルゴリズムの開発を行った．さらに，モデル選択技術を用いたがんゲノムデータの変異シグネチャーの予測を行う基盤情報技術の開発を行った．
統合オミックスデータ駆動生物学の数理情報基盤

2016

　View Summary

様々なオミックスデータを情報解析するための方法として以下の研究成果を得た・メタゲノムデータを確率的にモデリングするための確率モデルの開発を行った．この確率モデルにおいては，自然言語分野で用いられるLDAを，メタゲノムデータに応用することにより，細菌群が推定することが可能となる．推定された細菌群と広く知られているエンテロタイプとの関連性を詳細に調べることにより，細菌群の生物学的意味付けを与えた．・シークエンシングデータから植物ゲノムの変異を同定するためのパイプラインを構築した．構築したパイプラインを用いて，植物の変異体（ミュータント）の解析を詳細に行った．本研究は，理化学研究所との共同研究である．・タンパク質やDNA配列のモチーフの確率モデルであるプロファイルHMMを，暗号技術を用いることにより，モデル情報およびクエリの情報を秘匿したまま検索を行う手法の開発を行った．本手法では，加法準同型暗号を用いることにより，足し算が暗号化したまま可能となることが本質的に用いられている．
RNA-クロマチン相互作用予測と応用

2016

　View Summary

RNAとクロマチンの相互作用を配列情報のみから推定するための手法の開発に向けた以下の研究成果を得た．1. RNAとタンパク質の複合体構造を予測（ドッキング）を行うための新規手法を開発した．この手法の中では，分子動力学シミュレーションの結果を，複合体構造の評価関数に組み入れることによって，既存の手法に比べて大幅な精度の向上が実現された2. RNAの構造予測のための統合WebサーバRtoolsを構築し，公開をした．このウェブサーバーを用いることにより，RNAの配列情報のみから，構造に関する様々な予測情報（２次構造，塩基対確率行列，ステム，バルジ，ループなどの形成確率等）を得ることが可能となる．このような情報はRNA-クロマチン相互作用を予測する際にも有用となる
lncRNA-RNA相互作用の網羅的予測と実験情報を統合したデータベースの構築

2015

　View Summary

本研究では、第一に、高速にRNA-RNAの相互作用を予測するためのパイプラインシステムを構築した。さらに、パイプラインシステムを京コンピュータに実装した。第２に、このパイプラインを用いてヒトのlncRNAを対象に網羅的な相互作用相手の予測を行い、得られた結果をデータベースとして公開を行った。APBC2016において、浜田が口頭発表を行うと同時に、ジャーナル論文（BMC Genomics）に論文が掲載された。
エピゲノムの統合的理解に向けた情報技術の開発とデータ駆動型生物学の実践

2015

　View Summary

今年度は、昨年度発表した論文[1]のプログラムの、ソースコードの一般公開に向けて、プログラムの整理、および、改良を行った。具体的には、各位置においてクロマチン状態の事後確率が出力可能となるように変更を行った。[1] Michiaki Hamada*, Yukiteru Ono, Ryohei Fujimaki, Kiyoshi Asai, Learning chromatin states with factorized information criteria, Bioinformatics, Bioinformatics (2015) doi: 10.1093/bioinformatics/btv163 First published online: March 24, 2015
エピジェネティクスデータからクロマチン状態を推定する方法論の研究と応用

2014

　View Summary

Motivation: Recent studies have suggested that both the genome and the genome with epigenetic modifications, the so-called epigenome, play important roles in various biological functions, such as transcription and DNA replication, repair, and recombination. It is well known that specific combinations of histone modifications (e.g. methylations and acetylations) of nucleosomes induce chromatin states that correspond to specific functions of chromatin. Although the advent of next-generation sequencing (NGS) technologies enables measurement of epigenetic information for entire genomes at high-resolution, the variety of chromatin states has not been completely characterized. Results: In this study, we propose a method to estimate the chromatin states indicated by genome-wide chromatin marks identified by NGS technologies. The proposed method automatically estimates the number of chromatin states and characterize each state on the basis of a hidden Markov model (HMM) in combination with a recently proposed model selection technique, factorized information criteria. The method is expected to provide an unbiased model because it relies on only two adjustable parameters and avoids heuristic procedures as much as possible. Computational experiments with simulated datasets show that our method automatically learns an appropriate model, even in cases where methods that rely on Bayesian information criteria fail to learn the model structures. In addition, we comprehensively compare our method to ChromHMM on three real datasets and show that our method estimates more chromatin states than ChromHMM for those datasets.

▼display all