2024/04/18 更新

写真a

ハマダ ミチアキ
浜田 道昭
所属
理工学術院 先進理工学部
職名
教授
学位
博士(理学) ( 2009年03月 東京工業大学 )
メールアドレス
メールアドレス
ホームページ
プロフィール

2000年3月東北大学理学部数学科卒業,2002年3月東北大学大学院理学研究科数学専攻修士課程修了(専門は作用素論; 修士論文).2002年4月より8年半の間,株式会社富士総合研究所(現在のみずほリサーチ&テクノロジーズ株式会社)の研究員(コンサルタント)として,主に科学技術関連の受託研究開発に携わる.その間,NEDO「機能性RNAプロジェクト」に参画し,「バイオインフォマティクス」と初めて出会う.また,同プロジェクトにて,CentroidFoldに代表される数多くのRNA配列情報解析技術の開発を行う.2009年3月東京工業大学大学院総合理工学研究科知能システム科学専攻に社会人博士過程を修了し博士(理学)を取得する.2010年10月1日より東京大学大学院 新領域創成科学研究科 情報生命科学専攻 特任准教授を経て,2014年4月1日より早稲田大学 理工学術院 准教授(2018年4月より教授)としてバイオインフォマティクス研究室を主宰する.2016年10月1日より産業技術総合研究所招聘研究員(産総研・早大生体システムビッグデータ解析オープンイノベーションラボラトリ班長),および,2017年4月1日より日本医科大学大学院医学研究科客員教授を兼任する.また,2015年4月1日〜2017年3月31日間に日本バイオインフォマティクス学会理事.1998年に川合数学奨励賞, 1999年3月に青葉理学振興会奨励賞,2008年12月にOxford Journals JSBi Prize, 2010年3月に2008-2009 SIGBIO Best Paper Award,2017年に文部科学大臣表彰の若手科学者賞をそれぞれ受賞.またこれまでに,科研費若手A(2件),挑戦的萌芽研究,挑戦的研究(萌芽),新学術領域(公募研究),基盤研究(A),戦略的創造研究推進事業(CREST)などの研究代表者を務める.現在の専門はバイオインフォマティクス全般(特に、機能性RNA,エピジェネティクス,次世代シーケンサーに関わる情報技術の開発,RNAを基軸とした創薬研究等)である.今後長く利用されるキラーツールを作り上げ,ソフトウェア・情報技術の側面から生物・医薬学分野にブレイクスルーを起こすことを目指している.

経歴

  • 2022年04月
    -
    継続中

    早稲田大学 次代の中核研究者

  • 2018年04月
    -
    継続中

    早稲田大学   理工学術院   教授

  • 2017年04月
    -
    継続中

    日本医科大学   客員教授

  • 2016年10月
    -
    継続中

    産業技術総合研究所   生体システムビッグデータ解析オープ ンイノベーションラボラトリ(CBBD-OIL)   招聘研究員 / 班長

  • 2014年04月
    -
    2018年03月

    早稲田大学 理工学術院   准教授

  • 2010年10月
    -
    2014年03月

    東京大学   新領域創成科学研究科   特任准教授

  • 2002年04月
    -
    2010年09月

    株式会社 富士総合研究所   研究員

▼全件表示

学歴

  • 2005年10月
    -
    2009年03月

    東京工業大学   総合理工学研究科   知能システム科学専攻  

  • 2000年04月
    -
    2002年03月

    東北大学   大学院理学研究科   数学専攻  

  • 1996年04月
    -
    2000年03月

    東北大学   理学部   数学科  

委員歴

  • 2023年04月
    -
    継続中

    日本バイオインフォマティクス学会  副理事長

  • 2022年06月
    -
    継続中

    mRNAターゲット創薬研究機構  理事

  • 2020年04月
    -
    継続中

    日本バイオインフォマティクス学会  幹事

  • 2021年09月
    -
     

    2021年日本バイオインフォマティクス学会年会・第10回生命医薬情報学連合大会(IIBMP2021)大会長

  • 2015年04月
    -
    2017年03月

    日本バイオインフォマティクス学会  理事

研究分野

  • 生命、健康、医療情報学 / 知能情報学

研究キーワード

  • RNA創薬

  • 人工知能

  • 確率モデル

  • RNA-タンパク質相互作用

  • RNA-RNA相互作用

  • インタラクトーム

  • 長鎖ノンコーディングRNA

  • エピトランスクリプトーム

  • エピゲノム

  • RNAアプタマー

  • 配列情報解析

  • RNA

  • バイオインフォマティクス

▼全件表示

受賞

  • 早稲田大学ティーチングアワード

    2023年07月   早稲田大学   バイオインフォマティクス  

    受賞者: 浜田道昭, 曽超

  • 早稲田大学 次代の中核研究者 2022

    2022年04月   早稲田大学  

    受賞者: 浜田道昭

  • 早稲田大学リサーチアワード(国際研究発信力)

    2021年12月   早稲田大学  

    受賞者: 浜田道昭

  • 平成29年度科学技術分野の文部科学大臣表彰 若手科学者賞

    2017年04月   文部科学省  

    受賞者: 浜田道昭

  • 産業技術総合研究所 理事長賞(研究)

    2016年04月   産業技術総合研究所  

    受賞者: 浜田道昭

 

論文

  • Identification of a novel RNA transcript TISPL upregulated by stressors that stimulate ATF4

    Yutaro Wakabayashi, Aika Shimono, Yuki Terauchi, Chao Zeng, Michiaki Hamada, Kentaro Semba, Shinya Watanabe, Kosuke Ishikawa

    Gene     148464 - 148464  2024年04月

    DOI

  • RaptGen-Assisted Generation of an RNA/DNA Hybrid Aptamer against SARS-CoV-2 Spike Protein.

    Tatsuo Adachi, Shigetaka Nakamura, Akiya Michishita, Daiki Kawahara, Mizuki Yamamoto, Michiaki Hamada, Yoshikazu Nakamura

    Biochemistry    2024年03月  [国際誌]

     概要を見る

    Optimization of aptamers in length and chemistry is crucial for industrial applications. Here, we developed aptamers against the SARS-CoV-2 spike protein and achieved optimization with a deep-learning-based algorithm, RaptGen. We conducted a primer-less SELEX against the receptor binding domain (RBD) of the spike with an RNA/DNA hybrid library, and the resulting sequences were subjected to RaptGen analysis. Based on the sequence profiling by RaptGen, a short truncation aptamer of 26 nucleotides was obtained and further optimized by a chemical modification of relevant nucleotides. The resulting aptamer is bound to RBD not only of SARS-CoV-2 wildtype but also of its variants, SARS-CoV-1, and Middle East respiratory syndrome coronavirus (MERS-CoV). We concluded that the RaptGen-assisted discovery is efficient for developing optimized aptamers.

    DOI PubMed

    Scopus

  • Inflammation primes the kidney for recovery by activating AZIN1 A-to-I editing.

    Segewkal Heruye, Jered Myslinski, Chao Zeng, Amy Zollman, Shinichi Makino, Azuma Nanamatsu, Quoseena Mir, Sarath Chandra Janga, Emma H Doud, Michael T Eadon, Bernhard Maier, Michiaki Hamada, Tuan M Tran, Pierre C Dagher, Takashi Hato

    bioRxiv : the preprint server for biology    2023年11月  [国際誌]

     概要を見る

    The progression of kidney disease varies among individuals, but a general methodology to quantify disease timelines is lacking. Particularly challenging is the task of determining the potential for recovery from acute kidney injury following various insults. Here, we report that quantitation of post-transcriptional adenosine-to-inosine (A-to-I) RNA editing offers a distinct genome-wide signature, enabling the delineation of disease trajectories in the kidney. A well-defined murine model of endotoxemia permitted the identification of the origin and extent of A-to-I editing, along with temporally discrete signatures of double-stranded RNA stress and Adenosine Deaminase isoform switching. We found that A-to-I editing of Antizyme Inhibitor 1 (AZIN1), a positive regulator of polyamine biosynthesis, serves as a particularly useful temporal landmark during endotoxemia. Our data indicate that AZIN1 A-to-I editing, triggered by preceding inflammation, primes the kidney and activates endogenous recovery mechanisms. By comparing genetically modified human cell lines and mice locked in either A-to-I edited or uneditable states, we uncovered that AZIN1 A-to-I editing not only enhances polyamine biosynthesis but also engages glycolysis and nicotinamide biosynthesis to drive the recovery phenotype. Our findings implicate that quantifying AZIN1 A-to-I editing could potentially identify individuals who have transitioned to an endogenous recovery phase. This phase would reflect their past inflammation and indicate their potential for future recovery.

    DOI PubMed

  • DeepRaccess: high-speed RNA accessibility prediction using deep learning

    Kaisei Hara, Natsuki Iwano, Tsukasa Fukunaga, Michiaki Hamada

    Frontiers in Bioinformatics   3  2023年10月  [査読有り]

    担当区分:責任著者

     概要を見る

    RNA accessibility is a useful RNA secondary structural feature for predicting RNA-RNA interactions and translation efficiency in prokaryotes. However, conventional accessibility calculation tools, such as Raccess, are computationally expensive and require considerable computational time to perform transcriptome-scale analysis. In this study, we developed DeepRaccess, which predicts RNA accessibility based on deep learning methods. DeepRaccess was trained to take artificial RNA sequences as input and to predict the accessibility of these sequences as calculated by Raccess. Simulation and empirical dataset analyses showed that the accessibility predicted by DeepRaccess was highly correlated with the accessibility calculated by Raccess. In addition, we confirmed that DeepRaccess could predict protein abundance in E.coli with moderate accuracy from the sequences around the start codon. We also demonstrated that DeepRaccess achieved tens to hundreds of times software speed-up in a GPU environment. The source codes and the trained models of DeepRaccess are freely available at https://github.com/hmdlab/DeepRaccess.

    DOI

    Scopus

  • Neat1 lncRNA organizes the inflammatory gene expressions in the dorsal root ganglion in neuropathic pain caused by nerve injury

    Motoyo Maruyama, Atsushi Sakai, Tsukasa Fukunaga, Yoshitaka Miyagawa, Takashi Okada, Michiaki Hamada, Hidenori Suzuki

    Frontiers in Immunology   14  2023年08月

     概要を見る

    Primary sensory neurons regulate inflammatory processes in innervated regions through neuro-immune communication. However, how their immune-modulating functions are regulated in concert remains largely unknown. Here, we show that Neat1 long non-coding RNA (lncRNA) organizes the proinflammatory gene expressions in the dorsal root ganglion (DRG) in chronic intractable neuropathic pain in rats. Neat1 was abundantly expressed in the DRG and was upregulated after peripheral nerve injury. Neat1 overexpression in primary sensory neurons caused mechanical and thermal hypersensitivity, whereas its knockdown alleviated neuropathic pain. Bioinformatics analysis of comprehensive transcriptome changes indicated the inflammatory response was the most relevant function of genes upregulated through Neat1. Consistent with this, upregulation of proinflammatory genes in the DRG following nerve injury was suppressed by Neat1 knockdown. Expression changes of these proinflammatory genes were regulated through Neat1-mRNA interaction-dependent and -independent mechanisms. Notably, Neat1 increased proinflammatory genes by stabilizing its interacting mRNAs in neuropathic pain. Finally, Neat1 in primary sensory neurons contributed to spinal inflammatory processes that mediated peripheral neuropathic pain. These findings demonstrate that Neat1 lncRNA is a key regulator of neuro-immune communication in neuropathic pain.

    DOI

    Scopus

  • Landscape of semi-extractable RNAs across five human cell lines.

    Chao Zeng, Takeshi Chujo, Tetsuro Hirose, Michiaki Hamada

    Nucleic acids research    2023年07月  [国際誌]

     概要を見る

    Phase-separated membraneless organelles often contain RNAs that exhibit unusual semi-extractability using the conventional RNA extraction method, and can be efficiently retrieved by needle shearing or heating during RNA extraction. Semi-extractable RNAs are promising resources for understanding RNA-centric phase separation. However, limited assessments have been performed to systematically identify and characterize semi-extractable RNAs. In this study, 1074 semi-extractable RNAs, including ASAP1, DANT2, EXT1, FTX, IGF1R, LIMS1, NEAT1, PHF21A, PVT1, SCMH1, STRG.3024.1, TBL1X, TCF7L2, TVP23C-CDRT4, UBE2E2, ZCCHC7, ZFAND3 and ZSWIM6, which exhibited consistent semi-extractability were identified across five human cell lines. By integrating publicly available datasets, we found that semi-extractable RNAs tend to be distributed in the nuclear compartments but are dissociated from the chromatin. Long and repeat-containing semi-extractable RNAs act as hubs to provide global RNA-RNA interactions. Semi-extractable RNAs were divided into four groups based on their k-mer content. The NEAT1 group preferred to interact with paraspeckle proteins, such as FUS and NONO, implying that RNAs in this group are potential candidates of architectural RNAs that constitute nuclear bodies.

    DOI PubMed

    Scopus

  • Transposons contribute to the acquisition of cell type-specific cis-elements in the brain

    Kotaro Sekine, Masahiro Onoguchi, Michiaki Hamada

    Communications Biology   6 ( 1 ) 631 - 631  2023年06月  [査読有り]  [国際誌]

     概要を見る

    Abstract

    Mammalian brains have evolved in stages over a long history to acquire higher functions. Recently, several transposable element (TE) families have been shown to evolve into cis-regulatory elements of brain-specific genes. However, it is not fully understood how TEs are important for gene regulatory networks. Here, we performed a single-cell level analysis using public data of scATAC-seq to discover TE-derived cis-elements that are important for specific cell types. Our results suggest that DNA elements derived from TEs, MER130 and MamRep434, can function as transcription factor-binding sites based on their internal motifs for Neurod2 and Lhx2, respectively, especially in glutamatergic neuronal progenitors. Furthermore, MER130- and MamRep434-derived cis-elements were amplified in the ancestors of Amniota and Eutheria, respectively. These results suggest that the acquisition of cis-elements with TEs occurred in different stages during evolution and may contribute to the acquisition of different functions or morphologies in the brain.

    DOI PubMed

    Scopus

  • Recent trends in RNA informatics: a review of machine learning and deep learning for RNA secondary structure prediction and RNA drug discovery

    Kengo Sato, Michiaki Hamada

    Briefings in Bioinformatics    2023年05月  [査読有り]  [国際誌]

     概要を見る

    Abstract

    Computational analysis of RNA sequences constitutes a crucial step in the field of RNA biology. As in other domains of the life sciences, the incorporation of artificial intelligence and machine learning techniques into RNA sequence analysis has gained significant traction in recent years. Historically, thermodynamics-based methods were widely employed for the prediction of RNA secondary structures; however, machine learning-based approaches have demonstrated remarkable advancements in recent years, enabling more accurate predictions. Consequently, the precision of sequence analysis pertaining to RNA secondary structures, such as RNA–protein interactions, has also been enhanced, making a substantial contribution to the field of RNA biology. Additionally, artificial intelligence and machine learning are also introducing technical innovations in the analysis of RNA–small molecule interactions for RNA-targeted drug discovery and in the design of RNA aptamers, where RNA serves as its own ligand. This review will highlight recent trends in the prediction of RNA secondary structure, RNA aptamers and RNA drug discovery using machine learning, deep learning and related technologies, and will also discuss potential future avenues in the field of RNA informatics.

    DOI PubMed

    Scopus

    9
    被引用数
    (Scopus)
  • Mobile element variation contributes to population-specific genome diversification, gene regulation and disease risk.

    Shohei Kojima, Satoshi Koyama, Mirei Ka, Yuka Saito, Erica H Parrish, Mikiko Endo, Sadaaki Takata, Misaki Mizukoshi, Keiko Hikino, Atsushi Takeda, Asami F Gelinas, Steven M Heaton, Rie Koide, Anselmo J Kamada, Michiya Noguchi, Michiaki Hamada, Yoichiro Kamatani, Yasuhiro Murakawa, Kazuyoshi Ishigaki, Yukio Nakamura, Kaoru Ito, Chikashi Terao, Yukihide Momozawa, Nicholas F Parrish

    Nature genetics   55 ( 6 ) 939 - 951  2023年05月  [国際誌]

     概要を見る

    Mobile genetic elements (MEs) are heritable mutagens that recursively generate structural variants (SVs). ME variants (MEVs) are difficult to genotype and integrate in statistical genetics, obscuring their impact on genome diversification and traits. We developed a tool that accurately genotypes MEVs using short-read whole-genome sequencing (WGS) and applied it to global human populations. We find unexpected population-specific MEV differences, including an Alu insertion distribution distinguishing Japanese from other populations. Integrating MEVs with expression quantitative trait loci (eQTL) maps shows that MEV classes regulate tissue-specific gene expression by shared mechanisms, including creating or attenuating enhancers and recruiting post-transcriptional regulators, supporting class-wide interpretability. MEVs more often associate with gene expression changes than SNVs, thus plausibly impacting traits. Performing genome-wide association study (GWAS) with MEVs pinpoints potential causes of disease risk, including a LINE-1 insertion associated with keloid and fasciitis. This work implicates MEVs as drivers of human divergence and disease risk.

    DOI PubMed

    Scopus

    4
    被引用数
    (Scopus)
  • Bioinformatics approaches for unveiling virus-host interactions.

    Hitoshi Iuchi, Junna Kawasaki, Kento Kubo, Tsukasa Fukunaga, Koki Hokao, Gentaro Yokoyama, Akiko Ichinose, Kanta Suga, Michiaki Hamada

    Computational and structural biotechnology journal   21   1774 - 1784  2023年  [国際誌]

     概要を見る

    The coronavirus disease-2019 (COVID-19) pandemic has elucidated major limitations in the capacity of medical and research institutions to appropriately manage emerging infectious diseases. We can improve our understanding of infectious diseases by unveiling virus-host interactions through host range prediction and protein-protein interaction prediction. Although many algorithms have been developed to predict virus-host interactions, numerous issues remain to be solved, and the entire network remains veiled. In this review, we comprehensively surveyed algorithms used to predict virus-host interactions. We also discuss the current challenges, such as dataset biases toward highly pathogenic viruses, and the potential solutions. The complete prediction of virus-host interactions remains difficult; however, bioinformatics can contribute to progress in research on infectious diseases and human health.

    DOI PubMed

    Scopus

    6
    被引用数
    (Scopus)
  • Web Services for RNA-RNA Interaction Prediction

    Tsukasa Fukunaga, Junichi Iwakiri, Michiaki Hamada

    Methods in Molecular Biology   2586   175 - 195  2023年  [国際誌]

     概要を見る

    Non-coding RNAs have various biological functions such as translational regulation, and RNA-RNA interactions play essential roles in the mechanisms of action of these RNAs. Therefore, RNA-RNA interaction prediction is an important problem in bioinformatics, and many tools have been developed for the computational prediction of RNA-RNA interactions. In addition to the development of novel algorithms with high accuracy, the development and maintenance of web services is essential for enhancing usability by experimental biologists. In this review, we survey web services for RNA-RNA interaction predictions and introduce how to use primary web services. We present various prediction tools, including general interaction prediction tools, prediction tools for specific RNA classes, and RNA-RNA interaction-based RNA design tools. Additionally, we discuss the future perspectives of the development of RNA-RNA interaction prediction tools and the sustainability of web services.

    DOI PubMed

    Scopus

  • PBSIM3: a simulator for all types of PacBio and ONT long reads.

    Yukiteru Ono, Michiaki Hamada, Kiyoshi Asai

    NAR genomics and bioinformatics   4 ( 4 ) lqac092  2022年12月  [国際誌]

     概要を見る

    Long-read sequencers, such as Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) sequencers, have improved their read length and accuracy, thereby opening up unprecedented research. Many tools and algorithms have been developed to analyze long reads, and rapid progress in PacBio and ONT has further accelerated their development. Together with the development of high-throughput sequencing technologies and their analysis tools, many read simulators have been developed and effectively utilized. PBSIM is one of the popular long-read simulators. In this study, we developed PBSIM3 with three new functions: error models for long reads, multi-pass sequencing for high-fidelity read simulation and transcriptome sequencing simulation. Therefore, PBSIM3 is now able to meet a wide range of long-read simulation requirements.

    DOI PubMed

    Scopus

    10
    被引用数
    (Scopus)
  • Structure-based screening for functional non-coding RNAs in fission yeast identifies a factor repressing untimely initiation of sexual differentiation.

    Yu Ono, Kenta Katayama, Tomoki Onuma, Kento Kubo, Hayato Tsuyuzaki, Michiaki Hamada, Masamitsu Sato

    Nucleic acids research   50 ( 19 ) 11229 - 11242  2022年10月  [国際誌]

     概要を見る

    Non-coding RNAs (ncRNAs) ubiquitously exist in normal and cancer cells. Despite their prevalent distribution, the functions of most long ncRNAs remain uncharacterized. The fission yeast Schizosaccharomyces pombe expresses >1800 ncRNAs annotated to date, but most unconventional ncRNAs (excluding tRNA, rRNA, snRNA and snoRNA) remain uncharacterized. To discover the functional ncRNAs, here we performed a combinatory screening of computational and biological tests. First, all S. pombe ncRNAs were screened in silico for those showing conservation in sequence as well as in secondary structure with ncRNAs in closely related species. Almost a half of the 151 selected conserved ncRNA genes were uncharacterized. Twelve ncRNA genes that did not overlap with protein-coding sequences were next chosen for biological screening that examines defects in growth or sexual differentiation, as well as sensitivities to drugs and stresses. Finally, we highlighted an ncRNA transcribed from SPNCRNA.1669, which inhibited untimely initiation of sexual differentiation. A domain that was predicted as conserved secondary structure by the computational operations was essential for the ncRNA to function. Thus, this study demonstrates that in silico selection focusing on conservation of the secondary structure over species is a powerful method to pinpoint novel functional ncRNAs.

    DOI PubMed

    Scopus

  • Generative aptamer discovery using RaptGen

    Natsuki Iwano, Tatsuo Adachi, Kazuteru Aoki, Yoshikazu Nakamura, Michiaki Hamada

    Nature Computational Science   2 ( 6 ) 378 - 386  2022年06月

     概要を見る

    Nucleic acid aptamers are generated by an in vitro molecular evolution method known as systematic evolution of ligands by exponential enrichment (SELEX). Various candidates are limited by actual sequencing data from an experiment. Here we developed RaptGen, which is a variational autoencoder for in silico aptamer generation. RaptGen exploits a profile hidden Markov model decoder to represent motif sequences effectively. We showed that RaptGen embedded simulation sequence data into low-dimensional latent space on the basis of motif information. We also performed sequence embedding using two independent SELEX datasets. RaptGen successfully generated aptamers from the latent space even though they were not included in high-throughput sequencing. RaptGen could also generate a truncated aptamer with a short learning model. We demonstrated that RaptGen could be applied to activity-guided aptamer generation according to Bayesian optimization. We concluded that a generative method by RaptGen and latent representation are useful for aptamer discovery.

    DOI

    Scopus

    19
    被引用数
    (Scopus)
  • Mobile elements in human population-specific genome and phenotype divergence

    Shohei Kojima, Satoshi Koyama, Mirei Ka, Yuka Saito, Erica H. Parrish, Mikiko Endo, Sadaaki Takata, Misaki Mizukoshi, Keiko Hikino, Atsushi Takeda, Asami F. Gelinas, Steven M. Heaton, Rie Koide, Anselmo J. Kamada, Michiya Noguchi, Michiaki Hamada, Yoichiro Kamatani, Yasuhiro Murakawa, Kazuyoshi Ishigaki, Yukio Nakamura, Kaoru Ito, Chikashi Terao, Yukihide Momozawa, Nicholas F. Parrish

       2022年03月

     概要を見る

    Abstract

    Mobile genetic elements (MEs) are heritable mutagens that contribute to divergence between lineages by recursively generating structural variants. ME variants (MEVs) are difficult to genotype, obscuring their impact on recent genome and trait diversification. We developed a tool that uses short-read sequence data to accurately genotype MEVs, enabling us to study them using statistical genetics methods in global human genomes. We observe population-specific differences in the distribution of Alu insertions that distinguish Japanese from other populations. We integrated MEVs with epigenomic and expression quantitative trait loci (eQTL) maps to determine how they impact traits. This reveals coherent patterns by which specific MEs regulate tissue-specific gene expression, including creating or attenuating enhancers and recruiting post-transcriptional regulators. We pinpoint MEVs as genetic causes of disease risk, including a LINE-1 insertion linked to keloid and other diseases of fibroblast inflammation, by introducing MEVs into the genome-wide association study (GWAS) framework. In addition to nominating previously-hidden MEVs as causes of human diseases, this work highlights MEs as accelerators of human population divergence and begins to decipher the semantics of MEs.

    DOI

  • G0S2 regulates innate immunity in Kawasaki disease via lncRNA HSD11B1-AS1.

    Mako Okabe, Shinya Takarada, Nariaki Miyao, Hideyuki Nakaoka, Keijiro Ibuki, Sayaka Ozawa, Kazuhiro Watanabe, Harue Tsuji, Ikuo Hashimoto, Kiyoshi Hatasaki, Shotaro Hayakawa, Yu Hamaguchi, Michiaki Hamada, Fukiko Ichida, Keiichi Hirono

    Pediatric research   92 ( 2 ) 378 - 387  2022年03月  [国際誌]

     概要を見る

    BACKGROUND: Kawasaki disease (KD) is a systemic vasculitis that is currently the most common cause of acquired heart disease in children. However, its etiology remains unknown. Long non-coding RNAs (lncRNAs) contribute to the pathophysiology of various diseases. Few studies have reported the role of lncRNAs in KD inflammation; thus, we investigated the role of lncRNA in KD inflammation. METHODS: A total of 50 patients with KD (median age, 19 months; 29 males and 21 females) were enrolled. We conducted cap analysis gene expression sequencing to determine differentially expressed genes in monocytes of the peripheral blood of the subjects. RESULTS: About 21 candidate lncRNA transcripts were identified. The analyses of transcriptome and gene ontology revealed that the immune system was involved in KD. Among these genes, G0/G1 switch gene 2 (G0S2) and its antisense lncRNA, HSD11B1-AS1, were upregulated during the acute phase of KD (P < 0.0001 and <0.0001, respectively). Moreover, G0S2 increased when lipopolysaccharides induced inflammation in THP-1 monocytes, and silencing of G0S2 suppressed the expression of HSD11B1-AS1 and tumor necrosis factor-α. CONCLUSIONS: This study uncovered the crucial role of lncRNAs in innate immunity in acute KD. LncRNA may be a novel target for the diagnosis of KD. IMPACT: This study revealed the whole aspect of the gene expression profile of monocytes of patients with Kawasaki disease (KD) using cap analysis gene expression sequencing and identified KD-specific molecules: G0/G1 switch gene 2 (G0S2) and long non-coding RNA (lncRNA) HSD11B1-AS1. We demonstrated that G0S2 and its antisense HSD11B1-AS1 were associated with inflammation of innate immunity in KD. lncRNA may be a novel key target for the diagnosis of patients with KD.

    DOI PubMed

    Scopus

    6
    被引用数
    (Scopus)
  • HT-SELEX-based identification of binding pre-miRNA hairpin-motif for small molecules.

    Sanjukta Mukherjee, Asako Murata, Ryoga Ishida, Ayako Sugai, Chikara Dohno, Michiaki Hamada, Sudhir Krishna, Kazuhiko Nakatani

    Molecular therapy. Nucleic acids   27   165 - 174  2022年03月  [国際誌]

     概要を見る

    Selective targeting of biologically relevant RNAs with small molecules is a long-standing challenge due to the lack of clear understanding of the binding RNA motifs for small molecules. The standard SELEX procedure allows the identification of specific RNA binders (aptamers) for the target of interest. However, more effort is needed to identify and characterize the sequence-structure motifs in the aptamers important for binding to the target. Herein, we described a strategy integrating high-throughput (HT) sequencing with conventional SELEX followed by bioinformatic analysis to identify aptamers with high binding affinity and target specificity to unravel the sequence-structure motifs of pre-miRNA, which is essential for binding to the recently developed new water-soluble small-molecule CMBL3aL. To confirm the fidelity of this approach, we investigated the binding of CMBL3aL to the identified motifs by surface plasmon resonance (SPR) spectroscopy and its potential regulatory activity on dicer-mediated cleavage of the obtained aptamers and endogenous pre-miRNAs comprising the identified motif in its hairpin loop. This new approach would significantly accelerate the identification process of binding sequence-structure motifs of pre-miRNA for the compound of interest and would contribute to increase the spectrum of biomedical application.

    DOI PubMed

    Scopus

    3
    被引用数
    (Scopus)
  • Prediction of RNA-protein interactions using a nucleotide language model

    Keisuke Yamada, Michiaki Hamada

    Bioinformatics Advances   2 ( 1 ) vbac023  2022年  [国際誌]

     概要を見る

    MOTIVATION: The accumulation of sequencing data has enabled researchers to predict the interactions between RNA sequences and RNA-binding proteins (RBPs) using novel machine learning techniques. However, existing models are often difficult to interpret and require additional information to sequences. Bidirectional encoder representations from transformer (BERT) is a language-based deep learning model that is highly interpretable. Therefore, a model based on BERT architecture can potentially overcome such limitations. RESULTS: Here, we propose BERT-RBP as a model to predict RNA-RBP interactions by adapting the BERT architecture pretrained on a human reference genome. Our model outperformed state-of-the-art prediction models using the eCLIP-seq data of 154 RBPs. The detailed analysis further revealed that BERT-RBP could recognize both the transcript region type and RNA secondary structure only based on sequence information. Overall, the results provide insights into the fine-tuning mechanism of BERT in biological contexts and provide evidence of the applicability of the model to other RNA-related problems. AVAILABILITY AND IMPLEMENTATION: Python source codes are freely available at https://github.com/kkyamada/bert-rbp. The datasets underlying this article were derived from sources in the public domain: [RBPsuite (http://www.csbio.sjtu.edu.cn/bioinf/RBPsuite/), Ensembl Biomart (http://asia.ensembl.org/biomart/martview/)]. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online.

    DOI PubMed

    Scopus

    14
    被引用数
    (Scopus)
  • LinAliFold and CentroidLinAliFold: fast RNA consensus secondary structure prediction for aligned sequences using beam search methods

    Tsukasa Fukunaga, Michiaki Hamada

    Bioinformatics Advances   2 ( 1 )  2022年01月

     概要を見る

    Abstract

    Motivation

    RNA consensus secondary structure prediction from aligned sequences is a powerful approach for improving the secondary structure prediction accuracy. However, because the computational complexities of conventional prediction tools scale with the cube of the alignment lengths, their application to long RNA sequences, such as viral RNAs or long non-coding RNAs, requires significant computational time.

    Results

    In this study, we developed LinAliFold and CentroidLinAliFold, fast RNA consensus secondary structure prediction tools based on minimum free energy and maximum expected accuracy principles, respectively. We achieved software acceleration using beam search methods that were successfully used for fast secondary structure prediction from a single RNA sequence. Benchmark analyses showed that LinAliFold and CentroidLinAliFold were much faster than the existing methods while preserving the prediction accuracy. As an empirical application, we predicted the consensus secondary structure of coronaviruses with approximately 30 000 nt in 5 and 79 min by LinAliFold and CentroidLinAliFold, respectively. We confirmed that the predicted consensus secondary structure of coronaviruses was consistent with the experimental results.

    Availability and implementation

    The source codes of LinAliFold and CentroidLinAliFold are freely available at https://github.com/fukunagatsu/LinAliFold-CentroidLinAliFold.

    Supplementary information

    Supplementary data are available at Bioinformatics Advances online.

    DOI

    Scopus

    1
    被引用数
    (Scopus)
  • Bioinformatics Approaches for Determining the Functional Impact of Repetitive Elements on Non-coding RNAs.

    Chao Zeng, Atsushi Takeda, Kotaro Sekine, Naoki Osato, Tsukasa Fukunaga, Michiaki Hamada

    Methods in molecular biology (Clifton, N.J.)   2509   315 - 340  2022年  [国際誌]

     概要を見る

    With a large number of annotated non-coding RNAs (ncRNAs), repetitive sequences are found to constitute functional components (termed as repetitive elements) in ncRNAs that perform specific biological functions. Bioinformatics analysis is a powerful tool for improving our understanding of the role of repetitive elements in ncRNAs. This chapter summarizes recent findings that reveal the role of repetitive elements in ncRNAs. Furthermore, relevant bioinformatics approaches are systematically reviewed, which promises to provide valuable resources for studying the functional impact of repetitive elements on ncRNAs.

    DOI PubMed

    Scopus

    2
    被引用数
    (Scopus)
  • Clone decomposition based on mutation signatures provides novel insights into mutational processes.

    Taro Matsutani, Michiaki Hamada

    NAR genomics and bioinformatics   3 ( 4 ) lqab093  2021年12月  [国際誌]

     概要を見る

    Intra-tumor heterogeneity is a phenomenon in which mutation profiles differ from cell to cell within the same tumor and is observed in almost all tumors. Understanding intra-tumor heterogeneity is essential from the clinical perspective. Numerous methods have been developed to predict this phenomenon based on variant allele frequency. Among the methods, CloneSig models the variant allele frequency and mutation signatures simultaneously and provides an accurate clone decomposition. However, this method has limitations in terms of clone number selection and modeling. We propose SigTracer, a novel hierarchical Bayesian approach for analyzing intra-tumor heterogeneity based on mutation signatures to tackle these issues. We show that SigTracer predicts more reasonable clone decompositions than the existing methods against artificial data that mimic cancer genomes. We applied SigTracer to whole-genome sequences of blood cancer samples. The results were consistent with past findings that single base substitutions caused by a specific signature (previously reported as SBS9) related to the activation-induced cytidine deaminase intensively lie within immunoglobulin-coding regions for chronic lymphocytic leukemia samples. Furthermore, we showed that this signature mutates regions responsible for cell-cell adhesion. Accurate assignments of mutations to signatures by SigTracer can provide novel insights into signature origins and mutational processes.

    DOI PubMed

    Scopus

  • Multi-resBind: a residual network-based multi-label classifier for in vivo RNA binding prediction and preference visualization.

    Shitao Zhao, Michiaki Hamada

    BMC bioinformatics   22 ( 1 ) 554 - 554  2021年11月  [国際誌]

     概要を見る

    BACKGROUND: Protein-RNA interactions play key roles in many processes regulating gene expression. To understand the underlying binding preference, ultraviolet cross-linking and immunoprecipitation (CLIP)-based methods have been used to identify the binding sites for hundreds of RNA-binding proteins (RBPs) in vivo. Using these large-scale experimental data to infer RNA binding preference and predict missing binding sites has become a great challenge. Some existing deep-learning models have demonstrated high prediction accuracy for individual RBPs. However, it remains difficult to avoid significant bias due to the experimental protocol. The DeepRiPe method was recently developed to solve this problem via introducing multi-task or multi-label learning into this field. However, this method has not reached an ideal level of prediction power due to the weak neural network architecture. RESULTS: Compared to the DeepRiPe approach, our Multi-resBind method demonstrated substantial improvements using the same large-scale PAR-CLIP dataset with respect to an increase in the area under the receiver operating characteristic curve and average precision. We conducted extensive experiments to evaluate the impact of various types of input data on the final prediction accuracy. The same approach was used to evaluate the effect of loss functions. Finally, a modified integrated gradient was employed to generate attribution maps. The patterns disentangled from relative contributions according to context offer biological insights into the underlying mechanism of protein-RNA interactions. CONCLUSIONS: Here, we propose Multi-resBind as a new multi-label deep-learning approach to infer protein-RNA binding preferences and predict novel interactions. The results clearly demonstrate that Multi-resBind is a promising tool to predict unknown binding sites in vivo and gain biology insights into why the neural network makes a given prediction.

    DOI PubMed

    Scopus

    3
    被引用数
    (Scopus)
  • Impact of human gene annotations on RNA-seq differential expression analysis.

    Yu Hamaguchi, Chao Zeng, Michiaki Hamada

    BMC genomics   22 ( 1 ) 730 - 730  2021年10月  [国際誌]

     概要を見る

    BACKGROUND: Differential expression (DE) analysis of RNA-seq data typically depends on gene annotations. Different sets of gene annotations are available for the human genome and are continually updated-a process complicated with the development and application of high-throughput sequencing technologies. However, the impact of the complexity of gene annotations on DE analysis remains unclear. RESULTS: Using "mappability", a metric of the complexity of gene annotation, we compared three distinct human gene annotations, GENCODE, RefSeq, and NONCODE, and evaluated how mappability affected DE analysis. We found that mappability was significantly different among the human gene annotations. We also found that increasing mappability improved the performance of DE analysis, and the impact of mappability mainly evident in the quantification step and propagated downstream of DE analysis systematically. CONCLUSIONS: We assessed how the complexity of gene annotations affects DE analysis using mappability. Our findings indicate that the growth and complexity of gene annotations negatively impact the performance of DE analysis, suggesting that an approach that excludes unnecessary gene models from gene annotations improves the performance of DE analysis.

    DOI PubMed

    Scopus

    4
    被引用数
    (Scopus)
  • Binding patterns of RNA-binding proteins to repeat-derived RNA sequences reveal putative functional RNA elements.

    Masahiro Onoguchi, Chao Zeng, Ayako Matsumaru, Michiaki Hamada

    NAR genomics and bioinformatics   3 ( 3 ) lqab055  2021年09月  [国際誌]

     概要を見る

    Recent reports have revealed that repeat-derived sequences embedded in introns or long noncoding RNAs (lncRNAs) are targets of RNA-binding proteins (RBPs) and contribute to biological processes such as RNA splicing or transcriptional regulation. These findings suggest that repeat-derived RNAs are important as scaffolds of RBPs and functional elements. However, the overall functional sequences of the repeat-derived RNAs are not fully understood. Here, we show the putative functional repeat-derived RNAs by analyzing the binding patterns of RBPs based on ENCODE eCLIP data. We mapped all eCLIP reads to repeat sequences and observed that 10.75 % and 7.04 % of reads on average were enriched (at least 2-fold over control) in the repeats in K562 and HepG2 cells, respectively. Using these data, we predicted functional RNA elements on the sense and antisense strands of long interspersed element 1 (LINE1) sequences. Furthermore, we found several new sets of RBPs on fragments derived from other transposable element (TE) families. Some of these fragments show specific and stable secondary structures and are found to be inserted into the introns of genes or lncRNAs. These results suggest that the repeat-derived RNA sequences are strong candidates for the functional RNA elements of endogenous noncoding RNAs.

    DOI PubMed

    Scopus

    3
    被引用数
    (Scopus)
  • Umibato: estimation of time-varying microbial interaction using continuous-time regression hidden Markov model.

    Shion Hosoda, Tsukasa Fukunaga, Michiaki Hamada

    Bioinformatics (Oxford, England)   37 ( Suppl_1 ) i16-i24  2021年07月  [国際誌]

     概要を見る

    MOTIVATION: Accumulating evidence has highlighted the importance of microbial interaction networks. Methods have been developed for estimating microbial interaction networks, of which the generalized Lotka-Volterra equation (gLVE)-based method can estimate a directed interaction network. The previous gLVE-based method for estimating microbial interaction networks did not consider time-varying interactions. RESULTS: In this study, we developed unsupervised learning-based microbial interaction inference method using Bayesian estimation (Umibato), a method for estimating time-varying microbial interactions. The Umibato algorithm comprises Gaussian process regression (GPR) and a new Bayesian probabilistic model, the continuous-time regression hidden Markov model (CTRHMM). Growth rates are estimated by GPR, and interaction networks are estimated by CTRHMM. CTRHMM can estimate time-varying interaction networks using interaction states, which are defined as hidden variables. Umibato outperformed the existing methods on synthetic datasets. In addition, it yielded reasonable estimations in experiments on a mouse gut microbiota dataset, thus providing novel insights into the relationship between consumed diets and the gut microbiota. AVAILABILITY AND IMPLEMENTATION: The C++ and python source codes of the Umibato software are available at https://github.com/shion-h/Umibato. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

    DOI PubMed

    Scopus

    3
    被引用数
    (Scopus)
  • Possible roles for the hominoid-specific DSCR4 gene in human cells.

    Morteza M Saber, Marziyeh Karimiavargani, Takanori Uzawa, Nilmini Hettiarachchi, Michiaki Hamada, Yoshihiro Ito, Naruya Saitou

    Genes & genetic systems   96 ( 1 ) 1 - 11  2021年05月  [国内誌]

     概要を見る

    Down syndrome in humans is caused by trisomy of chromosome 21. DSCR4 (Down syndrome critical region 4) is a de novo-originated protein-coding gene present only in human chromosome 21 and its homologous chromosomes in apes. Despite being located in a medically critical genomic region and an abundance of evidence indicating its functionality, the roles of DSCR4 in human cells are unknown. We used a bioinformatic approach to infer the biological importance and cellular roles of this gene. Our analysis indicates that DSCR4 is likely involved in the regulation of interconnected biological pathways related to cell migration, coagulation and the immune system. We also showed that these predicted biological functions are consistent with tissue-specific expression of DSCR4 in migratory immune system leukocyte cells and neural crest cells (NCCs) that shape facial morphology in the human embryo. The immune system and NCCs are known to be affected in Down syndrome individuals, who suffer from DSCR4 misregulation, which further supports our findings. Providing evidence for the critical roles of DSCR4 in human cells, our findings establish the basis for further experimental investigations that will be necessary to confirm the roles of DSCR4 in the etiology of Down syndrome.

    DOI PubMed

    Scopus

    4
    被引用数
    (Scopus)
  • PBSIM2: a simulator for long-read sequencers with a novel generative model of quality scores.

    Yukiteru Ono, Kiyoshi Asai, Michiaki Hamada

    Bioinformatics (Oxford, England)   37 ( 5 ) 589 - 595  2021年05月  [国際誌]

     概要を見る

    MOTIVATION: Recent advances in high-throughput long-read sequencers, such as PacBio and Oxford Nanopore sequencers, produce longer reads with more errors than short-read sequencers. In addition to the high error rates of reads, non-uniformity of errors leads to difficulties in various downstream analyses using long reads. Many useful simulators, which characterize long-read error patterns and simulate them, have been developed. However, there is still room for improvement in the simulation of the non-uniformity of errors. RESULTS: To capture characteristics of errors in reads for long-read sequencers, here, we introduce a generative model for quality scores, in which a hidden Markov Model with a latest model selection method, called factorized information criteria, is utilized. We evaluated our developed simulator from various points, indicating that our simulator successfully simulates reads that are consistent with real reads. AVAILABILITY AND IMPLEMENTATION: The source codes of PBSIM2 are freely available from https://github.com/yukiteruono/pbsim2. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

    DOI PubMed

    Scopus

    55
    被引用数
    (Scopus)
  • Long Non-Coding RNA CRNDE Is Involved in Resistance to EGFR Tyrosine Kinase Inhibitor in EGFR-Mutant Lung Cancer via eIF4A3/MUC1/EGFR Signaling.

    Satoshi Takahashi, Rintaro Noro, Masahiro Seike, Chao Zeng, Masaru Matsumoto, Akiko Yoshikawa, Shinji Nakamichi, Teppei Sugano, Mariko Hirao, Kuniko Matsuda, Michiaki Hamada, Akihiko Gemma

    International journal of molecular sciences   22 ( 8 )  2021年04月  [国際誌]

     概要を見る

    (1) Background: Acquired resistance to epidermal growth factor receptor-tyrosine kinase inhibitors (EGFR-TKIs) is an intractable problem for many clinical oncologists. The mechanisms of resistance to EGFR-TKIs are complex. Long non-coding RNAs (lncRNAs) may play an important role in cancer development and metastasis. However, the biological process between lncRNAs and drug resistance to EGFR-mutated lung cancer remains largely unknown. (2) Methods: Osimertinib- and afatinib-resistant EGFR-mutated lung cancer cells were established using a stepwise method. A microarray analysis of non-coding and coding RNAs was performed using parental and resistant EGFR-mutant non-small cell lung cancer (NSCLC) cells and evaluated by bioinformatics analysis through medical-industrial collaboration. (3) Results: Colorectal neoplasia differentially expressed (CRNDE) and DiGeorge syndrome critical region gene 5 (DGCR5) lncRNAs were highly expressed in EGFR-TKI-resistant cells by microarray analysis. RNA-protein binding analysis revealed eukaryotic translation initiation factor 4A3 (eIF4A3) bound in an overlapping manner to CRNDE and DGCR5. The CRNDE downregulates the expression of eIF4A3, mucin 1 (MUC1), and phospho-EGFR. Inhibition of CRNDE activated the eIF4A3/MUC1/EGFR signaling pathway and apoptotic activity, and restored sensitivity to EGFR-TKIs. (4) Conclusions: The results showed that CRNDE is associated with the development of resistance to EGFR-TKIs. CRNDE may be a novel therapeutic target to conquer EGFR-mutant NSCLC.

    DOI PubMed

    Scopus

    24
    被引用数
    (Scopus)
  • Jonckheere-Terpstra-Kendall-based non-parametric analysis of temporal differential gene expression.

    Hitoshi Iuchi, Michiaki Hamada

    NAR genomics and bioinformatics   3 ( 1 ) lqab021  2021年03月  [国際誌]

     概要を見る

    Time-course experiments using parallel sequencers have the potential to uncover gradual changes in cells over time that cannot be observed in a two-point comparison. An essential step in time-series data analysis is the identification of temporal differentially expressed genes (TEGs) under two conditions (e.g. control versus case). Model-based approaches, which are typical TEG detection methods, often set one parameter (e.g. degree or degree of freedom) for one dataset. This approach risks modeling of linearly increasing genes with higher-order functions, or fitting of cyclic gene expression with linear functions, thereby leading to false positives/negatives. Here, we present a Jonckheere-Terpstra-Kendall (JTK)-based non-parametric algorithm for TEG detection. Benchmarks, using simulation data, show that the JTK-based approach outperforms existing methods, especially in long time-series experiments. Additionally, application of JTK in the analysis of time-series RNA-seq data from seven tissue types, across developmental stages in mouse and rat, suggested that the wave pattern contributes to the TEG identification of JTK, not the difference in expression levels. This result suggests that JTK is a suitable algorithm when focusing on expression patterns over time rather than expression levels, such as comparisons between different species. These results show that JTK is an excellent candidate for TEG detection.

    DOI PubMed

    Scopus

    2
    被引用数
    (Scopus)
  • Association analysis of repetitive elements and R-loop formation across species.

    Chao Zeng, Masahiro Onoguchi, Michiaki Hamada

    Mobile DNA   12 ( 1 ) 3 - 3  2021年01月  [国際誌]

     概要を見る

    BACKGROUND: Although recent studies have revealed the genome-wide distribution of R-loops, our understanding of R-loop formation is still limited. Genomes are known to have a large number of repetitive elements. Emerging evidence suggests that these sequences may play an important regulatory role. However, few studies have investigated the effect of repetitive elements on R-loop formation. RESULTS: We found different repetitive elements related to R-loop formation in various species. By controlling length and genomic distributions, we observed that satellite, long interspersed nuclear elements (LINEs), and DNA transposons were each specifically enriched for R-loops in humans, fruit flies, and Arabidopsis thaliana, respectively. R-loops also tended to arise in regions of low-complexity or simple repeats across species. We also found that the repetitive elements associated with R-loop formation differ according to developmental stage. For instance, LINEs and long terminal repeat retrotransposons (LTRs) are more likely to contain R-loops in embryos (fruit fly) and then turn out to be low-complexity and simple repeats in post-developmental S2 cells. CONCLUSIONS: Our results indicate that repetitive elements may have species-specific or development-specific regulatory effects on R-loop formation. This work advances our understanding of repetitive elements and R-loop biology.

    DOI PubMed

    Scopus

    11
    被引用数
    (Scopus)
  • Representation learning applications in biological sequence analysis.

    Hitoshi Iuchi, Taro Matsutani, Keisuke Yamada, Natsuki Iwano, Shunsuke Sumi, Shion Hosoda, Shitao Zhao, Tsukasa Fukunaga, Michiaki Hamada

    Computational and structural biotechnology journal   19   3198 - 3208  2021年  [国際誌]

     概要を見る

    Although remarkable advances have been reported in high-throughput sequencing, the ability to aptly analyze a substantial amount of rapidly generated biological (DNA/RNA/protein) sequencing data remains a critical hurdle. To tackle this issue, the application of natural language processing (NLP) to biological sequence analysis has received increased attention. In this method, biological sequences are regarded as sentences while the single nucleic acids/amino acids or k-mers in these sequences represent the words. Embedding is an essential step in NLP, which performs the conversion of these words into vectors. Specifically, representation learning is an approach used for this transformation process, which can be applied to biological sequences. Vectorized biological sequences can then be applied for function and structure estimation, or as input for other probabilistic models. Considering the importance and growing trend for the application of representation learning to biological research, in the present study, we have reviewed the existing knowledge in representation learning for biological sequence analysis.

    DOI PubMed

    Scopus

    33
    被引用数
    (Scopus)
  • Corrigendum: Possible roles for the hominoid-specific DSCR4 gene in human cells [Genes Genet. Syst. (2021) 96, p. 1-11].

    Morteza M Saber, Marziyeh Karimiavargani, Takanori Uzawa, Nilmini Hettiarachchi, Michiaki Hamada, Yoshihiro Ito, Naruya Saitou

    Genes & genetic systems   96 ( 2 ) 105 - 105  2021年  [国内誌]

     概要を見る

    Legends to Figures 4 and 5 (p. 7) should be exchanged. Below are the correct legends to Figure 4 and Figure 5. Fig. 4. Interconnection of DSCR4 overexpression-mediated perturbed pathways. KEGG analysis of DSCR4 overexpression-mediated DEGs shows enrichment for the tightly interconnected pathways of the coagulation cascade and the complement cascade (highlighted in red) and further confirm the connection of these cascades with cell adhesion, migration and proliferation (red circle). Fig. 5. Expression profile of DSCR4 across human cell lines and tissues. According to Roadmap Epigenomics Project data, DSCR4 and DSCR8, which share a bidirectional promoter, are highly expressed only in K562 cells, a type of leukemia cell. Analysis of transcriptome data provided by Prescott et al. (2015) showed that DSCR4 and DSCR8 also display high expression in human and chimpanzee neural crest cells, which are critical migratory cells involved in facial morphogenesis in the embryo. (1) Data from Prescott et al. (2015). (2) Samples also include esophagus, lung, spleen and fetal large intestine. (3) Samples also include brain germinal matrix, hippocampus, fetal small intestine, stomach, left ventricle, small intestine, sigmoid colon, HEPG2 cells and HMEC cells. The PDF file for DOI: https://doi.org/10.1266/ggs.20-00012 has been replaced with the corrected version as of June 17, 2021.

    DOI PubMed

    Scopus

  • Identification of m6A-Associated RNA Binding Proteins Using an Integrative Computational Framework.

    Yiqian Zhang, Michiaki Hamada

    Frontiers in genetics   12   625797 - 625797  2021年  [国際誌]

     概要を見る

    N6-methyladenosine (m6A) is an abundant modification on mRNA that plays an important role in regulating essential RNA activities. Several wet lab studies have identified some RNA binding proteins (RBPs) that are related to m6A's regulation. The objective of this study was to identify potential m6A-associated RBPs using an integrative computational framework. The framework was composed of an enrichment analysis and a classification model. Utilizing RBPs' binding data, we analyzed reproducible m6A regions from independent studies using this framework. The enrichment analysis identified known m6A-associated RBPs including YTH domain-containing proteins; it also identified RBM3 as a potential m6A-associated RBP for mouse. Furthermore, a significant correlation for the identified m6A-associated RBPs is observed at the protein expression level rather than the gene expression level. On the other hand, a Random Forest classification model was built for the reproducible m6A regions using RBPs' binding data. The RBP-based predictor demonstrated not only competitive performance when compared with sequence-based predictions but also reflected m6A's action of repelling against RBPs, which suggested that our framework can infer interaction between m6A and m6A-associated RBPs beyond sequence level when utilizing RBPs' binding data. In conclusion, we designed an integrative computational framework for the identification of known and potential m6A-associated RBPs. We hope the analysis will provide more insights on the studies of m6A and RNA modifications.

    DOI PubMed

    Scopus

    4
    被引用数
    (Scopus)
  • Detection and Characterization of Ribosome-Associated Long Noncoding RNAs.

    Chao Zeng, Michiaki Hamada

    Methods in molecular biology (Clifton, N.J.)   2254   179 - 194  2021年  [国際誌]

     概要を見る

    Ribosome profiling shows potential for studying the function of long noncoding RNAs (lncRNAs). We introduce a bioinformatics pipeline for detecting ribosome-associated lncRNAs (ribo-lncRNAs) from ribosome profiling data. Further, we describe a machine-learning approach for the characterization of ribo-lncRNAs based on their sequence features. Scripts for ribo-lncRNA analysis can be accessed at ( https://ribolnc.hamadalab.com/ ).

    DOI PubMed

    Scopus

    2
    被引用数
    (Scopus)
  • Parallelized Latent Dirichlet Allocation Provides a Novel Interpretability of Mutation Signatures in Cancer Genomes.

    Taro Matsutani, Michiaki Hamada

    Genes   11 ( 10 )  2020年09月  [国際誌]

     概要を見る

    Mutation signatures are defined as the distribution of specific mutations such as activity of AID/APOBEC family proteins. Previous studies have reported numerous signatures, using matrix factorization methods for mutation catalogs. Different mutation signatures are active in different tumor types; hence, signature activity varies greatly among tumor types and becomes sparse. Because of this, many previous methods require dividing mutation catalogs for each tumor type. Here, we propose parallelized latent Dirichlet allocation (PLDA), a novel Bayesian model to simultaneously predict mutation signatures with all mutation catalogs. PLDA is an extended model of latent Dirichlet allocation (LDA), which is one of the methods used for signature prediction. It has parallelized hyperparameters of Dirichlet distributions for LDA, and they represent the sparsity of signature activities for each tumor type, thus facilitating simultaneous analyses. First, we conducted a simulation experiment to compare PLDA with previous methods (including SigProfiler and SignatureAnalyzer) using artificial data and confirmed that PLDA could predict signature structures as accurately as previous methods without searching for the optimal hyperparameters. Next, we applied PLDA to PCAWG (Pan-Cancer Analysis of Whole Genomes) mutation catalogs and obtained a signature set different from the one predicted by SigProfiler. Further, we have shown that the mutation spectrum represented by the predicted signature with PLDA provides a novel interpretability through post-analyses.

    DOI PubMed

    Scopus

    2
    被引用数
    (Scopus)
  • Free-Energy Calculation of Ribonucleic Inosines and Its Application to Nearest-Neighbor Parameters.

    Shun Sakuraba, Junichi Iwakiri, Michiaki Hamada, Tomoshi Kameda, Genichiro Tsuji, Yasuaki Kimura, Hiroshi Abe, Kiyoshi Asai

    Journal of chemical theory and computation   16 ( 9 ) 5923 - 5935  2020年09月  [査読有り]  [国際誌]

     概要を見る

    Can current simulations quantitatively predict the stability of ribonucleic acids (RNAs)? In this research, we apply a free-energy perturbation simulation of RNAs containing inosine, a modified ribonucleic base, to the derivation of RNA nearest-neighbor parameters. A parameter set derived solely from 30 simulations was used to predict the free-energy difference of the RNA duplex with a mean unbiased error of 0.70 kcal/mol, which is a level of accuracy comparable to that obtained with parameters derived from 25 experiments. We further show that the error can be lowered to 0.60 kcal/mol by combining the simulation-derived free-energy differences with experimentally measured differences. This protocol can be used as a versatile method for deriving nearest-neighbor parameters of RNAs with various modified bases.

    DOI PubMed

    Scopus

    2
    被引用数
    (Scopus)
  • RaptRanker: in silico RNA aptamer selection from HT-SELEX experiment based on local sequence and structure information.

    Ryoga Ishida, Tatsuo Adachi, Aya Yokota, Hidehito Yoshihara, Kazuteru Aoki, Yoshikazu Nakamura, Michiaki Hamada

    Nucleic acids research   48 ( 14 ) e82  2020年08月  [国際誌]

     概要を見る

    Aptamers are short single-stranded RNA/DNA molecules that bind to specific target molecules. Aptamers with high binding-affinity and target specificity are identified using an in vitro procedure called high throughput systematic evolution of ligands by exponential enrichment (HT-SELEX). However, the development of aptamer affinity reagents takes a considerable amount of time and is costly because HT-SELEX produces a large dataset of candidate sequences, some of which have insufficient binding-affinity. Here, we present RNA aptamer Ranker (RaptRanker), a novel in silico method for identifying high binding-affinity aptamers from HT-SELEX data by scoring and ranking. RaptRanker analyzes HT-SELEX data by evaluating the nucleotide sequence and secondary structure simultaneously, and by ranking according to scores reflecting local structure and sequence frequencies. To evaluate the performance of RaptRanker, we performed two new HT-SELEX experiments, and evaluated binding affinities of a part of sequences that include aptamers with low binding-affinity. In both datasets, the performance of RaptRanker was superior to Frequency, Enrichment and MPBind. We also confirmed that the consideration of secondary structures is effective in HT-SELEX data analysis, and that RaptRanker successfully predicted the essential subsequence motifs in each identified sequence.

    DOI PubMed

    Scopus

    32
    被引用数
    (Scopus)
  • RNA-Seq Analysis Reveals Localization-Associated Alternative Splicing across 13 Cell Lines.

    Chao Zeng, Michiaki Hamada

    Genes   11 ( 7 )  2020年07月  [国際誌]

     概要を見る

    Alternative splicing, a ubiquitous phenomenon in eukaryotes, is a regulatory mechanism for the biological diversity of individual genes. Most studies have focused on the effects of alternative splicing for protein synthesis. However, the transcriptome-wide influence of alternative splicing on RNA subcellular localization has rarely been studied. By analyzing RNA-seq data obtained from subcellular fractions across 13 human cell lines, we identified 8720 switching genes between the cytoplasm and the nucleus. Consistent with previous reports, intron retention was observed to be enriched in the nuclear transcript variants. Interestingly, we found that short and structurally stable introns were positively correlated with nuclear localization. Motif analysis reveals that fourteen RNA-binding protein (RBPs) are prone to be preferentially bound with such introns. To our knowledge, this is the first transcriptome-wide study to analyze and evaluate the effect of alternative splicing on RNA subcellular localization. Our findings reveal that alternative splicing plays a promising role in regulating RNA subcellular localization.

    DOI PubMed

    Scopus

    8
    被引用数
    (Scopus)
  • Revealing the microbial assemblage structure in the human gut microbiome using latent Dirichlet allocation.

    Shion Hosoda, Suguru Nishijima, Tsukasa Fukunaga, Masahira Hattori, Michiaki Hamada

    Microbiome   8 ( 1 ) 95 - 95  2020年06月  [国際誌]

     概要を見る

    BACKGROUND: The human gut microbiome has been suggested to affect human health and thus has received considerable attention. To clarify the structure of the human gut microbiome, clustering methods are frequently applied to human gut taxonomic profiles. Enterotypes, i.e., clusters of individuals with similar microbiome composition, are well-studied and characterized. However, only a few detailed studies on assemblages, i.e., clusters of co-occurring bacterial taxa, have been conducted. Particularly, the relationship between the enterotype and assemblage is not well-understood. RESULTS: In this study, we detected gut microbiome assemblages using a latent Dirichlet allocation (LDA) method. We applied LDA to a large-scale human gut metagenome dataset and found that a 4-assemblage LDA model could represent relationships between enterotypes and assemblages with high interpretability. This model indicated that each individual tends to have several assemblages, three of which corresponded to the three classically recognized enterotypes. Conversely, the fourth assemblage corresponded to no enterotypes and emerged in all enterotypes. Interestingly, the dominant genera of this assemblage (Clostridium, Eubacterium, Faecalibacterium, Roseburia, Coprococcus, and Butyrivibrio) included butyrate-producing species such as Faecalibacterium prausnitzii. Indeed, the fourth assemblage significantly positively correlated with three butyrate-producing functions. CONCLUSIONS: We conducted an assemblage analysis on a large-scale human gut metagenome dataset using LDA. The present study revealed that there is an enterotype-independent assemblage. Video Abstract.

    DOI PubMed

    Scopus

    20
    被引用数
    (Scopus)
  • MoAIMS: efficient software for detection of enriched regions of MeRIP-Seq.

    Yiqian Zhang, Michiaki Hamada

    BMC bioinformatics   21 ( 1 ) 103 - 103  2020年03月  [国際誌]

     概要を見る

    BACKGROUND: Methylated RNA immunoprecipitation sequencing (MeRIP-Seq) is a popular sequencing method for studying RNA modifications and, in particular, for N6-methyladenosine (m6A), the most abundant RNA methylation modification found in various species. The detection of enriched regions is a main challenge of MeRIP-Seq analysis, however current tools either require a long time or do not fully utilize features of RNA sequencing such as strand information which could cause ambiguous calling. On the other hand, with more attention on the treatment experiments of MeRIP-Seq, biologists need intuitive evaluation on the treatment effect from comparison. Therefore, efficient and user-friendly software that can solve these tasks must be developed. RESULTS: We developed a software named "model-based analysis and inference of MeRIP-Seq (MoAIMS)" to detect enriched regions of MeRIP-Seq and infer signal proportion based on a mixture negative-binomial model. MoAIMS is designed for transcriptome immunoprecipitation sequencing experiments; therefore, it is compatible with different RNA sequencing protocols. MoAIMS offers excellent processing speed and competitive performance when compared with other tools. When MoAIMS is applied to studies of m6A, the detected enriched regions contain known biological features of m6A. Furthermore, signal proportion inferred from MoAIMS for m6A treatment datasets (perturbation of m6A methyltransferases) showed a decreasing trend that is consistent with experimental observations, suggesting that the signal proportion can be used as an intuitive indicator of treatment effect. CONCLUSIONS: MoAIMS is efficient and easy-to-use software implemented in R. MoAIMS can not only detect enriched regions of MeRIP-Seq efficiently but also provide intuitive evaluation on treatment effect for MeRIP-Seq treatment datasets.

    DOI PubMed

    Scopus

    12
    被引用数
    (Scopus)
  • Nucleosome destabilization by nuclear non-coding RNAs.

    Risa Fujita, Tatsuro Yamamoto, Yasuhiro Arimura, Saori Fujiwara, Hiroaki Tachiwana, Yuichi Ichikawa, Yuka Sakata, Liying Yang, Reo Maruyama, Michiaki Hamada, Mitsuyoshi Nakao, Noriko Saitoh, Hitoshi Kurumizaka

    Communications biology   3 ( 1 ) 60 - 60  2020年02月  [査読有り]  [国際誌]

     概要を見る

    In the nucleus, genomic DNA is wrapped around histone octamers to form nucleosomes. In principle, nucleosomes are substantial barriers to transcriptional activities. Nuclear non-coding RNAs (ncRNAs) are proposed to function in chromatin conformation modulation and transcriptional regulation. However, it remains unclear how ncRNAs affect the nucleosome structure. Eleanors are clusters of ncRNAs that accumulate around the estrogen receptor-α (ESR1) gene locus in long-term estrogen deprivation (LTED) breast cancer cells, and markedly enhance the transcription of the ESR1 gene. Here we detected nucleosome depletion around the transcription site of Eleanor2, the most highly expressed Eleanor in the LTED cells. We found that the purified Eleanor2 RNA fragment drastically destabilized the nucleosome in vitro. This activity was also exerted by other ncRNAs, but not by poly(U) RNA or DNA. The RNA-mediated nucleosome destabilization may be a common feature among natural nuclear RNAs, and may function in transcription regulation in chromatin.

    DOI PubMed

    Scopus

    6
    被引用数
    (Scopus)
  • Targeting the TR4 nuclear receptor-mediated lncTASR/AXL signaling with tretinoin increases the sunitinib sensitivity to better suppress the RCC progression.

    Hangchuan Shi, Yin Sun, Miao He, Xiong Yang, Michiaki Hamada, Tsukasa Fukunaga, Xiaoping Zhang, Chawnshang Chang

    Oncogene   39 ( 3 ) 530 - 545  2020年01月  [査読有り]  [国際誌]

     概要を見る

    Renal cell carcinoma (RCC) is one of the most lethal urological tumors. Using sunitinib to improve the survival has become the first-line therapy for metastatic RCC patients. However, the occurrence of sunitinib resistance in the clinical application has curtailed its efficacy. Here we found TR4 nuclear receptor might alter the sunitinib resistance to RCC via altering the TR4/lncTASR/AXL signaling. Mechanism dissection revealed that TR4 could modulate lncTASR (ENST00000600671.1) expression via transcriptional regulation, which might then increase AXL protein expression via enhancing the stability of AXL mRNA to increase the sunitinib resistance in RCC. Human clinical surveys also linked the expression of TR4, lncTASR, and AXL to the RCC survival, and results from multiple RCC cell lines revealed that targeting this newly identified TR4-mediated signaling with small molecules, including tretinoin, metformin, or TR4-shRNAs, all led to increase the sunitinib sensitivity to better suppress the RCC progression, and our preclinical study using the in vivo mouse model further proved tretinoin had a better synergistic effect to increase sunitinib sensitivity to suppress RCC progression. Future successful clinical trials may help in the development of a novel therapy to better suppress the RCC progression.

    DOI PubMed

    Scopus

    21
    被引用数
    (Scopus)
  • Discovering novel mutation signatures by latent Dirichlet allocation with variational Bayes inference.

    Taro Matsutani, Yuki Ueno, Tsukasa Fukunaga, Michiaki Hamada

    Bioinformatics (Oxford, England)   35 ( 22 ) 4543 - 4552  2019年11月  [査読有り]  [国際誌]

     概要を見る

    MOTIVATION: A cancer genome includes many mutations derived from various mutagens and mutational processes, leading to specific mutation patterns. It is known that each mutational process leads to characteristic mutations, and when a mutational process has preferences for mutations, this situation is called a 'mutation signature.' Identification of mutation signatures is an important task for elucidation of carcinogenic mechanisms. In previous studies, analyses with statistical approaches (e.g. non-negative matrix factorization and latent Dirichlet allocation) revealed a number of mutation signatures. Nonetheless, strictly speaking, these existing approaches employ an ad hoc method or incorrect approximation to estimate the number of mutation signatures, and the whole picture of mutation signatures is unclear. RESULTS: In this study, we present a novel method for estimating the number of mutation signatures-latent Dirichlet allocation with variational Bayes inference (VB-LDA)-where variational lower bounds are utilized for finding a plausible number of mutation patterns. In addition, we performed cluster analyses for estimated mutation signatures to extract novel mutation signatures that appear in multiple primary lesions. In a simulation with artificial data, we confirmed that our method estimated the correct number of mutation signatures. Furthermore, applying our method in combination with clustering procedures for real mutation data revealed many interesting mutation signatures that have not been previously reported. AVAILABILITY AND IMPLEMENTATION: All the predicted mutation signatures with clustering results are freely available at http://www.f.waseda.jp/mhamada/MS/index.html. All the C++ source code and python scripts utilized in this study can be downloaded on the Internet (https://github.com/qkirikigaku/MS_LDA). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

    DOI PubMed

    Scopus

    10
    被引用数
    (Scopus)
  • Stromal fibroblasts induce metastatic tumor cell clusters via epithelial-mesenchymal plasticity.

    Yuko Matsumura, Yasuhiko Ito, Yoshihiro Mezawa, Kaidiliayi Sulidan, Yataro Daigo, Toru Hiraga, Kaoru Mogushi, Nadila Wali, Hiromu Suzuki, Takumi Itoh, Yohei Miyagi, Tomoyuki Yokose, Satoru Shimizu, Atsushi Takano, Yasuhisa Terao, Harumi Saeki, Masayuki Ozawa, Masaaki Abe, Satoru Takeda, Ko Okumura, Sonoko Habu, Okio Hino, Kazuyoshi Takeda, Michiaki Hamada, Akira Orimo

    Life science alliance   2 ( 4 )  2019年08月  [査読有り]  [国際誌]

     概要を見る

    Emerging evidence supports the hypothesis that multicellular tumor clusters invade and seed metastasis. However, whether tumor-associated stroma induces epithelial-mesenchymal plasticity in tumor cell clusters, to promote invasion and metastasis, remains unknown. We demonstrate herein that carcinoma-associated fibroblasts (CAFs) frequently present in tumor stroma drive the formation of tumor cell clusters composed of two distinct cancer cell populations, one in a highly epithelial (E-cadherinhiZEB1lo/neg: Ehi) state and another in a hybrid epithelial/mesenchymal (E-cadherinloZEB1hi: E/M) state. The Ehi cells highly express oncogenic cell-cell adhesion molecules, such as carcinoembryonic antigen-related cell adhesion molecule 5 (CEACAM5) and CEACAM6 that associate with E-cadherin, resulting in increased tumor cell cluster formation and metastatic seeding. The E/M cells also retain associations with Ehi cells, which follow the E/M cells leading to collective invasion. CAF-produced stromal cell-derived factor 1 and transforming growth factor-β confer the Ehi and E/M states as well as invasive and metastatic traits via Src activation in apposed human breast tumor cells. Taken together, these findings indicate that invasive and metastatic tumor cell clusters are induced by CAFs via epithelial-mesenchymal plasticity.

    DOI PubMed

    Scopus

    47
    被引用数
    (Scopus)
  • Identification of RNA biomarkers for chemical safety screening in neural cells derived from mouse embryonic stem cells using RNA deep sequencing analysis.

    Hidenori Tani, Taro Matsutani, Hiroshi Aoki, Kaoru Nakamura, Yu Hamaguchi, Tetsuya Nakazato, Michiaki Hamada

    Biochemical and biophysical research communications   512 ( 4 ) 641 - 646  2019年05月  [査読有り]  [国際誌]

     概要を見る

    Chemical safety screening requires the development of more efficient assays that do not involve testing in animals. In vitro cell-based assays are among the most appropriate alternatives to animal testing for screening of chemical toxicity. Most studies performed to date made use of mRNAs as biomarkers. Recent studies have however indicated the presence of many unannotated non-coding RNAs (ncRNAs) in the transcriptome that do appear to encode proteins. In the present study, we performed whole-transcriptome sequencing analysis (RNA-Seq) to identify novel RNA biomarkers, including ncRNAs, which showed marked responses to the toxicity of nine chemicals. Chemical safety screening was performed in cell-based assays using mouse embryonic stem cell (mESC)-derived neural cells. Marked responses in the expression of some ncRNAs to the chemical compounds were observed. The results of the present study suggested that ncRNAs may be useful in chemical safety screening as novel RNA biomarkers.

    DOI PubMed

    Scopus

    1
    被引用数
    (Scopus)
  • LncRRIsearch: A Web Server for lncRNA-RNA Interaction Prediction Integrated With Tissue-Specific Expression and Subcellular Localization Data.

    Tsukasa Fukunaga, Junichi Iwakiri, Yukiteru Ono, Michiaki Hamada

    Frontiers in genetics   10   462 - 462  2019年  [査読有り]  [国際誌]

     概要を見る

    Long non-coding RNAs (lncRNAs) play critical roles in various biological processes, but the function of the majority of lncRNAs is still unclear. One approach for estimating a function of a lncRNA is the identification of its interaction target because functions of lncRNAs are expressed through interaction with other biomolecules in quite a few cases. In this paper, we developed "LncRRIsearch," which is a web server for comprehensive prediction of human and mouse lncRNA-lncRNA and lncRNA-mRNA interaction. The prediction was conducted using RIblast, which is a fast and accurate RNA-RNA interaction prediction tool. Users can investigate interaction target RNAs of a particular lncRNA through a web interface. In addition, we integrated tissue-specific expression and subcellular localization data for the lncRNAs with the web server. These data enable users to examine tissue-specific or subcellular localized lncRNA interactions. LncRRIsearch is publicly accessible at http://rtools.cbrc.jp/LncRRIsearch/.

    DOI PubMed

    Scopus

    80
    被引用数
    (Scopus)
  • DeepM6ASeq: prediction and characterization of m6A-containing sequences using deep learning.

    Yiqian Zhang, Michiaki Hamada

    BMC bioinformatics   19 ( Suppl 19 ) 524 - 524  2018年12月  [査読有り]  [国際誌]

     概要を見る

    BACKGROUND: N6-methyladensine (m6A) is a common and abundant RNA methylation modification found in various species. As a type of post-transcriptional methylation, m6A plays an important role in diverse RNA activities such as alternative splicing, an interplay with microRNAs and translation efficiency. Although existing tools can predict m6A at single-base resolution, it is still challenging to extract the biological information surrounding m6A sites. RESULTS: We implemented a deep learning framework, named DeepM6ASeq, to predict m6A-containing sequences and characterize surrounding biological features based on miCLIP-Seq data, which detects m6A sites at single-base resolution. DeepM6ASeq showed better performance as compared to other machine learning classifiers. Moreover, an independent test on m6A-Seq data, which identifies m6A-containing genomic regions, revealed that our model is competitive in predicting m6A-containing sequences. The learned motifs from DeepM6ASeq correspond to known m6A readers. Notably, DeepM6ASeq also identifies a newly recognized m6A reader: FMR1. Besides, we found that a saliency map in the deep learning model could be utilized to visualize locations of m6A sites. CONCULSION: We developed a deep-learning-based framework to predict and characterize m6A-containing sequences and hope to help investigators to gain more insights for m6A research. The source code is available at https://github.com/rreybeyb/DeepM6ASeq .

    DOI PubMed

    Scopus

    90
    被引用数
    (Scopus)
  • Identifying sequence features that drive ribosomal association for lncRNA.

    Chao Zeng, Michiaki Hamada

    BMC genomics   19 ( Suppl 10 ) 906 - 906  2018年12月  [査読有り]  [国際誌]

     概要を見る

    BACKGROUND: With the increasing number of annotated long noncoding RNAs (lncRNAs) from the genome, researchers are continually updating their understanding of lncRNAs. Recently, thousands of lncRNAs have been reported to be associated with ribosomes in mammals. However, their biological functions or mechanisms are still unclear. RESULTS: In this study, we tried to investigate the sequence features involved in the ribosomal association of lncRNA. We have extracted ninety-nine sequence features corresponding to different biological mechanisms (i.e., RNA splicing, putative ORF, k-mer frequency, RNA modification, RNA secondary structure, and repeat element). An [Formula: see text]-regularized logistic regression model was applied to screen these features. Finally, we obtained fifteen and nine important features for the ribosomal association of human and mouse lncRNAs, respectively. CONCLUSION: To our knowledge, this is the first study to characterize ribosome-associated lncRNAs and ribosome-free lncRNAs from the perspective of sequence features. These sequence features that were identified in this study may shed light on the biological mechanism of the ribosomal association and provide important clues for functional analysis of lncRNAs.

    DOI PubMed

    Scopus

    15
    被引用数
    (Scopus)
  • Nearest-neighbor parameter for inosine-cytosine pairs through a combined experimental and computational approach

    Shun Sakuraba, Junichi Iwakiri, Michiaki Hamada, Tomoshi Kameda, Genichiro Tsuji, Yasuaki Kimura, Hiroshi Abe, Kiyoshi Asai

       2018年10月  [査読有り]

    DOI

  • A Novel Method for Assessing the Statistical Significance of RNA-RNA Interactions Between Two Long RNAs.

    Tsukasa Fukunaga, Michiaki Hamada

    Journal of computational biology : a journal of computational molecular cell biology   25 ( 9 ) 976 - 986  2018年09月  [査読有り]  [国際誌]

     概要を見る

    RNA-RNA interactions are key mechanisms through which noncoding RNA (ncRNA) regions exert biological functions. Computational prediction of RNA-RNA interactions is an essential method for detecting novel RNA-RNA interactions because their comprehensive detection by biological experimentation is still quite difficult. Many RNA-RNA interaction prediction tools have been developed, but they tend to produce many false positives. Accordingly, assessment of the statistical significance of computationally predicted interactions is an important task. However, there is no method to evaluate the statistical significance of RNA-RNA interactions that is applicable to interactions between two long RNA sequences. We developed a method to calculate the p-value for the minimal interaction energy between two long RNA sequences. The developed method depends on the fact that minimum interaction energies of RNA-RNA interactions between long RNAs follow a Gumbel distribution when repeat sequences in RNAs are masked. To show the usefulness of the developed method, we applied it to whole human 5'-untranslated region (UTR) and 3'-UTR sequences to detect novel 5'-UTR-3'-UTR interactions. We thus identified two significant 5'-UTR-3'-UTR interactions. Specifically, the human small proline-rich repeat protein 3 shows conserved 5'-UTR-3'-UTR interactions with some nucleotide variations preserving base pairings among primates. Our developed method enables us to detect statistically significant RNA-RNA interactions between long RNAs such as long ncRNAs. Statistical significance estimates help in identification of interactions for experimental validation and provide novel insights into the function of ncRNA regions.

    DOI PubMed

    Scopus

    2
    被引用数
    (Scopus)
  • Computational approaches for alternative and transient secondary structures of ribonucleic acids.

    Tsukasa Fukunaga, Michiaki Hamada

    Briefings in functional genomics   18 ( 3 ) 182 - 191  2018年06月  [査読有り]  [国際誌]

     概要を見る

    Transient and alternative structures of ribonucleic acids (RNAs) play essential roles in various regulatory processes, such as translation regulation in living cells. Because experimental analyses for RNA structures are difficult and time-consuming, computational approaches based on RNA secondary structures are promising. In this article, we review computational methods for detecting and analyzing transient/alternative secondary structures of RNAs, including static approaches based on probabilistic distributions of RNA secondary structures and dynamic approaches such as kinetic folding and folding pathway predictions.

    DOI PubMed

    Scopus

    1
    被引用数
    (Scopus)
  • Identification and analysis of ribosome-associated lncRNAs using ribosome profiling data.

    Chao Zeng, Tsukasa Fukunaga, Michiaki Hamada

    BMC genomics   19 ( 1 ) 414 - 414  2018年05月  [査読有り]  [国際誌]

     概要を見る

    BACKGROUND: Although the number of discovered long non-coding RNAs (lncRNAs) has increased dramatically, their biological roles have not been established. Many recent studies have used ribosome profiling data to assess the protein-coding capacity of lncRNAs. However, very little work has been done to identify ribosome-associated lncRNAs, here defined as lncRNAs interacting with ribosomes related to protein synthesis as well as other unclear biological functions. RESULTS: On average, 39.17% of expressed lncRNAs were observed to interact with ribosomes in human and 48.16% in mouse. We developed the ribosomal association index (RAI), which quantifies the evidence for ribosomal associability of lncRNAs over various tissues and cell types, to catalog 691 and 409 lncRNAs that are robustly associated with ribosomes in human and mouse, respectively. Moreover, we identified 78 and 42 lncRNAs with a high probability of coding peptides in human and mouse, respectively. Compared with ribosome-free lncRNAs, ribosome-associated lncRNAs were observed to be more likely to be located in the cytoplasm and more sensitive to nonsense-mediated decay. CONCLUSION: Our results suggest that RAI can be used as an integrative and evidence-based tool for distinguishing between ribosome-associated and free lncRNAs, providing a valuable resource for the study of lncRNA functions.

    DOI PubMed

    Scopus

    45
    被引用数
    (Scopus)
  • Estimating energy parameters for RNA secondary structure predictions using both experimental and computational data.

    Nishida S, Sakuraba S, Asai K, Hamada M

    IEEE/ACM transactions on computational biology and bioinformatics   16 ( 5 ) 1645 - 1655  2018年03月  [査読有り]

    DOI PubMed

    Scopus

    2
    被引用数
    (Scopus)
  • Beyond similarity assessment: Selecting the optimal model for sequence alignment via the Factorized Asymptotic Bayesian algorithm.

    Takeda T, Hamada M

    Bioinformatics (Oxford, England)   34 ( 4 ) 576 - 584  2018年02月  [査読有り]  [国際誌]

    DOI PubMed

    Scopus

  • In silico approaches to RNA aptamer design.

    Hamada M

    Biochimie   145   8 - 14  2018年02月  [査読有り]  [国際誌]

    DOI PubMed

    Scopus

    36
    被引用数
    (Scopus)
  • Identification of Transposable Elements Contributing to Tissue-Specific Expression of Long Non-Coding RNAs.

    Chishima T, Iwakiri J, Hamada M

    Genes   9 ( 1 )  2018年01月  [査読有り]  [国際誌]

    DOI PubMed

    Scopus

    38
    被引用数
    (Scopus)
  • RIblast: an ultrafast RNA-RNA interaction prediction system based on a seed-and-extension approach.

    Fukunaga T, Hamada M

    Bioinformatics (Oxford, England)   33 ( 17 ) 2666 - 2674  2017年09月  [査読有り]  [国際誌]

    DOI PubMed

    Scopus

    72
    被引用数
    (Scopus)
  • Computational prediction of lncRNA-mRNA interactionsby integrating tissue specificity in human transcriptome.

    Iwakiri J, Terai G, Hamada M

    Biology direct   12 ( 1 ) 15 - 15  2017年06月  [査読有り]  [国際誌]

    DOI PubMed

    Scopus

    37
    被引用数
    (Scopus)
  • AMAP: A pipeline for whole-genome mutation detection in Arabidopsis thaliana.

    Ishii K, Kazama Y, Hirano T, Hamada M, Ono Y, Yamada M, Abe T

    Genes & genetic systems   91 ( 4 ) 229 - 233  2017年03月  [査読有り]  [国内誌]

    DOI PubMed

    Scopus

    4
    被引用数
    (Scopus)
  • Training alignment parameters for arbitrary sequencers with LAST-TRAIN.

    Hamada M, Ono Y, Asai K, Frith MC

    Bioinformatics (Oxford, England)   33 ( 6 ) 926 - 928  2017年03月  [査読有り]  [国際誌]

    DOI PubMed

    Scopus

    43
    被引用数
    (Scopus)
  • Improved Accuracy in RNA-Protein Rigid Body Docking by Incorporating Force Field for Molecular Dynamics Simulation into the Scoring Function.

    Iwakiri J, Hamada M, Asai K, Kameda T

    Journal of chemical theory and computation   12 ( 9 ) 4688 - 97  2016年09月  [査読有り]  [国際誌]

    DOI PubMed J-GLOBAL

    Scopus

    22
    被引用数
    (Scopus)
  • 長鎖ノンコーディングRNAのためのバイオインフォマティクス

    岩切淳一, 浜田道昭

    生物物理   56 ( 4 ) 217 - 220  2016年08月  [査読有り]

     概要を見る

    <p>Recent advances in high throughput sequencing technologies unveiled that large number of long non-coding RNAs (lncRNAs) are transcribed from human genome. Currently, these emerging transcripts are needed to be functionally classified and annotated. Here we review several bioinformatic approaches for analyzing the important characteristics of the lncRNAs toward discovering their functions: 1) tissue specificities of lncRNA expressions, 2) two types of macromolecular interactions (RNA-RNA and RNA-protein interactions), 3) secondary structures of lncRNAs.</p>

    DOI CiNii

  • Rtools: a web server for various secondary structural analyses on single RNA sequences

    Michiaki Hamada, Yukiteru Ono, Hisanori Kiryu, Kengo Sato, Yuki Kato, Tsukasa Fukunaga, Ryota Mori, Kiyoshi Asai

    NUCLEIC ACIDS RESEARCH   44 ( W1 ) W302 - W307  2016年07月  [査読有り]

     概要を見る

    The secondary structures, as well as the nucleotide sequences, are the important features of RNA molecules to characterize their functions. According to the thermodynamic model, however, the probability of any secondary structure is very small. As a consequence, any tool to predict the secondary structures of RNAs has limited accuracy. On the other hand, there are a few tools to compensate the imperfect predictions by calculating and visualizing the secondary structural information from RNA sequences. It is desirable to obtain the rich information from those tools through a friendly interface. We implemented a web server of the tools to predict secondary structures and to calculate various structural features based on the energy models of secondary structures. By just giving an RNA sequence to the web server, the user can get the different types of solutions of the secondary structures, the marginal probabilities such as base-paring probabilities, loop probabilities and accessibilities of the local bases, the energy changes by arbitrary base mutations as well as the measures for validations of the predicted secondary structures. The web server is available at http://rtools.cbrc.jp, which integrates software tools, CentroidFold, CentroidHomfold, IPKnot, CapR, Raccess, Rchange and RintD.

    DOI

  • Rtools: a web server for various secondary structural analyses on single RNA sequences

    Michiaki Hamada, Yukiteru Ono, Hisanori Kiryu, Kengo Sato, Yuki Kato, Tsukasa Fukunaga, Ryota Mori, Kiyoshi Asai

    NUCLEIC ACIDS RESEARCH   44 ( W1 ) W302 - W307  2016年07月  [査読有り]

     概要を見る

    The secondary structures, as well as the nucleotide sequences, are the important features of RNA molecules to characterize their functions. According to the thermodynamic model, however, the probability of any secondary structure is very small. As a consequence, any tool to predict the secondary structures of RNAs has limited accuracy. On the other hand, there are a few tools to compensate the imperfect predictions by calculating and visualizing the secondary structural information from RNA sequences. It is desirable to obtain the rich information from those tools through a friendly interface. We implemented a web server of the tools to predict secondary structures and to calculate various structural features based on the energy models of secondary structures. By just giving an RNA sequence to the web server, the user can get the different types of solutions of the secondary structures, the marginal probabilities such as base-paring probabilities, loop probabilities and accessibilities of the local bases, the energy changes by arbitrary base mutations as well as the measures for validations of the predicted secondary structures. The web server is available at http://rtools.cbrc.jp, which integrates software tools, CentroidFold, CentroidHomfold, IPKnot, CapR, Raccess, Rchange and RintD.

    DOI PubMed

  • Comprehensive prediction of lncRNA-RNA interactions in human transcriptome

    Goro Terai, Junichi Iwakiri, Tomoshi Kameda, Michiaki Hamada, Kiyoshi Asai

    BMC GENOMICS   17 ( S-1 ) 12  2016年  [査読有り]

     概要を見る

    Motivation: Recent studies have revealed that large numbers of non-coding RNAs are transcribed in humans, but only a few of them have been identified with their functions. Identification of the interaction target RNAs of the non-coding RNAs is an important step in predicting their functions. The current experimental methods to identify RNA-RNA interactions, however, are not fast enough to apply to a whole human transcriptome. Therefore, computational predictions of RNA-RNA interactions are desirable, but this is a challenging task due to the huge computational costs involved.
    Results: Here, we report comprehensive predictions of the interaction targets of lncRNAs in a whole human transcriptome for the first time. To achieve this, we developed an integrated pipeline for predicting RNA-RNA interactions on the K computer, which is one of the fastest super-computers in the world. Comparisons with experimentally-validated lncRNA-RNA interactions support the quality of the predictions. Additionally, we have developed a database that catalogs the predicted lncRNA-RNA interactions to provide fundamental information about the targets of lncRNAs.

    DOI

    Scopus

    50
    被引用数
    (Scopus)
  • Bioinformatics tools for lncRNA research.

    Iwakiri J, Hamada M, Asai K

    Biochimica et biophysica acta   1859 ( 1 ) 23 - 30  2016年01月  [査読有り]

    DOI PubMed

    Scopus

    45
    被引用数
    (Scopus)
  • Comprehensive prediction of lncRNA-RNA interactions in human transcriptome

    Goro Terai, Junichi Iwakiri, Tomoshi Kameda, Michiaki Hamada, Kiyoshi Asai

    BMC GENOMICS   17 ( 1 ) 12  2016年  [査読有り]

     概要を見る

    Motivation: Recent studies have revealed that large numbers of non-coding RNAs are transcribed in humans, but only a few of them have been identified with their functions. Identification of the interaction target RNAs of the non-coding RNAs is an important step in predicting their functions. The current experimental methods to identify RNA-RNA interactions, however, are not fast enough to apply to a whole human transcriptome. Therefore, computational predictions of RNA-RNA interactions are desirable, but this is a challenging task due to the huge computational costs involved.
    Results: Here, we report comprehensive predictions of the interaction targets of lncRNAs in a whole human transcriptome for the first time. To achieve this, we developed an integrated pipeline for predicting RNA-RNA interactions on the K computer, which is one of the fastest super-computers in the world. Comparisons with experimentally-validated lncRNA-RNA interactions support the quality of the predictions. Additionally, we have developed a database that catalogs the predicted lncRNA-RNA interactions to provide fundamental information about the targets of lncRNAs.

    DOI PubMed J-GLOBAL

    Scopus

    50
    被引用数
    (Scopus)
  • Bioinformatics tools for lncRNA research

    Junichi Iwakiri, Michiaki Hamada, Kiyoshi Asai

    BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS   1859 ( 1 ) 23 - 30  2016年01月  [査読有り]

     概要を見る

    Current experimental methods to identify the functions of a large number of the candidates of long non-coding RNAs (lncRNAs) are limited in their throughput. Therefore, it is essential to know which tools are effective for understanding lncRNAs so that reasonable speed and accuracy can be achieved. In this paper, we review the currently available bioinformatics tools and databases that are useful for finding non-coding RNAs and analyzing their structures, conservation, interactions, co-expressions and localization. This article is part of a Special Issue entitled: Clues to long noncoding RNA taxonomy1, edited by Dr. Tetsuro Hirose and Dr. Shinichi Nakagawa. (C) 2015 Elsevier B.V. All fights reserved.

    DOI PubMed J-GLOBAL

    Scopus

    45
    被引用数
    (Scopus)
  • Privacy-preserving search for chemical compound databases

    Kana Shimizu, Koji Nuida, Hiromi Arai, Shigeo Mitsunari, Nuttapong Attrapadung, Michiaki Hamada, Koji Tsuda, Takatsugu Hirokawa, Jun Sakuma, Goichiro Hanaoka, Kiyoshi Asai

    BMC BIOINFORMATICS   16 ( 18 ) S6  2015年12月  [査読有り]

     概要を見る

    Background: Searching for similar compounds in a database is the most important process for in-silico drug screening. Since a query compound is an important starting point for the new drug, a query holder, who is afraid of the query being monitored by the database server, usually downloads all the records in the database and uses them in a closed network. However, a serious dilemma arises when the database holder also wants to output no information except for the search results, and such a dilemma prevents the use of many important data resources.
    Results: In order to overcome this dilemma, we developed a novel cryptographic protocol that enables database searching while keeping both the query holder's privacy and database holder's privacy. Generally, the application of cryptographic techniques to practical problems is difficult because versatile techniques are computationally expensive while computationally inexpensive techniques can perform only trivial computation tasks. In this study, our protocol is successfully built only from an additive-homomorphic cryptosystem, which allows only addition performed on encrypted values but is computationally efficient compared with versatile techniques such as general purpose multi-party computation. In an experiment searching ChEMBL, which consists of more than 1,200,000 compounds, the proposed method was 36,900 times faster in CPU time and 12,000 times as efficient in communication size compared with general purpose multi-party computation.
    Conclusion: We proposed a novel privacy-preserving protocol for searching chemical compound databases. The proposed method, easily scaling for large-scale databases, may help to accelerate drug discovery research by making full use of unused but valuable data that includes sensitive information.

    DOI PubMed J-GLOBAL

    Scopus

    9
    被引用数
    (Scopus)
  • A semi-supervised learning approach for RNA secondary structure prediction

    Haruka Yonemoto, Kiyoshi Asai, Michiaki Hamada

    COMPUTATIONAL BIOLOGY AND CHEMISTRY   57   72 - 79  2015年08月  [査読有り]

     概要を見る

    RNA secondary structure prediction is a key technology in RNA bioinformatics. Most algorithms for RNA secondary structure prediction use probabilistic models, in which the model parameters are trained with reliable RNA secondary structures. Because of the difficulty of determining RNA secondary structures by experimental procedures, such as NMR or X-ray crystal structural analyses, there are still many RNA sequences that could be useful for training whose secondary structures have not been experimentally determined. In this paper, we introduce a novel semi-supervised learning approach for training parameters in a probabilistic model of RNA secondary structures in which we employ not only RNA sequences with annotated secondary structures but also ones with unknown secondary structures. Our model is based on a hybrid of generative (stochastic context-free grammars) and discriminative models (conditional random fields) that has been successfully applied to natural language processing. Computational experiments indicate that the accuracy of secondary structure prediction is improved by incorporating RNA sequences with unknown secondary structures into training. To our knowledge, this is the first study of a semi-supervised learning approach for RNA secondary structure prediction. This technique will be useful when the number of reliable structures is limited. (C) 2015 Elsevier Ltd. All rights reserved.

    DOI

    Scopus

    11
    被引用数
    (Scopus)
  • Learning chromatin states with factorized information criteria

    Michiaki Hamada, Yukiteru Ono, Ryohei Fujimaki, Kiyoshi Asai

    BIOINFORMATICS   31 ( 15 ) 2426 - 2433  2015年08月  [査読有り]

     概要を見る

    Motivation: Recent studies have suggested that both the genome and the genome with epigenetic modifications, the so-called epigenome, play important roles in various biological functions, such as transcription and DNA replication, repair, and recombination. It is well known that specific combinations of histone modifications (e.g. methylations and acetylations) of nucleosomes induce chromatin states that correspond to specific functions of chromatin. Although the advent of next-generation sequencing (NGS) technologies enables measurement of epigenetic information for entire genomes at high-resolution, the variety of chromatin states has not been completely characterized.
    Results: In this study, we propose a method to estimate the chromatin states indicated by genome-wide chromatin marks identified by NGS technologies. The proposed method automatically estimates the number of chromatin states and characterize each state on the basis of a hidden Markov model (HMM) in combination with a recently proposed model selection technique, factorized information criteria. The method is expected to provide an unbiased model because it relies on only two adjustable parameters and avoids heuristic procedures as much as possible. Computational experiments with simulated datasets show that our method automatically learns an appropriate model, even in cases where methods that rely on Bayesian information criteria fail to learn the model structures. In addition, we comprehensively compare our method to ChromHMM on three real datasets and show that our method estimates more chromatin states than ChromHMM for those datasets.

    DOI

    Scopus

    9
    被引用数
    (Scopus)
  • A semi-supervised learning approach for RNA secondary structure prediction

    Haruka Yonemoto, Kiyoshi Asai, Michiaki Hamada

    COMPUTATIONAL BIOLOGY AND CHEMISTRY   57   72 - 79  2015年08月  [査読有り]

     概要を見る

    RNA secondary structure prediction is a key technology in RNA bioinformatics. Most algorithms for RNA secondary structure prediction use probabilistic models, in which the model parameters are trained with reliable RNA secondary structures. Because of the difficulty of determining RNA secondary structures by experimental procedures, such as NMR or X-ray crystal structural analyses, there are still many RNA sequences that could be useful for training whose secondary structures have not been experimentally determined. In this paper, we introduce a novel semi-supervised learning approach for training parameters in a probabilistic model of RNA secondary structures in which we employ not only RNA sequences with annotated secondary structures but also ones with unknown secondary structures. Our model is based on a hybrid of generative (stochastic context-free grammars) and discriminative models (conditional random fields) that has been successfully applied to natural language processing. Computational experiments indicate that the accuracy of secondary structure prediction is improved by incorporating RNA sequences with unknown secondary structures into training. To our knowledge, this is the first study of a semi-supervised learning approach for RNA secondary structure prediction. This technique will be useful when the number of reliable structures is limited. (C) 2015 Elsevier Ltd. All rights reserved.

    DOI PubMed J-GLOBAL

    Scopus

    11
    被引用数
    (Scopus)
  • Learning chromatin states with factorized information criteria

    Michiaki Hamada, Yukiteru Ono, Ryohei Fujimaki, Kiyoshi Asai

    BIOINFORMATICS   31 ( 15 ) 2426 - 2433  2015年08月  [査読有り]

     概要を見る

    Motivation: Recent studies have suggested that both the genome and the genome with epigenetic modifications, the so-called epigenome, play important roles in various biological functions, such as transcription and DNA replication, repair, and recombination. It is well known that specific combinations of histone modifications (e.g. methylations and acetylations) of nucleosomes induce chromatin states that correspond to specific functions of chromatin. Although the advent of next-generation sequencing (NGS) technologies enables measurement of epigenetic information for entire genomes at high-resolution, the variety of chromatin states has not been completely characterized.
    Results: In this study, we propose a method to estimate the chromatin states indicated by genome-wide chromatin marks identified by NGS technologies. The proposed method automatically estimates the number of chromatin states and characterize each state on the basis of a hidden Markov model (HMM) in combination with a recently proposed model selection technique, factorized information criteria. The method is expected to provide an unbiased model because it relies on only two adjustable parameters and avoids heuristic procedures as much as possible. Computational experiments with simulated datasets show that our method automatically learns an appropriate model, even in cases where methods that rely on Bayesian information criteria fail to learn the model structures. In addition, we comprehensively compare our method to ChromHMM on three real datasets and show that our method estimates more chromatin states than ChromHMM for those datasets.

    DOI PubMed J-GLOBAL

    Scopus

    9
    被引用数
    (Scopus)
  • RNA secondary structure prediction from multi-aligned sequences

    Michiaki Hamada

    RNA Bioinformatics   1269   17 - 38  2015年01月  [査読有り]

     概要を見る

    It has been well accepted that the RNA secondary structures of most functional non-coding RNAs (ncRNAs) are closely related to their functions and are conserved during evolution. Hence, prediction of conserved secondary structures from evolutionarily related sequences is one important task in RNA bioinformatics
    the methods are useful not only to further functional analyses of ncRNAs but also to improve the accuracy of secondary structure predictions and to find novel functional RNAs from the genome. In this review, I focus on common secondary structure prediction from a given aligned RNA sequence, in which one secondary structure whose length is equal to that of the input alignment is predicted. I systematically review and classify existing tools and algorithms for the problem, by utilizing the information employed in the tools and by adopting a unified viewpoint based on maximum expected gain (MEG) estimators. I believe that this classification will allow a deeper understanding of each tool and provide users with useful information for selecting tools for common secondary structure predictions.

    DOI PubMed J-GLOBAL

    Scopus

    2
    被引用数
    (Scopus)
  • Efficient calculation of exact probability distributions of integer features on RNA secondary structures

    Ryota Mori, Michiaki Hamada, Kiyoshi Asai

    BMC GENOMICS   15   S6  2014年12月  [査読有り]

     概要を見る

    Background: Although the needs for analyses of secondary structures of RNAs are increasing, prediction of the secondary structures of RNAs are not always reliable. Because an RNA may have a complicated energy landscape, comprehensive representations of the whole ensemble of the secondary structures, such as the probability distributions of various features of RNA secondary structures are required.
    Results: A general method to efficiently compute the distribution of any integer scalar/vector function on the secondary structure is proposed. We also show two concrete algorithms, for Hamming distance from a reference structure and for 5' - 3' distance, which can be constructed by following our general method. These practical applications of this method show the effectiveness of the proposed method.
    Conclusions: The proposed method provides a clear and comprehensive procedure to construct algorithms for distributions of various integer features. In addition, distributions of integer vectors, that is a combination of different integer scores, can be also described by applying our 2D expanding technique.

    DOI PubMed J-GLOBAL

    Scopus

    10
    被引用数
    (Scopus)
  • Reference-free prediction of rearrangement breakpoint reads

    Edward Wijaya, Kana Shimizu, Kiyoshi Asai, Michiaki Hamada

    BIOINFORMATICS   30 ( 18 ) 2559 - 2567  2014年09月  [査読有り]

     概要を見る

    Motivation: Chromosome rearrangement events are triggered by atypical breaking and rejoining of DNA molecules, which are observed in many cancer-related diseases. The detection of rearrangement is typically done by using short reads generated by next-generation sequencing (NGS) and combining the reads with knowledge of a reference genome. Because structural variations and genomes differ from one person to another, intermediate comparison via a reference genome may lead to loss of information.
    Results: In this article, we propose a reference-free method for detecting clusters of breakpoints from the chromosomal rearrangements. This is done by directly comparing a set of NGS normal reads with another set that may be rearranged. Our method SlideSort-BPR (breakpoint reads) is based on a fast algorithm for all-against-all comparisons of short reads and theoretical analyses of the number of neighboring reads. When applied to a dataset with a sequencing depth of 100x, it finds similar to 88% of the breakpoints correctly with no false-positive reads. Moreover, evaluation on a real prostate cancer dataset shows that the proposed method predicts more fusion transcripts correctly than previous approaches, and yet produces fewer false-positive reads. To our knowledge, this is the first method to detect breakpoint reads without using a reference genome.

    DOI PubMed J-GLOBAL

    Scopus

    5
    被引用数
    (Scopus)
  • Reference-free prediction of rearrangement breakpoint reads

    Edward Wijaya, Kana Shimizu, Kiyoshi Asai, Michiaki Hamada

    BIOINFORMATICS   30 ( 18 ) 2559 - 2567  2014年09月  [査読有り]

     概要を見る

    Motivation: Chromosome rearrangement events are triggered by atypical breaking and rejoining of DNA molecules, which are observed in many cancer-related diseases. The detection of rearrangement is typically done by using short reads generated by next-generation sequencing (NGS) and combining the reads with knowledge of a reference genome. Because structural variations and genomes differ from one person to another, intermediate comparison via a reference genome may lead to loss of information.
    Results: In this article, we propose a reference-free method for detecting clusters of breakpoints from the chromosomal rearrangements. This is done by directly comparing a set of NGS normal reads with another set that may be rearranged. Our method SlideSort-BPR (breakpoint reads) is based on a fast algorithm for all-against-all comparisons of short reads and theoretical analyses of the number of neighboring reads. When applied to a dataset with a sequencing depth of 100x, it finds similar to 88% of the breakpoints correctly with no false-positive reads. Moreover, evaluation on a real prostate cancer dataset shows that the proposed method predicts more fusion transcripts correctly than previous approaches, and yet produces fewer false-positive reads. To our knowledge, this is the first method to detect breakpoint reads without using a reference genome.

    DOI

    Scopus

    5
    被引用数
    (Scopus)
  • Fighting against uncertainty: an essential issue in bioinformatics

    Michiaki Hamada

    BRIEFINGS IN BIOINFORMATICS   15 ( 5 ) 748 - 767  2014年09月  [査読有り]

     概要を見る

    Many bioinformatics problems, such as sequence alignment, gene prediction, phylogenetic tree estimation and RNA secondary structure prediction, are often affected by the 'uncertainty' of a solution, that is, the probability of the solution is extremely small. This situation arises for estimation problems on high-dimensional discrete spaces in which the number of possible discrete solutions is immense. In the analysis of biological data or the development of prediction algorithms, this uncertainty should be handled carefully and appropriately. In this review, I will explain several methods to combat this uncertainty, presenting a number of examples in bioinformatics. The methods include (i) avoiding point estimation, (ii) maximum expected accuracy (MEA) estimations and (iii) several strategies to design a pipeline involving several prediction methods. I believe that the basic concepts and ideas described in this review will be generally useful for estimation problems in various areas of bioinformatics.

    DOI

    Scopus

    10
    被引用数
    (Scopus)
  • RNA structural alignments, part II: non-Sankoff approaches for structural alignments.

    Asai K, Hamada M

    Methods in molecular biology (Clifton, N.J.)   1097   291 - 301  2014年  [査読有り]

    DOI PubMed J-GLOBAL

  • Analysis of base-pairing probabilities of RNA molecules involved in protein-RNA interactions

    Junichi Iwakiri, Tomoshi Kameda, Kiyoshi Asai, Michiaki Hamada

    BIOINFORMATICS   29 ( 20 ) 2524 - 2528  2013年10月  [査読有り]

     概要を見る

    Motivation: Understanding the details of protein-RNA interactions is important to reveal the functions of both the RNAs and the proteins. In these interactions, the secondary structures of the RNAs play an important role. Because RNA secondary structures in protein-RNA complexes are variable, considering the ensemble of RNA secondary structures is a useful approach. In particular, recent studies have supported the idea that, in the analysis of RNA secondary structures, the base-pairing probabilities (BPPs) of RNAs (i.e. the probabilities of forming a base pair in the ensemble of RNA secondary structures) provide richer and more robust information about the structures than a single RNA secondary structure, for example, the minimum free energy structure or a snapshot of structures in the Protein Data Bank. However, there has been no investigation of the BPPs in protein-RNA interactions.
    Results: In this study, we analyzed BPPs of RNA molecules involved in known protein-RNA complexes in the Protein Data Bank. Our analysis suggests that, in the tertiary structures, the BPPs (which are computed using only sequence information) for unpaired nucleotides with intermolecular hydrogen bonds (hbonds) to amino acids were significantly lower than those for unpaired nucleotides without hbonds. On the other hand, no difference was found between the BPPs for paired nucleotides with and without intermolecular hbonds. Those findings were commonly supported by three probabilistic models, which provide the ensemble of RNA secondary structures, including the McCaskill model based on Turner's free energy of secondary structures.

    DOI

    Scopus

    8
    被引用数
    (Scopus)
  • Analysis of base-pairing probabilities of RNA molecules involved in protein-RNA interactions

    Junichi Iwakiri, Tomoshi Kameda, Kiyoshi Asai, Michiaki Hamada

    BIOINFORMATICS   29 ( 20 ) 2524 - 2528  2013年10月  [査読有り]

     概要を見る

    Motivation: Understanding the details of protein-RNA interactions is important to reveal the functions of both the RNAs and the proteins. In these interactions, the secondary structures of the RNAs play an important role. Because RNA secondary structures in protein-RNA complexes are variable, considering the ensemble of RNA secondary structures is a useful approach. In particular, recent studies have supported the idea that, in the analysis of RNA secondary structures, the base-pairing probabilities (BPPs) of RNAs (i.e. the probabilities of forming a base pair in the ensemble of RNA secondary structures) provide richer and more robust information about the structures than a single RNA secondary structure, for example, the minimum free energy structure or a snapshot of structures in the Protein Data Bank. However, there has been no investigation of the BPPs in protein-RNA interactions.
    Results: In this study, we analyzed BPPs of RNA molecules involved in known protein-RNA complexes in the Protein Data Bank. Our analysis suggests that, in the tertiary structures, the BPPs (which are computed using only sequence information) for unpaired nucleotides with intermolecular hydrogen bonds (hbonds) to amino acids were significantly lower than those for unpaired nucleotides without hbonds. On the other hand, no difference was found between the BPPs for paired nucleotides with and without intermolecular hbonds. Those findings were commonly supported by three probabilistic models, which provide the ensemble of RNA secondary structures, including the McCaskill model based on Turner's free energy of secondary structures.

    DOI PubMed J-GLOBAL

    Scopus

    8
    被引用数
    (Scopus)
  • Fighting against uncertainty: An essential issue in bioinformatics

    Michiaki Hamada

    Briefings in Bioinformatics   15 ( 5 ) 748 - 767  2013年05月  [査読有り]

     概要を見る

    Many bioinformatics problems, such as sequence alignment, gene prediction, phylogenetic tree estimation and RNA secondary structure prediction, are often affected by the 'uncertainty' of a solution, that is, the probability of the solution is extremely small. This situation arises for estimation problems on high-dimensional discrete spaces in which the number of possible discrete solutions is immense. In the analysis of biological data or the development of prediction algorithms, this uncertainty should be handled carefully and appropriately. In this review, I will explain several methods to combat this uncertainty, presenting a number of examples in bioinformatics. The methods include (i) avoiding point estimation, (ii) maximum expected accuracy (MEA) estimations and (iii) several strategies to design a pipeline involving several prediction methods. I believe that the basic concepts and ideas described in this review will be generally useful for estimation problems in various areas of bioinformatics.

    DOI PubMed J-GLOBAL

    Scopus

    10
    被引用数
    (Scopus)
  • CentroidAlign-Web: A fast and accurate multiple aligner for long non-coding RNAs

    Haruka Yonemoto, Kiyoshi Asai, Michiaki Hamada

    International Journal of Molecular Sciences   14 ( 3 ) 6144 - 6156  2013年03月  [査読有り]

     概要を見る

    Due to the recent discovery of non-coding RNAs (ncRNAs), multiple sequence alignment (MSA) of those long RNA sequences is becoming increasingly important for classifying and determining the functional motifs in RNAs. However, not only primary (nucleotide) sequences, but also secondary structures of ncRNAs are closely related to their function and are conserved evolutionarily. Hence, information about secondary structures should be considered in the sequence alignment of ncRNAs. Yet, in general, a huge computational time is required in order to compute MSAs, taking secondary structure information into account. In this paper, we describe a fast and accurate web server, called CentroidAlign-Web, which can handle long RNA sequences. The web server also appropriately incorporates information about known secondary structures into MSAs. Computational experiments indicate that our web server is fast and accurate enough to handle long RNA sequences. CentroidAlign-Web is freely available from http://centroidalign.ncrna.org/. © 2013 by the authors
    licensee MDPI, Basel, Switzerland.

    DOI PubMed J-GLOBAL

    Scopus

    4
    被引用数
    (Scopus)
  • Generalized Centroid Estimators in Bioinformatics

    Michiaki Hamada, Hisanori Kiryu, Wataru Iwasaki, Kiyoshi Asai

    CoRR   abs/1305.4339  2013年  [査読有り]

  • PBSIM: PacBio reads simulator-toward accurate genome assembly

    Yukiteru Ono, Kiyoshi Asai, Michiaki Hamada

    BIOINFORMATICS   29 ( 1 ) 119 - 121  2013年01月  [査読有り]

     概要を見る

    Motivation: PacBio sequencers produce two types of characteristic reads (continuous long reads: long and high error rate and circular consensus sequencing: short and low error rate), both of which could be useful for de novo assembly of genomes. Currently, there is no available simulator that targets the specific generation of PacBio libraries.
    Results: Our analysis of 13 PacBio datasets showed characteristic features of PacBio reads (e.g. the read length of PacBio reads follows a log-normal distribution). We have developed a read simulator, PBSIM, that captures these features using either a model-based or sampling-based method. Using PBSIM, we conducted several hybrid error correction and assembly tests for PacBio reads, suggesting that a continuous long reads coverage depth of at least 15 in combination with a circular consensus sequencing coverage depth of at least 30 achieved extensive assembly results.

    DOI

    Scopus

    207
    被引用数
    (Scopus)
  • PBSIM: PacBio reads simulator-toward accurate genome assembly

    Yukiteru Ono, Kiyoshi Asai, Michiaki Hamada

    BIOINFORMATICS   29 ( 1 ) 119 - 121  2013年01月  [査読有り]

     概要を見る

    Motivation: PacBio sequencers produce two types of characteristic reads (continuous long reads: long and high error rate and circular consensus sequencing: short and low error rate), both of which could be useful for de novo assembly of genomes. Currently, there is no available simulator that targets the specific generation of PacBio libraries.
    Results: Our analysis of 13 PacBio datasets showed characteristic features of PacBio reads (e.g. the read length of PacBio reads follows a log-normal distribution). We have developed a read simulator, PBSIM, that captures these features using either a model-based or sampling-based method. Using PBSIM, we conducted several hybrid error correction and assembly tests for PacBio reads, suggesting that a continuous long reads coverage depth of at least 15 in combination with a circular consensus sequencing coverage depth of at least 30 achieved extensive assembly results.

    DOI PubMed J-GLOBAL

    Scopus

    207
    被引用数
    (Scopus)
  • 1P125 蛋白質-RNAの複合体立体構造予測(05A.核酸:構造・物性,ポスター,日本生物物理学会年会第51回(2013年度))

    Kameda Tomoshi, Iwakiri Junichi, Hamada Michiaki, Asai Kiyoshi

    生物物理   53 ( 1 ) S126  2013年

    DOI CiNii

  • Direct Updating of an RNA Base-Pairing Probability Matrix with Marginal Probability Constraints

    Michiaki Hamada

    JOURNAL OF COMPUTATIONAL BIOLOGY   19 ( 12 ) 1265 - 1276  2012年12月  [査読有り]

     概要を見る

    A base-pairing probability matrix (BPPM) stores the probabilities for every possible base pair in an RNA sequence and has been used in many algorithms in RNA informatics (e.g., RNA secondary structure prediction and motif search). In this study, we propose a novel algorithm to perform iterative updates of a given BPPM, satisfying marginal probability constraints that are (approximately) given by recently developed biochemical experiments, such as SHAPE, PAR, and FragSeq. The method is easily implemented and is applicable to common models for RNA secondary structures, such as energy-based or machine-learning-based models. In this article, we focus mainly on the details of the algorithms, although preliminary computational experiments will also be presented.

    DOI

    Scopus

    8
    被引用数
    (Scopus)
  • Direct updating of an RNA base-pairing probability matrix with marginal probability constraints

    Michiaki Hamada

    Journal of Computational Biology   19 ( 12 ) 1265 - 1276  2012年12月  [査読有り]

     概要を見る

    A base-pairing probability matrix (BPPM) stores the probabilities for every possible base pair in an RNA sequence and has been used in many algorithms in RNA informatics (e.g., RNA secondary structure prediction and motif search). In this study, we propose a novel algorithm to perform iterative updates of a given BPPM, satisfying marginal probability constraints that are (approximately) given by recently developed biochemical experiments, such as SHAPE, PAR, and FragSeq. The method is easily implemented and is applicable to common models for RNA secondary structures, such as energy-based or machine-learning-based models. In this article, we focus mainly on the details of the algorithms, although preliminary computational experiments will also be presented. © 2012 Mary Ann Liebert, Inc.

    DOI PubMed J-GLOBAL

    Scopus

    8
    被引用数
    (Scopus)
  • Shape-based alignment of genomic landscapes in multi-scale resolution

    Hiroki Ashida, Kiyoshi Asai, Michiaki Hamada

    NUCLEIC ACIDS RESEARCH   40 ( 14 ) 6435 - 6448  2012年08月  [査読有り]

     概要を見る

    Due to dramatic advances in DNA technology, quantitative measures of annotation data can now be obtained in continuous coordinates across the entire genome, allowing various heterogeneous 'genomic landscapes' to emerge. Although much effort has been devoted to comparing DNA sequences, not much attention has been given to comparing these large quantities of data comprehensively. In this article, we introduce a method for rapidly detecting local regions that show high correlations between genomic landscapes. We overcame the size problem for genome-wide data by converting the data into series of symbols and then carrying out sequence alignment. We also decomposed the oscillation of the landscape data into different frequency bands before analysis, since the real genomic landscape is a mixture of embedded and confounded biological processes working at different scales in the cell nucleus. To verify the usefulness and generality of our method, we applied our approach to well investigated landscapes from the human genome, including several histone modifications. Furthermore, by applying our method to over 20 genomic landscapes in human and 12 in mouse, we found that DNA replication timing and the density of Alu insertions are highly correlated genome-wide in both species, even though the Alu elements have amplified independently in the two genomes. To our knowledge, this is the first method to align genomic landscapes at multiple scales according to their shape.

    DOI PubMed J-GLOBAL

    Scopus

    5
    被引用数
    (Scopus)
  • A Classification of Bioinformatics Algorithms from the Viewpoint of Maximizing Expected Accuracy (MEA)

    Michiaki Hamada, Kiyoshi Asai

    JOURNAL OF COMPUTATIONAL BIOLOGY   19 ( 5 ) 532 - 549  2012年05月  [査読有り]

     概要を見る

    Many estimation problems in bioinformatics are formulated as point estimation problems in a high-dimensional discrete space. In general, it is difficult to design reliable estimators for this type of problem, because the number of possible solutions is immense, which leads to an extremely low probability for every solution-even for the one with the highest probability. Therefore, maximum score and maximum likelihood estimators do not work well in this situation although they are widely employed in a number of applications. Maximizing expected accuracy (MEA) estimation, in which accuracy measures of the target problem and the entire distribution of solutions are considered, is a more successful approach. In this review, we provide an extensive discussion of algorithms and software based on MEA. We describe how a number of algorithms used in previous studies can be classified from the viewpoint of MEA. We believe that this review will be useful not only for users wishing to utilize software to solve the estimation problems appearing in this article, but also for developers wishing to design algorithms on the basis of MEA.

    DOI

    Scopus

    14
    被引用数
    (Scopus)
  • A Classification of Bioinformatics Algorithms from the Viewpoint of Maximizing Expected Accuracy (MEA)

    Michiaki Hamada, Kiyoshi Asai

    JOURNAL OF COMPUTATIONAL BIOLOGY   19 ( 5 ) 532 - 549  2012年05月  [査読有り]

     概要を見る

    Many estimation problems in bioinformatics are formulated as point estimation problems in a high-dimensional discrete space. In general, it is difficult to design reliable estimators for this type of problem, because the number of possible solutions is immense, which leads to an extremely low probability for every solution-even for the one with the highest probability. Therefore, maximum score and maximum likelihood estimators do not work well in this situation although they are widely employed in a number of applications. Maximizing expected accuracy (MEA) estimation, in which accuracy measures of the target problem and the entire distribution of solutions are considered, is a more successful approach. In this review, we provide an extensive discussion of algorithms and software based on MEA. We describe how a number of algorithms used in previous studies can be classified from the viewpoint of MEA. We believe that this review will be useful not only for users wishing to utilize software to solve the estimation problems appearing in this article, but also for developers wishing to design algorithms on the basis of MEA.

    DOI PubMed J-GLOBAL

    Scopus

    14
    被引用数
    (Scopus)
  • 検索行動におけるプライバシ保護 (人工知能学会全国大会(第26回)文化,科学技術と未来) -- (オーガナイズドセッション「OS-20 プライバシー保護データマイニング」)

    荒井 ひろみ, 清水 佳奈, 浜田 道昭

    人工知能学会全国大会論文集   26   1 - 4  2012年

    CiNii

  • Probabilistic alignments with quality scores: an application to short-read mapping toward accurate SNP/indel detection

    Michiaki Hamada, Edward Wijaya, Martin C. Frith, Kiyoshi Asai

    BIOINFORMATICS   27 ( 22 ) 3085 - 3092  2011年11月  [査読有り]

     概要を見る

    Motivation: Recent studies have revealed the importance of considering quality scores of reads generated by next-generation sequence (NGS) platforms in various downstream analyses. It is also known that probabilistic alignments based on marginal probabilities (e. g. aligned-column and/or gap probabilities) provide more accurate alignment than conventional maximum score-based alignment. There exists, however, no study about probabilistic alignment that considers quality scores explicitly, although the method is expected to be useful in SNP/indel callers and bisulfite mapping, because accurate estimation of aligned columns or gaps is important in those analyses.
    Results: In this study, we propose methods of probabilistic alignment that consider quality scores of (one of) the sequences as well as a usual score matrix. The method is based on posterior decoding techniques in which various marginal probabilities are computed from a probabilistic model of alignments with quality scores, and can arbitrarily trade-off sensitivity and positive predictive value (PPV) of prediction (aligned columns and gaps). The method is directly applicable to read mapping (alignment) toward accurate detection of SNPs and indels. Several computational experiments indicated that probabilistic alignments can estimate aligned columns and gaps accurately, compared with other mapping algorithms e.g. SHRiMP2, Stampy, BWA and Novoalign. The study also suggested that our approach yields favorable precision for SNP/indel calling.

    DOI

    Scopus

    13
    被引用数
    (Scopus)
  • Probabilistic alignments with quality scores: an application to short-read mapping toward accurate SNP/indel detection

    Michiaki Hamada, Edward Wijaya, Martin C. Frith, Kiyoshi Asai

    BIOINFORMATICS   27 ( 22 ) 3085 - 3092  2011年11月  [査読有り]

     概要を見る

    Motivation: Recent studies have revealed the importance of considering quality scores of reads generated by next-generation sequence (NGS) platforms in various downstream analyses. It is also known that probabilistic alignments based on marginal probabilities (e. g. aligned-column and/or gap probabilities) provide more accurate alignment than conventional maximum score-based alignment. There exists, however, no study about probabilistic alignment that considers quality scores explicitly, although the method is expected to be useful in SNP/indel callers and bisulfite mapping, because accurate estimation of aligned columns or gaps is important in those analyses.
    Results: In this study, we propose methods of probabilistic alignment that consider quality scores of (one of) the sequences as well as a usual score matrix. The method is based on posterior decoding techniques in which various marginal probabilities are computed from a probabilistic model of alignments with quality scores, and can arbitrarily trade-off sensitivity and positive predictive value (PPV) of prediction (aligned columns and gaps). The method is directly applicable to read mapping (alignment) toward accurate detection of SNPs and indels. Several computational experiments indicated that probabilistic alignments can estimate aligned columns and gaps accurately, compared with other mapping algorithms e.g. SHRiMP2, Stampy, BWA and Novoalign. The study also suggested that our approach yields favorable precision for SNP/indel calling.

    DOI PubMed J-GLOBAL

    Scopus

    13
    被引用数
    (Scopus)
  • CentroidHomfold-LAST: accurate prediction of RNA secondary structure using automatically collected homologous sequences

    Michiaki Hamada, Koichiro Yamada, Kengo Sato, Martin C. Frith, Kiyoshi Asai

    NUCLEIC ACIDS RESEARCH   39 ( Web-Server-Issue ) W100 - W106  2011年07月  [査読有り]

     概要を見る

    Although secondary structure predictions of an individual RNA sequence have been widely used in a number of sequence analyses of RNAs, accuracy is still limited. Recently, we proposed a method (called 'CentroidHomfold'), which includes information about homologous sequences into the prediction of the secondary structure of the target sequence, and showed that it substantially improved the performance of secondary structure predictions. CentroidHomfold, however, forces users to prepare homologous sequences of the target sequence. We have developed a Web application (CentroidHomfold-LAST) that predicts the secondary structure of the target sequence using automatically collected homologous sequences. LAST, which is a fast and sensitive local aligner, and CentroidHomfold are employed in the Web application. Computational experiments with a commonly-used data set indicated that CentroidHomfold-LAST substantially outperformed conventional secondary structure predictions including CentroidFold and RNAfold.

    DOI

    Scopus

    21
    被引用数
    (Scopus)
  • IPknot: fast and accurate prediction of RNA secondary structures with pseudoknots using integer programming

    Kengo Sato, Yuki Kato, Michiaki Hamada, Tatsuya Akutsu, Kiyoshi Asai

    BIOINFORMATICS   27 ( 13 ) I85 - I93  2011年07月  [査読有り]

     概要を見る

    Motivation: Pseudoknots found in secondary structures of a number of functional RNAs play various roles in biological processes. Recent methods for predicting RNA secondary structures cover certain classes of pseudoknotted structures, but only a few of them achieve satisfying predictions in terms of both speed and accuracy.
    Results: We propose IPknot, a novel computational method for predicting RNA secondary structures with pseudoknots based on maximizing expected accuracy of a predicted structure. IPknot decomposes a pseudoknotted structure into a set of pseudoknot-free substructures and approximates a base-pairing probability distribution that considers pseudoknots, leading to the capability of modeling a wide class of pseudoknots and running quite fast. In addition, we propose a heuristic algorithm for refining base-paring probabilities to improve the prediction accuracy of IPknot. The problem of maximizing expected accuracy is solved by using integer programming with threshold cut. We also extend IPknot so that it can predict the consensus secondary structure with pseudoknots when a multiple sequence alignment is given. IPknot is validated through extensive experiments on various datasets, showing that IPknot achieves better prediction accuracy and faster running time as compared with several competitive prediction methods.

    DOI

    Scopus

    195
    被引用数
    (Scopus)
  • Antagonistic RNA aptamer specific to a heterodimeric form of human interleukin-17A/F

    Hironori Adachi, Akira Ishiguro, Michiaki Hamada, Eri Sakota, Kiyoshi Asai, Yoshikazu Nakamura

    BIOCHIMIE   93 ( 7 ) 1081 - 1088  2011年07月  [査読有り]

     概要を見る

    Interleukin-17 (IL-17) is a pro-inflammatory cytokine produced primarily by a subset of CD4(+) cells, called Th17 cells, that is involved in host defense, inflammation and autoimmune disorders. The two most structurally related IL-17 family members, IL-17A and IL-17F, form homodimeric (IL-17A/A, IL-17F/F) and heterodimeric (IL-17A/F) complexes. Although the biological significance of IL-17A and IL-17F have been investigated using respective antibodies or gene knockout mice, the functional study of IL-17A/F heterodimeric form has been hampered by the lack of an inhibitory tool specific to IL-17A/F. In this study, we aimed to develop an RNA aptamer that specifically inhibits IL-17A/F. Aptamers are short single-stranded nucleic acid sequences that are selected in vitro based on their high affinity to a target molecule. One selected aptamer against human IL-17A/F, AptAF42, was isolated by repeated cycles of selection and counterselection against heterodimeric and homodimeric complexes, respectively. Thus, AptAF42 bound IL-17A/F but not IL-17A/A or IL-17F/F. The optimized derivative, AptAF42dope1, blocked the binding of IL-17A/F, but not of IL-17A/A or IL-17F/F, to the IL-17 receptor in the surface plasmon resonance assay in vitro. Consistently, AptAF42dope1 blocked cytokine GRO-alpha production induced by IL-17A/F, but not by IL-17A/A or IL-17F/F, in human cells. An RNA footprinting assay using ribonucleases against AptAF42dope1 in the presence or absence of IL-17A/F revealed that part of the predicted secondary structure fluctuates between alternate forms and that AptAF42dope1 is globally protected from ribonuclease cleavage by IL-17A/F. These results suggest that the selected aptamer recognizes a global conformation specified by the heterodimeric surface of IL-17A/F. (C) 2011 Elsevier Masson SAS. All rights reserved.

    DOI PubMed J-GLOBAL

    Scopus

    19
    被引用数
    (Scopus)
  • CentroidHomfold-LAST: accurate prediction of RNA secondary structure using automatically collected homologous sequences

    Michiaki Hamada, Koichiro Yamada, Kengo Sato, Martin C. Frith, Kiyoshi Asai

    NUCLEIC ACIDS RESEARCH   39 ( Web Server issue ) W100 - W106  2011年07月  [査読有り]

     概要を見る

    Although secondary structure predictions of an individual RNA sequence have been widely used in a number of sequence analyses of RNAs, accuracy is still limited. Recently, we proposed a method (called 'CentroidHomfold'), which includes information about homologous sequences into the prediction of the secondary structure of the target sequence, and showed that it substantially improved the performance of secondary structure predictions. CentroidHomfold, however, forces users to prepare homologous sequences of the target sequence. We have developed a Web application (CentroidHomfold-LAST) that predicts the secondary structure of the target sequence using automatically collected homologous sequences. LAST, which is a fast and sensitive local aligner, and CentroidHomfold are employed in the Web application. Computational experiments with a commonly-used data set indicated that CentroidHomfold-LAST substantially outperformed conventional secondary structure predictions including CentroidFold and RNAfold.

    DOI PubMed J-GLOBAL

    Scopus

    21
    被引用数
    (Scopus)
  • IPknot: fast and accurate prediction of RNA secondary structures with pseudoknots using integer programming

    Kengo Sato, Yuki Kato, Michiaki Hamada, Tatsuya Akutsu, Kiyoshi Asai

    BIOINFORMATICS   27 ( 13 ) I85 - I93  2011年07月  [査読有り]

     概要を見る

    Motivation: Pseudoknots found in secondary structures of a number of functional RNAs play various roles in biological processes. Recent methods for predicting RNA secondary structures cover certain classes of pseudoknotted structures, but only a few of them achieve satisfying predictions in terms of both speed and accuracy.
    Results: We propose IPknot, a novel computational method for predicting RNA secondary structures with pseudoknots based on maximizing expected accuracy of a predicted structure. IPknot decomposes a pseudoknotted structure into a set of pseudoknot-free substructures and approximates a base-pairing probability distribution that considers pseudoknots, leading to the capability of modeling a wide class of pseudoknots and running quite fast. In addition, we propose a heuristic algorithm for refining base-paring probabilities to improve the prediction accuracy of IPknot. The problem of maximizing expected accuracy is solved by using integer programming with threshold cut. We also extend IPknot so that it can predict the consensus secondary structure with pseudoknots when a multiple sequence alignment is given. IPknot is validated through extensive experiments on various datasets, showing that IPknot achieves better prediction accuracy and faster running time as compared with several competitive prediction methods.

    DOI PubMed J-GLOBAL

    Scopus

    195
    被引用数
    (Scopus)
  • Generalized Centroid Estimators in Bioinformatics

    Michiaki Hamada, Hisanori Kiryu, Wataru Iwasaki, Kiyoshi Asai

    PLOS ONE   6 ( 2 ) e16450  2011年02月  [査読有り]

     概要を見る

    In a number of estimation problems in bioinformatics, accuracy measures of the target problem are usually given, and it is important to design estimators that are suitable to those accuracy measures. However, there is often a discrepancy between an employed estimator and a given accuracy measure of the problem. In this study, we introduce a general class of efficient estimators for estimation problems on high-dimensional binary spaces, which represent many fundamental problems in bioinformatics. Theoretical analysis reveals that the proposed estimators generally fit with commonly-used accuracy measures (e.g. sensitivity, PPV, MCC and F-score) as well as it can be computed efficiently in many cases, and cover a wide range of problems in bioinformatics from the viewpoint of the principle of maximum expected accuracy (MEA). It is also shown that some important algorithms in bioinformatics can be interpreted in a unified manner. Not only the concept presented in this paper gives a useful framework to design MEA-based estimators but also it is highly extendable and sheds new light on many problems in bioinformatics.

    DOI PubMed

    Scopus

    14
    被引用数
    (Scopus)
  • Improving the accuracy of predicting secondary structure for aligned RNA sequences

    Michiaki Hamada, Kengo Sato, Kiyoshi Asai

    NUCLEIC ACIDS RESEARCH   39 ( 2 ) 393 - 402  2011年01月  [査読有り]

     概要を見る

    Considerable attention has been focused on predicting the secondary structure for aligned RNA sequences since it is useful not only for improving the limiting accuracy of conventional secondary structure prediction but also for finding non-coding RNAs in genomic sequences. Although there exist many algorithms of predicting secondary structure for aligned RNA sequences, further improvement of the accuracy is still awaited. In this article, toward improving the accuracy, a theoretical classification of state-of-the-art algorithms of predicting secondary structure for aligned RNA sequences is presented. The classification is based on the viewpoint of maximum expected accuracy (MEA), which has been successfully applied in various problems in bioinformatics. The classification reveals several disadvantages of the current algorithms but we propose an improvement of a previously introduced algorithm (CentroidAlifold). Finally, computational experiments strongly support the theoretical classification and indicate that the improved CentroidAlifold substantially outperforms other algorithms.

    DOI

    Scopus

    49
    被引用数
    (Scopus)
  • Improving the accuracy of predicting secondary structure for aligned RNA sequences

    Michiaki Hamada, Kengo Sato, Kiyoshi Asai

    NUCLEIC ACIDS RESEARCH   39 ( 2 ) 393 - 402  2011年01月  [査読有り]

     概要を見る

    Considerable attention has been focused on predicting the secondary structure for aligned RNA sequences since it is useful not only for improving the limiting accuracy of conventional secondary structure prediction but also for finding non-coding RNAs in genomic sequences. Although there exist many algorithms of predicting secondary structure for aligned RNA sequences, further improvement of the accuracy is still awaited. In this article, toward improving the accuracy, a theoretical classification of state-of-the-art algorithms of predicting secondary structure for aligned RNA sequences is presented. The classification is based on the viewpoint of maximum expected accuracy (MEA), which has been successfully applied in various problems in bioinformatics. The classification reveals several disadvantages of the current algorithms but we propose an improvement of a previously introduced algorithm (CentroidAlifold). Finally, computational experiments strongly support the theoretical classification and indicate that the improved CentroidAlifold substantially outperforms other algorithms.

    DOI PubMed CiNii J-GLOBAL

    Scopus

    49
    被引用数
    (Scopus)
  • Prediction of RNA secondary structure by maximizing pseudo-expected accuracy

    Michiaki Hamada, Kengo Sato, Kiyoshi Asai

    BMC BIOINFORMATICS   11   586  2010年11月  [査読有り]

     概要を見る

    Background: Recent studies have revealed the importance of considering the entire distribution of possible secondary structures in RNA secondary structure predictions; therefore, a new type of estimator is proposed including the maximum expected accuracy ( MEA) estimator. The MEA-based estimators have been designed to maximize the expected accuracy of the base-pairs and have achieved the highest level of accuracy. Those methods, however, do not give the single best prediction of the structure, but employ parameters to control the trade-off between the sensitivity and the positive predictive value (PPV). It is unclear what parameter value we should use, and even the well-trained default parameter value does not, in general, give the best result in popular accuracy measures to each RNA sequence.
    Results: Instead of using the expected values of the popular accuracy measures for RNA secondary structure prediction, which is difficult to be calculated, the pseudo-expected accuracy, which can easily be computed from base-pairing probabilities, is introduced. It is shown that the pseudo-expected accuracy is a good approximation in terms of sensitivity, PPV, MCC, or F-score. The pseudo-expected accuracy can be approximately maximized for each RNA sequence by stochastic sampling. It is also shown that well-balanced secondary structures between sensitivity and PPV can be predicted with a small computational overhead by combining the pseudo-expected accuracy of MCC or F-score with the gamma-centroid estimator.
    Conclusions: This study gives not only a method for predicting the secondary structure that balances between sensitivity and PPV, but also a general method for approximately maximizing the (pseudo-) expected accuracy with respect to various evaluation measures including MCC and F-score.

    DOI PubMed J-GLOBAL

    Scopus

    22
    被引用数
    (Scopus)
  • RactIP: Fast and accurate prediction of RNA-RNA interaction using integer programming

    Yuki Kato, Kengo Sato, Michiaki Hamada, Yoshihide Watanabe, Kiyoshi Asai, Tatsuya Akutsu

    Bioinformatics   26 ( 18 ) i460 - i466  2010年09月  [査読有り]

     概要を見る

    Motivation: Considerable attention has been focused on predicting RNA-RNA interaction since it is a key to identifying possible targets of non-coding small RNAs that regulate gene expression post-transcriptionally. A number of computational studies have so far been devoted to predicting joint secondary structures or binding sites under a specific class of interactions. In general, there is a trade-off between range of interaction type and efficiency of a prediction algorithm, and thus efficient computational methods for predicting comprehensive type of interaction are still awaited.Results: We present RactIP, a fast and accurate prediction method for RNA-RNA interaction of general type using integer programming. RactIP can integrate approximate information on an ensemble of equilibrium joint structures into the objective function of integer programming using posterior internal and external base-paring probabilities. Experimental results on real interaction data show that prediction accuracy of RactIP is at least comparable to that of several state-of-the-art methods for RNA-RNA interaction prediction. Moreover, we demonstrate that RactIP can run incomparably faster than competitive methods for predicting joint secondary structures. © The Author(s) 2010. Published by Oxford University Press.

    DOI PubMed J-GLOBAL

    Scopus

    3
    被引用数
    (Scopus)
  • RactIP: fast and accurate prediction of RNA-RNA interaction using integer programming

    Yuki Kato, Kengo Sato, Michiaki Hamada, Yoshihide Watanabe, Kiyoshi Asai, Tatsuya Akutsu

    BIOINFORMATICS   26 ( 18 ) i460 - i466  2010年09月  [査読有り]

     概要を見る

    Motivation: Considerable attention has been focused on predicting RNA-RNA interaction since it is a key to identifying possible targets of non-coding small RNAs that regulate gene expression post-transcriptionally. A number of computational studies have so far been devoted to predicting joint secondary structures or binding sites under a specific class of interactions. In general, there is a tradeoff between range of interaction type and efficiency of a prediction algorithm, and thus efficient computational methods for predicting comprehensive type of interaction are still awaited.
    Results: We present RactIP, a fast and accurate prediction method for RNA-RNA interaction of general type using integer programming. RactIP can integrate approximate information on an ensemble of equilibrium joint structures into the objective function of integer programming using posterior internal and external base-paring probabilities. Experimental results on real interaction data show that prediction accuracy of RactIP is at least comparable to that of several state-of-the-art methods for RNA-RNA interaction prediction. Moreover, we demonstrate that RactIP can run incomparably faster than competitive methods for predicting joint secondary structures.

    DOI

    Scopus

    3
    被引用数
    (Scopus)
  • A non-parametric bayesian approach for predicting rna secondary structures

    Kengo Sato, Michiaki Hamada, Toutai Mituyama, Kiyoshi Asai, Yasubumi Sakakibara

    Journal of Bioinformatics and Computational Biology   8 ( 4 ) 727 - 742  2010年08月  [査読有り]

     概要を見る

    Since many functional RNAs form stable secondary structures which are related to their functions, RNA secondary structure prediction is a crucial problem in bioinformatics. We propose a novel model for generating RNA secondary structures based on a non-parametric Bayesian approach, called hierarchical Dirichlet processes for stochastic context-free grammars (HDP-SCFGs). Here non-parametric means that some meta-parameters, such as the number of non-terminal symbols and production rules, do not have to be fixed. Instead their distributions are inferred in order to be adapted (in the Bayesian sense) to the training sequences provided. The results of our RNA secondary structure predictions show that HDP-SCFGs are more accurate than the MFE-based and other generative models. © 2010 Imperial College Press.

    DOI

    Scopus

    10
    被引用数
    (Scopus)
  • Parameters for accurate genome alignment

    Martin C. Frith, Michiaki Hamada, Paul Horton

    BMC Bioinformatics   11   80  2010年02月  [査読有り]

     概要を見る

    Background: Genome sequence alignments form the basis of much research. Genome alignment depends on various mundane but critical choices, such as how to mask repeats and which score parameters to use. Surprisingly, there has been no large-scale assessment of these choices using real genomic data. Moreover, rigorous procedures to control the rate of spurious alignment have not been employed.Results: We have assessed 495 combinations of score parameters for alignment of animal, plant, and fungal genomes. As our gold-standard of accuracy, we used genome alignments implied by multiple alignments of proteins and of structural RNAs. We found the HOXD scoring schemes underlying alignments in the UCSC genome database to be far from optimal, and suggest better parameters. Higher values of the X-drop parameter are not always better. E-values accurately indicate the rate of spurious alignment, but only if tandem repeats are masked in a non-standard way. Finally, we show that γ-centroid (probabilistic) alignment can find highly reliable subsets of aligned bases.Conclusions: These results enable more accurate genome alignment, with reliability measures for local alignments and for individual aligned bases. This study was made possible by our new software, LAST, which can align vertebrate genomes in a few hours http://last.cbrc.jp/. © 2010 Frith et al
    licensee BioMed Central Ltd.

    DOI PubMed J-GLOBAL

    Scopus

    149
    被引用数
    (Scopus)
  • Parameters for accurate genome alignment

    Martin C. Frith, Michiaki Hamada, Paul Horton

    BMC BIOINFORMATICS   11   80  2010年02月  [査読有り]

     概要を見る

    Background: Genome sequence alignments form the basis of much research. Genome alignment depends on various mundane but critical choices, such as how to mask repeats and which score parameters to use. Surprisingly, there has been no large-scale assessment of these choices using real genomic data. Moreover, rigorous procedures to control the rate of spurious alignment have not been employed.
    Results: We have assessed 495 combinations of score parameters for alignment of animal, plant, and fungal genomes. As our gold-standard of accuracy, we used genome alignments implied by multiple alignments of proteins and of structural RNAs. We found the HOXD scoring schemes underlying alignments in the UCSC genome database to be far from optimal, and suggest better parameters. Higher values of the X-drop parameter are not always better. E-values accurately indicate the rate of spurious alignment, but only if tandem repeats are masked in a non-standard way. Finally, we show that gamma-centroid (probabilistic) alignment can find highly reliable subsets of aligned bases.
    Conclusions: These results enable more accurate genome alignment, with reliability measures for local alignments and for individual aligned bases. This study was made possible by our new software, LAST, which can align vertebrate genomes in a few hours http://last.cbrc.jp/.

    DOI

    Scopus

    149
    被引用数
    (Scopus)
  • CentroidAlign: fast and accurate aligner for structured RNAs by maximizing expected sum-of-pairs score

    Michiaki Hamada, Kengo Sato, Hisanori Kiryu, Toutai Mituyama, Kiyoshi Asai

    BIOINFORMATICS   25 ( 24 ) 3236 - 3243  2009年12月  [査読有り]

     概要を見る

    Motivation: The importance of accurate and fast predictions of multiple alignments for RNA sequences has increased due to recent findings about functional non-coding RNAs. Recent studies suggest that maximizing the expected accuracy of predictions will be useful for many problems in bioinformatics.
    Results: We designed a novel estimator for multiple alignments of structured RNAs, based on maximizing the expected accuracy of predictions. First, we define the maximum expected accuracy (MEA) estimator for pairwise alignment of RNA sequences. This maximizes the expected sum-of-pairs score (SPS) of a predicted alignment under a probability distribution of alignments given by marginalizing the Sankoff model. Then, by approximating the MEA estimator, we obtain an estimator whose time complexity is O(L-3 + c(2)dL(2)) where L is the length of input sequences and both c and d are constants independent of L. The proposed estimator can handle uncertainty of secondary structures and alignments that are obstacles in Bioinformatics because it considers all the secondary structures and all the pairwise alignments as input sequences. Moreover, we integrate the probabilistic consistency transformation (PCT) on alignments into the proposed estimator. Computational experiments using six benchmark datasets indicate that the proposed method achieved a favorable SPS and was the fastest of many state-of-the-art tools for multiple alignments of structured RNAs.

    DOI PubMed J-GLOBAL

    Scopus

    39
    被引用数
    (Scopus)
  • CENTROIDFOLD: a web server for RNA secondary structure prediction

    Kengo Sato, Michiaki Hamada, Kiyoshi Asai, Toutai Mituyama

    NUCLEIC ACIDS RESEARCH   37 ( Web Server issue ) W277 - W280  2009年07月  [査読有り]

     概要を見る

    The CENTROIDFOLD web server (http://www.ncrna.org/centroidfold/) is a web application for RNA secondary structure prediction powered by one of the most accurate prediction engine. The server accepts two kinds of sequence data: a single RNA sequence and a multiple alignment of RNA sequences. It responses with a prediction result shown as a popular base-pair notation and a graph representation. PDF version of the graph representation is also available. For a multiple alignment sequence, the server predicts a common secondary structure. Usage of the server is quite simple. You can paste a single RNA sequence (FASTA or plain sequence text) or a multiple alignment (CLUSTAL-W format) into the textarea then click on the 'execute CentroidFold' button. The server quickly responses with a prediction result. The major advantage of this server is that it employs our original CENTROIDFOLD software as its prediction engine which scores the best accuracy in our benchmark results. Our web server is freely available with no login requirement.

    DOI PubMed J-GLOBAL

    Scopus

    236
    被引用数
    (Scopus)
  • Predictions of RNA secondary structure by combining homologous sequence information.

    Hamada M, Sato K, Kiryu H, Mituyama T, Asai K

    Bioinformatics (Oxford, England)   25 ( 12 ) 330 - 338  2009年06月  [査読有り]  [国際誌]

    DOI PubMed J-GLOBAL

    Scopus

    40
    被引用数
    (Scopus)
  • Prediction of RNA secondary structure using generalized centroid estimators

    Michiaki Hamada, Hisanori Kiryu, Kengo Sato, Toutai Mituyama, Kiyoshi Asai

    BIOINFORMATICS   25 ( 4 ) 465 - 473  2009年02月  [査読有り]

     概要を見る

    Motivation: Recent studies have shown that the methods for predicting secondary structures of RNAs on the basis of posterior decoding of the base-pairing probabilities has an advantage with respect to prediction accuracy over the conventionally utilized minimum free energy methods. However, there is room for improvement in the objective functions presented in previous studies, which are maximized in the posterior decoding with respect to the accuracy measures for secondary structures.
    Results: We propose novel estimators which improve the accuracy of secondary structure prediction of RNAs. The proposed estimators maximize an objective function which is the weighted sum of the expected number of the true positives and that of the true negatives of the base pairs. The proposed estimators are also improved versions of the ones used in previous works, namely CONTRAfold for secondary structure prediction from a single RNA sequence and McCaskill-MEA for common secondary structure prediction from multiple alignments of RNA sequences. We clarify the relations between the proposed estimators and the estimators presented in previous works, and theoretically show that the previous estimators include additional unnecessary terms in the evaluation measures with respect to the accuracy. Furthermore, computational experiments confirm the theoretical analysis by indicating improvement in the empirical accuracy. The proposed estimators represent extensions of the centroid estimators proposed in Ding et al. and Carvalho and Lawrence, and are applicable to a wide variety of problems in bioinformatics.

    DOI PubMed J-GLOBAL

    Scopus

    183
    被引用数
    (Scopus)
  • A Non-parametric Bayesian Approach for Predicting RNA Secondary Structures

    Kengo Sato, Michiaki Hamada, Toutai Mituyama, Kiyoshi Asai, Yasubumi Sakakibara

    ALGORITHMS IN BIOINFORMATICS, PROCEEDINGS   5724   286 - +  2009年  [査読有り]

     概要を見る

    Since many functional RNAs form stable secondary structures which are related to their functions, RNA secondary structure prediction is a crucial problem in bioinformatics. We propose a novel model for generating RNA secondary structures based on a non-parametric Bayesian approach, called hierarchical Dirichlet processes for stochastic context-free grammars (HDP-SCFGs). Here non-parametric means that some meta-parameters, such as the number of non-terminal symbols and production rules, do not have to be fixed. Instead their distributions are inferred in order to be adapted (in the Bayesian sense) to the training sequences provided. The results of our RNA secondary structure predictions show that HDP-SCFGs are more accurate than the MFE-based and other generative models.

  • Large scale similarity search for locally stable secondary structures among RNA sequences

    Michiaki Hamada, Toutai Mituyama, Kiyoshi Asai

    IPSJ Transactions on Bioinformatics   2   36 - 46  2009年  [査読有り]

     概要を見る

    Recently, a large number of candidates of non-coding RNAs (ncRNAs) has been predicted by experimental or computational approaches. Moreover, in genomic sequences, there are still many interesting regions whose functions are unknown (e.g., indel conserved regions, human accelerated regions, ultraconserved elements and transposon free regions) and some of those regions may be ncRNAs. On the other hand, it is known that many ncRNAs have characteristic secondary structures which are strongly related to their functions. Therefore, detecting clusters which have mutually similar secondary structures is important for revealing new ncRNA families. In this paper, we describe a novel method, called RNAclique, which is able to search for clusters containing mutually similar and locally stable secondary structures among a large number of unaligned RNA sequences. Our problem is formulated as a constraint quasiclique search problem, and we use an approximate combinatorial optimization method, called GRASP, for solving the problem. Several computational experiments show that our method is useful and scalable for detecting ncRNA families from large sequences. We also present two examples of large scale sequence analysis using RNAclique. © 2009 Information Processing Society of Japan.

    DOI CiNii

    Scopus

    1
    被引用数
    (Scopus)
  • Software.ncrna.org: web servers for analyses of RNA sequences

    Kiyoshi Asai, Hisanori Kiryu, Michiaki Hamada, Yasuo Tabei, Kengo Sato, Hiroshi Matsui, Yasubumi Sakakibara, Goro Terai, Toutai Mituyama

    NUCLEIC ACIDS RESEARCH   36 ( Web Server issue ) W75 - W78  2008年07月  [査読有り]

     概要を見る

    We present web servers for analysis of non-coding RNA sequences on the basis of their secondary structures. Software tools for structural multiple sequence alignments, structural pairwise sequence alignments and structural motif findings are available from the integrated web server and the individual stand-alone web servers. The servers are located at http://software.ncrna.org, along with the information for the evaluation and downloading. This website is freely available to all users and there is no login requirement.

    DOI PubMed J-GLOBAL

    Scopus

    5
    被引用数
    (Scopus)
  • Mining frequent stem patterns from unaligned RNA sequences

    Michiaki Hamada, Koji Tsuda, Taku Kudo, Taishin Kin, Kiyoshi Asai

    BIOINFORMATICS   22 ( 20 ) 2480 - 2487  2006年10月  [査読有り]

     概要を見る

    Motivation: In detection of non-coding RNAs, it is often necessary to identify the secondary structure motifs from a set of putative RNA sequences. Most of the existing algorithms aim to provide the best motif or few good motifs, but biologists often need to inspect all the possible motifs thoroughly.
    Results: Our method RNAmine employs a graph theoretic representation of RNA sequences and detects all the possible motifs exhaustively using a graph mining algorithm. The motif detection problem boils down to finding frequently appearing patterns in a set of directed and labeled graphs. In the tasks of common secondary structure prediction and local motif detection from long sequences, our method performed favorably both in accuracy and in efficiency with the state-of-the-art methods such as CMFinder.

    DOI PubMed J-GLOBAL

    Scopus

    38
    被引用数
    (Scopus)

▼全件表示

書籍等出版物

  • 機械学習を用いた アプタマー配列の解析と創薬、実験医学別冊 Pythonで実践 生命科学データの機械学習

    岩野夏樹, 浜田道昭, 清水秀幸( 担当範囲: 第 12 章 発展編①)

    羊土社  2023年03月 ISBN: 9784758122634

  • 機械学習による遺伝子転写制御に関わる因子の探索, 月刊細胞

    大里直樹, 浜田道昭( 担当範囲: 2022年11月号 62-64)

    ニューサイエンス社  2022年11月

  • システムバイオロジー

    宇田, 新介, 浜田, 道昭

    コロナ社  2022年11月 ISBN: 9784339027341

  • バイオインフォマティクスのための生命科学入門

    福永, 津嵩, 岩切, 淳一, 浜田, 道昭

    コロナ社  2022年08月 ISBN: 9784339027310

  • RNA情報科学・AI技術を融合したAIアプタマー創薬技術の開発,革新的AI創薬最前線

    浜田道昭( 担当範囲: 第5章,第5節)

    エヌ・ティー・エス  2022年07月 ISBN: 9784860437886

  • 生物統計

    木立, 尚孝, 浜田, 道昭

    コロナ社  2022年05月 ISBN: 9784339027334

  • 生物ネットワーク解析

    竹本, 和広, 浜田, 道昭

    コロナ社  2021年11月 ISBN: 9784339027327

  • よくわかるバイオインフォマティクス入門 (KS生命科学専門書)

    岩部 直之, 川端 猛, 浜田 道昭, 門田 幸二, 須山 幹太, 光山 統泰, 黒川 顕, 森 宙史, 東 光一, 吉沢 明康, 片山 俊明, 藤 博幸( 担当: 共著)

    講談社  2018年11月 ISBN: 4065138213

    ASIN

  • 生命情報処理における機械学習 多重検定と推定量設計 (機械学習プロフェッショナルシリーズ)

    瀬々 潤, 浜田 道昭( 担当: 共著)

    講談社  2015年12月 ISBN: 4061529110

    ASIN

  • 生命情報処理における機械学習 : 多重検定と推定量設計 = Machine learning in bioinformatics

    瀬々, 潤, 浜田, 道昭

    講談社  2015年12月 ISBN: 9784061529113

▼全件表示

講演・口頭発表等

  • mRNAのトータルデザインに向けた情報技術

    浜田道昭  [招待有り]

    日本核酸医薬学会第8回年会 mRNAシンポジウム  

    発表年月: 2023年07月

  • RNAバイオインフォマティクスを用いた核酸医薬研究

    浜田道昭  [招待有り]

    日本核酸医薬学会第8回年会 教育セッション(生物)  

    発表年月: 2023年07月

  • RNA構造予測ソフトウエアの紹介と比較

    浜田道昭, 栗崎以久男  [招待有り]

    NPO法人mRNAターゲット創薬研究機構 2023年度 第1回講演会  

    発表年月: 2023年06月

  • バイオインフォマティクス:情報科学で生命・医学・薬学研究にブレイクスルーを

    浜田道昭  [招待有り]

    千代田稲門会2023年度定時総会講演会  

    発表年月: 2023年06月

  • AI aptamer drug discovery, Special session invited talk

    Michiaki Hamada  [招待有り]

    GIW / ISCB-Asia 2022  

    発表年月: 2022年12月

  • 情報科学を用いた核酸医薬・mRNA医薬研究

    浜田道昭  [招待有り]

    第31回WAKO Web受託セミナー RNA合成の進展  

    発表年月: 2022年11月

  • AIアプタマー創薬プロジェクト

    浜田道昭  [招待有り]

    2022年度CREST「バイオDX」領域キックオフシンポジウム  

    発表年月: 2022年11月

  • RNAバイオインフォマティクス研究の最前線

    浜田道昭  [招待有り]

    千葉工業大学大学院 最先端生命科学特論 講演会  

    発表年月: 2022年09月

  • RNA 情報学を用いた医薬学研究

    浜田道昭  [招待有り]

    特定非営利活動法人 mRNAターゲット創薬研究機構 2022年度第2回講演会  

    発表年月: 2022年08月

  • AIアプタマー創薬 ―人工知能技術を用いたRNAアプタマー創薬の加速―

    浜田道昭  [招待有り]

    日本コンピュータ化学会20周年記念シンポジウム  

    発表年月: 2022年06月

  • RNA情報学を基軸とした創薬基盤研究

    浜田道昭  [招待有り]

    RNA情報学を基軸とした創薬基盤研究  

    発表年月: 2022年05月

  • AI aptamer drug discovery project

    浜田道昭  [招待有り]

    China-Japan Artificial Intelligence for Social Innovation Conference  

    発表年月: 2022年03月

  • RNA研究の最前線】RNA情報学を基軸とした生命科学・医薬学研究

    浜田道昭  [招待有り]

    日本医科大学 講演会  

    発表年月: 2022年02月

  • RNAを基軸とした創薬研究

    浜田道昭  [招待有り]

    EWE講演会  

    発表年月: 2022年01月

  • AIアプタマー創薬

    浜田道昭  [招待有り]

    分子生物学会  

    発表年月: 2021年12月

  • ゲノム社会とバイオインフォマティクス

    浜田道昭  [招待有り]

    日本バイオインフォマティクス学会・日本オミックス医学会 合同シンポジウム, IIBMP2021  

    発表年月: 2021年09月

  • AIアプタマー創薬プロジェクト

    浜田道昭  [招待有り]

    日本医科大学・早稲田大学合同シンポジウム  

    発表年月: 2021年06月

  • RNA情報学の最前線

    浜田道昭  [招待有り]

    生命情報科学勉強会@宮崎大学  

    発表年月: 2021年05月

  • RNAバイオインフォマティクスの最前線

    浜田道昭  [招待有り]

    名古屋大学 特別講演  

    発表年月: 2021年01月

  • RNAを基軸とした創薬研究

    浜田道昭  [招待有り]

    EWE講演会  

    発表年月: 2021年01月

  • 核酸医薬品開発に向けたバイオインフォマティクス技術

    浜田道昭  [招待有り]

    第15回理研「バイオものづくり」シンポジウム  

    発表年月: 2020年12月

  • AIアプタマー創薬の実現に向けた情報技術

    浜田道昭  [招待有り]

    NVIDIA GPU Technology Conference (GTC)  

    開催年月:
    2020年10月
     
     
  • AIアプタマー創薬プロジェクト

    浜田道昭  [招待有り]

    CREST「人工知能」領域 第3回 成果展開シンポジウム  

    発表年月: 2020年09月

  • 長鎖ノンコーディングRNAの機能の解明に向けたバイオインフォマティクス

    浜田道昭  [招待有り]

    ゲノム創薬・創発フォーラム 第 3 回シンポジウム (主要テーマ:RNA関連の基礎研究とその創薬応用)   (東京大学医科学研究所附属病院 A棟8階 トミーホール) 

    発表年月: 2020年02月

  • ⻑鎖ノンコーディングRNAの 機能の解明に向けた バイオインフォマティクス技術

    浜田道昭  [招待有り]

    2019年度 RNAフロンティアミーティング   (IBM 天城ホームステッド) 

    発表年月: 2019年09月

  • RNAバイオインフォマティクス:技術開発と応用

    浜田道昭  [招待有り]

    2019年度 第1回 核酸を標的とした低分子創薬研究会   (大阪大学 産業科学研究所) 

    発表年月: 2019年08月

  • Model Learning meets Biology ー生物データの背後に潜む「構造」を情報科学で明らかにするー

    浜田道昭  [招待有り]

    第7回生命医薬情報学連合大会(IIBMP2018)  

    開催年月:
    2018年09月
     
     
  • 長鎖ノンコーディング RNA の機能の解明に向けた バイオインフォマティクス技術

    浜田道昭  [招待有り]

    EWE 三月会 11 月例会   (日比谷市政会館) 

    発表年月: 2017年11月

  • 生命情報科学と私

    浜田道昭  [招待有り]

    第9回生命情報科学若手の会   (西浦温泉ホテルたつき) 

    発表年月: 2017年10月

▼全件表示

共同研究・競争的資金等の研究課題

  • 先進ゲノム解析研究推進プラットフォーム

    日本学術振興会  科学研究費助成事業 学術変革領域研究(学術研究支援基盤形成)

    研究期間:

    2022年04月
    -
    2028年03月
     

    黒川 顕, 川嶋 実苗, 豊田 敦, 鈴木 穣, 林 哲也, 中村 保一, 森 宙史, 野口 英樹, 浅井 潔, 岩崎 渉, 森下 真一, 笠原 雅弘, 伊藤 武彦, 瀬々 潤, 中谷 明弘, 島村 徹平, 波江野 洋, 熊谷 雄太郎, 高橋 弘喜, 平川 英樹, 浜田 道昭, 山田 拓司

  • RNAを中心とした分子ネットワークに基づく生物学的相分離の俯瞰的・体系的理解

    日本学術振興会  科学研究費助成事業 基盤研究(A)

    研究期間:

    2023年04月
    -
    2026年03月
     

    浜田 道昭

  • AIアプタマー創薬プロジェクト

    国立研究開発法人科学技術振興機構  戦略的創造研究推進事業(CREST)

    研究期間:

    2021年04月
    -
    2024年03月
     

    浜田 道昭

     概要を見る

    低分子化合物に替わる次世代の新薬として注目されている「RNAアプタマー」の創薬期間を劇的に短縮するために、アプタマー創薬実験とRNA情報科学・人工知能技術を融合した「AIアプタマー創薬」を確立する。

  • リピート要素のde novo発見に基づく長鎖ノンコーディングRNAの機能の解明

    日本学術振興会  科学研究費助成事業 基盤研究(A)

    研究期間:

    2020年04月
    -
    2023年03月
     

    浜田 道昭, 小野口 真広, 福永 津嵩

  • 発達期ダイオキシンと老年期の高次認知機能低下の関係性解明

    日本学術振興会  科学研究費助成事業 基盤研究(A)

    研究期間:

    2019年04月
    -
    2022年03月
     

    掛山 正心, 浜田 道昭, 久保 健一郎, 皆川 栄子, 前川 文彦

     概要を見る

    我々は動物実験により、ダイオキシン等の胎仔期曝露が認知機能を低下させることを認知課題成績と神経細胞の微細形態変化の双方で報告した。本研究では到達目標を、ダイオキシン等の発達期曝露が認知症の発症・増悪に関与する科学的知見を集積し、認知症の毒性エンドポイントとしての重要性を示すことにおく。(1)ダイオキシン等によって老年期に生じる認知的柔軟性の低下に焦点をあて、ヒト調査ならびに動物毒性実験により、影響の質と程度、そしてその毒性機構を明らかにして、(2)その成果をもとに、ヒト調査ならびに動物毒性実験において、高次認知機能の表現型解析技術を確立することを目的としている。本年度は、ヒト・コホート調査と動物毒性実験を実施するため、ヒト調査で用いる課題アプリを作成するとともに、コホート調査手続きを行った。タブレット端末での課題提示によるリモート評価を行う基盤整備も進めた。動物実験では認知的柔軟性と脳活動の定量評価を行うため、課題の作成と毒性試験の準備を行った。IntelliCageを用いた課題とともに、タッチスクリーンオペラント実験装置を用いた課題の確立も行なった。理化学研究所との共同研究により、アルツハイマー病モデルマウスを対象とした表現型解析を行い、認知症とメンタルスキーマの関係についての有望な知見を得た(論文投稿中)。また、本プロジェクトで取得するデータをモデリングするため、既存データのメタ解析を実施した。

  • ceRNAネットワーク構造の解読を基盤とした、全く新しい抗がん剤開発戦略の開発

    日本学術振興会  科学研究費助成事業 基盤研究(B)

    研究期間:

    2018年07月
    -
    2021年03月
     

    秋光 信佳, 浜田 道昭

     概要を見る

    近年、RNA-RNA相互作用やRNA-RNA結合タンパク質との相互作用を基盤とした遺伝子発現制御ネットワークの存在が注目されている。ここで興味深いのは、これらRNAとRNA結合タンパク質の作り出すネットワークは相互作用を通じて巨大なネットワークを形成していることである。たとえば、小分子ノンコーディングRNAであるマイクロRNAは、それ自身と相補的な塩基対を有するmRNAに結合してmRNAを分解したり翻訳抑制することでmRNAの発現量を制御しているが、ひとつのマイクロRNAが標的とするmRNAは一つでは無く複数存在する。一方、ひとつのmRNAは複数種類のマイクロRNAによって発現制御を受けている。このように、RNA-RNA相互作用とRNA-RNA結合タンパク質との相互作用は、多数対多数の相互作用となっている。しかしながら、このような多数対多数の相互作用を基盤としたネットワークの構造やその生理的役割については不明な点が多数存在する。そこで、本研究では、RNA-RNA相互作用やRNA-RNA結合タンパク質との相互作用を解析するための技術開発等を行う。そして、この巨大ネットワークの生理的役割や疾患における役割を解明する。これまでに、RNAとRNA結合タンパク質との相互作用を解明する技術開発を進めてきており、研究論文を発表した(Yamada T. et al., Cell Rep)。内容は、公共データベース上に公開されている次世代シーケンサーデータをもとに、RNA結合タンパク質とその分解標的RNAとの発現量相関を調べるシステムを開発した。そして、このシステムが有効であることを複数のRNA結合タンパク質で検証し、研究成果を論文発表した。

  • RNA-クロマチン相互作用予測と応用

    日本学術振興会  科学研究費助成事業 挑戦的研究(萌芽)

    研究期間:

    2017年06月
    -
    2021年03月
     

    浜田 道昭, 岩切 淳一

     概要を見る

    哺乳類ゲノムの大部分は,コーディングあるいはノンコーディングRNAを転写している.このうちノンコーディングRNAの一部は,クロマチンと相互作用を行い,エピジェネティックな制御を行っていることが示唆されている.RNAとクロマチン相互作用のメカニズムを解明するために,lncRNAとクロマチンの相互作用予測を行うモデルを構築し,構築したモデルからどのような特徴が相互作用い寄与しているかの検討を行った.今回考えた特徴としては下記のものである:R-loop形成,RNA:DNA triplex, RNA結合によるscafold.このうち,R-loop形成に関しては配列相補性をアラインメントにより同定することにより推定した.またこの際には,RNAアクセシビリティも考慮するようにした.RNA:DNA triplexに関しては,既存のtriplex予測ツールを利用した.機械学習モデルとしては,ランダムフォレストを主に利用した.これは,ランダムフォレストは,分類に寄与した特徴量の導出が容易に可能となるためである.実際のデータとしては,RNAクロマチン相互作用に関する大規模実験データを用いて,正例と負例を作成し,構築したモデルの学習を行った.予測精度の評価はクロスバリデーションを用いたが,現状十分な予測精度は出ていない.特徴量および学習データの両面から現在詳細に検討を行っている段階である.機械学習モデルに関しても深層学習なども含めて検討を行うことを計画している.

  • 人工知能技術を用いた革新的アプタマー創薬システムの開発

    JST  戦略的創造研究推進事業(CREST)

    研究期間:

    2018年10月
    -
    2021年03月
     

    浜田道昭

     概要を見る

    本研究提案は,次世代新薬の要である『RNAアプタマー』の創薬のプロセスの劇的な短縮および成功率の向上を実現し,医薬品開発にブレイクスルーを起こすことを目的とします.そのために,アプタマー創薬プロセスの短鎖化までのステップを人工知能技術と核酸インフォマティクスにより自動化した『AIアプタマー創薬システム』の研究開発を行い,製薬企業のリボミックに導入しその汎用性・有効性を検証した後に公開します.

  • RNA-クロマチン相互作用予測と応用

    文部科学省  挑戦的研究(萌芽)

    研究期間:

    2017年03月
    -
    2020年04月
     

    浜田道昭

  • 機能エレメントと深層学習に基づく長鎖ノンコーディングRNAの機能分類

    日本学術振興会  科学研究費助成事業 若手研究(A)

    研究期間:

    2016年04月
    -
    2020年03月
     

    浜田 道昭

     概要を見る

    ヒトなどの高等生物では,タンパク質に翻訳されずにRNAのまま機能を発揮する長鎖ノンコーディングRNA(lncRNA)が数多く存在していることが示唆されているがその大部分の機能は未解明である.lncRNAの機能エレメントを同定するための研究として,下記の研究を行った.
    - リボソーム結合lncRNAの同定と配列解析:網羅的実験データを用いて,リボソームRNAの結合するlncRNAの同定を行うと同時に配列特徴の抽出を行い,その生物学的意義について検討を行った.関連する論文を2報出版した(BMC Genomics. 2018 Dec 31;19(Suppl 10):906, BMC Genomics. 2018 May 29;19(1):414. doi: 10.1186/s12864-018-4765-z.)
    - ヒトとマウスの網羅的なlncRNA-RNA相互作用予測を可能とするWebサーバLncRRISearchを公開した(http://rtools.cbrc.jp/LncRRIsearch/)
    - リピートに結合するRBPの網羅的同定:我々の過去の研究で,lncRNAの組織特異的発現にリピート要素が関連していることを示したが,さらなる機能解析を進めるために,リピートに結合するlncRNAの同定を行った.現在結果を詳細に検討中であり,今年中に論文として出版することを計画している.
    - RNA-RNA相互作用ツールRIblastの高度化:p-valueの計算を行う方法の実装を行った.これにより,実験生物学者の利用が促進されることが期待される(J Comput Biol. 2018 Sep;25(9):976-986)

  • 機能エレメントと深層学習に基づく長鎖ノンコーディングRNAの機能分類

    文部科学省  若手研究(A)

    研究期間:

    2016年04月
    -
    2020年03月
     

    浜田道昭

  • ドライバー遺伝子異常肺癌の薬剤耐性機序における長鎖ノンコーディング RNAの意義

    日本学術振興会  科学研究費助成事業 基盤研究(C)

    研究期間:

    2016年04月
    -
    2019年03月
     

    清家 正博, 野呂 林太郎, 浜田 道昭

     概要を見る

    肺癌のEGFR阻害薬やALK阻害薬等の分子標的薬耐性と耐性の根幹となる癌幹細胞・上皮間葉移行に関わる長鎖ノンコーディングRNA(lncRNA)の意義を明らかにし、分子標的薬耐性克服法確立および根治を目指した臨床応用を目的とする。肺癌の分子標的薬感受性細胞株4株と耐性株10株を用いた網羅的lncRNA発現解析とバイオインフォマティクス解析にて、分子標的薬耐性に共通に関与するlncRNAとしてCRNDEを同定し、関連タンパク質がIRX5であり、耐性細胞におけるIRX5抑制によりアポトーシスが誘導されることを明らかにした。CRNDEとIRX5は分子標的薬耐性克服に向けた新規治療標的となり得る。

  • エピトランスクリプトーム解析のためのRNAインフォマティクス基盤技術

    日本学術振興会  科学研究費助成事業 基盤研究(A)

    研究期間:

    2016年04月
    -
    2019年03月
     

    浅井 潔, 浜田 道昭, 亀田 倫史, 阿部 洋, 櫻庭 俊

     概要を見る

    本研究課題は、修飾塩基を含むRNA の2 次構造情報解析技術を開発し、エピトランスクリプトーム解析のためのRNA インフォマティクス基盤技術を確立することを目的としている。重要だが従来の2 次構造予測では扱えない修飾塩基について、エネルキーパラメタを熱測定実験と自由エネルギー計算の組み合わせて同定し、様々なRNA2 次構造解析ソフトウェアに導入する。
    シミュレーションの準備段階の量子化学計算、MDシミュレーションによるΔG同定 、修飾塩基エネルギパラメタの解析を行い、イノシンのエネルギーパラメタを決定した。
    RNA2次構造の確率分布をより詳細に解析するためのツールとして、参照2次構造からのハミング距離ごとの構造アンサンブル中での塩基対確率を高速に計算するツールRintWを開発し、論文発表した。

  • ヒストンバリアントに基づくクロマチンの機能の推定

    日本学術振興会  科学研究費助成事業 新学術領域研究(研究領域提案型)

    研究期間:

    2016年04月
    -
    2018年03月
     

    浜田 道昭

     概要を見る

    (1) ヒストンバリアントを含むクロマチンマークに対するクロマチン状態の推定.
    ヒストンバリアントのデータとしては,ヒト:Kujirai+, NAR (2016) 44, 6127-41,マウス:Maehara+, Epigenetics Chromatin (2015) 17;8:35を用いた.これらのデータを用いて,研究代表者が開発した手法を用いてクロマチン状態の推定を行った.さらに,推定されたクロマチン状態と,様々なゲノムアノテーションとの相関を調査した.
    (2)データベースlncRRIdb: 発現,局在情報を統合したlncRNA-RNA相互作用データベース
    本研究では,クロマチン機能を長鎖ノンコーディングRNA(lncRNA)の観点から特徴づけることを試みるために,lncRNAと相互作用を行うRNAの網羅的なデータベースの構築を行った.これは研究代表者らが開発したRIblastを用いて,計算機による網羅的な相互作用予測を行った結果を,発現および局在の実験情報とともに格納したデータベースである
    (3)階層的なクロマチン状態を推定するための情報技術の開発.
    プロモーターやエンハンサーも,階層的な構造を有していると考えた.例えば,promoter⇒strong promoter, weak promoter, bivalent promoterなどである.従来のクロマチン状態の推定手法においては,このような階層性を考えることはできなかったため,我々は独自に手法の開発を行った.そのためのプロトタイプシステムの開発を行い小さなデータを用いてその有効性を検証した.

  • ヒストンバリアントに基づくクロマチンの機能の推定

    文部科学省  新学術領域研究(研究領域提案型)

    研究期間:

    2016年04月
    -
    2018年03月
     

    浜田道昭

  • RNA二次構造の大域的性質の集団遺伝解析

    日本学術振興会  科学研究費助成事業 若手研究(B)

    研究期間:

    2013年04月
    -
    2017年03月
     

    木立 尚孝, 浅井 潔, 浜田 道昭, 佐藤 健吾, 加藤 有己, 岩崎 渉, 小野 幸輝, 寺井 悟朗, 尾崎 遼, 松本 拡高, 福永 津嵩, 森 遼太, 柏原 裕樹, 河口 理紗

     概要を見る

    細胞内のRNA分子は、遺伝子に書き込まれた情報がタンパク質になり生理的な機能を発揮する上で非常に重要な役割を持つ。RNA分子の立体構造は、ステムと呼ばれる局所的二重らせん構造(二次構造)の3次元的な配置としてよく理解されることが知られており、二次構造の性質を解明することは、RNAの機能を理解する上で重要である。本研究において、我々は、メッセンジャーRNAや長鎖非コードRNAと呼ばれる長大なRNA分子に対して、その二次構造的性質を正確に計算することができるアルゴリズムの開発に世界で初めて成功した。また、RNA結合タンパク質の結合領域周辺がどのような二次構造的特徴をもつかを計算する手法も開発した。

  • RNA・タンパク質相互作用の網羅的予測と検証

    日本学術振興会  科学研究費助成事業 基盤研究(A)

    研究期間:

    2013年04月
    -
    2016年03月
     

    浅井 潔, 浜田 道昭, 亀田 倫史, 石黒 亮, 由良 敬

     概要を見る

    RNAとタンパク質の相互作用の有無及び複合体構造予測を行うことを目指した。PDBのRNA-タンパク質複合体の情報解析から、2次構造で塩基対を形成せずアミノ酸と水素結合を形成する塩基は、2次構造での塩基対形成もアミノ酸との水素結合もない塩基に比べて,低い塩基対確率を持つことがわかった。RNA2次構造を多面的に解析するため、参照構造からの2次構造のハミング距離毎の構造全ての確率の合計の分布を高速に計算する方法を開発した。RNAとタンパク質の物理化学的性質を考慮した力場を導入したRNA-タンパク質複合体のドッキング手法を開発し、既存の方法よりも大幅に性能が改善することを明らかとなり、論文発表した。

  • プライバシー保護バイオインフォマティクス基盤技術の開発と応用

    日本学術振興会  科学研究費助成事業 挑戦的萌芽研究

    研究期間:

    2013年04月
    -
    2016年03月
     

    浜田 道昭, 清水 佳奈, 花岡 悟一郎, 津田 宏治, フリス マーティン, 浅井 潔

     概要を見る

    個人のゲノム情報や薬のたねとなる化合物情報などは,機密情報として取り扱うことが必要となる.一方,オープンサイエンスの立場からは,これらの情報を積極的に利用して他の情報と合わせてデータマイニングを行うことが重要である.本研究では,これらの生物分野の重要情報を秘匿したまま様々なデータマイニングを行う方法論の開発を行った.具体的には,化合物データベースの秘匿検索,隠れマルコフモデルを用いたゲノム情報の秘匿検索,秘匿配列アラインメントの技術を開発した.

  • 修飾・編集RNAの構造予測手法の研究開発

    日本学術振興会  科学研究費助成事業 若手研究(A)

    研究期間:

    2012年04月
    -
    2016年03月
     

    浜田 道昭

     概要を見る

    修飾/編集塩基を含むRNAの構造予測に向けた情報技術の研究開発を行った.修飾/編集塩基を含む既知のRNAの構造データは極めて限られているため,このような限られたデータを用いて,効果的に構造予測を行う手法の開発を行う.具体的には,少数の2次構造データから2次構造の確率モデルを学習するための,半教師有り学習の方法を新しく開発を行った.また,RNAの統合WebサーバRtoolsを開発し,一般に公開した.

  • ゲノム科学の総合的推進に向けた大規模ゲノム情報生産・高度情報解析支援

    日本学術振興会  科学研究費助成事業 新学術領域研究(研究領域提案型)

    研究期間:

    2010年04月
    -
    2016年03月
     

    小原 雄治, 加藤 和人, 豊田 敦, 黒木 陽子, 菅野 純夫, 鈴木 穣, 林 哲也, 山本 健, 辻 省次, 井ノ上 逸朗, 黒川 顕, 森下 真一, 中村 保一, 田畑 哲之, 久原 哲, 岩崎 渉, 瀬々 潤, 高橋 弘喜, 浅井 潔, 笠原 雅弘, 榊原 康文, 矢田 哲士, 山縣 然太朗, 武藤 香織, 位田 隆一, 増井 徹, 栗山 真理子, 高木 利久, 藤山 秋佐夫, 服部 正平, 小椋 義俊, 徳永 勝士, 桑野 良三, 大橋 順, 伊藤 武彦, 平川 英樹, 野口 英樹, 松岡 聡, 小笠原 直毅, 中村 建介, 浜田 道昭, 金谷 重彦, 安西 祐一郎, 岡田 清孝, 榊 佳之, 高久 史麿, 豊島 久真男, 中村 桂子, 堀田 凱樹, 米澤 明憲, 吉川 寛, 吉田 光昭, 猪子 英俊, 戸田 達史, 稲澤 譲治, 五條掘 孝, 漆原 秀子, 武田 洋幸, 城石 俊彦, 伊藤 隆司, 佐藤 矩行, 松田 秀雄, 五斗 進, 津田 雅孝, 桑野 良三, 徳永 勝士, 小笠原 直毅

     概要を見る

    国際的にも解析技術が予想以上の速度で進展した中、拠点集約により情報解析を含めた最先端の技術支援を進めることができた。毎年60-90件、総数465件の公募選定課題を支援し、シーラカンスゲノム解読など363報の論文成果が得られた。支援課題は科研費のすべての種目、生物系のほぼすべての分科に及び、この活動が生命科学の基盤として必須であることを示した。また、困難なゲノム解読の切り札ともなったゲノムアッセンブルソフトウェアPlatanusの独自開発に成功したことなど、支援と解析技術の高度化の好循環が進んだ。

▼全件表示

Misc

  • Fast RNA-RNA Interaction Prediction Methods for Interaction Analysis of Transcriptome-Scale Large Datasets

    Tsukasa Fukunaga, Michiaki Hamada

    Methods in molecular biology (Clifton, N.J.)   2586   163 - 173  2023年

    書評論文,書評,文献紹介等  

     概要を見る

    The computational prediction of RNA-RNA interactions has long been studied in RNA informatics. Most of the existing approaches focused on the interaction prediction of short RNAs in small datasets. However, in recent years, two fast prediction methods, RIsearch2 and RIblast, have been developed to predict transcriptome-scale interactions or long RNA interactions. The key idea of the software acceleration of these tools was the integration of a seed-and-extend method, which is used in fast sequence alignment tools, into RNA-RNA interaction prediction. As a result, the two software programs were ten to a thousand times faster than the existing tools; because of this acceleration, detection of genome-wide microRNA target sites or interaction partners of function-unknown long noncoding RNAs has become possible. In this review, we describe the basic concept of the algorithm, its applications, and the future perspectives of the fast RNA-RNA interaction prediction tools.

    DOI PubMed

  • ドライバー遺伝子異常肺癌の薬剤耐性機序における長鎖ノンコーディングRNAの意義

    高橋 聡, 野呂 林太郎, 吉川 明子, 中道 真仁, 菅野 哲平, 松本 優, 武内 進, 平尾 真季子, 松田 久仁子, Zeng Chao, 浜田 道昭, 久保田 馨, 清家 正博, 弦間 昭彦

    日本呼吸器学会誌   9 ( 増刊 ) 177 - 177  2020年08月

  • ドライバー遺伝子異常肺癌の薬剤耐性機序における長鎖ノンコーディングRNAの意義

    高橋 聡, 野呂 林太郎, 吉川 明子, 中道 真仁, 菅野 哲平, 松本 優, 武内 進, 平尾 真季子, 松田 久仁子, Zeng Chao, 浜田 道昭, 久保田 馨, 清家 正博, 弦間 昭彦

    日本呼吸器学会誌   9 ( 増刊 ) 177 - 177  2020年08月

  • CAFs induce formation of metastatic human breast tumor cell clusters with partial epithelial-mesenchymal transition

    Akira Orimo, Yasuhiko Ito, Yoshihiro Mezawa, Kaidiliavi Sulidan, Yataro Daigo, Nadila Wali, Okio Hino, Kazuyoshi Takeda, Michiaki Hamada, Yuko Matsumura

    CANCER SCIENCE   109   797 - 797  2018年12月  [査読有り]

    研究発表ペーパー・要旨(国際会議)  

  • 非コードRNA Eleanorはヌクレオソーム中のヒストンの交換を促進する

    藤田 理紗, 有村 泰宏, 山本 達郎, 浜田 道昭, 斉藤 典子, 胡桃坂 仁志

    生命科学系学会合同年次大会   2017年度   [3PT18 - 0555)]  2017年12月

  • トピックモデルを用いたがんゲノムの変異シグネチャー解析 (ニューロコンピューティング)

    松谷 太郎, 宇恵野 雄貴, 福永 津嵩, 浜田 道昭

    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報   117 ( 109 ) 159 - 164  2017年06月

    CiNii

  • トピックモデルを用いたがんゲノムの変異シグネチャー解析 (情報論的学習理論と機械学習)

    松谷 太郎, 宇恵野 雄貴, 福永 津嵩, 浜田 道昭

    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報   117 ( 110 ) 105 - 110  2017年06月

    CiNii

  • バイオイメージインフォマティクスにおける機械学習技術の活用 (Imaging Today 人工知能における学習技術)

    福永 津嵩, 浜田 道昭

    日本画像学会誌 = Journal of the Imaging Society of Japan   56 ( 2 ) 163 - 167  2017年

    CiNii

  • 疾患リスクの評価へ向けた加法準同型性暗号によるプライバシー保護HMMの実装と評価 (ニューロコンピューティング)

    三品 気吹, 浜田 道昭

    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報   116 ( 120 ) 121 - 126  2016年07月

    CiNii

  • Privacy-Preserving Search for Chemical Compound DatabasesPrivacy-Preserving Search for Chemical Compound Databases

    Kana Shimizu, Koji Nuida, Hiromi Arai, Shigeo Mitsunari, Nuttapong Attrapadung, Michiaki Hamada, Koji Tsuda, Takatsugu Hirokawa, Jun Sakuma, Goichiro Hanaoka, Kiyoshi Asai

    bioRxiv   ( 013995 )  2015年01月

    機関テクニカルレポート,技術報告書,プレプリント等  

    DOI

  • RNA secondary structure prediction from multi-aligned sequences

    Michiaki Hamada

       2013年07月

    機関テクニカルレポート,技術報告書,プレプリント等  

     概要を見る

    It has been well accepted that the RNA secondary structures of most<br />
    functional non-coding RNAs (ncRNAs) are closely related to their functions and<br />
    are conserved during evolution. Hence, prediction of conserved secondary<br />
    structures from evolutionarily related sequences is one important task in RNA<br />
    bioinformatics; the methods are useful not only to further functional analyses<br />
    of ncRNAs but also to improve the accuracy of secondary structure predictions<br />
    and to find novel functional RNAs from the genome. In this review, I focus on<br />
    common secondary structure prediction from a given aligned RNA s...

  • Generalized Centroid Estimators in Bioinformatics

    Michiaki Hamada, Hisanori Kiryu, Wataru Iwasaki, Kiyoshi Asai

    PLoS ONE 6(2):e16450, 2011    2013年05月

    機関テクニカルレポート,技術報告書,プレプリント等  

     概要を見る

    In a number of estimation problems in bioinformatics, accuracy measures of<br />
    the target problem are usually given, and it is important to design estimators<br />
    that are suitable to those accuracy measures. However, there is often a<br />
    discrepancy between an employed estimator and a given accuracy measure of the<br />
    problem. In this study, we introduce a general class of efficient estimators<br />
    for estimation problems on high-dimensional binary spaces, which representmany<br />
    fundamental problems in bioinformatics. Theoretical analysis reveals that the<br />
    proposed estimators generally fit with commonly-used accura...

    DOI

  • Fighting against uncertainty: An essential issue in bioinformatics

    Michiaki Hamada

       2013年05月

    機関テクニカルレポート,技術報告書,プレプリント等  

     概要を見る

    Many bioinformatics problems, such as sequence alignment, gene prediction,<br />
    phylogenetic tree estimation and RNA secondary structure prediction, are often<br />
    affected by the &quot;uncertainty&quot; of a solution; that is, the probability of the<br />
    solution is extremely small. This situation arises for estimation problems on<br />
    high-dimensional discrete spaces in which the number of possible discrete<br />
    solutions is immense. In the analysis of biological data or the development of<br />
    prediction algorithms, this uncertainty should be handled carefully and<br />
    appropriately. In this review, I will explain several methods t...

  • 加法準同型暗号を用いた化合物データベースの秘匿検索プロトコル

    縫田光司, 清水佳奈, 荒井ひろみ, 浜田道昭, 津田宏治, 広川貴次, 花岡悟一郎, 佐久間淳, 浅井潔

    情報処理学会シンポジウムシリーズ(CD-ROM)   2012 ( 3 ) ROMBUNNO.2C2-1 - 389  2012年10月

    CiNii J-GLOBAL

  • 半教師あり学習を用いたRNA二次構造予測アルゴリズムの提案

    米本悠, 浜田道昭, 浜田道昭, 浅井潔, 浅井潔

    日本RNA学会年会要旨集   14th   160  2012年07月

    J-GLOBAL

  • カノニカル分布に基づいたRNA二次構造安定性解析手法の開発

    森遼太, 浜田道昭, 浜田道昭, 浅井潔, 浅井潔

    日本RNA学会年会要旨集   14th   154  2012年07月

    J-GLOBAL

  • 検索行動におけるプライバシ保護

    荒井ひろみ, 清水佳奈, 浜田道昭, 津田宏治, 広川貴次, 佐久間淳, 浅井潔, 浅井潔

    人工知能学会全国大会論文集(CD-ROM)   26th   ROMBUNNO.3I2-OS-20-1  2012年

    J-GLOBAL

  • カノニカル分布に基づくRNA二次構造の存在確率分布記述手法の開発

    森遼太, 浜田道昭, 浜田道昭, 浅井潔, 浅井潔

    日本分子生物学会年会プログラム・要旨集(Web)   35th   WEB ONLY 1P-0244  2012年

    J-GLOBAL

  • 半教師あり学習を用いたRNA二次構造予測アルゴリズムの提案

    米本悠, 浜田道昭, 浜田道昭, 浅井潔, 浅井潔

    日本分子生物学会年会プログラム・要旨集(Web)   35th   WEB ONLY 3P-0071  2012年

    J-GLOBAL

  • 期待精度最大化とバイオインフォマティクス(インダストリアルマテリアルズ)

    浜田 道昭, 浅井 潔

    応用数理   21 ( 1 ) 34 - 39  2011年03月

    DOI CiNii

  • 期待精度最大化とバイオインフォマティクス

    浜田道昭, 浅井潔

    応用数理   21 ( 1 ) 34 - 39  2011年03月

    DOI CiNii J-GLOBAL

  • RNA-RNA interaction prediction using integer programming with threshold cut (ニューロコンピューティング)

    Kato Yuki, Sato Kengo, Hamada Michiaki

    電子情報通信学会技術研究報告   110 ( 83 ) 183 - 190  2010年06月

    CiNii

  • RNA-RNA Interaction Prediction Using Integer Programming with Threshold Cut (バイオ情報学(BIO) Vol.2010-BIO-21)

    Yuki Kato, Kengo Sato, Michiaki Hamada, Yoshihide Watanabe, Kiyoshi Asai, Tatsuya Akutsu

    研究報告バイオ情報学(BIO)   2010 ( 32 ) 1 - 8  2010年06月

     概要を見る

    Much attention has been focused on predicting RNA-RNA interaction since it is a key to identifying possible targets of noncoding small RNAs that regulate gene expression post-transcriptionally. A number of computational studies have so far been devoted to predicting joint secondary structures or binding sites under a specific class of interactions. In this technical report, we propose RactIP, a fast and accurate prediction method for RNA-RNA interaction of general type based on integer programming. RactIP can integrate approximate information on an ensemble of equilibrium joint structures into the objective function using posterior internal and external base paring probabilities. Experimental results show that prediction accuracy of RactIP is at least comparable to that of several state-of-the-art methods for RNA-RNA interaction prediction. Moreover, we demonstrate that RactIP can run incomparably faster than competitive methods for predicting joint secondary structures.Much attention has been focused on predicting RNA-RNA interaction since it is a key to identifying possible targets of noncoding small RNAs that regulate gene expression post-transcriptionally. A number of computational studies have so far been devoted to predicting joint secondary structures or binding sites under a specific class of interactions. In this technical report, we propose RactIP, a fast and accurate prediction method for RNA-RNA interaction of general type based on integer programming. RactIP can integrate approximate information on an ensemble of equilibrium joint structures into the objective function using posterior internal and external base paring probabilities. Experimental results show that prediction accuracy of RactIP is at least comparable to that of several state-of-the-art methods for RNA-RNA interaction prediction. Moreover, we demonstrate that RactIP can run incomparably faster than competitive methods for predicting joint secondary structures.

    CiNii

  • CentroidFold:RNA二次構造予測ウェブサーバー

    佐藤健吾, 佐藤健吾, 浜田道昭, 浜田道昭, 浅井潔, 浅井潔, 光山統泰

    日本RNA学会年会要旨集   11th   96  2009年07月

    J-GLOBAL

  • Large Scale Similarity Search for Locally stable Secondary Structures among RNA Sequences (IPSJ Transactions on Bioinformatics Vol.2)

    HAMADA MICHIAKI, MITUYAMA TOUTAI, ASAI KIYOSHI

    情報処理学会論文誌 論文誌トランザクション   2008 ( 2 ) 36 - 46  2009年04月

    CiNii

  • CentroidHomfold:相同配列群の情報を利用したRNAの2次構造予測

    浜田道昭, 浜田道昭, 佐藤健吾, 佐藤健吾, 木立尚孝, 木立尚孝, 光山統泰, 浅井潔, 浅井潔

    日本分子生物学会年会講演要旨集   32nd ( Vol.1 ) 48  2009年

    J-GLOBAL

  • 期待精度を最大化するRNA情報解析手法の開発

    浜田道昭, 浜田道昭, 木立尚孝, 佐藤健吾, 佐藤健吾, 光山統泰, 浅井潔, 浅井潔

    生化学     2P-0776  2008年

    J-GLOBAL

  • Support Vector Machineを用いた機能性RNAファミリーの分類

    浜田道昭, 浜田道昭, 浜田道昭, 加藤毅, 加藤毅, 金大真, 津田宏治, 浅井潔, 浅井潔

    RNAミーティング   7th   69  2005年

    J-GLOBAL

  • 超高速計算環境での生体関連分子の活性・機能予測システムの構築 : HIVプロテアーゼ阻害剤の解析への応用

    浜田 道昭, 馮 誠, 稲垣 祐一郎, 長嶋 雲兵, 村上 和彰, 中馬 寛

    日本応用数理学会論文誌   14 ( 4 ) 267 - 288  2004年12月

     概要を見る

    We have developed an object oriented large-scale scientific simulations system that contains algorithms of molecular scientific computing programs, called Embedded High-Performance Computing (EHPC). As an application of the system, &quot;EHPC-Drug platform&quot; has been constructed for rational drug design. It can provide a high-performance computing ability for exhaustive conformational analyses of biomolecules, generating computation of their three-dimensional topological descriptors, and docking calculations with their target receptors. To enhance its computing abilities, we are also planning to ...

    DOI CiNii

  • 超高速計算環境での生体関連分子の活性・機能予測システムの構築:HIVプロテアーゼ阻害剤の解析への応用

    浜田道昭, FENG C, 稲垣祐一郎, 長嶋雲兵, 村上和彰, 中馬寛

    日本応用数理学会論文誌   14 ( 4 ) 267 - 288  2004年12月

     概要を見る

    We have developed an object oriented large-scale scientific simulations system that contains algorithms of molecular scientific computing programs, called Embedded High-Performance Computing (EHPC). As an application of the system, "EHPC-Drug platform" has been constructed for rational drug design. It can provide a high-performance computing ability for exhaustive conformational analyses of biomolecules, generating computation of their three-dimensional topological descriptors, and docking calculations with their target receptors. To enhance its computing abilities, we are also planning to apply Grid computing technology to this system for parallel and distributed computing and Grid Data processing. As a critical test of our approach, we applied it to a prediction of bound conformation of several HIV protease inhibitors with the protease.

    DOI CiNii J-GLOBAL

  • Grid技術とXMLデータベースを用いた創薬プラットフォームの構築とその応用

    浜田道昭, 稲垣祐一郎, 中馬寛

    構造活性相関シンポジウム講演要旨集   32nd   141 - 144  2004年11月

    J-GLOBAL

  • 薬師(Xsi)―創薬のための仮想スクリーニング統合システムの開発

    稲垣祐一郎, 浜田道昭, 山崎一人, 金岡昌治, 中馬寛

    情報計算化学生物学会大会予稿集   2004   205 - 206  2004年07月

    J-GLOBAL

  • DrugMLとGrid創薬

    浜田道昭, 稲垣祐一郎, 中馬寛

    日本コンピュータ化学会年会講演予稿集   2004   51  2004年05月

    J-GLOBAL

  • DrugMLとGrid創薬

    浜田道昭, 稲垣祐一郎, 中馬寛

    構造活性相関シンポジウム講演要旨集   31st   101 - 102  2003年11月

    J-GLOBAL

▼全件表示

産業財産権

 

現在担当している科目

▼全件表示

 

他学部・他研究科等兼任情報

  • 理工学術院   大学院先進理工学研究科

学内研究所・附属機関兼任歴

  • 2023年
    -
    2024年

    データ科学センター   兼任センター員

  • 2022年
    -
    2024年

    理工学術院総合研究所   兼任研究員

  • 2022年
    -
    2024年

    カーボンニュートラル社会研究教育センター   兼任センター員

特定課題制度(学内資金)

  • シミュレーション技術を用いたRNA構造解析技術の開発

    2023年  

     概要を見る

    シミュレーション技術を用いたRNA構造解析技術として以下の研究開発を進めた.1.深層学習技術を用いてRNAとタンパク質の複合体立体構造を予測するための技術の開発を行った.現在論文執筆中である.2.分子動力学法などから得られる複数の立体構造情報を低次元の潜在空間に射影するための深層学習技術の開発を行った.現在論文執筆中である.

  • RNAリンカネーション

    2022年  

     概要を見る

    RNAリンカネーションの解明に寄与することが期待されるRNAバイオインフォマティクスの技術の開発を行った.例えば以下はRNAに共通する構造を高速に発見することを可能とするツールである.Tsukasa Fukunaga*, Michiaki Hamada, LinAliFold and CentroidLinAliFold: Fast RNA consensus secondary structure prediction for aligned sequences using beam search methods, Bioinformatics Advances, vbac078, https://doi.org/10.1093/bioadv/vbac078 Published: 22 October 2022また,予備実験も継続的に進めている.

  • 長鎖ノンコーディングRNA情報解析基盤の開発

    2021年  

     概要を見る

    長鎖ノンコーディングRNA(lncRNA)は生体内で単独で機能を発揮しているわけではなく,他の機能性分子と相互作用を行うことにより様々な機能を実現している.今年度はlncRNAと相互作用するRNA結合タンパク質(RBP)を情報科学的に解析するための研究を複数行った.第一に,RBPに結合するRNA配列をBERTの事前学習モデルを用いて予測するRBP-BERTを開発した.さらに学習された結果を解析することによりRBP結合の生物学的な特徴を抽出した.第二に,トランスポゾンなどのリピート要素に結合するRBPの網羅的な解析を行った.これにより,リピート要素がRBP結合の機能性配列となっていることが明らかになった.

  • ノンコーディングRNA解析情報基盤技術の研究

    2020年  

     概要を見る

    ヒトなどの高等真核生物で多数発見されている長鎖ノンコーディングRNAの機能を解明するために,基盤情報技術を構築し様々なバイオインフォマティクスの解析を行った.具体的には以下を行った.・局在と選択的スプライシングの関連性に関する網羅的解析・トランスクリプトームなm6A修飾の測定データから,高精度にm6A修飾位置を同定するためのツールMoAIMSの開発・ゲノムワイドなR-loop構造の同定と,その特徴の抽出

  • 秘密分散手法を用いた生命情報秘匿解析手法の研究

    2019年  

     概要を見る

    秘密分散法を用いて,アフィンギャップを用いた配列比較手法を安全に行うための手法の考案および実装を行った.既存手法との比較を行い,既存手法に比べて計算速度が大幅に改善することが確かめられた.[1] 深見 匠、浜田 道昭, アフィンギャップを考慮した安全な個人ゲノム比較, 2019/12/3, 第42回日本分子生物学会年会, 福岡国際会議場・マリンメッセ福岡[2] 深見匠, 浜田道昭, セキュアな個人ゲノム類似度計算, 2019年 暗号と情報セキュリティシンポジウム,2019年1月22日〜25日,びわ湖大津プリンスホテル

  • 統合オミックスデータ駆動生物学の数理情報基盤と実践

    2018年  

     概要を見る

    長鎖ノンコーディングRNAの機能の解明に向けたバイオインフォマティクス技術として,深層学習技術を用いた,m6A修飾の予測アルゴリズム/ツールの開発を行った.また,RNA-RNA相互作用を,配列情報のみを入力とし高速・高精度によろ即するためのアルゴリズムの開発を行った.さらに,モデル選択技術を用いたがんゲノムデータの変異シグネチャーの予測を行う基盤情報技術の開発を行った.

  • RNA-クロマチン相互作用予測と応用

    2016年  

     概要を見る

    RNAとクロマチンの相互作用を配列情報のみから推定するための手法の開発に向けた以下の研究成果を得た.1. RNAとタンパク質の複合体構造を予測(ドッキング)を行うための新規手法を開発した.この手法の中では,分子動力学シミュレーションの結果を,複合体構造の評価関数に組み入れることによって,既存の手法に比べて大幅な精度の向上が実現された2. RNAの構造予測のための統合WebサーバRtoolsを構築し,公開をした.このウェブサーバーを用いることにより,RNAの配列情報のみから,構造に関する様々な予測情報(2次構造,塩基対確率行列,ステム,バルジ,ループなどの形成確率等)を得ることが可能となる.このような情報はRNA-クロマチン相互作用を予測する際にも有用となる

  • 統合オミックスデータ駆動生物学の数理情報基盤

    2016年  

     概要を見る

    様々なオミックスデータを情報解析するための方法として以下の研究成果を得た・メタゲノムデータを確率的にモデリングするための確率モデルの開発を行った.この確率モデルにおいては,自然言語分野で用いられるLDAを,メタゲノムデータに応用することにより,細菌群が推定することが可能となる.推定された細菌群と広く知られているエンテロタイプとの関連性を詳細に調べることにより,細菌群の生物学的意味付けを与えた.・シークエンシングデータから植物ゲノムの変異を同定するためのパイプラインを構築した.構築したパイプラインを用いて,植物の変異体(ミュータント)の解析を詳細に行った.本研究は,理化学研究所との共同研究である.・タンパク質やDNA配列のモチーフの確率モデルであるプロファイルHMMを,暗号技術を用いることにより,モデル情報およびクエリの情報を秘匿したまま検索を行う手法の開発を行った.本手法では,加法準同型暗号を用いることにより,足し算が暗号化したまま可能となることが本質的に用いられている.

  • lncRNA-RNA相互作用の網羅的予測と実験情報を統合したデータベースの構築

    2015年  

     概要を見る

    本研究では、第一に、高速にRNA-RNAの相互作用を予測するためのパイプラインシステムを構築した。さらに、パイプラインシステムを京コンピュータに実装した。第2に、このパイプラインを用いてヒトのlncRNAを対象に網羅的な相互作用相手の予測を行い、得られた結果をデータベースとして公開を行った。APBC2016において、浜田が口頭発表を行うと同時に、ジャーナル論文(BMC Genomics)に論文が掲載された。

  • エピゲノムの統合的理解に向けた情報技術の開発とデータ駆動型生物学の実践

    2015年  

     概要を見る

    今年度は、昨年度発表した論文[1]のプログラムの、ソースコードの一般公開に向けて、プログラムの整理、および、改良を行った。具体的には、各位置においてクロマチン状態の事後確率が出力可能となるように変更を行った。[1] Michiaki Hamada*, Yukiteru Ono, Ryohei Fujimaki, Kiyoshi Asai, Learning chromatin states with factorized information criteria, Bioinformatics, Bioinformatics (2015) doi: 10.1093/bioinformatics/btv163 First published online: March 24, 2015

  • エピジェネティクスデータからクロマチン状態を推定する方法論の研究と応用

    2014年  

     概要を見る

    Motivation: Recent studies have suggested that both the genome and the genome with epigenetic modifications, the so-called epigenome, play important roles in various biological functions, such as transcription and DNA replication, repair, and recombination. It is well known that specific combinations of histone modifications (e.g. methylations and acetylations) of nucleosomes induce chromatin states that correspond to specific functions of chromatin. Although the advent of next-generation sequencing (NGS) technologies enables measurement of epigenetic information for entire genomes at high-resolution, the variety of chromatin states has not been completely characterized.&nbsp;Results: In this study, we propose a method to estimate the chromatin states indicated by genome-wide chromatin marks identified by NGS technologies. The proposed method automatically estimates the number of chromatin states and characterize each state on the basis of a hidden Markov model (HMM) in combination with a recently proposed model selection technique, factorized information criteria. The method is expected to provide an unbiased model because it relies on only two adjustable parameters and avoids heuristic procedures as much as possible. Computational experiments with simulated datasets show that our method automatically learns an appropriate model, even in cases where methods that rely on Bayesian information criteria fail to learn the model structures. In addition, we comprehensively compare our method to ChromHMM on three real datasets and show that our method estimates more chromatin states than ChromHMM for those datasets.

▼全件表示