Updated on 2022/05/19

写真a

 
FUKUNAGA, Tsukasa
 
Affiliation
Affiliated organization, Waseda Institute for Advanced Study
Job title
Assistant Professor(non-tenure-track)

Concurrent Post

  • Faculty of Science and Engineering   School of Advanced Science and Engineering

Education

  • 2011.04
    -
    2016.03

    東京大学大学院   新領域創成科学研究科   メディカル情報生命専攻  

  • 2007.04
    -
    2011.03

    The University of Tokyo   Faculty of Science   Undergraduate Program for Bioinformatics and Systems Biology  

Degree

  • The University of Tokyo   Ph. D.

Research Experience

  • 2021.04
    -
    Now

    Waseda University

  • 2017.10
    -
    2021.03

    Waseda University   Research Institute for Science and Engineering

  • 2017.10
    -
    2021.03

    The University of Tokyo

  • 2018.02
    -
    2019.03

    Osaka University

  • 2016.04
    -
    2017.09

    日本学術振興会   特別研究員(PD)

  • 2016.04
    -
    2017.09

    Waseda University   Faculty of Science and Engineering

▼display all

Professional Memberships

  •  
     
     

    JAPANESE SOCIETY FOR BIOINFORMATICS

 

Research Areas

  • System genome science

  • Genome biology

  • Life, health and medical informatics

Research Interests

  • 表現型

  • 遺伝子機能推定

  • ゲノム進化

  • データマイニング

  • 機械学習

  • RNA

  • バイオインフォマティクス

▼display all

Papers

  • Inverse Potts model improves accuracy of phylogenetic profiling.

    Tsukasa Fukunaga, Wataru Iwasaki

    Bioinformatics (Oxford, England)    2022.01  [Refereed]  [International journal]

     View Summary

    MOTIVATION: Phylogenetic profiling is a powerful computational method for revealing the functions of function-unknown genes. Although conventional similarity metrics in phylogenetic profiling achieved high prediction accuracy, they have two estimation biases: an evolutionary bias and a spurious correlation bias. While previous studies reduced the evolutionary bias by considering a phylogenetic tree, few studies have analyzed the spurious correlation bias. RESULTS: To reduce the spurious correlation bias, we developed metrics based on the inverse Potts model (IPM) for phylogenetic profiling. We also developed a metric based on both the IPM and a phylogenetic tree. In an empirical dataset analysis, we demonstrated that these IPM-based metrics improved the prediction performance of phylogenetic profiling. In addition, we found that the integration of several metrics, including the IPM-based metrics, had superior performance to a single metric. AVAILABILITY: The source code is freely available at https://github.com/fukunagatsu/Ipm. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

    DOI PubMed

  • Mirage: Estimation of ancestral gene-copy numbers by considering different evolutionary patterns among gene families

    Tsukasa Fukunaga, Wataru Iwasaki

    Bioinformatics Advances    2021.07  [Refereed]

     View Summary

    <title>Abstract</title>
    <sec>
    <title>Motivation</title>
    Reconstruction of gene copy number evolution is an essential approach for understanding how complex biological systems have been organized. Although various models have been proposed for gene copy number evolution, existing evolutionary models have not appropriately addressed the fact that different gene families can have very different gene gain/loss rates.


    </sec>
    <sec>
    <title>Results</title>
    In this study, we developed Mirage (MIxtuRe model for Ancestral Genome Estimation), which allows different gene families to have flexible gene gain/loss rates. Mirage can use three models for formulating heterogeneous evolution among gene families: the discretized Γ model, PDF model, and PM model. Simulation analysis showed that Mirage can accurately estimate heterogeneous gene gain/loss rates and reconstruct gene content evolutionary history. Application to empirical datasets demonstrated that the PM model fits genome data from various taxonomic groups better than the other heterogeneous models. Using Mirage, we revealed that metabolic function-related gene families displayed frequent gene gains and losses in all taxa investigated.


    </sec>
    <sec>
    <title>Availability</title>
    The source code of Mirage is freely available at https://github.com/fukunagatsu/Mirage.


    </sec>
    <sec>
    <title>Supplementary information</title>
    Supplementary data are available at Bioinformatics Advances online.


    </sec>

    DOI

  • Umibato: estimation of time-varying microbial interaction using continuous-time regression hidden Markov model.

    Shion Hosoda, Tsukasa Fukunaga, Michiaki Hamada

    Bioinformatics (Oxford, England)   37 ( Suppl_1 ) i16-i24  2021.07  [Refereed]  [International journal]

     View Summary

    MOTIVATION: Accumulating evidence has highlighted the importance of microbial interaction networks. Methods have been developed for estimating microbial interaction networks, of which the generalized Lotka-Volterra equation (gLVE)-based method can estimate a directed interaction network. The previous gLVE-based method for estimating microbial interaction networks did not consider time-varying interactions. RESULTS: In this study, we developed unsupervised learning-based microbial interaction inference method using Bayesian estimation (Umibato), a method for estimating time-varying microbial interactions. The Umibato algorithm comprises Gaussian process regression (GPR) and a new Bayesian probabilistic model, the continuous-time regression hidden Markov model (CTRHMM). Growth rates are estimated by GPR, and interaction networks are estimated by CTRHMM. CTRHMM can estimate time-varying interaction networks using interaction states, which are defined as hidden variables. Umibato outperformed the existing methods on synthetic datasets. In addition, it yielded reasonable estimations in experiments on a mouse gut microbiota dataset, thus providing novel insights into the relationship between consumed diets and the gut microbiota. AVAILABILITY AND IMPLEMENTATION: The C++ and python source codes of the Umibato software are available at https://github.com/shion-h/Umibato. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

    DOI PubMed

  • Representation learning applications in biological sequence analysis.

    Hitoshi Iuchi, Taro Matsutani, Keisuke Yamada, Natsuki Iwano, Shunsuke Sumi, Shion Hosoda, Shitao Zhao, Tsukasa Fukunaga, Michiaki Hamada

    Computational and structural biotechnology journal   19   3198 - 3208  2021  [Refereed]  [International journal]

     View Summary

    Although remarkable advances have been reported in high-throughput sequencing, the ability to aptly analyze a substantial amount of rapidly generated biological (DNA/RNA/protein) sequencing data remains a critical hurdle. To tackle this issue, the application of natural language processing (NLP) to biological sequence analysis has received increased attention. In this method, biological sequences are regarded as sentences while the single nucleic acids/amino acids or k-mers in these sequences represent the words. Embedding is an essential step in NLP, which performs the conversion of these words into vectors. Specifically, representation learning is an approach used for this transformation process, which can be applied to biological sequences. Vectorized biological sequences can then be applied for function and structure estimation, or as input for other probabilistic models. Considering the importance and growing trend for the application of representation learning to biological research, in the present study, we have reviewed the existing knowledge in representation learning for biological sequence analysis.

    DOI PubMed

  • Novel metric for hyperbolic phylogenetic tree embeddings.

    Hirotaka Matsumoto, Takahiro Mimori, Tsukasa Fukunaga

    Biology methods & protocols   6 ( 1 ) bpab006  2021  [Refereed]  [International journal]

     View Summary

    Advances in experimental technologies, such as DNA sequencing, have opened up new avenues for the applications of phylogenetic methods to various fields beyond their traditional application in evolutionary investigations, extending to the fields of development, differentiation, cancer genomics, and immunogenomics. Thus, the importance of phylogenetic methods is increasingly being recognized, and the development of a novel phylogenetic approach can contribute to several areas of research. Recently, the use of hyperbolic geometry has attracted attention in artificial intelligence research. Hyperbolic space can better represent a hierarchical structure compared to Euclidean space, and can therefore be useful for describing and analyzing a phylogenetic tree. In this study, we developed a novel metric that considers the characteristics of a phylogenetic tree for representation in hyperbolic space. We compared the performance of the proposed hyperbolic embeddings, general hyperbolic embeddings, and Euclidean embeddings, and confirmed that our method could be used to more precisely reconstruct evolutionary distance. We also demonstrate that our approach is useful for predicting the nearest-neighbor node in a partial phylogenetic tree with missing nodes. Furthermore, we proposed a novel approach based on our metric to integrate multiple trees for analyzing tree nodes or imputing missing distances. This study highlights the utility of adopting a geometric approach for further advancing the applications of phylogenetic methods.

    DOI PubMed

  • MotiMul: A significant discriminative sequence motif discovery algorithm with multiple testing correction

    Koichi Mori, Haruka Ozaki, Tsukasa Fukunaga

    Proceedings - 2020 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2020     186 - 193  2020.12  [Refereed]

     View Summary

    Sequence motifs play essential roles in intermolecular interactions such as DNA-protein interactions. The discovery of novel sequence motifs is therefore crucial for revealing gene functions. Various bioinformatics tools have been developed for finding sequence motifs, but until now there has been no software based on statistical hypothesis testing with statistically sound multiple testing correction. Existing software therefore could not control for the type-l error rates. This is because, in the sequence motif discovery problem, conventional multiple testing correction methods produce very low statistical power due to overly-strict correction. We developed MotiMul, which comprehensively finds significant sequence motifs using statistically sound multiple testing correction. Our key idea is the application of Tarone's correction, which improves the statistical power of the hypothesis test by ignoring hypotheses that never become statistically significant. For the efficient enumeration of the significant sequence motifs, we integrated a variant of the PrefixSpan algorithm with Tarone's correction. Simulation and empirical dataset analysis showed that MotiMul is a powerful method for finding biologically meaningful sequence motifs. The source code of MotiMul is freely available at https://github.com/ko-ichimo-ri/MotiMul.

    DOI

  • Revealing the microbial assemblage structure in the human gut microbiome using latent Dirichlet allocation

    Shion Hosoda, Suguru Nishijima, Tsukasa Fukunaga, Masahira Hattori, Michiaki Hamada

    Microbiome   8 ( 1 ) 95 - 95  2020.06  [Refereed]  [International journal]

     View Summary

    Background: The human gut microbiome has been suggested to affect human health and thus has received considerable attention. To clarify the structure of the human gut microbiome, clustering methods are frequently applied to human gut taxonomic profiles. Enterotypes, i.e., clusters of individuals with similar microbiome composition, are well-studied and characterized. However, only a few detailed studies on assemblages, i.e., clusters of co-occurring bacterial taxa, have been conducted. Particularly, the relationship between the enterotype and assemblage is not well-understood. Results: In this study, we detected gut microbiome assemblages using a latent Dirichlet allocation (LDA) method. We applied LDA to a large-scale human gut metagenome dataset and found that a 4-assemblage LDA model could represent relationships between enterotypes and assemblages with high interpretability. This model indicated that each individual tends to have several assemblages, three of which corresponded to the three classically recognized enterotypes. Conversely, the fourth assemblage corresponded to no enterotypes and emerged in all enterotypes. Interestingly, the dominant genera of this assemblage (Clostridium, Eubacterium, Faecalibacterium, Roseburia, Coprococcus, and Butyrivibrio) included butyrate-producing species such as Faecalibacterium prausnitzii. Indeed, the fourth assemblage significantly positively correlated with three butyrate-producing functions. Conclusions: We conducted an assemblage analysis on a large-scale human gut metagenome dataset using LDA. The present study revealed that there is an enterotype-independent assemblage. [MediaObject not available: see fulltext.]

    DOI PubMed

  • Logicome profiler: Exhaustive detection of statistically significant logic relationships from comparative omics data

    Tsukasa Fukunaga, Wataru Iwasaki

    PLoS ONE   15 ( 5 ) e0232106  2020.05  [Refereed]  [International journal]

     View Summary

    Logic relationship analysis is a data mining method that comprehensively detects item triplets that satisfy logic relationships from a binary matrix dataset, such as an ortholog table in comparative genomics. Thanks to recent technological advancements, many binary matrix datasets are now being produced in genomics, transcriptomics, epigenomics, metagenomics, and many other fields for comparative purposes. However, regardless of presumed interpretability and importance of logic relationships, existing data mining methods are not based on the framework of statistical hypothesis testing. That means, the type-1 and type-2 error rates are neither controlled nor estimated. Here, we developed Logicome Profiler, which exhaustively detects statistically significant triplet logic relationships from a binary matrix dataset (Logicome means ome of logics). To test all item triplets in a dataset while avoiding false positives, Logicome Profiler adjusts a significance level by the Bonferroni or Benjamini-Yekutieli method for the multiple testing correction. Its application to an ocean metagenomic dataset showed that Logicome Profiler can effectively detect statistically significant triplet logic relationships among environmental microbes and genes, which include those among urea transporter, urease, and photosynthesis-related genes. Beyond omics data analysis, Logicome Profiler is applicable to various binary matrix datasets in general for finding significant triplet logic relationships. The source code is available at https://github.com/fukunagatsu/LogicomeProfiler.

    DOI PubMed

  • Targeting the TR4 nuclear receptor-mediated lncTASR/AXL signaling with tretinoin increases the sunitinib sensitivity to better suppress the RCC progression

    Hangchuan Shi, Yin Sun, Miao He, Xiong Yang, Michiaki Hamada, Tsukasa Fukunaga, Xiaoping Zhang, Chawnshang Chang

    Oncogene   39 ( 3 ) 530 - 545  2020.01  [Refereed]  [International journal]

     View Summary

    Renal cell carcinoma (RCC) is one of the most lethal urological tumors. Using sunitinib to improve the survival has become the first-line therapy for metastatic RCC patients. However, the occurrence of sunitinib resistance in the clinical application has curtailed its efficacy. Here we found TR4 nuclear receptor might alter the sunitinib resistance to RCC via altering the TR4/lncTASR/AXL signaling. Mechanism dissection revealed that TR4 could modulate lncTASR (ENST00000600671.1) expression via transcriptional regulation, which might then increase AXL protein expression via enhancing the stability of AXL mRNA to increase the sunitinib resistance in RCC. Human clinical surveys also linked the expression of TR4, lncTASR, and AXL to the RCC survival, and results from multiple RCC cell lines revealed that targeting this newly identified TR4-mediated signaling with small molecules, including tretinoin, metformin, or TR4-shRNAs, all led to increase the sunitinib sensitivity to better suppress the RCC progression, and our preclinical study using the in vivo mouse model further proved tretinoin had a better synergistic effect to increase sunitinib sensitivity to suppress RCC progression. Future successful clinical trials may help in the development of a novel therapy to better suppress the RCC progression.

    DOI PubMed

  • Discovering novel mutation signatures by latent Dirichlet allocation with variational Bayes inference

    Taro Matsutani, Yuki Ueno, Tsukasa Fukunaga, Michiaki Hamada

    Bioinformatics   35 ( 22 ) 4543 - 4552  2019.11  [Refereed]  [International journal]

     View Summary

    A cancer genome includes many mutations derived from various mutagens and mutational processes, leading to specific mutation patterns. It is known that each mutational process leads to characteristic mutations, and when a mutational process has preferences for mutations, this situation is called a 'mutation signature.' Identification of mutation signatures is an important task for elucidation of carcinogenic mechanisms. In previous studies, analyses with statistical approaches (e.g. non-negative matrix factorization and latent Dirichlet allocation) revealed a number of mutation signatures. Nonetheless, strictly speaking, these existing approaches employ an ad hoc method or incorrect approximation to estimate the number of mutation signatures, and the whole picture of mutation signatures is unclear. Results: In this study, we present a novel method for estimating the number of mutation signatures- latent Dirichlet allocation with variational Bayes inference (VB-LDA)-where variational lower bounds are utilized for finding a plausible number of mutation patterns. In addition, we performed cluster analyses for estimated mutation signatures to extract novel mutation signatures that appear in multiple primary lesions. In a simulation with artificial data, we confirmed that our method estimated the correct number of mutation signatures. Furthermore, applying our method in combination with clustering procedures for real mutation data revealed many interesting mutation signatures that have not been previously reported.

    DOI PubMed

  • Lncrrisearch: A web server for lncRNA-RNA interaction prediction integrated with tissue-specific expression and subcellular localization data

    Tsukasa Fukunaga, Junichi Iwakiri, Yukiteru Ono, Michiaki Hamada

    Frontiers in Genetics   10 ( MAY ) 462 - 462  2019  [Refereed]  [International journal]

     View Summary

    Long non-coding RNAs (lncRNAs) play critical roles in various biological processes, but the function of the majority of lncRNAs is still unclear. One approach for estimating a function of a lncRNA is the identification of its interaction target because functions of lncRNAs are expressed through interaction with other biomolecules in quite a few cases. In this paper, we developed “LncRRIsearch,” which is a web server for comprehensive prediction of human and mouse lncRNA-lncRNA and lncRNA-mRNA interaction. The prediction was conducted using RIblast, which is a fast and accurate RNA-RNA interaction prediction tool. Users can investigate interaction target RNAs of a particular lncRNA through a web interface. In addition, we integrated tissue-specific expression and subcellular localization data for the lncRNAs with the web server. These data enable users to examine tissue-specific or subcellular localized lncRNA interactions. LncRRIsearch is publicly accessible at http://rtools.cbrc.jp/LncRRIsearch/.

    DOI PubMed

  • A Novel Method for Assessing the Statistical Significance of RNA-RNA Interactions Between Two Long RNAs

    Tsukasa Fukunaga, Michiaki Hamada

    Journal of Computational Biology   25 ( 9 ) 976 - 986  2018.09  [Refereed]  [International journal]

     View Summary

    RNA-RNA interactions are key mechanisms through which noncoding RNA (ncRNA) regions exert biological functions. Computational prediction of RNA-RNA interactions is an essential method for detecting novel RNA-RNA interactions because their comprehensive detection by biological experimentation is still quite difficult. Many RNA-RNA interaction prediction tools have been developed, but they tend to produce many false positives. Accordingly, assessment of the statistical significance of computationally predicted interactions is an important task. However, there is no method to evaluate the statistical significance of RNA-RNA interactions that is applicable to interactions between two long RNA sequences. We developed a method to calculate the p-value for the minimal interaction energy between two long RNA sequences. The developed method depends on the fact that minimum interaction energies of RNA-RNA interactions between long RNAs follow a Gumbel distribution when repeat sequences in RNAs are masked. To show the usefulness of the developed method, we applied it to whole human 5′-untranslated region (UTR) and 3′-UTR sequences to detect novel 5′-UTR-3′-UTR interactions. We thus identified two significant 5′-UTR-3′-UTR interactions. Specifically, the human small proline-rich repeat protein 3 shows conserved 5′-UTR-3′-UTR interactions with some nucleotide variations preserving base pairings among primates. Our developed method enables us to detect statistically significant RNA-RNA interactions between long RNAs such as long ncRNAs. Statistical significance estimates help in identification of interactions for experimental validation and provide novel insights into the function of ncRNA regions.

    DOI PubMed

  • Computational approaches for alternative and transient secondary structures of ribonucleic acids

    Tsukasa Fukunaga, Michiaki Hamada

    Briefings in Functional Genomics   18 ( 3 ) 182 - 191  2018.06  [Refereed]  [International journal]

     View Summary

    Transient and alternative structures of ribonucleic acids (RNAs) play essential roles in various regulatory processes, such as translation regulation in living cells. Because experimental analyses for RNA structures are difficult and time-consuming, computational approaches based on RNA secondary structures are promising. In this article, we review computational methods for detecting and analyzing transient/alternative secondary structures of RNAs, including static approaches based on probabilistic distributions of RNA secondary structures and dynamic approaches such as kinetic folding and folding pathway predictions.

    DOI PubMed

  • MitoFish and mifish pipeline: A mitochondrial genome database of fish with an analysis pipeline for environmental DNA metabarcoding

    Yukuto Sato, Masaki Miya, Tsukasa Fukunaga, Tetsuya Sado, Wataru Iwasaki

    Molecular Biology and Evolution   35 ( 6 ) 1553 - 1555  2018.06  [Refereed]

     View Summary

    Fish mitochondrial genome (mitogenome) data form a fundamental basis for revealing vertebrate evolution and hydrosphere ecology. Here, we report recent functional updates of MitoFish, which is a database of fish mitogenomes with a precise annotation pipeline MitoAnnotator. Most importantly, we describe implementation of MiFish pipeline for metabarcoding analysis of fish mitochondrial environmental DNA, which is a fast-emerging and powerful technology in fish studies. MitoFish, MitoAnnotator, and MiFish pipeline constitute a key platform for studies of fish evolution, ecology, and conservation, and are freely available at http://mitofish.aori.u-Tokyo.ac.jp/ (last accessed April 7th, 2018).

    DOI PubMed

  • Identification and analysis of ribosome-associated lncRNAs using ribosome profiling data

    Chao Zeng, Tsukasa Fukunaga, Michiaki Hamada

    BMC Genomics   19 ( 1 ) 414 - 414  2018.05  [Refereed]  [International journal]

     View Summary

    Background: Although the number of discovered long non-coding RNAs (lncRNAs) has increased dramatically, their biological roles have not been established. Many recent studies have used ribosome profiling data to assess the protein-coding capacity of lncRNAs. However, very little work has been done to identify ribosome-associated lncRNAs, here defined as lncRNAs interacting with ribosomes related to protein synthesis as well as other unclear biological functions. Results: On average, 39.17% of expressed lncRNAs were observed to interact with ribosomes in human and 48.16% in mouse. We developed the ribosomal association index (RAI), which quantifies the evidence for ribosomal associability of lncRNAs over various tissues and cell types, to catalog 691 and 409 lncRNAs that are robustly associated with ribosomes in human and mouse, respectively. Moreover, we identified 78 and 42 lncRNAs with a high probability of coding peptides in human and mouse, respectively. Compared with ribosome-free lncRNAs, ribosome-associated lncRNAs were observed to be more likely to be located in the cytoplasm and more sensitive to nonsense-mediated decay. Conclusion: Our results suggest that RAI can be used as an integrative and evidence-based tool for distinguishing between ribosome-associated and free lncRNAs, providing a valuable resource for the study of lncRNA functions.

    DOI PubMed

  • Solar-panel and parasol strategies shape the proteorhodopsin distribution pattern in marine Flavobacteriia

    Yohei Kumagai, Susumu Yoshizawa, Yu Nakajima, Mai Watanabe, Tsukasa Fukunaga, Yoshitoshi Ogura, Tetsuya Hayashi, Kenshiro Oshima, Masahira Hattori, Masahiko Ikeuchi, Kazuhiro Kogure, Edward F. Delong, Wataru Iwasaki

    ISME Journal   12 ( 5 ) 1329 - 1343  2018.05  [Refereed]  [International journal]

     View Summary

    Proteorhodopsin (PR) is a light-driven proton pump that is found in diverse bacteria and archaea species, and is widespread in marine microbial ecosystems. To date, many studies have suggested the advantage of PR for microorganisms in sunlit environments. The ecophysiological significance of PR is still not fully understood however, including the drivers of PR gene gain, retention, and loss in different marine microbial species. To explore this question we sequenced 21 marine Flavobacteriia genomes of polyphyletic origin, which encompassed both PR-possessing as well as PR-lacking strains. Here, we show that the possession or alternatively the lack of PR genes reflects one of two fundamental adaptive strategies in marine bacteria. Specifically, while PR-possessing bacteria utilize light energy ("solar-panel strategy"), PR-lacking bacteria exclusively possess UV-screening pigment synthesis genes to avoid UV damage and would adapt to microaerobic environment ("parasol strategy"), which also helps explain why PR-possessing bacteria have smaller genomes than those of PR-lacking bacteria. Collectively, our results highlight the different strategies of dealing with light, DNA repair, and oxygen availability that relate to the presence or absence of PR phototrophy.

    DOI PubMed

  • RIblast: an ultrafast RNA-RNA interaction prediction system based on a seed-and-extension approach

    Tsukasa Fukunaga, Michiaki Hamada

    Bioinformatics (Oxford, England)   33 ( 17 ) 2666 - 2674  2017.09  [Refereed]  [International journal]

     View Summary

    Motivation: LncRNAs play important roles in various biological processes. Although more than 58 000 human lncRNA genes have been discovered, most known lncRNAs are still poorly characterized. One approach to understanding the functions of lncRNAs is the detection of the interacting RNA target of each lncRNA. Because experimental detections of comprehensive lncRNA-RNA interactions are difficult, computational prediction of lncRNA-RNA interactions is an indispensable technique. However, the high computational costs of existing RNA-RNA interaction prediction tools prevent their application to large-scale lncRNA datasets.

    DOI PubMed

  • Inactivity periods and postural change speed can explain atypical postural change patterns of Caenorhabditis elegans mutants

    Tsukasa Fukunaga, Wataru Iwasaki

    BMC Bioinformatics   18 ( 1 ) 46  2017.01  [Refereed]

     View Summary

    Background: With rapid advances in genome sequencing and editing technologies, systematic and quantitative analysis of animal behavior is expected to be another key to facilitating data-driven behavioral genetics. The nematode Caenorhabditis elegans is a model organism in this field. Several video-tracking systems are available for automatically recording behavioral data for the nematode, but computational methods for analyzing these data are still under development. Results: In this study, we applied the Gaussian mixture model-based binning method to time-series postural data for 322 C. elegans strains. We revealed that the occurrence patterns of the postural states and the transition patterns among these states have a relationship as expected, and such a relationship must be taken into account to identify strains with atypical behaviors that are different from those of wild type. Based on this observation, we identified several strains that exhibit atypical transition patterns that cannot be fully explained by their occurrence patterns of postural states. Surprisingly, we found that two simple factors-overall acceleration of postural movement and elimination of inactivity periods-explained the behavioral characteristics of strains with very atypical transition patterns; therefore, computational analysis of animal behavior must be accompanied by evaluation of the effects of these simple factors. Finally, we found that the npr-1 and npr-3 mutants have similar behavioral patterns that were not predictable by sequence homology, proving that our data-driven approach can reveal the functions of genes that have not yet been characterized. Conclusion: We propose that elimination of inactivity periods and overall acceleration of postural change speed can explain behavioral phenotypes of strains with very atypical postural transition patterns. Our methods and results constitute guidelines for effectively finding strains that show "truly" interesting behaviors and systematically uncovering novel gene functions by bioimage-informatic approaches.

    DOI PubMed

  • Rtools: a web server for various secondary structural analyses on single RNA sequences

    Michiaki Hamada, Yukiteru Ono, Hisanori Kiryu, Kengo Sato, Yuki Kato, Tsukasa Fukunaga, Ryota Mori, Kiyoshi Asai

    Nucleic acids research   44 ( W1 ) W302 - W307  2016.07  [Refereed]

     View Summary

    The secondary structures, as well as the nucleotide sequences, are the important features of RNA molecules to characterize their functions. According to the thermodynamic model, however, the probability of any secondary structure is very small. As a consequence, any tool to predict the secondary structures of RNAs has limited accuracy. On the other hand, there are a few tools to compensate the imperfect predictions by calculating and visualizing the secondary structural information from RNA sequences. It is desirable to obtain the rich information from those tools through a friendly interface. We implemented a web server of the tools to predict secondary structures and to calculate various structural features based on the energy models of secondary structures. By just giving an RNA sequence to the web server, the user can get the different types of solutions of the secondary structures, the marginal probabilities such as base-paring probabilities, loop probabilities and accessibilities of the local bases, the energy changes by arbitrary base mutations as well as the measures for validations of the predicted secondary structures. The web server is available at http://rtools.cbrc.jp, which integrates software tools, CentroidFold, CentroidHomfold, IPKnot, CapR, Raccess, Rchange and RintD.

    DOI PubMed

  • GroupTracker: Video tracking system for multiple animals under severe occlusion

    Tsukasa Fukunaga, Shoko Kubota, Shoji Oda, Wataru Iwasaki

    Computational Biology and Chemistry   57   39 - 45  2015.12  [Refereed]

     View Summary

    Quantitative analysis of behaviors shown by interacting multiple animals can provide a key for revealing high-order functions of their nervous systems. To resolve these complex behaviors, a video tracking system that preserves individual identity even under severe overlap in positions, i.e., occlusion, is needed. We developed GroupTracker, a multiple animal tracking system that accurately tracks individuals even under severe occlusion. As maximum likelihood estimation of Gaussian mixture model whose components can severely overlap is theoretically an ill-posed problem, we devised an expectation-maximization scheme with additional constraints on the eigenvalues of the covariance matrix of the mixture components. Our system was shown to accurately track multiple medaka (Oryzias latipes) which freely swim around in three dimensions and frequently overlap each other. As an accurate multiple animal tracking system, GroupTracker will contribute to revealing unexplored structures and patterns behind animal interactions. The Java source code of GroupTracker is available at https://sites.google.com/site/fukunagatsu/software/group-tracker.

    DOI PubMed J-GLOBAL

  • MiFish, a set of universal PCR primers for metabarcoding environmental DNA from fishes: Detection of more than 230 subtropical marine species

    M. Miya, Y. Sato, T. Fukunaga, T. Sado, J. Y. Poulsen, K. Sato, T. Minamoto, S. Yamamoto, H. Yamanaka, H. Araki, M. Kondoh, W. Iwasaki

    Royal Society Open Science   2 ( 7 ) 150088  2015.07  [Refereed]

     View Summary

    We developed a set of universal PCR primers (MiFish-U/E) for metabarcoding environmental DNA (eDNA) from fishes. Primers were designed using aligned whole mitochondrial genome (mitogenome) sequences from 880 species, supplemented by partial mitogenome sequences from 160 elasmobranchs (sharks and rays). The primers target a hypervariable region of the 12S rRNA gene (163–185 bp), which contains sufficient information to identify fishes to taxonomic family, genus and species except for some closely related congeners. To test versatility of the primers across a diverse range of fishes, we sampled eDNA from four tanks in the Okinawa Churaumi Aquarium with known species compositions, prepared dual-indexed libraries and performed paired-end sequencing of the region using high-throughput next-generation sequencing technologies. Out of the 180 marine fish species contained in the four tanks with reference sequences in a custom database, we detected 168 species (93.3%) distributed across 59 families and 123 genera. These fishes are not only taxonomically diverse, ranging from sharks and rays to higher teleosts, but are also greatly varied in their ecology, including both pelagic and benthic species living in shallow coastal to deep waters. We also sampled natural seawaters around coral reefs near the aquarium and detected 93 fish species using this approach. Of the 93 species, 64 were not detected in the four aquarium tanks, rendering the total number of species detected to 232 (from 70 families and 152 genera). The metabarcoding approach presented here is non-invasive, more efficient, more cost-effective and more sensitive than the traditional survey methods. It has the potential to serve as an alternative (or complementary) tool for biodiversity monitoring that revolutionizes natural resource management and ecological studies of fish communities on larger spatial and temporal scales.

    DOI PubMed

  • Capr: Revealing structural specificities of rna-binding protein target recognition using clip-seq data

    Tsukasa Fukunaga, Haruka Ozaki, Goro Terai, Kiyoshi Asai, Wataru Iwasaki, Hisanori Kiryu

    Genome Biology   15 ( 1 ) R16  2014  [Refereed]

     View Summary

    RNA-binding proteins (RBPs) bind to their target RNA molecules by recognizing specific RNA sequences and structural contexts. The development of CLIP-seq and related protocols has made it possible to exhaustively identify RNA fragments that bind to RBPs. However, no efficient bioinformatics method exists to reveal the structural specificities of RBP-RNA interactions using these data. We present CapR, an efficient algorithm that calculates the probability that each RNA base position is located within each secondary structural context. Using CapR, we demonstrate that several RBPs bind to their target RNA molecules under specific structural contexts.

    DOI PubMed J-GLOBAL

  • Mitofish and mitoannotator: A mitochondrial genome database of fish with an accurate and automatic annotation pipeline

    Wataru Iwasaki, Tsukasa Fukunaga, Ryota Isagozawa, Koichiro Yamada, Yasunobu Maeda, Takashi P. Satoh, Tetsuya Sado, Kohji Mabuchi, Hirohiko Takeshima, Masaki Miya, Mutsumi Nishida

    Molecular Biology and Evolution   30 ( 11 ) 2531 - 2540  2013.11  [Refereed]

     View Summary

    Mitofish is a database of fish mitochondrial genomes (mitogenomes) that includes powerful and precise de novo annotations for mitogenome sequences. Fish occupy an important position in the evolution of vertebrates and the ecology of the hydrosphere, and mitogenomic sequence data have served as a rich source of information for resolving fish phylogenies and identifying new fish species. The importance of a mitogenomic database continues to grow at a rapid pace as massive amounts of mitogenomic data are generated with the advent of new sequencing technologies. A severe bottleneck seems likely to occur with regard to mitogenome annotation because of the overwhelming pace of data accumulation and the intrinsic difficulties in annotating sequences with degenerating transfer RNA structures, divergent start/stop codons of the coding elements, and the overlapping of adjacent elements. To ease this data backlog, we developed an annotation pipeline named MitoAnnotator. MitoAnnotator automatically annotates a fish mitogenome with a high degree of accuracy in approximately 5 min; thus, it is readily applicable to data sets of dozens of sequences. MitoFish also contains re-annotations of previously sequenced fish mitogenomes, enabling researchers to refer to them when they find annotations that are likely to be erroneous or while conducting comparative mitogenomic analyses. For users who need more information on the taxonomy, habitats, phenotypes, or life cycles of fish, MitoFish provides links to related databases. MitoFish and MitoAnnotator are freely available at http://mitofish.aori.u-tokyo.ac.jp/ (last accessed August 28, 2013); all of the data can be batch downloaded, and the annotation pipeline can be used via a web interface. © The Author 2013.

    DOI PubMed

  • Evolutionary Origin of the Scombridae (Tunas and Mackerels): Members of a Paleogene Adaptive Radiation with 14 Other Pelagic Fish Families

    Masaki Miya, Matt Friedman, Takashi P. Satoh, Hirohiko Takeshima, Tetsuya Sado, Wataru Iwasaki, Yusuke Yamanoue, Masanori Nakatani, Kohji Mabuchi, Jun G. Inoue, Jan Yde Poulsen, Tsukasa Fukunaga, Yukuto Sato, Mutsumi Nishida

    PLoS ONE   8 ( 9 ) e73535  2013.09  [Refereed]

     View Summary

    Uncertainties surrounding the evolutionary origin of the epipelagic fish family Scombridae (tunas and mackerels) are symptomatic of the difficulties in resolving suprafamilial relationships within Percomorpha, a hyperdiverse teleost radiation that contains approximately 17,000 species placed in 13 ill-defined orders and 269 families. Here we find that scombrids share a common ancestry with 14 families based on (i) bioinformatic analyses using partial mitochondrial and nuclear gene sequences from all percomorphs deposited in GenBank (10,733 sequences) and (ii) subsequent mitogenomic analysis based on 57 species from those targeted 15 families and 67 outgroup taxa. Morphological heterogeneity among these 15 families is so extraordinary that they have been placed in six different perciform suborders. However, members of the 15 families are either coastal or oceanic pelagic in their ecology with diverse modes of life, suggesting that they represent a previously undetected adaptive radiation in the pelagic realm. Time-calibrated phylogenies imply that scombrids originated from a deep-ocean ancestor and began to radiate after the end-Cretaceous when large predatory epipelagic fishes were selective victims of the Cretaceous-Paleogene mass extinction. We name this clade of open-ocean fishes containing Scombridae "Pelagia" in reference to the common habitat preference that links the 15 families. © 2013 Miya et al.

    DOI PubMed J-GLOBAL

▼display all

Awards

  • 優秀口頭発表賞

    2021   第十回生命医薬情報学連合大会   The inverse Potts model improves accuracy of phylogenetic profiling

    Winner: 福永津嵩, 岩崎渉

  • ポスター賞

    2020   第九回生命医薬情報学連合大会   統計的有意性を担保可能な系列パターンマイニングに基づく配列モチーフ検出ソフトウェアの開発

    Winner: 毛利公一, 尾崎遼, 福永津嵩

  • ポスター賞

    2016   第五回生命医薬情報学連合大会   RIblast: A high-speed RNA-RNA interaction prediction system for comprehensive lncRNA interactome analysis.

    Winner: 福永津嵩, 浜田道昭

Research Projects

  • リピート要素のde novo発見に基づく長鎖ノンコーディングRNAの機能の解明

    日本学術振興会  科学研究費助成事業 基盤研究(A)

    Project Year :

    2020.04
    -
    2023.03
     

    浜田 道昭, 小野口 真広, 福永 津嵩

  • 逆イジングモデル法に基づく機能未知な微生物遺伝子の機能推定

    日本学術振興会  科学研究費助成事業 新学術領域研究(研究領域提案型)

    Project Year :

    2020.04
    -
    2022.03
     

    福永 津嵩

  • 統計的論理関係解析法に基づく機能未知遺伝子の機能推定

    日本学術振興会  科学研究費助成事業 若手研究

    Project Year :

    2019.04
    -
    2022.03
     

    福永 津嵩

  • lncRNA-mRNAの相互作用ネットワークに基づくlncRNAの機能推定

    日本学術振興会  科研費 (新学術領域研究) 「ノンコーディングRNAネオタクソノミ」公募研究

    Project Year :

    2017.04
    -
    2019.03
     

    福永 津嵩

  • Computational Ethologyで解き明かす動物の群れ形成メカニズム

    日本学術振興会  科研費 (特別研究員奨励費)

    Project Year :

    2016.04
    -
    2019.03
     

    福永 津嵩

  • 定量的動画データ解析から迫るメダカの社会性行動

    日本学術振興会  科研費(特別研究員奨励費)

    Project Year :

    2015.04
    -
    2016.03
     

    福永 津嵩

▼display all

Presentations

  • The inverse Potts model improves accuracy of phylogenetic profiling

    福永津嵩, 岩崎渉

    第10回生命医薬情報学連合大会 

    Presentation date: 2021.09

  • 遺伝子獲得/欠失速度の不均一性を考慮したゲノム進化史再構築

    福永津嵩, 岩崎渉  [Invited]

    日本進化学会第23回東京大会 

    Presentation date: 2021.08

 

Syllabus

 

Committee Memberships

  • 2021.03
    -
    Now

    Frontiers in Genetics  Review Editor

  • 2021.04
    -
    2023.03

    日本バイオインフォマティクス学会  理事