Updated on 2022/11/28

写真a

 
MORISHIMA, Shigeo
 
Scopus Paper Info  
Paper Count: 0  Citation Count: 0  h-index: 12

Citation count denotes the number of citations in papers published for a particular year.

Affiliation
Faculty of Science and Engineering, School of Advanced Science and Engineering
Job title
Professor
Mail Address
メールアドレス

Concurrent Post

  • Faculty of Science and Engineering   Graduate School of Advanced Science and Engineering

  • Affiliated organization   Global Education Center

Research Institute

  • 2020
    -
    2022

    理工学術院総合研究所   兼任研究員

Education

  • 1982.04
    -
    1987.03

    University of Tokyo   Graduate School of Engineerng   Department of Electronics  

  • 1978.04
    -
    1982.03

    University of Tokyo   Faculty of Engineering   Department of Electronics  

Degree

  • Doctror of Engineering

Research Experience

  • 2004.04
    -
    Now

    Waseda University   Faculty of Science and Engineering   Professor

  • 2010.04
    -
    2014.03

    NICT   Spoken Language and Communication Research institute   Invited Researcher

  • 1999.04
    -
    2010.03

    ATR   Invited Resercher

  • 2001.04
    -
    2004.03

    Seikei University   Faculty of Engineering   Professor

  • 1988.04
    -
    2001.03

    Seikei University   Faculty of Engineering   Associate Professor

  • 1994.07
    -
    1995.08

    University of Toronto   Department of Computer Science   Visiting Professor

  • 1987.04
    -
    1988.03

    Seikei University   Faculty of Engineering   Permanent Lecturer

▼display all

Professional Memberships

  •  
     
     

    THE SOCIETY FOR ART AND SCIENCE

  •  
     
     

    JAPANESE ACADEMY OF FACIAL STUDIES

  •  
     
     

    THE JAPANESE PSYCHOLOGICAL ASSOCIATION

  •  
     
     

    THE INSTITUTE OF IMAGE INFORMATION AND TELEVISION ENGINEERS

  •  
     
     

    ACOUSTICAL SOCIETY OF JAPAN

  •  
     
     

    INFORMATION PROCESSING SOCIETY OF JAPAN

  •  
     
     

    THE INSTITUTE OF IMAGE ELECTRONICS ENGINEERS OF JAPAN

  •  
     
     

    IEEE

  •  
     
     

    ACM

▼display all

 

Research Areas

  • Intelligent informatics

Research Interests

  • Deep Learning

  • Audio Signal Processing

  • Face Image Processing

  • Multimedia Information Processing

  • Human Computer Interaction

  • Computer Vision

  • Computer Graphics

▼display all

Papers

  • The Sound of Bounding-Boxes.

    Takashi Oya, Shohei Iwase, Shigeo Morishima

    CoRR   abs/2203.15991  2022

     View Summary

    In the task of audio-visual sound source separation, which leverages visual
    information for sound source separation, identifying objects in an image is a
    crucial step prior to separating the sound source. However, existing methods
    that assign sound on detected bounding boxes suffer from a problem that their
    approach heavily relies on pre-trained object detectors. Specifically, when
    using these existing methods, it is required to predetermine all the possible
    categories of objects that can produce sound and use an object detector
    applicable to all such categories. To tackle this problem, we propose a fully
    unsupervised method that learns to detect objects in an image and separate
    sound source simultaneously. As our method does not rely on any pre-trained
    detector, our method is applicable to arbitrary categories without any
    additional annotation. Furthermore, although being fully unsupervised, we found
    that our method performs comparably in separation accuracy.

    DOI

  • Community-Driven Comprehensive Scientific Paper Summarization: Insight from cvpaper.challenge.

    Shintaro Yamamoto, Hirokatsu Kataoka, Ryota Suzuki 0006, Seitaro Shinagawa, Shigeo Morishima

    CoRR   abs/2203.09109  2022

     View Summary

    The present paper introduces a group activity involving writing summaries of
    conference proceedings by volunteer participants. The rapid increase in
    scientific papers is a heavy burden for researchers, especially non-native
    speakers, who need to survey scientific literature. To alleviate this problem,
    we organized a group of non-native English speakers to write summaries of
    papers presented at a computer vision conference to share the knowledge of the
    papers read by the group. We summarized a total of 2,000 papers presented at
    the Conference on Computer Vision and Pattern Recognition, a top-tier
    conference on computer vision, in 2019 and 2020. We quantitatively analyzed
    participants' selection regarding which papers they read among the many
    available papers. The experimental results suggest that we can summarize a wide
    range of papers without asking participants to read papers unrelated to their
    interests.

    DOI

  • 360 Depth Estimation in the Wild - The Depth360 Dataset and the SegFuse Network.

    Qi Feng, Hubert P. H. Shum, Shigeo Morishima

    CoRR   abs/2202.08010  2022

  • 360 Depth Estimation in the Wild - the Depth360 Dataset and the SegFuse Network.

    Qi Feng, Hubert P. H. Shum, Shigeo Morishima

    VR     664 - 673  2022

     View Summary

    Single-view depth estimation from omnidirectional images has gained
    popularity with its wide range of applications such as autonomous driving and
    scene reconstruction. Although data-driven learning-based methods demonstrate
    significant potential in this field, scarce training data and ineffective 360
    estimation algorithms are still two key limitations hindering accurate
    estimation across diverse domains. In this work, we first establish a
    large-scale dataset with varied settings called Depth360 to tackle the training
    data problem. This is achieved by exploring the use of a plenteous source of
    data, 360 videos from the internet, using a test-time training method that
    leverages unique information in each omnidirectional sequence. With novel
    geometric and temporal constraints, our method generates consistent and
    convincing depth samples to facilitate single-view estimation. We then propose
    an end-to-end two-branch multi-task learning network, SegFuse, that mimics the
    human eye to effectively learn from the dataset and estimate high-quality depth
    maps from diverse monocular RGB images. With a peripheral branch that uses
    equirectangular projection for depth estimation and a foveal branch that uses
    cubemap projection for semantic segmentation, our method predicts consistent
    global depth while maintaining sharp details at local regions. Experimental
    results show favorable performance against the state-of-the-art methods.

    DOI

    Scopus

  • 3D car shape reconstruction from a contour sketch using GAN and lazy learning.

    Naoki Nozawa, Hubert P. H. Shum, Qi Feng, Edmond S. L. Ho, Shigeo Morishima

    Vis. Comput.   38 ( 4 ) 1317 - 1330  2022

     View Summary

    3D car models are heavily used in computer games, visual effects, and even automotive designs. As a result, producing such models with minimal labour costs is increasingly more important. To tackle the challenge, we propose a novel system to reconstruct a 3D car using a single sketch image. The system learns from a synthetic database of 3D car models and their corresponding 2D contour sketches and segmentation masks, allowing effective training with minimal data collection cost. The core of the system is a machine learning pipeline that combines the use of a generative adversarial network (GAN) and lazy learning. GAN, being a deep learning method, is capable of modelling complicated data distributions, enabling the effective modelling of a large variety of cars. Its major weakness is that as a global method, modelling the fine details in the local region is challenging. Lazy learning works well to preserve local features by generating a local subspace with relevant data samples. We demonstrate that the combined use of GAN and lazy learning produces is able to produce high-quality results, in which different types of cars with complicated local features can be generated effectively with a single sketch. Our method outperforms existing ones using other machine learning structures such as the variational autoencoder.

    DOI

    Scopus

    3
    Citation
    (Scopus)
  • Audio-Oriented Video Interpolation Using Key Pose

    Takayuki Nakatsuka, Yukitaka Tsuchiya, Masatoshi Hamanaka, Shigeo Morishima

    International Journal of Pattern Recognition and Artificial Intelligence   35 ( 16 )  2021.12

     View Summary

    This paper describes a deep learning-based method for long-term video interpolation that generates intermediate frames between two music performance videos of a person playing a specific instrument. Recent advances in deep learning techniques have successfully generated realistic images with high-fidelity and high-resolution in short-term video interpolation. However, there is still room for improvement in long-term video interpolation due to lack of resolution and temporal consistency of the generated video. Particularly in music performance videos, the music and human performance motion need to be synchronized. We solved these problems by using human poses and music features essential for music performance in long-term video interpolation. By closely matching human poses with music and videos, it is possible to generate intermediate frames that synchronize with the music. Specifically, we obtain the human poses of the last frame of the first video and the first frame of the second video in the performance videos to be interpolated as key poses. Then, our encoder-decoder network estimates the human poses in the intermediate frames from the obtained key poses, with the music features as the condition. In order to construct an end-to-end network, we utilize a differentiable network that transforms the estimated human poses in vector form into the human pose in image form, such as human stick figures. Finally, a video-to-video synthesis network uses the stick figures to generate intermediate frames between two music performance videos. We found that the generated performance videos were of higher quality than the baseline method through quantitative experiments.

    DOI

    Scopus

  • Analysis of Use of Figures and Tables in Computer Vision Papers Using Image Recognition Technique

    YAMAMOTO Shintaro, SUZUKI Ryota, SHINAGAWA Seitaro, KATAOKA Hirokatsu, MORISHIMA Shigeo

    Journal of the Japan Society for Precision Engineering   87 ( 12 ) 995 - 1002  2021.12

     View Summary

    In scientific publications, information is conveyed in the form of figure and table as well as text. Among the many fields of research, computer vision focuses on visual information like image and video. Therefore, figure and table are useful to convey information in computer vision paper such as image used in the experiment or output of the proposed method. In the field of computer vision, conference papers are important, which is different from journal publications considered important in other fields. In this work, we study the use of figures and tables in computer vision papers in conference proceedings. We utilize object detection and image recognition techniques to extract and label figures in papers. We conducted the experiments from five aspects including (1) comparison with other field, (2) comparison among different conferences in computer vision, (3) comparison with workshop papers, (4) temporal change, and (5) comparison among different research topics in computer vision. Thorough the experiments, we observed that the use of figure and table has been changing between 2013 and 2020. We also revealed that the tendency in the use of figure is different among topics even in computer vision papers.

    DOI CiNii

  • Bowing-Net: Motion Generation for String Instruments Based on Bowing Information

    Asuka Hirata, Keitaro Tanaka, Ryo Shimamura, Shigeo Morishima

    Special Interest Group on Computer Graphics and Interactive Techniques Conference Posters, SIGGRAPH 2021    2021.08

     View Summary

    This paper presents a deep learning based method that generates body motion for string instrument performance from raw audio. In contrast to prior methods which aim to predict joint position from audio, we first estimate information that dictates the bowing dynamics, such as the bow direction and the played string. The final body motion is then determined from this information following a conversion rule. By adopting the bowing information as the target domain, not only is learning the mapping more feasible, but also the produced results have bowing dynamics that are consistent with the given audio. We confirmed that our results are superior to existing methods through extensive experiments.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • A case study on user evaluation of scientific publication summarization by Japanese students

    Shintaro Yamamoto, Ryota Suzuki, Tsukasa Fukusato, Hirokatsu Kataoka, Shigeo Morishima

    Applied Sciences (Switzerland)   11 ( 14 )  2021.07

     View Summary

    Summaries of scientific publications enable readers to gain an overview of a large number of studies, but users’ preferences have not yet been explored. In this paper, we conduct two user studies (i.e., short-and long-term studies) where Japanese university students read summaries of English research articles that were either manually written or automatically generated using text summarization and/or machine translation. In the short-term experiment, subjects compared and evaluated the two types of summaries of the same article. We analyze the characteristics in the generated summaries that readers regard as important, such as content richness and simplicity. The experimental results show that subjects are mainly judged based on four criteria, including content richness, simplicity, fluency, and format. In the long-term experiment, subjects read 50 summaries and answered whether they would like to read the original papers after reading the summaries. We discuss the characteristics in the summaries that readers tend to use to determine whether to read the papers, such as topic, methods, and results. The comments from subjects indicate that specific components of scientific publications, including research topics and methods, are important to judge whether to read or not. Our study provides insights to enhance the effectiveness of automatic summarization of scientific publications.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Pitch-Timbre Disentanglement Of Musical Instrument Sounds Based On Vae-Based Metric Learning

    Keitaro Tanaka, Ryo Nishikimi, Yoshiaki Bando, Kazuyoshi Yoshii, Shigeo Morishima

    ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)   2021-June   111 - 115  2021.06

     View Summary

    This paper describes a representation learning method for disentangling an arbitrary musical instrument sound into latent pitch and timbre representations. Although such pitch-timbre disentanglement has been achieved with a variational autoencoder (VAE), especially for a predefined set of musical instruments, the latent pitch and timbre representations are outspread, making them hard to interpret. To mitigate this problem, we introduce a metric learning technique into a VAE with latent pitch and timbre spaces so that similar (different) pitches or timbres are mapped close to (far from) each other. Specifically, our VAE is trained with additional contrastive losses so that the latent distances between two arbitrary sounds of the same pitch or timbre are minimized, and those of different pitches or timbres are maximized. This training is performed under weak supervision that uses only whether the pitches and timbres of two sounds are the same or not, instead of their actual values. This improves the generalization capability for unseen musical instruments. Experimental results show that the proposed method can find better-structured disentangled representations with pitch and timbre clusters even for unseen musical instruments.

    DOI

  • LSTM-SAKT: LSTM-Encoded SAKT-like Transformer for Knowledge Tracing

    Takashi Oya, Shigeo Morishima

    CoRR   abs/2102.00845  2021.01

     View Summary

    This paper introduces the 2nd place solution for the Riiid! Answer
    Correctness Prediction in Kaggle, the world's largest data science competition
    website. This competition was held from October 16, 2020, to January 7, 2021,
    with 3395 teams and 4387 competitors. The main insights and contributions of
    this paper are as follows. (i) We pointed out existing Transformer-based models
    are suffering from a problem that the information which their query/key/value
    can contain is limited. To solve this problem, we proposed a method that uses
    LSTM to obtain query/key/value and verified its effectiveness. (ii) We pointed
    out 'inter-container' leakage problem, which happens in datasets where
    questions are sometimes served together. To solve this problem, we showed
    special indexing/masking techniques that are useful when using RNN-variants and
    Transformer. (iii) We found additional hand-crafted features are effective to
    overcome the limits of Transformer, which can never consider the samples older
    than the sequence length.

  • Property analysis of adversarially robust representation

    Yoshihiro Fukuhara, Takahiro Itazuri, Hirokatsu Kataoka, Shigeo Morishima

    Seimitsu Kogaku Kaishi/Journal of the Japan Society for Precision Engineering   87 ( 1 ) 83 - 91  2021.01

     View Summary

    In this paper, we address the open question: "What do adversarially robust models look at?" Recently, it has been reported in many works that there exists the trade-off between standard accuracy and adversarial robustness. According to prior works, this trade-off is rooted in the fact that adversarially robust and standard accurate models might depend on very different sets of features. However, it has not been well studied what kind of difference actually exists. In this paper, we analyze this difference through various experiments visually and quantitatively. Experimental results show that adversarially robust models look at things at a larger scale than standard models and pay less attention to fine textures. Furthermore, although it has been claimed that adversarially robust features are not compatible with standard accuracy, there is even a positive effect by using them as pre-trained models particularly in low resolution datasets.

    DOI

    Scopus

  • RLTutor: Reinforcement Learning Based Adaptive Tutoring System by Modeling Virtual Student with Fewer Interactions.

    Yoshiki Kubotani, Yoshihiro Fukuhara, Shigeo Morishima

    CoRR   abs/2108.00268  2021

     View Summary

    A major challenge in the field of education is providing review schedules
    that present learned items at appropriate intervals to each student so that
    memory is retained over time. In recent years, attempts have been made to
    formulate item reviews as sequential decision-making problems to realize
    adaptive instruction based on the knowledge state of students. It has been
    reported previously that reinforcement learning can help realize mathematical
    models of students learning strategies to maintain a high memory rate. However,
    optimization using reinforcement learning requires a large number of
    interactions, and thus it cannot be applied directly to actual students. In
    this study, we propose a framework for optimizing teaching strategies by
    constructing a virtual model of the student while minimizing the interaction
    with the actual teaching target. In addition, we conducted an experiment
    considering actual instructions using the mathematical model and confirmed that
    the model performance is comparable to that of conventional teaching methods.
    Our framework can directly substitute mathematical models used in experiments
    with human students, and our results can serve as a buffer between theoretical
    instructional optimization and practical applications in e-learning systems.

  • Comprehending Research Article in Minutes: A User Study of Reading Computer Generated Summary for Young Researchers.

    Shintaro Yamamoto, Ryota Suzuki 0006, Hitokatsu Kataoka, Shigeo Morishima

    Human Interface and the Management of Information. Information Presentation and Visualization - Thematic Area   12765 LNCS   101 - 112  2021

     View Summary

    The automatic summarization of scientific papers, to assist researchers in conducting literature surveys, has garnered significant attention because of the rapid increase in the number of scientific articles published each year. However, whether and how these summaries actually help readers in comprehending scientific papers has not been examined yet. In this work, we study the effectiveness of automatically generated summaries of scientific papers for students who do not have sufficient knowledge in research. We asked six students, enrolled in bachelor’s and master’s programs in Japan, to prepare a presentation on a scientific paper by providing them either the article alone, or the article and its summary generated by an automatic summarization system, after 15 min of reading time. The comprehension of an article was judged by four evaluators based on the participant’s presentation. The experimental results show that the completeness of the comprehension of the four participants was higher overall when the summary of the paper was provided. In addition, four participants, including the two whose completeness score reduced when the summary was provided, mentioned that the summary is helpful to comprehend a research article within a limited time span.

    DOI

    Scopus

  • LineChaser: A Smartphone-Based Navigation System for Blind People to Stand in Lines.

    Masaki Kuribayashi, Seita Kayukawa, Hironobu Takagi, Chieko Asakawa, Shigeo Morishima

    CHI '21: CHI Conference on Human Factors in Computing Systems(CHI)     33 - 13  2021

    DOI

  • Self-Supervised Learning for Visual Summary Identification in Scientific Publications.

    Shintaro Yamamoto, Anne Lauscher, Simone Paolo Ponzetto, Goran Glavas, Shigeo Morishima

    Proceedings of the 11th International Workshop on Bibliometric-enhanced Information Retrieval co-located with 43rd European Conference on Information Retrieval (ECIR 2021)(BIR@ECIR)     5 - 19  2021

  • Visual Summary Identification From Scientific Publications via Self-Supervised Learning.

    Shintaro Yamamoto, Anne Lauscher, Simone Paolo Ponzetto, Goran Glavas, Shigeo Morishima

    Frontiers in Research Metrics and Analytics   6   719004 - 719004  2021  [International journal]

     View Summary

    The exponential growth of scientific literature yields the need to support users to both effectively and efficiently analyze and understand the some body of research work. This exploratory process can be facilitated by providing graphical abstracts-a visual summary of a scientific publication. Accordingly, previous work recently presented an initial study on automatic identification of a central figure in a scientific publication, to be used as the publication's visual summary. This study, however, have been limited only to a single (biomedical) domain. This is primarily because the current state-of-the-art relies on supervised machine learning, typically relying on the existence of large amounts of labeled data: the only existing annotated data set until now covered only the biomedical publications. In this work, we build a novel benchmark data set for visual summary identification from scientific publications, which consists of papers presented at conferences from several areas of computer science. We couple this contribution with a new self-supervised learning approach to learn a heuristic matching of in-text references to figures with figure captions. Our self-supervised pre-training, executed on a large unlabeled collection of publications, attenuates the need for large annotated data sets for visual summary identification and facilitates domain transfer for this task. We evaluate our self-supervised pretraining for visual summary identification on both the existing biomedical and our newly presented computer science data set. The experimental results suggest that the proposed method is able to outperform the previous state-of-the-art without any task-specific annotations.

    DOI PubMed

  • Vocal-Accompaniment Compatibility Estimation Using Self-Supervised and Joint-Embedding Techniques

    Takayuki Nakatsuka, Kento Watanabe, Yuki Koyama, Masahiro Hamasaki, Masataka Goto, Shigeo Morishima

    IEEE ACCESS   9   101994 - 102003  2021

     View Summary

    We propose a learning-based method of estimating the compatibility between vocal and accompaniment audio tracks, i.e., how well they go with each other when played simultaneously. This task is challenging because it is difficult to formulate hand-crafted rules or construct a large labeled dataset to perform supervised learning. Our method uses self-supervised and joint-embedding techniques for estimating vocal-accompaniment compatibility. We train vocal and accompaniment encoders to learn a joint-embedding space of vocal and accompaniment tracks, where the embedded feature vectors of a compatible pair of vocal and accompaniment tracks lie close to each other and those of an incompatible pair lie far from each other. To address the lack of large labeled datasets consisting of compatible and incompatible pairs of vocal and accompaniment tracks, we propose generating such a dataset from songs using singing voice separation techniques, with which songs are separated into pairs of vocal and accompaniment tracks, and then original pairs are assumed to be compatible, and other random pairs are not. We achieved this training by constructing a large dataset containing 910,803 songs and evaluated the effectiveness of our method using ranking-based evaluation methods.

    DOI

    Scopus

  • MirrorNet: A Deep Reflective Approach to 2D Pose Estimation for Single-Person Images

    Takayuki Nakatsuka, Kazuyoshi Yoshii, Yuki Koyama, Satoru Fukayama, Masataka Goto, Shigeo Morishima

    Journal of Information Processing   29   406 - 423  2021

     View Summary

    This paper proposes a statistical approach to 2D pose estimation from human images. The main problems with the standard supervised approach, which is based on a deep recognition (image-to-pose) model, are that it often yields anatomically implausible poses, and its performance is limited by the amount of paired data. To solve these problems, we propose a semi-supervised method that can make effective use of images with and without pose annotations. Specifically, we formulate a hierarchical generative model of poses and images by integrating a deep generative model of poses from pose features with that of images from poses and image features. We then introduce a deep recognition model that infers poses from images. Given images as observed data, these models can be trained jointly in a hierarchical variational autoencoding (image-to-pose-to-feature-to-pose-to-image) manner. The results of experiments show that the proposed reflective architecture makes estimated poses anatomically plausible, and the pose estimation performance is improved by integrating the recognition and generative models and also by feeding non-annotated images.

    DOI

  • Self-supervised learning for visual summary identification in scientific publications

    Shintaro Yamamoto, Anne Lauscher, Simone Paolo Ponzetto, Goran Glavaš, Shigeo Morishima

    CEUR Workshop Proceedings   2847   5 - 19  2021

     View Summary

    Providing visual summaries of scientific publications can increase information access for readers and thereby help deal with the exponential growth in the number of scientific publications. Nonetheless, efforts in providing visual publication summaries have been few and far apart, primarily focusing on the biomedical domain. This is primarily because of the limited availability of annotated gold standards, which hampers the application of robust and high-performing supervised learning techniques. To address these problems we create a new benchmark dataset for selecting figures to serve as visual summaries of publications based on their abstracts, covering several domains in computer science. Moreover, we develop a self-supervised learning approach, based on heuristic matching of inline references to figures with figure captions. Experiments in both biomedical and computer science domains show that our model is able to outperform the state of the art despite being self-supervised and therefore not relying on any annotated training data.

  • Do We Need Sound for Sound Source Localization?

    Takashi Oya, Shohei Iwase, Ryota Natsume, Takahiro Itazuri, Shugo Yamaguchi, Shigeo Morishima

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   12627 LNCS   119 - 136  2021

     View Summary

    During the performance of sound source localization which uses both visual and aural information, it presently remains unclear how much either image or sound modalities contribute to the result, i.e. do we need both image and sound for sound source localization? To address this question, we develop an unsupervised learning system that solves sound source localization by decomposing this task into two steps: (i) “potential sound source localization”, a step that localizes possible sound sources using only visual information (ii) “object selection”, a step that identifies which objects are actually sounding using aural information. Our overall system achieves state-of-the-art performance in sound source localization, and more importantly, we find that despite the constraint on available information, the results of (i) achieve similar performance. From this observation and further experiments, we show that visual information is dominant in “sound” source localization when evaluated with the currently adopted benchmark dataset. Moreover, we show that the majority of sound-producing objects within the samples in this dataset can be inherently identified using only visual information, and thus that the dataset is inadequate to evaluate a system’s capability to leverage aural information. As an alternative, we present an evaluation protocol that enforces both visual and aural information to be leveraged, and verify this property through several experiments.

    DOI

    Scopus

    3
    Citation
    (Scopus)
  • Song2Face: Synthesizing Singing Facial Animation from Audio

    Shohei Iwase, Takuya Kato, Shugo Yamaguchi, Tsuchiya Yukitaka, Shigeo Morishima

    SIGGRAPH Asia 2020 Technical Communications, SA 2020    2020.12  [Refereed]

     View Summary

    We present Song2Face, a deep neural network capable of producing singing facial animation from an input of singing voice and singer label. The network architecture is built upon our insight that, although facial expression when singing varies between different individuals, singing voices store valuable information such as pitch, breathe, and vibrato that expressions may be attributed to. Therefore, our network consists of an encoder that extracts relevant vocal features from audio, and a regression network conditioned on a singer label that predicts control parameters for facial animation. In contrast to prior audio-driven speech animation methods which initially map audio to text-level features, we show that vocal features can be directly learned from singing voice without any explicit constraints. Our network is capable of producing movements for all parts of the face and also rotational movement of the head itself. Furthermore, stylistic differences in expression between different singers are captured via the singer label, and thus the resulting animations singing style can be manipulated at test time.

    DOI

    Scopus

  • Audio-visual object removal in 360-degree videos

    Ryo Shimamura, Qi Feng, Yuki Koyama, Takayuki Nakatsuka, Satoru Fukayama, Masahiro Hamasaki, Masataka Goto, Shigeo Morishima

    VISUAL COMPUTER   36 ( 10-12 ) 2117 - 2128  2020.10  [Refereed]

     View Summary

    We present a novel conceptaudio-visual object removalin 360-degree videos, in which a target object in a 360-degree video is removed in both the visual and auditory domains synchronously. Previous methods have solely focused on the visual aspect of object removal using video inpainting techniques, resulting in videos with unreasonable remaining sounds corresponding to the removed objects. We propose a solution which incorporates direction acquired during the video inpainting process into the audio removal process. More specifically, our method identifies the sound corresponding to the visually tracked target object and then synthesizes a three-dimensional sound field by subtracting the identified sound from the input 360-degree video. We conducted a user study showing that our multi-modal object removal supporting both visual and auditory domains could significantly improve the virtual reality experience, and our method could generate sufficiently synchronous, natural and satisfactory 360-degree videos.

    DOI

    Scopus

    2
    Citation
    (Scopus)
  • LinSSS: linear decomposition of heterogeneous subsurface scattering for real-time screen-space rendering

    Tatsuya Yatagawa, Yasushi Yamaguchi, Shigeo Morishima

    VISUAL COMPUTER   36 ( 10-12 ) 1979 - 1992  2020.10  [Refereed]

     View Summary

    Screen-space subsurface scattering is currently the most common approach to represent translucent materials in real-time rendering. However, most of the current approaches approximate the diffuse reflectance profile of translucent materials as a symmetric function, whereas the profile has an asymmetric shape in nature. To address this problem, we propose LinSSS, a numerical representation of heterogeneous subsurface scattering for real-time screen-space rendering. Although our representation is built upon a previous method, it makes two contributions. First, LinSSS formulates the diffuse reflectance profile as a linear combination of radially symmetric Gaussian functions. Nevertheless, it can also represent the spatial variation and the radial asymmetry of the profile. Second, since LinSSS is formulated using only the Gaussian functions, the convolution of the diffuse reflectance profile can be efficiently calculated in screen space. To further improve the efficiency, we deform the rendering equation obtained using LinSSS by factoring common convolution terms and approximate the convolution processes using a MIP map. Consequently, our method works as fast as the state-of-the-art method, while our method successfully represents the heterogeneity of scattering.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Guiding Blind Pedestrians in Public Spaces by Understanding Walking Behavior of Nearby Pedestrians

    Seita Kayukawa, Tatsuya Ishihara, Hironobu Takagi, Shigeo Morishima, Chieko Asakawa

    Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies   4 ( 3 ) 85 - 22  2020.09  [Refereed]

     View Summary

    We present a guiding system to help blind people walk in public spaces while making their walking seamless with nearby pedestrians. Blind users carry a rolling suitcase-shaped system that has two RGBD Cameras, an inertial measurement unit (IMU) sensor, and light detection and ranging (LiDAR) sensor. The system senses the behavior of surrounding pedestrians, predicts risks of collisions, and alerts users to help them avoid collisions. It has two modes: The "on-path"mode that helps users avoid collisions without changing their path by adapting their walking speed; and the "off-path"mode that navigates an alternative path to go around pedestrians standing in the way Auditory and tactile modalities have been commonly used for non-visual navigation systems, so we implemented two interfaces to evaluate the effectiveness of each modality for collision avoidance. A user study with 14 blind participants in public spaces revealed that participants could successfully avoid collisions with both modalities. We detail the characteristics of each modality.

    DOI

    Scopus

    12
    Citation
    (Scopus)
  • Resolving hand-object occlusion for mixed reality with joint deep learning and model optimization

    Qi Feng, Hubert P. H. Shum, Shigeo Morishima

    COMPUTER ANIMATION AND VIRTUAL WORLDS   31 ( 4-5 )  2020.07  [Refereed]

     View Summary

    By overlaying virtual imagery onto the real world, mixed reality facilitates diverse applications and has drawn increasing attention. Enhancing physical in-hand objects with a virtual appearance is a key component for many applications that require users to interact with tools such as surgery simulations. However, due to complex hand articulations and severe hand-object occlusions, resolving occlusions in hand-object interactions is a challenging topic. Traditional tracking-based approaches are limited by strong ambiguities from occlusions and changing shapes, while reconstruction-based methods show a poor capability of handling dynamic scenes. In this article, we propose a novel real-time optimization system to resolve hand-object occlusions by spatially reconstructing the scene with estimated hand joints and masks. To acquire accurate results, we propose a joint learning process that shares information between two models and jointly estimates hand poses and semantic segmentation. To facilitate the joint learning system and improve its accuracy under occlusions, we propose an occlusion-aware RGB-D hand data set that mitigates the ambiguity through precise annotations and photorealistic appearance. Evaluations show more consistent overlays compared with literature, and a user study verifies a more realistic experience.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Asynchronous Eulerian Liquid Simulation

    T. Koike, S. Morishima, R. Ando

    Computer Graphics Forum   39 ( 2 ) 1 - 8  2020.05  [Refereed]

     View Summary

    We present a novel method for simulating liquid with asynchronous time steps on Eulerian grids. Previous approaches focus on Smoothed Particle Hydrodynamics (SPH), Material Point Method (MPM) or tetrahedral Finite Element Method (FEM) but the method for simulating liquid purely on Eulerian grids have not yet been investigated. We address several challenges specifically arising from the Eulerian asynchronous time integrator such as regional pressure solve, asynchronous advection, interpolation, regional volume preservation, and dedicated segregation of the simulation domain according to the liquid velocity. We demonstrate our method on top of staggered grids combined with the level set method and the semi-Lagrangian scheme. We run several examples and show that our method considerably outperforms the global adaptive time step method with respect to the computational runtime on scenes where a large variance of velocity is present.

    DOI

    Scopus

  • MirrorNet: A Deep Bayesian Approach to Reflective 2D Pose Estimation from Human Images

    Takayuki Nakatsuka, Kazuyoshi Yoshii, Yuki Koyama, Satoru Fukayama, Masataka Goto, Shigeo Morishima

    CoRR   abs/2004.03811  2020.04

     View Summary

    This paper proposes a statistical approach to 2D pose estimation from human
    images. The main problems with the standard supervised approach, which is based
    on a deep recognition (image-to-pose) model, are that it often yields
    anatomically implausible poses, and its performance is limited by the amount of
    paired data. To solve these problems, we propose a semi-supervised method that
    can make effective use of images with and without pose annotations.
    Specifically, we formulate a hierarchical generative model of poses and images
    by integrating a deep generative model of poses from pose features with that of
    images from poses and image features. We then introduce a deep recognition
    model that infers poses from images. Given images as observed data, these
    models can be trained jointly in a hierarchical variational autoencoding
    (image-to-pose-to-feature-to-pose-to-image) manner. The results of experiments
    show that the proposed reflective architecture makes estimated poses
    anatomically plausible, and the performance of pose estimation improved by
    integrating the recognition and generative models and also by feeding
    non-annotated images.

  • Data compression for measured heterogeneous subsurface scattering via scattering profile blending

    Tatsuya Yatagawa, Hideki Todo, Yasushi Yamaguchi, Shigeo Morishima

    VISUAL COMPUTER   36 ( 3 ) 541 - 558  2020.03  [Refereed]

     View Summary

    Subsurface scattering involves the complicated behavior of light beneath the surfaces of translucent objects that includes scattering and absorption inside the object's volume. Physically accurate numerical representation of subsurface scattering requires a large number of parameters because of the complex nature of this phenomenon. The large amount of data restricts the use of the data on memory-limited devices such as video game consoles and mobile phones. To address this problem, this paper proposes an efficient data compression method for heterogeneous subsurface scattering. The key insight of this study is that heterogeneous materials often comprise a limited number of base materials, and the size of the subsurface scattering data can be significantly reduced by parameterizing only a few base materials. In the proposed compression method, we represent the scattering property of a base material using a function referred to as the base scattering profile. A small subset of the base materials is assigned to each surface position, and the local scattering property near the position is described using a linear combination of the base scattering profiles in the log scale. The proposed method reduces the data by a factor of approximately 30 compared to a state-of-the-art method, without significant loss of visual quality in the rendered graphics. In addition, the compressed data can also be used as bidirectional scattering surface reflectance distribution functions (BSSRDF) without incurring much computational overhead. These practical aspects of the proposed method also facilitate the use of higher-resolution BSSRDFs in devices with large memory capacity.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Multi-Instrument Music Transcription Based on Deep Spherical Clustering of Spectrograms and Pitchgrams.

    Keitaro Tanaka, Takayuki Nakatsuka, Ryo Nishikimi, Kazuyoshi Yoshii, Shigeo Morishima

        327 - 334  2020

  • Garment transfer for quadruped characters

    F. Narita, S. Saito, T. Kato, T. Fukusato, S. Morishima

    European Association for Computer Graphics - 37th Annual Conference, EUROGRAPHICS 2016 - Short Papers     57 - 60  2020  [Refereed]

     View Summary

    Modeling clothing to characters is one of the most time-consuming tasks for artists in 3DCG animation production. Transferring existing clothing models is a simple and powerful solution to reduce labor. In this paper, we propose a method to generate a clothing model for various characters from a single template model. Our framework consists of three steps: scale measurement, clothing transformation, and texture preservation. By introducing a novel measurement of the scale deviation between two characters with different shapes and poses, our framework achieves pose-independent transfer of clothing even for quadrupeds (e.g., from human to horse). In addition to a plausible clothing transformation method based on the scale measurement, our method minimizes texture distortion resulting from large deformation. We demonstrate that our system is robust for a wide range of body shapes and poses, which is challenging for current state-of-the-art methods.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Hypermask talking head projected onto real object

    Shigeo Morishima, Tatsuo Yotsukura, Kim Binsted, Frank Nielsen, Claudio Pinhanez

    Multimedia Modeling: Modeling Multimedia Information and Systems, MMM 2000     403 - 412  2020

     View Summary

    HYPERMASK is a system which projects an animated face onto a physical mask, worn by an actor. As the mask moves within a prescribed area, its position and orientation are detected by a camera, and the projected image changes with respect to the viewpoint of the audience. The lips of the projected face are automatically synthesized in real time with the voice of the actor, who also controls the facial expressions. As a theatrical tool, HYPERMASK enables a new style of storytelling. As a prototype system, we propose to put a self-contained HYPERMASK system in a trolley (disguised as a linen cart), so that it projects onto the mask worn by the actor pushing the trolley.

  • Foreground-aware dense depth estimation for 360 images

    Qi Feng, Hubert P.H. Shum, Ryo Shimamu, Shigeo Morishima

    Journal of WSCG   28 ( 1-2 ) 79 - 88  2020

     View Summary

    With 360 imaging devices becoming widely accessible, omnidirectional content has gained popularity in multiple fields. The ability to estimate depth from a single omnidirectional image can benefit applications such as robotics navigation and virtual reality. However, existing depth estimation approaches produce sub-optimal results on real-world omnidirectional images with dynamic foreground objects. On the one hand, capture-based methods cannot obtain the foreground due to the limitations of the scanning and stitching schemes. On the other hand, it is challenging for synthesis-based methods to generate highly-realistic virtual foreground objects that are comparable to the real-world ones. In this paper, we propose to augment datasets with realistic foreground objects using an image-based approach, which produces a foreground-aware photorealistic dataset for machine learning algorithms. By exploiting a novel scale-invariant RGB-D correspondence in the spherical domain, we repurpose abundant non-omnidirectional datasets to include realistic foreground objects with correct distortions. We further propose a novel auxiliary deep neural network to estimate both the depth of the omnidirectional images and the mask of the foreground objects, where the two tasks facilitate each other. A new local depth loss considers small regions of interests and ensures that their depth estimations are not smoothed out during the global gradient’s optimization. We demonstrate the system using human as the foreground due to its complexity and contextual importance, while the framework can be generalized to any other foreground objects. Experimental results demonstrate more consistent global estimations and more accurate local estimations compared with state-of-the-arts.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Audio-guided Video Interpolation via Human Pose Features

    Takayuki Nakatsuka, Masatoshi Hamanaka, Shigeo Morishima

    PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VOL 5: VISAPP   5   27 - 35  2020

     View Summary

    This paper describes a method that generates in-between frames of two videos of a musical instrument being played. While image generation achieves a successful outcome in recent years, there is ample scope for improvement in video generation. The keys to improving the quality of video generation are the high resolution and temporal coherence of videos. We solved these requirements by using not only visual information but also aural information. The critical point of our method is using two-dimensional pose features to generate high-resolution in-between frames from the input audio. We constructed a deep neural network with a recurrent structure for inferring pose features from the input audio and an encoder-decoder network for padding and generating video frames using pose features. Our method, moreover, adopted a fusion approach of generating, padding, and retrieving video frames to improve the output video. Pose features played an essential role in both end-to-end training with a differentiable property and combining a generating, padding, and retrieving approach. We conducted a user study and confirmed that the proposed method is effective in generating interpolated videos.

    DOI

  • Single Sketch Image based 3D Car Shape Reconstruction with Deep Learning and Lazy Learning

    Naoki Nozawa, Hubert P. H. Shum, Edmond S. L. Ho, Shigeo Morishima

    PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VOL 1: GRAPP   1   179 - 190  2020

     View Summary

    Efficient car shape design is a challenging problem in both the automotive industry and the computer animation/games industry. In this paper, we present a system to reconstruct the 3D car shape from a single 2D sketch image. To learn the correlation between 2D sketches and 3D cars, we propose a Variational Autoencoder deep neural network that takes a 2D sketch and generates a set of multi-view depth and mask images, which form a more effective representation comparing to 3D meshes, and can be effectively fused to generate a 3D car shape. Since global models like deep learning have limited capacity to reconstruct fine-detail features, we propose a local lazy learning approach that constructs a small subspace based on a few relevant car samples in the database. Due to the small size of such a subspace, fine details can be represented effectively with a small number of parameters. With a low-cost optimization process, a high-quality car shape with detailed features is created. Experimental results show that the system performs consistently to create highly realistic cars of substantially different shape and topology.

    DOI

  • Smartphone-Based Assistance for Blind People to Stand in Lines

    Seita Kayukawa, Hironobu Takagi, Joao Guerreiro, Shigeo Morishima, Chieko Asakawa

    CHI'20: EXTENDED ABSTRACTS OF THE 2020 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS     1 - 8  2020

     View Summary

    We present a system to allow blind people to stand in line in public spaces by using an off-the-shelf smartphone only. The technologies to navigate blind pedestrians in public spaces are rapidly improving, but tasks which require to understand surrounding people's behavior are still difficult to assist. Standing in line at shops, stations, and other crowded places is one of such tasks. Therefore, we developed a system to detect and notify the distance to a person in front continuously by using a smartphone with a RGB camera and an infrared depth sensor. The system alerts three levels of distance via vibration patterns to allow users to start/stop moving forward to the right position at the right timing. To evaluate the effectiveness of the system, we performed a study with six blind people. We observed that the system enables blind participants to stand in line successfully, while also gaining more confidence.

    DOI

    Scopus

    7
    Citation
    (Scopus)
  • BlindPilot: A Robotic Local Navigation System that Leads Blind People to a Landmark Object

    Seita Kayukawa, Tatsuya Ishihara, Hironobu Takagi, Shigeo Morishima, Chieko Asakawa

    CHI'20: EXTENDED ABSTRACTS OF THE 2020 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS     1 - 9  2020

     View Summary

    Blind people face various local navigation challenges in their daily lives such as identifying empty seats in crowded stations, navigating toward a seat, and stopping and sitting at the correct spot. Although voice navigation is a commonly used solution, it requires users to carefully follow frequent navigational sounds over short distances. Therefore, we presented an assistive robot, BlindPilot, which guides blind users to landmark objects using an intuitive handle. BlindPilot employs an RGB-D camera to detect the positions of target objects and uses LiDAR to build a 2D map of the surrounding area. On the basis of the sensing results, BlindPilot then generates a path to the object and guides the user safely. To evaluate our system, we also implemented a sound-based navigation system as a baseline system, and asked six blind participants to approach an empty chair using the two systems. We observed that BlindPilot enabled users to approach a chair faster with a greater feeling of security and less effort compared to the baseline system.

    DOI

    Scopus

    8
    Citation
    (Scopus)
  • Automatic sign dance synthesis from gesture-based sign language

    Naoya Iwamoto, Hubert P.H. Shum, Wakana Asahina, Shigeo Morishima

    Proceedings - MIG 2019: ACM Conference on Motion, Interaction, and Games    2019.10

     View Summary

    Automatic dance synthesis has become more and more popular due to the increasing demand in computer games and animations. Existing research generates dance motions without much consideration for the context of the music. In reality, professional dancers make choreography according to the lyrics and music features. In this research, we focus on a particular genre of dance known as sign dance, which combines gesture-based sign language with full body dance motion. We propose a system to automatically generate sign dance from a piece of music and its corresponding sign gesture. The core of the system is a Sign Dance Model trained by multiple regression analysis to represent the correlations between sign dance and sign gesture/music, as well as a set of objective functions to evaluate the quality of the sign dance. Our system can be applied to music visualization, allowing people with hearing difficulties to understand and enjoy music.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • 3D car shape reconstruction from a single sketch image

    Naoiki Nozawa, Hubert P.H. Shum, Edmond S.L. Ho, Shigeo Morishima

    Proceedings - MIG 2019: ACM Conference on Motion, Interaction, and Games    2019.10

     View Summary

    Efficient car shape design is a challenging problem in both the automotive industry and the computer animation/games industry. In this paper, we present a system to reconstruct the 3D car shape from a single 2D sketch image. To learn the correlation between 2D sketches and 3D cars, we propose a Variational Autoencoder deep neural network that takes a 2D sketch and generates a set of multiview depth & mask images, which are more effective representation comparing to 3D mesh, and can be combined to form the 3D car shape. To ensure the volume and diversity of the training data, we propose a feature-preserving car mesh augmentation pipeline for data augmentation. Since deep learning has limited capacity to reconstruct fine-detail features, we propose a lazy learning approach that constructs a small subspace based on a few relevant car samples in the database. Due to the small size of such a subspace, fine details can be represented effectively with a small number of parameters. With a low-cost optimization process, a high-quality car with detailed features is created. Experimental results show that the system performs consistently to create highly realistic cars of substantially different shape and topology, with a very low computational cost.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Real-time Indirect Illumination of Emissive Inhomogeneous Volumes using Layered Polygonal Area Lights

    Takahiro Kuge, Tatsuya Yatagawa, Shigeo Morishima

    COMPUTER GRAPHICS FORUM   38 ( 7 ) 449 - 460  2019.10

     View Summary

    Indirect illumination involving with visually rich participating media such as turbulent smoke and loud explosions contributes significantly to the appearances of other objects in a rendering scene. However, previous real-time techniques have focused only on the appearances of the media directly visible from the viewer. Specifically, appearances that can be indirectly seen over reflective surfaces have not attracted much attention. In this paper, we present a real-time rendering technique for such indirect views that involves the participating media. To achieve real-time performance for computing indirect views, we leverage layered polygonal area lights (LPALs) that can be obtained by slicing the media into multiple flat layers. Using this representation, radiance entering each surface point from each slice of the volume is analytically evaluated to achieve instant calculation. The analytic solution can be derived for standard bidirectional reflectance distribution functions (BRDFs) based on the microfacet theory. Accordingly, our method is sufficiently robust to work on surfaces with arbitrary shapes and roughness values. In addition, we propose a quadrature method for more accurate rendering of scenes with dense volumes, and a transformation of the domain of volumes to simplify the calculation and implementation of the proposed method. By taking advantage of these computation techniques, the proposed method achieves real-time rendering of indirect illumination for emissive volumes.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Interactive Face Retrieval Framework for Clarifying User's Visual Memory

    Yugo Sato, Tsukasa Fukusato, Shigeo Morishima

    ITE TRANSACTIONS ON MEDIA TECHNOLOGY AND APPLICATIONS   7 ( 2 ) 68 - 79  2019

     View Summary

    This paper presents an interactive face retrieval framework for clarifying an image representation envisioned by a user. Our system is designed for a situation in which the user wishes to find a person but has only visual memory of the person. We address a critical challenge of image retrieval across the user's inputs. Instead of target-specific information, the user can select several images that are similar to an impression of the target person the user wishes to search for. Based on the user's selection, our proposed system automatically updates a deep convolutional neural network. By interactively repeating these process, the system can reduce the gap between human-based similarities and computer-based similarities and estimate the target image representation. We ran user studies with 10 participants on a public database and confirmed that the proposed framework is effective for clarifying the image representation envisioned by the user easily and quickly.

    DOI

    Scopus

  • A Study on the Sense of Burden and Body Ownership on Virtual Slope

    Ryo Shimamura, Seita Kayukawa, Takayuki Nakatsuka, Shoki Miyakawa, Shigeo Morishima

    2019 26TH IEEE CONFERENCE ON VIRTUAL REALITY AND 3D USER INTERFACES (VR)     1154 - 1155  2019

     View Summary

    This paper provides insight into the burden when people are walking up and down slopes in a virtual environment (VE) while actually walking on a flat floor in the real environment (RE). In RE, we feel a physical load during walking uphill or downhill. To reproduce such physical load in the VE, we provided visual stimuli to users by changing their step length. In order to investigate how the stimuli affect a sense of burden and body ownership, we performed a user study where participants walked on slopes in the VE. We found that changing the step length has a significant impact on a burden on the user and less correlation between body ownership and step length.

    DOI

    Scopus

  • Melody Slot Machine

    Masatoshi Hamanaka, Takayuki Nakatsuka, Shigeo Morishima

    ACM SIGGRAPH 2019 EMERGING TECHNOLOGIES (SIGGRAPH '19)    2019

     View Summary

    We developed an interactive music system called the "Melody Slot Machine," which provides an experience of manipulating a music performance. The melodies used in the system are divided into multiple segments, and each segment has multiple variations of melodies. By turning the dials manually, users can switch the variations of melodies freely. When you pull the slot lever, the melody of all segments rotates, and melody segments are randomly selected. Since the performer displayed in a hologram moves in accordance with the selected variation of melody, users can enjoy the feeling of manipulating the performance.

    DOI

    Scopus

    2
    Citation
    (Scopus)
  • GPU smoke simulation on compressed DCT space

    D. Ishida, R. Ando, S. Morishima

    European Association for Computer Graphics - 40th Annual Conference, EUROGRAPHICS 2019 - Short Papers     5 - 8  2019

     View Summary

    This paper presents a novel GPU-based algorithm for smoke animation. Our primary contribution is the use of Discrete Cosine Transform (DCT) compressed space for efficient simulation. We show that our method runs an order of magnitude faster than a CPU implementation while retaining visual details with a smaller memory usage. The key component of our method is an on-the-fly compression and expansion of velocity, pressure and density fields. Whenever these physical quantities are requested during a simulation, we perform data expansion and compression only where necessary in a loop. As a consequence, our simulation allows us to simulate a large domain without actually allocating full memory space for it. We show that albeit our method comes with some extra cost for DCT manipulations, such cost can be minimized with the aid of a devised shared memory usage.

    DOI

    Scopus

  • Generating Video from Single Image and Sound.

    Yukitaka Tsuchiya, Takahiro Itazuri, Ryota Natsume, Shintaro Yamamoto, Takuya Kato, Shigeo Morishima

    IEEE Conference on Computer Vision and Pattern Recognition Workshops     17 - 20  2019

  • BBeep: A Sonic Collision Avoidance System for Blind Travellers and Nearby Pedestrians

    Seita Kayukawa, Keita Higuchi, Joao Guerreiro, Shigeo Morishima, Yoichi Sato, Kris Kitani, Chieko Asakawa

    CHI 2019: PROCEEDINGS OF THE 2019 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS     52 - 52  2019

     View Summary

    We present an assistive suitcase system, BBeep, for supporting blind people when walking through crowded environments. BBeep uses pre-emptive sound notifications to help clear a path by alerting both the user and nearby pedestrians about the potential risk of collision. BBeep triggers notifications by tracking pedestrians, predicting their future position in real-time, and provides sound notifications only when it anticipates a future collision. We investigate how different types and timings of sound affect nearby pedestrian behavior. In our experiments, we found that sound emission timing has a significant impact on nearby pedestrian trajectories when compared to different sound types. Based on these findings, we performed a real-world user study at an international airport, where blind participants navigated with the suitcase in crowded areas. We observed that the proposed system significantly reduces the number of imminent collisions.

    DOI

    Scopus

    50
    Citation
    (Scopus)
  • Real-time Rendering of Layered Materials with Anisotropic Normal Distributions

    Tomoya Yamaguchi, Tatsuya Yatagawa, Yusuke Tokuyoshi, Shigeo Morishima

    SA'19: SIGGRAPH ASIA 2019 TECHNICAL BRIEFS   6 ( 1 ) 87 - 90  2019

     View Summary

    This paper proposes a lightweight bidirectional scattering distribution function (BSDF) model for layered materials with anisotropic reflection and refraction properties. In our method, each layer of the materials can be described by a microfacet BSDF using an anisotropic normal distribution function (NDF). Furthermore, the NDFs of layers can be defined on tangent vector fields, which differ from layer to layer. Our method is based on a previous study in which isotropic BSDFs are approximated by projecting them onto base planes. However, the adequateness of this previous work has not been well investigated for anisotropic BSDFs. In this paper, we demonstrate that the projection is also applicable to anisotropic BSDFs and that they can be approximated by elliptical distributions using covariance matrices.

    DOI

    Scopus

    3
    Citation
    (Scopus)
  • Audio-based automatic generation of a piano reduction score by considering the musical structure

    Hirofumi Takamori, Takayuki Nakatsuka, Satoru Fukayama, Masataka Goto, Shigeo Morishima

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   11296 LNCS   169 - 181  2019  [Refereed]

     View Summary

    This study describes a method that automatically generates a piano reduction score from the audio recordings of popular music while considering the musical structure. The generated score comprises both right- and left-hand piano parts, which reflect the melodies, chords, and rhythms extracted from the original audio signals. Generating such a reduction score from an audio recording is challenging because automatic music transcription is still considered to be inefficient when the input contains sounds from various instruments. Reflecting the long-term correlation structure behind similar repetitive bars is also challenging; further, previous methods have independently generated each bar. Our approach addresses the aforementioned issues by integrating musical analysis, especially structural analysis, with music generation. Our method extracts rhythmic features as well as melodies and chords from the input audio recording and reflects them in the score. To consider the long-term correlation between bars, we use similarity matrices, created for several acoustical features, as constraints. We further conduct a multivariate regression analysis to determine the acoustical features that represent the most valuable constraints for generating a musical structure. We have generated piano scores using our method and have observed that we can produce scores that differently balance between the ability to achieve rhythmic characteristics and the ability to obtain musical structures.

    DOI

    Scopus

    2
    Citation
    (Scopus)
  • Automatic arranging musical score for piano using important musical elements

    Hirofumi Takamori, Haruki Sato, Takayuki Nakatsuka, Shigeo Morishima

    Proceedings of the 14th Sound and Music Computing Conference 2017, SMC 2017     35 - 41  2019  [Refereed]

     View Summary

    There is a demand for arranging music composed using multiple instruments for a solo piano because there are several pianists who wish to practice playing their favorite songs or music. Generally, the method used for piano arrangement entails reducing original notes to fit on a two-line staff. However, a fundamental solution that improves originality and playability in conjunction with score quality continues to elude approaches proposed by extant studies. Hence, the present study proposes a new approach to arranging a musical score for the piano by using four musical components, namely melody, chords, rhythm, and the number of notes that can be extracted from an original score. The proposed method involves inputting an original score and subsequently generating both right- and left-hand playing parts of piano scores. With respect to the right part, optional notes from a chord were added to the melody. With respect to the left part, appropriate accompaniments were selected from a database comprising pop musical piano scores. The selected accompaniments are considered to correspond to the impression of an original score. High-quality solo piano scores reflecting original characteristics were generated and considered as part of playability.

  • Understanding Fake Faces

    Ryota Natsume, Kazuki Inoue, Yoshihiro Fukuhara, Shintaro Yamamoto, Shigeo Morishima, Hirokatsu Kataoka

    COMPUTER VISION - ECCV 2018 WORKSHOPS, PT III   11131   566 - 576  2019  [Refereed]

     View Summary

    Face recognition research is one of the most active topics in computer vision (CV), and deep neural networks (DNN) are now filling the gap between human-level and computer-driven performance levels in face verification algorithms. However, although the performance gap appears to be narrowing in terms of accuracy-based expectations, a curious question has arisen; specifically, Face understanding of AI is really close to that of human? In the present study, in an effort to confirm the brain-driven concept, we conduct image-based detection, classification, and generation using an in-house created fake face database. This database has two configurations: (i) false positive face detections produced using both the Viola Jones (VJ) method and convolutional neural networks (CNN), and (ii) simulacra that have fundamental characteristics that resemble faces but are completely artificial. The results show a level of suggestive knowledge that indicates the continuing existence of a gap between the capabilities of recent vision-based face recognition algorithms and human-level performance. On a positive note, however, we have obtained knowledge that will advance the progress of face-understanding models.

    DOI

    Scopus

  • Automatic Paper Summary Generation from Visual and Textual Information

    Shintaro Yamamoto, Yoshihiro Fukuhara, Ryota Suzuki, Shigeo Morishima, Hirokatsu Kataoka

    ELEVENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2018)   11041   V101  2019  [Refereed]

     View Summary

    Due to the recent boom in artificial intelligence (AI) research, including computer vision (CV), it has become impossible for researchers in these fields to keep up with the exponentially increasing number of manuscripts. In response to this situation, this paper proposes the paper summary generation (PSG) task using a simple but effective method to automatically generate an academic paper summary from raw PDF data. We realized PSG by combination of vision-based supervised components detector and language-based unsupervised important sentence extractor, which is applicable for a trained format of manuscripts. We show the quantitative evaluation of ability of simple vision-based components extraction, and the qualitative evaluation that our system can extract both visual item and sentence that are helpful for understanding. After processing via our PSG, the 979 manuscripts accepted by the Conference on Computer Vision and Pattern Recognition (CVPR) 2018 are available 1. It is believed that the proposed method will provide a better way for researchers to stay caught with important academic papers.

    DOI

    Scopus

  • Preface

    Shigeo Morishima

    Proceedings of the ACM Symposium on Virtual Reality Software and Technology, VRST     13  2018.11

  • Automatic Paper Summary Generation from Visual and Textual Information

    Shintaro Yamamoto, Yoshihiro Fukuhara, Ryota Suzuki, Shigeo Morishima, Hirokatsu Kataoka

    CoRR   abs/1811.06943  2018.11  [Refereed]

     View Summary

    Due to the recent boom in artificial intelligence (AI) research, including
    computer vision (CV), it has become impossible for researchers in these fields
    to keep up with the exponentially increasing number of manuscripts. In response
    to this situation, this paper proposes the paper summary generation (PSG) task
    using a simple but effective method to automatically generate an academic paper
    summary from raw PDF data. We realized PSG by combination of vision-based
    supervised components detector and language-based unsupervised important
    sentence extractor, which is applicable for a trained format of manuscripts. We
    show the quantitative evaluation of ability of simple vision-based components
    extraction, and the qualitative evaluation that our system can extract both
    visual item and sentence that are helpful for understanding. After processing
    via our PSG, the 979 manuscripts accepted by the Conference on Computer Vision
    and Pattern Recognition (CVPR) 2018 are available. It is believed that the
    proposed method will provide a better way for researchers to stay caught with
    important academic papers.

  • Occlusion for 3D Object Manipulation with Hands in Augmented Reality

    Q Feng, H Shum, S Morishima

    24th ACM Symposium on Virtual Reality Software and Technology (VRST2018),   24 ( 1166 ) 119  2018.11  [Refereed]

     View Summary

    Due to the need to interact with virtual objects, the hand-object interaction has become an important element in mixed reality (MR) applications. In this paper, we propose a novel approach to handle the occlusion of augmented 3D object manipulation with hands by exploiting the nature of hand poses combined with tracking-based and model-based methods, to achieve a complete mixed reality experience without necessities of heavy computations, complex manual segmentation processes or wearing special gloves. The experimental results show a frame rate faster than real-time and a great accuracy of rendered virtual appearances, and a user study verifies a more immersive experience compared to past approaches. We believe that the proposed method can improve a wide range of mixed reality applications that involve hand-object interactions.

    DOI

    Scopus

    3
    Citation
    (Scopus)
  • Efficient Metropolis Path Sampling for Material Editing and Re-rendering

    Tomoya Yamaguchi, Tatsuya Yatagawa, Shigeo Morishima

    Proceedings of Pacific Graphics 2018    2018.10  [Refereed]

     View Summary

    This paper proposes efficient path sampling for re-rendering scenes after material editing. The proposed sampling is based on Metropolis light transport (MLT) and dedicates path samples more to pixels receiving greater changes by editing. To roughly estimate the amount of the changes in pixel values, we first render the difference between images before and after editing. In this step, we render the difference image directly rather than taking the difference of the images by separately rendering them. Then, we sample more paths for the pixels with larger difference values, and render the scene after editing following a recent approach using the control variates. As a result, we can obtain fine rendering results with a small number of path samples. We examined the proposed sampling with a range of scenes and demonstrated that it achieves lower estimation errors and variances over the state-of-the-art method.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Understanding Fake Faces

    Ryota Natsume, Kazuki Inoue, Yoshihiro Fukuhara, Shintaro Yamamoto, Shigeo Morishima, Hirokatsu Kataoka

    CoRR   abs/1809.08391  2018.09  [Refereed]

     View Summary

    Face recognition research is one of the most active topics in computer vision
    (CV), and deep neural networks (DNN) are now filling the gap between
    human-level and computer-driven performance levels in face verification
    algorithms. However, although the performance gap appears to be narrowing in
    terms of accuracy-based expectations, a curious question has arisen;
    specifically, "Face understanding of AI is really close to that of human?" In
    the present study, in an effort to confirm the brain-driven concept, we conduct
    image-based detection, classification, and generation using an in-house created
    fake face database. This database has two configurations: (i) false positive
    face detections produced using both the Viola Jones (VJ) method and
    convolutional neural networks (CNN), and (ii) simulacra that have fundamental
    characteristics that resemble faces but are completely artificial. The results
    show a level of suggestive knowledge that indicates the continuing existence of
    a gap between the capabilities of recent vision-based face recognition
    algorithms and human-level performance. On a positive note, however, we have
    obtained knowledge that will advance the progress of face-understanding models.

  • RSGAN: Face swapping and editing using face and hair representation in latent spaces

    Ryota Natsume, Tatsuya Yatagawa, Shigeo Morishima

    ACM SIGGRAPH 2018 Posters, SIGGRAPH 2018     69:1-69:2  2018.08  [Refereed]

     View Summary

    This abstract introduces a generative neural network for face swapping and editing face images. We refer to this network as "region-separative generative adversarial network (RSGAN)". In existing deep generative models such as Variational autoencoder (VAE) and Generative adversarial network (GAN), training data must represent what the generative models synthesize. For example, image inpainting is achieved by training images with and without holes. However, it is difficult or even impossible to prepare a dataset which includes face images both before and after face swapping because faces of real people cannot be swapped without surgical operations. We tackle this problem by training the network so that it synthesizes synthesize a natural face image from an arbitrary pair of face and hair appearances. In addition to face swapping, the proposed network can be applied to other editing applications, such as visual attribute editing and random face parts synthesis.

    DOI

    Scopus

    22
    Citation
    (Scopus)
  • How makeup experience changes how we see cosmetics?

    Kanami Yamagishi, Takuya Kato, Shintaro Yamamoto, Ayano Kaneda, Shigeo Morishima

    Proceedings of ACM Symposium on Applied Perception (SAP2018)    2018.08  [Refereed]

  • RSGAN: Face Swapping and Editing Via Region Separation in Latent Spaces

    Ryota Natsume, Tatsuya Yatagawa, Shigeo Morishima

    ACM SIGGRAPH 2018 posters    2018.08  [Refereed]

     View Summary

    This poster proposes a new deep generative model that we refer to as region separative generative adversarial network (RSGAN), which can achieve face swapping between arbitrary image pairs and can robustly perform the swapping compared to previous methods using 3DMM.

  • High-Fidelity Facial Reflectance and Geometry Inference From an Unconstrained Image

    Shugo Yamaguchi, Shunsuke Saito, Koki Nagano, Yajie Zhao, Weikai Chen, Kyle Olszewski, Shigeo Morishima, Hao Li

    ACM TRANSACTIONS ON GRAPHICS   37 ( 4 ) 162-1 - 162-14  2018.08  [Refereed]

     View Summary

    We present a deep learning-based technique to infer high-quality facial reflectance and geometry given a single unconstrained image of the subject, which may contain partial occlusions and arbitrary illumination conditions. The reconstructed high-resolution textures, which are generated in only a few seconds, include high-resolution skin surface reflectance maps, representing both the diffuse and specular albedo, and medium-and high-frequency displacement maps, thereby allowing us to render compelling digital avatars under novel lighting conditions. To extract this data, we train our deep neural networks with a high-quality skin reflectance and geometry database created with a state-of-the-art multi-view photometric stereo system using polarized gradient illumination. Given the raw facial texture map extracted from the input image, our neural networks synthesize complete reflectance and displacement maps, as well as complete missing regions caused by occlusions. The completed textures exhibit consistent quality throughout the face due to our network architecture, which propagates texture features from the visible region, resulting in high-fidelity details that are consistent with those seen in visible regions. We describe how this highly underconstrained problem is made tractable by dividing the full inference into smaller tasks, which are addressed by dedicated neural networks. We demonstrate the effectiveness of our network design with robust texture completion from images of faces that are largely occluded. With the inferred reflectance and geometry data, we demonstrate the rendering of high-fidelity 3D avatars from a variety of subjects captured under different lighting conditions. In addition, we perform evaluations demonstrating that our method can infer plausible facial reflectance and geometric details comparable to those obtained from high-end capture devices, and outperform alternative approaches that require only a single unconstrained input image.

    DOI

    Scopus

    63
    Citation
    (Scopus)
  • Face retrieval framework relying on user's visual memory

    Yugo Sato, Tsukasa Fukusato, Shigeo Morishima

    ICMR 2018 - Proceedings of the 2018 ACM International Conference on Multimedia Retrieval     274 - 282  2018.06  [Refereed]

     View Summary

    This paper presents an interactive face retrieval framework for clarifying an image representation envisioned by a user. Our system is designed for a situation in which the user wishes to find a person but has only visual memory of the person. We address a critical challenge of image retrieval across the user's inputs. Instead of target-specific information, the user can select several images (or a single image) that are similar to an impression of the target person the user wishes to search for. Based on the user's selection, our proposed system automatically updates a deep convolutional neural network. By interactively repeating these process (human-in-the-loop optimization), the system can reduce the gap between human-based similarities and computer-based similarities and estimate the target image representation. We ran user studies with 10 subjects on a public database and confirmed that the proposed framework is effective for clarifying the image representation envisioned by the user easily and quickly.

    DOI

    Scopus

    4
    Citation
    (Scopus)
  • Placing Music in Space: A Study on Music Appreciation with Spatial Mapping

    Shoki Miyagawa, Yuki Koyama, Jun Kato, Masataka Goto, Shigeo Morishima

    Proceedings of the ACM Symposium on Designing Interactive Systems (DIS2018)     39 - 43  2018.06  [Refereed]

     View Summary

    We investigate the potential of music appreciation using spatial mapping techniques, which allow us to “place” audio sources in various locations within a physical space. We consider possible ways of this new appreciation style and list some design variables, such as how to define coordinate systems, how to show visually, and who to place the sound sources. We conducted an exploratory user study to examine how these design variables affect users’ music listening experiences. Based on our findings from the study, we discuss how we should develop systems that incorporate these design variables for music appreciation in the future.

  • 基本材質の拡散プロファイル混合による実測BSSRDFデータの圧縮

    Tatsuya Yatagawa, Hideki Todo, Yasushi Yamaguchi, Shigeo Morishima

    Visual Computing 2018 (VC 2018)    2018.06  [Refereed]

  • 分オートエンコーダを用いた領域特徴抽出による顔領域入れ替えを含む画像編集法

    Ryota Natsume, Tatsuya Yatagawa, Shigeo Morishima

    Visual Computing 2018 (VC 2018)    2018.06  [Refereed]

  • Thickness-aware voxelization

    Zhuopeng Zhang, Shigeo Morishima, Changbo Wang

    COMPUTER ANIMATION AND VIRTUAL WORLDS   29 ( 3-4 )  2018.05  [Refereed]

     View Summary

    Voxelization is a crucial process for many computer graphics applications such as collision detection, rendering of translucent objects, and global illumination. However, in some situations, although the mesh looks good, the voxelization result may be undesirable. In this paper, we describe a novel voxelization method that uses the graphics processing unit for surface voxelization. Our improvements on the voxelization algorithm can address a problem of state-of-the-art voxelization, which cannot deal with thin parts of the mesh object. We improve the quality of voxelization on both normal mediation and surface correction. Furthermore, we investigate our voxelization methods on indirect illumination, showing the improvement on the quality of real-time rendering.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • RSGAN: Face Swapping and Editing using Face and Hair Representation in Latent Spaces

    Ryota Natsume, Tatsuya Yatagawa, Shigeo Morishima

    CoRR   abs/1804.03447  2018.04  [Refereed]

     View Summary

    In this paper, we present an integrated system for automatically generating
    and editing face images through face swapping, attribute-based editing, and
    random face parts synthesis. The proposed system is based on a deep neural
    network that variationally learns the face and hair regions with large-scale
    face image datasets. Different from conventional variational methods, the
    proposed network represents the latent spaces individually for faces and hairs.
    We refer to the proposed network as region-separative generative adversarial
    network (RSGAN). The proposed network independently handles face and hair
    appearances in the latent spaces, and then, face swapping is achieved by
    replacing the latent-space representations of the faces, and reconstruct the
    entire face image with them. This approach in the latent space robustly
    performs face swapping even for images which the previous methods result in
    failure due to inappropriate fitting or the 3D morphable models. In addition,
    the proposed system can further edit face-swapped images with the same network
    by manipulating visual attributes or by composing them with randomly generated
    face or hair parts.

  • Dynamic object scanning: Object-based elastic timeline for quickly browsing first-person videos

    Seita Kayukawa, Keita Higuchi, Ryo Yonetani, Masanori Nakamura, Yoichi Sato, Shigeo Morishima

    Conference on Human Factors in Computing Systems - Proceedings   2018-April  2018.04  [Refereed]

     View Summary

    Copyright is held by the author/owner(s). This work presents the Dynamic Object Scanning (DO-Scanning), a novel interface that helps users browse long and untrimmed first-person videos quickly. The proposed interface offers users a small set of object cues generated automatically tailored to the context of a given video. Users choose which cue to highlight, and the interface in turn fast-forwards the video adaptively while keeping scenes with highlighted cues played at original speed. Our experimental results have revealed that the DO-Scanning arranged an efficient and compact set of cues, and this set of cues is useful for browsing a diverse set of first-person videos.

    DOI

    Scopus

    3
    Citation
    (Scopus)
  • Dynamic object scanning: Object-based elastic timeline for quickly browsing first-person videos

    Seita Kayukawa, Keita Higuchi, Ryo Yonetani, Masanori Nakamura, Yoichi Sato, Shigeo Morishima

    Conference on Human Factors in Computing Systems - Proceedings   2018-April  2018.04  [Refereed]

     View Summary

    Copyright is held by the author/owner(s). This work presents the Dynamic Object Scanning (DO-Scanning), a novel interface that helps users browse long and untrimmed first-person videos quickly. The proposed interface offers users a small set of object cues generated automatically tailored to the context of a given video. Users choose which cue to highlight, and the interface in turn adaptively fast-forwards the video while keeping scenes with highlighted cues played at original speed. Our experimental results have revealed that the DO-Scanning has an efficient and compact set of cues arranged dynamically and this set of cues is useful for browsing a diverse set of first-person videos.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Dynamic Object Scanning: Object-Based Elastic Timeline for Quickly Browsing First-Person Videos

    Seita Kayukawa, Keita Higuchi, Ryo Yonetani, Masanori Nakamura, Yoichi Sato, Shigeo Morishima

    Proceedings of 2018 CHI Conference on Human Factors in Computing Systems (CHI '18)    2018.04  [Refereed]

     View Summary

    This work presents the Dynamic Object Scanning (DO- Scanning), a novel interface that helps users browse long and untrimmed first-person videos quickly. The proposed interface offers users a small set of object cues generated automatically tailored to the context of a given video. Users choose which cue to highlight, and the interface in turn adaptively fast-forwards the video while keeping scenes with highlighted cues played at original speed. Our experimental results have revealed that the DO-Scanning has an efficient and compact set of cues arranged dynamically and this set of cues is useful for browsing a diverse set of first-person videos.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • RSGAN: Face Swapping and Editing using Face and Hair Representation in Latent Spaces

    Ryota Natsume, Tatsuya Yatagawa, Shigeo Morishima

    arXiv    2018.04

     View Summary

    In this paper, we present an integrated system for automatically generating and editing face images through face swapping, attribute-based editing, and random face parts synthesis. The proposed system is based on a deep neural network that variationally learns the face and hair regions with large-scale face image datasets. Different from conventional variational methods, the proposed network represents the latent spaces individually for faces and hairs. We refer to the proposed network as region-separative generative adversarial network (RSGAN). The proposed network independently handles face and hair appearances in the latent spaces, and then, face swapping is achieved by replacing the latent-space representations of the faces, and reconstruct the entire face image with them. This approach in the latent space robustly performs face swapping even for images which the previous methods result in failure due to inappropriate fitting or the 3D morphable models. In addition, the proposed system can further edit face-swapped images with the same network by manipulating visual attributes or by composing them with randomly generated face or hair parts.

  • DanceDJ: A 3D Dance Animation Authoring System for Live Performance

    Naoya Iwamoto, Takuya Kato, Hubert P. H. Shum, Ryo Kakitsuka, Kenta Hara, Shigeo Morishima

    ADVANCES IN COMPUTER ENTERTAINMENT TECHNOLOGY, ACE 2017   10714   653 - 670  2018  [Refereed]

     View Summary

    Dance is an important component of live performance for expressing emotion and presenting visual context. Human dance performances typically require expert knowledge of dance choreography and professional rehearsal, which are too costly for casual entertainment venues and clubs. Recent advancements in character animation and motion synthesis have made it possible to synthesize virtual 3D dance characters in real-time. The major problem in existing systems is a lack of an intuitive interfaces to control the animation for real-time dance controls. We propose a new system called the DanceDJ to solve this problem. Our system consists of two parts. The first part is an underlying motion analysis system that evaluates motion features including dance features such as the postures and movement tempo, as well as audio features such as the music tempo and structure. As a pre-process, given a dancing motion database, our system evaluates the quality of possible timings to connect and switch different dancing motions. During run-time, we propose a control interface that provides visual guidance. We observe that disk jockeys (DJs) effectively control the mixing of music using the DJ controller, and therefore propose a DJ controller for controlling dancing characters. This allows DJs to transfer their skills from music control to dance control using a similar hardware setup. We map different motion control functions onto the DJ controller, and visualize the timing of natural connection points, such that the DJ can effectively govern the synthesized dance motion. We conducted two user experiments to evaluate the user experience and the quality of the dance character. Quantitative analysis shows that our system performs well in both motion control and simulation quality.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Voice animator: Automatic lip-synching in limited animation by audio

    Shoichi Furukawa, Tsukasa Fukusato, Shugo Yamaguchi, Shigeo Morishima

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   10714 LNCS   153 - 171  2018  [Refereed]

     View Summary

    © Springer International Publishing AG, part of Springer Nature 2018. Limited animation is one of the traditional techniques for producing cartoon animations. Owing to its expressive style, it has been enjoyed around the world. However, producing high quality animations using this limited style is time-consuming and costly for animators. Furthermore, proper synchronization between the voice-actor’s voice and the character’s mouth and lip motion requires well-experienced animators. This is essential because viewers are very sensitive to audio-lip discrepancies. In this paper, we propose a method that automatically creates high-quality limited-style lip-synched animations using audio tracks. Our system can be applied for creating not only the original animations but also dubbed ones independently of languages. Because our approach follows the standard workflow employed in cartoon animation production, our system can successfully assist animators. In addition, users can implement our system as a plug-in of a standard tool for creating animations (Adobe After Effects) and can easily arrange character lip motion to suit their own style. We visually evaluate our results both absolutely and relatively by comparing them with those of previous works. From the user evaluations, we confirm that our algorithms is able to successfully generate more natural audio-mouth synchronizations in limited-style lip-synched animations than previous algorithms.

    DOI

    Scopus

    4
    Citation
    (Scopus)
  • FSNet: An Identity-Aware Generative Model for Image-based Face Swapping.

    Ryota Natsume, Tatsuya Yatagawa, Shigeo Morishima

    CoRR   abs/1811.12666  2018  [Refereed]

  • Resolving Occlusion for 3D Object Manipulation with Hands in Mixed Reality

    Qi Feng, Hubert P. H. Shum, Shigeo Morishima

    24TH ACM SYMPOSIUM ON VIRTUAL REALITY SOFTWARE AND TECHNOLOGY (VRST 2018)     119:1-119:2  2018  [Refereed]

     View Summary

    Due to the need to interact with virtual objects, the hand-object interaction has become an important element in mixed reality (MR) applications. In this paper, we propose a novel approach to handle the occlusion of augmented 3D object manipulation with hands by exploiting the nature of hand poses combined with tracking-based and model-based methods, to achieve a complete mixed reality experience without necessities of heavy computations, complex manual segmentation processes or wearing special gloves. The experimental results show a frame rate faster than real-time and a great accuracy of rendered virtual appearances, and a user study verifies a more imnersive experience compared to past approaclies. We believe that the proposed method can improve a wide range of mixed reality applications that involve hand-object interactions.

    DOI

    Scopus

    3
    Citation
    (Scopus)
  • Wrinkles individuality preserving aged texture generation using multiple expression images

    Pavel A. Savkin, Tsukasa Fukusato, Takuya Kato, Shigeo Morishima

    VISIGRAPP 2018 - Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications   4   549 - 557  2018  [Refereed]

     View Summary

    Copyright © 2018 by SCITEPRESS - Science and Technology Publications, Lda. All rights reserved. Aging of a human face is accompanied by visible changes such as sagging, spots, somberness, and wrinkles. Age progression techniques that estimate an aged facial image are required for long-term criminal or missing person investigations, and also in 3DCG facial animations. This paper focuses on aged facial texture and introduces a novel age progression method based on medical knowledge, which represents an aged wrinkles shapes and positions individuality. The effectiveness of the idea including expression wrinkles in aging facial image synthesis is confirmed through subjective evaluation.

    DOI

    Scopus

  • Face Retrieval Framework Relying on User's Visual Memory

    Yugo Sato, Tsukasa Fukusato, Shigeo Morishima

    ICMR '18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL     274 - 282  2018  [Refereed]

     View Summary

    This paper presents an interactive face retrieval framework for clarifying an image representation envisioned by a user. Our system is designed for a situation in which the user wishes to find a person but has only visual memory of the person. We address a critical challenge of image retrieval across the user's inputs. Instead of target-specific information, the user can select several images (or a single image) that are similar to an impression of the target person the user wishes to search for. Based on the user's selection, our proposed system automatically updates a deep convolutional neural network. By interactively repeating these process (human-in-the-loop optimization), the system can reduce the gap between human-based similarities and computer-based similarities and estimate the target image representation. We ran user studies with 10 subjects on a public database and con firmed that the proposed framework is effective for clarifying the image representation envisioned by the user easily and quickly.

    DOI

    Scopus

    4
    Citation
    (Scopus)
  • Placing Music in Space: A Study on Music Appreciation with Spatial Mapping

    Shoki Miyagawa, Shigeo Morishima, Yuki Koyama, Jun Kato, Masataka Goto

    DIS 2018: COMPANION PUBLICATION OF THE 2018 DESIGNING INTERACTIVE SYSTEMS CONFERENCE     39 - 43  2018  [Refereed]

     View Summary

    We investigate the potential of music appreciation using spatial mapping techniques, which allow us to "place" audio sources in various locations within a physical space. We consider possible ways of this new appreciation style and list some design variables, such as how to define coordinate systems, how to show visually, and how to place the sound sources. We conducted an exploratory user study to examine how these design variables affect users' music listening experiences. Based on our findings from the study, we discuss how we should develop systems that incorporate these design variables for music appreciation in the future.

    DOI

    Scopus

    2
    Citation
    (Scopus)
  • Latent topic similarity for music retrieval and its application to a system that supports DJ performance

    Tatsunori Hirai, Hironori Doi, Shigeo Morishima

    Journal of Information Processing   26   276 - 284  2018.01  [Refereed]

     View Summary

    © 2018 Information Processing Society of Japan. This paper presents a topic modeling method to retrieve similar music fragments and its application, Music- Mixer, which is a computer-aided DJ system that supports DJ performance by automatically mixing songs in a seamless manner. MusicMixer mixes songs based on audio similarity calculated via beat analysis and latent topic analysis of the chromatic signal in the audio. The topic represents latent semantics on how chromatic sounds are generated. Given a list of songs, a DJ selects a song with beats and sounds similar to a specific point of the currently playing song to seamlessly transition between songs. By calculating similarities between all existing song sections that can be naturally mixed, MusicMixer retrieves the best mixing point from a myriad of possibilities and enables seamless song transitions. Although it is comparatively easy to calculate beat similarity from audio signals, considering the semantics of songs from the viewpoint of a human DJ has proven difficult. Therefore, we propose a method to represent audio signals to construct topic models that acquire latent semantics of audio. The results of a subjective experiment demonstrate the effectiveness of the proposed latent semantic analysis method. MusicMixer achieves automatic song mixing using the audio signal processing approach; thus, users can perform DJ mixing simply by selecting a song from a list of songs suggested by the system.

    DOI

    Scopus

  • Voice Animator: Automatic Lip-Synching in Limited Animation by Audio

    Shoichi Furukawa, Tsukasa Fukusato, Shugo Yamaguchi, Shigeo Morishima

    ADVANCES IN COMPUTER ENTERTAINMENT TECHNOLOGY, ACE 2017   10714   153 - 171  2018  [Refereed]

     View Summary

    Limited animation is one of the traditional techniques for producing cartoon animations. Owing to its expressive style, it has been enjoyed around the world. However, producing high quality animations using this limited style is time-consuming and costly for animators. Furthermore, proper synchronization between the voice-actor's voice and the character's mouth and lip motion requires well-experienced animators. This is essential because viewers are very sensitive to audio-lip discrepancies. In this paper, we propose a method that automatically creates high-quality limited-style lip-synched animations using audio tracks. Our system can be applied for creating not only the original animations but also dubbed ones independently of languages. Because our approach follows the standard workflow employed in cartoon animation production, our system can successfully assist animators. In addition, users can implement our system as a plug-in of a standard tool for creating animations (Adobe After Effects) and can easily arrange character lip motion to suit their own style. We visually evaluate our results both absolutely and relatively by comparing them with those of previous works. From the user evaluations, we confirm that our algorithms is able to successfully generate more natural audio-mouth synchronizations in limited-style lipsynched animations than previous algorithms.

    DOI

    Scopus

    4
    Citation
    (Scopus)
  • Cosmetic Features Extraction by a Single Image Makeup Decomposition

    Kanami Yamagishi, Shintaro Yamamoto, Takuya Kato, Shigeo Morishima

    PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW)   2018-June   1965 - 1967  2018  [Refereed]

     View Summary

    In recent years, a large number of makeup images have been shared on social media. Most of these images lack information about the cosmetics used, such as color, glitter or etc., while they are difficult to infer due to the diversity of skin color or lighting conditions. In this paper, our goal is to estimate cosmetic features only from a single makeup image. Previous work has measured the material parameters of cosmetic products from pairs of images showing the face with and without makeup, but such comparison images are not always available. Furthermore, this method cannot represent local effects such as pearl or glitter since they adapted physically-based reflectance models. We propose a novel image-based method to extract cosmetic features considering both color and local effects by decomposing the target image into makeup and skin color using Difference of Gaussian (DoG). Our method can be applied for single, standalone makeup images, and considers both local effects and color. In addition, our method is robust to the skin color difference due to the decomposition separating makeup from skin. The experimental results demonstrate that our method is more robust to skin color difference and captures characteristics of each cosmetic product.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Latent topic similarity for music retrieval and its application to a system that supports DJ performance

    Tatsunori Hirai, Hironori Doi, Shigeo Morishima

    Journal of Information Processing   26 ( 3 ) 276 - 284  2018.01  [Refereed]

     View Summary

    This paper presents a topic modeling method to retrieve similar music fragments and its application, Music- Mixer, which is a computer-aided DJ system that supports DJ performance by automatically mixing songs in a seamless manner. MusicMixer mixes songs based on audio similarity calculated via beat analysis and latent topic analysis of the chromatic signal in the audio. The topic represents latent semantics on how chromatic sounds are generated. Given a list of songs, a DJ selects a song with beats and sounds similar to a specific point of the currently playing song to seamlessly transition between songs. By calculating similarities between all existing song sections that can be naturally mixed, MusicMixer retrieves the best mixing point from a myriad of possibilities and enables seamless song transitions. Although it is comparatively easy to calculate beat similarity from audio signals, considering the semantics of songs from the viewpoint of a human DJ has proven difficult. Therefore, we propose a method to represent audio signals to construct topic models that acquire latent semantics of audio. The results of a subjective experiment demonstrate the effectiveness of the proposed latent semantic analysis method. MusicMixer achieves automatic song mixing using the audio signal processing approach
    thus, users can perform DJ mixing simply by selecting a song from a list of songs suggested by the system.

    DOI

    Scopus

  • Naoya Iwamoto, Takuya Kato(joint first author), Hubert P. H. Shum, Ryo Kakitsuka, Kenta Hara, Shigeo Morishima

    DanceDJ: A, D, Dance Animation Authoring, System for Live Performance

    14TH INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTER ENTERTAINMENT TECHNOLOGY (ACE 2017)   10714   653 - 670  2017.12  [Refereed]

     View Summary

    Dance is an important component of live performance for expressing emotion and presenting visual context. Human dance performances typically require expert knowledge of dance choreography and professional rehearsal, which are too costly for casual entertainment venues and clubs. Recent advancements in character animation and motion synthesis have made it possible to synthesize virtual 3D dance characters in real-time. The major problem in existing systems is a lack of an intuitive interfaces to control the animation for real-time dance controls. We propose a new system called the DanceDJ to solve this problem. Our system consists of two parts. The first part is an underlying motion analysis system that evaluates motion features including dance features such as the postures and movement tempo, as well as audio features such as the music tempo and structure. As a pre-process, given a dancing motion database, our system evaluates the quality of possible timings to connect and switch different dancing motions. During run-time, we propose a control interface that provides visual guidance. We observe that disk jockeys (DJs) effectively control the mixing of music using the DJ controller, and therefore propose a DJ controller for controlling dancing characters. This allows DJs to transfer their skills from music control to dance control using a similar hardware setup. We map different motion control functions onto the DJ controller, and visualize the timing of natural connection points, such that the DJ can effectively govern the synthesized dance motion. We conducted two user experiments to evaluate the user experience and the quality of the dance character. Quantitative analysis shows that our system performs well in both motion control and simulation quality.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Qasi-developable garment transfer for animals

    Fumiya Narita, Shunsuke Saito, Tsukasa Fukusato, Shigeo Morishima

    SIGGRAPH Asia 2017 Technical Briefs, SA 2017     26:1-26:4  2017.11  [Refereed]

     View Summary

    ' 2017 ACM. In this paper, we present an interactive framework to model garments for animals from a template garment model based on correspondences between the source and the target bodies. We address two critical challenges of garment transfer across significantly different body shapes and postures (e.g., for quadruped and human); (1) ambiguity in the correspondences and (2) distortion due to large variation in scale of each body part. Our efficient cross-parameterization algorithm and intuitive user interface allow us to interactively compute correspondences and transfer the overall shape of garments. We also introduce a novel algorithm for local coordinate optimization that minimizes the distortion of transferred garments, which leads a resulting model to a quasi-developable surface and hence ready for fabrication. Finally, we demonstrate the robustness and effectiveness of our approach on a various garments and body shapes, showing that visually pleasant garment models for animals can be generated and fabricated by our system with minimal effort.

    DOI

    Scopus

  • Outside-in monocular IR camera based HMD pose estimation via geometric optimization

    Pavel A. Savkin, Shunsuke Saito, Jarich Vansteenberge, Tsukasa Fukusato, Lochlainn Wilson, Shigeo Morishima

    Proceedings of the ACM Symposium on Virtual Reality Software and Technology, VRST   Part F131944   7:1-7:9  2017.11  [Refereed]

     View Summary

    © 2017 ACM. Accurately tracking a Head Mounted Display (HMD) with a 6 degree of freedom is essential to achieve a comfortable and a nausea free experience in Virtual Reality. Existing commercial HMD systems using synchronized Infrared (IR) camera and blinking IR-LEDs can achieve highly accurate tracking. However, most of the o-the-shelf cameras do not support frame synchronization. In this paper, we propose a novel method for real time HMD pose estimation without using any camera synchronization or LED blinking. We extended over the state of the art pose estimation algorithm by introducing geometrically constrained optimization. In addition, we propose a novel system to increase robustness to the blurred IR-LEDs paerns appearing at high-velocity movements. The quantitative evaluations showed signicant improvements in pose stability and accuracy over wide rotational movements as well as a decrease in runtime.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Simulating the friction sounds using a friction-based adhesion theory model

    Takayuki Nakatsuka, Shigeo Morishima

    The 20th International Conference on Digital Audio Effects (DAFX2017)     32 - 39  2017.09  [Refereed]

     View Summary

    Synthesizing a friction sound of deformable objects by a computer is challenging. We propose a novel physics-based approach to synthesize friction sounds based on dynamics simulation. In this work, we calculate the elastic deformation of an object surface when the object comes in contact with other objects. The principle of our method is to divide an object surface into microrectangles. The deformation of each microrectangle is set using two assumptions: the size of a microrectangle (1) changes by contacting other object and (2) obeys a normal distribution. We consider the sound pressure distribution and its space spread, consisting of vibrations of all microrectangles, to synthesize a friction sound at an observation point. We express the global motions of an object by position based dynamics where we add an adhesion constraint. Our proposed method enables the generation of friction sounds of objects in different materials by regulating the initial value of microrectangular parameters.

  • Beautifying Font: Effective Handwriting Template for Mastering Expression of Chinese Calligraphy

    Masanori Nakamura, Shugo Yamaguchi, Shigeo Morishima

    SIGGRAPH 2017 posters    2017.08  [Refereed]

     View Summary

    We propose a novel font called Beautifying Font to assist learning techniques in writing Chinese calligraphy. Chinese calligraphy has various expressions but they are hard to acquire for beginners. Beautifying Font visualizes the speed and pressure of brush-strokes so that users can intuitively understand how to write.

  • Court-aware volleyball video summarization

    Takahiro Itazuri, Tsukasa Fukusato, Shugo Yamaguchi, Shigeo Morishima

    ACM SIGGRAPH 2017 Posters, SIGGRAPH 2017     74:1-74:2  2017.07  [Refereed]

     View Summary

    We propose a rally-rank evaluation based on the court transition information for volleyball video summarization considering the contents of the game. Our method uses the court transition information instead of non-robust visual features such as the position of a ball and players. Experimental results demonstrate the effectiveness that our method reflects viewers' preferences over previous methods.

    DOI

    Scopus

  • Beautifying font: Effective handwriting template for mastering expression of Chinese calligraphy

    Masanori Nakamura, Shugo Yamaguchi, Shigeo Morishima

    ACM SIGGRAPH 2017 Posters, SIGGRAPH 2017     3:1-3:2  2017.07  [Refereed]

     View Summary

    © 2017 Copyright held by the owner/author(s). We propose a novel font called Beautifying Font to assist learning techniques in writing Chinese calligraphy. Chinese calligraphy has various expressions but they are hard to acquire for beginners. Beautifying Font visualizes the speed and pressure of brush-strokes so that users can intuitively understand how to write.

    DOI

    Scopus

    3
    Citation
    (Scopus)
  • 3D model partial-resizing via normal and texture map combination

    Naoki Nozawa, Tsukasa Fukusato, Shigeo Morishima

    ACM SIGGRAPH 2017 Posters, SIGGRAPH 2017     1:1-1:2  2017.07  [Refereed]

     View Summary

    © 2017 Copyright held by the owner/author(s). Resizing of 3D model is necessary for computer graphics animation and application such as games and movies. In general, when users deform a target model, they built on a bounding box or a closed polygon mesh (cage) to enclose a target model. Then, the resizing is done by deforming the cage with target model. However, these approaches are not good for detailed adjustment of 3D shape because they do not preserve local information. In contrast, based on a local information (e.g., edge set and weight map), Sorkine et al. [Sorkine and Alexa 2007; Sorkine et al. 2004] can generate smooth and conformal deformation results with only a few control points. While these approaches are useful for some situations, the results depend on resolution and topology of the target model. In addition, these approaches do not consider texture (UV) information.

    DOI

    Scopus

  • Court-aware volleyball video summarization

    Takahiro Itazuri, Tsukasa Fukusato, Shugo Yamaguchi, Shigeo Morishima

    ACM SIGGRAPH 2017 Posters, SIGGRAPH 2017    2017.07  [Refereed]

     View Summary

    © 2017 Copyright held by the owner/author(s). We propose a rally-rank evaluation based on the court transition information for volleyball video summarization considering the contents of the game. Our method uses the court transition information instead of non-robust visual features such as the position of a ball and players. Experimental results demonstrate the effectiveness that our method reflects viewers' preferences over previous methods.

    DOI

    Scopus

  • 3D model partial-resizing via normal and texture map combination

    Naoki Nozawa, Tsukasa Fukusato, Shigeo Morishima

    ACM SIGGRAPH 2017 Posters, SIGGRAPH 2017    2017.07  [Refereed]

     View Summary

    Resizing of 3D model is necessary for computer graphics animation and application such as games and movies. In general, when users deform a target model, they built on a bounding box or a closed polygon mesh (cage) to enclose a target model. Then, the resizing is done by deforming the cage with target model. However, these approaches are not good for detailed adjustment of 3D shape because they do not preserve local information. In contrast, based on a local information (e.g., edge set and weight map), Sorkine et al. [Sorkine and Alexa 2007
    Sorkine et al. 2004] can generate smooth and conformal deformation results with only a few control points. While these approaches are useful for some situations, the results depend on resolution and topology of the target model. In addition, these approaches do not consider texture (UV) information.

    DOI

    Scopus

  • Retexturing under self-occlusion using hierarchical markers

    Shoki Miyagawa, Yoshihiro Fukuhara, Fumiya Narita, Norihiro Ogata, Shigeo Morishima

    ACM SIGGRAPH 2017 Posters, SIGGRAPH 2017     27:1-27:2  2017.07  [Refereed]

     View Summary

    © 2017 Copyright held by the owner/author(s). Marker-based retexturing is to superimpose the texture on a target object by detecting and identifying markers from within the captured image. We propose a new marker that can be identified under a large deformation that involves self-occlusion, which was not taken into consideration in the following markers. Bradley et al. [Bradley et al. 2009] designed the independent markers, but it is difficult to recognize them under complicated occlusion. Scholz et al.[Scholz and Magnor 2006] created a circular marker with a single color selected from multiple colors. They created ID corresponding to the alignment of colors by one marker and the markers around it and identified by placing the marker so that the ID would be unique. However, when some markers are covered by self-occlusion, the positional relationship of the markers appears to be different from the original, so markers near the self-occlusion are failed to identify. Narita et al. [Narita et al. 2017] considered self-occlusion by improving the identification algorithm. They succeeded in improving the accuracy of identification by creating triangle meshes whose vertices are the center of gravity of markers and assuming that they are close to a right isosceles triangle. However, since outliers are removed using angles, identification of markers may fail in the case of the object that is likely to be deformed in the shear direction like a cloth. Therefore, we considered self-occlusion by designing hierarchical markers so that they can be refferred to in a global scope. We designed a color based marker for easy recognition even at low resolution.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Facial video age progression considering expression change

    Shintaro Yamamoto, Pavel A. Savkin, Takuya Kato, Shoichi Furukawa, Shigeo Morishima

    ACM International Conference Proceeding Series   128640   5:1-5:5  2017.06  [Refereed]

     View Summary

    This paper proposes an age progression method for facial videos. Age is one of the main factors that changes the appearance of the face, due to the associated sagging, spots, and wrinkles. These aging features change in appearance depending on facial expressions. As an example, we see wrinkles appear in the face of the young when smiling, but the shape of wrinkles changes in older faces. Previous work has not considered the temporal changes of the face, using only static images as databases. To solve this problem, we extend the texture synthesis approach to use facial videos as source videos. First, we spatio-temporally align the videos of database to match the sequence of a target video. Then, we synthesize an aging face and apply the temporal changes of the target age to the wrinkles appearing in the facial expression image in the target video. As a result, our method successfully generates expression changes specific to the target age.

    DOI

    Scopus

  • Facial video age progression considering expression change

    Shintaro Yamamoto, Pavel A. Savkin, Takuya Kato, Shoichi Furukawa, Shigeo Morishima

    ACM International Conference Proceeding Series   Part F128640  2017.06  [Refereed]

     View Summary

    © 2017 ACM. This paper proposes an age progression method for facial videos. Age is one of the main factors that changes the appearance of the face, due to the associated sagging, spots, and wrinkles. These aging features change in appearance depending on facial expressions. As an example, we see wrinkles appear in the face of the young when smiling, but the shape of wrinkles changes in older faces. Previous work has not considered the temporal changes of the face, using only static images as databases. To solve this problem, we extend the texture synthesis approach to use facial videos as source videos. First, we spatio-temporally align the videos of database to match the sequence of a target video. Then, we synthesize an aging face and apply the temporal changes of the target age to the wrinkles appearing in the facial expression image in the target video. As a result, our method successfully generates expression changes specific to the target age.

    DOI

    Scopus

  • コート情報に基づくバレーボール映像の鑑賞支援

    板摺貴大, 福里司, 山口周悟, 森島繁生

    Visual Computing 2017 (VC 2017)    2017.06  [Refereed]

  • 表情変化データベースを用いた経年変化顔動画合成

    山本晋太郎, サフキンパーベル, 加藤卓哉, 佐藤優伍, 古川翔一, 森島繁生

    Visual Computing / グラフィクスとCAD 合同シンポジウム 2017    2017.06  [Refereed]

     View Summary

    本研究では,老化に伴う表情変化の違いを考慮した経年変化顔の動画像合成を行う.動画中の人物の経年変化は,映画のような映像作品における年齢変化表現において重要である.老化に伴って発生する顔特徴の一つである皺に注目すると,表情変化に伴う皺の変化は年齢に大きく依存する.静止画を対象とした経年変化顔合成手法は数多く存在するが,いずれも皺の動的な変化に焦点を置いていないため,年齢による表情変化の違いを表現することができない.そこで本研究では,データベースとして目標年代の人物の表情変化動画を用いることにより,表情変化に伴う皺の動的な変化を表現する.具体的には,入力動画の各表情に対して,データベースの類似表情を用いて,老化時の顔の構築を行う.また,対象人物の皺の深さを,目標年代の変化と一致するように操作する.以上により,表情変化による皺の動的変化を考慮した経年変化顔の動画合成を実現した.

  • 可展面制約を考慮したテンプレートベース衣服モデリング

    成田 史弥, 齋藤 隼介, 福里 司, 森島 繁生

    Visual Computing / グラフィクスとCAD 合同シンポジウム 2017    2017.06  [Refereed]

     View Summary

    本稿では,衣服のモデリングにおける労力削減を目的とし,テンプレートとなる1つの衣服モデルから,任意のキャラクタの体型に適合した衣服モデルと型紙を生成する手法を提案する. 生成される衣服の概形を確認しながらソースの身体とターゲット身体の対応関係をインタラクティブに修正できるインターフェースの導入により,身体のモデルの頂点数や頂点間の接続情報に依存しない衣服転写を実現する. また,ソースの衣服の形状を反映する最適化処理を行った後に可展面近似を行うことで,ソースの衣服のデザインから形状が大きく離れることを防ぎ,もっともらしい衣服モデルを生成することを実現する. 提案手法は生成した衣服モデルの型紙を出力することが可能なため,例えば人間とペットのペアルックの衣服制作など,現実世界におけるものづくり支援への応用が期待される.

  • Authoring System for Choreography Using Dance Motion Retrieval and Synthesis

    Ryo Kakitsuka, Kosetsu Tsukuda (AIST, Satoru Fukayama (AIST, Naoya Iwamoto, Masataka Goto (AIST, Shigeo Morishima

    The 30th International Conference on Computer Animation and Social Agents(CASA 2017)    2017.05  [Refereed]

     View Summary

    Dance animations of a three-dimensional (3D) character rendered with computer graphics (CG) have been created by using motion capture systems or 3DCG animation editing tools. Since these methods require skills and a large amount of work from the creator, automated choreography has been developed to make synthesizing dance motions much easier. However, due to the limited variations of the results, users could not reflect their preferences to the output. Therefore, supporting the user’s preference in dance animation is still important.<br />
    <br />
    We propose a novel dance creation system which supports the user to create choreography with their preferences. A user first selects a preferred motion from the database, and then assigns it to an arbitrary section of the music. With a few mouse clicks, our system helps a user search for alternative dance motion that reflect his or<br />
    her preference by using relevance feedback. The system automatically synthesizes a continuous choreography with the constraint condition that the motions specified by the user are fixed. We conducted user studies and found that users could create new dance motions easily with their preferences by using the system.

  • Fiber-dependent Approach for Fast Dynamic Character Animation

    Ayano Kaneda, Shigeo Morishima

    The 30th International Conference on Computer Animation and Social Agents(CASA 2017)    2017.05  [Refereed]

     View Summary

    Creating secondary motion of character animation including jiggling of fat is demanded in the computer animation. In general, secondary motion from the primary motion of the character are expressed based on shape matching approaches. However, the previous methods do not account for the directional stretch characteristics and local stiffness at the same time, that is problematic to represent the effect of anatomical structure such as muscle fiber. Our framework allows user to edit the anatomical structure of the character model corresponding to creature’s body containing muscle and fat from the tetrahedral model and bone motion. Our method then simulates the elastic deformation considering anatomical structures defined directional stretch characteristics and stiffness on each layer. In addition, our method can add the constraint for local deformation (e.g. biceps) considering defined model’s characteristics.

  • 2. IIEEJ activities of conferences and events: 2-1 IIEEJ annual conferences

    Shigeo Morishima

    Journal of the Institute of Image Electronics Engineers of Japan   46 ( 1 ) 6 - 8  2017

    DOI

    Scopus

  • Screen Space Hair Self Shadowing by Translucent Hybrid Ambient Occlusion

    Zhuopeng Zhang, Shigeo Morishima

    SMART GRAPHICS, SG 2015   9317   29 - 40  2017  [Refereed]

     View Summary

    Screen space ambient occlusion is a very efficient means to capture the shadows caused by adjacent objects. However it is incapable of expressing transparency of objects. We introduce an approach which behaves like the combination of ambient occlusion and translucency. This method is an extension of the traditional screen space ambient occlusion algorithm with extra density field input. It can be applied on rendering mesh objects, and moreover it is very suitable for rendering complex hair models. We use the new algorithm to approximate light attenuation though semi-transparent hairs at real-time. Our method is implemented on common GPU, and independent from pre-computation. When it is used in environment lighting, the hair shading is visually similar to however one order of magnitude faster than existing algorithm.

    DOI

    Scopus

  • Screen space hair self shadowing by translucent hybrid ambient occlusion

    Zhuopeng Zhang, Shigeo Morishima

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   9317 LNCS   29 - 40  2017  [Refereed]

     View Summary

    © Springer International Publishing AG 2017. Screen space ambient occlusion is a very efficient means to capture the shadows caused by adjacent objects. However it is incapable of expressing transparency of objects. We introduce an approach which behaves like the combination of ambient occlusion and translucency. This method is an extension of the traditional screen space ambient occlusion algorithm with extra density field input. It can be applied on rendering mesh objects, and moreover it is very suitable for rendering complex hair models. We use the new algorithm to approximate light attenuation though semi-transparent hairs at real-time. Our method is implemented on common GPU, and independent from pre-computation. When it is used in environment lighting, the hair shading is visually similar to however one order of magnitude faster than existing algorithm.

    DOI

    Scopus

  • Dynamic Subtitle Placement Considering the Region of Interest and Speaker Location

    Wataru Akahori, Tatsunori Hirai, Shigeo Morishima

    PROCEEDINGS OF THE 12TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISIGRAPP 2017), VOL 6   6   102 - 109  2017  [Refereed]

     View Summary

    This paper presents a subtitle placement method that reduces unnecessary eye movements. Although methods that vary the position of subtitles have been discussed in a previous study, subtitles may overlap the region of interest (ROI). Therefore, we propose a dynamic subtitling method that utilizes eye-tracking data to avoid the subtitles from overlapping with important regions. The proposed method calculates the ROI based on the eye-tracking data of multiple viewers. By positioning subtitles immediately under the ROI, the subtitles do not overlap the ROI. Furthermore, we detect speakers in a scene based on audio and visual information to help viewers recognize the speaker by positioning subtitles near the speaker. Experimental results show that the proposed method enables viewers to watch the ROI and the subtitle in longer duration than traditional subtitles, and is effective in terms of enhancing the comfort and utility of the viewing experience.

    DOI

    Scopus

    7
    Citation
    (Scopus)
  • Court-based Volleyball Video Summarization Focusing on Rally Scene

    Takahiro Itazuri, Tsukasa Fukusato, Shugo Yamaguchi, Shigeo Morishima

    2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW)   2017-July   179 - 186  2017  [Refereed]

     View Summary

    In this paper, we propose a video summarization system for volleyball videos. Our system automatically detects rally scenes as self-consumable video segments and evaluates rally-rank for each rally scene to decide priority. In the priority decision, features representing the contents of the game are necessary; however such features have not been considered in most previous methods. Although several visual features such as the position of a ball and players should be used, acquisition of such features is still non-robust and unreliable in low resolution or low frame rate volleyball videos. Instead, we utilize the court transition information caused by camera operation. Experimental results demonstrate the robustness of our rally scene detection and the effectiveness of our rally-rank to reflect viewers' preferences over previous methods.

    DOI

    Scopus

    2
    Citation
    (Scopus)
  • Outside-in Monocular IR Camera based HMD Pose Estimation via Geometric Optimization

    Pavel A. Savkin, Shunsuke Saito, Jarich Vansteenberge, Tsukasa Fukusato, Lochlainn Wilson, Shigeo Morishima

    VRST'17: PROCEEDINGS OF THE 23RD ACM SYMPOSIUM ON VIRTUAL REALITY SOFTWARE AND TECHNOLOGY   131944  2017  [Refereed]

     View Summary

    Accurately tracking a Head Mounted Display (HMD) with a 6 degree of freedom is essential to achieve a comfortable and a nausea free experience in Virtual Reality. Existing commercial HMD systems using synchronized Infrared (IR) camera and blinking IR-LEDs can achieve highly accurate tracking. However, most of the off-the-shelf cameras do not support frame synchronization. In this paper, we propose a novel method for real time HMD pose estimation without using any camera synchronization or LED blinking. We extended over the state of the art pose estimation algorithm by introducing geometrically constrained optimization. In addition, we propose a novel system to increase robustness to the blurred IR-LEDs patterns appearing at high-velocity movements. The quantitative evaluations showed significant improvements in pose stability and accuracy over wide rotational movements as well as a decrease in runtime.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Facial fattening and slimming simulation based on skull structure

    Tsukasa Fukusato, Masahiro Fujisaki, Takuya Kato, Shigeo Morishima

    Journal of the Institute of Image Electronics Engineers of Japan   46 ( 1 ) 197 - 205  2017  [Refereed]

     View Summary

    © 2017 Institute of Image Electronics Engineers of Japan. All rights reserved. In this paper, we introduce a technique of facial fattening (or slimming) simulation from a frontal facial image. In previous work, data-driven approaches enable a rapid design of fattening and slimming images that reflect realistic facial shapes; however, they focus only on facial surface such as facial landmarks, so it is difficult to consider any anatomical model such as skull structure. Then, we construct a novel facial database, which contains sets of frontal facial image and MRI image. Based on our database, we estimate 2D skull structure from the input image, and define a fattening (or slimming) rule and a skull-based constraint. As a result, we can generate a natural deformed image for fattening and slimming simulation.

    DOI

    Scopus

  • Fast subsurface light transport evaluation for real-time rendering using single shortest optical paths

    Tadahiro Ozawa, Tatsuya Yatagawa, Hiroyuki Kubo, Shigeo Morishima

    Journal of the Institute of Image Electronics Engineers of Japan   46 ( 4 ) 533 - 546  2017  [Refereed]

     View Summary

    © 2017 Institute of Image Electronics Engineers of Japan. All rights reserved. To synthesize photo-realistic images, simulating the subsurface scattering is highly important. However, physically accurate simulation of the subsurface scattering requires computationally complex integration over various optical paths going through translucent materials. On the other hand, the powers of light rays are immediately decreased when they pass through limitedly translucent materials. We leverage this observation to compute the light contribution approximately by only considering a single ray that has the most dominant contribution. The proposed method is based on an empirical assumption where the light ray having passed such a dominant path has the largest influence for the final rendering result. By only considering the single dominant path to evaluate the radiance, visually plausible rendering results can be obtained in real-time, while the rendering process itself is not based on the strict physics, nevertheless. This lightweight rendering process is particularly important for inhomogeneous translucent objects that include different translucent materials inside, which have been one of the targets that is difficult to render in real-time due to its high computational complexity.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Face Texture Synthesis from Multiple Images via Sparse and Dense Correspondence

    Shugo Yamaguchi, Shigeo Morishima

    ACM SIGGRAPH ASIA 2016 Technical Brief    2016.12  [Refereed]

     View Summary

    We have a desire to edit images for various purposes such as art, entertainment, and film production so texture synthesis methods have been proposed. Especially, PatchMatch algorithm [Barnes et al.2009] enabled us to easily use many image editing tools. However, these tools are applied to one image. If we can automatically synthesize from various examples, we can create new and higher quality images. Visio-lization [Mohammed et al. 2009] generated average face by synthesis of face image database. However, the synthesis was applied block-wise so there were artifacts on the result and free form features of source images such as wrinkles could not be preserved. We proposed a new synthesis method for multiple images. We applied sparse and dense nearest neighbor search so that we can preserve both input and source database image features. Our method allows us to create a novel image from a number of examples.

  • トレーシングとデータベースを併用する2Dアニメーション作成支援システム

    Tsukasa Fukusato, Shigeo Morishima

    WISS 2016   ( 78 )  2016.12  [Refereed]

    J-GLOBAL

  • Face texture synthesis from multiple images via sparse and dense correspondence

    Shugo Yamaguchi, Shigeo Morishima

    SA 2016 - SIGGRAPH ASIA 2016 Technical Briefs     14  2016.11  [Refereed]

     View Summary

    © 2016 ACM. We have a desire to edit images for various purposes such as art, entertainment, and film production so texture synthesis methods have been proposed. Especially, PatchMatch algorithm [Barnes et al. 2009] enabled us to easily use many image editing tools. However, these tools are applied to one image. If we can automatically synthesize from various examples, we can create new and higher quality images. Visio-lization [Mohammed et al. 2009] generated average face by synthesis of face image database. However, the synthesis was applied block-wise so there were artifacts on the result and free form features of source images such as wrinkles could not be preserved. We proposed a new synthesis method for multiple images. We applied sparse and dense nearest neighbor search so that we can preserve both input and source database image features. Our method allows us to create a novel image from a number of examples.

    DOI

    Scopus

    2
    Citation
    (Scopus)
  • 3D facial geometry reconstruction using patch database

    Tsukasa Nozawa, Takuya Kato, Pavel A. Savkin, Naoki Nozawa, Shigeo Morishima

    SIGGRAPH 2016 - ACM SIGGRAPH 2016 Posters     24:1-24:2  2016.07  [Refereed]

     View Summary

    3D facial shape reconstruction in the wild environments is an important research task in the field of CG and CV. This is because it can be applied to a lot of products, such as 3DCG video games and face recognition. One of the most popular 3D facial shape reconstruction techniques is 3D Model-based approach. This approach approximates a facial shape by using 3D face model, which is calculated by principal component analysis. [Blanz and Vetter 1999] performed a 3D facial reconstruction by fitting points from facial feature points of an input of single facial image to vertex of template 3D facial model named 3D Morphable Model. This method can reconstruct a facial shape from a variety of images which include different lighting and face orientation, as long as facial feature points can be detected. However, representation quality of the result depends on the number of 3D model resolution.

    DOI

    Scopus

  • Automatic dance generation system considering sign language information

    Wakana Asahina, Naoya Iwamoto, Hubert P.H. Shum, Shigeo Morishima

    SIGGRAPH 2016 - ACM SIGGRAPH 2016 Posters     23:1-23:2  2016.07  [Refereed]

     View Summary

    In recent years, thanks to the development of 3DCG animation editing tools (e.g. MikuMikuDance), a lot of 3D character dance animation movies are created by amateur users. However it is very difficult to create choreography from scratch without any technical knowledge. Shiratori et al. [2006] produced the dance automatic generation system considering rhythm and intensity of dance motions. However each segment is selected randomly from database, so the generated dance motion has no linguistic or emotional meanings. Takano et al. [2010] produced a human motion generation system considering motion labels. However they use simple motion labels like "running" or "jump", so they cannot generate motions that express emotions. In reality, professional dancers make choreography based on music features or lyrics in music, and express emotion or how they feel in music. In our work, we aim at generating more emotional dance motion easily. Therefore, we use linguistic information in lyrics, and generate dance motion.

    DOI

    Scopus

    3
    Citation
    (Scopus)
  • Video reshuffling: Automatic video dubbing without prior knowledge

    Shoichi Furukawa, Takuya Kato, Pavel Savkin, Shigeo Morishima

    SIGGRAPH 2016 - ACM SIGGRAPH 2016 Posters     19:1-19:2  2016.07  [Refereed]

     View Summary

    Numerous video have been translated using "dubbing," spurred by the recent growth of video market. However, it is very difficult to achieve the visual-audio synchronization. That is to say in general a new audio does not synchronize with actor's mouth motion. This discrepancy can disturb comprehension of video contents. There-fore many methods have been researched so far to solve this problem.

    DOI

    Scopus

    3
    Citation
    (Scopus)
  • Perception of drowsiness based on correlation with facial image features

    Yugo Sato, Takuya Kato, Naoki Nozawa, Shigeo Morishima

    Proceedings of the ACM Symposium on Applied Perception, SAP 2016     139  2016.07  [Refereed]

     View Summary

    © 2016 Copyright held by the owner/author(s). This paper presents a video-based method for detecting drowsiness. Generally, human beings can perceive their fatigue and drowsiness through looking at faces. The ability to perceive the fatigue and the drowsiness has been studied in many ways. The drowsiness detection method based on facial videos has been proposed [Nakamura et al. 2014]. In their method, a set of the facial features calculated with the Computer Vision techniques and the k-nearest neighbor algorithm are applied to classify drowsiness degree. However, the facial features that are ineffective against reproducing the perception of human beings with the machine learning method are not removed. This factor can decrease the detection accuracy.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Aged Wrinkles Individuality Considering Aging Simulation

    Shigeo Morishima

    IPSJ Journal   57 ( 7 ) 1627 - 1637  2016.07  [Refereed]

     View Summary

    An appearance of a human face changes with aging: sagging, spots, somberness, and wrinkles would be observed. Considering these changes, aging simulation techniques that estimate an aged facial image is required for a long-term criminal or missing person investigation. One of the latest works proposed a photorealistic aging simulation method using a patch-based facial image reconstruction. However, including this method, the latest works had a problem that they cannot represent an individuality of wrinkles which is one of the most important features that represent the human individuality. The individuality of wrinkles is defined by the shape and the position. In this paper, we introduce a novel aging simulation method with patch-based facial image reconstruction, which could overcome problem mentioned above. To preserve a wrinkles individuality, wrinkles which is in an expressive facial image is synthesized to an input image based on a medical knowledge. Our method represents the individuality of wrinkles, and subjective evaluation results describe that our method generates more accurate results when compared our work and the previous work with the ground truth.

  • Region-of-interest-based subtitle placement using eye-tracking data of multiple viewers

    Wataru Akahori, Shigeo Morishima, Tatsunori Hirai, Shunya Kawamura

    TVX 2016 - Proceedings of the ACM International Conference on Interactive Experiences for TV and Online Video     123 - 128  2016.06  [Refereed]

     View Summary

    Copyright is held by the owner/author(s). We present a subtitle-placement method that reduces viewer's eye movement without interfering with the target region of interest (ROI) in a video scene. Subtitles help viewers understand foreign-language videos. However, subtitles tend to attract viewers' line of sight, which cause viewers to lose focus on the video content. To address this problem, previous studies have attempted to improve viewer experiences by dynamically shifting subtitle positions. Nevertheless, in their user studies, some participants felt that the visual appearance of such subtitles was unnatural and caused them fatigue. We propose a method that places subtitles below the ROI, which is calculated by eye-tracking data from multiple viewers. Two experiments were conducted to evaluate viewer impression and compare line of sight for videos with subtitles placed by the proposed and previous methods.

    DOI

    Scopus

    9
    Citation
    (Scopus)
  • LyricsRadar: A Lyrics Retrieval Interface Based on Latent Topics of Lyrics

    Shoto Sasaki, Kazuyoshi Yoshii, Tomoyasu Nakano, Masataka Goto, Shigeo Morishima

    IPSJ Journal   57 ( 5 ) 1365 - 1374  2016.05  [Refereed]

  • 四肢キャラクタ間の衣装転写システムの提案

    Fumiya Narita, Shunsuke Saito, Tsukasa Fukusato, Shigeo Morishima

    IPSJ Journal   57 ( 3 ) 863 - 872  2016.03  [Refereed]

  • Acquiring Curvature-Dependent Reflectance Function from Translucent Material

    Midori Okamoto, Hiroyuki Kubo, Yasuhiro Mukaigawa, Tadahiro Ozawa, Keisuke Mochida, Shigeo Morishima

    Proceedings NICOGRAPH International 2016     182 - 185  2016  [Refereed]

     View Summary

    Acquiring scattering parameters from real objects is still a challenging work. To obtain the scattering parameters, physics-based analysis is ineffective because huge computational cost is required to simulate subsurface scattering effect accurately. Thus, we focus on Curvature-Dependent Reflectance Function (CDRF), the plausible approximation of the subsurface scattering effect. In this paper, we propose a novel technique to obtain scattering parameters from real objects by revealing the relation between curvature and translucency.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Friction sound synthesis of deformable objects based on adhesion theory.

    T. Nakatsuka, Shigeo Morishima

    Poster Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation, Zurich, Switzerland, July 11-13, 2016     6:1 - 1  2016  [Refereed]

  • A choreographic authoring system for character dance animation reflecting a user's preference.

    Ryo Kakitsuka, Kosetsu Tsukuda, Satoru Fukayama, Naoya Iwamoto, Masataka Goto, Shigeo Morishima

    Poster Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation, Zurich, Switzerland, July 11-13, 2016     5:1  2016  [Refereed]

  • Creating a realistic face image from a cartoon character.

    Masanori Nakamura, Shugo Yamaguchi, Tsukasa Fukusato, Shigeo Morishima

    Poster Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation, Zurich, Switzerland, July 11-13, 2016     2:1  2016  [Refereed]

  • Frame-wise continuity-based video summarization and stretching

    Tatsunori Hirai, Shigeo Morishima

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   9516   806 - 817  2016  [Refereed]

     View Summary

    © Springer International Publishing Switzerland 2016. This paper describes a method for freely changing the length of a video clip, leaving its content almost unchanged, by removing video frames considering both audio and video transitions. In a video clip that contains many video frames, there are less important frames that only extend the length of the clip. Taking the continuity of audio and video frames into account, the method enables changing the length of a video clip by removing or inserting frames that do not significantly affect the content. Our method can be used to change the length of a clip without changing the playback speed. Subjective experimental results demonstrate the effectiveness of our method in preserving the clip content.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • MusicMixer: Automatic DJ system considering beat and latent topic similarity

    Tatsunori Hirai, Hironori Doi, Shigeo Morishima

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   9516   698 - 709  2016  [Refereed]

     View Summary

    © Springer International Publishing Switzerland 2016. This paper presents MusicMixer, an automatic DJ system that mixes songs in a seamless manner. MusicMixer mixes songs based on audio similarity calculated via beat analysis and latent topic analysis of the chromatic signal in the audio. The topic represents latent semantics about how chromatic sounds are generated. Given a list of songs, a DJ selects a song with beat and sounds similar to a specific point of the currently playing song to seamlessly transition between songs. By calculating the similarity of all existing pairs of songs, the proposed system can retrieve the best mixing point from innumerable possibilities. Although it is comparatively easy to calculate beat similarity from audio signals, it has been difficult to consider the semantics of songs as a human DJ considers. To consider such semantics, we propose a method to represent audio signals to construct topic models that acquire latent semantics of audio. The results of a subjective experiment demonstrate the effectiveness of the proposed latent semantic analysis method.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Computational cartoonist: A comic-style video summarization system for anime films

    Tsukasa Fukusato, Tatsunori Hirai, Shunya Kawamura, Shigeo Morishima

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   9516   42 - 50  2016  [Refereed]

     View Summary

    © Springer International Publishing Switzerland 2016. This paper presents Computational Cartoonist, a comicstyle anime summarization system that detects key frame and generates comic layout automatically. In contract to previous studies, we define evaluation criteria based on the correspondence between anime films and original comics to determine whether the result of comic-style summarization is relevant. To detect key frame detection for anime films, the proposed system segments the input video into a series of basic temporal units, and computes frame importance using image characteristics such as motion. Subsequently, comic-style layouts are decided on the basis of pre-defined templates stored in a database. Several results demonstrate the efficiency of our key frame detection over previous methods by evaluating the matching accuracy between key frames and original comic panels.

    DOI

    Scopus

    5
    Citation
    (Scopus)
  • A SOUNDTRACK GENERATION SYSTEM TO SYNCHRONIZE THE CLIMAX OF A VIDEO CLIP WITH MUSIC

    Haruki Sato, Tatsunori Hirai, Tomoyasu Nakano, Masataka Goto, Shigeo Morishima

    2016 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO (ICME)   2016-August   1 - 6  2016  [Refereed]

     View Summary

    In this paper, we present a soundtrack generation system that can automatically add a soundtrack with the length and climax points aligned to those of a video clip. Adding a soundtrack to a video clip is an important process in video editing. Editors tend to add chorus sections to the climax points of the video clip by replacing and concatenating musical segments. However, this process is time-consuming Our system automatically detects climaxes of both the video clips and music based on feature extraction and analysis. This enables the system to add a soundtrack in which the climax is synchronized to the climax of the video clip. We evaluated the generated soundtracks through a subjective evaluation.

    DOI

    Scopus

    2
    Citation
    (Scopus)
  • RSViewer: An Efficient Video Viewer for Racquet Sports Focusing on Rally Scenes.

    Shunya Kawamura, Tsukasa Fukusato, Tatsunori Hirai, Shigeo Morishima

    Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2016) - Volume 2: IVAPP, Rome, Italy, February 27-29, 2016.     249 - 256  2016  [Refereed]

    DOI

  • Real-time rendering of heterogeneous translucent materials with dynamic programming

    Tadahiro Ozawa, Midori Okamoto, Hiroyuki Kubo, Shigeo Morishima

    European Association for Computer Graphics - 37th Annual Conference, EUROGRAPHICS 2016 - Posters     3 - 4  2016  [Refereed]

     View Summary

    © 2016 The Eurographics Association. Subsurface scattering is important to express translucent materials such as skin, marble and so on realistically. However, rendering translucent materials in real-time is challenging, since calculating subsurface light transport requires large computational cost. In this paper, we present a novel algorithm to render heterogeneous translucent materials using Dijkstra's Algorithm. Our two main ideas are follows: The first is fast construction of the graph by solid voxelization. The second is voxel shading like initialization of the graph. From these ideas, we obtain maximum contribution of emitted light on whole surface in single calculation. We realize real-time rendering of animated heterogeneous translucent objects with simple compute and our approach does not require any precomputation.

    DOI

    Scopus

  • Real-time rendering of heterogeneous translucent objects using voxel number map

    Keisuke Mochida, Midori Okamoto, Hiroyuki Kubo, Shigeo Morishima

    European Association for Computer Graphics - 37th Annual Conference, EUROGRAPHICS 2016 - Posters     1 - 2  2016  [Refereed]

     View Summary

    © 2016 The Eurographics Association. Rendering of tranlucent objects enhances the reality of computer graphics, however, it is still computationally expensive. In this paper, we introduce a real-time rendering technique for heterogeneous translucent objects that contain complex structure inside. First, for the precomputation, we convert mesh models into voxels and generate Look-Up-Table in which the optical thickness between two surface voxels is stored. Second, we compute radiance in real-time using the precomputed optical thickness. At this time, we generate Voxel-Number-Map to fetch the texel value of the Look-Up-Table in GPU. Using Look-Up-Table and Voxel-Number-Map, our method can render translucent objects with cavities and different media inside in real-time.

    DOI

    Scopus

  • Wrinkles individuality representing aging simulation

    Pavel A. Savkin, Daiki Kuwahara, Masahide Kawai, Takuya Kato, Shigeo Morishima

    SIGGRAPH Asia 2015 Posters, SA 2015     37:1  2015.11  [Refereed]

     View Summary

    An appearance of a human face changes due to aging: sagging, spots, lusters, and wrinkles would be observed. Therefore, facial aging simulation techniques are required for long-term criminal investigation. While the appearance of an aged face varies greatly from person to person, wrinkles are one of the most important features which represent the human individuality. An individuali-ty of wrinkles is defined by wrinkles shape and position. [Maejima et al. 2014] proposed an aging simulation method that preserves facial parts individuality using a patch-based facial image reconstruction. Since few wrinkles can be observed on an input young face, it is difficult to represent wrinkles appearance only by the reconstruction. Therefore, a statistical wrinkles aging pattern model is introduced to produce natural looking wrinkles by selecting appropriate patches in an age-specific patch database. However, the variation of the statistical wrinkles patterns model is too limited to represent wrinkles individuality. Additionally, an appropriate size of a patch and feature value had to be applied for each facial region to get a plausible aged facial image. In this paper, we introduce a novel aging simulation method us-ing the patch-based image reconstruction, which can overcome problems mentioned above. Based on a medical knowledge [Piér-ard et al. 2003], wrinkles in an expressive facial image (defined as expressive wrinkles) of a same person are synthesized to the input image instead of the statistical wrinkles pattern model to represent wrinkles individuality. Furthermore, different sizes of the patch and feature values are applied for each facial region to achieve high representation of both the wrinkles individuality and the age-specific features. The entire process is performed automatically, and a plausible aged facial image is generated.

    DOI

    Scopus

    2
    Citation
    (Scopus)
  • Automatic facial animation generation system of dancing characters considering emotion in dance and music

    Wakana Asahina, Narumi Okada, Naoya Iwamoto, Taro Masuda, Tsukasa Fukusato, Shigeo Morishima

    SIGGRAPH Asia 2015 Posters, SA 2015     11:1  2015.11  [Refereed]

     View Summary

    In recent years, a lot of 3D character dance animation movies are created by amateur users using 3DCG animation editing tools (e.g. MikuMikuDance). Whereas, most of them are created manually. Then automatic facial animation system for dancing character will be useful to create dance movies and visualize impressions effec- tively. Therefore, we address the challenging theme to estimate dancing character's emotions (we call "dance emotion"). In previ- ous work considering music features, DiPaola et al. [2006] pro- posed music-driven emotionally expressive face system. To de- tect the mood of the input music, they used a hierarchical frame- work (Thayer model), and achieved to generate facial animation that matches music emotion. However, their model can't express subtleties of emotion between two emotions because input music divided into few moods sharply using Gaussian mixture model. In addition, they decide more detailed moods based on the psychologi- cal rules that uses score information, so they requires MIDI data. In this paper, we propose "dance emotion model" to visualize danc- ing character's emotion as facial expression. Our model is built by the coordinate information frame by frame on the emotional space through perceptional experiment using music and dance mo- tion database without MIDI data. Moreover, by considering the displacement on the emotional space, we can express not only a certain emotion but also subtleties of emotions. As the result, our system got a higher accuracy comparing with the previous work. We can create the facial expression result soon by inputting audio data and synchronized motion. It is shown the utility through the comparison with previous work in Figure 1.

    DOI

    Scopus

    2
    Citation
    (Scopus)
  • Region-based painting style transfer

    Shugo Yamaguchi, Takuya Kato, Tsukasa Fukusato, Chie Furusawa, Shigeo Morishima

    SIGGRAPH Asia 2015 Technical Briefs, SA 2015   45 ( 1 ) 8:1-8:4  2015.11  [Refereed]

     View Summary

    © 2015 ACM. In this paper, we present a novel method for creating a painted im- Age from a photograph using an existing painting as a style source. The core idea is to identify the corresponding objects in the two images in order to select patches more appropriately. We automat- ically make a region correspondence between the painted source image and the target photograph by computing color and texture feature distances. Next, we conduct a patch-based synthesis that preserves the appropriate source and target features. Unlike previ- ous example-based approaches of painting style transfer, our results successfully reflect the features of the source images even if the in- put images have various colors and textures. Our method allows us to automatically render a new painted image preserving the features of the source image.

    DOI J-GLOBAL

    Scopus

    1
    Citation
    (Scopus)
  • Multi-layer Lattice Model for Real-Time Dynamic Character Deformation

    Naoya Iwamoto, Hubert P.H. Shum, Longzhi Yang, Shigeo Morishima

    Computer Graphics Forum   34 ( 7 ) 99 - 109  2015.10  [Refereed]

     View Summary

    © 2015 The Author(s) Computer Graphics Forum. Due to the recent advancement of computer graphics hardware and software algorithms, deformable characters have become more and more popular in real-time applications such as computer games. While there are mature techniques to generate primary deformation from skeletal movement, simulating realistic and stable secondary deformation such as jiggling of fats remains challenging. On one hand, traditional volumetric approaches such as the finite element method require higher computational cost and are infeasible for limited hardware such as game consoles. On the other hand, while shape matching based simulations can produce plausible deformation in real-time, they suffer from a stiffness problem in which particles either show unrealistic deformation due to high gains, or cannot catch up with the body movement. In this paper, we propose a unified multi-layer lattice model to simulate the primary and secondary deformation of skeleton-driven characters. The core idea is to voxelize the input character mesh into multiple anatomical layers including the bone, muscle, fat and skin. Primary deformation is applied on the bone voxels with lattice-based skinning. The movement of these voxels is propagated to other voxel layers using lattice shape matching simulation, creating a natural secondary deformation. Our multi-layer lattice framework can produce simulation quality comparable to those from other volumetric approaches with a significantly smaller computational cost. It is best to be applied in real-time applications such as console games or interactive animation creation.

    DOI

    Scopus

    13
    Citation
    (Scopus)
  • Multi-layer Lattice Model for Real-Time Dynamic Character Deformation

    Naoya Iwamoto, Hubert P. H. Shum, Longzhi Yang, Shigeo Morishima

    COMPUTER GRAPHICS FORUM   34 ( 7 ) 99 - 109  2015.10  [Refereed]

     View Summary

    Due to the recent advancement of computer graphics hardware and software algorithms, deformable characters have become more and more popular in real-time applications such as computer games. While there are mature techniques to generate primary deformation from skeletal movement, simulating realistic and stable secondary deformation such as jiggling of fats remains challenging. On one hand, traditional volumetric approaches such as the finite element method require higher computational cost and are infeasible for limited hardware such as game consoles. On the other hand, while shape matching based simulations can produce plausible deformation in real-time, they suffer from a stiffness problem in which particles either show unrealistic deformation due to high gains, or cannot catch up with the body movement. In this paper, we propose a unified multi-layer lattice model to simulate the primary and secondary deformation of skeleton-driven characters. The core idea is to voxelize the input character mesh into multiple anatomical layers including the bone, muscle, fat and skin. Primary deformation is applied on the bone voxels with lattice-based skinning. The movement of these voxels is propagated to other voxel layers using lattice shape matching simulation, creating a natural secondary deformation. Our multi-layer lattice framework can produce simulation quality comparable to those from other volumetric approaches with a significantly smaller computational cost. It is best to be applied in real-time applications such as console games or interactive animation creation.

    DOI

    Scopus

    13
    Citation
    (Scopus)
  • Automatic generation of photorealistic 3D inner mouth animation only from frontal images

    Masahide Kawai, Tomoyori Iwao, Akinobu Maejima, Shigeo Morishima

    Journal of Information Processing   23 ( 5 ) 693 - 703  2015.09  [Refereed]

     View Summary

    © 2015 Information Processing Society of Japan. In this paper, we propose a novel method to generate highly photorealistic three-dimensional (3D) inner mouth animation that is well-fitted to an original ready-made speech animation using only frontal captured images and small-size databases. The algorithms are composed of quasi-3D model reconstruction and motion control of teeth and the tongue, and final compositing of photorealistic speech animation synthesis tailored to the original. In general, producing a satisfactory photorealistic appearance of the inner mouth that is synchronized with mouth movement is a very complicated and time-consuming task. This is because the tongue and mouth are too flexible and delicate to be modeled with the large number of meshes required. Therefore, in some cases, this process is omitted or replaced with a very simple generic model. Our proposed method, on the other hand, can automatically generate 3D inner mouth appearances by improving photorealism with only three inputs: an original tailor-made lip-sync animation, a single image of the speaker’s teeth, and a syllabic decomposition of the desired speech. The key idea of our proposed method is to combine 3D reconstruction and simulation with two-dimensional (2D) image processing using only the above three inputs, as well as a tongue database and mouth database. The satisfactory performance of our proposed method is illustrated by the significant improvement in picture quality of several tailor-made animations to a degree nearly equivalent to that of camera-captured videos.

    DOI

    Scopus

  • Automatic generation of photorealistic 3D inner mouth animation only from frontal images

    Masahide Kawai, Tomoyori Iwao, Akinobu Maejima, Shigeo Morishima

    Journal of Information Processing   23 ( 5 ) 693 - 703  2015.09  [Refereed]

     View Summary

    In this paper, we propose a novel method to generate highly photorealistic three-dimensional (3D) inner mouth animation that is well-fitted to an original ready-made speech animation using only frontal captured images and small-size databases. The algorithms are composed of quasi-3D model reconstruction and motion control of teeth and the tongue, and final compositing of photorealistic speech animation synthesis tailored to the original. In general, producing a satisfactory photorealistic appearance of the inner mouth that is synchronized with mouth movement is a very complicated and time-consuming task. This is because the tongue and mouth are too flexible and delicate to be modeled with the large number of meshes required. Therefore, in some cases, this process is omitted or replaced with a very simple generic model. Our proposed method, on the other hand, can automatically generate 3D inner mouth appearances by improving photorealism with only three inputs: an original tailor-made lip-sync animation, a single image of the speaker’s teeth, and a syllabic decomposition of the desired speech. The key idea of our proposed method is to combine 3D reconstruction and simulation with two-dimensional (2D) image processing using only the above three inputs, as well as a tongue database and mouth database. The satisfactory performance of our proposed method is illustrated by the significant improvement in picture quality of several tailor-made animations to a degree nearly equivalent to that of camera-captured videos.

    DOI

    Scopus

  • VRMixer: 動画コンテンツと現実世界の融合とその適用可能性の検証

    牧 良樹, 中村聡史, 平井辰典, 湯村 翼, 森島繁生

    Entertainment Computing 2015講演論文集     1 - 9  2015.09

  • 3D face reconstruction from a single non-frontal face image

    Naoki Nozawa, Daiki Kuwahara, Shigeo Morishima

    ACM SIGGRAPH 2015 Posters, SIGGRAPH 2015    2015.07  [Refereed]

     View Summary

    A reconstruction of a human face shape from a single image is an important theme for criminal investigation such as recognition of suspected people from surveillance cameras with only a few frames. It is, however, still difficult to recover a face shape from a non-frontal face image. Method using shading cues on a face depends on the lighting circumstance and cannot be adapted to images in which shadows occurs, for example [Kemelmacher et al. 2011]. On the other hand, [Blanz et al. 2004] reconstructed a shape by 3D Morphable Model (3DMM) only with facial feature points. This method, however, requires the pose-wise correspondences of vertices in the model to feature points of input image because a face contour cannot be seen when the facial direction is not the front. In this paper, we propose a method which can reconstruct a facial shape from a non-frontal face image only with a single general correspondence table. Our method searches for the correspondences of points on a facial contour in the iterative reconstruction process, and makes the reconstruction simple and stable.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Texture preserving garment transfer

    Fumiya Narita, Shunsuke Saito, Takuya Kato, Tsukasa Fukusato, Shigeo Morishima

    ACM SIGGRAPH 2015 Posters, SIGGRAPH 2015     91:1  2015.07  [Refereed]

     View Summary

    Dressing virtual characters is necessary for many applications, while modeling clothing is a significant bottleneck. Therefore, it has been proposed that the idea of Garment Transfer for transfer-ring clothing model from one character to another character [Brouet et al. 2012]. In recent years, this idea has been extended to be applicable between characters in various poses and shapes [Narita et al. 2014]. However, texture design of clothing is not preserved in their method since they deform the source clothing model to fit the target body (see Figure 1(a)(c)). We propose a novel method to transfer garment while preserving its texture design. First, we cut the transferred clothing mesh model along the seam. Second, we follow the similar method to "as-rigid-as-possible" deformation, we deform the texture space to reflect the shape of transferred clothing mesh model. Our method keeps consistency of the texture as clothing by cutting them along a seam. In order not to generate the inversion, we modify the ex-pression of "as-rigid-as-possible". Our method allows users not only to preserve texture uniformly on transferred clothing (see Figure 1(b)), but also in particular location the user specified, such as the location with an appliqué (see Figure 1(e)). Our meth-od is the pioneer of Texture Preserving Garment Transfer.

    DOI

    Scopus

    2
    Citation
    (Scopus)
  • BG maker: Example-based anime background image creation from a photograph

    Shugo Yamaguchi, Chie Furusawa, Takuya Kato, Tsukasa Fukusato, Shigeo Morishima

    ACM SIGGRAPH 2015 Posters, SIGGRAPH 2015     45:1  2015.07  [Refereed]

     View Summary

    Anime designers often paint actual sceneries to serve as background images based on photographs to complement characters. As paint- ing background scenery is time consuming and cost ineffective, there is a high demand for techniques that can convert photographs into anime styled graphics. Previous approaches for this purpose, such as Image Quilting [Efros and Freeman 2001] transferred a source texture onto a target photograph. These methods synthesized corresponding source patches with the target elements in a photo- graph, and correspondence was achieved through nearest-neighbor search such as PatchMatch [Barnes et al. 2009]. However, the near- est-neighbor patch is not always the most suitable patch for anime transfer because photographs and anime background images differ in color and texture. For example, real-world color need to be con- verted into specific colors for anime; further, the type of brushwork required to realize an anime effect, is different for different photo- graph elements (e.g. sky, mountain, grass). Thus, to get the most suitable patch, we propose a method, wherein we establish global region correspondence before local patch match. In our proposed method, BGMaker, (1) we divide the real and anime images into re- gions; (2) then, we automatically acquire correspondence between each region on the basis of color and texture features, and (3) search and synthesize the most suitable patch within the corresponding re- gion. Our primary contribution in this paper is a method for au- Tomatically acquiring correspondence between target regions and source regions of different color and texture, which allows us to generate an anime background image while preserving the details of the source image.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Automatic synthesis of eye and head animation according to duration and point of gaze

    Hiroki Kagiyama, Masahide Kawai, Daiki Kuwahara, Takuya Kato, Shigeo Morishima

    ACM SIGGRAPH 2015 Posters, SIGGRAPH 2015     44:1  2015.07  [Refereed]

     View Summary

    In movie and video game productions, synthesizing subtle eye and corresponding head movements of CG character is essential to make a content dramatic and impressive. However, to complete them costs a lot of time and labors because they often have to be made by manual operations of skilled artists. [Itti et al. 2006] and [Yeo et al. 2012] proposed an automatic eyes and head's motion control method by measuring a real person watching a displayed gaze point. However, in both approaches, a rotational angle and speed of eyes and head are treated together uniformly depending on the gaze point location. Specifically, dis-playing duration time of gaze target strongly influences the motion of eyes and head because the shorter the blink interval of a gaze target is, the more quickly a human response becomes to chase the target by the combination of eye rotation and head movement. In this paper, we propose a method to automatically control eyes and head by taking account of both gaze target location and its blink time duration. As a result, eye and head movement are mod-eled combined with measured data by a function whose arguments are gaze point angle and duration time. So a variety of gaze action along with head motion including Ves-tibule-ocular Reflex can be generated automatically by changing the parameters of a gaze angle and duration.

    DOI

    Scopus

  • A music video authoring system synchronizing climax of video clips and music via rearrangement of musical bars

    Haruki Sato, Tatsunori Hirai, Tomoyasu Nakano, Masataka Goto, Shigeo Morishima

    ACM SIGGRAPH 2015 Posters, SIGGRAPH 2015     42:1  2015.07  [Refereed]

     View Summary

    This paper presents a system that can automatically add a sound-track to a video clip by replacing and concatenating an existing song's musical bars considering a user's preference. Since a soundtrack makes a video clip attractive, adding a soundtrack to a clip is one of the most important processes in video editing. To make a video clip more attractive, an editor of the clip tends to add a soundtrack considering its timing and climax. For example, editors often add chorus sections to the climax of the clip by re-placing and concatenating musical bars in an existing song. How-ever, in the process, editors should take naturalness of rearranged soundtrack into account. Therefore, editors have to decide how to replace musical bars in a song considering its timing, climax, and naturalness of rearranged soundtrack simultaneously. In this case, editors are required to optimize the soundtrack by listening to the rearranged result as well as checking the naturalness and synchro-nization between the result and the video clip. However, this repe-titious work is time-consuming. [Feng et al. 2010] proposed an automatic soundtrack addition method. However, since this meth-od automatically adds soundtrack with data-driven approach, this method cannot consider timing and climax which a user prefers. Our system takes all the patterns of rearranged musical bars into account and finds the most natural soundtrack considering a user's preference of intention for an audio-visual alignment and a climax of the resulting soundtrack. Specifically, musical sections between user's specified points and the beginning and the ending of the song are automatically interpolated by replacing and concatenat-ing musical bars based on dynamic programming. To consider user's intention for a climax of the soundtrack, the system allows the user to specify the intended climax by an editing interface. The system immediately reflects the intention and the soundtrack will be interactively re-rearranged. These semi-automated pro-cesses of rearranging soundtrack for a video clip help the users adding songs without considering naturalness of the rearranged song by their own.

    DOI

    Scopus

    4
    Citation
    (Scopus)
  • キャラクターの身体構造を考慮した実時間肉揺れ生成手法

    Naoya Iwamoto, Shigeo Morishima

    IPSJ Journal   44 ( 3 ) 502 - 511  2015.07  [Refereed]

  • VoiceDub:複数タイミング情報をともなう映像エンタテイメント向け音声同期収録支援システム

    Shinichi Kawamoto, Shigeo Morishima, Satoshi Nakamura

    IPSJ Journal   56 ( 4 ) 1142 - 1151  2015.04  [Refereed]

  • VRMixer: 動画セグメンテーションによる動画コンテンツと現実世界の融合

    平井辰典, 中村聡史, 湯村 翼, 森島繁生

    インタラクション2015講演論文集     1 - 6  2015.03

  • ラリーシーンに着目した映像自動要約によるラケットスポーツ動画鑑賞システム

    Toshiya Kawamura, Tsukasa Fukusato, Tatsunori Hirai, Shigeo Morishima

    IPSJ Journal   56 ( 3 ) 1028 - 1038  2015.03  [Refereed]

  • FG2015 Age Progression Evaluation

    Andreas Lanitis, Nicolas Tsapatsoulis, Kleanthis Soteriou, Daiki Kuwahara, Shigeo Morishima

    2015 11TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOPS ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG), VOL. 3    2015  [Refereed]

     View Summary

    The topic of face-aging received increased attention by the computer vision community during the recent years. This interest is motivated by important real life applications where accurate age progression algorithms can be used. However, age progression methodologies may only be used in real applications provided that they have the ability to produce accurate age progressed images. Therefore it is of utmost importance to encourage the development of accurate age progression algorithms through the formulation of performance evaluation protocols that can be used for obtaining accurate performance evaluation results for different algorithms reported in the literature. In this paper we describe the organization of the, first ever, pilot independent age progression competition that aims to provide the basis of a robust framework for assessing age progression methodologies. The evaluation carried out involves the use of several machine-based and human-based indicators that were used for assessing eight age progression methods.

  • FG2015 Age Progression Evaluation

    Andreas Lanitis, Nicolas Tsapatsoulis, Kleanthis Soteriou, Daiki Kuwahara, Shigeo Morishima

    2015 11TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOPS ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG), VOL. 1    2015  [Refereed]

     View Summary

    The topic of face-aging received increased attention by the computer vision community during the recent years. This interest is motivated by important real life applications where accurate age progression algorithms can be used. However, age progression methodologies may only be used in real applications provided that they have the ability to produce accurate age progressed images. Therefore it is of utmost importance to encourage the development of accurate age progression algorithms through the formulation of performance evaluation protocols that can be used for obtaining accurate performance evaluation results for different algorithms reported in the literature. In this paper we describe the organization of the, first ever, pilot independent age progression competition that aims to provide the basis of a robust framework for assessing age progression methodologies. The evaluation carried out involves the use of several machine-based and human-based indicators that were used for assessing eight age progression methods.

  • FG2015 Age Progression Evaluation

    Andreas Lanitis, Nicolas Tsapatsoulis, Kleanthis Soteriou, Daiki Kuwahara, Shigeo Morishima

    2015 11TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOPS ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG), VOL. 5    2015  [Refereed]

     View Summary

    The topic of face-aging received increased attention by the computer vision community during the recent years. This interest is motivated by important real life applications where accurate age progression algorithms can be used. However, age progression methodologies may only be used in real applications provided that they have the ability to produce accurate age progressed images. Therefore it is of utmost importance to encourage the development of accurate age progression algorithms through the formulation of performance evaluation protocols that can be used for obtaining accurate performance evaluation results for different algorithms reported in the literature. In this paper we describe the organization of the, first ever, pilot independent age progression competition that aims to provide the basis of a robust framework for assessing age progression methodologies. The evaluation carried out involves the use of several machine-based and human-based indicators that were used for assessing eight age progression methods.

    DOI

    Scopus

    4
    Citation
    (Scopus)
  • FG2015 Age Progression Evaluation

    Andreas Lanitis, Nicolas Tsapatsoulis, Kleanthis Soteriou, Daiki Kuwahara, Shigeo Morishima

    2015 11TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOPS ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG), VOL. 2    2015  [Refereed]

     View Summary

    The topic of face-aging received increased attention by the computer vision community during the recent years. This interest is motivated by important real life applications where accurate age progression algorithms can be used. However, age progression methodologies may only be used in real applications provided that they have the ability to produce accurate age progressed images. Therefore it is of utmost importance to encourage the development of accurate age progression algorithms through the formulation of performance evaluation protocols that can be used for obtaining accurate performance evaluation results for different algorithms reported in the literature. In this paper we describe the organization of the, first ever, pilot independent age progression competition that aims to provide the basis of a robust framework for assessing age progression methodologies. The evaluation carried out involves the use of several machine-based and human-based indicators that were used for assessing eight age progression methods.

  • Affective music recommendation systembased on the mood of input video

    Shoto Sasaki, Tatsunori Hirai, Hayato Ohya, Shigeo Morishima

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   8936   299 - 302  2015  [Refereed]

     View Summary

    © Springer International Publishing Switzerland 2015. We present an affective music recommendation system just fitting to an input video without textual information. Music that matches our current environmental mood can enhance a deep impression. However, we cannot know easily which music best matches our present mood from huge music database. So we often select a well-known popular song repeatedly in spite of the present mood. In this paper, we analyze the video sequence which represent current mood and recommend an appropriate music which affects the current mood. Our system matches an input video with music using valence-arousal plane which is an emotional plane.

    DOI

    Scopus

    5
    Citation
    (Scopus)
  • Facial aging simulator by data-drivencomponent-based texture cloning

    Daiki Kuwahara, Akinobu Maejima, Shigeo Morishima

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   8936   295 - 298  2015  [Refereed]

     View Summary

    © Springer International Publishing Switzerland 2015. Facial aging and rejuvenation simulation is a challenging topic because keeping personal characteristics in every age is difficult problem. In this demonstration, we simulate a facial aging/rejuvenating only from a single photo. Our system alters an input face image to aged face by reconstructing every facial component with face database for target age. An appropriate facial components image are selected by a special similarity measurement between current age and target age to keep personal characteristics asmuch as possible. Our systemsuccessfully generated aged/ rejuvenated faces with age-related features such as spots, wrinkles, and sagging while keeping personal characteristics throughout all ages.

    DOI

    Scopus

    3
    Citation
    (Scopus)
  • Focusing patch: Automatic photorealistic deblurring for facial images by patch-based color transfer

    Masahide Kawai, Shigeo Morishima

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   8935   155 - 166  2015  [Refereed]

     View Summary

    © Springer International Publishing Switzerland 2015. Facial image synthesis creates blurred facial images almost without high-frequency components, resulting in flat edges. Moreover, the synthesis process results in inconsistent facial images, such as the conditions where the white part of the eye is tinged with the color of the iris and the nasal cavity is tinged with the skin color. Therefore, we propose a method that can deblur an inconsistent synthesized facial image, including strong blurs created by common image morphing methods, and synthesize photographic quality facial images as clear as an image captured by a camera. Our system uses two original algorithms: patch color transfer and patch-optimized visio-lization. Patch color transfer can normalize facial luminance values with high precision, and patch-optimized visio-lization can synthesize a deblurred, photographic quality facial image. The advantages of our method are that it enables the reconstruction of the high-frequency components (concavo-convex) of human skin and removes strong blurs by employing only the input images used for original image morphing.

    DOI

    Scopus

    2
    Citation
    (Scopus)
  • Facial Fattening and Slimming Simulation Based on Skull Structure

    Masahiro Fujisaki, Shigeo Morishima

    ADVANCES IN VISUAL COMPUTING, PT II (ISVC 2015)   9475   137 - 149  2015  [Refereed]

     View Summary

    In this paper, we propose a novel facial fattening and slimming deformation method in 2D images that preserves the individuality of the input face by estimating the skull structure from a frontal face image and prevents unnatural deformation (e.g. penetration into the skull). Our method is composed of skull estimation, optimizing fattening and slimming rules appropriate to the estimated skull, mesh deformation to generate fattening and slimming face, and generation background image adapted to the generated face contour. Finally, we verify our method by comparison with other rules, precision of skull estimation, subjective experiment, and execution time.

    DOI

    Scopus

  • Dance motion segmentation method based on choreographic primitives

    Narumi Okada, Naoya Iwamoto, Tsukasa Fukusato, Shigeo Morishima

    GRAPP 2015 - 10th International Conference on Computer Graphics Theory and Applications; VISIGRAPP, Proceedings     332 - 339  2015  [Refereed]

     View Summary

    Copyright © 2015 SCITEPRESS - Science and Technology Publications. All rights reserved. Data-driven animation using a large human motion database enables the programing of various natural human motions. While the development of a motion capture system allows the acquisition of realistic human motion, segmenting the captured motion into a series of primitive motions for the construction of a motion database is necessary. Although most segmentation methods have focused on periodic motion, e.g., walking and jogging, segmenting non-periodic and asymmetrical motions such as dance performance, remains a challenging problem. In this paper, we present a specialized segmentation approach for human dance motion. Our approach consists of three steps based on the assumption that human dance motion is composed of consecutive choreographic primitives. First, we perform an investigation based on dancer perception to determine segmentation components. After professional dancers have selected segmentation sequences, we use their selected sequences to define rules for the segmentation of choreographic primitives. Finally, the accuracy of our approach is verified by a user-study, and we thereby show that our approach is superior to existing segmentation methods. Through three steps, we demonstrate automatic dance motion synthesis based on the choreographic primitives obtained.

    DOI

    Scopus

    2
    Citation
    (Scopus)
  • FG2015 Age Progression Evaluation

    Andreas Lanitis, Nicolas Tsapatsoulis, Kleanthis Soteriou, Daiki Kuwahara, Shigeo Morishima

    2015 11TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOPS ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG), VOL. 4     1 - 6  2015  [Refereed]

     View Summary

    The topic of face-aging received increased attention by the computer vision community during the recent years. This interest is motivated by important real life applications where accurate age progression algorithms can be used. However, age progression methodologies may only be used in real applications provided that they have the ability to produce accurate age progressed images. Therefore it is of utmost importance to encourage the development of accurate age progression algorithms through the formulation of performance evaluation protocols that can be used for obtaining accurate performance evaluation results for different algorithms reported in the literature. In this paper we describe the organization of the, first ever, pilot independent age progression competition that aims to provide the basis of a robust framework for assessing age progression methodologies. The evaluation carried out involves the use of several machine-based and human-based indicators that were used for assessing eight age progression methods.

    DOI

    Scopus

    4
    Citation
    (Scopus)
  • MusicMixer: Computer-Aided DJ System based on an Automatic Song Mixing

    Tatsunori Hirai, Hironori Doi, Shigeo Morishima

    12TH ADVANCES IN COMPUTER ENTERTAINMENT TECHNOLOGY CONFERENCE (ACE15)   16-19-November-2015   41:1-41:5  2015  [Refereed]

     View Summary

    In this paper, we present MusicMixer, a computer-aided DJ system that helps DJs, specifically with song mixing. MusicMixer continuously mixes and plays songs using an automatic music mixing method that employs audio similarity calculations. By calculating similarities between song sections that can be naturally mixed, MusicMixer enables seamless song transitions. Though song mixing is the most fundamental and important factor in DJ performance, it is difficult for untrained people to seamlessly connect songs. MusicMixer realizes automatic song mixing using an audio signal processing approach; therefore, users can perform DJ mixing simply by selecting a song from a list of songs suggested by the system, enabling effective DJ song mixing and lowering entry barriers for the inexperienced. We also propose personalization for song suggestions using a preference memorization function of MusicMixer.

    DOI

    Scopus

    2
    Citation
    (Scopus)
  • Pose-independent garment transfer

    Fumiya Narita, Shunsuke Saito, Takuya Kato, Tsukasa Fukusato, Shigeo Morishima

    SIGGRAPH Asia 2014 Posters, SIGGRAPH ASIA 2014     12  2014.11  [Refereed]

    DOI

    Scopus

    3
    Citation
    (Scopus)
  • VRMixer: Mixing video and real world with video segmentation

    Tatsunori Hirai, Satoshi Nakamura, Tsubasa Yumura, Shigeo Morishima

    ACM International Conference Proceeding Series   2014-November   30 - 7  2014.11  [Refereed]

     View Summary

    Copyright © 2014 ACM. This paper presents VRMixer, a system that mixes real world and a video clip letting a user enter the video clip and realize a virtual co-starring role with people appearing in the clip. Our system constructs a simple virtual space by allocating video frames and the people appearing in the clip within the user's 3D space. By measuring the user's 3D depth in real time, the time space of the video clip and the user's 3D space become mixed. VRMixer automatically extracts human images from a video clip by using a video segmentation technique based on 3D graph cut segmentation that employs face detection to detach the human area from the background. A virtual 3D space (i.e., 2.5D space) is constructed by positioning the background in the back and the people in the front. In the video clip, the user can stand in front of or behind the people by using a depth camera. Real objects that are closer than the distance of the clip's background will become part of the constructed virtual 3D space. This synthesis creates a new image in which the user appears to be a part of the video clip, or in which people in the clip appear to enter the real world. We aim to realize "video reality," i.e., a mixture of reality and video clips using VRMixer.

    DOI

    Scopus

    3
    Citation
    (Scopus)
  • Automatic depiction of onomatopoeia in animation considering physical phenomena

    Tsukasa Fukusato, Shigeo Morishima

    Proceedings - Motion in Games 2014, MIG 2014     161 - 169  2014.11  [Refereed]

     View Summary

    Copyright © ACM. This paper presents a method that enables the estimation and depiction of onomatopoeia in computer-generated animation based on physical parameters. Onomatopoeia is used to enhance physical characteristics and movement, and enables users to understand animation more intuitively. We experiment with onomatopoeia depiction in scenes within the animation process. To quantify onomatopoeia, we employ Komatsu's [2012] assumption, i.e., onomatopoeia can be expressed by n-dimensional vector. We also propose phonetic symbol vectors based on the correspondence of phonetic symbols to the impressions of onomatopoeia using a questionnaire-based investigation. Furthermore, we verify the positioning of onomatopoeia in animated scenes. The algorithms directly combine phonetic symbols to estimate optimum onomatopoeia. They use a view-dependent Gaussian function to display onomatopoeias in animated scenes. Our method successfully recommends optimum onomatopoeias using only physical parameters, so that even amateur animators can easily create onomatopoeia animation.

    DOI

    Scopus

    7
    Citation
    (Scopus)
  • VRMixer: 動画と現実の融合による新たなコンテンツの生成

    平井辰典, 中村聡史, 森島繁生, 湯村 翼

    OngaCRESTシンポジウム2014予稿集     27 - 27  2014.08

  • 歌手映像と歌声の解析に基づく音楽動画中の歌唱シーン検出

    平井辰典, 中野倫靖, 後藤真孝, 森島繁生

    OngaCRESTシンポジウム2014予稿集     20 - 20  2014.08

  • 顔の印象類似検索のための髪特徴量の提案

    藤 賢大, 福里 司, 佐々木将人, 増田太郎, 平井辰典, 森島繁生

    第17回画像の認識・理解シンポジウム(MIRU2014)講演論文集     1 - 2  2014.07

  • ラケットスポーツ動画のラリーシーンの特徴に基づく映像要約

    河村俊哉, 福里 司, 平井辰典, 森島繁生

    第17回画像の認識・理解シンポジウム(MIRU2014)講演論文集     1 - 2  2014.07

  • A Visuomotor Coordination Model for Obstacle Recognition

    Tomoyoti Iwao, Hiroyuki Kubo, Akinobu Maejima, Shigeo Morishima

    journal of WSCG   22 ( 1-2 ) 49 - 56  2014.07  [Refereed]

  • エンタテインメント応用のための人物顔パターン計測・合成技術

    Shigeo Morishima

    計測と制御   53 ( 7 ) 593 - 598  2014.07  [Refereed]

    DOI CiNii

  • 歌手映像と歌声の解析に基づく音楽動画中の歌唱シーン検出手法の検討

    平井辰典, 中野倫靖, 後藤真孝, 森島繁生

    研究報告音楽情報科学(MUS)   2014 ( 54 ) 1 - 8  2014.05

     View Summary

    本稿では,ライヴ動画や PV などに代表される音楽動画において,歌手が歌っているシーンである 「歌唱シーン」 を検出する手法について検討する.音楽において歌手は最も主要な役割を担っており,音楽動画における歌唱シーンも同様に動画のハイライトの一つであると言える.歌唱シーンは動画サムネイル生成や,大量の音楽動画の短時間ブラウジングなどにおいて有用である.歌唱シーンを検出するためには,歌手の顔認識,楽曲中の歌声区間検出といった要素手法及びそれらを組み合わせる手法についての検討が必要である.本稿では,顔認識を用いた映像解析,歌声区間検出を用いた音響解析,それらを複合した Audio-visual 解析のそれぞれについて比較・検討しながら歌唱シーン検出の実現可能性について議論する.

    CiNii

  • Macroscopic and microscopic deformation coupling in up-sampled cloth simulation

    Shunsuke Saito, Nobuyuki Umetani, Shigeo Morishima

    COMPUTER ANIMATION AND VIRTUAL WORLDS   25 ( 3-4 ) 437 - 446  2014.05  [Refereed]

     View Summary

    Various methods of predicting the deformation of fine-scale cloth from coarser resolutions have been explored. However, the influence of fine-scale deformation has not been considered in coarse-scale simulations. Thus, the simulation of highly nonhomogeneous detailed cloth is prone to large errors. We introduce an effective method to simulate cloth made of nonhomogeneous, anisotropic materials. We precompute a macroscopic stiffness that incorporates anisotropy from the microscopic structure, using the deformation computed for each unit strain. At every time step of the simulation, we compute the deformation of coarse meshes using the coarsened stiffness, which saves computational time and add higher-level details constructed by the characteristic displacement of simulated meshes. We demonstrate that anisotropic and inhomogeneous cloth models can be simulated efficiently using our method. (c) 2014 The Authors. Computer Animation and Virtual Worlds published by John Wiley & Sons, Ltd.

    DOI

    Scopus

    2
    Citation
    (Scopus)
  • Facial Aging Simulation by Patch-Based Texture Synthesis with Statistical Wrinkle Aging Pattern Model

    Akinobu maejima, Ai Mizokawa, Daiki Kuwahara, Shigeo Morishima

    Mathematical Progress in Expressive Image Synthesis     161 - 170  2014.02  [Refereed]

  • Driver Drowsiness Estimation from Facial Expression Features Computer Vision Feature Investigation using a CG Model

    Taro Nakamura, Akinobu Maejima, Shigeo Morishima

    PROCEEDINGS OF THE 2014 9TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, THEORY AND APPLICATIONS (VISAPP 2014), VOL 2   2   207 - 214  2014  [Refereed]

     View Summary

    We propose a method for estimating the degree of a driver's drowsiness on the basis of changes in facial expressions captured by an IR camera. Typically, drowsiness is accompanied by drooping eyelids. Therefore, most related studies have focused on tracking eyelid movement by monitoring facial feature points. However, the drowsiness feature emerges not only in eyelid movements but also in other facial expressions. To more precisely estimate drowsiness, we must select other effective features. In this study, we detected a new drowsiness feature by comparing a video image and CG model that are applied to the existing feature point information. In addition, we propose a more precise degree of drowsiness estimation method using wrinkle changes and calculating local edge intensity on faces, which expresses drowsiness more directly in the initial stage.

    DOI

    Scopus

    19
    Citation
    (Scopus)
  • Measured curvature-dependent reflectance function for synthesizing translucent materials in real-time

    Midori Okamoto, Shohei Adachi, Hiroaki Ukaji, Kazuki Okami, Shigeo Morishima

    ACM SIGGRAPH 2014 Posters, SIGGRAPH 2014     96:1  2014  [Refereed]

    DOI

    Scopus

  • Patch-based fast image interpolation in spatial and temporal direction

    Shunsuke Saito, Ryuuki Sakamoto, Shigeo Morishima

    ACM SIGGRAPH 2014 Posters, SIGGRAPH 2014     70:1  2014  [Refereed]

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Face retrieval system by similarity of impression based on hair attribute

    Takahiro Fuji, Tsukasa Fukusato, Shoto Sasaki, Taro Masuda, Tatsunori Hirai, Shigeo Morishima

    ACM SIGGRAPH 2014 Posters, SIGGRAPH 2014     65:1  2014  [Refereed]

    DOI

    Scopus

  • Efficient video viewing system for racquet sports with automatic summarization focusing on rally scenes

    Shunya Kawamura, Tsukasa Fukusato, Tatsunori Hirai, Shigeo Morishima

    ACM SIGGRAPH 2014 Posters, SIGGRAPH 2014     62:1  2014  [Refereed]

    DOI

    Scopus

    4
    Citation
    (Scopus)
  • Automatic deblurring for facial image based on patch synthesis

    Masahide Kawai, Tomoyori Iwao, Akinobu Maejima, Shigeo Morishima

    ACM SIGGRAPH 2014 Posters, SIGGRAPH 2014     58:1  2014  [Refereed]

    DOI

    Scopus

  • Photorealistic facial image from monochrome pencil sketch

    Ai Mizokawa, Taro Nakamura, Akinobu Maejima, Shigeo Morishima

    ACM SIGGRAPH 2014 Posters, SIGGRAPH 2014     39:1  2014  [Refereed]

    DOI

    Scopus

  • Facial fattening and slimming simulation considering skull structure

    Masahiro Fujisaki, Daiki Kuwahara, Taro Nakamura, Akinobu Maejima, Takayoshi Yamashita, Shigeo Morishima

    ACM SIGGRAPH 2014 Posters, SIGGRAPH 2014     35:1  2014  [Refereed]

    DOI

    Scopus

  • The efficient and robust sticky viscoelastic material simulation

    Kakuto Goto, Naoya Iwamoto, Shunsuke Saito, Shigeo Morishima

    ACM SIGGRAPH 2014 Posters, SIGGRAPH 2014     15:1  2014  [Refereed]

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Quasi 3D rotation for hand-drawn characters

    Chie Furusawa, Tsukasa Fukusato, Narumi Okada, Tatsunori Hirai, Shigeo Morishima

    ACM SIGGRAPH 2014 Posters, SIGGRAPH 2014     12:1  2014  [Refereed]

    DOI

    Scopus

    2
    Citation
    (Scopus)
  • Material parameter editing system for volumetric simulation models

    Naoya Iwamoto, Shigeo Morishima

    ACM SIGGRAPH 2014 Posters, SIGGRAPH 2014     10:1  2014  [Refereed]

    DOI

    Scopus

  • Example-based blendshape sculpting with expression individuality

    Takuya Kato, Shunsuke Saito, Masahide Kawai, Tomoyori Iwao, Akinobu Maejima, Shigeo Morishima

    ACM SIGGRAPH 2014 Posters, SIGGRAPH 2014     7:1  2014  [Refereed]

    DOI

    Scopus

  • Application friendly voxelization on GPU by geometry splitting

    Zhuopeng Zhang, Shigeo Morishima

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   8698 LNCS   112 - 120  2014  [Refereed]

     View Summary

    In this paper, we present a novel approach that utilizes the geometry shader to dynamically voxelize 3D models in real-time. In the geometry shader, the primitives are split by their Z-order, and then rendered to tiles which compose a single 2D texture. This method is completely based on graphic pipeline, rather than computational methods like CUDA/OpenCL implementation. So it can be easily integrated into a rendering or simulation system. Another advantage of our algorithm is that while doing voxelization, it can simultaneously record the additional mesh information like normal, material properties and even speed of vertex displacement. Our method achieves conservative voxelization by only two passes of rendering without any preprocessing and it fully runs on GPU. As a result, our algorithm is very useful for dynamic application. © 2014 Springer International Publishing.

    DOI

    Scopus

    5
    Citation
    (Scopus)
  • LyricsRadar: A Lyrics Retrieval System Based on Latent Topics of Lyrics.

    Shoto Sasaki, Kazuyoshi Yoshii, Tomoyasu Nakano, Masataka Goto, Shigeo Morishima

    Proceedings of the 15th International Society for Music Information Retrieval Conference, ISMIR 2014, Taipei, Taiwan, October 27-31, 2014     585 - 590  2014  [Refereed]

  • Spotting a query phrase from polyphonic music audio signals based on semi-supervised nonnegative matrix factorization

    Taro Masuda, Kazuyoshi Yoshii, Masataka Goto, Shigeo Morishima

    Proceedings of the 15th International Society for Music Information Retrieval Conference, ISMIR 2014     227 - 232  2014  [Refereed]

     View Summary

    © Taro Masuda, Kazuyoshi Yoshii, Masataka Goto, Shigeo Morishima. This paper proposes a query-by-audio system that aims to detect temporal locations where a musical phrase given as a query is played in musical pieces. The “phrase” in this paper means a short audio excerpt that is not limited to a main melody (singing part) and is usually played by a single musical instrument. A main problem of this task is that the query is often buried in mixture signals consisting of various instruments. To solve this problem, we propose a method that can appropriately calculate the distance between a query and partial components of a musical piece. More specifically, gamma process nonnegative matrix factorization (GaP-NMF) is used for decomposing the spectrogram of the query into an appropriate number of basis spectra and their activation patterns. Semi-supervised GaP-NMF is then used for estimating activation patterns of the learned basis spectra in the musical piece by presuming the piece to partially consist of those spectra. This enables distance calculation based on activation patterns. The experimental results showed that our method outperformed conventional matching methods.

  • Data-driven speech animation synthesis focusing on realistic inside of the mouth

    Masahide Kawai, Tomoyori Iwao, Daisuke Mima, Akinobu Maejima, Shigeo Morishima

    Journal of Information Processing   22 ( 2 ) 401 - 409  2014  [Refereed]

     View Summary

    Speech animation synthesis is still a challenging topic in the field of computer graphics. Despite many challenges, representing detailed appearance of inner mouth such as nipping tongue's tip with teeth and tongue's back hasn't been achieved in the resulting animation. To solve this problem, we propose a method of data-driven speech animation synthesis especially when focusing on the inside of the mouth. First, we classify inner mouth into teeth labeling opening distance of the teeth and a tongue according to phoneme information. We then insert them into existing speech animation based on opening distance of the teeth and phoneme information. Finally, we apply patch-based texture synthesis technique with a 2,213 images database created from 7 subjects to the resulting animation. By using the proposed method, we can automatically generate a speech animation with the realistic inner mouth from the existing speech animation created by previous methods. © 2014 Information Processing Society of Japan.

    DOI

    Scopus

    12
    Citation
    (Scopus)
  • Data-driven speech animation synthesis focusing on realistic inside of the mouth

    Masahide Kawai, Tomoyori Iwao, Daisuke Mima, Akinobu Maejima, Shigeo Morishima

    Journal of Information Processing   22 ( 2 ) 401 - 409  2014  [Refereed]

     View Summary

    Speech animation synthesis is still a challenging topic in the field of computer graphics. Despite many challenges, representing detailed appearance of inner mouth such as nipping tongue's tip with teeth and tongue's back hasn't been achieved in the resulting animation. To solve this problem, we propose a method of data-driven speech animation synthesis especially when focusing on the inside of the mouth. First, we classify inner mouth into teeth labeling opening distance of the teeth and a tongue according to phoneme information. We then insert them into existing speech animation based on opening distance of the teeth and phoneme information. Finally, we apply patch-based texture synthesis technique with a 2,213 images database created from 7 subjects to the resulting animation. By using the proposed method, we can automatically generate a speech animation with the realistic inner mouth from the existing speech animation created by previous methods. © 2014 Information Processing Society of Japan.

    DOI

    Scopus

    12
    Citation
    (Scopus)
  • Automatic photorealistic 3D inner mouth restoration from frontal images

    Masahide Kawai, Tomoyori Iwao, Akinobu Maejima, Shigeo Morishima

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   8887   51 - 62  2014  [Refereed]

     View Summary

    © Springer International Publishing Switzerland 2014. In this paper, we propose a novel method to generate highly photorealistic three-dimensional (3D) inner mouth animation that is well-fitted to an original ready-made speech animation using only frontal captured images and a small-size database. The algorithms are composed of quasi-3D model reconstruction and motion control of teeth and the tongue, and final compositing of photorealistic speech animation synthesis tailored to the original.

    DOI

    Scopus

    4
    Citation
    (Scopus)
  • Automatic Music Video Generation System by Reusing Posted Web Content with Hidden Markov Model

    Hayato Ohya, Shigeo Morishima

    IIEEJ Transactions on Image Electronics and Visual Computing   11 ( 1 ) 65 - 73  2013.12  [Refereed]

  • 率モデルに基づく対話時の眼球運動の分析及び合成

    Tomoyori Iwao, Daisuke Mima, Hiroyuki Kubo, Akinobu Maejima, Shigeo Morishima

    IIEEJ Transactions on Image Electronics and Visual Computing   42 ( 5 ) 661 - 670  2013.09  [Refereed]

  • Automatic Comic-Style Video Summarization of Anime Films by Key-frame Detection

    Tsukasa Fukusato, Tatsunori Hirai, Hayato Ohya, Shigeo Morishima

    Proceedings of the Expressive 2013    2013.07  [Refereed]

  • アニメ作品におけるキーフレーム自動抽出に基づく映像要約手法の提案

    Tsukasa Fukusato, Tatsunori Hirai, Hayato Ohya, Shigeo Morishima

    IIEEJ Transactions on Image Electronics and Visual Computing   42 ( 4 ) 448 - 456  2013.07  [Refereed]

  • 既存音楽動画の再利用による音楽に合った動画の自動生成システム

    Tatsunori Hirai, Hayato Ohya, Shigeo Morishima

    IPSJ Journal   54 ( 4 ) 1254 - 1262  2013.04  [Refereed]

  • 注目領域中の画像類似度に基づく動画中のキャラクター登場シーンの推薦手法

    増田太郎, 平井辰典, 大矢隼士, 森島繁生

    第75回全国大会講演論文集   2013 ( 1 ) 601 - 602  2013.03

     View Summary

    本研究では,ある動画のシーンに類似した別の動画のシーンを検索するための映像特徴量間の類似尺度を提案する。類似動画検索に関する多くの研究は,特徴量の計算を画面全体に対して行っているため,画面の中で重要な部分とそうでない部分を等しく扱っている。そこで本手法では,映像において人の注意を引き付けやすい領域を推定し,その局所領域内での特徴を抽出する。これに画面全体の特徴も加味することで,大局的な特徴と局所的な特徴の両方を考慮した動画間の類似尺度を構成する。クエリとなる入力動画のシーンと最も類似度の高い動画シーンをデータベースから探索することで類似動画の検索を行った。また,実験により本手法の有効性についても検討した。

    CiNii

  • Reflectance estimation of human face from a single shot image

    Kazuki Okami, Naoya Iwamoto, Akinobu Maejima, Shigeo Morishima

    ACM SIGGRAPH 2013 Posters, SIGGRAPH 2013     105  2013  [Refereed]

     View Summary

    Simulation of the reflectance of translucent materials is one of the most important factors in the creation of realistic CG objects. Estimating the reflectance characteristics of translucent materials from a single image is a very efficient way of re-rendering objects that exist in real environments. However, this task is considerably challenging because this approach leads to problems such as the existence of many unknown parameters. Munoz et al. [2011] proposed a method for the estimation of the bidirectional surface scattering reflectance distribution function (BSSRDF) from a given single image. However, it is difficult or impossible to estimate the BSSRDF of materials with complex shapes because this method's target was the convexity of objects therefore, it used a rough depth recovery technique for global convex objects. In this paper, we propose a method for accurately estimating the BSSRDF of human faces, which have complex shapes. We use a 3D face reconstruction technique to satisfy the above assumption. We are able to acquire more accurate geometries of human faces, and it enables us to estimate the reflectance characteristics of faces.

    DOI

    Scopus

  • Real-time dust rendering by parametric shell texture synthesis

    Shohei Adachi, Hiroaki Ukaji, Takahiro Kosaka, Shigeo Morishima

    ACM SIGGRAPH 2013 Posters, SIGGRAPH 2013     104  2013  [Refereed]

     View Summary

    When we synthesize a realistic appearance of dust-covered object by CG, it is necessary to express a large number of fabric components of dust accurately with many short fibers, and as a result, this process is a time-consuming task. The dust amount prediction function suggested by Hsu [1995] proposed modeling and rendering techniques for dusty surfaces. These techniques only describe dust accumulation as a shading function, however, they cannot express the volume of dust on the surfaces. In this study, we present a novel method to model and render the appearance and volume of dust in real-time by using shell texturing. Each shell texture, which can express several components, is automatically generated in our procedural approach. Therefore, we can draw any arbitrary appearance of dust rapidly and interactively by solely controlling simple parameters.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Affective music recommendation system using input images

    Shoto Sasaki, Tatsunori Hirai, Hayato Ohya, Shigeo Morishima

    ACM SIGGRAPH 2013 Posters, SIGGRAPH 2013     90  2013  [Refereed]

     View Summary

    Music that matches our current mood can create a deep impression, which we usually want to enjoy when we listen to music. However, we do not know which music best matches our present mood. We have to listen to each song, searching for music that matches our mood. As it is difficult to select music manually, we need a recommendation system that can operate affectively. Most recommendation methods, such as collaborative filtering or content similarity, do not target a specific mood. In addition, there may be no word exactly specifying the mood. Therefore, textual retrieval is not effective. In this paper, we assume that there exists a relationship between our mood and images because visual information affects our mood when we listen to music. We now present an affective music recommendation system using an input image without textual information.

    DOI

    Scopus

    5
    Citation
    (Scopus)
  • Photorealistic aged face image synthesis by wrinkles manipulation

    Ai Mizokawa, Hiroki Nakai, Akinobu Maejima, Shigeo Morishima

    ACM SIGGRAPH 2013 Posters, SIGGRAPH 2013     64  2013  [Refereed]

     View Summary

    Many studies on an aged face image synthesis have been reported with the purpose of security application such as investigation for criminal or kidnapped child and entertainment applications such as movie or video game.

    DOI

    Scopus

    6
    Citation
    (Scopus)
  • Driver drowsiness estimation using facial wrinkle feature

    Taro Nakamura, Tatsuhide Matsuda, Akinobu Maejima, Shigeo Morishima

    ACM SIGGRAPH 2013 Posters, SIGGRAPH 2013     30  2013  [Refereed]

     View Summary

    In recent years, the rate of fatal motor vehicle accidents caused by distracted driving resulting from factors such as sleeping at the wheel has been increasing. Therefore, an alert system that detects driver drowsiness and prevents accidents as a result by warning drivers before they fall asleep is urgently required. Non-contact measuring systems using computer vision techniques have been studied, and in vision approach, it is important to decide what kind of feature we should use for estimating drowsiness.

    DOI

    Scopus

    3
    Citation
    (Scopus)
  • Photorealistic inner mouth expression in speech animation

    Masahide Kawai, Tomoyori Iwao, Daisuke Mima, Akinobu Maejima, Shigeo Morishima

    ACM SIGGRAPH 2013 Posters, SIGGRAPH 2013     9  2013  [Refereed]

     View Summary

    We often see close-ups of CG characters' faces in movies or video games. In such situations, the quality of a character's face (mainly in dialogue scenes) primarily determines that of the entire movie. Creating highly realistic speech animation is essential because viewers watch these scenes carefully. In general, such speech animations are created manually by skilled artists. However, creating them requires a considerable effort and time.

    DOI

    Scopus

    3
    Citation
    (Scopus)
  • Generating eye movement during conversations using markov process

    Tomoyori Iwao, Daisuke Mima, Hiroyuki Kubo, Akinobu Maejima, Shigeo Morishima

    ACM SIGGRAPH 2013 Posters, SIGGRAPH 2013     6  2013  [Refereed]

     View Summary

    Generating realistic eye movements is a significant topic in Computer Graphics(CG) contents production field. Appropriate modeling and synthesis for eye movements are greatly difficult because they have a lot of important features. Gu et al[2007] proposed a method for automatically synthesizing realistic eye movements during conversations according to probability models. Despite eye movements during conversations include both saccades and fixational eye movements (FEMs), they synthesized only saccades which are relatively large eye movements.

    DOI

    Scopus

  • Expressive dance motion generation

    Narumi Okada, Kazuki Okami, Tsukasa Fukusato, Naoya Iwamoto, Shigeo Morishima

    ACM SIGGRAPH 2013 Posters, SIGGRAPH 2013     4  2013  [Refereed]

     View Summary

    The power of expression such as accent in motion and movement of arms is an indispensable factor in dance performance because there is a large difference in appearance between natural dance and expressive motions. Needless to say, expressive dance motion makes a great impression on viewers. However, creating such a dance motion is challenging because most of the creators have little knowledge about dance performance. Therefore, there is a demand for a system that generates expressive dance motion with ease. Tsuruta et al. [2010] generated expressive dance motion by changing only the speed of input motion or altering joint angles. However, the power of expression was not evaluated with certainty, and the generated motion did not synchronize with music. Therefore, the generated motion did not always satisfy the viewers.

    DOI

    Scopus

  • Efficient speech animation synthesis with vocalic lip shapes

    Daisuke Mima, Akinobu Maejima, Shigeo Morishima

    ACM SIGGRAPH 2013 Posters, SIGGRAPH 2013     2  2013  [Refereed]

     View Summary

    Computer-generated speech animations are commonly seen in video games and movies. Although high-quality facial motions can be created by the hand crafted work of skilled artists, this approach is not always suitable because of time and cost constraints. A data-driven approach [Taylor et al. 2012], such as machine learning to concatenate video portions of speech training data, has been utilized to generate natural speech animation, while a large number of target shapes are often required for synthesis. We can obtain smooth mouth motions from prepared lip shapes for typical vowels by using an interpolation of lip shapes with Gaussian mixture models (GMMs) [Yano et al. 2007]. However, the resulting animation is not directly generated from the measured lip motions of someone's actual speech.

    DOI

    Scopus

  • Real-time hair simulation on mobile device

    Zhuopeng Zhang, Shigeo Morishima

    Proceedings - Motion in Games 2013, MIG 2013     127 - 132  2013  [Refereed]

     View Summary

    Hair rendering and simulation is a fundamental part in the representation of virtual characters. But intensive calculation for the dynamic on thousands of hair strands makes the task much challengeable, especially on a portable device. The aim of this short paper is to solve the problem of how to perform real-time hair simulation and rendering on mobile device. In this paper, the process of hair simulation and rendering is adapted according to the property of mobile device hardware. To increase the number of hair strands of simulation, we adopted the Dynamic follow-the-leader (DFTL) method and altered it by our new method of interpolation. We also pictured a rendering strategy basing on the survey of the limitation of mobile GPU. Lastly we present an innovational method that carried out order independent transparency at a relatively inexpensive cost. © 2013 ACM.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Automatic Mash Up Music Video Generation System by Remixing Existing Video Content

    Hayato Ohya, Shigeo Morishima

    2013 INTERNATIONAL CONFERENCE ON CULTURE AND COMPUTING (CULTURE AND COMPUTING 2013)     157 - 158  2013  [Refereed]

     View Summary

    Music video is a short film which presents a visual representation of recent music. In these days, there is a trend that amateur users create music video in the video sharing website. Especially, the music video which is created by cutting and pasting existing video is called mashup music video. In this paper, we proposed the system that users can easily create mushup music video by using existing music videos. In addition, we conducted assessment evaluation experiment for our system. The system firstly extracts music features and video features from existing music videos. Then, the each feature is clustered and the relationship between each feature is learned by Hidden Markov Model. At last, the system cuts learned video scene which is the closest feature among learned videos and pastes it synchronizing with input song. Experiment shows that our method can generate more synchronized video than a previous method.

    DOI

    Scopus

    3
    Citation
    (Scopus)
  • Affective Music Recommendation System Reflecting the Mood of Input Image

    Shoto Sasaki, Tatsunori Hirai, Hayato Ohya, Shigeo Morishima

    2013 INTERNATIONAL CONFERENCE ON CULTURE AND COMPUTING (CULTURE AND COMPUTING 2013)     153 - 154  2013  [Refereed]

     View Summary

    We present an affective music recommendation system using input images without textual information. Music that matches our current mood can create a deep impression. However, we do not know which music best matches our present mood. As it is difficult to select music manually, we need a recommendation system that can operate affectively. In this paper, we assume that there exists a relationship between our mood and images because visual information affects our mood when we listen to music. Our system matches an input image with music using valence-arousal plane which is an emotional plane.

    DOI

    Scopus

    8
    Citation
    (Scopus)
  • Interactive Aged-Face Simulation with Freehand Wrinkle Drawing

    Ai Mizokawa, Akinobu Maejima, Shigeo Morishima

    2013 SECOND IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR 2013)     765 - 769  2013  [Refereed]

     View Summary

    Recently, many studies on facial aging synthesis have been reported for the purpose of security applications such as criminal investigation, kidnappings, and entertainment applications such as movies or video games. However, the representation of wrinkles, which is one of the most important elements when reflecting age characteristics, remains difficult. Additionally, the influence of lighting conditions and every individual's skin color is significant, and it is difficult to infer the location and shape of future wrinkles because they depend on factors such as one's living environment, eating habits, and DNA. Therefore, we must consider several possibilities for the locations of wrinkles. In this paper, we propose a facial aging synthesis method that can create plausible aged facial images, and is able to represent wrinkles at any desired location by drawing artificial freehand wrinkles while retaining photoreafistic quality.

    DOI

    Scopus

  • Detection of Driver's Drowsy Facial Expression

    Taro Nakamura, Akinobu Maejima, Shigeo Morishima

    2013 SECOND IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR 2013)     749 - 753  2013  [Refereed]

     View Summary

    We propose a method for the estimation of the degree of a driver's drowsiness on basis of changes in facial expressions captured by an IR camera. Typically, drowsiness is accompanied by falling of eyelids. Therefore, most of the related studies have focused on tracking eyelid movement by monitoring facial feature points. However, textural changes that arise from frowning are also very important and sensitive features in the initial stage of drowsiness, and it is difficult to detect such changes solely using facial feature points. In this paper, we propose a more precise drowsiness-degree estimation method considering wrinkles change by calculating local edge intensity on faces that expresses drowsiness more directly in the initial stage.

    DOI

    Scopus

    14
    Citation
    (Scopus)
  • Facial Aging Simulator Based on Patch-based Facial Texture Reconstruction

    Akinobu Maejima, Ai Mizokawa, Daiki Kuwahara, Shigeo Morishima

    2013 SECOND IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR 2013)     732 - 733  2013  [Refereed]

     View Summary

    We propose a facial aging simulator which can synthesize a photorealistic human aged-face image for criminal investigation. Our aging simulator is based on the patch-based facial texture reconstruction with a wrinkle aging pattern model. The advantage of our method is to synthesize an aged-face image with detailed skin texture such as spots and somberness of facial skin, as well as age-related facial wrinkles without blurs that are derived from lack of accurate pixel-wise alignments as in the linear combination model, while maintaining the identity of the original face.

    DOI

    Scopus

  • Automatic Mash up Music Video Generation System by Perceptual Synchronization of Music and Video Features

    Tatsunori Hirai, Hayato Ohya, Shigeo Morishima

    Special Interest Group on Computer Graphics and Interactive Techniques Conference, SIGGRAPH '12, Los Angeles, CA, USA, 2012, Poster Proceedings     449:1  2012.08  [Refereed]

  • Automatic Feature Point Detection Using Linear Predictors with Facial Shape Constraint

    MATSUDA Tatsuhide, HARA Tomoya, MAEJIMA Akinobu, MORISHIMA Shigeo

    The IEICE transactions on information and systems (Japanese edetion)   95 ( 8 ) 1530 - 1540  2012.08  [Refereed]

    CiNii J-GLOBAL

  • Acquiring shell textures from a single image for realistic fur rendering

    Hiroaki Ukaji, Takahiro Kosaka, Tomohito Hattori, Hiroyuki Kubo, Shigeo Morishima

    ACM SIGGRAPH 2012 Posters, SIGGRAPH'12     100  2012  [Refereed]

    DOI

    Scopus

  • Fast-automatic 3D face generation using a single video camera

    Tomoya Hara, Hiroyuki Kubo, Akinobu Maejima, Shigeo Morishima

    ACM SIGGRAPH 2012 Posters, SIGGRAPH'12     91  2012  [Refereed]

    DOI

    Scopus

  • Facial aging simulator considering geometry and patch-tiled texture

    Yusuke Tazoe, Hiroaki Gohara, Akinobu Maejima, Shigeo Morishima

    ACM SIGGRAPH 2012 Posters, SIGGRAPH'12     90  2012  [Refereed]

     View Summary

    People can estimate an approximate age of others by looking at their faces. This is because faces have certain elements by which people can judge a person's age. If computers can extract and manipulate such information, wide variety of applications for entertainment and security purpose would be expected. © 2012 ACM.

    DOI

    Scopus

    36
    Citation
    (Scopus)
  • Analysis and synthesis of realistic eye movement in face-to-face communication

    Tomoyori Iwao, Daisuke Mima, Hiroyuki Kubo, Akinobu Maejima, Shigeo Morishima

    ACM SIGGRAPH 2012 Posters, SIGGRAPH'12     87  2012  [Refereed]

    DOI

    Scopus

    2
    Citation
    (Scopus)
  • 3D human head geometry estimation from a speech

    Akinobu Maejima, Shigeo Morishima

    ACM SIGGRAPH 2012 Posters, SIGGRAPH'12     85  2012  [Refereed]

     View Summary

    We can visualize acquaintances' appearance by just hearing their voice if we have met them in past few years. Thus, it would appear that some relationships exist in between voice and appearance. If 3D head geometry could be estimated from a voice, we can realize some applications (e.g, avatar generation, character modeling for video game, etc.). Previously, although many researchers have been reported about a relationship between acoustic features of a voice and its corresponding dynamical visual features including lip, tongue, and jaw movements or vocal articulation during a speech, however, there have been few reports about a relationship between acoustic features and static 3D head geometry. In this paper, we focus on estimating 3D head geometry from a voice. Acoustic features vary depending on a speech context and its intonation. Therefore we restrict a context to Japanese 5 vowels. Under this assumption, to estimate 3D head geometry, we use a Feedforward Neural Network (FNN) trained by using a correspondence between an individual acoustic features extracted from a Japanese vowel and 3D head geometry generated based on a 3D range scan. The performance of our method is shown by both closed and open tests. As a result, we found that 3D head geometry which is acoustically similar to an input voice could be estimated under the limited condition. © 2012 ACM.

    DOI

    Scopus

  • Hair motion capturing from multiple view videos

    Tsukasa Fukusato, Naoya Iwamoto, Shoji Kunitomo, Hirofumi Suda, Shigeo Morishima

    ACM SIGGRAPH 2012 Posters, SIGGRAPH'12     58  2012  [Refereed]

    DOI

    Scopus

  • Automatic music video generating system by remixing existing contents in video hosting service based on hidden Markov model

    Hayato Ohya, Shigeo Morishima

    ACM SIGGRAPH 2012 Posters, SIGGRAPH'12     47  2012  [Refereed]

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Rapid and authentic rendering of translucent materials using depth-maps from multi-viewpoint

    Takahiro Kosaka, Tomohito Hattori, Hiroyuki Kubo, Shigeo Morishima

    SIGGRAPH Asia 2012 Posters, SA 2012     45  2012  [Refereed]

     View Summary

    We present a real-time rendering method of translucent materials with complex shape by estimating object's thickness between light source and view point precisely. Wang et al. [2010] has already proposed a real-time rendering method treating arbitrary shapes, but it requires such huge computational costs and graphics memories that it is very difficult to implement in a practical rendering pipe-line. Inside a trans-lucent object, the energy of incident light attenuates highly depends on the object's optical thickness. Translucent Shadow Maps (TSM) [2003] is able to compute object's thickness using depth map at light position. However, TSM is not able to calculate thickness accurately in concave objects. In this paper, we propose a novel technique to compute object's thickness precisely and as a result, we achieve a real-time rendering of translucent materials with complex shapes only by adding one render-ing pass to conventional TSM. Copyright is held by the author / owner(s).

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Fast-Accurate 3D Face Model Generation Using a Single Video Camera

    Tomoya Hara, Hiroyuki Kubo, Akinobu Maejima, Shigeo Morishima

    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012)     1269 - 1272  2012  [Refereed]

     View Summary

    In this paper, we present a new method to generate a 3D face model, based on both Data-Driven and Structure-from-Motion approach. Considering both 2D frontal face image constraint, 3D geometric constraint, and likelihood constraint, we are able to reconstruct subject's face model accurately, robustly, and automatically. Using our method, it is possible to create a 3D face model in 5.8 [sec] by only shaking own head freely in front of a single video camera.

  • Automatic Face Replacement for a Humanoid Robot with 3D Face Shape Display

    Akinobu Maejima, Takaaki Kuratate, Brennand Pierce, Shigeo Morishima, Gordon Cheng

    2012 12TH IEEE-RAS INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS (HUMANOIDS)     469 - 474  2012  [Refereed]

     View Summary

    In this paper, we propose a method to apply any new face to a retro-projected 3D face system, the Mask-bot, which we have developed as a human-robot interface. The robot face using facial animation projected onto a 3D face mask can be quickly replaced by a new face based on a single frontal image of any person. Our contribution is to apply an automatic face replacement technique with the modified texture morphable model fitting to the 3D face mask. Using our technique, a face model displayed on Mask-bot can be automatically replaced within approximately 3 seconds, which makes Mask-bot widely suitable to applications such as video conferencing and cognitive experiments.

    DOI

    Scopus

    7
    Citation
    (Scopus)
  • Curvature-approximated Estimation of Real-time Ambient Occlusion.

    Tomohito Hattori, Hiroyuki Kubo, Shigeo Morishima

    GRAPP & IVAPP 2012: Proceedings of the International Conference on Computer Graphics Theory and Applications and International Conference on Information Visualization Theory and Applications, Rome, Italy, 24-26 February, 2012     268 - 273  2012  [Refereed]

  • Development of an integrated multi-modal communication robotic face

    Brennand Pierce, Takaaki Kuratate, Akinobu Maejima, Shigeo Morishima, Yosuke Matsusaka, Marko Durkovic, Klaus Diepold, Gordon Cheng

    2012 IEEE WORKSHOP ON ADVANCED ROBOTICS AND ITS SOCIAL IMPACTS (ARSO)     104 - +  2012  [Refereed]

     View Summary

    This paper presents an overview of the new version of our multi-model communication face "Mask-Bot", a rear-projected animated robotic head, including our display system, face animation, speech communication and sound localization.

    DOI

    Scopus

    4
    Citation
    (Scopus)
  • Analysis of Relation between Movement of Smile Expression Process and Impression

    FUJISHIRO Hiroki, MAEJIMA Akinobu, MORISHIMA Shigeo

    The Transactions of the Institute of Electronics, Information and Communication Engineers. A   95 ( 1 ) 128 - 135  2012.01  [Refereed]

    CiNii J-GLOBAL

  • Identifying Scenes with the Same Person in Video Content on the Basis of Scene Continuity and Face Similarity Measurement

    平井辰典, 中野倫靖, 後藤真孝, 森島繁生

    映像情報メディア学会誌(Web)   66 ( 7 ) 251 - 259  2012  [Refereed]

    J-GLOBAL

  • 顔・人体メディアが拓く新産業の画像技術

    Masato Kawade, Masaaki Mochimaru, Shigeo Morishima

    映像情報メディア学会誌   65 ( 11 ) 1534 - 1544  2011.11  [Refereed]

  • Curvature-Dependent Reflectance Function for Interactive Rendering of Subsurface Scattering

    Hiroyuki Kubo, Yoshinori Dobashi, Shigeo Morishima

    The International Journal of Virtual Reality   10 ( 1 ) 41 - 47  2011.05  [Refereed]

  • A Proposal of Innovative Entertainment System "Dive Into the Movie"

    MORISHIMA Shigeo, YAGI Yasushi, NAKAMURA Satoshi, ISE Sirou, MUKAIGAWA Yasuhiro, MAKIHARA Yasushi, MASHITA Tomohiro, KONDO Kazuaki, ENOMOTO Seigo, KAWAMOTO Shinichi, YOTSUKURA Tatsuo, IKEDA Yusuke, MAEJIMA Akinobu, KUBO Hiroyuki

    The Journal of the Institute of Electronics, Information and Communication Engineers   94 ( 3 ) 250 - 268  2011.03  [Refereed]

    CiNii

  • Example-based Deformation with Support Joints

    Kentaro Yamanaka, Akane Yano, Shigeo Morishima

    WSCG 2011: COMMUNICATION PAPERS PROCEEDINGS     83 - +  2011  [Refereed]

     View Summary

    In character animation field, many deformation techniques have been proposed. Example-based deformation methods are widely used especially for interactive applications. Example-based methods are mainly divided into two types. One is Interpolation. Methods in this type are designed to interpolate examples in a pose space. The advantage is that the deformed meshes can precisely correspond to the example meshes. On the other hand, the disadvantage is that larger number of examples is needed to generate arbitrary plausible interpolated meshes between each example. The other is Example-based Skinning which optimizes particular parameters referencing examples to represent example meshes as accurately as possible. These methods provide plausible deformations with fewer examples. However they cannot perfectly depict example meshes. In this paper, we present an idea that combines techniques belonging to the two types, taking advantages of both types. We propose an example-based skinning method to be combined with Pose Space Deformation (PSD). It optimizes transformation matrices in Skeleton Subspace deformation (SSD) introducing "support joints". Our method itself generates plausible intermediate meshes with a small set of examples as well as other example-based skinning methods. Then we explain the benefit of combining our method with PSD. We show that provided examples are precisely represented and plausible deformations at arbitrary poses are obtained by our integrated method.

  • 3D reconstruction of detail change on dynamic non-rigid objects

    Daichi Taneda, Hirofumi Suda, Akinobu Maejima, Shigeo Morishima

    ACM SIGGRAPH 2011 Posters, SIGGRAPH'11     56  2011  [Refereed]

    DOI

    Scopus

  • Estimating fluid simulation parameters from videos

    Naoya Iwamoto, Ryusuke Sagawa, Shoji Kunitomo, Shigeo Morishima

    ACM SIGGRAPH 2011 Posters, SIGGRAPH'11     3  2011  [Refereed]

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Real-Time and Interactive Rendering for Translucent Materials Such as Human Skin

    Hiroyuki Kubo, Yoshinori Dobashi, Shigeo Morishima

    HUMAN INTERFACE AND THE MANAGEMENT OF INFORMATION: INTERACTING WITH INFORMATION, PT 2   6772   388 - 395  2011  [Refereed]

     View Summary

    To synthesize a realistic human animation using computer graphics, it is necessary to simulate subsurface scattering inside a human skin. We have developed a curvature-dependent reflectance functions (CDRF) which mimics the presence of a subsurface scattering effect. In this approach, we provide only a single parameter that represents the intensity of incident light scattering in a translucent material. We implemented our algorithm as a hardware-accelerated real-time renderer with a HLSL pixel shader. This approach is easily implementable on the GPU and does not require any complicated pre-processing and multi-pass rendering as is often the case in this area of research.

    DOI

    Scopus

  • The online gait measurement for characteristic gait animation synthesis

    Yasushi Makihara, Mayu Okumura, Yasushi Yagi, Shigeo Morishima

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   6773 ( 1 ) 325 - 334  2011  [Refereed]

     View Summary

    This paper presents a method to measure online the gait features from the gait silhouette images and to synthesize characteristic gait animation for an audience-participant digital entertainment. First, both static and dynamic gait features are extracted from the silhouette images captured by an online gait measurement system. Then, key motion data for various gaits are captured and a new motion data is synthesized by blending key motion data. Finally, blend ratios of the key motion data are estimated to minimize gait feature errors between the blended model and the online measurement. In experiments, the effectiveness of gait feature extraction were confirmed by using 100 subjects from OU-ISIR Gait Database and characteristic gait animations were created based on the measured gait features. © 2011 Springer-Verlag.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Realistic facial animation by automatic individual head modeling and facial muscle adjustment

    Akinobu Maejima, Hiroyuki Kubo, Shigeo Morishima

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   6774 ( 2 ) 260 - 269  2011  [Refereed]

     View Summary

    We propose a technique for automatically generating a realistic facial animation with precise individual facial geometry and characteristic facial expressions. Our method is divided into two key methods: the head modeling process automatically generates a whole head model only from facial range scan data, the facial animation setup process automatically generates key shapes which represent individual facial expressions based on physics-based facial muscle simulation with an individual muscle layout estimated from facial expression videos. Facial animations considering individual characteristics can be synthesized using the generated head model and key shapes. Experimental results show that the proposed method can generate facial animations where 84% of subjects can identify themselves. Therefore, we conclude that our head modeling techniques are effective to entertainment system like a Future Cast. © 2011 Springer-Verlag.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Instant movie casting with personality: Dive into the movie system

    Shigeo Morishima, Yasushi Yagi, Satoshi Nakamura

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   6774 ( 2 ) 187 - 196  2011  [Refereed]

     View Summary

    "Dive into the Movie (DIM)" is a name of project to aim to realize a world innovative entertainment system which can provide an immersion experience into the story by giving a chance to audience to share an impression with his family or friends by watching a movie in which all audience can participate in the story as movie casts. To realize this system, we are trying to model and capture the personal characteristics instantly and precisely in face, body, gait, hair and voice. All of the modeling, character synthesis, rendering and compositing processes have to be performed on real-time without any manual operation. In this paper, a novel entertainment system, Future Cast System (FCS), is introduced as a prototype of DIM. The first experimental trial demonstration of FCS was performed at the World Exposition 2005 in which 1,630,000 people have experienced this event during 6 months. And finally up-to-date DIM system to realize more realistic sensation is introduced. © 2011 Springer-Verlag.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Personalized voice assignment techniques for synchronized scenario speech output in entertainment systems

    Shin-Ichi Kawamoto, Tatsuo Yotsukura, Satoshi Nakamura, Shigeo Morishima

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   6774 ( 2 ) 177 - 186  2011  [Refereed]

     View Summary

    The paper describes voice assignment techniques for synchronized scenario speech output in an instant casting movie system that enables anyone to be a movie star using his or her own voice and face. Two prototype systems were implemented, and both systems worked well for various participants, ranging from children to the elderly. © 2011 Springer-Verlag.

    DOI

    Scopus

  • Rapid Rendering of Translucent Materials Using Curvature-Dependent Reflectance Functions

    KUBO Hiroyuki, DOBASHI Yoshinori, MORISHIMA Shigeo

    The IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences (Japanese edition) A   93 ( 11 ) 708 - 717  2010.11  [Refereed]

    CiNii

  • The effects of virtual characters on audiences' movie experience

    Tao Lin, Shigeo Morishima, Akinobu Maejima, Ningjiu Tang

    INTERACTING WITH COMPUTERS   22 ( 3 ) 218 - 229  2010.05  [Refereed]

     View Summary

    In this paper, we first present a new audience-participating movie form in which 3D virtual characters of audiences are constructed by computer graphics (CC) technologies and are embedded into a in a pre-rendered movie as different roles. Then, we investigate how the audiences respond to these virtual characters using physiological and subjective evaluation methods. To facilitate the investigation, we present three versions of a movie to an audience a Traditional version, its SDIM version with the participation of the audience's virtual character, and its SFDIM version with the co-participation of the audience and her/his friends' virtual characters. The subjective evaluation results show that the participation of virtual characters indeed causes increased subjective sense of spatial presence and engagement, and emotional reaction; moreover, SFDIM performs significantly better than SDIM, due to the co-participation of friends' virtual characters. Also, we find that the audiences experience not only significantly different galvanic skin response (GSR) changes on average changing trend over time and number of fluctuations but they also show the increased phasic GSR responses to the appearance of their own or friends' virtual 3D characters on the screen. The evaluation results demonstrate the success of the new audience-participating movie form and contribute to understanding how people respond to virtual characters in a role-playing entertainment interface. (C) 2009 Elsevier B.V. All rights reserved.

    DOI

    Scopus

    5
    Citation
    (Scopus)
  • Automatic generation of head models and facial animations considering personal characteristics

    Akinobu Maejima, Hiroto Yarimizu, Hiroyuki Kubo, Shigeo Morishima

    Proceedings of the ACM Symposium on Virtual Reality Software and Technology, VRST     71 - 78  2010  [Refereed]

     View Summary

    We propose a new automatic head modeling system to generate individualized head models which can express person-specific facial expressions. The head modeling system consists of two core processes. The head modeling process with the proposed automatic mesh completion generates a whole head model only from facial range scan data. The key shape generation process generates key shapes for the generated head model based on physics-based facial muscle simulation with an individual muscle layout estimated from subject's facial expression videos. Facial animations considering personal characteristics can be synthesized using the individualized head model and key shapes. Experimental results show that the proposed system can generate head models where 84% of subjects can identify themselves. Therefore, we conclude that our head modeling system is effective to games and entertainment systems like a Future Cast System. Copyright © 2010 by the Association for Computing Machinery, Inc.

    DOI

    Scopus

    4
    Citation
    (Scopus)
  • Curvature-dependent reflectance function for rendering translucent materials

    Hiroyuki Kubo, Yoshinori Dobashi, Shigeo Morishima

    ACM SIGGRAPH 2010 Talks, SIGGRAPH '10     a46  2010  [Refereed]

     View Summary

    Simulating sub-surface scattering is one of the most effective ways for realistically synthesizing translucent materials such as marble, milk and human skin. In previous work, the method developed by Jensen et al. [2002] significantly improved the speed of the simulation. However, the process is still not fast enough to produce realtime rendering. Thus, we have developed a curvature-dependent reflectance function (CDRF) which mimics the presence of a subsurface scattering effect.

    DOI

    Scopus

    4
    Citation
    (Scopus)
  • Curvature depended local illumination approximation of ambient occlusion

    Tomohito Hattori, Hiroyuki Kubo, Shigeo Morishima

    ACM SIGGRAPH 2010 Posters, SIGGRAPH '10   2009 ( 6 ) 122:1  2010  [Refereed]

     View Summary

    This paper discusses an approach for computing the ambient occlusion by curvature depended approximation of occlusion. Ambient occlusion is widely used to improve the realism of fast lighting simulation. The ambient occlusion is defined as follows. © ACM 2010.

    DOI J-GLOBAL

    Scopus

    4
    Citation
    (Scopus)
  • Optimization of cloth simulation parameters by considering static and dynamic features

    Shoji Kunitomo, Shinsuke Nakamura, Shigeo Morishima

    ACM SIGGRAPH 2010 Posters, SIGGRAPH '10     15:1  2010  [Refereed]

     View Summary

    Realistic drape and motion of virtual clothing is now possible by using an up-to-date cloth simulator, but it is even difficult and time consuming to adjust and tune many parameters to achieve an authentic looking of a real particular fabric. Bhat et al. [2003] proposed a way to estimate the parameters from the video data of real fabrics. However, this projects structured light patterns on the fabrics, so it might not be possible to estimate the accurate value of the parameters if fabrics have colors and textures. In addition to the structured light patterns, they use a motion capture system to track how the fabrics move. In this paper, we will introduce a new method using only a motion capture system by attaching a few markers on fabric surface without any other devices. Moreover, animators can easily estimate the parameters of many kinds of fabrics with this method. Authentic looking and motion of simulated fabrics are realized by minimizing error function between captured motion data and synthetic motion considering both static and dynamic cloth features. © ACM 2010.

    DOI

    Scopus

    10
    Citation
    (Scopus)
  • Data driven in-betweening for hand drawn rotating face

    Hiroaki Gohara, Shiori Sugimoto, Shigeo Morishima

    ACM SIGGRAPH 2010 Posters, SIGGRAPH '10     7:1  2010  [Refereed]

     View Summary

    In anime production, some key-frames are drawn by artist precisely and then a great number of in-betweening frames are drawn by assistants' hands. However, it is seriously time-consuming and skilled work to draw many characters especially including face rotation. In this paper, we propose an automatic in-betweening technique for rotating face of hand drawn character only from a front image and a diagonal image (Fig. 1). Baxter [2009] represented generating in-betweening using image morphing technique. However, their approach doesn't consider reflecting the artist's style and touch. Accordingly, we represent reflecting style and touch using morphing technique trained by his own database and introduced especially to generate a rotational in-betweening faces. This database contains center of gravity of each part (right eye, left eye, nose, mouth, eyebrow) and the contours on the facial image. © ACM 2010.

    DOI

    Scopus

    2
    Citation
    (Scopus)
  • A skinning technique considering the shape of human skeletons

    Hirofumi Suda, Kentaro Yamanaka, Shigeo Morishima

    ACM SIGGRAPH 2010 Posters, SIGGRAPH '10     4:1  2010  [Refereed]

     View Summary

    We propose a skinning technique to improve expressive power of Skeleton Subspace Deformation (SSD) by adding the influence of the shape of skeletons to the deformation result by postprocessing. © ACM 2010.

    DOI

    Scopus

  • Learning arm motion strategies for balance recovery of humanoid robots

    Masaki Nakada, Brian Allen, Shigeo Morishima, Demetri Terzopoulos

    Proceedings - EST 2010 - 2010 International Conference on Emerging Security Technologies, ROBOSEC 2010 - Robots and Security, LAB-RS 2010 - Learning and Adaptive Behavior in Robotic Systems     165 - 170  2010  [Refereed]

     View Summary

    Humans are able to robustly maintain balance in the presence of disturbances by combining a variety of control strategies using posture adjustments and limb motions. Such responses can be applied to balance control in two-armed bipedal robots. We present an upper-body control strategy for improving balance in a humanoid robot. Our method improves on lowerbody balance techniques by introducing an arm rotation strategy (ARS). The ARS uses Q-learning to map sensed state to the appropriate arm control torques. We demonstrate successful balance in a physically-simulated humanoid robot, in response to perturbations that overwhelm lower-body balance strategies alone. © 2010 IEEE.

    DOI

    Scopus

    10
    Citation
    (Scopus)
  • Interactive Shadow design tool for Cartoon Animation -KAGEZOU-.

    Shiori Sugimoto, Hidehito Nakajima, Yohei Shimotori, Shigeo Morishima

    Journal of WSCG   18 ( 1-3 ) 25 - 32  2010  [Refereed]

  • Interactive shadowing for 2D Anime

    Eiji Sugisaki, Hock Soon Seah, Feng Tian, Shigeo Morishima

    COMPUTER ANIMATION AND VIRTUAL WORLDS   20 ( 2-3 ) 395 - 404  2009.06  [Refereed]

     View Summary

    In this paper, we propose all instant shadow generation technique for 2D animation, especially Japanese Anime. In traditional 2D Anime production, the entire animation including shadows is drawn by hand so that it takes long time to complete. Shadows play all important role in the creation of symbolic visual effects. However shadows are not always drawn due to time constraints and lack of animators especially When the production schedule is tight. To solve this problem, We develop all easy shadowing approach that enables animators to easily create a layer of shadow and its animation based on the character's shapes. Our approach is both instant and intuitive. The only inputs required tire character or object shapes in input animation sequence With alpha value generally used in the Anime production pipeline. First, shadows are automatically rendered on a virtual plane by using a Shadow Map(1) based oil these inputs. Then the rendered shadows call be edited by simple operations and simplified by the Gaussian Filter. Several special effects such as blurring call be applied to the rendered shadow at the same time. Compared to existing approaches, ours is more efficient and effective to handle automatic shadowing in real-time. Copyright (C) 2009 John Wiley & Sons, Ltd.

    DOI

    Scopus

    5
    Citation
    (Scopus)
  • Facial expression recognition after orthodontic treatment using computer graphics

    TERADA Kazuto, YOSHIDA Mitsuru, SANO Natsuki, SAITO Isao, MIYANAGA Michiyo, MORISHIMA Shigeo, HU Min

    J Oromax Biomech   14 ( 1 ) 1 - 13  2009.03  [Refereed]

  • Dive into the movie: an instant casting and immersive experience in the story.

    Shigeo Morishima

    Proceedings of the ACM Symposium on Virtual Reality Software and Technology, VRST 2009, Kyoto, Japan, November 18-20, 2009     9  2009  [Refereed]

    DOI

  • Muscle-based facial animation considering fat layer structure captured by MRI.

    Hiroto Yarimizu, Yasushi Ishibashi, Hiroyuki Kubo, Akinobu Maejima, Shigeo Morishima

    International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2009, New Orleans, Louisiana, USA, August 3-7, 2009, Poster Proceedings    2009  [Refereed]

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Example based skinning with progressively optimized support joints

    Kentaro Yamanaka, Akane Yano, Shigeo Morishima

    ACM SIGGRAPH ASIA 2009 Posters, SIGGRAPH ASIA '09     55:1  2009  [Refereed]

     View Summary

    Skeleton-Subspace Deformation (SSD), which is the most popular method for articulated character animation, often causes some artifacts. Animators have to edit mesh each time, which is seriously tedious and time-consuming. So example based skinning has been proposed. It employs edited mesh as target poses and generates plausible animation efficiently. In this technique, character mesh should be deformed to accurately fit target poses. Mohr et al. [2003] introduced additional joints. They expect animators to embed skeleton precisely.

    DOI

    Scopus

    3
    Citation
    (Scopus)
  • Accurate skin deformation model of forearm using MRI

    Kentaro Yamanaka, Shinsuke Nakamura, Shota Kobayashi, Akane Yano, Masashi Shiraishi, Shigeo Morishima

    SIGGRAPH 2009: Posters, SIGGRAPH '09    2009  [Refereed]

     View Summary

    This paper presents a new methodology for constructing a skin deformation model using MRI and generating accurate skin deformations based on the model. Many methods to generate skin deformations have been proposed and they are classified into three main types. The first type is anatomically based modeling. Anatomically accurate deformations can be reconstructed but computation time is long and controlling generated motion is difficult. In addition, modeling whole body is very difficult. The second is skeleton-subspace deformation (SSD). SSD is easy to implement and fast to compute so it is the most common technique today. However, accurate skin deformations can't be easily realized with SSD. The last type consists of data-driven approaches including example-based methods. In order to construct our model from MRI images, we employ an example-based method. Using examples obtained from medical images, skin deformations can be modeled related to skeleton motions. Retargeting generated motions to other characters is generally difficult with this kind of methods. Kurihara and Miyata realize accurate skin deformations from CT images [Kurihara et al. 2004], but it doesn't mention the possibility of retargeting. With our model, however, generated deformations can be retargeted. Once the model is constructed, accurate skin deformations are easily generated applying our model to a skin mesh. In our experiment, we construct a skin deformation model which reconstructs pronosupination, rotational movement of forearm, and we use range scan data as a skin mesh to apply our model and generate accurate skin deformations.

    DOI

    Scopus

  • Expressive facial subspace construction from key face selection.

    Ryo Takamizawa, Takanori Suzuki, Hiroyuki Kubo, Akinobu Maejima, Shigeo Morishima

    International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2009, New Orleans, Louisiana, USA, August 3-7, 2009, Poster Proceedings    2009  [Refereed]

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Directable anime-like shadow based on water mapping filter.

    Yohei Shimotori, Shiori Sugimoto, Shigeo Morishima

    International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2009, New Orleans, Louisiana, USA, August 3-7, 2009, Poster Proceedings    2009  [Refereed]

    DOI

    Scopus

  • Characteristic gait animation synthesis from single view silhouette

    Shinsuke Nakamurai, Masashi Shiraishi, Shigeo Morishima, Mayu Okumura, Yasushi Makihara, Yasushi Yagi

    SIGGRAPH 2009: Posters, SIGGRAPH '09    2009  [Refereed]

     View Summary

    Characteristics of human motion, such as walking, running or jumping vary from person to person. Differences in human motion enable people to identify oneself or a friend. However, it is challenging to generate animation where individual characters exhibit characteristic motion using computer graphics. Our goal is to construct a system that synthesizes characteristic gait animation automatically. As a result, when crowd animation is generated for instance, the motion with the variation can be made using our system. In our system, we first acquire a silhouette image as input data using a video camera. Second, we extract gait feature from single view silhouette. Finally we automatically synthesize 3D gait animation using the method blending a small number of motion data [KOVAR, L et al 2003].This blending weight is estimated using the gait feature automatically.

    DOI

  • Human head modeling based on fast-automatic mesh completion

    Akinobu Maejima, Shigeo Morishima

    ACM SIGGRAPH ASIA 2009 Posters, SIGGRAPH ASIA '09     53:1  2009  [Refereed]

     View Summary

    The need to rapidly create 3D human head models is still an important issue in game and film production. Blanz et al have developed a morphable model which can semi-automatically reconstruct the facial appearance (3D shape and texture) and simulated hairstyles of "new" faces (faces not yet scanned into an existing database) using photographs taken from the front or other angles [Blanz et al. 2004]. However, this method still requires manual marker specification and approximately 4 minutes of computational time. Moreover, the facial reconstruction produced by this system is not accurate unless a database containing a large variety of facial models is available. We have developed a system that can rapidly generate human head models using only frontal facial range scan data. Where it is impossible to measure the 3D geometry accurately (as with hair regions) the missing data is complemented using the 3D geometry of the template mesh (TM). Our main contribution is to achieve the fast mesh completion for the head modeling based on the "Automatic Marker Setting" and the "Optimized Local Affine Transform (OLAT)". The proposed system generates a head model in approximately 8 seconds. Therefore, if users utilize a range scanner which can quickly produce range data, it is possible to generate a complete 3D head model in one minute using our system on a PC.

    DOI

    Scopus

    2
    Citation
    (Scopus)
  • Curvature-dependent local illumination approximation for translucent materials.

    Hiroyuki Kubo, Mai Hariu, Shuhei Wemler, Shigeo Morishima

    International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2009, New Orleans, Louisiana, USA, August 3-7, 2009, Poster Proceedings    2009  [Refereed]

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Aging model of human face by averaging geometry and filtering texture in database.

    Satoko Kasai, Shigeo Morishima

    International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2009, New Orleans, Louisiana, USA, August 3-7, 2009, Poster Proceedings    2009  [Refereed]

    DOI

  • A natural smile synthesis from an artificial smile.

    Hiroki Fujishiro, Takanori Suzuki, Shinya Nakano, Akinobu Maejima, Shigeo Morishima

    International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2009, New Orleans, Louisiana, USA, August 3-7, 2009, Poster Proceedings    2009  [Refereed]

    DOI

  • AUTOMATIC VOICE ASSIGNMENT TOOL FOR INSTANT CASTING MOVIE SYSTEM

    Yoshihiro Adachi, Shinichi Kawamoto, Tatsuo Yotsukura, Shigeo Morishima, Satoshi Nakamura

    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS     1897 - +  2009  [Refereed]

     View Summary

    In Instant Casting movie System, a personal CG character is automatically generated. The character resembles a participant in a face geometry and texture. However, the voice of a character was an alternative voice determined by the gender of the participant. Therefore sometimes it's not enough to identify the personality of a CG character.In this paper, an automatic pre-scored voice assignment tool for a personal CG character is presented. Voice is essential to identify a personal character as well as a face feature. Our proposed system selects the most similar voice to the participants from voice database, and assigns it as a voice of CG character. Voice similarity criterion is presented by combination of eight acoustic features. After assigning voice data to a personal character, the voice track is played back in synchronization with the movement of the CG character. 60 voice variations have been prepared to our voice database. Validity of the assigned voice has been evaluated by MOS value. The proposed method has achieved 68% of the theoretical figure that is calculated by preliminary experiments.

    DOI

    Scopus

    2
    Citation
    (Scopus)
  • "Dive into the Movie" audience-driven immersive experience in the story

    Shigeo Morishima

    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS   E91D ( 6 ) 1594 - 1603  2008.06  [Refereed]

     View Summary

    "Dive into the Movie (DIM)" is a name of project to aim to realize a world innovative entertainment system which can provide an immersion experience into the story by giving a chance to audience to share an impression with his family or friends by watching a movie in which all audience can participate in the story as movie casts. To realize this system, several techniques to model and capture the personal characteristics instantly in face, body, gesture, hair and voice by combining computer graphics, computer vision and speech signal processing technique. Anyway, all of the modeling, casting, character synthesis, rendering and compositing processes have to be performed on real-time without any operator, In this paper, first a novel entertainment system, Future Cast System (FCS), is introduced which can create DIM movie with audience's participation by replacing the original roles' face in a pre-created CG movie with audiences' own highly realistic 3D CG faces. Then the effects of DIM movie on audience experience are evaluated subjectively. The result suggests that most of the participants are seeking for higher realism, impression and satisfaction by replacing not only face part but also body, hair and voice. The first experimental trial demonstration of FCS was performed at the Mitsui-Toshiba pavilion of the 2005 World: Exposition in Aichi Japan. Then, 1,640,000 people have experienced this event during 6 months of exhibition and FCS became one of the most popular events at Expo.2005.

    DOI

    Scopus

    5
    Citation
    (Scopus)
  • Instant casting movie theater: The Future Cast Systems

    Akinobu Maejima, Shuhei Wemler, Tamotsu Machida, Masao Takebayashi, Shigeo Morishima

    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS   E91D ( 4 ) 1135 - 1148  2008.04  [Refereed]

     View Summary

    We have developed a visual entertainment system called "Future Cast" which enables anyone to easily participate in a pre-recorded or pre-created film as an instant CG movie star. This system provides audiences with the amazing opportunity to join the cast of a movie in real-time. The Future Cast System can automatically perform all the processes required to make this possible, from capturing participants' facial characteristics to rendering them into the movie. Our system can also be applied to any movie created using the same production process. We conducted our first experimental trial demonstration of the Future Cast System at the Mitsui-Toshiba pavilion at the 2005 World Exposition in Aichi Japan.

    DOI

    Scopus

    16
    Citation
    (Scopus)
  • Instant casting movie theater: The Future Cast Systems

    Akinobu Maejima, Shuhei Wemler, Tamotsu Machida, Masao Takebayashi, Shigeo Morishima

    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS   E91D ( 4 ) 1135 - 1148  2008.04  [Refereed]

     View Summary

    We have developed a visual entertainment system called "Future Cast" which enables anyone to easily participate in a pre-recorded or pre-created film as an instant CG movie star. This system provides audiences with the amazing opportunity to join the cast of a movie in real-time. The Future Cast System can automatically perform all the processes required to make this possible, from capturing participants&apos; facial characteristics to rendering them into the movie. Our system can also be applied to any movie created using the same production process. We conducted our first experimental trial demonstration of the Future Cast System at the Mitsui-Toshiba pavilion at the 2005 World Exposition in Aichi Japan.

    DOI

    Scopus

    16
    Citation
    (Scopus)
  • Synthesizing facial animation using dynamical property of facial muscle

    Hiroyuki Kubo, Yasushi Ishibashi, Akinobu Maejima, Shigeo Morishima

    SIGGRAPH'08 - ACM SIGGRAPH 2008 Posters     110  2008  [Refereed]

    DOI

    Scopus

  • Hair animation and styling based on 3D range scanning data

    Shiori Sugimoto, Shigeo Morishima

    SIGGRAPH'08 - ACM SIGGRAPH 2008 Posters     107  2008  [Refereed]

    DOI

    Scopus

  • Automatic and accurate mesh fitting based on 3D range scanning data

    Shinya Nakano, Yusuke Nonaka, Akinobu Maejima, Shigeo Morishima

    SIGGRAPH'08 - ACM SIGGRAPH 2008 Posters     104  2008  [Refereed]

    DOI

    Scopus

  • 3D facial animation from high speed video

    Takanori Suzuki, Yasushi Ishibashi, Hiroyuki Kubo, Akinobu Maejima, Shigeo Morishima

    SIGGRAPH'08 - ACM SIGGRAPH 2008 Posters     1  2008  [Refereed]

    DOI

    Scopus

  • An empirical study of bringing audience into the movie

    Tao Lin, Akinobu Maejima, Shigeo Morishima

    SMART GRAPHICS, PROCEEDINGS   5166   70 - 81  2008  [Refereed]

     View Summary

    In this paper we first present all audience-participating movie experience DIM, in which the photo-realistic 3D virtual actor of audience is constructed by computer graphic technologics, and then evaluate the effects of DIM on audience experience using physiological and subjective methods. The empirical results suggest that the participation of virtual actors causes increased subjective sense of presence and engagement, and more intensive emotional responses as compared to traditional movie form: interestingly, there also significantly different physiological responses caused by the participation of virtual actors, objectively indicating the improvement of interaction between audience and movie.

    DOI

    Scopus

    3
    Citation
    (Scopus)
  • Post-recording tool for instant casting movie system

    Shin-Ichi Kawamoto, Shigeo Morishima, Tatsuo Yotsukura, Satoshi Nakamura

    MM'08 - Proceedings of the 2008 ACM International Conference on Multimedia, with co-located Symposium and Workshops     893 - 896  2008  [Refereed]

     View Summary

    This paper proposes a universal user-friendly post-recording tool for an Instant Casting Movie System (ICS) that enables anyone to be a movie star using his or her own voice and faces. A personal CG character is automatically generated by scanning one's face geometry and image in ICS. Voice is as essential to identify a person as face. However, a character's voice is only based on gender in ICS. We proposed a novel voice recording tool for participants of all ages in a short time. Post-recording tasks are very difficult because speakers should speak in synchronization with the mouth movements of the CG characters. Therefore this task is generally recorded by professional voice actors. Our proposed tool has the following four features: 1) various supporting information for synchronization with voice and mouth movement timing for users
    2) automatic post-processing of recorded voices for compositing mixed audio
    3) intuitively displays operation for people of all ages
    and 4) handles multiple users in parallel for quick recording. We developed a prototype speech synchronization system using a post-recording tool and conducted subjective evaluation experiments of it. Over 60% of the subjects responded that the tool's interface can be operated easily. © 2009 IEEE.

    DOI

    Scopus

  • Perceptual similarity measurement of speech by combination of acoustic features

    Yoshihiro Adachi, Shinichi Kawamoto, Shigeo Morishima, Satoshi Nakamura

    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12     4861 - +  2008  [Refereed]

     View Summary

    Future cast system is a new entertainment system where participant's face is captured and rendered into the movie as an instant Computer Graphics (CG) movie star, which had been first exhibited at the 2005 World Exposition in Aichi Japan. We are working to add new functionality which enables mapping not only faces but also speech individualities to the cast. Our approach is to find a speaker with the closest speech individuality and apply voice conversion. This paper inves tigates acoustic features to estimate perceptual similarity of speech individuality. We propose a method linearly combined eight acoustic features related to the perception of speech individualities. The proposed method optimizes weights for the acoustic features considering perceptual similarities. We have evaluated performance of our method with Spearman's rank correlation coefficients to perceptual similarities. As the results, the experiments evidenced that the proposed method achieves a correlation coefficient of 0.66.

    DOI

    Scopus

    11
    Citation
    (Scopus)
  • Directable and stylized hair simulation

    Yosuke Kazama, Eiji Sugisaki, Shigeo Morishima

    GRAPP 2008: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMPUTER GRAPHICS THEORY AND APPLICATIONS     316 - 321  2008  [Refereed]

     View Summary

    Creating natural looking hair motion is considered to be one of the most difficult and time consuming challenges in CG animation. A detailed physics-based model is essential in creating convincing hair animation. However, hair animation created using detailed hair dynamics might not always be the result desired by creators. For this reason, a hair simulation system that is both detailed and editable is required in contemporary Computer Graphics. In this paper we therefore, propose the use of External Force Field (EFF) to construct hair motion using a motion capture system. Furthermore, we have developed a system for editing the hair motion obtained using this process. First, the environment around a subject is captured using a motion capture system and the EFF is defined. Second, we apply our EFF-based hair motion editing system to produce creator-oriented hair animation. Consequently, our editing system enables creators to develop desired hair animation intuitively without physical discontinuity.

  • Preliminary evaluation of the audience-driven movie

    Tao Lin, Akinobu Maejima, Shigeo Morishima, Atsumi Imamiya

    Conference on Human Factors in Computing Systems - Proceedings     3273 - 3278  2008  [Refereed]

     View Summary

    In this paper we introduce an audience-driven theater experience, DIM Movie, in which audience participates in a pre-created CG movie as its roles, and report the subjective and physiological evaluations for the audience experience offered by DIM movie. Specifically, we present three different experiences to an audience-a traditional movie, its SeIf-DIM (SDIM) version with the audience's participation, and its Self-Friend-DIM (SFDIM) version with co-participation of the audience and his friends. The evaluation results show that the DIM movies (SDIM and SFDIM) elicit greater subjective sense of presence, engagement, and emotional reaction, and stronger physiological response (galvanic skin response, GSR) as compared with the traditional movie form
    moreover, audiences show a phasic GSR increase responding to the appearance of their own or friends' CG characters on the movie screen.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Using subjective and physiological measures to evaluate audience-participating movie experience

    Tao Lin, Akinobu Maejima, Shigeo Morishima

    Proceedings of the Workshop on Advanced Visual Interfaces AVI     49 - 56  2008  [Refereed]

     View Summary

    In this paper we subjectively and physiologically investigate the effects of the audiences' 3D virtual actor in a movie on their movie experience, using the audience-participating movie DIM as the object of study. In DIM, the photo-realistic 3D virtual actors of audience are constructed by combining current computer graphics (CG) technologies and can act different roles in a pre-rendered CG movie. To facilitate the investigation, we presented three versions of a CG movie to an audience-a Traditional version, its Self-DIM (SDIM) version with the participation of the audience's virtual actor, and its Self-Friend-DIM (SFDIM) version with the coparticipation of the audience and his friends' virtual actors. The results show that the participation of audience's 3D virtual actors indeed cause increased subjective sense of presence and engagement, and emotional reaction
    moreover, SFDIM performs significantly better than SDIM, due to increased social presence. Interestingly, when watching the three movie versions, subjects experienced not only significantly different galvanic skin response (GSR) changes on average-changing trend over time, and number of fluctuations-but they also experienced phasic GSR increase when watching their own and friends' virtual 3D actors appearing on the movie screen. These results suggest that the participation of the 3D virtual actors in a movie can improve interaction and communication between audience and the movie. Copyright 2008 ACM.

    DOI

    Scopus

    7
    Citation
    (Scopus)
  • Tweakable Shadows for Cartoon Animation

    Hidehito Nakajima, Eiji Sugisaki, Shigeo Morishima

    WSCG 2007, FULL PAPERS PROCEEDINGS I AND II     233 - 240  2007  [Refereed]

     View Summary

    The role of shadow is important in cartoon animation. Shadows in hand-drawn animation reflect the expression of the animators' style, rather than mere physical phenomena. However shadows in 3DCG cannot express such an animators' touch. In this paper we propose a novel method for editing the shadow with both advantages of hand drawn animation and 3DCG technology. In particular, we discuss two phases that enable animators to transform and deform the shadow tweakably. The first phase is that a shadow projection matrix is applied to deform the shape of the shadow. The second one is that we manipulate vectors to transform the shadow such as scaling and translation. These vectors are used in shadow volume method. The shadows are edited directably by integration of these two phases. Our approach enables animators to edit the shadow by simple mouse operations. As a result, the animators can not only produce shadows automatically but also reflect their style easily. Once the shape and location of shadow are decided by animators' style in our method, 3DCG techniques can produce consistent shadow in object motion interactively.

  • Interactive shade control for cartoon animation.

    Yohei Shimotori, Hidehito Nakajima, Eiji Sugisaki, Akinobu Maejima, Shigeo Morishima

    34. International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2007, San Diego, California, USA, August 5-9, 2007, Posters     170  2007  [Refereed]

  • Hair motion reconstruction using motion capture system.

    Takahito Ishikawa, Yosuke Kazama, Eiji Sugisaki, Shigeo Morishima

    34. International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2007, San Diego, California, USA, August 5-9, 2007, Posters     78  2007  [Refereed]

  • Data-driven efficient production of cartoon character animation

    Shigeo Morishima, Shigeru Kuriyama, Shinichi Kawamoto, Tadamichi Suzuki, Masaaki Taira, Tatsuo Yotsukura, Satoshi Nakamura

    ACM SIGGRAPH 2007 Sketches, SIGGRAPH'07     76  2007  [Refereed]

    DOI

    Scopus

    2
    Citation
    (Scopus)
  • Designing a new car body shape by PCA of existing car database.

    Tatsunori Hayakawa, Yusuke Sekine, Akinobu Maejima, Shigeo Morishima

    34. International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2007, San Diego, California, USA, August 5-9, 2007, Posters     43  2007  [Refereed]

  • Variable rate speech animation synthesis.

    Akane Yano, Hiroyuki Kubo, Yoshihiro Adachi, Demetri Terzopoulos, Shigeo Morishima

    34. International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2007, San Diego, California, USA, August 5-9, 2007, Posters     28 - 28  2007  [Refereed]

  • Facial muscle adaptation for expression customization.

    Yasushi Ishibashi, Hiroyuki Kubo, Akinobu Maejima, Demetri Terzopoulos, Shigeo Morishima

    34. International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2007, San Diego, California, USA, August 5-9, 2007, Posters     26 - 26  2007  [Refereed]

  • Acoustic features for estimation of perceptional similarity

    Yoshibiro Adachi, Shinichi Kawamoto, Shigeo Morishima, Satoshi Nakamura

    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2007   4810   306 - +  2007  [Refereed]

     View Summary

    This paper describes an examination of acoustic features for the estimation of perceptional similarity between speeches. We firstly extract some acoustic features including personality from speeches of 36 persons. Secondly, we calculate each distance between extracted features using Gaussian Mixture Model (GMM) or Dynamic Time Warping (DTW), and then we sort speeches based on the physical similarity. On the other hand, there is the permutation based on the perceptional similarity which is sorted according to the subject. We evaluate the physical features by the Spearman's rank correlation coefficient with two permutations. Consequently, the results show that DTW distance with high STRAIGHT Cepstrum is an optimum feature for estimation of perceptional similarity.

    DOI

    Scopus

  • Car designing tool using multivariate analysis

    Yusuke Sekine, Akinobu Maejima, Eiji Sugisaki, Shigeo Morishima

    ACM SIGGRAPH 2006 Research Posters, SIGGRAPH 2006     94  2006.07  [Refereed]

    DOI

    Scopus

    2
    Citation
    (Scopus)
  • Facial animation by the manipulation of a few control points subject to muscle constraints

    Hiroyuki Kubo, Hiroaki Yanagisawa, Akinobu Maejima, Demetri Terzopoulos, Shigeo Morishima

    ACM SIGGRAPH 2006 Research Posters, SIGGRAPH 2006     65  2006.07  [Refereed]

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Hair motion re-modeling from cartoon animation sequence

    Yosuke Kazama, Eiji Sugisaki, Natsuko Tanaka, Akiko Sato, Shigeo Morishima

    ACM SIGGRAPH 2006 Research Posters, SIGGRAPH 2006     4  2006.07  [Refereed]

    DOI

    Scopus

  • Hair motion cloning from cartoon animation sequences

    Eiji Sugisaki, Yosuke Kazama, Shigeo Morishima, Natsuko Tanaka, Akiko Sato

    COMPUTER ANIMATION AND VIRTUAL WORLDS   17 ( 3-4 ) 491 - 499  2006.07  [Refereed]

     View Summary

    This paper describes a new approach to create cartoon hair animation that allows users to use existing cel character animation sequences. We demonstrate the generation of cartoon hair animation accentuated in 'anime-like' motions. The novelty of this method is that users can choose the existing cel animation of a character's hair animation and apply environmental elements such as wind to other characters With a three-dimensional structure. In fact, users can reuse existing cartoon sequences as input to endow another character with environmental elements as if both characters exist in the same scene. A three-dimensional character's hair motions are created based on hair motions from input cartoon animation sequences. First, users extract hair shapes at each frame from input sequences from Which they then construct physical equations. 'Anime-like' hair motion is created by using these physical equations. Copyright (c) 2006 John Wiley & Soils, Ltd.

    DOI

    Scopus

    5
    Citation
    (Scopus)
  • 表情筋制約モデルを用いた少ない制御点の動きからの表情合成

    久保尋之, 柳澤博昭, 前島謙宣, 森島繁生

    日本顔学会論文誌    2006.01  [Refereed]

  • Construction of audio-visual speech corpus using motion-capture system and corpus based facial animation

    T Yotsukura, S Morishima, S Nakamura

    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS   E88D ( 11 ) 2477 - 2483  2005.11  [Refereed]

     View Summary

    An accurate audio-visual speech corpus is inevitable for talking-heads research. This paper presents our audio-visual speech corpus collection and proposes a head-movement normalization method and a facial motion generation method. T he audio-visual corpus contains speech data, movie data on faces, and positions and movements of facial organs. The corpus consists of Japanese phoneme-balanced sentences uttered by a female native speaker. An accurate facial capture is realized by using an optical motion-capture system. We captured high-resolution 3D data by arranging many markers on the speaker's face. In addition, we propose a method of acquiring the facial movements and removing head movements by using affine transformation for computing displacements of pure facial organs. Finally, in order to easily create facial animation from this motion data, we propose a technique assigning the captured data to the facial polygon model. Evaluation results demonstrate the effectiveness of the proposed facial motion generation method and show the relationship between the number of markers and errors.

    DOI

    Scopus

    4
    Citation
    (Scopus)
  • Special section on life-like agent and its communication

    M Ishizuka, S Morishima

    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS   E88D ( 11 ) 2443 - 2444  2005.11  [Refereed]

    DOI

    Scopus

  • Automatic head-movement control for emotional speech

    Shin-Ichi Kawamoto, Tatsuo Yotsukura., Shigeo Morishima., Satoshi Nakamura

    ACM SIGGRAPH 2005 Posters, SIGGRAPH 2005     28  2005.07  [Refereed]

     View Summary

    Head movements could be automatically generated from speech data. The expression of head movement could be also controlled by user-defined emotional factors, as shown in the video demonstration.

    DOI

    Scopus

  • Speech to talking heads system based on hidden Markov models

    Tatsuo Yotsukura, Shigeo Morishima, Satoshi Nakamura

    ACM SIGGRAPH 2005 Posters, SIGGRAPH 2005     27  2005.07  [Refereed]

     View Summary

    Facial animation using the proposed talking heads was created from input speech signals, as shown in the video demonstration. We have confirmed facial animations of various facial objects.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Future cast system

    Shigeo Morishima, Akinobu Maejima, Shuhei Wemler, Tamotsu Machida, Masao Takebayashi

    ACM SIGGRAPH 2005 Sketches, SIGGRAPH 2005     20  2005.07  [Refereed]

    DOI

    Scopus

    5
    Citation
    (Scopus)
  • Simulation-Based Cartoon Hair Animation.

    Eiji Sugisaki, Yizhou Yu, Ken Anjyo, Shigeo Morishima

    The 13-th International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision'2005, WSCG 2005, University of West Bohemia, Campus Bory, Plzen-Bory, Czech Republic, January 31 - February 4, 2005     117 - 122  2005  [Refereed]

  • Reconstructing motion using a human structure model.

    Shohei Nishimura, Shoichiro Iwasawa, Eiji Sugisaki, Shigeo Morishima

    32. International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2005, Los Angeles, California, USA, July 31 - August 4, 2005, Posters     109  2005  [Refereed]

    DOI

  • Quantitative representation of face expression using motion capture system.

    Hiroaki Yanagisawa, Akinobu Maejima, Tatsuo Yotsukura, Shigeo Morishima

    32. International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2005, Los Angeles, California, USA, July 31 - August 4, 2005, Posters     108  2005  [Refereed]

    DOI

    Scopus

  • Interactive speech conversion system cloning speaker intonation automatically.

    Yoshihiro Adachi, Shigeo Morishima

    32. International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2005, Los Angeles, California, USA, July 31 - August 4, 2005, Posters     77  2005  [Refereed]

    DOI

  • Multimodal translation system using texture-mapped lip-sync images for video mail and automatic dubbing applications

    S Morishima, S Nakamura

    EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING   2004 ( 11 ) 1637 - 1647  2004.09  [Refereed]

     View Summary

    We introduce a multimodal English-to-Japanese and Japanese-to-English translation system that also translates the speaker's speech motion by synchronizing it to the translated speech. This system also introduces both a face synthesis technique that can generate any viseme lip shape and a face tracking technique that can estimate the original position and rotation of a speaker's face in an image sequence. To retain the speaker's facial expression, we substitute only the speech organ's image with the synthesized one, which is made by a 3D wire-frame model that is adaptable to any speaker. Our approach provides translated image synthesis with an extremely small database. The tracking motion of the face from a video image is performed by template matching. In this system, the translation and rotation of the face are detected by using a 3D personal face model whose texture is captured from a video frame. We also propose a method to customize the personal face model by using our GUI tool. By combining these techniques and the translated voice synthesis technique, an automatic multimodal translation can be achieved that is suitable for video mail or automatic dubbing systems into other languages.

    DOI

    Scopus

    6
    Citation
    (Scopus)
  • Mocap+MRI=?

    Shoichiro Iwasawa, Kenji Mase, Shigeo Morishima

    ACM SIGGRAPH 2004 Posters, SIGGRAPH 2004     116  2004.08  [Refereed]

     View Summary

    This poster proposes a basic idea to observe differences that exist between skeletal postures coming from two methods: postures generated by an only ordinary mocap process, and anatomically and individually accurate skeletal postures generated by a ordinary mocap process together with a medical imaging method such as MRI.

    DOI

    Scopus

  • Face expression synthesis based on a facial motion distribution chart

    Tatsuo Yotsukura, Shigeo Morishima, Satoshi Nakamura

    ACM SIGGRAPH 2004 Posters, SIGGRAPH 2004     85  2004.08  [Refereed]

    DOI

    Scopus

    3
    Citation
    (Scopus)
  • Cartoon hair animation based on physical simulation

    Eiji Sugisaki, Yizhou Yu, Ken Anjyo, Shigeo Morishima

    ACM SIGGRAPH 2004 Posters, SIGGRAPH 2004     27  2004.08  [Refereed]

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents.

    Shinichi Kawamoto, Hiroshi Shimodaira, Tsuneo Nitta, Takuya Nishimoto, Satoshi Nakamura, Katsunobu Itou, Shigeo Morishima, Tatsuo Yotsukura, Atsuhiko Kai, Akinobu Lee, Yoichi Yamashita, Takao Kobayashi, Keiichi Tokuda, Keikichi Hirose, Nobuaki Minematsu, Atsushi Yamada, Yasuharu Den, Takehito Utsuro, Shigeki Sagayama

    Life-like characters - tools, affective functions, and applications.     187 - 212  2004  [Refereed]

  • How to capture absolute human skeletal posture

    Shoichiro Iwasawa, Kiyoshi Kojima, Kenji Mase, Shigeo Morishima

    ACM SIGGRAPH 2003 Sketches and Applications, SIGGRAPH 2003    2003.07  [Refereed]

     View Summary

    Commercially available motion capture products give us fairly precise movements of human body segments but do not measure enough information to define skeletal posture in its entirety. This sketch describes how to obtain the complete posture of skeletal structure with the help of marker locations relative to bones that are derived from MRI data sets.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Face analysis and synthesis for interactive entertainment

    Shoichiro Iwasawa, Tatsuo Yotsukura, Shigeo Morishima

    IFIP Advances in Information and Communication Technology   112   157 - 164  2003  [Refereed]

     View Summary

    A stand-in is a common technique for movies and TV programs in foreign languages. The current stand-in that only substitutes the voice channel results awkward matching to the mouth motion. Videophone with automatic voice translation are expected to be widely used in the near future, which may face the same problem without lip-synchronized speaking face image translation. In this paper, we propose a method to track motion of the face from the video image and then replace the face part or only mouth part with synthesized one which is synchronized with synthetic voice or spoken voice. This is one of the key technologies not only for speaking image translation and communication system, but also for an interactive entertainment system. Finally, an interactive movie system is introduced as an application of entertainment system. © 2003 by Springer Science+Business Media New York.

    DOI

  • Model-based talking face synthesis for anthropomorphic spoken dialog agent system.

    Tatsuo Yotsukura, Shigeo Morishima, Satoshi Nakamur

    Proceedings of the Eleventh ACM International Conference on Multimedia, Berkeley, CA, USA, November 2-8, 2003     351 - 354  2003  [Refereed]

    DOI

    Scopus

    6
    Citation
    (Scopus)
  • Magical face: Integrated tool for muscle based facial animation

    Tatsuo Yotsukura, Mitsunori Takahashi, Shigeo Morishima, Kazunori Nakamura, Hirokazu Kudoh

    ACM SIGGRAPH 2002 Conference Abstracts and Applications, SIGGRAPH 2002     212  2002.07  [Refereed]

     View Summary

    In recent years, tremendous advances have been achieved in the 3D computer graphics used in the entertainment industry, and in the semiconductor technologies used to fabricate graphics chips and CPUs. However, although good reproduction of facial expressions is possible through 3D CG, the creation of realistic expressions and mouth motion is not a simple task.

    DOI

    Scopus

  • HyperMask - projecting a talking head onto a real object

    T Yotsukura, S Morishima, F Nielsen, K Binsted, C Pinhanez

    VISUAL COMPUTER   18 ( 2 ) 111 - 120  2002.04  [Refereed]

     View Summary

    HyperMask is a system which projects an animated face onto a physical mask worn by an actor. As the mask moves within a prescribed area, its position and orientation are detected by a camera and the projected image changes with respect to the viewpoint of the audience. The lips of the projected face are automatically synthesized in real time with the voice of the actor, who also controls the facial expressions. As a theatrical tool, HyperMask enables a new style of storytelling. As a prototype system, we put a self-contained HyperMask system in a trolley (disguised as a linen cart), so that it projects onto the mask worn by the actor pushing the trolley.

    DOI

    Scopus

    22
    Citation
    (Scopus)
  • Styling and animating human hair

    Keisuke Kishi, Shigeo Morishima

    Systems and Computers in Japan   33 ( 3 ) 31 - 40  2002.03  [Refereed]

     View Summary

    Synthesizing facial images by computer graphics (CG) has attracted attention in connection with the current trends toward synthesizing virtual faces and realizing communication systems in cyberspace. In this paper, a method for representing human hair, which is known to be difficult to synthesize in computer graphics, is presented. In spite of the fact that hair is visually important in human facial imaging, it has frequently been replaced by simple curved surfaces or a part of the background. Although the methods of representing hair by mapping techniques have achieved results, such methods are inappropriate in representing motions of hair. Thus, spatial curves are used to represent hair, without using textures or polygons. In addition, hair style design is simplified by modeling hair in units of tufts, which are partially concentrated areas of hair. This paper describes the collision decisions and motion representations in this new hair style design system, the modeling of tufts, the rendering method, and the four-branch (quadtree) method. In addition, hair design using this hair style design system and the animation of wind-blown hair are illustrated. © 2002 Scripta Technica, Syst. Comp. Jpn.

    DOI

    Scopus

  • Multi-modal translation system and its evaluation

    S Morishima, S Nakamura

    FOURTH IEEE INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES, PROCEEDINGS     241 - 246  2002  [Refereed]

     View Summary

    Speech-to-speech translation has been studied to realize natural human communication beyond language barriers. Toward further multi-modal natural communication, visual information such as face and lip movements will be necessary In this paper, we introduce a multi-modal English-to-Japanese and Japanese-to-English translation system that also translates the speaker's speech motion while synchronizing it to the translated speech. To retain the speaker's facial expression, we substitute only the speech organ's image with the synthesized one, which is made by a three-dimensional wire-frame model that is adaptable to any speaker. Our approach enables image synthesis and translation with an extremely small database. We conduct subjective evaluation tests using the connected digit discrimination test using data with and without audio-visual lip-synchronization. The results confirm the significant quality of the proposed audio-visual translation system and the importance of lip-synchronization.

    DOI

    Scopus

    2
    Citation
    (Scopus)
  • Audio-visual speech translation with automatic lip synchronization and face tracking based on 3-D read model

    S Morishima, S Ogata, K Murai, S Nakamura

    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS     2117 - 2120  2002  [Refereed]

     View Summary

    Speech-to-speech translation has been studied to realize natural human communication beyond language barriers. Toward further multi-modal natural communication, visual information such as face and lip movements will be necessary. In this paper, we introduce a multi-modal English-to-Japanese and Japanese-to-English translation system that also translates the speaker's speech motion while synchronizing it to the translated speech. To retain the speaker's facial expression, we substitute only the speech organ's image with the synthesized one, which is made by a three-dimensional wire-frame model that is adaptable to any speaker. Our approach enables image synthesis and translation with an extremely small database. We conduct subjective evaluation by connected digit discrimination using data with and without audiovisual lip-synchronicity. The results confirm the sufficient quality of the proposed audio-visual translation system.

    DOI

  • An open source development tool for anthropomorphic dialog agent - Face image synthesis and lip synchronization

    T Yotsukura, S Morishima

    PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING     272 - 275  2002  [Refereed]

     View Summary

    We describe the design and report the development of an open source ware toolkit for building an easily customizable anthropomorphic dialog agent. This toolkit consist of four modules for multi-modal dialog integration, speech recognition, speech synthesis, and face image synthesis. In this paper, we focus on the construction of an agent's face image synthesis.

    DOI

    Scopus

    3
    Citation
    (Scopus)
  • Audio-Video Tracking System for Multi-Modal Interface

    D. Zotkin, K. Takahashi, T. Yotsukura, S. Morishima, N. Tetsutani

    Journal of IIEEJ   30 ( 4 ) 452 - 463  2001  [Refereed]

     View Summary

    In this paper, a front end system which uses audio and video information to track the people or other sound sources in the ordinary room has developed. The microphone array is used for determining the spatial location of the sound; the active video camera acquires the image of the area where the sound is detected, detects the people in the image by using skin color and can zoom and track a speaker. Several add-ons to the system include various visualization tools such as on-screen displays of waveforms, correlation plots, spectrum plots, spatial acoustic energy distribution, running time-frequency acoustic energy plots, and the possibility of real-time beamforming with real-time output to the headphones. The system can be used as a front-end for the non-encumbering human-computer interaction by video and audio means.

    DOI CiNii

  • Model-based lip synchronization with automatically translated synthetic voice toward a multi-modal translation system

    Shin Ogata, Kazumasa Murai, Satoshi Nakamura, Shigeo Morishima

    Proceedings - IEEE International Conference on Multimedia and Expo     28 - 31  2001  [Refereed]

     View Summary

    In this paper, we introduce a multi-modal English-to-Japanese and Japanese-to-English translation system that also translates the speaker's speech motion while synchronizing it to the translated speech. To retain the speaker's facial expression, we substitute only the speech organ's image with the synthesized one, which is made by a three-dimensional wire-frame model that is adaptable to any speaker. Our approach enables image synthesis and translation with an extremely small database.

    DOI

    Scopus

    8
    Citation
    (Scopus)
  • Automatic face tracking and model match-move automatic face tracking and model match-move in video sequence using 3D face model in video sequence using 3D face model

    Takafumi Misawa, Kazumasa Murai, Satoshi Nakamura, Shigeo Morishima

    Proceedings - IEEE International Conference on Multimedia and Expo     234 - 236  2001  [Refereed]

     View Summary

    A stand-in is a common technique for movies and TV programs in foreign languages. The current stand-in that only substitutes the voice channel results awkward matching to the mouth motion. Videophone with automatic voice translation are expected to be widely used in the near future, which may face the same problem without lip-synchronized speaking face image translation. In this paper, we propose a method to track motion of the face from the video image, that is one of the key technologies for speaking image translation. Almost all the old tracking algorithms aim to detect feature points of the face. However, these algorithms had problems, such as blurring of a feature point between frames and occlusion of the hidden feature point by rotation of the head, and so on. In this paper, we propose a method which detects movement and rotation of the head given the three dimensional shape of the face, by template matching using a 3D personal face wireframe model. The evaluation experiments are carried out with the measured reference data of the head. The proposed method achieves 0.48 angle error in average. This result confirms effectiveness of the proposed method.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • HyperMask: Talking head projected onto moving surface

    S Morishima, T Yotsukura

    2001 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL III, PROCEEDINGS     947 - 950  2001  [Refereed]

     View Summary

    HYPERMASK is a system which projects an animated face onto a physical mask, worn by an actor. As the mask moves within a prescribed area, its position and orientation are detected by a camera, and the projected image changes with respect to the viewpoint of the audience. The lips of the projected face are automatically synthesized in real time with the voice of the actor, who also controls the facial expressions. As a theatrical tool, HYPERMASK enables a new style of storytelling. As a prototype system, we propose to put a self-contained HYPERMASK system in a trolley (disguised as a linen cart), so that it projects onto the mask worn by the actor pushing the trolley.

    DOI

  • Multimodal translation.

    Shigeo Morishima, Shin Ogata, Satoshi Nakamur

    Auditory-Visual Speech Processing, AVSP 2001, Aalborg, Denmark, September 7-9, 2001     98 - 103  2001  [Refereed]

    CiNii

  • マルチモーダル擬人化インタフェースとその感性基盤機能

    石塚 満, 橋本周司, 森島繁生, 小林哲則, 金子正秀, 相澤清晴, 苗村 健, 伊庭斉志, 土肥 浩

    感性的ヒューマンインタフェース公開シンポジウム資料,2000.11.22     99 - 111  2000.11

  • 3D face expression estimation and generation from 2D image based on a physically constraint model

    T Ishikawa, S Morishima, D Terzopoulos

    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS   E83D ( 2 ) 251 - 258  2000.02  [Refereed]

     View Summary

    Muscle based face image synthesis is one of the most realistic approaches to the realization of a life-like agent in computers. A facial muscle model is composed of facial tissue elements and simulated muscles. In this model, forces are calculated effecting a facial tissue element by contraction of each muscle string, so the combination of each muscle contracting force decides a specific facial expression. This muscle parameter is determined on a trial and error basis by comparing the sample photograph and a generated image using our Muscle-Editor to generate a specific face image. In this paper, we propose the strategy of automatic estimation of facial muscle parameters from 2D markers' movements located on a face using a neural network. This corresponds to the non-realtime 3D facial motion capturing from 2D camera image under the physics based condition.

  • Networked Theater - Movie Production System Based on a Networked Environment

    K. Takahashi, J. Kurumisawa, S. Morishima, T. Yotsukura, S. Kawato

    Proceedings of the Computer Graphics Annual Conference Series, 2000 (SIGGRAPH2000)     92  2000  [Refereed]

  • Human body postures from trinocular camera images

    Shoichiro Iwasawa, Jun Ohya, Kazuhiko Takahashi, Tatsumi Sakaguchi, Kazuyuki Ebihara, Shigeo Morishima

    Proceedings - 4th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2000     326 - 331  2000  [Refereed]

     View Summary

    This paper proposes a new real-time method for estimating human postures in 3D from trinocular images. In this method, an upper body orientation detection and a heuristic contour analysis are performed on the human silhouettes extracted from the trinocular images so that representative points such as the top of the head can be located. The major joint positions are estimated based on a genetic algorithm-based learning procedure. 3D coordinates of the representative points and joints are then obtained from the two views by evaluating the appropriateness of the three views. The proposed method implemented on a personal computer runs in real-time. Experimental results show high estimation accuracies and the effectiveness of the view selection process. © 2000 IEEE.

    DOI

    Scopus

    19
    Citation
    (Scopus)
  • Multi-Media Ambiance Communication Based on Actual Moving Pictures.

    Tadashi Ichikawa, Tetsuya Yoshimura, Kunio Yamada, Toshifumi Kanamaru, Hiromichi Suga, Shoichiro Iwasawa, Takeshi Naemura, Kiyoharu Aizawa, Shigeo Morishima, Takahiro Saito

    Proceedings of the 1999 International Conference on Image Processing, ICIP '99, Kobe, Japan, October 24-28, 1999     36 - 40  1999  [Refereed]

    DOI

  • Face-To-Face Communicative Avatar Driven by Voice.

    Shigeo Morishima, Tatsuo Yotsukura

    Proceedings of the 1999 International Conference on Image Processing, ICIP '99, Kobe, Japan, October 24-28, 1999     11 - 15  1999  [Refereed]

    DOI

  • Multiple points face-to-face communication in cyberspace using multi-modal agent.

    Shigeo Morishima

    Human-Computer Interaction: Communication, Cooperation, and Application Design, Proceedings of HCI International '99 (the 8th International Conference on Human-Computer Interaction), Munich, Germany, August 22-26, 1999, Volume 2   2   177 - 181  1999  [Refereed]

    CiNii

  • Physics-model-based 3D facial image reconstruction from frontal images using optical flow.

    Shigeo Morishima, Takahiro Ishikawa, Demetri Terzopoulos

    ACM SIGGRAPH 98 Conference Abstracts and Applications, Orlando, Florida, USA, July 19-24, 1998     258  1998  [Refereed]

    DOI

  • Dynamic modeling of human hair and GUI based hair style designing system.

    Keisuke Kishi, Shigeo Morishima

    ACM SIGGRAPH 98 Conference Abstracts and Applications, Orlando, Florida, USA, July 19-24, 1998     255  1998  [Refereed]

    DOI

  • Facial muscle parameter decision from 2D frontal image

    S Morishima, T Ishikawa, D Terzopoulos

    FOURTEENTH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1 AND 2     160 - 162  1998  [Refereed]

     View Summary

    Muscle based face image synthesis is one of the most realistic approach to realize life-like agent in computer. Facial muscle model is composed of facial tissue elements and muscles. In this model, forces are calculated effecting facial tissue element by contraction of each muscle strength, so the combination of each muscle parameter decide a specific facial expression. Now each muscle parameter is decided on trial and error procedure comparing the sample photograph and generated image using our Muscle-Editor to generate a specific face image.
    In this paper we propose the strategy of automatic estimation of facial muscle parameters from 2D marker movements using neural network.
    This is also 3D motion estimation from 2D point or flow information in captured image under restriction of physics based face model.

    DOI

  • Real-time human posture estimation using monocular thermal images

    S Iwasawa, K Ebihara, J Ohya, S Morishima

    AUTOMATIC FACE AND GESTURE RECOGNITION - THIRD IEEE INTERNATIONAL CONFERENCE PROCEEDINGS     492 - 497  1998  [Refereed]

     View Summary

    This paper introduces a new real-lime method to estimate the posture of a human from thermal images acquired by an infrared camera regardless of the background and lighting conditions. Distance transformation is performed for the human body area extracted from the thresholded thermal image for the calculation of the center of gravity. After the orientation of the zipper half of the body is obtained by calculating the moment of inertia, significant points such as the top of the head, the tips oft he hands and foot are heuristically located. In addition, the elbow and knee positions are estimated from the detected (significant) points using a genetic algorithm based learning procedure.
    The experimental results demonstrate the robustness of the proposed algorithm and real-time (faster than 20 frames per second) performance.

  • Facial image reconstruction by estimated muscle parameter

    T Ishikawa, H Sera, S Morishima, D Terzopoulos

    AUTOMATIC FACE AND GESTURE RECOGNITION - THIRD IEEE INTERNATIONAL CONFERENCE PROCEEDINGS     342 - 347  1998  [Refereed]

     View Summary

    Muscle based face image synthesis is one of the most realistic approach to realize life-like agent in computer.
    Facial muscle model is composed of facial tissue elements and muscles. In this model, forces are calculated effecting facial tisssue element by contraction of each muscle strength, so the combination of each muscle parameter decide a specific facial expression.
    Now each muscle parameter is decided on trial and error procedure comparing the sample photograph and generated image using our Muscle-Editor to generate a specific face image. In this paper, we propose the strategy of automatic estimation of facial muscle parameters front 2D marker movements using neural network. This corresponds to the non-realtime 3D facial motion tracking from 2D image under the physics based condition.

  • Real-time Talking Head Driven by Voice and its Application to Communication and Entertainment.

    Shigeo Morishima

    Auditory-Visual Speech Processing, AVSP '98, Sydney, NSW, Australia, December 4-6, 1998     195 - 200  1998  [Refereed]

  • 3D Estimation of Facial Muscle Parameter from the 2D Marker Movement Using Neural Network.

    Takahiro Ishikawa, Hajime Sera, Shigeo Morishima, Demetri Terzopoulos

    Computer Vision - ACCV'98, Third Asian Conference on Computer Vision, Hong Kong, China, January 8-10, 1998, Proceedings, Volume II     671 - 678  1998  [Refereed]

    DOI CiNii

    Scopus

    1
    Citation
    (Scopus)
  • Expression recognition and synthesis for face-to-face communication

    S Morishima

    DESIGN OF COMPUTING SYSTEMS: SOCIAL AND ERGONOMIC CONSIDERATIONS   21   415 - 418  1997  [Refereed]

     View Summary

    Muscle based face image synthesis [1] [2] is one of the most realistic approach to realize interface agent. Facial muscle model is composed of facial tissue elements and muscles. In this model, forces affecting facial tisssue element are calculated by contraction of each muscle strength, so the combination of each muscle parameter decide a specific facial expression.
    In this paper, we introduce the facial muscle model and the strategy of automatic estimation of facial muscle parameters from 2-D marker movements.

  • Real-time estimation of human body posture from monocular thermal images

    S Iwasawa, K Ebihara, J Ohya, S Morishima

    1997 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS     15 - 20  1997  [Refereed]

     View Summary

    This paper introduces a new;real-time method to estimate the posture of a human from thermal images acquired by an infrared camera regardless of the background and lighting conditions. Distance transformation is performed for the human body area extracted from the thresholded thermal image for the calculation of the center of gravity. After the orientation of the upper half of the body is obtained by calculating the moment of inertia, significant points such as the top of the head, the tips of the hands and foot are heuristically located. In addition, the elbow and knee positions are estimated from the detected (significant) points using a genetic algorithm based learning procedure.
    The experimental results demonstrate the robustness of the proposed algorithm and rear-time (faster than 20 frames per second) performance.

    DOI

  • Face feature extraction from spatial frequency for dynamic expression recognition

    Tatsumi Sakaguchi, Shigeo Morishima

    Proceedings - International Conference on Pattern Recognition   3   451 - 455  1996  [Refereed]

     View Summary

    A new facial feature extraction technique for expression recognition is proposed. We employ the spatial frequency domain information to obtain robust performance to the random noise on a image or the lighting conditions. It exhibited high ability sufficiently even if combined with a low-performance region tracking method. As an application of this technique, we have constructed a dynamic facial expression recognition system. We use hidden Markov models to utilize temporal changes in the facial expressions. The spatial frequency information and the temporal information make better rates of facial expression recognition. In the experiment, we established a correct response rate of approximately 84.1% of recognition with six categories. © 1996 IEEE.

    DOI

    Scopus

    4
    Citation
    (Scopus)
  • Construction and Psychological Evaluation of 3-D Emotion Space

    KAWAKAMI FUMIO, YAMADA HIROSHI, MORISHIMA SHIGEO, HARASHIMA HIROSHI

    International Journal of Biomedical Soft Computing and Human Sciences: the official journal of the Biomedical Fuzzy Systems Association   1 ( 1 ) 33 - 41  1995

     View Summary

    The goal of research is to realize natural human-machine communication system by givig a facial expression to computer. To realize this environment, it′s necessary for the machine to recognize human′s emotion condition appearing in the human′ face, and synthesize the reasonable facial image of machine. For this purpose, the machine should have emotion model based on parameterized faces which can express an emotion conditon quantitatively. Especially, both mapping and inverse mapping from face to the model have to be achieve. Facial expression of parameterized with Facial Action Coding System (FACS) which is translated to the grids′ motion of face model. An emotion condition is described compactly by the pint in a 3D space generated by 5-layered neural network and its evaluation result shows the high performance of this model.

    DOI CiNii

  • 3-D emotion space for interactive communication

    F Kawakami, M Ohkura, H Yamada, H Harashima, S Morishima

    IMAGE ANALYSIS APPLICATIONS AND COMPUTER GRAPHICS   1024   471 - 478  1995  [Refereed]

     View Summary

    In this paper, the methods for modeling racial expression and emotion are proposed. This Emotion Model, called 3-D Emotion Space can represent both human and computer emotion conditions appearing on the face as a coordinate in the 3-D Space.
    For the construction of this 3-D Emotion Space, 5-layer neural network which is superior in non-linear mapping performance is applied. After the network training with backpropagation to realize Identity Mapping, both mapping from facial expression parameters to the 3-D Emotion Space and inverse mapping from the Emotion Space to the expression parameters were realized.
    As a result a system which can analyze acid synthesize the facial expression were constructed simultaneously.
    Moreover, this inverse mapping to the facial expression is evaluated by the subjective evaluation using the synthesized expressions as Lest images. This evaluation result proved the high performance to describe natural facial expression and emotion condition with this model.

    DOI

    Scopus

    2
    Citation
    (Scopus)
  • Expression and motion control of hair using fast collision detection methods

    M Ando, S Morishima

    IMAGE ANALYSIS APPLICATIONS AND COMPUTER GRAPHICS   1024   463 - 470  1995  [Refereed]

     View Summary

    A trial to generate the object in the natural world by computer graphics is now actively done. Normally, huge computing power and storage capacity are necessary to make real and natural movement of the human's hair. In this paper, a technique to synthesize human's hair with short processing time and little storage capacity is discussed. A new Space Curve Model and Rigid Segment Model are proposed. And also high speed collision detection with the human's body is discussed.

    DOI

    Scopus

    5
    Citation
    (Scopus)
  • FACIAL EXPRESSION SYNTHESIS BASED ON NATURAL VOICE FOR VIRTUAL FACE-TO-FACE COMMUNICATION WITH MACHINE

    S MORISHIMA, H HARASHIMA

    IEEE VIRTUAL REALITY ANNUAL INTERNATIONAL SYMPOSIUM     486 - 491  1993  [Refereed]

    DOI

  • FACIAL ANIMATION SYNTHESIS FOR HUMAN-MACHINE COMMUNICATION-SYSTEM

    S MORISHIMA, H HARASHIMA

    HUMAN-COMPUTER INTERACTION, VOL 2   19   1085 - 1090  1993  [Refereed]

  • IMAGE SYNTHESIS AND EDITING SYSTEM FOR A MULTIMEDIA HUMAN INTERFACE WITH SPEAKING HEAD

    S MORISHIMA, H HARASHIMA

    INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND ITS APPLICATIONS   354   270 - 273  1992  [Refereed]

  • A MEDIA CONVERSION FROM SPEECH TO FACIAL IMAGE FOR INTELLIGENT MAN-MACHINE INTERFACE

    S MORISHIMA, H HARASHIMA

    IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS   9 ( 4 ) 594 - 600  1991.05  [Refereed]

     View Summary

    An automatic facial motion image synthesis scheme, driven by speech, and a real-time image synthesis design are presented. The purpose of this research is to realize an "intelligent" human-machine interface or "intelligent" communication system with talking head images. A human face is reconstructed on the display of a terminal using a 3-D surface model and texture mapping technique. Facial motion images are synthesized naturally by transformation of the lattice points on 3-D wire frames. Two driving motion methods, a text-to-image conversion scheme, and a voice-to-image conversion scheme are proposed in this paper. In the first method, the synthesized head image can appear to speak some given words and phrases naturally. In the latter case, some mouth and jaw motions can be synthesized in synchronization with voice signals from a speaker. Facial expressions, other than mouth shape and jaw position, also can be added at any moment, so it is easy to make the facial model appear angry, to smile, to appear sad, etc., by special modification rules. These schemes were implemented on a parallel image computer system. A real-time image synthesizer was able to generate facial motion images on the display, at a TV image video rate.

    DOI

    Scopus

    74
    Citation
    (Scopus)
  • A realtime model‐based facial image synthesis based on multiprocessor network

    Shigeo Morishima, Seiji Kobayashi, Hiroshi Harashima

    Systems and Computers in Japan   22 ( 13 ) 59 - 69  1991  [Refereed]

     View Summary

    Model‐based image coding has been highlighted recently as a high‐efficiency coding method for TV telephone and TV conference systems. In a model‐based coding system, an ultralow‐rate image transmission is realized by obtaining common models of a facial image at both sides of a communication and by transmitting only modification parameters between them. However, it is difficult to realize a realtime processing of a model‐based coding with a conventional iterative‐processing type computer since the amount of material to analyze and synthesize at both sides is very large. Realtime processing is absolutely necessary to realize a practical system using this method, i.e., highspeed processings at both the transmitter and receiver are required. This paper describes a realtime facial image synthesis method based on a multiprocessor construction with a transputer. The transputer is a microcomputer having a communication capability with multiple processors and 10 MIPS CPU performance. Using this system in a pipeline processing with 20 processors, it is possible to synthesize realtime facial images at a rate of about 16 frames per second. If only a part around the lips is transmitted, it is possible to synthesize an image at a rate of about 40 frames per second with a network using five processors. Copyright © 1991 Wiley Periodicals, Inc., A Wiley Company

    DOI

    Scopus

  • A facial motion synthesis for intelligent man‐machine interface

    Shigeo Morishima, Shin'Ichi Okada, Hiroshi Harashima

    Systems and Computers in Japan   22 ( 5 ) 50 - 59  1991  [Refereed]

     View Summary

    A facial motion image synthesis method for intelligent man‐machine interface is examined. Here, the intelligent man‐machine interface is a kind of friendly man‐machine interface with voices and pictures in which human faces appear on a screen and answer questions, compared to the currently existing user interfaces which primarily uses letters. Thus what appears on the screen is human faces, and if speech mannerisms and facial expressions are natural, then the interactions with the machine are similar to those with actual human beings. To implement such an intelligent man‐machine interface it is necessary to synthesize natural facial expressions on the screen. This paper investigates a method to synthesize facial motion images based on given information on text and emotion. The proposed method utilizes the analysis‐synthesis image coding method. It constructs facial images by assigning intensity data to the parameters of a 3‐dimensional (3‐D) model matching the person in question. Moreover, it synthesizes facial expressions by modifying the 3‐D model according to the predetermined set of rules based on the input phonemes and emotion, and also synthesizes reasonably natural facial images. Copyright © 1991 Wiley Periodicals, Inc., A Wiley Company

    DOI

    Scopus

    2
    Citation
    (Scopus)
  • Automatic Rule Extraction from Statistical Data and Fuzzy Tree Search

    Shigeo Morishima, Hiroshi Harashima

    Systems and Computers in Japan   19 ( 5 ) 26 - 37  1988  [Refereed]

     View Summary

    Generally speaking, it is necessary to extract knowledge from an expert in a given discipline and implement this knowledge into a system when constructing an expert system. However, it is not easy to extract knowledge in such fields as medical diagnosis or pattern recognition because the inference logic depends on the experience and intuition of the expert. This paper proposes an automatic rule extraction mechanism using statistical analysis. In this system, production rules are expressed in the form of the threshold function. Because the threshold function can describe any kinds of inference logic, it is expressed easily as a linear combination of input vectors and weighting coefficients. Thus weighting coefficients can be calculated by the same method as a discriminant function. If only one threshold is defined, general Boolean logic an be expressed. Moreover, an ambiguous inference rule can be expressed when the threshold levels are multidefined and a membership function is defined at each category. Further, the Fuzzy Tree Search algorithm which combines ambiguous inference and tree search is proposed at the end of this paper. This algorithm can search and determine an optimum cluster with little calculation and good performance. In practice, a medical diagnostic system applied to psychiatry which has most ambiguous diagnostic logic, has been constructed based on this algorithm and inference rules have been extracted automatically. By this experiment, Fuzzy Tree Search is as fast as the tree search technique and has as good a performance as a full search clustering technique. Copyright © 1988 Wiley Periodicals, Inc., A Wiley Company

    DOI

    Scopus

    3
    Citation
    (Scopus)

▼display all

Misc

  • 強化学習を用いた最適指導法獲得について

    久保谷 善記, 福原 吉博, 森島 繁生

      2021 ( 1 ) 343 - 344  2021.03

     View Summary

    近年、インターネットを利用した教育様式が広まりつつある。こうしたツールを利用する場合、学習者が自身のペースで学習を進められるという長所がある一方で、個々人に最適な指導を提供することが難しいという欠点がある。本研究では、マスデータから学習者の知識状態を推定するKnowledge Tracingと、環境との相互作用を繰り返す中で最適な方策を学習する強化学習の手法を組み合わせることで、少ない相互作用で個々人に最適な指導法を獲得することを目指した。

    CiNii

  • アニメ制作過程における画素対応を用いた作画ミス検出

    沖川 翔太, 山口 周悟, 森島 繁生

    情報処理学会研究報告(Web)   2021 ( 1 ) 139 - 140  2021.03

     View Summary

    アニメ制作においてはミスの発見・修正を行うために膨大な量のアニメーション画像を精査しなければならず大きな負担となっている。そこで精査するべき画像の枚数を減らして負担を軽減する必要がある。本論文ではアニメーション画像の連続性を利用して、色の塗りミスを検出することを目標としている。まず連続するフレーム同士で画素ごとに意味的な対応を取る。片方の画像に色の塗りミス箇所がある場合対応関係がうまく取れない箇所が出てくる。そのため、対応関係がうまく取れない場合を異常画像とする。本手法を用いることで、さまざまな種類の色の塗りミスの検出をすることに成功した。

    CiNii J-GLOBAL

  • 弓動作を反映した演奏モーションの自動生成

    平田 明日香, 田中 啓太郎, 島村 僚, 森島 繁生

      2021 ( 1 ) 263 - 264  2021.03

     View Summary

    本稿では,弦楽器の演奏音響信号から弓動作を反映した演奏モーションを自動生成する手法について述べる.弓を用いる弦楽器において自然な演奏モーションを生成するためには,特に音色と強く結びついている右手の弓動作を反映する必要がある.近年,深層学習を用いた演奏モーション生成手法が提案されているものの,多くの場合音響信号から直接モーションを生成しており,また既存手法を用いて推定されたポーズを正解としている.そのため出力結果は音源に合わせて右手が充分に動いていない不自然なものとなる.本研究では,使用弦と弓の切り返しを音響特徴量から推定し,その結果からルールベースで演奏モーション生成を行うことで,弓動作を反映したより自然なモーションを生成する.

    CiNii

  • 変分自己符号化器を用いた距離学習による楽器音の音高・音色分離表現

    田中啓太郎, 錦見亮, 坂東宜昭, 吉井和佳, 森島繁生

    情報処理学会研究報告(Web)   2021 ( MUS-131 )  2021

    J-GLOBAL

  • 画素対応を用いた自動着色

    沖川翔太, 森島繁生

    情報処理学会研究報告(Web)   2021 ( CG-184 )  2021

    J-GLOBAL

  • Corridor-Walker:視覚障害者が障害物を回避し交差点を認識するためのスマートフォン型屋内歩行支援システム

    栗林雅希, 粥川青汰, 粥川青汰, VONGKULBHISAL Jayakorn, 浅川智恵子, 佐藤大介, 高木啓伸, 森島繁生

    日本ソフトウェア科学会研究会資料シリーズ(Web)   ( 94 )  2021

    J-GLOBAL

  • Verification of Cyclical Annealing for Object-oriented Representation Learning

    小林篤史, 綱島秀樹, 綱島秀樹, 大川武彦, 相澤宏旭, YUE Qiu, 片岡裕雄, 森島繁生

    電子情報通信学会技術研究報告(Web)   121 ( 304(PRMU2021 24-59) )  2021

    J-GLOBAL

  • 個人情報保護のための写真内の指紋情報自動除去

    中村 和也, 夏目 亮太, 土屋 志高, 森島 繁生

      2020 ( 1 ) 137 - 138  2020.02

     View Summary

    近年、技術の発達に伴い高解像度の写真を撮影できるデジタルカメラが普及している。一方で、従来のデジタルカメラでは取得困難な指紋の情報が、写真から取得されてしまう可能性が高まっている。そのため、指紋の情報が意図せず含まれた自分の写真をインターネット上に公開してしまい、その指紋が不正ログインなどに悪用される危険性がある。本研究では、個人で撮影した自分の写真に含まれる指紋情報の自動除去手法を提案する。写真内の指紋を指定した別の指紋に置換し、かつ自然な見た目の写真を生成することで、写真の質を損なわない指紋情報の除去を行う。また、提案手法の適用がユーザーの満足度に与える影響をユーザーテストにより評価する。

    CiNii

  • 深層クラスタリングを用いた任意楽器パートの自動採譜

    田中 啓太郎, 中塚 貴之, 錦見 亮, 吉井 和佳, 森島 繁生

      2020 ( 1 ) 365 - 366  2020.02

     View Summary

    本研究では、任意の複数楽器で演奏された音楽音響信号に対して、各パートを自動採譜する手法を提案する。近年、深層学習によって識別性能や表現学習が大幅に向上したことによって、複数楽器の自動採譜が提案されるようになった。しかし多くの場合、採譜したい楽器について教師データを用意する必要があり、多様な楽器や音源全てに対し事前に学習することは現実的ではない。任意の音楽音響信号に対する採譜を行うため、楽器ラベルによる分類ではなくクラスタリングを用いることで、教師なし学習を行う枠組みを提案する。ネットワーク全体の最適化を通じて音源分離とパート譜採譜のマルチタスク学習を行うことで、各パートの採譜を実現する。

    CiNii

  • スペクトログラムとピッチグラムの深層クラスタリングに基づく複数楽器パート採譜

    田中啓太郎, 中塚貴之, 錦見亮, 吉井和佳, 森島繁生

    情報処理学会研究報告(Web)   2020 ( MUS-128 )  2020

    J-GLOBAL

  • LineChaser:視覚障碍者が列に並ぶためのスマートフォン型支援システム

    栗林雅希, 粥川青汰, 高木啓伸, 浅川智恵子, 森島繁生

    日本ソフトウェア科学会研究会資料シリーズ(Web)   ( 91 )  2020

    J-GLOBAL

  • 分離型畳み込みカーネルを用いた非均一表面下散乱の効率的な計測と実時間レンダリング法

    谷田川達也, 山口 泰, 森島繁生

    Visual Computing 2019 論文集   P26  2019.06

  • What Do Adversarially Robust Models Look At?

    Takahiro Itazuri, Yoshihiro Fukuhara, Hirokatsu Kataoka, Shigeo Morishima

       2019.05

     View Summary

    In this paper, we address the open question: "What do adversarially robust
    models look at?" Recently, it has been reported in many works that there exists
    the trade-off between standard accuracy and adversarial robustness. According
    to prior works, this trade-off is rooted in the fact that adversarially robust
    and standard accurate models might depend on very different sets of features.
    However, it has not been well studied what kind of difference actually exists.
    In this paper, we analyze this difference through various experiments visually
    and quantitatively. Experimental results show that adversarially robust models
    look at things at a larger scale than standard models and pay less attention to
    fine textures. Furthermore, although it has been claimed that adversarially
    robust features are not compatible with standard accuracy, there is even a
    positive effect by using them as pre-trained models particularly in low
    resolution datasets.

  • 音響情報を用いた一枚画像からの動画生成

    土屋 志高, 板摺 貴大, 夏目 亮太, 加藤 卓哉, 山本 晋太郎, 森島 繁生

      2019 ( 1 ) 169 - 170  2019.02

     View Summary

    人間は音のような聴覚情報から動画のような視覚情報を想像することが可能である.このような機能をコンピュータで実現する研究として,顔の特徴点や体のボーンといった特徴量を用いることで,口や体の動きを生成する研究がある.しかし,これらの手法では対象に特化した特徴量を用いているため,音と動きが連動したあらゆる現象に対して適用できないという問題点がある.本論文では,一枚の入力画像と数秒の入力音から,これらに合った動画を生成する問題に一般的に適用可能な深層学習を用いた手法を提案する.実験において,口や体の動きだけでなく,海の波や花火などの様々な動画において提案手法が有効であるかを検証した.

    CiNii

  • 一枚画像と音情報を用いた動画生成

    土屋志高, 板摺貴大, 夏目亮太, 加藤卓哉, 山本晋太郎, 森島繁生

    情報処理学会研究報告(Web)   2019 ( CG-173 )  2019

    J-GLOBAL

  • Linearly Transformed Cosinesを用いた非等方関与媒質のリアルタイムレンダリング

    久家隆宏, 谷田川達也, 森島繁生

    画像電子学会誌(CD-ROM)   48 ( 1 )  2019

    J-GLOBAL

  • VRによる視覚操作を用いた仮想勾配昇降時の知覚調査

    島村僚, 粥川青汰, 中塚貴之, 宮川翔貴, 森島繁生

    画像電子学会誌(CD-ROM)   48 ( 1 )  2019

    J-GLOBAL

  • GPUによる大規模煙シミュレーションとその高速化手法

    石田大地, 安東遼一, 森島繁生

    画像電子学会誌(CD-ROM)   48 ( 1 )  2019

    J-GLOBAL

  • 深層学習を用いた顔画像レンダリング

    山口周悟, 斉藤隼介, 斉藤隼介, 斉藤隼介, 長野光希, LI Hao, LI Hao, LI Hao, 森島繁生

    画像電子学会誌(CD-ROM)   48 ( 1 )  2019

    J-GLOBAL

  • Singing Facial Animation Synthesis Using Singing Voice and Musical Information

    加藤卓哉, 深山覚, 中野倫靖, 後藤真孝, 森島繁生

    画像電子学会誌(CD-ROM)   48 ( 2 )  2019

    J-GLOBAL

  • PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization

    Shunsuke Saito, Zeng Huang, Ryota Natsume, Shigeo Morishima, Angjoo Kanazawa, Hao Li

    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019)   2019-October   2304 - 2314  2019

     View Summary

    We introduce Pixel-aligned Implicit Function (PIFu), an implicit representation that locally aligns pixels of 2D images with the global context of their corresponding 3D object. Using PIFu, we propose an end-to-end deep learning method for digitizing highly detailed clothed humans that can infer both 3D surface and texture from a single image, and optionally, multiple input images. Highly intricate shapes, such as hairstyles, clothing, as well as their variations and deformations can be digitized in a unified way. Compared to existing representations used for 3D deep learning, PIFu produces high-resolution surfaces including largely unseen regions such as the back of a person. In particular, it is memory efficient unlike the voxel representation, can handle arbitrary topology, and the resulting surface is spatially aligned with the input image. Furthermore, while previous techniques are designed to process either a single image or multiple views, PIFu extends naturally to arbitrary number of views. We demonstrate high-resolution and robust reconstructions on real world images from the DeepFashion dataset, which contains a variety of challenging clothing types. Our method achieves state-of-the-art performance on a public benchmark and outperforms the prior work for clothed human digitization from a single image.

    DOI

  • 分離型畳み込みカーネルを用いた非均一表面下散乱の効率的推定法

    谷田川達也, 山口泰, 森島繁生

    情報処理学会研究報告(Web)   2019 ( CG-174 ) 1 - 8  2019

    J-GLOBAL

  • SiCloPe: Silhouette-Based Clothed People

    Ryota Natsume, Shunsuke Saito, Zeng Huang, Weikai Chen, Chongyang Ma, Hao Li, Shigeo Morishima

    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019)   2019-June   4475 - 4485  2019

     View Summary

    We introduce a new silhouette-based representation for modeling clothed human bodies using deep generative models. Our method can reconstruct a complete and textured 3D model of a person wearing clothes from a single input picture. Inspired by the visual hull algorithm, our implicit representation uses 2D silhouettes and 3D joints of a body pose to describe the immense shape complexity and variations of clothed people. Given a segmented 2D silhouette of a person and its inferred 3D joints from the input picture, we first synthesize consistent silhouettes from novel view points around the subject. The synthesized silhouettes which are the most consistent with the input segmentation are fed into a deep visual hull algorithm for robust 3D shape prediction. We then infer the texture of the subject's back view using the frontal image and segmentation mask as input to a conditional generative adversarial network. Our experiments demonstrate that our silhouette-based model is an effective representation and the appearance of the back view can be predicted reliably using an image-to-image translation network. While classic methods based on parametric models often fail for single-view images of subjects with challenging clothing, our approach can still produce successful results, which are comparable to those obtained from multi-view input.

    DOI

  • FSNet: An Identity-Aware Generative Model for Image-Based Face Swapping

    Ryota Natsume, Tatsuya Yatagawa, Shigeo Morishima

    COMPUTER VISION - ACCV 2018, PT VI   11366   117 - 132  2019

     View Summary

    This paper presents FSNet, a deep generative model for image-based face swapping. Traditionally, face-swapping methods are based on three-dimensional morphable models (3DMMs), and facial textures are replaced between the estimated three-dimensional (3D) geometries in two images of different individuals. However, the estimation of 3D geometries along with different lighting conditions using 3DMMs is still a difficult task. We herein represent the face region with a latent variable that is assigned with the proposed deep neural network (DNN) instead of facial textures. The proposed DNN synthesizes a face-swapped image using the latent variable of the face region and another image of the non-face region. The proposed method is not required to fit to the 3DMM; additionally, it performs face swapping only by feeding two face images to the proposed network. Consequently, our DNN-based face swapping performs better than previous approaches for challenging inputs with different face orientations and lighting conditions. Through several experiments, we demonstrated that the proposed method performs face swapping in a more stable manner than the state-of-the-art method, and that its results are compatible with the method thereof.

    DOI

  • Vision and Language for Automatic Paper Summary Generation

    山本晋太郎, 福原吉博, 鈴木亮太, 片岡裕雄, 森島繁生

    電子情報通信学会技術研究報告   118 ( 362(PRMU2018 75-95)(Web) )  2018

    J-GLOBAL

  • 音響特徴量を考慮したミュージックビデオの色調編集手法

    井上和樹, 中塚貴之, 柿塚亮, 高森啓史, 宮川翔貴, 森島繁生, 森島繁生

    情報処理学会全国大会講演論文集   80th ( 3 ) 3.271‐3.272  2018

    J-GLOBAL

  • 領域分離型敵対的生成ネットワークによる髪型編集手法

    夏目亮太, 谷田川達也, 森島繁生, 森島繁生

    情報処理学会全国大会講演論文集   80th ( 2 ) 2.207‐2.208  2018

    J-GLOBAL

  • 複数色の重ね合わせによるユーザーの好みを反映した化粧色推薦

    山岸奏実, 加藤卓哉, 古川翔一, 山本晋太郎, 森島繁生

    情報処理学会全国大会講演論文集   80th ( 4 ) 4.167‐4.168  2018

    J-GLOBAL

  • 楽曲構造を考慮した音楽音響信号からの自動ピアノアレンジ

    高森啓史, 中塚貴之, 深山覚, 後藤真孝, 森島繁生

    情報処理学会研究報告(Web)   2018 ( MUS-120 ) Vol.2018‐MUS‐120,No.11,1‐6 (WEB ONLY)  2018

    J-GLOBAL

  • Qasi-developable garment transfer for animals

    Fumiya Narita, Shunsuke Saito, Tsukasa Fukusato, Shigeo Morishima

    SIGGRAPH Asia 2017 Technical Briefs, SA 2017    2017.11

     View Summary

    In this paper, we present an interactive framework to model garments for animals from a template garment model based on correspondences between the source and the target bodies. We address two critical challenges of garment transfer across significantly different body shapes and postures (e.g., for quadruped and human)
    (1) ambiguity in the correspondences and (2) distortion due to large variation in scale of each body part. Our efficient cross-parameterization algorithm and intuitive user interface allow us to interactively compute correspondences and transfer the overall shape of garments. We also introduce a novel algorithm for local coordinate optimization that minimizes the distortion of transferred garments, which leads a resulting model to a quasi-developable surface and hence ready for fabrication. Finally, we demonstrate the robustness and effectiveness of our approach on a various garments and body shapes, showing that visually pleasant garment models for animals can be generated and fabricated by our system with minimal effort.

    DOI

  • 密な画素対応による例示ベースのテクスチャ合成

    山口周悟, 森島繁雄

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2017  2017

    J-GLOBAL

  • 一人称視点映像の高速閲覧に有効なキューの自動生成手法

    粥川青汰, 樋口啓太, 中村優文, 米谷竜, 佐藤洋一, 森島繁生

    日本ソフトウェア科学会研究会資料シリーズ(Web)   ( 81 ) WEB ONLY  2017

    J-GLOBAL

  • DanceDJ:ライブパフォーマンスを実現する実時間ダンス生成システム

    岩本尚也, 岩本尚也, 加藤卓哉, 柿塚亮, SHUM Hubert P.H, 原健太, 森島繁生

    日本ソフトウェア科学会研究会資料シリーズ(Web)   ( 81 ) ROMBUNNO.2‐A07 (WEB ONLY)  2017

    J-GLOBAL

  • コート情報に基づくバレーボール映像の鑑賞支援と戦術解析への応用の検討

    板摺貴大, 福里司, 山口周悟, 森島繁生, 森島繁生

    電子情報通信学会技術研究報告   116 ( 411(PRMU2016 127-151) ) 227‐234  2017

    J-GLOBAL

  • Facial Fattening and Slimming Simulation Based on Skull Structure

    福里司, 藤崎匡裕, 加藤卓哉, 森島繁生

    画像電子学会誌(CD-ROM)   46 ( 1 ) 197‐205  2017

    J-GLOBAL

  • 曲率に依存する反射関数を用いた半透明物体における法線マップ推定手法の提案

    久保尋之, 岡本翠, 向川康博, 森島繁生

    画像ラボ   28 ( 2 ) 14‐21  2017

    J-GLOBAL

  • Arranging Music for Piano using Musical Features of Original Score

    高森啓史, 佐藤晴紀, 中塚貴之, 森島繁生, 森島繁生

    情報処理学会研究報告(Web)   2017 ( MUS-114 ) Vol.2017‐MUS‐114,No.16,1‐6 (WEB ONLY)  2017

    J-GLOBAL

  • 表情変化を考慮した経年変化顔動画合成

    山本晋太郎, サフキン パーベル, 加藤卓哉, 山口周悟, 森島繁生, 森島繁生

    情報処理学会研究報告(Web)   2017 ( CG-166 ) Vol.2017‐CG‐166,No.3,1‐6 (WEB ONLY)  2017

    J-GLOBAL

  • キャラクタの局所的な身体構造を考慮したリアルタイム二次動作自動生成

    金田綾乃, 福里司, 福原吉博, 中塚貴之, 森島繁生

    情報処理学会研究報告(Web)   2017 ( CG-166 ) Vol.2017‐CG‐166,No.2,1‐5 (WEB ONLY)  2017

    J-GLOBAL

  • 笑顔動画データベースを用いた顔動画の経年変化

    山本晋太郎, SAVKIN Pavel, 佐藤優伍, 加藤卓哉, 森島繁生, 森島繁生

    情報処理学会全国大会講演論文集   79th ( 4 ) 4.103‐4.104  2017

    J-GLOBAL

  • 物体の形状による堆積への影響を考慮した埃の高速描画手法の提案

    佐藤樹, 小澤禎裕, 持田恵佑, 谷田川達也, 森島繁生, 森島繁生

    情報処理学会全国大会講演論文集   79th ( 4 ) 4.145‐4.146  2017

    J-GLOBAL

  • キャラクタの局所的な身体構造を考慮した二次動作自動生成

    金田綾乃, 福里司, 福原吉博, 中塚貴之, 森島繁生, 森島繁生

    情報処理学会全国大会講演論文集   79th ( 4 ) 4.89‐4.90  2017

    J-GLOBAL

  • 原曲の楽譜情報に基づいたピアノアレンジ譜面の生成

    高森啓史, 佐藤晴紀, 中塚貴之, 森島繁生, 森島繁生

    情報処理学会全国大会講演論文集   79th ( 2 ) 2.91‐2.92  2017

    J-GLOBAL

  • コート情報に基づくバレーボール映像の鑑賞支援とラリー解析

    板摺貴大, 福里司, 山口周悟, 森島繁生, 森島繁生

    情報処理学会全国大会講演論文集   79th ( 2 ) 2.317‐2.318  2017

    J-GLOBAL

  • ラリーシーンに着目したラケットスポーツ映像鑑賞システム

    板摺貴大, 福里司, 河村俊哉, 森島繁生

    画像ラボ   28 ( 6 ) 12‐19  2017

    J-GLOBAL

  • 可展面制約を考慮したテンプレートベース衣服モデリング

    成田史弥, 齋藤隼介, 福里司, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2017   ROMBUNNO.22  2017

    J-GLOBAL

  • 表情変化データベースを用いた経年変化顔動画合成

    山本晋太郎, SAVKIN Pavel, 加藤卓哉, 佐藤優伍, 古川翔一, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2017   ROMBUNNO.18  2017

    J-GLOBAL

  • 自己遮蔽下におけるリテクスチャリングのための階層型マーカの提案

    宮川翔貴, 福原吉博, 成田史弥, 小形憲弘, 森島繁生, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2017   ROMBUNNO.37  2017

    J-GLOBAL

  • 局所的な異方性と硬さを考慮した高速なキャラクタの二次動作生成の提案

    金田綾乃, 福里司, 福原吉博, 中塚貴之, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2017   ROMBUNNO.29  2017

    J-GLOBAL

  • コート情報に基づくバレーボール映像の鑑賞支援

    板摺貴大, 福里司, 山口周悟, 森島繁生, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2017   ROMBUNNO.09  2017

    J-GLOBAL

  • Automatic Piano-Arrangement Based on Music Elements Obtained from Music Audio Signals

    高森啓史, 深山覚, 後藤真孝, 森島繁生

    情報処理学会研究報告(Web)   2017 ( MUS-116 ) Vol.2017‐MUS‐116,No.13,1‐5 (WEB ONLY)  2017

    J-GLOBAL

  • Fast Subsurface Light Transport Evaluation for Real-Time Rendering Using Single Shortest Optical Paths

    小澤禎裕, 谷田川達也, 久保尋之, 森島繁生

    画像電子学会誌(CD-ROM)   46 ( 4 ) 533‐546  2017

    J-GLOBAL

  • 物体検出とユーザ入力に基づく一人称視点映像の高速閲覧手法

    粥川青汰, 樋口啓太, 米谷竜, 中村優文, 佐藤洋一, 森島繁生

    情報処理学会研究報告(Web)   2017 ( DCC-17 ) Vol.2017‐DCC‐17,No.4,1‐8 (WEB ONLY)  2017

    J-GLOBAL

  • Simulating the friction sounds using a friction-based adhesion theory model

    Takayuki Nakatsuka, Shigeo Morishima

    DAFx 2017 - Proceedings of the 20th International Conference on Digital Audio Effects     32 - 39  2017

     View Summary

    Synthesizing a friction sound of deformable objects by a computer is challenging. We propose a novel physics-based approach to synthesize friction sounds based on dynamics simulation. In this work, we calculate the elastic deformation of an object surface when the object comes in contact with other objects. The principle of our method is to divide an object surface into microrectangles. The deformation of each microrectangle is set using two assumptions: the size of a microrectangle (1) changes by contacting other object and (2) obeys a normal distribution. We consider the sound pressure distribution and its space spread, consisting of vibrations of all microrectangles, to synthesize a friction sound at an observation point. We express the global motions of an object by position based dynamics where we add an adhesion constraint. Our proposed method enables the generation of friction sounds of objects in different materials by regulating the initial value of microrectangular parameters.

  • Dynamic subtitle placement considering the region of interest and speaker location

    VISIGRAPP 2017 - Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications   6   102 - 109  2017.01

     View Summary

    © 2017 by SCITEPRESS - Science and Technology Publications, Lda. This paper presents a subtitle placement method that reduces unnecessary eye movements. Although methods that vary the position of subtitles have been discussed in a previous study, subtitles may overlap the region of interest (ROI). Therefore, we propose a dynamic subtitling method that utilizes eye-Tracking data to avoid the subtitles from overlapping with important regions. The proposed method calculates the ROI based on the eye-Tracking data of multiple viewers. By positioning subtitles immediately under the ROI, the subtitles do not overlap the ROI. Furthermore, we detect speakers in a scene based on audio and visual information to help viewers recognize the speaker by positioning subtitles near the speaker. Experimental results show that the proposed method enables viewers to watch the ROI and the subtitle in longer duration than traditional subtitles, and is effective in terms of enhancing the comfort and utility of the viewing experience.

  • motebi~文字を手書きで美しく書くための支援ツール~

    中村優文, 山口周悟, 森島繁生

    日本ソフトウェア科学会研究会資料シリーズ(Web)   ( 78 ) ROMBUNNO.2‐A06 (WEB ONLY)  2016

    J-GLOBAL

  • 視線情報と話者情報とを組み合わせた動画への動的字幕配置手法

    赤堀渉, 平井辰典, 森島繁生

    日本ソフトウェア科学会研究会資料シリーズ(Web)   ( 78 ) ROMBUNNO.1‐A10 (WEB ONLY)  2016

    J-GLOBAL

  • トレーシングとデータベースを併用する2Dアニメーション作成支援システム

    福里司, 森島繁生

    日本ソフトウェア科学会研究会資料シリーズ(Web)   ( 78 ) WEB ONLY  2016

    J-GLOBAL

  • Dance DJ:ライブパフォーマンスのためのダンス動作ミックスシステム

    岩本尚也, 加藤卓哉, 原健太, 柿塚亮, 森島繁生

    日本ソフトウェア科学会研究会資料シリーズ(Web)   ( 78 ) ROMBUNNO.1‐A02 (WEB ONLY)  2016

    J-GLOBAL

  • 曲率に依存した反射関数を用いた半透明物体の照度差ステレオ法

    岡本翠, 久保尋之, 向川康博, 森島繁生

    画像電子学会誌(CD-ROM)   45 ( 1 ) 119‐120  2016

    J-GLOBAL

  • ラリーシーンに着目したラケットスポーツ動画鑑賞システム

    河村俊哉, 福里司, 平井辰典, 森島繁生

    画像電子学会誌(CD-ROM)   45 ( 1 ) 121‐122  2016

    J-GLOBAL

  • Region-based Painting Style Transfer

    山口周悟, 加藤卓哉, 福里司, 古澤知英, 森島繁生

    画像電子学会誌(CD-ROM)   45 ( 1 ) 125‐126  2016

    J-GLOBAL

  • フレームリシャッフリングに基づく事前知識を用いない吹替映像の生成

    古川翔一, 加藤卓哉, 野澤直樹, PAVEL Savkin, 森島繁生, 森島繁生

    情報処理学会全国大会講演論文集   78th ( 4 ) 4.107-4.108  2016

    J-GLOBAL

  • 顔画像特徴と眠気の相関に基づくドライバーの眠気検出

    佐藤優伍, 野澤直樹, 森島繁生

    情報処理学会全国大会講演論文集   78th ( 3 ) 3.331-3.332  2016

    J-GLOBAL

  • 要素間補間による共回転系弾性体シミュレーションの高速化

    福原吉博, 齋藤隼介, 成田史弥, 森島繁生

    情報処理学会全国大会講演論文集   78th ( 4 ) 4.181-4.182  2016

    J-GLOBAL

  • 似顔絵の個性を考慮した実写化手法の提案

    中村優文, 山口周悟, 福里司, 古澤知英, 森島繁生

    情報処理学会全国大会講演論文集   78th ( 4 ) 4.93-4.94  2016

    J-GLOBAL

  • 凝着説に基づく物体表面の弾性変形を考慮した摩擦音の生成手法の提案

    中塚貴之, 森島繁生, 森島繁生

    情報処理学会全国大会講演論文集   78th ( 4 ) 4.187-4.188  2016

    J-GLOBAL

  • 好みを反映したダンス生成のための振付編集手法

    柿塚亮, 柿塚亮, 岩本尚也, 岩本尚也, 朝比奈わかな, 朝比奈わかな, 森島繁生, 森島繁生

    情報処理学会全国大会講演論文集   78th ( 4 ) 4.103-4.104  2016

    J-GLOBAL

  • パッチ単位の法線推定による三次元顔形状復元

    野沢綸佐, 加藤卓哉, 野澤直樹, PARVEL Savkin, 森島繁生, 森島繁生

    情報処理学会全国大会講演論文集   78th ( 2 ) 2.113-2.114  2016

    J-GLOBAL

  • 不均一な半透明物体の描画のためのTranslucent Shadow Mapsの拡張

    持田恵佑, 岡本翠, 久保尋之, 森島繁生

    情報処理学会全国大会講演論文集   78th ( 4 ) 4.57-4.58  2016

    J-GLOBAL

  • 輝度の最大寄与値を用いた半透明物体のリアルタイムレンダリング

    小澤禎裕, 岡本翠, 森島繁生

    情報処理学会全国大会講演論文集   78th ( 4 ) 4.55-4.56  2016

    J-GLOBAL

  • Garment Transfer between Biped and Quadruped Characters

    成田史弥, 齋藤隼介, 福里司, 森島繁生

    情報処理学会論文誌ジャーナル(Web)   57 ( 3 ) 863‐872 (WEB ONLY)  2016

    J-GLOBAL

  • LyricsRadar: A Lyrics Retrieval Interface Based on Latent Topics of Lyrics

    佐々木将人, 吉井和佳, 中野倫靖, 後藤真孝, 森島繁生

    情報処理学会論文誌ジャーナル(Web)   57 ( 5 ) 1365‐1374 (WEB ONLY)  2016

    J-GLOBAL

  • 寄与の大きな表面下散乱光の高速取得による半透明物体のリアルタイムレンダリング

    小澤禎裕, 久保尋之, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2016   ROMBUNNO.28  2016

    J-GLOBAL

  • 要素間補間による共回転系弾性体シミュレーションの高速化

    福原吉博, 斎藤隼介, 成田史弥, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2016   ROMBUNNO.13  2016

    J-GLOBAL

  • Voxel Number Mapを用いた不均一半透明物体のリアルタイムレンダリング

    持田恵佑, 久保尋之, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2016   ROMBUNNO.29  2016

    J-GLOBAL

  • フレームリシャッフリングに基づく音素情報を用いない吹替え映像の生成

    古川翔一, 加藤卓哉, SAVKIN Pavel, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2016   ROMBUNNO.16  2016

    J-GLOBAL

  • 好みを反映した3Dダンス制作のための振付編集手法

    柿塚亮, 岩本尚也, 朝比奈わかな, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2016   ROMBUNNO.26  2016

    J-GLOBAL

  • 複数人の視線追跡データから推定される関心領域に基づく動画への動的字幕配置手法

    赤堀渉, 平井辰典, 河村俊哉, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2016   ROMBUNNO.39  2016

    J-GLOBAL

  • CGアニメーションのための物体表面の凝着を考慮した摩擦音の生成

    中塚貴之, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2016   ROMBUNNO.20  2016

    J-GLOBAL

  • 頭蓋骨形状に基づいた顔の三次元肥痩シミュレーション

    藤崎匡裕, 鍵山裕貴, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2016   ROMBUNNO.25  2016

    J-GLOBAL

  • 法線情報を含むパッチデータベースを用いた三次元顔形状復元

    野沢綸佐, 加藤卓哉, SAVKIN Pavel, 山口周悟, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2016   ROMBUNNO.07  2016

    J-GLOBAL

  • 肖像画実写化手法の提案

    中村優文, 山口周悟, 福里司, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2016   ROMBUNNO.09  2016

    J-GLOBAL

  • Aged Wrinkles Individuality Considering Aging Simulation

    SAVKIN Pavel A., 加藤卓哉, 福里司, 森島繁生

    情報処理学会論文誌ジャーナル(Web)   57 ( 7 ) 1627‐1637 (WEB ONLY)  2016

    J-GLOBAL

  • A choreographic authoring system reflecting a user’s preference

    柿塚亮, 佃洸摂, 深山覚, 岩本尚也, 後藤真孝, 森島繁生

    情報処理学会研究報告(Web)   2016 ( MUS-112 ) Vol.2016‐MUS‐112,No.16,1‐6 (WEB ONLY)  2016

    J-GLOBAL

  • 肖像画からの写実的な顔画像生成手法

    中村優文, 山口周悟, 福里司, 森島繁生

    情報処理学会研究報告(Web)   2016 ( CG-163 ) Vol.2016‐CG‐163,No.10,1‐6 (WEB ONLY)  2016

    J-GLOBAL

  • 似顔絵から顔を知る~似顔絵実写化の可能性~

    中村優文, 森島繁生

    日本顔学会誌   16 ( 1 ) 36  2016

    J-GLOBAL

  • 顔の変化による眠そうな顔の認知

    佐藤優伍, 加藤卓哉, 野澤直樹, 森島繁生

    日本顔学会誌   16 ( 1 ) 71  2016

    J-GLOBAL

  • 顔の発話動作と音声とを同期させた映像を生成する手法の提案

    古川翔一, 加藤卓哉, サフキン パーベル, 森島繁生

    日本顔学会誌   16 ( 1 ) 38  2016

    J-GLOBAL

  • ラリーシーンの自動抽出と解析に基づくバレーボール映像の要約手法の提案

    板摺貴大, 福里司, 山口周悟, 森島繁生

    映像情報メディア学会冬季大会講演予稿集(CD-ROM)   2016   ROMBUNNO.25C‐1  2016

    J-GLOBAL

  • Automatic Generation of Photorealistic 3D Inner Mouth Animation only from Frontal Images

    Masahide Kawai, Tomoyori Iwao, Akinobu Maejima, Shigeo Morishima

      56 ( 9 )  2015.09

    CiNii

  • 単一楽曲の切り貼りによる動画の盛り上がりに同期したBGM自動付加手法

    佐藤 晴紀, 平井 辰典, 中野 倫靖, 後藤 真孝, 森島 繁生

    第77回全国大会講演論文集   2015 ( 1 ) 375 - 376  2015.03

     View Summary

    本稿では、動画とその盛り上がり箇所、BGMとしたい楽曲とその箇所を入力として与えると、動画の盛り上がりと楽曲の指定箇所が合うように楽曲を自動付加する手法を提案する。従来、BGMと映像の関係を事前に学習しておくことで、新たな動画にBGMを付加する手法が提案されているが、「動画の決定的なシーンに楽曲のサビの部分を合わせたい」といったユーザの意図の反映は検討されてこなかった。そこで本研究では、ユーザが楽曲のサビ区間をいつ再生するのか指定することで、楽曲と動画の始端及び終端を揃えながら、指定時刻にサビがくるように楽曲を断片的につなぎ合わせる。具体的には、動的計画法を用いた小節単位での楽曲の切り貼りによって、ユーザの意図を反映したBGM自動付加を実現する。

    CiNii

  • ダンスモーションにシンクロした音楽印象推定手法の提案とダンサーの表情自動合成への応用

    朝比奈わかな, 岡田成美, 岩本尚也, 増田太郎, 福里司, 森島繁生, 森島繁生

    情報処理学会研究報告(Web)   2015 ( 23 ) 1 - 6  2015.02

    CiNii J-GLOBAL

  • 映像の盛り上がり箇所に音楽のサビを同期させるBGM付加支援手法

    佐藤晴紀, 佐藤晴紀, 平井辰典, 中野倫靖, 後藤真孝, 森島繁生, 森島繁生

    情報処理学会研究報告(Web)   2015 ( 10 ) 1 - 6  2015.02

    CiNii J-GLOBAL

  • 実写画像に基づく特定画風を反映したアニメ背景画像への自動変換

    山口周悟, 古澤知英, 福里司, 森島繁生, 森島繁生

    情報処理学会研究報告(Web)   2015 ( 14 ) 1 - 6  2015.02

    CiNii J-GLOBAL

  • 人物の皺の発生位置と形状を反映した経年変化顔画像合成

    サフキン パーベル, 桑原大樹, 川井正英, 加藤卓哉, 森島繁生

    情報処理学会研究報告(Web)   2015 ( 12 ) 1 - 8  2015.02

    CiNii J-GLOBAL

  • 注視点の変化に追随するゲームキャラクタの頭部および眼球運動の自動合成

    鍵山裕貴, 川井正英, 桑原大樹, 森島繁生

    情報処理学会研究報告(Web)   2015 ( 13 ) 1 - 7  2015.02

    CiNii J-GLOBAL

  • 半透明物体における曲率と透過度合の相関分析

    岡本 翠, 安達 翔平, 久保 尋之, 向川 康博, 森島 繁生

    研究報告コンピュータビジョンとイメージメディア(CVIM)   2015 ( 26 ) 1 - 6  2015.01

     View Summary

    本研究では,半透明物体の内部で生じる光線の表面下散乱現象の解析を目的として,物体表面の曲率と表面下散乱現象との相関を検討する.コンピュータグラフィックス分野では,画像の生成を目的として表面下散乱現象のモデル化が積極的に行われてきたが,物理ベースの光学シミュレーションは計算負荷が高いため,このようなモデルを用いて画像の解析を行うことは非常に困難である.そこで本研究は,表面下散乱現象のモデルとして,計算コストが低い近似的なモデルである曲率に依存する反射関数 (CDRF) に着目する.様々な曲率の半透明物体における表面下散乱の計測結果をもとに,曲率と光の透過度合との相関を分析することによって,CDRF を用いた半透明物体の画像解析の有効性を検証する.

    CiNii

  • 似顔絵からのフォトリアルな顔画像生成

    溝川あい, 森島繁生

    画像センシングシンポジウム講演論文集(CD-ROM)   21st   ROMBUNNO.IS3-32  2015

    J-GLOBAL

  • 半透明物体における曲率と透過度合の相関分析

    岡本翠, 安達翔平, 久保尋之, 向川康博, 森島繁生

    電子情報通信学会技術研究報告   114 ( 410(MVE2014 49-73) ) 147 - 152  2015

    J-GLOBAL

  • A System for Viewing Racquet Sports Video with Automatic Summarization Focusing on Rally Scene

    河村俊哉, 福里司, 福里司, 平井辰典, 森島繁生, 森島繁生

    情報処理学会論文誌ジャーナル(Web)   56 ( 3 ) 1028-1038 (WEB ONLY)  2015

    J-GLOBAL

  • ダンスモーションに同期した表情自動合成のための楽曲印象解析手法の提案

    朝比奈わかな, 岡田成美, 岩本尚也, 増田太郎, 福里司, 森島繁生

    情報処理学会全国大会講演論文集   77th ( 2 ) 2.383-2.384  2015

    J-GLOBAL

  • 単一楽曲の切り貼りによる動画の盛り上がりに同期したBGM自動付加手法

    佐藤晴紀, 平井辰典, 中野倫靖, 後藤真孝, 森島繁生

    情報処理学会全国大会講演論文集   77th ( 2 ) 2.375-2.376  2015

    J-GLOBAL

  • 皺の個人性を考慮した経年変化顔画像合成

    SAVKIN Pavel, 桑原大樹, 川井正英, 加藤卓哉, 森島繁生

    情報処理学会全国大会講演論文集   77th ( 4 ) 4.111-4.112  2015

    J-GLOBAL

  • 実測に基づくゲームキャラクタの頭部および眼球運動の自動合成

    鍵山裕貴, 川井正英, 桑原大樹, 森島繁生

    情報処理学会全国大会講演論文集   77th ( 4 ) 4.109-4.110  2015

    J-GLOBAL

  • 顔画像のシルエット情報に基づく3次元顔形状復元

    野澤直樹, 桑原大樹, 森島繁生

    情報処理学会全国大会講演論文集   77th ( 2 ) 2.525-2.526  2015

    J-GLOBAL

  • フィッティングを保持した体型の変化に頑健な衣装転写システムの提案

    成田史弥, 齋藤隼介, 加藤卓哉, 福里司, 森島繁生

    情報処理学会全国大会講演論文集   77th ( 4 ) 4.105-4.106  2015

    J-GLOBAL

  • 雑音下での音源定位・音源分離に与える伝達関数測定法の影響の評価

    赤堀渉, 増田太郎, 奥乃博, 森島繁生

    情報処理学会全国大会講演論文集   77th ( 2 ) 2.119-2.120  2015

    J-GLOBAL

  • 実写画像に基づく画風を考慮したアニメ背景画像生成システム

    山口周悟, 古澤知英, 福里司, 森島繁生

    情報処理学会全国大会講演論文集   77th ( 4 ) 4.71-4.72  2015

    J-GLOBAL

  • 外科的矯正治療後のスマイルについて

    寺田員人, 佐野奈都貴, 寺嶋縁里, 亀田剛, 小原彰浩, 齋藤功, 森島繁生

    顎顔面バイオメカニクス学会誌   19/20 ( 1 ) 64‐70  2015

    J-GLOBAL

  • VoiceDub: Post-scoring System with Multiple Timing Information for Visual Entertainment

    川本真一, 森島繁生, 中村哲

    情報処理学会論文誌ジャーナル(Web)   56 ( 4 ) 1142-1151 (WEB ONLY)  2015

    J-GLOBAL

  • 音声と映像の変化に注目したフレーム間引きによる動画要約手法

    平井辰典, 森島繁生, 森島繁生

    情報処理学会研究報告(Web)   2015 ( MUS-107 ) VOL.2015-MUS-107,NO.18 (WEB ONLY)  2015

    J-GLOBAL

  • 頬のシルエット情報を活用した単一斜め向き顔画像に対する顔形状3次元復元手法

    野澤直樹, 森島繁生

    情報処理学会研究報告(Web)   2015 ( CG-159 ) VOL.2015-CG-159,NO.7 (WEB ONLY)  2015

    J-GLOBAL

  • 手描き画像の特徴を保存した実写画像への画風転写

    山口周悟, 加藤卓哉, 福里司, 古澤知英, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2015   ROMBUNNO.32  2015

    J-GLOBAL

  • 注視点の変化に追随するゲームキャラクタの頭部および眼球運動の自動合成

    鍵山裕貴, 川井正英, 桑原大樹, 加藤卓哉, 藤崎匡裕, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2015   ROMBUNNO.23  2015

    J-GLOBAL

  • 中割り自動生成による手描きストロークベースのキーフレームアニメーション作成支援ツール

    福里司, 古澤知英, 森島繁生, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2015   ROMBUNNO.22  2015

    J-GLOBAL

  • ポーズに依存しない4足キャラクタ間の衣装転写システムの提案

    成田史弥, 齋藤隼介, 加藤卓哉, 福里司, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2015   ROMBUNNO.07  2015

    J-GLOBAL

  • 単一楽曲の切り貼りにより映像と楽曲の指定箇所を同期させるBGM付加支援インタフェース

    佐藤晴紀, 佐藤晴紀, 平井辰典, 中野倫靖, 後藤真孝, 森島繁生, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2015   ROMBUNNO.26  2015

    J-GLOBAL

  • ダンスにシンクロした楽曲印象推定によるダンスキャラクタの表情アニメーション生成手法の提案

    朝比奈わかな, 朝比奈わかな, 岡田成美, 岡田成美, 岩本尚也, 岩本尚也, 増田太郎, 増田太郎, 福里司, 森島繁生, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2015   ROMBUNNO.09  2015

    J-GLOBAL

  • 皺の発生過程を考慮した経年変化顔画像合成

    SAVKIN Pavel, 桑原大樹, 川井正英, 加藤卓哉, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2015   ROMBUNNO.03  2015

    J-GLOBAL

  • 似顔絵実写化手法の提案

    溝川あい, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2015   ROMBUNNO.01  2015

    J-GLOBAL

  • 多重レイヤーボリューム構造を考慮したキャラクターのリアルタイム肉揺れアニメーション生成手法

    岩本尚也, 岩本尚也, 森島繁生, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2015   ROMBUNNO.08  2015

    J-GLOBAL

  • ラリーシーンに着目したラケットスポーツ動画の効率的鑑賞システム

    河村俊哉, 福里司, 平井辰典, 森島繁生

    画像ラボ   26 ( 7 ) 1 - 7  2015

    J-GLOBAL

  • Flesh Jigging Method Considered in Character Body Structure in Real-Time

    岩本尚也, 岩本尚也, 森島繁生, 森島繁生

    画像電子学会誌(CD-ROM)   44 ( 3 ) 502 - 511  2015

    DOI J-GLOBAL

  • 要素間補間による共回転系弾性体シミュレーションの高速化

    福原吉博, 斎藤隼介, 成田史弥, 森島繁生

    情報処理学会研究報告(Web)   2015 ( CG-160 ) VOL.2015-CG-160,NO.8 (WEB ONLY)  2015

    J-GLOBAL

  • MusicMixer:ビート及び潜在トピックの類似度を用いたDJシステム

    平井辰典, 土井啓成, 森島繁生, 森島繁生

    情報処理学会研究報告(Web)   2015 ( MUS-108 ) VOL.2015-MUS-108,NO.3,HIRAI (WEB ONLY)  2015

    J-GLOBAL

  • 楽曲のビート類似度及び潜在トピックの類似度に基づくDJプレイの自動化

    平井辰典, 土井啓成, 森島繁生, 森島繁生

    情報処理学会研究報告(Web)   2015 ( MUS-108 ) VOL.2015-MUS-108,NO.14 (WEB ONLY)  2015

    J-GLOBAL

  • キャラクタの個性的な表情特徴を反映した表情モデリング法の提案

    加藤卓哉, 森島繁生

    日本顔学会誌   15 ( 1 ) 102  2015

    J-GLOBAL

  • 楽曲印象に基づくダンスモーションに同期したダンスキャラクタの表情自動合成

    朝比奈わかな, 朝比奈わかな, 岡田成美, 岡田成美, 岩本尚也, 岩本尚也, 増田太郎, 増田太郎, 福里司, 森島繁生, 森島繁生

    日本顔学会誌   15 ( 1 ) 124  2015

    J-GLOBAL

  • 骨格を基にした顔の肥痩シミュレーション

    藤崎匡裕, 森島繁生

    日本顔学会誌   15 ( 1 ) 140  2015

    J-GLOBAL

  • 似顔絵実写化手法の提案

    中村優文, 森島繁生

    日本顔学会誌   15 ( 1 ) 125  2015

    J-GLOBAL

  • 曲率依存反射関数を用いた半透明物体における照度差ステレオ法の改善

    岡本翠, 久保尋之, 向川康博, 森島繁生

    電子情報通信学会技術研究報告   115 ( 224(PRMU2015 67-91) ) 129 - 133  2015

    J-GLOBAL

  • パッチタイリングを用いた法線推定による3次元顔形状復元

    野沢綸佐, 加藤卓哉, 藤崎匡裕, サフキン パーベル, 森島繁生, 森島繁生

    情報処理学会研究報告(Web)   2015 ( CVIM-199 ) VOL.2015-CVIM-199,NO.26 (WEB ONLY)  2015

    J-GLOBAL

  • 動的計画法を用いた半透明物体のリアルタイムレンダリング

    小澤禎裕, 岡本翠, 久保尋之, 森島繁生, 森島繁生

    情報処理学会研究報告(Web)   2015 ( CVIM-199 ) VOL.2015-CVIM-199,NO.1 (WEB ONLY)  2015

    J-GLOBAL

  • 三次元形状を考慮した半透明物体のリアルタイムレンダリング

    持田恵佑, 岡本翠, 小澤禎裕, 久保尋之, 森島繁生, 森島繁生

    情報処理学会研究報告(Web)   2015 ( CVIM-199 ) VOL.2015-CVIM-199,NO.2 (WEB ONLY)  2015

    J-GLOBAL

  • 主成分分析に基づく類似口形状検出によるビデオ翻訳動画の生成

    古川翔一, 加藤卓哉, 野澤直樹, サフキン パーベル, 森島繁生, 森島繁生

    情報処理学会研究報告(Web)   2015 ( CVIM-199 ) VOL.2015-CVIM-199,NO.14 (WEB ONLY)  2015

    J-GLOBAL

  • 視線追跡データから算出された注目領域に基づく視線移動の少ない字幕配置法の提案と評価

    赤堀渉, 平井辰典, 森島繁生, 森島繁生

    情報処理学会研究報告(Web)   2015 ( EC-38 ) VOL.2015‐EC‐38,NO.8 (WEB ONLY)  2015

    J-GLOBAL

  • Musicmean: Fusion-based music generation

    Tatsunori Hirai, Shoto Sasaki, Shigeo Morishima

    Proceedings of the 12th International Conference in Sound and Music Computing, SMC 2015     323 - 327  2015

     View Summary

    © 2015 Tatsunori Hirai et al. In this paper, we propose MusicMean, a system that fuses existing songs to create an "in-between song" such as an "average song," by calculating the average acoustic pitch of musical notes and the occurrence frequency of drum elements from multiple MIDI songs. We generate an inbetween song for generative music by defining rules based on simple music theory. The system realizes the interactive generation of in-between songs. This represents new interaction between human and digital content. Using MusicMean, users can create personalized songs by fusing their favorite songs.

  • Flesh jigging method considered in character body structure in real-time

    Naoya Iwamoto, Shigeo Morishima

    Journal of the Institute of Image Electronics Engineers of Japan   44 ( 3 ) 502 - 511  2015

     View Summary

    This paper presents a method to synthesize a high speed soft body character animation in real time. Especially, not only primary but also secondary physics based deformation in an under skin fat structure driven by a skeletal motion are considered. However, as a high fidelity soft body animation method, high cost FEM based methods are usually introduced, the method which represents a high speed and robust soft body animation using simplified elastic simulation method was proposed. In this method, they applied skinning result to fat and skin, and then a vibration was limited to be a very small and unnatural. So in this paper, we proposed a method to divide skinning and simulation by modeling an inside structure under skin approximately and automatically. As a result, by controlling parameters such as volume and elastic parameter at each layer freely. Consequently, by a user interaction with parameter adjustment, a definition of soft body material according to the location can be implemented to generate and edit a proper soft body motion. After an evaluation of soft body motion synthesis based on a several character models, this method is approved to be very effective to make a character animation with soft body characters.

  • Automatic singing voice to music video generation via mashup of singing video clips

    Tatsunori Hirai, Yukara Ikemiya, Kazuyoshi Yoshii, Tomoyasu Nakano, Masataka Goto, Shigeo Morishima

    Proceedings of the 12th International Conference in Sound and Music Computing, SMC 2015     153 - 159  2015

     View Summary

    © 2015 Tatsunori Hirai et al. This paper presents a system that takes audio signals of any song sung by a singer as the input and automatically generates a music video clip in which the singer appears to be actually singing the song. Although music video clips have gained the popularity in video streaming services, not all existing songs have corresponding video clips. Given a song sung by a singer, our system generates a singing video clip by reusing existing singing video clips featuring the singer. More specifically, the system retrieves short fragments of singing video clips that include singing voices similar to that in target song, and then concatenates these fragments using a technique of dynamic programming (DP). To achieve this, we propose a method to extract singing scenes from music video clips by combining vocal activity detection (VAD) with mouth aperture detection (MAD). The subjective experimental results demonstrate the effectiveness of our system.

  • Estimation of Reflectance Function by Measuring Translucent Materials for Real-time Rendering

    Midori Okamoto, Shohei Adachi, Hiroaki Ukaji, Kazuki Okami, Shigeo Morishima

    IPSJ SIG Notes   2014 ( 3 ) 1 - 7  2014.02

     View Summary

    It is important to render translucent materials realistically in computer graphics. In this paper, we propose a method of rendering translucent materials in real-time by actual measurement. First, we irradiate several translucent spheres of varying radii to measure actual subsurface scattering. Next, we acquire radiance and angle between normal and light vector on each point of the sphere. After that, we revise radiance by analyzing radiance and angle between normal and view vector. As a translucent material, we relate curvature and radiance. Finally, we can prepare fast and realistic rendering results of translucent materials.

    CiNii J-GLOBAL

  • Estimation of Onomatopoeia Considering Physical Phenomena

    Tsukasa Fukusato, Shigeo Morishima

    IPSJ SIG Notes   2014 ( 10 ) 1 - 8  2014.02

     View Summary

    This paper presents a technology that enables to estimate onomatopoeias based on physical parameters. Onomatopoeias often are used to emphasize material and character's motion in Non-photorealistic Animation (NPR), and enable users to understand anime contents intuitively. However, animators empirically select onomatopoeia according to the situations and character references. Therefore, we propose a method to quantify and recommend onomatopoeia using physical parameters which is computed within the animation process. This makes it possible to visualize the information (material and speed etc.).

    CiNii J-GLOBAL

  • Facial Fattening Simulation Based on Skull Bone

    Masahiro Fujisaki, Daiki Kuwahara, Ai Mizokawa, Tomoyori Iwao, Taro Nakamura, Akinobu Maejima, Shigeo Morishima

    IPSJ SIG Notes   2014 ( 20 ) 1 - 7  2014.02

     View Summary

    A precise facial fattening simulation is strongly required in beautification, health, and entertainment. In previous works, the individuality of facial deformation was ignored because they applied the same fattening rule to all people. Although, they defined the rule of facial fattening based on only structures of facial surface, the faces were deformed unnaturally ignoring the shape of skull bone. Therefore, we proposed a method of the facial fattening deformation that preserves the individuality of facial deformation by estimating the shape of skull bone from the frontal facial image. We also prevent the deformation that facial surfaces penetrate into the skull bone due to ignoring the shape of skull bone. We realized a realistic face fattening simulation while keeping the individuality of facial deformation and preventing unnatural deformation.

    CiNii J-GLOBAL

  • Realistic Facial Retargeting with Individual Characteristics

    Takuya Kato, Masahide Kawai, Shunsuke Saito, Tomoyori Iwao, Akinobu Maejima, Shigeo Morishima

    IPSJ SIG Notes   2014 ( 15 ) 1 - 8  2014.02

     View Summary

    Although facial retargeting using Blendshape Animation has been a major method to create CG facial animations, scalping its Blendshapes has been a drawback since it causes enormous labor. In this paper, we propose a method to create mapping between Blendshapes without individual characteristics and the training examples, and apply them onto the input facial model created by transferring geometry of other characters. Our system transfers the individual characteristics defined on couple of training examples to efficiently achieve facial retargeting for Blendshape Animation with individual characteristics.

    CiNii J-GLOBAL

  • LYRICS RADAR:歌詞の潜在的意味分析に基づく歌詞検索インタフェース

    佐々木将人, 吉井和佳, 中野倫靖, 後藤真孝, 森島繁生

    情報処理学会研究報告(Web)   2014 ( MUS-102 ) VOL.2014-MUS-102,NO.26 (WEB ONLY)  2014

    J-GLOBAL

  • Query by Phrase:半教師あり非負値行列因子分解を用いた音楽信号中のフレーズ検出

    増田太郎, 吉井和佳, 後藤真孝, 森島繁生

    情報処理学会研究報告(Web)   2014 ( MUS-102 ) VOL.2014-MUS-102,NO.25 (WEB ONLY)  2014

    J-GLOBAL

  • 話者類似度の時間的変化を用いた多人数音声モーフィングに基づく話者変換

    浜崎皓介, 河原英紀, 森島繁生

    日本音響学会研究発表会講演論文集(CD-ROM)   2014   ROMBUNNO.3-6-1  2014

    J-GLOBAL

  • ラケットスポーツ動画の構造解析に基づく映像要約と鑑賞インタフェースの提案

    河村俊哉, 福里司, 平井辰典, 森島繁生

    情報処理学会全国大会講演論文集   76th ( 2 ) 2.117-2.118  2014

    J-GLOBAL

  • キャラクタに固有な表情変化の特徴を反映したキーシェイプ自動生成手法の提案

    加藤卓哉, 川井正英, 桑原大樹, 斉藤隼介, 岩尾知頼, 前島謙宣, 森島繁生

    情報処理学会全国大会講演論文集   76th ( 4 ) 4.339-4.340  2014

    J-GLOBAL

  • 髪の特徴に基づく類似顔画像検索

    藤賢大, 福里司, 増田太郎, 平井辰典, 森島繁生

    情報処理学会全国大会講演論文集   76th ( 1 ) 1.539-1.540  2014

    J-GLOBAL

  • 実測に基づく反射関数による半透明物体のリアルタイムレンダリング

    岡本翠, 安達翔平, 宇梶弘晃, 岡見和樹, 森島繁生

    情報処理学会全国大会講演論文集   76th ( 4 ) 4.301-4.302  2014

    J-GLOBAL

  • 正面および側面の手描き顔画像からの顔回転シーン自動生成

    古澤知英, 福里司, 岡田成美, 平井辰典, 森島繁生

    情報処理学会全国大会講演論文集   76th ( 4 ) 4.345-4.346  2014

    J-GLOBAL

  • 頭蓋骨の形状を考慮した顔の肥痩シミュレーション

    藤崎匡裕, 桑原大樹, 溝川あい, 中村太郎, 前島謙宣, 森島繁生

    情報処理学会全国大会講演論文集   76th ( 2 ) 2.283-2.284  2014

    J-GLOBAL

  • 歌手映像と歌声の解析に基づく音楽動画中の歌唱シーン検出手法の検討

    平井辰典, 中野倫靖, 後藤真孝, 森島繁生

    電子情報通信学会技術研究報告   114 ( 52(SP2014 1-45) ) 271 - 278  2014

    J-GLOBAL

  • 物理現象を考慮した映像シーンへの擬音語自動付加の研究

    福里司, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2014   ROMBUNNO.16  2014

    J-GLOBAL

  • 正面口内画像群からのリアルな三次元口内アニメーションの自動生成

    川井正英, 岩尾知頼, 前島謙宣, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2014   ROMBUNNO.22  2014

    J-GLOBAL

  • キャラクタ固有の表情特徴を考慮した顔アニメーション生成手法

    加藤卓哉, 斉藤隼介, 川井正英, 岩尾知頼, 前島謙宣, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2014   ROMBUNNO.37  2014

    J-GLOBAL

  • 曲率依存反射関数の実測に基づく半透明物体のリアルタイムレンダリング

    岡本翠, 安達翔平, 宇梶弘晃, 岡見和樹, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2014   ROMBUNNO.50  2014

    J-GLOBAL

  • ラケットスポーツのラリーシーンに着目した映像要約と効率的鑑賞インタフェース

    河村俊哉, 福里司, 平井辰典, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2014   ROMBUNNO.4  2014

    J-GLOBAL

  • 2枚の手描き顔画像を用いたキャラクタ顔回転シーン自動生成

    古澤知英, 福里司, 岡田成美, 平井辰典, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2014   ROMBUNNO.35  2014

    J-GLOBAL

  • 頭蓋骨形状に基づいた顔の肥痩シミュレーション

    藤崎匡裕, 桑原大樹, 溝川あい, 中村太郎, 前島謙宣, 山下隆義, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2014   ROMBUNNO.20  2014

    J-GLOBAL

  • 髪の特徴に基づく顔の印象類似検索システム

    藤賢大, 福里司, 佐々木将人, 増田太郎, 平井辰典, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2014   ROMBUNNO.41  2014

    J-GLOBAL

  • 正面および側面のイラストからのキャラクタ顔回転シーンの自動生成

    古澤知英, 福里司, 岡田成美, 平井辰典, 森島繁生

    情報処理学会研究報告(Web)   2014 ( CG-156 ) VOL.2014-CG-156,NO.8 (WEB ONLY)  2014

    J-GLOBAL

  • 振り付けの構成要素を考慮したダンスモーションのセグメンテーション手法の提案

    岡田成美, 福里司, 岩本尚也, 森島繁生

    情報処理学会研究報告(Web)   2014 ( CG-156 ) VOL.2014-CG-156,NO.9 (WEB ONLY)  2014

    J-GLOBAL

  • 個人性を保持したデータドリブンなパーツベース経年変化顔画像合成

    桑原大樹, 前島謙宣, 藤崎匡裕, 森島繁生

    情報処理学会研究報告(Web)   2014 ( CVIM-194 ) VOL.2014-CVIM-194,NO.23 (WEB ONLY)  2014

    J-GLOBAL

  • Automatic Photorealistic 3D Inner Mouth Restoration from Frontal Images

    Masahide Kawai, Tomoyori Iwao, Akinobu Maejima, Shigeo Morishima

    ADVANCES IN VISUAL COMPUTING (ISVC 2014), PT 1   8887   51 - 62  2014

     View Summary

    In this paper, we propose a novel method to generate highly photorealistic three-dimensional (3D) inner mouth animation that is well-fitted to an original ready-made speech animation using only frontal captured images and a small-size database. The algorithms are composed of quasi-3D model reconstruction and motion control of teeth and the tongue, and final compositing of photorealistic speech animation synthesis tailored to the original.

  • A visuomotor coordination model for obstacle recognition

    Tomoyori Iwao, Hiroyuki Kubo, Akinobu Maejima, Shigeo Morishima

    Journal of WSCG   22 ( 2 ) 49 - 56  2014.01

     View Summary

    In this paper, we propose a novel method for animating CG characters that while walking or running pay heed to obstacles. Here, our primary contribution is to formulate a generic visuomotor coordination model for obstacle recognition with whole body movements. In addition, our model easily generates gaze shifts, which expresses the individuality of characters. Based on experimental evidence, we also incorporate the coordination of eye movements in response to obstacle recognition behavior via simple parameters related to the target position and individuality of the characters&#039;s gaze shifts. Our overall model can generate plausible visuomotor coordinated movements in various scenes by manipulating parameters of our proposed functions.

  • Character transfer: Example-based individuality retargeting for facial animations

    22nd International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision, WSCG 2014, Full Papers Proceedings - in co-operation with EUROGRAPHICS Association     121 - 129  2014.01

     View Summary

    A key disadvantage of blendshape animation is the labor-intensive task of sculpting blendshapes with individual expressions for each character. In this paper, we propose a novel system &quot;Character Transfer&quot;, that automatically sculpts blendshapes with individual expressions by extracting them from training examples; this extraction creates a mapping that drives the sculpting process. Comparing our approach with the nave method of transferring facial expressions from other characters, Character Transfer effectively sculpted blendshapes without the need to create such unnecessary blendshapes for other characters. Character Transfer is applicable even the training examples are limited to only a few number by using region segmentations of the face and the blending of the mappings.

  • ラケットスポーツ動画の構造解析による映像要約手法の提案

    河村俊哉, 福里司, 平井辰典, 森島繁生

    情報処理学会研究報告. CVIM, [コンピュータビジョンとイメージメディア]   2013 ( 15 ) 1 - 6  2013.11

     View Summary

    近年,スポーツ動画を手軽に鑑賞できるようになり,効率的な鑑賞方法が必要とされている.その解決策として,従来のラケットスポーツ動画に対する映像要約では,重要なラリーシーンの要約映像を生成したが,ラリーシーンの検出及びその評価方法に問題点が見られ,要約映像の効率的な鑑賞方法についても考慮が無かった.そこで本稿では,ラケットスポーツ動画に対する新たなラリーシーンの検出方法と各ラリーの重要度を用いた映像要約手法及びその鑑賞方法を提案する.提案手法では,ショット分割された動画に対し類似ショットのクラスタリング及びラリーを含むクラスタの選定により,精度の高いラリーシーンの検出方法を実現する.その後,各ラリーに対して音響情報を考慮した重要度評価を行い,その結果をユーザが調整することで,任意の時間内での動画の鑑賞を可能とする.さらに,ラケットスポーツに特化した高速再生による動画視聴方法を提案し,さらなる効率的な動画の鑑賞方法を実現する.

    CiNii

  • Conversion of a Motion Captured Dance into Expressive Motion

    Narumi Okada, Tsukasa Fukusato, Shigeo Morishima

    IPSJ SIG Notes   2013 ( 4 ) 1 - 7  2013.06

     View Summary

    We propose a method that transforms arbitrary dance motion into more expressive motion by filtering in accent and power. Original dance motion is divided into segments and converted to expressive dance to keep the original tempo. The expression conversion rule is extracted by analyzing motion capture data from training dance motions that include neutral and expressive motions labeled by subjective assessment.

    CiNii J-GLOBAL

  • 顔形状による制約を考慮した線形回帰に基づく顔特徴点自動検出

    松田龍英, 前島謙宣, 森島繁生

    画像ラボ   24 ( 6 ) 32 - 38  2013.06

    CiNii J-GLOBAL

  • レコむし:画像と楽曲の印象の一致による楽曲推薦システム

    佐々木将人, 平井辰典, 大矢隼士, 森島繁生

    研究報告エンタテインメントコンピューティング(EC)   2013 ( 10 ) 1 - 6  2013.03

     View Summary

    入力した画像に対して感性的にマッチした楽曲を推薦するシステム,レコむし (RECOmmendation of MUSic using an input Image) を提案する.音楽を楽しむ上で,現在の情景は重要な要素の一つである.なぜなら,楽曲の印象がその情景と調和しているほど,楽曲を聴いたときの感動は増すためである.しかし,膨大な楽曲群の中から現在の情景に的確にマッチした楽曲を手動で探し出すことは容易ではない.そこで本研究では,AV (Arousal-Valence) 空間と呼ばれる心理空間に画像と楽曲を配置することで,情景に対する印象と楽曲に対する印象を対応付ける.レコむしはこの対応付けを用いることで,現在の情景に合った印象を与える楽曲をプレイリストとして推薦する.また,ランダム選曲による楽曲と比較することで,レコむしの評価を行った.

    CiNii

  • レコむし:画像と楽曲の印象の一致による楽曲推薦システム

    佐々木将人, 平井辰典, 大矢隼士, 森島繁生

    研究報告音楽情報科学(MUS)   2013 ( 10 ) 1 - 6  2013.03

     View Summary

    入力した画像に対して感性的にマッチした楽曲を推薦するシステム,レコむし (RECOmmendation of MUSic using an input Image) を提案する.音楽を楽しむ上で,現在の情景は重要な要素の一つである.なぜなら,楽曲の印象がその情景と調和しているほど,楽曲を聴いたときの感動は増すためである.しかし,膨大な楽曲群の中から現在の情景に的確にマッチした楽曲を手動で探し出すことは容易ではない.そこで本研究では,AV (Arousal-Valence) 空間と呼ばれる心理空間に画像と楽曲を配置することで,情景に対する印象と楽曲に対する印象を対応付ける.レコむしはこの対応付けを用いることで,現在の情景に合った印象を与える楽曲をプレイリストとして推薦する.また,ランダム選曲による楽曲と比較することで,レコむしの評価を行った.

    CiNii

  • 年齢別パッチを用いた画像再構成による経年変化顔画像合成

    前島謙宣, 溝川あい, 松田龍英, 森島繁生

    研究報告コンピュータビジョンとイメージメディア(CVIM)   2013 ( 28 ) 1 - 6  2013.03

     View Summary

    安全安心な社会実現を目的とした犯罪捜査支援システム構築のため,顔写真から対象者の過去・未来の顔を合成可能な経年変化顔合成技術が求められている.本稿では,同一環境で撮影された顔画像のデータベースから構成される年齢別の小片画像群を用いて顔を再構成することで,経年変化顔画像を合成する手法を提案する.提案手法は,入力顔の再構成を通じて従来手法では困難であったしみやくすみといった細かな肌の特徴を表現し,さらに皺の統計モデルを用いて再構成の結果を変調することで経年変化をシミュレーションすることが可能である.提案手法の有効性を検証するため,年齢推定と個人識別に関する主観評価実験を行い,提案手法が,元の人物の印象を持つ目標年齢らしい経年顔画像を合成可能であることを示す.

    CiNii

  • Interactive Facial Aging Synthesis by user's simple operation

    溝川 あい, 中井 宏紀, 前島 謙宣, 森島 繁生

    研究報告コンピュータビジョンとイメージメディア(CVIM)   2013 ( 29 ) 1 - 5  2013.03

     View Summary

    近年,エンタテイメントやセキュリティへの応用を目的とした顔の経年変化画像を合成する研究が多くなされている.Tazoeらの提案した経年変化顔合成手法では,きめの乱れやくすみといった年齢特有の肌の質感の表現が可能である.しかし,皺が鮮明に表現できないといった問題や,結果が元画像の照明環境や肌の色に大きく影響されるという問題があった.また,加齢に伴う皺の付き方は多岐に渡るため,多様な皺を考慮する必要がある.本稿では,皺領域を二値化し照明環境や肌の色による影響を軽減し,且つユーザが書き足した曲線からリアルな年齢皺を付加した経年変化顔合成手法を提案する.Many studies on an aged face image synthesis have been reported with the purpose of security application such as investigation for criminal or kidnapped child and entertainment applications. Tazoe et al. proposed a facial aging technique that described skin texture such as rough or dull skin which helped in determining a person&#039;s age. However, in their method, representing wrinkles―one of the most important elements in reflecting age characteristics―is difficult. Additionally, the influence by lighting conditions and individual skin color is significant. Also, It is difficult to infer the location and shape of future wrinkles because they depend on individual factors. Therefore, we have to consider several possibilities of wrinkles locations. In this paper, we propose an aged face image synthesis method that can create plausible aged face images and is able to represent wrinkles at any optional location user want to add them by adding artificial freehand wrinkles with photorealistic quality.

    CiNii

  • A Study of Age-Invariant Face Authentication Based on the Edge of a Face Parts

    中井 宏紀, 平井 辰典, 前島 謙宣, 森島 繁生

    研究報告コンピュータビジョンとイメージメディア(CVIM)   2013 ( 30 ) 1 - 6  2013.03

     View Summary

    顔認証において,同一の被写体であっても顔の外見に経年変化が生じる場合は認証精度が低下するという問題がある.本稿では,経年変化が生じても見た目に大きな変化を及ぼさない顔器官(目・鼻・口等)の輪郭情報を特徴量にすることで経年変化を含む顔認証の精度向上を目指す.具体的には,顔の幾何学特徴として先行研究と同様に顔グラフを,テクスチャ特徴として顔特徴点周辺のHistogram of Oriented Gradient(HOG)特徴量を認証に用いた.結果として,公開顔画像データベースであるFG-NET Aging Databaseを用いた認証実験により,先行研究を上回る認証精度を示し,本手法の有効性を確認した.Face authentication accuracy falls by change of the face appearance due to aging. In this paper, we propose an age-invariant face authentication system which uses the edge of face parts to improve the face authentication accuracy for the image database with age variation. Specifically, we use face image graph as geometry feature and Histogram of Oriented Gradient (HOG) in the neighborhood of feature points as texture feature. As a result, for public facial aging database: FG-NET, we confirmed the effectiveness of proposed method through the evaluation experiment.

    CiNii

  • 短繊維を考慮した埃の描画手法

    安達翔平, 宇梶弘晃, 小坂昂大, 森島繁生

    全国大会講演論文集   2013 ( 1 ) 299 - 301  2013.03

     View Summary

    様々な汚れの表現手法の中でも、埃の体積は物体の経年変化を表す重要な要素である。従来の手法では堆積位置の関数化や反射特性の実測に基づいた描画を行っていたが、埃の構成要素である繊維や、堆積に由来する立体構造は考慮されていなかった。よって、本研究では上記の問題を改善し、写実的な描画を提案する。埃の堆積量については、高速処理が可能な階層上のテクスチャマッピングとして知られるShell法を用いて擬似的な立体表現を行った。繊維の表現に対しては、堆積した繊維の屈曲方向が無秩序であることから、UV平面上の点にランダムな運動を与えることで繊維の表現が可能なテクスチャのプロシージャル生成を実現した。

    CiNii

  • リアルな口内表現を実現する発話アニメーションの自動生成

    川井正英, 岩尾知頼, 三間大輔, 前島謙宣, 森島繁生

    全国大会講演論文集   2013 ( 1 ) 229 - 231  2013.03

     View Summary

    近年、映像コンテンツにおいて実写ベースのCGアニメーションを目にする機会が増加している。中でも発話シーンの合成においては、従来手法では、舌の複雑な動きや口内構造の表現において課題が残されていた。そこで本研究では、口内表現が不十分な発話顔画像を入力とし、予め取得した口唇画像群を利用して、入力画像の口内領域を補完することにより、実写品質の発話顔画像を自動生成する手法を提案する。具体的には、ある一人物の歯画像と舌画像を、開口距離情報と音素情報を用いて入力画像に挿入し、多人数の口唇領域のパッチ画像を合成された入力画像上にタイリングする。これにより、従来手法では特に表現が困難であった複雑な口内表現が可能となった。

    CiNii J-GLOBAL

  • D-12-31 Face Aging Synthesis by Addition of Artificial Wrinkles to the Face Image

    Mizokawa Ai, Nakai Hiroki, Maejima Akinobu, Morishima Shigeo

    Proceedings of the IEICE General Conference   2013 ( 2 ) 124 - 124  2013.03

    CiNii J-GLOBAL

  • D-12-39 Precision inprovement in 3D Face Reconstruction with Integration of Structure From Motion and Photometric Stereo

    Kuwahara Daiki, Matsuda Tatsuhide, Maejima Akinobu, Morishima Shigeo

    Proceedings of the IEICE General Conference   2013 ( 2 ) 133 - 133  2013.03

    CiNii J-GLOBAL

  • Rapid Dust Rendering by Generating Short Fiber Texture with Perlin Noise

    安達翔平, 宇梶弘晃, 小坂昂大, 森島繁生

    情報処理学会研究報告(CD-ROM)   2013 ( 6 ) 1 - 7  2013.02

    CiNii J-GLOBAL

  • Data-Driven Speech Animation Synthesis Focusing on Realistic Inside of the Mouth

    川井 正英, 岩尾 知頼, 三間 大輔, 前島 謙宣, 森島 繁生

    研究報告グラフィクスとCAD(CG)   2013 ( 2 ) 1 - 8  2013.02

     View Summary

    CG 発話アニメーションの自動生成手法は既に数多く提案されている.しかし,音節/θe/のような舌を噛む動きや舌の裏側といった細部の表現は未だ課題が残されている.そこで本研究では,口内画像を,開口距離情報がラベル付けされた歯画像と,音素ごとの動きに分類した舌画像とに分離し,任意の方法で生成された発話アニメーションの口内領域に挿入した後,複数人の口唇画像を用いてパッチタイリングを施すことで,複雑な口内表現を可能にした.Speech animation synthesis is still a challenging topic in computer graphics area. Despite of lots of challenge, representing detailed appearance of inner mouth such as nipping tongue&#039;s tip with teeth and tongue&#039;s back hasn&#039;t been achieved in the resulting animation. To solve this problem, we propose a method of data-driven speech animation synthesis especially focusing on realistic inside of the mouth. First, we classify inner mouth into teeth labeling opening distance of the teeth and a tongue according to phoneme information. We then insert them into pre-created speech animation based on opening distance of the teeth and phoneme information. Finally, we apply patch-based texture synthesis technique with 2213 images database created from 7 subject to the resulting animation. By using the proposed method, we can automatically generate a speech animation with the realistic inner mouth from the pre-created speech animation created by previous methods.

    CiNii J-GLOBAL

  • Facial Feature Point Tracking using Linear Predictors with Time Continuity and Geometrical Constraint

    MATSUDA Tatsuhide, MAEJIMA Akinnobu, MORISHIMA Shigeo

    Technical report of IEICE. PRMU   112 ( 385 ) 145 - 150  2013.01

     View Summary

    In this paper, we propose a novel method that tracks facial feature points using linear predictors with time continuity and geometrical constraint. In previous studies, a facial feature point tracking method using linear predictors was proposed which is a recursive linear regression method for displacement vectors from the current position to the correct position by considering pixel features around a target point. This method can accurately track feature points using multiple frames in a target facial video in training. In our proposed method, we performed facial feature point tracking for unknown persons robustly with time continuity using optical flow and geometrical constraint using PCA.

    CiNii

  • Facial Feature Point Tracking using Linear Predictors with Time Continuity and Geometrical Constraint

    MATSUDA Tatsuhide, MAEJIMA Akinnobu, MORISHIMA Shigeo

    Technical report of IEICE. Multimedia and virtual environment   112 ( 386 ) 145 - 150  2013.01

     View Summary

    In this paper, we propose a novel method that tracks facial feature points using linear predictors with time continuity and geometrical constraint. In previous studies, a facial feature point tracking method using linear predictors was proposed which is a recursive linear regression method for displacement vectors from the current position to the correct position by considering pixel features around a target point. This method can accurately track feature points using multiple frames in a target facial video in training. In our proposed method, we performed facial feature point tracking for unknown persons robustly with time continuity using optical flow and geometrical constraint using PCA.

    CiNii J-GLOBAL

  • 経年変化顔シミュレータ

    前島謙宣, 森島繁生

    画像センシングシンポジウム講演論文集(CD-ROM)   19th   ROMBUNNO.DS2-01  2013

    J-GLOBAL

  • 疑似的な皺の付加による加齢変化顔合成

    溝川あい, 中井宏紀, 前島謙宜, 森島繁生

    画像センシングシンポジウム講演論文集(CD-ROM)   19th   ROMBUNNO.IS2-13  2013

    J-GLOBAL

  • Facial Feature Point Tracking using Linear Predictors with Time Continuity and Geometrical Constraint

    松田龍英, 前島謙宣, 森島繁生

    電子情報通信学会技術研究報告   112 ( 385(PRMU2012 84-129) ) 145 - 150  2013

    J-GLOBAL

  • アニメ作品のコミック画像解析に基づく動画要約手法の提案

    福里司, 岩本尚也, 森島繁生

    画像電子学会誌(CD-ROM)   42 ( 1 ) 117  2013

    J-GLOBAL

  • 音素情報からの口唇動作推定を利用した発話アニメーションの生成

    三間大輔, 前島謙宣, 森島繁生

    画像電子学会誌(CD-ROM)   42 ( 1 ) 118  2013

    J-GLOBAL

  • 動画フレームの時間連続性と顔類似度に基づく動画コンテンツの同一人物抽出手法

    平井辰典, 中野倫靖, 後藤真孝, 森島繁生

    画像電子学会誌(CD-ROM)   42 ( 1 ) 116  2013

    J-GLOBAL

  • 対話時の感情を反映した眼球運動の分析及び合成

    岩尾知頼, 三間大輔, 久保尋之, 前島謙宣, 森島繁生

    画像電子学会誌(CD-ROM)   42 ( 1 ) 117  2013

    J-GLOBAL

  • 繊維構造を考慮した埃の高速描画手法

    安達翔平, 宇梶弘晃, 小坂昂大, 森島繁生

    情報処理学会全国大会講演論文集   75th ( 4 ) 4.299-4.300  2013

    J-GLOBAL

  • リアルな口内表現を実現する発話アニメーションの自動生成

    川井正英, 岩尾知頼, 三間大輔, 前島謙宣, 森島繁生

    情報処理学会全国大会講演論文集   75th ( 4 ) 4.229-4.230  2013

    J-GLOBAL

  • 注目領域中の画像類似度に基づく動画中のキャラクター登場シーンの推薦手法

    増田太郎, 平井辰典, 大矢隼士, 森島繁生

    情報処理学会全国大会講演論文集   75th ( 2 ) 2.601-2.602  2013

    J-GLOBAL

  • 入力画像に感性的に一致した楽曲を推薦するシステム

    佐々木将人, 平井辰典, 大矢隼士, 森島繁生

    情報処理学会全国大会講演論文集   75th ( 2 ) 2.45-2.46  2013

    J-GLOBAL

  • ダンスモーションにおける表現のバリエーション生成

    岡田成美, 岡見和樹, 福里司, 岩本尚也, 森島繁生

    情報処理学会全国大会講演論文集   75th ( 4 ) 4.227-4.228  2013

    J-GLOBAL

  • 年齢別パッチを用いた画像再構成による経年変化顔画像合成

    前島謙宣, 溝川あい, 松田龍英, 森島繁生

    情報処理学会研究報告(CD-ROM)   2012 ( 6 ) ROMBUNNO.CVIM-186,NO.28  2013

    J-GLOBAL

  • Data-Driven Speech Animation Synthesis Focusing on Realistic Inside of the Mouth

    川井正英, 岩尾知頼, 三間大輔, 前島謙宣, 森島繁生

    情報処理学会研究報告(CD-ROM)   2012 ( 6 ) ROMBUNNO.CG-150,NO.2  2013

    J-GLOBAL

  • レコむし:画像と楽曲の印象の一致による楽曲推薦システム

    佐々木将人, 平井辰典, 大矢隼士, 森島繁生

    情報処理学会研究報告(CD-ROM)   2012 ( 6 ) ROMBUNNO.MUS-98,NO.10  2013

    J-GLOBAL

  • A Study of Age-Invariant Face Authentication Based on the Edge of a Face Parts

    中井宏紀, 平井辰典, 前島謙宣, 森島繁生

    情報処理学会研究報告(CD-ROM)   2012 ( 6 ) ROMBUNNO.CVIM-186,NO.30  2013

    J-GLOBAL

  • Interactive Facial Aging Synthesis by user’s simple operation

    溝川あい, 中井宏紀, 前島謙宣, 森島繁生

    情報処理学会研究報告(CD-ROM)   2012 ( 6 ) ROMBUNNO.CVIM-186,NO.29  2013

    J-GLOBAL

  • An Automatic Music Video Generation System by Reusing Existent Music Video

    平井辰典, 平井辰典, 大矢隼士, 大矢隼士, 森島繁生, 森島繁生

    情報処理学会論文誌ジャーナル(Web)   54 ( 4 ) 1254 - 1262  2013

    CiNii J-GLOBAL

  • 音楽と映像が同期した音楽動画の自動生成システム

    平井辰典, 大矢隼士, 森島繁生

    情報処理学会研究報告(Web)   2013 ( MUS-99 ) WEB ONLY VOL.2013-MUS-99,NO.26  2013

    J-GLOBAL

  • 視線計測に基づく対話時の眼球運動の分析と合成

    岩尾知頼, 三間大輔, 久保尋之, 前島謙宣, 森島繁生

    画像ラボ   24 ( 5 ) 32 - 40  2013

    J-GLOBAL

  • 繊維構造を考慮したShell Textureのプロシージャル生成による埃の高速描画手法

    安達翔平, 宇梶弘晃, 小坂昂大, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2013   ROMBUNNO.30  2013

    J-GLOBAL

  • 顔領域の画像類似度に基づく動画中のキャラクタ登場シーン推薦

    増田太郎, 福里司, 平井辰典, 大矢隼士, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2013   ROMBUNNO.45  2013

    J-GLOBAL

  • 皺特徴の簡易的付加による加齢変化顔の合成

    溝川あい, 中井宏紀, 前島謙宜, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2013   ROMBUNNO.4  2013

    J-GLOBAL

  • 自動領域分割によるブレンドシェイプのためのリアルな表情転写

    小坂昂大, 前島謙宣, 久保尋之, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2013   ROMBUNNO.18  2013

    J-GLOBAL

  • 時系列性を考慮した感情を含んだ眼球運動の分析及び合成

    岩尾知頼, 三間大輔, 久保尋之, 前島謙宣, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2013   ROMBUNNO.33  2013

    J-GLOBAL

  • 均質化法を用いた詳細な布の変形シミュレーションの高速化

    斉藤隼介, 梅谷信行, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2013   ROMBUNNO.28  2013

    J-GLOBAL

  • 画像の感じ方に基づいた楽曲推薦を行うシステム

    佐々木将人, 平井辰典, 大矢隼士, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2013   ROMBUNNO.15  2013

    J-GLOBAL

  • 写実性豊かな口内表現を実現する発話アニメーションの自動生成

    川井正英, 岩尾知頼, 三間大輔, 前島謙宣, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2013   ROMBUNNO.17  2013

    J-GLOBAL

  • 一枚顔画像を入力とした顔の反射特性推定

    岡見和樹, 岩本尚也, 前島謙宣, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2013   ROMBUNNO.35  2013

    J-GLOBAL

  • アニメ作品のキーフレーム検出による漫画形式の映像要約手法の提案

    福里司, 平井辰典, 大矢隼士, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2013   ROMBUNNO.19  2013

    J-GLOBAL

  • 母音口形に基づく効率的な発話アニメーション生成手法の提案

    三間大輔, 前島謙宣, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2013   ROMBUNNO.49  2013

    J-GLOBAL

  • Automatic Video Summarization Based-on Key-frame Detection from Animated Film

    福里司, 平井辰典, 大矢隼士, 森島繁生, 森島繁生

    画像電子学会誌(CD-ROM)   42 ( 4 ) 448 - 456  2013

    DOI J-GLOBAL

  • Analysis and Synthesis of Eye Movement in Face-to-Face Conversation Based on Probability Model

    岩尾知頼, 三間大輔, 久保尋之, 前島謙宣, 森島繁生

    画像電子学会誌(CD-ROM)   42 ( 5 ) 661 - 670  2013

    DOI J-GLOBAL

  • ラケットスポーツ動画の構造解析による映像要約手法の提案

    河村俊哉, 福里司, 平井辰典, 森島繁生

    情報処理学会研究報告(Web)   2013 ( CG-153 ) WEB ONLY VOL.2013-CG-153,NO.15  2013

    J-GLOBAL

  • 髪の特徴に基づく顔画像の印象類似検索

    藤賢大, 福里司, 増田太郎, 平井辰典, 森島繁生

    情報処理学会研究報告(Web)   2013 ( CG-153 ) WEB ONLY VOL.2013-CG-153,NO.17  2013

    J-GLOBAL

  • A-15-10 Analysis and synthesis of involuntary eye movements in conversations

    Iwao Tomoyori, Mima Daisuke, Kubo Hiroyuki, Maejima Akinobu, Morishima Shigeo

    Proceedings of the IEICE General Conference   2012   230 - 230  2012.03

    CiNii J-GLOBAL

  • A-15-21 Modeling Walking Motion Transformation Related to Ground Condition

    Okami Kazuki, Iwamoto Naoya, Kunitomo Shoji, Suda Hirofumi, Morishima Shigeo

    Proceedings of the IEICE General Conference   2012   241 - 241  2012.03

    CiNii J-GLOBAL

  • D-12-72 Hair Motion Capture based on Multi Views Analysis

    Fukusato Tsukasa, Iwamoto Naoya, Kunitomo Shoji, Suda Hirofumi, Morishima Shigeo

    Proceedings of the IEICE General Conference   2012 ( 2 ) 166 - 166  2012.03

    CiNii J-GLOBAL

  • D-12-79 Real-time Fur Rendering from Realistic Single Image

    Ukaji Hiroaki, Kosaka Takahiro, Hattori Tomohito, Kubo Hiroyuki, Morishima Shigeo

    Proceedings of the IEICE General Conference   2012 ( 2 ) 173 - 173  2012.03

     View Summary

    近年の3DCGコンテンツにおいて,動物がキャラクターとして登場する場面は数多く挙げられる.動物はその大部分が毛皮で覆われているため,毛皮の描画の写実性はシーン全体の質を大きく左右する.実時間処理が可能な毛皮の描画手法の代表として,Shell法があげられる.Shell法は毛皮の断面図に相当するShell Textureを積層状に描画することによって毛皮を表現する手法であり,リアルタイムで比較的写実性の高い毛皮の描画が可能となっている.しかしShell法によって写実的な毛を生成する際には,目的の質感をもった毛皮を描画できるShell Textureをあらかじめ積層数分だけ用意しなければならず,テクスチャの準備の段階でコストを要するという問題が挙げられる.本手法では,毛皮の実写画像一枚のみを入力として,指定した枚数のShell Textureを自動的に生成する.これにより,入力画像に見られる毛並を3Dオブジェクト上に再現させることが可能である.各層のテクスチャは入力画像一枚から自動的に生成されるため,従来法と比較してテクスチャを準備するコストが削減可能となっている.なお提案手法では描画においてシェーダに転送する画像は入力一枚に留めているため,グラフィックメモリの使用量を抑えることができる.

    CiNii J-GLOBAL

  • A Method to Identify Artist's Name and Its Performance Scenes in Music Video Content

    Tatsunori Hirai, Tomoyasu Nakano, Masataka Goto, Shigeo Morishima

    IPSJ SIG Notes   2012 ( 24 ) 1 - 8  2012.01

     View Summary

    In this paper, we propose a method that can automatically annotate when and which artist is appearing in a music video clip. Previous face recognition methods were not robust against different shooting conditions such as variable lighting and face directions in a music video clip, and had difficulties identifying artist's name and its performance scenes. To overcome such difficulties, our method groups consecutive video frames (scenes) into clusters each having the same artist's face, and identifies an art ist by using many video frames in each cluster. In our experiments, accuracy with our method was approximately two or three times higher than a previous method recognizing a face in each frame. Furthermore, we discuss possible improvements by using relationship between the appearance of a vocalist in a video clip and sung sections in its song.

    CiNii J-GLOBAL

  • Analysis and synthesis of saccades and involuntary eye movements in fixation during conversations

    IWAO Tomoyori, MIMA Daisuke, KUBO Hiroyuki, MAEJIMA Akinobu, MORISHIMA Shigeo

    Technical report of IEICE. Multimedia and virtual environment   111 ( 380 ) 239 - 244  2012.01

     View Summary

    For generation of human motions naturally, it is important to synthesize realistic human eye movements in Computer Graphics. In this research, we generate probability models of eye movements according to measuring data. First, we separate eye movements by two different types, saccades and involuntary eye movements in fixation. Second, we approximate these movements as probability models respectively, then, we apply the models to CG characters. As a result, we realize to synthesize realistic eye movements automatically.

    CiNii

  • Analysis and synthesis of saccades and involuntary eye movements in fixation during conversations

    IWAO Tomoyori, MIMA Daisuke, KUBO Hiroyuki, MAEJIMA Akinobu, MORISHIMA Shigeo

    Technical report of IEICE. PRMU   111 ( 379 ) 239 - 244  2012.01

     View Summary

    For generation of human motions naturally, it is important to synthesize realistic human eye movements in Computer Graphics. In this research, we generate probability models of eye movements according to measuring data. First, we separate eye movements by two different types, saccades and involuntary eye movements in fixation. Second, we approximate these movements as probability models respectively, then, we apply the models to CG characters. As a result, we realize to synthesize realistic eye movements automatically.

    CiNii J-GLOBAL

  • パッチタイリングを用いた顔画像復元に基づく顔形状推定

    郷原裕明, 前島謙宣, 森島繋生

    画像センシングシンポジウム講演論文集(CD-ROM)   18th  2012

    J-GLOBAL

  • 笑顔表出過程の表情の動きと受け手の印象の相関分析

    藤代裕紀, 前島謙宣, 森島繁生

    電子情報通信学会論文誌 A   J95-A ( 1 ) 128 - 135  2012

    J-GLOBAL

  • Analysis and synthesis of saccades and involuntary eye movements in fixation during conversations

    岩尾知頼, 三間大輔, 久保尋之, 前島謙宣, 森島繁生

    電子情報通信学会技術研究報告   111 ( 380(MVE2011 56-94) ) 239 - 244  2012

    J-GLOBAL

  • 足元条件の変化に伴う歩行動作特徴変化のモデリング

    岡見和樹, 岩本尚也, 國友翔次, 須田洋文, 森島繁生

    電子情報通信学会大会講演論文集   2012   241  2012

    J-GLOBAL

  • 会話時の不随意な眼球運動の分析及び合成手法の提案

    岩尾知頼, 三間大輔, 久保尋之, 前島謙宣, 森島繁生

    電子情報通信学会大会講演論文集   2012   230  2012

    J-GLOBAL

  • 実写画像に基づく毛皮の実時間描画手法

    宇梶弘晃, 小坂昂大, 服部智仁, 久保尋之, 森島繁生

    電子情報通信学会大会講演論文集   2012   173  2012

    J-GLOBAL

  • 母音スペクトルのブレンドを用いた母音交換による話者変換

    浜崎皓介, 田中茉莉, 河原英紀, 森島繁生

    日本音響学会研究発表会講演論文集(CD-ROM)   2012   ROMBUNNO.3-11-13  2012

    J-GLOBAL

  • 複数台のビデオ映像解析による頭髪モーションキャプチャ

    福里司, 岩本尚也, 國友翔次, 須田洋文, 森島繁生

    電子情報通信学会大会講演論文集   2012   166  2012

    J-GLOBAL

  • Feature Extraction and Real-time Rendering of Fur from Realistic Image

    宇梶弘晃, 小坂昂大, 服部智仁, 久保尋之, 森島繁生

    情報処理学会研究報告(CD-ROM)   2011 ( 6 ) ROMBUNNO.CG-146,NO.22  2012

    J-GLOBAL

  • Proposal of Music Video Similarity Measuring scale

    長谷川裕記, 森島繁生

    情報処理学会研究報告(CD-ROM)   2011 ( 6 ) ROMBUNNO.EC-23,NO.21  2012

    J-GLOBAL

  • Realistic Hair Motion Reconstruction based on Multi Views Analysis

    福里司, 岩本尚也, 國友翔次, 須田洋文, 森島繁生

    情報処理学会研究報告(CD-ROM)   2011 ( 6 ) ROMBUNNO.CG-146,NO.1  2012

    J-GLOBAL

  • Modeling Walking Motion Transformation Related to Ground Condition Based on Usual Walking

    岡見和樹, 岩本尚也, 國友翔次, 須田洋文, 森島繁生

    情報処理学会研究報告(CD-ROM)   2011 ( 6 ) ROMBUNNO.CG-146,NO.21  2012

    J-GLOBAL

  • テクスチャの周波数解析に基づく年齢変化顔の生成

    中井宏紀, 松田龍英, 田副佑典, 前島謙宣, 森島繁生

    画像センシングシンポジウム講演論文集(CD-ROM)   18th   ROMBUNNO.IS1-05  2012

    J-GLOBAL

  • 形状変形とパッチタイリングに基づく顔のエージングシミュレーション

    田副佑典, 郷原裕明, 前島謙宣, 森島繁生

    画像センシングシンポジウム講演論文集(CD-ROM)   18th   ROMBUNNO.IS1-07  2012

    J-GLOBAL

  • SfMと顔変形モデルに基づく動画像からの3次元顔モデル高速自動生成

    原朋也, 前島謙宣, 森島繁生

    画像センシングシンポジウム講演論文集(CD-ROM)   18th   ROMBUNNO.IS1-08  2012

    J-GLOBAL

  • 顔のしわ特徴を考慮したドライバーの眠気度合推定

    中村太郎, 松田龍英, 原朋也, 前島謙宣, 森島繁生

    画像センシングシンポジウム講演論文集(CD-ROM)   18th   ROMBUNNO.IS1-04  2012

    J-GLOBAL

  • 形状変形とパッチタイリングに基づくテクスチャ変換による年齢変化顔シミュレーション

    田副佑典, 郷原裕明, 前島謙宣, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2012   ROMBUNNO.2  2012

    J-GLOBAL

  • 実写画像1枚からのShell Texture自動生成手法の提案

    宇梶弘晃, 小坂昂大, 服部智仁, 久保尋之, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2012   ROMBUNNO.12  2012

    J-GLOBAL

  • 既存の音楽動画を用いて音楽に合った映像を自動生成するシステム

    平井辰典, 大矢隼士, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2012   ROMBUNNO.18  2012

    J-GLOBAL

  • 会話時のリアルな眼球運動の分析及び合成手法の提案

    岩尾知頼, 三間大輔, 久保尋之, 前島謙宣, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2012   ROMBUNNO.4  2012

    J-GLOBAL

  • 事前知識とStructure‐from‐Motionを併用した1台のビデオ画像からの3次元顔モデル高速自動生成手法

    原朋也, 久保尋之, 前島謙宣, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2012   ROMBUNNO.1  2012

    J-GLOBAL

  • パッチタイリング手法による正面顔画像と両目位置情報からの顔3次元形状推定

    郷原裕明, 川井正英, 松田龍英, 前島謙宣, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2012   ROMBUNNO.43  2012

    J-GLOBAL

  • スキニングを用いた三色光源下における動的な次元立体形状の再現

    須田洋文, 岡見和樹, 前島謙宣, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2012   ROMBUNNO.21  2012

    J-GLOBAL

  • 動画サイトコンテンツ再利用によるHMMに基づく音楽からの動画自動生成システム

    大矢隼士, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2012   ROMBUNNO.50  2012

    J-GLOBAL

  • ステレオカメラ画像の色相検出に基づくマーカレス頭髪モーションキャプチャ

    福里司, 岩本尚也, 國友翔次, 須田洋文, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2012   ROMBUNNO.40  2012

    J-GLOBAL

  • 人の発話特性を考慮したリップシンクアニメーションの生成

    三間大輔, 小坂昂大, 久保尋之, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2012   ROMBUNNO.44  2012

    J-GLOBAL

  • モーションブラーシャドウのリアルタイム生成手法

    小坂昂大, 服部智仁, 久保尋之, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2012   ROMBUNNO.36  2012

    J-GLOBAL

  • 単母音に含まれる音響特徴からの3次元頭部形状推定に関する一検討

    前島謙宣, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2012   ROMBUNNO.45  2012

    J-GLOBAL

  • 正面顔画像からの形状ディスプレイ用テクスチャ自動生成

    前島謙宣, 倉立尚明, PIERCE Brennand, CHENG Gordon, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2012   ROMBUNNO.42  2012

    J-GLOBAL

  • 顔形状の制約を付加したLinear Predictorsに基づく特徴点自動検出

    松田龍英, 原朋也, 前島謙宣, 森島繁生

    電子情報通信学会論文誌 D   J95-D ( 8 ) 1530 - 1540  2012

    J-GLOBAL

  • Aging Face Composition Based on Frequency Analysis for Skin Texture

    中井宏紀, 松田龍英, 前島謙宣, 森島繁生

    映像情報メディア学会技術報告   36 ( 35(AIT2012 101-108) ) 5 - 8  2012

    J-GLOBAL

  • 実測に基づいた対話時の眼球運動の分析及び合成

    岩尾知頼, 三間大輔, 久保尋之, 前島謙宣, 森島繁生

    日本顔学会誌   12 ( 1 ) 166  2012

    J-GLOBAL

  • Identifying Scenes with the Same Person in Video Content on the Basis of Scene Continuity and Face Similarity Measurement

    Hirai Tatsunori, Nakano Tomoyasu, Goto Masataka, Morishima Shigeo

    The Journal of The Institute of Image Information and Television Engineers   66 ( 7 ) J251 - J259  2012

     View Summary

    We present a method that can automatically annotate when and who is appearing in a video stream that is shot in an unstaged condition. Previous face recognition methods were not robust against different shooting conditions, such as those with variable lighting, face directions, and other factors, in a video stream and had difficulties identifying a person and the scenes the person appears in. To overcome such difficulties, our method groups consecutive video frames (scenes) into clusters that each have the same person's face, which we call a &ldquo;facial-temporal continuum,&rdquo; and identifies a person by using many video frames in each cluster. In our experiments, accuracy with our method was approximately two or three times higher than a previous method that recognizes a face in each frame.

    DOI CiNii J-GLOBAL

  • Aging Face Composition Based on Frequency Analysis for Skin Texture

    NAKAI Hiroki, MATSUDA Tatsuhide, MAEJIMA Akinobu, MORISHIMA Shigeo

    ITE Technical Report   36 ( 0 ) 5 - 8  2012

     View Summary

    In this paper, aging face composition method based on frequency analysis is proposed. First, in the aging texture, we use two-dimensional Fourier transform for the flat area from the face image in the database, and determine the correlation coefficient between the age and the frequency components in frequency space. Next, we convert to a target age about frequency component of high correlation coefficient, and get a target texture by inverse Fourier transformation. In the proposed method, we can generate aging face to maintain personal skin.

    DOI CiNii

  • 3D Face Reconstruction From A Facial Image based on Texture-Depth Patch Techniques

    郷原 裕明, 前島 謙宣, 森島 繁生

    研究報告コンピュータビジョンとイメージメディア(CVIM)   2011 ( 20 ) 1 - 7  2011.11

     View Summary

    本稿では,正面顔画像から 3 次元形状を推定する,新たな手法を提案する.提案手法では,3 次元形状を持つ顔モデルからカラーマップとデプスマップを取得し,パッチに区切りデータベースを生成する.そして,入力顔画像のカラーとデータベース中のパッチのカラーマップと比較し,評価関数によりパッチを選択しパッチタイリングを行う.その際にデプスマップも同様に選択しタイリングをすることでテクスチャ情報と奥行情報の関係を利用した,テクスチャ-デプスパッチタイリングに基づく正面顔画像からの 3 次元顔形状推定なる新たな手法を提案する.The paper presents an adaptation of the image quilting algorithm for 3D reconstruction and synthesis from 2D images. We build a DB of 3D faces that are normalized and converted into depth-maps. Next, given a 2D image, we compute its 3D depth-map by synthesizing texture-depth patches from the database using a minimization framework.

    CiNii

  • Automatic 3D Face Generation from Video based on Sparse Feature Points and Deformable Face Model

    原 朋也, 前島 謙宣, 森島 繁生

    研究報告コンピュータビジョンとイメージメディア(CVIM)   2011 ( 25 ) 1 - 8  2011.11

     View Summary

    本稿では,1 台のビデオカメラで撮影された顔を自由に振る動作の動画像から,対象人物の 3 次元顔モデルを高速自動生成する手法を提案する.著者らは先行研究において,顔変形モデルと顔形状の尤度制約を用いた顔の奥行推定に基づき,正面顔画像から高速に 3 次元顔モデルを生成する手法を提案している.しかしながら,従来手法は奥行推定に入力顔画像から得られる特徴点の 2 次元位置座標を用いており,3 次元的な制約がないため,鼻の高さや頬部分の凹凸などの個人特徴を忠実に再現できないという問題点があった.そこで,提案手法では,一般的な 3 次元復元手法として知られる Structure-from-Motion のフレームワークと顔変形モデルに基づく手法を統合することで,より対象人物に近い 3 次元顔モデルの生成を実現した.In this paper, we propose a 3D face reconstruction method with hybrid approach that combines Structure-from-Motion(SfM) approach based on &quot;Factorization Method&quot; to estimate an accurate 3D point depth information and generic-model approach based on &quot;Deformable Face Model&quot; to keep an appropriate local face shape. Unlike other methods, our method requires no manual operations from image capturing to 3D face model output and is executed quickly. In consequence, our method succeed to express more plausible facial geometry compared with previous method.

    CiNii

  • 3D Face Reconstruction From A Facial Image based on Texture-Depth Patch Techniques

    郷原 裕明, 前島 謙宣, 森島 繁生

    研究報告グラフィクスとCAD(CG)   2011 ( 20 ) 1 - 7  2011.11

     View Summary

    本稿では,正面顔画像から 3 次元形状を推定する,新たな手法を提案する.提案手法では,3 次元形状を持つ顔モデルからカラーマップとデプスマップを取得し,パッチに区切りデータベースを生成する.そして,入力顔画像のカラーとデータベース中のパッチのカラーマップと比較し,評価関数によりパッチを選択しパッチタイリングを行う.その際にデプスマップも同様に選択しタイリングをすることでテクスチャ情報と奥行情報の関係を利用した,テクスチャ-デプスパッチタイリングに基づく正面顔画像からの 3 次元顔形状推定なる新たな手法を提案する.The paper presents an adaptation of the image quilting algorithm for 3D reconstruction and synthesis from 2D images. We build a DB of 3D faces that are normalized and converted into depth-maps. Next, given a 2D image, we compute its 3D depth-map by synthesizing texture-depth patches from the database using a minimization framework.

    CiNii J-GLOBAL

  • Automatic 3D Face Generation from Video based on Sparse Feature Points and Deformable Face Model

    原 朋也, 前島 謙宣, 森島 繁生

    研究報告グラフィクスとCAD(CG)   2011 ( 25 ) 1 - 8  2011.11

     View Summary

    本稿では,1 台のビデオカメラで撮影された顔を自由に振る動作の動画像から,対象人物の 3 次元顔モデルを高速自動生成する手法を提案する.著者らは先行研究において,顔変形モデルと顔形状の尤度制約を用いた顔の奥行推定に基づき,正面顔画像から高速に 3 次元顔モデルを生成する手法を提案している.しかしながら,従来手法は奥行推定に入力顔画像から得られる特徴点の 2 次元位置座標を用いており,3 次元的な制約がないため,鼻の高さや頬部分の凹凸などの個人特徴を忠実に再現できないという問題点があった.そこで,提案手法では,一般的な 3 次元復元手法として知られる Structure-from-Motion のフレームワークと顔変形モデルに基づく手法を統合することで,より対象人物に近い 3 次元顔モデルの生成を実現した.In this paper, we propose a 3D face reconstruction method with hybrid approach that combines Structure-from-Motion(SfM) approach based on &quot;Factorization Method&quot; to estimate an accurate 3D point depth information and generic-model approach based on &quot;Deformable Face Model&quot; to keep an appropriate local face shape. Unlike other methods, our method requires no manual operations from image capturing to 3D face model output and is executed quickly. In consequence, our method succeed to express more plausible facial geometry compared with previous method.

    CiNii J-GLOBAL

  • Imaging Technologies of Face and Human Body for New Industries

    KAWADE Masato, MOCHIMARU Masaaki, MORISHIMA Shigeo

    The Journal of The Institute of Image Information and Television Engineers   65 ( 11 ) 1534 - 1544  2011.11

    DOI CiNii

  • 三色光源下における動物体の高精度かつ詳細な三次元形状再現

    須田洋文, 前島謙宣, 森島繁生

    画像の認識・理解シンポジウム(MIRU2011)論文集   2011   1466 - 1472  2011.07

    CiNii

  • 特徴量の経年変化解析に基づく個人識別手法の検討

    原田健希, 田副佑典, 前島謙宣, 森島繁生

    画像の認識・理解シンポジウム(MIRU2011)論文集   2011   780 - 785  2011.07

    CiNii

  • 幾何学的制約を考慮したLinear Predictorsに基づく顔特徴点自動検出

    松田龍英, 原朋也, 前島謙宣, 森島繁生

    画像の認識・理解シンポジウム(MIRU2011)論文集   2011   773 - 779  2011.07

    CiNii

  • An automatic dance video creation system based on comprehension of image using annotation

    長谷川 裕記, 前島 謙宣, 森島 繁生

    研究報告音楽情報科学(MUS)   2011 ( 20 ) 1 - 6  2011.07

     View Summary

    本研究では動画に付随するアノテーション情報とユーザーが指定した情報に基き、画像に描写されているターゲット要素の特徴を機械学習することによって、データベース内の動画選択を行い音楽にマッチしたダンス動画を自動生成するシステムを構築した。画像内の輪郭特徴を表す特徴量、アノテーション情報を表す動画コンテンツに割り振られたタグ情報を用いて画像内容推定を行っており、先行研究より画像内の構図を考慮したダンス動画生成ができ、ユーザーがシステムを利用する際の自由度を上げる事が可能となった。This paper presents a system that automatically generates a dance video clip appropriate to music by segmenting and concatenating existing dance video clips. This system is based on machine learning for annotation and features of image. We create system can consider what object draw in the image, so user can control system more flexible than prior study. Because we use features express shape of object in image, and annotation attended videos in internet to guess what draw in the image.

    CiNii J-GLOBAL

  • Proposal and Implementation of Curvature-Dependent Reflectance Function as a Real-time Skin Shader

    久保 尋之, 土橋 宜典, 津田 順平, 森島 繁生

    研究報告 グラフィクスとCAD(CG)   2011 ( 2 ) 1 - 6  2011.06

     View Summary

    人間の肌のような半透明物体のリアルな質感を再現するためには、表面下散乱を考慮して描画することが必要不可欠である。そこで本研究では半透明物体の高速描画を目的とし、曲率に依存する反射関数 (CDRF) を提案する。実際の映像作品ではキャラクタの肌はそれぞれに特徴的で誇張した表現手法がとられるため、本研究では材質の散乱特性の調整だけでなく、曲率自体を強調する手法を導入することで、表面下散乱の影響が誇張された印象的な肌を表現可能なスキンシェーダを実現する。Simulating sub-surface scattering is one of the most effective ways to realistically synthesize translucent materials, especiallyhuman skin. This paper describes a technique of Curvature-Dependent Reflectance Function (CDRF) as a real-time skin shader and its implementation for a practical usage. For a production pipeline, we build a simple workflow, and prepare a method for exaggeration of scattering effects to describe a character&#039;s skin individuality.

    CiNii J-GLOBAL

  • An Automatic Music Video Creation System By Reusing Music Video Contents

    HIRAI Tatsunori, OHYA Hayato, HASEGAWA Yuki, MORISHIMA Shigeo

    IEICE technical report   111 ( 76 ) 143 - 148  2011.05

     View Summary

    In this paper, we propose music video creation system by cut-and-paste the existent video contents based on perceptual synchronization between music and video features. The synchronization method of this system is to match video features such as flicker and movement of existent video contents with RMS of the input song. This synchronization method is the way which human tend to feel music and video are synchronized well and it is evaluated by subjective evaluation experiment of this research. In movie creation of this system, first thing to do is to make a database from existent video contents' information about flicker and movement. We acquired them from optical-flow of the video and luminous of each frame in the video. The process to create music video is to search the data column of video features in database which show best correlation to the RMS of the input music, and connect all video fragments together with music.

    CiNii J-GLOBAL

  • An Automatic Music Video Creation System By Reusing Music Video Contents

    HIRAI Tatsunori, OHYA Hayato, HASEGAWA Yuki, MORISHIMA Shigeo

    IEICE technical report   111 ( 77 ) 143 - 148  2011.05

     View Summary

    In this paper, we propose music video creation system by cut-and-paste the existent video contents based on perceptual synchronization between music and video features. The synchronization method of this system is to match video features such as flicker and movement of existent video contents with RMS of the input song. This synchronization method is the way which human tend to feel music and video are synchronized well and it is evaluated by subjective evaluation experiment of this research. In movie creation of this system, first thing to do is to make a database from existent video contents' information about flicker and movement. We acquired them from optical-flow of the video and luminous of each frame in the video. The process to create music video is to search the data column of video features in database which show best correlation to the RMS of the input music, and connect all video fragments together with music.

    CiNii

  • Dance Video Creation System by Synchronization between Music and Video Features

    平井 辰典, 大矢 隼士, 長谷川 裕記, 森島 繁生

    研究報告 音楽情報科学(MUS)   2010 ( 6 ) 1 - 6  2011.04

     View Summary

    本稿では,主観評価実験によって評価された音楽と映像の同期手法に基づいて,入力された音楽から,人間が同期していると感じるダンス動画を生成するシステムを提案する.本システムの土台となる音楽と映像の同期手法は,音楽のエネルギーを示す特徴量であるRMSに対し,映像のアクセント (明滅や動きの速さなど) を付加するというものである.これは,本研究において主観評価実験により人が音楽と映像が 「合っている」 と感じると確かめられた同期手法である.本システムでの動画生成は,まずデータベースの構築として既存のダンス動画シーケンスから人物領域のみを抽出し,Optical flow の計算を行う.それに対し入力音楽を分割した素片の RMS の挙動に最も近い挙動を示す Optical flow のダンスシーケンスをデータベース中から選択し,それらの映像シーケンスの切り貼りを行うことで,音楽に最も同期しているダンス動画の生成を行うというものである.In this paper, we propose dance video creation system by the synchronization between music and video feature which evaluated by human&#039;s subjective judgment experiment. The video which created from the system matches with the criterion of synchronization which human tend to feel the music and the pictures are synchronized. The criterion of synchronization is that when RMS energy of music matches to the accent of video, people tends to feel the music and pictures are synchronized. In movie creation of this system, first thing to do is to make a database from existent dance movie&#039;s information about dance. We acquired them from optical-flow of the movies. The process to create dance movie is to cut the pieces of pictures that optical-flow shows best correlation to the RMS of the input music, and connect them together.

    CiNii

  • D-12-9 Rapid Rendering of Translucent Materials Using Multi-view Depth-Maps

    Kosaka Takahiro, Hattori Tomohito, Kubo Hiroyuki, Morishima Shigeo

    Proceedings of the IEICE General Conference   2011 ( 2 ) 112 - 112  2011.02

    CiNii

  • D-12-10 Estimating Fluid Simulation Parameters Considering Dynamic Liquid Surface

    Iwamoto Naoya, Kunitomo Shoji, Morishima Shigeo

    Proceedings of the IEICE General Conference   2011 ( 2 ) 113 - 113  2011.02

    CiNii J-GLOBAL

  • D-12-30 Dynamic 3D Shape Reconstruction from Multi-View Videos by Deformation of Standard Shape

    Taneda Daichi, Yamanaka Kentaro, Kunitomo Shoji, Suda Hirofumi, Maejima Akinobu, Morishima Shigeo

    Proceedings of the IEICE General Conference   2011 ( 2 ) 133 - 133  2011.02

    CiNii J-GLOBAL

  • D-12-37 A Study on Face Identification considering Age Progression

    Harada Tatsuki, Tazoe Yusuke, Maejima Akinobu, Morishima Shigeo

    Proceedings of the IEICE General Conference   2011 ( 2 ) 140 - 140  2011.02

    CiNii J-GLOBAL

  • D-12-38 Automatic Facial Feature Points Extraction using Linear Predictors with Geometry Constraints

    Matsuda Tatsuhide, Hara Tomoya, Maejima Akinobu, Morishima Sigeo

    Proceedings of the IEICE General Conference   2011 ( 2 ) 141 - 141  2011.02

    CiNii J-GLOBAL

  • D-12-39 Expression Generation with Illumination Variance

    Mima Daisuke, Yarimizu Hiroto, Kubo Hiroyuki, Maejima Akinobu, Morishima Shigeo

    Proceedings of the IEICE General Conference   2011 ( 2 ) 142 - 142  2011.02

    CiNii J-GLOBAL

  • Natural Smile Synthesis Considering Impression of Facial Expression Process

    FUJISHIRO Hiroki, MAEJIMA Akinobu, MORISHIMA Shigeo

    IEICE technical report   110 ( 459 ) 31 - 36  2011.02

     View Summary

    Facial expression is important for communicating emotion to others smoothly. There are many researches on smile expression because smile gives us good impression in our life. The most of research treat an impression on still face image mainly. However, it is suggested that the transient feature of facial expression change affects the impression recently. In this paper, we focus on the transient feature of smile about which facial part contributes to the natural impression of smile. By analysis of captured smiling videos, we define the rule to make the original smile more natural. As a result of subjective assessment of synthesized smile, naturalness is improved by our method.

    CiNii

  • Dancereproducer: An automatic mashup music video generation system by reusing dance video clips on the web

    Proceedings of the 8th Sound and Music Computing Conference, SMC 2011    2011.01

     View Summary

    We propose a dance video authoring system, DanceReProducer, that can automatically generate a dance video clip appropriate to a given piece of music by segmenting and concatenating existing dance video clips. In this paper, we focus on the reuse of ever-increasing user-generated dance video clips on a video sharing web service. In a video clip consisting of music (audio signals) and image sequences (video frames), the image sequences are often synchronized with or related to the music. Such relationships are diverse in different video clips, but were not dealt with by previous methods for automatic music video generation. Our system employs machine learning and beat tracking techniques to model these relationships. To generate new music video clips, short image sequences that have been previously extracted from other music clips are stretched and concatenated so that the emerging image sequence matches the rhythmic structure of the target song. Besides automatically generating music videos, DanceReProducer offers a user interface in which a user can interactively change image sequences just by choosing different candidates. This way people with little knowledge or experience in MAD movie generation can interactively create personalized video clips. © 2011 Tomoyasu Nakano et al.

  • Automatic generation of facial wrinkles according to expression changes

    Daisuke Mima, Hiroyuki Kubo, Akinobu Maejima, Shigeo Morishima

    SIGGRAPH Asia 2011 Posters, SA'11    2011

     View Summary

    In order to synthesize attractive facial expressions, it is necessary to consider detailed expression changes such as facial wrinkles. Nevertheless, most techniques of expression synthesis (i.e. blend shape, image morphing, simulation of mimic muscle and so on) focus entirely on large-scale deformation of a face and ignore small-scale details such as wrinkles and bulges. Therefore, hand crafted work of skilled artists is inevitable to make a face attractive finally, unless using a huge photographic equipment.

    DOI

  • Automatic 3D face generation from video with sparse point constraint and dense deformable model

    Tomoya Hara, Akinobu Maejima, Shigeo Morishima

    SIGGRAPH Asia 2011 Posters, SA'11    2011

     View Summary

    3D face models have been widely applied in various fields (e.g. biometrics, movies, video games). Especially, it is one of the most popular and challenging tasks in computer vision and computer graphics to reconstruct a 3D face model only with single camera without attaching landmarks and projecting laser dots or structured light patterns on a face. For example, Maejima proposed a method, based on generic-model approach, which can quickly reconstruct a 3D face model from 2D single photograph using a deformable face model [Maejima et al. 2008]. However, since it supposes input as a frontal face image, this method cannot express the individual facial parts' geometry such as height of nose and cheek contour accurately.

    DOI

  • Real time ambient occlusion by curvature dependent occlusion function

    Tomohito Hattori, Hiroyuki Kubo, Shigeo Morishima

    SIGGRAPH Asia 2011 Posters, SA'11    2011

     View Summary

    We present the novel technique to compute ambient occlusion [2008] on real-time graphics hardware. Because current real-time ambient occlusion techniques like SSAO need at least 16 rays sampling and too high computational cost to implement on computer games. Our method approximates occlusion as a local illumination model by introducing curvature-dependent function.

    DOI

  • 音楽と映像と同期手法に基づくダンス動画生成システム

    平井辰典, 大矢隼士, 長谷川裕記, 森島繁生

    音楽音響研究会資料   29 ( 7 ) 153 - 163  2011

    J-GLOBAL

  • 経年変化を考慮した個人識別手法の検討

    原田健希, 田副佑典, 前島謙宣, 森島繁生

    電子情報通信学会大会講演論文集   2011   140  2011

    J-GLOBAL

  • 顔画像における陰影変化を伴う表情生成

    三間大輔, 鑓水裕刀, 久保尋之, 前島謙宣, 森島繁生

    電子情報通信学会大会講演論文集   2011   142  2011

    J-GLOBAL

  • 複数視点からの深度マップを用いた半透明物体の高速描画

    小坂昂大, 服部智仁, 久保尋之, 森島繁生

    電子情報通信学会大会講演論文集   2011   112  2011

    J-GLOBAL

  • Natural Smile Synthesis Considering Impression of Facial Expression Process

    藤代裕紀, 前島謙宣, 森島繁生

    電子情報通信学会技術研究報告   110 ( 459(HCS2010 56-69) ) 31 - 36  2011

    J-GLOBAL

  • 動的な水の表面形状を考慮した流体のパラメータ推定

    岩本尚也, 國友翔次, 森島繁生

    電子情報通信学会大会講演論文集   2011   113  2011

    J-GLOBAL

  • 基準形状変形による多視点動画像からの動的立体形状再現

    種田大地, 山中健太郎, 國友翔次, 須田洋文, 前島謙宣, 森島繁生

    電子情報通信学会大会講演論文集   2011   133  2011

    J-GLOBAL

  • 幾何学的制約を考慮したLinear Predictorsによる顔特徴点自動抽出

    松田龍英, 原朋也, 前島謙宣, 森島繁生

    電子情報通信学会大会講演論文集   2011   141  2011

    J-GLOBAL

  • A Proposal of Innovative Entertainment System "Dive Into the Movie"

    森島繁生, 八木康史, 中村哲, 伊勢史郎, 向川康博, 槇原靖, 間下以大, 近藤一晃, 榎本成悟, 川本真一, 四倉達夫, 池田雄介, 前島謙宣, 久保尋之

    電子情報通信学会誌   94 ( 3 ) 250 - 268  2011

    CiNii J-GLOBAL

  • 既存の動画を再利用して音楽に合わせた動画を自動生成するシステムの提案

    大矢隼士, 森島繁生

    日本音響学会研究発表会講演論文集(CD-ROM)   2011   ROMBUNNO.3-1-10  2011

    J-GLOBAL

  • 主観評価に基づく音楽と映像の同期手法を用いた音楽動画生成システム

    平井辰典, 大矢隼士, 長谷川裕記, 森島繁生

    日本音響学会研究発表会講演論文集(CD-ROM)   2011   ROMBUNNO.3-1-11  2011

    J-GLOBAL

  • An Automatic Music Video Creation System By Reusing Music Video Contents

    平井辰典, 大矢隼士, 長谷川裕記, 森島繁生

    電子情報通信学会技術研究報告   111 ( 76(DE2011 1-26) ) 143 - 148  2011

    J-GLOBAL

  • 複数視点からの深度マップ利用による半透明物質の実時間描画法

    小坂昂大, 服部智仁, 久保尋之, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2011   POSUTAHAPPYO,FUKUSUSHITENKARANOSHINDOMAPPURIYO  2011

    J-GLOBAL

  • 顔画像における表情変化に伴う表情皺の自動生成手法の提案

    三間大輔, 久保尋之, 前島謙宣, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2011   POSUTAHAPPYO,KAOGAZONIOKERUHYOJOHENKA  2011

    J-GLOBAL

  • 動的な水の表面形状を考慮した流体パラメータ推定

    岩本尚也, 國友翔次, 佐川立昌, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2011   POSUTAHAPPYO,DOTEKINAMIZUNOHYOMENKEIJO  2011

    J-GLOBAL

  • 三色光源下における非剛体の動的立体形状再現

    種田大地, 須田洋文, 前島謙宣, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2011   KEIJOFUKUGEN,TANEDADAICHI  2011

    J-GLOBAL

  • Proposal and Implementation of Curvature-Dependent Reflectance Function as a Real-time Skin Shader

    久保尋之, 土橋宜典, 津田順平, 森島繁生

    情報処理学会研究報告(CD-ROM)   2011 ( 2 ) ROMBUNNO.CG-143,NO.2  2011

    J-GLOBAL

  • An automatic dance video creation system based on comprehension of image using annotation

    長谷川裕記, 前島謙宣, 森島繁生

    情報処理学会研究報告(CD-ROM)   2011 ( 2 ) ROMBUNNO.MUS-91,NO.20  2011

    J-GLOBAL

  • 医用画像を用いた個性を反映した表情アニメーション生成

    鑓水裕刀, 久保尋之, 前島謙宣, 森島繁生

    日本顔学会誌   11 ( 1 ) 154  2011

    J-GLOBAL

  • 表情の個性を表現するための解剖学的アプローチ

    三間大輔, 久保尋之, 前島謙宣, 森島繁生, 島田和幸

    日本顔学会誌   11 ( 1 ) 169  2011

    J-GLOBAL

  • 表情変化に伴う顔画像への表情皺の自動付加手法の提案

    三間大輔, 久保尋之, 前島謙宣, 森島繁生

    日本顔学会誌   11 ( 1 ) 188  2011

    J-GLOBAL

  • 個人性を反映した話者交換法に有効な話者分類の検討

    田中茉莉, 河原英紀, 森島繁生

    日本音響学会研究発表会講演論文集(CD-ROM)   2011   ROMBUNNO.2-8-2  2011

    J-GLOBAL

  • 自然発話スペクトログラム再現法による母音交換に基づく母音変換

    浜崎皓介, 山本達也, 田中茉莉, 河原英紀, 森島繁生

    日本音響学会研究発表会講演論文集(CD-ROM)   2011   ROMBUNNO.2-8-3  2011

    J-GLOBAL

  • Imaging technologies of face and human body for new industries

    Masato Kawade, Masaaki Mochimaru, Shigeo Morishima

    Kyokai Joho Imeji Zasshi/Journal of the Institute of Image Information and Television Engineers   65 ( 11 ) 1534 - 1544  2011

    DOI J-GLOBAL

  • Automatic 3D Face Generation from Video based on Sparse Feature Points and Deformable Face Model

    原朋也, 前島謙宣, 森島繁生

    情報処理学会研究報告(CD-ROM)   2011 ( 4 ) ROMBUNNO.CG-145,NO.25  2011

     View Summary

    3D face models have been widely applied in various fields (e.g. biometrics, movies, video games). Especially, it is one of the most popular and challenging tasks in computer vision and computer graphics to reconstruct a 3D face model only with single camera without attaching landmarks and projecting laser dots or structured light patterns on a face. For example, Maejima proposed a method, based on generic-model approach, which can quickly reconstruct a 3D face model from 2D single photograph using a deformable face model [Maejima et al. 2008]. However, since it supposes input as a frontal face image, this method cannot express the individual facial parts' geometry such as height of nose and cheek contour accurately.

    DOI J-GLOBAL

  • 3D Face Reconstruction From A Facial Image based on Texture-Depth Patch Techniques

    郷原裕明, 前島謙宣, 森島繁生

    情報処理学会研究報告(CD-ROM)   2011 ( 4 ) ROMBUNNO.CG-145,NO.20  2011

    J-GLOBAL

  • Acoustic features affecting speaker identification by imitated voice analysis

    20th International Congress on Acoustics 2010, ICA 2010 - Incorporating Proceedings of the 2010 Annual Conference of the Australian Acoustical Society   5   3677 - 3680  2010.12

     View Summary

    In this paper, physical correlates of perceived personal identity are investigated using imitated 16 utterances spoken by 11 mimicry speakers and 24 test subjects. Our unique strategy to use non-professional impersonators enabled to prepare test utterances with wide range of perceived similarities. Reasonably high correlations (0.46 and 0.44) in multiple regression analysis were attained by grouping subjects into three groups based on cluster analysis of the subjective test results. Without clustering, the correlation was only 0.17. Cluster analysis also revealed differences in their focusing physical correlates between three groups indicating importance of individual differences both in speakers and listeners.

  • BT-2-2 Consumer Participable Digital Contents

    MORISHIMA Shigeo

    Proceedings of the Society Conference of IEICE   2010 ( 2 ) "SS - 92"-"SS-95"  2010.08

    CiNii J-GLOBAL

  • A study of relationship between speaker identification and acoustic features using perceptual similarity of imitated voice

    TANAKA Mari, KAWAHARA Hideki, MORISHIMA Shigeo

    IEICE technical report   110 ( 143 ) 1 - 5  2010.07

     View Summary

    Physical correlates of perceived personal identity are investigated using imitated 16 utterances spoken by 11 mimicry speakers and 24 test subjects. Our unique strategy to use non-professional impersonators enabled to prepare test utterances with wide range of perceived similarities. Reasonably high correlations (0.46 and 0.44) in multiple regression analysis were attained by grouping subjects into three groups based on cluster analysis of the subjective test results. Without clustering, the correlation was only 0.17. The cluster analysis also revealed differences in their focusing physical correlates between three groups indicating importance of individual differences both in speakers and listeners.

    CiNii

  • 3D Face Reconstruction from Multi-view Images

    HARA Tomoya, FUJISHIRO Hiroki, NAKANO Shinya, MAEJIMA Akinobu, MORISHIMA Shigeo

    IEICE technical report   109 ( 471 ) 73 - 78  2010.03

     View Summary

    3D face models have become widely applied in various fields such as a filmmaking and an individual identification. We proposed a method which can quickly constructs a 3D face model from 2D facial image based on deformable head model without 3D scanner. However, the method could not express accurately the individual facial parts such as height of the nose and the cheek contour. In this paper, we propose a method of synthesizing a 3D face model which can express more close to original face shape through the process that constructs 3D face models from multi-view images taken from different face angles and integrate them by optimizing weight on each facial region. Compared with the previous method, we can find the improvement of accuracy especially in nose and mouth regions.

    CiNii

  • Fast-Automatic 3D Face Model Generation from Single Snapshot

    MAEJIMA Akinobu, MORISHIMA Shigeo

    IEICE technical report   109 ( 471 ) 145 - 150  2010.03

     View Summary

    In this paper, we have developed a method to fast-automatically generate an individual 3D face model with a plausible 3D geometry estimated from a facial snapshot using a prior knowledge of the 3D facial geometry. This prior knowledge consists of the deformable face model and the Gaussian Mixture Model learnt from 1,153 male/female and young/elderly of premodeled face models. A 3D geometry is estimated by optimizing an energy function for parameters of the deformable face model and deforming it according to the optimum parameters. The energy function is designed as the combination the vertex error term and the likelihood term. The vertex error term is defined as the sum of the Euclid distance between each feature point detected from a input snapshot and its corresponding vertex for the deformable face model. The likelihood term consisted of the GMM constrains parameters of the deformable face model to be guaranteed that the estimated 3D geometry does not collapse. Through the evaluation experiment, we show that the proposed method can generate a 3D face model with 2.1 [mm] estimation error in 1.2 seconds.

    CiNii

  • Aging Simulation of Personal Face Based on Conversion of 3D Geometry and Texture

    TAZOE Yusuke, FUJISHIRO Hiroki, NAKANO Shinya, KASAI Satoko, MAEJIMA Akinobu, MORISHIMA Shigeo

    IEICE technical report   109 ( 471 ) 151 - 156  2010.03

     View Summary

    In this paper, we have developed a method to synthesize the aging or antiaging face which added age characteristics to both of 3D geometry and texture to an individual face model generated from an accurately range scanned data. Next, the 3D geometry of the individual face model is converted into the target age by adding the age feature vector learned from the age-specific facial database. And, the facial texture is converted into the target age by adding the difference with luminance of between average facial texture of generation containing individual face and average facial texture of target generation. Finally, aging or anti-aging face is synthesized by integrating them. The proposed method enables to represent the variation of the facial geometry and the texture such as the freckle and the wrinkles with aging.

    CiNii

  • Dynamic Shape Reconstruction from Multiple Views using Three Colored Lightings

    KOBAYASHI Shota, MORISHIMA Shigeo

    IEICE technical report   109 ( 471 ) 157 - 162  2010.03

     View Summary

    Recent years, the methods to reconstruct 3D shape from 2D images using shading information are researched by many researchers. However, it is difficult to reconstruct dynamic and whole 3D model by such methods. In this paper, we obtain some shading information using 3 colored lightings, and estimate normal vector of initial shape by the shading information. To estimate normal vector, we make reflection model specific to each camera and lighting setting by collect sufficient data. Finally, we estimate initial shape's vertex coordinate using estimated normal vector.

    CiNii

  • 3D Face Reconstruction from Multi-view Images

    HARA Tomoya, FUJISHIRO Hiroki, NAKANO Shinya, MAEJIMA Akinobu, MORISHIMA Shigeo

    IEICE technical report   109 ( 470 ) 73 - 78  2010.03

     View Summary

    3D face models have become widely applied in various fields such as a filmmaking and an individual identification. We proposed a method which can quickly constructs a 3D face model from 2D facial image based on deformable head model without 3D scanner. However, the method could not express accurately the individual facial parts such as height of the nose and the cheek contour. In this paper, we propose a method of synthesizing a 3D face model which can express more close to original face shape through the process that constructs 3D face models from multi-view images taken from different face angles and integrate them by optimizing weight on each facial region. Compared with the previous method, we can find the improvement of accuracy especially in nose and mouth regions.

    CiNii J-GLOBAL

  • Fast-Automatic 3D Face Model Generation from Single Snapshot

    MAEJIMA Akinobu, MORISHIMA Shigeo

    IEICE technical report   109 ( 470 ) 145 - 150  2010.03

     View Summary

    In this paper, we have developed a method to fast-automatically generate an individual 3D face model with a plausible 3D geometry estimated from a facial snapshot using a prior knowledge of the 3D facial geometry. This prior knowledge consists of the deformable face model and the Gaussian Mixture Model learnt from 1,153 male/female and young/elderly of pre-modeled face models. A 3D geometry is estimated by optimizing an energy function for parameters of the deformable face model and deforming it according to the optimum parameters. The energy function is designed as the combination the vertex error term and the likelihood term. The vertex error term is defined as the sum of the Euclid distance between each feature point detected from a input snapshot and its corresponding vertex for the deformable face model. The likelihood term consisted of the GMM constrains parameters of the deformable face model to be guaranteed that the estimated 3D geometry does not collapse. Through the evaluation experiment, we show that the proposed method can generate a 3D face model with 2.1 [mm] estimation error in 1.2 seconds.

    CiNii J-GLOBAL

  • Aging Simulation of Personal Face Based on Conversion of 3D Geometry and Texture

    TAZOE Yusuke, FUJISHIRO Hiroki, NAKANO Shinya, KASAI Satoko, MAEJIMA Akinobu, MORISHIMA Shigeo

    IEICE technical report   109 ( 470 ) 151 - 156  2010.03

     View Summary

    In this paper, we have developed a method to synthesize the aging or antiaging face which added age characteristics to both of 3D geometry and texture to an individual face model generated from an accurately range scanned data. Next, the 3D geometry of the individual face model is converted into the target age by adding the age feature vector learned from the age-specific facial database. And, the facial texture is converted into the target age by adding the difference with luminance of between average facial texture of generation containing individual face and average facial texture of target generation. Finally, aging or and-aging face is synthesized by integrating them. The proposed method enables to represent the variation of the facial geometry and the texture such as the freckle and the wrinkles with aging.

    CiNii

  • Dynamic Shape Reconstruction from Multiple Views using Three Colored Lightings

    KOBAYASHI Shota, MORISHIMA Shigeo

    IEICE technical report   109 ( 470 ) 157 - 162  2010.03

     View Summary

    Recent years, the methods to reconstruct 3D shape from 2D images using shading information are researched by many researchers. However, it is difficult to reconstruct dynamic and whole 3D model by such methods. In this paper, we obtain some shading information using 3 colored lightings, and estimate normal vector of initial shape by the shading information. To estimate normal vector, we make reflection model specific to each camera and lighting setting by collect sufficient data. Finally, we estimate initial shape's vertex coordinate using estimated normal vector.

    CiNii J-GLOBAL

  • A-15-10 Aging Simulation of Face by Converting Both 3D Geometry and Texture

    Tazoe Yusuke, Fujishiro Hiroki, Nakano Shinya, Nonaka Yusuke, Kasai Satoko, Maejima Akinobu, Morishima Shigeo

    Proceedings of the IEICE General Conference   2010   192 - 192  2010.03

    CiNii J-GLOBAL

  • A-15-11 Skinning Technique Considering the Shape of Human Skeletons

    Suda Hirofumi, Nakamura Shinsuke, Yamanaka Kentaro, Morishima Shigeo

    Proceedings of the IEICE General Conference   2010   193 - 193  2010.03

     View Summary

    We propose a skinning technique to improve expressive power of Skeleton Subspace Deformation (SSD) by adding the influence of the shape of skeletons to the deformation result by postprocessing. © ACM 2010.

    DOI CiNii J-GLOBAL

  • A-16-10 Estimating Cloth Simulation Parameters Considering Static and Dynamic Features

    Kunitomo Shoji, Nakamura Shinsuke, Morishima Shigeo

    Proceedings of the IEICE General Conference   2010   228 - 228  2010.03

     View Summary

    Realistic drape and motion of virtual clothing is now possible by using an up-to-date cloth simulator, but it is even difficult and time consuming to adjust and tune many parameters to achieve an authentic looking of a real particular fabric. Bhat et al. [2003] proposed a way to estimate the parameters from the video data of real fabrics. However, this projects structured light patterns on the fabrics, so it might not be possible to estimate the accurate value of the parameters if fabrics have colors and textures. In addition to the structured light patterns, they use a motion capture system to track how the fabrics move. In this paper, we will introduce a new method using only a motion capture system by attaching a few markers on fabric surface without any other devices. Moreover, animators can easily estimate the parameters of many kinds of fabrics with this method. Authentic looking and motion of simulated fabrics are realized by minimizing error function between captured motion data and synthetic motion considering both static and dynamic cloth features. © ACM 2010.

    DOI CiNii J-GLOBAL

  • D-11-81 Cartoon facial animation with minimal 2D input based on non-linear morphing

    Gohara Hiroaki, Sugimoto Shiori, Morishima Shigeo

    Proceedings of the IEICE General Conference   2010 ( 2 ) 81 - 81  2010.03

    CiNii J-GLOBAL

  • D-12-8 3D Face Reconstruction considering Weights of Each Facial-parts from Multi-view Images

    Hara Tomoya, Fujishiro Hiroki, Nakano Shinya, Maejima Akinobu, Morishima Shigeo

    Proceedings of the IEICE General Conference   2010 ( 2 ) 119 - 119  2010.03

    CiNii J-GLOBAL

  • Voice Output System Considering Personal Voice for Instant Casting Movie

    川本 真一, 足立 吉広, 大谷 大和, 四倉 達夫, 森島 繁生, 中村 哲

    情報処理学会論文誌   51 ( 2 ) 250 - 264  2010.02

     View Summary

    In this paper, we propose an improved Future Cast System (FCS) that enables anyone to be a movie star while retaining their individuality in terms of how they look and how they sound. The proposed system produces voices that are significantly matched to their targets by integrating the results of multiple methods: similar speaker selection and voice morphing. After assigning one CG character to the audience, the system produces voices in synchronization with the CG character's movement. We constructed the speech synchronization system using a voice actor database with 60 different kinds of voices. Our system achieved higher voice similarity than conventional systems; the preference score of our system was 56.5% over other conventional systems.

    CiNii

  • Gait Animation Synthesis Exaggerated Characteristics Based on Perception Similarity

    NAKAMURA Shinsuke, MORISHIMA Shigeo

    研究報告グラフィクスとCAD(CG)   138 ( 6 ) F1 - F6  2010.02

     View Summary

    人間の歩行動作には、個人性情報が含まれており,最近では歩容個人認証の研究も盛んである。しかし個人の特徴を強調し,反映する歩容アニメーションを作ることは困難である.本研究では,歩行動作における個人性とは平均的な歩行動作からの差異によって表現されるものであると仮定し,その差異を増大させることによって個人性を強調した歩行動作を合成する.合成される歩行動作は,複数のサンプル歩行動作の主成分分析によって構築される空間において表現する.また,増大させる差異の大きさについては,複数の人物の歩行動作の中から特定の人物の歩行動作を探す主観評価実験によって最も認識率の高くなる割合を推定し,それを用いる.Characteristics of human motion, such as walking, running or jumping vary from person to person. Differences in human motion enable people to identify oneself or a friend. However, it is challenging to generate animation where individual characters exhibit characteristic motion using computer graphics. In our research, differences between an average motion in sample motions and a target motion are considered as characteristics target motion includes. We are enable to synthesize gait animation exaggerated characteristics by increasing this differences. The synthesized motion is represented as PCA score in PCA space composed of sample motions. In the experiment that subjects look for a target motion from crowd, we estimate the degree of exaggerated characteristics by the reaction time when subjects find the target motion most quickly.

    CiNii

  • Ambient Occlusion by Curvature Depended Local Illumination Approximation of Occlusion

    HATTORI Tomohito, KUBO Hiroyuki, MORISHIMA Shigeo

    ACM SIGGRAPH 2010 Posters, SIGGRAPH '10   138 ( 13 ) M1 - M6  2010.02

     View Summary

    This paper discusses an approach for computing the ambient occlusion by curvature depended approximation of occlusion. Ambient occlusion is widely used to improve the realism of fast lighting simulation. The ambient occlusion is defined as follows. © ACM 2010.

    DOI CiNii

  • Voice Output System Considering Personal Voice for Instant Casting Movie

    川本 真一, 足立 吉広, 大谷 大和, 四倉 達夫, 森島 繁生, 中村 哲

       2010.02

     View Summary

    In this paper, we propose an improved Future Cast System (FCS) that enables anyone to be a movie star while retaining their individuality in terms of how they look and how they sound. The proposed system produces voices that are significantly matched to their targets by integrating the results of multiple methods: similar speaker selection and voice morphing. After assigning one CG character to the audience, the system produces voices in synchronization with the CG character's movement. We constructed the speech synchronization system using a voice actor database with 60 different kinds of voices. Our system achieved higher voice similarity than conventional systems; the preference score of our system was 56.5% over other conventional systems.

    CiNii

  • Facial animation reflecting personal characteristics by automatic head modeling and facial muscle adjustment

    Akinobu Maejima, Hiroyuki Kubo, Shigeo Morishima

    ISCIT 2010 - 2010 10th International Symposium on Communications and Information Technologies     7 - 12  2010

     View Summary

    We propose a new automatic character modeling system which can generate an individualized head model only from a facial range scan data and an individualized facial animation with expression change. The head modeling system consists of two core modules: the head modeling module which can generate a head model from a personal facial range scan data using automatic mesh completion, and the key shape generation module which can generate key shapes for the generated head model based on physics-based facial muscle simulation with a personal muscle layout estimated from subject's facial expression videos. As a result, we can generate a head model which can synthesize facial expressions and impression similar to the target person. The experimental result shows that we archive to synthesize CG characters that subjects can identify themselves with 84%. Therefore, we conclude that our head modeling system is effective to games and entertainment systems like the Future Cast. ©2010 IEEE.

    DOI

  • Voice Output System Considering Personal Voice for Instant Casting Movie

    川本真一, 川本真一, 足立吉広, 足立吉広, 大谷大和, 四倉達夫, 森島繁生, 森島繁生, 中村哲, 中村哲

    情報処理学会論文誌ジャーナル(CD-ROM)   51 ( 2 ) 250 - 264  2010

    J-GLOBAL

  • 音楽特徴量と印象語の分析に基づく楽曲のサムネイル表現技術

    長谷川裕記, 室伏空, 山本達也, 森島繁生

    日本音響学会研究発表会講演論文集(CD-ROM)   2010   ROMBUNNO.2-8-20  2010

    J-GLOBAL

  • 物まね音声の知覚特性を反映した音声類似度評価尺度の提案

    田中茉莉, 山本達也, 室伏空, 河原英紀, 森島繁生

    日本音響学会研究発表会講演論文集(CD-ROM)   2010   ROMBUNNO.1-7-11  2010

    J-GLOBAL

  • 人体の骨格形状を考慮したスキニング手法の提案

    須田洋文, 中村槙介, 山中健太郎, 森島繁生

    電子情報通信学会大会講演論文集   2010   193  2010

    J-GLOBAL

  • 非線形モーフィングに基づく手描き顔アニメーションの中割り画像生成

    郷原裕明, 杉本志織, 森島繁生

    電子情報通信学会大会講演論文集   2010   81  2010

    J-GLOBAL

  • 多視点顔画像に基づく顔器官毎の重みを考慮した3次元顔形状推定

    原朋也, 藤代裕紀, 中野真也, 前島謙宣, 森島繁生

    電子情報通信学会大会講演論文集   2010   119  2010

    J-GLOBAL

  • 静的・動的特徴を考慮した布の物理パラメータ推定

    國友翔次, 中村槙介, 森島繁生

    電子情報通信学会大会講演論文集   2010   228  2010

    J-GLOBAL

  • 3次元形状とテクスチャの双方の変換による年齢変化顔の生成

    田副佑典, 藤代裕紀, 中野真也, 野中悠介, 笠井聡子, 前島謙宣, 森島繁生

    電子情報通信学会大会講演論文集   2010   192  2010

    J-GLOBAL

  • Aging Simulation of Personal Face Based on Conversion of 3D Geometry and Texture

    田副佑典, 藤代裕紀, 中野真也, 笠井聡子, 前島謙宣, 森島繁生

    電子情報通信学会技術研究報告   109 ( 471(HIP2009 118-210) ) 151 - 156  2010

    J-GLOBAL

  • Fast-Automatic 3D Face Model Generation from Single Snapshot

    前島謙宣, 森島繁生

    電子情報通信学会技術研究報告   109 ( 471(HIP2009 118-210) ) 145 - 150  2010

    J-GLOBAL

  • 3D Face Reconstruction from Multi-view Images

    原朋也, 藤代裕紀, 中野真也, 前島謙宣, 森島繁生

    電子情報通信学会技術研究報告   109 ( 471(HIP2009 118-210) ) 73 - 78  2010

    J-GLOBAL

  • Dynamic Shape Reconstruction from Multiple Views using Three Colored Lightings

    小林昭太, 森島繁生

    電子情報通信学会技術研究報告   109 ( 471(HIP2009 118-210) ) 157 - 162  2010

    J-GLOBAL

  • Ambient Occlusion by Curvature Depended Local Illumination Approximation of Occlusion.

    服部智仁, 久保尋之, 森島繁生

    情報処理学会研究報告(CD-ROM)   2009 ( 6 ) ROMBUNNO.CG-138,13  2010

    J-GLOBAL

  • Gait Animation Synthesis Exaggerated Characteristics Based on Perception Similarity

    中村槙介, 森島繁生

    情報処理学会研究報告(CD-ROM)   2009 ( 6 ) ROMBUNNO.CG-138,6  2010

    J-GLOBAL

  • 遮蔽度の曲率近似を用いたアンビエントオクルージョンの局所照明モデル化

    服部智仁, 久保尋之, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2010   MODERINGU,HATTORITOMOHITO  2010

    J-GLOBAL

  • 半透明物体の高速描画に向けた曲率に依存する反射関数の近似式

    久保尋之, 土橋宜典, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2010   MODERINGU,KUBOHIROYUKI  2010

    J-GLOBAL

  • 静的・動的特徴を考慮した布シミュレーションの物理パラメータ推定

    國友翔次, 中村槙介, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2010   MODERINGU,KUNITOMOSHOJI  2010

    J-GLOBAL

  • コンシューマ参加型デジタルコンテンツ

    森島繁生

    電子情報通信学会大会講演論文集   2010   SS.92-SS.95  2010

    J-GLOBAL

  • 人肌などの半透明物体の高速描画法

    久保尋之, 土橋宜典, 森島繁生

    日本顔学会誌   10 ( 1 ) 111  2010

    J-GLOBAL

  • 話速変化に応じたリップシンクアニメーションの作成

    高見澤涼, 矢野茜, 久保尋之, 前島謙宣, 森島繁生

    日本顔学会誌   10 ( 1 ) 132  2010

    J-GLOBAL

  • Gait Animation Synthesis having Characteristics Exaggeration by Perceptual Similarity Basis

    中村槙介, 森島繁生

    画像電子学会誌   39 ( 5 ) 615 - 620  2010

     View Summary

    Characteristics of human motion, such as walking, running or jumping vary from person to person. Differences in human motion enable people to identify oneself or a friend. However, it is challenging to generate animation where individual characters exhibit characteristic motion using computer graphics. In our research, differences between an average motion in some sample motions and a target motion are considered as characteristics target motion includes. We are able to synthesize gait animation having exaggerated characteristics by increasing the differences. The synthesized motion is represented as PCA (Principal Component Analysis) score in PCA space composed of sample motions. In the experiment of looking for a target motion from crowd, we estimate the optimum degree of exaggerated characteristics to minimize the finding time of the target motion. © 2010, The Institute of Image Electronics Engineers of Japan. All rights reserved.

    DOI J-GLOBAL

  • 曲率に依存する反射関数を用いた半透明物体の高速レンダリング

    久保尋之, 土橋宜典, 森島繁生

    電子情報通信学会論文誌 A   J93-A ( 11 ) 708 - 717  2010

    J-GLOBAL

  • Gait Animation Synthesis having Characteristics Exaggeration by Perceptual Similarity Basis

    NAKAMURA Shinsuke, MORISHIMA Shigeo

    The Journal of the Institute of Image Electronics Engineers of Japan   39 ( 5 ) 615 - 620  2010

     View Summary

    Characteristics of human motion, such as walking, running or jumping vary from person to person. Differences in human motion enable people to identify oneself or a friend. However, it is challenging to generate animation where individual characters exhibit characteristic motion using computer graphics. In our research, differences between an average motion in some sample motions and a target motion are considered as characteristics target motion includes. We are able to synthesize gait animation having exaggerated characteristics by increasing the differences. The synthesized motion is represented as PCA (Principal Component Analysis) score in PCA space composed of sample motions. In the experiment of looking for a target motion from crowd, we estimate the optimum degree of exaggerated characteristics to minimize the finding time of the target motion.

    DOI CiNii J-GLOBAL

  • プログラム担当より

    森島 繁生

    日本バーチャルリアリティ学会誌 = Journal of the Virtual Reality Society of Japan   14 ( 4 ) 207 - 208  2009.12

    CiNii

  • 座長からの報告

    渡邊 淳司, 藤田 欣也, 加藤 博一, ピトヨ ハルトノ, 苗村 健, 昆陽 雅司, 小林 稔, 酒井 幸仁, 筧 康明, 梶本 裕之, 北崎 充晃, 宮崎 慎也, 唐山 英明, 池井 寧, 柳田 康幸, 森島 繁生, 広田 光一, 前田 太郎, 中口 俊哉, 長谷川 晶一, 小木 哲朗, 井原 雅行, 妹尾 武治, 稲見 昌彦, 清川 清, 橋本 直己, 安藤 英由樹, 宮田 一乘, 吉田 俊介, 大田 友一, 原田 哲也, 寺林 賢司, 北島 律之, 松本 光春, 綿貫 啓一, 青木 義満

    日本バーチャルリアリティ学会誌 = Journal of the Virtual Reality Society of Japan   14 ( 4 ) 213 - 223  2009.12

    CiNii

  • Development of a toolkit for spoken dialog systems with an anthropomorphic agent: Galatea

    Takuya Nishimoto, Yoichi Yamashita, Tsuneo Nitta

    APSIPA ASC 2009 - Asia-Pacific Signal and Information Processing Association 2009 Annual Summit and Conference     148 - 153  2009.12

     View Summary

    The Interactive Speech Technology Consortium (ISTC) has been developing a toolkit called Galatea that comprises four fundamental modules for speech recognition, speech synthesis, face synthesis, and dialog control, that can be used to realize an interface for spoken dialog systems with an anthropomorphic agent. This paper describes the development of the Galatea toolkit and the functions of each module; in addition, it discusses the standardization of the description of multi-modal interactions.

  • High Realistic Contents that Make Audience Dive Into the Movie

    MORISHIMA Shigeo

    電気学会研究会資料. EDD, 電子デバイス研究会   2009 ( 81 ) 27 - 32  2009.11

    CiNii

  • High realistic contents that make audience dive into the movie

    森島 繁生

    研究会講演予稿   247   27 - 32  2009.11

    CiNii J-GLOBAL

  • High Realistic Contents that Make Audience Dive Into the Movie

    MORISHIMA Shigeo

    IEICE technical report   109 ( 267 ) 27 - 32  2009.10

     View Summary

    We have proposed a new movie contents style which makes all audience participate in the story as a movie casts and give an experience of immersive sensation into the story. In 2005, at Mitsui-Toshiba pavilion of Aichi Expo., Future Cast System is implemented first in the world and 1,630,000 people have experienced the sensational contents "Grand Odyssey". This system is commercially succeeded as Future Cast Theater at HUIS TEN BOSCH, Nagasaki in 2007. It is very important factor to succeed to customize CG character full automatically and instantly without any stress to the audience. In this paper, we propose new character customization method with full automatic reflection of personality in facial expression, body size, gait, voice quality and hair style in a few minutes. Also real-time full body motion synthesis and rendering including hair is realized and as a result self-awareness rate at which one can recognize himself in the screen improves almost 90% comparing with 59% of EXPO.2005.

    CiNii

  • An Automatic Music Video Creation System by Reusing Dance Video Content

    MUROFUSHI SORA, NAKANO TOMOYASU, GOTO MASATAKA, MORISHIMA SHIGEO

    研究報告音楽情報科学(MUS)   81 ( 21 ) U1 - U7  2009.07

     View Summary

    本研究では、既存のダンス動画コンテンツの複数の動画像を分割して連結(切り貼り)することで、音楽に合ったダンス動画を自動生成するシステムを提案する。従来、切り貼りに基づいた動画の自動生成に関する研究はあったが、音楽{映像間の多様な関係性を対応付ける研究はなかった。本システムでは、そうした多様な関係性をモデル化するために、Web 上で公開されている二次創作された大量のコンテンツを利用し、クラスタリングと複数の線形回帰モデルを用いることで音楽に合う映像の素片を選択する。その際、音楽{映像間の関係だけでなく、生成される動画の時間的連続性や音楽的構造もコストとして考慮することで、動画像の生成をビタビ探索によるコスト最小化問題として解いた。This paper presents a system that automatically generates a dance video clip appropriate to music by segmenting and concatenating existing dance video clips. Although there were previous works on automatic music video creation, they did not support various associations between music and video. To model such various associations, our system uses a large amount of fan-fiction content on the web, and selects video segments appropriate to music by using linear regression models for multiple clusters. By introducing costs representing temporal continuity and music structure of the generated video clip as well as associations between music and video, this video creation problem is solved by minimizing the costs by Viterbi search.

    CiNii

  • Automatic Head Model Generation Based on Optimized Local Affine Transform Using Facial Range Scan Data

    MAEJIMA Akinobu, MORISHIMA Shigeo

    The Journal of the Institute of Image Electronics Engineers of Japan   38 ( 4 ) 404 - 413  2009.07

     View Summary

    We propose an automatic 3D human head modeling method using both a frontal facial image and geometry. In general, template mesh fitting methods are used to create a face model from a facial range data obtained by range scanner. However, previous fitting techniques need to manually specify markers to the scanned 3D geometry and to manually correct the 3D geometry of the missing parts that it is impossible to accurately measure the head geometry of the hair region. We therefore complement this region's 3D geometry with the template mesh's one. Our technique can generate the head model that the scanned 3D face geometry and the template mesh's one are seamlessly connected. The computational time of our method is much faster than previous template mesh fitting methods. We therefore conclude that proposed method is effective to create a large amount of head models in game and film industry and an entertainment system.

    DOI CiNii J-GLOBAL

  • Voice conversion based on vowel-change using coarticulation model

    YAMAMOTO Tatsuya, MUROFUSHI Sora, MORISHIMA Shigeo

    IEICE technical report   109 ( 139 ) 37 - 42  2009.07

     View Summary

    The statistical and spectrum conversion method is researched from of old as the voice conversion technology of the synthesized voice. It is known that the speaker conversion with high quality or more becomes possible by having to acquire the utterance data from the speaker who becomes a target in the statistical and spectrum conversion method beforehand, and obtaining a lot of utterance data. However, receiving huge utterance data from the speaker who becomes a target on the other hand becomes a large encumbrance for the speaker. On the other hand, the speaker conversion by the vowel change method can convert the speaker into the content of all the utterances by acquiring only the vowel from the speaker who becomes a target. Authors proposed the technique for synthesizing a natural voice by using the modulation uniting model for the vicinity of the phoneme boundary to improve the discontinuity of the voice that was the problem of the vowel exchange method. It proposes the technique for synthesizing a voice near a more natural utterance by studying the application section of the modulation uniting model from the content of the input speaker's utterance in the present study.

    CiNii

  • Generating Forearm Motion with Skin Deformation Model Based on MRI images

    YAMANAKA Kentaro, NAKAMURA Shinsuke, KOBAYASHI Shota, MORISHIMA Shigeo

    Human Interface   11 ( 2 ) 85 - 90  2009.05

    CiNii

  • Generating Forearm Motion with Skin Deformation Model Based on MRI images

    YAMANAKA Kentaro, NAKAMURA Shinsuke, KOBAYASHI Shota, MORISHIMA Shigeo

    IEICE technical report   109 ( 29 ) 85 - 90  2009.05

     View Summary

    This paper presents a new methodology for constructing an example-based skin deformation model of human forearm based on MRI images. Generating realistic skin animation of forearm movement is generally difficult in CG domain because there is a crucial difference between a structure of forearm of a virtual human and a real human. So we propose a new kind of skin deformation model based on MRI images. Using MRI images, we can model example skin shapes associated with location of bones with accuracy. We also mention how to apply the model to characters to generate skin animation. For this purpose, we employ RBF, Radial Basis Functions. Once the model is constructed, skin animation is easily generated by applying the model to the forearm of a character by means of RBF. In this paper, we describe how to construct our model, first. Then we explain the method to apply the model to characters and generate skin animation.

    CiNii

  • Generating Forearm Motion with Skin Deformation Model Based on MRI images

    YAMANAKA Kentaro, NAKAMURA Shinsuke, KOBAYASHI Shota, MORISHIMA Shigeo

    IEICE technical report   109 ( 27 ) 85 - 90  2009.05

     View Summary

    This paper presents a new methodology for constructing an example-based skin deformation model of human forearm based on MRI images. Generating realistic skin animation of forearm movement is generally difficult in CG domain because there is a crucial difference between a structure of forearm of a virtual human and a real human. So we propose a new kind of skin deformation model based on MRI images. Using MRI images, we can model example skin shapes associated with location of bones with accuracy. We also mention how to apply the model to characters to generate skin animation. For this purpose, we employ RBF, Radial Basis Functions. Once the model is constructed, skin animation is easily generated by applying the model to the forearm of a character by means of RBF. In this paper, we describe how to construct our model, first. Then we explain the method to apply the model to characters and generate skin animation.

    CiNii

  • Generating Forearm Motion with Skin Deformation Model Based on MRI images

    YAMANAKA Kentaro, NAKAMURA Shinsuke, KOBAYASHI Shota, MORISHIMA Shigeo

    IEICE technical report   109 ( 28 ) 85 - 90  2009.05

     View Summary

    This paper presents a new methodology for constructing an example-based skin deformation model of human forearm based on MRI images. Generating realistic skin animation of forearm movement is generally difficult in CG domain because there is a crucial difference between a structure of forearm of a virtual human and a real human. So we propose a new kind of skin deformation model based on MRI images. Using MRI images, we can model example skin shapes associated with location of bones with accuracy. We also mention how to apply the model to characters to generate skin animation. For this purpose, we employ RBF, Radial Basis Functions. Once the model is constructed, skin animation is easily generated by applying the model to the forearm of a character by means of RBF. In this paper, we describe how to construct our model, first. Then we explain the method to apply the model to characters and generate skin animation.

    CiNii

  • CGキャラクタの存在感

    森島 繁生

    日本バーチャルリアリティ学会誌 = Journal of the Virtual Reality Society of Japan   14 ( 1 ) 23 - 28  2009.03

    CiNii J-GLOBAL

  • D-14-4 Speaker Conversion System Based on Vowel Change Using Coarticulation Correcting

    Yamamoto Tatsuya, Murofushi Sora, Kondo Kojiro, Morishima Shigeo

    Proceedings of the IEICE General Conference   2009 ( 1 ) 167 - 167  2009.03

    CiNii

  • D-11-109 Data Driven GUI Development for Car Shape Design

    Nakada Masaki, Hayakawa Tatsunori, Sugimoto Shiori, Morishima Shigeo

    Proceedings of the IEICE General Conference   2009 ( 2 ) 109 - 109  2009.03

    CiNii J-GLOBAL

  • A-10-2 Correction system for overtone mistakes using template matching

    Fujisawa Kentaro, Murofushi Sora, Kondo Kojiro, Morishima Shigeo

    Proceedings of the IEICE General Conference   2009   197 - 197  2009.03

    CiNii

  • A-15-14 Construction of Facial Muscle Model based on Controlling Fat-Layer Thickness using MRI

    Yarimizu Hiroto, Ishibashi Yasushi, Kubo Hiroyuki, Maejima Akinobu, Morishima Shigeo

    Proceedings of the IEICE General Conference   2009   250 - 250  2009.03

    CiNii J-GLOBAL

  • A-15-17 Construction of Facial Eigen-space for Synthesizing Various Facial Expressions

    Takamizawa Ryo, Suzuki Takanori, Kubo Hiroyuki, Maejima Akinobu, Morishima Shigeo

    Proceedings of the IEICE General Conference   2009   253 - 253  2009.03

    CiNii

  • A-15-16 A Classification of Artificial Smile and Natural Smile based on Optical Flow in HD Video Sequence

    Fujishiro Hiroki, Suzuki Takanori, Nakano Shinya, Nonaka Yusuke, Maejima Akinobu, Morishima Shigeo

    Proceedings of the IEICE General Conference   2009   252 - 252  2009.03

    CiNii J-GLOBAL

  • A-15-15 Accurate Reconstruction of Skin Deformation during Forearm Movement Using MRI

    Yamanaka Kentaro, Nakamura Shinsuke, Yano Akane, Morishima Shigeo

    Proceedings of the IEICE General Conference   2009   251 - 251  2009.03

    CiNii

  • MRIに基づく皮膚下構造を反映した顔面筋肉モデルの構築

    鑓水裕刀, 石橋康, 久保尋之, 前島謙宣, 森島繁生

    電子情報通信学会大会講演論文集   2009   250  2009

    J-GLOBAL

  • 楽器音テンプレートマッチングによる倍音誤り補正システム

    藤澤賢太郎, 室伏空, 近藤康治郎, 森島繁生

    電子情報通信学会大会講演論文集   2009   197  2009

    J-GLOBAL

  • MRIを用いた前腕運動時の皮膚形状変化の精密な再現

    山中健太郎, 中村槙介, 矢野茜, 森島繁生

    電子情報通信学会大会講演論文集   2009   251  2009

    J-GLOBAL

  • 調音結合補正を用いた母音交換法に基づく話者変換法

    山本達也, 室伏空, 近藤康治郎, 森島繁生

    電子情報通信学会大会講演論文集   2009   167  2009

    J-GLOBAL

  • データベースに基づく車体形状デザインGUIの構築

    仲田真輝, 早川達順, 杉本志織, 森島繁生

    電子情報通信学会大会講演論文集   2009   109  2009

    J-GLOBAL

  • 多様な表情を合成可能な固有顔空間の構築

    高見澤涼, 鈴木孝章, 久保尋之, 前島謙宣, 森島繁生

    電子情報通信学会大会講演論文集   2009   253  2009

    J-GLOBAL

  • 顔動画像のオプティカルフローに基づく作り笑い・自然な笑いの識別

    藤代裕紀, 鈴木孝章, 中野真也, 野中悠介, 前島謙宣, 森島繁生

    電子情報通信学会大会講演論文集   2009   252  2009

    J-GLOBAL

  • 英語情動文的“I love you”中国語話者による認知と音響特性相関(2)

    ヤーッコラ伊勢井敏子, 広瀬啓吉, 森島繁生

    日本音響学会研究発表会講演論文集(CD-ROM)   2009   ROMBUNNO.2-P-16  2009

    J-GLOBAL

  • 音と同期した3DCGを用いた日本人英語学習者に苦手な構音運動のためのトレーニングシステム―唇の突き出し

    ヤーッコラ(伊勢井, 敏子, 鈴木茂樹, 広瀬啓吉, 森島繁生

    日本音響学会研究発表会講演論文集(CD-ROM)   2009   ROMBUNNO.3-6-13  2009

    J-GLOBAL

  • アンドロイドやエージェントに感じる人の存在感 CGキャラクタの存在感

    森島繁生

    日本バーチャルリアリティ学会誌   14 ( 1 ) 23-28,1(1)  2009

    J-GLOBAL

  • Generating Forearm Motion with Skin Deformation Model Based on MRI images

    山中健太郎, 中村槙介, 小林昭太, 森島繁生

    電子情報通信学会技術研究報告   109 ( 29(WIT2009 1-47) ) 85 - 90  2009

    J-GLOBAL

  • MRI計測から獲得される皮膚下の厚みを適用した顔面筋肉モデルの構築

    鑓水裕刀, 石橋康, 久保尋之, 前島謙宣, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2009   ROMBUNNO.31  2009

    J-GLOBAL

  • MRIを用いた前腕運動に伴う皮膚形状変化モデルの構築

    山中健太郎, 中村槙介, 小林昭太, 白石允梓, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2009   ROMBUNNO.21  2009

    J-GLOBAL

  • Voice conversion based on vowel-change using coarticulation model

    山本達也, 室伏空, 森島繁生

    電子情報通信学会技術研究報告   109 ( 139(SP2009 41-48) ) 37 - 42  2009

    J-GLOBAL

  • Automatic Head Model Generation Based on Optimized Local Affine Transform Using Facial Range Scan Data

    前島謙宣, 森島繁生

    画像電子学会誌   38 ( 4 ) 404 - 413  2009

     View Summary

    We propose an automatic 3D human head modeling method using both a frontal facial image and geometry. In general, template mesh fitting methods are used to create a face model from a facial range data obtained by range scanner. However, previous fitting techniques need to manually specify markers to the scanned 3D geometry and to manually correct the 3D geometry of the missing parts that it is impossible to accurately measure the head geometry of the hair region. We therefore complement this region's 3D geometry with the template mesh's one. Our technique can generate the head model that the scanned 3D face geometry and the template mesh's one are seamlessly connected. The computational time of our method is much faster than previous template mesh fitting methods. We therefore conclude that proposed method is effective to create a large amount of head models in game and film industry and an entertainment system. © 2009, The Institute of Image Electronics Engineers of Japan. All rights reserved.

    DOI J-GLOBAL

  • An Automatic Music Video Creation System by Reusing Dance Video Content

    室伏空, 中野倫靖, 後藤真孝, 森島繁生

    情報処理学会研究報告(CD-ROM)   2009 ( 2 ) ROMBUNNO.MUS-NO.81(21)  2009

    J-GLOBAL

  • 多波長照明を用いた多視点画像からの動的立体形状再現

    小林昭太, 森島繁生

    日本バーチャルリアリティ学会大会論文集(CD-ROM)   14th   ROMBUNNO.1A4-1  2009

    J-GLOBAL

  • 平均顔を用いた実験用日本人表情刺激作成の試み

    木村あやの, 鈴木竜太, 吉田宏之, 渡邊伸行, 續木大介, CHANDRASIRI Naiwala P, 小泉憲生, 時田学, 森島繁生, 山田寛

    日本顔学会誌   9 ( 1 ) 53 - 69  2009

    J-GLOBAL

  • MRI画像に基づく皮膚下の厚みを反映した顔面筋肉モデルの構築

    鑓水裕刀, 久保尋之, 前島謙宣, 森島繁生

    日本顔学会誌   9 ( 1 ) 196  2009

    J-GLOBAL

  • 同世代女性の疲労感と表情の関係

    鈴木めぐみ, 永嶋義直, 矢田幸博, 森島繁生, 山田寛

    日本顔学会誌   9 ( 1 ) 245  2009

    J-GLOBAL

  • High Realistic Contents that Make Audience Dive Into the Movie

    森島繁生

    画像電子学会研究会講演予稿   247th   27 - 32  2009

    J-GLOBAL

  • High Realistic Contents that Make Audience Dive Into the Movie

    MORISHIMA Shigeo

    ITE Technical Report   33 ( 0 ) 27 - 32  2009

     View Summary

    We have proposed a new movie contents style which makes all audience participate in the story as a movie casts and give an experience of immersive sensation into the story. In 2005, at Mitsui-Toshiba pavilion of Aichi Expo., Future Cast System is implemented first in the world and 1,630,000 people have experienced the sensational contents "Grand Odyssey". This system is commercially succeeded as Future Cast Theater at HUIS TEN BOSCH, Nagasaki in 2007. It is very important factor to succeed to customize CG character full automatically and instantly without any stress to the audience. In this paper, we propose new character customization method with full automatic reflection of personality in facial expression, body size, gait, voice quality and hair style in a few minutes. Also real-time full body motion synthesis and rendering including hair is realized and as a result self-awareness rate at which one can recognize himself in the screen improves almost 90% comparing with 59% of EXPO.2005.

    DOI CiNii

  • Visual Entertainment System Considering Personal Voice

    足立 吉広, 大谷 大和, 川本 真一, 四倉 達夫, 森島 繁生, 中村 哲

    情報処理学会論文誌   49 ( 12 ) 3908 - 3917  2008.12

     View Summary

    In this paper, we propose an improved Future Cast System (FCS) that enables anyone to be a movie star with own individuality in voice as well as faces. Previous system created a CG character which closely resembles the face of the audience; however the voice of the character was selected only considering gender. Therefore, the voice of a CG character is not enough to identify oneself from others. The proposed system produces much closer voice to the audience by selecting one from a voice actor database, where voice similarity of speaker is estimated using a combined feature of 8 acoustic features. After assigning one CG character to the audience, the system produces voices in synchronization with the CG character's movement. We constructed the speech synchronization system using voice actor database with 60 voice quality, and conducted the subjective evaluation experiments of voice similarity in five-grades. Achievement rate of the proposal method for theoretical figure that considered the allowance rate of selected speaker to the database size is 68%.

    CiNii

  • Face Synthesis and its Emotional Evaluation

    MORISHIMA Shigeo

    The Journal of The Institute of Image Information and Television Engineers   62 ( 12 ) 1924 - 1927  2008.12

    DOI CiNii

  • アニメ作品制作の高能率化をめざす研究開発

    森島 繁生

    画像ラボ   19 ( 7 ) 34 - 39  2008.07

    CiNii J-GLOBAL

  • A Psychophysical Research on Perception of Facial Expressions using a Virtual Realistic Facial Image Processing System : Especially on the relationship between the strength of basic expressions and our sensation

    YAMADA Hiroshi, NAKAMURA Hironobu, MORISHIMA Shigeo, HARASHIMA Hiroshi

    Technical report of IEICE. HCS   95 ( 477 ) 15 - 20  2008.05

     View Summary

    Recent progress of the facial image processing technology has made it possible to study on the perception of facial expressions by means of the method of psychophysics with the use of realistic facial stimuli. This paper reported three experiments as our first step along with it. Each experiment attempted to clarify about the relationship between the strength of facial expressions and our sensation. Experiment 1 measured the thresholds of the perceived differences in facial appearance at 5different levels of strength of expressions for 'surprise' and 'anger'. Experiment 2 measured the same ones but for 'happiness' and 'Sadness'. Those results showed that there exist two types of relationship between the physical strength of expressions and the perceived one, which seem to be caused by two different visual properties of expressions. Experiment 3 confirmed the certainity that subjects in the previous experiments made their judgments of deifferences of facial images based on perceiving the expressions on the images by means of measuring the same thresholds for the inverted stimuli of 'surprise.

    CiNii

  • An attempt to develop tongue movement and sound by 3DCG for teaching English pronunciation: link to lexicon

    ヤーッコラ伊勢井敏子, ヤーッコラ伊勢井敏子, 鈴木茂樹, 森島繁生, 広瀬啓吉

    ITE technical report   32 ( 15 ) 29 - 32  2008.03

    CiNii J-GLOBAL

  • Data-Driven Modeling of Backbone Motions Based on Actual Bone Structure of Vertebra

    SEKINE Takao, MORISHIMA Shigeo

    IPSJ SIG Notes   130 ( 14 ) 11 - 16  2008.02

     View Summary

    In this research, we create a 3-Dimentional backbone model with the shape of vertebra based on the structure of actual humans. Additionally we develop a system generating natural backbone motions of character in CG. With the technological advancement in CG, character animations are drawn using skeleton models in movies and video games. It is difficult for users to model a backbone composed of detailed bones compared to other parts of a body such as arms and feet with a simple composition. Therefore, we create natural backbone motions and control these motions by dividing a backbone into three parts: lumbar vertebra, thoracic vertebra and cervical vertebra.

    CiNii J-GLOBAL

  • "Dive into the Move" Project to realize an immersive experience in the story

    MORISHIMA Shigeo, YAGI Yasushi, KAWAMOTO Shinichi, KAWAMOTO Shinichi

    IPSJ SIG Notes. CVIM   161 ( 3 ) 121 - 128  2008.01

     View Summary

    "Dive into the Movie" is world new entertainment system which can provide two styles of immersion experience into the story by giving a chance to audience to share an impression with his family or friend by watching a movie in which all audience can participate in the story as movie casts and by giving a chance to experience the panoramic scene and acoustic environment surrounding with the movie cast by special environment capturing and playback system. To realize this system, several techniques not only to model and capture the personal characteristics instantly in face, body, gesture and voice, but also to capture and playback high fidelity acoustic and visual environment automatically.

    CiNii

  • "Dive into the Move" Project to realize an immersive experience in the story

    MORISHIMA Shigeo, YAGI Yasushi, KAWAMOTO Shinichi, KAWAMOTO Shinichi

    IEICE technical report   107 ( 427 ) 153 - 160  2008.01

     View Summary

    "Dive into the Movie" is world new entertainment system which can provide two styles of immersion experience into the story by giving a chance to audience to share an impression with his family or friend by watching a movie in which all audience can participate in the story as movie casts and by giving a chance to experience the panoramic scene and acoustic environment surrounding with the movie cast by special environment capturing and playback system. To realize this system, several techniques not only to model and capture the personal characteristics instantly in face, body, gesture and voice, but also to capture and playback high fidelity acoustic and visual environment automatically.

    CiNii

  • 顎矯正手術後のスマイル作成時の軟組織変化

    吉田満, 寺田員人, 杉野伸一郎, 佐野奈都貴, 長谷川優, 小原彰浩, 齋藤功, 森島繁生

    日本矯正歯科学会大会プログラム・抄録集   67th   213  2008

    J-GLOBAL

  • 「デジタルメディア作品の制作を支援する基盤技術」2008 コンテンツ制作の高能率化のための要素技術研究

    森島繁生, 安生健一, バクスター ウィリアム, 中村哲, 四倉達夫, 川本真一

    デジタルメディア作品の制作を支援する基盤技術 第2回領域シンポジウム 平成20年     14 - 15  2008

    J-GLOBAL

  • コンテンツ制作の高能率化のための要素技術研究

    森島繁生

    戦略的創造研究推進事業研究年報(CD-ROM)   2008   DEJITARUMEDIA,MORISHIMA  2008

    J-GLOBAL

  • “Dive into the Move” Project to realize an immersive experience in the story

    森島繁生, 八木康史, 中村哲

    情報処理学会研究報告   2008 ( 3(CVIM-161) ) 121 - 128  2008

    J-GLOBAL

  • Highly Efficient Character Animation Production

    Morishima Shigeo, Kuriyama Shigeru, Kawamoto Shinichi

    The Journal of The Institute of Image Information and Television Engineers   62 ( 2 ) 156 - 160  2008

    DOI CiNii J-GLOBAL

  • An Interactive Tool for Editing Shadows in Cartoon Animation

    Nakajima Hidehito, Sugisaki Eiji, Morishima Shigeo

    The Journal of The Institute of Image Information and Television Engineers   62 ( 2 ) 234 - 239  2008

     View Summary

    Shadows in cartoon animation are generally used for dramatizing scenes. In hand-drawn animation, these shadows reflect the animators' intention and style rather than physical phenomena. On the other hand, shadows in 3 DCG animation are photorealistically rendered, and animators can not fully reflect their intention. This is because, in 3DCG animation, shadows are automatically generated once the light source is defined. Therefore, we develop an interactive tool for editing shadows that combines the advantages of hand-drawn animation and 3DCG technology. the advantage of our tool is that shadow attributes are inherited once animators edit the shape and location of shadows. Animators are only required mouse operations for editing shadows. Consequently, our tool enables animators to create shadows automatically and easily to reflect their intention and style.

    DOI CiNii J-GLOBAL

  • Data-Driven Modeling of Backbone Motions Based on Actual Bone Structure of Vertebra

    関根孝雄, 森島繁生

    情報処理学会研究報告   2008 ( 14(CG-130) ) 11 - 16  2008

    J-GLOBAL

  • 複数音響特徴量の統合による音声の知覚的類似度推定

    足立吉広, 川本真一, 森島繁生, 中村哲

    日本音響学会研究発表会講演論文集(CD-ROM)   2008   ROMBUNNO.1-11-14  2008

    J-GLOBAL

  • 表情筋運動モデルの過渡特性を考慮した表情アニメーション

    久保尋之, 石橋康, 前島謙宣, 森島繁生

    Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM)   2008   ROMBUNNO.20  2008

    J-GLOBAL

  • アニメ作品制作の高能率化をめざす研究開発

    森島繁生

    画像ラボ   19 ( 7 ) 34 - 39  2008

    J-GLOBAL

  • 英語情動文“I love you”中国語話者による認知と音響特性相関(1)

    ヤーッコラ(伊勢井, 敏子, 広瀬啓吉, 森島繁生

    日本音響学会研究発表会講演論文集(CD-ROM)   2008   ROMBUNNO.1-Q-5  2008

    J-GLOBAL

  • 音声モーフィングを用いた類似話者の評価

    近藤康治郎, 足立吉広, 森島繁生

    日本音響学会研究発表会講演論文集(CD-ROM)   2008   ROMBUNNO.1-Q-18  2008

    J-GLOBAL

  • 音声の知覚的類似度推定のための音響特徴量の選択

    足立吉広, 川本真一, 森島繁生, 中村哲

    日本音響学会研究発表会講演論文集(CD-ROM)   2008   ROMBUNNO.2-P-20  2008

    J-GLOBAL

  • メロディの楽譜と採譜された演奏記録を教師データに用いた演奏表情のモデリング

    室伏空, 近藤康治郎, 足立吉広, 森島繁生

    日本音響学会研究発表会講演論文集(CD-ROM)