研究者詳細 - 森島　繁生

写真a

モリシマ　シゲオ

森島　繁生

Scopus 論文情報

論文数: 355 Citation: 3279 h-index: 22

Click to view the Scopus page. The data was downloaded from Scopus API in October 20, 2025, via http://api.elsevier.com and http://www.scopus.com .

Google Scholar 情報（Citations per year）

Citation: 7146 h-index: 33 i10-index: 117

Click to view the Google Scholar page.

Scopus 情報

所属

理工学術院先進理工学部

職名

教授

学位

工学博士

メールアドレス

ホームページ

https://morishima-lab.jp/

経歴

2004年04月

-

継続中

早稲田大学理工学術院教授
2010年04月

-

2014年03月

NICT 音声言語コミュニケーション研究所客員研究員
1999年04月

-

2010年03月

国際電気通信基礎技術研究所客員研究員
2001年04月

-

2004年03月

成蹊大学工学部教授
1988年04月

-

2001年03月

成蹊大学工学部助教授
1994年07月

-

1995年08月

トロント大学コンピュータサイエンス学部 Visiting Professor
1987年04月

-

1988年03月

成蹊大学工学部専任講師

▼全件表示

学歴

1982年04月

-

1987年03月

東京大学大学院工学系研究科電子工学専門課程
1978年04月

-

1982年03月

東京大学工学部電子工学科

委員歴

2023年09月

-

継続中

JST ASPIREアドバイザー
2021年08月

-

継続中

パシフィックグラフィックス 2022 ゼネラルチェア
2018年04月

-

継続中

JST CREST アドバイザー
2022年08月

-

　

JST 創発アドバイザー
2020年04月

-

2021年03月

ACM VRST 2021 スポンサーシップチェア
2016年05月

-

2020年05月

画像電子学会ビジュアルコンピューティング研究専門委員会委員長
2018年12月

-

2019年12月

ACM VRST 2019 プログラムチェア
2018年12月

　

　

ACM SIGGRAPH ASIA 2018 VR/AR Advisor
2018年11月

-

2018年12月

ACM VRST 2018 General Chair
2014年05月

-

2016年05月

画像電子学会副会長
2015年11月

　

　

ACM SIGGRAPH ASIA 2015 Workshop/Partner Event Chair

▼全件表示

所属学協会

　

　

　

芸術科学会
　

　

　

日本顔学会
　

　

　

映像情報メディア学会
　

　

　

情報処理学会
　

　

　

画像電子学会
　

　

　

IEEE
　

　

　

ACM

▼全件表示

研究分野

知能情報学

研究キーワード

深層学習
音響信号処理
顔画像処理
マルチメディア情報処理
ヒューマンコンピュータインタラクション
コンピュータビジョン
コンピュータグラフィックス

▼全件表示

受賞

小野梓記念学術賞

2025年03月早稲田大学 WanderGuide: Indoor Map-less Robotic Guide for Exploration by Blind People

受賞者： Masaki Kuribayashi, Kohei Uehara, Allan Wang, Shigeo Morishima, Chieko Asakawa
小野梓記念学術賞

2025年03月早稲田大学 Onset-and-Offset-Aware Sound Event Detection via Differentiable Frame-to-Event Mapping

受賞者： Tomoya Yoshinaga, Keitaro Tanaka, Yoshiaki Bando, Keisuke Imoto, Shigeo Morishima
小野梓記念学術賞

2025年03月早稲田大学 Understanding and Supporting Formal Email Exchange by Answering AI-Generated Questions

受賞者： Yusuke Miura, Chi-Lan Yang, Masaki Kuribayashi, Keigo Matsumoto, Hideaki Kuzuoka, Shigeo Morishima
CHI2025 HONORABLE MENTION

2025年03月 CHI2025 Understanding and Supporting Formal Email Exchange by Answering AI-Generated Questions

受賞者： Yusuke Miura, Chi-Lan Yang, Masaki Kuribayashi, Keigo Matsumoto, Hideaki Kuzuoka, Shigeo Morishima
第29回日本音響学会, 学生優秀発表賞

2024年12月音響イベント検出のための隠れセミマルコフモデルに基づくイベント単位損失

受賞者：吉永朋矢, 坂東宜昭, 田中啓太郎, 井本桂右, 大西正輝, 森島繁生
応用音響研究会(EA)，学生研究奨励賞

2024年11月汎用事前学習済みモデルを用いた音響イベント検出のためのHSMMに基づくイベント単位学習

受賞者：吉永朋矢, 田中啓太郎, 坂東宜昭, 井本桂右, 大西正輝, 森島繁生
VC学生研究賞

2024年09月 Visual Computing 2024 話者固有の発話特性に着目したマルチタスク学習に基づく読唇精度向上手法

受賞者：柏木爽良, 田中啓太郎, 森島繁生
ベストプレゼンテーション賞（Best Research 部門）

2024年08月情報処理学会第141回音楽情報科学研究会変分オートエンコーダを用いた単旋律音楽信号の音高・音色・変動への分解

受賞者：田中啓太郎, 吉井和佳, Simon Dixon, 森島繁生
IBM アカデミックリサーチアワード

2022年02月 IBM 新時代のリアルワールドアクセシビリティ

受賞者：森島繁生
小野梓記念賞

2021年03月早稲田大学 Multi-Instrument Music Transcription Based on Deep Spherical Clustering of Spectrograms and Pitchgrams

受賞者：田中啓太郎, 中塚貴之, 錦見遼, 吉井和佳, 森島繁生
小野梓記念賞

2021年03月早稲田大学ラインチェーサー: スマートフォンベースの列に並ぶ視覚障碍者支援

受賞者：栗林雅希, 粥川青汰, 高木博啓, 浅川智恵子, 森島繁生
最優秀論文賞

2020年12月日本ソフトウェア科学会 LineChaser: 視覚障碍者が列に並ぶためのスマートフォン型支援システム

受賞者：栗林雅希, 粥川青汰, 髙木啓伸, 浅川智恵子, 森島繁生
CGジャパンアワード

2020年11月芸術科学会 CG，CV, 音楽情報処理の最先端研究と実用化開発

受賞者：森島繁生
羽倉賞フォーラムエイト賞

2020年11月最先端表現技術利用推進協会 1枚画像からのフォトリアリスティックな歌唱およびダンスキャラクタ⾃動⽣成

受賞者：森島繁生, 岩本尚也, 加藤卓哉, 中塚貴之, 山口周悟
ベストペーパー最終ノミネート

2019年06月 IEEE CVPR 2019 SiCloPe: シルエットベースの着衣人物

受賞者：夏目亮太, 斎藤隼介, ゼンファン, ワイカイチェン, チョンヤンマー、ハオリー, 森島繁生
インタラクション2019論文賞

2019年03月情報処理学会 BBeep：歩行者との衝突予測に基づく警告音を用いた視覚障害者のための衝突回避支援システム

受賞者：粥川青汰, 樋口啓太, João Guerreiro, 森島繁生, 佐藤洋一, Kris Kitani, 浅川智恵子
論文賞金賞

2017年12月コンピュータエンタテインメント技術国際会議（ACE2017） DanceDJ: ライブパフォーマンスのための３Dダンスアニメーションオーサリングシステム

受賞者：岩本尚也, 加藤卓哉, ヒューベルトシャム, 柿塚亮, 原健太, 森島繁生
論文賞金賞

2017年12月コンピュータエンタテインメント技術国際会議（ACE2017）ボイスアニメーター：音声によるリミテッド風アニメーションにおける自動リップシンク

受賞者：古川翔一、福里司、山口周悟、森島繁生
Visual Computing / グラフィクスとCAD 合同シンポジウム 2017 優秀研究発表賞

2017年06月画像電子学会可展面制約を考慮したテンプレートベース衣服モデリング

受賞者：成田史弥, 齋藤隼介, 福里司, 森島繁生

　概要を見る

本稿では，衣服のモデリングにおける労力削減を目的とし，テンプレートとなる１つの衣服モデルから，任意のキャラクタの体型に適合した衣服モデルと型紙を生成する手法を提案する．生成される衣服の概形を確認しながらソースの身体とターゲット身体の対応関係をインタラクティブに修正できるインターフェースの導入により，身体のモデルの頂点数や頂点間の接続情報に依存しない衣服転写を実現する．また，ソースの衣服の形状を反映する最適化処理を行った後に可展面近似を行うことで，ソースの衣服のデザインから形状が大きく離れることを防ぎ，もっともらしい衣服モデルを生成することを実現する．提案手法は生成した衣服モデルの型紙を出力することが可能なため，例えば人間とペットのペアルックの衣服制作など，現実世界におけるものづくり支援への応用が期待される．

▼全件表示

論文

Generative AI Framework to Enhance Joint Attention for Visually Impaired

Ryudai Inoue, Shigeo Morishima

ACM Conference on Human Factors in Computing Systems, CHI2025 2025年04月 [査読有り]
WanderGuide: Indoor Map-less Robotic Guide for Exploration by Blind People

Masaki Kuribayashi, Kohei Uehara, Allan Wang, Shigeo Morishima, Chieko Asakawa

ACM Conference on Human Factors in Computing Systems, CHI2025 2025年04月 [査読有り]
Beyond Omakase: Designing Shared Control for Navigation Robots with Blind People

Rie Kamikubo, Seita Kayukawa, Yuka Kaniwa, Allan Wang, Hernisa Kacorri, Hironobu Takagi, Chieko Asakawa

ACM Conference on Human Factors in Computing Systems, CHI2025 2025年04月 [査読有り]
Understanding and Supporting Formal Email Exchange by Answering AI-Generated Questions

Yusuke Miura, Chi-Lan Yang, Masaki Kuribayashi, Keigo Matsumoto, Hideaki Kuzuoka, Shigeo Morishima

ACM Conference on Human Factors in Computing Systems, CHI2025 2025年04月 [査読有り]
Formula-Supervised Sound Event Detection: Pre-Training Without Real Data

Yuto Shibata, Keitaro Tanaka, Yoshiaki Bando, Keisuke Imoto, Hirokatsu Kataoka, Yoshimitsu Aoki

IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP2025 2025年04月 [査読有り]
SynchroDexterity: Rapid Non-Dominant Hand Skill Acquisition with Synchronized Guidance in Mixed Reality

Ryudai Inoue, Qi Feng, Shigeo Morishima

IEEE Conference on Virtual Reality and 3D User Interfaces 2025年03月 [査読有り]
SyncViolinist: Music-Oriented Violin Motion Generation Based on Bowing and Fingering

Hiroki Nishizawa, Keitaro Tanaka, Asuka Hirata, Shugo Yamaguchi, Qi Feng, Masatoshi Hamanaka, Shigeo Morishima, equal contribution

IEEE/CVF Winter Conference on Applications of Computer Vision, WACV2025 2025年02月 [査読有り]

担当区分：最終著者
Unsupervised Pitch-Timbre-Variation Disentanglement of Monophonic Music Signals Based on Random Perturbation and Re-entry Training

Keitaro Tanaka, Kazuyoshi Yoshii, Simon Dixon, Shigeo Morishima

APSIPA Transactions on Signal and Information Processing 14 ( 1 ) 2025年02月 [査読有り]

DOI

Scopus
Capturing Dynamic Identity Features for Speaker-Adaptive Visual Speech Recognition

Sara Kashiwagi, Keitaro Tanaka, Shigeo Morishima

APSIPA ASC 2024 2024年12月 [査読有り]

担当区分：最終著者
Onset-and-Offset-Aware Sound Event Detection via Differentiable Frame-to-Event Mapping

Tomoya Yoshinaga, Keitaro Tanaka, Yoshiaki Bando, Keisuke Imoto, Shigeo Morishima

IEEE Signal Processing Letters 32 186 - 190 2024年11月 [査読有り]

担当区分：最終著者

DOI
ChitChatGuide: Conversational Interaction Using Large Language Models for Assisting People with Visual Impairments to Explore a Shopping Mall

Yuka Kaniwa, Masaki Kuribayashi, Seita Kayukawa, Daisuke Sato, Hironobu Takagi, Chieko Asakawa, Shigeo Morishima

Proceedings of the ACM on Human-Computer Interaction 8 ( MHCI ) 1 - 25 2024年09月 [査読有り]

　概要を見る

To enable people with visual impairments (PVI) to explore shopping malls, it is important to provide information for selecting destinations and obtaining information based on the individual's interests. We achieved this through conversational interaction by integrating a large language model (LLM) with a navigation system. ChitChatGuide allows users to plan a tour through contextual conversations, receive personalized descriptions of surroundings based on transit time, and make inquiries during navigation. We conducted a study in a shopping mall with 11 PVI, and the results reveal that the system allowed them to explore the facility with increased enjoyment. The LLM-based conversational interaction, by understanding vague and context-based questions, enabled the participants to explore unfamiliar environments effectively. The personalized and in-situ information generated by the LLM was both useful and enjoyable. Considering the limitations we identified, we discuss the criteria for integrating LLMs into navigation systems to enhance the exploration experiences of PVI.

DOI

Scopus

4

被引用数

(Scopus)
Snap&Nav: Smartphone-based Indoor Navigation System For Blind People via Floor Map Analysis and Intersection Detection

Masaya Kubota, Masaki Kuribayashi, Seita Kayukawa, Hironobu Takagi, Chieko Asakawa, Shigeo Morishima

Proceedings of the ACM on Human-Computer Interaction 8 ( MHCI ) 1 - 22 2024年09月 [査読有り]

担当区分：最終著者

　概要を見る

We present Snap&Nav, a navigation system for blind people in unfamiliar buildings, without prebuilt digital maps. Instead, the system utilizes the floor map as its primary information source for route guidance. The system requires a sighted assistant to capture an image of the floor map, which is analyzed to create a node map containing intersections, destinations, and current positions on the floor. The system provides turn-by-turn navigation instructions while tracking users' positions on the node map by detecting intersections. Additionally, the system estimates the scale difference of the node map to provide distance information. Our system was validated through two user studies with 20 sighted and 12 blind participants. Results showed that sighted participants processed floor map images without being accustomed to the system, while blind participants navigated with increased confidence and lower cognitive load compared to the condition using only cane, appreciating the system's potential for use in various buildings.

DOI

Scopus

3

被引用数

(Scopus)
話者固有の発話特性に着目したマルチタスク学習に基づく読唇精度向上手法

柏木爽良, 田中啓太郎, 森島繁生

VisualComputing, VC2024 2024年09月 [査読有り]
Detect Fake with Fake: Leveraging Synthetic Data-driven Representation for Synthetic Image Detection

Hina Otake, Yoshihiro Fukuhara, Yoshiki Kubotani, Shigeo Morishima

Trust What You learN (TWYN) Workshop (Organized in conjunction with ECCV 2024) Women in Computer Vision workshop (Organized in conjunction with ECCV 2024) 2024年09月 [査読有り]

担当区分：最終著者
The Gap in the Strategy of Recovering Task Failure between GPT-4V and Humans in a Visual Dialogue

Ryosuke Oshima, Seitaro Shinagawa, Shigeo Morishima

Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue 728 - 745 2024年09月 [査読有り]

担当区分：最終著者

DOI

Scopus
Adaptive Sampling for Monte-Carlo Event Imagery Rendering

Yuichiro Manabe, Tatsuya Yatagawa, Shigeo Morishima, Hiroyuki Kubo

ACM SIGGRAPH 2024 Posters 2024年07月

DOI

Scopus
Idea Track: Improving Sample Efficiency in World Models through Semantic Exploration via Expert Demonstration

Kensuke Tatematsu, Hideki Tsunashima, Morishima Shigeo

Fortieth International Conference on Machine Learning, ICML2024 2024年07月 [査読有り]

担当区分：最終著者
Keep Eyes on the Sentence: An Interactive Sentence Simplification System for English Learners Based on Eye Tracking and Large Language Models

Taichi Higasa, Keitaro Tanaka, Qi Feng, Shigeo Morishima

Extended Abstracts of the CHI Conference on Human Factors in Computing Systems 2024年05月 [査読有り]

担当区分：最終著者

DOI

Scopus

4

被引用数

(Scopus)
PathFinder: Designing a Map-less Navigation System for Blind People in Unfamiliar Buildings

栗林雅希, 石原辰也, 佐藤大介, カーネギーメロン大, Jayakorn Vongkulbhisa(l, 日本IBM, Ram Karnik(カーネギーメロ, ン大, 粥川青汰, 髙木啓伸, 森島繁生, 浅川智恵子, カーネギーメロン大

WISS2023 2023年11月 [査読有り]
Event-Based Camera Simulation Using Monte Carlo Path Tracing with Adaptive Denoising

Yuta Tsuji, Tatsuya Yatagawa, Hiroyuki Kubo, Shigeo Morishima

2023 IEEE International Conference on Image Processing (ICIP) 2023年10月

担当区分：最終著者

DOI
Scapegoat Generation for Privacy Protection from Deepfake

Gido Kato, Yoshihiro Fukuhara, Mariko Isogawa, Hideki Tsunashima, Hirokatsu Kataoka, Shigeo Morishima

2023 IEEE International Conference on Image Processing (ICIP) 2023年10月

担当区分：最終著者

DOI
On the Use of Synthesized Datasets and Transformer Adaptors for Musical Instrument Recognition

Tanaka, Keitaro, Luo, Yin-Jyun, Cheuk, Kin Wai, Yoshii, Kazuyoshi, Morishima, Shigeo, Dixon, Simon

ISMIR 2023 LP-10 2023年10月 [査読有り]
Gaze-Driven Sentence Simplification for Language Learners: Enhancing Comprehension and Readability

Taichi Higasa, Keitaro Tanaka, Qi Feng, Shigeo Morishima

The 25th International Conference on Multimodal Interaction, ICMI 2023 2023年10月 [査読有り]

DOI

Scopus

1

被引用数

(Scopus)
Enhancing Perception and Immersion in Pre-Captured Environments through Learning-Based Eye Height Adaptation

Qi Feng, Hubert P.H. Shum, Shigeo Morishima

The 22nd IEEE International Symposium on Mixed and Augmented Reality, ISMAR2023 2023年10月 [査読有り]
Audio-Visual Speech Enhancement With Preserving Specific Off-Screen Speech

Tomoya Yoshinaga, Keitaro Tanaka, Shigeo Morishima

VC2023 short, 39 2023年09月 [査読有り]
Detecting Unknown Multiword Expressions in Natural English Reading via Eye Gaze

Taichi Higasa, Asuka Hirata, Keitaro Tanaka, Qi Feng, Shigeo Morishima

VC2023 short, 38 2023年09月 [査読有り]
オブジェクトモーションブラー低減のための変形可能なNeRF

佐藤和仁, 山口周悟, 武田司, 森島繁生

VC2023 short, 29 2023年09月 [査読有り]
通常発声と無音発声の動画を用いた発話内容推測における距離学習に基づく精度差改善手法

柏木爽良, 田中啓太郎, 森島繁生

VC2023 long, 37 2023年09月 [査読有り]
半陰解法を用いた浅水方程式による保存的な流体シミュレーション

平江陽香, 森島繁生, 安東遼一

VC2023 long, 33, VC学生研究賞 2023年09月 [査読有り]
Audio-Visual Speech Enhancement With Selective Off-Screen Speech Extraction

Tomoya Yoshinaga, Keitaro Tanaka, Shigeo Morishima

The 31st European Signal Processing Conference, EUSIPCO2023, Best Student Paper Contest Finalist 2023年09月 [査読有り]
Improving the Gap in Visual Speech Recognition Between Normal and Silent Speech Based on Metric Learning

Sara Kashiwagi, Keitaro Tanaka, Qi Feng, Shigeo Morishima

INTERSPEECH 2023 2023年08月 [査読有り]
A Conservative Semi-Implicit Scheme for Shallow Water Equations

Haruka Hirae, Shigeo Morishima, Ryoichi Ando

Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation 2023年08月 [査読有り]

DOI

Scopus
Deformable Neural Radiance Fields for Object Motion Blur Removal

Kazuhito Sato, Shugo Yamaguchi, Tsukasa Takeda, Shigeo Morishima

Proceedings - SIGGRAPH 2023 Posters 2023年08月 [査読有り]

　概要を見る

In this paper, we present a novel approach to remove object motion blur in 3D scene renderings using deformable neural radiance fields. Our technique adapts the hyperspace representation to accommodate shape changes induced by object motion blur. Experiments on Blender-generated datasets demonstrate the effectiveness of our method in producing higher-quality images with reduced object motion blur artifacts.

DOI

Scopus

1

被引用数

(Scopus)
Efficient 3D Reconstruction of NeRF using Camera Pose Interpolation and Photometric Bundle Adjustment

Tsukasa Takeda, Shugo Yamaguchi, Kazuhito Sato, Kosuke Fukazawa, Shigeo Morishima

Proceedings - SIGGRAPH 2023 Posters 2023年08月 [査読有り]

DOI

Scopus

2

被引用数

(Scopus)
Scapegoat Generation for;Privacy Protection from Malicious Deepfake

Gido Kato, Yoshihiro Fukuhara, Mariko Isogawa, Hideki Tsunashima, Hirokatsu Kataoka, Shigeo Morishima

画像の認識・理解シンポジウム, MIRU2023 2023年07月 [査読有り]
Textual and Directional Sign Recognition Algorithm for People with Visual Impairment by Linking Texts and Arrows

Masaki Kuribayashi, Hironobu Takagi, Chieko Asakawa, Shigeo Morishima

The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 Workshop (CVPR 2023 Workshop) 2023年06月 [査読有り]
Memory Efficient Diffusion Probabilistic Models via Patch-based Generation

Shinei Arakawa, Hideki Tsunashima, Daichi Horita, Keitaro Tanaka, Shigeo Morishima

the Generative Models for Computer Vision workshop at CVPR 2023 2023年06月 [査読有り]
3DMovieMap: an Interactive Route Viewer for Multi-Level Buildings

Seita Kayukawa, Keita Higuchi, Shigeo Morishima, Ken Sakurada

Conference on Human Factors in Computing Systems - Proceedings 2023年04月 [査読有り]

　概要を見る

We present an interactive route viewer system, 3DMovieMap, which generates and shows navigation movies walking through multi-level buildings, such as a science museum, airport, and university building. Movie map systems can provide users with visual cues by synthesizing navigation movies based on their inputs of routes. However, existing systems are limited to flat areas such as city areas. We aim to extend Movie Map to generate navigation movies for multi-level buildings. The 3DMovieMap system generates a movie map from an equirectangular movie via a visual Simultaneous Localization and Mapping technology. Users select waypoints on the floor maps. 3DMovieMap calculates the shortest path that visits these points and generates a navigation movie along the route. We created four movie maps of buildings and asked two participants to use our system and provide feedback for further improvements. We will be releasing an open dataset of equirectangular movies captured in a science museum.

DOI

Scopus

7

被引用数

(Scopus)
PathFinder: Designing a Map-less Navigation System for Blind People in Unfamiliar Buildings

Masaki Kuribayashi, Tatsuya Ishihara, Daisuke Sato, Jayakorn Vongkulbhisal, Karnik Ram, Seita Kayukawa, Hironobu Takagi, Shigeo Morishima, Chieko Asakawa

Conference on Human Factors in Computing Systems - Proceedings 2023年04月 [査読有り]

　概要を見る

Indoor navigation systems with prebuilt maps have shown great potential in navigating blind people even in unfamiliar buildings. However, blind people cannot always benefit from them in every building, as prebuilt maps are expensive to build. This paper explores a map-less navigation system for blind people to reach destinations in unfamiliar buildings, which is implemented on a robot. We first conducted a participatory design with five blind people, which revealed that intersections and signs are the most relevant information in unfamiliar buildings. Then, we prototyped PathFinder, a navigation system that allows blind people to determine their way by detecting and conveying information about intersections and signs. Through a participatory study, we improved the interface of PathFinder, such as the feedback for conveying the detection results. Finally, a study with seven blind participants validated that PathFinder could assist users in navigating unfamiliar buildings with increased confidence compared to their regular aid.

DOI

Scopus

25

被引用数

(Scopus)
Enhancing Blind Visitor's Autonomy in a Science Museum Using an Autonomous Navigation Robot

Seita Kayukawa, Daisuke Sato, Masayuki Murata, Tatsuya Ishihara, Hironobu Takagi, Shigeo Morishima, Chieko Asakawa

Conference on Human Factors in Computing Systems - Proceedings 2023年04月 [査読有り]

　概要を見る

Enabling blind visitors to explore museum floors while feeling the facility's atmosphere and increasing their autonomy and enjoyment are imperative for giving them a high-quality museum experience. We designed a science museum exploration system for blind visitors using an autonomous navigation robot. Blind users can control the robot to navigate them toward desired exhibits while playing short audio descriptions along the route. They can also browse detailed explanations on their smartphones and call museum staff if interactive support is needed. Our real-world user study at a science museum during its opening hour revealed that blind participants could explore the museum safely and independently at their own pace. The study also showed that the sighted visitors who saw the participants walking with the robot accepted the assistive robot well. We finally conducted focus group sessions with the blind participants and discussed further requirements toward a more independent museum experience.

DOI

Scopus

24

被引用数

(Scopus)
Exploration of Sonification Feedback for People with Visual Impairment to Use Ski Simulator

Yusuke Miura, Erwin Wu, Masaki Kuribayashi, Hideki Koike, Shigeo Morishima

The Augmented Humans 2023, AHs2023 147 - 158 2023年03月 [査読有り]

　概要を見る

Training opportunities for visually impaired (VI) skiers are limited because it is essential for them to have sighted people who guide them with their voices. This study investigates an auditory feedback system that enables ski training using a ski simulator for VI skiers alone. Based on the results of interviews with actual VI skiers and their guides, we designed the following three types of sounds: 1) a single sound (ATS: Advance Turn Sound) that conveys information about turns; 2) a continuous sound (CES: Continuous Error Sound) that is emitted according to the difference between the user's future position and the position he/she should progress to; and 3) a single sound (Gate-Passed Sound) which is emitted when a user passed through a gate. We conducted an evaluation experiment with four blind skiers and three sighted guides. Results showed that three out of four skiers performed better under the conditions in which ATS and gate-passed sound were emitted than the condition in which a human guide gave calls. The result suggests that a sonification-based method such as ATS is effective for ski training on the ski simulator for VI skiers.

DOI

Scopus

4

被引用数

(Scopus)
Pointing out Human Answer Mistakes in a Goal-Oriented Visual Dialogue

Ryosuke Oshima, Seitaro Shinagawa, Hideki Tsunashima, Qi Feng, Shigeo Morishima

Proceedings - 2023 IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2023 4665 - 4670 2023年

　概要を見る

Effective communication between humans and intelligent agents has promising applications for solving complex problems. One such approach is visual dialogue, which leverages multimodal context to assist humans. However, real-world scenarios occasionally involve human mistakes, which can cause intelligent agents to fail. While most prior research assumes perfect answers from human interlocutors, we focus on a setting where the agent points out unintentional mistakes for the interlocutor to review, better reflecting real-world situations. In this paper, we show that human answer mistakes depend on question type and QA turn in the visual dialogue by analyzing a previously unused data collection of human mistakes. We demonstrate the effectiveness of those factors for the model's accuracy in a pointing-human-mistake task through experiments using a simple MLP model and a Visual Language Model.

DOI

Scopus

1

被引用数

(Scopus)
A Study on Sonification Method of Simulator-Based Ski Training for People with Visual Impairment

Yusuke Miura, Masaki Kuribayashi, Erwin Wu, Hideki Koike, Shigeo Morishima

Proceedings - SIGGRAPH Asia 2022 Posters 2022年12月

　概要を見る

People with visual impairment (PVI) are eager to push their limits in extreme sports such as alpine skiing. However, training skiing is very difficult as it always requires assistance from an experienced guide. This paper explores sonification-based methods that enable PVI to train skiing using a simulator, which will allow them to train without a guide. Two types of sonification feedback for PVI are proposed and studied in our experiment. The results suggest that users without any visual information can also pass through over 80% of the poles compared to with visual information on average.

DOI

Scopus

1

被引用数

(Scopus)
視線情報と比喩度に基づく英語フレーズの理解度推定

樋笠泰祐, 平田明日香, 田中啓太郎, 森島繁生

第30回インタラクティブシステムとソフトウェアに関するワークショップ , WISS 2022 2022年12月 [査読有り]
視覚障害者がスキーシミュレータを用いるための聴覚フィードバックの検討

三浦悠輔, Wu Erwin, 栗林雅希, 小池英樹, 森島繁生

第30回インタラクティブシステムとソフトウェアに関するワークショップ , WISS 2022 2022年12月 [査読有り]
Unsupervised Disentanglement of Timbral, Pitch, and Variation Features From Musical Instrument Sounds With Random Perturbation

Keitaro Tanaka, Yoshiaki Bando, Kazuyoshi Yoshii, Shigeo Morishima

2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 709 - 716 2022年11月

　概要を見る

This paper describes an unsupervised disentangled representation learning method for musical instrument sounds with pitched and unpitched spectra. Since conventional methods have commonly attempted to disentangle timbral features (e.g., instruments) and pitches (e.g., MIDI note numbers and FOs), they can be applied to only pitched sounds. Global timbres unique to instruments and local variations (e.g., expressions and playstyles) are also treated without distinction. Instead, we represent the spectrogram of a musical instrument sound with a variational autoencoder (VAE) that has timbral, pitch, and variation features as latent variables. The pitch clarity or percussiveness, brightness, and FOs (if existing) are considered to be represented in the abstract pitch features. The unsupervised disentanglement is achieved by extracting time-invariant and time-varying features as global timbres and local variations from randomly pitch-shifted input sounds and time-varying features as local pitch features from randomly timbre-distorted input sounds. To enhance the disentanglement of timbral and variation features from pitch features, input sounds are separated into spectral envelopes and fine structures with cepstrum analysis. The experiments showed that the proposed method can provide effective timbral and pitch features for better musical instrument classification and pitch estimation.

DOI
Geometric Features Informed Multi-person Human-Object Interaction Recognition in Videos

Tanqiu Qiao, Qianhui Men, Frederick W. B. Li, Yoshiki Kubotani, Shigeo Morishima, Hubert P. H. Shum

Lecture Notes in Computer Science 474 - 491 2022年10月 [査読有り]

DOI

Scopus

11

被引用数

(Scopus)
運指と運弓を反映した音響信号からのヴァイオリン演奏アニメーションの自動生成

平田明日香, 田中啓太郎, 浜中雅俊, 森島繁生

VC + VCC 2022, short 2022年10月 [査読有り]
Corridor-Walker: Mobile Indoor Walking Assistance for Blind People to Avoid Obstacles and Recognize Intersections

Masaki Kuribayashi, Seita Kayukawa, Jayakorn Vongkulbhisal, Chieko Asakawa, Daisuke Sato, Hironobu Takagi, Shigeo Morishima

Proceedings of the ACM on Human-Computer Interaction 6 ( MHCI ) 2022年09月

　概要を見る

Navigating in an indoor corridor can be challenging for blind people as they have to be aware of obstacles while also having to recognize the intersections that lead to the destination. To aid blind people in such tasks, we propose Corridor-Walker, a smartphone-based system that assists blind people to avoid obstacles and recognize intersections. The system uses a LiDAR sensor equipped with a smartphone to construct a 2D occupancy grid map of the surrounding environment. Then, the system generates an obstacle-avoiding path and detects upcoming intersections on the grid map. Finally, the system navigates the user to trace the generated path and notifes the user of each intersection’s existence and the shape using vibration and audio feedback. A user study with 14 blind participants revealed that Corridor-Walker allowed participants to avoid obstacles, rely less on the wall to walk straight, and enable them to recognize intersections.

DOI

Scopus

28

被引用数

(Scopus)
Audio-Driven Violin Performance Animation with Clear Fingering and Bowing

Asuka Hirata, Keitaro Tanaka, Masatoshi Hamanaka, Shigeo Morishima

Proceedings - SIGGRAPH 2022 Posters 2022年07月

　概要を見る

This paper presents an audio-to-animation synthesis method for violin performance. This new approach provides a fine-grained violin performance animation using information on playing procedure consisting of played string, finger number, position, and bow direction. We demonstrate that our method is capable of synthesizing natural violin performance animation with fine fingering and bowing through extensive evaluation.

DOI

Scopus

3

被引用数

(Scopus)
Real-time Shading with Free-form Planar Area Lights using Linearly Transformed Cosines

Takahiro Kuge, Tatsuya Yatagawa, Shigeo Morishima

ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, i3D 2022 2022年05月 [査読有り]
How Users, Facility Managers, and Bystanders Perceive and Accept a Navigation Robot for Visually Impaired People in Public Buildings

Seita Kayukawa, Daisuke Sato, Masayuki Murata, Tatsuya Ishihara, Akihiro Kosugi, Hironobu Takagi, Shigeo Morishima, Chieko Asakawa

RO-MAN 2022 - 31st IEEE International Conference on Robot and Human Interactive Communication: Social, Asocial, and Antisocial Robots 546 - 553 2022年

　概要を見る

Autonomous navigation robots have a considerable potential to offer a new form of mobility aid to people with visual impairments. However, to deploy such robots in public buildings, it is imperative to receive acceptance from not only robot users but also people that use the buildings and managers of those facilities. Therefore, we conducted three studies to investigate the acceptance and concerns of our prototype robot, which looks like a regular suitcase. First, an online survey revealed that people could accept the robot navigating blind users. Second, in the interviews with facility managers, they were cautious about the robot's camera and the privacy of their customers. Finally, focus group sessions with legally blind participants who experienced the robot navigation revealed that the robot may cause trouble when it collides with those who may not be aware of the user's blindness. Still, many participants liked the design of the robot which assimilated into the surroundings.

DOI

Scopus

18

被引用数

(Scopus)
The Sound of Bounding-Boxes.

Takashi Oya, Shohei Iwase, Shigeo Morishima

CoRR abs/2203.15991 9 - 15 2022年

　概要を見る

In the task of audio-visual sound source separation, which leverages visual
information for sound source separation, identifying objects in an image is a
crucial step prior to separating the sound source. However, existing methods
that assign sound on detected bounding boxes suffer from a problem that their
approach heavily relies on pre-trained object detectors. Specifically, when
using these existing methods, it is required to predetermine all the possible
categories of objects that can produce sound and use an object detector
applicable to all such categories. To tackle this problem, we propose a fully
unsupervised method that learns to detect objects in an image and separate
sound source simultaneously. As our method does not rely on any pre-trained
detector, our method is applicable to arbitrary categories without any
additional annotation. Furthermore, although being fully unsupervised, we found
that our method performs comparably in separation accuracy.

DOI
Community-Driven Comprehensive Scientific Paper Summarization: Insight from cvpaper.challenge.

Shintaro Yamamoto, Hirokatsu Kataoka, Ryota Suzuki 0006, Seitaro Shinagawa, Shigeo Morishima

CoRR abs/2203.09109 2022年

　概要を見る

The present paper introduces a group activity involving writing summaries of
conference proceedings by volunteer participants. The rapid increase in
scientific papers is a heavy burden for researchers, especially non-native
speakers, who need to survey scientific literature. To alleviate this problem,
we organized a group of non-native English speakers to write summaries of
papers presented at a computer vision conference to share the knowledge of the
papers read by the group. We summarized a total of 2,000 papers presented at
the Conference on Computer Vision and Pattern Recognition, a top-tier
conference on computer vision, in 2019 and 2020. We quantitatively analyzed
participants' selection regarding which papers they read among the many
available papers. The experimental results suggest that we can summarize a wide
range of papers without asking participants to read papers unrelated to their
interests.

DOI
360 Depth Estimation in the Wild - the Depth360 Dataset and the SegFuse Network.

Qi Feng, Hubert P. H. Shum, Shigeo Morishima

VR 664 - 673 2022年

　概要を見る

Single-view depth estimation from omnidirectional images has gained
popularity with its wide range of applications such as autonomous driving and
scene reconstruction. Although data-driven learning-based methods demonstrate
significant potential in this field, scarce training data and ineffective 360
estimation algorithms are still two key limitations hindering accurate
estimation across diverse domains. In this work, we first establish a
large-scale dataset with varied settings called Depth360 to tackle the training
data problem. This is achieved by exploring the use of a plenteous source of
data, 360 videos from the internet, using a test-time training method that
leverages unique information in each omnidirectional sequence. With novel
geometric and temporal constraints, our method generates consistent and
convincing depth samples to facilitate single-view estimation. We then propose
an end-to-end two-branch multi-task learning network, SegFuse, that mimics the
human eye to effectively learn from the dataset and estimate high-quality depth
maps from diverse monocular RGB images. With a peripheral branch that uses
equirectangular projection for depth estimation and a foveal branch that uses
cubemap projection for semantic segmentation, our method predicts consistent
global depth while maintaining sharp details at local regions. Experimental
results show favorable performance against the state-of-the-art methods.

DOI

Scopus

20

被引用数

(Scopus)
3D car shape reconstruction from a contour sketch using GAN and lazy learning.

Naoki Nozawa, Hubert P. H. Shum, Qi Feng, Edmond S. L. Ho, Shigeo Morishima

Vis. Comput. 38 ( 4 ) 1317 - 1330 2022年

　概要を見る

3D car models are heavily used in computer games, visual effects, and even automotive designs. As a result, producing such models with minimal labour costs is increasingly more important. To tackle the challenge, we propose a novel system to reconstruct a 3D car using a single sketch image. The system learns from a synthetic database of 3D car models and their corresponding 2D contour sketches and segmentation masks, allowing effective training with minimal data collection cost. The core of the system is a machine learning pipeline that combines the use of a generative adversarial network (GAN) and lazy learning. GAN, being a deep learning method, is capable of modelling complicated data distributions, enabling the effective modelling of a large variety of cars. Its major weakness is that as a global method, modelling the fine details in the local region is challenging. Lazy learning works well to preserve local features by generating a local subspace with relevant data samples. We demonstrate that the combined use of GAN and lazy learning produces is able to produce high-quality results, in which different types of cars with complicated local features can be generated effectively with a single sketch. Our method outperforms existing ones using other machine learning structures such as the variational autoencoder.

DOI

Scopus

16

被引用数

(Scopus)
Audio-Oriented Video Interpolation Using Key Pose

Takayuki Nakatsuka, Yukitaka Tsuchiya, Masatoshi Hamanaka, Shigeo Morishima

International Journal of Pattern Recognition and Artificial Intelligence 35 ( 16 ) 2021年12月

　概要を見る

This paper describes a deep learning-based method for long-term video interpolation that generates intermediate frames between two music performance videos of a person playing a specific instrument. Recent advances in deep learning techniques have successfully generated realistic images with high-fidelity and high-resolution in short-term video interpolation. However, there is still room for improvement in long-term video interpolation due to lack of resolution and temporal consistency of the generated video. Particularly in music performance videos, the music and human performance motion need to be synchronized. We solved these problems by using human poses and music features essential for music performance in long-term video interpolation. By closely matching human poses with music and videos, it is possible to generate intermediate frames that synchronize with the music. Specifically, we obtain the human poses of the last frame of the first video and the first frame of the second video in the performance videos to be interpolated as key poses. Then, our encoder-decoder network estimates the human poses in the intermediate frames from the obtained key poses, with the music features as the condition. In order to construct an end-to-end network, we utilize a differentiable network that transforms the estimated human poses in vector form into the human pose in image form, such as human stick figures. Finally, a video-to-video synthesis network uses the stick figures to generate intermediate frames between two music performance videos. We found that the generated performance videos were of higher quality than the baseline method through quantitative experiments.

DOI

Scopus

4

被引用数

(Scopus)
Light Source Selection in Primary-Sample-Space Neural Photon Sampling

Yuta Tsuji, Tatsuya Yatagawa, Shigeo Morishima

Proceedings - SIGGRAPH Asia 2021 Posters, SA 2021 2021年12月

　概要を見る

This paper proposes a light source selection for photon mapping combined with recent deep-learning-based importance sampling. Although applying such neural importance sampling (NIS) to photon mapping is not difficult, a straightforward approach can sample inappropriate photons for each light source because NIS relies on the approximation of a smooth continuous probability density function on the primary sample space, whereas the light source selection follows a discrete probability distribution. To alleviate this problem, we introduce a normalizing flow conditioned by a feature vector representing the index for each light source. When the neural network for NIS is trained to sample visible photons, we achieved lower variance with the same sample budgets, compared to a previous photon sampling using Markov chain Monte Carlo.

DOI

Scopus

1

被引用数

(Scopus)
画像認識技術を用いたコンピュータビジョン分野の論文における図表の使用の分析

山本晋太郎, 鈴木亮太, 品川政太朗, 片岡裕雄, 森島繁生

精密工学会誌 87 ( 12 ) 995 - 1002 2021年12月

DOI CiNii
Bowing-Net: Motion Generation for String Instruments Based on Bowing Information

Asuka Hirata, Keitaro Tanaka, Ryo Shimamura, Shigeo Morishima

Special Interest Group on Computer Graphics and Interactive Techniques Conference Posters, SIGGRAPH 2021 2021年08月

　概要を見る

This paper presents a deep learning based method that generates body motion for string instrument performance from raw audio. In contrast to prior methods which aim to predict joint position from audio, we first estimate information that dictates the bowing dynamics, such as the bow direction and the played string. The final body motion is then determined from this information following a conversion rule. By adopting the bowing information as the target domain, not only is learning the mapping more feasible, but also the produced results have bowing dynamics that are consistent with the given audio. We confirmed that our results are superior to existing methods through extensive experiments.

DOI

Scopus

3

被引用数

(Scopus)
A case study on user evaluation of scientific publication summarization by Japanese students

Shintaro Yamamoto, Ryota Suzuki, Tsukasa Fukusato, Hirokatsu Kataoka, Shigeo Morishima

Applied Sciences (Switzerland) 11 ( 14 ) 2021年07月

　概要を見る

Summaries of scientific publications enable readers to gain an overview of a large number of studies, but users’ preferences have not yet been explored. In this paper, we conduct two user studies (i.e., short-and long-term studies) where Japanese university students read summaries of English research articles that were either manually written or automatically generated using text summarization and/or machine translation. In the short-term experiment, subjects compared and evaluated the two types of summaries of the same article. We analyze the characteristics in the generated summaries that readers regard as important, such as content richness and simplicity. The experimental results show that subjects are mainly judged based on four criteria, including content richness, simplicity, fluency, and format. In the long-term experiment, subjects read 50 summaries and answered whether they would like to read the original papers after reading the summaries. We discuss the characteristics in the summaries that readers tend to use to determine whether to read the papers, such as topic, methods, and results. The comments from subjects indicate that specific components of scientific publications, including research topics and methods, are important to judge whether to read or not. Our study provides insights to enhance the effectiveness of automatic summarization of scientific publications.

DOI

Scopus

2

被引用数

(Scopus)
Pitch-Timbre Disentanglement Of Musical Instrument Sounds Based On Vae-Based Metric Learning

Keitaro Tanaka, Ryo Nishikimi, Yoshiaki Bando, Kazuyoshi Yoshii, Shigeo Morishima

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021-June 111 - 115 2021年06月

　概要を見る

This paper describes a representation learning method for disentangling an arbitrary musical instrument sound into latent pitch and timbre representations. Although such pitch-timbre disentanglement has been achieved with a variational autoencoder (VAE), especially for a predefined set of musical instruments, the latent pitch and timbre representations are outspread, making them hard to interpret. To mitigate this problem, we introduce a metric learning technique into a VAE with latent pitch and timbre spaces so that similar (different) pitches or timbres are mapped close to (far from) each other. Specifically, our VAE is trained with additional contrastive losses so that the latent distances between two arbitrary sounds of the same pitch or timbre are minimized, and those of different pitches or timbres are maximized. This training is performed under weak supervision that uses only whether the pitches and timbres of two sounds are the same or not, instead of their actual values. This improves the generalization capability for unseen musical instruments. Experimental results show that the proposed method can find better-structured disentangled representations with pitch and timbre clusters even for unseen musical instruments.

DOI
LSTM-SAKT: LSTM-Encoded SAKT-like Transformer for Knowledge Tracing

Takashi Oya, Shigeo Morishima

CoRR abs/2102.00845 2021年01月

　概要を見る

This paper introduces the 2nd place solution for the Riiid! Answer
Correctness Prediction in Kaggle, the world's largest data science competition
website. This competition was held from October 16, 2020, to January 7, 2021,
with 3395 teams and 4387 competitors. The main insights and contributions of
this paper are as follows. (i) We pointed out existing Transformer-based models
are suffering from a problem that the information which their query/key/value
can contain is limited. To solve this problem, we proposed a method that uses
LSTM to obtain query/key/value and verified its effectiveness. (ii) We pointed
out 'inter-container' leakage problem, which happens in datasets where
questions are sometimes served together. To solve this problem, we showed
special indexing/masking techniques that are useful when using RNN-variants and
Transformer. (iii) We found additional hand-crafted features are effective to
overcome the limits of Transformer, which can never consider the samples older
than the sequence length.
Property analysis of adversarially robust representation

Yoshihiro Fukuhara, Takahiro Itazuri, Hirokatsu Kataoka, Shigeo Morishima

Seimitsu Kogaku Kaishi/Journal of the Japan Society for Precision Engineering 87 ( 1 ) 83 - 91 2021年01月

　概要を見る

In this paper, we address the open question: "What do adversarially robust models look at?" Recently, it has been reported in many works that there exists the trade-off between standard accuracy and adversarial robustness. According to prior works, this trade-off is rooted in the fact that adversarially robust and standard accurate models might depend on very different sets of features. However, it has not been well studied what kind of difference actually exists. In this paper, we analyze this difference through various experiments visually and quantitatively. Experimental results show that adversarially robust models look at things at a larger scale than standard models and pay less attention to fine textures. Furthermore, although it has been claimed that adversarially robust features are not compatible with standard accuracy, there is even a positive effect by using them as pre-trained models particularly in low resolution datasets.

DOI

Scopus
RLTutor: Reinforcement Learning Based Adaptive Tutoring System by Modeling Virtual Student with Fewer Interactions.

Yoshiki Kubotani, Yoshihiro Fukuhara, Shigeo Morishima

CoRR abs/2108.00268 2021年

　概要を見る

A major challenge in the field of education is providing review schedules
that present learned items at appropriate intervals to each student so that
memory is retained over time. In recent years, attempts have been made to
formulate item reviews as sequential decision-making problems to realize
adaptive instruction based on the knowledge state of students. It has been
reported previously that reinforcement learning can help realize mathematical
models of students learning strategies to maintain a high memory rate. However,
optimization using reinforcement learning requires a large number of
interactions, and thus it cannot be applied directly to actual students. In
this study, we propose a framework for optimizing teaching strategies by
constructing a virtual model of the student while minimizing the interaction
with the actual teaching target. In addition, we conducted an experiment
considering actual instructions using the mathematical model and confirmed that
the model performance is comparable to that of conventional teaching methods.
Our framework can directly substitute mathematical models used in experiments
with human students, and our results can serve as a buffer between theoretical
instructional optimization and practical applications in e-learning systems.
Comprehending Research Article in Minutes: A User Study of Reading Computer Generated Summary for Young Researchers.

Shintaro Yamamoto, Ryota Suzuki 0006, Hitokatsu Kataoka, Shigeo Morishima

Human Interface and the Management of Information. Information Presentation and Visualization - Thematic Area 12765 LNCS 101 - 112 2021年

　概要を見る

The automatic summarization of scientific papers, to assist researchers in conducting literature surveys, has garnered significant attention because of the rapid increase in the number of scientific articles published each year. However, whether and how these summaries actually help readers in comprehending scientific papers has not been examined yet. In this work, we study the effectiveness of automatically generated summaries of scientific papers for students who do not have sufficient knowledge in research. We asked six students, enrolled in bachelor’s and master’s programs in Japan, to prepare a presentation on a scientific paper by providing them either the article alone, or the article and its summary generated by an automatic summarization system, after 15 min of reading time. The comprehension of an article was judged by four evaluators based on the participant’s presentation. The experimental results show that the completeness of the comprehension of the four participants was higher overall when the summary of the paper was provided. In addition, four participants, including the two whose completeness score reduced when the summary was provided, mentioned that the summary is helpful to comprehend a research article within a limited time span.

DOI

Scopus

1

被引用数

(Scopus)
LineChaser: A Smartphone-Based Navigation System for Blind People to Stand in Lines.

Masaki Kuribayashi, Seita Kayukawa, Hironobu Takagi, Chieko Asakawa, Shigeo Morishima

CHI '21: CHI Conference on Human Factors in Computing Systems(CHI) 33 - 13 2021年

DOI
Self-Supervised Learning for Visual Summary Identification in Scientific Publications.

Shintaro Yamamoto, Anne Lauscher, Simone Paolo Ponzetto, Goran Glavas, Shigeo Morishima

Proceedings of the 11th International Workshop on Bibliometric-enhanced Information Retrieval co-located with 43rd European Conference on Information Retrieval (ECIR 2021)(BIR@ECIR) 5 - 19 2021年
Visual Summary Identification From Scientific Publications via Self-Supervised Learning.

Shintaro Yamamoto, Anne Lauscher, Simone Paolo Ponzetto, Goran Glavas, Shigeo Morishima

Frontiers in Research Metrics and Analytics 6 719004 - 719004 2021年 [国際誌]

　概要を見る

The exponential growth of scientific literature yields the need to support users to both effectively and efficiently analyze and understand the some body of research work. This exploratory process can be facilitated by providing graphical abstracts-a visual summary of a scientific publication. Accordingly, previous work recently presented an initial study on automatic identification of a central figure in a scientific publication, to be used as the publication's visual summary. This study, however, have been limited only to a single (biomedical) domain. This is primarily because the current state-of-the-art relies on supervised machine learning, typically relying on the existence of large amounts of labeled data: the only existing annotated data set until now covered only the biomedical publications. In this work, we build a novel benchmark data set for visual summary identification from scientific publications, which consists of papers presented at conferences from several areas of computer science. We couple this contribution with a new self-supervised learning approach to learn a heuristic matching of in-text references to figures with figure captions. Our self-supervised pre-training, executed on a large unlabeled collection of publications, attenuates the need for large annotated data sets for visual summary identification and facilitates domain transfer for this task. We evaluate our self-supervised pretraining for visual summary identification on both the existing biomedical and our newly presented computer science data set. The experimental results suggest that the proposed method is able to outperform the previous state-of-the-art without any task-specific annotations.

DOI PubMed

Scopus

3

被引用数

(Scopus)
Vocal-Accompaniment Compatibility Estimation Using Self-Supervised and Joint-Embedding Techniques

Takayuki Nakatsuka, Kento Watanabe, Yuki Koyama, Masahiro Hamasaki, Masataka Goto, Shigeo Morishima

IEEE ACCESS 9 101994 - 102003 2021年

　概要を見る

We propose a learning-based method of estimating the compatibility between vocal and accompaniment audio tracks, i.e., how well they go with each other when played simultaneously. This task is challenging because it is difficult to formulate hand-crafted rules or construct a large labeled dataset to perform supervised learning. Our method uses self-supervised and joint-embedding techniques for estimating vocal-accompaniment compatibility. We train vocal and accompaniment encoders to learn a joint-embedding space of vocal and accompaniment tracks, where the embedded feature vectors of a compatible pair of vocal and accompaniment tracks lie close to each other and those of an incompatible pair lie far from each other. To address the lack of large labeled datasets consisting of compatible and incompatible pairs of vocal and accompaniment tracks, we propose generating such a dataset from songs using singing voice separation techniques, with which songs are separated into pairs of vocal and accompaniment tracks, and then original pairs are assumed to be compatible, and other random pairs are not. We achieved this training by constructing a large dataset containing 910,803 songs and evaluated the effectiveness of our method using ranking-based evaluation methods.

DOI

Scopus

4

被引用数

(Scopus)
MirrorNet: A Deep Reflective Approach to 2D Pose Estimation for Single-Person Images

Takayuki Nakatsuka, Kazuyoshi Yoshii, Yuki Koyama, Satoru Fukayama, Masataka Goto, Shigeo Morishima

Journal of Information Processing 29 406 - 423 2021年

　概要を見る

This paper proposes a statistical approach to 2D pose estimation from human images. The main problems with the standard supervised approach, which is based on a deep recognition (image-to-pose) model, are that it often yields anatomically implausible poses, and its performance is limited by the amount of paired data. To solve these problems, we propose a semi-supervised method that can make effective use of images with and without pose annotations. Specifically, we formulate a hierarchical generative model of poses and images by integrating a deep generative model of poses from pose features with that of images from poses and image features. We then introduce a deep recognition model that infers poses from images. Given images as observed data, these models can be trained jointly in a hierarchical variational autoencoding (image-to-pose-to-feature-to-pose-to-image) manner. The results of experiments show that the proposed reflective architecture makes estimated poses anatomically plausible, and the pose estimation performance is improved by integrating the recognition and generative models and also by feeding non-annotated images.

DOI
Self-supervised learning for visual summary identification in scientific publications

Shintaro Yamamoto, Anne Lauscher, Simone Paolo Ponzetto, Goran Glavaš, Shigeo Morishima

CEUR Workshop Proceedings 2847 5 - 19 2021年

　概要を見る

Providing visual summaries of scientific publications can increase information access for readers and thereby help deal with the exponential growth in the number of scientific publications. Nonetheless, efforts in providing visual publication summaries have been few and far apart, primarily focusing on the biomedical domain. This is primarily because of the limited availability of annotated gold standards, which hampers the application of robust and high-performing supervised learning techniques. To address these problems we create a new benchmark dataset for selecting figures to serve as visual summaries of publications based on their abstracts, covering several domains in computer science. Moreover, we develop a self-supervised learning approach, based on heuristic matching of inline references to figures with figure captions. Experiments in both biomedical and computer science domains show that our model is able to outperform the state of the art despite being self-supervised and therefore not relying on any annotated training data.
Do We Need Sound for Sound Source Localization?

Takashi Oya, Shohei Iwase, Ryota Natsume, Takahiro Itazuri, Shugo Yamaguchi, Shigeo Morishima

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 12627 LNCS 119 - 136 2021年

　概要を見る

During the performance of sound source localization which uses both visual and aural information, it presently remains unclear how much either image or sound modalities contribute to the result, i.e. do we need both image and sound for sound source localization? To address this question, we develop an unsupervised learning system that solves sound source localization by decomposing this task into two steps: (i) “potential sound source localization”, a step that localizes possible sound sources using only visual information (ii) “object selection”, a step that identifies which objects are actually sounding using aural information. Our overall system achieves state-of-the-art performance in sound source localization, and more importantly, we find that despite the constraint on available information, the results of (i) achieve similar performance. From this observation and further experiments, we show that visual information is dominant in “sound” source localization when evaluated with the currently adopted benchmark dataset. Moreover, we show that the majority of sound-producing objects within the samples in this dataset can be inherently identified using only visual information, and thus that the dataset is inadequate to evaluate a system’s capability to leverage aural information. As an alternative, we present an evaluation protocol that enforces both visual and aural information to be leveraged, and verify this property through several experiments.

DOI

Scopus

8

被引用数

(Scopus)
Song2Face: Synthesizing Singing Facial Animation from Audio

Shohei Iwase, Takuya Kato, Shugo Yamaguchi, Tsuchiya Yukitaka, Shigeo Morishima

SIGGRAPH Asia 2020 Technical Communications, SA 2020 2020年12月 [査読有り]

　概要を見る

We present Song2Face, a deep neural network capable of producing singing facial animation from an input of singing voice and singer label. The network architecture is built upon our insight that, although facial expression when singing varies between different individuals, singing voices store valuable information such as pitch, breathe, and vibrato that expressions may be attributed to. Therefore, our network consists of an encoder that extracts relevant vocal features from audio, and a regression network conditioned on a singer label that predicts control parameters for facial animation. In contrast to prior audio-driven speech animation methods which initially map audio to text-level features, we show that vocal features can be directly learned from singing voice without any explicit constraints. Our network is capable of producing movements for all parts of the face and also rotational movement of the head itself. Furthermore, stylistic differences in expression between different singers are captured via the singer label, and thus the resulting animations singing style can be manipulated at test time.

DOI

Scopus

6

被引用数

(Scopus)
Audio-visual object removal in 360-degree videos

Ryo Shimamura, Qi Feng, Yuki Koyama, Takayuki Nakatsuka, Satoru Fukayama, Masahiro Hamasaki, Masataka Goto, Shigeo Morishima

VISUAL COMPUTER 36 ( 10-12 ) 2117 - 2128 2020年10月 [査読有り]

　概要を見る

We present a novel conceptaudio-visual object removalin 360-degree videos, in which a target object in a 360-degree video is removed in both the visual and auditory domains synchronously. Previous methods have solely focused on the visual aspect of object removal using video inpainting techniques, resulting in videos with unreasonable remaining sounds corresponding to the removed objects. We propose a solution which incorporates direction acquired during the video inpainting process into the audio removal process. More specifically, our method identifies the sound corresponding to the visually tracked target object and then synthesizes a three-dimensional sound field by subtracting the identified sound from the input 360-degree video. We conducted a user study showing that our multi-modal object removal supporting both visual and auditory domains could significantly improve the virtual reality experience, and our method could generate sufficiently synchronous, natural and satisfactory 360-degree videos.

DOI

Scopus

6

被引用数

(Scopus)
LinSSS: linear decomposition of heterogeneous subsurface scattering for real-time screen-space rendering

Tatsuya Yatagawa, Yasushi Yamaguchi, Shigeo Morishima

VISUAL COMPUTER 36 ( 10-12 ) 1979 - 1992 2020年10月 [査読有り]

　概要を見る

Screen-space subsurface scattering is currently the most common approach to represent translucent materials in real-time rendering. However, most of the current approaches approximate the diffuse reflectance profile of translucent materials as a symmetric function, whereas the profile has an asymmetric shape in nature. To address this problem, we propose LinSSS, a numerical representation of heterogeneous subsurface scattering for real-time screen-space rendering. Although our representation is built upon a previous method, it makes two contributions. First, LinSSS formulates the diffuse reflectance profile as a linear combination of radially symmetric Gaussian functions. Nevertheless, it can also represent the spatial variation and the radial asymmetry of the profile. Second, since LinSSS is formulated using only the Gaussian functions, the convolution of the diffuse reflectance profile can be efficiently calculated in screen space. To further improve the efficiency, we deform the rendering equation obtained using LinSSS by factoring common convolution terms and approximate the convolution processes using a MIP map. Consequently, our method works as fast as the state-of-the-art method, while our method successfully represents the heterogeneity of scattering.

DOI

Scopus

4

被引用数

(Scopus)
Guiding Blind Pedestrians in Public Spaces by Understanding Walking Behavior of Nearby Pedestrians

Seita Kayukawa, Tatsuya Ishihara, Hironobu Takagi, Shigeo Morishima, Chieko Asakawa

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4 ( 3 ) 85 - 22 2020年09月 [査読有り]

　概要を見る

We present a guiding system to help blind people walk in public spaces while making their walking seamless with nearby pedestrians. Blind users carry a rolling suitcase-shaped system that has two RGBD Cameras, an inertial measurement unit (IMU) sensor, and light detection and ranging (LiDAR) sensor. The system senses the behavior of surrounding pedestrians, predicts risks of collisions, and alerts users to help them avoid collisions. It has two modes: The "on-path"mode that helps users avoid collisions without changing their path by adapting their walking speed; and the "off-path"mode that navigates an alternative path to go around pedestrians standing in the way Auditory and tactile modalities have been commonly used for non-visual navigation systems, so we implemented two interfaces to evaluate the effectiveness of each modality for collision avoidance. A user study with 14 blind participants in public spaces revealed that participants could successfully avoid collisions with both modalities. We detail the characteristics of each modality.

DOI

Scopus

40

被引用数

(Scopus)
Resolving hand-object occlusion for mixed reality with joint deep learning and model optimization

Qi Feng, Hubert P. H. Shum, Shigeo Morishima

COMPUTER ANIMATION AND VIRTUAL WORLDS 31 ( 4-5 ) 2020年07月 [査読有り]

　概要を見る

By overlaying virtual imagery onto the real world, mixed reality facilitates diverse applications and has drawn increasing attention. Enhancing physical in-hand objects with a virtual appearance is a key component for many applications that require users to interact with tools such as surgery simulations. However, due to complex hand articulations and severe hand-object occlusions, resolving occlusions in hand-object interactions is a challenging topic. Traditional tracking-based approaches are limited by strong ambiguities from occlusions and changing shapes, while reconstruction-based methods show a poor capability of handling dynamic scenes. In this article, we propose a novel real-time optimization system to resolve hand-object occlusions by spatially reconstructing the scene with estimated hand joints and masks. To acquire accurate results, we propose a joint learning process that shares information between two models and jointly estimates hand poses and semantic segmentation. To facilitate the joint learning system and improve its accuracy under occlusions, we propose an occlusion-aware RGB-D hand data set that mitigates the ambiguity through precise annotations and photorealistic appearance. Evaluations show more consistent overlays compared with literature, and a user study verifies a more realistic experience.

DOI

Scopus

9

被引用数

(Scopus)
Asynchronous Eulerian Liquid Simulation

T. Koike, S. Morishima, R. Ando

Computer Graphics Forum 39 ( 2 ) 1 - 8 2020年05月 [査読有り]

　概要を見る

We present a novel method for simulating liquid with asynchronous time steps on Eulerian grids. Previous approaches focus on Smoothed Particle Hydrodynamics (SPH), Material Point Method (MPM) or tetrahedral Finite Element Method (FEM) but the method for simulating liquid purely on Eulerian grids have not yet been investigated. We address several challenges specifically arising from the Eulerian asynchronous time integrator such as regional pressure solve, asynchronous advection, interpolation, regional volume preservation, and dedicated segregation of the simulation domain according to the liquid velocity. We demonstrate our method on top of staggered grids combined with the level set method and the semi-Lagrangian scheme. We run several examples and show that our method considerably outperforms the global adaptive time step method with respect to the computational runtime on scenes where a large variance of velocity is present.

DOI

Scopus

3

被引用数

(Scopus)
MirrorNet: A Deep Bayesian Approach to Reflective 2D Pose Estimation from Human Images

Takayuki Nakatsuka, Kazuyoshi Yoshii, Yuki Koyama, Satoru Fukayama, Masataka Goto, Shigeo Morishima

CoRR abs/2004.03811 2020年04月

　概要を見る

This paper proposes a statistical approach to 2D pose estimation from human
images. The main problems with the standard supervised approach, which is based
on a deep recognition (image-to-pose) model, are that it often yields
anatomically implausible poses, and its performance is limited by the amount of
paired data. To solve these problems, we propose a semi-supervised method that
can make effective use of images with and without pose annotations.
Specifically, we formulate a hierarchical generative model of poses and images
by integrating a deep generative model of poses from pose features with that of
images from poses and image features. We then introduce a deep recognition
model that infers poses from images. Given images as observed data, these
models can be trained jointly in a hierarchical variational autoencoding
(image-to-pose-to-feature-to-pose-to-image) manner. The results of experiments
show that the proposed reflective architecture makes estimated poses
anatomically plausible, and the performance of pose estimation improved by
integrating the recognition and generative models and also by feeding
non-annotated images.
Data compression for measured heterogeneous subsurface scattering via scattering profile blending

Tatsuya Yatagawa, Hideki Todo, Yasushi Yamaguchi, Shigeo Morishima

VISUAL COMPUTER 36 ( 3 ) 541 - 558 2020年03月 [査読有り]

　概要を見る

Subsurface scattering involves the complicated behavior of light beneath the surfaces of translucent objects that includes scattering and absorption inside the object's volume. Physically accurate numerical representation of subsurface scattering requires a large number of parameters because of the complex nature of this phenomenon. The large amount of data restricts the use of the data on memory-limited devices such as video game consoles and mobile phones. To address this problem, this paper proposes an efficient data compression method for heterogeneous subsurface scattering. The key insight of this study is that heterogeneous materials often comprise a limited number of base materials, and the size of the subsurface scattering data can be significantly reduced by parameterizing only a few base materials. In the proposed compression method, we represent the scattering property of a base material using a function referred to as the base scattering profile. A small subset of the base materials is assigned to each surface position, and the local scattering property near the position is described using a linear combination of the base scattering profiles in the log scale. The proposed method reduces the data by a factor of approximately 30 compared to a state-of-the-art method, without significant loss of visual quality in the rendered graphics. In addition, the compressed data can also be used as bidirectional scattering surface reflectance distribution functions (BSSRDF) without incurring much computational overhead. These practical aspects of the proposed method also facilitate the use of higher-resolution BSSRDFs in devices with large memory capacity.

DOI

Scopus

2

被引用数

(Scopus)
Adversarial Knowledge Distillation for a Compact Generator.

Hideki Tsunashima, Hirokatsu Kataoka, Junji Yamato, Qiu Chen, Shigeo Morishima

25th International Conference on Pattern Recognition(ICPR) 10636 - 10643 2020年

DOI

Scopus

2

被引用数

(Scopus)
Multi-Instrument Music Transcription Based on Deep Spherical Clustering of Spectrograms and Pitchgrams.

Keitaro Tanaka, Takayuki Nakatsuka, Ryo Nishikimi, Kazuyoshi Yoshii, Shigeo Morishima

Proceedings of the 21th International Society for Music Information Retrieval Conference(ISMIR) 327 - 334 2020年
Garment transfer for quadruped characters

F. Narita, S. Saito, T. Kato, T. Fukusato, S. Morishima

European Association for Computer Graphics - 37th Annual Conference, EUROGRAPHICS 2016 - Short Papers 57 - 60 2020年 [査読有り]

　概要を見る

Modeling clothing to characters is one of the most time-consuming tasks for artists in 3DCG animation production. Transferring existing clothing models is a simple and powerful solution to reduce labor. In this paper, we propose a method to generate a clothing model for various characters from a single template model. Our framework consists of three steps: scale measurement, clothing transformation, and texture preservation. By introducing a novel measurement of the scale deviation between two characters with different shapes and poses, our framework achieves pose-independent transfer of clothing even for quadrupeds (e.g., from human to horse). In addition to a plausible clothing transformation method based on the scale measurement, our method minimizes texture distortion resulting from large deformation. We demonstrate that our system is robust for a wide range of body shapes and poses, which is challenging for current state-of-the-art methods.

DOI

Scopus

1

被引用数

(Scopus)
Hypermask talking head projected onto real object

Shigeo Morishima, Tatsuo Yotsukura, Kim Binsted, Frank Nielsen, Claudio Pinhanez

Multimedia Modeling: Modeling Multimedia Information and Systems, MMM 2000 403 - 412 2020年

　概要を見る

HYPERMASK is a system which projects an animated face onto a physical mask, worn by an actor. As the mask moves within a prescribed area, its position and orientation are detected by a camera, and the projected image changes with respect to the viewpoint of the audience. The lips of the projected face are automatically synthesized in real time with the voice of the actor, who also controls the facial expressions. As a theatrical tool, HYPERMASK enables a new style of storytelling. As a prototype system, we propose to put a self-contained HYPERMASK system in a trolley (disguised as a linen cart), so that it projects onto the mask worn by the actor pushing the trolley.
Foreground-aware dense depth estimation for 360 images

Qi Feng, Hubert P.H. Shum, Ryo Shimamu, Shigeo Morishima

Journal of WSCG 28 ( 1-2 ) 79 - 88 2020年

　概要を見る

With 360 imaging devices becoming widely accessible, omnidirectional content has gained popularity in multiple fields. The ability to estimate depth from a single omnidirectional image can benefit applications such as robotics navigation and virtual reality. However, existing depth estimation approaches produce sub-optimal results on real-world omnidirectional images with dynamic foreground objects. On the one hand, capture-based methods cannot obtain the foreground due to the limitations of the scanning and stitching schemes. On the other hand, it is challenging for synthesis-based methods to generate highly-realistic virtual foreground objects that are comparable to the real-world ones. In this paper, we propose to augment datasets with realistic foreground objects using an image-based approach, which produces a foreground-aware photorealistic dataset for machine learning algorithms. By exploiting a novel scale-invariant RGB-D correspondence in the spherical domain, we repurpose abundant non-omnidirectional datasets to include realistic foreground objects with correct distortions. We further propose a novel auxiliary deep neural network to estimate both the depth of the omnidirectional images and the mask of the foreground objects, where the two tasks facilitate each other. A new local depth loss considers small regions of interests and ensures that their depth estimations are not smoothed out during the global gradient’s optimization. We demonstrate the system using human as the foreground due to its complexity and contextual importance, while the framework can be generalized to any other foreground objects. Experimental results demonstrate more consistent global estimations and more accurate local estimations compared with state-of-the-arts.

DOI

Scopus

3

被引用数

(Scopus)
Audio-guided Video Interpolation via Human Pose Features

Takayuki Nakatsuka, Masatoshi Hamanaka, Shigeo Morishima

PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VOL 5: VISAPP 5 27 - 35 2020年

　概要を見る

This paper describes a method that generates in-between frames of two videos of a musical instrument being played. While image generation achieves a successful outcome in recent years, there is ample scope for improvement in video generation. The keys to improving the quality of video generation are the high resolution and temporal coherence of videos. We solved these requirements by using not only visual information but also aural information. The critical point of our method is using two-dimensional pose features to generate high-resolution in-between frames from the input audio. We constructed a deep neural network with a recurrent structure for inferring pose features from the input audio and an encoder-decoder network for padding and generating video frames using pose features. Our method, moreover, adopted a fusion approach of generating, padding, and retrieving video frames to improve the output video. Pose features played an essential role in both end-to-end training with a differentiable property and combining a generating, padding, and retrieving approach. We conducted a user study and confirmed that the proposed method is effective in generating interpolated videos.

DOI
Single Sketch Image based 3D Car Shape Reconstruction with Deep Learning and Lazy Learning

Naoki Nozawa, Hubert P. H. Shum, Edmond S. L. Ho, Shigeo Morishima

PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VOL 1: GRAPP 1 179 - 190 2020年

　概要を見る

Efficient car shape design is a challenging problem in both the automotive industry and the computer animation/games industry. In this paper, we present a system to reconstruct the 3D car shape from a single 2D sketch image. To learn the correlation between 2D sketches and 3D cars, we propose a Variational Autoencoder deep neural network that takes a 2D sketch and generates a set of multi-view depth and mask images, which form a more effective representation comparing to 3D meshes, and can be effectively fused to generate a 3D car shape. Since global models like deep learning have limited capacity to reconstruct fine-detail features, we propose a local lazy learning approach that constructs a small subspace based on a few relevant car samples in the database. Due to the small size of such a subspace, fine details can be represented effectively with a small number of parameters. With a low-cost optimization process, a high-quality car shape with detailed features is created. Experimental results show that the system performs consistently to create highly realistic cars of substantially different shape and topology.

DOI

Scopus

8

被引用数

(Scopus)
Smartphone-Based Assistance for Blind People to Stand in Lines

Seita Kayukawa, Hironobu Takagi, Joao Guerreiro, Shigeo Morishima, Chieko Asakawa

CHI'20: EXTENDED ABSTRACTS OF THE 2020 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS 1 - 8 2020年

　概要を見る

We present a system to allow blind people to stand in line in public spaces by using an off-the-shelf smartphone only. The technologies to navigate blind pedestrians in public spaces are rapidly improving, but tasks which require to understand surrounding people's behavior are still difficult to assist. Standing in line at shops, stations, and other crowded places is one of such tasks. Therefore, we developed a system to detect and notify the distance to a person in front continuously by using a smartphone with a RGB camera and an infrared depth sensor. The system alerts three levels of distance via vibration patterns to allow users to start/stop moving forward to the right position at the right timing. To evaluate the effectiveness of the system, we performed a study with six blind people. We observed that the system enables blind participants to stand in line successfully, while also gaining more confidence.

DOI

Scopus

10

被引用数

(Scopus)
BlindPilot: A Robotic Local Navigation System that Leads Blind People to a Landmark Object

Seita Kayukawa, Tatsuya Ishihara, Hironobu Takagi, Shigeo Morishima, Chieko Asakawa

CHI'20: EXTENDED ABSTRACTS OF THE 2020 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS 1 - 9 2020年

　概要を見る

Blind people face various local navigation challenges in their daily lives such as identifying empty seats in crowded stations, navigating toward a seat, and stopping and sitting at the correct spot. Although voice navigation is a commonly used solution, it requires users to carefully follow frequent navigational sounds over short distances. Therefore, we presented an assistive robot, BlindPilot, which guides blind users to landmark objects using an intuitive handle. BlindPilot employs an RGB-D camera to detect the positions of target objects and uses LiDAR to build a 2D map of the surrounding area. On the basis of the sensing results, BlindPilot then generates a path to the object and guides the user safely. To evaluate our system, we also implemented a sound-based navigation system as a baseline system, and asked six blind participants to approach an empty chair using the two systems. We observed that BlindPilot enabled users to approach a chair faster with a greater feeling of security and less effort compared to the baseline system.

DOI

Scopus

27

被引用数

(Scopus)
Automatic sign dance synthesis from gesture-based sign language

Naoya Iwamoto, Hubert P.H. Shum, Wakana Asahina, Shigeo Morishima

Proceedings - MIG 2019: ACM Conference on Motion, Interaction, and Games 2019年10月

　概要を見る

Automatic dance synthesis has become more and more popular due to the increasing demand in computer games and animations. Existing research generates dance motions without much consideration for the context of the music. In reality, professional dancers make choreography according to the lyrics and music features. In this research, we focus on a particular genre of dance known as sign dance, which combines gesture-based sign language with full body dance motion. We propose a system to automatically generate sign dance from a piece of music and its corresponding sign gesture. The core of the system is a Sign Dance Model trained by multiple regression analysis to represent the correlations between sign dance and sign gesture/music, as well as a set of objective functions to evaluate the quality of the sign dance. Our system can be applied to music visualization, allowing people with hearing difficulties to understand and enjoy music.

DOI

Scopus

3

被引用数

(Scopus)
3D car shape reconstruction from a single sketch image

Naoiki Nozawa, Hubert P.H. Shum, Edmond S.L. Ho, Shigeo Morishima

Proceedings - MIG 2019: ACM Conference on Motion, Interaction, and Games 2019年10月

　概要を見る

Efficient car shape design is a challenging problem in both the automotive industry and the computer animation/games industry. In this paper, we present a system to reconstruct the 3D car shape from a single 2D sketch image. To learn the correlation between 2D sketches and 3D cars, we propose a Variational Autoencoder deep neural network that takes a 2D sketch and generates a set of multiview depth & mask images, which are more effective representation comparing to 3D mesh, and can be combined to form the 3D car shape. To ensure the volume and diversity of the training data, we propose a feature-preserving car mesh augmentation pipeline for data augmentation. Since deep learning has limited capacity to reconstruct fine-detail features, we propose a lazy learning approach that constructs a small subspace based on a few relevant car samples in the database. Due to the small size of such a subspace, fine details can be represented effectively with a small number of parameters. With a low-cost optimization process, a high-quality car with detailed features is created. Experimental results show that the system performs consistently to create highly realistic cars of substantially different shape and topology, with a very low computational cost.

DOI

Scopus

2

被引用数

(Scopus)
Real-time Indirect Illumination of Emissive Inhomogeneous Volumes using Layered Polygonal Area Lights

Takahiro Kuge, Tatsuya Yatagawa, Shigeo Morishima

COMPUTER GRAPHICS FORUM 38 ( 7 ) 449 - 460 2019年10月

　概要を見る

Indirect illumination involving with visually rich participating media such as turbulent smoke and loud explosions contributes significantly to the appearances of other objects in a rendering scene. However, previous real-time techniques have focused only on the appearances of the media directly visible from the viewer. Specifically, appearances that can be indirectly seen over reflective surfaces have not attracted much attention. In this paper, we present a real-time rendering technique for such indirect views that involves the participating media. To achieve real-time performance for computing indirect views, we leverage layered polygonal area lights (LPALs) that can be obtained by slicing the media into multiple flat layers. Using this representation, radiance entering each surface point from each slice of the volume is analytically evaluated to achieve instant calculation. The analytic solution can be derived for standard bidirectional reflectance distribution functions (BRDFs) based on the microfacet theory. Accordingly, our method is sufficiently robust to work on surfaces with arbitrary shapes and roughness values. In addition, we propose a quadrature method for more accurate rendering of scenes with dense volumes, and a transformation of the domain of volumes to simplify the calculation and implementation of the proposed method. By taking advantage of these computation techniques, the proposed method achieves real-time rendering of indirect illumination for emissive volumes.

DOI

Scopus

2

被引用数

(Scopus)
Interactive Face Retrieval Framework for Clarifying User's Visual Memory

Yugo Sato, Tsukasa Fukusato, Shigeo Morishima

ITE TRANSACTIONS ON MEDIA TECHNOLOGY AND APPLICATIONS 7 ( 2 ) 68 - 79 2019年

　概要を見る

This paper presents an interactive face retrieval framework for clarifying an image representation envisioned by a user. Our system is designed for a situation in which the user wishes to find a person but has only visual memory of the person. We address a critical challenge of image retrieval across the user's inputs. Instead of target-specific information, the user can select several images that are similar to an impression of the target person the user wishes to search for. Based on the user's selection, our proposed system automatically updates a deep convolutional neural network. By interactively repeating these process, the system can reduce the gap between human-based similarities and computer-based similarities and estimate the target image representation. We ran user studies with 10 participants on a public database and confirmed that the proposed framework is effective for clarifying the image representation envisioned by the user easily and quickly.

DOI

Scopus

2

被引用数

(Scopus)
A Study on the Sense of Burden and Body Ownership on Virtual Slope

Ryo Shimamura, Seita Kayukawa, Takayuki Nakatsuka, Shoki Miyakawa, Shigeo Morishima

2019 26TH IEEE CONFERENCE ON VIRTUAL REALITY AND 3D USER INTERFACES (VR) 1154 - 1155 2019年

　概要を見る

This paper provides insight into the burden when people are walking up and down slopes in a virtual environment (VE) while actually walking on a flat floor in the real environment (RE). In RE, we feel a physical load during walking uphill or downhill. To reproduce such physical load in the VE, we provided visual stimuli to users by changing their step length. In order to investigate how the stimuli affect a sense of burden and body ownership, we performed a user study where participants walked on slopes in the VE. We found that changing the step length has a significant impact on a burden on the user and less correlation between body ownership and step length.

DOI

Scopus

2

被引用数

(Scopus)
Melody Slot Machine

Masatoshi Hamanaka, Takayuki Nakatsuka, Shigeo Morishima

ACM SIGGRAPH 2019 EMERGING TECHNOLOGIES (SIGGRAPH '19) 2019年

　概要を見る

We developed an interactive music system called the "Melody Slot Machine," which provides an experience of manipulating a music performance. The melodies used in the system are divided into multiple segments, and each segment has multiple variations of melodies. By turning the dials manually, users can switch the variations of melodies freely. When you pull the slot lever, the melody of all segments rotates, and melody segments are randomly selected. Since the performer displayed in a hologram moves in accordance with the selected variation of melody, users can enjoy the feeling of manipulating the performance.

DOI

Scopus

5

被引用数

(Scopus)
GPU smoke simulation on compressed DCT space

D. Ishida, R. Ando, S. Morishima

European Association for Computer Graphics - 40th Annual Conference, EUROGRAPHICS 2019 - Short Papers 5 - 8 2019年

　概要を見る

This paper presents a novel GPU-based algorithm for smoke animation. Our primary contribution is the use of Discrete Cosine Transform (DCT) compressed space for efficient simulation. We show that our method runs an order of magnitude faster than a CPU implementation while retaining visual details with a smaller memory usage. The key component of our method is an on-the-fly compression and expansion of velocity, pressure and density fields. Whenever these physical quantities are requested during a simulation, we perform data expansion and compression only where necessary in a loop. As a consequence, our simulation allows us to simulate a large domain without actually allocating full memory space for it. We show that albeit our method comes with some extra cost for DCT manipulations, such cost can be minimized with the aid of a devised shared memory usage.

DOI

Scopus
Generating Video from Single Image and Sound.

Yukitaka Tsuchiya, Takahiro Itazuri, Ryota Natsume, Shintaro Yamamoto, Takuya Kato, Shigeo Morishima

IEEE Conference on Computer Vision and Pattern Recognition Workshops 17 - 20 2019年
BBeep: A Sonic Collision Avoidance System for Blind Travellers and Nearby Pedestrians

Seita Kayukawa, Keita Higuchi, Joao Guerreiro, Shigeo Morishima, Yoichi Sato, Kris Kitani, Chieko Asakawa

CHI 2019: PROCEEDINGS OF THE 2019 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS 52 - 52 2019年

　概要を見る

We present an assistive suitcase system, BBeep, for supporting blind people when walking through crowded environments. BBeep uses pre-emptive sound notifications to help clear a path by alerting both the user and nearby pedestrians about the potential risk of collision. BBeep triggers notifications by tracking pedestrians, predicting their future position in real-time, and provides sound notifications only when it anticipates a future collision. We investigate how different types and timings of sound affect nearby pedestrian behavior. In our experiments, we found that sound emission timing has a significant impact on nearby pedestrian trajectories when compared to different sound types. Based on these findings, we performed a real-world user study at an international airport, where blind participants navigated with the suitcase in crowded areas. We observed that the proposed system significantly reduces the number of imminent collisions.

DOI

Scopus

96

被引用数

(Scopus)
Real-time Rendering of Layered Materials with Anisotropic Normal Distributions

Tomoya Yamaguchi, Tatsuya Yatagawa, Yusuke Tokuyoshi, Shigeo Morishima

SA'19: SIGGRAPH ASIA 2019 TECHNICAL BRIEFS 6 ( 1 ) 87 - 90 2019年

　概要を見る

This paper proposes a lightweight bidirectional scattering distribution function (BSDF) model for layered materials with anisotropic reflection and refraction properties. In our method, each layer of the materials can be described by a microfacet BSDF using an anisotropic normal distribution function (NDF). Furthermore, the NDFs of layers can be defined on tangent vector fields, which differ from layer to layer. Our method is based on a previous study in which isotropic BSDFs are approximated by projecting them onto base planes. However, the adequateness of this previous work has not been well investigated for anisotropic BSDFs. In this paper, we demonstrate that the projection is also applicable to anisotropic BSDFs and that they can be approximated by elliptical distributions using covariance matrices.

DOI

Scopus

10

被引用数

(Scopus)
Audio-based automatic generation of a piano reduction score by considering the musical structure

Hirofumi Takamori, Takayuki Nakatsuka, Satoru Fukayama, Masataka Goto, Shigeo Morishima

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 11296 LNCS 169 - 181 2019年 [査読有り]

　概要を見る

This study describes a method that automatically generates a piano reduction score from the audio recordings of popular music while considering the musical structure. The generated score comprises both right- and left-hand piano parts, which reflect the melodies, chords, and rhythms extracted from the original audio signals. Generating such a reduction score from an audio recording is challenging because automatic music transcription is still considered to be inefficient when the input contains sounds from various instruments. Reflecting the long-term correlation structure behind similar repetitive bars is also challenging; further, previous methods have independently generated each bar. Our approach addresses the aforementioned issues by integrating musical analysis, especially structural analysis, with music generation. Our method extracts rhythmic features as well as melodies and chords from the input audio recording and reflects them in the score. To consider the long-term correlation between bars, we use similarity matrices, created for several acoustical features, as constraints. We further conduct a multivariate regression analysis to determine the acoustical features that represent the most valuable constraints for generating a musical structure. We have generated piano scores using our method and have observed that we can produce scores that differently balance between the ability to achieve rhythmic characteristics and the ability to obtain musical structures.

DOI

Scopus

8

被引用数

(Scopus)
Automatic arranging musical score for piano using important musical elements

Hirofumi Takamori, Haruki Sato, Takayuki Nakatsuka, Shigeo Morishima

Proceedings of the 14th Sound and Music Computing Conference 2017, SMC 2017 35 - 41 2019年 [査読有り]

　概要を見る

There is a demand for arranging music composed using multiple instruments for a solo piano because there are several pianists who wish to practice playing their favorite songs or music. Generally, the method used for piano arrangement entails reducing original notes to fit on a two-line staff. However, a fundamental solution that improves originality and playability in conjunction with score quality continues to elude approaches proposed by extant studies. Hence, the present study proposes a new approach to arranging a musical score for the piano by using four musical components, namely melody, chords, rhythm, and the number of notes that can be extracted from an original score. The proposed method involves inputting an original score and subsequently generating both right- and left-hand playing parts of piano scores. With respect to the right part, optional notes from a chord were added to the melody. With respect to the left part, appropriate accompaniments were selected from a database comprising pop musical piano scores. The selected accompaniments are considered to correspond to the impression of an original score. High-quality solo piano scores reflecting original characteristics were generated and considered as part of playability.
Understanding Fake Faces

Ryota Natsume, Kazuki Inoue, Yoshihiro Fukuhara, Shintaro Yamamoto, Shigeo Morishima, Hirokatsu Kataoka

COMPUTER VISION - ECCV 2018 WORKSHOPS, PT III 11131 566 - 576 2019年 [査読有り]

　概要を見る

Face recognition research is one of the most active topics in computer vision (CV), and deep neural networks (DNN) are now filling the gap between human-level and computer-driven performance levels in face verification algorithms. However, although the performance gap appears to be narrowing in terms of accuracy-based expectations, a curious question has arisen; specifically, Face understanding of AI is really close to that of human? In the present study, in an effort to confirm the brain-driven concept, we conduct image-based detection, classification, and generation using an in-house created fake face database. This database has two configurations: (i) false positive face detections produced using both the Viola Jones (VJ) method and convolutional neural networks (CNN), and (ii) simulacra that have fundamental characteristics that resemble faces but are completely artificial. The results show a level of suggestive knowledge that indicates the continuing existence of a gap between the capabilities of recent vision-based face recognition algorithms and human-level performance. On a positive note, however, we have obtained knowledge that will advance the progress of face-understanding models.

DOI

Scopus

1

被引用数

(Scopus)
Automatic Paper Summary Generation from Visual and Textual Information

Shintaro Yamamoto, Yoshihiro Fukuhara, Ryota Suzuki, Shigeo Morishima, Hirokatsu Kataoka

ELEVENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2018) 11041 V101 2019年 [査読有り]

　概要を見る

Due to the recent boom in artificial intelligence (AI) research, including computer vision (CV), it has become impossible for researchers in these fields to keep up with the exponentially increasing number of manuscripts. In response to this situation, this paper proposes the paper summary generation (PSG) task using a simple but effective method to automatically generate an academic paper summary from raw PDF data. We realized PSG by combination of vision-based supervised components detector and language-based unsupervised important sentence extractor, which is applicable for a trained format of manuscripts. We show the quantitative evaluation of ability of simple vision-based components extraction, and the qualitative evaluation that our system can extract both visual item and sentence that are helpful for understanding. After processing via our PSG, the 979 manuscripts accepted by the Conference on Computer Vision and Pattern Recognition (CVPR) 2018 are available 1. It is believed that the proposed method will provide a better way for researchers to stay caught with important academic papers.

DOI

Scopus

1

被引用数

(Scopus)
Preface

Shigeo Morishima

Proceedings of the ACM Symposium on Virtual Reality Software and Technology, VRST 13 2018年11月
Automatic Paper Summary Generation from Visual and Textual Information

Shintaro Yamamoto, Yoshihiro Fukuhara, Ryota Suzuki, Shigeo Morishima, Hirokatsu Kataoka

CoRR abs/1811.06943 2018年11月 [査読有り]

　概要を見る

Due to the recent boom in artificial intelligence (AI) research, including
computer vision (CV), it has become impossible for researchers in these fields
to keep up with the exponentially increasing number of manuscripts. In response
to this situation, this paper proposes the paper summary generation (PSG) task
using a simple but effective method to automatically generate an academic paper
summary from raw PDF data. We realized PSG by combination of vision-based
supervised components detector and language-based unsupervised important
sentence extractor, which is applicable for a trained format of manuscripts. We
show the quantitative evaluation of ability of simple vision-based components
extraction, and the qualitative evaluation that our system can extract both
visual item and sentence that are helpful for understanding. After processing
via our PSG, the 979 manuscripts accepted by the Conference on Computer Vision
and Pattern Recognition (CVPR) 2018 are available. It is believed that the
proposed method will provide a better way for researchers to stay caught with
important academic papers.
Occlusion for 3D Object Manipulation with Hands in Augmented Reality

Q Feng, H Shum, S Morishima

24th ACM Symposium on Virtual Reality Software and Technology (VRST2018), 24 ( 1166 ) 119 2018年11月 [査読有り]

　概要を見る

Due to the need to interact with virtual objects, the hand-object interaction has become an important element in mixed reality (MR) applications. In this paper, we propose a novel approach to handle the occlusion of augmented 3D object manipulation with hands by exploiting the nature of hand poses combined with tracking-based and model-based methods, to achieve a complete mixed reality experience without necessities of heavy computations, complex manual segmentation processes or wearing special gloves. The experimental results show a frame rate faster than real-time and a great accuracy of rendered virtual appearances, and a user study verifies a more immersive experience compared to past approaches. We believe that the proposed method can improve a wide range of mixed reality applications that involve hand-object interactions.

DOI

Scopus

10

被引用数

(Scopus)
Efficient Metropolis Path Sampling for Material Editing and Re-rendering

Tomoya Yamaguchi, Tatsuya Yatagawa, Shigeo Morishima

Proceedings of Pacific Graphics 2018 2018年10月 [査読有り]

　概要を見る

This paper proposes efficient path sampling for re-rendering scenes after material editing. The proposed sampling is based on Metropolis light transport (MLT) and dedicates path samples more to pixels receiving greater changes by editing. To roughly estimate the amount of the changes in pixel values, we first render the difference between images before and after editing. In this step, we render the difference image directly rather than taking the difference of the images by separately rendering them. Then, we sample more paths for the pixels with larger difference values, and render the scene after editing following a recent approach using the control variates. As a result, we can obtain fine rendering results with a small number of path samples. We examined the proposed sampling with a range of scenes and demonstrated that it achieves lower estimation errors and variances over the state-of-the-art method.

DOI

Scopus

1

被引用数

(Scopus)
Understanding Fake Faces

Ryota Natsume, Kazuki Inoue, Yoshihiro Fukuhara, Shintaro Yamamoto, Shigeo Morishima, Hirokatsu Kataoka

CoRR abs/1809.08391 2018年09月 [査読有り]

　概要を見る

Face recognition research is one of the most active topics in computer vision
(CV), and deep neural networks (DNN) are now filling the gap between
human-level and computer-driven performance levels in face verification
algorithms. However, although the performance gap appears to be narrowing in
terms of accuracy-based expectations, a curious question has arisen;
specifically, "Face understanding of AI is really close to that of human?" In
the present study, in an effort to confirm the brain-driven concept, we conduct
image-based detection, classification, and generation using an in-house created
fake face database. This database has two configurations: (i) false positive
face detections produced using both the Viola Jones (VJ) method and
convolutional neural networks (CNN), and (ii) simulacra that have fundamental
characteristics that resemble faces but are completely artificial. The results
show a level of suggestive knowledge that indicates the continuing existence of
a gap between the capabilities of recent vision-based face recognition
algorithms and human-level performance. On a positive note, however, we have
obtained knowledge that will advance the progress of face-understanding models.
RSGAN: Face swapping and editing using face and hair representation in latent spaces

Ryota Natsume, Tatsuya Yatagawa, Shigeo Morishima

ACM SIGGRAPH 2018 Posters, SIGGRAPH 2018 69:1-69:2 2018年08月 [査読有り]

　概要を見る

This abstract introduces a generative neural network for face swapping and editing face images. We refer to this network as "region-separative generative adversarial network (RSGAN)". In existing deep generative models such as Variational autoencoder (VAE) and Generative adversarial network (GAN), training data must represent what the generative models synthesize. For example, image inpainting is achieved by training images with and without holes. However, it is difficult or even impossible to prepare a dataset which includes face images both before and after face swapping because faces of real people cannot be swapped without surgical operations. We tackle this problem by training the network so that it synthesizes synthesize a natural face image from an arbitrary pair of face and hair appearances. In addition to face swapping, the proposed network can be applied to other editing applications, such as visual attribute editing and random face parts synthesis.

DOI

Scopus

66

被引用数

(Scopus)
How makeup experience changes how we see cosmetics?

Kanami Yamagishi, Takuya Kato, Shintaro Yamamoto, Ayano Kaneda, Shigeo Morishima

Proceedings of ACM Symposium on Applied Perception (SAP2018) 2018年08月 [査読有り]
RSGAN: Face Swapping and Editing Via Region Separation in Latent Spaces

Ryota Natsume, Tatsuya Yatagawa, Shigeo Morishima

ACM SIGGRAPH 2018 posters 2018年08月 [査読有り]

　概要を見る

This poster proposes a new deep generative model that we refer to as region separative generative adversarial network (RSGAN), which can achieve face swapping between arbitrary image pairs and can robustly perform the swapping compared to previous methods using 3DMM.
High-Fidelity Facial Reflectance and Geometry Inference From an Unconstrained Image

Shugo Yamaguchi, Shunsuke Saito, Koki Nagano, Yajie Zhao, Weikai Chen, Kyle Olszewski, Shigeo Morishima, Hao Li

ACM TRANSACTIONS ON GRAPHICS 37 ( 4 ) 162-1 - 162-14 2018年08月 [査読有り]

　概要を見る

We present a deep learning-based technique to infer high-quality facial reflectance and geometry given a single unconstrained image of the subject, which may contain partial occlusions and arbitrary illumination conditions. The reconstructed high-resolution textures, which are generated in only a few seconds, include high-resolution skin surface reflectance maps, representing both the diffuse and specular albedo, and medium-and high-frequency displacement maps, thereby allowing us to render compelling digital avatars under novel lighting conditions. To extract this data, we train our deep neural networks with a high-quality skin reflectance and geometry database created with a state-of-the-art multi-view photometric stereo system using polarized gradient illumination. Given the raw facial texture map extracted from the input image, our neural networks synthesize complete reflectance and displacement maps, as well as complete missing regions caused by occlusions. The completed textures exhibit consistent quality throughout the face due to our network architecture, which propagates texture features from the visible region, resulting in high-fidelity details that are consistent with those seen in visible regions. We describe how this highly underconstrained problem is made tractable by dividing the full inference into smaller tasks, which are addressed by dedicated neural networks. We demonstrate the effectiveness of our network design with robust texture completion from images of faces that are largely occluded. With the inferred reflectance and geometry data, we demonstrate the rendering of high-fidelity 3D avatars from a variety of subjects captured under different lighting conditions. In addition, we perform evaluations demonstrating that our method can infer plausible facial reflectance and geometric details comparable to those obtained from high-end capture devices, and outperform alternative approaches that require only a single unconstrained input image.

DOI

Scopus

115

被引用数

(Scopus)
Face retrieval framework relying on user's visual memory

Yugo Sato, Tsukasa Fukusato, Shigeo Morishima

ICMR 2018 - Proceedings of the 2018 ACM International Conference on Multimedia Retrieval 274 - 282 2018年06月 [査読有り]

　概要を見る

This paper presents an interactive face retrieval framework for clarifying an image representation envisioned by a user. Our system is designed for a situation in which the user wishes to find a person but has only visual memory of the person. We address a critical challenge of image retrieval across the user's inputs. Instead of target-specific information, the user can select several images (or a single image) that are similar to an impression of the target person the user wishes to search for. Based on the user's selection, our proposed system automatically updates a deep convolutional neural network. By interactively repeating these process (human-in-the-loop optimization), the system can reduce the gap between human-based similarities and computer-based similarities and estimate the target image representation. We ran user studies with 10 subjects on a public database and confirmed that the proposed framework is effective for clarifying the image representation envisioned by the user easily and quickly.

DOI

Scopus

4

被引用数

(Scopus)
Placing Music in Space: A Study on Music Appreciation with Spatial Mapping

Shoki Miyagawa, Yuki Koyama, Jun Kato, Masataka Goto, Shigeo Morishima

Proceedings of the ACM Symposium on Designing Interactive Systems (DIS2018) 39 - 43 2018年06月 [査読有り]

　概要を見る

We investigate the potential of music appreciation using spatial mapping techniques, which allow us to “place” audio sources in various locations within a physical space. We consider possible ways of this new appreciation style and list some design variables, such as how to define coordinate systems, how to show visually, and who to place the sound sources. We conducted an exploratory user study to examine how these design variables affect users’ music listening experiences. Based on our findings from the study, we discuss how we should develop systems that incorporate these design variables for music appreciation in the future.
基本材質の拡散プロファイル混合による実測BSSRDFデータの圧縮

谷田川達也, 藤堂英樹, 山口泰, 森島繁生

Visual Computing 2018 (VC 2018) 2018年06月 [査読有り]

　概要を見る

非均一の半透明媒質を物理的に正しくデータ化するためには，媒質ボリューム中の任意の点に対する拡散パラメータを数値化する必要があり，安直なモデル化ではデータ量は比較的低解像度であっても数GBのデータ量となる．本研究は，レンダリングの品質を大きく損なうことなく，データ量を従来法の30分の1に相当する数百KBまで圧縮する方法を提案する．提案法では，多くの非均一材質が高々数十個の基本材質の混合により形成されていることに注目し，これらの基本材質の拡散性能を記述する基底拡散プロファイル関数をデータ圧縮に用いる．実験を通し，測定領域の各画素には2個程度の基本材質を割り当てることで十分な近似が実現できることを示した．
変分オートエンコーダを用いた領域特徴抽出による顔領域入れ替えを含む画像編集法

夏目亮太, 谷田川達也, 森島繁生

Visual Computing 2018 (VC 2018) 2018年06月 [査読有り]

　概要を見る

本研究では, 顔画像に対する顔領域入れ替えと外見編集を可能とする統合的な編集システムを提案する．従来の顔領域入れ替え法の多くは, 顔の三次元形状復元に基づくが, 人目には些細な復元のズレが大きな違和感を生じさせる. 本研究では, 深層学習により三次元形状復元を介さない顔領域入れ替えを実現する. 提案法では, 顔画像に対して顔と髪に領域分割されたデータを, VAEを用いて学習し, 顔と髪の外見特徴を潜在変数として別々に抽出する. 生成ネットワークは, 学習を通し, 抽出された顔と髪の潜在変数の組だけでなく, ランダムな潜在変数の組に対しても自然な顔画像を合成する. 提案法では, 異なる二枚の画像から抽出されたの潜在変数の組から顔画像を合成することで, 顔領域入れ替えを実現する.
Thickness-aware voxelization

Zhuopeng Zhang, Shigeo Morishima, Changbo Wang

COMPUTER ANIMATION AND VIRTUAL WORLDS 29 ( 3-4 ) 2018年05月 [査読有り]

　概要を見る

Voxelization is a crucial process for many computer graphics applications such as collision detection, rendering of translucent objects, and global illumination. However, in some situations, although the mesh looks good, the voxelization result may be undesirable. In this paper, we describe a novel voxelization method that uses the graphics processing unit for surface voxelization. Our improvements on the voxelization algorithm can address a problem of state-of-the-art voxelization, which cannot deal with thin parts of the mesh object. We improve the quality of voxelization on both normal mediation and surface correction. Furthermore, we investigate our voxelization methods on indirect illumination, showing the improvement on the quality of real-time rendering.

DOI

Scopus

2

被引用数

(Scopus)
RSGAN: Face Swapping and Editing using Face and Hair Representation in Latent Spaces

Ryota Natsume, Tatsuya Yatagawa, Shigeo Morishima

CoRR abs/1804.03447 2018年04月 [査読有り]

　概要を見る

In this paper, we present an integrated system for automatically generating
and editing face images through face swapping, attribute-based editing, and
random face parts synthesis. The proposed system is based on a deep neural
network that variationally learns the face and hair regions with large-scale
face image datasets. Different from conventional variational methods, the
proposed network represents the latent spaces individually for faces and hairs.
We refer to the proposed network as region-separative generative adversarial
network (RSGAN). The proposed network independently handles face and hair
appearances in the latent spaces, and then, face swapping is achieved by
replacing the latent-space representations of the faces, and reconstruct the
entire face image with them. This approach in the latent space robustly
performs face swapping even for images which the previous methods result in
failure due to inappropriate fitting or the 3D morphable models. In addition,
the proposed system can further edit face-swapped images with the same network
by manipulating visual attributes or by composing them with randomly generated
face or hair parts.
Dynamic object scanning: Object-based elastic timeline for quickly browsing first-person videos

Seita Kayukawa, Keita Higuchi, Ryo Yonetani, Masanori Nakamura, Yoichi Sato, Shigeo Morishima

Conference on Human Factors in Computing Systems - Proceedings 2018-April 2018年04月 [査読有り]

　概要を見る

Copyright is held by the author/owner(s). This work presents the Dynamic Object Scanning (DO-Scanning), a novel interface that helps users browse long and untrimmed first-person videos quickly. The proposed interface offers users a small set of object cues generated automatically tailored to the context of a given video. Users choose which cue to highlight, and the interface in turn fast-forwards the video adaptively while keeping scenes with highlighted cues played at original speed. Our experimental results have revealed that the DO-Scanning arranged an efficient and compact set of cues, and this set of cues is useful for browsing a diverse set of first-person videos.

DOI

Scopus

4

被引用数

(Scopus)
Dynamic object scanning: Object-based elastic timeline for quickly browsing first-person videos

Seita Kayukawa, Keita Higuchi, Ryo Yonetani, Masanori Nakamura, Yoichi Sato, Shigeo Morishima

Conference on Human Factors in Computing Systems - Proceedings 2018-April 2018年04月 [査読有り]

　概要を見る

Copyright is held by the author/owner(s). This work presents the Dynamic Object Scanning (DO-Scanning), a novel interface that helps users browse long and untrimmed first-person videos quickly. The proposed interface offers users a small set of object cues generated automatically tailored to the context of a given video. Users choose which cue to highlight, and the interface in turn adaptively fast-forwards the video while keeping scenes with highlighted cues played at original speed. Our experimental results have revealed that the DO-Scanning has an efficient and compact set of cues arranged dynamically and this set of cues is useful for browsing a diverse set of first-person videos.

DOI

Scopus

1

被引用数

(Scopus)
Dynamic Object Scanning: Object-Based Elastic Timeline for Quickly Browsing First-Person Videos

Seita Kayukawa, Keita Higuchi, Ryo Yonetani, Masanori Nakamura, Yoichi Sato, Shigeo Morishima

Proceedings of 2018 CHI Conference on Human Factors in Computing Systems (CHI '18) 2018年04月 [査読有り]

　概要を見る

This work presents the Dynamic Object Scanning (DO- Scanning), a novel interface that helps users browse long and untrimmed first-person videos quickly. The proposed interface offers users a small set of object cues generated automatically tailored to the context of a given video. Users choose which cue to highlight, and the interface in turn adaptively fast-forwards the video while keeping scenes with highlighted cues played at original speed. Our experimental results have revealed that the DO-Scanning has an efficient and compact set of cues arranged dynamically and this set of cues is useful for browsing a diverse set of first-person videos.

DOI

Scopus

1

被引用数

(Scopus)
RSGAN: Face Swapping and Editing using Face and Hair Representation in Latent Spaces

Ryota Natsume, Tatsuya Yatagawa, Shigeo Morishima

arXiv 2018年04月

　概要を見る

In this paper, we present an integrated system for automatically generating and editing face images through face swapping, attribute-based editing, and random face parts synthesis. The proposed system is based on a deep neural network that variationally learns the face and hair regions with large-scale face image datasets. Different from conventional variational methods, the proposed network represents the latent spaces individually for faces and hairs. We refer to the proposed network as region-separative generative adversarial network (RSGAN). The proposed network independently handles face and hair appearances in the latent spaces, and then, face swapping is achieved by replacing the latent-space representations of the faces, and reconstruct the entire face image with them. This approach in the latent space robustly performs face swapping even for images which the previous methods result in failure due to inappropriate fitting or the 3D morphable models. In addition, the proposed system can further edit face-swapped images with the same network by manipulating visual attributes or by composing them with randomly generated face or hair parts.
DanceDJ: A 3D Dance Animation Authoring System for Live Performance

Naoya Iwamoto, Takuya Kato, Hubert P. H. Shum, Ryo Kakitsuka, Kenta Hara, Shigeo Morishima

ADVANCES IN COMPUTER ENTERTAINMENT TECHNOLOGY, ACE 2017 10714 653 - 670 2018年 [査読有り]

　概要を見る

Dance is an important component of live performance for expressing emotion and presenting visual context. Human dance performances typically require expert knowledge of dance choreography and professional rehearsal, which are too costly for casual entertainment venues and clubs. Recent advancements in character animation and motion synthesis have made it possible to synthesize virtual 3D dance characters in real-time. The major problem in existing systems is a lack of an intuitive interfaces to control the animation for real-time dance controls. We propose a new system called the DanceDJ to solve this problem. Our system consists of two parts. The first part is an underlying motion analysis system that evaluates motion features including dance features such as the postures and movement tempo, as well as audio features such as the music tempo and structure. As a pre-process, given a dancing motion database, our system evaluates the quality of possible timings to connect and switch different dancing motions. During run-time, we propose a control interface that provides visual guidance. We observe that disk jockeys (DJs) effectively control the mixing of music using the DJ controller, and therefore propose a DJ controller for controlling dancing characters. This allows DJs to transfer their skills from music control to dance control using a similar hardware setup. We map different motion control functions onto the DJ controller, and visualize the timing of natural connection points, such that the DJ can effectively govern the synthesized dance motion. We conducted two user experiments to evaluate the user experience and the quality of the dance character. Quantitative analysis shows that our system performs well in both motion control and simulation quality.

DOI

Scopus

4

被引用数

(Scopus)
Voice animator: Automatic lip-synching in limited animation by audio

Shoichi Furukawa, Tsukasa Fukusato, Shugo Yamaguchi, Shigeo Morishima

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 10714 LNCS 153 - 171 2018年 [査読有り]

　概要を見る

© Springer International Publishing AG, part of Springer Nature 2018. Limited animation is one of the traditional techniques for producing cartoon animations. Owing to its expressive style, it has been enjoyed around the world. However, producing high quality animations using this limited style is time-consuming and costly for animators. Furthermore, proper synchronization between the voice-actor’s voice and the character’s mouth and lip motion requires well-experienced animators. This is essential because viewers are very sensitive to audio-lip discrepancies. In this paper, we propose a method that automatically creates high-quality limited-style lip-synched animations using audio tracks. Our system can be applied for creating not only the original animations but also dubbed ones independently of languages. Because our approach follows the standard workflow employed in cartoon animation production, our system can successfully assist animators. In addition, users can implement our system as a plug-in of a standard tool for creating animations (Adobe After Effects) and can easily arrange character lip motion to suit their own style. We visually evaluate our results both absolutely and relatively by comparing them with those of previous works. From the user evaluations, we confirm that our algorithms is able to successfully generate more natural audio-mouth synchronizations in limited-style lip-synched animations than previous algorithms.

DOI

Scopus

7

被引用数

(Scopus)
FSNet: An Identity-Aware Generative Model for Image-based Face Swapping.

Ryota Natsume, Tatsuya Yatagawa, Shigeo Morishima

CoRR abs/1811.12666 2018年 [査読有り]
Proceedings of the 24th ACM Symposium on Virtual Reality Software and Technology, VRST 2018, Tokyo, Japan, November 28 - December 01, 2018

VRST 2018 2018年 [査読有り]

DOI
Resolving Occlusion for 3D Object Manipulation with Hands in Mixed Reality

Qi Feng, Hubert P. H. Shum, Shigeo Morishima

24TH ACM SYMPOSIUM ON VIRTUAL REALITY SOFTWARE AND TECHNOLOGY (VRST 2018) 119:1-119:2 2018年 [査読有り]

　概要を見る

Due to the need to interact with virtual objects, the hand-object interaction has become an important element in mixed reality (MR) applications. In this paper, we propose a novel approach to handle the occlusion of augmented 3D object manipulation with hands by exploiting the nature of hand poses combined with tracking-based and model-based methods, to achieve a complete mixed reality experience without necessities of heavy computations, complex manual segmentation processes or wearing special gloves. The experimental results show a frame rate faster than real-time and a great accuracy of rendered virtual appearances, and a user study verifies a more imnersive experience compared to past approaclies. We believe that the proposed method can improve a wide range of mixed reality applications that involve hand-object interactions.

DOI

Scopus

10

被引用数

(Scopus)
Wrinkles individuality preserving aged texture generation using multiple expression images

Pavel A. Savkin, Tsukasa Fukusato, Takuya Kato, Shigeo Morishima

VISIGRAPP 2018 - Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications 4 549 - 557 2018年 [査読有り]

　概要を見る

Copyright © 2018 by SCITEPRESS - Science and Technology Publications, Lda. All rights reserved. Aging of a human face is accompanied by visible changes such as sagging, spots, somberness, and wrinkles. Age progression techniques that estimate an aged facial image are required for long-term criminal or missing person investigations, and also in 3DCG facial animations. This paper focuses on aged facial texture and introduces a novel age progression method based on medical knowledge, which represents an aged wrinkles shapes and positions individuality. The effectiveness of the idea including expression wrinkles in aging facial image synthesis is confirmed through subjective evaluation.

DOI

Scopus
Face Retrieval Framework Relying on User's Visual Memory

Yugo Sato, Tsukasa Fukusato, Shigeo Morishima

ICMR '18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL 274 - 282 2018年 [査読有り]

　概要を見る

This paper presents an interactive face retrieval framework for clarifying an image representation envisioned by a user. Our system is designed for a situation in which the user wishes to find a person but has only visual memory of the person. We address a critical challenge of image retrieval across the user's inputs. Instead of target-specific information, the user can select several images (or a single image) that are similar to an impression of the target person the user wishes to search for. Based on the user's selection, our proposed system automatically updates a deep convolutional neural network. By interactively repeating these process (human-in-the-loop optimization), the system can reduce the gap between human-based similarities and computer-based similarities and estimate the target image representation. We ran user studies with 10 subjects on a public database and con firmed that the proposed framework is effective for clarifying the image representation envisioned by the user easily and quickly.

DOI

Scopus

4

被引用数

(Scopus)
Placing Music in Space: A Study on Music Appreciation with Spatial Mapping

Shoki Miyagawa, Shigeo Morishima, Yuki Koyama, Jun Kato, Masataka Goto

DIS 2018: COMPANION PUBLICATION OF THE 2018 DESIGNING INTERACTIVE SYSTEMS CONFERENCE 39 - 43 2018年 [査読有り]

　概要を見る

We investigate the potential of music appreciation using spatial mapping techniques, which allow us to "place" audio sources in various locations within a physical space. We consider possible ways of this new appreciation style and list some design variables, such as how to define coordinate systems, how to show visually, and how to place the sound sources. We conducted an exploratory user study to examine how these design variables affect users' music listening experiences. Based on our findings from the study, we discuss how we should develop systems that incorporate these design variables for music appreciation in the future.

DOI

Scopus

3

被引用数

(Scopus)
Latent topic similarity for music retrieval and its application to a system that supports DJ performance

Tatsunori Hirai, Hironori Doi, Shigeo Morishima

Journal of Information Processing 26 276 - 284 2018年01月 [査読有り]

　概要を見る

© 2018 Information Processing Society of Japan. This paper presents a topic modeling method to retrieve similar music fragments and its application, Music- Mixer, which is a computer-aided DJ system that supports DJ performance by automatically mixing songs in a seamless manner. MusicMixer mixes songs based on audio similarity calculated via beat analysis and latent topic analysis of the chromatic signal in the audio. The topic represents latent semantics on how chromatic sounds are generated. Given a list of songs, a DJ selects a song with beats and sounds similar to a specific point of the currently playing song to seamlessly transition between songs. By calculating similarities between all existing song sections that can be naturally mixed, MusicMixer retrieves the best mixing point from a myriad of possibilities and enables seamless song transitions. Although it is comparatively easy to calculate beat similarity from audio signals, considering the semantics of songs from the viewpoint of a human DJ has proven difficult. Therefore, we propose a method to represent audio signals to construct topic models that acquire latent semantics of audio. The results of a subjective experiment demonstrate the effectiveness of the proposed latent semantic analysis method. MusicMixer achieves automatic song mixing using the audio signal processing approach; thus, users can perform DJ mixing simply by selecting a song from a list of songs suggested by the system.

DOI

Scopus

1

被引用数

(Scopus)
Voice Animator: Automatic Lip-Synching in Limited Animation by Audio

Shoichi Furukawa, Tsukasa Fukusato, Shugo Yamaguchi, Shigeo Morishima

ADVANCES IN COMPUTER ENTERTAINMENT TECHNOLOGY, ACE 2017 10714 153 - 171 2018年 [査読有り]

　概要を見る

Limited animation is one of the traditional techniques for producing cartoon animations. Owing to its expressive style, it has been enjoyed around the world. However, producing high quality animations using this limited style is time-consuming and costly for animators. Furthermore, proper synchronization between the voice-actor's voice and the character's mouth and lip motion requires well-experienced animators. This is essential because viewers are very sensitive to audio-lip discrepancies. In this paper, we propose a method that automatically creates high-quality limited-style lip-synched animations using audio tracks. Our system can be applied for creating not only the original animations but also dubbed ones independently of languages. Because our approach follows the standard workflow employed in cartoon animation production, our system can successfully assist animators. In addition, users can implement our system as a plug-in of a standard tool for creating animations (Adobe After Effects) and can easily arrange character lip motion to suit their own style. We visually evaluate our results both absolutely and relatively by comparing them with those of previous works. From the user evaluations, we confirm that our algorithms is able to successfully generate more natural audio-mouth synchronizations in limited-style lipsynched animations than previous algorithms.

DOI

Scopus

7

被引用数

(Scopus)
Cosmetic Features Extraction by a Single Image Makeup Decomposition

Kanami Yamagishi, Shintaro Yamamoto, Takuya Kato, Shigeo Morishima

PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW) 2018-June 1965 - 1967 2018年 [査読有り]

　概要を見る

In recent years, a large number of makeup images have been shared on social media. Most of these images lack information about the cosmetics used, such as color, glitter or etc., while they are difficult to infer due to the diversity of skin color or lighting conditions. In this paper, our goal is to estimate cosmetic features only from a single makeup image. Previous work has measured the material parameters of cosmetic products from pairs of images showing the face with and without makeup, but such comparison images are not always available. Furthermore, this method cannot represent local effects such as pearl or glitter since they adapted physically-based reflectance models. We propose a novel image-based method to extract cosmetic features considering both color and local effects by decomposing the target image into makeup and skin color using Difference of Gaussian (DoG). Our method can be applied for single, standalone makeup images, and considers both local effects and color. In addition, our method is robust to the skin color difference due to the decomposition separating makeup from skin. The experimental results demonstrate that our method is more robust to skin color difference and captures characteristics of each cosmetic product.

DOI

Scopus

2

被引用数

(Scopus)
Latent topic similarity for music retrieval and its application to a system that supports DJ performance

Tatsunori Hirai, Hironori Doi, Shigeo Morishima

Journal of Information Processing 26 ( 3 ) 276 - 284 2018年01月 [査読有り]

　概要を見る

This paper presents a topic modeling method to retrieve similar music fragments and its application, Music- Mixer, which is a computer-aided DJ system that supports DJ performance by automatically mixing songs in a seamless manner. MusicMixer mixes songs based on audio similarity calculated via beat analysis and latent topic analysis of the chromatic signal in the audio. The topic represents latent semantics on how chromatic sounds are generated. Given a list of songs, a DJ selects a song with beats and sounds similar to a specific point of the currently playing song to seamlessly transition between songs. By calculating similarities between all existing song sections that can be naturally mixed, MusicMixer retrieves the best mixing point from a myriad of possibilities and enables seamless song transitions. Although it is comparatively easy to calculate beat similarity from audio signals, considering the semantics of songs from the viewpoint of a human DJ has proven difficult. Therefore, we propose a method to represent audio signals to construct topic models that acquire latent semantics of audio. The results of a subjective experiment demonstrate the effectiveness of the proposed latent semantic analysis method. MusicMixer achieves automatic song mixing using the audio signal processing approach
thus, users can perform DJ mixing simply by selecting a song from a list of songs suggested by the system.

DOI

Scopus

1

被引用数

(Scopus)
Naoya Iwamoto, Takuya Kato(joint first author), Hubert P. H. Shum, Ryo Kakitsuka, Kenta Hara, Shigeo Morishima

DanceDJ: A, D, Dance Animation Authoring, System for Live Performance

14TH INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTER ENTERTAINMENT TECHNOLOGY (ACE 2017) 10714 653 - 670 2017年12月 [査読有り]

　概要を見る

Dance is an important component of live performance for expressing emotion and presenting visual context. Human dance performances typically require expert knowledge of dance choreography and professional rehearsal, which are too costly for casual entertainment venues and clubs. Recent advancements in character animation and motion synthesis have made it possible to synthesize virtual 3D dance characters in real-time. The major problem in existing systems is a lack of an intuitive interfaces to control the animation for real-time dance controls. We propose a new system called the DanceDJ to solve this problem. Our system consists of two parts. The first part is an underlying motion analysis system that evaluates motion features including dance features such as the postures and movement tempo, as well as audio features such as the music tempo and structure. As a pre-process, given a dancing motion database, our system evaluates the quality of possible timings to connect and switch different dancing motions. During run-time, we propose a control interface that provides visual guidance. We observe that disk jockeys (DJs) effectively control the mixing of music using the DJ controller, and therefore propose a DJ controller for controlling dancing characters. This allows DJs to transfer their skills from music control to dance control using a similar hardware setup. We map different motion control functions onto the DJ controller, and visualize the timing of natural connection points, such that the DJ can effectively govern the synthesized dance motion. We conducted two user experiments to evaluate the user experience and the quality of the dance character. Quantitative analysis shows that our system performs well in both motion control and simulation quality.

DOI

Scopus

4

被引用数

(Scopus)
Qasi-developable garment transfer for animals

Fumiya Narita, Shunsuke Saito, Tsukasa Fukusato, Shigeo Morishima

SIGGRAPH Asia 2017 Technical Briefs, SA 2017 26:1-26:4 2017年11月 [査読有り]

　概要を見る

' 2017 ACM. In this paper, we present an interactive framework to model garments for animals from a template garment model based on correspondences between the source and the target bodies. We address two critical challenges of garment transfer across significantly different body shapes and postures (e.g., for quadruped and human); (1) ambiguity in the correspondences and (2) distortion due to large variation in scale of each body part. Our efficient cross-parameterization algorithm and intuitive user interface allow us to interactively compute correspondences and transfer the overall shape of garments. We also introduce a novel algorithm for local coordinate optimization that minimizes the distortion of transferred garments, which leads a resulting model to a quasi-developable surface and hence ready for fabrication. Finally, we demonstrate the robustness and effectiveness of our approach on a various garments and body shapes, showing that visually pleasant garment models for animals can be generated and fabricated by our system with minimal effort.

DOI

Scopus

2

被引用数

(Scopus)
Outside-in monocular IR camera based HMD pose estimation via geometric optimization

Pavel A. Savkin, Shunsuke Saito, Jarich Vansteenberge, Tsukasa Fukusato, Lochlainn Wilson, Shigeo Morishima

Proceedings of the ACM Symposium on Virtual Reality Software and Technology, VRST Part F131944 7:1-7:9 2017年11月 [査読有り]

　概要を見る

© 2017 ACM. Accurately tracking a Head Mounted Display (HMD) with a 6 degree of freedom is essential to achieve a comfortable and a nausea free experience in Virtual Reality. Existing commercial HMD systems using synchronized Infrared (IR) camera and blinking IR-LEDs can achieve highly accurate tracking. However, most of the o-the-shelf cameras do not support frame synchronization. In this paper, we propose a novel method for real time HMD pose estimation without using any camera synchronization or LED blinking. We extended over the state of the art pose estimation algorithm by introducing geometrically constrained optimization. In addition, we propose a novel system to increase robustness to the blurred IR-LEDs paerns appearing at high-velocity movements. The quantitative evaluations showed signicant improvements in pose stability and accuracy over wide rotational movements as well as a decrease in runtime.

DOI

Scopus

1

被引用数

(Scopus)
Simulating the friction sounds using a friction-based adhesion theory model

Takayuki Nakatsuka, Shigeo Morishima

The 20th International Conference on Digital Audio Effects (DAFX2017) 32 - 39 2017年09月 [査読有り]

　概要を見る

Synthesizing a friction sound of deformable objects by a computer is challenging. We propose a novel physics-based approach to synthesize friction sounds based on dynamics simulation. In this work, we calculate the elastic deformation of an object surface when the object comes in contact with other objects. The principle of our method is to divide an object surface into microrectangles. The deformation of each microrectangle is set using two assumptions: the size of a microrectangle (1) changes by contacting other object and (2) obeys a normal distribution. We consider the sound pressure distribution and its space spread, consisting of vibrations of all microrectangles, to synthesize a friction sound at an observation point. We express the global motions of an object by position based dynamics where we add an adhesion constraint. Our proposed method enables the generation of friction sounds of objects in different materials by regulating the initial value of microrectangular parameters.
Beautifying Font: Effective Handwriting Template for Mastering Expression of Chinese Calligraphy

Masanori Nakamura, Shugo Yamaguchi, Shigeo Morishima

SIGGRAPH 2017 posters 2017年08月 [査読有り]

　概要を見る

We propose a novel font called Beautifying Font to assist learning techniques in writing Chinese calligraphy. Chinese calligraphy has various expressions but they are hard to acquire for beginners. Beautifying Font visualizes the speed and pressure of brush-strokes so that users can intuitively understand how to write.
Court-aware volleyball video summarization

Takahiro Itazuri, Tsukasa Fukusato, Shugo Yamaguchi, Shigeo Morishima

ACM SIGGRAPH 2017 Posters, SIGGRAPH 2017 74:1-74:2 2017年07月 [査読有り]

　概要を見る

We propose a rally-rank evaluation based on the court transition information for volleyball video summarization considering the contents of the game. Our method uses the court transition information instead of non-robust visual features such as the position of a ball and players. Experimental results demonstrate the effectiveness that our method reflects viewers' preferences over previous methods.

DOI

Scopus

1

被引用数

(Scopus)
Beautifying font: Effective handwriting template for mastering expression of Chinese calligraphy

Masanori Nakamura, Shugo Yamaguchi, Shigeo Morishima

ACM SIGGRAPH 2017 Posters, SIGGRAPH 2017 3:1-3:2 2017年07月 [査読有り]

　概要を見る

© 2017 Copyright held by the owner/author(s). We propose a novel font called Beautifying Font to assist learning techniques in writing Chinese calligraphy. Chinese calligraphy has various expressions but they are hard to acquire for beginners. Beautifying Font visualizes the speed and pressure of brush-strokes so that users can intuitively understand how to write.

DOI

Scopus

3

被引用数

(Scopus)
3D model partial-resizing via normal and texture map combination

Naoki Nozawa, Tsukasa Fukusato, Shigeo Morishima

ACM SIGGRAPH 2017 Posters, SIGGRAPH 2017 1:1-1:2 2017年07月 [査読有り]

　概要を見る

© 2017 Copyright held by the owner/author(s). Resizing of 3D model is necessary for computer graphics animation and application such as games and movies. In general, when users deform a target model, they built on a bounding box or a closed polygon mesh (cage) to enclose a target model. Then, the resizing is done by deforming the cage with target model. However, these approaches are not good for detailed adjustment of 3D shape because they do not preserve local information. In contrast, based on a local information (e.g., edge set and weight map), Sorkine et al. [Sorkine and Alexa 2007; Sorkine et al. 2004] can generate smooth and conformal deformation results with only a few control points. While these approaches are useful for some situations, the results depend on resolution and topology of the target model. In addition, these approaches do not consider texture (UV) information.

DOI

Scopus

1

被引用数

(Scopus)
Court-aware volleyball video summarization

Takahiro Itazuri, Tsukasa Fukusato, Shugo Yamaguchi, Shigeo Morishima

ACM SIGGRAPH 2017 Posters, SIGGRAPH 2017 2017年07月 [査読有り]

　概要を見る

© 2017 Copyright held by the owner/author(s). We propose a rally-rank evaluation based on the court transition information for volleyball video summarization considering the contents of the game. Our method uses the court transition information instead of non-robust visual features such as the position of a ball and players. Experimental results demonstrate the effectiveness that our method reflects viewers' preferences over previous methods.

DOI

Scopus

1

被引用数

(Scopus)
3D model partial-resizing via normal and texture map combination

Naoki Nozawa, Tsukasa Fukusato, Shigeo Morishima

ACM SIGGRAPH 2017 Posters, SIGGRAPH 2017 2017年07月 [査読有り]

　概要を見る

Resizing of 3D model is necessary for computer graphics animation and application such as games and movies. In general, when users deform a target model, they built on a bounding box or a closed polygon mesh (cage) to enclose a target model. Then, the resizing is done by deforming the cage with target model. However, these approaches are not good for detailed adjustment of 3D shape because they do not preserve local information. In contrast, based on a local information (e.g., edge set and weight map), Sorkine et al. [Sorkine and Alexa 2007
Sorkine et al. 2004] can generate smooth and conformal deformation results with only a few control points. While these approaches are useful for some situations, the results depend on resolution and topology of the target model. In addition, these approaches do not consider texture (UV) information.

DOI

Scopus

1

被引用数

(Scopus)
Retexturing under self-occlusion using hierarchical markers

Shoki Miyagawa, Yoshihiro Fukuhara, Fumiya Narita, Norihiro Ogata, Shigeo Morishima

ACM SIGGRAPH 2017 Posters, SIGGRAPH 2017 27:1-27:2 2017年07月 [査読有り]

　概要を見る

© 2017 Copyright held by the owner/author(s). Marker-based retexturing is to superimpose the texture on a target object by detecting and identifying markers from within the captured image. We propose a new marker that can be identified under a large deformation that involves self-occlusion, which was not taken into consideration in the following markers. Bradley et al. [Bradley et al. 2009] designed the independent markers, but it is difficult to recognize them under complicated occlusion. Scholz et al.[Scholz and Magnor 2006] created a circular marker with a single color selected from multiple colors. They created ID corresponding to the alignment of colors by one marker and the markers around it and identified by placing the marker so that the ID would be unique. However, when some markers are covered by self-occlusion, the positional relationship of the markers appears to be different from the original, so markers near the self-occlusion are failed to identify. Narita et al. [Narita et al. 2017] considered self-occlusion by improving the identification algorithm. They succeeded in improving the accuracy of identification by creating triangle meshes whose vertices are the center of gravity of markers and assuming that they are close to a right isosceles triangle. However, since outliers are removed using angles, identification of markers may fail in the case of the object that is likely to be deformed in the shear direction like a cloth. Therefore, we considered self-occlusion by designing hierarchical markers so that they can be refferred to in a global scope. We designed a color based marker for easy recognition even at low resolution.

DOI

Scopus

1

被引用数

(Scopus)
Facial video age progression considering expression change

Shintaro Yamamoto, Pavel A. Savkin, Takuya Kato, Shoichi Furukawa, Shigeo Morishima

ACM International Conference Proceeding Series 128640 5:1-5:5 2017年06月 [査読有り]

　概要を見る

This paper proposes an age progression method for facial videos. Age is one of the main factors that changes the appearance of the face, due to the associated sagging, spots, and wrinkles. These aging features change in appearance depending on facial expressions. As an example, we see wrinkles appear in the face of the young when smiling, but the shape of wrinkles changes in older faces. Previous work has not considered the temporal changes of the face, using only static images as databases. To solve this problem, we extend the texture synthesis approach to use facial videos as source videos. First, we spatio-temporally align the videos of database to match the sequence of a target video. Then, we synthesize an aging face and apply the temporal changes of the target age to the wrinkles appearing in the facial expression image in the target video. As a result, our method successfully generates expression changes specific to the target age.

DOI

Scopus
Facial video age progression considering expression change

Shintaro Yamamoto, Pavel A. Savkin, Takuya Kato, Shoichi Furukawa, Shigeo Morishima

ACM International Conference Proceeding Series Part F128640 2017年06月 [査読有り]

　概要を見る

© 2017 ACM. This paper proposes an age progression method for facial videos. Age is one of the main factors that changes the appearance of the face, due to the associated sagging, spots, and wrinkles. These aging features change in appearance depending on facial expressions. As an example, we see wrinkles appear in the face of the young when smiling, but the shape of wrinkles changes in older faces. Previous work has not considered the temporal changes of the face, using only static images as databases. To solve this problem, we extend the texture synthesis approach to use facial videos as source videos. First, we spatio-temporally align the videos of database to match the sequence of a target video. Then, we synthesize an aging face and apply the temporal changes of the target age to the wrinkles appearing in the facial expression image in the target video. As a result, our method successfully generates expression changes specific to the target age.

DOI

Scopus
コート情報に基づくバレーボール映像の鑑賞支援

板摺貴大, 福里司, 山口周悟, 森島繁生

Visual Computing 2017 (VC 2017) 2017年06月 [査読有り]

　概要を見る

多くのスポーツ映像は試合時間が長いため，日常生活において映像鑑賞に充てる時間の中で観たい試合全てを視聴することは困難である．このような背景から，多くのスポーツ映像要約手法が提案されている．一般に視聴者によって観たい内容や鑑賞に充てる時間は異なるため，時間的柔軟性と内容的柔軟性を持つ手法が求められている．時間的柔軟性とは「視聴者が任意に指定した時間の要約映像の生成が可能であること」を指し，内容的柔軟性とは「視聴者の観たい内容に合わせた要約映像の編集が可能であること」を指す．内容的柔軟性を実現するためには，まず試合内容を考慮することが不可欠であるが，時間的柔軟性を保ったまま試合内容を考慮した映像要約手法はこれまで提案されていない．我々は時間的柔軟性と内容的柔軟性を実現するための，バレーボール映像の鑑賞支援手法を提案する．まず，試合映像の中でラリーシーンが試合内容の把握において最も重要であると仮定し，ラリーシーンを自動検出する．次に各ラリーに対して重要度を評価し，閾値以上の重要度を持つラリーシーンのみを含む映像を要約映像として出力する．要約映像が視聴者の指定した任意の時間に収まるように，この閾値を自動決定することで時間的柔軟性を実現する．また内容的柔軟性に不可欠である試合内容に基づいた映像要約を実現するため，従来手法における感性的特徴量に加えて，試合内容を含む特徴量の重み付き線形和によって重要度を評価する．この重みを視聴者が操作することによって内容的柔軟性が実現可能となると考えられる．試合内容を含む特徴量として選手やボールの追跡情報が挙げられるが，これらの情報は放送映像や動画サイト上の映像のような低品質かつ単視点の映像において取得することは困難である．我々はボールの動きに追従するようにカメラ操作が行われることに着目し，カメラ操作による映像中のコート中心の動きの情報（以下、コートの動き情報）を試合内容を含む特徴量として用いる．またこの手法を基にして，コートの動き情報の非類似度を計算することで，映像検索を行う．提案手法はコートの動き情報を用いているため，バスケットボールなどのカメラ操作がある他のスポーツ映像への適用も可能となっている．
表情変化データベースを用いた経年変化顔動画合成

山本晋太郎, サフキンパーベル, 加藤卓哉, 佐藤優伍, 古川翔一, 森島繁生

Visual Computing / グラフィクスとCAD 合同シンポジウム 2017 2017年06月 [査読有り]

　概要を見る

本研究では，老化に伴う表情変化の違いを考慮した経年変化顔の動画像合成を行う．動画中の人物の経年変化は，映画のような映像作品における年齢変化表現において重要である．老化に伴って発生する顔特徴の一つである皺に注目すると，表情変化に伴う皺の変化は年齢に大きく依存する．静止画を対象とした経年変化顔合成手法は数多く存在するが，いずれも皺の動的な変化に焦点を置いていないため，年齢による表情変化の違いを表現することができない．そこで本研究では，データベースとして目標年代の人物の表情変化動画を用いることにより，表情変化に伴う皺の動的な変化を表現する．具体的には，入力動画の各表情に対して，データベースの類似表情を用いて，老化時の顔の構築を行う．また，対象人物の皺の深さを，目標年代の変化と一致するように操作する．以上により，表情変化による皺の動的変化を考慮した経年変化顔の動画合成を実現した．
可展面制約を考慮したテンプレートベース衣服モデリング

成田史弥, 齋藤隼介, 福里司, 森島繁生

Visual Computing / グラフィクスとCAD 合同シンポジウム 2017 2017年06月 [査読有り]

　概要を見る

本稿では，衣服のモデリングにおける労力削減を目的とし，テンプレートとなる１つの衣服モデルから，任意のキャラクタの体型に適合した衣服モデルと型紙を生成する手法を提案する．生成される衣服の概形を確認しながらソースの身体とターゲット身体の対応関係をインタラクティブに修正できるインターフェースの導入により，身体のモデルの頂点数や頂点間の接続情報に依存しない衣服転写を実現する．また，ソースの衣服の形状を反映する最適化処理を行った後に可展面近似を行うことで，ソースの衣服のデザインから形状が大きく離れることを防ぎ，もっともらしい衣服モデルを生成することを実現する．提案手法は生成した衣服モデルの型紙を出力することが可能なため，例えば人間とペットのペアルックの衣服制作など，現実世界におけるものづくり支援への応用が期待される．
Authoring System for Choreography Using Dance Motion Retrieval and Synthesis

Ryo Kakitsuka, Kosetsu Tsukuda (AIST, Satoru Fukayama (AIST, Naoya Iwamoto, Masataka Goto (AIST, Shigeo Morishima

The 30th International Conference on Computer Animation and Social Agents(CASA 2017) 2017年05月 [査読有り]

　概要を見る

Dance animations of a three-dimensional (3D) character rendered with computer graphics (CG) have been created by using motion capture systems or 3DCG animation editing tools. Since these methods require skills and a large amount of work from the creator, automated choreography has been developed to make synthesizing dance motions much easier. However, due to the limited variations of the results, users could not reflect their preferences to the output. Therefore, supporting the user’s preference in dance animation is still important.<br />
<br />
We propose a novel dance creation system which supports the user to create choreography with their preferences. A user first selects a preferred motion from the database, and then assigns it to an arbitrary section of the music. With a few mouse clicks, our system helps a user search for alternative dance motion that reflect his or<br />
her preference by using relevance feedback. The system automatically synthesizes a continuous choreography with the constraint condition that the motions specified by the user are fixed. We conducted user studies and found that users could create new dance motions easily with their preferences by using the system.
Fiber-dependent Approach for Fast Dynamic Character Animation

金田綾乃, 森島繁生

The 30th International Conference on Computer Animation and Social Agents(CASA 2017) 2017年05月 [査読有り]

　概要を見る

Creating secondary motion of character animation including jiggling of fat is demanded in the computer animation. In general, secondary motion from the primary motion of the character are expressed based on shape matching approaches. However, the previous methods do not account for the directional stretch characteristics and local stiffness at the same time, that is problematic to represent the effect of anatomical structure such as muscle fiber. Our framework allows user to edit the anatomical structure of the character model corresponding to creature’s body containing muscle and fat from the tetrahedral model and bone motion. Our method then simulates the elastic deformation considering anatomical structures defined directional stretch characteristics and stiffness on each layer. In addition, our method can add the constraint for local deformation (e.g. biceps) considering defined model’s characteristics.
2. IIEEJ activities of conferences and events: 2-1 IIEEJ annual conferences

Shigeo Morishima

Journal of the Institute of Image Electronics Engineers of Japan 46 ( 1 ) 6 - 8 2017年

DOI

Scopus
Screen Space Hair Self Shadowing by Translucent Hybrid Ambient Occlusion

Zhuopeng Zhang, Shigeo Morishima

SMART GRAPHICS, SG 2015 9317 29 - 40 2017年 [査読有り]

　概要を見る

Screen space ambient occlusion is a very efficient means to capture the shadows caused by adjacent objects. However it is incapable of expressing transparency of objects. We introduce an approach which behaves like the combination of ambient occlusion and translucency. This method is an extension of the traditional screen space ambient occlusion algorithm with extra density field input. It can be applied on rendering mesh objects, and moreover it is very suitable for rendering complex hair models. We use the new algorithm to approximate light attenuation though semi-transparent hairs at real-time. Our method is implemented on common GPU, and independent from pre-computation. When it is used in environment lighting, the hair shading is visually similar to however one order of magnitude faster than existing algorithm.

DOI

Scopus
Screen space hair self shadowing by translucent hybrid ambient occlusion

Zhuopeng Zhang, Shigeo Morishima

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 9317 LNCS 29 - 40 2017年 [査読有り]

　概要を見る

© Springer International Publishing AG 2017. Screen space ambient occlusion is a very efficient means to capture the shadows caused by adjacent objects. However it is incapable of expressing transparency of objects. We introduce an approach which behaves like the combination of ambient occlusion and translucency. This method is an extension of the traditional screen space ambient occlusion algorithm with extra density field input. It can be applied on rendering mesh objects, and moreover it is very suitable for rendering complex hair models. We use the new algorithm to approximate light attenuation though semi-transparent hairs at real-time. Our method is implemented on common GPU, and independent from pre-computation. When it is used in environment lighting, the hair shading is visually similar to however one order of magnitude faster than existing algorithm.

DOI

Scopus
Dynamic Subtitle Placement Considering the Region of Interest and Speaker Location

Wataru Akahori, Tatsunori Hirai, Shigeo Morishima

PROCEEDINGS OF THE 12TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISIGRAPP 2017), VOL 6 6 102 - 109 2017年 [査読有り]

　概要を見る

This paper presents a subtitle placement method that reduces unnecessary eye movements. Although methods that vary the position of subtitles have been discussed in a previous study, subtitles may overlap the region of interest (ROI). Therefore, we propose a dynamic subtitling method that utilizes eye-tracking data to avoid the subtitles from overlapping with important regions. The proposed method calculates the ROI based on the eye-tracking data of multiple viewers. By positioning subtitles immediately under the ROI, the subtitles do not overlap the ROI. Furthermore, we detect speakers in a scene based on audio and visual information to help viewers recognize the speaker by positioning subtitles near the speaker. Experimental results show that the proposed method enables viewers to watch the ROI and the subtitle in longer duration than traditional subtitles, and is effective in terms of enhancing the comfort and utility of the viewing experience.

DOI

Scopus

12

被引用数

(Scopus)
Court-based Volleyball Video Summarization Focusing on Rally Scene

Takahiro Itazuri, Tsukasa Fukusato, Shugo Yamaguchi, Shigeo Morishima

2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW) 2017-July 179 - 186 2017年 [査読有り]

　概要を見る

In this paper, we propose a video summarization system for volleyball videos. Our system automatically detects rally scenes as self-consumable video segments and evaluates rally-rank for each rally scene to decide priority. In the priority decision, features representing the contents of the game are necessary; however such features have not been considered in most previous methods. Although several visual features such as the position of a ball and players should be used, acquisition of such features is still non-robust and unreliable in low resolution or low frame rate volleyball videos. Instead, we utilize the court transition information caused by camera operation. Experimental results demonstrate the robustness of our rally scene detection and the effectiveness of our rally-rank to reflect viewers' preferences over previous methods.

DOI

Scopus

5

被引用数

(Scopus)
Outside-in Monocular IR Camera based HMD Pose Estimation via Geometric Optimization

Pavel A. Savkin, Shunsuke Saito, Jarich Vansteenberge, Tsukasa Fukusato, Lochlainn Wilson, Shigeo Morishima

VRST'17: PROCEEDINGS OF THE 23RD ACM SYMPOSIUM ON VIRTUAL REALITY SOFTWARE AND TECHNOLOGY 131944 2017年 [査読有り]

　概要を見る

Accurately tracking a Head Mounted Display (HMD) with a 6 degree of freedom is essential to achieve a comfortable and a nausea free experience in Virtual Reality. Existing commercial HMD systems using synchronized Infrared (IR) camera and blinking IR-LEDs can achieve highly accurate tracking. However, most of the off-the-shelf cameras do not support frame synchronization. In this paper, we propose a novel method for real time HMD pose estimation without using any camera synchronization or LED blinking. We extended over the state of the art pose estimation algorithm by introducing geometrically constrained optimization. In addition, we propose a novel system to increase robustness to the blurred IR-LEDs patterns appearing at high-velocity movements. The quantitative evaluations showed significant improvements in pose stability and accuracy over wide rotational movements as well as a decrease in runtime.

DOI

Scopus

1

被引用数

(Scopus)
頭蓋骨形状を考慮した肥痩変化顔画像合成

福里司, 藤崎匡裕, 加藤卓哉, 森島繁生

画像電子学会誌 46 ( 1 ) 197 - 205 2017年 [査読有り]

DOI

Scopus
光学的最短経路長を用いた表面下散乱の高速計算による半透明物体のリアルタイム・レンダリング

小澤禎裕, 谷田川達也, 久保尋之, 森島繁生

画像電子学会誌 46 ( 4 ) 533 - 546 2017年 [査読有り]

　概要を見る

表面下散乱を考慮したレンダリングは，写実的なCG を作成するうえで不可欠である．しかし，表面下散乱を物理的に正しく扱う場合，半透明物体中を進む様々な光の経路を考慮する必要があり，この経路の多さが半透明物体の高速なレンダリングを難しくしている．物体がそれほど透けておらず，多重散乱光が支配的となる材質を仮定したとき，表面下散乱光の寄与は光路が長くなるにつれて急激に小さくなることが知られている．この観察から，半透明物体内部を最短の光学的距離で透過した光ほど，レンダリング結果に対して視覚的により重要であると考えられる．そこで入射光の強度および光学的最短距離の二つを考慮し，最も明るさの寄与が大きくなる表面下散乱光ただ一つを用いて物体表面上の輝度を推定する．本稿ではこれを寄与最大経路と呼び，この経路ただ一つを用いて物体表面上の輝度を推定する．本手法は，寄与最大経路を通った表面下散乱光は視覚的に大きな重要度を持つという経験則に基づく手法である．このような寄与最大経路一つから表面下散乱光を計算することで，物理的な正しさは保証されないものの，もっともらしいレンダリング結果をリアルタイムに得ることができる．特に，物体内部を考慮した光路の計算コストは非常に高いが，本手法では考慮する経路の本数を限定したことにより，内部に異なる材質の半透明物体が含まれた物体を実時間にレンダリング可能とした．

DOI

Scopus

1

被引用数

(Scopus)
Face Texture Synthesis from Multiple Images via Sparse and Dense Correspondence

Shugo Yamaguchi, Shigeo Morishima

ACM SIGGRAPH ASIA 2016 Technical Brief 2016年12月 [査読有り]

　概要を見る

We have a desire to edit images for various purposes such as art, entertainment, and film production so texture synthesis methods have been proposed. Especially, PatchMatch algorithm [Barnes et al.2009] enabled us to easily use many image editing tools. However, these tools are applied to one image. If we can automatically synthesize from various examples, we can create new and higher quality images. Visio-lization [Mohammed et al. 2009] generated average face by synthesis of face image database. However, the synthesis was applied block-wise so there were artifacts on the result and free form features of source images such as wrinkles could not be preserved. We proposed a new synthesis method for multiple images. We applied sparse and dense nearest neighbor search so that we can preserve both input and source database image features. Our method allows us to create a novel image from a number of examples.
トレーシングとデータベースを併用する2Dアニメーション作成支援システム

福里司, 森島繁生

第24回インタラクティブシステムとソフトウェアに関するワークショップ(WISS2016) ( 78 ) 2016年12月 [査読有り]

J-GLOBAL
Face texture synthesis from multiple images via sparse and dense correspondence

Shugo Yamaguchi, Shigeo Morishima

SA 2016 - SIGGRAPH ASIA 2016 Technical Briefs 14 2016年11月 [査読有り]

　概要を見る

© 2016 ACM. We have a desire to edit images for various purposes such as art, entertainment, and film production so texture synthesis methods have been proposed. Especially, PatchMatch algorithm [Barnes et al. 2009] enabled us to easily use many image editing tools. However, these tools are applied to one image. If we can automatically synthesize from various examples, we can create new and higher quality images. Visio-lization [Mohammed et al. 2009] generated average face by synthesis of face image database. However, the synthesis was applied block-wise so there were artifacts on the result and free form features of source images such as wrinkles could not be preserved. We proposed a new synthesis method for multiple images. We applied sparse and dense nearest neighbor search so that we can preserve both input and source database image features. Our method allows us to create a novel image from a number of examples.

DOI

Scopus

2

被引用数

(Scopus)
3D facial geometry reconstruction using patch database

Tsukasa Nozawa, Takuya Kato, Pavel A. Savkin, Naoki Nozawa, Shigeo Morishima

SIGGRAPH 2016 - ACM SIGGRAPH 2016 Posters 24:1-24:2 2016年07月 [査読有り]

　概要を見る

3D facial shape reconstruction in the wild environments is an important research task in the field of CG and CV. This is because it can be applied to a lot of products, such as 3DCG video games and face recognition. One of the most popular 3D facial shape reconstruction techniques is 3D Model-based approach. This approach approximates a facial shape by using 3D face model, which is calculated by principal component analysis. [Blanz and Vetter 1999] performed a 3D facial reconstruction by fitting points from facial feature points of an input of single facial image to vertex of template 3D facial model named 3D Morphable Model. This method can reconstruct a facial shape from a variety of images which include different lighting and face orientation, as long as facial feature points can be detected. However, representation quality of the result depends on the number of 3D model resolution.

DOI

Scopus
Automatic dance generation system considering sign language information

Wakana Asahina, Naoya Iwamoto, Hubert P.H. Shum, Shigeo Morishima

SIGGRAPH 2016 - ACM SIGGRAPH 2016 Posters 23:1-23:2 2016年07月 [査読有り]

　概要を見る

In recent years, thanks to the development of 3DCG animation editing tools (e.g. MikuMikuDance), a lot of 3D character dance animation movies are created by amateur users. However it is very difficult to create choreography from scratch without any technical knowledge. Shiratori et al. [2006] produced the dance automatic generation system considering rhythm and intensity of dance motions. However each segment is selected randomly from database, so the generated dance motion has no linguistic or emotional meanings. Takano et al. [2010] produced a human motion generation system considering motion labels. However they use simple motion labels like "running" or "jump", so they cannot generate motions that express emotions. In reality, professional dancers make choreography based on music features or lyrics in music, and express emotion or how they feel in music. In our work, we aim at generating more emotional dance motion easily. Therefore, we use linguistic information in lyrics, and generate dance motion.

DOI

Scopus

4

被引用数

(Scopus)
Video reshuffling: Automatic video dubbing without prior knowledge

Shoichi Furukawa, Takuya Kato, Pavel Savkin, Shigeo Morishima

SIGGRAPH 2016 - ACM SIGGRAPH 2016 Posters 19:1-19:2 2016年07月 [査読有り]

　概要を見る

Numerous video have been translated using "dubbing," spurred by the recent growth of video market. However, it is very difficult to achieve the visual-audio synchronization. That is to say in general a new audio does not synchronize with actor's mouth motion. This discrepancy can disturb comprehension of video contents. There-fore many methods have been researched so far to solve this problem.

DOI

Scopus

5

被引用数

(Scopus)
Perception of drowsiness based on correlation with facial image features

Yugo Sato, Takuya Kato, Naoki Nozawa, Shigeo Morishima

Proceedings of the ACM Symposium on Applied Perception, SAP 2016 139 2016年07月 [査読有り]

　概要を見る

© 2016 Copyright held by the owner/author(s). This paper presents a video-based method for detecting drowsiness. Generally, human beings can perceive their fatigue and drowsiness through looking at faces. The ability to perceive the fatigue and the drowsiness has been studied in many ways. The drowsiness detection method based on facial videos has been proposed [Nakamura et al. 2014]. In their method, a set of the facial features calculated with the Computer Vision techniques and the k-nearest neighbor algorithm are applied to classify drowsiness degree. However, the facial features that are ineffective against reproducing the perception of human beings with the machine learning method are not removed. This factor can decrease the detection accuracy.

DOI

Scopus

1

被引用数

(Scopus)
老化時の皺の個人性を考慮した経年変化顔画像合成

サフキンパーベル, 加藤卓哉, 福里司, 森島繁生

情報処理学会論文誌 57 ( 7 ) 1627 - 1637 2016年07月 [査読有り]

　概要を見る

人物の顔には老化にともない，しみやくすみ，皺やたるみが発生し，顔の印象が大きく変化する．このことから，経年変化顔生成技術は長期的な犯罪捜査や行方不明者の捜索に必要となる．既存研究の1つは，年代別顔画像データベースを用いて入力顔画像を小片画像単位で再構成することで，写実的な経年変化顔画像を合成する手法を提案している．しかし，この手法を含め，従来の経年変化顔画像生成手法には，老化時の人物の個人性を表す重要な要素である人物固有の皺の発生位置や形状を考慮できないという問題があった．そこで本稿では，この問題を解決する新たな経年変化顔画像合成手法を提案する．具体的には，若年での表情変化によってできる皺が老化時の皺発生の原因となるという医学的知見に基づき，表情変化時の顔画像で発生している皺を無表情顔画像へ転写することによって，老化時の皺の発生位置と形状を推定する．その後，年代別顔画像データベースを用いて皺の発生位置と形状が推定された結果を小片画像単位で再構成することで経年変化顔画像を合成する．提案手法は皺の位置や形状の個人性を反映し，また主観評価実験の結果から，その有用性を示した．

CiNii
Region-of-interest-based subtitle placement using eye-tracking data of multiple viewers

Wataru Akahori, Shigeo Morishima, Tatsunori Hirai, Shunya Kawamura

TVX 2016 - Proceedings of the ACM International Conference on Interactive Experiences for TV and Online Video 123 - 128 2016年06月 [査読有り]

　概要を見る

Copyright is held by the owner/author(s). We present a subtitle-placement method that reduces viewer's eye movement without interfering with the target region of interest (ROI) in a video scene. Subtitles help viewers understand foreign-language videos. However, subtitles tend to attract viewers' line of sight, which cause viewers to lose focus on the video content. To address this problem, previous studies have attempted to improve viewer experiences by dynamically shifting subtitle positions. Nevertheless, in their user studies, some participants felt that the visual appearance of such subtitles was unnatural and caused them fatigue. We propose a method that places subtitles below the ROI, which is calculated by eye-tracking data from multiple viewers. Two experiments were conducted to evaluate viewer impression and compare line of sight for videos with subtitles placed by the proposed and previous methods.

DOI

Scopus

14

被引用数

(Scopus)
LyricsRadar：歌詞の潜在的意味に基づく歌詞検索インタフェース

佐々木将人, 吉井和佳, 中野倫靖, 後藤真孝, 森島繁生

情報処理学会論文誌 57 ( 5 ) 1365 - 1374 2016年05月 [査読有り]
四肢キャラクタ間の衣装転写システムの提案

成田史弥, 斎藤隼介, 福里司, 森島繁生

情報処理学会論文誌 57 ( 3 ) 863 - 872 2016年03月 [査読有り]
Acquiring Curvature-Dependent Reflectance Function from Translucent Material

Midori Okamoto, Hiroyuki Kubo, Yasuhiro Mukaigawa, Tadahiro Ozawa, Keisuke Mochida, Shigeo Morishima

Proceedings NICOGRAPH International 2016 182 - 185 2016年 [査読有り]

　概要を見る

Acquiring scattering parameters from real objects is still a challenging work. To obtain the scattering parameters, physics-based analysis is ineffective because huge computational cost is required to simulate subsurface scattering effect accurately. Thus, we focus on Curvature-Dependent Reflectance Function (CDRF), the plausible approximation of the subsurface scattering effect. In this paper, we propose a novel technique to obtain scattering parameters from real objects by revealing the relation between curvature and translucency.

DOI

Scopus

1

被引用数

(Scopus)
Friction sound synthesis of deformable objects based on adhesion theory.

T. Nakatsuka, Shigeo Morishima

Poster Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation, Zurich, Switzerland, July 11-13, 2016 6:1 - 1 2016年 [査読有り]
A choreographic authoring system for character dance animation reflecting a user's preference.

Ryo Kakitsuka, Kosetsu Tsukuda, Satoru Fukayama, Naoya Iwamoto, Masataka Goto, Shigeo Morishima

Poster Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation, Zurich, Switzerland, July 11-13, 2016 5:1 2016年 [査読有り]
Creating a realistic face image from a cartoon character.

Masanori Nakamura, Shugo Yamaguchi, Tsukasa Fukusato, Shigeo Morishima

Poster Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation, Zurich, Switzerland, July 11-13, 2016 2:1 - 1 2016年 [査読有り]
Frame-wise continuity-based video summarization and stretching

Tatsunori Hirai, Shigeo Morishima

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 9516 806 - 817 2016年 [査読有り]

　概要を見る

© Springer International Publishing Switzerland 2016. This paper describes a method for freely changing the length of a video clip, leaving its content almost unchanged, by removing video frames considering both audio and video transitions. In a video clip that contains many video frames, there are less important frames that only extend the length of the clip. Taking the continuity of audio and video frames into account, the method enables changing the length of a video clip by removing or inserting frames that do not significantly affect the content. Our method can be used to change the length of a clip without changing the playback speed. Subjective experimental results demonstrate the effectiveness of our method in preserving the clip content.

DOI

Scopus

1

被引用数

(Scopus)
MusicMixer: Automatic DJ system considering beat and latent topic similarity

Tatsunori Hirai, Hironori Doi, Shigeo Morishima

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 9516 698 - 709 2016年 [査読有り]

　概要を見る

© Springer International Publishing Switzerland 2016. This paper presents MusicMixer, an automatic DJ system that mixes songs in a seamless manner. MusicMixer mixes songs based on audio similarity calculated via beat analysis and latent topic analysis of the chromatic signal in the audio. The topic represents latent semantics about how chromatic sounds are generated. Given a list of songs, a DJ selects a song with beat and sounds similar to a specific point of the currently playing song to seamlessly transition between songs. By calculating the similarity of all existing pairs of songs, the proposed system can retrieve the best mixing point from innumerable possibilities. Although it is comparatively easy to calculate beat similarity from audio signals, it has been difficult to consider the semantics of songs as a human DJ considers. To consider such semantics, we propose a method to represent audio signals to construct topic models that acquire latent semantics of audio. The results of a subjective experiment demonstrate the effectiveness of the proposed latent semantic analysis method.

DOI

Scopus

6

被引用数

(Scopus)
Computational cartoonist: A comic-style video summarization system for anime films

Tsukasa Fukusato, Tatsunori Hirai, Shunya Kawamura, Shigeo Morishima

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 9516 42 - 50 2016年 [査読有り]

　概要を見る

© Springer International Publishing Switzerland 2016. This paper presents Computational Cartoonist, a comicstyle anime summarization system that detects key frame and generates comic layout automatically. In contract to previous studies, we define evaluation criteria based on the correspondence between anime films and original comics to determine whether the result of comic-style summarization is relevant. To detect key frame detection for anime films, the proposed system segments the input video into a series of basic temporal units, and computes frame importance using image characteristics such as motion. Subsequently, comic-style layouts are decided on the basis of pre-defined templates stored in a database. Several results demonstrate the efficiency of our key frame detection over previous methods by evaluating the matching accuracy between key frames and original comic panels.

DOI

Scopus

8

被引用数

(Scopus)
A SOUNDTRACK GENERATION SYSTEM TO SYNCHRONIZE THE CLIMAX OF A VIDEO CLIP WITH MUSIC

Haruki Sato, Tatsunori Hirai, Tomoyasu Nakano, Masataka Goto, Shigeo Morishima

2016 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO (ICME) 2016-August 1 - 6 2016年 [査読有り]

　概要を見る

In this paper, we present a soundtrack generation system that can automatically add a soundtrack with the length and climax points aligned to those of a video clip. Adding a soundtrack to a video clip is an important process in video editing. Editors tend to add chorus sections to the climax points of the video clip by replacing and concatenating musical segments. However, this process is time-consuming Our system automatically detects climaxes of both the video clips and music based on feature extraction and analysis. This enables the system to add a soundtrack in which the climax is synchronized to the climax of the video clip. We evaluated the generated soundtracks through a subjective evaluation.

DOI

Scopus

3

被引用数

(Scopus)
RSViewer: An Efficient Video Viewer for Racquet Sports Focusing on Rally Scenes.

Shunya Kawamura, Tsukasa Fukusato, Tatsunori Hirai, Shigeo Morishima

Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2016) - Volume 2: IVAPP, Rome, Italy, February 27-29, 2016. 249 - 256 2016年 [査読有り]

DOI
Real-time rendering of heterogeneous translucent materials with dynamic programming

Tadahiro Ozawa, Midori Okamoto, Hiroyuki Kubo, Shigeo Morishima

European Association for Computer Graphics - 37th Annual Conference, EUROGRAPHICS 2016 - Posters 3 - 4 2016年 [査読有り]

　概要を見る

© 2016 The Eurographics Association. Subsurface scattering is important to express translucent materials such as skin, marble and so on realistically. However, rendering translucent materials in real-time is challenging, since calculating subsurface light transport requires large computational cost. In this paper, we present a novel algorithm to render heterogeneous translucent materials using Dijkstra's Algorithm. Our two main ideas are follows: The first is fast construction of the graph by solid voxelization. The second is voxel shading like initialization of the graph. From these ideas, we obtain maximum contribution of emitted light on whole surface in single calculation. We realize real-time rendering of animated heterogeneous translucent objects with simple compute and our approach does not require any precomputation.

DOI

Scopus
Real-time rendering of heterogeneous translucent objects using voxel number map

Keisuke Mochida, Midori Okamoto, Hiroyuki Kubo, Shigeo Morishima

European Association for Computer Graphics - 37th Annual Conference, EUROGRAPHICS 2016 - Posters 1 - 2 2016年 [査読有り]

　概要を見る

© 2016 The Eurographics Association. Rendering of tranlucent objects enhances the reality of computer graphics, however, it is still computationally expensive. In this paper, we introduce a real-time rendering technique for heterogeneous translucent objects that contain complex structure inside. First, for the precomputation, we convert mesh models into voxels and generate Look-Up-Table in which the optical thickness between two surface voxels is stored. Second, we compute radiance in real-time using the precomputed optical thickness. At this time, we generate Voxel-Number-Map to fetch the texel value of the Look-Up-Table in GPU. Using Look-Up-Table and Voxel-Number-Map, our method can render translucent objects with cavities and different media inside in real-time.

DOI

Scopus
Wrinkles individuality representing aging simulation

Pavel A. Savkin, Daiki Kuwahara, Masahide Kawai, Takuya Kato, Shigeo Morishima

SIGGRAPH Asia 2015 Posters, SA 2015 37:1 2015年11月 [査読有り]

　概要を見る

An appearance of a human face changes due to aging: sagging, spots, lusters, and wrinkles would be observed. Therefore, facial aging simulation techniques are required for long-term criminal investigation. While the appearance of an aged face varies greatly from person to person, wrinkles are one of the most important features which represent the human individuality. An individuali-ty of wrinkles is defined by wrinkles shape and position. [Maejima et al. 2014] proposed an aging simulation method that preserves facial parts individuality using a patch-based facial image reconstruction. Since few wrinkles can be observed on an input young face, it is difficult to represent wrinkles appearance only by the reconstruction. Therefore, a statistical wrinkles aging pattern model is introduced to produce natural looking wrinkles by selecting appropriate patches in an age-specific patch database. However, the variation of the statistical wrinkles patterns model is too limited to represent wrinkles individuality. Additionally, an appropriate size of a patch and feature value had to be applied for each facial region to get a plausible aged facial image. In this paper, we introduce a novel aging simulation method us-ing the patch-based image reconstruction, which can overcome problems mentioned above. Based on a medical knowledge [Piér-ard et al. 2003], wrinkles in an expressive facial image (defined as expressive wrinkles) of a same person are synthesized to the input image instead of the statistical wrinkles pattern model to represent wrinkles individuality. Furthermore, different sizes of the patch and feature values are applied for each facial region to achieve high representation of both the wrinkles individuality and the age-specific features. The entire process is performed automatically, and a plausible aged facial image is generated.

DOI

Scopus

2

被引用数

(Scopus)
Automatic facial animation generation system of dancing characters considering emotion in dance and music

Wakana Asahina, Narumi Okada, Naoya Iwamoto, Taro Masuda, Tsukasa Fukusato, Shigeo Morishima

SIGGRAPH Asia 2015 Posters, SA 2015 11:1 2015年11月 [査読有り]

　概要を見る

In recent years, a lot of 3D character dance animation movies are created by amateur users using 3DCG animation editing tools (e.g. MikuMikuDance). Whereas, most of them are created manually. Then automatic facial animation system for dancing character will be useful to create dance movies and visualize impressions effec- tively. Therefore, we address the challenging theme to estimate dancing character's emotions (we call "dance emotion"). In previ- ous work considering music features, DiPaola et al. [2006] pro- posed music-driven emotionally expressive face system. To de- tect the mood of the input music, they used a hierarchical frame- work (Thayer model), and achieved to generate facial animation that matches music emotion. However, their model can't express subtleties of emotion between two emotions because input music divided into few moods sharply using Gaussian mixture model. In addition, they decide more detailed moods based on the psychologi- cal rules that uses score information, so they requires MIDI data. In this paper, we propose "dance emotion model" to visualize danc- ing character's emotion as facial expression. Our model is built by the coordinate information frame by frame on the emotional space through perceptional experiment using music and dance mo- tion database without MIDI data. Moreover, by considering the displacement on the emotional space, we can express not only a certain emotion but also subtleties of emotions. As the result, our system got a higher accuracy comparing with the previous work. We can create the facial expression result soon by inputting audio data and synchronized motion. It is shown the utility through the comparison with previous work in Figure 1.

DOI

Scopus

3

被引用数

(Scopus)
領域ベースの画風転写

山口周悟, 加藤卓哉, 福里司, 古澤知英, 森島繁生

画像電子学会誌(CD-ROM) 45 ( 1 ) 8:1-8:4 2015年11月 [査読有り]

DOI J-GLOBAL

Scopus

1

被引用数

(Scopus)
Multi-layer Lattice Model for Real-Time Dynamic Character Deformation

Naoya Iwamoto, Hubert P.H. Shum, Longzhi Yang, Shigeo Morishima

Computer Graphics Forum 34 ( 7 ) 99 - 109 2015年10月 [査読有り]

　概要を見る

© 2015 The Author(s) Computer Graphics Forum. Due to the recent advancement of computer graphics hardware and software algorithms, deformable characters have become more and more popular in real-time applications such as computer games. While there are mature techniques to generate primary deformation from skeletal movement, simulating realistic and stable secondary deformation such as jiggling of fats remains challenging. On one hand, traditional volumetric approaches such as the finite element method require higher computational cost and are infeasible for limited hardware such as game consoles. On the other hand, while shape matching based simulations can produce plausible deformation in real-time, they suffer from a stiffness problem in which particles either show unrealistic deformation due to high gains, or cannot catch up with the body movement. In this paper, we propose a unified multi-layer lattice model to simulate the primary and secondary deformation of skeleton-driven characters. The core idea is to voxelize the input character mesh into multiple anatomical layers including the bone, muscle, fat and skin. Primary deformation is applied on the bone voxels with lattice-based skinning. The movement of these voxels is propagated to other voxel layers using lattice shape matching simulation, creating a natural secondary deformation. Our multi-layer lattice framework can produce simulation quality comparable to those from other volumetric approaches with a significantly smaller computational cost. It is best to be applied in real-time applications such as console games or interactive animation creation.

DOI

Scopus

18

被引用数

(Scopus)
Multi-layer Lattice Model for Real-Time Dynamic Character Deformation

Naoya Iwamoto, Hubert P. H. Shum, Longzhi Yang, Shigeo Morishima

COMPUTER GRAPHICS FORUM 34 ( 7 ) 99 - 109 2015年10月 [査読有り]

　概要を見る

Due to the recent advancement of computer graphics hardware and software algorithms, deformable characters have become more and more popular in real-time applications such as computer games. While there are mature techniques to generate primary deformation from skeletal movement, simulating realistic and stable secondary deformation such as jiggling of fats remains challenging. On one hand, traditional volumetric approaches such as the finite element method require higher computational cost and are infeasible for limited hardware such as game consoles. On the other hand, while shape matching based simulations can produce plausible deformation in real-time, they suffer from a stiffness problem in which particles either show unrealistic deformation due to high gains, or cannot catch up with the body movement. In this paper, we propose a unified multi-layer lattice model to simulate the primary and secondary deformation of skeleton-driven characters. The core idea is to voxelize the input character mesh into multiple anatomical layers including the bone, muscle, fat and skin. Primary deformation is applied on the bone voxels with lattice-based skinning. The movement of these voxels is propagated to other voxel layers using lattice shape matching simulation, creating a natural secondary deformation. Our multi-layer lattice framework can produce simulation quality comparable to those from other volumetric approaches with a significantly smaller computational cost. It is best to be applied in real-time applications such as console games or interactive animation creation.

DOI

Scopus

18

被引用数

(Scopus)
VRMixer: 動画コンテンツと現実世界の融合とその適用可能性の検証

牧良樹, 中村聡史, 平井辰典, 湯村翼, 森島繁生

エンタテインメントコンピューティングシンポジウム2015論文集 ( 2015 ) 557 - 565 2015年09月

CiNii
Automatic generation of photorealistic 3D inner mouth animation only from frontal images

Masahide Kawai, Tomoyori Iwao, Akinobu Maejima, Shigeo Morishima

Journal of Information Processing 23 ( 5 ) 693 - 703 2015年09月 [査読有り]

　概要を見る

© 2015 Information Processing Society of Japan. In this paper, we propose a novel method to generate highly photorealistic three-dimensional (3D) inner mouth animation that is well-fitted to an original ready-made speech animation using only frontal captured images and small-size databases. The algorithms are composed of quasi-3D model reconstruction and motion control of teeth and the tongue, and final compositing of photorealistic speech animation synthesis tailored to the original. In general, producing a satisfactory photorealistic appearance of the inner mouth that is synchronized with mouth movement is a very complicated and time-consuming task. This is because the tongue and mouth are too flexible and delicate to be modeled with the large number of meshes required. Therefore, in some cases, this process is omitted or replaced with a very simple generic model. Our proposed method, on the other hand, can automatically generate 3D inner mouth appearances by improving photorealism with only three inputs: an original tailor-made lip-sync animation, a single image of the speaker’s teeth, and a syllabic decomposition of the desired speech. The key idea of our proposed method is to combine 3D reconstruction and simulation with two-dimensional (2D) image processing using only the above three inputs, as well as a tongue database and mouth database. The satisfactory performance of our proposed method is illustrated by the significant improvement in picture quality of several tailor-made animations to a degree nearly equivalent to that of camera-captured videos.

DOI

Scopus
Automatic generation of photorealistic 3D inner mouth animation only from frontal images

Masahide Kawai, Tomoyori Iwao, Akinobu Maejima, Shigeo Morishima

Journal of Information Processing 23 ( 5 ) 693 - 703 2015年09月 [査読有り]

　概要を見る

In this paper, we propose a novel method to generate highly photorealistic three-dimensional (3D) inner mouth animation that is well-fitted to an original ready-made speech animation using only frontal captured images and small-size databases. The algorithms are composed of quasi-3D model reconstruction and motion control of teeth and the tongue, and final compositing of photorealistic speech animation synthesis tailored to the original. In general, producing a satisfactory photorealistic appearance of the inner mouth that is synchronized with mouth movement is a very complicated and time-consuming task. This is because the tongue and mouth are too flexible and delicate to be modeled with the large number of meshes required. Therefore, in some cases, this process is omitted or replaced with a very simple generic model. Our proposed method, on the other hand, can automatically generate 3D inner mouth appearances by improving photorealism with only three inputs: an original tailor-made lip-sync animation, a single image of the speaker’s teeth, and a syllabic decomposition of the desired speech. The key idea of our proposed method is to combine 3D reconstruction and simulation with two-dimensional (2D) image processing using only the above three inputs, as well as a tongue database and mouth database. The satisfactory performance of our proposed method is illustrated by the significant improvement in picture quality of several tailor-made animations to a degree nearly equivalent to that of camera-captured videos.

DOI

Scopus
VRMixer: 動画コンテンツと現実世界の融合とその適用可能性の検証

牧良樹, 中村聡史, 平井辰典, 湯村翼, 森島繁生

Entertainment Computing 2015講演論文集 1 - 9 2015年09月
3D face reconstruction from a single non-frontal face image

Naoki Nozawa, Daiki Kuwahara, Shigeo Morishima

ACM SIGGRAPH 2015 Posters, SIGGRAPH 2015 2015年07月 [査読有り]

　概要を見る

A reconstruction of a human face shape from a single image is an important theme for criminal investigation such as recognition of suspected people from surveillance cameras with only a few frames. It is, however, still difficult to recover a face shape from a non-frontal face image. Method using shading cues on a face depends on the lighting circumstance and cannot be adapted to images in which shadows occurs, for example [Kemelmacher et al. 2011]. On the other hand, [Blanz et al. 2004] reconstructed a shape by 3D Morphable Model (3DMM) only with facial feature points. This method, however, requires the pose-wise correspondences of vertices in the model to feature points of input image because a face contour cannot be seen when the facial direction is not the front. In this paper, we propose a method which can reconstruct a facial shape from a non-frontal face image only with a single general correspondence table. Our method searches for the correspondences of points on a facial contour in the iterative reconstruction process, and makes the reconstruction simple and stable.

DOI

Scopus

1

被引用数

(Scopus)
Texture preserving garment transfer

Fumiya Narita, Shunsuke Saito, Takuya Kato, Tsukasa Fukusato, Shigeo Morishima

ACM SIGGRAPH 2015 Posters, SIGGRAPH 2015 91:1 2015年07月 [査読有り]

　概要を見る

Dressing virtual characters is necessary for many applications, while modeling clothing is a significant bottleneck. Therefore, it has been proposed that the idea of Garment Transfer for transfer-ring clothing model from one character to another character [Brouet et al. 2012]. In recent years, this idea has been extended to be applicable between characters in various poses and shapes [Narita et al. 2014]. However, texture design of clothing is not preserved in their method since they deform the source clothing model to fit the target body (see Figure 1(a)(c)). We propose a novel method to transfer garment while preserving its texture design. First, we cut the transferred clothing mesh model along the seam. Second, we follow the similar method to "as-rigid-as-possible" deformation, we deform the texture space to reflect the shape of transferred clothing mesh model. Our method keeps consistency of the texture as clothing by cutting them along a seam. In order not to generate the inversion, we modify the ex-pression of "as-rigid-as-possible". Our method allows users not only to preserve texture uniformly on transferred clothing (see Figure 1(b)), but also in particular location the user specified, such as the location with an appliqué (see Figure 1(e)). Our meth-od is the pioneer of Texture Preserving Garment Transfer.

DOI

Scopus

3

被引用数

(Scopus)
BG maker: Example-based anime background image creation from a photograph

Shugo Yamaguchi, Chie Furusawa, Takuya Kato, Tsukasa Fukusato, Shigeo Morishima

ACM SIGGRAPH 2015 Posters, SIGGRAPH 2015 45:1 2015年07月 [査読有り]

　概要を見る

Anime designers often paint actual sceneries to serve as background images based on photographs to complement characters. As paint- ing background scenery is time consuming and cost ineffective, there is a high demand for techniques that can convert photographs into anime styled graphics. Previous approaches for this purpose, such as Image Quilting [Efros and Freeman 2001] transferred a source texture onto a target photograph. These methods synthesized corresponding source patches with the target elements in a photo- graph, and correspondence was achieved through nearest-neighbor search such as PatchMatch [Barnes et al. 2009]. However, the near- est-neighbor patch is not always the most suitable patch for anime transfer because photographs and anime background images differ in color and texture. For example, real-world color need to be con- verted into specific colors for anime; further, the type of brushwork required to realize an anime effect, is different for different photo- graph elements (e.g. sky, mountain, grass). Thus, to get the most suitable patch, we propose a method, wherein we establish global region correspondence before local patch match. In our proposed method, BGMaker, (1) we divide the real and anime images into re- gions; (2) then, we automatically acquire correspondence between each region on the basis of color and texture features, and (3) search and synthesize the most suitable patch within the corresponding re- gion. Our primary contribution in this paper is a method for au- Tomatically acquiring correspondence between target regions and source regions of different color and texture, which allows us to generate an anime background image while preserving the details of the source image.

DOI

Scopus

1

被引用数

(Scopus)
Automatic synthesis of eye and head animation according to duration and point of gaze

Hiroki Kagiyama, Masahide Kawai, Daiki Kuwahara, Takuya Kato, Shigeo Morishima

ACM SIGGRAPH 2015 Posters, SIGGRAPH 2015 44:1 2015年07月 [査読有り]

　概要を見る

In movie and video game productions, synthesizing subtle eye and corresponding head movements of CG character is essential to make a content dramatic and impressive. However, to complete them costs a lot of time and labors because they often have to be made by manual operations of skilled artists. [Itti et al. 2006] and [Yeo et al. 2012] proposed an automatic eyes and head's motion control method by measuring a real person watching a displayed gaze point. However, in both approaches, a rotational angle and speed of eyes and head are treated together uniformly depending on the gaze point location. Specifically, dis-playing duration time of gaze target strongly influences the motion of eyes and head because the shorter the blink interval of a gaze target is, the more quickly a human response becomes to chase the target by the combination of eye rotation and head movement. In this paper, we propose a method to automatically control eyes and head by taking account of both gaze target location and its blink time duration. As a result, eye and head movement are mod-eled combined with measured data by a function whose arguments are gaze point angle and duration time. So a variety of gaze action along with head motion including Ves-tibule-ocular Reflex can be generated automatically by changing the parameters of a gaze angle and duration.

DOI

Scopus
A music video authoring system synchronizing climax of video clips and music via rearrangement of musical bars

Haruki Sato, Tatsunori Hirai, Tomoyasu Nakano, Masataka Goto, Shigeo Morishima

ACM SIGGRAPH 2015 Posters, SIGGRAPH 2015 42:1 2015年07月 [査読有り]

　概要を見る

This paper presents a system that can automatically add a sound-track to a video clip by replacing and concatenating an existing song's musical bars considering a user's preference. Since a soundtrack makes a video clip attractive, adding a soundtrack to a clip is one of the most important processes in video editing. To make a video clip more attractive, an editor of the clip tends to add a soundtrack considering its timing and climax. For example, editors often add chorus sections to the climax of the clip by re-placing and concatenating musical bars in an existing song. How-ever, in the process, editors should take naturalness of rearranged soundtrack into account. Therefore, editors have to decide how to replace musical bars in a song considering its timing, climax, and naturalness of rearranged soundtrack simultaneously. In this case, editors are required to optimize the soundtrack by listening to the rearranged result as well as checking the naturalness and synchro-nization between the result and the video clip. However, this repe-titious work is time-consuming. [Feng et al. 2010] proposed an automatic soundtrack addition method. However, since this meth-od automatically adds soundtrack with data-driven approach, this method cannot consider timing and climax which a user prefers. Our system takes all the patterns of rearranged musical bars into account and finds the most natural soundtrack considering a user's preference of intention for an audio-visual alignment and a climax of the resulting soundtrack. Specifically, musical sections between user's specified points and the beginning and the ending of the song are automatically interpolated by replacing and concatenat-ing musical bars based on dynamic programming. To consider user's intention for a climax of the soundtrack, the system allows the user to specify the intended climax by an editing interface. The system immediately reflects the intention and the soundtrack will be interactively re-rearranged. These semi-automated pro-cesses of rearranging soundtrack for a video clip help the users adding songs without considering naturalness of the rearranged song by their own.

DOI

Scopus

4

被引用数

(Scopus)
キャラクターの身体構造を考慮した実時間肉揺れ生成手法

Naoya Iwamoto, Shigeo Morishima

情報処理学会論文誌 44 ( 3 ) 502 - 511 2015年07月 [査読有り]

　概要を見る

本論文では，キャラクターの肉揺れを実時間で生成する手法を提案する．ここで肉揺れとは，骨格の変形を起点としたキャラクター動作によって生じる脂肪層の揺れといった二次的動作を指す．忠実な肉揺れの表現手法として，計算コストの高い有限要素法を用いた手法が多く提案されている一方，近年では簡略的な弾性体の計算手法を用いた頑健な実時間肉揺れ生成手法も提案されている．しかしながら，骨の変形による剛体スキニングの影響がシミュレーション領域である脂肪層や皮膚にまで適用していたため，皮膚表面上のわずかな揺れしか表現できていない点が問題であった．そこで本論文では，より大きな肉揺れの実時間表現を目指し，キャラクターの皮下に存在する内部構造を自動で階層化し，スキニング領域とシミュレーション領域を切り分ける手法を提案する．本手法により，各階層の体積や弾性体のパラメータもユーザーが自由に変更できるため，内部構造を考慮した弾性体の材質設定や肉揺れ部位の指定も行うことが可能である．最後に，本手法を様々なキャラクターモデルに適用した結果により，手法の有効性を示す．

DOI CiNii
VoiceDub：複数タイミング情報をともなう映像エンタテイメント向け音声同期収録支援システム

川本真一, 森島繁生, 中村哲

情報処理学会論文誌 56 ( 4 ) 1142 - 1151 2015年04月 [査読有り]
VRMixer: 動画セグメンテーションによる動画コンテンツと現実世界の融合

平井辰典, 中村聡史, 湯村翼, 森島繁生

インタラクション2015講演論文集 1 - 6 2015年03月
ラリーシーンに着目した映像自動要約によるラケットスポーツ動画鑑賞システム

河村俊哉, 福里司, 平井辰典, 森島繁生

情報処理学会論文誌 56 ( 3 ) 1028 - 1038 2015年03月 [査読有り]
FG2015 Age Progression Evaluation

Andreas Lanitis, Nicolas Tsapatsoulis, Kleanthis Soteriou, Daiki Kuwahara, Shigeo Morishima

2015 11TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOPS ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG), VOL. 3 2015年 [査読有り]

　概要を見る

The topic of face-aging received increased attention by the computer vision community during the recent years. This interest is motivated by important real life applications where accurate age progression algorithms can be used. However, age progression methodologies may only be used in real applications provided that they have the ability to produce accurate age progressed images. Therefore it is of utmost importance to encourage the development of accurate age progression algorithms through the formulation of performance evaluation protocols that can be used for obtaining accurate performance evaluation results for different algorithms reported in the literature. In this paper we describe the organization of the, first ever, pilot independent age progression competition that aims to provide the basis of a robust framework for assessing age progression methodologies. The evaluation carried out involves the use of several machine-based and human-based indicators that were used for assessing eight age progression methods.
FG2015 Age Progression Evaluation

Andreas Lanitis, Nicolas Tsapatsoulis, Kleanthis Soteriou, Daiki Kuwahara, Shigeo Morishima

2015 11TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOPS ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG), VOL. 1 2015年 [査読有り]

　概要を見る

The topic of face-aging received increased attention by the computer vision community during the recent years. This interest is motivated by important real life applications where accurate age progression algorithms can be used. However, age progression methodologies may only be used in real applications provided that they have the ability to produce accurate age progressed images. Therefore it is of utmost importance to encourage the development of accurate age progression algorithms through the formulation of performance evaluation protocols that can be used for obtaining accurate performance evaluation results for different algorithms reported in the literature. In this paper we describe the organization of the, first ever, pilot independent age progression competition that aims to provide the basis of a robust framework for assessing age progression methodologies. The evaluation carried out involves the use of several machine-based and human-based indicators that were used for assessing eight age progression methods.
FG2015 Age Progression Evaluation

Andreas Lanitis, Nicolas Tsapatsoulis, Kleanthis Soteriou, Daiki Kuwahara, Shigeo Morishima

2015 11TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOPS ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG), VOL. 5 2015年 [査読有り]

　概要を見る

The topic of face-aging received increased attention by the computer vision community during the recent years. This interest is motivated by important real life applications where accurate age progression algorithms can be used. However, age progression methodologies may only be used in real applications provided that they have the ability to produce accurate age progressed images. Therefore it is of utmost importance to encourage the development of accurate age progression algorithms through the formulation of performance evaluation protocols that can be used for obtaining accurate performance evaluation results for different algorithms reported in the literature. In this paper we describe the organization of the, first ever, pilot independent age progression competition that aims to provide the basis of a robust framework for assessing age progression methodologies. The evaluation carried out involves the use of several machine-based and human-based indicators that were used for assessing eight age progression methods.

DOI

Scopus

4

被引用数

(Scopus)
FG2015 Age Progression Evaluation

Andreas Lanitis, Nicolas Tsapatsoulis, Kleanthis Soteriou, Daiki Kuwahara, Shigeo Morishima

2015 11TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOPS ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG), VOL. 2 2015年 [査読有り]

　概要を見る

The topic of face-aging received increased attention by the computer vision community during the recent years. This interest is motivated by important real life applications where accurate age progression algorithms can be used. However, age progression methodologies may only be used in real applications provided that they have the ability to produce accurate age progressed images. Therefore it is of utmost importance to encourage the development of accurate age progression algorithms through the formulation of performance evaluation protocols that can be used for obtaining accurate performance evaluation results for different algorithms reported in the literature. In this paper we describe the organization of the, first ever, pilot independent age progression competition that aims to provide the basis of a robust framework for assessing age progression methodologies. The evaluation carried out involves the use of several machine-based and human-based indicators that were used for assessing eight age progression methods.
Affective music recommendation systembased on the mood of input video

Shoto Sasaki, Tatsunori Hirai, Hayato Ohya, Shigeo Morishima

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 8936 299 - 302 2015年 [査読有り]

　概要を見る

© Springer International Publishing Switzerland 2015. We present an affective music recommendation system just fitting to an input video without textual information. Music that matches our current environmental mood can enhance a deep impression. However, we cannot know easily which music best matches our present mood from huge music database. So we often select a well-known popular song repeatedly in spite of the present mood. In this paper, we analyze the video sequence which represent current mood and recommend an appropriate music which affects the current mood. Our system matches an input video with music using valence-arousal plane which is an emotional plane.

DOI

Scopus

14

被引用数

(Scopus)
Facial aging simulator by data-drivencomponent-based texture cloning

Daiki Kuwahara, Akinobu Maejima, Shigeo Morishima

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 8936 295 - 298 2015年 [査読有り]

　概要を見る

© Springer International Publishing Switzerland 2015. Facial aging and rejuvenation simulation is a challenging topic because keeping personal characteristics in every age is difficult problem. In this demonstration, we simulate a facial aging/rejuvenating only from a single photo. Our system alters an input face image to aged face by reconstructing every facial component with face database for target age. An appropriate facial components image are selected by a special similarity measurement between current age and target age to keep personal characteristics asmuch as possible. Our systemsuccessfully generated aged/ rejuvenated faces with age-related features such as spots, wrinkles, and sagging while keeping personal characteristics throughout all ages.

DOI

Scopus

3

被引用数

(Scopus)
Focusing patch: Automatic photorealistic deblurring for facial images by patch-based color transfer

Masahide Kawai, Shigeo Morishima

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 8935 155 - 166 2015年 [査読有り]

　概要を見る

© Springer International Publishing Switzerland 2015. Facial image synthesis creates blurred facial images almost without high-frequency components, resulting in flat edges. Moreover, the synthesis process results in inconsistent facial images, such as the conditions where the white part of the eye is tinged with the color of the iris and the nasal cavity is tinged with the skin color. Therefore, we propose a method that can deblur an inconsistent synthesized facial image, including strong blurs created by common image morphing methods, and synthesize photographic quality facial images as clear as an image captured by a camera. Our system uses two original algorithms: patch color transfer and patch-optimized visio-lization. Patch color transfer can normalize facial luminance values with high precision, and patch-optimized visio-lization can synthesize a deblurred, photographic quality facial image. The advantages of our method are that it enables the reconstruction of the high-frequency components (concavo-convex) of human skin and removes strong blurs by employing only the input images used for original image morphing.

DOI

Scopus

2

被引用数

(Scopus)
Facial Fattening and Slimming Simulation Based on Skull Structure

Masahiro Fujisaki, Shigeo Morishima

ADVANCES IN VISUAL COMPUTING, PT II (ISVC 2015) 9475 137 - 149 2015年 [査読有り]

　概要を見る

In this paper, we propose a novel facial fattening and slimming deformation method in 2D images that preserves the individuality of the input face by estimating the skull structure from a frontal face image and prevents unnatural deformation (e.g. penetration into the skull). Our method is composed of skull estimation, optimizing fattening and slimming rules appropriate to the estimated skull, mesh deformation to generate fattening and slimming face, and generation background image adapted to the generated face contour. Finally, we verify our method by comparison with other rules, precision of skull estimation, subjective experiment, and execution time.

DOI

Scopus
Dance motion segmentation method based on choreographic primitives

Narumi Okada, Naoya Iwamoto, Tsukasa Fukusato, Shigeo Morishima

GRAPP 2015 - 10th International Conference on Computer Graphics Theory and Applications; VISIGRAPP, Proceedings 332 - 339 2015年 [査読有り]

　概要を見る

Copyright © 2015 SCITEPRESS - Science and Technology Publications. All rights reserved. Data-driven animation using a large human motion database enables the programing of various natural human motions. While the development of a motion capture system allows the acquisition of realistic human motion, segmenting the captured motion into a series of primitive motions for the construction of a motion database is necessary. Although most segmentation methods have focused on periodic motion, e.g., walking and jogging, segmenting non-periodic and asymmetrical motions such as dance performance, remains a challenging problem. In this paper, we present a specialized segmentation approach for human dance motion. Our approach consists of three steps based on the assumption that human dance motion is composed of consecutive choreographic primitives. First, we perform an investigation based on dancer perception to determine segmentation components. After professional dancers have selected segmentation sequences, we use their selected sequences to define rules for the segmentation of choreographic primitives. Finally, the accuracy of our approach is verified by a user-study, and we thereby show that our approach is superior to existing segmentation methods. Through three steps, we demonstrate automatic dance motion synthesis based on the choreographic primitives obtained.

DOI

Scopus

4

被引用数

(Scopus)
FG2015 Age Progression Evaluation

Andreas Lanitis, Nicolas Tsapatsoulis, Kleanthis Soteriou, Daiki Kuwahara, Shigeo Morishima

2015 11TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOPS ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG), VOL. 4 1 - 6 2015年 [査読有り]

　概要を見る

The topic of face-aging received increased attention by the computer vision community during the recent years. This interest is motivated by important real life applications where accurate age progression algorithms can be used. However, age progression methodologies may only be used in real applications provided that they have the ability to produce accurate age progressed images. Therefore it is of utmost importance to encourage the development of accurate age progression algorithms through the formulation of performance evaluation protocols that can be used for obtaining accurate performance evaluation results for different algorithms reported in the literature. In this paper we describe the organization of the, first ever, pilot independent age progression competition that aims to provide the basis of a robust framework for assessing age progression methodologies. The evaluation carried out involves the use of several machine-based and human-based indicators that were used for assessing eight age progression methods.

DOI

Scopus

4

被引用数

(Scopus)
MusicMixer: Computer-Aided DJ System based on an Automatic Song Mixing

Tatsunori Hirai, Hironori Doi, Shigeo Morishima

12TH ADVANCES IN COMPUTER ENTERTAINMENT TECHNOLOGY CONFERENCE (ACE15) 16-19-November-2015 41:1-41:5 2015年 [査読有り]

　概要を見る

In this paper, we present MusicMixer, a computer-aided DJ system that helps DJs, specifically with song mixing. MusicMixer continuously mixes and plays songs using an automatic music mixing method that employs audio similarity calculations. By calculating similarities between song sections that can be naturally mixed, MusicMixer enables seamless song transitions. Though song mixing is the most fundamental and important factor in DJ performance, it is difficult for untrained people to seamlessly connect songs. MusicMixer realizes automatic song mixing using an audio signal processing approach; therefore, users can perform DJ mixing simply by selecting a song from a list of songs suggested by the system, enabling effective DJ song mixing and lowering entry barriers for the inexperienced. We also propose personalization for song suggestions using a preference memorization function of MusicMixer.

DOI

Scopus

4

被引用数

(Scopus)
Pose-independent garment transfer

Fumiya Narita, Shunsuke Saito, Takuya Kato, Tsukasa Fukusato, Shigeo Morishima

SIGGRAPH Asia 2014 Posters, SIGGRAPH ASIA 2014 12 2014年11月 [査読有り]

DOI

Scopus

4

被引用数

(Scopus)
VRMixer: Mixing video and real world with video segmentation

Tatsunori Hirai, Satoshi Nakamura, Tsubasa Yumura, Shigeo Morishima

ACM International Conference Proceeding Series 2014-November 30 - 7 2014年11月 [査読有り]

　概要を見る

Copyright © 2014 ACM. This paper presents VRMixer, a system that mixes real world and a video clip letting a user enter the video clip and realize a virtual co-starring role with people appearing in the clip. Our system constructs a simple virtual space by allocating video frames and the people appearing in the clip within the user's 3D space. By measuring the user's 3D depth in real time, the time space of the video clip and the user's 3D space become mixed. VRMixer automatically extracts human images from a video clip by using a video segmentation technique based on 3D graph cut segmentation that employs face detection to detach the human area from the background. A virtual 3D space (i.e., 2.5D space) is constructed by positioning the background in the back and the people in the front. In the video clip, the user can stand in front of or behind the people by using a depth camera. Real objects that are closer than the distance of the clip's background will become part of the constructed virtual 3D space. This synthesis creates a new image in which the user appears to be a part of the video clip, or in which people in the clip appear to enter the real world. We aim to realize "video reality," i.e., a mixture of reality and video clips using VRMixer.

DOI

Scopus

4

被引用数

(Scopus)
Automatic depiction of onomatopoeia in animation considering physical phenomena

Tsukasa Fukusato, Shigeo Morishima

Proceedings - Motion in Games 2014, MIG 2014 161 - 169 2014年11月 [査読有り]

　概要を見る

Copyright © ACM. This paper presents a method that enables the estimation and depiction of onomatopoeia in computer-generated animation based on physical parameters. Onomatopoeia is used to enhance physical characteristics and movement, and enables users to understand animation more intuitively. We experiment with onomatopoeia depiction in scenes within the animation process. To quantify onomatopoeia, we employ Komatsu's [2012] assumption, i.e., onomatopoeia can be expressed by n-dimensional vector. We also propose phonetic symbol vectors based on the correspondence of phonetic symbols to the impressions of onomatopoeia using a questionnaire-based investigation. Furthermore, we verify the positioning of onomatopoeia in animated scenes. The algorithms directly combine phonetic symbols to estimate optimum onomatopoeia. They use a view-dependent Gaussian function to display onomatopoeias in animated scenes. Our method successfully recommends optimum onomatopoeias using only physical parameters, so that even amateur animators can easily create onomatopoeia animation.

DOI

Scopus

13

被引用数

(Scopus)
VRMixer: 動画と現実の融合による新たなコンテンツの生成

平井辰典, 中村聡史, 森島繁生, 湯村翼

OngaCRESTシンポジウム2014予稿集 27 - 27 2014年08月
歌手映像と歌声の解析に基づく音楽動画中の歌唱シーン検出

平井辰典, 中野倫靖, 後藤真孝, 森島繁生

OngaCRESTシンポジウム2014予稿集 20 - 20 2014年08月
顔の印象類似検索のための髪特徴量の提案

藤賢大, 福里司, 佐々木将人, 増田太郎, 平井辰典, 森島繁生

第17回画像の認識・理解シンポジウム(MIRU2014)講演論文集 1 - 2 2014年07月
ラケットスポーツ動画のラリーシーンの特徴に基づく映像要約

河村俊哉, 福里司, 平井辰典, 森島繁生

第17回画像の認識・理解シンポジウム(MIRU2014)講演論文集 1 - 2 2014年07月
A Visuomotor Coordination Model for Obstacle Recognition

Tomoyoti Iwao, Hiroyuki Kubo, Akinobu Maejima, Shigeo Morishima

journal of WSCG 22 ( 1-2 ) 49 - 56 2014年07月 [査読有り]
エンタテインメント応用のための人物顔パターン計測・合成技術

森島繁生

計測と制御 53 ( 7 ) 593 - 598 2014年07月 [査読有り]

　概要を見る

人物パターン計測技術は，安全・安心な社会を実現するためのセキュリティ応用がメインターゲットの 1 つであるが，エンタテインメントにおいても重要な要素技術を構成している．特に，ゲームやスマートフォン等のインタラクティブアプリケーションの情報入力手段として，ハードウェアの進歩にも支えられ，有効な技術となってきている．エンタテインメント応用を想定したパターン計測技術に求められる条件は，速度と精度のトレードオフを克服し，リアルタイム性を確保しながら要求される性能を極力実現しなければならない点が，精度重視のコンピュータビジョン (CV)応用のケースと根本的に異なっている．本稿では，特にユーザー参加型の CG エンタテインメントを想定し，そこに登場する人物キャラクタの個性を反映したモデリング，シミュレーション，アニメーション手法に関する

DOI CiNii
歌手映像と歌声の解析に基づく音楽動画中の歌唱シーン検出手法の検討

平井辰典, 中野倫靖, 後藤真孝, 森島繁生

研究報告音楽情報科学（MUS） 2014 ( 54 ) 1 - 8 2014年05月

　概要を見る

本稿では，ライヴ動画や PV などに代表される音楽動画において，歌手が歌っているシーンである「歌唱シーン」を検出する手法について検討する．音楽において歌手は最も主要な役割を担っており，音楽動画における歌唱シーンも同様に動画のハイライトの一つであると言える．歌唱シーンは動画サムネイル生成や，大量の音楽動画の短時間ブラウジングなどにおいて有用である．歌唱シーンを検出するためには，歌手の顔認識，楽曲中の歌声区間検出といった要素手法及びそれらを組み合わせる手法についての検討が必要である．本稿では，顔認識を用いた映像解析，歌声区間検出を用いた音響解析，それらを複合した Audio-visual 解析のそれぞれについて比較・検討しながら歌唱シーン検出の実現可能性について議論する．

CiNii
Macroscopic and microscopic deformation coupling in up-sampled cloth simulation

Shunsuke Saito, Nobuyuki Umetani, Shigeo Morishima

COMPUTER ANIMATION AND VIRTUAL WORLDS 25 ( 3-4 ) 437 - 446 2014年05月 [査読有り]

　概要を見る

Various methods of predicting the deformation of fine-scale cloth from coarser resolutions have been explored. However, the influence of fine-scale deformation has not been considered in coarse-scale simulations. Thus, the simulation of highly nonhomogeneous detailed cloth is prone to large errors. We introduce an effective method to simulate cloth made of nonhomogeneous, anisotropic materials. We precompute a macroscopic stiffness that incorporates anisotropy from the microscopic structure, using the deformation computed for each unit strain. At every time step of the simulation, we compute the deformation of coarse meshes using the coarsened stiffness, which saves computational time and add higher-level details constructed by the characteristic displacement of simulated meshes. We demonstrate that anisotropic and inhomogeneous cloth models can be simulated efficiently using our method. (c) 2014 The Authors. Computer Animation and Virtual Worlds published by John Wiley & Sons, Ltd.

DOI

Scopus

3

被引用数

(Scopus)
Facial Aging Simulation by Patch-Based Texture Synthesis with Statistical Wrinkle Aging Pattern Model

Akinobu maejima, Ai Mizokawa, Daiki Kuwahara, Shigeo Morishima

Mathematical Progress in Expressive Image Synthesis 161 - 170 2014年02月 [査読有り]
Driver Drowsiness Estimation from Facial Expression Features Computer Vision Feature Investigation using a CG Model

Taro Nakamura, Akinobu Maejima, Shigeo Morishima

PROCEEDINGS OF THE 2014 9TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, THEORY AND APPLICATIONS (VISAPP 2014), VOL 2 2 207 - 214 2014年 [査読有り]

　概要を見る

We propose a method for estimating the degree of a driver's drowsiness on the basis of changes in facial expressions captured by an IR camera. Typically, drowsiness is accompanied by drooping eyelids. Therefore, most related studies have focused on tracking eyelid movement by monitoring facial feature points. However, the drowsiness feature emerges not only in eyelid movements but also in other facial expressions. To more precisely estimate drowsiness, we must select other effective features. In this study, we detected a new drowsiness feature by comparing a video image and CG model that are applied to the existing feature point information. In addition, we propose a more precise degree of drowsiness estimation method using wrinkle changes and calculating local edge intensity on faces, which expresses drowsiness more directly in the initial stage.

DOI

Scopus

23

被引用数

(Scopus)
Measured curvature-dependent reflectance function for synthesizing translucent materials in real-time

Midori Okamoto, Shohei Adachi, Hiroaki Ukaji, Kazuki Okami, Shigeo Morishima

ACM SIGGRAPH 2014 Posters, SIGGRAPH 2014 96:1 2014年 [査読有り]

DOI

Scopus
Patch-based fast image interpolation in spatial and temporal direction

Shunsuke Saito, Ryuuki Sakamoto, Shigeo Morishima

ACM SIGGRAPH 2014 Posters, SIGGRAPH 2014 70:1 2014年 [査読有り]

DOI

Scopus

1

被引用数

(Scopus)
Face retrieval system by similarity of impression based on hair attribute

Takahiro Fuji, Tsukasa Fukusato, Shoto Sasaki, Taro Masuda, Tatsunori Hirai, Shigeo Morishima

ACM SIGGRAPH 2014 Posters, SIGGRAPH 2014 65:1 2014年 [査読有り]

DOI

Scopus
Efficient video viewing system for racquet sports with automatic summarization focusing on rally scenes

Shunya Kawamura, Tsukasa Fukusato, Tatsunori Hirai, Shigeo Morishima

ACM SIGGRAPH 2014 Posters, SIGGRAPH 2014 62:1 2014年 [査読有り]

DOI

Scopus

5

被引用数

(Scopus)
Automatic deblurring for facial image based on patch synthesis

Masahide Kawai, Tomoyori Iwao, Akinobu Maejima, Shigeo Morishima

ACM SIGGRAPH 2014 Posters, SIGGRAPH 2014 58:1 2014年 [査読有り]

DOI

Scopus
Photorealistic facial image from monochrome pencil sketch

Ai Mizokawa, Taro Nakamura, Akinobu Maejima, Shigeo Morishima

ACM SIGGRAPH 2014 Posters, SIGGRAPH 2014 39:1 2014年 [査読有り]

DOI

Scopus
Facial fattening and slimming simulation considering skull structure

Masahiro Fujisaki, Daiki Kuwahara, Taro Nakamura, Akinobu Maejima, Takayoshi Yamashita, Shigeo Morishima

ACM SIGGRAPH 2014 Posters, SIGGRAPH 2014 35:1 2014年 [査読有り]

DOI

Scopus
The efficient and robust sticky viscoelastic material simulation

Kakuto Goto, Naoya Iwamoto, Shunsuke Saito, Shigeo Morishima

ACM SIGGRAPH 2014 Posters, SIGGRAPH 2014 15:1 2014年 [査読有り]

DOI

Scopus

1

被引用数

(Scopus)
Quasi 3D rotation for hand-drawn characters

Chie Furusawa, Tsukasa Fukusato, Narumi Okada, Tatsunori Hirai, Shigeo Morishima

ACM SIGGRAPH 2014 Posters, SIGGRAPH 2014 12:1 2014年 [査読有り]

DOI

Scopus

3

被引用数

(Scopus)
Material parameter editing system for volumetric simulation models

Naoya Iwamoto, Shigeo Morishima

ACM SIGGRAPH 2014 Posters, SIGGRAPH 2014 10:1 2014年 [査読有り]

DOI

Scopus
Example-based blendshape sculpting with expression individuality

Takuya Kato, Shunsuke Saito, Masahide Kawai, Tomoyori Iwao, Akinobu Maejima, Shigeo Morishima

ACM SIGGRAPH 2014 Posters, SIGGRAPH 2014 7:1 2014年 [査読有り]

DOI

Scopus
Application friendly voxelization on GPU by geometry splitting

Zhuopeng Zhang, Shigeo Morishima

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 8698 LNCS 112 - 120 2014年 [査読有り]

　概要を見る

In this paper, we present a novel approach that utilizes the geometry shader to dynamically voxelize 3D models in real-time. In the geometry shader, the primitives are split by their Z-order, and then rendered to tiles which compose a single 2D texture. This method is completely based on graphic pipeline, rather than computational methods like CUDA/OpenCL implementation. So it can be easily integrated into a rendering or simulation system. Another advantage of our algorithm is that while doing voxelization, it can simultaneously record the additional mesh information like normal, material properties and even speed of vertex displacement. Our method achieves conservative voxelization by only two passes of rendering without any preprocessing and it fully runs on GPU. As a result, our algorithm is very useful for dynamic application. © 2014 Springer International Publishing.

DOI

Scopus

5

被引用数

(Scopus)
LyricsRadar: A Lyrics Retrieval System Based on Latent Topics of Lyrics.

Shoto Sasaki, Kazuyoshi Yoshii, Tomoyasu Nakano, Masataka Goto, Shigeo Morishima

Proceedings of the 15th International Society for Music Information Retrieval Conference, ISMIR 2014, Taipei, Taiwan, October 27-31, 2014 585 - 590 2014年 [査読有り]
Spotting a query phrase from polyphonic music audio signals based on semi-supervised nonnegative matrix factorization

Taro Masuda, Kazuyoshi Yoshii, Masataka Goto, Shigeo Morishima

Proceedings of the 15th International Society for Music Information Retrieval Conference, ISMIR 2014 227 - 232 2014年 [査読有り]

　概要を見る

© Taro Masuda, Kazuyoshi Yoshii, Masataka Goto, Shigeo Morishima. This paper proposes a query-by-audio system that aims to detect temporal locations where a musical phrase given as a query is played in musical pieces. The “phrase” in this paper means a short audio excerpt that is not limited to a main melody (singing part) and is usually played by a single musical instrument. A main problem of this task is that the query is often buried in mixture signals consisting of various instruments. To solve this problem, we propose a method that can appropriately calculate the distance between a query and partial components of a musical piece. More specifically, gamma process nonnegative matrix factorization (GaP-NMF) is used for decomposing the spectrogram of the query into an appropriate number of basis spectra and their activation patterns. Semi-supervised GaP-NMF is then used for estimating activation patterns of the learned basis spectra in the musical piece by presuming the piece to partially consist of those spectra. This enables distance calculation based on activation patterns. The experimental results showed that our method outperformed conventional matching methods.
Data-driven speech animation synthesis focusing on realistic inside of the mouth

Masahide Kawai, Tomoyori Iwao, Daisuke Mima, Akinobu Maejima, Shigeo Morishima

Journal of Information Processing 22 ( 2 ) 401 - 409 2014年 [査読有り]

　概要を見る

Speech animation synthesis is still a challenging topic in the field of computer graphics. Despite many challenges, representing detailed appearance of inner mouth such as nipping tongue's tip with teeth and tongue's back hasn't been achieved in the resulting animation. To solve this problem, we propose a method of data-driven speech animation synthesis especially when focusing on the inside of the mouth. First, we classify inner mouth into teeth labeling opening distance of the teeth and a tongue according to phoneme information. We then insert them into existing speech animation based on opening distance of the teeth and phoneme information. Finally, we apply patch-based texture synthesis technique with a 2,213 images database created from 7 subjects to the resulting animation. By using the proposed method, we can automatically generate a speech animation with the realistic inner mouth from the existing speech animation created by previous methods. © 2014 Information Processing Society of Japan.

DOI

Scopus

12

被引用数

(Scopus)
Data-driven speech animation synthesis focusing on realistic inside of the mouth

Masahide Kawai, Tomoyori Iwao, Daisuke Mima, Akinobu Maejima, Shigeo Morishima

Journal of Information Processing 22 ( 2 ) 401 - 409 2014年 [査読有り]

　概要を見る

Speech animation synthesis is still a challenging topic in the field of computer graphics. Despite many challenges, representing detailed appearance of inner mouth such as nipping tongue's tip with teeth and tongue's back hasn't been achieved in the resulting animation. To solve this problem, we propose a method of data-driven speech animation synthesis especially when focusing on the inside of the mouth. First, we classify inner mouth into teeth labeling opening distance of the teeth and a tongue according to phoneme information. We then insert them into existing speech animation based on opening distance of the teeth and phoneme information. Finally, we apply patch-based texture synthesis technique with a 2,213 images database created from 7 subjects to the resulting animation. By using the proposed method, we can automatically generate a speech animation with the realistic inner mouth from the existing speech animation created by previous methods. © 2014 Information Processing Society of Japan.

DOI

Scopus

12

被引用数

(Scopus)
Automatic photorealistic 3D inner mouth restoration from frontal images

Masahide Kawai, Tomoyori Iwao, Akinobu Maejima, Shigeo Morishima

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 8887 51 - 62 2014年 [査読有り]

　概要を見る

© Springer International Publishing Switzerland 2014. In this paper, we propose a novel method to generate highly photorealistic three-dimensional (3D) inner mouth animation that is well-fitted to an original ready-made speech animation using only frontal captured images and a small-size database. The algorithms are composed of quasi-3D model reconstruction and motion control of teeth and the tongue, and final compositing of photorealistic speech animation synthesis tailored to the original.

DOI

Scopus

4

被引用数

(Scopus)
Automatic Music Video Generation System by Reusing Posted Web Content with Hidden Markov Model

Hayato Ohya, Shigeo Morishima

IIEEJ Transactions on Image Electronics and Visual Computing 11 ( 1 ) 65 - 73 2013年12月 [査読有り]

CiNii
率モデルに基づく対話時の眼球運動の分析及び合成

岩尾知頼, 三間大輔, 久保尋之, 前島謙宣, 森島繁生

画像電子学会誌 42 ( 5 ) 661 - 670 2013年09月 [査読有り]

　概要を見る

写実的なキャラクタアニメーションの実現のためには眼球運動を正しく再現することが重要である．しかしながら現在は，キャラクタの対話時の眼球運動を再現する際にアーティストの手作業による作りこみが必要となるため，多大なコストや労力がかかることが問題である．そこで本研究では人間の対話時の眼球運動に着目し，計測結果を基に眼球運動と瞬きを確率関数を用いてモデル化することで，眼球運動を自動生成する．まず，瞬きを含む対話時の眼球運動と固視微動とをそれぞれ計測する．次に，計測結果を基に対話時の眼球運動を跳躍運動と固視微動に分類する．さらに分類された跳躍運動と固視微動及び瞬きを，それぞれ確率モデルを用いて近似した後に，それらの確率モデルをキャラクタに適用することにより，リアルな眼球運動の自動生成を可能とした．

DOI CiNii
Automatic Comic-Style Video Summarization of Anime Films by Key-frame Detection

Tsukasa Fukusato, Tatsunori Hirai, Hayato Ohya, Shigeo Morishima

Proceedings of the Expressive 2013 2013年07月 [査読有り]
アニメ作品におけるキーフレーム自動抽出に基づく映像要約手法の提案

福里司, 平井辰典, 大矢隼士, 森島繁生

画像電子学会誌 42 ( 4 ) 448 - 456 2013年07月 [査読有り]

　概要を見る

近年，映像要約を目的として，プレイログや画像特徴量を用いた漫画形式の映像要約手法が提案されている．しかし，従来手法の多くで，動画を適切に要約するために必要な情報量や，粒度が言及されていない．これは，取得したキーフレームを漫画として適切であるか判定するための評価尺度が存在しないからである．そこで本研究では，原作漫画からアニメ作品を制作する際，原作漫画のコマの補間が行われていることに着目し，アニメ作品に対する原作漫画との一致度合いを，映像要約の評価尺度として提案する．さらに，アニメ作品を漫画として適切に要約するためのキーフレームを，画像特徴量から判定する手法を提案する．アニメ作品をショットごとに分割し，ショット単位，フレーム単位の2 つの階層においての重要度を算出することで，アニメ作品から，原作漫画のコマに相当するキーフレームの高精度な抽出を実現し，定量的な評価結果により，その有効性を示した．

DOI CiNii
既存音楽動画の再利用による音楽に合った動画の自動生成システム

平井辰典, 大矢隼士, 森島繁生

情報処理学会論文誌 54 ( 4 ) 1254 - 1262 2013年04月 [査読有り]
注目領域中の画像類似度に基づく動画中のキャラクター登場シーンの推薦手法

増田太郎, 平井辰典, 大矢隼士, 森島繁生

第75回全国大会講演論文集 2013 ( 1 ) 601 - 602 2013年03月

　概要を見る

本研究では，ある動画のシーンに類似した別の動画のシーンを検索するための映像特徴量間の類似尺度を提案する。類似動画検索に関する多くの研究は，特徴量の計算を画面全体に対して行っているため，画面の中で重要な部分とそうでない部分を等しく扱っている。そこで本手法では，映像において人の注意を引き付けやすい領域を推定し，その局所領域内での特徴を抽出する。これに画面全体の特徴も加味することで，大局的な特徴と局所的な特徴の両方を考慮した動画間の類似尺度を構成する。クエリとなる入力動画のシーンと最も類似度の高い動画シーンをデータベースから探索することで類似動画の検索を行った。また，実験により本手法の有効性についても検討した。

CiNii
Reflectance estimation of human face from a single shot image

Kazuki Okami, Naoya Iwamoto, Akinobu Maejima, Shigeo Morishima

ACM SIGGRAPH 2013 Posters, SIGGRAPH 2013 105 2013年 [査読有り]

　概要を見る

Simulation of the reflectance of translucent materials is one of the most important factors in the creation of realistic CG objects. Estimating the reflectance characteristics of translucent materials from a single image is a very efficient way of re-rendering objects that exist in real environments. However, this task is considerably challenging because this approach leads to problems such as the existence of many unknown parameters. Munoz et al. [2011] proposed a method for the estimation of the bidirectional surface scattering reflectance distribution function (BSSRDF) from a given single image. However, it is difficult or impossible to estimate the BSSRDF of materials with complex shapes because this method's target was the convexity of objects therefore, it used a rough depth recovery technique for global convex objects. In this paper, we propose a method for accurately estimating the BSSRDF of human faces, which have complex shapes. We use a 3D face reconstruction technique to satisfy the above assumption. We are able to acquire more accurate geometries of human faces, and it enables us to estimate the reflectance characteristics of faces.

DOI

Scopus
Real-time dust rendering by parametric shell texture synthesis

Shohei Adachi, Hiroaki Ukaji, Takahiro Kosaka, Shigeo Morishima

ACM SIGGRAPH 2013 Posters, SIGGRAPH 2013 104 2013年 [査読有り]

　概要を見る

When we synthesize a realistic appearance of dust-covered object by CG, it is necessary to express a large number of fabric components of dust accurately with many short fibers, and as a result, this process is a time-consuming task. The dust amount prediction function suggested by Hsu [1995] proposed modeling and rendering techniques for dusty surfaces. These techniques only describe dust accumulation as a shading function, however, they cannot express the volume of dust on the surfaces. In this study, we present a novel method to model and render the appearance and volume of dust in real-time by using shell texturing. Each shell texture, which can express several components, is automatically generated in our procedural approach. Therefore, we can draw any arbitrary appearance of dust rapidly and interactively by solely controlling simple parameters.

DOI

Scopus
Affective music recommendation system using input images

Shoto Sasaki, Tatsunori Hirai, Hayato Ohya, Shigeo Morishima

ACM SIGGRAPH 2013 Posters, SIGGRAPH 2013 90 2013年 [査読有り]

　概要を見る

Music that matches our current mood can create a deep impression, which we usually want to enjoy when we listen to music. However, we do not know which music best matches our present mood. We have to listen to each song, searching for music that matches our mood. As it is difficult to select music manually, we need a recommendation system that can operate affectively. Most recommendation methods, such as collaborative filtering or content similarity, do not target a specific mood. In addition, there may be no word exactly specifying the mood. Therefore, textual retrieval is not effective. In this paper, we assume that there exists a relationship between our mood and images because visual information affects our mood when we listen to music. We now present an affective music recommendation system using an input image without textual information.

DOI

Scopus

5

被引用数

(Scopus)
Photorealistic aged face image synthesis by wrinkles manipulation

Ai Mizokawa, Hiroki Nakai, Akinobu Maejima, Shigeo Morishima

ACM SIGGRAPH 2013 Posters, SIGGRAPH 2013 64 2013年 [査読有り]

　概要を見る

Many studies on an aged face image synthesis have been reported with the purpose of security application such as investigation for criminal or kidnapped child and entertainment applications such as movie or video game.

DOI

Scopus

6

被引用数

(Scopus)
Driver drowsiness estimation using facial wrinkle feature

Taro Nakamura, Tatsuhide Matsuda, Akinobu Maejima, Shigeo Morishima

ACM SIGGRAPH 2013 Posters, SIGGRAPH 2013 30 2013年 [査読有り]

　概要を見る

In recent years, the rate of fatal motor vehicle accidents caused by distracted driving resulting from factors such as sleeping at the wheel has been increasing. Therefore, an alert system that detects driver drowsiness and prevents accidents as a result by warning drivers before they fall asleep is urgently required. Non-contact measuring systems using computer vision techniques have been studied, and in vision approach, it is important to decide what kind of feature we should use for estimating drowsiness.

DOI

Scopus

3

被引用数

(Scopus)
Photorealistic inner mouth expression in speech animation

Masahide Kawai, Tomoyori Iwao, Daisuke Mima, Akinobu Maejima, Shigeo Morishima

ACM SIGGRAPH 2013 Posters, SIGGRAPH 2013 9 2013年 [査読有り]

　概要を見る

We often see close-ups of CG characters' faces in movies or video games. In such situations, the quality of a character's face (mainly in dialogue scenes) primarily determines that of the entire movie. Creating highly realistic speech animation is essential because viewers watch these scenes carefully. In general, such speech animations are created manually by skilled artists. However, creating them requires a considerable effort and time.

DOI

Scopus

3

被引用数

(Scopus)
Generating eye movement during conversations using markov process

Tomoyori Iwao, Daisuke Mima, Hiroyuki Kubo, Akinobu Maejima, Shigeo Morishima

ACM SIGGRAPH 2013 Posters, SIGGRAPH 2013 6 2013年 [査読有り]

　概要を見る

Generating realistic eye movements is a significant topic in Computer Graphics(CG) contents production field. Appropriate modeling and synthesis for eye movements are greatly difficult because they have a lot of important features. Gu et al[2007] proposed a method for automatically synthesizing realistic eye movements during conversations according to probability models. Despite eye movements during conversations include both saccades and fixational eye movements (FEMs), they synthesized only saccades which are relatively large eye movements.

DOI

Scopus

1

被引用数

(Scopus)
Expressive dance motion generation

Narumi Okada, Kazuki Okami, Tsukasa Fukusato, Naoya Iwamoto, Shigeo Morishima

ACM SIGGRAPH 2013 Posters, SIGGRAPH 2013 4 2013年 [査読有り]

　概要を見る

The power of expression such as accent in motion and movement of arms is an indispensable factor in dance performance because there is a large difference in appearance between natural dance and expressive motions. Needless to say, expressive dance motion makes a great impression on viewers. However, creating such a dance motion is challenging because most of the creators have little knowledge about dance performance. Therefore, there is a demand for a system that generates expressive dance motion with ease. Tsuruta et al. [2010] generated expressive dance motion by changing only the speed of input motion or altering joint angles. However, the power of expression was not evaluated with certainty, and the generated motion did not synchronize with music. Therefore, the generated motion did not always satisfy the viewers.

DOI

Scopus
Efficient speech animation synthesis with vocalic lip shapes

Daisuke Mima, Akinobu Maejima, Shigeo Morishima

ACM SIGGRAPH 2013 Posters, SIGGRAPH 2013 2 2013年 [査読有り]

　概要を見る

Computer-generated speech animations are commonly seen in video games and movies. Although high-quality facial motions can be created by the hand crafted work of skilled artists, this approach is not always suitable because of time and cost constraints. A data-driven approach [Taylor et al. 2012], such as machine learning to concatenate video portions of speech training data, has been utilized to generate natural speech animation, while a large number of target shapes are often required for synthesis. We can obtain smooth mouth motions from prepared lip shapes for typical vowels by using an interpolation of lip shapes with Gaussian mixture models (GMMs) [Yano et al. 2007]. However, the resulting animation is not directly generated from the measured lip motions of someone's actual speech.

DOI

Scopus
Real-time hair simulation on mobile device

Zhuopeng Zhang, Shigeo Morishima

Proceedings - Motion in Games 2013, MIG 2013 127 - 132 2013年 [査読有り]

　概要を見る

Hair rendering and simulation is a fundamental part in the representation of virtual characters. But intensive calculation for the dynamic on thousands of hair strands makes the task much challengeable, especially on a portable device. The aim of this short paper is to solve the problem of how to perform real-time hair simulation and rendering on mobile device. In this paper, the process of hair simulation and rendering is adapted according to the property of mobile device hardware. To increase the number of hair strands of simulation, we adopted the Dynamic follow-the-leader (DFTL) method and altered it by our new method of interpolation. We also pictured a rendering strategy basing on the survey of the limitation of mobile GPU. Lastly we present an innovational method that carried out order independent transparency at a relatively inexpensive cost. © 2013 ACM.

DOI

Scopus

1

被引用数

(Scopus)
Automatic Mash Up Music Video Generation System by Remixing Existing Video Content

Hayato Ohya, Shigeo Morishima

2013 INTERNATIONAL CONFERENCE ON CULTURE AND COMPUTING (CULTURE AND COMPUTING 2013) 157 - 158 2013年 [査読有り]

　概要を見る

Music video is a short film which presents a visual representation of recent music. In these days, there is a trend that amateur users create music video in the video sharing website. Especially, the music video which is created by cutting and pasting existing video is called mashup music video. In this paper, we proposed the system that users can easily create mushup music video by using existing music videos. In addition, we conducted assessment evaluation experiment for our system. The system firstly extracts music features and video features from existing music videos. Then, the each feature is clustered and the relationship between each feature is learned by Hidden Markov Model. At last, the system cuts learned video scene which is the closest feature among learned videos and pastes it synchronizing with input song. Experiment shows that our method can generate more synchronized video than a previous method.

DOI

Scopus

3

被引用数

(Scopus)
Affective Music Recommendation System Reflecting the Mood of Input Image

Shoto Sasaki, Tatsunori Hirai, Hayato Ohya, Shigeo Morishima

2013 INTERNATIONAL CONFERENCE ON CULTURE AND COMPUTING (CULTURE AND COMPUTING 2013) 153 - 154 2013年 [査読有り]

　概要を見る

We present an affective music recommendation system using input images without textual information. Music that matches our current mood can create a deep impression. However, we do not know which music best matches our present mood. As it is difficult to select music manually, we need a recommendation system that can operate affectively. In this paper, we assume that there exists a relationship between our mood and images because visual information affects our mood when we listen to music. Our system matches an input image with music using valence-arousal plane which is an emotional plane.

DOI

Scopus

13

被引用数

(Scopus)
Interactive Aged-Face Simulation with Freehand Wrinkle Drawing

Ai Mizokawa, Akinobu Maejima, Shigeo Morishima

2013 SECOND IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR 2013) 765 - 769 2013年 [査読有り]

　概要を見る

Recently, many studies on facial aging synthesis have been reported for the purpose of security applications such as criminal investigation, kidnappings, and entertainment applications such as movies or video games. However, the representation of wrinkles, which is one of the most important elements when reflecting age characteristics, remains difficult. Additionally, the influence of lighting conditions and every individual's skin color is significant, and it is difficult to infer the location and shape of future wrinkles because they depend on factors such as one's living environment, eating habits, and DNA. Therefore, we must consider several possibilities for the locations of wrinkles. In this paper, we propose a facial aging synthesis method that can create plausible aged facial images, and is able to represent wrinkles at any desired location by drawing artificial freehand wrinkles while retaining photoreafistic quality.

DOI

Scopus
Detection of Driver's Drowsy Facial Expression

Taro Nakamura, Akinobu Maejima, Shigeo Morishima

2013 SECOND IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR 2013) 749 - 753 2013年 [査読有り]

　概要を見る

We propose a method for the estimation of the degree of a driver's drowsiness on basis of changes in facial expressions captured by an IR camera. Typically, drowsiness is accompanied by falling of eyelids. Therefore, most of the related studies have focused on tracking eyelid movement by monitoring facial feature points. However, textural changes that arise from frowning are also very important and sensitive features in the initial stage of drowsiness, and it is difficult to detect such changes solely using facial feature points. In this paper, we propose a more precise drowsiness-degree estimation method considering wrinkles change by calculating local edge intensity on faces that expresses drowsiness more directly in the initial stage.

DOI

Scopus

19

被引用数

(Scopus)
Facial Aging Simulator Based on Patch-based Facial Texture Reconstruction

Akinobu Maejima, Ai Mizokawa, Daiki Kuwahara, Shigeo Morishima

2013 SECOND IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR 2013) 732 - 733 2013年 [査読有り]

　概要を見る

We propose a facial aging simulator which can synthesize a photorealistic human aged-face image for criminal investigation. Our aging simulator is based on the patch-based facial texture reconstruction with a wrinkle aging pattern model. The advantage of our method is to synthesize an aged-face image with detailed skin texture such as spots and somberness of facial skin, as well as age-related facial wrinkles without blurs that are derived from lack of accurate pixel-wise alignments as in the linear combination model, while maintaining the identity of the original face.

DOI

Scopus
Automatic Mash up Music Video Generation System by Perceptual Synchronization of Music and Video Features

Tatsunori Hirai, Hayato Ohya, Shigeo Morishima

Special Interest Group on Computer Graphics and Interactive Techniques Conference, SIGGRAPH '12, Los Angeles, CA, USA, 2012, Poster Proceedings 449:1 2012年08月 [査読有り]
顔形状の制約を付加した Linear Predictors に基づく特徴点自動検出

松田龍英, 原朋也, 前島謙宣, 森島繁生

電子情報通信学会論文誌. D, 情報・システム = The IEICE transactions on information and systems (Japanese edition) 95 ( 8 ) 1530 - 1540 2012年08月 [査読有り]

　概要を見る

近年,顔画像を用いたアプリケーションが普及しつつあるが,これらの多くは特徴量を抽出する際に特徴点を基点としている.そのため,顔画像から正確に特徴点を検出する手法が求められている.本研究では,Eng-Jonらが提案したLinear Predictorsに,幾何学的制約を加味した新しい顔特徴点検出手法を提案する.Linear Predictorsは,注目画素周辺の輝度値と,特徴点の正解位置への移動ベクトルを線形回帰によって対応づける手法であり,20枚程度の学習データで正確な推定移動ベクトルが得られる.提案手法では,各顔器官の重心を基準とした特徴点の有効範囲を定め,移動ベクトル推定時に特徴点が有効範囲を超えないような制約を加えることにより,特徴点検出の正確度の向上を実現した.また,事前に顔向き角度推定を行い,推定結果に基づいた学習データの選択を行うことで,姿勢によらない特徴点検出を可能にした.

CiNii J-GLOBAL
Acquiring shell textures from a single image for realistic fur rendering

Hiroaki Ukaji, Takahiro Kosaka, Tomohito Hattori, Hiroyuki Kubo, Shigeo Morishima

ACM SIGGRAPH 2012 Posters, SIGGRAPH'12 100 2012年 [査読有り]

DOI

Scopus

1

被引用数

(Scopus)
Fast-automatic 3D face generation using a single video camera

Tomoya Hara, Hiroyuki Kubo, Akinobu Maejima, Shigeo Morishima

ACM SIGGRAPH 2012 Posters, SIGGRAPH'12 91 2012年 [査読有り]

DOI

Scopus
Facial aging simulator considering geometry and patch-tiled texture

Yusuke Tazoe, Hiroaki Gohara, Akinobu Maejima, Shigeo Morishima

ACM SIGGRAPH 2012 Posters, SIGGRAPH'12 90 2012年 [査読有り]

　概要を見る

People can estimate an approximate age of others by looking at their faces. This is because faces have certain elements by which people can judge a person's age. If computers can extract and manipulate such information, wide variety of applications for entertainment and security purpose would be expected. © 2012 ACM.

DOI

Scopus

50

被引用数

(Scopus)
Analysis and synthesis of realistic eye movement in face-to-face communication

Tomoyori Iwao, Daisuke Mima, Hiroyuki Kubo, Akinobu Maejima, Shigeo Morishima

ACM SIGGRAPH 2012 Posters, SIGGRAPH'12 87 2012年 [査読有り]

DOI

Scopus

3

被引用数

(Scopus)
3D human head geometry estimation from a speech

Akinobu Maejima, Shigeo Morishima

ACM SIGGRAPH 2012 Posters, SIGGRAPH'12 85 2012年 [査読有り]

　概要を見る

We can visualize acquaintances' appearance by just hearing their voice if we have met them in past few years. Thus, it would appear that some relationships exist in between voice and appearance. If 3D head geometry could be estimated from a voice, we can realize some applications (e.g, avatar generation, character modeling for video game, etc.). Previously, although many researchers have been reported about a relationship between acoustic features of a voice and its corresponding dynamical visual features including lip, tongue, and jaw movements or vocal articulation during a speech, however, there have been few reports about a relationship between acoustic features and static 3D head geometry. In this paper, we focus on estimating 3D head geometry from a voice. Acoustic features vary depending on a speech context and its intonation. Therefore we restrict a context to Japanese 5 vowels. Under this assumption, to estimate 3D head geometry, we use a Feedforward Neural Network (FNN) trained by using a correspondence between an individual acoustic features extracted from a Japanese vowel and 3D head geometry generated based on a 3D range scan. The performance of our method is shown by both closed and open tests. As a result, we found that 3D head geometry which is acoustically similar to an input voice could be estimated under the limited condition. © 2012 ACM.

DOI

Scopus
Hair motion capturing from multiple view videos

Tsukasa Fukusato, Naoya Iwamoto, Shoji Kunitomo, Hirofumi Suda, Shigeo Morishima

ACM SIGGRAPH 2012 Posters, SIGGRAPH'12 58 2012年 [査読有り]

DOI

Scopus
Automatic music video generating system by remixing existing contents in video hosting service based on hidden Markov model

Hayato Ohya, Shigeo Morishima

ACM SIGGRAPH 2012 Posters, SIGGRAPH'12 47 2012年 [査読有り]

DOI

Scopus

1

被引用数

(Scopus)
Rapid and authentic rendering of translucent materials using depth-maps from multi-viewpoint

Takahiro Kosaka, Tomohito Hattori, Hiroyuki Kubo, Shigeo Morishima

SIGGRAPH Asia 2012 Posters, SA 2012 45 2012年 [査読有り]

　概要を見る

We present a real-time rendering method of translucent materials with complex shape by estimating object's thickness between light source and view point precisely. Wang et al. [2010] has already proposed a real-time rendering method treating arbitrary shapes, but it requires such huge computational costs and graphics memories that it is very difficult to implement in a practical rendering pipe-line. Inside a trans-lucent object, the energy of incident light attenuates highly depends on the object's optical thickness. Translucent Shadow Maps (TSM) [2003] is able to compute object's thickness using depth map at light position. However, TSM is not able to calculate thickness accurately in concave objects. In this paper, we propose a novel technique to compute object's thickness precisely and as a result, we achieve a real-time rendering of translucent materials with complex shapes only by adding one render-ing pass to conventional TSM. Copyright is held by the author / owner(s).

DOI

Scopus

1

被引用数

(Scopus)
Fast-Accurate 3D Face Model Generation Using a Single Video Camera

Tomoya Hara, Hiroyuki Kubo, Akinobu Maejima, Shigeo Morishima

2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012) 1269 - 1272 2012年 [査読有り]

　概要を見る

In this paper, we present a new method to generate a 3D face model, based on both Data-Driven and Structure-from-Motion approach. Considering both 2D frontal face image constraint, 3D geometric constraint, and likelihood constraint, we are able to reconstruct subject's face model accurately, robustly, and automatically. Using our method, it is possible to create a 3D face model in 5.8 [sec] by only shaking own head freely in front of a single video camera.
Automatic Face Replacement for a Humanoid Robot with 3D Face Shape Display

Akinobu Maejima, Takaaki Kuratate, Brennand Pierce, Shigeo Morishima, Gordon Cheng

2012 12TH IEEE-RAS INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS (HUMANOIDS) 469 - 474 2012年 [査読有り]

　概要を見る

In this paper, we propose a method to apply any new face to a retro-projected 3D face system, the Mask-bot, which we have developed as a human-robot interface. The robot face using facial animation projected onto a 3D face mask can be quickly replaced by a new face based on a single frontal image of any person. Our contribution is to apply an automatic face replacement technique with the modified texture morphable model fitting to the 3D face mask. Using our technique, a face model displayed on Mask-bot can be automatically replaced within approximately 3 seconds, which makes Mask-bot widely suitable to applications such as video conferencing and cognitive experiments.

DOI

Scopus

8

被引用数

(Scopus)
Curvature-approximated Estimation of Real-time Ambient Occlusion.

Tomohito Hattori, Hiroyuki Kubo, Shigeo Morishima

GRAPP & IVAPP 2012: Proceedings of the International Conference on Computer Graphics Theory and Applications and International Conference on Information Visualization Theory and Applications, Rome, Italy, 24-26 February, 2012 268 - 273 2012年 [査読有り]
Development of an integrated multi-modal communication robotic face

Brennand Pierce, Takaaki Kuratate, Akinobu Maejima, Shigeo Morishima, Yosuke Matsusaka, Marko Durkovic, Klaus Diepold, Gordon Cheng

2012 IEEE WORKSHOP ON ADVANCED ROBOTICS AND ITS SOCIAL IMPACTS (ARSO) 104 - + 2012年 [査読有り]

　概要を見る

This paper presents an overview of the new version of our multi-model communication face "Mask-Bot", a rear-projected animated robotic head, including our display system, face animation, speech communication and sound localization.

DOI

Scopus

4

被引用数

(Scopus)
笑顔表出過程の表情の動きと受け手の印象の相関分析

藤代裕紀, 前島謙宣, 森島繁生

電子情報通信学会論文誌. A, 基礎・境界 = The transactions of the Institute of Electronics, Information and Communication Engineers. A 95 ( 1 ) 128 - 135 2012年01月 [査読有り]

　概要を見る

顔の表情は,人と人との円滑なコミュニケーションにおいて重要な役割を担っている.特に笑顔は相手に対して肯定的な印象を与えるため様々な研究が行われている.笑顔が相手に与える印象に関する研究はこれまでにもいくつか報告されているが,そのほとんどが静止画を対象とした研究であった.しかし最近では,表出後のみだけでなく,表出過程が相手に与える印象に大きく影響するとの指摘もある.そこで本研究では,笑いの自然さに注目し,表出過程において目,頬,口元の動きが笑いの自然さの印象に寄与するかを調べた.また,実際の笑顔表出に対して,笑顔の自然さを生じさせる客観的な指標を実験により明らかにし,その指標に基づき表情合成を行うことで,より自然な笑顔の合成動画像を作成することを目的とした.更に合成された動画像に対する主観評価実験を通じて,客観的指標と笑顔の自然さとの対応関係の妥当性を示した.

CiNii J-GLOBAL
シーンの連続性と顔類似度に基づく動画コンテンツ中の同一人物登場シーンの同定

平井辰典, 中野倫靖, 後藤真孝, 森島繁生

映像情報メディア学会誌(Web) 66 ( 7 ) 251 - 259 2012年 [査読有り]

J-GLOBAL
顔・人体メディアが拓く新産業の画像技術

川出雅人, 持丸正明, 森島繁生

映像情報メディア学会誌 65 ( 11 ) 1534 - 1544 2011年11月 [査読有り]
Curvature-Dependent Reflectance Function for Interactive Rendering of Subsurface Scattering

森島繁生

The International Journal of Virtual Reality 10 ( 1 ) 41 - 47 2011年05月 [査読有り]

DOI
新映像技術「ダイブイントゥザムービー」

森島繁生, 八木康史, 中村哲, 伊勢史郎, 向川康博, 槇原靖, 間下以大, 近藤一晃, 榎本成悟, 川本真一, 四倉達夫, 池田雄介, 前島謙宣, 久保尋之

電子情報通信学会誌 = The journal of the Institute of Electronics, Information and Communication Engineers 94 ( 3 ) 250 - 268 2011年03月 [査読有り]

　概要を見る

映像コンテンツの全く新しい実現形態として,観客自身が映画等の登場人物となり,時には友人や家族と一緒にこの作品を鑑賞することによって,自身がストーリーへ深く没入し,かつてない感動を覚えたり,時にはヒロイズムに浸ることを実現可能とする技術「ダイブイントゥザムービー」について本稿で解説する.この実現には,観客に全く負担をかけることなく本人そっくりの個性を有する登場人物を自動生成する技術と,自ら映像中のストーリーに参加しているという感覚を満足するためのキャラクタ合成のクオリティ,映像シーンの環境に没入していると錯覚させる高品質な映像・音響再現技術及びその収録技術が,観客の感動の強さを決定する重要な要素となる.2005年の愛・地球博にて実証実験を行った「フユーチャーキャスト」に端を発するこの技術は,ハードウェアの進歩と2007年にスタートした文部科学省の支援による科学技術振興調整費プロジェクトの実施によって,格段の進歩を遂げた.その結果,様々なバリエーションの観客の個性を全自動・短時間でストレスなくモデル化することが可能となり,また作品の中でリアルタイム合成されるキャラクタの顔と全身,声に各入の個性を忠実に反映することが可能となった.また,同時に役者が感じた音場・視点で1人称的にコンテンツへの没入感を体感することを可能にするシステムを同時に実現した.

CiNii
Example-based Deformation with Support Joints

Kentaro Yamanaka, Akane Yano, Shigeo Morishima

WSCG 2011: COMMUNICATION PAPERS PROCEEDINGS 83 - + 2011年 [査読有り]

　概要を見る

In character animation field, many deformation techniques have been proposed. Example-based deformation methods are widely used especially for interactive applications. Example-based methods are mainly divided into two types. One is Interpolation. Methods in this type are designed to interpolate examples in a pose space. The advantage is that the deformed meshes can precisely correspond to the example meshes. On the other hand, the disadvantage is that larger number of examples is needed to generate arbitrary plausible interpolated meshes between each example. The other is Example-based Skinning which optimizes particular parameters referencing examples to represent example meshes as accurately as possible. These methods provide plausible deformations with fewer examples. However they cannot perfectly depict example meshes. In this paper, we present an idea that combines techniques belonging to the two types, taking advantages of both types. We propose an example-based skinning method to be combined with Pose Space Deformation (PSD). It optimizes transformation matrices in Skeleton Subspace deformation (SSD) introducing "support joints". Our method itself generates plausible intermediate meshes with a small set of examples as well as other example-based skinning methods. Then we explain the benefit of combining our method with PSD. We show that provided examples are precisely represented and plausible deformations at arbitrary poses are obtained by our integrated method.
3D reconstruction of detail change on dynamic non-rigid objects

Daichi Taneda, Hirofumi Suda, Akinobu Maejima, Shigeo Morishima

ACM SIGGRAPH 2011 Posters, SIGGRAPH'11 56 2011年 [査読有り]

DOI

Scopus
Estimating fluid simulation parameters from videos

Naoya Iwamoto, Ryusuke Sagawa, Shoji Kunitomo, Shigeo Morishima

ACM SIGGRAPH 2011 Posters, SIGGRAPH'11 3 2011年 [査読有り]

DOI

Scopus

1

被引用数

(Scopus)
Real-Time and Interactive Rendering for Translucent Materials Such as Human Skin

Hiroyuki Kubo, Yoshinori Dobashi, Shigeo Morishima

HUMAN INTERFACE AND THE MANAGEMENT OF INFORMATION: INTERACTING WITH INFORMATION, PT 2 6772 388 - 395 2011年 [査読有り]

　概要を見る

To synthesize a realistic human animation using computer graphics, it is necessary to simulate subsurface scattering inside a human skin. We have developed a curvature-dependent reflectance functions (CDRF) which mimics the presence of a subsurface scattering effect. In this approach, we provide only a single parameter that represents the intensity of incident light scattering in a translucent material. We implemented our algorithm as a hardware-accelerated real-time renderer with a HLSL pixel shader. This approach is easily implementable on the GPU and does not require any complicated pre-processing and multi-pass rendering as is often the case in this area of research.

DOI

Scopus
The online gait measurement for characteristic gait animation synthesis

Yasushi Makihara, Mayu Okumura, Yasushi Yagi, Shigeo Morishima

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6773 ( 1 ) 325 - 334 2011年 [査読有り]

　概要を見る

This paper presents a method to measure online the gait features from the gait silhouette images and to synthesize characteristic gait animation for an audience-participant digital entertainment. First, both static and dynamic gait features are extracted from the silhouette images captured by an online gait measurement system. Then, key motion data for various gaits are captured and a new motion data is synthesized by blending key motion data. Finally, blend ratios of the key motion data are estimated to minimize gait feature errors between the blended model and the online measurement. In experiments, the effectiveness of gait feature extraction were confirmed by using 100 subjects from OU-ISIR Gait Database and characteristic gait animations were created based on the measured gait features. © 2011 Springer-Verlag.

DOI

Scopus

1

被引用数

(Scopus)
Realistic facial animation by automatic individual head modeling and facial muscle adjustment

Akinobu Maejima, Hiroyuki Kubo, Shigeo Morishima

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6774 ( 2 ) 260 - 269 2011年 [査読有り]

　概要を見る

We propose a technique for automatically generating a realistic facial animation with precise individual facial geometry and characteristic facial expressions. Our method is divided into two key methods: the head modeling process automatically generates a whole head model only from facial range scan data, the facial animation setup process automatically generates key shapes which represent individual facial expressions based on physics-based facial muscle simulation with an individual muscle layout estimated from facial expression videos. Facial animations considering individual characteristics can be synthesized using the generated head model and key shapes. Experimental results show that the proposed method can generate facial animations where 84% of subjects can identify themselves. Therefore, we conclude that our head modeling techniques are effective to entertainment system like a Future Cast. © 2011 Springer-Verlag.

DOI

Scopus

1

被引用数

(Scopus)
Instant movie casting with personality: Dive into the movie system

Shigeo Morishima, Yasushi Yagi, Satoshi Nakamura

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6774 ( 2 ) 187 - 196 2011年 [査読有り]

　概要を見る

"Dive into the Movie (DIM)" is a name of project to aim to realize a world innovative entertainment system which can provide an immersion experience into the story by giving a chance to audience to share an impression with his family or friends by watching a movie in which all audience can participate in the story as movie casts. To realize this system, we are trying to model and capture the personal characteristics instantly and precisely in face, body, gait, hair and voice. All of the modeling, character synthesis, rendering and compositing processes have to be performed on real-time without any manual operation. In this paper, a novel entertainment system, Future Cast System (FCS), is introduced as a prototype of DIM. The first experimental trial demonstration of FCS was performed at the World Exposition 2005 in which 1,630,000 people have experienced this event during 6 months. And finally up-to-date DIM system to realize more realistic sensation is introduced. © 2011 Springer-Verlag.

DOI

Scopus

1

被引用数

(Scopus)
Personalized voice assignment techniques for synchronized scenario speech output in entertainment systems

Shin-Ichi Kawamoto, Tatsuo Yotsukura, Satoshi Nakamura, Shigeo Morishima

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6774 ( 2 ) 177 - 186 2011年 [査読有り]

　概要を見る

The paper describes voice assignment techniques for synchronized scenario speech output in an instant casting movie system that enables anyone to be a movie star using his or her own voice and face. Two prototype systems were implemented, and both systems worked well for various participants, ranging from children to the elderly. © 2011 Springer-Verlag.

DOI

Scopus
曲率に依存する反射関数を用いた半透明物体の高速レンダリング

久保尋之, 土橋宜典, 森島繁生

電子情報通信学会論文誌. A, 基礎・境界 = The transactions of the Institute of Electronics, Information and Communication Engineers. A 93 ( 11 ) 708 - 717 2010年11月 [査読有り]

　概要を見る

CGを用いて人間の肌や大理石に代表される半透明物体を表現するために,物体内部で生じる表面下散乱現象を考慮することは有効な手段といえる.しかし表面下散乱の物理的に正確なモデル化には,大域照明モデルを導入せざるを得ず,実時間でのレンダリングにはほど遠いのが現状である.表面下散乱による影響は,物体表面の起伏の激しい点において特に顕著であると考えられる.そこで本研究では物体表面の起伏の強さを表す指標として,曲率を導入する.曲率は物体表面上で局所的に決定されるパラメータであるため,局所照明モデルが適用可能であり,高速なレンダリングが実現される.まず与えられた物体と同じ物質パラメータを有する半径γの球について,単一の白色な平行光源下で球表面の放射輝度分布を推定する.半径rの球面の曲率は1/rで表されるため,様々な半径の球面に対する放射輝度分布を調査することにより,曲率κ,光源と法線とのなす角θ_iに対する反射関数g_r(θ_i,κ)をルックアップテーブル(LUT)として取得する.レンダリング時にはあらかじめ曲率が計算されたポリゴンモデルに対し,LUTをテクスチャとして参照することで,リアルタイムに表面下散乱の影響を考慮したレンダリングが可能となる.

CiNii
The effects of virtual characters on audiences' movie experience

Tao Lin, Shigeo Morishima, Akinobu Maejima, Ningjiu Tang

INTERACTING WITH COMPUTERS 22 ( 3 ) 218 - 229 2010年05月 [査読有り]

　概要を見る

In this paper, we first present a new audience-participating movie form in which 3D virtual characters of audiences are constructed by computer graphics (CC) technologies and are embedded into a in a pre-rendered movie as different roles. Then, we investigate how the audiences respond to these virtual characters using physiological and subjective evaluation methods. To facilitate the investigation, we present three versions of a movie to an audience a Traditional version, its SDIM version with the participation of the audience's virtual character, and its SFDIM version with the co-participation of the audience and her/his friends' virtual characters. The subjective evaluation results show that the participation of virtual characters indeed causes increased subjective sense of spatial presence and engagement, and emotional reaction; moreover, SFDIM performs significantly better than SDIM, due to the co-participation of friends' virtual characters. Also, we find that the audiences experience not only significantly different galvanic skin response (GSR) changes on average changing trend over time and number of fluctuations but they also show the increased phasic GSR responses to the appearance of their own or friends' virtual 3D characters on the screen. The evaluation results demonstrate the success of the new audience-participating movie form and contribute to understanding how people respond to virtual characters in a role-playing entertainment interface. (C) 2009 Elsevier B.V. All rights reserved.

DOI

Scopus

8

被引用数

(Scopus)
Automatic generation of head models and facial animations considering personal characteristics

Akinobu Maejima, Hiroto Yarimizu, Hiroyuki Kubo, Shigeo Morishima

Proceedings of the ACM Symposium on Virtual Reality Software and Technology, VRST 71 - 78 2010年 [査読有り]

　概要を見る

We propose a new automatic head modeling system to generate individualized head models which can express person-specific facial expressions. The head modeling system consists of two core processes. The head modeling process with the proposed automatic mesh completion generates a whole head model only from facial range scan data. The key shape generation process generates key shapes for the generated head model based on physics-based facial muscle simulation with an individual muscle layout estimated from subject's facial expression videos. Facial animations considering personal characteristics can be synthesized using the individualized head model and key shapes. Experimental results show that the proposed system can generate head models where 84% of subjects can identify themselves. Therefore, we conclude that our head modeling system is effective to games and entertainment systems like a Future Cast System. Copyright © 2010 by the Association for Computing Machinery, Inc.

DOI

Scopus

5

被引用数

(Scopus)
Curvature-dependent reflectance function for rendering translucent materials

Hiroyuki Kubo, Yoshinori Dobashi, Shigeo Morishima

ACM SIGGRAPH 2010 Talks, SIGGRAPH '10 a46 2010年 [査読有り]

　概要を見る

Simulating sub-surface scattering is one of the most effective ways for realistically synthesizing translucent materials such as marble, milk and human skin. In previous work, the method developed by Jensen et al. [2002] significantly improved the speed of the simulation. However, the process is still not fast enough to produce realtime rendering. Thus, we have developed a curvature-dependent reflectance function (CDRF) which mimics the presence of a subsurface scattering effect.

DOI

Scopus

5

被引用数

(Scopus)
遮蔽度の曲率近似によるアンビエントオクルージョンの局所照明モデル化

服部智仁, 久保尋之, 森島繁生

情報処理学会研究報告(CD-ROM) 2009 ( 6 ) 122:1 2010年 [査読有り]

DOI J-GLOBAL

Scopus

4

被引用数

(Scopus)
Optimization of cloth simulation parameters by considering static and dynamic features

Shoji Kunitomo, Shinsuke Nakamura, Shigeo Morishima

ACM SIGGRAPH 2010 Posters, SIGGRAPH '10 15:1 2010年 [査読有り]

　概要を見る

Realistic drape and motion of virtual clothing is now possible by using an up-to-date cloth simulator, but it is even difficult and time consuming to adjust and tune many parameters to achieve an authentic looking of a real particular fabric. Bhat et al. [2003] proposed a way to estimate the parameters from the video data of real fabrics. However, this projects structured light patterns on the fabrics, so it might not be possible to estimate the accurate value of the parameters if fabrics have colors and textures. In addition to the structured light patterns, they use a motion capture system to track how the fabrics move. In this paper, we will introduce a new method using only a motion capture system by attaching a few markers on fabric surface without any other devices. Moreover, animators can easily estimate the parameters of many kinds of fabrics with this method. Authentic looking and motion of simulated fabrics are realized by minimizing error function between captured motion data and synthetic motion considering both static and dynamic cloth features. © ACM 2010.

DOI

Scopus

13

被引用数

(Scopus)
Data driven in-betweening for hand drawn rotating face

Hiroaki Gohara, Shiori Sugimoto, Shigeo Morishima

ACM SIGGRAPH 2010 Posters, SIGGRAPH '10 7:1 2010年 [査読有り]

　概要を見る

In anime production, some key-frames are drawn by artist precisely and then a great number of in-betweening frames are drawn by assistants' hands. However, it is seriously time-consuming and skilled work to draw many characters especially including face rotation. In this paper, we propose an automatic in-betweening technique for rotating face of hand drawn character only from a front image and a diagonal image (Fig. 1). Baxter [2009] represented generating in-betweening using image morphing technique. However, their approach doesn't consider reflecting the artist's style and touch. Accordingly, we represent reflecting style and touch using morphing technique trained by his own database and introduced especially to generate a rotational in-betweening faces. This database contains center of gravity of each part (right eye, left eye, nose, mouth, eyebrow) and the contours on the facial image. © ACM 2010.

DOI

Scopus

3

被引用数

(Scopus)
A skinning technique considering the shape of human skeletons

Hirofumi Suda, Kentaro Yamanaka, Shigeo Morishima

ACM SIGGRAPH 2010 Posters, SIGGRAPH '10 4:1 2010年 [査読有り]

　概要を見る

We propose a skinning technique to improve expressive power of Skeleton Subspace Deformation (SSD) by adding the influence of the shape of skeletons to the deformation result by postprocessing. © ACM 2010.

DOI

Scopus
Learning arm motion strategies for balance recovery of humanoid robots

Masaki Nakada, Brian Allen, Shigeo Morishima, Demetri Terzopoulos

Proceedings - EST 2010 - 2010 International Conference on Emerging Security Technologies, ROBOSEC 2010 - Robots and Security, LAB-RS 2010 - Learning and Adaptive Behavior in Robotic Systems 165 - 170 2010年 [査読有り]

　概要を見る

Humans are able to robustly maintain balance in the presence of disturbances by combining a variety of control strategies using posture adjustments and limb motions. Such responses can be applied to balance control in two-armed bipedal robots. We present an upper-body control strategy for improving balance in a humanoid robot. Our method improves on lowerbody balance techniques by introducing an arm rotation strategy (ARS). The ARS uses Q-learning to map sensed state to the appropriate arm control torques. We demonstrate successful balance in a physically-simulated humanoid robot, in response to perturbations that overwhelm lower-body balance strategies alone. © 2010 IEEE.

DOI

Scopus

10

被引用数

(Scopus)
Interactive shadow design tool for Cartoon animation-KAGEZOU

Sugimoto, S., Nakajima, H., Shimotori, Y., Morihsima, S.

Journal of WSCG 18 ( 1-3 ) 25 - 32 2010年 [査読有り]
Interactive shadowing for 2D Anime

Eiji Sugisaki, Hock Soon Seah, Feng Tian, Shigeo Morishima

COMPUTER ANIMATION AND VIRTUAL WORLDS 20 ( 2-3 ) 395 - 404 2009年06月 [査読有り]

　概要を見る

In this paper, we propose all instant shadow generation technique for 2D animation, especially Japanese Anime. In traditional 2D Anime production, the entire animation including shadows is drawn by hand so that it takes long time to complete. Shadows play all important role in the creation of symbolic visual effects. However shadows are not always drawn due to time constraints and lack of animators especially When the production schedule is tight. To solve this problem, We develop all easy shadowing approach that enables animators to easily create a layer of shadow and its animation based on the character's shapes. Our approach is both instant and intuitive. The only inputs required tire character or object shapes in input animation sequence With alpha value generally used in the Anime production pipeline. First, shadows are automatically rendered on a virtual plane by using a Shadow Map(1) based oil these inputs. Then the rendered shadows call be edited by simple operations and simplified by the Gaussian Filter. Several special effects such as blurring call be applied to the rendered shadow at the same time. Compared to existing approaches, ours is more efficient and effective to handle automatic shadowing in real-time. Copyright (C) 2009 John Wiley & Sons, Ltd.

DOI

Scopus

5

被引用数

(Scopus)
コンピュータグラフィックスを用いた矯正治療後の表情認知

寺田員人, 吉田満, 佐野奈都貴, 齋藤功, 宮永美知代, 森島繁生, 胡敏

顎顔面バイオメカニクス学会誌 14 ( 1 ) 1 - 13 2009年03月 [査読有り]
Dive into the movie: an instant casting and immersive experience in the story.

Shigeo Morishima

Proceedings of the ACM Symposium on Virtual Reality Software and Technology, VRST 2009, Kyoto, Japan, November 18-20, 2009 9 2009年 [査読有り]

DOI
Muscle-based facial animation considering fat layer structure captured by MRI.

Hiroto Yarimizu, Yasushi Ishibashi, Hiroyuki Kubo, Akinobu Maejima, Shigeo Morishima

International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2009, New Orleans, Louisiana, USA, August 3-7, 2009, Poster Proceedings 2009年 [査読有り]

DOI

Scopus

1

被引用数

(Scopus)
Example based skinning with progressively optimized support joints

Kentaro Yamanaka, Akane Yano, Shigeo Morishima

ACM SIGGRAPH ASIA 2009 Posters, SIGGRAPH ASIA '09 55:1 2009年 [査読有り]

　概要を見る

Skeleton-Subspace Deformation (SSD), which is the most popular method for articulated character animation, often causes some artifacts. Animators have to edit mesh each time, which is seriously tedious and time-consuming. So example based skinning has been proposed. It employs edited mesh as target poses and generates plausible animation efficiently. In this technique, character mesh should be deformed to accurately fit target poses. Mohr et al. [2003] introduced additional joints. They expect animators to embed skeleton precisely.

DOI

Scopus

3

被引用数

(Scopus)
Accurate skin deformation model of forearm using MRI

Kentaro Yamanaka, Shinsuke Nakamura, Shota Kobayashi, Akane Yano, Masashi Shiraishi, Shigeo Morishima

SIGGRAPH 2009: Posters, SIGGRAPH '09 2009年 [査読有り]

　概要を見る

This paper presents a new methodology for constructing a skin deformation model using MRI and generating accurate skin deformations based on the model. Many methods to generate skin deformations have been proposed and they are classified into three main types. The first type is anatomically based modeling. Anatomically accurate deformations can be reconstructed but computation time is long and controlling generated motion is difficult. In addition, modeling whole body is very difficult. The second is skeleton-subspace deformation (SSD). SSD is easy to implement and fast to compute so it is the most common technique today. However, accurate skin deformations can't be easily realized with SSD. The last type consists of data-driven approaches including example-based methods. In order to construct our model from MRI images, we employ an example-based method. Using examples obtained from medical images, skin deformations can be modeled related to skeleton motions. Retargeting generated motions to other characters is generally difficult with this kind of methods. Kurihara and Miyata realize accurate skin deformations from CT images [Kurihara et al. 2004], but it doesn't mention the possibility of retargeting. With our model, however, generated deformations can be retargeted. Once the model is constructed, accurate skin deformations are easily generated applying our model to a skin mesh. In our experiment, we construct a skin deformation model which reconstructs pronosupination, rotational movement of forearm, and we use range scan data as a skin mesh to apply our model and generate accurate skin deformations.

DOI

Scopus
Expressive facial subspace construction from key face selection.

Ryo Takamizawa, Takanori Suzuki, Hiroyuki Kubo, Akinobu Maejima, Shigeo Morishima

International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2009, New Orleans, Louisiana, USA, August 3-7, 2009, Poster Proceedings 2009年 [査読有り]

DOI

Scopus

1

被引用数

(Scopus)
Directable anime-like shadow based on water mapping filter.

Yohei Shimotori, Shiori Sugimoto, Shigeo Morishima

International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2009, New Orleans, Louisiana, USA, August 3-7, 2009, Poster Proceedings 2009年 [査読有り]

DOI

Scopus
Characteristic gait animation synthesis from single view silhouette

Shinsuke Nakamurai, Masashi Shiraishi, Shigeo Morishima, Mayu Okumura, Yasushi Makihara, Yasushi Yagi

SIGGRAPH 2009: Posters, SIGGRAPH '09 2009年 [査読有り]

　概要を見る

Characteristics of human motion, such as walking, running or jumping vary from person to person. Differences in human motion enable people to identify oneself or a friend. However, it is challenging to generate animation where individual characters exhibit characteristic motion using computer graphics. Our goal is to construct a system that synthesizes characteristic gait animation automatically. As a result, when crowd animation is generated for instance, the motion with the variation can be made using our system. In our system, we first acquire a silhouette image as input data using a video camera. Second, we extract gait feature from single view silhouette. Finally we automatically synthesize 3D gait animation using the method blending a small number of motion data [KOVAR, L et al 2003].This blending weight is estimated using the gait feature automatically.

DOI
Human head modeling based on fast-automatic mesh completion

Akinobu Maejima, Shigeo Morishima

ACM SIGGRAPH ASIA 2009 Posters, SIGGRAPH ASIA '09 53:1 2009年 [査読有り]

　概要を見る

The need to rapidly create 3D human head models is still an important issue in game and film production. Blanz et al have developed a morphable model which can semi-automatically reconstruct the facial appearance (3D shape and texture) and simulated hairstyles of "new" faces (faces not yet scanned into an existing database) using photographs taken from the front or other angles [Blanz et al. 2004]. However, this method still requires manual marker specification and approximately 4 minutes of computational time. Moreover, the facial reconstruction produced by this system is not accurate unless a database containing a large variety of facial models is available. We have developed a system that can rapidly generate human head models using only frontal facial range scan data. Where it is impossible to measure the 3D geometry accurately (as with hair regions) the missing data is complemented using the 3D geometry of the template mesh (TM). Our main contribution is to achieve the fast mesh completion for the head modeling based on the "Automatic Marker Setting" and the "Optimized Local Affine Transform (OLAT)". The proposed system generates a head model in approximately 8 seconds. Therefore, if users utilize a range scanner which can quickly produce range data, it is possible to generate a complete 3D head model in one minute using our system on a PC.

DOI

Scopus

2

被引用数

(Scopus)
Curvature-dependent local illumination approximation for translucent materials.

Hiroyuki Kubo, Mai Hariu, Shuhei Wemler, Shigeo Morishima

International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2009, New Orleans, Louisiana, USA, August 3-7, 2009, Poster Proceedings 2009年 [査読有り]

DOI

Scopus

1

被引用数

(Scopus)
Aging model of human face by averaging geometry and filtering texture in database.

Satoko Kasai, Shigeo Morishima

International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2009, New Orleans, Louisiana, USA, August 3-7, 2009, Poster Proceedings 2009年 [査読有り]

DOI CiNii
A natural smile synthesis from an artificial smile.

Hiroki Fujishiro, Takanori Suzuki, Shinya Nakano, Akinobu Maejima, Shigeo Morishima

International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2009, New Orleans, Louisiana, USA, August 3-7, 2009, Poster Proceedings 2009年 [査読有り]

DOI
AUTOMATIC VOICE ASSIGNMENT TOOL FOR INSTANT CASTING MOVIE SYSTEM

Yoshihiro Adachi, Shinichi Kawamoto, Tatsuo Yotsukura, Shigeo Morishima, Satoshi Nakamura

2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS 1897 - + 2009年 [査読有り]

　概要を見る

In Instant Casting movie System, a personal CG character is automatically generated. The character resembles a participant in a face geometry and texture. However, the voice of a character was an alternative voice determined by the gender of the participant. Therefore sometimes it's not enough to identify the personality of a CG character.In this paper, an automatic pre-scored voice assignment tool for a personal CG character is presented. Voice is essential to identify a personal character as well as a face feature. Our proposed system selects the most similar voice to the participants from voice database, and assigns it as a voice of CG character. Voice similarity criterion is presented by combination of eight acoustic features. After assigning voice data to a personal character, the voice track is played back in synchronization with the movement of the CG character. 60 voice variations have been prepared to our voice database. Validity of the assigned voice has been evaluated by MOS value. The proposed method has achieved 68% of the theoretical figure that is calculated by preliminary experiments.

DOI

Scopus

3

被引用数

(Scopus)
"Dive into the Movie" audience-driven immersive experience in the story

Shigeo Morishima

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E91D ( 6 ) 1594 - 1603 2008年06月 [査読有り]

　概要を見る

"Dive into the Movie (DIM)" is a name of project to aim to realize a world innovative entertainment system which can provide an immersion experience into the story by giving a chance to audience to share an impression with his family or friends by watching a movie in which all audience can participate in the story as movie casts. To realize this system, several techniques to model and capture the personal characteristics instantly in face, body, gesture, hair and voice by combining computer graphics, computer vision and speech signal processing technique. Anyway, all of the modeling, casting, character synthesis, rendering and compositing processes have to be performed on real-time without any operator, In this paper, first a novel entertainment system, Future Cast System (FCS), is introduced which can create DIM movie with audience's participation by replacing the original roles' face in a pre-created CG movie with audiences' own highly realistic 3D CG faces. Then the effects of DIM movie on audience experience are evaluated subjectively. The result suggests that most of the participants are seeking for higher realism, impression and satisfaction by replacing not only face part but also body, hair and voice. The first experimental trial demonstration of FCS was performed at the Mitsui-Toshiba pavilion of the 2005 World: Exposition in Aichi Japan. Then, 1,640,000 people have experienced this event during 6 months of exhibition and FCS became one of the most popular events at Expo.2005.

DOI

Scopus

5

被引用数

(Scopus)
Instant casting movie theater: The Future Cast Systems

Akinobu Maejima, Shuhei Wemler, Tamotsu Machida, Masao Takebayashi, Shigeo Morishima

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E91D ( 4 ) 1135 - 1148 2008年04月 [査読有り]

　概要を見る

We have developed a visual entertainment system called "Future Cast" which enables anyone to easily participate in a pre-recorded or pre-created film as an instant CG movie star. This system provides audiences with the amazing opportunity to join the cast of a movie in real-time. The Future Cast System can automatically perform all the processes required to make this possible, from capturing participants' facial characteristics to rendering them into the movie. Our system can also be applied to any movie created using the same production process. We conducted our first experimental trial demonstration of the Future Cast System at the Mitsui-Toshiba pavilion at the 2005 World Exposition in Aichi Japan.

DOI

Scopus

16

被引用数

(Scopus)
Instant casting movie theater: The Future Cast Systems

Akinobu Maejima, Shuhei Wemler, Tamotsu Machida, Masao Takebayashi, Shigeo Morishima

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E91D ( 4 ) 1135 - 1148 2008年04月 [査読有り]

　概要を見る

We have developed a visual entertainment system called "Future Cast" which enables anyone to easily participate in a pre-recorded or pre-created film as an instant CG movie star. This system provides audiences with the amazing opportunity to join the cast of a movie in real-time. The Future Cast System can automatically perform all the processes required to make this possible, from capturing participants' facial characteristics to rendering them into the movie. Our system can also be applied to any movie created using the same production process. We conducted our first experimental trial demonstration of the Future Cast System at the Mitsui-Toshiba pavilion at the 2005 World Exposition in Aichi Japan.

DOI

Scopus

16

被引用数

(Scopus)
Synthesizing facial animation using dynamical property of facial muscle

Hiroyuki Kubo, Yasushi Ishibashi, Akinobu Maejima, Shigeo Morishima

SIGGRAPH'08 - ACM SIGGRAPH 2008 Posters 110 2008年 [査読有り]

DOI

Scopus
Hair animation and styling based on 3D range scanning data

Shiori Sugimoto, Shigeo Morishima

SIGGRAPH'08 - ACM SIGGRAPH 2008 Posters 107 2008年 [査読有り]

DOI

Scopus

1

被引用数

(Scopus)
Automatic and accurate mesh fitting based on 3D range scanning data

Shinya Nakano, Yusuke Nonaka, Akinobu Maejima, Shigeo Morishima

SIGGRAPH'08 - ACM SIGGRAPH 2008 Posters 104 2008年 [査読有り]

DOI

Scopus
3D facial animation from high speed video

Takanori Suzuki, Yasushi Ishibashi, Hiroyuki Kubo, Akinobu Maejima, Shigeo Morishima

SIGGRAPH'08 - ACM SIGGRAPH 2008 Posters 1 2008年 [査読有り]

DOI

Scopus
An empirical study of bringing audience into the movie

Tao Lin, Akinobu Maejima, Shigeo Morishima

SMART GRAPHICS, PROCEEDINGS 5166 70 - 81 2008年 [査読有り]

　概要を見る

In this paper we first present all audience-participating movie experience DIM, in which the photo-realistic 3D virtual actor of audience is constructed by computer graphic technologics, and then evaluate the effects of DIM on audience experience using physiological and subjective methods. The empirical results suggest that the participation of virtual actors causes increased subjective sense of presence and engagement, and more intensive emotional responses as compared to traditional movie form: interestingly, there also significantly different physiological responses caused by the participation of virtual actors, objectively indicating the improvement of interaction between audience and movie.

DOI

Scopus

4

被引用数

(Scopus)
Post-recording tool for instant casting movie system

Shin-Ichi Kawamoto, Shigeo Morishima, Tatsuo Yotsukura, Satoshi Nakamura

MM'08 - Proceedings of the 2008 ACM International Conference on Multimedia, with co-located Symposium and Workshops 893 - 896 2008年 [査読有り]

　概要を見る

This paper proposes a universal user-friendly post-recording tool for an Instant Casting Movie System (ICS) that enables anyone to be a movie star using his or her own voice and faces. A personal CG character is automatically generated by scanning one's face geometry and image in ICS. Voice is as essential to identify a person as face. However, a character's voice is only based on gender in ICS. We proposed a novel voice recording tool for participants of all ages in a short time. Post-recording tasks are very difficult because speakers should speak in synchronization with the mouth movements of the CG characters. Therefore this task is generally recorded by professional voice actors. Our proposed tool has the following four features: 1) various supporting information for synchronization with voice and mouth movement timing for users
2) automatic post-processing of recorded voices for compositing mixed audio
3) intuitively displays operation for people of all ages
and 4) handles multiple users in parallel for quick recording. We developed a prototype speech synchronization system using a post-recording tool and conducted subjective evaluation experiments of it. Over 60% of the subjects responded that the tool's interface can be operated easily. © 2009 IEEE.

DOI

Scopus
Perceptual similarity measurement of speech by combination of acoustic features

Yoshihiro Adachi, Shinichi Kawamoto, Shigeo Morishima, Satoshi Nakamura

2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12 4861 - + 2008年 [査読有り]

　概要を見る

Future cast system is a new entertainment system where participant's face is captured and rendered into the movie as an instant Computer Graphics (CG) movie star, which had been first exhibited at the 2005 World Exposition in Aichi Japan. We are working to add new functionality which enables mapping not only faces but also speech individualities to the cast. Our approach is to find a speaker with the closest speech individuality and apply voice conversion. This paper inves tigates acoustic features to estimate perceptual similarity of speech individuality. We propose a method linearly combined eight acoustic features related to the perception of speech individualities. The proposed method optimizes weights for the acoustic features considering perceptual similarities. We have evaluated performance of our method with Spearman's rank correlation coefficients to perceptual similarities. As the results, the experiments evidenced that the proposed method achieves a correlation coefficient of 0.66.

DOI

Scopus

12

被引用数

(Scopus)
Directable and stylized hair simulation

Yosuke Kazama, Eiji Sugisaki, Shigeo Morishima

GRAPP 2008: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMPUTER GRAPHICS THEORY AND APPLICATIONS 316 - 321 2008年 [査読有り]

　概要を見る

Creating natural looking hair motion is considered to be one of the most difficult and time consuming challenges in CG animation. A detailed physics-based model is essential in creating convincing hair animation. However, hair animation created using detailed hair dynamics might not always be the result desired by creators. For this reason, a hair simulation system that is both detailed and editable is required in contemporary Computer Graphics. In this paper we therefore, propose the use of External Force Field (EFF) to construct hair motion using a motion capture system. Furthermore, we have developed a system for editing the hair motion obtained using this process. First, the environment around a subject is captured using a motion capture system and the EFF is defined. Second, we apply our EFF-based hair motion editing system to produce creator-oriented hair animation. Consequently, our editing system enables creators to develop desired hair animation intuitively without physical discontinuity.
Preliminary evaluation of the audience-driven movie

Tao Lin, Akinobu Maejima, Shigeo Morishima, Atsumi Imamiya

Conference on Human Factors in Computing Systems - Proceedings 3273 - 3278 2008年 [査読有り]

　概要を見る

In this paper we introduce an audience-driven theater experience, DIM Movie, in which audience participates in a pre-created CG movie as its roles, and report the subjective and physiological evaluations for the audience experience offered by DIM movie. Specifically, we present three different experiences to an audience-a traditional movie, its SeIf-DIM (SDIM) version with the audience's participation, and its Self-Friend-DIM (SFDIM) version with co-participation of the audience and his friends. The evaluation results show that the DIM movies (SDIM and SFDIM) elicit greater subjective sense of presence, engagement, and emotional reaction, and stronger physiological response (galvanic skin response, GSR) as compared with the traditional movie form
moreover, audiences show a phasic GSR increase responding to the appearance of their own or friends' CG characters on the movie screen.

DOI

Scopus

1

被引用数

(Scopus)
Using subjective and physiological measures to evaluate audience-participating movie experience

Tao Lin, Akinobu Maejima, Shigeo Morishima

Proceedings of the Workshop on Advanced Visual Interfaces AVI 49 - 56 2008年 [査読有り]

　概要を見る

In this paper we subjectively and physiologically investigate the effects of the audiences' 3D virtual actor in a movie on their movie experience, using the audience-participating movie DIM as the object of study. In DIM, the photo-realistic 3D virtual actors of audience are constructed by combining current computer graphics (CG) technologies and can act different roles in a pre-rendered CG movie. To facilitate the investigation, we presented three versions of a CG movie to an audience-a Traditional version, its Self-DIM (SDIM) version with the participation of the audience's virtual actor, and its Self-Friend-DIM (SFDIM) version with the coparticipation of the audience and his friends' virtual actors. The results show that the participation of audience's 3D virtual actors indeed cause increased subjective sense of presence and engagement, and emotional reaction
moreover, SFDIM performs significantly better than SDIM, due to increased social presence. Interestingly, when watching the three movie versions, subjects experienced not only significantly different galvanic skin response (GSR) changes on average-changing trend over time, and number of fluctuations-but they also experienced phasic GSR increase when watching their own and friends' virtual 3D actors appearing on the movie screen. These results suggest that the participation of the 3D virtual actors in a movie can improve interaction and communication between audience and the movie. Copyright 2008 ACM.

DOI

Scopus

7

被引用数

(Scopus)
Tweakable Shadows for Cartoon Animation

Hidehito Nakajima, Eiji Sugisaki, Shigeo Morishima

WSCG 2007, FULL PAPERS PROCEEDINGS I AND II 233 - 240 2007年 [査読有り]

　概要を見る

The role of shadow is important in cartoon animation. Shadows in hand-drawn animation reflect the expression of the animators' style, rather than mere physical phenomena. However shadows in 3DCG cannot express such an animators' touch. In this paper we propose a novel method for editing the shadow with both advantages of hand drawn animation and 3DCG technology. In particular, we discuss two phases that enable animators to transform and deform the shadow tweakably. The first phase is that a shadow projection matrix is applied to deform the shape of the shadow. The second one is that we manipulate vectors to transform the shadow such as scaling and translation. These vectors are used in shadow volume method. The shadows are edited directably by integration of these two phases. Our approach enables animators to edit the shadow by simple mouse operations. As a result, the animators can not only produce shadows automatically but also reflect their style easily. Once the shape and location of shadow are decided by animators' style in our method, 3DCG techniques can produce consistent shadow in object motion interactively.
Interactive shade control for cartoon animation.

Yohei Shimotori, Hidehito Nakajima, Eiji Sugisaki, Akinobu Maejima, Shigeo Morishima

34. International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2007, San Diego, California, USA, August 5-9, 2007, Posters 170 2007年 [査読有り]
Hair motion reconstruction using motion capture system.

Takahito Ishikawa, Yosuke Kazama, Eiji Sugisaki, Shigeo Morishima

34. International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2007, San Diego, California, USA, August 5-9, 2007, Posters 78 2007年 [査読有り]
Data-driven efficient production of cartoon character animation

Shigeo Morishima, Shigeru Kuriyama, Shinichi Kawamoto, Tadamichi Suzuki, Masaaki Taira, Tatsuo Yotsukura, Satoshi Nakamura

ACM SIGGRAPH 2007 Sketches, SIGGRAPH'07 76 2007年 [査読有り]

DOI

Scopus

3

被引用数

(Scopus)
Designing a new car body shape by PCA of existing car database.

Tatsunori Hayakawa, Yusuke Sekine, Akinobu Maejima, Shigeo Morishima

34. International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2007, San Diego, California, USA, August 5-9, 2007, Posters 43 2007年 [査読有り]
Variable rate speech animation synthesis.

Akane Yano, Hiroyuki Kubo, Yoshihiro Adachi, Demetri Terzopoulos, Shigeo Morishima

34. International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2007, San Diego, California, USA, August 5-9, 2007, Posters 28 - 28 2007年 [査読有り]
Facial muscle adaptation for expression customization.

Yasushi Ishibashi, Hiroyuki Kubo, Akinobu Maejima, Demetri Terzopoulos, Shigeo Morishima

34. International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2007, San Diego, California, USA, August 5-9, 2007, Posters 26 - 26 2007年 [査読有り]
Acoustic features for estimation of perceptional similarity

Yoshibiro Adachi, Shinichi Kawamoto, Shigeo Morishima, Satoshi Nakamura

ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2007 4810 306 - + 2007年 [査読有り]

　概要を見る

This paper describes an examination of acoustic features for the estimation of perceptional similarity between speeches. We firstly extract some acoustic features including personality from speeches of 36 persons. Secondly, we calculate each distance between extracted features using Gaussian Mixture Model (GMM) or Dynamic Time Warping (DTW), and then we sort speeches based on the physical similarity. On the other hand, there is the permutation based on the perceptional similarity which is sorted according to the subject. We evaluate the physical features by the Spearman's rank correlation coefficient with two permutations. Consequently, the results show that DTW distance with high STRAIGHT Cepstrum is an optimum feature for estimation of perceptional similarity.

DOI

Scopus
Car designing tool using multivariate analysis

Yusuke Sekine, Akinobu Maejima, Eiji Sugisaki, Shigeo Morishima

ACM SIGGRAPH 2006 Research Posters, SIGGRAPH 2006 94 2006年07月 [査読有り]

DOI

Scopus

2

被引用数

(Scopus)
Facial animation by the manipulation of a few control points subject to muscle constraints

Hiroyuki Kubo, Hiroaki Yanagisawa, Akinobu Maejima, Demetri Terzopoulos, Shigeo Morishima

ACM SIGGRAPH 2006 Research Posters, SIGGRAPH 2006 65 2006年07月 [査読有り]

DOI

Scopus

1

被引用数

(Scopus)
Hair motion re-modeling from cartoon animation sequence

Yosuke Kazama, Eiji Sugisaki, Natsuko Tanaka, Akiko Sato, Shigeo Morishima

ACM SIGGRAPH 2006 Research Posters, SIGGRAPH 2006 4 2006年07月 [査読有り]

DOI

Scopus
Hair motion cloning from cartoon animation sequences

Eiji Sugisaki, Yosuke Kazama, Shigeo Morishima, Natsuko Tanaka, Akiko Sato

COMPUTER ANIMATION AND VIRTUAL WORLDS 17 ( 3-4 ) 491 - 499 2006年07月 [査読有り]

　概要を見る

This paper describes a new approach to create cartoon hair animation that allows users to use existing cel character animation sequences. We demonstrate the generation of cartoon hair animation accentuated in 'anime-like' motions. The novelty of this method is that users can choose the existing cel animation of a character's hair animation and apply environmental elements such as wind to other characters With a three-dimensional structure. In fact, users can reuse existing cartoon sequences as input to endow another character with environmental elements as if both characters exist in the same scene. A three-dimensional character's hair motions are created based on hair motions from input cartoon animation sequences. First, users extract hair shapes at each frame from input sequences from Which they then construct physical equations. 'Anime-like' hair motion is created by using these physical equations. Copyright (c) 2006 John Wiley & Soils, Ltd.

DOI

Scopus

5

被引用数

(Scopus)
表情筋制約モデルを用いた少ない制御点の動きからの表情合成

久保尋之, 柳澤博昭, 前島謙宣, 森島繁生

日本顔学会論文誌 2006年01月 [査読有り]
Construction of audio-visual speech corpus using motion-capture system and corpus based facial animation

T Yotsukura, S Morishima, S Nakamura

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E88D ( 11 ) 2477 - 2483 2005年11月 [査読有り]

　概要を見る

An accurate audio-visual speech corpus is inevitable for talking-heads research. This paper presents our audio-visual speech corpus collection and proposes a head-movement normalization method and a facial motion generation method. T he audio-visual corpus contains speech data, movie data on faces, and positions and movements of facial organs. The corpus consists of Japanese phoneme-balanced sentences uttered by a female native speaker. An accurate facial capture is realized by using an optical motion-capture system. We captured high-resolution 3D data by arranging many markers on the speaker's face. In addition, we propose a method of acquiring the facial movements and removing head movements by using affine transformation for computing displacements of pure facial organs. Finally, in order to easily create facial animation from this motion data, we propose a technique assigning the captured data to the facial polygon model. Evaluation results demonstrate the effectiveness of the proposed facial motion generation method and show the relationship between the number of markers and errors.

DOI

Scopus

4

被引用数

(Scopus)
Special section on life-like agent and its communication

M Ishizuka, S Morishima

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E88D ( 11 ) 2443 - 2444 2005年11月 [査読有り]

DOI

Scopus
Automatic head-movement control for emotional speech

Shin-Ichi Kawamoto, Tatsuo Yotsukura., Shigeo Morishima., Satoshi Nakamura

ACM SIGGRAPH 2005 Posters, SIGGRAPH 2005 28 2005年07月 [査読有り]

　概要を見る

Head movements could be automatically generated from speech data. The expression of head movement could be also controlled by user-defined emotional factors, as shown in the video demonstration.

DOI

Scopus
Speech to talking heads system based on hidden Markov models

Tatsuo Yotsukura, Shigeo Morishima, Satoshi Nakamura

ACM SIGGRAPH 2005 Posters, SIGGRAPH 2005 27 2005年07月 [査読有り]

　概要を見る

Facial animation using the proposed talking heads was created from input speech signals, as shown in the video demonstration. We have confirmed facial animations of various facial objects.

DOI

Scopus

1

被引用数

(Scopus)
Future cast system

Shigeo Morishima, Akinobu Maejima, Shuhei Wemler, Tamotsu Machida, Masao Takebayashi

ACM SIGGRAPH 2005 Sketches, SIGGRAPH 2005 20 2005年07月 [査読有り]

DOI

Scopus

4

被引用数

(Scopus)
Simulation-Based Cartoon Hair Animation.

Eiji Sugisaki, Yizhou Yu, Ken Anjyo, Shigeo Morishima

The 13-th International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision'2005, WSCG 2005, University of West Bohemia, Campus Bory, Plzen-Bory, Czech Republic, January 31 - February 4, 2005 117 - 122 2005年 [査読有り]
Reconstructing motion using a human structure model.

Shohei Nishimura, Shoichiro Iwasawa, Eiji Sugisaki, Shigeo Morishima

32. International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2005, Los Angeles, California, USA, July 31 - August 4, 2005, Posters 109 2005年 [査読有り]

DOI
Quantitative representation of face expression using motion capture system.

Hiroaki Yanagisawa, Akinobu Maejima, Tatsuo Yotsukura, Shigeo Morishima

32. International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2005, Los Angeles, California, USA, July 31 - August 4, 2005, Posters 108 2005年 [査読有り]

DOI

Scopus
Interactive speech conversion system cloning speaker intonation automatically.

Yoshihiro Adachi, Shigeo Morishima

32. International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2005, Los Angeles, California, USA, July 31 - August 4, 2005, Posters 77 2005年 [査読有り]

DOI
Multimodal translation system using texture-mapped lip-sync images for video mail and automatic dubbing applications

S Morishima, S Nakamura

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING 2004 ( 11 ) 1637 - 1647 2004年09月 [査読有り]

　概要を見る

We introduce a multimodal English-to-Japanese and Japanese-to-English translation system that also translates the speaker's speech motion by synchronizing it to the translated speech. This system also introduces both a face synthesis technique that can generate any viseme lip shape and a face tracking technique that can estimate the original position and rotation of a speaker's face in an image sequence. To retain the speaker's facial expression, we substitute only the speech organ's image with the synthesized one, which is made by a 3D wire-frame model that is adaptable to any speaker. Our approach provides translated image synthesis with an extremely small database. The tracking motion of the face from a video image is performed by template matching. In this system, the translation and rotation of the face are detected by using a 3D personal face model whose texture is captured from a video frame. We also propose a method to customize the personal face model by using our GUI tool. By combining these techniques and the translated voice synthesis technique, an automatic multimodal translation can be achieved that is suitable for video mail or automatic dubbing systems into other languages.

DOI

Scopus

7

被引用数

(Scopus)
Mocap+MRI=?

Shoichiro Iwasawa, Kenji Mase, Shigeo Morishima

ACM SIGGRAPH 2004 Posters, SIGGRAPH 2004 116 2004年08月 [査読有り]

　概要を見る

This poster proposes a basic idea to observe differences that exist between skeletal postures coming from two methods: postures generated by an only ordinary mocap process, and anatomically and individually accurate skeletal postures generated by a ordinary mocap process together with a medical imaging method such as MRI.

DOI

Scopus
Face expression synthesis based on a facial motion distribution chart

Tatsuo Yotsukura, Shigeo Morishima, Satoshi Nakamura

ACM SIGGRAPH 2004 Posters, SIGGRAPH 2004 85 2004年08月 [査読有り]

DOI

Scopus

3

被引用数

(Scopus)
Cartoon hair animation based on physical simulation

Eiji Sugisaki, Yizhou Yu, Ken Anjyo, Shigeo Morishima

ACM SIGGRAPH 2004 Posters, SIGGRAPH 2004 27 2004年08月 [査読有り]

DOI

Scopus

1

被引用数

(Scopus)
Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents.

Shinichi Kawamoto, Hiroshi Shimodaira, Tsuneo Nitta, Takuya Nishimoto, Satoshi Nakamura, Katsunobu Itou, Shigeo Morishima, Tatsuo Yotsukura, Atsuhiko Kai, Akinobu Lee, Yoichi Yamashita, Takao Kobayashi, Keiichi Tokuda, Keikichi Hirose, Nobuaki Minematsu, Atsushi Yamada, Yasuharu Den, Takehito Utsuro, Shigeki Sagayama

Life-like characters - tools, affective functions, and applications. 187 - 212 2004年 [査読有り]
How to capture absolute human skeletal posture

Shoichiro Iwasawa, Kiyoshi Kojima, Kenji Mase, Shigeo Morishima

ACM SIGGRAPH 2003 Sketches and Applications, SIGGRAPH 2003 2003年07月 [査読有り]

　概要を見る

Commercially available motion capture products give us fairly precise movements of human body segments but do not measure enough information to define skeletal posture in its entirety. This sketch describes how to obtain the complete posture of skeletal structure with the help of marker locations relative to bones that are derived from MRI data sets.

DOI

Scopus

1

被引用数

(Scopus)
Face analysis and synthesis for interactive entertainment

Shoichiro Iwasawa, Tatsuo Yotsukura, Shigeo Morishima

IFIP Advances in Information and Communication Technology 112 157 - 164 2003年 [査読有り]

　概要を見る

A stand-in is a common technique for movies and TV programs in foreign languages. The current stand-in that only substitutes the voice channel results awkward matching to the mouth motion. Videophone with automatic voice translation are expected to be widely used in the near future, which may face the same problem without lip-synchronized speaking face image translation. In this paper, we propose a method to track motion of the face from the video image and then replace the face part or only mouth part with synthesized one which is synchronized with synthetic voice or spoken voice. This is one of the key technologies not only for speaking image translation and communication system, but also for an interactive entertainment system. Finally, an interactive movie system is introduced as an application of entertainment system. © 2003 by Springer Science+Business Media New York.

DOI
Model-based talking face synthesis for anthropomorphic spoken dialog agent system.

Tatsuo Yotsukura, Shigeo Morishima, Satoshi Nakamur

Proceedings of the Eleventh ACM International Conference on Multimedia, Berkeley, CA, USA, November 2-8, 2003 351 - 354 2003年 [査読有り]

DOI CiNii

Scopus

6

被引用数

(Scopus)
Magical face: Integrated tool for muscle based facial animation

Tatsuo Yotsukura, Mitsunori Takahashi, Shigeo Morishima, Kazunori Nakamura, Hirokazu Kudoh

ACM SIGGRAPH 2002 Conference Abstracts and Applications, SIGGRAPH 2002 212 2002年07月 [査読有り]

　概要を見る

In recent years, tremendous advances have been achieved in the 3D computer graphics used in the entertainment industry, and in the semiconductor technologies used to fabricate graphics chips and CPUs. However, although good reproduction of facial expressions is possible through 3D CG, the creation of realistic expressions and mouth motion is not a simple task.

DOI

Scopus
HyperMask - projecting a talking head onto a real object

T Yotsukura, S Morishima, F Nielsen, K Binsted, C Pinhanez

VISUAL COMPUTER 18 ( 2 ) 111 - 120 2002年04月 [査読有り]

　概要を見る

HyperMask is a system which projects an animated face onto a physical mask worn by an actor. As the mask moves within a prescribed area, its position and orientation are detected by a camera and the projected image changes with respect to the viewpoint of the audience. The lips of the projected face are automatically synthesized in real time with the voice of the actor, who also controls the facial expressions. As a theatrical tool, HyperMask enables a new style of storytelling. As a prototype system, we put a self-contained HyperMask system in a trolley (disguised as a linen cart), so that it projects onto the mask worn by the actor pushing the trolley.

DOI

Scopus

22

被引用数

(Scopus)
Styling and animating human hair

Keisuke Kishi, Shigeo Morishima

Systems and Computers in Japan 33 ( 3 ) 31 - 40 2002年03月 [査読有り]

　概要を見る

Synthesizing facial images by computer graphics (CG) has attracted attention in connection with the current trends toward synthesizing virtual faces and realizing communication systems in cyberspace. In this paper, a method for representing human hair, which is known to be difficult to synthesize in computer graphics, is presented. In spite of the fact that hair is visually important in human facial imaging, it has frequently been replaced by simple curved surfaces or a part of the background. Although the methods of representing hair by mapping techniques have achieved results, such methods are inappropriate in representing motions of hair. Thus, spatial curves are used to represent hair, without using textures or polygons. In addition, hair style design is simplified by modeling hair in units of tufts, which are partially concentrated areas of hair. This paper describes the collision decisions and motion representations in this new hair style design system, the modeling of tufts, the rendering method, and the four-branch (quadtree) method. In addition, hair design using this hair style design system and the animation of wind-blown hair are illustrated. © 2002 Scripta Technica, Syst. Comp. Jpn.

DOI

Scopus
Multi-modal translation system and its evaluation

S Morishima, S Nakamura

FOURTH IEEE INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES, PROCEEDINGS 241 - 246 2002年 [査読有り]

　概要を見る

Speech-to-speech translation has been studied to realize natural human communication beyond language barriers. Toward further multi-modal natural communication, visual information such as face and lip movements will be necessary In this paper, we introduce a multi-modal English-to-Japanese and Japanese-to-English translation system that also translates the speaker's speech motion while synchronizing it to the translated speech. To retain the speaker's facial expression, we substitute only the speech organ's image with the synthesized one, which is made by a three-dimensional wire-frame model that is adaptable to any speaker. Our approach enables image synthesis and translation with an extremely small database. We conduct subjective evaluation tests using the connected digit discrimination test using data with and without audio-visual lip-synchronization. The results confirm the significant quality of the proposed audio-visual translation system and the importance of lip-synchronization.

DOI

Scopus

2

被引用数

(Scopus)
Audio-visual speech translation with automatic lip synchronization and face tracking based on 3-D read model

S Morishima, S Ogata, K Murai, S Nakamura

2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS 2117 - 2120 2002年 [査読有り]

　概要を見る

Speech-to-speech translation has been studied to realize natural human communication beyond language barriers. Toward further multi-modal natural communication, visual information such as face and lip movements will be necessary. In this paper, we introduce a multi-modal English-to-Japanese and Japanese-to-English translation system that also translates the speaker's speech motion while synchronizing it to the translated speech. To retain the speaker's facial expression, we substitute only the speech organ's image with the synthesized one, which is made by a three-dimensional wire-frame model that is adaptable to any speaker. Our approach enables image synthesis and translation with an extremely small database. We conduct subjective evaluation by connected digit discrimination using data with and without audiovisual lip-synchronicity. The results confirm the sufficient quality of the proposed audio-visual translation system.

DOI
An open source development tool for anthropomorphic dialog agent - Face image synthesis and lip synchronization

T Yotsukura, S Morishima

PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING 272 - 275 2002年 [査読有り]

　概要を見る

We describe the design and report the development of an open source ware toolkit for building an easily customizable anthropomorphic dialog agent. This toolkit consist of four modules for multi-modal dialog integration, speech recognition, speech synthesis, and face image synthesis. In this paper, we focus on the construction of an agent's face image synthesis.

DOI

Scopus

3

被引用数

(Scopus)
Audio-Video Tracking System for Multi-Modal Interface

D. Zotkin, K. Takahashi, T. Yotsukura, S. Morishima, N. Tetsutani

画像電子学会論文誌 30 ( 4 ) 452 - 463 2001年 [査読有り]

　概要を見る

In this paper, a front end system which uses audio and video information to track the people or other sound sources in the ordinary room has developed. The microphone array is used for determining the spatial location of the sound; the active video camera acquires the image of the area where the sound is detected, detects the people in the image by using skin color and can zoom and track a speaker. Several add-ons to the system include various visualization tools such as on-screen displays of waveforms, correlation plots, spectrum plots, spatial acoustic energy distribution, running time-frequency acoustic energy plots, and the possibility of real-time beamforming with real-time output to the headphones. The system can be used as a front-end for the non-encumbering human-computer interaction by video and audio means.

DOI CiNii
Model-based lip synchronization with automatically translated synthetic voice toward a multi-modal translation system

Shin Ogata, Kazumasa Murai, Satoshi Nakamura, Shigeo Morishima

Proceedings - IEEE International Conference on Multimedia and Expo 28 - 31 2001年 [査読有り]

　概要を見る

In this paper, we introduce a multi-modal English-to-Japanese and Japanese-to-English translation system that also translates the speaker's speech motion while synchronizing it to the translated speech. To retain the speaker's facial expression, we substitute only the speech organ's image with the synthesized one, which is made by a three-dimensional wire-frame model that is adaptable to any speaker. Our approach enables image synthesis and translation with an extremely small database.

DOI

Scopus

9

被引用数

(Scopus)
Automatic face tracking and model match-move automatic face tracking and model match-move in video sequence using 3D face model in video sequence using 3D face model

Takafumi Misawa, Kazumasa Murai, Satoshi Nakamura, Shigeo Morishima

Proceedings - IEEE International Conference on Multimedia and Expo 234 - 236 2001年 [査読有り]

　概要を見る

A stand-in is a common technique for movies and TV programs in foreign languages. The current stand-in that only substitutes the voice channel results awkward matching to the mouth motion. Videophone with automatic voice translation are expected to be widely used in the near future, which may face the same problem without lip-synchronized speaking face image translation. In this paper, we propose a method to track motion of the face from the video image, that is one of the key technologies for speaking image translation. Almost all the old tracking algorithms aim to detect feature points of the face. However, these algorithms had problems, such as blurring of a feature point between frames and occlusion of the hidden feature point by rotation of the head, and so on. In this paper, we propose a method which detects movement and rotation of the head given the three dimensional shape of the face, by template matching using a 3D personal face wireframe model. The evaluation experiments are carried out with the measured reference data of the head. The proposed method achieves 0.48 angle error in average. This result confirms effectiveness of the proposed method.

DOI

Scopus

1

被引用数

(Scopus)
HyperMask: Talking head projected onto moving surface

S Morishima, T Yotsukura

2001 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL III, PROCEEDINGS 947 - 950 2001年 [査読有り]

　概要を見る

HYPERMASK is a system which projects an animated face onto a physical mask, worn by an actor. As the mask moves within a prescribed area, its position and orientation are detected by a camera, and the projected image changes with respect to the viewpoint of the audience. The lips of the projected face are automatically synthesized in real time with the voice of the actor, who also controls the facial expressions. As a theatrical tool, HYPERMASK enables a new style of storytelling. As a prototype system, we propose to put a self-contained HYPERMASK system in a trolley (disguised as a linen cart), so that it projects onto the mask worn by the actor pushing the trolley.

DOI
Multimodal translation.

Shigeo Morishima, Shin Ogata, Satoshi Nakamur

Auditory-Visual Speech Processing, AVSP 2001, Aalborg, Denmark, September 7-9, 2001 98 - 103 2001年 [査読有り]

CiNii
マルチモーダル擬人化インタフェースとその感性基盤機能

石塚満, 橋本周司, 森島繁生, 小林哲則, 金子正秀, 相澤清晴, 苗村健, 伊庭斉志, 土肥浩

感性的ヒューマンインタフェース公開シンポジウム資料，2000.11.22 99 - 111 2000年11月
3D face expression estimation and generation from 2D image based on a physically constraint model

T Ishikawa, S Morishima, D Terzopoulos

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E83D ( 2 ) 251 - 258 2000年02月 [査読有り]

　概要を見る

Muscle based face image synthesis is one of the most realistic approaches to the realization of a life-like agent in computers. A facial muscle model is composed of facial tissue elements and simulated muscles. In this model, forces are calculated effecting a facial tissue element by contraction of each muscle string, so the combination of each muscle contracting force decides a specific facial expression. This muscle parameter is determined on a trial and error basis by comparing the sample photograph and a generated image using our Muscle-Editor to generate a specific face image. In this paper, we propose the strategy of automatic estimation of facial muscle parameters from 2D markers' movements located on a face using a neural network. This corresponds to the non-realtime 3D facial motion capturing from 2D camera image under the physics based condition.
Networked Theater - Movie Production System Based on a Networked Environment

K. Takahashi, J. Kurumisawa, S. Morishima, T. Yotsukura, S. Kawato

Proceedings of the Computer Graphics Annual Conference Series, 2000 (SIGGRAPH2000) 92 2000年 [査読有り]
Human body postures from trinocular camera images

Shoichiro Iwasawa, Jun Ohya, Kazuhiko Takahashi, Tatsumi Sakaguchi, Kazuyuki Ebihara, Shigeo Morishima

Proceedings - 4th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2000 326 - 331 2000年 [査読有り]

　概要を見る

This paper proposes a new real-time method for estimating human postures in 3D from trinocular images. In this method, an upper body orientation detection and a heuristic contour analysis are performed on the human silhouettes extracted from the trinocular images so that representative points such as the top of the head can be located. The major joint positions are estimated based on a genetic algorithm-based learning procedure. 3D coordinates of the representative points and joints are then obtained from the two views by evaluating the appropriateness of the three views. The proposed method implemented on a personal computer runs in real-time. Experimental results show high estimation accuracies and the effectiveness of the view selection process. © 2000 IEEE.

DOI

Scopus

21

被引用数

(Scopus)
Multi-Media Ambiance Communication Based on Actual Moving Pictures.

Tadashi Ichikawa, Tetsuya Yoshimura, Kunio Yamada, Toshifumi Kanamaru, Hiromichi Suga, Shoichiro Iwasawa, Takeshi Naemura, Kiyoharu Aizawa, Shigeo Morishima, Takahiro Saito

Proceedings of the 1999 International Conference on Image Processing, ICIP '99, Kobe, Japan, October 24-28, 1999 36 - 40 1999年 [査読有り]

DOI
Face-To-Face Communicative Avatar Driven by Voice.

Shigeo Morishima, Tatsuo Yotsukura

Proceedings of the 1999 International Conference on Image Processing, ICIP '99, Kobe, Japan, October 24-28, 1999 11 - 15 1999年 [査読有り]

DOI CiNii
Multiple points face-to-face communication in cyberspace using multi-modal agent.

Shigeo Morishima

Human-Computer Interaction: Communication, Cooperation, and Application Design, Proceedings of HCI International '99 (the 8th International Conference on Human-Computer Interaction), Munich, Germany, August 22-26, 1999, Volume 2 2 177 - 181 1999年 [査読有り]

CiNii
Physics-model-based 3D facial image reconstruction from frontal images using optical flow.

Shigeo Morishima, Takahiro Ishikawa, Demetri Terzopoulos

ACM SIGGRAPH 98 Conference Abstracts and Applications, Orlando, Florida, USA, July 19-24, 1998 258 258 1998年 [査読有り]

DOI CiNii
Dynamic modeling of human hair and GUI based hair style designing system.

Keisuke Kishi, Shigeo Morishima

ACM SIGGRAPH 98 Conference Abstracts and Applications, Orlando, Florida, USA, July 19-24, 1998 255 1998年 [査読有り]

DOI CiNii
Facial muscle parameter decision from 2D frontal image

S Morishima, T Ishikawa, D Terzopoulos

FOURTEENTH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1 AND 2 160 - 162 1998年 [査読有り]

　概要を見る

Muscle based face image synthesis is one of the most realistic approach to realize life-like agent in computer. Facial muscle model is composed of facial tissue elements and muscles. In this model, forces are calculated effecting facial tissue element by contraction of each muscle strength, so the combination of each muscle parameter decide a specific facial expression. Now each muscle parameter is decided on trial and error procedure comparing the sample photograph and generated image using our Muscle-Editor to generate a specific face image.
In this paper we propose the strategy of automatic estimation of facial muscle parameters from 2D marker movements using neural network.
This is also 3D motion estimation from 2D point or flow information in captured image under restriction of physics based face model.

DOI
Real-time human posture estimation using monocular thermal images

S Iwasawa, K Ebihara, J Ohya, S Morishima

AUTOMATIC FACE AND GESTURE RECOGNITION - THIRD IEEE INTERNATIONAL CONFERENCE PROCEEDINGS 492 - 497 1998年 [査読有り]

　概要を見る

This paper introduces a new real-lime method to estimate the posture of a human from thermal images acquired by an infrared camera regardless of the background and lighting conditions. Distance transformation is performed for the human body area extracted from the thresholded thermal image for the calculation of the center of gravity. After the orientation of the zipper half of the body is obtained by calculating the moment of inertia, significant points such as the top of the head, the tips oft he hands and foot are heuristically located. In addition, the elbow and knee positions are estimated from the detected (significant) points using a genetic algorithm based learning procedure.
The experimental results demonstrate the robustness of the proposed algorithm and real-time (faster than 20 frames per second) performance.
Facial image reconstruction by estimated muscle parameter

T Ishikawa, H Sera, S Morishima, D Terzopoulos

AUTOMATIC FACE AND GESTURE RECOGNITION - THIRD IEEE INTERNATIONAL CONFERENCE PROCEEDINGS 342 - 347 1998年 [査読有り]

　概要を見る

Muscle based face image synthesis is one of the most realistic approach to realize life-like agent in computer.
Facial muscle model is composed of facial tissue elements and muscles. In this model, forces are calculated effecting facial tisssue element by contraction of each muscle strength, so the combination of each muscle parameter decide a specific facial expression.
Now each muscle parameter is decided on trial and error procedure comparing the sample photograph and generated image using our Muscle-Editor to generate a specific face image. In this paper, we propose the strategy of automatic estimation of facial muscle parameters front 2D marker movements using neural network. This corresponds to the non-realtime 3D facial motion tracking from 2D image under the physics based condition.
Real-time Talking Head Driven by Voice and its Application to Communication and Entertainment.

Shigeo Morishima

Auditory-Visual Speech Processing, AVSP '98, Sydney, NSW, Australia, December 4-6, 1998 195 - 200 1998年 [査読有り]
3D Estimation of Facial Muscle Parameter from the 2D Marker Movement Using Neural Network.

Takahiro Ishikawa, Hajime Sera, Shigeo Morishima, Demetri Terzopoulos

Computer Vision - ACCV'98, Third Asian Conference on Computer Vision, Hong Kong, China, January 8-10, 1998, Proceedings, Volume II 671 - 678 1998年 [査読有り]

DOI CiNii

Scopus

1

被引用数

(Scopus)
Expression recognition and synthesis for face-to-face communication

S Morishima

DESIGN OF COMPUTING SYSTEMS: SOCIAL AND ERGONOMIC CONSIDERATIONS 21 415 - 418 1997年 [査読有り]

　概要を見る

Muscle based face image synthesis [1] [2] is one of the most realistic approach to realize interface agent. Facial muscle model is composed of facial tissue elements and muscles. In this model, forces affecting facial tisssue element are calculated by contraction of each muscle strength, so the combination of each muscle parameter decide a specific facial expression.
In this paper, we introduce the facial muscle model and the strategy of automatic estimation of facial muscle parameters from 2-D marker movements.
Real-time estimation of human body posture from monocular thermal images

S Iwasawa, K Ebihara, J Ohya, S Morishima

1997 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS 15 - 20 1997年 [査読有り]

　概要を見る

This paper introduces a new;real-time method to estimate the posture of a human from thermal images acquired by an infrared camera regardless of the background and lighting conditions. Distance transformation is performed for the human body area extracted from the thresholded thermal image for the calculation of the center of gravity. After the orientation of the upper half of the body is obtained by calculating the moment of inertia, significant points such as the top of the head, the tips of the hands and foot are heuristically located. In addition, the elbow and knee positions are estimated from the detected (significant) points using a genetic algorithm based learning procedure.
The experimental results demonstrate the robustness of the proposed algorithm and rear-time (faster than 20 frames per second) performance.

DOI
Face feature extraction from spatial frequency for dynamic expression recognition

Tatsumi Sakaguchi, Shigeo Morishima

Proceedings - International Conference on Pattern Recognition 3 451 - 455 1996年 [査読有り]

　概要を見る

A new facial feature extraction technique for expression recognition is proposed. We employ the spatial frequency domain information to obtain robust performance to the random noise on a image or the lighting conditions. It exhibited high ability sufficiently even if combined with a low-performance region tracking method. As an application of this technique, we have constructed a dynamic facial expression recognition system. We use hidden Markov models to utilize temporal changes in the facial expressions. The spatial frequency information and the temporal information make better rates of facial expression recognition. In the experiment, we established a correct response rate of approximately 84.1% of recognition with six categories. © 1996 IEEE.

DOI

Scopus

4

被引用数

(Scopus)
Construction and Psychological Evaluation of 3-D Emotion Space

KAWAKAMI FUMIO, YAMADA HIROSHI, MORISHIMA SHIGEO, HARASHIMA HIROSHI

International Journal of Biomedical Soft Computing and Human Sciences: the official journal of the Biomedical Fuzzy Systems Association 1 ( 1 ) 33 - 41 1995年

　概要を見る

The goal of research is to realize natural human-machine communication system by givig a facial expression to computer. To realize this environment, it′s necessary for the machine to recognize human′s emotion condition appearing in the human′ face, and synthesize the reasonable facial image of machine. For this purpose, the machine should have emotion model based on parameterized faces which can express an emotion conditon quantitatively. Especially, both mapping and inverse mapping from face to the model have to be achieve. Facial expression of parameterized with Facial Action Coding System (FACS) which is translated to the grids′ motion of face model. An emotion condition is described compactly by the pint in a 3D space generated by 5-layered neural network and its evaluation result shows the high performance of this model.

DOI CiNii
3-D emotion space for interactive communication

F Kawakami, M Ohkura, H Yamada, H Harashima, S Morishima

IMAGE ANALYSIS APPLICATIONS AND COMPUTER GRAPHICS 1024 471 - 478 1995年 [査読有り]

　概要を見る

In this paper, the methods for modeling racial expression and emotion are proposed. This Emotion Model, called 3-D Emotion Space can represent both human and computer emotion conditions appearing on the face as a coordinate in the 3-D Space.
For the construction of this 3-D Emotion Space, 5-layer neural network which is superior in non-linear mapping performance is applied. After the network training with backpropagation to realize Identity Mapping, both mapping from facial expression parameters to the 3-D Emotion Space and inverse mapping from the Emotion Space to the expression parameters were realized.
As a result a system which can analyze acid synthesize the facial expression were constructed simultaneously.
Moreover, this inverse mapping to the facial expression is evaluated by the subjective evaluation using the synthesized expressions as Lest images. This evaluation result proved the high performance to describe natural facial expression and emotion condition with this model.

DOI

Scopus

2

被引用数

(Scopus)
Expression and motion control of hair using fast collision detection methods

M Ando, S Morishima

IMAGE ANALYSIS APPLICATIONS AND COMPUTER GRAPHICS 1024 463 - 470 1995年 [査読有り]

　概要を見る

A trial to generate the object in the natural world by computer graphics is now actively done. Normally, huge computing power and storage capacity are necessary to make real and natural movement of the human's hair. In this paper, a technique to synthesize human's hair with short processing time and little storage capacity is discussed. A new Space Curve Model and Rigid Segment Model are proposed. And also high speed collision detection with the human's body is discussed.

DOI

Scopus

5

被引用数

(Scopus)
FACIAL EXPRESSION SYNTHESIS BASED ON NATURAL VOICE FOR VIRTUAL FACE-TO-FACE COMMUNICATION WITH MACHINE

S MORISHIMA, H HARASHIMA

IEEE VIRTUAL REALITY ANNUAL INTERNATIONAL SYMPOSIUM 486 - 491 1993年 [査読有り]

DOI
FACIAL ANIMATION SYNTHESIS FOR HUMAN-MACHINE COMMUNICATION-SYSTEM

S MORISHIMA, H HARASHIMA

HUMAN-COMPUTER INTERACTION, VOL 2 19 1085 - 1090 1993年 [査読有り]
A facial image synthesis system for human-machine interface

Shigeo Morishima, Tatsumi Sakaguchi, Hiroshi Harashima

1992 Proceedings IEEE International Workshop on Robot and Human Communication, ROMAN 1992 363 - 368 1992年

　概要を見る

We're building "face" interface which is a user-friendly human-machine interface with Multi-media and can realize face-to-face communication environment between an operator and a machine. In this system, human natural face appears on the display of machine and can talk to operator with natural voice. This paper describes face motion and expression synthesis schemes which can be applied to this "face" interface. We express a human head with 3D model. The surface model is built by texture mapping with 2D real image. All the motions and expressions are synthesized and controlled automatically by the movement of some feature points on the model. This "face" interface is one of the applications of model based image coding and media conversion schemes we've been studying.

DOI

Scopus

3

被引用数

(Scopus)
IMAGE SYNTHESIS AND EDITING SYSTEM FOR A MULTIMEDIA HUMAN INTERFACE WITH SPEAKING HEAD

S MORISHIMA, H HARASHIMA

INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND ITS APPLICATIONS 354 270 - 273 1992年 [査読有り]
A MEDIA CONVERSION FROM SPEECH TO FACIAL IMAGE FOR INTELLIGENT MAN-MACHINE INTERFACE

S MORISHIMA, H HARASHIMA

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS 9 ( 4 ) 594 - 600 1991年05月 [査読有り]

　概要を見る

An automatic facial motion image synthesis scheme, driven by speech, and a real-time image synthesis design are presented. The purpose of this research is to realize an "intelligent" human-machine interface or "intelligent" communication system with talking head images. A human face is reconstructed on the display of a terminal using a 3-D surface model and texture mapping technique. Facial motion images are synthesized naturally by transformation of the lattice points on 3-D wire frames. Two driving motion methods, a text-to-image conversion scheme, and a voice-to-image conversion scheme are proposed in this paper. In the first method, the synthesized head image can appear to speak some given words and phrases naturally. In the latter case, some mouth and jaw motions can be synthesized in synchronization with voice signals from a speaker. Facial expressions, other than mouth shape and jaw position, also can be added at any moment, so it is easy to make the facial model appear angry, to smile, to appear sad, etc., by special modification rules. These schemes were implemented on a parallel image computer system. A real-time image synthesizer was able to generate facial motion images on the display, at a TV image video rate.

DOI

Scopus

81

被引用数

(Scopus)
A realtime model‐based facial image synthesis based on multiprocessor network

Shigeo Morishima, Seiji Kobayashi, Hiroshi Harashima

Systems and Computers in Japan 22 ( 13 ) 59 - 69 1991年 [査読有り]

　概要を見る

Model‐based image coding has been highlighted recently as a high‐efficiency coding method for TV telephone and TV conference systems. In a model‐based coding system, an ultralow‐rate image transmission is realized by obtaining common models of a facial image at both sides of a communication and by transmitting only modification parameters between them. However, it is difficult to realize a realtime processing of a model‐based coding with a conventional iterative‐processing type computer since the amount of material to analyze and synthesize at both sides is very large. Realtime processing is absolutely necessary to realize a practical system using this method, i.e., highspeed processings at both the transmitter and receiver are required. This paper describes a realtime facial image synthesis method based on a multiprocessor construction with a transputer. The transputer is a microcomputer having a communication capability with multiple processors and 10 MIPS CPU performance. Using this system in a pipeline processing with 20 processors, it is possible to synthesize realtime facial images at a rate of about 16 frames per second. If only a part around the lips is transmitted, it is possible to synthesize an image at a rate of about 40 frames per second with a network using five processors. Copyright © 1991 Wiley Periodicals, Inc., A Wiley Company

DOI

Scopus
A facial motion synthesis for intelligent man‐machine interface

Shigeo Morishima, Shin'Ichi Okada, Hiroshi Harashima

Systems and Computers in Japan 22 ( 5 ) 50 - 59 1991年 [査読有り]

　概要を見る

A facial motion image synthesis method for intelligent man‐machine interface is examined. Here, the intelligent man‐machine interface is a kind of friendly man‐machine interface with voices and pictures in which human faces appear on a screen and answer questions, compared to the currently existing user interfaces which primarily uses letters. Thus what appears on the screen is human faces, and if speech mannerisms and facial expressions are natural, then the interactions with the machine are similar to those with actual human beings. To implement such an intelligent man‐machine interface it is necessary to synthesize natural facial expressions on the screen. This paper investigates a method to synthesize facial motion images based on given information on text and emotion. The proposed method utilizes the analysis‐synthesis image coding method. It constructs facial images by assigning intensity data to the parameters of a 3‐dimensional (3‐D) model matching the person in question. Moreover, it synthesizes facial expressions by modifying the 3‐D model according to the predetermined set of rules based on the input phonemes and emotion, and also synthesizes reasonably natural facial images. Copyright © 1991 Wiley Periodicals, Inc., A Wiley Company

DOI

Scopus

2

被引用数

(Scopus)
Automatic Rule Extraction from Statistical Data and Fuzzy Tree Search

Shigeo Morishima, Hiroshi Harashima

Systems and Computers in Japan 19 ( 5 ) 26 - 37 1988年 [査読有り]

　概要を見る

Generally speaking, it is necessary to extract knowledge from an expert in a given discipline and implement this knowledge into a system when constructing an expert system. However, it is not easy to extract knowledge in such fields as medical diagnosis or pattern recognition because the inference logic depends on the experience and intuition of the expert. This paper proposes an automatic rule extraction mechanism using statistical analysis. In this system, production rules are expressed in the form of the threshold function. Because the threshold function can describe any kinds of inference logic, it is expressed easily as a linear combination of input vectors and weighting coefficients. Thus weighting coefficients can be calculated by the same method as a discriminant function. If only one threshold is defined, general Boolean logic an be expressed. Moreover, an ambiguous inference rule can be expressed when the threshold levels are multidefined and a membership function is defined at each category. Further, the Fuzzy Tree Search algorithm which combines ambiguous inference and tree search is proposed at the end of this paper. This algorithm can search and determine an optimum cluster with little calculation and good performance. In practice, a medical diagnostic system applied to psychiatry which has most ambiguous diagnostic logic, has been constructed based on this algorithm and inference rules have been extracted automatically. By this experiment, Fuzzy Tree Search is as fast as the tree search technique and has as good a performance as a full search clustering technique. Copyright © 1988 Wiley Periodicals, Inc., A Wiley Company

DOI

Scopus

3

被引用数

(Scopus)

▼全件表示

共同研究・競争的資金等の研究課題

音楽構造に基づき個性を反映したメロディ及びモーション生成手法の確立

日本学術振興会科学研究費助成事業

研究期間:

2024年04月

-

2029年03月

浜中雅俊, 森島繁生, 吉井和佳, 北原鉄朗, 長尾確
ライフログにおける写実性と創造性の融合

日本学術振興会科学研究費助成事業

研究期間:

2024年04月

-

2028年03月

吉井和佳, 河原達也, 延原章平, 森島繁生, 西野恒
多元自動通訳システムと評価法に関する研究とその応用展開

日本学術振興会科学研究費助成事業基盤研究(S)

研究期間:

2021年07月

-

2026年03月

中村哲, 河原達也, 戸田智基, 森島繁生, 猿渡洋, SAKTI Sakriani, 松下佳世, 山田優, 高道慎之介, 渡辺太郎, 須藤克仁, 田中宏季, 品川政太朗

　概要を見る

課題1-A)強調を含んだ原音声をend-to-endで対象言語に変換する方式を検討し有効性を示した．形容詞以外の品詞の強調やフォーカスの抽出および翻訳法に関する検討を開始した．音声とビデオの身体的話者性変換の統合システムの改良を実施した．任意の話者および言語にも対応可能な深層ネットワーク統合型音声変換法を提案し，その有効性を示した．任意のフォトリアルな発話表情合成を実現するため，Nerfに基づく従来の3次元顔モデルベースとは異なる，3次元モデルの仲介や長時間レンダリングを必要としない輝度場の機械学習に基づくスーパーフォトリアル顔画像合成法に着手した．B)分野やキーワード等の情報を明に与える形での事前適応について方式調査を行った．マルチモーダル事前学習モデルを用いた予備検討を行い，本タスクに最適化する際の学習効率が課題であることが明らかになった．C)漸進的音声合成の品質と遅延の改善を優先的に実施した．大きく語順の異なる同時音声翻訳に対応するため，構文情報を利用する方法，プレフィクスを利用する方法を提案し有効性を確認した．
課題2-A)遠隔同時通訳において導入されている既存の通訳者支援システムを特定し分析した．B)現在使われている通訳品質評価法について，先行研究の検証や関連機関へのヒアリングを行い，主観評価ではなく客観的に測定可能なデータに基づく通訳品質評価法の確立に向けた作業に着手した．翻訳評価で用いられる枠組みであるMQMを用いた小規模な同時通訳評価アノテーションを実施した．C)同時通訳中における文単位のASSR反応による認知負荷の測定，認知負荷指数の関係の解析を実施した．
課題3-A)原発話・通訳発話のアラインメントツールプロトタイプを構築し評価アノテーションに活用した．C)モジュールの再構築による統合システムの更新を行いIWSLT 2022の同時翻訳共通タスクに参加した．
分散型匿名化処理によるプライバシープリザーブドAI基盤構築

JST 未来社会創造事業

研究期間:

2019年11月

-

2026年03月

斎藤英雄, 森島繁生
認識・生成過程の統合に基づく視聴覚音楽理解

日本学術振興会科学研究費助成事業基盤研究(B)

研究期間:

2019年04月

-

2023年03月

吉井和佳, 河原達也, 森島繁生

　概要を見る

2019年度は、聴覚系による音楽理解の定量化として、まず、生成モデルと認識モデルの統合に基づく統計的自動採譜に取り組んだ。具体的には、コード認識タスクにおいて、コード系列から音響的特徴量系列が生成される過程を確率的生成モデルとして定式化し、その逆問題を解く、すなわち、音響的特徴量系列からコード系列を推定するための認識モデルを、償却型変分推論の枠組みで導入することで、両者を同時に最適化する方法を考案した。これにより、コードラベルが付与されていない音響信号も用いた半教師あり学習を可能にした。これは、人間が音楽を聴いて、そのコードを認識する際に、そのコードからどのような響きの音が発生するのかを同時に想像し、元の音楽との整合性を無意識的に考慮していることに相当していると考えられる。また、音楽の記号的な側面にも着目して研究を展開した。具体的には、ピアノの運指推定や、メロディのスタイル変換などの課題において、運指モデルや楽譜モデルを事前分布に導入し、身体的あるいは音楽的に妥当な推定結果を得るための統計的枠組みを考案した。さらに、音声理解の定量化して、音声スペクトルの深層生成モデルを事前分布に基いた音声強調法を開発すると同時に、高精度かつ高速なブラインド音源分離技術も考案し、音源モデル・空間モデルの両面から音理解の定量化に迫ることができた。一方、視覚系によるダンス動画理解の定量化に向けた第一段階として、画像中の人間の姿勢推定の研究の取り組みも開始した。また、楽器音を入力とすることで、高品質かつ音に合った自然な演奏映像の生成を実現した。具体的には、人の姿勢特徴量を介すことで、音と人物映像といった異なるドメイン間をマッピングするEnd-to-End学習が可能になった。
スキルやモティベーションを向上させる現実歪曲時空間の解明

日本学術振興会科学研究費助成事業基盤研究(A)

研究期間:

2019年04月

-

2022年03月

森島繁生, 稲見昌彦, 野嶋琢也, 暦本純一, 小池英樹, 持丸正明

　概要を見る

初年度は、スキルやモティベーションを向上させるための歪曲時空間制御に関する要素技術開発および応用先の模索を行った。森島グループは、１枚の着衣全身画像からの高精度な姿勢の３次元モデル生成およびテクスチャ合成に成功し、任意の角度から対象を鑑賞できる３次元視覚コンテンツのためのプレイヤーの効率的なモデリングの可能性を示した。持丸グループは、力学的な介入環境によって運動スキルを向上させる研究基盤として、非侵襲な運動計測技術の開発およびデジタルヒューマン技術による解析環境の開発を行った。マーカレスのビデオ式モーションキャプチャを整備し、有効性の確認を行った。野嶋グループは、スポーツにおけるヒトの行動を変容させるモティベーションデザインに関して、ビデオゲームと物理スポーツの融合手法の提案および観客の観戦手法に関する調査を行った。またAR技術を利用した観戦システムを構築し，その有効性を確認した。稲見グループでは、スポーツスキルの向上を目的とした現実歪曲時空間として、球技の際のボールの動きがスローモーションとなる空間の実装に着手した。まずは、ボールジャグリングのトレーニングを最初のフィールドとして設定し、速度変更可能なVR環境の構築を開始した。小池グループでは、卓球をターゲットとして、プレイヤートラッキングとボールの着弾点予測、および卓球台への実時間プロジェクションを行った。プレイヤーのサーブ動作を1台のRGBカメラで撮影し、深層学習による動作予測システムFuturePoseを用いてボールの着弾点を実時間予測し、卓球台上に実時間投影される。実験の結果、実際の着弾点をほぼ正確に予測できることが示された。暦本グループは、学習者の発話を音声認識により常に確認し、教材に対して正しくshadowを行えていない場合は、スピーキングの速度をゆるめるなど、学習課題の難易度を自動的に調整する技術を開発した。
次世代音声翻訳の研究

日本学術振興会科学研究費助成事業基盤研究(S)

研究期間:

2017年05月

-

2022年03月

中村哲, 河原達也, 猿渡洋, 戸田智基, 森島繁生, 高道慎之介, 須藤克仁, サクリアニサクティ, 吉野幸一郎, 田中宏季, 松本裕治

　概要を見る

課題①A)雑音下音声認識及びその前処理の音声強調処理に関し、独立深層学習行列分析（IDLMA）を提案した。B)単語単位のEnd-to-End音声認識を提案し、従来比30倍以上の高速化を実現した。また，音声認識と音声合成を人間の聴覚と発声器官のように連携させてモデル学習するMachine Speech Chainを提案し有効性を示した．さらに深層学習ベースの新たな漸進的音声認識，音声合成を提案した．C)入力に対して適応的な訳出遅延が可能な新しい方式を考案し，漸進的翻訳の実現可能性を示した．また同時通訳調の順送りの翻訳文を生成する方式を考案し，翻訳結果を順送りの訳に近づけられることを示した．D)機械翻訳の評価において訳出の長さを制御することで字幕等制約のある状況下での翻訳の実現や訳抜けや重複訳の解消を目指す手法の検討を行い，効果を確認した．E)対話制御に関わる多様なモダリティの情報を処理する研究開発を行った．
課題②A)パラ言語情報を保持したまま音声翻訳を実現するため新たな原言語音声から対象言語音声へ直接翻訳する手法について研究した．従来の類似言語間の直接翻訳でなく異なる構造の言語間でも直接音声翻訳を実現する手法を提案した．B)異なる言語の音声データを用いた学習を可能とする統計的声質変換技術を構築するとともに、深層波形生成モデルの導入による高品質化を達成した．
課題③A)奈良先端大の講義アーカイブシステムで翻訳字幕付与の自動化を実現した．B)音声画像翻訳の実現に向けて、特定人物の顔と全身のモデルをインスタントに自動生成し、任意の翻訳言語にシンクロさせて個性を保持したまま発話するアバタ生成技術を発展させた．
課題④同時通訳者の注意に基づく認知負荷の計測に関して取り組んだ．
課題⑤実際の統合システムとして実現するため，パイプ接続型・クライアントサーバ型の2種類のシステムを開発した．
次世代メディアコンテンツ生態系技術の基盤構築と応用展開

JST 戦略的創造研究推進事業 ACCEL

研究期間:

2016年04月

-

2021年03月

後藤真孝, 森島繁生, 吉井和佳, 中村聡史
個人特徴を表現可能な動画像に基づくフォトリアルな顔の分析・合成の研究

日本学術振興会科学研究費助成事業基盤研究(B)

研究期間:

2014年04月

-

2017年03月

森島繁生

　概要を見る

本研究課題では、主として次の６つの研究成果をあげることができた。
１）犯罪捜査に威力を発揮する経年変化や肥痩変形を考慮したモンタージュシステムの実現、２）１枚の顔画像のみからの小規模なデータベースを利用する高精度な３次元顔形状復元システムの実現、３）エンタテインメントシステム応用としての他言語音声への３次元モデルを一切使用しないビデオリシャッフリングに基づく顔動画像リップシンクシステムの実現、４）個性あるダンスキャラクタを自動生成可能なインスタントキャスティングシステムの実現。５）表情変化時の個性を反映可能な経年変化ビデオ生成システムの実現。６）似顔絵の個性を反映した実写映像復元の実現である。
コンテンツ共生社会のための類似度を可知化する情報環境の実現

JST 戦略的創造研究推進事業 CREST

研究期間:

2011年04月

-

2016年03月

後藤真孝, 森島繁生, 吉井和佳, 中村聡史
人物映像解析による犯罪捜査支援システム

文部科学省科学技術戦略推進費

研究期間:

2010年04月

-

2015年03月

八木康史, 森島繁生, 黄瀬紘一, 和田俊和, 岡田隆三
個性や年齢等の特徴を忠実に表現可能な顔分析・合成モデルの構築

日本学術振興会科学研究費助成事業基盤研究(B)

研究期間:

2010年

-

2012年

森島繁生, 島田和幸

　概要を見る

人物の顔が表現する様々な属性のうち、個性、年齢、感情、健康状態に着目し、これらの特徴を高精度かつリアルにパラメータ表現する研究を行った。まず、実際の顔面の解剖を共同研究者の島田教授とのコレボレーションにより実現し、固体毎に表情筋の配置および脂肪層の構造に違いがあることを明確化した。次にレンジスキャナによって計測した顔の立体形状と正面スナップショットから構成される大量の顔データベースを構築し、人間の顔形状には統計的に一定のルールがあることを見出しモデル化を試みた。まず、線形予測分析に基き、正面の顔から得られる目鼻等の特徴点を安定かつロバストに抽出できるアルゴリズムを開発した。この特徴点の情報から、個々人の顔の3次元構造を高精度に復元できることを明らかにし、さらにカメラから入力された写真に対して極めて高速に3次元顔モデルを自動生成できるシステムを構築した。さらに皺の制御によって年齢特徴の表現が可能であることを明らかにし、顔形状、テクスチャ、皺の制御によって個人の年齢を操作できるインタラクティブなシステムを構築した。さらにこれらの顔モデルをベースにして、経年変化に強靭な顔認証システム、運転時の眠気推定システム、皺の付加による顔画像合成システム、発話合成システムの構築、作り笑いと自然な笑いを制御可能なメカニズムの解明など、顔画像処理の分野に新たな知見を与え、分野の発展に大きく寄与することができた。
コンテンツ制作の高能率化のための要素技術研究

JST 戦略的創造研究推進事業 CREST

研究期間:

2004年04月

-

2010年03月

森島繁生, 安生健一, 中村哲
新映像技術ダイブイントゥザムービーの研究

文部科学省科学技術振興調整費

研究期間:

2006年08月

-

2009年03月

森島繁生, 八木康史, 中村哲, 伊勢史郎
顔表情変化時の動的特徴量に基づくバイオメトリクス個人認証アルゴリズムの研究

日本学術振興会科学研究費助成事業基盤研究(B)

研究期間:

2006年

-

2009年

森島繁生, 山田寛, 中村哲, 山田寛

　概要を見る

顔画像を対象とした精度の高いバイオメトリクス個人認証の実現を目標として、表情変化時の動的な特徴や3次元顔立体形状特徴などをパラメータとする顔認証システム構築を行った。さらに、加齢に伴う顔特徴量の変化のモデル化・年齢合成、自然な笑いの特徴抽出・再現、正面画像や任意の方向の画像から3次元の顔形状を精度よく復元する手法の検討を実施し、個人認証システムおよび犯罪捜査支援システムの構築に必要な要素技術の開発を行い実用的に有効な成果を残した。
音声映像合成モデルによる英語情動学習支援システム開発のための研究

日本学術振興会科学研究費助成事業基盤研究(C)

研究期間:

2007年

-

2008年

ヤーッコラ伊勢井敏子, 広瀬啓吉, 森島繁生

　概要を見る

情動認知に関しては, 文化差(同時に母語の言語系統の差)が関係し, また, 情動・感情ごとに差異があるということも判明した。情動の認知結果と物理音響特性(距離)の相関に関しては, 基本周波数よりも音圧の方がより相関が高いことも分かった。顔合成と音声との同期は極めて難しい課題であることがわかった。時間変化と音声の変化および顔の変化をすべて統合して合成することは現在の技術では不可能に近いことが判明した。
外科的矯正治療による心理・精神面と表情との変化の関連性に関する研究

日本学術振興会科学研究費助成事業基盤研究(C)

研究期間:

2004年

-

2006年

寺田員人, 森島繁生, 宮永美知代, 七里佳代

　概要を見る

外科的矯正治療前後の心理・精神面と表情表出の変化とその相互作用を調べ、静的、ならびに動的な表情表出の調和を目指した治療体系を構築することを目的に以下のことを行った。
1.外見と心理・精神面とは関係が深く関わっている笑顔について、外科的矯正治療前後における安静時とスマイル時の表情を二次元ならびに三次元計測し、評価した。
その結果、外科的矯正治療により、骨格形態、中顔面部の陥凹観、軟組織のバランスも改善され、軟組織の動きも滑らかとなった。外科的矯正治療後のスマイルでは、頬部、中下顔面部の軟組織の動きが大きくなり、口角が上外側に引き揚げられ、よりきれいな弓なりとなった口裂を形成していた。結果として、スマイル時、大きな軟組織の移動と細かな変化を表出した。
2.顎変形症を有する患者の多くは、下顎の前突感、顔の曲がりなど顔貌の不調和を悩みながら長期に亘り精神的ストレスを抱え、心理・精神面に悪影響を及ぼしていることが予想される。そこで、外科的矯正治療による心理・精神的な変化を調べることを目的に、外科的矯正治療患者を対象に状態・特性不安検査(State-Trait Anxiety Inventory,日本語版STAI)を用いて、調査を行った。
その結果、状態不安、特性不安ともに、対照群とした大学生と比較して良好な状態にあった。
以上の結果から、外科的矯正治療により、硬組織形態ならびに軟組織形態の均整が得られたことにより表情表出時の軟組織の動きが円滑に行われていた。一方、外科的矯正治療後の精神状態が向上し、より安定した状態となっていた。そのことは、外観と心的状態、両者が関連していることが示唆された。
解剖学的アプローチによる高精細・忠実な顔面筋モデルの作成と運動制御

日本学術振興会科学研究費助成事業基盤研究(B)

研究期間:

2001年

-

2004年

森島繁生, 島田和幸

　概要を見る

本研究は、人間の表情筋を実態に即して忠実にモデル化することによって得られたダイナミックな特性をルール化し、この新しい表情筋モデルの制御およびそのモデルから得られた知見に基づいてリアリティの高い表情合成の実現を試みる研究である。
まず、表情変形ルールを定義するために、作成された表情筋モデルに関して、個々の表情筋を物理シミュレーションする方式を検討した。その方法とは、まず、広がりのある表情筋をバネの集合体であると仮定する。次に、その運動方程式を解くことで、表情筋が収縮する際の速度、加速度を計算して、表情筋の運動を制御するとともに、表情筋の体積保存則や表情筋どおしの衝突の判定および反力の効果も考慮して、実態にできるだけ近い表情筋の運動モデルを構築した。
定義によって構築された表情変形ルールの評価として、個々の表情筋を実際に運動させたときに表出される表情の自然さについて主観評価を実施した。また、EMGの測定データとその際の表出表情を記録し、実施に表情筋モデルをEMGで駆動して、合成される表情が観測されたものに近いかどうかを検討した。加えて、音声信号を利用して新たな客観評価手法を提案した。
平均的な表情筋に個々の人物の表情筋へのバリエーションの付加を可能とする顔モデリングツールを作成した。これは撮影された正面顔画像に対してこのジェネリックな表情筋モデルをフィッティングさせ、近似的に人物の表情筋モデルを作成できるように配慮した。さらに、実際に表情筋を動かしてみて、合成される表情と本人の表情を比較し、表情筋の強度の典型的な組み合わせを設定することによって、喜怒哀楽や微妙な表情の合成が可能となった。
簡単な作業で、標準顔ワイヤフレームで計測された3次元レンジデータにフィッティング可能な顔整合ツールを開発した。また、この整合された個人顔モデルに対し表情合成を行い、音声と同期してリップシンクを実現できる汎用の表情アニメーションツールを開発した。このツールは、音声対話擬人化エージェント作成支援ツールに組み込まれている。
この研究によって得られた表情合成アプリケーションの一部を2005年に開催された愛・地球博において実現した。フューチャーキャストシステムとして提案したこのシステムは、観客参加型のエンタテインメントシステムであり、キャストの情報合成手法に今回の表情合成アルゴリズムの一部を反映させた。
外科的矯正治療が表情認知に与える影響に関する研究

日本学術振興会科学研究費助成事業基盤研究(C)

研究期間:

2000年

-

2001年

寺田員人, 宮永美知代, 森島繁生

　概要を見る

本研究の目的は、顎変形症を有する患者に行う外科的矯正治療前後で、基本感情表出が、どのように認知されるかを調べ、外科的矯正治療が表情認知に与える影響を調べることである。
外科的矯正治療後の表情表出に対する表情認知を調べるために、平均顔を作成し、咬合異常の形態で区別した群の特徴を反映しているかを調べた。その結果、平均顔が患者の個性を消して、患者の持つ共通の形態的特性を備えた有用な画像であることが確認され、顔面骨格形態と表情表出時の表情認知の判定に使用できることが示唆された。
本研究の趣旨に理解して承諾し、骨格性下顎前突を有し外科的矯正治療を行った女性患者、叢生、並びに上顎前突を有し矯正治療を行った女性患者を選択し、3群とした。この3群から平均顔を作成した後、普遍性のある基本感情カテゴリーの6表情、「驚き」、「恐れ」、「嫌悪」、「怒り」、「喜び」、「悲しみ」を作成した。被験者を顔に関して知識の高い矯正科の医局員(新潟大学歯学部)と美術学部の学生、中国人歯科医師を評価した群として、各6表情について、各被験者に最も印象の強い表情をしている顔画像を選択させた。
その結果、骨格性下顎前突を有する患者において、外科的矯正治療後に印象が強くなった平均顔の表情は、「驚き」、「恐れ」、「嫌悪」、「喜び」、「悲しみ」であり、「怒り」以外のすべての表情であった。
「喜び」の表情は、上顎前突を有した患者の治療後の平均顔が治療前より印象が強いという頻度が高くなった。さらに、評価した3群が同じような選択の分布を示した。このことより外科的矯正治療、あるいは矯正治療により、表情表出が豊かとなり、表情認知が向上することが示唆された。
感性情報通信に向けた顔の分析合成処理共通プラットホームの構築

日本学術振興会科学研究費助成事業基盤研究(B)

研究期間:

1999年

-

2001年

原島博, 橋本周司, 原文雄, 谷内田正彦, 山田寛, 森島繁生

　概要を見る

現在までに、多くの研究機関や大学において、顔の認識や顔画像合成に関する様々な研究が行われ、コンピュータの処理能力の向上と共に実用化の直前レベルまでに達している。
しかし、これらの要素技術は、個別の要求に対して研究開発された場合が殆どで、他のシステムとのインタフェースを含む拡張性や統一性を持たないものが多い。このような背景において、本研究では、顔画像を対象として汎用性のある構造化された基本ソフトウェアシステムの構築を目指した。
このソフトウェアシステムの中心は、顔画像の認識と合成技術を統一して扱える汎用的な顔情報処理ライブラリが中心となっている。顔の認識技術としては、顔と顔部品の抽出、記述、ならびに動画像中でのそれらの追跡、さらに個人照合や表情認識のための基本ツールの開発を行った。また汎用化のためのAPI開発も行った。顔画像合成では、自然さとリアルさを併せ持つ顔、顔部品の3次元モデルと顔の表情、動きの合成が可能な顔画像合成ルールの構築を行った。顔形状の標準モデルを用意し、認識システムの出力から得られた顔特徴点に基づいて個人の顔写真に整合して、個人の顔形状に合ったモデルを作成する手法を確立した。
各研究機関で開発された基本アルゴリズムを相互に結合し、さらに汎用化を目指す工夫を行ったこと。個々の表情変形ルールに対して大幅なクオリティ向上を目指したこと。またインタラクション技術として、感情の取り扱い方法や、ネットワークを経由したコミュニケーションについて具体的に検討したこと。またWEB上でのコミュニケーションを汎用的に記述する方法を実現した点が新しい成果である。また心理的な分析により顔表情アニメーションの時間的な変化について検討した点も全く新しい成果である。
顔面表情のカテゴリー分類プロセスに関する仮説の検証:記憶実験パラダイムの導入

日本学術振興会科学研究費助成事業基盤研究(B)

研究期間:

1998年

-

2000年

山田寛, 鈴木竜太, 森島繁雄, 厳島行雄

　概要を見る

本研究の目的は、顔面表情の知覚・認知過程に含まれると考えられるところの意味処理過程の解明にある。得に、顔面表情のカテゴリー分類が、各カテゴリーのプロトタイプとの感情的意味空間距離を関数に行われるのかどうかを検討することを主眼とした。この目的を達成するため、3年間にわたり、下記のように研究を進めた。
1)平成10年度
最初に、実験用顔画像処理システムの構築を試み、その実験用顔画像処理システムを利用して、本研究に用いる刺激の作成を行った。さらに、作成した表情刺激をセマンティックディファレンシャル(SD)法に従って被験者に評定させる実験を行った。その結果、これまでの過去の研究や本研究代表者が実施した実験結果で見出されてきたと同様の「快一不快」、「活動性」といった基本的な感情的意味次元を抽出することができた。
2)平成11年度
被験者に上記の表情刺激に対する感情カテゴリー判断を求め、被験者の各判断内容とその判断に要した反応時間を測定する実験を行った。この結果、これまでの実験結果の妥当性が確認できた。
3)平成12年度
最終年度にあたり、顔面表情の意味的処理のメカニズムに関して記憶の側面から検討を行った。すなわち、ターゲットとなる顔面表情が、感情的意味空間の中で、すでに記憶されているプロトタイプとの照合によってなされるのかどうかについて、実験データからの関数式のあてはめに基づいて検証した。結果としては、そのような仮説が支持され、本研究の当初の目的が達成された。
擬人化エージェントによるマルチモーダルインタフェースプロトタイプの構築

日本学術振興会科学研究費助成事業基盤研究(B)

研究期間:

1997年

-

1999年

森島繁生, 山田寛

　概要を見る

擬人化エージェントを介して、ネットワーク上のサイバースペース内で、フェーストゥーフェースの対話を実現する多人数コミュニケーションシステムのプロトタイプを完成させた。これは、イーサネットに接続された複数のクライアントと、1台のサーバから構成され、各クライアントには、マイクロフォンとビデオカメラが設置されている。マイクから入力された声は逐次サーバに転送され、ニューラルネットによるパラメータ変換メカニズムによってリアルタイムで口形状への変換が実行される。この口形状パラメータは、各クライアントに送られて、相手から見える本人の擬人化エージェントの表情制御に用いられる。また、音声信号はそのままクライアントに送られて、表情とともにスピーカーから再生される。ファンクションキーに割り当てられている基本表情を選択することによって、相手に提示する表情を変化させることができる。また、仮想空間内をフォークスルーして、擬人化エージェントの位置と視線方向を制御することが可能であるが、エージェントの視線から見た映像の他に、第三者的な視点からシーンの全体像を把握できるようにフライモードを用意している。擬人化エージェントの視線の制御は画面に写るシーンと一致するようになっているので、相手のエージェントとアイコンタクトを取りながら対話することが可能となった。
このプロトタイプシステムを利用して、3人のクライアント間でコミュニケーション実験を行った。合成レートはおよそ10フレーム毎秒であり、伝送される音声のクオリティの劣化、さらに伝送遅延による唇の動きとの非同期が生じたが、実際にこのシステムによって、複数の人間の間でネットワークを介して、自然な会話が行えることを評価実験により明らかにした。
外科的矯正治療による顔の表情の変化に関する研究

日本学術振興会科学研究費助成事業基盤研究(C)

研究期間:

1997年

-

1998年

寺田員人, 森島繁生

　概要を見る

本研究の目的は、表情を作成するときの顔面皮膚の動きを表示できるコンピュータシステムを作成し、顔画像モデルを用いて、外科的矯正治療前後の表情表出の違いを調べることと、この顔画像モデルの動きを見なから外科的矯正治療後の患者が表情筋を動かして表情表出を行うトレーニングが行えるまでに発展させることである。
研究成果
1. 表情を作る顔画像ソフトの作成
三次元標準顔モデルに正面顔写真を整合させて、表情を作る顔画像ソフト「顔ツール」ができた。これにより、任意の正面顔写真を資料として、コンピュータディスプレイ上でFacial Action Coding SystemのAction Unitsに応じた顔面の動きを与えるものである。
2. 外科的矯正治療による表情表出の変化
この「顔ツール」を使い、外科的矯正治療前後のそれぞれの正面写真から基本的感情カテゴリーの6表情を作成し、42名の被験者に評価させた結果、「驚き」、「恐れ」、「嫌悪」で治療後に印象が強かった。これにより、外科的矯正治療による表情変化が明確となった。
3. 表情表出トレーニング
「顔ツール」を用いて、一つあるいは複数のAction Unitsを動かした顔画像と鏡に写った顔を対比させながら、表情表出するトレーニングが行えた。これにより、顎変形症患者が手術後にこのトレーニングを行うことで、術後に危惧される知覚麻痺の予防にも効果があることが予想された。
音声における感情情報の記述・分類と感情音声認識・合成

日本学術振興会科学研究費助成事業奨励研究(A)

研究期間:

1997年

-

1998年

森島繁生

　概要を見る

音声に含まれるノンバーバル情報のうち感情の特徴に着目して、無感情の音声に感情を付加する感情音声合成、さらに感情を込めて発話された音声を分析して感情のカテゴライズを行う感情音声認識を目的に研究を進めた。
まず音声に含まれる感情特徴パラメータの決定のため、ピッチ、パワー、時間構造、アクセントを任意に変化させて、韻律変換できるツールを完成させた。このツールを用いて各パラメータを試行錯誤で変化させて、無感情で発話された音声に対して、喜び、怒り悲しみの感情をそれぞれ付加して、センテンスによらず、話者によらず感情音声合成が可能となった。
次にピッチ、パワー、発話速度をパラメータとして音声を上記の3つの感情カテゴリーに分類するシステムを構築した。ピッチ抽出はケプストラムに基づき、また発話速度は分析フレーム間のスペクトル距離に基づいて独自の抽出アルゴリズムを開発し、リアルタイムで逐次認識結果を出力できるシステムを完成させた。感情カテゴリー分類には多重判別分析を用いた。このシステムと従来から検討を進めている3次元表情合成とをリンクし、入力された音声にリアルタイムに反応して、その時の感情状態にしたがって表情を変化させる擬人化エージェントシステムも完成させた。
顔面表情認知の内的処理過程モデルに関する実験的検証

日本学術振興会科学研究費助成事業基盤研究(B)

研究期間:

1995年

-

1996年

山田寛, 森島繁生

　概要を見る

本研究の主題は、顔面表情認知の内的処理過程である。主に、先行研究より示唆されている情報処理モデルの実験的検証と、モデル構築に係わる新たな知見の抽出を目的とした。この2年間の研究期間に実施した研究の主たる内容とその成果は下記の通りである。
1.顔面表情刺激の作成:下記の実験に使用する刺激の作成を行った。まず、実人物の顔面表情写真を用意し、次いで、顔画像処理システムによりワイヤーフレームモデルとテキスチャデータを作成し、それらに基づいて刺激とする平均的な顔面表情画像を作成した。
2.顔面表情刺激のSD法評定実験:先行研究で使用したSD尺度を用いて、被験者に上記の刺激をSD法で評定させる実験を行った。因子分析の結果、先行研究で導きだされていた「快-不快」および「活動性」という伝統的な感情的意味次元の存在が改めて確認された。
3.顔面表情判断の反応時間の測定実験:上記の刺激を被験者に呈示し、各刺激に対する感情カテゴリー判断を求め、被験者の各判断内容とその判断に要した反応時間を測定する実験を行った。
4.情報処理モデルの検討:まずSD法評定実験の結果を基に、各刺激を感情的意味次元空間に位置づけた。さらに、同じ空間に、各感情カテゴリーの概念も位置づけた。次いで、各カテゴリーの概念およびプロトタイプと各刺激の間の意味空間上における距離を計算した。そこで、その距離を関数として上記の実験で得られた反応時間の分析を行った。この結果、各顔面表情刺激に対する被験者のカテゴリー判断時間は、感情カテゴリーのプロトタイプとの意味空間上における距離によって説明される可能性が示唆された。
人物頭部の3次元構造モデルに基づく顔面表情の定量的計測システムの開発

日本学術振興会科学研究費助成事業基盤研究(A)

研究期間:

1994年

-

1996年

大倉元宏, 原島博, 千葉浩彦, 山田寛, 森島繁生

　概要を見る

まず、本研究では、顔面の動きを3次元的に計測するシステムを構築した。これは、被験者の顔面にマーカーを付与し、これを垂直に配置した2台のカメラで撮影して、各特徴点の動きを3次元座標の移動量として取得するものである。このシステムを利用して、心理学分野で提唱されているFacial Action Coding Systemのアクションユニットの定量化や、発話時の口形状の実測を実施した。また頭部を表現した3次元ワイヤフレームの変形と正面画像のテクスチャマッピングにより、計測した原画像の表情をそのままコンピュータグラフィックスとして再現することを可能にした。次のテーマとして、アクションユニットの組み合わせによって実現される6つの基本表情を作成し、この表情記述パラメータからニューラルネットの恒等写像学習によって3次元の感情空間を獲得した。これは、表情記述パラメータの組み合わせによって記述された感情状態を3次元の空間の1点として表現するもので、感情の認識と合成を同時に実現することができる。この感情空間の評価により、空間内では表情が連続的に記述され、心理学的にも妥当な空間が獲得されていることが明かとなった。次に手がけたテーマは、マーカを添付せずに、表情認識する試みである。これは、目と唇領域に注目し、画像の空間周波数情報が表情変形と相関が高いことに着目して、縦方向と横方向の周波数帯域成分から6つの基本感情を推定するシステムを構築した。このシステムでは目と唇領域を自動的に追跡して、フレーム毎に感情状態の認識結果を実時間で出力することができるようになった。実験結果から人間の主観による認識に近い性能を実現できた。
音響情報からの感情情報抽出とそのヒューマンインタフェースへの応用

日本学術振興会科学研究費助成事業重点領域研究

研究期間:

1993年

　

　

森島繁生, 斎藤隆弘

　概要を見る

基本的な感情情報(喜び、悲しみ、怒り、嫌悪)と無感情状態を音声信号のみから認識するための、音響特徴パラメータの抽出を行った。演劇経験者の発声した感情音声を試行錯誤的に分析して、ピッチ統計、時間構造に着目した結果、顕著な差異が現われていることが分かった。特に喜びと怒りはパワーが大きく、ピッチが上昇する傾向にある。悲しみではピッチが一律に下降する。また嫌悪では語尾に行くにしたがってピッチが上昇する傾向が現われ、時間構造に特徴が見いだされた。このパラメータ変化情報に基づいて、無感情音声のピッチと時間構造を変換し特定の感情を付加することが可能となった。この聴取実験結果から感情が付加されているようすが確認された。
一方、感情情報自体の定量化についても検討を行った。これは各感情を表現していると判断される表情パラメータの分析から、基本感情の相対的な空間的位置づけを決定する試みであると同時に、表情から感情認識を行ったり、指定された感情情報から表情への変換を実現することが目的である。具体的にはパラメータ記述された表情を恒等写像を実現するニューラルネットに入力して、このデータに含まれる内部構造を獲得する手法をとった。感情空間の次元数を3、5階層のネットワークを利用して検討を行った結果、従来からの心理学の分野で提案されている感情空間と強い相関があることが実証された。すなわち喜びと怒り、驚きと悲しみが対局的な位置づけとなり、怒りと悲しみの中間に嫌悪、悲しみと驚きの中間に恐れが位置づけられる結果となった。これによって、心理学分野へ大きなインパクトを与えた。
人に優しいヒューマンインタフェースのためのメディア変換技術

日本学術振興会科学研究費助成事業一般研究(C)

研究期間:

1992年

-

1993年

森島繁生

　概要を見る

計算機の画面上に登場する人物とのフェーストゥーフェースの対話環境を実現するため、表情動画像のシナリオ記述システムとこれに基づくインタフェースプロトタイプの製作を行った。まず、人物頭部は3次元モデルによって記述し、表情はモデルの格子点移動によって表現する。表情アニメーションシナリオは、この表情のキーフレームの配置によって決定される。時間軸上に希望の表情を選択して配置すると、これらの間を滑らかにパラメータ補間が行われアニメーションが実現する。唇に関しては、入力されたテキスト情報の分析が自動的に行われ、各音韻の位置に基づいてキーフレームが決まる。音韻の標準継続長と口形パラメータについては会話のシーンの分析により決定した。音声自体を入力することも可能であり、キーフレームの位置は音声のセグメンテーション結果によって決定できる。このアルゴリズムは独自に開発した安定なものである。この表情アニメーションシステムを電子メールに応用する場合を考えると、利用者はまずメールテキストを入力する。必要な場合にはこれを喋っている音声も入力してやる。するとシステムによって音韻分析と音響分析が行われ、唇の動きに関するキーフレームが決定し画面に表示される。これを参考にして、どの時点でどういう表情をさせるかをアイコンを制御して表情キーフレームを決める。この際、基本表情以外は編集画面をオープンして、パラメータをスライドバーで制御して任意の表情を画面を見ながらインタラクティブに作成する。これを登録して時間軸上に配置すればアニメーションの中に取り込める。この一連の編集プロセスの結果、制御パラメータが送信され、受信側ではメールのメッセージを読み上げる表情と音声の同期したアニメーション画像が再生される。本年度はこのようなインタフェースのプロトタイプをワークステーション上に実装して、ユーザフレンドーなインタフェース環境を実現した。
音響情報からの感情情報抽出とそのヒューマンインタフェースへの応用

日本学術振興会科学研究費助成事業重点領域研究

研究期間:

1992年

　

　

森島繁生, 斎藤隆弘

　概要を見る

自然音声に含まれる感情の定量化を試み、音声信号から言語情報や感情情報を自動抽出して、表情動画像を合成する新たなメディア変換技術を目指して検討を進めている。
今年度は人種によりず表出されると推定される喜び、悲しみ、怒り、嫌悪、恐怖と無感情の6つの基本感情について、実際の音声においてこの特徴がどう表現できるか検討を行なった。まず分析対象となる音声刺激を作成するため、感性表出に熟練していると考えられる劇団員数名と一般学生により、ある情景を相像してもらい演技してもらった。発声する音声は言語情報自体に感情がこめられていないものを選択した。又、発声者と聴取者との主観の違いが生じないように、録音した音声に対して十数名の被験者による聞き取りテストを実施し、その音声からどのような感情が聞き取れるかを答えてもらった。この結果に基づいて客観的に特定の感情が顕著に現われていると思われる刺激を選び出し分析対象とした。分析法は合成による分析への発展を考慮して、パラメータ変化させやすい簡単なものに留めた。すなわちピッチ周期の頻度分布とその時間的な変動、そしてパワースペクトラムの平均と時間変化に着目した。時間変化については個人差が大きく出現し、感情による違いは顕著に現われなかった。しかし、音声区間全体で平均特徴としては無感情時からのピッチ周期変化と高周波パワーの占める割合で6つの基本感情が空間的に特徴づけられることが解かった。一方、顔面表情の定量化を試みとして、多層ニュートラルネットの恒等写像学習による2次元感情空間の自動獲得の試みを行なった。この結果により人物の任意の表情は2次元空間上の座標値によって記述され、表情変化は空間上の軌跡として表現できる。次年度以降は音声から抽出された特徴パラメータから、この感情空間へのマッピングを実現して、音声のみから表情を自動合成できるシステムの実現を目指す。
画像処理・CG手法を用いた表情の動的分析合成システムの開発と行動研究への応用

日本学術振興会科学研究費助成事業試験研究

研究期間:

1989年

-

1991年

原島博, 鈴木直人, 米谷淳, 千葉浩彦, 山田寛, 森島繁生, 水上啓子

　概要を見る

我々は、3年間を通じて下記のような研究成果を挙げた。
1.表情分析合成システムの基本設計
システムをグラフィックワ-クステ-ションに実装し、各種画像処理・CGソフトウェアを整備した。また、研究代表者らが提案している知的画像符号化の手法を用いて、顔面像及び表情の系統的な記述方法を検討した。その結果、知的画像符号化手法の有効性が確認された。
2.表情分析のための応用ソフトウェアの開発
顔画像から顔の向きと表情の形状パラメ-タを抽出するためのソフトウェアを開発した。
3.表情合成のための応用ソフトウェアの開発
表情パラメ-タから自然な顔画像を合成することを目的として、FACSに基づく筋肉運動の様相を心理学的な観点から検討し、表情合成のための画像生成ル-ルを作成した。
4.心理学的評価実験
顔面像より抽出する表情パラメ-タや合成表情画像についての心理学的な評価実験を行った。その結果、我々が開発を進めてきた表情の分析合成システムによる表情合成および表情分析の心理学的妥当性が確認された。
5.システムの心理学的研究および行動研究への応用の検討
心理学および行動科学の研究領域における表情や表情のコミュニケ-ションに関わる研究をサ-ベイし、本システムがそれら研究に果たす役割を探った。
顔画像の知的符号化による高次コミュニケーションの基礎研究

日本学術振興会科学研究費助成事業重点領域研究

研究期間:

1989年

　

　

原島博, 相沢清晴, 森島繁生, 斎藤隆弘

　概要を見る

本研究は、知的画像符号化の開発を通じて、将来の知的通信や知的マンマシンインタフェースの基盤となる高次コミュニケーション技術を確立することを目的としている。本年度は、人物顔画像を対象とした知的画像符号化のキーテクノロジーである次の項目について検討した。
1.動画像からの3次元構造情報の自動抽出に関する検討
知的画像符号化において汎用的な符号器を実現するには、予めモデルを用意するのではなく、動画像そのものから3次元構造情報が自動的に抽出できることが望ましい。ここでは、次の2通りの手法を検討した。
(1)単眼動画像からの3次元構造・運動推定
(2)ステレオ動画像からの3次元構造・運動推定
2.顔画像の表情パラメータの抽出ならびに表情合成の検討
顔画像の知的符号化方式における顔面表情の分析と合成手法として、次の2通りのアプローチを検討した。
(1)FACSに基づく表情分析・合成手法の検討
(2)中割りに基づく表情分析・合成手法の検討
3.音声・テキスト情報からの口唇画像の合成に関する検討
知的画像符号化におけるメディア変換の一例として、音声およびテキスト情報から受信側において口唇画像を合成する手法を開発した。さらに、この手法と2.で述べた顔面表情の合成法を組み合わせることにより、よりリアルな顔画像を合成することを試みた。
音声と画像の知的インタラクティブ符号化の研究

日本学術振興会科学研究費助成事業奨励研究(A)

研究期間:

1989年

　

　

森島繁生
顔画像の知的符合化による高次コミュニケーションの基礎研究

日本学術振興会科学研究費助成事業重点領域研究

研究期間:

1988年

　

　

原島博, 相沢清晴, 森島繁生, 斉藤隆弘

　概要を見る

本研究は、知的画像符合化の開発を通じて、将来の知的通信や知的マンシマンインタフェースの基盤となる高次コミュニケーション技術を確立することを目的としている。本年度は、人物顔画像を対象とした知的画像符合化のキーテクノロジーである次の項目について検討した。
(1)3次元構造モデルの構成と同定手法の検討:
知的画像符合化方式では、送信側と受信側で対象画像に関する3次元構造モデルを知識として共有する。画像が人物肩上像に限定できる場合は、顔および頭部についての3次元構造モデルが中心となる。その構成法として、本研究では「符合化対象の標準的な3次元形状(ワイヤフレームモデル)を予め知識として用意しておく方法」と、「入力動画像系列から直接動きパラメータ、奥行きなどの情報を推定し、予め用意された知識なしに符合化対象の3次元構造を復元する手法」を検討した。
(2)顔画像の合成ならびに表示手法の検討:
受信側における顔画像の合成法として、本研究では、顔の3次元ワイヤフレームモデルに対象人物の顔写真(正面像)を投影して貼り付け、これを回転して任意方向の顔画像を合成する手法を開発した。
(3)動き情報の検出・符合化と動き合成の検討:
顔画像の3次元的な動き情報の検出を目的として、頭部の動きパラメータと特徴点の奥行き座標を同時に推定する手法を開発した。また併せて、このような動き推定を可能にする特徴点の選択法と自動抽出法について検討を加えた。
(4)表情パラメータの抽出・符合化と表情合成の検討:
顔の表情の忠実な伝送を目的として「はめこみ合成法」と「構造変形による合成法」を提案し、本年度は特に後者について、表情に関連するパラメータの抽出法および表情の規則的な合成法の検討をおこなった。
音声と画像の知的インタラクティブ符号化の研究

日本学術振興会科学研究費助成事業奨励研究(A)

研究期間:

1988年

　

　

森島繁生

▼全件表示

Misc

低リソース言語の自動音声認識における他言語データの効率的利用

三森俊祐,柏木爽良,田中啓太郎,森島繁生

情報処理学会第87回全国大会 2025年03月
プロソディ特徴を考慮した感情豊かな音声駆動3D発話顔生成

坂本翔之進,森島繁生

情報処理学会第87回全国大会 2025年03月
Microfacet理論に基づく複数回の反射を考慮した偏光レンダリング

大羽英仁(早大),谷田川達也(一橋大),森島繁生(早大)

情報処理学会第87回全国大会 2025年03月
少数の参照画を用いたアニメ線画の自動彩色

高野悠, 前島謙宣, 山口周悟, 森島繁生

情報処理学会第87回全国大会 2025年03月
ダンスの楽曲と振付に基づく対話型カメラワーク探索システムの提案

鈴木悠, 岩本尚也, 森島繁生

情報処理学会第87回全国大会 2025年03月
SyncViolinist: Music-Oriented Violin Motion Generation Based on Bowing and Fingering

Hiroki Nishizawa, Keitaro Tanaka, Asuka Hirata, Shugo Yamaguchi, Qi Feng, Masatoshi Hamanaka, Shigeo Morishima

第142回音楽情報科学研究発表会 2025年03月
Enhancing Non-Dominant Hand Skills Through Inverted Visual Feedback in a Mixed Reality Environment

Ryudai Inoue, Qi Feng, Shigeo Morishima

第32回インタラクティブシステムとソフトウェアに関するワークショップ, WISS2024 2024年12月
AIによる質問を利用したメール返信支援システムの評価

三浦悠輔, 楊期蘭, 栗林雅希, 松本啓吾, 葛岡英明, 森島繁生

第32回インタラクティブシステムとソフトウェアに関するワークショップ, WISS2024 2024年12月
視覚障害者の探索を支援する事前準備した地図を必要としない案内ロボット

栗林雅希, 上原康平, Allan Wang, 森島繁生, 浅川智恵子

第32回インタラクティブシステムとソフトウェアに関するワークショップ, WISS2024 2024年12月
浅水方程式における剛体カップリングのための浮力計算法の検討

平江陽香, 森島繁生, 安東遼一

VisualComputing, VC2024 2024年11月
生成画像から獲得された表現を活用した生成画像検出

大竹ひな, 福原吉博, 久保谷善記, 森島繁生

VisualComputing, VC2024 2024年09月
少数の参照画を用いたアニメ線画の自動彩色

高野悠, 前島謙宣, 山口周悟, 森島繁生

VisualComputing, VC2024, ポスター発表 2024年09月
Projection-Based Monocular Depth Prediction for 360 Images with Scale Awareness

Qi Feng, Shigeo Morishima

VisualComputing, VC2024 2024年09月
変分オートエンコーダを用いた単旋律音楽信号の音高・音色・変動への分解

田中啓太郎, 吉井和佳, Simon Dixon, 森島繁生

情報処理学会第141回音楽情報科学研究会 2024年08月

担当区分：最終著者
Investigation of Non-Dominant Hand Training through Virtual Reality and Inverted Visual Feedback

Inoue Ryudai(Waseda University, Feng Qi(Waseda Research Institute for Science, Engineering),Morishima Shigeo(Waseda Research Institute for Science, Engineering

ヒューマンコンピュータインタラクション研究会（IPSJ-HCI） 33 2024年06月

担当区分：最終著者

研究発表ペーパー・要旨（全国大会，その他学術会議）
フロアマップ解析と交差点検出を用いた不慣れな屋内空間における視覚障害者のための案内システム

久保田, 雅也, 栗林雅希, 粥川青汰, IBM, Research - Tokyo, 高木啓伸, IBM, Research - Tokyo, 浅川智恵子, IBM Research, 森島繁生

ヒューマンコンピュータインタラクション研究会（IPSJ-HCI） 35 2024年06月

研究発表ペーパー・要旨（全国大会，その他学術会議）
ロボットの視点を含んだ3D Visual Grounding

岩片彰吾, 大島遼祐, 綱島秀樹, 松澤郁哉, YUE QIU, 片岡裕雄, 森島繁生

情報処理学会第86回全国大会, 大会学生奨励賞 2W-06 2024年03月

担当区分：最終著者

研究発表ペーパー・要旨（全国大会，その他学術会議）
敵対的最適化と距離学習を用いたDeepfake検出

大竹ひな, 福原吉博, 久保谷善記, 森島繁生

情報処理学会第86回全国大会, 大会学生奨励賞 6U-02 2024年03月

担当区分：最終著者

研究発表ペーパー・要旨（全国大会，その他学術会議）
カメラ間で時刻同期していない動画を用いたDynamic NeRFの検討

佐々木馨, 佐藤和仁, 山口周悟, 武田司, 森島繁生

情報処理学会第86回全国大会 2U-07 2024年03月

担当区分：最終著者

研究発表ペーパー・要旨（全国大会，その他学術会議）
オフラインデータを利用した意味的探索による世界モデルのサンプル効率の改善

立松健輔, 綱島秀樹, 森島繁生

情報処理学会第86回全国大会, 大会学生奨励賞 6Q-03 2024年03月

担当区分：最終著者

研究発表ペーパー・要旨（全国大会，その他学術会議）
時刻非同期の動画を入力としたDynamic NeRFの検討

佐々木馨, 佐藤和仁, 山口周悟, 武田司, 早稲, 森島繁生, 早, 大学理工学術院総合研究

VCワークショップ 2023 2023年12月

担当区分：最終著者

研究発表ペーパー・要旨（全国大会，その他学術会議）
3D Gaussian表現を用いた映像からアニメーション可能なアバター生成

深澤康介, 森島繁生

VCワークショップ2023 2023年12月

担当区分：最終著者

研究発表ペーパー・要旨（全国大会，その他学術会議）
ショッピングセンターにおける散策体験向上を目的とした大規模言語モデルを用いた視覚障碍者のためのナビゲーションシステム

神庭有花, 栗林雅希, 粥川青汰, 日, 佐藤大介, カーネギーメロン大, 髙木啓伸, 日本, 浅川智恵子, 来, IBM, 森島繁生

WISS2023, poster 2-B08 2023年11月

担当区分：最終著者

研究発表ペーパー・要旨（全国大会，その他学術会議）
フロアマップ解析と交差点検出を利用した視覚障害者のための案内システム

久保田雅也, 栗林雅希, 粥川青汰, 日本, 髙木啓伸, 日本I, 浅川智恵子, 本科学未来館∕IBM Research, 森島繁生

WISS2023, poster 1-B02 2023年11月

担当区分：最終著者

研究発表ペーパー・要旨（全国大会，その他学術会議）
布の完全な部分空間シミュレーションに向けた深層非線形項評価法の検討

田中瑞城, 谷田川達也, 森島繁生, 工学術院総合

第192回研究発表会 (CGVI, CVIM, DCC, PRMU合同) 2023年11月

担当区分：最終著者

研究発表ペーパー・要旨（全国大会，その他学術会議）
固定・移動カメラペアを用いた表情制御可能な動的ニューラル輝度場の取得

山口周悟, 武富貴史, サイバーエージェント, 楊興超(サイバーエージェント, 森島繁生

第192回研究発表会 (CGVI, CVIM, DCC, PRMU合同) セッション1A-1 2023年11月

担当区分：最終著者

研究発表ペーパー・要旨（全国大会，その他学術会議）
イベント間の共起構造を導入した隠れセミマルコフモデルに基づく音響イベント検出

吉永朋矢, 坂東宜昭, 井本桂右, 大西正輝, 森島繁生

日本音響学会第150回(2023年秋季)研究発表会, 2-1-3 2023年09月
時刻非同期の動画を入力としたDynamic NeRFの検討

佐々木馨, 山口周悟, 佐藤和仁, 武田司, 森島繁生

VC2023 poster, P32, VCポスター賞 2023年09月
拡散モデルを用いたパッチ単位の任意スケール画像生成

荒川深映, Erik Härkönen, 綱島秀樹, 堀田大地, 森島繁生

VC2023 poster, P20, VCポスター賞 2023年09月
オブジェクトモーションブラー除去のための変形可能なNeRF

佐藤和仁, 山口周悟, 武田司, 森島繁生

画像の認識・理解シンポジウム, MIRU2023 2023年07月
Visual Dialogueにおける人間の応答ミス指摘の検討

大島遼祐, 品川政太朗, 綱島秀樹, 馮起, 森島繁生

画像の認識・理解シンポジウム, MIRU2023 2023年07月
NeRFの効率的な三次元再構成のためのカメラポーズ補間手法の提案

武田司, 山口周悟, 佐藤和仁, 深澤康介, 森島繁生

画像の認識・理解シンポジウム, MIRU2023 2023年07月
パッチ分割による拡散確率モデルのメモリ消費量削減の検討

荒川深映, 綱島秀樹, 堀田大地, 田中啓太郎, 森島繁生

画像の認識・理解シンポジウム, MIRU2023 2023年07月
ブレンドシェイプを用いて個人の表情や個性を反映した3D顔モデルのリターゲティング

疋田善地, 山口周悟, 岩本尚也, 森島繁生

画像の認識・理解シンポジウム, MIRU2023 2023年07月
ブレンドシェイプを用いた個人の表情や個性を反映した3D顔モデルのリターゲティング

疋田善地, 山口周悟, 岩本尚也, 森島繁生

情報処理学会第85回全国大会, 全国大会学生奨励賞 2023年03月
視覚情報に基づくタスク指向型対話における人間の返答に対する間違い指摘の検討

大島遼祐, 品川政太朗, 綱島秀樹, 森島繁生

情報処理学会第85回全国大会, 全国大会学生奨励賞 2023年03月
口パク動画の発話内容推測における距離学習に基づく精度向上手法

柏木爽良, 田中啓太郎, 森島繁生

情報処理学会第85回全国大会, 全国大会学生奨励賞 2023年03月
覚醒度と感情価に基づく音楽による画像スタイル変換

神庭有花, 田中啓太郎, 平田明日香, 森島繁生

情報処理学会第85回全国大会, 全国大会学生奨励賞 2023年03月
動画内話者の音声強調における特定背景音声の透過

吉永朋矢, 田中啓太郎, 森島繁生

情報処理学会第85回全国大会 2023年03月
複数解像度で画像を生成可能な拡散確率モデル

荒川深映, 綱島秀樹, 堀田大地, 森島繁生

情報処理学会第85回全国大会, 全国大会学生奨励賞 2023年03月
保存型浅水方程式を用いた流体シミュレーションの検討

平江陽香,森島繁生,安東遼一

情報処理学会第85回全国大会, 全国大会学生奨励賞 2023年03月
ノイズを含むレンダリング動画に対する重み付き局所線形回帰によるイベント映像生成

辻雄太, 谷田川達也, 久保尋之, 森島繁生

情報処理学会コンピュータグラフィックスとビジュアル情報学研究会, 第189回コンピュータグラフィックスとビジュアル情報学研究発表会 2023年02月
A Combined Finite Element and Finite Volume Method for Liquid Simulation

Tatsuya Koike, Shigeo Morishima, Ryoichi Ando

2023年01月

　概要を見る

We introduce a new Eulerian simulation framework for liquid animation that
leverages both finite element and finite volume methods. In contrast to
previous methods where the whole simulation domain is discretized either using
the finite volume method or finite element method, our method spatially merges
them together using two types of discretization being tightly coupled on its
seams while enforcing second order accurate boundary conditions at free
surfaces. We achieve our formulation via a variational form using new shape
functions specifically designed for this purpose. By enabling a mixture of the
two methods, we can take advantage of the best of two worlds. For example,
finite volume method (FVM) result in sparse linear systems; however, complexity
is encountered when unstructured grids such as tetrahedral or Voronoi elements
are used. Finite element method (FEM), on the other hand, result in comparably
denser linear systems, but the complexity remains the same even if unstructured
elements are chosen; thereby facilitating spatial adaptivity. In this paper, we
propose to use FVM for the majority parts to retain the sparsity of linear
systems and FEM for parts where the grid elements are allowed to be freely
deformed. An example of this application is locally moving grids. We show that
by adapting the local grid movement to an underlying nearly rigid motion,
numerical diffusion is noticeably reduced; leading to better preservation of
structural details such as sharp edges, thin sheets and spindles of liquids.
マルチ解像度で画像を生成可能な拡散確率モデル

荒川深映, 綱島秀樹, 堀田大地, 森島繁生

VCワークショップ2022 2022年11月
入力動画に対する動画内話者と特定背景話者の同時音声抽出

吉永朋矢, 田中啓太郎, 森島繁生

VCワークショップ2022 2022年11月
口パク動画の発話内容推測における距離学習に基づく精度向上手法の検討

柏木爽良, 田中啓太郎, 森島繁生

VCワークショップ2022 2022年11月
大きく異なるモーション間でも使用可能なMotion Style Transferの提案

深澤康介, 森島繁生

VC + VCC 2022, poster 2022年10月
クロスシミュレーションのための深層学習に基づく部分空間Projective Dynamics

田中瑞城, 谷田川達也, 森島繁生

VC + VCC 2022, poster 2022年10月
画素対応を用いた自動着色手法の提案

沖川翔太, 山口周悟, 森島繁生

VC + VCC 2022, poster 2022年10月
リアルタイムレンダリング可能なNeRFの動的シーンへの拡張

武田司, 山口周悟, 佐藤和仁, 岩瀬翔平, 森島繁生

VC + VCC 2022, poster 2022年10月
AFSM: Adversarial Facial Style Modification for Privacy Protection from Deepfake

加藤義道, 福原吉博, 森島繁生

VC + VCC 2022, poster 2022年10月
カメラポーズが未知の環境下での少ない画像からのNeRFの学習

佐藤和仁, 森島繁生

VC + VCC 2022, poster 2022年10月
HyperNeRF を用いた任意視点・表情コントロール可能な発話動画生成

山口周悟, 武富貴史, 森島繁生

VC + VCC 2022, poster 2022年10月
リアルタイムレンダリング可能なNeRFの動的シーンへの拡張

武田, 司, 山口, 周悟, 岩瀬, 翔平, 佐藤, 和仁, 森島, 繁生

第84回全国大会講演論文集 2022 ( 1 ) 265 - 266 2022年02月

　概要を見る

NeRF:Neural Radiance Fieldsは、入力座標・視線方向を入力とし、輝度値と密度を出力するニューラルネットワークを構築することで、高品質な新規視点画像生成手法を行う手法である。しかし、基本的に対象が静的なシーンに限定されることや、レンダリング時間が長い等の制約がある。そこで我々は、静的なシーンに限定されるものの、レンダリング時間を大幅に高速化したPlenOctrees[Yu et al.2021]を動的なシーンに拡張することで、2つの制約を解消することを試みる。具体的には、（1）入力に時刻を加えたNeRFの学習を行い、（2）各時刻におけるPlenOctreeを時刻分生成する。加えて（3）レンダラーを時刻方向に拡張することで、動的なシーンにおけるNeRFのレンダリング時間の高速化を目指す。
視線情報を用いた英語フレーズの理解度推定

樋笠, 泰祐, 平田, 明日香, 田中, 啓太郎, 森島, 繁生

第84回全国大会講演論文集 2022 ( 1 ) 559 - 560 2022年02月

　概要を見る

本稿では、複数の英単語で意味をなす英語フレーズに対する読者の理解度を視線情報から推定する手法について述べる。近年、単語を理解度の推定対象として、アイトラッカで得られる視線情報と単語の難易度を入力特徴量とするサポートベクトルマシン（SVM）が提案されている。しかし、フレーズが平易な単語で構成される場合、既存手法では読者が理解していないフレーズを検出できない問題がある。本研究では、新たな視線情報を考慮した、単語の難易度に依らない手法を提案する。具体的には、フレーズを読み切るまでの時間やフレーズからの返り読みの回数をSVMの入力特徴量に追加する。被験者実験を通して、提案手法の有用性を検証する。
カメラポーズが未知の環境下での少ない画像からの深度画像を用いたNeRF

佐藤, 和仁, 武田, 司, 岩瀬, 翔平, 山口, 周悟, 森島, 繁生

第84回全国大会講演論文集 2022 ( 1 ) 273 - 274 2022年02月

　概要を見る

Neural Radiance Fields（NeRF）は優れた合成品質により、3Dシーンの再構成で大きな注目を集めている。しかし、NeRFの制約として、3Dシーン表現を学習するために多くの入力画像と正確なカメラポーズを必要とすることがある。本研究では、少ない画像かつ不完全なカメラポーズから深度画像を用いてNeRFを学習する手法を提案する。モデルによって推定された深度が正解の深度に近づくように損失を追加した。これにより、少ない画像からNeRFとカメラポーズを同時に最適化が可能となることを示す。
性能の異なるコンピュータからなるクラスタによる流体の移流計算の高速化

生田目大地, 森島繁生, 安東遼一

第84回全国大会講演論文集 2022 ( 1 ) 231 - 232 2022年02月

　概要を見る

流体CGの移流計算はクラスタを用いた並列分散処理により高速化できるものの、各コンピュータに均等な量のタスクを分散した場合、それらの性能が等しくない限り、効率的な高速化が困難なことがある。本研究は、性能が一様でないコンピュータから構成されるクラスタを計算環境とし、流体の移流計算を効率的に高速化する手法を提案する。まず、プログラム実行時に各コンピュータにサンプル問題の計算を行わせ、処理時間を測定する。次に、測定した情報をもとに、各コンピュータの処理時間と通信コストの合計が一様になるようなタスク分散の割合を求める。提案手法をMacCormack法による移流計算に適用し、有効性を評価した。
Deepfakeを破壊する摂動の転移性調査と効率的な最適化手法の検討

加藤, 義道, 福原, 吉博, 森島, 繁生

第84回全国大会講演論文集 2022 ( 1 ) 219 - 220 2022年02月

　概要を見る

Deepfakeは深層学習を用いたメディア合成技術である。これにより、人の印象を悪くするような悪意ある偽動画が作成され問題になっている。これを解決するために、人が認識できない微弱な摂動を用いてDNN変換モデルを破壊する手法が注目されている。既存手法では、最適化したモデルに対しては効率的な破壊がされていたが、異なるモデルへの転移性の調査は行われていなかった。我々は、複数の変換モデルにおける摂動の転移性を網羅的に調査した。既存手法ではノイズを大きくすることである程度の転移を確認したが、大きな摂動を加えることで画像の品質が低下した。これを踏まえて、大きな摂動を加えても画像の品質が劣化しないような手法を検討した。
主成分分析を用いた次元削減によるクロスシミュレーションの高速化

田中, 瑞城, 森島, 繁生, 安東, 遼一

第84回全国大会講演論文集 2022 ( 1 ) 223 - 224 2022年02月

　概要を見る

XRやゲーム等インタラクティブなコンテンツにおけるクロスシミュレーションの需要が今後見込まれる。しかし現在では計算時間の制約により、実用では粗いメッシュによる計算や物理に忠実でないモデルの利用が一般的であり、シワ等布特有の挙動の再現は困難である。本研究は有限要素法によるクロスのシミュレーションを次元削減した部分空間で解くことによって計算の高速化を行う。部分空間は、事前計算により得た頂点データに対して主成分分析を行うことで構成する。さらに、クロスが不自然に伸びることを拘束条件により制限することでアーティファクトを防ぐ。本手法をクロスと剛体のインタラクションを含むシーンに適用し計算時間を評価した。
スタイル変換を用いた多様な動作合成研究

深澤, 康介, Shum, Hubert, 森島, 繁生

第84回全国大会講演論文集 2022 ( 1 ) 251 - 252 2022年02月

　概要を見る

本研究では、動作スタイル変換を用いることで動作のContent（歩くなど）に対してStyle（怒るなど）を追加し、多様な動作を合成する手法を提案する。ゲームなどの大規模な人間動作制作では、多様な動作表現をすべて記録する場合多くの労力を要する。これを削減するために、動作のContentを保持したまま別のStyleに変換する動作スタイル変換が存在する。合成される動作は一意的に定まるのではなく、いくつかの候補から人間が選択可能にすることで汎用性の高いものになると考えられる。本手法では、既存の動作スタイル変換を拡張することで、ContentとStyleの1つのペアから、いくつかの候補動作を合成する。
強化学習を用いた最適指導法獲得について

久保谷善記, 福原吉博, 森島繁生

2021 ( 1 ) 343 - 344 2021年03月

　概要を見る

近年、インターネットを利用した教育様式が広まりつつある。こうしたツールを利用する場合、学習者が自身のペースで学習を進められるという長所がある一方で、個々人に最適な指導を提供することが難しいという欠点がある。本研究では、マスデータから学習者の知識状態を推定するKnowledge Tracingと、環境との相互作用を繰り返す中で最適な方策を学習する強化学習の手法を組み合わせることで、少ない相互作用で個々人に最適な指導法を獲得することを目指した。

CiNii
アニメ制作過程における画素対応を用いた作画ミス検出

沖川翔太, 山口周悟, 森島繁生

情報処理学会研究報告(Web) 2021 ( 1 ) 139 - 140 2021年03月

　概要を見る

アニメ制作においてはミスの発見・修正を行うために膨大な量のアニメーション画像を精査しなければならず大きな負担となっている。そこで精査するべき画像の枚数を減らして負担を軽減する必要がある。本論文ではアニメーション画像の連続性を利用して、色の塗りミスを検出することを目標としている。まず連続するフレーム同士で画素ごとに意味的な対応を取る。片方の画像に色の塗りミス箇所がある場合対応関係がうまく取れない箇所が出てくる。そのため、対応関係がうまく取れない場合を異常画像とする。本手法を用いることで、さまざまな種類の色の塗りミスの検出をすることに成功した。

CiNii J-GLOBAL
弓動作を反映した演奏モーションの自動生成

平田明日香, 田中啓太郎, 島村僚, 森島繁生

2021 ( 1 ) 263 - 264 2021年03月

　概要を見る

本稿では，弦楽器の演奏音響信号から弓動作を反映した演奏モーションを自動生成する手法について述べる．弓を用いる弦楽器において自然な演奏モーションを生成するためには，特に音色と強く結びついている右手の弓動作を反映する必要がある．近年，深層学習を用いた演奏モーション生成手法が提案されているものの，多くの場合音響信号から直接モーションを生成しており，また既存手法を用いて推定されたポーズを正解としている．そのため出力結果は音源に合わせて右手が充分に動いていない不自然なものとなる．本研究では，使用弦と弓の切り返しを音響特徴量から推定し，その結果からルールベースで演奏モーション生成を行うことで，弓動作を反映したより自然なモーションを生成する．

CiNii
変分自己符号化器を用いた距離学習による楽器音の音高・音色分離表現

田中啓太郎, 錦見亮, 坂東宜昭, 吉井和佳, 森島繁生

情報処理学会研究報告(Web) 2021 ( MUS-131 ) 2021年

J-GLOBAL
画素対応を用いた自動着色

沖川翔太, 森島繁生

情報処理学会研究報告(Web) 2021 ( CG-184 ) 2021年

J-GLOBAL
Corridor-Walker:視覚障害者が障害物を回避し交差点を認識するためのスマートフォン型屋内歩行支援システム

栗林雅希, 粥川青汰, 粥川青汰, VONGKULBHISAL Jayakorn, 浅川智恵子, 佐藤大介, 高木啓伸, 森島繁生

日本ソフトウェア科学会研究会資料シリーズ(Web) ( 94 ) 2021年

J-GLOBAL
Object-aware表現学習におけるKLダイバージェンスの周期性アニーリングによる潜在表現の安定化手法の検証

小林篤史, 綱島秀樹, 綱島秀樹, 大川武彦, 相澤宏旭, YUE Qiu, 片岡裕雄, 森島繁生

電子情報通信学会技術研究報告(Web) 121 ( 304(PRMU2021 24-59) ) 2021年

J-GLOBAL
個人情報保護のための写真内の指紋情報自動除去

中村和也, 夏目亮太, 土屋志高, 森島繁生

2020 ( 1 ) 137 - 138 2020年02月

　概要を見る

近年、技術の発達に伴い高解像度の写真を撮影できるデジタルカメラが普及している。一方で、従来のデジタルカメラでは取得困難な指紋の情報が、写真から取得されてしまう可能性が高まっている。そのため、指紋の情報が意図せず含まれた自分の写真をインターネット上に公開してしまい、その指紋が不正ログインなどに悪用される危険性がある。本研究では、個人で撮影した自分の写真に含まれる指紋情報の自動除去手法を提案する。写真内の指紋を指定した別の指紋に置換し、かつ自然な見た目の写真を生成することで、写真の質を損なわない指紋情報の除去を行う。また、提案手法の適用がユーザーの満足度に与える影響をユーザーテストにより評価する。

CiNii
深層クラスタリングを用いた任意楽器パートの自動採譜

田中啓太郎, 中塚貴之, 錦見亮, 吉井和佳, 森島繁生

2020 ( 1 ) 365 - 366 2020年02月

　概要を見る

本研究では、任意の複数楽器で演奏された音楽音響信号に対して、各パートを自動採譜する手法を提案する。近年、深層学習によって識別性能や表現学習が大幅に向上したことによって、複数楽器の自動採譜が提案されるようになった。しかし多くの場合、採譜したい楽器について教師データを用意する必要があり、多様な楽器や音源全てに対し事前に学習することは現実的ではない。任意の音楽音響信号に対する採譜を行うため、楽器ラベルによる分類ではなくクラスタリングを用いることで、教師なし学習を行う枠組みを提案する。ネットワーク全体の最適化を通じて音源分離とパート譜採譜のマルチタスク学習を行うことで、各パートの採譜を実現する。

CiNii
スペクトログラムとピッチグラムの深層クラスタリングに基づく複数楽器パート採譜

田中啓太郎, 中塚貴之, 錦見亮, 吉井和佳, 森島繁生

情報処理学会研究報告(Web) 2020 ( MUS-128 ) 2020年

J-GLOBAL
LineChaser:視覚障碍者が列に並ぶためのスマートフォン型支援システム

栗林雅希, 粥川青汰, 高木啓伸, 浅川智恵子, 森島繁生

日本ソフトウェア科学会研究会資料シリーズ(Web) ( 91 ) 2020年

J-GLOBAL
分離型畳み込みカーネルを用いた非均一表面下散乱の効率的な計測と実時間レンダリング法

谷田川達也, 山口泰, 森島繁生

Visual Computing 2019 論文集 P26 2019年06月
What Do Adversarially Robust Models Look At?

Takahiro Itazuri, Yoshihiro Fukuhara, Hirokatsu Kataoka, Shigeo Morishima

2019年05月

　概要を見る

In this paper, we address the open question: "What do adversarially robust
models look at?" Recently, it has been reported in many works that there exists
the trade-off between standard accuracy and adversarial robustness. According
to prior works, this trade-off is rooted in the fact that adversarially robust
and standard accurate models might depend on very different sets of features.
However, it has not been well studied what kind of difference actually exists.
In this paper, we analyze this difference through various experiments visually
and quantitatively. Experimental results show that adversarially robust models
look at things at a larger scale than standard models and pay less attention to
fine textures. Furthermore, although it has been claimed that adversarially
robust features are not compatible with standard accuracy, there is even a
positive effect by using them as pre-trained models particularly in low
resolution datasets.
音響情報を用いた一枚画像からの動画生成

土屋志高, 板摺貴大, 夏目亮太, 加藤卓哉, 山本晋太郎, 森島繁生

2019 ( 1 ) 169 - 170 2019年02月

　概要を見る

人間は音のような聴覚情報から動画のような視覚情報を想像することが可能である．このような機能をコンピュータで実現する研究として，顔の特徴点や体のボーンといった特徴量を用いることで，口や体の動きを生成する研究がある．しかし，これらの手法では対象に特化した特徴量を用いているため，音と動きが連動したあらゆる現象に対して適用できないという問題点がある．本論文では，一枚の入力画像と数秒の入力音から，これらに合った動画を生成する問題に一般的に適用可能な深層学習を用いた手法を提案する．実験において，口や体の動きだけでなく，海の波や花火などの様々な動画において提案手法が有効であるかを検証した．

CiNii
4‒2 ビジュアルコンピューティング分野 4‒2‒1 ビジュアルコンピューティング（VC）研究会

森島繁生

画像電子学会誌 48 ( 1 ) 25 - 26 2019年

DOI CiNii
ビジュアルコンピューティング論文特集号に寄せて

森島繁生

画像電子学会誌 48 ( 4 ) 487 - 487 2019年

DOI
一枚画像と音情報を用いた動画生成

土屋志高, 板摺貴大, 夏目亮太, 加藤卓哉, 山本晋太郎, 森島繁生

情報処理学会研究報告(Web) 2019 ( CG-173 ) 2019年

J-GLOBAL
Linearly Transformed Cosinesを用いた非等方関与媒質のリアルタイムレンダリング

久家隆宏, 谷田川達也, 森島繁生

画像電子学会誌(CD-ROM) 48 ( 1 ) 2019年

J-GLOBAL
VRによる視覚操作を用いた仮想勾配昇降時の知覚調査

島村僚, 粥川青汰, 中塚貴之, 宮川翔貴, 森島繁生

画像電子学会誌(CD-ROM) 48 ( 1 ) 2019年

J-GLOBAL
GPUによる大規模煙シミュレーションとその高速化手法

石田大地, 安東遼一, 森島繁生

画像電子学会誌(CD-ROM) 48 ( 1 ) 2019年

J-GLOBAL
深層学習を用いた顔画像レンダリング

山口周悟, 斉藤隼介, 斉藤隼介, 斉藤隼介, 長野光希, LI Hao, LI Hao, LI Hao, 森島繁生

画像電子学会誌(CD-ROM) 48 ( 1 ) 2019年

J-GLOBAL
歌声と楽曲構造を入力とした歌唱時の表情アニメーション自動生成手法

加藤卓哉, 深山覚, 中野倫靖, 後藤真孝, 森島繁生

画像電子学会誌(CD-ROM) 48 ( 2 ) 2019年

J-GLOBAL
PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization

Shunsuke Saito, Zeng Huang, Ryota Natsume, Shigeo Morishima, Angjoo Kanazawa, Hao Li

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019) 2019-October 2304 - 2314 2019年

　概要を見る

We introduce Pixel-aligned Implicit Function (PIFu), an implicit representation that locally aligns pixels of 2D images with the global context of their corresponding 3D object. Using PIFu, we propose an end-to-end deep learning method for digitizing highly detailed clothed humans that can infer both 3D surface and texture from a single image, and optionally, multiple input images. Highly intricate shapes, such as hairstyles, clothing, as well as their variations and deformations can be digitized in a unified way. Compared to existing representations used for 3D deep learning, PIFu produces high-resolution surfaces including largely unseen regions such as the back of a person. In particular, it is memory efficient unlike the voxel representation, can handle arbitrary topology, and the resulting surface is spatially aligned with the input image. Furthermore, while previous techniques are designed to process either a single image or multiple views, PIFu extends naturally to arbitrary number of views. We demonstrate high-resolution and robust reconstructions on real world images from the DeepFashion dataset, which contains a variety of challenging clothing types. Our method achieves state-of-the-art performance on a public benchmark and outperforms the prior work for clothed human digitization from a single image.

DOI
分離型畳み込みカーネルを用いた非均一表面下散乱の効率的推定法

谷田川達也, 山口泰, 森島繁生

情報処理学会研究報告(Web) 2019 ( CG-174 ) 1 - 8 2019年

J-GLOBAL
SiCloPe: Silhouette-Based Clothed People

Ryota Natsume, Shunsuke Saito, Zeng Huang, Weikai Chen, Chongyang Ma, Hao Li, Shigeo Morishima

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) 2019-June 4475 - 4485 2019年

　概要を見る

We introduce a new silhouette-based representation for modeling clothed human bodies using deep generative models. Our method can reconstruct a complete and textured 3D model of a person wearing clothes from a single input picture. Inspired by the visual hull algorithm, our implicit representation uses 2D silhouettes and 3D joints of a body pose to describe the immense shape complexity and variations of clothed people. Given a segmented 2D silhouette of a person and its inferred 3D joints from the input picture, we first synthesize consistent silhouettes from novel view points around the subject. The synthesized silhouettes which are the most consistent with the input segmentation are fed into a deep visual hull algorithm for robust 3D shape prediction. We then infer the texture of the subject's back view using the frontal image and segmentation mask as input to a conditional generative adversarial network. Our experiments demonstrate that our silhouette-based model is an effective representation and the appearance of the back view can be predicted reliably using an image-to-image translation network. While classic methods based on parametric models often fail for single-view images of subjects with challenging clothing, our approach can still produce successful results, which are comparable to those obtained from multi-view input.

DOI
FSNet: An Identity-Aware Generative Model for Image-Based Face Swapping

Ryota Natsume, Tatsuya Yatagawa, Shigeo Morishima

COMPUTER VISION - ACCV 2018, PT VI 11366 117 - 132 2019年

　概要を見る

This paper presents FSNet, a deep generative model for image-based face swapping. Traditionally, face-swapping methods are based on three-dimensional morphable models (3DMMs), and facial textures are replaced between the estimated three-dimensional (3D) geometries in two images of different individuals. However, the estimation of 3D geometries along with different lighting conditions using 3DMMs is still a difficult task. We herein represent the face region with a latent variable that is assigned with the proposed deep neural network (DNN) instead of facial textures. The proposed DNN synthesizes a face-swapped image using the latent variable of the face region and another image of the non-face region. The proposed method is not required to fit to the 3DMM; additionally, it performs face swapping only by feeding two face images to the proposed network. Consequently, our DNN-based face swapping performs better than previous approaches for challenging inputs with different face orientations and lighting conditions. Through several experiments, we demonstrated that the proposed method performs face swapping in a more stable manner than the state-of-the-art method, and that its results are compatible with the method thereof.

DOI
パーツ分離型敵対的生成ネットワークによる髪型編集手法

夏目, 亮太, 谷田川, 達也, 森島, 繁生

第80回全国大会講演論文集 2018 ( 1 ) 207 - 208 2018年03月

　概要を見る

本研究は、深層学習による生成モデルであるVAEとGANを組み合わせて、顔写真上での髪型編集法を提案する。通常の生成モデルでは、顔写真全体を単一の潜在変数ベクトルから生成するが、提案法では顔領域と髪領域の両者に対応する二つの潜在変数ベクトルから顔画像を生成する。学習時は、顔写真を顔と髪の領域に分割したデータセットを用意し、二つのVAEが顔写真から顔領域、髪領域を抽出する。GANは二つのVAEの中間層に現れる潜在変数ベクトルから元の顔写真を復元する。このように顔、髪の領域に対応した潜在変数ベクトルを取り出したことで、他人の髪型の試着、色や長さといった外観の編集、類似顔、類似髪型検索などの多様な応用を一度に実現した。

CiNii
コンピュータビジョンと自然言語処理技術を用いた自動論文要約システム

山本晋太郎, 福原吉博, 鈴木亮太, 片岡裕雄, 森島繁生

電子情報通信学会技術研究報告 118 ( 362(PRMU2018 75-95)(Web) ) 2018年

J-GLOBAL
音響特徴量を考慮したミュージックビデオの色調編集手法

井上和樹, 中塚貴之, 柿塚亮, 高森啓史, 宮川翔貴, 森島繁生, 森島繁生

情報処理学会全国大会講演論文集 80th ( 3 ) 3.271‐3.272 2018年

J-GLOBAL
領域分離型敵対的生成ネットワークによる髪型編集手法

夏目亮太, 谷田川達也, 森島繁生, 森島繁生

情報処理学会全国大会講演論文集 80th ( 2 ) 2.207‐2.208 2018年

J-GLOBAL
複数色の重ね合わせによるユーザーの好みを反映した化粧色推薦

山岸奏実, 加藤卓哉, 古川翔一, 山本晋太郎, 森島繁生

情報処理学会全国大会講演論文集 80th ( 4 ) 4.167‐4.168 2018年

J-GLOBAL
楽曲構造を考慮した音楽音響信号からの自動ピアノアレンジ

高森啓史, 中塚貴之, 深山覚, 後藤真孝, 森島繁生

情報処理学会研究報告(Web) 2018 ( MUS-120 ) Vol.2018‐MUS‐120,No.11,1‐6 (WEB ONLY) 2018年

J-GLOBAL
Qasi-developable garment transfer for animals

Fumiya Narita, Shunsuke Saito, Tsukasa Fukusato, Shigeo Morishima

SIGGRAPH Asia 2017 Technical Briefs, SA 2017 2017年11月

　概要を見る

In this paper, we present an interactive framework to model garments for animals from a template garment model based on correspondences between the source and the target bodies. We address two critical challenges of garment transfer across significantly different body shapes and postures (e.g., for quadruped and human)
(1) ambiguity in the correspondences and (2) distortion due to large variation in scale of each body part. Our efficient cross-parameterization algorithm and intuitive user interface allow us to interactively compute correspondences and transfer the overall shape of garments. We also introduce a novel algorithm for local coordinate optimization that minimizes the distortion of transferred garments, which leads a resulting model to a quasi-developable surface and hence ready for fabrication. Finally, we demonstrate the robustness and effectiveness of our approach on a various garments and body shapes, showing that visually pleasant garment models for animals can be generated and fabricated by our system with minimal effort.

DOI
ラリーシーンに着目したラケットスポーツ映像鑑賞システム—An Efficient Video Viewing System for Racquet Sports Focusing on Rally Scenes

板摺貴大, 福里司, 河村俊哉, 森島繁生

画像ラボ / 画像ラボ編集委員会編 28 ( 6 ) 12 - 19 2017年06月

CiNii
物体の形状による堆積への影響を考慮した埃の描画手法の提案

佐藤, 樹, 森島, 繁生, 谷田川, 達也, 小澤, 禎裕, 持田, 恵佑

第79回全国大会講演論文集 2017 ( 1 ) 145 - 146 2017年03月

　概要を見る

CGにおける汚れ表現の中でも埃の表現は経年変化を想起させ,また現実世界でも目にすることの多い重要な要素である.しかし,埃の繊維を正確に再現,描画するには一般に多くの手間と時間を要することが知られている.そこで,本研究では埃繊維のモデリング及びレンダリングをより少ない時間と手間で実現する.毛皮等のレンダリングに用いられるShell法におけるShellテクスチャを埃を模したテクスチャにおきかえることによって繊維構造に由来とする埃の立体的な特徴を高速に表現することに成功した.埃のレンダリングの際には,シャドウマップ等の実時間レンダリグにおいて用いられる陰影計算法を応用して,物体自身の遮蔽による埃の体積範囲への影響を再現した.これにより,現実世界の現象に近い埃の堆積を再現することに成功した.

CiNii
解析・計測曲率に依存する反射関数を用いた半透明物体における法線マップ推定手法の提案—Normal Map Estimation using Curvature-Dependent Reflectance Function for Translucent Materials

久保尋之, 岡本翠, 向川康博, 森島繁生

画像ラボ / 画像ラボ編集委員会編 28 ( 2 ) 14 - 21 2017年02月

CiNii
密な画素対応による例示ベースのテクスチャ合成

山口周悟, 森島繁雄

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2017 2017年

J-GLOBAL
一人称視点映像の高速閲覧に有効なキューの自動生成手法

粥川青汰, 樋口啓太, 中村優文, 米谷竜, 佐藤洋一, 森島繁生

日本ソフトウェア科学会研究会資料シリーズ(Web) ( 81 ) WEB ONLY 2017年

J-GLOBAL
DanceDJ:ライブパフォーマンスを実現する実時間ダンス生成システム

岩本尚也, 岩本尚也, 加藤卓哉, 柿塚亮, SHUM Hubert P.H., 原健太, 森島繁生

日本ソフトウェア科学会研究会資料シリーズ(Web) ( 81 ) ROMBUNNO.2‐A07 (WEB ONLY) 2017年

J-GLOBAL
コート情報に基づくバレーボール映像の鑑賞支援と戦術解析への応用の検討

板摺貴大, 福里司, 山口周悟, 森島繁生, 森島繁生

電子情報通信学会技術研究報告 116 ( 411(PRMU2016 127-151) ) 227‐234 2017年

J-GLOBAL
頭蓋骨形状を考慮した肥痩変化顔画像合成

福里司, 藤崎匡裕, 加藤卓哉, 森島繁生

画像電子学会誌(CD-ROM) 46 ( 1 ) 197‐205 2017年

J-GLOBAL
曲率に依存する反射関数を用いた半透明物体における法線マップ推定手法の提案

久保尋之, 岡本翠, 向川康博, 森島繁生

画像ラボ 28 ( 2 ) 14‐21 2017年

J-GLOBAL
原曲スコアの音楽特徴量に基づくピアノアレンジ

高森啓史, 佐藤晴紀, 中塚貴之, 森島繁生, 森島繁生

情報処理学会研究報告(Web) 2017 ( MUS-114 ) Vol.2017‐MUS‐114,No.16,1‐6 (WEB ONLY) 2017年

J-GLOBAL
表情変化を考慮した経年変化顔動画合成

山本晋太郎, サフキンパーベル, 加藤卓哉, 山口周悟, 森島繁生, 森島繁生

情報処理学会研究報告(Web) 2017 ( CG-166 ) Vol.2017‐CG‐166,No.3,1‐6 (WEB ONLY) 2017年

J-GLOBAL
キャラクタの局所的な身体構造を考慮したリアルタイム二次動作自動生成

金田綾乃, 福里司, 福原吉博, 中塚貴之, 森島繁生

情報処理学会研究報告(Web) 2017 ( CG-166 ) Vol.2017‐CG‐166,No.2,1‐5 (WEB ONLY) 2017年

J-GLOBAL
笑顔動画データベースを用いた顔動画の経年変化

山本晋太郎, SAVKIN Pavel, 佐藤優伍, 加藤卓哉, 森島繁生, 森島繁生

情報処理学会全国大会講演論文集 79th ( 4 ) 4.103‐4.104 2017年

J-GLOBAL
物体の形状による堆積への影響を考慮した埃の高速描画手法の提案

佐藤樹, 小澤禎裕, 持田恵佑, 谷田川達也, 森島繁生, 森島繁生

情報処理学会全国大会講演論文集 79th ( 4 ) 4.145‐4.146 2017年

J-GLOBAL
キャラクタの局所的な身体構造を考慮した二次動作自動生成

金田綾乃, 福里司, 福原吉博, 中塚貴之, 森島繁生, 森島繁生

情報処理学会全国大会講演論文集 79th ( 4 ) 4.89‐4.90 2017年

J-GLOBAL
原曲の楽譜情報に基づいたピアノアレンジ譜面の生成

高森啓史, 佐藤晴紀, 中塚貴之, 森島繁生, 森島繁生

情報処理学会全国大会講演論文集 79th ( 2 ) 2.91‐2.92 2017年

J-GLOBAL
コート情報に基づくバレーボール映像の鑑賞支援とラリー解析

板摺貴大, 福里司, 山口周悟, 森島繁生, 森島繁生

情報処理学会全国大会講演論文集 79th ( 2 ) 2.317‐2.318 2017年

J-GLOBAL
ラリーシーンに着目したラケットスポーツ映像鑑賞システム

板摺貴大, 福里司, 河村俊哉, 森島繁生

画像ラボ 28 ( 6 ) 12‐19 2017年

J-GLOBAL
可展面制約を考慮したテンプレートベース衣服モデリング

成田史弥, 齋藤隼介, 福里司, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2017 ROMBUNNO.22 2017年

J-GLOBAL
表情変化データベースを用いた経年変化顔動画合成

山本晋太郎, SAVKIN Pavel, 加藤卓哉, 佐藤優伍, 古川翔一, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2017 ROMBUNNO.18 2017年

J-GLOBAL
自己遮蔽下におけるリテクスチャリングのための階層型マーカの提案

宮川翔貴, 福原吉博, 成田史弥, 小形憲弘, 森島繁生, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2017 ROMBUNNO.37 2017年

J-GLOBAL
局所的な異方性と硬さを考慮した高速なキャラクタの二次動作生成の提案

金田綾乃, 福里司, 福原吉博, 中塚貴之, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2017 ROMBUNNO.29 2017年

J-GLOBAL
コート情報に基づくバレーボール映像の鑑賞支援

板摺貴大, 福里司, 山口周悟, 森島繁生, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2017 ROMBUNNO.09 2017年

J-GLOBAL
音楽音響信号から得られる音楽要素に基づく自動ピアノアレンジ

高森啓史, 深山覚, 後藤真孝, 森島繁生

情報処理学会研究報告(Web) 2017 ( MUS-116 ) Vol.2017‐MUS‐116,No.13,1‐5 (WEB ONLY) 2017年

J-GLOBAL
光学的最短経路長を用いた表面下散乱の高速計算による半透明物体のリアルタイム・レンダリング

小澤禎裕, 谷田川達也, 久保尋之, 森島繁生

画像電子学会誌(CD-ROM) 46 ( 4 ) 533‐546 2017年

J-GLOBAL
物体検出とユーザ入力に基づく一人称視点映像の高速閲覧手法

粥川青汰, 樋口啓太, 米谷竜, 中村優文, 佐藤洋一, 森島繁生

情報処理学会研究報告(Web) 2017 ( DCC-17 ) Vol.2017‐DCC‐17,No.4,1‐8 (WEB ONLY) 2017年

J-GLOBAL
Simulating the friction sounds using a friction-based adhesion theory model

Takayuki Nakatsuka, Shigeo Morishima

DAFx 2017 - Proceedings of the 20th International Conference on Digital Audio Effects 32 - 39 2017年

　概要を見る

Synthesizing a friction sound of deformable objects by a computer is challenging. We propose a novel physics-based approach to synthesize friction sounds based on dynamics simulation. In this work, we calculate the elastic deformation of an object surface when the object comes in contact with other objects. The principle of our method is to divide an object surface into microrectangles. The deformation of each microrectangle is set using two assumptions: the size of a microrectangle (1) changes by contacting other object and (2) obeys a normal distribution. We consider the sound pressure distribution and its space spread, consisting of vibrations of all microrectangles, to synthesize a friction sound at an observation point. We express the global motions of an object by position based dynamics where we add an adhesion constraint. Our proposed method enables the generation of friction sounds of objects in different materials by regulating the initial value of microrectangular parameters.
Dynamic subtitle placement considering the region of interest and speaker location

VISIGRAPP 2017 - Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications 6 102 - 109 2017年01月

　概要を見る

© 2017 by SCITEPRESS - Science and Technology Publications, Lda. This paper presents a subtitle placement method that reduces unnecessary eye movements. Although methods that vary the position of subtitles have been discussed in a previous study, subtitles may overlap the region of interest (ROI). Therefore, we propose a dynamic subtitling method that utilizes eye-Tracking data to avoid the subtitles from overlapping with important regions. The proposed method calculates the ROI based on the eye-Tracking data of multiple viewers. By positioning subtitles immediately under the ROI, the subtitles do not overlap the ROI. Furthermore, we detect speakers in a scene based on audio and visual information to help viewers recognize the speaker by positioning subtitles near the speaker. Experimental results show that the proposed method enables viewers to watch the ROI and the subtitle in longer duration than traditional subtitles, and is effective in terms of enhancing the comfort and utility of the viewing experience.
要素間補間による共回転系弾性体の高速化

福原, 吉博, 斎藤, 隼介, 成田, 史弥, 森島, 繁生

第78回全国大会講演論文集 2016 ( 1 ) 181 - 182 2016年03月

　概要を見る

共回転系弾性体は大きな変形に対してロバストであるため、コンピュータグラフィクス（CG）において広く用いている弾性体モデルの1つである。しかし、共回転系弾性体のシミュレーションを行うためには各ステップで全ての要素について特異値分解を行い、回転行列を計算しなければならないため、計算時間のボトルネックの1つとなっている。本研究では、弾性体の要素のうち、サンプリングされた少数の要素についてのみ特異値分解を行い、厳密な回転行列を計算する。残りの要素の回転行列についてはクォータニオンブレンディングによって近似的に補完することによって、より高速なシミュレーションを行うことを可能とした。

CiNii
ラリーシーンの自動抽出と解析に基づくバレーボール映像の要約手法の提案

板摺貴大, 福里司, 山口周悟, 森島繁生

映像情報メディア学会冬季大会講演予稿集 2016 25C-1 2016年

DOI CiNii
motebi~文字を手書きで美しく書くための支援ツール~

中村優文, 山口周悟, 森島繁生

日本ソフトウェア科学会研究会資料シリーズ(Web) ( 78 ) ROMBUNNO.2‐A06 (WEB ONLY) 2016年

J-GLOBAL
視線情報と話者情報とを組み合わせた動画への動的字幕配置手法

赤堀渉, 平井辰典, 森島繁生

日本ソフトウェア科学会研究会資料シリーズ(Web) ( 78 ) ROMBUNNO.1‐A10 (WEB ONLY) 2016年

J-GLOBAL
トレーシングとデータベースを併用する2Dアニメーション作成支援システム

福里司, 森島繁生

日本ソフトウェア科学会研究会資料シリーズ(Web) ( 78 ) WEB ONLY 2016年

J-GLOBAL
Dance DJ:ライブパフォーマンスのためのダンス動作ミックスシステム

岩本尚也, 加藤卓哉, 原健太, 柿塚亮, 森島繁生

日本ソフトウェア科学会研究会資料シリーズ(Web) ( 78 ) ROMBUNNO.1‐A02 (WEB ONLY) 2016年

J-GLOBAL
曲率に依存した反射関数を用いた半透明物体の照度差ステレオ法

岡本翠, 久保尋之, 向川康博, 森島繁生

画像電子学会誌(CD-ROM) 45 ( 1 ) 119‐120 2016年

J-GLOBAL
ラリーシーンに着目したラケットスポーツ動画鑑賞システム

河村俊哉, 福里司, 平井辰典, 森島繁生

画像電子学会誌(CD-ROM) 45 ( 1 ) 121‐122 2016年

J-GLOBAL
領域ベースの画風転写

山口周悟, 加藤卓哉, 福里司, 古澤知英, 森島繁生

画像電子学会誌(CD-ROM) 45 ( 1 ) 125‐126 2016年

J-GLOBAL
フレームリシャッフリングに基づく事前知識を用いない吹替映像の生成

古川翔一, 加藤卓哉, 野澤直樹, PAVEL Savkin, 森島繁生, 森島繁生

情報処理学会全国大会講演論文集 78th ( 4 ) 4.107-4.108 2016年

J-GLOBAL
顔画像特徴と眠気の相関に基づくドライバーの眠気検出

佐藤優伍, 野澤直樹, 森島繁生

情報処理学会全国大会講演論文集 78th ( 3 ) 3.331-3.332 2016年

J-GLOBAL
要素間補間による共回転系弾性体シミュレーションの高速化

福原吉博, 齋藤隼介, 成田史弥, 森島繁生

情報処理学会全国大会講演論文集 78th ( 4 ) 4.181-4.182 2016年

J-GLOBAL
似顔絵の個性を考慮した実写化手法の提案

中村優文, 山口周悟, 福里司, 古澤知英, 森島繁生

情報処理学会全国大会講演論文集 78th ( 4 ) 4.93-4.94 2016年

J-GLOBAL
凝着説に基づく物体表面の弾性変形を考慮した摩擦音の生成手法の提案

中塚貴之, 森島繁生, 森島繁生

情報処理学会全国大会講演論文集 78th ( 4 ) 4.187-4.188 2016年

J-GLOBAL
好みを反映したダンス生成のための振付編集手法

柿塚亮, 柿塚亮, 岩本尚也, 岩本尚也, 朝比奈わかな, 朝比奈わかな, 森島繁生, 森島繁生

情報処理学会全国大会講演論文集 78th ( 4 ) 4.103-4.104 2016年

J-GLOBAL
パッチ単位の法線推定による三次元顔形状復元

野沢綸佐, 加藤卓哉, 野澤直樹, PARVEL Savkin, 森島繁生, 森島繁生

情報処理学会全国大会講演論文集 78th ( 2 ) 2.113-2.114 2016年

J-GLOBAL
不均一な半透明物体の描画のためのTranslucent Shadow Mapsの拡張

持田恵佑, 岡本翠, 久保尋之, 森島繁生

情報処理学会全国大会講演論文集 78th ( 4 ) 4.57-4.58 2016年

J-GLOBAL
輝度の最大寄与値を用いた半透明物体のリアルタイムレンダリング

小澤禎裕, 岡本翠, 森島繁生

情報処理学会全国大会講演論文集 78th ( 4 ) 4.55-4.56 2016年

J-GLOBAL
四肢キャラクタ間の衣装転写システムの提案

成田史弥, 齋藤隼介, 福里司, 森島繁生

情報処理学会論文誌ジャーナル(Web) 57 ( 3 ) 863‐872 (WEB ONLY) 2016年

J-GLOBAL
LyricsRadar:歌詞の潜在的意味に基づく歌詞検索インタフェース

佐々木将人, 吉井和佳, 中野倫靖, 後藤真孝, 森島繁生

情報処理学会論文誌ジャーナル(Web) 57 ( 5 ) 1365‐1374 (WEB ONLY) 2016年

J-GLOBAL
寄与の大きな表面下散乱光の高速取得による半透明物体のリアルタイムレンダリング

小澤禎裕, 久保尋之, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2016 ROMBUNNO.28 2016年

J-GLOBAL
要素間補間による共回転系弾性体シミュレーションの高速化

福原吉博, 斎藤隼介, 成田史弥, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2016 ROMBUNNO.13 2016年

J-GLOBAL
Voxel Number Mapを用いた不均一半透明物体のリアルタイムレンダリング

持田恵佑, 久保尋之, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2016 ROMBUNNO.29 2016年

J-GLOBAL
フレームリシャッフリングに基づく音素情報を用いない吹替え映像の生成

古川翔一, 加藤卓哉, SAVKIN Pavel, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2016 ROMBUNNO.16 2016年

J-GLOBAL
好みを反映した3Dダンス制作のための振付編集手法

柿塚亮, 岩本尚也, 朝比奈わかな, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2016 ROMBUNNO.26 2016年

J-GLOBAL
複数人の視線追跡データから推定される関心領域に基づく動画への動的字幕配置手法

赤堀渉, 平井辰典, 河村俊哉, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2016 ROMBUNNO.39 2016年

J-GLOBAL
CGアニメーションのための物体表面の凝着を考慮した摩擦音の生成

中塚貴之, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2016 ROMBUNNO.20 2016年

J-GLOBAL
頭蓋骨形状に基づいた顔の三次元肥痩シミュレーション

藤崎匡裕, 鍵山裕貴, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2016 ROMBUNNO.25 2016年

J-GLOBAL
法線情報を含むパッチデータベースを用いた三次元顔形状復元

野沢綸佐, 加藤卓哉, SAVKIN Pavel, 山口周悟, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2016 ROMBUNNO.07 2016年

J-GLOBAL
肖像画実写化手法の提案

中村優文, 山口周悟, 福里司, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2016 ROMBUNNO.09 2016年

J-GLOBAL
老化時の皺の個人性を考慮した経年変化顔画像合成

SAVKIN Pavel A., 加藤卓哉, 福里司, 森島繁生

情報処理学会論文誌ジャーナル(Web) 57 ( 7 ) 1627‐1637 (WEB ONLY) 2016年

J-GLOBAL
適合フィードバックに基づく好みを反映したダンス編集手法

柿塚亮, 佃洸摂, 深山覚, 岩本尚也, 後藤真孝, 森島繁生

情報処理学会研究報告(Web) 2016 ( MUS-112 ) Vol.2016‐MUS‐112,No.16,1‐6 (WEB ONLY) 2016年

J-GLOBAL
肖像画からの写実的な顔画像生成手法

中村優文, 山口周悟, 福里司, 森島繁生

情報処理学会研究報告(Web) 2016 ( CG-163 ) Vol.2016‐CG‐163,No.10,1‐6 (WEB ONLY) 2016年

J-GLOBAL
似顔絵から顔を知る~似顔絵実写化の可能性~

中村優文, 森島繁生

日本顔学会誌 16 ( 1 ) 36 2016年

J-GLOBAL
顔の変化による眠そうな顔の認知

佐藤優伍, 加藤卓哉, 野澤直樹, 森島繁生

日本顔学会誌 16 ( 1 ) 71 2016年

J-GLOBAL
顔の発話動作と音声とを同期させた映像を生成する手法の提案

古川翔一, 加藤卓哉, サフキンパーベル, 森島繁生

日本顔学会誌 16 ( 1 ) 38 2016年

J-GLOBAL
ラリーシーンの自動抽出と解析に基づくバレーボール映像の要約手法の提案

板摺貴大, 福里司, 山口周悟, 森島繁生

映像情報メディア学会冬季大会講演予稿集(CD-ROM) 2016 ROMBUNNO.25C‐1 2016年

J-GLOBAL
Automatic Generation of Photorealistic 3D Inner Mouth Animation only from Frontal Images

Masahide Kawai, Tomoyori Iwao, Akinobu Maejima, Shigeo Morishima

情報処理学会論文誌 56 ( 9 ) 2015年09月

　概要を見る

In this paper, we propose a novel method to generate highly photorealistic three-dimensional (3D) inner mouth animation that is well-fitted to an original ready-made speech animation using only frontal captured images and small-size databases. The algorithms are composed of quasi-3D model reconstruction and motion control of teeth and the tongue, and final compositing of photorealistic speech animation synthesis tailored to the original. In general, producing a satisfactory photorealistic appearance of the inner mouth that is synchronized with mouth movement is a very complicated and time-consuming task. This is because the tongue and mouth are too flexible and delicate to be modeled with the large number of meshes required. Therefore, in some cases, this process is omitted or replaced with a very simple generic model. Our proposed method, on the other hand, can automatically generate 3D inner mouth appearances by improving photorealism with only three inputs: an original tailor-made lip-sync animation, a single image of the speaker's teeth, and a syllabic decomposition of the desired speech. The key idea of our proposed method is to combine 3D reconstruction and simulation with two-dimensional (2D) image processing using only the above three inputs, as well as a tongue database and mouth database. The satisfactory performance of our proposed method is illustrated by the significant improvement in picture quality of several tailor-made animations to a degree nearly equivalent to that of camera-captured videos.\n------------------------------This is a preprint of an article intended for publication Journal ofInformation Processing(JIP). This preprint should not be cited. Thisarticle should be cited as: Journal of Information Processing Vol.23(2015) No.5 (online)DOI http://dx.doi.org/10.2197/ipsjjip.23.693------------------------------In this paper, we propose a novel method to generate highly photorealistic three-dimensional (3D) inner mouth animation that is well-fitted to an original ready-made speech animation using only frontal captured images and small-size databases. The algorithms are composed of quasi-3D model reconstruction and motion control of teeth and the tongue, and final compositing of photorealistic speech animation synthesis tailored to the original. In general, producing a satisfactory photorealistic appearance of the inner mouth that is synchronized with mouth movement is a very complicated and time-consuming task. This is because the tongue and mouth are too flexible and delicate to be modeled with the large number of meshes required. Therefore, in some cases, this process is omitted or replaced with a very simple generic model. Our proposed method, on the other hand, can automatically generate 3D inner mouth appearances by improving photorealism with only three inputs: an original tailor-made lip-sync animation, a single image of the speaker's teeth, and a syllabic decomposition of the desired speech. The key idea of our proposed method is to combine 3D reconstruction and simulation with two-dimensional (2D) image processing using only the above three inputs, as well as a tongue database and mouth database. The satisfactory performance of our proposed method is illustrated by the significant improvement in picture quality of several tailor-made animations to a degree nearly equivalent to that of camera-captured videos.\n------------------------------This is a preprint of an article intended for publication Journal ofInformation Processing(JIP). This preprint should not be cited. Thisarticle should be cited as: Journal of Information Processing Vol.23(2015) No.5 (online)DOI http://dx.doi.org/10.2197/ipsjjip.23.693------------------------------

CiNii
実写画像を入力とする画風を考慮したアニメ背景画像生成システム

山口, 周悟, 古澤, 知英, 福里, 司, 森島, 繁生

第77回全国大会講演論文集 2015 ( 1 ) 71 - 72 2015年03月

　概要を見る

手描きアニメーション制作において背景画像は非常にコストがかかるため、簡易的な生成が求められる。本稿は，実写の風景画像をアニメ調に変換する手法を提案する．従来手法は、変換が画像全体に画一的であるため、アニメ作品上の背景画像の特徴である「色使い」「鮮明な輪郭」「ブラシの塗り方」を表現することが困難である．そこで提案手法では、入力として実写とアニメ背景画像を用意し、3段階の自動的な工程を踏む。実写とアニメの領域の分割と、領域の対応付けを行う。さらに、アニメから実写へテクスチャの転写を行うことにより、アニメ背景画像の特徴が反映された新たな背景画像を生成する。

CiNii
単一楽曲の切り貼りによる動画の盛り上がりに同期したBGM自動付加手法

佐藤晴紀, 平井辰典, 中野倫靖, 後藤真孝, 森島繁生

第77回全国大会講演論文集 2015 ( 1 ) 375 - 376 2015年03月

　概要を見る

本稿では、動画とその盛り上がり箇所、BGMとしたい楽曲とその箇所を入力として与えると、動画の盛り上がりと楽曲の指定箇所が合うように楽曲を自動付加する手法を提案する。従来、BGMと映像の関係を事前に学習しておくことで、新たな動画にBGMを付加する手法が提案されているが、「動画の決定的なシーンに楽曲のサビの部分を合わせたい」といったユーザの意図の反映は検討されてこなかった。そこで本研究では、ユーザが楽曲のサビ区間をいつ再生するのか指定することで、楽曲と動画の始端及び終端を揃えながら、指定時刻にサビがくるように楽曲を断片的につなぎ合わせる。具体的には、動的計画法を用いた小節単位での楽曲の切り貼りによって、ユーザの意図を反映したBGM自動付加を実現する。

CiNii
ダンスモーションにシンクロした音楽印象推定手法の提案とダンサーの表情自動合成への応用

朝比奈わかな, 岡田成美, 岩本尚也, 増田太郎, 福里司, 森島繁生

情報処理学会研究報告. [音楽情報科学] 2015 ( 23 ) 1 - 6 2015年02月

　概要を見る

近年,3DCG 制作ツール (MikuMikuDance 等) の普及により,楽曲に合わせて CG キャラクタを踊らせる動画作品が増加傾向にある.このようなダンス動画においてキャラクタの表情は作品全体の印象に大きく影響する.例えば,楽曲やダンスモーションの印象と全く異なる印象の表情を付与した場合,その動画は違和感のある作品となってしまう.また,楽曲の印象が一定でも,ダンスモーションの激しさの度合いやキャラクタの姿勢などによって,印象が大きく左右される場合がある.そのため,作品の印象を決定する要素として,楽曲の印象だけでなく,ダンスモーションの情報も考慮する必要がある.そこで我々は,ダンス動画の印象についての主観評価実験に基づき,音響特徴量とモーション特徴量を用いて重回帰分析を行うことで,ダンスモーションにシンクロした楽曲印象推定を可能にした.

CiNii J-GLOBAL
映像の盛り上がり箇所に音楽のサビを同期させるBGM付加支援手法

佐藤晴紀, 平井辰典, 中野倫靖, 後藤真孝, 森島繁生

情報処理学会研究報告. [音楽情報科学] 2015 ( 10 ) 1 - 6 2015年02月

　概要を見る

本稿では,入力映像の指定箇所と入力楽曲の指定箇所を同期させながら,映像の全区間に対して BGM を付加する手法を提案する.従来研究では,既存動画から音響特徴量と映像特徴量を学習し,映像に BGM を自動付加する手法が提案されている.しかし,自動で映像に BGM を付加しているため,「映像の決定的なシーンに楽曲のサビを合わせたい」「映像の始端と終端に楽曲の始端と終端を合わせたい」といったユーザが指定した楽曲と映像の特定の箇所を同期するための BGM 付加については言及されていない.そこで本研究では,楽曲と映像の長さを揃えながら,ユーザが指定した楽曲と映像の箇所を同期させるように楽曲を断片的につなぎ合わせることで,映像の全区間に対して BGM を付加する.具体的には,動的計画法に基づく小節単位での楽曲の切り貼りによりユーザが指定した箇所を同期させた BGM の付加を実現する.被験者実験の結果,本手法は同じ音色の箇所が多いインストゥルメンタルの楽曲に対して特に有効であった.また,一度生成された BGM をユーザが希望する楽曲の盛り上がりに合わせて再編集を行うことができるシステムを提案した.

CiNii J-GLOBAL
実写画像に基づく特定画風を反映したアニメ背景画像への自動変換

山口周悟, 古澤知英, 福里司, 森島繁生

情報処理学会研究報告. グラフィクスとCAD研究会報告 2015 ( 14 ) 1 - 6 2015年02月

　概要を見る

手描きアニメーション制作において、背景画像の作成は多大な労力を必要とする.そこで本研究では,実写の風景画像をアニメ背景画像へ自動変換する手法を提案する.提案手法では,各アニメータの個性的な "色使い","輪郭","ブラシの塗り方" を再現するために,アニメ作品中の風景画像の特徴を任意の実写風景画像に転写する.各領域の色調,塗り方の違いを考慮することで,従来研究で不十分であった輪郭の保持と,各領域の特徴を考慮した転写の両立を可能とした.

CiNii J-GLOBAL
人物の皺の発生位置と形状を反映した経年変化顔画像合成

サフキンパーベル, 桑原大樹, 川井正英, 加藤卓哉, 森島繁生

情報処理学会研究報告. グラフィクスとCAD研究会報告 2015 ( 12 ) 1 - 8 2015年02月

　概要を見る

長期的な犯罪捜査において,対象となる人物の過去・未来の顔を推定する経年変化顔合成技術が求められている.しかし,従来手法では経年変化後の印象の決定において重要となる,皺の個人性が表現できていない.皺の個人性は皺の位置や形状によって決まる.そこで本稿では,加齢による皺が表情皺の形状や位置に起因するという知見に基づき,皺の個人性を反映した経年変化顔合成手法を提案する.初めに,表情変化時の顔画像を入力として、その表情皺から入力人物の皺の形状及び発生位置を推定する.推定された結果を基に,年代別の顔画像データベースで入力人物の顔を再構成することで目標年代の印象を付加した経年変化顔画像を合成する.これにより,皺の個人性を反映する経年変化顔合成を可能にした.

CiNii J-GLOBAL
注視点の変化に追随するゲームキャラクタの頭部および眼球運動の自動合成

鍵山裕貴, 川井正英, 桑原大樹, 森島繁生

情報処理学会研究報告. グラフィクスとCAD研究会報告 2015 ( 13 ) 1 - 7 2015年02月

　概要を見る

眼球運動とそれに伴う頭部運動は,ゲームキャラクタの動きをよりリアルにみせるために重要な要素である.しかし,現状では眼球運動はアーティストの手作業によって作成されており,その表現には多大な労力と時間がかかっている.そこで本研究では,注視点の位置と表示時間を考慮したキャラクタアニメーションを自動合成する手法を提案する.具体的には,注視点が移る際の頭部及び眼球運動を実測し,その実測データから運動を決定づける複数のパラメータを抽出する.このパラメータをモデル化し,注視点の位置と表示時間に応じて決定することで,キャラクタの詳細な頭部及び眼球運動の合成が可能となる.

CiNii J-GLOBAL
半透明物体における曲率と透過度合の相関分析

岡本翠, 安達翔平, 久保尋之, 向川康博, 森島繁生

研究報告コンピュータビジョンとイメージメディア（CVIM） 2015 ( 26 ) 1 - 6 2015年01月

　概要を見る

本研究では，半透明物体の内部で生じる光線の表面下散乱現象の解析を目的として，物体表面の曲率と表面下散乱現象との相関を検討する．コンピュータグラフィックス分野では，画像の生成を目的として表面下散乱現象のモデル化が積極的に行われてきたが，物理ベースの光学シミュレーションは計算負荷が高いため，このようなモデルを用いて画像の解析を行うことは非常に困難である．そこで本研究は，表面下散乱現象のモデルとして，計算コストが低い近似的なモデルである曲率に依存する反射関数 (CDRF) に着目する．様々な曲率の半透明物体における表面下散乱の計測結果をもとに，曲率と光の透過度合との相関を分析することによって，CDRF を用いた半透明物体の画像解析の有効性を検証する．

CiNii
似顔絵からのフォトリアルな顔画像生成

溝川あい, 森島繁生

画像センシングシンポジウム講演論文集(CD-ROM) 21st ROMBUNNO.IS3-32 2015年

J-GLOBAL
半透明物体における曲率と透過度合の相関分析

岡本翠, 安達翔平, 久保尋之, 向川康博, 森島繁生

電子情報通信学会技術研究報告 114 ( 410(MVE2014 49-73) ) 147 - 152 2015年

J-GLOBAL
ラリーシーンに着目した映像自動要約によるラケットスポーツ動画鑑賞システム

河村俊哉, 福里司, 福里司, 平井辰典, 森島繁生, 森島繁生

情報処理学会論文誌ジャーナル(Web) 56 ( 3 ) 1028-1038 (WEB ONLY) 2015年

J-GLOBAL
ダンスモーションに同期した表情自動合成のための楽曲印象解析手法の提案

朝比奈わかな, 岡田成美, 岩本尚也, 増田太郎, 福里司, 森島繁生, 森島繁生

情報処理学会全国大会講演論文集 77th ( 2 ) 2.383-2.384 2015年

J-GLOBAL
単一楽曲の切り貼りによる動画の盛り上がりに同期したBGM自動付加手法

佐藤晴紀, 佐藤晴紀, 平井辰典, 中野倫靖, 後藤真孝, 森島繁生, 森島繁生

情報処理学会全国大会講演論文集 77th ( 2 ) 2.375-2.376 2015年

J-GLOBAL
皺の個人性を考慮した経年変化顔画像合成

SAVKIN Pavel, 桑原大樹, 川井正英, 加藤卓哉, 森島繁生

情報処理学会全国大会講演論文集 77th ( 4 ) 4.111-4.112 2015年

J-GLOBAL
実測に基づくゲームキャラクタの頭部および眼球運動の自動合成

鍵山裕貴, 川井正英, 桑原大樹, 森島繁生

情報処理学会全国大会講演論文集 77th ( 4 ) 4.109-4.110 2015年

J-GLOBAL
顔画像のシルエット情報に基づく3次元顔形状復元

野澤直樹, 桑原大樹, 森島繁生

情報処理学会全国大会講演論文集 77th ( 2 ) 2.525-2.526 2015年

J-GLOBAL
フィッティングを保持した体型の変化に頑健な衣装転写システムの提案

成田史弥, 齋藤隼介, 加藤卓哉, 福里司, 森島繁生

情報処理学会全国大会講演論文集 77th ( 4 ) 4.105-4.106 2015年

J-GLOBAL
雑音下での音源定位・音源分離に与える伝達関数測定法の影響の評価

赤堀渉, 増田太郎, 奥乃博, 森島繁生, 森島繁生

情報処理学会全国大会講演論文集 77th ( 2 ) 2.119-2.120 2015年

J-GLOBAL
実写画像に基づく画風を考慮したアニメ背景画像生成システム

山口周悟, 古澤知英, 福里司, 森島繁生, 森島繁生

情報処理学会全国大会講演論文集 77th ( 4 ) 4.71-4.72 2015年

J-GLOBAL
外科的矯正治療後のスマイルについて

寺田員人, 佐野奈都貴, 寺嶋縁里, 亀田剛, 小原彰浩, 齋藤功, 森島繁生

顎顔面バイオメカニクス学会誌 19/20 ( 1 ) 64‐70 2015年

J-GLOBAL
VoiceDub:複数タイミング情報をともなう映像エンタテイメント向け音声同期収録支援システム

川本真一, 森島繁生, 中村哲

情報処理学会論文誌ジャーナル(Web) 56 ( 4 ) 1142-1151 (WEB ONLY) 2015年

J-GLOBAL
音声と映像の変化に注目したフレーム間引きによる動画要約手法

平井辰典, 森島繁生, 森島繁生

情報処理学会研究報告(Web) 2015 ( MUS-107 ) VOL.2015-MUS-107,NO.18 (WEB ONLY) 2015年

J-GLOBAL
頬のシルエット情報を活用した単一斜め向き顔画像に対する顔形状3次元復元手法

野澤直樹, 森島繁生

情報処理学会研究報告(Web) 2015 ( CG-159 ) VOL.2015-CG-159,NO.7 (WEB ONLY) 2015年

J-GLOBAL
手描き画像の特徴を保存した実写画像への画風転写

山口周悟, 加藤卓哉, 福里司, 古澤知英, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2015 ROMBUNNO.32 2015年

J-GLOBAL
注視点の変化に追随するゲームキャラクタの頭部および眼球運動の自動合成

鍵山裕貴, 川井正英, 桑原大樹, 加藤卓哉, 藤崎匡裕, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2015 ROMBUNNO.23 2015年

J-GLOBAL
中割り自動生成による手描きストロークベースのキーフレームアニメーション作成支援ツール

福里司, 古澤知英, 森島繁生, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2015 ROMBUNNO.22 2015年

J-GLOBAL
ポーズに依存しない4足キャラクタ間の衣装転写システムの提案

成田史弥, 齋藤隼介, 加藤卓哉, 福里司, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2015 ROMBUNNO.07 2015年

J-GLOBAL
単一楽曲の切り貼りにより映像と楽曲の指定箇所を同期させるBGM付加支援インタフェース

佐藤晴紀, 佐藤晴紀, 平井辰典, 中野倫靖, 後藤真孝, 森島繁生, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2015 ROMBUNNO.26 2015年

J-GLOBAL
ダンスにシンクロした楽曲印象推定によるダンスキャラクタの表情アニメーション生成手法の提案

朝比奈わかな, 朝比奈わかな, 岡田成美, 岡田成美, 岩本尚也, 岩本尚也, 増田太郎, 増田太郎, 福里司, 森島繁生, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2015 ROMBUNNO.09 2015年

J-GLOBAL
皺の発生過程を考慮した経年変化顔画像合成

SAVKIN Pavel, 桑原大樹, 川井正英, 加藤卓哉, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2015 ROMBUNNO.03 2015年

J-GLOBAL
似顔絵実写化手法の提案

溝川あい, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2015 ROMBUNNO.01 2015年

J-GLOBAL
多重レイヤーボリューム構造を考慮したキャラクターのリアルタイム肉揺れアニメーション生成手法

岩本尚也, 岩本尚也, 森島繁生, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2015 ROMBUNNO.08 2015年

J-GLOBAL
ラリーシーンに着目したラケットスポーツ動画の効率的鑑賞システム

河村俊哉, 福里司, 平井辰典, 森島繁生

画像ラボ 26 ( 7 ) 1 - 7 2015年

J-GLOBAL
キャラクターの身体構造を考慮した実時間肉揺れ生成手法

岩本尚也, 岩本尚也, 森島繁生, 森島繁生

画像電子学会誌(CD-ROM) 44 ( 3 ) 502 - 511 2015年

DOI J-GLOBAL
要素間補間による共回転系弾性体シミュレーションの高速化

福原吉博, 斎藤隼介, 成田史弥, 森島繁生

情報処理学会研究報告(Web) 2015 ( CG-160 ) VOL.2015-CG-160,NO.8 (WEB ONLY) 2015年

J-GLOBAL
MusicMixer:ビート及び潜在トピックの類似度を用いたDJシステム

平井辰典, 土井啓成, 森島繁生, 森島繁生

情報処理学会研究報告(Web) 2015 ( MUS-108 ) VOL.2015-MUS-108,NO.3,HIRAI (WEB ONLY) 2015年

J-GLOBAL
楽曲のビート類似度及び潜在トピックの類似度に基づくDJプレイの自動化

平井辰典, 土井啓成, 森島繁生, 森島繁生

情報処理学会研究報告(Web) 2015 ( MUS-108 ) VOL.2015-MUS-108,NO.14 (WEB ONLY) 2015年

J-GLOBAL
キャラクタの個性的な表情特徴を反映した表情モデリング法の提案

加藤卓哉, 森島繁生

日本顔学会誌 15 ( 1 ) 102 2015年

J-GLOBAL
楽曲印象に基づくダンスモーションに同期したダンスキャラクタの表情自動合成

朝比奈わかな, 朝比奈わかな, 岡田成美, 岡田成美, 岩本尚也, 岩本尚也, 増田太郎, 増田太郎, 福里司, 森島繁生, 森島繁生

日本顔学会誌 15 ( 1 ) 124 2015年

J-GLOBAL
骨格を基にした顔の肥痩シミュレーション

藤崎匡裕, 森島繁生

日本顔学会誌 15 ( 1 ) 140 2015年

J-GLOBAL
似顔絵実写化手法の提案

中村優文, 森島繁生

日本顔学会誌 15 ( 1 ) 125 2015年

J-GLOBAL
曲率依存反射関数を用いた半透明物体における照度差ステレオ法の改善

岡本翠, 久保尋之, 向川康博, 森島繁生

電子情報通信学会技術研究報告 115 ( 224(PRMU2015 67-91) ) 129 - 133 2015年

J-GLOBAL
パッチタイリングを用いた法線推定による3次元顔形状復元

野沢綸佐, 加藤卓哉, 藤崎匡裕, サフキンパーベル, 森島繁生, 森島繁生

情報処理学会研究報告(Web) 2015 ( CVIM-199 ) VOL.2015-CVIM-199,NO.26 (WEB ONLY) 2015年

J-GLOBAL
動的計画法を用いた半透明物体のリアルタイムレンダリング

小澤禎裕, 岡本翠, 久保尋之, 森島繁生, 森島繁生

情報処理学会研究報告(Web) 2015 ( CVIM-199 ) VOL.2015-CVIM-199,NO.1 (WEB ONLY) 2015年

J-GLOBAL
三次元形状を考慮した半透明物体のリアルタイムレンダリング

持田恵佑, 岡本翠, 小澤禎裕, 久保尋之, 森島繁生, 森島繁生

情報処理学会研究報告(Web) 2015 ( CVIM-199 ) VOL.2015-CVIM-199,NO.2 (WEB ONLY) 2015年

J-GLOBAL
主成分分析に基づく類似口形状検出によるビデオ翻訳動画の生成

古川翔一, 加藤卓哉, 野澤直樹, サフキンパーベル, 森島繁生, 森島繁生

情報処理学会研究報告(Web) 2015 ( CVIM-199 ) VOL.2015-CVIM-199,NO.14 (WEB ONLY) 2015年

J-GLOBAL
視線追跡データから算出された注目領域に基づく視線移動の少ない字幕配置法の提案と評価

赤堀渉, 平井辰典, 森島繁生, 森島繁生

情報処理学会研究報告(Web) 2015 ( EC-38 ) VOL.2015‐EC‐38,NO.8 (WEB ONLY) 2015年

J-GLOBAL
Musicmean: Fusion-based music generation

Tatsunori Hirai, Shoto Sasaki, Shigeo Morishima

Proceedings of the 12th International Conference in Sound and Music Computing, SMC 2015 323 - 327 2015年

　概要を見る

© 2015 Tatsunori Hirai et al. In this paper, we propose MusicMean, a system that fuses existing songs to create an "in-between song" such as an "average song," by calculating the average acoustic pitch of musical notes and the occurrence frequency of drum elements from multiple MIDI songs. We generate an inbetween song for generative music by defining rules based on simple music theory. The system realizes the interactive generation of in-between songs. This represents new interaction between human and digital content. Using MusicMean, users can create personalized songs by fusing their favorite songs.
Flesh jigging method considered in character body structure in real-time

Naoya Iwamoto, Shigeo Morishima

Journal of the Institute of Image Electronics Engineers of Japan 44 ( 3 ) 502 - 511 2015年

　概要を見る

This paper presents a method to synthesize a high speed soft body character animation in real time. Especially, not only primary but also secondary physics based deformation in an under skin fat structure driven by a skeletal motion are considered. However, as a high fidelity soft body animation method, high cost FEM based methods are usually introduced, the method which represents a high speed and robust soft body animation using simplified elastic simulation method was proposed. In this method, they applied skinning result to fat and skin, and then a vibration was limited to be a very small and unnatural. So in this paper, we proposed a method to divide skinning and simulation by modeling an inside structure under skin approximately and automatically. As a result, by controlling parameters such as volume and elastic parameter at each layer freely. Consequently, by a user interaction with parameter adjustment, a definition of soft body material according to the location can be implemented to generate and edit a proper soft body motion. After an evaluation of soft body motion synthesis based on a several character models, this method is approved to be very effective to make a character animation with soft body characters.
Automatic singing voice to music video generation via mashup of singing video clips

Tatsunori Hirai, Yukara Ikemiya, Kazuyoshi Yoshii, Tomoyasu Nakano, Masataka Goto, Shigeo Morishima

Proceedings of the 12th International Conference in Sound and Music Computing, SMC 2015 153 - 159 2015年

　概要を見る

© 2015 Tatsunori Hirai et al. This paper presents a system that takes audio signals of any song sung by a singer as the input and automatically generates a music video clip in which the singer appears to be actually singing the song. Although music video clips have gained the popularity in video streaming services, not all existing songs have corresponding video clips. Given a song sung by a singer, our system generates a singing video clip by reusing existing singing video clips featuring the singer. More specifically, the system retrieves short fragments of singing video clips that include singing voices similar to that in target song, and then concatenates these fragments using a technique of dynamic programming (DP). To achieve this, we propose a method to extract singing scenes from music video clips by combining vocal activity detection (VAD) with mouth aperture detection (MAD). The subjective experimental results demonstrate the effectiveness of our system.
実計測による半透明物体の反射関数推定とリアルタイムレンダリング

岡本翠, 安達翔平, 宇梶弘晃, 岡見和樹, 森島繁生

情報処理学会研究報告. グラフィクスとCAD研究会報告 2014 ( 3 ) 1 - 7 2014年02月

　概要を見る

様々な CG コンテンツにおいて,半透明物体を写実的に表現することは重要である.本研究では,実測データをもとに曲率と光の透過度合を関連付け,半透明物体を高速に描画する手法を提案する.様々な半径の半透明球に光を照射し,球の法線と光源方向の成す角に対する輝度値の変化を実測する.さらに,球の法線と視点方向の成す角に対する輝度分布を分析することにより,半透明物体における光の指向性が与える影響を考察する.実測により得られたデータをもとに,半透明物体における曲率と表面下散乱との関連性を導き,高速かつ高品質な半透明物体の描画を実現する.

CiNii J-GLOBAL
物理的特徴に基づく擬音語可視化手法の検討

福里司, 森島繁生

情報処理学会研究報告. グラフィクスとCAD研究会報告 2014 ( 10 ) 1 - 8 2014年02月

　概要を見る

本稿では,物理パラメータを基に擬音語を推定する手法を検討する.擬音語とは,アニメや漫画作品に登場するキャラクタの質感や動きを言語化したものであり,受け手に内容や状況を直感的に認識させる表現技法である.しかしアニメータが,状況や設定資料を基に感覚的に決定しているのが現状である.そこで, CG アニメーションの制作において必要となる複数のパラメータを基に,擬音語を自動推定及び付与する手法を提案する.この技術により,映像のみでは把握できない環境情報 (素材感や速度感など) を可視化することが可能となる.

CiNii J-GLOBAL
頭蓋骨形状に基づく顔の痩せ太りシミュレーション

藤崎匡裕, 桑原大樹, 溝川あい, 岩尾知頼, 中村太郎, 前島謙宣, 森島繁生

情報処理学会研究報告. グラフィクスとCAD研究会報告 2014 ( 20 ) 1 - 7 2014年02月

　概要を見る

美容やヘルスケア,エンターテイメントの分野において,正確な顔の痩せ太りシミュレーションに対する需要が高まっている.従来手法では,全入力人物の顔画像に対して,データベースから定義された同様の変形ルールを適用していたため,変形の個人特徴が無視されていた.また,顔の表面情報のみを基にして痩せ太り変形ルールを定義していたため,頭蓋骨の形状を無視した不自然な変形が行われていた.そこで我々は,正面顔画像一枚から,顔内部の頭蓋骨形状を推定し,それを基にして入力人物ごとの個人性を考慮した痩せ太り変形を行う手法を提案する.また同時に,頭蓋骨の形状を越えた変形を防ぐ.本手法により,入力人物個人の個人性を残した上で,頭蓋骨形状を越えるような不自然な変形のない,顔のリアルな痩せ太りシミュレーションを実現した.

CiNii J-GLOBAL
キャラクタ特有の特徴再現を考慮したリアルな表情リターゲッティング手法の提案

加藤卓哉, 川井正英, 斉藤隼介, 岩尾知頼, 前島謙宣, 森島繁生

情報処理学会研究報告. グラフィクスとCAD研究会報告 2014 ( 15 ) 1 - 8 2014年02月

　概要を見る

CG キャラクタの表情作成に用いられるブレンドシェイプでは,それぞれのキャラクタ特有の表情変化を再現したキーシェイプ作成に多大な労力を要していた.そこで本研究では、人間の表情を転写することによって生成されたキーシェイプからアーティストが作成したキーシェイプへの写像を、少数の表情を用いて学習し、人間の表情が転写された他の表情のキーシェイプに適用する手法を提案する。本手法で生成したキーシェイプを用いることで、アーティストが少数の表情で定義した、キャラクタ特有の表情特徴を考慮したリターゲッティングを実現した.

CiNii J-GLOBAL
LYRICS RADAR:歌詞の潜在的意味分析に基づく歌詞検索インタフェース

佐々木将人, 吉井和佳, 中野倫靖, 後藤真孝, 森島繁生

情報処理学会研究報告(Web) 2014 ( MUS-102 ) VOL.2014-MUS-102,NO.26 (WEB ONLY) 2014年

J-GLOBAL
Query by Phrase:半教師あり非負値行列因子分解を用いた音楽信号中のフレーズ検出

増田太郎, 吉井和佳, 後藤真孝, 森島繁生

情報処理学会研究報告(Web) 2014 ( MUS-102 ) VOL.2014-MUS-102,NO.25 (WEB ONLY) 2014年

J-GLOBAL
話者類似度の時間的変化を用いた多人数音声モーフィングに基づく話者変換

浜崎皓介, 河原英紀, 森島繁生

日本音響学会研究発表会講演論文集(CD-ROM) 2014 ROMBUNNO.3-6-1 2014年

J-GLOBAL
ラケットスポーツ動画の構造解析に基づく映像要約と鑑賞インタフェースの提案

河村俊哉, 福里司, 福里司, 平井辰典, 森島繁生, 森島繁生

情報処理学会全国大会講演論文集 76th ( 2 ) 2.117-2.118 2014年

J-GLOBAL
キャラクタに固有な表情変化の特徴を反映したキーシェイプ自動生成手法の提案

加藤卓哉, 川井正英, 桑原大樹, 斉藤隼介, 岩尾知頼, 前島謙宣, 森島繁生

情報処理学会全国大会講演論文集 76th ( 4 ) 4.339-4.340 2014年

J-GLOBAL
髪の特徴に基づく類似顔画像検索

藤賢大, 福里司, 福里司, 増田太郎, 増田太郎, 平井辰典, 森島繁生, 森島繁生

情報処理学会全国大会講演論文集 76th ( 1 ) 1.539-1.540 2014年

J-GLOBAL
実測に基づく反射関数による半透明物体のリアルタイムレンダリング

岡本翠, 安達翔平, 宇梶弘晃, 岡見和樹, 森島繁生

情報処理学会全国大会講演論文集 76th ( 4 ) 4.301-4.302 2014年

J-GLOBAL
正面および側面の手描き顔画像からの顔回転シーン自動生成

古澤知英, 福里司, 福里司, 岡田成美, 平井辰典, 森島繁生, 森島繁生

情報処理学会全国大会講演論文集 76th ( 4 ) 4.345-4.346 2014年

J-GLOBAL
頭蓋骨の形状を考慮した顔の肥痩シミュレーション

藤崎匡裕, 桑原大樹, 溝川あい, 中村太郎, 前島謙宣, 森島繁生

情報処理学会全国大会講演論文集 76th ( 2 ) 2.283-2.284 2014年

J-GLOBAL
歌手映像と歌声の解析に基づく音楽動画中の歌唱シーン検出手法の検討

平井辰典, 中野倫靖, 後藤真孝, 森島繁生, 森島繁生

電子情報通信学会技術研究報告 114 ( 52(SP2014 1-45) ) 271 - 278 2014年

J-GLOBAL
物理現象を考慮した映像シーンへの擬音語自動付加の研究

福里司, 福里司, 森島繁生, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2014 ROMBUNNO.16 2014年

J-GLOBAL
正面口内画像群からのリアルな三次元口内アニメーションの自動生成

川井正英, 岩尾知頼, 前島謙宣, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2014 ROMBUNNO.22 2014年

J-GLOBAL
キャラクタ固有の表情特徴を考慮した顔アニメーション生成手法

加藤卓哉, 斉藤隼介, 川井正英, 岩尾知頼, 前島謙宣, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2014 ROMBUNNO.37 2014年

J-GLOBAL
曲率依存反射関数の実測に基づく半透明物体のリアルタイムレンダリング

岡本翠, 安達翔平, 宇梶弘晃, 岡見和樹, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2014 ROMBUNNO.50 2014年

J-GLOBAL
ラケットスポーツのラリーシーンに着目した映像要約と効率的鑑賞インタフェース

河村俊哉, 福里司, 福里司, 平井辰典, 森島繁生, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2014 ROMBUNNO.4 2014年

J-GLOBAL
2枚の手描き顔画像を用いたキャラクタ顔回転シーン自動生成

古澤知英, 福里司, 福里司, 岡田成美, 平井辰典, 森島繁生, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2014 ROMBUNNO.35 2014年

J-GLOBAL
頭蓋骨形状に基づいた顔の肥痩シミュレーション

藤崎匡裕, 桑原大樹, 溝川あい, 中村太郎, 前島謙宣, 山下隆義, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2014 ROMBUNNO.20 2014年

J-GLOBAL
髪の特徴に基づく顔の印象類似検索システム

藤賢大, 福里司, 福里司, 佐々木将人, 佐々木将人, 増田太郎, 増田太郎, 平井辰典, 森島繁生, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2014 ROMBUNNO.41 2014年

J-GLOBAL
パターン計測技術の深化と広がる産業応用エンタテインメント応用のための人物顔パターン計測・合成技術

森島繁生

計測と制御 53 ( 7 ) 593 - 598 2014年

DOI J-GLOBAL
正面および側面のイラストからのキャラクタ顔回転シーンの自動生成

古澤知英, 古澤知英, 福里司, 福里司, 岡田成美, 岡田成美, 平井辰典, 森島繁生, 森島繁生

情報処理学会研究報告(Web) 2014 ( CG-156 ) VOL.2014-CG-156,NO.8 (WEB ONLY) 2014年

J-GLOBAL
振り付けの構成要素を考慮したダンスモーションのセグメンテーション手法の提案

岡田成美, 福里司, 岩本尚也, 森島繁生, 森島繁生

情報処理学会研究報告(Web) 2014 ( CG-156 ) VOL.2014-CG-156,NO.9 (WEB ONLY) 2014年

J-GLOBAL
個人性を保持したデータドリブンなパーツベース経年変化顔画像合成

桑原大樹, 前島謙宣, 前島謙宣, 藤崎匡裕, 森島繁生

情報処理学会研究報告(Web) 2014 ( CVIM-194 ) VOL.2014-CVIM-194,NO.23 (WEB ONLY) 2014年

J-GLOBAL
Automatic Photorealistic 3D Inner Mouth Restoration from Frontal Images

Masahide Kawai, Tomoyori Iwao, Akinobu Maejima, Shigeo Morishima

ADVANCES IN VISUAL COMPUTING (ISVC 2014), PT 1 8887 51 - 62 2014年

　概要を見る

In this paper, we propose a novel method to generate highly photorealistic three-dimensional (3D) inner mouth animation that is well-fitted to an original ready-made speech animation using only frontal captured images and a small-size database. The algorithms are composed of quasi-3D model reconstruction and motion control of teeth and the tongue, and final compositing of photorealistic speech animation synthesis tailored to the original.
A visuomotor coordination model for obstacle recognition

Journal of WSCG 22 ( 2 ) 49 - 56 2014年01月

　概要を見る

In this paper, we propose a novel method for animating CG characters that while walking or running pay heed to obstacles. Here, our primary contribution is to formulate a generic visuomotor coordination model for obstacle recognition with whole body movements. In addition, our model easily generates gaze shifts, which expresses the individuality of characters. Based on experimental evidence, we also incorporate the coordination of eye movements in response to obstacle recognition behavior via simple parameters related to the target position and individuality of the characters's gaze shifts. Our overall model can generate plausible visuomotor coordinated movements in various scenes by manipulating parameters of our proposed functions.
Character transfer: Example-based individuality retargeting for facial animations

22nd International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision, WSCG 2014, Full Papers Proceedings - in co-operation with EUROGRAPHICS Association 121 - 129 2014年01月

　概要を見る

A key disadvantage of blendshape animation is the labor-intensive task of sculpting blendshapes with individual expressions for each character. In this paper, we propose a novel system "Character Transfer", that automatically sculpts blendshapes with individual expressions by extracting them from training examples; this extraction creates a mapping that drives the sculpting process. Comparing our approach with the nave method of transferring facial expressions from other characters, Character Transfer effectively sculpted blendshapes without the need to create such unnecessary blendshapes for other characters. Character Transfer is applicable even the training examples are limited to only a few number by using region segmentations of the face and the blending of the mappings.
ラケットスポーツ動画の構造解析による映像要約手法の提案

河村俊哉, 福里司, 平井辰典, 森島繁生

情報処理学会研究報告. CVIM, [コンピュータビジョンとイメージメディア] 2013 ( 15 ) 1 - 6 2013年11月

　概要を見る

近年,スポーツ動画を手軽に鑑賞できるようになり,効率的な鑑賞方法が必要とされている.その解決策として,従来のラケットスポーツ動画に対する映像要約では,重要なラリーシーンの要約映像を生成したが,ラリーシーンの検出及びその評価方法に問題点が見られ,要約映像の効率的な鑑賞方法についても考慮が無かった.そこで本稿では,ラケットスポーツ動画に対する新たなラリーシーンの検出方法と各ラリーの重要度を用いた映像要約手法及びその鑑賞方法を提案する.提案手法では,ショット分割された動画に対し類似ショットのクラスタリング及びラリーを含むクラスタの選定により,精度の高いラリーシーンの検出方法を実現する.その後,各ラリーに対して音響情報を考慮した重要度評価を行い,その結果をユーザが調整することで,任意の時間内での動画の鑑賞を可能とする.さらに,ラケットスポーツに特化した高速再生による動画視聴方法を提案し,さらなる効率的な動画の鑑賞方法を実現する.

CiNii
CG キャラクタのための感情と時系列性を考慮した眼球運動の分析と合成

岩尾知頼, 久保尋之, 前島謙宣, 森島繁生

エンタテインメントコンピューティングシンポジウム2013論文集 2013 259 - 265 2013年09月

CiNii
データドリブンなフォトリアル口内アニメーションの自動生成

川井正英, 岩尾知頼, 前島謙宣, 森島繁生

エンタテインメントコンピューティングシンポジウム2013論文集 2013 251 - 258 2013年09月

CiNii
ダンスモーションにおける表現力付与システムの提案

岡田成美, 福里司, 森島繁生

情報処理学会研究報告. グラフィクスとCAD研究会報告 2013 ( 4 ) 1 - 7 2013年06月

　概要を見る

本研究では,光学式モーションキャプチャシステムを用いて基準となるダンスモーションデータと,異なる表現を行っているダンスモーションデータを取得し,それらを用いて,テンポを考慮しつつ表現力の乗ったダンスモーションを容易に作成する手法を提案する.踊り手の意識が鑑賞者に伝わるかの主観評価実験を行い,その結果を基に,基準となるモーションに表現力を付加するフィルタを作成する.なお作成するフィルタは,テンポを一定に保ったまま,緩急と関節角度に注目して変換を行う.これにより得たフィルタを任意のダンスモーションに付加することで,表現力の豊かなモーションの作成を可能にした.

CiNii J-GLOBAL
認識・検出顔形状による制約を考慮した線形回帰に基づく顔特徴点自動検出

松田龍英, 前島謙宣, 森島繁生

画像ラボ 24 ( 6 ) 32 - 38 2013年06月

CiNii J-GLOBAL
レコむし：画像と楽曲の印象の一致による楽曲推薦システム

佐々木将人, 平井辰典, 大矢隼士, 森島繁生

研究報告エンタテインメントコンピューティング（EC） 2013 ( 10 ) 1 - 6 2013年03月

　概要を見る

入力した画像に対して感性的にマッチした楽曲を推薦するシステム，レコむし (RECOmmendation of MUSic using an input Image) を提案する．音楽を楽しむ上で，現在の情景は重要な要素の一つである．なぜなら，楽曲の印象がその情景と調和しているほど，楽曲を聴いたときの感動は増すためである．しかし，膨大な楽曲群の中から現在の情景に的確にマッチした楽曲を手動で探し出すことは容易ではない．そこで本研究では，AV (Arousal-Valence) 空間と呼ばれる心理空間に画像と楽曲を配置することで，情景に対する印象と楽曲に対する印象を対応付ける．レコむしはこの対応付けを用いることで，現在の情景に合った印象を与える楽曲をプレイリストとして推薦する．また，ランダム選曲による楽曲と比較することで，レコむしの評価を行った．

CiNii
レコむし：画像と楽曲の印象の一致による楽曲推薦システム

佐々木将人, 平井辰典, 大矢隼士, 森島繁生

研究報告音楽情報科学（MUS） 2013 ( 10 ) 1 - 6 2013年03月

　概要を見る

入力した画像に対して感性的にマッチした楽曲を推薦するシステム，レコむし (RECOmmendation of MUSic using an input Image) を提案する．音楽を楽しむ上で，現在の情景は重要な要素の一つである．なぜなら，楽曲の印象がその情景と調和しているほど，楽曲を聴いたときの感動は増すためである．しかし，膨大な楽曲群の中から現在の情景に的確にマッチした楽曲を手動で探し出すことは容易ではない．そこで本研究では，AV (Arousal-Valence) 空間と呼ばれる心理空間に画像と楽曲を配置することで，情景に対する印象と楽曲に対する印象を対応付ける．レコむしはこの対応付けを用いることで，現在の情景に合った印象を与える楽曲をプレイリストとして推薦する．また，ランダム選曲による楽曲と比較することで，レコむしの評価を行った．

CiNii
年齢別パッチを用いた画像再構成による経年変化顔画像合成

前島謙宣, 溝川あい, 松田龍英, 森島繁生

研究報告コンピュータビジョンとイメージメディア（CVIM） 2013 ( 28 ) 1 - 6 2013年03月

　概要を見る

安全安心な社会実現を目的とした犯罪捜査支援システム構築のため，顔写真から対象者の過去・未来の顔を合成可能な経年変化顔合成技術が求められている．本稿では，同一環境で撮影された顔画像のデータベースから構成される年齢別の小片画像群を用いて顔を再構成することで，経年変化顔画像を合成する手法を提案する．提案手法は，入力顔の再構成を通じて従来手法では困難であったしみやくすみといった細かな肌の特徴を表現し，さらに皺の統計モデルを用いて再構成の結果を変調することで経年変化をシミュレーションすることが可能である．提案手法の有効性を検証するため，年齢推定と個人識別に関する主観評価実験を行い，提案手法が，元の人物の印象を持つ目標年齢らしい経年顔画像を合成可能であることを示す．

CiNii
ユーザの簡易的操作によるインタラクティブな加齢変化顔合成

溝川あい, 中井宏紀, 前島謙宣, 森島繁生

研究報告コンピュータビジョンとイメージメディア（CVIM） 2013 ( 29 ) 1 - 5 2013年03月

　概要を見る

近年，エンタテイメントやセキュリティへの応用を目的とした顔の経年変化画像を合成する研究が多くなされている．Tazoeらの提案した経年変化顔合成手法では，きめの乱れやくすみといった年齢特有の肌の質感の表現が可能である．しかし，皺が鮮明に表現できないといった問題や，結果が元画像の照明環境や肌の色に大きく影響されるという問題があった．また，加齢に伴う皺の付き方は多岐に渡るため，多様な皺を考慮する必要がある．本稿では，皺領域を二値化し照明環境や肌の色による影響を軽減し，且つユーザが書き足した曲線からリアルな年齢皺を付加した経年変化顔合成手法を提案する．Many studies on an aged face image synthesis have been reported with the purpose of security application such as investigation for criminal or kidnapped child and entertainment applications. Tazoe et al. proposed a facial aging technique that described skin texture such as rough or dull skin which helped in determining a person's age. However, in their method, representing wrinkles―one of the most important elements in reflecting age characteristics―is difficult. Additionally, the influence by lighting conditions and individual skin color is significant. Also, It is difficult to infer the location and shape of future wrinkles because they depend on individual factors. Therefore, we have to consider several possibilities of wrinkles locations. In this paper, we propose an aged face image synthesis method that can create plausible aged face images and is able to represent wrinkles at any optional location user want to add them by adding artificial freehand wrinkles with photorealistic quality.

CiNii
顔器官の輪郭情報を用いた経年変化にロバストな認証システムの一検討

中井宏紀, 平井辰典, 前島謙宣, 森島繁生

研究報告コンピュータビジョンとイメージメディア（CVIM） 2013 ( 30 ) 1 - 6 2013年03月

　概要を見る

顔認証において，同一の被写体であっても顔の外見に経年変化が生じる場合は認証精度が低下するという問題がある．本稿では，経年変化が生じても見た目に大きな変化を及ぼさない顔器官(目・鼻・口等)の輪郭情報を特徴量にすることで経年変化を含む顔認証の精度向上を目指す．具体的には，顔の幾何学特徴として先行研究と同様に顔グラフを，テクスチャ特徴として顔特徴点周辺のHistogram of Oriented Gradient(HOG)特徴量を認証に用いた．結果として，公開顔画像データベースであるFG-NET Aging Databaseを用いた認証実験により，先行研究を上回る認証精度を示し，本手法の有効性を確認した．Face authentication accuracy falls by change of the face appearance due to aging. In this paper, we propose an age-invariant face authentication system which uses the edge of face parts to improve the face authentication accuracy for the image database with age variation. Specifically, we use face image graph as geometry feature and Histogram of Oriented Gradient (HOG) in the neighborhood of feature points as texture feature. As a result, for public facial aging database: FG-NET, we confirmed the effectiveness of proposed method through the evaluation experiment.

CiNii
短繊維を考慮した埃の描画手法

安達翔平, 宇梶弘晃, 小坂昂大, 森島繁生

全国大会講演論文集 2013 ( 1 ) 299 - 301 2013年03月

　概要を見る

様々な汚れの表現手法の中でも、埃の体積は物体の経年変化を表す重要な要素である。従来の手法では堆積位置の関数化や反射特性の実測に基づいた描画を行っていたが、埃の構成要素である繊維や、堆積に由来する立体構造は考慮されていなかった。よって、本研究では上記の問題を改善し、写実的な描画を提案する。埃の堆積量については、高速処理が可能な階層上のテクスチャマッピングとして知られるShell法を用いて擬似的な立体表現を行った。繊維の表現に対しては、堆積した繊維の屈曲方向が無秩序であることから、UV平面上の点にランダムな運動を与えることで繊維の表現が可能なテクスチャのプロシージャル生成を実現した。

CiNii
リアルな口内表現を実現する発話アニメーションの自動生成

川井正英, 岩尾知頼, 三間大輔, 前島謙宣, 森島繁生

全国大会講演論文集 2013 ( 1 ) 229 - 231 2013年03月

　概要を見る

近年、映像コンテンツにおいて実写ベースのCGアニメーションを目にする機会が増加している。中でも発話シーンの合成においては、従来手法では、舌の複雑な動きや口内構造の表現において課題が残されていた。そこで本研究では、口内表現が不十分な発話顔画像を入力とし、予め取得した口唇画像群を利用して、入力画像の口内領域を補完することにより、実写品質の発話顔画像を自動生成する手法を提案する。具体的には、ある一人物の歯画像と舌画像を、開口距離情報と音素情報を用いて入力画像に挿入し、多人数の口唇領域のパッチ画像を合成された入力画像上にタイリングする。これにより、従来手法では特に表現が困難であった複雑な口内表現が可能となった。

CiNii J-GLOBAL
D-12-31 疑似的な嬢特徴の追加によるリアルな加齢変化顔合成(D-12.パターン認識・メディア理解B(コンピュータビジョンとコンピュータグラフィックス))

溝川あい, 中井宏紀, 前島謙宣, 森島繁生

電子情報通信学会総合大会講演論文集 2013 ( 2 ) 124 - 124 2013年03月

CiNii J-GLOBAL
D-12-40 Structure From Motionと照度差ステレオの統合による3次元顔形状復元の高精度化(D-12.パターン認識・メディア理解B(コンピュータビジョンとコンピュータグラフィックス))

桑原大樹, 松田龍英, 前島謙宣, 森島繁生

電子情報通信学会総合大会講演論文集 2013 ( 2 ) 133 - 133 2013年03月

CiNii J-GLOBAL
Perlin noiseを用いた短繊維生成法による埃の高速描画手法

安達翔平, 宇梶弘晃, 小坂昂大, 森島繁生

研究報告グラフィクスとCAD（CG） 2013 ( 6 ) 1 - 7 2013年02月

　概要を見る

埃の表現は，物体の経年変化の描画を写実的に行う重要な要素である．本研究では，埃が短繊維の集合であることに着目し， UV 平面上で点をランダムに運動させ，その軌跡を描画することにより，埃の無秩序な短繊維の形状を表現するテクスチャの生成を行った．そして、テクスチャを階層状に積層させ，高速に描画する Shell 法を用いることで，従来法において実現し得なかった繊維構造由来の立体表現，及び短繊維のリアルタイムでの描画を実現した．For realistic rendering of aging objects, dust rendering is a significant factor. In this paper, we focus on the behavior that dust consists of short fibers. So we generate textures representing disordered short fibers by drawing the random trajectory on UV coordinates. Them, for rapid rendering, we use Shell texturing method, which enable us to pile up textures over arbitrary surfaces and render a series of textures rapidly. In this way, we achieve rapid rendering of short fibers and the volumetric representation based on structure of fibers.

CiNii J-GLOBAL
口内情報のリアルな表現を可能とするデータドリブンな発話アニメーション自動生成

川井正英, 岩尾知頼, 三間大輔, 前島謙宣, 森島繁生

研究報告グラフィクスとCAD（CG） 2013 ( 2 ) 1 - 8 2013年02月

　概要を見る

CG 発話アニメーションの自動生成手法は既に数多く提案されている．しかし，音節/θe/のような舌を噛む動きや舌の裏側といった細部の表現は未だ課題が残されている．そこで本研究では，口内画像を，開口距離情報がラベル付けされた歯画像と，音素ごとの動きに分類した舌画像とに分離し，任意の方法で生成された発話アニメーションの口内領域に挿入した後，複数人の口唇画像を用いてパッチタイリングを施すことで，複雑な口内表現を可能にした．Speech animation synthesis is still a challenging topic in computer graphics area. Despite of lots of challenge, representing detailed appearance of inner mouth such as nipping tongue's tip with teeth and tongue's back hasn't been achieved in the resulting animation. To solve this problem, we propose a method of data-driven speech animation synthesis especially focusing on realistic inside of the mouth. First, we classify inner mouth into teeth labeling opening distance of the teeth and a tongue according to phoneme information. We then insert them into pre-created speech animation based on opening distance of the teeth and phoneme information. Finally, we apply patch-based texture synthesis technique with 2213 images database created from 7 subject to the resulting animation. By using the proposed method, we can automatically generate a speech animation with the realistic inner mouth from the pre-created speech animation created by previous methods.

CiNii J-GLOBAL
時間連続性と顔形状制約を考慮した線形予測に基づく特徴点追跡 (パターン認識・メディア理解)

松田龍英, 前島謙宣, 森島繁生

電子情報通信学会技術研究報告 : 信学技報 112 ( 385 ) 145 - 150 2013年01月

　概要を見る

本研究では, Eng-Jonらが提案したlinear Predictorsに時間連続性と顔形状制約を考慮した新しい顔特徴点追跡手法を提案する.Linear Predictorsは注目画素周辺の画像特徴量と注目画素から正解位置への移動ベクトルを線形回帰によって対応づける手法であり,動画像内の十数フレームを学習画像として用いることで正確な特徴点追跡を可能とする.提案手法では, Linear Predictorsによって特徴点を検出し,その位置から前フレームの推定位置周辺の輝度値を基にオプティカルフローによって特徴点を移動させる.さらに,従来手法で提案された主成分分析に基づく幾何学的制約によって特徴点の位置を補正することで,未学習人物に対してもロバストな特徴点追跡を実現した.

CiNii
時間連続性と顔形状制約を考慮した線形予測に基づく特徴点追跡 (マルチメディア・仮想環境基礎)

松田龍英, 前島謙宣, 森島繁生

電子情報通信学会技術研究報告 : 信学技報 112 ( 386 ) 145 - 150 2013年01月

　概要を見る

本研究では, Eng-Jonらが提案したlinear Predictorsに時間連続性と顔形状制約を考慮した新しい顔特徴点追跡手法を提案する.Linear Predictorsは注目画素周辺の画像特徴量と注目画素から正解位置への移動ベクトルを線形回帰によって対応づける手法であり,動画像内の十数フレームを学習画像として用いることで正確な特徴点追跡を可能とする.提案手法では, Linear Predictorsによって特徴点を検出し,その位置から前フレームの推定位置周辺の輝度値を基にオプティカルフローによって特徴点を移動させる.さらに,従来手法で提案された主成分分析に基づく幾何学的制約によって特徴点の位置を補正することで,未学習人物に対してもロバストな特徴点追跡を実現した.

CiNii J-GLOBAL
経年変化顔シミュレータ

前島謙宣, 森島繁生

画像センシングシンポジウム講演論文集(CD-ROM) 19th ROMBUNNO.DS2-01 2013年

J-GLOBAL
疑似的な皺の付加による加齢変化顔合成

溝川あい, 中井宏紀, 前島謙宜, 森島繁生

画像センシングシンポジウム講演論文集(CD-ROM) 19th ROMBUNNO.IS2-13 2013年

J-GLOBAL
時間連続性と顔形状制約を考慮した線形予測に基づく特徴点追跡

松田龍英, 前島謙宣, 森島繁生

電子情報通信学会技術研究報告 112 ( 385(PRMU2012 84-129) ) 145 - 150 2013年

J-GLOBAL
アニメ作品のコミック画像解析に基づく動画要約手法の提案

福里司, 岩本尚也, 森島繁生

画像電子学会誌(CD-ROM) 42 ( 1 ) 117 2013年

J-GLOBAL
音素情報からの口唇動作推定を利用した発話アニメーションの生成

三間大輔, 前島謙宣, 森島繁生

画像電子学会誌(CD-ROM) 42 ( 1 ) 118 2013年

J-GLOBAL
動画フレームの時間連続性と顔類似度に基づく動画コンテンツの同一人物抽出手法

平井辰典, 中野倫靖, 後藤真孝, 森島繁生

画像電子学会誌(CD-ROM) 42 ( 1 ) 116 2013年

J-GLOBAL
対話時の感情を反映した眼球運動の分析及び合成

岩尾知頼, 三間大輔, 久保尋之, 前島謙宣, 森島繁生

画像電子学会誌(CD-ROM) 42 ( 1 ) 117 2013年

J-GLOBAL
繊維構造を考慮した埃の高速描画手法

安達翔平, 宇梶弘晃, 小坂昂大, 森島繁生

情報処理学会全国大会講演論文集 75th ( 4 ) 4.299-4.300 2013年

J-GLOBAL
リアルな口内表現を実現する発話アニメーションの自動生成

川井正英, 岩尾知頼, 三間大輔, 前島謙宣, 森島繁生

情報処理学会全国大会講演論文集 75th ( 4 ) 4.229-4.230 2013年

J-GLOBAL
注目領域中の画像類似度に基づく動画中のキャラクター登場シーンの推薦手法

増田太郎, 平井辰典, 大矢隼士, 森島繁生

情報処理学会全国大会講演論文集 75th ( 2 ) 2.601-2.602 2013年

J-GLOBAL
入力画像に感性的に一致した楽曲を推薦するシステム

佐々木将人, 平井辰典, 大矢隼士, 森島繁生

情報処理学会全国大会講演論文集 75th ( 2 ) 2.45-2.46 2013年

J-GLOBAL
ダンスモーションにおける表現のバリエーション生成

岡田成美, 岡見和樹, 福里司, 岩本尚也, 森島繁生

情報処理学会全国大会講演論文集 75th ( 4 ) 4.227-4.228 2013年

J-GLOBAL
年齢別パッチを用いた画像再構成による経年変化顔画像合成

前島謙宣, 溝川あい, 松田龍英, 森島繁生

情報処理学会研究報告(CD-ROM) 2012 ( 6 ) ROMBUNNO.CVIM-186,NO.28 2013年

J-GLOBAL
口内情報のリアルな表現を可能とするデータドリブンな発話アニメーション自動生成

川井正英, 岩尾知頼, 三間大輔, 前島謙宣, 森島繁生

情報処理学会研究報告(CD-ROM) 2012 ( 6 ) ROMBUNNO.CG-150,NO.2 2013年

J-GLOBAL
レコむし:画像と楽曲の印象の一致による楽曲推薦システム

佐々木将人, 平井辰典, 大矢隼士, 森島繁生

情報処理学会研究報告(CD-ROM) 2012 ( 6 ) ROMBUNNO.MUS-98,NO.10 2013年

J-GLOBAL
顔器官の輪郭情報を用いた経年変化にロバストな認証システムの一検討

中井宏紀, 平井辰典, 前島謙宣, 森島繁生

情報処理学会研究報告(CD-ROM) 2012 ( 6 ) ROMBUNNO.CVIM-186,NO.30 2013年

J-GLOBAL
ユーザの簡易的操作によるインタラクティブな加齢変化顔合成

溝川あい, 中井宏紀, 前島謙宣, 森島繁生

情報処理学会研究報告(CD-ROM) 2012 ( 6 ) ROMBUNNO.CVIM-186,NO.29 2013年

J-GLOBAL
既存音楽動画の再利用による音楽に合った動画の自動生成システム

平井辰典, 平井辰典, 大矢隼士, 大矢隼士, 森島繁生, 森島繁生

情報処理学会論文誌ジャーナル(Web) 54 ( 4 ) 1254 - 1262 2013年

　概要を見る

本論文では，任意の入力楽曲を基に，既存の音楽動画コンテンツを再利用し，音楽と映像が同期した音楽動画を自動生成するシステムを提案する．本研究では，まずシステムの土台となる音楽と映像の同期手法を主観評価実験により検討した．その結果に基づき，音のエネルギーを表すRMSの変化に，映像のアクセント（明滅や動きなど）を対応させるような音楽動画自動生成システムを実装した．音楽動画の自動生成の手順は以下のとおりである．まずデータベースの構築として既存の音楽動画の各フレームにおける明滅，動きに関する映像特徴量の計算を行う．そして，動画生成として，入力楽曲のRMSを抽出し，その推移に最も近い推移を示す映像特徴量を持つ音楽動画の素片をデータベースから探索し，それらの映像を切り貼りすることで，音楽に最も同期した音楽動画の生成を行う．また，本システムによる生成動画の評価も行った．In this paper, we present an automatic mashup music video generation system by segmenting and concatenating existing video clips. To create music video automatically synchronized with any input music, we performed experiment which subjectively evaluates optimum synchronization conditions between motions in a video and the music. The method to synchronize video with input music is to synchronize accent in a video such as movement and flicker with RMS energy of sound from input music. The system calculates RMS energy of input music in each music bar and searches for a video sequence which makes best synchronization from database of music video. Generated music videos are based on a result of subjective evaluation experiment, that change of brightness and movement of objects are united with sound. We also performed subjective evaluation experiment to evaluate output music video of the system.

CiNii J-GLOBAL
音楽と映像が同期した音楽動画の自動生成システム

平井辰典, 平井辰典, 大矢隼士, 大矢隼士, 森島繁生, 森島繁生

情報処理学会研究報告(Web) 2013 ( MUS-99 ) WEB ONLY VOL.2013-MUS-99,NO.26 2013年

J-GLOBAL
視線計測に基づく対話時の眼球運動の分析と合成

岩尾知頼, 三間大輔, 久保尋之, 前島謙宣, 森島繁生

画像ラボ 24 ( 5 ) 32 - 40 2013年

J-GLOBAL
繊維構造を考慮したShell Textureのプロシージャル生成による埃の高速描画手法

安達翔平, 宇梶弘晃, 小坂昂大, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2013 ROMBUNNO.30 2013年

J-GLOBAL
顔領域の画像類似度に基づく動画中のキャラクタ登場シーン推薦

増田太郎, 福里司, 平井辰典, 大矢隼士, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2013 ROMBUNNO.45 2013年

J-GLOBAL
皺特徴の簡易的付加による加齢変化顔の合成

溝川あい, 中井宏紀, 前島謙宜, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2013 ROMBUNNO.4 2013年

J-GLOBAL
自動領域分割によるブレンドシェイプのためのリアルな表情転写

小坂昂大, 前島謙宣, 久保尋之, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2013 ROMBUNNO.18 2013年

J-GLOBAL
時系列性を考慮した感情を含んだ眼球運動の分析及び合成

岩尾知頼, 三間大輔, 久保尋之, 前島謙宣, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2013 ROMBUNNO.33 2013年

J-GLOBAL
均質化法を用いた詳細な布の変形シミュレーションの高速化

斉藤隼介, 梅谷信行, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2013 ROMBUNNO.28 2013年

J-GLOBAL
画像の感じ方に基づいた楽曲推薦を行うシステム

佐々木将人, 佐々木将人, 平井辰典, 大矢隼士, 大矢隼士, 森島繁生, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2013 ROMBUNNO.15 2013年

J-GLOBAL
写実性豊かな口内表現を実現する発話アニメーションの自動生成

川井正英, 岩尾知頼, 三間大輔, 前島謙宣, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2013 ROMBUNNO.17 2013年

J-GLOBAL
一枚顔画像を入力とした顔の反射特性推定

岡見和樹, 岩本尚也, 前島謙宣, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2013 ROMBUNNO.35 2013年

J-GLOBAL
アニメ作品のキーフレーム検出による漫画形式の映像要約手法の提案

福里司, 平井辰典, 大矢隼士, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2013 ROMBUNNO.19 2013年

J-GLOBAL
母音口形に基づく効率的な発話アニメーション生成手法の提案

三間大輔, 前島謙宣, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2013 ROMBUNNO.49 2013年

J-GLOBAL
アニメ作品におけるキーフレーム自動抽出に基づく映像要約手法の提案

福里司, 平井辰典, 大矢隼士, 森島繁生, 森島繁生

画像電子学会誌(CD-ROM) 42 ( 4 ) 448 - 456 2013年

DOI J-GLOBAL
確率モデルに基づく対話時の眼球運動の分析及び合成

岩尾知頼, 三間大輔, 久保尋之, 前島謙宣, 森島繁生

画像電子学会誌(CD-ROM) 42 ( 5 ) 661 - 670 2013年

DOI J-GLOBAL
ラケットスポーツ動画の構造解析による映像要約手法の提案

河村俊哉, 福里司, 福里司, 平井辰典, 森島繁生, 森島繁生

情報処理学会研究報告(Web) 2013 ( CG-153 ) WEB ONLY VOL.2013-CG-153,NO.15 2013年

J-GLOBAL
髪の特徴に基づく顔画像の印象類似検索

藤賢大, 福里司, 福里司, 増田太郎, 増田太郎, 平井辰典, 森島繁生, 森島繁生

情報処理学会研究報告(Web) 2013 ( CG-153 ) WEB ONLY VOL.2013-CG-153,NO.17 2013年

J-GLOBAL
A-15-10 会話時の不随意な眼球運動の分析及び合成手法の提案(A-15.ヒューマン情報処理,一般セッション)

岩尾知頼, 三間大輔, 久保尋之, 前島謙宣, 森島繁生

電子情報通信学会総合大会講演論文集 2012 230 - 230 2012年03月

CiNii J-GLOBAL
A-15-21 足元条件の変化に伴う歩行動作特徴変化のモデリング(A-15.ヒューマン情報処理,一般セッション)

岡見和樹, 岩本尚也, 國友翔次, 須田洋文, 森島繁生

電子情報通信学会総合大会講演論文集 2012 241 - 241 2012年03月

CiNii J-GLOBAL
D-12-72 複数台のビデオ映像解析による頭髪モーションキャプチャ(D-12.パターン認識・メディア理解B(コンピュータビジョンとコンピュータグラフィックス),一般セッション)

福里司, 岩本尚也, 國友翔次, 須田洋文, 森島繁生

電子情報通信学会総合大会講演論文集 2012 ( 2 ) 166 - 166 2012年03月

CiNii J-GLOBAL
D-12-79 実写画像に基づく毛皮の実時間描画手法(D-12.パターン認識・メディア理解B(コンピュータビジョンとコンピュータグラフィックス),一般セッション)

宇梶弘晃, 小坂昂大, 服部智仁, 久保尋之, 森島繁生

電子情報通信学会総合大会講演論文集 2012 ( 2 ) 173 - 173 2012年03月

　概要を見る

近年の3DCGコンテンツにおいて,動物がキャラクターとして登場する場面は数多く挙げられる.動物はその大部分が毛皮で覆われているため,毛皮の描画の写実性はシーン全体の質を大きく左右する.実時間処理が可能な毛皮の描画手法の代表として,Shell法があげられる.Shell法は毛皮の断面図に相当するShell Textureを積層状に描画することによって毛皮を表現する手法であり,リアルタイムで比較的写実性の高い毛皮の描画が可能となっている.しかしShell法によって写実的な毛を生成する際には,目的の質感をもった毛皮を描画できるShell Textureをあらかじめ積層数分だけ用意しなければならず,テクスチャの準備の段階でコストを要するという問題が挙げられる.本手法では,毛皮の実写画像一枚のみを入力として,指定した枚数のShell Textureを自動的に生成する.これにより,入力画像に見られる毛並を3Dオブジェクト上に再現させることが可能である.各層のテクスチャは入力画像一枚から自動的に生成されるため,従来法と比較してテクスチャを準備するコストが削減可能となっている.なお提案手法では描画においてシェーダに転送する画像は入力一枚に留めているため,グラフィックメモリの使用量を抑えることができる.

CiNii J-GLOBAL
音楽動画コンテンツ中のアーティスト名とその登場シーンの同定手法

平井辰典, 中野倫靖, 後藤真孝, 森島繁生

情報処理学会研究報告. SLP, 音声言語情報処理 2012 ( 24 ) 1 - 8 2012年01月

　概要を見る

本稿では,音楽動画コンテンツに対して「どのアーティストがいつ映像中に登場しているか」というアノテーション情報を自動付加する手法を提案する.従来の人物顔認証手法は映像中の照明や顔向きなどの撮影環境の変動に脆弱で,その変動が大きい音楽動画コンテンツにおいて,アーティスト名とその登場シーンを同定することは困難であった.そこで本研究では,映像のフレームの時間的連続性を利用して同一人物の顔をクラスタリングすることで,撮影環境の違いを吸収し,アーティストの顔認証をおける問題を解決した.本手法により,従来の単一フレーム毎に顔認証を行う手法に比べ,約 2〜3 倍の精度向上を実現した.また,音楽の歌声区間と映像中にボーカリストが登場するシーンとの関係についても調査し,それを利用した今後の精度向上の可能性について考察した.

CiNii J-GLOBAL
会話時の眼球の跳躍運動と固視微動の分析及び合成手法の提案

岩尾知頼, 三間大輔, 久保尋之, 前島謙宣, 森島繁生

電子情報通信学会技術研究報告. MVE, マルチメディア・仮想環境基礎 111 ( 380 ) 239 - 244 2012年01月

　概要を見る

コンピュータグラフィクスで人間の動作を自然に表現するためには眼球運動をリアルに再現することが非常に重要である。本研究では実際の眼球運動の測定結果をもとにしたリアルな眼球のアニメーションの生成を目的とする。まず、眼球運動を跳躍運動と固視微動の2種類に区別する。さらにそれぞれの眼球運動を確率モデルで表すことによって、リアルな眼球運動の自動的な生成が可能となった。

CiNii
会話時の眼球の跳躍運動と固視微動の分析及び合成手法の提案

岩尾知頼, 三間大輔, 久保尋之, 前島謙宣, 森島繁生

電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 111 ( 379 ) 239 - 244 2012年01月

　概要を見る

コンピュータグラフィクスで人間の動作を自然に表現するためには眼球運動をリアルに再現することが非常に重要である。本研究では実際の眼球運動の測定結果をもとにしたリアルな眼球のアニメーションの生成を目的とする。まず、眼球運動を跳躍運動と固視微動の2種類に区別する。さらにそれぞれの眼球運動を確率モデルで表すことによって、リアルな眼球運動の自動的な生成が可能となった。

CiNii J-GLOBAL
パッチタイリングを用いた顔画像復元に基づく顔形状推定

郷原裕明, 前島謙宣, 森島繋生

画像センシングシンポジウム講演論文集(CD-ROM) 18th 2012年

J-GLOBAL
隠れマルコフモデルに基づく既存コンテンツ学習による音楽動画自動生成システムの提案

大矢隼士, 森島繁生

音楽音響研究会資料 31 ( 1 ) 47 - 52 2012年

J-GLOBAL
笑顔表出過程の表情の動きと受け手の印象の相関分析

藤代裕紀, 前島謙宣, 森島繁生

電子情報通信学会論文誌 A J95-A ( 1 ) 128 - 135 2012年

J-GLOBAL
会話時の眼球の跳躍運動と固視微動の分析及び合成手法の提案

岩尾知頼, 三間大輔, 久保尋之, 前島謙宣, 森島繁生

電子情報通信学会技術研究報告 111 ( 380(MVE2011 56-94) ) 239 - 244 2012年

J-GLOBAL
足元条件の変化に伴う歩行動作特徴変化のモデリング

岡見和樹, 岩本尚也, 國友翔次, 須田洋文, 森島繁生

電子情報通信学会大会講演論文集 2012 241 2012年

J-GLOBAL
会話時の不随意な眼球運動の分析及び合成手法の提案

岩尾知頼, 三間大輔, 久保尋之, 前島謙宣, 森島繁生

電子情報通信学会大会講演論文集 2012 230 2012年

J-GLOBAL
実写画像に基づく毛皮の実時間描画手法

宇梶弘晃, 小坂昂大, 服部智仁, 久保尋之, 森島繁生

電子情報通信学会大会講演論文集 2012 173 2012年

J-GLOBAL
母音スペクトルのブレンドを用いた母音交換による話者変換

浜崎皓介, 田中茉莉, 河原英紀, 森島繁生

日本音響学会研究発表会講演論文集(CD-ROM) 2012 ROMBUNNO.3-11-13 2012年

J-GLOBAL
複数台のビデオ映像解析による頭髪モーションキャプチャ

福里司, 岩本尚也, 國友翔次, 須田洋文, 森島繁生

電子情報通信学会大会講演論文集 2012 166 2012年

J-GLOBAL
実写画像に基づく毛皮の特徴抽出と実時間描画手法

宇梶弘晃, 小坂昂大, 服部智仁, 久保尋之, 森島繁生

情報処理学会研究報告(CD-ROM) 2011 ( 6 ) ROMBUNNO.CG-146,NO.22 2012年

J-GLOBAL
音楽動画コンテンツにおける類似性評価尺度の提案

長谷川裕記, 森島繁生

情報処理学会研究報告(CD-ROM) 2011 ( 6 ) ROMBUNNO.EC-23,NO.21 2012年

J-GLOBAL
動画像解析に基づくリアルな頭髪運動再現

福里司, 岩本尚也, 國友翔次, 須田洋文, 森島繁生

情報処理学会研究報告(CD-ROM) 2011 ( 6 ) ROMBUNNO.CG-146,NO.1 2012年

J-GLOBAL
平常時歩行動作からの足元の環境変化に伴う動作特徴変化のモデリング

岡見和樹, 岩本尚也, 國友翔次, 須田洋文, 森島繁生

情報処理学会研究報告(CD-ROM) 2011 ( 6 ) ROMBUNNO.CG-146,NO.21 2012年

J-GLOBAL
テクスチャの周波数解析に基づく年齢変化顔の生成

中井宏紀, 松田龍英, 田副佑典, 前島謙宣, 森島繁生

画像センシングシンポジウム講演論文集(CD-ROM) 18th ROMBUNNO.IS1-05 2012年

J-GLOBAL
形状変形とパッチタイリングに基づく顔のエージングシミュレーション

田副佑典, 郷原裕明, 前島謙宣, 森島繁生

画像センシングシンポジウム講演論文集(CD-ROM) 18th ROMBUNNO.IS1-07 2012年

J-GLOBAL
SfMと顔変形モデルに基づく動画像からの3次元顔モデル高速自動生成

原朋也, 前島謙宣, 森島繁生

画像センシングシンポジウム講演論文集(CD-ROM) 18th ROMBUNNO.IS1-08 2012年

J-GLOBAL
顔のしわ特徴を考慮したドライバーの眠気度合推定

中村太郎, 松田龍英, 原朋也, 前島謙宣, 森島繁生

画像センシングシンポジウム講演論文集(CD-ROM) 18th ROMBUNNO.IS1-04 2012年

J-GLOBAL
形状変形とパッチタイリングに基づくテクスチャ変換による年齢変化顔シミュレーション

田副佑典, 郷原裕明, 前島謙宣, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2012 ROMBUNNO.2 2012年

J-GLOBAL
実写画像1枚からのShell Texture自動生成手法の提案

宇梶弘晃, 小坂昂大, 服部智仁, 久保尋之, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2012 ROMBUNNO.12 2012年

J-GLOBAL
既存の音楽動画を用いて音楽に合った映像を自動生成するシステム

平井辰典, 大矢隼士, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2012 ROMBUNNO.18 2012年

J-GLOBAL
会話時のリアルな眼球運動の分析及び合成手法の提案

岩尾知頼, 三間大輔, 久保尋之, 前島謙宣, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2012 ROMBUNNO.4 2012年

J-GLOBAL
事前知識とStructure-from-Motionを併用した1台のビデオ画像からの3次元顔モデル高速自動生成手法

原朋也, 久保尋之, 前島謙宣, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2012 ROMBUNNO.1 2012年

J-GLOBAL
パッチタイリング手法による正面顔画像と両目位置情報からの顔3次元形状推定

郷原裕明, 川井正英, 松田龍英, 前島謙宣, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2012 ROMBUNNO.43 2012年

J-GLOBAL
スキニングを用いた三色光源下における動的な次元立体形状の再現

須田洋文, 岡見和樹, 前島謙宣, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2012 ROMBUNNO.21 2012年

J-GLOBAL
動画サイトコンテンツ再利用によるHMMに基づく音楽からの動画自動生成システム

大矢隼士, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2012 ROMBUNNO.50 2012年

J-GLOBAL
ステレオカメラ画像の色相検出に基づくマーカレス頭髪モーションキャプチャ

福里司, 岩本尚也, 國友翔次, 須田洋文, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2012 ROMBUNNO.40 2012年

J-GLOBAL
人の発話特性を考慮したリップシンクアニメーションの生成

三間大輔, 小坂昂大, 久保尋之, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2012 ROMBUNNO.44 2012年

J-GLOBAL
モーションブラーシャドウのリアルタイム生成手法

小坂昂大, 服部智仁, 久保尋之, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2012 ROMBUNNO.36 2012年

J-GLOBAL
単母音に含まれる音響特徴からの3次元頭部形状推定に関する一検討

前島謙宣, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2012 ROMBUNNO.45 2012年

J-GLOBAL
正面顔画像からの形状ディスプレイ用テクスチャ自動生成

前島謙宣, 倉立尚明, PIERCE Brennand, CHENG Gordon, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2012 ROMBUNNO.42 2012年

J-GLOBAL
顔形状の制約を付加したLinear Predictorsに基づく特徴点自動検出

松田龍英, 原朋也, 前島謙宣, 森島繁生

電子情報通信学会論文誌 D J95-D ( 8 ) 1530 - 1540 2012年

J-GLOBAL
肌の質感の周波数解析に基づく年齢変化顔合成

中井宏紀, 松田龍英, 前島謙宣, 森島繁生

映像情報メディア学会技術報告 36 ( 35(AIT2012 101-108) ) 5 - 8 2012年

J-GLOBAL
実測に基づいた対話時の眼球運動の分析及び合成

岩尾知頼, 三間大輔, 久保尋之, 前島謙宣, 森島繁生

日本顔学会誌 12 ( 1 ) 166 2012年

J-GLOBAL
シーンの連続性と顔類似度に基づく動画コンテンツ中の同一人物登場シーンの同定

平井辰典, 中野倫靖, 後藤真孝, 森島繁生

映像情報メディア学会誌 66 ( 7 ) J251 - J259 2012年

　概要を見る

We present a method that can automatically annotate when and who is appearing in a video stream that is shot in an unstaged condition. Previous face recognition methods were not robust against different shooting conditions, such as those with variable lighting, face directions, and other factors, in a video stream and had difficulties identifying a person and the scenes the person appears in. To overcome such difficulties, our method groups consecutive video frames (scenes) into clusters that each have the same person's face, which we call a “facial-temporal continuum,” and identifies a person by using many video frames in each cluster. In our experiments, accuracy with our method was approximately two or three times higher than a previous method that recognizes a face in each frame.

DOI CiNii J-GLOBAL
肌の質感の周波数解析に基づく年齢変化顔合成

中井宏紀, 松田龍英, 前島謙宣, 森島繁生

映像情報メディア学会技術報告 36 ( 0 ) 5 - 8 2012年

　概要を見る

本稿では,正面顔画像に対して肌の質感と形状の年齢特徴を操作することにより年齢変化顔を合成する手法を提案する.テクスチャの年齢変換において,まず,データベース中の顔画像から凸凹の少ない平坦領域に対して二次元フーリエ変換を行い,周波数空間における周波数成分の年齢との相関係数を求める.次に,入力画像について相関係数の高い周波数成分を所望する年齢へと変換し,逆フーリエ変換により画像空間に戻すことで目的のテクスチャ画像を得る.提案手法により,入力画像の肌の個人情報を維持した年齢変化顔画像の生成が可能となった.

DOI CiNii
テクスチャ-デプスパッチタイリングに基づく正面顔画像からの3次元形状推定

郷原裕明, 前島謙宣, 森島繁生

研究報告コンピュータビジョンとイメージメディア（CVIM） 2011 ( 20 ) 1 - 7 2011年11月

　概要を見る

本稿では，正面顔画像から 3 次元形状を推定する，新たな手法を提案する．提案手法では，3 次元形状を持つ顔モデルからカラーマップとデプスマップを取得し，パッチに区切りデータベースを生成する．そして，入力顔画像のカラーとデータベース中のパッチのカラーマップと比較し，評価関数によりパッチを選択しパッチタイリングを行う.その際にデプスマップも同様に選択しタイリングをすることでテクスチャ情報と奥行情報の関係を利用した，テクスチャ-デプスパッチタイリングに基づく正面顔画像からの 3 次元顔形状推定なる新たな手法を提案する．The paper presents an adaptation of the image quilting algorithm for 3D reconstruction and synthesis from 2D images. We build a DB of 3D faces that are normalized and converted into depth-maps. Next, given a 2D image, we compute its 3D depth-map by synthesizing texture-depth patches from the database using a minimization framework.

CiNii
疎な特徴点と顔変形モデルに基づく動画像からの3次元顔モデル自動生成手法

原朋也, 前島謙宣, 森島繁生

研究報告コンピュータビジョンとイメージメディア（CVIM） 2011 ( 25 ) 1 - 8 2011年11月

　概要を見る

本稿では，1 台のビデオカメラで撮影された顔を自由に振る動作の動画像から，対象人物の 3 次元顔モデルを高速自動生成する手法を提案する．著者らは先行研究において，顔変形モデルと顔形状の尤度制約を用いた顔の奥行推定に基づき，正面顔画像から高速に 3 次元顔モデルを生成する手法を提案している．しかしながら，従来手法は奥行推定に入力顔画像から得られる特徴点の 2 次元位置座標を用いており，3 次元的な制約がないため，鼻の高さや頬部分の凹凸などの個人特徴を忠実に再現できないという問題点があった．そこで，提案手法では，一般的な 3 次元復元手法として知られる Structure-from-Motion のフレームワークと顔変形モデルに基づく手法を統合することで，より対象人物に近い 3 次元顔モデルの生成を実現した．In this paper, we propose a 3D face reconstruction method with hybrid approach that combines Structure-from-Motion(SfM) approach based on "Factorization Method" to estimate an accurate 3D point depth information and generic-model approach based on "Deformable Face Model" to keep an appropriate local face shape. Unlike other methods, our method requires no manual operations from image capturing to 3D face model output and is executed quickly. In consequence, our method succeed to express more plausible facial geometry compared with previous method.

CiNii
テクスチャ-デプスパッチタイリングに基づく正面顔画像からの3次元形状推定

郷原裕明, 前島謙宣, 森島繁生

研究報告グラフィクスとCAD（CG） 2011 ( 20 ) 1 - 7 2011年11月

　概要を見る

本稿では，正面顔画像から 3 次元形状を推定する，新たな手法を提案する．提案手法では，3 次元形状を持つ顔モデルからカラーマップとデプスマップを取得し，パッチに区切りデータベースを生成する．そして，入力顔画像のカラーとデータベース中のパッチのカラーマップと比較し，評価関数によりパッチを選択しパッチタイリングを行う.その際にデプスマップも同様に選択しタイリングをすることでテクスチャ情報と奥行情報の関係を利用した，テクスチャ-デプスパッチタイリングに基づく正面顔画像からの 3 次元顔形状推定なる新たな手法を提案する．The paper presents an adaptation of the image quilting algorithm for 3D reconstruction and synthesis from 2D images. We build a DB of 3D faces that are normalized and converted into depth-maps. Next, given a 2D image, we compute its 3D depth-map by synthesizing texture-depth patches from the database using a minimization framework.

CiNii J-GLOBAL
疎な特徴点と顔変形モデルに基づく動画像からの3次元顔モデル自動生成手法

原朋也, 前島謙宣, 森島繁生

研究報告グラフィクスとCAD（CG） 2011 ( 25 ) 1 - 8 2011年11月

　概要を見る

本稿では，1 台のビデオカメラで撮影された顔を自由に振る動作の動画像から，対象人物の 3 次元顔モデルを高速自動生成する手法を提案する．著者らは先行研究において，顔変形モデルと顔形状の尤度制約を用いた顔の奥行推定に基づき，正面顔画像から高速に 3 次元顔モデルを生成する手法を提案している．しかしながら，従来手法は奥行推定に入力顔画像から得られる特徴点の 2 次元位置座標を用いており，3 次元的な制約がないため，鼻の高さや頬部分の凹凸などの個人特徴を忠実に再現できないという問題点があった．そこで，提案手法では，一般的な 3 次元復元手法として知られる Structure-from-Motion のフレームワークと顔変形モデルに基づく手法を統合することで，より対象人物に近い 3 次元顔モデルの生成を実現した．In this paper, we propose a 3D face reconstruction method with hybrid approach that combines Structure-from-Motion(SfM) approach based on "Factorization Method" to estimate an accurate 3D point depth information and generic-model approach based on "Deformable Face Model" to keep an appropriate local face shape. Unlike other methods, our method requires no manual operations from image capturing to 3D face model output and is executed quickly. In consequence, our method succeed to express more plausible facial geometry compared with previous method.

CiNii J-GLOBAL
実世界に学ぶ画像技術-現実と展望- : 顔・人体メディアが拓く新産業の画像技術

川出雅人, 持丸正明, 森島繁生

映像情報メディア学会誌 : 映像情報メディア = The journal of the Institute of Image Information and Television Engineers 65 ( 11 ) 1534 - 1544 2011年11月

DOI CiNii
三色光源下における動物体の高精度かつ詳細な三次元形状再現

須田洋文, 前島謙宣, 森島繁生

画像の認識・理解シンポジウム(MIRU2011)論文集 2011 1466 - 1472 2011年07月

CiNii
特徴量の経年変化解析に基づく個人識別手法の検討

原田健希, 田副佑典, 前島謙宣, 森島繁生

画像の認識・理解シンポジウム(MIRU2011)論文集 2011 780 - 785 2011年07月

CiNii
幾何学的制約を考慮したLinear Predictorsに基づく顔特徴点自動検出

松田龍英, 原朋也, 前島謙宣, 森島繁生

画像の認識・理解シンポジウム(MIRU2011)論文集 2011 773 - 779 2011年07月

CiNii
アノテーション情報を付加した画像内容推定結果に基づく自動ダンス動画生成システム

長谷川裕記, 前島謙宣, 森島繁生

研究報告音楽情報科学（MUS） 2011 ( 20 ) 1 - 6 2011年07月

　概要を見る

本研究では動画に付随するアノテーション情報とユーザーが指定した情報に基き、画像に描写されているターゲット要素の特徴を機械学習することによって、データベース内の動画選択を行い音楽にマッチしたダンス動画を自動生成するシステムを構築した。画像内の輪郭特徴を表す特徴量、アノテーション情報を表す動画コンテンツに割り振られたタグ情報を用いて画像内容推定を行っており、先行研究より画像内の構図を考慮したダンス動画生成ができ、ユーザーがシステムを利用する際の自由度を上げる事が可能となった。This paper presents a system that automatically generates a dance video clip appropriate to music by segmenting and concatenating existing dance video clips. This system is based on machine learning for annotation and features of image. We create system can consider what object draw in the image, so user can control system more flexible than prior study. Because we use features express shape of object in image, and annotation attended videos in internet to guess what draw in the image.

CiNii J-GLOBAL
リアルタイムスキンシェーダとしての曲率に依存する反射関数の提案と実装

久保尋之, 土橋宜典, 津田順平, 森島繁生

研究報告グラフィクスとCAD（CG） 2011 ( 2 ) 1 - 6 2011年06月

　概要を見る

人間の肌のような半透明物体のリアルな質感を再現するためには、表面下散乱を考慮して描画することが必要不可欠である。そこで本研究では半透明物体の高速描画を目的とし、曲率に依存する反射関数（CDRF）を提案する。実際の映像作品ではキャラクタの肌はそれぞれに特徴的で誇張した表現手法がとられるため、本研究では材質の散乱特性の調整だけでなく、曲率自体を強調する手法を導入することで、表面下散乱の影響が誇張された印象的な肌を表現可能なスキンシェーダを実現する。Simulating sub-surface scattering is one of the most effective ways to realistically synthesize translucent materials, especiallyhuman skin. This paper describes a technique of Curvature-Dependent Reflectance Function (CDRF) as a real-time skin shader and its implementation for a practical usage. For a production pipeline, we build a simple workflow, and prepare a method for exaggeration of scattering effects to describe a character's skin individuality.

CiNii J-GLOBAL
既存動画コンテンツを再利用して音楽にマッチした動画を自動生成するシステム

平井辰典, 大矢隼士, 長谷川裕記, 森島繁生

電子情報通信学会技術研究報告. DE, データ工学 111 ( 76 ) 143 - 148 2011年05月

　概要を見る

本稿では,入力された任意の音楽を元に,既存の動画コンテンツを再利用し,人間が音楽と映像が同期していると感じる音楽動画を自動的に生成するシステムを提案する.本システムの土台となる音楽と映像の同期手法は,音楽のエネルギーを示す特徴量であるRMSに対し,映像のアクセント(明滅や動きなど)を付加するというものである.これは,本研究で行った主観評価実験により人が音楽と映像が「合っている」と感じると確かめられた同期手法である.本システムの動画生成は,まずデータベースの構築として既存の動画シーケンスから各フレームの明滅,動きに関する映像特徴量の計算を行う.それに対し入力音楽のRMSの挙動に最も近い挙動を示す映像特徴量を持つ動画シーケンスをデータベース中から探索し,それらの映像シーケンスを切り貼りすることで,音楽に最も同期している音楽動画の生成を行うというものである.

CiNii J-GLOBAL
既存動画コンテンツを再利用して音楽にマッチした動画を自動生成するシステム

平井辰典, 大矢隼士, 長谷川裕記, 森島繁生

電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 111 ( 77 ) 143 - 148 2011年05月

　概要を見る

本稿では,入力された任意の音楽を元に,既存の動画コンテンツを再利用し,人間が音楽と映像が同期していると感じる音楽動画を自動的に生成するシステムを提案する.本システムの土台となる音楽と映像の同期手法は,音楽のエネルギーを示す特徴量であるRMSに対し,映像のアクセント(明滅や動きなど)を付加するというものである.これは,本研究で行った主観評価実験により人が音楽と映像が「合っている」と感じると確かめられた同期手法である.本システムの動画生成は,まずデータベースの構築として既存の動画シーケンスから各フレームの明滅,動きに関する映像特徴量の計算を行う.それに対し入力音楽のRMSの挙動に最も近い挙動を示す映像特徴量を持つ動画シーケンスをデータベース中から探索し,それらの映像シーケンスを切り貼りすることで,音楽に最も同期している音楽動画の生成を行うというものである.

CiNii
音楽と映像の同期手法に基づくダンス動画生成システム (音楽情報科学(MUS) Vol.2011-MUS-89)

平井辰典, 大矢隼士, 長谷川裕記, 森島繁生

情報処理学会研究報告 2010 ( 6 ) 1 - 6 2011年04月

　概要を見る

本稿では，主観評価実験によって評価された音楽と映像の同期手法に基づいて，入力された音楽から，人間が同期していると感じるダンス動画を生成するシステムを提案する．本システムの土台となる音楽と映像の同期手法は，音楽のエネルギーを示す特徴量であるRMSに対し，映像のアクセント（明滅や動きの速さなど）を付加するというものである．これは，本研究において主観評価実験により人が音楽と映像が「合っている」と感じると確かめられた同期手法である．本システムでの動画生成は，まずデータベースの構築として既存のダンス動画シーケンスから人物領域のみを抽出し，Optical flow の計算を行う．それに対し入力音楽を分割した素片の RMS の挙動に最も近い挙動を示す Optical flow のダンスシーケンスをデータベース中から選択し，それらの映像シーケンスの切り貼りを行うことで，音楽に最も同期しているダンス動画の生成を行うというものである．In this paper, we propose dance video creation system by the synchronization between music and video feature which evaluated by human's subjective judgment experiment. The video which created from the system matches with the criterion of synchronization which human tend to feel the music and the pictures are synchronized. The criterion of synchronization is that when RMS energy of music matches to the accent of video, people tends to feel the music and pictures are synchronized. In movie creation of this system, first thing to do is to make a database from existent dance movie's information about dance. We acquired them from optical-flow of the movies. The process to create dance movie is to cut the pieces of pictures that optical-flow shows best correlation to the RMS of the input music, and connect them together.

CiNii
D-12-9 複数視点からの深度マップを用いた半透明物体の高速描画(D-12.パターン認識・メディア理解,一般セッション)

小坂昂大, 服部智仁, 久保尋之, 森島繁生

電子情報通信学会総合大会講演論文集 2011 ( 2 ) 112 - 112 2011年02月

CiNii
D-12-10 動的な水の表面形状を考慮した流体のパラメータ推定(D-12.パターン認識・メディア理解,一般セッション)

岩本尚也, 國友翔次, 森島繁生

電子情報通信学会総合大会講演論文集 2011 ( 2 ) 113 - 113 2011年02月

CiNii J-GLOBAL
D-12-30 基準形状変形による多視点動画像からの動的立体形状再現(D-12.パターン認識・メディア理解,一般セッション)

種田大地, 山中健太郎, 國友翔次, 須田洋文, 前島謙宣, 森島繁生

電子情報通信学会総合大会講演論文集 2011 ( 2 ) 133 - 133 2011年02月

CiNii J-GLOBAL
D-12-37 経年変化を考慮した個人識別手法の検討(D-12.パターン認識・メディア理解,一般セッション)

原田健希, 田副佑典, 前島謙宣, 森島繁生

電子情報通信学会総合大会講演論文集 2011 ( 2 ) 140 - 140 2011年02月

CiNii J-GLOBAL
D-12-38 幾何学的制約を考慮したLinear Predictorsによる顔特徴点自動抽出(D-12.パターン認識・メディア理解,一般セッション)

松田龍英, 原朋也, 前島謙宣, 森島繁生

電子情報通信学会総合大会講演論文集 2011 ( 2 ) 141 - 141 2011年02月

CiNii J-GLOBAL
D-12-39 顔画像における陰影変化を伴う表情生成(D-12.パターン認識・メディア理解,一般セッション)

三間大輔, 鑓水裕刀, 久保尋之, 前島謙宣, 森島繁生

電子情報通信学会総合大会講演論文集 2011 ( 2 ) 142 - 142 2011年02月

CiNii J-GLOBAL
表出過程の印象を考慮したより自然な笑顔動画像の合成

藤代裕紀, 前島謙宣, 森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 110 ( 459 ) 31 - 36 2011年02月

　概要を見る

顔の表情は,人と人との円滑なコミュニケーションにおいて重要な役割を担っている.特に笑顔は相手に対して肯定的な印象を与えるため様々な研究が行われている.特に笑顔が相手に与える印象の研究は多く,そのほとんどが静止画での研究であった.しかし最近では,表出後のみだけでなく,表出過程が相手に与える印象に大きく影響するとの指摘もある.そこで本研究では,笑いの自然さに注目し,表出過程においてどの部位の動きが笑いの自然さの印象に寄与するかを調べた.また,分析結果に基づく顔部位変形を適用し,オリジナルよりも自然さを強調した合成動画像を作成して,主観評価により分析結果の正当性を示した.

CiNii
Dancereproducer: An automatic mashup music video generation system by reusing dance video clips on the web

Proceedings of the 8th Sound and Music Computing Conference, SMC 2011 2011年01月

　概要を見る

We propose a dance video authoring system, DanceReProducer, that can automatically generate a dance video clip appropriate to a given piece of music by segmenting and concatenating existing dance video clips. In this paper, we focus on the reuse of ever-increasing user-generated dance video clips on a video sharing web service. In a video clip consisting of music (audio signals) and image sequences (video frames), the image sequences are often synchronized with or related to the music. Such relationships are diverse in different video clips, but were not dealt with by previous methods for automatic music video generation. Our system employs machine learning and beat tracking techniques to model these relationships. To generate new music video clips, short image sequences that have been previously extracted from other music clips are stretched and concatenated so that the emerging image sequence matches the rhythmic structure of the target song. Besides automatically generating music videos, DanceReProducer offers a user interface in which a user can interactively change image sequences just by choosing different candidates. This way people with little knowledge or experience in MAD movie generation can interactively create personalized video clips. © 2011 Tomoyasu Nakano et al.
Automatic generation of facial wrinkles according to expression changes

Daisuke Mima, Hiroyuki Kubo, Akinobu Maejima, Shigeo Morishima

SIGGRAPH Asia 2011 Posters, SA'11 2011年

　概要を見る

In order to synthesize attractive facial expressions, it is necessary to consider detailed expression changes such as facial wrinkles. Nevertheless, most techniques of expression synthesis (i.e. blend shape, image morphing, simulation of mimic muscle and so on) focus entirely on large-scale deformation of a face and ignore small-scale details such as wrinkles and bulges. Therefore, hand crafted work of skilled artists is inevitable to make a face attractive finally, unless using a huge photographic equipment.

DOI
Automatic 3D face generation from video with sparse point constraint and dense deformable model

Tomoya Hara, Akinobu Maejima, Shigeo Morishima

SIGGRAPH Asia 2011 Posters, SA'11 2011年

　概要を見る

3D face models have been widely applied in various fields (e.g. biometrics, movies, video games). Especially, it is one of the most popular and challenging tasks in computer vision and computer graphics to reconstruct a 3D face model only with single camera without attaching landmarks and projecting laser dots or structured light patterns on a face. For example, Maejima proposed a method, based on generic-model approach, which can quickly reconstruct a 3D face model from 2D single photograph using a deformable face model [Maejima et al. 2008]. However, since it supposes input as a frontal face image, this method cannot express the individual facial parts' geometry such as height of nose and cheek contour accurately.

DOI
Real time ambient occlusion by curvature dependent occlusion function

Tomohito Hattori, Hiroyuki Kubo, Shigeo Morishima

SIGGRAPH Asia 2011 Posters, SA'11 2011年

　概要を見る

We present the novel technique to compute ambient occlusion [2008] on real-time graphics hardware. Because current real-time ambient occlusion techniques like SSAO need at least 16 rays sampling and too high computational cost to implement on computer games. Our method approximates occlusion as a local illumination model by introducing curvature-dependent function.

DOI
音楽と映像と同期手法に基づくダンス動画生成システム

平井辰典, 大矢隼士, 長谷川裕記, 森島繁生

音楽音響研究会資料 29 ( 7 ) 153 - 163 2011年

J-GLOBAL
経年変化を考慮した個人識別手法の検討

原田健希, 田副佑典, 前島謙宣, 森島繁生

電子情報通信学会大会講演論文集 2011 140 2011年

J-GLOBAL
顔画像における陰影変化を伴う表情生成

三間大輔, 鑓水裕刀, 久保尋之, 前島謙宣, 森島繁生

電子情報通信学会大会講演論文集 2011 142 2011年

J-GLOBAL
複数視点からの深度マップを用いた半透明物体の高速描画

小坂昂大, 服部智仁, 久保尋之, 森島繁生

電子情報通信学会大会講演論文集 2011 112 2011年

J-GLOBAL
表出過程の印象を考慮したより自然な笑顔動画像の合成

藤代裕紀, 前島謙宣, 森島繁生

電子情報通信学会技術研究報告 110 ( 459(HCS2010 56-69) ) 31 - 36 2011年

J-GLOBAL
動的な水の表面形状を考慮した流体のパラメータ推定

岩本尚也, 國友翔次, 森島繁生

電子情報通信学会大会講演論文集 2011 113 2011年

J-GLOBAL
基準形状変形による多視点動画像からの動的立体形状再現

種田大地, 山中健太郎, 國友翔次, 須田洋文, 前島謙宣, 森島繁生

電子情報通信学会大会講演論文集 2011 133 2011年

J-GLOBAL
幾何学的制約を考慮したLinear Predictorsによる顔特徴点自動抽出

松田龍英, 原朋也, 前島謙宣, 森島繁生

電子情報通信学会大会講演論文集 2011 141 2011年

J-GLOBAL
新映像技術「ダイブイントゥザムービー」

森島繁生, 八木康史, 中村哲, 伊勢史郎, 向川康博, 槇原靖, 間下以大, 近藤一晃, 榎本成悟, 川本真一, 四倉達夫, 池田雄介, 前島謙宣, 久保尋之

電子情報通信学会誌 94 ( 3 ) 250 - 268 2011年

　概要を見る

本文データは学協会の許諾に基づきCiNiiから複製したものである映像コンテンツの全く新しい実現形態として, 観客自身が映画等の登場人物となり, 時には友人や家族と一緒にこの作品を鑑賞することによって, 自身がストーリーへ深く没入し, かつてない感動を覚えたり, 時にはヒロイズムに浸ることを実現可能とする技術「ダイブイントゥザムービー」について本稿で解説する.この実現には, 観客に全く負担をかけることなく本人そっくりの個性を有する登場人物を自動生成する技術と, 自ら映像中のストーリーに参加しているという感覚を満足するためのキャラクタ合成のクオリティ, 映像シーンの環境に没入していると錯覚させる高品質な映像・音響再現技術及びその収録技術が, 観客の感動の強さを決定する重要な要素となる.2005年の愛・地球博にて実証実験を行った「フユーチャーキャスト」に端を発するこの技術は, ハードウェアの進歩と2007年にスタートした文部科学省の支援による科学技術振興調整費プロジェクトの実施によって, 格段の進歩を遂げた.その結果, 様々なバリエーションの観客の個性を全自動・短時間でストレスなくモデル化することが可能となり, また作品の中でリアルタイム合成されるキャラクタの顔と全身, 声に各入の個性を忠実に反映することが可能となった.また, 同時に役者が感じた音場・視点で1人称的にコンテンツへの没入感を体感することを可能にするシステムを同時に実現した.

CiNii J-GLOBAL
既存の動画を再利用して音楽に合わせた動画を自動生成するシステムの提案

大矢隼士, 森島繁生

日本音響学会研究発表会講演論文集(CD-ROM) 2011 ROMBUNNO.3-1-10 2011年

J-GLOBAL
主観評価に基づく音楽と映像の同期手法を用いた音楽動画生成システム

平井辰典, 大矢隼士, 長谷川裕記, 森島繁生

日本音響学会研究発表会講演論文集(CD-ROM) 2011 ROMBUNNO.3-1-11 2011年

J-GLOBAL
既存動画コンテンツを再利用して音楽にマッチした動画を自動生成するシステム

平井辰典, 大矢隼士, 長谷川裕記, 森島繁生

電子情報通信学会技術研究報告 111 ( 76(DE2011 1-26) ) 143 - 148 2011年

J-GLOBAL
複数視点からの深度マップ利用による半透明物質の実時間描画法

小坂昂大, 服部智仁, 久保尋之, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2011 POSUTAHAPPYO,FUKUSUSHITENKARANOSHINDOMAPPURIYO 2011年

J-GLOBAL
顔画像における表情変化に伴う表情皺の自動生成手法の提案

三間大輔, 久保尋之, 前島謙宣, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2011 POSUTAHAPPYO,KAOGAZONIOKERUHYOJOHENKA 2011年

J-GLOBAL
動的な水の表面形状を考慮した流体パラメータ推定

岩本尚也, 國友翔次, 佐川立昌, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2011 POSUTAHAPPYO,DOTEKINAMIZUNOHYOMENKEIJO 2011年

J-GLOBAL
三色光源下における非剛体の動的立体形状再現

種田大地, 須田洋文, 前島謙宣, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2011 KEIJOFUKUGEN,TANEDADAICHI 2011年

J-GLOBAL
リアルタイムスキンシェーダとしての曲率に依存する反射関数の提案と実装

久保尋之, 土橋宜典, 津田順平, 森島繁生

情報処理学会研究報告(CD-ROM) 2011 ( 2 ) ROMBUNNO.CG-143,NO.2 2011年

J-GLOBAL
アノテーション情報を付加した画像内容推定結果に基づく自動ダンス動画生成システム

長谷川裕記, 前島謙宣, 森島繁生

情報処理学会研究報告(CD-ROM) 2011 ( 2 ) ROMBUNNO.MUS-91,NO.20 2011年

J-GLOBAL
医用画像を用いた個性を反映した表情アニメーション生成

鑓水裕刀, 久保尋之, 前島謙宣, 森島繁生

日本顔学会誌 11 ( 1 ) 154 2011年

J-GLOBAL
表情の個性を表現するための解剖学的アプローチ

三間大輔, 久保尋之, 前島謙宣, 森島繁生, 島田和幸

日本顔学会誌 11 ( 1 ) 169 2011年

J-GLOBAL
表情変化に伴う顔画像への表情皺の自動付加手法の提案

三間大輔, 久保尋之, 前島謙宣, 森島繁生

日本顔学会誌 11 ( 1 ) 188 2011年

J-GLOBAL
個人性を反映した話者交換法に有効な話者分類の検討

田中茉莉, 河原英紀, 森島繁生

日本音響学会研究発表会講演論文集(CD-ROM) 2011 ROMBUNNO.2-8-2 2011年

J-GLOBAL
自然発話スペクトログラム再現法による母音交換に基づく母音変換

浜崎皓介, 山本達也, 田中茉莉, 河原英紀, 森島繁生

日本音響学会研究発表会講演論文集(CD-ROM) 2011 ROMBUNNO.2-8-3 2011年

J-GLOBAL
産業を支える画像技術~その広がりと学術・技術的深化~《2》実世界に学ぶ画像技術~現実と展望~《2-6》顔・人体メディアが拓く新産業の画像技術

川出雅人, 持丸正明, 森島繁生

映像情報メディア学会誌 65 ( 11 ) 1534 - 1544 2011年

DOI J-GLOBAL
疎な特徴点と顔変形モデルに基づく動画像からの3次元顔モデル自動生成手法

原朋也, 前島謙宣, 森島繁生

情報処理学会研究報告(CD-ROM) 2011 ( 4 ) ROMBUNNO.CG-145,NO.25 2011年

DOI J-GLOBAL
テクスチャ-デプスパッチタイリングに基づく正面顔画像からの3次元形状推定

郷原裕明, 前島謙宣, 森島繁生

情報処理学会研究報告(CD-ROM) 2011 ( 4 ) ROMBUNNO.CG-145,NO.20 2011年

J-GLOBAL
Acoustic features affecting speaker identification by imitated voice analysis

20th International Congress on Acoustics 2010, ICA 2010 - Incorporating Proceedings of the 2010 Annual Conference of the Australian Acoustical Society 5 3677 - 3680 2010年12月

　概要を見る

In this paper, physical correlates of perceived personal identity are investigated using imitated 16 utterances spoken by 11 mimicry speakers and 24 test subjects. Our unique strategy to use non-professional impersonators enabled to prepare test utterances with wide range of perceived similarities. Reasonably high correlations (0.46 and 0.44) in multiple regression analysis were attained by grouping subjects into three groups based on cluster analysis of the subjective test results. Without clustering, the correlation was only 0.17. Cluster analysis also revealed differences in their focusing physical correlates between three groups indicating importance of individual differences both in speakers and listeners.
BT-2-2 コンシューマ参加型デジタルコンテンツ(BT-2.ネットワークを活用したディジタルメディア〜テクノロジとビジネスの最新動向〜,チュートリアルセッション,ソサイエティ企画)

森島繁生

電子情報通信学会ソサイエティ大会講演論文集 2010 ( 2 ) "SS - 92"-"SS-95" 2010年08月

CiNii J-GLOBAL
A study of relationship between speaker identification and acoustic features using perceptual similarity of imitated voice (音声)

Tanaka Mari, Kawahara Hideki, Morishima Shigeo

電子情報通信学会技術研究報告. SP, 音声 110 ( 143 ) 1 - 5 2010年07月

　概要を見る

Physical correlates of perceived personal identity are investigated using imitated 16 utterances spoken by 11 mimicry speakers and 24 test subjects. Our unique strategy to use non-professional impersonators enabled to prepare test utterances with wide range of perceived similarities. Reasonably high correlations (0.46 and 0.44) in multiple regression analysis were attained by grouping subjects into three groups based on cluster analysis of the subjective test results. Without clustering, the correlation was only 0.17. The cluster analysis also revealed differences in their focusing physical correlates between three groups indicating importance of individual differences both in speakers and listeners.

CiNii
遮蔽度の曲率近似によるアンビエントオクルージョンの局所照明モデル化—Ambient Occlusion by Curvature Depended Local Illumination Approximation of Occlusion—グラフィクスとCAD(CG) Vol.2009-CG-138

服部智仁, 久保尋之, 森島繁生

情報処理学会研究報告 2009年度 ( 6 ) 1 - 6 2010年04月

CiNii
主観評価に基づく個人性を強調した歩行動作合成手法の提案—Gait Animation Synthesis Exaggerated Characteristics Based on Perception Similarity—グラフィクスとCAD(CG) Vol.2009-CG-138

中村槙介, 森島繁生

情報処理学会研究報告 2009年度 ( 6 ) 1 - 6 2010年04月

CiNii
人物頭部モデル自動生成システムの実現--最適化局所アフィン変換に基づく人物頭部モデルの自動生成

前島謙宣, 森島繁生

画像ラボ 21 ( 4 ) 29 - 35 2010年04月

CiNii
多視点顔画像に基づく3次元顔形状推定

原朋也, 藤代裕紀, 中野真也, 前島謙宣, 森島繁生

電子情報通信学会技術研究報告. HIP, ヒューマン情報処理 109 ( 471 ) 73 - 78 2010年03月

　概要を見る

3次元顔モデルは,映像制作や個人認証など様々な分野で応用されている.筆者らは,先行研究において,大型な装置を必要とせずに3次元顔モデルを構築する手法として,頭部変形モデルに基づき正面顔画像から高速に3次元顔モデルを構築する手法を提案している.しかしながら,鼻の高さや頬部分の凹凸の個人特徴を忠実に再現出来ないという問題点があった.本稿では,顔向きを変化させて撮影した複数枚の画像を入力として,各画像から構築される3次元顔モデルを,顔の部位ごとに最適な重みを与えて統合することで,より本人らしい3次元顔モデルを生成する手法を提案する.提案手法をオープンテストによって評価した結果,従来手法と比べて,特に鼻や口領域に関して,精度の向上を確認することが出来た.

CiNii
スナップ写真からの3次元顔モデル高速自動生成

前島謙宣, 森島繁生

電子情報通信学会技術研究報告. HIP, ヒューマン情報処理 109 ( 471 ) 145 - 150 2010年03月

　概要を見る

本報告では,スナップ写真から写真中の人物らしい3次元顔モデルを高速自動生成する手法について述べる.提案手法は3次元顔形状の事前知識として1153人の3次元顔モデルから構築される顔変形モデルと,顔形状の分布に対してフィッティングされた混合ガウス分布モデルを用いる.3次元顔形状は,画像から自動検出される顔特徴点と顔変形モデル上の対応頂点との残差の二乗和が最小かっその時の顔変形モデルのモデルパラメータに対する混合ガウス分布モデルの尤度が最大となるようなエネルギー最小化問題を解くことにより推定される.提案手法に対する性能評価実験の結果から,平均1.2秒の処理時間で2.1mmの精度誤差を持つ3次元顔モデルが生成可能であることを示す.

CiNii
個人顔の3次元形状変形とテクスチャ変換に基づぐエージングシミュレーション

田副佑典, 藤代裕紀, 中野真也, 笠井聡子, 前島謙宣, 森島繁生

電子情報通信学会技術研究報告. HIP, ヒューマン情報処理 109 ( 471 ) 151 - 156 2010年03月

　概要を見る

本稿では,レンジスキャンデータ(3次元顔形状データ+正面顔テクスチャ)に基づいて生成された正確な3次元形状と正面顔テクスチャを有する個人顔モデルに対し,3次元形状とテクスチャの双方に年齢特徴を加えた年齢変化顔の合成手法を提案する.まず,被験者の年齢に基づき3次元顔形状データと正面顔テクスチャを分類し年齢別顔データベースを構築する.次に,データベース中の3次元顔形状データから学習される年齢特徴ベクトルを入力顔に付加することで,3次元顔形状を所望の年齢へと変換する.続いて,入力顔に対し,入力顔が属する年代の平均顔テクスチャの輝度値とターゲットとなる年代の平均顔テクスチャの輝度値との差分を加えることにより,テクスチャをターゲットの年齢へ変換する.最後に,両者を統合することにより年齢変化顔を合成する.提案手法により,年齢変化に伴う3次元形状変化およびしみ,しわ等の表現を可能とした.

CiNii
3種の色光源を用いた多視点動画像からの動的立体構造再現

小林昭太, 森島繁生

電子情報通信学会技術研究報告. HIP, ヒューマン情報処理 109 ( 471 ) 157 - 162 2010年03月

　概要を見る

カメラを用いた撮影画像の陰影情報から,対象の3次元形状を復元する手法が研究されている.しかし,複数のカメラに同時に陰影情報を与える照明環境を作成することは難しく,対象の全体像の動的な形状変化を再現することは困難である.そこで,本研究では3色の色光源を用い,各照明に対応する陰影情報を撮影画像から分離して取得できる環境を作成することで複数の陰影情報を同時に取得し,視体積交差法により得られる対象の初期形状の法線を再計算する.陰影情報から法線を推定する際,反射モデルが必要になるが,本研究では,撮影環境に特化した反射モデルを,十分なキャリブレーションを行うことで作成した.再計算された法線から,初期形状の頂点座標を再計算することで3次元形状を再現する.

CiNii
多視点顔画像に基づく3次元顔形状推定

原朋也, 藤代裕紀, 中野真也, 前島謙宣, 森島繁生

電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 109 ( 470 ) 73 - 78 2010年03月

　概要を見る

3次元顔モデルは,映像制作や個人認証など様々な分野で応用されている.筆者らは,先行研究において,大型な装置を必要とせずに3次元顔モデルを構築する手法として,頭部変形モデルに基づき正面顔画像から高速に3次元顔モデルを構築する手法を提案している.しかしながら,鼻の高さや頬部分の凹凸の個人特徴を忠実に再現出来ないという問題点があった.本稿では,顔向きを変化させて撮影した複数枚の画像を入力として,各画像から構築される3次元顔モデルを,顔の部位ごとに最適な重みを与えて統合することで,より本人らしい3次元顔モデルを生成する手法を提案する.提案手法をオープンテストによって評価した結果,従来手法と比べて,特に鼻や口領域に関して,精度の向上を確認することが出来た.

CiNii J-GLOBAL
スナップ写真からの3次元顔モデル高速自動生成

前島謙宣, 森島繁生

電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 109 ( 470 ) 145 - 150 2010年03月

　概要を見る

本報告では,スナップ写真から写真中の人物らしい3次元顔モデルを高速自動生成する手法について述べる.提案手法は3次元顔形状の事前知識として1153人の3次元顔モデルから構築される顔変形モデルと,顔形状の分布に対してフィッティングされた混合ガウス分布モデルを用いる.3次元顔形状は,画像から自動検出される顔特徴点と顔変形モデル上の対応頂点との残差の二乗和が最小かつその時の顔変形モデルのモデルパラメータに対する混合ガウス分布モデルの尤度が最大となるようなエネルギー最小化問題を解くことにより推定される.提案手法に対する性能評価実験の結果から,平均1.2秒の処理時間で2.1mmの精度誤差を持つ3次元顔モデルが生成可能であることを示す.

CiNii J-GLOBAL
個人顔の3次元形状変形とテクスチャ変換に基づぐエージングシミュレーション

田副佑典, 藤代裕紀, 中野真也, 笠井聡子, 前島謙宣, 森島繁生

電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 109 ( 470 ) 151 - 156 2010年03月

　概要を見る

本稿では,レンジスキャンデータ(3次元顔形状データ+正面顔テクスチャ)に基づいて生成された正確な3次元形状と正面顔テクスチャを有する個人顔モデルに対し,3次元形状とテクスチャの双方に年齢特徴を加えた年齢変化顔の合成手法を提案する.まず,被験者の年齢に基づき3次元顔形状データと正面顔テクスチャを分類し年齢別顔データベースを構築する.次に,データベース中の3次元顔形状データから学習される年齢特徴ベクトルを入力顔に付加することで,3次元顔形状を所望の年齢へと変換する.続いて,入力顔に対し,入力顔が属する年代の平均顔テクスチャの輝度値とターゲットとなる年代の平均顔テクスチャの輝度値との差分を加えることにより,テクスチャをターゲットの年齢へ変換する.最後に,両者を統合することにより年齢変化顔を合成する.提案手法により,年齢変化に伴う3次元形状変化およびしみ,しわ等の表現を可能とした.

CiNii
3種の色光源を用いた多視点動画像からの動的立体構造再現

小林昭太, 森島繁生

電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 109 ( 470 ) 157 - 162 2010年03月

　概要を見る

カメラを用いた撮影画像の陰影情報から,対象の3次元形状を復元する手法が研究されている.しかし,複数のカメラに同時に陰影情報を与える照明環境を作成することは難しく,対象の全体像の動的な形状変化を再現することは困難である.そこで,本研究では3色の色光源を用い,各照明に対応する陰影情報を撮影画像から分離して取得できる環境を作成することで複数の陰影情報を同時に取得し,視体積交差法により得られる対象の初期形状の法線を再計算する.陰影情報から法線を推定する際,反射モデルが必要になるが,本研究では,撮影環境に特化した反射モデルを,十分なキャリブレーションを行うことで作成した.再計算された法線から,初期形状の頂点座標を再計算することで3次元形状を再現する.

CiNii J-GLOBAL
A-15-10 3次元形状とテクスチャの双方の変換による年齢変化顔の生成(A-15.ヒューマン情報処理,一般セッション)

田副佑典, 藤代裕紀, 中野真也, 野中悠介, 笠井聡子, 前島謙宣, 森島繁生

電子情報通信学会総合大会講演論文集 2010 192 - 192 2010年03月

CiNii J-GLOBAL
A-15-11 人体の骨格形状を考慮したスキニング手法の提案(A-15.ヒューマン情報処理,一般セッション)

須田洋文, 中村槙介, 山中健太郎, 森島繁生

電子情報通信学会総合大会講演論文集 2010 193 - 193 2010年03月

DOI CiNii J-GLOBAL
A-16-10 静的・動的特徴を考慮した布の物理パラメータ推定(A-16.マルチメディア・仮想環境基礎,一般セッション)

國友翔次, 中村槙介, 森島繁生

電子情報通信学会総合大会講演論文集 2010 228 - 228 2010年03月

DOI CiNii J-GLOBAL
D-11-81 非線形モーフィングに基づく手描き顔アニメーションの中割り画像生成(D-11.画像工学,一般セッション)

郷原裕明, 杉本志織, 森島繁生

電子情報通信学会総合大会講演論文集 2010 ( 2 ) 81 - 81 2010年03月

CiNii J-GLOBAL
D-12-8 多視点顔画像に基づく顔器官毎の重みを考慮した3次元顔形状推定(D-12.パターン認識・メディア理解,一般セッション)

原朋也, 藤代裕紀, 中野真也, 前島謙宣, 森島繁生

電子情報通信学会総合大会講演論文集 2010 ( 2 ) 119 - 119 2010年03月

CiNii J-GLOBAL
来場者の声の特徴を反映する映像エンタテインメントシステムのための台詞音声生成システム

川本真一, 足立吉広, 大谷大和, 四倉達夫, 森島繁生, 中村哲

情報処理学会論文誌 51 ( 2 ) 250 - 264 2010年02月

　概要を見る

視聴者の顔をCGで再現し，CGキャラクタとして映画に登場させるFuture Cast System（FCS）を改良し，視聴者から収録した少量の音声サンプルを用いて，視聴者に似た台詞音声を生成するため複数手法を統合し，生成された台詞音声をシーンに合わせて同期再生することで，視聴者の声の特徴をキャラクタに反映させるシステムを提案する．話者データベースから視聴者と声が似た話者を選択する手法（類似話者選択技術）と，複数話者音声を混合することで視聴者の声に似た音声を生成する手法（音声モーフィング技術）を組み合わせたシステムを構築し，複数処理を並列化することで，上映準備時間の要求条件を満たした．実環境を想定してBGM/SEを重畳した音声によって，従来手法である類似話者選択技術より得られる音声と，提案法で導入した音声モーフィング技術より得られる音声を主観評価実験により評価した結果，Preference Scoreで56.5%のモーフィング音声が目標話者の音声に似ていると判断され，音声モーフィングを組み合わせることでシステムが出力する台詞音声の話者類似性を改善できることを示した．In this paper, we propose an improved Future Cast System (FCS) that enables anyone to be a movie star while retaining their individuality in terms of how they look and how they sound. The proposed system produces voices that are significantly matched to their targets by integrating the results of multiple methods: similar speaker selection and voice morphing. After assigning one CG character to the audience, the system produces voices in synchronization with the CG character's movement. We constructed the speech synchronization system using a voice actor database with 60 different kinds of voices. Our system achieved higher voice similarity than conventional systems; the preference score of our system was 56.5% over other conventional systems.

CiNii
主観評価に基づく個人性を強調した歩行動作合成手法の提案

中村槙介, 森島繁生

情報処理学会研究報告. グラフィクスとCAD研究会報告 138 ( 6 ) F1 - F6 2010年02月

　概要を見る

人間の歩行動作には、個人性情報が含まれており，最近では歩容個人認証の研究も盛んである。しかし個人の特徴を強調し，反映する歩容アニメーションを作ることは困難である．本研究では，歩行動作における個人性とは平均的な歩行動作からの差異によって表現されるものであると仮定し，その差異を増大させることによって個人性を強調した歩行動作を合成する．合成される歩行動作は，複数のサンプル歩行動作の主成分分析によって構築される空間において表現する．また，増大させる差異の大きさについては，複数の人物の歩行動作の中から特定の人物の歩行動作を探す主観評価実験によって最も認識率の高くなる割合を推定し，それを用いる．Characteristics of human motion, such as walking, running or jumping vary from person to person. Differences in human motion enable people to identify oneself or a friend. However, it is challenging to generate animation where individual characters exhibit characteristic motion using computer graphics. In our research, differences between an average motion in sample motions and a target motion are considered as characteristics target motion includes. We are enable to synthesize gait animation exaggerated characteristics by increasing this differences. The synthesized motion is represented as PCA score in PCA space composed of sample motions. In the experiment that subjects look for a target motion from crowd, we estimate the degree of exaggerated characteristics by the reaction time when subjects find the target motion most quickly.

CiNii
遮蔽度の曲率近似によるアンビエントオクルージョンの局所照明モデル化

服部智仁, 久保尋之, 森島繁生

情報処理学会研究報告. グラフィクスとCAD研究会報告 138 ( 13 ) M1 - M6 2010年02月

　概要を見る

間接光を考慮した大域照明モデルによって生成される柔らかい陰影は，より物体を立体的にリアル表現するために必要であり，局所照明モデルによって得難いものである．現在，Ambient Occlusion と呼ばれる大域照明モデルを局所照明モデルに付加することにより，ソフトシャドウを低コストに得る手法が知られている．本稿では，Ambient Occlusion を，曲率を用いた局所照明モデルに落とし込むことにより，従来のモデルより少ない計算量により効果を得る手法を提案する．Ambient Occlusion is used widely for improving the realism of rapid lighting simulation. We propose a new, simple, real-time technique for computing Ambient Occlusion, by a curvature depended local illumination approximation of occlusion. According to our result, we achieve Ambient Occlusion from curvature by less computational complexities than past techniques.

DOI CiNii
来場者の声の特徴を反映する映像エンタテインメントシステムのための台詞音声生成システム

川本真一, 足立吉広, 大谷大和, 四倉達夫, 森島繁生, 中村哲

2010年02月

　概要を見る

視聴者の顔をCGで再現し,CGキャラクタとして映画に登場させるFuture Cast System(FCS)を改良し,視聴者から収録した少量の音声サンプルを用いて,視聴者に似た台詞音声を生成するため複数手法を統合し,生成された台詞音声をシーンに合わせて同期再生することで,視聴者の声の特徴をキャラクタに反映させるシステムを提案する.話者データベースから視聴者と声が似た話者を選択する手法(類似話者選択技術)と,複数話者音声を混合することで視聴者の声に似た音声を生成する手法(音声モーフィング技術)を組み合わせたシステムを構築し,複数処理を並列化することで,上映準備時間の要求条件を満たした.実環境を想定してBGM/SEを重畳した音声によって,従来手法である類似話者選択技術より得られる音声と,提案法で導入した音声モーフィング技術より得られる音声を主観評価実験により評価した結果,Preference Scoreで56.5%のモーフィング音声が目標話者の音声に似ていると判断され,音声モーフィングを組み合わせることでシステムが出力する台詞音声の話者類似性を改善できることを示した.

CiNii
3次元計上とテクスチャを用いた年連変化顔シミュレーション

森島繁生

第16回画像センシングシンポジウム(SSII2010) 2010年 [査読有り]
顔計上事前知識に基づく顔画像からの3次元顔も出る高速自動生成

森島繁生

第16回画像センシングシンポジウム(SSII2010) 2010年 [査読有り]
3次元形状変形とテクスチャ変換に基づく年齢変化顔モデルの生成

森島繁生

画像の認識・理解シンポジウム(MIRU2010) 2010年 [査読有り]
顔変形モデルと顔形状分布制約に基づく単一顔画像からの3次元顔モデル高速自動生成

森島繁生

画像の認識・理解シンポジウム(MIRU2010) 2010年 [査読有り]
複数顔向き画像に基づく3次元顔モデル生成

森島繁生

第16回画像センシングシンポジウム(SSII2010) 2010年 [査読有り]
アクティブスナップショット

森島繁生

第15回日本顔学会フォーラム顔学2010 Vol.10 104 2010年
Facial animation reflecting personal characteristics by automatic head modeling and facial muscle adjustment

Akinobu Maejima, Hiroyuki Kubo, Shigeo Morishima

ISCIT 2010 - 2010 10th International Symposium on Communications and Information Technologies 7 - 12 2010年

　概要を見る

We propose a new automatic character modeling system which can generate an individualized head model only from a facial range scan data and an individualized facial animation with expression change. The head modeling system consists of two core modules: the head modeling module which can generate a head model from a personal facial range scan data using automatic mesh completion, and the key shape generation module which can generate key shapes for the generated head model based on physics-based facial muscle simulation with a personal muscle layout estimated from subject's facial expression videos. As a result, we can generate a head model which can synthesize facial expressions and impression similar to the target person. The experimental result shows that we archive to synthesize CG characters that subjects can identify themselves with 84%. Therefore, we conclude that our head modeling system is effective to games and entertainment systems like the Future Cast. ©2010 IEEE.

DOI
来場者の声の特徴を反映する映像エンタテインメントシステムのための台詞音声生成システム

川本真一, 川本真一, 足立吉広, 足立吉広, 大谷大和, 四倉達夫, 森島繁生, 森島繁生, 中村哲, 中村哲

情報処理学会論文誌ジャーナル(CD-ROM) 51 ( 2 ) 250 - 264 2010年

J-GLOBAL
音楽特徴量と印象語の分析に基づく楽曲のサムネイル表現技術

長谷川裕記, 室伏空, 山本達也, 森島繁生

日本音響学会研究発表会講演論文集(CD-ROM) 2010 ROMBUNNO.2-8-20 2010年

J-GLOBAL
物まね音声の知覚特性を反映した音声類似度評価尺度の提案

田中茉莉, 山本達也, 室伏空, 河原英紀, 森島繁生

日本音響学会研究発表会講演論文集(CD-ROM) 2010 ROMBUNNO.1-7-11 2010年

J-GLOBAL
人体の骨格形状を考慮したスキニング手法の提案

須田洋文, 中村槙介, 山中健太郎, 森島繁生

電子情報通信学会大会講演論文集 2010 193 2010年

J-GLOBAL
非線形モーフィングに基づく手描き顔アニメーションの中割り画像生成

郷原裕明, 杉本志織, 森島繁生

電子情報通信学会大会講演論文集 2010 81 2010年

J-GLOBAL
多視点顔画像に基づく顔器官毎の重みを考慮した3次元顔形状推定

原朋也, 藤代裕紀, 中野真也, 前島謙宣, 森島繁生

電子情報通信学会大会講演論文集 2010 119 2010年

J-GLOBAL
静的・動的特徴を考慮した布の物理パラメータ推定

國友翔次, 中村槙介, 森島繁生

電子情報通信学会大会講演論文集 2010 228 2010年

J-GLOBAL
3次元形状とテクスチャの双方の変換による年齢変化顔の生成

田副佑典, 藤代裕紀, 中野真也, 野中悠介, 笠井聡子, 前島謙宣, 森島繁生

電子情報通信学会大会講演論文集 2010 192 2010年

J-GLOBAL
個人顔の3次元形状変形とテクスチャ変換に基づくエージングシミュレーション

田副佑典, 藤代裕紀, 中野真也, 笠井聡子, 前島謙宣, 森島繁生

電子情報通信学会技術研究報告 109 ( 471(HIP2009 118-210) ) 151 - 156 2010年

J-GLOBAL
スナップ写真からの3次元顔モデル高速自動生成

前島謙宣, 森島繁生

電子情報通信学会技術研究報告 109 ( 471(HIP2009 118-210) ) 145 - 150 2010年

J-GLOBAL
多視点顔画像に基づく3次元顔形状推定

原朋也, 藤代裕紀, 中野真也, 前島謙宣, 森島繁生

電子情報通信学会技術研究報告 109 ( 471(HIP2009 118-210) ) 73 - 78 2010年

J-GLOBAL
3種の色光源を用いた多視点動画像からの動的立体構造再現

小林昭太, 森島繁生

電子情報通信学会技術研究報告 109 ( 471(HIP2009 118-210) ) 157 - 162 2010年

J-GLOBAL
人物頭部モデル自動生成システムの実現最適化局所アフィン変換に基づく人物頭部モデルの自動生成

前島謙宣, 森島繁生

画像ラボ 21 ( 4 ) 29 - 35 2010年

J-GLOBAL
遮蔽度の曲率近似によるアンビエントオクルージョンの局所照明モデル化

服部智仁, 久保尋之, 森島繁生

情報処理学会研究報告(CD-ROM) 2009 ( 6 ) ROMBUNNO.CG-138,13 2010年

J-GLOBAL
主観評価に基づく個人性を強調した歩行動作合成手法の提案

中村槙介, 森島繁生

情報処理学会研究報告(CD-ROM) 2009 ( 6 ) ROMBUNNO.CG-138,6 2010年

J-GLOBAL
遮蔽度の曲率近似を用いたアンビエントオクルージョンの局所照明モデル化

服部智仁, 久保尋之, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2010 MODERINGU,HATTORITOMOHITO 2010年

J-GLOBAL
半透明物体の高速描画に向けた曲率に依存する反射関数の近似式

久保尋之, 久保尋之, 土橋宜典, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2010 MODERINGU,KUBOHIROYUKI 2010年

J-GLOBAL
静的・動的特徴を考慮した布シミュレーションの物理パラメータ推定

國友翔次, 中村槙介, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2010 MODERINGU,KUNITOMOSHOJI 2010年

J-GLOBAL
コンシューマ参加型デジタルコンテンツ

森島繁生

電子情報通信学会大会講演論文集 2010 SS.92-SS.95 2010年

J-GLOBAL
人肌などの半透明物体の高速描画法

久保尋之, 久保尋之, 土橋宜典, 森島繁生

日本顔学会誌 10 ( 1 ) 111 2010年

J-GLOBAL
話速変化に応じたリップシンクアニメーションの作成

高見澤涼, 矢野茜, 久保尋之, 前島謙宣, 森島繁生

日本顔学会誌 10 ( 1 ) 132 2010年

J-GLOBAL
歩行における知覚的類似性尺度に基づく個人性を強調した動作合成手法

中村槙介, 森島繁生

画像電子学会誌 39 ( 5 ) 615 - 620 2010年

　概要を見る

Characteristics of human motion, such as walking, running or jumping vary from person to person. Differences in human motion enable people to identify oneself or a friend. However, it is challenging to generate animation where individual characters exhibit characteristic motion using computer graphics. In our research, differences between an average motion in some sample motions and a target motion are considered as characteristics target motion includes. We are able to synthesize gait animation having exaggerated characteristics by increasing the differences. The synthesized motion is represented as PCA (Principal Component Analysis) score in PCA space composed of sample motions. In the experiment of looking for a target motion from crowd, we estimate the optimum degree of exaggerated characteristics to minimize the finding time of the target motion. © 2010, The Institute of Image Electronics Engineers of Japan. All rights reserved.

DOI J-GLOBAL
曲率に依存する反射関数を用いた半透明物体の高速レンダリング

久保尋之, 久保尋之, 土橋宜典, 森島繁生

電子情報通信学会論文誌 A J93-A ( 11 ) 708 - 717 2010年

J-GLOBAL
歩行における知覚的類似性尺度に基づく個人性を強調した動作合成手法

中村槙介, 森島繁生

画像電子学会誌 39 ( 5 ) 615 - 620 2010年

　概要を見る

人間の歩行動作には，個人性情報が含まれており，最近では歩容個人認証の研究も盛んである．しかし個人の特徴を強調し，反映する歩容アニメーションを作ることは困難である．<br>本研究では，歩行動作における個人性とは平均的な歩行動作からの差異によって表現されるものであると仮定し，その差異を増大させることによって個人性を強調した歩行動作を合成する．合成される歩行動作は，複数のサンプル歩行動作の主成分分析によって構築される空間において表現する．また，増大させる差異の大きさについては，複数の人物の歩行動作の中から特定の人物の歩行動作を探す主観評価実験によって最も認識率の高くなる割合を推定し，それを用いる．

DOI CiNii J-GLOBAL
プログラム担当より

森島繁生

日本バーチャルリアリティ学会誌 = Journal of the Virtual Reality Society of Japan 14 ( 4 ) 207 - 208 2009年12月

CiNii
座長からの報告

渡邊淳司, 藤田欣也, 加藤博一, ピトヨハルトノ, 苗村健, 昆陽雅司, 小林稔, 酒井幸仁, 筧康明, 梶本裕之, 北崎充晃, 宮崎慎也, 唐山英明, 池井寧, 柳田康幸, 森島繁生, 広田光一, 前田太郎, 中口俊哉, 長谷川晶一, 小木哲朗, 井原雅行, 妹尾武治, 稲見昌彦, 清川清, 橋本直己, 安藤英由樹, 宮田一乘, 吉田俊介, 大田友一, 原田哲也, 寺林賢司, 北島律之, 松本光春, 綿貫啓一, 青木義満

日本バーチャルリアリティ学会誌 = Journal of the Virtual Reality Society of Japan 14 ( 4 ) 213 - 223 2009年12月

CiNii
Development of a toolkit for spoken dialog systems with an anthropomorphic agent: Galatea

Takuya Nishimoto, Yoichi Yamashita, Tsuneo Nitta

APSIPA ASC 2009 - Asia-Pacific Signal and Information Processing Association 2009 Annual Summit and Conference 148 - 153 2009年12月

　概要を見る

The Interactive Speech Technology Consortium (ISTC) has been developing a toolkit called Galatea that comprises four fundamental modules for speech recognition, speech synthesis, face synthesis, and dialog control, that can be used to realize an interface for spoken dialog systems with an anthropomorphic agent. This paper describes the development of the Galatea toolkit and the functions of each module; in addition, it discusses the standardization of the description of multi-modal interactions.
ストーリへの没入感を体験可能な高臨場感コンテンツ

森島繁生

電気学会研究会資料. EDD, 電子デバイス研究会 2009 ( 81 ) 27 - 32 2009年11月

CiNii
ストーリへの没入感を体験可能な高臨場感コンテンツ

森島繁生

研究会講演予稿 247 27 - 32 2009年11月

CiNii J-GLOBAL
ストーリへの没入感を体験可能な高臨場感コンテンツ

森島繁生

電子情報通信学会技術研究報告. EID, 電子ディスプレイ 109 ( 267 ) 27 - 32 2009年10月

　概要を見る

観客全員を映画のキャストとして登場させ、ストーリへの没入感を体験可能な新しいコンテンツ形態を提案している。2005年の愛・地球博、三井・東芝館において、世界で初めて実現されたフューチャーキャストシステムは、視聴者体験型の映像コンテンツ『グランオデッセイ』を実現し、6か月間の体験者163万人を記録し、人気を博した。この技術は、2007年にハウステンボスにおいて、フューチャーキャストシアターとして商用化された。いかに短時間で負担を与えることなく、観客の個性を反映するCGキャラクタを自動カスタマイズ合成するかが成功のカギとなるが、今回、当初の顔の3次元形状計測に加えて、表情の個性、体系・歩容の個性、声質の個性、髪形の選択などを短時間かつ自動的にモデル化する手法を新たに実現し、頭髪を含む全身モデルのリアルタイムレンダリングを実現することで、自分自身を映像中に発見できる確率を、万博システムの59%から約90%に劇的に向上させることができた。

CiNii
ダンス動画コンテンツを再利用して音楽に合わせた動画を自動生成するシステム

室伏空, 中野倫靖, 後藤真孝, 森島繁生

情報処理学会研究報告. [音楽情報科学] 81 ( 21 ) U1 - U7 2009年07月

　概要を見る

本研究では、既存のダンス動画コンテンツの複数の動画像を分割して連結（切り貼り）することで、音楽に合ったダンス動画を自動生成するシステムを提案する。従来、切り貼りに基づいた動画の自動生成に関する研究はあったが、音楽{映像間の多様な関係性を対応付ける研究はなかった。本システムでは、そうした多様な関係性をモデル化するために、Web 上で公開されている二次創作された大量のコンテンツを利用し、クラスタリングと複数の線形回帰モデルを用いることで音楽に合う映像の素片を選択する。その際、音楽{映像間の関係だけでなく、生成される動画の時間的連続性や音楽的構造もコストとして考慮することで、動画像の生成をビタビ探索によるコスト最小化問題として解いた。This paper presents a system that automatically generates a dance video clip appropriate to music by segmenting and concatenating existing dance video clips. Although there were previous works on automatic music video creation, they did not support various associations between music and video. To model such various associations, our system uses a large amount of fan-fiction content on the web, and selects video segments appropriate to music by using linear regression models for multiple clusters. By introducing costs representing temporal continuity and music structure of the generated video clip as well as associations between music and video, this video creation problem is solved by minimizing the costs by Viterbi search.

CiNii
最適化局所アフィン変換に基づく正面顔レンジスキャンデータからの頭部モデル自動生成

前島謙宣, 森島繁生

画像電子学会誌 38 ( 4 ) 404 - 413 2009年07月

　概要を見る

人間の三次元頭部モデルの構築において，三次元レンジスキャナから獲得した頭部の形状データに対して，テンプレートメッシュを変形し整合する手法が一般的である．しかし，整合に必要なマーカの手動設定が必要とされ，レンジスキャン不可能な頭髪部位のデータ欠損の影響で後頭部の構築が困難な箇所の手修正を行う必要が生じる．本稿では，レンジスキャンで得られる正面顔のみの形状とテンプレートメッシュから，通常では計測困難な後頭部の形状をテンプレートメッシュの形状から補完可能な自動頭部生成法を提案する．本手法は，実測された顔形状とテンプレートメッシュ形状の境界がシームレスに接続された頭部モデルを構築することが可能である．また，一般的な整合処理は数分の時間を要するのに対し，本手法の処理時間は平均10.4秒であるため，短時間に多くのCG頭部モデル作成を必要とするゲーム，映像制作などエンタテインメント分野において有効であると考えられる．

DOI CiNii J-GLOBAL
調音結合モデルを用いた母音交換に基づく話者変換法

山本達也, 室伏空, 森島繁生

電子情報通信学会技術研究報告. SP, 音声 109 ( 139 ) 37 - 42 2009年07月

　概要を見る

合成音声の声質変換技術として研究されている統計的スペクトル変換法は、ターゲットとなる話者から事前に発話データを取得する必要があり、多くの発話データを得ることによってより品質の高い話者変換が可能になることが知られている。しかし一方でターゲットとなる話者から膨大な発話データを得ることは話者にとって大きな負担となる。一方、母音交換法による話者変換はターゲットとなる話者から母音のみを取得してあらゆる発話内容に対して話者変換を行うことができる。筆者らは母音交換法の問題点であった音声の不連続性を改善するため音素境界付近に調音結合モデルを用いて自然な音声を合成する手法を提案した。また、調音結合モデルの適用区間を入力話者の発話内容から学習することでより自然発話に近い音声を合成する手法を提案する。

CiNii
MRIを用いた前腕皮膚形状変化モデルの構築と運動生成

山中健太郎, 中村槙介, 小林昭太, 森島繁生

ヒューマンインタフェース学会研究報告集 : human interface 11 ( 2 ) 85 - 90 2009年05月

CiNii
MRIを用いた前腕皮膚形状変化モデルの構築と運動生成

山中健太郎, 中村槙介, 小林昭太, 森島繁生

電子情報通信学会技術研究報告. WIT, 福祉情報工学 109 ( 29 ) 85 - 90 2009年05月

　概要を見る

This paper presents a new methodology for constructing an example-based skin deformation model of human forearm based on MRI images. Generating realistic skin animation of forearm movement is generally difficult in CG domain because there is a crucial difference between a structure of forearm of a virtual human and a real human. So we propose a new kind of skin deformation model based on MRI images. Using MRI images, we can model example skin shapes associated with location of bones with accuracy. We also mention how to apply the model to characters to generate skin animation. For this purpose, we employ RBF, Radial Basis Functions. Once the model is constructed, skin animation is easily generated by applying the model to the forearm of a character by means of RBF. In this paper, we describe how to construct our model, first. Then we explain the method to apply the model to characters and generate skin animation.

CiNii
MRIを用いた前腕皮膚形状変化モデルの構築と運動生成

山中健太郎, 中村槙介, 小林昭太, 森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 109 ( 27 ) 85 - 90 2009年05月

　概要を見る

This paper presents a new methodology for constructing an example-based skin deformation model of human forearm based on MRI images. Generating realistic skin animation of forearm movement is generally difficult in CG domain because there is a crucial difference between a structure of forearm of a virtual human and a real human. So we propose a new kind of skin deformation model based on MRI images. Using MRI images, we can model example skin shapes associated with location of bones with accuracy. We also mention how to apply the model to characters to generate skin animation. For this purpose, we employ RBF, Radial Basis Functions. Once the model is constructed, skin animation is easily generated by applying the model to the forearm of a character by means of RBF. In this paper, we describe how to construct our model, first. Then we explain the method to apply the model to characters and generate skin animation.

CiNii
MRIを用いた前腕皮膚形状変化モデルの構築と運動生成

山中健太郎, 中村槙介, 小林昭太, 森島繁生

電子情報通信学会技術研究報告. HIP, ヒューマン情報処理 109 ( 28 ) 85 - 90 2009年05月

　概要を見る

This paper presents a new methodology for constructing an example-based skin deformation model of human forearm based on MRI images. Generating realistic skin animation of forearm movement is generally difficult in CG domain because there is a crucial difference between a structure of forearm of a virtual human and a real human. So we propose a new kind of skin deformation model based on MRI images. Using MRI images, we can model example skin shapes associated with location of bones with accuracy. We also mention how to apply the model to characters to generate skin animation. For this purpose, we employ RBF, Radial Basis Functions. Once the model is constructed, skin animation is easily generated by applying the model to the forearm of a character by means of RBF. In this paper, we describe how to construct our model, first. Then we explain the method to apply the model to characters and generate skin animation.

CiNii
CGキャラクタの存在感

森島繁生

日本バーチャルリアリティ学会誌 = Journal of the Virtual Reality Society of Japan 14 ( 1 ) 23 - 28 2009年03月

CiNii J-GLOBAL
D-14-4 調音結合補正を用いた母音交換法に基づく話者変換法(D-14. 音声,一般セッション)

山本達也, 室伏空, 近藤康治郎, 森島繁生

電子情報通信学会総合大会講演論文集 2009 ( 1 ) 167 - 167 2009年03月

CiNii
D-11-109 データベースに基づく車体形状デザインGUIの構築(D-11.画像工学,一般セッション)

仲田真輝, 早川達順, 杉本志織, 森島繁生

電子情報通信学会総合大会講演論文集 2009 ( 2 ) 109 - 109 2009年03月

CiNii J-GLOBAL
A-10-2 楽器音テンプレートマッチングによる倍音誤り補正システム(A-10.応用音響,一般セッション)

藤澤賢太郎, 室伏空, 近藤康治郎, 森島繁生

電子情報通信学会総合大会講演論文集 2009 197 - 197 2009年03月

CiNii
A-15-14 MRIに基づく皮膚下構造を反映した顔面筋肉モデルの構築(A-15.ヒューマン情報処理,一般セッション)

鑓水裕刀, 石橋康, 久保尋之, 前島謙宣, 森島繁生

電子情報通信学会総合大会講演論文集 2009 250 - 250 2009年03月

CiNii J-GLOBAL
A-15-17 多様な表情を合成可能な固有顔空間の構築(A-15.ヒューマン情報処理,一般セッション)

高見澤涼, 鈴木孝章, 久保尋之, 前島謙宣, 森島繁生

電子情報通信学会総合大会講演論文集 2009 253 - 253 2009年03月

CiNii
A-15-16 顔動画像のオプティカルフローに基づく作り笑い・自然な笑いの識別(A-15.ヒューマン情報処理,一般セッション)

藤代裕紀, 鈴木孝章, 中野真也, 野中悠介, 前島謙宣, 森島繁生

電子情報通信学会総合大会講演論文集 2009 252 - 252 2009年03月

CiNii J-GLOBAL
A-15-15 MRIを用いた前腕運動時の皮膚形状変化の精密な再現(A-15.ヒューマン情報処理,一般セッション)

山中健太郎, 中村槙介, 矢野茜, 森島繁生

電子情報通信学会総合大会講演論文集 2009 251 - 251 2009年03月

CiNii
MRIに基づく皮膚下構造を反映した顔面筋肉モデルの構築

鑓水裕刀, 石橋康, 久保尋之, 前島謙宣, 森島繁生

電子情報通信学会大会講演論文集 2009 250 2009年

J-GLOBAL
楽器音テンプレートマッチングによる倍音誤り補正システム

藤澤賢太郎, 室伏空, 近藤康治郎, 森島繁生

電子情報通信学会大会講演論文集 2009 197 2009年

J-GLOBAL
MRIを用いた前腕運動時の皮膚形状変化の精密な再現

山中健太郎, 中村槙介, 矢野茜, 森島繁生

電子情報通信学会大会講演論文集 2009 251 2009年

J-GLOBAL
調音結合補正を用いた母音交換法に基づく話者変換法

山本達也, 室伏空, 近藤康治郎, 森島繁生

電子情報通信学会大会講演論文集 2009 167 2009年

J-GLOBAL
データベースに基づく車体形状デザインGUIの構築

仲田真輝, 早川達順, 杉本志織, 森島繁生

電子情報通信学会大会講演論文集 2009 109 2009年

J-GLOBAL
多様な表情を合成可能な固有顔空間の構築

高見澤涼, 鈴木孝章, 久保尋之, 前島謙宣, 森島繁生

電子情報通信学会大会講演論文集 2009 253 2009年

J-GLOBAL
顔動画像のオプティカルフローに基づく作り笑い・自然な笑いの識別

藤代裕紀, 鈴木孝章, 中野真也, 野中悠介, 前島謙宣, 森島繁生

電子情報通信学会大会講演論文集 2009 252 2009年

J-GLOBAL
英語情動文的“I love you”中国語話者による認知と音響特性相関(2)

ヤーッコラ伊勢井敏子, ヤーッコラ伊勢井敏子, 広瀬啓吉, 森島繁生

日本音響学会研究発表会講演論文集(CD-ROM) 2009 ROMBUNNO.2-P-16 2009年

J-GLOBAL
音と同期した3DCGを用いた日本人英語学習者に苦手な構音運動のためのトレーニングシステム-唇の突き出し

ヤーッコラ(伊勢井)敏子, 鈴木茂樹, 広瀬啓吉, 森島繁生

日本音響学会研究発表会講演論文集(CD-ROM) 2009 ROMBUNNO.3-6-13 2009年

J-GLOBAL
アンドロイドやエージェントに感じる人の存在感 CGキャラクタの存在感

森島繁生

日本バーチャルリアリティ学会誌 14 ( 1 ) 23-28,1(1) 2009年

J-GLOBAL
MRIを用いた前腕皮膚形状変化モデルの構築と運動生成

山中健太郎, 中村槙介, 小林昭太, 森島繁生

電子情報通信学会技術研究報告 109 ( 29(WIT2009 1-47) ) 85 - 90 2009年

J-GLOBAL
MRI計測から獲得される皮膚下の厚みを適用した顔面筋肉モデルの構築

鑓水裕刀, 石橋康, 久保尋之, 前島謙宣, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2009 ROMBUNNO.31 2009年

J-GLOBAL
MRIを用いた前腕運動に伴う皮膚形状変化モデルの構築

山中健太郎, 中村槙介, 小林昭太, 白石允梓, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2009 ROMBUNNO.21 2009年

J-GLOBAL
調音結合モデルを用いた母音交換に基づく話者変換法

山本達也, 室伏空, 森島繁生

電子情報通信学会技術研究報告 109 ( 139(SP2009 41-48) ) 37 - 42 2009年

J-GLOBAL
最適化局所アフィン変換に基づく正面顔レンジスキャンデータからの頭部モデル自動生成

前島謙宣, 森島繁生

画像電子学会誌 38 ( 4 ) 404 - 413 2009年

　概要を見る

We propose an automatic 3D human head modeling method using both a frontal facial image and geometry. In general, template mesh fitting methods are used to create a face model from a facial range data obtained by range scanner. However, previous fitting techniques need to manually specify markers to the scanned 3D geometry and to manually correct the 3D geometry of the missing parts that it is impossible to accurately measure the head geometry of the hair region. We therefore complement this region's 3D geometry with the template mesh's one. Our technique can generate the head model that the scanned 3D face geometry and the template mesh's one are seamlessly connected. The computational time of our method is much faster than previous template mesh fitting methods. We therefore conclude that proposed method is effective to create a large amount of head models in game and film industry and an entertainment system. © 2009, The Institute of Image Electronics Engineers of Japan. All rights reserved.

DOI J-GLOBAL
ダンス動画コンテンツを再利用して音楽に合わせた動画を自動生成するシステム

室伏空, 中野倫靖, 後藤真孝, 森島繁生

情報処理学会研究報告(CD-ROM) 2009 ( 2 ) ROMBUNNO.MUS-NO.81(21) 2009年

J-GLOBAL
多波長照明を用いた多視点画像からの動的立体形状再現

小林昭太, 森島繁生

日本バーチャルリアリティ学会大会論文集(CD-ROM) 14th ROMBUNNO.1A4-1 2009年

J-GLOBAL
平均顔を用いた実験用日本人表情刺激作成の試み

木村あやの, 鈴木竜太, 吉田宏之, 渡邊伸行, 續木大介, 續木大介, CHANDRASIRI Naiwala P., 小泉憲生, 時田学, 森島繁生, 山田寛

日本顔学会誌 9 ( 1 ) 53 - 69 2009年

J-GLOBAL
MRI画像に基づく皮膚下の厚みを反映した顔面筋肉モデルの構築

鑓水裕刀, 久保尋之, 前島謙宣, 森島繁生

日本顔学会誌 9 ( 1 ) 196 2009年

J-GLOBAL
同世代女性の疲労感と表情の関係

鈴木めぐみ, 永嶋義直, 矢田幸博, 森島繁生, 山田寛

日本顔学会誌 9 ( 1 ) 245 2009年

J-GLOBAL
ストーリへの没入感を体験可能な高臨場感コンテンツ

森島繁生

画像電子学会研究会講演予稿 247th 27 - 32 2009年

J-GLOBAL
ここまで来た!顔情報処理技術の最先端キャラクターの個性を表現可能な顔画像合成技術

森島繁生

O plus E ( 361 ) 1413 - 1417 2009年

J-GLOBAL
ストーリへの没入感を体験可能な高臨場感コンテンツ(高臨場感ディスプレイフォーラム2009臨場感とは何か?)

森島繁生

映像情報メディア学会技術報告 33 ( 0 ) 27 - 32 2009年

　概要を見る

観客全員を映画のキャストとして登場させ、ストーリへの没入感を体験可能な新しいコンテンツ形態を提案している。2005年の愛・地球博、三井・東芝館において、世界で初めて実現されたフューチャーキャストシステムは、視聴者体験型の映像コンテンツ『グランオデッセイ』を実現し、6か月間の体験者163万人を記録し、人気を博した。この技術は、2007年にハウステンボスにおいて、フューチャーキャストシアターとして商用化された。いかに短時間で負担を与えることなく、観客の個性を反映するCGキャラクタを自動カスタマイズ合成するかが成功のカギとなるが、今回、当初の顔の3次元形状計測に加えて、表情の個性、体系・歩容の個性、声質の個性、髪形の選択などを短時間かつ自動的にモデル化する手法を新たに実現し、頭髪を含む全身モデルのリアルタイムレンダリングを実現することで、自分自身を映像中に発見できる確率を、万博システムの59%から約90%に劇的に向上させることができた。

DOI CiNii
個人の音声を反映する映像エンタテインメントシステム

足立吉広, 大谷大和, 川本真一, 四倉達夫, 森島繁生, 中村哲

情報処理学会論文誌 49 ( 12 ) 3908 - 3917 2008年12月

　概要を見る

視聴者の顔をCGで再現し，CGキャラクタとして映画に登場させるFuture Cast System（FCS）を改良し，視聴者の声の特徴をそのキャラクタの台詞音声へ反映させ，キャラクタの顔と声の一致度を向上させて音声を出力するシステムを構築する．あらかじめ構築した話者データベースから視聴者の知覚的類似話者を選出し，その話者の台詞音声を視聴者のキャラクタに割り当て，短時間で台詞音声を映像と同期出力するシステムを提案する．知覚的類似話者は，個人性の知覚と関係があると報告されている8つの音響特徴量による距離の線形結合を用いて推定する．声優による60種類の声質の台詞音声データベースを用いた音声出力同期システムを構築し，視聴者のキャラクタの顔と選択された音声の一致度に関して5段階の主観評価を行った．登場者数と話者データベースの規模，および類似話者の許容度の関係を予備実験により調査し，実験条件にあてはめたところ，予想される許容度約51%に対して主観実験値において35%の許容が確認され，全体として予備実験で得られた予想値の68%が達成できた．In this paper, we propose an improved Future Cast System (FCS) that enables anyone to be a movie star with own individuality in voice as well as faces. Previous system created a CG character which closely resembles the face of the audience; however the voice of the character was selected only considering gender. Therefore, the voice of a CG character is not enough to identify oneself from others. The proposed system produces much closer voice to the audience by selecting one from a voice actor database, where voice similarity of speaker is estimated using a combined feature of 8 acoustic features. After assigning one CG character to the audience, the system produces voices in synchronization with the CG character's movement. We constructed the speech synchronization system using voice actor database with 60 voice quality, and conducted the subjective evaluation experiments of voice similarity in five-grades. Achievement rate of the proposal method for theoretical figure that considered the allowance rate of selected speaker to the database size is 68%.

CiNii
顔表情のCG合成と感動評価

森島繁生

映像情報メディア学会誌 : 映像情報メディア = The journal of the Institute of Image Information and Television Engineers 62 ( 12 ) 1924 - 1927 2008年12月

DOI CiNii
アニメ作品制作の高能率化をめざす研究開発

森島繁生

画像ラボ 19 ( 7 ) 34 - 39 2008年07月

CiNii J-GLOBAL
仮想現実的顔画像処理システムを用いた顔面表情知覚の精神物理学的研究特に基本表情の強度と感覚量の関係について

山田寛, 中村宏信, 森島繁生, 原島博

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 95 ( 477 ) 15 - 20 2008年05月

　概要を見る

近年のめざましい顔画像処理技術の発展により, 現実的な顔を用いた表情知覚の研究にも, 厳密な精神物理学的測定法を適用することが可能になってきた. 本研究では, その第一歩として, 顔面表情の強度とそれに対する人間の感覚量の関係を調べるための3つの精神物理学的実験を試みた. 実験1では『驚き』と『怒り』の2表情を取り上げ, 5つの異なる表情強度における弁別閾を測定した. 実験2では, 『喜び(開口)』, 『喜び(閉口)』, 『悲しみ』の表情について同様の測定を行い, 表情強度と感覚量の関係には, 個々の表情の物理的特性に起因する2つのタイプが存在することを明らかにした. 実験3では, 倒立した『驚き』の表情刺激を用いて同様の測定を行い, それまでの実験結果で被験者が提示刺激を弁別する際に確かに表情知覚を行っていたことを検証した.

CiNii
英語音声教育のための3DCGによる舌の動きと音声のリンク開発の試み--語彙との同期 (ヒューマンインフォメーション・立体映像技術)

ヤーッコラ伊勢井敏子, 鈴木茂樹, 森島繁生

映像情報メディア学会技術報告 32 ( 15 ) 29 - 32 2008年03月

CiNii J-GLOBAL
椎骨骨格形状モデルに基づくデータドリブンな脊椎動作モデリング

関根孝雄, 森島繁生

情報処理学会研究報告. CG,グラフィクスとCAD研究会報告 130 ( 14 ) 11 - 16 2008年02月

　概要を見る

本研究では、実際の人間の骨を忠実に再現した椎骨模型の三次元形状を基に脊柱モデルを作成し、CGの人物描写における自然な脊柱動作を生成するシステムを提案する。近年、CG分野における技術の発展により、映画やゲームの世界でもスケルトンモデルを利用して人物動作の描写が頻繁に行われるようになった。しかし、腕や足などの大きな骨で構成される部位に対し、多くの細かい骨で形成される脊柱部分についてはモデル化が難しいと考えられる。そこで、本研究では人体の脊椎骨構造に基づき、腰椎・胸椎・頚椎に分割して、各部分の動きを操作することで自然な脊柱動作を生成した。

CiNii J-GLOBAL
ストーリへの没入感を実現するダイブイントゥザムービープロジェクト

森島繁生, 八木康史, 中村哲, KAWAMOTO Shinichi

情報処理学会研究報告. CVIM, [コンピュータビジョンとイメージメディア] 161 ( 3 ) 121 - 128 2008年01月

　概要を見る

『ダイブイントゥザムービー』は、視聴者自らが映画の登場人物として演技し、それを観客として家族や友人と鑑賞し感動を共有する1つ形態と、忠実な環境を仮想再現することによって観客自らが主人公の存在する空間や音場を体験共有する形態という、2つのスタイルでストーリへの没入感を実現する、世界に類を見ないエンタテインメントの提案である。この実現のためには、視聴者そっくりのキャラクタモデルを短時間のうちに生成し、本人の特徴を忠実に再現しながら実時間合成する必要がある技術開発と、演者を取り巻く映像および音場環境を忠実に記録し、自動的に再現する技術が必要となる。

CiNii
ストーリへの没入感を実現するダイブイントゥザムービープロジェクト

森島繁生, 八木康史, 中村哲, KAWAMOTO Shinichi

電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 107 ( 427 ) 153 - 160 2008年01月

　概要を見る

『ダイブイントゥザムービー』は、視聴者自らが映画の登場人物として演技し、それを観客として家族や友人と鑑賞し感動を共有する1つ形態と、忠実な環境を仮想再現することによって観客自らが主人公の存在する空間や音場を体験共有する形態という、2つのスタイルでストーリへの没入感を実現する、世界に類を見ないエンタテインメントの提案である。この実現のためには、視聴者そっくりのキャラクタモデルを短時間のうちに生成し、本人の特徴を忠実に再現しながら実時間合成する必要がある技術開発と、演者を取り巻く映像および音場環境を忠実に記録し、自動的に再現する技術が必要となる。

CiNii
英語流音学習のための音と同期した3DCGによる構音運動

ヤーッコラ伊勢井敏子, 森島繁生, 広瀬啓吉

日本音響学会秋季大会講演論文集 339 - 340 2008年
英語情動文“1 love you"の中国語話者による認知と音響特性相関(1)

広瀬啓吉, 森島繁生

日本音響学会秋季大会講演論文集 331 - 332 2008年
顎矯正手術後のスマイル作成時の軟組織変化

吉田満, 寺田員人, 杉野伸一郎, 佐野奈都貴, 長谷川優, 小原彰浩, 齋藤功, 森島繁生

日本矯正歯科学会大会プログラム・抄録集 67th 213 2008年

J-GLOBAL
「デジタルメディア作品の制作を支援する基盤技術」2008 コンテンツ制作の高能率化のための要素技術研究

森島繁生, 安生健一, バクスターウィリアム, 中村哲, 四倉達夫, 川本真一

デジタルメディア作品の制作を支援する基盤技術第2回領域シンポジウム平成20年 14 - 15 2008年

J-GLOBAL
コンテンツ制作の高能率化のための要素技術研究

森島繁生

戦略的創造研究推進事業研究年報(CD-ROM) 2008 DEJITARUMEDIA,MORISHIMA 2008年

J-GLOBAL
ストーリへの没入感を実現するダイブイントゥザムービープロジェクト

森島繁生, 八木康史, 中村哲

情報処理学会研究報告 2008 ( 3(CVIM-161) ) 121 - 128 2008年

J-GLOBAL
ディジタルコンテンツ制作を支える新技術１．キャラクタアニメーション制作の高能率化手法

森島繁生, 栗山繁, 川本真一

映像情報メディア学会誌 62 ( 2 ) 156 - 160 2008年

DOI CiNii J-GLOBAL
ディジタルコンテンツ制作―ＤＣＳ'０７関連―直感的に影を演出可能な編集ツール

中嶋英仁, 杉崎英嗣, 森島繁生

映像情報メディア学会誌 62 ( 2 ) 234 - 239 2008年

　概要を見る

Shadows in cartoon animation are generally used for dramatizing scenes. In hand-drawn animation, these shadows reflect the animators' intention and style rather than physical phenomena. On the other hand, shadows in 3 DCG animation are photorealistically rendered, and animators can not fully reflect their intention. This is because, in 3DCG animation, shadows are automatically generated once the light source is defined. Therefore, we develop an interactive tool for editing shadows that combines the advantages of hand-drawn animation and 3DCG technology. the advantage of our tool is that shadow attributes are inherited once animators edit the shape and location of shadows. Animators are only required mouse operations for editing shadows. Consequently, our tool enables animators to create shadows automatically and easily to reflect their intention and style.

DOI CiNii J-GLOBAL
椎骨骨格形状モデルに基づくデータドリブンな脊椎動作モデリング

関根孝雄, 森島繁生

情報処理学会研究報告 2008 ( 14(CG-130) ) 11 - 16 2008年

J-GLOBAL
複数音響特徴量の統合による音声の知覚的類似度推定

足立吉広, 足立吉広, 川本真一, 森島繁生, 中村哲

日本音響学会研究発表会講演論文集(CD-ROM) 2008 ROMBUNNO.1-11-14 2008年

J-GLOBAL
表情筋運動モデルの過渡特性を考慮した表情アニメーション

久保尋之, 久保尋之, 石橋康, 前島謙宣, 森島繁生

Visual Computing/グラフィクスとCAD合同シンポジウム予稿集(CD-ROM) 2008 ROMBUNNO.20 2008年

J-GLOBAL
アニメ作品制作の高能率化をめざす研究開発

森島繁生

画像ラボ 19 ( 7 ) 34 - 39 2008年

J-GLOBAL
英語情動文“I love you”中国語話者による認知と音響特性相関(1)

ヤーッコラ(伊勢井)敏子, ヤーッコラ(伊勢井)敏子, 広瀬啓吉, 森島繁生

日本音響学会研究発表会講演論文集(CD-ROM) 2008 ROMBUNNO.1-Q-5 2008年

J-GLOBAL
音声モーフィングを用いた類似話者の評価

近藤康治郎, 足立吉広, 足立吉広, 森島繁生

日本音響学会研究発表会講演論文集(CD-ROM) 2008 ROMBUNNO.1-Q-18 2008年

J-GLOBAL
音声の知覚的類似度推定のための音響特徴量の選択

足立吉広, 足立吉広, 川本真一, 森島繁生, 中村哲

日本音響学会研究発表会講演論文集(CD-ROM) 2008 ROMBUNNO.2-P-20 2008年

J-GLOBAL
メロディの楽譜と採譜された演奏記録を教師データに用いた演奏表情のモデリング

室伏空, 近藤康治郎, 足立吉広, 森島繁生

日本音響学会研究発表会講演論文集(CD-ROM) 2008 ROMBUNNO.1-9-18 2008年

J-GLOBAL
高速度カメラを用いた3次元表情アニメーション生成

鈴木孝章, 石橋康, 久保壽之, 前島謙宣, 森島繁生

日本顔学会誌 8 ( 1 ) 195 2008年

J-GLOBAL
筋配置のカスタマイズが可能な表情生成シミュレータの提案

久保尋之, 久保尋之, 上村周平, 石橋康, 前島謙宣, 森島繁生

日本顔学会誌 8 ( 1 ) 189 2008年

J-GLOBAL
いま“顔”が面白い~顔の画像処理とその応用~3.顔表情のCG合成と感動評価

森島繁生

映像情報メディア学会誌 62 ( 12 ) 1924-1927,1910-1911 - 1927 2008年

DOI J-GLOBAL
個人の音声を反映する映像エンタテインメントシステム

足立吉広, 足立吉広, 大谷大和, 川本真一, 四倉達夫, 森島繁生, 中村哲

情報処理学会論文誌ジャーナル(CD-ROM) 49 ( 12 ) 3908 - 3917 2008年

J-GLOBAL
情報技術が支えるアートとコンテンツの世界-Art with Science Science with Art- : 5.効率的アニメ制作支援のための3次元CG技術

森島繁生, 安生健一, 中村哲

情報処理 48 ( 12 ) 1351 - 1358 2007年12月

CiNii
ユーザ参加型エンタテインメント『ダイブイントゥザムービー』

森島繁生

情報処理学会研究報告. CG,グラフィクスとCAD研究会報告 128 ( 84 ) 29 - 34 2007年08月

　概要を見る

本稿では、視聴者自身が映画の登場人物として演技することができ、さらに登場人物の環境を再現することによって、ストーリーへの没入感を体験できる新しいエンタテインメント『ダイブイントゥザムービー』について述べる。視聴者そっくりのキャラクタモデルを短時間のうちに生成し、映画の本編にてこのキャラクタが実時間合成されて演技するという過去に類を見ないエンタテインメントの形態である。この技術を実現するためには、リアルタイムCG合成技術は当然必要であるが、個人性を表現するためのさまざまな技術、個人性をキャプチャするためのCV技術、そして音声信号処理技術も必要となる。

CiNii J-GLOBAL
骨格性下顎前突症患者における口唇周囲軟組織の三次元運動解析

松原大樹, 寺田員人, 中村康雄, 林豊彦, 森嶋繁生, 齋藤功

日本顎変形症学会雑誌 17 ( 3 ) 189 - 199 2007年08月

DOI CiNii J-GLOBAL
モーションキャプチャシステムを用いた頭髪アニメーション手法の提案

杉崎英嗣, 風間祥介, 石川貴仁, 白石允梓, 西村昌平, 森島繁生

画像電子学会誌 = Imaging & Visual Computing The Journal of the Institute of Image Electronics Engineers of Japan 36 ( 4 ) 398 - 406 2007年07月

　概要を見る

モーションキャプチャシステムで取得したデータを用いるコンピュータグラフィックスでの人物動作表現，顔表情表現は，データドリブン手法の一つとして非常に有効であり，そのデータを用いた応用研究発表が毎年SIGGRAPHなどの学会で盛んに行われている．また，その技術は実際の映画制作の分野でも広く使用されており，それに対応したソフトウェアの開発も行われている．本稿は，モーションキャプチャシステムを用いて頭髪運動をシミュレーションする手法を提案する．実際の人間の頭髪は，個人差が存在するもののおよそ10万本からなる．それゆえ，その頭髪すべてにモーションキャプチャのマーカを貼り付け，その動きをキャプチャすることは現実的に不可能である．そこで本手法では，頭髪デザインのプロフェッショナルである美容師が実際に頭髪をデザインする際に用いるレイヤモデルを基に，計測対象となる頭髪房を各レイヤの端点に配置する．それらを主たる頭髪運動の軸としてキャプチャを行い，配置された頭髪房から頭髪全体を補間することで頭髪運動アニメーションが作成可能であることを実証する．

DOI CiNii
魅せる表情--似せるキャラクター (シンポジウム表情と動きから見た未来のアニメーション)

森島繁生

美術解剖学雑誌 11 ( 1 ) 30 - 41 2007年07月

CiNii
後側方車両認識支援のためのダイナミックサイドミラーの提案とその評価

桑名潤平, 伊藤誠, 稲垣敏之, 三浦文裕, 前島謙宣, 森島繁生

電子情報通信学会技術研究報告. HIP, ヒューマン情報処理 107 ( 60 ) 59 - 64 2007年05月

　概要を見る

本稿では、顔の3次元形状を利用した認証システムを提案する。顔の3次元形状に対して整合された3次元メッシュモデルの頂点座標に対して主成分分析を行い、その主成分得点を特徴量として使用する。本システムでは、まず主成分得点に対して演算量の削減の為に各主成分納上で正規分布を仮定した不均一量子化を行い、識別を行う。次に識別された候補に対して、ガウス関数の積分値を距離尺度として認証を行う。2782人の3次元顔形状データベースを用いて全員を登録者とする認証実験を行った結果、EERは4.53(%)となった。また、登録人数を15人に削減した場合ではEERが2.32(%)となり、高い精度での認証を実現できた。

CiNii
レンジデータに整合された顔モデル3次元座標のPCAによる大規模データベース対応型顔認証システム

野中悠介, 山名信弘, 井辺昭人, 三浦文裕, 前島謙宣, 森島繁生

電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 106 ( 539 ) 61 - 66 2007年02月

　概要を見る

本稿では、顔の3次元形状を利用した認証システムを提案する。顔の3次元形状に対して整合された3次元メッシュモデルの頂点座標に対して主成分分析を行い、その主成分得点を特徴量として使用する。本システムでは、まず主成分得点に対して演算量の削減の為に各主成分納上で正規分布を仮定した不均一量子化を行い、識別を行う。次に識別された候補に対して、ガウス関数の積分値を距離尺度として認証を行う。2782人の3次元顔形状データベースを用いて全員を登録者とする認証実験を行った結果、EERは4.53(%)となった。また、登録人数を15人に削減した場合ではEERが2.32(%)となり、高い精度での認証を実現できた。

CiNii
外科的矯正治療後のスマイルの三次元的変化

寺田員人, 吉田満, 佐野奈都貴, 松原大樹, 小原彰浩, 齋藤功, 森島繁生

日本矯正歯科学会大会プログラム・抄録集 66th 251 2007年

J-GLOBAL
「デジタルメディア作品の制作を支援する基盤技術」コンテンツ制作の高能率化のための要素技術研究

森島繁生

戦略的創造研究推進事業研究年報(CD-ROM) 2007 DEJITARUMEDIA,MORISHIMA 2007年

J-GLOBAL
レンジデータに整合された顔モデル3次元座標のPCAによる大規模データベース対応型顔認証システム

野中悠介, 山名信弘, 井辺昭人, 三浦文裕, 前島謙宣, 森島繁生

電子情報通信学会技術研究報告 106 ( 539(PRMU2006 221-234) ) 61 - 66 2007年

J-GLOBAL
話者類似性知覚における年齢・性別・発話スタイルの影響の検討

足立吉広, 足立吉広, 川本真一, 森島繁生, 中村哲

日本音響学会研究発表会講演論文集(CD-ROM) 2007 1-Q-10 2007年

J-GLOBAL
顔情報データベースＦＩＮＤ―日本人の顔面像データベース構築の試み―:―日本人の顔画像データベース構築の試み―

渡邊伸行, 山田寛, 鈴木竜太, 吉田宏之, 續木大介, 番場あやの, Chandrasiri Naiwala P., 時田学, 和田万紀, 森島繁生

感情心理学研究 14 ( 1 ) 39 - 53 2007年

　概要を見る

This paper offers a database of facial images of Japanese, available for various kinds of studies on face and facial expressions. The Facial Information Norm Database (FIND) currently includes more than 13,000 images of 150 Japanese neutral faces, seven prototypical facial expressions of emotion (happiness, surprise, fear, sadness, anger, disgust, and contempt), and facial behaviors of single Action Unit (AU) and AU combinations based on the Facial Action Coding System (FACS; Ekman et al., 2002). FIND also contains information on each image such as a demographics, facial structural (shape) information by fitting a facial wireframe model onto the image, cognitive judgment data, and psychophysiological data obtained in judgment studies. We call the images and all other information mentioned above “facial information.” This paper describes the FIND and efforts to establishing the environment of capturing the pictures, procedures for obtaining the pictures, an interface for database access, and the issue of personal information protection of the participants appearing in the images.

DOI CiNii J-GLOBAL
表情筋モデルを用いた表情の再現と個人表情の表現

石橋康, 久保尋久, 柳澤博昭, 前島謙宣, 森島繁生

Visual Computing グラフィクスとCAD合同シンポジウム予稿集 2007 263 - 268 2007年

DOI J-GLOBAL
標準モデルによる車体形状表現および車体形状合成GUIの構築

早川達順, 関根佑介, 前島謙宣, 森島繁生

Visual Computing グラフィクスとCAD合同シンポジウム予稿集 2007 285 - 289 2007年

DOI J-GLOBAL
モーションキャプチャシステムを用いた頭髪アニメーション手法の提案

杉崎英嗣, 風間祥介, 石川貴仁, 白石允梓, 西村昌平, 森島繁生

画像電子学会誌 36 ( 4 ) 398 - 406 2007年

　概要を見る

モーションキャプチャシステムで取得したデータを用いるコンピュータグラフィックスでの人物動作表現，顔表情表現は，データドリブン手法の一つとして非常に有効であり，そのデータを用いた応用研究発表が毎年SIGGRAPHなどの学会で盛んに行われている．また，その技術は実際の映画制作の分野でも広く使用されており，それに対応したソフトウェアの開発も行われている．本稿は，モーションキャプチャシステムを用いて頭髪運動をシミュレーションする手法を提案する．実際の人間の頭髪は，個人差が存在するもののおよそ10万本からなる．それゆえ，その頭髪すべてにモーションキャプチャのマーカを貼り付け，その動きをキャプチャすることは現実的に不可能である．そこで本手法では，頭髪デザインのプロフェッショナルである美容師が実際に頭髪をデザインする際に用いるレイヤモデルを基に，計測対象となる頭髪房を各レイヤの端点に配置する．それらを主たる頭髪運動の軸としてキャプチャを行い，配置された頭髪房から頭髪全体を補間することで頭髪運動アニメーションが作成可能であることを実証する．

DOI CiNii J-GLOBAL
ユーザ参加型エンタテインメント『ダイブイントゥザムービー』

森島繁生

情報処理学会研究報告 2007 ( 84(CG-128) ) 29 - 34 2007年

J-GLOBAL
話者類似度推定のための音響特徴量

足立吉広, 足立吉広, 川本真一, 森島繁生, ナカムラサトシ

日本音響学会研究発表会講演論文集(CD-ROM) 2007 3-4-18 2007年

J-GLOBAL
顔情報デジタルアーカイブ構築における個人情報取り扱いの問題

吉田宏之, 鈴木竜太, 渡邊伸行, 續木大介, 續木大介, 番場あやの, CHANDRASIRI Naiwala P., CHANDRASIRI Naiwala P., 時田学, 和田万紀, 森島繁生, 山田寛

日本顔学会誌 7 ( 1 ) 147 - 159 2007年

J-GLOBAL
平均顔を用いた実験用日本人表情刺激作成の試み

番場あやの, 吉田宏之, 鈴木竜太, 渡邊伸行, 續木大介, 續木大介, CHANDRASIRI Naiwala P., CHANDRASIRI Naiwala P., 小泉憲生, 時田学, 和田万紀, 森島繁生, 山田寛

日本顔学会誌 7 ( 1 ) 248 2007年

J-GLOBAL
情報技術が支えるアートとコンテンツの世界-Art with Science, Science with Art-5 効率的アニメ制作支援のための3次元CG技術

森島繁生, 安生健一, 中村哲

情報処理 48 ( 12 ) 1351 - 1358 2007年

J-GLOBAL
動画の3次元周波数成分を用いた顔認証システム

山名信弘, 井辺昭人, 三浦文裕, 前島謙宣, 森島繁生

電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 106 ( 73 ) 13 - 18 2006年05月

　概要を見る

本研究は、表情変化する顔を撮影した動画像を3次元フーリエ変換して得た3次元周波数成分を特徴量とする顔認証システムを提案する。表情変化する顔を用いることが、静止画を用いた顔認証で懸念されているなりすましに対して防止効果があることを示す。本システムの特徴として、特徴量に周波数成分を用いることで顔領域を抽出することなく認証を行う。用いる周波数項は分散分析によって選定する。また、認証処理はwatch list形式であり、重判別分析とマハラノビス汎距離測定の2つの手法を直列的に組み合わせて行う。

CiNii J-GLOBAL
動画の3次元周波数成分を用いた顔認証システム

山名信弘, 井辺昭人, 三浦文裕, 前島謙宣, 森島繁生

電子情報通信学会技術研究報告. MI, 医用画像 106 ( 75 ) 13 - 18 2006年05月

　概要を見る

本研究は、表情変化する顔を撮影した動画像を3次元フーリエ変換して得た3次元周波数成分を特徴量とする顔認証システムを提案する。表情変化する顔を用いることが、静止画を用いた顔認証で懸念されているなりすましに対して防止効果があることを示す。本システムの特徴として、特徴量に周波数成分を用いることで顔領域を抽出することなく認証を行う。用いる周波数項は分散分析によって選定する。また、認証処理はwatch list形式であり、重判別分析とマハラノビス汎距離測定の2つの手法を直列的に組み合わせて行う。

CiNii
表情筋変形パラメータの推定による表情合成

久保尋之, 柳澤博昭, 前島謙宣, 森島繁生

電子情報通信学会技術研究報告. HIP, ヒューマン情報処理 105 ( 682 ) 31 - 36 2006年03月

　概要を見る

本稿では、表情筋の収縮による顔形状の変形に基づき、少ない制御点からリアルな表情合成を実現する手法を提案する。本研究では3次元スキャナにより測定された顔形状をモデル化し、表情筋変形に基づく表情合成を行った。さらにモーションキャプチャを用いて表情表出時の顔表面の動作遷移を計測する。制御点としてある瞬間の顔表面に配置されたモーションキャプチャマーカの座標を用いることで、制御点と合成された表情をと比較し、表情筋変形パラメータを推定し、表情を合成した。表情筋変形パラメータから顔形状の変形は一意に定まるため、本手法では、モーションキャプチャによって3次元座標が測定されない顔表面の頂点も、表情筋による顔形状変形ルールに基づき、表現することが可能である。本手法により、表情筋の変形に基づくリアルな表情を合成することが可能となった。

CiNii J-GLOBAL
インタラクティブな声質変換システムの構築

井上隆大, 足立吉広, 森島繁生

電子情報通信学会技術研究報告. HIP, ヒューマン情報処理 105 ( 682 ) 37 - 42 2006年03月

　概要を見る

本稿では自然発話の声質を制御するシステムを提案する。まず目標の話者から日本語の発話に必要な一通りの音節または母音を収録し、スペクトログラムに変換して保持しておく。そして声質を変換させる人力音声をスペクトログラムに変換し、音節または母音ごとにターゲット話者のスベタトログラムに入れ替えて音声合成することで、声質が変換された音声を出力する。この音節交換と母音交換による合成音声は、入力音声のイントネーションを維持している。これら2手法による出力音声の個人性は、ほぼ同程度であることが主観評価実験によって明らかになった。よって特に母音のみの交換による声質変換法は、比較的小さなデータベースでの声質変換を可能にすると考えられる。

CiNii J-GLOBAL
リアルな頭部動作のモデリング

関根孝雄, 足立吉広, 森島繁生

電子情報通信学会技術研究報告. MVE, マルチメディア・仮想環境基礎 105 ( 683 ) 13 - 18 2006年03月

　概要を見る

本研究では、CGによる人物合成における自然な頭部動作の生成を目的とし、その中でも頻繁に行われる動きの一つである頷き動作に注目して、頷きにおける自然な首の動きを生成するシステムを提案する。人が頷きを行う場合、頸椎のみでなく脊椎全体の回転により姿勢が作られる。首の動きを作る際に、インバースキネマティックスの概念により個々の頸椎の関節角度を求める方法が考えられるが、IKによる頷き動作は関節間の長さが一定であるため、肩から上のモデルでは頭部動作を表現しきれない。そこで、本研究では首の動きを関節ごとの回転と伸縮の組み合わせによって自然な頷きを生成した。

CiNii J-GLOBAL
B-18-2 顔動画像の3次元周波数成分を用いた顔認証システムの研究(B-18.バイオメトリクス・セキュリティ,一般講演)

山名信弘, 井辺昭人, 三浦文裕, 前島謙宣, 森島繁生

電子情報通信学会総合大会講演論文集 2006 ( 2 ) 536 - 536 2006年03月

CiNii
A-15-3 リアルな頭部動作のモデリング(A-15.ヒューマン情報処理,一般講演)

関根孝雄, 足立吉広, 森島繁生

電子情報通信学会総合大会講演論文集 2006 239 - 239 2006年03月

CiNii
D-11-80 アニメーションのための影編集ツールの開発(D-11.画像工学D(画像処理・計測),一般講演)

中嶋英仁, 杉崎英嗣, 前島謙宣, 森島繁生

電子情報通信学会総合大会講演論文集 2006 ( 2 ) 80 - 80 2006年03月

CiNii J-GLOBAL
D-11-81 アニメ映像からの頭髪運動の構築(D-11.画像工学D(画像処理・計測),一般講演)

風間祥介, 杉崎英嗣, 森島繁生

電子情報通信学会総合大会講演論文集 2006 ( 2 ) 81 - 81 2006年03月

CiNii J-GLOBAL
D-11-114 車体形状の定量表現によるカーデザインツールの構築(D-11.画像工学D(画像処理・計測),一般講演)

関根佑介, 前島謙宜, 杉崎英嗣, 森島繁生

電子情報通信学会総合大会講演論文集 2006 ( 2 ) 114 - 114 2006年03月

CiNii J-GLOBAL
リファレンス音声に基づく韻律・声質・話者変換システム

足立吉広, 森島繁生

電子情報通信学会技術研究報告. SP, 音声 105 ( 571 ) 37 - 42 2006年01月

　概要を見る

本稿では、任意の自然発語音声に対するインタラクティブな韻律・声質・話者変換システムについて述べる。このシステムでは過去に収録された声質に対して、別話者が発声した手本となる音声(リファレンス音声)を与えることにより、このリファレンス音声から抽出された発話速度、基本周波数の遷移、音量の遷移の情報に基づいて収録音声の韻律を制御する。また、声質は母音の特徴に大きく依存すると考え、母音のスペクトルを変換させることにより声質変換を行う。話者性は韻律と声質から成り立つと仮定し、先のパラメータ及びスペクトログラムの情報を制御することにより、話者変換を実現する。

CiNii J-GLOBAL
フューチャーキャストシステムの舞台裏と今後の展開

森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 105 ( 542 ) 39 - 44 2006年01月

　概要を見る

本稿では、愛知万博の三井・東芝パビリオンにおいて世界に先駆けて具現化された、全く新しいエンタテインメントシステムであるフューチャーキャストシステムの全容について述べ、実現するまでのプロセスと今後のビジョンについて述べる。このシステムでは、観客全員(240名)が、映画の登場人物に扮してスクリーンに登場し、本人そっくりのキャラクタが台詞を喋り、感情を表現しながらストーリーが展開していくものである。顔の3次元形状計測から、個人のワイヤフレーム生成、さらにリアルタイムレンダリングによる映像生成までを完全自動で実行させる。9月来場者により、システム評価を実施したところ、平均で93.4%の人物が実際に映像に登場していたことが分かった。

CiNii
動画像の時空間周波数を用いた顔認識システム

森島繁生

「画像の認識・理解シンポジウム(MIRU2006)」ダイジェスト冊子 ISI-40 388 2006年
Subjective evaluation of a synthetic talking face in an acoustically noisy environment

A Maejima, T Yotsukura, S Morishima, S Nakamura

ELECTRONICS AND COMMUNICATIONS IN JAPAN PART III-FUNDAMENTAL ELECTRONIC SCIENCE 89 ( 5 ) 39 - 52 2006年

　概要を見る

The realization of an anthropomorphic agent which looks like a real human is an important research topic for the broadening of the range of human-to-human communications through the use of a computer. We have proposed a technique for synthesizing natural talking-face animation that permits such communications. How to evaluate the performance of talking-face animation, however, has remained an outstanding issue. The performance of talking-face animation is determined in three parameters: (1) Does it reproduce human talking to an extent that permits lipreading? (2) Does it appear visually natural? (3) Is it accurately synchronized with voice? In this paper, we first presented talking-face animation along with the voice to subjects and conducted experiments on how well the subjects heard the contents of the spoken words to examine Parameter (1). In the next step, with regard to Parameter (2), the visual naturalness of the talking-face animation and the smoothness of the motion of the talking mouth were evaluated on a scale of 5 points. Lastly, with regard to Parameter (3), talking-face animation in which the synchronization of the animation with sound was off by a fixed interval was shown to subjects to investigate the subjective perception of the synchronization gap, and the extent of the resulting strange feeling was evaluated on a scale of 5 points. In addition, the effect of the synchronization gap between voice and talking-face animation on the manner in which the spoken words are understood was also evaluated. Through these evaluation experiments, the quality of synthetic talking-face animation proposed by the authors was evaluated, and we studied naturally-appearing synchronization between synthetic talking-face animation and voice. (c) 2006 Wiley Periodicals, Inc.

DOI
コンテンツ制作の高能率化のための要素技術研究

森島繁生

戦略的創造研究推進事業研究年報(CD-ROM) 2006 DEJITARUMEDIASAKUHIN,MORISHIMA 2006年

J-GLOBAL
愛・地球博「三井・東芝館」における新しいエンタテインメントへの挑戦--来場者の顔を3Dセンシングして瞬時に映画の登場人物を生成--

森島繁生, 前島謙宣

画像センシングシンポジウム予稿集(CD-ROM) 12th E-2 2006年

J-GLOBAL
フューチャーキャストシステムの舞台裏と今後の展開

森島繁生

電子情報通信学会技術研究報告 105 ( 542(HCS2005 55-62) ) 39 - 44 2006年

J-GLOBAL
リファレンス音声に基づく韻律・声質・話者変換システム

足立吉広, 森島繁生

電子情報通信学会技術研究報告 105 ( 571(SP2005 139-149) ) 37 - 42 2006年

J-GLOBAL
手本音声を用いた声質変換システム

井上隆大, 足立吉広, 森島繁生

電子情報通信学会大会講演論文集 2006 144 2006年

J-GLOBAL
顔動画像の3次元周波数成分を用いた顔認証システムの研究

山名信弘, 井辺昭人, 三浦文裕, 前島謙宣, 森島繁生

電子情報通信学会大会講演論文集 2006 536 2006年

J-GLOBAL
アニメ映像からの頭髪運動の構築

風間祥介, 杉崎英嗣, 森島繁生

電子情報通信学会大会講演論文集 2006 81 2006年

J-GLOBAL
アニメーションのための影編集ツールの開発

中嶋英仁, 杉崎英嗣, 前島謙宣, 森島繁生

電子情報通信学会大会講演論文集 2006 80 2006年

J-GLOBAL
リアルな頭部動作のモデリング

関根孝雄, 足立吉広, 森島繁生

電子情報通信学会大会講演論文集 2006 239 2006年

J-GLOBAL
車体形状の定量表現によるカーデザインツールの構築

関根佑介, 前島謙宜, 杉崎英嗣, 森島繁生

電子情報通信学会大会講演論文集 2006 114 2006年

J-GLOBAL
表情筋変形パラメータの推定による表情合成

久保尋之, 柳沢博昭, 前島謙宣, 森島繁生

電子情報通信学会技術研究報告 105 ( 682(HIP2005 154-167) ) 31 - 36 2006年

J-GLOBAL
リアルな頭部動作のモデリング

関根孝雄, 足立吉広, 森島繁生

電子情報通信学会技術研究報告 105 ( 683(MVE2005 69-81) ) 13 - 18 2006年

J-GLOBAL
インタラクティブな声質変換システムの構築

井上隆大, 足立吉広, 森島繁生

電子情報通信学会技術研究報告 105 ( 682(HIP2005 154-167) ) 37 - 42 2006年

J-GLOBAL
動画の3次元周波数成分を用いた顔認証システム

山名信弘, 井辺昭人, 三浦文裕, 前島謙宣, 森島繁生

電子情報通信学会技術研究報告 106 ( 75(MI2006 20-38) ) 13 - 18 2006年

J-GLOBAL
アニメ頭髪運動の再構築手法の提案

風間祥介, 杉崎英嗣, 田中懐子, 佐藤暁子, 森島繁生

Visual Computing グラフィクスとCAD合同シンポジウム予稿集 2006 125 - 130 2006年

DOI J-GLOBAL
車体形状の定量表現によるカーデザインツールの構築

関根佑介, 前島謙宣, 杉崎英嗣, 森島繁生

Visual Computing グラフィクスとCAD合同シンポジウム予稿集 2006 147 - 152 2006年

DOI J-GLOBAL
手本音声を用いた声質変換システム

井上隆大, 足立吉広, 森島繁生

電子情報通信学会総合大会2006年 144 - 144 2006年

CiNii
顔情報データベース構築の基礎的検討(3) : 表情画像の認知的評価とデータベースの信頼性について

鈴木竜太, 吉田宏之, 渡邊伸行, 前田亜希, 番場あやの, 續木大介, 北村麻梨, 時田学, 和田万紀, 森島繁生, 山田寛

映像情報メディア学会技術報告 29 ( 64 ) 93 - 98 2005年11月

CiNii
顔情報データベース構築の基礎的検討(3) : 表情画像の認知的評価とデータベースの信頼性について

鈴木竜太, 吉田宏之, 渡邊伸行, 前田亜希, 番場あやの, 續木大介, 北村麻梨, 時田学, 和田万紀, 森島繁生, 山田寛

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 105 ( 385 ) 93 - 98 2005年11月

　概要を見る

顔情報処理の問題に対する学術的関心が高まっており, 我々はそのような学問的発展に資する顔情報データベースの構築を目指したプロジェクトを進めている。現在, この顔情報データベースには, 統一された撮影環境と撮影手続により収集された表情画像が, ある程度の規模で収録されつつある。本報では, 収録された顔画像情報に対する表情判断実験を行い, 表情画像の認知的評価による信頼性の向上について検討を行った。このことにより本プロジェクトで構築を進めている顔情報データベースは, その発展に向けた新たな局面を迎えている。

CiNii
フューチャーキャストシステム三井・東芝館 (特集:最先端映像--愛・地球博)

森島繁生

画像ラボ 16 ( 9 ) 25 - 28 2005年09月

CiNii
フューチャーキャストシステム : 三井・東芝館

森島繁生

映像情報メディア学会誌 : 映像情報メディア = The journal of the Institute of Image Information and Television Engineers 59 ( 4 ) 522 - 524 2005年04月

DOI DOI2 CiNii
MRIを用いた骨格・関節のモーションキャプチャリング

西村昌平, 小島潔, 岩澤昭一郎, 森島繁生

電子情報通信学会技術研究報告. HIP, ヒューマン情報処理 104 ( 747 ) 19 - 24 2005年03月

　概要を見る

本研究では、MRIを用いて被験者の体内部を計測することにより、人物の骨格構造を忠実に再現した。また、モーションキャプチャシステムを用いて取得できるマーカの3次元移動量と、MRIで得られた骨格情報を用いて人体運動を実現した。MRIの撮影で取得した骨格情報より人体モデルを作成し、モーションキャプチャ撮影時のマーカを、人体モデルの皮膚上に仮想的に1対1で対応させ、配置させる。この手法により、今までモーションキャプチャマーカの位置から骨格構造上の統計的ルールで決定されていた関節位置を、被験者個人の関節位置に修正することが可能となり、忠実な骨格および関節の動作を実現した。また、動作時における皮膚と骨との位置関係の変化も検証した。

CiNii
特徴点の3次元情報を利用した顔認証システムの構築

井辺昭人, 佐藤康之, 前島謙宣, 森島繁生

電子情報通信学会技術研究報告. MVE, マルチメディア・仮想環境基礎 104 ( 748 ) 25 - 30 2005年03月

　概要を見る

本稿では、特徴点の3次元情報を利用した顔認証システムを提案する。レンジスキャナから得られた3次元データを個人認証に利用することで認証精度は向上する。しかし、顔全体の3次元データを個人認証に使用した場合データ量や計算量が膨大になる。そこで本研究では顔の各部位に特徴点を定義し、特徴点の3次元情報を使用することでデータ量の削減を行い、特徴点の2次元データによる認証精度と2次元に奥行きを付加した3次元データによる認証精度を比較することによって顔の3次元データの優位性を検証した。また、特徴点に対して分散分析を行い、分散比の大きい特徴点を個人認証に使用することにより認証精度を維持したまま更なるデータ量の削減に対する検討を行った。

CiNii J-GLOBAL
数字発話時の唇動作に基づく顔認証システムの構築

三浦文裕, 佐藤康之, 森島繁生

電子情報通信学会技術研究報告. MVE, マルチメディア・仮想環境基礎 104 ( 748 ) 31 - 34 2005年03月

　概要を見る

番号を発話する時の人の顔動画を用い、個人識別精度の向上を目指した。まず、取得した顔動画像上の顔器官(目、鼻、口など)の特徴的な位置に対し、特徴点を定義し、その点の座標によって個人を記述した。その際、サンプルによって被験者の顔の位置・傾き・大きさが異なるため、アフィン変換によって特徴点座標の正規化をした。正規化後特徴点座標に対し判別分析を行い、識別結果を得た。顔動画の初期フレームのみを識別に用いた場合と顔動画を10フレーム用いた場合の識別率の比較を行い、動画による認証の有効性を示した。

CiNii J-GLOBAL
モーションキャプチャによる顔表情の定量表現

柳澤博昭, 祖川慎治, 前島謙宣, 四倉達夫, 森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 104 ( 744 ) 7 - 12 2005年03月

　概要を見る

本稿では、モーションキャプチャを用いて計測された顔表情データからの顔表情合成手法、および顔表情データを主成分分析することで、顔表情データから新たに互いに相関のない特徴量を導出し、その特徴量がどのような表情を示すかのに対する定義づけを行った。顔表情のデータは、FACSに基づいた人間の表情の基本単位、およびその基本単位を組み合わせた表情の計64種類を撮影することで得られる。撮影には光学式モーションキャプチャシステムを用い、被験者の顔表面に146点のマーカを配置し、表情変化時の詳細な3次元移動量を取得した。取得した顔表面の3次元移動量を、被験者の正面画像に対して整合された3次元顔モデルに適用することで、表情を合成することが可能となる。さらに、各表情の3次元移動量に対して段階的に主成分分析を行い、各表情の3次元移動量を直交化することで、他の表情と相関の無い表情変化のパラメータを提案する。

CiNii J-GLOBAL
B-18-2 特徴点の3次元情報を利用した顔認証システム(B-18. バイオメトリクス・セキュリティ, 通信2)

井辺昭人, 佐藤康之, 前島謙宣, 森島繁生

電子情報通信学会総合大会講演論文集 2005 ( 2 ) 530 - 530 2005年03月

CiNii J-GLOBAL
B-18-3 表情動画像に基づく顔認証システムの構築(B-18. バイオメトリクス・セキュリティ, 通信2)

三浦文裕, 佐藤康之, 森島繁生

電子情報通信学会総合大会講演論文集 2005 ( 2 ) 531 - 531 2005年03月

CiNii
A-15-10 感情音声と表情動画像を同時に提示した場合の印象評価(A-15. ヒューマン情報処理, 基礎・境界)

比留間庸介, 足立吉広, 森島繁生

電子情報通信学会総合大会講演論文集 2005 253 - 253 2005年03月

CiNii J-GLOBAL
A-15-12 骨格モーションキャプチャ(A-15. ヒューマン情報処理, 基礎・境界)

西村昌平, 小島潔, 岩澤昭一郎, 森島繁生

電子情報通信学会総合大会講演論文集 2005 255 - 255 2005年03月

CiNii
MRIイメージからの骨格抽出と高忠実な骨格および関節のモーションキャプチャリング

小島潔, 西村昌平, 岩澤昭一郎, 森島繁生

情報処理学会研究報告. CVIM, [コンピュータビジョンとイメージメディア] 148 ( 18 ) 101 - 108 2005年03月

　概要を見る

本研究では、全身のMRIイメージから計測対象である人物の骨格構造を忠実に再現し、光学モーションキャプチャシステムで利用するマーカの3次元移動量から骨の運動を再現する。MRIイメージの処理により人物の骨格・関節構造を得るが、標準骨モデルをこの計測結果に基づいて変形し、カスタマイズを施す。その後に皮膚表面に添付された光学マーカの初期位置を仮想空間内の人体モデルに指定することで、仮想世界と現実世界とのリンクを確立し、複数のマーカ座標から骨の位置を決定する変換行列を求め、マーカの動きから骨の運動をキャプチャする。複雑な骨の回転変形や、皮膚表面でのマーカの滑りを考慮して補正し、高忠実な骨格および関節のモーションキャプチャを実現している。

CiNii J-GLOBAL
コンテンツ制作の高能率化のための要素技術研究

森島繁生

戦略的創造研究推進事業研究年報(CD-ROM) 2005 DEJITARUMEDIASAKUHIN,MORISHIMA 2005年

J-GLOBAL
雑音環境下での音声の聞き取り実験による合成発話顔アニメーションの評価

前島謙宣, 四倉達夫, 森島繁生, 中村哲

電子情報通信学会論文誌 A J88-A ( 1 ) 71 - 82 2005年

J-GLOBAL
話者のイントネーションを模倣するインタラクティブ声質変換システムの構築

足立吉広, 森島繁生

情報処理学会シンポジウム論文集 2005 ( 4 ) 261 - 268 2005年

J-GLOBAL
MRIイメージからの骨格抽出と高忠実な骨格および関節のモーションキャプチャリング

小島潔, 西村昌平, 岩沢昭一郎, 森島繁生

情報処理学会研究報告 2005 ( 18(CVIM-148) ) 101 - 108 2005年

J-GLOBAL
骨格モーションキャプチャ

西村昌平, 小島潔, 岩沢昭一郎, 森島繁生

電子情報通信学会大会講演論文集 2005 255 2005年

J-GLOBAL
特徴点の3次元情報を利用した顔認証システム

井辺昭人, 佐藤康之, 前島謙宣, 森島繁生

電子情報通信学会大会講演論文集 2005 530 2005年

J-GLOBAL
感情音声と表情動画像を同時に提示した場合の印象評価

比留間庸介, 足立吉広, 森島繁生

電子情報通信学会大会講演論文集 2005 253 2005年

J-GLOBAL
表情動画像に基づく顔認証システムの構築

三浦文裕, 佐藤康之, 森島繁生

電子情報通信学会大会講演論文集 2005 531 2005年

J-GLOBAL
特徴点の3次元情報を利用した顔認証システムの構築

井辺昭人, 佐藤康之, 前島謙宣, 森島繁生

電子情報通信学会技術研究報告 104 ( 748(MVE2004 69-76) ) 25 - 30 2005年

J-GLOBAL
数字発話時の唇動作に基づく顔認証システムの構築

三浦文裕, 佐藤康之, 森島繁生

電子情報通信学会技術研究報告 104 ( 748(MVE2004 69-76) ) 31 - 34 2005年

J-GLOBAL
モーションキャプチャによる顔表情の定量表現

柳沢博昭, 祖川慎治, 前島謙宣, 四倉達夫, 森島繁生

電子情報通信学会技術研究報告 104 ( 744(HCS2004 49-59) ) 7 - 12 2005年

J-GLOBAL
MRIを用いた骨格・関節のモーションキャプチャリング

西村昌平, 小島潔, 岩沢昭一郎, 森島繁生

電子情報通信学会技術研究報告 104 ( 747(HIP2004 100-112) ) 19 - 24 2005年

J-GLOBAL
「愛・地球博」における最新映像技術 6.フュチャーキャストシステム『三井・東芝館』

森島繁生

映像情報メディア学会誌 59 ( 4 ) 522 - 524 2005年

DOI CiNii J-GLOBAL
顔の3次元構造に着目した特徴点情報による認証システムの研究

井辺昭人, 佐藤康之, 前島謙宣, 森島繁生

画像センシングシンポジウム講演論文集 11th 353 - 356 2005年

J-GLOBAL
Future Cast System:三井・東芝館

前島謙宣, 森島繁生

3D映像 19 ( 2 ) 45 - 48 2005年

J-GLOBAL
顔形状に基づく特徴点の3次元情報を利用した個人認証システムの提案

井辺昭人, 佐藤康之, 前島謙宣, 森島繁生

情報処理学会シンポジウム論文集 2005 IS2-63 2005年

J-GLOBAL
フューチャーキャストシステム三井・東芝館

森島繁生

画像ラボ 16 ( 9 ) 25 - 28 2005年

J-GLOBAL
顔情報データベース構築の基礎的検討(3)-表情画像の認知的評価とデータベースの信頼性について-

鈴木竜太, 吉田宏之, 渡辺伸行, 前田亜希, 番場あやの, 続木大介, 北村麻梨, 時田学, 和田万紀, 森島繁生, 山田寛

電子情報通信学会技術研究報告 105 ( 385(HCS2005 38-54) ) 93 - 98 2005年

J-GLOBAL
雑音環境下での音声の聞き取り実験による合成発話顔アニメｰションの評価

前島謙宣, 四倉達夫, 森島繁生, 中村哲

電子情報通信学会論文誌. A, 基礎・境界 88 ( 1 ) 71 - 82 2005年01月

　概要を見る

人間のような見た目をもつ擬人化エージェントの実現は,コンピュータを介して人間同士のコミュニケーションの幅を広げるための重要な研究課題である.筆者らは,このようなコミュニケーションを可能にするための,自然な発話顔アニメーションの合成手法を提案している.しかし,発話顔アニメーションに対する性能の評価方法は課題として残されていた.発話顔アニメーションの性能は,(1)読唇をできる程度に再現されているか,(2)視覚的に自然であるか,(3)音声と正確に同期しているかの3点により決定される.本論文では,まず雑音環境下において発話顔アニメーションと音声とを被験者に提示し,発話内容の聞き取り実験を行うことにより(1)を検証する.次に(2)について,発話顔アニメーションの視覚的な自然さ及び発話口形の滑らかさを5段階評価する.最後に(3)について,ある一定間隔で音声と発話顔アニメーションとの同期をずらしたものを被験者に提示し,同期のずれの主観値を調査するとともに,違和感の程度を5段階評価により評価する.加えて音声と発話顔アニメーションとの同期のずれが音声の知覚に及ぼす影響についても評価する.これらの評価実験を通じて,筆者らが提案する合成発話顔アニメーションの品質を評価するとともに,合成発話顔アニメーションと音声との自然な同期について検証した.

CiNii J-GLOBAL
感情音声と表情画像を同時に提示した場合のマルチモーダル印象の評価

比留間庸介, 足立吉弘, 森島繁生

言語・音声理解と対話処理研究会 42 7 - 12 2004年11月

CiNii
感情音声と表情画像を同時に提示した場合のマルチモーダル印象の評価

比留間庸介, 足立吉広, 森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 104 ( 445 ) 7 - 12 2004年11月

　概要を見る

人間同士のコミュニケーションにおいて感情のやり取りを行う場合に、発話音声から聞き取られる感情と表情から読み取れる感情は矛盾なく一致して相手に伝えられる。擬人化エージェントにおいては、感情表現技術の未熟さから、必ずしもリアリティが高く感情豊かな表情合成や音声合成が実現できているわけではない。したがって、しばしば受ける印象に違和感が生じる場合がある。そこで本稿では、発話音声に含まれる感情表現と表情動画像に含まれる感情表現に矛盾が生じた場合に、人間の受け取る印象にどのような変化が生じるかを評価することによって、音声に強い影響を受ける感情は何か、画像に強い影響を受ける感情は何かを明らかにすることを試みた。まず評価実験は、自然音声と合成音声を対象として、音声単独での感情聞き取り実験を実施した。次にビデオで収録した感情動画像のみを無音で提示し、評価した。最後に、映像と音声を同時に提示して評価した。その際、映像に同期するように音声の発話速度を制御して、異なる感情の組み合わせで印象がどう変化するかを評価した。

CiNii J-GLOBAL
モーションキャプチャシステムを用いたマルチモーダル音声コーパスの構築

四倉達夫, 森島繁生, 中村哲

情報処理学会研究報告. HI,ヒューマンインタフェース研究会報告 110 19 - 24 2004年09月

　概要を見る

本稿では音声、顔画像および発話時における顔器官の位置とその変化量を含むマルチモーダル音声コーパスの制作方法、およびデータの処理方法について述べる。発話用テキストはATR日本語バランス文とし、女性話者1名の発話をコーパスとした。変化量の計測には光学式モーションキャプチャシステムを使用し、発話者の顔上に多数のマーカを配置することで、顔画像情報のみでは獲得することができない顔位置の詳細かつ高精度の3次元データを収録した。さらに本稿では純粋な顔器官の動きを算出するため、アフィン変換を用い頭部の動きを除去し顔位置のみの情報を獲得する手法を提案する。またコンピュータ上に計測した変化量を発話アニメーションヘ容易に再現させるため、顔器官の動きをメッシュで構成された顔オブジェクトヘ割り当てる手法について述べる。

CiNii
顔情報データベース構築の基礎的検討(2) : 撮影環境と検索インターフェイスについて

吉田宏之, 鈴木竜太, 渡邊伸行, 山口拓人, 小川宜子, 北村麻梨, 前田亜希, 續木大介, 時田学, 和田万紀, 森島繁生, 山田寛

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 104 ( 198 ) 13 - 16 2004年07月

　概要を見る

顔情報データベースとは顔研究に欠かせない顔情報を収集し,様々な研究用途に提供するものである.データベース化する顔情報はFACSに基づいて表情表出を行った静止画像,その静止画像から抽出する顔の構造情報,表情表出時の運動情報などから構成される.本報では登録されるデータを撮影するときの環境条件,ならびにデータベースサーバとそのインターフェイスについての報告を行う.また,顔情報というきわめてプライバシー性の高い情報を取り扱うにあたり,個人情報保護の取り組みついて現在まで検討を進めている公開方式等についても併せて報告する.

CiNii J-GLOBAL
コミュニケーションギャップを埋める顔画像処理技術 (特集:知的画像処理への最先端研究動向)

森島繁生

光技術コンタクト 42 ( 4 ) 190 - 197 2004年04月

CiNii J-GLOBAL
頭髪運動のリアルタイムアニメーションツールの開発

杉森大輔, 森島繁生

電子情報通信学会技術研究報告. MVE, マルチメディア・仮想環境基礎 103 ( 745 ) 37 - 42 2004年03月

　概要を見る

本研究は頭髪の運動アニメーションをリアルタイム合成するためのいくつかの手法を提案する。この手法とは頭髪をモデリングしているセグメントの再分割、およびポクセルによる風および衝突のモデル化である。これらの手法の導入によってスーパーリアルなアニメーションを合成することはできないが、処理時間の大幅な削減を図ることができる。これによって本研究では一般的なPCで最大10[frame/sec]のフレームレートによるアニメーション生成を可能とした。

CiNii J-GLOBAL
ポリゴン細分割とテクスチャブレンディングによるリアルな表情合成

岩上圭吾, 大橋俊介, 森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 103 ( 742 ) 13 - 18 2004年03月

　概要を見る

近年、仮想空間での擬人化エージェントによるコミュニケーションを実現するために、CGによる表情合成の研究が注目を集めている。従来からテクスチャマッピングによる表情合成の研究が進められてきたが、皺がきれいに合成されないという問題点があった。そこで本稿では、皺の表現のためにポリゴンを細かくしたワイヤフレームを作成し、画像へのフィッティングの際に皺をきちんと位置あわせすることで、皺の合成がきれいに実現できる手法を提案する。これにより、複数の基本顔テクスチャのテクスチャブレンディングによって、任意表情を表現でき、皺が薄くなるという従来の問題点を解決することができた。さらに、ポリゴンの細分割を実施して、ワイヤフレームの頂点数を増やし、レンジデータを反映させることで、より原画像に忠実な頭部モデルを作成することが可能となった。

CiNii J-GLOBAL
音声の韻律情報の変換によるイントネーション変換システム

中野篤志, 足立吉広, 森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 103 ( 742 ) 53 - 58 2004年03月

　概要を見る

音声への感情付加や発話強調、方言の付加等を目的として、任意の自然音声若しくは合成音声に対して声質変換する手法を提案する。従来から、音声の韻律情報を制御し、イントネーションを制御する研究が行われてきたが、一部の処理が波形レベルで行われていた為声質変換後の音質劣化が目立つものとなっていた。そこで波形レベルでの処理をやめ、スペクトログラムに対して韻律情報を変化させることで目立っていた音質劣化を防いだイントネーション変換システムを構築した。本システムにより、手本となる参照音声から発話速度、基本周波数遷移、パワー遷移の分析結果を同一内容の音声に反映させ自然音声のイントネーションを変換することが可能になった。

CiNii
雑音環境下における合成発話アニメーションの評価

前島謙宣, 四倉達夫, 森島繁生, 中村哲

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 103 ( 742 ) 59 - 64 2004年03月

　概要を見る

筆者らは、すでに自然な発話アニメーションの合成手法を提案してきた.しかし,その評価は主観評価実験によるところが大きかった.本稿では,発話アニメーションの客観的評価尺度を含む新しい評価手法について提案する.この評価手法では,発話アニメーションの性能は以下の3つの要素によって評価される.読唇が可能か.視覚的に自然か.音声と正確に同期しているか.読唇の可能性は,まず雑音環境下において顔アニメーションと音声とを被験者に提示し,発話単語がどの程度正しく聞き取ることができたかという実験により判断する.次に,発話アニメーションの視覚的な自然さと発話口形変化の滑らかさをMOS5段階評価する.音声との自然な同期に関しては,一定間隔で音声と発話アニメーションとの同期をずらしたものを被験者に提示し,主観的な同期のずれを調査するとともに、違和感の程度を5段階評価によって評価する.加えて,音声と発話アニメーションとの同期のずれが音声の知覚に及ぼす影響についても評価する.以上により,合成された発話アニメーションの品質評価を行い、音声との自然な同期について検証した.

CiNii J-GLOBAL
MRIとモーションキャプチャシステムを用いた精度の良い骨格の動き推定

田代和己, 小島潔, 岩澤昭一郎, 森島繁生

電子情報通信学会技術研究報告. HIP, ヒューマン情報処理 103 ( 743 ) 47 - 52 2004年03月

　概要を見る

コンピュータグラフィックスの分野では、人物描写をより写実的に実現するため、実際の人物をモーションキャプチャしキャラクタ上にその動きを投影する手法がしばしば取られる。本研究では、その動作情報を利用して人物の動きをより実際に忠実に、また正確に再現するために、骨格の動作を再現する手法について述べる。今回は特に腕部分の動作に着目し、MRIで取得した体内のデータから骨格部分を抽出して骨格モデルを生成する。光学式モーションキャプチャシステムより得られる各部位ごとの皮膚動作情報からこの骨格モデルを制御する。この際、皮膚と骨格の間に生ずるすべりも考慮して、高い精度での骨格の制御を実現した。これにより、通常のモーションキャプチャデータから再現可能なスケルトンモデルよりも、より実際に忠実で、リアルな人物運動表現が可能となった。

CiNii
房単位で編集可能なヘアスタイルデザインシステム

加藤絢子, 古川利博, 森島繁生

電子情報通信学会技術研究報告. CAS, 回路とシステム 103 ( 717 ) 87 - 91 2004年03月

　概要を見る

人物のイメージを決める要素として,頭髪は重要な役割を果たす.そのため,髪の長さ,色などを変化させ,ヘアスタイルでの自己表現をする人が多くなっており,CGによる人物表現においても頭髪に対する多くの研究がなされている.本研究では,CGを用いたヘアスタイルデザインシステムについて述べる.頭髪の表現にB-Spline曲線,頭部の表現にNURBS曲面を用いたヘアスタイルデザインシステムがある.このシステムは,領域の数が少なく,頭髪の編集が領域ごとにしかできないため表現できるヘアスタイルが限られてしまうという問題点がある.本研究ではこのシステムに機能を追加し,より多くの種類のヘアスタイルを表現できるようにした.頭髪の細かい編集を行うためにパラメータ変換を用いて頭部の領域分割を行い,B-Spline曲線を分割することによってカット機能を追加した.

CiNii J-GLOBAL
房単位で編集可能なヘアスタイルデザインシステム

加藤絢子, 古川利博, 森島繁生

電子情報通信学会技術研究報告. DSP, ディジタル信号処理 103 ( 719 ) 87 - 91 2004年03月

　概要を見る

人物のイメージを決める要素として,頭髪は重要な役割を果たす.そのため,髪の長さ,色などを変化させ,ヘアスタイルでの自己表現をする人が多くなっており,CGによる人物表現においても頭髪に対する多くの研究がなされている.本研究では,CGを用いたヘアスタイルデザインシステムについて述べる.頭髪の表現にB-Spline曲線,頭部の表現にNURBS曲面を用いたヘアスタイルデザインシステムがある.このシステムは,領域の数が少なく,頭髪の編集が領域ごとにしかできないため表現できるヘアスタイルが限られてしまうという問題点がある.本研究ではこのシステムに機能を追加し,より多くの種類のヘアスタイルを表現できるようにした.頭髪の細かい編集を行うためにパラメータ変換を用いて頭部の領域分割を行い,B-Spline曲線を分割することによってカット機能を追加した.

CiNii
房単位で編集可能なヘアスタイルデザインシステム

加藤絢子, 古川利博, 森島繁生

電子情報通信学会技術研究報告. CS, 通信方式 103 ( 721 ) 87 - 91 2004年03月

　概要を見る

人物のイメージを決める要素として,頭髪は重要な役割を果たす.そのため,髪の長さ,色などを変化させ,ヘアスタイルでの自己表現をする人が多くなっており,CGによる人物表現においても頭髪に対する多くの研究がなされている.本研究では,CGを用いたヘアスタイルデザインシステムについて述べる.頭髪の表現にB-Spline曲線,頭部の表現にNURBS曲面を用いたヘアスタイルデザインシステムがある.このシステムは,領域の数が少なく,頭髪の編集が領域ごとにしかできないため表現できるヘアスタイルが限られてしまうという問題点がある.本研究ではこのシステムに機能を追加し,より多くの種類のヘアスタイルを表現できるようにした.頭髪の細かい編集を行うためにパラメータ変換を用いて頭部の領域分割を行い,B-Spline曲線を分割することによってカット機能を追加した.

CiNii
D-8-11 擬人化音声対話システム構築のための顔モデル生成ツールの開発(D-8. 人工知能と知識処理)

林壮, 祖川慎治, 四倉達夫, 森島繁生

電子情報通信学会総合大会講演論文集 2004 ( 1 ) 98 - 98 2004年03月

CiNii J-GLOBAL
D-14-5 発話速度、ピッチ、パワー制御による自然音声のイントネーション変換(D-14. 音声・聴覚)

中野篤志, 足立吉広, 前島謙宣, 森島繁生

電子情報通信学会総合大会講演論文集 2004 ( 1 ) 146 - 146 2004年03月

CiNii J-GLOBAL
D-11-141 制御点の増減による頭髪アニメーション合成のリアルタイム処理(D-11.画像工学D)

土田洋平, 杉森大輔, 森島繁生

電子情報通信学会総合大会講演論文集 2004 ( 2 ) 141 - 141 2004年03月

CiNii
D-11-142 基本動作の合成による多様な人物動作の作成(D-11.画像工学D)

吉田憲彦, 仲野陽介, 小島潔, 森島繁生

電子情報通信学会総合大会講演論文集 2004 ( 2 ) 142 - 142 2004年03月

CiNii J-GLOBAL
D-11-143 前腕部の骨格動作推定(D-11.画像工学D)

田代和己, 小島潔, 岩澤昭一郎, 森島繁生

電子情報通信学会総合大会講演論文集 2004 ( 2 ) 143 - 143 2004年03月

CiNii
D-12-52 奥行き情報を用いた個人認証における顔の最適な角度の検討(D-12.パターン認識・メディア理解A)

高橋悠, 森島繁生

電子情報通信学会総合大会講演論文集 2004 ( 2 ) 218 - 218 2004年03月

CiNii J-GLOBAL
D-12-96 3次元レンジセンサを利用した表情合成における汎用性の実現(D-12. パターン認識・メディア理解B)

山口智之, 大橋俊介, 祖川慎治, 森島繁生

電子情報通信学会総合大会講演論文集 2004 ( 2 ) 262 - 262 2004年03月

CiNii
D-12-98 テクスチャと細分割を利用した皺のある表情合成の構築(D-12. パターン認識・メディア理解B)

岩上圭吾, 大橋俊介, 森島繁生

電子情報通信学会総合大会講演論文集 2004 ( 2 ) 264 - 264 2004年03月

CiNii J-GLOBAL
Activities of Interactive Speech Technology Consortium (ISTC) targeting open software development for MMI systems

T Nitta, S Sagayama, Y Yamashita, T Kawahara, S Morishima, S Nakamura, A Yamada, K Ito, M Kai, A Li, M Mimura, K Hirose, T Kobayashi, K Tokuda, N Minematsu, Y Den, T Utsuro, T Yotsukura, H Shimodaira, M Araki, T Nishimoto, N Kawaguchi, H Banno, K Katsurada

RO-MAN 2004: 13TH IEEE INTERNATIONAL WORKSHOP ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, PROCEEDINGS 165 - 170 2004年

　概要を見る

Interactive Speech Technology Consortium (ISTC), established on November 2003 after three years activity of the Galatea project supported by Information-technology Promotion Agency (IPA) of Japan, aims at supporting open-source free software development of Multi-Modal Interaction (MM) for human-like agents. The software named Galatea-toolkit developed by 24 researchers of 16 research institutes in Japan includes a Japanese speech recognition engine, a Japanese speech synthesis engine, and a facial image synthesis engine used for developing an anthropomorphic agent, as well as dialogue manager that can integrates multiple modalities, interprets them, and decides an action with differentiating it to multiple media of voice and facial expression. ISTC provides members a one-day technical seminar and one-week training course to master Galatea-toolkit, as well as a software set (CDROM) every year.
Face and gesture capturing and cloning for life-like agent

S Morishima

RO-MAN 2004: 13TH IEEE INTERNATIONAL WORKSHOP ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, PROCEEDINGS 171 - 176 2004年

　概要を見る

Face and gesture cloning is essential to make a life-like agent more believable and to give it a personality and a character of target person. To realize cloning, an accurate face capture and motion capture are inevitable to get corpus data about face expressions, speaking scenes and gestures. In this paper, our recent approach to capture the personal feature of face and gesture is presented.
For the face capturing, a face location and angles are estimated from video sequence with personal 3D face model and then a synthetic face model data is imposed into frames to realize automatic stand-in system or multimodal translation system..
A stand-in is a common technique for movies and TV programs in foreign languages. The current stand-in that only substitutes the voice channel results awkward matching to the mouth motion. Videophone with automatic voice translation are expected to be widely used in the near future, which may face the same problem without lip- synchronized speaking face image translation. In this paper, we introduce a method to track motion of the face from the video image and then replace the face part or only mouth part with synthesized one which is synchronized with synthetic voice or spoken voice. This is one of the key technologies not only for speaking image translation and communication system, but also for an interactive entertaimnent system. Also, an interactive movie system is introduced as an application of entertaimnent system.
Capturing and copying a facial expression based on a physics base facial muscle constraint has been already presented[6]. So in this paper, this part is not described.
For a gesture capturing, commercially available motion capture products give us fairly precise movements of human body segments but do not measure enough information to define skeletal posture in its entirety. This paper describes how to obtain the complete posture of skeletal structure with the help of marker locations relative to bones that are derived from MRI data sets.
コンテンツ制作の高能率化のための要素技術研究

森島繁生

戦略的創造研究推進事業研究年報(CD-ROM) 2004 DEJITARUMEDIASAKUHIN,MORISHIMA 2004年

J-GLOBAL
前腕部の骨格動作推定

田代和己, 小島潔, 岩沢昭一郎, 森島繁生

電子情報通信学会大会講演論文集 2004 143 2004年

J-GLOBAL
制御点の増減による頭髪アニメーション合成のリアルタイム処理

土田洋平, 杉森大輔, 森島繁生

電子情報通信学会大会講演論文集 2004 141 2004年

J-GLOBAL
3次元レンジセンサを利用した表情合成における汎用性の実現

山口智之, 大橋俊介, 祖川慎治, 森島繁生

電子情報通信学会大会講演論文集 2004 262 2004年

J-GLOBAL
基本動作の合成による多様な人物動作の作成

吉田憲彦, 仲野陽介, 小島潔, 森島繁生

電子情報通信学会大会講演論文集 2004 142 2004年

J-GLOBAL
奥行き情報を用いた個人認証における顔の最適な角度の検討

高橋悠, 森島繁生

電子情報通信学会大会講演論文集 2004 218 2004年

J-GLOBAL
発話速度,ピッチ,パワー制御による自然音声のイントネーション変換

中野篤志, 足立吉広, 前島謙宣, 森島繁生

電子情報通信学会大会講演論文集 2004 146 2004年

J-GLOBAL
擬人化音声対話システム構築のための顔モデル生成ツールの開発

林壮, 祖川慎治, 四倉達夫, 森島繁生

電子情報通信学会大会講演論文集 2004 98 2004年

J-GLOBAL
テクスチャと細分割を利用したしわのある表情合成の構築

岩上圭吾, 大橋俊介, 森島繁生

電子情報通信学会大会講演論文集 2004 264 2004年

J-GLOBAL
房単位で編集可能なヘアスタイルデザインシステム

加藤絢子, 古川利博, 森島繁生

電子情報通信学会技術研究報告 103 ( 721(CS2003 178-197) ) 87 - 91 2004年

J-GLOBAL
頭髪運動のリアルタイムアニメーションツールの開発

杉森大輔, 森島繁生

電子情報通信学会技術研究報告 103 ( 745(MVE2003 121-131) ) 37 - 42 2004年

J-GLOBAL
雑音環境下における合成発話アニメーションの評価

前島謙宣, 四倉達夫, 森島繁生, 中村哲

電子情報通信学会技術研究報告 103 ( 742(HCS2003 56-70) ) 59 - 64 2004年

J-GLOBAL
MRIとモーションキャプチャシステムを用いた精度の良い骨格の動き推定

田代和己, 小島潔, 岩沢昭一郎, 森島繁生

電子情報通信学会技術研究報告 103 ( 743(HIP2003 126-136) ) 47 - 52 2004年

J-GLOBAL
ポリゴン細分割とテクスチャブレンディングによるリアルな表情合成

岩上圭吾, 大橋俊介, 森島繁生

電子情報通信学会技術研究報告 103 ( 742(HCS2003 56-70) ) 13 - 18 2004年

J-GLOBAL
音声の韻律情報の変換によるイントネーション変換システム

中野篤志, 足立吉広, 森島繁生

電子情報通信学会技術研究報告 103 ( 742(HCS2003 56-70) ) 53 - 58 2004年

J-GLOBAL
知的画像処理への最先端研究動向コミュニケーションギャップを埋める顔画像処理技術

森島繁生

光技術コンタクト 42 ( 4 ) 190 - 197 2004年

J-GLOBAL
顔情報データベース構築の基礎的検討(2)-撮影環境と検索インターフェイスについて-

吉田宏之, 鈴木竜太, 渡辺伸行, 山口拓人, 続木大介, 時田学, 和田万紀, 森島繁生, 山田寛

電子情報通信学会技術研究報告 104 ( 198(HCS2004 10-17) ) 13 - 16 2004年

J-GLOBAL
モーションキャプチャシステムを用いたマルチモーダル音声コーパスの構築

四倉達夫, 森島繁生, 中村哲

情報処理学会研究報告 2004 ( 90(HI-110) ) 19 - 24 2004年

J-GLOBAL
感情音声と表情画像を同時に提示した場合のマルチモーダル印象の評価

比留間庸介, 足立吉広, 森島繁生

電子情報通信学会技術研究報告 104 ( 445(HCS2004 22-29) ) 7 - 12 2004年

J-GLOBAL
擬人化音声対話エージェント基本ソフトウェアの開発プロジェクト報告

嵯峨山茂樹, 伊藤克亘, 宇津呂武仁, 甲斐充彦, 小林隆夫, 下平博, 伝康晴, 徳田恵一, 中村哲, 西本卓也, 新田恒雄, 広瀬啓吉, 峯松信明, 森島繁生, 山下洋一, 山田篤, 李晃伸

情報処理学会研究報告. SLP, 音声言語情報処理 49 ( 124 ) 319 - 324 2003年12月

　概要を見る

擬人化音声対話エージェントのツールキット"Galatea"の開発プロジェクトについて報告する.Galateaの主要な機能は音声認識,音声合成,顔画像合成であり,これらの機能を統合して,対話制御の下で動作させるものである.研究のプラットフォームとして利用されることを想定してカスタマイズ可能性を重視した結果,顔画像が容易に交換可能で,音声合成が話者適応可能で,対話制御の記述変更が容易で,更にこれらの機能モジュール自体を別のモジュールに差し替えることが容易であり,かつ処理ハードウェアの個数に柔軟に対処できるなどの特徴を持つシステムとなった.この成果はダウンロード可能となっており,一般に無償使用許諾している.

CiNii
擬人化音声対話エージェント基本ソフトウェアの開発プロジェクト報告

嵯峨山茂樹, 伊藤克亘, 宇津呂武仁, 甲斐充彦, 小林隆夫, 下平博, 伝康晴, 徳田恵一, 中村哲, 西本卓也, 新田恒雄, 広瀬啓吉, 峯松信明, 森島繁生, 山下洋一, 山田篤, 李晃伸

電子情報通信学会技術研究報告. NLC, 言語理解とコミュニケーション 103 ( 518 ) 73 - 78 2003年12月

　概要を見る

擬人化音声対話エージェントのツールキット"Galatea"の開発プロジェクトについて報告する.Galateaの主要な機能は音声認識,音声合成,顔画像合成であり,これらの機能を統合して,対話制御の下で動作させるものである.研究のプラットフォームとして利用されることを想定してカスタマイズ可能性を重視した結果,顔画像が容易に交換可能で,音声合成が話者適応可能で,対話制御の記述変更が容易で,更にこれらの機能モジュール自体を別のモジュールに差し替えることが容易であり,かつ処理ハードウェアの個数に柔軟に対処できるなどの特徴を持つシステムとなった.この成果はダウンロード可能となっており,一般に無償使用許諾している.

CiNii
擬人化音声対話エージェント基本ソフトウェアの開発プロジェクト報告

嵯峨山茂樹, 伊藤克亘, 宇津呂武仁, 甲斐充彦, 小林隆夫, 下平博, 伝康晴, 徳田恵一, 中村哲, 西本卓也, 新田恒雄, 広瀬啓吉, 峯松信明, 森島繁生, 山下洋一, 山田篤, 李晃伸

電子情報通信学会技術研究報告. SP, 音声 103 ( 520 ) 73 - 78 2003年12月

　概要を見る

擬人化音声対話エージェントのツールキット"Galatea"の開発プロジェクトについて報告する.Galateaの主要な機能は音声認識,音声合成,顔画像合成であり,これらの機能を統合して,対話制御の下で動作させるものである.研究のプラットフォームとして利用されることを想定してカスタマイズ可能性を重視した結果,顔画像が容易に交換可能で,音声合成が話者適応可能で,対話制御の記述変更が容易で,更にこれらの機能モジュール自体を別のモジュールに差し替えることが容易であり,かつ処理ハードウェアの個数に柔軟に対処できるなどの特徴を持つシステムとなった.この成果はダウンロード可能となっており,一般に無償使用許諾している.

CiNii
レンジサンサを用いた表情の計測および変形ルールの記述

大橋俊介, 山口智之, 森島繁生

映像情報メディア学会技術報告 27 ( 53 ) 1 - 6 2003年09月

CiNii
レンジセンサを用いた表情の計測および変形ルールの記述

大橋俊介, 山口智之, 森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 103 ( 328 ) 1 - 6 2003年09月

　概要を見る

近年、アバタを介したコミュニケーションシステムやヒューマンインタフェースにおいて、CGによる表情合成の研究が注目を集めている。コンピュータと人間とのコミュニケーションを円滑に行うには、人間同士がフェーストゥーフェースで対話しているような環境が重要であり、表情を含めたノンバーバルな情報の表現が課題となる。本稿では、レンジセンサを利用して、高精度に顔の3次元形状を計測すると同時に、無表情と基本表情変化後の対応関係から、表情変形ルールを抽出する方法について述べる。基本的には、標準顔モデルの整合による特徴点の対応付けに基づくが、表情移動量分布図の概念を導入し、任意の顔形状モデルに対して、表情変形ルールが適用できることを示す。

CiNii
<研究装置設備紹介>3次元動作解析システム(平成14年度購入)

森島繁生, 柴田昌明

成蹊大学工学研究報告 40 ( 2 ) 91 - 92 2003年09月

CiNii
顔の分析・合成とその応用

森島繁生

情報処理学会研究報告. CVIM, [コンピュータビジョンとイメージメディア] 139 ( 66 ) 107 - 114 2003年07月

　概要を見る

近年バーチャルヒューマンのアプリケーションとして、エンタテインメントシステムやコミュニケーションシステムが注目されている。ここではいかに人物を忠実にコピーして合成するかが課題であるが、ビジョンベースのシステムは性能に限界があり、クオリティの高い合成画像を再現することは難しい。モーションキャプチャシステムは、映画やゲームの世界で一般的に利用されるが、オンラインで合成することは難しく、後処理で多くのマニュアル操作が必要とされるのが現状である。本稿では、特に顔のアクションを高い忠実度と自然なクオリティでコピーすることを目的とし、顔画像分析・合成技術を駆使して、準リアルタイムでバーチャルヒューマンを制御する手法について述べる。さらにエンタテインメントおよびコミュニケーションを目的とするアプリケーションシステムを紹介する。

CiNii
擬人化インタフェース

森島繁生

ヒューマンインタフェース学会誌 = Journal of Human Interface Society : human interface 5 ( 2 ) 109 - 112 2003年05月

CiNii
テクスチャブレンディングによる皺の表現と口形アニメーション

石山, 慎一郎, 高橋, 光紀, 大橋, 俊介, 森島, 繁生

第65回全国大会講演論文集 2003 ( 1 ) 59 - 60 2003年03月

CiNii
モーションキャプチャシステムによる複雑な人物動作の表現

仲野, 陽介, 四倉, 達夫, 杉崎, 英嗣, 森島, 繁生

第65回全国大会講演論文集 2003 ( 1 ) 179 - 180 2003年03月

CiNii
影響力マップを用いた顔画像合成の提案

祖, 川慎治, 四倉, 達夫, 森島, 繁生

第65回全国大会講演論文集 2003 ( 1 ) 197 - 198 2003年03月

CiNii
モデル・テクスチャ・皺のブレンディングによる顔表情の表現

高橋, 光紀, 森島, 繁生

第65回全国大会講演論文集 2003 ( 1 ) 199 - 200 2003年03月

CiNii
モーションキャプチャを用いた内部骨格の動作再現

小島潔, 杉崎英嗣, 四倉達夫, 森島繁生

電子情報通信学会技術研究報告. HIP, ヒューマン情報処理 102 ( 736 ) 19 - 24 2003年03月

　概要を見る

モーションキャプチャシステムは、動作計測およびアニメーション再現を目的として、映画やゲームソフトの製作プロセスで最近よく用いられているが、その計測精度には限界があり、実際の動作再現に際して手作業による膨大なポスト処理が必要とされる。本研究では、このモーションキャプチャシステムを利用して、動作計測を精度よく行う手法を提案する。キャプチャのための独白のマーカ配置を行い、このマーカデータから人体骨格を忠実に表現したスケルトンモデルを利用している。また、骨格は体の内部に存在し、マーカは体表面の位置特定を行っているため、マーカに対する骨格の正確な位置を特定するためにMRIを利用した。この提案システムを、特徴的な動作が多くモデリングが困難とされる古典バレエのダンスのキャプチャリングに適用し、その精度よい骨格の動き再現とバレエ動作の再現を行った。

CiNii J-GLOBAL
音声のパラメータ変換によるイントネーション変換システムの構築

足立吉広, 前島謙宣, 四倉達夫, 森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 102 ( 734 ) 1 - 6 2003年03月

　概要を見る

音声への感情付加や発話強調、方言の付加等を目的として、任意の自然音声もしくは合成音声に対して声質を変換する手法を提案する。従来から、音声の韻律情報を制御し、イントネーションを制御する研究が行われてきたが、波形レベルでの変形を行っていることから、再現された音声の自然性の劣化が著しかった。そこで本研究では、声質変換した音声の自然性の劣化を抑えるためにSTRAIGHTの考え方を導入し、セグメンテーションした音節区間毎に、継続長、ピッチ、パワーを制御する方法を新たに付加することで、発話速度とイントネーションを変換するシステムを構築した。これにより喋り方の手本となる参照音声の分析結果から、発話速度、ピッチ推移、パワー推移をセグメントごとに自動抽出して、サンプル音声にこの韻律情報をそのままコピーし、声質変換することが可能となった。

CiNii J-GLOBAL
3次元顔テンプレートを用いた静止画中の顔認識と表情変換

高橋悠, 八島隆, 森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 102 ( 734 ) 31 - 36 2003年03月

　概要を見る

デジタルカメラが飛躍的に普及している昨今、さまざまな人物が写ったスナップ写真を大量にパソコンに保管することが多くなってきた.この大量の写真画像データベースの中から、ある特定の人物が写っている写真のみを探し出したり、笑顔の写真を選択する作業を、手作業によるラベリング処理なして行う必要性がある. そこで本稿では、ターゲットとなる人物の3次元顔モデルを生成し、この顔モデルに基づいて3次元テンプレートを構成することで、静止画像であるスナップ写真の中から対象人物の写っているものを自動選択し、さらにこの3Dモデルを対象人物像に自動的に重ね合わせる手法を提案する.また表情変形規則を3次元顔モデルに定義することで表情変化を含む探索を実現し、ある人物が特定の表情をしている写真を選択することを実現した.また3次元テンプレートマッチングにより、顔の位置・向きが同時に推定可能なため、この顔部分をモデル変形によって別の表情に変換し、違和感のない画像生成を行う表情変換も実現された.

CiNii J-GLOBAL
A-14-1 幾何構造とテクスチャの平均化による人物に依存しない表情変形モデルの構築

大橋俊介, 井上真実, 森島繁生

電子情報通信学会総合大会講演論文集 2003 253 - 253 2003年03月

CiNii J-GLOBAL
A-14-2 音声のパラメータ変換によるイントネーション変換システムの構築

足立吉広, 前島謙宣, 四倉達夫, 森島繁生

電子情報通信学会総合大会講演論文集 2003 254 - 254 2003年03月

CiNii J-GLOBAL
A-14-3 複数人話者会話シーンにおける動画像翻訳システムの構築

前島謙宣, 森島繁生, 中村哲

電子情報通信学会総合大会講演論文集 2003 255 - 255 2003年03月

CiNii J-GLOBAL
擬人化エージェントに必要な顔のリアリティとそのモデル化

森島繁生

情報処理学会研究報告. SLP, 音声言語情報処理 45 ( 14 ) 47 - 50 2003年02月

　概要を見る

顔表現の表現法として、FACSを利用する方法が一般的に知られている。これは顔の各部位の移動を定義するもので、もっぱら幾何構造変形のみを定義するものである。本稿では、従来困難であった皺などの詳細な表現を行う方法を提案する。皺などの情報を持った顔モデル(基本顔)を用意し、それらの構造およびテクスチャのブレンディングによって任意の顔表情を構築する。発話の際の口形アニメーションについても同様に、各VISEMEの口形モデルとして幾何構造およびテクスチャを予め作成して、それらのブレンディングによって滑らかな口形アニメーションを実現する。

CiNii J-GLOBAL
擬人化音声対話エージェントツールキット Galatea

嵯峨山茂樹, 川本真一, 下平博, 新田恒雄, 西本卓也, 中村哲, 伊藤克亘, 森島繁生, 四倉達夫, 甲斐充彦, 李晃伸, 山下洋一, 小林隆夫, 徳田恵一, 広瀬啓吉, 峯松信明, 山田篤, 伝康晴, 宇津呂武仁

情報処理学会研究報告. SLP, 音声言語情報処理 45 ( 14 ) 57 - 64 2003年02月

　概要を見る

筆者らが開発した擬人化音声対話エージェントのツールキット"Galatea"についてその概要を述べる。主要な機能は音声認識、音声合成、顔画像合成であり、これらの機能を統合して、対話制御の下で動作させるものである。研究のプラットフォームとして利用されることを想定してカスタマイズ可能性を重視した結果、顔画像が容易に交換可能で、音声合成が話者適応可能で、対話制御の記述変更が容易で、更にこれらの機能モジュール自体を別のモジュールに差し替えることが容易であり、かつ処理ハードウェアの個数に柔軟に対処できるなどの特徴を持つシステムとなった。この成果はソース公開し、一般に無償使用許諾する予定である。

CiNii J-GLOBAL
複数人話者会話シーンの動画像翻訳

前島謙宣, 森島繁生, 中村哲

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 102 ( 598 ) 13 - 18 2003年01月

　概要を見る

本論文では、複数人の話者の会話シーンにおける画像翻訳の手法について述べる。会話シーンにおいて、ビデオ映像中の人物の顔の動きを推定し、映像中に存在する各話者について発話判定を行う。発話が検出された話者の口領域を、別に用意された音声に同期して合成された口唇映像で置き換えることにより、他言語もしくは変換された発話内容へのリップシンクを実現する。

CiNii J-GLOBAL
影響力マップ用いた任意表情モデル上での表情合成

祖川慎治, 四倉達夫, 森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 102 ( 598 ) 19 - 24 2003年01月

　概要を見る

フェーストゥーフェースのコミュニケーションシステムやヒューマンインターフェースを実現するためにアバタの表情合成の研究が活発に進められている。しかし、従来の表情合成技術はモデル依存性が強く、基本的に詳細な表情変形ルールが定義された特定のワイヤフレームモデルを用いて表情合成する必要があった。本稿では特定のモデルに依存しない表情合成のために、極めて細かいメッシュ構造を持つ影響力マップを定義し、表情変形ルールの定義された顔モデルからマップ上に3次元移動量としての表情変形ルールを射影して、ユーザが独自に定義した顔モデルにこのルールを移植する方法を提案した。具体的にはルールの定義された顔モデルとユーザの定義した顔モデルの格子点の対応関係を放射基底関数を用いて実現する。格子点の異なるモデル間においてもルールの変換が出来る点が特徴で、このためのGUIも実現している。

CiNii J-GLOBAL
解剖学から始めるフェイシャルアニメーション

森島繁生

CG WORLD 2003年5月号 32 - 43 2003年
Face analysis and synthesis for interactive entertainment

Shoichiro Iwasawa, Tatsuo Yotsukura, Shigeo Morishima

IFIP Advances in Information and Communication Technology 112 157 - 164 2003年

　概要を見る

A stand-in is a common technique for movies and TV programs in foreign languages. The current stand-in that only substitutes the voice channel results awkward matching to the mouth motion. Videophone with automatic voice translation are expected to be widely used in the near future, which may face the same problem without lip-synchronized speaking face image translation. In this paper, we propose a method to track motion of the face from the video image and then replace the face part or only mouth part with synthesized one which is synchronized with synthetic voice or spoken voice. This is one of the key technologies not only for speaking image translation and communication system, but also for an interactive entertainment system. Finally, an interactive movie system is introduced as an application of entertainment system. © 2003 by Springer Science+Business Media New York.

DOI
影響力マップ用いた任意表情モデル上での表情合成

祖川慎治, 四倉達夫, 森島繁生

電子情報通信学会技術研究報告 102 ( 598(HCS2002 28-36) ) 19 - 24 2003年

J-GLOBAL
複数人話者会話シーンの動画像翻訳

前島謙宣, 森島繁生, 中村哲

電子情報通信学会技術研究報告 102 ( 598(HCS2002 28-36) ) 13 - 18 2003年

J-GLOBAL
擬人化音声対話エージェントツールキットGalatea

嵯峨山茂樹, 川本真一, 新田恒雄, 中村哲, 伊藤克亘, 森島繁生, 甲斐充彦, 李晃伸, 山下洋一

情報処理学会研究報告 2003 ( 14(SLP-45) ) 57 - 64 2003年

J-GLOBAL
擬人化エージェントに必要な顔のリアリティとそのモデル化

森島繁生

情報処理学会研究報告 2003 ( 14(SLP-45) ) 47 - 50 2003年

J-GLOBAL
複数人話者会話シーンにおける動画像翻訳システムの構築

前島謙宣, 森島繁生, 中村哲

電子情報通信学会大会講演論文集 2003 255 2003年

J-GLOBAL
古典バレエのモーションキャプチャリング

小島潔, 杉崎英嗣, 四倉達夫, 森島繁生

電子情報通信学会大会講演論文集 2003 148 2003年

J-GLOBAL
音声のパラメータ変換によるイントネーション変換システムの構築

足立吉広, 前島謙宣, 四倉達夫, 森島繁生

電子情報通信学会大会講演論文集 2003 254 2003年

J-GLOBAL
幾何構造とテクスチャの平均化による人物に依存しない表情変形モデルの構築

大橋俊介, 井上真実, 森島繁生

電子情報通信学会大会講演論文集 2003 253 2003年

J-GLOBAL
音声のパラメータ変換によるイントネーション変換システムの構築

足立吉広, 前島謙宣, 四倉達夫, 森島繁生

電子情報通信学会技術研究報告 102 ( 734(HCS2002 47-56) ) 1 - 6 2003年

J-GLOBAL
3次元顔テンプレートを用いた静止画中の顔認識と表情変換

高橋悠, 八島隆, 森島繁生

電子情報通信学会技術研究報告 102 ( 734(HCS2002 47-56) ) 31 - 36 2003年

J-GLOBAL
モーションキャプチャを用いた内部骨格の動作再現

小島潔, 杉崎英嗣, 四倉達夫, 森島繁生

電子情報通信学会技術研究報告 102 ( 736(HIP2002 77-82) ) 19 - 24 2003年

J-GLOBAL
スナップ写真中の人物の同定,表情認識,表情変換

八島隆, 高橋悠, 森島繁生

情報処理学会全国大会講演論文集 65th ( 2 ) 2.83-2.84 2003年

J-GLOBAL
テクスチャブレンディングによるしわの表現と口形アニメーション

石山慎一郎, 高橋光紀, 大橋俊介, 森島繁生

情報処理学会全国大会講演論文集 65th ( 4 ) 4.59-4.60 2003年

J-GLOBAL
影響力マップ用いた任意表情モデル上での表情合成

祖川慎治, 四倉達夫, 森島繁生

情報処理学会全国大会講演論文集 65th ( 5 ) 5.427-5.430 2003年

J-GLOBAL
モーションキャプチャシステムを用いた複雑な人物動作の表現

仲野陽介, 杉崎英嗣, 四倉達夫, 森島繁生

情報処理学会全国大会講演論文集 65th ( 5 ) 5.391-5.394 2003年

J-GLOBAL
擬人化音声対話エージェントのための顔画像合成モジュールの開発

四倉達夫, 森島繁生

情報処理学会全国大会講演論文集 65th ( 5 ) 5.423-5.426 2003年

J-GLOBAL
場を用いて擬似的に表現した風の力による頭髪の運動アニメーション

浅井崇, 杉森大輔, 杉崎英嗣, 森島繁生

情報処理学会全国大会講演論文集 65th ( 4 ) 4.49-4.50 2003年

J-GLOBAL
グラフマッチングを利用した顔特徴部位の位置推定と追跡

大室学, 森島繁生

情報処理学会全国大会講演論文集 65th ( 2 ) 2.77-2.78 2003年

J-GLOBAL
モデル・テクスチャ・しわのブレンディングによる顔表情の表現

高橋光紀, 森島繁生

情報処理学会全国大会講演論文集 65th ( 5 ) 5.431-5.434 2003年

J-GLOBAL
GUIを越えて-Beyond Desktop特集擬人化インタフェース

森島繁生

ヒューマンインタフェース学会誌 5 ( 2 ) 109 - 112 2003年

J-GLOBAL
フォトリアリスティックな空間共有コミュニケーション技術

望月研二, 相沢清晴, 森島繁生, 斉藤隆弘

3次元画像コンファレンス講演論文集 2003 209 - 212 2003年

J-GLOBAL
顔の分析・合成とその応用

森島繁生

情報処理学会研究報告 2003 ( 66(CVIM-139) ) 107 - 114 2003年

J-GLOBAL
レンジセンサを用いた表情の計測および変形ルールの記述

大橋俊介, 山口智之, 森島繁生

電子情報通信学会技術研究報告 103 ( 328(HCS2003 14-19) ) 1 - 6 2003年

J-GLOBAL
古典バレエのモーションキャプチャリング

小島潔, 杉崎英嗣, 四倉達夫, 森島繁生

電子情報通信学会総合大会、2003 148 - 148 2003年

CiNii
擬人化音声対話システムにおけるエージェント画像生成

四倉達夫, 森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 102 ( 342 ) 1 - 6 2002年09月

　概要を見る

機械と人間とのコミュニケーション形態の1つとして擬人化エージェントが挙げられる.このエージェントがコンピュータディスプレイ上に表示し,言語情報やジェスチャ,表情等の非言語情報を理解・表出しあたかも人間同士が対面対話するようなリアルなコミュニケーション環境を構築可能なシステムが求められる.エージェントを構築するにあたり,最終的な目標として,いかにエージェント自体をリアルなものとし,コミュニケーションの際,現実世界との対話と遜色なくすることである.本稿ではこのシステムのエージェントの構築技術を紹介し,エージェントの顔モデル構築,表情合成,アニメーション手法について紹介する.

CiNii
TAO-5 空間共有コミュニケーションプロジェクト(大型プロジェクト紹介,学術系企画)

齊藤隆弘, 森島繁生, 相澤清晴, 望月研二

情報科学技術フォーラム学術系・企業系予稿集 2002 42 - 43 2002年09月

CiNii
空間共有コミュニケーションプロジェクト

齋藤隆弘, 森島繁生, 相澤清晴, 望月研二

計測と制御 = Journal of the Society of Instrument and Control Engineers 41 ( 9 ) 653 - 658 2002年09月

DOI CiNii J-GLOBAL
カスタマイズ性を考慮した擬人化音声対話ソフトウェアツールキットの設計

川本真一, 下平博, 新田恒雄, 西本卓也, 中村哲, 伊藤克亘, 森島繁生, 四倉達夫, 甲斐充彦, 李晃伸, 山下洋一, 小林隆夫, 徳田恵一, 広瀬啓吉, 峯松信明, 山田篤, 伝康晴, 宇津呂武仁, 嵯峨山茂樹

情報処理学会論文誌 43 ( 7 ) 2249 - 2263 2002年07月

　概要を見る

本論文では，擬人化音声対話エージェントを将来のヒューマンインタフェースの重要な技術要素として位置づけ，研究開発の共通プラットフォームとなりうる高いカスタマイズ可能性を備えたソフトウェアツールキットの実現を目指し，それに必要な要素とその実現技術について論じる．今後のヒューマンインタフェース技術において，コンピュータがあたかも一個の人間として振る舞い，人間の顔や姿を持ち，ユーザと音声言語で対話するようにすることは，大きな目標の1つである．このような研究開発を進めるにあたっては，多分野の協力が必要であり，研究成果を集積していくための共通プラットフォームが必要である．それには，音声認識，音声合成，画像合成，対話制御などの基本モジュールと，それらを統合制御する仕組みが必要である．さらに，個性の表現や広い応用などのためには，各モジュールは高い基本機能のみならずカスタマイズ可能性が重要である．このため，筆者らは，顔画像が容易に交換可能で，音声合成が話者適応可能で，対話制御の記述変更が容易で，さらにこれらの機能モジュール自体を別のモジュールに差し替えることが容易であるなどの特徴を持つ擬人化音声対話エージェントシステムを構想し，実装した．いくつかの簡単な対話タスクについてエージェントを試作し，必要な機能に関する達成度を確認した．This paper discusses the design and architecture of a software toolkitfor building an easily customizable anthropomorphic spoken dialogagent (ASDA). Human-like spoken dialogue agent is one of the promisingman-machine interface for the next generation. Simply combining,however, the existing software modules for speech recognition, speechsynthesis, face-animation synthesis and dialogue control do not leadto a satisfying agent system as might be expected. ASDA requires moresophisticated functions of the modules than those when the modules areused independently, as well as the integration mechanism. Anotherproblem with ASDA was that it required great customization effortfor any user-system interaction task.Therefore,developing an easy-to-customize software platform for ASDA is quitemeaningful, though it is still a great challenge in both research anddevelopment aspects. This paper discusses basic and essentialrequirements for ASDA systems, and software modules forthe system are designed to fulfill the requirements. Using this software toolkit,A prototype agent system has been developed on a UNIX-based system using thissoftware toolkit.Finally, we discuss current achievements of the toolkit.

CiNii
擬人化音声対話エージェント開発プロジェクト

嵯峨山茂樹, 伊藤克亘, 宇津呂武仁, 甲斐充彦, 小林隆夫, 下平博, 伝康晴, 徳田恵一, 中村哲, 西本卓也, 新田恒雄, 広瀬啓吉, 森島繁生, 峯松信明, 山下洋一, 山田篤, 李晃伸

日本音響学会研究発表会講演論文集 2002 ( 1 ) 27 - 28 2002年03月

CiNii
A-14-13 レンジファインダを用いた表情編集ツールの構築

大橋俊介, 杉崎英嗣, 伊藤圭, 森島繁生

電子情報通信学会総合大会講演論文集 2002 291 - 291 2002年03月

CiNii J-GLOBAL
A-14-14 奥行き情報を利用した正面顔画像への標準顔モデルの自動整合

井上洋信, 大室学, 伊藤圭, 森島繁生

電子情報通信学会総合大会講演論文集 2002 292 - 292 2002年03月

CiNii J-GLOBAL
A-15-3 空間周波数成分と十字テンプレートを用いたリアルタイム顔器官形状認識と表情合成

中村一雄, 大室学, 島田昌実, 森島繁生

電子情報通信学会総合大会講演論文集 2002 297 - 297 2002年03月

CiNii J-GLOBAL
A-16-34 仮想人物の舌モデル構成と発話アニメーション作成

鳥飼友美, 伊藤圭, 緒方信, 森島繁生

電子情報通信学会総合大会講演論文集 2002 354 - 354 2002年03月

CiNii J-GLOBAL
テクスチャブレンディングによる皺の表現と基本顔モデルによる感情空間の構築

柳澤尋輝, 高橋光紀, 森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 101 ( 693 ) 17 - 24 2002年02月

　概要を見る

特定の顔表情を予めレンジセンサにより取得された3次元構造とデジタルカメラ撮影されたテクスチャを含む11種類の基本表情モデルのブレンディングによって表現する手法を提案した。これは従来のモデルでは、表現が難しかった皺の表現や、微妙な表情変形を可能にした。また、この各基本表情モデルの融合強度を、入出力層に与えて5層ニューラルネットワークを恒等写像学習させ、その中間層から3次元感情空間を構成した。これによりリアルな表情アニメーションが感情空間上の点の制御と後半の3層の変換によって、容易に実現できるようになった。

CiNii
ズーム変化を含む動画中の顔自動トラッキングとマッチムーブによる表情合成

長田誉弘, 大室学, 緒方信, 森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 101 ( 693 ) 25 - 32 2002年02月

　概要を見る

映画製作においては、手動のマッチムーブ処理によって、主人公の顔部分を別の人物に置き換える処理がしばしば行われるが、経験と時間を要する処理である。また、洋画の吹き替えにおいては、口形と音声の同期が取れず、しばしば、口の動きからせりふが制約を受ける場合もある。本稿では、映像中の人物の顔の位置と向きを自動的に推定し、顔の全体もしくは一部を置き換える手法を提案し、この問題点に対応する。顔のトラッキングには、3次元テンプレートを利用する手法を提案し、精度の高い推定を実現する。また、このトラッキング結果に基づいて画像中にワイヤフレームを当てはめ、顔を他の人物のものと置換したり、口の部分を入れ替えて、別の言葉を発声する映像に変換するシステムを提案する。

CiNii
擬人化音声対話エージェントツールキットの基本設計

川本真一, 下平博, 新田恒雄, 西本卓也, 中村哲, 伊藤克亘, 森島繁生, 四倉達夫, 甲斐充彦, 李晃伸, 山下洋一, 小林隆夫, 徳田恵一, 広瀬啓吉, 峯松信明, 山田篤, 伝康晴, 宇津呂武仁, 嵯峨山茂樹

情報処理学会研究報告. HI,ヒューマンインタフェース研究会報告 97 61 - 66 2002年02月

　概要を見る

筆者らは,顔画像が容易に交換可能で,音声合成が話者適応可能で,対話制御の記述変更が容易で,更にこれらの機能モジュール自体を別のモジュールに差し替えることが容易であり,かつ処理ハードウェアの個数に柔軟に対処できるなどの特徴を持つ擬人化音声対話エージェントシステムを構想し,実装した.各モジュールのインタフェースを統一化して扱い,モジュール間の入出力は,UNIXシステムで使われている標準入出力を用いる簡便な方法にてモジュール統合機構を実現した.いくつかの簡単な対話タスクについてエージェントを試作し,必要な機能に関する達成度を確認した.また,顔画像合成モジュールを制御する新たなモジュールの追加を容易に実現することができた.

CiNii
擬人化音声対話エージェントのための表情合成技術

四倉達夫, 森島繁生

情報処理学会研究報告. HI,ヒューマンインタフェース研究会報告 97 79 - 84 2002年02月

　概要を見る

機械と人間とのコミュニケーション形態の1つとして擬人化エージェントが挙げられる。このエージェントがコンピュータディスプレイ上に表示し、言語情報やジェスチャ、表情等の非言語情報を理解・表出しあたかも人間同士が対面対話するようなリアルなコミュニケーション環境を構築可能なシステムが求められる。エージェントを構築するにあたり、最終的な目標として、いかにエージェント自体をリアルなものとし、コミュニケーションの際、現実世界との対話と遜色なくすることである。本稿ではこのシステムのエージェントの構築技術を紹介し、エージェントの顔モデル構築、表情合成、アニメーション手法について紹介する。

CiNii
擬人化音声対話エージェントのための表情合成技術

四倉達夫, 森島繁生

情報処理学会研究報告. SLP, 音声言語情報処理 2002 ( 10 ) 79 - 84 2002年02月

　概要を見る

機械と人間とのコミュニケーション形態の1つとして擬人化エージェントが挙げられる。このエージェントがコンピュータディスプレイ上に表示し、言語情報やジェスチャ、表情等の非言語情報を理解・表出しあたかも人間同士が対面対話するようなリアルなコミュニケーション環境を構築可能なシステムが求められる。エージェントを構築するにあたり、最終的な目標として、いかにエージェント自体をリアルなものとし、コミュニケーションの際、現実世界との対話と遜色なくすることである。本稿ではこのシステムのエージェントの構築技術を紹介し、エージェントの顔モデル構築、表情合成、アニメーション手法について紹介する。

CiNii J-GLOBAL
レンジファインダを用いた表情変形ルールと表情編集ツールの構築

大橋俊介, 杉崎英嗣, 伊藤圭, 森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 101 ( 610 ) 1 - 6 2002年01月

　概要を見る

近年、アバタを介したコミュニケーションシステムやヒューマンインタフェースにおいて、CGによる表情合成の研究が注目を集めている。コンピュータと人間とのコミュニケーションを円滑に行うには、人間同士がフェーストゥーフェースで対話しているような環境が重要であり、表情を含めたノンバーバルな情報の表現が課題となる。本稿では、従来から行っているFacial Action Coding System(FACS)に基づく表情合成規則を、レンジファインダを用いて高精度に再現することを目標とし、実際、個々のアクションユニットに関して、表情変化を実測して、高精細な3次元顔変形ルールを定めた。またこの表情変化ルールに基づき、任意の表情変化を記述する表情編集ツールを実現した。

CiNii
高速度カメラによる動的な顔面表情の分析および合成

四倉達夫, 内田英子, 山田寛, 赤松茂, 鉄谷信二, 森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 101 ( 610 ) 7 - 12 2002年01月

　概要を見る

本稿では、人間が自然な表情した場合 : 自発表出と典型的な表情を演じる際の顔表情 : 演技表出を撮影し、顔の各部位に設定した特徴点の変位量に基づき顔の動きの定量的な測定を高速度カメラを用いて分析した。また測定結果からCGによって構築した顔モデルのアニメーション生成を行った。自発表出条件、演技表出条件ともに顔の各部位の動き出しの差は微細であり高速度カメラを用いたことの有効性が示された。また情動ごとおよび表出条件ごとに顔の動き量や速さに特徴的な違いが認められたが、動きの変化そのものの様相には興味深い共通性が認められた。顔モデルのアニメーションに関しても、線形補間によるキーフレームアニメーションと比べより自然な顔表情表出が可能となった。

CiNii
マルチモーダル翻訳インタフェース

森島繁生, 四倉達夫

計測自動制御学会部門大会／部門学術講演会資料 si2002 117 - 117 2002年

DOI CiNii
Multi-modal Translation and Evaluation of Lip-synchronization using Noise Added Voice

MORISHIMA Shigeo

The First International Joint Conference on Autonomous Agents & Multi-Agent Systems, Proc. of Workshop 14 : Embodied conversational agents-let's specify and evaluate them!, 2002 2002年

CiNii
Audio-visual speech translation with automatic lip synchronization and face tracking based on 3-D read model

S Morishima, S Ogata, K Murai, S Nakamura

2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS 2 2117 - 2120 2002年

　概要を見る

Speech-to-speech translation has been studied to realize natural human communication beyond language barriers. Toward further multi-modal natural communication, visual information such as face and lip movements will be necessary. In this paper, we introduce a multi-modal English-to-Japanese and Japanese-to-English translation system that also translates the speaker's speech motion while synchronizing it to the translated speech. To retain the speaker's facial expression, we substitute only the speech organ's image with the synthesized one, which is made by a three-dimensional wire-frame model that is adaptable to any speaker. Our approach enables image synthesis and translation with an extremely small database. We conduct subjective evaluation by connected digit discrimination using data with and without audiovisual lip-synchronicity. The results confirm the sufficient quality of the proposed audio-visual translation system.
リアルな顔合成による音声対話擬人化エージェントの開発 (電子情報通信学会S) (日本ソフトウエア科学会S)

四倉達夫, 森島繁生

エージェント合同シンポジウム(JAWS2002)講演論文集平成14年 142 - 143 2002年

J-GLOBAL
HYPERMASK 3次元顔モデルを用いた仮面の構築

四倉達夫, BINSTED K, NIELSEN F, PHINHANEZ C, 鉄谷信二, 中津良平, 森島繁生

電子情報通信学会論文誌 D-2 J85-D-2 ( 1 ) 36 - 45 2002年

J-GLOBAL
高速度カメラによる動的な顔面表情の分析および合成

四倉達夫, 内田英子, 山田寛, 赤松茂, 鉄谷信二, 森島繁生

電子情報通信学会技術研究報告 101 ( 610(HCS2001 32-46) ) 7 - 12 2002年

J-GLOBAL
レンジファインダを用いた表情変形ルールと表情編集ツールの構築

大橋俊介, 杉崎英嗣, 伊藤圭, 森島繁生

電子情報通信学会技術研究報告 101 ( 610(HCS2001 32-46) ) 1 - 6 2002年

J-GLOBAL
擬人化音声対話エージェントツールキットの基本設計

川本真一, 新田恒雄, 西本卓也, 中村哲, 伊藤克亘, 森島繁生, 甲斐充彦, 山下洋一, 嵯峨山茂樹

情報処理学会研究報告 2002 ( 10(HI-97 SLP-40) ) 61 - 66 2002年

J-GLOBAL
擬人化音声対話エージェントのための表情合成技術

四倉達夫, 森島繁生

情報処理学会研究報告 2002 ( 10(HI-97 SLP-40) ) 79 - 84 2002年

J-GLOBAL
レンジファインダと複数アングル画像を用いた3次元顔モデルの生成とその表情合成

伊藤圭, 森島繁生

電子情報通信学会大会講演論文集 2002 299 2002年

J-GLOBAL
ズーム変化を含む動画中の顔自動トラッキング

長田誉弘, 大室学, 緒方信, 森島繁生

電子情報通信学会大会講演論文集 2002 276 2002年

J-GLOBAL
テクスチャブレンディングによるしわの表現と基本顔モデルによる感情空間の構築

柳沢尋輝, 高橋光紀, 森島繁生

電子情報通信学会技術研究報告 101 ( 693(HCS2001 47-53) ) 17 - 24 2002年

J-GLOBAL
パンチルト制御可能な複数のカメラの連携による顔領域追跡

島田昌実, 森島繁生

電子情報通信学会大会講演論文集 2002 190 2002年

J-GLOBAL
テクスチャブレンディングによるしわの表現と基本顔モデルによる感情空間の構築

柳沢尋輝, 高橋光紀, 森島繁生

電子情報通信学会大会講演論文集 2002 298 2002年

J-GLOBAL
自然な頭髪の運動と髪型を保存する復元力の表現

杉森大輔, 杉崎英嗣, 佐野和成, 森島繁生

電子情報通信学会大会講演論文集 2002 173 2002年

J-GLOBAL
仮想人物の舌モデル構成と発話アニメーション作成

鳥飼友美, 伊藤圭, 緒方信, 森島繁生

電子情報通信学会大会講演論文集 2002 354 2002年

J-GLOBAL
人物モデルの構築と歩行動作のルール化によるアニメーション生成

佐藤大, 杉崎英嗣, 佐野和成, 森島繁生

電子情報通信学会大会講演論文集 2002 170 2002年

J-GLOBAL
ズーム変化を含む動画中の顔自動トラッキングとマッチムーブによる表情合成

長田誉弘, 大室学, 緒方信, 森島繁生

電子情報通信学会技術研究報告 101 ( 693(HCS2001 47-53) ) 25 - 32 2002年

J-GLOBAL
レンジファインダを用いた表情編集ツールの構築

大橋俊介, 杉崎英嗣, 伊藤圭, 森島繁生

電子情報通信学会大会講演論文集 2002 291 2002年

J-GLOBAL
有声音区間の対応づけによる自動イントネーション変換

前島謙宣, 緒方信, 森島繁生

電子情報通信学会大会講演論文集 2002 289 2002年

J-GLOBAL
奥行き情報を利用した正面顔画像への標準顔モデルの自動整合

井上洋信, 大室学, 伊藤圭, 森島繁生

電子情報通信学会大会講演論文集 2002 292 2002年

J-GLOBAL
空間周波数成分と十字テンプレートを用いたリアルタイム顔器官形状認識と表情合成

中村一雄, 大室学, 島田昌実, 森島繁生

電子情報通信学会大会講演論文集 2002 297 2002年

J-GLOBAL
スナップ写真データベース中の人物検索

高橋悠, 伊藤圭, 緒方信, 森島繁生

電子情報通信学会大会講演論文集 2002 288 2002年

J-GLOBAL
人物アニメーションに同期した頭髪の運動表現

佐野和成, 森島繁生

電子情報通信学会大会講演論文集 2002 169 2002年

J-GLOBAL
コンピュータグラフィックスによる髪型を保存する復元力を用いた頭髪の自然な運動表現

杉森大輔, 杉崎英嗣, 森島繁生

Visual Computing グラフィクスとCAD合同シンポジウム予稿集 2002 83 - 86 2002年

J-GLOBAL
音声言語情報処理とその応用カスタマイズ性を考慮した擬人化音声対話ソフトウェアツールキットの設計

川本真一, 新田恒雄, 西本卓也, 中村哲, 伊藤克亘, 森島繁生, 甲斐充彦, 李晃伸, 嵯峨山茂樹

情報処理学会論文誌 43 ( 7 ) 2249 - 2263 2002年

J-GLOBAL
複合現実空間共有コミュニケーションプロジェクト

斉藤隆弘, 森島繁生, 相沢清晴, 望月研二

計測と制御 41 ( 9 ) 653 - 658 2002年

DOI J-GLOBAL
空間共有コミュニケーションプロジェクト

斉藤隆弘, 森島繁生, 相沢清晴, 望月研二

情報科学技術フォーラム FIT 2002 42 - 43 2002年

J-GLOBAL
擬人化音声対話システムにおけるエージェント画像生成

四倉達夫, 森島繁生

電子情報通信学会技術研究報告 102 ( 342(HCS2002 22-27) ) 1 - 6 2002年

J-GLOBAL
空間共有コミュニケーションの実験システム:BEOEB

望月研二, 山田邦男, 岩沢昭一郎, 吉田俊介, 鳥羽美奈子, 相沢清晴, 森島繁生, 斉藤隆弘

映像情報メディア学会技術報告 26 ( 70(ME2002 64-69) ) 13 - 16 2002年

J-GLOBAL
HAI ヒューマンエージェントインタラクション HAIにおけるエージェントのリアリティとコミュニケーションギャップ

森島繁生

人工知能学会誌 17 ( 6 ) 687 - 692 2002年

J-GLOBAL
エンタテインメントのための表情分析・合成技術

森島繁生

日本バーチャルリアリティ学会論文誌 7 ( 4 ) 533 - 541 2002年

　概要を見る

Copying human action accurately is main scheme in entertainment VR area to control and generate a virtual human or cartoon character in a screen. Motion capture system is generally used to generate a crone human in the scene of motion picture or interactive game, however, huge manual operation at post processing is inevitable to generate high quality image. In this paper, especially copying facial action is focused on and high quality and accurate method to generate and copy natural impression of face with semi-interactive process using face image analysis and face image synthesis scheme. Finally, some application systems in entertainment area are introduced.

DOI CiNii J-GLOBAL
有声音区間の対応付けによる自動イントネーション変換

前島謙宣, 緒方信, 森島繁生

電子情報通信学会2002年総合全国大会, March 289 - 289 2002年

CiNii
テクスチャブレンディングによる皺の表現と基本顔モデルによる感情空間の構築

柳澤尋輝, 高橋光紀, 森島繁生

電気情報通信学会総合大会, 2002-03 2002 ( 693(HCS2001 47-53) ) 298 - 298 2002年

CiNii J-GLOBAL
複数アングル画像とレンジファインダを用いた3次元顔モデルの生成とその表情合成

伊藤圭, 森島繁生

電子情報通信学会2002年総合全国大会, March 299 - 299 2002年

CiNii
空間共有コミュニケーションの実験システム : BEOEB(感性情報処理および一般)

望月研二, 山田邦男, 岩澤昭一郎, 吉田俊介, 鳥羽美奈子, 相澤清晴, 森島繁生, 齊藤隆弘

映像情報メディア学会技術報告 26 ( 0 ) 13 - 16 2002年

　概要を見る

通信・放送メディアなどの映像メディアでは、画像情報を2次元から、より臨場感のある3次元映像の様な高い次元の情報とする要求が高くなってきている。本郷空間共有リサーチセンタでは、人類の夢の画像コミュニケーションである「空間共有コミュニケーション」の実現へ向けて、視点からの距離に依存した3層構造の表現をベースにした実写画像等を用いてリアルな仮想空間を共有する技術について提案した。今回は、このコミュニケーションの場の実現を目指した各要素技術について説明すると共にこれを実現する実験システムBEOEBについて述べる。

DOI CiNii
エンタテインメントのための表情分析・合成技術(<特集>エンタテインメントVR)

森島繁生

日本バーチャルリアリティ学会論文誌 7 ( 4 ) 533 - 541 2002年

　概要を見る

Copying human action accurately is main scheme in entertainment VR area to control and generate a virtual human or cartoon character in a screen. Motion capture system is generally used to generate a crone human in the scene of motion picture or interactive game, however, huge manual operation at post processing is inevitable to generate high quality image. In this paper, especially copying facial action is focused on and high quality and accurate method to generate and copy natural impression of face with semi-interactive process using face image analysis and face image synthesis scheme. Finally, some application systems in entertainment area are introduced.

DOI CiNii
HYPERMASK : 3次元顔モデルを用いた仮面の構築

四倉達夫, BINSTED Kim, NIELSEN Frank, PINHANEZ Claudio, 鉄谷信二, 中津良平, 森島繁生

電子情報通信学会論文誌. D-II, 情報・システム, II-パターン処理 = The transactions of the Institute of Electronics, Information and Communication Engineers. D-II 85 ( 1 ) 36 - 45 2002年01月

　概要を見る

HYPERMASKとは従来単一の顔表情や人物を表現する仮面の概念を進化させ, 一つの仮面からあらゆる表情や人物を自由に生成及び表現可能なシステムである.本システムを用いることで, その仮面を装着した役者の表現の幅や新しい演出方法が生み出されていくと考えられる.顔の表出手法として, 仮面に装着された五つのLEDを, カメラにより追跡することで仮面の位置及び方向を求め, プロジェクタによって算出されたパラメータをもとに顔画像の投影を行う.また投影されている顔画像は演技者の音声を分析することによりリアルタイムで音声同期して口形状のアニメーションを行い, 顔表情や人物の切換はユーザが任意に選択可能である.本論文ではHYPERMASKシステムを用いた演出支援装置を紹介し, 新たな仮面の表現技法の確立を目指す.

CiNii J-GLOBAL
擬人化音声対話エージェントツールキットの基本設計

川本, 下平博, 新田恒雄, 西本卓也, 中村哲, 伊藤克亘, 森島繁生, 四倉達夫, 甲斐充彦, 李晃伸, 山下洋一, 小林隆夫, 徳田恵一, 広瀬啓吉, 峯松信明, 山田篤, 伝康晴, 宇津呂武仁, 嵯峨山茂樹

情処研報 2002 ( 10 ) 61 - 66 2002年

　概要を見る

筆者らは,顔画像が容易に交換可能で,音声合成が話者適応可能で,対話制御の記述変更が容易で,更にこれらの機能モジュール自体を別のモジュールに差し替えることが容易であり,かつ処理ハードウェアの個数に柔軟に対処できるなどの特徴を持つ擬人化音声対話エージェントシステムを構想し,実装した.各モジュールのインタフェースを統一化して扱い,モジュール間の入出力は,UNIXシステムで使われている標準入出力を用いる簡便な方法にてモジュール統合機構を実現した.いくつかの簡単な対話タスクについてエージェントを試作し,必要な機能に関する達成度を確認した.また,顔画像合成モジュールを制御する新たなモジュールの追加を容易に実現することができた.

CiNii J-GLOBAL
Novel 3D image structure, processing and communications technologies based on hyper-realistic image for multi-media ambiance communication

Takahiro Saito

SID Conference Record of the International Display Research Conference 1353 - 1356 2001年12月

　概要を見る

3D image structure, processing and communications technologies based on hyper-realistic images for multi-media ambiance communication were proposed. Technique to integrate high accuracy texture data captured with a digital camera at identification point with range data was adopted. Analysis showed that realization of advanced multi-media ambiance communication could be expected by putting it together with a display device.

CiNii
自発・演技表情表出時における顔面動作分析および表情合成

四倉達夫, 内田英子, 山田寛, 赤松茂, 森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 101 ( 333 ) 39 - 46 2001年09月

　概要を見る

本稿では、高速度カメラを用いて人間が自然な表情した場合(自発表出)と普遍的かつ典型的な表情を演じる際の顔表情(演技表出)を撮影し、顔の各部位に設定した特徴点の変位置に基づき顔の動きの定量的な測定を分析した。また測定結果からCGによって構築した顔モデルのアニメーション生成を行った。自発表出条件、演技表出条件ともに顔の各部位の動き出しの差は微細であり高速度カメラを用いる有効性が示された。また情動ごとおよび表出条件ごとに顔の動き量や速さに特徴的な違いが認められたが、動きの変化そのものの様相には興味深い共通性が認められた。顔モデルのアニメーションに関しても、線形補間によるキーフレームアニメーションと比べ自然な顔表情表出が可能となった。

CiNii
高速度カメラを用いた顔面動作の分析および表情合成

四倉達夫, 内田英子, 山田寛, 赤松茂, 鉄谷信二, 森島繁生

電子情報通信学会技術研究報告 101 ( 298 ) 15 - 22 2001年09月

　概要を見る

本稿では、高速度カメラを用いて人間が自然な表情した場合(自発表出)と普遍的かつ典型的な表情を演じる際の顔表情(演技表出)を撮影し, 顔の各部位に設定した特徴点の変位量に基づき顔の動きの定量的な測定を分析した。また測定結果からCGによって構築した顔モデルのアニメーション生成を行った。自発表出条件、演技表出条件ともに顔の各部位の動き出しの差は微細であり高速度カメラを用いたことの有効性が示された。また情動ごとおよび表出条件ごとに顔の動き量や速さに特徴的な違いが認められたが、動きの変化そのものの様相には興味深い共通性が認められた。顔モデルのアニメーションに関しても、線形補間によるキーフレームアニメーションと比べより自然な顔表情表出が可能となった。

CiNii
高速度カメラを用いた顔面動作の分析および表情合成

四倉達夫, 内田英子, 山田寛, 赤松茂, 鉄谷信二, 森島繁生

電子情報通信学会技術研究報告. IE, 画像工学 101 ( 300 ) 15 - 22 2001年09月

　概要を見る

本稿では、高速度カメラを用いて人間が自然な表情した場合(自発表出)と普遍的かつ典型的な表情を演じる際の顔表情(演技表出)を撮影し、顔の各部位に設定した特徴点の変位量に基づき顔の動きの定量的な測定を分析した。また測定結果からCGによって構築した顔モデルのアニメーション生成を行った。自発表出条件、演技表出条件ともに顔の各部位の動き出しの差は微細であり高速度カメラを用いたことの有効性が示された。また情動ごとにおよび表出条件ごとに顔の動き量や速さに特徴的な違いが認められたが、動きの変化そのものの様相には興味深い共通性が認められた。顔モデルのアニメーションに関しても、線形補間によるキーフレームアニメーションと比べより自然な顔表出が可能となった。

CiNii
あなたは人の計測にCV技術を使いますか?

山本正信, 美濃導彦, 岩井儀雄, 白井良明, 上田博唯, 岡本浩幸, 八村広三郎, 坂本浩, 森島繁生, 中村裕一

情報処理学会研究報告. CVIM, [コンピュータビジョンとイメージメディア] 128 ( 66 ) 111 - 111 2001年07月

　概要を見る

オーガナイズドセッション「人を観る」に伴い, 人間を計測・認識する技術に関する注文・批判・今後の期待を討論する場を用意致しました.各界で活躍されている研究者の方々に忌憚なく意見を述べて頂き, 人間を計測する技術への理解を深めつつ, 将来の方向性を探ります.なぜ画像センサを使うのか?何を情報として抽出したいのか?画像センサを使うことの利点/欠点は何か?対象が人間であることによって, 簡単になっている部分はどこか?逆に難しくなっている部分は?どんな人体モデルが必要なのか?等々の議論が期待されてます.

CiNii
色ヒストグラムインターセクションを用いたリアルタイム人物

吉村哲也, 市川忠嗣, 森島繁生, 相澤清晴, 齊藤隆弘

映像情報メディア学会誌 : 映像情報メディア = The journal of the Institute of Image Information and Television Engineers 55 ( 3 ) 412 - 416 2001年03月

　概要を見る

人物頭部が動いたり, 顔表情が変化しても顔を抽出するため, 色ヒストグラムインタセクションの類似尺度を用いて物体を抽出する手法を改良し, 顔の抽出に適用した.改良では, 従来のアクティブ探索法をより高速化する手法を考案した.本手法を用いて, 顔抽出実験を行い, 顔抽出の安定性と処理速度に関する評価を行った.

DOI CiNii
3次元個人顔モデルを用いたビデオ映像中の顔の自動トラッキング及びモデルマッチムーブ処理

三澤貴文, 村井和昌, 中村哲, 森島繁生

電子情報通信学会技術研究報告. MVE, マルチメディア・仮想環境基礎 100 ( 716 ) 1 - 8 2001年03月

　概要を見る

近年の著しい技術進歩により、携帯情報端末で動画像を送受信したり、音声翻訳システムを介して海外の人々と母国語で会話ができる時代もそう遠くはなく、音声のみならず画像も翻訳できれば会話がより自然なものになると思われる。この画像翻訳の実現には、映像中の人物顔を正確にトラッキングする技術が必要となる。顔のトラッキングは多くの研究者により研究されてきたが、その多くが顔の特徴点を追うものであり、特徴点のフレーム間でのブレや顔の回転による特徴点の隠れなどの問題が残されていた。そこで本論文では画像翻訳の核となる技術として、3次元個人モデルを用いたテンプレートマッチングによる顔のトラッキング手法を提案した。そして、評価実験により、ある軸での角度の平均誤差が約0.28°という結果を得た。この結果は、提案した手法が効果的な方法であることを示すものである。

CiNii J-GLOBAL
空間曲線上の点の直接操作によるヘアスタイルデザインシステム及びカット機能の実現

杉崎英嗣, 佐野和成, 森島繁生

電子情報通信学会技術研究報告. MVE, マルチメディア・仮想環境基礎 100 ( 716 ) 9 - 16 2001年03月

　概要を見る

サイバースペースにおける仮想人物の合成やコンピュータグラフィックス(CG)による人物合成等が注目を集めている。本論文では特に人物のCGの中でも合成が難しいとされる頭髪の表現について述べる。現在、頭髪をマッピング技術を用いて表現する方法が成果を上げているが、運動表現には不向きである。そこで頭髪の表現にテクスチャやポリゴンを用いずに3次B-Spline空間曲線を用いる。またインタフェースはレイヤモデルを用いたシステムを使用する。本論文では空間曲線上を通過する3次B-Spline空間曲線、空間曲線の切断について述べる。また実際にレイヤモデルによるシステム用いて頭髪のモデリング、カットを実現した。

CiNii
高速度カメラで捉えた自発表情と演技表情の動的変化

山田寛, 内田英子, 四倉達夫, 森島繁生, 鉄谷信二, 赤松茂

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 100 ( 712 ) 27 - 34 2001年03月

　概要を見る

本研究では、人間が自然な表情を自発した時と普遍的で典型的と言われている表情を演じる時の顔の動きを高速度カメラで撮影し、顔の特徴点の変位の測定に基づいて顔の動きの定量的な特性を分析した。自然な表情は、Gross & Levens (1995)が標準化した情動喚起刺激を被験者に提示することによって自発させた。典型的な表情の演技は、FACSの定義に基づいた。自発表出条件、演技表出条件ともに顔の各部位の動き出しの差は微細であり高速度カメラを用いたことの有効性が示された。また情動ごとおよび表出条件ごとに顔の各部の動きの量や速さに特徴的な違いが認められたが、動きの変化そのものの様相には興味深い共通性が認められた。

CiNii J-GLOBAL
A-14-6 ネットワークシアタ

高橋和彦, 楜沢順, 四倉達夫, 森島繁生, 鉄谷信二

電子情報通信学会総合大会講演論文集 2001 282 - 282 2001年03月

CiNii
A-14-18 仮想空間を用いた多人数コミュニケーションシステム構築

伊藤圭, 四倉達夫, 森島繁生

電子情報通信学会総合大会講演論文集 2001 294 - 294 2001年03月

CiNii
A-14-19 顔特徴点抽出に基づく正面顔画像への標準顔モデルの自動フィッティング

坂本将人, 伊藤圭, 三澤貴文, 森島繁生

電子情報通信学会総合大会講演論文集 2001 295 - 295 2001年03月

CiNii
A-14-21 高速度カメラを用いた表情表出時の顔面動作の分析および微妙な表情の合成

長谷川佳之, 四倉達夫, 内田英子, 山田寛, 森島繁生

電子情報通信学会総合大会講演論文集 2001 297 - 297 2001年03月

CiNii J-GLOBAL
A-14-23 実画像とワイヤフレームモデルを併用した自動翻訳合成音声とのリップシンクシステムの構築

緒方信, 中村哲, 森島繁生

電子情報通信学会総合大会講演論文集 2001 299 - 299 2001年03月

CiNii J-GLOBAL
A-15-10 複数のカメラによる口・目領域追跡と表情合成のための特徴分析

島田直幸, 森島繁生

電子情報通信学会総合大会講演論文集 2001 309 - 309 2001年03月

CiNii J-GLOBAL
A-15-11 唇の特徴点抽出と音声分析を併用した音声に同期する唇画像の実時間合成

大室学, 伊藤圭, 島田直幸, 森島繁生

電子情報通信学会総合大会講演論文集 2001 310 - 310 2001年03月

CiNii
A-15-24 複数の基本顔の恒等写像学習による感情空間構成

高橋光紀, 伊東大介, 森島繁生

電子情報通信学会総合大会講演論文集 2001 323 - 323 2001年03月

CiNii
A-16-5 舌モデルの付加によるリアルな英語口形の実現と発話アニメーション

石川行一, 伊藤圭, 三澤貴文, 森島繁生

電子情報通信学会総合大会講演論文集 2001 328 - 328 2001年03月

CiNii J-GLOBAL
D-11-110 HyperMask-任意表情及び人物表出可能な仮面の構築

四倉達夫, Binsted Kim, Nielsen Frank, Claudio Pinhanez, 鉄谷信二, 森島繁生

電子情報通信学会総合大会講演論文集 2001 ( 2 ) 110 - 110 2001年03月

CiNii
D-11-113 空間曲線上の点の直接操作による頭髪スタイル制御とカット機能を有するヘアスタイルデザインシステム

杉崎英嗣, 佐野和成, 森島繁生

電子情報通信学会総合大会講演論文集 2001 ( 2 ) 113 - 113 2001年03月

CiNii J-GLOBAL
D-12-32 パンチルトカメラの多段接続による顔の特徴部位追跡

島田昌実, 島田直幸, 森島繁生

電子情報通信学会総合大会講演論文集 2001 ( 2 ) 199 - 199 2001年03月

CiNii J-GLOBAL
D-12-35 十字型テンプレートを利用したリアルタイム視線追跡と瞬き検出

笠嶋健吾, 島田直幸, 森島繁生

電子情報通信学会総合大会講演論文集 2001 ( 2 ) 202 - 202 2001年03月

CiNii
A human face tracking using color histogram intersection matching method

吉村哲也, 市川忠嗣, 森島繁生, 相澤清晴, 齊藤隆弘

Kyokai Joho Imeji Zasshi/Journal of the Institute of Image Information and Television Engineers 55 ( 3 ) 412 - 416 2001年03月

　概要を見る

人物頭部が動いたり, 顔表情が変化しても顔を抽出するため, 色ヒストグラムインタセクションの類似尺度を用いて物体を抽出する手法を改良し, 顔の抽出に適用した.改良では, 従来のアクティブ探索法をより高速化する手法を考案した.本手法を用いて, 顔抽出実験を行い, 顔抽出の安定性と処理速度に関する評価を行った.

DOI DOI2 CiNii
3次元空間共有コミュニケーション技術の研究開発 : 実写画像をベースとしたマルチメディア・アンビアンスコミュニケーションの実現に向けて

望月研二, 齊藤隆弘, 市川忠嗣, 森島繁生, 相澤清晴, 山田邦男, 須賀弘道, 岩澤昭一郎, 山本健一郎, 児玉和也, 苗村健, 斎藤英雄

電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 100 ( 633 ) 47 - 54 2001年02月

　概要を見る

現実感のある3次元画像の共有空間を構築することにより, 遠隔地で離れた人同士があたかも同じ空間を共有しているような感覚でコミュニケーションしたりCSCWなどに利用する事が、高度な画像メディア通信の領域で求められている。広い周囲環境画像をビデオカメラやディジタルカメラで撮影し, この実写画像を元に3次元画像の共有空間を再構成する方式では、いまだ品質の良いものは実現できてない。人の視覚特性に基づけば遠景・中景・近景の3レイヤ構造やセッティング表現の考えをベースに、現実感が高くかつ効率的な3次元表示が可能である。本稿では、視点に基づく立体視のための近似的表現を利用した空間共有コミュニケーション(Multimedia Ambiance Communication)のための3次元画像空間の構築方法について述べる。

CiNii J-GLOBAL
テキスト情報を利用した発話音声の自動セグメンテーションと感情音声分析への応用

吉住悟, 緒方信, 森島繁夫

信学会総合全大, 2001 277 - 277 2001年

CiNii
Video translation system using face tracking and lip synchronization

S. Morishima, Shin Ogata, S. Nakamura

Proceedings - IEEE International Conference on Multimedia and Expo 649 - 652 2001年

　概要を見る

We introduce a multi-modal English-to-Japanese and Japanese-to-English translation system that also translates the speaker's speech motion while synchronizing it to the translated speech. To retain the speaker's facial expression, we substitute only the speech organ's image with the synthesized one, which is made by a three-dimensional wire-frame model that is adaptable to any speaker. Our approach enables image synthesis and translation with an extremely small database. Also, we propose a method to track motion of the face from the video image. In this system, movement and rotation of the head is detected by template matching using a 3D personal face wire-frame model. By this technique, an automatic video translation can be achieved.

DOI
Model-based lip synchronization with automatically translated synthetic voice toward a multi-modal translation system

Shin Ogata, Kazumasa Murai, Satoshi Nakamura, Shigeo Morishima

Proceedings - IEEE International Conference on Multimedia and Expo 28 - 31 2001年

　概要を見る

In this paper, we introduce a multi-modal English-to-Japanese and Japanese-to-English translation system that also translates the speaker's speech motion while synchronizing it to the translated speech. To retain the speaker's facial expression, we substitute only the speech organ's image with the synthesized one, which is made by a three-dimensional wire-frame model that is adaptable to any speaker. Our approach enables image synthesis and translation with an extremely small database.

DOI
Face analysis and synthesis: For duplication expression and impression

Shigeo Morishima

IEEE Signal Processing Magazine 18 ( 3 ) 26 - 34 2001年

　概要を見る

A report on synthesis of virtual human or avatar to project the features with a realistic texture-mapped face to generate facial expression and action controlled by a multimodal input signal is reported. The report covers the face fitting tool from multiview camera images to make 3-D face model and the voice signal is used to determine the mouth shape feature when an avatar is speaking.

DOI
Three-dimensional image capturing and representation for multimedia ambiance communication

T Ichikawa, S Iwasawa, K Yamada, T Kanamaru, T Naemura, K Aizawa, S Morishima, T Saito

STEREOSCOPIC DISPLAYS AND VIRTUAL REALITY SYSTEMS VIII 4297 132 - 140 2001年

　概要を見る

Multimedia Ambiance Communication is as a means of achieving shared-space communication in an immersive environment consisting of an arch-type stereoscopic projection display. Our goal is to enable shared-space communication by creating a photo-realistic three-dimensional (3D) image space that users can feel a part of. The concept of a layered structure defined for painting, such as long-range, mid-range, and short-range views, can be applied to a 3D image space. New techniques, such as two-plane expression, high quality panorama image generation and setting representation for image processing, 3D image representation and generation for photo-realistic 3D image space have been developed. Also, we propose a life-like avatar within the 3D image space. To obtain the characteristics of user's body, a human subject is scanned using a Cyberware (TM) whole body scanner. The output from the scanner, a range image, is a good start for modeling the avatar's geometric shape. A generic human surface model is fitted to the range image. The obtained model is topologically equivalent even if our method is applied to another subject. If a generic model with motion definitions is employed, and common motion rules can be applied to all models made from the generic model.

DOI
Automatic face tracking and model match-move automatic face tracking and model match-move in video sequence using 3D face model in video sequence using 3D face model

Takafumi Misawa, Kazumasa Murai, Satoshi Nakamura, Shigeo Morishima

Proceedings - IEEE International Conference on Multimedia and Expo 234 - 236 2001年

　概要を見る

A stand-in is a common technique for movies and TV programs in foreign languages. The current stand-in that only substitutes the voice channel results awkward matching to the mouth motion. Videophone with automatic voice translation are expected to be widely used in the near future, which may face the same problem without lip-synchronized speaking face image translation. In this paper, we propose a method to track motion of the face from the video image, that is one of the key technologies for speaking image translation. Almost all the old tracking algorithms aim to detect feature points of the face. However, these algorithms had problems, such as blurring of a feature point between frames and occlusion of the hidden feature point by rotation of the head, and so on. In this paper, we propose a method which detects movement and rotation of the head given the three dimensional shape of the face, by template matching using a 3D personal face wireframe model. The evaluation experiments are carried out with the measured reference data of the head. The proposed method achieves 0.48 angle error in average. This result confirms effectiveness of the proposed method.

DOI
Multimedia ambiance communication: Based on actual images

Tadashi Ichikawa, Kunio Yamada, Toshifumi Kanamaru, Takeshi Naemura, Kiyoharu Aizawa, Shigeo Morishima, Takahiro Saito

IEEE Signal Processing Magazine 18 ( 3 ) 43 - 50 2001年

　概要を見る

A report on creating photo-realistic image space that looks natural and inviting while minimizing the computer power needed was presented. The creation of image was done by layering different types of images to create the 3-D image space. The laws of perpective and the characteristics of human visual perception are used to alter the actual images to make them more realistic.

DOI
Dynamic micro aspects of facial movements in elicited and posed expressions using high-speed camera

S Morishima, T Yotsukura, H Yamada, H Uchida, N Tetsutani, S Akamatsu

ROBOT AND HUMAN COMMUNICATION, PROCEEDINGS 371 - 376 2001年

　概要を見る

The present study investigated the dynamic aspects of facial movements in spontaneously, elicited and posed facial expressions of emotion. Me recorded participants' facial movements when they were shown a set of emotional eliciting films, and when they, posed typical facial expressions. Those facial movements it-ere recorded by a high-speed camera of 250 frames per second. We measured facial movements frame by frame in terms of displacements of facial feature points. Such micro-temporal analysis showed that, although it was very subtle, there exits the characteristic onset as synchrony of each part's movement. Furthermore, it was found the commonality v of each pail's movement in temporal change although the speed and the amount of each movement varied along with expressional conditions and emotions.
Dynamic micro aspects of facial movements in elicited and posed expressions using high-speed camera

S Morishima, T Yotsukura, H Yamada, H Uchida, N Tetsutani, S Akamatsu

ROBOT AND HUMAN COMMUNICATION, PROCEEDINGS 371 - 376 2001年

　概要を見る

The present study investigated the dynamic aspects of facial movements in spontaneously, elicited and posed facial expressions of emotion. Me recorded participants' facial movements when they were shown a set of emotional eliciting films, and when they, posed typical facial expressions. Those facial movements it-ere recorded by a high-speed camera of 250 frames per second. We measured facial movements frame by frame in terms of displacements of facial feature points. Such micro-temporal analysis showed that, although it was very subtle, there exits the characteristic onset as synchrony of each part's movement. Furthermore, it was found the commonality v of each pail's movement in temporal change although the speed and the amount of each movement varied along with expressional conditions and emotions.

DOI
ニューラルネットワークによる低解像度顔画像からの層性識別に関する一考察

高橋和彦, 四倉達夫, 森島繁生, 鉄谷信二

インテリジェント・システム・シンポジウム講演論文集 11th 529 - 534 2001年

J-GLOBAL
3次元空間共有コミュニケーション技術の研究開発実写画像をベースとしたマルチメディア・アンビアンスコミュニケーションの実現に向けて

望月研二, 斉藤隆弘, 市川忠嗣, 森島繁生, 相沢清晴, 山本健一郎, 児玉和也, 苗村健, 斎藤英雄

電子情報通信学会技術研究報告 100 ( 633(PRMU2000 186-192) ) 47 - 54 2001年

J-GLOBAL
映像処理研究用標準ディジタル動画像の作成とその評価 (放送文化基金S)

斎藤隆弘, 相沢清晴, 柴田正啓, 全へい東, 外村佳伸, 森島繁生

研究報告放送文化基金 ( 19 ) 52 - 57 2001年

J-GLOBAL
ビデオ翻訳システム自動翻訳合成音声とのモデルベースリップシンクの実現

緒方信, 中村哲, 森島繁生

情報処理学会シンポジウム論文集 2001 ( 5 ) 239 - 246 2001年

J-GLOBAL
パンチルトカメラの多段接続により顔の特徴部位追跡

島田昌実, 島田直幸, 森島繁生

電子情報通信学会大会講演論文集 2001 199 2001年

J-GLOBAL
ネットワークシアタ

高橋和彦, くるみ沢順, 四倉達夫, 森島繁生, 鉄谷信二

電子情報通信学会大会講演論文集 2001 282 2001年

J-GLOBAL
複数のカメラによる口・目領域追跡と表情合成のための特徴分析

島田直幸, 森島繁生

電子情報通信学会大会講演論文集 2001 309 2001年

J-GLOBAL
顔特徴点抽出に基づく正面顔画像への標準顔モデルの自動フィッティング

坂本将人, 伊藤圭, 三沢貴文, 森島繁生

電子情報通信学会大会講演論文集 2001 295 2001年

J-GLOBAL
細分割曲面を用いた顔表情合成

伊東大介, 森島繁生

電子情報通信学会大会講演論文集 2001 298 2001年

J-GLOBAL
実画像とワイヤフレームモデルを併用した自動翻訳合成音声とのリップシンクシステムの構築

緒方信, 中村哲, 森島繁生

電子情報通信学会大会講演論文集 2001 299 2001年

J-GLOBAL
舌モデルの付加によるリアルな英語口形の実現と発話アニメーション

石川行一, 伊藤圭, 三沢貴文, 森島繁生

電子情報通信学会大会講演論文集 2001 328 2001年

J-GLOBAL
十字型テンプレートを利用したリアルタイム視線追跡と瞬き検出

笠嶋健吾, 島田直幸, 森島繁生

電子情報通信学会大会講演論文集 2001 202 2001年

J-GLOBAL
空間曲線上の点の直接操作による頭髪スタイル制御とカット機能を有するヘアスタイルデザインシステム

杉崎英嗣, 佐野和成, 森島繁生

電子情報通信学会大会講演論文集 2001 113 2001年

J-GLOBAL
3次元顔モデルを用いたビデオ映像中の自動顔トラッキングとモデルマッチムーブ

三沢貴文, 村井和昌, 中村哲, 森島繁生

電子情報通信学会大会講演論文集 2001 ( 2 ) 293 - 293 2001年

CiNii J-GLOBAL
HyperMask 任意表情及び人物表出可能な仮面の構築

四倉達夫, BINSTED K, NIELSEN F, CLAUDIO P, 鉄谷信二, 森島繁生

電子情報通信学会大会講演論文集 2001 110 2001年

J-GLOBAL
高速度カメラを用いた表情表出時の顔面動作の分析および微妙な表情の合成

長谷川佳之, 四倉達夫, 内田英子, 山田寛, 森島繁生

電子情報通信学会大会講演論文集 2001 297 2001年

J-GLOBAL
韻律情報のコピーおよび自由制御可能な声質変換のためのGUIツールの作成

田野泰史, 緒方信, 森島繁生

電子情報通信学会大会講演論文集 2001 278 2001年

J-GLOBAL
複数の基本顔の恒等写像学習による感情空間構成

高橋光紀, 伊東大介, 森島繁生

電子情報通信学会大会講演論文集 2001 323 2001年

J-GLOBAL
唇の特徴点抽出と音声分析を併用した音声に同期する唇画像の実時間合成

大室学, 伊藤圭, 島田直幸, 森島繁生

電子情報通信学会大会講演論文集 2001 310 2001年

J-GLOBAL
仮想空間を用いた多人数コミュニケーションシステム構築

伊藤圭, 四倉達夫, 森島繁生

電子情報通信学会大会講演論文集 2001 294 2001年

J-GLOBAL
映像情報の検索技術と編集処理色ヒストグラムインターセクションを用いたリアルタイム人物顔抽出

吉村哲也, 市川忠嗣, 森島繁生, 相沢清晴, 斉藤隆弘

映像情報メディア学会誌 55 ( 3 ) 412 - 416 2001年

DOI J-GLOBAL
高速度カメラで捉えた自発表情と演技表情の動的変化

山田寛, 内田英子, 四倉達夫, 森島繁生, 鉄谷信二, 赤松茂

電子情報通信学会技術研究報告 100 ( 712(HCS2000 56-61) ) 27 - 34 2001年

J-GLOBAL
空間曲線上の点の直接操作によるヘアスタイルデザインシステム及びカット機能の実現

杉崎英嗣, 佐野和成, 森島繁生

電子情報通信学会技術研究報告 100 ( 716(MVE2000 112-127) ) 9 - 16 2001年

J-GLOBAL
3次元個人顔モデルを用いたビデオ映像中の顔の自動トラッキング及びモデルマッチムーブ処理

三沢貴文, 村井和昌, 中村哲, 森島繁生

電子情報通信学会技術研究報告 100 ( 716(MVE2000 112-127) ) 1 - 8 2001年

J-GLOBAL
顔表情認知におけるサイズの効果

川嶋英嗣, 小田浩一, 四倉達夫, 森島繁生

Vision 13 ( 3 ) 206 - 207 2001年

J-GLOBAL
リアルな英語口形と発話アニメーション

石川行一, 森島繁生, 柴田昌明

電気学会産業応用部門大会講演論文集 2001 ( 2 ) 993 2001年

J-GLOBAL
高感度カメラを用いた顔面動作の分析および表情合成

四倉達夫, 内田英子, 山田寛, 赤松茂, 鉄谷信二, 森島繁生

電子情報通信学会技術研究報告 101 ( 298(OFS2001 20-28) ) 15 - 22 2001年

J-GLOBAL
ネットワークシアタ仮想環境とコンピュータネットワークによるコンテンツ作成システム

高橋和彦, くるみ沢順, 四倉達夫, 森島繁生, 鉄谷信二, 中津良平

画像電子学会誌 30 ( 5 ) 555 - 564 2001年

　概要を見る

This paper proposes a new concept for a movie production system that utilizes a virtual environment and a computer network. The basis of the concept is that via a computer network anybody can be: (1) a performer, (2) a producer, and (3) the audience. A prototype composed of a server and client computers has been developed and is used to produce a movie based on both motion capture systems and CG animation. Experimental results confirm the feasibility of the system. © 2001, The Institute of Image Electronics Engineers of Japan. All rights reserved.

DOI J-GLOBAL
自発・演技表情表出時における顔面動作分析および表情合成

四倉達夫, 内田英子, 山田寛, 赤松茂, 森島繁生

電子情報通信学会技術研究報告 101 ( 333(HCS2001 26-31) ) 39 - 46 2001年

J-GLOBAL
インタラクティブ性を重視した擬人化音声対話エージェントの基本設計

川本真一, 下平博, 嵯峨山茂樹, 新田恒雄, 西本卓也, 中村哲, 伊藤克亘, 森島繁生, 宇津呂武仁

人工知能学会言語・音声理解と対話処理研究会資料 33rd 63 - 68 2001年

J-GLOBAL
細分割曲面を用いた顔表情合成

伊東, 森島繁生

電子情報通信学会2001年総合大会講演論文集, 基礎境界 298 298 - 298 2001年

CiNii J-GLOBAL
高速度カメラを用いた顔面動作の分析および表情合成

四倉達夫, 内田英子, 山田寛, 赤松茂, 鉄谷信二, 森島繁生

映像情報メディア学会技術報告 25 ( 0 ) 15 - 22 2001年

　概要を見る

本稿では、高速度カメラを用いて人間が自然な表情した場合(自発表出)と普遍的かつ典型的な表情を演じる際の顔表情(演技表出)を撮影し、顔の各部位に設定した特徴点の変位量に基づき顔の動きの定量的な測定を分析した。また測定結果からCGによって構築した顔モデルのアニメーション生成を行った。自発表出条件、演技表出条件ともに顔の各部位の動き出しの差は微細であり高速度カメラを用いたことの有効性が示された。また情動ごとおよび表出条件ごとに顔の動き量や速さに特徴的な違いが認められたが、動きの変化そのものの様相には興味深い共通性が認められた。顔モデルのアニメーションに関しても、線形補間によるキーフレームアニメーションと比べより自然な顔表情表出が可能となった。

DOI CiNii
ネットワークシアタ: 仮想環境とコンピュータネットワークによるコンテンツ作成システム

高橋和彦, 楜沢順, 四倉達夫, 森島繁生, 鉄谷信二, 中津良平

画像電子学会誌 30 ( 5 ) 555 - 564 2001年

　概要を見る

本論文では, 「(1)誰でも役者になれる, (2)誰でも監督になれる, (3)ネットワークを介して鑑賞できる」を基本コンセプトに、仮想空間とコンピュータネットワークを用いてユーザーが自由にムービーコンテンツを製作するためのネットワークシアタシステムを提案した. クライアント-サーバシステムによりプロトタイプを構築し, 演技者の演技によりストーリを展開可能である事, 監督がイベントオブジェクトの変更とストーリスクリプトの編集によりストーリを制御可能である事, 及びインターネットを経由してコンテンツを鑑賞できるというネットワークシアタ基本機能の動作とその実現性が確認された.

DOI CiNii J-GLOBAL
頭髪のスタイリングとアニメーション表現

岸啓補, 森島繁生

電子情報通信学会論文誌. D-2, 情報・システム 2-パターン処理 83 ( 12 ) 2716 - 2724 2000年12月

　概要を見る

サイバースペースにおける仮想人物の合成やコミュニケーションシステムの実現に向け, コンピュータグラフィックスによる人物画像合成等が注目を集めている.本論文では, 特に人物のCGの中でも合成が難しいとされる頭髪の表現方法について述べる.人物画像において頭髪は視覚的に重要であるにもかかわらず, 簡単な曲面や背景の一部で代用されることが多い, 頭髪をマッピング技術を用いて表現する手法が成果をあげているが運動の表現には不適当である.そこで頭髪の表現にテクスチャやポリゴンを用いずに空間曲線を用いる.更に頭髪の部分的な集まりである房単位にモデル化することでヘアスタイルデザインを簡略化する.本論文では, この新しいヘアスタイルデザインシステム, 房のモデル化手法, レタリング手法, 4分岐法による衝突判定, 運動表現について述べる.また, 実際にこのヘアスタイルデザインシステムを用いて頭髪をデザインし, 風になびかせるアニメーションを実現した.

CiNii
2. 通信・放送機構本郷空間共有リサーチセンターの研究紹介 : 空間共有コミュニケーションプロジェクト

市川忠嗣, 相澤清晴, 森島繁生, 齊藤隆弘

画像電子学会誌 29 ( 6 ) 854 - 857 2000年11月

CiNii J-GLOBAL
Hyper Mask:表情・口形状制御可能な顔モデルを用いた仮面の構築

四倉達夫, 鉄谷信二, 森島繁生

研究会講演予稿 182 1 - 6 2000年11月

CiNii J-GLOBAL
擬人化音声対話エージェント開発と周辺技術 : (3)対話における顔画像生成

森島繁生, 四倉達夫

情報処理学会研究報告. SLP, 音声言語情報処理 33 ( 101 ) 13 - 18 2000年10月

　概要を見る

近年、ユーザフレンドリーなヒューマンインタフェースの研究が盛んである。ユーザフレンドリーとは言うまでもコンピュータと人間との対話を円滑にするものである。1つの形態として、擬人化エージェントをディスプレイ上に表示して、この擬人化エージェントが言語情報のみならず非言語情報をも理解・表出することによって人間同士が面と向かって対話するような自然さでコンピュータとコミュニケーションできるシステムが考えられる。このシステムで重要な要素はいかに本物の人間と見分けがつかないくらいリアルで生命の息吹をも感じさせるような擬人化エージェントを生成するかである。この実現のためにはまず、実在する人物の表情や印象をいかに忠実にアバタの表情としてコピーできるかが、表情合成に課せられた課題である。特に対話の場合は、時間遅れが少なくリアルタイムに行えることが必須条件となる。本稿では、著者らが現在取り組んでいる顔合成の技術を紹介し、擬人化音声対話エージェント開発に向けてのこの研究の位置づけを明らかにする。

CiNii J-GLOBAL
擬人化技術 : リアルなコミュニケーション環境生成のための表情分析・合成

森島繁生

ヒューマンインタフェース学会誌 = Journal of Human Interface Society : human interface 2 ( 3 ) 41 - 46 2000年08月

CiNii
高速度カメラを用いた顔面表情の動的変化に関する分析

内田英子, 四倉達夫, 森島繁生, 山田寛, 大谷淳, 赤松茂

電子情報通信学会技術研究報告. HIP, ヒューマン情報処理 99 ( 722 ) 1 - 6 2000年03月

　概要を見る

顔面表情に焦点をあて、意図的なコントロールを受けたものと、なんらかの情動喚起に伴い自発的に現れるものとの違い、特に動的な変化の違いを実験的に検討した。被験者の顔面表情の変化を次の2条件下で高速度カメラにより撮影した。1つが意図的表出(動作教示)条件、もう一つが自発的表出条件である。意図的表出条件では、顔面動作教示に従って被験者に6つの基本表情を演じさせた。一方、自発的表出条件では、情動喚起映像(喜び、驚き、怒り、悲しみ、嫌悪、恐れ)を提示し、被験者に自然な表情を自発させた。高速度カメラで撮影した顔面表情の動的変化(特徴点の変位)を、画像解析ツールを用いて測定した。

CiNii J-GLOBAL
実写画像をベースとしたマルチメディア・アンビアンスコミュニケーションの提案

市川忠嗣, 吉村哲也, 山田邦男, 金丸利文, 須賀弘道, 岩澤昭一郎, 苗村健, 相澤清晴, 森島繁生, 齋藤隆弘

映像情報メディア学会誌 : 映像情報メディア = The journal of the Institute of Image Information and Television Engineers 54 ( 3 ) 435 - 439 2000年03月

　概要を見る

ビデオカメラやディジタルカメラで撮影された実写画像を用い, 遠景, 中景, 近景のレイヤ構造で表現し, 再構成するフォトリアリスティックな3次元画像空間を提案すると共に, この画像空間によるコミュニケーションを提案し, その実現方法を報告する.

DOI CiNii J-GLOBAL
ボクセル表現による衝突判定法とバネモデルによる頭髪運動表現

岸啓補, 森島繁生

電子情報通信学会総合大会講演論文集 2000 ( 2 ) 113 - 113 2000年03月

CiNii J-GLOBAL
動画像分析からの3次元表情パラメータの推定と表情再合成

小野哲治, 伊東大介, 森島繁生

電子情報通信学会総合大会講演論文集 2000 ( 2 ) 284 - 284 2000年03月

CiNii
流体場を用いた頭髪アニメーションと頭部との衝突判定法

奥谷敦, 岸啓補, 森島繁生

電子情報通信学会総合大会講演論文集 2000 ( 2 ) 353 - 353 2000年03月

CiNii
自然音声の分析に基づく音声への感情情報の付加

加川大志郎, 四倉達夫, 森島繁生

電子情報通信学会総合大会講演論文集 2000 258 - 258 2000年03月

CiNii J-GLOBAL
韻律情報制御のための感情音声合成GUIツール

緒方信, 四倉達夫, 森島繁生

電子情報通信学会総合大会講演論文集 2000 259 - 259 2000年03月

CiNii
高速度カメラを用いた顔面動作の分析

四倉達夫, 内田英子, 山田寛, 森島繁生, 赤松茂, 大谷淳

電子情報通信学会総合大会講演論文集 2000 260 - 260 2000年03月

CiNii J-GLOBAL
複数アングル画像からの3次元頭部モデルの生成と表情合成

河合丈治, 三澤貴文, 武藤淳一, 森島繁生

電子情報通信学会総合大会講演論文集 2000 ( 582(HIP99 64-75) ) 261 - 261 2000年03月

CiNii J-GLOBAL
画像と音声を併用したリアルタイムリップシンク

川又正太, 島田直幸, 武藤淳一, 森島繁生

電子情報通信学会総合大会講演論文集 2000 285 - 285 2000年03月

CiNii
判別分析法による音声の感情推定及び実時間メディア変換システム

廣瀬陽介, 四倉達夫, 森島繁生

電子情報通信学会総合大会講演論文集 2000 286 - 286 2000年03月

CiNii
複数視点画像を用いた3次元頭部モデルの構築

武藤淳一, 森島繁生

電子情報通信学会総合大会講演論文集 2000 329 - 329 2000年03月

CiNii J-GLOBAL
複数アングル画像からの3次元頭部モデルの作成と表情合成

伊藤圭, 三澤貴文, 武藤淳一, 森島繁生

電子情報通信学会技術研究報告. HIP, ヒューマン情報処理 99 ( 582 ) 7 - 12 2000年01月

　概要を見る

サイバースペース上でのコミュニケーションを目的とし、特定人物のアバタを作成する方法として、レンジスキャナーのような特別な機材を用いることなく、複数のアングルから撮影した頭部の画像によって、3次元頭部モデルを簡単に作成する手法を提案した。これは、任意のアングルから撮影した頭部画像にワイヤフレームを独自に開発したGUIを用いてフィッティングさせ、特に正面と側面から撮影された画像のブレンディングによって頭部のテクスチャを構成するものである。さらにこのモデルに基づいて、発話時の口形状を定義するパラメータを見なおし、任意の方向から見ても、違和感のない口形状の表現が可能となった。

CiNii
韻律情報の制御による感情音声合成のための声質変換

緒方信, 四倉達夫, 森島繁生

電子情報通信学会技術研究報告. HIP, ヒューマン情報処理 99 ( 582 ) 53 - 58 2000年01月

　概要を見る

感情音声が合成可能となれば、人間と機会とのノンバーバルなコミュニケーションが実現できるのみならず、人間同士の対話も円滑化する新しいコミュニケーションシステムが実現可能となる。しかし自然音声に感情を付加する為には、原音声のクオリティ、発話内容、話者の情報を保ちつつ、韻律情報を制御しなくてはならない。本稿では、音声系列中の各母音を切り出してピッチ制御を行い、文節単位でイントネーションを変化させ、さらに発話速度や音の強弱の制御によって、感情表現付加が可能なシステムを開発した。本手法により無感情音声から原音声のクオリティを保ったまま合成感情音声の作成が可能となった。

CiNii J-GLOBAL
対話システムにおける顔面像生成

森島繁生

情報処理学会研究報告 13 - 18 2000年

CiNii
3D face expression estimation and generation from 2D image based on a physically constraint model

ISHIKAWA T.

IEICE Transactions on Information and Systems E83-D ( 2 ) 251 - 258 2000年01月

　概要を見る

Muscle based face image synthesis is one of the most realistic approaches to the realization of a life-like agent in computers. A facial muscle model is composed of facial tissue elements and simulated muscles. In this model, forces are calculated effecting a facial tissue element by contraction of each muscle string, so the combination of each muscle contracting force decides a specific facial expression. This muscle parameter is determined on a trial and error basis by comparing the sample photograph and a generated image using our Muscle-Editor to generate a specific face image. In this paper, we propose the strategy of automatic estimation of facial muscle parameters from 2D markers' movements located on a face using a neural network. This corresponds to the non-realtime 3D facial motion capturing from 2D camera image under the physics based condition.

CiNii
Multimedia ambience communication based on actual moving pictures in a stereoscopic projection display environment

K Yamada, T Ichikawa, T Naemura, K Aizawa, S Morishima, T Saito

STEREOSCOPIC DISPLAYS AND VIRTUAL REALITY SYSTEMS VII 3957 303 - 311 2000年

　概要を見る

Multi-media Ambiance Project of TAO has been researching and developing an image space, that can be shared by people in different locations and can lend a real sense of presence. The image space is mainly based on photo-realistic texture, and some deformations which depend on human vision characteristics or pictorial expressions are being applied. We aim to accomplish shared-space communication by an immersive environment consisting of the image space stereoscopically projected on an arched screen. We refer to this scheme as "ambiance communication."
The first half of this paper presents global descriptions on basic concepts of the project, the display system and the 3-camera image capturing system. And the latter half focuses on two methods to create a photo-realistic image space using the captured images of a natural environment.
One is the divided expression of the long-range view and ground, which not only gives more realistic setting of the ground but commands more natural view when synthesized with other objects and gives potentialities of deformations for some purposes.
The other is the high quality panorama generation based on even-odd field integration and image enhancement by a two dimensional quadratic Volterra filter.
Realtime face analysis and synthesis using neural network

S Morishima

NEURAL NETWORKS FOR SIGNAL PROCESSING X, VOLS 1 AND 2, PROCEEDINGS 1 13 - 22 2000年

　概要を見る

In this paper we describe a recent research results about how to generate an avatar's face on a real-time process exactly copying a real person's face. It is very important for synthesis of a real avatar to duplicate emotion and impression precisely included in original face image and voice. Face fitting tool from multi-angle camera images is introduced to make a real 3D face model with real texture and geometry very close to the original. When avatar is speaking something voice signal is very essential to decide a mouth shape feature. So real-time mouth shape control mechanism is proposed by conversion from speech parameters to lip shape parameters using multilayered neural network. For dynamic modeling of facial expression, muscle structure constraint is introduced to generate a facial expression naturally with a few parameters. We also tried to get muscle parameters automatically to decide an expression from local motion vector on face calculated by optical flow in video sequence. Finally an approach that enables the modeling emotions appearing on faces. A system with this approach helps to analyze, synthesize and code face images at the emotional level.
パン,チルト,ズーム制御可能なカメラによる顔特徴の実時間追跡

森島繁生, 島田直幸

成けい大学工学研究報告 37 ( 1 ) 1 - 6 2000年

J-GLOBAL
韻律情報の制御による感情音声合成のための声質変換

緒方信, 四倉達夫, 森島繁生

電子情報通信学会技術研究報告 99 ( 582(HIP99 64-75) ) 53 - 58 2000年

J-GLOBAL
複数アングル画像からの3次元頭部モデルの作成と表情合成

伊藤圭, 三沢貴文, 武藤淳一, 森島繁生

電子情報通信学会技術研究報告 99 ( 582(HIP99 64-75) ) 7 - 12 2000年

J-GLOBAL
EMGによる筋肉モデルの制御と精密な表情合成

塚田章之, 伊東大介, 森島繁生

電子情報通信学会大会講演論文集 2000 284 2000年

J-GLOBAL
レイヤモデルによるヘアスタイルデザインシステムの構築

佐野和成, 岸啓輔, 森島繁生

電子情報通信学会大会講演論文集 2000 114 2000年

J-GLOBAL
空間共有コミュニケーションにおける表情入力のための人物顔追跡機能の実現

吉村哲也, 市川忠嗣, 森島繁生, 相沢清晴, 斉藤隆弘

電子情報通信学会大会講演論文集 2000 293 2000年

J-GLOBAL
仮想空間上におけるリアルな三次元口形状の作成

伊藤圭, 三沢貴文, 武藤淳一, 森島繁生

電子情報通信学会大会講演論文集 2000 328 2000年

J-GLOBAL
画像と音声を併用したリアルタイムリップシンク

川又正太, 島田直幸, 武藤淳一, 森島繁生

電子情報通信学会大会講演論文集 2000 285 2000年

J-GLOBAL
動画像分析からの3次元表情パラメータの推定と表情再合成

小野哲治, 伊東大介, 森島繁生

電子情報通信学会大会講演論文集 2000 284 2000年

J-GLOBAL
複数視点画像を用いた3次元頭部モデルの構築

武藤淳一, 森島繁生

電子情報通信学会大会講演論文集 2000 329 2000年

J-GLOBAL
ボクセル表現による衝突判定法とバネモデルによる頭髪運動表現

岸啓輔, 森島繁生

電子情報通信学会大会講演論文集 2000 113 2000年

J-GLOBAL
流体場を用いた頭髪アニメーションと頭部との衝突判定法

奥谷敦, 岸啓輔, 森島繁生

電子情報通信学会大会講演論文集 2000 353 2000年

J-GLOBAL
複数アングル画像からの3次元頭部モデルの生成と表情合成

河合丈治, 三沢貴文, 武藤淳一, 森島繁生

電子情報通信学会大会講演論文集 2000 261 2000年

J-GLOBAL
韻律情報制御のための感情音声合成GUIツール

緒方信, 四倉達夫, 森島繁生

電子情報通信学会大会講演論文集 2000 259 2000年

J-GLOBAL
判別分析法による音声の感情推定及び実時間メディア変換システム

広瀬陽介, 四倉達夫, 森島繁生

電子情報通信学会大会講演論文集 2000 286 2000年

J-GLOBAL
高速度カメラを用いた顔面動作の分析

四倉達夫, 内田英子, 山田寛, 森島繁生, 赤松茂, 大谷淳

電子情報通信学会大会講演論文集 2000 260 2000年

J-GLOBAL
自然音声の分析に基づく音声への感情情報の付加

加川大志郎, 四倉達夫, 森島繁生

電子情報通信学会大会講演論文集 2000 258 2000年

J-GLOBAL
顔の認識・合成のための標準ツール

森島繁生, 八木康史

システム/制御/情報 44 ( 3 ) 121 - 127 2000年

DOI CiNii J-GLOBAL
3次元映像情報メディア技術実写画像をベースとしたマルチメディア・アンビアンスコミュニケーションの提案

市川忠嗣, 吉村哲也, 山田邦男, 金丸利文, 須賀弘道, 苗村健, 相沢清晴, 森島繁生, 斉藤隆弘

映像情報メディア学会誌 54 ( 3 ) 435 - 439 2000年

DOI J-GLOBAL
高速度カメラを用いた顔面表情の動的変化に関する分析

内田英子, 四倉達夫, 森島繁生, 山田寛, 大谷淳, 赤松茂

電子情報通信学会技術研究報告 99 ( 722(HIP99 76-81) ) 1 - 6 2000年

J-GLOBAL
顔の認識・合成と新メディアの可能性

森島繁生

画像センシングシンポジウム講演論文集 6th 415 - 424 2000年

J-GLOBAL
擬人化技術リアルなコミュニケーション環境生成のための表情分析・合成

森島繁生

ヒューマンインタフェース学会誌 2 ( 3 ) 183 - 188 2000年

CiNii J-GLOBAL
擬人化音声対話エージェント開発と周辺技術 (3) 対話における顔画像生成

森島繁生, 四倉達夫

情報処理学会研究報告 2000 ( 101(SLP-33) ) 13 - 18 2000年

J-GLOBAL
HYPER MASK 表情・口形状制御可能な顔モデルを用いた仮面の構築

四倉達夫, BINSTED K, NIELSEN F, PINHANEZ C, 鉄谷信二, 森島繁生

画像電子学会研究会講演予稿 182nd 1 - 6 2000年

J-GLOBAL
通信・放送機構本郷空間共有リサーチセンターの研究紹介空間共有コミュニケーションプロジェクト

市川忠嗣, 相沢清晴, 森島繁生, 斎藤隆弘

画像電子学会誌 29 ( 6 ) 854 - 857 2000年

J-GLOBAL
頭髪のスタイリングとアニメーション表現

岸啓補, 森島繁生

電子情報通信学会論文誌 D-2 J83-D-2 ( 12 ) 2716 - 2724 2000年

J-GLOBAL
レイヤモデルによるヘアスタイルデザインシステムの構築

佐野和成, 岸啓補, 森島繁生

2000年電子情報通信学会総合大会, 情報システム2 2000 114 - 114 2000年

CiNii J-GLOBAL
空間共有コミュニケーションにおける表情入力のための人物顔追跡機能の実現

吉村哲也, 市川忠嗣, 森島繁生, 相澤清晴, 齊藤隆弘

2000年信学総大 12 293 - 293 2000年

CiNii
EMGによる筋肉モデルの制御と精密な表情合成

塚田, 伊東大介, 森島繁生

電子情報通信学会2000年全国大会講演論文集, March 284 - 284 2000年

CiNii
仮想空間上におけるリアルな三次元口形状の作成

伊藤圭, 三澤貴文, 武藤淳一, 森島繁生

2000信学総大 328 - 328 2000年

CiNii
顔の認識・合成のための標準ツール (豊かなヒューマンコミュニケーションのための顔とジェスチャの認識・合成技術特集号)

森島繁生, 八木康史

システム／制御／情報 44 ( 3 ) 121 - 127 2000年

DOI CiNii
パン、チルト、ズーム制御可能なカメラによる顔特徴の実時間追跡

森島繁生, 島田直幸

成蹊大学工学研究報告 37 ( 1 ) 1 - 6 2000年01月

CiNii
Real-time voice driven facial animation system

Proceedings of the IEEE International Conference on Systems, Man and Cybernetics 6 1999年12月

　概要を見る

Recently computer can make cyberspace to walk through by an interactive virtual reality technique. An avatar in cyberspace can bring us a virtual face-to-face communication environment. In this paper, we realize an avatar which has a real face in cyberspace to construct a multi-user communication system by voice transmission through network. Voice from microphone is transmitted and analyzed, then mouth shape and facial expression of avatar are synchronously estimated and synthesized on real time. And also we introduce an entertainment application of a real-time voice driven synthetic face. This project is named 'Fifteen Seconds of Fame' which is an example of interactive movie.
アクティブカメラによる視線追跡・自動 Lip Reading

四倉達夫, 島田直幸, 森島繁生, 大谷淳

電子情報通信学会技術研究報告. HIP, ヒューマン情報処理 99 ( 451 ) 31 - 36 1999年11月

　概要を見る

本稿ではカメラヘッドのパン・チルトとズームのコントロール可能なアクティブカメラ2台を用い、常時ユーザの口・目領域を高解像度でキャプチャし自動追跡を行う手法を提案する。各々のカメラから取り込んだ画像を2値化することにより、口や目領域を検出し、キャプチャされた画像における各領域の位置とその面積からカメラの回転方向と回転速度、ズーム速度を決定し、カメラ制御を行う。この追跡法によって抽出された口・目領域の2個化画像の特徴を分析し、LipReading、瞬きの検出や視線の追跡が可能となった。

CiNii J-GLOBAL
コンピュータグラフィックスを用いた矯正治療による表情の変化

寺田員人, 宮永美知代, 森島繁生, 花田晃治

歯科審美 = Journal of esthetic dentistry 12 ( 1 ) 37 - 51 1999年09月

CiNii
生理学的手法を用いた顔面筋肉モデルの構築

伊東大介, 四倉達夫, 森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 99 ( 122 ) 17 - 24 1999年06月

　概要を見る

近年、人間の顔表情をCG (Computer Graphics)にて表現することは映画の特殊効果や、ヒューマンインタフェースのためのエージェントの表現として-般的になっており、そのクオリティは実写に近いレベルまで達している。しかしながらそれらの構築に対しアニメータ等の膨大な労力と資金が必要であり、製作期間も長期間にわたるのが現状である。そこで本論文ではリアルな顔画像生成のため、皮膚組織や表情筋を持つ顔面筋肉モデルを用いて表情表出を行うシステムを構築し、各表情筋の変化に対応した筋電を測定する装置を用いて各々の筋電を測定し、各筋肉の収縮をモデル化する。測定データから顔面筋肉モデルの表情筋をコントロールして、リアルな口形状のモデル化を実現するシステムも可能となった。

CiNii J-GLOBAL
表情筋運動モデルによる顔面モデルの構築

石川貴博, 森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 98 ( 682 ) 21 - 28 1999年03月

　概要を見る

コンピュータグラフィクス(Computer Graphics 以下 CG) による顔表現方法の1つとして顔面筋肉モデルが挙げられる。現在、顔面筋肉モデルの表情筋は、単なる線分で表現され、実際の表情筋の様な複雑な形状を与えられていない。そこで、本論文ではこれらのことに着目し、表情筋の形状データを使用し、同時にその表情筋の運動表現を試みる。表情筋の運動表現は、表情筋がバネの集合体であると仮定し、運動方程式を解くことで実現させる。また下顎骨を剛体と仮定し、咀嚼筋による下顎骨の運動制御を提案する。下顎骨は咀嚼筋の収縮時の力を受けて回転運動を行う。これらの結果、表情筋の形状を考慮した運動表現および咀嚼筋による下顎骨の運動表現が可能になった。

CiNii
仮想人物によるサイバースペース上でのコミュニケーションシステムの構築

四倉達夫, 武藤淳一, 今村達也, 藤井英史, 森島繁生

電子情報通信学会技術研究報告. HIP, ヒューマン情報処理 98 ( 683 ) 39 - 46 1999年03月

　概要を見る

コンピュータの処理能力の急速な発展により、複数のユーザがネットワークを介してサイバースペースを共有し、対話や協調作業を行うインタラクション環境が整ってきている。この仮想空間への没入感覚と臨場感を向上させるためには、実空間と同等の自然さで、人間同士のコミュニケーションを実現する必要がある。そこで本稿では、自分自身の姿を投影した顔を持つアバタ(Avatar)を仮想空間上に生成し、マイクから入力された自然音声に同期させて会話時の口形状および感情推定をリアルタイムに実施し、アバタの表情合成を行うシステムを提案する。このシステムによりサイバースペース上で多人数が参加可能なフェイストゥフェイスの対話環境が実現可能となった。

CiNii J-GLOBAL
D-12-72 空間周波数を用いた口領域のモーションキャプチャ

川俣充, 武藤淳一, 近藤淳, 森島繁生

電子情報通信学会総合大会講演論文集 1999 ( 2 ) 245 - 245 1999年03月

CiNii J-GLOBAL
D-12-82 オプティカルフローによる筋肉パラメータの自動推定と表情再合成

伊東大介, 岩澤昭一郎, 森島繁生

電子情報通信学会総合大会講演論文集 1999 ( 2 ) 255 - 255 1999年03月

CiNii
D-12-91 全身像の三眼シルエット画像に基づく姿勢推定の検討

岩澤昭一郎, 大谷淳, 森島繁生

電子情報通信学会総合大会講演論文集 1999 ( 2 ) 264 - 264 1999年03月

CiNii
D-12-154 流体力学に基づくCGによる風に靡くリアルな頭髪の運動表現

平山聡, 岸啓補, 森島繁生

電子情報通信学会総合大会講演論文集 1999 ( 2 ) 327 - 327 1999年03月

CiNii J-GLOBAL
A-14-3 リアルな3次元表情編集ツールの作成

千明太郎, 藤井英史, 四倉達夫, 森島繁生

電子情報通信学会総合大会講演論文集 1999 304 - 304 1999年03月

CiNii J-GLOBAL
A-14-4 音声からの感情推定と実時間メディア変換システム

今村達也, 四倉達夫, 森島繁生

電子情報通信学会総合大会講演論文集 1999 305 - 305 1999年03月

CiNii J-GLOBAL
A-15-6 筋肉パラメータに基づく3次元感情空間の構築

重松陽志, 石川貴博, 森島繁生

電子情報通信学会総合大会講演論文集 1999 330 - 330 1999年03月

CiNii
A-15-17 Gabor Waveletを使用した動画像からの顔表情認識

近藤淳, 森島繁生

電子情報通信学会総合大会講演論文集 1999 341 - 341 1999年03月

CiNii
A-15-19 体積を持った表情筋運動モデルによる顔面モデルの構築

石川貴博, 森島繁生

電子情報通信学会総合大会講演論文集 1999 343 - 343 1999年03月

CiNii J-GLOBAL
A-16-11 3次元アバタの構築とリアルタイム対話システム

三澤貴文, 四倉達夫, 藤井英史, 森島繁生

電子情報通信学会総合大会講演論文集 1999 367 - 367 1999年03月

CiNii J-GLOBAL
A-16-12 3次元頭部モデルを用いた実時間メディア変換

藤井英史, 森島繁生

電子情報通信学会総合大会講演論文集 1999 368 - 368 1999年03月

CiNii J-GLOBAL
サイバースペース上の仮想人物による実時間対話システムの構築

四倉達夫, 藤井英史, 森島繁生

情報処理学会論文誌 40 ( 2 ) 677 - 686 1999年02月

　概要を見る

コンピュータの処理能力の急速な発展により複数のユーザがネットワークを介してサイバースペースを共有し対話や協調作業を行うインクラクション環境が整ってきている. この仮想空間への没入感覚と臨場感を向上させるためには実空間と同等の自然さで人間同士のコミュニケーションを実現する必要がある. そこで本稿では自分自身の姿を投影した顔を持つアバタ(Avatar)を仮想空間上に生成しマイクから入力された自然音声に同期させて会話時の口形状の推定をリアルタイムに実施し同時にキー入力された感情情報によってアバタの表情合成を行うシステムを提案する. このシステムによりサイバースペース上で多人数が参加可能なフェイストゥフェイスの対話環境が実現可能となった.Recent advances in computer performance can generate an interaction environment in which multiple user can share cyberspace and communicate each other to make a cooperative work. An avatar in cyberspace can bring us a virtual face-to-face communication environment. In this paper, we realize an avatar which has a real face in cyberspace and construct a multiuser communication system by voice transmission through network. Voice from microphone is analyzed and transmitted, then mouth shape of avatar are synchronously estimated and synthesized on real time. And also facial expression is controlled on real-time by key-input.

CiNii J-GLOBAL
表情の分析・合成を用いたサイバースペース内でのフェーストローフェース対話システム

森島繁生

電子情報通信学会技術研究報告. IE, 画像工学 98 ( 551 ) 61 - 68 1999年01月

　概要を見る

Recently computer can make cyberspace to walk through by an interactive virtual reality technique.An avatar in cyberspace can bring us a virtual face-to-face communication environment.In this paper, an avatar is realized which has a real face in cyberspace and a multi-user communication system is constructed by voice transmission through network.Voice from microphone is transmitted and analized, then mouth shape and facial expression of avatar are synchronously estimated and synthesized on real time.And also an entertainment application of a real-time voice driven synthetic face is introduced and this is an example of interactive movie.Finally, face motion capture system using physics based face model is introduced.

CiNii
Real-time, 3D estimation of human body postures from trinocular images

Shoichiro Iwasawa, Jun Ohya, Kazuhiko Takahashi, Tatsumi Sakaguchi, Sinjiro Kawato, Kazuyuki Ebihara, Sigeo Morishima

Proceedings - IEEE International Workshop on Modelling People, MPeople 1999 3 - 10 1999年

　概要を見る

This paper proposes a new real-time method for estimating human postures in 3D from trinocular images. In this method, an upper body orientation detection and a heuristic contour analysis are performed on the human silhouettes extracted from the trinocular images so that representative points such as the top of the head can be located. The major joint positions are estimated based on a genetic algorithm based learning procedure. 3D coordinates of the representative points and joints are then obtained from the two views by evaluating the appropriateness of the three views. The proposed method implemented on a personal computer runs in real-time (30 frames/second). Experimental results show high estimation accuracies and the effectiveness of the view selection process.

DOI
ヒューマンインタフェースとインタラクションサイバースペース上の仮想人物による実時間対話システムの構築

四倉達夫, 藤井英史, 森島繁生

情報処理学会論文誌 40 ( 2 ) 677 - 686 1999年

J-GLOBAL
筋肉パラメータに基づく3次元感情空間の構築

重松陽志, 石川貴博, 森島繁生

電子情報通信学会大会講演論文集 1999 330 1999年

J-GLOBAL
GaborWaveletを使用した動画像からの顔表情認識

近藤淳, 森島繁生

電子情報通信学会大会講演論文集 1999 341 1999年

J-GLOBAL
オプティカルフローによる筋肉パラメータの自動推定と表情再合成

伊東大介, 岩沢昭一郎, 森島繁生

電子情報通信学会大会講演論文集 1999 255 1999年

J-GLOBAL
Pan Tilt Zoom Controllable Cameraによる目及び唇形状抽出・追跡

島田直幸, 武藤淳一, 森島繁生

電子情報通信学会大会講演論文集 1999 244 1999年

J-GLOBAL
空間周波数を用いた口領域のモーションキャプチャ

川俣充, 武藤淳一, 近藤淳, 森島繁生

電子情報通信学会大会講演論文集 1999 245 1999年

J-GLOBAL
3次元アバタの構築とリアルタイム対話システム

三沢貴文, 四倉達夫, 藤井英史, 森島繁生

電子情報通信学会大会講演論文集 1999 367 1999年

J-GLOBAL
流体力学に基づくCGによる風になびくリアルな頭髪の運動表現

平山聡, 岸啓補, 森島繁生

電子情報通信学会大会講演論文集 1999 327 1999年

J-GLOBAL
音声からの感情推定と実時間メディア変換システム

今村達也, 四倉達夫, 森島繁生

電子情報通信学会大会講演論文集 1999 305 1999年

J-GLOBAL
体積を持った表情筋運動モデルによる顔面モデルの構築

石川貴博, 森島繁生

電子情報通信学会大会講演論文集 1999 343 1999年

J-GLOBAL
3次元頭部モデルを用いた実時間メディア変換

藤井英史, 森島繁生

電子情報通信学会大会講演論文集 1999 368 1999年

J-GLOBAL
全身像の三眼シルエット画像に基づく姿勢推定の検討

岩沢昭一郎, 大谷淳, 森島繁生

電子情報通信学会大会講演論文集 1999 264 1999年

J-GLOBAL
リアルな3次元表情編集ツールの作成

千明太郎, 藤井英史, 四倉達夫, 森島繁生

電子情報通信学会大会講演論文集 1999 304 1999年

J-GLOBAL
音声に含まれる感情のモデル化と感情音声合成ツール

古村祐子, 四倉達夫, 森島繁生

電子情報通信学会大会講演論文集 1999 308 1999年

J-GLOBAL
表情筋運動モデルによる顔面モデルの構築

石川貴博, 森島繁生

電子情報通信学会技術研究報告 98 ( 682(HCS98 46-48) ) 21 - 28 1999年

J-GLOBAL
仮想人物によるサイバースペース上でのコミュニケーションシステムの構築

四倉達夫, 武藤淳一, 今村達也, 藤井英史, 森島繁生

電子情報通信学会技術研究報告 98 ( 683(HIP98 52-61) ) 39 - 46 1999年

J-GLOBAL
生理学的手法を用いた顔面筋肉モデルの構築

伊東大介, 四倉達夫, 森島繁生

電子情報通信学会技術研究報告 99 ( 122(HCS99 6-11) ) 17 - 24 1999年

J-GLOBAL
顔表情の実時間変形によるインタラクティブムービーの提案

岩沢昭一郎, 森島繁生

電気学会電子・情報・システム部門大会講演論文集 1999 495 - 496 1999年

J-GLOBAL
コンピュータグラフィックスを用いた矯正治療による表情の変化

寺田員人, 宮永美知代, 森島繁生, 花田晃治

歯科審美 12 ( 1 ) 37 - 51 1999年

J-GLOBAL
アクティブカメラによる視線追跡・自動Lip Reading

四倉達夫, 島田直幸, 森島繁生, 大谷淳

電子情報通信学会技術研究報告 99 ( 451(HIP99 33-46) ) 31 - 36 1999年

J-GLOBAL
空間共有コミュニケーションにおける表情入力のための顔抽出

吉村哲也, 市川忠嗣, 森島繁生, 相沢清晴, 斉藤隆弘

映像情報メディア学会冬季大会講演予稿集 1999 35 1999年

J-GLOBAL
Pan Tilt Zoom Controllable Camera による目及び唇形状抽出追跡

島田直幸, 武藤淳一, 森島繁生

電気情報通信学会 1999年総合大会講演論文集, 3 244 - 244 1999年

CiNii
実写動画像をベースとしたマルチメディアアンビアンスコミュニケーションの提案

市川, 吉村哲也, 山田邦男, 金丸利文, 須賀弘道, 岩澤昭一郎, 苗村健, 相澤清晴, 森島繁生, 齊藤隆弘

1999信学総大 16 398 - 398 1999年

CiNii
表情の分析・合成を用いたサイバースペース内でのフェーストローフェース対話システム

森島繁生

映像情報メディア学会技術報告 23 ( 0 ) 61 - 68 1999年

　概要を見る

Recently computer can make cyberspace to walk through by an interactive virtual reality technique. An avatar in cyberspace can bring us a virtual face-to-face communication environment. In this paper, an avatar is realized which has a real face in cyberspace and a multi-user communication system is constructed by voice transmission through network. Voice from microphone is transmitted and analyzed, then mouth shape and facial expression of avatar are synchronously estimated and synthesized on real time. And also an entertainment application of a real-time voice driven synthetic face is introduced and this is an example of interactive movie. Finally, face motion capture system using physics based face model is introduced.

DOI CiNii
1-3 空間共有コミュニケーションにおける表情入力のための顔抽出

吉村哲也, 市川忠嗣, 森島繁生, 相澤清晴, 齊藤隆弘

映像情報メディア学会冬季大会講演予稿集 1999 ( 0 ) 35 - 35 1999年

DOI CiNii
コンピュ-タを利用した表情の研究 (特集顔学入門)

森島繁生

言語 27 ( 12 ) 70 - 78 1998年12月

CiNii
複数画像からの実時間身体姿勢推定の検討

岩澤昭一郎, 竹松克浩, 大谷淳, 森島繁生

電子情報通信学会ソサイエティ大会講演論文集 1998 308 - 308 1998年09月

CiNii J-GLOBAL
10)房モデルによるヘアスタイルデザインシステムの開発(ネットワーク映像メディア研究会)

岸啓輔, 三枝太, 森島繁生

映像情報メディア学会誌 : 映像情報メディア 52 ( 7 ) 958 - 958 1998年07月

CiNii
11)音声による実時間口形・表情制御可能なサイバースペース上での仮想人物の実現(ネットワーク映像メディア研究会)

四倉達夫, 藤井英史, 小林智典, 森島繁生

映像情報メディア学会誌 : 映像情報メディア 52 ( 7 ) 958 - 958 1998年07月

CiNii
顔の情報処理--表情認識・合成技術の現状と課題

森島繁生

画像ラボ 9 ( 5 ) 5 - 11 1998年05月

CiNii J-GLOBAL
空間周波数に基づく顔器官の形状認識と再合成

武藤淳一, 藤井英史, 森島繁生

情報処理学会研究報告. CVIM, [コンピュータビジョンとイメージメディア] 110 ( 26 ) 49 - 56 1998年03月

　概要を見る

空間周波数成分を用いて顔表情の認識と再合成を実時間で行うシステムを提案する。画像から自動的にトラッキングされた目・口周辺の正方領域について、高速フーリエ変換を実時間で行い空間周波数成分を求める。次にこの帯域パワーから顔器官の形状、ここではFACSに基づくAUのパラメータ値をニューラルネットワークを用いて推定する。実際にこの推定結果から、顔表情を再合成して原画像との印象を比較した結果、学習には用いていない表情に対しても、原画像と類似した印象を再合成することが可能となった。これにより、瞬きや口の開き、目の開き工合などが忠実にトラッキングすることができる。したがって、マーカー等を顔面に添付することなく、非装着、非接着で表情の印象レベルでのモーションキャプチャを実現することが可能となった。

CiNii J-GLOBAL
顔情報処理のための共通プラットホームの構築

八木康史, 森島繁生, 金子正秀, 原島博, 谷内田正彦, 原文雄, 橋本周司

情報処理学会研究報告. CVIM, [コンピュータビジョンとイメージメディア] 110 ( 26 ) 65 - 72 1998年03月

　概要を見る

顔画像処理に対する様々な分野での関心の高まりや、工学分野における顔画像処理技術の研究成果の蓄積を背景にして、顔画像処理に関する共通ソフトウェアのツールの作成に向けた活動が進められている。この活動は、「感性擬人化エージェントのための顔情報処理システムの開発」(略称、アドバンストエージェントプロジェクト)と呼ばれ、情報処理振興技術協会(IPA)における独創的情報技術育成事業に関わる開発テーマの一つとして、平成7年度より3年間の計画で精力的に活動を行ってきた。擬人化エージェント技術はさまざまな技術要素から構成されているが、本プロジェクトでは、この中で特に「顔」の役割に着目し、顔画像の認識・合成に関わる顔画像処理システムの開発に主眼をおいた。これと同時に、本システムでは工学のみならず心理学や医学などの分野も含めた顔関連分野における共通の実験用ツールを広く提供することも目標としている。本稿では、平成10年3月で終了するこのプロジェクトの概要と、共通ソフトウェアの紹介を行う。

CiNii
解剖学に基づいた顔面筋肉モデルによる顔表情合成

世良元, 井土哲也, 森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 97 ( 605 ) 9 - 16 1998年03月

　概要を見る

近年、コンピュータグラフィックスにより顔表情を合成する手法が注目を集めている。最も効果的で自然なモデリング手法として顔面筋肉モデルがあげられる。顔面筋肉モデルでは「筋肉の収縮によって皮膚組織が移動して、顔表情が表出される」という、実際の表情表出過程をシミュレーションすることによって顔表情を合成する手法である。つまり顔面筋肉モデルでは、このシミュレーションの正確さによって合成画像の自然さが決定されるといえる。本研究では、頭部の骨と筋肉の解剖学的形状および構造を正確に再現したモデルを新たに構築した。骨の構造再現およびその整合手法の提案により、従来のモデルでは実現できなかった顎の運動の正確な再現が可能となった。さらに自然な表情合成を行なうために皺の再現も行なっている。この新たに構築した顔面モデルを用いて、FACSに基づいた表情編集ツールを開発した。

CiNii J-GLOBAL
オプティカルフローを用いた正面顔画像からの顔面筋パラメータの自動推定

矢崎和彦, 石川貴博, 森島繁生

電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 97 ( 596 ) 121 - 128 1998年03月

　概要を見る

コンピュータ上で顔表情を表現する手法の一つとして顔面筋肉モデルが挙げられる。顔面筋肉モデルはモデル化した皮膚組織及び表情筋を持ち皮膚組織に与える影響力を物理的に計算し、皮膚組織を変形させることによって表情表出が可能である。表情を決定するパラメータは筋肉が収縮する力(筋肉パラメータ)である。筋肉パラメータからの表情への変換は力の重ねあわせによって行なわれるが、特定の表情に対応する筋肉パラメータの決定は試行錯誤的に行なう必要があった。そこで本研究ではオプティカルフローを用いることにより正面顔画像から自動的に顔画像に対応する筋肉パラメータを推定する試みを行なった。

CiNii J-GLOBAL
顔の認識・合成のための標準ソフトウェアの開発

森島繁生, 八木康史, 金子正秀, 原島博, 谷内田正彦, 原文雄

電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 97 ( 596 ) 129 - 136 1998年03月

　概要を見る

顔画像の処理に関する共通ソフトウェアツール作成に向けた活動は、「感性擬人化エージェントのための顔情報処理システムの開発」と呼ばれ、情報処理振興技術協会(IPA)における独創的情報技術育成事業に関わる開発テーマの一つとして、平成7年度より3年間の計画で精力的に活動を行ってきた。そしてそのプロジェクトの成果として顔の認識・合成システムのための標準ソフトウェアを構築した。本システムでは工学のみならず心理学や医学、歯学などの分野も含めた顔関連分野における共通の実験用ツールを広く提供することも目標としている。

CiNii
房モデルを用いたGUIヘアスタイルデザインシステム

岸啓補, 三枝太, 森島繁生

電子情報通信学会総合大会講演論文集 1998 ( 2 ) 316 - 316 1998年03月

CiNii J-GLOBAL
オプティカルフローを用いた顔面筋パラメータの自動推定

矢崎和彦, 石川貴博, 森島繁生

電子情報通信学会総合大会講演論文集 1998 ( 2 ) 342 - 342 1998年03月

CiNii J-GLOBAL
空間周波数に基づく顔器官の形状認識と再合成

武藤淳一, 森島繁生

電子情報通信学会総合大会講演論文集 1998 ( 2 ) 343 - 343 1998年03月

CiNii J-GLOBAL
印象語に基づく表現合成

武山聡史, 藤井英史, 森島繁生

電子情報通信学会総合大会講演論文集 1998 322 - 322 1998年03月

CiNii J-GLOBAL
解剖学的構造を考慮した顔面筋肉モデルの構築

世良元, 森島繁生

電子情報通信学会総合大会講演論文集 1998 323 - 323 1998年03月

CiNii J-GLOBAL
物理モデルを用いた表情合成方法の構築および皺の表現

井土哲也, 世良元, 森島繁生

電子情報通信学会総合大会講演論文集 1998 342 - 342 1998年03月

CiNii
リアルな頭髪アニメーションの生成と頭部との自動衝突判定

今村顕, 三枝太, 森島繁生

電子情報通信学会総合大会講演論文集 1998 377 - 377 1998年03月

CiNii J-GLOBAL
感情を理解する実時間インタラクションシステムの構築

小林智典, 藤井英史, 森島繁生

電子情報通信学会総合大会講演論文集 1998 388 - 388 1998年03月

CiNii
房モデルによるヘアスタイルデザインシステムの開発

岸啓補, 三枝太, 森島繁生

電子情報通信学会技術研究報告. MVE, マルチメディア・仮想環境基礎 97 ( 566 ) 67 - 74 1998年02月

　概要を見る

サイバースペースにおける仮想人物の合成やコミュニケーションシステムの実現にむけ、コンピュータグラフィックスによる人物画像合成等が注目を集めている。本稿では、特に人物のCGの中でも合成が難しいとされる頭髪の表現方法について述べる。人物画像において頭髪は視覚的に重要であるにも関わらず、簡単な曲面や背景の一部で代用されることが多い。頭髪を一つの物体として扱い、マッピング技術を用いて表現する手法が成果をあげているが運動の表現は不可能である。そこで頭髪をテクスチャを用いずに空間曲線を用いて作成する。頭髪の部分的な集まりである房をモデル化することで簡略化したヘアスタイルデザインシステムを提案し、房をモデル化する手法、レンダリング手法について述べ、実際にこのヘアスタイルデザインシステムを用いて作成した頭髪画像を示す。

CiNii
音声による実時間口形・表情制御可能なサイバースペース上での仮想人物の実現

四倉達夫, 藤井英史, 小林智典, 森島繁生

電子情報通信学会技術研究報告. MVE, マルチメディア・仮想環境基礎 97 ( 566 ) 75 - 82 1998年02月

　概要を見る

コンピュータの発展に伴い、近年コンピュータ上にサイバースペースを作り出し、あたかもその環境に没入する感覚を生み出させるバーチャルリアリティ技術が急速に発展してきた。このサイバースペースは仮想空間をより実空間に近づけ、そこで人間同士のコミュニケーションをリアルに実現することを一つの目標としている。そこで、自分自身を投影した人間同様の顔を持つ分身としてのエージェント(Avatar:アバタ)を仮想空間上に生成し、マイクから入力された自然音声の分析から会話時の口形状および表情の推定を行って、リアルタイムにアバタの表情合成を行うシステムを提案する。このシステムによりネットワークを通して伝送された音声に同期してアバタの表情が変形し、サイバースペース上で多人数間でのFace-to-Faceの対話が可能となった。

CiNii
The Fifteen Seconds of Fame : 視聴者参加型インタラクティブ映画の提案

森島繁生

フォーラム顔学 '98 第3回日本顔学会大会予稿集 1998年

CiNii
Processing of facial information by computer

Osamu Hasegawa, Shigeo Morishima, Masahide Kaneko

Electronics and Communications in Japan, Part III: Fundamental Electronic Science (English translation of Denshi Tsushin Gakkai Ronbunshi) 81 ( 10 ) 36 - 57 1998年

　概要を見る

Faces are very familiar to everyone and convey various information, including information that is specific to the individual and information that is part of mutual communication. Usually verbal media is not able to describe such information appropriately. Recently, detailed studies on facial information processing by computer have been carried out in the engineering field for application to communication media and human interfaces. Two main topics are recognition of human faces and synthesis of facial images. The objective of the former is to enable computers to recognize human faces and that of the latter is to provide a natural and impressive interface in the form of a "face" for communication media. These technologies have also been found to be useful in various fields related to the face, such as psychology, anthropology, cosmetology, and dentistry. Although most of the studies in these fields have been carried out independently, the above technologies provide for a common study tool among them. This paper focuses on processing facial information by computer. First, recent research trends on the synthesis of facial images and recognition of facial expressions are outlined. After considering the various characteristics of the face, the engineering applications of facial information are discussed from two standpoints: the support of face-to-face communication between humans, and communication between human and machine using facial information. The tools and databases used in studying facial information processing and some related topics are also described. ©1998 Scripta Technica.

DOI
コンピュータヒューマンインタラクションのための人物表情の合成・認識技術

森島繁生

成けい大学工学研究報告 35 ( 1 ) 19 - 31 1998年

J-GLOBAL
房モデルによるヘアスタイルデザインシステムの開発

岸啓輔, 三枝太, 森島繁生

Human Interface News and Report 13 ( 1 ) 75 - 82 1998年

J-GLOBAL
音声による実時間口形・表情制御可能なサイバースペース上での仮想人物の実現

四倉達夫, 藤井英史, 小林智典, 森島繁生

Human Interface News and Report 13 ( 1 ) 83 - 90 1998年

J-GLOBAL
房モデルによるヘアスタイルデザインシステムの開発

岸啓補, 三枝太, 森島繁生

電子情報通信学会技術研究報告 97 ( 566(MVE97 93-103) ) 67 - 74 1998年

J-GLOBAL
音声による実時間口形・表情制御可能なサイバースペース上での仮想人物の実現

四倉達夫, 藤井英史, 小林智典, 森島繁生

電子情報通信学会技術研究報告 97 ( 566(MVE97 93-103) ) 75 - 82 1998年

J-GLOBAL
房モデルを用いたGUIヘアスタイルデザインシステム

岸啓補, 三枝太, 森島繁生

電子情報通信学会大会講演論文集 1998 316 1998年

J-GLOBAL
解剖学的構造を考慮した顔面筋肉モデルの構築

世良元, 森島繁生

電子情報通信学会大会講演論文集 1998 323 1998年

J-GLOBAL
音声から画像へのメディア変換を用いたサイバースペース上での多人数コミュニケーションシステム

四倉達夫, 小林智典, 藤井英史, 森島繁生

情報処理学会シンポジウム論文集 98 ( 5 ) 133 - 134 1998年

J-GLOBAL
オプティカルフローを用いた顔面筋パラメータの自動推定

矢崎和彦, 石川貴博, 森島繁生

電子情報通信学会大会講演論文集 1998 342 1998年

J-GLOBAL
印象語に基づく表情合成

武山聡史, 藤井英史, 森島繁生

電子情報通信学会大会講演論文集 1998 322 1998年

J-GLOBAL
サイバースペース上での多人数コミュニケーションシステム

四倉達夫, 藤井英史, 森島繁生

電子情報通信学会大会講演論文集 1998 375 1998年

J-GLOBAL
感情を理解する実時間インタラクションシステムの構築

小林智典, 藤井英史, 森島繁生

電子情報通信学会大会講演論文集 1998 388 1998年

J-GLOBAL
リアルな頭髪アニメーションの生成と頭部との自動衝突判定

今村顕, 三枝太, 森島繁生

電子情報通信学会大会講演論文集 1998 377 1998年

J-GLOBAL
物理モデルを用いた表情合成方法の構築およびしわの再現

井土哲也, 世良元, 森島繁生

電子情報通信学会大会講演論文集 1998 342 1998年

J-GLOBAL
空間周波数に基づく顔器官の形状認識と再合成

武藤淳一, 森島繁生

電子情報通信学会大会講演論文集 1998 343 1998年

J-GLOBAL
解剖学に基づいた顔面筋肉モデルによる顔表情合成

世良元, 井土哲也, 森島繁生

電子情報通信学会技術研究報告 97 ( 605(HCS97 30-32) ) 9 - 16 1998年

J-GLOBAL
オプティカルフローを用いた正面顔画像からの顔面筋パラメータの自動推定

矢崎和彦, 石川貴博, 森島繁生

電子情報通信学会技術研究報告 97 ( 596(PRMU97 266-282) ) 121 - 128 1998年

J-GLOBAL
顔の認識・合成のための標準ソフトウェアの開発

森島繁生, 八木康史, 金子正秀, 原島博, 谷内田正彦, 原文雄

電子情報通信学会技術研究報告 97 ( 596(PRMU97 266-282) ) 129 - 136 1998年

CiNii J-GLOBAL
顔情報処理のための共通プラットホームの構築

八木康史, 森島繁生, 金子正秀, 原島博, 谷内田正彦, 原文雄, 橋本周司

情報処理学会研究報告 98 ( 26(CVIM-110) ) 65 - 72 1998年

J-GLOBAL
空間周波数に基づく顔器官の形状認識と再合成

武藤淳一, 藤井英史, 森島繁生

情報処理学会研究報告 98 ( 26(CVIM-110) ) 49 - 56 1998年

J-GLOBAL
顔の情報処理表情認識・合成技術の現状と課題

森島繁生

画像ラボ 9 ( 5 ) 5 - 11 1998年

J-GLOBAL
自由な髪型が表現可能なヘアスタイルデザインシステムの開発と頭髪の質感・運動表現

岸啓補, 森島繁生

Visual Computing グラフィクスとCAD合同シンポジウム予稿集 1998 109 - 114 1998年

J-GLOBAL
画像の空間周波数成分を用いた実時間顔表情報認識と再合成

武藤淳一, 藤井英史, 森島繁生

情報処理学会シンポジウム論文集 98 ( 10 Pt.2 ) II.325-II.330 1998年

J-GLOBAL
複数画像からの実時間身体姿勢推定の検討

岩沢昭一郎, 竹松克浩, 大谷淳, 森島繁生

電子情報通信学会大会講演論文集 1998 308 1998年

J-GLOBAL
顔画像処理研究会報告最終年度に突入

森島繁生

電子情報通信学会誌 4 1998年

J-GLOBAL
コンピュ-タヒュ-マンインタラクションのための人物表情の合成・認識技術(技術解説)

森島繁生

成蹊大学工学研究報告 35 ( 1 ) 19 - 31 1998年01月

CiNii
音声に含まれる感情のモデル化と感情音声合成ツール

古村, 四倉達夫, 森島繁生

電気情報通信学会総合大会, May, 1998 1999 308 - 308 1998年

CiNii J-GLOBAL
サイバースペース上での多人数コミュニケーションシステム

四倉, 藤井英史, 森島繁生

電子情報通信学会総合大会講演論文集, 1998 375 - 375 1998年

CiNii
房モデルによるヘアスタイルデザインシステムの開発

岸啓補, 三枝太, 森島繁生

映像情報メディア学会技術報告 22 ( 0 ) 67 - 74 1998年

　概要を見る

サイバースペースにおける仮想人物の合成やコミュニケーションシステムの実現にむけ、コンピュータグラフィクスによる人物画像合成等が注目を集めている。本稿では、特に人物のCGの中でも合成が難しいとされる頭髪の表現方法について述べる。人物画像において頭髪は視覚的に重要であるにも関わらず、簡単な曲面や背景の一部で代用されることが多い。頭髪を一つの物体として扱い、マッピング技術を用いて表現する手法が成果をあげているが運動の表現は不可能である。そこで頭髪をテクスチャを用いずに空間曲線を用いて作成する。頭髪の部分的な集まりである房をモデル化することで簡略化したヘアスタイルデザインシステムを提案し、房をモデル化する手法、レンダリング手法について述べ、実際にこのヘアスタイルデザインシステムを用いて作成した頭髪画像を示す。

DOI CiNii
音声による実時間口形・表情制御可能なサイバースペース上での仮想人物の実現

四倉達夫, 藤井英史, 小林智典, 森島繁生

映像情報メディア学会技術報告 22 ( 0 ) 75 - 82 1998年

　概要を見る

コンピュータの発展に伴い、近年コンピュータ上にサイバースペースを作り出し、あたかもその環境に没入する感覚を生み出させるバーチャルリアリティ技術が急速に発展してきた。このサイバースペースは仮想空間をより実空間に近づけ、そこで人間同士のコミュニケーションをリアルに実現することを一つの目標としている.そこで、自分自身を投影した人間同様の顔を持つ分身としてのエージェント(Avatar : アバタ)を仮想空間上に生成し、マイクから入力された自然音声の分析から会話時の口形状および表情の推定を行って、リアルタイムにアバタの表情合成を行うシステムを提案する。このシステムによりネットワークを通して伝送された音声に同期してアバタの表情が変形し、サイバースペース上で多人数間でのFace-to-Faceの対話が可能となった。

DOI CiNii
HTMLブラウザを用いた感情音声刺激のSD法評価実験(研究発表B,IV.第16回大会発表要旨)

佐藤順, 森島繁生, 山田寛

基礎心理学研究 16 ( 2 ) 118 - 118 1998年

DOI CiNii
顔面筋肉モデルに基づく表情トラッキングと再合成

石川貴博, 矢崎和彦, 世良元, 森島繁生

電子情報通信学会技術研究報告. HIP, ヒューマン情報処理 97 ( 388 ) 39 - 46 1997年11月

　概要を見る

コンピュータ上で顔表情を表現する手法の1つである顔面筋肉モデル[1][2][3]は、モデル化された皮膚組織及び表情筋を持ち筋肉が皮膚組織に与える影響力を計算し、皮膚組織を変形させることによって、表情表出が可能である。表情を決定するパラメータは、筋肉が収縮するカ(筋肉パラメータ)である。筋肉パラメータから顔表情への変換は、カの重ね合わせによって行われるが、特定の表情に対応する筋肉パラメータの決定は、試行錯誤的に行う必要があった。しかし、本稿では1台のカメラで撮影した顔面の2次元移動量から筋肉パラメータの自動推定し、顔表情の再合成を行った。これは同時に筋肉モデルという制約下の元で正面画像のみから得られる2次元の顔表情の情報から3次元の表情を追跡・合成することに相当している。

CiNii
表情認識・合成の技術課題

森島繁生

電子情報通信学会技術研究報告. HIP, ヒューマン情報処理 97 ( 388 ) 83 - 92 1997年11月

　概要を見る

顔表情の認識・合成は、Face-to-Faceの対話を実現する次世代のヒューマンインタフェースの基礎技術として重要な役割を果たす。本稿では、主として表情合成の立場から現在の技術レベルを概観し、その問題点と今後の技術課題を抽出する。特に顔のモデル化手法、動的な表情のモデル化手法、感情のモデル化手法、頭髪の形状および運動のモデル化手法、リップシンク手法について述べる。

CiNii
顔面筋肉モデルに基づく表情トラッキングと再合成

石川貴博, 矢崎和彦, 世良元, 森島繁生

電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 97 ( 386 ) 47 - 54 1997年11月

　概要を見る

コンピュータ上で顔表情を表現する手法の1つである顔面筋肉モデル[1][2][3]は、モデル化された皮膚組織及び表情筋を持ち筋肉が皮膚組織に与える影響力を計算し、皮膚組織を変形させることによって、表情表出が可能である。表情を決定するパラメータは、筋肉が収縮するカ(筋肉パラメータ)である。筋肉パラメータから顔表情への変換は、カの重ね合わせによって行われるが、特定の表情に対応する筋肉パラメータの決定は、試行錯誤的に行う必要があった。しかし、本稿では1台のカメラで撮影した顔面の2次元移動量から筋肉パラメータの自動推定し、顔表情の再合成を行った。これは同時に筋肉モデルという制約下の元で正面画像のみから得られる2次元の顔表情の情報から3次元の表情を追跡・合成することに相当している。

CiNii J-GLOBAL
表情認識・合成の技術課題

森島繁生

電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 97 ( 386 ) 91 - 100 1997年11月

　概要を見る

顔表情の認識・合成は、Face-to-Faceの対話を実現する次世代のヒューマンインタフェースの基礎技術として重要な役割を果たす。本稿では、主として表情合成の立場から現在の技術レベルを概観し、その問題点と今後の技術課題を抽出する。特に顔のモデル化手法、動的な表情のモデル化手法、感情のモデル化手法、頭髪の形状および運動のモデル化手法、リップシンク手法について述べる。

CiNii J-GLOBAL
ダイナミックスモデルに基づく頭髪の運動表現

三枝太, 森島繁生

情報処理学会研究報告. CG,グラフィクスとCAD研究会報告 87 ( 98 ) 25 - 30 1997年10月

　概要を見る

サイバースペースにおける仮想人物の合成やコミュニケーションシステムのための人物像合成等で、コンピュータグラフィックスによる人物合成が注目を集めている。本稿では、特に人物のCGの中でも特に合成が難しいとされる頭髪の表現手法について述べる。既に頭髪を空間曲線により近似することで、形状データの容量を大幅に削減し、また剛体セグメントモデルで近似することにより頭髪の運動制御を実現する方法を筆者らは提案した。本稿では、この運動制御方法をさらに改良し実際の運動に即したモデルで記述することで、頭髪の動きをより自然なものにした。また、新たな衝突判定アルゴリズムを提案し、高速な衝突判定処理を行った。さらに、ディスプレイの数倍の解像度を持つイメージバッファを用いたレンダリング手法により、より滑らかな頭髪の表現を可能にした。

CiNii J-GLOBAL
「顔」の情報処理

長谷川修, 森島敏生, 金子正秀

電子情報通信学会論文誌. D-2, 情報・システム 2-情報処理 80 ( 8 ) 2047 - 2065 1997年08月

　概要を見る

「顔」は人間にとって非常に身近な存在であると共に, 顔の持ち主である一人一人の人間における個人的な情報, コミュニケーションに係わる情報を始めとした, 言語的手段では表現しにくいようなさまざまな情報を担っている. 近年, 工学分野では主としてコミュニケーションメディアやヒューマンインタフェースへの応用の観点から,「顔」の工学的取扱いに対する研究が活発に行われている. 具体的には, ユーザである人間を対象とした視覚機能をコンピュータにもたせるための顔の認識技術と, コンピュータあるいはコミュニケーションメディアに表現力豊かな顔をもたせるための顔の合成技術である. これらの研究成果は, 従来個別に検討が行われていた顔関連の心理学, 人類学, 美容, 歯科等さまざまな分野においても活用されつつある. 本論文では, このような観点からコンピュータによる顔情報処理に焦点を当て, まず要素技術としての顔画像合成と表情認識について最近の技術動向を概観する. 次に,「顔」の諸特性について考察した後に, 人と人との対面コミュニケーションの支援, 人と機械との間の顔情報を介したコミュニケーションという二つの立場から「顔」の工学的応用について述べる. また,「顔」情報処理の研究のためのツールやデータベース等についても紹介する.

CiNii
「顔」の情報処理

長谷川修, 森島繁生, 金子正秀

電子情報通信学会論文誌. A, 基礎・境界 80 ( 8 ) 1231 - 1249 1997年08月

　概要を見る

「顔」は人間にとって非常に身近な存在であると共に, 顔の持ち主である一人一人の人間における個人的な情報, コミュニケーションに係わる情報を始めとした, 言語的手段では表現しにくいようなさまざまな情報を担っている. 近年, 工学分野では主としてコミュニケーションメディアやヒューマンインタフェースへの応用の観点から,「顔」の工学的取扱いに対する研究が活発に行われている. 具体的には, ユーザである人間を対象とした視覚機能をコンピュータにもたせるための顔の認識技術と, コンピュータあるいはコミュニケーションメディアに表現力豊かな顔をもたせるための顔の合成技術である. これらの研究成果は, 従来個別に検討が行われていた顔関連の心理学, 人類学, 美容, 歯科等さまざまな分野においても活用されつつある. 本論文では, このような観点からコンピュータによる顔情報処理に焦点を当て, まず要素技術としての顔画像合成と表情認識について最近の技術動向を概観する. 次に,「顔」の諸特性について考察した後に, 人と人との対面コミュニケーションの支援, 人と機械との間の顔情報を介したコミュニケーションという二つの立場から「顔」の工学的応用について述べる. また,「顔」情報処理の研究のためのツールやデータベース等についても紹介する.

CiNii J-GLOBAL
顔画像を基にした3次元感情モデルの構築とその評価

坂口竜己, 山田寛, 森島繁生

電子情報通信学会論文誌. A, 基礎・境界 80 ( 8 ) 1279 - 1284 1997年08月

　概要を見る

人間の感情状態をコンピュータ内で擬似的に表現するために, 感情モデルを構築する研究を進めている. この感情モデルを顔表情の記述に利用することで, 顔表情画像の圧縮伝送等に応用が可能となる. 既に5層ニューラルネットワークを表情記述パラメータにより恒等号像学習を行うことで, その中間層に内部構造としての2次元感情空間(感情モデル)を構築する手法を提案しているが, 心理学的に不十分な解釈しか行われていなかった. 本論文では感情モデルを3次元に拡張し, 作成された感情空間の心理学的な妥当性の検証, および表情認識システムの構築も行った. 多数の被験者による主観評価実験により空間内の位置と心理学的評価の対応関係を明らかにし, この感情モデルによる表情の認識手法では, より人間に近い反応が得られることを示す.

CiNii
4-5顔画像によるアドバンストエージェント

金子正秀, 森島繁生

映像情報メディア学会誌 51 ( 8 ) 1169 - 1174 1997年08月

DOI CiNii
熱画像を用いた人物全身像の実時間姿勢推定

岩澤昭一郎, 海老原一之, 大谷淳, 中津良平, 森島繁生

映像情報メディア学会誌 51 ( 8 ) 1270 - 1277 1997年08月

　概要を見る

This paper proposes a new real-time method for estimating human body postures from thermal images acquired by an infrared camera, regardless of the background and lighting conditions. Distance transformation is performed for the human body area extracted from the thresholded thermal image, in order to calculate the center of gravity. After the orientation of the upper half of the body is obtained by calculating the moment of inertia, significant points such as the top of the head and the ends of the hands and feet are heuristically located. In addition, the elbow and knee positions are estimated from the detected (significant) points, using a genetic-algorithm-based learning procedure. The experimental results demonstrate the robustness of the proposed algorithm and real-time performance (faster than 20 frames per second).

DOI CiNii J-GLOBAL
頭髪の質感および運動の表現

三枝太, 森島繁生

電子情報通信学会ソサイエティ大会講演論文集 1997 256 - 256 1997年08月

　概要を見る

人物の画像の中でも特にCGによる合成が難しい頭髪を表現する手法について述べる。これまでに, 頭髪を「空間曲線」によって近似し, 空間曲線を極めて細い円筒形チューブである仮定し,曲線上の任意の点における法線ベクトルを計算することで空間曲線のレンダリングを可能にする手法や, 「剛体セグメントモデル」を用いて頭髪の運動を制御する手法を提案してきた。本稿では, ディスプレイの数倍の解像度を持つイメージバッファを用いたアンチエイリアシング手法について述べ, 従来では考慮されていなかったセグメント相互の影響を考慮した運動モデルを提案する。

CiNii J-GLOBAL
空間周波数を使用した実時間顔表情認識

近藤淳, 森島繁生

電子情報通信学会ソサイエティ大会講演論文集 1997 186 - 186 1997年08月

　概要を見る

人間とコンピュータとの自然なコミュニケーションを実現するため、顔画像の解析により顔表情の認識を行う。顔表情認識には空間周波数の有効性が指摘されている。それにより得られた特徴をもとに顔表情を基本6表情(怒り、嫌悪、恐れ、喜び、悲しみ、驚き)に分類する。本研究では、実時間顔表情認識システムを提案する。各表情の特徴を取得するため、表情変化の際に大きな形状変化をする目、口領域を抽出する。各領域を空間周波数領域に変換し、その帯域分割を行い、各帯域における周波数特徴を取得する。その特徴をもとに、顔表情認識を行う。

CiNii J-GLOBAL
感情音声の印象に関する主観評価実験

佐藤順, 森島繁生

電子情報通信学会ソサイエティ大会講演論文集 1997 194 - 194 1997年08月

　概要を見る

我々は音声に含まれる感情情報の分析, およびニューラルネットワークを用いた感情情報のモデル化の試みを行っている。しかし, 現在のところ決定的な感情情報のモデル化の方法は見いだせていない。そこで, 人間が音声に込められた感情を理解するメカニズムを解明する手がかりを得るために, 感情が込められた音声の主観評価実験を行った。本研究では, まず音声資料をテレビドラマから収集し, これらの音声資料に感情が込められているか, またどのような感情であるかを調べるために聴取実験を行った。次にSD法による主観評価実験で用いる形容詞をサンプリングし, これらが感情を込めた音声を評価するのに妥当であるか調べるために評価実験を行った。最後に, SD法による主観評価実験を行った。なお, 全ての刺激の提示と回答はHTMLブラウザを用い, 被験者がコンピュータを直接操作して行った。本稿では, 上述の評価実験の実験方法と, 結果に対して因子分析を行い, 得られた知見について報告する。

CiNii J-GLOBAL
画像の2次元離散コサイン変換を利用した実時間顔表情認識

坂口竜己, 森島繁生

電子情報通信学会論文誌. D-2, 情報・システム 2-情報処理 80 ( 6 ) 1547 - 1554 1997年06月

　概要を見る

人物顔表情の認識は, 心理学や工学などさまざまな分野での応用を期待され, 多くの研究がなされている. しかしそのほとんどは, 認識の精度と計算量とのトレードオフにより実時間での処理は困難であった. 本論文では比較的低速なワークステーションやPC上での動作を前提とした実時間表情認識システムを提案する. 顔の領域抽出はフレーム間差分画像によりまばたきを検出することで行い, 動画像中の領域追跡では1次元の相関マッチングを利用する. この手法は単純なアルゴリズムで実現されるため高速であり, 表情の特徴を空間周波数成分から得る本手法と組み合わせる場合において十分な性能をもっている. 表情認識は離散コサイン変換 (DCT) とニューラルネットワークにより行う. 特定個人の4表情認識では, 動画像中の領域追跡も含め, 90%以上の精度が得られることを確認できた.

CiNii
音声に込められた感情の意味次元に関する検討

佐藤順, 森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 97 ( 99 ) 21 - 28 1997年06月

　概要を見る

音声に込められた感情の意味次元について3つの主観評価実験を行った。まず, 実験1としてテレビドラマより収録した音声資料の感情カテゴリーを分類し, 実験2では実験3で用いるSD法の評価形容詞対として適当なものを選択する実験を行った。最後に, 実験1で分類された音声資料に対し, それらの評価語対を用いてSD法により主観評価実験を行った。さらに, 実験3で得られた結果について因子分析を行った。その結果, 2つの因子が得られ, ひとつは「活性化」の因子で, もうひとつは「快-不快」の因子であることがわかった。また, 各音声資料についてそれぞれの因子得点を求めたところ, 基本感情のカテゴリー毎に分離されたグループを形成することが分かった。

CiNii
ニューラルネットに基づくマーカ移動量からの顔面筋パラメータの推定

石川貴博, 世良元, 森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 96 ( 604 ) 29 - 36 1997年03月

　概要を見る

コンピュータ上で顔表情を表現する手法の1つである顔面筋肉モデル^lt[1][2][3]gtは、モデル化された皮膚組織及び表情筋を持ち筋肉が皮膚組織に与える影響力を計算し、皮膚組織を変形させることによって、表情表出が可能である。表情を決定するパラメータは、筋肉が収縮する力(筋肉パラメータ)である。筋肉パラメータから顔表情への変換は、力の重ね合わせによって行われるが、特定の表情に対応する筋肉パラメータの決定は、試行錯誤的に行う必要があった。しかし、本稿では1台のカメラで撮影した顔面上のマーカの2次元移動量から筋肉パラメータの自動推定を行った。これは同時に筋肉モデルという制約下のもとで正面画像のみから得られる2次元のマーカ情報から3次元の表情をモーションキャプチャすることに相当している。

CiNii
熱画像を用いた人体姿勢の実時間推定の検討

岩澤昭一郎, 海老原一之, 大谷淳, 森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 96 ( 604 ) 37 - 44 1997年03月

　概要を見る

本報告では赤外線カメラにより獲得される熱画像を用いて、背景・照明条件に対して実時間で人物姿勢を推定する手法を提案する。熱画像への閾値処理により得られる人物領域に距離変換を施すことにより全身像の重心を求め、上半身の傾きを得た後、頭頂・手先・足先の各特徴部位を求める。さらに、これら特徴部位から遺伝的アルゴリズムを用いて肘と膝の位置を学習的に推定する。本手法は非接触方式であり、任意の人物に対して適用が可能である。

CiNii J-GLOBAL
実計測に基づく頭髪の運動表現

近藤淳, 三枝太, 森島繁生

電子情報通信学会総合大会講演論文集 1997 ( 2 ) 390 - 390 1997年03月

　概要を見る

現在、コンピュータグラフィックスによる人物画像の合成が、さまざまな分野で行なわれており、よりリアルな人物の合成画像が求められている。人物頭部領域においては、これまでに頭髪を「空間曲線」によって近似し、「剛体セグメントモデル」を用いて運動を制御する方法を提案してきた。しかし従来の方法は頭髪の運動を統括的に制御せず、部分的なセグメントの運動を考えているため時間の経過と共に運動に破綻をきたすといった問題があった。そこで運動を統括的に制御するため糸状物体の運動の実計測を行ない、それを運動アニメーションに反映させることによって、よりリアルな頭髪の運動を表現する。

CiNii J-GLOBAL
3次元モデルを用いた口形状の制御

藤井英史, 宮下直也, 森島繁生

電子情報通信学会総合大会講演論文集 1997 ( Sogo Pt 1 ) 317 - 317 1997年03月

　概要を見る

コンピュータと人間とのコミュニケーションを円滑に行うには、人と人が直接対話しているような環境を実現することが理想である。このためには、画像と音声の同期がとれることはもちろんのこと、発話時の口の動きを自然に表現することが求められる。現在、実時間メディア変換システム[1]に用いられている口形動作には違和感があるため、その口形動作のクオリティの向上が必要である。本稿では、日本語と英語の基本口形を実測に基づいて作成し、これらの時間方向の補間により自然なアニメーションを実現した。

CiNii J-GLOBAL
実時間インタラクションシステムの構築

宮下直也, 佐藤昌代, 森島繁生

電子情報通信学会総合大会講演論文集 1997 318 - 318 1997年03月

　概要を見る

人間同様の顔を持つエージェント (擬人化エージェント)がヒューマンインタフェース分野のホットなトピックとなっている. これは, あたかも人と人とが直接, 接しているような高度な現実感を持った環境を実現することが要求される. その第一の条件としてエージェントが作り物であることをユーザに意識させない, 自然な顔表情合成と実時間での音声との同期表示が挙げられる. このような環境の実現に向けて, マイクから入力された, あるいは記録された自然音声の分析に基づいて会話時の口唇の動き, および表情をリアルタイムに合成するリアルタイムメディア変換システムを提案する. 本稿では, ユーザとエージェントとのインタラクション[1]を実現するプロトタイプシステムについて報告する.

CiNii
英語発声のための筋肉モデルによる口形制御

関矢正樹, 世良元, 森島繁生

電子情報通信学会総合大会講演論文集 1997 ( Sogo Pt 1 ) 344 - 344 1997年03月

　概要を見る

筋肉モデル[1]を用いて口形を作成する研究を進めている。今まで日本語の口形が作成されてきた[2]。しかし、従来のままでは英語特有の口形([f],[v],[θ],[δ] )の表現が出来ない。また日本語よりも英語の方が、より多くの口形があるために、より幅広い筋肉の制御レンジが必要となる。そこで、モデルの筋肉の形状、顎の制御の改良を行い、モデルに舌を加えた。また現実感のある画像の作成のために、口内のモデルを付加した。

CiNii J-GLOBAL
正面顔画像のマーカ移動量からの顔面筋パラメータの自動推定

石川貴博, 世良元, 森島繁生

電子情報通信学会総合大会講演論文集 1997 ( Sogo Pt 1 ) 347 - 347 1997年03月

　概要を見る

顔面筋肉モデル[1]は、モデル化された皮膚組織及び表情筋を持ち筋肉が皮膚組織に与える影響力を計算し、皮膚組織を変形させることによって、表情表出が可能である。表情を決定するパラメータは、筋肉の力(筋肉パラメータ)である。筋肉のパラメータから顔表情への変換は、力の重ね合わせによって行われるが、特定の表情に対応する筋肉パラメータの決定は、試行錯誤的に行う必要があった。が、本稿では1台のカメラで撮影した顔面上のマーカの2次元移動量から筋肉パラメータを自動推定する試みを行った. これは同時に筋肉モデルという制約下のもとで正面画像のみから得られる2次元のマーカ情報から3次元の表情のモーションキャプチャをすることに相当している.

CiNii J-GLOBAL
顔画像の空間周波数特徴を用いた実時間表情合成

酒井典子, 森島繁生

電子情報通信学会総合大会講演論文集 1997 ( Sogo Pt 1 ) 353 - 353 1997年03月

　概要を見る

人間とコンピュータとのインタラクティブな対話を実現するための環境の構築を目指している。すでに筆者らの提案した表情分析・合成システムでは、合成画像の顔器官上の特徴点の移動量を表情パラメータに変換し、顔表情を合成している[1]。しかし、静止画像をターゲットとしていた点や表情表出過程における度合いの弱い表情の分析が難しいといった問題点があった。そこで、本論文では、無表情から表情表出までの顔動画像を合成する方法を提案する。分析対象となる実動画像の領域抽出および領域追跡を画像の2値化と加算投影により行い、顔表情分析は高速フーリエ変換(FFT)とニューラルネットを用いて特徴点の移動量を推定し、表情合成を行う。

CiNii J-GLOBAL
画像処理技術の基本を学ぶ(第3回)顔画像の処理技術

森島繁生

画像ラボ 8 ( 2 ) 54 - 58 1997年02月

CiNii J-GLOBAL
表情筋の3次元物理モデルに基づく人物表情の分析合成

森島繁生

3D映像 11 ( 2 ) 7 - 18 1997年

CiNii
Virtual Face-to-Face Communication Driven by Voice Through Network

MORISHIMA Shigeo

Proceedings of Workshop on Perceptual User Interfaces (PUI'97), Oct. 1997年

CiNii
3D estimation of facial muscle parameter from the 2D marker movement using neural network

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 1352 671 - 678 1997年01月

　概要を見る

© 1997, Springer Verlag. All rights reserved. Muscle based face image synthesis is one of the most realistic approach to realize life-like agent in computer. Facial muscle model is composed of facial tissue elements and muscles. In this model, forces are calculated effecting facial tisssue element by contraction of each muscle strength, so the combination of each muscle parameter decide a specific facial expression. Now each muscle parameter is decided on trial and error procedure comparing the sample photograph and generated image using our Muscle-Editor to generate a specific face image. In this paper, we propose the strategy of automatic estimation of facial muscle parameters from 2D marker movements using neural network. This corresponds to the non-reattime 3D facial motion tracking from 2D image under the physics based condition.
画像処理技術の基本を学ぶ (第3回) 顔画像の処理技術モデル符号化や次世代インタフェースに必要な顔画像の処理技術

森島繁生

画像ラボ 8 ( 2 ) 54 - 58 1997年

J-GLOBAL
ヒューマンコンピュータインタラクションのための音声から画像へのリアルタイムメディア変換

宮下直也, 坂口竜己, 森島繁生

情報処理学会シンポジウム論文集 97 ( 1 ) 53 - 54 1997年

J-GLOBAL
実時間インタラクションシステムの構築

宮下直也, 佐藤昌代, 森島繁生

電子情報通信学会大会講演論文集 1997 ( Sogo Pt 1 ) 318 1997年

J-GLOBAL
実計測に基づく頭髪の運動表現

近藤淳, 三枝太, 森島繁生

電子情報通信学会大会講演論文集 1997 ( Sogo Pt 7 ) 390 1997年

J-GLOBAL
熱画像からの人体の姿勢推定の高度化の検討

岩沢昭一郎, 海老原一之, 大谷淳, 森島繁生

電子情報通信学会大会講演論文集 1997 ( Sogo Pt 7 ) 365 - 365 1997年

　概要を見る

筆者らは既に、熱画像を用いた非接触な人物の姿勢推定手法を提案している。しかし、筆者らの従来法では、上半身の左右への大幅な傾きや、足の広範な動作に対応できず、また単眼のため3次元情報が得られないなどの課題があった。本報告ではより多くの姿勢に対応できるよう、従来の単眼用アルゴリズムを改良する。即ち単眼赤外線カメラから入力される熱画像から獲得される人物領域と、その輪郭の情報に基づいてヒューリスティックに頭頂・手先・足先の各位置を実時間で検出する手法、および遺伝的アルゴリズム (GA) を利用して肘および膝の位置を推定する手法を提案する。さらに、ステレオ視による3次元位置の獲得について検討する。

CiNii J-GLOBAL
英語発声のための筋肉モデルによる口形制御

関矢正樹, 世良元, 森島繁生

電子情報通信学会大会講演論文集 1997 ( Sogo Pt 1 ) 344 1997年

J-GLOBAL
3次元モデルを用いた口形状の制御

藤井英史, 宮下直也, 森島繁生

電子情報通信学会大会講演論文集 1997 ( Sogo Pt 1 ) 317 1997年

J-GLOBAL
正面顔画像のマーカ移動量からの顔面筋パラメータの自動推定

石川貴博, 世良元, 森島繁生

電子情報通信学会大会講演論文集 1997 ( Sogo Pt 1 ) 347 1997年

J-GLOBAL
顔画像の空間周波数特徴を用いた実時間表情合成

酒井典子, 森島繁生

電子情報通信学会大会講演論文集 1997 ( Sogo Pt 1 ) 353 1997年

J-GLOBAL
ヘアスタイルモデリングツールの開発

酒井文騰, 三枝太, 森島繁生

電子情報通信学会大会講演論文集 1997 ( Sogo Pt 7 ) 398 1997年

J-GLOBAL
熱画像を用いた人体姿勢の実時間推定の検討

岩沢昭一郎, 海老原一之, 大谷淳, 森島繁生

電子情報通信学会技術研究報告 96 ( 604(HCS96 44-49) ) 37 - 44 1997年

J-GLOBAL
ニューラルネットに基づくマーカ移動量からの顔面筋パラメータの推定

石川貴博, 世良元, 森島繁生

電子情報通信学会技術研究報告 96 ( 604(HCS96 44-49) ) 29 - 36 1997年

J-GLOBAL
画像の認識・理解画像の2次元離散コサイン変換を利用した実時間顔表情認識

坂口竜己, 森島繁生

電子情報通信学会論文誌 D-2 J80-D-2 ( 6 ) 1547 - 1554 1997年

J-GLOBAL
音声に込められた感情の意味次元に関する検討

佐藤順, 森島繁生

電子情報通信学会技術研究報告 97 ( 99(HCS97 1-6) ) 21 - 28 1997年

J-GLOBAL
2次元マーカ移動量からの顔面筋パラメータ自動推定

石川貴博, 世良元, 森島繁生

映像情報メディア学会年次大会講演予稿集 1997 373 - 374 1997年

J-GLOBAL
3次元モデルを用いた発話アニメーションの作成

藤井英史, 宮下直也, 森島繁生

映像情報メディア学会年次大会講演予稿集 1997 68 - 69 1997年

J-GLOBAL
「顔」の情報処理

長谷川修, 森島繁生, 金子正秀

電子情報通信学会論文誌 A J80-A ( 8 ) 1231 - 1249 1997年

J-GLOBAL
「顔」の情報処理

長谷川修, 森島繁生, 金子正秀

電子情報通信学会論文誌 D-2 J80-D-2 ( 8 ) 2047 - 2065 1997年

J-GLOBAL
画像技術における学習・適応・進化熱画像を用いた人物全身像の実時間姿勢推定

岩沢昭一郎, 海老原一之, 大谷淳, 中津良平, 森島繁生

映像情報メディア学会誌 51 ( 8 ) 1270 - 1277 1997年

　概要を見る

This paper proposes a new real-time method for estimating human body postures from thermal images acquired by an infrared camera, regardless of the background and lighting conditions. Distance transformation is performed for the human body area extracted from the thresholded thermal image, in order to calculate the center of gravity. After the orientation of the upper half of the body is obtained by calculating the moment of inertia, significant points such as the top of the head and the ends of the hands and feet are heuristically located. In addition, the elbow and knee positions are estimated from the detected (significant) points, using a genetic-algorithm-based learning procedure. The experimental results demonstrate the robustness of the proposed algorithm and real-time performance (faster than 20 frames per second).

DOI J-GLOBAL
人体と顔の画像処理 4. 応用 4-5 顔画像によるアドバンストエージェント

金子正秀, 森島繁生

映像情報メディア学会誌 51 ( 8 ) 1169 - 1174 1997年

DOI CiNii J-GLOBAL
感情,精神の分析の顔認識モデル顔画像を基にした3次元感情モデルの構築とその評価

坂口竜己, 山田寛, 森島繁生

電子情報通信学会論文誌 A J80-A ( 8 ) 1279 - 1284 1997年

J-GLOBAL
感情音声の印象に関する主観評価実験

佐藤順, 森島繁生

電子情報通信学会大会講演論文集 1997 194 1997年

J-GLOBAL
頭髪の質感および運動の表現

三枝太, 森島繁生

電子情報通信学会大会講演論文集 1997 256 1997年

J-GLOBAL
空間周波数を使用した実時間顔表情認識

近藤淳, 森島繁生

電子情報通信学会大会講演論文集 1997 186 1997年

J-GLOBAL
ダイナミックスモデルに基づく頭髪の運動表現

三枝太, 森島繁生

情報処理学会研究報告 97 ( 98(CG-87) ) 25 - 30 1997年

J-GLOBAL
顔面筋肉モデルに基づく表情トラッキングと再合成

石川貴博, 矢崎和彦, 世良元, 森島繁生

電子情報通信学会技術研究報告 97 ( 386(PRMU97 129-151) ) 47 - 54 1997年

J-GLOBAL
表情認識・合成の技術課題

森島繁生

電子情報通信学会技術研究報告 97 ( 386(PRMU97 129-151) ) 91 - 100 1997年

J-GLOBAL
ヘアスタイルモデリングツールの開発

酒井文騰, 三枝太, 森島繁生

信学'97春大, 03 398 - 398 1997年

　概要を見る

自然物体の形状、質感、及び運動表現は複雑であり、この中で特に頭髪などの細く、柔らかい糸状物体はCGによる表現は困難とされ確立された表現方法はない。すでに、頭部モデルへのインタラクティブな頭髪の生成方法について報告しているが、固定された頭髪生成領域や、2枚の編集画面による間接的な頭髪の生成方法など、まだ忠実にイメージどうりのヘアスタイルを表現するのは容易ではなかった。本稿ではこの問題点を考慮し、マウスを用いて、より複雑なヘアスタイルを簡単にデザインできるモデリングツールを開発した。

CiNii
6-2 3次元モデルを用いた発話アニメーションの作成

藤井英史, 宮下直也, 森島繁生

映像情報メディア学会年次大会講演予稿集 1997 ( 0 ) 68 - 69 1997年

　概要を見る

コンピュータ上で表情や口形状の動きを表現する方法として、3次元モデルを用いている。口形状に関しては、各制御点に各口形のパラメータ値を定め、口形状の動きを表現している。表情に関しては、Action Unit(AU)と呼ばれる44種類の顔各部の動きの単位を組み合わせることにより、表情の表出を行っている。そこで、本稿では口形状の動きを自然に表現するために、2台のカメラを用いて顔面上のマーカの移動量を求め、各口形パラメータ値を決定した。このパラメータを用いて各口形を合成し、それを滑らかに表現するアニメーション作成システムの構築を行った。

DOI CiNii J-GLOBAL
30-2 2次元マーカ移動量からの顔面筋パラメータ自動推定

石川貴博, 世良元, 森島繁生

映像情報メディア学会年次大会講演予稿集 1997 ( 0 ) 373 - 374 1997年

　概要を見る

コンピュータ上で顔表情を合成するモデルの1つとして顔面筋肉モデルがある。顔面筋肉モデルは、モデル化された皮膚組織及び表情筋を持ち筋肉が皮膚組織に与える影響力を計算し、皮膚組織を変形させることによって、表情表出が可能である。表情を決定するパラメータは、筋肉が収縮する力(筋肉パラメータ)である。筋肉パラメータから顔表情への変換は、力の重ね合わせによって行われるが、特定の表情に対応する筋肉パラメータの決定は、試行錯誤的に行う必要があった。しかし、本稿では1台のカメラで正面から撮影した顔面上のマーカの2次元移動量から筋肉パラメータの自動推定を行った。これは同時に筋肉モデルという制約下のもとで正面画像のみから得られる2次元のマーカ情報から3次元の表情をモーションキャプチャすることに相当している。

DOI CiNii J-GLOBAL
熱画像からの人体の姿勢推定の高度化の検討

岩澤昭一郎, 海老原一之, 大谷淳, 森島繁生

電子情報通信学会総合大会講演論文集 1997 ( 2 ) 15 - 20 1997年

　概要を見る

筆者らは既に、熱画像を用いた非接触な人物の姿勢推定手法を提案している。しかし、筆者らの従来法では、上半身の左右への大幅な傾きや、足の広範な動作に対応できず、また単眼のため3次元情報が得られないなどの課題があった。本報告ではより多くの姿勢に対応できるよう、従来の単眼用アルゴリズムを改良する。即ち単眼赤外線カメラから入力される熱画像から獲得される人物領域と、その輪郭の情報に基づいてヒューリスティックに頭頂・手先・足先の各位置を実時間で検出する手法、および遺伝的アルゴリズム (GA) を利用して肘および膝の位置を推定する手法を提案する。さらに、ステレオ視による3次元位置の獲得について検討する。

CiNii
8)モデルフィッティングのための正面顔画像からの特徴点自動抽出([マルチメディア情報処理研究会ネットワーク映像メディア研究会]合同)

岩澤昭一郎, 森島繁生

テレビジョン学会誌 50 ( 11 ) 1817 - 1817 1996年11月

CiNii
9)空間周波数を利用した実時間顔表情認識([マルチメディア情報処理研究会ネットワーク映像メディア研究会]合同

坂口竜己, 森島繁生

テレビジョン学会誌 50 ( 11 ) 1817 - 1817 1996年11月

CiNii
8)モデルフィッティングのための正面顔画像からの特徴点自動抽出([マルチメディア情報処理研究会ネットワーク映像メディア研究会]合同)

岩澤昭一郎, 森島繁生

テレビジョン学会誌 50 ( 9 ) 1419 - 1419 1996年09月

CiNii
9)空間周波数を利用した実時間顔表情認識([マルチメディア情報処理研究会ネットワーク映像メディア研究会]合同)

坂口竜己, 森島繁生

テレビジョン学会誌 50 ( 9 ) 1419 - 1419 1996年09月

CiNii
流体場解析における頭髪の運動表現

三枝太, 森島繁生

電子情報通信学会ソサイエティ大会講演論文集 1996 ( Society D ) 285 - 285 1996年09月

　概要を見る

擬人化エージェントの実現に向けて人物像のリアルな画像合成の検討を行っている.特に人物頭部領域においては,これまでに,頭髪を「空間曲線」によって近似し,「剛体セグメントモデル」を用いて運動を制御する手法を提案してきた.これは頭髪を空間曲腺によって近似することにより,頭髪の複雑な形状を少ないデータで表現することを可能にし,剛体セグメントモデルにより,簡単な数値計算で空間曲線の運動を可能にした.本稿では,頭髪の運動を表現するのに流体力学の理論を基盤に少ない演算量で近似的に流体場を求める手法を提案し,その手法による運動アニメーションの評価を行う.

CiNii J-GLOBAL
動画像からの実時間表情認識

坂口竜己, 森島繁生

電子情報通信学会ソサイエティ大会講演論文集 1996 ( Society A ) 340 - 341 1996年09月

　概要を見る

人物顔表情の認識は、心理学や工学など様々な分野での応用を期待され、多くの研究が成されてきている。しかしそのほとんどは、認識の精度と計算量とのトレードオフにより、実時間での処理は困難であった。最近ではトランスピュータや高速なワークステーションを使った実時間認識の例も見られるようになってはいるが、ハードウェア規模の巨大化が問題である。本稿では、比較的低速なワークステーションやPC上での動作を前提とした実時間表情認識システムを提案する。顔の領域抽出および領域追跡は、単純な差分画像により行い、表情認識は離散コサイン変換とニューラルネットワークにより行う。特定個人の4表情認識では90%以上の精度が得られることが確認されている。

CiNii J-GLOBAL
顔動画像からの周波数特性解析による表情認識

小島孝治, 坂口竜己, 森島繁生

電子情報通信学会総合大会講演論文集 1996 ( Sogo Pt 1 ) 379 - 379 1996年03月

　概要を見る

顔表情の認識はソフトウェアエージェント等によって実現される仮想人物とユーザである人間とコンピュータの自然なコミュニケーションを実現するために重要な技術である。これまでの顔表情認識の研究においては静止画像を用いたものが多かったが、対象を動画像とすることにより表情変化の過程を考慮することが可能となり、より正確な認識ができると思われる。本稿では顔画像を空間周波数成分で表現し、表情変化の過程におけるこの周波数成分の変化を求める。実際に表情認識を行う段階では、表情表出過程で主に変化する顔部位(口と目)を抽出し、その周波数成分の時間的変化を学習パターンと比較することで結果的にどの表情であるかを識別する。表情を画像で判断する場合個人差が大きく生じるので、まず最初にその人の基本6表情(怒り、嫌悪、恐れ、喜び、悲しみ、驚き)を学習パターンとして用意し、新たに入力された画像がそのどれに近いかを識別する方法を採用した。

CiNii J-GLOBAL
分析・合成のための表情記述パラメータ推定

川上文雄, 森島繁生, 山田寛, 原島博

電子情報通信学会総合大会講演論文集 1996 ( Sogo Pt 1 ) 380 - 380 1996年03月

　概要を見る

人とコンピュータがFace-to-Faccでコミュニケーションできるような環境の構築を目指している。これまで、5層ニューラルネットの恒等写像能力を用いて、17次元の表情パラメータ空間をその中間層に3次元に圧縮して感情空間と仮定し、表情からこの空間への写像とその逆写像(感情空間→表情)を実現し表情の分析・合成を行うシステムの構築を行った。一方で、この逆写像能力の心理学的妥当性を獲得すべく感情空間から合成される表情を刺激に主観評価実験もすでに行っている。しかしながら、ニューラルネットによって生成された感情空間は、主に心理学の分野で用いられているFACS(Facial Action Coding Systcm)に基づき生成されたものである。したがって、表情→感情空間のマッピングの入力はAUパラメータであるために入力された顔画像からまずAUを推定する必要がある。本論文では主に、ニューラルネットを用いたAU推定の手法について述べる。

CiNii J-GLOBAL
新しい表情記述パラメータ(直交化FACS)の提案とその表情分析・合成への応用

松川和正, 川上丈雄, 森島繁生, 山田寛

電子情報通信学会総合大会講演論文集 1996 ( Sogo Pt 1 ) 381 - 381 1996年03月

　概要を見る

表情からの感性情報を扱うことのできるコンピュータと人とのコミュニケーションを実現する環境の構築を目指している。これまで、顔表晴を記述するのにFACS(Facil Action Coding System)を拡張し用いていたが、これらの運動単位(Action Unit)には相関があり顔特徴点の移動量からパラメータを一意に求めることは容易ではない。そこで、本論文ではFACSに基づいた新たな互いに直交化された表情記述パラメータを提案する。このパラメータから、表情の再合成も可能であり、更に顔に表出される感情状態も定量的に記述可能である。

CiNii J-GLOBAL
物理法則に基づいた顔モデルによる口形状の表現

世良元, 岩澤昭一郎, 森島繁生

電子情報通信学会総合大会講演論文集 1996 ( Sogo Pt 1 ) 401 - 401 1996年03月

　概要を見る

筋肉モデルを用いて人の顔表情を作成する研究を進めている。しかし、人の会話時の口形を作成するには現在の物理モデルは筋肉の本数、形状が適していない。そこで新たに筋肉の種類、形状の改良を行い会話時の口形の作成を可能とした。また現実感のある画像の作成のために、歯、目、まぶたなど顔の他の部分のモデル化が行われている。

CiNii J-GLOBAL
顔モデルにおける口唇形状動作の改善

川原光貴, 岩澤昭一郎, 森島繁生

電子情報通信学会総合大会講演論文集 1996 402 - 402 1996年03月

　概要を見る

筆者らはコンピュータシステムと利用者の間でより円滑なコミュニケーションを実現するために、伝達される情報の半分以上が顔表情に基づくともいわれる人間同士のコミュニケーションに習いインタフェースシステムに顔やその表情を利用することを考えている。顔を用いた対話型インタフェースシステムにとって、あたかも実在の人物と対面しているという現実感を与えることが一つの理想であろう。このような高度な対話の現実感を得るためには、現実に忠実な発話時の口唇形状を再現できることが求められる。現在ある実時間メディア変換システムにおける顔モデルの口唇形状動作は正確とは言えず、システムと対話時に違和感を生じる。この違和感を取り除く為には新たな口唇動作を定義することが求められる。本稿では3次元計測に基づく新たな口唇形状動作のルール定義について述べる。

CiNii
仮想人物とユーザの対話を実現するための音声から画像への実時間メディア変換システム

宮下直也, 岩澤昭一郎, 坂口竜己, 森島繁生

電子情報通信学会総合大会講演論文集 1996 403 - 403 1996年03月

　概要を見る

人物の表情や会話シーンなどのリアルな顔画像を合成する研究が、近年盛んに行なわれている。テレビ会議や臨場感通信、知的インタフェースの表示部への応用を考慮して、顔画像の合成をリアルタイムに実行する研究を進めている。より人に優しいインタフェースを構築するためには、よりリアルな画像をリアルタイムに合成することが必要である。本稿では、この実時間メディア変換の技術を用いて、仮想人物(VirtualAgent-VA、現時点ではAgentを操るのは人間)とユーザとの対話をコンピュータのディスプレイ上で行うシステムについて述べる。

CiNii
陰影を考慮した頭髪の表現に関する一検討

安藤真, 森島繁生

電子情報通信学会総合大会講演論文集 1996 406 - 406 1996年03月

　概要を見る

筆者らは、人物画像の中の頭髪に着目し、頭髪全体をモデル化することで、従来のようなテクスチャでは困難であった頭髪表面のハイライトの変化や外力による頭髪の運動などを表現した。しかしながら頭髪のレングリングはZバッファを用いた局所照明アルゴリズムであるため、このままでは陰影を表現することができない。手軽に陰影を表現する手法としてはレイ・トレーシングなどの大域照明アルゴリズムがあるが、頭髪のように物体の数が極めて多い場合には、計算コストが膨大になり実用的ではない。そこで本稿では、より少ない計算時間で陰影を含む頭髪のレングリングを行うために、陰影バッファを用いたアルゴリズムを提案する。

CiNii
3台のカメラによるリモートセンシング画像からの標高測定

安藤裕幸, 安藤真, 森島繁生

電子情報通信学会総合大会講演論文集 1996 ( 1 ) 209 - 209 1996年03月

　概要を見る

リモートセンシングデータの利用法の1つとして地形図作成があげられる。現状ではマッチングをすべて自動で行うことはできず、人間の手作業によるマッチングを行うことにより標高測定をしている。そこで、ここでは3方向のセンサデータとそれらの時間軸正規化法としてDPマッチングを用いることによりリモートセンシングデータから標高測定を行う手法を提案した。ここで3方向カメラを用いたのはオクルージョン対策であり、前方視一直下視と、後方視一直下視とのスイッチングにより、正しい高度を求められるように改良した。

CiNii J-GLOBAL
顔面筋の物理モデルに基づく会話時の口形状の制御

世良元, 岩澤昭一郎, 森島繁生, Terzopoulos Demetri

電子情報通信学会技術研究報告. MVE, マルチメディア・仮想環境基礎 95 ( 553 ) 9 - 16 1996年03月

　概要を見る

ヒューマンインタフェースにおける擬人化エージェントの実現やエンタティメント映像生成に向けCGによる人物像の生成が望まれている。本稿では現実感の高い人の顔を表現することを目的として物理法則に基づく筋肉モデルを提案する。一方、人の表情を作成する研究は数多く見られるが、会話時の口形状に対しての研究は少ない。特に自然な会話のアニメーションの合成のため、口形状の表現に適した筋肉の種類と形状の改良を行った。また、実画像からの測定結果に基づき口形状の作成を行った。また音韻継続時間を考慮に入れ、音と同期したアニメーションを生成した。

CiNii J-GLOBAL
仮想人物との対話を実現するための音声から画像への実時間メディア変換システムの研究

宮下直也, 佐藤順, 坂口竜己, 森島繁生

電子情報通信学会技術研究報告. MVE, マルチメディア・仮想環境基礎 95 ( 553 ) 17 - 24 1996年03月

　概要を見る

ヒューマンインタフェースにおける擬人化エージェントの姿形を実現するため、あるいは知的通信の受信側の表示部として人物の表情や会話シーンの合成について研究を進めている。擬人化エージェントとの対応では、あたかも人と人とか直接接しているような高度な現実感をもった環境を実現することが目標であるが、これには、画像と音声とが同期し、相手が真の人物であると思わせるほどの自然な画像合成と実時間での画像表示が必要不可欠である。本稿では、このような擬人化エージェントの実現に向けて、実用に近いプロトタイプシステムを構築することを目指す。人物の顔形状に忠実な3次元モデルを導入して、実際に人物が会話している音声を入力し、この音声の分析結果から口の形状を推定して、リアルタイムに表情・口形合成するシステムについて述べる。またこの実時間メディア変換の技術を用いて仮想人物(virtual agent、現時点でagentを操るのは人間)とユーザとの対話をディスプレイ上で行うシステムについても述べる。最後にこのvirtual agentとユーザがインタラクティプにコミュニケーションを実現するために感情音声分析・合成について触れる。

CiNii
3次元感情モデルに基づく表情分析・合成システムの構築

川上文雄, 山田寛, 原島博, 森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 95 ( 552 ) 7 - 14 1996年03月

　概要を見る

これまで5層ニューラルネットの優れた恒等与像能力を用いて人間の顔に表出される感情状態を空間的に表現しこの空間を「3次元感情空間」(3次元感情モデル)と呼ぶことで表情から感情空間及び、感情空間から表情への両写像を実現するためのシステムの構築を行った。しかしながら、感情空間から表情への写像にはその心理学的妥当性がないこと、感情空間への入力は表情パラメータであることや例え表情から感情空間への写像を実現したとしても、その心理学的意味が明解でないことなど多くの問題を抱えていた。そこで、感情空間の心理学的妥当性を確認するために心理学実験を行い、入力顔画像から表情パラメータの推定を行った上でユーザの顔画像から感情状態を把握可能なシステムの構築を行った。

CiNii J-GLOBAL
ダイナミックスモデルに基づく自然な頭髪アニメーション

安藤真, 三枝太, 松坂秀治, 森島繁生

電子情報通信学会技術研究報告. HCS, ヒューマンコミュニケーション基礎 95 ( 552 ) 15 - 22 1996年03月

　概要を見る

現在さまざまな分野でコンピュータグラフィックスによる人物の画像が用いられている。本稿では、これら人体頭部の画像の中でも特にCGによる合成が難しい対象である頭髪を表現する手法について述べる。筆者らは、既に頭髪を空間曲線で近似することで形状データの容量を大幅に削減し、また剛体セグメントモデルによる連動制御を可能にした。本稿では、この連動制御方法をさらに現実の連動に即したモデルで記述することで、頭髪の動きをより自然なものにした。また新しい衝突判定アルゴリズムを用いて、高速な衝突処理を行った。加えて、従来は考慮されていなかった頭髪による陰影を高速に実現するための新たなアルゴリズムについても提案する。

CiNii J-GLOBAL
物理法則に基づいた筋肉モデルによる口唇形状の制御

森島繁生

第12回NICOGRAPH論文コンテスト論文集, 1996 219 - 229 1996年

CiNii
Modeling of facial expression and emotion for human communication system

Shigeo Morishima

Displays 17 ( 1 ) 15 - 25 1996年

　概要を見る

The goal of this research is to realize a face-to-face communication environment with machine by giving a facial expression to computer system. In this paper, modeling methods of facial expression including 3D face model, expression model and emotion model are presented. Facial expression is parameterized with Facial Action Coding System (FACS) which is translated to the grid's motion of face model which is constructed from the 3D range sensor data. An emotion condition is described compactly by the point in a 3D space generated by a 5-layered neural network and its evaluation result shows the high performance of this model.

DOI
Physics-based muscle model for mouth shape control

H Sera, S Morishima, D Terzopoulos

RO-MAN '96 - 5TH IEEE INTERNATIONAL WORKSHOP ON ROBOT AND HUMAN COMMUNICATION, PROCEEDINGS 207 - 212 1996年

　概要を見る

Human image synthesis by computer graphics is essential to a virtual agent in human interface and entertainment visual system. In this paper, a muscle model is proposed to create a super realistic human face. There are several researches to synthesize human expression, however, a research about mouth shape control in conversation is limited to our group. Especially, we try to choose and modify muscles which are good for mouth shape generation to realize a natural conversation scene. Basic mouth shape is defined by measuring the real image captured by camera. We also try to make animation using standard phoneme duration to realize lip-sync.
Emotion modeling in speech production using emotion space

J Sato, S Morishima

RO-MAN '96 - 5TH IEEE INTERNATIONAL WORKSHOP ON ROBOT AND HUMAN COMMUNICATION, PROCEEDINGS 472 - 477 1996年

　概要を見る

This paper describes the modeling scheme of emotions appearing in a speech production by using neural network and the synthesizing technique of emotional condition from neutral speech.
To model emotion conditions in speech production, Emotion Space is introduced It has already been proposed in facial expression modeling [1, 2]. Emotion Space can represent emotion condition appearing in speech production in a two dimensional space and realize both mapping and inverse mapping between the emotion condition and the speech production.
We developed Emotional Speech Synthesizer to synthesize emotional speech. The Emotional Speech Synthesizer has an ability to synthesize an emotional speech by modifying a neutral speech in its timing, pitch and intensity.
This paper also describes the subjective evaluation result of synthesized speech from the Emotion Space.
A face to face communication using real-time media conversion system

N Miyashita, T Sakaguchi, S Morishima

RO-MAN '96 - 5TH IEEE INTERNATIONAL WORKSHOP ON ROBOT AND HUMAN COMMUNICATION, PROCEEDINGS 543 - 544 1996年

　概要を見る

A prototype of a user friendly human interface which facilitates natural human-machine communication has been developed. The interface uses real-time media conversion system which has a virtual agent with a human face. This system can be used to synthesize very natural facial motion images at video rate and utilized for communication between a user and the virtual agent.
仮想現実的顔画像処理システムを用いた顔面表情知覚の精神物理学的研究特に基本表情の強度と感覚量の関係について

山田寛, 中村宏信, 森島繁生, 原島博

電子情報通信学会技術研究報告 95 ( 477(HCS95 20-25) ) 15 - 20 1996年

J-GLOBAL
分析・合成のための表情記述パラメータ推定

川上文雄, 森島繁生, 山田寛, 原島博

電子情報通信学会大会講演論文集 1996 ( Sogo Pt 1 ) 380 1996年

J-GLOBAL
GUIを用いたヘアスタイルデザインシステムの開発

三枝太, 安藤真, 森島繁生

電子情報通信学会大会講演論文集 1996 ( Sogo Pt 1 ) 395 1996年

J-GLOBAL
顔モデルにおける口唇形状動作の改善

川原光貴, 岩沢昭一郎, 森島繁生

電子情報通信学会大会講演論文集 1996 ( Sogo Pt 1 ) 402 1996年

J-GLOBAL
3台のカメラによるリモートセンシング画像からの標高測定

安藤裕幸, 安藤真, 森島繁生

電子情報通信学会大会講演論文集 1996 ( Sogo Pt 2 ) 209 1996年

J-GLOBAL
仮想人物とユーザの対話を実現するための音声から画像への実時間メディア変換システム

宮下直也, 岩沢昭一郎, 坂口竜己, 森島繁生

電子情報通信学会大会講演論文集 1996 ( Sogo Pt 1 ) 403 1996年

J-GLOBAL
顔動画像からの周波数特性解析による表情認識

小島孝治, 坂口竜己, 森島繁生

電子情報通信学会大会講演論文集 1996 ( Sogo Pt 1 ) 379 1996年

J-GLOBAL
表情の三次元計測に基づく顔画像合成規則の検討

高橋成晴, 坂口竜己, 森島繁生

電子情報通信学会大会講演論文集 1996 ( Sogo Pt 1 ) 382 1996年

J-GLOBAL
感情音声による感情空間の構築

佐藤順, 川上文雄, 森島繁生

電子情報通信学会大会講演論文集 1996 ( Sogo Pt 1 ) 400 1996年

J-GLOBAL
陰影を考慮した頭髪の表現に関する一検討

安藤真, 森島繁生

電子情報通信学会大会講演論文集 1996 ( Sogo Pt 1 ) 406 1996年

J-GLOBAL
物理法則に基づいた顔モデルによる口形状の表現

世良元, 岩沢昭一郎, 森島繁生

電子情報通信学会大会講演論文集 1996 ( Sogo Pt 1 ) 401 1996年

J-GLOBAL
新しい表情記述パラメータ(直交化FACS)の提案とその表情分析・合成への応用

松川和正, 川上文雄, 森島繁生, 山田寛

電子情報通信学会大会講演論文集 1996 ( Sogo Pt 1 ) 381 1996年

J-GLOBAL
仮想人物との対話を実現するための音声から画像への実時間メディア変換システムの研究

宮下直也, 佐藤順, 坂口竜己, 森島繁生

電子情報通信学会技術研究報告 95 ( 553(MVE95 60-68) ) 17 - 24 1996年

J-GLOBAL
3次元感情モデルに基づく表情分析・合成システムの構築

川上文雄, 山田寛, 原島博, 森島繁生

電子情報通信学会技術研究報告 95 ( 552(HCS95 26-29) ) 7 - 14 1996年

J-GLOBAL
ダイナミックスモデルに基づく自然な頭髪アニメーション

安藤真, 三枝太, 松坂秀治, 森島繁生

電子情報通信学会技術研究報告 95 ( 552(HCS95 26-29) ) 15 - 22 1996年

J-GLOBAL
顔面筋の物理モデルに基づく会話時の口形状の制御

世良元, 岩沢昭一郎, 森島繁生, TERZOPOULOS D

電子情報通信学会技術研究報告 95 ( 553(MVE95 60-68) ) 9 - 16 1996年

J-GLOBAL
モデルフィッティングのための正面顔画像からの特徴点自動抽出

岩沢昭一郎, 森島繁生

テレビジョン学会技術報告 20 ( 41(MIP96 53-63/NIM96 75-85) ) 43 - 48 1996年

J-GLOBAL
空間周波数を利用した実時間顔表情認識

坂口竜己, 森島繁生

テレビジョン学会技術報告 20 ( 41(MIP96 53-63/NIM96 75-85) ) 49 - 54 1996年

J-GLOBAL
流体場解析における頭髪の運動表現

三枝太, 森島繁生

電子情報通信学会大会講演論文集 1996 ( Society D ) 285 1996年

J-GLOBAL
動画像からの実時間表情認識

坂口竜己, 森島繁生

電子情報通信学会大会講演論文集 1996 ( Society A ) 340 - 341 1996年

J-GLOBAL
顔の三次元計測に基づく顔画像合成規則の検討

高橋成晴, 坂口竜己, 森島繁生

1996年電子情報通信学会春季大会 1996 ( Sogo Pt 1 ) 382 - 382 1996年

　概要を見る

筆者らはモデルに基づいた顔表情画像合成のために顔に49点の測定用マーカを張りつけた被験者を、直交した正面と側面から同期した2台のカメラとVTRで撮影し、表情表出時のマーカの移動を追跡することによって3次元的な移動量を求めた。従来、筆者らが用いているFACS表情合成規則は、平面的であり経験的に移動量を求めたものであった。これに対し実画像から測定することによって3次元的変形を捉えることが出来るため、より自然な顔画像の合成が可能となった。

CiNii J-GLOBAL
GUIを用いたヘアスタイルデザインシステムの開発

三枝太, 安藤真, 森島繁生

信学会春季大会, 1996 395 - 395 1996年

　概要を見る

ヒューマン・インタフェース、知的画像符号化などの分野での表情合成技術においては、人物頭部画像のリアルな合成が必要不可欠なものとなっている。筆者らは、頭髪を「空間曲線」によって近似し、近似的なアンチエイリアシングや予測を用いた効率的なレンダリングを取り入れることで、より高速で質の高い画像の生成に成功した。頭髪の生成には、予め与えられた人物頭部の3次元モデル表面に自動的に生成する方法を提案した。しかし、この手法では髪型をインタラクティブにデザインできないという問題点が残されていた。そこで髪型をインタラクティブに編集するインタフェースの実現により、より自然な頭髪画像の生成に成功したので報告する。

CiNii
感情音声による感情空間の構築

佐藤, 川上文雄, 森島繁生

電子情報通信学会総合大会講演論文集 0 ( 399 ) 400 - 400 1996年

　概要を見る

人と機械との対話を可能にするインタフェースを実現するために音声に含まれる感情情報に関する研究を行っている。音声に含まれる感情情報は主に韻律情報であることがすでに報告されている。そこで、感情毎のモデルを簡略化するために音節毎に韻律情報を求め、これらを音声に含まれる感情のパラメータとした。これまで顔表情の分野においてニューラルネットを用いた感情空間が提案されている。これは、表情記述パラメータをその中間層に圧縮しこれを感情空間と呼ぶことで表情の分析・合成を行おうというものである。本稿ではこのモデルを音声の分野に適応し音声による感情空間を提案する。また、感情空間からパラメータを再現しこれを平静音声に付加することにより感情合成音声を生成する。

CiNii J-GLOBAL
モデルフィッティングのための正面顔画像からの特徴点自動抽出

岩澤昭一郎, 森島繁生

テレビジョン学会技術報告 20 ( 41 ) 43 - 48 1996年

　概要を見る

The geometry modeling in computer graphics has been very difficult and have needed a lot of work as usual. The facial geometry has particular compleity and personality. In this paper, a generic model is used for the facial modeling method and is deformed and fitted along facial feature points to make the personal model. This paper also describes the algorithm to extract facial feature points from frontal view image. This algorithm is composed of the region segmentation techniques using color information to select eye/eyebrow/lip regions, and the heuristical extraction of facial feature points within local regions. And the experiment results using actual face image shows the error to the manual modeling is slight.

DOI CiNii
空間周波数を利用した実時間顔表情認識

坂口竜己, 森島繁生

テレビジョン学会技術報告 20 ( 41 ) 49 - 54 1996年

　概要を見る

A considerable number of studies have been made on the facial expression recognition techniques in pshycological field, engineering and so on. However, it remains an unsettled problem that trade-off between recognition accuracy and calculation cost. It disturbs realizing real-time processing. In the last few years, real-time recognition systems which use a high-speed graphics workstation or a transputer have been seen. They need such a high-end computer system, so it is difficult to use it as a simple interface between computer and human. In this paper, we propose a real-time facial expression recognition method on the assumption that it runs on generic (low-cost) work-station or PC with video capture function. A face extraction is based on simple temporal differential image. One-dimensional correlation matching method is used for a feature tracking. And performing discrete cosine transform (DCT) to image, calculated coefficients in term of festure vectors are given to the neural network. In the user depended expression classification experiments, we confirm that the accuracy of our method is above 90% for five expression categories.

DOI CiNii
音声駆動による実時間表情変形システム : "Better Face Communication" at SIGGRAPH'95

森島繁生

電子情報通信学会技術研究報告. MVE, マルチメディア・仮想環境基礎 95 ( 345 ) 9 - 16 1995年10月

　概要を見る

擬人化仮想エージェントやアニメーションキャラクタの発する音声と表情情報の同期が重要な研究テーマとなっている。特に、唇の動きと音声を同期させるリップシンクは多くの報告がなされている。本稿では、コミュニケーションシステムへの応用を想定して、リップシンクを実時間で実現する手法について報告する。マイクロフォンから入力された音声から、フレーム単位にスペクトル情報が計算され、ニューラルネットによって口形のパラメータに変換される。表情はこのパラメータ情報に基づいて3次元ワイヤフレームモデルを変形し、人物のテクスチャを貼り付けることによって実現される。実際にこのアルゴリズムをリアルタイムシステムとして実現し、任意の訪問者の正面静止画像と僅かな音声サンプルの取得後、自らの囗の動きを自身でコントロールしたり表情を付加できるデモシステムをSIGGRAPH'95に出展して評価を行った。

CiNii
リモートセンシング画像の圧縮とその評価

野村晴尊, 森島繁生, 原島博

電子情報通信学会総合大会講演論文集 1995 ( 1 ) 205 - 205 1995年03月

　概要を見る

現在、資源探査や地球観測の手段としてリモートセンシングが注目されている。より高解像度のデータを取得するために情報圧縮が強く望まれており、将来的には十分の一程度に圧縮することが目標となっている。本稿では圧縮手法および今後の評価法について述べる。圧縮率、S/Nは次式によって定義する。rate(%)=P_c/P_o×100 (1) S/N(dB)=10×log_<10>Σ{(255)^2/(i_c-i_o)^2} (2) ここで、P_c:圧縮画像のデータ量、P_o:原画像のデータ量、i_c:復元画像の輝度値、i_o:原画像の輝度値とした。

CiNii J-GLOBAL
音声から口唇形状への実時間メディア変換

岩澤昭一郎, 上野雅俊, 森島繁生, 原島博

電子情報通信学会総合大会講演論文集 1995 ( Sogo Pt 1 ) 250 - 250 1995年03月

　概要を見る

筆者らはコンピュータシステムと利用者の間でより円滑なコミュニケーションを実現するために、伝達される情報の半分以上が顔表情に基づくともいわれる人間同士のコミュニケーションに習いインタフェースシステムに顔やその表情を利用することを考えている。顔を用いたインターフェースシステムにとって、あたかも実在の人物と対面しているという現実感を与えることが一つの理想である。このような高度な現実感を得るためには、画像と音声とが同期していることはもちろん、動きの自然さや画像生成・表示は実時間に近いことが求められる。本稿では音声から口形状を推定し顔動画像へと変換する実時間メディア変換システムについて述べる。

CiNii J-GLOBAL
3次元感情空間を用いた心理学実験

川上文雄, 森島繁生, 山田寛, 原島博

電子情報通信学会総合大会講演論文集 1995 ( Sogo Pt 1 ) 251 - 251 1995年03月

　概要を見る

筆者らは5層ニューラルネットの恒等写像能力を用いて、17次元の表情バラメータ空間をその中間層に3次元に圧縮して感情空間と仮定し、同時に表情からこの空間への写像とその逆写像(感情空間→表情)を実現して表情の分析・合成を行うシステムの構築をすでに行った。しかし、この3次元感情空間の評価には心理学的妥当性が必要である。そこで、被験者を対象に感情空間から合成される表情を刺激に心理学実験を行った。

CiNii J-GLOBAL
インターフェースマネージメントのための表情分析

森島繁生

知能情報メディア 8 197 - 222 1995年

CiNii
An evaluation of 3-D emotion space

F Kawakami, M Okura, H Yamada, H Harashima, S Morishima

RO-MAN'95 TOKYO: 4TH IEEE INTERNATIONAL WORKSHOP ON ROBOT AND HUMAN COMMUNICATION, PROCEEDINGS 269 - 274 1995年

　概要を見る

The goal of the research is to realize very natural human-machine communication system by giving a facial expression to computer. The 3-D Emotion Space we already proposed, can represent both human and computer emotion condition appearing on the face compactly in a 3-D space[1]. Namely, this 3-D Emotion Space can also realize both mapping and inverse mapping from facial expression to this 3-D space. This paper is mainly about the subjective evaluation using the synthesized facial expression from the 3-D Emotion Space.
A modeling of facial expression and emotion for recognition and synthesis

S MORISHIMA, F KAWAKAMI, H YAMADA, H HARASHIMA

SYMBIOSIS OF HUMAN AND ARTIFACT: FUTURE COMPUTING AND DESIGN FOR HUMAN-COMPUTER INTERACTION 20 251 - 256 1995年

DOI
高精細3次元モデルを用いた音声から画像への実時間メディア変換システム

上野雅俊, 岩沢昭一郎, 森島繁生, 原島博

電子情報通信学会技術研究報告 94 ( 486(HC94 82-86) ) 17 - 24 1995年

J-GLOBAL
リモートセンシングデータの圧縮と標高測定精度への影響

野村晴尊, 森島繁生, 原島博

電子情報通信学会技術研究報告 94 ( 517(SANE94 101-109) ) 7 - 12 1995年

J-GLOBAL
リモートセンシング画像の圧縮とその評価

野村晴尊, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1995 ( Sogo Pt 2 ) 205 1995年

J-GLOBAL
3次元感情空間を用いた心理学実験

川上文雄, 森島繁生, 山田寛, 原島博

電子情報通信学会大会講演論文集 1995 ( Sogo Pt 1 ) 251 1995年

J-GLOBAL
頭髪と人体の高速な衝突判定に関する一検討

安藤真, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1995 ( Sogo Pt 1 ) 249 1995年

J-GLOBAL
音声から口唇形状への実時間メディア変換

岩沢昭一郎, 上野雅俊, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1995 ( Sogo Pt 1 ) 250 1995年

J-GLOBAL
音声駆動による実時間表情変形システム

森島繁生

電子情報通信学会技術研究報告 95 ( 345(MVE95 44-48) ) 9 - 16 1995年

J-GLOBAL
ニューラルネットを用いた表情パラメータ推定の試み

川上文雄, 山田寛, 原島博, 森島繁生

テレビジョン学会映像メディア部門冬季大会講演予稿集 1995 53 1995年

J-GLOBAL
直交化表情記述パラメータによる感情の分析・合成

松川和正, 川上文雄, 山田寛, 森島繁生

テレビジョン学会映像メディア部門冬季大会講演予稿集 1995 50 1995年

J-GLOBAL
流体モデルに基づく髪の毛の運動制御

安藤真, 松坂秀治, 森島繁生

テレビジョン学会映像メディア部門冬季大会講演予稿集 1995 49 1995年

J-GLOBAL
頭髪と人体の高速な衝突判定に関する一検討

安藤真, 森島繁生, 原島博

信学会総合大会, 1995 249 - 249 1995年

　概要を見る

筆者らは、ヒューマンインターフェースなどの分野における人間の表情合成をよりリアルなものとするために、その頭髪に着目し、頭髪全体をモデル化することで、外力による頭髪の変形や頭髪表面の変化など、従来のテクスチャでは難しかった環境の変化を実現した。しかし実際の物理モデルに基づいて自然な頭髪の動きを再現しようとすると、多くの計算負荷に頼らざるを得ない。中でも視覚的に重要な作業である頭髪と人体との衝突判定は、人体を構成する全てのポリゴンと全ての頭髪が同時に関わってくるので、極めて多くの計算時間を要する。これまでに提案された主な衝突判定としては、頭部を包むように疑似外力領域を設け、疑似外力の作用によって頭部内部への頭髪の進入を防ぐ手法、円筒座標系における頭部中心から表面までの距離をあらかじめ衝突判定用バッファとして用意しておく手法がある。前者の手法は、特別な衝突処理が必要ない分計算が高速化されるが、厳密な衝突回避はなされていない。また頭髪が東部に潜り込まないような疑似外力を設定しなければならない。一方後者の手法では、判定に要する時間は僅かであるが、物体のそれぞれについて衝突判定用バッファを用意しなければならず、頭髪の座標変換も個々の物体について行なわなければならない。そこで我々は、直交座標系上に衝突判定バッファを設けることによって、座標変換を必要とせずに単純な比較で衝突判定を行う手法を試みたので、報告する。

CiNii
1-3 流体モデルに基づく髪の毛の運動制御

安藤真, 松坂秀治, 森島繁生

テレビジョン学会映像メディア部門冬季大会講演予稿集 1995 ( 0 ) 49 - 49 1995年

DOI CiNii
1-4 直交化表情記述パラメータによる感情の分析・合成

松川和正, 川上文雄, 山田寛, 森島繁生

テレビジョン学会映像メディア部門冬季大会講演予稿集 1995 ( 0 ) 50 - 50 1995年

DOI CiNii
2-1 ニューラルネットを用いた表情パラメータ推定の試み

川上文雄, 山田寛, 原島博, 森島繁生

テレビジョン学会映像メディア部門冬季大会講演予稿集 1995 ( 0 ) 53 - 53 1995年

DOI CiNii
Construction of 3-D emotion space based on parameterized faces

Robot and Human Communication - Proceedings of the IEEE International Workshop 216 - 221 1994年12月

　概要を見る

If machine can treat KANSEI information like emotion, the relation between human and machine would become more friendly. Our goal is to realize very natural human-machine communication environment by giving a face to computer terminal or communication system. To realize this environment, it's necessary for the machine to recognize human's emotion condition appearing in the face, and synthesize the reasonable facial image against it. For this purpose, the machine should have emotion model based on parameterized faces which can map his or her emotion condition one by one to this space and can also map inversely to reply.
EXPRESSION ANALYSIS SYNTHESIS SYSTEM BASED ON EMOTION SPACE CONSTRUCTED BY MULTILAYERED NEURAL-NETWORK

N UEKI, S MORISHIMA, H YAMADA, H HARASHIMA

SYSTEMS AND COMPUTERS IN JAPAN 25 ( 13 ) 95 - 107 1994年11月

　概要を見る

To realize a user-friendly interface where a human and a computer can engage in face-to-face communication, the computer must be able to recognize the emotional state of the human by facial expressions, then synthesize and display a reasonable facial image in response. To describe this analysis and synthesis of facial expressions easily, the computer itself should have some kind of emotion model. By using a five-layered neural network which has generalization and superior nonlinear mapping performance, identity mapping training was performed using parameterized facial expressions.
With respect to the space built in the middle layer of the five-layered neural network as an emotion model, emotion space was constructed. Based on this emotion space an attempt was made to build a system which can realize mappings from an expression to an emotion and from an emotion to an expression simultaneously. Moreover, to recognize a facial expression from the actual facial image of a human, the method extracting the facial parameter from the facial points movement, is investigated.
SA-6-2 高精細3次元モデルを用いた音声から画像への実時間メディア変換の一検討(SA-6. メディア変換・統合技術とヒューマンコミュニケーション,シンポジウム)

上野雅俊, 岩澤昭一郎, 森島繁生, 原島博

電子情報通信学会秋季大会講演論文集 1994 261 - 261 1994年09月

CiNii
多層ニューラルネットによって構成された感情空間に基づく表情の分析・合成システムの構築

上木伸夫, 森島繁生, 山田寛, 原島博

電子情報通信学会論文誌. D-II, 情報・システム, II-情報処理 = The transactions of the Institute of Electronics, Information and Communication Engineers 77 ( 3 ) 573 - 582 1994年03月

　概要を見る

人間とコンピュータとがあたかもフェーストゥフェースの環境で対話できるユーザフレンドリーなインタフェースを実現するためには,コンピュータが相手である人間の顔表情の認識を行って感情状態を把握し,それに対するふさわしい自然な表情を合成・表示できる必要がある.この表情分析・合成を容易に記述するためには,コンピュータが何らかの感情モデルを自らもつ必要がある.汎化性能をもち,非線形写像能力に優れた5層ニューラルネットを用いて,パラメータで記述された顔表情を恒等写像学習させ,そのとき生成された中間層出力空間を感情モデルとみなすことで感情空間を構成した.また,この感情空間に基づいて表情から感情,更に感情から表情へのマッピングを同時に実現するシステムの構築を試みた.また,生成された感情空間の主観評価を行い,このモデルの妥当性を確認した.更に,実際に人物の顔画像から表情認識を行うため,顔の特徴点から表情パラメータを求める手法についても検討を行った.

CiNii J-GLOBAL
感情空間構成に関する工学的・心理学的研究 (文部省S)

森島繁生, 斎藤隆弘

感性情報処理の情報学・心理学的研究平成5年度 No.04236107 235 - 238 1994年

J-GLOBAL
3次元計測に基づく顔表情変化の分析と合成

坂口竜己, 森島繁生, 大谷淳, 岸野文郎

電子情報通信学会技術研究報告 93 ( 439(HC93 66-78) ) 61 - 68 1994年

J-GLOBAL
音声に含まれる感情情報抽出の一検討

平賀裕, 斉藤善行, 森島繁生, 原島博

電子情報通信学会技術研究報告 93 ( 439(HC93 66-78) ) 1 - 8 1994年

J-GLOBAL
短文音声に含まれる感情情報分析の一検討

平賀裕, 斉藤善行, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1994 ( Shunki Pt 1 ) 1.310 1994年

J-GLOBAL
表情パラメータを用いた三次元感情空間の構成

川上文雄, 森島繁生, 山田寛, 原島博

電子情報通信学会大会講演論文集 1994 ( Shunki Pt 1 ) 1.332 1994年

J-GLOBAL
自然な顔画像合成のための頭髪の表現と運動制御

安藤真, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1994 ( Shunki Pt 1 ) 1.341 1994年

J-GLOBAL
多層ニューラルネットによって構成された感情空間に基づく表情の分析・合成システムの構築

上木伸夫, 森島繁生, 山田寛, 原島博

電子情報通信学会論文誌 D-2 77 ( 3 ) 573 - 582 1994年

J-GLOBAL
高精細ワイヤフレームモデルに基づく自然な表情合成法の一検討

岩沢昭一郎, 上野雅俊, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1994 ( Shunki Pt 1 ) 1.344 1994年

J-GLOBAL
顔画像の3次元計測によるAU定量化の試み

坂口竜己, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1994 ( Shunki Pt 1 ) 1.335 1994年

J-GLOBAL
バンド間のベクトル量子化に基づくリモートセンシング画像の圧縮

野村晴尊, 松原亮彦, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1994 ( Shunki Pt 2 ) 2.183 1994年

J-GLOBAL
正面顔画像からの表情アニメーション作成支援ツールの作成

小沢一将, 岸健治, 坂口竜己, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1994 ( Shunki Pt 1 ) 1.338 1994年

J-GLOBAL
表情に基づく3次元感情空間への工学的・心理学的アプローチ

川上文雄, 坂口竜己, 森島繁生, 山田寛, 原島博

電子情報通信学会技術研究報告 93 ( 492(HC93 86-97) ) 53 - 58 1994年

J-GLOBAL
顔画像と音声のメディア統合

森島繁生, 原島博

マルチメディアと映像処理シンポジウム 1994 67 - 70 1994年

J-GLOBAL
コンピュータイメージフロンティア II 3 知的インタフェースのための表情分析・合成とメディア変換技術

森島繁生

O plus E 177 ( 177 ) 124 - 138 1994年

CiNii J-GLOBAL
高精細3次元モデルを用いた音声から画像への実時間メディア変換の一検討

上野雅俊, 岩沢昭一郎, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1994 ( Shuki Pt 1 ) 261 1994年

J-GLOBAL
高速な頭髪画像の生成と自然な運動制御

安藤真, 森島繁生, 原島博

情報処理学会シンポジウム論文集 94 ( 8 ) 19 - 27 1994年

J-GLOBAL
顔面像認識表情の認識工学の立場から

森島繁生

Medical Imaging Technology 12 ( 6 ) 688 - 693 1994年

DOI CiNii J-GLOBAL
音声に含まれる感情情報抽出の一検討

平賀裕, 斎藤善行, 森島繁生, 原島博

信学技報 93 ( 439(HC93 66-78) ) 1 - 8 1994年

　概要を見る

音声に含まれる基本的感情を分析するため、演劇経験者に感情を込めて単語音声・短文音声を発声してもらい、それぞれに関して分析を試みた。本研究では扱う感情を「怒り」「喜び」「悲しみ」「嫌悪」の4種とし、「平静」音声と比較を基に今まであまり行なわれていなかったピッチ周波数・振幅の変化パターンの検討を中心に分析を行った。またより豊かな感情分析のためにFMラジオから感情音声を採取し、主観評価した後同様の検討を加えた。その結果、矛盾点も皆無というわけではなかったが、相互に多大なる共通項を見いだすことが出来た。

CiNii J-GLOBAL
3次元計測に基づく顔表情変化の分析と合成

坂口竜己, 森島繁生, 大谷淳, 岸野文郎

信学技報 93 ( 439(HC93 66-78) ) 61 - 68 1994年

　概要を見る

よりユーザフレンドリーなコンピュータとのコミュニケーション環境実現のため、顔表情動画像を用いたインタフェース構築の研究を進めている。筆者らはすでにモデルベース手法を応用した表情動画像の作成について提案しているが、この表情変形規則は2次元的な計測を基に作られたものであったため、満足な性能は得られていなかった。本稿では、顔表面の3次元計測により、各表情表出時の顔面皮膚の移動量を求め、新たな移度制御点(特徴点)の設定と移動規則の決定を行なっている。3次元計測では正面・側面画像を利用する手法を採用し、誤差±1.2%程度の精度を得ている。更に得られた特徴点位置についての測定結果よりFACSのAUの定量化を見直し、特徴点以外の点の補間法を検討してより自然な画像合成を行なっている。

CiNii J-GLOBAL
表情に基づく3次元感情空間への工学的心理学的アプローチ

川上文雄, 坂口竜巳, 森島繁生, 山田寛, 原島博

信学技報 93 ( 492(HC93 86-97) ) 53 - 58 1994年

　概要を見る

人と機械との関係がユーザフレンドリーなものとなるためには,機械が相手である人間の顔に表出される感情状態を認識してそれに対する表情を合成する必要がある。これには機械が何らかの定量化された感情モデルを持つ必要がある。そこで,筆者らは汎化性能を持ち非線形写像能力に優れた5層ニューラルネットに表情をAUパラメータとして記述して繰り返し学習させ,このときの中間層のユニット数をこれまでの2ユニットから3ユニットに変更することで3次元感情モデルを構築した。これにより表情のの認識・合成を同時に実現するシステムの構築を行った。また,この3次元感情モデルを心理学で提案されている表情空間と比較することで興味深い相関を得た。

CiNii J-GLOBAL
1993 Picture Coding Symposium (PCS '93)

森島繁生

テレビジョン学会誌 47 ( 5 ) 768 - 768 1993年05月

CiNii
自然な表情アニメーションのための感情空間の構成

森島繁生

NICOGRAPH論文集, 1993 1993年

CiNii
表情インタフェースのための感情情報の定量表現とモデル化

森島繁生, 原島博

ヒューマンインタフェースシンポジウム論文集 9th 357 - 360 1993年

CiNii J-GLOBAL
多層ニューラルネットの恒等写像学習による感情空間の構成

上木伸夫, 森島繁生, 山田寛, 原島博

電子情報通信学会技術研究報告 92 ( 443(HC92 58-64) ) 17 - 22 1993年

J-GLOBAL
照明環境を保存する分析合成符号化の一検討

小野英太, 森島繁生, 原島博

電子情報通信学会技術研究報告 92 ( 443(HC92 58-64) ) 29 - 34 1993年

J-GLOBAL
テキストを併用した音声から顔動画像への新しいメディア変換方式

上田亨, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1993 ( Shunki Pt 1 ) 1.251 1993年

J-GLOBAL
基本感情を表現する音響特徴抽出の一検討

岡本勝規, 平賀裕, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1993 ( Shunki Pt 1 ) 1.433-1.434 1993年

J-GLOBAL
アメリカ英語発音訓練CAIシステムの構築

田中宏和, 坂口竜己, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1993 ( Shunki Pt 1 ) 1.253 1993年

J-GLOBAL
感情空間を用いた表情分析・合成の一検討

上木伸夫, 森島繁生, 山田寛, 原島博

電子情報通信学会大会講演論文集 1993 ( Shunki Pt 1 ) 1.243 1993年

J-GLOBAL
3次元モデルに基づく自然顔画像の陰影成分分離と再合成の試み

佐々木康, 小野英太, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1993 ( Shunki Pt 7 ) 7.403 1993年

J-GLOBAL
高精細3次元モデルを用いた表情分析の一検討

上野雅俊, 小野英太, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1993 ( Shunki Pt 1 ) 1.245 1993年

J-GLOBAL
バンド間の相関を利用したリモートセンシング画像の情報圧縮

野村晴尊, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1993 ( Shunki Pt 2 ) 2.144 1993年

J-GLOBAL
照明環境を保存した顔画像合成の一検討

小野英太, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1993 ( Shunki Pt 1 ) 1.247 1993年

J-GLOBAL
人物顔画像からの表情自動認識の試み

坂口竜己, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1993 ( Shuki Pt 1 ) 1.307-1.308 1993年

J-GLOBAL
自然音声における感情特性分析の一検討

平賀裕, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1993 ( Shuki Pt 1 ) 1.166 1993年

J-GLOBAL
コミュニケーションにおける知識(通信の立場から)

森島繁生

人工知能学会合同研究会AIシンポジウム資料 ( 9302 ) 25 - 26 1993年

J-GLOBAL
通信としての対話

森島繁生

人工知能学会合同研究会AIシンポジウム資料 ( 9302 ) 1 - 8 1993年

J-GLOBAL
表情の分析・合成を同時に実現する多層ニューラルネットによる感情空間の構成(VI.第12回大会発表要旨)

森島繁生, 山田寛, 原島博

基礎心理学研究 12 ( 1 ) 67 - 67 1993年

DOI CiNii
5)マルチメディア電子メールシステムの提案(画像通信システム研究会)

森島繁生, 原島博

テレビジョン学会誌 46 ( 2 ) 223 - 224 1992年02月

CiNii
顔画像によるマルチメディアインクフェースとその開発支援環境の構築

森島繁生

第7回画像符号化シンポジウムPCSJ92 231 - 234 1992年

CiNii
発音口形CAIシステムのための英文テキストから顔画像へのメディア変換

須藤学, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1992 ( Shunki Pt 1 ) 1.276 1992年

J-GLOBAL
データグローブのポインティングデバイスへの応用と評価

今井聡, 清岡昌吉, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1992 ( Shunki Pt 1 ) 1.290 1992年

J-GLOBAL
自動いき値設定による顔画像からの口領域抽出

五味秀雄, 小野英太, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1992 ( Shunki Pt 7 ) 7.177 1992年

J-GLOBAL
ヒューマンインタフェースのための表情動画像シナリオ作成ツールの開発

坂口竜己, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1992 ( Shunki Pt 1 ) 1.279 1992年

J-GLOBAL
部分パターンの登録・更新による簡易実時間動画像表情合成システム

平賀裕, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1992 ( Shunki Pt 1 ) 1.269 1992年

J-GLOBAL
音韻セグメントを考慮した蓄積音声から顔画像へのメディア変換

佐藤城二, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1992 ( Shunki Pt 1 ) 1.275 1992年

J-GLOBAL
表情アニメーション作成のためのシナリオ記述ツールとリアルタイム動画像表示

坂口竜己, 平賀裕, 森島繁生, 原島博

電子情報通信学会技術研究報告 91 ( 508(HC91 54-60) ) 23 - 30 1992年

J-GLOBAL
顔画像用高精度3次元モデルの階層的制御の一検討

小野英太, 上野雅俊, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1992 ( Shuki Pt 1 ) 1.156 1992年

J-GLOBAL
唇動画像の自動合成による発音口形CAIシステム

森島繁生, 坂口竜己, 原島博

電子情報通信学会大会講演論文集 1992 ( Shuki Pt 1 ) 1.203 1992年

J-GLOBAL
表情の認識・合成のためのニューラルネットによる感情モデルの生成

上木伸夫, 森島繁生, 原島博

電子情報通信学会大会講演論文集 1992 ( Shuki Pt 1 ) 1.162 1992年

J-GLOBAL
3次元構造モデルに基づく自然顔画像の陰影制御

佐々木康, 小野英太, 森島繁生, 原島博

電子情報通信学会技術研究報告 92 ( 386(PRU92 78-88) ) 17 - 23 1992年

J-GLOBAL
3次元構造モデルに基づく自然顔画像の陰影制御

佐々木康, 小野英太, 森島繁生, 原島博

情報処理学会研究報告 92 ( 101(CG-60) ) 17 - 23 1992年

J-GLOBAL
自然な表情合成のための頭部高精細ワイヤフレームの構成とその階層的制御について

上野雅俊, 小野英太, 森島繁生, 原島博

情報処理学会研究報告 92 ( 101(CG-60) ) 9 - 16 1992年

J-GLOBAL
自然な表情合成のための頭部高精細ワイヤフレームの構成とその階層的制御について

上野雅俊, 小野英太, 森島繁生, 原島博

電子情報通信学会技術研究報告 92 ( 386(PRU92 78-88) ) 9 - 16 1992年

J-GLOBAL
ICASSP '91

森島繁生

テレビジョン学会誌 45 ( 7 ) 882 - 882 1991年07月

CiNii
SPEECH-TO-IMAGE MEDIA CONVERSION BASED ON VQ AND NEURAL NETWORK

S MORISHIMA, H HARASHIMA

ICASSP 91, VOLS 1-5 4 2865 - 2868 1991年

　概要を見る

Automatic media conversion schemes from speech to a facial image and a construction of a real-time image synthesis system are presented. The purpose of this research is to realize an intelligent human-machine interface or intelligent communication system with synthesized human face images. A human face image is reconstructed on the display of a terminal using a 3-D surface model and texture mapping technique. Facial motion images are synthesized by transformation of the 3-D model. In the motion driving method, based on vector quantization and the neural network, the synthesized head image can appear to speak some given words and phrases naturally, in synchronization with voice signals from a speaker.
糸状物体の運動モデルとCGによるシミュレーション

小林誠司, 森島繁生, 原島博

電子情報通信学会技術研究報告 90 ( 434(PRU90 125-135) ) 15 - 20 1991年

J-GLOBAL
頭部の動き推定を付加した音声から画像へのメディア変換

小野英太, 岡田信一, 森島繁生, 原島博

電子情報通信学会全国大会講演論文集 1991 ( Spring Pt 1 ) 1.272 1991年

J-GLOBAL
顔画像によるマルチメディアインタフェース構築の試み

安達一文, 今井聡, 森島繁生, 原島博

電子情報通信学会全国大会講演論文集 1991 ( Spring Pt 1 ) 1.413-1.414 1991年

J-GLOBAL
ヘアデザインのための髪型形状の自動生成の試み

森島繁生, 菅野雅彦, 小林誠司, 原島博

電子情報通信学会全国大会講演論文集 1991 ( Spring Pt 7 ) 7.374 1991年

J-GLOBAL
糸状物体のアニメーション

小林誠司, 森島繁生, 原島博

電子情報通信学会全国大会講演論文集 1991 ( Spring Pt 7 ) 7.373 1991年

J-GLOBAL
多層ニューラルネットのトポロジカルな特性

上木伸夫, 片山泰男, 森島繁生

電子情報通信学会全国大会講演論文集 1991 ( Spring Pt 6 ) 6.21 1991年

J-GLOBAL
頭部の動きを考慮した顔画像の特徴点抽出

岡田信一, 森島繁生, 原島博

電子情報通信学会全国大会講演論文集 1991 ( Spring Pt 1 ) 1.266 1991年

J-GLOBAL
リアルタイム・メディア変換システムの改良

森島繁生, 黒部伸也, 下山田浩康, 大山公一, 原島博

電子情報通信学会全国大会講演論文集 1991 ( Spring Pt 1 ) 1.273 1991年

J-GLOBAL
音声と表情の同期表示

森島繁生

情報処理学会シンポジウム論文集 91 ( 4 ) 45 - 60 1991年

J-GLOBAL
表情インタフェースのための画像作成・編集・表示方式

森島繁生, 原島博

電子情報通信学会大会講演論文集 1991 ( Shuki Pt 1 ) 1.114 1991年

J-GLOBAL
マルチメディア電子メールシステムの提案

森島繁生, 原島博

テレビジョン学会技術報告 15 ( 60(ICS91 60-65) ) 25 - 28 1991年

J-GLOBAL
マルチメディア電子メールシステムの提案

森島繁生, 原島博

テレビジョン学会技術報告 15 ( 60 ) 25 - 28 1991年

　概要を見る

We have already proposed an automatic facial motion image synthesis schemes driven by speech and text as media conversion schemes. The purpose of this scheme is to realize an intelligent human-machine interface or intelligent communication system with talking head images. Human face is reconstructed with 3D surface model and texture mapping technique. In this paper, we applied these schemes to multi-media human-machine interface. One example is multi-media E-mail system. Scenerio expression tool and real-time image playback system are realized on workstation window.

DOI CiNii J-GLOBAL
マルチプロセッサ構成による知的画像符号化のためのリアルタイム表情合成の試み

森島繁生, 小林誠司, 原島博

電子情報通信学会論文誌 D-2 情報・システム 73 ( 10 ) p1647 - 1654 1990年10月

CiNii J-GLOBAL
知的インタフェ-スのための顔の表情合成法の一検討

森島繁生, 岡田信一, 原島博

電子情報通信学会論文誌 D-2 情報・システム 73 ( 3 ) p351 - 359 1990年03月

CiNii J-GLOBAL
分散処理による音声から画像への実時間メディア変換システム

森島繁生

1990年画像符号化シンポジウムPCSJ90 7 - 6 1990年

CiNii
A REAL-TIME FACIAL ACTION IMAGE SYNTHESIS SYSTEM DRIVEN BY SPEECH AND TEXT

S MORISHIMA, K AIZAWA, H HARASHIMA

VISUAL COMMUNICATIONS AND IMAGE PROCESSING 90, PTS 1-3 1360 1151 - 1158 1990年

　概要を見る

Automatic facial motion image synthesis schemes and a real-time system design are presented. The purpose of this scheme is to realize an intelligent human-machine interface or intelligent communication system with talking head images. A human face is reconstructed with 3D surface model and texture mapping technique on the display of terminal. Facial motion images are synthesized naturally by transformation of the lattice points on wire frames. Two types of motion drive methods, text to image conversion and speech to image conversion are proposed in this paper. In the former manner, the synthesized head can speak some given texts naturally and in the latter case, some mouth and jaw motions can be synthesized in time to speech signal of behind speaker. These schemes were implemented to a parallel image computer and a real-time image synthesizer could output facial motion images to the display as fast as video rate.
SPEECH CODING BASED ON A MULTILAYER NEURAL NETWORK

S MORISHIMA, H HARASHIMA, Y KATAYAMA

IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS : ICC 90, VOLS 1-4 2 429 - 433 1990年

　概要を見る

The authors present a speech-compression scheme based on a three-layer perceptron in which the number of units in the hidden layer is reduced. Input and output layers have the same number of units in order to achieve identity mapping. Speech coding is realized by scalar or vector quantization of hidden-layer outputs. By analyzing the weighting coefficients, it can be shown that speech coding based on a three-layer neural network is speaker-independent. Transform coding is automatically based on back propagation. The relation between compression ratio and SNR (signal-to-noise ratio) is investigated. The bit allocation and optimum number of hidden-layer units necessary to realize a specific bit rate are given. According to the analysis of weighting coefficients, speech coding based on a neural network is transform coding similar to Karhunen-Loeve transformation. The characteristics of a five-layer neural network are examined. It is shown that since the five-layer neural network can realize nonlinear mapping, it is more effective than the three-layer network.
音声情報圧縮を実現する多層ニューラルネットワークの特性解析

森島繁生, 中山博文, 清水誠司, 片山泰男, 原島博

電子情報通信学会技術研究報告 89 ( 431(SP89 119-123) ) 15 - 22 1990年

CiNii J-GLOBAL
CELPにおける駆動音源信号の適応符号化について

熊野聡, 森島繁生, 原島博

電子情報通信学会技術研究報告 89 ( 432(SP89 124-130) ) 9 - 16 1990年

J-GLOBAL
恒等写像を実現する多層ニューラルネットの特性について

清水誠司, 片山泰男, 森島繁生

電子情報通信学会全国大会講演論文集 1990 ( Spring Pt.7 ) 7.224 1990年

J-GLOBAL
音源情報の適応ビット配分に基づくCELP符号化

熊野聡, 森島繁生, 原島博

電子情報通信学会全国大会講演論文集 1990 ( Spring Pt.1 ) 1.431-432 1990年

J-GLOBAL
知的画像符号化のための音声からのリアルタイム表情合成

春山達郎, 小山昌岐, 森島繁生, 原島博

電子情報通信学会全国大会講演論文集 1990 ( Spring Pt.7 ) 7.298 1990年

J-GLOBAL
次世代マンマシンインターフェースのための表情合成システムの開発

中林靖, 岡田信一, 森島繁生, 原島博

電子情報通信学会全国大会講演論文集 1990 ( Spring Pt.7 ) 7.207 1990年

J-GLOBAL
多層ニューラルネットに基づく音声符号化の一検討

中山博文, 熊野聡, 片山泰男, 森島繁生

電子情報通信学会全国大会講演論文集 1990 ( Spring Pt.1 ) 429 - 433 1990年

J-GLOBAL
ニューラルネットに基づく音声から画像へのメディア変換

青山和義, 森島繁生, 原島博

電子情報通信学会全国大会講演論文集 1990 ( Spring Pt.7 ) 7.171 1990年

J-GLOBAL
顔面像からの唇の特徴点抽出法

小林誠司, 中村都, 森島繁生, 原島博

電子情報通信学会全国大会講演論文集 1990 ( Spring Pt.7 ) 7.26 1990年

J-GLOBAL
知的インタフェースのための顔の表情合成法の一検討

森島繁生, 岡田信一, 原島博

電子情報通信学会論文誌 D-2 73 ( 3 ) 351 - 359 1990年

J-GLOBAL
マルチプロセッサ構成による知的画像符号化のためのリアルタイム表情合成の試み

森島繁生, 小林誠司, 原島博

電子情報通信学会論文誌 D-2 73 ( 10 ) 1647 - 1654 1990年

J-GLOBAL
Intelligent facial image coding driven by speech and phoneme

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings 3 1795 - 1798 1989年12月

　概要を見る

The authors propose and compare two types of model-based facial motion coding schemes, i.e., synthesis by rules and synthesis by parameters. In synthesis by rules, facial motion images are synthesized on the basis of rules extracted by analysis of training image samples that include all of the phonemes and coarticulation. This system can be utilized as an automatic facial animation synthesizer from text input or as a man-machine interface using the facial motion image. In synthesis by parameters, facial motion images are synthesized on the basis of a code word index of speech parameters. Experimental results indicate good performance for both systems, which can create natural facial-motion images with very low transmission rate. Details of 3-D modeling, algorithm synthesis, and performance are discussed.

DOI
ニューラルネットに基づく画像圧縮 : 高速化と重み行列について

風山雅裕, 片山泰男, 森島繁生

全国大会講演論文集 38 ( 0 ) 490 - 491 1989年03月

　概要を見る

バックプロパゲーション型ニューラルネットワークを用いた画像データの圧縮符号化の実験において、学習を高速化する方法を提案する。また符号量を減らす試みとして、中間層の出力のいくつかをカットしてみた。そのときの画像評価を行なった結果についても報告する。さらに入力層-中間層間、出力層-中間層間の重みを示し、これらの関係について考察する。

CiNii
知的画像符号化に基づく自然なマンマシンインターフェースの実現

森島繁生, 原島博

ヒューマンインタフェースシンポジウム論文集 5th 449 - 456 1989年

J-GLOBAL
ニューラルネットに基づく音声情報圧縮

森島繁生, 小松一樹, 片山泰男, 原島博

電子情報通信学会技術研究報告 88 ( 453 ) 61-66(SP88-142) - 66 1989年

CiNii J-GLOBAL
知的インターフェースのための顔の表情合成法の一検討

岡田信一, 森島繁生, 原島博

電子情報通信学会全国大会講演論文集 1989 ( Spring Pt.7 ) 7.153 1989年

J-GLOBAL
トランスピュータによる表情合成の高速化に対する一検討

小林誠司, 市川正浩, 正村和由, 森島繁生

電子情報通信学会全国大会講演論文集 1989 ( Spring Pt.7 ) 7.156 1989年

J-GLOBAL
ニューラルネットによる音声の符号化

森島繁生, 小松一樹, 片山泰男, 原島博

電子情報通信学会全国大会講演論文集 1989 ( Spring Pt.1 ) 1.343-1.344 1989年

J-GLOBAL
音声情報による顔画像の表情合成法の改良への一検討

青木孝泰, 井口陽子, 森島繁生, 原島博

電子情報通信学会全国大会講演論文集 1989 ( Spring Pt.7 ) 7.151 1989年

J-GLOBAL
ニューラルネットに基づく画像圧縮高速化と重み行列について

風山雅裕, 片山泰男, 森島繁生

情報処理学会全国大会講演論文集 38th ( 1 ) 490 - 491 1989年

J-GLOBAL
知的通信と知的符号化

原島博, 森島繁生

日本音響学会誌 45 ( 7 ) 534 - 540 1989年

DOI CiNii J-GLOBAL
知的通信と知的符号化

原島博, 森島繁生

日本音響学会誌 45 ( 7 ) 534 - 540 1989年

　概要を見る

記事分類: 電気工学--電気通信--電気通信応用

DOI CiNii
Very low bit rate speech coding based on a phoneme recognition

25 n 13 71 - 72 1988年12月

　概要を見る

Summary form only given, as follows. A new speech compression technique for voice storage or voice mail is presented. Basically the coding scheme of this system is stochastic coding (CELP), but the results of phoneme recognition and segmentation are utilized as the standard for vector quantization (VQ) codebook selection and voiced-unvoiced control. The recognition process is performed using the heuristic knowledge to decide nine phonemes. Codebooks for both PARCOR coefficients and excitations for each phoneme are trained by a 75 spoken word sequence that includes all the VCV patterns. The phoneme code number is quantized at the beginning of each segment to select the optimum codebooks and strategies for that segment. This scheme can be categorized as multiple-stage VQ. Thus the size of each codebook is very small and the length of each segment is very long. Very-low-bit-rate coding with high quality can be realized, and a special procedure can be performed to increase the intelligibility. In the case where the average bit rate is 860 b/s, the experimental results show that the average segmental SNR is 6.30 dB, and a subjective test indicates good intelligibility and phoneme clarity.
符号駆動線形予測に基づく低ビットレ-ト音声符号化

熊野聡, 森島繁生

成蹊大学工学部工学報告 ( 46 ) p3123 - 3131 1988年09月

CiNii J-GLOBAL
音声情報に基づく表情の自動合成の研究

森島繁生

第4回NICOGRAPH論文コンテスト論文集 139 - 146 1988年

CiNii
モデルに基づく口画像の分析合成音色から画像へのメディア変換

金子克美, 森島繁生, 相沢清晴, 原島博

電子情報通信学会全国大会講演論文集 1988 ( Pt. D-2 ) 93 1988年

J-GLOBAL
モデルに基づく口画像の規則合成テキストから画像へのメディア変換

向井和彦, 森島繁生, 相沢清晴, 原島博

電子情報通信学会全国大会講演論文集 1988 ( Pt. D-2 ) 94 1988年

J-GLOBAL
顔の知的画像符号化を利用した知的マンマシンインターフェース

森島繁生, 相沢清晴, 原島博

電子情報通信学会全国大会講演論文集 1988 ( Autumn Pt. D-1 ) 233 - 234 1988年

J-GLOBAL
音響処理と記号処理とを融合した単語音声認識システムの構成

森島繁生, 原島博

電子情報通信学会論文誌 D 情報・システム 70 ( 10 ) p1890 - 1901 1987年10月

CiNii J-GLOBAL
電子情報通信学会編, 中前栄八郎著, ニューメディア技術シリーズ, "コンピュータグラフィックス", オーム社, A5判, 244p., \3,500, 1987

森島繁生

情報処理 28 ( 6 ) 813 - 814 1987年06月

CiNii
知的音声符号化の構想と認識に基づく音声符号化

森島繁生, 原島博

電子情報通信学会技術研究報告 87 ( 31 ) 17-24(SP87-10) 1987年

J-GLOBAL
音響処理と記号処理とを融合した単語音声認識システムの構成

森島繁生, 原島博

電子情報通信学会論文誌 D 70 ( 10 ) 1890 - 1901 1987年

J-GLOBAL
認識に基づくコードブック選択機能を有する低レート音声符号化

森島繁生, 原島博

電子情報通信学会情報・システム部門全国大会講演論文集 1987 ( 1 ) 337 - 338 1987年

J-GLOBAL
PROPOSAL OF A KNOWLEDGE BASED ISOLATED WORD RECOGNITION.

MORISHIMA S.

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings 713 - 716 1986年12月

　概要を見る

The authors describe a knowledge-based isolated Japanese word recognition algorithm. The program is written in Prolog/KR and has two basic inference processes: a bottom-up search and a top-down search. In the bottom-up process, segmentation and vowel decision are performed and some target word patterns are generated. The top-down process includes a consonant decision using a score for each candidate word calculated on the basis of fuzzy set theory. In the vowel inference step, a template matching is applied. In the segmentation step, heuristic rules based on the spectrum transition and the wave form are used. In the consonant inference step, each rule has a hierarchical structure and the consonant is defined automatically in the form of a multivalued threshold function from learning data.

CiNii
統計的手法に基づくプロダクションル-ルの自動抽出法とファジイ木探索

森島繁生, 原島博

電子通信学会論文誌 D 69 ( 11 ) p1754 - 1764 1986年11月

CiNii
多きょく論理を応用した単語音声中の子音認識ルールについて

森島繁生, 原島博

情報処理学会全国大会講演論文集 32nd ( 2 ) 1477 - 1478 1986年

J-GLOBAL
知識に基づく単語音声認識プロセスの効率化に関する検討

森島繁生, 原島博

電子通信学会技術研究報告 86 ( 93 ) 33-40(SP86-31) 1986年

J-GLOBAL
統計的手法に基づくプロダクションルールの自動抽出法とファジィ木探索

森島繁生, 原島博

電子通信学会論文誌 D 69 ( 11 ) 1754 - 1764 1986年

J-GLOBAL
しきい論理に基づく新しい知識表現形式とルールの自動抽出について

森島繁生, 原島博, 宮川洋

情報処理学会全国大会講演論文集 31st ( 2 ) 1031 - 1032 1985年

J-GLOBAL
MEDICAL INFERENCE AND DECISION SUPPORTING SYSTEM (MINDS) APPLIED FOR PSYCHIATRY.

Japanese Journal of Medical Electronics and Biological Engineering 22 938 - 939 1984年04月

　概要を見る

This paper describes MINDS (Medical Inference and Decision Supporting System). In a field like psychiatry, it is difficult to define rules for diagnoses, so the authors tried to define rules automatically in this system. They report correct diagnoses in 94. 3% of didactic cases and in 79. 0% of new cases with these rules. This system is composed of three level structures. The first level includes the patient's answers to standardized tests (TPI etc. ); the second level is estimated from symptom patterns; the third level is the diagnostic class. From the 1st to the 2nd level of this system, statistical methods were used and from the 2nd to the 3rd, a production rule technique was employed.
医療診断システムにおける診断ルール自動抽出に関する一検討

森島繁生, 原島博, 宮川洋, 斉藤陽一

電子通信学会技術研究報告 83 ( 164 ) 1-6(PRL83-33) 1983年

J-GLOBAL
医療診断システムにおける推論アルゴリズムの一検討

森島繁生, 原島博, 宮川洋, 斉藤陽一

電子通信学会技術研究報告 82 ( 204 ) 115-121(PRL82-63) 1982年

J-GLOBAL

▼全件表示

現在担当している科目

物理入門　（物理）　【前年度成績S評価者用】

先進理工学部

2025年春学期
物理入門　（物理）

先進理工学部

2025年春学期
Scientific Research

先進理工学部

2025年春学期
理工学基礎実験２Ｂ　物理

先進理工学部

2025年秋学期
理工学基礎実験２Ｂ　応物

先進理工学部

2025年秋学期
卒業研究　　【前年度成績S評価者用】

先進理工学部

2025年通年
卒業研究

先進理工学部

2025年通年
デジタル信号処理

先進理工学部

2025年秋学期
応用物理学実験Ｂ

先進理工学部

2025年通年
応用物理学実験Ｂ　【前年度成績S評価者用】

先進理工学部

2025年通年
回路理論Ａ　（応物）　【前年度成績S評価者用】

先進理工学部

2025年春学期
回路理論Ａ　（応物）

先進理工学部

2025年春学期
物理入門　（応物）　【前年度成績S評価者用】

先進理工学部

2025年春学期
物理入門　（応物）

先進理工学部

2025年春学期
回路理論Ｂ　（物理）

先進理工学部

2025年秋学期
回路理論Ａ　（物理）

先進理工学部

2025年春学期
卒業研究【前年度成績S評価者用】

先進理工学部

2025年通年
デジタル信号処理

先進理工学部

2025年秋学期
卒業研究

先進理工学部

2025年通年
物理実験Ｂ　　【前年度成績S評価者用】

先進理工学部

2025年通年
物理実験Ｂ

先進理工学部

2025年通年
応用物理学実験Ａ

先進理工学部

2025年通年
回路理論Ｂ　 (応物）

先進理工学部

2025年秋学期
Graduation Thesis Fall [S Grade]

先進理工学部

2025年秋学期
Graduation Thesis Spring [S Grade]

先進理工学部

2025年春学期
Graduation Thesis Fall

先進理工学部

2025年秋学期
Graduation Thesis Spring

先進理工学部

2025年春学期
Current Topics in Physics

先進理工学部

2025年秋学期
Current Topics in Physics [S Grade]

先進理工学部

2025年秋学期
Research on Image Processing

大学院先進理工学研究科

2025年通年
画像情報処理研究

大学院先進理工学研究科

2025年通年
Master's Thesis (Department of Pure and Applied Physics)

大学院先進理工学研究科

2025年通年
修士論文（物理応物）

大学院先進理工学研究科

2025年通年
Seminar on Image Processing D

大学院先進理工学研究科

2025年秋学期
Seminar on Image Processing C

大学院先進理工学研究科

2025年春学期
画像情報処理演習Ｄ

大学院先進理工学研究科

2025年秋学期
画像情報処理演習Ｃ

大学院先進理工学研究科

2025年春学期
物理学及応用物理学海外特別演習Ｄ

大学院先進理工学研究科

2025年通年
物理学及応用物理学海外特別演習Ｂ

大学院先進理工学研究科

2025年通年
物理学及応用物理学海外特別演習Ａ

大学院先進理工学研究科

2025年通年
画像情報処理研究

大学院先進理工学研究科

2025年通年
物理学及応用物理学海外特別演習Ｃ

大学院先進理工学研究科

2025年通年
あいおいニッセイ同和損害保険株式会社寄附講座メタバースと法

法学部

2025年秋学期

▼全件表示

担当経験のある科目(授業)

画像情報処理工学特論

早稲田大学
デジタル信号処理

早稲田大学
回路理論

早稲田大学, 成蹊大学

他学部・他研究科等兼任情報

理工学術院大学院先進理工学研究科
附属機関・学校グローバル・エデュケーション・センター

学内研究所・附属機関兼任歴

2024年

-

2026年

理工学術院総合研究所兼任研究員