Details of a Researcher - KIKUCHI, Hideaki

写真a

KIKUCHI, Hideaki

Scopus Paper Info

Paper Count: 42 Citation Count: 253 h-index: 6

Click to view the Scopus page. The data was downloaded from Scopus API in November 15, 2025, via http://api.elsevier.com and http://www.scopus.com .

Affiliation

Faculty of Human Sciences, School of Human Sciences

Job title

Professor

Degree

博士(情報科学) ( 早稲田大学 )

Homepage URL

https://sites.google.com/view/kikuchihideaki/home

Professional Memberships

　

　

　

ヒューマンインタフェース学会
　

　

　

情報処理学会
　

　

　

電子情報通信学会
　

　

　

人工知能学会
　

　

　

日本音響学会

Research Areas

Kansei informatics / Human interface and interaction / Intelligent informatics

Research Interests

Speech Science, Spoken Dialogue, Human Agent Interaction

Awards

情報処理学会第５３回全国大会大会優秀賞

1996.03

Papers

音声対話システムの共感的応答に対するユーザの評価傾向—Individual user tendencies in evaluations to empathetic responses by spoken dialogue systems—特集音声の多様性とその応用

菊池浩史, 宮澤幸希, 佐藤可直, 楊潔, 菊池英明

日本音響学会誌 81 ( 1 ) 84 - 87 2025
The Effect of the Repetitive Utterances Variation on User’s Empathy and Engagement by a Chat-Oriented Spoken Dialogue System

YANG Jie, KIKUCHI Hirofumi, KIKUCHI Hideaki

Journal of Japan Society for Fuzzy Theory and Intelligent Informatics 36 ( 4 ) 713 - 721 2024.11

　View Summary

Although the advent of large language models has been addressing the issue of users’ engagement in conversations with dialogue systems, it is still not sufficient. It is believed that empathic responses by a dialogue system are effective in motivating users’ engagement with dialogue systems. It has also been suggested that repetitive utterances have a function of showing empathy and influences the user’s engagement. This study focuses on the repetitive utterances with an empathetic function, and examines the effect of the variation of repetitive utterances by communication robots on users’ engagement in the dialogue. Based on the GPT-4, we constructed a chat-oriented communication robot that automatically generates repetitive utterances and conducted a robot dialogue experiment. As a result, evaluation of engagement and perceived empathy was significantly higher for a communication robot that produced a sufficiently high variation of repetitive utterances among subjects with a strong ‘anxiety toward behavioral characteristics of robots.’

DOI
オプティカルフローを用いたreal-time MRI映像における調音運動検出・提示手法—A Method for Detecting and Displaying Articulation Movements in Real-time MRI Movies Using Optical Flow—特集 2022年電子・情報・システム部門大会

大浦杏奈, 浅井拓也, 菊池英明

電気学会論文誌. C, 電子・情報・システム部門誌 = IEEJ transactions on electronics, information and systems 143 ( 7 ) 686 - 693 2023.07
Promoting Users' Empathy and Desire of Continuing Dialogue by Chat-oriented Dialogue Robot with Linguistic Alignment

62 ( 2 ) 772 - 781 2021.02

DOI CiNii
Children’s Speech Inaccuracies and Developmental Change: An Elicited Production Study in 5- to 13-year-old Japanese Children

Iwamoto Kyoji, Kikuchi Hideaki, Mazuka Reiko

音声研究 25 2021

J-GLOBAL
Analysis of the Associations between Voice Quality of Singing Voice and Color

KANATO Ai, KIKUCHI Hideaki

Transactions of Japan Society of Kansei Engineering 17 ( 1 ) 109 - 118 2018

　View Summary

This paper describes the relationship between voice quality of singing voice and color. We conducted 2 experiments. As stimuli, 28 voices were recorded from 11 amateur female singers. First, matching between voice quality and color was made with paired comparison. Color stimuli were categorized by 3 conditions and selected from the PCCS color system. As a result of binomial test, raters tended to agree with each other in the lightness condition. Second, voice stimuli were evaluated with 13 pairs of word as psychological features and 3 factors were extracted by factor analysis. In addition, 10 acoustic features were calculated as physical features. Based on the result of correlation analysis, it turned out that many features of colors are associated with impressions like a factor of “activity”. We also found that spectral centroid and spectral tilt might be related to some of the color features in the analysis of physical features.

DOI CiNii
Vowels in infant-directed speech: More breathy and more variable, but not clearer

Kouki Miyazawa, Takahito Shinya, Andrew Martin, Hideaki Kikuchi, Reiko Mazuka

COGNITION 166 84 - 93 2017.09 [Refereed]

　View Summary

Infant-directed speech (IDS) is known to differ from adult-directed speech (ADS) in a number of ways, and it has often been argued that some of these IDS properties facilitate infants' acquisition of language. An influential study in support of this view is Kuhl et al. (1997), which found that vowels in IDS are produced with expanded first and second formants (F1/F2) on average, indicating that the vowels are acoustically further apart in IDS than in ADS. These results have been interpreted to mean that the way vowels are produced in IDS makes infants' task of learning vowel categories easier. The present paper revisits this interpretation by means of a thorough analysis of IDS vowels using a large-scale corpus of Japanese natural utterances. We will show that the expansion of F1/F2 values does occur in spontaneous IDS even when the vowels' prosodic position, lexical pitch accent, and lexical bias are accounted for. When IDS vowels are compared to carefully read speech (CS) by the same mothers, however, larger variability among IDS vowel tokens means that the acoustic distances among vowels are farther apart only in CS, but not in IDS when compared to ADS. Finally, we will show that IDS vowels are significantly more breathy than ADS or CS vowels. Taken together, our results demonstrate that even though expansion of formant values occurs in spontaneous IDS, this expansion cannot be interpreted as an indication that the acoustic distances among vowels are farther apart, as is the case in CS. Instead, we found that IDS vowels are characterized by breathy voice, which has been associated with the communication of emotional affect. (C) 2017 Elsevier B.V. All rights reserved.

DOI

Scopus

47

Citation

(Scopus)
Assigning a Personality to a Spoken Dialogue Agent by Behavior Reporting

Yoshito Ogawa, Hideaki Kikuchi

NEW GENERATION COMPUTING 35 ( 2 ) 181 - 209 2017.04 [Refereed]

　View Summary

A method to assign a personality to a spoken dialogue agent is proposed and evaluated. The proposed method assigns a personality using agent reporting about behavior independent of interaction with a user. The proposed method attempts to assigning complex personalities. For this purpose, we have defined a behavior report dialogue and designed a personality assigning method using behavior reporting. The proposed method consists of three steps: collecting stereotypes between a personality and behavior through a questionnaire, designing the behavior report dialogue from the collected stereotypes, and agent reports about behavior at the start of interactions with a user. Experimental results show that the proposed method can assign a personality by repeating the behavior report dialogue, (the assigned personality is equivalent to the personality determined by the collected stereotypes) and that reporting behavior influences the assigned personality. In addition, we verified that the proposed method can assign "kind", "judicious" and the five basic personalities defined in the Tokyo University Egogram Second Edition.

DOI

Scopus

2

Citation

(Scopus)
Turn-taking timing of mother tongue

Ichikawa Akira, Oohashi Hiroki, Naka Makiko, Kikuchi Hideaki, Horiuchi Yasuo, Kuroiwa Shingo

Studies in Science and Technology 5 ( 1 ) 113 - 122 2016

DOI CiNii
Automatic Estimation of Speaking Style in Speech Corpora Focusing on Speech Transcriptions

Shen Raymond, Kikuchi Hideaki

Journal of Natural Language Processing 21 ( 3 ) 445 - 464 2014

　View Summary

Recent developments in computer technology have allowed the construction and widespread application of large-scale speech corpora. To enable users of speech corpora to easier data retrieval, we attempt to characterise the speaking style of speakers recorded in the corpora. We first introduce the three scales for measuring speaking style which were proposed by Eskenazi in 1993. We then use morphological features extracted from speech transcriptions that have proven effective in discriminating between styles and identifying authors in the field of natural language processing to construct an estimation model of speaking style. More specifically, we randomly choose transcriptions from various speech corpora as text stimuli with which to conduct a rating experiment on speaking style perception. Then, using the features extracted from these stimuli and rating results, we construct an estimation model of speaking style, using a multi-regression analysis. After cross-validation (leave-1-out), the results show that among the three scales of speaking style, the ratings of two scales can be estimated with high accuracy, which proves the effectiveness of our method in the estimation of speaking style.

CiNii
Self Organizing Maps as the Perceptual Acquisition Model:-Unsupervised Phoneme Learning from Continuous Speech-

MIYAZAWA Kouki, SHIROSE Ayako, MAZUKA Reiko, KIKUCHI Hideaki

J. SOFT 26 ( 1 ) 510 - 520 2014

　View Summary

We assume that SOM is adequate as a language acquisition model of the native phonetic system. However, many studies don't consider the quantitative features (the appearance frequency and the number of frames of each phoneme) of the input data. Our model is designed to learn values of the acoustic characteristic of a natural continuous speech and to estimate the number and boundaries of the vowel categories without using explicit instructions. In the simulation trial, we investigate the relationship between the quantity of learning and the accuracy for the vowels in a single Japanese speaker's natural speech. As a result, it is found that the recognition accuracy rate (of our model) are 5% (/u/)-92% (/s/).

DOI CiNii
Effect of Schemed Acting Directions on Speech Expressions : Toward the Achievement of Expressive Acted Speech

MIYAJIMA Takahiro, KIKUCHI Hideaki, SHIRAI Katsuhiko, OKAWA Shigeki

Journal of the Phonetic Society of Japan 17 ( 3 ) 10 - 23 2013.12

　View Summary

This paper explains the procedure to enhance the expressiveness in acted speech. We designed our own "format of acting script" referring to the theory of drama and created 280 acting scripts. We presented these acting scripts as acting directions to three actresses and collected 840 speech data. For comparison, using typical emotional words as acting directions, we also collected 160 speech data from each actress. Then, we compared tendencies of various features of each data type and each speaker and found that our acting scripts are effective on the enhancement of expressiveness in acted speech psychologically/acoustically.

DOI CiNii
Learning Phonemic Vowel Length from Naturalistic Recordings of Japanese Infant-Directed Speech

Ricardo A. H. Bion, Kouki Miyazawa, Hideaki Kikuchi, Reiko Mazuka

PLoS ONE 8 ( 2 ) e51594 - e51594 2013.02

DOI

Scopus

32

Citation

(Scopus)
Towards the Text-level Characterization Based on Speech Generation

53 ( 4 ) 1269 - 1276 2012.04

CiNii
同一対象を叙述した話し言葉と書き言葉の比較—Comparison between Spoken and Written Utterances Describing the Same Object—人とエージェントのインタラクション論文特集

高松亮, 菊池英明

電子情報通信学会論文誌. A, 基礎・境界 = The IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences (Japanese edition). A / 電子情報通信学会編 95 ( 1 ) 146 - 156 2012.01

CiNii
Study on System Utterance of Suggestion to Promote User's Accepatance in Driving Environment

MIYAZAWA Kouki, KAGETANI Takuya, SHEN Raymond, KIKUCHI Hideaki, OGAWA Yoshito, HATA Chihiro, OHTA Katsumi, HOZUMI Hideaki, MITAMURA Takeshi

Transactions of the Japanese Society for Artificial Intelligence 25 ( 6 ) 723 - 732 2010

　View Summary

In this study, we aim at clarification of the factor that promotes an user's acceptance of suggestion from an interactive agent in driving environment. Our aim is to figure out how human beings accept the encouragement from interaction objects, and also which kinds of dialogues or action controls are necessary for the design of car navigation system which makes suggestion and requests to drivers. Firstly, we had an experiment for collecting dialogue between humans in driving simulation environment, then we analyzed the drivers' acceptance and evaluation for the navigators. As the results, we found that the presence and reliability of the navigator highly relate to the acceptance of suggestion from the navigator. When navigators were next to drivers, the rate of drivers' suggestion acceptance rose. However, the stress of drivers increased. In addition, based on the linguistic and acoustic analysis of the navigators' utterances, we found out some points of designing system utterance of suggestion to promote user's acceptance. We found that expressing the grounds of suggestions, showing the exact numbers, and the wide pitch ranges, all highly relate to the acceptance of suggestions.

DOI CiNii

Scopus
肯定的/否定的発話態度の認識とその音声対話システムへの応用

藤江真也, 江尻康, 菊池英明, 小林哲則

電子情報通信学会論文誌 J88-D-II ( 2 ) 489 - 498 2005.03

CiNii
Phonetic Labeling of the 'Corpus of Spontaneous Japanese'(<Feature Articles>Phonetics and Speech Technology)

KIKUCHI Hideaki, MAEKAWA Kikuo, IGARASHI Yosuke, YONEYAMA Kiyoko, FUJIMOTO Masako

Journal of the Phonetic Society of Japan 7 ( 3 ) 16 - 26 2003

　View Summary

In an attempt to construct a large-scale database of spontaneous speech, the authors planned to give segmental and prosodic labels to spontaneous Japanese speech. This paper reports the method of this labeling and its performance. First, the performance of automatic segmental labeling by Hidden Markov Model was verified. Sample speech of about four hour long was automatically phoneme labeled and compared to the results of hand-labeling. It turned out that average of label boundary difference with hand labeled data was 14.3[ms]. Second, the performance of prosodic labeling by newly proposed labeling scheme named X-JToBI (eXtended J_ToBI) was verified. The analysis of labeled data showed that newly added inventories appeared in the data of spontaneous speech and rate of inter-labeler agreement increased in nearly all types of labels.

DOI CiNii
音声対話における心的状態変化の予測をともなうメタ発話生成機構

菊池英明, 白井克彦

情報処理学会論文誌 Vol.43, No.7 ( 7 ) 2130 - 2137 2002.07

CiNii
「日本語話し言葉コーパス」における書き起こしの方法とその基準について

小磯花絵, 土屋菜緒子, 間淵洋子, 斉藤美紀, 籠宮隆之, 菊池英明, 前川喜久雄

日本語科学 Vol.9 ( 9 ) 43 - 58 2001.04

DOI CiNii
日本語話し言葉コーパスの設計

音声研究 4月2日 ( 2 ) 51 - 61 2000.08

CiNii
対話効率の向上を目的とした音声対話制御のモデル化

KIKUCHI Hideaki, SHIRAI Katsuhiko

ヒューマンインタフェース学会誌 Vol.2, No.2 ( 2 ) 39 - 46 2000.05

CiNii
課題遂行対話における対話潤滑語の認定

NAKAZATO Shu, TAMOTO Masafumi, KIKUCHI Hideaki, YOSHIMURA Takashi, Shu Nakazato, Masafumi Tamoto, Hideaki Kikuchi, Takashi Yoshimura, Dept. of Management and Information Sciences faculty of International Studies Meio University, Media Information Laboratory NTT Communication Science Laboratories, Graduate School of Science and Engineering Waseda University

人工知能学会誌 Vol.14, No.5 ( 5 ) 900 - 906 1999.09

CiNii
人間型ロボットの対話インタフェースにおける発話交替時の非言語情報の制御

情報処理学会論文誌 Vol.40, No.2 ( 2 ) 487 - 496 1999.02

CiNii
情報のら旋成長を支援するコミュニケーション形電子図書館

ASHIZAWA Minoru, KIKUCHI Hideaki, MISHINA Yusuke, FUJISAWA Hiromichi, HIDAKA Minoru, YAMAZAKI Naoko, SAKURAI Akito

電子情報通信学会論文誌 Vol.J81-D-II, No.5 ( 5 ) 1014 - 1024 1998.05

CiNii
音声対話インタフェースにおける発話権管理による割り込みへの対処

KIKUCHI Hideaki, KUDO Ikuo, KOBAYASHI Tetsunori, SHIRAI Katsuhiko

電子情報通信学会論文誌 Vol.J77-D-II, No.8 ( 8 ) 1502 - 1511 1994.08

CiNii
Three Different LR Parsing Algorithm for Phoneme-Context-Dependent HMM-Based Continuous Speech Recognition

NAGAI Akito, Sagayama Shigeki, Kita Kenji, Kikuchi Hideaki

IEICE Trans. Inf. & Sys. Vol.E76-D, No.1 29 - 37 1993.01

　View Summary

This paper discusses three approaches for combining an efficient LR parser and phoneme-context-dependent HMMs and compares them through continuous speech recognition experiments. In continuous speech recognition, phoneme-context-dependent allophonic models are considered very helpful for enhancing the recognition accuracy. They precisely represent allophonic variations caused by the difference in phoneme-contexts. With grammatical constraints based on a context free grammar (CFG), a generalized LR parser is one of the most efficient parsing algorithms for speech recognition. Therefore, the combination of allophonic models and a generalized LR parser is a powerful scheme enabling accurate and efficient speech recognition. In this paper, three phoneme-context-dependent LR parsing algorithms are proposed, which make it possible to drive allophonic HMMs. The algorithms are outlined as follows: (1) Algorithm for predicting the phonemic context dynamically in the LR parser using a phoneme-context-independent LR table. (2) Algorithm for converting an LR table into a phoneme-context-dependent LR table. (3) Algorithm for converting a CFG into a phoneme-context-dependent CFG. This paper also includes discussion of the results of recognition experiments, and a comparison of performance and efficiency of these three algorithms.

CiNii

▼display all

Books and Other Publications

"感情の音声表出", 石井克典監修「IoHを指向する感情・思考センシング技術」

KIKUCHI Hideaki

CMC出版 2019 ISBN: 9784781314303
"音声対話システム", 白井克彦編著「音声言語処理の潮流」

KIKUCHI Hideaki

コロナ社 2010
“Outline of J_ToBI”, in "COMPUTER PROCESSING OF ASIAN SPOKEN LANGUAGES"

KIKUCHI Hideaki

Americas Group Publications,U.S. 2010 ISBN: 0935047727
"音声コミュニケーションの分析単位 -ToBI-", 坊農真弓, 高梨克也編,「多人数インタラクションの分析手法」

KIKUCHI Hideaki

オーム社 2009
"韻律を利用した対話状態の推定", 広瀬啓吉編「韻律と音声言語情報処理」

KIKUCHI Hideaki, SHIRAI Katsuhiko

丸善 2006 ISBN: 4621076744
"Voicing in Japanese," Van de Weijer, Nanjo, Nishihara (Eds.)

MAEKAWA Kikuo, KIKUCHI Hideaki

Mouton de Gruyter, Berlin and New York 2006
"Spoken Language Systems", S. Nakagawa et al. (Eds.)

HATAOKA Nobuo, ANDO Haru, KIKUCHI Hideaki

Ohmsha/IOS Press 2005 ISBN: 427490637x

▼display all

Presentations

Investigation of acoustic features that affect pop-out evaluation in everyday conversational speech

MANTANI Kazuki, KIKUCHI Hideaki

Proceedings of the Technical Committee on Speech Communication The Technical Committee on Speech Communication,the Acoustical Society of Japan

Presentation date： 2024.09

Event date：
2024.09

　

　

　View Summary

In previous research, pop-out voices are defined as voices that are conspicuously perceived in the midst of disturbing sounds such as background noise. Factors that are expected to be associated with pop-out voices include "linguistic characteristics," "spectral shape," "degree of reverberation," and "type and nature of background noise." Therefore, in this study， we focus on the type and magnitude of background noise and examine how the acoustic features affect the pop-out evaluation depending on the background noise for audio recorded daily conversations.
音声信号から real-time MRI 調音運動動画を推定するモデルの構築と検証－発話者独立型モデルの構築に向けて－

OURA Anna, KIKUCHI Hideaki

Proceedings of the Technical Committee on Speech Communication The Technical Committee on Speech Communication,the Acoustical Society of Japan

Presentation date： 2024.09

Event date：
2024.09

　

　
語用論的対話方策を使用するルールベースの対話システム

楊潔, 菊池浩史, 中下咲帆, 藤後英哲, 菊池英明

人工知能学会研究会資料言語・音声理解と対話処理研究会一般社団法人人工知能学会

Presentation date： 2022.12

Event date：
2022.12

　

　

　View Summary

本稿は談話研究の知見を活用し、SUNABAでマルチモーダル対話システムを作成した。謝罪対話と自由対話に分け、シチュエーションに適するかつ人らしさが感じられる工夫をした。謝罪対話の部分では、人同士の謝罪場面における語用論的方策、使用率及び方策の順番を参考に、システム発話を作成した。また、被謝罪側による応答の種類に応じて条件分岐を作成した。さらに、共感を示す発話や非流暢性発話など、人らしさを高めるための工夫をした。一方、自由対話の部分では、システムによる質問や暗黙的な話題転換などの方策を用いて、システム主導の対話シナリオを作成した。音声（ピッチ、音量、話速）とジェスチャについては、連続で同じ調整の応答にならない制御を基本指針とした。また、謝罪の流れに応じた想起される謝罪側の欲求を考慮して制御を行った。評価の結果、本システムは予選で四位の評価を得た。
A Study of User's Acceptance Evaluation Model for Paralinguistic Information of Response Speech in Spoken Dialogue System

KIKUCHI Hirofumi, YANG Jie, KIKUCHI Hideaki

Proceedings of the Annual Conference of JSAI The Japanese Society for Artificial Intelligence

Presentation date： 2022

Event date：
2022

　

　

　View Summary

Today, Japan's super-aged society and other social backgrounds are increasing the demand for dialogue partners. Spoken dialogue systems are expected to be utilized for this demand. One of the roles of a dialogue system as a chat partner is to share mental states. When a dialogue system responds to a user's expressed mental state with paralinguistic information that the user cannot tolerate, the dialogue breaks down and the user's desire to continue the dialogue decreases. This research aims to solve the problem of such a breakdown. We have focused on the pleasant and unpleasant states expressed by user utterances and system responses, and have confirmed the existence of an acceptable range of system responses to various user utterances. In this paper, we discuss a tolerance evaluation model for outputting acceptable system responses according to the pleasant and unpleasant states expressed in user speech.
A Study on Various Adaptive Responses in Non-Task-Oriented Dialogue Systems Considering Users' Acceptable Range

KIKUCHI Hirofumi, YANG JIE, KIKUCHI Hideaki

Proceedings of the Annual Conference of JSAI The Japanese Society for Artificial Intelligence

Presentation date： 2021

Event date：
2021

　

　

　View Summary

Recently, the number of elderly people living alone in households is increasing in Japan. In these households, the frequency of conversation is decreasing. There are concerns that less frequent conversations will lead to a decline in health. Spoken dialogue systems are expected to be used to meet this demand for conversation. However, spoken dialogue systems have a problem of decreasing the users' desire to continue the dialogue. In this research, we aim to solve such a problem of breakdown. We have confirmed that there exists an acceptable range of system response to users' utterances using a single speaker's utterances. In this paper, we recorded user utterances by nine speakers and conducted a listening evaluation experiment to confirm the existence of acceptance for various types of user utterances. As results, the tendency of the relationship between user utterances and system responses, which is related to the users' acceptance judgment, was clarified.
Research of User's Acceptance Range Reguarding to Responses of Non-task-oriented Dialogue System

KIKUCHI Hirofumi, YANG JIE, KIKUCHI Hideaki

Proceedings of the Annual Conference of JSAI The Japanese Society for Artificial Intelligence

Presentation date： 2020

Event date：
2020

　

　

　View Summary

Recently, Japan is reaching a super aging society. The staying alone elderly person tend to be less likely to talk with another, to have no one to rely when in trouble. The dialogue agent is expected to eliminate the social isolation of such as those elderly person living alone. One of roles of dialogue systems which users talk to by non-task-oriented is to share mental states. When users can not accept paralinguistic information that dialogue systems response to users' expressed mental state, users' dialogue continuance desire is decreasing. This study aims to solve such a problem of a failure of dialogue. Assuming a dialogue between the user and the system, we prepared speech stimulation connected the fixed users' utterance and the system response then conducted a listening evaluation experiment. For the system response, we used general purpose backchannel "Soudesuka". Through the analysis, the acceptance range of the system response diversity focusing on paralinguistic information was investigated. About the acceptance range, measurable correlation was seen in valence state of users' utterance and the system response. From discussion of research result, it leads to the proposal of the guideline for making the feature of the paralinguistic information of the system response to the users' utterance acceptable.
The Influence on Users’ Empathy and Desire of Continuing Dialogue by Dialogue Robots with Linguistic Alignment

YANG Jie, KIKUCHI Hideaki

Proceedings of the Annual Conference of JSAI The Japanese Society for Artificial Intelligence

Presentation date： 2020

Event date：
2020

　

　

　View Summary

Linguistic alignment refers to the use of similar words to a conversational partner. In this research, we analyzed feature values of linguistic alignment in human-human non-task-oriented dialogue. Based on the obtained feature values, we then verified the influence on users by dialogue robots with linguistic alignment taking users' attributions into account. We conducted a control experiment using Wizard of Oz method for 38 subjects. The experimental conditions were divided into low-frequency, medium-frequency, and high-frequency according to the frequency of linguistic alignment during one dialogue. For each frequency condition, an experimental group (linguistic alignment) and a control group (backchannel) were set up. After each experiment, the subjects were required to answer three questions relating to empathy and desire of continuing dialogue using five-point Likert scale. As a result, the influence on users' empathy and desire of continuing dialogue by dialogue robots with linguistic alignment is investigated. Furthermore, it is suggested that the more negative the user's attitude toward the robot, the higher the effect of linguistic alignment.
The Effect of Promoting Empathy and Dialogue Continuance Desire by Dialogue Robots with Linguistic Alignment

YANG Jie, KIKUCHI Hideaki

JSAI Technical Report, SIG-SLUD The Japanese Society for Artificial Intelligence

Presentation date： 2019.11

Event date：
2019.11

　

　
Japanese children’s speaking rate reflect acquisition of mora-timed rhythm

岩本教慈, 岩本教慈, 近藤綾子, 菊池英明, 菊池英明, 馬塚れい子

電子情報通信学会技術研究報告

Presentation date： 2018

Event date：
2018

　

　
音楽ジャンル印象を考慮したメロディ自動編曲

伊藤, 康佑, 金礪, 愛, 菊池, 英明

第79回全国大会講演論文集情報処理学会

Presentation date： 2017.03

Event date：
2017.03

　

　

　View Summary

本稿では,日本のポピュラー音楽のメロディ聴取時における,音楽ジャンル印象を考慮した自動編曲手法の提案を行う.従来の編曲手法と異なり,音楽ジャンル印象に基づいた編曲を行う点,複数の音楽ジャンルが混在したメロディの編曲を行う点から,多様な編曲結果を得ることを目標としている.日本人の被験者を対象とした印象評価実験を行い,メロディの特徴から音楽ジャンル印象の推定を行うモデルの構築を行った.また,モデルの構築結果を考慮し,ユーザが入力したメロディに対し,音楽ジャンル印象を任意の値に調節したメロディへ編曲を行うことができる,メロディ自動編曲システムの開発を行った.
日本語対話における母語話者と非母語話者の話者交替についての差異

張,雪薇, 菊池,英明

社会言語科学会第39回研究大会

Presentation date： 2017.03

Event date：
2017.03

　

　
音声学的手法に基づいた児童の発話速度の発達過程の解析

岩本教慈, 近藤綾子, 菊池英明, 馬塚れい子

日本音響学会研究発表会講演論文集(CD-ROM)

Presentation date： 2017

Event date：
2017

　

　
Humor utterance generation for non-task-oriented dialogue systems

Shohei Fujikura, Yoshito Ogawa, Hideaki Kikuchi

HAI 2015 - Proceedings of the 3rd International Conference on Human-Agent Interaction Association for Computing Machinery, Inc

Presentation date： 2015.10

Event date：
2015.10

　

　

　View Summary

We propose a humor utterance generation method that is compatible with dialogue systems, to increase "desire of continuing dialogue". A dialogue system retrieves leading-item: noun pairs from Twitter as knowledge and attempts to select the most humorous reply using word similarity, which reveals that incongruity can be explained by the incongruity-resolution model. We consider the differences among individuals, and confirm the validity of the proposed method. Ex-perimental results indicate that high-incongruity replies are significantly effective against low-incongruity replies with a limited condition.
Constructing the corpus of infant-directed speech and infant-like robot-directed speech

Ryuji Nakamura, Kouki Miyazawa, Hisashi Ishihara, Ken'ya Nishikawa, Hideaki Kikuchi, Minoru Asada, Reiko Mazuka

HAI 2015 - Proceedings of the 3rd International Conference on Human-Agent Interaction Association for Computing Machinery, Inc

Presentation date： 2015.10

Event date：
2015.10

　

　

　View Summary

The characteristics of the spoken language used to address infants have been eagerly studied as a part of the language acquisition research. Because of the uncontrollability factor with regard to the infants, the features and roles of infantdirected speech were tried to be revealed by the comparison of speech directed toward infants and that toward other listeners. However, they share few characteristics with infants, while infants have many characteristics which may derive the features of IDS. In this study, to solve this problem, we will introduce a new approach that replaces the infant with an infant-like robot which is designed to control its motions and to imitate its appearance very similar to a real infant. We have now recorded both infant-and infantlike robot-directed speech and are constructing both corpora. Analysis of these corpora is expected to contribute to the studies of infant-directed speech. In this paper, we discuss the contents of this approach and the outline of the corpora.
スペクトル包絡の伸長による乳児音声の合成手法の提案とその評価

岩本教慈, 宮澤幸希, 金礪愛, 馬塚れい子, 菊池英明

日本音響学会研究発表会講演論文集(CD-ROM)

Presentation date： 2015

Event date：
2015

　

　
母語(日本語)獲得と年長園児の話者交替タイミング—Mother Tongue ( Japanese ) Acquisition and Talker Alternation Timing of Elder Nursery School Child

市川熹, 川端良子, 菊池英明

日本音響学会研究発表会講演論文集日本音響学会編東京 : 日本音響学会

Presentation date： 2014

Event date：
2014

　

　
Humor Utterance Generation Method for Non-task-oriented Dialogue System

FUJIKURA Shohei, OGAWA Yoshito, KIKUCHI Hideaki

IEICE technical report. Natural language understanding and models of communication The Institute of Electronics, Information and Communication Engineers

Presentation date： 2013.11

Event date：
2013.11

　

　

　View Summary

In this study, we propose humor generate method for Non-task-oriented Dialogue System using Twitter. We have been aiming to establish the design of dialogue systems with desire of continuing interaction by analyzing the factors to feel, "want to chat with next time". We confirmed dealing humor is valid for desire of continuing interaction. In this paper, we proposes a method which dialogue system can automatically generate humor with knowledge, extract from Twitter as Modifier-Noun pair and Value-Predicate clauses pair. And in Evaluation experiment, We confirmed proposal method can generate humor.
Desire of Continuing Interaction with Spoken Dialogue System

KIKUCHI Hideaki, MIYAZAWA Kouki, OGAWA Yoshito, FUJIKURA Shouhei

IEICE technical report. Speech The Institute of Electronics, Information and Communication Engineers

Presentation date： 2013.09

Event date：
2013.09

　

　

　View Summary

We aimed at improvement of desire of continuing interaction with a spoken dialogue system through the three cases of construction of spoken dialogue system. System utterances with humor, control of speech rate of system utterances and estimation of user's personality based on user's utterances are effective for improvement of desire of continuing interaction.
E-037 Development of Fatigue Degree Estimation System for Smartphone

Aoki Yuki, Miyajima Takahiro, Kikuchi Hideaki, Shiomi Kakuichi

Forum on Information Technology

Presentation date： 2013.08

Event date：
2013.08

　

　
J-054 Personality Recognition and Improvement of Dialogue Continuance Desire Using Control of Speech Rate and Speech Interval Length

Takeya Yuki, Ogawa Yoshito, Kikuchi Hideaki

Forum on Information Technology

Presentation date： 2013.08

Event date：
2013.08

　

　
The Relationship between the Level of Intimacy and Manner of Speech

NAKAZATO Shu, OSHIRO Yuji, KIKUCHI Hideaki

Technical report of IEICE. HIP The Institute of Electronics, Information and Communication Engineers

Presentation date： 2013.03

Event date：
2013.03

　

　

　View Summary

In this study, we analyzed the change of manner of speech with the level of intimacy. For our experiments, we collected two kinds of dialogue data: the initial meeting dialogue data we called "low intimacy" and that after six months we called "high intimacy". In our experiments, subjects listened to the dialogue of three pairs of speakers, and evaluated their impressions of the manner of speech through a questionnaire. Analyzing the results, we extracted four significant factors: "Liveliness", "Pleasantness", "Fluency" and "Speed". Comparing the factor scores for the low intimacy dialogues with the high intimacy dialogues, we found similar results for different partners in the low intimacy dialogues, butdifferent factor scores for different partners in the high intimacy dialogues. In particular, the fluency score increased in the dialogues after six months.
Effects of an Agent's Feature Grasping on an User's Attachment

OGAWA Yoshito, HARADA Kaho, KIKUCHI Hideaki

IEICE technical report. Speech The Institute of Electronics, Information and Communication Engineers

Presentation date： 2013.02

Event date：
2013.02

　

　

　View Summary

In this study, we consider effects of an agent's grasping user features on the user's attachment to the agent. Recently, some previous studies of HAI have researched for about strategies that make users continue to use spoken dialogue systems long period of time. In this research, we suggest a system estimate user's degree of activeness from prosody and accumulate that with a correct label decided from user response as training data for following estimation. Our results show our system performs more stable estimate, and higher estimation accuracy makes users conscious more intense attachment.
対話システムと心的負担

市川熹, 滝沢恵子, 菊池英明, 大橋浩輝, 堀内靖雄, 黒岩眞吾

人工知能学会全国大会論文集一般社団法人人工知能学会

Presentation date： 2013

Event date：
2013

　

　

　View Summary

<p>高齢者や障害者をユーザとして想定した場合，ユーザへの負荷が小さいシステムが必要となる．しかし，現状の多くのシステムでは自然な対話が実現できておらず，ユーザに負担を課している．本研究では，非母語・母語対話における話者交替のタイミングを分析し，非母語話者-母語話者間で話者交替の様相が異なることを示す．この結果は，母語の対話では，参与者が互いに負荷を軽減するように話者交替を行っているこが示唆された．</p>
The construction of an evaluation scale for singing voice of popular music : in Amateur singing voice

KANATO Ai, KIKUCHI Hideaki

IEICE technical report. Speech The Institute of Electronics, Information and Communication Engineers

Presentation date： 2013.01

Event date：
2013.01

　

　

　View Summary

In this research, we tried to construct an evaluation scale for singing voice of popular music. In this paper, we considered the effectiveness of the scale in amateur singing voice and the factor of evaluation for singing voice. As a result, we constructed the scale with 12 words and confirmed its reliability. And, we found a characteristic factor in singing voice which differs from speaking voice.
Acoustic Analysis of Infant Directed Speech Using the Multi-Timescale Phoneme Acquisition Model

Presentation date： 2012.03

Event date：
2012.03

　

　
音声の発話者印象情報の知覚・認知モデル構築—Modeling of perception and cognition process of impressions of speaker

佐藤安里, 菊池英明, 市川熹

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報東京 : 電子情報通信学会

Presentation date： 2012.03

Event date：
2012.03

　

　
Acoustic Analysis of Infant Directed Speech Using the Multi-Timescale Phoneme Acquisition Model

MIYAZAWA Kouki, KIKUCHI Hideaki, MAZUKA Reiko

IEICE technical report. Speech The Institute of Electronics, Information and Communication Engineers

Presentation date： 2012.03

Event date：
2012.03

　

　

　View Summary

We propose the phoneme acquisition model that simulates human auditory system that perceives the temporal variations of the speech. Our model is designed to learn values of the acoustic characteristic of a continuous speech and to estimate the number and boundaries of the phoneme categories without using explicit instructions. A purpose of this study is to clarify the role of an Infant-Directed Speech (IDS), that has the high variability. We performed a learning experiment of an Adult Directed Speech (ADS) and IDS using our model and compared the results. As a result, in the accuracy rate of the voiceless stops, IDS is significantly higher than ADS. Accordingly, it is possible that IDS emphasized the acoustic features of unsteady phonemes, and to assist the acquisition.
パラ言語情報の音声知覚 : 小学生と大学生の比較—Perception of Paralinguistic Information in Speech : A Comparison between Elementary School Children and University Students

有賀亮, 菊池英明

論叢 : 玉川大学教育学部紀要 / [玉川大学教育学部] [編] 町田 : 玉川大学教育学部

Presentation date： 2012

Event date：
2012

　

　
音声コーパスの類似性可視化システムの改良

石本祐一, 板橋秀一, 山川仁子, 沈睿, 菊池英明, 松井知子

日本音響学会2011年春季研究発表会講演論文集

Presentation date： 2011.03

Event date：
2011.03

　

　
The multi timescale phoneme acquisition model of the self-organizing based on the dynamic features

宮澤幸希, 菊池英明, 馬塚れい子

Proceedings of Interspeech

Presentation date： 2011

Event date：
2011

　

　
The Multi Timescale Phoneme Acquisition Model of the Self-Organizing Based on the Dynamic Features

宮澤幸希, 菊池英明, 馬塚れい子

Proceedings of INTERSPEECH 2011

Presentation date： 2011

Event date：
2011

　

　
連続音声からの音韻カテゴリ獲得モデルに関する考察

宮澤幸希, 三浦英朗, 菊池英明, 馬塚れい子

人工知能学会全国大会論文集一般社団法人人工知能学会

Presentation date： 2011

Event date：
2011

　

　

　View Summary

<p>健常な乳児は生後一年以内に母国語の音韻を獲得する。乳児は周囲の成人の言語入力に基づいた学習を行っていると考えられるが、詳細なメカニズムは明らかになっていない。そこで我々では、自然な連続音声の入力と、ヒトの認知能力を模擬したニューラルネットワークモデルによる音韻体系範疇化のシミュレーションを行っている。本研究では、母音と子音を音声の動的特徴に基づいて同一のモデルで取り扱う手法について報告する。</p>
A Method of Estimation of Stress for Evaluation of Speech Interface

FUKURO Natsuko, KIKUCHI Hideaki

The Transactions of Human Interface Society Human Interface Society

Presentation date： 2010.11

Event date：
2010.11

　

　

　View Summary

This paper describes a method of estimation of stress by speech analysis technology for evaluation of speech interface. First, we conducted an experiment of evaluating a speech interface and observed participants' action and speech with some troubles in use of the interface. Some participants seem to feel anger and anxiety in using the speech interface and we focus on 'anxiety' which derives from the problem of timing to speak or designing of system functions. Then, we collected utterances which were spoken in condition of 'anxiety' on the simulated environment of a speech interface. We made a decision tree from the collected utterances and confirmed the high accuracy of 'anxiety' estimation. Finally, we applied the tree to the utterances in the experiment of evaluating a speech interface. The accuracy was high (71.9-90.0%) and it was confirmed that our method of estimation of 'anxiety' by speech analysis technology is effective to find out problems on use of a speech interface.
Unsupervised learning of vowels from continuous speech based on self-organized phoneme acquisition model

宮澤幸希, 菊池英明, 馬塚れい子

Proceedings of Interspeech

Presentation date： 2010

Event date：
2010

　

　
印象語から想起した音声情報の特徴量空間の分析—Analysis of feature space of voice information recalled from impression words

宮島崇浩, 菊池英明, 榑松明

言語・音声理解と対話処理研究会 / 人工知能学会 [編] 東京 : 人工知能学会

Presentation date： 2009.03

Event date：
2009.03

　

　
対乳児発話の母音の時間構造-理研日本語母子会話コーパスを用いた分析-

宮澤幸希, 菊池英明, 馬塚れい子

電子情報通信学会技術研究報告 Vol.109, No.308

Presentation date： 2009

Event date：
2009

　

　
印象空間における音声と文字の対応関係の分析—Analysis on relationship between voice and characters using impression space

宮島崇浩, 菊池英明, 榑松明

言語・音声理解と対話処理研究会 / 人工知能学会 [編] 東京 : 人工知能学会

Presentation date： 2008.07

Event date：
2008.07

　

　
『日本語話し言葉コーパス』のXML文書—話し言葉の日本語 ; 日本語話し言葉コーパス

菊池英明

日本語学東京 : 明治書院

Presentation date： 2008.04

Event date：
2008.04

　

　
日本語話し言葉コーパスを利用した句末境界音調に関する一考察—A study of prosodic phrase boundary in the corpus of spontaneous Japanese

谷口未希, 菊池英明

言語・音声理解と対話処理研究会 / 人工知能学会 [編] 東京 : 人工知能学会

Presentation date： 2007.07

Event date：
2007.07

　

　
B10. アクセント句を単位としてみた自発音声の韻律特徴 : 韻律境界強度の予備的分析(研究発表,日本音声学会2007年度(第21回)全国大会発表要旨)

前川喜久雄, 菊池英明

音声研究日本音声学会

Presentation date： 2007

Event date：
2007

　

　
韻律情報を用いた発話態度認識とその対話システムへの応用

八木大三, 藤江真也, 菊池英明, 小林哲則

日本音響学会2005年春季研究発表会講演論文集

Presentation date： 2005.03

Event date：
2005.03

　

　
早稲田大学eスクールの実践:大学教育におけるeラーニングの展望

向後千春, 松居辰則, 西村昭治, 浅田匡, 菊池英明, 金群, 野嶋栄一郎

第11回日本教育メディア学会年次大会発表論文集

Presentation date： 2004.10

Event date：
2004.10

　

　
韻律情報を利用した文章入力システムのための韻律制御モデル

大久保崇, 菊池英明, 白井克彦

日本音響学会2004年秋季研究発表会講演論文集

Presentation date： 2004.09

Event date：
2004.09

　

　
音声対話における発話の感情判別

小林季実子, 菊池英明, 白井克彦

日本音響学会2004年秋季研究発表会講演論文集

Presentation date： 2004.09

Event date：
2004.09

　

　
日本語話し言葉コーパス

国立国語研究所

Presentation date： 2004.03

Event date：
2004.03

　

　
韻律情報を用いた肯定的/否定的態度の認識

八木大三, 藤江真也, 菊池英明, 小林哲則

日本音響学会2004年春季研究発表会講演論文集

Presentation date： 2004.03

Event date：
2004.03

　

　
アイヌ語音声データベース

早稲田大学語学教育研究所

Presentation date： 2004.03

Event date：
2004.03

　

　
Spoken Dialogue System Using Prosody As Para-Linguistic Information

FUJIE Shinya, YAGI Daizo, MATSUSAKA Yosuke, KIKUCHI Hideaki, KOBAYASHI Tetsunori

proc. of SP2004(International Conference Speech Prosody,2004)

Presentation date： 2004.03

Event date：
2004.03

　

　
Corpus of Spontaneous Japanese: Design, Annotation and XML Representation

Kikuo Maekawa, Hideaki Kikuchi, Wataru Tsukahara

International Symposium on Large-scale Knowledge Resources (LKR2004)

Presentation date： 2004.03

Event date：
2004.03

　

　
音声対話における韻律を用いた話題境界検出

大久保崇, 菊池英明, 白井克彦

電子情報通信学会技術報告

Presentation date： 2003.12

Event date：
2003.12

　

　
パラ言語の理解能力を有する対話ロボット

藤江真也, 江尻康, 菊池英明, 小林哲則

情報処理学会音声言語情報処理研究会 Information Processing Society of Japan (IPSJ)

Presentation date： 2003.10

Event date：
2003.10

　

　

　View Summary

The human-human interactions in a spoken dialogue seem to use not only linguistic information in the utterances but also some sorts of additional information supporting linguistic information. We call these sorts of additional information "para-linguistic information". In this paper, we present a recognition method of attitudes by prosodic information, and a recognition method of head gestures. In the former method, in order to recognize two attitudes, such as "positive" and "negative", FO pattern and phoneme alignment are introduced as features. In the latter method, in order to recognize three gestures, such as "nod", "tilt" and "shake", left-to-right HMM is introduced as the probabilistic model as well as optical flow is introduced as features. Experimental results show that these methods are sufficient to recognize user's attitude as para-linguistic information. Finally, we show a proto-type spoken dialogue system using para-linguistic information and how these sorts of information contribute the efficient conversation.
パラ言語情報を用いた音声対話システム

藤江真也, 八木大三, 菊池英明, 小林哲則

日本音響学会2003年秋季研究発表会講演論文集

Presentation date： 2003.09

Event date：
2003.09

　

　
Use of a large-scale spontaneous speech corpus in the study of linguistic variation

MAEKAWA Kikuo, KOISO Hanae, KIKUCHI Hideaki, YONEYAMA Kiyoko

proc. of 15th Int'l Congress of Phonetic Sciences

Presentation date： 2003.08

Event date：
2003.08

　

　
Evaluation of the effectiveness of "X-JToBI": A new prosodic labeling scheme for spontaneous Japanese speech

KIKUCHI Hideaki, MAEKAWA Kikuo

proc. of 15th Int'l Congress of Phonetic Sciences

Presentation date： 2003.08

Event date：
2003.08

　

　
自発音声コーパスにおけるF0下降開始位置の分析

籠宮隆之, 五十嵐陽介, 菊池英明, 米山聖子, 前川喜久雄

日本音響学会2003年春季研究発表会講演論文集

Presentation date： 2003.03

Event date：
2003.03

　

　
『日本語話し言葉コーパス』(CSJ)のXML検索環境

塚原渉, 菊池英明, 前川喜久雄

第3回話し言葉の科学と工学ワークショップ講演予稿集

Presentation date： 2003.02

Event date：
2003.02

　

　
XMLを利用した『日本語話し言葉コーパス』(CSJ)の整合性検証

菊池英明, 塚原渉, 前川喜久雄

第3回話し言葉の科学と工学ワークショップ講演予稿集

Presentation date： 2003.02

Event date：
2003.02

　

　
Performance of segmental and prosodic labeling of spontaneous speech

Kikuchi, H, K. Maekawa

proc. of the ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition (SSPR2003)

Presentation date： 2003.02

Event date：
2003.02

　

　
Recognition of para-linguistic information and its application to spoken dialogue system

S Fujie, Y Ejiri, Y Matsusaka, H Kikuchi, T Kobayashi

ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03 IEEE

Presentation date： 2003

Event date：
2003

　

　

　View Summary

The human-human interactions in a spoken dialogue seem to use not only linguistic information in the utterances but also some sorts of additional information supporting linguistic information. We call these sorts of additional information "para-linguistic information". In this paper, we present a recognition method of attitudes by prosodic information, and a recognition method of head gestures. In the former method, in order to recognize two attitudes, such as "positive" and "negative", F0 pattern and phoneme alignment are introduced as features. In the latter method, in order to recognize three gestures, such as "nod", "tilt" and "shake", left-to-right HMM is introduced as the probabilistic model as well as optical flow is introduced as features. Experiment results show that these methods are sufficient to recognize user's attitude as para-linguistic information. Finally, we show a proto-type spoken dialogue system using para-linguistic information and how these sorts of information contribute the efficient conversation.
日本語自発音声韻律ラベリングスキームX-JToBIの能力検証

菊池英明, 前川喜久雄

人工知能学会言語・音声理解と対話処理研究会

Presentation date： 2002.11

Event date：
2002.11

　

　
自発音声韻律ラベリングスキームX-JToBIによるラベリング精度の検証

菊池英明, 前川喜久雄

日本音響学会2002年春季研究発表会講演論文集

Presentation date： 2002.09

Event date：
2002.09

　

　
大規模自発音声コーパス『日本語話し言葉コーパス』の仕様と作成

籠宮隆之, 小磯花絵, 小椋秀樹, 山口昌也, 菊池英明, 間淵洋子, 土屋菜穂子, 斎藤美紀, 西川賢哉, 前川喜久雄

国語学会2002年度春季大会要旨集

Presentation date： 2002.05

Event date：
2002.05

　

　
日本語自発音声の韻律ラベリング体系: X-JToBI

前川喜久雄, 菊池英明, 五十嵐陽介

日本音響学会2002年春季研究発表会講演論文集

Presentation date： 2002.03

Event date：
2002.03

　

　
自発音声に対する音素自動ラベリング精度の検証

菊池英明, 前川喜久雄

日本音響学会2001年春季研究発表会講演論文集

Presentation date： 2002.03

Event date：
2002.03

　

　
日本語自発音声の韻律ラベリングスキーム: X-JToBI

菊池英明, 前川喜久雄, 五十嵐陽介

第2回話し言葉の科学と工学ワークショップ講演予稿集

Presentation date： 2002.02

Event date：
2002.02

　

　
自発音声に対する音素自動ラベリング精度の検証

菊池英明, 前川喜久雄

第2回話し言葉の科学と工学ワークショップ講演予稿集

Presentation date： 2002.02

Event date：
2002.02

　

　
X-JToBI: an extended j-toBI for spontaneous speech.

Kikuo Maekawa, Hideaki Kikuchi, Yosuke Igarashi, Jennifer J. Venditti

proc. 7th International Congress on Spoken Language Processing (ICSLP2002) ISCA

Presentation date： 2002

Event date：
2002

　

　
X-JToBI : An Intonation Labeling Scheme for Spontaneous Japanese

MAEKAWA Kikuo, KIKUCHI Hideaki, IGARASHI Yosuke

IPSJ SIG Notes Information Processing Society of Japan (IPSJ)

Presentation date： 2001.12

Event date：
2001.12

　

　

　View Summary

Outline of the X-JToBI intonation labeling scheme, the extended version of the J_ToBI, is presented. The main motive for the extension being consisted in its application for spontaneous speech, the following extensions were introduced: 1)Exact match between the time-stamp of tone labels and the timing of physical events, 2)Enlargement of the inventory of Boundary Pitch Movement, 3)Extension and ramification of the usage of Boundary Indices, and 4)Proposal for the labeling of filled-pause and non-lexical penult prominence.
自発音声コーパスにおける印象評定とその要因

籠宮隆之, 槙洋一, 菊池英明, 前川喜久雄

日本音響学会2001年秋季研究発表会講演論文集

Presentation date： 2001.09

Event date：
2001.09

　

　
多次元心的状態を扱う音声対話システムの構築

鈴木堅悟, 青山一美, 菊池英明, 白井克彦

情報処理学会音声言語情報処理研究会

Presentation date： 2001.06

Event date：
2001.06

　

　
自発音声に対するJ_ToBIラベリングの問題点検討

菊池英明, 籠宮隆之, 前川喜久雄, 竹内京子

日本音響学会2001年春季研究発表会講演論文集

Presentation date： 2001.03

Event date：
2001.03

　

　
日本語音声への韻律ラベリング

菊池英明

人工知能学会研究会資料

Presentation date： 2001.02

Event date：
2001.02

　

　
音声対話に基づく知的情報検索システム

菊池英明, 阿部賢司, 桐山伸也, 大野澄雄, 河原達也, 板橋秀一, 広瀬啓吉, 中川聖一, 堂下修二, 白井克彦, 藤崎博也

情報処理学会音声言語情報処理研究会 Information Processing Society of Japan (IPSJ)

Presentation date： 2001.02

Event date：
2001.02

　

　

　View Summary

This paper presents an intelligent system for information retrieval based on human-machine dialogue through spoken language with novel features such as use of key concepts, unknown word processing, dialogue management through user and system modeling, and automatic acquisition of knowledge to adapt the system to individual users. It then describes an experimental system for academic information retrieval constructed to implement these features and to demonstrate their feasibility.
『日本語話し言葉コーパス』の構築における計算機利用

前川喜久雄, 菊池英明, 籠宮隆之, 山口昌也, 小磯花絵, 小椋秀

日本語学, 明治書院

Presentation date： 2001

Event date：
2001

　

　
『日本語話し言葉コーパス』の書き起こし基準について

小磯花絵, 土屋菜穂子, 間淵洋子, 斉藤美紀, 籠宮隆之, 菊池英明, 前川喜久雄

電子情報通信学会技術報告

Presentation date： 2000.12

Event date：
2000.12

　

　
モノローグを対象とした自発音声コーパス:その設計について

第14回日本音声学会全国大会予稿集

Presentation date： 2000.10

Event date：
2000.10

　

　
Overview of an Intelligent System for Information Retrieval Based on Human-Machine Dialogue through Spoken Language

proc. of Int'l. Conference on Spoken Language Processing

Presentation date： 2000.10

Event date：
2000.10

　

　
Improvement of Dialogue Efficiency by Dialogue Control Model According to Performance of Processes

proc. of Int'l. Conference on Spoken Language Processing

Presentation date： 2000.10

Event date：
2000.10

　

　
Designing a Domain Independent Platform of Spoken Dialogue System

AOYAMA K.

proc. of Int'l. Conference on Spoken Language Processing

Presentation date： 2000.10

Event date：
2000.10

　

　
大規模話し言葉コーパスにおける発話スタイルの諸相---書き起こしテキストの分析から---

KAGOMIYA Takayuki, KIKUCHI Hideaki, KOISO Hanae, MAEKAWA Kikuo

日本音響学会2000年秋季研究発表会講演論文集

Presentation date： 2000.09

Event date：
2000.09

　

　
音声対話システム汎用プラットフォームにおける行動管理部の構築

人工知能学会全国大会(第14回)

Presentation date： 2000.06

Event date：
2000.06

　

　
音声対話システム汎用プラットフォームの検討

情報処理学会音声言語情報処理研究会

Presentation date： 2000.02

Event date：
2000.02

　

　
Modeling of spoken dialogue control for improvement of dialogue efficiency

H Kikuchi, K Shirai

SMC 2000 CONFERENCE PROCEEDINGS: 2000 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN & CYBERNETICS, VOL 1-5 IEEE

Presentation date： 2000

Event date：
2000

　

　

　View Summary

In this research, we aim at the establishment of a method of controlling dialogues in accordence with the changes of performance in computation. So that the system takes the most suitable dialogue strategy according to performance in computation, it needs to calculate evaluation functions modeled with performance in computation in the inside all time. In this paper, we model dialogue control centered on 'system transparency' with computation time and accuracy of system's processes. Then, we simulate this model by controlling dialogue strategies. We also confirm effectiveness of the model by preliminary experiment using the prototype of spoken dialogue system.
Controlling non-verbal information in speaker-change for spoken dialogue

K Aoyama, M Yokoyama, H Kikuchi, K Shirai

SMC 2000 CONFERENCE PROCEEDINGS: 2000 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN & CYBERNETICS, VOL 1-5 IEEE

Presentation date： 2000

Event date：
2000

　

　

　View Summary

In this research, we focus on two main topics about the model for use of non-verbal information. The first topic is the clarification of the difference of the model between using non-verbal information by human beings and using non-verbal information by robot. The other topic is the clarification of the difference of the model between the CG robot and the real robot. As a result, we clarified the strength of constraint and naturalness of various types of non-verbal information. We also confirmed that appropriate output timing of non-verbal information with the CG robot is the start of utterances. This is the same as the usage of non-verbal information in human-human dialogue. Moreover, pie confirmed non-verbal information made speaker-change more smoothly for the humanoid robot than in the case of the CG robot.
Improving Recognition Correct Rate of Important Words in Large Vocabulary Speech Recognition

proc. of Eurospeech

Presentation date： 1999.09

Event date：
1999.09

　

　
Controlling Dialogue Strategy According to Performance of Processes

proc. of ESCA Workshop Interactive Dialogue in Multi-modal Systems

Presentation date： 1999.05

Event date：
1999.05

　

　
音声対話システムにおける処理性能と対話戦略の関係についての一考察

KIKUCHI Hideaki, KOBAYASHI Tetsunori, SHIRAI Katsuhiko

日本音響学会講演論文集

Presentation date： 1999.03

Event date：
1999.03

　

　
システムの処理性能を考慮した対話制御方法の検討

人工知能学会言語・音声理解と対話処理研究会予稿集

Presentation date： 1999.02

Event date：
1999.02

　

　
A post-processing of speech for hearing impaired integrate into standard digital audio decoders.

Shinichi Hoshino, Itaru Kaneko, Hideaki Kikuchi, Katsuhiko Shirai

proc. of Eurospeech ISCA

Presentation date： 1999

Event date：
1999

　

　
Use of Nonverbal Information in Communication between Human and Robot

Proc. Of International Conference on Spoken Language Processing (ICSLP)

Presentation date： 1998.12

Event date：
1998.12

　

　
非言語的現象の分析と対話処理 -電子メール討論

KIKUCHI Hideaki, ITOU Katunobu, KAWABATA Takeshi

日本音響学会誌 The Acoustical Society of Japan (ASJ)

Presentation date： 1998.11

Event date：
1998.11

　

　
人間型対話インタフェースにおけるまばたき制御の検討

人工知能学会全国大会論文集

Presentation date： 1998.06

Event date：
1998.06

　

　
時間的制約を考慮した対話制御方法の実現方法

人工知能学会全国大会論文集

Presentation date： 1998.06

Event date：
1998.06

　

　
人間とロボットのコミュニケーションにおける非言語情報の利用

情報処理学会音声言語情報処理研究会資料

Presentation date： 1998.05

Event date：
1998.05

　

　
Multimodal Communication Between Human and Robot

Proc. Of International Wireless and Telecommunications Symposium (IWTS)

Presentation date： 1998.05

Event date：
1998.05

　

　
時間的制約を考慮した対話制御方法の検討

KIKUCHI Hideaki, SUGITA Yosuke, SHIRAI Katsuhiko

日本音響学会講演論文集

Presentation date： 1998.03

Event date：
1998.03

　

　
Controlling gaze of humanoid in communication with human

H Kikuchi, M Yokoyama, K Hoashi, Y Hidaki, T Kobayashi, K Shirai

1998 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS - PROCEEDINGS, VOLS 1-3 IEEE

Presentation date： 1998

Event date：
1998

　

　

　View Summary

This paper describes controlling robot's gaze which has relation to smoothness of turn-taking in communication. We considered the role of gaze in dialogues between human beings and examined it by CG simulation and our humanoid Also we analyzed the features of gaze movement in dialogues by plural persons and confirmed that controlling gaze is efficient in confirmation of communication channel by implementing it on the humanoid.
自由会話における時間的制約の影響の分析

KIKUCHI Hideaki, SUGITA Yosuke, SHIRAI Katsuhiko

電子情報通信学会技術研究報告 The Institute of Electronics, Information and Communication Engineers

Presentation date： 1997.10

Event date：
1997.10

　

　

　View Summary

In natural dialogue, time constraint influences on speaker's plan of utterance generation. We analyzed influence of time constraint in natural dialogue on the dialogue corpus. We tried to make them clear by measuring time of speaker's turn and analyzing tendency in some situations. As the result, it was indicated that influence of time constraint on hearer's point had appeared with participant-dependency. Also it was indicated that function of utterance, modality of dialogue, and positiveness of participant had influences on time constraint.
音声を利用したマルチモーダルインタフェース

電子情報通信学会誌

Presentation date： 1997.10

Event date：
1997.10

　

　
複数ユーザとロボットの対話における非言語情報の役割

YOKOYAMA Masao, KIKUCHI Hideaki, SHIRAI Katsuhiko

日本音響学会講演論文集

Presentation date： 1997.09

Event date：
1997.09

　

　
The Role of Non-Verbal Information in Spoken Dialogue between a Man and a Robot

International Conference on Speech Processing (ICSP) '97 Proceedings

Presentation date： 1997.08

Event date：
1997.08

　

　
ロボットとの対話における非言語情報の役割

人工知能学会全国大会論文集

Presentation date： 1997.06

Event date：
1997.06

　

　
音響学会員のためのインターネット概説

KIKUCHI Hideaki, HATAOKA Nobuo

日本音響学会誌 The Acoustical Society of Japan (ASJ)

Presentation date： 1996.08

Event date：
1996.08

　

　
情報処理学会第53回全国大会大会優秀賞

Presentation date： 1996.03

Event date：
1996.03

　

　
ハイパーメディア共有アーキテクチャにおけるバージョン管理方式

情報処理学会全国大会講演論文集

Presentation date： 1996.03

Event date：
1996.03

　

　
ハイパーメディア共有アーキテクチャ

情報処理学会全国大会講演論文集

Presentation date： 1996.03

Event date：
1996.03

　

　
Extensions of World-wide Aiming at the construction of a "Virtual Personal Library"

proc. of Seventh ACM Conf. on Hypertext

Presentation date： 1996.03

Event date：
1996.03

　

　
User interface for a digital library to support construction of a ''virtual personal library''

H Kikuchi, Y Mishina, M Ashizawa, N Yamazaki, H Fujisawa

PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS I E E E, COMPUTER SOC PRESS

Presentation date： 1996

Event date：
1996

　

　
Multimodal Interface Using Speech and Pointing Gestures, and Its Applications for Interior Design and PC Window Manipulation

proc. of IWHIT95 (International Workshop on Human Interface Technology 95)

Presentation date： 1995.10

Event date：
1995.10

　

　
音声とペンを入力手段とするマルチモーダルインタフェースの構築

情報処理学会音声言語情報処理研究会

Presentation date： 1995.07

Event date：
1995.07

　

　
仮想個人図書館の構築を支援するユーザインタフェースの開発

電子情報通信学会春季大会講演論文集

Presentation date： 1995.03

Event date：
1995.03

　

　
Agent-typed multimodal interface using speech, pointing gestures and CG

H ANDO, H KIKUCHI, N HATAOKA

SYMBIOSIS OF HUMAN AND ARTIFACT: FUTURE COMPUTING AND DESIGN FOR HUMAN-COMPUTER INTERACTION ELSEVIER SCIENCE PUBL B V

Presentation date： 1995

Event date：
1995

　

　
音声・ポインティング・CGによるエージェント型ユーザインタフェースの試作と評価

第10回ヒューマンインタフェースシンポジウム論文集

Presentation date： 1994.10

Event date：
1994.10

　

　
マルチモーダルウインドウシステムの構築

第10回ヒューマンインタフェースシンポジウム論文集

Presentation date： 1994.10

Event date：
1994.10

　

　
音声・ポインティング・CGによるエージェント型ユーザインタフェースシステム

電子情報通信学会秋季大会講演論文集

Presentation date： 1994.09

Event date：
1994.09

　

　
音声対話システムにおける発話権の制御

電子情報通信学会春季大会講演論文集

Presentation date： 1993.03

Event date：
1993.03

　

　
ナビゲーションシステムにおける音声対話インタフェースの構成

人工知能学会言語・音声理解と対話処理研究会

Presentation date： 1992.10

Event date：
1992.10

　

　
自然な模擬対話を収録するために

菊池英明

音声対話のモデル化とその機械処理関する総合的研究研究成果報告書

Presentation date： 1992

Event date：
1992

　

　
自然な模擬対話の収録のために

菊池英明

文部省科研費総合研究A「音声対話のモデル化とその機械処理に関する総合的研究」研究成果報

Presentation date： 1992

Event date：
1992

▼display all

Research Projects

Construction of phoneme-balanced speech datbase of Japanese by means of real-time MRI imaging

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2024.04

-

2027.03
Real-time MRI database of articulatory movements of Japanese

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2020.04

-

2024.03
Development of protocol of effective social work interview for older adults with dementia

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2019.04

-

2022.03
バーチャルリアリティ環境におけるオラリティの運用の検討

Project Year :

2017.07

-

2020.03

　View Summary

本年度は昨年度に引き続き、バーチャルリアリティ空間におけるオラリティを構成する要因（エージェントの視覚的特徴、コミュニケーションのインタラクティブ性、垂直方向の距離知覚、空間の非現実性）を心理学実験により検討した。研究１では、バーチャルリアリティ空間で対面したエージェントに対する適切なパーソナルスペースの推定に関する実証実験のデータ収集を行った。昨年度収集したデータと合わせて、対面するエージェントの視覚的な特徴（性別、人間またはロボット、身長差）がパーソナルスペースに影響することを確認した。研究２では、一次救命処置訓練をテーマにバーチャルリアリティ環境でのインタラクティブ性が訓練受講者の心理に及ぼす影響を検討した。実験では、エージェントが倒れてしまった場面で周囲の人物への声がけを行いながらAEDを用いる手続きを体験しながら学習した。受講者の声がけの音響特徴に合わせて、周囲のエージェントの振る舞いが変更するようにプログラムされており、受講者は自らの声がけによるインタラクティブな状況変化を体験した。その結果、同様の学習をビデオ視聴に行う場合と比べて、一次救命処置に対するより高い自己効力感と訓練方法に対する高い興味が示された。研究３では昨年度まで実施していたバーチャルリアリティ空間内の距離近く推定に関する実験を一部拡張し、垂直方向の距離推定を行った。その結果、奥行き方向と垂直方向では距離推定のバイアスが異なることが示された。研究４ではバーチャルリアリティ空間の非現実性が体験者の心理に及ぼす影響を検討した。３つの環境（草原、海、宿泊施設）を模したバーチャルリアリティ空間にお風呂を設置し、現実空間でお風呂に入っている参加者に提示した。その結果、非現実的なバーチャルリアリティ空間のお風呂と現実のお風呂のギャップにより参加者のリラックス度合いに変化が生まれることがわかった。本年度は、前年度までに行った対面コミュニケーションや空間知覚に関して、バーチャルリアリティ空間内の注意位置を計測しながら実験的検討を行う予定であった。注視点計測機能のあるヘッドマウントディスプレイを用いた検討を実施したが、注視点の推定精度が当初の予想よりも低いため検証実験の準備に想定以上の時間を要した。さらにコロナウイルス感染拡大の影響で被験者を伴う予備実験の実施を断念せざるを得なかった。これにより当初予定していた検証実験が未実施となった。実施が延期された視点計測を伴うバーチャルリアリティ環境におけるオラリティを構成する要因の検討を実施予定である。実験データをまとめるとともに、最終年度として、４年間で得られた知見を統合して、バーチャルリアリティ環境におけるオラリティに関わる要因の特徴を考察する
Elaboration of articulatory phonetics by means of realtime-MRI and WAVE data

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2017.04

-

2020.03

Maekawa Kikuo

　View Summary

Realtime MRI movie of articulatory movements were recorded for 16 Tokyo Japanese, five Kinki Japanese, and three Inner Mongolian speakers; each movie is about 60 minutes long. Data browsing environment on the web and techniques for automatic contour extraction of articulatory organs (lips, tongue, jaw, palate, pharynx etc.) were developped. As for linguistic application, three papers were published with respect to issues like 1) Relevance of the tongue root in the Mongolian vowel harmony, 2) Influence of the preceding vowel for the determination of the place of articulation of the utterance-final moraic nasal in Japanese, and, 3) Exclusive relevance of labial (rather than labio-velar) articulation in the Japanese /w/. Analysis of the Japanese /r/ was also presented.
コーパス言語学的手法に基づく会話音声の韻律特徴の体系化

Project Year :

2016.04

-

2020.03

　View Summary

本研究課題の目的は、コーパス言語学的手法に基づき、独話や朗読音声との比較を通し、会話音声の韻律体系を実証的に検証・確立することである。この目的に向け、最終年度にあたる今年度は、次のことを実施した。①これまで、独話用に開発した韻律ラベリング基準X-JToBIを日常会話用に拡張することを検討してきたが、今年度はそれを作業マニュアルとしてまとめ上げ、今後、ラベリング結果とともに公開できるよう準備した。②代表者が構築を主導する『日本語日常会話コーパス』のうち16時間の会話を対象に、2次チェックまで含めて韻律ラベリングを実施した。③16時間のラベリング結果に基づき、『日本語話し言葉コーパス』の独話（講演）との比較を通して、日常会話の韻律の特性について分析を行い、以下のことを明らかにした：ア)上昇成分を伴うBPM（上昇調・上昇下降調など）が学会講演で多く雑談では少ない傾向が見られる、イ)BPMの内訳を見ると、学会講演ではアクセント句末での上昇調が多いのに対し、雑談では上昇下降調が多い、ウ)線形判別分析を用いてこれらの句末音調・BI値の特徴からレジスター（雑談・学会講演・模擬講演）を推定するモデルを構築し、各変数がレジスターの判別にどのように寄与するかを検討した結果、正解率は82.1%と、高い確率で３つのレジスターが句末音調・BI値の特徴から判別できることなどが分かった。以上の結果から、学会講演ではイントネーション句内において上昇調で卓立させたアクセント句が複数継起する発話スタイルをとることが多いのに対し、日常会話では上昇下降調などを伴うアクセント句は複数継起することは少なく単独でイントネーション句を構成する、つまり韻律的には細ぎれの発話スタイルが多いと考えることができる。令和元年度が最終年度であるため、記入しない。令和元年度が最終年度であるため、記入しない
Estimation of User's Impression Space for Improving Desire of Interaction with Spoken Dialogue System

Project Year :

2014.04

-

2017.03

　View Summary

This study aims at investigating influences of humans’ personal characteristics on forming impression of agent through human-agent interaction. We conducted an experiment in which subjects have some interaction with an agent or a human and form impression toward them. The result showed that subjects who has no experiences of programming tend to evaluate an agent lower than a human. Also subjects of the “high emotional-warmth” group tend to evaluate an agent lower than a human.Also we proposed a humor utterance generation method which compatible with dialogue system, to increase desire of sustainability. Through the experiment, we confirmed validity of the method. From the result, we observed high-incongruity reply is significantly effective against the low-incongruity and random reply. Finally we confirmed generating humor utterances is effective for increase desire of sustainability in interaction with dialogue system
Fundamental Study for Conversion between Spoken and Written Japanese Considering Influence of Interactivity

Project Year :

2013.04

-

2016.03

　View Summary

Our research clarifies the differences of styles between four modes of Japanese sentences. Each of these modes are combination of a pair of exclusive modes: spoken/written, dialogue/monologue. We developed a method for acquiring sentences of such modes within which issues and stories are controlled.Acquired data shows that on dialogue condition, the differences of styles between spoken and written sentences are smaller than those on monologue condition. These results imply that the traits of dialogue in which talker is prompted to make quick composition of sentence and to pay more attention to listener in front of him/her decrease spoken or written specific styles
音声対話システムに対するインタラクション欲求向上のためのユーザ印象空間の推定

科学研究費助成事業(早稲田大学) 科学研究費助成事業(基盤研究(C))

Project Year :

2014

-

2016
Estimation of User's Impression Space for Improving Desire of Interaction with Spoken Dialogue System

Project Year :

2014

-

2016

　View Summary

This study aims at investigating influences of humans’ personal characteristics on forming impression of agent through human-agent interaction. We conducted an experiment in which subjects have some interaction with an agent or a human and form impression toward them. The result showed that subjects who has no experiences of programming tend to evaluate an agent lower than a human. Also subjects of the “high emotional-warmth” group tend to evaluate an agent lower than a human.Also we proposed a humor utterance generation method which compatible with dialogue system, to increase desire of sustainability. Through the experiment, we confirmed validity of the method. From the result, we observed high-incongruity reply is significantly effective against the low-incongruity and random reply. Finally we confirmed generating humor utterances is effective for increase desire of sustainability in interaction with dialogue system
Preparation of Database and Analysis of Dialectial Difference in Phonetics, Phonology, Grammar and Vocabulary among Dialects of Ainu

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2012.04

-

2015.03

OKUDA OSAMI, SHIRAISHI Hidetoshi, KIKUCHI Hideaki

　View Summary

The differences among person marking system of several Ainu dialects have been observed and a prominent difference between Saru and Chitose dialects, which have so far been regarded as very similar dialects, was found. General tendency or historical implication of Ainu person marking system was investigated upon this observation. Difference among kinship terms of Saru and Shizunai dialects was researched using the epic texts of these dialects. The Ainu audio database was experimentally constructed and records for the database have been accumulated. Tagging to the verbs in Ainu text records for the database, focusing upon the grammatical and semantic role of each morpheme, was attempted and evaluated
On the physical factors which makes the mother tongue dialogues smoothly - through the comparison with the non-mother tongue

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2012.04

-

2015.03

KUROIWA Shingo, ICHIKAWA Akira, HORIUCHI Yasuo, KIKUCHI Hideaki, NAKA Makiko, OHASHI Hiroki, KAWABATA Yoshiko, TAKIZAWA Keiko

　View Summary

In the dialogue among mother tongue adult speakers, overlapping utterances are produced in transition-relevance places (TRP). This phenomenon seems to appear as the result of some capability which reduces the mental ‘burden’ of the dialogue in the mother tongue. We examined the age to acquire this capability regarding Japanese mother tongue speakers.The timing of the turn taking of the mother tongue and the non-mother tongue speakers was compared first. It was found that the non-mother tongue speaker could not maintain TRP. Next, the timing of turn taking between adults and five-year olds or six-year olds was examined. As for five-year olds, a difference between them and adults was present, but a difference between six-year olds and adults was absent. As a result, it found that the capability was already acquired by six
Analysis of infant-directed speech by acoustic and computational modeling methods.

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2009

-

2011

MAZUKA Reiko, KIKUCHI Hideaki, ICHIKAWA Akira, TAJIMA Keiichi, MIYAZAWA Kouki, RICARDO Hoffman bion

　View Summary

The goal of the present project was to investigate how the nature of infant-directed speech(IDS) differs from that of adult-directed speech(ADS), and what functions the properties of IDS may play for infants phonological acquisition. In particular, we focused on the vowel category acquisition. In acquiring vowel categories, based on the quality differences(i. e.,/a/,/i/,/u/,/e/,/o/), SOM models that were developed for ADS were able to learn similar categories from IDS as well. The distinction between short and long vowels, however, turned out to be particularly challenging. That was because the actual duration of the vowels varied widely independent of whether it is phonologically short or long, and we are continuing our research into what additional information will be necessary for a model to acquire the long-short vowel distinction
A Study on a framework of spontaneous communication depending on dialogue situation

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2005

-

2007

SHIRAI Katsuhiko, KUREMATSU Akira, HONDA Masaaki, KOBAYASHI Tetsunori, KIKUCHI Hideaki, OOKAWA Shigeki

　View Summary

In this study, we examined a framework of communication systems that interact with users spontaneously so as to cope with practical dialogue environments. While conventional spoken dialogue systems aimed to efficiently achieve specific speech-dialogue tasks, the framework of spontaneous communication system was developed in order to evolve these spoken dialogue systems to one that spontaneously start and continue spoken dialogues.For this purpose, three studies were conducted : (i) understanding of dialogue environment, which aims advanced human recognition and spoken dialogue recognition using image and speech signal, (ii) spontaneous communication management model, which models how to start, continue and end spoken dialogues, and(iii) speech generation and motion expression technology, which is how to present intentions of the system by utterances or motions.In the study(i), human pose estimation using stereo camera was developed. Simultaneous adoption of information of space depth of images, and shapes and textures of either human bodies or clothes realized accurate estimation of human poses. In addition, estimation of utterance intention was studied. Using characteristics of end of sentences and word N-grams, more accurate utterance intention was achieved.In the study(ii), models for a robot to start communication with a human, to continue it, and to end it were developed. we developed mental-state of a conversational partner for these purposes.In the study a speech generation technique for laughter was developed. Based on acoustical analyses of human laughter syntheses of both laughter and laughter-speech were realized. Throughout these three studies, basic framework of spontaneous communication was established
Quantitative Analysis of Linguistic Variation Using a Spontaneous Speech Corpus

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2004

-

2006

MAEKAWA Kikuo, KOISO Hanae, OGURA Hideki, KIKUCHI Hideaki, DEN Yasuharu, HIBIYA Junko

　View Summary

The first half of the three-year research project was devoted for the derivation of new research data from the Corpus of Spontaneous Japanese (CSJ), including,1) Prosodic and morphological information of the CSJ-Core (about 44 hours) is reorganized for the study of prosodic variations. The new data is encoded as an XML document whose hierarchical structure reflects those of Japanese prosody. The most basic node of the new XML document corresponds to so-called 'accentual phrase' of the Japanese language. (Done by Kikuchi)2) The word-origin information (i.e., Native, Sino-Japanese, Borrowing, and mixture of these) is given to the total of forty thousand short-unit-word recorded in the CSJ (by Ogura).3) Phonetic database for the study of the variation of the velar nasal in Tokyo Japanese (by Maekawa and Hibiya).4) RDB containing the whole word-from variations observed in the whole CSJ (7.52 million SUW.) (by Maekawa).Based upon these data, we analyzed language variations recorded in the CSJ including,1) Devoicing of vowels (by Maekawa ＆ Kikuchi)2) Non-lexical lengthening of vowels (by Den)3) Moraic nasalization of particles (by Koiso)4) Variation of velar nasal (by Hibiya)5) Word-form variation of the whole CSJ (by Maekawa)6) Variation in the accentual-phrase-final rising intonation (by Maekawa ＆ Kikuchi)7) Variation of morphological features at the end of sentence (by Ogura)The results of these studies were presented in international and domestic conferences, and reprinted in the 252-page final report of the project
Development of new prototypes and models of higher education utilizing broadband networks.

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2003

-

2006

KOGO Chiharu, ASADA Tadashi, KIKUCHI Hideaki, NISHIMURA Shoji, NOJIMA Eiichiro, SUZUKI Katsuaki

　View Summary

To get substantive outcomes of e-learning courses, it is necessary for e-learning system including learning management systems to facilitate learners learning. Also it is necessary for teachers, coaches, and supporting staffs to work respectively. Teachers have three types of work: design, management and evaluation of the courses. Designing the detailed course structure is the new and important part of work for the teachers. And then online coaches appear to have a greater part of work to support the teacher and to facilitate classroom activities. Coaches have three types of work: facilitating the classroom activities, making classroom atmosphere and standards, facilitating the discussion processes. Many kind of learning management systems are now available free or commercially. The minimum functions are video streaming, bulletin board system, and testing, but these functions should be carefully designed and become more usable to get more substantive learning outcomes. Talking about the future learning environments, learner will be able to access directly his/her own working spaces by opening web brouser
Analysis of effective communication process by using body and artifact for faculty Development

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2004

-

2005

HOZAKI Norio, NAKANO Michiko, SUZUKI Hiroko, YAMAJI Hiroki, NAKAJIMA Yoshiaki, NISHIMURA Shoji

　View Summary

Based on the analysis of college lecture class and English class at elementary school by using Observational System for Instructional Analysis originally developed by Hough and Duncan, the followings were found. The OSIA analysis, starting with transcription of class activities, takes the form of matrix and timeline.1) Compared with college class, the level of interaction was significantly higher and there were more students' talk observed in an elementary English class2) More experienced Assistant Language Teacher(ALT, native speaker of English) created higher level of interaction between teacher and students by not just starting repetition practice but waiting for students' utterances after showing visual materials at an elementary English class. This created more energetic class activities with a higher sense of participation.3) Richer facial expression of teachers increased positive class atmosphere.4) More experienced ALT used short sentences and phrases elaborately and effectively as feedback each time students answered.5) Class proceeded effectively and smoothly when appropriate artifact was used for instruction
XML documentation of complex annotation on spontaneous speech data

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2002

-

2003

MAEKAWA Kikuo, TSUKAHARA Wataru, KIKUCHI Hideaki, KOISO Hanae, YONEYAMA Kiyoko

　View Summary

Annotation of spontaneous speech data is a difficult task, but the maintenance of large annotated spontaneous speech database and the information retrieval of such database is all the more difficult. We proposed a XML format that can represent nearly all annotation information of the Corpus of Spontaneous Japanese. CSJ is a world's largest spontaneous speech database with very rich annotation including transcription, POS information, clause boundary information, dependency-structure information, discourse-boundary information, segment label, intonation label, and so forth.Our XML format includes 10 layers (starting with "Talk" element and ending in "Phone" and "Tone" elements) arranged according to the structure of natural language. 208 attributes covers linguistic, paralinguistic, and non-linguistic annotation of the speech data as well as various disfluency phenomena. Also, there are some attributes that are introduced to represent the format of the transcription text.We have converted all 3302 talks of the CSJ (661 hours, over 7.5 million morphemes) into XML document, and used them for the data validation purposes.Information retrieval experiments were also conducted using the XML documents. It turned out that the use of XSLT language gave satisfactory performance. Information retrievals of modest complexity could be performed within 15 to 30 minutes when a PC of ordinary performance (3Ghz CPU with 2GB memory) was used.Lastly, we developed a simple GUI-based search tool that helps naive users to make XSLT query scripts. The software is written in Java language and runs under nearly all PC platforms.The XML documents and GUI search tool will be publicly available as a part of the CSJ in June 2004
韻律制御に主体をおいた対話システム

　View Summary

今年度の成果は以下の通りである。a)対話のリズムと韻律制御前年度までの成果に基づいて、対話における話題境界の判別を題材に、韻律情報におけるアクセント句単位でのパラメータを用いて統計的なモデルを学習し、オープンデータに対しても人間と同程度の判別精度が得られることを確認した。(白井・菊池)自然な対話システムを構築する上で重要なシステム側の相槌生成と話者交替のタイミングの決定を、韻律情報と表層的言語情報を用いて行う方法を開発した。この決定法を、実際に天気予報を題材にした雑談対話システムに実装し、被験者がシステムと対話することにより主観的な評価を行い、有用性を確認した。(中川)b)対話音声理解応用対話音声における繰り返しの訂正発話に関する特徴の統計的な分析結果を踏まえ、フレーズ単位の韻律的特徴の併用と訂正発話検出への適用を評価した。また、これらと併せた頑健な対話音声理解のため、フィラーの韻律的な特徴分析・モデル化の検討を行った。(甲斐)c)対話音声合成応用語彙の韻律的有標性について程度の副詞を用い、生成・聴覚の両面から分析を行い、自然な会話音声生成のための韻律的強勢制御を実現した。また、統計的計算モデルによる話速制御モデルを作成し、会話音声にみられる局所話速の分析を進め、自由な話速の制御を可能とした。さらに、韻律制御パラメータが合成音声の自然性品質に及ぼす影響を調べた。(匂坂)d)対話システム上記の成果をまとめ,対話システムを実装した。特に,顔表情の認識・生成システム,声表情の認識・生成システムなどを前年度までに開発した対話プラットホーム上に統合し,パラ言語情報の授受を可能とするリズムある対話システムを構築した。(小林

▼display all

Industrial Property Rights

検索語クラスタリング装置、検索語クラスタリング方法、検索語クラスタリングプログラム及び記録媒体

白井克彦, 菊池英明, 新関一馬

Patent
音声認識装置及び音声認識用プログラム

白井克彦, 菊池英明, 大久保崇

Patent
連続音声認識装置および方法

白井克彦, 城崎康夫, 菊池英明

Patent

Syllabus

Introduction to Human Infomatics and Cognitive Sciences

School of Human Sciences

2025 spring quarter
Basic Seminar I

School of Human Sciences

2025 spring semester
Basic Seminar I

School of Human Sciences

2025 spring semester
Seminar of Graduation Thesis Research I (Language and Information Science)

School of Human Sciences

2025 spring semester
Seminar II (Language and Information Science)

School of Human Sciences

2025 fall semester
Seminar I (Language and Information Science)

School of Human Sciences

2025 spring semester
Seminar of Graduation Thesis Research II (Language and Information Science)

School of Human Sciences

2025 fall semester
Language and Information Sciences

School of Human Sciences

2025 spring semester
Measurement and Modeling of Human

School of Human Sciences

2025 spring semester
Research Method for Measurement and Modeling of Human 02

School of Human Sciences

2025 fall quarter
Research Method for Measurement and Modeling of Human 01

School of Human Sciences

2025 fall quarter
Research Method for Measurement and Modeling of Human 03

School of Human Sciences

2025 fall quarter
Language and Information Sciences

School of Human Sciences (Online Degree Program)

2025 fall semester
Seminar(Language and Information Science) (fall)

School of Human Sciences (Online Degree Program)

2025 fall semester
Seminar(Language and Information Science) (spring)

School of Human Sciences (Online Degree Program)

2025 spring semester
Seminar of Graduation Thesis Research(human informatics and cognitive sciences) (Language and Information Science) (fall)

School of Human Sciences (Online Degree Program)

2025 fall semester
Seminar of Graduation Thesis Research(human informatics and cognitive sciences) (Language and Information Science) (spring)

School of Human Sciences (Online Degree Program)

2025 spring semester
Introduction to Human Infomatics and Cognitive Sciences

School of Human Sciences (Online Degree Program)

2025 fall quarter
Language and Information Sciences(2) B

Graduate School of Human Sciences

2025 fall semester
Language and Information Sciences(2) A

Graduate School of Human Sciences

2025 spring semester
Language and Information Sciences(1) B

Graduate School of Human Sciences

2025 fall semester
Language and Information Sciences(1) A

Graduate School of Human Sciences

2025 spring semester
Language and Information Sciences A

Graduate School of Human Sciences

2025 spring semester
Language and Information Sciences B

Graduate School of Human Sciences

2025 fall semester
Language and Information Sciences(D) B

Graduate School of Human Sciences

2025 fall semester
Language and Information Sciences(D) A

Graduate School of Human Sciences

2025 spring semester
Language and Information Sciences

Graduate School of Human Sciences

2025 winter quarter

▼display all

Overseas Activities

音声言語における感情・評価・態度の解析技術高度化

2009.03

-

2010.03

アメリカオハイオ州立大学

中国北京大学

Sub-affiliation

Faculty of Human Sciences Graduate School of Human Sciences
Faculty of Science and Engineering Graduate School of Fundamental Science and Engineering
Affiliated organization Global Education Center

Research Institute

2025

-

2026

Center for Data Science Concurrent Researcher

Internal Special Research Projects

音象徴の表現力の精緻化

2020

　View Summary

本研究では、刺激を工学的手法により作成・呈示し、その結果を心理学的な統計処理を施し解釈することで、音象徴の持つ表現力を精緻化する。具体的には、実験参加者の属性を抽象化能力の理解度という側面から、音象徴の持つ表現力の差異を明らかにする。音象徴の印象評定実験を行い、心理学的距離に差が生じ得るか検討した。印象評定実験では，Scheffe の一対比較法によって，選定した聴覚・視覚刺激の心理尺度上の距離を比較した．その結果，日本語母語話者においても，丸みを帯びた名前はブーバ顔と，角張った名前はキキ顔と強く結びつく傾向にあることが示された．
音声言語コーパスへの発話スタイル属性付与のためのアノテーション規準作成と自動推定

2020 沈睿

　View Summary

音声言語コーパスに収められたデータの発話スタイルについて、コーパス検索に適した体系を整理したうえで、主に言語処理技術により発話スタイル自動推定を可能にし、推定結果としての発話スタイルをコーパスの属性情報として活用できるようにすることを目指す。特定課題研究期間には、応募課題の発展のためにコーパスの内容を吟味してより具体的で現実的な問題を発見する。そのために、手元にあるコーパス20～30点の転記テキストデータを電子化し、プログラムなどで統一的に扱えるように形式を整えた。手元のコーパスの発話スタイルの網羅性を検討した。
音声対話システム発話の音声言語的特徴制御によるインタラクション欲求向上

2013

　View Summary

音声対話システムのシステム発話を制御することによってユーザに与える印象を変化させる技術の開発を目指している。本特定課題研究では、擬人化したシステムの自己開示によってパーソナリティを付与する手法に関する基礎研究を中心に進めた。実験を通じて、自己開示量と内容によって特定のパーソナリティを付与できることを確認した。この成果は、ヒューマンインタフェース学会論文誌に査読論文として掲載された（「自己開示による音声対話エージェントへのパーソナリティ付与」）。他にも、マイクロブログからユーモア発話を自動生成する技術（「非タスク指向対話システムにおけるマイクロブログを用いたユーモア発話の自動生成」）、発話速度あるいは無音区間長を制御する手法（「ロボット発話の話速・無音区間長の制御によるパーソナリティ認知と対話継続欲求の向上」）を検討し、それぞれによってユーザがシステムに抱く印象がどのように変化するかを実験により調査した。いずれも国内学会にて成果を発表した。ユーザ発話における音声のプロソディを解析することによってユーザの心的状態を推定して、それに応じてシステムの振舞を変えることによってユーザが抱く愛着感を変化させられることを実験により確認し、この成果を国際会議にて発表した（「Effects of an Agent Feature Comprehension on the Emotional Attachment of Users」）。いずれの研究においても、ユーザが抱く印象を変化させるこれらの手法によって、音声対話システムに対するインタラクション継続欲求が向上することを確認している。このことを一旦整理して国内学会にて発表した（「音声対話システムに対するインタラクション継続欲求」）。こうした一連の研究成果を体系化してさらに幅広く応用可能な技術を開発するために、インタラクション継続欲求とユーザ印象空間の関係を明確にする必要性が生じている。そこで最終的に、本特定課題研究を経て、2014年度科学研究費の基盤(C)に「音声対話システムに対するインタラクション欲求向上のためのユーザ印象空間の推定」というテーマを申請して採択された。
生体情報を教師信号としたモデル学習による感情推定技術の高度化

2006

　View Summary

　本研究では、音声からの心的状態の推定において話者の心的状況をより高い精度で推定するため、生体情報を教師信号としたモデル学習を行う“生理心理学的アプローチの導入”を提案する。　従来の感情推定は、モデル学習の際に実験者の判断による評定結果が教師信号として用いられるため主観的方法であることが否めない。また推定の対象も基本的な感情にのみ重点が置かれてきた。　生体情報は、意図的な操作が入らず継時的な変化を捉えられることができるとされている。そのため、推定を行う際に実験の第一段階として生体信号を利用することで、多様で連続的な心情の変化を対象とすることができるようになり、またより客観的で精度の高い判断が可能になると思われる。　難度の異なる音読課題を2つ用意し、課題間における生体信号の反応の違いが音声の違いにも現れるのかを観察した。実験者の主観的評価によってストレス状態と判断された被験者の音声と、それらのうち生体信号の変化からもストレス状態にあると判断できた被験者の音声の比較を行う。　生体信号には、心的状態の推定へ利用できると思われた容積脈派(BVP)、心電図(EKG)、皮膚温（TEMP）、皮膚コンダクタンス（SC）を用いた。　音声の比較には、各音声からF0とパワーそれぞれの最大値、最小値、振幅、平均値、それに発話速度を加えた9つの特徴量を抽出し、これらを決定木学習に利用した。決定木学習には、C4.5アルゴリズムを使用し、交差検定を用いて評価を行う。　全データ（実験者の主観的評価のみによってストレス状態を判断した）で学習モデルを生成した場合平均63.9%であった判別率が、選別データ（主観的評価に加え、生体信号の変化からもストレス状態を判断した）で学習モデルを生成した場合には平均77.8%まで精度が向上した。　生体信号がストレス状態を判断するうえで一つの指標となり得ることを示唆する結果となった。本実験の結果、音声からの心的状態の推定を行う際に生体情報を利用することの有益性が実証された。