Details of a Researcher - YAMANA, Hayato

写真a

YAMANA, Hayato

Scopus Paper Info

Paper Count: 181 Citation Count: 1091 h-index: 16

Click to view the Scopus page. The data was downloaded from Scopus API in February 17, 2026, via http://api.elsevier.com and http://www.scopus.com .

Google Scholar Information (Citations per year)

Citation Count: 2701 h-index: 26 i10-index: 75

Click to view the Google Scholar page.

Scopus Information

Affiliation

Faculty of Science and Engineering, School of Fundamental Science and Engineering

Job title

Professor

Degree

Dr.(Eng.) ( 1993.03 Waseda University )

Mail Address

Homepage URL

http://www.yama.info.waseda.ac.jp/~yamana/

Profile

Hayato Yamana received a Dr. Eng. degree at Waseda University in 1993. He began his career at the Electrotechnical Laboratory of the former Ministry of International Trade and Industry (MITI), and was seconded to MITI’s Machinery and Information Industries Bureau for a year in 1996. He was subsequently appointed associate professor of Computer Science and Engineering at Waseda University in 2000, and has been a professor since 2005. Since 2010, he has been director of DBSJ (Database Society of Japan). He was director of IPSJ (Information Processing Society of Japan) and vice chair of the Institute of Electronics, Information and Communication Engineers (IEICE)’s Information and Communication Society. At Waseda University, he was Deputy Chief Information Officer from 2015 to 2020. Since Oct. 2020, he has been Vice President for IT Promotion and Chief Information Officer. His research interests include fully homomorphic encryption, big data analysis and computer architecture.

Research Experience

2005.04

-

Now

Waseda University Faculty of Science and Engineering Professor
2020.10

-

Now

Waseda University Vice President for IT Promotion
2005.04

-

　

National Institute of Infomatics Visiting Professor
2004.04

-

2005.03

National Institute of Informatics Visiting Associate Professor
2000.04

-

2005.03

Waseda University School of Science and Engineering Associate Professor
1999.04

-

2000.03

Seikei Univ. Visiting lecturer
1997.04

-

2000.03

Electrotechnical Laboratory、METI Chief Researcher
1996.04

-

1997.03

Ministry of International Trade and Industry Electronic Equipment Division, Machinery and Information Industry Bureau
1993.04

-

1997.03

Electrotechnical Laboratory Researcher
1989.11

-

1993.03

Waseda University Research Assistant

▼display all

Education Background

1989.04

-

1993.03

Waseda University Graduate School of Science and Engineering
1987.04

-

1989.03

Waseda University Graduate School of Science and Engineering
1983.04

-

1987.03

Waseda University School of Science and Engineering Electoronics and Communication

Committee Memberships

2018.01

-

2020.12

IEEE Computer Society Board of Governors
2015.06

-

2017.06

IEICE Deputy Chair of ISS Society
2015.06

-

2017.06

Information Processing Society of Japan Board of Governors
2022.06

-

Now

Database Society of Japan Auditor
2002

-

Now

IEICE Journal Editorial Boards
2023.09

-

2024.07

IEEE SamrtComp 2025 General Chair
2010.06

-

2022.06

Database Society of Japan Board of Governor
2010.05

-

2014.04

情報処理学会データベースシステム研究会主査
2009

-

　

General Chair of The 2009 IEEE International Workshop on Quantitative Evaluation of large-scale Systems and Technologies
2009

-

　

Program Chair of The 2009 IEEE International Symposium on Mining and Web
2008

-

　

Program Chair of the 2008 IEEE International Symposium on Mining and Web
2007

-

　

PC Member of THE 2007 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP-07)
2007

-

　

PC Member of The 12th International Conf. on Database Systems and Advanced Applications
2007

-

　

PC Member of The 2007 IEEE International Symposium on Data Mining and Information Retrieval PC Member
2006

-

　

PC Member of IFIP International Conference on Network and Parallel Computing
2004

-

　

NTCIRワークショップ：Webタスクオーガナイザ 2004 －
2004

-

　

総務省情報通信政策課デジタル資産活用戦略会議ウェブ情報利用ワーキンググループ委員 2004 －
2004

-

　

HPCAsia2004 ORGANIZING COMMITTEE
2003

-

2004

ＨＰＣAsia国際会議組織委員会委員 2003 － 2004
2002

-

2004

IEEE Computer Society Japan Chapter Chair
2002

-

2004

IEEE Japan Tokyo Chapter Committee
2003

-

　

IEEE IEEE Computer Society Japan Chapter Chair(2003-2004)
2002

-

2003

マイクロソフト（株） .NET Architectureを基盤としたSecure Web Services研究会 .NET Web Services研究会専門委員 2002 － 2003
2001

-

2003

NTCIRワークショップ:Webタスクアドバイザリ委員 2001 － 2003
2002

-

　

電子情報通信学会論文誌編集委員(A)(2002-)
2002

-

　

（株）サーティファイ Web利用・技術認定委員会委員 2002 －
2001

-

2002

(財) 国際情報化協力センター多言語セキュリティ関連情報収集分析システム開発委員会委員 2001 － 2002
2000

-

2002

(財)日本情報処理開発協会人間主体の知的情報技術調査グループ委員 2000 － 2002
1998

-

2002

IPSJ Journal Editorial Boards
2000

-

　

通商産業省電子政策課次世代電子情報基盤技術調査委員会委員 2000 －
1999

-

　

情報処理学会論文誌編集委員(1999-2001), ARC研究会委員(1992-1997), SLDM研究会委員(2001-)

▼display all

Professional Memberships

　

　

　

ACM
　

　

　

IEEE
　

　

　

IEICE
　

　

　

IPSJ
　

　

　

DBSJ

Research Areas

Computational science Homomorphic Encryption / Database / Web informatics and service informatics / Human interface and interaction / Intelligent informatics / Kansei informatics / Life, health and medical informatics / Information network / Computer system

Research Interests

Secure Computation
Big Data Analysis (SNS, Trustworthy, Recommendation, Authorship Identification)
Search Engines
User Interface
Data Mining
Parallelizing Compiler

▼display all

Awards

Fellow

2020 IPSJ
Golden Core Award

2018 IEEE Computer Society
Fellow

2018 IEICE
Excellent Paper Award

2013 IEICE
Excellent Paper Award, DBSJ (JAPAN)

2009
IBM Faculty Award

2009
Best Author Award, ITE(Japan)

2003
Best Author Award, IPSJ (JAPAN)

2002
山下記念研究賞(情報処理学会)

1995
研究奨励賞(情報処理学会)

1993

▼display all

Papers

Light distillation for Incremental Graph Convolution Collaborative Filtering.

X. Fan, Fan Mo 0002, Chongxian Chen, Hayato Yamana

CoRR abs/2505.19810 2025.05

DOI
Personalized Fashion Recommendation with Image Attributes and Aesthetics Assessment.

Chongxian Chen, Fan Mo 0002, Xin Fan, Hayato Yamana

CoRR abs/2501.03085 2025.01

DOI
eCommTouch: A Benchmark Dataset for Touch-based Continuous Mobile Device Authentication for e-Commerce.

Masashi Kudo, Tsubasa Takahashi 0001, Isao Echizen, Hayato Yamana

BigComp 140 - 147 2025

DOI

Scopus
Privacy-Preserving News Recommendation over Homomorphic Encryption.

Eishin Nakano, Takuya Suzuki, Yuki Yada, Hayato Yamana

Aina (5) 84 - 97 2025

DOI

Scopus

1

Citation

(Scopus)
Synergistic Fusion Framework: Integrating Training and Non-training Processes for Accelerated Graph Convolution Network-based Recommendation.

Fan Mo 0002, Xin Fan, Chongxian Chen, Hayato Yamana

Pattern Recognit. 167 111829 - 111829 2025

DOI

Scopus

1

Citation

(Scopus)
Non-interactive Private Multivariate Function Evaluation using Homomorphic Table Lookup

Ruixiao Li, Hayato Yamana

IACR Communications in Cryptology 1 ( 3 ) 19 - 19 2024.10

DOI
SSR: Solving Named Entity Recognition Problems via a Single-stream Reasoner

Yuxiang Zhang, Junjie Wang, Xinyu Zhu, Tetsuya Sakai, Hayato Yamana

ACM Transactions on Information Systems 42 ( 5 ) 138 - 28 2024.09

DOI

Scopus

2

Citation

(Scopus)
PCPR: Plaintext Compression and Plaintext Reconstruction for Reducing Memory Consumption on Homomorphically Encrypted CNN

Takuya Suzuki, Hayato Yamana

Advanced Information Networking and Applications 120 - 132 2024.04

DOI

Scopus
MoRF_ESM: Prediction of MoRFs in Disordered Proteins Based on a Deep Transformer Protein Language Model

Chun Fang, Jiasheng He, Hayato Yamana

Journal of Bioinformatics and Computational Biology 22 ( 2 ) 2450006 - 17 2024.04

DOI

Scopus

3

Citation

(Scopus)
Sampling-based Epoch Differentiation Calibrated Graph Convolution Network for Point-of-interest Recommendation.

Fan Mo 0002, Xin Fan, Chongxian Chen, Changhao Bai, Hayato Yamana

Neurocomputing 571 127140 - 127140 2024.02

DOI

Scopus

8

Citation

(Scopus)
Message from BITS 2024 Co-Chairs and Technical Program Co-Chairs; SMARTCOMP 2024.

Sajal K. Das 0001, Hayato Yamana, Keiichi Yasumoto, Shameek Bhattacharjee

IEEE International Conference on Smart Computing(SMARTCOMP) 2024

DOI

Scopus
Message from the General and TPC Co-Chairs; SMARTCOMP 2024.

Franca Delmastro, Hayato Yamana, Dario Bruneo, Dirk Pesch

IEEE International Conference on Smart Computing(SMARTCOMP) 2024

DOI

Scopus
An Implementation of Private Function Evaluation Using FHE and TEE for Smart Computing Systems.

Ruixiao Li, Ryutaro Onishi, Hayato Yamana

IEEE International Conference on Smart Computing(SMARTCOMP) 231 - 233 2024

DOI

Scopus

1

Citation

(Scopus)
Data-Efficient Massive Tool Retrieval: A Reinforcement Learning Approach for Query-Tool Alignment with Language Models.

Yuxiang Zhang, Xin Fan, Junjie Wang 0011, Chongxian Chen, Fan Mo 0002, Tetsuya Sakai, Hayato Yamana

Sigir-ap 226 - 235 2024

DOI

Scopus
Traffic Jam Detection Using Real-Time Bus Operation Data Considering Timetable Information in Various Conditions.

Nozomi Hatanka, Hiroki Aoyagi, Tomoya Fujita, Hayato Yamana, Masato Oguchi

18th International Conference on Ubiquitous Information Management and Communication(IMCOM) 1 - 6 2024

DOI

Scopus
ToolBeHonest: A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large Language Models.

Yuxiang Zhang, Jing Chen, Junjie Wang 0011, Yaxin Liu, Cheng Yang 0007, Chufan Shi, Xinyu Zhu, Zihao Lin, Hanwen Wan, Yujiu Yang, Tetsuya Sakai, Tian Feng 0001, Hayato Yamana

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing(EMNLP) 11388 - 11422 2024

DOI

Scopus

5

Citation

(Scopus)
Prediction of Heart Disease Severity Using Hierarchically-Structured Machine-Learning Models with Feature Space Reduction.

Ayami Kiuchi, Tomoya Fujita, Hayato Yamana

Proceedings of the 17th International Joint Conference on Biomedical Engineering Systems and Technologies 662 - 670 2024

DOI
Touch-Based Continuous Mobile Device Authentication Using One-vs-One Classification Approach.

Masashi Kudo, Tsubasa Takahashi 0001, Hayato Yamana

IEEE International Conference on Big Data and Smart Computing(BigComp) 167 - 174 2024

DOI

Scopus

1

Citation

(Scopus)
PhiSN: Phishing URL Detection Using Segmentation and NLP Features.

Eint Sandi Aung, Hayato Yamana

Journal of Information Processing 32 973 - 989 2024

DOI

Scopus

1

Citation

(Scopus)
HPRoP: Hierarchical Privacy-preserving Route Planning for Smart Cities

Francis Tiausas, Keiichi Yasumoto, Jose Paolo Talusan, Hayato Yamana, Hirozumi Yamaguchi, Shameek Bhattacharjee, Abhishek Dubey, Sajal K. Das

ACM Transactions on Cyber-Physical Systems 2023.10

DOI

Scopus

5

Citation

(Scopus)
EPT-GCN: Edge Propagation-based Time-aware Graph Convolution Network for POI Recommendation.

Fan Mo, Hayato Yamana

Neurocomputing 543 126272 - 126272 2023.07

DOI

Scopus

14

Citation

(Scopus)
Designing In-Storage Computing for Low Latency and High Throughput Homomorphic Encrypted Execution

Takuya Suzuki, Hayato Yamana

2023 IEEE 8th International Conference on Big Data Analytics (ICBDA) 2023.03

DOI
Privacy Preserving Function Evaluation using Lookup Tables with Word-Wise FHE

Ruixiao LI, Hayato YAMANA

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 107 ( 8 ) 1163 - 1177 2023

DOI
MOBARec-GCNFP: Champion Recommendation for Multi-Player Online Battle Arena Games Using Graph Convolution Network with Fewer Parameters

Hayato YAMANA

Ieee International Conference on Big Data Analytics, Icbda 2023

DOI

Scopus

4

Citation

(Scopus)
Analysis of Dark Pattern-related Tweets from 2010

Hayato YAMANA

Ieee International Conference on Big Data Analytics, Icbda 2023

DOI

Scopus

4

Citation

(Scopus)
NER-to-MRC: Named-Entity Recognition Completely Solving as Machine Reading Comprehension.

Yuxiang Zhang, Junjie Wang, Xinyu Zhu, Tetsuya Sakai, Hayato Yamana

CoRR abs/2305.03970 2023

DOI
Fair and Robust Metric for Evaluating Touch-based Continuous Mobile Device Authentication.

Masashi Kudo, Tsubasa Takahashi 0001, Shojiro Ushiyama, Hayato Yamana

Companion Proceedings of the 28th International Conference on Intelligent User Interfaces 141 - 144 2023

DOI

Scopus

2

Citation

(Scopus)
Dark Patterns in E-commerce: A Dataset and Its Baseline Evaluations.

Yuki Yada, Jiaying Feng, Tsuneo Matsumoto, Nao Fukushima, Fuyuko Kido, Hayato Yamana

CoRR abs/2211.06543 2022

DOI
Privacy-Preserving Data Falsification Detection in Smart Grids using Elliptic Curve Cryptography and Homomorphic Encryption.

Sanskruti Joshi, Ruixiao Li, Shameek Bhattacharjee, Sajal K. Das 0001, Hayato Yamana

2022 IEEE International Conference on Smart Computing(SMARTCOMP) 229 - 234 2022

DOI

Scopus

5

Citation

(Scopus)
Look-Up Table based FHE System for Privacy Preserving Anomaly Detection in Smart Grids.

Ruixiao Li, Shameek Bhattacharjee, Sajal K. Das 0001, Hayato Yamana

2022 IEEE International Conference on Smart Computing(SMARTCOMP) 108 - 115 2022

DOI

Scopus

9

Citation

(Scopus)
Decoy Effect of Recommendation Systems on Real E-commerce Websites.

Fan Mo, Tsuneo Matsumoto, Nao Fukushima, Fuyuko Kido, Hayato Yamana

Proceedings of the 9th Joint Workshop on Interfaces and Human Decision Making for Recommender Systems co-located with 16th ACM Conference on Recommender Systems (RecSys 2022)(IntRS@RecSys) 151 - 163 2022
HRCA+: Advanced Multiple-choice Machine Reading Comprehension Method.

Yuxiang Zhang, Hayato Yamana

Proceedings of the Thirteenth Language Resources and Evaluation Conference(LREC) 6059 - 6068 2022
Hybrid Phishing URL Detection Using Segmented Word Embedding.

Eint Sandi Aung, Hayato Yamana

Information Integration and Web Intelligence - 24th International Conference(iiWAS) 507 - 518 2022

DOI

Scopus

2

Citation

(Scopus)
Latency-Aware Inference on Convolutional Neural Network Over Homomorphic Encryption.

Takumi Ishiyama, Takuya Suzuki, Hayato Yamana

Information Integration and Web Intelligence - 24th International Conference(iiWAS) 324 - 337 2022

DOI

Scopus

4

Citation

(Scopus)
GN-GCN: Combining Geographical Neighbor Concept with Graph Convolution Network for POI Recommendation.

Fan Mo, Hayato Yamana

Information Integration and Web Intelligence - 24th International Conference(iiWAS) 153 - 165 2022

DOI

Scopus

7

Citation

(Scopus)
Acceleration of Homomorphic Unrolled Trace-Type Function using AVX512 instructions.

Kotaro Inoue, Takuya Suzuki, Hayato Yamana

Proceedings of the 10th Workshop on Encrypted Computing & Applied Homomorphic Cryptography(WAHC@CCS) 47 - 52 2022

DOI

Scopus

2

Citation

(Scopus)
Homomorphic Encryption-Friendly Privacy-Preserving Partitioning Algorithm for Differential Privacy.

Shojiro Ushiyama, Tsubasa Takahashi 0001, Masashi Kudo, Hayato Yamana

IEEE International Conference on Big Data 5812 - 5822 2022

DOI

Scopus

2

Citation

(Scopus)
Dark patterns in e-commerce: a dataset and its baseline evaluations.

Yuki Yada, Jiaying Feng, Tsuneo Matsumoto, Nao Fukushima, Fuyuko Kido, Hayato Yamana

IEEE International Conference on Big Data 3015 - 3022 2022

DOI

Scopus

13

Citation

(Scopus)
Overfitting measurement of convolutional neural networks using trained network weights.

Satoru Watanabe, Hayato Yamana

International Journal of Data Science and Analytics 14 ( 3 ) 261 - 278 2022

DOI

Scopus

9

Citation

(Scopus)
A Survey on Explainable Fake News Detection.

Ken Mishima, Hayato Yamana

IEICE Transactions on Information & Systems 105-D ( 7 ) 1249 - 1257 2022

DOI
CHE: Channel-Wise Homomorphic Encryption for Ciphertext Inference in Convolutional Neural Network.

Tianying Xie, Hayato Yamana, Tatsuya Mori

IEEE Access 10 107446 - 107458 2022

DOI

Scopus

6

Citation

(Scopus)
Comparing Augmented Reality-based Display Methods to Present Guiding Information.

Riko Horikawa, Manaka Ito, Kosuke Komiya, Tatsuo Nakajima, Hayato Yamana

4th IEEE Global Conference on Life Sciences and Technologies(LifeTech) 22 - 25 2022

DOI

Scopus

1

Citation

(Scopus)
Topological measurement of deep neural networks using persistent homology.

Satoru Watanabe, Hayato Yamana

Annals of Mathematics and Artificial Intelligence 90 ( 1 ) 75 - 92 2022

DOI

Scopus

19

Citation

(Scopus)
Point of Interest Recommendation Acceleration Using Clustering

Huida Jiao, Fan Mo, Hayato Yamana

2021 IEEE 6th International Conference on Big Data Analytics, ICBDA 2021 175 - 180 2021.03

　View Summary

Point of Interest (POI) recommendation systems exploit information in location-based social networks to predict locations that users may be interested in. POI recommendations have been widely adopted in many applications, which are helpful for daily life. POI recommendation services receive a huge volume of visit history data generated by users' daily lives with mobile devices. However, POI recommendation systems require long time to build a model from such a huge volume of check-in data and recommend suitable POIs to users. Thus, it is indispensable to shorten the execution time in a big data era. In this study, we propose a clustering-based method to divide the data into multiple subsets to accelerate the POI recommendation's execution while maintaining accuracy. Our proposed method can be adapted to any general POI recommendation algorithm. We divide the whole data, that is, users and POIs, into subsets with a tree structure to balance the size of subsets according to both geographical information and user check-in distribution. Evaluation results show that we successfully accelerate the base algorithms over 17 to 39 times faster while keeping the accuracy almost the same.

DOI

Scopus

1

Citation

(Scopus)
Faithful Post-hoc Explanation of Recommendation Using Optimally Selected Features.

Shun Morisawa, Hayato Yamana

Engineering Artificially Intelligent Systems 159 - 173 2021

DOI

Scopus
Construction of Differentially Private Summaries over Fully Homomorphic Encryption.

Shojiro Ushiyama, Tsubasa Takahashi 0001, Masashi Kudo, Hayato Yamana

CoRR abs/2112.08662 2021
Segmentation-based Phishing URL Detection.

Eint Sandi Aung, Hayato Yamana

WI-IAT '21: IEEE/WIC/ACM International Conference on Web Intelligence(WI/IAT) 550 - 556 2021

DOI

Scopus

9

Citation

(Scopus)
Improving Text Classification Using Knowledge in Labels

Cheng Zhang, Hayato Yamana

2021 IEEE 6TH INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (ICBDA 2021) 193 - 197 2021

　View Summary

Various algorithms and models have been proposed to address text classification tasks; however, they rarely consider incorporating the additional knowledge hidden in class labels. We argue that hidden information in class labels leads to better classification accuracy. In this study, instead of encoding the labels into numerical values, we incorporated the knowledge in the labels into the original model without changing the model architecture. We combined the output of an original classification model with the relatedness calculated based on the embeddings of a sequence and a keyword set. A keyword set is a word set to represent knowledge in the labels. Usually, it is generated from the classes while it could also be customized by the users. The experimental results show that our proposed method achieved statistically significant improvements in text classification tasks. The source code and experimental details of this study can be found on Github(1).

DOI

Scopus

7

Citation

(Scopus)
Topological Measurement of Deep Neural Networks Using Persistent Homology.

Satoru Watanabe, Hayato Yamana

CoRR abs/2106.03016 ( 1 ) 75 - 92 2021

　View Summary

The inner representation of deep neural networks (DNNs) is indecipherable,
which makes it difficult to tune DNN models, control their training process,
and interpret their outputs. In this paper, we propose a novel approach to
investigate the inner representation of DNNs through topological data analysis
(TDA). Persistent homology (PH), one of the outstanding methods in TDA, was
employed for investigating the complexities of trained DNNs. We constructed
clique complexes on trained DNNs and calculated the one-dimensional PH of DNNs.
The PH reveals the combinational effects of multiple neurons in DNNs at
different resolutions, which is difficult to be captured without using PH.
Evaluations were conducted using fully connected networks (FCNs) and networks
combining FCNs and convolutional neural networks (CNNs) trained on the MNIST
and CIFAR-10 data sets. Evaluation results demonstrate that the PH of DNNs
reflects both the excess of neurons and problem difficulty, making PH one of
the prominent methods for investigating the inner representation of DNNs.

DOI

Scopus

19

Citation

(Scopus)
User-centric Distributed Route Planning in Smart Cities based on Multi-objective Optimization.

Francis Tiausas, Jose Paolo Talusan, Yu Ishimaki, Hayato Yamana, Hirozumi Yamaguchi, Shameek Bhattacharjee, Abhishek Dubey, Keiichi Yasumoto, Sajal K. Das 0001

IEEE International Conference on Smart Computing(SMARTCOMP) 77 - 82 2021

　View Summary

The realization of edge-based cyber-physical systems (CPS) poses important challenges in terms of performance, robustness, security, etc. This paper examines a novel approach to providing a user-centric adaptive route planning service over a network of Road Side Units (RSUs) in smart cities. The key idea is to adaptively select routing task parameters such as privacy-cloaked area sizes and number of retained intersections to balance processing time, privacy protection level, and route accuracy for privacy-augmented distributed route search while also handling per-query user preferences. This is formulated as an optimization problem with a set of parameters giving the best result for a set of queries given system constraints. Processing Throughput, Privacy Protection, and Travel Time Accuracy were developed as the objective functions to be balanced. A Multi-Objective Genetic Algorithm based technique (NSGA-II) is applied to recover a feasible solution. The performance of this approach was then evaluated using traffic data from Osaka, Japan. Results show good performance of the approach in balancing the aforementioned objectives based on user preferences.

DOI

Scopus

4

Citation

(Scopus)
Real-time Periodic Advertisement Recommendation Optimization under Delivery Constraint using Quantum-inspired Computer.

Fan Mo, Huida Jiao, Shun Morisawa, Makoto Nakamura, Koichi Kimura, Hisanori Fujisawa, Masafumi Ohtsuka, Hayato Yamana

Proceedings of the 23rd International Conference on Enterprise Information Systems 431 - 441 2021

DOI
Overfitting Measurement of Deep Neural Networks Using No Data.

Satoru Watanabe, Hayato Yamana

8th IEEE International Conference on Data Science and Advanced Analytics(DSAA) 1 - 10 2021

DOI

Scopus

16

Citation

(Scopus)
Construction of Differentially Private Summaries Over Fully Homomorphic Encryption.

Shojiro Ushiyama, Tsubasa Takahashi 0001, Masashi Kudo, Hayato Yamana

Database and Expert Systems Applications - 32nd International Conference 12924 LNCS 9 - 21 2021

　View Summary

Cloud computing has garnered attention as a platform of query processing systems. However, data privacy leakage is a critical problem. Chowdhury et al. proposed Cryptε, which executes differential privacy (DP) over encrypted data on two non-colluding semi-honest servers. Further, the DP index proposed by these authors summarizes a dataset to prevent information leakage while improving the performance. However, two problems persist: 1) the original data are decrypted to apply sorting via a garbled circuit, and 2) the added noise becomes large because the sorted data are partitioned with equal width, regardless of the data distribution. To solve these problems, we propose a new method called DP-summary that summarizes a dataset into differentially private data over a homomorphic encryption without decryption, thereby enhancing data security. Furthermore, our scheme adopts Li et al.’s data-aware and workload-aware (DAWA) algorithm for the encrypted data, thereby minimizing the noise caused by DP and reducing the errors of query responses. An experimental evaluation using torus fully homomorphic encryption (TFHE), a bit-wise fully homomorphic encryption library, confirms the applicability of the proposed method, which summarized eight 16-bit data in 12.5 h. We also confirmed that there was no accuracy degradation even after adopting TFHE along with the DAWA algorithm.

DOI

Scopus

2

Citation

(Scopus)
First-Impression-Based Unreliable Web Pages Detection - Does First Impression Work?

Kenta Yamada, Hayato Yamana

Advanced Information Networking and Applications - Proceedings of the 35th International Conference on Advanced Information Networking and Applications (AINA-2021) 227 635 - 641 2021

　View Summary

Considering the continuous increase in the number of web pages worldwide, detecting unreliable pages, such as those containing fake news, is indispensable. Natural language processing and social-information-based methods have been proposed for web page credibility evaluation. However, the applicability of the former to web pages is limited because a model is required for each language, while the latter is poorly adapted to changes, owing to its dependence on external services that can be discontinued. To solve these problems, herein we propose a first-impression-based web credibility evaluation method. Our experimental evaluation of a fake news corpus gave an accuracy of 0.898, which is superior to those of existing methods.

DOI

Scopus
Fast and Accurate Function Evaluation with LUT over Integer-Based Fully Homomorphic Encryption.

Ruixiao Li, Hayato Yamana

Advanced Information Networking and Applications - Proceedings of the 35th International Conference on Advanced Information Networking and Applications (AINA-2021) 226 LNNS 620 - 633 2021

　View Summary

Fully homomorphic encryption (FHE), which is used to evaluate arbitrary functions in addition and multiplication operations via modular arithmetic (mod q) over ciphertext, can be applied in various privacy-preserving applications. However, big data is difficult to adopt owing to its high computational cost and the challenges associated with the efficient handling of complex functions such as log(x). To address these problems, we propose a method for handling any multi-input function using a lookup table (LUT) to replace the original calculations with array indexing operations over integer-based FHE. In this study, we extend our LUT-based method to handle any input values, i.e., including non-matched element values in the LUT, to match with a near indexed value and return an approximated output over FHE. In addition, we propose a technique for splitting the table to handle large integers for improved accuracy with only a slight increase in the execution time. For the experiments, we use the Microsoft/SEAL library, and the results show that our proposed method can evaluate a 16-bit to 16-bit function in 2.110 s and a 16-bit to 32-bit function in 2.268 s, thereby outperforming previous methods implemented via bit-wise calculation over FHE.

DOI

Scopus

2

Citation

(Scopus)
Faster Homomorphic Trace-Type Function Evaluation.

Yu Ishimaki, Hayato Yamana

IEEE Access 9 53061 - 53077 2021

　View Summary

Homomorphic encryption enables computations over encrypted data without decryption, and can be used for outsourcing computations to some untrusted source. In homomorphic encryption based on the hardness of ring-learning with errors, offering promising security and functionality, a plaintext is represented by a polynomial. A plaintext is treated as a vector whose homomorphic evaluation enables component-wise addition and multiplication, as well as rotation across the components. We focus on a commonly used and time-consuming subroutine that enables homomorphically summing-up the components of the vector or homomorphically extracting the coefficients of the polynomial, and call it homomorphic trace-type function. We improve the efficiency of the homomorphic trace-type function evaluation. The homomorphic trace-type function evaluation is performed by repeating homomorphic rotation followed by addition (rotations-and-sums). To correctly add up a rotated ciphertext and an unrotated one, a special operation called key-switching should be performed on the rotated one. As key-switching is computationally expensive, the rotations-and-sums is inherently inefficient. We propose a more efficient trace-type function evaluation by using loop-unrolling, which is compatible with other optimization techniques such as hoisting, and can exploit multi-threading. We show that the rotations-and-sums is not the optimal solution in terms of runtime complexity and that a trade-off exists between time and space. Experimental results demonstrate that our proposed method works 1.32-2.12 times faster than the previous method.

DOI

Scopus

5

Citation

(Scopus)
Time Distribution Based Diversified Point of Interest Recommendation

Fan Mo, Huida Jiao, Hayato Yamana

2020 IEEE 5th International Conference on Cloud Computing and Big Data Analytics, ICCCBDA 2020 37 - 44 2020.04 [Refereed]

　View Summary

© 2020 IEEE. In location-based social networks (LBSNs), personalized point-of-interest (POI) recommendation helps users mine their interests and find new locations conveniently and quickly. It is one of the most important services to improve users' quality of life and travel. Most POI recommendation systems devoted to improve accuracy, however in recent years, diversity of POI recommendations, such as categorical and geographical diversity, receives much attention because a single type of POIs easily causes loss of users' interest. Different from previous diversity related recommendations, in this paper, we focus on visiting time of POI- A unique attribute of the interaction between users and POIs. Users usually have different active visiting time patterns and different frequently visiting POIs depending on time. If a set of proper visiting times of recommended POIs concentrates on a small range of time, the user might be unsatisfied because they cannot cover whole of the user's active time range that results in inappropriateness for the user to visit those POIs. To solve this problem, we propose a new concept-time diversity and a time distribution based recommendation method to improve time diversity of recommended POIs. Our experimental result with Gowalla dataset shows our proposed method effectively improves time diversity 25.9% compared with USG with only 7.9% accuracy loss.

DOI

Scopus

3

Citation

(Scopus)
推薦システムにおける推薦理由提示手法の提案-機械学習解釈モデルを用いて-

森澤竣, 真鍋智紀, 座間味卓臣, 山名早人

日本データベース学会和文論文誌(Web) 18-J 2020

J-GLOBAL
完全準同型暗号におけるbootstrap problem及びrelinearize problemの厳密解法の高速化

佐藤宏樹, 石巻優, 山名早人

日本データベース学会和文論文誌(Web) 18-J 2020

J-GLOBAL
Towards Privacy-preserving Anomaly-based Attack Detection against Data Falsification in Smart Grid.

Yu Ishimaki, Shameek Bhattacharjee, Hayato Yamana, Sajal K. Das 0001

2020 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids(SmartGridComm) 1 - 6 2020

　View Summary

In this paper, we present a novel framework for privacy-preserving anomaly-based data falsification attack detection in a smart grid advanced metering infrastructure (AMI). Specifically, we propose an anomaly detection framework over homomorphically encrypted data. Unlike existing privacy-preserving anomaly detectors, our framework detects the presence of not only energy theft (i.e., deductive attack), but also more advanced data integrity attacks (i.e., additive and camouflage attacks) over encrypted data without diminishing detection sensitivity. We optimize the anomaly detection procedure such that potentially expensive operations over homomorphically encrypted space are avoided. Moreover, we optimize the encryption method designed for a resource constrained device such as smart meters, and the time to complete encryption gets 40x faster over the naïve adoption of the encryption method. We also validate the proposed framework using a real dataset from smart metering infrastructures, and demonstrate that the data integrity attacks can be detected with high sensitivity, without sacrificing user privacy. Experimental results with a real dataset of 200 houses from an AMI in Texas showed that the detection sensitivity of the plaintext algorithm is not degraded due to the use of homomorphic encryption.

DOI

Scopus

10

Citation

(Scopus)
Real-Time Periodic Advertisement Recommendation Optimization using Ising Machine.

Fan Mo, Huida Jiao, Shun Morisawa, Makoto Nakamura, Koichi Kimura, Hisanori Fujisawa, Masafumi Ohtsuka, Hayato Yamana

2020 IEEE International Conference on Big Data (IEEE BigData 2020) 5783 - 5785 2020

　View Summary

Online advertising is widely used by commercial companies to attract customers. Tuning advertisement delivery to achieve a high conversion rate (CVR) is crucial for improving advertising effectiveness. Because advertisers require demandside platforms (DSPs) to deliver a certain number of ads within a fixed period, it is challenging to maximize CVR while satisfying ads delivery constraints. Such a combinatorial optimization problem is NP-hard when we have a considerable number of both ads and users. In this paper, we adopt Digital Annealer (DA), a quantum-inspired Ising computer, to solve the combinatorial optimization problem. The experimental evaluation result shows that the proposed method increases accuracy from 0.176 to 0.326 and achieves 20.8 times speed-up compared to baseline.

DOI

Scopus

5

Citation

(Scopus)
Highly Accurate CNN Inference Using Approximate Activation Functions over Homomorphic Encryption.

Takumi Ishiyama, Takuya Suzuki, Hayato Yamana

CoRR abs/2009.03727 3989 - 3995 2020

　View Summary

In the big data era, cloud-based machine learning as a service (MLaaS) has attracted considerable attention. However, when handling sensitive data, such as financial and medical data, a privacy issue emerges, because the cloud server can access clients' raw data. A common method of handling sensitive data in the cloud uses homomorphic encryption, which allows computation over encrypted data without decryption. Previous research adopted a low-degree polynomial mapping function, such as the square function, for data classification. However, this technique results in low classification accuracy. This study seeks to improve the classification accuracy for inference processing in a convolutional neural network (CNN) while using homomorphic encryption. We apply various orders of the polynomial approximations of Google's Swish and ReLU activation functions. We also adopt batch normalization to normalize the inputs for the approximated activation functions to fit the input range to minimize the error. We implemented CNN inference labeling over homomorphic encryption using the Microsoft's Simple Encrypted Arithmetic Library (SEAL) for the Cheon-Kim-Kim-Song (CKKS) scheme. The experimental evaluations confirmed classification accuracies of 99.29% and 81.06% for MNIST and CIFAR-10, respectively, which entails 0.11% and 4.69% improvements, respectively, over previous methods.

DOI

Scopus

44

Citation

(Scopus)
Deep Neural Network Pruning Using Persistent Homology.

Satoru Watanabe, Hayato Yamana

3rd IEEE International Conference on Artificial Intelligence and Knowledge Engineering(AIKE) 153 - 156 2020

　View Summary

Deep neural networks (DNNs) have improved the performance of artificial intelligence systems in various fields including image analysis, speech recognition, and text classification. However, the consumption of enormous computation resources prevents DNNs from operating on small computers such as edge sensors and handheld devices. Network pruning (NP), which removes parameters from trained DNNs, is one of the prominent methods of reducing the resource consumption of DNNs. In this paper, we propose a novel method of NP, hereafter referred to as PHPM, using persistent homology (PH). PH investigates the inner representation of knowledge in DNNs, and PHPM utilizes the investigation in NP to improve the efficiency of pruning. PHPM prunes DNNs in ascending order of magnitudes of the combinational effects among neurons, which are calculated using the one-dimensional PH, to prevent the deterioration of the accuracy. We compared PHPM with global magnitude pruning method (GMP), which is one of the common baselines to evaluate pruning methods. Evaluation results show that the classification accuracy of DNNs pruned by PHPM outperforms that pruned by GMP.

DOI

Scopus

8

Citation

(Scopus)
Highly Accurate CNN Inference Using Approximate Activation Functions over Homomorphic Encryption.

Takumi Ishiyama, Takuya Suzuki, Hayato Yamana

CoRR abs/2009.03727 3989 - 3995 2020

DOI

Scopus

44

Citation

(Scopus)
DAMCREM: Dynamic Allocation Method of Computation REsource to Macro-Tasks for Fully Homomorphic Encryption Applications.

Takuya Suzuki, Yu Ishimaki, Hayato Yamana

IEEE International Conference on Smart Computing(SMARTCOMP) 458 - 463 2020

　View Summary

Smart computing aims to improve the quality of life by utilizing Internet-of-Things devices and cloud computing. Typically, this computing handles private and/or personal information so concealing such sensitive information is a challenge. Adopting fully homomorphic encryption (FHE) is one approach for handling such sensitive information safely; that is, we can calculate the encrypted data without decryption. However, the time and space complexity of the FHE operation is high. Thus, its computation takes a long time. In this study, we aim to shorten FHE execution time by adopting our new scheduling algorithm, which divides a task into several macro-tasks and then assigns a set of threads. We assume a cloud computing system that is equipped with a many-core CPU. Thus, we propose the dynamic allocation method of computation resource to macro-tasks (DAMCREM), which dynamically allocates a certain number of threads (selected from pre-defined candidates) to each macro-task of every given job. In the evaluation, we compared DAMCREM to naive methods that allocate a pre-defined number of threads to each macro-task. The result shows that the average latency and maximum latency of job execution is less than those of naive methods, even when the average interval of job arrival is short.

DOI

Scopus

1

Citation

(Scopus)
WUY at SemEval-2020 Task 7: Combining BERT and Naive Bayes-SVM for Humor Assessment in Edited News Headlines.

Cheng Zhang, Hayato Yamana

Proceedings of the Fourteenth Workshop on Semantic Evaluation(SemEval@COLING) 1071 - 1076 2020

DOI

Scopus

8

Citation

(Scopus)
Topological Measurement of Deep Neural Networks Using Persistent Homology.

Satoru Watanabe, Hayato Yamana

International Symposium on Artificial Intelligence and Mathematics(ISAIM) 90 ( 1 ) 75 - 92 2020

　View Summary

The inner representation of deep neural networks (DNNs) is indecipherable, which makes it difficult to tune DNN models, control their training process, and interpret their outputs. In this paper, we propose a novel approach to investigate the inner representation of DNNs through topological data analysis (TDA). Persistent homology (PH), one of the outstanding methods in TDA, was employed for investigating the complexities of trained DNNs. We constructed clique complexes on trained DNNs and calculated the one-dimensional PH of DNNs. The PH reveals the combinational effects of multiple neurons in DNNs at different resolutions, which is difficult to be captured without using PH. Evaluations were conducted using fully connected networks (FCNs) and networks combining FCNs and convolutional neural networks (CNNs) trained on the MNIST and CIFAR-10 data sets. Evaluation results demonstrate that the PH of DNNs reflects both the excess of neurons and problem difficulty, making PH one of the prominent methods for investigating the inner representation of DNNs.

DOI

Scopus

19

Citation

(Scopus)
Imitation-Resistant Passive Authentication Interface for Stroke-Based Touch Screen Devices.

Masashi Kudo, Hayato Yamana

HCI International 2020 - Posters - 22nd International Conference 1226 CCIS 558 - 565 2020

　View Summary

Today’s widespread use of stroke-based touchscreen devices creates numerous associated security concerns and requires efficient security measures in response. We propose an imitation-resistant passive authentication interface for stroke-based touch screen devices employing classifiers for each individual stroke, which is evaluated with respect to 26 features. For experimental validation, we collect stroke-based touchscreen data from 23 participants containing target and imitation stroke patterns using a photo-matching game in the form of an iOS application. The equal error rate (EER), depicting the rate at which false rejection and false acceptance of target and imitator strokes are equal, is assumed as an indicator of the classification accuracy. Leave-one-out cross-validation was employed to evaluate the datasets based on the mean EER. For each cross-validation, one out of the two target datasets, an imitator dataset, and the remaining 20 imitator datasets were selected as genuine data, imitator test data, and imitator training data, respectively. Our results confirm stroke imitation as a serious threat. Among the 26 stroke features evaluated in terms of their imitation tolerance, the stroke velocity was identified as the most difficult to imitate. Dividing classifiers based on the stroke direction was found to further contribute to classification accuracy.

DOI

Scopus
Smart SE: Smart Systems and Services Innovative Professional Education Program.

Hironori Washizaki, Kenji Tei, Kazunori Ueda, Hayato Yamana, Yoshiaki Fukazawa, Shinichi Honiden, Shoichi Okazaki, Nobukazu Yoshioka, Naoshi Uchihira

44th IEEE Annual Computers, Software, and Applications Conference(COMPSAC) 1113 - 1114 2020

　View Summary

The Smart Systems and Services Innovative Professional Education (Smart SE) program is a certification program developed as part of the education network for the Practical information Technologies (enPiT-Pro) project, which is funded by the Japan Ministry of Education, Culture, Sports, Science and Technology. The Smart SE program provides industry professionals working in fields related to information and communication technology (ICT) with additional training and education in smart systems and services that utilize various technologies such as IoT, Cloud, Big Data, and Artificial Intelligence (AI) for businesses. Here, we illustrate its purpose, curriculum and features to respond to the needs of industrial professional education.

DOI

Scopus

2

Citation

(Scopus)
Privacy Preserving Calculation in Cloud using Fully Homomorphic Encryption with Table Lookup

Ruixiao Li, Yu Ishimaki, Hayato Yamana

2020 5TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (IEEE ICBDA 2020) 315 - 322 2020 [Refereed]

　View Summary

To protect data in cloud servers, fully homomorphic encryption (FHE) is an effective solution. In addition to encrypting data, FHE allows a third party to evaluate arithmetic circuits (i.e., computations) over encrypted data without decrypting it, guaranteeing protection even during the calculation. However, FHE supports only addition and multiplication. Functions that cannot be directly represented by additions or multiplications cannot be evaluated with FHE. A naive implementation of such arithmetic operations with FHE is a bit-wise operation that encrypts numerical data as a binary string. This incurs huge computation time and storage costs, however. To overcome this limitation, we propose an efficient protocol to evaluate multi-input functions with FHE using a lookup table. We extend our previous work, which evaluates a single-integer input function, such as f(x). Our extended protocol can handle multi-input functions, such as f(x, y). Thus, we propose a new method of constructing lookup tables that can evaluate multi-input functions to handle general functions. We adopt integer encoding rather than bit-wise encoding to speed up the evaluations. By adopting both permutation operations and a private information retrieval scheme, we guarantee that no information from the underlying plaintext is leaked between two parties: a cloud computation server and a decryptor. Our experimental results show that the runtime of our protocol for a two-input function is approximately 13 minutes, when there are 8,192 input elements in the lookup table. By adopting a multi-threading technique, the runtime can be further reduced to approximately three minutes with eight threads. Our work is more practical than a previously proposed bit-wise implementation, which requires 60 minutes to evaluate a single-input function.

DOI

Scopus

4

Citation

(Scopus)
Geographic Diversification of Recommended POIs in Frequently Visited Areas.

Jungkyu Han, Hayato Yamana

ACM Transactions on Information Systems 38 ( 1 ) 1 - 39 2020 [Refereed]

　View Summary

In the personalized Point-Of-Interest (POI) (or venue) recommendation, the diversity of recommended POIs is an important aspect. Diversity is especially important when POIs are recommended in the target users' frequently visited areas, because users are likely to revisit such areas. In addition to the (POI) category diversity that is a popular diversification objective in recommendation domains, diversification of recommended POI locations is an interesting subject itself. Despite its importance, existing POI recommender studies generally focus on and evaluate prediction accuracy. In this article, geographical diversification (geo-diversification), a novel diversification concept that aims to increase recommendation coverage for a target users' geographic areas of interest, is introduced, from which a method that improves geo-diversity as an addition to existing state-of-the-art POI recommenders is proposed. In experiments with the datasets from two real Location Based Social Networks (LSBNs), we first analyze the performance of four state-of-the-art POI recommenders from various evaluation perspectives including category diversity and geo-diversity that have not been examined previously. The proposed method consistently improves geo-diversity (CPR(geo)@20) by 5 to 12% when combined with four state-of-the-art POI recommenders with negligible prediction accuracy (Recall@20) loss and provides 6 to 18% geo-diversity improvement with tolerable prediction accuracy loss (up to 2.4%).

DOI

Scopus

18

Citation

(Scopus)
Appearance Frequency-Based Ranking Method for Improving Recommendation Diversity

Seiki Miyamoto, Takumi Zamami, Hayato Yamana

2019 4TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (ICBDA 2019) 420 - 425 2019 [Refereed]

　View Summary

Recommender systems are used to analyze users' preferences through their past activities and to personalize recommendations for each user based on what they might be interested in. The performance of the recommender system is most commonly measured using only recommendation accuracy. However, recommending accurate items does not mean that the generated recommendation is the best for the user because it can be biased towards some items that have a higher chance of being liked by users, such as popular items. Recommendations become repetitive and obvious with biased item selection and are less likely to be personalized. To mitigate bias and repetitiveness, recommendation diversity has been studied. However, diversity has a trade-off relationship with accuracy. Modifying the recommendation algorithm to consider diversity while learning about user preferences would not only cause loss in accuracy, but also lead to less precise reading of user preferences. Instead, using ranking methods to re-rank the priority of items predicted, the recommendation algorithm would keep the preciseness of the algorithm. In this study, a ranking method using the appearance frequency of items to restrict the items from being frequently recommended will be proposed. The experimental results showed that the proposed method consistently improved diversity in multiple diversity metrics.

DOI

Scopus

5

Citation

(Scopus)
Privacy-preserving Recommendation for Location-based Services

Qiuyi Lyu, Yu Ishimaki, Hayato Yamana

2019 4TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (ICBDA 2019) 98 - 105 2019 [Refereed]

　View Summary

Location-based recommendation services, such as Foursquare, enhance the convenience in the life of consumers. Users are usually sensitive to disclose their personal information. Unavoidable security concerns arise because malicious third parties could misuse confidential information, such as the users' preferences. The mainstream research to this problem is employing the privacy-preserving k-NN search algorithm. However, two major bottlenecks exist. One is that it only provides the nearest points of interest (POI) to the users without any recommendations based on the users' behavior history. This limited service eventually results in a situation in which no user would prefer to continue using it. The other is that only a single user holds the private key; thus, the service providers cannot obtain any user's information to analyze to make a profit. To solve the first problem, our proposed protocol provides recommendation services by adopting collaborative filtering techniques with an encrypted database based on fully homomorphic encryption aside from encrypting both the user's location and preferences. For the second problem, a privacy service provider (PSP) is designed to generate and hold the private key. Thus, service providers can homomorphically compute aggregate information concerning user behavior patterns and send the encrypted results to PSP to ensure decryption while maintaining the privacy of individual users. Compared with the previous studies, the novelty of the proposed protocol is the design of a commercially valuable privacy recommendation mechanism that could benefit both consumers and service providers on LBS.

DOI

Scopus

6

Citation

(Scopus)
Fully Homomorphic Encryption with Table Lookup for Privacy-Preserving Smart Grid.

Ruixiao Li, Yu Ishimaki, Hayato Yamana

IEEE International Conference on Smart Computing(SMARTCOMP) 19 - 24 2019 [Refereed]

　View Summary

Smart grids are indispensable applications in smart connected communities (SCC). To construct privacy-preserving anomaly detection systems on a smart grid, we adopt fully homomorphic encryption (FHE) to protect users' sensitive data. Although FHE allows a third party to perform calculations on encrypted data without decryption, FHE only supports addition and multiplication on encrypted data. In anomaly detection, we must calculate both harmonic and arithmetic means consisting of logarithms. A naive implementation of such arithmetic operations with FHE is a bitwise operation; thus, it requires huge computation time. To speed up such calculations, we propose an efficient protocol to evaluate any functions with FHE using a lookup table (LUT). Our protocol allows integer encoding, i.e., a set of integers is encrypted as a single ciphertext, rather than using bitwise encoding. Our experimental results in a multi-threaded environment show that the runtime of our protocol is approximately 51 s when the size of the LUT is 448,000. Our protocol is more practical than the previously proposed bitwise implementation.

DOI

Scopus

8

Citation

(Scopus)
URL-based Phishing Detection using the Entropy of Non-Alphanumeric Characters.

Eint Sandi Aung, Hayato Yamana

Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services(iiWAS) 385 - 392 2019 [Refereed]

　View Summary

© 2019 Association for Computing Machinery. Phishing is a type of personal information theft in which phishers lure users to steal sensitive information. Phishing detection mechanisms using various techniques have been developed. Our hypothesis is that phishers create fake websites with as little information as possible in a webpage, which makes it difficult for content- A nd visual similarity-based detections by analyzing the webpage content. To overcome this, we focus on the use of Uniform Resource Locators (URLs) to detect phishing. Since previous work extracts specific special-character features, we assume that non- A lphanumeric (NAN) character distributions highly impact the performance of URL-based detection. We hence propose a new feature called the entropy of NAN characters for URL-based phishing detection. Experimental evaluation with balanced and imbalanced datasets shows 96% ROC AUC on the balanced dataset and 89% ROC AUC on the imbalanced dataset, which increases the ROC AUC as 5 to 6% from without adopting our proposed feature.

DOI

Scopus

6

Citation

(Scopus)
A Privacy-Preserving Query System using Fully Homomorphic Encryption with Real-World Implementation for Medicine-Side Effect Search.

Yusheng Jiang, Tamotsu Noguchi, Nobuyuki Kanno, Yoshiko Yasumura, Takuya Suzuki, Yu Ishimaki, Hayato Yamana

Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services(iiWAS) 63 - 72 2019 [Refereed]

　View Summary

© 2019 Association for Computing Machinery. The preservation of privacy during a search has become a serious problem in recent years. There is an increasing requirement to ensure that user queries are not abused by a third party, including the search provider. Fully homomorphic encryption (FHE) can conduct addition and multiplication directly over ciphertext. Using FHE, privacy, concerning both the user queries and the database of the search provider, can be protected. In this paper, we propose a privacy-preserving query system model. We implemented the proposed model in a real-world medicine side-effect query system. We applied a filtering technique, prior to the query deployment, to reduce the size of the database and used multi-threading to accelerate the search. The system was tested 10,000 times with a random query, using a database comprising 40,000 records of simulation data, and completed 99.84% of the queries within 60 seconds (s), proving the real-world applicability of our system.

DOI

Scopus

6

Citation

(Scopus)
Outsourced Private Set Union on Multi-Attribute Datasets for Search Protocol using Fully Homomorphic Encryption.

Rumi Shakya, Yoshiko Yasumura, Takuya Suzuki, Yu Ishimaki, Hayato Yamana

Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services(iiWAS) 55 - 62 2019 [Refereed]

　View Summary

© 2019 Association for Computing Machinery. In the era of big data and cloud computing, outsourcing data storage to the cloud poses the risk of its abuse or leakage. Thus, we address the problem of delegating computation on outsourced private datasets while maintaining privacy. In this study, we consider a scenario involving two data owners outsourcing their datasets to a cloud service. The cloud performs a set union computation, after which the querier sends a query to obtain information from both datasets. We propose a protocol that uses fully homomorphic encryption (FHE) and Cartesian-join of Bloom filters (CBF) as proposed by Wang et al. The protocol obtains information on the existence of a particular set of elements without learning about the residing source. To the best of our knowledge, our protocol, by using the FHE and CBF matrix, is a novel approach to ensuring the security of outsourced set union operations.

DOI

Scopus

3

Citation

(Scopus)
Secure Naïve Bayes Classification Protocol over Encrypted Data Using Fully Homomorphic Encryption.

Yoshiko Yasumura, Yu Ishimaki, Hayato Yamana

Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services(iiWAS) 45 - 54 2019 [Refereed]

　View Summary

© 2019 Association for Computing Machinery. Machine learning classification has a wide range of applications. In the big data era, a client may want to outsource classification tasks to reduce the computational burden at the client. Meanwhile, an entity may want to provide a classification model and classification services to such clients. However, applications such as medical diagnosis require sensitive data that both parties may not want to reveal. Fully homomorphic encryption (FHE) enables secure computation over encrypted data without decryption. By applying FHE, classification can be outsourced to a cloud without revealing any data. However, existing studies on classification over FHE do not achieve the scenario of outsourcing classification to a cloud while preserving the privacy of the classification model, client's data and result. In this work, we apply FHE to a naïve Bayes classifier and, to the best of our knowledge, propose the first concrete secure classification protocol that satisfies the above scenario.

DOI

Scopus

12

Citation

(Scopus)
Point of Interest Recommendation by Exploiting Geographical Weighted Center and Categorical Preference.

Fan Mo, Hayato Yamana

2019 International Conference on Data Mining Workshops 2019-November 73 - 76 2019 [Refereed]

　View Summary

© 2019 IEEE. Point of interest (POI) recommendation is one of the indispensable services in location-based social networks (LBSNs). POI recommendation helps users find new locations and better understand the city. In LBSNs, the aspects, such as geographical information and categorical information, improve the accuracy of POI recommendation. In this paper, we propose two new techniques to improve the recommendation accuracy; 1) weighted center of a target user's each active area and 2) category-dependent threshold for categorical preference. The weighted center represents density-based center of a target user's active area. The geographical aspect usually adopts the target user's active areas that he frequently visited. Although previous researches define the active area by its active center and its radius, they choose the location of the most frequently visited POI as the active center even if there exist several POIs that have similar number of check-ins, which results in miss-definition of active center. Our weighted center is able to handle the target user's check-in probability, which follows a power-law distribution. Besides, previous researches predict users' preference for categories; however, they neglect the fact that different categories have different users' preference distributions. For example, a specific category has wide-range of subcategories to be preferred by user, but another category has a few subcategories to be preferred, even if there are many subcategories in the category. Thus, we set different thresholds to select candidate subcategories in each category. Experimental result with Weeplaces dataset shows that our method outperforms other baselines by at least 16.93% in F1-score@5.

DOI

Scopus

3

Citation

(Scopus)
Two-Factor Authentication Using Leap Motion and Numeric Keypad.

Tomoki Manabe, Hayato Yamana

HCI for Cybersecurity, Privacy and Trust - First International Conference 11594 LNCS 38 - 51 2019 [Refereed]

　View Summary

© 2019, Springer Nature Switzerland AG. Biometric authentication has become popular in modern society. It takes less time and effort for users when compared to conventional password authentication. Furthermore, biometric authentication was considered more secure than password authentication because it was more difficult to steal biometric information when compared to passwords. However, given the development of high-spec cameras and image recognition technology, the risk of the theft of biometric information, such as fingerprints, is increasing. Additionally, biometric authentication exhibits lower and less stable accuracy than that of password authentication. To solve the aforementioned issues, we propose two-factor authentication combining password-input and biometric authentication of the hand. We adopt Leap Motion to measure physical and behavioral features related to hands. Subsequently, a random forest classifier determines whether the hand features belongs to a genuine user. Our authentication system architecture completes the biometric authentication by using a limited amount of data obtained within a few seconds when a user enters a password. The advantage of the proposed method is that it prevents intrusion by biometric authentication even if a password is stolen. Our experimental results for 21 testers exhibit 94.98% authentication accuracy in a limited duration, 2.52 s on an average while inputting a password.

DOI

Scopus

3

Citation

(Scopus)
Effectiveness of Usability & Performance Features for Web Credibility Evaluation.

Kenta Yamada, Hayato Yamana

2019 IEEE International Conference on Big Data (IEEE BigData) 6257 - 6259 2019 [Refereed]

　View Summary

Unreliable web pages, such as fake news, have become an unavoidable problem. To tackle this problem, recent researches have adopted both content and social features to predict the credibility of the web pages; however, the accuracy is almost saturated. In this paper, we propose the adoption of Google Lighthouse features to predict web page credibility. Our experimental results show that the proposed method achieves an increased accuracy of 7.9% in comparison with state-of-the-art methods.

DOI

Scopus

7

Citation

(Scopus)
Message from the BITS 2018 General Chairs and TPC Chairs

Sajal K. Das, Hayato Yamana, General Co-Chairs, Mauro Conti, Atsuko Miyaji, Jun Sakuma

Proceedings - 2018 IEEE International Conference on Smart Computing, SMARTCOMP 2018 xxiii 2018.07 [Refereed]

DOI

Scopus
Attribute-based proxy re-encryption method for revocation in cloud storage: Reduction of communication cost at re-encryption

Yoshiko Yasumura, Hiroki Imabayashi, Hayato Yamana

2018 IEEE 3rd International Conference on Big Data Analysis, ICBDA 2018 312 - 318 2018.05 [Refereed]

　View Summary

© 2018 IEEE. In recent years, many users have uploaded data to the cloud for easy storage and sharing with other users. At the same time, security and privacy concerns for the data are growing. Attribute-based encryption (ABE) enables both data security and access control by defining users with attributes so that only those users who have matching attributes can decrypt them. For real-world applications of ABE, revocation of users or their attributes is necessary so that revoked users can no longer decrypt the data. In actual implementations, ABE is used in hybrid with a symmetric encryption scheme such as the advanced encryption standard (AES) where data is encrypted with AES and the AES key is encrypted with ABE. The hybrid encryption scheme requires re-encryption of the data upon revocation to ensure that the revoked users can no longer decrypt that data. To re-encrypt the data, the data owner (DO) must download the data from the cloud, then decrypt, encrypt, and upload the data back to the cloud, resulting in both huge communication costs and computational burden on the DO depending on the size of the data to be re-encrypted. In this paper, we propose an attribute-based proxy re-encryption method in which data can be re-encrypted in the cloud without downloading any data by adopting both ABE and Syalim's encryption scheme. Our proposed scheme reduces the communication cost between the DO and cloud storage. Experimental results show that the proposed method reduces the communication cost by as much as one quarter compared to that of the trivial solution.

DOI

Scopus

16

Citation

(Scopus)
Loop Circuit Optimization with Bootstrapping over Fully Homomorphic Encryption and its Application to Nearest Neighbor

佐藤宏樹, 馬屋原昂, 石巻優, 山名早人

日本データベース学会和文論文誌(Web) 16-J ROMBUNNO.12 (WEB ONLY) 2018.03

J-GLOBAL
Realization of Active Authentication for Smart Phone by Using Online Learning

石山雄大, 山名早人, 山名早人

日本データベース学会和文論文誌(Web) 16-J ROMBUNNO.18 (WEB ONLY) 2018.03

J-GLOBAL
ShuttleBoard:スマートウォッチにおけるタップ動作の少ない仮名文字入力手法

下岡純也, 山名早人

日本データベース学会和文論文誌(Web) 16-J ROMBUNNO.5 (WEB ONLY) 2018.03

J-GLOBAL
History-enhanced Focused Website Segment Crawler.

Tanaphol Suebchua, Bundit Manaskasemsak, Arnon Rungsawang, Hayato Yamana

2018 International Conference on Information Networking(ICOIN) 2018-January 80 - 85 2018 [Refereed]

　View Summary

The primary challenge in focused crawling research is how to efficiently utilize computing resources, e.g., bandwidth, disk space, and time, to find as many web pages related to a specific topic as possible. To meet this challenge, we previously introduced a machine-learning-based focused crawler that aims to crawl a group of relevant web pages located in the same directory path, called a website segment, and has achieved high efficiency so far. One of the limitations of our previous approach is that it may repeatedly visit a website that does not serve any relevant website segments, in the scenario where the website segments share the same linkage characteristics as the relevant ones in the training dataset. In this paper, we propose a "history-enhanced focused website segment crawler" to solve the problem. The idea behind it is that the priority score of an unvisited website segment should be reduced if the crawler has consecutively downloaded many irrelevant web pages from the website. To implement this idea, we propose a new prediction feature, called the "history feature", that is extracted from the recent crawling results, i.e., relevant and irrelevant web pages gathered from the target website. Our experiment shows that our newly proposed feature could improve the crawling efficiency of our focused crawler by a maximum of approximately 5%.

DOI

Scopus

5

Citation

(Scopus)
Efficient Topical Focused Crawling Through Neighborhood Feature.

Tanaphol Suebchua, Bundit Manaskasemsak, Arnon Rungsawang, Hayato Yamana

New Generation Computing 36 ( 2 ) 95 - 118 2018 [Refereed]

　View Summary

A focused web crawler is an essential tool for gathering domain-specific data used by national web corpora, vertical search engines, and so on, since it is more efficient than general Breadth-First or Depth-First crawlers. The problem in focused crawling research is the prioritization of unvisited web pages in the crawling frontier followed by crawling these web pages in the order of their priority. The most common feature, adopted in many focused crawling researches, to prioritize an unvisited web page is the relevancy of the set of its source web pages, i.e., its in-linked web pages. However, this feature is limited, because we cannot estimate the relevancy of the unvisited web page correctly if we have few source web pages. To solve this problem and enhance the efficiency of focused web crawlers, we propose a new feature, called the "neighborhood feature". This enables the adoption of additional already-downloaded web pages to estimate the priority of a target web page. The additionally adopted web pages consist both of web pages located at the same directory as that of the target web page and web pages whose directory paths are similar to that of the target web page. Our experimental results show that our enhanced focused crawlers outperform the crawlers not utilizing the neighborhood feature as well as the state-of-the-art focused crawlers, including HMM crawler.

DOI

Scopus

9

Citation

(Scopus)
Editor's Message to Special Issue of Young Researchers' Papers.

Hayato Yamana

Journal of Information Processing 26 224 - 224 2018 [Refereed]

DOI

Scopus
Outsourced Private Set Intersection Cardinality with Fully Homomorphic Encryption.

Arisa Tajima, Hiroki Sato, Hayato Yamana

6th International Conference on Multimedia Computing and Systems(ICMCS) 2018-May 1 - 8 2018 [Refereed]

　View Summary

Cloud database services have attracted considerable interest with the increase in the amount of data to be analyzed. Delegating data management to cloud services, however, causes security and privacy issues because cloud services are not always trustable. In this study, we address the problem of answering join queries across outsourced private datasets while maintaining data confidentiality. We particularly consider a scenario in which two data owners each own a set of elements and a querier asks the cloud to perform join operations to obtain the size of the common elements in the two datasets. To process the join operations without revealing the contents of data to the cloud, we propose two protocols, a basic protocol and a querier-friendly protocol, which adopt a functionality of outsourced private set intersection cardinality (OPSI-CA) with fully homomorphic encryption (FHE) and bloom filters. The querier-friendly protocol achieves a reduction in communication and computation costs for the querier. Our experimental results show that it takes 436 s for the basic protocol and 298 s for the querier-friendly protocol to execute the join query on the two datasets with 100 elements each. The novelty of this study is that our protocols are the first approaches for outsourced join operations adopting FHE.

DOI

Scopus

30

Citation

(Scopus)
Active Authentication on Smartphone using Touch Pressure.

Masashi Kudo, Hayato Yamana

The 31st Annual ACM Symposium on User Interface Software and Technology Adjunct Proceedings 96 - 98 2018 [Refereed]

　View Summary

Smartphone user authentication is still an open challenge because the balance between both security and usability is indispensable. To balance between them, active authentication is one way to overcome the problem. In this paper, we tackle to improve the accuracy of active authentication by adopting online learning with touch pressure. In recent years, it becomes easy to use the smartphones equipped with pressure sensor so that we have confirmed the effectiveness of adopting the touch pressure as one of the features to authenticate. Our experiments adopting online AROW algorithm with touch pressure show that equal error rate (EER), where the miss rate and false rate are equal, is reduced up to one-fifth by adding touch pressure feature. Moreover, we have confirmed that training with the data from both sitting posture and prone posture archives the best when testing variety of postures including sitting, standing and prone, which achieves EER up to 0.14%.

DOI

Scopus

5

Citation

(Scopus)
Non-Interactive and Fully Output Expressive Private Comparison.

Yu Ishimaki, Hayato Yamana

Progress in Cryptology - INDOCRYPT 2018 - 19th International Conference on Cryptology in India(INDOCRYPT) 11356 LNCS 355 - 374 2018 [Refereed]

　View Summary

© 2018, Springer Nature Switzerland AG. Private comparison protocols are fundamental to the field of secure computation. Recently, Lu et al. (ASIACCS 2018) proposed a new protocol, XCMP,, which is based on a ring-based fully homomorphic encryption (FHE) scheme. In that scheme, two μ-bit integers a and b are compared in encrypted form without revealing the plaintext to an evaluator. The protocol outputs a bit in encrypted form, which indicates whether a > b. XCMP has the following three advantages: the output can be reused for further processing, the evaluation is performed without any interactions with a decryptor having a secret key, and the required multiplicative depth is only 1. However, XCMP has two potential disadvantages. First, the protocol result preserves both additive and multiplicative homomorphisms over ℤ t only, whereas the underlying FHE scheme can support a much larger plaintext space of (Formula Presented) for a prime t and a power-of-two N; this restricts the functionality of applications using the comparison result. Second, the bit length μ of the integers to be compared is no more than log N (typically 16 bits, at most). Thus, it is difficult for XCMP to handle larger integers. In this paper, we propose a non-interactive private comparison protocol that solves the aforementioned problems and outputs an additively and multiplicatively reusable comparison result over the ring without adding an extremely large computational overhead over XCMP. Moreover, by regarding a μ (>16 -bit integer as a sequence of chunks, we show that the multiplicative depth required for our comparison protocol is logarithmic in the number of chunks. This value is much smaller than the naïve solution with a multiplicative depth of log μ. Experiment results demonstrate that our protocol introduces a subtle overhead over XCMP. Remarkably, we experimentally demonstrate that our protocol for a larger domain is comparable to the construction given by one of the state-of-the-art bitwise FHE schemes.

DOI

Scopus

10

Citation

(Scopus)
External Content-dependent Features for Web Credibility Evaluation.

Kazuyoshi Ootani, Hayato Yamana

IEEE International Conference on Big Data (IEEE BigData 2018) 5414 - 5416 2018 [Refereed]

　View Summary

Unreliable web pages such as fake news has become a global problem in big data era. The motivation to publish fake news is often for profit; for example, earning advertisement income by putting ads on their web pages. In this paper, we focus on different usage of HTML source tags between reliable and unreliable web pages, then propose new features for predicting their credibility. The experimental result shows that our proposed features increase accuracy when used together with previously proposed Contents features.

DOI

Scopus

2

Citation

(Scopus)
Improving Recommendation Diversity Across Users by Reducing Frequently Recommended Items.

Seiki Miyamoto, Takumi Zamami, Hayato Yamana

IEEE International Conference on Big Data (IEEE BigData 2018) 5392 - 5394 2018 [Refereed]

　View Summary

Recommender systems have been used for analyzing users' preference through their past activities and recommend items in which they might be interested in. There are numerous research on improving the accuracy of recommendation being conducted, so the recommender system reads user preference more accurately. However, it is important to consider the recommendation diversity, because lacking diversity will lead to recommendation being repetitive and obvious. In this paper, we propose a method to re-rank the recommendation list by appearance frequency of items to recommend more range of items. The experimental result shows that our method consistently performs better than a related work to improve recommendation diversity.

DOI

Scopus

6

Citation

(Scopus)
A survey on recommendation methods beyond accuracy

H. A.N. Jungkyu, Hayato Yamana

IEICE Transactions on Information and Systems E100D ( 12 ) 2931 - 2944 2017.12 [Refereed]

DOI

Scopus

23

Citation

(Scopus)
A Survey on Recommendation Methods Beyond Accuracy.

Jungkyu Han, Hayato Yamana

IEICE Transactions on Information & Systems 100-D ( 12 ) 2931 - 2944 2017.12 [Refereed]

　View Summary

In recommending to another individual an item that one loves, accuracy is important, however in most cases, focusing only on accuracy generates less satisfactory recommendations. Studies have repeatedly pointed out that aspects that go beyond accuracy-such as the diversity and novelty of the recommended items-are as important as accuracy in making a satisfactory recommendation. Despite their importance, there is no global consensus about definitions and evaluations regarding beyond-accuracy aspects, as such aspects closely relate to the subjective sensibility of user satisfaction. In addition, devising algorithms for this purpose is difficult, because algorithms concurrently pursue the aspects in trade-off relation (i.e., accuracy vs. novelty). In the aforementioned situation, for researchers initiating a study in this domain, it is important to obtain a systematically integrated view of the domain. This paper reports the results of a survey of about 70 studies published over the last 15 years, each of which addresses recommendations that consider beyond-accuracy aspects. From this survey, we identify diversity, novelty, and coverage as important aspects in achieving serendipity and popularity unbiasedness-factors that are important to user satisfaction and business profits, respectively. The five major groups of algorithms that tackle the beyond-accuracy aspects are multi-objective, modified collaborative filtering (CF), clustering, graph, and hybrid; we then classify and describe algorithms as per this typology. The off-line evaluation metrics and user studies carried out by the studies are also described. Based on the survey results, we assert that there is a lot of room for research in the domain. Especially, personalization and generalization are considered important issues that should be addressed in future research (e.g., automatic per-user-trade-off among the aspects, and properly establishing beyond-accuracy aspects for various types of applications or algorithms).

DOI

Scopus

23

Citation

(Scopus)
Bits Message from General Co-Chairs

Sajal K. Das, Hayato Yamana

2017 IEEE International Conference on Smart Computing, SMARTCOMP 2017 xxiii 2017.06 [Refereed]

DOI

Scopus
Streamline Computation of Secure Frequent Pattern Mining by Fully Homomorphic Encryption

今林広樹, 石巻優, 馬屋原昂, 佐藤宏樹, 山名早人

情報処理学会論文誌トランザクションデータベース(Web) 10 ( 1 ) 1 - 12 2017.03

CiNii J-GLOBAL
Private Substring Search on Homomorphically Encrypted Data

Yu Ishimaki, Hiroki Imabayashi, Hayato Yamana

2017 IEEE INTERNATIONAL CONFERENCE ON SMART COMPUTING (SMARTCOMP) 457 - 462 2017 [Refereed]

　View Summary

With the rapid development of cloud storage services and IoT environment, how to securely and efficiently search without compromising privacy has been an indispensable problem. In order to address such a problem, much works have been proposed for searching over encrypted data. Motivated by storing sensitive data such as genomic and medical data, substring search for encrypted data has been studied. Previous work either leaks query access pattern using vulnerable cryptographic model or performs search over plaintext data by an encrypted query. Thus they are not compatible with outsourcing scenario where searched data is stored in encrypted form which is searched by an encrypted substring query without leaking query access pattern, i.e., private substring search. In order to perform private substring search, Fully Homomorphic Encryption (FHE) can be adopted although it induces computationally huge overhead. Because of the huge overhead, performing private substring search efficiently over FHE is a challenging task. In this work, we propose a private substring search protocol over encrypted data by adopting FHE followed by examining its feasibility. In particular, we make use of batching technique which can accelerate homomorphic computation in SIMD manner. In addition, we propose a data structure which can be useful to specific searching function for batched computation. Our experimental result showed our proposed method is feasible.

DOI

Scopus

11

Citation

(Scopus)
Geographical Diversification in POI Recommendation: Toward Improved Coverage on Interested Areas

Jungkyu Han, Hayato Yamana

PROCEEDINGS OF THE ELEVENTH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS'17) 224 - 228 2017 [Refereed]

　View Summary

In recommending POIs(Point-Of-Interests), factors such as the diversity of the recommended POIs are as important as accuracy for providing a satisfactory recommendation. Although existing diversification methods can help POI recommender systems suggest more diverse POIs, they lack "geographical diversification," which results in the concentration of the supposedly "diverse" recommended POIs on "a small portion" in areas where the target-user is most active. This is caused by the neglect of POI locations in the diversification, i.e., existing diversification methods try to diversify the categories of recommended items. However, geographical diversification is essential for users whose activity interests comprise many sub-areas and who require a variety of recommended POIs encompassing all their activity interests. In this paper, we propose a novel proportional geographical diversification method that recommends a variety of POIs located in the activity district of a user such that the variety of sub-areas in the district is proportional to the frequency of his/her activity in each sub-area. We compare the performance of the proposed method with existing diversification methods using real datasets. The evaluation result shows that no method except the proposed one can significantly increase geographical diversity at the expense of tolerable accuracy loss.

DOI

Scopus

21

Citation

(Scopus)
Virtual co-eating: Making solitary eating experience more enjoyable

Takahashi, M., Tanaka, H., Yamana, H., Nakajima, T.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 10507 LNCS 460 - 464 2017 [Refereed]

　View Summary

Recently, a research on eating habits of Japanese college students revealed that they have a highly desire to communicate with others through co-eating. Even though better eating experience through co-eating is important, they often tend to be alone even more because of some reasons like small households, living alone, and having no time to find others for co-eating. Therefore, we believe that it may improve eating experience by incorporating a fictional character into the real space as a partner to eat together. For validating the idea, we have developed a virtual co-eating system for solving issues caused from solitary eating, and show some insights from its user study.

DOI

Scopus

26

Citation

(Scopus)
A Variable-Length Motifs Discovery Method in Time Series using Hybrid Approach

Chaw Thet Zan, Hayato Yamana

19TH INTERNATIONAL CONFERENCE ON INFORMATION INTEGRATION AND WEB-BASED APPLICATIONS & SERVICES (IIWAS2017) 49 - 57 2017 [Refereed]

　View Summary

Discovery of repeated patterns, known as motifs, from long time series is essential for providing hidden knowledge to real-world applications like medical, financial and weather analysis. Motifs can be discovered on raw time series directly or on their transformed abstract representation alternatively. Most of time series motif discovery methods require predefined motif length, which results in long execution time because we have to vary the length to discover motifs with different lengths. To solve the problem, we propose an efficient method for discovering variable length motifs in combination of approximate method with exact verification. First, symbolic representation is adopted to discover motifs roughly followed by exact examination of the found motifs with original real-valued data to achieve fast and exact discovery. The experiments show that our proposed method successfully discovered significant motifs efficiently in comparison with state-of-the-art methods: MK and SBF.

DOI

Scopus

2

Citation

(Scopus)
Securing Big Data and IoT Networks in Smart Cyber-Physical Environments

Sajal K. Das, Hayato Yamana

2017 INTERNATIONAL CONFERENCE ON SMART DIGITAL ENVIRONMENT (ICSDE'17) Part F130526 189 - 194 2017 [Refereed]

　View Summary

This position paper highlights security and privacy issues in smart environments based on cyber-physical systems. It also summarizes some of our recent research activities and projects in this area.

DOI

Scopus
Attribute-based Proxy Re-encryption Method for Revocation in Cloud Data Storage.

Yoshiko Yasumura, Hiroki Imabayashi, Hayato Yamana

2017 IEEE International Conference on Big Data (IEEE BigData 2017) 2018-January 4858 - 4860 2017 [Refereed]

　View Summary

In the big data era, many users upload data to cloud while security concerns are growing. By using attribute-based encryption (ABE), users can securely store data in cloud while exerting access control over it. Revocation is necessary for real-world applications of ABE so that revoked users can no longer decrypt data. In actual implementations, however, revocation requires re-encryption of data in client side through download, decrypt, encrypt, and upload, which results in huge communication cost between the client and the cloud depending on the data size. In this paper, we propose a new method where the data can be re-encrypted in cloud without downloading any data. The experimental result showed that our method reduces the communication cost by one quarter in comparison with the trivial solution where re-encryption is performed in client side.

DOI

Scopus

10

Citation

(Scopus)
MCMalloc: A Scalable Memory Allocator for Multithreaded Applications on a Many-core Shared-memory Machine.

Akira Umayabara, Hayato Yamana

2017 IEEE International Conference on Big Data (IEEE BigData 2017) 2018-January 4846 - 4848 2017 [Refereed]

　View Summary

In the big data era, multithreaded processing on a many-core machine, whose core number is still increasing, has become essential to parallelize the execution of big data applications, besides distributed computing. In such a machine, malloc-intensive applications cannot scale due to lock contentions among threads, which becomes worse as the number of threads increases. To solve the problem, we propose a new method to reduce lock contentions by batch malloc, pseudo free, and fine-grained data-locking. Experimental result shows 4.72 times speed-up in comparison with JEmalloc which is the fastest memory allocator among previous ones.

DOI

Scopus

3

Citation

(Scopus)
Familiarity-aware POI Recommendation in Urban Neighborhoods.

Jungkyu Han, Hayato Yamana

Journal of Information Processing 25 386 - 396 2017 [Refereed]

　View Summary

© 2017 Information Processing Society of Japan. Users’ visiting patterns to POIs (Points-Of-Interest) varied with regard to the users’ familiarity with their visited areas. For instance, users visit tourist sites in unfamiliar cities rather than in their familiar home city. Previous studies have shown that familiarity can improve POI recommendation performance. However, such studies have focused on the differences between home and other cities, and not among small urban neighborhoods in the same city where user activities frequently occur. Applying the studies directly to the areas is difficult because simple distance-based familiarity measures, or visit-pattern differences represented on topics, groups of POIs that share common functions such as Arts, French restaurants, are too coarse for capturing the differences observed among different areas. In the urban neighborhoods in the same city, user visit-pattern differences originate from more precise POI levels. In order to extend the previously proposed familiarity-aware POI recommendation to be adopted in different areas in the same city, we propose a method that employs visit-frequency-based familiarity and precise POI level of visit-pattern differentiation. In experiments on real LBSN data consists of over 800,000 check-ins for three cities: NYC, LA, and Tokyo, our proposed method outperforms state-of-the-art methods by 0.05 to 0.06 in Recall@20 metric.

DOI

Scopus

1

Citation

(Scopus)
Dynamic SAX Parameter Estimation for Time Series.

Chaw Thet Zan, Hayato Yamana

International Journal of Web Information Systems 13 ( 4 ) 387 - 404 2017 [Refereed]

　View Summary

Purpose - The paper aims to estimate the segment size and alphabet size of Symbolic Aggregate approXimation (SAX). In SAX, time series data are divided into a set of equal-sized segments. Each segment is represented by its mean value and mapped with an alphabet, where the number of adopted symbols is called alphabet size. Both parameters control data compression ratio and accuracy of time series mining tasks. Besides, optimal parameters selection highly depends on different application and data sets. In fact, these parameters are iteratively selected by analyzing entire data sets, which limits handling of the huge amount of time series and reduces the applicability of SAX.Design/methodology/approach - The segment size is estimated based on Shannon sampling theorem (autoSAXSD_S) and adaptive hierarchical segmentation (autoSAXSD_M). As for the alphabet size, it is focused on how mean values of all the segments are distributed. The small number of alphabet size is set for large distribution to easily distinguish the difference among segments.Findings - Experimental evaluation using University of California Riverside (UCR) data sets shows that the proposed schemes are able to select the parameters well with high classification accuracy and show comparable efficiency in comparison with state-of-the-art methods, SAX and auto_iSAX.Originality/value - The originality of this paper is the way to find out the optimal parameters of SAX using the proposed estimation schemes. The first parameter segment size is automatically estimated on two approaches and the second parameter alphabet size is estimated on the most frequent average (mean) value among segments.

DOI

Scopus

12

Citation

(Scopus)
An improved symbolic aggregate approximation distance measure based on its statistical features

Chaw Thet Zan, Hayato Yamana

ACM International Conference Proceeding Series 72 - 80 2016.11 [Refereed]

　View Summary

© 2016 ACM. The challenges in effcient data representation and similarity measures on massive amounts of time series have enormous impact on many applications. This paper addresses an improvement on Symbolic Aggregate approXimation (SAX), is one of the effcient representations for time series mining. Because SAX represents its symbols by the average (mean) value of a segment with the assumption of Gaussian distribution, it is insuficient to serve the entire deterministic information and causes sometimes incorrect results in time series classiffcation. In this work, SAX representation and distance measure is improved with the addition of another moment of the prior distribution, standard deviation; SAX SD is proposed. We provide comprehensive analysis for the proposed SAX SD and conrm both the highest classi-fication accuracy and the highest dimensionality reduction ratio on University of California, Riverside (UCR) datasets in comparison to state of the art methods such as SAX, Extended SAX (ESAX) and SAX Trend Distance (SAX TD).

DOI

Scopus

35

Citation

(Scopus)
早稲田大学のICT活用 : 過去・現在,そして未来へ (ICT活用の新段階)

山名早人

IDE : 現代の高等教育 ( 585 ) 11 - 16 2016.11

CiNii
Message from the MAW 2016 Symposium Organizers

Takahiro Hara, Kin Fun Li, Shengrui Wang, Hayato Yamana

Proceedings - IEEE 30th International Conference on Advanced Information Networking and Applications Workshops, WAINA 2016 lviii 2016.05

DOI

Scopus
Nowcasting Economic Indicators by Analyzing the Diet Proceedings

高杉亮介, 山名早人, 山名早人

日本データベース学会和文論文誌(Web) 14-J 2016

J-GLOBAL
What is your Mother Tongue?: Improving Chinese Native Language Identification by Cleaning Noisy Data and Adopting BM25

Lan Wang, Masahiro Tanaka, Hayato Yamana

PROCEEDINGS OF 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA ANALYSIS (ICBDA) 42 - 47 2016 [Refereed]

　View Summary

Native language identification (NLI) is a process by which an author's native language can be identified from essays written in the second language of the author. In this work, a supervised model is built to accomplish this based on a Chinese learner corpus. In the NLI field, this is the first work to (1) eliminate noisy data automatically before the training phase and (2) employ a BM25 term weighting technique to score each feature. We also adopt a hierarchical structure of linear support vector machine classifiers to achieve high accuracy and a state-of-the-art accuracy of 77.1%, which is greater than those of other Chinese NLI methods by over 10%.

DOI

Scopus

6

Citation

(Scopus)
Identifying protein short linear motifs by position-specific scoring matrix

Fang, C., Noguchi, T., Yamana, H., Sun, F.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 9713 LNCS 206 - 214 2016 [Refereed]

　View Summary

Short linear motifs (SLiMs) play a central role in several biological functions, such as cell regulation, scaffolding, cell signaling, post-translational modification, and cleavage. Identifying SLiMs is an important step for understanding their functions and mechanism. Due to their short length and particular properties, discovery of SLiMs in proteins is a challenge both experimentally and computationally. So far, many existing computational methods adopted many predicted sequence or structures features as input for prediction, there is no report about using position-specific scoring matrix (PSSM) profiles of proteins directly for SLiMs prediction. In this study, we describe a simple method, named as PSSMpred, which only use the evolutionary information generated in form of PSSM profiles of protein sequences for SLiMs prediction. When comparing with other methods tested on the same datasets, PSSMpred achieves the best performances: (1) achieving 0.03-0.1 higher AUC than other methods when tested on HumanTest151; (2) achieving 0.03-0.05 and 0.03-0.06 higher AUC than other methods when tested on ANCHOR-short and ANCHOR-long respectively.

DOI

Scopus

2

Citation

(Scopus)
Adaptive Focused Website Segment Crawler

Tanaphol Suebchua, Arnon Rungsawang, Hayato Yamana

PROCEEDINGS OF 2016 19TH INTERNATIONAL CONFERENCE ON NETWORK-BASED INFORMATION SYSTEMS (NBIS) 181 - 187 2016 [Refereed]

　View Summary

Focused web crawler has become indispensable for vertical search engines that provide a search service for specialized datasets. These vertical search engines have to collect specific web pages in the web space, whereas search engines such as Google and Bing gather web pages from all over the world. The problem in focused crawling research is how to collect specific web pages with minimal computing resources. We previously addressed this problem by proposing a focused crawling strategy, which utilizes an ensemble machine learning classifier to find the group of relevant web pages, referred to as relevant website segment. In this paper, we enhance the proposed crawler as follows: 1) We increase the accuracy of predicting website segments, by preparing two predictors: a predictor learned by features extracted from relevant source website segments and another predictor learned by features from irrelevant ones. The idea is that there may exist different characteristics between these two types of source website segments. 2) We also propose a noisy data elimination method when updating the predictor incrementally during the crawling process. A preliminary experiment shows that our enhanced crawler outperforms a crawler that equips neither of these approaches by around 12%, at most.

DOI

Scopus

7

Citation

(Scopus)
Secure frequent pattern mining by fully homomorphic encryption with ciphertext packing

Imabayashi, H., Ishimaki, Y., Umayabara, A., Sato, H., Yamana, H.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 9963 LNCS 181 - 195 2016 [Refereed]

　View Summary

We propose an efficient and secure frequent pattern mining protocol with fully homomorphic encryption (FHE). Nowadays, secure outsourcing of mining tasks to the cloud with FHE is gaining attentions. However, FHE execution leads to significant time and space complexities. P3CC, the first proposed secure protocol with FHE for frequent pattern mining, has these particular problems. It generates ciphertexts for each component in item-transaction data matrix, and executes numerous operations over the encrypted components. To address this issue, we propose efficient frequent pattern mining with ciphertext packing. By adopting the packing method, our scheme will require fewer ciphertexts and associated operations than P3CC, thus reducing both encryption and calculation times. We have also optimized its implementation by reusing previously produced results so as not to repeat calculations. Our experimental evaluation shows that the proposed scheme runs 430 times faster than P3CC, and uses 94.7% less memory with 10,000 transactions data.

DOI

Scopus

8

Citation

(Scopus)
Privacy-Preserving String Search for Genome Sequences with FHE bootstrapping optimization

Yu Ishimaki, Hiroki Imabavashi, Kana Shimizu, Hayato Yamana

2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) 3989 - 3991 2016 [Refereed]

　View Summary

Privacy-preserving string search is a crucial task for analyzing genomics-driven big data. In this work, we propose a cryptographic protocol that uses Fully Homomorphic Encryption (FHE) to enable a client to search on a genome sequence database without leaking his/her query to the server. Though FHE supports both addition and multiplication over encrypted data, random noise inside ciphertexts grows with every arithmetic operation especially multiplication, which results in incorrect decryption when the noise amount exceeds its threshold called level. There are two approaches to avoid the incorrect decryption: one is setting the sufficient level that assures correct decryption within the limited number of operations, and the other is resetting the noise by the method called bootstrapping. It is important to find an optimal balance between overhead caused by the level and overhead caused by the bootstrapping, since using higher level deteriorates the performance of all the arithmetic operations, while the more number of bootstrappings causes more expensive overhead. In this study, we propose an efficient approach to minimize the number of bootstrappings while reducing the level as much as possible. Our experimental result shows that it runs at most 10 times faster than a naive approach.

DOI

Scopus

16

Citation

(Scopus)
Fast and Space-Efficient Secure Frequent Pattern Mining by FHE

Hiroki Imabayashi, Yu Ishimaki, Akira Umayabara, Hayato Yamana

2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) 3983 - 3985 2016 [Refereed]

　View Summary

In the big data era, security and privacy concerns are growing. One of the big challenges is secure Frequent Pattern Mining (FPM) over Fully Homomorphic Encryption (FHE). There exist some research efforts aimed at speeding-up, however, we have a big room so as to decrease time and space complexity. Apriori over FHE, in particular, generates a large number of ciphertexts during the support calculation, which results in both large time and space complexity. To solve it, we proposed a speed-up technique, around 430 times faster and 18.9 times smaller memory usage than the state-of-the-art method, by adopting both packing and caching mechanism. In this paper, we further propose to decrease the memory space used for caching. Our goal is to discard redundant cached ciphertexts without increasing the execution time. Our experimental results show that our method decreases the memory usage by 6.09% at most in comparison with our previous method without increasing the execution time.

DOI

Scopus

7

Citation

(Scopus)
A study on individual mobility patterns based on individuals’ familiarity to visited areas

Han, J., Yamana, H.

International Journal of Pervasive Computing and Communications 12 ( 1 ) 23 - 48 2016 [Refereed]

　View Summary

Purpose - The purpose of this paper is to clarify the correlations between amount of individual's knowledge of a specific area and his/her visit pattern to point of interest (POI, interested places) located in the area.Design/methodology/approach - This paper proposes a visit-frequency-based familiarity estimation method that estimates individuals' knowledge of areas in a quantitative manner. Based on the familiarity degree, individuals' visit logs to POIs are divided into a set of groups followed by analyzing the differences among the groups from various points of view, such as user preference, POI categories/popularity, visit time/date and subsequent visits.Findings - Existence of statistically significant correlations between individuals' familiarity to areas and their visit patterns is observed by our analysis using 1.4-million POI visit logs collected from a popular location-based social network (LBSN), Foursquare. There exist different skewness of the visit time and visited POI distribution/popularity with regard to the familiarity. For instance, users go to unfamiliar areas on weekends and visit POIs for cultural experiences, such as museums. Anotable point is that the correlations can be detected even in the areas in home city, which have not been known so far.Originality/value - This is the first in-depth work that studies both estimation of individuals' familiarity and correlations between the familiarity and individuals' mobility patterns by analyzing massive LBSN data. The methodologies used and the findings of this work can be applicable not only to human mobility analysis for sociology, but also to POI recommendation system design.

DOI

Scopus

10

Citation

(Scopus)
Why people go to unfamiliar areas?: Analysis of mobility pattern based on users' familiarity

Jungkyu Han, Hayato Yamana

17th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2015 - Proceedings 2015.12 [Refereed]

　View Summary

Human mobility analysis with Location-Based Social Network (LBSN) data is the basis of personalized point-of-interest (POI) recommendations or location-aware advertisements. In addition to personal preference and spatiotemporal factors such as time and distance, personal context has a strong influence on mobility. An individual's familiarity with an area is an interesting context because it can bias the influence of certain factors. For example, the mobility patterns of two persons who have similar preferences are different when their familiarity with the area is different, even in the same area. In this paper, we analyze familiarity's effect on mobility patterns by using over 1.4 million check-ins gathered from Foursquare. The analysis indicates that there is a skewness of the visit time and visited venue distribution in unfamiliar areas. For instance, people go to unfamiliar areas on weekends
and venues for cultural experiences, such as museums, strongly contribute to the motivation of visit.

DOI

Scopus

4

Citation

(Scopus)
Super-information Society based on Big Data - Information Technologies of Searching the Whole, from Platform Technologies to Applications -：2. Big Data Program Trends in US and EU

YAMANA HAYATO

情報処理 56 ( 10 ) 962 - 967 2015.09

CiNii J-GLOBAL
Cross-lingual Investigation of User Evaluations for Global Restaurants

LE Jiawen, YAMANA Hayato

IMT 10 ( 2 ) 317 - 322 2015

　View Summary

Twitter, as one of the popular social network services, is now widely used to query public opinions. In this paper, tweets, along with the reviews collected from review websites are used to carry out sentimental analysis, so as to figure out the language-based and location-based effects on user evaluations for six global restaurants. The language expansion is carried out that 34 languages are taken into account. By using a range of new and standard features, a series of classifiers are trained and applied in the later steps of sentiment analysis. Our experimental results show that the location and language effects on user evaluations for restaurants actually exist.

DOI CiNii
Detecting Learner's To-Be-Forgotten Items using Online Handwritten Data.

Hiroki Asai, Hayato Yamana

Proceedings of the 15th New Zealand Conference on Human-Computer Interaction, CHINZ 2015, Hamilton, New Zealand, September 3-4, 2015 17 - 20 2015 [Refereed]

　View Summary

An effective learning system is indispensable for human beings with a limited life span. Traditional learning systems schedule repetition based on both the results of a recall test and learning theories such as the spacing effect. However, there is room for improvement from the perspective of remembrance-level estimation. In this paper, we focus on on-line handwritten data obtained from handwriting using a computer. We collected handwritten data from remembrance tests to both analyze the problem of traditional estimation methods and to build a new estimation model using handwritten data as the input data. The evaluation found that our proposed model can output a continuous remembrance-level value of zero to 1, whereas traditional methods output a only binary decision. In addition, the experiment showed that our proposed model achieves the best performance with an F-value of 0.69.

DOI

Scopus

3

Citation

(Scopus)
Predicting Various Types of User Attributes in Twitter by Using Personalized PageRank

Kazuya Uesato, Hiroki Asai, Hayato Yamana

PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA 2825 - 2827 2015 [Refereed]

　View Summary

Predicting various types of user-attributes in social networks has become indispensable for personalizing applications since there are many non-disclosed attributes in social networks. However, extracted attributes in existing works are limited to pre-defined types of attributes, which results in no extraction of unexpected-types of attributes. In this paper, we therefore propose a novel method that extracts various, i.e., unlimited, types of attributes by adopting personalized PageRank to a large social network. The experimental results using over 7.9 million of Japanese twitter-users show that our proposed method successfully extracts four types of attributes per-user in average with 0.841 of MAP@20.

DOI

Scopus

1

Citation

(Scopus)
Condensing position-specific scoring matrixs by the Kidera factors for ligand-binding site prediction

Fang, C., Noguchi, T., Yamana, H.

International Journal of Data Mining and Bioinformatics 12 ( 1 ) 70 - 84 2015 [Refereed]

　View Summary

protein functional sites. However, it is 20-dimentional and contains many redundant features. The Kidera factors were reported to contain information relating almost all physical properties of amino acids, but it requires appropriate weighting coefficients to express their properties. We developed a novel method, named as KSPSSMpred, which integrated PSSM and the Kidera Factors into a 10-dimensional matrix (KSPSSM) for ligand-binding site prediction. Flavin adenine dinucleotide (FAD) was chosen as a representative ligand for this study. When compared with five other feature-based methods on a benchmark dataset, KSPSSMpred performed the best. This study demonstrates that, KSPSSM is an effective feature extraction method which can enrich PSSM with information relating 188 physical properties of residues, and reduce 50% feature dimensions without losing information included in the PSSM.

DOI PubMed

Scopus

4

Citation

(Scopus)
A Recognition Model of Selected Regions Indicated by Handwritten Annotations on Electronic Documents

ASAI HIROKI, YAMANA HAYATO

情報処理学会論文誌トランザクションデータベース(Web) 7 ( 4 ) 1 - 12 2014.12

　View Summary

Handwriting annotation on paper-based documents is widely performed for both appending information and emphasizing a part of the document. When we perform it on electronic documents using a computer, there are some problems about improving availability such as searching and sharing by using these annotation information. We have to estimate where is annotated on the document to solve the problem. However, the accuracy of traditional methods indicate insufficient recognition accuracy because they proposed heuristic method ignoring human habit of annotations. In this paper, we therefore propose a recognition model of handwriting targeting annotations that is important to solve these problems. Our recognition model enables to detect common targeting annotation by users such as underline, enclosure and vertical. Our user study found that the proposed model can estimate selected region for 85% on average in the selection of characters and for 91% in the selection of text lines.

CiNii J-GLOBAL
Intelligent ink annotation framework that uses user's intention in electronic document annotation

Hiroki Asai, Hayato Yamana

ITS 2014 - Proceedings of the 2014 ACM International Conference on Interactive Tabletops and Surfaces 333 - 338 2014.11 [Refereed]

　View Summary

Annotating documents is one of the indispensable interaction between human and documents. The annotation system of electronic documents enables to implement effective functions, such as information retrieval and annotation-based navigation, by using the annotation information
however, traditional systems require users to perform gestures in addition to common gestures for paper-based documents. This can reduce "learnability" of the system. We propose an intelligent ink annotation framework that helps the system to increase the learnability of annotation systems by detecting recognizable intentions from natural annotation behavior on paper-based documents. Our framework recognizes "Targeting Content" and "Commenting," which are related to extraction of annotation information. We have developed a prototype annotation system using our proposed framework and conducted a user study to identify future direction.

DOI

Scopus

8

Citation

(Scopus)
Twitter User Profile Estimation from Mention Information

OKUTANI TAKASHI, YAMANA HAYATO, YAMANA HAYATO

日本データベース学会和文論文誌 13 ( 1 ) 1 - 6 2014.10

CiNii J-GLOBAL
An Image Blur Alert System for Mobile Device Cameras

TEZUKA SHOTA, ASAI HIROKI, YAMANA HAYATO, YAMANA HAYATO

日本データベース学会和文論文誌 13 ( 1 ) 58 - 63 2014.10

CiNii J-GLOBAL
マルチコアCPU環境における低レイテンシデータストリーム処理

上田高徳, 秋岡明香, 山名早人

情報・システムソサイエティ誌 19 ( 3 ) 14 - 14 2014

DOI CiNii
Analyzing conservation patterns and its influence on identifying protein functional

Chun Fang, Tamotsu Noguchi, Hayato Yamana

Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014 73 - 79 2014.01

　View Summary

Evolutionary conservation information included in position-specific scoring matrix (PSSM) has been adopted by almost all sequence-based methods for identifying protein functional sites, because all functional sites, whether in ordered or disordered proteins, are found to be conserved to some extent. However, different functional sites have different conservation patterns, some of them are linear contextual, some of them are mingled with highly variable residues, and some others seem to be conserved independently. All existing studies used the direct output of PSSM for functional sites prediction, without considering the relationship between conservation patterns of residues and the distribution of conservation scores in PSSMs. In order to demonstrate the importance of analyzing conservation patterns, three PSSM-based methods for identifying 3 kinds of functional sites have been compared. Experiment results show that, although all the methods are based on the same feature - PSSM of protein sequence, they are competent in identifying different patterns of functional sites: the PSSM-based method is competent for identifying functional site which is independently conserved; the smoothed-PSSM is competent for identifying functional site which is continuous conserved; and the masked-smoothed- PSSM based method is competent for identifying functional site which is mingled with intensively highly flexible and highly conserved residues. Copyright © (2014) by the International Society for Computers and Their Applications.
Image Annotation Fusing Content-based and Tag-based Technique Using Support Vector Machine and Vector Space Model

Shan-Bin Chan, Hayato Yamana, Duy-Dinh Le, Shin'ichi Satoh, Hayato Yamana

10TH INTERNATIONAL CONFERENCE ON SIGNAL-IMAGE TECHNOLOGY AND INTERNET-BASED SYSTEMS SITIS 2014 272 - 276 2014 [Refereed]

　View Summary

In this paper, we propose a new image annotation method by combining content-based image annotation and tag-based image annotation techniques. Content-based image annotation technique is adopted to extract "loosely defined concepts" by analyzing pre-given images' features such as color moment (CM), edge orientation histogram (EOH), and local binary pattern (LBP); followed by constructing a set of SVMs for 100 loosely defined concepts. A base-vector for each concept, similar to tag-based image annotation technique, is then constructed by using SVMs' predicted probabilistic results for sample-images whose main concepts are known. Finally cosine similarity between a query-image vector and the base vector is calculated for each concept. Experimental results show that our proposed method outperforms content-based image annotation technique by about 23% in accuracy.

DOI

Scopus

3

Citation

(Scopus)
EA snippets: Generating summarized view of handwritten documents based on emphasis annotations

Asai, H., Yamana, H.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 8522 LNCS ( PART 2 ) 20 - 31 2014 [Refereed]

　View Summary

Owing to the recent development of handwriting input devices such as tablets and digital pens, digital notebooks have become an alternative to traditional paper-based notebooks. Digital notebooks are available for various device types. To display a list of text documents on a device screen, we often use scaled thumbnails or text snippets summarized through natural language processing or structural analyses. However, these are ineffective in conveying summaries of handwritten documents, because informal and unstructured handwritten data are difficult to summarize using traditional methods. We therefore propose the use of emphasis-based snippets, i.e., summarized handwritten documents based on natural emphasis annotations, such as underlines and enclosures. Our proposed method places emphasized words into thumbnails or text snippets. User studies showed that the proposed method is effective for keyword-based navigation.

DOI

Scopus
A Challenge of Authorship Identification for Ten-thousand-scale Microblog Users

Syunya Okuno, Hiroki Asai, Hayato Yamana

2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) 52 - 54 2014 [Refereed]

　View Summary

Internet security issues require authorship identification for all kinds of internet contents; however, authorship identification for microblog users is much harder than other documents because microblog texts are too short. Moreover, when the number of candidates becomes large, i.e., big data, it will take long time to identify. Our proposed method solves these problems. The experimental results show that our method successfully identifies the authorship with 53.2% of precision out of 10,000 microblog users in the almost half execution time of previous method.

DOI

Scopus

18

Citation

(Scopus)
Analysis of evolutionary conservation patterns and their influence on identifying protein functional sites

Fang, C., Noguchi, T., Yamana, H.

Journal of Bioinformatics and Computational Biology 12 ( 5 ) 1440003 2014 [Refereed]

　View Summary

Evolutionary conservation information included in position-specific scoring matrix (PSSM) has been widely adopted by sequence-based methods for identifying protein functional sites, because all functional sites, whether in ordered or disordered proteins, are found to be conserved at some extent. However, different functional sites have different conservation patterns, some of them are linear contextual, some of them are mingled with highly variable residues, and some others seem to be conserved independently. Every value in PSSMs is calculated independently of each other, without carrying the contextual information of residues in the sequence. Therefore, adopting the direct output of PSSM for prediction fails to consider the relationship between conservation patterns of residues and the distribution of conservation scores in PSSMs. In order to demonstrate the importance of combining PSSMs with the specific conservation patterns of functional sites for prediction, three different PSSM-based methods for identifying three kinds of functional sites have been analyzed. Results suggest that, different PSSM-based methods differ in their capability to identify different patterns of functional sites, and better combining PSSMs with the specific conservation patterns of residues would largely facilitate the prediction.

DOI PubMed

Scopus

3

Citation

(Scopus)
Simplified sequence-based method for ATP-binding prediction using contextual local evolutionary conservation

Fang, C., Noguchi, T., Yamana, H.

Algorithms for Molecular Biology 9 ( 1 ) 7 2014 [Refereed]

　View Summary

Background: Identifying ligand-binding sites is a key step to annotate the protein functions and to find applications in drug design. Now, many sequence-based methods adopted various predicted results from other classifiers, such as predicted secondary structure, predicted solvent accessibility and predicted disorder probabilities, to combine with position-specific scoring matrix (PSSM) as input for binding sites prediction. These predicted features not only easily result in high-dimensional feature space, but also greatly increased the complexity of algorithms. Moreover, the performances of these predictors are also largely influenced by the other classifiers.Results: In order to verify that conservation is the most powerful attribute in identifying ligand-binding sites, and to show the importance of revising PSSM to match the detailed conservation pattern of functional site in prediction, we have analyzed the Adenosine-5'-triphosphate (ATP) ligand as an example, and proposed a simple method for ATP-binding sites prediction, named as CLCLpred (Contextual Local evolutionary Conservation-based method for Ligand-binding prediction). Our method employed no predicted results from other classifiers as input; all used features were extracted from PSSM only. We tested our method on 2 separate data sets. Experimental results showed that, comparing with other 9 existing methods on the same data sets, our method achieved the best performance.Conclusions: This study demonstrates that: 1) exploiting the signal from the detailed conservation pattern of residues will largely facilitate the prediction of protein functional sites; and 2) the local evolutionary conservation enables accurate prediction of ATP-binding sites directly from protein sequence.

DOI PubMed

Scopus

16

Citation

(Scopus)
編集にあたって

Hayato,Yamana, Miyuki,Nakano, Yohei,Seki

情報処理学会論文誌. データベース 6 ( 5 ) i - iii 2013.12
マルチコアCPU環境における低レイテンシデータストリーム処理

上田高徳, 秋岡明香, 山名早人, 山名早人

電子情報通信学会論文誌 D J96-D ( 5 ) 1094 - 1104 2013.05

J-GLOBAL
A Parallel Distributed Web Crawler Consisting of Producer-Consumer Modules

Takanori Ueda, Koh Satoh, Daichi Suzuki, Kenji Uchida, Kousuke Morimoto, Sayaka Akioka, Hayato Yamana

情報処理学会論文誌データベース（TOD） 6 ( 2 ) 85 - 97 2013.03

　View Summary

Web crawlers must collect Web data while performing tasks such as detecting crawled URLs and preventing consecutive accesses to a particular Web server. Parallel-distributed crawling is carried out at a high speed for the enormous number of URLs existing on the Web. However, in order to crawl efficiently, a crawler must realize load balancing between computers in addition to reducing time and space complexities in the crawling process. The Web crawler proposed in this paper crawls the Web using producer-consumer modules, which compose the crawler, and it realizes load balancing per module and not per crawled Web site. In other words, it realizes load balancing that is appropriate to certain computer resources necessary for the modules that compose the Web crawler; in this way, it improves biases in computation loads and memory utilization between computers. Moreover, the crawler is able to crawl the Web on a large scale while conserving resources, because the modules that manage host names or URLs are implemented by data structures that are temporally and spatially efficient.

CiNii J-GLOBAL
SCPSSMpred: A General Sequence-based Method for Ligand-binding Site Prediction

Fang Chun, Noguchi Tamotsu, Yamana Hayato

IMT 8 ( 3 ) 890 - 897 2013

　View Summary

In this paper, we propose a novel method, named SCPSSMpred (Smoothed and Condensed PSSM based prediction), which uses a simplified position-specific scoring matrix (PSSM) for predicting ligand-binding sites. Although the simplified PSSM has only ten dimensions, it combines abundant features, such as amino acid arrangement, information of neighboring residues, physicochemical properties, and evolutionary information. Our method employs no predicted results from other classifiers as input, i.e., all features used in this method are extracted from the sequences only. Three ligands (FAD, NAD and ATP) were used to verify the versatility of our method, and three alternative traditional methods were also analyzed for comparison. All the methods were tested at both the residue level and the protein sequence level. Experimental results showed that the SCPSSMpred method achieved the best performance besides reducing 50% of redundant features in PSSM. In addition, it showed a remarkable adaptability in dealing with unbalanced data compared to other methods when tested on the protein sequence level. This study not only demonstrates the importance of reducing redundant features in PSSM, but also identifies sequence-derived hallmarks of ligand-binding sites, such that both the arrangements and physicochemical properties of neighboring residues significantly impact ligand-binding behavior.

DOI CiNii
Detecting student frustration based on handwriting behavior

Hiroki Asai, Hayato Yamana

UIST 2013 Adjunct - Adjunct Publication of the 26th Annual ACM Symposium on User Interface Software and Technology 77 - 78 2013 [Refereed]

　View Summary

Detecting states of frustration among students engaged in learning activities is critical to the success of teaching assistance tools. We examine the relationship between a student's pen activity and his/her state of frustration while solving handwritten problems. Based on a user study involving mathematics problems, we found that our detection method was able to detect student frustration with a precision of 87% and a recall of 90%. We also identified several particularly discriminative features, including writing stroke number, erased stroke number, pen activity time, and air stroke speed. © 2013 Authors.

DOI

Scopus

13

Citation

(Scopus)
A Negative Sample Image Selection Method Referring to Semantic Hierarchical Structure for Image Annotation

Shan-Bin Chan, Hayato Yamana, Shin'ichi Satoh

2013 INTERNATIONAL CONFERENCE ON SIGNAL-IMAGE TECHNOLOGY & INTERNET-BASED SYSTEMS (SITIS) 162 - 167 2013 [Refereed]

　View Summary

When SVM is adopted for image annotation, we know that high quality sample images will improve image recognition accuracy. Images with the same visual/semantic features are adopted as positive sample images, and images with different visual/semantic features are adopted as negative sample images. But it is labor intensive in high quality sample images selection, especially when collecting by visual features. Most researchers randomly choose positive and negative sample images for classifier training. In many applications, adopting different negative sample image datasets will vary annotation accuracy. In this research, we will discuss the accuracy between different negative sample images dataset collected by semantic features. We adopted ImageNet as image dataset in this study, and we adopted WordNet for building semantic hierarchical tree. Semantic hierarchical structure tree is adopted to calculate the distance between each node. Then we adopt this distance relationship to prepare positive and negative sample images. We prepare one baseline method and suggest six different negative sample images selection methods for experiment. The binary SVM classifier training and prediction is implemented to compare the accuracy and Mean Reciprocal Rank (MRR) between baseline and each proposed method. Our results show that if we select uniform amount of negative sample images in each distance in the semantic hierarchical tree, we will achieve highest accuracy.

DOI

Scopus
WSD Team's Approaches for Textual Entailment Recognition at the NTCIR10 (RITE2).

Daiki Ito, Masahiro Tanaka, Hayato Yamana

Proceedings of the 10th NTCIR Conference on Evaluation of Information Access Technologies, NTCIR-10, National Center of Sciences, Tokyo, Japan, June 18-21, 2013 2013 [Refereed]
IC-BIDE: Intensity Constraint-based Closed Sequential Pattern Mining for Coding Pattern Extraction

Hiromasa Takei, Hayato Yamana

2013 IEEE 27TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS (AINA) 976 - 983 2013 [Refereed]

　View Summary

We propose intensity constraint-based closed sequential pattern mining algorithm, called IC-BIDE, for a coding pattern extraction. Source code often contains frequent patterns of function calls or control flows, i.e., "coding patterns." Previous studies used sequential pattern mining to extract coding pattern; however, these algorithms have not been optimized for coding pattern extraction, which results in useless patterns as well as long execution times. We propose a new constraint, called "intensity constraint," in order to enhance closed sequential pattern mining and efficiently extract coding patterns. Our proposed algorithm is based on BI-Directional Execution (BIDE), an algorithm proposed expressly for closed sequential pattern mining. BIDE algorithm is not able to adapt to constraint-based closed sequential pattern mining. We extend BIDE algorithm and prove that our extended algorithm is able to adapt to intensity constraint-based closed sequential pattern mining. Our contributions are as follow; 1) We propose a new constraint, which we call "intensity"; 2) We propose intensity constraint-based closed sequential pattern mining algorithm, which we call "IC-BIDE" algorithm. Experimental results with open source software (Bullet Physics, MySQL, and OpenCV) show that IC-BIDE algorithm successfully excludes useless pattern effectively. Moreover, our proposed method is able to accelerate the extraction by a factor of 8.9 in comparison with the BIDE algorithm.

DOI

Scopus

2

Citation

(Scopus)
MFSPSSMpred: Identifying short disorder-to-order binding regions in disordered proteins based on contextual local evolutionary conservation

Fang, C., Noguchi, T., Tominaga, D., Yamana, H.

BMC Bioinformatics 14 ( 1 ) 300 2013 [Refereed]

　View Summary

Background: Molecular recognition features (MoRFs) are short binding regions located in longer intrinsically disordered protein regions. Although these short regions lack a stable structure in the natural state, they readily undergo disorder-to-order transitions upon binding to their partner molecules. MoRFs play critical roles in the molecular interaction network of a cell, and are associated with many human genetic diseases. Therefore, identification of MoRFs is an important step in understanding functional aspects of these proteins and in finding applications in drug design.Results: Here, we propose a novel method for identifying MoRFs, named as MFSPSSMpred (Masked, Filtered and Smoothed Position-Specific Scoring Matrix-based Predictor). Firstly, a masking method is used to calculate the average local conservation scores of residues within a masking-window length in the position-specific scoring matrix (PSSM). Then, the scores below the average are filtered out. Finally, a smoothing method is used to incorporate the features of flanking regions for each residue to prepare the feature sets for prediction. Our method employs no predicted results from other classifiers as input, i.e., all features used in this method are extracted from the PSSM of sequence only. Experimental results show that, comparing with other methods tested on the same datasets, our method achieves the best performance: achieving 0.004 similar to 0.079 higher AUC than other methods when tested on TEST419, and achieving 0.045 similar to 0.212 higher AUC than other methods when tested on TEST2012. In addition, when tested on an independent membrane proteins-related dataset, MFSPSSMpred significantly outperformed the existing predictor MoRFpred.Conclusions: This study suggests that: 1) amino acid composition and physicochemical properties in the flanking regions of MoRFs are very different from those in the general non-MoRF regions; 2) MoRFs contain both highly conserved residues and highly variable residues and, on the whole, are highly locally conserved; and 3) combining contextual information with local conservation information of residues facilitates the prediction of MoRFs.

DOI PubMed

Scopus

61

Citation

(Scopus)
編集にあたって

山名早人, 酒井哲也, 石川佳治

情報処理学会論文誌. データベース 5 ( 2 ) i - iii 2012.06

CiNii
学生論文特集の発行にあたって(<特集>学生論文)

山名早人

電子情報通信学会論文誌. D, 情報・システム 95 ( 3 ) 31 - 36 2012.03

CiNii
Authorship Attribution Method using N-gram Part-of-Speech Tagger : Evaluation of Robustness in Topic-Independence

10 ( 3 ) 7 - 12 2012.02

CiNii J-GLOBAL
データストリーム処理におけるレイテンシ削減と高可用性のためのオペレータ実行方法

上田高徳, 打田研二, 秋岡明香, 山名早人, 山名早人

日本データベース学会論文誌 10 ( 3 ) 2012

J-GLOBAL
Hit count reliability: How much can we trust hit counts?

Koh Satoh, Hayato Yamana

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 7235 LNCS 751 - 758 2012 [Refereed]

DOI

Scopus

6

Citation

(Scopus)
A survey of aggregated search for web search engines(<Special feature>Aggregated search)

YAMANA Hayato

The journal of Information Science and Technology Association 61 ( 9 ) 343 - 348 2011.09

　View Summary

Nowadays, web search engines have features to blend other vertical search results into web search results. It is called aggregated search. Here, vertical search includes news, blog, image, video, real time information search such as twitter, and so on. Web search engines do not always blend these vertical search results. Instead, they decide when they should blend vertical search results, and which vertical search results should be blended depending on input queries. In this paper, we survey the techniques and evaluation schemes to blend vertical search results into web search results. Afte...

DOI CiNii
Time-weighted web authoritative ranking

Bundit Manaskasemsak, Arnon Rungsawang, Hayato Yamana

INFORMATION RETRIEVAL 14 ( 2 ) 133 - 157 2011.04

DOI

Scopus

17

Citation

(Scopus)
An Eager Database Replication Middleware Managing Locks

HORII Hiroshi, ONODERA Tamiya, YAMANA Hayato

The IEICE transactions on information and systems (Japanese edetion) 94 ( 3 ) 515 - 524 2011.03

　View Summary

既存データベースを利用してミドルウェアでデータベース複製を行う手法において,トランザクション中の更新ごとに複製を行う同期複製手法は,データベースの一貫性を損なうことなく,照会中心のアプリケーションの性能を向上することが可能である.しかし,従来手法は,レプリカをまたがったデッドロックを検出できない,繰返し可能読取りの分離性を提供できない問題がある.本論文では,排他制御をミドルウェア内で行うことで,繰返し可能読取りの分離性を保障し,かつ,デッドロックを検出可能とする同期複製ミドルウェア,Yamaを提案する.Yamaの排他制御は,SQL記述の解析とレプリカへの直接問合せで,SQLを処理する際に必要となるロック対象を特定し,Yama内の排他制御機構よりロックを獲得する.また,各レプリカに非コミット読込み分離性の処理を要求することで,レプリカ内部の排他制御で処理が停滞することを防ぐ.我々は,本手法をTPC-Wに適用したところ,従来手法に比べ,高いスループットを示すことが分かった.

CiNii
Legible thumbnail: Summarizing on-line handwritten documents based on emphasized expressions

Hiroki Asai, Takanori Ueda, Hayato Yamana

Mobile HCI 2011 - 13th International Conference on Human-Computer Interaction with Mobile Devices and Services 551 - 556 2011 [Refereed]

　View Summary

In recent years, digital notebooks have been replacing traditional paper-based notebooks with the development of handwriting input devices. Currently, we can access digital notebooks in various devices, including mobile devices. When we use such mobile devices, however, their limited screen size results in difficulty in understanding the summary of hand-written documents, without the use of a zoom feature. In this paper, we therefore propose the "Legible Thumbnail" that helps us to understand the summary without zooming. Our method detects the important words based on emphasis, such as an underline, and the method outputs the emphasized words to the thumbnail. Experiments show our thumbnail reduces search time by 21%. © 2011 Authors.

DOI

Scopus

1

Citation

(Scopus)
Retweet Reputation: A Bias-Free Evaluation Method for Tweeted Contents.

Shino Fujiki, Hiroya Yano, Takashi Fukuda, Hayato Yamana

Social Innovation and Social Media, Papers from the 2011 ICWSM Workshop, Barcelona, Catalonia, Spain, July 21, 2011 WS-11-01 10 - 13 2011 [Refereed]

　View Summary

The widespread of word of mouth using retweets on Twitter has enabled us to estimate trends in the real world. Previous research methods estimate the value of a tweeted content by calculating the number of subscribers who receive the tweet. However, we should consider the numbers of followers for both the tweeter and retweeter(s) as a greater number of followers may result in more retweets, which we call "bias." In this paper, we propose a bias-free evaluation method for tweeted contents. Experiments show that our method is successful at evaluating tweets without biases. Copyright © 2011, Association for the Advancement of Artificial Intelligence. All rights reserved.
ロック制御型同期複製ミドルウェアの提案

堀井洋, 堀井洋, 小野寺民也, 山名早人

電子情報通信学会論文誌 D J94-D ( 3 ) 515 - 524 2011

J-GLOBAL
A survey of aggregated search for web search engines

山名早人

The Journal of Information Science and Technology Association 61 ( 9 ) 343 - 348 2011

DOI CiNii J-GLOBAL
Hit count dance: scientific basis to adopt search engines' hit counts

舟橋卓也, 山名早人

DBSJ journal 9 ( 1 ) 18 - 22 2010.06

CiNii J-GLOBAL
A Method for Automatic Group Commit of OLTP Systems

HORII Hiroshi, ONODERA Tamiya, YAMANA Hayato

The IEICE transactions on information and systems (Japanese edetion) 93 ( 3 ) 222 - 231 2010.03

　View Summary

データベースの幅広い普及により,更新トランザクションを多用するアプリケーションが増えている.更新トランザクションは,高価な高可用サーバで処理する必要があるため,スケールアップが求められる.そのためには,複数のトランザクションを一つのトランザクションとして処理するグループコミットが有効である.しかし,グループコミットを利用するには,アプリケーションの修正が必要で,既存のアプリケーションに適用することができなかった.本論文では,トランザクション処理内容の事前知識や,アプリケーションの修正を必要とせず,ミドルウェア内でグループコミットを行う手法を提案する.アプリケーションの修正を行わない場合,グループコミットの対象とするトランザクション集を特定し,バッチ更新をスケジュールする必要がある.本手法では,トランザクションの実行履歴をもとに,トランザクション中のSQLを事前に予測し,グループコミット対象,バッチ更新のスケジュールを行う.本手法をJavaで実装し,5クライアントの環境で評価したところ,データベースのCPU利用率を抑えながら,従来の約2倍のスループットを実現可能であることが分かった.

CiNii J-GLOBAL
Nb-GCLOCK: A Non-blocking Buffer Management Based on the Generalized CLOCK

Makoto Yui, Jun Miyazaki, Shunsuke Uemura, Hayato Yamana

26TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING ICDE 2010 745 - 756 2010 [Refereed]

　View Summary

In this paper, we propose a non-blocking buffer management scheme based on a lock-free variant of the GCLOCK page replacement algorithm. Concurrent access to the buffer management module is a major factor that prevents database scalability to processors. Therefore, we propose a non-blocking scheme for bufferfix operations that fix buffer frames for requested pages without locks by combining Nb-GCLOCK and a non-blocking hash table. Our experimental results revealed that our scheme can obtain nearly linear scalability to processors up to 64 processors, although the existing locking-based schemes do not scale beyond 16 processors.

DOI

Scopus

9

Citation

(Scopus)
舟橋卓也, 山名早人

Hit Count Dance, 検索エンジンのヒット数に対する信頼性検証

日本データベース学会論文誌 Vol.9 ( No.1 ) 18 - 22 2010
Reliability Verification of Search Engines' Hit Counts: How to Select a Reliable Hit Count for a Query

Takuya Funahashi, Hayato Yamana

CURRENT TRENDS IN WEB ENGINEERING 6385s 114 - 125 2010

DOI

Scopus

10

Citation

(Scopus)
A Lock-free GCLOCK Page Replacement Algorithm

Makoto Yui, Jun Miyazaki, Shunsuke Uemura, Hirokazu Kato, Hayato Yamana

情報処理学会論文誌. データベース 2 ( 4 ) 32 - 48 2009.12

　View Summary

In this paper, we propose a lock-free variant of the GCLOCK page replacement algorithm, named Nb-GCLOCK. Concurrent access to the buffer management module is a major factor that prevents database scalability to processors. Therefore, we propose a non-blocking scheme for bufferfix operations that fix buffer frames for requested pages without locks by combining Nb-GCLOCK and a wait-free hash table. Our experimental results revealed that our scheme can obtain nearly linear scalability to processors up to 64 processors, although the existing locking-based schemes do not scale beyond 16 processors.

CiNii
Improvement in Speed and Accuracy of Multiple Sequence Alignment Program PRIME

Yamada Shinsuke, Gotoh Osamu, Yamana Hayato

Information and Media Technologies 4 ( 2 ) 317 - 327 2009

　View Summary

Multiple sequence alignment (MSA) is a useful tool in bioinformatics. Although many MSA algorithms have been developed, there is still room for improvement in accuracy and speed. We have developed an MSA program PRIME, whose crucial feature is the use of a group-to-group sequence alignment algorithm with a piecewise linear gap cost. We have shown that PRIME is one of the most accurate MSA programs currently available. However, PRIME is slower than other leading MSA programs. To improve computational performance, we newly incorporate anchoring and grouping heuristics into PRIME. An anchoring method is to locate well-conserved regions in a given MSA as anchor points to reduce the region of DP matrix to be examined, while a grouping method detects conserved subfamily alignments specified by phylogenetic tree in a given MSA to reduce the number of iterative refinement steps. The results of BAliBASE 3.0 and PREFAB 4 benchmark tests indicated that these heuristics contributed to reduction in the computational time of PRIME by more than 60% while the average alignment accuracy measures decreased by at most 2%. Additionally, we evaluated the effectiveness of iterative refinement algorithm based on maximal expected accuracy (MEA). Our experiments revealed that when many sequences are aligned, the MEA-based algorithm significantly improves alignment accuracy compared with the standard version of PRIME at the expense of a considerable increase in computation time.

DOI CiNii
8. Challenges to Gathering and Analyzing over 10 Billion of Web Pages(<Special Feature>Development of Advanced Development of Fundamental Software through Tight Collaboration of Academia and Industry)

MURAOKA Yoichi, YAMANA Hayato, MATSUI Kunio, HASIMOTO Minako, AKABANE Kyoko, HAGIWARA Junichi

Journal of Information Processing Society of Japan 49 ( 11 ) 1277 - 1283 2008.11

　View Summary

Webページ数は,2006年11月時点で537億ページと推測されている.我々は,2004年1月〜2006年7月の間に,全世界の5,548万台のWebサーバからテキストのみを対象に収集を行い,ユニークなWebページ数として約144.5億ページを収集した.また,収集済みWebページに対して,トップレベルドメイン分布,記述言語分布,Webサーバの地理的位置の解析,バックリンク解析やPageRank計算を進め,Web空間の現状分析を行った.さらに,Webページの解析がビジネスに利用可能であることを示すために,企業のWebサイト上の活動を可視化するe企業調査プロトタイプを構築し,企業の特徴,戦略,評判などの抽出を行った.

CiNii J-GLOBAL
単独記事フィルタリングを用いた時系列ニュース記事分類法の提案

中村智浩, 平野孝佳, 平手勇宇, 山名早人, 山名早人

日本データベース学会論文誌 7 ( 2 ) 2008

J-GLOBAL
Web structure in 2005

Yu Hirate, Shin Kato, Hayato Yamana

ALGORITHMS AND MODELS FOR THE WEB-GRAPH 4936 36 - 46 2008 [Refereed]

DOI

Scopus

9

Citation

(Scopus)
Detecting article errors in english using search engines

平野孝佳, 平手勇宇, 山名早人

DBSJ letters 6 ( 3 ) 1 - 4 2007.12

　View Summary

The purpose of this study is to show the effectiveness of shell script execution on multi-core and/or SMT (Simultaneous Multi-Threading) processors. recently, multi-core processor and SMT technique have become popular even at home and in business. However, using programs or compilers without consideration of parallelism does not give us the benefits of multi-core and multi-thread. Programmers have to do parallel programming to receive the benefits. Therefore, automatic parallelizing technique has been studied actively. This paper proposes automatic parallelizing scheme for shell script prog...

CiNii J-GLOBAL
EPCI: Extracting potentially copyright infringement texts from the web

Takashi Tashiro, Takanori Ueda, Taisuke Hori, Yu Hirate, Hayato Yamana

16th International World Wide Web Conference, WWW2007 1151 - 1152 2007 [Refereed]

　View Summary

In this paper, we propose a new system extracting potentially copyright infringement texts from the Web, called EPCI. EPCI extracts them in the following way: (1) generating a set of queries based on a given copyright reserved seed-text, (2) putting every query to search engine API, (3) gathering the search result Web pages from high ranking until the similarity between the given seed-text and the search result pages becomes less than a given threshold value, and (4) merging all the gathered pages, then re-ranking them in the order of their similarity. Our experimental result using 40 seed-texts shows that EPCI is able to extract 132 potentially copyright infringement Web pages per a given copyright reserved seed-text with 94% precision in average.

DOI

Scopus

3

Citation

(Scopus)
Improvement in accuracy of multiple sequence alignment using novel group-to-group sequence alignment algorithm with piecewise linear gap cost

Shinsuke Yamada, Osamu Gotoh, Hayato Yamana

BMC BIOINFORMATICS 7 524 2006.12 [Refereed]

DOI PubMed

Scopus

32

Citation

(Scopus)
Search Engines 2006 -Guides to the Web-:Introduction to Search Engines

YAMANA H., MURATA Tsuyoshi

IPSJ-MGN 46 ( 9 ) 981 - 987 2005.09

CiNii
TF^2P-growth : Frequent Itemset Mining Algorithm without Any Thresholds

HIRATE YU, IWAHASHI EIGO, YAMANA HAYATO

情報処理学会論文誌. データベース 46 ( 8 ) 60 - 71 2005.06

　View Summary

Conventional frequent itemset mining algorithms require some user-specified minimum support, and then mine frequent itemsets with support values that are higher than the minimum support. As it is difficult to predict how many frequent itemsets will be mined with a specified minimum support, the Top-κ mining concept has been proposed. The Top-κ Mining concept is based on an algorithm for mining frequent itemsets without a minimum support, but with the number of most κ frequent itemsets ordered according to their support values. However, the Top-κ mining concept still requires a threshold κ. ...

CiNii J-GLOBAL
The Branch Predictor Referring a BTB Entry Existence(Processor Architecture)

SAITO FUMIKO, YAMANA HAYATO

情報処理学会論文誌. コンピューティングシステム 45 ( 11 ) 71 - 79 2004.10

　View Summary

The branch prediction is installed on the recent processor to avoid stalling pipeline. Branch prediction is a kind of speculative execution for control dependence. In the recent year, the deeper pipeline gets, the higher branch miss prediction penalty reaches. Thus, branch miss prediction rate must lower to rise processor performance. The branch prediction predicts a branch direction and a branch target address. BTB (Branch Target Buffer) registers Taken branch. We found that the most branches, which do not have BTB entry are NotTaken branches. We propose the branch predictor reffering a BT...

CiNii J-GLOBAL
The Architecture of Search Engines (<Special feature> Internet Search Engines)

YAMANA Hayato

The journal of Information Science and Technology Association 54 ( 2 ) 84 - 89 2004.02

　View Summary

Search engines are indispensable for using the Internet, today. However, their architecture is somewhat unknown. In this paper, the architecture of search engines is described by way of Google as an example, focusing on Web crawlers, the indexing and the searching scheme. Moreover, the problems to manage many queries and the cost of running the search engines is taken up.

DOI CiNii J-GLOBAL
Exploitation of Informational Applications : Toward the Global Web Information Archive(<Special Edition>Exploitation of Internet New Applications)

Yamana Hayato

The journal of the Institute of Image Information and Television Engineers 57 ( 12 ) 1632 - 1637 2003.12

DOI CiNii J-GLOBAL

Scopus
(IT in the Home)The Internet Leading IT Society : The Present Internet Access at Home and the Future(Special Issue on IT in Daily Life: IT Application Systems at Our Fingertips)

YAMANA Hayato

The Journal of the Institute of Electronics, Information, and Communication Engineers 86 ( 5 ) 304 - 310 2003.05

　View Summary

総務省の調査によれば,日本における家庭へのインターネット普及率は,2002年末に全世帯の8割を超えた.全世帯へのインターネット普及率が10%を超えたのは1998年末であり,普及率10%までの所要年数は5年である.この所要年数をパソコンの13年,自動車・携帯電話の15年等に比較すると,いかに急ピッチで普及してきたかが分かる.本稿では,インターネットと家庭との関係に焦点を当て,具体的データに基づいて,家庭からのインターネットアクセスの過去・現在・未来を紹介する.

CiNii J-GLOBAL
Design of Automatic Parallelizing Intermediate Code Interpreter

KOIKE HANPEI, YAMANA HAYATO, YAMAGUCHI YOSHINORI

情報処理学会論文誌. プログラミング 40 ( 10 ) 64 - 74 1999.12

　View Summary

In this paper, the design of the intermediate code interpreter, which executes a sequential program in parallel using speculative method, is discussed. Software techniques which enable an efficient parallel speculative execution without hardware support, such as the check point execution mechanism with which an appropriate parallel execution granularity is established, and the efficient implementation of the speculative memory operations which minimize the overhead of searching, recording and the mutual exclusion, are proposed. Experiment results to see the basic performance of these techni...

CiNii J-GLOBAL
Evaluation of Communication Mechanisms for Distributed Memory Parallel Computers in Wavefront Computation (Special Issue on Parallel Processings)

SAKANE HIROFUMI, KODAMA YUETSU, TATEBE OSAMU, KOIKE HANPEI, YAMANE HAYATO, YAMAGUCHI YOSHINORI, YUBA TOSHITSUGU

Transactions of Information Processing Society of Japan 40 ( 5 ) 2281 - 2292 1999.05

　View Summary

In this paper, we discuss efficient parallel execution of a dense-matrix problem considering trade-offs between fine-grain and coarse-grain communication in distributed memory machines. The solution of the triangular system of equations involves data dependencies between consecutive iterations in the outer-loop. The dependencies can be naturally solved and processed in parallel by wavefront computation. Two ways of parallelizing are presented; the element-wise fine-grain approach and the coarse-grain approach. We implemented these algorithms on both EM-X and AP 1000+. Fine-grain support mec...

CiNii J-GLOBAL
「情報検索の新たな展開 : テストコレクションからサーチエンジンまで」

細野公男, 小川泰嗣, 神門典子, 木谷勉, 住田一男, 福島俊一, 山名早人

情報処理 40 ( 2 ) 34 - 35 1999.02

CiNii
Experiments of Collecting WWW Information Using Distributed WWW Robots

Hayato Yamana, Kent Tamura, Hiroyuki Kawano, Satoshi Kamei, Masanori Harada, Hideki Nishimura, Isao Asai, Hiroyuki Kusumoto, Yoichi Shinoda

SIGIR Forum (ACM Special Interest Group on Information Retrieval) 379 - 380 1998 [Refereed]

DOI

Scopus
次世代並列処理計算機のプロトタイプを開発 : 通信オーバヘッドを大幅に削減

山名早人

電子情報通信学会誌 79 ( 8 ) 855 - 855 1996.08

CiNii
MOSトランジスタ構造の高安定真空マイクロ素子の開発に成功 : ディスプレイなどへの実用化に大きく前進

山名早人

電子情報通信学会誌 79 ( 6 ) 630 - 630 1996.06

CiNii
Identifying the capability of overlapping computation with communication

A Sohn, J Ku, Y Kodama, M Sato, H Sakane, H Yamana, S Sakai, Y Yamaguchi

PROCEEDINGS OF THE 1996 CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT '96) 133 - 138 1996 [Refereed]

　View Summary

Overlapping computation with communication is central to obtaining high performance on distributed-memory multiprocessors. Identifying the capability of overlapping, machine architects and programmers will be able to provide tools which can effectively utilize the underlying architecture. This report explicates the overlapping capability of two distributed-memory multiprocessors: the laboratory prototype EM-X multithreaded multiprocessor and a commercially available IBM SP2 with wide nodes. The well-known bitonic sorting algorithm is selected for experiments. Various message sizes are used to determine when, where, how much and why overlapping takes place. Experimental results indicate that both multiprocessors would yield up to 30% to 40% overlap of communication time when the message size is approximately 1K integers. EM-X is found message-size insensitive yielding high overlap for various message sizes while SP2 was effective for a window of message size of 512 to 2K integers, depen...
A Distributed Control Scheme of Macrotask-level Speculative Execution on the EM-4 Multiprocessor

Yamana,Hayato, Sato,Mitsuhisa, Kodama,Yuetsu, Sakane,Hirohumi, Sakai,Shuichi, Yamaguchi,Yoshinori

IPSJ Journal 36 ( 7 ) 1578 - 1588 1995.07 [Refereed]

　View Summary

The purpose of this paper is to propose a new fast control scheme of macrotasks with speculation. A macrotask is a coarse grain task which is a unit of speculation. Previous works have reported that the speedup ratio is 12 to 630 times in comparison with conventional execution schems without speculation when both the speculation depth and the computational resource are infinite, that is called oracle model. The control overhead,however, prevents the speedup from attaining the theoretical speedup ratio. Thus, the control scheme with small overhead is desired. The distributed control scheme archieves small control overhead by adopting two mechanisms-1) Each macrotask creates its successive macrotasks and 2) Each macrotask snoops the broadcasted control signals and determines its next state by itself. Thus, the control of macrotasks can be parallelized and the overhead to control macrotasks does not depend on the number of macrotasks. The scheme has been implemented on the EM-4 multiprocessor. Preliminary evaluations using Boolean Recurrence loops show that the control overhead of the proposed scheme is smaller than that of conventional control schemes-centralized control schemes.

J-GLOBAL
A Computation Scheme of the Execution Time of Single Doacross Loops on Distributed Shared Memory Machines

Yamana Hayato, Yasue Toshiaki, Muraoka Yoichi, Yamaguchi Yoshinori

The transactions of the Institute of Electronics, Information and Communication Engineers 78 ( 2 ) 170 - 178 1995.02

　View Summary

近年の分散共有メモリ型並列計算機は,プロセッサの処理と独立に非同期でデータ転送を行う機構を備え,プロセッサがデータ転送に忙殺されるのを防いでいる.また,プロセッサの高速化と64bitアーキテクチャ化により,演算性能は向上したが,コストパフォーマンスの関係からネットワークの転送性能は,プロセッサの演算性能より低く設定される場合が多い.本論文では,このような分散共有メモリ型並列計算機において,1重Doacross型ループの実行時間算出法を提案する.実行時間算出にあたっては,ネットワークの転送性能として,プロセッサとネットワーク間のデータ入出力時間間隔であるデータ転送ピッチを新たに導入すると共に,データ転送余裕時間およびデータ転送発行遅延時間を定義し,ループ実行時間を求める.また,求められたループ実行時間の利用例としてデータ転送ピッチを考慮した場合,データ転送順序の変更によって実行時間を小さくすることができることを示す.T3Dを対象としたシミュレーションの結果,従来法に比較して,より実測値に近い実行時間を算出できることを確認した.

CiNii J-GLOBAL
Dynamic characteristics of multithreaded execution in the EM-X multiprocessor

H Sakane, R Sato, Y Kodama, H Yamana, S Sakai, Y Yamaguchi

1995 INTERNATIONAL WORKSHOP ON COMPUTER PERFORMANCE MEASUREMENT AND ANALYSIS (PERMEAN '95), PROCEEDINGS 14 - 22 1995 [Refereed]

　View Summary

Multithreading is known be effective for tolerating communication latency in distributed-memory multiprocessors. Two types of support for multithreading have been used to date including software and hardware. This paper presents the impact of multithreading on performance through empirical studies. In particular, we explicate the performance difference between software support and hardware support for the 80-processor EM-X distributed-memory multiprocessor which we have designed and implemented. The EMX provides three types of hardware supports for fine-grain multithreading including direct remote memory access, fast thread invocation, and dedicated instructions for generating fixed-sized communication packets. To demonstrate the effect of multithreading, we have performed various experiments using micro benchmark programs and MP3D, one of the SPLASH benchmarks. Three types of performance parameters have been measured including processor efficiency, remote memory latency, and network load. Experimental results indicate that the EM-X architecture is highly effective for supporting the multithreading principles of execution through dedicated hardware and software. keywords Multithreading, latency hiding, fine grain communication, direct remote memory access, shared memory benchmark, synthetic workload. 1
A Speculative Execution Scheme of Macrotasks for Parallel Processing System

Yamana Hayato, Yasue Toshiaki, Ishii Yoshihiko, Muraoka Yoichi

The transactions of the Institute of Electronics, Information and Communication Engineers 77 ( 5 ) 343 - 353 1994.05

　View Summary

本論文では,並列処理システム上ではFORTRANプログラムを高速に実行する方式として,多段の条件分岐に渡る先行評価を用いたプログラムの並列化と実行方式を提案する.従来,条件分岐を含むプログラムを並列化する手法がいくつか提案されている.先行評価を用いない手法としては,(1)タスクの最早実行条件求出法があり,先行評価を用いる手法としては,(2)スーパスカラプロセッサやVLIW計算機を対象とした条件分岐1段に限った先行評価方式,および,(3)特定のループを対象とした多段の先行評価方式,が提案されている.しかし,(1)最早実行条件を求めるのみでは十分な並列性が得られない.(2)1段の条件分岐の先行評価で得られる速度向上はたかだか2倍である,(3)適用対象が特定ループに限られる,という問題をもつ.これらの問題に対して,本論文では,プログラムをマクロタスクに分割し,マクロタスク間の多段の先行評価方式を一般的な並列処理システム上で定義する.そして,各々のマクロタスクについと,実行開始条件・制御確定条件・実行停止条件を用いたマクロタスクの実行制御手法を提案する.

CiNii J-GLOBAL
A FORTRAN compiling method for dataflow machines and its prototype compiler for the parallel processing system -Harray-

T. Yasue, H. Yamana, Y. Muraoka

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 757 LNCS 482 - 496 1993

DOI

Scopus
A Flow-Executing Scheme of Doacross Loops on Dynamic Dataflow Machines

ISHII Yoshihiko, YASUE Toshiaki, YAMANA Hayato, MURAOKA Yoichi

The Transactions of the Institute of Electronics,Information and Communication Engineers. 75 ( 7 ) 440 - 449 1992.07

CiNii J-GLOBAL
An environment for dataflow program development of parallel processing system-Harray-.

山名早人, 神舘淳, 安江俊明, 村岡洋一

電子情報通信学会論文誌 D-1 73 ( 6 ) 1990

J-GLOBAL
System architecture of parallel processing system - Harray

Hayato Yamana, Toshikazu Marushima, Takashi Hagiwara, Yoichi Muraoka

Proceedings of the International Conference on Supercomputing Part F130184 76 - 89 1988.06 [Refereed]

　View Summary

This paper proposes a parallel processing system - Harray-for scientific computations. Data flow computers are expected to obtain the high performance because they can extract parallelism fully from a program. However, they have many problems, such as the difficulty of controlling the sequence of execution. The - Harray - system is an array processor which adapts two levels of control mechanism
data flow execution in each processor and control flow between processors, in order to take full advantage of both mechanisms. A task which is assigned to a processor is called a "macro-block". Three types of macro-blocking and three types of activation schemes for the macro-block which initiates its execution are introduced in order to attain the high performance. Moreover, a hardware synchronization mechanism is used to reduce synchronization overhead and to gain the liner speedup of the - Harray - system. In this paper, the system architecture of the - Harray - system and its performance evaluation by software simulation are presented.

DOI

Scopus

3

Citation

(Scopus)

▼display all

Books and Other Publications

オプティマイジングWebサイト ―SEO・サーバ・クライアントの最適化技術

Andrew B. King, 山名早人, 原隆文

オライリージャパン 2009.12 ISBN: 9784873114316

ASIN
Algorithms and Models for the Web-Graph: Fourth International Workshop, WAW 2006, Banff, Canada, November 30 - December 1, 2006, Revised Papers ... Applications, incl. Internet/Web, and HCI)

William Aiello, Andrei Broder, Jeannette Janssen, Evangelos Milios

Springer 2008.04 ISBN: 3540788077

ASIN
Google Hacks 第3版 ―プロが使うテクニック & ツール 100選

Rael Dornfest, Paul Bausch, Tara Calishain, 山名早人, 山名早人, 石川隼輔, 堀井洋, 村上明子, 鹿島久嗣, 小柳光生( Part： Supervisor (editorial))

オライリー・ジャパン 2007.04 ISBN: 9784873113210

ASIN
Google Hacks 第2版―プロが使うテクニック&ツール100選

Tara Calishain, Rael Dornfest, 山名早人, 石川隼輔, 堀井洋, 村上明子, 鹿島久嗣, 小柳光生( Part： Joint author)

オライリージャパン 2005.08 ISBN: 4873112338

ASIN
GOOGLEポケットガイド

山名早人( Part： Joint translator)

オライリージャパン 2003.10 ISBN: 4873111536

ASIN
Google hacks : プロが使うテクニック&ツール100選

Calishain, Tara, Dornfest, Rael, 田中, 裕子, 山名, 早人( Part： Joint translator)

オライリー・ジャパン,オーム社 (発売) 2003.08 ISBN: 4873111366

ASIN
World Wide Web情報検索の達人―WWW検索サービス完全ガイド

山名早人, 田村健人( Part： Joint author)

カットシステム 1996.12 ISBN: 4906391389

ASIN
インターネットを進化させるモノたち (マイガイアの本)

山名早人( Part： Sole author)

マイガイア 1996.08 ISBN: 488528211X

ASIN
図解超並列コンピュータ入門 (COMシリーズ)

村岡洋一, 山名早人( Part： Joint author)

オーム社 1992.06 ISBN: 4274129039

ASIN
Parallel Processing in Computational Mechanics (New Generation Computing)

Hojjat Adeli

CRC Press 1991.09 ISBN: 0824785576

ASIN

▼display all

Works

CREST SECURE DATA SHARING AND DISTRIBUTION PLATFORM FOR INTEGRATED BIG DATA UTILIZATION

Software

2015.10

-

2021.09
多メディアWeb解析基盤の構築及び社会分析ソフトウェアの開発

2008

-

　
検索エンジンの信頼性

2007

-

　
Trustwothiness of Search Engines

2007

-

　
e-Society/インターネット上の知識集約を可能にするプラットフォーム構築技術

2002

-

2007
e-Society Project

2002

-

2007
Productive ICT Academia Program(21世紀COE)

2002

-

2006
効率的な情報収集に関する調査(Web)

2002

　

　
分散ソフトウェアロボット負荷分散法の研究

2000

-

2002
アドバンスト並列化コンパイラ技術の開発(NEDO/METI)

2000

-

2002
Research on Load Balancing Technique for Distributed WWW Robots

2000

-

2002
Research on Advanced Parallelizing Compiler

2000

-

2002
WWW情報検索システムのサーベイ

2000

-

　
Survey in WWW Search Engines

2000

-

▼display all

Presentations

特定分野における単語重要度計算手法の提案と短い文章における著者の専門性推定への適応

滝川真弘, 山名早人

情報処理学会研究報告(Web)

Presentation date： 2017.10

Event date：
2017.10

　

　
CTR向上を目的としたWEBページ上でのオンライン広告配置位置推定

大谷一善, 滝川真弘, 堀田弘明, 山名早人

情報科学技術フォーラム講演論文集

Presentation date： 2017.09

Event date：
2017.09

　

　
FCMalloc:完全準同型暗号の高速化に向たメモリアロケータ

馬屋原昂, 佐藤宏樹, 石巻優, 今林広樹, 山名早人

情報処理学会研究報告(Web)

Presentation date： 2017.07

Event date：
2017.07

　

　
電子ペンを用いた手書き解答データによる幾何学解答パターン分類手法

森山優姫菜, 下岡純也, 浅井洋樹, 山名早人, 山名早人

情報科学技術フォーラム講演論文集

Presentation date： 2016.08

Event date：
2016.08

　

　
特定分野を対象とした単語重要度計算手法の提案とTwitterにおける専門性推定への適応

滝川真弘, 山名早人

情報科学技術フォーラム講演論文集

Presentation date： 2016.08

Event date：
2016.08

　

　
完全準同型暗号のデータマイニングへの利用に関する研究動向

佐藤宏樹, 馬屋原昂, 石巻優, 今林広樹, 山名早人

情報科学技術フォーラム講演論文集

Presentation date： 2016.08

Event date：
2016.08

　

　
完全準同型暗号を用いた高速なゲノム秘匿検索

石巻優, 清水佳奈, 縫田光司, 山名早人

2016年暗号と情報セキュリティシンポジウム(SCIS2016)予稿集

Presentation date： 2016.01

Event date：
2016.01

　

　
A study of effective visit history utilization for Location recommendation-User's Familiarity with area and Visit pattern change-

HAN JUNGKYU, YAMANA HAYATO

電子情報通信学会技術研究報告

Presentation date： 2015.06

Event date：
2015.06

　

　
Comparison of Different Semantic Negative Concepts Selection Methods in SVM Classifier Training for Image Annotation

Shan-Bin Chan, Shin'ichi Satoh, Hayato Yamana

第5回データ工学と情報マネジメントに関するフォーラム (DEIM2013)

Presentation date： 2015.03

Event date：
2015.03

　

　
Improved Native Language Identification with Upper Phrase Information and Training Data Selection

TANAKA MASAHIRO, WANG LAN, YAMANA HAYATO

電子情報通信学会技術研究報告

Presentation date： 2014.12

Event date：
2014.12

　

　
オンライン手書き情報を用いた未定着記憶推定システム

ASAI HIROKI, YAMANA HAYATO

情報処理学会研究報告(Web)

Presentation date： 2014.11

Event date：
2014.11

　

　
メンション情報を利用したTwitterユーザプロフィール推定における単語重要度算出手法の考察

UESATO KAZUYA, TANAKA MASAHIRO, ASAI HIROKI, YAMANA HAYATO

電子情報通信学会技術研究報告

Presentation date： 2014.07

Event date：
2014.07

　

　
単語の意味概念行列を用いたキーワード生成による関連論文検索システム

HAYASHI YUUMA, OKUNO SHUN'YA, YAMANA HAYATO

電子情報通信学会技術研究報告

Presentation date： 2014.07

Event date：
2014.07

　

　
マイクロブログを対象とした著者推定手法の提案―10,000人レベルでの著者推定―

OKUNO SHUN'YA, ASAI HIROKI, YAMANA HAYATO

電子情報通信学会技術研究報告

Presentation date： 2014.07

Event date：
2014.07

　

　
文体及びツイート付随情報を用いた乗っ取りツイート検出

上里和也, 奥谷貴志, 浅井洋樹, 奥野峻弥, 田中正浩, 山名早人

研究報告データベースシステム（DBS）一般社団法人情報処理学会

Presentation date： 2013.11

Event date：
2013.11

　

　

　View Summary

Twitter のユーザ数が増加を続ける一方で，不正に ID 及びパスワードを入手され，他人によってツイートを投稿される被害が増加している．これに対し，我々はアカウント乗っ取りによって投稿されるメッセージの一部であるスパムツイートの検出手法を提案し，8 割程度の正答率を得ている．同手法では特定の単語が含まれているスパムツイートを検出対象とし，検出の有効性を示している．本研究では同検出対象を広げ，アカウントの所持者以外が投稿したツイート全体を「乗っ取りツイート」として定義し，これを検出する手法を提案する．また本研究では，以前提案した手法に対してパラメータの再調整を行うと同時に，頻繁に用いるハッシュタグの種類及びリプライを送る相手が各アカウントにおいて特徴的であることを利用し，F 値の向上を図った．100 アカウントに対して評価実験を行った結果，我々が提案している従来手法と比較し，F 値を 0.1984 向上させ F 値 0.8570 を達成した．
ソーシャルメディアを含む多メディアビッグデータの統合的解析による情報抽出

上田高徳, 浅井洋樹, 藤木紫乃, 山本祐輔, 武井宏将, 秋岡明香, 山名早人

情報処理学会研究報告. データベース・システム研究会報告一般社団法人情報処理学会

Presentation date： 2012.12

Event date：
2012.12

　

　

　View Summary

本稿では我々が取り組んでいる多メディアビッグデータの統合的解析による情報抽出の試みについて述べる.ソーシャルメディアの普及によって,様々な情報がリアルタイムにインターネット上にアップロードされるようになった.我々は,単一のソーシャルメディアだけでなく,複数の情報源を組み合わせた, 「多メディアデータ」を解析することで,より有益な情報を抽出できると考えている.本稿では我々が取り組んでいる多メディア解析について述べる.また,大規模リアルタイムデータの解析をサポートするために開発している,並列分散処理フレームワーク QueueLinker についても述べる.
ソーシャルメディアを含む多メディアビッグデータの統合的解析による情報抽出

上田高徳, 浅井洋樹, 藤木紫乃, 山本祐輔, 武井宏将, 武井宏将, 秋岡明香, 山名早人, 山名早人

電子情報通信学会技術研究報告

Presentation date： 2012.12

Event date：
2012.12

　

　
ソーシャルメディアを含む多メディアビッグデータの統合的解析による情報抽出(ソーシャルメディア,ビッグデータとソーシャルコンピューティング,及び一般)

上田高徳, 浅井洋樹, 藤木紫乃, 山本祐輔, 武井宏将, 秋岡明香, 山名早人

電子情報通信学会技術研究報告. DE, データ工学一般社団法人電子情報通信学会

Presentation date： 2012.12

Event date：
2012.12

　

　
Producer‐Consumer型モジュールで構成された並列分散Webクローラの開発

上田高徳, 佐藤亘, 鈴木大地, 打田研二, 森本浩介, 秋岡明香, 山名早人, 山名早人

情報処理学会シンポジウムシリーズ(CD-ROM)

Presentation date： 2012.11

Event date：
2012.11

　

　
形態素間の優先関係を考慮した略語生成手法

田中友樹, 及川孝徳, 山名早人, 山名早人, 大西貴士, 土田正明, 石川開

情報処理学会シンポジウムシリーズ(CD-ROM)

Presentation date： 2012.11

Event date：
2012.11

　

　
筆記情報と時系列モデルを用いた学習者つまずき検出

浅井洋樹, 浅井洋樹, 野澤明里, 苑田翔吾, 山名早人

電子情報通信学会技術研究報告

Presentation date： 2012.10

Event date：
2012.10

　

　
筆記情報と時系列モデルを用いた学習者つまずき検出(教育・学習支援プラットフォーム/一般)

浅井洋樹, 野輝明里, 苑田翔吾, 山名早人

電子情報通信学会技術研究報告. ET, 教育工学一般社団法人電子情報通信学会

Presentation date： 2012.10

Event date：
2012.10

　

　

　View Summary

生徒の学習を支援する際に必要なプロセスとして,つまずきの検知が挙げられる.CAIのつまずき検出に関する研究では,採点結果や解答所要時間,センサーから取得した学習者の顔画像や脈拍などの生体情報,そして入力デバイスであるキーボードやマウスの操作履歴を利用して検知を行う研究が行われてきた.しかし現状の初等教育では筆記活動を中心とした環境であり,こうした環境におけるつまずき検出に関しては深い議論が行われてこなかった.本報告では生徒が利用するペンから得られる筆記情報を元に,つまずきを検出する手法について検討を行う.検出には時系列モデルであるARモデルを用いて学習者の手書き行動が変化する変化点を検出し,変化点間ごとに推定を行う.実施した試験評価において一定の検出性能が確認できた.
The 2010 IEEE International Workshop on Quantitative Evaluation of large-scale Systems and Technologies (QuEST): Welcome message from workshop organizers

Kin Fun Li, Rick McGeer, Stephen Neville, Hayato Yamana

24th IEEE International Conference on Advanced Information Networking and Applications Workshops, WAINA 2010

Presentation date： 2010.07

Event date：
2010.07

　

　
The 2010 IEEE International Symposium on Mining and Web (MAW): Welcome message from symposium organizers

Takahiro Hara, Kin Fun Li, Shengrui Wang, Hayato Yamana, Laurence T. Yang, Yanchun Zhang

24th IEEE International Conference on Advanced Information Networking and Applications Workshops, WAINA 2010

Presentation date： 2010.07

Event date：
2010.07

　

　
Two step adjustment technique of term weight

YANO Hiroya, NAKAJIMA Tai, YAMANA Hayato

IEICE technical report. Data engineering 社団法人電子情報通信学会

Presentation date： 2010.06

Event date：
2010.06

　

　

　View Summary

TF・IDF method is one of the methods to weight terms in the field of document retrieval. IDF value shows the degree of how a term is difficult to appear in the document set, and depends on the document set to be retrieved. Therefore, the problem is that, even if a term is difficult to appear in the same field of document set as query(which means the term is highly specific in the document), IDF value of term which appears easily in the document set to be retrieved is small. In this paper, we propose and study two step adjustment technique of term weight. In the first step, we get documents r...
Similar object detection using template matching focused on positional relationship of feature regions

Keisuke Arai, Kosuke Morimoto, Hayato Yamana

IPSJ SIG Notes. CVIM Information Processing Society of Japan (IPSJ)

Presentation date： 2010.05

Event date：
2010.05

　

　

　View Summary

The similar object detection from a large quantity of images helps us to be able to organize images by category and research market by using images on the Web. Template matching that can detect similarity object doesn't suit unknown images so that there is an assumption that target image contains same object. In this paper, we are aimed at decreasing false-positive rate due to the premise of template matching. We propose the method that considers the positional relationships of the feature regions with conventional template matching. Each feature region in template image matches target imag...
6K-7 Data Mining Algorithms Classified Based on Data Access Patterns

Akioka Sayaka, Muraoka Yoichi, Yamana Hayato, Nakajima Tatsuo

全国大会講演論文集 Information Processing Society of Japan (IPSJ)

Presentation date： 2010.03

Event date：
2010.03

　

　
Model-Based Gaze Tracking with Low-cost Web Cameras

FUKUDA Takashi, MATSUZAKI Katsuhiko, YAMANA Hayato

Technical report of IEICE. PRMU 社団法人電子情報通信学会

Presentation date： 2010.03

Event date：
2010.03

　

　

　View Summary

The gaze estimation without restraining users at home will be contributed to reformation of user interfaces. The commercial gaze estimation systems show high accuracy by using infrared, however, gaze estimation systems with web cameras are highly desired at homes because of their price. The problem using web cams is its low resolution for gaze estimation. Low resolution images are strongly influenced by noises. So the past studies do not estimate detailed direction of eyes. In this paper, we propose a new gaze estimation method which shows high accuracy using image processing and geometrica...
Model-Based Gaze Tracking with Low-cost Web Cameras

FUKUDA Takashi, MATSUZAKI Katsuhiko, YAMANA Hayato

Technical report of IEICE. HIP 一般社団法人電子情報通信学会

Presentation date： 2010.03

Event date：
2010.03

　

　

　View Summary

The gaze estimation without restraining users at home will be contributed to reformation of user interfaces. The commercial gaze estimation systems show high accuracy by using infrared, however, gaze estimation systems with web cameras are highly desired at homes because of their price. The problem using web cams is its low resolution for gaze estimation. Low resolution images are strongly influenced by noises. So the past studies do not estimate detailed direction of eyes. In this paper, we propose a new gaze estimation method which shows high accuracy using image processing and geometrica...
Cross-media impact on Twitter in Japan

Sayaka Akioka, Norikazu Kato, Yoichi Muraoka, Hayato Yamana

International Conference on Information and Knowledge Management, Proceedings

Presentation date： 2010

Event date：
2010

　

　

　View Summary

Twitter, a microblogging service, is now grabbing attention of people as a new channel. For deep understanding of this new service, this paper reports the characteristics of Twitter users in Japan, and the impact of media such as publications, and TV programs on Twitter community. To the best of our knowledge, this paper is the first to analyze mutual impact between Twitter, and other media quantitatively. In order for the analyses, we crawled user profiles whose language setting is Japanese, and conducted several analysis with well-known methodologies as conventional work did. We confirmed the characteristics of the collected user profiles. We observed the distributions of the number of friends, and the number of follows both follow power-law, and there exists the correlation between the number of friends, and the number of follows. Besides the collected user profiles, we also utilized closed caption data of TV programs in Japan, and other information on media picked up Twitter. We run a batch of matching these data outside Twitter with the collected user profiles, and concluded Twitter has been already widely spread among Japanese people, however, media have still huge impact on the growth of Twitter users. We also conjectured the impact is not one-sided, however, is mutual influence between Twitter, and other media. © 2010 ACM.
Community QA Question Classification: Is the Asker Looking for Subjective Answers or Not?

Naoyoshi AIKAWA, Tetsuya SAKAI, Hayato YAMANA

WebDBForum2011

Presentation date： 2010

Event date：
2010

　

　
The Method of Improving the Specific Language Focused Crawler,

Shan-Bin Chan, Hayato Yamana

Proc. of the 1st CIPS-SIGHAN Joint Conf. on Chinese Language Processing(CLP2010)

Presentation date： 2010

Event date：
2010

　

　
Data Access Pattern Analysis on Stream Mining Algorithms for Cloud Computation,

Sayaka Akioka, Hayato Yamana, Yoichi Muraoka

Proc. of the 2010 Int'll Conf. on Parallel and Distributed Processing Techniques and Applications

Presentation date： 2010

Event date：
2010

　

　
Reliability Verification of Search Engines' Hit Counts: How to Select a Reliable Hit Count for a Query

Takuya Funahashi, Hayato Yamana

CURRENT TRENDS IN WEB ENGINEERING SPRINGER-VERLAG BERLIN

Presentation date： 2010

Event date：
2010

　

　

　View Summary

In this paper, we investigate the trustworthiness of search engines' hit counts, numbers returned as search result counts. Since many studies adopt search engines' hit counts to estimate the popularity of input queries, the reliability of hit counts is indispensable for archiving trustworthy studies. However, hit counts are unreliable because they change, when a user clicks the "Search" button more than once or clicks the "Next" button on the search results page, or when a user queries the same term on separate days. In this paper, we analyze the characteristics of hit count transition by gathering various types of hit counts over two months by using 10,000 queries. The results of our study show that the hit counts with the largest search offset just before search engines adjust their hit counts are the most reliable. Moreover, hit counts are the most reliable when they are consistent over approximately a week.
Search Engines’ Trustworthiness-Current Status

Hayato YAMANA

Proc. of the 5th Korea-Japan Database Workshop

Presentation date： 2010

Event date：
2010

　

　
Time-weighted web authoritative ranking

Bundit Manaskasemsak, Arnon Rungsawang, Hayato Yamana

Information Retrieval

Presentation date： 2010

Event date：
2010

　

　
Community QA Question Classification: Is the Asker Looking for Subjective Answers or Not?

Naoyoshi AIKAWA, Tetsuya SAKAI, Hayato YAMANA

WebDBForum2011

Presentation date： 2010

Event date：
2010

　

　
Speed-Up of Resizable-LSH for Similarity-Based Range Query

山崎邦弘, 山名早人, 山名早人

情報処理学会研究報告(CD-ROM)

Presentation date： 2010

Event date：
2010

　

　
The Method of Improving the Specific Language Focused Crawler,

Shan-Bin Chan, Hayato Yamana

Proc. of the 1st CIPS-SIGHAN Joint Conf. on Chinese Language Processing(CLP2010)

Presentation date： 2010

Event date：
2010

　

　
Data Access Pattern Analysis on Stream Mining Algorithms for Cloud Computation,

Sayaka Akioka, Hayato Yamana, Yoichi Muraoka

Proc. of the 2010 Int'll Conf. on Parallel and Distributed Processing Techniques and Applications

Presentation date： 2010

Event date：
2010

　

　
Localized Multiple Kernel Learningを用いた画像分類

小林大輔, 相川直視, 山名早人

MIRU2010, IS2-43

Presentation date： 2010

Event date：
2010

　

　
低解像度目画像からのModel-Based視線推定

福田崇, 松崎勝彦, 山名早人

MIRU2010, IS1-46

Presentation date： 2010

Event date：
2010

　

　
動画像における正面画像推定からの衣服領域抽出

金正文, 森本浩介, 山名早人

MIRU2010, IS3-36

Presentation date： 2010

Event date：
2010

　

　
領域分割と色特徴を利用したテンプレートマッチングによる類似物体検出

新井啓介, 森本浩介, 山名早人

MIRU2010,IS2-42

Presentation date： 2010

Event date：
2010

　

　
検索語の重みの2段階調整手法

矢野博也, 中島泰, 山名早人

信学技報

Presentation date： 2010

Event date：
2010

　

　
Search Engines’ Trustworthiness-Current Status

Hayato YAMANA

Proc. of the 5th Korea-Japan Database Workshop

Presentation date： 2010

Event date：
2010

　

　
Similar object detection using template matching focused on positional relationship of feature regions

新井啓介, 森本浩介, 山名早人, 山名早人

情報処理学会研究報告(CD-ROM)

Presentation date： 2010

Event date：
2010

　

　
データアクセスパターンに基づくデータマイニング手法の分類

秋岡明香, 村岡洋一, 山名早人, 中島達夫

第72回情処全大

Presentation date： 2010

Event date：
2010

　

　
安価なWebカメラを用いたModel-Based視線推定

福田崇, 松崎勝彦, 山名早人

信学技報(PRMU)

Presentation date： 2010

Event date：
2010

　

　
Hit Count Dance -検索エンジンのヒット数に関する信頼性検証-

舟橋卓也, 山名早人

第2回データ工学と情報マネジメントに関するフォーラム(DEIM2010)

Presentation date： 2010

Event date：
2010

　

　
LittleWeb: 類似ノード集約によるWebグラフ圧縮手法

片瀬弘晶, 上田高徳, 山名早人

第2回データ工学と情報マネジメントに関するフォーラム(DEIM2010)

Presentation date： 2010

Event date：
2010

　

　
QueueLinker: パイプライン型アプリケーションのための分散処理フレームワーク

上田高徳, 片瀬弘晶, 森本浩介, 打田研二, 油井誠, 山名早人

第2回データ工学と情報マネジメントに関するフォーラム(DEIM2010)

Presentation date： 2010

Event date：
2010

　

　
Unexpected and Interesting: 動画視聴サイトにおける発見性を重視した動画推薦手法の提案

中村智浩, 山名早人

第2回データ工学と情報マネジメントに関するフォーラム(DEIM2010)

Presentation date： 2010

Event date：
2010

　

　
WWWにおけるP3Pコンパクトポリシーの利用状況に関する調査

櫻井宏樹, 高木浩光, 山名早人

第2回データ工学と情報マネジメントに関するフォーラム(DEIM2010)

Presentation date： 2010

Event date：
2010

　

　
Winnyネットワーク上を流通するコンテンツの傾向と分析

打田研二, 高木浩光, 山崎邦弘, 山名早人

第2回データ工学と情報マネジメントに関するフォーラム(DEIM2010)

Presentation date： 2010

Event date：
2010

　

　
アンカーテキストとリンク構造を用いた同義語抽出手法

黒木さやか, 立石健二, 細見格, 山名早人

第2回データ工学と情報マネジメントに関するフォーラム(DEIM2010)

Presentation date： 2010

Event date：
2010

　

　
字幕テキストの利用によるブログで引用されたテレビ番組の推定

及川孝徳, 中島泰, 松崎勝彦, 黒木さやか, 山名早人

第2回データ工学と情報マネジメントに関するフォーラム(DEIM2010)

Presentation date： 2010

Event date：
2010

　

　
特定言語Webページ収集のためのフォーカストクローラの性能改善手法

詹善斌, 山名早人

第2回データ工学と情報マネジメントに関するフォーラム(DEIM2010)

Presentation date： 2010

Event date：
2010

　

　
Time-weighted web authoritative ranking

Bundit Manaskasemsak, Arnon Rungsawang, Hayato Yamana

Information Retrieval

Presentation date： 2010

Event date：
2010

　

　
A Lock-free GCLOCK Page Replacement Algorithm

Presentation date： 2009.12

Event date：
2009.12

　

　

　View Summary

In this paper, we propose a lock-free variant of the GCLOCK page replacement algorithm, named Nb-GCLOCK. Concurrent access to the buffer management module is a major factor that prevents database scalability to processors. Therefore, we propose a non-blocking scheme for bufferfix operations that fix buffer frames for requested pages without locks by combining Nb-GCLOCK and a wait-free hash table. Our experimental results revealed that our scheme can obtain nearly linear scalability to processors up to 64 processors, although the existing locking-based schemes do not scale beyond 16 processors.
Prediction of GPCR ligands by 2-way prediction method

Hiroto Hyakkoku, Minoru Sugihara, Makiko Suwa, Tsuyoshi Kato, Hayato Yamana, Wataru Fujibuchi

IPSJ SIG technical reports Information Processing Society of Japan (IPSJ)

Presentation date： 2009.09

Event date：
2009.09

　

　

　View Summary

G-protein coupled receptors (GPCRs) are important pharmacological targets and to predict unknown interactions between GPCRs and ligands is one of the most interesting topics in the current computational biology. However, ligands of many GPCRs are experimentally not identified yet and it is difficult to predict unknown ligands of GPCRs because of insufficiency of training data set. We have developed a 2-way prediction method based on the support vector machine. In this method, the prediction is performed by using both information of ligands and GPCRs and one can apply this method to the case...
Proposal of Word Salad Spam Detection Method using N-gram and Interrupted Collocations

MORIMOTO Kosuke, KATASE Hiroaki, YAMANA Hayato

情報処理学会研究報告. データベース・システム研究会報告情報処理学会

Presentation date： 2009.07

Event date：
2009.07

　

　

　View Summary

インターネット上にウェブページが爆発的に増加し，インターネットから得られる情報が重要になっている．しかし，ウェブページの爆発的な増加につれてスパム行為を行うページも同様に増加し，インターネットから得られる情報の価値を下げている．スパム行為には様々な手法があるが，本論文では自動的に文章を生成するワードサラダに着目し，ワードサラダ型のスパムを効率的に検出する手法を提案する．ワードサラダ型スパムを検出するため，n-gram と離散型共起表現を用いてカルバック・ライブラー情報量に基づく文章のスコアを計算し，計算したスコアに基づき判定を行う．提案手法の評価実験を行った結果，既存手法と比較して F 値で 0.18 の性能の向上を確認できた．Information on the Internet becomes important because of exploding Web page. However, Spam pages also have exploded and information from the Internet have become lower reliability. Though there are many Spamming methods, in this article we focus on "word salad" that creates text automatically, and we propose the effective method of word salad detection. We detect word salad by the score based on Kullback-Leibler divergence calculated with n-gram and interrupted collocation. As a result of experiment, our method improves 0.18 points in F-value from the existing method.
Resizable-LSH : An Approximate Similarity Search Algorithm for Resizable Range-Search

YAMAZAKI Kunihiro, NAKAMURA Tomohiro, FUNAHASHI Takuya, YAMANA Hayato

情報処理学会研究報告. 情報学基礎研究会報告情報処理学会

Presentation date： 2009.07

Event date：
2009.07

　

　

　View Summary

本稿では閾値を可変にした近似的な類似検索手法を提案する．近年，距離を用いた類似検索手法の 1 つとして，Locality-Sensitive Hashing （局所性鋭敏型ハッシング，LSH）による近似的な類似検索が注目されている．LSHは，「距離が近い入力同士は高い確率で衝突する」特徴を持つハッシュ関数を用いたデータマッピング手法であり，高次元なデータに対しても高速に近傍検索を行うことができる．しかし LSH では，事前計算によって距離が近いデータ同士を同じハッシュ値にマッピングするため，検索時に類似度の閾値を変更することができない．閾値を変更するにはハッシュテーブルの再構築が必要になるため，ユーザが閾値を指定できるような類似検索は実現困難である．そこで本研究では，類似検索時に，クエリとハッシュ値が一致するデータに加え，ハッシュ値が近いデータも取得することで，ハッシュテーブルの再構築を行うことなく，閾値を指定できる類似検索を実現した．提案手法は，閾値に合わせてハッシュテーブルを逐次再構築する LSH と比較して，同程度の精度で，かつ 1,000 倍程度の高速化を達成できることを実験により確認した．We introduce an efficient algorithm named "Resizable-LSH" for approximate similarity search, which enables resizing the search range flexibly. Nowadays, Locality-Sensitive Hashing (LSH) is drawing attention as an efficient algorithm for approximate nearest neighbor search. LSH adopts hash functions that collide with high probability if two vectors are close, so that LSH finds approximate nearest neighbors quickly even if the dataset is high-dimensional. However, LSH should generate hash tables preliminarily, that results in resizing the search range costs expensive because hash table regeneration is required whenever we face the needs to resize search range. To solve the problem, our proposed Resizable-LSH retrieves not only the same hash value of query, but also near hash values. Then Resizable-LSH achieves resizable range-search. As it turns out, the result of the experiments shows Resizable-LSH works about 1,000 times faster than LSH with almost the same quality in comparison with LSH.
Resizable-LSH : An Approximate Similarity Search Algorithm for Resizable Range-Search

YAMAZAKI Kunihiro, NAKAMURA Tomohiro, FUNAHASHI Takuya, YAMANA Hayato

研究報告データベースシステム（DBS）情報処理学会

Presentation date： 2009.07

Event date：
2009.07

　

　

　View Summary

本稿では閾値を可変にした近似的な類似検索手法を提案する．近年，距離を用いた類似検索手法の 1 つとして，Locality-Sensitive Hashing （局所性鋭敏型ハッシング，LSH）による近似的な類似検索が注目されている．LSHは，「距離が近い入力同士は高い確率で衝突する」特徴を持つハッシュ関数を用いたデータマッピング手法であり，高次元なデータに対しても高速に近傍検索を行うことができる．しかし LSH では，事前計算によって距離が近いデータ同士を同じハッシュ値にマッピングするため，検索時に類似度の閾値を変更することができない．閾値を変更するにはハッシュテーブルの再構築が必要になるため，ユーザが閾値を指定できるような類似検索は実現困難である．そこで本研究では，類似検索時に，クエリとハッシュ値が一致するデータに加え，ハッシュ値が近いデータも取得することで，ハッシュテーブルの再構築を行うことなく，閾値を指定できる類似検索を実現した．提案手法は，閾値に合わせてハッシュテーブルを逐次再構築する LSH と比較して，同程度の精度で，かつ 1,000 倍程度の高速化を達成できることを実験により確認した．We introduce an efficient algorithm named "Resizable-LSH" for approximate similarity search, which enables resizing the search range flexibly. Nowadays, Locality-Sensitive Hashing (LSH) is drawing attention as an efficient algorithm for approximate nearest neighbor search. LSH adopts hash functions that collide with high probability if two vectors are close, so that LSH finds approximate nearest neighbors quickly even if the dataset is high-dimensional. However, LSH should generate hash tables preliminarily, that results in resizing the search range costs expensive because hash table regeneration is required whenever we face the needs to resize search range. To solve the problem, our proposed Resizable-LSH retrieves not only the same hash value of query, but also near hash values. Then Resizable-LSH achieves resizable range-search. As it turns out, the result of the experiments shows Resizable-LSH works about 1,000 times faster than LSH with almost the same quality in comparison with LSH.
Proposal of Word Salad Spam Detection Method using N-gram and Interrupted Collocations

MORIMOTO Kosuke, KATASE Hiroaki, YAMANA Hayato

研究報告情報学基礎（FI）

Presentation date： 2009.07

Event date：
2009.07

　

　

　View Summary

インターネット上にウェブページが爆発的に増加し，インターネットから得られる情報が重要になっている．しかし，ウェブページの爆発的な増加につれてスパム行為を行うページも同様に増加し，インターネットから得られる情報の価値を下げている．スパム行為には様々な手法があるが，本論文では自動的に文章を生成するワードサラダに着目し，ワードサラダ型のスパムを効率的に検出する手法を提案する．ワードサラダ型スパムを検出するため，n-gram と離散型共起表現を用いてカルバック・ライブラー情報量に基づく文章のスコアを計算し，計算したスコアに基づき判定を行う．提案手法の評価実験を行った結果，既存手法と比較して F 値で 0.18 の性能の向上を確認できた．Information on the Internet becomes important because of exploding Web page. However, Spam pages also have exploded and information from the Internet have become lower reliability. Though there are many Spamming methods, in this article we focus on "word salad" that creates text automatically, and we propose the effective method of word salad detection. We detect word salad by the score based on Kullback-Leibler divergence calculated with n-gram and interrupted collocation. As a result of experiment, our method improves 0.18 points in F-value from the existing method.
Reliability Verification of Search Engines' Hit Count using Multi Query

FUNAHASHI Takuya, SONE Hiroaki, YAMANA Hayato

IEICE technical report. Data engineering 社団法人電子情報通信学会

Presentation date： 2009.07

Event date：
2009.07

　

　

　View Summary

A number of studies have been using Search Engines' hit count. The goal of these studies is to build applications for translation support or natural language processing support. These studies assume that the hit count is reliable. To identify the reliability of Search Engines' hit count, we have challenged to verify. In the past, we verified hit count only using one keyword query. The contribution of this paper is to verify hit count using multi query keyword.
Efficient duplicated URL detection for web crawlers

久保田展行, 上田高徳, 山名早人

DBSJ journal 日本データベース学会

Presentation date： 2009.06

Event date：
2009.06

　

　
Extending ALT algorithm to use multiple landmarks

MATSUNAGA TAKU, HIRATE YU, YAMANA HAYATO

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2009.01

Event date：
2009.01

　

　

　View Summary

Recently, the ALT algorithm is proposed as a speed-up algorithm to compute shortest paths in general graph structures. The ALT algorithm offers a landmark based heuristic function to estimate distance in A* search Before computing shortest paths, the ALT algorithm computes distances between all nodes and landmarks, and stores them to prepared memory or storage space. However, as the number of landmarks increases, the required prepared space increases linearly. To solve this problem, in this paper, we propose a novel heuristic function for computing shortest paths in general graph structures...
The Challenge of Eliminating Storage Bottlenecks in Distributed Systems

Takanori Ueda, Yu Hirate, Hayato Yamana

FIRST INTERNATIONAL WORKSHOP ON SOFTWARE TECHNOLOGIES FOR FUTURE DEPENDABLE DISTRIBUTED SYSTEMS, PROCEEDINGS IEEE COMPUTER SOC

Presentation date： 2009

Event date：
2009

　

　

　View Summary

One of the most difficult problems in distributed systems is load-balancing. Even if we take care of load-balancing, heavily-loaded nodes often occur while there are still lightly-loaded nodes that have idle memory and idle CPU power. Our idea is to exploit this idle memory and idle CPU power to improve the storage performance of heavily-loaded nodes. Idle memory can be used for caching file data and idle CPU power can be used for extracting file access patterns from file access logs. File access patterns are valuable sources for optimizing a cache strategy. Our project goal is to improve the overall performance of distributed systems by improving storage access performance. This paper gives an overview of this project idea and reports the current status of the project. In addition, we show benchmark results from our prototype cache extension system, which is implemented in Linux Kernel 2.6. The DBT-3 (TPC-H) benchmark results show that our system can increase computer speed by a factor of 6.68.
Profiling Node Conditions of Distributed System with Sequential Pattern Mining

Yu Hirate, Hayato Yamana

FIRST INTERNATIONAL WORKSHOP ON SOFTWARE TECHNOLOGIES FOR FUTURE DEPENDABLE DISTRIBUTED SYSTEMS, PROCEEDINGS IEEE COMPUTER SOC

Presentation date： 2009

Event date：
2009

　

　

　View Summary

Recently, with wide-spread of distributed systems, distributed monitoring systems are needed to mange such systems. However, since monitoring architecture of distributed system faces a huge amount of log data which come from local computing nodes, information aggregation is fundamental scheme for monitoring distributed system. In this paper, we preset a novel approach for extracting computing node-condition profiles by using sequential pattern mining, which is one of data mining techniques. Extracted computing node condition profiles represent node condition patterns which are occurred in many computing nodes frequently. Thus, extracted profiles enable summarized distributed system conditions to be small sized and easy-understandable information.
A Scalable Monitoring System for Distributed Environments

Sayaka Akioka, Junichi Ikeda, Takanori Ueda, Yuki Ohno, Midori Sugaya, Yu Hirate, Jiro Katto, Shigeki Goto, Yoichi Muraoka, Hayato Yamana, Tatsuo Nakajima

FIRST INTERNATIONAL WORKSHOP ON SOFTWARE TECHNOLOGIES FOR FUTURE DEPENDABLE DISTRIBUTED SYSTEMS, PROCEEDINGS IEEE COMPUTER SOC

Presentation date： 2009

Event date：
2009

　

　

　View Summary

The total amount of information to process or analyze is jumping sharply with the quick spread of computers and networks. Our project, «Highly scalable monitoring architecture for information explosion», develops a monitoring system allows observing systems, merging the system logs, and discovering intelligence to share. More concretely, the project builds the total system to maintain, optimize, and protect autonomically. This paper reports the outcomes of the project after first-half of the development period.The rest of the paper is organized as follows. Section 2 describes the concept and details of the monitoring system on a single node, and Section 3 addresses the aggregation of the collected information in distributed environments. Section 4 and Section 5 introduce applications of the monitoring systems. Section 6 summarizes the project and mentions future plans. © 2009 IEEE.
Resizable-LSH: An Approximate Similarity Search Algorithm for Resizable Range-Search

山崎邦弘, 中村智浩, 舟橋卓也, 山名早人, 山名早人

情報処理学会研究報告(CD-ROM)

Presentation date： 2009

Event date：
2009

　

　
QueueLinker: Distributed Producer/Consumer Queue Framework"

上田高徳, 片瀬弘晶, 森本浩介, 打田研二, 山名早人

WebDB Forum2009

Presentation date： 2009

Event date：
2009

　

　
A Lock-free GCLOCK Page Replacement Algorithm

油井誠, 油井誠, 宮崎純, 植村俊亮, 加藤博一, 山名早人

情報処理学会論文誌トランザクション(CD-ROM)

Presentation date： 2009

Event date：
2009

　

　
QueueLinker: Distributed Producer/Consumer Queue Framework"

上田高徳, 片瀬弘晶, 森本浩介, 打田研二, 山名早人

WebDB Forum2009

Presentation date： 2009

Event date：
2009

　

　
ウィキペディア記事閲覧回数の特徴分析

曽根広哲, 山名早人

Wikimedia Conference Japan 2009

Presentation date： 2009

Event date：
2009

　

　
Prediction of GPCR ligands by 2-way prediction method

百石弘澄, 杉原稔, 諏訪牧子, 諏訪牧子, 加藤毅, 加藤毅, 山名早人, 藤渕航, 藤渕航

情報処理学会研究報告(CD-ROM)

Presentation date： 2009

Event date：
2009

　

　
Proposal of Word Salad Spam Detection Method using N-gram and Interrupted Collocations

森本浩介, 片瀬弘晶, 山名早人, 山名早人

情報処理学会研究報告(CD-ROM)

Presentation date： 2009

Event date：
2009

　

　
ブログにおける話題語の出現理由の抽出と話題に関する詳細記事推薦

中島泰, 黒木さやか, 櫻井宏樹, 山名早人

第15回Webインテリジェンスとインタラクション研究会

Presentation date： 2009

Event date：
2009

　

　
複数キーワードクエリに対する検索ヒット数の信頼性検証

舟橋卓也, 曽根広哲, 山名早人

信学技報

Presentation date： 2009

Event date：
2009

　

　
Webページ間の関連性の伝播を用いたWebコミュニティ抽出手法の評価

飯村卓也, 平手勇宇, 山名早人

DEIM2009

Presentation date： 2009

Event date：
2009

　

　
印象語からの概念推定システム

永井洋平, 黒木さやか, 山名早人

信学技報(Webインテリジェンスとインタラクション研究会)

Presentation date： 2009

Event date：
2009

　

　
核となるアイテムセットによる頻出アイテムセット抽出数削減手法

松崎勝彦, 平手勇宇, 山名早人

DEIM2009

Presentation date： 2009

Event date：
2009

　

　
検索ヒット数のクラスタリングを用いた補正手法の検討

舟橋卓也, 平手勇宇, 山名早人

DEIM2009

Presentation date： 2009

Event date：
2009

　

　
商用検索エンジンにランキングされたサイトのランク変動パターンの解析

吉田泰明, 平手勇宇, 山名早人

DEIM2009

Presentation date： 2009

Event date：
2009

　

　
Exploiting idle CPU cores to improve file access performance

Takanori Ueda, Yu Hirate, Hayato Yamana

Proceedings of the 3rd International Conference on Ubiquitous Information Management and Communication, ICUIMC'09

Presentation date： 2009

Event date：
2009

　

　

　View Summary

Many-core CPUs require many parallel computation tasks to reach their full potential because CPU cores become idle if they do not have enough computation tasks. How best to utilize a number of cores in many-core CPUs should be examined. In this paper, we propose exploitation of idle cores for improving file access performance. Idle cores are used to extract file access patterns from access logs and the extracted patterns are used to improve file cache efficiency by reordering the LRU (Least Recently Used) list based on the extracted patterns. Data mining techniques are used to extract access patterns to reduce computation overhead. Our method was evaluated by simulation and also implemented on Linux kernel 2.6.26 as a prototype system. In the simulation experiment, our method improved the cache-hit ratio up to 1.09% on DBT-2 (TPC-C) trace logs. Our prototype implementation on Linux improves DBT-2 performance up to 5.24% on a real machine. Copyright 2009 ACM.
Implementing and Evaluating Graph Engine for Large Scale Graphs

MATSUNAGA Taku, KATASE Hiroaki, UEDA Takanori, KUBOTA Nobuyuki, MORIMOTO Kosuke, HIRATE Yu, YAMANA Hayato

IEICE technical report. Data engineering 社団法人電子情報通信学会

Presentation date： 2008.11

Event date：
2008.11

　

　
Improvement in speed and accuracy of multiple sequence alignment program prime

Shinsuke Yamada, Osamu Gotoh, Hayato Yamana

IPSJ Transactions on Bioinformatics 一般社団法人情報処理学会

Presentation date： 2008.11

Event date：
2008.11

　

　

　View Summary

Multiple sequence alignment (MSA) is a useful tool in bioinformatics. Although many MSA algorithms have been developed, there is still room for improvement in accuracy and speed. We have developed an MSA program PRIME, whose crucial feature is the use of a group-to-group sequence alignment algorithm with a piecewise linear gap cost. We have shown that PRIME is one of the most accurate MSA programs currently available. However, PRIME is slower than other leading MSA programs. To improve computational performance, we newly incorporate anchoring and grouping heuristics into PRIME. An anchoring method is to locate well-conserved regions in a given MSA as anchor points to reduce the region of DP matrix to be examined, while a grouping method detects conserved subfamily alignments specified by phylogenetic tree in a given MSA to reduce the number of iterative refinement steps. The results of BAliBASE 3.0 and PREFAB 4 benchmark tests indicated that these heuristics contributed to reduction in the computational time of PRIME by more than 60% while the average alignment accuracy measures decreased by at most 2%. Additionally, we evaluated the effectiveness of iterative refinement algorithm based on maximal expected accuracy (MEA). Our experiments revealed that when many sequences are aligned, the MEA-based algorithm significantly improves alignment accuracy compared with the standard version of PRIME at the expense of a considerable increase in computation time. © 2008 Information Processing Society of Japan.
Search Engines' Trustworthiness(<Special Issue>Trust Assessment of Web Information)

Yamana Hayato

Journal of Japanese Society for Artificial Intelligence 社団法人人工知能学会

Presentation date： 2008.11

Event date：
2008.11

　

　
Reliability Verification of Search Engines' Hit Count

FUNAHASHI Takuya, UEDA Takanori, HIRATE Yu, YAMANA Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2008.09

Event date：
2008.09

　

　

　View Summary

A number of studies have been using Search Engines' hit count. The goal of these studies is to build applications for translation support or natural language processing support. These studies assume that the hit count is reliable. However, none of the studies have been verifide the reliability of Search Engines' hit count. If the hit count is unreliable, studies using hit count become also unreliable. The purpose of this paper is to verify the reliability of Search Engines' hit count. In this experiment, we used Search APIs provided by Google, Yahoo! Japan and Live Search. Furthermore, we r...
Web Community Extraction Method with Web Pages' Relevance Fowarding

IIMURA Takuya, HIRATE Yu, YAMANA Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2008.09

Event date：
2008.09

　

　

　View Summary

To find information from a large collection of Web-pages, several methods for extracting Web communities are proposed. In the past studies, it succeeds in improving precision score by making a rule whether or not to include a certain Web page into a Web community strictly. However, recall score might worsen because the Web page that should be included in the Web community is not included. In this paper, we propose the Web community extraction method that can improve recall score without decreasing precision score. The method adds Web pages that have many links from/to the Web pages in a sam...
Dynamic I/O Optimization with Access Pattern Mining at OS Level

UEDA Takanori, HIRATE Yu, YAMANA Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2008.09

Event date：
2008.09

　

　

　View Summary

Many-core CPU improves parallel performance but also raises problem of storage performance bottleneck. I/O optimization should be taken at operating system level because various applications are executed in parallel on many-core CPU environment and I/O optimization requires cross-cutting knowledge about applications. We propose a new method which uses disk access patterns for improving efficiency of disk cache replacement algorithm. Our method is now implemented at Linux 2.6.26 and extracts access patterns from file access logs of applications. The experimental results show our method impro...
Message from the MAW 2008 co-chairs

Takahiro Hara, Yanchim Zhang, William K. Cheung, Shengrui Wang, Hayato Yamana, Km Fun Li, Laurence T. Yang

Proceedings - International Conference on Advanced Information Networking and Applications, AINA

Presentation date： 2008.09

Event date：
2008.09

　

　
Analyzing geographical location and number of back-links of web servers all over the world

平手勇宇, 片瀬弘晶, 山名早人

Journal of the DBSJ 日本データベース学会

Presentation date： 2008.09

Event date：
2008.09

　

　
Measuring Editor's Trustworthiness in Wikipedia by Using Edit History without Depending on the Edit Frequency

SAKURAI Hiroki, YOSHIDA Yasuaki, HIRATE Yu, YAMANA Hayato

Technical report of IEICE. PRMU 社団法人電子情報通信学会

Presentation date： 2008.06

Event date：
2008.06

　

　

　View Summary

Free encyclopedia Wikipedia on the Internet contains 490,000 Japanese articles and its volume of information is huge and useful. However, the trustworthiness of the articles' contents becomes uncertain because anyone can edit them easily. In other words, since we cannot understand exactly who edits the contents, it results in difficulty of measuring trustworthiness of the articles' contents. To such a problem the method using the edit history is proposed for measuring the trustworthiness of the articles. However, it is inapposite compared with the article and the editor with a little freque...
Gathering Over 10 Billion of Web Pages and its Applications

YAMANA Hayato

Technical report of IEICE. PRMU 社団法人電子情報通信学会

Presentation date： 2008.06

Event date：
2008.06

　

　
Influence of Wikipedia on Search Engine Rankings

SONE Hiroaki, YOSHIDA Yasuaki, HIRATE Yu, YAMANA Hayato

Technical report of IEICE. PRMU 社団法人電子情報通信学会

Presentation date： 2008.06

Event date：
2008.06

　

　

　View Summary

Search engines are necessary to find information from the Internet. There is an investigation report that users recognize that information from search engines' results is as believable as information from television. That is, we can understand a part of the influence which a certain site gives to the society by examining search engine rankings. In this paper, to examine the influence of Wikipedia, that has become the pronoun of the encyclopedia in the Internet, we have conducted the experiments. We collected ranking positions of the articles in the Japanese version of Wikipedia by using sea...
Temporal Clustering of Internet News Articles with Excluding Single Articles

NAKAMURA Tomohiro, HIRANO Takayoshi, HIRATE Yu, YAMANA Hayato

Technical report of IEICE. PRMU 社団法人電子情報通信学会

Presentation date： 2008.06

Event date：
2008.06

　

　

　View Summary

Clustering of internet news articles makes it possible to detect various useful information, for example, related articles, and latest topic words. From the TDT project down, this area is widely researched. Conventional clustering methods have difficulties to detect single article as a single cluster even though many single articles exists. In this paper, we propose a method to cluster news articles that exclude single articles in advance by using proper noun information, topographic information and other characteristics between single and non-single articles. In evaluation, we use half a y...
Measuring Editor's Trustworthiness in Wikipedia by Using Edit History without Depending on the Edit Frequency

SAKURAI Hiroki, YOSHIDA Yasuaki, HIRATE Yu, YAMANA Hayato

IEICE technical report. Data engineering 社団法人電子情報通信学会

Presentation date： 2008.06

Event date：
2008.06

　

　

　View Summary

Free encyclopedia Wikipedia on the Internet contains 490,000 Japanese articles and its volume of information is huge and useful. However, the trustworthiness of the articles' contents becomes uncertain because anyone can edit them easily. In other words, since we cannot understand exactly who edits the contents, it results in difficulty of measuring trustworthiness of the articles' contents. To such a problem the method using the edit history is proposed for measuring the trustworthiness of the articles. However, it is inapposite compared with the article and the editor with a little freque...
Gathering Over 10 Billion of Web Pages and its Applications

YAMANA Hayato

IEICE technical report. Data engineering 社団法人電子情報通信学会

Presentation date： 2008.06

Event date：
2008.06

　

　

　View Summary

The number of Web pages distributed from Web servers is estimated about 53.7 billion as of Oct. 2005. We had gathered 14,456,201,906 Web pages from 5,548 Web servers during Jan. 2004 to July 2006. It had been conducted as part of e-Society project which is one of MEXT, Ministry of Education, Culture, Sports, Science and Technology, leading projects. Speedup of crawling Web pages conflicts with Web-site friendly crawling, however, both are indispensable for gathering Web pages. In the project, we have studied and proposed a dynamic delay adjustment scheme for accessing Web servers to prevent...
Influence of Wikipedia on Search Engine Rankings

SONE Hiroaki, YOSHIDA Yasuaki, HIRATE Yu, YAMANA Hayato

IEICE technical report. Data engineering 社団法人電子情報通信学会

Presentation date： 2008.06

Event date：
2008.06

　

　

　View Summary

Search engines are necessary to find information from the Internet. There is an investigation report that users recognize that information from search engines' results is as believable as information from television. That is, we can understand a part of the influence which a certain site gives to the society by examining search engine rankings. In this paper, to examine the influence of Wikipedia, that has become the pronoun of the encyclopedia in the Internet, we have conducted the experiments. We collected ranking positions of the articles in the Japanese version of Wikipedia by using sea...
Temporal Clustering of Internet News Articles with Excluding Single Articles

NAKAMURA Tomohiro, HIRANO Takayoshi, HIRATE Yu, YAMANA Hayato

IEICE technical report. Data engineering 社団法人電子情報通信学会

Presentation date： 2008.06

Event date：
2008.06

　

　

　View Summary

Clustering of internet news articles makes it possible to detect various useful information, for example, related articles, and latest topic words. From the TDT project down, this area is widely researched. Conventional clustering methods have difficulties to detect single article as a single cluster even though many single articles exists. In this paper, we propose a method to cluster news articles that exclude single articles in advance by using proper noun information, topographic information and other characteristics between single and non-single articles. In evaluation, we use half a y...
OS Level I/O Optimization in the Many-Core Era

UEDA Takanori, HIRATE Yu, YAMANA Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2008.06

Event date：
2008.06

　

　

　View Summary

近い将来，１つのチップに十数コアを搭載したメニーコア CPU が登場することは確実である．メニーコア環境下では，多くのアプリケーションが並列に動作するため，HDD が特に不得手とするランダムアクセスの頻度が増え，ストレージがますますボトルネックになると考えられる．そこで我々は，ストレージのボトルネックをソフトウェア的に軽減することを考えている．具体的には，アプリケーションのアクセスパターンを活用するディスクキャッシュ機構を Linux に実装し，実システムで評価することをひとつの目標にしている．ワークショップでは，これまでの研究概要と既存研究について述べると共に，最新の研究成果について述べ，今後の研究指針を示す．Many-core CPU which consists of some dozen cores in one package will definitely appear in the near future. In many-core environments, storage system will become bottlenecks since the random access to storage will increase because many applications will run in parallel. To meet this problem, we try to ease the storage bottlenecks by software method. Specifically, we try to implement a novel disk cache technique which exploits file access patterns of applications. The cache technique will be implemented on Linux Kernel and evaluated in real system. In this talk, we will show our research abstract and related works, and then show the latest results and the milestone of our research.
Geographical Location and Number of Back-Links of Web Servers All Over the World

HIRATE Yu, KATASE Hiroaki, YAMANA Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2008.06

Event date：
2008.06

　

　

　View Summary

According to our investigation result in Oct. 2005, the number of Web pages all over the world is estimated 53.7 billion. We have investigated TLD distribution and Language Distribution of Web pages based on 10.7 billion Web page dataset. In this paper, as one of our Web statics investigation series, we analyzed three kinds of distribution based on 10.7 billion Web page dataset, distribution of geographical location of Web server, the number of virtual hosts per one Web server, and the number of back links, i.e. the value of in-degree, per one Web server. Our results show (1) about 95.5% of...
Measuring Editor's Trustworthiness in Wikipedia by Using Edit History without Depending on the Edit Frequency

SAKURAI Hiroki, YOSHIDA Yasuaki, HIRATE Yu, YAMANA Hayato

電子情報通信学会技術研究報告. DE, データ工学 The Institute of Electronics, Information and Communication Engineers

Presentation date： 2008.06

Event date：
2008.06

　

　

　View Summary

Free encyclopedia Wikipedia on the Internet contains 490,000 Japanese articles and its volume of information is huge and useful. However, the trustworthiness of the articles' contents becomes uncertain because anyone can edit them easily. In other words, since we cannot understand exactly who edits the contents, it results in difficulty of measuring trustworthiness of the articles' contents. To such a problem the method using the edit history is proposed for measuring the trustworthiness of the articles. However, it is inapposite compared with the article and the editor with a little frequency to be edited. Therefore, in this paper, we propose the method for measuring editors' trustworthiness without depending on the edit frequency. The proposed method is based on the ratio where the edit remains the latest version of contents. Our evaluation shows that our proposed method evaluate the editor with high reliability highly, and the editor with low reliability lowly without depending on the edit frequency.
Geographical Location and Number of Back-Links of Web Servers All Over the World

HIRATE Yu, KATASE Hiroaki, YAMANA Hayato

IPSJ SIG Notes Information Processing Society of Japan (IPSJ)

Presentation date： 2008.06

Event date：
2008.06

　

　

　View Summary

According to our investigation result in Oct. 2005, the number of Web pages all over the world is estimated 53.7 billion. We have investigated TLD distribution and Language Distribution of Web pages based on 10.7 billion Web page dataset. In this paper, as one of our Web statics investigation series, we analyzed three kinds of distribution based on 10.7 billion Web page dataset, distribution of geographical location of Web server, the number of virtual hosts per one Web server, and the number of back links, i.e. the value of in-degree, per one Web server. Our results show (1) about 95.5% of Web servers are located in North America, Europe, and Asia regions, (2) hosts located in Latain America and East Europe have a large number of virtual hosts, and (3) the distribution between the value of in-degree and the number of Web servers follow the power low.
Temporal Clustering of Internet News Articles with Excluding Single Articles

NAKAMURA Tomohiro, HIRANO Takayoshi, HIRATE Yu, YAMANA Hayato

電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 The Institute of Electronics, Information and Communication Engineers

Presentation date： 2008.06

Event date：
2008.06

　

　

　View Summary

Clustering of internet news articles makes it possible to detect various useful information, for example, related articles, and latest topic words. From the TDT project down, this area is widely researched. Conventional clustering methods have difficulties to detect single article as a single cluster even though many single articles exists. In this paper, we propose a method to cluster news articles that exclude single articles in advance by using proper noun information, topographic information and other characteristics between single and non-single articles. In evaluation, we use half a year Japanese news articles. Compared to the Single-Link Method, which alone is difficult to judge articles single, our proposing method improves precision 10.2% and reduces the computation time to approximately a third.
Influence of Wikipedia on Search Engine Rankings

SONE Hiroaki, YOSHIDA Yasuaki, HIRATE Yu, YAMANA Hayato

電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 The Institute of Electronics, Information and Communication Engineers

Presentation date： 2008.06

Event date：
2008.06

　

　

　View Summary

Search engines are necessary to find information from the Internet. There is an investigation report that users recognize that information from search engines' results is as believable as information from television. That is, we can understand a part of the influence which a certain site gives to the society by examining search engine rankings. In this paper, to examine the influence of Wikipedia, that has become the pronoun of the encyclopedia in the Internet, we have conducted the experiments. We collected ranking positions of the articles in the Japanese version of Wikipedia by using search engines' APIs. Among all articles in the Japanese version of Wikipedia, about 90% of articles were ranked in top 10 by Yahoo! JAPAN and Google, and about 70% were ranked in top 10 by MSN. In the case of Yahoo! JAPAN and MSN, newly created articles in Wikipedia tend to appear in top 10 ranking at first, and keep their high rankings. The influence of Wikipedia toward search engine rankings was confirmed by our experiments.
Temporal Clustering of Internet News Articles with Excluding Single Articles

NAKAMURA Tomohiro, HIRANO Takayoshi, HIRATE Yu, YAMANA Hayato

電子情報通信学会技術研究報告. DE, データ工学

Presentation date： 2008.06

Event date：
2008.06

　

　
OS Level I/O Optimization in the Many-Core Era

UEDA Takanori, HIRATE Yu, YAMANA Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2008.06

Event date：
2008.06

　

　

　View Summary

近い将来，１つのチップに十数コアを搭載したメニーコア CPU が登場することは確実である．メニーコア環境下では，多くのアプリケーションが並列に動作するため，HDD が特に不得手とするランダムアクセスの頻度が増え，ストレージがますますボトルネックになると考えられる．そこで我々は，ストレージのボトルネックをソフトウェア的に軽減することを考えている．具体的には，アプリケーションのアクセスパターンを活用するディスクキャッシュ機構を Linux に実装し，実システムで評価することをひとつの目標にしている．ワークショップでは，これまでの研究概要と既存研究について述べると共に，最新の研究成果について述べ，今後の研究指針を示す．Many-core CPU which consists of some dozen cores in one package will definitely appear in the near future. In many-core environments, storage system will become bottlenecks since the random access to storage will increase because many applications will run in parallel. To meet this problem, we try to ease the storage bottlenecks by software method. Specifically, we try to implement a novel disk cache technique which exploits file access patterns of applications. The cache technique will be implemented on Linux Kernel and evaluated in real system. In this talk, we will show our research abstract and related works, and then show the latest results and the milestone of our research.
Influence of Wikipedia on Search Engine Rankings

SONE Hiroaki, YOSHIDA Yasuaki, HIRATE Yu, YAMANA Hayato

電子情報通信学会技術研究報告. DE, データ工学 The Institute of Electronics, Information and Communication Engineers

Presentation date： 2008.06

Event date：
2008.06

　

　

　View Summary

Search engines are necessary to find information from the Internet. There is an investigation report that users recognize that information from search engines' results is as believable as information from television. That is, we can understand a part of the influence which a certain site gives to the society by examining search engine rankings. In this paper, to examine the influence of Wikipedia, that has become the pronoun of the encyclopedia in the Internet, we have conducted the experiments. We collected ranking positions of the articles in the Japanese version of Wikipedia by using search engines' APIs. Among all articles in the Japanese version of Wikipedia, about 90% of articles were ranked in top 10 by Yahoo! JAPAN and Google, and about 70% were ranked in top 10 by MSN. In the case of Yahoo! JAPAN and MSN, newly created articles in Wikipedia tend to appear in top 10 ranking at first, and keep their high rankings. The influence of Wikipedia toward search engine rankings was confirmed by our experiments.
5L-1 全世界のWebページのTLD・言語分布解析(リーディングプロジェクト e-society:WebアーカイブとWebデータ解析技術,一般セッション,リーディングプロジェクト e-society)

平手勇宇, 山名早人

全国大会講演論文集一般社団法人情報処理学会

Presentation date： 2008.03

Event date：
2008.03

　

　
3ZK-10 A System for Finding Shortest Paths Between Web Pages

Matsunaga Taku, Hirate Yu, Yamana Hayato

全国大会講演論文集 Information Processing Society of Japan (IPSJ)

Presentation date： 2008.03

Event date：
2008.03

　

　

　View Summary

According to our investigation result in Oct. 2005, the number of Web pages all over the world is estimated 53.7 billion. We have investigated TLD distribution and Language Distribution of Web pages based on 10.7 billion Web page dataset. In this paper, as one of our Web statics investigation series, we analyzed three kinds of distribution based on 10.7 billion Web page dataset, distribution of geographical location of Web server, the number of virtual hosts per one Web server, and the number of back links, i.e. the value of in-degree, per one Web server. Our results show (1) about 95.5% of...
プログラムコードの抽象化を利用した類似ソースコード検索システム

黒木さやか, 上田高徳, 平手勇宇, 山名早人

DEWS2008

Presentation date： 2008

Event date：
2008

　

　
リンク構造解析アルゴリズム高速化のための縮小Webリンク構造の構築

片瀬弘晶, 松永拓, 上田高徳, 田代崇, 平手勇宇, 山名早人

DEWS2008

Presentation date： 2008

Event date：
2008

　

　
検索エンジンを用いた類似文章検索システムEPCI の評価

田代崇, 上田高徳, 平手勇宇, 山名早人

DEWS2008

Presentation date： 2008

Event date：
2008

　

　
商用検索エンジンの検索結果では取得できないランキング下位部分の収集・解析

舟橋卓也, 上田高徳, 平手勇宇, 山名早人

DEWS2008

Presentation date： 2008

Event date：
2008

　

　
全世界のWebサイトの言語分布と日本語を含むWebサイトのリンク・地理的位置の解析

童芳, 平手勇宇, 山名早人

DEWS2008

Presentation date： 2008

Event date：
2008

　

　
全世界のWebページのTLD・言語分布解析

平手勇宇, 山名早人

第70回情処全大

Presentation date： 2008

Event date：
2008

　

　
評判情報における評価対象の性質や一部分を表す表現の高精度な抽出手法

臼渕護, 平手勇宇, 山名早人

言語処理学会第14回年次大会(NLP2008)

Presentation date： 2008

Event date：
2008

　

　
分散メタP2Pストレージ「DiMPS」によるコンテンツ配信システムの実現

岡本雄太, 山名早人

DEWS2008

Presentation date： 2008

Event date：
2008

　

　
EReM-DiCE: Exploiting Remote Memory for Disk Cache Extension

Takanori UEDA, Yu HIRATE, Hayato YAMANA

Proc. of 1st International Workshop on Storage and I/O Virtualization, Performance, Energy, Evaluation and Dependability (SPEED2008)

Presentation date： 2008

Event date：
2008

　

　
Sequential pattern mining with time intervals

Yu Hirate, Hayato Yamana

ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS SPRINGER-VERLAG BERLIN

Presentation date： 2006

Event date：
2006

　

　

　View Summary

Sequential pattern mining can be used to extract frequent sequences maintaining their transaction order. As conventional sequential pattern mining methods do not consider transaction occurrence time intervals, it is impossible to predict the time intervals of any two transactions extracted as frequent sequences. Thus, from extracted sequential patterns, although users are able to predict what events will occur, they are not able to predict when the events will occur. Here, we propose a new sequential pattern mining method that considers time intervals. Using Japanese earthquake data, we confirmed that our method is able to extract new types of frequent sequences that are not extracted by conventional sequential pattern mining methods.
Text Mining using PrefixSpan constrained by Item Interval and Item Attribute

Issei Sato, Yu Hirate, Hayato Yamana

ICDEW 2006 - Proceedings of the 22nd International Conference on Data Engineering Workshops Institute of Electrical and Electronics Engineers Inc.

Presentation date： 2006

Event date：
2006

　

　

　View Summary

Applying conventional sequential pattern mining methods to text data extracts many uninteresting patterns, which increases the time to interpret the extracted patterns. To solve this problem, we propose a new sequential pattern mining algorithm by adopting the following two constraints. One is to select sequences with regard to item intervals-The number of items between any two adjacent items in a sequence-And the other is to select sequences with regard to item attributes. Using Amazon customer reviews in the book category, we have confirmed that our method is able to extract patterns faster than the conventional method, and is better able to exclude uninteresting patterns while retaining the patterns of interest.
Improvement in accuracy of multiple sequence alignment using novel group-to-group sequence alignment algorithm with piecewise linear gap cost

Shinsuke Yamada, Osamu Gotoh, Hayato Yamana

BMC Bioinformatics

Presentation date： 2006

Event date：
2006

　

　

　View Summary

Background: Multiple sequence alignment (MSA) is a useful tool in bioinformatics. Although many MSA algorithms have been developed, there is still room for improvement in accuracy and speed. In the alignment of a family of protein sequences, global MSA algorithms perform better than local ones in many cases, while local ones perform better than global ones when some sequences have long insertions or deletions (indels) relative to others. Many recent leading MSA algorithms have incorporated pairwise alignment information obtained from a mixture of sources into their scoring system to improve accuracy of alignment containing long indels. Results: We propose a novel group-to-group sequence alignment algorithm that uses a piecewise linear gap cost. We developed a program called PRIME, which employs our proposed algorithm to optimize the well-defined sum-of-pairs score. PRIME stands for Profile-based Randomized Iteration MEthod. We evaluated PRIME and some recent MSA programs using BAliBASE version 3.0 and PREFAB version 4.0 benchmarks. The results of benchmark tests showed that PRIME can construct accurate alignments comparable to the most accurate programs currently available, including L-INS-i of MAFFT, ProbCons, and T-Coffee. Conclusion: PRIME enables users to construct accurate alignments without having to employ pairwise alignment information. PRIME is available at http://prime.cbrc.jp/. © 2006 Yamada et al licensee BioMed Central Ltd.
Prediction of domain and disordered regions in proteins by fold recognition and secondary structure prediction

Masatoshi Takizawa, Naoko Inoue, Kentaro Tomii, Hayato Yamana, Tamotsu Noguchi

Critical Assessment of Techniques for Protein Structure Prediction Seventh Meeting

Presentation date： 2006

Event date：
2006

　

　
Automatic extraction of conserved region from alignment based on protein structure

Shinsuke Yamada, Kouratou Yamada, Hayato Yamana, Tamotsu Noguchi

EABS & BSJ 2006

Presentation date： 2006

Event date：
2006

　

　
Web Structure in 2005

Yu Hirate, Hayato Yamana

WAW2006, Banff

Presentation date： 2006

Event date：
2006

　

　
Contour Extraction using Texture and Non-Texture Distinction

IGUCHI Shigeru, YAMANA Hayato

Technical report of IEICE. PRMU 社団法人電子情報通信学会

Presentation date： 2005.11

Event date：
2005.11

　

　

　View Summary

This paper proposes a technique for applying a suitable contour extraction method to a texture region and a non-texture region to improve the accuracy of the contour extraction after dividing an image into these two regions. The most basic idea to extract contours is edge detection by derivative filters, however, it is hard to say edges equal borderlines. Thus, a texture analysis is essential to get the accurate result. Most of the conventional studies apply either edge detection or texture analysis to the whole in an image. Against that, in this paper, we firstly extract a texture region a...
Sample Collection System for Online Handwritten Mathematical Expressions written by Digital Pen and Preliminary Recognition Experiments

KASUYA Yuji, YAMANA Hayato

Technical report of IEICE. PRMU 社団法人電子情報通信学会

Presentation date： 2005.10

Event date：
2005.10

　

　

　View Summary

This paper proposes a sample collection system for online handwritten mathematical expressions based on digital pens. In the prior online handwriting character recognition systems, samples collected by pen tablets have been used. But data by pen tablets are (1) difficult to collect because users aren't familiar with pen tablets, (2) different from real handwriting because users have to look at their monitors to write characters. On the contrary digital pens, easy to use for the first time, are used and samples written by 74 examinees are collected. By recognition experiments following facts...
C-013 A Consideration on Thread-Level Speculative Execution

SAITO Fumiko, YAMANA Hayato

情報科学技術フォーラム一般講演論文集 FIT(電子情報通信学会・情報処理学会)推進委員会

Presentation date： 2005.08

Event date：
2005.08

　

　
Sequential Pattern Mining based on Event Intervals

HIRATE YU, KOMATSU SHUNSUKE, YAMANA HAYATO

IPSJ SIG Notes Information Processing Society of Japan (IPSJ)

Presentation date： 2005.07

Event date：
2005.07

　

　

　View Summary

In data mining researches, sequential pattern mining extracts frequent sequences keeping their event occurrence orders. Since conventional sequential pattern mining methods have not consider event occurrence time intervals, it is impossible to understand time intervals of any two events which are included by result sequences. In this paper, we propose a new sequential pattern mining method which considers event occurrence time intervals. As a result of our evaluation in applying the earthquake data. We confirmed our new method can extract new kind of frequent sequences which couldn't extrac...
Sequential Pattern Mining based on Event Intervals

HIRATE YU, KOMATSU SHUNSUKE, YAMANA HAYATO

IPSJ SIG Notes Information Processing Society of Japan (IPSJ)

Presentation date： 2005.07

Event date：
2005.07

　

　

　View Summary

In data mining researches, sequential pattern mining extracts frequent sequences keeping their event occurrence orders. Since conventional sequential pattern mining methods have not consider event occurrence time intervals, it is impossible to understand time intervals of any two events which are included by result sequences. In this paper, we propose a new sequential pattern mining method which considers event occurrence time intervals. As a result of our evaluation in applying the earthquake data. We confirmed our new method can extract new kind of frequent sequences which couldn't extracted by conventional sequential pattern mining methods.
Sequential Pattern Mining based on Event Intervals

HIRATE Yu, KOMATSU Shunsuke, YAMANA Hayato

IEICE technical report. Data engineering 社団法人電子情報通信学会

Presentation date： 2005.07

Event date：
2005.07

　

　

　View Summary

In data mining researches, sequential pattern mining extracts frequent sequences keeping their event occurrence orders. Since conventional sequential pattern mining methods have not consider event occurrence time intervals, it is impossible to understand time intervals of any two events which are included by result sequences. In this paper, we propose a new sequential pattern mining method which considers event occurrence time intervals. As a result of our evaluation in applying the earthquake data. We confirmed our new method can extract new kind of frequent sequences which couldn't extrac...
From the Search Engine to the Analysis Engine

Yamana Hayato

Journal of Japanese Society for Artificial Intelligence 社団法人人工知能学会

Presentation date： 2005.07

Event date：
2005.07

　

　
TF2P-growth:Frequent Itemset Mining Algorithm without Any Thresholds

HIRATE YU, IWAHASHI EIGO, YAMANA HAYATO

情報処理学会論文誌データベース（TOD）一般社団法人情報処理学会

Presentation date： 2005.06

Event date：
2005.06

　

　

　View Summary

Conventional frequent itemset mining algorithms require some user-specified minimum support, and then mine frequent itemsets with support values that are higher than the minimum support. As it is difficult to predict how many frequent itemsets will be mined with a specified minimum support, the Top-κ mining concept has been proposed. The Top-κ Mining concept is based on an algorithm for mining frequent itemsets without a minimum support, but with the number of most κ frequent itemsets ordered according to their support values. However, the Top-κ mining concept still requires a threshold κ. Therefore, users must decide the value of κ before initiating mining. In this paper, we propose a new mining algorithm, called "TF^2P-growth, " which does not require any thresholds. This algorithm mines itemsets with the descending order of their support values without any thresholds and returns frequent itemsets to users sequentially with short response time.
The Current Status of the Art of the 21st COE Programs in the Information Sciences Field (1) Productive ICT Academia Project

上田和紀, 大石進一, 甲藤二郎, 中島達夫, 村岡洋一, 山名早人

情報処理 Information Processing Society of Japan (IPSJ)

Presentation date： 2005.04

Event date：
2005.04

　

　
10. Productive ICT Academia Project

UEDA Kazunori, OISHI Shinichi, KATTO Jiro, NAKAJIMA Tatsuo, MURAOKA Yoichi, YAMANA Hayato

Journal of Information Processing Society of Japan 一般社団法人情報処理学会

Presentation date： 2005.04

Event date：
2005.04

　

　
Defense against Buffer Overflow by Segmenting Stack Frame

HIRUTA Tomonori, YAMANA Hayato

Information Processing Society of Japan (IPSJ)

Presentation date： 2005.03

Event date：
2005.03

　

　

　View Summary

In recent years, Buffer overflow Attacks are increasing. Buffer overflow is caused by inputting larger data than date space which prepared for various numbers. The most danger buffer overflow is stack overflow. When Stack Overflow occurs, return address is re-written and malicious code becomes executable. This paper proposes defense technique against buffer overflow by segmenting stack frame. We implement this technique Simplescalar tool set ver3.0d and evaluate with SPEC CINT95. Evaluation result shows that this technique causes 2.3% performance degradation.
Defense against Buffer Overflow by Segmenting Stack Frame

Hiruta Tomonori, Yamana Hayato

情報処理学会研究報告. SLDM, [システムLSI設計技術] 一般社団法人情報処理学会

Presentation date： 2005.03

Event date：
2005.03

　

　

　View Summary

In recent years, Buffer overflow Attacks are increasing. Buffer overflow is caused by inputting larger data than date space which prepared for various numbers. The most danger buffer overflow is stack overflow. When Stack Overflow occurs, return address is re-written and malicious code becomes executable. This paper proposes defense technique against buffer overflow by segmenting stack frame. We implement this technique Simplescalar tool set ver3.0d and evaluate with SPEC CINT95. Evaluation result shows that this technique causes 2.3% performance degradation.
Defense against Buffer Overflow by Segmenting Stack Frame

Hiruta Tomonori, Yamana Hayato

IEICE technical report. Computer systems 社団法人電子情報通信学会

Presentation date： 2005.03

Event date：
2005.03

　

　

　View Summary

In recent years, Buffer overflow Attacks are increasing. Buffer overflow is caused by inputting larger data than date space which prepared for various numbers. The most danger buffer overflow is stack overflow. When Stack Overflow occurs, return address is re-written and malicious code becomes executable. This paper proposes defense technique against buffer overflow by segmenting stack frame. We implement this technique Simplescalar tool set ver3.0d and evaluate with SPEC CINT95. Evaluation result shows that this technique causes 2.3% performance degradation.
MPIETE2:Improvement of the MPI Execution Time Estimator in Prediction Error of Communication Time

IWABUCHI Toshihiro, SUGITA Shu, YAMANA Hayato

IPSJ SIG Notes Information Processing Society of Japan (IPSJ)

Presentation date： 2005.03

Event date：
2005.03

　

　

　View Summary

In this paper, we improve the MPI Execution Time Estimator (MPIETE) to reduce the prediction error of communication time. MPIETE we have proposed is the execution time estimation tool for MPI programs. MPIETE's scheme divides a MPI program into the computation blocks and the communication blocks, and then predicts the total execution time by summing the execution time of each block. Since estimating the block execution time is fast, MPIETE enables to predict the total execution time faster than executing MPI program actually. However, MPIETE assumes no network contension. This results in some errors to predict the delay-time with network contentions. In this paper, by proposing the new estimation scheme for communication block including the delay-time, we improve the MPIETE. The proposed scheme enables to predict the performance decrement and to find out the number of the Processing Unit (PU) where the target platform marks the best performance. We have evaluated MPIETE2, that improves MPIETE with the proposed scheme, using EP, CG, FT, MG from NAS Parallel Benchmmarks 2.4. As the results for 2-128PU, the prediction error ranges are less than 14% and the execution time of the prediction is 1/4 times smaller than the actual execution time. Moreover, MPIETE2 predicts exactly the number of PU where the target platform marks the best performance.
MPIETE2 : Improvement of the MPI Execution Time Estimator in Prediction Error of Communication Time

IWABUCHI Toshihiro, SUGITA Shu, YAMANA Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2005.03

Event date：
2005.03

　

　

　View Summary

In this paper, we improve the MPI Execution Time Estimator (MPIETE) to reduce the prediction error of communication time. MPIETE we have proposed is the execution time estimation tool for MPI programs. MPIETE's scheme divides a MPI program into the computation blocks and the communication blocks, and then predicts the total execution time by summing the execution time of each block. Since estimating the block execution time is fast, MPIETE enables to predict the total execution time faster than executing MPI program actually. However, MPIETE assumes no network contension. This results in so...
The Proposal of Tri-Mode Branch Predictor

SAITO FUMIKO, YAMANA HAYATO

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2005.03

Event date：
2005.03

　

　

　View Summary

The branch prediction is installed on the recent processor to avoid stalling pipeline. Branch prediction is a kind of speculative execution for control dependence. In the recent year, the deeper pipeline gets, the higher branch miss prediction penalty reaches. Thus, branch miss prediction rate must lower to rise processor performance. Recently, many various branch predictors have been proposed. Hybrid branch predictors composed of multiple pattern history tables (PHT) show the highest accuracy among them. The Bi-Mode branch predictor is the most famous of the hybrid branch predictors. On th...
The Proposal of Tri-Mode Branch Predictor

SAITO FUMIKO, YAMANA HAYATO

IPSJ SIG Notes Information Processing Society of Japan (IPSJ)

Presentation date： 2005.03

Event date：
2005.03

　

　

　View Summary

The branch prediction is installed on the recent processor to avoid stalling pipeline. Branch prediction is a kind of speculative execution for control dependence. In the recent year, the deeper pipeline gets, the higher branch miss prediction penalty reaches. Thus, branch miss prediction rate must lower to rise processor performance. Recently, many various branch predictors have been proposed. Hybrid branch predictors composed of multiple pattern history tables (PHT) show the highest accuracy among them. The Bi-Mode branch predictor is the most famous of the hybrid branch predictors. On the Bi-Mode predictor, the Choice PHT judges the branch bias and selects the Direction PHT(Taken or NotTaken PHT). This paper focuses on the Weakly Branches which the Choice PHT judges Weakly Taken or NotTaken don't have the branch bias. In order to avoid the Weakly branch influence on the Direction PHTs, we propose "the Tri-Mode brach predictor" added the Weakly PHT predicting the Weakly branches. On the 12KB Tri-Mode predictor, the branch miss reduction rate from the Bi-Mode predictor shows average 2.78% in the SPECint95(ref inputs) benchmark simulation.
Selective Attention System by Residual Information of Predictive Coding

Saitoh Jun, Yamana Hayato

IPSJ SIG Notes. CVIM 一般社団法人情報処理学会

Presentation date： 2005.03

Event date：
2005.03

　

　

　View Summary

This paper describes unsupervised selective attention system by using residual information of Predictive Coding. The imitation system of brain visual passway by Rao and Ballard, or Predictive Coding, uses learning rule to minimamize the residuals between the inputs and the internal predictions, which results to get linear coding of subimages by basis set. Residual-based selective attention model can select out "informative" subimages from a validation image because they have features which is rare to been seen in learning sample subimages. We experiment our system to have it learn some simi...
Selective Attention System by Residual Information of Predictive Coding

SAITOH Jun, YAMANA Hayato

情報処理学会研究報告. CVIM, [コンピュータビジョンとイメージメディア]

Presentation date： 2005.03

Event date：
2005.03

　

　
A Branch Prediction Technique focused on Weak States of Prediction Table

Nakazawa Yukari, Saito Fumiko, Yamana Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2005.01

Event date：
2005.01

　

　

　View Summary

In recent years, as the pipeline's length gets deeper, and the instruction fetch width and the issue width become wider, more accurate branch predictors are needed. Branch predictors predict with the 2 bit saturating counters (predictor counters) whose state is changed by the execution result of the branch. As a result of analyzing the prediction accuracy in each state (Strongly Taken, Weakly Taken, Weakly Not-taken and Strongly Not-taken) of prediction counters, it turns out that the prediction accuracy in the Weak states of gshare predictor is especially low. We propose the predictor sele...
An Efficient Synchronization Scheme Using Speculative Threads on Hyper-Threading Technology

HONDA Dai, SAITO Fumiko, YAMANA Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2005.01

Event date：
2005.01

　

　

　View Summary

Recently, the gap between CPU processing speed and the data transmission speed from the main memory has lowered execution speed. Thus, data caching technique becomes more important. Particularly in pointer-based programs which have nonlinear access patterns, the cache miss rate is very high. To solve this problem, Pre-Execution has been proposed as a cache miss latency tolerance technique that makes one or more helper threads running in the spare CPU's resources ahead of the main computation. This paper proposes the synchronous technique between main thread and helper thread. Furthermore, t...
PlusDBG: Web community extraction scheme improving both precision and pseudo-recall

N Saida, A Umezawa, H Yamana

WEB TECHNOLOGIES RESEARCH AND DEVELOPMENT - APWEB 2005 SPRINGER-VERLAG BERLIN

Presentation date： 2005

Event date：
2005

　

　

　View Summary

This paper proposes PlusDBG to improve both precision and pseudo-recall by extending the conventional Web community extraction scheme. Precision is defined as the percentage of relevant Web pages extracted as members of Web communities and pseudo-recall is defined as the sum of the number of relevant Web pages extracted as members of Web communities. The proposed scheme adopts the new distance parameter defined by the relevance between a Web page and a Web community, and extracts the Web community with higher precision and pseudo-recall. Moreover, we have implemented and evaluated the proposed scheme. Our results confirm that the proposed scheme is able to extract about 3.2-fold larger numbers of members of Web communities than the conventional scheme, while maintaining equivalent precision.
相同性検索手法の組み合わせによる検索精度向上

滝沢雅俊, 山田真介, 山名早人

情報処理学会全国大会講演論文集

Presentation date： 2005

Event date：
2005

　

　
MPIプログラムの簡易実行による実行時間予測手法における通信予測の効率化

杉田秀, 岩淵寿寛, 山名早人

情報処理学会全国大会講演論文集

Presentation date： 2005

Event date：
2005

　

　
HMMでの動作認識における類似動作からの特徴部位抽出

井口茂, 山名早人

情報処理学会全国大会講演論文集

Presentation date： 2005

Event date：
2005

　

　
リカレントネットを用いたオンライン文字認識システム

糟谷勇児, 山名早人

情報処理学会全国大会講演論文集

Presentation date： 2005

Event date：
2005

　

　
Overview of the NTCIR-5 WEB Navigational Retrieval Subtask 2 (Navi-2）

Keizo Oyama, Masao Takaku, Haruko Ishikawa, Akiko Aizawa, Hayato Yamana

Proc. of NTCIR-5 Workshop

Presentation date： 2005

Event date：
2005

　

　
Overview of the NTCIR-5 WEB Navigational Retrieval Subtask 2 (Navi-2）

Keizo Oyama, Masao Takaku, Haruko Ishikawa, Akiko Aizawa, Hayato Yamana

Proc. of NTCIR-5 Workshop

Presentation date： 2005

Event date：
2005

　

　
Multiple sequence alignment

Osamu Gotoh, Shinsuke Yamada, Tetsushi Yada

handbook of computational molecular biology

Presentation date： 2005

Event date：
2005

　

　
スパイウェア

山名早人監訳, 斎藤純, 平手勇, 糟谷勇児, 柳井佳孝, 蛭田智則, 杉田秀, 井口茂訳

CACM日本語版

Presentation date： 2005

Event date：
2005

　

　
P2Pファイル共有ネットワーク上で動作するメタファイルシステム

岡本雄太, 山名早人

日本ソフトウェア科学会インターネットテクノロジワークショップ2005(WIT2005)

Presentation date： 2005

Event date：
2005

　

　
テクスチャと非テクスチャの区別を用いた輪郭線抽出

井口茂, 山名早人

信学技報(PMRU)

Presentation date： 2005

Event date：
2005

　

　
PRIME - an implementation of a doubly nested randomized iterative refinement strategy with the picewise linear gap cost

Shinsuke Yamada, Osamu Gotoh

CBRC2005, Poster No.2

Presentation date： 2005

Event date：
2005

　

　
デジタルペンを用いた数式サンプル収集システムの紹介と採取サンプルの解析

糟谷勇児, 山名早人

信学技報(PRMU)

Presentation date： 2005

Event date：
2005

　

　
FORTE1を利用したドメイン予測法の開発

滝沢雅俊, 山名早人, 野口保

産総研生命情報科学人材養成コース最終シンポジウム、ポスター番号038

Presentation date： 2005

Event date：
2005

　

　
スレッドレベル投機的実行に関する考察

斎藤史子, 山名早人

FIT2005,C-1

Presentation date： 2005

Event date：
2005

　

　
区分的線形ギャップコストを用いたマルチプルアラインメントアルゴリズムの開発

山田真介, 山名早人, 後藤修

産総研生命情報科学人材養成コース最終シンポジウム、ポスター番号002

Presentation date： 2005

Event date：
2005

　

　
三次元情報を利用した保存領域の自動決定

山田晃太郎, 山田真介, 山名早人, 野口保

産総研生命情報科学人材養成コース最終シンポジウム、ポスター番号040

Presentation date： 2005

Event date：
2005

　

　
Search Engines 2005-Guides to the Web-Introduction to Search Engines

山名早人, 村田剛志

情報処理 Information Processing Society of Japan (IPSJ)

Presentation date： 2005

Event date：
2005

　

　
イベント発生間隔を考慮したシーケンシャルパターンマイニング

平手勇宇, 小松俊介, 山名早人

情報研報(DBS）

Presentation date： 2005

Event date：
2005

　

　
迷惑メールを撃退する

山名早人監修, J.グッドマン, D.ベッカーマン, R.ラウンスウェイト

日経サイエンス

Presentation date： 2005

Event date：
2005

　

　
Googleを超える利口な検索エンジン

山名早人監修, J.モスタファ

日経サイエンス

Presentation date： 2005

Event date：
2005

　

　
PlusDBG: Web community extraction scheme improving both precision and pseudo-recall

N Saida, A Umezawa, H Yamana

WEB TECHNOLOGIES RESEARCH AND DEVELOPMENT - APWEB 2005 SPRINGER-VERLAG BERLIN

Presentation date： 2005

Event date：
2005

　

　

　View Summary

This paper proposes PlusDBG to improve both precision and pseudo-recall by extending the conventional Web community extraction scheme. Precision is defined as the percentage of relevant Web pages extracted as members of Web communities and pseudo-recall is defined as the sum of the number of relevant Web pages extracted as members of Web communities. The proposed scheme adopts the new distance parameter defined by the relevance between a Web page and a Web community, and extracts the Web community with higher precision and pseudo-recall. Moreover, we have implemented and evaluated the proposed scheme. Our results confirm that the proposed scheme is able to extract about 3.2-fold larger numbers of members of Web communities than the conventional scheme, while maintaining equivalent precision.
3P266 Identification of rigid domains by using complete graph and application to SCOP

Mashiko R, Wako H, Yamana H

Biophysics 日本生物物理学会

Presentation date： 2004.11

Event date：
2004.11

　

　
F-033 Parallel Learning Methods of Reinforcement Learning on Shared Memory Multiprocessors

Mori Kouichirou, Yamana Hayato

情報科学技術フォーラム一般講演論文集 FIT(電子情報通信学会・情報処理学会)推進委員会

Presentation date： 2004.08

Event date：
2004.08

　

　
An Efficient Caching Technique Using Speculative Threads on Hyper-Threading Technology

HONDA Dai, SAITO Fumiko, YAMANA Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2004.07

Event date：
2004.07

　

　

　View Summary

Recently, the gap between CPU processing speed and the data transmission speed from the main memory has greatly influenced execution speed. Thus data caching technique become more important However, in pointer-based programs which have a nonlinear access pattern a cache memory does not function effectively. To solve this problem, Pre-Execution is a cache miss latency tolerance technique that uses one or more helper threads running in spare CPU's 'resources ahead of the main computation. This paper proposes the synchronous technique of the Helper thread. Furthermore, this paper examines the ...
A Translation Support System using Search Engines

OSHIKA Hironori, SATOU Manabu, ANDO Susumu, YAMANA Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2004.07

Event date：
2004.07

　

　

　View Summary

This paper proposes a new Japanese-to-English-translation support system using search engines. The system uses Google to address some of the problems that non-native speakers of English will often encounter when trying to write English sentences. Especially, appropriate choices of English nouns and prepositions are a challenge for Japanese. To solve these problems, we focus on the techniques presented in the book titled "Using Google to Improve Your Translation Skills" written by Susumu Ando, published by Maruzen in 2003. The proposed system is implemented by automatically constructing opti...
A Translation Support System using Search Engines

OSHIKA Hironori, SATOU Manabu, YAMANA Hayato

IEICE technical report. Data engineering 社団法人電子情報通信学会

Presentation date： 2004.07

Event date：
2004.07

　

　

　View Summary

This paper proposes a new Japanese-to-English-translation support system using search engines. The system uses Google to address some of the problems that non-native speakers of English will often encounter when trying to write English sentences. Especially, appropriate choices of English nouns and prepositions are a challenge for Japanese. To solve these problems, we focus on the techniques presented in the book titled "Using Google to Improve Your Translation Skills" written by Susumu Ando, published by Maruzen in 2003. The proposed system is implemented by automatically constructing opti...
Toward the Exploitation of New Applications based on Web Data

YAMANA Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2004.05

Event date：
2004.05

　

　

　View Summary

The amount of the information on the Web is huge and the number of Web page is estimated about 9. 25 billion in April 2004. Moreover, one billion Web pageswill be added to the Web repository every year, which is estimated by calculating its average increase in these two years. It is not too much to say that the huge Web repository has all kinds of information, knowledge, and know-how that can not be learned by a human even if he spent all the life time to learn them. In this paper, we introduce the major research projects that concern how to crawl the huge Web pages, how to keep them up-to-date, and how to make full use of them.
Toward the Exploitation of New Applications based on Web Data

YAMANA Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2004.05

Event date：
2004.05

　

　

　View Summary

The amount of the information on the Web is huge and the number of Web page is estimated about 9. 25 billion in April 2004. Moreover, one billion Web pageswill be added to the Web repository every year, which is estimated by calculating its average increase in these two years. It is not too much to say that the huge Web repository has all kinds of information, knowledge, and know-how that can not be learned by a human even if he spent all the life time to learn them. In this paper, we introduce the major research projects that concern how to crawl the huge Web pages, how to keep them up-to-...
Parallelization of Loops Including Loop Carried Dependences Using Thread Level Speculative Execution

Ishikawa Shunsuke, Yamana Hayato

情報処理学会研究報告. SLDM, [システムLSI設計技術] 一般社団法人情報処理学会

Presentation date： 2004.03

Event date：
2004.03

　

　

　View Summary

The loop parallelization scheme improves the program performance, dramatically. However, when the data dependence cannot be analyzed statically, the conventional parallelization scheme assumes that the data dependence exists. Thus the loop containing unanalyzed Loop Carried Dependence cannot be parallelizedthe thread level speculative execution scheme has been known to speedup such a loop. In this paper, we propose the scheme to apply the speculative execution alternatively only to the portion expected to be speeduped effectively, using the overhead parameter required for the book-keeping p...
A Fast Learning Method for Reinforcement Learning on Shared Memory Multiprocessors

MORI Kouichirou, YAMANA Hayato

IPSJ SIG Notes. ICS 一般社団法人情報処理学会

Presentation date： 2004.03

Event date：
2004.03

　

　

　View Summary

In Reinforcement Learning, the agent learns by trial and error from a state without knowledge. Therefore, reinforcement learning has drawbacks that learning is slow. It is a serious problem how learns at high speed. In order to learn at high speed, some methods have been proposed. In the methods, the value function is divided. Then each divided value function is assigned to each processor, and updated in parallel. However, the method needs to exchange experiences frequently between the divided value function because of the character of reinforcement learning. It was a problem in the previou...
Parallelization of Loops Including Loop Carried Dependences Using Thread Level Speculative Execution

Ishikawa Shunsuke, Yamana Hayato

IEICE technical report. Computer systems 社団法人電子情報通信学会

Presentation date： 2004.03

Event date：
2004.03

　

　

　View Summary

The loop parallelization scheme improves the program performance, dramatically. However, when the data dependence cannot be analyzed statically, the conventional parallelization scheme assumes that the data dependence exists. Thus the loop containing unanalyzed Loop Carried Dependence cannot be parallelizedthe thread level speculative execution scheme has been known to speedup such a loop. In this paper, we propose the scheme to apply the speculative execution alternatively only to the portion expected to be speeduped effectively, using the overhead parameter required for the book-keeping p...
A Fast Learning Method for Reinforcement Learning on Shared Memory Multiprocessors

MORI Kouichirou, YAMANA Hayato

IEICE technical report. Artificial intelligence and knowledge-based processing 社団法人電子情報通信学会

Presentation date： 2004.03

Event date：
2004.03

　

　

　View Summary

In Reinforcement Learning, the agent learns by trial and error from a state without knowledge. Therefore, reinforcement learning has drawbacks that learning is slow. It is a serious problem how learns at high speed. In order to learn at high speed, some methods have been proposed. In the methods, the value function is divided. Then each divided value function is assigned to each processor, and updated in parallel. However, the method needs to exchange experiences frequently between the divided value function because of the character of reinforcement learning. It was a problem in the previou...
The Branch Predictor refering a BTB Entry Existence

SAITO FUMIKO, YAMANA HAYATO

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2004.03

Event date：
2004.03

　

　

　View Summary

The branch prediction is installed on the recent procesor to avoid stalling pipeline. Branch prediction is a kind of speculative execution for control dependence. In the recent year, the deeper pipeline gets, the higher branch miss prediction penalty reaches. Thus, branch miss prediction rate must lower to rise processor performance. The branch prediction predicts a branch direction and a branch target address. BTB (Branch Target Buffer) registers Taken branch. We found that the most branches, which do not have BTB entry are NotTaken branches. We propose the branch predictor reffering a BTB...
MPIETE : An Execution Time Estimator for MPI Programs

HORII Hiroshi, IWABUCHI Toshihiro, YAMANA Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2004.03

Event date：
2004.03

　

　

　View Summary

In this paper, we propose the MPI Execution Time Estimator (MPIETE), the execution time estimation tool for MPI programs, helping you to choose the best suited computing platform to execute a MPI program. Conventional execution time estimation schemes are not able to model a computing platform or a MPI program perfectly, which results in no reusable of any parameters of both the computing platform and the MPI program. On the contrary, the proposed scheme enables to reuse all the parameters of both the computing platform and the MPI program even for the estimation on another computing platfo...
Extension of Prrn: implementation of a doubly nested randomized iterative refinement strategy under a piecewise linear gap cost

山田真介, 後藤修, 山名早人

the Fifteenth International Conference on Genome Informatics

Presentation date： 2004

Event date：
2004

　

　
TF2P-growth: An Efficient Algorithm for Mining Frequent Patterns without any Thresholds

平手勇宇, 岩橋永悟, 山名早人

IEEE ICDM'04 Workshop on Alternatives Techniques for Data Mining and Knowledge Discovery

Presentation date： 2004

Event date：
2004

　

　
The Branch Predictor Referring a BTB Entry Existence

SAITO FUMIKO, YAMANA HAYATO

Information Processing Society of Japan (IPSJ)

Presentation date： 2004

Event date：
2004

　

　

　View Summary

The branch prediction is installed on the recent processor to avoid stalling pipeline. Branch prediction is a kind of speculative execution for control dependence. In the recent year, the deeper pipeline gets, the higher branch miss prediction penalty reaches. Thus, branch miss prediction rate must lower to rise processor performance. The branch prediction predicts a branch direction and a branch target address. BTB (Branch Target Buffer) registers Taken branch. We found that the most branches, which do not have BTB entry are NotTaken branches. We propose the branch predictor reffering a BTB entry existence. The proposed predictor only undates the entry of the branch whose target address is registered in BTB, in order to allevilate aliasing. In SPECint95 (train), branch prediction miss rate lowers avarage 1.5% on 8 KB Gshare predictor and avarage 0.4% on 1.5 KB Bi-Mode predictor.
extension of group-to-group sequence alignment algorithm under a piecewise linear gap cost

山田真介, 後藤修, 山名早人

Proc. of Intelligent Systems for Molecular Biology 2004

Presentation date： 2004

Event date：
2004

　

　
ハイパースレッディング環境における投機的スレッドを用いたキャッシュ効率化

本田大, 斎藤史子, 山名早人

SWoPP2004

Presentation date： 2004

Event date：
2004

　

　
検索エンジンを使った翻訳サポートシステムの構築

大鹿広憲, 佐藤学, 安藤進, 山名早人

DBWS2004

Presentation date： 2004

Event date：
2004

　

　
サービス指向コンピューティング

山名早人監訳, 石川隼輔, 堀井洋, 岩渕寿寛, 岩橋永悟, 山口正男訳

CACM日本語版

Presentation date： 2004

Event date：
2004

　

　
A Challenge to Gather 10 billion of Web Pages

山名早人

2004 CORS/INFORMS International Meeting (2004.5)

Presentation date： 2004

Event date：
2004

　

　
An Efficient Algorithm for Mining Top-k Frequent Patterns with a Small Response Time

平手勇宇, 岩橋永悟, 山名早人

2004 CORS/INFORMS International Meeting (2004.5)

Presentation date： 2004

Event date：
2004

　

　
The Branch Predictor refering a BTB Entry Existence

斎藤史子, 山名早人

情報処理学会シンポジウム論文集

Presentation date： 2004

Event date：
2004

　

　
Cutting down the amount of communications for frequent pattern mining on Grid

加藤真, 平手勇宇, 岩橋永悟, 山名早人

情報処理学会シンポジウム論文集

Presentation date： 2004

Event date：
2004

　

　
MPIプログラムの簡易実行結果を用いた実行時間予測ツールMPIETEの評価

堀井洋, 岩渕寿寛, 山名早人

情処研報(HPC)

Presentation date： 2004

Event date：
2004

　

　
グループ化されたWebページを用いた検索

梅沢晃, 山名早人

DEWS2004

Presentation date： 2004

Event date：
2004

　

　
スレッドレベル投機的実行による依存距離不定運搬依存をもつループの並列化

石川隼輔, 山名早人

情処研報(SLDM)

Presentation date： 2004

Event date：
2004

　

　
トランスポート層の情報を利用したパケットの経路選択

高見進太郎, 山名早人, 廣津登志夫

第66回情処全大

Presentation date： 2004

Event date：
2004

　

　
ページ-コミュニティ間の関連性を考慮したWebコミュニティ抽出

斉田直幸, 梅沢晃, 山名早人

第66回情処全大

Presentation date： 2004

Event date：
2004

　

　
ユーザの感覚を考慮したWeb検索システムの評価手法

大塚崇志, 江口浩二, 山名早人

DEWS2004

Presentation date： 2004

Event date：
2004

　

　
ユーザへの応答時間を重視した最頻出kパターン抽出アルゴリズム

平手勇宇, 岩橋永悟, 山名早人

DEWS2004

Presentation date： 2004

Event date：
2004

　

　
リンク構造を利用したWebページの更新判別手法

熊谷英樹, 山名早人

DEWS2004

Presentation date： 2004

Event date：
2004

　

　
強化学習並列化による学習の高速化

森紘一郎, 山名早人

情処研報(ICS)

Presentation date： 2004

Event date：
2004

　

　
繰り返し囚人のジレンマゲームを適用したネットオークションモデルの提案と協調行動の観察

久野木彩子, 山名早人

DEWS2004

Presentation date： 2004

Event date：
2004

　

　
分岐方向偏向強弱毎の予測表で構成された分岐方向予測機構

斎藤史子, 山名早人

情処研報(ARC)

Presentation date： 2004

Event date：
2004

　

　
検索エンジンのアーキテクチャ

山名早人

情報の科学と技術

Presentation date： 2004

Event date：
2004

　

　
見たいサイトが一発で出てくる検索エンジンの仕組みとは

山名早人

インターネットマガジン（インプレス）

Presentation date： 2004

Event date：
2004

　

　
Branch Direction Miss Prediction Rate Diminution in Cooperation with a Selector and Predictors on Hybrid Branch Predictor

Saito Fumiko, Nakazawa Yukari, Yamana Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2003.08

Event date：
2003.08

　

　

　View Summary

In recent years, as pipelines length deeper, more accurate branch direction predictors are needed. Most of the branch predictor tend to increase hardware budget for aliasing alleviation. This research proposes the means for miss prediction rate reduction in same hardware budget Hybrid Predictor. This predictor is called Hybrid Predictor Referenced Prediction Counter State(Hybrid-RPCS). Generally, low-predictability branches have high transition rate and no direction deviation. For low-predictability branches, prediction is turned the other way, based on a selector counter state and predicto...
Evaluation of Execution-time Prediction Method of MPI Programs based Simple Execution

IWABUCHI Toshihiro, HORII Hiroshi, YAMANA Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2003.08

Event date：
2003.08

　

　

　View Summary

In this paper, we show evaluation results of our execution-time prediction method which simply executes MPI program on 2PU. We predict the execution time of NAS Parallel Benchmarks ver.2.3 on 2-128PU. Execution time prediction is effective technique to determine the optimal number of PU for some target applications. The most existing methods are not only for predicting execution-time but for obtaining information of various overhead, and hence need the long simulation time. On the other hand, since our purpose is to obtain execution time only, our method can predict faster than actual execu...
Parallel FP-growth Algorithm for Frequent Pattern Mining

IWAHASHI Eigo, YAMANA Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2003.07

Event date：
2003.07

　

　

　View Summary

Frequent patterns mining is one of the important problem in data mining research. The Apriori is a prominent algorithm followed by many variants. In 2000, the FP-growth, which is reported to be faster than the Apriori, was proposed. However, many parallel algorithms of frequent pattern mining are still based on the Apriori. In this paper, we propose a parallelized version of the FP-growth, which accesses disks in parallel and constructs local FP-trees on each local memory. As a result of the evaluation using 32 node PC cluster, our method is approximately 2 and 130 times faster than sequent...
Parallel FP-growth Algorithm for Frequent Pattern Mining

IWAHASHI Eigo, YAMANA Hayato

IEICE technical report. Data engineering 社団法人電子情報通信学会

Presentation date： 2003.07

Event date：
2003.07

　

　

　View Summary

Frequent patterns mining is one of the important problem in data mining research. The Apriori is a prominent algorithm followed by many variants. In 2000, the FP-growth, which is reported to be faster than the Apriori, was proposed. However, many parallel algorithms of frequent pattern mining are still based on the Apriori. In this paper, we propose a parallelized version of the FP-growth, which accesses disks in parallel and constructs local FP-trees on each local memory. As a result of the evaluation using 32 node PC cluster, our method is approximately 2 and 130 times faster than sequent...
Exploitation of Informational Applications - Toward the Global Web Information Archive

Hayato Yamana

Kyokai Joho Imeji Zasshi/Journal of the Institute of Image Information and Television Engineers Inst. of Image Information and Television Engineers

Presentation date： 2003

Event date：
2003

　

　
ゲノムデータベースにおけるエントリの関連性検索

三村徹, 諸岡慎士, 山名早人

情報処理学会全国大会講演論文集

Presentation date： 2003

Event date：
2003

　

　
P2P方式における検索効率の改善手法の評価

難波貞暁, 山名早人

情報処理学会全国大会講演論文集

Presentation date： 2003

Event date：
2003

　

　
マルコフモデルを使用したWebランキングの評価実験

赤津秀之, 山名早人

情報処理学会全国大会講演論文集

Presentation date： 2003

Event date：
2003

　

　
分岐命令に着目した投機的実行支援情報収集機構の設計とFPGAへの実装

蛭田智則, 小池汎平, 佐谷野健二, 山名早人

情報処理学会全国大会講演論文集

Presentation date： 2003

Event date：
2003

　

　
「情報」応用の開拓～全世界のWeb情報アーカイブ構築への挑戦～

山名早人

映像情報メディア学会誌

Presentation date： 2003

Event date：
2003

　

　
MPIプログラムの簡易実行による実行時間予測の評価

岩渕寿寛, 堀井洋, 山名早人

情処研報(HPC)

Presentation date： 2003

Event date：
2003

　

　
ハイブリッド予測機構における選択器と予測器の協調による予測ミス率の低減

斎藤史子, 仲沢由香里, 山名早人

情処研報(ARC)

Presentation date： 2003

Event date：
2003

　

　
FP-growthの並列化による頻出パターン抽出の高速化

岩橋永悟, 山名早人

情処研報(DBS)

Presentation date： 2003

Event date：
2003

　

　
IT社会を先導するインターネット－家庭でのインターネットアクセスの現状と今後－

山名早人

電子情報通信学会誌

Presentation date： 2003

Event date：
2003

　

　
GnutellaにおけるQuery Hitを用いたトラヒック量軽減手法の提案

難波貞暁, 山名早人

第65回情処全大

Presentation date： 2003

Event date：
2003

　

　
MPIプログラムの簡易実行による実行時間予測

岩渕寿寛, 堀井洋, 山名早人

第65回情処全大

Presentation date： 2003

Event date：
2003

　

　
Webサーチエンジンの新しい評価手法

大塚崇志, 山名早人

電子情報通信学会第14回データ工学ワークショップDEWS2003

Presentation date： 2003

Event date：
2003

　

　
Webページの更新傾向を踏まえた効率的な収集方法の提案

熊谷英樹, 山名早人

第65回情処全大

Presentation date： 2003

Event date：
2003

　

　
Webページ構造を考慮したキーワードによる画像の内容特定

大鹿広憲, 山名早人

第65回情処全大

Presentation date： 2003

Event date：
2003

　

　
アプリケーションのレスポンス時間を用いたPCの性能評価

堀井洋, 山名早人

第65回情処全大

Presentation date： 2003

Event date：
2003

　

　
ゲノムデータベースにおけるアノテーションフィールドを利用したエントリの類似検索

三村徹, 諸岡慎士, 山名早人

第65回情処全大

Presentation date： 2003

Event date：
2003

　

　
データ依存命令を対象としたデータ値予測

仲沢由香里, 山名早人

第65回情処全大

Presentation date： 2003

Event date：
2003

　

　
マルコフモデルを用いたWebランキング法の評価実験

赤津秀之, 山名早人

第65回情処全大

Presentation date： 2003

Event date：
2003

　

　
ユーザの検索履歴を用いた情報検索システムの提案

三浦典之, 山名早人

第65回情処全大

Presentation date： 2003

Event date：
2003

　

　
リンク構造を用いたWebページ自動分類の精度向上法

大西高裕, 山名早人

第65回情処全大

Presentation date： 2003

Event date：
2003

　

　
大規模Webデータからのコミュニティ抽出

梅沢晃, 山名早人

第65回情処全大

Presentation date： 2003

Event date：
2003

　

　
投機的データプリフェッチを用いたキャッシュ効率化の考察

本田大, 山名早人

第65回情処全大

Presentation date： 2003

Event date：
2003

　

　
投機的実行による難並列化ループの高速化

石川隼輔, 山名早人

第65回情処全大

Presentation date： 2003

Event date：
2003

　

　
分岐命令実行回数に着目した投機的実行支援情報収集機構の設計とFPGAへの実装

蛭田智則, 山名早人, 佐谷野健二, 小池汎平

第65回情処全大

Presentation date： 2003

Event date：
2003

　

　
分子系統樹構成法に関する最新技術動向

益子理絵, 山田真介, 山名早人

第65回情処全大

Presentation date： 2003

Event date：
2003

　

　
Hybrid Branch Predictors Evaluation on Prediction Accuracy

Saito Fumiko, Kitamura Takeshi, Yamana Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2002.11

Event date：
2002.11

　

　

　View Summary

In recent years, branch predictors with multiple prediction tables, which are called "Hybrid Predictor" in this paper, have been proposed. "Hybrid Predictor" is classfied into two categories. One is "Combining Predictor", the other is "De-Aliased Predictor". The difference between "Combining Predictor" and "De-Aliased Predictor" is a means to select a prediction table. "Combining Predictor" select a prediction table by confidence. "De-Aliased Predictor" select a prediction table by branch direction bias. Although the prediction accuracy in "Combining Predictor" is the highest, "Combining Pr...
Necessity for Confidence in Multiple PHT Branch Predictors

Saito Fumiko, Hiruta Tomonori, Yamana Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2002.08

Event date：
2002.08

　

　

　View Summary

In recent years, branch predictors with multiple PHTs(Pattern History Table) has been proposed. In this paper, branch predictors with multiple PHTs are classfied into two categories. One is "Hybrid Predictor", the other is "Multiple PHT Predictor" (.which is called in this paper). The difference between "Hybrid Predictor" and "Multiple PHT Predictor" is PHT selection confidence. In this paper, we compare "Hybrid Predictor" with "Multiple PHT Predictor". As the result, if "Hibrid Predictor" is the same size as "Multiple PHT Predictor"," Hybrid Predictor" predicts branch directions better tha...
A Proposal of the Branch Prediction Technique based on the Transition Rate

Umezawa Akira, Yamana Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2002.05

Event date：
2002.05

　

　

　View Summary

In order to raise the processing speed of a processor, in today's processor, the technique of piplining is adapted to extract the instruction level parallelism. However, a pipeline stall occurs when a conditional branch exists. Various researches have been done, in order to raise the accuracy of prediction. In this paper, we propose a new branch prediction technique based on the transition rate, which is specifically the number of succession branch times for the same direction. The proposed scheme targets the branches that are classified into difficult prediction branch. We applied the prop...
Search Pattern Modeling based on its Search Interval

Suzuki Shunsuke, Yamana Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2002.03

Event date：
2002.03

　

　

　View Summary

The conventional search engines searches based on the pre-generated index. Thus, when some users search with the same query, the search engine returns same result, even if they want to obtain different results. In order to solve such a problem, in this paper, we propose the user modeling scheme based on the user's search pattern to speculate the user's intention. Consequently, we have classified the search interval to re-search into two patterns. Furthermore, we have classified 91% of a user's queries into nine patterns. Using these patterns, the search engines will be able to return the re...
Search Pattern Modeling based on its Search Interval

Suzuki Shunsuke, Yamana Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2002.03

Event date：
2002.03

　

　

　View Summary

The conventional search engines searches based on the pre-generated index. Thus, when some users search with the same query, the search engine returns same result, even if they want to obtain different results. In order to solve such a problem, in this paper, we propose the user modeling scheme based on the user's search pattern to speculate the user's intention. Consequently, we have classified the search interval to re-search into two patterns. Furthermore, we have classified 91% of a user's queries into nine patterns. Using these patterns, the search engines will be able to return the results that suite the user's intention.
An Efficient Speculative Execution Scheme for Loops

Ishikawa Shunsuke, Yamana Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2002.03

Event date：
2002.03

　

　

　View Summary

In this paper, we propose an efficient speculative execution scheme for loops, and have confirmed the usefullness of the scheme using the compress program from SPECcpu95 benchmark. Generally, since the execution time of loops holds the large portion of the total execution time, the loop parallelization scheme improves the program performance, dramatically. However, when the data dependence cannot be analyzed statically, the conventional parallelization scheme assumes that the data dependence exists. For this reason, such a loop cannot be parallelized even if the loop carried dependence(LCD)...
An Efficient Speculative Execution Scheme for Loops

Ishikawa Shunsuke, Yamana Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2002.03

Event date：
2002.03

　

　

　View Summary

In this paper, we propose an efficient speculative execution scheme for loops, and have confirmed the usefullness of the scheme using the compress program from SPECcpu95 benchmark. Generally, since the execution time of loops holds the large portion of the total execution time, the loop parallelization scheme improves the program performance, dramatically. However, when the data dependence cannot be analyzed statically, the conventional parallelization scheme assumes that the data dependence exists. For this reason, such a loop cannot be parallelized even if the loop carried dependence(LCD) occurs only once in 10,000 times, dynamically. However, the speculative execution scheme has been known to speedup such a loop. In this paper, we propose the scheme to apply the speculative execution alternatively only to the portion expected to be speeduped effectively, using the overhead parameter required for the book-keeping process when the speculation fails. Such overhead has not been considered on oonventional speculative execution schemes. The proposed scheme enables the alternative speculative execution using the overhead parameter for book-keeping, the LCD existence probability, and the timing of the speculative execution initiation. As a result, in the present stage, the execution speed is fell down to one third. To solve this problem, we also propose a new speculative execution.
構造プロファイルによる局所構造予測法の開発

山田真介, 富井健太郎, 太田元規, 秋山泰, 山名早人

第2回日本蛋白質科学会年会ポスター

Presentation date： 2002

Event date：
2002

　

　
Webページの更新頻度とアクセス頻度に基づく効率的な収集方法の考察

熊谷英樹, 山名早人

第64回情処全大

Presentation date： 2002

Event date：
2002

　

　
Web上からの論文ファイル自動抽出の試み

田伏真之, 山名早人

第64回情処全大

Presentation date： 2002

Event date：
2002

　

　
ドメイン毎のWebページ数の偏りを考慮した日本のWebページ数推定調査

西村真幸, 山名早人

第64回情処全大

Presentation date： 2002

Event date：
2002

　

　
マルコフモデルを使用したWebランキング

赤津秀之, 山名早人

第64回情処全大

Presentation date： 2002

Event date：
2002

　

　
逆リンクのチェックによるサイトの特徴・有用性の調査

高見進太郎, 山名早人

第64回情処全大

Presentation date： 2002

Event date：
2002

　

　
脳型情報処理の研究に関する最新動向

齋藤雅浩, 山名早人

第64回情処全大

Presentation date： 2002

Event date：
2002

　

　
The Latest Technical Trends in Speculative Execution

Saito Fumiko, Yamana Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2001.11

Event date：
2001.11

　

　

　View Summary

Instruction level speculative execution schemes are classified into the branch prediction which alleviates control dependence, and the data prediction which alleviates data dependece. In this paper, we summarize 36 papers on the branch prediction and 27 papers on the data prediction in HPCA from 1996 to 2001, ISCA, MICRO, and ASPLOS from 1996 to 2000. As the general trends, until 1998, more than half of the researches on speculative execution are related to the branch prediction. However, since 1997, reseraches on data prediction have increased.
An Estimation Scheme of the Exection Time for MPI Programs using Measured Primitives

Horii Hiroshi, Yamana Hayato

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2001.10

Event date：
2001.10

　

　

　View Summary

In this paper, we propose the scheme of estimating the execution time of MPI programs, and confirmed the usefulness of the scheme using NAS Parallel Benchmarks (NPB) ver 2.3. The scheme estimates the execution time of MPI program dividing into the computation part and the communication part. In estimating the execution time of the computation, we divide a MPI program into blocks that have loop structure, measure the execution time of every block, and estimate the total execution time. In estimating the communication time, we measure the communication time with the same message size which is...
Search Engine Google

YAMANA Hayato, KONDO Hidekazu

Journal of Information Processing Society of Japan 一般社団法人情報処理学会

Presentation date： 2001.08

Event date：
2001.08

　

　

　View Summary

Googleは，世界最大の情報を持つサーチエンジンとして有名である．Googleは，スタンフォード大学コンピュータサイエンス学科の研究プロジェクトとしてスタートした後，シリコンバレーの2大ベンチャーキャピタルから総額2 500万ドルの投資を受け，博士課程の学生であった当時25歳のLarry（Lawrence）Pageと Sergey Brinの2人が1998年9月に会社として起業した．
招待講演2 サーチエンジンGoogleの情報検索技術 (AIシンポジウム(第15回)WWW情報検索と情報統合)

山名早人

AIシンポジウム人工知能学会

Presentation date： 2001.07

Event date：
2001.07

　

　
招待論文-サーチエンジンGoogleの情報検索技術

山名早人

第15回AIシンポジウム

Presentation date： 2001

Event date：
2001

　

　
投機的実行のループへの効果的な適用法

山名早人

情報処理学会第６２回全国大会

Presentation date： 2001

Event date：
2001

　

　
データベース最前線-12-検索エンジンと高速ページ収集技術--分散型WWWロボット実験

山名早人

Bit 共立出版

Presentation date： 2000.12

Event date：
2000.12

　

　
2000-ARC-139-28 Unlimited Speculative Execution for Loops

YAMANA Hayato, Koike Hanpei

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 2000.08

Event date：
2000.08

　

　

　View Summary

This paper discusses how to adopt the"Unlimited Speculative Execution"on loops. A task level speculative execution scheme, called the"Unlimited Speculative Execution", is adopted on the loops that are not able to be parallelized because of memory ambiguation or control dependences. In this paper, loops are classified into nine categories to make clear the applicable loops for the scheme. Moreover, we discusses the result after applying the scheme to SPEC95int compress program.
分散ＷＷＷロボット実験

山名早人

Bit,共立出版

Presentation date： 2000

Event date：
2000

　

　
Internet広域分散サーチロボットの研究開発

村岡洋一, 山名早人, 田村健人, 河野浩之, 森英雄, 浅井勇夫, 西村英樹, 楠本博之, 篠田洋一

第１９回ＩＰＡ技術発表会

Presentation date： 2000

Event date：
2000

　

　
分散型ＷＷＷロボットの予備評価と高速化の検討

山名早人, 森英雄, 田村健人, 河野浩之, 村岡洋一

日本ソフトウェア科学会インターネットテクノロジワークショップ

Presentation date： 2000

Event date：
2000

　

　
臨界投機実行のループへの適用

山名早人, 小池汎平

情報処理学会計算機アーキテクチャ研究会(SWoPP00)

Presentation date： 2000

Event date：
2000

　

　
分散型ＷＷＷロボットによる国内のＷＷＷデータ収集実験

山名早人

ACM SIGMOD Japanシンポジウム講演集

Presentation date： 2000

Event date：
2000

　

　
スーパーコンパイラ・テクノロジの調査研究

平成11年度先導調査研究報告書／新エネルギー・産業総合開発機構

Presentation date： 2000

Event date：
2000

　

　
広域分散コンピューティングの現状と課題―分散型ＷＷＷロボットを例にとって―

山名早人

北海道地域ネットワーク協議会シンポジウム2000／北海道地域ネットワーク協議会

Presentation date： 2000

Event date：
2000

　

　
分散型WWWロボットによる国内のWWWデータ収集実験

山名早人

ACM SIGMOD Japanシンポジウム講演集

Presentation date： 2000

Event date：
2000

　

　
分散型WWWロボット実験の状況 (特集次世代インターネットの展望)

山名早人

機械振興機械振興協会

Presentation date： 1999.08

Event date：
1999.08

　

　
User Support on Narrowing Retrieval using the Unlimited Speculativ Search Service.

山名早人, 小池汎平, 児玉祐悦, 坂根広史, 山口喜教

人工知能学会知識ベースシステム研究会資料

Presentation date： 1999.03

Event date：
1999.03

　

　
Speculative Control/Data Dependence Graph and Java Jog-time Analyzer : A Preliminary Evaluation for Java Virtual Accelerator

KOIKE HANPEI, YAMANA HAYATO, YAMAGUCHI YOSHINORI

Transactions of Information Processing Society of Japan 一般社団法人情報処理学会

Presentation date： 1999.02

Event date：
1999.02

　

　

　View Summary

The authors are investigating the possibility of Java Virtual Accelerator, a run-time parallelizing interpreter/JIT compiler system which speeds up Java execution through parallelizing emulation. To realize parallelizing emulation, automatic extraction of the parallelism from sequential binary programs is important. We developed the "speculative control-data dependence graph" model to relieve the control and data dependence constraints inherent in the sequential programs. Speculative control-data dependence graph is constructed by measuring the prediction rate for both control and data valu...
Speculative Control/Data Dependence Graph and Java Jog-time Analyzer : A Preliminary Evaluation for Java Virtual Accelerator

KOIKE HANPEI, YAMANA HAYATO, YAMAGUCHI YOSHINORI

Information Processing Society of Japan (IPSJ)

Presentation date： 1999.02

Event date：
1999.02

　

　

　View Summary

The authors are investigating the possibility of Java Virtual Accelerator, a run-time parallelizing interpreter/JIT compiler system which speeds up Java execution through parallelizing emulation. To realize parallelizing emulation, automatic extraction of the parallelism from sequential binary programs is important. We developed the "speculative control-data dependence graph" model to relieve the control and data dependence constraints inherent in the sequential programs. Speculative control-data dependence graph is constructed by measuring the prediction rate for both control and data values during the test run, and replacing highly predictable arcs with predict-confirm nodes. Java Jog-time Analyzer is developed for the experiment of the model described above. JJA analyzes control and data dependences statically while class files are loaded, and the intermediate code interpreter of JJA invokes data and branch prediction modules and gathers run-time statistics everytime basic block boundary is crossed. Run-time statistics such as the block execution counts, the prediction rates, the critical path execution time and the average parallelism, as well as the plot of the dependence graphs, are shown at the end of the execution. In this paper, several experiment results with JJA are shown.
国内の全WWWデータを24時間で収集する分散型WWWロボットの試み

山名早人, 田村健人, 森英雄, 黒田洋介, 西村英樹, 浅井勇夫, 楠本博之, 篠田陽一, 村岡洋一

Proceedings of NORTH Internet Symposium

Presentation date： 1999

Event date：
1999

　

　
Design of Automatic Parallelizing Intermediate Code Interpreter

KOIKE HANPEI, YAMANA HAYATO, YAMAGUCHI YOSHINORI

Information Processing Society of Japan (IPSJ)

Presentation date： 1999

Event date：
1999

　

　

　View Summary

In this paper, the design of the intermediate code interpreter, which executes a sequential program in parallel using speculative method, is discussed. Software techniques which enable an efficient parallel speculative execution without hardware support, such as the check point execution mechanism with which an appropriate parallel execution granularity is established, and the efficient implementation of the speculative memory operations which minimize the overhead of searching, recording and the mutual exclusion, are proposed. Experiment results to see the basic performance of these techniques are also presented. From the experiment, we confirmed that we can implement a speculative intermediate code interpreter which can result in speedup, if we adopt the soft ware techniques described in this paper.
分散型ＷＷＷロボットの実験状況と今後の課題

インターネットコンファレンス99論文集／日本ソフトウェア科学会

Presentation date： 1999

Event date：
1999

　

　
Internet広域分散協調サーチロボットの研究開発

IPA第18回技術発表会論文集／情報処理振興事業協会

Presentation date： 1999

Event date：
1999

　

　
経営学大事典第二版

中央経済社

Presentation date： 1999

Event date：
1999

　

　
Distributed WWW robot experiment.

山名早人

機械振興

Presentation date： 1999

Event date：
1999

　

　
Preliminal Evaluation of the Unlimited Speculative Search Service on Parallel Computers.

山名早人, 小池汎平, 児玉祐悦, 坂根広史, 山口喜教

情報処理学会シンポジウム論文集

Presentation date： 1999

Event date：
1999

　

　
Evaluation of Communication Mechanisms for Distributed Memory Parallel Computers in Wavefront Computation

SAKANE Hirofumi, KODAMA Yuetsu, TATEBE Osamu, KOIKE Hanpei, YAMANA Hayato, YAMAGUCHI Yoshinori, YUBA Yoshitsugu

Transactions of Information Processing Society of Japan Information Processing Society of Japan (IPSJ)

Presentation date： 1999

Event date：
1999

　

　

　View Summary

In this paper, we discuss efficient parallel execution of a dense-matrix problem considering trade-offs between fine-grain and coarse-grain communication in distributed memory machines. The solution of the triangular system of equations involves data dependencies between consecutive iterations in the outer-loop. The dependencies can be naturally solved and processed in parallel by wavefront computation. Two ways of parallelizing are presented; the element-wise fine-grain approach and the coarse-grain approach. We implemented these algorithms on both EM-X and AP 1000+. Fine-grain support mechanisms of the EM-X had a great effect on the performance of the element-wise method for relatively small problem size, while employed RISC processors of the AP1000+ brought high performance of the coarse-grain method for larger size.
Fast speculative search engine on the highly parallel computer EM-X

Hayato Yamana, Hanpei Koike, Yuetsu Kodama, Hirofumi Sakane, Yoshinori Yamaguchi

SIGIR Forum (ACM Special Interest Group on Information Retrieval)

Presentation date： 1998.12

Event date：
1998.12

　

　

　View Summary

A WWW search engine called fast speculative search engine that uses speculative execution of multiprocessor systems to shorten the total time to retrieve information from the WWW is presented. This engine predicts the user's next queries and initiates the searches with the predicted queries before receiving them to accelerate narrowing the search space. This fast speculative search engine is implemented using the data speculation on the EM-X, a highly parallel computer which can tolerate communication latency by using low latency communication and multithreading.
New trends of information retrieval in multimedia and Internet. Globally distributed cooperative search robot for Internet.

山名早人

Computer Today

Presentation date： 1998.09

Event date：
1998.09

　

　
A Study of Adopting the unlimited Speculative Execution on Multigrain Parallelizing Compilers

YAMANA Hayato, KOIKE Hanpei, KODAMA Yuetsu, SAKANE Hirohumi, YAMAGUCHI Yoshinori

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 1998.08

Event date：
1998.08

　

　

　View Summary

This paper discusses the effectiveness of the unlimited speculative execution and how to adopt the scheme on multigrain parallelizing compilers.The multigrain parallelizing compilers exploit parallelism among coarse-grain taks like loops, medium-grain tasks such as loop iterations, and near-fine-grain tasks such as statements.When we adopt the unlimited speculative execution scheme on multigrain parallelizing compilers, the codes, that are not parallelized because of memory ambiguation or control dependences, are able to be parallelized.In this paper, loops are classified into nine categori...
A Study of Unlimited Speculative Execution on Multigrain Parallel Processing

YAMANA Hayato, KOIKE Hanpei, KODAMA Yuetsu, SAKANE Hirofumi, YAMAGUCHI Yoshinori

全国大会講演論文集 Information Processing Society of Japan (IPSJ)

Presentation date： 1998.03

Event date：
1998.03

　

　
分散型ロボットによるWWW情報収集

山名早人

第9回データ工学ワークショップ(DEWS'98), 電子情報通信学会データ工学専門委員会

Presentation date： 1998

Event date：
1998

　

　
Speculative Control-Data Dependence Graph and Java Jog-time Analyzer-A Step toward Java Virtual Accelerator.

小池汎平, 山名早人, 山口喜教

情報処理学会シンポジウム論文集

Presentation date： 1998

Event date：
1998

　

　
A Survey of World Wide Web Search Engines

Yamana Hayato

コンピュータソフトウェア一般社団法人日本ソフトウェア科学会

Presentation date： 1997.09

Event date：
1997.09

　

　

　View Summary

1997年1月時点で,世界の約83万組織,約1600万台のコンピュータがインターネットに接続し,学術論文から趣味にいたるまで,1億ページを越える情報がWWWサーバから発信されている.この膨大な情報を有効に利用するためには,必要とする情報の掲載されたページを瞬時に,かつ,的確にみつけ出すことが必須となる.このような機能を提供するWWW情報検索サービスは,1994年頃から登場し始め,現在,その数は100を越える.本稿では,WWW情報検索サービスの現状とその問題点を解説する.
Parallelization and Performance Evaluation of Sparse Matrix Computation in The EM-X Multiprocessor

SATO MITSUHISA, KODAMA YUETSU, SAKANE HIROFUMI, YAMANA HAYATO, SAKAI SHUICHI, YAMAGUCHI YOSHINORI

Transactions of Information Processing Society of Japan 一般社団法人情報処理学会

Presentation date： 1997.09

Event date：
1997.09

　

　

　View Summary

In this paper, we describe the parallelization of a sparse matrix computation, CG (Conjugate Gradient method) kernel taken from NAS parallel benchmark suite, for the EM-X multiprocessor. Dataflow mechanism of EM-X supports fine-grain communication very efficiently, which provides low latency communication, and flexible message-passing facility. We compare the performance of sparse matrix vector multiplications by the complete exchange communication, by element-wise remote update and by the element-wise remote read with multithreading. The measurements taken on the EM-X indicates effectivene...
Parallel Execution of Radix Sort Programs Using Fine-grain Communication

KODAMA YUETSU, SAKANE HIROFUMI, SATO MITSUHISA, YAMANA HAYATO, SAKAI SHUICHI, YAMAGUCHI YOSHINORI

Transactions of Information Processing Society of Japan 一般社団法人情報処理学会

Presentation date： 1997.09

Event date：
1997.09

　

　

　View Summary

EM-X is a highly parallel computer with a distributed memory. It supports fine-grain communication, whose size is two-word fixed, on an instruction execution pipeline. It achieves high communication throughput by overlapping remote memory access with thread execution, and tolerates communication latency by rapid switching of threads. We developed an 80 processor system of EM-X, and are evaluating its architectural features on the system. In this paper, we execute radix sort programs to evaluate the parallel performance of EM-X and compare the results with other parallel computers. The resul...
Parallelization and Performance Evaluation of Sparse Matrix Computation in The EM - X Multiprocessor

SATO Mitsuhisa, KODAMA Yuetsu, SAKANE Hirofumi, YAMANA Hayato, SAKAI Shuichi, YAMAGUCHI Yoshinori

Transactions of Information Processing Society of Japan Information Processing Society of Japan (IPSJ)

Presentation date： 1997.09

Event date：
1997.09

　

　

　View Summary

In this paper, we describe the parallelization of a sparse matrix computation, CG (Conjugate Gradient method) kernel taken from NAS parallel benchmark suite, for the EM-X multiprocessor. Dataflow mechanism of EM-X supports fine-grain communication very efficiently, which provides low latency communication, and flexible message-passing facility. We compare the performance of sparse matrix vector multiplications by the complete exchange communication, by element-wise remote update and by the element-wise remote read with multithreading. The measurements taken on the EM-X indicates effectiveness of the fine-grain communication which enables element-wise access efficiently. Fine-grain communcation is effective when problem size per PE becomes small in large scale multiprocessor systems. The complete complete exchange version incurs the negative impact due to the limitation of its bandwidth, and the performance of the element-wise remote read version is degraded by the overhead of context-switching for multithreading.
Parallel Execution of Radix Sort Programs Using Fine - grain Communication

KODAMA Yuetsu, SAKANE Hirofumi, SATO Mitsuhisa, YAMANA Hayato, SAKAI Shuichi, YAMAGUCHI Yoshinori

Transactions of Information Processing Society of Japan Information Processing Society of Japan (IPSJ)

Presentation date： 1997.09

Event date：
1997.09

　

　

　View Summary

EM-X is a highly parallel computer with a distributed memory. It supports fine-grain communication, whose size is two-word fixed, on an instruction execution pipeline. It achieves high communication throughput by overlapping remote memory access with thread execution, and tolerates communication latency by rapid switching of threads. We developed an 80 processor system of EM-X, and are evaluating its architectural features on the system. In this paper, we execute radix sort programs to evaluate the parallel performance of EM-X and compare the results with other parallel computers. The results show that fine grain communication achieves very good scalability, while coarse grain message passing decrease the performance on a large number of processors because of contentions on a network.
Using the Unlimited Speculative Execution on WWW Information Retrieval

YAMANA Hayato, KOIKE Hanpei, KODAMA Yuetsu, TODA Kenji, YAMAGUCHI Yoshinori

IEICE technical report. Computer systems 社団法人電子情報通信学会

Presentation date： 1997.08

Event date：
1997.08

　

　

　View Summary

This paper explains how to use the Unlimited Speculative Execution scheme to accelerate the information retrieval on the World Wide Web. The Unlimited Speculative Execution scheme utilizes low loaded processors to speculate the tasks which are not decided to be initiated. The goal of this research is to present a fast WWW information retrieval system by using the Unlimited Speculative Execution scheme. We use the EM-X parallel computer which consists of 80 processors for the platform.
Developing information industry. Trends of WWW information retrieval service.

山名早人

機械振興

Presentation date： 1997.08

Event date：
1997.08

　

　
Fine-grain multithreading with the EM-X multiprocessor

Andrew Sohn, Yuetsu Kodama, Jui Ku, Mitsuhisa Sato, Hirofumi Sakane, Hayato Yamana, Shuichi Sakai, Yoshinori Yamaguchi

Annual ACM Symposium on Parallel Algorithms and Architectures

Presentation date： 1997.01

Event date：
1997.01

　

　

　View Summary

Multithreading aims to tolerate latency by overlapping communication with computation. This report explicates the multithreading capabilities of the EM-X distributed-memory multiprocessor through empirical studies. The EM-X provides hardware supports for fine-grain multithreading, including a by-passing mechanism for direct remote reads and writes, hardware FIFO thread scheduling, and dedicated instructions for generating fixed-sized communication packets. Bitonic sorting and Fast Fourier Transform are selected for experiments. Parameters that characterize the performance of multithreading are investigated, including the number of threads, the number of thread switches, the run length, and the number of remote reads. Experimental results indicate that the best communication performance occurs when the number of threads is two to four. FFT yielded over 95% overlapping due to a large amount of computation and communication parallelism across threads. Even in the absence of thread computation parallelism, multithreading helps overlap over 35% of the communication time for bitonic sorting.
Experience with fine-grain communication in EM-X multiprocessor for parallel sparse matrix computation

M Sato, Y Kodama, H Sakane, H Yamana, S Sakai, Y Yamaguchi

11TH INTERNATIONAL PARALLEL PROCESSING SYMPOSIUM, PROCEEDINGS I E E E, COMPUTER SOC PRESS

Presentation date： 1997

Event date：
1997

　

　

　View Summary

Sparse matrix problems require a communication paradigm different from those used in conventional distributed-memory multiprocessors. We present in this gaper how fine-grain communication can help obtain high performance in the experimental distributed-memory multiprocessor, EM-X, developed at ETL, which can handle fine-grain communication very efficiently. The sparse matrix: kernel, Conjugate Gradient, is selected for the experiments. Among the steps in CG is the sparse matrix vector multiplications we focus on in the study. Some communication methods are developed for performance comparison, including coarse-grain and fine-grain implementations, Fine-grain communication allows exact data access in an unstructured problem to reduce the amount of communication. While CG presents bottlenecks in terms of a large number of fine-grain remote reads, the multi-thraded principles of execution is so designed to tolerate such latency. Experimental results indicate that the performance of fine-grain read implementation is comparable to that of coarse-grain implementation on 64 processors. The results demonstrate that fine-grain communication can be a viable and efficient approach to unstructured sparse matrix problems on large-scale distributed-memory multiprocessors.
Superspeed computer application technology for elucidating complicated phenomena.

関口智嗣, 佐藤三久, 山名早人

国立機関原子力試験研究成果報告書

Presentation date： 1997

Event date：
1997

　

　
Fine-grain parallel processing of a dense-matrix problem on EM-X-Efficient execution of wavefront parallelism.

坂根広史, 児玉祐悦, 小池汎平, 佐藤三久, 山名早人, 坂井修一, 山口喜教

並列処理シンポジウム論文集

Presentation date： 1997

Event date：
1997

　

　
Attractiveness of the Internet.

山名早人

CIAJ Journal (Communications and Information Network Association of Japan)

Presentation date： 1997

Event date：
1997

　

　
Message-based efficient remote memory access on a highly parallel computer EM-X

Yuetsu Kodama, Yuetsu Kodama, Hirohumi Sakane, N. Mitsuhisa Sato, Hayato Yamana, Shuichi Sakal, Yoshinori Yamaguchl

IEICE Transactions on Information and Systems

Presentation date： 1996.12

Event date：
1996.12

　

　

　View Summary

Communication latency is central to multiprocessor design. This study presents the design principles of the EM-X distributed-memory multiprocessor towards tolerating communication latency. The EM-X overlaps computation with communication for latency tolerance by multithreading. In particular, we present two types of hardware support for remote memory access: (1) priority-based packet scheduling for thread invocation, and (2) direct remote memory access. The prioritybased scheduling policy extends a FIFO ordered thread invocation policy to adopt to different computational needs. The direct remote memory access is designed to overlap remote memory operations with thread execution. The 80-processor prototype of EM-X is developed and is operational since December 1995. We execute several programs on the machine and evaluate how the EM-X effectively overlaps computation with communication toward tolerating communication latency for high performance parallel computing.
Performance Evaluation for a Matrix Operation Benchmark on EM-X Multiprocessor

SAKANE HIROFUMI, KODAMA YUETSU, SATO MITSUHISA, YAMANA HAYATO, SAKAI SHUICHI, YAMAGUCHI YOSHINORI

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 1996.08

Event date：
1996.08

　

　

　View Summary

In this paper, we discuss an implementation of the LINPACK benchmark parallelized on the EM-X multiprocessor and evaluate its performance focusing the floating point operations in which a regular repetitive pattern occurs. It is important to overlap the communication and calculation as much as relationship between the broadcast algorithms and load balancing. Exploiting the potential of a reduction of the number or memory accesses and adopting the multi-column simultaneous elimination technique, we also further accelerated the most innerloop code we had already reported for optimization on a...
Message-based efficient remote memory access on a highly parallel computer EM-X

Y Kodama, H Sakane, M Sato, H Yamana, S Sakai, Y Yamaguchi

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS IEICE-INST ELECTRON INFO COMMUN ENG

Presentation date： 1996.08

Event date：
1996.08

　

　

　View Summary

Communication latency is central to multiprocessor design. This study presents the design principles of the EM-X distributed-memory multiprocessor towards tolerating communication latency. The EM-X overlaps computation with communication for latency tolerance by multithreading. In particular, we present two types of hardware support for remote memory access: (1) priority-based packet scheduling for thread invocation, and (2) direct remote memory access. The priority-based scheduling policy extends a FIFO ordered thread invocation policy to adopt to different computational needs. The direct remote memory access is designed to overlap remote memory operations with thread execution. The 80-processor prototype of EM-X is developed and is operational since December 1995. We execute several programs on the machine and evaluate how the EM-X effectively overlaps computation with communication toward tolerating communication latency for high performance parallel computing.
Enjoy the WWW!

YAMANA Hayato

The Journal of the Institute of Electronics, Information, and Communication Engineers 社団法人電子情報通信学会

Presentation date： 1996.01

Event date：
1996.01

　

　
Application technology of superspeed computer for elucidation of complicated phenomena.

関口智嗣, 佐藤三久, 山名早人

国立機関原子力試験研究成果報告書

Presentation date： 1996

Event date：
1996

　

　
Parallel execution of radix sort program on a highly parallel computer EM-X.

児玉祐悦, 坂根広史, 佐藤三久, 山名早人, 坂井修一, 山口喜教

並列処理シンポジウム論文集

Presentation date： 1996

Event date：
1996

　

　
Survey of Speculative Execution and the Effect of Task-level Speculation

Yamana Hayato, Sato Mitsuhisa, Kodama Yuetsu, Sakane Hirohumi, Sakai Shuichi, Yamaguchi Yoshinori

全国大会講演論文集 Information Processing Society of Japan (IPSJ)

Presentation date： 1995.09

Event date：
1995.09

　

　

　View Summary

投機的実行(Speculative Execution)に関して,94年〜95年7月のサーベイを報告すると共に,我々が提案しているタスク間投機的実行の有効性を示す.なお,94年までの調査については,文献[2]を参照していただきたい.調査対象とした論文を表1に示し,近年の投機的実行に関する論文数の推移を図1に示す.図1に示すように,VLIWやSuperscalarが出始めた91年頃から投機的実行に関する論文が急増している.これらの研究は,(1)プログラムに内在する命令レベルの並列性調査,(2)Superscalar/VLIWでの投機的実行,(3)並列計算機での投機的実行に分類される.90年代前半は(1)に関する論文が多かったが,その後,(2)に関する論文が急増し,94-95年の論文はその中でも,分岐予測(branch prediction)と条件付実行(predicated execution)に関するものが全体の7割を占め,89-93年に多かったアーキテクチャ上の実現方法に関する論文が激減した.本報告では,現在最もホットな話題となっている分岐予測と条件付実行を中心に説明する.
Parallelization and Performance of Sparse Matrix Computation in The EM-X Multiprocessor

Sato Mitsuhisa, Kodama Yuetsu, Sakane Hirofumi, Yamana Hayato, Sakai Shuichi, Yamaguti Yoshinori

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 1995.08

Event date：
1995.08

　

　

　View Summary

In this paper, we describe the parallelization of a sparse matrix computation, CG (Conjugate Gradient method) kernel taken from NAS parallel benchmark suite, for the EM-X multiprocessor. EM-X is a distributed-memory multiprocessor. Dataflow mechanism of EM-X supports fine-grain communication very efficiently, which provides low latency communication, and flexible message-passing with direct remote memory access. We compare the performance of sparse matrix vector multiplications by the complete exchange communication and by the element-wise remote memory access with multithreading. The measu...
Decreasing the Control Overhead of the Unlimited Speculative Execution

Yamana Hayato, Sato Mitsuhisa, Kodama Yuetsu, Sakane Hirohumi, Sakai Shuichi, Yamaguchi Yoshinori

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 1995.08

Event date：
1995.08

　

　

　View Summary

This paper discusses how to decrease the control overhead of tasks with Speculation on multiprocessors. Firstly, we have implemented the unlimited speculative execution on the EM-4 multiprocessor. Secondly, the overhead is classified into its several sources. After measuring each classified overhead, it has been confirmed that both the broadcast latency and the overhead initiating tasks are not major factors. Insted, the overhead of receiving and manipulating broadcasted control data is major factor. When the factor is decreased by 1/4, the speedup ratio increases up to 3 and we will have 1...
A Distributed Control Scheme of Macrotask-level Speculative Execution on the EM-4 Multiprocessor

Yamana Hayato, Sato Mitsuhisa, Kodama Yuetsu, Sakane Hirohumi, Sakai Shuichi, Yamaguchi Yoshinori

Transactions of Information Processing Society of Japan 一般社団法人情報処理学会

Presentation date： 1995.07

Event date：
1995.07

　

　

　View Summary

The purpose of this paper is to propose a new fast control scheme of macrotasks with speculation. A macrotask is a coarse grain task which is a unit of speculation. Previous works have reported that the speedup ratio is 12 to 630 times in comparison with conventional execution schems without speculation when both the speculation depth and the computational resource are infinite, that is called oracle model. The control overhead,however, prevents the speedup from attaining the theoretical speedup ratio. Thus, the control scheme with small overhead is desired. The distributed control scheme a...
A macrotask-level unlimited speculative execution on multiprocessors

Hayato Yamana, Mitsuhisa Sato, Yuetsu Kodama, Hirofumi Sakane, Shunichi Sakai, Yoshinori Yamaguchi

Proceedings of the International Conference on Supercomputing Association for Computing Machinery

Presentation date： 1995.07

Event date：
1995.07

　

　

　View Summary

The purpose of this paper is to propose a new fast execution scheme of FORTRAN programs. The proposed scheme enables the fast initiation of macrotask when its data dependences are satisfied even if the control flow has not been reached. The previous schemes to parallelize a program including conditional branches have a number of problems - 1) Though the theoretical speedup ratio is up to N when N conditional branches are jumped on either a VLIW or a superscalar machine, the number of N is restricted up to the number of ALU's on a chip, 2) Since conventional control schemes use a few processors to control macrotasks, the overhead to control them is large. The proposed scheme solves these problems - 1) The proposed scheme enables speculative execution between coarse grain tasks, i.e., macrotasks, on multiprocessors by jumping many conditional branches, 2) A distributed control scheme is proposed and implemented on the EM-4 multiprocessor to decrease the control overhead of macrotasks. Preliminary evaluations show that the control overhead of the proposed scheme is smaller than that of the other control schemes. Moreover, it is confirmed that the distributed control can be implemented by using software when the average macrotask execution time is larger than 14.4 (Is on the EM-4 multiprocessor.
A SPECULATIVE EXECUTION SCHEME OF MACROTASKS FOR PARALLEL-PROCESSING SYSTEMS

H YAMANA, T YASUE, Y ISHII, Y MURAOKA

SYSTEMS AND COMPUTERS IN JAPAN SCRIPTA TECHNICA PUBL

Presentation date： 1995.06

Event date：
1995.06

　

　

　View Summary

This paper considers the high-speed execution of FORTRAN programs on parallel processing systems and proposes the parallelizing scheme of the program and execution based on the speculative execution over multiple conditional branches. Several techniques have been proposed that parallelize the program including conditional branches. A method which does not use the speculative execution is: (1) the method called earliest execution condition determination. As the methods which use the speculative execution are: (2) speculative evaluation scheme for a single conditional branch for the superscalar processor or VLIW computer; and (3) multiple speculative execution scheme assuming particular loops. There are the following problems: (1) sufficient parallelism is not extracted only by determining the earliest execution condition; (2) the speed improvement that can be realized by the speculative execution of a single conditional branch is at most twofold; and (3) the scheme can be applied only to particular loops. This paper divides the program into macrotasks, and defines the multiple stage speculative execution scheme between macrotasks on the general parallel processing system. Then, the macrotask execution control for the individual macro-task is proposed, using the execution start condition, the control establishment condition and the execution stop condition.
EM-X parallel computer: Architecture and basic performance

Yuetsu Kodama, Hirohumi Sakane, Mitsuhisa Sato, Hayato Yamana, Shuichi Sakai, Yoshinori Yamaguchi

Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA

Presentation date： 1995.01

Event date：
1995.01

　

　

　View Summary

Latency tolerance is essential in achieving high performance on parallel computers for remote function calls and fine-grained remote memory accesses. EM-X supports interprocessor communication on an execution pipeline with small and simple packets. It can create a packet in one cycle, and receive a packet from the network in the on-chip buffer without interruption. EM-X invokes threads on packet arrival, minimizing the overhead of thread switching. It can tolerate communication latency by using efficient multi-threading and optimizing packet flow of fine grain communication. EM-X also supports the synchronization of two operands, direct remote memory read/write operations and flexible packet scheduling with priority. This paper describes distinctive features of the EM-X architecture and reports the performance of small synthetic programs and larger more realistic programs.
The EM-X parallel computer: Architecture and basic performance

Y KODAMA, H SAKANE, M SATO, H YAMANA, S SAKAI, Y YAMAGUCHI

22ND ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, PROCEEDINGS ASSOC COMPUTING MACHINERY

Presentation date： 1995

Event date：
1995

　

　

　View Summary

Latency tolerance is essential in achieving high performance on parallel computers for remote function calls and fine-grained remote memory accesses. EM-X supports interprocessor communication on an execution pipeline with small and simple packets. It can create a packet in one cycle, and receive a packet from the network in the on-chip buffer without interruption. EM-X invokes threads on packet arrival, minimizing the overhead of thread switching. It can tolerate communication latency by using efficient multi-threading and optimizing packet flow of fine grain communication. EM-X also supports the synchronization of two operands, direct remote memory read/write operations and flexible packet scheduling with priority. This paper describes distinctive features of the EM-X architecture and reports the performance of small synthetic programs and larger more realistic programs.
分散共有メモリ型並列計算機における1重Doacross型ループの実行時間算出法

山名,安江, 村岡,山口

電子情報通信学会論文誌

Presentation date： 1995

Event date：
1995

　

　
An Evaluation of Doacross among Loops on the EM-4 Multiprocessor

YAMANA Hayato, SATO Mitsuhisa, KODAMA Yuetsu, SAKANE Hirohumi, SAKAI Shuichi, YAMAGUCHI Yoshinori

全国大会講演論文集 Information Processing Society of Japan (IPSJ)

Presentation date： 1994.03

Event date：
1994.03

　

　

　View Summary

従来, Doall型以外のループを並列計算機上で実行する方式としてDoacross[Cytr86]やPipelining[PaKL80]が提案されている. しかし, これらの方式は, 元々, 密結合型の並列計算機を対象としたものであり, メッセージ通信によりプロセッサ間のデータ交換を行う祖結合型の並列計算機では, 十分な処理性能を引き出すことができない. これは, 以下に述べる問題によるものである. ここで, ループの繰り返し回数をNとする. ・ Doacrossでは, プロセッサ間の通信ディレイが(N-l)回分, 全体の実行時間に加算されるため, が十分に小さくないと処理速度の向上が得られない. ・ Pipeliningでは, 各文の実行時問Tsが(N-l)回分, 全体の来行時間に加算されるため, Tsが十分に小さくないと処理速度の向上が得られない. 祖結合型の並列計算機では, メッゼージ通信によりプロセッサ間のデータ交換を行うため,を小さくすることが困難である. また, Tsには, 他のプロセッサ間でのデータの入出力時間が含まれるため, Tsを小さくすることも困難である. これに対して, 本報告で提案するループ間Doacrossは, プロセッサ問の通信ディレイが全体の実行時間に与える影響, 及び, 各文の実行時問Tsが全体の実行時間に与える影響を小さくする方式である. 本...
Optimization of network interface in a processing element for a parallel computer EM-X

Sakane Hirofumi, Kodama Yuetsu, Sato Mitsuhisa, Yamana Hayato, Sakai Shuichi, Yamaguchi Yoshinori

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 1994.01

Event date：
1994.01

　

　

　View Summary

This paper discusses some of the design parameters for the network interface of a single chip processor EMC-Y to achieve high throughput and high performance. We are currently designing the EMC-Y, a processing element of a parallel computer EM-X. The design parameters include the arbitration method in the network switch, memory access priority and the size of internal FIFOs. To optimize parallel execution performance, we have examined the parameters of the network interface by using a register transfer level simulator of the EM-X.
投機的実行の現状と Unlimited Speculative Execution Scheme の提案

山名

情報処理学会研究報告

Presentation date： 1994

Event date：
1994

　

　
Fundamental performance evaluation of a processing element EMC-Y for a parallel computer.

坂根広史, 児玉祐悦, 佐藤三久, 山名早人, 坂井修一, 山口喜教

情報処理学会研究報告

Presentation date： 1994

Event date：
1994

　

　
Survey of Today’s Speculative Execution Schemes and a Proposal of Unlimited Speculative Execution Scheme.

山名早人, 佐藤三久, 児玉祐悦, 坂根広史, 坂井修一, 山口喜教

情報処理学会研究報告

Presentation date： 1994

Event date：
1994

　

　
A Distributed Controlling Scheme of the Multistage Speculative Execution on the EM-4 Multiprosessor.

山名早人, 佐藤三久, 児玉祐悦, 坂根広史, 坂井修一, 山口喜教

並列処理シンポジウム論文集

Presentation date： 1994

Event date：
1994

　

　
Automatic Tuning of Loop-Doacross Execution Scheme on the EM-4 Multiprocessor.

山名早人, 佐藤三久, 児玉祐悦, 坂根広史, 坂井修一, 山口喜教

情報処理学会研究報告

Presentation date： 1994

Event date：
1994

　

　
並列処理システムにおけるマクロタスク間先行評価方式

山名,安江, 石井,村岡

電子情報通信学会論文誌

Presentation date： 1994

Event date：
1994

　

　
An Experimental Evaluation of the Multi-stage Specurative Execution Scheme on the EM-4 Multiprocessor

Yamana Hayato, Sato Mitsuhisa, Kodama Yuetsu, Sakane Hirohumi, Sakai Shuichi, Yamaguchi Yoshinori

IPSJ SIG Notes 一般社団法人情報処理学会

Presentation date： 1993.08

Event date：
1993.08

　

　

　View Summary

The purpose of this paper is to evaluate a new fast execution scheme of a program with speculative execution on the EM-4 multiprocessor. Conventional Schemes to parallelize programs including conditional branches have some problems. The multi-stage specurative execution scheme enables -(1) solving the side-efects problem, (2) decreaseing the number of processors to execute the specurative execution scheme, (3) jumping multi-stage conditional branches, (4) suiting to general multiprocessor systems. An experimental evaluation shows that the measured speedup ratio is 50% of the theoretical rat...
A Macro-tasking Scheme for Eager Evaluation

YAMANA Hayato, YASUE Toshiaki, ISHII Yoshihiko, MURAOKA Yoichi

全国大会講演論文集 Information Processing Society of Japan (IPSJ)

Presentation date： 1993.03

Event date：
1993.03

　

　

　View Summary

従来提案しているマクロタスク間先行評価方式におけるマクロタスク構成方法について報告する.先行評価方式とは,プログラム中の条件分岐文を越えて実行を進める方式である.マクロタスク生成の目的は,(1)変数の2重定義に件う副作用間題の回避,及び(2)仮実行(投棄的実行)に必要なプロセッサ数の削減の2点である.先行評価によって生じる副作用は,先行評価中に,同一データに対する2重定義が行われることによって生じる,本稿では,2重定義を回避するために,各マクロタスクヘのデータ依存間係が制御によらず一意になるようにマクロタスクを構成する.次に,実行時のプロセッサ数を削減するため,マクロタスク生成においては,データ依存と制御依存の間係を用いて,マクロタスクを融合した場合も,先行評価の効果を失わない部分を1つのマクロタスクとする.これは,従来のマクロタスク生成手法が制御依存のみを考えていたのに対し,データ依存を考えた生成手法として新規性を持つ.
A Optimal Data Transfer Order of Doacross Loops

YAMANA Hayato, MURAOKA Yoichi

全国大会講演論文集 Information Processing Society of Japan (IPSJ)

Presentation date： 1993.03

Event date：
1993.03

　

　

　View Summary

本報告では,DOACRCSS型ループの実行時間を最小にするデータ通信順序を求める.DOACROSS型ループ実行に関する従来の研究は,プロセッサの処理能力を表すパラメータとして演算命令の実行時間(以下,演算時間),及び,データ通信の遅延時間以下,通信遅延時間)を用いてきた.しかし,演算と通信を並列に処理できるマルチプロセッサ上で,DOACROSS型ループを実行する場合,これらのパラメータ以外に,通信ピッチを考慮いなくてはならない.通信ピッチは,プロセッサと相互結合網間のデータ入出力時間間隔である.通信ピッチがデータ通信の発生する時間間隔より大きい場合,通信が全体の実行時間の隘路となる.これは、データ通信が通信ピッチ以下の時間間隔で開始(以下,発行)できず,通信発行に遅延が生じるためである.この時,実際の実行時間は,従来の理論的な値よりも大きくなる.以下では,このような場合,データを定義順で他のプロセッサへ送らず,通信順序を変更することにより,実行智間を短縮できることを示す.
An Implementation of Sparse BLAS-3 on a Distributed Memory Parallel Machine PA1000

Utino Satosi, Hagiwara Junichi, Yasue Toshiaki, Yamana Hayato, Muraoka Yoichi

全国大会講演論文集 Information Processing Society of Japan (IPSJ)

Presentation date： 1993.03

Event date：
1993.03

　

　

　View Summary

疎行列を対象としたBLAS-3(Basic Linear Algebra Subrou-tine Leve13)を並列化し、富士通の分散メモリ型並列計算機AP1000上に実装した。BLAS-3は、次のサブルーチンで構成されている。・行列の積(GEMM,SYMM)・対称行列に対する階数kと2kの更新(SYRK,SYRK)・三角行列との積(TRMM)・右辺が複数列で、三角行列を係数に持つ連立一次方程式(TRSM)このうちTRSM以外の5つは、行列同士の積が主演算である。そのため、実装にあたって疎行列の積の並列化方法が重要になる。密行列の積の並列化と比較して、疎行列の積の並列化においては、次の点が問題になる。1.疎行列の積C&loarr;C+ABの計算においては、疎行列が圧縮されて格納されているためCの書き換えに時間がかかる。2.後巡する方法で行列をセルヘ分割した時、各セル(PE)の持つ部分行列の大きさに偏りがあるために、その格納に必要なメモリと通信量に偏りが生じる。そこで、本稿では1点を解決するための計算の実行順序について提案し、その計算順序で実行した場合における2の点を解決するための通信方法について提案する。さらに、提案した方法に基づいて実装したGEMM(非対称行列同士の積)ルーチンを用いて評価する。なお、疎行列は以下の条件で格納されているものとする。汎用的なプログラムとするた...
A flow‐executing scheme for DOACROSS loops on dynamic dataflow machines

Yoshihiko Ishii, Hayato Yamana, Toshiaki Yasue, Yoichi Muraoka

Systems and Computers in Japan

Presentation date： 1993

Event date：
1993

　

　

　View Summary

This paper modifies a flow‐executing scheme of the color‐reuse type, using multiple initial loop control packets, and then proves that the flow‐executing scheme is best suited for executing DOACROSS loops on dynamic dataflow machines. Flow‐executing schemes can be divided into four categories: (1) those using a single initial loop control packet (2) those using multiple initial loop packets (a) the color overflow type and (b) the color reuse type. Then the flow‐executing scheme can be classified into Classes (1‐a), (1b), (2‐a), and (2‐b) through the combination of Categories (1), (2), (a), and (b). This paper suggests that Class (2‐b) is best suited for executing DOACROSS loops, as it extracts full parallelism from DOACROSS loops, no sychronization overhead exists, and no memory access overhead exists after the synchronization. Copyright © 1993 Wiley Periodicals, Inc., A Wiley Company
A Distributed Control Scheme of the Multi-stage Jumping Execution of Conditional Branches for Macrotasks

YAMANA Hayato, YASUE Toshiaki, ISHII Yoshihiko, MURAOKA Yoichi

全国大会講演論文集 Information Processing Society of Japan (IPSJ)

Presentation date： 1992.09

Event date：
1992.09

　

　

　View Summary

本報告では、先行評価を用いたマクロタスクの多段仮実行方式におけるマクロタスクの効果的な制御手法として、マクロタスクの分散制御手法を提案する。多段仮実行方式は、プログラム中のデータ依存と制御依存の内、データ依存を保証した段階でマクロタスクと呼ぶタスクの実行を開始し、後で制御依存に基づいて制御確定したマクロタスクを選択する手法である。本方式を実際のマルチプロセッサ上で実現するにあたっての問題点は、実行時に発生する各種オーバヘッドの削減である。実行時のオーバヘッドには、制御が確定しない段階で実行を開始することにより発生する(1)メモリバンド幅の増大に起因するオーバヘッド、(2)多数のマクロタスクを制御するために発生する制御オーバヘッドがある。本稿では、これら2つのオーバヘッドの内、(2)のオーバヘッドを削減するための手法として、プロセッサにマクロタスク制御専用のハードウェアを付加し、集中制御を廃したマクロタスクの制御手法を提案する。(1)の問題は、マクロタスクのスケジューリング問題であり、今後の課題である。
A Description and its Application of the Data Dependence between Loops for : HAREDAS

KANEKO Masanori, YASUE Toshiaki, HAGIWARA Junichi, TAHARA Ayumu, YAMANA Hayato, MURAOKA Yoichi

全国大会講演論文集 Information Processing Society of Japan (IPSJ)

Presentation date： 1992.09

Event date：
1992.09

　

　

　View Summary

本稿では、マルチアーキテクチャコンパイラ開発環境-はれだす-におけるループ間依存関係の記述方法とその適用例について述べる。従来、プログラム中におけるデータ依存関係を特徴付ける方法して依存ベクトルが用いられている。しかし、この依存ベクトルは、同一ループ内あるいはループ外におけるデータ依存関係を記述するものであり、ループ間のデータ依存関係を特徴付けるための一般的な記述方法は定義されていなかった。これに対して本稿では、ループ間依存ベクトルを定義し、ループ間のデータ依存関係を記述する方法について述べる。また、ループ間依存ベクトルを用いることにより、ループ融合可能判定が従来手法に比べて容易に行えることを示し、さらに、各ループの並列性を失うことなくループを融合するためのループ間依存ベクトルの適用法について述べる。
The inner expression of HAREDAS: The compiler development environment or multi-architecture compiler for massive parallel computing

YASUE Toshiaki, KANEKO Masanori, HAGIWARA Junichi, TAHARA Ayumu, YAMANA Hayato, MURAOKA Yoichi

全国大会講演論文集 Information Processing Society of Japan (IPSJ)

Presentation date： 1992.09

Event date：
1992.09

　

　

　View Summary

はれだすは超並列化マルチアーキテクチャコンパイラの開発を目的とした開発環境である。本稿では、はれだすの内部表現とその上での先行評価表現方法について述べる。超並列化のための1つのアプローチとして、先行評価により既存言語中に陰に含まれる並列性を抽出する方法がある。先行評価とはプログラム中の制御依存関係を変更することにより、データ依存関係以外の先行制約関係を排除する高速化手法である。しかし、従来の先行評価では、命令レベルスケジューリングにおける並列性不足の補助手段としてしか実現されていない。はれだすでは、内部表現レベルで汎用的に先行評価を扱うことができるため、先行評価により引き出し得る並列性を有効に利用することが可能となる。本稿では、この内部表現による先行評価の表現方法について述べる。まず第2節においてはれだすの構成を述べる。続く第3節で、内部表現の構成と特徴について説明したのち、第4節で先行評価の表現方法とその操作方法について詳説する。
A Parallelizing and Executing Scheme of FORTRAN Programs with Eager Evaluation

YAMANA Hayato, YASUE Toshiaki, MURAOKA Yoichi

全国大会講演論文集 Information Processing Society of Japan (IPSJ)

Presentation date： 1992.02

Event date：
1992.02

　

　

　View Summary

本報告では,FORTRANプログラムをマルチプロセッサ上で高速に実行するための方式として,先行評価を用いたプログラムの並列化手法と実行方式を提案する.従来,条件分岐を含むプログラムを並列化する手法として,タスクの最速実行条件を求める手法や制御依存を越えた実行方式が提案されている.しかし,(1)最速実行条件を求めるだけでは十分な並列性が得られない,(2)対象プログラムが限定され,かつ,実行方式の提案がないといった問題を持つ.これらの問題に対して我々は,フローグラフ展開を用いた仮実行方式,データ駆動を用いた条件分岐のn段先行評価制御方式を提案している.本稿では,これらの手法を一般化すると共に,理論的な速度向上について論じる.
A Control Scheme of Multistage Eager Evaluation for Multiprocesser System

ISHII Yoshihiko, YASUE Toshiaki, YAMANA Hayato, MURAOKA Yoichi

全国大会講演論文集 Information Processing Society of Japan (IPSJ)

Presentation date： 1992.02

Event date：
1992.02

　

　

　View Summary

本稿では,マルチプロセツサシステムにおける,タスク一段の先行評価の制御(一段先行評価制御)と,タスク多段に渡る先行評価の制御(多段先行評価制御)との違いについて述べる.我々は,一投先行評価制御,及び,多段先行評価制御を具体的なマルチプロセッサシステム(並列処理システム-晴-)に沿って提案してきた.本稿では,一段先行評価制御,及び,多段先行評価制御を時相論理で表現し一般化する.その後,この時相論理を用いて,制御の違いを推論する,また,この推論によって,我々が提案してきた具体的なマルチプロセッサシステムに沿った一段先行評価制御,及び,多段先行評価制御の正当性を述べる.
An optimized vectorization scheme for multiply nested loops

SHINKAI Masashi, YASUE Toshiaki, KANEKO Masanori, YAMANA Hayato, MURAOKA Yoichi

全国大会講演論文集 Information Processing Society of Japan (IPSJ)

Presentation date： 1992.02

Event date：
1992.02

　

　

　View Summary

本稿では、多重ループの最適なベクトル化を実現するために、(1)内側ループからのタイト化、(2)積極的なループ分割、という2つの解析方針に基づくベクトル化手法を提案する。従来の多重ループのベクトル化手法では、(1)外側ループからタイト化するためループ分割が十分できない、(2)ループ分割による損得の評価が不完全である、という問題があり、最適なベクトル化ができない。そこで、本稿ではこれらの問題を解決するための解析手法を提案するとともに、実機(富士通のVP220O)において本手法を定量的に評価する。
A Scheme to Reduce the Access Rate to Schred Memory for Multiprocessor System.

山名早人, 村岡洋一

電子情報通信学会技術研究報告

Presentation date： 1992

Event date：
1992

　

　
An Execution Scheme for DO-loops on Distributed Memory Machines.

萩原純一, 安江俊明, 金子正教, 山名早人, 村岡洋一

電子情報通信学会技術研究報告

Presentation date： 1992

Event date：
1992

　

　
早稲田大学理工学部電子通信学科村岡洋一研究室

田渕仁浩, 山名早人

人工知能学会誌社団法人人工知能学会

Presentation date： 1991.05

Event date：
1991.05

　

　
Parallel execution scheme of conditional branches with graph unfolding for the parallel processing system - Harray

Hayato Yamana, Toshiaki Yasue, Jun Kohdate, Yoichi Muraoka

Bulletin of Centre for Informatics (Waseda University)

Presentation date： 1991.03

Event date：
1991.03

　

　

　View Summary

The purpose of this paper is to propose and evaluate a new scheme, called the Preceding Activation Scheme with Graph Unfolding, which translates a FORTRAN program into a dataflow graph and executes it efficiency. The problems in restructuring a FORTRAN program into a dataflow graph is that a FORTRAN program has an explicit control flow, which results in little parallelism because many gate-operations, such as T/F gates, are introduced in the dataflow graph to synchronize the data mevement. Thus, discarding these gate-operations is the key to expose parallelism from a FORTRAN program, which is the main purpose of the proposed scheme. In the software simulation, it is shown that the execution speed with the proposed scheme for flow graphs without backward branches is about 1.5 times as fast as that of the pure dataflow computer. Moreover, the execution speed is 2.7 times as fast as that of the pure dataflow computer if a flow graph including backward branches is unfolded by the proposed scheme.
A controll Scheme of Processing Element for the Parallel Processing System : Harray

ISHIZAKI Kazuaki, ISHII Yoshihiko, HAGIMOTO Takeshi, YAMANA Hayato, MURAOKA Yoichi

全国大会講演論文集 Information Processing Society of Japan (IPSJ)

Presentation date： 1991.02

Event date：
1991.02

　

　

　View Summary

本報告では、並列処理システム-晴-における仮実行時の要素プロセッサ(PE)内の制御方法について述べる。仮実行方式とは、プログラムの並列実行を妨げる原因の一つである制御依存を超えてeagar evaluationを行う方式である。従来のパイプラインプロセッサ等のeagar evaluationは、単一プロセッサ内で行われていたためその範囲が小規模であった。-晴-では、仮実行を複数プロセッサを用い、多段にわたってeagar evaluationを行う。ここで問題となるのは分岐が決定した際の、複数PE間にわたるPEの制御方法である。制御を一箇所で集中的に行うと1000台規模のプロセッサではオーバヘッドが無視できない。そこで、我々は実行制御をPE毎に分散して行う方式を提案している。本報告では、まず仮実行の単位としてActivation Setという制御単位を定義する。次に、Activation Setを用いた仮実行時のPE毎に独立した制御方法について、その概要を述べる。さらに、PE内での具体的な処理手順を示す。
A Parallel Execution Scheme of Conditional Branches and its Evaluation for the Parallel Processing Sustem : Harray

YAMANA Hayato, YASUE Toshiaki, Kohdate Jun, MURAOKA Yoichi

全国大会講演論文集 Information Processing Society of Japan (IPSJ)

Presentation date： 1991.02

Event date：
1991.02

　

　

　View Summary

本報告では,プログラム内の条件分岐を並列処理することによるプログラム実行時間の短縮について述べ,我々の提案している並列処理システム-晴-上での条件分岐並列処理手法の性能予測を示す.プログラム中の条件分岐を並列処理しようという試みは,VLIW型計算機を中心にこれまでに数多く行われている.しかし,これらの方式は,大規模な並列処理計算機を対象とした方式ではないため,条件分岐の先行評価段数が小さく,得られる並列性も小さい.これに対して,-晴-は1000台規模の要素プロセッサを持つため,先行評価段数を大きくし,十分な並列性をプログラムから抽出する.先行評価段数を大きくする手法として,我々はこれまでにフローグラフ展開を提案している.フローグラフ展開とは,条件分岐点における同期をとらず,条件の成立・不成立によって分かれる全ての制御フローについて演算を同時に実行し,後で制御に基づいて有効となったフローを選択する手法である.これまでの評価では,フローグラフ展開の対象となる部分について,1.5倍-5.2倍の処理速度の向上を確認している.本稿では,まず,(1)条件分岐の並列処理による処理速度向上をいくつかの科学技術計算プログラムのシミュレーション結果を用いて示し,次に(2)フローグラフ展開による処理速度の向上が,プログラム全体として考えた時に,どの程度期待できるかについて評価した結果を示す.
An environment for dataflow program development of parallel processing system‐harray

Hayato Yatnana, Jun Kohdate, Toshiaki Yasue, Associate Members, Yoichi Muraoka

Systems and Computers in Japan

Presentation date： 1991

Event date：
1991

　

　

　View Summary

This paper considers the dataflow program development environment for the system programmer who develops the compiler and proposes a method to improve the debugging efficiency. The conventional debugging methods are either: (1) to monitor the packet in the dataflow ring, or (2) to specify the function containing a bug. The former contains unsolved problems such as the determination of start timing for the data monitoring and the presentation of a large amount of information to the user. The latter contains a problem in that the debugging is impossible at the dataflow level. This paper aims at the solution of those problems, and the detailed debugging is executed on the software, not on the real machine. The information presentation on a dataflow graph is considered for systematic presentation of the debugging information. As the development environment, the parallel processing system Harray proposed by the authors is considered. In the proposed system, a two‐stage process is employed in which the first step is to specify the macro‐block (which is a task unit in Harray) containing the bug, and the second step is the detailed debugging of the specified macro‐block. The debugging within the macroblock is executed on the software, and the debugging efficiency is improved by: (1) diagram representation for easier visual recognition, and (2) backward tracing function. Copyright © 1991 Wiley Periodicals, Inc., A Wiley Company
A Scheduling Scheme for the Parallel Processing System. Harray. A Task Restructuring for the Reduction of Inter-Processor Communication and Synchronization.

萩原純一, 安江俊明, 山名早人, 村岡洋一

情報処理学会全国大会講演論文集

Presentation date： 1991

Event date：
1991

　

　
A Network Construction of Parallel Processor for Eager Evaluation.

石崎一明, 安江俊明, 山名早人, 村岡洋一

情報処理学会研究報告

Presentation date： 1991

Event date：
1991

　

　
A Construction of Execution Unit for Parallel Processing System. Harray.

萩本猛, 山名早人, 村岡洋一

情報処理学会全国大会講演論文集

Presentation date： 1991

Event date：
1991

　

　
Loop Parallelizing Scheme:Dependent-flow Loop.

金子正教, 中里倫明, 安江俊明, 山名早人, 村岡洋一

電子情報通信学会技術研究報告

Presentation date： 1991

Event date：
1991

　

　
A Scheme to Reduce the Access Rate to Shared Memory for the Parallel Processing System -Harray-.

山名早人, 大段智志, 村岡洋一

並列処理シンポジウム論文集

Presentation date： 1991

Event date：
1991

　

　
A Parallel Execution Scheme of Conditional Branches Using Eager Evaluation for the Parallel Processing System-Harray.

山名早人, 石崎一明, 安江俊明, 村岡洋一

情報処理学会研究報告

Presentation date： 1991

Event date：
1991

　

　
Prototype FORTRAN Compiler for Parallel Processing System -Harray-.

安江俊明, 神舘淳, 山名早人, 村岡洋一

並列処理シンポジウム論文集

Presentation date： 1991

Event date：
1991

　

　
Dataflow program developing environment for the parallel processing system -Harray-.

安江俊明, 神舘淳, 山名早人, 村岡洋一

並列処理シンポジウム論文集

Presentation date： 1990.06

Event date：
1990.06

　

　
Parallel processing system -Harray-

H. Yamana, Y. Kusano, T. Yasue, J. Kohdate, T. Hagiwara, Y. Muraoka

Computing Systems in Engineering

Presentation date： 1990

Event date：
1990

　

　

　View Summary

The parallel processing system -Harray- for scientific computations is introduced. The special features of the -Harray- system described are (1) the Controlled Dataflow (CD flow) mechanism, (2) the preceding activation scheme with graph unfolding, and (3) the visual environment for dataflow program development. The CD flow mechanism, controlling the sequence of execution in two levels-dataflow execution in each processor and control flow execution between processors-is adapted in the -Harray- system. Though dataflow computers are expected to extract parallelism fully from a program, they have many problems, such as the difficulty of controlling the sequence of execution. To solve these problems, the CD flow mechanism is adopted. The preceding activation scheme makes it possible to bypass control dependencies in a program, such as IF-GOTO statements which decrease the parallelism in a program. The flow graph of a program is unfolded to decrease the control dependency and to increase the parallelism. The visual environment helps programmers in the writing and debugging of a dataflow program. The environment consists of a graphical editor of a dataflow graph, and a debugger. These special features of the -Harray- system and its execution mechanism are described. © 1990.
A FORTRAN compiler for paralell processing system. Harray.

安江俊明, 神館淳, 山名早人, 村岡洋一

電子情報通信学会技術研究報告

Presentation date： 1990

Event date：
1990

　

　
A method of function calls. parallel processing system -Harray-.

石崎一明, 神舘淳, 山名早人, 村岡洋一

電子情報通信学会技術研究報告

Presentation date： 1990

Event date：
1990

　

　
Evaluation of color management on parallel processing system -Harray-.

石井吉彦, 安江俊明, 山名早人, 村岡洋一

電子情報通信学会技術研究報告

Presentation date： 1990

Event date：
1990

　

　
A macro-block controlling scheme of parallel processing system. Harray.

山名早人, 安江俊明, 神館淳, 村岡洋一

電子情報通信学会技術研究報告

Presentation date： 1990

Event date：
1990

　

　
The organization of global memory for the parallel processing system -Harray-.

山名早人, 片山啓, 草野義博, 村岡洋一

並列処理シンポジウム論文集

Presentation date： 1990

Event date：
1990

　

　
A construction of switching unit for parallel processing system -Harray-.

石崎一明, 山名早人, 村岡洋一

情報処理学会研究報告

Presentation date： 1990

Event date：
1990

　

　
Implementation of color management scheme on data driven computer.

石井吉彦, 安江俊明, 山名早人, 村岡洋一

電子情報通信学会技術研究報告

Presentation date： 1990

Event date：
1990

　

　
A Structure Handling Scheme of Parallel Processing System -Harray-

Yamana Hayato, Kusano Yoshihiro, Muraoka Yoichi

全国大会講演論文集 Information Processing Society of Japan (IPSJ)

Presentation date： 1989.10

Event date：
1989.10

　

　

　View Summary

本稿では,並列処理システム-晴-〔1〕における構造体処理方式〔2〕の実現方法について述べる.-晴-では,実行方式にCDフロー(Controlled Dataflow)方式〔3〕を採用している.CDフロー方式では,マクロブロックと呼ぶ処理単位間でコントロールフロー制御をおこない,マクロブロック内でデータフロー実行をおこなう.データフロー実行には記憶の概念が存在しないが,実際に計算機を構成するにあたっては,大規模な構造体を格納するための構造体記憶が必要不可欠である.従来,構造体処理に関してI-ストラクチャ〔4〕等が提案されている.しかし,これらの方式はデータフロー方式の持つ単一代入則を厳密に実現したものであって,参照は複数回できるが,定義は1回のみという制限を持つ.したがって,二重定義時には,構造体をコピーしなければならず,オーバヘッドが発生する.-晴-では,構造体記憶(以下大域記憶と呼ぶ)に対して複数回の定義及び参照を可能とし,二重定義時のコピーオーバヘッドを無くした構造体処理方式を提案している〔2〕.本方式では,アクセス順序の保証を複数マクロブロックに及ぶ定義・参照に対しておこない,マクロブロック内に閉じた定義・参照は対象としない.これは,マクロブロック内で定義されたデータを同一マクロブロック内で使用する場合には,定義されたデータをらである.本稿では,まず-晴-の大域記憶及び...
A Run-time Error Handling Scheme for Parallel Processing System -Harray-

Hagimoto Takeshi, Kusano Yoshihiro, Yamana Hayato, Muraoka Yoichi

全国大会講演論文集 Information Processing Society of Japan (IPSJ)

Presentation date： 1989.10

Event date：
1989.10

　

　

　View Summary

我々は,科学技術計算用並列処理システム-晴-(-HARRAY-:IIybrid ARRAY)を提案している).-晴-は,科学技術計算用にF0RTRANで記述されたプログラムを高速に実行することを目的とし,要素プロセッサを1024個持つ並列処理システムである.-晴-の実行方式は,プログラムをコンパイル時にマクロブロックという単位に分割し,マクロブロック間をコントロールフロー,マクロブロック内をデータフローで処理を行うCDフロー方式である.データフローのプログラムでは,後述するゲート後置を行うと,計算機資源が無限にあると仮定したとき,実行速度が約3倍向上することを確認している.しかし,計算機資源は有限であるため,-晴-では,プログラムの並列度が計算機資源よりも小さい部分でゲート後置を行い,この部分の実行速度を向上させる.しかし,制御ゲートが実行時エラーを回避させるために設けられているとき,ゲート後置を行うと,その先行評価部分で,ゲート後置が原因の実行時エラーが発生する場合がある.この実行時エラーは,ユーザのプログラムの誤りが原因でないため,ユーザに報告することはできない.したがって,ゲート後置が原因となりえる実行時エラーが発生したとき,その発生原因がゲー置であるのかプログラムの誤りであるのかを判断する必要がある.本稿では,ゲート後置が原因となりえる実行時エラーが発生したとき,その...
A Method of Color Management for Parallel Processing System -Harray-

Ishii Yoshihiko, Yasue Toshiaki, Yamana Hayato, Muraoka Yoichi

全国大会講演論文集 Information Processing Society of Japan (IPSJ)

Presentation date： 1989.10

Event date：
1989.10

　

　

　View Summary

本稿では、並列処理システム-晴-[1]におけるカラー管理方式の一提案を行なう。-晴-では、マクロブロック[1]という処理単位内で動的データ駆動方式を採用している。動的データ駆動方式ではループの処理にカラーを用いる。しかし、カラーは有限であるため、カラーの資源管理が必要となる。カラーの資源管理、即ち、カラーの回収・再割当に関して従来の方式では、「カラーのオーバフロー時に新しいループを生成する方法」が提案されている[2]。しかしながら、ループ生成のオーバヘッドが大きいという問題を持つ。また、計算機資源は有限であるから、計算機資源以上のカラーを用いても、処理速度向上は望めない。即ち、計算機資源に見合ったカラーを使用すれば良い。これらの点をふまえて、本稿では、必要以上のカラーを使用せず、カラーの回収・再割当のオーバヘッドを削減したループ処理方式を提案する。以下では、まず、ループ本体に対しデータフロー解析を行ない、カラーの必要個数(Lで表わす)を求める。そして、カラーのオーバフローを回避し、Lで制限されたカラーを回収・再割当するループ処理方式を示す。なお、今回はLが計算機資源以下の場合について報告する。
A PRECEDING ACTIVATION SCHEME WITH GRAPH UNFOLDING FOR THE PARALLEL PROCESSING SYSTEM HARRAY

H YAMANA, T HAGIWARA, J KOHDATE, Y MURAOKA

PROCEEDINGS : SUPERCOMPUTING 89 ASSOC COMPUTING MACHINERY

Presentation date： 1989

Event date：
1989

　

　

　View Summary

The purpose of this work is to propose and evaluate the preceding activation scheme with graph unfolding, which translate a Fortran program into a dataflow graph and executes it efficiently. The problems in restructuring a Fortran program into a dataflow graph are that a Fortran program is not written in a single assignment rule and it has an explicit control flow. These problems result in little parallelism because many gate operations, such as T/F gates, are introduced in the dataflow graph to synchronize the data movement. Therefore, discarding these gate operations is the key to exposing parallelism in a Fortran program. The preceding activation scheme with graph unfolding is proposed to discard these gate operations. The result of the performance evaluation by the 'Harray' software simulator is presented. It is shown that the execution speed with the proposed scheme for flow graphs without backward branches is about 1.5 times as fast as that with the extended activation scheme which initiates the execution only after it is confirmed that a basic block will be selected at a conditional branch. Moreover, the execution speed is 2.7 times as fast as that with the extended activation scheme if a flow graph including backward branches is unfolded by the proposed scheme.
A compiling algorithm of parallel processing system - Harray with graph unfolding scheme.

神舘淳, 安江俊明, 山名早人, 村岡洋一

電子情報通信学会技術研究報告

Presentation date： 1989

Event date：
1989

　

　
Parallel processing system HARE.

萩原孝, 山名早人, 丸島敏一, 村岡洋一

BIT (Tokyo)

Presentation date： 1989

Event date：
1989

　

　
A controlled dataflow mechanism of parallel processing system - Harray.

山名早人, 草野義博, 神舘淳, 安江敏明, 村岡洋一

電子情報通信学会技術研究報告

Presentation date： 1989

Event date：
1989

　

　
Evaluation of unfolded flow graph for the parallel processing system -Harray-.

荻原孝, 山名早人, 神館純, 村岡洋一

情報処理学会研究報告

Presentation date： 1989

Event date：
1989

　

　
Visual environment for lower level program development of parallel processing system -Harray-.

安江俊明, 神舘淳, 萩原孝, 山名早人, 村岡洋一

電子情報通信学会技術研究報告

Presentation date： 1989

Event date：
1989

　

　
The Completion Detection of Macro-Block in Parallel Computer System -Harray-

Kusano Yoshihiro, Hagiwara Takashi, Yamana Hayato, Muraoka Yoichi

全国大会講演論文集 Information Processing Society of Japan (IPSJ)

Presentation date： 1988.09

Event date：
1988.09

　

　

　View Summary

我々が提案している科学技術計算処理用データフロー・マルチプロセッサシステム-晴-では、各プロセッサエレメントへ割り当てるタスクの分割にマクロブロックという概念を用いている。マクロブロックとはプログラムをある基準に従って図1のように分割したもので、-晴-ではマクロブロックを単位としてプロセッサエレメントにタスクを割り当てる。マクロブロック内部ではデータ駆動制御で計算を進めて自然に並列性を抽出し、さらにマクロブロック間にコントロールフロー制御を導入し階層的な制御構造をとる。このような方法により、制御命令の増加などのデータ駆動制御の欠点を補うことができる。しかし、マクロブロックを単位としてタスクを割当てる際に種々の問題が生じる。マクロブロックの終了検出を高速に行なう必要があることもその一つである。そこで、本稿ではマクロブロックの終了検出を高速に行なう手法について述べ、簡単な評価を行なう。
A Construction of Waiting Memory for Parallel Processing Syatem -Harray-

Yamana Hayato, Kusano Yoshihiro, Hagiwara Takashi, Muraoka Yoichi

全国大会講演論文集 Information Processing Society of Japan (IPSJ)

Presentation date： 1988.09

Event date：
1988.09

　

　

　View Summary

我々は、主に科学技術計算を目的とした並列処理システム-晴-を提案している。-晴-では、プログラムに内在する並列性を十分に引き出す為にデータフロー実行を取り入れている。データフロー実行では、ノードの発火制御を司る待ち合わせ記憶(WM:Waiting Memory)の高速化がシステム全体の高速化において重要なポイントとなる。本稿では、-晴-の試作機で用いる待ち合わせ記憶WMの構成について述べると共に、ソフトウェアシミュレータによる簡単な評価を行う。
A design concept of the compiler for parallel processing system-Harray-.

萩原孝, 山名早人, 村岡洋一

電子情報通信学会技術研究報告

Presentation date： 1988

Event date：
1988

　

　
Evaluation of a simulated parallel processing system - Harray - .

山名早人, 萩原孝, 草野義博, 村岡洋一

情報処理学会研究報告

Presentation date： 1988

Event date：
1988

　

　
Execution mechanism of parallel processing system -Harray-.

丸島敏一, 山名早人, 萩原孝, 草野義博, 村岡洋一

情報処理学会研究報告

Presentation date： 1988

Event date：
1988

　

　
Evaluation of processing element in parallel computer system-Harray.

草野義博, 山名早人, 丸島敏一, 村岡洋一

情報処理学会全国大会講演論文集

Presentation date： 1988

Event date：
1988

　

　
A construction of processing element in a parallel processing system -Harray-.

山名早人, 丸島敏一, 草野義博, 村岡洋一

情報処理学会研究報告

Presentation date： 1988

Event date：
1988

　

　
EXPERIENCE USING THE RESTRUCTURING COMPILER PARAFRASE.

Toshikazu Marushima, Takashi Hagiwara, Hayato Yamana, Yoichi Muraoka

Bulletin of Centre for Informatics (Waseda University)

Presentation date： 1987.03

Event date：
1987.03

　

　

　View Summary

Parallel processing with an ordinary sequential language is important from a point of a view of its simplicity and the effective utilization of existing software. This paper reports an experience gained by using Parafrase, a restructuring compiler developed by University of Illinois.
A scheme of macro blocking for parallel scientific computer system -Harray-.

萩原孝, 山名早人, 丸島敏一, 村岡洋一

情報処理学会全国大会講演論文集

Presentation date： 1987

Event date：
1987

　

　
A construction of processing element in parallel scientific computer system -Harray-.

山名早人, 丸島敏一, 萩原孝, 村岡洋一

情報処理学会全国大会講演論文集

Presentation date： 1987

Event date：
1987

　

　
Sequential Pattern Mining with Time Interval

Yu Hirate, Hayato Yamana

Proc. of PAKDD2006
RetweetReputation: バイアスを排除したTwitter投稿内容評価手法

藤木紫乃, 矢野博也, 山名早人

DEIM2011

▼display all

Research Seeds

System for the handwritten input of mathematical expressions: MathBox

Information Communication
Creation of effective thumbnails based on handwritten notes

Information Communication

Research Projects

Precise Detection of "Learning Transitions" Using Online Handwriting Data and Eye Gaze - Toward a Personalized Learning

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2024.04

-

2028.03
Sustainable system design for health care and long-term care - Utilization of administrative big data through international comparative studies

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2022.04

-

2026.03
Identification of logical thinking ability from online handwritten data

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2020.04

-

2024.03
Credibility Analysis of Web contents based on 10 billion Web pages

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2017.07

-

2022.03

YAMANA HAYATO

　View Summary

In this research project, efficient web page crawlers (gathering programs), web page content analysis methods, methods for estimating web content reliability without accessing web contents (i.e., using only URLs), revealing the problems of previous benchmarks where the ground truth is usually based on human first-impression decisions, and distributing the related research survey of web content reliability have been completed. Especially, the crawler achieved a 10% improvement in efficiency compared to previous methods, and the method that can judge credibility using only URLs (achieving an accuracy of 99.4%) achieved significant results for future practical use, as it can judge credibility using only URLs without accessing content.
戦略的創造研究推進事業（CREST）「ビッグデータ統合利活用のための次世代基盤技術の創出・体系化」領域

JST 戦略的創造研究推進事業(CREST)

Project Year :

2015.10

-

2021.09
Authorship Identification for Hundred-thousand-scale Microblog Users in the Web

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2013.04

-

2017.03

YAMANA Hayato, OYAMA Keizo, UNO Takeaki, OKUNO Syunya, OKUTANI Takashi, ASAI Hiroki, UESATO Kazuya, TANAKA Masahiro, SHINOHARA Shota, ISHIYAMA Takehiro, Wang Lan

　View Summary

Since various information floods on the Internet, its credibility is becoming a social problem. In this study, we researched on authorship identification technique for short messages such as SNS, targeting to identify the authorship of the messages from among 100,000 candidates. That is, if there is some documents written in advance by the author, it is possible to estimate the writer. As a result, we have established a mechanism to find a specific user out of 100,000 SNS users with accuracy of 60% if we have only 30 messages. In addition, the probability of being able to extract in the top 10 places was 74%. This is a major contribution to the fact that other research in the world is limited to about 20% accuracy for 100,000 candidates.
Frustration Detection using Online Handwriting Behavior

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2014.04

-

2016.03

YAMANA HAYATO, ASAI Hiroki

　View Summary

In the era of educational computerization, this research aimed to extract students’ frustration during their study by using online handwritten data, followed by promoting effective personalized study which will be realized in near future.
<BR>
We classified student’s frustration into two categories: 1) frustration caused from non-established memory and 2) frustration during their answering process. As for 1), we defined “memory” into non-established, subjective established, and subjective non-established memory. Then, our proposed system, targeting Japanese Kanji memorization, tried to detect non-established memory automatically from subjective-established memory where students thought they memorized but not memorized in fact. Our evaluation shows 0.69 F-value which is applicable to the real world. As for 2), we picked up mathematical problems. The result to categorize their processes shows 0.5 to 0.7 F-value, which will be applicable to the real world.
Integrated analysis of web information structure and users' behavior and its application to advanced information access

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2010.04

-

2013.03

OYAMA Keizo, AIZAWA Akiko, MIYAO Yusuke, SUN Yuan, KOBAYASHI Tetsuro, HAN Hao, KISHIDA Kazuaki, YAMANA Hayato, OKUMURA Manabu, YOSHIOKA Masaharu, ISHITA Emi, MURATA Tsuyoshi, EGUCHI Koji

　View Summary

For understanding Web structure and users' behavior of information retrieval and browsing in an integrated way, and for extending it to various applications, we collected and introduced various data reflecting Web information structure and Web users' behavior (e.g. Web view log data, micro-blog data), obtained user data through questionnaire, and executed integrated analysis on them.
Consequently, we obtained various findings through data such that there is a gap between information wanted to know and information wanted to inform, and that, through using Web portal sites, unexpected contact to various information occurs. Moreover, we proposed and studied various methods for advanced information access such as information recommendation and information retrieval based on the information obtained through the integrated analysis.
Challenges and Successful Approaches in Multimedia Event Detection

MEXT

Project Year :

2009

-

2011.03

Shinichi SATO, Masaru KITSUREGAWA, Satoshi TOYODA
Analyisis of Search Engines' Trustworthiness

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2009

-

2011

YAMANA Hayato, MATSUYAMA Yasuo

　View Summary

Nowadays, search engines become indispensable for us to live a life ; however, trustworthiness of search engines are unclear. Especially, the number of search results, i. e., hit-count, usually varies about 100 to 1000 times increase or decrease even if we put them the same query word. In this research, we have made clear the transition characteristics of hit-counts based on 15 months investigation for Google, Yahoo! JAPAN and Bing. Moreover, we have proposed a new method to choose trustworthy hit-counts, which results in 99.5% precision when we compare two hit-counts on the point which query word has larger number of search results.
メニーコアCPUにおける冬眠コアのゼロ化

日本学術振興会科学研究費助成事業

Project Year :

2009

-

2010

山名早人

　View Summary

2010年度は、2009年度に開発したシステム自動最適化アルゴリズムの実機評価を目指した。本アルゴリズムはProducer-Consumer型のモジュール群で構築されたアプリケーションにおいて、メニーコアCPUを最大限に利用できるよう各モジュールに割り当てる計算機やスレッド数を自動で決定し、アプリケーションの性能を最適化することが目標である。研究には我々が開発している分散処理フレームワークであるQueueLinkerを用いた。
2010年度は、まず、自動最適化アルゴリズムの評価用アプリケーションとしてWebクローラを開発し、QueueLinkerのプロトタイプにより動作を確認した。本クローラを構成するモジュールは全てProducer-Consumer型であり、QueueLinkerにより分散実行できる。実験に先立ち、本クローラがWebサーバにかける負荷を軽減するために、同一Webサーバに対するアクセス時間間隔の最小値を厳密に保証するクローリングスケジューラを開発した。本スケジューラは、時間計算量が0(1)であり、空間計算量の上限がクローリング対象のURL数に依存しない。本アルゴリズムはDEIM 2011において発表した。
そして、開発したWebクローラをアプリケーションに用い、QueueLinkerの自動プロファイリング機能を開発した。本プロファイリング機能は、モジュールが使用するCPU時間や、ネットワーク通信量をプロファイリングできる。その後、昨年度開発したシステム自動最適化アルゴリズムを実際のプロファイリングデータを利用して動作するよう設計を修正した。本アルゴリズムは、各モジュールが使用するリソース量に基づいて、アプリケーションの性能が最大になるように、モジュールに割り当てる計算機やスレッド数を自動で決定するものである。
Highly Scalable Monitoring Architecture for Information Explosion Environments

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2006

-

2010

NAKAJIMA Tatsuo, MURAOKA Yoichi, GOTO Shigeki, YAMANA Hayato, KATTO Jiro, OIKAWA Shuichi, AKIOKA Sayaka

　View Summary

In this project, a monitoring system architecture consists of a set of software to protect information infrastructures, social infrastructures and human everyday life. The goal of the project is to integrate research areas that are independently discussed before.The project developed several monitoring systems for computer systems, network systems and the real world to investigate the future information infrastructure.
Design and Development of Advanced IT Research Platform for Information Explosion Era

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2006

-

2010

ADACHI Jun, TANAKA Katsumi, NISHIDA Toyoaki, KUNIYOSHI Yasuo, SUDOH Osamu, KUROHASHI Sadao, HARA Takahiro, MATSUOKA Satoshi, TAURA Kenjiro, TATEBE Osami, MUNETOMO Masaharu, HIROTSU Toshio, MATSUBARA Jin, SHIMOJYO Shinji, CHIBA Shigeru, YUASA Taichi, MATSUYAMA Takashi, CHIKAYAMA Takashi, KONDO Toru, KONO Kenji, OKAMOTO Masahiro, AIDA Kento, KAMADA Tomio, KITSUREGAWA Mararu, YAMANA Hayato, NAKAMURA Yutaka, KOBAYASHI Hiroaki, NAKAJIMA Hiroshi

　View Summary

This project implemented a common research infrastructure for all the research groups participating in this priority-area research initiative, accordingly supported all research activities in this initiative. Providing this infrastructure, we succeeded in accelerating shared utilization of research facilities and resources within the limitation of research funding and strengthening the collaboration among research groups. These shared facilities include (a)TSUBAKI: a open search engine for large-scale corpus, (b)InTrigger : Widely-distributed computing test-bed, (c)IMADE : an environment for real-world interaction measurement and analysis, and (d) prototyping for sensor-network based preventive medicine.
Bioinformatics in silico by the Unification of Symobols and Patterns

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2005

-

2007

MATSUYAMA Yasuo, YANAGISAWA Masao, YAMANA Hayato, KURUMIZAKA Hitoshi, INOUE Masato

　View Summary

This project was started towards the development of computational intelligence algorithms for finding soft patterns existing in DNA and amino acid sequences. The main methodology is in Aim. Wet biologists are included in this group so that overly abstract problems are suppressed. The unification between compute-based information scientists and test-tube-based life scientists still requires time, however, a steady step towards such collaboration was enhanced by this project with the following results :
(1) Prediction methods fir the transcription start site were established. On human .genome which is a representative of eukaryotes, a combination of the spectrum kernel, hidden Markov models, and FFT integrated by a support vector machine was presented. This mechanism yielded a top class ROC curves. On the prediction of E.coli which is a representative of prokaryotes, a combination of the independent component analysis and a support vector machine revealed the best prediction performance to date.
(2) Anew effective algorithm on the multiple sequence alignment was developed. This new method suppresses the appearance of multiple gaps in the same column. The gap extension can be regulated by piecewise linear penalties. The total algorithm is realized as the software named PRIME. The PRIME showed better performances than ClustalW and T-Coffee in the sense of resulting alignments and computational speed.
(3) The wet biology team hind an evidence on Rad5l which repairs cut double strands of DNA. The binding site of Rad51 is altered in breast cancer patients.
As was explained above, this research brought about fruitful results on post genome topics : The prediction of promoters and transcription start sites, a new multiple sequence alignment method leading to tertiary structure prediction, and a cancer property caused by protein functions.
Research on Fast Execution using Helper Threads on Multi-threading Processors

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

Project Year :

2005

-

2006

YAMANA Hayato, SAITO Fumiko

　View Summary

Recently, many multi-core processors are on markets. In this research, we have studied how to accelerate programs on these multi-core processors by using multi-threading technique.
In 2005FY, we surveyed this research area, and studied both on the algorithms and on its applications. As for the study on algorithms, we studied how to decrease the number of open slots of CPU pipelines resulted from branch miss-predictions. Moreover, we studied how to control L2 cache to speedup programs because many multi-core CPUs have shared a L2 cache and how to control L2 cache is the key to speedup programs. By proposing the technique that puts data on the best suited area on the cache to decrease wiring delay, we have confirmed that the technique is able to speedup SPECint95/SPECint200 programs 1.17 times of IPC in average. As for the study on applications, we investigated many applications including search related ones that will be suited for multi-core CPUs.
In 2006FY, the targets of this research were fixed on how to speedup disk accesses based on 2005FY studies. We researched on three areas : 1) pre-loading of data from disks, 2) extending disk cache onto other machine's memory and 3) shell script acceleration by parallelizing disk access. First, we have proposed a new pre-loading scheme from disks by using helper threads. Experimental evaluation using "gzip" application shows that we can archive 39.2% speedup in comparison with using no-helper threads. Second, we have proposed a new disk caching scheme by using remote memory of another PC. The scheme realize extending the size of disk cache by using helper-threads. It has been confirmed that the speedup ratio becomes up to 3.08 on the benchmark program called DBT-3. Third, we have proposed a new parallelizing scheme by using alternative threads for shell script programs. By using our scheme, shell script programs archive 1.4 to 1.8 times speedup in comparison with normal execution. The last result is now under productization by USP Lab (http://www.usp-lab.com/).
広域分散型情報収集・検索システムにおける負荷分散方式の研究

日本学術振興会科学研究費助成事業

Project Year :

2001

-

2002

山名早人

　View Summary

平成14年度の研究においては、平成13年度の成果を踏まえた上で、ネットワークの混雑状況を考慮した分散収集の仕組みを提案するため、当該WWWサーバに至る経路が複数ある場合の経路の選択手法について研究を実施した。
具体的には、パケットのトランスポート層の各種情報を分析する事で、複数のネットワーク経路が存在する場合に最適な経路を発見することが可能かどうかを検証した。まず、パケット内のトランスポート層の情報であるTCPヘッダの内容を分析し、複数のネットワークの中から、どのネットワークを使えば効率的にデータ転送を行う事ができるのか示すことができるパラメータを発見することを目指した。
最初に、転送率とTCPの様々なパラメータ(平均ウィンドウサイズ、最大ウィンドウサイズ、RTT)の関連性について解析した。解析の結果、1KB以上の転送量を持つコネクションよりは1KB未満の転送量を持つコネクションのほうが、ウィンドウサイズと転送率の関係を得やすいということがわかった。さらに、長い転送時間のコネクション(実験では1秒以上)よりは短い転送時間(同1秒未満)のコネクションからの方が、ウィンドウサイズと転送率の関係を得やすいことがわかった。
これらの結果は、小さい転送量、もしくは短い転送時間のコネクションでは、安定してパケットの送信が行われているためだと考えられる。大きい転送量、もしくは長い転送時間のコネクションは、送信の途中で何らかの問題点を持っている可能性があるため、最適経路を選択する上でのパラメータとしては用いない方がよいことが分かった。
以上の結果を踏まえ、Webページ収集時に当該WWWサーバまで複数の経路が存在する場合に、経路を選択するための一手法を提案した。
さらに、昨年度からの継続として、Webページの更新間隔をWebページを収集することなく発見するためのアルゴリズム開発を行った。
科学技術計算用並列処理システム-晴-のアーキテクチャに関する研究

日本学術振興会科学研究費助成事業

Project Year :

1991

　

　

山名早人

▼display all

Misc

Webはパワースーツ

山名早人

情報・システムソサイエティ誌 27 ( 3 ) 3 - 3 2022.11

DOI
A Survey of Explainable Recommender System

松島ひろむ, 森澤竣, 石山琢己, 山名早人

情報科学技術フォーラム講演論文集 20th 2021

J-GLOBAL
論文を書くということ

山名早人

情報・システムソサイエティ誌 25 ( 1 ) 10 - 11 2020.05

DOI CiNii
Predicting Answers with Hints from Online Answer Data-Targeting Geometric Problems-

三浦将人, 村上統馬, 中山祐貴, 山名早人

電子情報通信学会技術研究報告 119 ( 393(ET2019 69-75)(Web) ) 2020

J-GLOBAL
A Survey of the Hardware Implementations for Improving Performance of Fully Homomorphic Encryption

井上紘太朗, 鈴木拓也, 山名早人

情報科学技術フォーラム講演論文集 19th 2020

J-GLOBAL
The Use of survey materials for the repair landscape project in the "Higashiyama-Higashi" Important Preservation District for Groups of Traditional Buildings, Kanazawa Part 3

UCHIDA Shin, HIRATE Yu, YAMANA Hayato

National Institute of Technology,Ishikawa College Bulletin 51 19 - 24 2019

　View Summary

The aim of this study is to utilization of survey materials in repair landscape project, exemplifying "Higashiyama-higashi" Important Preservation District for Groups of Historic Buildings located in Ishikawa Prefecture as a case study. In this paper, we investigated the cleaning method of timber lattice and the sectional structure of wooden fittings and analyzed the relationship between the change in appearance of timber lattice and the sectional structure of wooden fittings.

DOI CiNii
メニーコアCPU環境における準同型暗号演算高速化を目的とするタスクスケジューリング手法の検討

鈴木拓也, 石巻優, 山名早人

情報処理学会研究報告(Web) 2019 ( HPC-170 ) 2019

J-GLOBAL
Secureな環境における副作用ファジー検索システムの構築

菅野敦之, 野口保, 山名早人

日本薬剤師会学術大会(Web) 52nd 2019

J-GLOBAL
特集「若手研究者」の編集にあたって

山名, 早人

情報処理学会論文誌 59 ( 3 ) 821 - 821 2018.03

CiNii
2016年度論文賞の受賞論文紹介: The 2016 IPSJ Best Paper Award：Foreword

58 ( 8 ) 709 - 709 2017.07

CiNii
FCMalloc:完全準同型暗号の高速化に向たメモリアロケータ

馬屋原昂, 佐藤宏樹, 石巻優, 今林広樹, 山名早人

情報処理学会研究報告(Web) 2017 ( OS-141 ) 2017

J-GLOBAL
特定分野における単語重要度計算手法の提案と短い文章における著者の専門性推定への適応

滝川真弘, 山名早人

情報処理学会研究報告(Web) 2017 ( NL-233 ) 2017

J-GLOBAL
CTR向上を目的としたWEBページ上でのオンライン広告配置位置推定

大谷一善, 滝川真弘, 堀田弘明, 山名早人

情報科学技術フォーラム講演論文集 16th 2017

J-GLOBAL
電子ペンを利用した数学手書き答案の戦略分類手法~多項式展開問題を題材として~

浅井洋樹, 山名早人, 山名早人

情報処理学会研究報告(Web) 2016 ( CE-133 ) 2016

J-GLOBAL
電子ペンを用いた手書き解答データによる幾何学解答パターン分類手法

森山優姫菜, 下岡純也, 浅井洋樹, 山名早人, 山名早人

情報科学技術フォーラム講演論文集 15th 2016

J-GLOBAL
完全準同型暗号のデータマイニングへの利用に関する研究動向

佐藤宏樹, 馬屋原昂, 石巻優, 今林広樹, 山名早人

情報科学技術フォーラム講演論文集 15th 2016

J-GLOBAL
特定分野を対象とした単語重要度計算手法の提案とTwitterにおける専門性推定への適応

滝川真弘, 山名早人

情報科学技術フォーラム講演論文集 15th 2016

J-GLOBAL
Super-information Society based on Big Data - Information Technologies of Searching the Whole, from Platform Technologies to Applications -：0. Foreword

56 ( 10 ) 956 - 957 2015.09

CiNii
Cross-lingual Investigation of User Evaluations for Global Restaurants

DBSJ journal 13 ( 1 ) 37 - 42 2015.03

CiNii
A study of effective visit history utilization for Location recommendation-User’s Familiarity with area and Visit pattern change-

HAN Jungkyu, 山名早人

電子情報通信学会技術研究報告 115 ( 110(DE2015 1-11) ) 2015

J-GLOBAL
スマートウォッチにおけるアイズフリー日本語入力手法

下岡純也, 浅井洋樹, 山名早人, 山名早人

日本ソフトウェア科学会研究会資料シリーズ(Web) ( 76 ) 2015

J-GLOBAL
Improved Native Language Identification with Upper Phrase Information and Training Data Selection

TANAKA MASAHIRO, WANG LAN, YAMANA HAYATO

IEICE technical report. Natural language understanding and models of communication 114 ( 366 ) 127 - 132 2014.12

　View Summary

Native Language Identification, the task of identifying the native language (L1) of a writer based solely on a sample of his/her writing in non-native language (L2), is one of the authorship attribution problem. In this paper, we propose i) "upper phrase information" as a new feature, ii) discarding essay data which seem to be outliers from the training dataset. NLI is able to applicable to many other NLP tasks such as Second Language Acquisition. From 2005, many researchers have approached this task in different ways. Combining all the proposed techniques and existing methods, our system archives 85.6% accuracy on the NLI Shared Task 2014 data. To the best of our knowledge, this is a state-of-the-art accuracy in the NLI tasks.

CiNii J-GLOBAL
オンライン手書き情報を用いた未定着記憶推定システム

浅井洋樹, 山名早人

研究報告コンピュータと教育（CE） 2014 ( 1 ) 1 - 6 2014.11

　View Summary

漢字や英単語を記憶する暗記学習は，忘却せずに再生可能となるよう記憶を定着させることが目標であり，より効率的に記憶を定着可能な学習システムは学習者にとって有用である．記憶を定着させるためには，暗記する対象を再学習する反復学習を繰り返す必要があると言われており，効率的に暗記を行うためには定着していない記憶を選び出して優先的に反復学習を行うことが必要である．しかし，学習者の正解・不正解のテスト結果だけでは，正解しているがすぐに忘れてしまう定着度の低い暗記対象が検出できないため，未定着記憶を網羅することができない．また定着・未定着の 2 値判定にとどまり，反復学習の優先順位を決めることができない．そこで本研究では，タブレット端末等から取得可能な時系列情報や筆圧が含まれるオンライン手書き情報を用いて，学習者の記憶定着度を推定する手法を提案する．提案システムによって得られる連続値である「記憶度」の数値が低い事象を優先的に学習することで，効率的に暗記可能な学習支援システムの構築が実現可能となる．

CiNii J-GLOBAL
マイクロブログを対象とした著者推定手法の提案 : 10,000人レベルでの著者推定 (データ工学)

奥野峻弥, 浅井洋樹, 山名早人

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 173 ) 65 - 70 2014.08

　View Summary

従来,著者推定研究は小説に対する著者推定を中心に研究が行われており,推定対象を限定した,少人数に対する著者候補者群が取り扱われてきた.これに対し,我々はマイクロブログを対象にした,不特定多数の候補者群に対する著者推定の提案を行った.その際,精度向上のためマイクロブログ特有の叫喚フレーズに対する正規化手法,および計算量削減のため推定に必要となるメッセージ数を削減する手法を提案してきた.本稿では,より多くのマイクロブログ利用者を対象にした著者推定を行う上での問題点,特に学習用データとテストデータの取得期間の差異が精度に与える影響について検証し,学習用データの取得期間が精度に与える影響を小さくする手法を提案する.実験ではTwitterユーザ10,000人に対して著者推定を行い,Precision@1で0.535,MRRで0.602を達成した.

CiNii J-GLOBAL
メンション情報を利用したTwitterユーザプロフィール推定における単語重要度算出手法の考察 (データ工学)

上里和也, 田中正浩, 浅井洋樹, 山名早人

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 173 ) 125 - 130 2014.08

　View Summary

Twitterのような大規模なソーシャルサービスにおいて,ユーザの興味や所属などのプロフィールを知ることは,効果的なマーケティングを行う上で重要である.このような背景から,Twitterにおけるプロフィール推定に関する研究が行われてきた.従来のプロフィール推定手法では,フォロー情報によって構築されるソーシャルグラフからコミュニティを抽出し,対象のユーザが属するコミュニティの属性を推定することでプロフィール推定を行なっている.しかし,各々のフォローの目的や,活発な交流があるかという点を考慮することができないため,実際に親密な関係を持つユーザ群をコミュニティとして抽出することが困難であるという問題が存在する.それに対して奥谷らは,フォローに代えてメンション情報を用いてソーシャルグラフを構築することで,これらの問題を解決する手法を提案している.しかし同手法には,プロフィール推定の対象となるユーザの周辺ユーザのプロフィールに幅広く共通して出現する単語が,プロフィールとして出力されにくいという問題がある.そこで本論文では,奥谷らのプロフィール推定手法における単語の重要度の算出方法を変更し,Twitterユーザ全体からランダムにサンプリングした100,000ユーザのデータを利用して一般語をフィルタリングすることで,この問題を解決する手法を提案する.6人の被験者による実験の結果,奥谷らの手法と比較して,Precision@10が0.37から0.78,MRRが1.44から2.61に向上した.

CiNii J-GLOBAL
Topics and Influential User Identification in Twitter using Twitter Lists

Zhou Guanying, Asai Hiroki, Yamana Hayato

IEICE technical report. Data engineering 114 ( 173 ) 71 - 76 2014.08

　View Summary

Twitter, as one of the most popular social network services, draws the attention of more and more researchers worldwide. With a large amount of information tweeted every day, it turns essential to identify the influential users we are interested in. In the previous research, researchers mainly identify topics from tweets and rank users by utilizing the follow relationship; however, the following relationship is strongly related to their reputation in real world and cannot describe their influence and activity level in Twitter exactly. Instead, in this paper, to identify topics and influential users, we use "Twitter List," whose name represents the topic of listed members. By analyzing Twitter List, we are able to detect topics and identify influential users in the corresponding topic more efficiently. Based on our experimental evaluation using the selected two topics, the influential users identified by our proposed method have the average influence score related to the topic made by interviewees of 3.7 and 3.33 outweigh the methods of ranking by follower numbers with the average score of 3.22 and 3.27 respectively.

CiNii
Cross-Domain Investigations of User Evaluations under the Multi-cultural Backgrounds

LE JIAWEN, YAMANA HAYATO

IEICE technical report. Data engineering 114 ( 173 ) 137 - 142 2014.08

　View Summary

Twitter, as one of the most popular social network services, is widely used to query public opinions. In this research, a large corpus of Twitter data, along with online reviews, are used to apply sentimental and culture-based analysis, so as to figure out the cultural effects on user evaluations. Posts written in more than 30 languages from more than 30 countries are collected. In order to implement the cross-domain investigations, global restaurants and world attractions are taken as the research subjects, and a series of classifiers with high performances are trained and applied in the experiment steps. Then various analyzing methods are applied to obtain informative results and conclusions about the user evaluations for the targets. As the contributions, this research validates the capability and field transferability of the proposed methods for cross-lingual sentiment analysis, and arrives at the conclusions that the cultural effects on user evaluations for both restaurant domain and travel domain actually exist, and are obvious for some countries and cultural backgrounds.

CiNii
単語の意味概念行列を用いたキーワード生成による関連論文検索システム (データ工学)

林佑磨, 奥野峻弥, 山名早人

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 173 ) 53 - 58 2014.08

　View Summary

研究者は,研究意義や既存手法を知るために,自らの研究分野に関連する論文の調査を行う.論文の調査に広く用いられる論文検索システムは,ユーザがキーワードをクエリとして与えるキーワード検索が一般的である.専門用語の多い技術分野などでは,特に研究分野にまだ精通していない研究者が,適切なキーワードを与えて検索を行い,満足な結果を得ることは難しい.この問題を解決するため,我々は論文の概要を入力とする関連論文検索システムを提案した.同システムでは,入力の概要に含まれる単語が持つ意味を意味概念行列として表現し考慮することで,検索に用いるクエリの自動生成を行っている.本稿では,我々が以前提案したシステムの拡張を行う.具体的には,1)日本語論文検索への対応,および2)RSMによる論文クラスタリングを用いてより質の高いキーワード生成を実現する.日本語に対応している既存の論文検索システムとの比較により,p@10を平均で0.17向上させることに成功した.

CiNii J-GLOBAL
Topics and Influential User Identification in Twitter using Twitter Lists

Guanying Zhou, Hiroki Asai, Hayato Yamana

IPSJ SIG Notes 2014 ( 13 ) 1 - 6 2014.07

　View Summary

Twitter, as one of the most popular social network services, draws the attention of more and more researchers worldwide. With a large amount of information tweeted every day, it turns essential to identify the influential users we are interested in. In the previous research, researchers mainly identify topics from tweets and rank users by utilizing the follow relationship; however, the following relationship is strongly related to their reputation in real world and cannot describe their influence and activity level in Twitter exactly. Instead, in this paper, to identify topics and influential users, we use "Twitter List," whose name represents the topic of listed members. By analyzing Twitter List, we are able to detect topics and identify influential users in the corresponding topic more efficiently. Based on our experimental evaluation using the selected two topics, the influential users identified by our proposed method have the average influence score related to the topic made by interviewees of 3.7 and 3.33 outweigh the methods of ranking by follower numbers with the average score of 3.22 and 3.27 respectively.

CiNii
マイクロブログを対象とした著者推定手法の提案－10,000人レベルでの著者推定－

奥野峻弥, 浅井洋樹, 山名早人

研究報告情報基礎とアクセス技術（IFAT） 2014 ( 12 ) 1 - 6 2014.07

　View Summary

従来，著者推定研究は小説に対する著者推定を中心に研究が行われており，推定対象を限定した，少人数に対する著者候補者群が取り扱われてきた．これに対し，我々はマイクロブログを対象にした，不特定多数の候補者群に対する著者推定の提案を行った．その際，精度向上のためマイクロブログ特有の叫喚フレーズに対する正規化手法，および計算量削減のため推定に必要となるメッセージ数を削減する手法を提案してきた．本稿では，より多くのマイクロブログ利用者を対象にした著者推定を行う上での問題点，特に学習用データとテストデータの取得期間の差異が精度に与える影響について検証し，学習用データの取得期間が精度に与える影響を小さくする手法を提案する．実験では Twitter ユーザ 10,000 人に対して著者推定を行い，Precision@1 で 0.535，MRR で 0.602 を達成した．

CiNii
メンション情報を利用したTwitterユーザプロフィール推定における単語重要度算出手法の考察

上里和也, 田中正浩, 浅井洋樹, 山名早人

研究報告情報基礎とアクセス技術（IFAT） 2014 ( 22 ) 1 - 6 2014.07

　View Summary

Twitter のような大規模なソーシャルサービスにおいて，ユーザの興味や所属などのプロフィールを知ることは，効果的なマーケティングを行う上で重要である．このような背景から，Twitter におけるプロフィール推定に関する研究が行われてきた．従来のプロフィール推定手法では，フォロー情報によって構築されるソーシャルグラフからコミュニティを抽出し，対象のユーザが属するコミュニティの属性を推定することでプロフィール推定を行なっている．しかし，各々のフォローの目的や，活発な交流があるかという点を考慮することができないため，実際に親密な関係を持つユーザ群をコミュニティとして抽出することが困難であるという問題が存在する．それに対して奥谷らは，フォローに代えてメンション情報を用いてソーシャルグラフを構築することで，これらの問題を解決する手法を提案している．しかし同手法には，プロフィール推定の対象となるユーザの周辺ユーザのプロフィールに幅広く共通して出現する単語が，プロフィールとして出力されにくいという問題がある．そこで本論文では，奥谷らのプロフィール推定手法における単語の重要度の算出方法を変更し，Twitter ユーザ全体からランダムにサンプリングした 100,000 ユーザのデータを利用して一般語をフィルタリングすることで，この問題を解決する手法を提案する．6 人の被験者による実験の結果，奥谷らの手法と比較して，Precision@10 が 0.37 から 0.78，MRR が 1.44 から 2.61 に向上した．

CiNii
メンション情報を利用したTwitterユーザプロフィール推定における単語重要度算出手法の考察

上里和也, 田中正浩, 浅井洋樹, 山名早人

研究報告データベースシステム（DBS） 2014 ( 22 ) 1 - 6 2014.07

　View Summary

Twitter のような大規模なソーシャルサービスにおいて，ユーザの興味や所属などのプロフィールを知ることは，効果的なマーケティングを行う上で重要である．このような背景から，Twitter におけるプロフィール推定に関する研究が行われてきた．従来のプロフィール推定手法では，フォロー情報によって構築されるソーシャルグラフからコミュニティを抽出し，対象のユーザが属するコミュニティの属性を推定することでプロフィール推定を行なっている．しかし，各々のフォローの目的や，活発な交流があるかという点を考慮することができないため，実際に親密な関係を持つユーザ群をコミュニティとして抽出することが困難であるという問題が存在する．それに対して奥谷らは，フォローに代えてメンション情報を用いてソーシャルグラフを構築することで，これらの問題を解決する手法を提案している．しかし同手法には，プロフィール推定の対象となるユーザの周辺ユーザのプロフィールに幅広く共通して出現する単語が，プロフィールとして出力されにくいという問題がある．そこで本論文では，奥谷らのプロフィール推定手法における単語の重要度の算出方法を変更し，Twitter ユーザ全体からランダムにサンプリングした 100,000 ユーザのデータを利用して一般語をフィルタリングすることで，この問題を解決する手法を提案する．6 人の被験者による実験の結果，奥谷らの手法と比較して，Precision@10 が 0.37 から 0.78，MRR が 1.44 から 2.61 に向上した．

CiNii
Topics and Influential User Identification in Twitter using Twitter Lists

Guanying Zhou, Hiroki Asai, Hayato Yamana

IPSJ SIG Notes 2014 ( 13 ) 1 - 6 2014.07

　View Summary

Twitter, as one of the most popular social network services, draws the attention of more and more researchers worldwide. With a large amount of information tweeted every day, it turns essential to identify the influential users we are interested in. In the previous research, researchers mainly identify topics from tweets and rank users by utilizing the follow relationship; however, the following relationship is strongly related to their reputation in real world and cannot describe their influence and activity level in Twitter exactly. Instead, in this paper, to identify topics and influential users, we use "Twitter List," whose name represents the topic of listed members. By analyzing Twitter List, we are able to detect topics and identify influential users in the corresponding topic more efficiently. Based on our experimental evaluation using the selected two topics, the influential users identified by our proposed method have the average influence score related to the topic made by interviewees of 3.7 and 3.33 outweigh the methods of ranking by follower numbers with the average score of 3.22 and 3.27 respectively.

CiNii
アミノ酸配列情報とPPIネットワークを使用したヘテロ二量体タンパク質複合体予測手法の開発

石巻優, 油谷幸代, 山名早人, 山名早人

日本分子生物学会年会プログラム・要旨集(Web) 37th 2014

J-GLOBAL
包括的遺伝子ネットワーク構造からの活性化部位推定手法の開発

下岡純也, 油谷幸代, 山名早人, 山名早人

日本分子生物学会年会プログラム・要旨集(Web) 37th 2014

J-GLOBAL
編集にあたって

山名早人, 中野美由紀, 関洋平

情報処理学会論文誌データベース（TOD） 6 ( 5 ) i - iii 2013.12

CiNii
教育環境における書き込み可能な電子ペーパー端末の利活用

浅井洋樹, 山名早人

MNC Communications 15 ( 15 ) 2013.12

CiNii
文体及びツイート付随情報を用いた乗っ取りツイート検出

上里和也, 奥谷貴志, 浅井洋樹, 奥野峻弥, 田中正浩, 山名早人

情報処理学会研究報告. データベース・システム研究会報告 2013 ( 21 ) 1 - 8 2013.11

　View Summary

Twitter のユーザ数が増加を続ける一方で，不正に ID 及びパスワードを入手され，他人によってツイートを投稿される被害が増加している．これに対し，我々はアカウント乗っ取りによって投稿されるメッセージの一部であるスパムツイートの検出手法を提案し，8 割程度の正答率を得ている．同手法では特定の単語が含まれているスパムツイートを検出対象とし，検出の有効性を示している．本研究では同検出対象を広げ，アカウントの所持者以外が投稿したツイート全体を「乗っ取りツイート」として定義し，これを検出する手法を提案する．また本研究では，以前提案した手法に対してパラメータの再調整を行うと同時に，頻繁に用いるハッシュタグの種類及びリプライを送る相手が各アカウントにおいて特徴的であることを利用し，F 値の向上を図った．100 アカウントに対して評価実験を行った結果，我々が提案している従来手法と比較し，F 値を 0.1984 向上させ F 値 0.8570 を達成した．

CiNii J-GLOBAL
編集にあたって

山名早人, 酒井哲也, 石川佳治

情報処理学会論文誌データベース（TOD） 6 ( 4 ) i - iii 2013.09

CiNii
刊行500号までの軌跡とこれからの論文誌のあり方

山名早人

電子情報通信学会論文誌. D, 情報・システム = The IEICE transactions on information and systems (Japanese edition) 96 ( 8 ) 1661 - 1662 2013.08

CiNii
医薬品副作用情報を用いた副作用検索システムの提案 (データ工学)

三上拓也, 駒田康孝, 野口保, 菅野敦之, 山名早人

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 ( 150 ) 59 - 64 2013.07

　View Summary

医薬品の服用に伴う副作用の早期発見と対策は,医療現場において重要な課題である.副作用の早期発見と対策のためには,医師や薬剤師が全医薬品の全副作用を把握しておく必要がある.しかし,一つの医薬品に既知の副作用は数多くあり,全ての副作用の把握は困難である.また副作用には同義に解される類似表記が多くあり,同一の副作用にも関わらず表記違いにより異なる副作用として誤認し,副作用の発見が遅れる可能性がある.さらに,医薬品との関連性が立証されていない未知の副作用も想定される.そこで本稿では副作用の表記ゆれに頑健かつ,医薬品の未知の副作用検索に対応した副作用検索システムを提案する.提案手法では,医薬品の添付文書中の副作用や,副作用が疑われる症例報告を元に医薬品の未知の副作用を推定する.実験では,実際に副作用が疑われる症例報告があった事例150件を入力し,副作用の検索結果と,副作用が疑われる症例報告にある,医薬品との関連性が疑われる副作用を比較することにより有用性を評価した.実験の結果,副作用の検出率は72.7%であり,うち42.2%を未知の副作用として検出した.また従来,表記ゆれにより同一の副作用として検出できなかった既知の副作用29.3%を,副作用の表記ゆれを解消して同一の副作用として検出でき,提案手法が有用であることを確認した.

CiNii J-GLOBAL
マイクロブログを対象とした1,000人レベルでの著者推定手法構築に向けて

奥野峻弥, 浅井洋樹, 山名早人

研究報告データベースシステム（DBS） 2013 ( 7 ) 1 - 6 2013.07

　View Summary

従来，著者推定研究は小説に対する著者推定を中心に研究が行われており，限定された人数の著者候補者群を取り扱ってきた．またこれまでに，インターネットに投稿された文章を対象に 1 万人レベルでの著者推定手法を提案し，8 割程度の精度を得ている．しかし，多数のユーザが存在する，マイクロブログに投稿されるメッセージは，投稿数は多いが一度に投稿される文章量が短く，未知語や誤字脱字が多いという特徴が存在するため，これまでの手法では精度が低下してしまう．そこで，本研究ではメッセージから辞書を作成し，その辞書を用いた形態素解析器を利用することで少数のメッセージを利用した大規模人数に対する著者推定を行う手法を提案する．900 人の候補者から著者を推定する評価実験を行った結果，既存の著者推定手法よりも精度が上昇することが確認できた．

CiNii
マイクロブログを対象とした1,000人レベルでの著者推定手法構築に向けて

奥野峻弥, 浅井洋樹, 山名早人

情報処理学会研究報告. 情報学基礎研究会報告 2013 ( 7 ) 1 - 6 2013.07

　View Summary

従来,著者推定研究は小説に対する著者推定を中心に研究が行われており,限定された人数の著者候補者群を取り扱ってきた.またこれまでに,インターネットに投稿された文章を対象に 1 万人レベルでの著者推定手法を提案し,8 割程度の精度を得ている.しかし,多数のユーザが存在する,マイクロブログに投稿されるメッセージは,投稿数は多いが一度に投稿される文章量が短く,未知語や誤字脱字が多いという特徴が存在するため,これまでの手法では精度が低下してしまう.そこで,本研究ではメッセージから辞書を作成し,その辞書を用いた形態素解析器を利用することで少数のメッセージを利用した大規模人数に対する著者推定を行う手法を提案する.900 人の候補者から著者を推定する評価実験を行った結果,既存の著者推定手法よりも精度が上昇することが確認できた.

CiNii
医薬品副作用情報を用いた副作用検索システムの提案

三上拓也, 駒田康孝, 野口保, 菅野敦之, 山名早人

研究報告データベースシステム（DBS） 2013 ( 11 ) 1 - 6 2013.07

　View Summary

医薬品の服用に伴う副作用の早期発見と対策は，医療現場において重要な課題である．副作用の早期発見と対策のためには，医師や薬剤師が全医薬品の全副作用を把握しておく必要がある．しかし，一つの医薬品に既知の副作用は数多くあり，全ての副作用の把握は困難である．また副作用には同義に解される類似表記が多くあり，同一の副作用にも関わらず表記違いにより異なる副作用として誤認し，副作用の発見が遅れる可能性がある．さらに，医薬品との関連性が立証されていない未知の副作用も想定される．そこで本稿では副作用の表記ゆれに頑健かつ，医薬品の未知の副作用検索に対応した副作用検索システムを提案する．提案手法では，医薬品の添付文書中の副作用や，副作用が疑われる症例報告を元に医薬品の未知の副作用を推定する．実験では，実際に副作用が疑われる症例報告があった事例 150 件を入力し，副作用の検索結果と，副作用が疑われる症例報告にある，医薬品との関連性が疑われる副作用を比較することにより有用性を評価した．実験の結果，副作用の検出率は 72.7% であり，うち 42.2% を未知の副作用として検出した．また従来，表記ゆれにより同一の副作用として検出できなかった既知の副作用 29.3% を，副作用の表記ゆれを解消して同一の副作用として検出でき，提案手法が有用であることを確認した．

CiNii
マイクロブログを対象とした1,000人レベルでの著者推定手法構築に向けて(ブログ・ソーシャルネットワーク,ビッグデータを対象とした管理・情報検索・知識獲得及び一般)

奥野峻弥, 浅井洋樹, 山名早人

電子情報通信学会技術研究報告. DE, データ工学 113 ( 150 ) 37 - 42 2013.07

　View Summary

従来,著者推定研究は小説に対する著者推定を中心に研究が行われており,限定された人数の著者候補者群を取り扱ってきた.またこれまでに,インターネットに投稿された文章を対象に1万人レベルでの著者推定手法を提案し,8割程度の精度を得ている.しかし,多数のユーザが存在する,マイクロブログに投稿されるメッセージは,投稿数は多いが一度に投稿される文章量が短く,未知語や誤字脱字が多いという特徴が存在するため,これまでの手法では精度が低下してしまう.そこで,本研究ではメッセージから辞書を作成し,その辞書を用いた形態素解析器を利用することで少数のメッセージを利用した大規模人数に対する著者推定を行う手法を提案する.900人の候補者から著者を推定する評価実験を行った結果,既存の著者推定手法よりも精度が上昇することが確認できた.

CiNii J-GLOBAL
編集にあたって

山名早人, 酒井哲也, 石川佳治

情報処理学会論文誌データベース（TOD） 6 ( 3 ) i - iii 2013.06

CiNii
Interaction prediction method of G-protein-coupled receptor and chemical compound with SVM

OHNO Yorihito, TOH Hiroyuki, YAMANA Hayato

IEICE technical report. Neurocomputing 113 ( 111 ) 55 - 61 2013.06

　View Summary

G-protein-coupled receptors (GPCRs) are involved in the transduction of signals carried by the endogenous ligands into cytosolic regions, which are regarded as important targets to develop new drugs. Accurate prediction of interaction between GPCRs and chemical compounds is keenly required for drug development, because the number of the combinations of GPCR and the compounds is too large to be examined by experiments. Therefore, such computational approaches have been extensively investigated. One of the preceding studies by Okuno et al. had succeeded to achieve high performance in prediction by using the entire amino acid sequence of a GPCR and the chemical feature of a chemical compound. However, the amino acid residues involved in the ligand binding are quite limited. We estimate that the residues could strongly affect the binding. So, we identified the amino acid residues constituting ligand binding region from the 3D structure of GPCR. Then, we examined whether the use of the residues, instead of entire amino acid sequence, can improve the prediction. Support vector machine (SVM) was used for the prediction. Experimental result showed that the accuracy was improved by 3.6%, Fvalue was improved by 0.038% and AUC was improved by 0.002%, comparing to the approach by Okuno et al.

CiNii J-GLOBAL
Identifying Topics and Influential Users based on Information Propagation in Twitter

ZHOU Guanying, ZHANG Xuan, YAMANA Hayato

IEICE technical report. Data engineering 113 ( 105 ) 29 - 33 2013.06

　View Summary

Recently, Twitter has become an efficient tool for product promotion. Thus, both how to measure the influence of individuals and to identify influential Twitter users have great research value. Most of previous researches about influential Twitter users identification have been concentrated on the following and/or friend relationships over user network without taking the factors of real information propagation into account. We believe influential users are those who spread information in the propagation and thus they are key figures in advertising for commercial companies. In this paper, we proposed a new method to identify influential Twitter users in some popular topics in Twitter based on retweet relationship. We use LDA to detect topics and then rank Twitter users in each topic.

CiNii
Accuracy Evaluation for Search Engine's Hit Count : Comparison with Document Frequency in Large-Scale Crawl Data

12 ( 1 ) 13 - 18 2013.06

CiNii J-GLOBAL
SCPSSMpred: A general sequence-based method for ligand-binding site prediction

Chun Fang, Tamotsu Noguchi, Hayato Yamana

IPSJ Transactions on Bioinformatics 6 35 - 42 2013.06

DOI
Low Latency Data Stream Processing on Multi-Core CPU Environments

UEDA Takanori, AKIOKA Sayaka, YAMANA Hayato

The IEICE transactions on information and systems (Japanese edetion) 96 ( 5 ) 1094 - 1104 2013.05

　View Summary

データストリーム処理のアプリケーションには,アルゴリズム取引やネットワークパケット監視のように,大容量データストリームを低レイテンシで処理することが必要なものがある.マルチコアCPUを用いた並列処理により大容量ストリームの処理が可能であるが,オペレータごとにスレッドを割り当てると,CPUコア間通信やスレッド待機のオーバヘッドによりレイテンシが増大する.逆にスレッド数が少なすぎては並列性を生かせず,処理できるデータ量に限界が生じる.本論文では,CPUアーキテクチャやスレッド待機のオーバヘッドを考慮し,処理レイテンシを短縮するスレッド割当手法を提案する.マルチコア環境におけるデータストリーム処理のレイテンシ定義を与え,モデル上で最適なスレッド割当が求まることを示す.更に,入力ストリームのデータレート変化に応じてオペレータを再配置する際,ストリーム処理を止めずにタプル適用順序を守ってオペレータを再配置する方法を提案する.

CiNii
編集にあたって

山名早人, 酒井哲也, 石川佳治

情報処理学会論文誌データベース（TOD） 6 ( 2 ) i - iii 2013.03

CiNii
Task graphs for machine learning algorithms

AKIOKA SAYAKA, MURAOKA YOICHI, YAMANA HAYATO

電子情報通信学会技術研究報告 112 ( 454(IBISML2012 93-109) ) 25 - 30 2013.02

　View Summary

Abstract Applications to process a massive amount of data, so-called "big data analysis" , is one of the recent hot requirements, and a machine learning algorithm is highly expected to run much faster in scalable environment in order to fulfill the requirements. A machine learning algorithm often behaves quite differently from a data intensive application, which has been deeply investigated in high performance computing area. Therefore, a clear model of data access patterns, dependency analysis, and parallelism extraction on machine learning algorithms are indespensable so as to run machine learning algorithms faster in parallel and distributed computing environment such as the Cloud. This article reports task graphs generation for well-known machine learning algorithms, and the task graphs equip all the required information for parallel execution of machine learning algorithms.

CiNii J-GLOBAL
三角形特徴を用いた部分形状検索(ポスターセッション,大規模データベースとパターン認識)

武井宏将, 山名早人

電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 112 ( 441 ) 109 - 109 2013.02

　View Summary

近年、三次元形状データはさまざまな分野で活用され、多くの三次元データが保存されている。三次元形状データの増加に伴い、三次元形状データの検索へのニーズが高まっている。形状検索は大きく全体形状検索と部分形状検索の二つに分けられる。全体形状検索では、クエリーとして与えた形状データと完全一致する形状データを検索する。一方、部分形状検索では、クエリーとして与えた形状データを含む形状データを検索する。形状検索を実行する多くの場面において、検索したい形状と同一の形状データをクエリーとして持っていることはほとんどなく、クエリーとして用いられるのは多くの場合、部分形状データとなる。そのため、全体形状検索よりも部分形状検索へのニーズの方が高い。一方で、部分形状検索はクエリーと検索したい形状が同一形状でないため形状間の対応付けが難しく、チャレンジンクな課題として知られている。部分形状検索の手法としては、Bag-of-featuresを用いる方法や特徴点のマッチングを用いる方法が知られている。しかし、Bag-of-featuresを用いる方法では、Bag-of-featuresは全体形状データまたは部分形状データを事前にヒストクラム表現するため、事前にヒストグラム表現した形状とクエリーが類似していなければ適用することは難しい。特徴点のマッチングを用いる方法では、特徴点のマッチングの精度が検索の精度に影響する。特に、誤対応した特徴点のマッチングが精度の低下をもたらす。本論文の提案手法は、三点の特徴点からなる三角形を用いることでマッチング精度の向上を実現する。クエリーとして与えた形状データから特徴点を抽出し、三点の特徴点を一組の三角形とする。局所特徴量ベクトルの距離による対応付けに基づき、保存された形状データの特徴点からなる三角形を作成し、その三角形どうしを比較する。三点の特徴点を用いることで局所的な情報だけではなく特徴点間の位置関係も考慮され、三角形の対応関係をチェックすることで誤対応を取り除くことができ、マッチング精度を向上することができる。また、各特徴点の局所特徴量ベクトルをインデックス化し、インデックスと三角形間の対応関係チェックを組み合わせて用いることで、高速な検索を実現する。本論文における、私たちの貢献は以下の2つである。1.三角形特徴を用いることで、精度向上を実現している2.特徴点の局所特徴量ベクトルのインデックス化と三角形間の対応関係チェックを組み合わせることで高速な検索を実現している実験として、形状データ100データをインデックス化したデータベースを作成し、インデックス化したデータから無作為に選んだ20データについて、形状データ全体の30〜50%程度を切り出したデータをクエリーとする部分形状検索を行った。本実験において、特徴点を単独で用いた場合の正解率0.65に対して、提案手法は正解率0.85を示した。また、インデックス化するデータの数を増やして検索速度を測定し、本提案手法がインデックス化されたデータ数に対して頑健に検索できることを示した。

CiNii
編集にあたって

山名早人, 酒井哲也, 石川佳治

情報処理学会論文誌データベース（TOD） 6 ( 1 ) i - iii 2013.01

CiNii
A Comparative Study of User Evaluations for Global Restaurants under the Multi-cultural Backgrounds

山名早人

WebDBフォーラム2013論文集 - 2013 [Refereed]
マイクロブログを対象とした著者推定手法の提案-5,000人規模での著者推定-

奥野峻弥, 浅井洋樹, 浅井洋樹, 山名早人, 山名早人

情報処理学会シンポジウムシリーズ(CD-ROM) 2013 ( 5 ) 2013

J-GLOBAL
教育環境における書き込み可能な電子ペーパー端末の利活用

浅井洋樹, 山名早人

大学ICT推進協議会年次大会論文集 3p 2013

CiNii
The Reliability of Web Search Engine and its Appropriate Usage

( 262 ) 2 - 7 2013

CiNii J-GLOBAL
ソーシャルメディアを含む多メディアビッグデータの統合的解析による情報抽出(ソーシャルメディア,ビッグデータとソーシャルコンピューティング,及び一般)

上田高徳, 浅井洋樹, 藤木紫乃, 山本祐輔, 武井宏将, 秋岡明香, 山名早人

電子情報通信学会技術研究報告. DE, データ工学 112 ( 346 ) 53 - 58 2012.12

CiNii
ソーシャルメディアを含む多メディアビッグデータの統合的解析による情報抽出

上田高徳, 浅井洋樹, 藤木紫乃, 山本祐輔, 武井宏将, 武井宏将, 秋岡明香, 山名早人, 山名早人

電子情報通信学会技術研究報告 112 ( 346(DE2012 27-40) ) 53 - 58 2012.12

CiNii J-GLOBAL
ソーシャルメディアを含む多メディアビッグデータの統合的解析による情報抽出

上田高徳, 浅井洋樹, 藤木紫乃, 山本祐輔, 武井宏将, 秋岡明香, 山名早人

情報処理学会研究報告. データベース・システム研究会報告 2012 ( 8 ) 1 - 6 2012.12

　View Summary

本稿では我々が取り組んでいる多メディアビッグデータの統合的解析による情報抽出の試みについて述べる.ソーシャルメディアの普及によって,様々な情報がリアルタイムにインターネット上にアップロードされるようになった.我々は,単一のソーシャルメディアだけでなく,複数の情報源を組み合わせた, 「多メディアデータ」を解析することで,より有益な情報を抽出できると考えている.本稿では我々が取り組んでいる多メディア解析について述べる.また,大規模リアルタイムデータの解析をサポートするために開発している,並列分散処理フレームワーク QueueLinker についても述べる.

CiNii
形態素間の優先関係を考慮した略語生成手法

田中友樹, 及川孝徳, 山名早人, 山名早人, 大西貴士, 土田正明, 石川開

情報処理学会シンポジウムシリーズ(CD-ROM) 2012 ( 5 ) ROMBUNNO.B4,TANAKA 2012.11

J-GLOBAL
Producer‐Consumer型モジュールで構成された並列分散Webクローラの開発

上田高徳, 佐藤亘, 鈴木大地, 打田研二, 森本浩介, 秋岡明香, 山名早人, 山名早人

情報処理学会シンポジウムシリーズ(CD-ROM) 2012 ( 5 ) ROMBUNNO.B3,UEDA 2012.11

J-GLOBAL
筆記情報と時系列モデルを用いた学習者つまずき検出(教育・学習支援プラットフォーム/一般)

浅井洋樹, 野輝明里, 苑田翔吾, 山名早人

電子情報通信学会技術研究報告. ET, 教育工学 112 ( 269 ) 65 - 70 2012.10

　View Summary

生徒の学習を支援する際に必要なプロセスとして,つまずきの検知が挙げられる.CAIのつまずき検出に関する研究では,採点結果や解答所要時間,センサーから取得した学習者の顔画像や脈拍などの生体情報,そして入力デバイスであるキーボードやマウスの操作履歴を利用して検知を行う研究が行われてきた.しかし現状の初等教育では筆記活動を中心とした環境であり,こうした環境におけるつまずき検出に関しては深い議論が行われてこなかった.本報告では生徒が利用するペンから得られる筆記情報を元に,つまずきを検出する手法について検討を行う.検出には時系列モデルであるARモデルを用いて学習者の手書き行動が変化する変化点を検出し,変化点間ごとに推定を行う.実施した試験評価において一定の検出性能が確認できた.

CiNii J-GLOBAL
強調表記を利用した手書きドキュメント検索スニペット生成

浅井洋樹, 山名早人, 山名早人

情報処理学会研究報告(CD-ROM) 2012 ( 3 ) ROMBUNNO.DBS-154,NO.8 2012.10

J-GLOBAL
編集にあたって

山名早人, 酒井哲也, 石川佳治

情報処理学会論文誌データベース（TOD） 5 ( 3 ) i - iii 2012.09

CiNii
強調表記を利用した手書きドキュメント検索スニペット生成

浅井洋樹, 山名早人

情報処理学会研究報告. データベース・システム研究会報告 2012 ( 8 ) 1 - 7 2012.07

　View Summary

近年,タブレット端末や電子ペンに代表される手書き入力可能な端末が普及し始めたことにより,手書きドキュメントの電子化が進みつつある.端末上でのドキュメント探索,閲覧プロセスの過程において各ドキュメントの概要把握を目的とした閲覧時では,元ドキュメントを縮小したサムネイルや,要約テキストを出力するテキストスニペットが一覧表示のスニペットとしてしばし用いられる.しかし,手書きドキュメントに対して従来の単純に縮小したサムネイルを用いると,文字が要約されずに縮小されてしまうため記述内容が読み取れず,概要把握が困難となる問題がある.また,図などの文字以外の情報が含まれ,不完全な文字認識しか行えない手書きドキュメントを要約する研究は,我々の知る限り存在しない.そこで本稿では,下線や囲い込みに代表される筆記者の強調表記を利用して,手書きドキュメントを要約することにより概要の把握が容易となる検索スニペットを提案する.ユーザによる情報検索評価実験の結果,従来と比較して我々の提案するスニペットを利用することで検索速度が平均 42% 削減される結果が得られた.

CiNii
強調表記を利用した手書きドキュメント検索スニペット生成

浅井洋樹, 山名早人

情報処理学会研究報告. 情報学基礎研究会報告 2012 ( 8 ) 1 - 7 2012.07

　View Summary

近年,タブレット端末や電子ペンに代表される手書き入力可能な端末が普及し始めたことにより,手書きドキュメントの電子化が進みつつある.端末上でのドキュメント探索,閲覧プロセスの過程において各ドキュメントの概要把握を目的とした閲覧時では,元ドキュメントを縮小したサムネイルや,要約テキストを出力するテキストスニペットが一覧表示のスニペットとしてしばし用いられる.しかし,手書きドキュメントに対して従来の単純に縮小したサムネイルを用いると,文字が要約されずに縮小されてしまうため記述内容が読み取れず,概要把握が困難となる問題がある.また,図などの文字以外の情報が含まれ,不完全な文字認識しか行えない手書きドキュメントを要約する研究は,我々の知る限り存在しない.そこで本稿では,下線や囲い込みに代表される筆記者の強調表記を利用して,手書きドキュメントを要約することにより概要の把握が容易となる検索スニペットを提案する.ユーザによる情報検索評価実験の結果,従来と比較して我々の提案するスニペットを利用することで検索速度が平均 42% 削減される結果が得られた.

CiNii
編集にあたって

山名早人, 酒井哲也, 石川佳治

情報処理学会論文誌データベース（TOD） 5 ( 2 ) i - iii 2012.06

CiNii
Welcome message from MAW-2012 international symposium organizers

Takahiro Hara, Kin Fun Li, Hayato Yamana, Shengrui Wang

Proceedings - 26th IEEE International Conference on Advanced Information Networking and Applications Workshops, WAINA 2012 2012.05

DOI
ThumbPop : 注目物体を強調した疑似立体サムネイル生成 (ヒューマン情報処理)

新井啓介, 武井宏将, 山名早人

電子情報通信学会技術研究報告 : 信学技報 111 ( 500 ) 177 - 182 2012.03

CiNii
Model-based Gaze Estimation Method with Low-Resolution RGB Images of Eyes

111 ( 499 ) 31 - 36 2012.03

CiNii
ThumbPop : Visual Attention-Based Pseudo 3D Thumbnail

111 ( 499 ) 177 - 182 2012.03

CiNii
低解像度可視光目画像を用いたモデルベース視線推定手法 (ヒューマン情報処理)

福田崇, 山名早人

電子情報通信学会技術研究報告 : 信学技報 111 ( 500 ) 31 - 36 2012.03

CiNii
Model-based Gaze Estimation Method with Low-Resolution RGB Images of Eyes

FUKUDA Takashi, YAMANA Hayato

Technical report of IEICE. HIP 111 ( 500 ) 31 - 36 2012.03

　View Summary

Recording eye-gaze data from many people in natural situations are valuable in Human Factors, Market Strategy and any other studies. The gaze tracking system equipped with webcams is suitable to record eye-gaze data. The resolutions of eye-images captured by webcams are low. Low-resolution eye-images tend to be distorted and cause errors in gaze estimations. We have developed the gaze tracking system with low-resoltion eye-images. In our past study, we extract contours of pupils, remove distorted areas in these contours, approximate pupil contours into an ellipse, and calculate gaze directi...

CiNii
ThumbPop : Visual Attention-Based Pseudo 3D Thumbnail

ARAI Keisuke, Takei Hiromasa, YAMANA Hayato

Technical report of IEICE. HIP 111 ( 500 ) 177 - 182 2012.03

　View Summary

We propose a pseudo 3D thumbnail generation method to improve the recognizability of visual attention objects by decomposing an original image into visual attention objects and a background. Thumbnails created by scaling, used as general, are not suitable for mobile devices equipped a small display, because attention objects become too small to recognize. Furthermore, previous visual attention-based thumbnail generation methods cannot preserve recognizability of visual attention objects by burying them in the background image. To solve these problems, we propose a pseudo 3D thumbnail genera...

CiNii
Model-based Gaze Estimation Method with Low-Resolution RGB Images of Eyes

FUKUDA Takashi, YAMANA Hayato

Technical report of IEICE. PRMU 111 ( 499 ) 31 - 36 2012.03

　View Summary

Recording eye-gaze data from many people in natural situations are valuable in Human Factors, Market Strategy and any other studies. The gaze tracking system equipped with webcams is suitable to record eye-gaze data. The resolutions of eye-images captured by webcams are low. Low-resolution eye-images tend to be distorted and cause errors in gaze estimations. We have developed the gaze tracking system with low-resoltion eye-images. In our past study, we extract contours of pupils, remove distorted areas in these contours, approximate pupil contours into an ellipse, and calculate gaze directi...

CiNii J-GLOBAL
ThumbPop : Visual Attention-Based Pseudo 3D Thumbnail

ARAI Keisuke, Takei Hiromasa, YAMANA Hayato

Technical report of IEICE. PRMU 111 ( 499 ) 177 - 182 2012.03

　View Summary

We propose a pseudo 3D thumbnail generation method to improve the recognizability of visual attention objects by decomposing an original image into visual attention objects and a background. Thumbnails created by scaling, used as general, are not suitable for mobile devices equipped a small display, because attention objects become too small to recognize. Furthermore, previous visual attention-based thumbnail generation methods cannot preserve recognizability of visual attention objects by burying them in the background image. To solve these problems, we propose a pseudo 3D thumbnail genera...

CiNii J-GLOBAL
Visual-Attention-based Thumbnail using Two-Stage GrabCut

Keisuke Arai, Hiromasa Takei, Hayato Yamana

2012 INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS (ICMCS) 96 - 101 2012

　View Summary

This paper proposes a new thumbnail generation method to improve the recognizability of visual attention objects on small displays. Previous methods such as simple scaling reduce the recognizability of original images because the visual attention objects become too small to recognize. When we view thumbnails on small displays such as those of mobile devices, recognizability is indispensable for handling many images simultaneously. To solve the problem of low recognizability of visual attention objects, we adopt GrabCut to extract visual attention objects from an original image and then divide the original image into visual attention objects and a background image. While the background image is reduced to fit the size of a thumbnail, the extracted visual attention objects are merged into the reduced background image to preserve their recognizability. In adopting GrabCut, we propose a two-stage GrabCut method to automate the extraction of attention objects; the extraction was performed by hand in previous methods. Our experimental results show that our proposed method is able to shorten the search time by 44% and improve the precision of the search by 19% in comparison with simple scaling.

DOI
データストリーム処理におけるレイテンシ最小化と高可用性のためのオペレータ実行方法

上田高徳, 打田研二, 秋岡明香, 山名早人, 山名早人

情報処理学会シンポジウムシリーズ(CD−ROM) 2011 ( 5 ) ROMBUNNO.2G-2,UEDA 2011.10

J-GLOBAL
品詞n‐gramを用いた著者推定手法—話題に依存しない頑健性の評価—

井上雅翔, 中島泰, 山名早人, 山名早人

情報処理学会シンポジウムシリーズ(CD−ROM) 2011 ( 5 ) ROMBUNNO.ROMBUNSHOSESSHON,INOE 2011.10

J-GLOBAL
A Proposal and Validity Inspection of Reliability Evaluation Method for Search Engines' Hit Count

SATO KO, UCHIDA KENJI, YAMANA HAYATO, YAMANA HAYATO

情報処理学会研究報告(CD−ROM) 2011 ( 3 ) ROMBUNNO.DBS-152,NO.8 2011.10

J-GLOBAL
A Proposal and Validity Inspection of Reliability Evaluation Method for Search Engines' Hit Count

2011 ( 8 ) 1 - 8 2011.07

CiNii
A Proposal and Validity Inspection of Reliability Evaluation Method for Search Engines' Hit Count

Koh Satoh, Kenji Uchida, Hayato Yamana

IPSJ SIG Notes 2011 ( 8 ) 1 - 8 2011.07

　View Summary

Recently, there exit numerous researches based on the number of search results, or hit count. However, hit counts returned by search engines can fluctuate unnaturally when observed on different days, and may cause too large errors to be used in researches. Therefore, it is important to discuss on how we can evaluate and improve the reliability of hit counts. We have performed several researches about this problem such as a research to specify the conditions in which search engines can return reliable hit counts, and a research to define the reliability evaluation metrics. In this paper, in ...

CiNii J-GLOBAL
Extraction of Emphasized Words from On-line Handwritten Notebooks

浅井洋樹, 山名早人

日本データベース学会論文誌 10 ( 1 ) 67 - 72 2011.06

CiNii J-GLOBAL
Welcome message from MAW 2011 symposium chairs

Takahiro Hara, Kin Fun Li, Shengrui Wang, Hayato Yamana

Proceedings - 25th IEEE International Conference on Advanced Information Networking and Applications Workshops, WAINA 2011 48 2011.05

DOI
Welcome message from the QuEST 2011 workshop chairs

Kin Fun Li, Rick McGeer, Stephen Neville, Hayato Yamana

Proceedings - 25th IEEE International Conference on Advanced Information Networking and Applications Workshops, WAINA 2011 2011.05

DOI
Time-weighted web authoritative ranking

Bundit Manaskasemsak, Arnon Rungsawang, Hayato Yamana

INFORMATION RETRIEVAL 14 ( 2 ) 133 - 157 2011.04

　View Summary

We investigate temporal factors in assessing the authoritativeness of web pages. We present three different metrics related to time: age, event, and trend. These metrics measure recentness, special event occurrence, and trend in revisions, respectively. An experimental dataset is created by crawling selected web pages for a period of several months. This data is used to compare page rankings by human users with rankings computed by the standard PageRank algorithm (which does not include temporal factors) and three algorithms that incorporate temporal factors, including the Time-Weighted PageRank (TWPR) algorithm introduced here. Analysis of the rankings shows that all three temporal-aware algorithms produce rankings more like those of human users than does the PageRank algorithm. Of these, the TWPR algorithm produces rankings most similar to human users', indicating that all three temporal factors are relevant in page ranking. In addition, analysis of parameter values used to weight the three temporal factors reveals that age factor has the most impact on page rankings, while trend and event factors have the second and the least impact. Proper weighting of the three factors in TWPR algorithm provides the best ranking results.

DOI
情報検索コース

監修, 神門典子, 山名早人

Webラーニングプラザ：技術者Web学習システム(技術者向けeラーニング）, 科学技術振興機構 2011.03 [Invited]
レビューからの商品比較表の自動生成

相川直視, 山名早人, 山名早人

言語処理学会年次大会発表論文集 17th (CD-ROM) ROMBUNNO.D2-3 2011.03

J-GLOBAL
ロック制御型同期複製ミドルウェアの提案

堀井洋, 堀井洋, 小野寺民也, 山名早人

電子情報通信学会論文誌 D J94-D ( 3 ) 515-524 2011.03

J-GLOBAL
Increase the Image Search Results by Using Flickr Tags

ShanBin Chan, 佐藤真一, 山名早人

DEIM2011 B1-3 2011
2段階LDAを用いたインクリメンタルなソフトウェアレポジトリの自動分類手法

井上雅翔, 新井啓介, 山名早人

DEIM2011 E4-5 2011
ウェブサーバへの最短訪問間隔を保証する時間計算量がO(1)のウェブクローリングスケジューラ

森本浩介, 上田高徳, 打田研二, 山名早人

DEIM2011 B5-6 2011
品詞と助詞の出現パターンを用いた類似著者の推定とコミュニティ抽出

中島泰, 山名早人

DEIM2011 C6-5 2011
検索エンジンのヒット数の信頼性に対する評価

佐藤亘, 打田研二, 山名早人

DEIM2011 E6-1 2011
結晶化環境におけるpH値を考慮したSVMによるタンパク質結晶化の予測

片岡義雅, 野口保, 百石弘澄, 小林大輔, 山名早人

DEIM2011 D8-1 2011
筆記者の強調表現に基づいたオンライン手書きノートの圧縮サムネイル生成手法

浅井洋樹, 小林大輔, 山名早人

DEIM2011 E8-6 2011
Cannyエッジ情報に基づく人物画像における髪型の定量化

須藤優介, 福田崇, 山名早人

DEIM2011 E9-6 2011
字幕テキストの利用によるマイクロブログからのテレビ番組に言及したメッセージ検出手法

山本祐輔, 及川孝徳, 山名早人

DEIM2011 A10-1 2011
レビューからの商品比較表の自動生成

相川直視, 山名早人

自然言語処理学会第17回年次大会 D2-3 2011
Increase the Image Search Results by Using Flickr Tags

ShanBin Chan, 佐藤真一, 山名早人

DEIM2011 B1-3 2011
Extraction of Distinctive Phrases from Mini Blog Entries and Application for Topic Tracking across the Media

KATO NORIKAZU, AKIOKA SAYAKA, MURAOKA YOICHI, YAMANA HAYATO

情報処理学会研究報告(CD−ROM) 2010 ( 4 ) ROMBUNNO.DBS-151,22 - 8 2010.12

CiNii J-GLOBAL
Dynamic Resource Allocation for Streaming Applications in Cloud Environment

2010 ( 4 ) 1 - 7 2010.12

CiNii
Dynamic Resource Allocation for Streaming Applications in Cloud Environment

Sayaka Akioka, Norikazu Kato, Yoichi Muraoka, Hayato Yamana

IPSJ SIG Notes 2010 ( 8 ) 1 - 7 2010.11

　View Summary

Streaming application, which requires to process data frequently arrives in chronological order, is now a center of interest. This paper proposes a methodology to parallelize, and dynamically allocate streaming applications over distributed environment such as cloud computing environment. The simulation results approved that practical streaming applications need to be processed in parallel in order to avoid loss of data for lack of processing time. However, the methodology proposed in this paper enables all the input data processed with 26% overhead of average execution time of each block o...

CiNii J-GLOBAL
Extraction of Distinctive Phrases from Mini Blog Entries and Application for Topic Tracking across the Media

Norikazu Kato, Sayaka Akioka, Yoichi Muraoka, Hayato Yamana

IPSJ SIG Notes 2010 ( 22 ) 1 - 8 2010.11

　View Summary

A mini blog service, including Twitter, is one of emerging media of note. Across-the-board analysis in posted blogs, and descriptions in related media, such as TV, newspapers, and other media, is indispensable for social analysis. Posts in mini blogs, however, often include names of particular movies, novels, and products, and many of which are compounders. A compounder is often divided into several words by word processors, and difficult to extract as one solid word. Here, if a hot compounder is extracted as it is supposed to be, the quality of morphological analysis is improved to contrib...

CiNii J-GLOBAL
Speed-Up of Resizable-LSH for Similarity-Based Range Query

Kunihiro Yamazaki, Hayato Yamana

IPSJ SIG Notes 2010 ( 5 ) 1 - 8 2010.09

　View Summary

In this paper, we improve our previously proposed Resizable-LSH that enhances the range query on approximate similarity search faster. Nowadays, Locality-Sensitive Hashing (LSH) is drawing attention as an effective algorithm for approximate nearest neighbor search. LSH adopts hash functions that collide with high probability if two vectors are close, so that LSH finds approximate nearest neighbors quickly even if the dataset is high-dimensional. However, LSH uses fixed search range on generating hash tables, so resizing the search range costs expensive. As a solution, we've proposed the alg...

CiNii
The 2010 IEEE International Symposium on Mining and Web (MAW): Welcome message from symposium organizers

Takahiro Hara, Kin Fun Li, Shengrui Wang, Hayato Yamana, Laurence T. Yang, Yanchun Zhang

24th IEEE International Conference on Advanced Information Networking and Applications Workshops, WAINA 2010 2010.07

DOI
The 2010 IEEE International Workshop on Quantitative Evaluation of large-scale Systems and Technologies (QuEST): Welcome message from workshop organizers

Kin Fun Li, Rick McGeer, Stephen Neville, Hayato Yamana

24th IEEE International Conference on Advanced Information Networking and Applications Workshops, WAINA 2010 90 2010.07

DOI
Two step adjustment technique of term weight

YANO Hiroya, NAKAJIMA Tai, YAMANA Hayato

IEICE technical report. Data engineering 110 ( 107 ) 45 - 50 2010.06

　View Summary

TF・IDF method is one of the methods to weight terms in the field of document retrieval. IDF value shows the degree of how a term is difficult to appear in the document set, and depends on the document set to be retrieved. Therefore, the problem is that, even if a term is difficult to appear in the same field of document set as query(which means the term is highly specific in the document), IDF value of term which appears easily in the document set to be retrieved is small. In this paper, we propose and study two step adjustment technique of term weight. In the first step, we get documents r...

CiNii J-GLOBAL
Similar object detection using template matching focused on positional relationship of feature regions

Keisuke Arai, Kosuke Morimoto, Hayato Yamana

IPSJ SIG Notes. CVIM 2010 ( 4 ) 1 - 8 2010.05

　View Summary

The similar object detection from a large quantity of images helps us to be able to organize images by category and research market by using images on the Web. Template matching that can detect similarity object doesn't suit unknown images so that there is an assumption that target image contains same object. In this paper, we are aimed at decreasing false-positive rate due to the premise of template matching. We propose the method that considers the positional relationships of the feature regions with conventional template matching. Each feature region in template image matches target imag...

CiNii
Model-Based Gaze Tracking with Low-cost Web Cameras

FUKUDA Takashi, MATSUZAKI Katsuhiko, YAMANA Hayato

Technical report of IEICE. HIP 109 ( 471 ) 113 - 118 2010.03

　View Summary

The gaze estimation without restraining users at home will be contributed to reformation of user interfaces. The commercial gaze estimation systems show high accuracy by using infrared, however, gaze estimation systems with web cameras are highly desired at homes because of their price. The problem using web cams is its low resolution for gaze estimation. Low resolution images are strongly influenced by noises. So the past studies do not estimate detailed direction of eyes. In this paper, we propose a new gaze estimation method which shows high accuracy using image processing and geometrica...

CiNii J-GLOBAL
Model-Based Gaze Tracking with Low-cost Web Cameras

FUKUDA Takashi, MATSUZAKI Katsuhiko, YAMANA Hayato

Technical report of IEICE. PRMU 109 ( 470 ) 113 - 118 2010.03

　View Summary

The gaze estimation without restraining users at home will be contributed to reformation of user interfaces. The commercial gaze estimation systems show high accuracy by using infrared, however, gaze estimation systems with web cameras are highly desired at homes because of their price. The problem using web cams is its low resolution for gaze estimation. Low resolution images are strongly influenced by noises. So the past studies do not estimate detailed direction of eyes. In this paper, we propose a new gaze estimation method which shows high accuracy using image processing and geometrica...

CiNii
6K-7 Data Mining Algorithms Classified Based on Data Access Patterns

Akioka Sayaka, Muraoka Yoichi, Yamana Hayato, Nakajima Tatsuo

全国大会講演論文集 72 ( 5 ) "5 - 105"-"5-106" 2010.03

CiNii
Time-weighted web authoritative ranking

Bundit Manaskasemsak, Arnon Rungsawang, Hayato Yamana

Information Retrieval Vol.13 ( No.4 ) 2010
特定言語Webページ収集のためのフォーカストクローラの性能改善手法

山名早人

第2回データ工学と情報マネジメントに関するフォーラム論文集 B2-1 16 2010
字幕テキストの利用によるブログで引用されたテレビ番組の推定

及川孝徳, 中島泰, 松崎勝彦, 黒木さやか, 山名早人

第2回データ工学と情報マネジメントに関するフォーラム(DEIM2010) 2010
アンカーテキストとリンク構造を用いた同義語抽出手法

黒木さやか, 立石健二, 細見格, 山名早人

第2回データ工学と情報マネジメントに関するフォーラム(DEIM2010) 2010
Winnyネットワーク上を流通するコンテンツの傾向と分析

打田研二, 高木浩光, 山崎邦弘, 山名早人

第2回データ工学と情報マネジメントに関するフォーラム(DEIM2010) 2010
WWWにおけるP3Pコンパクトポリシーの利用状況に関する調査

山名早人

第2回データ工学と情報マネジメントに関するフォーラム論文集 D8-5 18 2010
Unexpected and Interesting: 動画視聴サイトにおける発見性を重視した動画推薦手法の提案

中村智浩, 山名早人

第2回データ工学と情報マネジメントに関するフォーラム(DEIM2010) 2010
QueueLinker: パイプライン型アプリケーションのための分散処理フレームワーク

上田高徳, 片瀬弘晶, 森本浩介, 打田研二, 油井誠, 山名早人

第2回データ工学と情報マネジメントに関するフォーラム(DEIM2010) 17 2010
LittleWeb: 類似ノード集約によるWebグラフ圧縮手法

片瀬弘晶, 上田高徳, 山名早人

第2回データ工学と情報マネジメントに関するフォーラム(DEIM2010) 2010
Hit Count Dance -検索エンジンのヒット数に関する信頼性検証-

舟橋卓也, 山名早人

第2回データ工学と情報マネジメントに関するフォーラム(DEIM2010) 2010
安価なWebカメラを用いたModel-Based視線推定

福田崇, 松崎勝彦, 山名早人

信学技報(PRMU) Vol.2009 ( No.252 ) 113 - 118 2010
データアクセスパターンに基づくデータマイニング手法の分類

秋岡明香, 村岡洋一, 山名早人, 中島達夫

第72回情処全大 6K-7 ( 5 ) 2010

J-GLOBAL
Similar object detection using template matching focused on positional relationship of feature regions

新井啓介, 森本浩介, 山名早人, 山名早人

情報処理学会研究報告(CD-ROM) Vol.2010-CVIM-172 ( No.4 ) 1 - 8 2010

J-GLOBAL
Search Engines’ Trustworthiness-Current Status

Hayato YAMANA

Proc. of the 5th Korea-Japan Database Workshop 219 - 240 2010
検索語の重みの2段階調整手法

矢野博也, 中島泰, 山名早人

信学技報 Vol.110 ( No.107 ) 45 - 50 2010
領域分割と色特徴を利用したテンプレートマッチングによる類似物体検出

新井啓介, 森本浩介, 山名早人

MIRU2010,IS2-42 2010
動画像における正面画像推定からの衣服領域抽出

金正文, 森本浩介, 山名早人

MIRU2010, IS3-36 2010
低解像度目画像からのModel-Based視線推定

福田崇, 松崎勝彦, 山名早人

MIRU2010, IS1-46 2010
Localized Multiple Kernel Learningを用いた画像分類

小林大輔, 相川直視, 山名早人

MIRU2010, IS2-43 2010
Data Access Pattern Analysis on Stream Mining Algorithms for Cloud Computation,

Sayaka Akioka, Hayato Yamana, Yoichi Muraoka

Proc. of the 2010 Int'll Conf. on Parallel and Distributed Processing Techniques and Applications 2010
The Method of Improving the Specific Language Focused Crawler,

Shan-Bin Chan, Hayato Yamana

Proc. of the 1st CIPS-SIGHAN Joint Conf. on Chinese Language Processing(CLP2010) 2010
Speed-Up of Resizable-LSH for Similarity-Based Range Query

山崎邦弘, 山名早人, 山名早人

情報処理学会研究報告(CD-ROM) Vol.2010-AL-131 ( No.5 ) 1 - 8 2010

J-GLOBAL
Community QA Question Classification: Is the Asker Looking for Subjective Answers or Not?

Naoyoshi AIKAWA, Tetsuya SAKAI, Hayato YAMANA

WebDBForum2011 2010
Time-weighted web authoritative ranking

Bundit Manaskasemsak, Arnon Rungsawang, Hayato Yamana

Information Retrieval Vol.13 ( No.4 ) 2010
Search Engines' Trustworthiness-Current Status

山名早人

Proc.of the 5th Korea-Japan Database Workshop 219 - 240 2010
Reliability Verification of Search Engines' Hit Counts: How to Select a Reliable Hit Count for a Query

Takuya Funahashi, Hayato Yamana

CURRENT TRENDS IN WEB ENGINEERING 6385s 114 - 125 2010

　View Summary

In this paper, we investigate the trustworthiness of search engines' hit counts, numbers returned as search result counts. Since many studies adopt search engines' hit counts to estimate the popularity of input queries, the reliability of hit counts is indispensable for archiving trustworthy studies. However, hit counts are unreliable because they change, when a user clicks the "Search" button more than once or clicks the "Next" button on the search results page, or when a user queries the same term on separate days. In this paper, we analyze the characteristics of hit count transition by gathering various types of hit counts over two months by using 10,000 queries. The results of our study show that the hit counts with the largest search offset just before search engines adjust their hit counts are the most reliable. Moreover, hit counts are the most reliable when they are consistent over approximately a week.

DOI
Data Access Pattern Analysis on Stream Mining Algorithms for Cloud Computation,

Sayaka Akioka, Hayato Yamana, Yoichi Muraoka

Proc. of the 2010 Int'll Conf. on Parallel and Distributed Processing Techniques and Applications 2010
The Method of Improving the Specific Language Focused Crawler,

Shan-Bin Chan, Hayato Yamana

Proc. of the 1st CIPS-SIGHAN Joint Conf. on Chinese Language Processing(CLP2010) 2010
Community QA Question Classification: Is the Asker Looking for Subjective Answers or Not?

Naoyoshi AIKAWA, Tetsuya SAKAI, Hayato YAMANA

WebDBForum2011 2010
Cross-media impact on Twitter in Japan

Sayaka Akioka, Norikazu Kato, Yoichi Muraoka, Hayato Yamana

International Conference on Information and Knowledge Management, Proceedings 111 - 118 2010

　View Summary

Twitter, a microblogging service, is now grabbing attention of people as a new channel. For deep understanding of this new service, this paper reports the characteristics of Twitter users in Japan, and the impact of media such as publications, and TV programs on Twitter community. To the best of our knowledge, this paper is the first to analyze mutual impact between Twitter, and other media quantitatively. In order for the analyses, we crawled user profiles whose language setting is Japanese, and conducted several analysis with well-known methodologies as conventional work did. We confirmed the characteristics of the collected user profiles. We observed the distributions of the number of friends, and the number of follows both follow power-law, and there exists the correlation between the number of friends, and the number of follows. Besides the collected user profiles, we also utilized closed caption data of TV programs in Japan, and other information on media picked up Twitter. We run a batch of matching these data outside Twitter with the collected user profiles, and concluded Twitter has been already widely spread among Japanese people, however, media have still huge impact on the growth of Twitter users. We also conjectured the impact is not one-sided, however, is mutual influence between Twitter, and other media. © 2010 ACM.

DOI
A Lock-free GCLOCK Page Replacement Algorithm

2 ( 4 ) 32 - 48 2009.12

　View Summary

In this paper, we propose a lock-free variant of the GCLOCK page replacement algorithm, named Nb-GCLOCK. Concurrent access to the buffer management module is a major factor that prevents database scalability to processors. Therefore, we propose a non-blocking scheme for bufferfix operations that fix buffer frames for requested pages without locks by combining Nb-GCLOCK and a wait-free hash table. Our experimental results revealed that our scheme can obtain nearly linear scalability to processors up to 64 processors, although the existing locking-based schemes do not scale beyond 16 processors.

CiNii
Prediction of GPCR ligands by 2-way prediction method

Hiroto Hyakkoku, Minoru Sugihara, Makiko Suwa, Tsuyoshi Kato, Hayato Yamana, Wataru Fujibuchi

IPSJ SIG technical reports 2009 ( 2 ) 1 - 8 2009.09

　View Summary

G-protein coupled receptors (GPCRs) are important pharmacological targets and to predict unknown interactions between GPCRs and ligands is one of the most interesting topics in the current computational biology. However, ligands of many GPCRs are experimentally not identified yet and it is difficult to predict unknown ligands of GPCRs because of insufficiency of training data set. We have developed a 2-way prediction method based on the support vector machine. In this method, the prediction is performed by using both information of ligands and GPCRs and one can apply this method to the case...

CiNii
Proposal of Word Salad Spam Detection Method using N-gram and Interrupted Collocations

MORIMOTO Kosuke, KATASE Hiroaki, YAMANA Hayato

研究報告情報学基礎（FI） 95 ( 24 ) X1 - X8 2009.07

　View Summary

インターネット上にウェブページが爆発的に増加し，インターネットから得られる情報が重要になっている．しかし，ウェブページの爆発的な増加につれてスパム行為を行うページも同様に増加し，インターネットから得られる情報の価値を下げている．スパム行為には様々な手法があるが，本論文では自動的に文章を生成するワードサラダに着目し，ワードサラダ型のスパムを効率的に検出する手法を提案する．ワードサラダ型スパムを検出するため，n-gram と離散型共起表現を用いてカルバック・ライブラー情報量に基づく文章のスコアを計算し，計算したスコアに基づき判定を行う．提案手法の評価実験を行った結果，既存手法と比較して F 値で 0.18 の性能の向上を確認できた．Information on the Internet becomes important because of exploding Web page. However, Spam pages also have exploded and information from the Internet have become lower reliability. Though there are many Spamming methods, in this article we focus on "word salad" that creates text automatically, and we propose the effective method of word salad detection. We detect word salad by the score based on Kullback-Leibler divergence calculated with n-gram and interrupted collocation. As a result of experiment, our method improves 0.18 points in F-value from the existing method.

CiNii
Resizable-LSH : An Approximate Similarity Search Algorithm for Resizable Range-Search

YAMAZAKI Kunihiro, NAKAMURA Tomohiro, FUNAHASHI Takuya, YAMANA Hayato

研究報告データベースシステム（DBS） 148 ( 22 ) V1 - V8 2009.07

　View Summary

本稿では閾値を可変にした近似的な類似検索手法を提案する．近年，距離を用いた類似検索手法の 1 つとして，Locality-Sensitive Hashing （局所性鋭敏型ハッシング，LSH）による近似的な類似検索が注目されている．LSHは，「距離が近い入力同士は高い確率で衝突する」特徴を持つハッシュ関数を用いたデータマッピング手法であり，高次元なデータに対しても高速に近傍検索を行うことができる．しかし LSH では，事前計算によって距離が近いデータ同士を同じハッシュ値にマッピングするため，検索時に類似度の閾値を変更することができない．閾値を変更するにはハッシュテーブルの再構築が必要になるため，ユーザが閾値を指定できるような類似検索は実現困難である．そこで本研究では，類似検索時に，クエリとハッシュ値が一致するデータに加え，ハッシュ値が近いデータも取得することで，ハッシュテーブルの再構築を行うことなく，閾値を指定できる類似検索を実現した．提案手法は，閾値に合わせてハッシュテーブルを逐次再構築する LSH と比較して，同程度の精度で，かつ 1,000 倍程度の高速化を達成できることを実験により確認した．We introduce an efficient algorithm named "Resizable-LSH" for approximate similarity search, which enables resizing the search range flexibly. Nowadays, Locality-Sensitive Hashing (LSH) is drawing attention as an efficient algorithm for approximate nearest neighbor search. LSH adopts hash functions that collide with high probability if two vectors are close, so that LSH finds approximate nearest neighbors quickly even if the dataset is high-dimensional. However, LSH should generate hash tables preliminarily, that results in resizing the search range costs expensive because hash table regeneration is required whenever we face the needs to resize search range. To solve the problem, our proposed Resizable-LSH retrieves not only the same hash value of query, but also near hash values. Then Resizable-LSH achieves resizable range-search. As it turns out, the result of the experiments shows Resizable-LSH works about 1,000 times faster than LSH with almost the same quality in comparison with LSH.

CiNii
Resizable-LSH : An Approximate Similarity Search Algorithm for Resizable Range-Search

YAMAZAKI Kunihiro, NAKAMURA Tomohiro, FUNAHASHI Takuya, YAMANA Hayato

情報処理学会研究報告. 情報学基礎研究会報告 95 ( 22 ) 1 - 8 2009.07

　View Summary

本稿では閾値を可変にした近似的な類似検索手法を提案する．近年，距離を用いた類似検索手法の 1 つとして，Locality-Sensitive Hashing （局所性鋭敏型ハッシング，LSH）による近似的な類似検索が注目されている．LSHは，「距離が近い入力同士は高い確率で衝突する」特徴を持つハッシュ関数を用いたデータマッピング手法であり，高次元なデータに対しても高速に近傍検索を行うことができる．しかし LSH では，事前計算によって距離が近いデータ同士を同じハッシュ値にマッピングするため，検索時に類似度の閾値を変更することができない．閾値を変更するにはハッシュテーブルの再構築が必要になるため，ユーザが閾値を指定できるような類似検索は実現困難である．そこで本研究では，類似検索時に，クエリとハッシュ値が一致するデータに加え，ハッシュ値が近いデータも取得することで，ハッシュテーブルの再構築を行うことなく，閾値を指定できる類似検索を実現した．提案手法は，閾値に合わせてハッシュテーブルを逐次再構築する LSH と比較して，同程度の精度で，かつ 1,000 倍程度の高速化を達成できることを実験により確認した．We introduce an efficient algorithm named "Resizable-LSH" for approximate similarity search, which enables resizing the search range flexibly. Nowadays, Locality-Sensitive Hashing (LSH) is drawing attention as an efficient algorithm for approximate nearest neighbor search. LSH adopts hash functions that collide with high probability if two vectors are close, so that LSH finds approximate nearest neighbors quickly even if the dataset is high-dimensional. However, LSH should generate hash tables preliminarily, that results in resizing the search range costs expensive because hash table regeneration is required whenever we face the needs to resize search range. To solve the problem, our proposed Resizable-LSH retrieves not only the same hash value of query, but also near hash values. Then Resizable-LSH achieves resizable range-search. As it turns out, the result of the experiments shows Resizable-LSH works about 1,000 times faster than LSH with almost the same quality in comparison with LSH.

CiNii
Proposal of Word Salad Spam Detection Method using N-gram and Interrupted Collocations

MORIMOTO Kosuke, KATASE Hiroaki, YAMANA Hayato

情報処理学会研究報告. データベース・システム研究会報告 148 ( 24 ) 1 - 8 2009.07

　View Summary

インターネット上にウェブページが爆発的に増加し，インターネットから得られる情報が重要になっている．しかし，ウェブページの爆発的な増加につれてスパム行為を行うページも同様に増加し，インターネットから得られる情報の価値を下げている．スパム行為には様々な手法があるが，本論文では自動的に文章を生成するワードサラダに着目し，ワードサラダ型のスパムを効率的に検出する手法を提案する．ワードサラダ型スパムを検出するため，n-gram と離散型共起表現を用いてカルバック・ライブラー情報量に基づく文章のスコアを計算し，計算したスコアに基づき判定を行う．提案手法の評価実験を行った結果，既存手法と比較して F 値で 0.18 の性能の向上を確認できた．Information on the Internet becomes important because of exploding Web page. However, Spam pages also have exploded and information from the Internet have become lower reliability. Though there are many Spamming methods, in this article we focus on "word salad" that creates text automatically, and we propose the effective method of word salad detection. We detect word salad by the score based on Kullback-Leibler divergence calculated with n-gram and interrupted collocation. As a result of experiment, our method improves 0.18 points in F-value from the existing method.

CiNii
Reliability Verification of Search Engines' Hit Count using Multi Query

FUNAHASHI Takuya, SONE Hiroaki, YAMANA Hayato

IEICE technical report. Data engineering 109 ( 153 ) 19 - 24 2009.07

　View Summary

A number of studies have been using Search Engines' hit count. The goal of these studies is to build applications for translation support or natural language processing support. These studies assume that the hit count is reliable. To identify the reliability of Search Engines' hit count, we have challenged to verify. In the past, we verified hit count only using one keyword query. The contribution of this paper is to verify hit count using multi query keyword.

CiNii J-GLOBAL
Efficient duplicated URL detection for web crawlers

久保田展行, 上田高徳, 山名早人

DBSJ journal 8 ( 1 ) 83 - 88 2009.06

CiNii J-GLOBAL
Improvement in Speed and Accuracy of Multiple Sequence Alignment Program PRIME (IPSJ Transactions on Bioinformatics Vol.1)

2008 ( 2 ) 2 - 12 2009.04

CiNii
Extending ALT algorithm to use multiple landmarks

MATSUNAGA TAKU, HIRATE YU, YAMANA HAYATO

IPSJ SIG Notes 2009 ( 9 ) 75 - 80 2009.01

　View Summary

Recently, the ALT algorithm is proposed as a speed-up algorithm to compute shortest paths in general graph structures. The ALT algorithm offers a landmark based heuristic function to estimate distance in A* search Before computing shortest paths, the ALT algorithm computes distances between all nodes and landmarks, and stores them to prepared memory or storage space. However, as the number of landmarks increases, the required prepared space increases linearly. To solve this problem, in this paper, we propose a novel heuristic function for computing shortest paths in general graph structures...

CiNii J-GLOBAL
Exploiting idle CPU cores to improve file access performance

Takanori Ueda, Yu Hirate, Hayato Yamana

Proceedings of the 3rd International Conference on Ubiquitous Information Management and Communication, ICUIMC'09 CD-ROM 529 - 535 2009

　View Summary

Many-core CPUs require many parallel computation tasks to reach their full potential because CPU cores become idle if they do not have enough computation tasks. How best to utilize a number of cores in many-core CPUs should be examined. In this paper, we propose exploitation of idle cores for improving file access performance. Idle cores are used to extract file access patterns from access logs and the extracted patterns are used to improve file cache efficiency by reordering the LRU (Least Recently Used) list based on the extracted patterns. Data mining techniques are used to extract access patterns to reduce computation overhead. Our method was evaluated by simulation and also implemented on Linux kernel 2.6.26 as a prototype system. In the simulation experiment, our method improved the cache-hit ratio up to 1.09% on DBT-2 (TPC-C) trace logs. Our prototype implementation on Linux improves DBT-2 performance up to 5.24% on a real machine. Copyright 2009 ACM.

DOI
商用検索エンジンにランキングされたサイトのランク変動パターンの解析

吉田泰明, 平手勇宇, 山名早人

DEIM2009 2009
検索ヒット数のクラスタリングを用いた補正手法の検討

舟橋卓也, 平手勇宇, 山名早人

DEIM2009 2009
核となるアイテムセットによる頻出アイテムセット抽出数削減手法

松崎勝彦, 平手勇宇, 山名早人

DEIM2009 2009
印象語からの概念推定システム

永井洋平, 黒木さやか, 山名早人

信学技報(Webインテリジェンスとインタラクション研究会) 2009
Webページ間の関連性の伝播を用いたWebコミュニティ抽出手法の評価

飯村卓也, 平手勇宇, 山名早人

DEIM2009 2009
複数キーワードクエリに対する検素ヒット数の信頼性検証

山名早人

信学技報 Vol.109, No.153 Vol.109 ( No.153 ) 19 - 24 2009
ブログにおける話題語の出現理由の抽出と話題に関する詳細記事推薦

中島泰, 黒木さやか, 櫻井宏樹, 山名早人

第15回Webインテリジェンスとインタラクション研究会 2009
Proposal of Word Salad Spam Detection Method using N-gram and Interrupted Collocations

森本浩介, 片瀬弘晶, 山名早人, 山名早人

情報処理学会研究報告(CD-ROM) Vol.2009-DBS-148 ( No.24 ) 1 - 8 2009

J-GLOBAL
Prediction of GPCR ligands by 2-way prediction method

百石弘澄, 杉原稔, 諏訪牧子, 諏訪牧子, 加藤毅, 加藤毅, 山名早人, 藤渕航, 藤渕航

情報処理学会研究報告(CD-ROM) Vol.2009-BIO-18 ( No.2 ) 1 - 8 2009

J-GLOBAL
ウィキペディア記事閲覧回数の特徴分析

曽根広哲, 山名早人

Wikimedia Conference Japan 2009 SIG-SWO-A901-03 2009
QueueLinker: Distributed Producer/Consumer Queue Framework"

上田高徳, 片瀬弘晶, 森本浩介, 打田研二, 山名早人

WebDB Forum2009 2009
A Lock-free GCLOCK Page Replacement Algorithm

油井誠, 油井誠, 宮崎純, 植村俊亮, 加藤博一, 山名早人

情報処理学会論文誌トランザクション(CD-ROM) Vol.2 ( No.4 ) 32 - 48 2009

J-GLOBAL
QueueLinker: Distributed Producer/Consumer Queue Framework"

上田高徳, 片瀬弘晶, 森本浩介, 打田研二, 山名早人

WebDB Forum2009 2009
Resizable-LSH: An Approximate Similarity Search Algorithm for Resizable Range-Search

山崎邦弘, 中村智浩, 舟橋卓也, 山名早人, 山名早人

情報処理学会研究報告(CD-ROM) 2009 ( 2 ) 2009

J-GLOBAL
A Scalable Monitoring System for Distributed Environments

Sayaka Akioka, Junichi Ikeda, Takanori Ueda, Yuki Ohno, Midori Sugaya, Yu Hirate, Jiro Katto, Shigeki Goto, Yoichi Muraoka, Hayato Yamana, Tatsuo Nakajima

FIRST INTERNATIONAL WORKSHOP ON SOFTWARE TECHNOLOGIES FOR FUTURE DEPENDABLE DISTRIBUTED SYSTEMS, PROCEEDINGS 32 - + 2009

　View Summary

The total amount of information to process or analyze is jumping sharply with the quick spread of computers and networks. Our project, «Highly scalable monitoring architecture for information explosion», develops a monitoring system allows observing systems, merging the system logs, and discovering intelligence to share. More concretely, the project builds the total system to maintain, optimize, and protect autonomically. This paper reports the outcomes of the project after first-half of the development period.The rest of the paper is organized as follows. Section 2 describes the concept and details of the monitoring system on a single node, and Section 3 addresses the aggregation of the collected information in distributed environments. Section 4 and Section 5 introduce applications of the monitoring systems. Section 6 summarizes the project and mentions future plans. © 2009 IEEE.

DOI
Profiling Node Conditions of Distributed System with Sequential Pattern Mining

Yu Hirate, Hayato Yamana

FIRST INTERNATIONAL WORKSHOP ON SOFTWARE TECHNOLOGIES FOR FUTURE DEPENDABLE DISTRIBUTED SYSTEMS, PROCEEDINGS 43 - + 2009

　View Summary

Recently, with wide-spread of distributed systems, distributed monitoring systems are needed to mange such systems. However, since monitoring architecture of distributed system faces a huge amount of log data which come from local computing nodes, information aggregation is fundamental scheme for monitoring distributed system. In this paper, we preset a novel approach for extracting computing node-condition profiles by using sequential pattern mining, which is one of data mining techniques. Extracted computing node condition profiles represent node condition patterns which are occurred in many computing nodes frequently. Thus, extracted profiles enable summarized distributed system conditions to be small sized and easy-understandable information.

DOI
The Challenge of Eliminating Storage Bottlenecks in Distributed Systems

Takanori Ueda, Yu Hirate, Hayato Yamana

FIRST INTERNATIONAL WORKSHOP ON SOFTWARE TECHNOLOGIES FOR FUTURE DEPENDABLE DISTRIBUTED SYSTEMS, PROCEEDINGS 49 - 53 2009

　View Summary

One of the most difficult problems in distributed systems is load-balancing. Even if we take care of load-balancing, heavily-loaded nodes often occur while there are still lightly-loaded nodes that have idle memory and idle CPU power. Our idea is to exploit this idle memory and idle CPU power to improve the storage performance of heavily-loaded nodes. Idle memory can be used for caching file data and idle CPU power can be used for extracting file access patterns from file access logs. File access patterns are valuable sources for optimizing a cache strategy. Our project goal is to improve the overall performance of distributed systems by improving storage access performance. This paper gives an overview of this project idea and reports the current status of the project. In addition, we show benchmark results from our prototype cache extension system, which is implemented in Linux Kernel 2.6. The DBT-3 (TPC-H) benchmark results show that our system can increase computer speed by a factor of 6.68.

DOI
Implementing and Evaluating Graph Engine for Large Scale Graphs

MATSUNAGA Taku, KATASE Hiroaki, UEDA Takanori, KUBOTA Nobuyuki, MORIMOTO Kosuke, HIRATE Yu, YAMANA Hayato

IEICE technical report. Data engineering 108 ( 329 ) 43 - 43 2008.11

CiNii
Search Engines' Trustworthiness(<Special Issue>Trust Assessment of Web Information)

Yamana Hayato

Journal of Japanese Society for Artificial Intelligence 23 ( 6 ) 752 - 759 2008.11

CiNii J-GLOBAL
Improvement in speed and accuracy of multiple sequence alignment program prime

Shinsuke Yamada, Osamu Gotoh, Hayato Yamana

IPSJ Transactions on Bioinformatics 1 2 - 12 2008.11

DOI CiNii
Dynamic I/O Optimization with Access Pattern Mining at OS Level

UEDA Takanori, HIRATE Yu, YAMANA Hayato

IPSJ SIG Notes 2008 ( 88 ) 73 - 78 2008.09

　View Summary

Many-core CPU improves parallel performance but also raises problem of storage performance bottleneck. I/O optimization should be taken at operating system level because various applications are executed in parallel on many-core CPU environment and I/O optimization requires cross-cutting knowledge about applications. We propose a new method which uses disk access patterns for improving efficiency of disk cache replacement algorithm. Our method is now implemented at Linux 2.6.26 and extracts access patterns from file access logs of applications. The experimental results show our method impro...

CiNii J-GLOBAL
Web Community Extraction Method with Web Pages' Relevance Fowarding

IIMURA Takuya, HIRATE Yu, YAMANA Hayato

IPSJ SIG Notes 2008 ( 88 ) 133 - 138 2008.09

　View Summary

To find information from a large collection of Web-pages, several methods for extracting Web communities are proposed. In the past studies, it succeeds in improving precision score by making a rule whether or not to include a certain Web page into a Web community strictly. However, recall score might worsen because the Web page that should be included in the Web community is not included. In this paper, we propose the Web community extraction method that can improve recall score without decreasing precision score. The method adds Web pages that have many links from/to the Web pages in a sam...

CiNii J-GLOBAL
Reliability Verification of Search Engines' Hit Count

FUNAHASHI Takuya, UEDA Takanori, HIRATE Yu, YAMANA Hayato

IPSJ SIG Notes 2008 ( 88 ) 139 - 144 2008.09

　View Summary

A number of studies have been using Search Engines' hit count. The goal of these studies is to build applications for translation support or natural language processing support. These studies assume that the hit count is reliable. However, none of the studies have been verifide the reliability of Search Engines' hit count. If the hit count is unreliable, studies using hit count become also unreliable. The purpose of this paper is to verify the reliability of Search Engines' hit count. In this experiment, we used Search APIs provided by Google, Yahoo! Japan and Live Search. Furthermore, we r...

CiNii J-GLOBAL
Analyzing geographical location and number of back-links of web servers all over the world

平手勇宇, 片瀬弘晶, 山名早人

Journal of the DBSJ Vol.7 ( No.2 ) 1 - 6 2008.09

CiNii J-GLOBAL
Message from the MAW 2008 co-chairs

Takahiro Hara, Yanchim Zhang, William K. Cheung, Shengrui Wang, Hayato Yamana, Km Fun Li, Laurence T. Yang

Proceedings - International Conference on Advanced Information Networking and Applications, AINA 57 2008.09

DOI
Influence of Wikipedia on Search Engine Rankings

SONE Hiroaki, YOSHIDA Yasuaki, HIRATE Yu, YAMANA Hayato

電子情報通信学会技術研究報告. DE, データ工学 Vol.108 ( No.93 ) 89 - 94 2008.06

　View Summary

Search engines are necessary to find information from the Internet. There is an investigation report that users recognize that information from search engines' results is as believable as information from television. That is, we can understand a part of the influence which a certain site gives to the society by examining search engine rankings. In this paper, to examine the influence of Wikipedia, that has become the pronoun of the encyclopedia in the Internet, we have conducted the experiments. We collected ranking positions of the articles in the Japanese version of Wikipedia by using search engines' APIs. Among all articles in the Japanese version of Wikipedia, about 90% of articles were ranked in top 10 by Yahoo! JAPAN and Google, and about 70% were ranked in top 10 by MSN. In the case of Yahoo! JAPAN and MSN, newly created articles in Wikipedia tend to appear in top 10 ranking at first, and keep their high rankings. The influence of Wikipedia toward search engine rankings was confirmed by our experiments.

CiNii J-GLOBAL
OS Level I/O Optimization in the Many-Core Era

UEDA Takanori, HIRATE Yu, YAMANA Hayato

IPSJ SIG Notes Vol.2008 ( No.56 ) 133 - 133 2008.06

　View Summary

近い将来，１つのチップに十数コアを搭載したメニーコア CPU が登場することは確実である．メニーコア環境下では，多くのアプリケーションが並列に動作するため，HDD が特に不得手とするランダムアクセスの頻度が増え，ストレージがますますボトルネックになると考えられる．そこで我々は，ストレージのボトルネックをソフトウェア的に軽減することを考えている．具体的には，アプリケーションのアクセスパターンを活用するディスクキャッシュ機構を Linux に実装し，実システムで評価することをひとつの目標にしている．ワークショップでは，これまでの研究概要と既存研究について述べると共に，最新の研究成果について述べ，今後の研究指針を示す．Many-core CPU which consists of some dozen cores in one package will definitely appear in the near future. In many-core environments, storage system will become bottlenecks since the random access to storage will increase because many applications will run in parallel. To meet this problem, we try to ease the storage bottlenecks by software method. Specifically, we try to implement a novel disk cache technique which exploits file access patterns of applications. The cache technique will be implemented on Linux Kernel and evaluated in real system. In this talk, we will show our research abstract and related works, and then show the latest results and the milestone of our research.

CiNii
Temporal Clustering of Internet News Articles with Excluding Single Articles

NAKAMURA Tomohiro, HIRANO Takayoshi, HIRATE Yu, YAMANA Hayato

電子情報通信学会技術研究報告. DE, データ工学 Vol.7 ( No.2 ) 7 - 12 2008.06

CiNii J-GLOBAL
Influence of Wikipedia on Search Engine Rankings

SONE Hiroaki, YOSHIDA Yasuaki, HIRATE Yu, YAMANA Hayato

電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 108 ( 94 ) 89 - 94 2008.06

　View Summary

Search engines are necessary to find information from the Internet. There is an investigation report that users recognize that information from search engines' results is as believable as information from television. That is, we can understand a part of the influence which a certain site gives to the society by examining search engine rankings. In this paper, to examine the influence of Wikipedia, that has become the pronoun of the encyclopedia in the Internet, we have conducted the experiments. We collected ranking positions of the articles in the Japanese version of Wikipedia by using search engines' APIs. Among all articles in the Japanese version of Wikipedia, about 90% of articles were ranked in top 10 by Yahoo! JAPAN and Google, and about 70% were ranked in top 10 by MSN. In the case of Yahoo! JAPAN and MSN, newly created articles in Wikipedia tend to appear in top 10 ranking at first, and keep their high rankings. The influence of Wikipedia toward search engine rankings was confirmed by our experiments.

CiNii
Temporal Clustering of Internet News Articles with Excluding Single Articles

NAKAMURA Tomohiro, HIRANO Takayoshi, HIRATE Yu, YAMANA Hayato

電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 108 ( 94 ) 59 - 64 2008.06

　View Summary

Clustering of internet news articles makes it possible to detect various useful information, for example, related articles, and latest topic words. From the TDT project down, this area is widely researched. Conventional clustering methods have difficulties to detect single article as a single cluster even though many single articles exists. In this paper, we propose a method to cluster news articles that exclude single articles in advance by using proper noun information, topographic information and other characteristics between single and non-single articles. In evaluation, we use half a year Japanese news articles. Compared to the Single-Link Method, which alone is difficult to judge articles single, our proposing method improves precision 10.2% and reduces the computation time to approximately a third.

CiNii
Geographical Location and Number of Back-Links of Web Servers All Over the World

HIRATE Yu, KATASE Hiroaki, YAMANA Hayato

IPSJ SIG Notes 2008 ( 56 ) 25 - 32 2008.06

　View Summary

According to our investigation result in Oct. 2005, the number of Web pages all over the world is estimated 53.7 billion. We have investigated TLD distribution and Language Distribution of Web pages based on 10.7 billion Web page dataset. In this paper, as one of our Web statics investigation series, we analyzed three kinds of distribution based on 10.7 billion Web page dataset, distribution of geographical location of Web server, the number of virtual hosts per one Web server, and the number of back links, i.e. the value of in-degree, per one Web server. Our results show (1) about 95.5% of Web servers are located in North America, Europe, and Asia regions, (2) hosts located in Latain America and East Europe have a large number of virtual hosts, and (3) the distribution between the value of in-degree and the number of Web servers follow the power low.

CiNii J-GLOBAL
Measuring Editor's Trustworthiness in Wikipedia by Using Edit History without Depending on the Edit Frequency

SAKURAI Hiroki, YOSHIDA Yasuaki, HIRATE Yu, YAMANA Hayato

電子情報通信学会技術研究報告. DE, データ工学 108 ( 93 ) 115 - 120 2008.06

　View Summary

Free encyclopedia Wikipedia on the Internet contains 490,000 Japanese articles and its volume of information is huge and useful. However, the trustworthiness of the articles' contents becomes uncertain because anyone can edit them easily. In other words, since we cannot understand exactly who edits the contents, it results in difficulty of measuring trustworthiness of the articles' contents. To such a problem the method using the edit history is proposed for measuring the trustworthiness of the articles. However, it is inapposite compared with the article and the editor with a little frequency to be edited. Therefore, in this paper, we propose the method for measuring editors' trustworthiness without depending on the edit frequency. The proposed method is based on the ratio where the edit remains the latest version of contents. Our evaluation shows that our proposed method evaluate the editor with high reliability highly, and the editor with low reliability lowly without depending on the edit frequency.

CiNii J-GLOBAL
Geographical Location and Number of Back-Links of Web Servers All Over the World

HIRATE Yu, KATASE Hiroaki, YAMANA Hayato

IPSJ SIG Notes 2008 ( 56 ) 25 - 32 2008.06

　View Summary

According to our investigation result in Oct. 2005, the number of Web pages all over the world is estimated 53.7 billion. We have investigated TLD distribution and Language Distribution of Web pages based on 10.7 billion Web page dataset. In this paper, as one of our Web statics investigation series, we analyzed three kinds of distribution based on 10.7 billion Web page dataset, distribution of geographical location of Web server, the number of virtual hosts per one Web server, and the number of back links, i.e. the value of in-degree, per one Web server. Our results show (1) about 95.5% of...

CiNii
OS Level I/O Optimization in the Many-Core Era

UEDA Takanori, HIRATE Yu, YAMANA Hayato

IPSJ SIG Notes 2008 ( 56 ) 133 - 133 2008.06

　View Summary

近い将来，１つのチップに十数コアを搭載したメニーコア CPU が登場することは確実である．メニーコア環境下では，多くのアプリケーションが並列に動作するため，HDD が特に不得手とするランダムアクセスの頻度が増え，ストレージがますますボトルネックになると考えられる．そこで我々は，ストレージのボトルネックをソフトウェア的に軽減することを考えている．具体的には，アプリケーションのアクセスパターンを活用するディスクキャッシュ機構を Linux に実装し，実システムで評価することをひとつの目標にしている．ワークショップでは，これまでの研究概要と既存研究について述べると共に，最新の研究成果について述べ，今後の研究指針を示す．Many-core CPU which consists of some dozen cores in one package will definitely appear in the near future. In many-core environments, storage system will become bottlenecks since the random access to storage will increase because many applications will run in parallel. To meet this problem, we try to ease the storage bottlenecks by software method. Specifically, we try to implement a novel disk cache technique which exploits file access patterns of applications. The cache technique will be implemented on Linux Kernel and evaluated in real system. In this talk, we will show our research abstract and related works, and then show the latest results and the milestone of our research.

CiNii J-GLOBAL
Temporal Clustering of Internet News Articles with Excluding Single Articles

NAKAMURA Tomohiro, HIRANO Takayoshi, HIRATE Yu, YAMANA Hayato

IEICE technical report. Data engineering 108 ( 93 ) 59 - 64 2008.06

　View Summary

Clustering of internet news articles makes it possible to detect various useful information, for example, related articles, and latest topic words. From the TDT project down, this area is widely researched. Conventional clustering methods have difficulties to detect single article as a single cluster even though many single articles exists. In this paper, we propose a method to cluster news articles that exclude single articles in advance by using proper noun information, topographic information and other characteristics between single and non-single articles. In evaluation, we use half a y...

CiNii
Influence of Wikipedia on Search Engine Rankings

SONE Hiroaki, YOSHIDA Yasuaki, HIRATE Yu, YAMANA Hayato

IEICE technical report. Data engineering 108 ( 93 ) 89 - 94 2008.06

　View Summary

Search engines are necessary to find information from the Internet. There is an investigation report that users recognize that information from search engines' results is as believable as information from television. That is, we can understand a part of the influence which a certain site gives to the society by examining search engine rankings. In this paper, to examine the influence of Wikipedia, that has become the pronoun of the encyclopedia in the Internet, we have conducted the experiments. We collected ranking positions of the articles in the Japanese version of Wikipedia by using sea...

CiNii
Gathering Over 10 Billion of Web Pages and its Applications

YAMANA Hayato

IEICE technical report. Data engineering 108 ( 93 ) 95 - 95 2008.06

　View Summary

The number of Web pages distributed from Web servers is estimated about 53.7 billion as of Oct. 2005. We had gathered 14,456,201,906 Web pages from 5,548 Web servers during Jan. 2004 to July 2006. It had been conducted as part of e-Society project which is one of MEXT, Ministry of Education, Culture, Sports, Science and Technology, leading projects. Speedup of crawling Web pages conflicts with Web-site friendly crawling, however, both are indispensable for gathering Web pages. In the project, we have studied and proposed a dynamic delay adjustment scheme for accessing Web servers to prevent...

CiNii
Measuring Editor's Trustworthiness in Wikipedia by Using Edit History without Depending on the Edit Frequency

SAKURAI Hiroki, YOSHIDA Yasuaki, HIRATE Yu, YAMANA Hayato

IEICE technical report. Data engineering 108 ( 93 ) 115 - 120 2008.06

　View Summary

Free encyclopedia Wikipedia on the Internet contains 490,000 Japanese articles and its volume of information is huge and useful. However, the trustworthiness of the articles' contents becomes uncertain because anyone can edit them easily. In other words, since we cannot understand exactly who edits the contents, it results in difficulty of measuring trustworthiness of the articles' contents. To such a problem the method using the edit history is proposed for measuring the trustworthiness of the articles. However, it is inapposite compared with the article and the editor with a little freque...

CiNii
Temporal Clustering of Internet News Articles with Excluding Single Articles

NAKAMURA Tomohiro, HIRANO Takayoshi, HIRATE Yu, YAMANA Hayato

Technical report of IEICE. PRMU 108 ( 94 ) 59 - 64 2008.06

　View Summary

Clustering of internet news articles makes it possible to detect various useful information, for example, related articles, and latest topic words. From the TDT project down, this area is widely researched. Conventional clustering methods have difficulties to detect single article as a single cluster even though many single articles exists. In this paper, we propose a method to cluster news articles that exclude single articles in advance by using proper noun information, topographic information and other characteristics between single and non-single articles. In evaluation, we use half a y...

CiNii
Influence of Wikipedia on Search Engine Rankings

SONE Hiroaki, YOSHIDA Yasuaki, HIRATE Yu, YAMANA Hayato

Technical report of IEICE. PRMU 108 ( 94 ) 89 - 94 2008.06

　View Summary

Search engines are necessary to find information from the Internet. There is an investigation report that users recognize that information from search engines' results is as believable as information from television. That is, we can understand a part of the influence which a certain site gives to the society by examining search engine rankings. In this paper, to examine the influence of Wikipedia, that has become the pronoun of the encyclopedia in the Internet, we have conducted the experiments. We collected ranking positions of the articles in the Japanese version of Wikipedia by using sea...

CiNii
Gathering Over 10 Billion of Web Pages and its Applications

YAMANA Hayato

Technical report of IEICE. PRMU 108 ( 94 ) 95 - 95 2008.06

CiNii
Measuring Editor's Trustworthiness in Wikipedia by Using Edit History without Depending on the Edit Frequency

SAKURAI Hiroki, YOSHIDA Yasuaki, HIRATE Yu, YAMANA Hayato

Technical report of IEICE. PRMU 108 ( 94 ) 115 - 120 2008.06

　View Summary

Free encyclopedia Wikipedia on the Internet contains 490,000 Japanese articles and its volume of information is huge and useful. However, the trustworthiness of the articles' contents becomes uncertain because anyone can edit them easily. In other words, since we cannot understand exactly who edits the contents, it results in difficulty of measuring trustworthiness of the articles' contents. To such a problem the method using the edit history is proposed for measuring the trustworthiness of the articles. However, it is inapposite compared with the article and the editor with a little freque...

CiNii
3ZK-10 A System for Finding Shortest Paths Between Web Pages

Matsunaga Taku, Hirate Yu, Yamana Hayato

全国大会講演論文集 70 ( 5 ) "5 - 193"-"5-194" 2008.03

　View Summary

According to our investigation result in Oct. 2005, the number of Web pages all over the world is estimated 53.7 billion. We have investigated TLD distribution and Language Distribution of Web pages based on 10.7 billion Web page dataset. In this paper, as one of our Web statics investigation series, we analyzed three kinds of distribution based on 10.7 billion Web page dataset, distribution of geographical location of Web server, the number of virtual hosts per one Web server, and the number of back links, i.e. the value of in-degree, per one Web server. Our results show (1) about 95.5% of...

CiNii
5L-1 全世界のWebページのTLD・言語分布解析(リーディングプロジェクト e-society:WebアーカイブとWebデータ解析技術,一般セッション,リーディングプロジェクト e-society)

平手勇宇, 山名早人

全国大会講演論文集 70 ( 5 ) "5 - 361"-"5-362" 2008.03

CiNii
EReM-DiCE: Exploiting Remote Memory for Disk Cache Extension

Takanori UEDA, Yu HIRATE, Hayato YAMANA

Proc. of 1st International Workshop on Storage and I/O Virtualization, Performance, Energy, Evaluation and Dependability (SPEED2008) 2008
分散メタP2Pストレージ「DiMPS」によるコンテンツ配信システムの実現

岡本雄太, 山名早人

DEWS2008 2008
評判情報における評価対象の性質や一部分を表す表現の高精度な抽出手法

臼渕護, 平手勇宇, 山名早人

言語処理学会第14回年次大会(NLP2008) 2008
全世界のWebページのTLD・言語分布解析

平手勇宇, 山名早人

第70回情処全大 5L-1 ( 5 ) 2008

J-GLOBAL
全世界の Web サイトの言語分布と日本語を含む Web サイトのリンク地理的位置の解析

童芳

DEWS2008 2008

CiNii
商用検索エンジンの検索結果では取得できないランキング下位部分の収集・解析

舟橋卓也, 上田高徳, 平手勇宇, 山名早人

DEWS2008 2008
検索エンジンを用いた類似文章検索システムEPCI の評価

田代崇, 上田高徳, 平手勇宇, 山名早人

DEWS2008 2008
リンク構造解析アルゴリズム高速化のための縮小Webリンク構造の構築

片瀬弘晶, 松永拓, 上田高徳, 田代崇, 平手勇宇, 山名早人

DEWS2008 2008
プログラムコードの抽象化を利用した類似ソースコード検索システム

黒木さやか, 上田高徳, 平手勇宇, 山名早人

DEWS2008 2008
システムコールレベルのアクセスログによるディスクアクセスパターンマイニングの検討

上田高徳, 平手勇宇, 山名早人

DEWS2008 2008
Webページ間最短経路探索システムの構築

松永拓, 平手勇宇, 山名早人

第70回情処全大 3ZK-10 ( 5 ) 2008

J-GLOBAL
Webページ間最短経路サブグラフによるオンラインリンクマイニング

松永拓, 平手勇宇, 山名早人

DEWS2008 2008
Y.Hirate(D3), A.Aiyoshizawa, S.O, Y.Ioku, F.Kido and H.Yamana

System for Detecting Auction Fraud Communities in Internet Auctions

Proc. of the 2nd International Conf. on Information Systems, Technology and Management(ICISTM-08) 2008
What's going on in search engine rankings?

Yasuaki Yoshida, Takanori Ueda, Takashi Tashiro, Yu Hirate, Hayato Yamana

2008 22ND INTERNATIONAL WORKSHOPS ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS, VOLS 1-3 1199 - 1204 2008 [Refereed]

　View Summary

Many people use search engines every day to retrieve documents from the Web. Although the social influence of search engine rankings has become significant, ranking algorithms are not disclosed. In this paper we have investigated three major search engine rankings by analyzing two kinds of data. One is the weekly ranking snapshots of top 250. Web pages we collected for almost one year by submitting 1,000 pre-selected queries; the other comprises back-linked Web pages gathered by our own Web crawling. As a result, we have confirmed that (1) several top 10 rankings are mutually similar however the following ranked Web pages are almost different, (2) ranking transitions have their own characteristics, and (3) each search engine's ranking has its own correlation with the number of back-linked Web pages.

DOI
全世界のWebホストの地理的位置・バックリンク数の解析

平手勇宇, 片瀬弘晶, 山名早人

情報研報(DBS) Vol.2008 ( No.56 ) 25 - 32 2008
全世界のWebサイトのTLD・言語分布・地理的設置位置の特定

童芳, 平手勇宇, 山名早人

日本データベース学会論文 Vol.7 ( No.1 ) 31 - 36 2008

J-GLOBAL
商用検索エンジンの検索結果では取得できないランキング下位部分の収集・解析

舟橋卓也, 上田高徳, 平手勇宇, 山名早人

日本データベース学会論文誌 Vol.7 ( No.1 ) 37 - 42 2008

J-GLOBAL
商用検索エンジンのヒット数に対する信頼性の検証

舟橋卓也, 上田高徳, 平手勇宇, 山名早人

情処研報(DBS)/iDB2008 Vol.2008 ( No.88 ) 139 - 144 2008
リンク構造解析アルゴリズム高速化のための縮小Webの構築

片瀬弘晶, 松永拓, 上田高徳, 田代崇, 平手勇宇, 山名早人

日本データベース学会論文誌, Vol.7 Vol.7 ( No.1 ) 245 - 250 2008

J-GLOBAL
システムコールレベルのアクセスログを用いたディスクアクセスパターンマイニング

上田高徳, 平手勇宇, 山名早人

日本データベース学会論文誌 Vol.7 ( No.1 ) 145 - 150 2008

J-GLOBAL
アクセスパターンマイニングによるOSレベルでの動的なI/O最適化

上田高徳, 平手勇宇, 山名早人

情処研報(DBS)/iDB2008 Vol.2008 ( No.88 ) 73 - 78 2008
Webページ間の関連性の伝播を用いたWebコミュニティ抽出手法

飯村卓也, 平手勇宇, 山名早人

情処研報(DBS)/iDB2008 Vol.2008 ( No.88 ) 133 - 138 2008
100億規模のWebページ収集とその活用

山名早人

信学技報(データ工学研究会) Vol.108 ( No.93 ) 95 2008
Toward the Analysis of over 10 billion Web pages

Hayato YAMANA

Proc. of the 4th Korea-Japan Int'l Database Workshop 2008(KJDB 2008) 239 - 255 2008
大規模Webリンクデータを用いたリンクスパムコミュニティ抽出

平手勇宇, 山名早人

楽天研究開発シンポジウム2008 2008
検索エンジンの信頼性

山名早人

人工知能学会誌 Vol.23 ( No.6 ) 752 - 759 2008
100億規模のWebページ収集・分析への挑戦

村岡洋一, 山名早人, 松井くにお, 橋本三奈子, 赤羽匡子, 萩原純一

情報処理 Vol.49 ( No.11 ) 1277 - 1283 2008
商用検索エンジンのヒット数に対する信頼性の検証

舟橋卓也, 上田高徳, 平手勇宇, 山名早人

日本データベース学会論文誌 Vol.7 ( No.3 ) 31 - 36 2008

J-GLOBAL
グラフデータ処理エンジンの実装と評価

松永拓, 片瀬弘晶, 上田高徳, 久保田展行, 森本浩介, 平手勇宇, 山名早人

信学技報 Vol.108 ( No. 329 ) 43 - 43 2008
EReM-DiCE: Exploiting Remote Memory for Disk Cache Extension

Takanori UEDA, Yu HIRATE, Hayato YAMANA

Proc. of 1st International Workshop on Storage and I/O Virtualization, Performance, Energy, Evaluation and Dependability (SPEED2008) 2008
Y.Hirate(D3), A.Aiyoshizawa, S.O, Y.Ioku, F.Kido and H.Yamana

System for Detecting Auction Fraud Communities in Internet Auctions

Proc. of the 2nd International Conf. on Information Systems, Technology and Management(ICISTM-08) 2008
Toward the Analysis of over 10 billion Web pages

Hayato YAMANA

Proc. of the 4th Korea-Japan Int'l Database Workshop 2008(KJDB 2008) 239 - 255 2008
大規模テキストからの複合語の属性表現の抽出手法

臼渕護, 平手勇宇, 山名早人

言語処理学会年次大会発表論文集 14th 2008

J-GLOBAL
Web structure in 2005

Yu Hirate, Shin Kato, Hayato Yamana

ALGORITHMS AND MODELS FOR THE WEB-GRAPH 4936 36 - 46 2008

　View Summary

The estimated number of static web pages in Oct 2005 was over 20.3 billion, which was determined by multiplying the average number of pages per web server based on the results of three previous studies, 200 pages, by the estimated number of web servers on the Internet, 101.4 million. However, based on the analysis of 8.5 billion web pages that we crawled by Oct. 2005, we estimate the total number of web pages to be 53.7 billion. This is because the number of dynamic web pages has increased rapidly in recent years. We also analyzed the web structure using 3 billion of the 8.5 billion web pages that we have crawled. Our results indicate that the size of the "CORE," the central component of the bow tie structure, has increased in recent years, especially in the Chinese and Japanese web.

DOI
Optimistic transactional active replication

Hiroshi Horii, Hayato Yamana

Proceedings of the 2nd International Conference on Ubiquitous Information Management and Communication, ICUIMC-2008 94 - 100 2008

　View Summary

Critical database applications require 2-safe replication between at least two sites for disaster-tolerant services. At the same time, they must provide consistent and low-latency results to their clients in normal cases. In this paper, we propose Optimistic Transactional Active Replication (OTAR), which replicates the transaction logs with low latency and provides a consistent view to database applications. The latency of our replication is lower than Passive Replication, and guarantees the serializability of transaction isolation levels that cannot be supported by Active Replication. For our replication, each client sends a transaction request to all replicas and all of the replicas execute the request and optimistically return the result of the transaction to the client. Each replica generates a causality history of the transaction, sent to the client with the result. With the causality histories, the client can make sure that the requested transaction was executed in the same order at all of the replicas and eventually commit it. If the client cannot validate the order, then the client waits for the pessimistic result of the transaction from the replicas. This paper describes the algorithm and its properties. © 2008 ACM.

DOI
Improvement in speed and accuracy of multiple sequence alignment program PRIME

Yamada Shinsuke, Gotoh Osamu, Yamana Hayato

IPSJ SIG technical reports 2007 ( 128 ) 267 - 274 2007.12

　View Summary

Multiple sequence alignments (MSAs) are useful tools in bioinformatics, and many MSA algorithms have been developed. We have developed an MSA program PRIME, which is one of the most accurate programs. However, PRIME is slower than other leading MSA programs. Therefore, we newly incorporate heuristics into PRIME. The benchmark results indicated that these heuristics contributed to significant reduction in the computational time with the slight accuracy decrease. Additionally, we evaluated the effectiveness of an algorithm based on maximal expected accuracy (MEA). Our experiments revealed tha...

CiNii
Improvement in speed and accuracy of multiple sequence alignment program PRIME

Yamada Shinsuke, Gotoh Osamu, Yamana Hayato

IPSJ SIG Notes 2007 ( 128 ) 267 - 274 2007.12

　View Summary

Multiple sequence alignments (MSAs) are useful tools in bioinformatics, and many MSA algorithms have been developed. We have developed an MSA program PRIME, which is one of the most accurate programs. However, PRIME is slower than other leading MSA programs. Therefore, we newly incorporate heuristics into PRIME. The benchmark results indicated that these heuristics contributed to significant reduction in the computational time with the slight accuracy decrease. Additionally, we evaluated the effectiveness of an algorithm based on maximal expected accuracy (MEA). Our experiments revealed tha...

CiNii
Automatic Non-Photorealistic Rendering Based on Adding Freehand Borderlines to Photographs

SAKAMOTO Yuki, YAMANA Hayato

IPSJ SIG Notes 128 1 - 6 2007.08

　View Summary

This paper proposes a new method for the automatic non-photorealistic rendering based on adding freehand borderlines to photographs. The proposed method enables various borderline expressions by extracting borderlines from an input image, and pouring various line patterns which are drawn with a pencil, a pen, a brush, and a crayon. That results in automatic generation of the picture which user hopes for. In order to extract natural borderlines from a picture, we propose a new method to extract a series of borderline, because conventional borderline extraction methods have problems such as dividing of borderlines and including short branches. As a result, the proposed method is able to extract natural borderlines that results in handwriting-like borderline expressions.

CiNii
Automatic Non-Photorealistic Rendering Based on Adding Freehand Borderlines to Photographs

SAKAMOTO Yuki, YAMANA Hayato

IPSJ SIG Notes 2007 ( 84 ) 1 - 6 2007.08

　View Summary

This paper proposes a new method for the automatic non-photorealistic rendering based on adding freehand borderlines to photographs. The proposed method enables various borderline expressions by extracting borderlines from an input image, and pouring various line patterns which are drawn with a pencil, a pen, a brush, and a crayon. That results in automatic generation of the picture which user hopes for. In order to extract natural borderlines from a picture, we propose a new method to extract a series of borderline, because conventional borderline extraction methods have problems such as d...

CiNii J-GLOBAL
Exploiting Remote Memory to Speed-up Random Disk Access

UEDA Takanori, HIRATE Yu, YAMANA Hayato

IPSJ SIG Notes 2007 ( 79 ) 151 - 156 2007.08

　View Summary

As hard disks tend to be bottleneck devices in the current computer architecture, enhancing hard disk access speed is one of the most efficient factors to improve the total performance of computers. However, hardware modification with or without software modification is costly. Accordingly, we propose a new acceleration method implemented at the OS kernel level, which has no requirement for modification of hardware or existing applications. Specifically, our method exploits remote memory to extend local disk cache. We have implemented our method on Linux Kernel 2.6. The PostgreSQL benchmark...

CiNii J-GLOBAL
Detecting Article Errors in English using Search Engines

HIRANO Takayoshi, HIRATE Yu, YAMANA Hayato

IPSJ SIG Notes 2007 ( 65 ) 139 - 144 2007.07

　View Summary

Recently, both the necessity for English and opportunities to write English have became higher and higher among non-native English speakers. But most of Japanese people tend to made many errors in English article usage when they write English. In this paper, we propose a method for detecting article errors in English by using search engines. Since search engines index great amounts of text data on web pages, search engine based methods are able to detect undetectable errors which conventional corpus based method cannot detect. Lapata et al. proposed a method for detecting article errors bas...

CiNii
Quantitative Evaluation and Feature Analysis of Search Engine Rankings

YOSHIDA Yasuaki, UEDA Takanori, TASHIRO Takashi, HIRATE Yu, YAMANA Hayato

IPSJ SIG Notes 2007 ( 65 ) 441 - 446 2007.07

　View Summary

Most people use search engines in order to retrieve documents on the web. In this way, the social influence of search engines' ranking is large, however, the algorithms of deciding ranking are not declared. In this paper, we have investigated three major search engines' rankings with analyzing ranking data that we collected by submitting 1000 queries weekly. As a result, we have confirmed that (1) the several top tens rankings of famous search engines are similar mutually, (2) the ranking transitions are different each other, and (3) each engine's rankings have correlation with the number o...

CiNii
Detecting Article Errors in English using Search Engines

HIRANO Takayoshi, HIRATE Yu, YAMANA Hayato

IEICE technical report. Data engineering 107 ( 131 ) 139 - 144 2007.06

　View Summary

Recently, both the necessity for English and opportunities to write English have became higher and higher among non-native English speakers. But most of Japanese people tend to made many errors in English article usage when they write English. In this paper, we propose a method for detecting article errors in English by using search engines. Since search engines index great amounts of text data on web pages, search engine based methods are able to detect undetectable errors which conventional corpus based method cannot detect. Lapata et al. proposed a method for detecting article errors bas...

CiNii J-GLOBAL
Quantitative Evaluation and Feature Analysis of Search Engine Rankings

YOSHIDA Yasuaki, UEDA Takanori, TASHIRO Takashi, HIRATE Yu, YAMANA Hayato

IEICE technical report. Data engineering 107 ( 131 ) 441 - 446 2007.06

　View Summary

Most people use search engines in order to retrieve documents on the web. In this way, the social influence of search engines' ranking is large, however, the algorithms of deciding ranking are not declared. In this paper, we have investigated three major search engines' rankings with analyzing ranking data that we collected by submitting 1000 queries weekly. As a result, we have confirmed that (1) the several top tens rankings of famous search engines are similar mutually, (2) the ranking transitions are different each other, and (3) each engine's rankings have correlation with the number o...

CiNii J-GLOBAL
MathBox : Pen-Based Mathematical Expression Input System

糟谷勇児, 山名早人

Technical report of IEICE. PRMU 106 ( 606 ) 1 - 6 2007.03

　View Summary

This paper presents pen-based mathematical expression input system named MathBox. On MathBox, a user inputs mathematical expressions as the iteration of writing one symbol in the "box". Boxes are automatically shown along user's writing. When a user writes alphanumeric, MathBox shows the boxes for power and index. When a user writes fraction line, MathBox shows the boxes for numerator and denominator. Those interactions enable MathBox to skip the recognition of the structure of mathematical expressions. Thus a user is capable of inputting mathematical expressions in practical accuracy. The ...

CiNii J-GLOBAL
Image Retrieval with Automatic Labeling with Users' Queries : Growable Search Engine by Users

IGUCHI Shigeru, YAMANA Hayato

Technical report of IEICE. PRMU 106 ( 606 ) 61 - 66 2007.03

　View Summary

This paper proposes an image retrieval system whose images are labeled and search precision is improved while the system is used. Image retrieval is categorized into two major types: one type is Text-based Image Retrieval(TBIR) which uses keywords(texts) as queries; another type is Content-based Image Retrieval(CBIR) which uses images as queries. TBIR returns images related in linguistics and meaning, CBIR returns images related in look and feel. Moreover, TBIR takes extra effort for labeling images; CBIR does not need labeling, but it does not take into account effective information of key...

CiNii J-GLOBAL
Disk Access Speed up Using Prefetching Thread

FUKAYAMA Tatsunori, SUGITA Shu, HIRUTA Tomonori, YAMANA Hayato

IPSJ SIG Notes Vol.2007 ( No.17 ) 233 - 238 2007.03

　View Summary

It takes four to ten times more time to get data from hard disk drive than from DRAM. In this paper, we present a speed up mechanisms using a prefetching thread on a multicore system to overcome this relative deterioration of hard disk drive performance. A Prefetching thread loads data from hard disk before main thread requires the particular data. When main thread requires the data, the data will be found on disk cache so it will take no time to get the data. We have confirmed that the prefetching thread reduces the execution time of gzip. The performance of gzip increased up to 39.2%.

CiNii J-GLOBAL
A Speed-Up Method for Shell Scripts on Multi-Core and SMT Processors

SUGITA SHU, FUKAYAMA TATSUNORI, HIRUTA TOMONORI, TOUNAKA NOBUAKI, YAMANA HAYATO

IPSJ SIG Notes 2007 ( 17 ) 73 - 78 2007.03

　View Summary

The purpose of this study is to show the effectiveness of shell script execution on multi-core and/or SMT (Simultaneous Multi-Threading) processors. recently, multi-core processor and SMT technique have become popular even at home and in business. However, using programs or compilers without consideration of parallelism does not give us the benefits of multi-core and multi-thread. Programmers have to do parallel programming to receive the benefits. Therefore, automatic parallelizing technique has been studied actively. This paper proposes automatic parallelizing scheme for shell script programs on multi-core and/or SMT processors. As a result of the experiment, we have confirmed that the speed-up of automatic parallelized shell script program is 1.4 to 1.8 times in comparison with the original shell script program.

CiNii J-GLOBAL
A Speed-Up Method for Shell Scripts on Multi-Core and SMT Processors

SUGITA SHU, FUKAYAMA TATSUNORI, HIRUTA TOMONORI, TOUNAKA NOBUAKI, YAMANA HAYATO

IPSJ SIG Notes 2007 ( 17 ) 73 - 78 2007.03

　View Summary

The purpose of this study is to show the effectiveness of shell script execution on multi-core and/or SMT (Simultaneous Multi-Threading) processors. recently, multi-core processor and SMT technique have become popular even at home and in business. However, using programs or compilers without consideration of parallelism does not give us the benefits of multi-core and multi-thread. Programmers have to do parallel programming to receive the benefits. Therefore, automatic parallelizing technique has been studied actively. This paper proposes automatic parallelizing scheme for shell script prog...

CiNii
Disk Access Speed up Using Prefetching Thread

Fukayama Tatsunori, Sugita Shu, Hiruta Tomonori, Yamana Hayato

IPSJ SIG Notes 2007 ( 17 ) 233 - 238 2007.03

　View Summary

It takes four to ten times more time to get data from hard disk drive than from DRAM. In this paper, we present a speed up mechanisms using a prefetching thread on a multicore system to overcome this relative deterioration of hard disk drive performance. A Prefetching thread loads data from hard disk before main thread requires the particular data. When main thread requires the data, the data will be found on disk cache so it will take no time to get the data. We have confirmed that the prefetching thread reduces the execution time of gzip. The performance of gzip increased up to 39.2%.

CiNii
MathBox: Interactive Pen-Based Interface for Inputting Mathematical Expressions

Yuji Kasuya, Hayato Yamana

2007 INTERNATIONAL CONFERENCE ON INTELLIGENT USER INTERFACES 274 - 277 2007

　View Summary

Inputting mathematical expressions with a mouse and a keyboard is a troublesome task. Thus, a number of mathematical expression recognition systems capable of recognizing handwritten mathematical expressions to input them into computers have been proposed. Even with these systems, however, structure recognition of mathematical expressions is still difficult. This paper presents MathBox, a new pen-based interface for inputting mathematical expressions into computers. MathBox interactively shows "boxes" in which the user can write one symbol. The boxes are shown along with the user's writing. For example, when the user writes 'x,' the boxes for a power and an index of 'x' and for the next symbol are shown. When the user inputs a fraction line, boxes for the numerator, denominator, and the next symbol are shown. MathBox skips recognizing the structures of expressions, which enables users to write mathematical expressions with practical accuracy.

DOI
二段階の類似画像検索を用いた改変画像検出手法

馬越健治, 糟谷勇児, 山名早人

DEWS2007 L1-3 2007
経済時系列データからの投資指標の抽出

柳井佳孝, 山名早人

DEWS2007 E9-4 2007
ネットワーク上のマシンをディスクキャッシュに利用した場合の性能評価

上田高徳, 平手勇宇, 山名早人

DEWS2007 E7-9 2007
キーワードの出現に基づくブログコミュニティ抽出とオピニオンリーダーの発見

永拓, 平手勇宇, 山名早人

DEWS2007 C3-7 2007
Web 検索エンジンのランキングバイアスに関する研究動向

平手勇宇

データ工学ワークショップDEWS2007 C7-7 2007

CiNii
手書き数式入力システムMathBox

糟谷勇児, 山名早人

信学技報PRMU 2007
ユーザクエリによる画像へのキーワード付けを利用した画像検索～利用によって賢くなる検索エンジン～

井口茂, 山名早人

信学技報PRMU 2007
マルチコアプロセッサ上におけるシェルスクリプト高速化手法

山名早人

情報処理学会研究会報告 2007・17 Vol.2007 ( No.17 ) 73 - 78 2007
タンパク質立体構造に基づいたアラインメント中の保存領域抽出手法の改良

山田真介, 山名早人, 野口保

第7回日本蛋白質科学会年会 7th 2007

J-GLOBAL
EPCI: Extracting potentially copyright infringement texts from the web

Takashi Tashiro, Takanori Ueda, Taisuke Hori, Yu Hirate, Hayato Yamana

16th International World Wide Web Conference, WWW2007 pp.1151-1152 1151 - 1152 2007

　View Summary

In this paper, we propose a new system extracting potentially copyright infringement texts from the Web, called EPCI. EPCI extracts them in the following way: (1) generating a set of queries based on a given copyright reserved seed-text, (2) putting every query to search engine API, (3) gathering the search result Web pages from high ranking until the similarity between the given seed-text and the search result pages becomes less than a given threshold value, and (4) merging all the gathered pages, then re-ranking them in the order of their similarity. Our experimental result using 40 seed-texts shows that EPCI is able to extract 132 potentially copyright infringement Web pages per a given copyright reserved seed-text with 94% precision in average.

DOI
商用検索エンジンのランキングに関する定量的評価と特徴解析

吉田泰明, 上田高徳, 田代崇, 平手勇宇, 山名早人

情報研報(DBS),Vol.2007 No.65 441 - 446 2007
検索エンジンを用いた英文冠詞誤りの検出

平野孝佳, 平手勇宇, 山名早人

情報研報(DBS),Vol.2007 139 - 144 2007
リモートメモリを用いたランダムディスクアクセス高速化手法

上田高徳, 平手勇宇, 山名早人

情処研報(ARC), Vol.2007 No.79 151 - 156 2007
multiple sequence alignment program based on group-to-group sequence alignment algorithm with piecewise linear gap cost

Shinsuke Yamada, Osamu Gotoh, Hayato Yamana

ISMB/ECCB2007, Austria Center Vienna 2007
商用サーチエンジンのランキング解析サポートシステム

吉田泰明, 舟橋卓也, 片瀬弘晶, 上田高徳, 平手勇宇, 山名早人

DBWeb2007 2007
学内ドメインに存在する隠れたWebページの解析

平手勇宇, シュティフロマン, 魏小比, 山名早人

平成19年度情報教育研究集会 2007
multiple sequence alignment program based on group-to-group sequence alignment algorithm with piecewise linear gap cost

Shinsuke Yamada, Osamu Gotoh, Hayato Yamana

ISMB/ECCB2007, Austria Center Vienna 2007
P2Pファイル共有ネットワークを利用した大規模分散ストレージの実現

岡本雄太, 蛭田智則, 山名早人

情報処理学会全国大会講演論文集 69th ( 3 ) 2007

J-GLOBAL
MathBox: Interactive Pen-Based Interface for Inputting Mathematical Expressions

Yuji Kasuya, Hayato Yamana

2007 INTERNATIONAL CONFERENCE ON INTELLIGENT USER INTERFACES 274 - 277 2007

　View Summary

Inputting mathematical expressions with a mouse and a keyboard is a troublesome task. Thus, a number of mathematical expression recognition systems capable of recognizing handwritten mathematical expressions to input them into computers have been proposed. Even with these systems, however, structure recognition of mathematical expressions is still difficult. This paper presents MathBox, a new pen-based interface for inputting mathematical expressions into computers. MathBox interactively shows "boxes" in which the user can write one symbol. The boxes are shown along with the user's writing. For example, when the user writes 'x,' the boxes for a power and an index of 'x' and for the next symbol are shown. When the user inputs a fraction line, boxes for the numerator, denominator, and the next symbol are shown. MathBox skips recognizing the structures of expressions, which enables users to write mathematical expressions with practical accuracy.

DOI
The development and evaluation of a prototype system for the inference of genetic networks based on genetic programming

Kouji Tanaka, Hayato Yamana

WMSCI 2007: 11TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL IV, PROCEEDINGS 4 13 - 18 2007

　View Summary

Estimating mutual interactions of genetic networks is mainly to infer the mutual control relationships from multiple genes from the gene expression data. Such correlations are typically expressible in the form of nonlinear simultaneous differential equations. However, most work to date has employed S-systems as an expression of such differential equations, allowing only rough approximations of mass actions, and as such it was difficult to determine the actual correlations between the genes. Instead, we formulate the mutual interactions as actual simultaneous partial differential equations, and automatically determine its structure and coefficients using genetic programming (GP) from a given data series. Parallel implementation of the scheme in a Grid environment using our Jojo Grid programming system for Java has resulted in precise determination of the equations in many cases within some reasonable time.
1P474 Automatic extraction of conserved region from alignment based on protein structure(23. Bioinformatics, genomics and proteomics (I),Poster Session,Abstract,Meeting Program of EABS &BSJ 2006)

Yamada Shinsuke, Yamada Koutarou, Yamana Hayato, Noguchi Tamotsu

Biophysics 46 ( 2 ) S265 2006.10

DOI CiNii
Domain linker prediction based on position-specific scoring matrix

Takizawa Masatoshi, Yamana Hayato, Noguchi Tamotsu

IPSJ SIG technical reports 2006 ( 99 ) 41 - 47 2006.09

　View Summary

The domain linker prediction plays an important role in efficient protein structure analysis. Since previous domain linker prediction methods have employed sliding window, these methods do not explicitly consider the position dependence of amino acids within domain linkers In this paper, we propose a novel domain linker prediction method, focusing on both ends of the domain linker. Our method employs Support Vector Machines, which train on position dependence of amino acids extracted from the position-specific scoring matrix. As a result of the experiment using data set of coil regions dete...

CiNii J-GLOBAL
Building a terabyte-scale web data collection "NW1000G-04" in the NTCIR-5 WEB task

Masao Takaku, Keizo Oyama, Akiko Aizawa, Haruko Ishikawa, Haruko Ishikawa, Kengo Minamide, Shin Kato, Hayato Yamana, Hayato Yamana, Junya Hayashi

NII Technical Reports 2006 ( 12 ) 1 - 8 2006.09
Copyright violation detection system for Web texts

TASHIRO TAKASHI, UEDA TAKANORI, HORI TAISUKE, HIRATE Yu, YAMANA HAYATO

IPSJ SIG Notes 2006 ( 78 ) 27 - 33 2006.07

　View Summary

Due to explosive increase of the number of web pages, the number of copyright violation web pages, such as lyrics or news citation pages without permission, has also been increased. To solve this problem, we propose a system for detecting copyright violation web pages. The proposed system consists of three steps. Firstly, the system generates search keywords on phrasal units, called "bunsetsu", which are included in the "seed page." Secondly, on search keywords generated by the first step, the system gathers candidate of web pages violating copyright by using Google or Yahoo! web service. F...

CiNii J-GLOBAL
Support system for detecting abuse users in internet auction

HIRATE YU, AIYOSHIZAWA AKIRA, O SHOREI, IOKU YUICHI, KIDO FUYUKO, YAMANA HAYATO

IPSJ SIG Notes 2006 ( 78 ) 367 - 374 2006.07

　View Summary

Due to recent widespread use of internet auction, a huge amount of users trade with each other. At the same time, damages from the fraud caused by abuse users have been become a serious problem of the internet auction system. In this paper, we developed abuse user detecting system referring to rating log data, which is a part of auction log data. Rating log data indicates assesment between seller and buyer. The system consists of two methods; extracting users who are rated as "Good" abusively, and extracting abuse users' community based on one abuse user. Our evaluation shows proposing syst...

CiNii J-GLOBAL
Support system for detecting abuse users in internet auction

HIRATE Yu, AIYOSHIZAWA Akira, O Shorei, IOKU Yuichi, KIDO Fuyuko, YAMANA Hayato

IEICE technical report. Data engineering 106 ( 150 ) 37 - 42 2006.07

　View Summary

Due to recent widespread use of internet, many people use internet auction system, and trade with each other. At the same time, damages from the fraud caused by abuse users have been become a serious problem of the internet auction system. In this paper, we developed abuse user detecting system referring to rating log data, which is a part of auction log data. Rating log data indicates assesment between seller and buyer. The system consists of two methods; extracting users who are rated as "Good" abusively, and extracting abuse users' community based on one abuse user. Our evaluation shows ...

CiNii J-GLOBAL
Copyright violation detection system for Web texts

TASHIRO Takashi, UEDA Takanori, HORI Taisuke, HIRATE Yu, YAMANA Hayato

IEICE technical report. Data engineering 106 ( 149 ) 23 - 28 2006.07

　View Summary

Due to explosive increase of the number of web pages, the number of copyright violation web pages, such as lyrics or news citation pages without permission, has also been increased. To solve this problem, we propose a system for detecting copyright violation web pages. The proposed system consists of three steps. Firstly, the system generates search keywords on phrasal units, called "bunsetsu", which are included in the "seed page." Secondly, on search keywords generated by the first step, the system gathers candidate of web pages violating copyright by using Google or Yahoo! web service. F...

CiNii J-GLOBAL
Hierarchical Clustering Of Feature Vectors at Visual Attentional Points

SAITO Jun, YAMANA Hayato

IPSJ SIG Notes. CVIM 2006 ( 25 ) 57 - 62 2006.03

　View Summary

In Content-based image retrieval, the classifications is needed for better performances of (i)speeds of retrieval and (ii)semanticity of retrieval. Our system extracts the most attentional points by using a selective visual attention model which extracts feature vectors of attentional points in images. And our system classifies feature vectors by hierarchical clustering with residuals. An attentional point in an image is outlier in an image, or special outlier. We propose extension of selective attention model to extract temporal outlier with residual vectors, and the method of moving atten...

CiNii J-GLOBAL
Estimation of Shape and Reflectance by using Extended SFS from Multiple Views

KOBAYASHI MASANORI, IGUCHI SHIGERU, YAMANA HAYATO

IPSJ SIG Notes. CVIM 2006 ( 25 ) 391 - 398 2006.03

　View Summary

There exist many methods to reconstract the 3D model from the real object. However, they have some restrictions such as using expensive devices, using reference objects, or based on assumption that the target object is composed of one material. This paper proposes a new method that is based on shape from shading using multiple views.The proposed method treats the object composed of multiple materials. The proposed method preliminarily clusters the reflectances using the input images, and then analyze the 3D shape and the reflectance parameters.

CiNii J-GLOBAL
Estimation of Shape and Reflectance by using Extended SFS from Multiple Views

KOBAYASHI MASANORI, IGUCHI SHIGERU, YAMANA HAYATO

Technical report of IEICE. PRMU 105 ( 674 ) 219 - 226 2006.03

　View Summary

There exist many methods to reconstract the 3D model from the real object. However, they have some restrictions such as using expensive devices, using reference objects, or based on assumption that the target object is composed of one material. This paper proposes a new method that is based on shape from shading using multiple views.The proposed method treats the object composed of multiple materials. The proposed method preliminarily clusters the reflectances using the input images, and then analyze the 3D shape and the reflectance parameters.

CiNii J-GLOBAL
Hierarchical Clustering Of Feature Vectors at Visual Attentional Points

SAITO Jun, YAMANA Hayato

Technical report of IEICE. PRMU 105 ( 673 ) 57 - 62 2006.03

　View Summary

In Content-based image retrieval, the classifications is needed for better performances of (i) speeds of retrieval and (ii) semanticity of retrieval. Our system extracts the most attentional points by using a selective visual attention model which extracts feature vectors of attentional points in images. And our system classifies feature vectors by hierarchical clustering with residuals. An attentional point in an image is outlier in an image, or special outlier. We propose extension of selective attention model to extract temporal outlier with residual vectors, and the method of moving att...

CiNii J-GLOBAL
Optimization Technique for Cache Memory Considering Wire Delay

HIRUTA Tomonori, MASUDA Keisuke, YAMANA Hayato

IPSJ SIG Notes Vol.2006 ( No.20 ) 19 - 24 2006.02

　View Summary

The increase of the gap between processor speed and memory speed makes cache memory more important. However, wire delay in large cache grows that results from the process miniaturization. Therefore, cache memory access will become bottle neck. This paper proposes an optimization technique for cache memory considering wire delay. We implement this technique with SimpleScalar 3.0d and evaluate with SPEC95 CINT and SPEC2000 CINT. In the result, IPC grows at the average of 1.17 times.

CiNii J-GLOBAL
Optimization Technique for Cache Memory Considering Wire Delay

HIRUTA Tomonori, MASUDA Keisuke, YAMANA Hayato

IPSJ SIG Notes 2006 ( 20 ) 19 - 24 2006.02

　View Summary

The increase of the gap between processor speed and memory speed makes cache memory more important. However, wire delay in large cache grows that results from the process miniaturization. Therefore, cache memory access will become bottle neck. This paper proposes an optimization technique for cache memory considering wire delay. We implement this technique with SimpleScalar 3.0d and evaluate with SPEC95 CINT and SPEC2000 CINT. In the result, IPC grows at the average of 1.17 times.

CiNii
Recognition of Similar Character Pairs with Two Types of SVMs for Online Mathematical Expression Recognition

KASUYA Yuji, YAMANA Hayato

Technical report of IEICE. Thought and language 105 ( 612 ) 55 - 60 2006.02

　View Summary

Mathematical expression recognition systems which recognize mathematical expressions and translates them into digital data usable in computer is needed. However characters and symbols in mathematical expressions are sometimes similar and difficult to discriminate. This paper proposes a method to recognize similar character pairs with two types of SVM (Support Vector Machine). One is normal SVM which uses images of handwriting as input; the other is SVMGDTW which uses sequences of pen position. With the proposed method, "γ"and "r" are discriminated with a recognition rate of 86.7%, "ω" and "...

CiNii J-GLOBAL
Content-based Image Retrieval with Selective Visual Attention

SAITO Jun, YAMANA Hayato

Technical report of IEICE. Thought and language 105 ( 612 ) 61 - 66 2006.02

　View Summary

Content-based image retrievals (CBIR) have difficulty on concentrated information processing at the point which seems important. This is caused by difficulty on automated selection of "informative" points. As one part of studies of humans or primate brain, selective visual attention has been researched, which model processes for decision of the attentional point of an image before eye movements when the subject is presented an image. To use local information of an image, we propose a CBIR system with the feature vector at the most attentional point of the image. We introduce model of select...

CiNii J-GLOBAL
Recognition of Similar Character Pairs with Two Types of SVMs for Online Mathematical Expression Recognition

KASUYA Yuji, YAMANA Hayato

Technical report of IEICE. PRMU 105 ( 614 ) 55 - 60 2006.02

　View Summary

Mathematical expression recognition systems which recognize mathematical expressions and translates them into digital data usable in computer is needed. However characters and symbols in mathematical expressions are sometimes similar and difficult to discriminate. This paper proposes a method to recognize similar character pairs with two types of SVM (Support Vector Machine). One is normal SVM which uses images of handwriting as input; the other is SVMGDTW which uses sequences of pen position. With the proposed method, "γ"and "r" are discriminated with a recognition rate of 86.7%,"ω" and "w...

CiNii
Content-based Image Retrieval with Selective Visual Attention

SAITO Jun, YAMANA Hayato

Technical report of IEICE. PRMU 105 ( 614 ) 61 - 66 2006.02

　View Summary

Content-based image retrievals(CBIR) have difficulty on concentrated information processing at the point which seems important. This is caused by difficulty on automated selection of "informative" points. As one part of studies of humans or primate brain, selective visual attention has been researched, which model processes for decision of the attentional point of an image before eye movements when the subject is presented an image. To use local information of an image, we propose a CBIR system with the feature vector at the most attentional point of the image. We introduce model of selecti...

CiNii
Sequential pattern mining with time intervals

Yu Hirate, Hayato Yamana

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 3918 LNAI 775 - 779 2006

DOI
Generalized sequential pattern mining with item intervals

Yu Hirate, Hayato Yamana

Journal of Computers 1 ( 3 ) 51 - 60 2006

DOI
選択注視を用いた画像検索システムの提案

斎藤純, 山名早人

信学技報(PRMU） Vol.105 ( No.614 ) 61 - 66 2006
SVMを用いたオンライン類似数式文字認識

糟谷勇児, 山名早人

信学技報(PRMU) 2006
迷惑メールを見分ける賢いチップ

山名早人監修, G.スティックス

日経サイエンス 2006年5月号 2006
時間情報を含むシーケンシャルパターンマイニングの一般化

平手勇宇, 山名早人

DEWS2006 2006
検索エンジンを利用した英作文支援システムの構築

佐藤学, 安藤進, 山名早人

言語処理学会第12回年次大会 12th 664 - 667 2006

J-GLOBAL
距離と属性を考慮したPrefixSpanによる感情表現抽出

佐藤一誠, 平手勇宇, 山名早人

DEWS2006 2006
学習器残差の距離による画像検索システム

斎藤純, 山名早人

信学技法(PRMU) Vol.105 ( No.673 ) 57 - 62 2006
拡張多視点SFSによる3次元形状と反射属性の推定

小林正典, 井口茂, 山名早人

情処研報(CVIM), Vol.2006 ( No.25 ) 391 - 398 2006
リンク構造解析による不要なWebコミュニティの事前判別

斉田直幸, 山名早人

DEWS2006 2006
Fact of the Web：50億ページのウェブの解析

加藤真, 山名早人

DEWS2006 2006
タンパク質立体構造に基づく保存領域の自動抽出

山田晃太郎, 山田真介, 山名早人, 野口保

第６回日本蛋白質科学会年会ポスター番号2P-07 2006

J-GLOBAL
Text Mining using PrefixSpan constrained by Item Interval and Item Attribute

Issei Sato, Yu Hirate, Hayato Yamana

ICDEW 2006 - Proceedings of the 22nd International Conference on Data Engineering Workshops 35 - 38 2006

　View Summary

Applying conventional sequential pattern mining methods to text data extracts many uninteresting patterns, which increases the time to interpret the extracted patterns. To solve this problem, we propose a new sequential pattern mining algorithm by adopting the following two constraints. One is to select sequences with regard to item intervals-The number of items between any two adjacent items in a sequence-And the other is to select sequences with regard to item attributes. Using Amazon customer reviews in the book category, we have confirmed that our method is able to extract patterns faster than the conventional method, and is better able to exclude uninteresting patterns while retaining the patterns of interest.

DOI
インターネットオークションにおける不正行為者の発見支援

平手勇宇, 相吉澤明, 翁松齢, 井奥雄一, 木戸冬子, 山名早人

情報研報(DBS） Vol.2006 ( 140(2) ) 367 - 374 2006
Web ページを対象とした著作権違反自動検知システム

田代崇, 上田高徳, 堀泰祐, 平手勇宇, 山名早人

情報研報(DBS) Vol.2006 ( 140(2) ) 27 - 33 2006
配列プロファイルを利用したドメインリンカー予測

滝沢雅俊, 山名早人, 野口保

情処研報(BIO） Vol.2006 ( 99 ) 41 - 47 2006
検索エンジンを用いた英文冠詞誤りの検出

平野孝佳, 平手勇宇, 山名早人

日本データベース学会Letters Vol.6, No.3 1 - 4 2006
インターネットオークションにおける不正行為者の発見支援

平手勇宇, 相吉澤明, 翁松齢, 井奥雄一, 木戸冬子, 山名早人

日本データベース学会Letters Vol.5 ( 2 ) 77 - 80 2006

J-GLOBAL
Web上の文章を対象とした著作権違反自動検知システム

田代崇, 上田高徳, 堀泰祐, 平手勇宇, 山名早人

日本データベース学会Letters Vol.5 ( 2 ) 25 - 28 2006

J-GLOBAL
学内ドメインに存在する著作権違反ページ抽出の可能性

平手勇宇, 山名早人

平成18年度情報教育研究集会論文集 876 - 879 2006
Web Structure in 2005

Yu Hirate, Hayato Yamana

WAW2006, Banff 2006
Automatic extraction of conserved region from alignment based on protein structure

Shinsuke Yamada, Kouratou Yamada, Hayato Yamana, Tamotsu Noguchi

EABS & BSJ 2006 Poster No. 1P474 2006
Prediction of domain and disordered regions in proteins by fold recognition and secondary structure prediction

Masatoshi Takizawa, Naoko Inoue, Kentaro Tomii, Hayato Yamana, Tamotsu Noguchi

Critical Assessment of Techniques for Protein Structure Prediction Seventh Meeting Poster No.9 2006
Improvement in accuracy of multiple sequence alignment using novel group-to-group sequence alignment algorithm with piecewise linear gap cost

Shinsuke Yamada, Osamu Gotoh, Hayato Yamana

BMC Bioinformatics 7 2006

　View Summary

Background: Multiple sequence alignment (MSA) is a useful tool in bioinformatics. Although many MSA algorithms have been developed, there is still room for improvement in accuracy and speed. In the alignment of a family of protein sequences, global MSA algorithms perform better than local ones in many cases, while local ones perform better than global ones when some sequences have long insertions or deletions (indels) relative to others. Many recent leading MSA algorithms have incorporated pairwise alignment information obtained from a mixture of sources into their scoring system to improve accuracy of alignment containing long indels. Results: We propose a novel group-to-group sequence alignment algorithm that uses a piecewise linear gap cost. We developed a program called PRIME, which employs our proposed algorithm to optimize the well-defined sum-of-pairs score. PRIME stands for Profile-based Randomized Iteration MEthod. We evaluated PRIME and some recent MSA programs using BAliBASE version 3.0 and PREFAB version 4.0 benchmarks. The results of benchmark tests showed that PRIME can construct accurate alignments comparable to the most accurate programs currently available, including L-INS-i of MAFFT, ProbCons, and T-Coffee. Conclusion: PRIME enables users to construct accurate alignments without having to employ pairwise alignment information. PRIME is available at http://prime.cbrc.jp/. © 2006 Yamada et al
licensee BioMed Central Ltd.

DOI PubMed
Text Mining using PrefixSpan constrained by Item Interval and Item Attribute

Issei Sato, Yu Hirate, Hayato Yamana

ICDEW 2006 - Proceedings of the 22nd International Conference on Data Engineering Workshops 35 - 38 2006

　View Summary

Applying conventional sequential pattern mining methods to text data extracts many uninteresting patterns, which increases the time to interpret the extracted patterns. To solve this problem, we propose a new sequential pattern mining algorithm by adopting the following two constraints. One is to select sequences with regard to item intervals-The number of items between any two adjacent items in a sequence-And the other is to select sequences with regard to item attributes. Using Amazon customer reviews in the book category, we have confirmed that our method is able to extract patterns faster than the conventional method, and is better able to exclude uninteresting patterns while retaining the patterns of interest.

DOI
Sequential pattern mining with time intervals

Yu Hirate, Hayato Yamana

ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS 3918 775 - 779 2006

　View Summary

Sequential pattern mining can be used to extract frequent sequences maintaining their transaction order. As conventional sequential pattern mining methods do not consider transaction occurrence time intervals, it is impossible to predict the time intervals of any two transactions extracted as frequent sequences. Thus, from extracted sequential patterns, although users are able to predict what events will occur, they are not able to predict when the events will occur. Here, we propose a new sequential pattern mining method that considers time intervals. Using Japanese earthquake data, we confirmed that our method is able to extract new types of frequent sequences that are not extracted by conventional sequential pattern mining methods.

DOI
Contour Extraction using Texture and Non-Texture Distinction

IGUCHI Shigeru, YAMANA Hayato

Technical report of IEICE. PRMU 105 ( 414 ) 13 - 18 2005.11

　View Summary

This paper proposes a technique for applying a suitable contour extraction method to a texture region and a non-texture region to improve the accuracy of the contour extraction after dividing an image into these two regions. The most basic idea to extract contours is edge detection by derivative filters, however, it is hard to say edges equal borderlines. Thus, a texture analysis is essential to get the accurate result. Most of the conventional studies apply either edge detection or texture analysis to the whole in an image. Against that, in this paper, we firstly extract a texture region a...

CiNii J-GLOBAL
Sample Collection System for Online Handwritten Mathematical Expressions written by Digital Pen and Preliminary Recognition Experiments

KASUYA Yuji, YAMANA Hayato

Technical report of IEICE. PRMU 105 ( 374 ) 7 - 12 2005.10

　View Summary

This paper proposes a sample collection system for online handwritten mathematical expressions based on digital pens. In the prior online handwriting character recognition systems, samples collected by pen tablets have been used. But data by pen tablets are (1) difficult to collect because users aren't familiar with pen tablets, (2) different from real handwriting because users have to look at their monitors to write characters. On the contrary digital pens, easy to use for the first time, are used and samples written by 74 examinees are collected. By recognition experiments following facts...

CiNii J-GLOBAL
C-013 A Consideration on Thread-Level Speculative Execution

SAITO Fumiko, YAMANA Hayato

情報科学技術フォーラム一般講演論文集 4 ( 1 ) 205 - 206 2005.08

CiNii
Sequential Pattern Mining based on Event Intervals

HIRATE YU, KOMATSU SHUNSUKE, YAMANA HAYATO

IPSJ SIG Notes 2005 ( 68 ) 321 - 328 2005.07

　View Summary

In data mining researches, sequential pattern mining extracts frequent sequences keeping their event occurrence orders. Since conventional sequential pattern mining methods have not consider event occurrence time intervals, it is impossible to understand time intervals of any two events which are included by result sequences. In this paper, we propose a new sequential pattern mining method which considers event occurrence time intervals. As a result of our evaluation in applying the earthquake data. We confirmed our new method can extract new kind of frequent sequences which couldn't extracted by conventional sequential pattern mining methods.

CiNii
Sequential Pattern Mining based on Event Intervals

HIRATE YU, KOMATSU SHUNSUKE, YAMANA HAYATO

IPSJ SIG Notes 2005 ( 68 ) 321 - 328 2005.07

　View Summary

In data mining researches, sequential pattern mining extracts frequent sequences keeping their event occurrence orders. Since conventional sequential pattern mining methods have not consider event occurrence time intervals, it is impossible to understand time intervals of any two events which are included by result sequences. In this paper, we propose a new sequential pattern mining method which considers event occurrence time intervals. As a result of our evaluation in applying the earthquake data. We confirmed our new method can extract new kind of frequent sequences which couldn't extrac...

CiNii
Sequential Pattern Mining based on Event Intervals

HIRATE Yu, KOMATSU Shunsuke, YAMANA Hayato

IEICE technical report. Data engineering 105 ( 172 ) 43 - 48 2005.07

　View Summary

In data mining researches, sequential pattern mining extracts frequent sequences keeping their event occurrence orders. Since conventional sequential pattern mining methods have not consider event occurrence time intervals, it is impossible to understand time intervals of any two events which are included by result sequences. In this paper, we propose a new sequential pattern mining method which considers event occurrence time intervals. As a result of our evaluation in applying the earthquake data. We confirmed our new method can extract new kind of frequent sequences which couldn't extrac...

CiNii J-GLOBAL
From the Search Engine to the Analysis Engine

Yamana Hayato

Journal of Japanese Society for Artificial Intelligence Vol.20 ( No.4 ) 471 - 478 2005.07

CiNii J-GLOBAL
TF2P-growth:Frequent Itemset Mining Algorithm without Any Thresholds

HIRATE YU, IWAHASHI EIGO, YAMANA HAYATO

情報処理学会論文誌データベース（TOD） Vol.46 ( No.SIG 8(TOD 26) ) 60 - 71 2005.06

　View Summary

Conventional frequent itemset mining algorithms require some user-specified minimum support, and then mine frequent itemsets with support values that are higher than the minimum support. As it is difficult to predict how many frequent itemsets will be mined with a specified minimum support, the Top-κ mining concept has been proposed. The Top-κ Mining concept is based on an algorithm for mining frequent itemsets without a minimum support, but with the number of most κ frequent itemsets ordered according to their support values. However, the Top-κ mining concept still requires a threshold κ. Therefore, users must decide the value of κ before initiating mining. In this paper, we propose a new mining algorithm, called "TF^2P-growth, " which does not require any thresholds. This algorithm mines itemsets with the descending order of their support values without any thresholds and returns frequent itemsets to users sequentially with short response time.

CiNii
10. Productive ICT Academia Project

UEDA Kazunori, OISHI Shinichi, KATTO Jiro, NAKAJIMA Tatsuo, MURAOKA Yoichi, YAMANA Hayato

Journal of Information Processing Society of Japan 46 ( 4 ) 410 - 416 2005.04

CiNii
The Current Status of the Art of the 21st COE Programs in the Information Sciences Field (1) Productive ICT Academia Project

上田和紀, 大石進一, 甲藤二郎, 中島達夫, 村岡洋一, 山名早人

情報処理 46 ( 4 ) 410 - 416 2005.04

CiNii J-GLOBAL
Defense against Buffer Overflow by Segmenting Stack Frame

Hiruta Tomonori, Yamana Hayato

情報処理学会研究報告. SLDM, [システムLSI設計技術] 2005 ( 27 ) 161 - 165 2005.03

　View Summary

In recent years, Buffer overflow Attacks are increasing. Buffer overflow is caused by inputting larger data than date space which prepared for various numbers. The most danger buffer overflow is stack overflow. When Stack Overflow occurs, return address is re-written and malicious code becomes executable. This paper proposes defense technique against buffer overflow by segmenting stack frame. We implement this technique Simplescalar tool set ver3.0d and evaluate with SPEC CINT95. Evaluation result shows that this technique causes 2.3% performance degradation.

CiNii
Defense against Buffer Overflow by Segmenting Stack Frame

HIRUTA Tomonori, YAMANA Hayato

119 ( 0 ) 161 - 165 2005.03

　View Summary

In recent years, Buffer overflow Attacks are increasing. Buffer overflow is caused by inputting larger data than date space which prepared for various numbers. The most danger buffer overflow is stack overflow. When Stack Overflow occurs, return address is re-written and malicious code becomes executable. This paper proposes defense technique against buffer overflow by segmenting stack frame. We implement this technique Simplescalar tool set ver3.0d and evaluate with SPEC CINT95. Evaluation result shows that this technique causes 2.3% performance degradation.

CiNii
Defense against Buffer Overflow by Segmenting Stack Frame

Hiruta Tomonori, Yamana Hayato

IEICE technical report. Computer systems 104 ( 738 ) 71 - 75 2005.03

　View Summary

In recent years, Buffer overflow Attacks are increasing. Buffer overflow is caused by inputting larger data than date space which prepared for various numbers. The most danger buffer overflow is stack overflow. When Stack Overflow occurs, return address is re-written and malicious code becomes executable. This paper proposes defense technique against buffer overflow by segmenting stack frame. We implement this technique Simplescalar tool set ver3.0d and evaluate with SPEC CINT95. Evaluation result shows that this technique causes 2.3% performance degradation.

CiNii J-GLOBAL
MPIETE2:Improvement of the MPI Execution Time Estimator in Prediction Error of Communication Time

IWABUCHI Toshihiro, SUGITA Shu, YAMANA Hayato

IPSJ SIG Notes 2005 ( 19 ) 175 - 180 2005.03

　View Summary

In this paper, we improve the MPI Execution Time Estimator (MPIETE) to reduce the prediction error of communication time. MPIETE we have proposed is the execution time estimation tool for MPI programs. MPIETE's scheme divides a MPI program into the computation blocks and the communication blocks, and then predicts the total execution time by summing the execution time of each block. Since estimating the block execution time is fast, MPIETE enables to predict the total execution time faster than executing MPI program actually. However, MPIETE assumes no network contension. This results in some errors to predict the delay-time with network contentions. In this paper, by proposing the new estimation scheme for communication block including the delay-time, we improve the MPIETE. The proposed scheme enables to predict the performance decrement and to find out the number of the Processing Unit (PU) where the target platform marks the best performance. We have evaluated MPIETE2, that improves MPIETE with the proposed scheme, using EP, CG, FT, MG from NAS Parallel Benchmmarks 2.4. As the results for 2-128PU, the prediction error ranges are less than 14% and the execution time of the prediction is 1/4 times smaller than the actual execution time. Moreover, MPIETE2 predicts exactly the number of PU where the target platform marks the best performance.

CiNii J-GLOBAL
The Proposal of Tri-Mode Branch Predictor

SAITO FUMIKO, YAMANA HAYATO

IPSJ SIG Notes 2005 ( 19 ) 25 - 30 2005.03

　View Summary

The branch prediction is installed on the recent processor to avoid stalling pipeline. Branch prediction is a kind of speculative execution for control dependence. In the recent year, the deeper pipeline gets, the higher branch miss prediction penalty reaches. Thus, branch miss prediction rate must lower to rise processor performance. Recently, many various branch predictors have been proposed. Hybrid branch predictors composed of multiple pattern history tables (PHT) show the highest accuracy among them. The Bi-Mode branch predictor is the most famous of the hybrid branch predictors. On the Bi-Mode predictor, the Choice PHT judges the branch bias and selects the Direction PHT(Taken or NotTaken PHT). This paper focuses on the Weakly Branches which the Choice PHT judges Weakly Taken or NotTaken don't have the branch bias. In order to avoid the Weakly branch influence on the Direction PHTs, we propose "the Tri-Mode brach predictor" added the Weakly PHT predicting the Weakly branches. On the 12KB Tri-Mode predictor, the branch miss reduction rate from the Bi-Mode predictor shows average 2.78% in the SPECint95(ref inputs) benchmark simulation.

CiNii J-GLOBAL
The Proposal of Tri-Mode Branch Predictor

SAITO FUMIKO, YAMANA HAYATO

IPSJ SIG Notes 2005 ( 19 ) 25 - 30 2005.03

　View Summary

The branch prediction is installed on the recent processor to avoid stalling pipeline. Branch prediction is a kind of speculative execution for control dependence. In the recent year, the deeper pipeline gets, the higher branch miss prediction penalty reaches. Thus, branch miss prediction rate must lower to rise processor performance. Recently, many various branch predictors have been proposed. Hybrid branch predictors composed of multiple pattern history tables (PHT) show the highest accuracy among them. The Bi-Mode branch predictor is the most famous of the hybrid branch predictors. On th...

CiNii
MPIETE2 : Improvement of the MPI Execution Time Estimator in Prediction Error of Communication Time

IWABUCHI Toshihiro, SUGITA Shu, YAMANA Hayato

IPSJ SIG Notes 2005 ( 19 ) 175 - 180 2005.03

　View Summary

In this paper, we improve the MPI Execution Time Estimator (MPIETE) to reduce the prediction error of communication time. MPIETE we have proposed is the execution time estimation tool for MPI programs. MPIETE's scheme divides a MPI program into the computation blocks and the communication blocks, and then predicts the total execution time by summing the execution time of each block. Since estimating the block execution time is fast, MPIETE enables to predict the total execution time faster than executing MPI program actually. However, MPIETE assumes no network contension. This results in so...

CiNii
Selective Attention System by Residual Information of Predictive Coding

SAITOH Jun, YAMANA Hayato

IPSJ SIG Notes. CVIM 148 235 - 242 2005.03

　View Summary

This paper describes unsupervised selective attention system by using residual information of Predictive Coding. The imitation system of brain visual passway by Rao and Ballard, or Predictive Coding, uses learning rule to minimamize the residuals between the inputs and the internal predictions, which results to get linear coding of subimages by basis set. Residual-based selective attention model can select out "informative" subimages from a validation image because they have features which is rare to been seen in learning sample subimages. We experiment our system to have it learn some similar natural scenes and then to have it recognize another kind of images. We discussed these experiments and usefulness of our system.

CiNii
Selective Attention System by Residual Information of Predictive Coding

Saitoh Jun, Yamana Hayato

IPSJ SIG Notes. CVIM 2005 ( 18 ) 235 - 242 2005.03

　View Summary

This paper describes unsupervised selective attention system by using residual information of Predictive Coding. The imitation system of brain visual passway by Rao and Ballard, or Predictive Coding, uses learning rule to minimamize the residuals between the inputs and the internal predictions, which results to get linear coding of subimages by basis set. Residual-based selective attention model can select out "informative" subimages from a validation image because they have features which is rare to been seen in learning sample subimages. We experiment our system to have it learn some simi...

CiNii J-GLOBAL
An Efficient Synchronization Scheme Using Speculative Threads on Hyper-Threading Technology

HONDA Dai, SAITO Fumiko, YAMANA Hayato

IPSJ SIG Notes 2005 ( 7 ) 33 - 38 2005.01

　View Summary

Recently, the gap between CPU processing speed and the data transmission speed from the main memory has lowered execution speed. Thus, data caching technique becomes more important. Particularly in pointer-based programs which have nonlinear access patterns, the cache miss rate is very high. To solve this problem, Pre-Execution has been proposed as a cache miss latency tolerance technique that makes one or more helper threads running in the spare CPU's resources ahead of the main computation. This paper proposes the synchronous technique between main thread and helper thread. Furthermore, t...

CiNii J-GLOBAL
A Branch Prediction Technique focused on Weak States of Prediction Table

Nakazawa Yukari, Saito Fumiko, Yamana Hayato

IPSJ SIG Notes 2005 ( 7 ) 51 - 56 2005.01

　View Summary

In recent years, as the pipeline's length gets deeper, and the instruction fetch width and the issue width become wider, more accurate branch predictors are needed. Branch predictors predict with the 2 bit saturating counters (predictor counters) whose state is changed by the execution result of the branch. As a result of analyzing the prediction accuracy in each state (Strongly Taken, Weakly Taken, Weakly Not-taken and Strongly Not-taken) of prediction counters, it turns out that the prediction accuracy in the Weak states of gshare predictor is especially low. We propose the predictor sele...

CiNii J-GLOBAL
PlusDBG: Web community extraction scheme improving both precision and pseudo-recall

N Saida, A Umezawa, H Yamana

WEB TECHNOLOGIES RESEARCH AND DEVELOPMENT - APWEB 2005 3399 938 - 943 2005

　View Summary

This paper proposes PlusDBG to improve both precision and pseudo-recall by extending the conventional Web community extraction scheme. Precision is defined as the percentage of relevant Web pages extracted as members of Web communities and pseudo-recall is defined as the sum of the number of relevant Web pages extracted as members of Web communities. The proposed scheme adopts the new distance parameter defined by the relevance between a Web page and a Web community, and extracts the Web community with higher precision and pseudo-recall. Moreover, we have implemented and evaluated the proposed scheme. Our results confirm that the proposed scheme is able to extract about 3.2-fold larger numbers of members of Web communities than the conventional scheme, while maintaining equivalent precision.
Googleを超える利口な検索エンジン

山名早人監修, J.モスタファ

日経サイエンス日経サイエンス2005年5月号 2005
迷惑メールを撃退する

山名早人監修, J.グッドマン, D.ベッカーマン, R.ラウンスウェイト

日経サイエンス日経サイエンス2005年7月号 2005
イベント発生間隔を考慮したシーケンシャルパターンマイニング

平手勇宇, 小松俊介, 山名早人

情報研報(DBS） Vol.2005 ( No.68 ) 321 - 328 2005
Search Engines 2005-Guides to the Web-Introduction to Search Engines

山名早人, 村田剛志

情報処理 Vol.46 ( No.9 ) 981 - 987 2005

CiNii J-GLOBAL
三次元情報を利用した保存領域の自動決定

山名早人

産総研生命情報科学人材養成コース最終シンポジウム 40 2005
区分的線形ギャップコストを用いたマルチプルアラインメントアルゴリズムの開発

山田真介, 山名早人, 後藤修

産総研生命情報科学人材養成コース最終シンポジウム、ポスター番号002 2005
スレッドレベル投機的実行に関する考察

斎藤史子, 山名早人

FIT2005,C-1 FIT 2005 2005

J-GLOBAL
FORTE1を利用したドメイン予測法の開発

山名早人

産総研生命情報科学人材養成コース最終シンポジウム 38 2005
デジタルペンを用いた数式サンプル収集システムの紹介と採取サンプルの解析

糟谷勇児, 山名早人

信学技報(PRMU) Vol.105 ( No.374 ) 7 - 12 2005
PRIME - an implementation of a doubly nested randomized iterative refinement strategy with the picewise linear gap cost

Shinsuke Yamada, Osamu Gotoh

CBRC2005, Poster No.2 2005
テクスチャと非テクスチャの区別を用いた輪郭線抽出

井口茂, 山名早人

信学技報(PMRU) Vol.105 ( No.414 ) 13 - 18 2005
P2Pファイル共有ネットワーク上で動作するメタファイルシステム

山名早人

日本ソフトウェア科学会インターネットテクノロジワークショップ2005論文集 WIT2005 2005
スパイウェア

山名早人監訳, 斎藤純, 平手勇, 糟谷勇児, 柳井佳孝, 蛭田智則, 杉田秀, 井口茂訳

CACM日本語版 Vol.6, No.1 2005
Multiple sequence alignment

Osamu Gotoh, Shinsuke Yamada, Tetsushi Yada

handbook of computational molecular biology 2005
Overview of the NTCIR-5 WEB Navigational Retrieval Subtask 2 (Navi-2）

Keizo Oyama, Masao Takaku, Haruko Ishikawa, Akiko Aizawa, Hayato Yamana

Proc. of NTCIR-5 Workshop 2005
Overview of the NTCIR-5 WEB Navigational Retrieval Subtask 2 (Navi-2）

Keizo Oyama, Masao Takaku, Haruko Ishikawa, Akiko Aizawa, Hayato Yamana

Proc. of NTCIR-5 Workshop 2005
リカレントネットを用いたオンライン文字認識システム

糟谷勇児, 山名早人

情報処理学会全国大会講演論文集 67th ( 2 ) 2005

J-GLOBAL
HMMでの動作認識における類似動作からの特徴部位抽出

井口茂, 山名早人

情報処理学会全国大会講演論文集 67th ( 2 ) 2005

J-GLOBAL
MPIプログラムの簡易実行による実行時間予測手法における通信予測の効率化

杉田秀, 岩淵寿寛, 山名早人

情報処理学会全国大会講演論文集 67th ( 1 ) 2005

J-GLOBAL
相同性検索手法の組み合わせによる検索精度向上

滝沢雅俊, 山田真介, 山名早人

情報処理学会全国大会講演論文集 67th ( 3 ) 2005

J-GLOBAL
PlusDBG: Web community extraction scheme improving both precision and pseudo-recall

N Saida, A Umezawa, H Yamana

WEB TECHNOLOGIES RESEARCH AND DEVELOPMENT - APWEB 2005 3399 938 - 943 2005

　View Summary

This paper proposes PlusDBG to improve both precision and pseudo-recall by extending the conventional Web community extraction scheme. Precision is defined as the percentage of relevant Web pages extracted as members of Web communities and pseudo-recall is defined as the sum of the number of relevant Web pages extracted as members of Web communities. The proposed scheme adopts the new distance parameter defined by the relevance between a Web page and a Web community, and extracts the Web community with higher precision and pseudo-recall. Moreover, we have implemented and evaluated the proposed scheme. Our results confirm that the proposed scheme is able to extract about 3.2-fold larger numbers of members of Web communities than the conventional scheme, while maintaining equivalent precision.

DOI
3P266 Identification of rigid domains by using complete graph and application to SCOP

Mashiko R, Wako H, Yamana H

Biophysics 44 ( 1 ) S256 2004.11

DOI CiNii
F-033 Parallel Learning Methods of Reinforcement Learning on Shared Memory Multiprocessors

Mori Kouichirou, Yamana Hayato

情報科学技術フォーラム一般講演論文集 3 ( 2 ) 291 - 292 2004.08

CiNii J-GLOBAL
An Efficient Caching Technique Using Speculative Threads on Hyper-Threading Technology

HONDA Dai, SAITO Fumiko, YAMANA Hayato

IPSJ SIG Notes 2004 ( 80 ) 43 - 48 2004.07

　View Summary

Recently, the gap between CPU processing speed and the data transmission speed from the main memory has greatly influenced execution speed. Thus data caching technique become more important However, in pointer-based programs which have a nonlinear access pattern a cache memory does not function effectively. To solve this problem, Pre-Execution is a cache miss latency tolerance technique that uses one or more helper threads running in spare CPU's 'resources ahead of the main computation. This paper proposes the synchronous technique of the Helper thread. Furthermore, this paper examines the ...

CiNii J-GLOBAL
A Translation Support System using Search Engines

OSHIKA Hironori, SATOU Manabu, ANDO Susumu, YAMANA Hayato

IPSJ SIG Notes 2004 ( 72 ) 585 - 591 2004.07

　View Summary

This paper proposes a new Japanese-to-English-translation support system using search engines. The system uses Google to address some of the problems that non-native speakers of English will often encounter when trying to write English sentences. Especially, appropriate choices of English nouns and prepositions are a challenge for Japanese. To solve these problems, we focus on the techniques presented in the book titled "Using Google to Improve Your Translation Skills" written by Susumu Ando, published by Maruzen in 2003. The proposed system is implemented by automatically constructing opti...

CiNii
A Translation Support System using Search Engines

OSHIKA Hironori, SATOU Manabu, YAMANA Hayato

IEICE technical report. Data engineering 104 ( 177 ) 237 - 242 2004.07

　View Summary

This paper proposes a new Japanese-to-English-translation support system using search engines. The system uses Google to address some of the problems that non-native speakers of English will often encounter when trying to write English sentences. Especially, appropriate choices of English nouns and prepositions are a challenge for Japanese. To solve these problems, we focus on the techniques presented in the book titled "Using Google to Improve Your Translation Skills" written by Susumu Ando, published by Maruzen in 2003. The proposed system is implemented by automatically constructing opti...

CiNii J-GLOBAL
Toward the Exploitation of New Applications based on Web Data

YAMANA Hayato

IPSJ SIG Notes 2004 ( 45 ) 107 - 110 2004.05

　View Summary

The amount of the information on the Web is huge and the number of Web page is estimated about 9. 25 billion in April 2004. Moreover, one billion Web pageswill be added to the Web repository every year, which is estimated by calculating its average increase in these two years. It is not too much to say that the huge Web repository has all kinds of information, knowledge, and know-how that can not be learned by a human even if he spent all the life time to learn them. In this paper, we introduce the major research projects that concern how to crawl the huge Web pages, how to keep them up-to-date, and how to make full use of them.

CiNii
Toward the Exploitation of New Applications based on Web Data

YAMANA Hayato

IPSJ SIG Notes 2004 ( 45 ) 107 - 110 2004.05

　View Summary

The amount of the information on the Web is huge and the number of Web page is estimated about 9. 25 billion in April 2004. Moreover, one billion Web pageswill be added to the Web repository every year, which is estimated by calculating its average increase in these two years. It is not too much to say that the huge Web repository has all kinds of information, knowledge, and know-how that can not be learned by a human even if he spent all the life time to learn them. In this paper, we introduce the major research projects that concern how to crawl the huge Web pages, how to keep them up-to-...

CiNii J-GLOBAL
Parallelization of Loops Including Loop Carried Dependences Using Thread Level Speculative Execution

Ishikawa Shunsuke, Yamana Hayato

情報処理学会研究報告. SLDM, [システムLSI設計技術] 2004 ( 33 ) 63 - 68 2004.03

　View Summary

The loop parallelization scheme improves the program performance, dramatically. However, when the data dependence cannot be analyzed statically, the conventional parallelization scheme assumes that the data dependence exists. Thus the loop containing unanalyzed Loop Carried Dependence cannot be parallelizedthe thread level speculative execution scheme has been known to speedup such a loop. In this paper, we propose the scheme to apply the speculative execution alternatively only to the portion expected to be speeduped effectively, using the overhead parameter required for the book-keeping p...

CiNii
A Fast Learning Method for Reinforcement Learning on Shared Memory Multiprocessors

MORI Kouichirou, YAMANA Hayato

IPSJ SIG Notes. ICS 2004 ( 29 ) 89 - 94 2004.03

　View Summary

In Reinforcement Learning, the agent learns by trial and error from a state without knowledge. Therefore, reinforcement learning has drawbacks that learning is slow. It is a serious problem how learns at high speed. In order to learn at high speed, some methods have been proposed. In the methods, the value function is divided. Then each divided value function is assigned to each processor, and updated in parallel. However, the method needs to exchange experiences frequently between the divided value function because of the character of reinforcement learning. It was a problem in the previou...

CiNii
Parallelization of Loops Including Loop Carried Dependences Using Thread Level Speculative Execution

Ishikawa Shunsuke, Yamana Hayato

IEICE technical report. Computer systems 103 ( 736 ) 19 - 24 2004.03

　View Summary

The loop parallelization scheme improves the program performance, dramatically. However, when the data dependence cannot be analyzed statically, the conventional parallelization scheme assumes that the data dependence exists. Thus the loop containing unanalyzed Loop Carried Dependence cannot be parallelizedthe thread level speculative execution scheme has been known to speedup such a loop. In this paper, we propose the scheme to apply the speculative execution alternatively only to the portion expected to be speeduped effectively, using the overhead parameter required for the book-keeping p...

CiNii J-GLOBAL
A Fast Learning Method for Reinforcement Learning on Shared Memory Multiprocessors

MORI Kouichirou, YAMANA Hayato

IEICE technical report. Artificial intelligence and knowledge-based processing 103 ( 725 ) 59 - 64 2004.03

　View Summary

In Reinforcement Learning, the agent learns by trial and error from a state without knowledge. Therefore, reinforcement learning has drawbacks that learning is slow. It is a serious problem how learns at high speed. In order to learn at high speed, some methods have been proposed. In the methods, the value function is divided. Then each divided value function is assigned to each processor, and updated in parallel. However, the method needs to exchange experiences frequently between the divided value function because of the character of reinforcement learning. It was a problem in the previou...

CiNii J-GLOBAL
MPIETE : An Execution Time Estimator for MPI Programs

HORII Hiroshi, IWABUCHI Toshihiro, YAMANA Hayato

IPSJ SIG Notes 2004 ( 20 ) 55 - 60 2004.03

　View Summary

In this paper, we propose the MPI Execution Time Estimator (MPIETE), the execution time estimation tool for MPI programs, helping you to choose the best suited computing platform to execute a MPI program. Conventional execution time estimation schemes are not able to model a computing platform or a MPI program perfectly, which results in no reusable of any parameters of both the computing platform and the MPI program. On the contrary, the proposed scheme enables to reuse all the parameters of both the computing platform and the MPI program even for the estimation on another computing platfo...

CiNii J-GLOBAL
The Branch Predictor refering a BTB Entry Existence

SAITO FUMIKO, YAMANA HAYATO

IPSJ SIG Notes 2004 ( 20 ) 127 - 132 2004.03

　View Summary

The branch prediction is installed on the recent procesor to avoid stalling pipeline. Branch prediction is a kind of speculative execution for control dependence. In the recent year, the deeper pipeline gets, the higher branch miss prediction penalty reaches. Thus, branch miss prediction rate must lower to rise processor performance. The branch prediction predicts a branch direction and a branch target address. BTB (Branch Target Buffer) registers Taken branch. We found that the most branches, which do not have BTB entry are NotTaken branches. We propose the branch predictor reffering a BTB...

CiNii J-GLOBAL
見たいサイトが一発で出てくる検索エンジンの仕組みとは

山名早人

インターネットマガジン（インプレス） ( No.108 ) 88 - 91 2004
検索エンジンのアーキテクチャ

山名早人

情報の科学と技術 Vol.54 ( No.2 ) 84 - 89 2004

DOI
分岐方向偏向強弱毎の予測表で構成された分岐方向予測機構

斎藤史子, 山名早人

情処研報(ARC) Vol.2004 ( No.20 ) 127 - 132 2004
繰り返し囚人のジレンマゲームを適用したネットオークションモデルの提案と協調行動の観察

久野木彩子, 山名早人

DEWS2004 2004
強化学習並列化による学習の高速化

森紘一郎, 山名早人

情処研報(ICS) Vol.2004 ( No.29 ) 89 - 94 2004
リンク構造を利用した Web ページの更新判別手法

熊谷秀樹

DEWS2004, 3. 4-6 2004

CiNii
ユーザへの応答時間を重視した最頻出kパターン抽出アルゴリズム

平手勇宇, 岩橋永悟, 山名早人

DEWS2004 2004
ユーザの感覚を考慮したWeb検索システムの評価手法

大塚崇志, 江口浩二, 山名早人

DEWS2004 2004
ページ-コミュニティ間の関連性を考慮したWebコミュニティ抽出

斉田直幸, 梅沢晃, 山名早人

第66回情処全大 1U-5 ( 3 ) 2004

J-GLOBAL
トランスポート層の情報を利用したパケットの経路選択

高見進太郎, 山名早人, 廣津登志夫

第66回情処全大 4W-2 ( 3 ) 2004

J-GLOBAL
スレッドレベル投機的実行による依存距離不定運搬依存をもつループの並列化

石川隼輔, 山名早人

情処研報(SLDM) Vol.2004 ( No.33 ) 63 - 68 2004
グループ化された Web ページを用いた検索

梅沢晃

DEWS2004, 3. 4-6 2004

CiNii
MPIプログラムの簡易実行結果を用いた実行時間予測ツールMPIETEの評価

堀井洋, 岩渕寿寛, 山名早人

情処研報(HPC) Vol.2004 ( No.20 ) 55 - 60 2004
Cutting down the amount of communications for frequent pattern mining on Grid

加藤真, 平手勇宇, 岩橋永悟, 山名早人

情報処理学会シンポジウム論文集 2004 ( 6 ) 165 - 166 2004

J-GLOBAL
The Branch Predictor refering a BTB Entry Existence

斎藤史子, 山名早人

情報処理学会シンポジウム論文集 2004 ( 6 ) 261 - 268 2004

J-GLOBAL
An Efficient Algorithm for Mining Top-k Frequent Patterns with a Small Response Time

平手勇宇, 岩橋永悟, 山名早人

2004 CORS/INFORMS International Meeting (2004.5) 2004
A challenge to gather 10 billion of web pages

YAMANA H.

2004 CORS/INFORMS International Meeting 2004

CiNii
サービス指向コンピューティング

山名早人監訳, 石川隼輔, 堀井洋, 岩渕寿寛, 岩橋永悟, 山口正男訳

CACM日本語版 Vol.4 ( No.3 ) 2004
検索エンジンを使った翻訳サポートシステムの構築

大鹿広憲, 佐藤学, 安藤進, 山名早人

DBWS2004 2004
ハイパースレッディング環境における投機的スレッドを用いたキャッシュ効率化

本田大, 斎藤史子, 山名早人

SWoPP2004 2004
extension of group-to-group sequence alignment algorithm under a piecewise linear gap cost

山田真介, 後藤修, 山名早人

Proc. of Intelligent Systems for Molecular Biology 2004 2004
The Branch Predictor Referring a BTB Entry Existence

SAITO FUMIKO, YAMANA HAYATO

45 ( 7 ) 71 - 79 2004

　View Summary

The branch prediction is installed on the recent processor to avoid stalling pipeline. Branch prediction is a kind of speculative execution for control dependence. In the recent year, the deeper pipeline gets, the higher branch miss prediction penalty reaches. Thus, branch miss prediction rate must lower to rise processor performance. The branch prediction predicts a branch direction and a branch target address. BTB (Branch Target Buffer) registers Taken branch. We found that the most branches, which do not have BTB entry are NotTaken branches. We propose the branch predictor reffering a BTB entry existence. The proposed predictor only undates the entry of the branch whose target address is registered in BTB, in order to allevilate aliasing. In SPECint95 (train), branch prediction miss rate lowers avarage 1.5% on 8 KB Gshare predictor and avarage 0.4% on 1.5 KB Bi-Mode predictor.

CiNii
TF2P-growth: An Efficient Algorithm for Mining Frequent Patterns without any Thresholds

平手勇宇, 岩橋永悟, 山名早人

IEEE ICDM'04 Workshop on Alternatives Techniques for Data Mining and Knowledge Discovery 2004
Extension of Prrn: implementation of a doubly nested randomized iterative refinement strategy under a piecewise linear gap cost

山田真介, 後藤修, 山名早人

the Fifteenth International Conference on Genome Informatics 2004
Exploitation of Informational Applications - Toward the Global Web Information Archive

Hayato Yamana

Kyokai Joho Imeji Zasshi/Journal of the Institute of Image Information and Television Engineers 57 ( 12 ) 1632 - 1637 2003.12

DOI
Branch Direction Miss Prediction Rate Diminution in Cooperation with a Selector and Predictors on Hybrid Branch Predictor

SAITO Fumiko, NAKAZAWA Yukari, YAMANA Hayato

IPSJ SIG Notes 154 ( No.84 ) 115 - 120 2003.08

　View Summary

In recent years, as pipelines length deeper, more accurate branch direction predictors are needed. Most of the branch predictor tend to increase hardware budget for aliasing alleviation. This research proposes the means for miss prediction rate reduction in same hardware budget Hybrid Predictor. This predictor is called Hybrid Predictor Referenced Prediction Counter State(Hybrid-RPCS). Generally, low-predictability branches have high transition rate and no direction deviation. For low-predictability branches, prediction is turned the other way, based on a selector counter state and predictor counter states. For example, Bi-Mode Predictor has a strongly-state selector counter choosing prediction tables and a weakly-state predictor counter deciding final prediction, and predicts the opposite selector counter direction, miss prediction rate shows high. In SPECint95(ref) simulation, miss prediction rate reduces max 1.43% on 1.5KB Bi-Mode Predictor and max 0.16% on 12KB Combining Predictor.

CiNii
Branch Direction Miss Prediction Rate Diminution in Cooperation with a Selector and Predictors on Hybrid Branch Predictor

Saito Fumiko, Nakazawa Yukari, Yamana Hayato

IPSJ SIG Notes 2003 ( 84 ) 115 - 120 2003.08

　View Summary

In recent years, as pipelines length deeper, more accurate branch direction predictors are needed. Most of the branch predictor tend to increase hardware budget for aliasing alleviation. This research proposes the means for miss prediction rate reduction in same hardware budget Hybrid Predictor. This predictor is called Hybrid Predictor Referenced Prediction Counter State(Hybrid-RPCS). Generally, low-predictability branches have high transition rate and no direction deviation. For low-predictability branches, prediction is turned the other way, based on a selector counter state and predicto...

CiNii J-GLOBAL
Evaluation of Execution-time Prediction Method of MPI Programs based Simple Execution

IWABUCHI Toshihiro, HORII Hiroshi, YAMANA Hayato

IPSJ SIG Notes 2003 ( 83 ) 131 - 136 2003.08

　View Summary

In this paper, we show evaluation results of our execution-time prediction method which simply executes MPI program on 2PU. We predict the execution time of NAS Parallel Benchmarks ver.2.3 on 2-128PU. Execution time prediction is effective technique to determine the optimal number of PU for some target applications. The most existing methods are not only for predicting execution-time but for obtaining information of various overhead, and hence need the long simulation time. On the other hand, since our purpose is to obtain execution time only, our method can predict faster than actual execu...

CiNii J-GLOBAL
Parallel FP-growth Algorithm for Frequent Pattern Mining

IWAHASHI Eigo, YAMANA Hayato

IPSJ SIG Notes 2003 ( 71 ) 327 - 333 2003.07

　View Summary

Frequent patterns mining is one of the important problem in data mining research. The Apriori is a prominent algorithm followed by many variants. In 2000, the FP-growth, which is reported to be faster than the Apriori, was proposed. However, many parallel algorithms of frequent pattern mining are still based on the Apriori. In this paper, we propose a parallelized version of the FP-growth, which accesses disks in parallel and constructs local FP-trees on each local memory. As a result of the evaluation using 32 node PC cluster, our method is approximately 2 and 130 times faster than sequent...

CiNii
Parallel FP-growth Algorithm for Frequent Pattern Mining

IWAHASHI Eigo, YAMANA Hayato

IEICE technical report. Data engineering 103 ( 191 ) 109 - 114 2003.07

　View Summary

Frequent patterns mining is one of the important problem in data mining research. The Apriori is a prominent algorithm followed by many variants. In 2000, the FP-growth, which is reported to be faster than the Apriori, was proposed. However, many parallel algorithms of frequent pattern mining are still based on the Apriori. In this paper, we propose a parallelized version of the FP-growth, which accesses disks in parallel and constructs local FP-trees on each local memory. As a result of the evaluation using 32 node PC cluster, our method is approximately 2 and 130 times faster than sequent...

CiNii J-GLOBAL
Webページ構造を考慮したキーワードによる画像の内容特定

大鹿, 広憲, 山名, 早人

第65回全国大会講演論文集 2003 ( 1 ) 81 - 82 2003.03

CiNii
ユーザの検索履歴を用いた情報検索システムの提案

三浦, 典之, 山名, 早人

第65回全国大会講演論文集 2003 ( 1 ) 115 - 116 2003.03

CiNii
Webページの更新傾向を踏まえた効率的な収集方法の提案

熊谷, 英樹, 山名, 早人

第65回全国大会講演論文集 2003 ( 1 ) 167 - 168 2003.03

CiNii
投機的データプリフェッチを用いたキャッシュ効率化の考察

本田, 大, 山名, 早人

第65回全国大会講演論文集 2003 ( 1 ) 145 - 146 2003.03

CiNii
大規模Webデータからのコミュニティ抽出

梅沢, 晃, 山名, 早人

第65回全国大会講演論文集 2003 ( 1 ) 131 - 132 2003.03

CiNii
分子系統樹構成法に関する最新技術動向

益子理絵, 山田真介, 山名早人

第65回情処全大 1Y-5 ( 1 ) 1 2003

J-GLOBAL
分岐命令実行回数に着目した投機的実行支援情報収集機構の設計とFPGAへの実装

蛭田智則, 山名早人, 佐谷野健二, 小池汎平

第65回情処全大 3ZA-5 1 2003
投機的実行による難並列化ループの高速化

石川隼輔, 山名早人

第65回情処全大 3ZA-4 ( 1 ) 1 2003

J-GLOBAL
投機的データプリフェッチを用いたキャッシュ効率化の考察

本田大, 山名早人

第65回情処全大 3ZA-7 ( 1 ) 1 2003

J-GLOBAL
大規模Webデータからのコミュニティ抽出

梅沢晃, 山名早人

第65回情処全大 4U-1 ( 3 ) 3 2003

J-GLOBAL
リンク構造を用いたWebページ自動分類の精度向上法

大西高裕, 山名早人

第65回情処全大 4ZA-1 ( 3 ) 3 2003

J-GLOBAL
ユーザの検索履歴を用いた情報検索システムの提案

三浦典之, 山名早人

第65回情処全大 3U-1 ( 3 ) 3 2003

J-GLOBAL
マルコフモデルを用いたWebランキング法の評価実験

赤津秀之, 山名早人

第65回情処全大 4ZA-2 3 2003
データ依存命令を対象としたデータ値予測

仲沢由香里, 山名早人

第65回情処全大 3ZA-6 ( 1 ) 1 2003

J-GLOBAL
ゲノムデータベースにおけるアノテーションフィールドを利用したエントリの類似検索

三村徹, 諸岡慎士, 山名早人

第65回情処全大 4U-3 3 2003
アプリケーションのレスポンス時間を用いたPCの性能評価

堀井洋, 山名早人

第65回情処全大 5U-5 ( 1 ) 1 2003

J-GLOBAL
Webページ構造を考慮したキーワードによる画像の内容特定

大鹿広憲, 山名早人

第65回情処全大 3N-1 ( 3 ) 3 2003

J-GLOBAL
Webページの更新傾向を踏まえた効率的な収集方法の提案

熊谷英樹, 山名早人

第65回情処全大 4ZA-4 ( 3 ) 3 2003

J-GLOBAL
Web サーチエンジンの新しい評価手法

大塚崇志

電子情報通信学会第14回データ工学ワークショップDEWS2003 (7-P:Webサーチ,Web応用) 2003

CiNii
MPIプログラムの簡易実行による実行時間予測

岩渕寿寛, 堀井洋, 山名早人

第65回情処全大 5Z-5 ( 1 ) 1 2003

J-GLOBAL
GnutellaにおけるQuery Hitを用いたトラヒック量軽減手法の提案

難波貞暁, 山名早人

第65回情処全大 5W-5 3 2003
IT社会を先導するインターネット－家庭でのインターネットアクセスの現状と今後－

山名早人

電子情報通信学会誌 Vol.86 ( No.5 ) 304 - 310 2003
FP-growthの並列化による頻出パターン抽出の高速化

岩橋永悟, 山名早人

情処研報(DBS) Vol.2003 ( No.71 ) 327 - 334 2003
ハイブリッド予測機構における選択器と予測器の協調による予測ミス率の低減

斎藤史子, 仲沢由香里, 山名早人

情処研報(ARC) Vol.2003 ( No.84 ) 115 - 120 2003
MPIプログラムの簡易実行による実行時間予測の評価

岩渕寿寛, 堀井洋, 山名早人

情処研報(HPC) Vol.2003 ( No.83 ) 131 - 136 2003
「情報」応用の開拓～全世界のWeb情報アーカイブ構築への挑戦～

山名早人

映像情報メディア学会誌 Vol.57 ( No.12 ) 1632 - 1637 2003

DOI
分岐命令に着目した投機的実行支援情報収集機構の設計とFPGAへの実装

蛭田智則, 小池汎平, 佐谷野健二, 山名早人

情報処理学会全国大会講演論文集 65th ( 1 ) 2003

J-GLOBAL
マルコフモデルを使用したWebランキングの評価実験

赤津秀之, 山名早人

情報処理学会全国大会講演論文集 65th ( 3 ) 2003

J-GLOBAL
P2P方式における検索効率の改善手法の評価

難波貞暁, 山名早人

情報処理学会全国大会講演論文集 65th ( 3 ) 2003

J-GLOBAL
ゲノムデータベースにおけるエントリの関連性検索

三村徹, 諸岡慎士, 山名早人

情報処理学会全国大会講演論文集 65th ( 3 ) 2003

J-GLOBAL
Hybrid Branch Predictors Evaluation on Prediction Accuracy

Saito Fumiko, Kitamura Takeshi, Yamana Hayato

IPSJ SIG Notes 2002 ( 112 ) 89 - 94 2002.11

　View Summary

In recent years, branch predictors with multiple prediction tables, which are called "Hybrid Predictor" in this paper, have been proposed. "Hybrid Predictor" is classfied into two categories. One is "Combining Predictor", the other is "De-Aliased Predictor". The difference between "Combining Predictor" and "De-Aliased Predictor" is a means to select a prediction table. "Combining Predictor" select a prediction table by confidence. "De-Aliased Predictor" select a prediction table by branch direction bias. Although the prediction accuracy in "Combining Predictor" is the highest, "Combining Pr...

CiNii J-GLOBAL
Necessity for Confidence in Multiple PHT Branch Predictors

Saito Fumiko, Hiruta Tomonori, Yamana Hayato

IPSJ SIG Notes 2002 ( 81 ) 55 - 60 2002.08

　View Summary

In recent years, branch predictors with multiple PHTs(Pattern History Table) has been proposed. In this paper, branch predictors with multiple PHTs are classfied into two categories. One is "Hybrid Predictor", the other is "Multiple PHT Predictor" (.which is called in this paper). The difference between "Hybrid Predictor" and "Multiple PHT Predictor" is PHT selection confidence. In this paper, we compare "Hybrid Predictor" with "Multiple PHT Predictor". As the result, if "Hibrid Predictor" is the same size as "Multiple PHT Predictor"," Hybrid Predictor" predicts branch directions better tha...

CiNii J-GLOBAL
A Proposal of the Branch Prediction Technique based on the Transition Rate

Umezawa Akira, Yamana Hayato

IPSJ SIG Notes 2002 ( 37 ) 25 - 30 2002.05

　View Summary

In order to raise the processing speed of a processor, in today's processor, the technique of piplining is adapted to extract the instruction level parallelism. However, a pipeline stall occurs when a conditional branch exists. Various researches have been done, in order to raise the accuracy of prediction. In this paper, we propose a new branch prediction technique based on the transition rate, which is specifically the number of succession branch times for the same direction. The proposed scheme targets the branches that are classified into difficult prediction branch. We applied the prop...

CiNii J-GLOBAL
Search Pattern Modeling based on its Search Interval

Suzuki Shunsuke, Yamana Hayato

IPSJ SIG Notes Vol.2002 ( No.28 ) 103 - 110 2002.03

　View Summary

The conventional search engines searches based on the pre-generated index. Thus, when some users search with the same query, the search engine returns same result, even if they want to obtain different results. In order to solve such a problem, in this paper, we propose the user modeling scheme based on the user's search pattern to speculate the user's intention. Consequently, we have classified the search interval to re-search into two patterns. Furthermore, we have classified 91% of a user's queries into nine patterns. Using these patterns, the search engines will be able to return the results that suite the user's intention.

CiNii
Search Pattern Modeling based on its Search Interval

Suzuki Shunsuke, Yamana Hayato

IPSJ SIG Notes 2002 ( 28 ) 103 - 110 2002.03

　View Summary

The conventional search engines searches based on the pre-generated index. Thus, when some users search with the same query, the search engine returns same result, even if they want to obtain different results. In order to solve such a problem, in this paper, we propose the user modeling scheme based on the user's search pattern to speculate the user's intention. Consequently, we have classified the search interval to re-search into two patterns. Furthermore, we have classified 91% of a user's queries into nine patterns. Using these patterns, the search engines will be able to return the re...

CiNii J-GLOBAL
Web ロボットにおけるキャッシュの有効性

熊谷, 英樹, 山名, 早人

第64回全国大会講演論文集 2002 ( 1 ) 49 - 50 2002.03

CiNii
脳型情報処理のモデル化に関する最新動向

齋藤, 雅浩, 山名, 早人

第64回全国大会講演論文集 2002 ( 1 ) 223 - 224 2002.03

CiNii
An Efficient Speculative Execution Scheme for Loops

Ishikawa Shunsuke, Yamana Hayato

IPSJ SIG Notes Vol.2002 ( No.22 ) 121 - 126 2002.03

　View Summary

In this paper, we propose an efficient speculative execution scheme for loops, and have confirmed the usefullness of the scheme using the compress program from SPECcpu95 benchmark. Generally, since the execution time of loops holds the large portion of the total execution time, the loop parallelization scheme improves the program performance, dramatically. However, when the data dependence cannot be analyzed statically, the conventional parallelization scheme assumes that the data dependence exists. For this reason, such a loop cannot be parallelized even if the loop carried dependence(LCD) occurs only once in 10,000 times, dynamically. However, the speculative execution scheme has been known to speedup such a loop. In this paper, we propose the scheme to apply the speculative execution alternatively only to the portion expected to be speeduped effectively, using the overhead parameter required for the book-keeping process when the speculation fails. Such overhead has not been considered on oonventional speculative execution schemes. The proposed scheme enables the alternative speculative execution using the overhead parameter for book-keeping, the LCD existence probability, and the timing of the speculative execution initiation. As a result, in the present stage, the execution speed is fell down to one third. To solve this problem, we also propose a new speculative execution.

CiNii
An Efficient Speculative Execution Scheme for Loops

Ishikawa Shunsuke, Yamana Hayato

IPSJ SIG Notes 2002 ( 22 ) 121 - 126 2002.03

　View Summary

In this paper, we propose an efficient speculative execution scheme for loops, and have confirmed the usefullness of the scheme using the compress program from SPECcpu95 benchmark. Generally, since the execution time of loops holds the large portion of the total execution time, the loop parallelization scheme improves the program performance, dramatically. However, when the data dependence cannot be analyzed statically, the conventional parallelization scheme assumes that the data dependence exists. For this reason, such a loop cannot be parallelized even if the loop carried dependence(LCD)...

CiNii J-GLOBAL
脳型情報処理の研究に関する最新動向

齋藤雅浩, 山名早人

第64回情処全大 5P-3 ( 2 ) 2002

J-GLOBAL
逆リンクのチェックによるサイトの特徴・有用性の調査

高見進太郎, 山名早人

第64回情処全大 3X-3 ( 3 ) 2002

J-GLOBAL
マルコフモデルを使用したWebランキング

赤津秀之, 山名早人

第64回情処全大 3X-6 ( 3 ) 2002

J-GLOBAL
ドメイン毎のWebページ数の偏りを考慮した日本のWebページ数推定調査

西村真幸, 山名早人

第64回情処全大 2X-6 ( 3 ) 2002

J-GLOBAL
Web上からの論文ファイル自動抽出の試み

田伏真之, 山名早人

第64回情処全大 4Y-6 ( 3 ) 2002

J-GLOBAL
Webページの更新頻度とアクセス頻度に基づく効率的な収集方法の考察

熊谷英樹, 山名早人

第64回情処全大 4X-6 ( 3 ) 2002

J-GLOBAL
構造プロファイルによる局所構造予測法の開発

山田真介, 富井健太郎, 太田元規, 秋山泰, 山名早人

第2回日本蛋白質科学会年会ポスター 2p-141 2002
The Latest Technical Trends in Speculative Execution

Saito Fumiko, Yamana Hayato

IPSJ SIG Notes 2001 ( 116 ) 67 - 72 2001.11

　View Summary

Instruction level speculative execution schemes are classified into the branch prediction which alleviates control dependence, and the data prediction which alleviates data dependece. In this paper, we summarize 36 papers on the branch prediction and 27 papers on the data prediction in HPCA from 1996 to 2001, ISCA, MICRO, and ASPLOS from 1996 to 2000. As the general trends, until 1998, more than half of the researches on speculative execution are related to the branch prediction. However, since 1997, reseraches on data prediction have increased.

CiNii J-GLOBAL
An Estimation Scheme of the Exection Time for MPI Programs using Measured Primitives

Horii Hiroshi, Yamana Hayato

IPSJ SIG Notes 2001 ( 102 ) 61 - 66 2001.10

　View Summary

In this paper, we propose the scheme of estimating the execution time of MPI programs, and confirmed the usefulness of the scheme using NAS Parallel Benchmarks (NPB) ver 2.3. The scheme estimates the execution time of MPI program dividing into the computation part and the communication part. In estimating the execution time of the computation, we divide a MPI program into blocks that have loop structure, measure the execution time of every block, and estimate the total execution time. In estimating the communication time, we measure the communication time with the same message size which is...

CiNii J-GLOBAL
Search Engine Google

YAMANA Hayato, KONDO Hidekazu

Journal of Information Processing Society of Japan Vol.42 ( No.8 ) 775 - 780 2001.08

　View Summary

Googleは，世界最大の情報を持つサーチエンジンとして有名である．Googleは，スタンフォード大学コンピュータサイエンス学科の研究プロジェクトとしてスタートした後，シリコンバレーの2大ベンチャーキャピタルから総額2 500万ドルの投資を受け，博士課程の学生であった当時25歳のLarry（Lawrence）Pageと Sergey Brinの2人が1998年9月に会社として起業した．

CiNii J-GLOBAL
招待講演2 サーチエンジンGoogleの情報検索技術 (AIシンポジウム(第15回)WWW情報検索と情報統合)

山名早人

AIシンポジウム 15 21 - 26 2001.07

CiNii
逐次ループを対象とした臨界投機実行方式の検討

山名, 早人

第62回全国大会講演論文集 2001 ( 1 ) 163 - 164 2001.03

CiNii
投機的実行のループへの効果的な適用法

山名早人

情報処理学会第６２回全国大会 5R-4 ( 1 ) 2001

J-GLOBAL
招待論文-サーチエンジンGoogleの情報検索技術

山名早人

第15回AIシンポジウム SIG-J-A101 21 - 26 2001

J-GLOBAL
データベース最前線-12-検索エンジンと高速ページ収集技術--分散型WWWロボット実験

山名早人

Bit 32 ( 12 ) 72 - 79 2000.12

CiNii J-GLOBAL
2000-ARC-139-28 Unlimited Speculative Execution for Loops

YAMANA Hayato, Koike Hanpei

IPSJ SIG Notes 2000 ( 74 ) 163 - 168 2000.08

　View Summary

This paper discusses how to adopt the"Unlimited Speculative Execution"on loops. A task level speculative execution scheme, called the"Unlimited Speculative Execution", is adopted on the loops that are not able to be parallelized because of memory ambiguation or control dependences. In this paper, loops are classified into nine categories to make clear the applicable loops for the scheme. Moreover, we discusses the result after applying the scheme to SPEC95int compress program.

CiNii J-GLOBAL
分散型WWWロボットによる国内のWWWデータ収集実験

山名早人

ACM SIGMOD Japanシンポジウム講演集 2000
広域分散コンピューティングの現状と課題―分散型ＷＷＷロボットを例にとって―

山名早人

北海道地域ネットワーク協議会シンポジウム2000／北海道地域ネットワーク協議会 95 - 102 2000
スーパーコンパイラ・テクノロジの調査研究

平成11年度先導調査研究報告書／新エネルギー・産業総合開発機構 2000
分散型ＷＷＷロボットによる国内のＷＷＷデータ収集実験

山名早人

ACM SIGMOD Japanシンポジウム講演集 2000
臨界投機実行のループへの適用

山名

情処研報 2 - 5 2000

CiNii
分散型WWWロボットの予備評価と高速化の検討

山名

日本ソフトウェア科学会The Third Workshop on Internet Technology 2000

CiNii
Internet広域分散サーチロボットの研究開発

村岡

第19回IPA技術発表会 2000

CiNii
分散ＷＷＷロボット実験

山名早人

Bit,共立出版 2000
分散型WWWロボット実験の状況 (特集次世代インターネットの展望)

山名早人

機械振興 32 ( 8 ) 61 - 67 1999.08

CiNii
User Support on Narrowing Retrieval using the Unlimited Speculativ Search Service.

山名早人, 小池汎平, 児玉祐悦, 坂根広史, 山口喜教

人工知能学会知識ベースシステム研究会資料 43rd ( 43 ) 93 - 98 1999.03

CiNii J-GLOBAL
Speculative Control/Data Dependence Graph and Java Jog-time Analyzer : A Preliminary Evaluation for Java Virtual Accelerator

KOIKE HANPEI, YAMANA HAYATO, YAMAGUCHI YOSHINORI

40 ( 1 ) 32 - 41 1999.02

　View Summary

The authors are investigating the possibility of Java Virtual Accelerator, a run-time parallelizing interpreter/JIT compiler system which speeds up Java execution through parallelizing emulation. To realize parallelizing emulation, automatic extraction of the parallelism from sequential binary programs is important. We developed the "speculative control-data dependence graph" model to relieve the control and data dependence constraints inherent in the sequential programs. Speculative control-data dependence graph is constructed by measuring the prediction rate for both control and data values during the test run, and replacing highly predictable arcs with predict-confirm nodes. Java Jog-time Analyzer is developed for the experiment of the model described above. JJA analyzes control and data dependences statically while class files are loaded, and the intermediate code interpreter of JJA invokes data and branch prediction modules and gathers run-time statistics everytime basic block boundary is crossed. Run-time statistics such as the block execution counts, the prediction rates, the critical path execution time and the average parallelism, as well as the plot of the dependence graphs, are shown at the end of the execution. In this paper, several experiment results with JJA are shown.

CiNii
Speculative Control/Data Dependence Graph and Java Jog-time Analyzer : A Preliminary Evaluation for Java Virtual Accelerator

KOIKE HANPEI, YAMANA HAYATO, YAMAGUCHI YOSHINORI

Transactions of Information Processing Society of Japan 40 ( 1 ) 32 - 41 1999.02

　View Summary

The authors are investigating the possibility of Java Virtual Accelerator, a run-time parallelizing interpreter/JIT compiler system which speeds up Java execution through parallelizing emulation. To realize parallelizing emulation, automatic extraction of the parallelism from sequential binary programs is important. We developed the "speculative control-data dependence graph" model to relieve the control and data dependence constraints inherent in the sequential programs. Speculative control-data dependence graph is constructed by measuring the prediction rate for both control and data valu...

CiNii J-GLOBAL
Evaluation of Communication Mechanisms for Distributed Memory Parallel Computers in Wavefront Computation

SAKANE Hirofumi, KODAMA Yuetsu, TATEBE Osamu, KOIKE Hanpei, YAMANA Hayato, YAMAGUCHI Yoshinori, YUBA Yoshitsugu

Transactions of Information Processing Society of Japan 40 ( 5 ) 2281 - 2292 1999

　View Summary

In this paper, we discuss efficient parallel execution of a dense-matrix problem considering trade-offs between fine-grain and coarse-grain communication in distributed memory machines. The solution of the triangular system of equations involves data dependencies between consecutive iterations in the outer-loop. The dependencies can be naturally solved and processed in parallel by wavefront computation. Two ways of parallelizing are presented; the element-wise fine-grain approach and the coarse-grain approach. We implemented these algorithms on both EM-X and AP 1000+. Fine-grain support mechanisms of the EM-X had a great effect on the performance of the element-wise method for relatively small problem size, while employed RISC processors of the AP1000+ brought high performance of the coarse-grain method for larger size.

CiNii
Preliminal Evaluation of the Unlimited Speculative Search Service on Parallel Computers.

山名早人, 小池汎平, 児玉祐悦, 坂根広史, 山口喜教

情報処理学会シンポジウム論文集 99 ( 6 ) 216 1999

J-GLOBAL
Distributed WWW robot experiment.

山名早人

機械振興 32 ( 8 ) 61 - 67 1999

J-GLOBAL
経営学大事典第二版

中央経済社 1999
Internet広域分散協調サーチロボットの研究開発

IPA第18回技術発表会論文集／情報処理振興事業協会 18 71 - 78 1999
分散型ＷＷＷロボットの実験状況と今後の課題

インターネットコンファレンス99論文集／日本ソフトウェア科学会 14, p.141 1999
Design of Automatic Parallelizing Intermediate Code Interpreter

KOIKE HANPEI, YAMANA HAYATO, YAMAGUCHI YOSHINORI

40 ( SIG10 ) 64 - 74 1999

　View Summary

In this paper, the design of the intermediate code interpreter, which executes a sequential program in parallel using speculative method, is discussed. Software techniques which enable an efficient parallel speculative execution without hardware support, such as the check point execution mechanism with which an appropriate parallel execution granularity is established, and the efficient implementation of the speculative memory operations which minimize the overhead of searching, recording and the mutual exclusion, are proposed. Experiment results to see the basic performance of these techniques are also presented. From the experiment, we confirmed that we can implement a speculative intermediate code interpreter which can result in speedup, if we adopt the soft ware techniques described in this paper.

CiNii
国内の全WWWデータを24時間で収集する分散型WWWロボットの試み

山名早人, 田村健人, 森英雄, 黒田洋介, 西村英樹, 浅井勇夫, 楠本博之, 篠田陽一, 村岡洋一

Proceedings of NORTH Internet Symposium 1999 1999

J-GLOBAL
Fast speculative search engine on the highly parallel computer EM-X

Hayato Yamana, Hanpei Koike, Yuetsu Kodama, Hirofumi Sakane, Yoshinori Yamaguchi

SIGIR Forum (ACM Special Interest Group on Information Retrieval) 390 - 390 1998.12

DOI
New trends of information retrieval in multimedia and Internet. Globally distributed cooperative search robot for Internet.

山名早人

Computer Today 15 ( 5 ) 4 - 9 1998.09

CiNii J-GLOBAL
A Study of Adopting the unlimited Speculative Execution on Multigrain Parallelizing Compilers

YAMANA Hayato, KOIKE Hanpei, KODAMA Yuetsu, SAKANE Hirohumi, YAMAGUCHI Yoshinori

IPSJ SIG Notes 98 ( 70 ) 19 - 24 1998.08

　View Summary

This paper discusses the effectiveness of the unlimited speculative execution and how to adopt the scheme on multigrain parallelizing compilers.The multigrain parallelizing compilers exploit parallelism among coarse-grain taks like loops, medium-grain tasks such as loop iterations, and near-fine-grain tasks such as statements.When we adopt the unlimited speculative execution scheme on multigrain parallelizing compilers, the codes, that are not parallelized because of memory ambiguation or control dependences, are able to be parallelized.In this paper, loops are classified into nine categori...

CiNii J-GLOBAL
A Study of Unlimited Speculative Execution on Multigrain Parallel Processing

YAMANA Hayato, KOIKE Hanpei, KODAMA Yuetsu, SAKANE Hirofumi, YAMAGUCHI Yoshinori

全国大会講演論文集 56 ( 1 ) 297 - 298 1998.03

CiNii J-GLOBAL
情報検索の新潮流

山名早人

1998

CiNii
分散型 web ロボット構築のための性能評価

山名早人

第9回データ工学ワークショップ (DEWS'98), March 1998

CiNii
Speculative Control-Data Dependence Graph and Java Jog-time Analyzer-A Step toward Java Virtual Accelerator.

小池汎平, 山名早人, 山口喜教

情報処理学会シンポジウム論文集 98 ( 7 ) 1998

J-GLOBAL
分散型ロボットによるWWW情報収集

山名早人

第9回データ工学ワークショップ(DEWS'98), 電子情報通信学会データ工学専門委員会 1998

CiNii
A Survey of World Wide Web Search Engines

Yamana Hayato

コンピュータソフトウェア 14 ( 5 ) 503 - 510 1997.09

　View Summary

1997年1月時点で,世界の約83万組織,約1600万台のコンピュータがインターネットに接続し,学術論文から趣味にいたるまで,1億ページを越える情報がWWWサーバから発信されている.この膨大な情報を有効に利用するためには,必要とする情報の掲載されたページを瞬時に,かつ,的確にみつけ出すことが必須となる.このような機能を提供するWWW情報検索サービスは,1994年頃から登場し始め,現在,その数は100を越える.本稿では,WWW情報検索サービスの現状とその問題点を解説する.

CiNii J-GLOBAL
Parallel Execution of Radix Sort Programs Using Fine - grain Communication

KODAMA Yuetsu, SAKANE Hirofumi, SATO Mitsuhisa, YAMANA Hayato, SAKAI Shuichi, YAMAGUCHI Yoshinori

Transactions of Information Processing Society of Japan 38 ( 9 ) 1726 - 1735 1997.09

　View Summary

EM-X is a highly parallel computer with a distributed memory. It supports fine-grain communication, whose size is two-word fixed, on an instruction execution pipeline. It achieves high communication throughput by overlapping remote memory access with thread execution, and tolerates communication latency by rapid switching of threads. We developed an 80 processor system of EM-X, and are evaluating its architectural features on the system. In this paper, we execute radix sort programs to evaluate the parallel performance of EM-X and compare the results with other parallel computers. The results show that fine grain communication achieves very good scalability, while coarse grain message passing decrease the performance on a large number of processors because of contentions on a network.

CiNii
Parallelization and Performance Evaluation of Sparse Matrix Computation in The EM - X Multiprocessor

SATO Mitsuhisa, KODAMA Yuetsu, SAKANE Hirofumi, YAMANA Hayato, SAKAI Shuichi, YAMAGUCHI Yoshinori

Transactions of Information Processing Society of Japan 38 ( 9 ) 1761 - 1770 1997.09

　View Summary

In this paper, we describe the parallelization of a sparse matrix computation, CG (Conjugate Gradient method) kernel taken from NAS parallel benchmark suite, for the EM-X multiprocessor. Dataflow mechanism of EM-X supports fine-grain communication very efficiently, which provides low latency communication, and flexible message-passing facility. We compare the performance of sparse matrix vector multiplications by the complete exchange communication, by element-wise remote update and by the element-wise remote read with multithreading. The measurements taken on the EM-X indicates effectiveness of the fine-grain communication which enables element-wise access efficiently. Fine-grain communcation is effective when problem size per PE becomes small in large scale multiprocessor systems. The complete complete exchange version incurs the negative impact due to the limitation of its bandwidth, and the performance of the element-wise remote read version is degraded by the overhead of context-switching for multithreading.

CiNii
Parallel Execution of Radix Sort Programs Using Fine-grain Communication

KODAMA YUETSU, SAKANE HIROFUMI, SATO MITSUHISA, YAMANA HAYATO, SAKAI SHUICHI, YAMAGUCHI YOSHINORI

Transactions of Information Processing Society of Japan 38 ( 9 ) 1726 - 1735 1997.09

　View Summary

EM-X is a highly parallel computer with a distributed memory. It supports fine-grain communication, whose size is two-word fixed, on an instruction execution pipeline. It achieves high communication throughput by overlapping remote memory access with thread execution, and tolerates communication latency by rapid switching of threads. We developed an 80 processor system of EM-X, and are evaluating its architectural features on the system. In this paper, we execute radix sort programs to evaluate the parallel performance of EM-X and compare the results with other parallel computers. The resul...

CiNii J-GLOBAL
Parallelization and Performance Evaluation of Sparse Matrix Computation in The EM-X Multiprocessor

SATO MITSUHISA, KODAMA YUETSU, SAKANE HIROFUMI, YAMANA HAYATO, SAKAI SHUICHI, YAMAGUCHI YOSHINORI

Transactions of Information Processing Society of Japan 38 ( 9 ) 1761 - 1770 1997.09

　View Summary

In this paper, we describe the parallelization of a sparse matrix computation, CG (Conjugate Gradient method) kernel taken from NAS parallel benchmark suite, for the EM-X multiprocessor. Dataflow mechanism of EM-X supports fine-grain communication very efficiently, which provides low latency communication, and flexible message-passing facility. We compare the performance of sparse matrix vector multiplications by the complete exchange communication, by element-wise remote update and by the element-wise remote read with multithreading. The measurements taken on the EM-X indicates effectivene...

CiNii J-GLOBAL
Using the Unlimited Speculative Execution on WWW Information Retrieval

YAMANA Hayato, KOIKE Hanpei, KODAMA Yuetsu, TODA Kenji, YAMAGUCHI Yoshinori

IEICE technical report. Computer systems 97 ( 226 ) 69 - 74 1997.08

　View Summary

This paper explains how to use the Unlimited Speculative Execution scheme to accelerate the information retrieval on the World Wide Web. The Unlimited Speculative Execution scheme utilizes low loaded processors to speculate the tasks which are not decided to be initiated. The goal of this research is to present a fast WWW information retrieval system by using the Unlimited Speculative Execution scheme. We use the EM-X parallel computer which consists of 80 processors for the platform.

CiNii J-GLOBAL
Developing information industry. Trends of WWW information retrieval service.

山名早人

機械振興 30 ( 8 ) 54 - 63 1997.08

CiNii J-GLOBAL
Attractiveness of the Internet.

山名早人

CIAJ Journal (Communications and Information Network Association of Japan) 37 ( 3 ) 1997

J-GLOBAL
Fine-grain parallel processing of a dense-matrix problem on EM-X-Efficient execution of wavefront parallelism.

坂根広史, 児玉祐悦, 小池汎平, 佐藤三久, 山名早人, 坂井修一, 山口喜教

並列処理シンポジウム論文集 1997 1997

J-GLOBAL
Superspeed computer application technology for elucidating complicated phenomena.

関口智嗣, 佐藤三久, 山名早人

国立機関原子力試験研究成果報告書 36(1995) 1997

J-GLOBAL
Experience with fine-grain communication in EM-X multiprocessor for parallel sparse matrix computation

M Sato, Y Kodama, H Sakane, H Yamana, S Sakai, Y Yamaguchi

11TH INTERNATIONAL PARALLEL PROCESSING SYMPOSIUM, PROCEEDINGS 242 - 248 1997

　View Summary

Sparse matrix problems require a communication paradigm different from those used in conventional distributed-memory multiprocessors. We present in this gaper how fine-grain communication can help obtain high performance in the experimental distributed-memory multiprocessor, EM-X, developed at ETL, which can handle fine-grain communication very efficiently. The sparse matrix: kernel, Conjugate Gradient, is selected for the experiments. Among the steps in CG is the sparse matrix vector multiplications we focus on in the study. Some communication methods are developed for performance comparison, including coarse-grain and fine-grain implementations, Fine-grain communication allows exact data access in an unstructured problem to reduce the amount of communication. While CG presents bottlenecks in terms of a large number of fine-grain remote reads, the multi-thraded principles of execution is so designed to tolerate such latency. Experimental results indicate that the performance of fine-grain read implementation is comparable to that of coarse-grain implementation on 64 processors. The results demonstrate that fine-grain communication can be a viable and efficient approach to unstructured sparse matrix problems on large-scale distributed-memory multiprocessors.
Fine-grain multithreading with the EM-X multiprocessor

Andrew Sohn, Yuetsu Kodama, Jui Ku, Mitsuhisa Sato, Hirofumi Sakane, Hayato Yamana, Shuichi Sakai, Yoshinori Yamaguchi

Annual ACM Symposium on Parallel Algorithms and Architectures 189 - 198 1997.01

　View Summary

Multithreading aims to tolerate latency by overlapping communication with computation. This report explicates the multithreading capabilities of the EM-X distributed-memory multiprocessor through empirical studies. The EM-X provides hardware supports for fine-grain multithreading, including a by-passing mechanism for direct remote reads and writes, hardware FIFO thread scheduling, and dedicated instructions for generating fixed-sized communication packets. Bitonic sorting and Fast Fourier Transform are selected for experiments. Parameters that characterize the performance of multithreading are investigated, including the number of threads, the number of thread switches, the run length, and the number of remote reads. Experimental results indicate that the best communication performance occurs when the number of threads is two to four. FFT yielded over 95% overlapping due to a large amount of computation and communication parallelism across threads. Even in the absence of thread computation parallelism, multithreading helps overlap over 35% of the communication time for bitonic sorting.

DOI
Message-based efficient remote memory access on a highly parallel computer EM-X

Yuetsu Kodama, Yuetsu Kodama, Hirohumi Sakane, N. Mitsuhisa Sato, Hayato Yamana, Shuichi Sakal, Yoshinori Yamaguchl

IEICE Transactions on Information and Systems E79-D ( 8 ) 1065 - 1071 1996.12
Performance Evaluation for a Matrix Operation Benchmark on EM-X Multiprocessor

SAKANE HIROFUMI, KODAMA YUETSU, SATO MITSUHISA, YAMANA HAYATO, SAKAI SHUICHI, YAMAGUCHI YOSHINORI

IPSJ SIG Notes 96 ( 80 ) 239 - 244 1996.08

　View Summary

In this paper, we discuss an implementation of the LINPACK benchmark parallelized on the EM-X multiprocessor and evaluate its performance focusing the floating point operations in which a regular repetitive pattern occurs. It is important to overlap the communication and calculation as much as relationship between the broadcast algorithms and load balancing. Exploiting the potential of a reduction of the number or memory accesses and adopting the multi-column simultaneous elimination technique, we also further accelerated the most innerloop code we had already reported for optimization on a...

CiNii J-GLOBAL
Message-based efficient remote memory access on a highly parallel computer EM-X

Y Kodama, H Sakane, M Sato, H Yamana, S Sakai, Y Yamaguchi

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E79D ( 8 ) 1065 - 1071 1996.08

　View Summary

Communication latency is central to multiprocessor design. This study presents the design principles of the EM-X distributed-memory multiprocessor towards tolerating communication latency. The EM-X overlaps computation with communication for latency tolerance by multithreading. In particular, we present two types of hardware support for remote memory access: (1) priority-based packet scheduling for thread invocation, and (2) direct remote memory access. The priority-based scheduling policy extends a FIFO ordered thread invocation policy to adopt to different computational needs. The direct remote memory access is designed to overlap remote memory operations with thread execution. The 80-processor prototype of EM-X is developed and is operational since December 1995. We execute several programs on the machine and evaluate how the EM-X effectively overlaps computation with communication toward tolerating communication latency for high performance parallel computing.

CiNii
Enjoy the WWW!

YAMANA Hayato

The Journal of the Institute of Electronics, Information, and Communication Engineers 79 ( 1 ) 65 - 67 1996.01

CiNii J-GLOBAL
Parallel execution of radix sort program on a highly parallel computer EM-X.

児玉祐悦, 坂根広史, 佐藤三久, 山名早人, 坂井修一, 山口喜教

並列処理シンポジウム論文集 1996 1996

J-GLOBAL
Application technology of superspeed computer for elucidation of complicated phenomena.

関口智嗣, 佐藤三久, 山名早人

国立機関原子力試験研究成果報告書 35(1994) 1996

J-GLOBAL
Survey of Speculative Execution and the Effect of Task-level Speculation

Yamana Hayato, Sato Mitsuhisa, Kodama Yuetsu, Sakane Hirohumi, Sakai Shuichi, Yamaguchi Yoshinori

全国大会講演論文集 51 ( 6 ) 75 - 76 1995.09

　View Summary

投機的実行(Speculative Execution)に関して,94年〜95年7月のサーベイを報告すると共に,我々が提案しているタスク間投機的実行の有効性を示す.なお,94年までの調査については,文献[2]を参照していただきたい.調査対象とした論文を表1に示し,近年の投機的実行に関する論文数の推移を図1に示す.図1に示すように,VLIWやSuperscalarが出始めた91年頃から投機的実行に関する論文が急増している.これらの研究は,(1)プログラムに内在する命令レベルの並列性調査,(2)Superscalar/VLIWでの投機的実行,(3)並列計算機での投機的実行に分類される.90年代前半は(1)に関する論文が多かったが,その後,(2)に関する論文が急増し,94-95年の論文はその中でも,分岐予測(branch prediction)と条件付実行(predicated execution)に関するものが全体の7割を占め,89-93年に多かったアーキテクチャ上の実現方法に関する論文が激減した.本報告では,現在最もホットな話題となっている分岐予測と条件付実行を中心に説明する.

CiNii J-GLOBAL
Decreasing the Control Overhead of the Unlimited Speculative Execution

Yamana Hayato, Sato Mitsuhisa, Kodama Yuetsu, Sakane Hirohumi, Sakai Shuichi, Yamaguchi Yoshinori

IPSJ SIG Notes 95 ( 80 ) 33 - 40 1995.08

　View Summary

This paper discusses how to decrease the control overhead of tasks with Speculation on multiprocessors. Firstly, we have implemented the unlimited speculative execution on the EM-4 multiprocessor. Secondly, the overhead is classified into its several sources. After measuring each classified overhead, it has been confirmed that both the broadcast latency and the overhead initiating tasks are not major factors. Insted, the overhead of receiving and manipulating broadcasted control data is major factor. When the factor is decreased by 1/4, the speedup ratio increases up to 3 and we will have 1...

CiNii J-GLOBAL
Parallelization and Performance of Sparse Matrix Computation in The EM-X Multiprocessor

Sato Mitsuhisa, Kodama Yuetsu, Sakane Hirofumi, Yamana Hayato, Sakai Shuichi, Yamaguti Yoshinori

IPSJ SIG Notes 95 ( 80 ) 209 - 216 1995.08

　View Summary

In this paper, we describe the parallelization of a sparse matrix computation, CG (Conjugate Gradient method) kernel taken from NAS parallel benchmark suite, for the EM-X multiprocessor. EM-X is a distributed-memory multiprocessor. Dataflow mechanism of EM-X supports fine-grain communication very efficiently, which provides low latency communication, and flexible message-passing with direct remote memory access. We compare the performance of sparse matrix vector multiplications by the complete exchange communication and by the element-wise remote memory access with multithreading. The measu...

CiNii J-GLOBAL
A Distributed Control Scheme of Macrotask-level Speculative Execution on the EM-4 Multiprocessor

Yamana Hayato, Sato Mitsuhisa, Kodama Yuetsu, Sakane Hirohumi, Sakai Shuichi, Yamaguchi Yoshinori

Transactions of Information Processing Society of Japan 36 ( 7 ) 1578 - 1588 1995.07

　View Summary

The purpose of this paper is to propose a new fast control scheme of macrotasks with speculation. A macrotask is a coarse grain task which is a unit of speculation. Previous works have reported that the speedup ratio is 12 to 630 times in comparison with conventional execution schems without speculation when both the speculation depth and the computational resource are infinite, that is called oracle model. The control overhead,however, prevents the speedup from attaining the theoretical speedup ratio. Thus, the control scheme with small overhead is desired. The distributed control scheme a...

CiNii
A macrotask-level unlimited speculative execution on multiprocessors

Hayato Yamana, Mitsuhisa Sato, Yuetsu Kodama, Hirofumi Sakane, Shunichi Sakai, Yoshinori Yamaguchi

Proceedings of the International Conference on Supercomputing Part F129361 328 - 337 1995.07

　View Summary

The purpose of this paper is to propose a new fast execution scheme of FORTRAN programs. The proposed scheme enables the fast initiation of macrotask when its data dependences are satisfied even if the control flow has not been reached. The previous schemes to parallelize a program including conditional branches have a number of problems - 1) Though the theoretical speedup ratio is up to N when N conditional branches are jumped on either a VLIW or a superscalar machine, the number of N is restricted up to the number of ALU's on a chip, 2) Since conventional control schemes use a few processors to control macrotasks, the overhead to control them is large. The proposed scheme solves these problems - 1) The proposed scheme enables speculative execution between coarse grain tasks, i.e., macrotasks, on multiprocessors by jumping many conditional branches, 2) A distributed control scheme is proposed and implemented on the EM-4 multiprocessor to decrease the control overhead of macrotasks. Preliminary evaluations show that the control overhead of the proposed scheme is smaller than that of the other control schemes. Moreover, it is confirmed that the distributed control can be implemented by using software when the average macrotask execution time is larger than 14.4 (Is on the EM-4 multiprocessor.

DOI
A SPECULATIVE EXECUTION SCHEME OF MACROTASKS FOR PARALLEL-PROCESSING SYSTEMS

H YAMANA, T YASUE, Y ISHII, Y MURAOKA

SYSTEMS AND COMPUTERS IN JAPAN 26 ( 6 ) 1 - 15 1995.06

DOI
分散共有メモリ型並列計算機における1重Doacross型ループの実行時間算出法

山名,安江, 村岡,山口

電子情報通信学会論文誌 J78-D-1/2 170 - 178 1995
Parallelization and Performance of Sparse Matrix Computation in The EM-X Multiprocessor.

佐藤三久, 児玉祐悦, 坂根広史, 山名早人, 坂井修一, 山口喜教

情報処理学会研究報告 95 ( 80(ARC-113) ) 1995

J-GLOBAL
The EM-X parallel computer: Architecture and basic performance

Y KODAMA, H SAKANE, M SATO, H YAMANA, S SAKAI, Y YAMAGUCHI

22ND ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, PROCEEDINGS 14 - 23 1995

　View Summary

Latency tolerance is essential in achieving high performance on parallel computers for remote function calls and fine-grained remote memory accesses. EM-X supports interprocessor communication on an execution pipeline with small and simple packets. It can create a packet in one cycle, and receive a packet from the network in the on-chip buffer without interruption. EM-X invokes threads on packet arrival, minimizing the overhead of thread switching. It can tolerate communication latency by using efficient multi-threading and optimizing packet flow of fine grain communication. EM-X also supports the synchronization of two operands, direct remote memory read/write operations and flexible packet scheduling with priority. This paper describes distinctive features of the EM-X architecture and reports the performance of small synthetic programs and larger more realistic programs.

DOI
EM-X parallel computer: Architecture and basic performance

Yuetsu Kodama, Hirohumi Sakane, Mitsuhisa Sato, Hayato Yamana, Shuichi Sakai, Yoshinori Yamaguchi

Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA 14 - 23 1995.01

　View Summary

Latency tolerance is essential in achieving high performance on parallel computers for remote function calls and fine-grained remote memory accesses. EM-X supports interprocessor communication on an execution pipeline with small and simple packets. It can create a packet in one cycle, and receive a packet from the network in the on-chip buffer without interruption. EM-X invokes threads on packet arrival, minimizing the overhead of thread switching. It can tolerate communication latency by using efficient multi-threading and optimizing packet flow of fine grain communication. EM-X also supports the synchronization of two operands, direct remote memory read/write operations and flexible packet scheduling with priority. This paper describes distinctive features of the EM-X architecture and reports the performance of small synthetic programs and larger more realistic programs.
An Evaluation of Doacross among Loops on the EM-4 Multiprocessor

YAMANA Hayato, SATO Mitsuhisa, KODAMA Yuetsu, SAKANE Hirohumi, SAKAI Shuichi, YAMAGUCHI Yoshinori

全国大会講演論文集 48 ( 6 ) 19 - 20 1994.03

　View Summary

従来, Doall型以外のループを並列計算機上で実行する方式としてDoacross[Cytr86]やPipelining[PaKL80]が提案されている. しかし, これらの方式は, 元々, 密結合型の並列計算機を対象としたものであり, メッセージ通信によりプロセッサ間のデータ交換を行う祖結合型の並列計算機では, 十分な処理性能を引き出すことができない. これは, 以下に述べる問題によるものである. ここで, ループの繰り返し回数をNとする. ・ Doacrossでは, プロセッサ間の通信ディレイが(N-l)回分, 全体の実行時間に加算されるため, が十分に小さくないと処理速度の向上が得られない. ・ Pipeliningでは, 各文の実行時問Tsが(N-l)回分, 全体の来行時間に加算されるため, Tsが十分に小さくないと処理速度の向上が得られない. 祖結合型の並列計算機では, メッゼージ通信によりプロセッサ間のデータ交換を行うため,を小さくすることが困難である. また, Tsには, 他のプロセッサ間でのデータの入出力時間が含まれるため, Tsを小さくすることも困難である. これに対して, 本報告で提案するループ間Doacrossは, プロセッサ問の通信ディレイが全体の実行時間に与える影響, 及び, 各文の実行時問Tsが全体の実行時間に与える影響を小さくする方式である. 本...

CiNii J-GLOBAL
Optimization of network interface in a processing element for a parallel computer EM-X

Sakane Hirofumi, Kodama Yuetsu, Sato Mitsuhisa, Yamana Hayato, Sakai Shuichi, Yamaguchi Yoshinori

IPSJ SIG Notes 94 ( 13 ) 105 - 112 1994.01

　View Summary

This paper discusses some of the design parameters for the network interface of a single chip processor EMC-Y to achieve high throughput and high performance. We are currently designing the EMC-Y, a processing element of a parallel computer EM-X. The design parameters include the arbitration method in the network switch, memory access priority and the size of internal FIFOs. To optimize parallel execution performance, we have examined the parameters of the network interface by using a register transfer level simulator of the EM-X.

CiNii J-GLOBAL
並列処理システムにおけるマクロタスク間先行評価方式

山名,安江, 石井,村岡

電子情報通信学会論文誌 J77-D-1/5 343 - 353 1994
並列計算機EM-Xにおけるループ間Doacross方式の自動最適化

山名早人

情報処理学会研究報告書 106 17 - 24 1994

CiNii
Automatic Tuning of Loop-Doacross Execution Scheme on the EM-4 Multiprocessor.

山名早人, 佐藤三久, 児玉祐悦, 坂根広史, 坂井修一, 山口喜教

情報処理学会研究報告 94 ( 50(ARC-106) ) 1994

J-GLOBAL
A Distributed Controlling Scheme of the Multistage Speculative Execution on the EM-4 Multiprosessor.

山名早人, 佐藤三久, 児玉祐悦, 坂根広史, 坂井修一, 山口喜教

並列処理シンポジウム論文集 1994 1994

J-GLOBAL
Survey of Today’s Speculative Execution Schemes and a Proposal of Unlimited Speculative Execution Scheme.

山名早人, 佐藤三久, 児玉祐悦, 坂根広史, 坂井修一, 山口喜教

情報処理学会研究報告 94 ( 66(ARC-107) ) 1994

J-GLOBAL
Fundamental performance evaluation of a processing element EMC-Y for a parallel computer.

坂根広史, 児玉祐悦, 佐藤三久, 山名早人, 坂井修一, 山口喜教

情報処理学会研究報告 94 ( 66(ARC-107) ) 1994

J-GLOBAL
投機的実行の現状と Unlimited Speculative Execution Scheme の提案

山名

情報処理学会研究報告 107 105 - 112 1994

CiNii
An Experimental Evaluation of the Multi-stage Specurative Execution Scheme on the EM-4 Multiprocessor

Yamana Hayato, Sato Mitsuhisa, Kodama Yuetsu, Sakane Hirohumi, Sakai Shuichi, Yamaguchi Yoshinori

IPSJ SIG Notes 93 ( 72 ) 105 - 112 1993.08

　View Summary

The purpose of this paper is to evaluate a new fast execution scheme of a program with speculative execution on the EM-4 multiprocessor. Conventional Schemes to parallelize programs including conditional branches have some problems. The multi-stage specurative execution scheme enables -(1) solving the side-efects problem, (2) decreaseing the number of processors to execute the specurative execution scheme, (3) jumping multi-stage conditional branches, (4) suiting to general multiprocessor systems. An experimental evaluation shows that the measured speedup ratio is 50% of the theoretical rat...

CiNii J-GLOBAL
An Implementation of Sparse BLAS-3 on a Distributed Memory Parallel Machine PA1000

Utino Satosi, Hagiwara Junichi, Yasue Toshiaki, Yamana Hayato, Muraoka Yoichi

全国大会講演論文集 46 ( 1 ) 59 - 60 1993.03

　View Summary

疎行列を対象としたBLAS-3(Basic Linear Algebra Subrou-tine Leve13)を並列化し、富士通の分散メモリ型並列計算機AP1000上に実装した。BLAS-3は、次のサブルーチンで構成されている。・行列の積(GEMM,SYMM)・対称行列に対する階数kと2kの更新(SYRK,SYRK)・三角行列との積(TRMM)・右辺が複数列で、三角行列を係数に持つ連立一次方程式(TRSM)このうちTRSM以外の5つは、行列同士の積が主演算である。そのため、実装にあたって疎行列の積の並列化方法が重要になる。密行列の積の並列化と比較して、疎行列の積の並列化においては、次の点が問題になる。1.疎行列の積C&loarr;C+ABの計算においては、疎行列が圧縮されて格納されているためCの書き換えに時間がかかる。2.後巡する方法で行列をセルヘ分割した時、各セル(PE)の持つ部分行列の大きさに偏りがあるために、その格納に必要なメモリと通信量に偏りが生じる。そこで、本稿では1点を解決するための計算の実行順序について提案し、その計算順序で実行した場合における2の点を解決するための通信方法について提案する。さらに、提案した方法に基づいて実装したGEMM(非対称行列同士の積)ルーチンを用いて評価する。なお、疎行列は以下の条件で格納されているものとする。汎用的なプログラムとするた...

CiNii J-GLOBAL
A Optimal Data Transfer Order of Doacross Loops

YAMANA Hayato, MURAOKA Yoichi

全国大会講演論文集 46 ( 6 ) 35 - 36 1993.03

　View Summary

本報告では,DOACRCSS型ループの実行時間を最小にするデータ通信順序を求める.DOACROSS型ループ実行に関する従来の研究は,プロセッサの処理能力を表すパラメータとして演算命令の実行時間(以下,演算時間),及び,データ通信の遅延時間以下,通信遅延時間)を用いてきた.しかし,演算と通信を並列に処理できるマルチプロセッサ上で,DOACROSS型ループを実行する場合,これらのパラメータ以外に,通信ピッチを考慮いなくてはならない.通信ピッチは,プロセッサと相互結合網間のデータ入出力時間間隔である.通信ピッチがデータ通信の発生する時間間隔より大きい場合,通信が全体の実行時間の隘路となる.これは、データ通信が通信ピッチ以下の時間間隔で開始(以下,発行)できず,通信発行に遅延が生じるためである.この時,実際の実行時間は,従来の理論的な値よりも大きくなる.以下では,このような場合,データを定義順で他のプロセッサへ送らず,通信順序を変更することにより,実行智間を短縮できることを示す.

CiNii J-GLOBAL
A Macro-tasking Scheme for Eager Evaluation

YAMANA Hayato, YASUE Toshiaki, ISHII Yoshihiko, MURAOKA Yoichi

全国大会講演論文集 46 ( 6 ) 37 - 38 1993.03

　View Summary

従来提案しているマクロタスク間先行評価方式におけるマクロタスク構成方法について報告する.先行評価方式とは,プログラム中の条件分岐文を越えて実行を進める方式である.マクロタスク生成の目的は,(1)変数の2重定義に件う副作用間題の回避,及び(2)仮実行(投棄的実行)に必要なプロセッサ数の削減の2点である.先行評価によって生じる副作用は,先行評価中に,同一データに対する2重定義が行われることによって生じる,本稿では,2重定義を回避するために,各マクロタスクヘのデータ依存間係が制御によらず一意になるようにマクロタスクを構成する.次に,実行時のプロセッサ数を削減するため,マクロタスク生成においては,データ依存と制御依存の間係を用いて,マクロタスクを融合した場合も,先行評価の効果を失わない部分を1つのマクロタスクとする.これは,従来のマクロタスク生成手法が制御依存のみを考えていたのに対し,データ依存を考えた生成手法として新規性を持つ.

CiNii J-GLOBAL
A flow‐executing scheme for DOACROSS loops on dynamic dataflow machines

Yoshihiko Ishii, Hayato Yamana, Toshiaki Yasue, Yoichi Muraoka

Systems and Computers in Japan 24 ( 4 ) 1 - 12 1993

DOI
The inner expression of HAREDAS: The compiler development environment or multi-architecture compiler for massive parallel computing

YASUE Toshiaki, KANEKO Masanori, HAGIWARA Junichi, TAHARA Ayumu, YAMANA Hayato, MURAOKA Yoichi

全国大会講演論文集 45 ( 5 ) 335 - 336 1992.09

　View Summary

はれだすは超並列化マルチアーキテクチャコンパイラの開発を目的とした開発環境である。本稿では、はれだすの内部表現とその上での先行評価表現方法について述べる。超並列化のための1つのアプローチとして、先行評価により既存言語中に陰に含まれる並列性を抽出する方法がある。先行評価とはプログラム中の制御依存関係を変更することにより、データ依存関係以外の先行制約関係を排除する高速化手法である。しかし、従来の先行評価では、命令レベルスケジューリングにおける並列性不足の補助手段としてしか実現されていない。はれだすでは、内部表現レベルで汎用的に先行評価を扱うことができるため、先行評価により引き出し得る並列性を有効に利用することが可能となる。本稿では、この内部表現による先行評価の表現方法について述べる。まず第2節においてはれだすの構成を述べる。続く第3節で、内部表現の構成と特徴について説明したのち、第4節で先行評価の表現方法とその操作方法について詳説する。

CiNii J-GLOBAL
A Description and its Application of the Data Dependence between Loops for : HAREDAS

KANEKO Masanori, YASUE Toshiaki, HAGIWARA Junichi, TAHARA Ayumu, YAMANA Hayato, MURAOKA Yoichi

全国大会講演論文集 45 ( 5 ) 337 - 338 1992.09

　View Summary

本稿では、マルチアーキテクチャコンパイラ開発環境-はれだす-におけるループ間依存関係の記述方法とその適用例について述べる。従来、プログラム中におけるデータ依存関係を特徴付ける方法して依存ベクトルが用いられている。しかし、この依存ベクトルは、同一ループ内あるいはループ外におけるデータ依存関係を記述するものであり、ループ間のデータ依存関係を特徴付けるための一般的な記述方法は定義されていなかった。これに対して本稿では、ループ間依存ベクトルを定義し、ループ間のデータ依存関係を記述する方法について述べる。また、ループ間依存ベクトルを用いることにより、ループ融合可能判定が従来手法に比べて容易に行えることを示し、さらに、各ループの並列性を失うことなくループを融合するためのループ間依存ベクトルの適用法について述べる。

CiNii J-GLOBAL
A Distributed Control Scheme of the Multi-stage Jumping Execution of Conditional Branches for Macrotasks

YAMANA Hayato, YASUE Toshiaki, ISHII Yoshihiko, MURAOKA Yoichi

全国大会講演論文集 45 ( 6 ) 121 - 122 1992.09

　View Summary

本報告では、先行評価を用いたマクロタスクの多段仮実行方式におけるマクロタスクの効果的な制御手法として、マクロタスクの分散制御手法を提案する。多段仮実行方式は、プログラム中のデータ依存と制御依存の内、データ依存を保証した段階でマクロタスクと呼ぶタスクの実行を開始し、後で制御依存に基づいて制御確定したマクロタスクを選択する手法である。本方式を実際のマルチプロセッサ上で実現するにあたっての問題点は、実行時に発生する各種オーバヘッドの削減である。実行時のオーバヘッドには、制御が確定しない段階で実行を開始することにより発生する(1)メモリバンド幅の増大に起因するオーバヘッド、(2)多数のマクロタスクを制御するために発生する制御オーバヘッドがある。本稿では、これら2つのオーバヘッドの内、(2)のオーバヘッドを削減するための手法として、プロセッサにマクロタスク制御専用のハードウェアを付加し、集中制御を廃したマクロタスクの制御手法を提案する。(1)の問題は、マクロタスクのスケジューリング問題であり、今後の課題である。

CiNii J-GLOBAL
An optimized vectorization scheme for multiply nested loops

SHINKAI Masashi, YASUE Toshiaki, KANEKO Masanori, YAMANA Hayato, MURAOKA Yoichi

全国大会講演論文集 44 ( 5 ) 93 - 94 1992.02

　View Summary

本稿では、多重ループの最適なベクトル化を実現するために、(1)内側ループからのタイト化、(2)積極的なループ分割、という2つの解析方針に基づくベクトル化手法を提案する。従来の多重ループのベクトル化手法では、(1)外側ループからタイト化するためループ分割が十分できない、(2)ループ分割による損得の評価が不完全である、という問題があり、最適なベクトル化ができない。そこで、本稿ではこれらの問題を解決するための解析手法を提案するとともに、実機(富士通のVP220O)において本手法を定量的に評価する。

CiNii J-GLOBAL
A Control Scheme of Multistage Eager Evaluation for Multiprocesser System

ISHII Yoshihiko, YASUE Toshiaki, YAMANA Hayato, MURAOKA Yoichi

全国大会講演論文集 44 ( 6 ) 27 - 28 1992.02

　View Summary

本稿では,マルチプロセツサシステムにおける,タスク一段の先行評価の制御(一段先行評価制御)と,タスク多段に渡る先行評価の制御(多段先行評価制御)との違いについて述べる.我々は,一投先行評価制御,及び,多段先行評価制御を具体的なマルチプロセッサシステム(並列処理システム-晴-)に沿って提案してきた.本稿では,一段先行評価制御,及び,多段先行評価制御を時相論理で表現し一般化する.その後,この時相論理を用いて,制御の違いを推論する,また,この推論によって,我々が提案してきた具体的なマルチプロセッサシステムに沿った一段先行評価制御,及び,多段先行評価制御の正当性を述べる.

CiNii J-GLOBAL
A Parallelizing and Executing Scheme of FORTRAN Programs with Eager Evaluation

YAMANA Hayato, YASUE Toshiaki, MURAOKA Yoichi

全国大会講演論文集 44 ( 6 ) 29 - 30 1992.02

　View Summary

本報告では,FORTRANプログラムをマルチプロセッサ上で高速に実行するための方式として,先行評価を用いたプログラムの並列化手法と実行方式を提案する.従来,条件分岐を含むプログラムを並列化する手法として,タスクの最速実行条件を求める手法や制御依存を越えた実行方式が提案されている.しかし,(1)最速実行条件を求めるだけでは十分な並列性が得られない,(2)対象プログラムが限定され,かつ,実行方式の提案がないといった問題を持つ.これらの問題に対して我々は,フローグラフ展開を用いた仮実行方式,データ駆動を用いた条件分岐のn段先行評価制御方式を提案している.本稿では,これらの手法を一般化すると共に,理論的な速度向上について論じる.

CiNii J-GLOBAL
An Execution Scheme for DO-loops on Distributed Memory Machines.

萩原純一, 安江俊明, 金子正教, 山名早人, 村岡洋一

電子情報通信学会技術研究報告 92 ( 172(CPSY92 9-19) ) 1992

J-GLOBAL
A Scheme to Reduce the Access Rate to Schred Memory for Multiprocessor System.

山名早人, 村岡洋一

電子情報通信学会技術研究報告 92 ( 172(CPSY92 9-19) ) 1992

J-GLOBAL
早稲田大学理工学部電子通信学科村岡洋一研究室

田渕仁浩, 山名早人, 早稲田大学理工学部電子通信学科村岡洋一研究室・D3, 早稲田大学理工学部電子通信学科村岡洋一研究室・D3

人工知能学会誌 = Journal of Japanese Society for Artificial Intelligence 6 ( 3 ) 440 - 440 1991.05

CiNii
Parallel execution scheme of conditional branches with graph unfolding for the parallel processing system - Harray

Hayato Yamana, Toshiaki Yasue, Jun Kohdate, Yoichi Muraoka

Bulletin of Centre for Informatics (Waseda University) 12 8 - 18 1991.03

J-GLOBAL
A Parallel Execution Scheme of Conditional Branches and its Evaluation for the Parallel Processing Sustem : Harray

YAMANA Hayato, YASUE Toshiaki, Kohdate Jun, MURAOKA Yoichi

全国大会講演論文集 42 ( 6 ) 60 - 61 1991.02

　View Summary

本報告では,プログラム内の条件分岐を並列処理することによるプログラム実行時間の短縮について述べ,我々の提案している並列処理システム-晴-上での条件分岐並列処理手法の性能予測を示す.プログラム中の条件分岐を並列処理しようという試みは,VLIW型計算機を中心にこれまでに数多く行われている.しかし,これらの方式は,大規模な並列処理計算機を対象とした方式ではないため,条件分岐の先行評価段数が小さく,得られる並列性も小さい.これに対して,-晴-は1000台規模の要素プロセッサを持つため,先行評価段数を大きくし,十分な並列性をプログラムから抽出する.先行評価段数を大きくする手法として,我々はこれまでにフローグラフ展開を提案している.フローグラフ展開とは,条件分岐点における同期をとらず,条件の成立・不成立によって分かれる全ての制御フローについて演算を同時に実行し,後で制御に基づいて有効となったフローを選択する手法である.これまでの評価では,フローグラフ展開の対象となる部分について,1.5倍-5.2倍の処理速度の向上を確認している.本稿では,まず,(1)条件分岐の並列処理による処理速度向上をいくつかの科学技術計算プログラムのシミュレーション結果を用いて示し,次に(2)フローグラフ展開による処理速度の向上が,プログラム全体として考えた時に,どの程度期待できるかについて評価した結果を示す.

CiNii J-GLOBAL
A controll Scheme of Processing Element for the Parallel Processing System : Harray

ISHIZAKI Kazuaki, ISHII Yoshihiko, HAGIMOTO Takeshi, YAMANA Hayato, MURAOKA Yoichi

全国大会講演論文集 42 ( 6 ) 62 - 63 1991.02

　View Summary

本報告では、並列処理システム-晴-における仮実行時の要素プロセッサ(PE)内の制御方法について述べる。仮実行方式とは、プログラムの並列実行を妨げる原因の一つである制御依存を超えてeagar evaluationを行う方式である。従来のパイプラインプロセッサ等のeagar evaluationは、単一プロセッサ内で行われていたためその範囲が小規模であった。-晴-では、仮実行を複数プロセッサを用い、多段にわたってeagar evaluationを行う。ここで問題となるのは分岐が決定した際の、複数PE間にわたるPEの制御方法である。制御を一箇所で集中的に行うと1000台規模のプロセッサではオーバヘッドが無視できない。そこで、我々は実行制御をPE毎に分散して行う方式を提案している。本報告では、まず仮実行の単位としてActivation Setという制御単位を定義する。次に、Activation Setを用いた仮実行時のPE毎に独立した制御方法について、その概要を述べる。さらに、PE内での具体的な処理手順を示す。

CiNii J-GLOBAL
A Control Scheme of Processing Element for the Parallel Processing System. Harray.

石井吉彦, 石崎一明, 萩本猛, 山名早人, 村岡洋一

情報処理学会全国大会講演論文集 43rd ( 6 ) 1991

J-GLOBAL
Prototype FORTRAN Compiler for Parallel Processing System -Harray-.

安江俊明, 神舘淳, 山名早人, 村岡洋一

並列処理シンポジウム論文集 1991 1991

J-GLOBAL
A Parallel Execution Scheme of Conditional Branches Using Eager Evaluation for the Parallel Processing System-Harray.

山名早人, 石崎一明, 安江俊明, 村岡洋一

情報処理学会研究報告 91 ( 64(ARC-89) ) 1991

J-GLOBAL
A Scheme to Reduce the Access Rate to Shared Memory for the Parallel Processing System -Harray-.

山名早人, 大段智志, 村岡洋一

並列処理シンポジウム論文集 1991 1991

J-GLOBAL
Loop Parallelizing Scheme:Dependent-flow Loop.

金子正教, 中里倫明, 安江俊明, 山名早人, 村岡洋一

電子情報通信学会技術研究報告 91 ( 130(CPSY91 4-33) ) 1991

J-GLOBAL
A Construction of Execution Unit for Parallel Processing System. Harray.

萩本猛, 山名早人, 村岡洋一

情報処理学会全国大会講演論文集 43rd ( 6 ) 1991

J-GLOBAL
A Network Construction of Parallel Processor for Eager Evaluation.

石崎一明, 安江俊明, 山名早人, 村岡洋一

情報処理学会研究報告 91 ( 64(ARC-89) ) 1991

J-GLOBAL
A Scheduling Scheme for the Parallel Processing System. Harray. A Task Restructuring for the Reduction of Inter-Processor Communication and Synchronization.

萩原純一, 安江俊明, 山名早人, 村岡洋一

情報処理学会全国大会講演論文集 43rd ( 6 ) 1991

J-GLOBAL
An environment for dataflow program development of parallel processing system‐harray

Hayato Yatnana, Jun Kohdate, Toshiaki Yasue, Associate Members, Yoichi Muraoka

Systems and Computers in Japan 22 ( 8 ) 26 - 38 1991

　View Summary

This paper considers the dataflow program development environment for the system programmer who develops the compiler and proposes a method to improve the debugging efficiency. The conventional debugging methods are either: (1) to monitor the packet in the dataflow ring, or (2) to specify the function containing a bug. The former contains unsolved problems such as the determination of start timing for the data monitoring and the presentation of a large amount of information to the user. The latter contains a problem in that the debugging is impossible at the dataflow level. This paper aims at the solution of those problems, and the detailed debugging is executed on the software, not on the real machine. The information presentation on a dataflow graph is considered for systematic presentation of the debugging information. As the development environment, the parallel processing system Harray proposed by the authors is considered. In the proposed system, a two‐stage process is employed in which the first step is to specify the macro‐block (which is a task unit in Harray) containing the bug, and the second step is the detailed debugging of the specified macro‐block. The debugging within the macroblock is executed on the software, and the debugging efficiency is improved by: (1) diagram representation for easier visual recognition, and (2) backward tracing function. Copyright © 1991 Wiley Periodicals, Inc., A Wiley Company

DOI
Dataflow program developing environment for the parallel processing system -Harray-.

安江俊明, 神舘淳, 山名早人, 村岡洋一

並列処理シンポジウム論文集 73 ( 6 ) p569 - 579 1990.06

CiNii J-GLOBAL
Implementation of color management scheme on data driven computer.

石井吉彦, 安江俊明, 山名早人, 村岡洋一

電子情報通信学会技術研究報告 90 ( 143(CPSY90 12-37) ) 1990

J-GLOBAL
A construction of switching unit for parallel processing system -Harray-.

石崎一明, 山名早人, 村岡洋一

情報処理学会研究報告 90 ( 90(ARC-85) ) 1990

J-GLOBAL
The organization of global memory for the parallel processing system -Harray-.

山名早人, 片山啓, 草野義博, 村岡洋一

並列処理シンポジウム論文集 1990 1990

J-GLOBAL
A macro-block controlling scheme of parallel processing system. Harray.

山名早人, 安江俊明, 神館淳, 村岡洋一

電子情報通信学会技術研究報告 90 ( 143(CPSY90 12-37) ) 1990

J-GLOBAL
Evaluation of color management on parallel processing system -Harray-.

石井吉彦, 安江俊明, 山名早人, 村岡洋一

電子情報通信学会技術研究報告 90 ( 11(CPSY90 1-4) ) 1990

J-GLOBAL
A method of function calls. parallel processing system -Harray-.

石崎一明, 神舘淳, 山名早人, 村岡洋一

電子情報通信学会技術研究報告 90 ( 11(CPSY90 1-4) ) 1990

J-GLOBAL
A FORTRAN compiler for paralell processing system. Harray.

安江俊明, 神館淳, 山名早人, 村岡洋一

電子情報通信学会技術研究報告 90 ( 143(CPSY90 12-37) ) 1990

J-GLOBAL
Parallel processing system -Harray-

H. Yamana, Y. Kusano, T. Yasue, J. Kohdate, T. Hagiwara, Y. Muraoka

Computing Systems in Engineering 1 ( 1 ) 111 - 130 1990

DOI
A Method of Color Management for Parallel Processing System -Harray-

Ishii Yoshihiko, Yasue Toshiaki, Yamana Hayato, Muraoka Yoichi

全国大会講演論文集 39 ( 3 ) 1798 - 1799 1989.10

　View Summary

本稿では、並列処理システム-晴-[1]におけるカラー管理方式の一提案を行なう。-晴-では、マクロブロック[1]という処理単位内で動的データ駆動方式を採用している。動的データ駆動方式ではループの処理にカラーを用いる。しかし、カラーは有限であるため、カラーの資源管理が必要となる。カラーの資源管理、即ち、カラーの回収・再割当に関して従来の方式では、「カラーのオーバフロー時に新しいループを生成する方法」が提案されている[2]。しかしながら、ループ生成のオーバヘッドが大きいという問題を持つ。また、計算機資源は有限であるから、計算機資源以上のカラーを用いても、処理速度向上は望めない。即ち、計算機資源に見合ったカラーを使用すれば良い。これらの点をふまえて、本稿では、必要以上のカラーを使用せず、カラーの回収・再割当のオーバヘッドを削減したループ処理方式を提案する。以下では、まず、ループ本体に対しデータフロー解析を行ない、カラーの必要個数(Lで表わす)を求める。そして、カラーのオーバフローを回避し、Lで制限されたカラーを回収・再割当するループ処理方式を示す。なお、今回はLが計算機資源以下の場合について報告する。

CiNii J-GLOBAL
A Run-time Error Handling Scheme for Parallel Processing System -Harray-

Hagimoto Takeshi, Kusano Yoshihiro, Yamana Hayato, Muraoka Yoichi

全国大会講演論文集 39 ( 3 ) 1800 - 1801 1989.10

　View Summary

我々は,科学技術計算用並列処理システム-晴-(-HARRAY-:IIybrid ARRAY)を提案している).-晴-は,科学技術計算用にF0RTRANで記述されたプログラムを高速に実行することを目的とし,要素プロセッサを1024個持つ並列処理システムである.-晴-の実行方式は,プログラムをコンパイル時にマクロブロックという単位に分割し,マクロブロック間をコントロールフロー,マクロブロック内をデータフローで処理を行うCDフロー方式である.データフローのプログラムでは,後述するゲート後置を行うと,計算機資源が無限にあると仮定したとき,実行速度が約3倍向上することを確認している.しかし,計算機資源は有限であるため,-晴-では,プログラムの並列度が計算機資源よりも小さい部分でゲート後置を行い,この部分の実行速度を向上させる.しかし,制御ゲートが実行時エラーを回避させるために設けられているとき,ゲート後置を行うと,その先行評価部分で,ゲート後置が原因の実行時エラーが発生する場合がある.この実行時エラーは,ユーザのプログラムの誤りが原因でないため,ユーザに報告することはできない.したがって,ゲート後置が原因となりえる実行時エラーが発生したとき,その発生原因がゲー置であるのかプログラムの誤りであるのかを判断する必要がある.本稿では,ゲート後置が原因となりえる実行時エラーが発生したとき,その...

CiNii J-GLOBAL
A Structure Handling Scheme of Parallel Processing System -Harray-

Yamana Hayato, Kusano Yoshihiro, Muraoka Yoichi

全国大会講演論文集 39 ( 3 ) 1802 - 1803 1989.10

　View Summary

本稿では,並列処理システム-晴-〔1〕における構造体処理方式〔2〕の実現方法について述べる.-晴-では,実行方式にCDフロー(Controlled Dataflow)方式〔3〕を採用している.CDフロー方式では,マクロブロックと呼ぶ処理単位間でコントロールフロー制御をおこない,マクロブロック内でデータフロー実行をおこなう.データフロー実行には記憶の概念が存在しないが,実際に計算機を構成するにあたっては,大規模な構造体を格納するための構造体記憶が必要不可欠である.従来,構造体処理に関してI-ストラクチャ〔4〕等が提案されている.しかし,これらの方式はデータフロー方式の持つ単一代入則を厳密に実現したものであって,参照は複数回できるが,定義は1回のみという制限を持つ.したがって,二重定義時には,構造体をコピーしなければならず,オーバヘッドが発生する.-晴-では,構造体記憶(以下大域記憶と呼ぶ)に対して複数回の定義及び参照を可能とし,二重定義時のコピーオーバヘッドを無くした構造体処理方式を提案している〔2〕.本方式では,アクセス順序の保証を複数マクロブロックに及ぶ定義・参照に対しておこない,マクロブロック内に閉じた定義・参照は対象としない.これは,マクロブロック内で定義されたデータを同一マクロブロック内で使用する場合には,定義されたデータをらである.本稿では,まず-晴-の大域記憶及び...

CiNii J-GLOBAL
A structure handling scheme of parallel processing system -Harray-.

山名早人, 草野義博, 村岡洋一

情報処理学会全国大会講演論文集 39th ( 3 ) 1989

J-GLOBAL
Visual environment for lower level program development of parallel processing system -Harray-.

安江俊明, 神舘淳, 萩原孝, 山名早人, 村岡洋一

電子情報通信学会技術研究報告 89 ( 19(CPSY89 1-5) ) 1989

J-GLOBAL
Evaluation of unfolded flow graph for the parallel processing system -Harray-.

荻原孝, 山名早人, 神館純, 村岡洋一

情報処理学会研究報告 89 ( 30(ARC-75) ) 1989

J-GLOBAL
A controlled dataflow mechanism of parallel processing system - Harray.

山名早人, 草野義博, 神舘淳, 安江敏明, 村岡洋一

電子情報通信学会技術研究報告 89 ( 168(CPSY89 45-58) ) 1989

J-GLOBAL
Parallel processing system HARE.

萩原孝, 山名早人, 丸島敏一, 村岡洋一

BIT (Tokyo) 21 ( 4 ) 1989

J-GLOBAL
A compiling algorithm of parallel processing system - Harray with graph unfolding scheme.

神舘淳, 安江俊明, 山名早人, 村岡洋一

電子情報通信学会技術研究報告 89 ( 168(CPSY89 45-58) ) 1989

J-GLOBAL
A PRECEDING ACTIVATION SCHEME WITH GRAPH UNFOLDING FOR THE PARALLEL PROCESSING SYSTEM HARRAY

H YAMANA, T HAGIWARA, J KOHDATE, Y MURAOKA

PROCEEDINGS : SUPERCOMPUTING 89 675 - 684 1989

　View Summary

The purpose of this work is to propose and evaluate the preceding activation scheme with graph unfolding, which translate a Fortran program into a dataflow graph and executes it efficiently. The problems in restructuring a Fortran program into a dataflow graph are that a Fortran program is not written in a single assignment rule and it has an explicit control flow. These problems result in little parallelism because many gate operations, such as T/F gates, are introduced in the dataflow graph to synchronize the data movement. Therefore, discarding these gate operations is the key to exposing parallelism in a Fortran program. The preceding activation scheme with graph unfolding is proposed to discard these gate operations. The result of the performance evaluation by the 'Harray' software simulator is presented. It is shown that the execution speed with the proposed scheme for flow graphs without backward branches is about 1.5 times as fast as that with the extended activation scheme which initiates the execution only after it is confirmed that a basic block will be selected at a conditional branch. Moreover, the execution speed is 2.7 times as fast as that with the extended activation scheme if a flow graph including backward branches is unfolded by the proposed scheme.
A Construction of Waiting Memory for Parallel Processing Syatem -Harray-

Yamana Hayato, Kusano Yoshihiro, Hagiwara Takashi, Muraoka Yoichi

全国大会講演論文集 37 ( 1 ) 65 - 66 1988.09

　View Summary

我々は、主に科学技術計算を目的とした並列処理システム-晴-を提案している。-晴-では、プログラムに内在する並列性を十分に引き出す為にデータフロー実行を取り入れている。データフロー実行では、ノードの発火制御を司る待ち合わせ記憶(WM:Waiting Memory)の高速化がシステム全体の高速化において重要なポイントとなる。本稿では、-晴-の試作機で用いる待ち合わせ記憶WMの構成について述べると共に、ソフトウェアシミュレータによる簡単な評価を行う。

CiNii J-GLOBAL
The Completion Detection of Macro-Block in Parallel Computer System -Harray-

Kusano Yoshihiro, Hagiwara Takashi, Yamana Hayato, Muraoka Yoichi

全国大会講演論文集 37 ( 1 ) 67 - 68 1988.09

　View Summary

我々が提案している科学技術計算処理用データフロー・マルチプロセッサシステム-晴-では、各プロセッサエレメントへ割り当てるタスクの分割にマクロブロックという概念を用いている。マクロブロックとはプログラムをある基準に従って図1のように分割したもので、-晴-ではマクロブロックを単位としてプロセッサエレメントにタスクを割り当てる。マクロブロック内部ではデータ駆動制御で計算を進めて自然に並列性を抽出し、さらにマクロブロック間にコントロールフロー制御を導入し階層的な制御構造をとる。このような方法により、制御命令の増加などのデータ駆動制御の欠点を補うことができる。しかし、マクロブロックを単位としてタスクを割当てる際に種々の問題が生じる。マクロブロックの終了検出を高速に行なう必要があることもその一つである。そこで、本稿ではマクロブロックの終了検出を高速に行なう手法について述べ、簡単な評価を行なう。

CiNii J-GLOBAL
A construction of processing element in a parallel processing system -Harray-.

山名早人, 丸島敏一, 草野義博, 村岡洋一

情報処理学会研究報告 88 ( 19(CA-70) ) 1988

J-GLOBAL
Evaluation of processing element in parallel computer system-Harray.

草野義博, 山名早人, 丸島敏一, 村岡洋一

情報処理学会全国大会講演論文集 36th ( 1 ) 1988

J-GLOBAL
Execution mechanism of parallel processing system -Harray-.

丸島敏一, 山名早人, 萩原孝, 草野義博, 村岡洋一

情報処理学会研究報告 88 ( 4(CA-69/MC-48) ) 1988

J-GLOBAL
Evaluation of a simulated parallel processing system - Harray - .

山名早人, 萩原孝, 草野義博, 村岡洋一

情報処理学会研究報告 88 ( 79(ARC-73) ) 1988

J-GLOBAL
A design concept of the compiler for parallel processing system-Harray-.

萩原孝, 山名早人, 村岡洋一

電子情報通信学会技術研究報告 88 ( 155 ) 1988

J-GLOBAL
EXPERIENCE USING THE RESTRUCTURING COMPILER PARAFRASE.

Toshikazu Marushima, Takashi Hagiwara, Hayato Yamana, Yoichi Muraoka

Bulletin of Centre for Informatics (Waseda University) 5 69 - 77 1987.03

J-GLOBAL
A construction of processing element in parallel scientific computer system -Harray-.

山名早人, 丸島敏一, 萩原孝, 村岡洋一

情報処理学会全国大会講演論文集 34th ( 1 ) 1987

J-GLOBAL
A scheme of macro blocking for parallel scientific computer system -Harray-.

萩原孝, 山名早人, 丸島敏一, 村岡洋一

情報処理学会全国大会講演論文集 34th ( 1 ) 1987

J-GLOBAL
RetweetReputation: バイアスを排除したTwitter投稿内容評価手法

藤木紫乃, 矢野博也, 山名早人

DEIM2011 A10-3
Sequential Pattern Mining with Time Interval

Yu Hirate, Hayato Yamana

Proc. of PAKDD2006

DOI

▼display all

Industrial Property Rights

認証システム、認証プログラム及び認証方法

山名早人, 工藤雅士

Patent

J-GLOBAL
略語生成システム

特許第6135867号

石川開, 土田正明, 大西貴士, 山名早人, 及川孝徳

Patent

J-GLOBAL
記憶度推定装置および記憶度推定プログラム

特許第6032638号

山名早人, 苑田翔吾, 浅井洋樹

Patent

J-GLOBAL
辞書作成支援装置、辞書作成支援方法及び辞書作成支援プログラム

特許第5648890号

立石健二, 細見格, 山名早人

Patent

J-GLOBAL
辞書作成支援装置、辞書作成支援方法及び辞書作成支援プログラム

立石健二, 細見格, 山名早人

Patent

J-GLOBAL
ネットワーク取引不正行為者検出方法

山名早人, 平出勇宇, 相吉澤明, 木戸冬子

Patent

J-GLOBAL

▼display all

Syllabus

Bachelor Thesis A

School of Fundamental Science and Engineering

2026 spring semester
Computer Science and Engineering Laboratory B

School of Fundamental Science and Engineering

2026 spring semester
Computer Science and Engineering Laboratory A [S Grade]

School of Fundamental Science and Engineering

2026 fall semester
Computer Science and Engineering Laboratory B [S Grade]

School of Fundamental Science and Engineering

2026 spring semester
Computer Science and Engineering Laboratory A (2)

School of Fundamental Science and Engineering

2026 fall semester
Information Retrieval

School of Fundamental Science and Engineering

2026 fall semester
Advanced Information Communication Technology and Open Innovation

School of Fundamental Science and Engineering

2026 spring semester
IoT System Design

School of Fundamental Science and Engineering

2026 spring semester
Data Mining

School of Fundamental Science and Engineering

2026 spring semester
Project Research B

School of Fundamental Science and Engineering

2026 fall semester
Bachelor Thesis B（Spring Semester）

School of Fundamental Science and Engineering

2026 spring semester
Bachelor Thesis B

School of Fundamental Science and Engineering

2026 fall semester
Project Research A

School of Fundamental Science and Engineering

2026 spring semester
Bachelor Thesis A (Intensive Course)

School of Fundamental Science and Engineering

2026 an intensive course(spring and fall)
Bachelor Thesis A [S Grade]

School of Fundamental Science and Engineering

2026 spring semester
Biology for Computer Science

School of Fundamental Science and Engineering

2026 fall semester
Bachelor Thesis A（Fall Semester）

School of Fundamental Science and Engineering

2026 fall semester
Bachelor Thesis B

School of Fundamental Science and Engineering

2026 fall semester
Bachelor Thesis B [S Grade]

School of Fundamental Science and Engineering

2026 fall semester
Bachelor Thesis A（Fall Semester）

School of Fundamental Science and Engineering

2026 fall semester
Bachelor Thesis A

School of Fundamental Science and Engineering

2026 spring semester
Bachelor Thesis B（Spring Semester）

School of Fundamental Science and Engineering

2026 spring semester
Communications and Computer Engineering Laboratory A [S Grade]

School of Fundamental Science and Engineering

2026 fall semester
Communications and Computer Engineering Laboratory A

School of Fundamental Science and Engineering

2026 fall semester
Bachelor Thesis A

School of Fundamental Science and Engineering

2026 spring semester
Bachelor Thesis A (Fall Semester)

School of Fundamental Science and Engineering

2026 fall semester
Bachelor Thesis A

School of Fundamental Science and Engineering

2026 spring semester
Communications and Computer Engineering Laboratory B

School of Fundamental Science and Engineering

2026 spring semester
Advanced Information Communication Technology and Open Innovation

School of Fundamental Science and Engineering

2026 spring semester
IoT System Design

School of Fundamental Science and Engineering

2026 spring semester
Project Research B

School of Fundamental Science and Engineering

2026 fall semester
Advanced Information Communication Technology and Open Innovation

School of Fundamental Science and Engineering

2026 spring semester
Information Retrieval

School of Fundamental Science and Engineering

2026 fall semester
Data Mining

School of Fundamental Science and Engineering

2026 spring semester
IoT System Design

School of Fundamental Science and Engineering

2026 spring semester
Project Research A

School of Fundamental Science and Engineering

2026 spring semester
Biology for Computer Science

School of Fundamental Science and Engineering

2026 fall semester
Bachelor Thesis B (Spring Semester)

School of Fundamental Science and Engineering

2026 spring semester
Bachelor Thesis B

School of Fundamental Science and Engineering

2026 fall semester
Bachelor Thesis A (Intensive Course)

School of Fundamental Science and Engineering

2026 an intensive course(spring and fall)
Bachelor Thesis A (Fall Semester)

School of Fundamental Science and Engineering

2026 fall semester
Communications and Computer Engineering Laboratory B [S Grade]

School of Fundamental Science and Engineering

2026 spring semester
Graduation Thesis A (Fall)[For students enrolled before 2022]

School of Fundamental Science and Engineering

2026 fall semester
Graduation Thesis B (Spring) [S Grade]

School of Fundamental Science and Engineering

2026 spring semester
Graduation Thesis B (Fall) [S Grade]

School of Fundamental Science and Engineering

2026 fall semester
Graduation Thesis B (Fall)

School of Fundamental Science and Engineering

2026 fall semester
Computer Science and Communications Engineering Laboratory A [S Grade]

School of Fundamental Science and Engineering

2026 fall semester
Graduation Thesis B (Spring)

School of Fundamental Science and Engineering

2026 spring semester
Computer Science and Communications Engineering Laboratory A

School of Fundamental Science and Engineering

2026 fall semester
Project Research Fall

School of Fundamental Science and Engineering

2026 fall semester
Project Research Spring

School of Fundamental Science and Engineering

2026 spring semester
Operating Systems

School of Fundamental Science and Engineering

2026 spring semester
Computer Science and Communications Engineering Laboratory B

School of Fundamental Science and Engineering

2026 spring semester
Introduction to Computers and Networks

School of Fundamental Science and Engineering

2026 spring semester
Graduation Thesis A (Fall)

School of Fundamental Science and Engineering

2026 fall semester
Data Mining

School of Fundamental Science and Engineering

2026 spring semester
Information Retrieval

School of Fundamental Science and Engineering

2026 fall semester
Graduation Thesis A (Spring) [S Grade]

School of Fundamental Science and Engineering

2026 spring semester
Graduation Thesis A (Fall) [S Grade]

School of Fundamental Science and Engineering

2026 fall semester
Graduation Thesis A (Spring)

School of Fundamental Science and Engineering

2026 spring semester
Graduation Thesis A (Fall)[S Grade][For students enrolled before 2022]

School of Fundamental Science and Engineering

2026 fall semester
Graduation Thesis A (Spring)[For students enrolled before 2022]

School of Fundamental Science and Engineering

2026 spring semester
Graduation Thesis A (Spring)[S Grade][For students enrolled before 2022]

School of Fundamental Science and Engineering

2026 spring semester
Bachelor Thesis B [S Grade]

School of Fundamental Science and Engineering

2026 fall semester
Bachelor Thesis B

School of Fundamental Science and Engineering

2026 fall semester
Bachelor Thesis B (Spring Semester)

School of Fundamental Science and Engineering

2026 spring semester
Bachelor Thesis A [S Grade]

School of Fundamental Science and Engineering

2026 spring semester
IoT System Design

Graduate School of Fundamental Science and Engineering

2026 spring semester
Seminar on Parallel and Distributed Architecture B

Graduate School of Fundamental Science and Engineering

2026 fall semester
Seminar on Parallel and Distributed Architecture A

Graduate School of Fundamental Science and Engineering

2026 spring semester
Advanced Project Study(Autumn)

Graduate School of Fundamental Science and Engineering

2026 fall semester
Advanced Project Study(Spring)

Graduate School of Fundamental Science and Engineering

2026 spring semester
Advanced Information Communication Technology and Open Innovation

Graduate School of Fundamental Science and Engineering

2026 spring semester
Special Laboratory B in Computer Science and Communications Engineering

Graduate School of Fundamental Science and Engineering

2026 fall semester
Advances in Bioinformatics

Graduate School of Fundamental Science and Engineering

2026 fall semester
Special Laboratory A in Computer Science and Communications Engineering

Graduate School of Fundamental Science and Engineering

2026 spring semester
Information Retrieval

Graduate School of Fundamental Science and Engineering

2026 fall semester
Research on Bioinformatics

Graduate School of Fundamental Science and Engineering

2026 full year
Research on Parallel and Distributed Architecture

Graduate School of Fundamental Science and Engineering

2026 full year
Advanced Information Communication Technology and Open Innovation

Graduate School of Fundamental Science and Engineering

2026 spring semester
Seminar on Parallel and Distributed Architecture C

Graduate School of Fundamental Science and Engineering

2026 spring semester
Seminar on Bioinformatics D

Graduate School of Fundamental Science and Engineering

2026 fall semester
Seminar on Bioinformatics C

Graduate School of Fundamental Science and Engineering

2026 spring semester
Seminar on Bioinformatics B

Graduate School of Fundamental Science and Engineering

2026 fall semester
Seminar on Bioinformatics A

Graduate School of Fundamental Science and Engineering

2026 spring semester
Seminar on Parallel and Distributed Architecture D

Graduate School of Fundamental Science and Engineering

2026 fall semester
Seminar on Parallel and Distributed Architecture C

Graduate School of Fundamental Science and Engineering

2026 spring semester
Seminar on Parallel and Distributed Architecture B

Graduate School of Fundamental Science and Engineering

2026 fall semester
Seminar on Parallel and Distributed Architecture A

Graduate School of Fundamental Science and Engineering

2026 spring semester
Special Laboratory B in Computer Science and Communications Engineering

Graduate School of Fundamental Science and Engineering

2026 fall semester
Special Laboratory A in Computer Science and Communications Engineering

Graduate School of Fundamental Science and Engineering

2026 spring semester
Information Retrieval

Graduate School of Fundamental Science and Engineering

2026 fall semester
Research on Bioinformatics

Graduate School of Fundamental Science and Engineering

2026 full year
Research on Parallel and Distributed Architecture

Graduate School of Fundamental Science and Engineering

2026 full year
Seminar on Bioinformatics C

Graduate School of Fundamental Science and Engineering

2026 spring semester
Seminar on Bioinformatics B

Graduate School of Fundamental Science and Engineering

2026 fall semester
Seminar on Bioinformatics A

Graduate School of Fundamental Science and Engineering

2026 spring semester
Seminar on Bioinformatics D

Graduate School of Fundamental Science and Engineering

2026 fall semester
Seminar on Parallel and Distributed Architecture D

Graduate School of Fundamental Science and Engineering

2026 fall semester
IoT System Design

Graduate School of Creative Science and Engineering

2026 spring semester
Special Seminar B in Computer Science and Communications Engineering

Graduate School of Fundamental Science and Engineering

2026 fall semester
Research on Parallel and Distributed Architecture

Graduate School of Fundamental Science and Engineering

2026 full year
Research on Bioinformatics

Graduate School of Fundamental Science and Engineering

2026 full year
Special Seminar A in Computer Science and Communications Engineering

Graduate School of Fundamental Science and Engineering

2026 spring semester
IoT System Design

Graduate School of Advanced Science and Engineering

2026 spring semester
Intermediate Level of Client Side Web Programming 01

Global Education Center

2025 spring quarter
Applied Information Literacy in the Era of Finance Digital Transformation

Global Education Center

2025 summer quarter
Information Literacy Basics in the Era of Finance Digital Transformation

Global Education Center

2025 spring quarter
Bachelor Thesis A (Intensive Course)

School of Fundamental Science and Engineering

2025 an intensive course(spring and fall)
Bachelor Thesis A

School of Fundamental Science and Engineering

2025 spring semester
Bachelor Thesis B [S Grade]

School of Fundamental Science and Engineering

2025 fall semester
Bachelor Thesis B

School of Fundamental Science and Engineering

2025 fall semester
Bachelor Thesis A [S Grade]

School of Fundamental Science and Engineering

2025 spring semester
Bachelor Thesis B（Spring Semester）

School of Fundamental Science and Engineering

2025 spring semester
Bachelor Thesis A（Fall Semester）

School of Fundamental Science and Engineering

2025 fall semester
Bachelor Thesis A

School of Fundamental Science and Engineering

2025 spring semester
Computer Science and Engineering Laboratory B [S Grade]

School of Fundamental Science and Engineering

2025 spring semester
Computer Science and Engineering Laboratory A [S Grade]

School of Fundamental Science and Engineering

2025 fall semester
Computer Science and Engineering Laboratory B

School of Fundamental Science and Engineering

2025 spring semester
Computer Science and Engineering Laboratory A (2)

School of Fundamental Science and Engineering

2025 fall semester
Data Mining

School of Fundamental Science and Engineering

2025 spring semester
Advanced Information Communication Technology and Open Innovation

School of Fundamental Science and Engineering

2025 spring semester
IoT System Design

School of Fundamental Science and Engineering

2025 spring semester
Information Retrieval

School of Fundamental Science and Engineering

2025 fall semester
Project Research B

School of Fundamental Science and Engineering

2025 fall semester
Project Research A

School of Fundamental Science and Engineering

2025 spring semester
Biology for Computer Science

School of Fundamental Science and Engineering

2025 fall semester
Bachelor Thesis B（Spring Semester）

School of Fundamental Science and Engineering

2025 spring semester
Bachelor Thesis A（Fall Semester）

School of Fundamental Science and Engineering

2025 fall semester
Bachelor Thesis B

School of Fundamental Science and Engineering

2025 fall semester
Bachelor Thesis A

School of Fundamental Science and Engineering

2025 spring semester
Communications and Computer Engineering Laboratory B [S Grade]

School of Fundamental Science and Engineering

2025 spring semester
Communications and Computer Engineering Laboratory B

School of Fundamental Science and Engineering

2025 spring semester
Communications and Computer Engineering Laboratory A

School of Fundamental Science and Engineering

2025 fall semester
Advanced Information Communication Technology and Open Innovation

School of Fundamental Science and Engineering

2025 spring semester
IoT System Design

School of Fundamental Science and Engineering

2025 spring semester
Advanced Information Communication Technology and Open Innovation

School of Fundamental Science and Engineering

2025 spring semester
Information Retrieval

School of Fundamental Science and Engineering

2025 fall semester
Data Mining

School of Fundamental Science and Engineering

2025 spring semester
Bachelor Thesis B

School of Fundamental Science and Engineering

2025 fall semester
Bachelor Thesis B (Spring Semester)

School of Fundamental Science and Engineering

2025 spring semester
Bachelor Thesis A (Intensive Course)

School of Fundamental Science and Engineering

2025 an intensive course(spring and fall)
Biology for Computer Science

School of Fundamental Science and Engineering

2025 fall semester
Bachelor Thesis B (Spring Semester)

School of Fundamental Science and Engineering

2025 spring semester
Bachelor Thesis B [S Grade]

School of Fundamental Science and Engineering

2025 fall semester
Bachelor Thesis A [S Grade]

School of Fundamental Science and Engineering

2025 spring semester
Bachelor Thesis A (Fall Semester)

School of Fundamental Science and Engineering

2025 fall semester
Bachelor Thesis A (Fall Semester)

School of Fundamental Science and Engineering

2025 fall semester
Bachelor Thesis B

School of Fundamental Science and Engineering

2025 fall semester
Bachelor Thesis A

School of Fundamental Science and Engineering

2025 spring semester
Communications and Computer Engineering Laboratory A [S Grade]

School of Fundamental Science and Engineering

2025 fall semester
Graduation Thesis B (Spring) [S Grade]

School of Fundamental Science and Engineering

2025 spring semester
Computer Science and Communications Engineering Laboratory A [S Grade]

School of Fundamental Science and Engineering

2025 fall semester
Graduation Thesis B (Fall)

School of Fundamental Science and Engineering

2025 fall semester
Graduation Thesis B (Spring)

School of Fundamental Science and Engineering

2025 spring semester
Computer Science and Communications Engineering Laboratory A

School of Fundamental Science and Engineering

2025 fall semester
IoT System Design

School of Fundamental Science and Engineering

2025 spring semester
Project Research B

School of Fundamental Science and Engineering

2025 fall semester
Project Research A

School of Fundamental Science and Engineering

2025 spring semester
Project Research Spring

School of Fundamental Science and Engineering

2025 spring semester
Operating Systems

School of Fundamental Science and Engineering

2025 spring semester
Data Mining

School of Fundamental Science and Engineering

2025 spring semester
Computer Science and Communications Engineering Laboratory B

School of Fundamental Science and Engineering

2025 spring semester
Project Research Fall

School of Fundamental Science and Engineering

2025 fall semester
Introduction to Computers and Networks

School of Fundamental Science and Engineering

2025 spring semester
Graduation Thesis A (Fall) [S Grade]

School of Fundamental Science and Engineering

2025 fall semester
Graduation Thesis A (Spring)

School of Fundamental Science and Engineering

2025 spring semester
Graduation Thesis A (Fall)

School of Fundamental Science and Engineering

2025 fall semester
Graduation Thesis A (Spring) [S Grade]

School of Fundamental Science and Engineering

2025 spring semester
Information Retrieval

School of Fundamental Science and Engineering

2025 fall semester
Graduation Thesis B (Fall) [S Grade]

School of Fundamental Science and Engineering

2025 fall semester
IoT System Design

Graduate School of Fundamental Science and Engineering

2025 spring semester
Master's Thesis (Department of Computer Science and Communications Engineering)

Graduate School of Fundamental Science and Engineering

2025 full year
Advanced Information Communication Technology and Open Innovation

Graduate School of Fundamental Science and Engineering

2025 spring semester
Master's Thesis (Department of Computer Science and Communications Engineering)

Graduate School of Fundamental Science and Engineering

2025 full year
Seminar on Parallel and Distributed Architecture A

Graduate School of Fundamental Science and Engineering

2025 spring semester
Seminar on Parallel and Distributed Architecture D

Graduate School of Fundamental Science and Engineering

2025 fall semester
Seminar on Parallel and Distributed Architecture C

Graduate School of Fundamental Science and Engineering

2025 spring semester
Seminar on Parallel and Distributed Architecture B

Graduate School of Fundamental Science and Engineering

2025 fall semester
Advanced Project Study(Autumn)

Graduate School of Fundamental Science and Engineering

2025 fall semester
Advanced Project Study(Spring)

Graduate School of Fundamental Science and Engineering

2025 spring semester
Advanced Information Communication Technology and Open Innovation

Graduate School of Fundamental Science and Engineering

2025 spring semester
Advances in Bioinformatics

Graduate School of Fundamental Science and Engineering

2025 fall semester
Special Laboratory B in Computer Science and Communications Engineering

Graduate School of Fundamental Science and Engineering

2025 fall semester
Special Laboratory A in Computer Science and Communications Engineering

Graduate School of Fundamental Science and Engineering

2025 spring semester
Information Retrieval

Graduate School of Fundamental Science and Engineering

2025 fall semester
Research on Bioinformatics

Graduate School of Fundamental Science and Engineering

2025 full year
Research on Parallel and Distributed Architecture

Graduate School of Fundamental Science and Engineering

2025 full year
Seminar on Bioinformatics D

Graduate School of Fundamental Science and Engineering

2025 fall semester
Seminar on Bioinformatics C

Graduate School of Fundamental Science and Engineering

2025 spring semester
Seminar on Parallel and Distributed Architecture D

Graduate School of Fundamental Science and Engineering

2025 fall semester
Seminar on Parallel and Distributed Architecture C

Graduate School of Fundamental Science and Engineering

2025 spring semester
Seminar on Parallel and Distributed Architecture B

Graduate School of Fundamental Science and Engineering

2025 fall semester
Seminar on Parallel and Distributed Architecture A

Graduate School of Fundamental Science and Engineering

2025 spring semester
Special Laboratory B in Computer Science and Communications Engineering

Graduate School of Fundamental Science and Engineering

2025 fall semester
Special Laboratory A in Computer Science and Communications Engineering

Graduate School of Fundamental Science and Engineering

2025 spring semester
Information Retrieval

Graduate School of Fundamental Science and Engineering

2025 fall semester
Research on Bioinformatics

Graduate School of Fundamental Science and Engineering

2025 full year
Research on Parallel and Distributed Architecture

Graduate School of Fundamental Science and Engineering

2025 full year
Seminar on Bioinformatics B

Graduate School of Fundamental Science and Engineering

2025 fall semester
Seminar on Bioinformatics A

Graduate School of Fundamental Science and Engineering

2025 spring semester
Seminar on Bioinformatics D

Graduate School of Fundamental Science and Engineering

2025 fall semester
Seminar on Bioinformatics C

Graduate School of Fundamental Science and Engineering

2025 spring semester
Seminar on Bioinformatics B

Graduate School of Fundamental Science and Engineering

2025 fall semester
Seminar on Bioinformatics A

Graduate School of Fundamental Science and Engineering

2025 spring semester
IoT System Design

Graduate School of Creative Science and Engineering

2025 spring semester
Special Seminar B in Computer Science and Communications Engineering

Graduate School of Fundamental Science and Engineering

2025 fall semester
Research on Bioinformatics

Graduate School of Fundamental Science and Engineering

2025 full year
Research on Parallel and Distributed Architecture

Graduate School of Fundamental Science and Engineering

2025 full year
Special Seminar A in Computer Science and Communications Engineering

Graduate School of Fundamental Science and Engineering

2025 spring semester
IoT System Design

Graduate School of Advanced Science and Engineering

2025 spring semester

▼display all

Teaching Experience

Client side Web Programming

Waseda University
Data Mining

Waseda University
オペレーティングシステム

早稲田大学
電子回路

早稲田大学
Logic Circuits

Waseda University
プログラミング

早稲田大学
Information Retrieval

Waseda University

▼display all

Sub-affiliation

Faculty of Science and Engineering Graduate School of Fundamental Science and Engineering
Affiliated organization Global Education Center

Research Institute

2025

-

2026

Center for Data Science Concurrent Researcher
2025

　

　

Center for Higher Education Studies Concurrent Researcher
2024

-

2026

Waseda Research Institute for Science and Engineering Concurrent Researcher
2024

-

2026

Waseda Center for a Carbon Neutral Society Concurrent Researcher
2024

-

2026

Research Organization for Open Innovation Strategy Concurrent Researcher

Internal Special Research Projects

セキュアなビッグデータ解析に関する研究

2025

　View Summary

Society 5.0時代においては、あらゆるモノがつながり、多種多様なデータが社会基盤を構成している。その利活用を促進するためには、個人情報やプライバシー保護の課題を解決することが不可欠である。これに対する有力な技術の一つとして、準同型暗号（HE: Homomorphic Encryption）を用い、暗号化状態のまま全ての演算を行う手法が挙げられる。しかしながら、この方式は実行時間及び計算可能な演算の観点から依然として実用化には課題が残る。本研究では、準同型暗号上で直接計算できない不連続関数の効率的な実現を目的とした。具体的には、これまで本研究室で提案してきた「整数を対象とするBFV/BGV方式におけるルックアップテーブル（LUT）を用いた不連続関数実現手法」を、実数演算を扱うCKKS方式に拡張し、非対話的に動作する新手法を提案した。CKKS方式を採用することで、LUT構築の柔軟性と計算効率を向上させた。本提案手法の評価実験では、精度とレイテンシの間で柔軟なトレードオフ関係を実現できることを確認した。具体的には、LUTサイズを大きくすることで関数出力の精度が向上し、ユーザは用途に応じて計算精度と処理速度の最適なバランスを選択できる。実験結果として、より広い平文空間を持つ従来方式と比較し、レイテンシを最大19%削減、平均絶対百分率誤差（MAPE）を最大81.1%低減することを達成した。
耐量子暗号である完全準同型暗号実用化のための新演算方式への挑戦

2024 Ruixiao Li

　View Summary

完全準同型暗号 (FHE) は、クラウドコンピューティング等の第三者サーバにおいて計算を行う際、データのプライバシーを保護するスキームである。しかし、FHEを適用した場合、計算コストが高く、また加算と乗算しか評価できないという算術演算の制限がある。そこで、本研究では任意関数の実現に取り組んだ。具体的には、事前に計算されたルックアップテーブル (LUT) 内に、入力と出力の関係を保持し、任意関数を実現した。従来、1)ビット単位のFHEと、2)ワード単位のFHEを使用したLUT実現手法が存在していたが、前者ではビット数の大きい値を用いる場合、O(s·2^d·m)の計算量となるため、ビット数が増えるとパフォーマンスが急速に低下する。ここで、mは関数への入力値の数、dとsはそれぞれ入力と出力のビット長を表す。後者は、前者の問題を解決できるものの、2入力以内の関数実現に留まっていることと、整数のドメインサイズが2N（Nは単一の暗号文に詰め込まれた要素の数）を超えると高速化が困難であるという問題を持つ。本研究では、ワード単位のFHEによる準同型テーブルルックアップを実現して任意の多変量関数を評価する「復号を必要としない非対話型モデル」を提案した。本計算方法により、1）計算量をO（2^d·m/l）に削減できる（lはFHEパッキングの要素サイズ）、2）入力と出力のドメインサイズを拡張できる、3）多次元テーブルの提案によりマルチスレッド化によりレイテンシを削減できるという利点を持つ。実験結果では、10ビットの2入力関数と5ビットの3入力関数の評価に、16スレッドを用い、それぞれ約90.5秒と105.5秒で結果を返せることを確認した。これは、ビット単位の FHE を使用した単純な LUT 対応計算と比較して、2 ビットおよび 3 ビットの 3 入力関数を評価する際に 3.2 倍と 23.1 倍の高速化となる。
ビッグデータ解析応用に関する研究

2024

　View Summary

あらゆるモノがつながり、多種多様量のデータが基盤となるSociety5.0時代、データの利活用の促進のためには、個人情報、プライバシー問題の解決が欠かせない。これを解決する技術として、データを暗号化し暗号のままあらゆる処理を可能とする技術がある。本研究では、2024年度、同暗号の処理速度向上及び安全性の向上のため、隔離実行環境との併用による高速化を実施した。具体的には、準同型暗号(HE)と隔離実行環境(TEE)との組み合わせにより、レイテンシ、精度、およびデータ保護のバランスをとる「HE＆Plain in TEE」と呼ばれる新しい組み合わせ方法を提案した。本提案手法では、すべての計算を TEE 内で実行し、HE に適さない操作をプレーン テキスト（平文）で処理し、その他の処理は HE を使用して処理する。畳み込みニューラルネットワーク(CNN)の推論（画像を１０分類するＣＩＰＡＲ－１０のデータセットを利用して評価）のレイテンシと精度を定量的に評価し、既存手法法と比較すると共に、そのデータ保護機能について定性的に議論した。評価結果によると、提案手法は、既存の最も安全な手法と比較し、レイテンシを90.2％削減し、制度を2.2%向上させることに成功した。なお、最速の既存手法と比較した場合、レイテンシは12.5%増加するが、コードに対する保護（コードへの変更を許さない）を行うことができる点に優位性がある。本成果は、ＨＥ研究の最高レベルの国際会議WAHC2024で発表した。
プライバシー保護実用化のための新演算方式への挑戦

2023

　View Summary

本研究は、個人情報、プライバシー問題を解決する技術として、データを暗号化し暗号のままあらゆる処理を可能とする技術である準同型暗号の実用化を目指したものである。準同型暗号では加算と乗算のみがサポートされるため、複雑な計算（例えば平方根や除算など）を実行することができない。これに対して、本研究ではテーブルに予め関数の入力と出力の対応を保存しておき、入力された値（暗号化済）に対する出力（暗号化済）を求める手法の研究（ルックアップテーブル方式（LUT）と予備）を進めた。これにより任意の演算を実現することが可能となる。従来提案されているLUTは、ビットレベルで実現されており、ビット数をNとすると２Nオーダの時間が必要となる。これに対し提案手法では、ワードレベルの準同型暗号を用いることで高速化を狙った。一方、ワードレベルでのLUTの従来研究では、「入力として取り得る値の範囲が暗号文の空間の２倍までに限定されるといった問題」や「関数への入力数が限定されるという問題」により、汎用的な演算を実現できない。これに対して、提案手法では、任意の入力数を持つ関数、かつ、各入力の値が取り得る範囲を限定しない汎用的な手法を目指した。Microsoft SEALライブラリのBFVスキームを利用した実証実験により、（１）1入力12ビット関数を0.14秒、（２）1入力18ビット関数を2.53秒、（３）2入力6ビット関数を0.17秒、（４）3入力4ビット関数を0.20秒で実現することに成功した。さらに、機械学習で頻繁に用いられる活性化関数であるSwishをインプリメントし、既存手法に対して1.2倍の高速化を確認すると共に、ReLUでは、1.9倍の高速化を達成した。
ビッグデータ解析基盤に関する研究

2023

　View Summary

あらゆるモノがつながり、多種多様量のデータが基盤となるSociety5.0時代、データの利活用の促進のためには、個人情報、プライバシー問題の解決が欠かせない。これを解決する技術として、データを暗号化し暗号のままあらゆる処理を可能とする技術がある。暗号化したままでの個人情報や機微なデータ処理ができれば、大きな社会変革が期待される。本研究では、2023年度、同暗号を実現するライブラリについて比較を実施し、それぞれのライブラリの優劣について技術的比較検討を行った。具体的には、本分野で最も利用されているIBMのHeLib、マイクロソフトのMicrosoftSEAL、そして準同型暗号標準化団体のOpenFHEを対象とした。これらのライブラリには、それぞれ得失があり、どのような場合にどのライブラリを使うべきかという点について不明瞭な点が多い。本研究では、これらの３ライブラリを比較し得失を明かにすることを目的とした。３つのライブラリを評価するために、畳み込みニューラルネットワーク推論(CNN)をそれぞれのライブラリを用いて実装し、その実行時間を解析した。その結果、MicrosoftSEALはOpenFHEの56%以下、HElibの17%以下という最短の実行時間を達成しており、高速な処理を目的とする場合に適することが分かった。一方で、使いやすさの点では、OpenFHEが他のライブラリに優る。これは、プログラム開発者が、（１）rescalingと呼ばれる桁合わせ処理を考える必要がない点、（２）CKKSスキームを使用する際のKeyswitchingと呼ばれる処理について計算の特性に合わせて最適なものを選択できる点にある。一方で、（１）の自動rescalingを用いた場合、プログラマが手動でrescalingを行う場合に比較して約５倍実行時間が長くなることから、利用者のニーズに応じたライブラリ選択が重要である。
ビッグデータ活用に向けた基盤技術研究

2022

　View Summary

ビッグデータ活用は、Society5.0時代において欠くことができないものである。2022年度は、ビッグデータ活用に向けた基盤技術研究として、深層学習（DNN）の内部表現の解明を目的に、パーシステントホモロジーの適用可否の検討を行った。具体的には、パーシステントホモロジーを用いることで、DNNの過学習検出ができることを実験的に示した。また、本手法を枝刈り（DNNの内部表現の削減）に適用することで、精度と処理時間のチューニングが可能となることを示した。評価では、９５％の内部表現（エッジ）を削除した場合、一般的な枝刈り手法であるGMPに比較して１２％高い精度を保つことが可能となった。
情報推薦システムの高度化に関する研究

2021

　View Summary

情報推薦に関する研究を実施した。2021年度の研究では、POI（Pointof Interest）、すなわち、レストランや博物館など、地理情報に基づいた推薦を対象とした。グラフ畳み込みネットワーク（GCN）を基盤として用い、さらに高い推薦性能を目指した。GCNでは、学習なパラメータ数が増加し、モデル学習が困難で実用性が低い。これに対し、本研究では、ユーザの活動エリア情報を事前に抽出し、地理的近傍のみをGCN に反映することで、モデルの複雑性を下げ、より高い推薦性能を達成した。評価実験の結果、ベースラインモデルと比較して、本手法はYelpデータセットにおいてRecall@10を4.98%、Gowallaデータセットにおいて3.82%改善した。
情報推薦技術に関する研究

2020

　View Summary

本研究においては、情報推薦技術に関して「時間多様性のある推薦」を実現する手法に取り組んだ。位置情報を内包するSNS（ソーシャルネットワーキングサービス）、所謂LBSN(LocationBased Social Networking)においては、いかに個人の嗜好に合わせたPOI（Point ofInterest）を推薦するかが重要となる。POIの例としては、レストラン、観光スポット等が挙げられる。本研究では、POIの訪問時間に着目し、POIの訪問時間の多様性を高めた推薦手法の研究を行った。具体的には、推薦を受けるユーザのアクティブ時間の分布に合わせて、当該分布を満たすようにPOIを推薦すると共に、推薦対象となるPOIについては当該POIがどの時間帯にどの程度訪問されているかという統計的情報を用い、POIの訪問時間に多様性が表れるように推薦POIを決定した。Gowallaデータセットを用いた実験結果から、提案手法は従来のUSGアルゴリズムと比較して、時間多様性を25.9%向上させ、精度低下を7.9%に抑えることができた。
アクティブ認証の高度化に関わる研究

2019

　View Summary

近年、スマートフォンなどの携帯端末や銀行ATMなどで生体認証が取り入れられている。しかし、指紋や静脈といった生体認証のみではセキュリティを確実に保つことが難しい。本研究では、こうした生体認証の先を行く技術として、各種認証技術の基盤研究を実施した。一つは、ATM等におけるキーパッド入力を対象とした認証、もう一つはスマートフォンを対象とした認証である。キーパッド入力に対しては入力時の指の特徴量（長さ、骨格等）、スマートフォン入力に対しては入力時のタッチストロークの特徴量（速度、圧力等）を用いパッシブ認証（利用者の明示的な認証動作なく利用行動から自動認証）の実現可能性をそれぞれの外部発表により示した。
インターネット発信コンテンツに対するプライバシー保護に関する研究

2018

　View Summary

2018年度の研究では、プライバシーを故意に取得しようとするようなWebページを「信頼性に欠けるWebページ」であると定義した上で、こうしたWebページを判定するための手法について基礎研究を行った。具体的には、「信頼性に欠けるWebページ」に特有な特徴を抽出することを目的に、「外部コンテンツ依存」を指標化した。これは、あるWebページのコンテンツの多くが外部コンテンツに依存しているようなページは、本来の情報発信を目的としているページではないというアイデアに基づいている。従来研究が用いている特徴量と一緒に用いることで判定精度を最大3.8%向上させた。
1000人規模のオンライン手書きデータによる論理的思考力数値化への挑戦

2017

　View Summary

近年、論理的思考力の育成が初等教育から重要視されてきている。本研究では、数学の幾何学問題を題材とし、論理的思考力の数値化が可能かどうかのフィジビリティスタディを行うと共に、手書きのシーケンスデータを解析するための基盤技術の研究開発を実施した。フィジビリティスタディでは、19人の被験者に3題の幾何学問題を解答してもらい、事前に想定していた2～3種類の解答への自動分類を試みた。SVMを用い分類を行ったが、結果、手書き解答データ数の少なさから、うまく学習することができなかった。一方で、「勘で解答したかどうか」の判定においては、正解率0.83を達成することができた。
Webを対象としたプライバシー保護に関する研究

2017

　View Summary

Webやソーシャルネットワーキングサービスにおいてフェイクニュースをはじめとする間違った情報の拡散が社会問題化している。この問題の発生要因は、(1)利用者自身が故意で間違った情報を流す場合と(2)ID等が乗っ取られて悪意を持つ人にIDが利用されてしまうという２つに分類できる。今年度は、前者についてはWeb情報の収集を開始した。後者については、昨今、スマートデバイスからのネットアクセスが多いことに鑑み、これまで開発してきた「アクティブ認証技術」、すなわちスマートフォン利用時のスワイプパターンから本人であるかどうかをリアルタイムで認識する手法の精度向上を進め、従来手法にタッチ圧力を特徴量として追加し0.83%のEERを達成した。
大規模オンライン手書きデータを用いたテーラーメイド学習支援の実現

2016

　View Summary

本研究では、手書きデータから論理的思考力を数値化することを最終目標とした予備研究として、１）数学幾何問題を対象とした手書きデータからの解答分類、２）記述された文章からの単語分割手法に関する研究を実施した。幾何問題を対象とした解答分類では、まず19名の被験者実験を行い、幾何問題解答中の被験者の筆記過程を調査した。結果、幾何問題に対する図形への書き込みは、a)補助線、b)角度記入、c)辺上マーク（平行や長さが等しいことを示す）、d)線分の長さ・比、の４つに分類できることがわかった。次に、これらのプリミティブを用いて解法戦略として「円周角の定理」「内角の和」「合同」「二等辺三角形」「正三角形」「相似」を判定できるかについてk-means法を用いて分類を行い、平均0.7のF値で可能であることが分かった。記述文章からの単語分割では、ニューラルネットワークを用いた系列ラベリングによる単語分割手法を提案した。本手法を用いることで、少量の学習データを用いるだけで、辞書無しに単語分割が可能であることがわかり、認識精度が十分でなく、かつデータ量の少ない手書きデータへの利用可能性を示すことができた。
ソーシャルネットワーキングサービスを対象としたプライバシー保護に関する研究

2016

　View Summary

ソーシャルネットワーキングサービス（SNS）を対象としたプライバシー保護に関連し、スマートフォン上でのアクティブ認証手法に関する研究を実施した。具体的には、スマートフォン上での指によるスワイプ等の動作から、利用者本人であるかどうかをリアルタイムに判別するための仕組みを構築した。従来研究では、学習器による静的な学習を元にアクティブ認証を実現していたのに対し、提案手法ではオンライン分類器AROWを用いて継続的に学習すると共に、瞬間的な異動作を誤検知することがないよう、一定時間毎にスライディングウィンドウ（13ストローク間隔）を設定し、多数決による認証を実現した。これにより、1.9%の EER を達成した。
オンライン手書きデータからの学習つまずき発見

2015 浅井洋樹

　View Summary

本研究課題では、電子ペンを利用した数学問題解答時のストロークを解析することにより、学習者のつまずきを発見すること、及び、漢字等の記憶問題についてどの程度記憶しているかを数値化することを目標に研究を進めた。前者では、数式展開問題及び幾何問題（何れも中学レベル）を対象とし、解答戦略別に自動分類が可能であることを示すことができた。後者については、漢字を対象とした書き取りテスト中のストローク解析により、記憶度を数値化することに成功した。
イベント発生間隔を考慮したデータマイニングによる学習つまずき発見

2013

　View Summary

今年度、同研究課題において、「記憶が重要となる国語や社会の科目」を想定した。すなわち、学習上のつまずきとして記憶が関連する問題に対して取り組んだ。具体的には、被験者が記憶すべき事項を覚えているかどうかを判定するための手法について研究を実施しシステムとして構築すると共に、手書き入力を効率化するための手法について研究を実施した。【記憶の自動判定】物事を記憶するという行為は，私たちが社会で生きる上で必要不可欠な行為である．近年，タブレット端末が教育現場へ導入され始めたこともあり，タブレット端末における効率の良い記憶支援システムの設計が必要とされる．そこで，まず学習者の主観的な記憶度と客観的な記憶度に関する調査を行った．具体的には、1週間後の記憶度から客観的な記憶度が学習者自身の認識と一致しているかどうかを調べた．その結果，学習者が学習時点で「覚えた」と判定する事項に対してその２割は、実際には１週間後に忘れることがわかった。そこで、タブレット端末上で取得可能なオンライン手書きデータに着目し，オンライン手書きデータから記憶の定着具合を客観的に推定することで，記憶支援を行うシステムを提案した．本システムでは，将来忘れることのない記憶を定着記憶，近い将来忘れる可能性がある記憶を未定着記憶とし，学習した事象をいずれかに分類する．１２名による被験者実験の結果，未定着記憶の事象を精度100%，再現率95%で分類可能である結果が得られた．本システムを利用することによって未定着記憶を効率よく学習することが可能となり，記憶学習の効率向上へとつながる可能性が示された．【手書き入力の効率化】単語を書く際に各漢字を途中までしか書かなくとも，希望する単語を動的に予測し，手書き入力を高速化する手法を提案した．評価実験では，提案手法と既存手法で同じテキストを入力し，入力にかかる時間とストローク数を記録した．その結果、提案手法を用いた場合，入力にかかる時間の削減はできなかったが，ストローク数を既存手法より少なくすることができることを確認した。　以上の研究成果をもとに今後も学習のつまずき発見に関する研究を継続していく予定である。
オンライン手書きデータからの学習つまずき発見

2012

　View Summary

本研究では，漢字書き取りや読み取り，大学入試センターの選択問題を題材として，被験者の「記憶度」を推定することを目指して研究を行った．従来，例えば単語帳や漢字練習帳のようなアプリケーションでは，被験者が正解すれば「記憶している」とコンピュータ側で解釈を行っていた．しかし，被験者が必ずしも「記憶している」わけではなく，たまたま正解したが「記憶はあいまい」といったケースがある．こうしたケースを判別できれば，被験者の学習状態をより的確に把握することができ，学習教材開発に大きく貢献することができる．さらに，本学が進めるCourseN@viを含めた遠隔教育への応用も可能である．従来，効率的な学習指導を行うために，学習状態を機械的に推測する研究が行われている．学習状態推定の研究分野では，これまでデスクトップ環境を前提とした研究が広く行われてきている．一方で，教育現場ではタブレットデバイスの導入を積極的に進めており，今後タブレットデバイスを用いた学習状態の推定は重要度を増していくと考えられる．そこで本研究では，タブレットデバイスから得られる情報を用いた学習状態の推定手法を提案した．具体的には，段階別の記憶度評価を行うことが可能かどうかについて予備実験を実施した．予備実験では，専門分野が多様な大学生男性15名，女性15名の計30名に実験を依頼した．出題した問題は，易しい漢字の書き取り問題として漢字検定８級，難しい漢字の書き取り問題として漢字検定２級の問題と準１級の問題，易しい漢字読み取り問題として漢字検定８級，難しい漢字の読み取り問題として漢字検定２級と準１級の問題を用いた．その結果，「問題における正答を『完全に記憶している』『迷いながら正解した』の2値に区別することが，漢字の書き取り問題において精度約80%，再現率約51%，漢字の読み問題において精度約64%，再現率約52%で判別可能であること」「タスクによって，オンライン手書きデータ特徴量への記憶の度合の反映度が異なること」「被験者が『完全に記憶している』と判断した場合でも，オンライン手書きデータ特徴量には大きな差異があること」が明らかとなった．さらに，手書き情報と，それに付随する学習者の表情・動作を特徴量として，「問題を解くときに悩んだか否か」，「自信を持って答えたか否か」を推定する試みを同じ30名の被験者を対象に，大学入試センターの選択問題を用いて実施した．その結果，解答時に悩んだか否か，自信があったか否かの推定について，それぞれ平均精度75.3%，66.7%，平均再現率74.8%，60.1%を達成した．今後の課題としては，本稿を応用した記憶支援アプリケーションの開発とそのアプリケーションに用いるための効率的な機械学習が行える機械学習器の設計・開発が挙げられる．
Ｐｅｅｒ－ｔｏ－Ｐｅｅｒ型メモリ提供サービス

2008 上田　高徳

　View Summary

　大容量のデータを保存・解析するために，ストレージは必要不可欠の存在である．ストレージの性能はシステム全体の性能に大きく関係するが，アクセス速度で不利なハードディスクドライブ（HDD）が，いまだにストレージとして利用されている．HDDはその物理的な構造から，アクセス性能の向上が難しく，CPUの性能向上に遅れをとっている．　一方，近年，1チップ上に複数のコアを持つマルチコアCPUが登場してきている．今後，１つのチップに十数コアを搭載したメニーコアCPUが登場することは確実である．このようなメニーコア環境下では，アプリケーションが並列に動作するため，ストレージへのアクセス集中が問題になる．並列アクセスが原因で，HDDが特に不得手とするランダムアクセスの頻度が増え，ストレージがますますボトルネックになると考えられる．　そこで本研究では，来たるメニーコア時代を睨み，ストレージのボトルネックをソフトウェア的に軽減することを考えた．具体的には，新しいデータマイニングを用いたディスクキャッシュの仕組みを考案し，OSレベルに実装することで，ハードウェアおよびアプリケーションの修正を回避し，ボトルネックを低コストで軽減した．　本研究の成果として，ファイルアクセスログから抽出したアクセスパターンを用いることで，キャッシュ性能を向上できることを示した．実験では，TPC-C相当のベンチマークのアクセスログをシステムコールレベルで取得し，データマイニングのアルゴリズムを用いてアクセスパターンを抽出した．シーケンシャルアクセスをフィルタすることで，マイニングに掛る時間を小さくすると同時に，良質なアクセスパターンを抽出可能なことを確認した．また，アクセスパターンを用いてLRUリストを並び替えることで，キャッシュミス率を低減できることも確認した．　本研究の主な貢献は，ディスクキャッシュを効率化するための新しいマイニング手法の提案とマイニングを効率よく行うためのフィルタリング手法の提案である．
コンパイラ協調型命令レベル投機的実行方式の研究

2004 斎藤　史子

　View Summary

命令間の制御依存によってパイプライン処理を滞らせないために，近年のプロセッサでは分岐予測が採用されている．分岐予測は，未解決の分岐命令を超えた実行（投機的実行）を可能とする．一方，近年の命令パイプライン長の深化により，分岐ミスペナルティが増大している．そのため，分岐予測ミス率の低減はプロセッサの性能向上において不可避の課題となっている．現在までさまざまな分岐予測器が提案されてきた．なかでも，複数の予測表で構成されたハイブリッド分岐予測器は高精度な予測器として知られている．ハイブリッド予測器ではセレクターによる予測表選択手法が予測精度に影響を与える．従来のハイブリッド予測器は，セレクターのカウンタ状態の偏向に応じて予測表を選択する．例えば，Combining予測器では，予測信頼度が高い予測表を選択し，Bimode予測器では，分岐偏向に応じた予測表を選択する．本研究では、全く新しい分岐予測方法（投機的実行）として、このようなハイブリッド分岐予測器において、セレクターのカウンタ状態の偏向ではなく，カウンタ状態の偏向の強弱に応じた予測表選択手法を提案した。この手法を Confidence-Selectorと呼ぶ。本分野におけるベンチマークテストとして一般的に用いられているSPECint95（ref入力）を用いて、シンプルスカラと呼ばれるシミュレータにより評価した結果、従来の予測器と比較して12KB Combining予測器では平均５．６５％、１２KBのBimode予測器に対しては平均０．５７％の予測ミス率を削減することができた。
インターネット分散型超高速データ収集・検索システムにおける負荷分散方式の研究

2000

　View Summary

　広域に分散したWWWサーバのデータの高速収集するための一手法として、WWWロボットと呼ばれるWWWサーバのデータを自動的に収集するプログラムをインターネット上に複数配置し協調動作させる「分散型WWWロボット」について研究開発を実施している。1998年度末までに、5ヶ所に分散したロボットを用いた評価を行い、分散型WWWロボットの有効性を確認しているが、本研究では、実用性を評価するためにさらに実験規模を拡大し、17ヶ所に分散配置された分散型WWWロボットを使い6,500のWWWサーバ（465万URL）を対象として実験を実施した結果を詳細に解析した。　この結果、一カ所で集中して収集する場合に比較し、我々の提案する負荷均一化による分散により、6.3～286倍の高速化が可能であることが判明した。特に、17台の分散型WWWロボットと6,500台のWWWサーバ間のデータ転送速度の間には、同一のWWWサーバを対象とした場合でも2倍～710倍、平均67.5倍の速度差があることがわかった。このため、収集対象となるWWWサーバをランダムに分散型WWWロボットに割り当てる「ランダム分散」では大きな負荷の不均衡が発生し、負荷の均等化が必要になる。なお、今回の実験で極端にデータ転送速度の遅かった２つのWWWロボット（原因はProxy性能及びネットワークルーティング）を除いた15台の分散型WWWロボットのみの間でも、1.3～710倍、平均9.6倍の速度差があることがわかった。また、同一WWWサーバに対する最高速と最低速のWWWロボットの間のデータ転送速度が10倍未満のWWWサーバは全体の71％、20倍までで全体の91％であり、速度差は20倍程度の範囲にそのほとんどが収まっていることがわかる。　本研究により、分散型WWWロボットのさらなる高速化のためには、インターネットにおけるデータ転送の変動値を上限とした負荷の均等化を行うことのできる分散方式が重要になることが判明した。

▼display all