研究者詳細 - 戸川　望

1992年早稲田大学理工学部卒業。1997年同大学院理工学研究科博士後期課程修了。同年博士（工学）。早稲田大学助手・講師、北九州市立大学助教授等を経て、2009年より早稲田大学理工学術院教授（現在に至る）。2022年より株式会社Quanmatic最高科学責任者（CSO）兼務。集積システム設計、量子計算、セキュリティ等が専門。

経歴

2024年09月

-

継続中

早稲田大学理工学術院学術院長
2022年10月

-

継続中

株式会社Quanmatic 最高科学責任者（CSO）兼務
2009年04月

-

継続中

早稲田大学理工学術院教授
2018年09月

-

2024年09月

早稲田大学基幹理工学部／大学院基幹理工学研究科学部長兼研究科長
2005年04月

-

2009年03月

早稲田大学理工学術院助教授・准教授
2001年04月

-

2005年03月

北九州市立大学国際環境工学部助教授
2000年04月

-

2001年03月

早稲田大学理工学総合研究センター講師
1997年04月

-

2000年03月

早稲田大学理工学部助手
1994年04月

-

1997年03月

独立行政法人日本学術振興会特別研究員DC1

▼全件表示

学歴

1994年04月

-

1997年03月

早稲田大学大学院理工学研究科電気工学専攻博士後期課程
1992年04月

-

1994年03月

早稲田大学大学院理工学研究科電気工学専攻修士課程
1988年04月

-

1992年03月

早稲田大学理工学部電子通信学科

委員歴

2020年06月

-

継続中

電子情報通信学会VLSI設計技術研究専門委員会委員
2017年01月

-

継続中

総務省サイバーセキュリティタスクフォース構成員
2018年04月

-

2025年05月

内閣サイバーセキュリティセンター（NISC）研究開発戦略専門調査会委員
2022年01月

-

2023年12月

IEEE Circuits and Systems Society, Japan Joint Chapter Chair
2020年06月

-

2022年06月

電子情報通信学会基礎・境界ソサイエティ特別委員
2018年04月

-

2022年03月

情報処理学会高度交通システムとスマートコミュニティ研究会委員
2020年01月

-

2021年12月

IEEE Circuits and Systems Society, Japan Joint Chapter Vice Chair
2013年01月

-

2021年01月

IEEE/ACM Asia South Pacific Design Automation Conference Steering Committee Secretary
2019年06月

-

2020年06月

電子情報通信学会VLSI設計技術研究専門委員会委員長
2018年01月

-

2019年12月

IEEE Circuits and Systems Society, Japan Joint Chapter Secretary
2018年06月

-

2019年05月

電子情報通信学会VLSI設計技術研究専門委員会副委員長
2015年04月

-

2019年03月

IPSJ Transactions on System LSI Design Methodology Editor-in-Chief
2014年06月

-

2017年05月

電子情報通信学会基礎・境界ソサイエティ特別委員
2013年04月

-

2017年03月

情報処理学会システムとLSIの設計技術研究会委員
2010年05月

-

2016年05月

電子情報通信学会VLSI設計技術研究専門委員会専門委員
2011年05月

-

2014年06月

電子情報通信学会アクレディエーション委員会幹事
2013年04月

-

2014年03月

情報処理学会論文誌編集委員会基盤グループ主査
2011年04月

-

2013年03月

情報処理学会システムLSI設計技術研究会幹事
2010年05月

-

2012年05月

電子情報通信学会基礎・境界ソサイエティ幹事
2005年04月

-

2012年05月

電子情報通信学会リコンフィギャラブルシステム研究専門委員会専門委員
2008年05月

-

2011年05月

電子情報通信学会アクレディエーション委員会委員
2009年04月

-

2011年03月

情報処理学会システムLSI設計技術研究会委員
2008年05月

-

2010年05月

電子情報通信学会VLSI設計技術研究専門委員会幹事
2003年05月

-

2008年05月

電子情報通信学会 VLSI設計技術研究専門委員会専門委員
2004年05月

-

2006年05月

電子情報通信学会会誌編集委員会委員

▼全件表示

所属学協会

　

　

　

ACM
　

　

　

IEEE
　

　

　

情報処理学会
　

　

　

電子情報通信学会

研究分野

計算機システム / 情報セキュリティ

研究キーワード

集積システム設計
量子計算
情報セキュリティ

受賞

Best Paper Award

2025年12月 The 12nd International Conference on Internet of Things: Systems, Management and Security, IoTSMS-2025 Automated Security Compliance Evaluation Using Hierarchical RAG for IoT Devices with Large-Scale Documentation

受賞者： Yuka Ikegami, Kento Hasegawa, Seira Hidano, Kazuhide Fukushima, Kazuo Hashimoto, Nozomu Togawa
Best Paper Award

2024年09月 11th International Conference on Internet of Things: Systems, Management and Security, IOTSMS 2024 Initial Seeds Generation Using LLM for IoT Device Fuzzing

受賞者： Hibiki Nakanishi, Kota Hisafuru, Kento Hasegawa, Seira Hidano, Kazuhide Fukushima, Kazuo Hashimoto, Nozomu Togawa
テレコムシステム技術奨励賞

2024年03月電気通信普及財団

受賞者：長谷川健人、山下一樹、披田野清良、福島和英、橋本和夫、戸川望
SCOPE成果展開推進賞

2022年12月総務省

受賞者：戸川望
科学技術分野の文部科学大臣表彰・科学技術賞（研究部門）

2018年04月文部科学省集積回路の革新的設計技術とそのセキュリティ応用研究

受賞者：戸川望
最優秀論文賞

2017年09月 IEEE ICCE-Berlin A robust scan-based side-channel attack method against HMAC-SHA-256 circuits

受賞者： Daisuke Oku, Masao Yanagisawa, Nozomu Togawa
最優秀論文賞

2016年10月 IEEE International SoC Conference A high-performance circuit design algorithm using datadependent approximation

受賞者： Kazushi Kawamura, Masao Yanagisawa, Nozomu Togawa
テレコムシステム技術賞

2011年03月電気通信普及財団

受賞者：戸川望
（財）船井情報科学財団船井学術賞

2010年04月
（財）丸文研究交流財団丸文研究奨励賞

2010年03月
IEEE DAC/ISSCC Student Design Contest, 1st Place

2006年07月
（財）武田計測先端知財団，武田研究奨励賞優秀賞

2001年12月
丹羽記念会平成9年度（第21回）丹羽記念賞

1998年02月
（財）安藤研究所第9回安藤博記念学術奨励賞

1996年06月
電子情報通信学会第8回回路とシステム軽井沢ワークショップ研究奨励賞

1996年04月
早稲田大学平成7年度小野梓記念学術賞

1996年03月
早稲田大学平成7年度大川功記念賞

1996年03月
Best Paper Award (IEEE Asia and South Pacific Design Automation Conference 1995)

1995年08月
（財）電気通信普及財団第10回テレコムシステム技術学生賞

1995年03月

▼全件表示

論文

Mitigating Precision Errors in Quantum Annealing via Coefficient Reduction of Embedded Hamiltonians

Kentaro Ohno, Nozomu Togawa

IEEE Transactions on Quantum Engineering 2026年

DOI

Scopus
QUBO Simplification by Singular Value Decomposition and Coefficient Elimination for Ising Machines

Shinnosuke Inaba, Takeru Ota, Nozomu Togawa

IEEE Access 2026年

DOI

Scopus
Hardware-Trojan Detection Using GraphGPS.

Sho Yoshimi, Yuka Ikegami, Nozomu Togawa

ICCE 1 - 2 2026年 [査読有り]

担当区分：責任著者

DOI

Scopus
Bit-Width Reduction for Ising Models Proving the Ground State by Topologically Isomorphic n-Partition.

Shu Tomita, Raio Aoki, Shinnosuke Inaba, Masashi Tawada, Nozomu Togawa

ICCE 1 - 3 2026年 [査読有り]

担当区分：最終著者

DOI

Scopus
Dynamic Variable Selection for subQUBO Annealing Using Hamming Weight.

Yoshihito Saito, Koki Mita, Nozomu Togawa

ICCE 1 - 4 2026年 [査読有り]

担当区分：最終著者

DOI

Scopus
Content-Path Optimization for TV News Programs.

Shoma Kaji, Risa Kaneko, Yuta Hagio, Masaru Miyazaki, Siya Bao, Nozomu Togawa

ICCE 1 - 3 2026年 [査読有り]

担当区分：最終著者

DOI

Scopus
Constraint-Parameterized Spin-Variable Reduction Method for QAP.

Raio Aoki, Takeru Ota, Tatsuhiko Shirai, Nozomu Togawa

ICCE 1 - 3 2026年 [査読有り]

担当区分：最終著者

DOI

Scopus
Leveraging Non-Exact QUBO Formulation With Quantum Annealing to Optimize 5G Base Station Power Selection

Chihiro Dogo, Kazuhiro Saito, Masashi Tawada, Nozomu Togawa

IEEE Access 2026年 [査読有り]

担当区分：最終著者

DOI

Scopus
Initial Seeds Generation Based on Communication Logs Using LLM for IoT Device Fuzzing.

Hibiki Nakanishi, Kento Hasegawa, Seira Hidano, Kazuhide Fukushima, Kazuo Hashimoto, Nozomu Togawa

IEICE Trans. Inf. Syst. 109 ( 1 ) 152 - 164 2026年 [査読有り]

担当区分：責任著者

DOI
Anomalous Behavior Detection in IoT Devices Utilizing Embedded Representations of Power Consumption Waveforms

Ryusei Eda, Nozomu Togawa

IEEE Access 2026年 [査読有り]

担当区分：責任著者

DOI

Scopus
QUBO Simplification Method for Improving Solution Convergence Speed Using an Ising Machine

Shinnosuke Inaba, Takeru Ota, Chihiro Dogo, Kazuhiro Saito, Nozomu Togawa

2025 IEEE International Conference on Quantum Computing and Engineering (QCE) 492 - 493 2025年08月 [査読有り]

担当区分：最終著者

DOI
Node-Wise Hardware Trojan Detection Based on Graph Learning

Kento Hasegawa, Kazuki Yamashita, Seira Hidano, Kazuhide Fukushima, Kazuo Hashimoto, Nozomu Togawa

IEEE Transactions on Computers 74 ( 3 ) 749 - 761 2025年03月 [査読有り]

担当区分：最終著者

DOI
Course Selection Optimization using pVSQA on Quantum Computers.

Takeru Ota, Tatsuhiko Shirai, Nozomu Togawa

ICCE-Berlin 207 - 210 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus
Predicting Dynamic Travel Time Between Points of Interest Using Machine Learning Models.

Shoma Kaji, Toshinori Takayama, Siya Bao, Nozomu Togawa

ICCE-Berlin 197 - 200 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus
A Correction Method Using Multiple Trained Models for Graph-Learning Based Hardware-Trojan Detection.

Sho Yoshimi, Yuka Ikegami, Nozomu Togawa

ICCE-Berlin 26 - 29 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus
Security Conformance Scoring for IoT Devices Through Documentation Analysis Using Large Language Models.

Yuka Ikegami, Kento Hasegawa, Seira Hidano, Kazuhide Fukushima, Kazuo Hashimoto, Nozomu Togawa

ICCE-Berlin 21 - 25 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus

1

被引用数

(Scopus)
LLM-based Fuzzing Method Using UI-based Input Value Generation for IoT Devices.

Hibiki Nakanishi, Kento Hasegawa, Seira Hidano, Kazuhide Fukushima, Nozomu Togawa

ICCE-Berlin 16 - 20 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus
Anomalous IoT Behavior Detection Utilizing Time-Series Embedded Representation.

Ryusei Eda, Nozomu Togawa

ICCE-Berlin 10 - 15 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus

1

被引用数

(Scopus)
Anomalous IoT Behavior Detection Based on SARIMA Reference Waveform.

Ryusei Eda, Nozomu Togawa

31st IEEE International Symposium on On-Line Testing and Robust System Design(IOLTS) 1 - 5 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus
Prioritizing Vulnerability Assessment Items for IoT Devices Based on Suitability Evaluation Using LLMs.

Yuka Ikegami, Kento Hasegawa, Seira Hidano, Kazuhide Fukushima, Kazuo Hashimoto, Nozomu Togawa

IEICE Transactions on Information & Systems 108 ( 12 ) 1556 - 1569 2025年 [査読有り]

担当区分：最終著者

DOI
Gen-Power2: Improved Anomaly Detection in IoT Devices Utilizing Generated Power Waveforms.

Ryusei Eda, Kota Hisafuru, Nozomu Togawa

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 108 ( 12 ) 1598 - 1611 2025年 [査読有り]

担当区分：最終著者

DOI
Enriching Experiences Through Shared Moments: Travel Recommendation for Heterogeneous Users Using Ising Machines.

Siya Bao, Nozomu Togawa

VTC2025-Spring 1 - 7 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus
Compressed Space Quantum Approximate Optimization Algorithm for Constrained Combinatorial Optimization

Tatsuhiko Shirai, Nozomu Togawa

IEEE Transactions on Quantum Engineering 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus

3

被引用数

(Scopus)
An Ising Machine Approach to the Personalized Course Selection Problem

Takeru Ota, Keisuke Fukada, Nozomu Togawa

IEEE Access 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus
Evaluation of Hardware-Trojan Detection by Ensemble Learning Model for Circuits Inserted with a Trojan by an Automated Framework.

Sho Yoshimi, Yuka Ikegami, Ryotaro Negishi, Kota Hisafuru, Nozomu Togawa

ICCE 1 - 4 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus

1

被引用数

(Scopus)
Searching Candidate Sequences for RNA Aptamers Using Quantum Computation.

Sora Tomita, Tatsuhiko Shirai, Michiaki Hamada, Tatsuo Adachi, Nozomu Togawa

ICCE 1 - 2 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus
Personalized Course Selection Optimization Using QAOA.

Takeru Ota, Keisuke Fukada, Tatsuhiko Shirai, Nozomu Togawa

ICCE 1 - 2 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus
SubQUBO Annealing Based on Efficient Binary Variable Selection for Combinatorial Optimization Problems with One-Hot Constraints.

Tatsuya Noguchi, Keisuke Fukada, Siya Bao, Nozomu Togawa

ICCE 1 - 2 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus

2

被引用数

(Scopus)
Performance Comparison of the LLM Models on LLM-Based Seed Generation Method for IoT Device Fuzzing.

Hibiki Nakanishi, Kento Hasegawa, Seira Hidano, Kazuhide Fukushima, Kazuo Hashimoto, Nozomu Togawa

ICCE 1 - 5 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus

1

被引用数

(Scopus)
Solving Task-Select Problems Using a Quantum Annealer.

Koki Mita, Yuta Yachi, Keisuke Fukada, Nozomu Togawa

ICCE 1 - 2 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus
Shoe-Based PDR System for Two-Dimensional Tracking.

Dai Kajimoto, Yusuke Tanaka, Siya Bao, Nozomu Togawa

ICCE 1 - 2 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus

1

被引用数

(Scopus)
Machine Learning-Based Estimation of Dynamic Travel Times between Points of Interest.

Shoma Kaji, Dai Kajimoto, Tatsuya Noguchi, Toshinori Takayama, Siya Bao, Nozomu Togawa

ICCE 1 - 2 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus

2

被引用数

(Scopus)
An Ising Machine-Based Hybrid Optimization Method Using Constraint Conversion and Correction.

Kinya Iwata, Masashi Tawada, Nozomu Togawa

ICCE 1 - 2 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus
Autonomous Hardware-Trojan Generation Method Using Reinforcement Learning for Random Forest-Based Hardware-Trojan Detection.

Yuka Ikegami, Kento Hasegawa, Seira Hidano, Kazuhide Fukushima, Nozomu Togawa

ICCE 1 - 6 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus
Verification of Internal Memory Readout by Voltage Fault Attack on Automotive ECUs.

Ryusei Eda, Kota Hisafuru, Katsuhiko Sato, Yuya Adachi, Nozomu Togawa

ICCE 1 - 4 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus
Anomalous IoT Behavior Detection by LSTM-Based Power Waveform Prediction.

Ryusei Eda, Nozomu Togawa

IoTBDS 345 - 352 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus

2

被引用数

(Scopus)
Automating the Assessment of Japanese Cyber-Security Technical Assessment Requirements Using Large Language Models.

Kento Hasegawa, Yuka Ikegami, Seira Hidano, Kazuhide Fukushima, Kazuo Hashimoto, Nozomu Togawa

IoTBDS 305 - 312 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus

1

被引用数

(Scopus)
Automated Test Input Generation Based on Web User Interfaces via Large Language Models.

Kento Hasegawa, Hibiki Nakanishi, Seira Hidano, Kazuhide Fukushima, Kazuo Hashimoto, Nozomu Togawa

IoTBDS 297 - 304 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus

2

被引用数

(Scopus)
Hybrid Iterative Annealing Method Using a Quantum Annealer and a Classical Computer and Its Evaluation.

Keisuke Fukada, Tatsuhiko Shirai, Nozomu Togawa

IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 108 ( 3 ) 460 - 472 2025年 [査読有り]

担当区分：最終著者

DOI
Linearizing Binary Optimization Problems Using Variable Posets for Ising Machines

Kentaro Ohno, Nozomu Togawa

IEEE Transactions on Emerging Topics in Computing 13 ( 1 ) 250 - 261 2025年01月 [査読有り]

担当区分：最終著者

DOI
Hybrid subQUBO Annealing With a Correction Process for Multi-Day Intermodal Trip Planning

Tatsuya Noguchi, Keisuke Fukada, Siya Bao, Nozomu Togawa

IEEE Access 13 19716 - 19727 2025年 [査読有り]

担当区分：最終著者

DOI

Scopus
Large-Sized VQE Performance Profiling in Quantum Chemistry Using a Multi-Node Quantum Simulator.

Keisuke Fukada, Tatsuhiko Shirai, Mikio Morita, Yoshinori Tomita, Koichi Kimura, Nozomu Togawa

QCE 432 - 433 2024年 [査読有り]

担当区分：最終著者

DOI

Scopus
Personalized Course Selection Optimization Using an Ising Machine.

Takeru Ota, Keisuke Fukada, Nozomu Togawa

QCE 430 - 431 2024年 [査読有り]

担当区分：最終著者

DOI

Scopus

4

被引用数

(Scopus)
Non-zero Coefficients Removing Method to Improve the Ising Machine Solving Performance.

Kinya Iwata, Masashi Tawada, Nozomu Togawa

QCE 428 - 429 2024年 [査読有り]

担当区分：最終著者

DOI

Scopus

3

被引用数

(Scopus)
Optimization of Base Station Power Supply Selection by Quantum Annealing.

Chihiro Dogo, Kazuhiro Saito, Nozomu Togawa

QCE 408 - 409 2024年 [査読有り]

担当区分：最終著者

DOI

Scopus

5

被引用数

(Scopus)
Variable Reduction Method for Quadratic Three-Dimensional Assignment Problem with FMQA.

Sora Tomita, Tatsuhiko Shirai, Nozomu Togawa

QCE 404 - 405 2024年 [査読有り]

担当区分：最終著者

DOI

Scopus
QUBO Coefficient Dynamic Ratio Shrinking Method for Quantum Annealers.

Yuta Yachi, Masashi Tawada, Nozomu Togawa

QCE 384 - 385 2024年 [査読有り]

担当区分：最終著者

DOI

Scopus
Multi-Day Intermodal Trip Planning Using subQUBO Annealing with Correction Processing.

Tatsuya Noguchi, Keisuke Fukada, Siya Bao, Nozomu Togawa

QCE 380 - 381 2024年 [査読有り]

担当区分：最終著者

DOI

Scopus

1

被引用数

(Scopus)
A Novel Classical-Ising Hybrid Annealing Method with QUBO Model Cutting.

Yuta Atobe, Masashi Tawada, Nozomu Togawa

MWSCAS 1154 - 1157 2024年 [査読有り]

担当区分：最終著者

DOI

Scopus

1

被引用数

(Scopus)
Prioritizing Vulnerability Assessment Items Using LLM Based on IoT Device Documentations.

Yuka Ikegami, Ryotaro Negishi, Kento Hasegawa, Seira Hidano, Kazuhide Fukushima, Kazuo Hashimoto, Nozomu Togawa

IOTSMS 147 - 152 2024年 [査読有り]

担当区分：最終著者

DOI

Scopus

4

被引用数

(Scopus)
Initial Seeds Generation Using LLM for IoT Device Fuzzing.

Hibiki Nakanishi, Kota Hisafuru, Kento Hasegawa, Seira Hidano, Kazuhide Fukushima, Kazuo Hashimoto, Nozomu Togawa

IOTSMS 5 - 10 2024年 [査読有り]

担当区分：最終著者

DOI

Scopus

5

被引用数

(Scopus)
QUBO Formulation Using Sequence Pair With Search Space Restriction for Rectangle Packing Problem

Akihisa Okada, Keisuke Otaki, Tadayoshi Matsumori, Hiroaki Yoshida, Kotaro Terada, Nozomu Togawa

IEEE Access 2024年 [査読有り]

担当区分：最終著者

DOI

Scopus
Anomalous IoT Behavior Detection by Generated Power Waveforms with Hyper-parameter Tuning.

Ryusei Eda, Kota Hisafuru, Nozomu Togawa

IOLTS 1 - 3 2024年 [査読有り]

担当区分：最終著者

DOI

Scopus

4

被引用数

(Scopus)
Toward Practical Benchmarks of Ising Machines: A Case Study on the Quadratic Knapsack Problem

Kentaro Ohno, Tatsuhiko Shirai, Nozomu Togawa

IEEE Access 2024年 [査読有り]

担当区分：最終著者

DOI

Scopus

8

被引用数

(Scopus)
Ising Machine Approach to the Lecturer–Student Assignment Problem

Sora Tomita, Tatsuhiko Shirai, Nozomu Togawa

IEEE Access 2024年 [査読有り]

担当区分：最終著者

DOI

Scopus

2

被引用数

(Scopus)
A GPU-Based Ising Machine With a Multi-Spin-Flip Capability for Constrained Combinatorial Optimization

Satoru Jimbo, Tatsuhiko Shirai, Nozomu Togawa, Masato Motomura, Kazushi Kawamura

IEEE Access 12 43660 - 43673 2024年 [査読有り]

DOI

Scopus

6

被引用数

(Scopus)
Postprocessing Variationally Scheduled Quantum Algorithm for Constrained Combinatorial Optimization Problems

Tatsuhiko Shirai, Nozomu Togawa

IEEE Transactions on Quantum Engineering 2024年 [査読有り]

担当区分：最終著者

DOI

Scopus

7

被引用数

(Scopus)
Optimization of Practical Time-Dependent Vehicle Routing Problem by Ising Machines.

Yui Tsuyumine, Kenichi Masuda, Takeshi Hachikawa, Tsuyoshi Haga, Yuta Yachi, Tatsuhiko Shirai, Masashi Tawada, Nozomu Togawa

ICCE 1 - 5 2024年 [査読有り]

担当区分：最終著者

DOI

Scopus

3

被引用数

(Scopus)
Time-Dependent Multi-Objective Trip Planning by Ant Colony Optimization with Route API.

Etsushi Saeki, Siya Bao, Toshinori Takayama, Nozomu Togawa

ICCE 1 - 2 2024年 [査読有り]

担当区分：最終著者

DOI

Scopus

3

被引用数

(Scopus)
Evaluation of Ensemble Learning Models for Hardware-Trojan Identification at Gate-level Netlists.

Ryotaro Negishi, Nozomu Togawa

ICCE 1 - 6 2024年 [査読有り]

担当区分：最終著者

DOI

Scopus

11

被引用数

(Scopus)
An Interaction Coefficient Control Method for Setting Initial Solutions to Ising Machines.

Soma Kawakami, Kentaro Ohno, Dema Ba, Satoshi Yagi, Junji Teramoto, Nozomu Togawa

ICCE 1 - 2 2024年 [査読有り]

担当区分：最終著者

DOI

Scopus
Carrying-Mode-Free Stair Ascent and Descent Estimation using Smartphones.

Dai Kajimoto, Etsushi Saeki, Siya Bao, Nozomu Togawa

ICCE 1 - 6 2024年 [査読有り]

担当区分：最終著者

DOI

Scopus
Gen-Power: Anomaly Detection in IoT Devices Utilizing Generated Power Waveforms.

Kota Hisafuru, Nozomu Togawa

ICCE 1 - 6 2024年 [査読有り]

担当区分：最終著者

DOI

Scopus

2

被引用数

(Scopus)
Hybrid Iterative Annealing Method Using a Quantum Annealer and a Classical Computer.

Keisuke Fukada, Tatshuhiko Shirai, Nozomu Togawa

ICCE 108 ( 3 ) 1 - 6 2024年 [査読有り]

担当区分：最終著者

DOI

Scopus

1

被引用数

(Scopus)
An Anomalous Behavior Detection Method Utilizing IoT Power Waveform Shapes.

Kota Hisafuru, Kazunari Takasaki, Nozomu Togawa

IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 107 ( 1 ) 75 - 86 2024年01月 [査読有り]

担当区分：最終著者

DOI
Hardware-Trojan Detection at Gate-Level Netlists Using a Gradient Boosting Decision Tree Model and Its Extension Using Trojan Probability Propagation.

Ryotaro Negishi, Tatsuki Kurihara, Nozomu Togawa

IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 107 ( 1 ) 63 - 74 2024年01月 [査読有り]

担当区分：最終著者

DOI
Giving a Quasi-Initial Solution to Ising Machines by Controlling External Magnetic Field Coefficients.

Soma Kawakami, Kentaro Ohno, Dema Ba, Satoshi Yagi, Junji Teramoto, Nozomu Togawa

IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 107 ( 1 ) 52 - 62 2024年01月 [査読有り]

担当区分：最終著者

DOI
Ising-Machine-Based Solver for Constrained Graph Coloring Problems.

Soma Kawakami, Yosuke Mukasa, Siya Bao, Dema Ba, Junya Arai, Satoshi Yagi, Junji Teramoto, Nozomu Togawa

IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 107 ( 1 ) 38 - 51 2024年01月 [査読有り]

担当区分：最終著者

DOI
Hybrid Optimization Method Using Simulated-Annealing-Based Ising Machine and Quantum Annealer

Shuta Kikuchi, Nozomu Togawa, Shu Tanaka

Journal of the Physical Society of Japan 2023年12月 [査読有り]

DOI

Scopus

6

被引用数

(Scopus)
An Efficient Combined Bit-Width Reducing Method for Ising Models.

Yuta Yachi, Masashi Tawada, Nozomu Togawa

IEICE Trans. Inf. Syst. 106 ( 4 ) 495 - 508 2023年04月 [査読有り]

担当区分：最終著者

DOI
Multi-Spin-Flip Engineering in an Ising Machine.

Tatsuhiko Shirai, Nozomu Togawa

IEEE Trans. Computers 72 ( 3 ) 759 - 771 2023年03月 [査読有り]

担当区分：最終著者

DOI

Scopus

8

被引用数

(Scopus)
R-HTDetector: Robust Hardware-Trojan Detection Based on Adversarial Training.

Kento Hasegawa, Seira Hidano, Kohei Nozawa, Shinsaku Kiyomoto, Nozomu Togawa

IEEE Trans. Computers 72 ( 2 ) 333 - 345 2023年02月 [査読有り]

担当区分：最終著者

DOI

Scopus

42

被引用数

(Scopus)
Multi-Day Intermodal Travel Planning for Urban Cities Using Ising Machines

Siya Bao, Nozomu Togawa

IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC 54 - 60 2023年 [査読有り]

担当区分：最終著者

　概要を見る

The multi-day intermodal travel planning problem (MITPP) is an optimization problem (OP) and it generates the optimal sequences of point-of-interests (POIs) and hotels while searching for the most suitable transport modes between POIs and hotels. Conventional methods and solvers using von Neumann computers provide good approximate solutions to the OPs, but the computation time grows exponentially dealing with large-scale or complex OPs. Meanwhile, Ising machines or quantum annealing machines are non-von Neumann computers that are designed to solve complex OPs. In this paper, we focus on solving the MITPP by a two-phase Ising-based method. The first POI clustering phase aims at generating POIs clusters for sightseeing days and the second POI routing phase generates travel routes for each day with the optimal transport modes. Practical factors such as POI satisfaction, POI duration, hotel fee, and transportation fee are included in the MITPP. We map these elements onto quadratic unconstrained binary optimization (QUBO) models. For evaluation, we use a real-world dataset in Sapporo, Japan. Empirical results confirm that the proposed method can effectively solve the MITPP both in terms of solution quality and execution time and outperforms a conventional solver, a conventional method, and the latest Ising-based method.

DOI

Scopus

3

被引用数

(Scopus)
Smart Device-Based PDR Methods for Indoor Localization

Siya Bao, Nozomu Togawa

Machine Learning for Indoor Localization and Navigation 27 - 48 2023年01月

　概要を見る

Smart devices, such as smartphones and smartwatches, are indispensable nowadays for everyone’s daily life due to their mobility and powerful computation capability. Sensors embedded in these devices are relatively low-cost and convenient to carry. Consequently, leveraging the sensors embedded in smart devices has provided new opportunities for indoor PDR developments. In this chapter, we first introduce various types of smart devices and device-based carrying modes. We then describe the types and functionalities of sensors built into these devices, as well as common steps and evaluation metrics in smart device-based PDR methods. Several methods are summarized based on the usage of smart devices, sensors, techniques, and performances. Lastly, we present challenges and issues that remain for current smart device-based PDR methods.

DOI

Scopus
An Ising-Machine-Based Solver of Vehicle Routing Problem With Balanced Pick-Up

Siya Bao, Masashi Tawada, Shu Tanaka, Nozomu Togawa

IEEE Transactions on Consumer Electronics 70 ( 1 ) 445 - 459 2023年 [査読有り]

担当区分：最終著者

　概要を見る

Vehicle routing applications are ubiquitous in the field of pick-up and delivery service. We focus on the vehicle routing problem with balanced pick-up called VRPBP which originates from the package pick-up service. The aim of the problem is not only to efficiently explore the shortest travel route but also to balance loads between depots and vehicles. These problems can be regarded as optimization problems, and recent developments in Ising machines, including quantum annealing machines, bring us a new opportunity to solve complex real-world optimization problems. In this paper, a two-phase method and a three-phase method using Ising machines are proposed for solving the VRPBP. As the applicability of current Ising machines is limited due to the small size of Ising spins and connectivities, we partition the complex problem into two or three sub-problems, and the key elements of each sub-problem are mapped onto quadratic unconstrained binary optimization (QUBO) models to fit in the structure of the Ising machines. We first compared the performances of the Ising machine on the standard TSP and CVRP datasets with a conventional state-of-the-art solver and three conventional methods. Then, we evaluated the performances of the proposed methods compared with five conventional method for solving the VRPBP. The results confirm the effectiveness of the two proposed methods in solving vehicle-routing-related optimization problems.

DOI

Scopus

14

被引用数

(Scopus)
QuDASH: Quantum-Inspired Rate Adaptation Approach for DASH Video Streaming.

Bo Wei, Hang Song, Makoto Nakamura, Koichi Kimura, Nozomu Togawa, Jiro Katto

IEEE Access 11 118462 - 118473 2023年 [査読有り]

DOI

Scopus

5

被引用数

(Scopus)
Trip Planning Based on subQUBO Annealing.

Tatsuya Noguchi, Keisuke Fukada, Siya Bao, Nozomu Togawa

IEEE Access 11 100383 - 100395 2023年 [査読有り]

担当区分：最終著者

DOI

Scopus

9

被引用数

(Scopus)
Spin-Variable Reduction Method for Handling Linear Equality Constraints in Ising Machines

Tatsuhiko Shirai, Nozomu Togawa

IEEE Transactions on Computers 2023年 [査読有り]

担当区分：最終著者

DOI

Scopus

11

被引用数

(Scopus)
Dynamical Process of a Bit-Width Reduced Ising Model With Simulated Annealing.

Shuta Kikuchi, Nozomu Togawa, Shu Tanaka

IEEE Access 11 95493 - 95506 2023年 [査読有り]

DOI

Scopus

6

被引用数

(Scopus)
Fast and Accurate Smartglass Angles Inference Based on Periodic Behavior in Walking.

Dai Sato, Nozomu Togawa

ICCE 1 - 6 2023年 [査読有り]

DOI

Scopus
Fast Hyperparameter Tuning for Ising Machines.

Matthieu Parizy, Norihiro Kakuko, Nozomu Togawa

ICCE 1 - 6 2023年 [査読有り]

DOI

Scopus

6

被引用数

(Scopus)
A Quasi-Initial Solution Giving Method for Ising Machines by Controlling External Magnetic Field Coefficients.

Soma Kawakami, Kentaro Ohno, Dema Ba, Satoshi Yagi, Junji Teramoto, Nozomu Togawa

ICCE 1 - 6 2023年 [査読有り]

DOI

Scopus

2

被引用数

(Scopus)
A Constrained Graph Coloring Solver Based on Ising Machines.

Soma Kawakami, Yosuke Mukasa, Siya Bao, Dema Ba, Junya Arai, Satoshi Yagi, Junji Teramoto, Nozomu Togawa

ICCE 1 - 6 2023年 [査読有り]

DOI

Scopus

2

被引用数

(Scopus)
Cardinality Constrained Portfolio Optimization on an Ising Machine.

Matthieu Parizy, Przemyslaw Sadowski, Nozomu Togawa

SOCC 1 - 6 2022年 [査読有り]

DOI

Scopus

5

被引用数

(Scopus)
Effective Hardware-Trojan Feature Extraction Against Adversarial Attacks at Gate-Level Netlists.

Kazuki Yamashita, Tomohiro Kato, Kento Hasegawa, Seira Hidano, Kazuhide Fukushima, Nozomu Togawa

IOLTS 1 - 7 2022年 [査読有り]

DOI

Scopus

8

被引用数

(Scopus)
An Anomalous Behavior Detection Method for IoT Devices Based on Power Waveform Shapes.

Kota Hisafuru, Kazunari Takasaki, Nozomu Togawa

IOLTS 1 - 7 2022年 [査読有り]

DOI

Scopus

3

被引用数

(Scopus)
Hardware-Trojan Detection at Gate-level Netlists using Gradient Boosting Decision Tree Models.

Ryotaro Negishi, Tatsuki Kurihara, Nozomu Togawa

ICCE-Berlin 1 - 6 2022年 [査読有り]

DOI

Scopus

13

被引用数

(Scopus)
Autonomous driving system with feature extraction using a binarized autoencoder.

Kota Hisafuru, Ryotaro Negishi, Soma Kawakami, Dai Sato, Kazuki Yamashita, Keisuke Fukada, Nozomu Togawa

FPT 1 - 4 2022年 [査読有り]

DOI

Scopus
Multi-Objective Trip Planning Based on Ant Colony Optimization Utilizing Trip Records.

Etsushi Saeki, Siya Bao, Toshinori Takayama, Nozomu Togawa

IEEE Access 10 127825 - 127844 2022年 [査読有り]

DOI

Scopus

14

被引用数

(Scopus)
Hybrid Annealing Method Based on subQUBO Model Extraction With Multiple Solution Instances.

Yuta Atobe, Masashi Tawada, Nozomu Togawa

IEEE Transactions on Computers 71 ( 10 ) 2606 - 2619 2022年 [査読有り]

担当区分：最終著者

DOI

Scopus

33

被引用数

(Scopus)
Hardware-Trojan Detection Based on the Structural Features of Trojan Circuits Using Random Forests.

Tatsuki Kurihara, Nozomu Togawa

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 105-A ( 7 ) 1049 - 1060 2022年 [査読有り]

DOI
QUBO Matrix Distorting Method for Consumer Applications.

Tomokazu Yoshimura, Tatsuhiko Shirai, Masashi Tawada, Nozomu Togawa

ICCE 1 - 6 2022年 [査読有り]

DOI

Scopus

4

被引用数

(Scopus)
Efficient Coefficient Bit-Width Reduction Method for Ising Machines.

Yuta Yachi, Yousuke Mukasa, Masashi Tawada, Nozomu Togawa

ICCE 1 - 6 2022年 [査読有り]

DOI

Scopus

5

被引用数

(Scopus)
A PDR Method Using Smartglasses Reducing Accumulated Errors by Detecting User's Stop Motions.

Dai Sato, Nozomu Togawa

ICCE 1 - 2 2022年 [査読有り]

DOI

Scopus

2

被引用数

(Scopus)
Carrying-mode Free Indoor Positioning Using Smartphone and Smartwatch and Its Evaluations.

Tomoya Wakaizumi, Nozomu Togawa

J. Inf. Process. 30 52 - 65 2022年 [査読有り]

DOI
An Anomalous Behavior Detection Method Based on Power Analysis Utilizing Steady State Power Waveform Predicted by LSTM

Kazunari Takasaki, Ryoichi Kida, Nozomu Togawa

Proceedings - 2021 IEEE 27th International Symposium on On-Line Testing and Robust System Design, IOLTS 2021 1 - 7 2021年06月 [査読有り]

　概要を見る

Hardware security issues have emerged in recent years as Internet of Things (IoT) devices have rapidly spread. Power analysis is one of the methods to detect anomalous operations, but it is hard to apply it to IoT devices where an operating system and various software programs are running and hence its power waveforms become more complex. In this paper, we propose an anomalous behavior detection method utilizing application-specific power behaviors extracted by steady-state power waveform, which is generated by LSTM (long short-term memory). The proposed method is based on extracting application-specific power behaviors by predicting steady-state power waveforms. At that time, by using LSTM, we can effectively predict steady-state power waveforms, even if they include one or more cycled waveforms and/or they are composed of many complex waveforms. In the experiment, we implement three normal application programs and one anomalous application program on a single board computer and apply the proposed method to it. The experimental results demonstrate that the proposed method successfully detects the anomalous power behavior of an anomalous application program, while the existing method cannot.

DOI

Scopus

4

被引用数

(Scopus)
Hardware-trojan classification based on the structure of trigger circuits utilizing random forests

Tatsuki Kurihara, Nozomu Togawa

Proceedings - 2021 IEEE 27th International Symposium on On-Line Testing and Robust System Design, IOLTS 2021 1 - 4 2021年06月 [査読有り]

　概要を見る

Recently, with the spread of Internet of Things (IoT) devices, embedded hardware devices have been used in a variety of everyday electrical items. Due to the increased demand for embedded hardware devices, some of the IC design and manufacturing steps have been outsourced to third-party vendors. Since malicious third-party vendors may insert malicious circuits, called hardware Trojans, into their products, developing an effective hardware Trojan detection method is strongly required. In this paper, we propose 25 hardware-Trojan features based on the structure of trigger circuits for machine-learning-based hardware Trojan detection. Combining the proposed features into 11 existing hardware-Trojan features, we totally utilize 36 hardware-Trojan features for classification. Then we classify the nets in an unknown netlist into a set of normal nets and Trojan nets based on the random-forest classifier. The experimental results demonstrate that the average true positive rate (TPR) becomes 63.6% and the average true negative rate (TNR) becomes 100.0%. They improve the average TPR by 14.7 points while keeping the average TNR compared to existing state-of-the-art methods. In particular, the proposed method successfully finds out Trojan nets in several benchmark circuits, which are not found by the existing method.

DOI

Scopus

34

被引用数

(Scopus)
Data Augmentation for Machine Learning-Based Hardware Trojan Detection at Gate-Level Netlists

Kento Hasegawa, Seira Hidano, Kohei Nozawa, Shinsaku Kiyomoto, Nozomu Togawa

Proceedings - 2021 IEEE 27th International Symposium on On-Line Testing and Robust System Design, IOLTS 2021 1 - 4 2021年06月 [査読有り]

　概要を見る

Due to the rapid growth in the information and telecommunications industries, an untrusted vendor might compromise the complicated supply chain by inserting hardware Trojans (HTs). Although hardware Trojan detection methods at gate-level netlists employing machine learning have been developed, the training dataset is insufficient. In this paper, we propose a data augmentation method for machine-learning-based hardware Trojan detection. Our proposed method replaces a gate in a hardware Trojan circuit with logically equivalent gates. The experimental results demonstrate that our proposed method successfully enhances the classification performance with all the classifiers in terms of the true positive rates (TPRs).

DOI

Scopus

6

被引用数

(Scopus)
An Approach to the Vehicle Routing Problem with Balanced Pick-up Using Ising Machines

Siya Bao, Masashi Tawada, Shu Tanaka, Nozomu Togawa

2021 International Symposium on VLSI Design, Automation and Test, VLSI-DAT 2021 - Proceedings 2021年04月 [査読有り]

　概要を見る

Vehicle routing problems (VRPs) can be solved as optimization problems. Practical applications of the VRPs are involved in various areas including manufacturing, supply chain, and tourism. Conventional approaches using von Neumann computers obtain good approximate solutions to the optimization problems, but conventional approaches show disadvantages of computation costs in large-scale or complex problems due to the combinatorial explosion. Oppositely, Ising machines or quantum annealing machines are non-von Neumann computers that are designed to solve complex optimization problems. In this paper, we propose an Ising-machine based approach for the vehicle routing problem with balanced pick-up (VRPBP). The development of the VRPBP is motivated by postal items pick-up services in the real-world. Our approach includes various features of VRP variants. We propose a 2-phase approach to solve the VRPBP and key elements in each phase are mapped onto quadratic unconstrained binary optimization (QUBO) forms. Specifically, the first phase belongs to the clustering phase which is an extension to the knapsack problem with additional distance and load balancing concerns. The second phase is mapped to the traveling salesman problem. Experimental results of our approach are evaluated in terms of solution quality and computation time compared with conventional approaches.

DOI

Scopus

13

被引用数

(Scopus)
Mapping induced subgraph isomorphism problems to ising models and its evaluations by an ising machine

Natsuhito Yoshimura, Masashi Tawada, Shu Tanaka, Junya Arai, Satoshi Yagi, Hiroyuki Uchiyama, Nozomu Togawa

IEICE Transactions on Information and Systems E104.D ( 4 ) 481 - 489 2021年04月 [査読有り]

　概要を見る

SUMMARY Ising machines have attracted attention as they are expected to solve combinatorial optimization problems at high speed with Ising models corresponding to those problems. An induced subgraph isomorphism problem is one of the decision problems, which determines whether a specific graph structure is included in a whole graph or not. The problem can be represented by equality constraints in the words of combinatorial optimization problem. By using the penalty functions corresponding to the equality constraints, we can utilize an Ising machine to the induced subgraph isomorphism problem. The induced subgraph isomorphism problem can be seen in many practical problems, for example, finding out a particular malicious circuit in a device or particular network structure of chemical bonds in a compound. However, due to the limitation of the number of spin variables in the current Ising machines, reducing the number of spin variables is a major concern. Here, we propose an efficient Ising model mapping method to solve the induced subgraph isomorphism problem by Ising machines. Our proposed method theoretically solves the induced subgraph isomorphism problem. Furthermore, the number of spin variables in the Ising model generated by our proposed method is theoretically smaller than that of the conventional method. Experimental results demonstrate that our proposed method can successfully solve the induced subgraph isomorphism problem by using the Ising-model based simulated annealing and a real Ising machine.

DOI

Scopus

3

被引用数

(Scopus)
Solving constrained slot placement problems using an ising machine and its evaluations

Sho Kanamaru, Kazushi Kawamura, Shu Tanaka, Yoshinori Tomita, Nozomu Togawa

IEICE Transactions on Information and Systems E104D ( 2 ) 226 - 236 2021年02月 [査読有り]

　概要を見る

Ising machines have attracted attention, which is expected to obtain better solutions of various combinatorial optimization problems at high speed by mapping the problems to natural phenomena. A slot-placement problem is one of the combinatorial optimization problems, regarded as a quadratic assignment problem, which relates to the optimal logic-block placement in a digital circuit as well as optimal delivery planning. Here, we propose a mapping to the Ising model for solving a slot-placement problem with additional constraints, called a constrained slot-placement problem, where several item pairs must be placed within a given distance. Since the behavior of Ising machines is stochastic and we map the problem to the Ising model which uses the penalty method, the obtained solution does not always satisfy the slot-placement constraint, which is different from the conventional methods such as the conventional simulated annealing. To resolve the problem, we propose an interpretation method in which a feasible solution is generated by post-processing procedures. We measured the execution time of an Ising machine and compared the execution time of the simulated annealing in which solutions with almost the same accuracy are obtained. As a result, we found that the Ising machine is faster than the simulated annealing that we implemented.

DOI

Scopus

4

被引用数

(Scopus)
An Indoor Positioning Method using Smartphone and Smartwatch Independent of Carrying Modes

Tomoya Wakaizumi, Nozomu Togawa

Digest of Technical Papers - IEEE International Conference on Consumer Electronics 2021-January 2021年01月 [査読有り]

　概要を見る

A pedestrian dead reckoning method, or PDR method in short, is one of the positioning methods in indoor environments, which estimates user's positions by using sensors such as acceleration and angular velocity sensors. When we consider using a smartphone as a PDR sensor device, it has various carrying modes such as holding it directly and carrying it inside a pocket. How to deal with these various carrying modes is the great concern in PDR using a smartphone. In this paper, we propose a PDR method based on a combination of a smartphone and a smartwatch. By synchronizing smartphone and smartwatch sensors effectively, the proposed method can successfully reduce drift errors and thus estimate accurate user's positions, compared to just using a smartphone. Furthermore, even when the user carries his/her smartphone in various carrying modes, the proposed method still realizes accurate PDR. The experimental results demonstrate that the positioning errors are reduced by approximately 87.5% on average compared to the existing method.

DOI

Scopus

7

被引用数

(Scopus)
Visiting-Route Recommendation in Amusement Parks and its Evaluations by an Ising Machine

Yosuke Mukasa, Tomoya Wakaizumi, Shu Tanaka, Nozomu Togawa

Digest of Technical Papers - IEEE International Conference on Consumer Electronics 2021-January 1 - 6 2021年01月 [査読有り]

　概要を見る

In an amusement park, an attraction-visiting route considering the waiting time and traveling time improves visitors' satisfaction and experience. We focus on Ising machines to solve the problem, which are recently expected to solve combinatorial optimization problems at high speed by mapping the problems to Ising models or quadratic unconstrained binary optimization (QUBO) models. We propose a mapping of the visiting-route recommendation problem in amusement parks to a QUBO model for solving it using Ising machines. By using an actual Ising machine, we could obtain feasible solutions 15 times faster with almost the same accuracy as the simulated annealing method for the visiting-route recommendation problem.

DOI

Scopus

3

被引用数

(Scopus)
Reducing Writing Energy Consumption for Non-Volatile Registers Utilizing Frequent Patterns of Sequential Bits on RISC-V Architecture

Shota Matsuno, Masashi Tawada, Nozomu Togawa

Digest of Technical Papers - IEEE International Conference on Consumer Electronics 2021-January 1 - 6 2021年01月 [査読有り]

　概要を見る

Single-board computers have been widely spread and used in a variety of situations, where they may be requested to operate under low-energy conditions or with an unstable power supply. Utilizing non-volatile memory (NVM) retaining data without power must be one of the effective solutions to tackle this issue. However, compared to volatile memory such as SRAM and DRAM, NVM consumes more energy in writing operations. In this paper, we propose an effective energy reduction method for RISC-V architecture, targeting one of NVMs called spin-transfer torque RAMs (STT-RAM). Firstly, we thoroughly investigate the writing bit patterns to registers in RISC-V architecture for various typical application programs and find out that most of them can be classified into three patterns, in which most bits in writing 32-bit data are 0s (zero's). Secondly, we propose an energy-reduced register-writing method utilizing these frequent writing bit patterns. In this method, when a writing data falls into one of the three frequent bit writing patterns above, we just write the bit pattern type into the extra bits and do not write actual data into registers and hence we can reduce the write energy in NVM register writing extremely. Experimental results on RISC-V architecture demonstrate that the energy consumption is reduced by 12.5%-53.8% by using our proposed method compared to the baseline architecture.

DOI

Scopus
An ising machine-based solver for visiting-route recommendation problems in amusement parks

Yosuke MUKASA, Tomoya WAKAIZUMI, Shu TANAKA, Nozomu TOGAWA

IEICE Transactions on Information and Systems E104D ( 10 ) 1592 - 1600 2021年

　概要を見る

In an amusement park, an attraction-visiting route considering the waiting time and traveling time improves visitors' satisfaction and experience. We focus on Ising machines to solve the problem, which are recently expected to solve combinatorial optimization problems at high speed by mapping the problems to Ising models or quadratic unconstrained binary optimization (QUBO) models. We propose a mapping of the visiting-route recommendation problem in amusement parks to a QUBO model for solving it using Ising machines. By using an actual Ising machine, we could obtain feasible solutions one order of magnitude faster with almost the same accuracy as the simulated annealing method for the visiting-route recommendation problem.

DOI

Scopus

9

被引用数

(Scopus)
Multi-day Travel Planning Using Ising Machines for Real-world Applications.

Siya Bao, Masashi Tawada, Shu Tanaka, Nozomu Togawa

24th IEEE International Intelligent Transportation Systems Conference(ITSC) 3704 - 3709 2021年 [査読有り]

DOI

Scopus

14

被引用数

(Scopus)
A PDR Method Combining Smartphone and Smartwatch based on Multi-Scenario Map Matching.

Tomoya Wakaizumi, Nozomu Togawa

GCCE 308 - 309 2021年 [査読有り]

DOI

Scopus
An autonomous driving system utilizing image processing accelerated by FPGA.

Kazunari Takasaki, Kota Hisafuru, Ryotaro Negishi, Kazuki Yamashita, Keisuke Fukada, Tomoya Wakaizumi, Nozomu Togawa

FPT 1 - 4 2021年 [査読有り]

DOI

Scopus

4

被引用数

(Scopus)
Toward Learning Robust Detectors from Imbalanced Datasets Leveraging Weighted Adversarial Training.

Kento Hasegawa, Seira Hidano, Shinsaku Kiyomoto, Nozomu Togawa

CANS 392 - 411 2021年 [査読有り]

DOI

Scopus
A Three-Stage Annealing Method Solving Slot-Placement Problems Using an Ising Machine.

Keisuke Fukada, Matthieu Parizy, Yoshinori Tomita, Nozomu Togawa

IEEE Access 9 134413 - 134426 2021年 [査読有り]

DOI

Scopus

12

被引用数

(Scopus)
Experimental evaluations of parallel tempering on an ising machine

Yosuke Mukasa, Shu Tanaka, Nozomu Togawa

IPSJ Transactions on System LSI Design Methodology 14 27 - 29 2021年 [査読有り]

　概要を見る

Ising machines have recently attracted much attention because they are expected to solve combinatorial optimization problems efficiently. We focus on an Ising machine whose algorithm is based on parallel tempering (PT), and experimentally evaluate the performance of the Ising machine for MIN-CUT problems. Experimental results show that the Ising machine outperforms a famous graph partitioning solver in terms of the quality of solution and the time-to-target-solution.

DOI

Scopus

1

被引用数

(Scopus)
Performance Comparison of Typical Binary-Integer Encodings in an Ising Machine

Kensuke Tamura, Tatsuhiko Shirai, Hosho Katsura, Shu Tanaka, Nozomu Togawa

IEEE Access 9 81032 - 81039 2021年 [査読有り]

　概要を見る

The differences in performance among binary-integer encodings in an Ising machine, which can solve combinatorial optimization problems, are investigated. Many combinatorial optimization problems can be mapped to find the lowest-energy (ground) state of an Ising model or its equivalent model, the Quadratic Unconstrained Binary Optimization (QUBO). Since the Ising model and QUBO consist of binary variables, they often express integers as binary when using Ising machines. A typical example is the combinatorial optimization problem under inequality constraints. Here, the quadratic knapsack problem is adopted as a prototypical problem with an inequality constraint. It is solved using typical binary-integer encodings: one-hot encoding, binary encoding, and unary encoding. Unary encoding shows the best performance for large-sized problems.

DOI

Scopus

48

被引用数

(Scopus)
Generating adversarial examples for hardware-trojan detection at gate-level netlists

Kohei Nozawa, Kento Hasegawa, Seira Hidano, Shinsaku Kiyomoto, Kazuo Hashimoto, Nozomu Togawa

Journal of Information Processing 29 236 - 246 2021年 [査読有り]

　概要を見る

Recently, the great demand for integrated circuits (ICs) drives third parties to be involved in IC design and manufacturing steps. At the same time, the threat of injecting a malicious circuit, called a hardware Trojan, by third parties has been increasing. Machine learning is one of the powerful solutions for detecting hardware Trojans. How-ever, a weakness of such a machine-learning-based classification method against adversarial examples (AEs) has been reported, which causes misclassification by adding perturbation in input samples. This paper firstly proposes a framework generating adversarial examples for hardware-Trojan detection at gate-level netlists utilizing neural networks. The proposed framework replaces hardware Trojan circuits with logically equivalent ones, and makes it difficult to detect them. Secondly, we propose a Trojan-net concealment degree (TCD) and a modification evaluating value (MEV) as measures of the amount of modifications. Finally, based on the MEV, we pick up adversarial modification patterns to apply to the circuits against hardware-Trojan detection. The experimental results using benchmarks demonstrate that the proposed framework successfully decreases the true positive rate (TPR) by a maximum of 30.15 points.

DOI

Scopus

16

被引用数

(Scopus)
A route recommendation method considering individual user’s preferences by monte-carlo tree search and its evaluations

Yuta Ishizaki, Yurie Koyama, Toshinori Takayama, Nozomu Togawa

Journal of Information Processing 29 81 - 92 2021年 [査読有り]

　概要を見る

As smartphones and tablets are widely spread and used, route recommendation and guidance services have become commonplace. Conventional services in route recommendation and guidance try to give best routes in terms of route length, time required, and train/bus fares, whereas even different users are given the same route when inputting the same parameters. However, each user has various preferences from the aspect of safety and comfort. It is strongly desirable to reflect the user’s preferences in route recommendation and recommend the most preferable route to every user. Since user’s preferences are extremely vague and complicated, how to evaluate them in route recommendation is one of the key problems there. In this paper, we propose a route recommendation method, called P-UCT method, considering individual user’s preferences utilizing Monte-Carlo tree search. In the proposed method, we firstly ex-tract route features based on the route recommendation history of every user and construct a route evaluator based on Support Vector Machine (SVM). After that, the method generates a random route from a start point to an end point by Monte-Carlo tree search. The route evaluator determines how well every generated route matches the user’s preferences. By repeating the evaluation, the method obtains the route, which must be closest to the user’s preferences. Experimental results demonstrate that the proposed method outperforms the existing method from the viewpoint of the average evaluation scores. They also demonstrate that the proposed method provides the recommended route reflecting the user’s individual preferences even if it learns the recommended route history of areas in different situations.

DOI

Scopus

6

被引用数

(Scopus)
A capacitance measurement device for running hardware devices and its evaluations

Makoto Nishizawa, Kento Hasegawa, Nozomu Togawa

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E103A ( 9 ) 1018 - 1027 2020年09月 [査読有り]

　概要を見る

In IoT (Internet-of-Things) era, the number and variety of hardware devices becomes continuously increasing. Several IoT devices are utilized at infrastructure equipments. How to maintain such IoT devices is a serious concern. Capacitance measurement is one of the powerful ways to detect anomalous states in the structure of the hardware devices. Particularly, measuring capacitance while the hardware device is running is a major challenge but no such researches proposed so far. This paper proposes a capacitance measuring device which measures device capacitance in operation. We firstly combine the AC (alternating current) voltage signal with the DC (direct current) supply voltage signal and generates the fluctuating signal. We supply the fluctuating signal to the target device instead of supplying the DC supply voltage. By effectively filtering the observed current in the target device, the filtered current can be proportional to the capacitance value and thus we can measure the target device capacitance even when it is running. We have implemented the proposed capacitance measuring device on the printed wiring board with the size of 95 mm × 70 mm and evaluated power consumption and accuracy of the capacitance measurement. The experimental results demonstrate that power consumption of the proposed capacitance measuring device is reduced by 65% in low-power mode from measuring mode and proposed device successfully measured capacitance in 0.002 µF resolution.

DOI

Scopus

1

被引用数

(Scopus)
Trojan-net classification for gate-level hardware design utilizing boundary net structures

Kento Hasegawa, Masao Yanagisawa, Nozomu Togawa

IEICE Transactions on Information and Systems E103D ( 7 ) 1618 - 1622 2020年07月 [査読有り]

　概要を見る

Cybersecurity has become a serious concern in our daily lives. The malicious functions inserted into hardware devices have been well known as hardware Trojans. In this letter, we propose a hardware-Trojan classification method at gate-level netlists utilizing boundary net structures. We first use a machine-learning-based hardware-Trojan detection method and classify the nets in a given netlist into a set of normal nets and a set of Trojan nets. Based on the classification results, we investigate the net structures around the boundary between normal nets and Trojan nets, and extract the features of the nets mistakenly identified to be normal nets or Trojan nets. Finally, based on the extracted features of the boundary nets, we again classify the nets in a given netlist into a set of normal nets and a set of Trojan nets. The experimental results demonstrate that our proposed method outperforms an existing machine-learning-based hardware-Trojan detection method in terms of its true positive rate.

DOI

Scopus

9

被引用数

(Scopus)
An Anomalous Behavior Detection Method for IoT Devices by Extracting Application-Specific Power Behaviors

Kazunari Takasaki, Kento Hasegawa, Ryoichi Kida, Nozomu Togawa

Proceedings - 2020 26th IEEE International Symposium on On-Line Testing and Robust System Design, IOLTS 2020 1 - 4 2020年07月 [査読有り]

　概要を見る

With the widespread use of Internet of Things (IoT) devices in recent years, we utilize a variety of hardware devices in our daily life. On the other hand, hardware security issues are emerging. Power analysis is one of the methods to detect anomalous operations, but it is hard to apply it to IoT devices where an operating system and various software programs are running. In this paper, we propose an anomalous behavior detection method for an IoT device by extracting application-specific power behaviors. First, we measure a power consumption of an IoT device, and obtain the power waveform. Next, we extract an application-specific power waveform by eliminating a steady factor from the obtained power waveform. Finally, we extract feature values from the application-specific power waveform and detect an anomalous behavior by utilizing the local outlier factor (LOF) method. The experimental results using a single board computer demonstrate that the proposed method successfully detects the anomalous power behavior of an anomalous application program.

DOI

Scopus

2

被引用数

(Scopus)
Evaluation on Hardware-Trojan Detection at Gate-Level IP Cores Utilizing Machine Learning Methods

Tatsuki Kurihara, Kento Hasegawa, Nozomu Togawa

Proceedings - 2020 26th IEEE International Symposium on On-Line Testing and Robust System Design, IOLTS 2020 1 - 4 2020年07月 [査読有り]

　概要を見る

Recently, with the spread of Internet of Things (IoT) devices, embedded hardware devices have been used in a variety of everyday electrical items. Due to the increased demand for embedded hardware devices, some of the IC design and manufacturing steps have been outsourced to third-party vendors. Since malicious third-party vendors may insert hardware Trojans into their products, developing an effective hardware Trojan detection method is strongly required. In this paper, we evaluate hardware Trojan detection methods using neural networks and random forests at gate-level intellectual property (IP) cores that contain more than 10,000 nets. First, we extract 11 features for each net in a given netlist, and learn them with neural networks and random forests. Then, we classify the nets in an unknown netlist into a set of normal nets and Trojan nets based on the learned classifiers. The experimental results demonstrate that the average true positive rate becomes 84.6% and the average true negative rate becomes 95.1%, which is sufficiently high accuracy compared to existing evaluations.

DOI

Scopus

34

被引用数

(Scopus)
Designing stochastic number generators sharing a random number source based on the randomization function

Masashi Tawada, Nozomu Togawa

NEWCAS 2020 - 18th IEEE International New Circuits and Systems Conference, Proceedings 271 - 274 2020年06月 [査読有り]

　概要を見る

In this study, we propose a novel stochastic number generator architecture and prove that the resulting circuit can deliver independent stochastic numbers and improve the accuracy of the calculation results obtained using some recent conventional stochastic computing-based arithmetic circuits. This study is motivated by the increasingly important role of stochastic computing in various fields, such as the digital circuit design, where the stochastic number generators are responsible for a significant share of the hardware cost.

DOI

Scopus

2

被引用数

(Scopus)
Document-level sentiment classification in japanese by stem-based segmentation with category and data-source information

Siya Bao, Nozomu Togawa

Proceedings - 14th IEEE International Conference on Semantic Computing, ICSC 2020 311 - 314 2020年02月 [査読有り]

　概要を見る

Existing studies focus on text information while ignoring category and data source information, both of which are verified to be important in interpreting sentiments in travel comments in this paper. Furthermore, the unique linguistic characteristics of Japanese cause difficulty in applying the conventional token-based word segmentation methods to Japanese comments directly. In this paper, we propose a method of stem-based segmentation based on Japanese linguistic characteristics and incorporate it with category and data source information into a hierarchical network model for document-level sentiment classification. Empirical results of our proposed model outperform existing models on a real-world dataset.

DOI

Scopus

1

被引用数

(Scopus)
Multi-Resolutional Image Format Using Stochastic Numbers and Its Hardware Implementation

Ryota Ishikawa, Masashi Tawada, Masao Yanagisawa, Nozomu Togawa

2020 IEEE 11th Latin American Symposium on Circuits and Systems, LASCAS 2020 1 - 4 2020年02月 [査読有り]

　概要を見る

The popularization of IoT devices made image processing very common for users. Image formats used in hardware abound since there are varieties of IoT devices. Conversion of image formats in hardware is relatively complicated compared with other calculation. This paper focuses on conversion of image resolution, especially image reduction. By expressing images with stochastic numbers, this paper proposes an image format which can be treated to be in several resolution with one data. From experimental evaluations, we found that the proposed image format enables image reduction by pixel average to be implemented into hardware with lower costs compared with conventional pixel average using binary numbers. Also, image magnification using the proposed image format can restore the original image, while conventional image magnification cannot.

DOI

Scopus
Adversarial examples for hardware-trojan detection at gate-level netlists

Kohei Nozawa, Kento Hasegawa, Seira Hidano, Shinsaku Kiyomoto, Kazuo Hashimoto, Nozomu Togawa

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 11980 LNCS 341 - 359 2020年 [査読有り]

　概要を見る

Recently, due to the increase of outsourcing in integrated circuit (IC) design and manufacturing, the threat of injecting a malicious circuit, called a hardware Trojan, by third party has been increasing. Machine learning has been known to produce a powerful model to detect hardware Trojans. But it is recently reported that such a machine learning based detection is weak against adversarial examples (AEs), which cause misclassification by adding perturbation in input data. Referring to the existing studies on adversarial examples, most of which are discussed in the field of image processing, this paper first proposes a framework generating adversarial examples for hardware-Trojan detection for gate-level netlists utilizing neural networks. The proposed framework replaces hardware Trojan circuits with logically equivalent circuits, and makes it difficult to detect them. Second, we define Trojan-net concealment degree (TCD) as a possibility of misclassification, and modification evaluating value (MEV) as a measure of the amount of modifications. Third, judging from MEV, we pick up adversarial modification patterns to apply to the circuits against hardware-Trojan detection. The experimental results using benchmarks demonstrate that the proposed framework successfully decreases true positive rate (TPR) by at most 30.15 points.

DOI

Scopus

12

被引用数

(Scopus)
How to Reduce the Bit-width of an Ising Model by Adding Auxiliary Spins

Daisuke Oku, Masashi Tawada, Shu Tanaka, Nozomu Togawa

IEEE Transactions on Computers 71 ( 1 ) 223 - 234 2020年 [査読有り]

　概要を見る

Annealing machines have been developed as non-von Neumann computers aimed at solving combinatorial optimization problems efficiently. To use annealing machines for solving combinatorial optimization problems, we have to represent the objective function and constraints by an Ising model, which is a theoretical model in statistical physics. Further, it is necessary to transform the Ising model according to the hardware limitations. In the transformation, the process of effectively reducing the bit-widths of coefficients in the Ising model has hardly been studied so far. Thus, when we consider the Ising model with a large bit-width, a naive method, which means right bit-shift, has to be applied. Since it is expected that obtaining highly accurate solutions is difficult by the naive method, it is necessary to construct a method for efficiently reducing the bit-width. This paper proposes methods for reducing the bit-widths of interaction and external magnetic field coefficients in the Ising model and proves that the reduction gives theoretically the same ground state of the original Ising model. The experimental evaluations also demonstrate the effectiveness of our proposed methods.

DOI

Scopus

21

被引用数

(Scopus)
Guiding Principle for Minor-Embedding in Simulated-Annealing-Based Ising Machines

Tatsuhiko Shirai, Shu Tanaka, Nozomu Togawa

IEEE Access 8 210490 - 210502 2020年 [査読有り]

　概要を見る

We propose a novel type of minor-embedding (ME) in simulated-annealing-based Ising machines. The Ising machines can solve combinatorial optimization problems. Many combinatorial optimization problems are mapped to find the ground (lowest-energy) state of the logical Ising model. When connectivity is restricted on Ising machines, ME is required for mapping from the logical Ising model to a physical Ising model, which corresponds to a specific Ising machine. Herein we discuss the guiding principle of ME design to achieve a high performance in Ising machines. We derive the proposed ME based on a theoretical argument of statistical mechanics. The performance of the proposed ME is compared with two existing types of MEs for different benchmarking problems. Simulated annealing shows that the proposed ME outperforms existing MEs for all benchmarking problems, especially when the distribution of the degree in a logical Ising model has a large standard deviation. This study validates the guiding principle of using statistical mechanics for ME to realize fast and high-precision solvers for combinatorial optimization problems.

DOI

Scopus

17

被引用数

(Scopus)
A new LDPC code decoding method: Expanding the scope of ising machines

Masashi Tawada, Shu Tanaka, Nozomu Togawa

Digest of Technical Papers - IEEE International Conference on Consumer Electronics 2020-January 1 - 6 2020年01月 [査読有り]

　概要を見る

Low-density parity-check (LDPC) codes have previously been considered as combinatorial optimization problems (COPs) in respect to its decoding. However, after defining it as such, none have gone so far as to convert the LDPC code into a quadratic unconstrained binary optimization (QUBO) problem. Thus, a new method is created: one that converts the LDPC code to a QUBO problem, inputs the QUBO problem into Ising machines (computers based on the Ising model that are designed to solve the QUBO problem), obtains the QUBO solution and converts it to a LDPC solution. By utilizing an actual Ising machine, LDPC solutions with code length of 256-bits have been obtained with an accuracy of 93.9% by average annealing time 214.0ms. The benefit of this newfound methodology goes beyond its theoretical imprint of obtaining LDPC solutions more accurately. It has only been a few years since the Ising machine has been developed. Therefore, in formulating this method, one expands the currently scope of studies involving Ising machines, helping current and future researchers unlock its full range of capabilities and possibilities.

DOI

Scopus

5

被引用数

(Scopus)
Theory of Ising Machines and a Common Software Platform for Ising Machines

Shu Tanaka, Yoshiki Matsuda, Nozomu Togawa

Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC 2020-January 659 - 666 2020年01月 [査読有り]

　概要を見る

Ising machines are a new type of non-Neumann computer that specializes in solving combinatorial optimization problems efficiently. The input form of Ising machines is the energy function of the Ising model or quadratic unconstrained binary optimization form, and Ising machines operate to search for a condition to minimize the energy function. We describe the theory of Ising machines and the present status of the Ising machines, software for Ising machines, and applications using Ising machines.

DOI

Scopus

16

被引用数

(Scopus)
FPGA-based Heterogeneous Solver for Three-Dimensional Routing

Kento Hasegawa, Ryota Ishikawa, Makoto Nishizawa, Kazushi Kawamura, Masashi Tawada, Nozomu Togawa

Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC 2020-January 11 - 12 2020年01月 [査読有り]

　概要を見る

A heuristic algorithm is one of the approaches to solve an NP-hard problem. In order to enhance the capability of the system, heterogeneous computing is often adapted. In this paper, we propose an FPGA-based heterogeneous solver for three-dimensional routing. The proposed system is implemented into multiple FPGA boards and a single-board computer. The experimental results demonstrate that the proposed system outperforms a single FPGA system.

DOI

Scopus
Scalable stochastic number duplicators for accuracy-flexible arithmetic circuit design

Ryota Ishikawa, Masashi Tawada, Masao Yanagisawa, Nozomu Togawa

IPSJ Transactions on System LSI Design Methodology 13 10 - 20 2020年 [査読有り]

　概要を見る

Stochastic computing is a computation method which can implement arithmetic operations by simple logic circuits. Stochastic numbers are used in this method, whose values are defined by their bit streams' appearance rates of 1's. As a nature of stochastic computing, changing the length of the input stochastic numbers will change the whole circuit's accuracy. However, in some implementations with re-convergence paths, the circuit itself will cause errors, i.e., the length of the input stochastic numbers does not change that circuit's accuracy. This paper proposes a stochastic number duplicator whose outputs differ every time and are all independent. This stochastic number duplicator has a scalable structure by changing the numbers of flip-flops for bit re-arrangement. From the experimental evaluations and discussions, we clarify that the proposed stochastic number duplicator enables accuracy-flexible circuits.

DOI

Scopus

1

被引用数

(Scopus)
A travel decision support algorithm: Landmark activity extraction from japanese travel comments

Siya Bao, Masao Yanagisawa, Nozomu Togawa

Studies in Computational Intelligence 849 109 - 123 2020年 [査読有り]

　概要を見る

To help people smoothly and efficiently make travel decisions, we utilize the advantages of travel comments posted by thousands of other travelers. In this paper, we analyze the feasibility of exploring landmark activity queries and representative examples from Japanese travel comments. Contributions in this paper include a framework for extracting activity concerned keywords and queries, quantifying the relationship between landmark activities and comment contents. An evaluation of activity-example extraction is conducted in two case studies through 18,939 travel comments.

DOI

Scopus
Implementation of a ROS-Based Autonomous Vehicle on an FPGA Board

Kento Hasegawa, Kazunari Takasaki, Makoto Nishizawa, Ryota Ishikawa, Kazushi Kawamura, Nozomu Togawa

2019 International Conference on Field-Programmable Technology (ICFPT) 2019年12月

DOI
Error correction system using stochastic numbers in symmetric channels and z channels

Ryota Ishikawa, Masashi Tawada, Masao Yanagisawa, Nozomu Togawa

2019 26th IEEE International Conference on Electronics, Circuits and Systems, ICECS 2019 578 - 581 2019年11月

　概要を見る

In fields of image processing and neural network, increasing amount of data and complex functions are making circuits precise. This makes errors in circuits relatively large. When upper bits of binary signals flip due to noise, the value will increase or decrease drastically. On the other hand, if stochastic numbers are used, the changes on their values are the same since all the bits have the same weight. Therefore, stochastic computing, a computation method based on stochastic numbers, is attracting interest. Stochastic computing does have error tolerance, but cannot restore the bit stream if the bits are erroneous. Here, we focus on evaluating the error-free value from the bit error rate and the erroneous value. We have proposed a method to correct errors of stochastic numbers by measuring the bit error rate and filtering the values properly in symmetric channels. In this paper, we propose a method that can also correct errors in Z channels as well as symmetric channels. From experimental evaluations, in environment with high bit error rate, this proposal will give a better peak-signal-to-noise ratio compared with a conventional error correction coding.

DOI

Scopus

2

被引用数

(Scopus)
A multiple coefficients trial method to solve combinatorial optimization problems for simulated-annealing-based ising machines

Kota Takehara, Daisuke Oku, Yoshiki Matsuda, Shu Tanaka, Nozomu Togawa

IEEE International Conference on Consumer Electronics - Berlin, ICCE-Berlin 2019- 64 - 69 2019年09月

　概要を見る

When solving a combinatorial optimization problem with Ising machines, we have to formulate it onto an energy function of Ising model. Here, how to determine the penalty coefficients in the energy function is a great concern if it includes constraint terms. In this paper, we focus on a traveling salesman problem (TSP, in short), one of the combinatorial optimization problems with equality constraints. Firstly, we investigate the relationship between the penalty coefficient and the accuracy of solutions in a TSP. Based on it, we propose a method to obtain a TSP quasi-optimum solution, which is called multiple coefficients trial method. In our proposed method, we use an Ising machine to solve a TSP by changing a penalty coefficient every trial, the TSP solutions can converge very fast in total. Compared to naive methods using simulated-annealing-based Ising machines, we confirmed that our proposed method can reduce the total number of annealing iterations to 1/10 to 1/1000 to obtain a quasi-optimum solution in 32-city TSPs.

DOI

Scopus

11

被引用数

(Scopus)
Efficient ising model mapping for induced subgraph isomorphism problems using ising machines

Natsuhito Yoshimura, Masashi Tawada, Shu Tanaka, Junya Arai, Satoshi Yagi, Hiroyuki Uchiyama, Nozomu Togawa

IEEE International Conference on Consumer Electronics - Berlin, ICCE-Berlin 2019- 227 - 232 2019年09月

　概要を見る

Ising machines have attracted attention as they are expected to solve combinatorial optimization problems at high speed with Ising models corresponding to those problems. An induced subgraph isomorphism problem is one of the decision problems, which determines whether a specific graph structure is included in a whole graph or not. The problem can be represented by equality constraints. By using the penalty functions correspond to the equality constraints, we can utilize an Ising machine to the induced subgraph isomorphism problem. The induced subgraph isomorphism problem can be seen in many practical problems, for example, finding out a particular malicious circuit in a device or particular network structure of chemical bonds in a compound. However, due to the limitation of the number of spin variables in the current Ising machines, reducing the number of spin variables is a major concern. Here, we propose an efficient Ising model mapping method to solve the induced subgraph isomorphism problem by Ising machines. Our proposed method theoretically solves the induced subgraph isomorphism problem. Furthermore, the number of spin variables in the Ising model generated by our proposed method is theoretically smaller than that of the conventional method. Experimental results demonstrate that our proposed method can successfully solve the induced subgraph isomorphism problem using the Isingmodel based simulated annealing.

DOI

Scopus

8

被引用数

(Scopus)
A route recommendation method based on personal preferences by monte-carlo tree search

Yuta Ishizaki, Toshinori Takayama, Nozomu Togawa

IEEE International Conference on Consumer Electronics - Berlin, ICCE-Berlin 2019- 404 - 409 2019年09月

　概要を見る

In this paper, we propose a route recommendation method, called P-UCT method, considering individual user's preferences utilizing Monte-Carlo tree search. In the proposed method, we firstly extract route features based on the route recommendation history of every user and construct a route evaluator based on Support Vector Machine (SVM). After that, the method generates a random route from a start point to an end point by Monte-Carlo tree search. The route evaluator determines how well every generated route matches the user's preferences. By repeating the evaluation, the method obtains the route, which must be closest to the user's preferences. Experimental results demonstrate that the proposed method outperforms the existing method from the viewpoint of the average evaluation scores.

DOI

Scopus

2

被引用数

(Scopus)
Error Correction Coding of Stochastic Numbers Using BER Measurement

Ryota Ishikawa, Masashi Tawada, Masao Yanagisawa, Nozomu Togawa

2019 IEEE 25th International Symposium on On-Line Testing and Robust System Design, IOLTS 2019 243 - 246 2019年07月

　概要を見る

In electric circuits, errors are ineluctable. When upper bits of binary signals flip due to noise, the value will increase or decrease drastically. On the other hand, if stochastic numbers are used, the change on their values are the same since all the bits have the same weight. Therefore, stochastic computing, a computation method based on stochastic numbers, is attracting interest. Stochastic computing does have error tolerance, but cannot restore the bit stream if the bits are erroneous. Here, this paper focuses on evaluating the error-free value from the bit error rate and the erroneous value. In this paper, we propose a method to correct errors of stochastic numbers by measuring the bit error rate and filtering the values properly. From experimental evaluations, in environment with errors of more than 21%, this proposal will give a better peak-signal-to-noise ratio compared with a conventional error correction coding.

DOI

Scopus

1

被引用数

(Scopus)
Empirical Evaluation on Anomaly Behavior Detection for Low-Cost Micro-Controllers Utilizing Accurate Power Analysis

Kento Hasegawa, Kiyoshi Chikamatsu, Nozomu Togawa

2019 IEEE 25th International Symposium on On-Line Testing and Robust System Design, IOLTS 2019 54 - 57 2019年07月

　概要を見る

Since hardware/software vendors produce their IoT products easily and inexpensively, they often outsource their designs to third-party vendors where malicious third-party vendors can have a chance to insert software Trojans as well as 'hardware Trojans' into their IoT devices. How to tackle the issue becomes a serious concern these days. In this paper, we propose an anomaly behavior detection method utilizing accurate power analysis for low-cost micro-controllers. Our method accurately measures power consumption of the target device, and then classifies its waveform into the sleep-mode part, in which a micro-controller saves power, and into the active-mode part, in which a micro-controller works in a normal operation. After that, we obtain the duration time and consumed power from each active-mode period as feature values. Finally, we detect abnormal behavior based on the obtained feature values utilizing an outlier detection method. In our experiments, we empirically evaluate the proposed method utilizing two types of micro-controllers, and the experimental results demonstrate that our proposed method successfully detects abnormal behaviors.

DOI

Scopus

4

被引用数

(Scopus)
A fully-connected ising model embedding method and its evaluation for CMOS annealing machines

Oku, D., Terada, K., Hayashi, M., Yamaoka, M., Tanaka, S., Togawa, N.

IEICE Transactions on Information and Systems E102D ( 9 ) 2019年

DOI

Scopus

31

被引用数

(Scopus)
Personalized Landmark Recommendation for Language-Specific Users by Open Data Mining

Siya Bao, Masao Yanagisawa, Nozomu Togawa

Studies in Computational Intelligence 791 107 - 121 2019年 [査読有り]

　概要を見る

© 2019, Springer Nature Switzerland AG. This paper proposes a personalized landmark recommendation algorithm aiming at exploring new sights into the determinants of landmark satisfaction prediction. We gather 1,219,048 user-generated comments in Tokyo, Shanghai and New York from four travel websites. We find that users have diverse satisfaction on landmarks those findings, we propose an effective algorithm for personalize landmark satisfaction prediction. Our algorithm provides the top-6 landmarks with the highest satisfaction to users for a one-day trip plan our proposed algorithm has better performances than previous studies from the viewpoints of landmark recommendation and landmark satisfaction prediction.

DOI

Scopus

1

被引用数

(Scopus)
Bicycle behavior recognition using 3-axis acceleration sensor and 3-axis gyro sensor equipped with smartphone

Usami, Y., Ishikawa, K., Takayama, T., Yanagisawa, M., Togawa, N.

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E102A ( 8 ) 953 - 965 2019年 [査読有り]

DOI

Scopus

3

被引用数

(Scopus)
Static Error Analysis and Optimization of Faithfully Truncated Adders for Area-Power Efficient FIR Designs.

Jinghao Ye, Nozomu Togawa, Masao Yanagisawa, Youhua Shi

IEEE International Symposium on Circuits and Systems, ISCAS 2019, Sapporo, Japan, May 26-29, 2019 2019-May 1 - 4 2019年 [査読有り]

DOI

Scopus

6

被引用数

(Scopus)
Efficient Ising Model Mapping to Solving Slot Placement Problem.

Sho Kanamaru, Daisuke Oku, Masashi Tawada, Shu Tanaka, Masato Hayashi, Masanao Yamaoka, Masao Yanagisawa, Nozomu Togawa

IEEE International Conference on Consumer Electronics, ICCE 2019, Las Vegas, NV, USA, January 11-13, 2019 1 - 6 2019年 [査読有り]

DOI

Scopus

12

被引用数

(Scopus)
An FPGA Implementation Method based on Distributed-register Architectures.

Koichi Fujiwara, Kazushi Kawamura, Masao Yanagisawa, Nozomu Togawa

IPSJ Trans. System LSI Design Methodology 12 38 - 41 2019年 [査読有り]

DOI

Scopus
A robust indoor/outdoor detection method based on spatial and temporal features of sparse GPS measured positions

Iwata, S., Ishikawa, K., Takayama, T., Yanagisawa, M., Togawa, N.

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E102A ( 6 ) 860 - 865 2019年 [査読有り]

DOI

Scopus

3

被引用数

(Scopus)
A multiple cyclic-route generation method with route length constraint considering point-of-interests

Nishimura, T., Ishikawa, K., Takayama, T., Yanagisawa, M., Togawa, N.

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E102A ( 4 ) 641 - 653 2019年 [査読有り]

DOI

Scopus
Bicycle behavior recognition using sensors equipped with smartphone

Yuri Usami, Kazuaki Ishikawa, Toshinori Takayama, Masao Yanagisawa, Nozomu Togawa

IEEE International Conference on Consumer Electronics - Berlin, ICCE-Berlin 2018- 2018年12月

　概要を見る

It becomes possible to prevent accidents beforehand by predicting dangerous riding behavior based on recognition of bicycle behaviors. In this paper, we propose a bicycle behavior recognition method using a three-axis acceleration sensor and three-axis gyro sensor equipped with a smartphone. We focus on the periodic handlebar motions for balancing while running a bicycle and reduce the sensor noises caused by them. After that, we use machine learning for recognizing the bicycle behaviors, effectively utilizing the motion features in bicycle behavior recognition. The experimental results demonstrate that the proposed method accurately recognizes the four bicycle behaviors of stop, run straight, turn right, and turn left and its F-measure becomes around 0.9 while the F-measure of the existing method just reaches 0.6-0.8.

DOI

Scopus

18

被引用数

(Scopus)
2n RRR: Improved stochastic number duplicator based on bit re-arrangement

Ryota Ishikawa, Masashi Tawada, Masao Yanagisawa, Nozomu Togawa

2018 New Generation of CAS, NGCAS 2018 182 - 185 2018年12月

　概要を見る

In the fields of machine learning and image processing, cost-less circuits with low energy are required instead of extreme precision, and stochastic computing (SC), a type of approximate computing, is attracting attention. In SC, stochastic numbers (SNs), bit streams with values of the appearance rates of 1's, are used. SC enables calculations with simple circuits. To make the calculation results correct, duplication of an SN (gener-ating an SN with the same value) is required when using the SN with the same value. The conventional SN duplicator composed of a flip-flop (FF) has a problem that the output SN only depends on the input SN. Therefore, if the FF-based duplicator is used in a circuit with re-convergence paths, the output SN becomes erroneous. This paper proposes an SN duplicator, 2 n RRR, that can output more independent output by its improved flexibility of bit re-arrangement. With this duplicator, the errors of the hyperbolic tangent function are reduced by up to 50% compared to the duplicator that we proposed previously. Also, up to more than 99.9% of the circuit area is reduced compared to the implementation of binary computing.

DOI

Scopus

1

被引用数

(Scopus)
Hardware trojan detection and classification based on logic testing utilizing steady state learning

Masaru Oya, Masao Yanagisawa, Nozomu Togawa

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E101A ( 12 ) 2308 - 2319 2018年12月

　概要を見る

Modern digital integrated circuits (ICs) are often designed and fabricated by third parties and tools, which can make IC design/ fabrication vulnerable to malicious modifications. The malicious circuits are generally referred to as hardware Trojans (HTs) and they are considered to be a serious security concern. In this paper, we propose a logic-Testing based HT detection and classification method utilizing steady state learning. We first observe that HTs are hidden while applying random test patterns in a short time but most of them can be activated in a very long-Term random circuit operation. Hence it is very natural that we learn steady signal-Transition states of every suspicious Trojan net in a netlist by performing short-Term random simulation. After that, we simulate or emulate the netlist in a very long time by giving random test patterns and obtain a set of signal-Transition states. By discovering correlation between them, our method detects HTs and finds out its behavior. HTs sometimes do not affect primary outputs but just leak information over side channels. Our method can be successfully applied to those types of HTs. Experimental results demonstrate that our method can successfully identify all the real Trojan nets to be Trojan nets and all the normal nets to be normal nets, while other existing logic-Testing HT detection methods cannot detect some of them. Moreover, our method can successfully detect HTs even if they are not really activated during long-Term random simulation. Our method also correctly guesses the HT behavior utilizing signal transition learning.

DOI

Scopus

2

被引用数

(Scopus)
Personalized landmark recommendation algorithm based on language-specific satisfaction prediction using heterogeneous open data sources

Siya Bao, Masao Yanagisawa, Nozomu Togawa

Proceedings - 2018 10th International Conference on Computational Intelligence and Communication Networks, CICN 2018 70 - 76 2018年08月

　概要を見る

© 2018 IEEE. This paper proposes a personalized landmark recommendation algorithm based on the prediction of users' satisfaction on landmarks. We have accumulated 270,239 user-generated comments from travel websites of Ctrip, Jaran and TripAdvisor for 196 landmarks in Tokyo, Japan. We find that users do have different satisfaction on landmarks depending on their commonly used languages and travel websites. Then we establish a database for landmarks with abundant and accurate landmark type and landmark satisfaction information. Finally, we propose an effective personalized landmark satisfaction prediction algorithm based on users' landmark type, language and travel website preferences. After that, landmarks with the top-6 highest satisfaction are provided to the user for a one-day visit plan in Tokyo. Experimental results demonstrate that the proposed algorithm can recommend landmarks that fit the user's preferences and our algorithm also successfully predicts the user's landmark satisfaction with a low error rate less than 7%, which is superior to other previous studies.

DOI

Scopus
Robust AES circuit design for delay variation using suspicious timing error prediction

Yuki Yahagi, Masao Yanagisawa, Nozomu Togawa

Proceedings - International SoC Design Conference 2017, ISOCC 2017 101 - 102 2018年05月 [査読有り]

　概要を見る

This paper proposes a robust AES (advanced encryption standard) circuit for delay variation. In our proposed AES circuit, suspicious timing error prediction circuits (STEPCs) and their associating gating circuit are incorporated into a normal AES circuit to predict timing errors. STEPCs are inserted between inter-module connections and thus we can monitor almost all of the signal paths between registers and effectively prevent timing errors. The simulation results demonstrate that our AES circuit with STEPCs can be overclocked by up to 1.66X with just 8.05% area overheads.

DOI

Scopus
A selector-based FFT processor and its FPGA implementation

Yuya Hirai, Kazushi Kawamura, Masao Yanagisawa, Nozomu Togawa

Proceedings - International SoC Design Conference 2017, ISOCC 2017 88 - 89 2018年05月 [査読有り]

　概要を見る

Fast Fourier transform (FFT) is used in various applications such as signal processings and developing a high-speed FFT processor is quite required. In this paper, we propose a high-speed FFT processor based on selector logics. The selector-based FFT processor is constructed by focusing on the subtract-multiplication operations and partly applying selector logics to them. Furthermore, we implement the selector-based FFT processor on a Xilinx FPGA. Experimental results show that our proposed FFT processor can improve the processing speed by up to 21% and also reduce the number of LUTs by up to 33% compared with a naive FFT processor.

DOI

Scopus
A loop structure optimization targeting high-level synthesis of fast number theoretic transform

Kazushi Kawamura, Masao Yanagisawa, Nozomu Togawa

Proceedings - International Symposium on Quality Electronic Design, ISQED 2018- 106 - 111 2018年05月 [査読有り]

　概要を見る

Multiplication with a large number of digits is heavily used when processing data encrypted by a fully homomorphic encryption, which is a bottleneck in computation time. An algorithm utilizing fast number theoretic transform (FNTT) is known as a high-speed multiplication algorithm and the further speeding up is expected by implementing the FNTT process on an FPGA. A high-level synthesis tool enables efficient hardware implementation even for FNTT with a large number of points. In this paper, we propose a methodology for optimizing the loop structure included in a software description of FNTT so that the performance of the synthesized FNTT processor can be maximized. The loop structure optimization is considered in terms of loop flattening and trip count reduction. We implement a 65,536-point FNTT processor with the loop structure optimization on an FPGA, and demonstrate that it can be executed 6.9 times faster than the execution on a CPU.

DOI

Scopus

22

被引用数

(Scopus)
A stayed location estimation method for sparse GPS positioning information based on positioning accuracy and short-time cluster removal

Sae Iwata, Tomoyuki Nitta, Toshinori Takayama, Masao Yanagisawa, Nozomu Togawa

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E101A ( 5 ) 831 - 843 2018年05月 [査読有り]

　概要を見る

Cell phones with GPS function as well as GPS loggers are widely used and users' geographic information can be easily obtained. However, still battery consumption in these mobile devices is main concern and then obtaining GPS positioning data so frequently is not allowed. In this paper, a stayed location estimation method for sparse GPS positioning information is proposed. After generating initial clusters from a sequence of measured positions, the e ective radius is set for every cluster based on positioning accuracy and the clusters are merged e ectively using it. After that, short-time clusters are removed temporarily but measured positions included in them are not removed. Then the clusters are merged again, taking all the measured positions into consideration. This process is performed twice, in other words, two-stage short-time cluster removal is performed, and finally accurate stayed location estimation is realized even when the GPS positioning interval is five minutes or more. Experiments demonstrate that the total distance error between the estimated stayed location and the true stayed location is reduced by more than 33% and also the proposed method much improves F1 measure compared to conventional state-of-the-art methods.

DOI

Scopus

3

被引用数

(Scopus)
A Trojan-invalidating Circuit Based on Signal Transitions and Its FPGA Implementation

Kento Hasegawa, Masao Yanagisawa, Nozomu Togawa

Proceedings - IEEE International Symposium on Circuits and Systems 2018- 2018年04月

　概要を見る

Recently, high-functioning hardware devices such as smart TVs and smart phones have been widely used in our daily lives. To keep up with the rapid advance of these high technologies, reconfigurable hardware devices such as FP-GAs (Field Programmable Gate Arrays) have been used in final products. Under the circumstances, the risks that mal-functions may be inserted into hardware devices have arisen. The malfunctions inserted into hardware devices are known as hardware Trojans. How to detect them becomes serious concern in hardware production. In this paper, we design a Trojan-infected cryptographic circuit as well as a Trojan-invalidating circuit, and implement them on an FPGA board. To begin with, we design an AES cryptographic circuit. Secondly, we insert a hardware Trojan into the AES cryptographic circuit. Finally, we design a Trojan-invalidating circuit and insert it into a suspicious Trojan net in the Trojan-infected cryptographic circuit. After that, we implement the circuits into an FPGA board. The experimental results demonstrate that the Trojan-invalidating circuit adequately deactivate the suspicious Trojan net in the Trojan-infected cryptographic circuit.

DOI

Scopus
A hardware-Trojan classification method utilizing boundary net structures

Kento Hasegawa, Masao Yanagisawa, Nozomu Togawa

2018 IEEE International Conference on Consumer Electronics, ICCE 2018 2018- 1 - 4 2018年03月 [査読有り]

　概要を見る

Recently, cybersecurity has become a serious concern for us. For example, the threats of hardware Trojans (malfunctions inserted into hardware devices) have appeared. Since hardware vendors often outsource parts of their hardware products to third-party vendors, the risk of hardware-Trojan insertion has been increased. Especially in the hardware design step, malicious vendors have a chance to insert hardware Trojans easily. In this paper, we propose a hardware-Trojan classification method utilizing boundary net structures. To begin with, we use a machine-learning-based hardware-Trojan detection method and classify the nets in a given netlist into a set of normal nets and that of Trojan nets. Based on the classification, we investigate the nets around the boundary between normal nets and Trojan nets and extract the features of the nets identified to be normal nets or Trojan nets mistakenly. Finally, using the classification results of machine-learning-based hardware-Trojan detection and the extracted features of the boundary nets, we classify the nets in a given netlist into a set of normal nets and that of Trojan nets again. The experimental results demonstrate that our method outperforms an existing machine-learning-based hardware-Trojan detection method in terms of true positive rate.

DOI

Scopus

31

被引用数

(Scopus)
Road-illuminance level inference across road networks based on Bayesian analysis

Siya Bao, Masao Yanagisawa, Nozomu Togawa

2018 IEEE International Conference on Consumer Electronics, ICCE 2018 2018- 1 - 6 2018年03月 [査読有り]

　概要を見る

This paper proposes a road-illuminance level inference method based on the naive Bayesian analysis. We investigate quantities and types of road lights and landmarks with a large set of roads in real environments and reorganize them into two safety classes, safe or unsafe, with seven road attributes. Then we carry out data learning using three types of datasets according to different groups of the road attributes. Experimental results demonstrate that the proposed method successfully classifies a set of roads with seven attributes into safe ones and unsafe ones with the accuracy of more than 85%, which is superior to other machine-learning based methods and a manual-based method.

DOI

Scopus
Scan-based side-channel attack against HMAC-SHA-256 circuits based on isolating bit-transition groups using scan signatures

Daisuke Oku, Masao Yanagisawa, Nozomu Togawa

IPSJ Transactions on System LSI Design Methodology 11 16 - 28 2018年02月 [査読有り]

　概要を見る

A scan chain is used by scan-path test, one of design-for-test techniques, which can control and observe internal registers in an LSI chip. On the other hand, a scan-based side-channel attack is focused on which can restore secret information by exploiting the scan data obtained from a scan chain inside the crypto chip during cryptographic processing. In this paper, we propose a scan-based attack method against a hash generator circuit called HMAC-SHA- 256. Our proposed method is composed of three steps
Firstly, we isolate 64 bit-transition groups from a scan data using scan signatures based on the property of the HMAC-SHA-256 algorithm. Secondly, we classify these 64 bittransition groups into 32 pairs. Lastly, we find out the correspondence between the scan data and the internal registers in the target HMAC-SHA-256 circuit. Our proposed method restores the secret information by the three steps above, even if the scan chain includes registers other than the target hash generator circuit and hence it becomes too long. Experimental results show that our proposed method successfully restores two secret keys of the HMAC-SHA-256 circuit using up to 425 input messages in 7.5 hours.

DOI

Scopus

4

被引用数

(Scopus)
Designing hardware trojans and their detection based on a SVM-based approach

Tomotaka Inoue, Kento Hasegawa, Masao Yanagisawa, Nozomu Togawa

Proceedings of International Conference on ASIC 2017- 811 - 814 2018年01月 [査読有り]

　概要を見る

Since hardware production become inexpensive and international, hardware vendors often outsource their products to third-party vendors. Due to the situation, malicious vendors can easily insert malfunctions (also known as 'hardware Trojans') to their products. In this paper, we experimentally evaluate a machine-learning-based hardware-Trojan detection method using several hardware Trojans we designed. To begin with, we design three types of hardware Trojans and insert them to simple RS232 transceiver circuits. After that, we learn known netlists, where we know which nets are Trojan ones or normal ones beforehand, using a machine-learning-based hardware-Trojan detection method with a support vector machine (SVM) classifier. Finally, we classify the nets in the designed hardware-Trojan-inserted netlists into a set of Trojan nets and that of normal nets using the learned classifier. The experimental results demonstrate that the hardware-Trojan detection method with the SVM-based approach can detect a part of hardware Trojans we designed.

DOI

Scopus

38

被引用数

(Scopus)
A low cost and high speed CSD-based symmetric transpose block FIR implementation

Jinghao Ye, Youhua Shi, Nozomu Togawa, Masao Yanagisawa

Proceedings of International Conference on ASIC 2017- 311 - 314 2018年01月 [査読有り]

　概要を見る

In this paper, a low cost and high speed CSD-based symmetric transpose block FIR design was proposed for low cost digital signal processing. First, the existing area-efficient CSD-based multiplier was optimized by considering the reusability and the symmetry of coefficients for area reduction. Second, the position of the input register was changed for high speed transpose block FIR processing in which half of the number of required multipliers can be saved. When compared with the existing block FIR designs, the proposed FIR design can increase the data rate from 238.66 MHz to 373.13 MHz while saving 10.89% area and 21.30% energy consumption as well.

DOI

Scopus

8

被引用数

(Scopus)
Floorplan-driven high-level synthesis using volatile/non-volatile registers for hybrid energy-harvesting systems

Daiki Asai, Masao Yanagisawa, Nozomu Togawa

Proceedings of International Conference on ASIC 2017- 64 - 67 2018年01月 [査読有り]

　概要を見る

In this paper, we propose a floorplan-driven highlevel synthesis algorithm utilizing both volatile and non-volatile registers for hybrid energy-harvesting systems. In our algorithm, we firstly introduce an idea of safety line candidates. Based on them, we perform safety-line (SL) scheduling so that every operation does not cross the safety line candidates and then perform volatile/non-volatile register binding so that all the data crossing the safety line candidates are stored into non-violate registers. We can safely restore all the data and re-start the circuit operation from every safety line candidate, even if the power shut-off occurs while running the circuit. Experimental results show that our algorithm reduces average latency by 30.76% and the average energy consumption by 24.94% compared to the naive algorithm when sufficient energy is given (normal mode). Experimental results also show that our algorithm reduces average latency by 30.58% compared to the naive algorithm by reducing rollback execution if a small amount of energy is given (energy-harvesting mode).

DOI

Scopus
Soft error tolerant latch designs with low power consumption (invited paper)

Saki Tajima, Nozomu Togawa, Masao Yanagisawa, Youhua Shi

Proceedings of International Conference on ASIC 2017- 52 - 55 2018年01月 [査読有り]

　概要を見る

As semiconductor technology continues scaling down, the reliability issue has become much more critical than ever before. Unlike traditional hard-errors caused by permanent physical damage which can't be recovered in field, soft errors are caused by radiation or voltage/current fluctuations that lead to transient changes on internal node states, thus they can be viewed as temporary errors. However, due to the unpredictable occurrence of soft errors, it is desirable to develop soft error tolerant designs. For this reason, soft error tolerant design techniques have gained great research interest. In this paper, we will explain the soft error mechanism and then review the existing soft error tolerant design techniques with particular emphasis on SEH family because they can achieve low power consumption and small performance overhead as well.

DOI

Scopus

2

被引用数

(Scopus)
Detecting the Existence of Malfunctions in Microcontrollers Utilizing Power Analysis.

Kento Hasegawa, Masao Yanagisawa, Nozomu Togawa

24th IEEE International Symposium on On-Line Testing And Robust System Design, IOLTS 2018, Platja D'Aro, Spain, July 2-4, 2018 97 - 102 2018年 [査読有り]

DOI

Scopus

8

被引用数

(Scopus)
An Effective Stochastic Number Duplicator and Its Evaluations Using Composite Arithmetic Circuits.

Ryota Ishikawa, Masashi Tawada, Masao Yanagisawa, Nozomu Togawa

24th IEEE International Symposium on On-Line Testing And Robust System Design, IOLTS 2018, Platja D'Aro, Spain, July 2-4, 2018 53 - 56 2018年 [査読有り]

DOI

Scopus

3

被引用数

(Scopus)
Hardware Trojan Detection Utilizing Machine Learning Approaches.

Kento Hasegawa, Youhua Shi, Nozomu Togawa

17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications / 12th IEEE International Conference On Big Data Science And Engineering, TrustCom/BigDataSE 2018, New York, NY, USA, August 1-3, 2018 1891 - 1896 2018年 [査読有り]

DOI

Scopus

42

被引用数

(Scopus)
Message from the Editor-in-Chief

Togawa, N.

IPSJ Transactions on System LSI Design Methodology 11 1 - 1 2018年 [査読有り]

DOI

Scopus
An Ising model mapping to solve rectangle packing problem.

Kotaro Terada, Daisuke Oku, Sho Kanamaru, Shu Tanaka, Masato Hayashi, Masanao Yamaoka, Masao Yanagisawa, Nozomu Togawa

2018 International Symposium on VLSI Design, Automation and Test (VLSI-DAT), Hsinchu, Taiwan, April 16-19, 2018 1 - 4 2018年 [査読有り]

DOI

Scopus

25

被引用数

(Scopus)
A relaxed bit-write-reducing and error-correcting code for non-volatile memories

Kojo, T., Tawada, M., Yanagisawa, M., Togawa, N.

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E101A ( 7 ) 1045 - 1052 2018年 [査読有り]

DOI

Scopus
A Low Power Soft Error Hardened Latch with Schmitt-Trigger-Based C-Element.

Saki Tajima, Nozomu Togawa, Masao Yanagisawa, Youhua Shi

IEICE Transactions 101-A ( 7 ) 1025 - 1034 2018年 [査読有り]

DOI

Scopus

7

被引用数

(Scopus)
Extension and Performance/Accuracy Formulation for Optimal GeAr-Based Approximate Adder Designs.

Ken Hayamizu, Nozomu Togawa, Masao Yanagisawa, Youhua Shi

IEICE Transactions 101-A ( 7 ) 1014 - 1024 2018年 [査読有り]

DOI

Scopus

2

被引用数

(Scopus)
Stochastic number duplicators based on bit re-arrangement using randomized bit streams

Ishikawa, R., Tawada, M., Yanagisawa, M., Togawa, N.

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E101A ( 7 ) 1002 - 1013 2018年 [査読有り]

DOI

Scopus

6

被引用数

(Scopus)
A Multiple Cyclic-Route Generation Method for Strolling Based on Point-of-Interests.

Tensei Nishimura, Kazuaki Ishikawa, Toshinori Takayama, Masao Yanagisawa, Nozomu Togawa

8th IEEE International Conference on Consumer Electronics - Berlin, ICCE-Berlin 2018, Berlin, Germany, September 2-5, 2018 1 - 2 2018年 [査読有り]

DOI

Scopus

2

被引用数

(Scopus)
Robust Indoor/Outdoor Detection Method based on Sparse GPS Positioning Information.

Sae Iwata, Kazuaki Ishikawa, Toshinori Takayama, Masao Yanagisawa, Nozomu Togawa

8th IEEE International Conference on Consumer Electronics - Berlin, ICCE-Berlin 2018, Berlin, Germany, September 2-5, 2018 1 - 4 2018年 [査読有り]

DOI

Scopus

11

被引用数

(Scopus)
Designing Subspecies of Hardware Trojans and Their Detection Using Neural Network Approach.

Tomotaka Inoue, Kento Hasegawa, Yuki Kobayashi, Masao Yanagisawa, Nozomu Togawa

8th IEEE International Conference on Consumer Electronics - Berlin, ICCE-Berlin 2018, Berlin, Germany, September 2-5, 2018 1 - 4 2018年 [査読有り]

DOI

Scopus

18

被引用数

(Scopus)
Landmark Seasonal Travel Distribution and Activity Prediction Based on Language-specific Analysis.

Siya Bao, Masao Yanagisawa, Nozomu Togawa

IEEE International Conference on Big Data, Big Data 2018, Seattle, WA, USA, December 10-13, 2018 3628 - 3637 2018年 [査読有り]

DOI

Scopus
Capacitance Measurement of Running Hardware Devices and its Application to Malicious Modification Detection.

Makoto Nishizawa, Kento Hasegawa, Nozomu Togawa

2018 IEEE Asia Pacific Conference on Circuits and Systems, APCCAS 2018, Chengdu, China, October 26-30, 2018 362 - 365 2018年 [査読有り]

DOI

Scopus

6

被引用数

(Scopus)
Empirical evaluation and optimization of hardware-Trojan classification for gate-level netlists based on multi-layer neural networks

Hasegawa, K., Yanagisawa, M., Togawa, N.

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E101A ( 12 ) 2320 - 2326 2018年 [査読有り]

DOI

Scopus

12

被引用数

(Scopus)
An accurate indoor positioning algorithm using particle filter based on the proximity of bluetooth beacons

Ryoya Momose, Tomoyuki Nitta, Masao Yanagisawa, Nozomu Togawa

2017 IEEE 6th Global Conference on Consumer Electronics, GCCE 2017 2017- 1 - 5 2017年12月 [査読有り]

　概要を見る

Indoor positioning without GPS is one of the most important problems in indoor pedestrian navigation. In this paper, we propose an accurate indoor positioning algorithm using a particle filter based on a floormap, where we use the proximity of the Bluetooth beacons as well as acceleration and geomagnetic sensors. In designing the likelihood function in the particle filter, we effectively use the proximity of the Bluetooth beacons, which just gives rough distance to the target beacon but more stable than conventional RSSI-based distance estimation. In addition to that, by effectively utilizing a floormap, the accumulated positioning errors due to the acceleration and geomagnetic sensors are much reduced. Moreover, when the radio waves from the Bluetooth beacons are blocked by obstacles, we can also take it into account in designing the likelihood function in the particle filter. Experimental results demonstrate that our algorithm can reduce the indoor positioning errors by up to 79% compared to several conventional algorithms.

DOI

Scopus

15

被引用数

(Scopus)
A stayed location estimation method for sparse GPS positioning information

Sae Iwata, Tomoyuki Nitta, Toshinori Takayama, Masao Yanagisawa, Nozomu Togawa

2017 IEEE 6th Global Conference on Consumer Electronics, GCCE 2017 2017- 1 - 5 2017年12月 [査読有り]

　概要を見る

Cell phones with GPS function as well as GPS loggers are widely used and we can easily obtain users' geographic information. However, still battery consumption in these mobile devices is main concern and then we are not allowed to obtain GPS positioning data so frequently. In this paper, we propose a stayed location estimation method for sparse GPS positioning data. After generating initial clusters from a sequence of measured positions, we set the effective radius for every cluster based on positioning accuracy and merge the clusters effectively using it. After that, we temporarily remove short-time clusters but do not remove measured positions included in them. Then we merge the clusters again, taking all the measured positions into consideration. We perform this process twice, i.e, we perform two-stage short-time cluster removal, and finally realize accurate stayed location estimation even when the GPS positioning interval is five minutes or more. Experiments demonstrate that the total distance error between the estimated stayed location and the true stayed location is reduced by more than 50% compared to a conventional state-of-the-art method.

DOI

Scopus

3

被引用数

(Scopus)
Personalized one-day travel with multi-nearby-landmark recommendation

Siya Bao, Masao Yanagisawa, Nozomu Togawa

IEEE International Conference on Consumer Electronics - Berlin, ICCE-Berlin 2017- 239 - 242 2017年12月 [査読有り]

　概要を見る

Travel route recommendation can strongly influence users' satisfaction and the success of touristic businesses. This paper proposes a personalized travel recommendation algorithm with time planning. We use landmark categorization and region clustering to obtain effective elements. Then we build a travel map to generate all possible travel routes. Our proposed algorithm has higher precision in landmark recommendation and time planning than thoes in previous algorithms.

DOI

Scopus

5

被引用数

(Scopus)
A robust scan-based side-channel attack method against HMAC-SHA-256 circuits

Daisuke Oku, Masao Yanagisawa, Nozomu Togawa

IEEE International Conference on Consumer Electronics - Berlin, ICCE-Berlin 2017- 79 - 84 2017年12月 [査読有り]

　概要を見る

A scan-based side-channel attack is still a real threat against a crypto circuit as well as a hash generator circuit, which can restore secret information by exploiting the scan data obtained from scan chains inside the chip during its processing. In this paper, we propose a scan-based attack method against a hash generator circuit called HMAC-SHA-256. Our proposed method restores the secret information by finding out the correspondence between the scan data obtained from a scan chain and the internal registers in the target HMAC-SHA-256 circuit, even if the scan chain includes registers other than the target hash generator circuit and an attacker does not know well the hash generation timing. Experimental results show that our proposed method successfully restores two secret keys of the HMAC-SHA-256 circuit in at most 6 hours.

DOI

Scopus

8

被引用数

(Scopus)
A bitwidth-aware high-level synthesis algorithm using operation chainings for tiled-DR architectures

Kotaro Terada, Masao Yanagisawa, Nozomu Togawa

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E100A ( 12 ) 2911 - 2924 2017年12月 [査読有り]

　概要を見る

As application hardware designs and implementations in a short term are required, high-level synthesis is more and more essential EDA technique nowadays. In deep-submicron era, interconnection delays are not negligible even in high-level synthesis thus distributed-register and - controller architectures (DR architectures) have been proposed in order to cope with this problem. It is also profitable to take data-bitwidth into account in high-level synthesis. In this paper, we propose a bitwidth-aware high-level synthesis algorithm using operation chainings targeting Tiled-DR architectures. Our proposed algorithm optimizes bitwidths of functional units and utilizes the vacant tiles by adding some extra functional units to realize effective operation chainings to generate high performance circuits without increasing the total area. Experimental results show that our proposed algorithm reduces the overall latency by up to 47% comparedtothe conventional approach without area overheads by eliminating unnecessary bitwidths and adding efficient extra FUs for Tiled-DR architectures.

DOI

Scopus
A safe and comprehensive route finding algorithm for pedestrians based on lighting and landmark conditions

Siya Bao, Tomoyuki Nitta, Masao Yanagisawa, Nozomu Togawa

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E100A ( 11 ) 2439 - 2450 2017年11月 [査読有り]

　概要を見る

In this paper, we propose a safe and comprehensive route finding algorithm for pedestrians based on lighting and landmark conditions. Safety and comprehensiveness can be predicted by the five possible indicators: (1) lighting conditions, (2) landmark visibility, (3) landmark effectiveness, (4) turning counts along a route, and (5) road widths. We first investigate impacts of these five indicators on pedestrians' perceptions on safety and comprehensiveness during route findings. After that, a route finding algorithm is proposed for pedestrians. In the algorithm, we design the score based on the indicators (1), (2), (3), and (5) above and also introduce a turning count reduction strategy for the indicator (4). Thus we find out a safe and comprehensive route through them. In particular, we design daytime score and nighttime score differently and find out an appropriate route depending on the time periods. Experimental simulation results demonstrate that the proposed algorithm obtains higher scores compared to several existing algorithms. We also demonstrate that the proposed algorithm is able to find out safe and comprehensive routes for pedestrians in real environments in accordance with questionnaire results.

DOI

Scopus

5

被引用数

(Scopus)
Effective write-reduction method for MLC non-volatile memory

Masashi Tawada, Shinji Kimura, Masao Yanagisawa, Nozomu Togawa

Proceedings - IEEE International Symposium on Circuits and Systems 1 - 4 2017年09月 [査読有り]

　概要を見る

Recently, the requirement for non-volatile memory on embedded systems has increased because they can be applied with normally-off and power gating technologies to. However, they have a lower endurance than volatile memories. When data is encoded as a write-reduction code appropriately, the endurance of non-volatile memory can be enhanced by writing the encoded data into the memory. We propose a highly effective write-reduction method for a multi-level cell (MLC) non-volatile memory focusing on the write-reduction code (WRC) as the optimal bit-write reduction method. The WRC can be applied only to single-level cell non-volatile memory. The proposed method generates a cell-write reduction code based on the WRC
the cell has multiple bits as the holdable data. Our proposed method achieves a cell-write reduction by 31.6% compared to the conventional method.

DOI

Scopus
Trojan-feature extraction at gate-level netlists and its application to hardware-Trojan detection using random forest classifier

Kento Hasegawa, Masao Yanagisawa, Nozomu Togawa

Proceedings - IEEE International Symposium on Circuits and Systems 1 - 4 2017年09月 [査読有り]

　概要を見る

Recently, due to the increase of outsourcing in IC design, it has been reported that malicious third-party vendors often insert hardware Trojans into their ICs. How to detect them is a strong concern in IC design process. The features of hardware-Trojan infected nets (or Trojan nets) in ICs often differ from those of normal nets. To classify all the nets in netlists designed by third-party vendors into Trojan ones and normal ones, we have to extract effective Trojan features from Trojan nets. In this paper, we first propose 51 Trojan features which describe Trojan nets from netlists. Based on the importance values obtained from the random forest classifier, we extract the best set of 11 Trojan features out of the 51 features which can effectively detect Trojan nets, maximizing the F-measures. By using the 11 Trojan features extracted, the machine-learning based hardware Trojan classifier has achieved at most 100% true positive rate as well as 100% true negative rate in several TrustHUB benchmarks and obtained the average F-measure of 74.6%, which realizes the best values among existing machine-learning-based hardware-Trojan detection methods.

DOI

Scopus

211

被引用数

(Scopus)
Trojan-feature extraction at gate-level netlists and its application to hardware-Trojan detection using random forest classifier

Kento Hasegawa, Masao Yanagisawa, Nozomu Togawa

Proceedings - IEEE International Symposium on Circuits and Systems 100-A ( 12 ) 2857 - 2868 2017年09月 [査読有り]

　概要を見る

Recently, due to the increase of outsourcing in IC design, it has been reported that malicious third-party vendors often insert hardware Trojans into their ICs. How to detect them is a strong concern in IC design process. The features of hardware-Trojan infected nets (or Trojan nets) in ICs often differ from those of normal nets. To classify all the nets in netlists designed by third-party vendors into Trojan ones and normal ones, we have to extract effective Trojan features from Trojan nets. In this paper, we first propose 51 Trojan features which describe Trojan nets from netlists. Based on the importance values obtained from the random forest classifier, we extract the best set of 11 Trojan features out of the 51 features which can effectively detect Trojan nets, maximizing the F-measures. By using the 11 Trojan features extracted, the machine-learning based hardware Trojan classifier has achieved at most 100% true positive rate as well as 100% true negative rate in several TrustHUB benchmarks and obtained the average F-measure of 74.6%, which realizes the best values among existing machine-learning-based hardware-Trojan detection methods.

DOI

Scopus

211

被引用数

(Scopus)
Hardware Trojans classification for gate-level netlists using multi-layer neural networks

Kento Hasegawa, Masao Yanagisawa, Nozomu Togawa

2017 IEEE 23rd International Symposium on On-Line Testing and Robust System Design, IOLTS 2017 227 - 232 2017年09月 [査読有り]

　概要を見る

Recently, due to the increase of outsourcing in IC design and manufacturing, it has been reported that malicious third-party IC vendors often insert hardware Trojans into their products. Especially in IC design step, it is strongly required to detect hardware Trojans because malicious third-party vendors can easily insert hardware Trojans in their products. In this paper, we propose a machine-learning-based hardware-Trojan detection method for gate-level netlists using multi-layer neural networks. First, we extract 11 Trojan-net feature values for each net in a netlist. After that, we classify the nets in an unknown netlist into a set of Trojan nets and that of normal nets using multi-layer neural networks. We obtained at most 100% true positive rate with our proposed method.

DOI

Scopus

140

被引用数

(Scopus)
Hardware Trojan detection and classification based on steady state learning

Masaru Oya, Masao Yanagisawa, Nozomu Togawa

2017 IEEE 23rd International Symposium on On-Line Testing and Robust System Design, IOLTS 2017 215 - 220 2017年09月 [査読有り]

　概要を見る

In this paper, we propose a logic-testing based HT detection and classification method utilizing steady state learning. We first observe that HTs are hidden while applying random test patterns in a short time but most of them can be activated in a very long-term random circuit operation. Hence it is very natural that we learn steady signal-transition states of every suspicious Trojan net in a netlist by performing short-term random simulation. After that, we simulate or emulate the netlist in a very long time by giving random test patterns and obtain a set of signal-transition states. By discovering correlation between them, our method detects HTs and finds out its behavior. Experimental results demonstrate that our method can successfully identify all the real Trojan nets to be Trojan nets and all the normal nets to be normal nets, while other existing logic-testing HT detection methods cannot detect some of them.

DOI

Scopus

1

被引用数

(Scopus)
A Floorplan Aware High-Level Synthesis Algorithm with Body Biasing for Delay Variation Compensation

Koki Igawa, Masao Yanagisawa, Nozomu Togawa

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E100A ( 7 ) 1439 - 1451 2017年07月 [査読有り]

　概要を見る

In this paper, we propose a floorplan aware high-level synthesis algorithm with body biasing for delay variation compensation, which minimizes the average leakage energy of manufactured chips. In order to realize floorplan-aware high-level synthesis, we utilize huddle-based distributed register architecture (HDR architecture). HDR architecture divides the chip area into small partitions called a huddle and we can control a body bias voltage for every huddle. During high-level synthesis, we iteratively obtain expected leakage energy for every huddle when applying a body bias voltage. A huddle with smaller expected leakage energy contributes to reducing expected leakage energy of the entire circuit more but can increase the latency. We assign control-data flow graph (CDFG) nodes in non-critical paths to the huddles with larger expected leakage energy and those in critical paths to the huddles with smaller expected leakage energy. We expect to minimize the entire leakage energy in a manufactured chip without increasing its latency. Experimental results show that our algorithm reduces the average leakage energy by up to 39.7% without latency and yield degradation compared with typical-case design with body biasing.

DOI

Scopus
A Hardware-Trojan Classification Method Using Machine Learning at Gate-Level Netlists Based on Trojan Features

Kento Hasegawa, Masao Yanagisawa, Nozomu Togawa

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E100A ( 7 ) 1427 - 1438 2017年07月 [査読有り]

　概要を見る

Due to the increase of outsourcing by IC vendors, we face a serious risk that malicious third-party vendors insert hardware Trojans very easily into their IC products. However, detecting hardware Trojans is very difficult because today's ICs are huge and complex. In this paper, we propose a hardware-Trojan classification method for gate-level netlists to identify hardware-Trojan infected nets (or Trojan nets) using a support vector machine (SVM) or a neural network (NN). At first, we extract the five hardware-Trojan features from each net in a netlist. These feature values are complicated so that we cannot give the simple and fixed threshold values to them. Hence we secondly represent them to be a five-dimensional vector and learn them by using SVM or NN. Finally, we can successfully classify all the nets in an unknown netlist into Trojan ones and normal ones based on the learned classifiers. We have applied our machine-learning based hardware-Trojan classification method to Trust-HUB benchmarks. The results demonstrate that our method increases the true positive rate compared to the existing state-of-the-art results in most of the cases. In some cases, our method can achieve the true positive rate of 100%, which shows that all the Trojan nets in an unknown netlist are completely detected by our method.

DOI

Scopus

60

被引用数

(Scopus)
Efficient Multiplexer Networks for Field-Data Extractors and Their Evaluations

Koki Ito, Kazushi Kawamura, Yutaka Tamiya, Masao Yanagisawa, Nozomu Togawa

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E100A ( 4 ) 1015 - 1028 2017年04月 [査読有り]

　概要を見る

As seen in stream data processing, it is necessary to extract a particular data field from bulk data, where we can use a field-data extractor. Particularly, an (M, N)-field-data extractor reads out any consecutive N bytes from an M-byte register by connecting its input/output using multiplexers (MUXs). However, the number of required MUXs increases too much as the input/output byte widths increase. It is known that partitioning a MUX network leads to reducing the number of MUXs. In this paper, we firstly pick up a multi-layered MUX network, which is generated by repeatedly partitioning a MUX network into a collection of single layered MUX networks. We show that the multi-layered MUX network is equivalent to the barrel shifter from which redundant MUXs and wires are removed, and we prove that the number of required MUXs becomes the smallest among MUX-network-partitioning based field-data extractors. Next, we propose a rotator-based MUX network for a field-data extractor, which is based on reading out a particular data in an input register to a rotator. The byte width of the rotator is the same as its output register and hence we no longer require any extra wires nor MUXs. By rotating the input data appropriately, we can finally have a right-ordered data into an output register. Experimental results show that a multi-layered MUX network reduces the number of required gates to construct a field-data extractor by up to 97.0% compared with the one using a naive approach and its delay becomes 1.8 ns-2.3 ns. A rotator-based MUX network with a control circuit also reduces the number of required gates to construct a field-data extractor by up to 97.3% compared with the one using a naive approach and its delay becomes 2.1 ns-2.9 ns.

DOI

Scopus
Trojan-net feature extraction and its application to hardware-Trojan detection for gate-level netlists using random forest

Hasegawa, K., Yanagisawa, M., Togawa, N.

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E100A ( 12 ) 2017年

DOI

Scopus

12

被引用数

(Scopus)
A Bit-Write-Reducing and Error-Correcting Code Generation Method by Clustering ECC Codewords for Non-Volatile Memories

Tatsuro Kojo, Masashi Tawada, Masao Yanagisawa, Nozomu Togawa

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E99A ( 12 ) 2398 - 2411 2016年12月 [査読有り]

　概要を見る

Non-volatile memories are paid attention to as a promising alternative to memory design. Data stored in them still may be destructed due to crosstalk and radiation. We can restore the data by using errorcorrecting codes which require extra bits to correct bit errors. Further, nonvolatile memories consume ten to hundred times more energy than normal memories in bit-writing. When we configure them using error-correcting codes, it is quite necessary to reduce writing bits. In this paper, we propose a method to generate a bit-write-reducing code with error-correcting ability. We first pick up an error-correcting code which can correct t-bit errors. We cluster its codeswords and generate a cluster graph satisfying the S-bit flip conditions. We assign a data to be written to each cluster. In other words, we generate one-to-many mapping from each data to the codewords in the cluster. We prove that, if the cluster graph is a complete graph, every data in a memory cell can be re-written into another data by flipping at most S bits keeping error-correcting ability to t bits. We further propose an efficient method to cluster error-correcting codewords. Experimental results show that the bit-write-reducing and error-correcting codes generated by our proposed method efficiently reduce energy consumption. This paper proposes the world-first theoretically near-optimal bit-write-reducing code with error-correcting ability based on the efficient coding theories.

DOI

Scopus

1

被引用数

(Scopus)
A Highly-Adaptable and Small-Sized In-Field Power Analyzer for Low-Power IoT Devices

Ryosuke Kitayama, Takashi Takenaka, Masao Yanagisawa, Nozomu Togawa

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E99A ( 12 ) 2348 - 2362 2016年12月 [査読有り]

　概要を見る

Power analysis for IoT devices is strongly required to protect attacks from malicious attackers. It is also very important to reduce power consumption itself of IoT devices. In this paper, we propose a highly-adaptable and small-sized in-field power analyzer for low-power IoT devices. The proposed power analyzer has the following advantages: (A) The proposed power analyzer realizes signal-averaging noise reduction with synchronization signal lines and thus it can reduce wide frequency range of noises; (B) The proposed power analyzer partitions a long-term power analysis process into several analysis segments and measures voltages and currents of each analysis segment by using small amount of data memories. By combining these analysis segments, we can obtain long-term analysis results; (C) The proposed power analyzer has two amplifiers that amplify current signals adaptively depending on their magnitude. Hence maximum readable current can be increased with keeping minimum readable current small enough. Since all of (A), (B) and (C) do not require complicated mechanisms nor circuits, the proposed power analyzer is implemented on just a 2.5 cm x 3.3 cm board, which is the smallest size among the other existing power analyzers for IoT devices. We have measured power and energy consumption of the AES encryption process on the IoT device and demonstrated that the proposed power analyzer has only up to 1.17% measurement errors compared to a high-precision oscilloscope.

DOI

Scopus

1

被引用数

(Scopus)
Hardware-Trojans Rank:Quantitative Evaluation of Security Threats at Gate-Level Netlists by Pattern Matching

Masaru Oya, Noritaka Yamashita, Toshihiko Okamura, Yukiyasu Tsunoo, Masao Yanagisawa, Nozomu Togawa

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E99A ( 12 ) 2335 - 2347 2016年12月 [査読有り]

　概要を見る

Since digital ICs are often designed and fabricated by third parties at any phases today, we must eliminate risks that malicious attackers may implement Hardware Trojans (HTs) on them. In particular, they can easily insert HTs during design phase. This paper proposes an HT rank which is a new quantitative analysis criterion against HTs at gate-level netlists. We have carefully analyzed all the gate-level netlists in Trust-HUB benchmark suite and found out several Trojan net features in them. Then we design the three types of Trojan points: feature point, count point, and location point. By assigning these points to every net and summing up them, we have the maximum Trojan point in a gate-level netlist. This point gives our HT rank. The HT rank can be calculated just by net features and we do not perform any logic simulation nor random test. When all the gate-level netlists in Trust-HUB, ISCAS85, ISCAS89 and ITC99 benchmark suites as well as several OpenCores designs, HT-free and HT-inserted AES netlists are ranked by our HT rank, we can completely distinguish HT-inserted ones (which HT rank is ten or more) from HT-free ones (which HT rank is nine or less). The HT rank is the world-first quantitative criterion which distinguishes HT-inserted netlists from HT-free ones in all the gate-level netlists in Trust-HUB, ISCAS85, ISCAS89, and ITC99.

DOI

Scopus

13

被引用数

(Scopus)
Multi-scenario high-level synthesis for dynamic delay variation and its evaluation on FPGA platforms

Koki Igawa, Masao Yanagisawa, Nozomu Togawa

IEICE ELECTRONICS EXPRESS 13 ( 18 ) 20160641 2016年09月 [査読有り]

　概要を見る

Multi-scenario high-level synthesis for distributed register/controller architecture has been proposed targeting static delay variation. In this paper, we extend it and propose a floorplan-driven high-level synthesis algorithm which can be applied to dynamic delay variation by effectively using an error prediction technique, where pre-error registers are introduced to local registers in every circuit block. Experimental results show that the proposed algorithm using two and three scenarios on an FPGA chip reduces the average number of required control steps by 17.6% and 25.5% on average compared to worst-case high-level synthesis at the expense of increasing lookup-tables and flip-flops. Moreover, we implement a multi-scenario elliptic-wave-filter (EWF) circuit with three scenarios synthesized by our proposed algorithm onto an FPGA chip and run it under the environment with varying supply voltages which causes dynamic delay variation. The FPGA implementation experiments also demonstrate that the EWF circuit effectively runs on the real FPGA chip. As far as we know, this is the world-first experiment where a multi-scenario circuit runs under real dynamic delay variation environment.

DOI

Scopus
Bi-Partitioning Based Multiplexer Network for Field-Data Extractors

Koki Ito, Kazushi Kawamura, Yutaka Tamiya, Masao Yanagisawa, Nozomu Togawa

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E99A ( 7 ) 1410 - 1414 2016年07月 [査読有り]

　概要を見る

An (M,N)-field-data extractor reads out any consecutive N bytes from an M-byte register by connecting its input/output using a multiplexer (MUX) network. It is used in packet analysis and/or stream data processing for video/audio data. In this letter, we propose an efficient MUX network for an (M,N)-field-data extractor. By bi-partitioning a simple MUX network into an upper one and a lower one, we can theoretically reduce the number of required MUXs without increasing the MUX network depth. Experimental results show that we can reduce the gate count by up to 92% compared to a naive approach.

DOI

Scopus

1

被引用数

(Scopus)
Interconnection-Delay and Clock-Skew Estimate Modelings for Floorplan-Driven High-Level Synthesis Targeting FPGA Designs

Koichi Fujiwara, Kazushi Kawamura, Masao Yanagisawa, Nozomu Togawa

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E99A ( 7 ) 1294 - 1310 2016年07月 [査読有り]

　概要を見る

Recently, high-level synthesis techniques for FPGA designs (FPGA-HLS techniques) are strongly required in various applications. Both interconnection delays and clock skews have a large impact on circuit performance implemented onto FPGA, which indicates the need for floorplan-driven FPGA-HLS algorithms considering them. To appropriately estimate interconnection delays and clock skews at HLS phase, a reasonable model to estimate them becomes essential. In this paper, we demonstrate several experiments to characterize interconnection delays and clock skews in FPGA and propose novel estimate models called "IDEF" and "CSEF". In order to evaluate our models, we integrate them into a conventional floorplan-driven FPGA-HLS algorithm. Experimental results demonstrate that our algorithm can realize FPGA designs which reduce the latency by up to 22% compared with conventional approaches.

DOI
A Multi-Scenario High-Level Synthesis Algorithm for Variation-Tolerant Floorplan-Driven Design

Koki Igawa, Masao Yanagisawa, Nozomu Togawa

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E99A ( 7 ) 1278 - 1293 2016年07月 [査読有り]

　概要を見る

In order to tackle a process-variation problem, we can define several scenarios, each of which corresponds to a particular LSI behavior, such as a typical-case scenario and a worst-case scenario. By designing a single LSI chip which realizes multiple scenarios simultaneously, we can have a process-variation-tolerant LSI chip. In this paper, we propose a multi-scenario high-level synthesis algorithm for variation-tolerant floorplan-driven design targeting new distributed-register architectures, called HDR architectures. We assume two scenarios, a typical-case scenario and a worst-case scenario, and realize them onto a single chip. We first schedule/bind each of the scenarios independently. After that, we commonize the scheduling/binding results for the typical-case and worstcase scenarios and thus generate a commonized area-minimized floorplan result. At that time, we can explicitly take into account interconnection delays by using distributed-register architectures. Experimental results show that our algorithm reduces the latency of the typical-case scenario by up to 50% without increasing the latency of the worst-case scenario, compared with several existing methods.

DOI
Indoor Navigation Based on Real-time Direction Information Generation Using Wearable Glasses

Ryota Iwanaji, Tomoyuki Nitta, Kazuaki Ishikawa, Masao Yanagisawa, Nozomu Togawa

2016 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS-ASIA (ICCE-ASIA) 2016年 [査読有り]

　概要を見る

Indoor areas such as office buildings and railway stations have almost no landmarks and how to navigate pedestrians to their destination points there is one of the big challenges. In this paper, we propose an indoor navigation system based on real-time direction information generation using wearable glasses. The proposed system effectively calculates a pedestrian's direction in a real-time manner using sensors and superimposes the direction to proceed on wearable glasses. Hence it navigates pedestrians to their right direction without using landmarks. Experiments demonstrate the effectiveness of the proposed navigation system.
A High-level Synthesis Algorithm for FPGA Designs Optimizing Critical Path with Interconnection-delay and Clock-skew Consideration

Koichi Fujiwara, Kazushi Kawamura, Masao Yanagisawa, Nozomu Togawa

2016 INTERNATIONAL SYMPOSIUM ON VLSI DESIGN, AUTOMATION AND TEST (VLSI-DAT) 2016年 [査読有り]

　概要を見る

High-level synthesis for FPGA designs (FPGA-HLS) is recently required in various applications. Since wire delays are becoming a design bottleneck in FPGA, we need to handle interconnection delays and clock skews in FPGA-HLS flow. In this paper, we propose an FPGA-HLS algorithm optimizing critical path with interconnection-delay and clock-skew consideration. By utilizing HDR architecture, we floorplan circuit modules in HLS flow and, based on the result, estimate interconnection delays and clock skews. To reduce the critical-path delay(s) of a circuit, we propose two novel methods for FPGA-HLS. Experimental results demonstrate that our algorithm can improve circuit performance by up to 24% compared with conventional approaches.
Rotator-Based Multiplexer Network Synthesis for Field-Data Extractors

Koki Ito, Kazushi Kawamura, Yutaka Tamiya, Masao Yanagisawa, Nozomu Togawa

2016 29TH IEEE INTERNATIONAL SYSTEM-ON-CHIP CONFERENCE (SOCC) 194 - 199 2016年 [査読有り]

　概要を見る

As seen in stream data processing, it is necessary to extract a particular data field from bulk data, where we can use a field-data extractor. Particularly, an (M; N)-field-data extractor reads out any consecutive N bytes from an M -byte register by connecting its input/output using multiplexers (MUXs). However, the number of required MUXs increases too much as the input/output byte lengths increase. It is known that partitioning an MUX network leads to reducing the number of MUXs. In this paper, we firstly pick up a multi-layered MUX network, which is generated by repeatedly partitioning a MUX network into a collection of single-layered MUX networks. We show that the multi-layered MUX network is equivalent to the barrel shifter from which redundant MUXs and wires are removed, and we prove that the number of its required MUXs becomes the smallest among MUX-network-partitioning based field-data extractors. Next, we propose a rotator-based MUX network for a field-data extractor, which reads out a particular data in an input register to a rotator. The size of the rotator is the same as its output register and hence we no longer require any extra wires nor MUXs. By rotating the input data appropriately, we can finally have a right-ordered data into an output register. Experimental results show that our rotator-based MUX network reduces the required number of gates to implement a field-data extractor by up to 33% compared with the one using a multi-layered MUX network.

DOI

Scopus

1

被引用数

(Scopus)
In-situ Trojan Authentication for Invalidating Hardware-Trojan Functions

Masaru Oya, Youhua Shi, Masao Yanagisawa, Nozomu Togawa

PROCEEDINGS OF THE SEVENTEENTH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN ISQED 2016 152 - 157 2016年 [査読有り]

　概要を見る

Due to the fact that we do not know who will create hardware Trojans (HTs), and when and where they would be inserted, it is very difficult to correctly and completely detect all the real HTs in untrusted ICs, and thus it is desired to incorporate in-situ HT invalidating functions into untrusted ICs as a countermeasure against HTs. This paper proposes an in situ Trojan authentication technique for gate-level netlists to avoid security leakage. In the proposed approach, an untrusted IC operates in authentication mode and normal mode. In the authentication mode, an embedded Trojan authentication circuit monitors the bit-flipping count of a suspicious Trojan net within the pre-defined constant clock cycles and identify whether it is a real Trojan or not. If the authentication condition is satisfied, the suspicious Trojan net is validated. Otherwise, it is invalidated and HT functions are masked. By doing this, even untrusted netlists with HTs can still be used in the normal mode without security leakage. By setting the appropriate authentication condition using training sets from Trust-HUB gate-level benchmarks, the proposed technique invalidates successfully only HTs in the training sets. Furthermore, by embedding the in-situ Trojan authentication circuit into a Trojan-inserted AES crypto netlist, it can run securely and correctly even if HTs exist where its area overhead is just 1.5% with no delay overhead.

DOI

Scopus

6

被引用数

(Scopus)
A Delay Variation and Floorplan Aware High-level Synthesis Algorithm with Body Biasing

Koki Igawa, Youhua Shi, Masao Yanagisawa, Nozomu Togawa

PROCEEDINGS OF THE SEVENTEENTH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN ISQED 2016 75 - 80 2016年 [査読有り]

　概要を見る

In this paper, we propose a delay variation and floorplan aware high-level synthesis algorithm with body biasing, which minimizes the average leakage energy of manufactured chips. To realize a floorplan-oriented high-level synthesis, we utilize a huddle-based distributed register architecture (HDR architecture), one of the DR architectures. HDR architecture divides the chip area into small partitions called a huddle and we can control a body bias voltage for every huddle. During high-level synthesis, we iteratively obtain expected leakage energy for every huddle when applying a body bias voltage. A huddle with smaller expected leakage energy contributes to reducing expected leakage energy of the entire circuit but can increase the latency. We assign CDFG nodes in critical paths to the huddles with larger expected leakage energy and those in non-critical paths to the huddles with smaller expected leakage energy. We expect to minimize the entire leakage energy in a manufactured chip without increasing its latency. Experimental results show that our algorithm reduces the average leakage energy by up to 38.9% without latency and yield degradation compared with typical-case design with body biasing.

DOI

Scopus

1

被引用数

(Scopus)
A High-performance Circuit Design Algorithm using Data Dependent Approximation

Kazushi Kawamura, Masao Yanagisawa, Nozomu Togawa

2016 INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC) 95 - 96 2016年 [査読有り]

　概要を見る

This paper proposes a high-performance circuit design algorithm using input data dependent approximation. In our algorithm, STEPCs (Suspicious Timing Error Prediction Circuits) are utilized for identifying the paths to be optimized inside a circuit efficiently. Experimental results targeting a set of basic adders show that our algorithm can achieve performance increase by up to 11.1% within the error rate of 2.1% compared to a conventional design technique.

DOI

Scopus
Hash-Table and Balanced-Tree Based FIB Architecture for CCN Routers

Kenta Shimazaki, Takashi Aoki, Takahiro Hatano, Takuya Otsuka, Akihiko Miyazaki, Toshitaka Tsuda, Nozomu Togawa

2016 INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC) 67 - 68 2016年 [査読有り]

　概要を見る

Recently, content centric networking (CCN) attracts attention as a next generation network on which every router forwards a packet to another router and also functions as a server. A CCN router has a forwarding table called FIB (Forwarding Information Base) but its table look-up can become a bottleneck. In this paper, we propose FIB data structure for CCN routers which can reduce the number of comparisons in its look-up table. Our proposed FIB is composed of a bloom filter and a hash table and each hash entry is connected to a balanced binary search tree. By using our FIB, the number of comparisons cannot much increase even if hash collisions occur. Experimental results demonstrate the effectiveness of the proposed FIB over the several existing methods.

DOI

Scopus

3

被引用数

(Scopus)
Scalable and Small-Sized Power Analyzer Design with Signal-Averaging Noise Reduction for Low-Power IoT Devices

Ryosuke Kitayama, Takashi Takenaka, Masao Yanagisawa, Nozomu Togawa

2016 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS) 978 - 981 2016年 [査読有り]

　概要を見る

Power analysis for IoT devices is strongly required to reduce power consumption and realize secure communications. In this paper, we propose a scalable and small-sized power analyzer with signal-averaging noise reduction for low-power IoT devices. The proposed power analyzer reduces a wide frequency range of noises by using a signal averaging method and is implemented on just a 2cm x 3cm board, which is the smallest size among the other existing power analyzers for IoT devices. It further has the following advantages: (a) It has a two-level amplifier that amplifies current signals adaptively depending on their magnitude. Hence maximum readable current can be increased with keeping minimum readable current small enough. (b) If long-time analysis is required, it can be partitioned into several analysis segments. The proposed power analyzer can measure currents and voltages of each analysis segment by using a small amount of data memories. After that, by combining these analysis segments using a timer module, we can obtain long-time analysis results. We have analyzed power and energy consumption of encryption processes of AES block cipher on the IoT device and demonstrated that the proposed power analyzer has only 1.8% measurement error compared with a high-precision oscilloscope.

DOI

Scopus
Redesign for Untrusted Gate-level Netlists

Masaru Oya, Masao Yanagisawa, Nozomu Togawa

2016 IEEE 22ND INTERNATIONAL SYMPOSIUM ON ON-LINE TESTING AND ROBUST SYSTEM DESIGN (IOLTS) 219 - 220 2016年 [査読有り]

　概要を見る

This paper proposes a redesign technique which designs from untrusted netlists to trusted netlists. Our approach consists of two phases, detection phase and invalidation phase. The detection phase picks up suspicious hardware Trojans (HTs) by pattern matching. The invalidation phase modifies the suspicious HTs in order not to activate them. In the invalidation phase, three invalidation techniques are selected by analyzing location of suspicious malicious nets. Applying appropriately the invalidation technique to the nets can correctly invalidate HTs. In our results, the proposed technique can successfully invalidate HTs on several Trust-HUB benchmarks without HT activations. The results clearly demonstrate that our redesign technique is very effective to remove HT risks.

DOI

Scopus
Hardware Trojans Classification for Gate-level Netlists based on Machine Learning

Kento Hasegawa, Masaru Oya, Masao Yanagisawa, Nozomu Togawa

2016 IEEE 22ND INTERNATIONAL SYMPOSIUM ON ON-LINE TESTING AND ROBUST SYSTEM DESIGN (IOLTS) 203 - 206 2016年 [査読有り]

　概要を見る

Recently, we face a serious risk that malicious third-party vendors can very easily insert hardware Trojans into their IC products but it is very difficult to analyze huge and complex ICs. In this paper, we propose a hardware-Trojan classification method to identify hardware-Trojan infected nets (or Trojan nets) using a support vector machine (SVM). Firstly, we extract the five hardware-Trojan features in each net in a netlist. Secondly, since we cannot effectively give the simple and fixed threshold values to them to detect hardware Trojans, we represent them to be a five-dimensional vector and learn them by using SVM. Finally, we can successfully classify a set of all the nets in an unknown netlist into Trojan ones and normal ones based on the learned SVM classifier. We have applied our SVM-based hardware-Trojan classification method to Trust-HUB benchmarks and the results demonstrate that our method can much increase the true positive rate compared to the existing state-of-the-art results in most of the cases. In some cases, our method can achieve the true positive rate of 100%, which shows that all the Trojan nets in a netlist are completely detected by our method.

DOI

Scopus

166

被引用数

(Scopus)
Pedestrian Navigation Based on Landmark Recognition Using Glass-type Wearable Devices

Ryoya Yano, Tomoyuki Nitta, Kazuaki Ishikawa, Masao Yanagisawa, Nozomu Togawa

2016 IEEE 5TH GLOBAL CONFERENCE ON CONSUMER ELECTRONICS 1 - 2 2016年 [査読有り]

　概要を見る

In this paper, we propose a pedestrian navigation system based on landmark recognition. Our proposed system utilizes a glass-type wearable device and gives a correspondence between a map and a real-world landscape. By recognizing a landmark position effectively, a pedestrian can easily know where to turn at each turning position and hence he/she can reach his/her goal without losing his/her way. Experimental results demonstrate that the proposed system can increase the landmarks which pedestrians can recognize and thus gives comprehensive navigation effectively.

DOI

Scopus
Comprehensive Deformed Map Generation for Wristwatch-type Wearable Devices Based on Landmark-based Partitioning

Keisuke Kono, Tomoyuki Nitta, Kazuaki Ishikawa, Masao Yanagisawa, Nozomu Togawa

2016 IEEE 5TH GLOBAL CONFERENCE ON CONSUMER ELECTRONICS 1 - 2 2016年 [査読有り]

　概要を見る

Recently, wristwatch-type wearable devices are developed and geographic information services have become widely available on them. In this paper, we propose a comprehensive deformed map generation algorithm for wristwatch-type wearable devices. Our algorithm first normalizes a pedestrian route to 0 degrees, 45 degrees, or 90 degrees so that the pedestrian can see the route not tilting the wristwatch-type wearable device on his/her wrist. Second, our algorithm partitions the normalized map so that several landmarks are overlapped in the partitioned sub-maps. Hence the sub-maps can be largely displayed on wristwatch-type wearable devices and the pedestrian can recognize his/her location even when the sub-maps displayed are changed. Experiments demonstrate the effectiveness of our deformed map generation algorithm on wristwatch-type wearable devices.

DOI

Scopus

4

被引用数

(Scopus)
A Safe and Comprehensive Route Finding Method for Pedestrian Based on Lighting and Landmark

Siya Bao, Tomoyuki Nitta, Kazuaki Ishikawa, Masao Yanagisawa, Nozomu Togawa

2016 IEEE 5TH GLOBAL CONFERENCE ON CONSUMER ELECTRONICS 1 - 5 2016年 [査読有り]

　概要を見る

This paper proposes a safe and comprehensive route finding method for pedestrians. We evaluate five factors that do relieve pedestrians' fear of darkness. Based upon the evaluation, we propose a comprehensive route finding method by taking road width and reduction on turning points into consideration. The experimental results on real outdoor environments under different lighting situations confirm that the proposed method can obtain safety and comprehensive routes for pedestrians.

DOI

Scopus

12

被引用数

(Scopus)
Implementation Evaluation of Scan-based Attack against a Trivium Cipher Circuit

Daisuke Oku, Masao Yanagisawa, Nozomu Togawa

2016 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS) 220 - 223 2016年 [査読有り]

　概要を見る

Scan-path test, which is one of design-for-test techniques using a scan chain, can control and observe internal registers in an LSI chip. However, attackers can also use it. to retrieve secret information from cipher circuits. Recently, scan-based attacks using a scan chain inside an LSI chip is reported which can restore secret information by analyzing the scan data during cryptographic processing. In this paper, we pick up a scan-based attack method against a Trivium cipher, one of synchronous stream ciphers, and evaluate it using the FPGA platform called SASEBO-GII We implement the Trivium cipher on the FPGA chip and perform the scan-based attack against it. We demonstrate that the scan-based attack can successfully restore the secret information in the FPGA chip within several minutes, even if the FPGA chip contains several circuits other than the Trivium cipher circuit, which reveals that the scan-based attack against the Trivium cipher is not only a simulation threat but a real threat.

DOI

Scopus

1

被引用数

(Scopus)
Scan-Based Side-Channel Attack on the Camellia Block Cipher Using Scan Signatures

Huiqian Jiang, Mika Fujishiro, Hirokazu Kodera, Masao Yanagisawa, Nozomu Togawa

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E98A ( 12 ) 2547 - 2555 2015年12月 [査読有り]

　概要を見る

Camellia is a block cipher jointly developed by Mitsubishi and NTT of Japan. It is designed suitable for both software and hardware implementations. One of the design-for-test techniques using scan chains is called scan-path test, in which testers can observe and control the registers inside the LSI chip directly in order to check if the LSI chip correctly operates or not. Recently, a scan-based side-channel attack is reported which retrieves the secret information from the cryptosystem using scan chains. In this paper, we propose a scan-based attack method on the Camellia cipher using scan signatures. Our proposed method is based on the equivalent transformation of the Camellia algorithm and the possible key candidate reduction in order to retrieve the secret key. Experimental results show that our proposed method sucessfully retrieved its 128-bit secret key using 960 plaintexts even if the scan chain includes the Camellia cipher and other circuits and also sucessfully retrieves its secret key on the SASEBO-GII board, which is a side-channel attack standard evaluation board.

DOI

Scopus

3

被引用数

(Scopus)
A Hardware-Trojans Identifying Method Based on Trojan Net Scoring at Gate-Level Netlists

Masaru Oya, Youhua Shi, Noritaka Yamashita, Toshihiko Okamura, Yukiyasu Tsunoo, Satoshi Goto, Masao Yanagisawa, Nozomu Togawa

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E98A ( 12 ) 2537 - 2546 2015年12月 [査読有り]

　概要を見る

Outsourcing IC design and fabrication is one of the effective solutions to reduce design cost but it may cause severe security risks. Particularly, malicious outside vendors may implement Hardware Trojans (HTs) on ICs. When we focus on IC design phase, we cannot assume an HT-free netlist or a Golden netlist and it is too difficult to identify whether a given netlist is HT-free or not. In this paper, we propose a score-based hardware-trojans identifying method at gate-level netlists without using a Golden netlist. Our proposed method does not directly detect HTs themselves in a gate-level netlist but it detects a net included in HTs, which is called Trojan net, instead. Firstly, we observe Trojan nets from several HT-inserted benchmarks and extract several their features. Secondly, we give scores to extracted Trojan net features and sum up them for each net in benchmarks. Then we can find out a score threshold to classify HT-free and HT-inserted netlists. Based on these scores, we can successfully classify HT-free and HT-inserted netlists in all the Trust-HUB gate-level benchmarks and ISCAS85 benchmarks as well as HT-free and HT-inserted AES gate-level netlists. Experimental results demonstrate that our method successfully identify all the HT-inserted gate-level benchmarks to be "HT-inserted" and all the HT-free gate-level benchmarks to be "HT-free" in approximately three hours for each benchmark.

DOI

Scopus

12

被引用数

(Scopus)
ECC-Based Bit-Write Reduction Code Generation for Non-Volatile Memory

Masashi Tawada, Shinji Kimura, Masao Yanagisawa, Nozomu Togawa

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E98A ( 12 ) 2494 - 2504 2015年12月 [査読有り]

　概要を見る

Non-volatile memory has many advantages such as high density and low leakage power but it consumes larger writing energy than SRAM. It is quite necessary to reduce writing energy in non-volatile memory design. In this paper, we propose write-reduction codes based on error correcting codes and reduce writing energy in non-volatile memory by decreasing the number of writing bits. When a data is written into a memory cell, we do not write it directly but encode it into a codeword. In our write-reduction codes, every data corresponds to an information vector in an error-correcting code and an information vector corresponds not to a single codeword but a set of write-reduction codewords. Given a writing data and current memory bits, we can deterministically select a particular write-reduction codeword corresponding to the data to be written, where the maximum number of flipped bits are theoretically minimized. Then the number of writing bits into memory cells will also be minimized. Experimental results demonstrate that we have achieved writing-bits reduction by an average of 51% and energy reduction by an average of 33% compared to non-encoded memory.

DOI

Scopus

2

被引用数

(Scopus)
Code Generation Limiting Maximum and Minimum Hamming Distances for Non-Volatile Memories

Tatsuro Kojo, Masashi Tawada, Masao Yanagisawa, Nozomu Togawa

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E98A ( 12 ) 2484 - 2493 2015年12月 [査読有り]

　概要を見る

Data stored in non-volatile memories may be destructed due to crosstalk and radiation but we can restore their data by using error-correcting codes. However, non-volatile memories consume a large amount of energy in writing. How to reduce maximum writing bits even using error-correcting codes is one of the challenges in non-volatile memory design. In this paper, we first propose Doughnut code which is based on state encoding limiting maximum and minimum Hamming distances. After that, we propose a code expansion method, which improves maximum and minimum Hamming distances. When we apply our code expansion method to Doughnut code, we can obtain a code which reduces maximum-flipped bits and has error-correcting ability equal to Hamming code. Experimental results show that the proposed code efficiently reduces the number of maximum-writing bits.

DOI

Scopus

2

被引用数

(Scopus)
A floorplan-driven high-level synthesis algorithm with multiple-operation chainings based on path enumeration

Kotaro Terada, Masao Yanagisawa, Nozomu Togawa

Proceedings - IEEE International Symposium on Circuits and Systems 2015- 2129 - 2132 2015年07月 [査読有り]

　概要を見る

As process technologies advance, interconnection delays are not negligible even in high-level synthesis and regular-distributed-register (RDR) architecture has been proposed to cope with this problem. In this paper, we propose a floorplan-driven high-level synthesis algorithm using multiple-operation chainings composed of two or more operations, and reduce the overall latency targeting RDR architecture. Our algorithm enumerates multiple-operation-chaining path candidates before performing scheduling/binding. Based on them, we find out optimal ones taking into account RDR floorplan information. Experimental results show that our algorithm successfully reduces the latency by up to 30.4% compared to the conventional approaches.

DOI

Scopus

1

被引用数

(Scopus)
An Effective Suspicious Timing-Error Prediction Circuit Insertion Algorithm Minimizing Area Overhead

Shinnosuke Yoshida, Youhua Shi, Masao Yanagisawa, Nozomu Togawa

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E98A ( 7 ) 1406 - 1418 2015年07月 [査読有り]

　概要を見る

As process technologies advance, timing-error correction techniques have become important as well. A suspicious timing-error prediction (STEP) technique has been proposed recently, which predicts timing errors by monitoring themiddle points, or check points of several speed-paths in a circuit. However, if we insert STEP circuits (STEPCs) in the middle points of all the paths from primary inputs to primary outputs, we need many STEPCs and thus require too much area overhead. How to determine these check points is very important. In this paper, we propose an effective STEPC insertion algorithm minimizing area overhead. Our proposed algorithm moves the STEPC insertion positions to minimize inserted STEPC counts. We apply a max-flow and min-cut approach to determine the optimal positions of inserted STEPCs and reduce the required number of STEPCs to 1/10-1/80 and their area to 1/5-1/8 compared with a naive algorithm. Furthermore, our algorithm realizes 1.12X-1.5X overclocking compared with just inserting STEPCs into several speed-paths.

DOI

Scopus

3

被引用数

(Scopus)
A Floorplan-Driven High-Level Synthesis Algorithm for Multiplexer Reduction Targeting FPGA Designs

Koichi Fujiwara, Kazushi Kawamura, Shin-ya Abe, Masao Yanagisawa, Nozomu Togawa

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E98A ( 7 ) 1392 - 1405 2015年07月 [査読有り]

　概要を見る

Recently, high-level synthesis (HLS) techniques for FPGA designs are required in various applications such as computerized stock tradings and reconfigurable network processings. In HLS for FPGA designs, we need to consider module floorplan and reduce multiplexer's cost concurrently. In this paper, we propose a floorplan-driven HLS algorithm for multiplexer reduction targeting FPGA designs. By utilizing distributed-register architectures called HDR, we can easily consider module floorplan in HLS. In order to reduce multiplexer's cost, we propose two novel binding methods called datapath-oriented scheduling/FU binding and datapath-oriented register binding. Experimental results demonstrate that our algorithm can realize FPGA designs which reduce the number of slices by up to 47% and latency by up to 22% compared with conventional approaches while the number of required control steps is almost the same.

DOI

Scopus

2

被引用数

(Scopus)
An Energy-Efficient Floorplan Driven High-Level Synthesis Algorithm for Multiple Clock Domains Design

Shin-ya Abe, Youhua Shi, Kimiyoshi Usami, Masao Yanagisawa, Nozomu Togawa

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E98A ( 7 ) 1376 - 1391 2015年07月 [査読有り]

　概要を見る

In this paper, we first propose an HDR-mcd architecture, which integrates periodically all-in-phase based multiple clock domains and multi-cycle interconnect communication into high-level synthesis. In HDR-mcd, an entire chip is divided into several huddles. Huddles can realize synchronization between different clock domains in which interconnection delay should be considered during high-level synthesis. Next, we propose a high-level synthesis algorithm for HDR-mcd, which can reduce energy consumption by optimizing configuration and placement of huddles. Experimental results show that the proposed method achieves 32.5% energy-saving compared with the existing single clock domain based methods.

DOI

Scopus

1

被引用数

(Scopus)
A High-Level Synthesis Algorithm with Inter-Island Distance Based Operation Chainings for RDR Architectures

Kotaro Terada, Masao Yanagisawa, Nozomu Togawa

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E98A ( 7 ) 1366 - 1375 2015年07月 [査読有り]

　概要を見る

In deep-submicron era, interconnection delays are not negligible even in high-level synthesis and regular-distributed-register architectures (RDR architectures) have been proposed to cope with this problem. In this paper, we propose a high-level synthesis algorithm using operation chainings which reduces the overall latency targeting RDR architectures. Our algorithm consists of three steps: The first step enumerates candidate operations for chaining. The second step introduces maximal chaining distance (MCD), which gives the maximal allowable inter-island distance on RDR architecture between chaining candidate operations. The last step performs list-scheduling and binding simultaneously based on the results of the two preceding steps. Our algorithm enumerates feasible chaining candidates and selects the best ones for RDR architecture. Experimental results show that our proposed algorithm reduces the latency by up to 40.0% compared to the original approach, and by up to 25.0% compared to a conventional approach. Our algorithm also reduces the number of registers and the number of multiplexers compared to the conventional approaches in some cases.

DOI

Scopus

1

被引用数

(Scopus)
Energy-efficient High-level Synthesis for HDR Architecture with Multi-stage Clock Gating

Akasaka Hiroyuki, Abe Shin-ya, Yanagisawa Masao, Togawa Nozomu

Information and Media Technologies 10 ( 1 ) 1 - 7 2015年

　概要を見る

With the miniaturization and high performance of current and future LSIs, demand for portable devices has much more increased. Especially the problems of battery runtime and device overheating have occurred. In addition, with the downsize of the LSI design process, the ratio of an interconnection delay to a gate delay has continued to increase. High-level synthesis to estimate the interconnection delays and reduce energy consumption is essential. In this paper, we propose a high-level synthesis algorithm based on HDR architectures (huddle-based distributed register architectures) utilizing multi-stage clock gating. By increasing the number of clock gating stages in each huddle, we increase the number of the control steps at which we can apply the clock gating to registers. We can determine the configuration of the clock gating with optimized energy consumption. The experimental results demonstrate that our proposed algorithm reduced energy consumption by up to 27.7% compared with conventional algorithms.

DOI CiNii
Fast source optimization by clustering algorithm based on lithography properties

Masashi Tawada, Takaki Hashimoto, Keishi Sakanushi, Shigeki Nojima, Toshiya Kotani, Masao Yanagisawa, Nozomu Togawa

DESIGN-PROCESS-TECHNOLOGY CO-OPTIMIZATION FOR MANUFACTURABILITY IX 9427 2015年 [査読有り]

　概要を見る

Lithography is a technology to make circuit patterns on a wafer. UV light diffracted by a photomask forms optical images on a photoresist. Then, a photoresist is melt by an amount of exposed UV light exceeding the threshold. The UV light diffracted by a photomask through lens exposes the photoresist on the wafer. Its lightness and darkness generate patterns on the photoresist. As the technology node advances, the feature sizes on photoresist becomes much smaller. Diffracted UV light is dispersed on the wafer, and then exposing photoresists has become more difficult. Exposure source optimization, SO in short, techniques for optimizing illumination shape have been studied. Although exposure source has hundreds of grid-points, all of previous works deal with them one by one. Then they consume too much running time and that increases design time extremely. How to reduce the parameters to be optimized in SO is the key to decrease source optimization time. In this paper, we propose a variation-resilient and high-speed cluster-based exposure source optimization algorithm. We focus on image log slope (ILS) and use it for generating clusters. When an optical image formed by a source shape has a small ILS value at an EPE (Edge placement error) evaluation point, dose/focus variation much affects the EPE values. When an optical image formed by a source shape has a large ILS value at an evaluation point, dose/focus variation less affects the EPE value. In our algorithm, we cluster several grid-points with similar ILS values and reduce the number of parameters to be simultaneously optimized in SO. Our clustering algorithm is composed of two STEPs: In STEP 1, we cluster grid-points into four groups based on ILS values of grid-points at each evaluation point. In STEP 2, we generate super clusters from the clusters generated in STEP 1. We consider a set of grid-points in each cluster to be a single light source element. As a result, we can optimize the SO problem very fast. Experimental results demonstrate that our algorithm runs speed-up compared to a conventional algorithm with keeping the EPE values.

DOI

Scopus

1

被引用数

(Scopus)
A Floorplan-Aware High-Level Synthesis Technique with Delay-Variation Tolerance

Kazushi Kawamura, Yuta Hagio, Youhua Shi, Nozomu Togawa

PROCEEDINGS OF THE 2015 IEEE INTERNATIONAL CONFERENCE ON ELECTRON DEVICES AND SOLID-STATE CIRCUITS (EDSSC) 122 - 125 2015年 [査読有り]

　概要を見る

For realizing better trade-off between performance and yield rate in recent LSI designs, it is required to deal with increasing the ratios of interconnect delay as well as delay variation. In this paper, a novel floorplan-aware high-level synthesis technique with delay-variation tolerance is proposed. By utilizing floorplan-driven architectures, interconnect delays can be estimated and then handled even in high-level synthesis. Applying our technique enables to realize two scheduling/binding results (one is a non-delayed result and the other is a delayed result) simultaneously on a chip with small area/performance overhead, and either one of them can be selected according to the post-silicon delay variation. Experimental results demonstrate that our technique can reduce delayed scheduling/binding latency by up to 32.3% compared with conventional approaches.
Scan-based Side-channel Attack against Symmetric Key Ciphers Using Scan Signatures

Mika Fujishiro, Youhua Shi, Masao Yanagisawa, Nozomu Togawa

PROCEEDINGS OF THE 2015 IEEE INTERNATIONAL CONFERENCE ON ELECTRON DEVICES AND SOLID-STATE CIRCUITS (EDSSC) 309 - 312 2015年 [査読有り]

　概要を見る

There are a number of studies on a side-channel attack which uses information exploited from the physical implementation of a cryptosystem. A scan-based side-channel attack utilizes scan chains, one of design-for-test techniques and retrieves the secret information inside the cryptosystem. In this paper, scan based side-channel attack methods against symmetric key ciphers such as block ciphers and stream ciphers using scan signatures are presented to show the risk of scan-based attacks.
Partitioning-Based Multiplexer Network Synthesis for Field-Data Extractors

Koki Ito, Yutaka Tamiya, Masao Yanagisawa, Nozomu Togawa

2015 28TH IEEE INTERNATIONAL SYSTEM-ON-CHIP CONFERENCE (SOCC) 263 - 268 2015年 [査読有り]

　概要を見る

As seen in packet analysis of TCP/IP offload engine and stream data processing for video/audio data, it is necessary to extract a particular data field from bulk data, where we can use a field-data extractor. Particularly, an (M, N)-field-data extractor reads out any consecutive N bytes from an M-byte register by connecting its input/ output using multiplexers. However, the number of required multiplexers increases too much as the input/ output byte lengths increase. How to reduce the number of its required multiplexers is a major challenge. In this paper, we propose an efficient multiplexer network synthesis method for an (M, N)-field-data extractor. Our method is based on inserting an (N + B - 1)-byte virtual intermediate register into a multiplexer network and partitioning it into an upper network and a lower network. Our method theoretically reduces the number of required multiplexers without increasing the multiplexer network depth. We also propose how to determine the size of the virtual intermediate register that minimizes the number of required multiplexers. Experimental results show that our method reduces the required number of gates to implement a field-data extractor by up to 92% compared with the one using a naive multiplexer network.

DOI

Scopus

3

被引用数

(Scopus)
A Process-Variation-Aware Multi-Scenario High-Level Synthesis Algorithm for Distributed-Register Architectures

Koki Igawa, Youhua Shi, Masao Yanagisawa, Nozomu Togawa

2015 28TH IEEE INTERNATIONAL SYSTEM-ON-CHIP CONFERENCE (SOCC) 7 - 12 2015年 [査読有り]

　概要を見る

In order to tackle a process-variation problem, we can define several scenarios, each of which corresponds to a particular LSI behavior, such as a typical-case scenario and a worst-case scenario. By designing a single LSI chip which realizes multiple scenarios simultaneously, we can have a process-variation-tolerant LSI chip. In this paper, we propose a processvariation- aware low-latency and multi-scenario high-level synthesis algorithm targeting new distributed-register architectures, called HDR architectures. We assume two scenarios, a typical-case scenario and a worst-case scenario, and realize them onto a single chip. We first schedule/bind each of the scenarios independently. After that, we commonize the scheduling/binding results for the typical-case and worst-case scenarios and thus generate a commonized area-minimized floorplan result. Experimental results show that our algorithm reduces the latency of the typical-case scenario by up to 50% without increasing the latency of the worst-case scenario, compared with several existing methods.

DOI

Scopus

2

被引用数

(Scopus)
A Floorplan-Driven High-Level Synthesis Algorithm with Multiple-Operation Chainings based on Path Enumeration

Kotaro Terada, Masao Yanagisawa, Nozomu Togawa

2015 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS) 2129 - 2132 2015年 [査読有り]

　概要を見る

As process technologies advance, interconnection delays are not negligible even in high-level synthesis and regular-distributed-register (RDR) architecture has been proposed to cope with this problem. In this paper, we propose a floorplan-driven high-level synthesis algorithm using multiple-operation chainings composed of two or more operations, and reduce the overall latency targeting RDR architecture. Our algorithm enumerates multiple-operation-chaining path candidates before performing scheduling/ binding. Based on them, we find out optimal ones taking into account RDR floorplan information. Experimental results show that our algorithm successfully reduces the latency by up to 30.4% compared to the conventional approaches.

DOI

Scopus

1

被引用数

(Scopus)
Bit-Write-Reducing and Error-Correcting Code Generation by Clustering Error-Correcting Codewords for Non-Volatile Memories

Tatsuro Kojo, Masashi Tawada, Masao Yanagisawa, Nozomu Togawa

2015 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD) 682 - 689 2015年 [査読有り]

　概要を見る

Non-volatile memories are paid attention to as a promising alternative to memory design. Data stored in them still may be destructed due to crosstalk and radiation. We can restore the data by using error-correcting codes which require extra bits to correct bit errors. Further, non-volatile memories consume ten to hundred times more energy than normal memories in bit-writing. When we configure them using error-correcting codes, it is quite necessary to reduce writing bits. In this paper, we propose a method to generate a bit-write-reducing code with error-correcting ability. We first pick up an error-correcting code which can correct t-bit errors. We cluster its codeswords and generate a cluster graph satisfying the S-bit flip conditions. We assign a data to be written to each cluster. In other words, we generate one-to-many mapping from each data to the codewords in the cluster. We prove that, if the cluster graph is a complete graph, every data in a memory cell can be re-written into another data by flipping at most S bits keeping error-correcting ability to t bits. We further propose an efficient method to cluster error-correcting codewords. Experimental results demonstrate that, when we apply our bit-write-reducing code to MediaBench applications, it can reduce writing-bit counts by up to 28.2% and also energy consumption of non-volatile memory cells by up to 27.9% compared to existing error-correcting codes keeping the same error-correcting ability. This paper proposes the world-first theoretically near-optimal bit-write-reducing code with error-correcting ability based on the efficient coding theories.

DOI

Scopus

2

被引用数

(Scopus)
Effective Parallel Algorithm for GPGPU-Accelerated Explicit Routing Optimization

Ko Kikuta, Eiji Oki, Naoaki Yamanaka, Nozomu Togawa, Hidenori Nakazato

2015 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM) 1 - 6 2015年 [査読有り]

　概要を見る

The recent development of network technologies that offer centralized control of explicit routes opens the door to the online optimization of explicit routing. For this kind of Traffic Engineering optimization, raising the calculation speeds by using multi-core processors with effective parallel algorithms is a key goal. This paper proposes an effective parallel algorithm for General purpose Programming on Graphic Processing Unit (GPGPU); its massively parallel style promises strong acceleration of calculation speed. The proposed algorithm parallelizes not only the search method of the Genetic Algorithm, but also its fitness functions, which calculate the network congestion ratio, so as to fully utilize the power of modern GPGPUs. Concurrently, each execution is designed for thread-block execution on the GPU with consideration of thread occupancy, local resources, and SIMT execution to maximize GPU performance. Evaluations show that the proposed algorithm offers, on average, a nine fold speedup compared to the conventional CPU approach.

DOI

Scopus

3

被引用数

(Scopus)
A Landmark-based Route Recommendation Method for Pedestrian Walking Strategies

Siya Bao, Tomoyuki Nitta, Daisuke Shindou, Masao Yanagisawa, Nozomu Togawa

2015 IEEE 4TH GLOBAL CONFERENCE ON CONSUMER ELECTRONICS (GCCE) 672 - 673 2015年 [査読有り]

　概要を見る

This paper proposes a landmark-based route recommendation method for enjoyable walking atmosphere strategies by accumulating and analyzing geographical information. We utilize landmark categorization and region clustering to obtain effective elements. Experimental results demonstrate that our proposed method improves walking environment quality and confirm that it is applicable in both urban and rural areas.

DOI

Scopus

6

被引用数

(Scopus)
A Visible Corner-Landmark Based Route Finding Algorithm for Pedestrian Navigation

Kengo Takeda, Tomoyuki Nitta, Daisuke Shindou, Masao Yanagisawa, Nozomu Togawa

2015 IEEE 4TH GLOBAL CONFERENCE ON CONSUMER ELECTRONICS (GCCE) 601 - 602 2015年 [査読有り]

　概要を見る

Although many GPS-based pedestrian navigations are released, their instructions at decision points are not sufficient. This is mainly due to the lack of landmark informations. They may cause pedestrians to pass decision points or misunderstand when to turn. This paper proposes a visible corner-landmark based route finding algorithm. The proposed algorithm is based on visibility edges for landmarks and can obtain a pedestrian route that has visible landmarks on its corner points. Experiments demonstrate that the proposed algorithm can maximize the visible corner landmarks included in the generated routes.

DOI

Scopus

3

被引用数

(Scopus)
A Score-Based Classification Method for Identifying Hardware-Trojans at Gate-Level Netlists

Masaru Oya, Youhua Shi, Masao Yanagisawa, Nozomu Togawa

2015 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE) 465 - 470 2015年 [査読有り]

　概要を見る

Recently, digital ICs are often designed by outside vendors to reduce design costs in semiconductor industry, which may introduce severe risks that malicious attackers implement Hardware Trojans (HTs) on them. Since IC design phase generates only a single design result, an RT-level or gate-level netlist for example, we cannot assume an HT-free netlist or a Golden netlist and then it is too difficult to identify whether a generated netlist is HT-free or HT-inserted. In this paper, we propose a score-based classification method for identifying HT-free or HT-inserted gate-level netlists without using a Golden netlist. Our proposed method does not directly detect HTs themselves in a gate-level netlist but a net included in HTs, which is called Trojan net, instead. Firstly, we observe Trojan nets from several HT-inserted benchmarks and extract several their features. Secondly, we give scores to extracted Trojan net features and sum up them for each net in benchmarks. Then we can find out a score threshold to classify HT-free and HT-inserted netlists. Based on these scores, we can successfully classify HT-free and HT-inserted netlists in all the Trust-HUB gate-level benchmarks. Experimental results demonstrate that our method successfully identify all the HT-inserted gate-level benchmarks to be "HT-inserted" and all the HT-free gate-level benchmarks to be "HT-free" in approximately three hours for each benchmark.
A Bit-Write Reduction Method based on Error-Correcting Codes for Non-Volatile Memories

Masashi Tawada, Shinji Kimura, Masao Yanagisawa, Nozomu Togawa

2015 20TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC) 496 - 501 2015年 [査読有り]

　概要を見る

Non-volatile memory has many advantages over SRAM. However, one of its largest problems is that it consumes a large amount of energy in writing. In this paper, we propose a bit-write reduction method based on error correcting codes for non-volatile memories. When a data is written into a memory cell, we do not write it directly but encode it into a codeword. We focus on error-correcting codes and generate new codes called write-reduction codes. In our write-reduction codes, each data corresponds to an information vector in an error-correcting code and an information vector corresponds not to a single codeword but a set of write-reduction codewords. Given a writing data and current memory bits, we can deterministically select a particular write-reduction codeword corresponding to a data to be written, where the maximum number of flipped bits are theoretically minimized. Then the number of writing bits into memory cells will also be minimized. We perform several experimental evaluations and demonstrate up to 72% energy reduction.

DOI

Scopus

6

被引用数

(Scopus)
Improved Monitoring-Path Selection Algorithm for Suspicious Timing Error Prediction based Timing Speculation

Shinnosuke Yoshida, Youhua Shi, Masao Yanagisawa, Nozomu Togawa

PROCEEDINGS OF 2015 IEEE 11TH INTERNATIONAL CONFERENCE ON ASIC (ASICON) 1 - 4 2015年 [査読有り]

　概要を見る

As process technology is scaling down, timing speculation techniques such as Razor and STEP are emerged as alternative solutions to reduce required margins due to various variation effects. Unlike Razor, STEP is a prediction-based timing speculation method to predict suspicious timing errors before they really appear, and thus it can result in more performance improvement. Therefore, an improved monitoring-path selection algorithm for STEP-based timing speculation is proposed in this paper, in which candidate monitoring-paths are selected based on short path removement and path length estimation. Experimental results show that the proposed algorithm realizes an average of 1.71X overclocking compared with worst-case based designs.

DOI

Scopus
A low-power soft error tolerant latch scheme

Saki Tajima, Youhua Shi, Nozomu Togawa, Masao Yanagisawa

PROCEEDINGS OF 2015 IEEE 11TH INTERNATIONAL CONFERENCE ON ASIC (ASICON) 1 - 4 2015年 [査読有り]

　概要を見る

As process technology continues scaling, low power and reliability of integrated circuits are becoming more critical than ever before. Particularly, due to the reduction of node capacitance and operating voltage for low power consumption, it makes the circuits more sensitive to high-energy particles induced soft errors. In this paper, a soft-error tolerant latch called TSPC-SEH is proposed for soft error tolerance with low power consumption. The simulation results show that the proposed TSPC-SEH latch can achieve up to 42% power consumption reduction and 54% delay improvement compared to the existing soft error tolerant SEH and DICE designs.

DOI

Scopus

1

被引用数

(Scopus)
Small-Sized and Noise-Reducing Power Analyzer Design for Low-Power IoT Devices

Ryosuke Kitayama, Takashi Takenaka, Masao Yanagisawa, Nozomu Togawa

PROCEEDINGS OF 2015 IEEE 11TH INTERNATIONAL CONFERENCE ON ASIC (ASICON) 1 - 4 2015年 [査読有り]

　概要を見る

Power analysis for IoT devices is strongly required to reduce power consumption and realize secure communications, where we need a small-sized power analyzer that can reduce a wide frequency range of noises is needed. In this paper, we propose a small-sized and noise-reduced power analyzer for IoT devices. We utilize a signal averaging method to reduce a wide frequency range of noises. At that time, how to implement a synchronous process between a power analyzer and a target IoT device becomes the key problem. We solve this problem by ( a) using synchronization signals generated by a general-purpose I/O interface of a microprocessor and ( b) introducing a data-order correction process. We analyze power/energy consumption of the encryption process of LED block cipher on the IoT device and obtain an average power of 146.3mW and energy of 3.84mJ. The proposed power analyzer is just implemented on a 5cmx5cm board but these results only have 5% errors compared to a highprecision oscilloscope.

DOI

Scopus

2

被引用数

(Scopus)
Image Synthesis Circuit Design Using Selector-logic-based Alpha Blending and Its FPGA Implementation

Keita Igarashi, Masao Yanagisawa, Nozomu Togawa

PROCEEDINGS OF 2015 IEEE 11TH INTERNATIONAL CONFERENCE ON ASIC (ASICON) 1 - 4 2015年 [査読有り]

　概要を見る

Alpha blending is one of image synthesis techniques, which synthesizes a new image by summing up weighted input images and realizes transparent effect. In this paper, we focus on alpha blending using selector logics and implement it on an FPGA board. By applying selector logics to the alpha blending operation, its total product terms are decreased and thus a circuit size and circuit delay are improved simultaneously. In our implementation, original pixel values are stored into a memory on the FPGA board and then a new pixel value is synthesized based on input transmittance factors. We realize approximately 23% speed-up and 8% area reduction simultaneously using selector-logic based alpha blending.

DOI

Scopus

4

被引用数

(Scopus)
Clock Skew Estimate Modeling for FPGA High-level Synthesis and Its Application

Koichi Fujiwara, Kazushi Kawamura, Masao Yanagisawa, Nozomu Togawa

PROCEEDINGS OF 2015 IEEE 11TH INTERNATIONAL CONFERENCE ON ASIC (ASICON) 1 - 4 2015年 [査読有り]

　概要を見る

Recently, high-level synthesis (HLS) techniques for FPGA designs are required in various applications. Clock network in FPGA has already been built before implementing any circuits, which may lead a large impact of clock skews and then degrade operation frequency. In this paper, we formulate a clock skew estimate model for FPGA-HLS (CSEF). CSEF is an accurate model to estimate clock skews in HLS flow. CSEF is then integrated into a floorplan-aware HLS algorithm targeting FPGA designs. Experimental results demonstrate that our HLS algorithm can realize FPGA designs which reduce the latency by up to 19% compared with conventional approaches.

DOI

Scopus

4

被引用数

(Scopus)
Scan-Based Side-Channel Attack on the LED Block Cipher Using Scan Signatures

Mika Fujishiro, Masao Yanagisawa, Nozomu Togawa

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E97A ( 12 ) 2434 - 2442 2014年12月 [査読有り]

　概要を見る

LED (Light Encryption Device) block cipher, one of lightweight block ciphers, is very compact in hardware. Its encryption process is composed of AES-like rounds. Recently, a scan-based side-channel attack is reported which retrieves the secret information inside the cryptosystem utilizing scan chains, one of design-for-test techniques. In this paper, a scan-based attack method on the LED block cipher using scan signatures is proposed. In our proposed method, we focus on a particular 16-bit position in scanned data obtained from an LED LSI chip and retrieve its secret key using scan signatures. Experimental results show that our proposed method successfully retrieves its 64-bit secret key using 36 plaintexts on average if the scan chain is only connected to the LED block cipher. These experimental results also show the key is successfully retrieved even if the scan chain includes additional 130,000 1-bit data.

DOI

Scopus

4

被引用数

(Scopus)
Scan-Based Attack against Trivium Stream Cipher Using Scan Signatures

Mika Fujishiro, Masao Yanagisawa, Nozomu Togawa

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E97A ( 7 ) 1444 - 1451 2014年07月 [査読有り]

　概要を見る

Trivium is a synchronous stream cipher using three shift registers. It is designed to have a simple structure and runs at high speed. A scan-based side-channel attack retrieves secret information using scan chains, one of design-for-test techniques. In this paper, a scan-based side-channel attack method against Trivium using scan signatures is proposed. In our method, we reconstruct a previous internal state in Trivium one by one from the internal state just when a ciphertext is generated. When we retrieve the internal state, we focus on a particular 1-bit position in a collection of scan chains and then we can attack Trivium even if the scan chain includes other registers than internal state registers in Trivium. Experimental results show that our proposed method successfully retrieves a plaintext from a ciphertext generated by Trivium.

DOI

Scopus

7

被引用数

(Scopus)
Foreword: Special section on VLSI design and CAD algorithms

Yamada, A., Higami, Y., Takagi, K., Amagasaki, M., Ikeda, M., Ishihara, T., Ito, K., Usami, K., Okada, K., Kajihara, S., Kaneko, M., Kawaguchi, H., Kimura, S., Kurokawa, A., Shibata, Y., Seto, K., Song, T., Takashima, Y., Takahashi, A., Takenaka, T., Togawa, N., Tomiyama, H., Nakatake, S., Nakamura, Y., Hashimoto, M., Hamaguchi, K., Higuchi, H., Hirose, T., Fukuda, D., Matsumoto, T., Miura, Y., Minato, S.-I., Minami, F., Yamashita, S., Yuminaka, Y., Yoshikawa, M., Watanabe, T.

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E97A ( 12 ) 2366 - 2366 2014年
A Delay-variation-aware High-level Synthesis Algorithm for RDR Architectures

Hagio Yuta, Yanagisawa Masao, Togawa Nozomu

Information and Media Technologies 9 ( 4 ) 446 - 455 2014年

　概要を見る

As device feature size drops, interconnection delays often exceed gate delays. We have to incorporate interconnection delays even in high-level synthesis. Using RDR architectures is one of the effective solutions to this problem. At the same time, process and delay variation also becomes a serious problem which may result in several timing errors. How to deal with this problem is another key issue in high-level synthesis. In this paper, we propose a delay-variation-aware high-level synthesis algorithm for RDR architectures. We first obtain a non-delayed scheduling/binding result and, based on it, we also obtain a delayed scheduling/binding result. By adding several extra functional units to vacant RDR islands, we can have a delayed scheduling/binding result so that its latency is not much increased compared with the non-delayed one. After that, we similarize the two scheduling/binding results by repeatedly modifying their results. We can finally realize non-delayed and delayed scheduling/binding results simultaneously on RDR architecture with almost no area/performance overheads and we can select either one of them depending on post-silicon delay variation. Experimental results show that our algorithm successfully reduces delayed scheduling/binding latency by up to 42.9% compared with the conventional approach.

DOI CiNii
Scan-based attack on the LED block cipher using scan signatures

Mika Fujishiro, Masao Yanagisawa, Nozomu Togawa

Proceedings - IEEE International Symposium on Circuits and Systems 1460 - 1463 2014年 [査読有り]

　概要を見る

LED (Light Encryption Device) block cipher, one of lightweight block ciphers, is very compact in hardware. Its encryption process is composed of AES-like rounds. Recently, a scan-based side-channel attack is reported which retrieves the secret information inside the cryptosystem utilizing scan chains, one of design-for-test techniques. In this paper, a scan-based attack method on the LED block cipher using scan signatures is proposed. In our proposed method, we focus on a particular 16-bit position in scanned data obtained from an LED LSI chip and retrieve its secret key using scan signatures. Experimental results show that our proposed method successfully retrieves its 64-bit secret key using 73 plaintexts on average if the scan chain is only connected to the LED block cipher. These experimental results also show the key is successfully retrieved even if the scan chain includes additional some 4000 1-bit registers. © 2014 IEEE.

DOI

Scopus

4

被引用数

(Scopus)
Linear and bi-linear interpolation circuits using selector logics and their evaluations

Masashi Shio, Masao Yanagisawa, Nozomu Togawa

Proceedings - IEEE International Symposium on Circuits and Systems 1436 - 1439 2014年 [査読有り]

　概要を見る

Interpolation is a technique that presumes a value between existing data, which is often used for image scaling and correction of distortion. Linear interpolation is one of the interpolation techniques which interpolates inbetween values by linearly connecting two known values. Also, bi-linear interpolation is one of interpolation techniques, which interpolates a value linearly from its four circumferences. Both of them are used practically in many cases. In this paper, we propose high-speed and small-sized linear and bi-linear interpolation circuits based on selector logics. The proposed linear and bi-linear interpolation circuits reduce carry propagation delays by using selector logics and then realize fast and small-sized circuits. We have implemented our linear interpolation circuit and bi-linear interpolation circuits in several ways and evaluated each of them. We can find out that a selector-based bi-linear interpolation circuit where its partial products are summed up by using the arithmetic operator saves its area by up to 42% and reduces its delay by up to 18% compared with a conventional design. © 2014 IEEE.

DOI

Scopus

12

被引用数

(Scopus)
Throughput Driven Check Point Selection in Suspicious Timing Error Prediction based Designs

Hiroaki Igarashi, Youhua Shi, Masao Yanagisawa, Nozomu Togawa

2014 IEEE 5TH LATIN AMERICAN SYMPOSIUM ON CIRCUITS AND SYSTEMS (LASCAS) 1 - 4 2014年 [査読有り]

　概要を見る

In this paper, a throughput-driven design technique is proposed, in which a suspicious timing error prediction circuit is inserted to monitor the signal transitions at some selected check points. Unlike previous works where timing errors are detected after their occurrence, the proposed method tries to use the real intermediate signal transitions for timing error prediction. The check point selection will affect both the maximal operation frequency and the suspicious timing error overestimation rate, both of which have an effect on the overall throughput, thus an analysis on the check point selection is also given. In our work, the circuit can be overclocked by a factor of 2 or more with ignorable area overhead while guarantees the always-correct output.

DOI

Scopus
Scan-based Attack on the LED Block Cipher Using Scan Signatures

Mika Fujishiro, Masao Yanagisawa, Nozomu Togawa

2014 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS) 1460 - 1463 2014年 [査読有り]

　概要を見る

LED (Light Encryption Device) block cipher, one of lightweight block ciphers, is very compact in hardware. Its encryption process is composed of AES-like rounds. Recently, a scan-based side-channel attack is reported which retrieves the secret information inside the cryptosystem utilizing scan chains, one of design-for-test techniques. In this paper, a scan-based attack method on the LED block cipher using scan signatures is proposed. In our proposed method, we focus on a particular 16-bit position in scanned data obtained from an LED LSI chip and retrieve its secret key using scan signatures. Experimental results show that our proposed method successfully retrieves its 64-bit secret key using 73 plaintexts on average if the scan chain is only connected to the LED block cipher. These experimental results also show the key is successfully retrieved even if the scan chain includes additional some 4000 1-bit registers.

DOI

Scopus

4

被引用数

(Scopus)
Linear and Bi-linear Interpolation Circuits Selector Logics and their Evaluations

Masashi Shio, Masao Yanagisawa, Nozomu Togawa

2014 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS) 1436 - 1439 2014年 [査読有り]

　概要を見る

Interpolation is a technique that presumes a value between existing data, which is often used for image scaling and correction of distortion. Linear interpolation is one of the interpolation techniques which interpolates inbetween values by linearly connecting two known values. Also, bi-linear interpolation is one of interpolation techniques, which interpolates a value linearly from its four circumferences. Both of them are used practically in many cases. In this paper, we propose high-speed and small-sized linear and bi-linear interpolation circuits based on selector logics. The proposed linear and bi-linear interpolation circuits reduce carry propagation delays by using selector logics and then realize fast and small-sized circuits. We have implemented our linear interpolation circuit and bi-linear interpolation circuits in several ways and evaluated each of them. We can find out that a selector-based bi-linear interpolation circuit where its partial products are summed up by using the arithmetic operator saves its area by up to 42% and reduces its delay by up to 18% compared with a conventional design.

DOI

Scopus

12

被引用数

(Scopus)
In-situ Timing Monitoring Methods for Variation-Resilient Designs

Youhua Shi, Nozomu Togawa

2014 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS) 735 - 738 2014年 [査読有り]

　概要を見る

With technology scaling, process, voltage, and temperature (PVT) variations pose great challenges on integrated circuit designs. Conventionally, LSI circuits are designed by adding pessimistic timing margin to guarantee "always correct" operations even under worst-case conditions. However, due to the increasing PVT variations, unacceptable larger design guard band should be reserved to avoid timing errors on critical paths of circuits, which will therefore lead to very inefficient designs in terms of power and performance. For this reason, in-situ timing monitoring technique has gained great research interest. In this paper, we will review existing variation-resilient design techniques with particular emphasis on in-situ timing monitoring techniques including both detection and prediction-based methods. The effectiveness of in-situ timing monitoring techniques will be discussed. Finally, we show an example of in-situ timing monitoring technique called STEP with applications to general pipeline designs.

DOI

Scopus
Secure scan design using improved random order and its evaluations

Masaru Oya, Yuta Atobe, Youhua Shi, Masao Yanagisawa, Nozomu Togawa

2014 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS) 555 - 558 2014年 [査読有り]

　概要を見る

Scan test using scan chains is one of the most important DFT techniques. However, scan-based attacks are reported which can retrieve the secret key in crypto circuits by using scan chains. Secure scan architecture is strongly required to protect scan chains from scan-based attacks. This paper proposes an improved version of random order as a secure scan architecture. In improved random order, a scan chain is partitioned into multiple sub-chains. The structure of the scan chain changes dynamically by selecting a subchain to scan out. Testability and security of the proposed improved random order are also discussed in the paper, and the implementation results demonstrate the effectiveness of the proposed method.

DOI

Scopus

8

被引用数

(Scopus)
A Write-Reducing and Error-Correcting Code Generation Method for Non-Volatile Memories

Tatsuro Kojo, Masashi Tawada, Masao Yanagisawa, Nozomu Togawa

2014 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS) 304 - 307 2014年 [査読有り]

　概要を見る

Data stored in non-volatile memories may be destructed due to crosstalk and radiation but we can restore their data by using error-correcting codes. However, non-volatile memories consume a large amount of energy in writing. How to reduce writing bits even using error-correcting codes is one of the challenges in non-volatile memory design. In this paper, we propose a new write-reducing and error-correcting code, called Doughnut code. Doughnut code is based on state encoding limiting maximum and minimum Hamming distances. After that, we propose a code expansion method, which improves minimum and maximum Hamming distances by expanding a write-reducing code. When we apply our code expansion method to Doughnut code, we can obtain a write-reducing code whose error-correcting ability is equal to Hamming code. Experimental results show that the proposed write-reducing code reduces the number of writing bits by up to 36% compared to Hamming code.

DOI

Scopus

3

被引用数

(Scopus)
An Area-Overhead-Oriented Monitoring-Path Selection Algorithm for Suspicious Timing Error Prediction

Shinnosuke Yoshida, Youhua Shi, Masao Yanagisawa, Nozomu Togawa

2014 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS) 300 - 303 2014年 [査読有り]

　概要を見る

As process technologies advance, the importance of timing error correction techniques is increasing as well. In this paper, We propose an area-overhead-oriented monitoring-path selection algorithm for suspicious timing error prediction circuits (STEPCs). STEPC predicts timing errors by monitoring the middle points of several speed-paths in a circuit. However, we need many STEPCs with a high area overhead to predict timing errors in an overall circuit. Our proposed method moves the STEPC insertion positions to minimize the number of inserted STEPCs. We apply a max-flow and min-cut approach to determine the optimal positions of inserted STEPCs. Our proposed algorithm reduces the required number of STEPCs to 1/19 and their area to 1/5 compared with a naive algorithm. Furthermore, our algorithm realizes 2.25X overclocking compared with just inserting STEPCs into several speed-paths.

DOI

Scopus

1

被引用数

(Scopus)
Scan-based Side-Channel Attack on Camellia Cipher Using Scan Signatures

Huiqian Jiang, Mika Fujishiro, Hirokazu Kodera, Masao Yanagisawa, Nozomu Togawa

2014 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS) 252 - 255 2014年 [査読有り]

　概要を見る

Camellia, a block cipher jointly developed by Mitsubishi and NTT of Japan, is suitable for both software and hardware implementations and more secure than AES cipher. One of design-for-test techniques using scan chains is called scan-path test, in which testers can observe and control registers inside the LSI chip directly. Recently, scan-based side-channel attack is reported which retrieves the secret information from the cryptosystem using scan chains. In this paper, we propose a scan-based attack method on Camellia cipher using scan signatures. Our proposed method is based on equivalent transformation of the Camellia algorithm and key pattern reduction in order to retrieve the secret key. Experimental results show that our proposed method sucessfully retrieves its 128-bit secret key using 960 plaintexts if the scan chain is only connected to the Camellia cipher and also sucessfully retrieves its key on SASEBO-GII, which is a side-channel attack standard evaluation board.

DOI

Scopus

3

被引用数

(Scopus)
A Floorplan-Driven High-Level Synthesis Algorithm with Operation Chainings Using Chaining Enumeration

Kotaro Terada, Masao Yanagisawa, Nozomu Togawa

2014 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS) 248 - 251 2014年 [査読有り]

　概要を見る

In deep-submicron era, interconnection delays are not negligible even in high-level synthesis and RDR (Regular-Distributed-Register) architecture has been proposed to cope with this problem. In this paper, we propose a high-level synthesis algorithm using operation chainings which reduces the overall latency targeting RDR architectures. Our algorithm consists of three steps: The first step enumerates candidates for chaining. The second step introduces maximal chaining distance (MCD), which gives the maximum allowable distance on RDR architecture between chaining candidate operations. The last step performs list-scheduling and binding simultaneously using the results of two preceding steps. Our algorithm enumerates feasible chaining candidates and selects the best ones for RDR architecture. Experimental results show that our algorithm reduces the latency by up to 28.6%, the number of registers by up to 37.5%, the number of multiplexers by up to 25.0%, compared to the conventional approaches.

DOI

Scopus

1

被引用数

(Scopus)
A Floorplan-Aware High-level Synthesis Algorithm for Multiplexer Reduction Targeting FPGA Designs

Koichi Fujiwara, Shinya Abe, Kazushi Kawamura, Masao Yanagisawa, Nozomu Togawa

2014 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS) 244 - 247 2014年 [査読有り]

　概要を見る

Recently, high-level synthesis (HLS) techniques for FPGA designs are required in various applications such as computerized stock tradings and reconfigurable network processings. In HLS for FPGA designs, we need to consider module floorplan and reduce multiplexer's cost concurrently. In this paper, we propose a floorplan-aware HLS algorithm for multiplexer reduction targeting FPGA designs. By utilizing distirbuted-register architectures called HDR, we can easily consider module floorplan in HLS. In order to reduce multiplexer's cost, we propose two novel binding methods called datapath-oriented scheduling/FU binding and datapath-oriented register binding. Experimental results demonstrate that our algorithm can realize FPGA designs which reduces the number of slices by up to 47% and circuit delay by up to 16% compared with the conventional approach.

DOI

Scopus

2

被引用数

(Scopus)
A delay-variation-aware high-level synthesis algorithm for RDR architectures

Yuta Hagio, Masao Yanagisawa, Nozomu Togawa

IPSJ Transactions on System LSI Design Methodology 7 81 - 90 2014年 [査読有り]

　概要を見る

As device feature size drops, interconnection delays often exceed gate delays. We have to incorporate interconnection delays even in high-level synthesis. Using RDR architectures is one of the effective solutions to this problem. At the same time, process and delay variation also becomes a serious problem which may result in several timing errors. How to deal with this problem is another key issue in high-level synthesis. In this paper, we propose a delay-variation-aware high-level synthesis algorithm for RDR architectures. We first obtain a non-delayed scheduling/binding result and, based on it, we also obtain a delayed scheduling/binding result. By adding several extra functional units to vacant RDR islands, we can have a delayed scheduling/binding result so that its latency is not much increased compared with the non-delayed one. After that, we similarize the two scheduling/binding results by repeatedly modifying their results. We can finally realize non-delayed and delayed scheduling/binding results simultaneously on RDR architecture with almost no area/performance overheads and we can select either one of them depending on post-silicon delay variation. Experimental results show that our algorithm successfully reduces delayed scheduling/binding latency by up to 42.9% compared with the conventional approach.

DOI

Scopus

5

被引用数

(Scopus)
Energy-efficient high-level synthesis for HDR architecture with multi-stage clock gating

Hiroyuki Akasaka, Shin-Ya Abe, Masao Yanagisawa, Nozomu Togawa

IPSJ Transactions on System LSI Design Methodology 7 74 - 80 2014年 [査読有り]

　概要を見る

With the miniaturization and high performance of current and future LSIs, demand for portable devices has much more increased. Especially the problems of battery runtime and device overheating have occurred. In addition, with the downsize of the LSI design process, the ratio of an interconnection delay to a gate delay has continued to increase. High-level synthesis to estimate the interconnection delays and reduce energy consumption is essential. In this paper, we propose a high-level synthesis algorithm based on HDR architectures (huddle-based distributed register architectures) utilizing multi-stage clock gating. By increasing the number of clock gating stages in each huddle, we increase the number of the control steps at which we can apply the clock gating to registers. We can determine the configuration of the clock gating with optimized energy consumption. The experimental results demonstrate that our proposed algorithm reduced energy consumption by up to 27.7% compared with conventional algorithms.

DOI

Scopus
Floorplan Driven Architecture and High-Level Synthesis Algorithm for Dynamic Multiple Supply Voltages

Shin-ya Abe, Youhua Shi, Kimiyoshi Usami, Masao Yanagisawa, Nozomu Togawa

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E96A ( 12 ) 2597 - 2611 2013年12月 [査読有り]

　概要を見る

In this paper, we propose an adaptive voltage huddle-based distributed-register architecture (AVHDR architecture), which integrates dynamic multiple supply voltages and interconnection delay into high-level synthesis. In AVHDR architecture, voltages can be dynamically assigned for energy reduction. In other words, low supply voltages are assigned to non-critical operations, and leakage power is cut off by turning off the power supply to the sleeping functional units. Next, an AVHDR-based high-level synthesis algorithm is proposed. Our algorithm is based on iterative improvement of scheduling/binding and floorplanning. In the iteration process, the modules in each huddle can be placed close to each other and the corresponding AVHDR architecture can be generated and optimized with floorplanning information. Experimental results show that on average our algorithm achieves 43.9% energy-saving compared with conventional algorithms.

DOI

Scopus

2

被引用数

(Scopus)
A High-Speed Trace-Driven Cache Configuration Simulator for Dual-Core Processor L1 Caches

Masashi Tawada, Masao Yanagisawa, Nozomu Togawa

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E96A ( 6 ) 1283 - 1292 2013年06月 [査読有り]

　概要を見る

Recently, multi-core processors are used in embedded systems very often. Since application programs is much limited running on embedded systems, there must exists an optimal cache memory configuration in terms of power and area. Simulating application programs on various cache configurations is one of the best options to determine the optimal one. Multi-core cache configuration simulation, however, is much more complicated and takes much more time than single-core cache configuration simulation. In this paper, we propose a very fast dual-core L1 cache configuration simulation algorithm. We first propose a new data structure where just a single data structure represents two or more multi-core cache configurations with different cache associativities. After that, we propose a new multi-core cache configuration simulation algorithm using our new data structure associated with new theorems. Experimental results demonstrate that our algorithm obtains exact simulation results but runs 20 times faster than a conventional approach.

DOI

Scopus
Scan-based Attack against DES and Triple DES Cryptosystems Using Scan Signatures

Kodera Hirokazu, Yanagisawa Masao, Togawa Nozomu

Information and Media Technologies 8 ( 3 ) 867 - 874 2013年

　概要を見る

A scan-path test is one of the useful design-for-test techniques, in which testers can observe and control registers inside the target LSI chip directly. On the other hand, the risk of side-channel attacks against cryptographic LSIs and modules has been pointed out. In particular, scan-based attacks which retrieve secret keys by analyzing scan data obtained from scan chains have been attracting attention. In this paper, we propose two scan-based attack methods against DES and Triple DES using scan signatures. Our proposed methods are based on focusing on particular bit-column-data in a set of scan data and observing their changes when giving several plaintexts. Based on this property, we introduce the idea of a scan signature first and apply it to DES cryptosystems. In DES cryptosystems, we can retrieve secret keys by partitioning the S-BOX process into eight independent sub-processes and reducing the number of the round key candidates from 2⁴⁸ to 2⁶ × 8 = 512. In Triple DES cryptosystems, three secret keys are used to encrypt plaintexts. Then we retrieve them one by one, using the similar technique as in DES cryptosystems. Although some problems occur when retrieving the second/third secret key, our proposed method effectively resolves them. Our proposed methods can retrieve secret keys even if a scan chain includes registers except a crypto module and attackers do not know when the encryption is really done in the crypto module. Experimental results demonstrate that we successfully retrieve the secret keys of a DES cryptosystem using at most 32 plaintexts and that of a Triple DES cryptosystem using at most 36 plaintexts.

DOI CiNii
Energy-efficient High-level Synthesis for HDR Architectures with Clock Gating Based on Concurrency-oriented Scheduling

Akasaka Hiroyuki, Abe Shin-ya, Yanagisawa Masao, Togawa Nozomu

Information and Media Technologies 8 ( 4 ) 913 - 923 2013年

　概要を見る

With the miniaturization of LSIs and its increasing performance, demand for high-functional portable devices has grown significantly. At the same time, battery lifetime and device overheating are leading to major design problems hampering further LSI integration. On the other hand, the ratio of an interconnection delay to a gate delay has continued to increase as device feature size decreases. We have to estimate interconnection delays and reduce energy consumption even in a high-level synthesis stage. In this paper, we propose a high-level synthesis algorithm for huddle-based distributed-register architectures (HDR architectures) with clock gatings based on concurrency-oriented scheduling/functional unit binding. We assume coarse-grained clock gatings to huddles and we focus on the number of control steps, or gating steps, at which we can apply the clock gating to registers in every huddle. We propose two methods to increase gating steps: One is that we try to schedule and bind operations to be performed at the same timing. By adjusting the clock gating timings in a high-level synthesis stage, we expect that we can enhance the effect of clock gatings more than applying clock gatings after logic synthesis. The other is that we try to synthesize huddles such that each of the synthesized huddles includes registers which have similar or the same clock gating timings. At this time, we determine the clock gating timings to minimize all energy consumption including clock tree energy. The experimental results show that our proposed algorithm reduces energy consumption by a maximum of 23.8% compared with several conventional algorithms.

DOI CiNii
Concurrent faulty clock detection for crypto circuits against clock glitch based DFA

Hiroaki Igarashi, Youhua Shi, Masao Yanagisawa, Nozomu Togawa

Proceedings - IEEE International Symposium on Circuits and Systems 1432 - 1435 2013年 [査読有り]

　概要を見る

In this paper, a concurrent faulty clock detection method is proposed for crypto circuits against clock glitch based differential fault analysis (DFA). In the proposed method, a nonlogic buffer-based delay chain is inserted, and then by monitoring the delay along the delay chain, a possible clock glitch based DFA can be detected. Experimental results on an AES circuit show that the proposed method can successfully detect clock glitch based attacks, and the required area overhead is only 0.47% that is much smaller than previous works. © 2013 IEEE.

DOI

Scopus

19

被引用数

(Scopus)
An Energy-efficient High-level Synthesis Algorithm Incorporating Interconnection Delays and Dynamic Multiple Supply Voltages

Shin-ya Abe, Youhua Shi, Kimiyoshi Usami, Masao Yanagisawa, Nozomu Togawa

2013 INTERNATIONAL SYMPOSIUM ON VLSI DESIGN, AUTOMATION, AND TEST (VLSI-DAT) 2013年 [査読有り]

　概要を見る

In this paper, we propose an adaptive voltage huddle-based distributed-register architecture (AVHDR architecture) that integrates dynamic multiple supply voltages and interconnection delays into high-level synthesis. Next, we propose a high-level synthesis algorithm for AVHDR architectures. Our algorithm is based on iterative improvement of scheduling/binding and floorplanning. In the iteration process, huddles, each of which abstracts modules placed close to each other, are naturally generated using floorplanning. Low-supply voltages are assigned to non-critical operations, and leakage power is cut off by turning off the power supply to the sleeping functional units. Experimental results show that our algorithm achieves 50% energy-saving compared with conventional algorithms.
High-Level Synthesis with Post-Silicon Delay Tuning for RDR Architectures

Yuta Hagio, Masao Yanagisawa, Nozomu Togawa

2013 INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC) 194 - 197 2013年 [査読有り]

　概要を見る

In this paper, we propose a high-level synthesis algorithm with post-silicon delay tuning for RDR architectures. We first obtain a non-delayed scheduling/binding result and a delayed scheduling/binding result. By adding several extra functional units to vacant RDR islands, we have a delayed scheduling/binding result so that its latency cannot be increased compared with the non-delayed one. After that, we similarize the two scheduling/binding results by repeatedly modifying their results. We can finally realize non-delayed and delayed scheduling/binding results simultaneously on RDR architecture with almost no area/performance overheads and we can select either one of them depending on post-silicon delay variation. Experimental results show that our algorithm successfully reduces delayed scheduling/binding latency by up to 42.9% compared with the conventional approach.
An Energy-efficient High-level Synthesis Algorithm Incorporating Interconnection Delays and Dynamic Multiple Supply Voltages

Shin-ya Abe, Youhua Shi, Kimiyoshi Usami, Masao Yanagisawa, Nozomu Togawa

2013 INTERNATIONAL SYMPOSIUM ON VLSI DESIGN, AUTOMATION, AND TEST (VLSI-DAT) 1 - 4 2013年 [査読有り]

　概要を見る

In this paper, we propose an adaptive voltage huddle-based distributed-register architecture (AVHDR architecture) that integrates dynamic multiple supply voltages and interconnection delays into high-level synthesis. Next, we propose a high-level synthesis algorithm for AVHDR architectures. Our algorithm is based on iterative improvement of scheduling/binding and floorplanning. In the iteration process, huddles, each of which abstracts modules placed close to each other, are naturally generated using floorplanning. Low-supply voltages are assigned to non-critical operations, and leakage power is cut off by turning off the power supply to the sleeping functional units. Experimental results show that our algorithm achieves 50% energy-saving compared with conventional algorithms.

DOI

Scopus
Secure scan design with dynamically configurable connection

Yuta Atobe, Youhua Shi, Masao Yanagisawa, Nozomu Togawa

Proceedings of IEEE Pacific Rim International Symposium on Dependable Computing, PRDC 256 - 262 2013年 [査読有り]

　概要を見る

Scan test is a powerful test technique which can control and observe the internal states of the circuit under test through scan chains. However, it has been reported that it's possible to retrieve secret keys from cryptographic LSIs through scan chains. Therefore new secure test methods are required to satisfy both testability and security requirements. In this paper, a secure scan design is proposed to achieve adequate security requirement as a countermeasure against scan-based attacks, while still maintain high testability like normal scan testing. In our method, the internal scan chain is divided into several sub chains, and the connection order of sub chains can be dynamically changed. In addition, how to decide the connection order of those sub chains so that it can't be identified by an attacker is also proposed in this paper. The proposed method is implemented on an AES circuit to show its effectiveness, and a security analysis is also given to show how the proposed approach can be used as a countermeasure against those known scan-based attacks. © 2013 IEEE.

DOI

Scopus

35

被引用数

(Scopus)
Suspicious Timing Error Prediction with In-Cycle Clock Gating

Youhua Shi, Hiroaki Igarashi, Nozomu Togawa, Masao Yanagisawa

PROCEEDINGS OF THE FOURTEENTH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED 2013) 335 - 340 2013年 [査読有り]

　概要を見る

Conventionally, circuits are designed to add pessimistic timing margin to solve delay variation problems, which guarantees "always correct" operations. However, due to the fact that such a worst-case condition occurs rarely, the traditional pessimistic design method is therefore becoming one of the main obstacles for designers to achieve higher performance and/or ultra-low power consumption. By monitoring timing error occurrence during circuit operation, adaptive timing error detection and recovery methods have gained wide interests recently as a promising solution. As an extension of existing research, in this paper, we propose a suspicious timing error prediction method for performance or energy efficiency improvement in pipeline designs. Experimental results show that with when compared with typical margin designs, the proposed method can 1) achieve up to 1.41X throughput improvement with in-situ timing error prediction ability; and 2) allow the design to be overclocked by up to 1.88X with "always correct" outputs.

DOI

Scopus

17

被引用数

(Scopus)
A partial redundant fault-secure high-level synthesis algorithm for RDR architectures

Kazushi Kawamura, Sho Tanaka, Masao Yanagisawa, Nozomu Togawa

Proceedings - IEEE International Symposium on Circuits and Systems 1736 - 1739 2013年 [査読有り]

　概要を見る

In this paper, we propose a partial redundant fault-secure high-level synthesis algorithm for RDR architectures, where we duplicate a part of the original CDFG and maximize its reliability under a timing constraint. Firstly, our algorithm allocates some new additional functional units to vacant spaces on RDR islands for recomputation and increases the number of duplicated operation nodes. Secondly, it minimizes the number of inserted comparator nodes through re-scheduling/re-binding the recomputation CDFG's nodes. As a result, we will obtain a scheduled/bound recomputation CDFG and renewed functional unit allocation with high reliability. Experimental results demonstrate that our algorithm improves reliability by up to 52% compared with the conventional approach. © 2013 IEEE.

DOI

Scopus
Concurrent Faulty Clock Detection for Crypto Circuits against Clock Glitch based DFA

Hiroaki Igarashi, Youhua Shi, Masao Yanagisawa, Nozomu Togawa

2013 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS) 1432 - 1435 2013年 [査読有り]

　概要を見る

In this paper, a concurrent faulty clock detection method is proposed for crypto circuits against clock glitch based differential fault analysis (DFA). In the proposed method, a non-logic buffer-based delay chain is inserted, and then by monitoring the delay along the delay chain, a possible clock glitch based DFA can be detected. Experimental results on an AES circuit show that the proposed method can successfully detect clock glitch based attacks, and the required area overhead is only 0.47% that is much smaller than previous works.

DOI

Scopus

19

被引用数

(Scopus)
Energy Evaluation for Two-level On-chip Cache with Non-Volatile Memory on Mobile Processors

Shota Matsuno, Masashi Tawada, Masao Yanagisawa, Shinji Kimura, Nozomu Togawa, Tadahiko Sugibayashi

2013 IEEE 10TH INTERNATIONAL CONFERENCE ON ASIC (ASICON) 1 - 4 2013年 [査読有り]

　概要を見る

As leakage power of traditional SRAM becomes larger, a ratio of static energy in total energy of memory architecture becomes also larger. Non-volatile memory (NVM) has many advantages over SRAM, such as high density, low leakage power, and non-volatility, but consumes too much write energy. In this paper, we evaluate energy consumption of two-level cache using NVM in part on mobile processors and confirm that it effectively reduces energy consumption.

DOI

Scopus

1

被引用数

(Scopus)
Scan-based Attack against Trivium Stream Cipher Independent of Scan Structure

Mika Fujishiro, Masao Yanagisawa, Nozomu Togawa

2013 IEEE 10TH INTERNATIONAL CONFERENCE ON ASIC (ASICON) 1 - 4 2013年 [査読有り]

　概要を見る

Trivium is a synchronous stream cipher using three shift registers running at high speed with simple structure. A scan-based side-channel attack retrieves secret information using scan chains, one of design-for-test techniques. In this paper, a scan-based side-channel attack method against Trivium using scan signatures is proposed. In our method, we focus on a particular I-bit position in a collection of scan chains and then we can attack Trivium even if the scan chain includes other registers than internal state registers in Trivium. Experimental results show that our proposed method successfully retrieves a plaintext from a ciphertext.

DOI

Scopus

6

被引用数

(Scopus)
Scan-based attack against DES and Triple DES cryptosystems using scan signatures

Hirokazu Kodera, Masao Yanagisawa, Nozomu Togawa

Journal of Information Processing 21 ( 3 ) 572 - 579 2013年 [査読有り]

　概要を見る

A scan-path test is one of the useful design-for-test techniques, in which testers can observe and control registers inside the target LSI chip directly. On the other hand, the risk of side-channel attacks against cryptographic LSIs and modules has been pointed out. In particular, scan-based attacks which retrieve secret keys by analyzing scan data obtained from scan chains have been attracting attention. In this paper, we propose two scan-based attack methods against DES and Triple DES using scan signatures. Our proposed methods are based on focusing on particular bit-column-data in a set of scan data and observing their changes when giving several plaintexts. Based on this property, we introduce the idea of a scan signature first and apply it to DES cryptosystems. In DES cryptosystems, we can retrieve secret keys by partitioning the S-BOX process into eight independent sub-processes and reducing the number of the round key candidates from 248 to 26 × 8 = 512. In Triple DES cryptosystems, three secret keys are used to encrypt plaintexts. Then we retrieve them one by one, using the similar technique as in DES cryptosystems. Although some problems occur when retrieving the second/third secret key, our proposed method effectively resolves them. Our proposed methods can retrieve secret keys even if a scan chain includes registers except a crypto module and attackers do not know when the encryption is really done in the crypto module. Experimental results demonstrate that we successfully retrieve the secret keys of a DES cryptosystem using at most 32 plaintexts and that of a Triple DES cryptosystem using at most 36 plaintexts. © 2013 Information Processing Society of Japan.

DOI

Scopus

2

被引用数

(Scopus)
Energy-efficient High-level Synthesis for HDR Architectures with Clock Gating Based on Concurrency-oriented Scheduling.

Hiroyuki Akasaka, Shin-ya Abe, Masao Yanagisawa, Nozomu Togawa

IPSJ Trans. System LSI Design Methodology 6 101 - 111 2013年 [査読有り]

　概要を見る

With the miniaturization of LSIs and its increasing performance, demand for high-functional portable devices has grown significantly. At the same time, battery lifetime and device overheating are leading to major design problems hampering further LSI integration. On the other hand, the ratio of an interconnection delay to a gate delay has continued to increase as device feature size decreases. We have to estimate interconnection delays and reduce energy consumption even in a high-level synthesis stage. In this paper, we propose a high-level synthesis algorithm for huddle-based distributed-register architectures (HDR architectures) with clock gatings based on concurrency-oriented scheduling/functional unit binding. We assume coarse-grained clock gatings to huddles and we focus on the number of control steps, or gating steps, at which we can apply the clock gating to registers in every huddle. We propose two methods to increase gating steps: One is that we try to schedule and bind operations to be performed at the same timing. By adjusting the clock gating timings in a high-level synthesis stage, we expect that we can enhance the effect of clock gatings more than applying clock gatings after logic synthesis. The other is that we try to synthesize huddles such that each of the synthesized huddles includes registers which have similar or the same clock gating timings. At this time, we determine the clock gating timings to minimize all energy consumption including clock tree energy. The experimental results show that our proposed algorithm reduces energy consumption by a maximum of 23.8% compared with several conventional algorithms.

DOI CiNii
A thermal-aware high-level synthesis algorithm for RDR architectures through binding and allocation

Kazushi Kawamura, Masao Yanagisawa, Nozomu Togawa

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E96-A ( 1 ) 312 - 321 2013年 [査読有り]

　概要を見る

With process technology scaling, a heat problem in ICs is becoming a serious issue. Since high temperature adversely impacts on reliability, design costs, and leakage power, it is necessary to incorporate thermal-aware synthesis into IC design flows. In particular, hot spots are serious concerns where a chip is locally too much heated and reducing the peak temperature inside a chip is very important. On the other hand, increasing the average interconnect delays is also becoming a serious issue. By using RDR architectures (Regular-Distributed-Register architectures), the interconnect delays can be easily estimated and their influence can be much reduced even in high-level synthesis. In this paper, we propose a thermal-aware high-level synthesis algorithm for RDR architectures. The RDR architecture divides the entire chip into islands and each island has uniform area. Our algorithm balances the energy consumption among islands through re-binding to functional units. By allocating some new additional functional units to vacant areas on islands, our algorithm further balances the energy consumption among islands and thus reduces the peak temperature. Experimental results demonstrate that our algorithm reduces the peak temperature by up to 9.1% compared with the conventional approach. Copyright © 2013 The Institute of Electronics, Information and Communication Engineers.

DOI

Scopus

1

被引用数

(Scopus)
Scan-Based Attack on AES through Round Registers and Its Countermeasure

Youhua Shi, Nozomu Togawa, Masao Yanagisawa

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E95A ( 12 ) 2338 - 2346 2012年12月 [査読有り]

　概要を見る

Scan-based side channel attack on hardware implementations of cryptographic algorithms has shown its great security threat. Unlike existing scan-based attacks, in our work we observed that instead of the secret-related-registers, some non-secret registers also carry the potential of being misused to help a hacker to retrieve secret keys. In this paper, we first present a scan-based side channel attack method on AES by making use of the round counter registers, which are not paid attention to in previous works, to show the potential security threat in designs with scan chains. And then we discussed the issues of secure DFT requirements and proposed a secure scan scheme to preserve all the advantages and simplicities of traditional scan test, while significantly improve the security with ignorable design overhead, for crypto hardware implementations.

DOI

Scopus

1

被引用数

(Scopus)
A Locality-Aware Hybrid NoC Configuration Algorithm Utilizing the Communication Volume among IP Cores

Seungju Lee, Masao Yanagisawa, Nozomu Togawa

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E95A ( 9 ) 1538 - 1549 2012年09月 [査読有り]

　概要を見る

Network-on-chip (NoC) architectures have emerged as a promising solution to the lack of scalability in multi-processor systems-on-chips (MPSoCs). With the explosive growth in the usage of multimedia applications, it is expected that NoC serves as a multimedia server supporting multi-class services. In this paper, we propose a configuration algorithm for a hybrid bus-NoC architecture together with simulation results. Our target architecture is a hybrid bus-NoC architecture, called busmesh NoC, which is a generalized version of a hybrid NoC with local buses. In our BMNoC configuration algorithm, cores which have a heavy communication volume between them are mapped in a cluster node (CN) and connected by a local bus. CNs can have communication with each other via edge switches (ESes) and mesh routers (MRs). With this hierarchical communication network, our proposed algorithm can improve the latency as compared with conventional methods. Several realistic applications applied to our algorithm illustrate the better performance than earlier studies and feasibility of our proposed algorithm.

DOI

Scopus

1

被引用数

(Scopus)
Energy-efficient High-level Synthesis for HDR Architectures

Abe Shin-ya, Yanagisawa Masao, Togawa Nozomu

Information and Media Technologies 7 ( 4 ) 1319 - 1330 2012年

　概要を見る

As battery runtime and overheating problems for portable devices become unignorable, energy-aware LSI design is strongly required. Moreover, an interconnection delay should be explicitly considered there because it exceeds a gate delay as the semiconductor devices are downsized. We must take account of energy efficiency and interconnection delays even in high-level synthesis. In this paper, we first propose a huddle-based distributed-register architecture (HDR architecture), an island-based distributed-register architecture for multi-cycle interconnect communications where we can develop several energy-saving techniques. Next, we propose an energy-efficient high-level synthesis algorithm for HDR architectures focusing on multiple supply voltages. Our algorithm is based on iterative improvement of scheduling/binding and floorplanning. In the iteration process, a huddle, which is composed of functional units, registers, controller, and level converters, are very naturally generated using floorplanning results. By assigning high supply voltage to critical huddles and low supply voltage to non-critical huddles, we can finally have energy-efficient floorplan-aware high-level synthesis. Experimental results show that our algorithm achieves 45% energy-saving compared with the conventional distributed-register architectures and conventional algorithms.

DOI CiNii
Dynamically Changeable Secure Scan Architecture against Scan-Based Side Channel Attack

Yuta Atobe, Youhua Shi, Masao Yanagisawa, Nozomu Togawa

2012 INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC) 155 - 158 2012年 [査読有り]

　概要を見る

Scan test which is one of the useful design for testability techniques is effective for LSIs including cryptographic circuit. It can observe and control the internal states of the circuit under test by using scan chain. However, scan chain presents a significant security risk of information leakage for scan-based attacks which retrieves secret keys of cryptographic LSIs. In this paper, a secure scan architecture against scan-based attack which still has high testability is proposed. In our method, scan data is dynamically changed by adding the latch to any FFs in the scan chain. We show that by using proposed method, neither the secret key nor the testability of an RSA circuit implementation is compromised, and the effectiveness of the proposed method.

DOI

Scopus

42

被引用数

(Scopus)
Energy-efficient High-level Synthesis for HDR Architectures with Clock Gating

Hiroyuki Akasaka, Masao Yanagisawa, Nozomu Togawa

2012 INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC) 135 - 138 2012年 [査読有り]

　概要を見る

With the miniaturization of LSIs and its increasing performance, demand for high-functional portable devices has grown significantly. At the same time, the problems for battery runtime and device overheating have occurred. On the other hand, the ratio of an interconnection delay to a gate delay has continued to increase as device feature size decreases. We have to estimate the interconnection delay and reduce energy consumption even in a high-level synthesis stage. Recently, an HDR architecture and its associated power-optimized high-level synthesis algorithm have been proposed which can effectively estimate the interconnection delays by introducing the idea of "huddles" into an LSI chip. It utilize multiple supply voltages and achieves power-optimized LSI synthesis but does not take into account the clock gatings. In this paper, we propose a high-level synthesis algorithm based on HDR architectures utilizing clock gatings. Firstly we focus on the number of the control steps at which we can apply the clock gating to registers. Secondly, we synthesize the huddles such that each of the synthesized huddles includes registers which have similar or exactly the same clock gating timings. The experimental results show that our proposed algorithm reduces energy consumption by a maximum of 14.9% compared with the conventional algorithm.

DOI

Scopus

2

被引用数

(Scopus)
A Novel BMNoC Configuration Algorithm Utilizing Communication Volume and Locality among Cores

Seungju Lee, Nozomu Togawa, Takashi Aoki, Akira Onozawa

2012 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 2012) 1668 - 1671 2012年 [査読有り]

　概要を見る

Network-on-chip (NoC) architectures are emerged as a promising solution to the lack of scalability in multi-processor systems-on-chips (MPSoCs). In this paper, we propose a novel BMNoC configuration algorithm together with simulation results. Our BMNoC configuration algorithm analyses the data traffic of the target application and determines which core is the right one to put into the certain cluster with its communication volume and locality. Furthermore, the simulation results illustrate the better latency than earlier studies and feasibility of BMNoC.

DOI

Scopus
An Energy-efficient High-level Synthesis Algorithm for Huddle-based Distributed-Register Architectures

Shin-ya Abet, Masao Yanagisawa, Nozomu Togawat

2012 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 2012) 576 - 579 2012年 [査読有り]

　概要を見る

In this paper, we first propose a huddle-based distributed-register architecture (HDR architecture), an island-based distributed-register architecture for multi-cycle interconnect communications where we can develop several energy-saving techniques. Next, we propose an energy-efficient high-level synthesis algorithm for HDR architectures focusing on multiple supply voltages. Our algorithm is based on iterative improvement of scheduling/binding and floorplanning. In the iteration process, huddles, each of which is composed of functional units, registers, controller, and level converters, are very naturally generated using floorplanning results. By assigning high supply voltage to critical huddles and low supply voltage to non-critical huddles, we can finally have energy-efficient floorplan-aware high-level synthesis. Experimental results show that our algorithm achieves 45% energy-saving compared with the conventional distributed-register architectures and conventional algorithms.

DOI

Scopus

8

被引用数

(Scopus)
State Dependent Scan Flip-Flop with Key-Based Configuration against Scan-Based Side Channel Attack on RSA Circuit

Yuta Atobe, Youhua Shi, Masao Yanagisawa, Nozomu Togawa

2012 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS) 607 - 610 2012年 [査読有り]

　概要を見る

Scan test is one of the useful design for testability techniques, which can detect circuit failure efficiently. However, it has been reported that it's possible to retrieve secret keys from cryptographic LSIs through scan chains. Therefore testability and security contradicted to each other, and there is a need to an efficient design for testability circuit so as to satisfy both testability and security requirement. In this paper, a secure scan architecture against scan-based attack is proposed to achieve high security without compromising the testability. In our method, scan structure is dynamically changed by adding the latch to any FFs in the scan chain. We made an analysis on an RSA circuit implementation to show the effectiveness of the proposed method and discussed how our approach is resistant to scan-based attack.

DOI

Scopus

19

被引用数

(Scopus)
Weighted Adders with Selector Logics for Super-resolution and Its FPGA-based Evaluation

Hiromine Yoshihara, Masao Yanagisawa, Nozomu Togawa

2012 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS) 603 - 606 2012年 [査読有り]

　概要を見る

Super-resolution is a technique to remove the noise of observed images and restore its high frequencies. We focus on reconstruction-based super-resolution. Reconstruction requires large computation cost since it requires many images. In this paper, we propose a fast weighted adder for reconstruction-based super-resolution. From the viewpoint of reducing partial products, we propose two approaches to speed up a weighted adder. First, we use selector logics to halve its partial products. Second, we propose a weights-range limit method utilizing negative term. By applying our proposed approaches to a weighted adder, we can reduce carry propagations and our weighted adder can be designed by a fast circuit as compared to conventional ones. Experimental evaluations demonstrate that our weighted adder improves the performance by a maximum of 29.9% and reduces a maximum of 592 LUTs, compared to conventional implementations.

DOI

Scopus

1

被引用数

(Scopus)
Scan-based Attack against DES Cryptosystems Using Scan Signatures

Hirokazu Kodera, Masao Yanagisawa, Nozomu Togawa

2012 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS) 599 - 602 2012年 [査読有り]

　概要を見る

With the high integration of LSI in recent years, the importance of design-for-techniques has been increasing. A scan-path test is one of the useful design-for-test techniques, in which testers can observe and control registers inside the target LSI chip directly. On the other hand, the risk of side-channel attacks against cryptographic LSIs and modules has been pointed out. In particular, scan-based attacks which retrieve secret keys by analyzing scan data obtained from scan chains has been attracting attention. In this paper, we propose a scan-based attack method against DES using scan signatures. Our proposed method are based on focusing on particular bit-column-data in a set of scan data and observing their changes when given several plaintexts. We can retrieve secret keys by partitioning the S-BOX process into eight independent sub-processes and reducing the number of the round key candidates from 2(48) to 2(6) x 8 = 512. Our proposed methods can retrieve secret keys even if a scan chain includes registers except a crypto module and attackers do not know when the encryption is really done in the crypto module. Experimental results demonstrate that we successfully retrieve the secret keys of a DES cryptosystem using at most 32 plaintexts.

DOI

Scopus

32

被引用数

(Scopus)
A Hybrid NoC Architecture Utilizing Packet Transmission Priority Control Method

Seungju Lee, Nozomu Togawa, Yusuke Sekihara, Takashi Aoki, Akira Onozawa

2012 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS) 404 - 407 2012年 [査読有り]

　概要を見る

Network-on-chip architectures have emerged as a promising solution to the lack of scalability in multi-processor systems-on-chips (MPSoCs). With the explosive growth in the usage of multimedia applications, it is expected that NoC serves as a multimedia server supporting multi-class services. Recently, a busmesh NoC (BMNoC) has been proposed. The BMNoC architecture, which analyses the data traffic and makes aware of localities between cores, improves the system performance in terms of latency as compared with conventional NoCs. In this paper, we propose a novel BMNoC utilizing packet transmission priority control methods. Our proposed BMNoC is a generalized and simplified version of a hybrid NoC which is composed of local buses and global mesh routers. Several realistic applications applied to our algorithm illustrate the better performance than previous studies and feasibility of our proposed architecture.

DOI

Scopus

4

被引用数

(Scopus)
Robust Secure Scan Design Against Scan-Based Differential Cryptanalysis

Youhua Shi, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 20 ( 1 ) 176 - 181 2012年01月 [査読有り]

　概要を見る

Scan technology carries the potential risk of being misused as a "side channel" to leak out the secrets of crypto cores. The existing scan-based attacks could be viewed as one kind of differential cryptanalysis, which takes advantages of scan chains to observe the bit changes between pairs of chosen plaintexts so as to identify the secret keys. To address such a design/test challenge, this paper proposes a robust secure scan structure design for crypto cores as a countermeasure against scan-based attacks to maintain high security without compromising the testability.

DOI

Scopus

24

被引用数

(Scopus)
Energy-efficient high-level synthesis for HDR architectures

Shin-Ya Abe, Masao Yanagisawa, Nozomu Togawa

IPSJ Transactions on System LSI Design Methodology 5 106 - 117 2012年 [査読有り]

　概要を見る

As battery runtime and overheating problems for portable devices become unignorable, energy-aware LSI design is strongly required. Moreover, an interconnection delay should be explicitly considered there because it exceeds a gate delay as the semiconductor devices are downsized. We must take account of energy efficiency and interconnection delays even in high-level synthesis. In this paper, we first propose a huddle-based distributed-register architecture (HDR architecture), an island-based distributed-register architecture for multi-cycle interconnect communications where we can develop several energy-saving techniques. Next, we propose an energy-efficient high-level synthesis algorithm for HDR architectures focusing on multiple supply voltages. Our algorithm is based on iterative improvement of scheduling/binding and floorplanning. In the iteration process, a huddle, which is composed of functional units, registers, controller, and level converters, are very naturally generated using floorplanning results. By assigning high supply voltage to critical huddles and low supply voltage to non-critical huddles, we can finally have energy-efficient floorplan-aware high-level synthesis. Experimental results show that our algorithm achieves 45% energy-saving compared with the conventional distributed-register architectures and conventional algorithms. © 2012 Information Processing Society of Japan.

DOI

Scopus

13

被引用数

(Scopus)
A fastweighted adder by reducing partial product for reconstruction in super-resolution

Hiromine Yoshihara, Masao Yanagisawa, Nozomu Togawa

IPSJ Transactions on System LSI Design Methodology 5 96 - 105 2012年 [査読有り]

　概要を見る

In recent years, it is quite necessary to convert conventional low-resolution images to high-resolution ones at low cost. Super-resolution is a technique to remove the noise of observed images and restore its high frequencies. We focus on reconstruction-based super-resolution. Reconstruction requires large computation cost since it requires many images. In this paper, we propose a fast weighted adder for reconstruction-based super-resolution. From the viewpoint of reducing partial products, we propose two approaches to speed up a weighted adder. First, we use selector logics to halve its partial products. Second, we propose a weights-range limit method utilizing negative term. By applying our proposed approaches to a weighted adder, we can reduce carry propagations and our weighted adder can be designed by a fast circuit as compared to conventional ones. Experimental evaluations demonstrate that our weighted adder reduces its delay time by a maximum of 25.29% and its area to a maximum of 1/3, compared to conventional implementations. © 2012 Information Processing Society of Japan.

DOI

Scopus
MH4 : multiple-supply-voltages aware high-level synthesis for high-integrated and high-frequency circuits for HDR architectures

Shin-ya Abe, Youhua Shi, Masao Yanagisawa, Nozomu Togawa

IEICE ELECTRONICS EXPRESS 9 ( 17 ) 1414 - 1422 2012年 [査読有り]

　概要を見る

In this paper, we propose multiple-supply-voltages aware high-level synthesis algorithm for HDR architectures which realizes high-speed and high-efficient circuits. We propose three new techniques: virtual area estimation, virtual area adaptation, and floorplanning-directed huddling, and integrate them into our HDR architecture synthesis algorithm. Virtual area estimation/adaptation effectively estimates a huddle area by gradually reducing it during iterations, which improves the convergence of our algorithm. Floorplanning-directed huddling determines huddle composition very effectively by performing floorplanning and functional unit assignment inside huddles simultaneously. Experimental results show that our algorithm achieves about 29% run-time-saving compared with the conventional algorithms, and obtains a solution which cannot be obtained by our original algorithm even if a very tight clock constraint is given.

DOI

Scopus

14

被引用数

(Scopus)
Greedy Algorithm for the On-Chip Decoupling Capacitance Optimization to Satisfy the Voltage Drop Constraint

Mikiko Sode Tanaka, Nozomu Togawa, Masao Yanagisawa, Satoshi Goto

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E94A ( 12 ) 2482 - 2489 2011年12月 [査読有り]

　概要を見る

With the progress of process technology in recent years, low voltage power supplies have become quite predominant. With this, the voltage margin has decreased and therefore the on-chip decoupling capacitance optimization that satisfies the voltage drop constraint becomes more important. In addition, the reduction of the on-chip decoupling capacitance area will reduce the chip area and, therefore, manufacturing costs. Hence, we propose an algorithm that satisfies the voltage drop constraint and at the same time, minimizes the total on-chip decoupling capacitance area. The proposed algorithm uses the idea of the network algorithm where the path which has the most influence on voltage drop is found. Voltage drop is improved by adding the on-chip capacitance to the node on the path. The proposed algorithm is efficient and effectively adds the on-chip capacitance to the greatest influence on the voltage drop. Experimental results demonstrate that, with the proposed algorithm, real size power/ground network could be optimized in just a few minutes which are quite practical. Compared with the conventional algorithm, we confirmed that the total on-chip decoupling capacitance area of the power/ground network was reducible by about 40 similar to 50%.

DOI

Scopus
Speeding-up exact and fast FIFO-based cache configuration simulation

Masashi Tawada, Masao Yanagisawa, Nozomu Togawa

IEICE ELECTRONICS EXPRESS 8 ( 14 ) 1161 - 1167 2011年07月 [査読有り]

　概要を見る

The number of sets, block size, and associativity determine processor's cache configurations. Particularly in embedded systems, their cache configuration can be optimized since their target applications are much limited. Recently, the CRCB method has been proposed for LRU-based (Least Recently Used-based) cache configuration simulation, which can calculate cache hit/miss counts accurately and very fast changing the three parameters. However many recent processors use FIFO-based (First-In-First-Out-based) caches instead of LRU-based caches due to the viewpoints of their hardware costs. In this paper, we propose a speeding-up cache configuration simulation method for embedded applications that uses FIFO as a cache replacement policy. We first prove several properties for FIFO-based caches and then propose a simulation method that can process two or more FIFO-based cache configurations with different cache associativities simultaneously. Experimental results show that our proposed method can obtain accurate cache hits/misses and runs up to 32% faster than the conventional simulators.

DOI

Scopus

3

被引用数

(Scopus)
Greedy Optimization Algorithm for the Power/Ground Network Design to Satisfy the Voltage Drop Constraint

Mikiko Sode Tanaka, Nozomu Togawa, Masao Yanagisawa, Satoshi Goto

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E94A ( 4 ) 1082 - 1090 2011年04月 [査読有り]

　概要を見る

With the process technological progress in recent years, low voltage power supplies have become quite predominant. With this, the voltage margin has decreased and therefore the power/ground design that satisfies the voltage drop constraint becomes more important. In addition, the reduction of the power/ground total wiring area and the number of layers will reduce manufacturing and designing costs. So, we propose an algorithm that satisfies the voltage drop constraint and at the same time, minimizes the power/ground total wiring area. The proposed algorithm uses the idea of a network algorithm [I] where the edge which has the most influence on voltage drop is found. Voltage drop is improved by changing the resistance of the edge. The proposed algorithm is efficient and effectively updates the edge with the greatest influence on the voltage drop. From experimental results, compared with the conventional algorithm, we confirmed that the total wiring area of the power/ground was reducible by about 1/3. Also, the experimental data shows that the proposed algorithm satisfies the voltage drop constraint in the data whereas the conventional algorithm cannot.

DOI

Scopus

3

被引用数

(Scopus)
A Fast Selector-Based Subtract-Multiplication Unit and Its Application to Butterfly Unit

Tsukamoto Youhei, Yanagisawa Masao, Ohtsuki Tatsuo, Togawa Nozomu

Information and Media Technologies 6 ( 2 ) 276 - 285 2011年

　概要を見る

Large-scale network and multimedia application LSIs include application specific arithmetic units. A multiply-accumulator unit or a MAC unit which is one of these optimized units arranges partial products and decreases carry propagations. However, there is no method similar to MAC to execute “subtract-multiplication”. In this paper, we propose a high-speed subtract-multiplication unit that decreases latency of a subtract operation by bit-level transformation using selector logics. By using bit-level transformation, its partial products are calculated directly. The proposed subtract-multiplication units can be applied to any types of systems using subtract-multiplications and a butterfly operation in FFT is one of their suitable applications. We apply them effectively to Radix-2 butterfly units and Radix-4 butterfly units. Experimental results show that our proposed operation units using selector logics improves the performance by up to 13.92%, compared to a conventional approach.

DOI CiNii
Exact, Fast and Flexible L1 Cache Configuration Simulation for Embedded Systems

Tawada Masashi, Yanagisawa Masao, Ohtsuki Tatsuo, Togawa Nozomu

Information and Media Technologies 6 ( 4 ) 1076 - 1091 2011年

　概要を見る

Since target applications running on an embedded processor are much limited in embedded systems, we can optimize its cache configuration based on the number of sets, block size, and associativities. An extremely fast cache configuration simulation method, CRCB (Configuration Reduction approach by the Cache Behavior), has been recently proposed which can calculate cache hit/miss counts accurately for possible cache configurations when the three parameters above are changed. The CRCB method assumes LRU-based (Least Recently Used-based) cache but many recent processors use FIFO-based (First In First Out-based) cache or PLRU-based (Pseudo LRU-based) cache due to its hardware cost. In this paper, we propose exact and fast L1 cache configuration simulation algorithms for embedded applications that use PLRU or FIFO as a cache replacement policy. Firstly, we prove that the CRCB method can be applied not only to LRU but also to other cache replacement policies including FIFO and PLRU. Secondly, we prove several properties for FIFO- and PLRU-based caches and we propose associated cache simulation algorithms which can simulate simultaneously more than one cache configurations with different cache associativities accurately for FIFO or PLRU. Finally, many experimental results demonstrate that our cache configuration simulation algorithms obtain accurate cache hit/miss counts and run up to 249 times faster than a conventional cache simulator.

DOI CiNii
Exact and Fast L1 Cache Configuration Simulation for Embedded Systems with FIFO/PLRU Cache Replacement Policies

Masashi Tawada, Masao Yanagisawa, Tatsuo Ohtsuki, Nozomu Togawa

2011 INTERNATIONAL SYMPOSIUM ON VLSI DESIGN, AUTOMATION AND TEST (VLSI-DAT) 247 - 250 2011年 [査読有り]

　概要を見る

Since target applications in embedded systems are limited, we can optimize its cache configuration. A very fast and exact cache simulation algorithm, CRCB, has been recently proposed. CRCB assumes LRU as a cache replacement policy but FIFO- or PLRU-based cache is often used due to its low hardware cost. This paper proposes exact and fast L1 cache simulation algorithms for PLRU- or FIFO-based caches. First, we prove that CRCB can be applied to FIFO and PLRU. Next, we show several properties for FIFO- and PLRU-based caches and propose their associated cache-simulation speed-up algorithms. Experiments demonstrate that our algorithms run up to 300 times faster than a well-known cache simulator.
Exact, fast and flexible L1 cache configuration simulation for embedded systems

Masashi Tawada, Masao Yanagisawa, Tatsuo Ohtsuki, Nozomu Togawa

IPSJ Transactions on System LSI Design Methodology 4 166 - 181 2011年 [査読有り]

　概要を見る

Since target applications running on an embedded processor are much limited in embedded systems, we can optimize its cache configuration based on the number of sets, block size, and associativities. An extremely fast cache configuration simulation method, CRCB (Configuration Reduction approach by the Cache Behavior), has been recently proposed which can calculate cache hit/miss counts accurately for possible cache configurations when the three parameters above are changed. The CRCB method assumes LRU-based (Least Recently Used-based) cache but many recent processors use FIFO-based (First In First Out-based) cache or PLRU-based (Pseudo LRU-based) cache due to its hardware cost. In this paper, we propose exact and fast L1 cache configuration simulation algorithms for embedded applications that use PLRU or FIFO as a cache replacement policy. Firstly, we prove that the CRCB method can be applied not only to LRU but also to other cache replacement policies including FIFO and PLRU. Secondly, we prove several properties for FIFO- and PLRU-based caches and we propose associated cache simulation algorithms which can simulate simultaneously more than one cache configurations with different cache associativities accurately for FIFO or PLRU. Finally, many experimental results demonstrate that our cache configuration simulation algorithms obtain accurate cache hit/miss counts and run up to 249 times faster than a conventional cache simulator. © 2011 Information Processing Society of Japan.

DOI

Scopus
A fault-secure high-level synthesis algorithm for RDR architectures

Sho Tanaka, Masao Yanagisawa, Tatsuo Ohtsuki, Nozomu Togawa

IPSJ Transactions on System LSI Design Methodology 4 150 - 165 2011年 [査読有り]

　概要を見る

As device feature size decreases, the reliability improvement against soft errors becomes quite necessary. A fault-secure system, in which concurrent error detection is realized, is one of the solutions to this problem. On the other hand, average interconnection delays exceed gate delays which leads to a serious timing closure problem. By using regular-distributed-register architecture (RDR architecture), we can estimate interconnection delays very accurately and their influence can be much reduced even in behavioral-level design. In this paper, we propose a fault-secure high-level synthesis algorithm for an RDR architecture. In fault-secure high-level synthesis, a recomputation CDFG as well as a normal-computation CDFG must be scheduled to control steps and bound to functional units. Firstly, our algorithm re-uses vacant areas on RDR islands to allocate new function units additionally for the recomputation CDFG. Secondly, we propose an efficient edge-break algorithm which considers comparison nodes' scheduling/binding. We can have small-latency scheduling/binding for both the normal CDFG and recomputation CDFG. Our algorithm reduces the required control steps by up to 53% compared with the conventional approach. © 2011 Information Processing Society of Japan.

DOI

Scopus

12

被引用数

(Scopus)
A fast selector-based subtract-multiplication unit and its application to butterfly unit

Youhei Tsukamoto, Masao Yanagisawa, Tatsuo Ohtsuki, Nozomu Togawa

IPSJ Transactions on System LSI Design Methodology 4 60 - 69 2011年 [査読有り]

　概要を見る

Large-scale network and multimedia application LSIs include application specific arithmetic units. A multiply-accumulator unit or a MAC unit which is one of these optimized units arranges partial products and decreases carry propagations. However, there is no method similar to MAC to execute "subtractmultiplication". In this paper, we propose a high-speed subtract-multiplication unit that decreases latency of a subtract operation by bit-level transformation using selector logics. By using bit-level transformation, its partial products are calculated directly. The proposed subtract-multiplication units can be applied to any types of systems using subtract-multiplications and a butterfly operation in FFT is one of their suitable applications. We apply them effectively to Radix- 2 butterfly units and Radix-4 butterfly units. Experimental results show that our proposed operation units using selector logics improves the performance by up to 13.92%, compared to a conventional approach. © 2011 Information Processing Society of Japan.

DOI

Scopus

1

被引用数

(Scopus)
Scan vulnerability in elliptic curve cryptosystems

Ryuta Nara, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

IPSJ Transactions on System LSI Design Methodology 4 47 - 59 2011年 [査読有り]

　概要を見る

A scan-path test is one of the most important testing techniques, but it can be used as a side-channel attack against a cryptography circuit. Scan-based attacks are techniques to decipher a secret key using scanned data obtained from a cryptography circuit. Public-key cryptography, such as RSA and elliptic curve cryptosystem (ECC), is extensively used but conventional scan-based attacks cannot be applied to it, because it has a complicated algorithm as well as a complicated architecture. This paper proposes a scan-based attack which enables us to decipher a secret key in ECC. The proposed method is based on detecting intermediate values calculated in ECC. We focus on a 1-bit sequence which is specific to some intermediate values. By monitoring the 1-bit sequence in the scan path, we can find out the register position specific to the intermediate value in it and we can know whether this intermediate value is calculated or not in the target ECC circuit. By using several intermediate values, we can decipher a secret key. The experimental results demonstrate that a secret key in a practical ECC circuit can be deciphered using 29 points over the elliptic curve E within 40 seconds. © 2011 Information Processing Society of Japan.

DOI

Scopus

6

被引用数

(Scopus)
Scan-Based Side-Channel Attack against RSA Cryptosystems Using Scan Signatures

Ryuta Nara, Kei Satoh, Masao Yanagisawa, Tatsuo Ohtsuki, Nozomu Togawa

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E93A ( 12 ) 2481 - 2489 2010年12月 [査読有り]

　概要を見る

Scan based side channel attacks retrieve a secret key in a cryptography circuit by analyzing scanned data Since they must be considerable threats to a cryptosystem LSI we have to protect cryptography circuits from them RSA is one of the most important cryptography algorithms because it effectively realizes a public key cryptography system RSA is extensively used but conventional scan based side channel attacks cannot be applied to it because It has a complicated algorithm This paper proposes a scan based side channel attack which enables us to retrieve a secret key in an RSA circuit The proposed method is based on detecting intermediate values calculated in an RSA circuit We focus on a I bit time sequence which is specific to some intermediate values By monitoring the I bit time sequence in the scan path we can find out the register position specific to the intermediate value and we can know whether this intermediate value is calculated or not in the target RSA circuit We can retrieve a secret key one bit by one bit from MSB to LSB The experimental results demonstrate that a 1 024 bit secret key used in the target RSA circuit can be retrieved using 30 2 input messages within 98 3 seconds and its 2 048 bit secret key can be retrieved using, 34 4 input within 634 0 seconds

DOI

Scopus

80

被引用数

(Scopus)
Improved Launch for Higher TDF Coverage With Fewer Test Patterns

Youhua Shi, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 29 ( 8 ) 1294 - 1299 2010年08月 [査読有り]

　概要を見る

Due to the limitations of scan structure, the second vector in transition delay test is usually applied either by shift operation or by functional launch, which possibly results in unsatisfying transition delay fault (TDF) coverage. To overcome such a limitation for higher TDF coverage, a novel improved launch delay test technique that combines the pros of launch-on-shift and launch-on-capture tests is introduced in this paper. The proposed method can achieve near perfect TDF coverage with fewer test patterns without the need for a global fast scan enable signal. Experimental results on ISCAS89 and ITC99 benchmark circuits are included to show the effectiveness of the proposed method.

DOI

Scopus
State-dependent Changeable Scan Architecture against Scan-based Side Channel Attacks

Ryuta Nara, Hiroshi Atobe, Youhua Shi, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

2010 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS 1867 - 1870 2010年 [査読有り]

　概要を見る

Scan test is a powerful and popular test technique because it can control and observe the internal states of the circuit under test. However, scan path would be used to discover the internals of crypto hardware, which presents a significant security risk of information leakage. An interesting design-for-test technique by inserting inverters into the internal scan path to complicate the scan structure has been recently presented. Unfortunately, it still carries the potential of being attacked through statistical analysis of the information scanned out from chips. Therefore, in this paper we propose secure scan architecture, called dynamic variable secure scan, against scan-based side channel attack. The modified scan flip-flops are state-dependent, which could cause the output of each State-dependent Scan FF to be inverted or not so as to make it more difficult to discover the internal scan architecture.

DOI

Scopus

10

被引用数

(Scopus)
Performance-driven High-level Synthesis with floorplan for GDR Architectures and its Evaluation

Akira Ohchi, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

2010 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS 921 - 924 2010年 [査読有り]

　概要を見る

In this paper, we propose a high-level synthesis method targeting generalized distributed-register architecture in which we introduce shared/local registers and global/local controllers. Functional units on a critical path use local registers and local controllers and functional units on non-critical path use shared register and global controller in our architecture. Our method is based on iterative improvement of scheduling/binding and floorplanning. Using iterative flow, we obtains a generalized distributed-register architecture where its scheduling/binding as well as floorplanning are simultaneously optimized. Experimental results show that 8.6% performance improvement can be achieved compared to the conventional high-performance method.

DOI

Scopus

4

被引用数

(Scopus)
Scan-Based Attack against Elliptic Curve Cryptosystems

Ryuta Nara, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

2010 15TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC 2010) 402 - 407 2010年 [査読有り]

　概要を見る

Scan-based attacks are techniques to decipher a secret key using scanned data obtained from a cryptography circuit. Public-key cryptography, such as RSA and elliptic curve cryptosystem (ECC), is extensively used but conventional scan-based attacks cannot be applied to it, because it has a complicated algorithm as well as a complicated architecture. This paper proposes a scan-based attack which enables us to decipher a secret key in ECC. The proposed method is based on detecting intermediate values calculated in ECC. By monitoring the 1-bit sequence in the scan path, we can find out the register position specific to the intermediate value in it and we can know whether this intermediate value is calculated or not in the target ECC circuit. By using several intermediate values, we can decipher a secret key. The experimental results demonstrate that a secret key in a practical ECC circuit can be deciphered using 29 points over the elliptic curve E within 40 seconds.

DOI

Scopus

73

被引用数

(Scopus)
VLSI Implementation of a Fast Intra Prediction Algorithm for H.264/AVC Encoding

Youhua Shi, Kenta Tokumitsu, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

PROCEEDINGS OF THE 2010 IEEE ASIA PACIFIC CONFERENCE ON CIRCUIT AND SYSTEM (APCCAS) 1139 - 1142 2010年 [査読有り]

　概要を見る

Intra-frame coding is one of the most important technologies in H.264/AVC, which made significant contributions to the enhancement of coding efficiency of H.264/AVC at the cost of computation complexity. To address this problem, in this paper we present an efficient VLSI implementation of a computation efficient intra prediction algorithm for H.264/AVC encoding. Unlike most of existing fast intra-mode selection techniques, in the proposed method the directional differences are computed using a few selected original pixels to obtain the candidate modes with the minimal direction cost. The proposed method is hardware-friendly and provides more processing parallelism for H.264 intra-frame encoding with less overhead and less power consumption, which is expected to be utilized as a favourable accelerator hardware module in a real-time HDTV (1920x1080p) H.264 encoder.

DOI

Scopus

2

被引用数

(Scopus)
A Fast Selector-Based Subtract-Multiplication Unit and Its Application to Radix-2 Butterfly Unit

Youhei Tsukamoto, Masao Yanagisawa, Tatsuo Ohtsuki, Nozomu Togawa

PROCEEDINGS OF THE 2010 IEEE ASIA PACIFIC CONFERENCE ON CIRCUIT AND SYSTEM (APCCAS) 1083 - 1086 2010年 [査読有り]

　概要を見る

Large-scale network and multimedia application LSIs include application specific arithmetic units. A multiplyaccumulator unit (MAC unit) which is one of these optimized units arranges partial products and decreases carry propagations. However, there is no method similar to MAC to execute "subtract-multiplication". In this paper, we propose a high-speed subtract-multiplication unit that decreases latency of a subtract operation by bit-level transformation using selector logics. By using bit-level transformation, its partial products are calculated directly. The proposed subtract-multiplication units can be applied to even any types of systems using subtract-multiplications and a butterfly operation in FFT is one of their suitable applications. Experimental results show that our proposed arithmetic units using selector logics improves the performance by 13.92%, compared to a conventional approach.

DOI

Scopus
BusMesh NoC: A Novel NoC Architecture Comprised of Bus-based Connection and Global Mesh Routers

SeungJu Lee, Masao Yanagisawa, Tatsuo Ohtsuki, Nozomu Togawa

PROCEEDINGS OF THE 2010 IEEE ASIA PACIFIC CONFERENCE ON CIRCUIT AND SYSTEM (APCCAS) 712 - 715 2010年 [査読有り]

　概要を見る

Network-on-chip (NoC) architectures are emerged as a promising solution to the lack of scalability in multi-processor systems-on-chips (MPSoCs). In this paper, A busmesh network-on-chip (BMNoC) architecture is proposed, together with simulation results. It is comprised of bus-based connection and global mesh routers to enhance the performance of on-chip communication. Furthermore, MPEG-4, H.264 and a hybrid application mixed MPEG-4 and H.264 on our architecture illustrates the better performance than earlier studies and feasibility of BMNoC.

DOI

Scopus

5

被引用数

(Scopus)
A Two-Level Cache Design Space Exploration System for Embedded Applications

Nobuaki Tojo, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E92A ( 12 ) 3238 - 3247 2009年12月 [査読有り]

　概要を見る

Recently, two-level cache, L1 cache and L2 cache, is commonly used in a processor. Particularly in an embedded system whereby a single application or a class of applications is repeatedly executed on a processor, its cache configuration can be customized such that an optimal one is achieved. An optimal two-level cache configuration can be obtained which minimizes overall memory access time or memory energy consumption by varying the three cache parameters: the number of sets, a line size, and an associativity, for L1 cache and L2 cache. In this paper, we first extend the L1 cache simulation algorithm so that we can explore two-level cache configuration. Second, we propose two-level cache design space exploration algorithms: CRCB-T1 and CRCB-T2, each of which is based on applying Cache Inclusion Proper v to two-level cache configuration. Each of the proposed algorithms realizes exact cache simulation but decreases the number of cache hit/miss judgments by a factor of several thousands. Experimental results show that. by using our approach. the number of cache hit/miss judgments required to optimize a cache configurations is reduced to 1/50-1/5500 compared to the exhaustive approach. As a result, our proposed approach totally runs an average of 1398.25 times faster compared to the exhaustive approach. Our proposed cache simulation approach achieves the world fastest two-level cache design space exploration.

DOI

Scopus
A Scan-Based Attack Based on Discriminators for AES Cryptosystems

Ryuta Nara, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E92A ( 12 ) 3229 - 3237 2009年12月 [査読有り]

　概要を見る

A scan chain is one of the most important testing techniques, but it can be used as side-channel attacks against a cryptography LSI. We focus on scan-based attacks, in which scan chains are targeted for side-channel attacks. The conventional scan-based attacks only consider the scan chain composed of only the registers in a cryptography circuit. However, a cryptography LSI usually uses many circuits such as memories, micro processors and other circuits. This means that the conventional attacks cannot be applied to the practical scan chain composed of various types of registers. In this paper, a scan-based attack which enables to decipher the secret key in an AES cryptography LSI composed of an AES circuit and other circuits is proposed. By focusing on bit pattern of the specific register and monitoring its change, Our scan-based attack eliminates the influence of registers included in other circuits than AES. Our attack does not depend on scan chain architecture, and it can decipher practical AES cryptography LSIs.

DOI

Scopus

51

被引用数

(Scopus)
Floorplan-Aware High-Level Synthesis for Generalized Distributed-Register Architectures

Akira Ohchi, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E92A ( 12 ) 3169 - 3179 2009年12月 [査読有り]

　概要を見る

As device feature size decreases, interconnection delay becomes the dominating factor of circuit total delay. Distributed-register architectures call reduce the influence of interconnection delay. They may, however, increase circuit area because they require many local registers. Moreover original distributed-register architectures do not consider control signal delay, which may be the bottleneck in a circuit. In this paper. we propose it high-level synthesis method targeting generalized distributed-register architecture in which we introduce shared/local registers aid global/local controllers. Our method is based on iterative improvement of scheduling/binding and floorplanning. First, we prepare shared-register groups with global controllers, each of which corresponds to it single functional unit. As iterations proceed, we use local registers and local controllers for functional units on it critical path. Shared-register groups physically located close to each other are merged into a single group. Accordingly, global controllers are merged. Finally, our method obtains it generalized distributed-register architecture where its scheduling/binding as well as floorplanning are simultaneously optimized. Experimental results show that the area is decreased by 4.7% while maintaining the performance of the circuit equal with that using original distributed-register architectures.

DOI

Scopus

8

被引用数

(Scopus)
X-Handling for Current X-Tolerant Compactors with More Unknowns and Maximal Compaction

Youhua Shi, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E92A ( 12 ) 3119 - 3127 2009年12月 [査読有り]

　概要を見る

This paper presents a novel X-handling technique, which removes the effect of unknowns on compacted test response with maximal compaction ratio. The proposed method combines with the current X-tolerant compactors and inserts masking cells on scan paths to selectively mask X's. By doing this, the number of unknown responses in each scan-out cycle could be reduced to a reasonable level such that the target X-tolerant compactor would tolerate with guaranteed possible error detection, It guarantees no test loss due to the effect of X's, and achieves the maximal compaction that the target response compactor could provide as well. Moreover, because the masking cells are only inserted on the scan paths, it has no performance degradation of the designs. Experimental results demonstrate the effectiveness of the proposed method.

DOI

Scopus
Unified Dual-Radix Architecture for Scalable Montgomery Multiplications in GF(P) and GF(2(n))

Kazuyuki Tanimura, Ryuta Nara, Shunitsu Kohara, Youhua Shi, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E92A ( 9 ) 2304 - 2317 2009年09月 [査読有り]

　概要を見る

Modular multiplication is the most dominant arithmetic operation in elliptic curve cryptography (ECC), that is a type of public-key cryptography. Montgomery multiplier is commonly used to compute the modular multiplications and requires scalability because the bit length of operands varies depending on its security level. In addition, ECC is performed in GF(P) or GF(2(n)), and unified architecture for multipliers in GF(P) and GF(2(n)) is required. However, in previous works, changing frequency is necessary to deal with delay-time difference between GF(P) and GF(2(n)) multipliers because the critical path of the GF(P) multiplier is longer. This paper proposes unified dual-radix architecture for scalable Montgomery multiplications in GF(P) and GF(2(n)). This proposed architecture unifies four parallel radix-2(16) multipliers in GF(P) and a radix-2(64) multiplier in GF(2(n)) into a single unit. Applying lower radix to GF(P) multiplier shortens its critical path and makes it possible to compute the operands in the two fields using the same multiplier at the same frequency so that clock dividers to deal with the delay-time difference are not required. Moreover, parallel architecture in GF(P) reduces the clock cycles increased by dual-radix approach. Consequently, the proposed architecture achieves to compute a GF(P) 256-bit Montgomery multiplication in 0.28 mu s. The implementation result shows that the area of the proposal is almost the same as that of previous works: 39 kgates.

DOI

Scopus
An L1 Cache Design Space Exploration System for Embedded Applications

Nobuaki Tojo, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E92A ( 6 ) 1442 - 1453 2009年06月 [査読有り]

　概要を見る

In an embedded system where a single application or a class of applications is repeatedly executed on a processor, its cache configuration can be customized such that an optimal one is achieved. We can have an optimal cache configuration which minimizes overall memory access time by varying the three cache parameters: the number of sets, a line size, and an associativity. In this paper, we first propose two cache simulation algorithms: CRCB1 and CRCB2, based on Cache Inclusion Property. They realize exact cache simulation but decrease the number of cache hit/miss judgments dramatically. We further propose three more cache design space exploration algorithms: CRMF1, CRMF2, and CRMF3, based on our experimental observations. They can find an almost optimal cache configuration from the viewpoint of access time. By using our approach, the number of cache hit/miss judgments required for optimizing cache configurations is reduced to 1/10-1/50 compared to conventional approaches. As a result, our proposed approach totally runs an aver-age of 3.2 times faster and a maximum of 5.3 times faster compared to the fastest approach proposed so far. Our proposed cache simulation approach achieves the world fastest cache design space exploration when optimizing total memory access time.

DOI

Scopus
Design-for-Secure-Test for Crypto Cores

Youhua Shi, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

ITC: 2009 INTERNATIONAL TEST CONFERENCE 618 - 618 2009年 [査読有り]

　概要を見る

Scan technology carries the potential of being misused as a "side channel" to leak out the secret information of crypto cores. To address such a design challenge, this paper proposes a design-for-secure-test (DFST) solution for crypto cores by adding a stimuli-launched flip-flop into the traditional scan flip-flop to maintain the high test quality without compromising the security.

DOI

Scopus

7

被引用数

(Scopus)
Exact and Fast L1 Cache Simulation for Embedded Systems

Nobuaki Tojo, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

PROCEEDINGS OF THE ASP-DAC 2009: ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE 2009 817 - 822 2009年 [査読有り]

　概要を見る

In recent years, the gap between the cycle time of processors and memory access time has been increasing. One of the solutions to solve this problem is to use a cache. But just using a large cache may not reduce the total memory access time. We can have an optimal cache configuration which minimizes overall memory access time by varying the three cache parameters: a cache set size, a line size, and an associativity. In this paper, we propose two exact cache simulation algorithms: CRCB1 and CRCB2, based on Cache Inclusion Property. They realize exact cache simulation but increase simulation speed dramatically. By using our approach, the number of cache hit/miss judgments required for simulating all the cache configurations is reduced to 31.4%-93.6% compared to conventional approaches. As a result, our proposed approach totally runs an average of 1.8 times faster and a maximum of 3.3 times faster compared to the fastest approach proposed so far. Our proposed exact cache simulation approach achieves the world fastest L1 cache simulation.

DOI

Scopus

25

被引用数

(Scopus)
A Unified Test Compression Technique for Scan Stimulus and Unknown Masking Data with No Test Loss

Youhua Shi, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E91A ( 12 ) 3514 - 3523 2008年12月 [査読有り]

　概要を見る

This paper presents a unified test compression technique for scan stimulus and unknown masking data with seamless integration of test generation, test compression and all unknown response masking for high quality manufacturing test cost reduction. Unlike prior test compression methods. the proposed approach considers the unknown responses during test pattern generation procedure, and then selectively encodes, the less specified bits (either Is or Os) in each scan slice for compression while at the same time masks the unknown responses before sending them to the response compactor. The proposed test scheme could dramatically reduce test data volume as well as the number of required test channels by using only c tester channels to drive N internal scan chains, where c = inverted right perpendicular log(2) N inverted left perpendicular + 2. In addition, because all the unknown responses could be exactly masked before entering into the response compactor, test loss due to unknown responses would be eliminated. Experimental results oil both benchmark circuits and larger designs indicated the effectiveness of the proposed technique.

DOI

Scopus
Floorplan-driven high-level synthesis for distributed/shared-register architectures

Akira Ohchi, Shunitsu Kohara, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

IPSJ Transactions on System LSI Design Methodology 1 78 - 90 2008年08月 [査読有り]

　概要を見る

In this paper, we propose a high-level synthesis method targeting distributed/shared-register architectures. Our method repeats (1) scheduling/ FU binding, (2) register allocation, (3) register binding, and (4) module placement. By feeding back floorplan information from (4) to (1), our method obtains a distributed/shared-register architecture where its scheduling/binding as well as floorplaning are simultaneously optimized. Experimental results show that the area is decreased by 13.2% while maintaining the performance of the circuit equal with that using distributed-register architectures. © 2008 Information Processing Society of Japan.

DOI

Scopus

8

被引用数

(Scopus)
Low power LDPC code decoder architecture based on intermediate message compression technique

Kazunori Shimizu, Nozomu Togawa, Takeshi Ikenaga, Satoshi Goto

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E91A ( 4 ) 1054 - 1061 2008年04月 [査読有り]

　概要を見る

Reducing the power dissipation for LDPC code decoder is a major challenging task to apply it to the practical digital communication systems. In this paper, we propose a low power LDPC code decoder architecture based on an intermediate message-compression technique which features as follows: (i) An intermediate message compression technique enables the decoder to reduce the required memory capacity and write power dissipation. (H) A clock gated shift register based intermediate message memory architecture enables the decoder to decompress the compressed messages in a single clock cycle while reducing the read power dissipation. The combination of the above two techniques enables the decoder to reduce the power dissipation while keeping the decoding throughput. The simulation results show that the proposed architecture improves the power efficiency up to 52% and 18% compared to that of the decoder based on the overlapped schedule and the rapid convergence schedule without the proposed techniques respectively.

DOI

Scopus
A secure test technique for pipelined advanced encryption standard

Youhua Shi, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E91D ( 3 ) 776 - 780 2008年03月 [査読有り]

　概要を見る

In this paper, we presented a Design-for-Secure-Test (DFST) technique for pipelined AES to guarantee both the security and the test quality during testing. Unlike previous works, the proposed method can keep all the secrets inside and provide high test quality and fault diagnosis ability as well. Furthermore, the proposed DFST technique can significantly reduce test application time, test data volume, and test generation effort as additional benefits.

DOI

Scopus

3

被引用数

(Scopus)
Floorplan-Driven High-Level Synthesis for Distributed/Shared-Register Architectures

Ohchi Akira, Kohara Shunitsu, Togawa Nozomu, Yanagisawa Masao, Ohtsuki Tatsuo

Information and Media Technologies 3 ( 4 ) 691 - 703 2008年

　概要を見る

In this paper, we propose a high-level synthesis method targeting distributed/shared-register architectures. Our method repeats (1) scheduling/FU binding, (2) register allocation, (3) register binding, and (4) module placement. By feeding back floorplan information from (4) to (1), our method obtains a distributed/shared-register architecture where its scheduling/binding as well as floorplaning are simultaneously optimized. Experimental results show that the area is decreased by 13.2% while maintaining the performance of the circuit equal with that using distributed-register architectures.

DOI CiNii
High-level synthesis algorithms with floorplaning for distributed/shared-register architectures

Akira Ohchi, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

2008 INTERNATIONAL SYMPOSIUM ON VLSI DESIGN, AUTOMATION AND TEST (VLSI-DAT), PROCEEDINGS OF TECHNICAL PROGRAM 164 - 167 2008年 [査読有り]

　概要を見る

In this paper, we propose a high-level synthesis method targeting distributed/shared-register architectures. Our method repeats (1) scheduling/FU binding, (2) register allocation, (3) register binding, and (4) module placement. By feeding back floorplan information from (4) to (1), our method obtains a distributed/shared-register architecture where its scheduling/binding as well as floorplaning are simultaneously optimized. Experimental results show that the area is decreased by 13.6% while maintaining the performance of the circuit equal with that using distributed-register architectures.
Scalable unified dual-radix architecture for Montgomery multiplication in GF(P) and GF(2(n))

Kazuyuki Tanimura, Ryuta Nara, Shunitsu Kohara, Kazunori Shimizu, Youhua Shi, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

2008 ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, VOLS 1 AND 2 667 - 672 2008年 [査読有り]

　概要を見る

Modular multiplication is the most dominant arithmetic operation in elliptic curve cryptography (ECC), which is a type of public-key cryptography. Montgomery multiplication is commonly used as a technique for the modular multiplication and required scalability since the bit length of operands varies depending on the security levels. Also, ECC is performed in GF(P) or GF(2), and unified architectures for GF(P) and GF(2(n)) Multiplier are needed. However, in previous works, changing frequency or dual-radix architecture is necessary to deal with delay-time difference between GF(P) and GF(2(n)) circuits of the multiplier because the critical path of GF(P) circuit is longer. This paper proposes a scalable unified dual-radix architecture for Montgomery multiplication in GF(P) and GF(2(n)). The proposed architecture unifies 4 parallel radix-2(16) multipliers in GF(P) and a radix-2(64) multiplier in GF(2(n)) into a single unit Applying lower radix to GF(P) multiplier shortens its critical path and makes it possible to compute the operands in the two fields using the same multiplier at the same frequency so that clock dividers to deal with the delay-time difference are not required. Moreover, parallel architecture in GF(P) reduces the clock cycles increased by dual-radix approach. Consequently, the proposed architecture achieves to compute GF(P) 256-bit Montgomery multiplication in 0.23 mu s.

DOI

Scopus

4

被引用数

(Scopus)
GECOM: Test data compression combined with all unknown response masking

Youhua Shi, Nozontu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

2008 ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, VOLS 1 AND 2 537 - 542 2008年 [査読有り]

　概要を見る

This paper introduces GECOM technology, a novel test compression method with seamless integration of test GEneration, test COmpression (i.e. integrated compression on scan stimulus and masking bits) and all unknown scan responses Masking for manufacturing test cost reduction. Unlike most of prior methods, the proposed method considers the unknown responses during ATPG procedure and selectively encodes the specified 1 or 0 bits (either Is or Os) in scan slices for compression while at the same time masks the unknown responses before sending them to the response compactor. The proposed GECOM technology consists of GECOM architecture and GECOM ATPG technique. In the GECOM architecture, for a circuit with N internal scan chains, only c tester channels, where c = [log(2) N] +2, are required. GECOM ATPG generates test patterns for the GECOM architecture thus not only the scan inputs could be efficiently compressed but also all the unknown responses would be masked. Experimental results on both benchmark circuits and real industrial designs indicated the effectiveness of the proposed GECOM technique.

DOI

Scopus

5

被引用数

(Scopus)
Unknown Response Masking with Minimized Observable Response Loss and Mask Data

Youhua Shi, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

2008 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS 2008), VOLS 1-4 1779 - + 2008年 [査読有り]

　概要を見る

This paper presents a new unknown response masking technique to minimize the effect on test loss due to over-masking. Unlike previous works where the scan responses are masked before entering the response compactor, the proposed method could mask the Xs when they are transformed on the scan path. Meanwhile, the masking cells are inserted along the scan paths, thus they would have no degradation on the performance of the designs. In addition, the test data required to mask unknown responses is only one bit for each test pattern. Experimental results show the effectiveness of the proposed method.

DOI

Scopus
Dynamically Reconfigurable Architecture for Multi-Rate Compatible Regular LDPC Decoding

Akiyuki Nagashima, Yuta Imai, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

2008 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS 2008), VOLS 1-4 705 - 708 2008年 [査読有り]

　概要を見る

Recently a demand for high-speed wireless network service on mobile devices is rapidly increasing. Error correcting codes are used to enhance network communication quality. Particularly, LDPC (Low Density Parity Check) codes show high throughput and achieve information rates very close to the Shannon limit. In this paper, we propose a dynamically reconfigurable architecture for mufti-rate compatible regular LDPC decoding. Our proposed decoder deals with mufti-rate codes by introducing a mufti-rate compatible 1st-2nd minimum searching unit. The proposed decoder shows the better throughput over the wide range of S/N ratio compared to conventional rate-fixed LDPC decoders.

DOI

Scopus

1

被引用数

(Scopus)
FIR Filter Design on Flexible Engine/Generic ALU Array and Its Dedicated Synthesis Algorithm

Ryo Tamura, Masayuki Honma, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki, Makoto Satoh

2008 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS 2008), VOLS 1-4 701 - + 2008年 [査読有り]

　概要を見る

Reconfigurable processors are those whose contexts are dynamically reconfigured while they are working. We focus on a reconfigurable processor called FE-GA (Flexible Engine/Generic ALU array) for digital media processing. Currently, FE-GA does not have its dedicated behavior synthesis tool. In this paper, we design FIR filters and propose an algorithm to map them onto it automatically. For given an order and coefficients of an FIR filter, the algorithm generates a dedicated assembly code which represents a given FIR filter for FE-GA. Then an editor called FEEditor reads the generated assembly code and implements its corresponding FIR filter on FE-GA. The proposed algorithm achieves automatic mapping of FIR filters of all orders within the range of the specification of FE-GA architecture. Furthermore, it is proved that a minimum cycle is achieved to execute FIR filtering if there is no thread switching.

DOI

Scopus

5

被引用数

(Scopus)
携帯機器向けMPEG-A Photo Playerのメタデータ生成システムのハードウェア化に関する一考察

元橋雅人, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 VLD2006 ( 552 ) 31 - 36 2007年03月

　概要を見る

近年撮像デバイスが普及し,ユーザの保有するマルチメディアコンテンツが増大している.そのため,画像や動画を圧縮する技術から効率的に扱う(検索・グルーピング等)技術が注目されている.本研究ではデジタル写真を検索・グルーピングするデジタル写真アルバムに着目し,その機能を携帯機器へ載せるための提案を行う.現在,コンテンツの特徴量(メタデータ)を抽出し検索等を行うメタデータ生成処理がボトルネックとなり携帯機器へ実装されていない.また,携帯機器供給者がメタデータを用いて様々なサービスを提供することを見据え,本研究ではメタデータ生成処理部分を対象とする.本研究で設計するメタデータ生成システムはMPEG-A Photo Playerが規格化しているメタデータを生成する.本システムは携帯機器での実行を前提とし,1秒以内でのメタデータ生成を目指し,メタデータ生成手法の処理量の削減と専用ハードウェアを用いた処理の高速化を提案する.具体的にはメタデータ生成において最大のボトルネックとなるクラスタリングの代替,Radon変換の高速化,画像の縮小を用いて処理量を削減する.この処理量の削減とともに専用ハードウェアを用いることで約200万画素の画像から0.87秒でメタデータを生成することができ,制約とした1秒以内での実行を実現した.

CiNii
アプリケーションプロセッサ向けデータキャッシュ構成最適化システムとその評価

堀内一央, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 VLD2006 2007年03月

CiNii
SIMD型プロセッサコア最適化設計のための多重ループに対応したSIMD命令合成手法

中島裕貴, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 VLD2006 ( 551 ) 13 - 18 2007年03月

　概要を見る

PackedSIMD型命令セットを持つプロセッサを対象としたHW/SW協調合成システムSPADESでは,PackedSIMD型命令を持つプロセッサを対象とした並列化コンパイラを必要とする.この並列化コンパイラは,まずSPADESが対象とするプロセッサコアにおいて,付加可能なすべてのハードウェアユニットを持つ仮想的なプロセッサを仮定する.仮想プロセッサ上で,入力アプリケーションの命令レベル並列度をPackedSIMD型命令を用いて最大限に抽出し,実行時間最小となるようにスケジューリングされたアセンブリコードを出力する.並列化コンパイラの出力により,合成するプロセッサの初期構成を得られる.本稿では,並列化コンパイラの中核をなす多重ループ構造の並列化アルゴリズムを提案する.提案手法では入力アプリケーション内の多重ループ構造に対して命令レベル並列性を抽出し,PackedSIMD型命令を生成可能とする.提案手法により,アプリケーションの命令レベル並列性をPackedSIMD型命令を利用して抽出し,高速化されたアセンブリコードを出力する.計算機実験により提案手法の有効性を評価する.

CiNii
SIMD型プロセッサコアを対象としたハードウェア/ソフトウェア分割フレームワーク

大東真崇, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 VLD2006 ( 551 ) 7 - 12 2007年03月

　概要を見る

本稿では,SIMD型プロセッサコアを対象としたハードウェア/ソフトウェア(HW/SW)協調合成システム「SPADES」においてHW/SW分割フレームワークを提案する.SPADESはアプリケーションに特化した面積・性能に過不足のないプロセッサコアの自動合成を目的とする.SPADESの核となるHW/SW分割では,アプリケーションを最速処理可能となるようなプロセッサコア構成からプロセッサコアに付加される演算器やレジスタ等のハードウェアユニットを削減していき,最適なプロセッサコア構成を探索する.提案するHW/SW分割フレームワークにより,プロセッサコア構成探索を可能とし,アプリケーションの性能要求に応じた小面積のSIMD型プロセッサコア構成を得ることが出来る.また,提案フレームワークはモジュール単位で構成されているため,変更および拡張を容易に実現できる.計算機実験により提案フレームワークを評価し,結果を報告する.

CiNii
SIMD型プロセッサコア設計におけるプロセッシングユニット最適化手法

繁田裕之, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 VLD2006 ( 551 ) 1 - 6 2007年03月

　概要を見る

画像処理や音声処理などのアプリケーションを実行する処理系プロセッサは,対象とするアプリケーションに特化した演算ユニットをもつことで,汎用のプロセッサよりも高速,小型かつ低費消費電力を両立することができる.このような,特殊なデータパスを持つ演算ユニットを我々はプロセッシングユニットと呼んでいる.アプリケーションに応じてプロセッシングユニットを設計を行うには時間的なコストがかかるため,プロセッシングユニットを自動合成できるシステムが必要とされている.本稿では,アプリケーションに特化したSIMD型プロセッサコアの自動合成において,合成されるプロセッサコアに付加するプロセッシングユニットを最適化する手法を提案する.提案手法では,アプリケーションのControl Data Flow Graph(CDFG)構造から複数の算術演算および論理演算ノードに対してクラスタリングを行ない,プロセッシングユニットの自動生成を行なう.生成されたプロセッシングユニットをプロセッサコアに組み込み,プロセッサコアの最適化と同時にプロセッシングユニットの再構成を行ないながら最適な構成の探索を行なう.計算機実験により提案手法を評価し結果を報告する.

CiNii
Power-Efficient LDPC Code Decoder Architecture

Kazunori Shimizu, Nozomu Togawa, Takeshi Ikenaga, Satoshi Goto

ISLPED'07: PROCEEDINGS OF THE 2007 INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN 359 - 362 2007年 [査読有り]

　概要を見る

This paper proposes the power-efficient LDPC decoder architecture which features (1) a FIFO buffering based rapid convergence schedule which enables the decoder to accelerate the decoding throughput without increasing the required number of memory bits, (2) an intermediate message compression technique based on a clock gated shift register which reduces the read and write, power dissipation for the intermediate messages. Simulation results show that the proposed decoder achieves 1.66 times faster decoding throughput, and improves the power efficiency (which is defined by the power dissipation per Mbps) up to 52% compared to the decoder based on the conventional overlapped schedule.

DOI

Scopus

3

被引用数

(Scopus)
Design for secure test - A case study on pipelined Advanced Encryption Standard

Youhua Shi, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

2007 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11 149 - 152 2007年 [査読有り]

　概要を見る

Cryptography plays an important role in the security of data transmission. To ensure the correctness of crypto hardware, we should conduct testing at fabrication and infield. However, the state-of-the-art scan-based test techniques, to achieve high test qualities, need to increase the testability of the circuit under test, which carries a potential of being misused to reveal the secret information of the crypto hardware. Thus, to develop efficient test strategies for crypto hardware to achieve high test quality without compromising security becomes an important task. In this paper we discuss the development of a Design-for-Secure-Test (DFST) technique for pipelined AES to overcome the above contradiction between security and test quality in testing crypto hardware. Unlike previous works, the proposed method can keep all the secrets inside and provide high test quality and fault diagnosis ability as well. Furthermore, the proposed DFST technique can significantly reduce test application time, test data volume, and test generation effort as additional benefits.

DOI
XMLをベースとしたCDFGマニピュレーションフレームワーク: CoDaMa

小原俊逸, 史又華, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 VLD2006-97 19 - 24 2007年01月

CiNii
楕円曲線暗号向けGF(2m)上のDigit-Serial乗算器の設計

奈良竜太, 小原俊逸, 清水一範, 戸川望, 池永剛, 柳澤政生, 後藤敏, 大附辰夫

電子情報通信学会技術研究報告 VLD2006-89 ( 455 ) 25 - 30 2007年01月

CiNii
Power-efficient LDPC decoder architecture based on accelerated message-passing schedule

Kazunori Shimizu, Tatsuyuki Ishikawa, Nozomu Togawa, Takeshi Ikenaga, Satoshi Goto

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E89A ( 12 ) 3602 - 3612 2006年12月 [査読有り]

　概要を見る

In this paper, we propose a power-efficient LDPC decoder architecture based on an accelerated message-passing schedule. The proposed decoder architecture is characterized as follows: (i) Partitioning a pipelined operation not to read and write intermediate messages simultaneously enables the accelerated message-passing schedule to be implemented with single-port SRAMs. (H) FIFO-based buffering reduces the number of SRAM banks and words of the LDPC. decoder based on the accelerated message-passing schedule.. The proposed LDPC decoder keeps a single message for each non-zero bit in a parity check matrix as well as a classical schedule while achieving the accelerated message-passing schedule. Implementation results in 0.18 [mu m] CMOS technology show that the proposed decoder architecture reduces an area of the LDPC decoder by 43% and a power dissipation by 29% compared to the conventional architecture based on the accelerated message-passing schedule.

DOI

Scopus

5

被引用数

(Scopus)
アプリケーションプロセッサのフォワーディングユニット最適化手法

日浦敏俊, 小原俊逸, 史又華, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 ( VLD2006-80 ) 2006年11月
動的再構成可能なマルチレート対応LDPC符号復号器の実装

今井優太, 清水一範, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 RECONF2006-43 ( 393 ) 35 - 40 2006年11月

　概要を見る

近年,携帯電話などのモバイル端末での大容量コンテンツ配信など,無線環境での高速通信の需要が高まりつつある.無線通信の品質を向上する必要がある中で1つの手段として誤り訂正符号が用いられる.本稿では,線形時間のオーダで復号が可能で従来の誤り訂正符号よりも高い復号特性を持つ新しい誤り訂正符号であるLDPC(Low Density Parity Check)符号を対象アプリケーションとした復号器を提案する.提案する復号器はmin-sum法を使用し,通信路の雑音特性に応じたマルチレート符号語の復号に対応する.マルチレート対応型最小・第2最小値探索回路を提案し設計した.このことにより,従来までの符号化率固定型の復号器と比較して幅広いS/N比において,より優れた復号スループットを出すことが可能である

CiNii
歩行者ナビゲーションにおける微小画面での視認性とユーザの迷いにくさを考慮した略地図生成手法

二宮直也, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 ITS2006-34 ( 266 ) 53 - 58 2006年09月

　概要を見る

携帯電話による位置情報サービスとインターネットサービスの普及により,歩行者を対象とした地図サービスの利用が拡大している.これに伴い,表示面積の狭いモバイル端末に有効な略地図を自動生成するための各種技術の研究が盛んに行われている.道路形状の水平・垂直化,交差点角度の量子化を基本とする従来手法では碁盤の目のようなデザイン性の高い略地図の生成が可能であるが,それらがユーザにとって迷いにくい地図であるとは限らない.本稿では,人間の方向判断基準や交差点形状が歩行者に与える影響を反映させた略地図を生成するための簡略化処理アルゴリズムを提案し,携帯電話のような微小画面においても視認性が良く,かつ迷いにくい略地図の生成を目指す.ノード数400程度の道路ネットワークデータに対して本手法を適用し略地図が生成されることを確認した.

CiNii
屋内用歩行者ナビゲーションにおける歩行者の嗜好を反映させる経路探索手法

荒井亨, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 ITS2006-33 ( 266 ) 47 - 52 2006年09月

　概要を見る

現在,携帯電話を用いて目的地まで利用者を経路案内するサービスが展開している.しかしながら,展開されている経路案内サービスの対象は屋外空間のみである.本稿では,経路案内サービスの対象を屋内空間とし,地下街や百貨店等の屋内空間に特化したネットワークデータを構築する.さらに,携帯端末を用いて屋内用歩行者ナビゲーションサービスを利用することを想定し,歩行者の嗜好性を反映した最適経路を提供することを目的として,屋内に特化した経路探索手法を提案する.提案手法の有効性を示すために2種類のシミュレーション実験を実施し,ユーザにとって最適な経路を出力することができることを示す.

CiNii
屋内向け歩行者ナビゲーションにおけるユーザの嗜好性と混雑状況を考慮した目的地決定手法

小林和馬, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 ITS2006-32 ( 266 ) 41 - 45 2006年09月

　概要を見る

近年,移動体通信網の発達などを背景に,携帯情報端末を利用した歩行者用ナビゲーションシステムに関する研究が盛んに実施されている.この種のシステムでは,ユーザの嗜好を反映させたナビゲーションを可能にすることでナビゲーションシステムの使いやすさの向上が期待できる.そこで本稿では,デパートなどの屋内環境を対象とした飲食店を推測するナビゲーションシステムにおいて,ユーザの詳細な要求に対応した目的地推測手法を提案する.本システムでは,ユーザの食事に対する嗜好性や履歴情報,および店の混雑時間をもとに,ユーザの嗜好を満足する目的地の候補を求め,ユーザ・システム間のインタラクションによって目的地を決定する.本手法による実験システムを構築した結果,その有効性を確認した.

CiNii
車車間・路車間通信技術を用いた車線別の渋滞情報の検出手法

大高宏介, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 ITS2006-18 ( 265 ) 19 - 24 2006年09月

　概要を見る

近年のITS技術の進化に伴ってカーナビにおける測位精度や経路案内の技術が高まりつつあるが,出発地から目的地までの所要時間の測定の精度は未だに十分とは言えず,いかにして正確な渋滞情報を取得するかが課題となっている.特に車線ごとに混雑状況が異なることが大きな影響を及ぼすため,既存の渋滞情報の検出方法の問題を起こすことなく,交差点などにおいて車線ごとに混雑状況が異なる場合があったとしてもそれを個別に検出することが必要である.そこで本稿では,車車間通信および路車間通信を用いることにより,一般道路の各交差点において車線別に渋滞情報をリアルタイムに検出するための手法を提案する.ビーコンから通信を開始して車車間通信を繰り返すことで車両の情報を収集し,その情報から渋滞を通過するまでに要した時間を算出する.また,シミュレーションによりこの手法の有効性を示す.

CiNii
H.264符号化向けDSPにおける動き予測演算器の設計

高橋豊和, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 ( VLD2006 ) 2006年06月
アプリケーションプロセッサの面積/遅延見積もり手法

山崎大輔, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 CAS2006-1 ( VLD2006-14, SIP2006-24 ) 2006年06月
Selective low-care coding: A means for test data compression in circuits with multiple scan chains

YH Shi, N Togawa, S Kimura, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E89A ( 4 ) 996 - 1004 2006年04月 [査読有り]

　概要を見る

This paper presents a test input data compression technique, Selective Low-Care Coding (SLC), which can be used to significantly reduce input test data volume as well as the external test channel requirement for multiscan-based designs. In the proposed SLC scheme, we explored the linear dependencies of the internal scan chains, and instead of encoding all the specified bits in test cubes, only a smaller amount of specified bits are selected for encoding, thus greater compression can be expected. Experiments on the larger benchmark circuits show drastic reduction in test data volume with corresponding savings on test application time can be indeed achieved even for the well-compacted test set.

DOI

Scopus

2

被引用数

(Scopus)
Partially-parallel LDPC decoder achieving high-efficiency message-passing schedule

K Shimizu, T Ishikawa, N Togawa, T Ikenaga, S Goto

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E89A ( 4 ) 969 - 978 2006年04月 [査読有り]

　概要を見る

In this paper, we propose a partially-parallel LDPC decoder which achieves a high-efficiency message-passing schedule. The proposed LDPC decoder is characterized as follows: (i) The column operations follow the row operations in a pipelined architecture to ensure that the row and column operations are performed concurrently. (ii) The proposed parallel pipelined bit functional unit enables the column operation module to compute every message in each bit node which is updated by the row operations. These column operations can be performed without extending the single iterative decoding delay when the row and column operations are performed concurrently. Therefore, the proposed decoder performs the column operations more frequently in a single iterative decoding, and achieves a high-efficiency message-passing schedule within the limited decoding delay time. Hardware implementation on an FPGA and simulation results show that the proposed partially-parallel LDPC decoder improves the decoding throughput and bit error performance with a small hardware overhead.

DOI

Scopus

9

被引用数

(Scopus)
Hardware architecture of efficient message-passing schedule based on modified min-sum algorithm for decoding LDPC codes

Proc. Synthesis and System Integration of Mixed Technologies (SASIMI 2006) 2006年04月
A pipelined functional unit generation method in HW/SW cosynthesis for SIMD processor cores

Proc. Synthesis and System Integration of Mixed Technologies (SASIMI 2006) 2006年04月
Hardware architecture of efficient message-passing schedule based on modified min-sum algorithm for decoding LDPC codes

清水一範, 石川達之, 戸川望, 池永剛, 後藤敏

Proc. Synthesis and System Integration of Mixed Technologies (SASIMI 2006) 2006年04月
A pipelined functional unit generation method in HW/SW cosynthesis for SIMD processor cores

小原俊逸, 栗原輝, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

Proc. Synthesis and System Integration of Mixed Technologies (SASIMI 2006) 2006年04月
アプリケーションプロセッサのデータキャッシュ構成最適化手法

堀内一央, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会第19回回路とシステム軽井沢ワークショップ論文集 19 583 - 588 2006年04月

CiNii
FIFOバッファによる高効率Message-Passingスケジュールを用いたLDPC復号器

清水一範, 石川達之, 戸川望, 池永剛, 後藤敏

電子情報通信学会第19回回路とシステム軽井沢ワークショップ論文集 19 211 - 216 2006年04月

CiNii
Selective low-care coding: A means for test data compression in circuits with multiple scan chains

YH Shi, N Togawa, S Kimura, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E89A ( 4 ) 996 - 1004 2006年04月

　概要を見る

This paper presents a test input data compression technique, Selective Low-Care Coding (SLC), which can be used to significantly reduce input test data volume as well as the external test channel requirement for multiscan-based designs. In the proposed SLC scheme, we explored the linear dependencies of the internal scan chains, and instead of encoding all the specified bits in test cubes, only a smaller amount of specified bits are selected for encoding, thus greater compression can be expected. Experiments on the larger benchmark circuits show drastic reduction in test data volume with corresponding savings on test application time can be indeed achieved even for the well-compacted test set.

DOI

Scopus

2

被引用数

(Scopus)
Partially-parallel LDPC decoder achieving high-efficiency message-passing schedule

K Shimizu, T Ishikawa, N Togawa, T Ikenaga, S Goto

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E89A ( 4 ) 969 - 978 2006年04月

　概要を見る

In this paper, we propose a partially-parallel LDPC decoder which achieves a high-efficiency message-passing schedule. The proposed LDPC decoder is characterized as follows: (i) The column operations follow the row operations in a pipelined architecture to ensure that the row and column operations are performed concurrently. (ii) The proposed parallel pipelined bit functional unit enables the column operation module to compute every message in each bit node which is updated by the row operations. These column operations can be performed without extending the single iterative decoding delay when the row and column operations are performed concurrently. Therefore, the proposed decoder performs the column operations more frequently in a single iterative decoding, and achieves a high-efficiency message-passing schedule within the limited decoding delay time. Hardware implementation on an FPGA and simulation results show that the proposed partially-parallel LDPC decoder improves the decoding throughput and bit error performance with a small hardware overhead.

DOI

Scopus

9

被引用数

(Scopus)
A fast elliptic curve cryptosystem LSI embedding word-based Montgomery multiplier

J Uchida, N Togawa, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON ELECTRONICS E89C ( 3 ) 243 - 249 2006年03月 [査読有り]

　概要を見る

Elliptic curve cryptosystems are expected to be a next standard of public-key cryptosystems. A security level of elliptic curve cryptosystems depends on a difficulty of a discrete logarithm problem on elliptic curves. The security level of a elliptic curve cryptosystem which has a public-key of 160-bit is equivalent to that of a RSA system which has a public-key of 1024-bit. We propose an elliptic curve cryptosystem LSI architecture embedding word-based Montgomery multipliers. A Montgomery multiplication is an efficient method for a finite field multiplication. We can design a scalable architecture for an elliptic curve cryptosystem by selecting structure of word-based Montgomery multipliers. Experimental results demonstrate effectiveness and efficiency of the proposed architecture. In the hardware evaluation using 0.18 mu m CMOS library, the highspeed design using 126 Kgates with 20 x 8-bit multipliers achieved operation times of 3.6 ms for a 160-bit point multiplication.

DOI

Scopus
歩行者向け地図情報配信システムにおける道路交通標識を用いた位置特定手法

中口智史, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 ( ITS2005-114 ) 2006年03月
SIMD型プロセッサコアの自動合成におけるパイプライン構成最適化手法

栗原輝, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 105 ( VLD2005-115, ICD2005-232 ) 43 - 48 2006年03月

　概要を見る

本稿では,アプリケーションに特化したSIMD型プロセッサコアの自動合成において,合成されるプロセッサコアのパイプライン構成を最適化する手法を提案する.画像処理アプリケーションなどで頻繁に処理されるSIMD乗算や乗加算は演算実行遅延が大きくプロセッサの動作周波数を決定することが多い.SIMD演算を実行可能なSIMD型プロセッサコアの自動合成では,SIMD演算をマルチサイクル化することにより,少ない面積オーバーヘッドでより高速なプロセッサコア構成を得ることが期待できる.但し,ハザードの影響により必ずしもパイプライン段数を増やすことがアプリケーション実行時間の短縮につながるとは限らず,プロセッサコアの自動合成では与えられた実行時間制約の中で最適なパイプライン構成を探索するのが望ましい.提案手法では,まず段数の異なる複数のパイプライン構成を定義する.次に定義された全てのパイプライン構成について,与えられた実行時間制約を満たすように,付加するハードウェアユニットの種類と数を最適化する.最後に得られた全てのパイプライン構成の中で最も小面積の構成を選択し,最適なパイプライン構成として出力する.計算機実験により本手法の有効性を評価した.

CiNii
動的フローに対応したネットワークプロセッサの改良とその評価

田淵英孝, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 ( VLD2005-112, ICD2005-229 ) 2006年03月
設計ナビゲーション機構を有するシステムLSI設計のためのHW/SW分割システム

小島洋平, 戸川望, 橘昌良, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 105 ( VLD2005-111, ICD2005-228 ) 19 - 24 2006年03月

　概要を見る

本稿ではシステムLSI設計のためのHW/SW分割システムを提案する.本システムは,制約に応じて列挙するIPの数と種類を変えるデータベースと,制約を満たす解がない場合に改良に取り掛かる設計者を対話により補助する設計ナビゲーションを持つ.データベースのIPを再利用することにより新規設計部分を減らし開発期間の短縮が可能となり,列挙するIPを削減することで探索時間も短縮できる.また,設計ナビゲーションによりボトルネックや改良が容易なアーキテクチャを短時間で検討できる.これらの有効性を計算機実験により確認した.

CiNii
高速移動体のためのハンドオフメッセージ数を最小化した高速ハンドオフ手法

伊藤光司, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 ( IN2005-222 ) 2006年03月
A fast elliptic curve cryptosystem LSI embedding word-based Montgomery multiplier

J Uchida, N Togawa, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON ELECTRONICS E89C ( 3 ) 243 - 249 2006年03月

　概要を見る

Elliptic curve cryptosystems are expected to be a next standard of public-key cryptosystems. A security level of elliptic curve cryptosystems depends on a difficulty of a discrete logarithm problem on elliptic curves. The security level of a elliptic curve cryptosystem which has a public-key of 160-bit is equivalent to that of a RSA system which has a public-key of 1024-bit. We propose an elliptic curve cryptosystem LSI architecture embedding word-based Montgomery multipliers. A Montgomery multiplication is an efficient method for a finite field multiplication. We can design a scalable architecture for an elliptic curve cryptosystem by selecting structure of word-based Montgomery multipliers. Experimental results demonstrate effectiveness and efficiency of the proposed architecture. In the hardware evaluation using 0.18 mu m CMOS library, the highspeed design using 126 Kgates with 20 x 8-bit multipliers achieved operation times of 3.6 ms for a 160-bit point multiplication.

DOI

Scopus
ASIC implementation of LDPC decoder accelerating message-passing schedule

SHIMIZU Kazunori

IEEE International Solid State Circuits Confeference (ISSCC), DAC/ISSCC2006 Student Design Contest (Conceptual Category: 1st Place Winner), San Franscisco 2006年02月

CiNii
ASIC implementation of LDPC decoder accelerating message-passing schedule

清水一範, 石川達之, 戸川望, 池永剛, 後藤敏

IEEE International Solid State Circuits Confeference (ISSCC), DAC/ISSCC2006 Student Design Contest (Conceptual Category: 1st Place Winner), San Franscisco 2006年02月
A parallel LSI architecture for LDPC decoder improving message-passing schedule

Kazunori Shimizu, Tatsuyuki Ishikawa, Nozomu Togawa, Takeshi Ikenaga, Satoshi Goto

Proceedings - IEEE International Symposium on Circuits and Systems 5099 - 5102 2006年

　概要を見る

This paper proposes a parallel LSI architecture for LDPC decoder which improves a message-passing schedule. The proposed LDPC decoder is characterized as follows: (i) The column operations follow the row operations in a pipelined architecture to ensure that the row and column operations are performed concurrently, (ii) The proposed parallel pipelined bit functional unit enables the decoder to perform every column operation using the messages which is updated by the row operations. These column operations can be performed without extending the single iterative decoding delay. Hardware implementation and simulation results show that the proposed decoder improves the decoding throughput and bit error performance with a small hardware overhead. © 2006 IEEE.
Special section on VLSI Design and CAD Algorithms

Onodera, H., Ikeda, M., Ishihara, T., Isshiki, T., Inoue, K., Okada, K., Kajihara, S., Kaneko, M., Kawaguchi, H., Kimura, S., Kuga, M., Kurokawa, A., Sato, T., Shibuya, T., Shiraishi, Y., Takagi, K., Takahashi, A., Takeuchi, Y., Togawa, N., Tomiyama, H., Nakamura, Y., Hamaguchi, K., Miura, Y., Minato, S.-I., Yamaguchi, R., Yamada, M., Yuminaka, Y., Watanabe, T., Hashimoto, M., Miyazaki, M.

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E89-A ( 12 ) 3377 - 3377 2006年

DOI

Scopus
A parallel LSI architecture for LDPC decoder improving message-passing schedule

Kazunori Shimizu, Tatsuyuki Ishikawa, Nozomu Togawa, Takeshi Ikenaga, Satoshi Gotot

2006 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, PROCEEDINGS 5099 - + 2006年 [査読有り]

　概要を見る

This paper proposes a parallel LSI architecture for LDPC decoder which improves a message-passing schedule. The proposed LDPC decoder is characterized as follows: (i) The column operations follow the row operations in a pipelined architecture to ensure that the row and column operations are performed concurrently. (ii) The proposed parallel pipelined bit functional unit enables the decoder to perform every column operation using the messages which is updated by the row operations. These column operations can be performed without extending the single iterative decoding delay. Hardware imp mentation and simulation results show that the proposed decoder improves the decoding throughput and bit error performance with a small hardware overhead.

DOI
FCSCAN: An efficient multiscan-based test compression technique for test cost reduction

Youhua Shi, Nozomu Togawa, Shinji Kimura, Masao Yanagisawa, Tatsuo Ohtsuki

ASP-DAC 2006: 11TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, PROCEEDINGS 653 - 658 2006年 [査読有り]

　概要を見る

This paper proposes a new multiscan-based test input data compression technique by employing a Fan-out Compression Scan Architecture (FCSCAN) for test cost reduction. The basic idea of FCSCAN is to target the minority specified 1 or 0 bits (either 1 or 0) in scan slices for compression. Due to the low specified bit density in test cube set, FCSCAN can significantly reduce input test data volume and the number of required test channels so as to reduce test cost. The FCSCAN technique is easy to be implemented with small hardware overhead and does not need any special ATPG for test generation. In addition, based on the theoretical compression efficiency analysis, improved procedures are also proposed for the FCSCAN to achieve further compression. Experimental results on both benchmark circuits and one real industrial design indicate that drastic reduction in test cost can be indeed achieved.

DOI
An interface-circuit synthesis method with configurable processor core in IP-based SoC designs

Shunitsu Kohara, Naoki Tomono, Jumpei Uchida, Yuichiro Miyaoka, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

ASP-DAC 2006: 11TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, PROCEEDINGS 594 - 599 2006年 [査読有り]

　概要を見る

In SoC designs, efficient communication between the hardware IPs and the on-chip processor becomes very important, however the interface is usually affacted by the processor core specification. Thus in this paper, we focus on developing an efficient interface circuit architecture for the communications between the on-chip processor and embedded hardware IP cores. we also propose a method to synthesize it. Experimental results show that our method could obtain optimal interface circuits and works well through designing a MPEG-4 encode application.

DOI
Memory-efficient accelerating schedule for LDPC decoder

Kazunori Shimizu, Nozonm Togawa, Takeshi Ikenaga, Satoshi Goto

2006 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS 1317 - + 2006年 [査読有り]

　概要を見る

This paper proposes a memory-efficient accelerating schedule for LDPC decoder. Important properties of the proposed techniques are as follows: (i) Partitioning a pipelined operation not to read and write intermediate messages simultaneously enables the accelerated message-passing schedule to be implemented with single-port memories. (ii) FIFO-based buffering reduces the number of memory banks and words for the decoder based on the accelerated message-passing schedule. The proposed decoder reduces the memories for intermediate messages by half compared to the conventional one based on the accelerated message-passing schedule.

DOI

Scopus
Selective low-care coding: A means for test data compression in circuits with multiple scan chains

Youhua Shi, Nozomu Togawa, Shinji Kimura, Masao Yanagisawa, Tatsuo Ohtsuki

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E89-A ( 4 ) 996 - 1003 2006年

　概要を見る

This paper presents a test input data compression technique, Selective Low-Care Coding (SLC), which can he used to significantly reduce input test data volume as well as the external test channel requirement for multiscan-based designs. In the proposed SLC scheme, we explored the linear dependencies of the internal scan chains, and instead of encoding all the specified bits in test cubes, only a smaller amount of specified bits are selected for encoding, thus greater compression can be expected. Experiments on the larger benchmark circuits show drastic reduction in test data volume with corresponding savings on test application time can be indeed achieved even for the well-compacted test set. Copyright © 2006 The Institute of Electronics, Information and Communication Engineers.

DOI

Scopus

2

被引用数

(Scopus)
FCSCAN: An efficient multiscan-based test compression technique for test cost reduction

Youhua Shi, Nozomu Togawa, Shinji Kimura, Masao Yanagisawa, Tatsuo Ohtsuki

ASP-DAC 2006: 11TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, PROCEEDINGS 653 - 658 2006年

　概要を見る

This paper proposes a new multiscan-based test input data compression technique by employing a Fan-out Compression Scan Architecture (FCSCAN) for test cost reduction. The basic idea of FCSCAN is to target the minority specified 1 or 0 bits (either 1 or 0) in scan slices for compression. Due to the low specified bit density in test cube set, FCSCAN can significantly reduce input test data volume and the number of required test channels so as to reduce test cost. The FCSCAN technique is easy to be implemented with small hardware overhead and does not need any special ATPG for test generation. In addition, based on the theoretical compression efficiency analysis, improved procedures are also proposed for the FCSCAN to achieve further compression. Experimental results on both benchmark circuits and one real industrial design indicate that drastic reduction in test cost can be indeed achieved.
An interface-circuit synthesis method with configurable processor core in IP-based SoC designs

Shunitsu Kohara, Naoki Tomono, Jumpei Uchida, Yuichiro Miyaoka, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

ASP-DAC 2006: 11TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, PROCEEDINGS 594 - 599 2006年

　概要を見る

In SoC designs, efficient communication between the hardware IPs and the on-chip processor becomes very important, however the interface is usually affacted by the processor core specification. Thus in this paper, we focus on developing an efficient interface circuit architecture for the communications between the on-chip processor and embedded hardware IP cores. we also propose a method to synthesize it. Experimental results show that our method could obtain optimal interface circuits and works well through designing a MPEG-4 encode application.
MPEG-4形状符号化/復号化に対応したDSP組み込み向け専用演算器の設計

古宇多朋史, 小原俊逸, 史又華, 戸川望, 柳澤政生, 大附辰夫

情報処理学会組込みシステムシンポジウム2006論文集(ESS2006) 2006年
連携処理を考慮したネットワークプロセッサ合成システム

中山敬史, 戸川望, 柳澤政生, 大附辰夫

情報処理学会DAシンポジウム2006論文集 2006年
レジスタ分散・共有併用型アーキテクチャを対象としたフロアプランを考慮した高位合成手法

大智輝, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

情報処理学会DAシンポジウム2006論文集 2006年
SIMD型プロセッサコアの自動合成のためのパイプライン演算ユニット生成手法

栗原輝, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

情報処理学会論文誌 vol. 47 ( no. 6 ) 2006年
H.264符号化向けDSPにおける動き予測演算器の設計

高橋豊和, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 CAS2006-10 ( VLD2006-23, SIP2006-33 ) 13 - 18 2006年

　概要を見る

H.264/AVCは高い符号化効率を実現する反面,符号化に必要な処理量が多い問題点があり,その90%以上は動き予測処理が占めている.符号化効率を向上させるために導入された複数参照フレーム,可変ブロックサイズ,1/4画素精度動き補償がその主因である.これを高速化させるため,複数参照フレーム,可変ブロックサイズに対応した整数精度動き予測処理アーキテクチャが提案されている.しかし,これらのアーキテクチャは探索場所の移動において変則的なメモリアクセスを要し,メモリバンド幅が制限されるDSP組み込み等の用途では性能向上が難しい.本稿では,この問題に対応するため,画素サブサンプリング手法を用いたDSP組み込み整数精度動き予測処理アーキテクチャを提案する.画素サブサンプリングは演算に用いる画素を間引くことにより,一般的にハードウェア量削減に用いられる.提案アーキテクチャではサブサンプリングパターンを一般的なチェスボード状から縦縞状に変更することにより,演算器のデータ読み込みサイクルを削減し動き予測処理の高速化を可能とする.提案するアーキテクチャは200MHzで動作させた場合,CIF画像の予測処理を86.5fpsで実行可能である.

CiNii
A parallel LSI architecture for LDPC decoder improving message-passing schedule

Kazunori Shimizu, Tatsuyuki Ishikawa, Nozomu Togawa, Takeshi Ikenaga, Satoshi Gotot

2006 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, PROCEEDINGS 5099 - + 2006年

　概要を見る

This paper proposes a parallel LSI architecture for LDPC decoder which improves a message-passing schedule. The proposed LDPC decoder is characterized as follows: (i) The column operations follow the row operations in a pipelined architecture to ensure that the row and column operations are performed concurrently. (ii) The proposed parallel pipelined bit functional unit enables the decoder to perform every column operation using the messages which is updated by the row operations. These column operations can be performed without extending the single iterative decoding delay. Hardware imp mentation and simulation results show that the proposed decoder improves the decoding throughput and bit error performance with a small hardware overhead.
FCSCAN: An efficient multiscan-based data compression technique for test cost reduction

史又華, 戸川望, 木村晋二, 柳澤政生, 大附辰夫

Proc. IEEE Asia and South Pacific Design Automation Conference 2006 (ASP-DAC 2006) 653 - 658 2006年01月
An interface-circuit synthesizer with configurable processor core in IP-based SOC design

小原俊逸, 友野直紀, 内田純平, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

Proc. IEEE Asia and South Pacific Design Automation Conference 2006 (ASP-DAC 2006) 594 - 599 2006年01月
重回帰分析により得られた1次式によるインダクタンスを考慮した配線遅延の見積り

鈴木康成, マルタディナタアンワル, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 105 ( VLD2005-72 ) 67 - 72 2005年12月

　概要を見る

DSM(Deep SubMicron technology)時代では高位設計の際, フロアプランや配線抵抗などを考慮する必要が出でくる.また, 高位設計で繰り返し行われるグローバル配線遅延の見積もりの際, インダクタンスの影響が無視できない.本稿ではインダクタンスを考慮してグローバル配線遅延を見積もる方法について述べる.本稿ではドライバ-RLC配線-負荷モデルのステップ応答のが50%に達するまでの時間(50%遅延)を見積もる.提案する見積もり式は, あらかじめ素子値を説明変数として重回帰分析により得られた1次式を用いる.本手法は遅延の内, time of flightが支配的な場合に適用可能で, SPICEで計算した値との誤差を最大約15%, 平均約2.5%で見積もることができる.

CiNii
レジスタ分散・共有アーキテクチャを対象としたフロアプラン指向高位合成手法

大智輝, 戸川望, 柳澤雅夫, 大附辰夫

電子情報通信学会技術研究報告 105 ( VLD2005-66 ) 31 - 36 2005年12月

　概要を見る

近年のLSI設計プロセスの微細化に伴い, 配線による遅延の割合がゲート遅延に対し相対的に増加してきおり, 高位合成の段階においてもフロアプランを考慮する必要がある.レジスタ分散型アーキテクチャを用いると, レジスタ間データ転送を利用することにより配線遅延が回路の性能に与える影響を削除することが可能であるが, レジスタ数が増大し面積増加を招いてしまうという問題点が生じる.本稿では, レジスタ分散型とレジスタ共有型を併用するレジスタ分散・共有型を対象とし, (1)スケジューリング, (2)レジスタアロケーション, (3)レジスタバインディング, (4)モジュール配置の工程を繰り返し(4)から得られたフロアプラン情報をフィードバックすることにより, 解を収束させる高位合成手法を提案する.この手法はレジスタ分散型アーキテクチャと同等の回路の性能を維持しながら面積を削減することが可能となる.また, 計算機実験によって, 提案手法の有効性を示す.

CiNii
SIMD型プロセッサの自動合成におけるパイプライン演算ユニット生成手法

栗原輝, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

情報処理学会DAシンポジウム2005論文集 25 - 30 2005年08月

CiNii
画像処理向けシステムLSI設計における設計ナビゲーションを考慮したHW/SW分割システム

小島洋平, 戸川望, 橘昌良, 柳澤政生, 大附辰夫

情報処理学会DAシンポジウム2005論文集 19 - 24 2005年08月
Reconfigurable adaptive FEC system based on Reed-Solomon code with interleaving

K Shimizu, N Togawa, T Ikenaga, S Goto

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E88D ( 7 ) 1526 - 1537 2005年07月 [査読有り]

　概要を見る

This paper proposes a reconfigurable adaptive FEC system based on Reed-Solomon (RS) code with interleaving. In adaptive FEC schemes, error correction capability t is changed dynamically according to the communication channel condition. For given error correction capability t, we can implement an optimal RS decoder composed of minimum hardware units for each t. If the hardware units of the RS decoder can be reduced for any given error correction capability t, we can embed as large deinterleaver as possible into the RS decoder for each.t. Reconfiguring the RS decoder embedded with the expanded deinterleaver dynamically for each error correction capability t allows us to decode larger interleaved codes which are more robust error correction codes to burst errors. In a reliable transport protocol, experimental results show that our system achieves up to 65% lower packet error rate and 5.9% higher data transmission throughput compared to the adaptive FEC scheme on a conventional fixed hardware system. In an unreliable transport protocol, our system achieves up to 76% better bit error performance with higher code rate compared to the adaptive FEC scheme on a conventional fixed hardware system.

DOI

Scopus

3

被引用数

(Scopus)
A SIMD instruction set and functional unit synthesis algorithm with SIMD operation decomposition

N Togawa, K Tachikake, Y Miyaoka, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E88D ( 7 ) 1340 - 1349 2005年07月 [査読有り]

　概要を見る

This paper focuses on SIMD processor synthesis and proposes a SIMD instruction set/functional unit synthesis algorithm. Given an initial assembly code and a timing constraint, the proposed algorithm synthesizes an area-optimized processor core with optimal SIMD functional units. It also synthesizes a SIMD instruction set. The input initial assembly code is assumed to run on a full-resource SIMD processor (virtual processor) which has all the possible SIMD functional units. In our algorithm, we introduce the SIMD operation decomposition and apply it to the initial assembly code and the full-resource SIMD processor. By gradually reducing SIMD operations or decomposing SIMD operations, we can finally find a processor core with small area under the given timing constraint. The promising experimental results are also shown.

DOI

Scopus

1

被引用数

(Scopus)
Reconfigurable adaptive FEC system based on Reed-Solomon code with interleaving

IEICE Trans. on Information and Systems E88-D ( 7 ) 1538 - 1545 2005年07月

DOI

Scopus

3

被引用数

(Scopus)
A SIMD instruction set and functional unit synthesis algorithm with simd operation decomposition

IEICE Trans. on Information and Systems E88-D ( 7 ) 1340 - 1349 2005年07月

DOI

Scopus

1

被引用数

(Scopus)
Reconfigurable adaptive FEC system based on Reed-Solomon code with interleaving

K Shimizu, N Togawa, T Ikenaga, S Goto

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E88D ( 7 ) 1526 - 1537 2005年07月

　概要を見る

This paper proposes a reconfigurable adaptive FEC system based on Reed-Solomon (RS) code with interleaving. In adaptive FEC schemes, error correction capability t is changed dynamically according to the communication channel condition. For given error correction capability t, we can implement an optimal RS decoder composed of minimum hardware units for each t. If the hardware units of the RS decoder can be reduced for any given error correction capability t, we can embed as large deinterleaver as possible into the RS decoder for each.t. Reconfiguring the RS decoder embedded with the expanded deinterleaver dynamically for each error correction capability t allows us to decode larger interleaved codes which are more robust error correction codes to burst errors. In a reliable transport protocol, experimental results show that our system achieves up to 65% lower packet error rate and 5.9% higher data transmission throughput compared to the adaptive FEC scheme on a conventional fixed hardware system. In an unreliable transport protocol, our system achieves up to 76% better bit error performance with higher code rate compared to the adaptive FEC scheme on a conventional fixed hardware system.

DOI

Scopus

3

被引用数

(Scopus)
A SIMD instruction set and functional unit synthesis algorithm with SIMD operation decomposition

N Togawa, K Tachikake, Y Miyaoka, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E88D ( 7 ) 1340 - 1349 2005年07月

　概要を見る

This paper focuses on SIMD processor synthesis and proposes a SIMD instruction set/functional unit synthesis algorithm. Given an initial assembly code and a timing constraint, the proposed algorithm synthesizes an area-optimized processor core with optimal SIMD functional units. It also synthesizes a SIMD instruction set. The input initial assembly code is assumed to run on a full-resource SIMD processor (virtual processor) which has all the possible SIMD functional units. In our algorithm, we introduce the SIMD operation decomposition and apply it to the initial assembly code and the full-resource SIMD processor. By gradually reducing SIMD operations or decomposing SIMD operations, we can finally find a processor core with small area under the given timing constraint. The promising experimental results are also shown.

DOI

Scopus

1

被引用数

(Scopus)
レジスタ分散型アーキテクチャを対象とするフロアプランとタイミング制約を考慮した高位合成手法

田中真, 内田純平, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

情報処理学会論文誌 46 ( 6 ) 1383 - 1394 2005年05月
Sub-operation parallelism optimization in SIMD processor core synthesis

H Kawazu, J Uchida, Y Miyaoka, N Togawa, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E88A ( 4 ) 876 - 884 2005年04月 [査読有り]

　概要を見る

A b-bit SIMD functional unit has n k-bit sub-functional units in itself, where b = k x n. It can execute n-parallel k-bit operations. However, all the b-bit functional units in a processor core do not necessarily execute n-parallel operations. Depending on an application program, some of them just execute n/2-parallel operations or even n/4-parallel operations. This means that we can modify a b-bit SIMD functional unit so that it has n/2 k-bit sub-functional units or n/4 k-bit sub-functional units. The number of k-bit sub-functional units in a SIMD functional unit is called sub-operation parallelism. We incorporate a sub-operation parallelism optimization algorithm into SIMD functional unit optimization. Our proposed algorithm gradually reduces sub-operation parallelism of a SIMD functional unit while the timing constraint of execution time satisfied. Thereby, we can finally find a processor core with small area under the given timing constraint. We expect that we can obtain processor core configurations of smaller area in the same timing constraint rather than a conventional system. The promising experimental results are also shown.

DOI

Scopus
IP再利用を考慮したシステムLSI設計におけるインタフェース回路生成システム

小原俊逸, 友野直紀, 内田純平, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会第18回回路とシステム軽井沢ワークショップ論文集 581 - 586 2005年04月
SIMD型プロセッサコア向けHW/SW協調合成システムにおけるパイプライン演算ユニット生成手法

栗原輝, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会第18回回路とシステム軽井沢ワークショップ論文集 575 - 580 2005年04月
A selective care bits coding method for test data compression

史又華, 戸川望, 木村晋二, 柳澤政生, 大附辰夫

電子情報通信学会第18回回路とシステム軽井沢ワークショップ論文集 241 - 246 2005年04月
信頼度の伝播効率を改善する部分並列LDPC復号器の実装と評価

清水一範, 石川達之, 戸川望, 池永剛, 後藤敏

電子情報通信学会第18回回路とシステム軽井沢ワークショップ論文集 181 - 186 2005年04月

CiNii
インダクタンスを考慮した配線遅延の近似式による見積もり

鈴木康成, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会第18回回路とシステム軽井沢ワークショップ論文集 1 - 6 2005年04月
ネットワークプロセッサ合成システムの改良とその評価

升本英行, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 VLD2004 2005年03月
動的フローに適応したネットワークプロセッサ設計とその評価

細田宗一郎, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 VLD2004 ( 709 ) 79 - 84 2005年03月

　概要を見る

本稿では, 動的フローに適応したネットワークプロセッサを提案する.提案ネットワークプロセッサは, ネットワークプロセッサ内の各処理ユニットの中から, ボトルネック処理をコマンド待ちキュー情報・各処理ユニットに必要とされるサイクルカウントを元に判断し, 全体スループットを改善する為に次に行うべき処理を採択し, それを実行するハードウェア機構を備えている.この処理状態管理ハードウェアとネットワークプロセッサ内の各処理ユニットに必要な命令・データパスを備えたDynamic-Micro Packet Processorが, 随時ボトルネック処理の並列化を行うことで, ネットワークプロセッサ全体のスループット向上を実現する.提案アーキテクチャをHW記述言語であるVHDLを用いて実装し, 計算機実験を行うことで, その有効性を評価する.

CiNii
ワードベースモンゴメリ乗算器を搭載した高速楕円曲線暗号LSI

内田純平, 奈良竜太, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 VLD2004 ( 708 ) 5 - 10 2005年03月

　概要を見る

次世代の公開鍵暗号技術として楕円曲線暗号が注目されている.楕円曲線暗号の安全性は楕円曲線上の離散対数問題に依っている.このため, 1024ビットのRSA暗号と同程度の安全性を160ビットの暗号鍵で実現できる.本稿では, ワードベースモンゴメリ乗算器を搭載した楕円曲線暗号LSIを提案する.モンゴメリ乗算は有限体上の乗算を実現するアルゴリズムであり, ワードベースモンゴメリ乗算器の構成を変えることで, 小面積で高速な楕円曲線暗号LSIを実現することができる.計算機実験により提案する楕円曲線暗号LSIを評価する.評価結果において, 0.18μmプロセスを用い8ビット乗算器を搭載した楕円曲線暗号LSIは126Kgatesで160ビットのスカラー倍算を実行した場合の処理時間は3.6msであった.

CiNii
面積制約を考慮したマルチスレッドプロセッサの合成手法

麻生雄一, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 2005年03月
Sub-operation parallelism optimization in SIMD processor synthesis and its experimental evaluations

N Togawa, Y Miyaoka, H Kawazu, M Yanagisawa, J Uchida, T Ohtsuki

2005 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), VOLS 1-6, CONFERENCE PROCEEDINGS 3499 - 3502 2005年 [査読有り]

　概要を見る

In this paper, we propose a sub-operation parallelism optimization algorithm in SIMD) processor synthesis. Given an initial assembly code and timing constraints, our algorithm synthesizes a processor core with sub-operation parallelism optimization for SIMD) functional units. First we consider an initial processor which has sufficient hardware units for executing an initial assembly code. An initial processor core includes the maximum sub-operation parallelism for each SIMD) functional unit. By gradually reducing sub-operation parallelism, we can finally have a processor core with small area meeting a given timing constraints. We show the effectiveness of our proposed algorithm through experimental results.

DOI

Scopus
Partially-parallel LDPC decoder based on high-efficiency message-passing algorithm

K Shimizu, T Ishikawa, N Togawa, T Ikenaga, S Goto

2005 IEEE International Conference on Computer Design: VLSI in Computers & Processors, Proceedings 503 - 510 2005年 [査読有り]

　概要を見る

This paper proposes a partially-parallel LDPC decoder based on a high-efficiency message-passing algorithm. Our proposed partially-parallel LDPC decoder performs the column operations for bit nodes in conjunction with the row operations for check nodes. Bit functional unit with pipeline architecture in our LDPC decoder allows us to perform column operations for every bit node connected to each of check nodes which are updated by the row operations in parallel. Our proposed LDPC decoder improves the timing when the column operations are performed, accordingly it improves the message-passing efficiency within the limited number of iterations for decoding. We implemented the proposed partially-parallel LDPC decoder on an FPGA, and simulated its decoding performance. Practical simulation shows that our proposed LDPC decoder reduces the number of iterations for decoding, and it improves the bit error performance with a small hardware overhead.

DOI

Scopus

32

被引用数

(Scopus)
Low power test compression technique for designs with multiple scan chains

YH Shi, N Togawa, S Kimura, M Yanagisawa, T Ohtsuki

14TH ASIAN TEST SYMPOSIUM, PROCEEDINGS 386 - 389 2005年 [査読有り]

　概要を見る

This paper presents a new DFT technique that can significantly reduce test data volume as well as scan-in power consumption for multiscan-based designs. It can also help to reduce test time and tester channel requirements with small hardware overhead In the proposed approach, we start with a pre-computed test cube set and fill the don't-cares with proper values for joint reduction of test data volume and scan power consumption. In addition we explore the linear dependencies of the scan chains to construct a fanout structure only with inverters to achieve further compression. Experimental results for the larger ISCAS'89 benchmarks show the efficiency of the proposed technique.

DOI

Scopus

17

被引用数

(Scopus)
Reconfigurable adaptive FEC system with interleaving

Kazunori Shimizu, Nozomu Togawa, Takeshi Ikenaga, Satoshi Goto

ASP-DAC 2005: PROCEEDINGS OF THE ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, VOLS 1 AND 2 1252 - 1255 2005年 [査読有り]

　概要を見る

This paper proposes a reconfigurable adaptive FEC system with interleaving. For adaptive FEC schemes, we can implement an optimal RS decoder composed of minimum hardware units for any given error correction capability t. If the hardware units of the RS decoder can be reduced for any given t, we can embed as large deinterleaver as possible into the RS decoder for each t. Reconfiguring the RS decoder embedded with the expanded deinterleaver dynamically for each t allows us to decode larger interleaved codes which are more robust FEC codes to burst errors. Our reconfigurable adaptive FEC system with interleaving achieves better packet error rate and higher throughput than fixed hardware systems.
A processor core synthesis system in IP-based SoC design

Naoki Tomono, Shunitsu Kohara, Jumpei Uchida, Yuichiro Miyaoka, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

ASP-DAC 2005: PROCEEDINGS OF THE ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, VOLS 1 AND 2 286 - 291 2005年 [査読有り]

　概要を見る

This paper proposes a new design methodology for SoCs reusing hardware IPs. In our approach, after system-level HW/SW partitioning, we use IPs for hardware parts, but synthesize a new processor core instead of reusing a processor core IP. System performs efficient parallel execution of hardware and software by taking account of a response time of hardware IP obtained by the proposed calculation algorithm. We can use optimal hardware IPs selected by the proposed hardware IPs selection algorithm. The experimental results show effectiveness of our new design methodology.
Sub-operation parallelism optimization in SIMD processor synthesis and its experimental evaluations

N Togawa, Y Miyaoka, H Kawazu, M Yanagisawa, J Uchida, T Ohtsuki

2005 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), VOLS 1-6, CONFERENCE PROCEEDINGS 3499 - 3502 2005年

　概要を見る

In this paper, we propose a sub-operation parallelism optimization algorithm in SIMD) processor synthesis. Given an initial assembly code and timing constraints, our algorithm synthesizes a processor core with sub-operation parallelism optimization for SIMD) functional units. First we consider an initial processor which has sufficient hardware units for executing an initial assembly code. An initial processor core includes the maximum sub-operation parallelism for each SIMD) functional unit. By gradually reducing sub-operation parallelism, we can finally have a processor core with small area meeting a given timing constraints. We show the effectiveness of our proposed algorithm through experimental results.

DOI

Scopus
A processor core synthesis system in IP-based SoC design

Naoki Tomono, Shunitsu Kohara, Jumpei Uchida, Yuichiro Miyaoka, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

ASP-DAC 2005: PROCEEDINGS OF THE ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, VOLS 1 AND 2 286 - 291 2005年

　概要を見る

This paper proposes a new design methodology for SoCs reusing hardware IPs. In our approach, after system-level HW/SW partitioning, we use IPs for hardware parts, but synthesize a new processor core instead of reusing a processor core IP. System performs efficient parallel execution of hardware and software by taking account of a response time of hardware IP obtained by the proposed calculation algorithm. We can use optimal hardware IPs selected by the proposed hardware IPs selection algorithm. The experimental results show effectiveness of our new design methodology.
Sub-operation parallelism optimization in SIMD processor synthesis and its experimental evaluations

Nozomu Togawa, Hideki Kawazu, Jumpei Uchida, Yuichiro Miyaoka, Masao Yanagisawa, Tatsuo Ohtsuki

Proceedings - IEEE International Symposium on Circuits and Systems 3499 - 3502 2005年

　概要を見る

In this paper, we propose a sub-operation parallelism optimization algorithm in SIMD processor synthesis. Given an initial assembly code and timing constraints, our algorithm synthesizes a processor core with sub-operation parallelism optimization for SIMD functional units. First we consider an initial processor which has sufficient hardware units for executing an initial assembly code. An initial processor core includes the maximum sub-operation parallelism for each SIMD functional unit. By gradually reducing sub-operation parallelism, we can finally have a processor core with small area meeting a given timing constraints. We show the effectiveness of our proposed algorithm through experimental results. © 2005 IEEE.

DOI

Scopus
Partially-parallel LDPC decoder based on high-efficiency message-passing algorithm

K Shimizu, T Ishikawa, N Togawa, T Ikenaga, S Goto

2005 IEEE International Conference on Computer Design: VLSI in Computers & Processors, Proceedings 503 - 510 2005年

　概要を見る

This paper proposes a partially-parallel LDPC decoder based on a high-efficiency message-passing algorithm. Our proposed partially-parallel LDPC decoder performs the column operations for bit nodes in conjunction with the row operations for check nodes. Bit functional unit with pipeline architecture in our LDPC decoder allows us to perform column operations for every bit node connected to each of check nodes which are updated by the row operations in parallel. Our proposed LDPC decoder improves the timing when the column operations are performed, accordingly it improves the message-passing efficiency within the limited number of iterations for decoding. We implemented the proposed partially-parallel LDPC decoder on an FPGA, and simulated its decoding performance. Practical simulation shows that our proposed LDPC decoder reduces the number of iterations for decoding, and it improves the bit error performance with a small hardware overhead.

DOI

Scopus

32

被引用数

(Scopus)
Sub-operation parallelism optimization in SIMD processor core synthesis

Hideki Kawazu, Jumpei Uchida, Yuichiro Miyaoka, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E88-A ( 4 ) 876 - 883 2005年

　概要を見る

A b-bit SIMD functional unit has n k-bit sub-functional units in itself, where b = k × n. It can execute n-parallel k-bit operations. However, all the b-bit functional units in a processor core do not necessarily execute n-parallel operations. Depending on an application program, some of them just execute n/2-parallel operations or even n/4-parallel operations. This means that we can modify a b-bit SIMD functional unit so that it has n/2 k-bit sub-functional units or n/4 k-bit sub-functional units. The number of k-bit sub-functional units in a SIMD functional unit is called sub-operation parallelism. We incorporate a sub-operation parallelism optimization algorithm into SIMD functional unit optimization. Our proposed algorithm gradually reduces sub-operation parallelism of a SIMD functional unit while the timing constraint of execution time satisfied. Thereby, we can finally find a processor core with small area under the given timing constraint. We expect that we can obtain processor core configurations of smaller area in the same timing constraint rather than a conventional system. The promising experimental results are also shown. Copyright © 2005 The Institute of Electronics, Information and Communication Engineers.

DOI

Scopus
A Processor Core Synthesis System in IP-based SoC Design

友野直紀, 小原俊逸, 内田純平, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

Proceedings of the ASP-DAC 2005 2005年01月
レジスタ分散型アーキテクチャを対象とするフロアプランを考慮した高位合成手法

田中真, 内田純平, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 VLD2004 2004年12月

CiNii
FPGA-based reconfigurable adaptive FEC

K Shimizu, J Uchida, Y Miyaoka, N Togawa, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E87A ( 12 ) 3036 - 3046 2004年12月

　概要を見る

In this paper, we propose a reconfigurable adaptive FEC system. In adaptive FEC schemes, the error correction capability t is changed dynamically according to the communication channel condition. If a particular error correction capability t is given, we can implement an FEC decoder which is optimal for t by taking the number of operations into consideration. Thus, reconfiguring the optimal FEC decoder dynamically for each error correction capability allows us to maximize the throughput of each decoder within a limited hardware resource. Based on this concept, our reconfigurable adaptive FEC system can reduce the packet dropping rate more efficiently than conventional fixed hardware systems. We can improve data transmission throughput for a reliable transport protocol. Practical simulation results are also shown.
High-level power optimization based on thread partitioning

J Uchida, N Togawa, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E87A ( 12 ) 3075 - 3082 2004年12月

　概要を見る

This paper proposes a thread partitioning algorithm in low power high-level synthesis. The algorithm is applied to high-level synthesis systems. In the systems, we can describe parallel behaving circuit blocks (threads) explicitly. First it focuses on a local register file RF in a thread. It partitions a thread into two sub-threads, one of which has RF and the other does not have RE The partitioned sub-threads need to be synchronized with each other to keep the data dependency of the original thread. Since the partitioned sub-threads have waiting time for synchronization, gated clocks can be applied to each sub-thread. Then we can synthesize a low power circuit with a low area overhead, compared to the original circuit. Experimental results demonstrate effectiveness and efficiency of the algorithm.
A sub-operation parallelism optimization algorithmin HW/SW partitioning for SIMD processor cores

SASIMI2004 483 - 490 2004年10月
A sub-operation parallelism optimization algorithmin HW/SW partitioning for SIMD processor cores

川津秀樹, 内田純平, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

SASIMI2004 483 - 490 2004年10月
IP再利用を考慮したシステムLSIにおけるプロセッサコア合成システム

友野直紀, 小原俊逸, 内田純平, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

情報処理学会 DAシンポジウム 2004 19 - 24 2004年07月
フロアプランとタイミング制約に基づくレジスタ間データ転送を考慮した高位合成手法

田中真, 内田純平, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

情報処理学会 DAシンポジウム 2004 283 - 288 2004年07月
SIMD型プロセッサコア向けHW/SW分割におけるSIMD型演算最適化手法

川津秀樹, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会第17回回路とシステム(軽井沢)ワークショップ 579 - 584 2004年04月
A hardware/software cosynthesis algorithm for processors with heterogeneous datapaths

Y Miyaoka, N Togawa, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E87A ( 4 ) 830 - 836 2004年04月

　概要を見る

This paper proposes a hardware/software cosynthesis algorithm for processors with heterogeneous registers. Given a CDFG corresponding to an application program and a timing constraint, the algorithm generates a processor configuration minimizing area of the processor and an assembly code on the processor. First, the algorithm configures a datapath which can execute several DFG nodes with data dependency at one cycle. The datapath can execute the application program at the least number of cycles. The branch and bound algorithm is applied and all the number of functional units and memory banks are tried. For an assumed number of functional units and memory banks, an appropriate number of heterogeneous registers and connections to functional units and registers are explored. The experimental results show effectiveness and efficiency of the algorithm.
ネットワークプロセッサ合成システム

松浦努, 内田純平, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 VLD2003-145 55 - 60 2004年03月
HW/SW分割システムにおける仮想IP類推手法

小田雄一, 内田純平, 宮岡祐一郎, 戸川望, 橘昌良, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 VLD2003-158 ( 703 ) 47 - 52 2004年03月

　概要を見る

本稿では,既存のIPを元に「処理の並列度」や「処理の反復性」に着目し,新たなIPを類推する手法を提案する.提案手法は,MPEG-4,JPEG,JPEG2000などの画像処理アプリケーションを対象とする.上記のアプリケーションを構成する処理(DCT,量子化など)を解析し,それぞれの処理に対して「処理の並列度に着目した類推手法」または「処理の反復性に着目した類推手法」を適用する.提案手法を適用することにより,解の探索空間が広がる.その結果,既存のIPのみから構成される解と比較して,同一の処理時間かつ小面積の解を得ることが可能になる.提案手法を計算機上に実装し,その有効性を報告する.

CiNii
Packed SIMD型命令を持つプロセッサ合成システムのためのリターゲッタブルコンパイラ

加藤久晴, 内田純平, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 VLD2003-157 41 - 46 2004年03月
面積制約を考慮したCAMプロセッサ最適化手法

石川裕一朗, 内田純平, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 VLD2003-152 13 - 18 2004年03月
インターリーブを考慮したReconfigurable Adaptive FEC

清水一範, 内田純平, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 VLD2003-151 ( 703 ) 7 - 12 2004年03月

　概要を見る

本稿ではインターリーブを考慮したReconfigurable Adaptive FEC を提案する.誤り訂正処理回路のハードウェア構成を訂正能力について動的に最適化することにより,ハードウェアで考慮できるインターリーブの深さを訂正能力ごとに与えることができる.これにより従来のASIC等で実現する固定的なハードウェアに比べて,よりバースト誤りに強い符号を生成することが可能となる.誤り訂正処理における復号器の最適化とインターリーブを考慮したハードウェア構成を提案し,本ハードウェアによる有効性を計算機シミュレーションにより評価する.

CiNii
携帯機器を対象としたJava動的コンパイラにおけるプロファイリングシステム

船田雅史, 内田純平, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告，2004-MBL-28 2004 ( 21 ) 55 - 62 2004年03月

　概要を見る

本稿では，携帯機器を対象としたJava 動的コンパイラにおけるプロファイリングシステムを提案する．本システムはJava 仮想マシンの実行中に，アプリケーションにおいて頻繁に実行されるメソッド(ホットメソッド) を検出する．ホットメソッドはコンパイラによってネイティブコードにコンパイルされ，ヒープ領域に格納される．ネイティブコードに使用するヒープ領域をプロファイラが決定することにより，ガーベッジコレクションの起動を抑制することが可能である．提案プロファイラはJava仮想マシンの処理時間の3%程度のオーバーヘッドで動作可能である．ガーベッジコレクションを元の仮想マシンの2 倍程度に抑え，プロファイリングによって得られたメソッドをコンパイルすることにより，平均で約7 倍程度の高速化を実現した．This paper proposes a lightweight profiling system of Java dynamic compiler for handheld devices. The system detects the methods frequently invoked in application (hot method) during execution of a Java virtual machine. A hot method is compiled into a native code by the compiler, and is stored in a heap area. The profiler determines the heap area used for a native code, and it is possible to reduce a garbage collection. Our technique can profiles method informations with 3% overhead of the processing time of a Java virtual machine. By compiling the hot method, as a result, we achieve approximately 7 times speedup in average, by suppressing a garbage collections to approximately 2 times of the original virtual machine.

CiNii
Alternative Run-Length.Coding through scan chain reconfiguration for joint minimization of test data volume and power consumption in scan test

YH Shi, S Kimura, N Togawa, M Yanagisawa, T Ohtsuki

13TH ASIAN TEST SYMPOSIUM, PROCEEDINGS 432 - 437 2004年 [査読有り]

　概要を見る

Test data volume and scan power are two Major concerns in SoC test. In this paper we present an alternative run-length coding method through scan chain reconfiguration to reduce both test data volume and scan-in power consumption. The proposed method analyzes the compatibility of the internal scan cells for a given test set and then divides the scan cells into compatible classes. To extract the compatible scan cells we apply a heuristic algorithm by solving the graph coloring problem; and then a simple greedy algorithm is used to configure the scan chain for the minimization of scan power Experimental results for the larger ISCAS'89 benchmarks show that the proposed approach leads to highly reduced test data volume with significant power savings during scan test.

DOI

Scopus

2

被引用数

(Scopus)
Instruction set and functional unit synthesis for SIMD processor cores

N Togawa, K Tachikake, Y Miyaoka, M Yanagisawa, T Ohtsuki

ASP-DAC 2004: PROCEEDINGS OF THE ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE 743 - 750 2004年 [査読有り]

　概要を見る

This paper focuses on SIMD processor synthesis and proposes a SIMD instruction set/functional unit synthesis algorithm. Given an initial assembly code and a timing constraint, the proposed algorithm synthesizes an area-optimized processor core with optimal SIMD functional units. It also synthesizes a SIMD instruction set. The input initial assembly code is assumed to run on a full-resource SIMD processor (virtual processor) which has all the possible SIMD functional units. In our algorithm, we introduce the SIMD operation decomposition and apply it to the initial assembly code and the full-resource SIMD processor. By gradually reducing SIMD operations or decomposing SIMD operations, we can finally find a processor core with small area under the given timing constraint. The promising experimental results are also shown.
A cosynthesis algorithm for application specific processors with heterogeneous datapaths

Y Miyaoka, N Togawa, M Yanagisawa, T Ohtsuki

ASP-DAC 2004: PROCEEDINGS OF THE ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE 250 - 255 2004年 [査読有り]

　概要を見る

This paper proposes a hardware/software cosynthesis algorithm for processors with heterogeneous registers. Given a CDFG corresponding to an application program and a timing constraint, the algorithm generates a processor configuration minimizing area of the processor and an assembly code on the processor. First, the algorithm configures a datapath which can execute several DFG nodes with data dependency at one cycle. The datapath can execute the application program at the least number of cycles. The branch and bound algorithm is applied and all the number of functional units and memory banks are tried. For an assumed number of functional units and memory banks, an appropriate number of heterogeneous registers and connections to functional units and registers are explored. The experimental results show effectiveness and efficiency of the algorithm.
A thread partitioning algorithm in low power high-level synthesis

J Uchida, N Togawa, M Yanagisawa, T Ohtsuki

ASP-DAC 2004: PROCEEDINGS OF THE ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE 74 - 79 2004年 [査読有り]

　概要を見る

This paper proposes a thread partitioning algorithm in low power high-level synthesis. The algorithm is applied to high-level synthesis systems. In the systems, we can describe parallel behaving circuit blocks(threads) explicitly. First it focuses on a local register file RF in a thread. It partitions a thread into two sub-threads, one of which has RF and the other does not have, RF. The partitioned sub-threads need to be synchronized with each other to keep the data dependency of the original thread. Since the partitioned sub-threads have waiting time for synchronization, gated clocks can be applied to each sub-thread. Then we can synthesize a low power circuit with a low area overhead, compared to the original circuit. Experimental results demonstrate effectiveness and efficiency of the algorithm.
A reconfigurable adaptive FEC system for reliable wireless communications

K Shimizu, N Togawa, T Ikenaga, M Yanagisawa, S Goto, T Ohtsuki

PROCEEDINGS OF THE 2004 IEEE ASIA-PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS, VOL 1 AND 2 13 - 16 2004年

　概要を見る

This paper proposes a reconfigurable adaptive FEC system. For adaptive FEC schemes, we can implement an FEC decoder which is optimal for error correction capability t by taking the number of operations into consideration. Reconfiguring the optimal FEC decoder dynamically for each t allows us to maximize the throughput of each decoder within a limited hardware resource. Our system can reduce packet dropping rate more efficiently than conventional fixed hardware systems for a reliable transport protocol.
Experimental evaluation of high-level energy optimization based on thread partitioning

J Uchida, Y Miyaoka, N Togawa, M Yanagisawa, T Ohtsuki

PROCEEDINGS OF THE 2004 IEEE ASIA-PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS, VOL 1 AND 2 161 - 164 2004年

　概要を見る

This paper presents a thread partitioning algorithm for high-level synthesis systems which generate low energy circuits. In the algorithm, we partitions a thread into two sub-threads, one of which has RF and the other does not have RE The partitioned sub-threads need to be synchronized with each other to keep the data dependency of the original thread. Since the partitioned sub-threads have waiting time for synchronization, gated clocks can be applied to each sub-thread. We achieve 33% energy reduction when we apply our proposed algorithm to a JPEG encoder.
Alternative Run-Length.Coding through scan chain reconfiguration for joint minimization of test data volume and power consumption in scan test

YH Shi, S Kimura, N Togawa, M Yanagisawa, T Ohtsuki

13TH ASIAN TEST SYMPOSIUM, PROCEEDINGS 432 - 437 2004年

　概要を見る

Test data volume and scan power are two Major concerns in SoC test. In this paper we present an alternative run-length coding method through scan chain reconfiguration to reduce both test data volume and scan-in power consumption. The proposed method analyzes the compatibility of the internal scan cells for a given test set and then divides the scan cells into compatible classes. To extract the compatible scan cells we apply a heuristic algorithm by solving the graph coloring problem; and then a simple greedy algorithm is used to configure the scan chain for the minimization of scan power Experimental results for the larger ISCAS'89 benchmarks show that the proposed approach leads to highly reduced test data volume with significant power savings during scan test.

DOI

Scopus

2

被引用数

(Scopus)
An efficient algorithm/architecture codesign for image encoders

J Choi, N Togawa, T Ikenaga, S Goto, M Yanagisawa, T Ohtsuki

2004 47TH MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL II, CONFERENCE PROCEEDINGS 469 - 472 2004年

　概要を見る

We describe the optimization of a complex video encoder systems based on target architecture. We implemented the MPEG-4 encoder using hardware/software codesign approach, mapped together based on a target architecture. We proposed a target architecture template and an optimization methodology. In our design flow, we searched for a bottleneck module constraining the system. After investigating the computational complexity, quality, and the simplicity of algorithms, we chose the best algorithm for hardware implementation, and then mapped the selected algorithm onto the hardware with different architecture, what does the best architecture for the algorithm and which is the best architecture of components. We chose one of the architectures meet the constraints and also made tradeoffs among speed, chip area, and memory bandwidth for different architecture. The proposed system architecture was used to reduce the design decisions and iterations, provided flexible and scalable systems. The evaluations resulted in effective optimization of the motion estimation module and better tradeoffs that optimized the overall system.
Reducing test data volume for multiscan-based designs through single/sequence mixed encoding

Y Shi, S Kimura, N Togawa, M Yanagisawa, T Ohtsuki

2004 47TH MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL II, CONFERENCE PROCEEDINGS 445 - 448 2004年

　概要を見る

This paper presents a new test data compression technique for multiscan-based designs through dictionary-based encoding on the single or sequences scan-inputs. In spite of its simplicity, it achieves significant reduction in test data volume. Unlike some previous approaches on test data compression, our approach eliminates the need for additional synchronization and handshaking between the CUT and the ATE, so it is especially suitable to be integrated in a low cost test scheme for SoC test In addition in contrast to previous dictionary-based coding techniques, even for the CUT with a small number of scan chains, the proposed approach can achieve satisfied reduction in test data volume. Experimental results showed the proposed test scheme works particularly well for the large ISCAS'89 benchmarks.
Instruction set and functional unit synthesis for SIMD processor cores

N Togawa, K Tachikake, Y Miyaoka, M Yanagisawa, T Ohtsuki

ASP-DAC 2004: PROCEEDINGS OF THE ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE 743 - 750 2004年

　概要を見る

This paper focuses on SIMD processor synthesis and proposes a SIMD instruction set/functional unit synthesis algorithm. Given an initial assembly code and a timing constraint, the proposed algorithm synthesizes an area-optimized processor core with optimal SIMD functional units. It also synthesizes a SIMD instruction set. The input initial assembly code is assumed to run on a full-resource SIMD processor (virtual processor) which has all the possible SIMD functional units. In our algorithm, we introduce the SIMD operation decomposition and apply it to the initial assembly code and the full-resource SIMD processor. By gradually reducing SIMD operations or decomposing SIMD operations, we can finally find a processor core with small area under the given timing constraint. The promising experimental results are also shown.
A cosynthesis algorithm for application specific processors with heterogeneous datapaths

Y Miyaoka, N Togawa, M Yanagisawa, T Ohtsuki

ASP-DAC 2004: PROCEEDINGS OF THE ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE 250 - 255 2004年

　概要を見る

This paper proposes a hardware/software cosynthesis algorithm for processors with heterogeneous registers. Given a CDFG corresponding to an application program and a timing constraint, the algorithm generates a processor configuration minimizing area of the processor and an assembly code on the processor. First, the algorithm configures a datapath which can execute several DFG nodes with data dependency at one cycle. The datapath can execute the application program at the least number of cycles. The branch and bound algorithm is applied and all the number of functional units and memory banks are tried. For an assumed number of functional units and memory banks, an appropriate number of heterogeneous registers and connections to functional units and registers are explored. The experimental results show effectiveness and efficiency of the algorithm.
A thread partitioning algorithm in low power high-level synthesis

J Uchida, N Togawa, M Yanagisawa, T Ohtsuki

ASP-DAC 2004: PROCEEDINGS OF THE ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE 74 - 79 2004年

　概要を見る

This paper proposes a thread partitioning algorithm in low power high-level synthesis. The algorithm is applied to high-level synthesis systems. In the systems, we can describe parallel behaving circuit blocks(threads) explicitly. First it focuses on a local register file RF in a thread. It partitions a thread into two sub-threads, one of which has RF and the other does not have, RF. The partitioned sub-threads need to be synchronized with each other to keep the data dependency of the original thread. Since the partitioned sub-threads have waiting time for synchronization, gated clocks can be applied to each sub-thread. Then we can synthesize a low power circuit with a low area overhead, compared to the original circuit. Experimental results demonstrate effectiveness and efficiency of the algorithm.
A hardware/software partitioning algorithm for processor cores with packed SIMD-type instructions

N Togawa, K Tachikake, Y Miyaoka, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E86A ( 12 ) 3218 - 3224 2003年12月

　概要を見る

This letter proposes a new hardware/software partitioning algorithm for processor cores with SIMD instructions. Given a compiled assembly code including SIMD instructions and a timing constraint, the proposed algorithm synthesizes an area-optimized processor core with a new assembly code. Firstly, we assume for each operation type a super SIMD functional unit which can execute all the SIMD instructions. Secondly we reduce a SIMD instruction or "sub-function" of each super functional unit, one by one, while the timing constraint is satisfied. At the same time, we update the assembly code so that it can run on the new processor configuration. By repeating this process, we finally find SIMD functional unit configuration as well as a processor core architecture. The promising experimental results are also shown.
A retargetable simulator generator for DSP processor cores with packed SIMD-type instructions

N Togawa, K Kasahara, Y Miyaoka, J Choi, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E86A ( 12 ) 3099 - 3109 2003年12月

　概要を見る

A packed SIMD type operation or a SIMD operation is n-parallel b/n-bit sub-operations executed by the modified n-bit functional unit. Such a functional unit is called a SIMD functional unit and a processor core which can execute SIMD operations is called a SIMD processor core. SIMD operations can be effectively applied to image processing applications. This paper focuses on hardware/software cosynthesis of SIMD processor cores and particularly proposes a new simulator generator which simulates pipelined instructions for a SIMD processor. Generally, a SIMD functional unit has many options and then we can have so many different SIMD functional unit instances. However, since our hardware/software cosynthesis system synthesizes a special-purpose processor core for an input application program, it uses very limited SIMD functional unit instances. In the proposed approach, we consider a SIMD operation to be a set of SIMD sub-operations. By adding up the appropriate SIMD sub-operations, we construct a single SlMD operation. Then a SIMD functional unit behavior can be characterized by a collection of SIMD operations. This approach has the advantage that: if we have a small number of behavior libraries for SIMD suboperations, we can instantiate a particular SIMD functional unit behavior. Experimental results demonstrate the effectiveness of the proposed approach.
面積制約付きCAMプロセッサ合成手法

石川裕一朗, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 VLD2003-89 ( 478 ) 115 - 120 2003年11月

　概要を見る

本研究室ではCAM(連想メモリ)を使用するプロセッサを合成するシステムを構築中である.現在のシステムはC言語で記述されたアプリケーション記述を入力としてそのアプリケーションを実行するプロセッサの最適なハードウェア構成を出力する.本稿では現在のシステムに面積制約機能を付加する.提案手法ではCAMの一部をRAMに置換してプロセッサの面積を削減する.本システムは面積制約を満足した上で実行時間を最小化する,CAMからRAMへ置換する量を導出できる.計算機実験により,面積制約を満たした上で,システムに入力されたアプリケーションを最遠に実行するプロセッサの構成を出力できる事を確認した.

CiNii
面積制約を考慮したCAMプロセッサ向けハードウェア/ソフトウェア協調設計手法

石川裕一朗, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 IE2003-98 ( 380 ) 83 - 88 2003年10月

　概要を見る

我々はCAM(連想メモリ)を使用するプロセッサを対象としたハードウェア/ソフトウェア協調合成システムを構築中である.現在のシステムはC言語で記述されたアプリケーション記述を入力としてそのアプリケーションを実行するプロセッサの最適なハードウェア構成を出力する.本稿では,現在のシステムを拡張し,面積制約機能を付加したCAMプロセッサ向けハードウェア/ソフトウェア協調合成システムを提案する.提案手法では面積制約を満足した上で実行時間を最小化するCAMワード数を導出し,CAMの一部をRAMに置換してプロセッサの面積を削減する.計算機実験により,面積制約を満たした上で,システムに入力されたアプリケーションを最速に実行するプロセッサの構成を出力できる事を確認した.

CiNii
FPGAを用いたReconfigurable Adaptive FECの実装と評価

清水一範, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 Reconf2003-9 2003年09月
分岐距離による再送手法選択式マルチキャスト

山田泰弘, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 CQ2003-58 ( 290 ) 29 - 32 2003年09月

　概要を見る

IPマルチキャスト環境において,転送信頼性を確保する高信頼性マルチキャストが盛んに研究されている.現在までに様々な再送手法が提案されているが,それぞれ適したネットワークの状況やマルチキャストグループの規模は異なる.本研究では,時間的に変化するネットワーク状況や受信ホスト数に対応し,その状況に適した再送手法を選択する高信頼性マルチキャストプロトコルを提案する.提案プロトコルでは,隣接する他の受信ホストまでのホップ数である分岐距離という指標を用いて再送手法の決定を行う.幾つかの再送手法を単独で用いた場合と提案プロトコルをシュミレーションして比較した結果,提案プロトコルの配信完了所要時間が減少したことを確認した.

CiNii
公共空間におけるハンドオフ時間短縮を考慮したBluetoothネットワークの手順に関する一検討

寺崎暁, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 CQ2003-58 25 - 28 2003年09月
動的再構成可能システムによるAdaptive FECの実装

清水一範, 戸川望, 柳澤政生, 大附辰夫

情報処理学会 DAシンポジウム 2003 25 - 30 2003年07月
システムLSI設計における定性的側面を考慮したハードウェア/ソフトウェア分割システム

小田雄一, 宮岡祐一郎, 戸川望, 橘昌良, 柳澤政生, 大附辰夫

情報処理学会 DAシンポジウム 2003 169 - 174 2003年07月

CiNii
冗長記述を利用したVHDLへの透かし埋め込み手法

久保ゆきこ, 戸川望, 柳澤政生, 大附辰夫

情報処理学会 DAシンポジウム 2003 37 - 42 2003年07月
System Architecture based on Hardware/Software Codesign for Optimization of Video Encoders

The 2003 International Technical Conference on Circuits/Systems，Computers and Communications 2003年06月
System Architecture based on Hardware/Software Codesign for Optimization of Video Encoders

崔鎮求, 戸川望, 柳澤政生, 大附辰夫

The 2003 International Technical Conference on Circuits/Systems，Computers and Communications 2003年06月
A hardware/software cosynthesis system for processor cores with content addressable memories

N Togawa, T Totsuka, T Wakui, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E86A ( 5 ) 1082 - 1092 2003年05月

　概要を見る

Content addressable memory (CAM) is one of the functional memories which realize word-parallel equivalence search. Since a CAM unit is generally used in a particular application program, we consider that appropriate design for CAM units is required depending on the requirements for the application program. This paper proposes a hardware/software cosynthesis system for CAM processors. The input of the system is an application program written in C including CAM functions and a constraint for execution time (or CAM processor area). Its output is hardware descriptions of a synthesized processor and a binary code executed on it. Based on the branch-and-bound method, the system determines which CAM function is realized by a hardware and which CAM function is realized by a software with meeting the given timing constraint (or area constraint) and minimizing the CAM processor area (or execution time of the application program). We expect that we can realize optimal CAM processor design for an application program. Experimental results for several application programs show that we can obtain a CAM processor whose area is minimum with meeting the given timing constraint.
An Instruction-Set Simulator Generator for SIMD Processor Cores

Proceedings of workshop SASIMI2003 160 - 167 2003年04月
ネットワークスイッチング処理を対象としたCAMプロセッサ自動合成システム

田中英夫, 戸川望, 柳澤政生, 大附辰夫

回路とシステム(軽井沢)ワークショップ 435 - 440 2003年04月
不規則なデータパスを持つプロセッサのハードウェア/ソフトウェア協調合成手法

宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

回路とシステム(軽井沢)ワークショップ 441 - 446 2003年04月
An Instruction-Set Simulator Generator for SIMD Processor Cores

宮岡祐一郎, 戸川望, 笠原亨介, 崔鎮求, 柳澤政生, 大附辰夫

Proceedings of SASIMI2003 160 - 167 2003年04月
閾値検索機能付きCAMプロセッサの最適化手法

戸塚崇夫, 宮岡祐一郎, 石川裕一朗, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD2002-158 19 - 24 2003年03月
SIMD型プロセッサコア向けHW/SW分割におけるSIMD型演算最適化手法

太刀掛宏一, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD2002-157 13 - 18 2003年03月
高位合成システムにおけるスレッド分割を用いた低消費電力化手法

内田純平, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD2002-156 7 - 12 2003年03月
A hardware/software partitioning algorithm for SIMD processor cores

K Tachikake, N Togawa, Y Miyaoka, J Choi, M Yanagisawa, T Ohtsuki

ASP-DAC 2003: PROCEEDINGS OF THE ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE 135 - 140 2003年 [査読有り]

　概要を見る

This paper proposes a new hardware/software partitioning algorithm for processor cores with SIMD instructions. Given a compiled assembly code including SIMD instructions, a timing constraint of execution time, and available hardware units, the proposed algorithm synthesizes an area-optimized processor core with a new assembly code. Firstly, we assume an initial processor core on which an input assembly code can run with the shortest execution time. Secondly we reduce a hardware unit added to a processor core one by one while the timing constraint is satisfied. At the same time, we update the assembly code so that it can run on the new processor configuration. By repeating this process, we finally obtain a processor core architecture with small area under the given timing constraint. We expect that we can obtain a processor core which has appropriate SIMD functional units for running the input application program. The promising experimental results are also shown.

DOI

Scopus

2

被引用数

(Scopus)
ハードウェアIPの応答時間を考慮したプロセッサコアのハードウェア/ソフトウェア分割手法

田川博規, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD2002-136 ( 609 ) 37 - 42 2003年01月

　概要を見る

本稿では,ハードウェアIPの応答時間を考慮したプロセッサコアのハードウェア/ソフトウェア分割手法を提案する.我々は対象とするアプリケーションに応じて利用するハードウェアIPを始めに決定した上で,機能・性能に過不足のないプロセッサコアを合成するシステムLSI設計アプローチを提案している.そこで適切な構成をもったプロセッサコアを合成するためには,ハードウェアIPの応答時間を考慮したハードウェア/ソフトウェア分割が有効である.提案手法はハードウェアIPの応答時間を命令レベルで考慮することで既存手法を拡張しており,これによりプロセッサコアとハードウェアIPが独立したタスクを効率良く並列実行することが可能となる.計算機実験により提案手法を評価し,本設計アプローチの有効性を示す.

CiNii
ハードウェアIPの応答時間を考慮したプロセッサコア合成システム

小原俊逸, 田川博規, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD2002-135 ( 609 ) 31 - 36 2003年01月

　概要を見る

本稿では,ハードウェアIPの応答時間を考慮したプロセッサコアの合成システムと,これを利用したシステムLSI設計のフレームワークを提案する.ハードウェアIPを利用したシステムLSI設計では,システムLSIに要求される性能に対して必要にして十分な性能のIPが必ずしも用意されているとは限らない.そこでシステムのハードウェア/ソフトウェア分割後,ハードウェアで実現する機能にはIPを用いるが,ソフトウェアを動作させるプロセッサコアにはIPを用いず,アプリケーションに応じて性能に過不足のないプロセッサコアを自動合成する.提案するフレームワークに沿ってJPEGエンコーダを設計し,計算機実験により提案する合成システムとフレームワークの有効性を示す.

CiNii
MPEG-4コアプロファイル符号化向けDSP

石本剛, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD2002-134 25 - 30 2003年01月
A hardware/software partitioning algorithm for SIMD processor cores

K Tachikake, N Togawa, Y Miyaoka, J Choi, M Yanagisawa, T Ohtsuki

ASP-DAC 2003: PROCEEDINGS OF THE ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE 135 - 140 2003年

　概要を見る

This paper proposes a new hardware/software partitioning algorithm for processor cores with SIMD instructions. Given a compiled assembly code including SIMD instructions, a timing constraint of execution time, and available hardware units, the proposed algorithm synthesizes an area-optimized processor core with a new assembly code. Firstly, we assume an initial processor core on which an input assembly code can run with the shortest execution time. Secondly we reduce a hardware unit added to a processor core one by one while the timing constraint is satisfied. At the same time, we update the assembly code so that it can run on the new processor configuration. By repeating this process, we finally obtain a processor core architecture with small area under the given timing constraint. We expect that we can obtain a processor core which has appropriate SIMD functional units for running the input application program. The promising experimental results are also shown.

DOI

Scopus

2

被引用数

(Scopus)
A high-level energy-optimizing algorithm for system VLSIs based on area/time/power estimation

S Noda, N Togawa, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E85A ( 12 ) 2655 - 2666 2002年12月

　概要を見る

This paper proposes a high-level energy-optimizing algorithm which can synthesize low energy system VLSIs. Given an initial system hardware obtained from an abstract behavioral description, the proposed algorithm applies to it the three energy reduction techniques, 1) reducing supply voltage, 2) selecting lower energy modules, and 3) applying gated clocks. By incorporating our area/delay/power estimation, the proposed algorithm can obtain low energy system VLSIs meeting the constraints of area, delay, and execution time. The proposed algorithm has been incorporated into a high-level synthesis system and experimental results demonstrate effectiveness and efficiency of the algorithm.
An algorithm and a flexible architecture for fast block-matching motion estimation

J Choi, N Togawa, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E85A ( 12 ) 2603 - 2611 2002年12月

　概要を見る

The motion estimation can choose the most suitable algorithm for different kinds of motion types, formats, and characteristics. The video encoding system can be optimized for quality, speed, and power consumption. In this paper, we propose a reconfigurable approach to a motion estimation algorithm and hardware architecture. The proposed algorithm determines motion type and then selects adapted block-matching algorithm for different kinds of motion sequences. The quality of our algorithm is better than that of the TSS and the BBGDS algorithm, or comparable to the performance of the better of the two, and the computational complexity of our algorithm is significantly less than that of the TSS. We also propose hardware architecture for realizing two kinds of motion estimations in the same hardware. We implemented the flexible and reconfigurable hardware architecture by using address generator unit, delay unit, and parameters and by using the hardware description language (VHDL) and the SYNOPSYS synthesis design tools. We analyze the performance of the algorithm and present adapted algorithm for a low cost real time application.
閾値検索機能を持つCAMプロセッサの自動合成システム

戸塚崇夫, 石川裕一朗, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD2002-113 ( 476 ) 197 - 192 2002年11月

　概要を見る

Content Addressable Memory(CAM)を使用したプロセッサを対象とした自動合成システムを提案する.本システムではC言語によるCAMユニットの機能を使用したアプリケーションプログラムを入力とし,CAMプロセッサのハードウェア記述とCAMプロセッサ上で動作するバイナリコードを出力する.一致検索や閾値検索機能を持つ機能の異なる10種類のCAMセルアレーから合成することで,アプリケーションに適したCAMプロセッサを設計できる.計算機上に本システムを構築し,アプリケーションプログラムの入力によってCAMプロセッサのハードウェア記述およびバイナリコードが得られることを確認した.

CiNii
動的再構成可能システムによるプロトコルブースタの実装

清水一範, 陳暁梅, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD2002-103 127 - 132 2002年11月
ストリーミングを主目的としたアクセスネットワークでの最大許容遅延を考慮した制御方式

柳澤政生, 佐藤隆之, 戸川望, 大附辰夫

電子情報通信学会技術報告，MoMuC 2-Jul ( 251 ) 13 - 18 2002年07月

　概要を見る

本稿では,Bluetoothを用いたストリーミングを主目的としたアクセスネットワーク環境を考える.一般に無線回線は有線回線と比べてパケットの欠落が多いためストリーミングデータの劣化が起こる.人間の耳は100[ms]程度の遅延時間は感知しないという特徴がある。そこでデータに許容可能な遅延時間を考え,許容遅延時間以内でエラー訂正可能ならばエラー訂正することで,ストリーミングデータの劣化が抑えられうと考えられる.Bluetoothを用いたアクセスネットワークの環境では,一般的にサーバの存在する回線側は端末の存在するBluetooth回線側と比べて通信速度が速いという特徴がある.そのためエラー訂正等で増えるサーバ側回線の通信量がストリーミングデータへ与える影響は少ないと考えられ,この手法はBluetoothを用いたアクセスネットワークの環境での利用に適していると考える.本稿では上記の特徴を使って最大許容遅延という指標を導入し,再送処理・帯域制御することで品質の良いストリーミングデータ伝送することを提案する.さらに,計算機実験によってその有効性を示す。

CiNii
仮想IP類推機構を有する動画像処理向けシステムVLSIのためのハードウェア/ソフトウェア分割システム

小田雄一, 磯田新平, 戸川望, 橘昌良, 柳澤政生, 大附辰夫

情報処理学会 DAシンポジウム 2002 173 - 178 2002年07月
A Software/Hardware Codesign for MPEG Encoder

FIT(Forum on Information Technology)2002 2002年06月
System-level Function and Architecture Codesign for Optimization of MPEG Encoder

ITC-CSCC'02 2002年06月
Packed SIMD 型命令を持つプロセッサを対象としたハードウェア/ソフトウェア協調合成システムのための並列化コンパイル手法

鈴木伸治, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD2002-78 ( 168 ) 79 - 84 2002年06月

　概要を見る

Packed SIMD型命令を持つプロセッサのハードウェア/ソフトウェア協調合成システムSPADE-simdでは,Packed SIMD型命令を持つプロセッサを対象とした並列化コンパイラが必要となる.この並列化コンパイラは,まず,SPADE-simdが対象とするプロセッサコアにおいて,付加可能な全てのハードウェアを持つ仮想的なプロセッサを仮定する.仮想プロセッサ上で,入力アプリケーションの命令レベル並列性をPacked SIMD型命令を用いて最大限に抽出し,アセンブリコードを出力する.並列化コンパイラの出力により,合成するプロセッサの初期構成が得られる.本稿では,並列化コンパイラの中核をなすレジスタ内SIMD並列化アルゴリズム,命令併合アルゴリズムを提案する.レジスタ内SIMD並列化アルゴリズムは,入力アプリケーションの命令レベル並列性を抽出し,レジスタ内に低精度データを梱包・整列することでPacked SIMD型命令を利用する.命令併合アルゴリズムは,複数のPacked SIMD型命令を1命令に併合し,シフト・飽和演算を含むPacked SIMD型命令やPacked SIMD型乗加算命令を利用可能とする.提案手法により,アプリケーションの命令レベル並列性をPacked SIMD型命令を用いて抽出し,高速化したアセンブリコードを出力する.提案手法を用いて計算機上に並列化コンパイラを実装し,有効性を評価する.

CiNii
Packed SIMD型命令セットを持った画像処理プロセッサのためのハードウェア/ソフトウェア分割手法

太刀掛宏一, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD2002-53 ( 168 ) 85 - 90 2002年06月

　概要を見る

本稿では,Packed SIMD型命令セットを持った画像処理プロセッサを対象としたハードウェア/ソフトウェア協調合成システムSPADE-simdにおけるハードウェア/ソフトウェア分割手法を提案する.画像処理では,プロセッサの基本ピット幅に比較して小さいビット幅を持ったデータを演算対象とする.Packed SIMD型命令は1つの演算器を用いて複数の小ピットデータに対する演算を実行するため,画像処理に適用することでデータの並列演算による処理の高速化が可能となる.アプリケーションプログラムに記述される全てのPacked SIMD型命令について,命令実行機能をプロセッサコアに含まれる対応Packed SIMD型演算器のいずれかに割り当てる.割り当てられた命令の組合せに応じて,各々のPacked SIMD型演算器のハードウェア構成は変更が可能である.プロセッサに付加されるPacked SIMD型演算器の命令実行機能構成を適切に決定することで,演算器およびプロセッサコアのハードウェアコストを削減することができる.計算機実験により提案手法を評価し,結果を報告する.

CiNii
A Software/Hardware Codesign for MPEG Encoder

崔鎮求, 戸川望, 柳澤政生, 大附辰夫

FIT(Forum on Information Technology)2002 2002年06月
System-level Function and Architecture Codesign for Optimization of MPEG Encoder

崔鎮求, 戸川望, 柳澤政生, 大附辰夫

ITC-CSCC'02 2002年06月
モバイル環境における一対多通信 −シミュレーションによるFTPとSRMの比較−

佐藤隆之, 柳生健吾, 戸川望, 大附辰夫

電子情報通信学会技術報告，MoMuC 2-Jun 33 - 38 2002年05月
ディジタル信号処理向けプロセッサのためのシミュレータ生成手法

笠原亨介, 戸川望, 柳澤政生, 大附辰夫

情報処理学会論文誌 vol.43 No.5 1202 - 1213 2002年05月
Packed SIMD型命令を持つプロセッサを対象としたハードウェア/ソフトウェア協調合成システムのためのハードウェアユニット生成手法

宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

情報処理学会論文誌 vol.43 No.5 ( 5 ) 1191 - 1201 2002年05月

　概要を見る

本稿では，Packed SIMD型命令を持つプロセッサを対象としたハードウェア/ソフトウェア協調合成システムのためのハードウェアユニット生成手法を提案する．ハードウェアユニットを複数の部分機能を実現するハードウェアの組合せで構成することにより，与えられた命令集合を実行できるハードウェアユニットの構成を，高速に複数列挙し，ハードウェアユニットの面積と遅延を見積もることができ，Packed SIMD型命令を持つプロセッサコアを協調合成システムにより得ることができるようになる．計算機実験により本手法の有効性を評価した．This paper proposes a hardware unit generation algorithm for ahardware/software cosynthesis system of digital signal processors withpacked SIMD type instructions. Given a set of instructions, the proposedalgorithm extracts a set of subfunctions to be required by the hardwareunit and generates more than one architecture candidates for hardwareunits. The algorithm also outputs the estimated area and delay of eachof the generated hardware units. The execution time of the proposedalgorithm is very short and thus it can be easily incorporated into theprocessor core synthesis system. Experimental results demonstrateeffectiveness and efficiency of the alogorithm.

CiNii
DSPプロセッサコアのハードウェア/ソフトウェア協調合成システムのための演算語長縮小化手法

田川博規, 嶋下和宏, 戸川望, 柳澤政生, 大附辰夫

回路とシステム軽井沢ワークショップ 429 - 434 2002年04月
High-level area/delay/power estimation for low power system VLSIs with gated clocks

S Noda, N Togawa, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E85A ( 4 ) 827 - 834 2002年04月

　概要を見る

At high-level synthesis for system VLSIs, their power consumption is efficiently reduced by applying gated clocks to them. Since using gated clocks causes the reduction of power consumption and the increase of area/delay, estimating tradeoff between power and area/delay by applying gated clocks is very important. In this paper. we discuss the amount of variance of area, delay and power by applying gated clocks. We propose a simple gate-level circuit model and estimation equations. We vary parameters in our proposed circuit model, and evaluate power consumption by back-annotating gate-level simulation results to the original circuit. This paper also proposes a conditional expression for applying gated clocks The expression shows whether or not we can reduce power consumption by applying gated clocks. We confirm the accuracy of proposed estimation equations by experiments.
制御処理ハードウェア高位合成のためのコントロールデータフローグラフ変形手法

石井哲雄, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD2001-165 ( 695 ) 41 - 48 2002年03月

　概要を見る

制御処理ハードウェアの高位合成システムにおける,ハードウェアの実行時間の削減を実現するコントロールデータフローグラフ変形手法を提案する.本高位合成システムは,アプリケーションプログラム記述から,状態数が最小で且つ高速な演算器が割り当てられ高速に動作する初期状態、遷移グラフを初期ハードウェア候補として出力する.提案するコントロールデータフローグラフ変形手法は2つのアプローチにより,(1)初期状態遷移グラフの実行クロックサイクル数の削減,(2)初期状態遷移グラフのクロック周期の削減,を実現する.(1)を部分コントロールデータフローグラフの複製処理によるパス長の削減により実現する.(2)を一算術演算処理内のメモリ読み込み処理と演算処理を異なるイタレーションに分散させるパイプライン実行により実現する.計算機実験により提案手法の有効性を示す.

CiNii
IP再利用を考慮した動画像処理システムVLSI向けハードウェア/ソフトウェア分割設計支援システム

磯田新平, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD2001-164 ( 695 ) 33 - 40 2002年03月

　概要を見る

本稿では,1)類推列挙機構を有し,2)定性指標を考慮した,動画像処理向けVLSIを対象とするハードウェア/ソフトウェア協調設計システムを提案する.本システムでは,過去に設計したIPをデータベースに蓄積し,類推列挙に利用する.類推列挙とは,データベースに存在するIPから,その並列処理の度合いを変化させたときの面積/遅延の変化を類推する機能のことである.定性指標とは,他のハードウェア部品との接続しやすさといった,面積/時間/電力以外の定性的な指標のことである.1)を導入することで仮想的に解候補数を増やすことができ,面積/時間制約を満たす解が得られやすくなる.2)を導入することで定量指標が同等な解候補の中から,設計コストの低い候補を選ぶことができる.これらの機能を含むシステムを構築し,計算機実験によってその有効性を示す.

CiNii
Packed SIMD 型演算器を持つディジタル信号処理プロセッサのためのリターゲッタブルシミュレータ生成手法

笠原亨介, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD2001-162 24 - 17 2002年03月
VLSI architecture for a flexible motion estimation with parameters

J Choi, N Togawa, M Yanagisawa, T Ohtsuki

ASP-DAC/VLSI DESIGN 2002: 7TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE AND 15TH INTERNATIONAL CONFERENCE ON VLSI DESIGN, PROCEEDINGS 452 - 457 2002年 [査読有り]

　概要を見る

If motion estimation can choose the most suitable algorithm according to the changing characteristics of input image signals, we can get benefits, which improve quality and performance, reduce power consumption, and an optimize system. In this paper we propose a reconfigurable approach to motion estimation algorithm and architecture. The propose algorithm determines motion type and then selects adapted algorithm in order to improve quality and performance of images. We implemented the flexible and reconfigurable architecture by hardware with address generator unit, delay unit, and parameters. Our architecture supports more than one block-matching algorithm and parameters providing to optimize system. We are implementing our architecture by using hardware description language (VHDL) and synthesis design tools. We analyze the performance of architecture and present adaption to algorithm for a low cost real time application.

DOI

Scopus

4

被引用数

(Scopus)
An algorithm of hardware unit generation for processor core synthesis with packed SIMD type instructions

Y Miyaoka, A Choi, N Togawa, M Yanagisawa, T Ohtsuki

APCCAS 2002: ASIA-PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS, VOL 1, PROCEEDINGS 171 - 176 2002年 [査読有り]

　概要を見る

Let us consider to synthesize a processor core with SIMD instructions by a hardware/software cosynthesis system. The system is required to configure functional units executing SIMD instructions and obtain the area and delay of the functional units to evaluate the synthesized processor core. This paper proposes a hardware unit generation algorithm for a hardwaxe/software cosynthesis system of processors with SIMD instructions. Given a set of instructions to be executed by a hardware unit and constraints for area and delay of the hardware unit, the proposed algorithm extracts a set of subfunctions to be required by the hardware unit and generates more than one architecture candidates for the hardware unit. The algorithm also outputs the estimated area and delay of each of the generated hardware units. The execution time of the proposed algorithm is very short and thus it can be easily incorporated into the processor core synthesis system. Experimental results demonstrate effectiveness and efficiency of the algorithm.

DOI

Scopus
An algorithm of hardware unit generation for processor core synthesis with packed SIMD type instructions

Y. Miyaoka, J. Choi, N. Togawa, M. Yanagisawa, T. Ohtsuki

IEEE Asia-Pacific Conference on Circuits and Systems, Proceedings, APCCAS 1 171 - 176 2002年

　概要を見る

The authors consider the synthesis of a processor core with SIMD instructions by a hardware/software cosynthesis system. The system is required to configure functional units executing SIMD instructions and obtain the area and delay of the functional units to evaluate the synthesized processor core. This paper proposes a hardware unit generation algorithm for a hardware/software cosynthesis system of processors with SIMD instructions. Given a set of instructions to be executed by a hardware unit and constraints for area and delay of the hardware unit, the proposed algorithm extracts a set of subfunctions to be required by the hardware unit and generates more than one architecture candidates for the hardware unit. The algorithm also outputs the estimated area and delay of each of the generated hardware units. The execution time of the proposed algorithm is very short and thus it can be easily incorporated into the processor core synthesis system. Experimental results demonstrate effectiveness and efficiency of the algorithm.

DOI

Scopus
An algorithm of hardware unit generation for processor core synthesis with packed SIMD type instructions

Y Miyaoka, A Choi, N Togawa, M Yanagisawa, T Ohtsuki

APCCAS 2002: ASIA-PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS, VOL 1, PROCEEDINGS 171 - 176 2002年

　概要を見る

Let us consider to synthesize a processor core with SIMD instructions by a hardware/software cosynthesis system. The system is required to configure functional units executing SIMD instructions and obtain the area and delay of the functional units to evaluate the synthesized processor core. This paper proposes a hardware unit generation algorithm for a hardwaxe/software cosynthesis system of processors with SIMD instructions. Given a set of instructions to be executed by a hardware unit and constraints for area and delay of the hardware unit, the proposed algorithm extracts a set of subfunctions to be required by the hardware unit and generates more than one architecture candidates for the hardware unit. The algorithm also outputs the estimated area and delay of each of the generated hardware units. The execution time of the proposed algorithm is very short and thus it can be easily incorporated into the processor core synthesis system. Experimental results demonstrate effectiveness and efficiency of the algorithm.
ロジック入力用レベルシフトコンパレーター設計考察

宮崎英敏, 戸川望, 柳澤政生, 大附辰夫, 茨木栄武, 新谷悟

電子回路研究会，ETC-02-16 13 - 17 2002年01月
システムVLSIのための高位面積/遅延/消費電力見積もりに基づく低消費電力指向高位合成手法

野田真一, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD2001-144 ( 577 ) 93 - 100 2002年01月

　概要を見る

本稿では, 面積/遅延/実行時間の制約を満たしながら低消費電力なシステムVLSIを合成可能な高位合成システムを提案する.低消費電力化手法として, 1)電源電圧の低減, 2)低消費電力なモジュールの選択, 3)Gated Clockの3つの手法を採用した.一般に, これら3つの手法の適用により消費電力は低減可能であるが, 面積/遅延/実行時間は増加してしまう.提案する手法では, 面積/遅延/実行時間の変化量を予測することによって, これらの各制約を満たしながら初期ハードウェアよりも消費電力を低減したハードウェアを合成することができる.さらに, 計算機実験により消費電力が低減されていることを確認した.

CiNii
メモリとのインターフェース仕様を考慮した演算語長縮小に基づくプロセッサコアのハードウェア/ソフトウェア協調合成システム

嶋下和宏, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD2001-110 ( 467 ) 127 - 132 2001年11月

　概要を見る

演算語長をnビットからn/2ビットに縮小することでプロセッサの面積削減を図る.一般的に, このときnビットの結果を得るには少なくとも2回の演算命令を実行しなければならない.しかし, ここでアプリケーションプログラムの内部変数のデータ長がn/2ビットのみであると仮定する.このとき, 演算命令を一度実行するだけで結果を得られる.我々の提案しているハードウェア/ソフトウェア協調合成システムでは, これまで, アプリケーションプログラムのデータ長はプロセッサコアの演算語長と等しいことを前提としていた.本稿では, 演算語長を縮小する手法を提案する.この手法は, 内部変数の演算精度に応じて各nビットの演算命令を1つ, もしくは2つ以上のn/2ビットの演算命令に繰り返し置き換える.

CiNii
A new hardware/software partitioning algorithm for DSP processor cores with two types of register files

N Togawa, T Sakurai, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E84A ( 11 ) 2802 - 2807 2001年11月

　概要を見る

This letter proposes a hardware/software partitioning algorithm for digital signal processor cores with two register files. Given a compiled assembly code and a timing constraint of execution time, the proposed algorithm generates a processor core configuration with a new assembly code running on the generated processor core. The proposed algorithm considers two register files and determines the number of registers in each of register files. Moreover the algorithm considers two or more types of functional units for each arithmetic or logical operation and assigns functional units with small area to a processor core without causing performance penalty. A generated processor core will have small area compared with processor cores which have a single register file or those which consider only one type of functional units for each operation. The experimental results demonstrate the effectiveness and efficiency of the proposed algorithm.
Area and delay estimation in hardware/software cosynthesis for digital signal processor cores

N Togawa, Y Kataoka, Y Miyaoka, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E84A ( 11 ) 2639 - 2647 2001年11月

　概要を見る

Hardware/software partitioning is one of the key processes in a hardware/software cosynthesis system for digital signal processor cores. In hardware/software partitioning, area and delay estimation of a processor core plays an unportant role since the hardware/software partitioning process must determine which part of a processor core should be realized by hardware units and which part should be realized by a sequence of instructions based on execution time of an input application program and area of a synthesized processor core. This paper proposes area and delay estimation equations for digital signal processor cores. For area estimation, we show that total area for a processor core can be derived from the sum of area for a processor kernel and area for additional hardware units. Area for a processor kernel can be mainly obtained by minimum area for a processor kernel and overheads for adding hardware units and registers. Area for a hardware unit can be mainly obtained by its type and operation bit width. For delay estimation, we show that critical path delay for a processor core can be derived from the delay of a hardware unit which is on the critical path in the processor core. Experimental results demonstrate that errors of area estimation are less than 2% and errors of delay estimation are less than 2 ns when comparing estimated area and delay with logic-synthesized area and delay.
A Hardware/Software Cosynthesis System for CAM Processors

SASIMI200１ 2001年10月
A Hardware/Software Cosynthesis System for CAM Processors

戸川望, 涌井達彦, 柳澤政生, 大附辰夫

SASIMI200１ 2001年10月

CiNii
Implementation of Motion Estimation IP Core for MPEG Encoder

ITC-CSCC 2001 2001年07月
Packed SIMD型命令を持つプロセッサを対象としたハードウェア／ソフトウェア協調合成システムのためのハードウェアユニット生成手法

宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

情報処理学会 DAシンポジウム 2001 223 - 228 2001年07月

CiNii
ディジタル信号処理向けプロセッサのためのシミュレータ生成手法

笠原亨介, 戸川望, 柳澤政生, 大附辰夫

情報処理学会 DAシンポジウム 2001 137 - 142 2001年07月

CiNii
Implementation of Motion Estimation IP Core for MPEG Encoder

崔鎮求, 戸川望, 柳澤政生, 大附辰夫

ITC-CSCC 2001 2001年07月

CiNii
An area/time optimizing algorithm in high-level synthesis of control-based hardwares

N Togawa, M Ienaga, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E84A ( 5 ) 1166 - 1176 2001年05月

　概要を見る

This paper proposes an area/time optimizing algorithm in a high-level synthesis system for control-based hardwares. Given a call graph whose node corresponds to a control flow of an application program. the algorithm generates a set of state-transition graphs which represents the input call graph under area and timing constraint. In the algorithm. first state-transition graphs which satisfy only timing constraint are generated and second they are transformed so that they can satisfy area constraint. Since the algorithm is directly applied to control-flow graphs, it can deal with control flows such as bitwise processed and conditional branches. Further, the algorithm synthesizes more than one hardware architecture candidates from a single call graph for an application program. Designers of an application program can select several good hardware architectures among candidates depending on multiple design criteria. Experimental results for several control-based hardwares demonstrate effectiveness and efficiency of the algorithm.
ディジタル信号処理向けプロセッサコアのPacked SIMD型ハードウェアユニット生成手法

宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD2001-2 ( 45 ) 7 - 13 2001年05月

　概要を見る

1個のbビット演算ユニットを用いてn個のb/nビット演算を実現するPacked SIMD型命令は画像処理などに有効である. Packed SIMD型命令を持つプロセッサコアをハードウェア/ソフトウェア協調合成システムによって合成するとき, 必要な命令が実行できるPacked SIMD型演算ユニットを構成し面積と遅延を高速に見積もることが要求される. そこで, 本稿では複数のハードウェアユニットを高速に構成するPakced SIMD型ハードウェアユニット生成手法を提案する. 本手法は, 1つのハードウェアユニットで実行される命令の集合と, 生成されるハードウェアユニットの面積と遅延の制約を入力とし, ハードウェアユニットに必要となる部分機能を抽出して, その部分機能を実現するハードウェアを組み合わせることでハードウェアユニット構成を複数列挙し面積と遅延の見積もり値を出力する. 提案手法を計算機上に実装しPacked SIMD型演算ユニットに適用した結果を示しその有効性を評価する.

CiNii
Gated Clockによる低消費電力化システムVLSIの高位面積／遅延／消費電力見積り

野田真一, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会第14回回路とシステム(軽井沢)ワークショップ 591 - 596 2001年04月
ソフトIPのための保護アルゴリズム

堀川哲郎, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会第14回回路とシステム(軽井沢)ワークショップ 411 - 416 2001年04月
システムLSIを対象としたハードウェア／ソフトウェア分割システム

小田龍之介, 磯田新平, 戸川望, 橘昌良, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD2000-140 ( 646 ) 37 - 42 2001年03月

　概要を見る

本稿では設計資産の再利用が実現するシステムLSIを対象としたハードウェア/ソフトウェア分割システムを提案する。システムはアーキテクチャデータベースと実現可能割付け系で構成される。アーキテクチャデータベースはアルゴリズム名と既設計事例を対応付けるデータベースである。アプリケーションは機能ブロック図で表す。機能ブロック図は機能モジュールと機能モジュールの接続関係からなり,各機能モジュールにはアルゴリズム名を与える。実現可能割付け系にアプリケーションの実行時間制約と機能ブロック図を入力として与える。実現可能割付け系は時間制約を満たすモジュール-既設計事例割付けを全て列挙する。MPEG-4エンコーダを入力とした適用事例を示す。

CiNii
画像処理を対象としたPacked SIMD型命令セットを持つプロセッサのハードウェア／ソフトウェア協調合成システムにおける並列化Cコンパイラ

野々垣直浩, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD2000-139 ( 646 ) 31 - 36 2001年03月

　概要を見る

画像処理アプリケーションの性能向上を目的として多くの汎用およびディジタル信号処理向けプロセッサで拡張命令セットが提供されている。主にこの機能拡張はPacked SIMD型の命令の追加によるものである。Packed SIMD型命令を用いることで,アプリケーションのデータ並列性を活用することができる。Packed SIMD型命令は多くの種類が存在するが,アプリケーションは限られた命令しか使用しない。そこでプロセッサをアプリケーションスペシフィック合成をすることで,実行時間制約を満足するプロセッサの面積を大きく削減することができる。本稿では,Packed SIMD型命令セットをもつプロセッサのハードウェア/ソフトウェア協調合成システムとそのコンパイラのレジスタ内SIMD並列化アルゴリズムを提案する。C言語によるアプリケーション記述とアプリケーションデータを入力とし,プロセッサのハードウェア記述,プロセッサ上で動作するオブジェクトコードおよびソフトウェア環境を出力とする。並列化コンパイラはプロセッサコアに付加可能な全てのハードウェアを付加したプロセッサ上で動作するオブジェクトコードを生成する。コンパイラは命令レベル並列性,レジスタ内SIMD並列性を抽出し,実行時間の最小化を目指す。計算機実験によって本コンパイラを評価した結果を報告する。

CiNii
制御処理ハードウェアの高位合成システムのための面積／遅延見積もり手法

余田貴幸, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告 2001-SLDM-100-4,pp.25-32 ( 12 ) 25 - 32 2001年02月

　概要を見る

本稿では，制御処理ハードウェアの高位合成システムのための面積／遅延見積もり手法を提案する．面積／遅延見積もりでは本システム構成の1つである面積／時間最適化において構築された状態遷移グラフを入力としてその状態遷移グラフに対する面積見積もり値および遅延見積もり値を出力する．提案見積もり手法では状態数および演算器の種類に依存した見積もり式を定式化することでハードウェアの制御部分を含めた面積，遅延の見積もり値を得ている．提案手法をハフマン符号化を始めとするいくつかの制御処理アプリケーションプログラムに適用し，その有効性を評価する．This paper proposes an area/delay estimation technique in high-level synthesis for control flow based hardwares. At area/delay estimation, the input is the state-transition graph, which is generated by the area/time optimizing. The output is estimated area and delay value for the state-transition graph. Our estimation technique gives area and delay including control part of hardware, using an estimation equation. The equation has been decided by number of operations, number of states and type of operations. Experimental results for several control-based hardware demonstrate effectiveness and efficiency of the technique.

CiNii
RC等価回路に基づくクロストーク低減配線手法

曽根原理仁, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告 2001-SLDM-100-3,pp.17-24 17 - 24 2001年02月

CiNii
Area/delay estimation for digital signal processor cores

Y Miyaoka, Y Kataoka, N Togawa, M Yanagisawa, T Ohtsuki

PROCEEDINGS OF THE ASP-DAC 2001: ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE 2001 156 - 161 2001年 [査読有り]

　概要を見る

Hardware/software partitioning is one of the key processes in a hardware/software cosynthesis system for digital signal processor cores. In hardware/software partitioning, area and delay estimation of a processor core plays an important role since the hardware/software partitioning process must determine which part of a processor core should be realized by hardware units and which part should be realized by a sequence of instructions based on execution time of an input application program and area of a synthesized processor core. This paper proposes area and delay estimation equations for digital signal processor cores. For area estimation, we show that total area for a processor core can be derived from the sum of area for a processor kernel and area for additional hardware units. Area for a processor kernel can be mainly obtained by minimum area for a processor kernel and overheads for adding hardware units and registers. Area for a hardware unit can be mainly obtained by its type and operation bit width. For delay estimation, we show that critical path delay for a processor core can be derived from the delay of a hardware unit which is on the critical path in the processor core. Experimental results demonstrate that errors of area estimation are less than 2% and errors of delay estimation are less than 2ns when comparing estimated area and delay with logic-synthesized area and delay.

DOI

Scopus

6

被引用数

(Scopus)
発見的算法と分枝限定法を用いた時間的予測に基づくリソースバイディング

中村洋, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD2000-119 ( 532 ) 17 - 24 2001年01月

　概要を見る

本稿では, ディジタル信号処理ハードウェアのデータパス設計を対象とした計算時間予測に基づき解を導出するリソースバインディング手法を提案する.提案手法は, 発見的算法に基づくリソースバインダと分枝限定法に基づくリソースバインダを組み合わせたものである.まず, 発見的算法に基づくリソースバインダが割り当てるリソース数を変化させ, 残りのリソースを分枝限定法に基づくリソースバインダで割り当てた場合, 計算時間がどのように増減するかを予測する.その予測に基づき, 設計者の与える計算時間制約を満足するように発見的算法に基づくリソースバインダで割り当てるリソース数を決定し, 実際に割当を実行する.残りのリソースの割当を分枝限定法に基づくリソースバインダで決定することにより, 最終的なリソースバインディングの解を得る.計算機実験により, 本手法の有効性を確認する.

CiNii
FPGAを用いた動的再構成可能システムを対象とするスケジューリング手法

石飛貴志, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD2000-115 33 - 40 2001年01月

CiNii
パラメータ付けされた動的再構成可能システムとその応用

香西伸治, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD2000-114 ( 531 ) 25 - 32 2001年01月

　概要を見る

近年, システムの動作中にシステムの一部の論理を書き換える動的再構成可能システムが研究されている.従来, 動的再構成可能システムはそのハードウェア構成が固定されており, そのため, 特定のアプリケーションに適用するにはそのシステム上の構成では機能が冗長, あるいは, 不足するという問題があった.本研究では, これらの問題を解決するためにパラメータ付けされた動的再構成可能システムを提案する.提案する動的再構成可能システムはPCIインターフェース部, 演算を実行する機能部, それらを制御する制御部から成る.機能部の処理速度や回路規模, 再構成時間, ピン数などのデバイス構成やデバイス間の接続方式はアプリケーションに応じて可変である.デバイスのパラメータにはその性能に応じてコストが割り当てられておりユーザの指定した制約コスト内での最速処理をするデバイスのパラメータとそのデバイスから成るシステム構成の決定が可能である.本研究ではアプリケーションの適用例として画像符号化処理とパケット処理を挙げ, システムの有効性を評価する.

CiNii
Area/delay estimation for digital signal processor cores

Y Miyaoka, Y Kataoka, N Togawa, M Yanagisawa, T Ohtsuki

PROCEEDINGS OF THE ASP-DAC 2001: ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE 2001 156 - 161 2001年

　概要を見る

Hardware/software partitioning is one of the key processes in a hardware/software cosynthesis system for digital signal processor cores. In hardware/software partitioning, area and delay estimation of a processor core plays an important role since the hardware/software partitioning process must determine which part of a processor core should be realized by hardware units and which part should be realized by a sequence of instructions based on execution time of an input application program and area of a synthesized processor core. This paper proposes area and delay estimation equations for digital signal processor cores. For area estimation, we show that total area for a processor core can be derived from the sum of area for a processor kernel and area for additional hardware units. Area for a processor kernel can be mainly obtained by minimum area for a processor kernel and overheads for adding hardware units and registers. Area for a hardware unit can be mainly obtained by its type and operation bit width. For delay estimation, we show that critical path delay for a processor core can be derived from the delay of a hardware unit which is on the critical path in the processor core. Experimental results demonstrate that errors of area estimation are less than 2% and errors of delay estimation are less than 2ns when comparing estimated area and delay with logic-synthesized area and delay.

DOI

Scopus

6

被引用数

(Scopus)
CAM processor synthesis based on behavioral descriptions

N Togawa, T Wakui, T Yoden, M Terajima, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E83A ( 12 ) 2464 - 2473 2000年12月

　概要を見る

CAM (Content Addressable Memory) units are generally designed so that thc) carl be applied to variety of application programs. However, if a particular application runs on CAM units, some functions in CAM units may be often used and other functions may never be used. We consider that appropriate design for CAM units is required depending on the requirements for a given application program. This paper proposes a CAM processor synthesis system based on behavioral descriptions. The input of the system is an application programs written in C including CAM functions, and its output is hardware descriptions of a synthesized processor and a binary code executed on it. Since the system determines functions in CAM units and synthesizes a CAM processor depending on the requirements of an application program, we expect that a synthesized CAM processor can execute the application program with small processor area and delay. Experimental results demonstrate its efficiency and effectiveness.
CAMプロセッサを対象とするハードウェア／ソフトウェア協調合成システム

涌井達彦, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD2000-84 89 - 94 2000年11月

CiNii
機能メモリを使用したプロセッサの面積／遅延見積もり手法

余傅達彦, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD2000-83 83 - 88 2000年11月
制御処理ハードウェアの高位合成システムのための面積／時間最適化アルゴリズム

家長真行, 戸川望, 柳澤政生, 大附辰夫

情報処理学会DAシンポジウム2000 27 - 32 2000年07月
A Behavioral Synthesis System for Processors with Content Addressable Memories

涌井達彦, 余傅達彦, 寺島信, 戸川望, 柳澤政生, 大附辰夫

Proc.SASIMI2000 56 - 63 2000年04月
システムVLSIの動作合成におけるレイアウト面積・遅延見積もり手法

諏訪勝, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会第13回回路とシステム（軽井沢）ワークショップ 125 - 130 2000年04月

CiNii
歩行者を対象とした地図データ配信システムにおける専用プロセッサの設計と評価

伊澤義貴, 濱未希子, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD99-267 ( 658 ) 15 - 22 2000年03月

　概要を見る

歩行者を対象とする地図データ配信システムとはベクトル型の地図データを格納する地図サーバ, アンテナを設置している基地局, 及び歩行者が所持している複数個の携帯端末から構成されるシステムである.歩行者はGPSにより位置情報を取得し, 位置情報を基地局を通し地図サーバに送信する.地図サーバは位置を中心とする地図データを基地局を通し, 携帯端末に送信する.本稿では, 携帯端末の位置情報が地図サーバに与えられたとき, 位置情報を中心とする質問長方形の内部に含まれる, または交差する地図データを探索する地図データ探索問題を取り上げる.これを高速に処理する地図サーバ専用プロセッサとして連想メモリを適用した地図サーバプロセッサおよび専用ハードウェアを持った地図サーバプロセッサを提案する.それぞれのプロセッサハードウェアをハードウェア記述言語で記述し, これらを論理合成することで, ハードウェアコスト, クリティカルパス遅延を見積もり, プロセッサの性能を評価した結果を報告する.

CiNii
FPGAを用いた動的再構成可能システムと暗号化アルゴリズムへの応用

羽切崇, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD99-266 2000年03月

CiNii
A hardware/software cosynthesis system for digital signal processor cores with two types of register files

N Togawa, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E83A ( 3 ) 442 - 451 2000年03月

　概要を見る

In digital signal processing, bit width of intermediate variables should be longer than that of input and output variables in order to execute intermediate operations with high precision. Then a processor cole for digital signal processing is required to have two types of register files, one of which is used by input and output variables and the other one is used by intermediate variables. This paper proposes a hardware/software cosynthesis system for digital signal processor cores with two types of register files. Given an application program and its data, the system synthesizes a hardware description of a processor cure, an object code running on the processor core, and software environments. A synthesized processor core can be composed of a processor kernel, multiple data memory buses, hardware loop units, addressing units, and multiple functional units. Furthermore it can have two types of register files RF1 and RF2. The bit width and number of registers in RF1 or RF2 will be determined based on a given application program. Thus a synthesized processor core will have small area with keeping high precision of intermediate operations compared with a processor core with only one register file. The experimental results demonstrate the effectiveness of the proposed system.
An area/time optimizing algorithm in high-level synthesis for control-based hardwares

Nozomu Togawa, Masayuki Ienaga, Masao Yanagisawa, Tatsuo Ohtsuki

Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC 309 - 312 2000年 [査読有り]

　概要を見る

This paper proposes an area/time optimizing algorithm in high-level synthesis for control-based hardwares. Given a call graph whose node corresponds to a control flow of an application program, the algorithm generates a set of state-transition graphs which represents the input call graph under area and timing constraint. In the algorithm, first state-transition graphs which satisfy only timing constraint are generated and second they are transformed so that they can satisfy area constraint. Since the algorithm is directly applied to control-flow graphs, it can deal with control flows such as bit-wise processes and conditional branches. Further, the algorithm synthesizes more than one hardware architecture candidates from a single call graph for an application program. Designers of an application program can select several good hardware architectures among candidates depending on multiple design criteria. Experimental results for several control-based hardwares demonstrate effectiveness and efficiency of the algorithm. © 2000 IEEE.

DOI

Scopus

6

被引用数

(Scopus)
A hardware/software partitioning algorithm for digital signal processor cores with two types of register files

N Togawa, T Sakurai, M Yanagisawa, T Ohtsuki

2000 IEEE ASIA-PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS 544 - 547 2000年

　概要を見る

This paper proposes a hardware/software partitioning algorithm for digital signal processor cores with two register files. Given a compiled assembly code and a timing constraint of execution time, the proposed algorithm generates a processor core configuration with a new assembly code running on the generated processor core. The proposed algorithm considers two register files and determines the number of registers in each of register files. Moreover the algorithm considers two or more functional units for each arithmetic or logical operation and assigns functional units with small area to a processor core without causing performance penalty. A generated processor core will have small area compared with processor cores which have a single register file or those which have only one functional unit for each operation. The experimental results demonstrate the effectiveness and efficiency of the proposed algorithm.
A Behavioral Synthesis System for Processors with Content Addressable Memories

Proc. SASIMI 2000 56 - 63 2000年
A hardware/software partitioning algorithm for digital signal processor cores with two types of register files

N Togawa, T Sakurai, M Yanagisawa, T Ohtsuki

2000 IEEE ASIA-PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS pp.544-547 544 - 547 2000年

　概要を見る

This paper proposes a hardware/software partitioning algorithm for digital signal processor cores with two register files. Given a compiled assembly code and a timing constraint of execution time, the proposed algorithm generates a processor core configuration with a new assembly code running on the generated processor core. The proposed algorithm considers two register files and determines the number of registers in each of register files. Moreover the algorithm considers two or more functional units for each arithmetic or logical operation and assigns functional units with small area to a processor core without causing performance penalty. A generated processor core will have small area compared with processor cores which have a single register file or those which have only one functional unit for each operation. The experimental results demonstrate the effectiveness and efficiency of the proposed algorithm.
An area/time optimizing algorithm in high-level synthesis for control-based hardwares

戸川望, 家長真行, 柳澤政生, 大附辰夫

Proceedings of IEEE Asia and South Pacific Design Automation Conference 2000 (ASP-DAC 2000) 2000年01月
A simultaneous placement and routing algorithm for FPGAs with power optimization

Journal of Circuits, Systems and Computers 9;1,2 99 - 112 1999年12月

DOI
A hardware/software cosynthesis system for digital signal processor cores

N Togawa, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E82A ( 11 ) 2325 - 2337 1999年11月

　概要を見る

This paper proposes a hardware/software cosynthesis system for digital signal processor cores and a hardware/software partitioning algorithm which is one of the key issues for the system. The target processor has a VLIW-type core which can be composed of a processor kernel, multiple data memory buses (X-bus and Y-bus), hardware loop units, addressing units, and multiple functional units. The processor kernel includes five pipeline stages (RISC-type kernel) or three pipeline stages (DSP-type kernel). Given an application program written in the C language and a set of application data, the system synthesizes a processor core by selecting an appropriate kernel (RISC-type or DSP-type kernel) and required hardware units according to the application program/data and the hardware costs. The system also generates the object code for the application program and a software environment (compiler and simulator) for the processor core. The experimental results demonstrate that the system synthesizes processor cores effectively according to the features of an application program and the synthesized processor cores execute most application programs with the minimum number of clock cycles compared with several existing processors.
2種類のレジスタファイルを持つディジタル信号処理向けプロセッサのハードウェア/ソフトウェア分割手法

電子情報通信学会技術報告 VLD99-76 1999年11月
ディジタル信号処理向けプロセッサコアの面積/遅延見積り手法

片岡義治, 吉澤大, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD99-75 ( 475 ) 1 - 8 1999年11月

　概要を見る

2種類のレジスタファイルを持つディジタル信号処理向けプロセッサのハードウェア/ソフトウェア協調合成システムでは,ハードウェア/ソフトウェア分割の評価値として,アプリケーションプログラムの実行時間の見積り値と生成されるプロセッサコアの面積の見積り値が必要となる.これら見積り値を得るためには,実際にシステムを用いてハードウェアユニットを変化させ得られたプロセッサコア記述を論理合成ツールで論理合成した結果を解析し,見積り式を導出する必要がある.本稿では,プロセッサコアの面積見積り式および遅延見積り式の導出方法とその検証結果について報告する.面積見積り式の導出では,まず,プロセッサコアの面積がプロセッサカーネルとカーネルに付加されるハードウェアユニットの面積の和として表されることを示す.しかも,プロセッサカーネルの面積が付加するハードウェアユニットに依存する部分と汎用レジスタ数に依存する部分に分離して考えられる点に注目する.導出した面積見積り式によるプロセッサコアの面積見積り値は,論理合成結果後の面積値と比較して,誤差を2%程度に抑えられることが分かった.遅延見積り式の導出では,クリティカルパスを構成する演算器ごとに見積り式を導出することにより誤差を小さくできることを示す.導出した遅延見積り式によるプロセッサコアの1クロック周期は,論理合成結果後の1クロック周期と比較して,誤差を2ns以下に抑えられることが分かった.

CiNii
A hardware/software cosynthesis system for digital signal processor cores

IEICE Trans. on Fundamentals of Electronics, Communications and Computer Sciences E82-A;11 2325 - 2337 1999年11月
制御処理ハードウェアの高位合成システムのための面積/時間最適化アルゴリム

家長真行, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術報告 VLD99-66 ( 317 ) 15 - 22 1999年09月

　概要を見る

本稿では,制御処理ハードウェアの高位合成システムのための面積/時間最適化アルゴリズムを提案する.面積/時間最適化アルゴリズムは,入力としてコールグラフおよびコールグラフを構成するコントロールフローグラフ集合を取り,面積制約および時間制約のもとに,コールグラフ全体を表す状態遷移グラフ集合を合成する.まず,時間制約のみを満足する状態遷移グラフを構築し,その後,面積制約を満足するよう状態遷移グラフを変換する.提案アルゴリズムは,コントロールフローグラフを直接的に操作するため,ビット処理および条件分岐処理といった制御処理を扱うことができ,しかも,アプリケーションプログラム全体を表す1個のコールグラフから,面積制約および時間制約を満足する複数個のハードウェア候補を列挙することができる.提案アルゴリズムをハフマン符号化を始めとする,いくつかの制御処理アプリケーションプログラムに適用し,その有効性を評価する.

CiNii
制御処理を主体としたハードウェア記述生成手法

横山正幸

情報処理学会DAシンポジウム'99論文集 1999年07月

CiNii
制御処理を主体としたハードウェアを対象とする高位合成システムとその適用

情報処理学会DAシンポジウム'99論文集 1999年07月
2種類のレジスタファイルを持ったディジタル信号処理向けプロセッサのハードウェア/ソフトウェア協調合成システム

電子情報通信学会第12回回路とシステム軽井沢ワークショップ論文集 1999年04月
分枝限定法に基づく最適解を保証するリソースバインディング手法

情報処理学会論文誌 40;4 1565 - 1577 1999年04月
FPGAを用いた再構成可能システムとその応用

電子情報通信学会技術研究報告 VLD98;143 1999年03月
A depth-constrained technology mapping algorithm for logic-blocks composed of tree-structured LUTs

N Togawa, K Ara, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E82A ( 3 ) 473 - 482 1999年03月

　概要を見る

This paper proposes a fast depth-constrained technology mapping algorithm for logic-blocks composed of tree-structured lookup tables. First, we propose a technology mapping algorithm which minimizes the number of logic-blocks if an input Boolean network is a tree. Second, we propose a technology mapping algorithm which minimizes logic depth for any input Boolean network. Finally, we combine those two technology mapping algorithms and propose an algorithm which realizes technology mapping whose depth is bounded by a given upper bound d(c). Experimental results demonstrate the effectiveness and efficiency of the proposed algorithm.
A simultaneous placement and global routing algorithm for FPGAS with power optimization

N Togawa, K Ukai, M Yanagisawa, T Ohtsuki

JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS 9 ( 1-2 ) 99 - 112 1999年02月 [査読有り]

　概要を見る

This paper proposes a simultaneous placement and global routing algorithm for FPGAs with power optimization. The algorithm is based on hierarchical bipartitioning of layout regions and sets of logic-blocks. When bipartitioning a layout region, pseudo-blocks are introduced to preserve connections if there exist connections between bipartitioned logic-block sets. A global route is represented by a sequence of pseudo-blocks. Since pseudo-blocks and logic-blocks can be dealt with equally, placement and global routing are processed simultaneously. The algorithm gives weights to nets with high switching probabilities and attempts to assign the blocks connected by weighted nets to the same region. Thus their length is shortened and the power consumption of a whole circuit can be reduced. The experimental results demonstrate the effectiveness and efficiency of the algorithm.

DOI
A simultaneous placement and global routing algorithm for FPGAS with power optimization

N Togawa, K Ukai, M Yanagisawa, T Ohtsuki

JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS 9 ( 1-2 ) 99 - 112 1999年02月

　概要を見る

This paper proposes a simultaneous placement and global routing algorithm for FPGAs with power optimization. The algorithm is based on hierarchical bipartitioning of layout regions and sets of logic-blocks. When bipartitioning a layout region, pseudo-blocks are introduced to preserve connections if there exist connections between bipartitioned logic-block sets. A global route is represented by a sequence of pseudo-blocks. Since pseudo-blocks and logic-blocks can be dealt with equally, placement and global routing are processed simultaneously. The algorithm gives weights to nets with high switching probabilities and attempts to assign the blocks connected by weighted nets to the same region. Thus their length is shortened and the power consumption of a whole circuit can be reduced. The experimental results demonstrate the effectiveness and efficiency of the algorithm.

DOI
2種類のレジスタファイルを持ったディジタル信号処理向けプロセッサのハードウェア/ソフトウェア協調合成システムとその並列化コンパイラ

中村剛, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 FTS98;132 71 - 78 1999年02月

　概要を見る

ディジタル信号処理において高い演算精度を保つためには, 演算の途中結果は入カ系列のビット幅より大きなビット幅を持つ必要がある. ディジタル信号処理向けプロセッサが2種類のレジスタファイルを持てば, 演算精度を保ち, しかも小さいハードウェア面積でディジタル信号処理アプリケーションを実現することができる. 本稿では, レジスタビット幅の異なる2種類のレジスタファイルを持ったディジタル信号処理向けプロセッサのハードウェア/ソフトウェア協調合成システムおよびその並列化コンパイラを提案する. 本システムはアプリケーションプログラムのC言語記述およびアプリケーションデータを入力とし, プロセッサのハードウェア記述, プロセッサ上で動作するオブジェクトコードおよびソフトウェア環境を出力とする. 並列化コンパイラは, 与えられたC言語記述からターゲットアーキテクチャで想定される全てのハードウェアユニットを持つプロセッサ上で動作するアセンブリコードを出力する. この際, アプリケーションが持つ並列度を最大限に抽出し, 実行時間の最小化を目指す. さらに2つのデータ型から変数を2種類のレジスタファイルに割り当て, 演算精度を保つアセンブリコードを生成できる. 計算機実験によって本システムを評価した結果を報告する.

CiNii
A hardware/software partitioning algorithm for processor cores of digital signal processing

N Togawa, T Sakurai, M Yanagisawa, T Ohtsuki

PROCEEDINGS OF ASP-DAC '99 335 - 338 1999年 [査読有り]

　概要を見る

A hardware/software cosynthesis system for processor cores of digital signal processing has been developed. This paper focuses on a hardware/software partitioning algorithm which is one of the Rey issues in the system. Given an input assembly code generated by the compiler in the system, the proposed hardware/software partitioning algorithm first determines the types and the numbers of required hardware units, such as multiple functional units, hardware loop units, and particular addressing units, for a processor core (initial resource allocation). Second, the hardware units determined at initial resource allocation are reduced one by one while the assembly code meets a given timing constraint (configuration of a processor core). The execution time of the assembly code becomes longer but the hardware costs for a processor core to execute it becomes smaller. Finally, it outputs an optimized assembly code and a processor configuration. Experimental results demonstrate that the system synthesizes processor cores effectively according to the features of an application program/data.

DOI

Scopus

2

被引用数

(Scopus)
A hardware/software partitioning algorithm for processor cores of digital signal processing

N Togawa, T Sakurai, M Yanagisawa, T Ohtsuki

PROCEEDINGS OF ASP-DAC '99 335 - 338 1999年

　概要を見る

A hardware/software cosynthesis system for processor cores of digital signal processing has been developed. This paper focuses on a hardware/software partitioning algorithm which is one of the Rey issues in the system. Given an input assembly code generated by the compiler in the system, the proposed hardware/software partitioning algorithm first determines the types and the numbers of required hardware units, such as multiple functional units, hardware loop units, and particular addressing units, for a processor core (initial resource allocation). Second, the hardware units determined at initial resource allocation are reduced one by one while the assembly code meets a given timing constraint (configuration of a processor core). The execution time of the assembly code becomes longer but the hardware costs for a processor core to execute it becomes smaller. Finally, it outputs an optimized assembly code and a processor configuration. Experimental results demonstrate that the system synthesizes processor cores effectively according to the features of an application program/data.
FPGAのマクロブロックを対象とした配置概略配線同時処理手法

情報処理学会研究報告 98-DA;90 1998年12月
A high-level synthesis system for digital signal processing based on data-flow graph enumeration

N Togawa, T Hisaki, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E81A ( 12 ) 2563 - 2575 1998年12月

　概要を見る

This paper proposes a high-level synthesis system for datapath design of digital signal processing hardwares. The system consists of four phases: (I) DFG (data-flow graph) generation, (2) scheduling, (3) resource binding, and (4) HDL (hardware description language) generation. In (1), the system does not generate only one best DFG representing a given behavioral description of a hardware, but more than one good DFGs representing it. In (2) and (3), several synthesis tools can be incorporated into the system depending on the required objectives. Thus we can obtain more than one datapath candidates for a behavioral description with their area and performance evaluation. In (4), the best datapath design is selected among those candidates and its hardware description is generated. The experimental results for applying the system to several benchmarks show the effectiveness and efficiency.
Maple-opt: A performance-oriented simultaneous technology mapping, placement, and global routing algorithm for FPGA's

N Togawa, M Yanagisawa, T Ohtsuki

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 17 ( 9 ) 803 - 818 1998年09月 [査読有り]

　概要を見る

A new held programmable gate array (FPGA) design algorithm, Maple-opt, is proposed for technology mapping, placement, and global routing subject to a given upper bound of critical signal path delay. The basic procedure of Maple-opt is viewed as top-down hierarchical bipartition of a layout region. In each bipartitioning step, technology mapping onto logic blocks of FPGA's, their placement, and global routing are determined simultaneously, which leads to a more congestion-balanced layout for routing, In addition, Maple-opt is capable of estimating a lower bound of the delay for a constrained path and of extracting critical paths based on the difference between the lower bounds and given constraint values in each bipartitioning step. Two delay-reduction procedures for the critical paths are applied; routing delay reduction and logic-block delay reduction, The routing delay reduction is done by assigning each constrained path to a single subregion when bipartitioning a region. The logic-block delay reduction is done by mapping each constrained path onto a smaller number of logic blocks, Experimental results for benchmark circuits demonstrate that Maple-opt reduces the maximum number of tracks per channel by a maximum of 38% compared with existing algorithms while satisfying almost all the path delay constraints.

DOI J-GLOBAL

Scopus

9

被引用数

(Scopus)
最適解を保証するリソースバインディング手法

情報処理学会DAシンポジウム'98論文集 1998年07月
A fast scheduling algorithm based on gradual time-frame reduction for datapath synthesis

N Togawa, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E81A ( 6 ) 1231 - 1241 1998年06月

　概要を見る

This paper proposes a fast scheduling algorithm based on gradual lime-frame reduction for datapath synthesis of digital signal processing hardwares. The objective of the algorithm is to minimize the costs for functional units and registers and to maximize connectivity under given computation time and initiation interval. Incorporating the connectivity in a scheduling stage can reduce multiplexer counts in resource binding. The algorithm maximizes connectivity with maintaining low time complexity and obtains datapath designs with totally small hardware costs in the high-level synthesis environment. The algorithm also resolves inter-iteration data dependencies and thus realizes pipelined datapaths. The experimental results demonstrate that the proposed algorithm reduces the multiplexer counts after resource binding with maintaining low costs for functional units and registers compared with eight conventional schedulers.
分布定数回路の遅延感度解析に基づくクロック配線最適化手法

中嶋雄一郎, 鈴木将貴, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告 98-DA;88 ( 43 ) 21 - 28 1998年05月

　概要を見る

本稿では，枝配線部分に分布定数回路を適応したクロック配線を対象とする遅延感度解析を確立し，本解析に基づくクロック配線最適化手法を提案する．同期式ディジタル回路では，回路動作の安定性や消費電力の問題等から，各フリップフロップへのクロックの供給は同時に行われることが求められる．特に近年の高速動作する集積回路では，クロック配線の善し悪しがシステムの性能に繋がるため，より正確に遅延時間を見積もることが必要となる．本手法では正確な遅延を算出するために，分布定数回路に基づきクロック配線の各シンクに対する遅延感度を算出する．算出値から，線形計画法により各配線長の最適解を求める．本手法の計算機実験した結果を報告し，手法の有効性を検証する．In synchronous digital circuit systems, a clock signal is needed to arrive at each flip-flop simultaneously in order to operate a circuit correctly. Particularly in a circuit operating with high clock frequency, since the circuit performance is directly dependent on the clock tree layout, we need more accurate delay analysis. This paper establishes delay sensitivity analysis for clock tree modeled by distributed constant circuits and proposes a clock tree optimization algorithm based on this analysis. Experimental results show efficiency and effectiveness of this algorithm.

CiNii
An FPGA layout reconfiguration algorithm based on global routes for engineering changes in system design specifications

N Togawa, K Hagi, M Yanagisawa, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E81A ( 5 ) 873 - 884 1998年05月

　概要を見る

Rapid system prototyping is one of the main applications for field-programmable gate arrays (FPGAs). At the stage of rapid system prototyping, design specifications can often be changed since they cannot be determined completely. In this paper, layout design change is focused on and a layout reconfiguration algorithm is proposed for FPGAs. The target FPGA architecture is developed for transport processing. In order to implement more various circuits flexibly it has three-input lookup tables (LUTs) as minimum logic cells. Since its logic granularity is finer than that of conventional FPGAs, it requires more routing resources to connect them and minimization of routing congestion is indispensable. In layout reconfiguration, the main problem is to add LUTs to initial layouts. Our algorithm consists of two steps: For given placement and global routing of LUTs, in Step 1 an added LUT is placed with allowing that the position of the added LUT may overlap that of a preplaced LUT; Then in Step 2 preplaced LUTs are moved to their, adjacent positions so that the overlap of the LUT positions can be resolved. Global routes are updated corresponding to reconfiguration of placement. The algorithm keeps routing congestion small by evaluating global routes directly both in Steps 1 and 2. Especially in Step 2, if the minimum number of preplaced LUTs are moved to their adjacent positions, our algorithm minimizes routing congestion. Experimental results demonstrate that, if the number of added LUTs is at most 20% of the number of initial LUTs, our algorithm generates the reconfigured layouts whose routing congestion is as small as that obtained by executing a conventional placement and global routing algorithm. Run time of our algorithm is within approximately one second.
ツリー状に接続されたLUTを対象とした深さ制約付きテクノロジーマッピング手法

電子情報通信学会第11回回路とシステム軽井沢ワークショップ論文集 1998年04月
パイプラインプロセッサのハードウェア記述自動生成手法

電子情報通信学会技術研究報告 VLD97;117 1998年03月
ディジタル信号処理向けプロセッサの自動合成システムにおける並列化コンパイラ

電子情報通信学会技術研究報告 VLD97;116 1998年03月
ディジタル信号処理向けプロセッサのハードウェア/ソフトウェア協調合成システム

戸川望, 桜井崇志, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 VLD97;115 ( 576 ) 17 - 24 1998年03月

　概要を見る

本稿では, ディジタル信号処理向けプロセッサを対象としたハードウェア/ソフトウェア協調合成システムおよびシステムの核となるハードウェア/ソフトウェア分割手法を提案する.合成されるプロセッサは, 複数個の命令を同時に実行すルVLIWタイプのプロセッサであり, プロセッサカーネルに加えて, 複数個のデータメモリバス構成(Xバス, Yバス)、ハードウェアループ, アドレッシングユニット、複数個の演算器を取ることを可能とする.これらの中からアプリケーションプログラム, データおよびハードウェアコストに応じて, 適当なハードウェアを選択することが可能となる.しかも、5段階のパイプラインステージ(RISC型と呼ぶ)あるいは3段階のパイプラインステージ(DSP型)を取ることにより, RISCアーキテクチャに向くアプリケーションからDSPアーキテクチャに向くアプリケーションに対応してプロセッサを合成する.計算機実験により本システムを評価した結果を報告する.

CiNii
平成9年度（第21回）丹羽記念賞

丹羽記念会 1998年02月
An incremental placement and global routing algorithm for field-programmable gate arrays

N Togawa, K Hagi, M Yanagisawa, T Ohtsuki

PROCEEDINGS OF THE ASP-DAC '98 - ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE 1998 WITH EDA TECHNO FAIR '98 519 - 526 1998年 [査読有り]

　概要を見る

Rapid system prototyping is one of the main applications for field-programmable gate arrays (FPGAs). At the stage of rapid system prototyping, design specifications can often be changed since they cannot always be determined completely. In this paper, layout design change is focused on and a layout reconfiguration algorithm is proposed for FPGAs. In layout reconfiguration, the main problem is to add LUTs to initial layouts. Our algorithm consists of two steps: For given placement and global routing of LUTs, Step 1 places an added LUT with allowing that the position of the added LUT may overlap that of a preplaced LUT; Then Step 2 moves preplaced LUTs to their adjacent positions so that the overlap of the LUT positions can be resolved. Global routes are updated corresponding to reconfiguration of placement. The algorithm keeps routing congestion small by evaluating global routes directly both in Steps 1 and 2. Especially in Step 2, if the minimum number of preplaced LUTs are moved to their adjacent positions, our algorithm minimizes routing congestion. Experimental results demonstrate the effectiveness and efficiency of the algorithm.

DOI
A high-level synthesis system for digital signal processing based on enumerating data-flow graphs

N Togawa, T Hisaki, M Yanagisawa, T Ohtsuki

PROCEEDINGS OF THE ASP-DAC '98 - ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE 1998 WITH EDA TECHNO FAIR '98 265 - 274 1998年 [査読有り]

　概要を見る

This paper proposes a high-level synthesis system for datapath design of digital signal processing hardwares. The system consists of four phases: (1) DFG (data-flow graph) generation, (2) scheduling, (3) resource binding, and (4) HDL (hardware description language) generation. In (1), the system does not generate only one best DFG representing a given behavioral description of a hardware, but more than one good DFGs representing it. In (2) and (3), several synthesis tools can be incorporated into the system depending on the required objectives. Thus we can obtain more than one datapath candidates for a behavioral description with their area and performance evaluation. In (4), the best datapath design is selected among those candidates and its hardware description is generated. The experimental results for applying the system to several benchmarks show the effectiveness and efficiency.

DOI
A Fast Scheduling Algorithm Based on Gradual Time-Frame Reduction for Datapath Synthesis

IEICE Trans on Fundamentals of Electronics, Communications and Computer Sciences E81-A/6 1231 - 1240 1998年
A simultaneous placement and global routing algorithm for FPGAs with power optimization

N Togawa, K Ukai, M Yanagisawa, T Ohtsuki

APCCAS '98 - IEEE ASIA-PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS 125 - 128 1998年

　概要を見る

This paper proposes a, simultaneous placement and global routing algorithm for FPGAs with power optimization. The algorithm is based on hierarchical bipartitioning of layout regions and sets of logic-blocks. When bipartitioning a layout region, pseudo-blocks are introduced to preserve connections if there exist connections between bipartitioned logic-block sets. A global route is represented by a sequence of pseudo-blocks, Since pseudo-blocks and logic-blocks can be dealt with equally, placement and global routing are processed simultaneously. The algorithm gives weights to the nets with high switching probabilities and assigns the blocks connected by weighted nets to the same region. Thus their length is shortened and the power consumption of a whole circuit can be reduced. The experimental results demonstrate the effectiveness and efficiency of the algorithm.
Maple-opt: A performance-oriented simultaneous technology mapping, placement, and global routing algorithm for FPGA's

Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 17 ( 9 ) 803 - 818 1998年

　概要を見る

A new field programmable gate array (FPGA) design algorithm, Maple-opt, is proposed for technology mapping, placement, and global routing subject to a given upper bound of critical signal path delay. The basic procedure of Mapleopt is viewed as top-down hierarchical bipartition of a layout region. In each bipartitioning step, technology mapping onto logic blocks of FPGA's, their placement, and global routing are determined simultaneously, which leads to a more congestionbalanced layout for routing. In addition, Maple-opt is capable of estimating a lower bound of the delay for a constrained path and of extracting critical paths based on the difference between the lower bounds and given constraint values in each bipartitioning step. Two delay-reduction procedures for the critical paths are applied
routing delay reduction and logic-block delay reduction. The routing delay reduction is done by assigning each constrained path to a single subregion when bipartitioning a region. The logic-block delay reduction is done by mapping each constrained path onto a smaller number of logic blocks. Experimental results for benchmark circuits demonstrate that Maple-opt reduces the maximum number of tracks per channel by a maximum of 38% compared with existing algorithms while satisfying almost all the path delay constraints. © 1998 IEEE.

DOI

Scopus

9

被引用数

(Scopus)
An incremental placement and global routing algorithm for field-programmable gate arrays

N Togawa, K Hagi, M Yanagisawa, T Ohtsuki

PROCEEDINGS OF THE ASP-DAC '98 - ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE 1998 WITH EDA TECHNO FAIR '98 519 - 526 1998年

　概要を見る

Rapid system prototyping is one of the main applications for field-programmable gate arrays (FPGAs). At the stage of rapid system prototyping, design specifications can often be changed since they cannot always be determined completely. In this paper, layout design change is focused on and a layout reconfiguration algorithm is proposed for FPGAs. In layout reconfiguration, the main problem is to add LUTs to initial layouts. Our algorithm consists of two steps: For given placement and global routing of LUTs, Step 1 places an added LUT with allowing that the position of the added LUT may overlap that of a preplaced LUT; Then Step 2 moves preplaced LUTs to their adjacent positions so that the overlap of the LUT positions can be resolved. Global routes are updated corresponding to reconfiguration of placement. The algorithm keeps routing congestion small by evaluating global routes directly both in Steps 1 and 2. Especially in Step 2, if the minimum number of preplaced LUTs are moved to their adjacent positions, our algorithm minimizes routing congestion. Experimental results demonstrate the effectiveness and efficiency of the algorithm.
A high-level synthesis system for digital signal processing based on enumerating data-flow graphs

N Togawa, T Hisaki, M Yanagisawa, T Ohtsuki

PROCEEDINGS OF THE ASP-DAC '98 - ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE 1998 WITH EDA TECHNO FAIR '98 265 - 274 1998年

　概要を見る

This paper proposes a high-level synthesis system for datapath design of digital signal processing hardwares. The system consists of four phases: (1) DFG (data-flow graph) generation, (2) scheduling, (3) resource binding, and (4) HDL (hardware description language) generation. In (1), the system does not generate only one best DFG representing a given behavioral description of a hardware, but more than one good DFGs representing it. In (2) and (3), several synthesis tools can be incorporated into the system depending on the required objectives. Thus we can obtain more than one datapath candidates for a behavioral description with their area and performance evaluation. In (4), the best datapath design is selected among those candidates and its hardware description is generated. The experimental results for applying the system to several benchmarks show the effectiveness and efficiency.
A performance-oriented circuit partitioning algorithm with logic-block replication for multi-FPGA systems

N Togawa, M Sato, T Ohtsuki

JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS 7 ( 5 ) 373 - 393 1997年10月 [査読有り]

　概要を見る

In this paper, we extend the circuit partitioning algorithm which we had proposed for multi-FPGA systems and present a new algorithm in which the delay of each critical signal path is within a specified upper bound imposed on it. The core of the presented algorithm is recursive bipartitioning of a circuit. The bipartitioning procedure consists of three stages: (0) detection of critical paths; (1) bipartitioning of a set of primary inputs and outputs; and (2) bipartitioning of a set of logic-blocks. In (0), the algorithm computes the lower bounds of delays for paths with path delay constraints and detects the critical paths based on the difference between the lower and upper bounds dynamically in every bipartitioning procedure. The delays of the critical paths are reduced with higher priority. In (1), the algorithm attempts to assign the primary inputs and outputs on each critical path to one chip so that the critical path does not cross between chips. Finally in (2), the algorithm not only decreases the number of crossings between chips but also assigns the logic-blocks on each critical path to one chip by exploiting a network flow technique. The algorithm has been implemented and applied to MCNC PARTITIONING 93 benchmark circuits. The experimental results demonstrate that it resolves almost all path delay constraints while maintaining the maximum number of required I/O blocks per chip small compared with conventional algorithms.

DOI

Scopus
連想メモリを搭載したハードウェアエンジンによる故障回路並列故障シミュレーションの高速化手法

福山誠一郎, 戸川望, 佐藤政生, 大附辰夫

電子情報通信学会技術研究報告 CPSY97;76 ( 103 ) 81 - 88 1997年10月

　概要を見る

連想メモリ(CAM: Content Addressable Memory)を搭載したハードウェアエンジンを用いて故障回路並列の故障シミュレーションを実行することにより，シミュレーション時間が逐次型計算機を利用した場合に比べて短縮されることが知られている．これは，連想メモリの一致検索機能や並列書き込み機能等を利用することで，全故障回路を連想メモリ上で並列にシミュレーションできるためである．しかしながら，対象とする回路が大規模化すると，全故障回路を連想メモリ上に同時に記憶できないことがある．従来，全故障回路を連想メモリ上に一度に記憶可能な数毎に分割して処理していたが，その際に必要となる連想メモリとホストコンピュータとの通信がボトルネックとなり，シミュレーション速度向上の妨げとなっていた．本稿では，連想メモリを搭載したハードウェアエンジン上に通信用RAMを設けることで，ホストコンピュータとの通信による遅延を削減する手法を提案し計算機による提案手法の評価結果を報告する．CAM (Content Addressable Memory) can operate word-parallel equivalence search and word-parallel writing. Fault simulation time can be reduced by using a CAM-based hardware engine compared with using a serial computer, since fault circuits can be simulated in parallel on CAM. However, if the size of circuits is larger, we cannot simulate all fault circuits because of limited CAM capacity. In such a case, we can divide the parallel-fault simulation. This simulation needs communication between the hardware engine and host computer, which is a bottleneck of the parallel-fault simulation and decreases fault simulation speed. In this paper, we propose a fast parallel-fault simulation algorithm using a CAM-based hardware engine which has communication RAM. Communication RAM reduces a delay caused by communication between the hardware engine and host computer. Experimental results demonstrate its efficiency and effectiveness.

CiNii
Fast scheduling and allocation algorithms for entropy CODEC

K Suzuki, N Togawa, M Sato, T Ohtsuki

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E80D ( 10 ) 982 - 992 1997年10月

　概要を見る

Entropy coding/decoding are implemented on FPGAs as a fast and flexible system in which high-level synthesis technologies are key issues. In this paper, we propose scheduling and allocation algorithms for behavioral descriptions of entropy CODEC. The scheduling algorithm employs a control-flow graph as input and finds a solution with minimal hardware cost and execution time by merging nodes in the control-flow graph. The allocation algorithm assigns operations to operators with various bit lengths. As a result, register-transfer level descriptions are efficiently obtained from behavioral descriptions of entropy CODEC with complicated control flow and variable bit lengths. Experimental results demonstrate that our algorithms synthesize the same circuits as manually designed within one second.
A performance-oriented simultaneous placement and global routing algorithm for transport-processing FPGAs

N Togawa, M Sato, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E80A ( 10 ) 1795 - 1806 1997年10月

　概要を見る

In layout design of transport-processing FPGAs, it is required that not only routing congestion is kept small but also circuits implemented on them operate with higher operation frequency. This paper extends the proposed simultaneous placement and glob al routing algorithm for transport-processing FPGAs whose objective is to minimize routing congestion and proposes a new algorithm in which the length of each critical signal path (path length) is limited within a specified upper bound imposed on it (path length constraint). The algorithm is based on hierarchical bipartitioning of layout regions and LUT (LookUp Table) sets to be placed. In each bipartitioning, the algorithm first searches the paths with tighter path length constraints by estimating their path lengths. Second the algorithm proceeds the bipartitioning so that the path lengths of critical paths can be reduced. The algorithm is applied to transport-processing circuits and compared with conventional approaches. The results demonstrate that the algorithm satisfies the path length constraints for 11 out of 13 circuits, though it increases routing congestion by an average of 20%. After detailed routing, it achieves 100% routing for all the circuits and decreases a circuit delay by an average of 23%.
機能メモリを使用したプロセッサを対象とするハードウェア/ソフトウェア協調合成システム

電子情報通信学会技術研究報告 CPSY98;85 1997年09月
ディジタル信号処理を対象とした高位合成システムにおける高速なスケジューリングアルゴリズム

情報処理学会DAシンポジウム'97論文集 1997年07月
スケッチレイアウトシステムにおけるBGAパッケージ配線手法

回路実装学会誌 12;4 241 - 246 1997年07月

DOI
FPGAを対象とした低消費電力指向配置・概略配線同時処理手法

鵜飼薫, 戸川望, 佐藤政生, 大附辰夫

電子情報通信学会技術研究報告 VLD97;42 ( 141 ) 191 - 198 1997年06月

　概要を見る

FPGAをコアチップと周辺LSIをつなぐインタフェースに用いることにより, システムLSIの汎用性を増す試みが注目されている. しかしながら, FPGAはゲートアレイ等と比べ消費電力が高く, システムLSI全体の消費電力を削減するにはFPGAを対象に低消費電力化設計手法を構築する必要がある. 本稿ではFPGAを対象とし, 低消費電力化を目的とした配置・概略配線同時処理手法を提案する. 提案手法は, レイアウト領域および配置すべき論理ブロック集合の階層的2分割を基本とする. 分割された領域間に接続要求がある場合, 仮想ブロックを生成することによりブロック間の接続を保持する. 配線を仮想ブロックにより表現することで論理ブロックと同等に扱い, 配置と概略配線とを同時に処理する. スイッチング確率の高いネットに重みを付加し, そのネットに接続されたブロックを同じ領域に割り当てることにより, スイッチング確率の高いネットの配線長を短くし回路全体の消費電力を削減する. 計算機実験により手法の有効性を評価した結果を報告する.

CiNii
システム設計仕様部分的変更を実現する概略配線径路を考慮したFPGA向けレイアウト再構成手法

電子情報通信学会第10回回路とシステム軽井沢ワークショップ論文集 1997年04月
A circuit partitioning algorithm with path delay constraints for multi-FPGA systems

N Togawa, M Sato, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E80A ( 3 ) 494 - 505 1997年03月

　概要を見る

In this paper, we extend the circuit partitioning algorithm which we have proposed for multi-FPGA systems and present a new algorithm in which the delay of each critical signal path is within a specified upper bound imposed on it. The core of the presented algorithm is recursive bipartitioning of a circuit. The bipartitioning procedure consists of three stages: 0) detection of critical paths; 1) bipartitioning of a set of primary inputs and outputs; and 2) bipartitioning of a set of logic-blocks. In 0), the algorithm computes the lower bounds of delays for paths with path delay constraints and detects the critical paths based on the difference between the lower and upper bound dynamically in every bipartitioning procedure. The delays of the critical paths are reduced with higher priority. In 1), the algorithm attempts to assign the primary inputs and outputs on each critical path to one chip so that the critical path does not cross between chips. Finally in 2), the algorithm not only decreases the number of crossings between chips but also assigns the logic-blocks on each critical path to one chip by exploiting a network flow technique. The algorithm has been implemented and applied to MCNC PARTITIONING 93 benchmark circuits. The experimental results demonstrate that it resolves almost all path delay constraints with maintaining the maximum number of required I/O blocks per chip small compared with conventional algorithms.
スケッチレイアウトシステムにおけるBGAパッケージ配線手法

電子情報通信学会VLSI設計技術研究会 VLD96;96 1997年03月
接続コストの最小化を目的とした高速アロケーション手法

加藤健吉, 戸川望, 佐藤政生, 大附辰夫

電子情報通信学会VLSI設計技術研究会 VLD96;96 ( 556 ) 1 - 8 1997年03月

　概要を見る

ディジタル信号処理ハードウェアのデータパス設計を対象とした高速なアロケーション手法を提案する.提案手法はレジスタトランスファレベルのアーキテクチャにおける接続コストの最小化を目的とし,(1)2部グラフの最小重みマッチングによるレジスタ割当て,(2)接続確率による演算器割当て,(3)接続確率によるレジスタ再割当てを基本とする.(1),(2),(3)において,常に接続コストの最小化を目的とすることで,比較的高速にしかも最適解に近い解を算出することができる.提案手法を計算機上に実装し評価した結果を報告する.

CiNii
A Circuit Partitioning Alglrithm with Replication Capability for Multi-FPGA Systems

IEICE Trans,on Fundementals of Eledtronics,Communications and Computer Sciences E78-A/13 1118 - 1123 1997年
A performance-oriented circuit partitioning algorithm with logic-block replication for multi-FPGA systems

Nozomu Togawa, Masao Sato, Tatsuo Ohtsuki

Journal of Circuits, Systems and Computers 7 ( 5 ) 373 - 393 1997年

　概要を見る

In this paper, we extend the circuit partitioning algorithm which we had proposed for multi-FPGA systems and present a new algorithm in which the delay of each critical signal path is within a specified upper bound imposed on it. The core of the presented algorithm is recursive bipartitioning of a circuit. The bipartitioning procedure consists of three stages: (0) detection of critical paths
(1) bipartitioning of a set of primary inputs and outputs
and (2) bipartitioning of a set of logic-blocks. In (0), the algorithm computes the lower bounds of delays for paths with path delay constraints and detects the critical paths based on the difference between the lower and upper bounds dynamically in every bipartitioning procedure. The delays of the critical paths are reduced with higher priority. In (1), the algorithm attempts to assign the primary inputs and outputs on each critical path to one chip so that the critical path does not cross between chips. Finally in (2), the algorithm not only decreases the number of crossings between chips but also assigns the logic-blocks on each critical path to one chip by exploiting a network flow technique. The algorithm has been implemented and applied to MCNC PARTITIONING 93 benchmark circuits. The experimental results demonstrate that it resolves almost all path delay constraints while maintaining the maximum number of required I/O blocks per chip small compared with conventional algorithms.

DOI

Scopus
Simultaneous placement and global routing for transport-processing FPGA layout

N Togawa, M Sato, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E79A ( 12 ) 2140 - 2150 1996年12月

　概要を見る

Transport-processing FPGAs have been proposed for flexible telecommunication systems. Since those FPGAs have finer granularity of logic functions to implement circuits on them the amount of routing resources tends to increase. Tn order to keep routing congestion small, it is necessary to execute placement and routing simultaneously. This paper proposes a simultaneous placement and global routing algorithm for transport-processing FPGAs whose primary objective is minimizing routing congestion. The algorithm is based on hierarchical bipartition of layout regions and sets of LUTs (LookUp Tables) to be placed. It achieves bipartitioning which leads to small routing congestion by applying a network Bow technique to it and computing a maximum Bow and a minimum cut. If there exist connections between bipartitioned LUT sets, pairs of pseudo-terminals are introduced to preserve the connections. A sequence of pseudo-terminals represents a global route of each net. As a result, both placement of LUTs and global routing are determined when hierarchical bipartitioning procedures are finished. The proposed algorithm has been implemented and applied to practical transport-processing circuits. The experimental results demonstrate that it decreases routing congestion bq an average of 37% compared with a conventional algorithm and achieves 100% routing for the circuits for which the conventional algorithm causes unrouted nets.
Dharmaアーキテクチャに基づくＦＰＧＡチップの試作

マイクロエレクトロニクス研究開発機構第１５回研究交流会 1996年12月
Scheduling and Allocation Algorithms for Entropy CODEC

SUZUKI K.

Proceedings of the Sixth Workshop on Synthesis and System Integration of Mixed Technologies (SASIMI'96) 149 - 154 1996年11月

CiNii
パス長制約を考慮した通信処理用FPGA向け配置・概略配線同時処理手法

戸川望, 佐藤政生, 大附辰夫

情報処理学会設計自動化研究会 DA96;81 ( 299 ) 9 - 16 1996年10月

　概要を見る

通信処理用FPGAを対象としたレイアウト合成では, 配線混雑度を小さく抑えると共に高速動作を可能とした回路設計が要求される. 本稿では, 通信処理用FPGAを対象に配線混雑度最小化を目的として提案された配置・概略配線同時処理手法を拡張し, タイミングがクリティカルな信号パスの長さ (パス長) の最大値を制約として与え, パス長を最大値以内に抑えることを可能とした手法を提案する. 提案手法は階層的2分割に基づく. 各2分割処理は, (0) パス長制約の評価. (1) 端子集合の2分割, (2) LUT集合の2分割, の3段階より構成される. (0)により, パス長制約がより厳しいパスを探索し, (1), (2) でそのようなパスのパス長が優先的に短くなる処理を実行することでパス長制約の満足を目指す. 提案手法をいくつかの通信処理用回路に適用し評価実験した結果を報告する.

CiNii
高位合成システムを用いた画像符号化アルゴリズムのハードウェア合成法

情報処理学会DAシンポジウム'96論文集 1996年08月
データパス設計を対象とした高位合成システム

戸川望

情報処理学会DAシンポジウム'96論文集 1996年08月

CiNii
安藤研究所第9回安藤博記念学術奨励賞

1996年06月
通信処理用FPGAを対象とした配置・概略配線同時処理手法

戸川望, 佐藤政生, 大附辰夫

情報処理学会設計自動化研究会 DA96;80 15 - 22 1996年05月

　概要を見る

本稿では, 通信処理用FPGAを対象に配線混雑度の最小化を目的とした配置・概略配線同時処理手法を提案する. 提案手法は, レイアウト領域および配置すべきLUT(LookUp Table) 集合の階層的2分割を基本とする. ネットワークフローの考えを適用し最大フローによる最小カットを算出することで配線混雑度の小さくなる分割を実現する. この際, 分割されたLUT集合間の結線要求は, 仮想端子と呼ばれる仮想的に導入した端子によって保持される. 仮想端子の並びによって概略配線径路が表される. その結果, 階層分割の終了と共に配置と概略配線とが決定する. 計算機実験により, 提案手法の有効性を評価する.

CiNii
プリント配線板を対象とした二層均等化スペーシング手法

金井宏和, 戸川望, 佐藤政生, 大附辰夫

情報処理学会設計自動化研究会 DA96;80 ( 51 ) 9 - 14 1996年05月

　概要を見る

プリント配線板の設計では，部品を配置した後，部品間を配線する．そのため，必ずしも配置された部品間をすべて配線できるとは限らない．配線設計に先だって，配線が可能になるように配置された部品を移動する処理をスペーシングと呼ぶ．本稿では，基板の両面に部品を配置配線する表面実装技術を用いた二層のプリント配線板を対象としたスペーシング手法を提案する．本手法は，部品間の配線本数に応じて配線に必要な間隔を配線領域に与えるように部品を再配置する．配置された部品同士に重なりなどの設計規則違反がある場合には，配線に必要な間隔を配線領域の確保し，同時に部品の重なりを除去することで，違反を解決する．提案手法を計算機上に実装し，手法の有効性を確認した．In design of printed wiring boards, after parts are placed on them, wires are routed among the parts. Thus all wires cannot always be routed among the placed parts. Spacing is the process to move the preplaced parts so that all wires can be routed. In this paper, we propose a spacing algorithm for double-sided printed wiring boards, on both sides of which parts are placed and wires are routed. The algorithm moves the preplaced parts so as to take the space in proportion to the number of wires among the parts. If the initial layout has violations of the design rules such as overlaps of the preplaced parts, the algorithm resolves the violations, Experimental show that the algorithm is effective.

CiNii
電子情報通信学会第8回回路とシステム軽井沢ワークショップ研究奨励賞

1996年04月
A simultaneous technology mapping, placement, and global routing algorithm for FPGAs with path delay constraints

N Togawa, M Sato, T Ohtsuki

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E79A ( 3 ) 321 - 329 1996年03月 [査読有り]

　概要を見る

In this paper, we propose a new FPGA design algorithm, Maple-opt, in which technology mapping, placement, and global routing are executed so that the delay of each critical signal path in an input circuit is within a specified upper bound imposed on it. The basic algorithm of Maple-opt is top-down hierarchical bi-partitioning of regions. Technology mapping onto logic-blocks of FPGAs, their placement, and global routing are determined simultaneously in each hierarchical process. This simultaneity leads to less congested layout For routing. In addition to that, Maple-opt computes a lower bound of delay for each path with a constraint value and determines critical paths based on the difference between the lower bound and the constraint value dynamically in each hierarchical process. Two delay reduction processes are executed for the critical paths; one is routing delay reduction and the other is logic-block delay reduction. Routing delay reduction is realized such that, when bi-partitioning a region, each constrained path is assigned to one subregion. Logic-block delay reduction is realized such that each constrained path is mapped onto fewer logic-blocks. Experimental results for some benchmark circuits show its efficiency and effectiveness.
A simultaneous placement and global routing algorithm with path length constraints for transport-processing FPGAs

N Togawa, M Sato, T Ohtsuki

PROCEEDINGS OF THE ASP-DAC '97 - ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE 1997 569 - 578 1996年 [査読有り]

　概要を見る

In layout design of transport-processing FPGAs, it is required that not only routing congestion is kept small but also circuits implemented on them operate with higher operation frequency. This paper extends the proposed simultaneous placement and global routing algorithm for transport-processing FPGAs whose objective is to minimize routing congestion and proposes a new algorithm in which the length of each critical signal path (path length) is limited within a specified upper bound imposed on it (path length constraint). The algorithm is based on hierarchical bipartitioning of layout regions and LUT (LookUp Table) sets to be placed. Each bipartitioning procedure consists of three phases: (0) estimation of path lengths, (1) bipartitioning of a set of terminals, and (2) bipartitioning of a set of LUTs. After searching the paths with tighter path length constraints by estimating path lengths in (0), (1) and (2) are executed so that their path lengths are reduced with higher priority and thus path length constraints are not violated. The algorithm has been implemented and applied to transport-processing circuits compared with conventional approaches. The results demonstrate that the algorithm resolves path length constraints for 11 out of 13 circuits, though it increases routing congestion by an average of 20%. After detailed routing, it achieves 100% routing for all the circuits and decreases a circuit delay by an average of 23%.

DOI
A simultaneous placement and global routing algorithm with path length constraints for transport-processing FPGAs

N Togawa, M Sato, T Ohtsuki

PROCEEDINGS OF THE ASP-DAC '97 - ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE 1997 569 - 578 1996年

　概要を見る

In layout design of transport-processing FPGAs, it is required that not only routing congestion is kept small but also circuits implemented on them operate with higher operation frequency. This paper extends the proposed simultaneous placement and global routing algorithm for transport-processing FPGAs whose objective is to minimize routing congestion and proposes a new algorithm in which the length of each critical signal path (path length) is limited within a specified upper bound imposed on it (path length constraint). The algorithm is based on hierarchical bipartitioning of layout regions and LUT (LookUp Table) sets to be placed. Each bipartitioning procedure consists of three phases: (0) estimation of path lengths, (1) bipartitioning of a set of terminals, and (2) bipartitioning of a set of LUTs. After searching the paths with tighter path length constraints by estimating path lengths in (0), (1) and (2) are executed so that their path lengths are reduced with higher priority and thus path length constraints are not violated. The algorithm has been implemented and applied to transport-processing circuits compared with conventional approaches. The results demonstrate that the algorithm resolves path length constraints for 11 out of 13 circuits, though it increases routing congestion by an average of 20%. After detailed routing, it achieves 100% routing for all the circuits and decreases a circuit delay by an average of 23%.
A performance-oriented circuit partitioning algorithm with logic-block replication for multi-FPGA systems

N Togawa, M Sato, T Ohtsuki

APCCAS '96 - IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS '96 294 - 297 1996年

　概要を見る

This paper proposes a circuit partitioning algorithm in which the delay of each critical signal path is within a specified upper bound. Its core is recursive bipartitioning of a circuit which consists of three stages: 0) detection of critical paths; 1) bipartitioning of a set of primary inputs and outputs; and 2) bipartitioning of a set of logic-blocks. In 0), the algorithm detects the critical paths based on their lower bounds of delays. The delays of the critical paths are reduced with higher priority In 1), the algorithm attempts to assign the primary input and output on each critical path to one chip. In 2), the algorithm not only decreases the number of crossings between chips but also assigns the logic-blocks on each critical path to one chip by exploiting a network flow technique with logic-block replication. The experimental results demonstrate that it resolves almost all path delay constraints with maintaining the maximum number of required I/O blocks per chip small compared with conventional algorithms.
Maple-opt: a simultaneous technology mapping, placement, and global routing algorithm FPGAs with performance optimization.

Nozomu Togawa, Masao Sato, Tatsuo Ohtsuki

Proceedings of the 1995 Conference on Asia Pacific Design Automation, Makuhari, Massa, Chiba, Japan, August 29 - September 1, 1995 319 - 327 1995年 [査読有り]

DOI CiNii
MAPLE - A SIMULTANEOUS TECHNOLOGY MAPPING, PLACEMENT, AND GLOBAL ROUTING ALGORITHM FOR FIELD-PROGRAMMABLE GATE ARRAYS

N TOGAWA, M SATO, T OHTSUKI

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E77A ( 12 ) 2028 - 2038 1994年12月 [査読有り]

　概要を見る

Technology mapping algorithms for LUT (Look Up Table) based FPGAs have been proposed to transfer a Boolean network into logic-blocks. However, since those algorithms take no layout information into account, they do not always lead to excellent results. In this paper, a simultaneous technology mapping, placement and global routing algorithm for FPGAs, Maple, is presented. Maple is an extended version of a simultaneous placement and global routing algorithm for FPGAs, which is based on recursive partition of layout regions and block sets. Maple inherits its basic process and executes the technology mapping simultaneously in each recursive process. Therefore, the mapping can be done with the placement and global routing information. Experimental results for some benchmark circuits demonstrate its efficiency and effectiveness.
MAPLE - A SIMULTANEOUS TECHNOLOGY MAPPING, PLACEMENT, AND GLOBAL ROUTING ALGORITHM FOR FPGAS

N TOGAWA, M SATO, T OHTSUKI

APCCAS '94 - 1994 IEEE ASIA-PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS 554 - 559 1994年 [査読有り]
A SIMULTANEOUS PLACEMENT AND GLOBAL ROUTING ALGORITHM FOR FPGAS

N TOGAWA, M SATO, T OHTSUKI

1994 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL 1 A483 - A486 1994年 [査読有り]

DOI
A simultaneous technology mapping, placement, and global routing algorithm for field-programmable gate arrays.

Nozomu Togawa, Masao Sato, Tatsuo Ohtsuki

Proceedings of the 1994 IEEE/ACM International Conference on Computer-Aided Design, ICCAD 1994, San Jose, California, USA, November 6-10, 1994 156 - 163 1994年 [査読有り]

DOI CiNii

▼全件表示

書籍等出版物

ハードウェアトロイ検知 ―半導体設計情報に潜むハードウェア版マルウェアの見つけ方―

戸川望, 長谷川健人, 永田真一( 担当：共著)

オーム社 2024年11月 ISBN: 9784274232688
CMOS VLSI 回路設計応用編

ウェスト,ハリス著, 宇佐美公良, 池田誠, 小林和淑監訳, 戸川望他分担共訳( 担当：共訳)

丸善出版 2014年01月 ISBN: 9784621087206
組込みシステム概論

戸川望編著

CQ出版 2008年02月 ISBN: 9784789845502

研究シーズ

共同研究・競争的資金等の研究課題

プライバシを担保した機械学習モデルによる設計工程ハードウェアトロイ検知

日本学術振興会科学研究費助成事業

研究期間:

2025年04月

-

2028年03月

戸川望
量子・古典ハイブリッドテストベッドの利用環境整備

戦略的イノベーション創造プログラム（SIP）第三期

研究期間:

2023年10月

-

2028年03月
再構成アクセラレータのための近似最適化手法

日本学術振興会科学研究費助成事業

研究期間:

2023年04月

-

2026年03月

木村晋二, 戸川望, 孫鶴鳴

　概要を見る

今年度は、再構成アクセラレータ向けのデータ表現法、Ad Hoc な近似演算器の設計手法、およびシステマティックな近似回路の合成手法の文献調査を行った。近似演算器の場合は誤差と電力などとのトレードオフの下で設計最適化を行うので、誤差の評価は非常に重要である。Ad Hoc な近似乗算器の設計法に関しては、乗算における部分積の各桁の積算のための圧縮機に着目し、同じ重みで2つに圧縮する圧縮機を用いた乗算回路、新たな部分積の圧縮機の提案と種類の異なる圧縮機を用いた誤差削減を用いた乗算回路、誤差が正負の方向に同じ確率で現れるバイアスのない乗算回路についてパレート最適化と評価を行い、国際会議において発表した。また、各桁に符号をつけた符号付二進数の最適化に基づく8-ポイントの近似 DCT (Discrete Cosign Transformation) 回路の設計と評価を行い、国際会議において発表した。DCT の定数係数との乗算では、符号付二進数を用いることで連続した1からなる数字との乗算を一回の減算に変換できるので、出力への影響に基づいて係数をなるべく簡単な符号付二進数に近似することで演算のハードウェア資源を大きく削減している。近似回路の自動合成に向けては、与えられた論理関数を厳密に最小の素子数で合成する手法の検討を行い、3入力中の2つ以上が1であるときに出力が1となる多数決演算向けの厳密合成手法の提案を行った。厳密合成では、素子数の少ない順に、すべての構造を調べて目的の論理関数が実現できるかをチェックするが、素子数が大きいと、すべての構造を一度にチェックするよりも、クラスタに分けてチェックする方が効率的となるため、各素子の入力のレベルでクラスタ化する手法を提案し、他のクラスタ化法と比較して全体の合成時間を削減できることを示し、論文誌に掲載した。
再構成アクセラレータのための近似最適化手法

日本学術振興会科学研究費助成事業

研究期間:

2023年04月

-

2026年03月

木村晋二, 戸川望, 孫鶴鳴
量子計算及びイジング計算システムの統合型研究開発

NEDO 高効率･高速処理を可能とするAIチップ・次世代コンピューティングの技術開発

研究期間:

2021年04月

-

2026年03月
攻撃に耐性を持つ機械学習モデルによる設計工程ハードウェアトロイ検知

日本学術振興会科学研究費助成事業

研究期間:

2022年04月

-

2025年03月

戸川望, 木村晋二

　概要を見る

本研究では，レジスタトランスファレベル・論理レベル等の集積回路設計データを対象に，機械学習によるハードウェアトロイの「学習」を利用し機械学習モデルを進化，未知ハードウェアトロイや，摂動を加えたハードウェアトロイを含む設計データ（未知設計データ）に対し，未知設計データ中の「各信号線のトロイ／非トロイを識別」する技術の確立を目的とする．しかも機械学習モデルそのものを「騙す」攻撃を解明し理論的に「騙されにくい」ハードウェアトロイ検知技術を構築するものとする．
<BR>
上記の目的を達成するために，2023年度は，2022年度に実施したハードウェアトロイのための「特徴量」の最適化ならびにハードウェアトロイの「摂動」を利用して，防御側に立って攻撃に耐性を持つ機械学習モデルを構築した．
<BR>
2023年度にはこれらの成果を受けて，対象とする設計段階の回路情報に対して，「摂動」を加える．「摂動」は(1)回路機能的に等価であり，(2)さらにハードウェアトロイを構成する信号線特徴量を変化させた．このような「摂動」を加えることで，機械学習モデルの識別性能が低下することを確認した．続いて，上記識別性能が低下するような回路情報の摂動に対して，信号線特徴量のうち，摂動によって変化しないもの，すなわち摂動に強い信号線特徴量を抽出し，これらの信号線特徴量をもとに新たに機械学習モデルを構築した．この際，データ拡張による機械学習モデル生成，加えてハードウェアトロイのためのAdversarial Training手法を考案し，理論的に攻撃に耐性を持つ機械学習モデルを構築，評価を行った．さらにこのような工程を，研究代表者らが持つさまざまハードウェアトロイビッグデータに適用し，評価した．
攻撃に耐性を持つ機械学習モデルによる設計工程ハードウェアトロイ検知

日本学術振興会科学研究費助成事業

研究期間:

2022年04月

-

2025年03月

戸川望, 木村晋二
地理空間情報を自在に操るイジング計算機の新展開

科学技術振興機構戦略的な研究開発の推進戦略的創造研究推進事業 CREST

研究期間:

2019年

-

2024年

戸川望

　概要を見る

本研究はSociety5.0の実現に不可欠な「地理空間情報処理」の高度化に焦点をあて,ノイマン型コンピューティング技術によるプログラムパラダイムを抜本的に変革し,地理空間情報処理向けイジングプログラミングを確立します.多種制約付き多地点最適巡回経路探索など多くの地理空間情報処理問題をイジング模型にマッピング,実イジング計算機にエンベッドし,実規模かつ実制約を持つ地理空間情報処理問題を解法します.
次世代アクセラレータ基盤に係る研究開発

戦略的イノベーション創造プログラム（SIP）第二期

研究期間:

2019年10月

-

2023年03月
機械学習による集積回路設計データ中のハードウェアトロイ検知

日本学術振興会科学研究費助成事業基盤研究(B)

研究期間:

2019年04月

-

2022年03月

戸川望, 木村晋二

　概要を見る

現在，集積回路設計・製造は低コスト化のため積極的に外注が利用され，外部の悪意ある設計・製造者により悪意ある回路を故意に侵入する「ハードウェアトロイ」が現実的な脅威として指摘されている．特に集積回路設計データに挿入されたハードウェアトロイは，軽微な設計データ改変で重大な事象を引き起こす可能性がある．ハードウェアトロイはその対策技術が開発されると，それを回避するハードウェアトロイが開発される「いたちごっこ」が続いているのが現状である．本研究では，集積回路設計データ中のハードウェアトロイの各種特徴量を積極的に学習することにより，既知・未知のハードウェアトロイを検知する技術を確立した．
再構成アクセラレータにおけるデータ形式最適化と精度保証

日本学術振興会科学研究費助成事業基盤研究(B)

研究期間:

2018年04月

-

2021年03月

木村晋二, 戸川望

　概要を見る

平成30年度は、画像コーデックにおける中間データの圧縮方法および CNN (Convolutional Neural Network) のデータ形式の最適化に関する研究を行った。とくに、画像データの組込み圧縮法と、畳み込み演算における Approximate Computing 手法の研究を行った。環境整備としては、CNN の開発環境の caffe での誤差解析のための環境整備ならびにその上での評価実験と、FPGA の高位合成環境の整備とテストを行った。また、データ形式を決めた場合の誤差の伝播と蓄積の計算法についての文献調査や理論的な検討を行った。
まず、画像処理のための画像データの組込み圧縮法に関して、情報ロスを含む Lossy圧縮法の提案を行った。本手法は、量子化と可変長符号化を組合せた方式に基づいており、メモリバンド幅を大幅に削減することを可能とした。
同時に、CNNの誤差を許容できる性質を用いて、演算の一部を簡略化し、回路の面積、遅延、電力を削減する Approximate Computing 手法の研究を行い、乗算回路の新たな Approximate 手法をいくつか提案した。乗算では部分積の加算を繰返すが、下位側を OR で近似計算する手法や、部分積の順序を入れ替えることで精度を保持したまま回路を簡単化する手法の検討を行った。
また、Ubuntu で caffe に基づく環境を用い、浮動小数点データに対する種々のデータ形式の適用と、乗算と加算における演算誤差の解析方法および誤差の伝播方法の研究を行った。これまでに、浮動小数点数のカスタマイズについて、仮数部のビット数を3 に、また指数部のビット数を5 に削減してもトレーニングを含め十分な認識精度が得られるという結果を得ている。ここではそれを発展させ、動的なデータ形式の変更に関する検討を行った。
イジングマシン共通ソフトウェア基盤の研究開発

NEDO 高効率･高速処理を可能とするAIチップ・次世代コンピューティングの技術開発

研究期間:

2018年09月

-

2021年03月

戸川望
設計・製造におけるチップの脆弱性検知手法の研究開発

総務省・ICT重点技術の研究開発プロジェクト

研究期間:

2019年09月

-

2020年03月
超微細加工技術にも適応する抽象LSIモデルの構築と高位・物理統合化LSI合成技術

科学研究費助成事業(早稲田大学) 科学研究費助成事業(基盤研究(B))

研究期間:

2010年

-

2012年

戸川望

　概要を見る

本研究では,第一に超微細加工プロセスによって製造されるLSI にも適応すべく,レジスタ-制御回路-機能モジュール間に結び付きの概念を導入し,LSI 内部の構成要素を物理的な結合と論理的な結合で抽象化した抽象LSI モデルを構築した.構築した抽象LSI モデルを導入することで,きわめて見通し良く高位設計と物理設計とをインターフェースすることが可能となる.次にこの抽象LSI モデルの上で,高位合成と物理合成とを統合化する新たなLSI 自動合成技術を構築しアルゴリズム化した.シミュレーション実験ならびに一部チップ試作により提案構築した技術の優位性を確認した.
認知科学とプリント基板配線を応用した屋内経路探索とその表示手法

科学研究費助成事業(早稲田大学) 科学研究費助成事業(挑戦的萌芽研究)

研究期間:

2009年

-

2011年

戸川望

　概要を見る

現在,移動通信網の発展や計算機の小型化・高性能化により携帯通信サービスが急激に成長している.しかしながら,屋内歩行者ナビゲーションでは位置追跡システムが難しいことや,屋外環境と異なり道路ネットワークが存在しないことが問題となり停滞している.本研究では,プリント基盤配線を応用し,屋内空間を対象とした歩行者ナビゲーションシステムに関するモデル化手法を構築し,これを用いることで認知科学的経路探索,経路案内を含めたトータルナビゲーションを構築する.実環境での運用実験を実施し評価する.
選択論理を利用した超高速な差積演算回路とその実応用回路の設計

産学が連携した研究開発成果の展開研究成果展開事業研究成果最適展開支援プログラム(A-STEP) 探索タイプ

研究期間:

2010年

　

　

戸川望

　概要を見る

差積演算は高速フーリエ変換や超解像処理など重要な応用回路の基本演算として頻出する演算であるが、減算と乗算の順序関係のため演算時間が増大するという問題点がある。これに対し申請者は、差積演算を2進表現し展開し途中項をxizi+yizi(xi,yi,ziは1ビット変数)という形式で表現することに成功した。これは選択論理と呼ばれ桁上げがなく、前処理として高速演算すれば途中項数を1/2に、つまり差積演算時間あるいはその面積を最大1/2に削減できる。この成果のもと本申請では、高速フーリエ変換の基本演算としてバタフライ演算を高速化、続いて、超解像処理の基本演算として加重加算を高速化・低面積化した。バタフライ演算では11%以上の高速化を達成し、加重加算では25%以上の高速化および最大50%の低面積化を実現した。今後、これらの基本演算の高速化・低面積化により直接、デジタルテレビ規格で用いられるOFDM(直交波周波数分割多重)方式や、超解像処理、3次元画像処理を等して、たとえば次世代デジタルテレビ放送などに対して大きくその飛躍が期待できる。
ディープサブミクロン技術を想定した次世代高位合成システムに関する研究

科学研究費助成事業(早稲田大学) 科学研究費助成事業(若手研究(B))

研究期間:

2006年

-

2008年

戸川望

　概要を見る

システムLSI設計技術は,配線幅が90nmや65nmといったディープサブミクロン時代に突入し,その設計生産性を向上するには,システムの動作レベルからシステムLSIを自動設計することを可能とした「高位合成技術」が極めて有効である.本研究ではディープサブミクロン技術を想定した物理設計指向の高位合成アルゴリズムを提案しならびに高位合成フローを構築した.構築した高位合成フローを利用することにより,従来の設計フローに比較して,30%以上の動作速度向上を確認した.
高速大容量ネットワークプロセッサ設計システムに関する研究

科学研究費助成事業(北九州市立大学) 科学研究費助成事業(若手研究(B))

研究期間:

2003年

-

2005年

戸川望

　概要を見る

本研究では,ネットワーク環境専用の高速大容量ネットワークプロセソサを計算機によって自動設計するシステム環境を構築することを目的とする.提案する高速大容量ネットワークプロセッサ設計システムは,仮想ネットワークプロセッサなるモデル化を計算機内部で行うことで,システムLSI設計におけるハードウェア/ソフトウェア協調設計の概念をプロセッサ設計に導入したものである.平成17年度には,提案構築中の高速大容量ネットワークプロセッサ設計システムにおけるソフトウェア設計ならびにソフトウェア合成に取り組んだ.特に,プロセッサの抽象記述から,プロセッサハードウェアを自動で合成するアルゴリズム開発,ならびに専用プロセッサ向けソフトウェア環境の自動生成に主眼を置き,さらに,提案システム全体の評価を行った.以下の手順に沿って研究を進めた.
(1)平成15年度,平成16年度の研究成果を踏まえ,ネットワークプロセッサ上で動作するソフトウェア環境を自動で合成するアルゴリズムを考案,確立した.特に,ネットワーク連係処理を念頭に,1つのネットワークアプリケーションを,複数のサブプロセスに分割し,これらが連係して処理されるようなシステム構成ならびにネットワークモデルを構築し,体系化ならびに評価した.評価実験の結果,既存モデルに比較して,連係処理による設計空間の広がりにより,約20%程度のネットワーク性能の向上を見ることができた.
(2)これまでに構築されたシステム環境によって設計されたネットワークプロセッサを,ハードウェア記述言語によって実現して,(1)で構築されたソフトウェアと共に,面積,動作周波数等の性能を具体的に評価した.
今後,残された研究課題は,提案したネットーワークモデルのアーキテクチャボトルネックの解消ならびによりスケーラビリティのあるシステム設計である.
FPGAを対象とした動的再構成可能システムとその設計環境に関する研究

科学研究費助成事業(早稲田大学) 科学研究費助成事業(奨励研究(A))

研究期間:

2000年

-

2001年

戸川望

　概要を見る

FPGA (Field-Programmable Gate Array)とは,設計者が手元で電気的に回路機能を書き込むことができるLSI(大規模集積回路)デバイスの総称であり,1980年代半ばに実用的なFPGAデバイスが発表されて以来,デバイスそのものならびにその応用環境に関する研究が注目されてきた.本研究では,FPGAによって実現される回路機能がFPGA動作中に変化可能とした動的再構成可能FPGAに焦点を当て,まず,複数個の動的再構成可能FPGA,メモリおよびこれらをとりまく周辺回路から構成される動的再構成可能システムを考案・構築することを目的としている.続いて,動的再構成可能FPGAおよびシステムを対象に,動作レベルアルゴリズムから,動的再構成可能FPGAおよびシステム上で実現されるハードウェアを,計算機によって自動的に合成する環境(ハードウェア高位合成環境)を考案・構築することを目的としている.これらの研究を達成するため,平成13年度において,昨年度に引き続き,(1)動的再構成可能システムの構成法に関する調査・研究を実施したのと同時に,(2)動的再構成可能システムを対象としたハードウェア合成環境に関する研究を行った.
(1)制御処理用FPGAおよび演算処理用FPGA,メモリならびに周辺回路から構成される動的再構成可能システムを想定し,ネットワークプロトコル処理,特にネットワーク暗号化処理・プロトコルブースタを始めとする実アプリケーションを実現し,動作および性能を検証した.実機による実験の結果,ソフトウェアシミュレーションと同等な出力結果を,FPGAハードウェアによって得られることを確認した.これは,本研究で提案・構築した動的再構成可能システムの正当性を示している.
(2)続いて,動的再構成可能システムのためのハードウェア合成環境を構築した.構築された環境は,コンパイラ系,面積/時間/消費電力最適化系,ハードウェア生成系から構成される.C言語によるハードウェアの動作記述ならびに面積/時間制約のもとに,これらの制約を満足しかつ消費電力ならびにアプリケーションプログラムで消費される総エネルギーを小さく抑えた複数個のハードウェアを合成する.計算機上にこれらの環境を構築し,評価実験を行った結果,面積および時間制約とがトレードオフの関係にあり,かつ消費電力/総消費エネルギーを抑えた複数個のハードウェアが列挙されることを確認した.
これら(1)および(2)の研究成果ならび前年度までの研究の結果,アルゴリズムの動作記述から,ほぼ自動で動的再構成可能ハードウェアを得ることができる.これはアルゴリズム評価のプロトタイプ化を極めて高速に実現できることを意味し,結果,情報技術の躍進に寄与すると考える.
画像処理向け組込みプロセッサのハ-ドウェア/ソフトウェア協調設計手法に関する研究

科学研究費助成事業(早稲田大学) 科学研究費助成事業(奨励研究(A))

研究期間:

1998年

-

1999年

戸川望

　概要を見る

平成11年度において,画像処理向け組込みプロセッサのためのハードウェア/ソフトウェア協調設計手法を構築するにあたり,専用ソフトウェアプログラム群自動合成手法に関する研究を行った.さらに,構築されたプロセッサハードウェアの自動合成手法およびそのソフトウェア環境を実ハードウェアとして総合的に評価するために,フィールドプロマブルゲートアレイ(FPGA)設計環境を構築し,実際に,いくつかのハードウェアをFPGAチップ上に実現した.
1.専用ソフトウェアプログラム群自動合成手法の構築
画像処理向け組込みプロセッサのハードウェア/ソフトウェア協調設計では,与えられたアプリケーションプログラムに特化したプロセッサハードウェアとそのプロセッサ上で動作するソフトウェア環境を自動合成する.本年度の研究では,専用ソフトウェア環境として,リターゲッタブルアセンブラならびにリターゲッタブルシミュレータを構築した.これらのアセンブラならびにシミュレータはいずれも,自動合成される画像処理向け組込みプロセッサハードウェアに応じて,その機能を可変とすることができ,その結果,通常のRISC型マイクロプロセッサからDSP型(ディジタル信号処理型)プロセッサに至る広範囲のプロセッサに対応できることを確認した.
2.FPGA設計環境の構築とその応用
これまでの研究において構築されたプロセッサハードウェアの自動合成手法およびそのソフトウェア環境を実ハードウェア上で評価するために,FPGA設計環境を構築した.FPGA設計環境は,FPGAチップとしてXilinx XC4000シリーズを複数個用いることを可能とし,実現すべきハードウェア規模に応じて,任意の拡張性を持つことに特長がある.実際に,小規模プロセッサあるいはアプリケーションプログラムを,2個のFPGAチップを用いて実現し,その有効性を実証した.
高信頼IoT社会を実現する分散型基盤アーキテクチャの研究開発

NEDO エネルギー・環境新技術先導プログラム

戸川望
極低エネルギー化を実現する統合化システムLSI設計技術

NEDO 先導的産業技術創出事業

戸川望
IoT部品・機器・ネットワークの階層横断セキュリティ技術の研究開発

総務省戦略的情報通信研究開発推進事業（SCOPE）

戸川望
設計工程に侵入したハードウェアトロイの検出と耐ハードウェアトロイ設計技術の研究開発

総務省戦略的情報通信研究開発推進事業（SCOPE）

戸川望
不揮発メモリの書込みビット数を厳密に最小化する符号化とノーマリオフ計算への応用

科学研究費助成事業(早稲田大学) 科学研究費助成事業(挑戦的研究(萌芽))

戸川望
セレクタ論理を利用し部分積項数を半減する差積演算回路設計とその画像処理応用

科学研究費助成事業(早稲田大学) 科学研究費助成事業(挑戦的萌芽研究)

戸川望
大域的超低エネルギー化を実現するLSI抽象モデルと上位下位統合化LSI設計技術

科学研究費助成事業(早稲田大学) 科学研究費助成事業(基盤研究(B))

戸川望

▼全件表示

Misc

機械学習を用いた蟻コロニー最適化による多目的時間依存オリエンテーリング問題の解法

梶, 翔真, 梶本, 大, 野口, 竜弥, 高山, 敏典, 鮑, 思雅, 戸川, 望

マルチメディア，分散，協調とモバイル(DICOMO2025)シンポジウム2025論文集 2025 192 - 200 2025年06月

　概要を見る

本稿では，機械学習を用いた移動時間予測モデルを蟻コロニー最適化 (Ant Colony Optimization: ACO) に統合することで，多目的時間依存オリエンテーリング問題 (Multi-Objective Time-Dependent Orienteering Problem: MOTDOP) を高速に求解するアルゴリズムを提案する．提案手法は，機械学習を用いた移動時間予測モデルによって移動時間を高速かつ高精度に予測する．さらに，ACO の経路構築の際に各移動時間取得ステップに移動時間予測モデルを組み込むことで，移動時間の変動を反映した経路探索を実現し，動的かつ多目的な旅程最適化を可能とする．評価実験では，京都市内の POI を対象としたデータセットで評価を実施し，詳細経路探索 API を用いて ACO による求解を行う従来手法に比べ，提案手法は約 550 倍～ 600 倍の計算時間短縮を達成しつつ，同等のスコアを維持できることを明らかにした．また，提案手法がユーザが重視する価値観に基づき柔軟な経路設計が可能であることも示され，提案手法の実用性と柔軟性を確認した．
経路改良を含めた部分アニーリングによる容量制約付き配送計画問題の解法

原島夏希, 白井達彦, 矢田部彰宏, 矢田部彰宏, 矢田部彰宏, 戸川望

情報処理学会研究報告(Web) 2025 ( ITS-100 ) 2025年

J-GLOBAL
消費電力波形の形状に基づくクラスタリングによるIoTデバイス異常動作検知手法

川村竜太, 江田琉聖, 中西響, 戸川望

電子情報通信学会技術研究報告(Web) 124 ( 400(VLD2024 103-140) ) 2025年

J-GLOBAL
オートエンコーダの再構成誤差を用いたFPGA異常動作検出手法

五百森颯, 江田琉聖, 木田良一, 小笠原恒雄, 戸川望

電子情報通信学会技術研究報告(Web) 124 ( 400(VLD2024 103-140) ) 2025年

J-GLOBAL
量子アニーリングマシンの最適問題サイズの上限と部分アニーリングによる容量制約付き配送計画問題の求解性能評価

原島夏希, 白井達彦, 矢田部彰宏, 矢田部彰宏, 矢田部彰宏, 戸川望

情報処理学会論文誌ジャーナル(Web) 66 ( 6 ) 2025年

J-GLOBAL
複数の機械学習モデルにおける動的移動時間の予測性能分析

梶翔真, 鮑思雅, 高山敏典, 戸川望

電子情報通信学会技術研究報告(Web) 125 ( 77(CAS2025 1-16) ) 2025年

J-GLOBAL
IoTデバイスのドキュメントに基づくLLMを用いたセキュリティ適合性スコアリング

池上裕香, 長谷川健人, 披田野清良, 福島和英, 橋本和夫, 戸川望

電子情報通信学会技術研究報告(Web) 125 ( 77(CAS2025 1-16) ) 2025年

J-GLOBAL
時系列埋め込み表現を用いたIoTデバイス異常動作検知手法

江田琉聖, 戸川望

電子情報通信学会技術研究報告(Web) 125 ( 77(CAS2025 1-16) ) 2025年

J-GLOBAL
LLMによるIoTデバイスのUIに基づく入力値生成を用いたファジング手法

中西響, 長谷川健人, 披田野清良, 福島和英, 橋本和夫, 戸川望

電子情報通信学会技術研究報告(Web) 125 ( 77(CAS2025 1-16) ) 2025年

J-GLOBAL
イジングマシンを用いた格子点削除法によるsubQUBO構築の評価

三田光希, 深田佳佑, 太田岳, 戸川望

電子情報通信学会技術研究報告(Web) 125 ( 77(CAS2025 1-16) ) 2025年

J-GLOBAL
pVSQAを用いた履修最適化の一検討

太田岳, 白井達彦, 戸川望, 戸川望

電子情報通信学会技術研究報告(Web) 125 ( 77(CAS2025 1-16) ) 2025年

J-GLOBAL
多様なIoTデバイスを対象としたLLMによるセキュリティ適合性自動評価手法の検証

池上裕香, 長谷川健人, 披田野清良, 福島和英, 橋本和夫, 戸川望

電子情報通信学会技術研究報告(Web) 125 ( 99(ISEC2025 16-71) ) 2025年

J-GLOBAL
IoTデバイスのファジングのためのLLMベース生成型ミューテーション手法

中西響, 長谷川健人, 披田野清良, 福島和英, 橋本和夫, 戸川望

電子情報通信学会技術研究報告(Web) 125 ( 99(ISEC2025 16-71) ) 2025年

J-GLOBAL
ハミング重みを考慮した部分QUBOの動的変数選択手法

齋藤善仁, 三田光希, 戸川望, 戸川望

電子情報通信学会技術研究報告(Web) 125 ( 260(VLD2025 17-66) ) 2025年

J-GLOBAL
IoTデバイスのブラックボックスファジングのためのLLMベース初期シード選択手法

中西響, 長谷川健人, 披田野清良, 福島和英, 橋本和夫, 戸川望

電子情報通信学会技術研究報告(Web) 125 ( 248(HWS2025 61-71) ) 2025年

J-GLOBAL
イジングマシンのための特異値分解と係数除去によるQUBO簡略化

稲葉慎之助, 戸川望

電子情報通信学会技術研究報告(Web) 125 ( 260(VLD2025 17-66) ) 2025年

J-GLOBAL
LLMを用いた消費電力解析によるIoTデバイス異常動作検知の一検討

江田琉聖, 中西響, 戸川望

電子情報通信学会技術研究報告(Web) 125 ( 248(HWS2025 61-71) ) 2025年

J-GLOBAL
トポロジー同型n分割による基底状態を保証したイジングモデルビット幅削減手法

冨田柊, 青木来生, 稲葉慎之助, 多和田雅師, 戸川望

電子情報通信学会技術研究報告(Web) 125 ( 260(VLD2025 17-66) ) 2025年

J-GLOBAL
イジングマシンと大規模言語モデルによる複数日旅程計画問題へのアプローチ

田中綺珠, 梶翔真, 太田岳, 池上裕香, 鮑思雅, 戸川望, 戸川望

電子情報通信学会技術研究報告(Web) 125 ( 260(VLD2025 17-66) ) 2025年

J-GLOBAL
FMAのためのランク学習を用いたQUBO構築手法

太田岳, 白井達彦, 戸川望, 戸川望

電子情報通信学会技術研究報告(Web) 125 ( 260(VLD2025 17-66) ) 2025年

J-GLOBAL
制約パラメータ化を用いたスピン変数削減手法

青木来生, 太田岳, 白井達彦, 戸川望

電子情報通信学会技術研究報告(Web) 125 ( 260(VLD2025 17-66) ) 2025年

J-GLOBAL
量子アニーリングによる経路探索APIを用いた複数日インターモーダル旅程最適化

野口竜弥, 深田佳佑, 鮑思雅, 高山敏典, 戸川望

情報処理学会研究報告(Web) 2025 ( ITS-100 ) 2025年

J-GLOBAL
ETSI TS 103 701に基づくLLMを用いたセキュリティテストの該非判定

池上, 裕香, 長谷川, 健人, 披田野, 清良, 福島, 和英, 橋本, 和夫, 戸川, 望

コンピュータセキュリティシンポジウム2024論文集 24 - 31 2024年10月

　概要を見る

近年，IoTデバイスの脆弱性検査の自動化が期待されるが，多種多様なIoTデバイスにとって機械的に検査項目を設定することは難しい．我々は大規模言語モデル (LLM)を用いた検査項目の該非判定手法を提案しており， JPCERT/CCが公開しているIoTセキュリティチェックリストを通してその有効性を確認している．国際的に用いられるテスト仕様として，消費者向けIoTデバイスに適用されるセキュリティ規定が定められた欧州規格ETSI EN 303 645に基づいて作成されたETSI TS 103 701がある．本稿では，LLMを用いた検査項目の該非判定手法にETSI TS 103 701を適用し評価する．提案手法によりETSI TS 103 701における各テストグループの該非を判定した結果，提案手法はチェックリストとしてETSI TS 103 701を適用した場合でも検査対象デバイスの仕様に適したテストグループの該非を判定することを確認した．
In this paper, we apply ETSI TS 103 701 to a method for determining the suitability of vulnerability assessment items using LLMs. As a result, it is confirmed that we can successfully determine the suitability of each test group in ETSI TS 103 701.
IoTデバイスのファジングのための通信ログを用いたLLMベースシード生成自動化

中西, 響, 長谷川, 健人, 披田野, 清良, 福島, 和英, 橋本, 和夫, 戸川, 望

コンピュータセキュリティシンポジウム2024論文集 16 - 23 2024年10月

　概要を見る

近年，IoT(Internet of Things)デバイスの普及によりデバイス間で個人情報やインフラ等の重要情報が扱われることに伴い，IoTデバイスのセキュリティ対策が必要不可欠となっている．ファジングは，IoTデバイスの有効な脆弱性発見手法の1つであるが，従来手法は初期シード生成を手作業で行うためコストがかかるという点で課題がある．我々は，初期シード生成に大規模言語モデルを用いる手法(LLMベースシード生成手法)を提案しているが，テスト対象デバイスの仕様を手作業で指定する必要があった．本稿では，LLMベースシード生成手法において，通信ログを用いることでテスト対象デバイスの仕様抽出を自動化し手法を改良する．改良したLLMベースシード生成手法をさまざまな脆弱性を持つIoTデバイスに適用した結果，通信ログを用いることで，手動による仕様抽出なしにデバイスの操作や入力フォーマットに適したシードを自動生成し，手動による従来の初期シードによるファジングでは検出できなかったクラッシュの検出に成功した．
We propose an LLM-based seed generation method using communication logs for IoT devices fuzzing. The experimental evaluation results showed the effectiveness of the proposed method.
ハイパーパラメータチューニングを導入した生成電力波形によるIoTデバイス異常動作検知手法の評価

江田, 琉聖, 木田, 良一, 小笠原, 恒雄, 戸川, 望

コンピュータセキュリティシンポジウム2024論文集 1561 - 1568 2024年10月

　概要を見る

近年，Internet of Things(IoT)デバイスの普及に伴い，ハードウェアデバイスに実装される部品またはプログラムに起因するセキュリティ課題が増加している． OS上でアプリケーションが実行されるIoTデバイスでは，OSやハードウェアによる定常的な消費電力とアプリケーションによる消費電力が重なり，複雑な消費電力波形になる．消費電力波形を用いて異常動作を検知するには，測定した電力波形から定常的な消費電力を差し引き，アプリケーション電力波形のみを抽出する必要がある．定常的な消費電力を含むIoTデバイスの異常動作検知手法として，我々は波形生成に基づく手法を提案している．波形生成に基づく手法は，定常的な消費電力を機械学習を用いて除去し，電力波形の潜在的な特徴量を抽出することで自動的に異常動作を検知する．本稿では，2種類のデバイスに対して波形生成に基づく手法を適用し，その有効性を評価する．実験の結果，シングルボードコンピュータとFPGAの両方において，従来手法で検知できなかった異常動作を波形生成に基づく手法で検知できた．
In this paper, we apply the waveform generation-based IoT anomaly detection method to two types of devices and evaluate its effectiveness. Experimental results show that the method can detect anomalous behaviors successfully in both a single-board computer and an FPGA while the recent state-of-the-art method cannot detect them.
Ashiraseデバイスを用いた2次元PDR

梶本, 大, 鮑, 思雅, 田中, 裕介, 戸川, 望

マルチメディア，分散，協調とモバイルシンポジウム2024論文集 2024 1490 - 1495 2024年06月

　概要を見る

GPS (Global Positioning System)をはじめとして，我々は日常的に自己位置を推定しているが，環境によりGPSを利用できず，センサを用いたPDR (Pedestrian Dead Reckoning)等の相対的測位手法が必要となる．特に視覚障がい者の方々はスマートフォンなど視覚情報を用いたナビゲーションを利用することが困難であり，他の情報を用いたナビゲーションデバイスを利用する必要がある．本稿では，Ashiraseデバイスを利用した精度が高い2次元PDRを提案する．Ashiraseデバイスは視覚障がい者の靴に装着されるもので，Ashiraseデバイスを足元で振動させることで視覚情報以外の情報をフィードバックすることで視覚障がい者のナビゲーションを実現する．Ashiraseデバイスは靴に装着されるためスマートフォンと比較し姿勢の自由度が低い．そこで提案手法は，Ashiraseデバイスから取得される加速度センサならびにジャイロセンサの情報を利用することで自己位置推定する．この際，センサから得られる値の誤差の蓄積を減少するため，閾値の設定やゼロセット，またIMUフィルタを適用し位置推定の精度を向上させた．評価実験の結果，提案手法は推定誤差を少なく開始地点からの距離ならびに座標を推定できた．
部分アニーリングによる容量制約付き配送計画問題の求解性能評価

原島夏希, 白井達彦, 矢田部彰宏, 戸川望

情報処理学会研究報告(Web) 2024 ( ITS-96 ) 2024年

J-GLOBAL
解空間内の状態遷移を容易にする冗長符号を活用したQUBO変換手法

多和田雅師, 戸川望

電子情報通信学会技術研究報告(Web) 123 ( 390(VLD2023 99-140) ) 2024年

J-GLOBAL
自動化フレームワークにより生成されたトロイ回路を対象とした機械学習によるハードウェアトロイ識別の評価

吉見尚, 根岸良太郎, 久古幸汰, 戸川望

電子情報通信学会技術研究報告(Web) 123 ( 390(VLD2023 99-140) ) 2024年

J-GLOBAL
IoTデバイスのドキュメントに基づくLLMを用いた脆弱性検査項目の該非判定の拡張

池上裕香, 長谷川健人, 披田野清良, 福島和英, 橋本和夫, 戸川望

電子情報通信学会技術研究報告(Web) 124 ( 110(CAS2024 1-26) ) 2024年

J-GLOBAL
車載ECUに対する電圧フォールト攻撃による内部メモリの読み出し検証

江田琉聖, 久古幸汰, 佐藤勝彦, 足立勇弥, 戸川望

電子情報通信学会技術研究報告(Web) 124 ( 110(CAS2024 1-26) ) 2024年

J-GLOBAL
LLMを用いた初期シード生成によるIoTデバイスのファジングの有効性評価

中西響, 長谷川健人, 披田野清良, 福島和英, 橋本和夫, 戸川望

電子情報通信学会技術研究報告(Web) 124 ( 110(CAS2024 1-26) ) 2024年

J-GLOBAL
解品質向上を目的とする非零係数項の除去によるQUBO簡略化手法

岩田錦哉, 多和田雅師, 戸川望

電子情報通信学会技術研究報告(Web) 124 ( 110(CAS2024 1-26) ) 2024年

J-GLOBAL
イジングマシンを用いた履修最適化

太田岳, 深田佳佑, 戸川望, 戸川望

電子情報通信学会技術研究報告(Web) 124 ( 110(CAS2024 1-26) ) 2024年

J-GLOBAL
FMAによる3次元2次割当問題の変数削減に関する一考察

富田空, 白井達彦, 戸川望

電子情報通信学会技術研究報告(Web) 124 ( 110(CAS2024 1-26) ) 2024年

J-GLOBAL
量子計算を用いた機能性核酸塩基配列の導出

富田空, 白井達彦, 浜田道昭, 安達健朗, 戸川望

電子情報通信学会技術研究報告(Web) 124 ( 247(VLD2024 27-75) ) 2024年

J-GLOBAL
制約違反を前提としたイジングマシンと補正処理によるハイブリッド組合せ最適化手法

岩田錦哉, 多和田雅師, 戸川望

電子情報通信学会技術研究報告(Web) 124 ( 247(VLD2024 27-75) ) 2024年

J-GLOBAL
オートエンコーダによるアプリケーション電力波形抽出を利用した電力解析に基づく異常動作検知手法

江田琉聖, 戸川望

電子情報通信学会技術研究報告(Web) 124 ( 247(VLD2024 27-75) ) 2024年

J-GLOBAL
QAOAを用いた履修最適化の一検討

太田岳, 深田佳佑, 白井達彦, 戸川望, 戸川望

電子情報通信学会技術研究報告(Web) 124 ( 247(VLD2024 27-75) ) 2024年

J-GLOBAL
補正処理を導入した部分QUBOアニーリングによる複数日インターモーダル旅程最適化

野口竜弥, 深田佳佑, 鮑思雅, 戸川望

電子情報通信学会技術研究報告(Web) 124 ( 110(CAS2024 1-26) ) 2024年

J-GLOBAL
1-hot制約付き問題に対する1-hot変数選択法を用いた部分QUBOアニーリング手法

野口竜弥, 深田佳佑, 鮑思雅, 戸川望

電子情報通信学会技術研究報告(Web) 124 ( 247(VLD2024 27-75) ) 2024年

J-GLOBAL
部分QUBOアニーリングを用いたインターモーダル旅程最適化

野口竜弥, 深田佳佑, 鮑思雅, 戸川望

情報処理学会研究報告(Web) 2024 ( ITS-96 ) 2024年

J-GLOBAL
ランダムフォレストに基づくハードウェアトロイ検知に対する強化学習を用いた自律的なハードウェアトロイ生成手法

池上, 裕香, 根岸, 良太郎, 長谷川, 健人, 披田野, 清良, 福島, 和英, 戸川, 望

コンピュータセキュリティシンポジウム2023論文集 684 - 690 2023年10月

　概要を見る

近年，IC設計工程でハードウェアトロイ（HT）と呼ばれる不正回路が挿入されるリスクが深刻となっている．HTの検知には機械学習を用いる手法が有効であるとされるが，どのようなHTの検知が可能か，あるいは検知が難しいか検討が不十分である．検知の難しいHTを効率的に生成する必要がある．本稿ではランダムフォレストに基づくHT検知を対象に，強化学習を用いた自律的なHT生成手法を提案する．提案手法では強化学習を用いることにより，効率的で自律的なHT生成が可能となる．提案手法で用いる強化学習アルゴリズムは，新たなHTの生成とHT検知精度の観測を繰り返し，HT検知において検知精度を最小化するよう，HT構成時のパラメータ変更の方策を学習する．ランダムフォレストに基づくHT検知を対象とした評価実験の結果，提案手法はHT検知精度を低下させるHTを構成することを確認した．
In this paper, we propose a hardware-Trojan (HT) generation method utilizing reinforcement learning against random-forest-based HT detection. The proposed method learns how to change the parameterized HTs so as to minimize the HT detection rate. Experimental evaluation results confirm the effectiveness of the proposed method.
複数の機械学習モデルを用いたアンサンブル学習によるハードウェアトロイ識別手法

根岸, 良太郎, 戸川, 望

コンピュータセキュリティシンポジウム2023論文集 676 - 683 2023年10月

　概要を見る

IoT (Internet-of-Things)機器は，私たちの日常生活に広く普及しており，コスト削減のための設計・製造の外注が一般的になっている．一方，信頼性の乏しいベンダーによりハードウェアトロイ (HT)と呼ばれる悪意のある回路が挿入されるリスクが指摘されている．本稿では，ゲートレベルネットリストを対象に，複数の機械学習モデル (ランダムフォレスト，XGBoost，LightGBM，CatBoost)を用いたアンサンブル学習によるHT識別を行い，識別性能を評価する．提案手法では機械学習による識別結果を再度，機械学習モデルに学習することにより識別精度の向上を目指す．評価実験の結果，提案手法は従来の単体の機械学習モデルを用いる手法に比べ高い精度でHTを識別することを確認した．
IoT (Internet-of-Things) devices are widely used in our daily lives, and outsourcing of design and manufacturing has become common to reduce costs. On the other hand, there is a risk of malicious circuits called hardware Trojans (HTs) being inserted by unreliable vendors. In this paper, we propose an HT detection method using multiple ensemble learning models (random forest, XGBoost, LightGBM, and CatBoost). The experimental results showed that the proposed method identifies HTs at a higher rate than the conventional method using a single machine-learning model.
生成電力波形によるIoT異常動作検知手法

久古, 幸汰, 戸川, 望

コンピュータセキュリティシンポジウム2023論文集 668 - 675 2023年10月

　概要を見る

近年，Internet of Things (IoT) デバイスの普及に伴い，サプライチェーン上に第三者が介入する機会が増加し，IoT デバイスのセキュリティ課題が顕在化している．電力解析により IoT デバイスの異常動作を検知する手法が知られているが，IoT デバイスは OS やハードウェア自体が電力を消費し定常的に電力を消費するため，異常動作検知にはこのような定常的な電力を除去する必要がある．従来，異常動作検知では，定常状態の電力を手動で除去しており，また電力波形の特定の特徴から異常動作を検知していたため，十分に異常動作を検知できない場合があった．本稿では，機械学習により定常状態の位置を推定することでアプリケーション電力波形を生成し，その波形から潜在的な特徴量を抽出することで自動的に異常動作を検知する手法を提案する．実験の結果，従来の IoT 異常動作検知手法では検知出来なかった電力波形に対し，提案手法を適用することで異常動作の検知に成功した．
In this paper, we propose an anomalous behavior detection method in Internet-of-Things devices by analyzing power consumption. Experimental evaluations show that the proposed method detects anomalous application behaviors successfully, while the recent state-of-the-art method cannot detect them.
イジングモデルの係数削減による量子イジングマシンの出力改善の評価

谷地, 悠太, 多和田, 雅師, 戸川, 望

DAシンポジウム2023論文集 2023 141 - 148 2023年08月

　概要を見る

量子イジングマシンは，イジングモデルを入力として取り，その基底状態を探索することで組合せ最適化問題を解く．しかし，入力したイジングモデルの係数と量子イジングマシンが解法する係数に誤差が発生する．本誤差の存在により，量子イジングマシンは入力したイジングモデルの基底状態を必ずしも出力しない．本稿では，イジングモデルの係数が量子イジングマシンの基底状態を得る確率に与える影響を示す．入力するイジングモデルの係数の絶対値における最大値と最小値の比率が大きくなるにつれ，量子イジングマシンが基底状態を得る確率が減少することを実験で示した．加えて，同実験結果をもとに，量子イジングマシンにイジングモデルを入力した場合に発生する誤差の大きさを推定する方法を検討した．さらに，入力するイジングモデルの係数の絶対値における最大値と最小値の比率を削減することで，量子イジングマシンが基底状態を得る確率が上昇することを実験で示した．上記実験結果は，量子イジングマシンが基底状態を得る確率の上昇のためのアプローチとして，入力するイジングモデルの係数の削減が有効である可能性を示している．
量子ビット読み出し時間を削減するトポロジ周期性活用のマイナ埋め込み手法

多和田, 雅師, 戸川, 望

DAシンポジウム2023論文集 2023 167 - 172 2023年08月

　概要を見る

マイナ埋め込みは量子アニーリングの実行時にレイテンシ増加を引き起こす．レイテンシの削減を目指すために，全結合グラフからハードウェアトポロジへのマイナ埋め込みパタンを事前に準備する戦略が存在する．既存の手法では，入力された論理イジングモデルを全結合グラフとして扱い，実行時のマイナ埋め込みを省略するためにマイナ埋め込みパタンを生成する．我々は，量子ビットの個々のばらつきが既存の手法の読み出し時間を増加させることを発見した．本稿では，量子ビットの個々のばらつきを考慮に入れ，読み出し時間を最小化する全結合グラフのマイナ埋め込み手法を提案する．提案手法では，量子アニーリングマシンのハードウェアトポロジに周期性が存在することに注目し，元のマイナ埋め込みパタンをユニットセルごとにシフトさせて読み出し時間が最小となるマイナ埋め込みパタンを探索する．計算機実験により，提案した手法は既存手法と比較して，量子アニーリングの実行時間の一部である読み出し時間を削減することが確認された．
ACOによる時間変化に対応した旅行計画最適化手法

佐伯, 越志, 鮑, 思雅, 高山, 敏典, 戸川, 望

マルチメディア，分散，協調とモバイルシンポジウム2023論文集 2023 490 - 503 2023年06月

　概要を見る

観光産業の振興と情報科学技術の発展によって，ユーザの旅行計画を補助する技術の開発が進んでいる．旅行計画では，人気度や費用など複数の目的関数を同時に最適化することで，ユーザが満足する経路を生成する必要がある．さらに，ユーザに旅行の詳細な情報を与え，ユーザが行動しやすい旅行経路を生成するには，時間依存で変化する移動時間や観光地の価値を考慮するべきである．例えば，移動に公共交通機関を利用する場合，時刻表や移動経路によって出発時刻に依存して移動時間が変化する．観光地の価値についても，夜景が綺麗な観光地や，イベントを開催する観光地，営業時間の存在など，訪問時間によって価値が変化する．本稿では，旅行計画における時間変化する価値を考慮し，複数の目的関数を最適化できる，時間依存多目的旅行計画問題最適化手法を提案する．提案手法は，蟻コロニー最適化において複数の目的関数を異なる重みで考慮する蟻を設定し，フェロモンに時間属性を付加することで時間依存多目的旅行計画問題を解法する．特に，タイムスタンプ付きの過去のユーザの旅行履歴を利用することで時間依存の観光地の価値に対応し，詳細経路 API を利用して時間変動する移動時間に対応する．その上で，詳細経路 API 利用時の応答時間の増加を想定し，API 呼出回数を削減する工夫を導入する．評価実験により，提案手法は既存手法に対し，より時間変化する価値を最適化した旅行経路を生成した．
歩行特性を利用したスマートフォン階段昇降推定

梶本, 大, 佐伯, 越志, 鮑, 思雅, 戸川, 望

マルチメディア，分散，協調とモバイルシンポジウム2023論文集 2023 329 - 335 2023年06月

　概要を見る

GPS (Global Positioning System) をはじめとして，我々は日常的に自己位置を推定している．しかし，GPS を利用できない環境の場合，携帯端末のセンサを用いた PDR (Pedestrian Dead Reckoning) 等の相対的測位手法が必要となる．特に複雑な屋内空間において，歩行者は水平方向に移動するだけでなく垂直方向にも移動する．このとき，エレベータやエスカレータのように歩行者の揺れや振動が少ない移動手段だけではなく，階段のような歩行者に不規則に揺れや振動が生じる場合にも，正確に垂直方向の移動を推定する必要がある．本稿では，スマートフォンを利用した階段昇降推定手法を提案する．提案手法は，歩行者の歩行特性を利用してフロアの水平部分を検出し気圧センサの誤差を解消することで，高い精度で階段中のフロアを推定する．さらに，気圧センサの値がスマートフォンの姿勢に左右されない特性を利用することで，スマートフォンの姿勢によらない階段昇降推定を実現する．評価実験の結果，提案手法は既存手法と比較して，推定誤差を低減し階段昇降を推定できた．
ハードウェアトロイとその検出技術

長谷川健人, 福島和英, 戸川望

電子情報通信学会誌 106 ( 3 ) 2023年

J-GLOBAL
LSTMを用いた定常状態を含む消費電力波形予測に基づくIoTデバイス異常動作検知手法

江田琉聖, 久古幸汰, 根岸良太郎, 戸川望

電子情報通信学会技術研究報告(Web) 122 ( 402(VLD2022 73-122) ) 2023年

J-GLOBAL
消費電力波形の極値クラスタリングを用いた文法的推論に基づくIoTデバイス異常動作検知手法

中西響, 久古幸汰, 根岸良太郎, 戸川望

電子情報通信学会技術研究報告(Web) 122 ( 402(VLD2022 73-122) ) 2023年

J-GLOBAL
実環境回路に挿入されたハードウェアトロイを対象としたグラフ学習によるゲートレベルハードウェアトロイ識別

池上裕香, 山下一樹, 長谷川健人, 福島和英, 清本晋作, 戸川望

電子情報通信学会技術研究報告(Web) 122 ( 402(VLD2022 73-122) ) 2023年

J-GLOBAL
消費電力波形データから抽出した特徴量群に対する主成分分析に基づくIoTデバイス異常動作検知手法

矢部拓真, 久古幸汰, 根岸良太郎, 戸川望

電子情報通信学会技術研究報告(Web) 122 ( 402(VLD2022 73-122) ) 2023年

J-GLOBAL
イジングマシンによる配送計画の最適化-古典コンピュータによる計算結果との比較-

露峰祐衣, 増田健一, 北田智之, 八川剛志, 羽賀剛, 谷地悠太, 白井達彦, 多和田雅師, 戸川望

電子情報通信学会大会講演論文集(CD-ROM) 2023 2023年

J-GLOBAL
量子アニーリングマシンと古典計算機を組み合せたハイブリッドイテレーティブアニーリングによる組合せ最適化問題の評価

深田佳佑, 戸川望

電子情報通信学会技術研究報告(Web) 123 ( 97(CAS2023 1-29) ) 2023年

J-GLOBAL
動的周波数割当問題に対するイジングマシンの求解性能評価

岩田錦哉, 多和田雅師, 齋藤和広, 山田秀昭, 戸川望

電子情報通信学会技術研究報告(Web) 123 ( 97(CAS2023 1-29) ) 2023年

J-GLOBAL
イジングマシンを用いたハイブリッドアニーリングによる容量制約付き配送計画問題の解法

原島夏希, 川上蒼馬, 矢田部彰宏, 戸川望

電子情報通信学会技術研究報告(Web) 123 ( 97(CAS2023 1-29) ) 2023年

J-GLOBAL
ハードウェア版マルウェアハードウェア・トロイとFPGA

長谷川健人, 福島和英, 戸川望

インターフェース 49 ( 10 ) 2023年

J-GLOBAL
生成電力波形によるIoT異常動作検知手法の改良

久古幸汰, 戸川望

電子情報通信学会技術研究報告(Web) 123 ( 258(VLD2023 30-79) ) 2023年

J-GLOBAL
ハードウェアトロイ識別における機械学習モデルのループ最適化手法

根岸良太郎, 戸川望

電子情報通信学会技術研究報告(Web) 123 ( 258(VLD2023 30-79) ) 2023年

J-GLOBAL
ハイブリッドアニーリングを用いた動的周波数割当問題の求解性能評価

岩田錦哉, 多和田雅師, 齋藤和広, 山田秀昭, 戸川望

電子情報通信学会技術研究報告(Web) 123 ( 258(VLD2023 30-79) ) 2023年

J-GLOBAL
相互作用の調整によるイジングマシンへの初期解擬似導入手法

川上蒼馬, 大野乾太郎, 巴徳瑪, 八木哲志, 寺本純司, 戸川望

電子情報通信学会技術研究報告(Web) 123 ( 258(VLD2023 30-79) ) 2023年

J-GLOBAL
補正処理を導入した部分QUBOアニーリングによる複数日旅程最適化

野口竜弥, 深田佳佑, 鮑思雅, 戸川望

電子情報通信学会技術研究報告(Web) 123 ( 258(VLD2023 30-79) ) 2023年

J-GLOBAL
部分QUBOアニーリングによる複数日旅程最適化問題の解法

野口竜弥, 深田佳佑, 鮑思雅, 戸川望

情報処理学会研究報告(Web) 2023 ( ITS-092 ) 2023年

J-GLOBAL
グラフ学習を用いたゲートレベルハードウェアトロイ識別に対するメンバシップ推論攻撃

山下, 一樹, 長谷川, 健人, 披田野, 清良, 福島, 和英, 戸川, 望

コンピュータセキュリティシンポジウム2022論文集 408 - 415 2022年10月

　概要を見る

ハードウェアトロイとは，IC の設計・製造工程で悪意ある第三者により挿入された不正回路であり，その検知に，ゲートレベルネットリストのグラフ学習に基づく手法が有効であると報告されている．一方で，機械学習モデルに対するメンバシップ推論攻撃の危険性が指摘されている．この攻撃手法は，あるデータが識別器の訓練データとして使用されたかを推論する攻撃手法であり，攻撃が成功した場合，悪意ある第三者に訓練データが漏洩することになる．本稿ではグラフ学習によるハードウェアトロイ識別器を対象としたメンバシップ推論攻撃を提案する．提案手法は，まず攻撃対象となるハードウェアトロイ識別器に訓練データに含まれるハードウェアトロイを与えたときの出力値と，テストデータに含まれるハードウェアトロイを与えたときの出力値を得る．そして，これらの出力値の差異を学習することで，訓練データに含まれるハードウェアトロイの構造を明らかにする．評価実験の結果，攻撃対象のハードウェアトロイ識別器に対し，訓練及びテストに使用されたハードウェアトロイのデータを，AUC0.988 の精度で識別できることを確認した．
Recently, the threat of injecting a circuit with malicious functions called a hardware Trojan has been growing. Hardware Trojans can be effectively detected using machine learning and a method using graph neural networks (GNNs) has been proposed. On the other hand, the risk of membership inference attacks against machine learning models has been pointed out. This attack infers whether or not certain data is used as training data for a model. The success of the attack implies a privacy violation against the data providers and an attacker definitely knows which Trojans are used or not. In this paper, we propose a membership inference attack method for the hardware-Trojan classifier based on GNN and confirm the possibility that the training data can be exposed.
実環境を想定したハードウェアトロイ回路を対象としたXGBoostによるハードウェアトロイ識別

根岸, 良太郎, 長谷川, 健人, 福島, 和英, 清本, 晋作, 戸川, 望

コンピュータセキュリティシンポジウム2022論文集 416 - 423 2022年10月

　概要を見る

情報テクノロジー機器は人々の生活に深く浸透しており需要は年々拡大している．同様に，情報テクノロジー機器に欠かせない IC の設計・製造も年々需要が拡大しており，半導体開発ベンダは設計や製造の一部を外部に委託するようになってきている．第三者への外注による IC の設計・製造はコストが削減できる利点がある反面，信頼性の乏しいベンダーの設計・製造によるハードウェアトロイの組み込みを招く危険性が指摘されている．本稿では Trust-HUB のベンチマーク回路および実環境を想定したハードウェアトロイ回路に対し XGBoost によるハードウェアトロイ検出手法を適用し，識別結果を評価する．
Information technology devices have become deeply embedded in people's lives, and their demand is growing every year. The design and manufacturing of ICs, which are essential for those information technology devices, becomes a large-scale business, and companies often outsource them. While outsourcing IC design and manufacturing to third parties has the benefit of reducing costs, it is also indicated the risk of inserting hardware Trojans due to unreliable vendors involved. In this paper, we apply the XGBoost-based hardware-Trojan detection method to the Trust-HUB benchmarks and practical Trojan netlists, and evaluate the identification results.
消費電力波形の形状を考慮した IoT デバイス異常動作検知手法の評価

久古, 幸汰, 戸川, 望

コンピュータセキュリティシンポジウム2022論文集 424 - 431 2022年10月

　概要を見る

近年，Internet of Things（IoT）デバイスの普及に伴い，ハードウェアデバイスのセキュリティ課題が増加している．ハードウェアデバイスの異常動作を検知する手法として，消費電力を解析し，動作の継続時間と消費電力量から異常動作を検知する手法があるが，時系列データの形状を考慮しておらず，継続時間と消費電力量が類似するが形状は異なる電力波形を区別できない．継続時間と消費電力量が類似するが形状は異なる電力波形を区別し異常動作を検知する手法として，Shape-based Distance（SBD）を用い時系列データの形状から異常動作を検知する手法（SBD 異常動作検知手法）が提案されている．本稿では，2 種類のデバイスと 2 種類のアプリケーションの組み合わせに SBD 異常動作検知手法を適用し，その有効性を評価する．実験の結果，全ての組み合わせにおいて SBD 異常動作検知手法で異常動作の検知に成功した．
With the wide spread of the Internet of Things (IoT) devices, security issues for hardware devices have been increasing, where detecting their anomalous behaviors becomes quite important. Recently, an anomalous behavior detection method based on the shape of time-series data by incorporating a shape-based distance (SBD) measure has been proposed. The method can successfully distinguish between power waveforms with similar duration time and consumed energy but different shapes. In this paper, we apply the anomalous behavior detection method based on SBD to various devices and applications, and evaluate its effectiveness. The evaluation results confirm the effectiveness and efficiency of the anomalous behavior detection method based on SBD.
歩行時の加速度の周期性によるスマートグラス端末姿勢推定手法—ITS研究会交通センシング,通信,情報処理,一般

佐藤大生, 戸川望

電気学会研究会資料. ITS / ITS研究会 [編] 2022 ( 16-21・25-31 ) 51 - 56 2022年09月
ACOによる多目的要求に対応した旅行計画最適化手法

佐伯, 越志, 鮑, 思雅, 高山, 敏典, 戸川, 望

マルチメディア，分散，協調とモバイルシンポジウム2022論文集 2022 1556 - 1569 2022年07月

　概要を見る

観光産業の振興と情報科学技術の発展によって，旅行計画サービスの開発が進んでいる．旅行計画サービスが対象とする旅行計画では，満足度や費用など複数の目的関数を同時に最適化することで，ユーザが満足する経路を生成する必要がある．とりわけ，過去に多くのユーザが同様な旅程を計画している，あるいは部分的に同様な旅程を計画していることから，いかに過去のユーザの旅行経路を再利用するかが旅行計画の大きな鍵となる．本稿では，旅行計画に対するユーザの要求を満足するため，多目的オリエンテーリング問題をベースに過去のユーザの旅行経路を陽に利用した旅行計画最適化手法を提案する．提案手法は，蟻コロニー最適化を利用することで，過去のユーザの旅行経路を陽に反映した旅行計画を可能とする．その上で，蟻コロニー最適化において蟻の行動を多様な目的関数に対応して変化させることで，多目的オリエンテーリング問題を解法する．評価実験により，既存手法に対し，過去の旅行者の旅行経路に近く，よりユーザの要求を満足する旅行経路を生成した．
イジング計算機のためのマルチスピンフリップ法とその応用

白井達彦, 戸川望

電子情報通信学会技術研究報告(Web) 121 ( 342(VLD2021 49-75) ) 2022年

J-GLOBAL
スマートフォンとスマートウォッチを併用したPDRによる進行方向補正

若泉朋弥, 戸川望

電子情報通信学会大会講演論文集(CD-ROM) 2022 2022年

J-GLOBAL
外部磁場の調整によるイジングマシンへの初期解擬似導入手法

川上蒼馬, 巴徳瑪, 大野乾太郎, 八木哲志, 寺本純司, 戸川望

電子情報通信学会技術研究報告(Web) 122 ( 283(VLD2022 19-55) ) 2022年

J-GLOBAL
イジングマシンを繰り返し用いるイテレーティブアニーリング手法と組合せ最適化問題の評価

深田佳佑, パリジマチュー, パリジマチュー, 富田憲範, 戸川望

電子情報通信学会技術研究報告(Web) 122 ( 283(VLD2022 19-55) ) 2022年

J-GLOBAL
基底状態の破壊を検出可能な係数分割によるイジングモデルのビット幅削減手法

谷地悠太, 多和田雅師, 戸川望

電子情報通信学会技術研究報告(Web) 122 ( 283(VLD2022 19-55) ) 2022年

J-GLOBAL
イジングモデル係数へのノイズ付与によるイジングマシン高精度化手法

吉村友和, 白井達彦, 多和田雅師, 戸川望

電子情報通信学会技術研究報告(Web) 122 ( 283(VLD2022 19-55) ) 2022年

J-GLOBAL
イジング計算機向け変数消去法

白井達彦, 戸川望

日本物理学会講演概要集(CD-ROM) 77 ( 2 ) 2022年

J-GLOBAL
イジングモデルにおける外部磁場係数への乱数加算による熱浴法

吉村友和, 白井達彦, 多和田雅師, 戸川望, 戸川望

回路とシステムワークショップ論文集(CD-ROM) 35th 2022年

J-GLOBAL
GNNExplainerによるグラフ学習を用いたハードウェアトロイ識別の評価

山下一樹, 長谷川健人, 披田野清良, 福島和英, 戸川望

回路とシステムワークショップ論文集(CD-ROM) 35th 2022年

J-GLOBAL
量子アニーリングによるフォトニック結晶レーザーの構造最適化

井上卓也, 関優也, 田中宗, 戸川望, 石崎賢司, 野田進

応用物理学会秋季学術講演会講演予稿集(CD-ROM) 83rd 2022年

J-GLOBAL
量子アニーリングマシンと非量子イジングマシンを利用したハイブリッド最適化手法に向けた解析

菊池脩太, 戸川望, 田中宗, 田中宗, 田中宗

日本物理学会講演概要集(CD-ROM) 77 ( 2 ) 2022年

J-GLOBAL
イジング計算機向けマルチスピンフリップ法

白井達彦, 戸川望

日本物理学会講演概要集(CD-ROM) 77 ( 1 ) 2022年

J-GLOBAL
歩行時の加速度の周期性によるスマートグラス端末姿勢推定手法

佐藤大生, 戸川望

電子情報通信学会技術研究報告(Web) 122 ( 190(ITS2022 6-9) ) 2022年

J-GLOBAL
係数分割によるイジングモデルのビット幅削減の検討

谷地悠太, 多和田雅師, 戸川望

情報処理学会研究報告(Web) 2022 ( SLDM-199 ) 2022年

J-GLOBAL
ビット幅削減イジングモデルのシミュレーテッドアニーリングにおける動的プロセスの解析

菊池脩太, 戸川望, 田中宗, 田中宗

日本物理学会講演概要集(CD-ROM) 77 ( 1 ) 2022年

J-GLOBAL
イジングマシンを用いた複数日にまたがる観光地選出手法

鮑思雅, 戸川望

電子情報通信学会大会講演論文集(CD-ROM) 2022 2022年

J-GLOBAL
実環境を想定したトロイ回路を対象としたランダムフォレストによるハードウェアトロイ識別

栗原樹, 長谷川健人, 福島和英, 清本晋作, 戸川望

9 - 16 2021年10月

CiNii
多出力LSTMによる定常状態電力波形推定を利用した消費電力解析にもとづくデバイスの異常動作検知手法

高崎和成, 木田良一, 金子博一, 戸川望

17 - 24 2021年10月

CiNii
グラフ学習によるゲートレベルハードウェアトロイ識別と評価

山下一樹, 長谷川健人, 披田野清良, 清本晋作, 戸川望

1 - 8 2021年10月

CiNii
ストカスティック数を用いた絶対値関数及び不連続関数の実装と評価

石川遼太, 多和田雅師, 戸川望

DAシンポジウム2021論文集 ( 2021 ) 65 - 70 2021年08月

CiNii
端末姿勢の安定性により蓄積誤差を低減するスマートグラスPDR手法

佐藤, 大生, 戸川, 望

マルチメディア，分散協調とモバイルシンポジウム2021論文集 2021 ( 1 ) 565 - 573 2021年06月

　概要を見る

スマートグラスによる歩行者ナビゲーションは，視界に直接様々な情報が表示されるため，画面を確認するために視線を落とす必要がなく，より直観的で分かりやすいナビゲーションを実現できる．歩行者ナビゲーションは自己位置の推定が不可欠であり，広くGlobal Positioning System (GPS) が用いられるが，屋内，地下，高層ビル街などでは測位精度が著しく低下する．屋内空間で位置測位する手法として，Pedestrian Dead Reckoning（PDR）があり，端末のセンサを用いて現在位置を推定する．PDR は外部インフラを必要とせず低コストで導入することができるが，初期位置を基準とした相対位置を推定するため，歩行距離が増大するにつれて誤差が蓄積する．蓄積誤差を低減することが PDR の実現において重要である．本稿では，スマートグラスによる歩行者ナビゲーションを対象に，蓄積誤差を低減する PDR 手法を提案する．提案手法では，歩行者が正面を向いて静止したときスマートグラス端末の鉛直下向きの加速度が重力加速度に近づくことを利用し，端末の水平面の角度の蓄積誤差の低減を実現し，その結果，PDR による蓄積誤差の低減を可能とする．評価実験の結果，提案手法により，推定位置の誤差が最大で約 13% 低減されることが分かった．

CiNii
スマートフォンとスマートウォッチを併用したPDR手法の地図情報を利用した高精度化

若泉, 朋弥, 戸川, 望

マルチメディア，分散協調とモバイルシンポジウム2021論文集 2021 ( 1 ) 555 - 564 2021年06月

　概要を見る

スマートフォンの普及により，歩行者向けナビゲーションシステムが多く利用されている．こうしたナビゲーションシステムではGPS を利用し現在位置を測位しているが，屋内や地下では GPS による測位が困難である．屋内での位置推定手法として Pedestrian Dead Reckoning (PDR) が提案されている．PDR 手法は歩行者の身に付ける端末からセンサ情報を取得し，初期地点からの経路をたどることで現在位置を推定する．これまでに我々は，スマートフォンとスマートウォッチを併用した PDR 手法を提案し，外部インフラを用いず，スマートフォンの位置によらず自己位置推定が可能であり，そのうえで推定誤差を削減できることを，評価実験により確認している．しかし，方向転換時に進行方向の誤差が残り，誤差が蓄積するため，右左折の多い環境では推定誤差が大きくなる問題がある．本稿では，スマートフォンとスマートウォッチを併用した PDR 手法の精度改善手法を提案する．提案手法では，新たに地図情報を組み合わせ，マップマッチングを導入することにより，従来手法の精度を向上させる．実現評価を通じて，提案手法の有効性を確認した．

CiNii
モンテカルロ木探索を用いたユーザ個人の嗜好を考慮した経路推薦手法とその高速化

石崎雄太, 高山敏典, 戸川望

電子情報通信学会大会講演論文集(CD-ROM) 2021 2021年

J-GLOBAL
スマートフォンとスマートウォッチを併用したPDRによる屋内位置推定の実験評価

若泉朋弥, 戸川望

電子情報通信学会大会講演論文集(CD-ROM) 2021 2021年

J-GLOBAL
内閣府SIPにおける光・量子を活用したSociety 5.0実現化技術企業現場が活用可能な量子計算機サービスの提供へ向けて

戸川望, 田中宗

Optronics ( 477 ) 2021年

J-GLOBAL
スマートグラスを用いたPDRによる屋内位置推定の評価

佐藤大生, 戸川望

電子情報通信学会大会講演論文集(CD-ROM) 2021 2021年

J-GLOBAL
ニューラルネットワークを用いたハードウェアトロイ識別に対するデータ拡張と敵対的学習の応用と評価

野澤康平, 披田野清良, 清本晋作, 戸川望

コンピュータセキュリティシンポジウム2020論文集 1206 - 1213 2020年10月

CiNii
スマートフォンとスマートウォッチを併用したPDRによる屋内位置推定

若泉朋弥, 戸川望

マルチメディア，分散協調とモバイルシンポジウム2205論文集 2020 ( 2020 ) 1290 - 1302 2020年06月

　概要を見る

スマートフォンの普及により，歩行者向けナビゲーションシステムが多く利用されている．こうしたナビゲーションシステムでは一般的に GPS (Global Positioning System) を利用して，自らの現在位置を測位しているが，屋内や地下では GPS の精度が落ちるため，GPS に代わる位置測位が必要である．
PDR (Pedestrian Dead Reckoning) は屋内における現在位置推定手法の一つであり，歩行者がセンサによって得た加速度や角速度などの歩行データから現在位置を予測する手法である．PDR は外部インフラを利用しないため導入コストが低いという利点がある．一方で，スマートフォンを用いた PDR では位置測位中に端末の向き(モードと呼ばれる) を固定するのが難しいことや蓄積誤差が問題となる．端末のモードを考慮した研究は多くなされているが，端末のモードを 1 つに固定した手法が多く，実用的とは言えない．本稿ではスマートフォンとスマートウォッチを併用した PDR 手法を提案する．提案手法は，スマートフォンとスマートウォッチの双方から得られるセンサデータを併用することで，外部インフラを使用せず，様々なスマートフォンのモードに対応した PDR を実現する．同時にドリフト誤差を削減し，精度の高い位置推定を実現する．実験の結果，既存手法と比較して，位置推定誤差を平均約 87% 削減する結果となった．

CiNii
モンテカルロ木探索を用いたユーザ個人の嗜好を考慮した経路推薦手法の高速化

石崎雄太, 高山敏典, 戸川望

マルチメディア，分散協調とモバイルシンポジウム2206論文集 ( 2020 ) 1303 - 1310 2020年06月

CiNii
メタヒューリスティクスの制約なし二次形式二値変数最適化問題への適用 (システム数理と応用)

多和田雅師, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 119 ( 470 ) 43 - 48 2020年03月

CiNii
イジング計算機による3次元直方体パッキング問題の解法 (VLSI設計技術)

金丸翔, 寺田晃太朗, 川村一志, 田中宗, 富田憲範, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 119 ( 443 ) 173 - 178 2020年03月

CiNii
トリガ回路の性質にもとづく特徴量を利用したニューラルネットワークによるハードウェアトロイ識別 (VLSI設計技術)

井上智貴, 長谷川健人, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 119 ( 443 ) 227 - 232 2020年03月

CiNii
イジングマシンを用いたアミューズメントパークの経路最適化手法 (VLSI設計技術)

武笠陽介, 若泉朋弥, 田中宗, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 119 ( 443 ) 167 - 172 2020年03月

CiNii
乱数化関数を用いた乱数生成回路を共有するストカスティック数生成器 (VLSI設計技術)

多和田雅師, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 119 ( 443 ) 163 - 166 2020年03月

CiNii
グリッド配線問題に対するQUBO定式化手法

川村一志, 田中宗, 田中宗, 戸川望

電子情報通信学会大会講演論文集(CD-ROM) 2020 2020年

J-GLOBAL
温度効果を用いたイジングマシンにおける埋込アルゴリズム

白井達彦, 田中宗, 田中宗, 戸川望

電子情報通信学会大会講演論文集(CD-ROM) 2020 2020年

J-GLOBAL
ブロックチェーンを用いたデータ管理基盤のFlashAirによる実装

長谷川健人, 戸川望

電子情報通信学会大会講演論文集(CD-ROM) 2020 2020年

J-GLOBAL
3次元直方体パッキング問題のQUBOモデルマッピング

金丸翔, 寺田晃太朗, 川村一志, 田中宗, 田中宗, 富田憲範, 戸川望

電子情報通信学会大会講演論文集(CD-ROM) 2020 2020年

J-GLOBAL
ニューラルネットワークを用いたハードウェアトロイ検出における局所性の応用に関する一考察

井上智貴, 長谷川健人, 戸川望

電子情報通信学会大会講演論文集(CD-ROM) 2020 2020年

J-GLOBAL
イジングマシン分野の研究開発の現状と今後~ハード・ソフト・アプリケーション・理論~

田中宗, 戸川望

電子情報通信学会大会講演論文集(CD-ROM) 2020 2020年

J-GLOBAL
トリガ回路の性質にもとづく特徴量を利用したランダムフォレストによるハードウェアトロイ識別

栗原樹, 戸川望

電子情報通信学会技術研究報告(Web) 120 ( 211(HWS2020 25-41) ) 2020年

J-GLOBAL
アプリケーション電力抽出に基づくIoTデバイスの異常動作検知の評価

高崎和成, 木田良一, 戸川望

電子情報通信学会技術研究報告(Web) 120 ( 211(HWS2020 25-41) ) 2020年

J-GLOBAL
ハードウェアトロイ識別における敵対的サンプル用改変の体系的分類

野澤康平, 長谷川健人, 披田野清良, 清本晋作, 戸川望

電子情報通信学会技術研究報告(Web) 120 ( 211(HWS2020 25-41) ) 2020年

J-GLOBAL
多様なアルゴリズムを用いた配置配線パズルの協調システム

若泉朋弥, 高崎和成, 谷地悠太, 吉村友和, 西澤誠人, 多和田雅師, 戸川望

電子情報通信学会技術研究報告(Web) 120 ( 234(VLD2020 11-38) ) 2020年

J-GLOBAL
スマートフォンとスマートウォッチを併用したPDRによる屋内位置推定

若泉朋弥, 戸川望

情報処理学会シンポジウムシリーズ(CD-ROM) 2020 ( 1 ) 2020年

J-GLOBAL
モンテカルロ木探索を用いたユーザ個人の嗜好を考慮した経路推薦手法の高速化

石崎雄太, 高山敏典, 戸川望

情報処理学会シンポジウムシリーズ(CD-ROM) 2020 ( 1 ) 2020年

J-GLOBAL
イジングマシンにおける整数バイナリ変換の性能比較

田村健祐, 白井達彦, 桂法称, 田中宗, 戸川望

日本物理学会講演概要集 75.1 2351 - 2351 2020年

DOI CiNii
イジングモデルによる類似誘導部分グラフ同型問題の解法 (VLSI設計技術) -- (デザインガイア2019 : VLSI設計の新しい大地)

吉村夏一, 多和田雅師, 田中宗, 新井淳也, 巴徳瑪, 八木哲志, 内山寛之, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 119 ( 282 ) 103 - 108 2019年11月

CiNii
ストカスティック計算におけるステップ関数の実装と評価 (VLSI設計技術) -- (デザインガイア2019 : VLSI設計の新しい大地)

石川遼太, 多和田雅師, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 119 ( 282 ) 69 - 74 2019年11月

CiNii
量子アニーリングエミュレータのためのデータ構造

植田圭, 戸川望, 木村晋二

DAシンポジウム2019論文集 ( 2019 ) 39 - 44 2019年08月

CiNii
低密度パリティ検査符号復号問題を制約なし二次形式二値変数最適化問題に変換した解法

多和田雅師, 田中宗, 戸川望

DAシンポジウム2019論文集 ( 2019 ) 45 - 50 2019年08月

CiNii
スリープ状態をもつ組込みシステムを対象とした電力解析にもとづく異常動作検知とその実証的評価

長谷川健人, 近松聖, 戸川望

DAシンポジウム2019論文集 ( 2019 ) 93 - 98 2019年08月

CiNii
ストカスティック数を用いた再帰的分割による解像度解釈可変な画像形式 (VLSI設計技術)

石川遼太, 多和田雅師, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 119 ( 154 ) 71 - 76 2019年07月

CiNii
モンテカルロ木探索によるユーザ個人の嗜好を考慮した経路推薦手法とその評価

石崎雄太, 高山敏典, 戸川望

マルチメディア,分散協調とモバイルシンポジウム2019論文集 ( 2019 ) 854 - 862 2019年06月

CiNii
動的な歩幅更新をベースとするマップマッチングによるPDR手法

西村天晴, 高山敏典, 柳澤政生, 戸川望

マルチメディア,分散協調とモバイルシンポジウム2019論文集 ( 2019 ) 1663 - 1669 2019年06月

CiNii
スマートフォン搭載センサを用いた自転車の挙動認識の向上

宇佐見友理, 石川和明, 高山敏典, 柳澤政生, 戸川望

マルチメディア,分散協調とモバイルシンポジウム2019論文集 ( 2019 ) 1670 - 1675 2019年06月

CiNii
凍結ビットパタン分岐によるリストサイズ2のポーラ符号高速リストデコーダ

会沢優花, 多和田雅師, 井手口裕太, 神谷典史, 戸川望

電子情報通信学会技術研究報告 118 ( 457(VLD2018 93-143) ) 2019年

J-GLOBAL
周辺ネットの特徴量を考慮した二段階のニューラルネットワークによるハードウェアトロイ検出手法

井上智貴, 長谷川健人, 戸川望

情報処理学会研究報告(Web) 2019 ( ARC-235 ) 2019年

J-GLOBAL
SAベースのイジングマシンにより巡回セールスマン問題を高速解法するための多種軽量係数試行法

竹原康太, 於久太祐, 松田佳希, 田中宗, 田中宗, 戸川望

情報処理学会研究報告(Web) 2019 ( ARC-235 ) 2019年

J-GLOBAL
実イジング計算機による制約付きスロット配置問題の解法

金丸翔, 川村一志, 田中宗, 田中宗, 戸川望

情報処理学会研究報告(Web) 2019 ( ARC-235 ) 2019年

J-GLOBAL
BER測定を用いたストカスティック数の誤り訂正

石川遼太, 多和田雅師, 柳澤政生, 戸川望

情報処理学会研究報告(Web) 2019 ( ARC-235 ) 2019年

J-GLOBAL
イジング計算機による誘導部分グラフ同型問題の解法

吉村夏一, 多和田雅師, 田中宗, 田中宗, 新井淳也, 八木哲志, 内山寛之, 戸川望

情報処理学会研究報告(Web) 2019 ( ARC-235 ) 2019年

J-GLOBAL
モンテカルロ木探索によるユーザ個人の嗜好を考慮した経路推薦手法とその評価

石崎雄太, 高山敏典, 戸川望

情報処理学会シンポジウムシリーズ(CD-ROM) 2019 ( 1 ) 2019年

J-GLOBAL
動的な歩幅更新をベースとするマップマッチングによるPDR手法

西村天晴, 高山敏典, 柳澤政生, 戸川望

情報処理学会シンポジウムシリーズ(CD-ROM) 2019 ( 1 ) 2019年

J-GLOBAL
スマートフォン搭載センサを用いた自転車の挙動認識の向上

宇佐見友理, 石川和明, 高山敏典, 柳澤政生, 戸川望

情報処理学会シンポジウムシリーズ(CD-ROM) 2019 ( 1 ) 2019年

J-GLOBAL
ストカスティック数を用いたZ通信路の誤り訂正

石川遼太, 多和田雅師, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 119 ( 76(CPSY2019 1-16)(Web) ) 2019年

J-GLOBAL
非正規挿入デバイス検知のための電気容量監視装置とその実験的評価

西澤誠人, 長谷川健人, 戸川望

電子情報通信学会技術研究報告 119 ( 260(HWS2019 57-66) ) 2019年

J-GLOBAL
ストカスティック数を用いたステップ関数の実装と評価

石川遼太, 多和田雅師, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 119 ( 282(VLD2019 29-53) ) 2019年

J-GLOBAL
IoTデバイス管理基盤の一考察

長谷川健人, 戸川望

電子情報通信学会大会講演論文集(CD-ROM) 2019 2019年

J-GLOBAL
ハードウェアセキュリティにおけるAI活用と攻撃

長谷川健人, 野澤康平, 披田野清良, 清本晋作, 橋本和夫, 戸川望

電子情報通信学会大会講演論文集(CD-ROM) 2019 2019年

J-GLOBAL
ハードウェアトロイ検出方法の実装と課題

永田真一, 高橋功次, 近藤信一, 大屋優, 戸川望

電子情報通信学会大会講演論文集(CD-ROM) 2019 2019年

J-GLOBAL
ニューラルネットワークを用いたハードウェアトロイ識別に対する敵対的サンプル攻撃の実証評価

野澤康平, 長谷川健人, 披田野清良, 清本晋作, 橋本和夫, 戸川望

電子情報通信学会技術研究報告 119 ( 260(HWS2019 57-66) ) 2019年

J-GLOBAL
2ⁿRRR : 高度な並び替えにより誤り耐性を強化したストカスティック数複製器 (VLSI設計技術) -- (デザインガイア2018 : VLSI設計の新しい大地)

石川遼太, 多和田雅師, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 118 ( 334 ) 95 - 100 2018年12月

CiNii
高位合成時のモジュール分割におけるバッファコスト最小化問題とその解法

大場諒介, 川村一志, 田宮豊, 柳澤政生, 戸川望

DAシンポジウム2018論文集 ( 2018 ) 63 - 68 2018年08月

CiNii
低電力化電気容量検出装置を用いた動作中の不正デバイス検知

西澤誠人, 長谷川健人, 柳澤政生, 戸川望

DAシンポジウム2018論文集 ( 2018 ) 112 - 117 2018年08月

CiNii
マイクロコントローラのスリープ状態に着目した消費電力にもとづく悪意のある機能の発現検知

長谷川健人, 柳澤政生, 戸川望

DAシンポジウム2018論文集 ( 2018 ) 118 - 123 2018年08月

CiNii
スマートフォン搭載3軸加速度センサと3軸ジャイロセンサを用いた自転車の挙動認識

宇佐見友理, 石川和明, 高山敏典, 柳澤政生, 戸川望

マルチメディア，分散協調とモバイルシンポジウム2018論文集 ( 2018 ) 32 - 42 2018年06月

CiNii
POIを考慮した経路長指定の複数巡回経路探索手法

西村天晴, 石川和明, 高山敏典, 柳澤政生, 戸川望

マルチメディア，分散協調とモバイルシンポジウム2018論文集 ( 2018 ) 1612 - 1621 2018年06月

CiNii
再収斂による計算誤りに耐性を持つストカスティック数複製器を用いた活性化関数の実装と評価 (VLSI設計技術)

石川遼太, 多和田雅師, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 118 ( 83 ) 167 - 172 2018年06月

CiNii J-GLOBAL
イジング計算機によるスロット配置問題の解法

金丸翔, 於久太祐, 多和田雅師, 田中宗, 田中宗, 林真人, 山岡雅直, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 118 ( 85(MSS2018 1-36) ) 161‐166 2018年06月

J-GLOBAL
亜種ハードウェアトロイの設計とそのニューラルネットワークを用いた検出

井上智貴, 長谷川健人, 小林悠記, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 118 ( 85(MSS2018 1-36) ) 173‐178 2018年06月

J-GLOBAL
リーク削減による低消費電力SRAMの設計—A low power SRAM design with leakage power reduction

伊藤卓, 戸川望, 柳澤政生, 史又華

回路とシステムワークショップ論文集 Workshop on Circuits and Systems 31 197 - 202 2018年05月

CiNii
低周波圧電エネルギーハーベスティングにおけるMOSs SP-SSHI手法—MOSs SP-SSHI for low frequency piezoelectric energy harvesting

杉山貴紀, 戸川望, 柳澤政生, 史又華

回路とシステムワークショップ論文集 Workshop on Circuits and Systems 31 86 - 91 2018年05月

CiNii
CNNに対する概算加算器の適用と評価—Application and evaluation of CNN with approximate adders

井上雄太, 戸川望, 柳澤政生, 史又華

回路とシステムワークショップ論文集 Workshop on Circuits and Systems 31 191 - 196 2018年05月

CiNii
効率的なストカスティック数複製器と合成関数回路を用いたその評価 (ディペンダブルコンピューティング) -- (組込み技術とネットワークに関するワークショップETNET2018)

石川遼太, 多和田雅師, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 117 ( 480 ) 209 - 214 2018年03月

CiNii J-GLOBAL
鍵長128ビット,192ビット,256ビットの軽量暗号CLEFIAに対するスキャンベース攻撃手法 (ディペンダブルコンピューティング) -- (組込み技術とネットワークに関するワークショップETNET2018)

於久太祐, 多和田雅師, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 117 ( 480 ) 251 - 256 2018年03月

CiNii J-GLOBAL
鍵長128ビット,192ビット,256ビットの軽量暗号CLEFIAに対するスキャンベース攻撃手法 (コンピュータシステム) -- (組込み技術とネットワークに関するワークショップETNET2018)

於久太祐, 多和田雅師, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 117 ( 479 ) 251 - 256 2018年03月

CiNii
複数エリアへの近接度を用いたパーティクルフィルタによる屋内測位手法の適用

百瀬凌也, 石川和明, 柳澤政生, 戸川望

電子情報通信学会大会講演論文集(CD-ROM) 2018 ROMBUNNO.A‐14‐7 2018年03月

J-GLOBAL
凍結ビットパタンの偏りを利用した高速Polar符号復号器とそのハードウェア実装の検討

多和田雅師, 神谷典史, 井手口裕太, 井上浩明, 戸川望

電子情報通信学会大会講演論文集(CD-ROM) 2018 ROMBUNNO.A‐1‐12 2018年03月

J-GLOBAL
LSIの配線問題―DAシンポジウムの配線問題解法コンテスト―2 機械学習とFPGAを用いた配線問題解法への取り組み

川村一志, 長谷川健人, 多和田雅師, 戸川望

情報処理 59 ( 3 ) 228‐231 2018年02月

J-GLOBAL
CNNに対する概算加算器の適用と評価

井上雄太, 戸川望, 柳澤政生, SHI Youhua

回路とシステムワークショップ論文集(CD-ROM) 31st 2018年

J-GLOBAL
組合せ最適化処理に向けた革新的アニーリングマシンの研究開発

山岡雅直, 川畑史郎, TSAI Jawshen, 中村泰信, 河原林健一, 戸川望, 田中宗, 小林哲則

電子情報通信学会大会講演論文集(CD-ROM) 2018 2018年

J-GLOBAL
スマートフォン搭載3軸加速度センサと3軸ジャイロセンサを用いた自転車の挙動認識

宇佐見友理, 石川和明, 高山敏典, 柳澤政生, 戸川望

情報処理学会シンポジウムシリーズ(CD-ROM) 2018 ( 1 ) 2018年

J-GLOBAL
POIを考慮した経路長指定の複数巡回経路探索手法

西村天晴, 石川和明, 高山敏典, 柳澤政生, 戸川望

情報処理学会シンポジウムシリーズ(CD-ROM) 2018 ( 1 ) 2018年

J-GLOBAL
経路選択履歴を用いたモンテカルロ木探索による推薦経路探索手法

百瀬凌也, 石川和明, 高山敏典, 柳澤政生, 戸川望

情報処理学会シンポジウムシリーズ(CD-ROM) 2018 ( 1 ) 2018年

J-GLOBAL
スパイチップはあるのかハードウェアセキュリティの必要性

長谷川健人, 戸川望

情報処理 60 ( 1 ) 2018年

J-GLOBAL
低周波圧電エネルギーハーベスティングにおけるMOSs SP‐SSHI手法

杉山貴紀, 戸川望, 柳澤政生, SHI Youhua

回路とシステムワークショップ論文集(CD-ROM) 31st ROMBUNNO.A2‐1 2018年

J-GLOBAL
リーク削減による低消費電力SRAMの設計

伊藤卓, 戸川望, 柳澤政生, SHI Youhua

回路とシステムワークショップ論文集(CD-ROM) 31st ROMBUNNO.C4‐3 2018年

J-GLOBAL
トリガ条件の異なるハードウェアトロイの設計とSVMを用いた検出 (VLSI設計技術) -- (デザインガイア2017 : VLSI設計の新しい大地)

井上智貴, 長谷川健人, 小林悠記, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 117 ( 273 ) 133 - 138 2017年11月

CiNii
暗号回路に挿入されたハードウェアトロイとその抑止回路のFPGA実装 (VLSI設計技術) -- (デザインガイア2017 : VLSI設計の新しい大地)

長谷川健人, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 117 ( 273 ) 139 - 144 2017年11月

CiNii
静的な定数を係数とする乱数生成器を使用しないストカスティック論理回路 (VLSI設計技術) -- (デザインガイア2017 : VLSI設計の新しい大地)

多和田雅師, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 117 ( 273 ) 121 - 124 2017年11月

CiNii
環境発電動作を想定した揮発・不揮発レジスタ併用型フロアプラン指向高位合成手法

浅井大輝, 柳澤政生, 戸川望

DAシンポジウム2017論文集 ( 2017 ) 57 - 62 2017年08月

CiNii
高ポイント高速数論変換に対する高位合成のためのループ構造最適化

川村一志, 柳澤政生, 戸川望

DAシンポジウム2017論文集 ( 2017 ) 63 - 68 2017年08月

CiNii
スキャンシグネチャを用いた周辺回路を含む軽量暗号CLEFIAに対するスキャンベース攻撃

於久太祐, 多和田雅師, 柳澤政生, 戸川望

DAシンポジウム2017論文集 ( 2017 ) 116 - 121 2017年08月

CiNii
不揮発性メモリを対象とした低書き込みメモリ暗号化手法

多和田雅師, 柳澤政生, 戸川望

DAシンポジウム2017論文集 ( 2017 ) 122 - 126 2017年08月

CiNii
ネットの周辺情報を考慮した機械学習によるハードウェアトロイ識別

長谷川健人, 柳澤政生, 戸川望

DAシンポジウム2017論文集 ( 2017 ) 127 - 132 2017年08月

CiNii
20KスピンCMOSアニーリングマシンを対象とした完全結合イジングモデルマッピング手法と評価

寺田晃太朗, 田中宗, 林真人, 山岡雅直, 柳澤政生, 戸川望

DAシンポジウム2017論文集 ( 2017 ) 163 - 168 2017年08月

CiNii
セレクタ論理を適用したFFTプロセッサのFPGA実装評価

平井勇也, 川村一志, 柳澤政生, 戸川望

DAシンポジウム2017論文集 ( 2017 ) 180 - 185 2017年08月

CiNii
遅延変動に対しロバストなAES暗号回路の設計

矢作裕基, 柳澤政生, 戸川望

DAシンポジウム2017論文集 ( 2017 ) 210 - 215 2017年08月

CiNii
乱数によるビット並び替えに基づくストカスティック数複製器

石川遼太, 多和田雅師, 柳澤政生, 戸川望

DAシンポジウム2017論文集 ( 2017 ) 169 - 174 2017年08月

CiNii
近接度を用いたパーティクルフィルタによる高精度屋内測位手法

百瀬凌也, 新田知之, 柳澤政生, 戸川望

マルチメディア，分散協調とモバイルシンポジウム2017論文集 ( 2017 ) 514 - 522 2017年06月

CiNii
疎な GPS 測位情報を対象にした測位精度と短時間滞在除去に基づく滞在地推定手法

岩田紗瑛, 新田知之, 高山敏典, 柳澤政生, 戸川望

マルチメディア，分散協調とモバイルシンポジウム2017論文集 ( 2017 ) 523 - 531 2017年06月

CiNii
C-elementを用いたソフトエラー耐性をもつSHCラッチの設計

田島咲季, 戸川望, 柳澤政生, 史又華

回路とシステムワークショップ論文集 Workshop on Circuits and Systems 30 214 - 219 2017年05月

CiNii
内部ノードを利用したソフトエラー検出ラッチの設計

中垣直道, 戸川望, 柳澤政生, 史又華

回路とシステムワークショップ論文集 Workshop on Circuits and Systems 30 220 - 225 2017年05月

CiNii
最大エラー距離に基づくGeAr回路の最適化

早水謙, 戸川望, 柳澤政生, 史又華

回路とシステムワークショップ論文集 Workshop on Circuits and Systems 30 7 - 12 2017年05月

CiNii
自己動力型スイッチング磁気変圧回路を用いたエネルギーハーベスティングシステム

川合洋平, 戸川望, 柳澤政生, 史又華

回路とシステムワークショップ論文集 Workshop on Circuits and Systems 30 1 - 6 2017年05月

CiNii
連続してハッシュ値を出力しないHMAC-SHA-256回路へのスキャンベース攻撃手法 (ディペンダブルコンピューティング) -- (組込み技術とネットワークに関するワークショップETNET2017)

於久太祐, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 116 ( 511 ) 129 - 134 2017年03月

CiNii
ネットの特徴量を用いた多層ニューラルネットワークによるハードウェアトロイ識別 (ディペンダブルコンピューティング) -- (組込み技術とネットワークに関するワークショップETNET2017)

長谷川健人, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 116 ( 511 ) 135 - 140 2017年03月

CiNii
連続してハッシュ値を出力しないHMAC-SHA-256回路へのスキャンベース攻撃手法 (コンピュータシステム) -- (組込み技術とネットワークに関するワークショップETNET2017)

於久太祐, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 116 ( 510 ) 129 - 134 2017年03月

CiNii
ネットの特徴量を用いた多層ニューラルネットワークによるハードウェアトロイ識別 (コンピュータシステム) -- (組込み技術とネットワークに関するワークショップETNET2017)

長谷川健人, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 116 ( 510 ) 135 - 140 2017年03月

CiNii
ニューラルネットワークにもとづく概算回路設計手法 (VLSI設計技術)

川村一志, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 116 ( 478 ) 25 - 30 2017年03月

CiNii
PDRの測位誤差補正のためのマルチシナリオ化マップマッチング手法 (画像工学)

岩名地良太, 新田知之, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 116 ( 464 ) 387 - 392 2017年02月

CiNii
PDRの測位誤差補正のためのマルチシナリオ化マップマッチング手法 (マルチメディアストレージコンシューマエレクトロニクスヒューマンインフォメーションメディア工学映像表現&コンピュータグラフィックス)

岩名地良太, 新田知之, 柳澤政生, 戸川望

映像情報メディア学会技術報告 = ITE technical report 41 ( 5 ) 387 - 392 2017年02月

CiNii
高精度ストカスティック演算のためのFSM設計 (VLSI設計技術)

多和田雅師, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 116 ( 415 ) 171 - 174 2017年01月

CiNii
エラー距離を考慮した概算累算回路の設計

早水謙, 戸川望, 柳澤政生, SHI Youhua

電子情報通信学会大会講演論文集(CD-ROM) 2017 2017年

J-GLOBAL
20KスピンCMOSアニーリングマシンを対象とした完全結合イジングモデルマッピング手法

寺田晃太朗, 田中宗, 田中宗, 林真人, 山岡雅直, 柳澤政生, 戸川望

日本物理学会講演概要集(CD-ROM) 72 ( 2 ) 2017年

J-GLOBAL
残留磁気を考慮したマルチシナリオ化マップマッチング手法

岩名地良太, 石川和明, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 117 ( 347(ITS2017 12-60) ) 2017年

J-GLOBAL
腕時計型ウェアラブル端末と略地図を用いたPDRの検討

河野圭亮, 石川和明, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 117 ( 347(ITS2017 12-60) ) 2017年

J-GLOBAL
眼鏡型ウェアラブル端末を用いた顔の回転に対応可能なPDR手法

矢野椋也, 石川和明, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 117 ( 347(ITS2017 12-60) ) 2017年

J-GLOBAL
近接度を用いたパーティクルフィルタによる高精度屋内測位手法

百瀬凌也, 新田知之, 柳澤政生, 戸川望

情報処理学会シンポジウムシリーズ(CD-ROM) 2017 ( 1 ) 2017年

J-GLOBAL
眼鏡型ウェアラブル端末を用いた屋外歩行者ナビゲーションに適したPDR手法

矢野椋也, 新田知之, 石川和明, 柳澤政生, 戸川望

電子情報通信学会大会講演論文集(CD-ROM) 2017 2017年

J-GLOBAL
腕時計型ウェアラブル端末向け略地図への現在地表示と自動切り替えの検討

河野圭亮, 新田知之, 石川和明, 柳澤政生, 戸川望

電子情報通信学会大会講演論文集(CD-ROM) 2017 2017年

J-GLOBAL
GPS測位情報にSVMを適用した屋内外判定手法

岩田紗瑛, 石川和明, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 117 ( 347(ITS2017 12-60) ) 2017年

J-GLOBAL
道路照明灯とランドマークを用いた道路照度レベルの評価法

BAO Siya, YANAGISAWA Masao, TOGAWA Nozomu

電子情報通信学会大会講演論文集(CD-ROM) 2017 2017年

J-GLOBAL
NFC給電のみで動作可能なpH測定デバイスの提案

宮林駿, 逢坂哲彌, 多和田雅師, 戸川望, 片岡孝介, 朝日透, 岩田浩康, 隼田大輝, 岩瀬英治, 藤枝俊宣, 武岡真司, 大橋啓之, 佐藤慎, 黒岩繁樹, 門間聰之

ロボティクス・メカトロニクス講演会講演概要集 2017 ( 0 ) 1A1 - L10 2017年

　概要を見る

<p>Skin-attachable devices are essential to the realization of personalized skin health through continuously monitoring individual's skin surface pH. This paper describes an approach to measure the skin surface pH no matter when or where, just holding a NFC enable phone over the pH-sensor device capable of operating only with NFC energy harvesting. Since NFC can generate the power and batteries are replaced, the proposed device becomes smaller, lighter and thinner. Therefore, it could be attached on the skin by using the ultrathin polymer film called nanosheet. Moreover, the low-power circuit is proposed which implements the constant current circuit and the function of wireless communication.</p>

DOI CiNii
セレクタ論理に帰着させたバタフライ演算器のFPGA実装評価 (VLSI設計技術) -- (デザインガイア2016 : VLSI設計の新しい大地)

伊東光希, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 116 ( 330 ) 67 - 72 2016年11月

CiNii
CCNルータのためのコンテンツのネーム長による分割ハッシュテーブルと平衡木によるFIBの構築 (VLSI設計技術) -- (デザインガイア2016 : VLSI設計の新しい大地)

島崎健太, 右近祐太, 宮崎昭彦, 津田俊隆, 中里秀則, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 116 ( 330 ) 123 - 128 2016年11月

CiNii
動作中のIoTデバイスに対する電気容量変化の測定を用いた不正改変検知装置の設計 (VLSI設計技術) -- (デザインガイア2016 : VLSI設計の新しい大地)

北山遼育, 竹中崇, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 116 ( 330 ) 129 - 134 2016年11月

CiNii
経年劣化を考慮したフロアプラン統合化高位合成手法 (VLSI設計技術) -- (デザインガイア2016 : VLSI設計の新しい大地)

井川昂輝, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 116 ( 330 ) 141 - 146 2016年11月

CiNii
スキャンシグネチャを用いたスキャンデータ解析に基づくHMAC-SHA-256ハッシュ回路のスキャンベース攻撃

於久太祐, 多和田雅師, 柳澤政生, 戸川望

DAシンポジウム2016論文集 2016 ( 2 ) 2 - 7 2016年09月

CiNii
Random Forestを用いたネットリスト特徴選択と機械学習によるハードウェアトロイ識別

長谷川健人, 柳澤政生, 戸川望

DAシンポジウム2016論文集 2016 ( 3 ) 8 - 13 2016年09月

CiNii
リードソロモン符号に基づいたマルチレベルセル不揮発性メモリ書き込み削減

多和田雅師, 柳澤政生, 戸川望

DAシンポジウム2016論文集 2016 ( 31 ) 163 - 168 2016年09月

CiNii
EDA研究の観点から (小特集 VDECとLSI設計研究・教育 : LSI設計試作のコモディティ化20年の歩みと今後)

戸川望

電子情報通信学会誌 = The journal of the Institute of Electronics, Information and Communication Engineers 99 ( 9 ) 901 - 906 2016年09月

CiNii
歩行者の方向判断基準を用いた腕時計型ウェアラブル端末向け略地図生成手法

河野圭亮, 新田知之, 石川和明, 柳澤政生, 戸川望

マルチメディア，分散協調とモバイルシンポジウム2016論文集 ( 2016 ) 411 - 418 2016年07月

CiNii
眼鏡型ウェアラブル端末を用いたランドマーク確認に基づく屋外歩行者ナビゲーション

矢野椋也, 新田知之, 石川和明, 柳澤政生, 戸川望

マルチメディア，分散協調とモバイルシンポジウム2016論文集 ( 2016 ) 419 - 427 2016年07月

CiNii
歩行者の視点情報に基づく屋内経路案内

岩名地良太, 新田知之, 石川和明, 柳澤政生, 戸川望

マルチメディア，分散協調とモバイルシンポジウム2016論文集 ( 2016 ) 1748 - 1756 2016年07月

CiNii
Trivium暗号回路に対するスキャンベース攻撃の実装評価 (システム数理と応用)

於久太祐, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 116 ( 96 ) 7 - 12 2016年06月

CiNii
ニューラルネットを利用したネットリストの特徴にもとづくハードウェアトロイ識別 (システム数理と応用)

長谷川健人, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 116 ( 96 ) 1 - 6 2016年06月

CiNii
ニューラルネットを利用したネットリストの特徴にもとづくハードウェアトロイ識別 (信号処理)

長谷川健人, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 116 ( 95 ) 1 - 6 2016年06月

CiNii
Trivium暗号回路に対するスキャンベース攻撃の実装評価 (信号処理)

於久太祐, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 116 ( 95 ) 7 - 12 2016年06月

CiNii
高速かつ低電力なソフトエラー耐性をもつFast-SEHラッチの設計

田島咲季, 史又華, 戸川望, 柳澤政生

回路とシステムワークショップ論文集 Workshop on Circuits and Systems 29 220 - 224 2016年05月

CiNii
DFGのクリティカルパス最適化に基づく演算チェイニングを用いたRDRアーキテクチャ対象高位合成手法 (VLSI設計技術)

寺田晃太朗, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 116 ( 21 ) 41 - 46 2016年05月

CiNii
A-6-4 FPGA実装を考慮したセレクタ論理型ボリュームレンダリング回路の改良と評価(A-6.VLSI設計技術,一般セッション)

五十嵐啓太, 柳澤政生, 戸川望

電子情報通信学会基礎・境界ソサイエティ/NOLTAソサイエティ大会講演論文集 2016 78 - 78 2016年03月

CiNii
A-6-5 クリティカルパス最適化フロアプラン指向FPGA高位合成手法のアプリケーション適用評価(A-6.VLSI設計技術,一般セッション)

藤原晃一, 川村一志, 柳澤政生, 戸川望

電子情報通信学会基礎・境界ソサイエティ/NOLTAソサイエティ大会講演論文集 2016 79 - 79 2016年03月

CiNii
FPGA実装を考慮したセレクタ論理型ボリュームレンダリング回路の設計評価 (VLSI設計技術)

五十嵐啓太, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 115 ( 465 ) 119 - 124 2016年02月

CiNii
動的遅延ばらつきに対する適応性を考慮したフロアプラン指向高位合成手法の検討 (VLSI設計技術)

井川昂輝, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 115 ( 398 ) 209 - 214 2016年01月

CiNii
冗長符号化を用いたマルチレベルセル不揮発性メモリ書き込み量削減

多和田雅師, 木村晋二, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 115 ( 398(VLD2015 77-110) ) 2016年

J-GLOBAL
クリティカルパス最適化フロアプラン指向FPGA高位合成手法のアプリケーション適用評価

藤原晃一, 川村一志, 柳澤政生, 戸川望

電子情報通信学会大会講演論文集(CD-ROM) 2016 2016年

J-GLOBAL
FPGA実装を考慮したセレクタ論理型ボリュームレンダリング回路の改良と評価

五十嵐啓太, 柳澤政生, 戸川望

電子情報通信学会大会講演論文集(CD-ROM) 2016 2016年

J-GLOBAL
VDECとLSI設計研究・教育 4.EDA研究の観点から

戸川望

電子情報通信学会誌 99 ( 9 ) 2016年

J-GLOBAL
タイミングエラー耐性を持つAES暗号回路の設計

吉田慎之介, SHI Youhua, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 115 ( 465(VLD2015 111-141) ) 2016年

J-GLOBAL
フロアプラン指向高位合成を用いたレジスタ分散型アーキテクチャ回路のFPGA実装

藤原晃一, 川村一志, 五十嵐啓太, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 115 ( 465(VLD2015 111-141) ) 2016年

J-GLOBAL
制御回路を考慮したローテータベースマルチプレクサネットワークによるフィールドデータ抽出器の評価

伊東光希, 川村一志, 田宮豊, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 115 ( 465(VLD2015 111-141) ) 2016年

J-GLOBAL
不揮発メモリを対象に最悪書き込みビット数削減と誤り訂正を両立する一対多符号構成手法

古城辰朗, 多和田雅師, 柳澤政生, 戸川望

電子情報通信学会論文誌 A(Web) J99-A ( 8 ) 2016年

J-GLOBAL
歩行者の視点情報に基づく屋内経路案内

岩名地良太, 新田知之, 石川和明, 柳澤政生, 戸川望

情報処理学会シンポジウムシリーズ(CD-ROM) 2016 ( 1 ) 2016年

J-GLOBAL
歩行者の方向判断基準を用いた腕時計型ウェアラブル端末向け略地図生成手法

河野圭亮, 新田知之, 石川和明, 柳澤政生, 戸川望

情報処理学会シンポジウムシリーズ(CD-ROM) 2016 ( 1 ) 2016年

J-GLOBAL
演算ビット幅に基づく演算チェイニングを用いたRDRアーキテクチャ向け性能指向高位合成手法

寺田晃太朗, 柳澤政生, 戸川望

電子情報通信学会大会講演論文集(CD-ROM) 2016 2016年

J-GLOBAL
悪意ある機能を無効化する内部ハードウェアトロイ認証

大屋優, SHI Youhua, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 115 ( 465(VLD2015 111-141) ) 2016年

J-GLOBAL
眼鏡型ウェアラブル端末を用いたランドマーク確認に基づく屋外歩行者ナビゲーション

矢野椋也, 新田知之, 石川和明, 柳澤政生, 戸川望

情報処理学会シンポジウムシリーズ(CD-ROM) 2016 ( 1 ) 2016年

J-GLOBAL
15nmプロセスにおける低電力な耐ソフトエラーラッチの設計 (VLSI設計技術)

田島咲季, 史又華, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 115 ( 338 ) 123 - 127 2015年12月

CiNii
タイミングエラー予測回路によるデータ依存最適化回路設計とそのFPGA評価 (VLSI設計技術)

川村一志, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 115 ( 338 ) 183 - 188 2015年12月

CiNii
A-3-7 不揮発メモリを対象に最悪書込みビット数削減と誤り訂正を両立する一対多符号構成手法(A-3.VLSI設計技術,一般セッション)

古城辰朗, 多和田雅師, 柳澤政生, 戸川望

電子情報通信学会基礎・境界ソサイエティ/NOLTAソサイエティ大会講演論文集 2015 52 - 52 2015年08月

CiNii
A-9-2 低電力なソフトエラー耐性をもつNew-SEHラッチの設計(A-9.信頼性,一般セッション)

田島咲季, 史又華, 戸川望, 柳澤政生

電子情報通信学会基礎・境界ソサイエティ/NOLTAソサイエティ大会講演論文集 2015 106 - 106 2015年08月

CiNii
クラスタリングによる書き込みビット数削減と誤り訂正を実現する不揮発メモリを対象とした符号の構成手法

古城辰朗, 多和田雅師, 柳澤政生, 戸川望

DAシンポジウム2015論文集 ( 2015 ) 11 - 16 2015年08月

CiNii
演算チェイニングの候補列挙・選択アルゴリズムを用いたフロアプラン指向高位合成手法

寺田晃太朗, 柳澤政生, 戸川望

DAシンポジウム2015論文集 ( 2015 ) 17 - 22 2015年08月

CiNii
基板バイアス制御による遅延ばらつき補償および配線遅延を考慮した低エネルギーオーバーヘッド指向の高位合成手法

井川昂輝, 史又華, 柳澤政生, 戸川望

DAシンポジウム2015論文集 ( 2015 ) 23 - 28 2015年08月

CiNii
ローテータベースマルチプレクサネットワークによるフィールドデータ抽出器の構成手法

伊東光希, 川村一志, 田宮豊, 柳澤政生, 戸川望

DAシンポジウム2015論文集 ( 2015 ) 29 - 34 2015年08月

CiNii
セレクタ論理に帰着させたアルファブレンド演算器を用いた画像間合成回路のFPGA実装

五十嵐啓太, 柳澤政生, 戸川望

DAシンポジウム2015論文集 ( 2015 ) 143 - 148 2015年08月

CiNii
低電力IoTデバイスを対象とするノイズに強く測定の長さが制約されない小型電力解析装置の設計

北山遼育, 竹中崇, 柳澤正生, 戸川望

DAシンポジウム2015論文集 ( 2015 ) 161 - 166 2015年08月

CiNii
クロックグリッチに基づく故障解析に耐性を持つAES暗号回路 (VLSI設計技術)

平野大輔, 史又華, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 115 ( 21 ) 51 - 55 2015年05月

CiNii
クロックグリッチに基づく故障解析に耐性を持つAES暗号回路

平野大輔, 史又華, 戸川望, 柳澤政生

情報処理学会研究報告. SLDM, [システムLSI設計技術] 2015 ( 10 ) 1 - 5 2015年05月

　概要を見る

近年,暗号回路への攻撃手法として,故障解析が脅威となっている.回路への故障の発生方法には,レーザー照射や電圧変動,クロックグリッチなどの方法があるが,実装や制御の容易性からクロックグリッチが注目されている.対策手法として,回路を三重化して比較する空間冗長化手法や,同じ処理を 2 回行って比較する時間冗長化手法が存在する.しかし,これらの手法は面積オーバーヘッド或いは時間オーバーヘッドが大きいという問題点がある.本稿では,故障解析の誘因となるクロックグリッチを高速に検出可能で,面積オーバーヘッドを 4.9% に抑えた AES 暗号回路を提案する.

CiNii
製造ばらつきと配線遅延を同時に考慮した低レイテンシ指向のマルチシナリオ高位合成の評価 (ディペンダブルコンピューティング)

井川昂輝, 阿部晋矢, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 507 ) 155 - 160 2015年03月

　概要を見る

増大を続ける製造ばらつきや配線遅延への解決策として,HDRアーキテクチャを対象としたマルチシナリオ高位合成手法を提案している.チップ全体をハドルと呼ばれる配線遅延の影響のない範囲に分割することで高位合成段階における適切な配線遅延の予測が可能となる.加えて製造ばらつきによる演算器の遅延ばらつきをシナリオとして扱う.演算器の遅延がTypicalケースの場合のTypicalシナリオ,Worstケースの場合のWorstシナリオを同時に1つのチップ上に高位合成し,製造されたチップの特性に応じてシナリオを切り替えることで高い歩留りと高い性能の両立が可能となる.提案手法は各シナリオの動作コントロールステップ数を最小化し,ハドル間データ通信やモジュール間結線をシナリオ間で揃える共通化と呼ばれる処理により全体の面積を削減する.本稿では,計算機実験により各動作条件におけるレイテンシを従来手法と比較し評価する.また,演算器の遅延分布からTypicalシナリオで動作可能な確率を算出し,レイテンシの期待値も評価する.提案手法は従来手法と比較し,レイテンシの期待値を最大35%削減できることを確認した.

CiNii
鍵長に依存しないLED暗号に対するスキャンベース攻撃 (ディペンダブルコンピューティング)

藤代美佳, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 507 ) 149 - 154 2015年03月

　概要を見る

スマートカード等において利用される軽量ブロック暗号にLED暗号があり,LED暗号へのスキャンベース攻撃が報告されている.しかし,この手法ではLED暗号の鍵長を64ビットとしており他の鍵長を考慮していない.他の鍵長の場合,秘密鍵を解読できない.本稿では,LED暗号の鍵長が64ビットより大きい場合のスキャンベース攻撃手法を提案する.計算機実験では,暗号回路のみをスキャンチェインに含む場合,提案手法を用いて平均145個の平文で128ビットの秘密鍵を復元可能と確認した.

CiNii
鍵長に依存しないLED暗号に対するスキャンベース攻撃 (コンピュータシステム)

藤代美佳, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 506 ) 149 - 154 2015年03月

　概要を見る

スマートカード等において利用される軽量ブロック暗号にLED暗号があり,LED暗号へのスキャンベース攻撃が報告されている.しかし,この手法ではLED暗号の鍵長を64ビットとしており他の鍵長を考慮していない.他の鍵長の場合,秘密鍵を解読できない.本稿では,LED暗号の鍵長が64ビットより大きい場合のスキャンベース攻撃手法を提案する.計算機実験では,暗号回路のみをスキャンチェインに含む場合,提案手法を用いて平均145個の平文で128ビットの秘密鍵を復元可能と確認した.

CiNii
低電力耐ソフトエラーラッチの設計 (VLSI設計技術)

田島咲季, 史又華, 戸川望, 柳澤政生

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 476 ) 55 - 60 2015年03月

　概要を見る

近年のLSIの微細化に伴い,ソフトエラーによる信頼性の低下が問題視されている.フリップフロップの多重化等の様々なソフトエラー対策が提案されてきたが,多重化による面積・電力の増大が問題である.そこで,本稿では既存のSEHラッチに低電力化技術であるTSPC (True Single Phase Clock)を取り入れた,低電力耐ソフトエラーラッチを提案する.レイアウトを設計し,HSPICEシミュレーションによりTSPC-SEHラッチと従来のSEHラッチ,DICEラッチと比較し,ソフトエラー耐性を損なわずに電力を最大42%削減し,54%の動作速度向上を達成した.

DOI CiNii
ゲートレベルネットリストを対象としたスコアに基づくハードウェアトロイ識別手法 (VLSI設計技術)

大屋優, 史又華, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 476 ) 165 - 170 2015年03月

　概要を見る

近年,企業はデジタルICの製造コストを削減するために,チップの製造をサードパーティに外注するようになった.サードパーティが製造に関わるようになり,ハードウェアトロイ(HTs)の挿入が問題視されるようになった.設計段階ではRegister Transfer Level (RTL)やゲートレベルのネットリストが1つだけ生成されるため,Goldenネットリストを仮定することはできない.以上の背景から生成されたネットリストにHTsが挿入されているか否かを識別するのは極めて困難である.本稿では,Goldenネットリストを使わずにゲートレベルのネットリストに対してスコアに基づいたHTsの有無を識別する手法を提案する.提案手法は,HTsに含まれるネット(トロイネットと呼ぶ)の特徴に注目し,トロイネットを検出することでHTsを検出する.提案手法はTrust-HUBのAbstraction Gate Levelで公開されている全てのゲートレベルのネットリストに対してHTsの有無を分類することに成功した.提案手法にかかる時間は高々数時間程度である.

CiNii
鍵長に依存しないLED暗号に対するスキャンベース攻撃

藤代美佳, 柳澤政生, 戸川望

情報処理学会研究報告. SLDM, [システムLSI設計技術] 2015 ( 47 ) 1 - 6 2015年02月

　概要を見る

スマートカード等において利用される軽量ブロック暗号に LED 暗号があり,LED 暗号へのスキャンベース攻撃が報告されている.しかし,この手法では LED 暗号の鍵長を 64 ビットとしており他の鍵長を考慮していない.他の鍵長の場合,秘密鍵を解読できない.本稿では,LED 暗号の鍵長が 64 ビットより大きい場合のスキャンベース攻撃手法を提案する.計算機実験では,暗号回路のみをスキャンチェインに含む場合,提案手法を用いて平均 145 個の平文で 128 ビットの秘密鍵を復元可能と確認した.

CiNii
製造ばらつきと配線遅延を同時に考慮した低レイテンシ指向のマルチシナリオ高位合成の評価

井川昂輝, 阿部晋矢, 柳澤政生, 戸川望

情報処理学会研究報告. SLDM, [システムLSI設計技術] 2015 ( 48 ) 1 - 6 2015年02月

　概要を見る

増大を続ける製造ばらつきや配線遅延への解決策として,HDR アーキテクチャを対象としたマルチシナリオ高位合成手法を提案している.チップ全体をハドルと呼ばれる配線遅延の影響のない範囲に分割することで高位合成段階における適切な配線遅延の予測が可能となる.加えて製造ばらつきによる演算器の遅延ばらつきをシナリオとして扱う.演算器の遅延が Typical ケースの場合の Typical シナリオ,Worst ケースの場合の Worst シナリオを同時に 1 つのチップ上に高位合成し,製造されたチップの特性に応じてシナリオを切り替えることで高い歩留りと高い性能の両立が可能となる.提案手法は各シナリオの動作コントロールステップ数を最小化し,ハドル間データ通信やモジュール間結線をシナリオ間で揃える共通化と呼ばれる処理により全体の面積を削減する.本稿では,計算機実験により各動作条件におけるレイテンシを従来手法と比較し評価する.また,演算器の遅延分布から Typical シナリオで動作可能な確率を算出し,レイテンシの期待値も評価する.提案手法は従来手法と比較し,レイテンシの期待値を最大 35% 削減できることを確認した.

CiNii
A-3-1 FPGA向けフロアプラン指向高位合成手法のための配線遅延モデリング(A-3.VLSI設計技術,一般セッション)

藤原晃一, 柳澤政生, 戸川望

電子情報通信学会総合大会講演論文集 2015 80 - 80 2015年02月

CiNii
A-3-8 FPGAを対象としたセレクタ論理型アルファブレンディング回路の実装と評価(A-3.VLSI設計技術,一般セッション)

五十嵐啓太, 柳澤政生, 戸川望

電子情報通信学会総合大会講演論文集 2015 87 - 87 2015年02月

CiNii
トロイネットの特徴に基づくハードウェアトロイ検出手法 (VLSI設計技術)

大屋優, 史又華, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 426 ) 157 - 162 2015年01月

　概要を見る

近年,企業はチップの製造コストを削減するために,チップの製造をサードパーティに外注するようになった.サードパーティが製造に関わるようになり,ハードウェアトロイの挿入が問題視されるようになった.特に設計段階では容易にハードウェアトロイを挿入することができる,ゲートレベルのネットリストに対してハードウェアトロイ検出手法を適用する場合,Goldenネットリストを持っておらず,挿入されているハードウェアトロイを活性化させないという条件下でハードウェアトロイを検出できる手法は存在しない。本稿では,Goldenネットリストが無く,ハードウェアトロイを活性化させなくても,ハードウェアトロイを検出する手法を提案する.提案手法は,ハードウェアトロイに含まれるネット(トロイネットと呼ぶ)の特徴に注目し,トロイネットを検出することでハードウェアトロイを検出する.トロイネットの特徴は9個あり,これらの特徴に一致するネットに重みづけを行うことで,トロイネットを検出する.提案手法はTrust-HUBのAbstraction Gate Levelで公開されているハードウェアトロイの挿入されている全てのゲートレベルのネットリストに対してトロイネットを検出した.加えて,2個のネットリストを除いて,誤検出なくトロイネットのみを検出することに成功した.提案手法にかかる時間は高々数十分程度である.

CiNii
トロイネットの特徴に基づくハードウェアトロイ検出手法 (コンピュータシステム)

大屋優, 史又華, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 427 ) 157 - 162 2015年01月

　概要を見る

近年,企業はチップの製造コストを削減するために,チップの製造をサードパーティに外注するようになった.サードパーティが製造に関わるようになり,ハードウェアトロイの挿入が問題視されるようになった.特に設計段階では容易にハードウェアトロイを挿入することができる,ゲートレベルのネットリストに対してハードウェアトロイ検出手法を適用する場合,Goldenネットリストを持っておらず,挿入されているハードウェアトロイを活性化させないという条件下でハードウェアトロイを検出できる手法は存在しない。本稿では,Goldenネットリストが無く,ハードウェアトロイを活性化させなくても,ハードウェアトロイを検出する手法を提案する.提案手法は,ハードウェアトロイに含まれるネット(トロイネットと呼ぶ)の特徴に注目し,トロイネットを検出することでハードウェアトロイを検出する.トロイネットの特徴は9個あり,これらの特徴に一致するネットに重みづけを行うことで,トロイネットを検出する.提案手法はTrust-HUBのAbstraction Gate Levelで公開されているハードウェアトロイの挿入されている全てのゲートレベルのネットリストに対してトロイネットを検出した.加えて,2個のネットリストを除いて,誤検出なくトロイネットのみを検出することに成功した.提案手法にかかる時間は高々数十分程度である.

CiNii
トロイネットの特徴に基づくハードウェアトロイ検出手法

大屋優, 史又華, 柳澤政生, 戸川望

情報処理学会研究報告. SLDM, [システムLSI設計技術] 2015 ( 28 ) 1 - 6 2015年01月

　概要を見る

近年,企業はチップの製造コストを削減するために,チップの製造をサードパーテイに外注するようになった.サードパーテイが製造に関わるようになり,ハードウェアトロイの挿入が問題視されるようになった.特に設計段階では容易にハードウェアトロイを挿入することができる.ゲートレベルのネットリストに対してハードウェアトロイ検出手法を適用する場合,Golden ネットリストを持っておらず,挿入されているハードウェアトロイを活性化させないという条件下でハードウェアトロイを検出できる手法は存在しない本稿では,Golden ネットリストが無く,ハードウェアトロイを活性化させなくても,ハードウェアトロイを検出する手法を提案する.提案手法は,ハードウェアトロイに含まれるネット (トロイネットと呼ぶ) の特徴に注目し,トロイネットを検出することでハードウェアトロイを検出する.トロイネットの特徴は 9 個あり,これらの特徴に一致するネットに重みづけを行うことで,トロイネットを検出する.提案手法は Trust-HUB の Abstraction Gate Level で公開されているハードウェアトロイの挿入されている全てのゲートレベルのネットリストに対してトロイネットを検出した.加えて,2 個のネットリストを除いて,誤検出なくトロイネットのみを検出することに成功した.提案手法にかかる時間は高々数十分程度である.

CiNii
不揮発メモリを対象に最悪書込みビット数削減と誤り訂正を両立する一対多符号構成手法

古城辰朗, 多和田雅師, 柳澤政生, 戸川望

電子情報通信学会大会講演論文集(CD-ROM) 2015 2015年

J-GLOBAL
回路面積を考慮した不揮発性メモリ書き込み削減符号生成手法

多和田雅師, 木村晋二, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 115 ( 338(VLD2015 38-76) ) 2015年

J-GLOBAL
CCNルータのためのハッシュテーブルと平衡木の併用によるメモリアクセスを削減したFIBの構築

島崎健太, 青木孝, 羽田野孝裕, 大塚卓哉, 宮崎昭彦, 津田俊隆, PARK Yong-Jin, 戸川望

電子情報通信学会技術研究報告 115 ( 338(VLD2015 38-76) ) 2015年

J-GLOBAL
配線遅延とクロックスキューを利用したフロアプラン指向FPGA高位合成手法

藤原晃一, 川村一志, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 115 ( 338(VLD2015 38-76) ) 2015年

J-GLOBAL
SVMを利用したネットリストの特徴に基づくハードウェアトロイ識別

長谷川健人, 大屋優, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 115 ( 338(VLD2015 38-76) ) 2015年

J-GLOBAL
ゲートレベルネットリストの脆弱性を表現する指標

大屋優, SHI Youhua, 山下哲孝, 岡村利彦, 角尾幸保, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 115 ( 338(VLD2015 38-76) ) 2015年

J-GLOBAL
低電力なソフトエラー耐性をもつNew-SEHラッチの設計

田島咲季, SHI Youhua, 戸川望, 柳澤政生

電子情報通信学会大会講演論文集(CD-ROM) 2015 2015年

J-GLOBAL
低電力IoTデバイスを対象としたノイズ低減機能を持つ小型電力解析装置の設計

北山遼育, 竹中崇, 柳澤政生, 戸川望

回路とシステムワークショップ論文集(CD-ROM) 28 2015年

J-GLOBAL
立地を考慮した可視ランドマークに基づく屋外歩行者案内

竹田健吾, 柳澤政生, 戸川望, 新田知之, 進藤大介

情報処理学会シンポジウムシリーズ(CD-ROM) 2015 ( 1 ) 2015年

J-GLOBAL
マルチプレクサ木分割によるフィールドデータ抽出器の構成手法 (VLSI設計技術) -- (デザインガイア2014 : VLSI設計の新しい大地)

伊東光希, 川村一志, 柳澤政生, 戸川望, 田宮豊

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 328 ) 197 - 202 2014年11月

　概要を見る

TCP/IPオフロードエンジンのパケット解析や動画データエンコード・デコード回路のストリームデータ処理に見られるように,動的にフィールド位置が変わるデータから一部のデータを効率良く取り出す処理が必要となる.これは入出力となるレジスタを多数のマルチプレクサを用いて接続することによって実現されるが,入出力レジスタのバイト長が増大すると,必要となるマルチプレクサ数も増大し,いかにマルチプレクサ数を削減するかが大きな課題となる.マルチプレクサを用いて,Mバイト長データを収めるレジスタの任意オフセットから連続したNバイトを読み出す回路をフィールドデータ抽出器と呼ぶ.本稿では,入出力レジスタの接続に仮想中間レジスタを設けることで,マルチプレクサ木の段数を変えずにマルチプレクサ数を削減する回路構成を提案する.また,マルチプレクサ数が最小となるような仮想中間レジスタサイズを定める手法も提案する.提案手法を論理合成して評価したところ,仮想中間レジスタを設ける前と比べてゲート数が最大92%削減できることを確認した.

CiNii
回路面積を考慮したSuspicious Timing Error Prediction回路の挿入位置決定手法の改良と評価 (VLSI設計技術) -- (デザインガイア2014 : VLSI設計の新しい大地)

吉田慎之介, 史又華, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 328 ) 57 - 62 2014年11月

　概要を見る

近年,半導体技術の進展に伴いタイミングエラー発生の危険性が増加している.STEPはタイミングエラーを事前に予測できる手法であるが,STEP回路を挿入する位置が重要である.このような背景から、回路面積を考慮したSTEP回路の挿入位置決定手法を提案した.本手法ではSTEP回路の個数を削減するために短いパスを無視するが,長いパスまで無視する可能性があった.また,短いパスに合わせて位置ラベルを付けるため,STEP回路の挿入位置がパスの後半に偏る可能性があった.本稿ではSTEP回路の挿入位置決定手法で用いる,短いパスの探索方法とラベル付けの方法を改良する.パスの長さを推定することで短いパスのみを無視できるため,これまでSTEP回路を挿入しなかった長いパスで発生するタイミングエラーが予測できる.また,任意の長さのパスに合わせたラベル付けもできるため,チェックポイントがパスの後半となることを防ぐ.改良した手法を複数の回路に対して適用し,最大動作周波数の向上を図る.実験結果よりSTEP回路を入れない場合と比較して,最大動作周波数を平均1.71倍に向上させることができた.改良前の手法と比較すると,最大動作周波数を平均1.15倍に向上させることができた.

CiNii
HDR-mcvを対象とした複数クロックドメインおよび複数電源電圧による低電力化高位合成手法 (VLSI設計技術) -- (デザインガイア2014 : VLSI設計の新しい大地)

阿部晋矢, 史又華, 宇佐美公良, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 328 ) 203 - 208 2014年11月

　概要を見る

低電力かつ高速なLSIの設計へ向け,配線遅延を考慮しながら複数クロックドメイン,複数電源電圧を同時に適用可能なHDR-mcvおよび高位合成手法が提案された.従来手法はクロックおよび電圧をハドルと呼ぶ区画毎に割り当てるが,クロックツリー数の増加による消費エネルギーのオーバヘッドが無視できない.提案手法はクロックに同期する論理,および演算回路に対し独立に電圧を割り当てることで,クロックツリー数を増加せずに複数クロックドメインと複数電源電圧を同時適用する.計算機実験結果により,提案手法は従来のHDR-mcvアーキテクチャを対象とした高位合成アルゴリズムと比較し50%程度消費エネルギーを削減し,最終的に従来のレジスタ分散型アーキテクチャと比較し提案手法は60%程度消費エネルギーを削減できることを確認した.

CiNii
HDRアーキテクチャを対象とした製造ばらつき耐性と低レイテンシを両立可能なマルチシナリオ高位合成手法 (VLSI設計技術) -- (デザインガイア2014 : VLSI設計の新しい大地)

井川昂輝, 阿部晋矢, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 328 ) 105 - 110 2014年11月

　概要を見る

半導体プロセスの継続的な微細化により,製造ばらつきや配線遅延がLSI設計に与える影響が増加している.これらに対し,製造ばらつきに応じてLSI動作に複数のシナリオを想定し,しかも配線遅延を考慮した高位合成手法の構築が有力な解となる.本稿では,分散レジスタアーキテクチャモデルの1つとしてHDRアーキテクチャを対象に,製造ばらつき耐性と低レイテンシを両立するマルチシナリオ高位合成手法を提案する.提案手法では使用するすべての演算器の遅延がTypicalケースの場合,Worstケースの場合の2つのシナリオを想定し,これらのシナリオを同時にLSI上に高位合成する.HDRアーキテクチャを前提にハドルによるモジュールの抽象化により,レイアウトに起因する問題の複雑度を軽減し,TypicalシナリオとWorstシナリオで可能な限り共通化したスケジューリング/バインディングを実行することで2つのシナリオを同時に最適化する.計算機実験により,従来手法と比較しTypicalシナリオのレイテンシを平均33%,最大39%削減できることを確認した.

CiNii
不揮発メモリを対象とした最大ハミング距離と最小ハミング距離を制約した符号による書き込み手法のエネルギー評価 (VLSI設計技術) -- (デザインガイア2014 : VLSI設計の新しい大地)

古城辰朗, 多和田雅師, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 328 ) 221 - 226 2014年11月

　概要を見る

デバイスの微細化によってメモリに保存されている値が破壊されるリスクが増大する.メモリの値を破壊から守る手法として誤り訂正符号を利用することが挙げられる.誤り訂正符号はメモリに書き込むビット数が多いため,書き込みエネルギーが大きくなるという欠点があり,書き込みビット数削減と誤り訂正を同時に実現する符号が必要とされる.我々は書き込みビット数削減と誤り訂正を同時に実現する符号としてドーナツ符号を提案した.また,符号拡張手法を提案し,ドーナツ符号に符号拡張手法を適用した拡張ドーナツ符号を提案した.拡張ドーナツ符号は,書き込みビット数を削減し,同時に誤り訂正を実現した符号である.本稿では,我々が提案した拡張ドーナツ符号について,エンコーダ・デコーダならびに不揮発メモリのエネルギーを評価する.評価実験の結果,拡張ドーナツ符号を用いたメモリはハミング符号を用いたメモリに比べて最大32%のエネルギーが削減された.

CiNii
不揮発メモリの書き込み削減手法のための小面積なエンコーダ/デコーダ回路構成 (VLSI設計技術) -- (デザインガイア2014 : VLSI設計の新しい大地)

多和田雅師, 木村晋二, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 328 ) 227 - 232 2014年11月

　概要を見る

不揮発メモリはリーク電力が非常に小さい,電源が落ちていても情報を保持できるといった性質から次世代メモリとして注目されている.一方で不揮発メモリには書き込みエネルギーが大きい,書き換え回数に上限があるという問題がある.書き込みエネルギーの削減とウェアレベリングを行う手法としてビットレベルでの書き込み削減手法が存在する.ハミング符号より生成した冗長符号を用いてメモリに保存する値を符号化して書き込む手法が提案されている.従来手法の回路構成では符号化のためのエンコーダ,デコーダの規模が大きくなる欠点がある.本稿では書き込み削減手法に適した符号構成を行うことでエンコーダ,デコーダの面積を小さくする手法を提案する.メモリに保存したいビットシーケンスをエンコードせずにエンコード後のベクトルとみなしても書き込みに必要な情報が得られる.メモリに保存されているベクトルを誤り訂正すると,デコードせずにシンドロームが元のビットシーケンスが持つ情報と一致する.その結果,小面積のエンコーダ,デコーダが構成できる.提案手法によりエンコーダとデコーダを設計した結果,従来手法と比較して面積が削減されることを確認する.

CiNii
タイミングエラー予測回路による再構成可能デバイス上でのデータ依存最適化回路設計 (VLSI設計技術) -- (デザインガイア2014 : VLSI設計の新しい大地)

川村一志, 阿部晋矢, 史又華, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 328 ) 51 - 56 2014年11月

　概要を見る

LSI内部の各パス遅延は入力データに応じて様々に変動する.この性質を利用することで,計算精度をわずかに落としながらも高速に動作するLSIの設計が可能になる.本稿では,入力データ群にもとづき特定された最適化すべきパスをリコンフィギュレーションし最適化する,新たな回路設計アルゴリズムを提案する.提案アルゴリズムは最適化対象の回路にタイミングエラー予測回路を挿入し動作させることで被最適化パスを特定,動的に再構成し与えられたエラー制約内で動作クロック周期の最小化を図る.本アルゴリズムを加算器に対して適用した結果,通常のクリティカルパス最小化の設計と比較し,2.1%以下のエラーを許容する制約下で最大18.5%の高速化に成功した.

CiNii
タイミングエラーへの耐性を持つフリップフロップ設計 (VLSI設計技術) -- (デザインガイア2014 : VLSI設計の新しい大地)

鈴木大渡, 史又華, 戸川望, 宇佐美公良, 柳澤政生

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 328 ) 45 - 50 2014年11月

　概要を見る

集積回路の微細化の影響により,回路のばらつきが大きくなっており,設計に必要な電源電圧やクロック周波数のマージンが増大している.マージンの緩和のため,タイミングエラーへの耐性を持つ回路の構造が盛んに研究されている.本稿では,フリップフロップの動作とラッチの動作を動的に切り替えることによりタイミングエラー耐性を実現するTime Borrowing Flip-Flop(TBFF)のトランジスタレベルの構造を2通り提案した.また,HSPICEシミュレーションによる評価を行い,従来手法と比較して消費エネルギーを最大20.6%削減できることを示した.

CiNii
FPGAの配線遅延特性を利用したフロアプラン指向高位合成手法 (VLSI設計技術) -- (デザインガイア2014 : VLSI設計の新しい大地)

藤原晃一, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 328 ) 99 - 104 2014年11月

　概要を見る

近年,画像処理や通信プロトコル処理などデータを高速処理する必要がある場面で,高位合成を利用したFPGA設計が増加している.しかし,LSIプロセスの微細化に伴って配線遅延のボトルネックが深刻化しており,FPGAにおいても例外では無い.また,FPGAではマルチプレクサ(MUX)が回路の遅延・面積において大きなボトルネックである.高位合成を利用したFPGA設計では,高位合成段階で配線遅延の考慮とMUXの削減を同時に実現することが強く求められる.FPGAは種類によって配線遅延特性が異なるため,配線遅延を見積もる際にはFPGAの配線遅延特性を考慮する必要がある.本稿では,高位合成段階でMUXを削減・制限した上で,FPGAの配線遅延特性を考慮したフロアプラン指向高位合成手法を提案する.提案手法はバインディングにおいてMUXの削減・制限を行い,FPGAにおけるマルチプレクサのボトルネックを解決する.また,レジスタ分散型アーキテクチャの1つであるHDRアーキテクチャを用いて,高位合成段階でモジュールの配置を行う.フロアプランの際に,FPGAでの配線遅延特性を考慮した配線遅延距離を用いることで,適切にFPGAでの配線遅延を見積もると共に,クリティカルパス遅延の小さいフロアプラン結果を実現する.提案手法は,従来手法と比較して配線遅延特性の顕著なFPGAにおいて,スライス数を同程度にした上でレイテンシ-を最大6%,平均3%削減した.

CiNii
DTMOSを用いたサブスレッショルド回路の高速化設計 (VLSI設計技術) -- (デザインガイア2014 : VLSI設計の新しい大地)

福留祐治, 史又華, 戸川望, 宇佐美公良, 柳澤政生

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 328 ) 117 - 121 2014年11月

　概要を見る

サブスレッショルド領域で回路を動作させることで低電力化は実現されるが,同時に速度が劣化するトレードオフの関係にある.本稿ではサブスレッショルド領域において低電力で高速化を実現するため,DTMOSを用いたサブスレッショルド回路の高速化設計を行い,トランジスタレベルのシミュレーションの結果,30〜45%高速化し,V_<dd>=0.2V,0.3Vにおいて平均15%低エネルギー化したことを示す.

CiNii
ハードウェアトロイに含まれるネットに着目したハードウェアトロイ検出手法 (VLSI設計技術) -- (デザインガイア2014 : VLSI設計の新しい大地)

大屋優, 史又華, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 328 ) 135 - 140 2014年11月

　概要を見る

近年チップの製造をサードパーティに外注するようになり,ハードウェアトロイが挿入される可能性が高まってきた.特に設計段階では簡単にハードウェアトロイを挿入することができる.ゲートレベルのネットリストに対してハードウェアトロイ検出手法を適用する場合,我々はGoldenネットリストを持っておらず,挿入されているハードウェアトロイを活性化するという条件下でハードウェアトロイを検出する手法が存在するのみである.本稿では,Goldenネットリストが無く,ハードウェアトロイを活性化させなくてもハードウェアトロイを検出する手法として,低スイッチング確率のネット(LSLGネットと呼ぶ)の検出を通じてハードウェアトロイを検出する手法を提案する.LSLGネットはネットリストに含まれるネットの数%であるにも関わらず,Trust-HUBのAbstraction Gate Levelで公開されているハードウェアトロイが挿入されている全てのゲートレベルのネットリストに対して,ハードウェアトロイの一部を検出することに成功した.提案手法にかかる時間は高々十数分程度である.

CiNii
遅延ばらつき許容量を最適化するRDRアーキテクチャ向け高位合成手法 (VLSI設計技術) -- (デザインガイア2014 : VLSI設計の新しい大地)

萩尾勇太, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 328 ) 209 - 214 2014年11月

　概要を見る

本稿では,遅延ばらつきの許容量を調整でき,なおかつレイテンシが増大しない範囲内で遅延ばらつき許容量を最大化するRDRアーキテクチャ向け高位合成手法を提案する.遅延ばらつきによるタイミング違反が発生しない場合と発生した場合の2通りのスケジューリング,バインディングを想定し,チップ製造後に発生した遅延ばらつきに応じて動作を選択する.入力としてばらつき率を与えることで,ばらつきの許容量の目標値を設定できる.ばらつき率を変化させながら複数回スケジューリング/バインディングを行うことで,レイテンシが増大しない範囲内で遅延ばらつき耐性を最大化するスケジューリング/バインディング解を求める.また,RDRアーキテクチャの空き領域を利用しここに演算器を追加することで,遅延ばらつきによるタイミング違反が発生した場合でも実行時間の最小化を図る.さらに,2通りのスケジューリング,バインディング結果に類似化という考えを導入することでチップ面積を最小化する.計算機実験により,提案手法は従来手法と比較して遅延ばらつき発生時の実行時間を最大16.7%削減,遅延ばらつき耐性を最大24%向上させることを確認した.

CiNii
タイミングエラーへの耐性を持つフリップフロップ設計 (ディペンダブルコンピューティング) -- (デザインガイア2014 : VLSI設計の新しい大地)

鈴木大渡, 史又華, 戸川望, 宇佐美公良, 柳澤政生

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 329 ) 45 - 50 2014年11月

　概要を見る

集積回路の微細化の影響により,回路のばらつきが大きくなっており,設計に必要な電源電圧やクロック周波数のマージンが増大している.マージンの緩和のため,タイミングエラーへの耐性を持つ回路の構造が盛んに研究されている.本稿では,フリップフロップの動作とラッチの動作を動的に切り替えることによりタイミングエラー耐性を実現するTime Borrowing Flip-Flop(TBFF)のトランジスタレベルの構造を2通り提案した.また,HSPICEシミュレーションによる評価を行い,従来手法と比較して消費エネルギーを最大20.6%削減できることを示した.

CiNii
FPGAの配線遅延特性を利用したフロアプラン指向高位合成手法 (ディペンダブルコンピューティング) -- (デザインガイア2014 : VLSI設計の新しい大地)

藤原晃一, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 329 ) 99 - 104 2014年11月

　概要を見る

近年,画像処理や通信プロトコル処理などデータを高速処理する必要がある場面で,高位合成を利用したFPGA設計が増加している.しかし,LSIプロセスの微細化に伴って配線遅延のボトルネックが深刻化しており,FPGAにおいても例外では無い.また,FPGAではマルチプレクサ(MUX)が回路の遅延・面積において大きなボトルネックである.高位合成を利用したFPGA設計では,高位合成段階で配線遅延の考慮とMUXの削減を同時に実現することが強く求められる.FPGAは種類によって配線遅延特性が異なるため,配線遅延を見積もる際にはFPGAの配線遅延特性を考慮する必要がある.本稿では,高位合成段階でMUXを削減・制限した上で,FPGAの配線遅延特性を考慮したフロアプラン指向高位合成手法を提案する.提案手法はバインディングにおいてMUXの削減・制限を行い,FPGAにおけるマルチプレクサのボトルネックを解決する.また,レジスタ分散型アーキテクチャの1つであるHDRアーキテクチァを用いて,高位合成段階でモジュールの配置を行う.フロアプランの際に,FPGAでの配線遅延特性を考慮した配線遅延距離を用いることで,適切にFPGAでの配線遅延を見積もると共に,クリティカルパス遅延の小さいフロアプラン結果を実現する.提案手法は,従来手法と比較して配線遅延特性の顕著なFPGAにおいて,スライス数を同程度にした上でレイテンシーを最大6%,平均3%削減した.

CiNii
DTMOSを用いたサブスレッショルド回路の高速化設計 (ディペンダブルコンピューティング) -- (デザインガイア2014 : VLSI設計の新しい大地)

福留祐治, 史又華, 戸川望, 宇佐美公良, 柳澤政生

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 329 ) 117 - 121 2014年11月

　概要を見る

サブスレッショルド領域で回路を動作させることで低電力化は実現されるが,同時に速度が劣化するトレードオフの関係にある.本稿ではサブスレッショルド領域において低電力で高速化を実現するため,DTMOSを用いたサブスレッショルド回路の高速化設計を行い,トランジスタレベルのシミュレーションの結果,30〜45%高速化し,V_<dd>=0.2V,0.3Vにおいて平均15%低エネルギー化したことを示す.

CiNii
ハードウェアトロイに含まれるネットに着目したハードウェアトロイ検出手法 (ディペンダブルコンピューティング) -- (デザインガイア2014 : VLSI設計の新しい大地)

大屋優, 史又華, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 329 ) 135 - 140 2014年11月

　概要を見る

近年チップの製造をサードパーティに外注するようになり,ハードウェアトロイが挿入される可能性が高まってきた.特に設計段階では簡単にハードウェアトロイを挿入することができる.ゲートレベルのネットリストに対してハードウェアトロイ検出手法を適用する場合,我々はGoldenネットリストを持っておらず,挿入されているハードウェアトロイを活性化するという条件下でハードウェアトロイを検出する手法が存在するのみである.本稿では,Goldenネットリストが無く,ハードウェアトロイを活性化させなくてもハードウェアトロイを検出する手法として,低スイッチング確率のネット(LSLGネットと呼ぶ)の検出を通じてハードウェアトロイを検出する手法を提案する.LSLGネットはネットリストに含まれるネットの数%であるにも関わらず,Trust-HUBのAbstraction Gate Levelで公開されているハードウェアトロイが挿入されている全てのゲートレベルのネットリストに対して,ハードウェアトロイの一部を検出することに成功した.提案手法にかかる時間は高々十数分程度である.

CiNii
遅延ばらつき許容量を最適化するRDRアーキテクチャ向け高位合成手法 (ディペンダブルコンピューティング) -- (デザインガイア2014 : VLSI設計の新しい大地)

萩尾勇太, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 329 ) 209 - 214 2014年11月

　概要を見る

本稿では,遅延ばらつきの許容量を調整でき,なおかつレイテンシが増大しない範囲内で遅延ばらつき許容量を最大化するRDRアーキテクチャ向け高位合成手法を提案する.遅延ばらつきによるタイミング違反が発生しない場合と発生した場合の2通りのスケジューリング,バインディングを想定し,チップ製造後に発生した遅延ばらつきに応じて動作を選択する.入力としてばらつき率を与えることで,ばらつきの許容量の目標値を設定できる.ばらつき率を変化させながら複数回スケジューリング/バインディングを行うことで,レイテンシが増大しない範囲内で遅延ばらつき耐性を最大化するスケジューリング/バインディング解を求める.また,RDRアーキテクチャの空き領域を利用しここに演算器を追加することで,遅延ばらつきによるタイミング違反が発生した場合でも実行時間の最小化を図る.さらに,2通りのスケジューリング,バインディング結果に類似化という考えを導入することでチップ面積を最小化する.計算機実験により,提案手法は従来手法と比較して遅延ばらつき発生時の実行時間を最大16.7%削減,遅延ばらつき耐性を最大24%向上させることを確認した.

CiNii
マルチプレクサ木分割によるフィールドデータ抽出器の構成手法

伊東光希, 川村一志, 柳澤政生, 戸川望, 田宮豊

研究報告システムとLSIの設計技術（SLDM） 2014 ( 39 ) 1 - 6 2014年11月

　概要を見る

TCP/IP オフロードエンジンのパケット解析や動画データエンコード・デコード回路のストリームデータ処理に見られるように，動的にフィールド位置が変わるデータから一部のデータを効率良く取り出す処理が必要となる．これは入出力となるレジスタを多数のマルチプレクサを用いて接続することによって実現されるが，入出力レジスタのバイト長が増大すると，必要となるマルチプレクサ数も増大し，いかにマルチプレクサ数を削減するかが大きな課題となる．マルチプレクサを用いて，M バイト長データを収めるレジスタの任意オフセットから連続した N バイトを読み出す回路をフィールドデータ抽出器と呼ぶ．本稿では，入出力レジスタの接続に仮想中間レジスタを設けることで，マルチプレクサ木の段数を変えずにマルチプレクサ数を削減する回路構成を提案する．また，マルチプレクサ数が最小となるような仮想中間レジスタサイズを定める手法も提案する．提案手法を論理合成して評価したところ，仮想中間レジスタを設ける前と比べてゲート数が最大 92％削減できることを確認した．As seen in packet analysis of TCP/IP offload engine and stream data processing of encoder/decoder for video data, it is often necessary to extract a part of data from data changed field dynamically , where we can use a field-data extractor. Particularly, an (M, N) field-data extractor reads out any consecutive AT bytes from an M-byte register by connecting its input/output using multiplexers. However, the number of required multiplexers increases too much as the input/output byte lengths increase. How to reduce the number of its required multiplexers is a major challenge. In this paper, we propose an efficient multiplexer-tree configuration method for an (M, N) field-data extractor. Our method is based on inserting a (N + B - 1)-byte virtual intermediate-register into a multiplexer tree and partitioning it into an upper tree and a lower tree. Then our method theoretically reduces the number of required multiplexers without increasing the multiplexer-tree depth. We also propose how to determine the size of the virtual intermediate-register that minimizes the number of required multiplexers. Experimental results show that our method reduces the required number of gates to implement a field-data extractor by up to 92% compared with the one using a naive multiplexer-tree configuration.

CiNii
不揮発メモリを対象とした最大ハミング距離と最小ハミング距離を制約した符号による書き込み手法のエネルギー評価

古城辰朗, 多和田雅師, 柳澤政生, 戸川望

研究報告システムとLSIの設計技術（SLDM） 2014 ( 34 ) 1 - 6 2014年11月

　概要を見る

デバイスの微細化によってメモリに保存されている値が破壊されるリスクが増大する．メモリの値を破壊から守る手法として誤り訂正符号を利用することが挙げられる．誤り訂正符号はメモリに書き込むピット数が多いため，書き込みエネルギーが大きくなるという欠点があり，書き込みピット数削減と誤り訂正を同時に実現する符号が必要とされる．我々は書き込みピット数削減と誤り訂正を同時に実現する符号としてドーナツ符号を提案した．また，符号拡張手法を提案し，ドーナツ符号に符号拡張手法を適用した拡張ドーナツ符号を提案した．拡張ドーナツ符号は，書き込みビット数を削減し，同時に誤り訂正を実現した符号である．本稿では，我々が提案した拡張ドーナツ符号について，エンコーダ・デコーダならびに不揮発メモリのエネルギーを評価する．評価実験の結果，拡張ドーナツ符号を用いたメモリはハミング符号を用いたメモリに比べて最大 32％のエネルギーが削減された．Data stored in non-volatile memories may be destructed due to crosstalk and radiation but we can restore their data by using error-correcting codes. However, non-volatile memories consume a large amount of energy in writing. How to reduce writing bits even when using error-correcting codes is one of the challenges in non-volatile memory design. We have proposed a Doughnut code, which is a new bit-write-reducing and error-correcting code. In addition, we have proposed a code expansion method. When we apply our code expansion method to Doughnut code, we can obtain expanded Doughnut codes. Expanded Doughnut codes are error-correcting codes which can reduce the number of writing bits. In this paper, we demonstrate experimental evaluations from the viewpoint of energy reduction of our proposed expanded Doughnut codes. Experimental results show that the write-reducing code reduces energy consumption by up to 32% compared to Hamming code.

CiNii
HDR-mcvを対象とした複数クロックドメインおよび複数電源電圧による低電力化高位合成手法

阿部晋矢, 史又華, 宇佐美公良, 柳澤政生, 戸川望

研究報告システムとLSIの設計技術（SLDM） 2014 ( 40 ) 1 - 6 2014年11月

　概要を見る

低電力かつ高速な LSI の設計へ向け，配線遅延を考慮しながら複数クロックドメイン，複数電源電圧を同時に適用可能な HDR-mcv および高位合成手法が提案された．従来手法はクロックおよび電圧をハドルと呼ぶ区画毎に割り当てるが，クロックツリー数の増加による消費エネルギーのオーバヘッドが無視できない．提案手法はクロックに同期する論理，および演算回路に対し独立に電圧を割り当てることで，クロックツリー数を増加せずに複数クロックドメインと複数電源電圧を同時適用する．計算機実験結果により，提案手法は従来の HDR-mcv アーキテクチャを対象とした高位合成アルゴリズムと比較し 50％程度消費エネルギーを削減し，最終的に従来のレジスタ分散型アーキテクチャと比較し提案手法は 60％程度消費エネルギーを削減できることを確認した．An HDR-mcv architecture, which integrates multiple supply voltages and multiple clock domains into high-level synthesis and enables us to estimate interconnection delay effects during high-level synthesis, has been proposed with the corresponding synthesis algorithm. They assign voltages and clock frequencies to huddles which are the partitions for interconnection delay estimation during high-level synthesis. However, the voltage and clock assignment may have some energy overheads due to the increased clock trees. In this paper, we propose a new HDR-mcv architecture in which supply voltages are assigned to functional logics and clock synchronization logics separately. Next, we propose a high-level synthesis algorithm for the architecture, which can assign clock frequen cies and supply voltages on the bases of the placement and energy informations. Experimental results show that the proposed method achieves 50% energy-saving compared with the conventional HDR-mcv architecture and 60% energy-saving compared with the existing high-level synthesis methods.

CiNii
不揮発メモリの書き込み削減手法のための小面積なエンコーダ/デコーダ回路構成

多和田雅師, 木村晋二, 柳澤政生, 戸川望

研究報告システムとLSIの設計技術（SLDM） 2014 ( 35 ) 1 - 6 2014年11月

　概要を見る

不揮発メモリはリーク電力が非常に小さい電源が落ちていても情報を保持できるといった性質から次世代メモリとして注目されている．一方で不揮発メモリには書き込みエネルギーが大きい，書き換え回数に上限があるという問題がある．書き込みエネルギーの削減とウェアレベリングを行う手法としてピットレベルでの書き込み削減手法が存在する．ハミング符号より生成した冗長符号を用いてメモリに保存する値を符号化して書き込む手法が提案されている．従来手法の回路構成では符号化のためのエンコーダデコーダの規模が大きくなる欠点がある．本稿では書き込み削減手法に適した符号構成を行うことでエンコーダデコーダの面積を小さくする手法を提案する．メモリに保存したいピットシーケンスをエンコードせずにエンコード後のベクトルとみなしても書き込みに必要な情報が得られる．メモリに保存されているベクトルを誤り訂正すると，デコードせずにシンドロームが元のピットシーケンスが持つ情報と一致する．その結果，小面積のエンコーダ；デコーダが構成できる．提案手法によりエンコーダとデコーダを設計した結果，従来手法と比較して面積が削減されることを確認する．Non-volatile memory has many advantages such as low leakage power and non-volatility. However, there are problems that a non-volatile memory consumes a large amount of energy in writing and that the maximum number of bit re-writings is limited. We have proposed a Hamming-code based bit-write reduction method using data encoding/decoding but its encoder/decoder becomes too much large. In this paper, we propose small-sized encoder/decoder circuit design for the bit-write reduction codes. In this design, we simplify data encoding/decoding by using code redundancy. Experimental results show the efficiency of our encoder/decoder design.

CiNii
回路面積を考慮したSuspicious Timing Error Prediction 回路の挿入位置決定手法の改良と評価

吉田慎之介, 史又華, 柳澤政生, 戸川望

研究報告システムとLSIの設計技術（SLDM） 2014 ( 3 ) 1 - 6 2014年11月

　概要を見る

近年，半導体技術の進展に伴いタイミングエラー発生の危険性が増加している．STEP はタイミングエラーを事前に予測できる手法であるが，STEP 回路を挿入する位置が重要である．このような背景から、回路面積を考慮した STEP 回路の挿入位置決定手法を提案した．本手法では STEP 回路の個数を削減するために短いパスを無視するが，長いパスまで無視する可能性があった．また，短いパスに合わせて位置ラベルを付けるため，STEP 回路の挿入位置がパスの後半に偏る可能性があった．本稿では STEP 回路の挿入位置決定手法で用いる，短いパスの探索方法とラベル付けの方法を改良する．パスの長さを推定することで短いパスのみを無視できるため，これまで STEP 回路を挿入しなかった長いパスで発生するタイミングエラーが予測できる．また，任意の長さのパスに合わせたラベル付けもできるため，チェックポイントがバスの後半となることを防ぐ．改良した手法を複数の回路に対して適用し，最大動作周波数の向上を図る．実験結果より STEP 回路を入れない場合と比較して，最大動作周波数を平均 1.71 倍に向上させることができた．改良前の手法と比較すると，最大動作周波数を平均 1.15 倍に向上させることができた．As process technologies advance, process and delay variation causes a complex timing design and in-situ timing error correction techniques are strongly required. Suspicious timing error prediction (STEP) predicts timing errors by monitoring checkpoints by STEP circuits (STEPCs) and how to insert checkpoints is very important. We have proposed a network-flow-based checkpoint insertion algorithm for STEP. However, our algorithm may ignore long paths and insert checkpoints near the output. In this paper, we improve how to ignore short paths and set labels by estimating path lengths. Then, we can ignore only short paths and insert checkpoints into near the center of all long paths. We evaluate our algorithm by applying it to four benchmark circuits. Experimental results show that our proposed algorithm realizes an average of 1.71X overclocking compared with just inserting no STEPC. Furthermore, our improved algorithm realizes an average of 1.15X overclocking compared with our original algorithm.

CiNii
ハードウェアトロイに含まれるネットに着目したハードウェアトロイ検出手法

大屋優, 史又華, 柳澤政生, 戸川望

研究報告システムとLSIの設計技術（SLDM） 2014 ( 24 ) 1 - 6 2014年11月

　概要を見る

近年チップの製造をサードパーティに外注するようになり，ハードウェアトロイが挿入される可能性が高まってきた．特に設計段階では簡単にハードウェアトロイを挿入することができる．ゲートレベルのネットリストに対してハードウェアトロイ検出手法を適用する場合，我々は Golden ネットリストを持っておらず，挿入されているハードウェアトロイを活性化するという条件下でハードウェアトロイを検出する手法が存在するのみである．本稿では，Golden ネットリストが無く，ハードウェアトロイを活性化させなくてもハードウェアトロイを検出する手法として，低スイッチング確率のネット（LSLG ネットと呼ぶ）の検出を通じてハードウェアトロイを検出する手法を提案する LSLG ネットはネットリストに含まれるネットの数％であるにも関わらず，Trｕｓｔ-HUB の Abstraction Gate Level で公開されているハードウェアトロイが挿入されている全てのゲートレベルのネットリストに対して，ハードウェアトロイの一部を検出することに成功した．提案手法にかかる時間は高々十数分程度である．Recently, digital ICs are designed by outside vendors to reduce design costs in semiconductor industry. This circumstance introduces risks that malicious attackers implement Hardware Trojans (HTs) into ICs. HTs are easily inserted in particular during design phase, but HTs detection is too difficult during this phase. This is why we have to assume Golden Netlists and activation of HTs in previous researches. This paper proposes an HT detection method through detecting LSLG nets, which have low switching probabilities. Our approach does not assume Golden netlists nor activation of HTs. We succesfully find out that all HT-inserted gate-level netlists from Trust-HUB benchmarks include a small number of LSLG nets. It takes approximately ten minutes to detect LSLG nets in each benchmark.

CiNii
タイミングエラーへの耐性を持つフリップフロップ設計

鈴木大渡, 史又華, 戸川望, 宇佐美公良, 柳澤政生

研究報告システムとLSIの設計技術（SLDM） 2014 ( 1 ) 1 - 6 2014年11月

　概要を見る

集積回路の微細化の影響により，回路のばらつきが大きくなっており，設計に必要な電源電圧やクロック周波数のマージンが増大している．マージンの緩和のため，タイミングエラーへの耐性を持つ回路の構造が盛んに研究されている．本稿では，フリップフロップの動作とラッチの動作を動的に切り替えることによりタイミングエラー耐性を実現する Time Borrowing Flip-Flop(TBFF) のトランジスタレベルの構造を 2 通り提案したまた，HSPICE シミュレーションによる評価を行い，従来手法と比較して消費エネルギーを最大 20.6%削減できることを示した．Under the influence of the miniaturization of the integrated circuit, the variation of the operation condition of the circuit becomes bigger, and margins of the supply voltage and the clock frequency necessary for a design increase. For the mitigation of the margin, the structure of the circuit with the timing error tolerance is studied flourishingly. In this paper, we propose two new Time Borrowing Flip-Flops (TBFF) in transistor level to realize timing error tolerance by switching from flip-flop to latch dynamically. HSPICE simulation results show that the proposed TBFF can achieve up to 28.1% power reduction when compared with existing works.

CiNii
HDRアーキテクチャを対象とした製造ばらつき耐性と低レイテンシを両立可能なマルチシナリオ高位合成手法

井川昂輝, 阿部晋矢, 柳澤政生, 戸川望

研究報告システムとLSIの設計技術（SLDM） 2014 ( 15 ) 1 - 6 2014年11月

　概要を見る

半導体プロセスの継続的な微細化により，製造ばらつきや配線遅延が LSI 設計に与える影響が増加している．これらに対し，製造ばらつきに応じて LSI 動作に複数のシナリオを想定し，しかも配線遅延を考慮した高位合成手法の構築が有力な解となる．本稿では，分散レジスタアーキテクチャモデルの 1 つとして HDR アーキテクチャを対象に，製造ばらつき耐性と低レイテンシを両立するマルチシナリオ高位合成手法を提案する．提案手法では使用するすべての演算器の遅延が Typical ケースの場合，Worst ケースの場合の 2 つのシナリオを想定し，これらのシナリオを同時に LSI 上に高位合成する．HDR アーキテクチャを前提にハドルによるモジュールの抽象化により，レイアウトに起因する問題の複雑度を軽減し，Typical シナリオと Worst シナリオで可能な限り共通化したスケジューリング／バインディングを実行することで 2 つのシナリオを同時に最適化する．計算機実験により，従来手法と比較し Typical シナリオのレイテンシを平均 33％，最大 39％削減できることを確認した．In this paper, we propose a process-variation-tolerant and low-latency multi-scenario high-level synthesis algorithm for HDR architectures. We assume two scenarios, which are a typical-case scenario and a worst-case scenario, and realize them on a single chip. By using distributed-register architectures called HDR architectures, we can take into account interconnection delays in high-level syntesis. We first schedule/bind each of the scenarios independently. After that, we commonize a typical-case scenario and a worst-case scenario and synthesize a commonized scheduling/binding result. Experimental results show that our algorithm reduces the latency of typical-case scenario by up to 33% compared with previous methods.

CiNii
DTMOSを用いたサブスレッショルド回路の高速化設計

福留祐治, 史又華, 戸川望, 宇佐美公良, 柳澤政生

研究報告システムとLSIの設計技術（SLDM） 2014 ( 21 ) 1 - 5 2014年11月

　概要を見る

サブスレッショルド領域で回路を動作させることで低電力化は実現されるが，同時に速度が劣化するトレードオフの関係にある．本稿ではサブスレッショルド領域において低電力で高速化を実現するため，DTMOS を用いたサブスレッシヨルド回路の高速化設計を行い，トランジスタレベルのシミュレーションの結果，30～45％高速化し，Vdd＝0.2Ｖ, 0.3Ｖにおいて平均 15％低エネルギー化したことを示す．Low power consumption is achieved by operating circuits in sub-threshold region. However, in sub-threshold region, the operating speed becomes slow, and the tradeoff between power and speed should be considered carefully. In this work, we present DTMOS implementations to realize high speed and low power in subthreshold region. Transistor level simulation results show that the operating speed can be improved by 30 %-45 %, and on average 15 % energy reduction can be achieved when Vdd ranges 0.2-0.3V.

CiNii
FPGAの配線遅延特性を利用したフロアプラン指向高位合成手法

藤原晃一, 柳澤政生, 戸川望

研究報告システムとLSIの設計技術（SLDM） 2014 ( 14 ) 1 - 6 2014年11月

　概要を見る

近年，画像処理や通信プロトコル処理などデータを高速処理する必要がある場面で，高位合成を利用した FPGA 設計が増加している．しかし，LSI プロセスの微細化に伴って配線遅延のボトルネックが深刻化しており，FPGA においても例外では無い．また，FPGA ではマルチプレクサ（MUX）が回路の遅延・面積において大きなボトルネックである．高位合成を利用した FPGA 設計では，高位合成段階で配線遅延の考慮と MUX の削減を同時に実現することが強く求められる．FPGA は種類によって配線遅延特性が異なるため，配線遅延を見積もる際には FPGA の配線遅延特性を考慮する必要がある．本稿では，高位合成段階で MUX を削減・制限した上で，FPGA の配線遅延特性を考慮したフロアプラン指向高位合成手法を提案する．提案手法はバインデイングにおいて MUX の削減・制限を行い，FPGA におけるマルチプレクサのボトルネックを解決する．また，レジスタ分散型アーキテクチャの 1 つである HDR アーキテクチャを用いて，高位合成段階でモジュールの配置を行う．フロアプランの際に，FPGA での配線遅延特性を考慮した配線遅延距離を用いることで，適切に FPGA での配線遅延を見積もると共に，クリティカルパス遅延の小さいフロアプラン結果を実現する．提案手法は，従来手法と比較して配線遅延特性の顕著な FPGA において，スライス数を同程度にした上でレイテンシーを最大 6％，平均 3％削減した．Recently, high-level synthesis (HLS) techniques for FPGA designs are required such as in image pro cessing and computerized stock tradings. With recent process scaling in FPGAs, interconnection delays become dominant in total circuit delays nevertheless I/O buffers and wire buffers are provided and each FPGA has a different interconnection delay characteristics. We need to consider interconnection delays based on interconnection delay characteristics in FPGA designs. In this paper, we propose a floorplan-aware high-level synthesis algorithm utilizing interconnection delay characteristics targeting FPGA designs. Our target architecture is HDR, one of distributed-register architectures, and then we can estimate interconnection delays correctly by utilizing interconnection delay characteristics in an FPGA chip. Further, we reduce multiplexers generated and also limit the total number of inputs to multiplexers in HLS process. Experimental results demonstrate that our algorithm can realize FPGA designs which reduce the latency by up to 6% compared with our previous approach.

CiNii
遅延ばらつき許容量を最適化するRDRアーキテクチャ向け高位合成手法

萩尾勇太, 柳澤政生, 戸川望

研究報告システムとLSIの設計技術（SLDM） 2014 ( 41 ) 1 - 6 2014年11月

　概要を見る

本稿では，遅延ばらつきの許容量を調整でき，なおかつレイテンシが増大しない範囲内で遅延ばらつき許容量を最大化する RDR アーキテクチャ向け高位合成手法を提案する．遅延ばらつきによるタイミング違反が発生しない場合と発生した場合の 2 通りのスケジューリング；バインディングを想定し，チップ製造後に発生した遅延ばらつきに応じて動作を選択する．入力としてばらつき率を与えることで，ばらつきの許容量の目標値を設定できる．ばらつき率を変化させながら複数回スケジューリング/バインディングを行うことで，レイテンシが増大しない範囲内で遅延ばらつき耐性を最大化するスケジューリング/バインディング解を求める．また，RDRアーキテクチャの空き領域を利用しここに演算器を追加することで，遅延ばらつきによるタイミング違反が発生した場合でも実行時間の最小化を図る．さらに，2 通りのスケジューリング；バインディング結果に類似化という考えを導入することでチップ面積を最小化する．計算機実験により，提案手法は従来手法と比較して遅延ばらつき発生時の実行時間を最大16.7％削減，遅延ばらつき耐性を最大24％向上させることを確認した．In this paper, we propose a high-level synthesis algorithm with delay variation tolerance optimization for RDR architec tures. We first obtain a non-delayed scheduling/binding result and a delayed scheduling/binding result independently. When we obtain two scheduling/binding results, we use two variation rates, the typical variation rate and the worst variation rate, and maximize them without increasing the latency. By adding several extra functional units to vacant RDR islands, we have a delayed scheduling/binding result so that its latency cannot be increased compared with the non-delayed one. After that, we similarize the two scheduling/binding results by repeatedly modifying their results. We can finally realize non-delayed and delayed scheduling/binding results simultaneously on RDR architecture with almost no area/performance overheads and we can select either one of them depending on post-silicon delay variation. Experimental results show that our algorithm successfully reduces delayed scheduling/binding latency by up to 16.7% compared with the conventional approach.

CiNii
タイミングエラー予測回路による再構成可能デバイス上でのデータ依存最適化回路設計

川村一志, 阿部晋矢, 史又華, 柳澤政生, 戸川望

研究報告システムとLSIの設計技術（SLDM） 2014 ( 2 ) 1 - 6 2014年11月

　概要を見る

LSI 内部の各パス遅延は入力データに応じて様々に変動する．この性質を利用することで，計算精度をわずかに落としながらも高速に動作する LSI の設計が可能になる．本稿では，入力データ群にもとづき特定された最適化すべきパスをリコンフィギュレーションし最適化する，新たな回路設計アルゴリズムを提案する．提案アルゴリズムは最適化対象の回路にタイミングエラー予測回路を挿入し動作させることで被最適化パスを特定，動的に再構成し与えられたエラー制約内で動作クロック周期の最小化を図る．本アルゴリズムを加算器に対して適用した結果，通常のクリティカルパス最小化の設計と比較し，2.1 ％以下のエラーを許容する制約下で最大 18.5％の高速化に成功した．The propagation delay along each path inside an LSI widely varies depending on input data, and this property can be exploited to design high-performance approximation circuit with a negligible error rate. In this paper, we propose a novel approximation circuit design algorithm, which identifies paths to be optimized based on input data and reconfigures these paths. Our algorithm first identifies the optimized paths by incorporating timing error prediction circuits into a target circuit and running them in practice. These paths are then dynamically reconfigured within an accuracy constraint with the objective of maximizing its performance. Experimental results targeting a set of basic adders show that our algorithm can achieve performance increase by up to 18.5% within acceptable error of 2.1% compared with conventional design techniques.

CiNii
迷いにくい可視ランドマークに基づく屋外歩行者ナビゲーションシステム

竹田健吾, 柳澤政生, 戸川望, 新田知之, 進藤大介, 田中清貴

組込みシステムシンポジウム2014論文集 2014 102 - 107 2014年10月

CiNii
曲線道路を滑らかに接続する道路ネットワーク整形手法

折原照崇, 柳澤政生, 戸川望, 新田知之, 進藤大介, 田中清貴

組込みシステムシンポジウム2014論文集 2014 69 - 74 2014年10月

CiNii
歩行通路形状を考慮した可視グラフに基づく屋内環境ナビゲーションシステム

町田理, 柳澤政生, 戸川望, 新田知之, 進藤大介, 田中清貴

組込みシステムシンポジウム2014論文集 2014 96 - 101 2014年10月

CiNii
可変パイプラインのローカルなパルス生成による低消費エネルギー化手法 (VLSI設計技術)

新井孝将, 史又華, 戸川望, 宇佐美公良, 柳澤政生

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 231 ) 7 - 12 2014年10月

　概要を見る

モバイル端末において性能向上による消費エネルギーの増加が問題となっており,様々な低消費エネルギー化手法が提案されている.その一つである可変パイプライン段数(Variable Stages Pipeline:VSP)では,LDS-cell(Latch D-FF Selector cell)という特殊なセルを用いてグリッチを緩和することができる.しかし,クロックがLowのときに発生するグリッチに対しては緩和できないという問題があった.本稿では既存の可変パイプライン段数手法に対し,LE(Low Energy)モード時にクロックゲーティングを適用し,ローカルなパルス生成によりデータパス上のグリッチを更に抑制し,消費エネルギーを削減する手法を提案する.実際に乗算器に提案手法を実装し,従来のVSPと比較して3.08%消費エネルギーを削減することができた.

CiNii
可変パイプラインのローカルなパルス生成による低消費エネルギー化手法 (画像工学)

新井孝将, 史又華, 戸川望, 宇佐美公良, 柳澤政生

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 233 ) 7 - 12 2014年10月

　概要を見る

モバイル端末において性能向上による消費エネルギーの増加が問題となっており,様々な低消費エネルギー化手法が提案されている.その一つである可変パイプライン段数(Variable Stages Pipeline:VSP)では,LDS-cell(Latch D-FF Selector cell)という特殊なセルを用いてグリッチを緩和することができる.しかし,クロックがLowのときに発生するグリッチに対しては緩和できないという問題があった.本稿では既存の可変パイプライン段数手法に対し,LE(Low Energy)モード時にクロックゲーティングを適用し,ローカルなパルス生成によりデータパス上のグリッチを更に抑制し,消費エネルギーを削減する手法を提案する.実際に乗算器に提案手法を実装し,従来のVSPと比較して3.08%消費エネルギーを削減することができた.

CiNii
可変パイプラインのローカルなパルス生成による低消費エネルギー化手法 (集積回路)

新井孝将, 史又華, 戸川望, 宇佐美公良, 柳澤政生

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 232 ) 7 - 12 2014年10月

　概要を見る

モバイル端末において性能向上による消費エネルギーの増加が問題となっており,様々な低消費エネルギー化手法が提案されている.その一つである可変パイプライン段数(Variable Stages Pipeline:VSP)では,LDS-cell(Latch D-FF Selector cell)という特殊なセルを用いてグリッチを緩和することができる.しかし,クロックがLowのときに発生するグリッチに対しては緩和できないという問題があった.本稿では既存の可変パイプライン段数手法に対し,LE(Low Energy)モード時にクロックゲーティングを適用し,ローカルなパルス生成によりデータパス上のグリッチを更に抑制し,消費エネルギーを削減する手法を提案する.実際に乗算器に提案手法を実装し,従来のVSPと比較して3.08%消費エネルギーを削減することができた.

CiNii
可変パイプラインのローカルなパルス生成による低消費エネルギー化手法

新井孝将, 史又華, 戸川望, 宇佐美公良, 柳澤政生

研究報告システムとLSIの設計技術（SLDM） 2014 ( 2 ) 1 - 6 2014年09月

　概要を見る

モバイル端末において性能向上による消費エネルギーの増加が問題となっており，様々な低消費エネルギー化手法が提案されている．その一つである可変パイプライン段数 (Variable Stages Pipeline:VSP) では，LDS-cell (Latch D-FF Selector cell) という特殊なセルを用いてグリッチを緩和することができる．しかし，クロックが Low のときに発生するグリッチに対しては緩和できないという問題があった．本稿では既存の可変パイプライン段数手法に対し，LE(Low Energy) モード時にクロックゲーティングを適用し，ローカルなパルス生成によりデータパス上のグリッチを更に抑制し，消費エネルギーを削減する手法を提案する．実際に乗算器に提案手法を実装し，従来の VSP と比較して 3.08％消費エネルギーを削減することができた．The increase of energy consumption due to improved performance has become a problem in the mobile terminal, and various low energy design techniques have been proposed. Variable Stages Pipeline(VSP) technique is one of them, which can reduce glitches by using a special LDS-cell(Latch D-FF selector-cell). However, glitches that occur during the low clock phase will still be propagated to next stages. In this paper, we propose a method for variable stages pipeline designs by applying local pulse generation and clock gating in LE mode for further energy reduction. We implemented the proposed method to a multiplier and experimental results show that the energy is reduced by 3.08% when compared to conventional VSP.

CiNii
A-3-12 フロアプラン統合化アーキテクチャを対象とした低面積指向フォールトセキュア高位合成(A-3.VLSI設計技術,一般セッション)

川村一志, 柳澤政生, 戸川望

電子情報通信学会ソサイエティ大会講演論文集 2014 56 - 56 2014年09月

CiNii
A-3-13 ばらつき耐性を最大化するRDRアーキテクチャ向け高位合成手法(A-3.VLSI設計技術,一般セッション)

萩尾勇太, 柳澤政生, 戸川望

電子情報通信学会ソサイエティ大会講演論文集 2014 57 - 57 2014年09月

CiNii
A-17-10 可視グラフに基づく屋内環境ナビゲーションシステムにおける出発地点・目的地点の決定手法(A-17.ITS,一般セッション)

町田理, 柳澤政生, 戸川望, 新田知之, 進藤大介, 田中清貴

電子情報通信学会ソサイエティ大会講演論文集 2014 119 - 119 2014年09月

CiNii
A-17-11 可視グラフに基づくセンサとBluetooth Beaconを用いたモバイル端末向け屋内現在位置測位(A-17.ITS,一般セッション)

藤田博, 柳澤政生, 戸川望, 新田知之, 進藤大介, 田中清貴

電子情報通信学会ソサイエティ大会講演論文集 2014 120 - 120 2014年09月

CiNii
A-17-12 特徴点抽出と3次ベジェ曲線によるリンク列曲線化アルゴリズムの検討(A-17.ITS,一般セッション)

折原照崇, 柳澤政生, 戸川望, 新田知之, 進藤大介, 田中清貴

電子情報通信学会ソサイエティ大会講演論文集 2014 121 - 121 2014年09月

CiNii
ネットワーク型マルチFPGAシステムを対象としたタスク割り当て手法

片野弘規, 戸川望, 青木孝, 関原悠介, 中西衛

DAシンポジウム2014論文集 2014 21 - 26 2014年08月

CiNii
遅延ばらつき許容量を調整できるRDRアーキテクチャ向け高位合成手法

萩尾勇太, 柳澤政生, 戸川望

DAシンポジウム2014論文集 2014 55 - 60 2014年08月

CiNii
フロアプランを考慮したマルチプレクサ削減FPGA高位合成手法

藤原晃一, 阿部晋矢, 川村一志, 柳澤政生, 戸川望

DAシンポジウム2014論文集 2014 109 - 114 2014年08月

CiNii
セレクタ論理を適用したバイリニア補間演算器を用いた画像拡大縮小回路のFPGA実装と評価

五十嵐啓太, 柳澤政生, 戸川望

DAシンポジウム2014論文集 2014 169 - 174 2014年08月

CiNii
Suspicious Timing Error Prediction を用いた回路全体の遅延ばらつきに対するロバスト設計

吉田慎之介, 史又華, 柳澤政生, 戸川望

DAシンポジウム2014論文集 2014 61 - 66 2014年08月

CiNii
多段演算チェイニングを利用した配線遅延を考慮した高位合成手法

寺田晃太朗, 柳澤政生, 戸川望

DAシンポジウム2014論文集 2014 115 - 120 2014年08月

CiNii
4-to-1セレクタ論理及び2-to-1セレクタ論理を利用したバイリニア補間演算器の設計と評価

塩雅史, 柳澤政生, 戸川望

DAシンポジウム2014論文集 2014 163 - 168 2014年08月

　概要を見る

画像の補間処理は，ディジタル画像処理における基本技術の 1 つである．その応用技術としては，画像の解像度変換，フレームレート変換などがあげられる．これらの技術は近年の携帯電話やタブレット PC などの表示デバイスの小型により効率化が求められている．バイリニア補間は補間演算の 1 つであり，周囲 4 つの値から線形的に値を補間する．画像の拡大・縮小など実用的に用いられることも多い．本稿では，バイリニア補間演算をビットレベル式変形し 2-to-1 セレクタ及び 4-to-1 セレクタに帰着させることで，桁上げ伝搬遅延を削減し高速化したセレクタ論理帰着型バイリニア補間演算器を提案する．セレクタ論理帰着型バイリニア補間演算器において，2-to-1 セレクタのみに帰着する方法，4-to-1 セレクタへの帰着を併用する方法など，複数の方法で実装し評価・比較した．Image interpolation is one of the basic techniques in digital image processing, and often used in a mobile phone and a tablet PC in recent years. Bi-linear interpolation is one of the interpolation operations and interpolates a value linearly from the values of the four circumferences. In this paper, we propose a high-speed bi-linear interpolation circuit reducing carry propagation delay by using 2-to-1 and 4-to-1 selector logics. Our bi-linear interpolation circuit is designed by using (1) only 2-to-1 selector logics and (2) a combination of 2-to-1 and 4-to-1 selector logics. We have implemented our bi-linear circuits and evaluated them. Experiments demonstrate their effectiveness and efficiency.

CiNii
演算チェイニング候補列挙に基づく配線遅延を考慮した高位合成手法

寺田晃太朗, 柳澤政生, 戸川望

回路とシステムワークショップ論文集 Workshop on Circuits and Systems 27 440 - 445 2014年08月

CiNii
フロアプランを考慮したマルチプレクサ入力数制限FPGA向け高位合成手法 (スマートインフォメディアシステム)

藤原晃一, 阿部晋矢, 川村一志, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 126 ) 219 - 224 2014年07月

　概要を見る

近年,証券取引など短期間で回路の修正・仕様変更が必要となる場面で,高位合成を利用したFPGA設計が増加している.高位合成を用いたFPGA設計では,モジュールの配置とマルチプレクサのコストを考慮する必要がある.本稿では,フロアプランを考慮した上で,マルチプレクサの入力数を制限するFPGA向け高位合成手法を提案する.提案手法は,モジュールの配置とマルチプレクサの入力数を同時に考慮した新しいFPGA向け高位合成手法である.提案手法では,レジスタ分散型アーキテクチャの1つであるHDRアーキテクチャを用いて,高位合成段階でモジュールの配置を行い,配線遅延を見積もる.またレジスタバインディングではマルチプレクサの入力数の制限を実現する.その結果,スライス数・遅延の削減を図る.提案手法を計算機上に実装し,従来手法と比較した結果,クリティカルパス遅延を同程度にした上でスライス数を最大33%,平均13%削減した.

CiNii
フロアプランを考慮したマルチプレクサ入力数制限FPGA向け高位合成手法 (システム数理と応用)

藤原晃一, 阿部晋矢, 川村一志, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 125 ) 219 - 224 2014年07月

　概要を見る

近年,証券取引など短期間で回路の修正・仕様変更が必要となる場面で,高位合成を利用したFPGA設計が増加している.高位合成を用いたFPGA設計では,モジュールの配置とマルチプレクサのコストを考慮する必要がある.本稿では,フロアプランを考慮した上で,マルチプレクサの入力数を制限するFPGA向け高位合成手法を提案する.提案手法は,モジュールの配置とマルチプレクサの入力数を同時に考慮した新しいFPGA向け高位合成手法である.提案手法では,レジスタ分散型アーキテクチャの1つであるHDRアーキテクチャを用いて,高位合成段階でモジュールの配置を行い,配線遅延を見積もる.またレジスタバインディングではマルチプレクサの入力数の制限を実現する.その結果,スライス数・遅延の削減を図る.提案手法を計算機上に実装し,従来手法と比較した結果,クリティカルパス遅延を同程度にした上でスライス数を最大33%,平均13%削減した.

CiNii
フロアプランを考慮したマルチプレクサ入力数制限FPGA向け高位合成手法 (回路とシステム)

藤原晃一, 阿部晋矢, 川村一志, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 122 ) 219 - 224 2014年07月

　概要を見る

近年,証券取引など短期間で回路の修正・仕様変更が必要となる場面で,高位合成を利用したFPGA設計が増加している.高位合成を用いたFPGA設計では,モジュールの配置とマルチプレクサのコストを考慮する必要がある.本稿では,フロアプランを考慮した上で,マルチプレクサの入力数を制限するFPGA向け高位合成手法を提案する.提案手法は,モジュールの配置とマルチプレクサの入力数を同時に考慮した新しいFPGA向け高位合成手法である.提案手法では,レジスタ分散型アーキテクチャの1つであるHDRアーキテクチャを用いて,高位合成段階でモジュールの配置を行い,配線遅延を見積もる.またレジスタバインディングではマルチプレクサの入力数の制限を実現する.その結果,スライス数・遅延の削減を図る.提案手法を計算機上に実装し,従来手法と比較した結果,クリティカルパス遅延を同程度にした上でスライス数を最大33%,平均13%削減した.

CiNii
フロアプランを考慮したマルチプレクサ入力数制限FPGA向け高位合成手法 (VLSI設計技術)

藤原晃一, 阿部晋矢, 川村一志, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 123 ) 219 - 224 2014年07月

　概要を見る

近年,証券取引など短期間で回路の修正・仕様変更が必要となる場面で,高位合成を利用したFPGA設計が増加している.高位合成を用いたFPGA設計では,モジュールの配置とマルチプレクサのコストを考慮する必要がある.本稿では,フロアプランを考慮した上で,マルチプレクサの入力数を制限するFPGA向け高位合成手法を提案する.提案手法は,モジュールの配置とマルチプレクサの入力数を同時に考慮した新しいFPGA向け高位合成手法である.提案手法では,レジスタ分散型アーキテクチャの1つであるHDRアーキテクチャを用いて,高位合成段階でモジュールの配置を行い,配線遅延を見積もる.またレジスタバインディングではマルチプレクサの入力数の制限を実現する.その結果,スライス数・遅延の削減を図る.提案手法を計算機上に実装し,従来手法と比較した結果,クリティカルパス遅延を同程度にした上でスライス数を最大33%,平均13%削減した.

CiNii
スキャンチェイン長に依存しないLED暗号に対するスキャンベース攻撃 (VLSI設計技術)

藤代美佳, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 ( 454 ) 31 - 36 2014年03月

　概要を見る

スマートカード等において需要が高い軽量ブロック暗号にLED暗号があり,LED暗号へのスキャンベース攻撃が報告されている.しかし,この手法ではスキャンチェイン長が3万ビット前後の場合,秘密鍵を正しく解読できるとは限らない.本稿では,スキャンチェイン長に依存しないLED暗号へのスキャンベース攻撃手法を提案する.計算機実験では,暗号回路のみをスキャンチェインに含む場合,提案手法を用いて平均36個の平文で64ビットの秘密鍵を復元可能と確認した.スキャンチェインに暗号回路以外の回路が接続されていることを想定して,スキャンデータ上にランダムなビット値を付加してスキャンチェイン長を13万ビットまで変化させた場合にも,秘密鍵が解読できることを確認した.

CiNii
故障解析に耐性を持つラッチを利用したAES暗号回路 (VLSI設計技術)

史又華, 谷口寛彰, 戸川望, 柳澤政生

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 ( 454 ) 37 - 42 2014年03月

　概要を見る

暗号技術は複雑な数学的理論を安全性の根拠としているため安全性が高いとされている.しかし近年,暗号アルゴリズムに対してではなく,暗号回路そのものに攻撃を仕掛ける故障解析が脅威となってきている.故障解析にはレーザーや異常電圧やクロックグリッチが使用されるが,攻撃の容易さからクロックグリッチによる攻撃が注目されている.クロックグリッチによる故障を検出するためにはラウンド間での故障を検出する必要があるが,そのための実装方法として,回路を三重化して比較する空間冗長化や,同じ処理を2回行って比較する時間冗長化が存在する.前者は3倍以上の面積オーバーヘッドが存在し,後者は最大で2倍の時間オーバーヘッドが存在するという問題点がある.本稿ではラッチを用いたAES暗号回路を提案する.提案手法では小面積,高速でクロックグリッチによる故障解析に耐性を持たせることを可能にした.提案手法は,攻撃者にとって意味があるクロックグリッチにおいて,データレジスタにおける故障の検出率を100%とするとともに,データレジスタに一度故障が起きた場合でも最終的な暗号処理結果を100%正しくすることを可能にした.

CiNii
改良ランダムオーダースキャンによるセキュアスキャン設計とその評価 (VLSI設計技術)

大屋優, 跡部悠太, 史又華, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 ( 454 ) 43 - 48 2014年03月

　概要を見る

大規模集積回路のテスト容易化設計の1つであるスキャンチェインを利用したスキャンテストが一般的に行われる.反面スキャンチェインを利用して,暗号回路の秘密鍵が解読されるなどのスキャンベース攻撃が問題となっている.スキャンチェインをスキャンベース攻撃から保護するために,セキュアスキャンアーキテクチャの必要性が高まってきた.セキュアスキャンアーキテクチャは,テスト性を保証すると共にスキャンベース攻撃からスキャンチェインを保護する必要がある.本稿では,セキュアスキャンアーキテクチャとして改良ランダムオーダースキャンを提案する.改良ランダムオーダースキャンは,ランダムオーダースキャンを改良したものであり,スキャンチェインの構造を動的に変化させ,スキャンベース攻撃からスキャンチェインを保護する.スキャンチェインを複数のサブチェインに分割し,イネーブル信号でスキャンアウトさせるサブチェインを次々と選択することで,スキャンチェインの構造が動的に変化する.改良ランダムオーダースキャンの安全性とテスト性を議論し,また計算核実験により面積オーバーヘッドが小さいセキュアスキャンアーキテクチャであることを示す.

CiNii
クラスタリングによる高速なリソグラフィ照明形状最適化 (VLSI設計技術)

多和田雅師, 柳澤政生, 戸川望, 橋本隆希, 坂主圭史, 野嶋茂樹, 小谷敏也

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 ( 454 ) 105 - 110 2014年03月

　概要を見る

リソグラフィでは光をフォトマスクに通してウェハ上にパターンを生成する.近年の微細化により生成されるパターンはフォトマスクと異なってしまい,フォトマスクや照明形状を最適化する処理が必要となる.フォトマスク最適化に対し照明形状最適化は高速化に関して研究が少ない.提案手法ではクラスタリングを用いて照明を表現するパラメータの総数を性能が劣化することなく削減する.最適化に関係するパラメータを削減し高速に解を求める.

CiNii
サブスレッショルド回路における遅延・エネルギーの温度依存性に関する実験および考察 (VLSI設計技術)

櫛田浩樹, 史又華, 戸川望, 宇佐美公良, 柳澤政生

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 ( 454 ) 147 - 151 2014年03月

　概要を見る

バッテリ一式の無線ネットワーク機器では,消費エネルギーの削減を重視するため,供給電圧を下げる設計手法が広く用いられる.サブスレッショルド回路においては,しきい値下の電圧で制御することで大幅なエネルギー削減を達成できるが,性能の低下や環境変動による遅延ばらつきの問題が生じる.本稿ではスーパーパイプラインを用いたサブスレッショルド乗算器を実装し,性能向上とリークエネルギー削減による全体の消費エネルギー削減を確認した.さらに,温度変動による回路の遅延・エネルギーの温度依存性について実験し,考察を行った.

CiNii
局所性を考慮したマルチFPGAシステム向けタスクマッピング手法 (VLSI設計技術)

片野弘規, 李昇周, 戸川望, 青木孝, 関原悠介, 中西衛

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 ( 416 ) 143 - 148 2014年01月

　概要を見る

ネットワーク状に構成されたマルチFPGAシステムが提案されている.これは複数枚のボードから構成され,各ボードには5つのFPGAチップとボード間通信用ルータFPGAチップがリング状に接続されている.アプリケーションや状況に応じてボードを追加することができ,柔軟性と拡張性に優れた特徴を持っている.このようなマルチFPGAシステムでは,タスクグラフを構成する各タスクのマルチFPGAシステムへのマッピングが,マルチFPGAシステム上で実行されるアプリケーションの性能とコストを決定づける重要な要素となる.本研究ではマルチFPGAシステムを対象としたタスクマッピング手法を提案する.提案するタスクマッピング手法では,各FPGAチップが持つ局所性を考慮して,タスクグラフ中で通信量が最大となるタスクを中心として周囲を探索し,これらのタスクはできるだけ同一あるいは隣接したFPGAチップに割り当てていくものである.また提案したタスクマッピング手法の検証と評価を行った.

CiNii
局所性を考慮したマルチFPGAシステム向けタスクマッピング手法

片野弘規, 李昇周, 戸川望, 青木孝, 関原悠介, 中西衛

研究報告システムLSI設計技術（SLDM） 2014 ( 26 ) 1 - 6 2014年01月

　概要を見る

ネットワーク状に構成されたマルチ FPGA システムが提案されている．これは複数枚のボードから構成され，各ボードには 5 つの FPGA チップとボード間通信用ルータ FPGA チップがリング状に接続されている．アプリケーションや状況に応じてボードを追加することができ，柔軟性と拡張性に優れた特徴を持っている．このようなマルチ FPGA システムでは，タスクグラフを構成する各タスクのマルチ FPGA システムへのマッピングが，マルチ FPGA システム上で実行されるアプリケーションの性能とコストを決定づける重要な要素となる．本研究ではマルチ FPGA システムを対象としたタスクマッピング手法を提案する．提案するタスクマッピング手法では，各 FPGA チップが持つ局所性を考慮して，タスクグラフ中で通信量が最大となるタスクを中心として周囲を探索し，これらのタスクはできるだけ同一あるいは隣接した FPGA チップに割り当てていくものである．また提案したタスクマッピング手法の検証と評価を行った．Recently, a scalable and reconfigurable multi-FPGA system has been proposed which consists of two or more boards, each of which consists of one router FPGA chip and five general-purpose FPGA chips. The five general-purpose FPGA chips are connected to form a ring and the router FPGA chip performs inter-board com munications. How to map a task graph onto such a multi-FPGA system is one of the challenging problems. In this paper, we propose a task mapping algorithm for a multi-FPGA system. Since the multi-FPGA system has a hierarchical structure, we have to find out locality in a given task graph. In our proposed algorithm, we focus on the communication rate between tasks and try to assign the ones with many communications between them to the same FPGA chip one by one. Experimental results demonstrate the effectiveness of our proposed algorithm.

CiNii
信頼性と時間オーバーヘッド間のトレードオフを考慮した面積制約にもとづくRDRアーキテクチャ向けフォールトセキュア高位合成手法 (ディペンダブルコンピューティングデザインガイア2013 : VLSI設計の新しい大地)

川村一志, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 ( 321 ) 129 - 134 2013年11月

　概要を見る

半導体の微細化技術の進展に伴い,ソフトエラーに起因する信頼性の低下,及び配線遅延の相対的増大が問題となっている.信頼性の低下を克服する手法のひとつに並行誤り検出を用いたフォールトセキュア設計手法があり,演算処理の部分的な二重化を考えることで信頼性とオーバーヘッドのトレードオフを考慮した設計が可能となる.本稿では,小さいオーバーヘッドで大きな信頼性向上が得られるよう高位合成段階での適用を前提とし,設計手法を提案する,提案手法のポイントは三点あり,第一にRDRアーキテクチャを対象とすることで高位合成段階で配線遅延を考慮できるようにする.第二に面積制約を通常計算用に用意したRDRアーキテクチャとすることで面積オーバーヘッドなくフォールトセキュア設計を実現する.第三に与えられた時間制約のもとで信頼性の最大化を目指す.提案手法を計算機上に実装し,従来手法と比較した結果,時間及び面積オーバーヘッドなく最大44%の信頼性向上を達成した.さらに,面積オーバーヘッドの増大を許容しなくとも50%程度の時間オーバーヘッドを許容することで演算処理の完全な二重化が実現可能であることを示した.

CiNii
チェックポイント観測によるタイミングエラー予測手法 (ディペンダブルコンピューティングデザインガイア2013 : VLSI設計の新しい大地)

五十嵐博昭, 史又華, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 ( 321 ) 39 - 44 2013年11月

　概要を見る

プロセス技術の微細化によりLSIのタイミング設計が難しくなっており,タイミングエラー対策手法の重要性が高まっている.既存のタイミングエラー検出手法はエラー訂正に再実行が必要であったり,複雑な構造を持つためタイミング設計が難しい.我々はより訂正コストが小さく簡単な構造を持つタイミングエラー対策手法としてSTEPを提案している.STEPではチェックポイントと呼ばれるパス中の観測点をチェックすることでタイミングエラー発生の可能性を検出する.STEPはタイミングエラー予測手法であるため誤検出が発生し,誤検出の削減が大きな課題である.本稿ではチェックポイントの最適化により誤検出を削減する手法を提案する.実験結果より,動作可能周波数が最大で2.4倍となり,スループットは最大で約45%向上した.

CiNii
不揮発メモリを対象とした書き込み削減手法のエネルギー評価 (ディペンダブルコンピューティングデザインガイア2013 : VLSI設計の新しい大地)

多和田雅師, 木村晋二, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 ( 321 ) 141 - 146 2013年11月

　概要を見る

近年の高集積化に伴い消費電力全体に対するリーク電力の割合が高まっている.不揮発メモリはリーク電力をほとんど消費しないため次世代のメモリとして期待されている.不揮発メモリは通常のメモリより書き込み時に電力を消費する問題がある.不揮発メモリの書き込み電力を低減するためには,書き込みビット数を削減する手法が考えられる.メモリの値をある値から違う値へ書き換えるとき,実際に保存する値を符号化することで,本来書き換えるビット数よりも実際に書き込むビット数を少なくすることができる.本稿では不揮発メモリを対象とした書き込みビット数削減手法のエネルギーを評価する.

CiNii
HDR-mcdを対象としたクロックエネルギー優位な高位合成と実験評価 (ディペンダブルコンピューティングデザインガイア2013 : VLSI設計の新しい大地)

阿部晋矢, 史又華, 宇佐美公良, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 ( 321 ) 263 - 268 2013年11月

　概要を見る

LSI全体に占めるクロック信号によるエネルギー消費の割合は大きく,マルチクロックドメイン,クロックゲーティングなどが提案された.本稿では,マルチクロックドメイン指向HDR-mcdアーキテクチャを対象としたクロックエネルギー削減に向けた高位合成手法を提案する.提案手法は1クロック内の通信が保障されるハドルと呼ぶ区画を利用し,配線遅延の影響を予測,異なるクロック間の同期を考慮した高位合成を実現する.クロックはハドル毎に割り当て,資源制約と時間制約を満たす範囲で低い周波数のクロックを割り当てることで低電力化する.計算機実験により提案手法はクロックゲーティングのみを考慮した従来手法と比較し,クロックツリーのエネルギーを30%程度削減でき,全体のエネルギーを25%程度削減できることを確認した.

CiNii
HDR-mcdを対象としたクロックエネルギー優位な高位合成と実験評価

阿部晋矢, 史又華, 宇佐美公良, 柳澤政生, 戸川望

研究報告システムLSI設計技術（SLDM） 2013 ( 47 ) 1 - 6 2013年11月

　概要を見る

LSI 全体に占めるクロック信号によるエネルギー消費の割合は大きく，マルチクロックドメイン，クロックゲーテイングなどが提案された．本稿では，マルチクロックドメイン指向 HDR-mcd アーキテクチャを対象としたクロックエネルギー削減に向けた高位合成手法を提案する．提案手法は 1 クロック内の通信が保障されるハドルと呼ぶ区画を利用し，配線遅延の影響を予測，異なるクロック間の同期を考慮した高位合成を実現する．クロックはハドル毎に割り当て，資源制約と時間制約を満たす範囲で低い周波数のクロックを割り当てることで低電力化する．計算機実験により提案手法はクロックゲーテイングのみを考慮した従来手法と比較し，クロックツリーのエネルギーを 30％程度削減でき，全体のエネルギーを 25％程度削減できることを確認した．In this paper, we propose a clock energy-efficient high-level synthesis algorithm for HDR-mcd architecture. In HDR-mcd, an entire chip is divided into several huddles. Huddles can realize synchronization between different clock domains in which interconnection delay is required and should be considered during high-level synthesis. In our iterative improvement based algorithm, low-frequency clocks are assigned to non-critical huddlesunder resource and latency constraints for energy efficiency improvement. Experimental results show that the proposed method achieves 20% clock energy-saving and 10% total energy-saving compared with the existing methods considering clock gating.

CiNii
不揮発メモリを対象とした書き込み削減手法のエネルギー評価

多和田雅師, 木村晋二, 柳澤政生, 戸川望

研究報告システムLSI設計技術（SLDM） 2013 ( 26 ) 1 - 6 2013年11月

　概要を見る

近年の高集積化に伴い消費電力全体に対するリーク電力の割合が高まっている．不揮発メモリはリーク電力をほとんど消費しないため次世代のメモリとして期待されている．不揮発メモリは通常のメモリより書き込み時に電力を消費する問題がある．不揮発メモリの書き込み電力を低減するためには，書き込みピット数を削減する手法が考えられる．メモリの値をある値から違う値へ書き換えるとき，実際に保存する値を符号化することで，本来書き換えるビット数よりも実際に書き込むピット数を少なくすることができる．本稿では不揮発メモリを対象とした書き込みピット数削減手法のエネルギーを評価する．Non-volatile memory has many advantages over SRAM, such as high density, low leakage power, and non-volatility. However, one of its largest problems is that it consumes a large amount of energy in writing. It is quite necessary to reduce the number of writing bits and thus decrease its writing energy.We have proposed a memory writing reduction method based on error correcting codes. When a data is written into a memory, we do not write it directly but encode it into a codeword. Then the number of writing bits into memory is also limited in data writing. In this paper, we demonstrate several experimental evaluations from the viewpoints of energy reduction and discuss the effectiveness of our proposed writing-reduction codes.

CiNii
信頼性と時間オーバーヘッド間のトレードオフを考慮した面積制約にもとづくRDRアーキテクチャ向けフォールトセキュア高位合成手法

川村一志, 柳澤政生, 戸川望

研究報告システムLSI設計技術（SLDM） 2013 ( 24 ) 1 - 6 2013年11月

　概要を見る

半導体の微細化技術の進展に伴い，ソフトエラーに起因する信頼性の低下，及び配線遅延の相対的増大が問題となっている．信頼'性の低下を克服する手法のひとつに並行誤り検出を用いたフォールトセキュア設計手法があり，演算処理の部分的な二重化を考えることで信頼,性とオーバーヘッドのトレードオフを考慮した設計が可能となる．本稿では，小さいオーバーヘッドで大きな信頼性向上が得られるよう高位合成段階での適用を前提とし，設計手法を提案する．提案手法のポイントは三点あり，第一に RDR アーキテクチャを対象とすることで高位合成段階で配線遅延を考慮できるようにする．第二に面積制約を通常計算用に用意した RDR アーキテクチャとすることで面積オーバーヘッドなくフォールトセキュア設計を実現する．第三に与えられた時間制約のもとで信頼性の最大化を目指す．提案手法を計算機上に実装し，従来手法と比較した結果，時間及び面積オーバーヘッドなく最大 44％の信頼性向上を達成した．さらに，面積オーバーヘッドの増大を許容しなくとも 50％程度の時間オーバーヘッドを許容することで演算処理の完全な二重化が実現可能であることを示した．With process technology scaling, decreasing reliability caused by soft errors as well as increasing the average interconnection delays are becoming serious issues. The fault-secure design technique which utilizes concurrent error detection is one of the approaches to overcome reliability degradation, and we can design systems based on trade-off between reliability and several kinds of overhead by giving a partial redundancy to operations. In this paper, we propose a partial redundant fault-secure high-level synthesis algorithm for RDR architectures. Our proposed algorithm receives a fixed area constraint and various time constrains as inputs, and aims at maximizing reliability under them. Experimental results demonstrate that our algorithm improves reliability by up to 44% with zero time and area overhead compared with the conventional approach. They also show that we can realize complete duplication of operations with zero area overhead and about 50% time overhead.

CiNii
チェックポイント観測によるタイミングエラー予測手法

五十嵐博昭, 史又華, 柳澤政生, 戸川望

研究報告システムLSI設計技術（SLDM） 2013 ( 8 ) 1 - 6 2013年11月

　概要を見る

プロセス技術の微細化により LSI のタイミング設計が難しくなっており，タイミングエラー対策手法の重要性が高まっている．既存のタイミングエラー検出手法はエラー訂正に再実行が必要であったり，複雑な構造を持つためタイミング設計が難しい我々はより訂正コストが小さく簡単な構造を持つタイミングエラー対策手法として STEP を提案している．STEP ではチェックポイントと呼ばれるパス中の観測点をチェックすることでタイミングエラー発生の可能性を検出する．STEP はタイミングエラー予測手法であるため誤検出が発生し，誤検出の削減が大きな課題である．本稿ではチェックポイントの最適化により誤検出を削減する手法を提案する．実験結果より，動作可能周波数が最大で 24 倍となり，スループットは最大で約 45％向上した．Due to advance process technologies, timing design of LSIs has become more difficult and the importance of timing error countermeasure techniques is increasing as well. Existing timing error detection/correction methods have difficulties in timing design since they have complex structure. Furthermore, their error correction is realized by re-run operation which results in low throughput. We have proposed a suspicious timing error prediction method (STEP method) which predicts timing error and corrects it with simple structure. STEP is based on checking timing errors by observing several checkpoints on signal paths. Since STEP is a timing error prediction method, we may have false positives and reduction of them is one of the largest problems. In this paper, we propose a method to reduce the false positives to optimize the checkpoints. The experimental results show that an operational frequency is increased by up to 2.4 times and its throughput is improved by up to 45%.

CiNii
スキャンシグネチャを用いたLED暗号へのスキャンベース攻撃 (画像工学)

藤代美佳, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 ( 237 ) 47 - 52 2013年10月

　概要を見る

LED (Light Encryption Device)暗号は,センサーネットワークやRFIDタグ等において需要が高い最軽量ブロック暗号の1つで,ハードウェアへ小面積で実装可能である.AESに似た構造であり,分割・転置等を実行するラウンド処理と鍵との加算を繰り返す.一方で,暗号LSIに対するサイドチャネル攻撃の一種であるスキャンベース攻撃の危険性が報告されている.スキャンベース攻撃ではテスト用のスキャンチェインを用いて暗号LSIの秘密情報を取得する.本稿では,スキャンシグネチャを用いたLED暗号へのスキャンベース攻撃手法を提案する.提案手法は,特定の平文を入力したLSIから取得したスキャンデータをXOR加算し,特定のビット列に着目することで秘密鍵を部分ごとに解読する手法である.スキャンチェインのレジスタの種類や数,接続順が分からなくても解読可能である.計算機実験では,提案手法を用いて平均73個の平文で64ビットの秘密鍵を復元可能と確認した.スキャンチェインに他の回路が含まれていることを想定し,スキャンデータにランダムなビット値を付加してスキャンチェイン長を4096ビットまで変化させた場合にも,秘密鍵が解読できることを確認した.

CiNii
スキャンシグネチャを用いたLED暗号へのスキャンベース攻撃 (集積回路)

藤代美佳, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 ( 236 ) 47 - 52 2013年10月

　概要を見る

LED (Light Encryption Device)暗号は,センサーネットワークやRFIDタグ等において需要が高い最軽量ブロック暗号の1つで,ハードウェアへ小面積で実装可能である.AESに似た構造であり,分割・転置等を実行するラウンド処理と鍵との加算を繰り返す.一方で,暗号LSIに対するサイドチャネル攻撃の一種であるスキャンベース攻撃の危険性が報告されている.スキャンベース攻撃ではテスト用のスキャンチェインを用いて暗号LSIの秘密情報を取得する.本稿では,スキャンシグネチャを用いたLED暗号へのスキャンベース攻撃手法を提案する.提案手法は,特定の平文を入力したLSIから取得したスキャンデータをXOR加算し,特定のビット列に着目することで秘密鍵を部分ごとに解読する手法である.スキャンチェインのレジスタの種類や数,接続順が分からなくても解読可能である.計算機実験では,提案手法を用いて平均73個の平文で64ビットの秘密鍵を復元可能と確認した.スキャンチェインに他の回路が含まれていることを想定し,スキャンデータにランダムなビット値を付加してスキャンチェイン長を4096ビットまで変化させた場合にも,秘密鍵が解読できることを確認した.

CiNii
製造後遅延調整機能を持つRDRアーキテクチャ向け高位合成手法の評価

萩尾勇太, 柳澤政生, 戸川望

研究報告システムLSI設計技術（SLDM） 2013 ( 9 ) 1 - 6 2013年09月

　概要を見る

LSI の微細加工技術の進歩により配線遅延の拡大や製造時の遅延ばらつきによるタイミング違反が問題となっている．とりわけ配線遅延がゲート遅延と比較して相対的に増加しており高位合成段階でいかに配線遅延を取り扱うかが鍵となる．また，製造時の遅延ばらつきに対応するために，従来は過剰なマージンの挿入，統計的静的遅延解析などが適用されてきたが，性能低下しない手法としてチップ製造後の回路チューニングが提案されている．このような背景に基づき，配線遅延の拡大や製造時の遅延ばらつきの双方に対応した高位合成として，製造後遅延調整機能を持つ RDR アーキテクチャ向け高位合成手法を提案した．本稿では，提案手法の有効'性を検証するため計算機実験をし，従来手法と比較することで提案手法を評価する．また，回路面積を最小化するために提案手法では類似化のステップを設けているが，その有効性についても検証する．計算機実験により，提案手法は従来手法と比較して遅延ばらつき発生時の実行時間を最大 42.9％削減できることを確認した．As device feature size drops, interconnection delays often exceed gate delays. We have to incorporate interconnection delays even in high-level synthesis. Using RDR architectures is one of the effective solutions to this problem. At the same time, process and delay variation also becomes a serious problem which may result in several timing errors. How to deal with this problem is another key issue in high-level synthesis. Thus, we have proposed a high-level synthesis algorithm with post-silicon delay tuning for RDR architectures. In this paper, we evaluate our high-level synthesis algorithm comparing several existing algorithms considering several situations. Experimental results show that our algorithm successfully reduces delayed scheduling/binding latency by up to 42.9% compared with the conventional approach.

CiNii
製造後遅延調整機能を持つRDRアーキテクチャ向け高位合成手法の評価(システムLSIの応用とその要素技術,専用プロセッサ,プロセッサ,DSP,画像処理技術,及び一般)

萩尾勇太, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. VLD, VLSI設計技術 113 ( 235 ) 41 - 46 2013年09月

　概要を見る

LSIの微細加工技術の進歩により配線遅延の拡大や製造時の遅延ばらつきによるタイミング違反が問題となっている.とりわけ配線遅延がゲート遅延と比較して相対的に増加しており高位合成段階でいかに配線遅延を取り扱うかが鍵となる.また,製造時の遅延ばらつきに対応するために,従来は過剰なマージンの挿入,統計的静的遅延解析などが適用されてきたが,性能低下しない手法としてチップ製造後の回路チューニングが提案されている.このような背景に基づき,配線遅延の拡大や製造時の遅延ばらつきの双方に対応した高位合成として,製造後遅延調整機能を持つRDRアーキテクチャ向け高位合成手法を提案した.本稿では,提案手法の有効性を検証するため計算機実験をし,従来手法と比較することで提案手法を評価する.また,回路面積を最小化するために提案手法では類似化のステップを設けているが,その有効性についても検証する.計算機実験により,提案手法は従来手法と比較して遅延ばらつき発生時の実行時間を最大42.9%削減できることを確認した.

CiNii
スキャンシグネチャを用いたLED暗号へのスキャンベース攻撃(システムLSIの応用とその要素技術,専用プロセッサ,プロセッサ,DSP,画像処理技術,及び一般)

藤代美佳, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. VLD, VLSI設計技術 113 ( 235 ) 47 - 52 2013年09月

　概要を見る

LED (Light Encryption Device)暗号は,センサーネットワークやRFIDタグ等において需要が高い最軽量ブロック暗号の1つで,ハードウェアへ小面積で実装可能である.AESに似た構造であり,分割・転置等を実行するラウンド処理と鍵との加算を繰り返す.一方で,暗号LSIに対するサイドチャネル攻撃の一種であるスキャンベース攻撃の危険性が報告されている.スキャンベース攻撃ではテスト用のスキャンチェインを用いて暗号LSIの秘密情報を取得する.本稿では,スキャンシグネチャを用いたLED暗号へのスキャンベース攻撃手法を提案する.提案手法は,特定の平文を入力したLSIから取得したスキャンデータをXOR加算し,特定のビット列に着目することで秘密鍵を部分ごとに解読する手法である.スキャンチェインのレジスタの種類や数,接続順が分からなくても解読可能である.計算機実験では,提案手法を用いて平均73個の平文で64ビットの秘密鍵を復元可能と確認した.スキャンチェインに他の回路が含まれていることを想定し,スキャンデータにランダムなビット値を付加してスキャンチェイン長を4096ビットまで変化させた場合にも,秘密鍵が解読できることを確認した.

CiNii
セレクタ論理を利用したバイリニア補間演算器設計と評価(システムLSIの応用とその要素技術,専用プロセッサ,プロセッサ,DSP,画像処理技術,及び一般)

塩雅史, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. VLD, VLSI設計技術 113 ( 235 ) 53 - 58 2013年09月

　概要を見る

バイリニア補間は補間演算の1 つであり,周囲4 つの値から線形的に値を補間する.画像の拡大・縮小など実用的に用いられることも多い.本稿では,バイリニア補間演算をビットレベル式変形しセレクタ論理に帰着させることで,桁上げ伝搬遅延を削減し高速化したセレクタ論理帰着型バイリニア補間演算器を提案する.セレクタ論理帰着型バイリニア補間演算器において,セレクタ論理によって生成された部分積を算術演算子を用いて加算,桁上げ先読み加算器を用いて加算など,複数の方法で実装し評価・比較した.

CiNii
A-17-10 滑らかな接続表現のための道路ネットワーク整形手法(A-17.ITS,一般セッション)

折原照崇, 柳澤政生, 戸川望

電子情報通信学会ソサイエティ大会講演論文集 2013 143 - 143 2013年09月

CiNii
A-3-6 故障差分解析に耐性を持つデータ修復可能なAES暗号回路(A-3.VLSI設計技術,一般セッション)

谷口寛彰, 史又華, 戸川望, 柳澤政生

電子情報通信学会ソサイエティ大会講演論文集 2013 49 - 49 2013年09月

CiNii
A-3-5 トロイパスによるハードウェアトロイ検出の一手法(A-3.VLSI設計技術,一般セッション)

跡部悠太, 史又華, 柳澤政生, 戸川望

電子情報通信学会ソサイエティ大会講演論文集 2013 48 - 48 2013年09月

CiNii
スキャンシグネチャを利用したブロック暗号に対するスキャンベース攻撃

戸川望, 小寺博和

回路とシステムワークショップ論文集 Workshop on Circuits and Systems 26 197 - 202 2013年07月

CiNii
RDRアーキテクチヤを対象とした時間・面積制約にもとづくフォールドセキュア高位合成手法

川村一志, 柳澤政生, 戸川望

回路とシステムワークショップ論文集 Workshop on Circuits and Systems 26 454 - 459 2013年07月

CiNii
IL1およびIL2キャッシュに不揮発メモリを利用した二階層キャッシュにおける消費エネルギーの評価 (システム数理と応用)

松野翔太, 多和田雅師, 柳澤政生, 木村晋二, 戸川望, 杉林直彦

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 ( 121 ) 89 - 94 2013年07月

　概要を見る

オンチップ・メモリによく利用されるSRAMは,高速かつ動作電力が低いが,微細化とともに構造に起因するリーク電力が増大し,無視できなくなってきた.一方,不揮発メモリはリーク電力が小さいという特性を持つ.さらに電源をOFFにしても記憶内容が保持されるため,ノーマリオフへの活用が期待されている.しかし,書き込みエネルギーが大きいなどの欠点がある.本稿では,二階層キャッシュの一部に不揮発メモリを利用したときに,書き込みエネルギーが大きいという欠点があっても,消費エネルギーが削減できることを確認した.

CiNii
最大ハミング距離を制限した符号とこれを用いた不揮発メモリの書き込み削減手法 (システム数理と応用)

多和田雅師, 木村晋二, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 ( 121 ) 95 - 100 2013年07月

　概要を見る

近年の高集積化に伴い消費電力全体に対するリーク電力の割合が高まっている.不揮発メモリはリーク電力をほとんど消費しないため次世代のメモリとして期待されている.不揮発メモリは通常のメモリより書き込み時に電力を消費する問題がある.不揮発メモリの書き込み電力を低減するためには,書き込みビット数を削減する手法が考えられる.メモリの値をある値から違う値へ書き換えるとき,実際に保存する値を符号化することで,本来書き換えるビット数よりも実際に書き込むビット数を少なくすることができる.最大ハミング距離を制限した符号により,書き込みビット数を削減する手法を提案する.符号間の最大ハミング距離を制限する符号を生成し,一回の値の書き込みで反転するビット数を制限することで書き込みビット数を削減する.

CiNii
IL1およびIL2キャッシュに不揮発メモリを利用した二階層キャッシュにおける消費エネルギーの評価 (回路とシステム)

松野翔太, 多和田雅師, 柳澤政生, 木村晋二, 戸川望, 杉林直彦

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 ( 118 ) 89 - 94 2013年07月

　概要を見る

オンチップ・メモリによく利用されるSRAMは,高速かつ動作電力が低いが,微細化とともに構造に起因するリーク電力が増大し,無視できなくなってきた.一方,不揮発メモリはリーク電力が小さいという特性を持つ.さらに電源をOFFにしても記憶内容が保持されるため,ノーマリオフへの活用が期待されている.しかし,書き込みエネルギーが大きいなどの欠点がある.本稿では,二階層キャッシュの一部に不揮発メモリを利用したときに,書き込みエネルギーが大きいという欠点があっても,消費エネルギーが削減できることを確認した.

CiNii
最大ハミング距離を制限した符号とこれを用いた不揮発メモリの書き込み削減手法 (回路とシステム)

多和田雅師, 木村晋二, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 ( 118 ) 95 - 100 2013年07月

　概要を見る

近年の高集積化に伴い消費電力全体に対するリーク電力の割合が高まっている.不揮発メモリはリーク電力をほとんど消費しないため次世代のメモリとして期待されている.不揮発メモリは通常のメモリより書き込み時に電力を消費する問題がある.不揮発メモリの書き込み電力を低減するためには,書き込みビット数を削減する手法が考えられる.メモリの値をある値から違う値へ書き換えるとき,実際に保存する値を符号化することで,本来書き換えるビット数よりも実際に書き込むビット数を少なくすることができる.最大ハミング距離を制限した符号により,書き込みビット数を削減する手法を提案する.符号間の最大ハミング距離を制限する符号を生成し,一回の値の書き込みで反転するビット数を制限することで書き込みビット数を削減する.

CiNii
IL1およびIL2キャッシュに不揮発メモリを利用した二階層キャッシュにおける消費エネルギーの評価 (信号処理)

松野翔太, 多和田雅師, 柳澤政生, 木村晋二, 戸川望, 杉林直彦

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 ( 120 ) 89 - 94 2013年07月

　概要を見る

オンチップ・メモリによく利用されるSRAMは,高速かつ動作電力が低いが,微細化とともに構造に起因するリーク電力が増大し,無視できなくなってきた.一方,不揮発メモリはリーク電力が小さいという特性を持つ.さらに電源をOFFにしても記憶内容が保持されるため,ノーマリオフへの活用が期待されている.しかし,書き込みエネルギーが大きいなどの欠点がある.本稿では,二階層キャッシュの一部に不揮発メモリを利用したときに,書き込みエネルギーが大きいという欠点があっても,消費エネルギーが削減できることを確認した.

CiNii
IL1およびIL2キャッシュに不揮発メモリを利用した二階層キャッシュにおける消費エネルギーの評価 (VLSI設計技術)

松野翔太, 多和田雅師, 柳澤政生, 木村晋二, 戸川望, 杉林直彦

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 ( 119 ) 89 - 94 2013年07月

　概要を見る

オンチップ・メモリによく利用されるSRAMは,高速かつ動作電力が低いが,微細化とともに構造に起因するリーク電力が増大し,無視できなくなってきた.一方,不揮発メモリはリーク電力が小さいという特性を持つ.さらに電源をOFFにしても記憶内容が保持されるため,ノーマリオフへの活用が期待されている.しかし,書き込みエネルギーが大きいなどの欠点がある.本稿では,二階層キャッシュの一部に不揮発メモリを利用したときに,書き込みエネルギーが大きいという欠点があっても,消費エネルギーが削減できることを確認した.

CiNii
特定形状を考慮した視認性の良いエリア略地図生成手法

折原照崇, 柳澤政生, 戸川望

マルチメディア、分散協調とモバイルシンポジウム2013論文集 2013 2036 - 2043 2013年07月

CiNii
VNS：可視グラフに基づく屋内環境ナビゲーションシステム

町田理, 町田直哉, 柳澤政生, 戸川望

マルチメディア、分散協調とモバイルシンポジウム2013論文集 2013 688 - 701 2013年07月

CiNii
モバイル端末におけるセンサ利用型現在位置測位の精度評価

藤田博, 柳澤政生, 戸川望

マルチメディア、分散協調とモバイルシンポジウム2013論文集 2013 175 - 181 2013年07月

CiNii
ランドマーク表示歩行者向けナビゲーションシステム

岩田裕樹, 柳澤政生, 戸川望

マルチメディア、分散協調とモバイルシンポジウム2013論文集 2013 702 - 716 2013年07月

CiNii
セレクタ論理を利用した線形補間演算器設計と評価 (VLSI設計技術)

塩雅史, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 ( 30 ) 49 - 54 2013年05月

　概要を見る

補間演算とは,得られたデータ列から範囲内の値を推定する演算であり,画像の拡大・縮小や歪みの補正などに用いられる.線形補間補間演算の1つであり,2つの値を直線的に結ぶことで補間を実行する.演算量が多い多項式補間の中では比較的演算量が少ないため,実用的に用いられることも多い.本稿では,線形補間演算をビットレベル式変形しセレクタ論理に帰着させることで,セレクタ論理帰着型線形補間演算器を提案する.提案するセレクタ論理帰着型線形補間演算器は,セレクタ論理を用いることで桁上げ伝搬遅延を削減し演算の高速化を実現する.セレクタ論理帰着型線形補間演算器において,セレクタ論理によって生成された部分積を算術演算子を用いて加算,桁上げ先読み加算器を用いて加算など,複数の方法で実装し評価・比較した.その結果,セレクタ論理を用いず補間演算全体を単純な算術演算子を用いて設計した場合と比較し,補間演算に対しセレクタ論理によって生成された部分積を算術演算子を用いた加算した場合,遅延時間が最大16%削減されることを確認した.

CiNii
スキャンシグネチャを用いたストリーム暗号Triviumへのスキャンベース攻撃手法 (VLSI設計技術)

藤代美佳, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 ( 30 ) 61 - 66 2013年05月

　概要を見る

Triviumは同期式ストリーム暗号の1つで,ストリーム暗号評価プロジェクトeSTREAMにて推奨アルゴリズムに選定されたものである.3本のシフトレジスタから構成され,内部演算はビット同士のAND演算とXOR演算のみであり,構造が単純で高速に動作する.一方,暗号LSIに対するサイドチャネル攻撃の1つに,テスト用のスキャンチェインを利用して暗号解読するスキャンベース攻撃がある.Triviumへのスキャンベース攻撃は,スキャンチェインを用いて取得したスキャンデータから回路内部の状態を求め,求めた内部状態を元にキーストリームと平文を復元する.Triviumへのスキャンベース攻撃に対し従来手法では,暗号回路の内部レジスタのみがスキャンチェインに含まれていることが前提であり,周辺回路のレジスタがスキャンチェインに含まれている場合,スキャンチェインの内部構造を解析できない.一般に,LSIチップには暗号回路と共に複数の回路が同一スキャンチェインに含まれることが多く,従来手法を実際の攻撃手法として利用することは難しい.本稿では,スキャンチェインの構造に依存しないTriviumへのスキャンベース攻撃手法を提案する.提案手法では,1ビットレジスタ値の入力・動作サイクル数に対する変化がそのレジスタ固有の値になることを利用しTriviumの内部状態を復元する.計算機実験により,スキャンチェインに周辺回路が含まれる場合でも,提案手法を用いてTriviumの内部状態を復元し,平文を復元できることを確認した.

CiNii
RDRアーキテクチャを対象とした時間及び面積オーバーヘッドのないフォールトセキュア高位合成手法 (VLSI設計技術)

川村一志, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 ( 30 ) 67 - 72 2013年05月

　概要を見る

本論文では,RDRアーキテクチャを対象とした時間及び面積オーバーヘッドのないフォールトセキュア高位合成手法を提案する.提案手法は,与えられた時間制約,面積制約のもとで演算処理を部分的に二重化し,ソフトエラーに起因するフォールトを検出することで信頼性の向上を図る.計算機実験の結果から,提案手法は従来手法と比較して,時間及び面積のオーバーヘッドなく信頼性を最大37.73%向上させることに成功した.

CiNii
RDRアーキテクチャを対象とした時間及び面積オーバーヘッドのないフォールトセキュア高位合成手法

川村一志, 柳澤政生, 戸川望

研究報告システムLSI設計技術（SLDM） 2013 ( 12 ) 1 - 6 2013年05月

　概要を見る

本論文では，RDRアーキテクチャを対象とした時間及び面積オーバーヘッドのないフォールトセキュア高位合成手法を提案する．提案手法は，与えられた時間制約，面積制約のもとで演算処理を部分的に二重化し，ソフトエラーに起因するフォールトを検出することで信頼性の向上を図る．計算機実験の結果から，提案手法は従来手法と比較して，時間及び面積のオーバーヘッドなく信頼性を最大37.73％向上させることに成功した．In this paper, we propose a zero time and area overhead fault-secure high-level synthesis algorithm for RDR architectures. We duplicate some operations under a given time and area constraint and improve reliability by detecting the faults caused by soft errors. Experimental results demonstrate that our algorithm improves reliability by up to 37.73% with zero time and area overhead compared with the conventional approach.

CiNii
セレクタ論理を利用した線形補間演算器設計と評価

塩雅史, 柳澤政生, 戸川望

研究報告システムLSI設計技術（SLDM） 2013 ( 9 ) 1 - 6 2013年05月

　概要を見る

補間演算とは，得られたデータ列から範囲内の値を推定する演算であり，画像の拡大・縮小や歪みの補正などに用いられる．線形補間補間演算の1つであり，2つの値を直線的に結ぶことで補間を実行する．演算量が多い多項式補間の中では比較的演算量が少ないため，実用的に用いられることも多い．本稿では，線形補間演算をビットレベル式変形しセレクタ論理に帰着させることで，セレクタ論理帰着型線形補間演算器を提案する．提案するセレクタ論理帰着型線形補間演算器は，セレクタ論理を用いることで桁上げ伝搬遅延を削減し演算の高速化を実現する．セレクタ論理帰着型線形補間演算器において，セレクタ論理によって生成された部分積を算術演算子を用いて加算，桁上げ先読み加算器を用いて加算など，複数の方法で実装し評価・比較した．その結果，セレクタ論理を用いず補間演算全体を単純な算術演算子を用いて設計した場合と比較し，補間演算に対しセレクタ論理によって生成された部分積を算術演算子を用いた加算した場合，遅延時間が最大16%削減されることを確認した．Interpolation is a technique that presumes a value between existing data, which is often used for image scaling and correction of distortion. A linear interpolation is one of the interpolation techniques which interpolates inbetween values by linearly connecting two known values. It is used practically in many cases because there are comparatively small computation cost. In this paper, we propose a high-speed linear interpolation circuit based on selector logics. The proposed linear interpolation circuit reduces carry propagation delay by using selector logics and then realizes a fast operation. We have implemented our linear interpolation circuit in several ways and evaluated each of them. We can find out that a selector-based linear interpolation circuit where its partial products are summed up by using the arithmetic operator reduces its delay by a maximum of 16% compared with a linear interpolation circuit synthesized by using arithmetic operators only.

CiNii
スキャンシグネチャを用いたストリーム暗号Triviumへのスキャンベース攻撃手法

藤代美佳, 柳澤政生, 戸川望

研究報告システムLSI設計技術（SLDM） 2013 ( 11 ) 1 - 6 2013年05月

　概要を見る

Triviumは同期式ストリーム暗号の1つで，ストリーム暗号評価プロジェクトeSTREAMにて推奨アルゴリズムに選定されたものである．3本のシフトレジスタから構成され，内部演算はビット同士のAND演算とXOR演算のみであり，構造が単純で高速に動作する．一方，暗号LSIに対するサイドチャネル攻撃の1つに，テスト用のスキャンチェインを利用して暗号解読するスキャンベース攻撃がある．Triviumへのスキャンベース攻撃は，スキャンチェインを用いて取得したスキャンデータから回路内部の状態を求め，求めた内部状態を元にキーストリームと平文を復元する．Triviumへのスキャンベース攻撃に対し従来手法では，暗号回路の内部レジスタのみがスキャンチェインに含まれていることが前提であり，周辺回路のレジスタがスキャンチェインに含まれている場合，スキャンチェインの内部構造を解析できない．一般に，LSIチップには暗号回路と共に複数の回路が同一スキャンチェインに含まれることが多く，従来手法を実際の攻撃手法として利用することは難しい．本稿では，スキャンチェインの構造に依存しないTriviumへのスキャンベース攻撃手法を提案する．提案手法では，1ビットレジスタ値の入力・動作サイクル数に対する変化がそのレジスタ固有の値になることを利用しTriviumの内部状態を復元する．計算機実験により，スキャンチェインに周辺回路が含まれる場合でも，提案手法を用いてTriviumの内部状態を復元し，平文を復元できることを確認した．Trivium is a synchronous stream cipher using three shift registers. It is designed to have a simple structure and runs at high speed. A scan-based side-channel attack retrieves secret information using scan chains, one of design-for-test techniques. Since a conventional scan-based attack against Trivium assumes that a scan chain connects just registers in Trivium, it is difficult to apply it to a practical Trivium LSI chip. In this paper, a scan-based attack method against Trivium using scan signatures is proposed. In our method, we focus on a particular 1-bit position in a collection of scan chains and then we can attack Trivium even if the scan chain includes other registers than internal state registers in Trivium. Experimental results show that our proposed method successfully retrieves a plaintext from a ciphertext.

CiNii
2コアアーキテクチャを対象とするトレースベースキャッシュシミュレーションの精度評価 (ディペンダブルコンピューティング組込み技術とネットワークに関するワークショップETNET2013)

多和田雅師, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 : 信学技報 112 ( 482 ) 85 - 90 2013年03月

　概要を見る

一般にプロセッサ上でアプリケーションを走らせた場合にキャッシュがどのように動作するかサイクル精度でシミュレーションすると時間がかかる.そこで,特定のキャッシュ構成を想定してサイクル精度でシミュレーションすることによりメモリアクセストレースを入手し,メモリアクセストレースを用いてキャッシュ動作をトレースベースシミュレーションするとシミュレーション時間を極めて短くできる.ここでキャッシュのトレースベースシミュレーションとは,メモリアクセストレースに従ってプロセッサがメモリアクセスすると仮定し,キャッシュがどのように動作するかのシミュレーションである.ところが,マルチコアアーキテクチャではメモリアクセスは原理的に,想定するキャッシュ構成によって変化する.トレースベースシミュレーションをマルチコアアーキテクチャに適用した場合,メモリアクセストレースを入手するときに想定したキャッシュ構成とトレースベースシミュレーションで想定したキャッシュ構成が異なるとトレースベースシミュレーション結果はサイクル精度シミュレーション結果と一致しない.本稿では,メモリアクセストレースを入手するときに想定したキャッシュ構成とトレースベースシミュレーションで想定したキャッシュ構成が異なるとき,トレースベースシミュレーションがどの程度,サイクル精度シミュレーションと一致するかを評価する.

CiNii
フロアプランを考慮したマルチクロックドメイン指向の低電力化高位合成手法 (コンピュータシステム組込み技術とネットワークに関するワークショップETNET2013)

阿部晋矢, 史又華, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 : 信学技報 112 ( 481 ) 115 - 120 2013年03月

　概要を見る

本稿では,マルチクロックドメイン適用へ向け,HDRアーキテクチャを拡張したHDR-mcdを提案する.続いてHDR-mcdを対象にマルチクロックドメイン指向の低電力化高位合成を提案する.提案手法はフロアプラン情報をフィードバックし,反復改良する合成フローを取る.その際,1クロック内の通信が保障されるパドルと呼ぶ区画を利用し,配線遅延の影響を予測,異なるクロック間の同期を考慮した高位合成を実現する.クロックはパドル毎に割り当て,資源制約と時間制約を満たす範囲で低い周波数のクロックを割り当てることで低電力化する.計算機実験により提案手法は従来の単一クロックのみを考慮したレジスタ分散型アーキテクチャと比較し25%程度消費エネルギーを削減できることを確認した.

CiNii
2コアアーキテクチャを対象とするトレースベースキャッシュシミュレーションの精度評価 (コンピュータシステム組込み技術とネットワークに関するワークショップETNET2013)

多和田雅師, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 : 信学技報 112 ( 481 ) 85 - 90 2013年03月

　概要を見る

一般にプロセッサ上でアプリケーションを走らせた場合にキャッシュがどのように動作するかサイクル精度でシミュレーションすると時間がかかる.そこで,特定のキャッシュ構成を想定してサイクル精度でシミュレーションすることによりメモリアクセストレースを入手し,メモリアクセストレースを用いてキャッシュ動作をトレースベースシミュレーションするとシミュレーション時間を極めて短くできる.ここでキャッシュのトレースベースシミュレーションとは,メモリアクセストレースに従ってプロセッサがメモリアクセスすると仮定し,キャッシュがどのように動作するかのシミュレーションである.ところが,マルチコアアーキテクチャではメモリアクセスは原理的に,想定するキャッシュ構成によって変化する.トレースベースシミュレーションをマルチコアアーキテクチャに適用した場合,メモリアクセストレースを入手するときに想定したキャッシュ構成とトレースベースシミュレーションで想定したキャッシュ構成が異なるとトレースベースシミュレーション結果はサイクル精度シミュレーション結果と一致しない.本稿では,メモリアクセストレースを入手するときに想定したキャッシュ構成とトレースベースシミュレーションで想定したキャッシュ構成が異なるとき,トレースベースシミュレーションがどの程度,サイクル精度シミュレーションと一致するかを評価する.

CiNii
フロアプランを考慮したマルチクロックドメイン指向の低電力化高位合成手法

阿部晋矢, 史又華, 柳澤政生, 戸川望

研究報告組込みシステム（EMB） 2013 ( 20 ) 1 - 6 2013年03月

　概要を見る

本稿では，マルチクロックドメイン適用へ向け，HDRアーキテクチャを拡張したHDR-mcdを提案する．続いてHDR-mcdを対象にマルチクロックドメイン指向の低電力化高位合成を提案する．提案手法はフロアプラン情報をフィードバックし，反復改良する合成フローを取る．その際，1クロック内の通信が保障されるハドルと呼ぶ区画を利用し，配線遅延の影響を予測，異なるクロック間の同期を考慮した高位合成を実現する．クロックはハドル毎に割り当て，資源制約と時間制約を満たす範囲で低い周波数のクロックを割り当てることで低電力化する．計算機実験により提案手法は従来の単一クロックのみを考慮したレジスタ分散型アーキテクチャと比較し25%程度消費エネルギーを削減できることを確認した．

CiNii
2コアアーキテクチャを対象とするトレースベースキャッシュシミュレーションの精度評価

多和田雅師, 柳澤政生, 戸川望

研究報告システムLSI設計技術（SLDM） 2013 ( 15 ) 1 - 6 2013年03月

　概要を見る

一般にプロセッサ上でアプリケーションを走らせた場合にキャッシュがどのように動作するかサイクル精度でシミュレーションすると時間がかかる．そこで，特定のキャッシュ構成を想定してサイクル精度でシミュレーションすることによりメモリアクセストレースを入手し，メモリアクセストレースを用いてキャッシュ動作をトレースベースシミュレーションするとシミュレーション時間を極めて短くできる．ここでキャッシュのトレースベースシミュレーションとは，メモリアクセストレースに従ってプロセッサがメモリアクセスすると仮定し，キャッシュがどのように動作するかのシミュレーションである．ところが，マルチコアアーキテクチャではメモリアクセスは原理的に，想定するキャッシュ構成によって変化する．トレースベースシミュレーションをマルチコアアーキテクチャに適用した場合，メモリアクセストレースを入手するときに想定したキャッシュ構成とトレースベースシミュレーションで想定したキャッシュ構成が異なるとトレースベースシミュレーション結果はサイクル精度シミュレーション結果と一致しない．本稿では，メモリアクセストレースを入手するときに想定したキャッシュ構成とトレースベースシミュレーションで想定したキャッシュ構成が異なるとき，トレースベースシミュレーションがどの程度，サイクル精度シミュレーションと一致するかを評価する．In trace-based cache simulation, we perform cache simulation based on a particular memory access trace obtained by cycle-accurate memory simulation. While cycle-accurate simulation takes too many time to run, trace-based cache simulation runs very fast and then we can evaluate many cache configurations in a short time. Let us consider a multi-core processor cache. We can obtain a memory access trace by using a cycle-accurate memory simulation but it can be changed when we consider another multi-core processor cache configuration. One of the main concerns in trace-based cache simulation applied to multi-core processor caches is its accuracy when the cache configuration that the memory access trace assumed is different from those the trace-based cache simulation targets. In this paper, we evaluate how much memory access traces affect cache configuration simulation when cache configurations simulated are different from the one that memory access traces assume, using several benchmark applications.

CiNii
2コアアーキテクチャを対象とするトレースベースキャッシュシミュレーションの精度評価

多和田雅師, 柳澤政生, 戸川望

研究報告組込みシステム（EMB） 2013 ( 15 ) 1 - 6 2013年03月

　概要を見る

一般にプロセッサ上でアプリケーションを走らせた場合にキャッシュがどのように動作するかサイクル精度でシミュレーションすると時間がかかる．そこで，特定のキャッシュ構成を想定してサイクル精度でシミュレーションすることによりメモリアクセストレースを入手し，メモリアクセストレースを用いてキャッシュ動作をトレースベースシミュレーションするとシミュレーション時間を極めて短くできる．ここでキャッシュのトレースベースシミュレーションとは，メモリアクセストレースに従ってプロセッサがメモリアクセスすると仮定し，キャッシュがどのように動作するかのシミュレーションである．ところが，マルチコアアーキテクチャではメモリアクセスは原理的に，想定するキャッシュ構成によって変化する．トレースベースシミュレーションをマルチコアアーキテクチャに適用した場合，メモリアクセストレースを入手するときに想定したキャッシュ構成とトレースベースシミュレーションで想定したキャッシュ構成が異なるとトレースベースシミュレーション結果はサイクル精度シミュレーション結果と一致しない．本稿では，メモリアクセストレースを入手するときに想定したキャッシュ構成とトレースベースシミュレーションで想定したキャッシュ構成が異なるとき，トレースベースシミュレーションがどの程度，サイクル精度シミュレーションと一致するかを評価する．In trace-based cache simulation, we perform cache simulation based on a particular memory access trace obtained by cycle-accurate memory simulation. While cycle-accurate simulation takes too many time to run, trace-based cache simulation runs very fast and then we can evaluate many cache configurations in a short time. Let us consider a multi-core processor cache. We can obtain a memory access trace by using a cycle-accurate memory simulation but it can be changed when we consider another multi-core processor cache configuration. One of the main concerns in trace-based cache simulation applied to multi-core processor caches is its accuracy when the cache configuration that the memory access trace assumed is different from those the trace-based cache simulation targets. In this paper, we evaluate how much memory access traces affect cache configuration simulation when cache configurations simulated are different from the one that memory access traces assume, using several benchmark applications.

CiNii
A-17-13 ランドマークを用いた携帯端末向けユニバーサルナビゲーションシステム(A-17.ITS)

石田健, 柳澤政生, 戸川望

電子情報通信学会総合大会講演論文集 2013 244 - 244 2013年03月

CiNii
A-17-7 携帯端末画面で表示するわかりやすい略地図経路の検討(A-17.ITS)

辻和貴, 柳澤政生, 戸川望

電子情報通信学会総合大会講演論文集 2013 238 - 238 2013年03月

CiNii
B-6-56 センサネットワーク低消費電力化のためのSMACプロトコルduty cycle最適化手法と実装評価(B-6.ネットワークシステム)

大岸龍司, 柳澤政生, 戸川望

電子情報通信学会総合大会講演論文集 2013 ( 2 ) 56 - 56 2013年03月

CiNii
HDRアーキテクチャを対象とした多段階クロックゲーティングを用いた低電力化高位合成手法

赤坂宏行, 阿部晋矢, 柳澤政生, 戸川望

情報処理学会シンポジウム論文集 2013 ( 3 ) 2013年

J-GLOBAL
フロアプラン統合化アーキテクチャを対象とした複数クロックドメインおよび複数電源電圧による低電力化高位合成手法

阿部晋矢, SHI Youhua, 宇佐美公良, 宇佐美公良, 柳澤政生, 戸川望

情報処理学会シンポジウム論文集 2013 ( 3 ) 2013年

J-GLOBAL
製造後遅延調整機能を持つRDRアーキテクチャ向け高位合成手法

萩尾勇太, 柳澤政生, 戸川望

情報処理学会シンポジウム論文集 2013 ( 3 ) 2013年

J-GLOBAL
ハミング符号を用いた冗長化による不揮発メモリの書き込み削減手法

多和田雅師, 木村晋二, 柳澤政生, 戸川望

情報処理学会シンポジウム論文集 2013 ( 3 ) 2013年

J-GLOBAL
ストリーム暗号Triviumに対するスキャンチェインの構造に依存しないスキャンベース攻撃手法

藤代美佳, 柳澤政生, 戸川望

回路とシステムワークショップ論文集(CD-ROM) 26 2013年

J-GLOBAL
セレクタ論理を利用した線形補間演算器の実装と評価

塩雅史, 柳澤政生, 戸川望

回路とシステムワークショップ論文集(CD-ROM) 26 2013年

J-GLOBAL
SAAV:AVHDRアーキテクチャを対象とした動的複数電源電圧指向の低電力化高位合成手法

阿部晋矢, 史又華, 宇佐美公良, 柳澤政生, 戸川望

研究報告システムLSI設計技術（SLDM） 2012 ( 24 ) 1 - 6 2012年11月

　概要を見る

動的複数電源電圧と配線遅延を高位合成に統合するプラットフォームとして， Adaptive Voltages Huddle-based Distributed-Register アーキテクチャ (AVHDR) および AVHDR アーキテクチャを対象とした高位合成手法が提案された．従来手法はフロアプラン情報をフィードバックし，反復改良する合成フローを取る従来手法では収束性を改善するため，仮想面積ベースの反復改良を採用している．しかし，仮想面積は面積と配線遅延のオーバヘッドを伴う可能性がある．本稿では反復が進むごとにオーバヘッドを削減する仮想面積調整を提案する．計算機実験結果により，提案手法は従来の AVHDR アーキテクチャを対象とした高位合成アルゴリズムと比較し最大 6.2％消費エネルギーを削減し，最終的に従来の高位合成アルゴリズムと比較し最大 65.7％消費エネルギーを削減できることを確認した．An adaptive voltage huddle-based distributed-register architecture (AVHDR architecture), which integrates dynamic multiple supply voltages and interconnection delays into high-level synthesis, and a synthesis algorithm for AVHDR architectures have been proposed. This algorithm is based on iterative improvement of scheduling/binding and floorplanning and can converge without oscillation by using virtual-area-based iterative refinement flow. However, virtual areas may have some area and interconnection delay overheads. In this paper, we propose virtual area adaptation which relaxes these overheads as the iteration proceeds. Experimental results show that our algorithm achieves 6.2% energy saving compared with conventional algorithm for AVHDR architectures and 65.7% energy saving compared with conventional algorithms.

CiNii
HDRアーキテクチャを対象とした同時実行指向スケジューリングを用いたクロック設計考慮低電力化高位合成手法

赤坂宏行, 柳澤政生, 戸川望

研究報告システムLSI設計技術（SLDM） 2012 ( 23 ) 1 - 6 2012年11月

　概要を見る

現在 LSI の小型化や高性能化に伴い携帯端末の需要が増加し，電池の耐久性や端末の発熱問題が発生しているまた， LSI 設計プロセスの微細化に伴い，ゲート遅延に対する配線遅延の割合が増加し続けている．そこで消費電力の削減と配線遅延の予測を図った高位合成が必要となる．本論文では HDR アーキテクチャを対象に同時実行指向スケジューリングを適用し，クロックツリーの消費エネルギーを含めた全消費エネルギーが最小となるようハドルを構成する手法を提案する．通常よりクロックゲーティングでクロックを遮断するステップ数を増やすことに着目し，同時に実行する演算を増加させるスケジューリングを実行する．高位合成の段階でクロックゲーティングのタイミングを合わせこむことで，論理合成後にクロックゲーテイングを適用するよりクロックゲーテイングの効果を高める．さらにクロックツリーの消費エネルギーを含めて最小エネルギーとなるようクロックゲーティングタイミングを決定する．計算機実験により提案手法は従来手法と比較して最大 21.2％の消費エネルギーを削減できることを確認した．With the miniaturization of LSIs and its increasing performance, demand for high-functional portable devices has grown significantly. At the same time, the problems for battery runtime and device overheating have occurred. On the other hand, the ratio of an interconnection delay to a gate delay has continued to increase as device feature size decreases. We have to estimate the interconnection delay and reduce energy consumption even in a high-level synthesis stage. In this paper, we propose high-level synthesis considering clock design for HDR architectures with concurrency-oriented scheduling. Firstly we focus on the number of the control steps at which we can apply the clock gating to registers and we schedule and bind operations to be performed at the same time. By adjusting the clock gating timings in a high-level synthesis stage, we enhance the effect of clock gatings than applying clock gatings after logic synthesis. Secondly, we determine the clock gating timings to minimize all energy consumption including clock tree energy. The experimental results show that our proposed algorithm reduces energy consumption by a maximum of 21.2% compared with several conventional algorithms.

CiNii
Camellia暗号回路に対するスキャンベース攻撃手法

小寺博和, 柳澤政生, 戸川望

研究報告システムLSI設計技術（SLDM） 2012 ( 10 ) 1 - 6 2012年11月

　概要を見る

Camellia は共通鍵ブロック暗号であり， AES よりも高い暗号攻撃耐性と持ち， AES と同等の処理性能を持つ暗号アルゴリズムである暗号化と復号の処理が共用でき，算術演算を使用しないことから，少ないゲート数でハードウェア実装可能であるため，実用性にも優れている Camellia はラウンド関数を 18 回繰り返す， 18 段 Feistel 構造である一方で，スキャンパステストで用いるスキャンチェインから取得可能なスキャンデータをもとに秘密鍵を特定するスキャンベース攻撃が報告されている．しかし， Camellia に対するスキャンベース攻撃手法は報告されていない本稿では， Camellia に対するスキャンベース攻撃手法を提案する提案手法では， 2 つの特定の平文を Camellia 暗号 LSI に入力したときの 2 つのスキャンデータを取得し，それらを XOR することでラウンド関数の S 関数の影響を除去する．また， XOR したスキャンデータの特定のビット列に着目し，対応するレジスタの変化を観察することで第 4 ラウンドまでの 4 つの等価鍵を特定し，第 3 ラウンドと第 4 ラウンドの等価鍵を用いて秘密鍵を解読する提案手法を用いて Camellia の秘密鍵を解読できることをソフトウェア実装実験によって示した．Camellia is a common key cryptosystem and it has higher tolerance for cryptoanalysis than AES. In addition, Camellia has a processing speed which is equivalent to AES. Because Camellia can share encryption processing with decryption processing and it doesn't use arithmetic operation, it can be implemented hardware with the small number of gates. Recently, scan-based attacks are reported which retrieve secret keys with scanned data obtained from scan chain. There are no reports on scan-based attack against Camellia. In this paper, we propose a scan-based attack method against Camellia. Camellia has an 18-round Feistel structure which repeats the round function 18 times. In our proposed method, attackers input two plaintexts to a Camellia cryptosystem LSI and obtain two scanned data. By XORing them, influence of S-funtion in the round function can be removed. We focus on specific bit column data of XORed scanned data and, by observing transition of correspoding registers. Then, attackers retrieve four equivalent keys and restore a secret key in Camellia. We showed that secret keys of Camellia are restored with our proposed method.

CiNii
島内消費電力量見積もりにもとづく温度特性を考慮したRDRアーキテクチャ向け高位合成手法

川村一志, 柳澤政生, 戸川望

研究報告システムLSI設計技術（SLDM） 2012 ( 3 ) 1 - 6 2012年11月

　概要を見る

半導体の微細化技術向上に伴い， IC チップ内部の発熱，特にホットスポットと呼ばれる局所的に温度の高い空間が問題となっている一方，微細化技術向上に伴ってゲート遅延より配線遅延が支配的となったため，高位合成段階で配線遅延を考慮する必要が生じている．これら双方の問題に対処するため，配線遅延を考慮した設計が可能な RDR アーキテクチャを対象に，温度特性を考慮した高位合成手法が提案された．本論文では，従来手法における島内消費電力量の見積もり式を改良し，島内消費電力量見積もりにもとづく温度特性を考慮した RDR アーキテクチャ向け高位合成手法を提案する RDR アーキテクチャはチップ内部を同じ面積の島に分割するため，提案手法では演算の実行回数に注目して島間の消費電力量を均一化し，ホットスポットの温度を削減する．さらに，レジスタやマルチプレクサがチップ内部の発熱に与える影響を見積もり， RDR アーキテクチャ上の空き領域に新たな演算器を配置することで，ホットスポットの温度最小化を図る．計算機実験により，提案手法は従来手法と比較して最大 15.51％ホットスポットの温度を削減できることを確認した．With process technology scaling, heat problems in IC chips as well as increasing the average interconnection delays are becoming serious issues. Recently, we have proposed a binding and allocation algorithm for regular-distributed-register architectures (RDR architectures) with the objective of minimizing the peak temperature. In this paper, we propose an improved thermal-aware high-level synthesis algorithm for RDR architectures. The RDR architecture divides the entire chip into islands regularly. Firstly, our algorithm balances the energy consumption among islands through re-binding to functional units. Secondly, it accurately estimates the energy consumption in each island and minimizes the maximum energy consumption among islands through re-allocating new additional functional units. Experimental results demonstrate that our algorithm reduces the peak temperature by up to 15.42% compared with the conventional approaches.

CiNii
鍵ベース構成のState Dependent Scan Flip-Flopを用いたセキュアスキャンアーキテクチャ

跡部悠太, 史又華, 柳澤政生, 戸川望

研究報告システムLSI設計技術（SLDM） 2012 ( 9 ) 1 - 6 2012年11月

　概要を見る

暗号 LSI は機密操作を行うために使用されるため，それ自体は安全である必要があるスキャンテストは高い故障検出率を持つテスト容易化手法であり，近年の LSI の大規模化によって重要性が高まっているが，様々な暗号回路へのスキャンベース攻撃手法が報告されているそこで，テスト容易性を保ちスキャンベース攻撃に対して高い安全性を持つセキュアスキャンアーキテクチャとして SDSFF (State Dependent Scan Flip-Flop) が提案された SDSFF では，スキャンフリップフロップに対して付加するラッチの値を更新するタイミングが重要な問題となる本稿では，オンラインテストを可能にする更新タイミングを提案する提案する更新タイミングはスキャンチェイン上の任意のフリップフロップと回路設計時に決定した値との比較結果によって決定される RSA 暗号回路， AES 暗号回路及び DES 暗号回路に提案手法を実装し，評価を行った実験結果より，様々な暗号回路において有効であることが示せた．Secure cryptographic LSIs is intensively used in order to perform confidential operation. Scan test has become the most widely adopted test technique to ensure the correctness of manufactured LSIs, in which through the scan chains the internal states of the circuit under test (CUT) can be controlled and observed externally. However, scan chains using scan test might carry the risk of being misused for secret information leakage. Therefore a secure scan architecture using SDSFF(State Dependent Scan Flip-Flop) against scan-based attack which achieves high security without compromising the testability is proposed. In SDSFF, there is a problem which is the update timing of the latch which added to the scan FF. In this paper, we propose the update timing to online test without sacrificing the security. In our method, the latches are updated by result which the value of KEY which decided when designed compared with any FFs in a scan chain. We show that by using proposed method, neither the secret key nor the testability of vairous crypto circuits implementation is compromised, and the effectiveness of the proposed method. Experimental results on various crypto implementations show the effectiveness of the proposed method.

CiNii
島内消費電力量見積もりにもとづく温度特性を考慮したRDRアーキテクチャ向け高位合成手法

川村一志, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. VLD, VLSI設計技術 112 ( 320 ) 13 - 18 2012年11月

　概要を見る

半導体の微細化技術向上に伴い,ICチップ内部の発熱,特にホットスポットと呼ばれる局所的に温度の高い空間が問題となっている.一方,微細化技術向上に伴ってゲート遅延より配線遅延が支配的となったため,高位合成段階で配線遅延を考慮する必要が生じている.これら双方の問題に対処するため,配線遅延を考慮した設計が可能なRDRアーキテクチャを対象に,温度特性を考慮した高位合成手法が提案された.本論文では,従来手法における島内消費電力量の見積もり式を改良し,島内消費電力量見積もりにもとづく温度特性を考慮したRDRアーキテクチャ向け高位合成手法を提案する.RDRアーキテクチャはチップ内部を同じ面積の島に分割するため,提案手法では演算の実行回数に注目して島間の消費電力量を均一化し,ホットスポットの温度を削減する.さらに,レジスタやマルチプレクサがチップ内部の発熱に与える影響を見積もり,RDRアーキテクチャ上の空き領域に新たな演算器を配置することで,ホットスポットの温度最小化を図る.計算機実験により,提案手法は従来手法と比較して最大15.51%ホットスポットの温度を削減できることを確認した.

CiNii
鍵ベース構成のState Dependent Scan Flip-Flopを用いたセキュアスキャンアーキテクチャ

跡部悠太, 史又華, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. VLD, VLSI設計技術 112 ( 320 ) 45 - 50 2012年11月

　概要を見る

暗号LSIは機密操作を行うために使用されるため,それ自体は安全である必要がある.スキャンテストは高い故障検出率を持つテスト容易化手法であり,近年のLSIの大規模化によって重要性が高まっているが,様々な暗号回路へのスキャンベース攻撃手法が報告されている.そこで,テスト容易性を保ちスキャンベース攻撃に対して高い安全性を持つセキュアスキャンアーキテクチャとしてSDSFF(State Dependent Scan Flip-Flop)が提案された.SDSFFでは,スキャンフリップフロップに対して付加するラッチの値を更新するタイミングが重要な問題となる.本稿では,オンラインテストを可能にする更新タイミングを提案する.提案する更新タイミングはスキャンチェイン上の任意のフリップフロップと回路設計時に決定した値との比較結果によって決定される.RSA暗号回路,AES暗号回路及びDES暗号回路に提案手法を実装し,評価を行った.実験結果より,様々な暗号回路において有効であることが示せた.

CiNii
Camellia暗号回路に対するスキャンベース攻撃手法

小寺博和, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. VLD, VLSI設計技術 112 ( 320 ) 51 - 56 2012年11月

　概要を見る

Camelliaは共通鍵ブロック暗号であり,AESよりも高い暗号攻撃耐性と持ち,AESと同等の処理性能を持つ暗号アルゴリズムである.暗号化と復号の処理が共用でき,算術演算を使用しないことから,少ないゲート数でハードウェア実装可能であるため,実用性にも優れている.Camelliaはラウンド関数を18回繰り返す,18段Feistel構造である.一方で,スキャンパステストで用いるスキャンチェインから取得可能なスキャンデータをもとに秘密鍵を特定するスキャンベース攻撃が報告されている.しかし,Camelliaに対するスキャンベース攻撃手法は報告されていない.本稿では,Camelliaに対するスキャンベース攻撃手法を提案する.提案手法では,2つの特定の平文をCamellia暗号LSIに入力したときの2つのスキャンデータを取得し,それらをXORすることでラウンド関数のS関数の影響を除去する.また,XORしたスキャンデータの特定のビット列に着目し,対応するレジスタの変化を観察することで第4ラウンドまでの4つの等価鍵を特定し,第3ラウンドと第4ラウンドの等価鍵を用いて秘密鍵を解読する.提案手法を用いてCamelliaの秘密鍵を解読できることをソフトウェア実装実験によって示した.

CiNii
HDRアーキテクチャを対象とした同時実行指向スケジューリングを用いたクロック設計考慮低電力化高位合成手法

赤坂宏行, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. VLD, VLSI設計技術 112 ( 320 ) 129 - 134 2012年11月

　概要を見る

現在LSIの小型化や高性能化に伴い携帯端末の需要が増加し,電池の耐久性や端末の発熱問題が発生している.また,LSI設計プロセスの微細化に伴い,ゲート遅延に対する配線遅延の割合が増加し続けている.そこで消費電力の削減と配線遅延の予測を図った高位合成が必要となる.本論文ではHDRアーキテクチャを対象に同時実行指向スケジューリングを適用し,クロックツリーの消費エネルギーを含めた全消費エネルギーが最小となるようハドルを構成する手法を提案する.通常よりクロックゲーティングでクロックを遮断するステップ数を増やすことに着目し,同時に実行する演算を増加させるスケジューリングを実行する.高位合成の段階でクロックゲーティングのタイミングを合わせこむことで,論理合成後にクロックゲーティングを適用するよりクロックゲーティングの効果を高める.さらにクロックツリーの消費エネルギーを含めて最小エネルギーとなるようクロックゲーティングタイミングを決定する.計算機実験により提案手法は従来手法と比較して最大21.2%の消費エネルギーを削減できることを確認した.

CiNii
SAAV:AVHDRアーキテクチャを対象として動的複数電源電圧指向の低電力化高位合成手法

阿部晋矢, 史又華, 宇佐美公良, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. VLD, VLSI設計技術 112 ( 320 ) 135 - 140 2012年11月

　概要を見る

動的複数電源電圧と配線遅延を高位合成に統合するプラットフォームとして,Adaptive Voltages Huddle-basedDistributed-Registerアーキテクチャ(AVHDR)およびAVHDRアーキテクチャを対象とした高位合成手法が提案された.従来手法はフロアプラン情報をフィードバックし,反復改良する合成フローを取る.従来手法では収束性を改善するため,仮想面積ベースの反復改良を採用している.しかし,仮想面積は面積と配線遅延のオーバヘッドを伴う可能性がある.本稿では反復が進むごとにオーバヘッドを削減する仮想面積調整を提案する.計算機実験結果により,提案手法は従来のAVHDRアーキテクチャを対象とした高位合成アルゴリズムと比較し最大6.2%消費エネルギーを削減し,最終的に従来の高位合成アルゴリズムと比較し最大65.7%消費エネルギーを削減できることを確認した.

CiNii
島内消費電力量見積もりにもとづく温度特性を考慮したRDRアーキテクチャ向け高位合成手法

川村一志, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. DC, ディペンダブルコンピューティング : IEICE technical report 112 ( 321 ) 13 - 18 2012年11月

　概要を見る

半導体の微細化技術向上に伴い,ICチップ内部の発熱,特にホットスポットと呼ばれる局所的に温度の高い空間が問題となっている.一方,微細化技術向上に伴ってゲート遅延より配線遅延が支配的となったため,高位合成段階で配線遅延を考慮する必要が生じている.これら双方の問題に対処するため,配線遅延を考慮した設計が可能なRDRアーキテクチャを対象に,温度特性を考慮した高位合成手法が提案された.本論文では,従来手法における島内消費電力量の見積もり式を改良し,島内消費電力量見積もりにもとづく温度特性を考慮したRDRアーキテクチャ向け高位合成手法を提案する.RDRアーキテクチャはチップ内部を同じ面積の島に分割するため,提案手法では演算の実行回数に注目して島間の消費電力量を均一化し,ホットスポットの温度を削減する.さらに,レジスタやマルチプレクサがチップ内部の発熱に与える影響を見積もり,RDRアーキテクチャ上の空き領域に新たな演算器を配置することで,ホットスポットの温度最小化を図る.計算機実験により,提案手法は従来手法と比較して最大15.51%ホットスポットの温度を削減できることを確認した.

CiNii
鍵ベース構成のState Dependent Scan Flip-Flopを用いたセキュアスキャンアーキテクチャ

跡部悠太, 史又華, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. DC, ディペンダブルコンピューティング : IEICE technical report 112 ( 321 ) 45 - 50 2012年11月

　概要を見る

暗号LSIは機密操作を行うために使用されるため,それ自体は安全である必要がある.スキャンテストは高い故障検出率を持つテスト容易化手法であり,近年のLSIの大規模化によって重要性が高まっているが,様々な暗号回路へのスキャンベース攻撃手法が報告されている.そこで,テスト容易性を保ちスキャンベース攻撃に対して高い安全性を持つセキュアスキャンアーキテクチャとしてSDSFF(State Dependent Scan Flip-Flop)が提案された.SDSFFでは,スキャンフリップフロップに対して付加するラッチの値を更新するタイミングが重要な問題となる.本稿では,オンラインテストを可能にする更新タイミングを提案する.提案する更新タイミングはスキャンチェイン上の任意のフリップフロップと回路設計時に決定した値との比較結果によって決定される.RSA暗号回路,AES暗号回路及びDES暗号回路に提案手法を実装し,評価を行った.実験結果より,様々な暗号回路において有効であることが示せた.

CiNii
Camellia暗号回路に対するスキャンベース攻撃手法

小寺博和, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. DC, ディペンダブルコンピューティング : IEICE technical report 112 ( 321 ) 51 - 56 2012年11月

　概要を見る

Camelliaは共通鍵ブロック暗号であり,AESよりも高い暗号攻撃耐性と持ち,AESと同等の処理性能を持つ暗号アルゴリズムである.暗号化と復号の処理が共用でき,算術演算を使用しないことから,少ないゲート数でハードウェア実装可能であるため,実用性にも優れている.Camelliaはラウンド関数を18回繰り返す,18段Feistel構造である.一方で,スキャンパステストで用いるスキャンチェインから取得可能なスキャンデータをもとに秘密鍵を特定するスキャンベース攻撃が報告されている.しかし,Camelliaに対するスキャンベース攻撃手法は報告されていない.本稿では,Camelliaに対するスキャンベース攻撃手法を提案する.提案手法では,2つの特定の平文をCamellia暗号LSIに入力したときの2つのスキャンデータを取得し,それらをXORすることでラウンド関数のS関数の影響を除去する.また,XORしたスキャンデータの特定のビット列に着目し,対応するレジスタの変化を観察することで第4ラウンドまでの4つの等価鍵を特定し,第3ラウンドと第4ラウンドの等価鍵を用いて秘密鍵を解読する.提案手法を用いてCamelliaの秘密鍵を解読できることをソフトウェア実装実験によって示した.

CiNii
HDRアーキテクチャを対象とした同時実行指向スケジューリングを用いたクロック設計考慮低電力化高位合成手法

赤坂宏行, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. DC, ディペンダブルコンピューティング : IEICE technical report 112 ( 321 ) 129 - 134 2012年11月

　概要を見る

現在LSIの小型化や高性能化に伴い携帯端末の需要が増加し,電池の耐久性や端末の発熱問題が発生している.また,LSI設計プロセスの微細化に伴い,ゲート遅延に対する配線遅延の割合が増加し続けている.そこで消費電力の削減と配線遅延の予測を図った高位合成が必要となる.本論文ではHDRアーキテクチャを対象に同時実行指向スケジューリングを適用し,クロックツリーの消費エネルギーを含めた全消費エネルギーが最小となるようハドルを構成する手法を提案する.通常よりクロックゲーティングでクロックを遮断するステップ数を増やすことに着目し,同時に実行する演算を増加させるスケジューリングを実行する.高位合成の段階でクロックゲーティングのタイミングを合わせこむことで,論理合成後にクロックゲーティングを適用するよりクロックゲーティングの効果を高める.さらにクロックツリーの消費エネルギーを含めて最小エネルギーとなるようクロックゲーティングタイミングを決定する.計算機実験により提案手法は従来手法と比較して最大21.2%の消費エネルギーを削減できることを確認した.

CiNii
SAAV:AVHDRアーキテクチャを対象として動的複数電源電圧指向の低電力化高位合成手法

阿部晋矢, 史又華, 宇佐美公良, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. DC, ディペンダブルコンピューティング : IEICE technical report 112 ( 321 ) 135 - 140 2012年11月

　概要を見る

動的複数電源電圧と配線遅延を高位合成に統合するプラットフォームとして,Adaptive Voltages Huddle-basedDistributed-Registerアーキテクチャ(AVHDR)およびAVHDRアーキテクチャを対象とした高位合成手法が提案された.従来手法はフロアプラン情報をフィードバックし,反復改良する合成フローを取る.従来手法では収束性を改善するため,仮想面積ベースの反復改良を採用している.しかし,仮想面積は面積と配線遅延のオーバヘッドを伴う可能性がある.本稿では反復が進むごとにオーバヘッドを削減する仮想面積調整を提案する.計算機実験結果により,提案手法は従来のAVHDRアーキテクチャを対象とした高位合成アルゴリズムと比較し最大6.2%消費エネルギーを削減し,最終的に従来の高位合成アルゴリズムと比較し最大65.7%消費エネルギーを削減できることを確認した.

CiNii
鍵ベース構成のState Dependent Scan Flip-Flopを用いたセキュアスキャンアーキテクチャのRSA暗号回路への実装 (集積回路)

跡部悠太, 史又華, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 : 信学技報 112 ( 247 ) 95 - 100 2012年10月

　概要を見る

スキャンテストは高い故障検出率を持ち,一般的に使われるテスト容易化設計技術である.しかし,スキャンテストで用いられるスキャンチェインを通して暗号LSIから秘密鍵が解読できる可能性が指摘されている.そこで,テスト容易性を保ちスキャンベース攻撃に対して高い安全性を持つセキュアスキャンアーキテクチャとしてSDSFF(State Dependent Scan Flip-Flop)が提案された.SDSFFでは,スキャンフリップフロップに対して付加するラッチの値を更新するタイミングが重要な問題となる.本稿では,オンラインテストを可能にする更新タイミングを提案する.提案する更新タイミングはスキャンチェイン上の任意のフリップフロップと回路設計時に決定した値との比較結果によって決定される.RSA暗号回路に提案するセキュアスキャンアーキテクチャを実装し,評価を行った.実験結果より,SDSFFを100個実装した場合面積オーバーヘッドは高々0.555%であり,従来手法よりも小さい面積オーバーヘッドであることがわかった.

CiNii
鍵ベース構成の State Dependent Scan Flip-Flop を用いたセキュアスキャンアーキテクチャのRSA暗号回路への実装

跡部悠太, 史又華, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. ICD, 集積回路 112 ( 247 ) 95 - 100 2012年10月

CiNii
鍵ベース構成の State Dependent Scan Flip-Flop を用いたセキュアスキャンアーキテクチャのRSA暗号回路への実装

跡部悠太, 史又華, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. SIP, 信号処理 : IEICE technical report 112 ( 246 ) 95 - 100 2012年10月

　概要を見る

スキャンテストは高い故障検出率を持ち,一般的に使われるテスト容易化設計技術である.しかし,スキャンテストで用いられるスキャンチェインを通して暗号LSIから秘密鍵が解読できる可能性が指摘されている.そこで,テスト容易性を保ちスキャンベース攻撃に対して高い安全性を持つセキュアスキャンアーキテクチャとしてSDSFF(State Dependent Scan Flip-Flop)が提案された.SDSFFでは,スキャンフリップフロップに対して付加するラッチの値を更新するタイミングが重要な問題となる.本稿では,オンラインテストを可能にする更新タイミングを提案する.提案する更新タイミングはスキャンチェイン上の任意のフリップフロップと回路設計時に決定した値との比較結果によって決定される.RSA暗号回路に提案するセキュアスキャンアーキテクチャを実装し,評価を行った.実験結果より,SDSFFを100個実装した場合面積オーバーヘッドは高々0.555%であり,従来手法よりも小さい面積オーバーヘッドであることがわかった.

CiNii
鍵ベース構成の State Dependent Scan Flip-Flop を用いたセキュアスキャンアーキテクチャのRSA暗号回路への実装

跡部悠太, 史又華, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. VLD, VLSI設計技術 112 ( 245 ) 95 - 100 2012年10月

　概要を見る

スキャンテストは高い故障検出率を持ち,一般的に使われるテスト容易化設計技術である.しかし,スキャンテストで用いられるスキャンチェインを通して暗号LSIから秘密鍵が解読できる可能性が指摘されている.そこで,テスト容易性を保ちスキャンベース攻撃に対して高い安全性を持つセキュアスキャンアーキテクチャとしてSDSFF(State Dependent Scan Flip-Flop)が提案された.SDSFFでは,スキャンフリップフロップに対して付加するラッチの値を更新するタイミングが重要な問題となる.本稿では,オンラインテストを可能にする更新タイミングを提案する.提案する更新タイミングはスキャンチェイン上の任意のフリップフロップと回路設計時に決定した値との比較結果によって決定される.RSA暗号回路に提案するセキュアスキャンアーキテクチャを実装し,評価を行った.実験結果より,SDSFFを100個実装した場合面積オーバーヘッドは高々0.555%であり,従来手法よりも小さい面積オーバーヘッドであることがわかった.

CiNii
鍵ベース構成の State Dependent Scan Flip-Flop を用いたセキュアスキャンアーキテクチャのRSA暗号回路への実装

跡部悠太, 史又華, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. IE, 画像工学 112 ( 248 ) 95 - 100 2012年10月

　概要を見る

スキャンテストは高い故障検出率を持ち,一般的に使われるテスト容易化設計技術である.しかし,スキャンテストで用いられるスキャンチェインを通して暗号LSIから秘密鍵が解読できる可能性が指摘されている.そこで,テスト容易性を保ちスキャンベース攻撃に対して高い安全性を持つセキュアスキャンアーキテクチャとしてSDSFF(State Dependent Scan Flip-Flop)が提案された.SDSFFでは,スキャンフリップフロップに対して付加するラッチの値を更新するタイミングが重要な問題となる.本稿では,オンラインテストを可能にする更新タイミングを提案する.提案する更新タイミングはスキャンチェイン上の任意のフリップフロップと回路設計時に決定した値との比較結果によって決定される.RSA暗号回路に提案するセキュアスキャンアーキテクチャを実装し,評価を行った.実験結果より,SDSFFを100個実装した場合面積オーバーヘッドは高々0.555%であり,従来手法よりも小さい面積オーバーヘッドであることがわかった.

CiNii
A-17-7 略地図を用いた迷いにくい歩行者向けナビゲーション(A-17.ITS,一般セッション)

辻和貴, 柳澤政生, 戸川望

電子情報通信学会ソサイエティ大会講演論文集 2012 166 - 166 2012年08月

CiNii
A-3-4 クロックの立下りを利用した耐故障攻撃AES暗号回路(A-3.VLSI設計技術,一般セッション)

五十嵐博昭, 史又華, 柳澤政生, 戸川望

電子情報通信学会ソサイエティ大会講演論文集 2012 51 - 51 2012年08月

CiNii
A-3-1 キャッシュ構成の高速シミュレーションを利用したIL1およびUL2キャッシュに不揮発メモリ(A-3.VLSI設計技術,一般セッション)

松野翔太, 多和田雅師, 柳澤政生, 戸川望

電子情報通信学会ソサイエティ大会講演論文集 2012 48 - 48 2012年08月

CiNii
A-3-5 Feedback付きState Dependent Scan Flip-Flopを用いたセキュアスキャンアーキテクチャ(A-3.VLSI設計技術,一般セッション)

跡部悠太, 史又華, 柳澤政生, 戸川望

電子情報通信学会ソサイエティ大会講演論文集 2012 52 - 52 2012年08月

CiNii
高集積かつ高周波な回路に対応した複数電源電圧指向の高位合成手法

阿部晋矢, 柳澤政生, 戸川望

回路とシステムワークショップ論文集 Workshop on Circuits and Systems 25 160 - 165 2012年07月

CiNii
複数のキャッシュ構成を同時に表現するデータ構造とこれを用いた高速で正確な2コアキャッシュシミュレーション

多和田雅師, 柳澤政生, 戸川望

回路とシステムワークショップ論文集 Workshop on Circuits and Systems 25 414 - 419 2012年07月

CiNii
State Dependent Scan Flip Flop を用いたRSA暗号回路へのセキュアスキャンンアーキテクチャの実装

跡部悠太, 史又華, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. SIP, 信号処理 : IEICE technical report 112 ( 115 ) 115 - 120 2012年06月

　概要を見る

代表的なテスト容易化設計であるスキャンテストは,LSI内部のFF(フリップフロップ)を直列に接続し,外部から自由に制御,観測でき,効率よく故障検出をすることができる.一方,スキャンテストで用いられるスキャンチェインを使用し,暗号LSIの秘密鍵を解読するスキャンベース攻撃が注目されている.一般的にテスト容易性とセキュリティは相反する性質であるが,それらを両立させる回路設計が必要である.本稿では,スキャンテストの利点であるテスト容易性を持ち,スキャンベース攻撃に対するセキュアスキャンアーキテクチャを提案する.提案手法では,スキャンチェイン中の任意のFFにラッチを付け加えることで,過去のFFの値を利用し,スキャンデータを攻撃者に解読不可能なデータに変化させる.FFの値が変化することで,スキャンデータを動的に変化させることが可能である.攻撃者には解読不可能なデータであっても,テスト者は拡張回路の構造を知っているため,通常のスキャンテストと同様のテストが可能である.RSA暗号回路に提案するセキュアスキャンアーキテクチャを実装し,評価を行った.

CiNii
State Dependent Scan Flip Flop を用いたRSA暗号回路へのセキュアスキャンンアーキテクチャの実装

跡部悠太, 史又華, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. MSS, システム数理と応用 : IEICE technical report 112 ( 116 ) 115 - 120 2012年06月

　概要を見る

代表的なテスト容易化設計であるスキャンテストは,LSI内部のFF(フリップフロップ)を直列に接続し,外部から自由に制御,観測でき,効率よく故障検出をすることができる.一方,スキャンテストで用いられるスキャンチェインを使用し,暗号LSIの秘密鍵を解読するスキャンベース攻撃が注目されている.一般的にテスト容易性とセキュリティは相反する性質であるが,それらを両立させる回路設計が必要である.本稿では,スキャンテストの利点であるテスト容易性を持ち,スキャンベース攻撃に対するセキュアスキャンアーキテクチャを提案する.提案手法では,スキャンチェイン中の任意のFFにラッチを付け加えることで,過去のFFの値を利用し,スキャンデータを攻撃者に解読不可能なデータに変化させる.FFの値が変化することで,スキャンデータを動的に変化させることが可能である.攻撃者には解読不可能なデータであっても,テスト者は拡張回路の構造を知っているため,通常のスキャンテストと同様のテストが可能である.RSA暗号回路に提案するセキュアスキャンアーキテクチャを実装し,評価を行った.

CiNii
State Dependent Scan Flip Flop を用いたRSA暗号回路へのセキュアスキャンンアーキテクチャの実装

跡部悠太, 史又華, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. CAS, 回路とシステム 112 ( 113 ) 115 - 120 2012年06月

　概要を見る

代表的なテスト容易化設計であるスキャンテストは,LSI内部のFF(フリップフロップ)を直列に接続し,外部から自由に制御,観測でき,効率よく故障検出をすることができる.一方,スキャンテストで用いられるスキャンチェインを使用し,暗号LSIの秘密鍵を解読するスキャンベース攻撃が注目されている.一般的にテスト容易性とセキュリティは相反する性質であるが,それらを両立させる回路設計が必要である.本稿では,スキャンテストの利点であるテスト容易性を持ち,スキャンベース攻撃に対するセキュアスキャンアーキテクチャを提案する.提案手法では,スキャンチェイン中の任意のFFにラッチを付け加えることで,過去のFFの値を利用し,スキャンデータを攻撃者に解読不可能なデータに変化させる.FFの値が変化することで,スキャンデータを動的に変化させることが可能である.攻撃者には解読不可能なデータであっても,テスト者は拡張回路の構造を知っているため,通常のスキャンテストと同様のテストが可能である.RSA暗号回路に提案するセキュアスキャンアーキテクチャを実装し,評価を行った.

CiNii
State Dependent Scan Flip Flop を用いたRSA暗号回路へのセキュアスキャンンアーキテクチャの実装

跡部悠太, 史又華, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. VLD, VLSI設計技術 112 ( 114 ) 115 - 120 2012年06月

　概要を見る

代表的なテスト容易化設計であるスキャンテストは,LSI内部のFF(フリップフロップ)を直列に接続し,外部から自由に制御,観測でき,効率よく故障検出をすることができる.一方,スキャンテストで用いられるスキャンチェインを使用し,暗号LSIの秘密鍵を解読するスキャンベース攻撃が注目されている.一般的にテスト容易性とセキュリティは相反する性質であるが,それらを両立させる回路設計が必要である.本稿では,スキャンテストの利点であるテスト容易性を持ち,スキャンベース攻撃に対するセキュアスキャンアーキテクチャを提案する.提案手法では,スキャンチェイン中の任意のFFにラッチを付け加えることで,過去のFFの値を利用し,スキャンデータを攻撃者に解読不可能なデータに変化させる.FFの値が変化することで,スキャンデータを動的に変化させることが可能である.攻撃者には解読不可能なデータであっても,テスト者は拡張回路の構造を知っているため,通常のスキャンテストと同様のテストが可能である.RSA暗号回路に提案するセキュアスキャンアーキテクチャを実装し,評価を行った.

CiNii
HDRアーキテクチャを対象とした高速かつ効率的な複数電源電圧指向の高位合成手法

阿部晋矢, 柳澤政生, 戸川望

研究報告システムLSI設計技術（SLDM） 2012 ( 2 ) 1 - 6 2012年05月

　概要を見る

高集積，高機能な LSI 加工技術の出現により，エネルギー効率と配線遅延を意識した LSI 設計が求められる．低電力化技術の１つである複数電源電圧は，設計の上位工程で意識するほど効果が高いまた，設計の下位工程であるフロアプランまで意識し，配線遅延の影響を考えた高位合成が必要となっている．複数電源電圧と配線遅延を高位合成に統合するプラットフォームとして HDR アーキテクチャが提案された本稿では，HDR アーキテクチャを対象に高速かつ効率的な複数電源電圧指向の高位合成を提案する．高速かつ効率的に解を得るため，「高収束な面積見積もり」と「フロアプラン指向ハドル合成」を提案する．「高収束な面積見積もり」は，従来手法において収束の妨げとなっていた反復中の面積の振動を削減する．「フロアプラン指向ハドル合成」は，ハドルに所属する演算器をフロアプランと同時に決定することで効率的にハドルの構成を決定する．計算機実験結果より提案手法は従来手法と比較し，約 40％実行時間が削減された．HDR architecture has been proposed as a platform that integrates energy-efficiency and interconnection delays into high-level synthesis. In this paper, we propose new multiple-supply-voltages aware high-speed and highefficiency high-level synthesis for HDR architectures. We propose two new techniques, "virtual area estimation" and "floorplanning directed huddling", and integrate them into an HDR architecture synthesis algorithm. "Virtual area estimation" reduces huddles' area oscillating during iterations, which impedes convergence of conventional methods. "Floorplanning directed huddling" determines huddle structure effectively by resolving floorplanning and functional unit assignment inside huddles at the same time. Experimental results show that our algorithm achieves about 40% run-time-saving compared with the conventional methods.

CiNii
HDRアーキテクチャを対象とした高速かつ効率的な複数電源電圧指向の高位合成手法

阿部晋矢, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. VLD, VLSI設計技術 112 ( 71 ) 7 - 12 2012年05月

　概要を見る

高集積,高機能なLSI加工技術の出現により,エネルギー効率と配線遅延を意識したLSI設計が求められる.低電力化技術の1つである複数電源電圧は,設計の上位工程で意識するほど効果が高い.また,設計の下位工程であるフロアプランまで意識し,配線遅延の影響を考えた高位合成が必要となっている.複数電源電圧と配線遅延を高位合成に統合するプラットフォームとしてHDRアーキテクチャが提案された.本稿では,HDRアーキテクチャを対象に高速かつ効率的な複数電源電圧指向の高位合成を提案する.高速かつ効率的に解を得るため,「高収束な面積見積もり」と「フロアプラン指向ハドル合成」を提案する.「高収束な面積見積もり」は,従来手法において収束の妨げとなっていた反復中の面積の振動を削減する.「フロアプラン指向ハドル合成」は,ハドルに所属する演算器をフロアプランと同時に決定することで効率的にハドルの構成を決定する.計算機実験結果より提案手法は従来手法と比較し,約40%実行時間が削減された.

CiNii
A-3-7 多数ビデオ入力に対する画像認識ハードウェアの制御方式の提案(A-3.VLSI設計技術,一般セッション)

大塚卓哉, 細谷英一, 青木孝, 小野澤晃, 李昇周, 戸川望

電子情報通信学会総合大会講演論文集 2012 91 - 91 2012年03月

CiNii
2コアプロセッサL1キャッシュ構成の正確で高速なシミュレーション手法

多和田雅師, 柳澤政生, 戸川望

研究報告システムLSI設計技術（SLDM） 2012 ( 3 ) 1 - 6 2012年02月

　概要を見る

近年，複数のコアをもつ組込みプロセッサが増えている．アプリケーションが限定される組込みシステムでは，速度や電力，面積の点で最適なキャッシュメモリが存在する．限定されたアプリケーションに対して複数のキャッシュ構成それぞれで動作シミュレーションを行うことで，キャッシュメモリ設計時に最適なキャッシュ構成を判定できる．マルチコアキャッシュ構成のシミュレーションは複雑になりシングルコアキャッシュ構成のシミュレーションよりも時間がかかってしまう．シングルコアプロセッサのキャッシュ構成ではシミュレーションの高速化手法が研究されているが，マルチコアプロセッサのキャッシュ構成ではシミュレーション高速化手法の研究は進んでいない．本稿では 2 コアプロセッサ L1 キャッシュのキャッシュ構成シミュレーションの高速化手法を提案する．マルチコアプロセッサではキャッシュコヒーレンシプロトコルがあり，複数の似たキャッシュ構成であっても内部状態が異なる場合が多い．そこでキャッシュコヒーレンシプロトコルの状態遷移とキャッシュ連想度に関する性質を利用することで 1 つのデータ構造で連想度の異なる複数のキャッシュ構成を表現する手法を提案する．複数のキャッシュ構成を 1 つのデータ構造で表し探索や更新の範囲を少なくすることで，シミュレーションの高速化を図る．Recently, multiple-core processors are used in embedded systems very often. Since application programs running are much limited on embedded systems, there must exist an optimal cache memory in terms of power and area. Simulating application programs on various cache configurations is one of the best options to determine the optimal cache configuration. Multicore cache configuration simulation, however, is much more complicated and takes much more time than single-core cache configuration simulation. In this paper, we propose a very fast two-core L1 cache configuration simulation algorithm. We first propose a new data structure where just a single data structure represents two or more multicore cache configurations with different cache associativities. After that, we propose a new multicore cache configuration simulation algorithm using our new data structure associated with new theorems.

CiNii
RDRアーキテクチャを対象とした部分2重化フォールトセキュア高位合成手法

田中翔, 柳澤政生, 戸川望

研究報告システムLSI設計技術（SLDM） 2012 ( 4 ) 1 - 6 2012年02月

　概要を見る

半導体の微細化技術の向上に伴い，ソフトエラーによる信頼性低下が問題となっている．そのため，LSI にエラー検出機能を組み込むフォールトセキュア設計の必要性が高まっている．一方，微細化技術の向上によりゲート遅延より配線遅延が支配的となったため，高位合成段階で配線遅延を予測する必要が生じている．配線長が不定である従来のレジスタ集中型アーキテクチャに対し，レジスタをチップ内に均等に配置することで配線長を一定とする RDR アーキテクチャが提案されている．本稿では RDR アーキテクチャを対象とした，部分 2 重化によるフォールトセキュア高位合成手法を提案する．提案手法では入力 CDFG の演算ノードを一部 2 重化することで，レイテンシ制約内で信頼性を最大化する．RDR アーキテクチャで生じる空き領域をフォールトセキュア設計に利用することで面積効率を向上させると同時に，2 重化可能な演算ノード数を増加させる．続いて，挿入比較ノード数を最小化するスケジューリング・バインディングを行うことで余分な演算器動作を抑制し，信頼性向上を図る．計算機実験により提案手法は，フォールトセキュア設計を利用しない手法と比して最大 57% 信頼性を向上させるフォールトセキュア高位合成が可能であることを確認した．As device feature size decreases, the reliability improvement against soft errors becomes quite necessary. A fault-secure system, in which concurrent error detection is realized, is one of the solutions to this problem. On the other hand, the average interconnect delay exceeds the gate delay which leads to the timing closure problem. By using regular-distributed-register architecture (RDR architecture), we can estimate interconnection delays very accurately and influence of their interconnect can be much reduced even in the behavioral level. In this paper, we propose a partial redundant fault-secure high-level synthesis algorithm for an RDR architecture. In fault-secure high-level synthesis, a re-computation CDFG a part of normal-computation CDFG must be scheduled and bound to functional units. Firstly, our algorithm re-uses vacant areas on RDR islands to allocate new function units additionally for the re-computation CDFG.Secondly, we propose a scheduling algorithm which minimize the number of insert comparator nodes. We show the effectiveness of the proposed algorithm through experimental results. Our algorithm reduces the soft error rate by an average of 57% compared with the non fault-secure approach.

CiNii
セレクタ論理を利用した高速補間演算器設計

岩田愛実, 吉原弘峰, 柳澤政生, 戸川望

研究報告システムLSI設計技術（SLDM） 2012 ( 7 ) 1 - 6 2012年02月

　概要を見る

補間演算は既知のデータ列を基にして各区間の範囲内を埋める数値または関数を求める演算で，画像の拡大，縮小や魚眼画像の補正といった処理に利用される．キュービックスプライン補間は周囲 4 点から 3 次関数を用いることで補間を行うため精度が高く，より滑らかな補間ができるため実用的に用いられる．しかし，キュービックスプライン補間では扱う既知データが多く，計算が複雑なために処理に時間がかかる．そのため，補間演算処理をリアルタイムに行うには演算の高速化が必要である．本稿では，補間演算器にセレクタ論理を組み込むことで桁上げ伝搬遅延を削減し，演算器を高速化する手法を提案する．周囲 2 点を基に補間を行う線形補間では，算術演算子を用いて設計した従来の線形補間演算器に比べ，遅延時間は最大15％削減された．キュービックスプライン補間演算では，従来のキュービックスプライン補間演算器に比べ，遅延時間は最大 25％削減された．Interpolation is a technique that fills the gaps between existing data, which is often applied to image scaling and superresolution. Cubic spline interpolation, one of the interpolation techniques, obtains a cubic function based on the four existing points and fills their gaps very smoothly and precisely. However, it takes a lot of time because it requires many data and complex calculation. Speeding-up cubic spline interpolation is the key to realize a practical image scaling system. In this paper, we firstly focus on linear interpolation and propose a high-speed linear interpolation circuit based on "selector logics." Secondly, we propose a high-speed cubic spline interpolation circuit composed of our proposed linear interpolation circuits. Experimental results demonstrate that our linear interpolation circuit improves the performance by 15% and that our cubic interpolation circuit improves the performance by 25 %, compared to a conventional interpolation design.

CiNii
2コアプロセッサL1キャッシュ構成の正確で高速なシミュレーション手法

多和田雅師, 柳澤政生, 戸川望

研究報告組込みシステム（EMB） 2012 ( 3 ) 1 - 6 2012年02月

　概要を見る

近年，複数のコアをもつ組込みプロセッサが増えている．アプリケーションが限定される組込みシステムでは，速度や電力，面積の点で最適なキャッシュメモリが存在する．限定されたアプリケーションに対して複数のキャッシュ構成それぞれで動作シミュレーションを行うことで，キャッシュメモリ設計時に最適なキャッシュ構成を判定できる．マルチコアキャッシュ構成のシミュレーションは複雑になりシングルコアキャッシュ構成のシミュレーションよりも時間がかかってしまう．シングルコアプロセッサのキャッシュ構成ではシミュレーションの高速化手法が研究されているが，マルチコアプロセッサのキャッシュ構成ではシミュレーション高速化手法の研究は進んでいない．本稿では 2 コアプロセッサ L1 キャッシュのキャッシュ構成シミュレーションの高速化手法を提案する．マルチコアプロセッサではキャッシュコヒーレンシプロトコルがあり，複数の似たキャッシュ構成であっても内部状態が異なる場合が多い．そこでキャッシュコヒーレンシプロトコルの状態遷移とキャッシュ連想度に関する性質を利用することで 1 つのデータ構造で連想度の異なる複数のキャッシュ構成を表現する手法を提案する．複数のキャッシュ構成を 1 つのデータ構造で表し探索や更新の範囲を少なくすることで，シミュレーションの高速化を図る．Recently, multiple-core processors are used in embedded systems very often. Since application programs running are much limited on embedded systems, there must exist an optimal cache memory in terms of power and area. Simulating application programs on various cache configurations is one of the best options to determine the optimal cache configuration. Multicore cache configuration simulation, however, is much more complicated and takes much more time than single-core cache configuration simulation. In this paper, we propose a very fast two-core L1 cache configuration simulation algorithm. We first propose a new data structure where just a single data structure represents two or more multicore cache configurations with different cache associativities. After that, we propose a new multicore cache configuration simulation algorithm using our new data structure associated with new theorems.

CiNii
RDRアーキテクチャを対象とした部分2重化フォールトセキュア高位合成手法

田中翔, 柳澤政生, 戸川望

研究報告組込みシステム（EMB） 2012 ( 4 ) 1 - 6 2012年02月

　概要を見る

半導体の微細化技術の向上に伴い，ソフトエラーによる信頼性低下が問題となっている．そのため，LSI にエラー検出機能を組み込むフォールトセキュア設計の必要性が高まっている．一方，微細化技術の向上によりゲート遅延より配線遅延が支配的となったため，高位合成段階で配線遅延を予測する必要が生じている．配線長が不定である従来のレジスタ集中型アーキテクチャに対し，レジスタをチップ内に均等に配置することで配線長を一定とする RDR アーキテクチャが提案されている．本稿では RDR アーキテクチャを対象とした，部分 2 重化によるフォールトセキュア高位合成手法を提案する．提案手法では入力 CDFG の演算ノードを一部 2 重化することで，レイテンシ制約内で信頼性を最大化する．RDR アーキテクチャで生じる空き領域をフォールトセキュア設計に利用することで面積効率を向上させると同時に，2 重化可能な演算ノード数を増加させる．続いて，挿入比較ノード数を最小化するスケジューリング・バインディングを行うことで余分な演算器動作を抑制し，信頼性向上を図る．計算機実験により提案手法は，フォールトセキュア設計を利用しない手法と比して最大 57% 信頼性を向上させるフォールトセキュア高位合成が可能であることを確認した．As device feature size decreases, the reliability improvement against soft errors becomes quite necessary. A fault-secure system, in which concurrent error detection is realized, is one of the solutions to this problem. On the other hand, the average interconnect delay exceeds the gate delay which leads to the timing closure problem. By using regular-distributed-register architecture (RDR architecture), we can estimate interconnection delays very accurately and influence of their interconnect can be much reduced even in the behavioral level. In this paper, we propose a partial redundant fault-secure high-level synthesis algorithm for an RDR architecture. In fault-secure high-level synthesis, a re-computation CDFG a part of normal-computation CDFG must be scheduled and bound to functional units. Firstly, our algorithm re-uses vacant areas on RDR islands to allocate new function units additionally for the re-computation CDFG.Secondly, we propose a scheduling algorithm which minimize the number of insert comparator nodes. We show the effectiveness of the proposed algorithm through experimental results. Our algorithm reduces the soft error rate by an average of 57% compared with the non fault-secure approach.

CiNii
セレクタ論理を利用した高速補間演算器設計

岩田愛実, 吉原弘峰, 柳澤政生, 戸川望

研究報告組込みシステム（EMB） 2012 ( 7 ) 1 - 6 2012年02月

　概要を見る

補間演算は既知のデータ列を基にして各区間の範囲内を埋める数値または関数を求める演算で，画像の拡大，縮小や魚眼画像の補正といった処理に利用される．キュービックスプライン補間は周囲 4 点から 3 次関数を用いることで補間を行うため精度が高く，より滑らかな補間ができるため実用的に用いられる．しかし，キュービックスプライン補間では扱う既知データが多く，計算が複雑なために処理に時間がかかる．そのため，補間演算処理をリアルタイムに行うには演算の高速化が必要である．本稿では，補間演算器にセレクタ論理を組み込むことで桁上げ伝搬遅延を削減し，演算器を高速化する手法を提案する．周囲 2 点を基に補間を行う線形補間では，算術演算子を用いて設計した従来の線形補間演算器に比べ，遅延時間は最大15％削減された．キュービックスプライン補間演算では，従来のキュービックスプライン補間演算器に比べ，遅延時間は最大 25％削減された．Interpolation is a technique that fills the gaps between existing data, which is often applied to image scaling and superresolution. Cubic spline interpolation, one of the interpolation techniques, obtains a cubic function based on the four existing points and fills their gaps very smoothly and precisely. However, it takes a lot of time because it requires many data and complex calculation. Speeding-up cubic spline interpolation is the key to realize a practical image scaling system. In this paper, we firstly focus on linear interpolation and propose a high-speed linear interpolation circuit based on "selector logics." Secondly, we propose a high-speed cubic spline interpolation circuit composed of our proposed linear interpolation circuits. Experimental results demonstrate that our linear interpolation circuit improves the performance by 15% and that our cubic interpolation circuit improves the performance by 25 %, compared to a conventional interpolation design.

CiNii
2コアプロセッサL1キャッシュ構成の正確で高速なシミュレーション手法

多和田雅師, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. DC, ディペンダブルコンピューティング : IEICE technical report 111 ( 462 ) 13 - 18 2012年02月

　概要を見る

近年,複数のコアをもつ組込みプロセッサが増えている.アプリケーションが限定される組込みシステムでは,速度や電力,面積の点で最適なキャッシュメモリが存在する.限定されたアプリケーションに対して複数のキャッシュ構成それぞれで動作シミュレーションを行うことで,キャッシュメモリ設計時に最適なキャッシュ構成を判定できる.マルチコアキャッシュ構成のシミュレーションは複雑になりシングルコアキャッシュ構成のシミュレーションよりも時間がかかってしまう.シングルコアプロセッサのキャッシュ構成ではシミュレーションの高速化手法が研究されているが,マルチコアプロセッサのキャッシュ構成ではシミュレーション高速化手法の研究は進んでいない.本稿では2コアプロセッサL1キャッシュのキャッシュ構成シミュレーションの高速化手法を提案する.マルチコアプロセッサではキャッシュコヒーレンシプロトコルがあり,複数の似たキャッシュ構成であっても内部状態が異なる場合が多い.そこでキャッシュコヒーレンシプロトコルの状態遷移とキャッシュ連想度に関する性質を利用することで1つのデータ構造で連想度の異なる複数のキャッシュ構成を表現する手法を提案する.複数のキャッシュ構成を1つのデータ構造で表し探索や更新の範囲を少なくすることで,シミュレーションの高速化を図る.

CiNii
RDRアーキテクチャを対象とした部分2重化フォールドセキュア高位合成手法

田中翔, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. DC, ディペンダブルコンピューティング : IEICE technical report 111 ( 462 ) 19 - 24 2012年02月

　概要を見る

半導体の微細化技術の向上に伴い,ソフトエラーによる信頼性低下が問題となっている.そのため,LSIにエラー検出機能を組み込むフォールトセキュア設計の必要性が高まっている.一方,微細化技術の向上によりゲート遅延より配線遅延が支配的となったため,高位合成段階で配線遅延を予測する必要が生じている.配線長が不定である従来のレジスタ集中型アーキテクチャに対し,レジスタをチップ内に均等に配置することで配線長を一定とするRDRアーキテクチャが提案されている.本稿ではRDRアーキテクチャを対象とした,部分2重化によるフォールトセキュア高位合成手法を提案する.提案手法では入力CDFGの演算ノードを一部2重化することで,レイテンシ制約内で信頼性を最大化する.RDRアーキテクチャで生じる空き領域をフォールトセキュア設計に利用することで面積効率を向上させると同時に, 2重化可能な演算ノード数を増加させる.続いて,挿入比較ノード数を最小化するスケジューリング・バインディングを行うことで余分な演算器動作を抑制し,信頼性向上を図る.計算機実験により提案手法は,フォールトセキュア設計を利用しない手法と比して最大57%信頼性を向上させるフォールトセキュア高位合成が可能であることを確認した.

CiNii
セレクタ論理を利用した高速補間演算器設計

岩田愛実, 吉原弘峰, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. DC, ディペンダブルコンピューティング : IEICE technical report 111 ( 462 ) 37 - 42 2012年02月

　概要を見る

補間演算は既知のデータ列を基にして各区間の範囲内を埋める数値または関数を求める演算で,画像の拡大,縮小や魚眼画像の補正といった処理に利用される.キュービックスプライン補間は周囲4点から3次関数を用いることで補間を行うため精度が高く,より滑らかな補間ができるため実用的に用いられる.しかし,キュービックスプライン補間では扱う既知データが多く,計算が複雑なために処理に時間がかかる.そのため,補間演算処理をリアルタイムに行うには演算の高速化が必要である.本稿では,補間演算器にセレクタ論理を組み込むことで桁上げ伝搬遅延を削減し,演算器を高速化する手法を提案する.周囲2点を基に補間を行う線形補間では,算術演算子を用いて設計した従来の線形補間演算器に比べ,遅延時間は最大15%削減された.キュービックスプライン補間演算では,従来のキュービックスプライン補間演算器に比べ,遅延時間は最大25%削減された.

CiNii
2コアプロセッサL1キャッシュ構成の正確で高速なシミュレーション手法

多和田雅師, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. CPSY, コンピュータシステム 111 ( 461 ) 13 - 18 2012年02月

　概要を見る

近年,複数のコアをもつ組込みプロセッサが増えている.アプリケーションが限定される組込みシステムでは,速度や電力,面積の点で最適なキャッシュメモリが存在する.限定されたアプリケーションに対して複数のキャッシュ構成それぞれで動作シミュレーションを行うことで,キャッシュメモリ設計時に最適なキャッシュ構成を判定できる.マルチコアキャッシュ構成のシミュレーションは複雑になりシングルコアキャッシュ構成のシミュレーションよりも時間がかかってしまう.シングルコアプロセッサのキャッシュ構成ではシミュレーションの高速化手法が研究されているが,マルチコアプロセッサのキャッシュ構成ではシミュレーション高速化手法の研究は進んでいない.本稿では2コアプロセッサL1キャッシュのキャッシュ構成シミュレーションの高速化手法を提案する.マルチコアプロセッサではキャッシュコヒーレンシプロトコルがあり,複数の似たキャッシュ構成であっても内部状態が異なる場合が多い.そこでキャッシュコヒーレンシプロトコルの状態遷移とキャッシュ連想度に関する性質を利用することで1つのデータ構造で連想度の異なる複数のキャッシュ構成を表現する手法を提案する.複数のキャッシュ構成を1つのデータ構造で表し探索や更新の範囲を少なくすることで,シミュレーションの高速化を図る.

CiNii
RDRアーキテクチャを対象とした部分2重化フォールドセキュア高位合成手法

田中翔, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. CPSY, コンピュータシステム 111 ( 461 ) 19 - 24 2012年02月

　概要を見る

半導体の微細化技術の向上に伴い,ソフトエラーによる信頼性低下が問題となっている.そのため,LSIにエラー検出機能を組み込むフォールトセキュア設計の必要性が高まっている.一方,微細化技術の向上によりゲート遅延より配線遅延が支配的となったため,高位合成段階で配線遅延を予測する必要が生じている.配線長が不定である従来のレジスタ集中型アーキテクチャに対し,レジスタをチップ内に均等に配置することで配線長を一定とするRDRアーキテクチャが提案されている.本稿ではRDRアーキテクチャを対象とした,部分2重化によるフォールトセキュア高位合成手法を提案する.提案手法では入力CDFGの演算ノードを一部2重化することで,レイテンシ制約内で信頼性を最大化する.RDRアーキテクチャで生じる空き領域をフォールトセキュア設計に利用することで面積効率を向上させると同時に, 2重化可能な演算ノード数を増加させる.続いて,挿入比較ノード数を最小化するスケジューリング・バインディングを行うことで余分な演算器動作を抑制し,信頼性向上を図る.計算機実験により提案手法は,フォールトセキュア設計を利用しない手法と比して最大57%信頼性を向上させるフォールトセキュア高位合成が可能であることを確認した.

CiNii
セレクタ論理を利用した高速補間演算器設計

岩田愛実, 吉原弘峰, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. CPSY, コンピュータシステム 111 ( 461 ) 37 - 42 2012年02月

　概要を見る

補間演算は既知のデータ列を基にして各区間の範囲内を埋める数値または関数を求める演算で,画像の拡大,縮小や魚眼画像の補正といった処理に利用される.キュービックスプライン補間は周囲4点から3次関数を用いることで補間を行うため精度が高く,より滑らかな補間ができるため実用的に用いられる.しかし,キュービックスプライン補間では扱う既知データが多く,計算が複雑なために処理に時間がかかる.そのため,補間演算処理をリアルタイムに行うには演算の高速化が必要である.本稿では,補間演算器にセレクタ論理を組み込むことで桁上げ伝搬遅延を削減し,演算器を高速化する手法を提案する.周囲2点を基に補間を行う線形補間では,算術演算子を用いて設計した従来の線形補間演算器に比べ,遅延時間は最大15%削減された.キュービックスプライン補間演算では,従来のキュービックスプライン補間演算器に比べ,遅延時間は最大25%削減された.

CiNii
センサネットワーク低消費電力化のためのS-MACプロトコルduty cycle最適化手法

大岸龍司, 柳澤政生, 戸川望

情報処理学会シンポジウムシリーズ(CD-ROM) 2012 ( 1 ) 2012年

J-GLOBAL
空間認知を利用した歩行者のための屋内ナビゲーションシステム設計

杉岡基行, 柳澤政生, 戸川望, 石川知夏, 新垣紀子, 小野澤晃

情報処理学会シンポジウムシリーズ(CD-ROM) 2012 ( 1 ) 2012年

J-GLOBAL
可視グラフによる屋内環境モデル化に基づく屋内環境向けナビゲーションシステム

町田直哉, 柳澤政生, 戸川望

情報処理学会シンポジウムシリーズ(CD-ROM) 2012 ( 1 ) 2012年

J-GLOBAL
HDRアーキテクチャを対象としたクロックゲーティングを用いた低電力化高位合成手法

赤坂宏行, 柳澤政生, 戸川望

情報処理学会シンポジウム論文集 2012 ( 5 ) 2012年

J-GLOBAL
温度特性を考慮したRDRアーキテクチャ向け高位合成手法

川村一志, 柳澤政生, 戸川望

情報処理学会シンポジウム論文集 2012 ( 5 ) 2012年

J-GLOBAL
State Dependent Scan Flip Flopを用いた暗号回路へのセキュアスキャンアーキテクチャの実装

跡部悠太, SHI Youhua, 柳澤政生, 戸川望

情報処理学会シンポジウム論文集 2012 ( 5 ) 2012年

J-GLOBAL
動的複数電源電圧およびフロアプラン統合化アーキテクチャを対象とした低電力化高位合成手法

阿部晋矢, 宇佐美公良, 宇佐美公良, 柳澤政生, 戸川望

情報処理学会シンポジウム論文集 2012 ( 5 ) 2012年

J-GLOBAL
2コアプロセッサを対象とする正確で高速なヘテロL1キャッシュシミュレーション

多和田雅師, 柳澤政生, 戸川望

情報処理学会シンポジウム論文集 2012 ( 5 ) 2012年

J-GLOBAL
組込みプロセッサのための超高速なオンチップメモリ最適化技術(継続)

戸川望

電気通信普及財団研究調査報告書(CD-ROM) ( 27 ) 2012年

J-GLOBAL
クロックグリッチを利用した故障攻撃に対するカウンタ用いた耐タンパAES暗号回路

五十嵐博昭, SHI Youhua, 柳澤政生, 戸川望

情報処理学会シンポジウム論文集 2012 ( 5 ) 2012年

J-GLOBAL
キャッシュ構成の高速シミュレーションを利用した不揮発メモリによる二階層キャッシュ構成の評価

松野翔太, 多和田雅師, 柳澤政生, 戸川望

情報処理学会シンポジウム論文集 2012 ( 5 ) 2012年

J-GLOBAL
スキャンシグネチャを用いたTriple DESに対するスキャンベース攻撃手法 (VLSI設計技術)

小寺博和, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 : 信学技報 111 ( 324 ) 7 - 12 2011年11月

　概要を見る

テスト容易化技術の1つであるスキャンパステストは,LSIのレジスタを外部から直接観測・制御することが可能であるためLSIの検証に非常に役立つ.一方で,暗号モジュールや暗号LSIに対するサイドチャネル攻撃の危険性が指摘されており,その中でもスキャンパステストで使用するテスト用スキャンチェインから取得可能なスキャンデータから秘密鍵を解読するスキャンベース攻撃が注目されている.従来研究として,共通鍵暗号DESやAES,公開鍵暗号RSAや楕円曲線暗号に対するスキャンベース攻撃手法が提案されているが,共通鍵暗号Triple DESに対するスキャンベース攻撃手法は報告されていない.本稿では,共通鍵暗号Triple DESに対するスキャンシグネチャを用いたスキャンベース攻撃手法を提案する.提案手法では,暗号LSIに複数の平文を入力したときのスキャンデータの特定のビット列に着目し,対応するレジスタの変化を観察することで秘密鍵を解読する.暗号LSI以外のレジスタがスキャンチェインに含まれる場合や,暗号LSIの動作タイミングが不明な場合でも秘密鍵の解読が可能となる.TripleDESは暗号化のために秘密鍵を3つ使用するため,最初に解読した秘密鍵を用いて他の秘密鍵の解読を行うことで3つの秘密鍵の解読を実行する.提案手法では,多くても43個の平文でTriple DESの秘密鍵解読をできる結果が得られた.

CiNii
スキャンシグネチャを用いたTriple DESに対するスキャンベース攻撃手法 (ディペンダブルコンピューティング)

小寺博和, 柳澤政生, 戸川望

電子情報通信学会技術研究報告 : 信学技報 111 ( 325 ) 7 - 12 2011年11月

　概要を見る

テスト容易化技術の1つであるスキャンパステストは,LSIのレジスタを外部から直接観測・制御することが可能であるためLSIの検証に非常に役立つ.一方で,暗号モジュールや暗号LSIに対するサイドチャネル攻撃の危険性が指摘されており,その中でもスキャンパステストで使用するテスト用スキャンチェインから取得可能なスキャンデータから秘密鍵を解読するスキャンベース攻撃が注目されている.従来研究として,共通鍵暗号DESやAES,公開鍵暗号RSAや楕円曲線暗号に対するスキャンベース攻撃手法が提案されているが,共通鍵暗号Triple DESに対するスキャンベース攻撃手法は報告されていない.本稿では,共通鍵暗号Triple DESに対するスキャンシグネチャを用いたスキャンベース攻撃手法を提案する.提案手法では,暗号LSIに複数の平文を入力したときのスキャンデータの特定のビット列に着目し,対応するレジスタの変化を観察することで秘密鍵を解読する.暗号LSI以外のレジスタがスキャンチェインに含まれる場合や,暗号LSIの動作タイミングが不明な場合でも秘密鍵の解読が可能となる.Triple DESは暗号化のために秘密鍵を3つ使用するため,最初に解読した秘密鍵を用いて他の秘密鍵の解読を行うことで3つの秘密鍵の解読を実行する.提案手法では,多くても43個の平文でTriple DESの秘密鍵解読をできる結果が得られた.

CiNii
スキャンシグネチャを用いた Triple DES に対するスキャンベース攻撃手法

小寺博和, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. DC, ディペンダブルコンピューティング : IEICE technical report 111 ( 325 ) 7 - 12 2011年11月

CiNii
スキャンシグネチャを用いた Triple DES に対するスキャンベース攻撃手法

小寺博和, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. VLD, VLSI設計技術 111 ( 324 ) 7 - 12 2011年11月

CiNii
スキャンシグネチャを用いたTriple DESに対するスキャンベース攻撃手法

小寺博和, 柳澤政生, 戸川望

研究報告システムLSI設計技術（SLDM） 2011 ( 2 ) 1 - 6 2011年11月

　概要を見る

テスト容易化技術の 1 つであるスキャンパステストは，LSI のレジスタを外部から直接観測・制御することが可能であるため LSI の検証に非常に役立つ．一方で，暗号モジュールや暗号 LSI に対するサイドチャネル攻撃の危険性が指摘されており，その中でもスキャンパステストで使用するテスト用スキャンチェインから取得可能なスキャンデータから秘密鍵を解読するスキャンベース攻撃が注目されている．従来研究として，共通鍵暗号 DES や AES，公開鍵暗号 RSA や楕円曲線暗号に対するスキャンベース攻撃手法が提案されているが，共通鍵暗号 Triple DES に対するスキャンベース攻撃手法は報告されていない．本稿では，共通鍵暗号 Triple DES に対するスキャンシグネチャを用いたスキャンベース攻撃手法を提案する．提案手法では，暗号 LSI に複数の平文を入力したときのスキャンデータの特定のビット列に着目し，対応するレジスタの変化を観察することで秘密鍵を解読する．暗号 LSI 以外のレジスタがスキャンチェインに含まれる場合や，暗号 LSI の動作タイミングが不明な場合でも秘密鍵の解読が可能となる．Triple DES は暗号化のために秘密鍵を 3 つ使用するため，最初に解読した秘密鍵を用いて他の秘密鍵の解読を行うことで 3 つの秘密鍵の解読を実行する．提案手法では，多くても 43 個の平文で Triple DES の秘密鍵解読をできる結果が得られた．Scan-path test is one of the useful design-for-test techniques, which can observe and control registers inside LSIs. On the other hand, a scan-based attack which retrieves secret keys from scanned data is considered to be one of the strongest side-channel attacks. In this paper, a scan-based attack method against Triple DES cryptosystems using a "scan signature" is proposed. In our method, several plaintexts are inputted into a Triple DES module and an attacker obtains scanned data. Then, an attacker observes a specific bit line (scan signature) of these scanned data to retrieve a secret key. The Triple DES algorithm uses three secret keys. The first secret key can be retrieved as in the same way as we can retrieve a secret key from a DES module. How to retrieve the second and third secret keys is the most concern. In our proposed method, we retrieve the second and third secret keys by using the retrieved first key and setting an appropriate scan signature. Experimental results show that our proposed method successfully retrieve three secret keys in a Triple DES module using up to 43 plaintexts.

CiNii
スキャンチェイン構造に依存しないDESに対するスキャンベース攻撃手法

小寺博和, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. SIP, 信号処理 : IEICE technical report 111 ( 257 ) 61 - 66 2011年10月

　概要を見る

近年,暗号モジュールや暗号LSIに対するサイドチャネル攻撃の危険性が指摘されている.その中でもスキャンチェインから取得できるスキャンデータによって秘密鍵を解読するスキャンベース攻撃が注目されている.スキャンベース攻撃では,スキャンチェインから容易にスキャンデータを取得できる性質を利用して秘密鍵を解読する.本稿では,スキャンチェインのレジスタ接続順や暗号LSIの動作するタイミングの特定が不要となるDESに対するスキャンベース攻撃手法を提案する.提案手法では,暗号LSIに複数の平文を入力したときの暗号化処理中のスキャンデータの特定の1ビットに着目し,対応するレジスタの変化を観察することで秘密鍵を解読する.暗号LSI以外のレジスタがスキャンチェインに含まれた場合でも秘密の鍵解読が可能となるため,より現実的な条件でスキャンベース攻撃による秘密鍵解読が可能となる.

CiNii
HDRアーキテクチャを対象とした複数電源電圧指向の低電力化高位合成手法

阿部晋矢, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. SIP, 信号処理 : IEICE technical report 111 ( 257 ) 95 - 100 2011年10月

　概要を見る

携帯機器の駆動時間や発熱が問題となる現代,低電力化を意識したLSI設計が必要である.半導体の微細化技術の向上のため,ゲート遅延に対する配線遅延の割合が増加し,配線遅延を考慮した設計も必要である.システムLSIの設計手法として高位合成があるが,低電力化と配線遅延の双方を意識した高位合成としてHDRアーキテクチャを対象とした複数電源電圧指向の高位合成がある.しかし,これはスケジューリング/FUバインディングの際,直接的に消費エネルギーを最小化するのではなく,実行時間の最小化を目的とすることで2次的に消費エネルギーを削減している.本稿では,HDRアーキテクチャを対象とした,複数電源電圧を考慮した消費エネルギーの最小化を目的とするスケジューリング/FUバインディングを提案する.計算機実験により提案手法は,従来のレジスタ分散型アーキテクチャと比較して最大45.1%程度消費エネルギーを削減でき,従来のHDRアーキテクチャを対象とした手法と比較して最大15.9%程度消費エネルギーを削減できることを確認した.

CiNii
スキャンチェイン構造に依存しないDESに対するスキャンベース攻撃手法

小寺博和, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. IE, 画像工学 111 ( 259 ) 61 - 66 2011年10月

　概要を見る

近年,暗号モジュールや暗号LSIに対するサイドチャネル攻撃の危険性が指摘されている.その中でもスキャンチェインから取得できるスキャンデータによって秘密鍵を解読するスキャンベース攻撃が注目されている.スキャンベース攻撃では,スキャンチェインから容易にスキャンデータを取得できる性質を利用して秘密鍵を解読する.本稿では,スキャンチェインのレジスタ接続順や暗号LSIの動作するタイミングの特定が不要となるDESに対するスキャンベース攻撃手法を提案する.提案手法では,暗号LSIに複数の平文を入力したときの暗号化処理中のスキャンデータの特定の1ビットに着目し,対応するレジスタの変化を観察することで秘密鍵を解読する.暗号LSI以外のレジスタがスキャンチェインに含まれた場合でも秘密の鍵解読が可能となるため,より現実的な条件でスキャンベース攻撃による秘密鍵解読が可能となる.

CiNii
HDRアーキテクチャを対象とした複数電源電圧指向の低電力化高位合成手法

阿部晋矢, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. IE, 画像工学 111 ( 259 ) 95 - 100 2011年10月

　概要を見る

携帯機器の駆動時間や発熱が問題となる現代,低電力化を意識したLSI設計が必要である.半導体の微細化技術の向上のため,ゲート遅延に対する配線遅延の割合が増加し,配線遅延を考慮した設計も必要である.システムLSIの設計手法として高位合成があるが,低電力化と配線遅延の双方を意識した高位合成としてHDRアーキテクチャを対象とした複数電源電圧指向の高位合成がある.しかし,これはスケジューリング/FUバインディングの際,直接的に消費エネルギーを最小化するのではなく,実行時間の最小化を目的とすることで2次的に消費エネルギーを削減している.本稿では,HDRアーキテクチャを対象とした,複数電源電圧を考慮した消費エネルギーの最小化を目的とするスケジューリング/FUバインディングを提案する.計算機実験により提案手法は,従来のレジスタ分散型アーキテクチャと比較して最大45.1%程度消費エネルギーを削減でき,従来のHDRアーキテクチャを対象とした手法と比較して最大15.9%程度消費エネルギーを削減できることを確認した.

CiNii
スキャンチェイン構造に依存しないDESに対するスキャンベース攻撃手法

小寺博和, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. ICD, 集積回路 111 ( 258 ) 61 - 66 2011年10月

　概要を見る

近年,暗号モジュールや暗号LSIに対するサイドチャネル攻撃の危険性が指摘されている.その中でもスキャンチェインから取得できるスキャンデータによって秘密鍵を解読するスキャンベース攻撃が注目されている.スキャンベース攻撃では,スキャンチェインから容易にスキャンデータを取得できる性質を利用して秘密鍵を解読する.本稿では,スキャンチェインのレジスタ接続順や暗号LSIの動作するタイミングの特定が不要となるDESに対するスキャンベース攻撃手法を提案する.提案手法では,暗号LSIに複数の平文を入力したときの暗号化処理中のスキャンデータの特定の1ビットに着目し,対応するレジスタの変化を観察することで秘密鍵を解読する.暗号LSI以外のレジスタがスキャンチェインに含まれた場合でも秘密の鍵解読が可能となるため,より現実的な条件でスキャンベース攻撃による秘密鍵解読が可能となる.

CiNii
HDRアーキテクチャを対象とした複数電源電圧指向の低電力化高位合成手法

阿部晋矢, 柳澤政生, 戸川望

電子情報通信学会技術研究報告. ICD, 集積回路 111 ( 258 ) 95 - 100 2011年10月

　概要を見る

携帯機器の駆動時間や発熱が問題となる現代,低電力化を意識したLSI設計が必要である.半導体の微細化技術の向上のため,ゲート遅延に対する配線遅延の割合が増加し,配線遅延を考慮した設計も必要である。システムLSIの設計手法として高位合成があるが,低電力化と配線遅延の双方を意識した高位合成としてHDRアーキテクチャを対象とした複数電源電圧指向の高位合成がある.しかし,これはスケジューリング/FUバインディングの際,直接的に消費エネルギーを最小化するのではなく,実行時間の最小化を目的とすることで2次的に消費エネルギーを削減している.本稿では,HDRアーキテクチャを対象とした,複数電源電圧を考慮した消費エネルギーの最小化を目的とするスケジューリング/FUバインディングを提案する.計算機実験により提案手法は,従来のレジスタ分散型アーキテクチャと比較して最大45.1%程度消費エネルギーを削減でき,従来のHDRアーキテクチャを対象とした手法と比較して最大15.9%程度消費エネルギーを削減できることを確認した.

CiNii
A-3-1 動きベクトルを考慮した遅延オーバーヘッドのないハードウェア向き適応的並列補間手法(A-3.VLSI設計技術,一般セッション)

栗岡大生, 柳澤政生, 戸川望

電子情報通信学会ソサイエティ大会講演論文集 2011 75 - 75 2011年08月

CiNii
A-3-3 セレクタ論理帰着型重み付き加算器を用いた超解像処理と比較実験(A-3.VLSI設計技術,一般セッション)

吉原弘峰, 柳澤政生, 戸川望

電子情報通信学会ソサイエティ大会講演論文集 2011 77 - 77 2011年08月

CiNii
A-3-11 2コアプロセッサアーキテクチャを対象とする正確なキャッシュ構成シミュレーションの高速化に対する一考察(A-3.VLSI設計技術,一般セッション)

多和田雅師, 柳澤政生, 戸川望

電子情報通信学会ソサイエティ大会講演論文集 2011 85 - 85 2011年08月

CiNii
A-3-13 共有バス方式とバスマトリクス方式を用いたネットワークプロセッサのバス競合の性能比較評価(A-3.VLSI設計技術,一般セッション)

出口健介, 柳澤政生, 戸川望

電子情報通信学会ソサイエティ大会講演論文集 2011 87 - 87 2011年08月

CiNii
超解像技術におけるセレクタ論理帰着型重み付き加算による再構築処理ハードウェア設計 (システム設計)

吉原弘峰, 柳澤政生, 戸川望

回路とシステムワークショップ論文集 Workshop on Circuits and Systems 24 ( 266 ) 431 - 436 2011年08月

CiNii
歩行者ナビゲーションのための屋内環境での空間認知

杉岡基行, 柳澤政生, 戸川望

マルチメディア、分散協調とモバイルシンポジウム2011論文集 2011 1065 - 1079 2011年06月

CiNii
屋内環境モデル化と柔軟な歩行経路生成手法

町田直哉, 柳澤政生, 戸川望

マルチメディア、分散協調とモバイルシンポジウム2011論文集 2011 1057 - 1064 2011年06月

CiNii
セレクタ論理帰着型重み付き加算器を用いた超解像処理

吉原弘峰, 柳澤政生, 大附辰夫, 戸川望

研究報告システムLSI設計技術（SLDM） 2011 ( 6 ) 1 - 6 2011年05月

　概要を見る

近年，大型テレビやパソコンが普及し，高解像度の液晶ディスプレイで動画像を視聴する機会が増えている．低コストで低解像度の画像を高解像度の画像に変換する必要がある．そこで注目されている技術が超解像処理である．超解像処理とは，観測画像のノイズを取り除き，高周波数成分を復元する技術である．その中でも，被写体本来の輝度を再現できる再構築型超解像が注目されている．再構築型超解像は複数の低解像度画像から 1 枚の高解像度画像を生成する．再構築型超解像処理では複数の画像を利用するため，再構築処理の計算コストが大きい．一方，扱う画像情報の増加や組み込む機器の小型・薄型化により，超解像処理を行う専用演算器の小面積化・高速化が求められている．そこで，本稿では再構築型超解像処理の再構築処理が重み付き加算に帰結できることを利用し，再構築処理にセレクタ論理を組み込むことで，桁上げ伝搬遅延を削減，超解像処理を効率化する手法を提案する．算術演算子を用いて設計した従来の重み付き加算器に比べ，遅延時間は 13% ，回路面積は 32% 削減された．In recent years the popularity of television sets and computers with large screens has let to more opportunities to watch moving picture on high-resolution liquid crystal display (LCD) where it is quite necessary to convert low-resolution images to highresolution ones at low cost. Super-resolution is a technique to remove the noise of observed images and restore the high frequencise of ones. We focus on reconstruction-based super-resolution because it can restore their own brightnesses. It produces a highresolution image from a set of low-resolution images. Reconstruction requires large computation cost because it requires many images. However, it is necessary to improve arithmetic circuits' performance specific to reconstruction-based super-resolution since the reconstruction-based algorithms need more information on images. In this paper, we propose a reconstruction-based super-resolution using a weighted adder. Our weighted adder is implemented by using selector logics so that we can reduce carry propagations and improve the performance of reconstruction-based super-resolution. Finally, experimental results demonstrate that our proposed weighted adder circuit improves the performance by 13 % and reduces the area by 32 %, compared to conventional weighted adders.

CiNii
セレクタ論理帰着型重み付き加算器を用いた超解像処理

吉原弘峰, 柳澤政生, 大附辰夫, 戸川望

電子情報通信学会技術研究報告. VLD, VLSI設計技術 111 ( 40 ) 27 - 32 2011年05月

　概要を見る

近年,大型テレビやパソコンが普及し,高解像度の液晶ディスプレイで動画像を視聴する機会が増えている.低コストで低解像度の画像を高解像度の画像に変換する必要がある.そこで注目されている技術が超解像処理である.超解像処理とは,観測画像のノイズを取り除き,高周波数成分を復元する技術である.その中でも,被写体本来の輝度を再現できる再構築型超解像が注目されている.再構築型超解像は複数の低解像度画像から1枚の高解像度画像を生成する.再構築型超解像処理では複数の画像を利用するため,再構築処理の計算コストが大きい.一方,扱う画像情報の増加や組み込む機器の小型・薄型化により,超解像処理を行う専用演算器の小面積化・高速化が求められている.そこで,本稿では再構築型超解像処理の再構築処理が重み付き加算に帰結できることを利用し,再構築処理にセレクタ論理を組み込むことで,桁上げ伝搬遅延を削減,超解像処理を効率化する手法を提案する.算術演算子を用いて設計した従来の重み付き加算器に比べ,遅延時間は13%,回路面積は32%削減された.

CiNii
Scan Vulnerability in Elliptic Curve Cryptosystems (IPSJ Transactions on System LSI Design Methodology Vol.4)

NARA RYUTA, TOGAWA NOZOMU, YANAGISAWA MASAO

情報処理学会論文誌論文誌トランザクション 2010 ( 2 ) 47 - 59 2011年04月

CiNii
DS-2-4 エッジ情報を用いたAngularイントラ予測モード高速決定手法(DS-2.HEVCの標準化に向けた新規映像符号化技術,シンポジウムセッション)

徳満健太, 蝶野慶一, 矢崎健太, 仙田裕三, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会総合大会講演論文集 2011 ( 2 ) "S - 7" 2011年02月

CiNii
柔軟な置換ポリシをもつ2階層キャッシュの正確で高速なシミュレーション手法

多和田雅師, 柳澤政生, 大附辰夫, 戸川望

電子情報通信学会技術研究報告. VLD, VLSI設計技術 110 ( 432 ) 13 - 18 2011年02月

　概要を見る

通常,多階層キャッシュにおいてL1キャッシュは置換ポリシとしてLRUを持つが,下位階層のキャッシュの置換ポリシはハードウェアコストの低いFIFOなどを用いることが普通である.本稿ではL1キャッシュでLRUをキャッシュ置換ポリシとし,L2キャッシュでFIFOをキャッシュ置換ポリシとして持つ2階層キャッシュの高速なシミュレーションの手法を提案する.提案手法はL1命令キャッシュ,L1データキャッシュの一方を固定し,L2キャッシュを含めたキャッシュシミュレーションを複数回行う.キャッシュの性質を利用し,結果を正しく予測できるシミュレーションを省略することで高速化する.計算機実験により手法の有効性を評価する.

CiNii
スクラッチパッドメモリとコード配置最適化による低消費エネルギーASIP合成手法

嶋田吉倫, 史又華, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 110 ( 432 ) 25 - 30 2011年02月

　概要を見る

本稿ではVLIW型ASIPを対象としたハードウェア/ソフトウェア(HW/SW)ASIP協調合成システムSPADESにおける消費エネルギー削減手法を提案する.ASIPにおいて命令メモリが占める消費エネルギーの割合は大きく,命令メモリの消費エネルギー削減が課題となっている.そこで我々は,SPADESを対象としたスクラッチパッドメモリアーキテクチャと,コード配置最適化手法を提案する.提案するスクラッチパッドメモリアーキテクチャは,プログラムカウンタによりスクラッチパッドメモリへ配置するデータを判別する.コード配置最適化手法は,アプリケーションCFGから消費エネルギー最小となるコード配置とスクラッチパッドメモリのサイズを決定する.これにより命令メモリのアクセス数を削減し,消費エネルギーを削減することができる.計算機実験により,メモリを含むプロセッサ全体で平均47.9%の消費エネルギー削減を確認した.

CiNii
組込みプロセッサのための超高速なオンチップメモリ最適化技術

戸川望

電気通信普及財団研究調査報告書(CD-ROM) ( 26 ) 2011年

J-GLOBAL
複数電源電圧および複数サイクルレジスタ間通信指向の低電力化高位合成手法

阿部晋矢, 柳澤政生, 戸川望

情報処理学会シンポジウム論文集 2011 ( 5 ) 2011年

J-GLOBAL
FIFOをキャッシュ置換ポリシとする正確なキャッシュ構成シミュレーションの高速化

多和田雅師, 柳澤政生, 大附辰夫, 戸川望

電子情報通信学会技術研究報告. DC, ディペンダブルコンピューティング : IEICE technical report 110 ( 317 ) 55 - 60 2010年11月

　概要を見る

一般にプロセッサのキャッシュ構成はセット数,ブロックサイズ,連想度のパラメータが存在する.組込みシステムでは対象とするアプリケーションが限定されているため,そのキャッシュ構成を最適化することができる.対象アプリケーションに対しキャッシュ置換ポリシとしてLRUを仮定し,これら3つのキャッシュパラメータを変化させたときのキャッシュヒット/ミス数を正確に,かつきわめて高速にシミュレーションする手法としてCRCB手法が提案されている.ところが多くのキャッシュは,キャッシュハードウェアのオーバヘッド削減のためより簡易なキャッシュ置換ポリシとしてFIFOを持つ.本稿では組込みアプリケーションを対象にFIFOをキャッシュ置換ポリシを持つキャッシュ構成シミュレーションの高速化アルゴリズムを提案する.FIFOに対し,キャッシュの性質を利用することで,連想度が異なる複数のキャッシュ構成を一括してシミュレーションしヒット/ミスを判定する手法を提案する.計算機実験の結果,従来のFIFOを対象とするキャッシュ構成シミュレータに対し平均18%高速に,複数のキャッシュ構成のヒット/ミス数を正確に判定できた.

CiNii
FIFOをキャッシュ置換ポリシとする正確なキャッシュ構成シミュレーションの高速化

多和田雅師, 柳澤政生, 大附辰夫, 戸川望

電子情報通信学会技術研究報告. VLD, VLSI設計技術 110 ( 316 ) 55 - 60 2010年11月

　概要を見る

一般にプロセッサのキャッシュ構成はセット数,ブロックサイズ,連想度のパラメータが存在する.組込みシステムでは対象とするアプリケーションが限定されているため,そのキャッシュ構成を最適化することができる.対象アプリケーションに対しキャッシュ置換ポリシとしてLRUを仮定し,これら3つのキャッシュパラメータを変化させたときのキャッシュヒット/ミス数を正確に,かつきわめて高速にシミュレーションする手法としてCRCB 手法が提案されている.ところが多くのキャッシュは,キャッシュハードウェアのオーバヘッド削減のためより簡易なキャッシュ置換ポリシとしてFIFOを持つ.本稿では組込みアプリケーションを対象にFIFOをキャッシュ置換ポリシを持つキャッシュ構成シミュレーションの高速化アルゴリズムを提案する.FIFOに対し,キャッシュの性質を利用することで,連想度が異なる複数のキャッシュ構成を一括してシミュレーションしヒット/ミスを判定する手法を提案する.計算機実験の結果,従来のFIFOを対象とするキャッシュ構成シミュレータに対し平均18%高速に,複数のキャッシュ構成のヒット/ミス数を正確に判定できた.

CiNii
A-3-6 RSA暗号に対するスキャンベース攻撃の評価実験(A-3.VLSI設計技術,一般セッション)

奈良竜太, 柳澤政生, 大附辰夫, 戸川望

電子情報通信学会ソサイエティ大会講演論文集 2010 68 - 68 2010年08月

CiNii
未来を切り拓く最先端 VLSI テクノロジー : 1.メディア処理における超低消費電力SoC技術

後藤敏, 池永剛, 吉村猛, 木村晋二, 戸川望

情報処理 51 ( 7 ) 837 - 845 2010年07月

CiNii
一般化レジスタ分散アーキテクチャを対象とした高位合成手法とその評価

大智輝, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 110 ( 36 ) 19 - 24 2010年05月

　概要を見る

半導体微細化に伴い,配線遅延はゲート遅延と比較し相対的に増加しており,高位合成の段階においても配置・配線を考慮することが望まれる.本稿では,レジスタ・演算器間,制御回路・演算器間の配線遅延に着目した一般化レジスタ分散型アーキテクチャを対象とした高位合成手法を提案する.対象アーキテクチャは配線遅延がボトルネックとなる演算器に専用のレジスタ・制御回路を付加することにより,回路性能の向上し,それ以外の演算器はレジスタ・制御回路を共有することにより,面積を削減可能となる.本手法はレジスタ・制御回路の構成を入力アプリケーションと制約から配置・配線情報をフィードバックする高位合成フロー中に自動的に決定される.計算機実験により,提案手法は従来の配置・配線を考慮した高位合成手法と比較し,4.9%の性能向上,10.0%の面積削減を達成した.

CiNii
道路標識とランドマークを用いた歩行者位置特定システムと実地調査による評価

児島伴幸, 山根和也, 柳澤政生, 大附辰夫, 戸川望

情報処理学会論文誌 51 ( 3 ) 899 - 913 2010年03月

　概要を見る

携帯電話を用いた歩行者の位置特定は一般的に携帯電話に搭載されたGPS（携帯GPSと呼ぶ）を用いているが，携帯GPSはマルチパスなどの影響により測位誤差が生じる可能性がある．一方，携帯GPSの測位誤差を調べた調査結果が公開されていることが少ない．本論文ではまず都市部と住宅地の両方が存在する高田馬場駅周辺において携帯GPSの測位誤差を調査した．調査の結果，携帯GPSは最大で80m程度の測位誤差が生じた．都市部における80mの測位誤差は道路2.3本分の誤差に対応するため，歩行者に混乱を与えかねない．次に，携帯GPSの測位誤差を0に近づけるため，道路標識とランドマークを用いて携帯GPSの測位を補正する位置特定手法を提案する．既存インフラである道路標識・ランドマークと，近い将来に社会インフラ化される携帯GPSを用いるため，インフラ設備を最小限に抑えることができる．提案手法は利用者の現在地を道路標識の位置と同一視し，利用者が見つけた道路標識の位置を知ることにより，利用者の位置を特定するものである．処理の流れは携帯GPSにより大まかな位置を特定した後に，利用者が見つけた道路標識を選択することにより現在地候補を5個以下に絞る．現在地候補の近辺に存在するランドマークを選択することにより唯一の現在地を特定する．提案手法をCGI環境で実装し，NTTドコモ社とKDDI社の携帯電話を用いて評価実験した．実地調査を通じて98%の精度で利用者の現在地を特定できることを実証し，提案手法が有効な手法であることを確認した．Mobile-GPS is generally used for pedestrian positioning on mobile devices such as mobile phones and PDAs. Positioning errors of mobile-GPS might be caused by several factors, such as "multipath," however, positioning errors of mobile-GPS have been not investigated sufficiently. In this paper, we first investigate positioning errors of mobile-GPS at Takadanobaba station and its environs which have both urban and residential areas. Our investigation results show that positioning errors of mobile-GPS can cause approximately 80-meter error at the maximum. Secondly we propose a highly accurate pedestrian positioning method using road traffic signs and landmarks. Our proposed method does not require any infrastructure construction as we already have infrastructure of road traffic signs, landmarks and mobile-GPS on mobile devices. Assuming that a user is positioned at the traffic sign, our proposed method determine the user position by finding out several nearby road traffic signs and sending their colors and shapes to a server. Our method start with locating approximately position of a user using mobile-GPS. Next, it locates user position by selecting road traffic sings and landmarks. Our method is implemented with CGI and investigated using mobile phones of NTT Docomo and au by KDDI. By this investigation, the accuracy of this method was 98% and we succeeded to confirm effectiveness of the proposed method through this evaluation investigation.

CiNii
常時着用型センサ"ビジネス顕微鏡"による組織変革

荒宏視, 佐藤信夫, 矢野和男, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 109 ( 462 ) 43 - 47 2010年03月

　概要を見る

先端のセンサネット技術と組織論・心理学の融合による,新しい組織行動支援技術を示す.

CiNii
歩行者の現在地認識に基づく道路標識とランドマークを用いた位置特定システムの改良とシミュレーション評価

児島伴幸, 柳澤政生, 大附辰夫, 戸川望

電子情報通信学会技術研究報告. ITS 109 ( 414 ) 153 - 158 2010年02月

　概要を見る

近年,歩行者の位置特定には携帯電話に搭載されたGPS(携帯GPSと呼ぶ)を用いるが,携帯GPSは都市部においてマルチパスなどの影響により数100m程度の測位誤差が生じる可能性がある.現在地が地図上で数100m程度離れてプロットされると,歩行者は現在地を地図上から認識し難い.我々は携帯GPS,道路標識,ランドマークを用いた,歩行者位置特定システムを構築している.本稿では,まず,このシステムを改良し,歩行者が地図上から現在地を認識し易い位置特定システムを提案する.本システムは,歩行者が見つけた道路標識,ランドマークを用いて道路標識の位置を特定し,歩行者が見つけた道路標識とランドマークの位置を地図上に表示する.歩行者は地図上に表示された道路標識とランドマークを用いて地図と現実をマッピングするため,地図上から現在地を認識し易い.次に,近くに存在する同一の道路標識を1つのクラスタとして考えることと,高層ビル街とそれ以外に分けて各種閾値を定義することによって,歩行者による道路標識の選択回数を都市部の高層ビル街以外において1回,高層ビル街において2回以下に抑え,ユーザビリティを向上させる.最後に,シミュレーション実験を通じて,改良手法が有効な手法であることを確認した.

CiNii
歩行者の現在地認識に基づく道路標識とランドマークを用いた位置特定システムの改良とシミュレーション評価

児島伴幸, 柳澤政生, 大附辰夫, 戸川望

電子情報通信学会技術研究報告. IE, 画像工学 109 ( 415 ) 153 - 158 2010年02月

　概要を見る

近年,歩行者の位置特定には携帯電話に搭載されたGPS(携帯GPSと呼ぶ)を用いるが,携帯GPSは都市部においてマルチパスなどの影響により数100m程度の測位誤差が生じる可能性がある.現在地が地図上で数100m程度離れてプロットされると,歩行者は現在地を地図上から認識し難い.我々は携帯GPS,道路標識,ランドマークを用いた,歩行者位置特定システムを構築している.本稿では,まず,このシステムを改良し,歩行者が地図上から現在地を認識し易い位置特定システムを提案する.本システムは,歩行者が見つけた道路標識ランドマークを用いて道路標識の位置を特定し,歩行者が見つけた道路標識とランドマークの位置を地図上に表示する.歩行者は地図上に表示された道路標識とランドマークを用いて地図と現実をマッピングするため,地図上から現在地を認識し易い.次に,近くに存在する同一の道路標識を1つのクラスタとして考えることと,高層ビル街とそれ以外に分けて各種閾値を定義することによって,歩行者による道路標識の選択回数を都市部の高層ビル街以外において1回,高層ビル街において2回以下に抑え,ユーザビリティを向上させる.最後に,シミュレーション実験を通じて,改良手法が有効な手法であることを確認した.

DOI CiNii
アドホックネットワークにおけるクラスタの接続性とクラスタヘッドの負荷分散を考慮したルーティング (アドホックネットワーク)

板橋裕介, 戸川望, 柳澤政生

電子情報通信学会技術研究報告 109 ( 381 ) 85 - 90 2010年01月

CiNii
複数のグループを持つ無線アドホックネットワークにおける衝突回避型マルチキャストプロトコル (アドホックネットワーク)

竹内博是, 戸川望, 柳澤政生

電子情報通信学会技術研究報告 109 ( 381 ) 95 - 100 2010年01月

CiNii
部分マッチングを考慮しMISO構造に対応した専用演算器合成手法

橋本識弘, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. RECONF, リコンフィギャラブルシステム : IEICE technical report 109 ( 395 ) 89 - 94 2010年01月

　概要を見る

近年,アプリケーションに特化した専用プロセッサの需要が伸びているが,アプリケーションごとにプロセッサを設計するのは時間的なコストを必要とするため,アプリケーションに専用の演算器を持つプロセッサの自動合成システムが求められている.本稿では,アプリケーションに特化した専用演算器合成手法を提案する.提案手法は,MISO(Multiple Input, Single Output)構造の専用演算器を生成することを可能とする.さらに専用演算器内の不要な演算に対して,0もしくは1を入力とし演算を実行させないことで,専用演算器とCDFG(Control Data Flow Graph)のサブグラフが部分的に一致している場合にも,専用演算器で演算を実行させる部分マッチングを可能とした.提案手法により,既存手法と比較して52%の実行時間を削減することができた.

CiNii
部分マッチングを考慮しMISO構造に対応した専用演算器合成手法

橋本識弘, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. CPSY, コンピュータシステム 109 ( 394 ) 89 - 94 2010年01月

　概要を見る

近年,アプリケーションに特化した専用プロセッサの需要が伸びているが,アプリケーションごとにプロセッサを設計するのは時間的なコストを必要とするため,アプリケーションに専用の演算器を持つプロセッサの自動合成システムが求められている.本稿では,アプリケーションに特化した専用演算器合成手法を提案する.提案手法は,MISO (Multiple Input, Single Output)構造の専用演算器を生成することを可能とする.さらに専用演算器内の不要な演算に対して,0もしくは1を入力とし演算を実行させないことで,専用演算器とCDFG (Control Data Flow Graph)のサブグラフが部分的に一致している場合にも,専用演算器で演算を実行させる部分マッチングを可能とした.提案手法により,既存手法と比較して52%の実行時間を削減することができた.

CiNii
部分マッチングを考慮しMISO構造に対応した専用演算器合成手法

橋本識弘, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 109 ( 393 ) 89 - 94 2010年01月

　概要を見る

近年,アプリケーションに特化した専用プロセッサの需要が伸びているが,アプリケーションごとにプロセッサを設計するのは時間的なコストを必要とするため,アプリケーションに専用の演算器を持つプロセッサの自動合成システムが求められている.本稿では,アプリケーションに特化した専用演算器合成手法を提案する.提案手法は,MISO(Multiple Input, Single OutPut)構造の専用演算器を生成することを可能とする.さらに専用演算器内の不要な演算に対して,0もしくは1を入力とし演算を実行させないことで,専用演算器とCDFG(Control Data Flow Graph)のサブグラフが部分的に一致している場合にも,専用演算器で演算を実行させる部分マッチングを可能とした.提案手法により,既存手法と比較して52%の実行時間を削減することができた.

CiNii
組み込みアプリケーションを対象とした2階層キャッシュメモリにおけるキャッシュ/バス構成最適化手法

渡辺信太, 戸川望, 柳澤政生, 大附辰夫

情報処理学会シンポジウム論文集 2010 ( 7 ) 2010年

J-GLOBAL
RDRアーキテクチャを対象としたフォールトセキュア高位合成手法

田中翔, 柳澤政生, 大附辰夫, 戸川望

情報処理学会シンポジウム論文集 2010 ( 7 ) 2010年

J-GLOBAL
FIFOとPLRUをキャッシュ置換ポリシとする高速なキャッシュ構成シミュレーション手法

多和田雅師, 柳澤政生, 大附辰夫, 戸川望

情報処理学会シンポジウム論文集 2010 ( 7 ) 2010年

J-GLOBAL
MANETにおけるSIPサーバレスシステム

下坂知輝, 戸川望, 柳澤政生, 大附辰夫

情報処理学会シンポジウムシリーズ(CD-ROM) 2010 ( 1 ) 2010年

J-GLOBAL
携帯電話GPSの測位誤差測定に基づく道路標識とランドマークを用いた位置特定システムの改良

田口真史, 児島伴幸, 柳澤政生, 大附辰夫, 戸川望

情報処理学会シンポジウムシリーズ(CD-ROM) 2010 ( 1 ) 2010年

J-GLOBAL
道路標識とランドマークを用いた歩行者位置特定システムと実地調査による評価

児島伴幸, 山根和也, 柳澤政生, 大附辰夫, 戸川望

情報処理学会論文誌ジャーナル(CD-ROM) 51 ( 3 ) 2010年

J-GLOBAL
歩行者の現在地認識に基づく道路標識とランドマークを用いた位置特定システムの改良とシミュレーション評価(ITS画像処理,映像メディア,視覚および一般)

児島伴幸, 柳澤政生, 大附辰夫, 戸川望

映像情報メディア学会技術報告 34 ( 0 ) 153 - 158 2010年

　概要を見る

近年,歩行者の位置特定には携帯電話に搭載されたGPS(携帯GPSと呼ぶ)を用いるが,携帯GPSは都市部においてマルチパスなどの影響により数100m程度の測位誤差が生じる可能性がある.現在地が地図上で数100m程度離れてプロットされると,歩行者は現在地を地図上から認識し難い.我々は携帯GPS,道路標識,ランドマークを用いた,歩行者位置特定システムを構築している.本稿では,まず,このシステムを改良し,歩行者が地図上から現在地を認識し易い位置特定システムを提案する.本システムは,歩行者が見つけた道路標識,ランドマークを用いて道路標識の位置を特定し,歩行者が見つけた道路標識とランドマークの位置を地図上に表示する.歩行者は地図上に表示された道路標識とランドマークを用いて地図と現実をマッピングするため,地図上から現在地を認識し易い.次に,近くに存在する同一の道路標識を1つのクラスタとして考えることと,高層ビル街とそれ以外に分けて各種閾値を定義することによって,歩行者による道路標識の選択回数を都市部の高層ビル街以外において1回,高層ビル街において2回以下に抑え,ユーザビリティを向上させる.最後に,シミュレーション実験を通じて,改良手法が有効な手法であることを確認した.

DOI CiNii
組み込みアプリケーションを対象とした2階層ユニファイドキャッシュのシミュレーション手法

小林優太, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. DC, ディペンダブルコンピューティング : IEICE technical report 109 ( 316 ) 37 - 42 2009年11月

　概要を見る

本稿では組み込みアプリケーションを対象として,パラメータによって変化したL1命令キャッシュ,L1データキャッシュ,L2ユニファイドキャッシュのヒット数およびミス数を正確かつ高速に算出する手法を提案する.本手法はL1命令(データ)キャッシュ-L2命令(データ)キャッシュを対象としたシミュレーションを複数回繰り返すことで,2階層ユニファイドキャッシュを対象としたシミュレーションを可能とした.さらに,キャッシュが持つ性質を利用し,シミュレーションするキャッシュ構成を省略することで,全探索手法と比較して,最大3662.93倍の高速化を確認した.

CiNii
2階層キャッシュメモリにおけるシミュレーションベースのバス幅最適化手法

渡辺信太, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. DC, ディペンダブルコンピューティング : IEICE technical report 109 ( 316 ) 43 - 48 2009年11月

　概要を見る

本稿では組み込みアプリケーションを対象とし,2階層キャッシュメモリにおけるバス幅とキャッシュ構成のシミュレーションベースの最適化手法を提案する.まず,キャッシュのヒット/ミス判定とバス幅の最適化を独立して考えることができることを示す.キャッシュのヒット/ミス判定はCRCB手法を適用することで効率的に探索する.バス幅の最適化はキャッシュとバスの持つ性質を利用することで効率的な探索を可能とする.本手法の評価として,総メモリアクセス時間最小または総消費エネルギー最小となるようなキャッシュ・バス構成を探索するシステムを構築し,単純な全探索と比較して最大で835.91倍高速化した.

CiNii
組み込みアプリケーションを対象とした2階層ユニファイドキャッシュのシミュレーション手法

小林優太, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 109 ( 315 ) 37 - 42 2009年11月

　概要を見る

本稿では組み込みアプリケーションを対象として,パラメータによって変化したL1命令キャッシュ,L1データキャッシュ,L2ユニファイドキャッシュのヒット数およびミス数を正確かつ高速に算出する手法を提案する.本手法はL1命令(データ)キャッシュ-L2命令(データ)キャッシュを対象としたシミュレーションを複数回繰り返すことで,2階層ユニファイドキャッシュを対象としたシミュレーションを可能とした.さらに,キャッシュが持つ性質を利用し,シミュレーションするキャッシュ構成を省略することで,全探索手法と比較して,最大3662.93倍の高速化を確認した.

CiNii
2階層キャッシュメモリにおけるシミュレーションベースのバス幅最適化手法

渡辺信太, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 109 ( 315 ) 43 - 48 2009年11月

　概要を見る

本稿では組み込みアプリケーションを対象とし,2階層キャッシュメモリにおけるバス幅とキャッシュ構成のシミュレーションべースの最適化手法を提案する.まず,キャッシュのヒット/ミス判定とバス幅の最適化を独立して考えることができることを示す.キャッシュのヒット/ミス判定はCRCB手法を適用することで効率的に探索する.バス幅の最適化はキャッシュとバスの持つ性質を利用することで効率的な探索を可能とする.本手法の評価として,総メモリアクセス時間最小または総消費エネルギー最小となるようなキャッシュ・バス構成を探索するシステムを構築し,単純な全探索と比較して最大で835.91倍高速化した.

CiNii
セレクタ論理を用いた高速な差積演算器の設計とバタフライ演算への応用 (画像工学)

塚本洋平, 戸川望, 柳澤政生

電子情報通信学会技術研究報告 109 ( 227 ) 101 - 106 2009年10月

CiNii
セレクタ論理を用いた高速な差積演算器の設計とバタフライ演算への応用

塚本洋平, 戸川望, 柳澤政生, 大附辰夫, 外村元伸

研究報告システムLSI設計技術（SLDM） 2009 ( 18 ) 1 - 6 2009年10月

　概要を見る

システム LSI は通信，動画像，音声処理などの複雑で規模の大きな演算を高速に処理するために特定の計算に特化した専用演算器を搭載してきた．その一つが積和演算を行う MAC 演算器である．これは部分積加算を拡張することで桁上げ伝播遅延を削減でき，結果として乗算 1 回分と同等の遅延時間で計算できる．一方差積演算に注目すると，部分積が決定するのに減算の桁上げ遅延を待たねばならず全体の遅延は減算と乗算 2 つの遅延の合計となる．本稿ではこの問題に対し差積演算の部分積を適切にまとめたものがセレクタ回路の計算と等価となることに注目し，セレクタ論理を用いて部分積を高速に生成し差積演算の速度を向上する手法を提案する．次に設計した差積演算器を FFT におけるバタフライ演算に組み込むことを考える．FFT は無線通信，動画像処理などの分野で高サンプル数の演算が求められており，それらに対応するために高速なバタフライ演算器が必要である．これに対しバタフライ演算のクリティカルパスは複素減算，乗算演算でありこれに上述の差積演算回路を適用することで高速化できることを示す．Large-scale network and multimedia application LSIs include application specific arithmetic circuits. A multiply-accumulator (MAC) which is one of these optimized circuits extends partial-products addition and decreases carry propagations. However, there is no method similar to MAC to execute subtractmultiplication. In this paper, we propose a high-speed subtract-multiplier that decreases latency of subtract operation by bit-level transformation using selector-logics. Partial products are calculated directly by bit-level transformation and its total number is compressed to approximately half. The proposed subtract-multiplier can apply to even any kind of systems using subtractmultiplications and butterfly operation in FFT is a suitable application using them. Experimental results show that our proposed butterfly operation circuit improves the performance by 33.0%, compared to a conventional one.

CiNii
セレクタ論理を用いた高速な差積演算器の設計とバタフライ演算への応用

塚本洋平, 戸川望, 柳澤政生, 大附辰夫, 外村元伸

電子情報通信学会技術研究報告. SIP, 信号処理 : IEICE technical report 109 ( 226 ) 101 - 106 2009年10月

CiNii
IEEE802.11nに対応した高効率列処理演算器による高スループットイレギュラーLDPC復号器の実装と評価

長島諒侑, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 109 ( 201 ) 51 - 56 2009年09月

　概要を見る

近年携帯電話や無線LANといった無線端末が普及し放送もデジタル化するなど,無線通信の利用が急激に加速している.通信環境の変動しやすい無線端末において,高い通信品質を保つことが大きな課題となっている.LDPC(Low Density Parity Check)符号は高い誤り訂正能力を持つため次世代の誤り訂正符号として注目され,2006年にはIEEE802.11nで規格化された.本稿では,IEEE802.11nの規格に準拠したイレギュラーな検査行列の復号が可能な復号器を提案する.提案する復号器は列処理演算の並列性に着目することで符号化率や符号長が変化しても加算回路を共有し演算器の使用率を向上させる.提案復号器では行方向,列方向ともに並列性を持たせることで高スループット化を実現する.さらに符号化率が高くなるにつれて列処理演算器の演算並列度を向上させることができるとともに,短い符号長でも演算器の使用率の低下を抑えることで従来手法よりも高いスループットを実現できる.FPGA上に実装した結果,既存の最高性能の復号器に比較して最高スループットで約28%の向上が見られ,符号長1296では2倍以上のスループットの向上が確認できた.

CiNii
ディジタルメディア向け動的再構成型プロセッサFE-GAへのDFGマッピングとその自動化手法

田村亮, 戸川望, 柳澤政生, 大附辰夫, 佐藤真琴

電子情報通信学会技術研究報告. VLD, VLSI設計技術 109 ( 201 ) 57 - 62 2009年09月

　概要を見る

近年のディジタル機器は多種多様で,膨大なデータを短時間で処理する事が要求されている.この変化に対応するべく,日立製作所は動的再構成型プロセッサFE-GA(Flexible Engine/Generic ALU Array)を開発・推進している.本稿では,FE-GAに様々なDFG(Data Flow Graph)をマッピングする手法を提案する.提案した手法では,格子状に配置された演算セルアレイ上に配置配線を行う.配置配線に際しては,マッピングしたノードの出力が確保されるよう行うセル封鎖判定や,データの到着タイミングを合わせるサイクル数調整などを行っている.さらにFE-GAの特徴であるスレッド切り替えを用いて,1面に収まりきらないDFGを分割する事で任意の大きさのDFGのマッピングを実現している.この提案手法では,FE-GAのアーキテクチャ制限の範囲内において,加算器で構成されたDFGを自動的にFE-GA上へマッピングすることに成功した.

CiNii
ビットレベル処理を考慮したセレクタ帰着型重み付き加算器

原智昭, 戸川望, 柳澤政生, 大附辰夫, 外村元伸

電子情報通信学会技術研究報告. VLD, VLSI設計技術 109 ( 34 ) 7 - 12 2009年05月

　概要を見る

複数の入力値に総和が1になる重みを付加する重み付き加算演算がある。この重み付き加算演算は複数枚の画像の重ね合わせ処理に使用されている.本稿では,重み付き加算器の演算式を式変形し,セレクタ論理に帰着させることにより桁上げ伝幡遅延を削減した重み付き加算演器を提案する.評価実験の結果,提案した重み付き加算演算器は,算術演算子を用いた演算器に比べ,速度優先設計で17%演算高速化を確認した.

CiNii
Odd-Even Turn Model を対象としたNoCの負荷分散による遅延時間削減手法

脇田慎吾, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 108 ( 478 ) 153 - 158 2009年03月

　概要を見る

Network-on-Chip(NoC)ではノード間通信の品質維持のために送信元ノードから宛先ノードへパケットを転送する際の平均遅延時間を低く抑える必要がある.NoCで用いられる適応型ルーティングは,経路候補を選択するRouting Functionと,その候補の中からトラフィックの分布に対して通信遅延時間を最小にする候補を使用経路に決定するSelection Functionとで構成される.現時点で主流となっているRouting FunctionにOdd-Even Turn Modelがあるが,この手法は負荷の分布を考慮した経路選択を行わないため,負荷が集中しているチャネルの使用が避けられなくなる場合があり,その結果遅延が大きくなるという問題がある.そこで本稿では,Odd-Even Turn Modelを対象としたNoCの負荷分散による遅延時間削減手法を提案する.提案手法はOdd-Even Turn Modelの経路選択方法の特徴から予めトラフィックが集中すると予測される箇所を進入制限領域なる領域に定め,進入制限領域内のチャネルをパケット転送に用いる経路候補に含ませる頻度を低くすることで,トラフィックの分散とそれに伴う遅延時間の削減を可能とする.

CiNii
連携処理を考慮したネットワークプロセッサへの処理割り当て手法

齊藤啓太, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 108 ( 478 ) 147 - 152 2009年03月

　概要を見る

近年のネットワークの広帯域化・多機能化により,汎用プロセッサよりも高速・低消費電力であり,かつASICよりも柔軟性が高いネットワークプロセッサが注目されている.ネットワークプロセッサでは,処理を複数の部分処理に分割して処理エンジン同士で通信しながらパケットを処理していく連携処理が用いられることがあるが,この際には処理の分割や処理の処理エンジンへの割り当てが必要となる.ネットワークプロセッサに求められる柔軟性と高速性を損なわないためには処理割り当てに要する時間は短く,かつハードウェアの性能を限界まで引き出すことが求められる.本稿ではバックトラッキングを用いたネットワークプロセッサへの処理割り当て手法を提案する.提案手法では既存の負荷分散に着目した手法や処理エンジン間での通信コストに着目した手法を導入し,それらの手法で得られる解の周辺に探索範囲を限定することで,短時間での割り当てと解の精度の向上を目指す.提案手法を実装して計算機実験を行い,既存手法との比較によってよりスループットの高い解が得られることを示す.

CiNii
命令メモリアクセス数削減に基づく低エネルギーASIP合成手法

小林優太, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. CPSY, コンピュータシステム 108 ( 413 ) 147 - 152 2009年01月

　概要を見る

本稿ではVLIW型ASIPを対象としたハードウェア/ソフトウェア(HW/SW)ASIP協調合成システムSPADESにおける消費エネルギー削減手法を提案する.ASIPにおいて命令メモリが占める消費エネルギーの割合は大きく,命令メモリの消費エネルギー削減が課題となっている.そこで我々は,1命令に逐次実行される複数の命令を格納する垂直結合命令を考え,垂直結合命令探索手法を提案する.提案する垂直結合命令探索手法はスケジューリング済みCDFGから消費エネルギー削減量が最大となるように垂直結合命令を決定する.これにより命令メモリのアクセス数を削減し,消費エネルギーを削減することができる.計算機実験により,メモリを含むプロセッサ全体で平均41.9%の消費エネルギー削減を確認した.

CiNii
組み込みシステム向けMPSoCのためのマルチレイヤ構造をとるバスアーキテクチャ最適化手法

吉田陽信, 戸川望, 柳澤政生, 大附辰夫, 橘昌良

電子情報通信学会技術研究報告. CPSY, コンピュータシステム 108 ( 413 ) 141 - 146 2009年01月

　概要を見る

マルチレイヤ構造をとるバスアーキテクチャを対象とし,特定のアプリケーションに適した構成を選択するためのバスアーキテクチャ最適化手法を提案する.入力としてプロセッサシミュレータから取得したアプリケーションのトレースデータと時間制約を与え,まずメモリアクセス競合を考慮せずにトレースデータから求めたデータ転送時間によって制約を満たす可能性のある構成を限定する.その後,各構成についてメモリアクセス競合を考慮したスケジューリングをすることで,制約を満たすか否かを判定をする.この時,面積の小さい構成から大きい構成の順に探索することにより面積を最小とする構成を能率良く発見することができる.計算機実験を行った結果からマルチレイヤ構造のバスを面積が同等と考えられる共有バスと比較し,有効性を確認した.また提案する探索範囲削減手法は一般的な全探索手法と比較し,8.55倍高速に最適解を求められることを示した.

CiNii
アプリケーションプロセッサのための高速かつ最適なパイプライン構成を持つSIMD演算ユニット合成手法

渡辺隆行, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. CPSY, コンピュータシステム 108 ( 413 ) 99 - 104 2009年01月

　概要を見る

組み込みシステム向けのプロセッサに採用されるアプリケーションプロセッサには,低面積,高性能に加え,高い設計生産性が要求される.本稿では,アプリケーションプロセッサを対象としたハードウェア/ソフトウェア(HW/SW)協調合成システムSPADESにおいて,プロセッサコアに付加可能なSIMD演算ユニットを高速かつ最適なパイプライン構成で合成する手法を提案する.提案手法では,SIMD演算ユニットを遅延時間が最小となるようにパイプライン化し,SIMD演算ユニットがプロセッサコアのクリティカルパスとならない場合,クリティカルパス遅延を違反しない範囲内でSIMD演算ユニットの面積増加量が最小となる位置にパイプラインレジスタを挿入することで,従来手法よりもパイプラインレジスタ挿入時の面積増加量を抑えられる.高速に最適解を求められるため,プロセッサのアーキテクチャ構成を探索する場合にも有効である.本手法を組み込んだSIMD演算ユニット生成システムでは最終的にSIMD演算ユニットのHDL記述を生成する.計算機実験により,本手法の有効性を評価し結果を報告する.

CiNii
フロアプランを考慮した高位合成のための高速なモジュール配置手法

佐藤亘, 大智輝, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. CPSY, コンピュータシステム 108 ( 413 ) 93 - 98 2009年01月

　概要を見る

近年のLSI設計プロセスの微細化に伴い,配線遅延がゲート遅延に対し相対的に増加してきている.そのため,高位合成の段階においてフロアプランを考慮する必要がある.LSI設計プロセスの微細化の一方で,Time to marketの条件が厳しく設計に割ける時間が短くなってきているため,フロアプランを考慮した高位合成を短時間で実行することが望まれる.本稿では,高位合成とフロアプランを繰り返し実行する環境の中で,高位合成の情報を利用した高速なモジュール配置手法を提案する.本手法はイタレーションしている高位合成を対象としてスケジューリング/FUバインディング工程で得られる情報を利用した構築的手法によって高速かつモジュール間の配線遅延を考慮した配置を実行する.計算機実験によって,対象とする高位合成システムに本手法を組み込んだ場合,システム全体の実行時間を平均で98%削減した.

CiNii
命令メモリアクセス数削減に基づく低エネルギーASIP合成手法

小林優太, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. RECONF, リコンフィギャラブルシステム : IEICE technical report 108 ( 414 ) 147 - 152 2009年01月

　概要を見る

本稿ではVLIW型ASIPを対象としたハードウェア/ソフトウェア(HW/SW)ASIP協調合成システムSPADESにおける消費エネルギー削減手法を提案する.ASIPにおいて命令メモリが占める消費エネルギーの割合は大きく,命令メモリの消費エネルギー削減が課題となっている.そこで我々は,1命令に逐次実行される複数の命令を格納する垂直結合命令を考え,垂直結合命令探索手法を提案する.提案する垂直結合命令探索手法はスケジューリング済みCDFGから消費エネルギー削減量が最大となるように垂直結合命令を決定する.これにより命令メモリのアクセス数を削減し,消費エネルギーを削減することができる.計算機実験により,メモリを含むプロセッサ全体で平均41.9%の消費エネルギー削減を確認した.

CiNii
組み込みシステム向けMPSoCのためのマルチレイヤ構造をとるバスアーキテクチャ最適化手法

吉田陽信, 戸川望, 柳澤政生, 大附辰夫, 橘昌良

電子情報通信学会技術研究報告. RECONF, リコンフィギャラブルシステム : IEICE technical report 108 ( 414 ) 141 - 146 2009年01月

　概要を見る

マルチレイヤ構造をとるバスアーキテクチャを対象とし,特定のアプリケーションに適した構成を選択するためのバスアーキテクチャ最適化手法を提案する.入力としてプロセッサシミュレータから取得したアプリケーションのトレースデータと時間制約を与え,まずメモリアクセス競合を考慮せずにトレースデータから求めたデータ転送時間によって制約を満たす可能性のある構成を限定する.その後,各構成についてメモリアクセス競合を考慮したスケジューリングをすることで,制約を満たすか否かを判定をする.この時,面積の小さい構成から大きい構成の順に探索することにより面積を最小とする構成を能率良く発見することができる.計算機実験を行った結果からマルチレイヤ構造のバスを面積が同等と考えられる共有バスと比較し,有効性を確認した.また提案する探索範囲削減手法は一般的な全探索手法と比較し,8.55倍高速に最適解を求められることを示した.

CiNii
アプリケーションプロセッサのための高速かつ最適なパイプライン構成を持つSIMD演算ユニット合成手法

渡辺隆行, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. RECONF, リコンフィギャラブルシステム : IEICE technical report 108 ( 414 ) 99 - 104 2009年01月

　概要を見る

組み込みシステム向けのプロセッサに採用されるアプリケーションプロセッサには,低面積,高性能に加え,高い設計生産性が要求される.本稿では,アプリケーションプロセッサを対象としたハードウェア/ソフトウェア(HW/SW)協調合成システムSPADESにおいて,プロセッサコアに付加可能なSIMD演算ユニットを高速かつ最適なパイプライン構成で合成する手法を提案する.提案手法では,SIMD演算ユニットを遅延時間が最小となるようにパイプライン化し,SIMD演算ユニットがプロセッサコアのクリティカルパスとならない場合,クリティカルパス遅延を違反しない範囲内でSIMD演算ユニットの面積増加量が最小となる位置にパイプラインレジスタを挿入することで,従来手法よりもパイプラインレジスタ挿入時の面積増加量を抑えられる.高速に最適解を求められるため,プロセッサのアーキテクチャ構成を探索する場合にも有効である.本手法を組み込んだSIMD演算ユニット生成システムでは最終的にSIMD演算ユニットのHDL記述を生成する.計算機実験により,本手法の有効性を評価し結果を報告する.

CiNii
フロアプランを考慮した高位合成のための高速なモジュール配置手法

佐藤亘, 大智輝, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. RECONF, リコンフィギャラブルシステム : IEICE technical report 108 ( 414 ) 93 - 98 2009年01月

　概要を見る

近年のLSI設計プロセスの微細化に伴い,配線遅延がゲート遅延に対し相対的に増加してきている.そのため,高位合成の段階においてフロアプランを考慮する必要がある.LSI設計プロセスの微細化の一方で,Time to marketの条件が厳しく設計に割ける時間が短くなってきているため,フロアプランを考慮した高位合成を短時間で実行することが望まれる.本稿では,高位合成とフロアプランを繰り返し実行する環境の中で,高位合成の情報を利用した高速なモジュール配置手法を提案する.本手法はイタレーションしている高位合成を対象としてスケジューリング/FUバインディング工程で得られる情報を利用した構築的手法によって高速かつモジュール間の配線遅延を考慮した配置を実行する.計算機実験によって,対象とする高位合成システムに本手法を組み込んだ場合,システム全体の実行時間を平均で98%削減した.

CiNii
アプリケーションプロセッサのための高速かつ最適なパイプライン構成を持つ SIMD 演算ユニット合成手法

渡辺隆行, 戸川望, 柳澤政生, 大附辰夫

研究報告システムLSI設計技術（SLDM） 2009 ( 7 ) 99 - 104 2009年01月

　概要を見る

組み込みシステム向けのプロセッサに採用されるアプリケーションプロセッサには，低面積，高性能に加え，高い設計生産性が要求される．本稿では，アプリケーションプロセッサを対象としたハードウェア／ソフトウェア (HW/SW) 協調合成システム SPADES において，プロセッサコアに付加可能な SIMD 演算ユニットを高速かつ最適なパイプライン構成で合成する手法を提案する．提案手法では，SIMD 演算ユニットを遅延時間が最小となるようにパイプライン化し，SIMD 演算ユニットがプロセッサコアのクリティカルパスとならない場合，クリティカルパス遅延を違反しない範囲内で SIMD 演算ユニットの面積増加量が最小となる位置にパイプラインレジスタを挿入することで，従来手法よりもパイプラインレジスタ挿入時の面積増加量を抑えられる高速に最適解を求められるため，プロセッサのアーキテクチャ構成を探索する場合にも有効である．本手法を組み込んだ SIMD 演算ユニット生成システムでは最終的に SIMD 演算ユニットの HDL 記述を生成する．計算機実験により，本手法の有効性を評価し結果を報告する．Small area, high performance and high productivity are required for application-specific processors in embedded systems. This paper proposes a fast SIMD processing unit synthesis method with optimal pipeline architecture applied to a processor core in hardware/software (HW/SW) co-synthesis system, SPADES, for application-specific processors. In the proposed method, if a pipelined SIMD processing unit with minimum delay is not on the critical path of a processor core, pipeline registers are inserted at optimal position which causes minimum amount of area increase within the critical path delay of a processor core. Therefore it can reduce area increase compared with the conventional method. Since this proposed method is fast to find the optimal solution, exploring processor architecture configuration is also effective. Finally, the SIMD operation unit generation system into which this proposed method is embeded generates HDL description of a SIMD processing unit. The experimental results show effectiveness of this method.

CiNii
フロアプランを考慮した高位合成のための高速なモジュール配置手法

佐藤亘, 大智輝, 戸川望, 柳澤政生, 大附辰夫

研究報告システムLSI設計技術（SLDM） 2009 ( 7 ) 93 - 98 2009年01月

　概要を見る

近年の LSI 設計プロセスの微細化に伴い配線遅延がゲート遅延に対し相対的に増加してきている．そのため，高位合成の段階においてフロアプランを考慮する必要がある．LSI 設計プロセスの微細化の一方で，Time to market の条件が厳しく設計に割ける時間が短くなってきているため，フロアプランを考慮した高位合成を短時間で実行することが望まれる．本稿では，高位合成とフロアプランを繰り返し実行する環境の中で，高位合成の情報を利用した高速なモジュール配置手法を提案する．本手法はイタレーションしている高位合成を対象としてスケジューリング/ FU パインディングエ程で得られる情報を利用した構築的手法によって高速かつモジュール間の配線遅延を考慮した配置を実行する．計算機実験によって，対象とする高位合成システムに本手法を組み込んだ場合，システム全体の実行時間を平均で 98% 削減した．As device feature size decreases, interconnect delay becomes the dominating factor of total delay. Therefore it is necessary to consider a floorplan in a stage of the high-level synthesis. While device feature size decreases, a condition of the Time to Market is severe, we need to design in a short time. Therefore it is desired to execute the high-level synthesis with floorplan in a short time. In this paper, we propose a high-speed module placement algorithm that used information of the high-level synthesis for the system that execute high-level synthesis and a floorplan repeatedly. This algorithm executes the placement fast that considered interconnect delay between modules by constructive method that used information of a scheduling/FU binding process. We show effectiveness of the proposed algorithm through experimental results.

CiNii
命令メモリアクセス数削減に基づく低エネルギー ASIP 合成手法

小林優太, 戸川望, 柳澤政生, 大附辰夫

研究報告システムLSI設計技術（SLDM） 2009 ( 7 ) 147 - 152 2009年01月

　概要を見る

本稿では VLIW 型 ASIP を対象としたハードウェア/ソフトウェア（HW/SW） ASIP 協調合成システム SPADES における消費エネルギー削減手法を提案する．ASIP において命令メモリが占める消費エネルギーの割合は大きく，命令メモリの消費エネルギー削減が課題となっている．そこで我々は，1 命令に逐次実行される複数の命令を格納する垂直結合命令を考え，垂直結合命令探索手法を提案する．提案する垂直結合命令探索手法はスケジューリング済み ODFG から消費エネルギー削減量が最大となるように垂直結合命令を決定する．これにより命令メモリのアクセス数を削減し，消費エネルギーを削減することができる．計算機実験により，メモリを含むプロセッサ全体で平均 41.9% の消費エネルギー削減を確認した．In this paper, we propose an energy-efficient ASIP synthesis method based on reducing instruction memory access. Since an instruction memory is one of the main energy consumers in ASIP, reducing consumed energy in instruction memory is an important problem. We propose a vertical combined instruction that stores two or more instructions issued sequentially into a single instruction. Then we propose a method to synthesize the vertical combined instructions from a scheduled CDFG. Since the number of instruction memory accesses is reduced, the energy consumption can also be reduced. In experimental results, we confirm reducing approximately 41.9% energy consumption at a whole processor system including memories.

CiNii
組み込みシステム向け MPSoC のためのマルチレイヤ構造をとるバスアーキテクチャ最適化手法

吉田陽信, 戸川望, 柳澤政生, 大附辰夫, 橘昌良

研究報告システムLSI設計技術（SLDM） 2009 ( 7 ) 141 - 146 2009年01月

　概要を見る

マルチレイヤ構造をとるバスアーキテクチャを対象とし，特定のアプリケーションに適した構成を選択するためのバスアーキテクチャ最適化手法を提案する．入力としてプロセッサシミュレータから取得したアプリケーションのトレースデータと時間制約を与え，まずメモリアクセス競合を考慮せずにトレースデータから求めたデータ転送時間によって制約を満たす可能性のある構成を限定する．その後，各構成についてメモリアクセス競合を考慮したスケジューリングをすることで，制約を満たすか否かを判定をする．この時，面積の小さい構成から大きい構成の順に探索することにより面積を最小とする構成を能率良く発見することができる．計算機実験を行った結果からマルチレイヤ構造のバスを面積が同等と考えられる共有バスと比較し，有効性を確認した．また提案する探索範囲削減手法は一般的な全探索手法と比較し，8.55 倍高速に最適解を求められることを示した．In this paper, we propose an on-chip bus optimization algorithm for a multi-layer bus architecture. Our algorithm efficiently searches for an optimal selection of the number and bit-size of buses, CPU-bus connection topology, and the priority of each CPU subject to the time constraint for given embedded applications. It is necessary to estimate the running time of applications with taking into consideration the effect of memory access conflict. Before taking into consideration the effect of memory access conflict, our approach removes configurations which violate the constraints. By reducing the design space in this way we can obtain an optimal configuration in shorter time. Our algorithm is 8.55 faster compared to the exhaustive approach.

CiNii
命令メモリアクセス数削減に基づく低エネルギーASIP合成手法

小林優太, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 108 ( 412 ) 147 - 152 2009年01月

　概要を見る

本稿ではVLIW型ASIPを対象としたハードウェア/ソフトウェア(HW/SW)ASIP協調合成システムSPADESにおける消費エネルギー削減手法を提案する.ASIPにおいて命令メモリが占める消費エネルギーの割合は大きく,命令メモリの消費エネルギー削減が課題となっている.そこで我々は,1命令に逐次実行される複数の命令を格納する垂直結合命令を考え,垂直結合命令探索手法を提案する.提案する垂直結合命令探索手法はスケジューリング済みCDFGから消費エネルギー削減量が最大となるように垂直結合命令を決定する.これにより命令メモリのアクセス数を削減し,消費エネルギーを削減することができる.計算機実験により,メモリを含むプロセッサ全体で平均41.9%の消費エネルギー削減を確認した.

CiNii
組み込みシステム向けMPSoCのためのマルチレイヤ構造をとるバスアーキテクチャ最適化手法

吉田陽信, 戸川望, 柳澤政生, 大附辰夫, 橘昌良

電子情報通信学会技術研究報告. VLD, VLSI設計技術 108 ( 412 ) 141 - 146 2009年01月

　概要を見る

マルチレイヤ構造をとるバスアーキテクチャを対象とし,特定のアプリケーションに適した構成を選択するためのバスアーキテクチャ最適化手法を提案する.入力としてプロセッサシミュレータから取得したアプリケーションのトレースデータと時間制約を与え,まずメモリアクセス競合を考慮せずにトレースデータから求めたデータ転送時間によって制約を満たす可能性のある構成を限定する.その後,各構成についてメモリアクセス競合を考慮したスケジューリングをすることで,制約を満たすか否かを判定をする.この時,面積の小さい構成から大きい構成の順に探索することにより面積を最小とする構成を能率良く発見することができる.計算機実験を行った結果からマルチレイヤ構造のバスを面積が同等と考えられる共有バスと比較し,有効性を確認した.また提案する探索範囲削減手法は一般的な全探索手法と比較し,8.55倍高速に最適解を求められることを示した.

CiNii
アプリケーションプロセッサのための高速かつ最適なパイプライン構成を持つSIMD演算ユニット合成手法

渡辺隆行, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 108 ( 412 ) 99 - 104 2009年01月

　概要を見る

組み込みシステム向けのプロセッサに採用されるアプリケーションプロセッサには,低面積,高性能に加え,高い設計生産性が要求される.本稿では,アプリケーションプロセッサを対象としたハードウェア/ソフトウェア(HW/SW)協調合成システムSPADESにおいて,プロセッサコアに付加可能なSIMD演算ユニットを高速かつ最適なパイプライン構成で合成する手法を提案する.提案手法では,SIMD演算ユニットを遅延時間が最小となるようにパイプライン化し,SIMD演算ユニットがプロセッサコアのクリティカルパスとならない場合,クリティカルパス遅延を違反しない範囲内でSIMD演算ユニットの面積増加量が最小となる位置にパイプラインレジスタを挿入することで,従来手法よりもパイプラインレジスタ挿入時の面積増加量を抑えられる.高速に最適解を求められるため,プロセッサのアーキテクチャ構成を探索する場合にも有効である.本手法を組み込んだSIMD演算ユニット生成システムでは最終的にSIMD演算ユニットのHDL記述を生成する.計算機実験により,本手法の有効性を評価し結果を報告する.

CiNii
フロアプランを考慮した高位合成のための高速なモジュール配置手法

佐藤亘, 大智輝, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 108 ( 412 ) 93 - 98 2009年01月

　概要を見る

近年のLSI設計プロセスの微細化に伴い,配線遅延がゲート遅延に対し相対的に増加してきている.そのため,高位合成の段階においてフロアプランを考慮する必要がある.LSI設計プロセスの微細化の一方で,Time to marketの条件が厳しく設計に割ける時間が短くなってきているため,フロアプランを考慮した高位合成を短時間で実行することが望まれる.本稿では,高位合成とフロアプランを繰り返し実行する環境の中で,高位合成の情報を利用した高速なモジュール配置手法を提案する.本手法はイタレーションしている高位合成を対象としてスケジューリング/FUバインディング工程で得られる情報を利用した構築的手法によって高速かつモジュール間の配線遅延を考慮した配置を実行する.計算機実験によって,対象とする高位合成システムに本手法を組み込んだ場合,システム全体の実行時間を平均で98%削減した.

CiNii
楕円曲線暗号に対するスキャンベース攻撃

奈良竜太, 戸川望, 柳澤政生, 大附辰夫

情報処理学会シンポジウム論文集 2009 ( 7 ) 2009年

J-GLOBAL
道路標識とランドマークを用いた歩行者位置特定システムと実地調査による評価

児島伴幸, 山根和也, 柳澤政生, 大附辰夫, 戸川望

情報処理学会シンポジウムシリーズ(CD-ROM) 2009 ( 1 ) 2009年

J-GLOBAL
ルータの負荷分散と制御パケット数削減を目的としたエニーキャスト経路選択手法

横田雅之, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. NS, ネットワークシステム 108 ( 359 ) 13 - 18 2008年12月

　概要を見る

エニーキャスト通信では,クライアント側はエニーキャストアドレスを指定するだけで特定のアプリケーションを提供する複数のサーバの中から最適なサーバと自動的に通信することができる.通信時間が最小になるサーバへの経路選択は経路のホップ数やサーバの処理時間などで判断するが,多数のクライアントが存在する場合,ネットワークの輻輳によるルータの負荷も考慮しなければならない.本稿では,ルータの負荷分散と制御パケット数削減を目的としたエニーキャスト経路選択手法を提案する.提案手法はCore-Based Tree Method (CBT)を基に,負荷が集中するリッジルータとそれに隣接する複数のリッジルータを構成要素とするPartial木を構築する.リッジルータの負荷状況に応じて構築されるPartial木を用いて経路を選択することでリッジルータの負荷が分散し,負荷情報をやり取りする制御パケット数も削減されるため,ネットワーク全体のトラヒックが小さくなり,全体の通信時間を短くできる.シミュレーションによって提案手法の有効性を示し,想定環境におけるPartial木の最適数を考察する.

CiNii
高速移動体のためのNEMOを用いた高速ハンドオフ手法

田中敦樹, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. NS, ネットワークシステム 108 ( 359 ) 89 - 94 2008年12月

　概要を見る

車や電車内部からインターネットを利用する手法としてNEMO (Network Mobility)が提案されている.NEMO方式は理論値で54Mbpsの通信速度が実現可能であるが,ハンドオフに伴い通信不能時間が発生してしまう.F-HMIPv6をNEMOに適用することで,ハンドオフ時間を削減可能であるが,新幹線のような高速移動体で適用できない.本稿では電車内に設置されているMR (Mobile Router)が電車の速度情報などを利用し,AP (Access Point)からのL2ビーコンを受信前に次に使用予定のIPアドレスを取得することで,高速移動体において適用可能な高速ハンドオフ手法を提案する.提案手法により,新幹線のような高速移動体でもMAP (Mobile Anchor Point)を用いた高速ハンドオフが可能となる.提案手法の有効性をNetwork Simulator 2 (NS-2)で検証する.

CiNii
レジスタ分散型アーキテクチャを対象としたフロアプラン指向高位合成のためのマルチプレクサ削減手法

遠藤哲弥, 大智輝, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 2008 ( 111 ) 145 - 150 2008年11月

　概要を見る

リソース共有型の高位合成において，バインディング結果として演算器やレジスタの入力側にマルチプレクサが挿入されるマルチプレクサ数の増加は回路の面積増加や性能低下の原因となるため，高位合成の段階で考慮する必要がある．本稿では，レジスタ分散型アーキテクチャを対象としたフロアプラン指向高位合成のためのマルチプレクサ削減手法を提案する．提案手法は，データ転送回数テーブルを利用したスケジューリング / FU バインディング手法，演算ノードの割当コントロールステップならびに割当 FU の局所変更を行う FU Connect Reduction 手法，モジュール間ポート再割当を行う Port Re-Assignment 手法，の 3 手法によりマルチプレクサ数を削減する．対象とする高位合成に提案手法を組み込む事で，平均で 15.4% のマルチプレクサ数，9.9% の面積が削減でき有効性を確認した．In high level synthesis for resource shared architecture, multiplexers are inserted between registers and functional units as a result of binding. Multiplexer reduction is necessary for area and performance of synthesized circuit. In this paper, we propose multiplexer reducting algorithms in floorplan-aware high-level synthesis for distributed-register architectures. These algorithms can reduce the number of multiplexers for conventional high-level synthesis. We show effectiveness of the proposed algorithm thorough experimental results.

CiNii
組み込みシステムの2階層キャッシュとスクラッチパッドメモリのシミュレーション手法

東條信明, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 2008 ( 111 ) 97 - 102 2008年11月

　概要を見る

本稿では複数の 2 階層キャッシュ構成およびスクラッチパッドメモリを含めたメモリ構成のシミュレーション手法を提案する．本手法は，アプリケーションソースコードを入力とし，メインメモリ，スクラッチパッドメモリ，L1，L2 キャッシュからなるメモリ階層を，キャッシュの性質を利用した手法を導入することで正確かつ高速にシミュレーションし，各構成のキャッシュヒット数およびキャッシュミス数を求めることを目的としている．また，評価のために総メモリアクセス時間あるいは総メモリ消費エネルギーが最小となるように SPM ・キャッシュ構成を最適化するシステムを実装し，その有効性について確認した．In an embedded system where a single application or a class of applications are repeatedly executed on a processor, its memory configuration can be customized such that an optimal one is achieved. We can have an optimal two-level cache and scratch pad memory configuration which minimizes overall memory access time or energy consumption by varying the seven parameters: the number of sets of an L1/L2 cache, a line size of an L1/L2 cache, an associativity of an L1/L2 cache, and a size of a scratch pad memory. In this paper, we propose two-level cache and scratch pad memory design space exploration algorithms: CRCB-T and CRCB-S. Our proposed approach totally runs a maximum of 3172.94 faster compared to the conventional exhaustive approach.

CiNii
クロスバスイッチを用いたS-Box切替によるAES暗号処理回路のパワーマスキング手法

川畑伸幸, 奈良竜太, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 2008 ( 111 ) 61 - 66 2008年11月

　概要を見る

共通鍵暗号規格の一つである AES は専用処理ハードウェアが搭載された IC チップ等の組込み機器上での使用例が多く，格納された共通鍵は外部に対して秘密であることが前提とされている．しかし，暗号処理演算中に発生する物理量を解析して共通鍵を解読するサイドチャネル攻撃と呼ばれる攻撃法が提案されその危険性が指摘されている．中でも電力差分解析攻撃 (DPA) は最も危険性が懸念されている攻撃法の一つであり，DPA への耐性を考慮した専用ハードウェアの設計が要求されている．本稿では， AES の SubBytes 処理にて複数の S-Box 回路を用いて並列処理させる場合に，クロスバスイッチを用いて消費電力の異なる複数の S-Box をランダムで切り替え消費電力を攪枠する手法を提案する．提案手法の実装をして評価および結果を報告する．AES is one of the common key cryptosystems often used on an embedded systems, IC-chips and others. Teir common key must be kept secret from others. However, it can be deciphered by side channel attack, the method of cracking cryptosystems by analyzing physical quantity generated at the encryption processing. Especially in side channel attack, differential power analysis (DPA) is known as the most dangerous attacking method. AES circuit is needed to be designd with regard to anti-DPA. To design an anti-DPA AES circuit, we propose a power masking SubBytes circuit which switches several S-Boxes, each of which has a different power to each other. We demonstrate our evaluation and results.

CiNii
暗号回路における動的に構造変化するセキュアスキャンアーキテクチャ

跡部浩士, 奈良竜太, 史又華, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 2008 ( 111 ) 55 - 59 2008年11月

　概要を見る

スキャンテストはスキャンチェインを用いた手法で一般的かつ強力なテスト手法である．しかし，スキャンチェインは外部から回路内部の情報を取得できるため，暗号回路においては有効な攻撃手段となりえる．本稿ではスキャンベース攻撃に対するセキュアスキャンアーキテクチャを提案する．スキャンチェイン内のランダムな場所にインバータを挿入し，スキャンチェインの構造を複雑にする防御手法が提案されているが，回路設計時にインバータを挿入する場所が固定されてしまうため，その特徴を利用し攻撃される可能性がある．したがって，設計した後にもスキャンチェインの構造を動的に変化させる必要があると考えられる．我々はラッチを用いて過去の FF の状態を利用することで次のスキャン FF への出力を変化させる状態依存スキャン FF (SDSFF) を提案する．このスキャンＦＦを用いることでスキャンチェインの構造を動的に変化させることが可能であり，コントローラを必要としないため面積オーバーヘッドも少ない AES 暗号回路に提案手法を実装し，評価を行った．Scan test is a powerful and popular test technique because it can control and observe the internal states of the circuit under test. However, scan chains would be used to discover the internals of crypto hardware, which presents a significant security risk of information leakage. An interesting design-for-test technique by inserting in verters into the internal scan chains to complicate the scan structure has been recently presented. Unfortunately, it still carries the potential of being attacked through statistical analysis of the information scanned out from chips. Therefore, in this paper we propose secure scan architecture, called dynamic variable secure scan, against scan-based side channel attack. The modified scan flip-flops are state-dependent, which could cause the output of each SDSFF to be inverted or not so as to make it more difficult to discover the internal scan architecture. We made an analysis on an AES implementation to show the effectiveness of the proposed method and discussed how our approach is resistant to scan-based side channel attack.

CiNii
周辺回路を含むAES-LSIヘのスキャンベース攻撃

奈良竜太, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 2008 ( 111 ) 49 - 53 2008年11月

　概要を見る

暗号 LSI に対するサイドチャネル攻撃の危険性が指摘されているなか，スキャンチェインを利用して秘密鍵を解読するスキャンベース攻撃が注目されている．スキャンチェインは必須の LSI テスト技術である一方，LSI 内部のレジスタを直接観測できるため，暗号回路の秘密鍵解読に利用されている．従来のスキャンベース攻撃は暗号回路だけのレジスタだけで構成されたスキャンチェインにのみ有効であり，周辺回路のレジスタを考慮していない欠点があった．そこで本稿では暗号回路以外のレジスタがスキャンチェインに含まれていても秘密鍵を解読する手法を提案する．特定のレジスタに着目し，その値の変化を見ることで秘密鍵を解析する．他のレジスタに影響を受けないため，スキャンチェインの構成に依存しない．そのため，周辺回路を含んだ，より現実に近い暗号 LSI に対しスキャンベース攻撃することができる．The threat of side-channel attacks against the cryptography LSI is indicated. Especially, scan-based attacks, which use the scan chain, are watched. Scan chains are one of the most important testing techniques, but it is possible to use for attacks against the cryptography LSI. Conventional scan-based attacks only consider the scan chain made by registers of cryptography circuits. However, cryptography LSI usually has many IPs such as memories, micro-processors and other circuits. Because of the real scan chain consists of many kinds of registers, it is obscure whether conventional scan-based attacks can attack or cannot. In this paper, scan-based attack which enables to crack the secret key in the AES-LSI with other IPs is proposed. By focusing the bit pattern of the specific register and monitoring its change, and our method eliminates the influence of other circuit registers. Therefore, our scan-based attacks don't depend on the architecture of the scan chain, and it can crack real cryptography LSIs included with other IPs.

CiNii
歩行者向けデフォルメ地図生成のための並列処理ハードウェアエンジンの設計

荒幡明, 奈良竜太, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 2008 ( 111 ) 43 - 48 2008年11月

　概要を見る

現在携帯端末へ地図情報を配信するサービスが普及しているが，それらの地図の多くは PC 用の地図であり，微細な携帯端末用ディスプレイでの表示には適していない地図情報は性質上リアルタイムの更新を必要とするため，あらかじめ視認性の高いデフォルメ化された地図を作成しておくのは現実的ではない．そのため地図情報を自動的にデフォルメ化する技術が多数開発・提案されているが，デフォルメ化処理をサーバ上で処理するとデフォルメ化地図をユーザの嗜好に合わせるのは困難である．携帯電話の処理能力ではデフォルメ化処理は処理量が大きく，現実的には実行不可能な程の処理時間を必要とする．本稿では携帯電話向けデフォルメ地図生成処理向け並列処理ハードウェアエンジンを提案する．携帯電話上でデフォルメ地図生成処理を可能とするために，処理量の削減に加えて，処理の並列化と並列処理ハードウェアエンジンを提案し，デフォルメ地図生成処理のボトルネック部分をハードウェア処理することで処理時間を短縮した．提案した並列処理ハードウェアエンジンによりデフォルメ地図生成処理は携帯電話上で 1 秒以内で処理出来る．Recently, many of the distribution of map information to mobile devices have been highly-popularized, however, those maps are generally for PC use and not suitable for displays as on mobile devices. According to the nature of map information, it has to be updated in real time, it is a distant idea to prepare an easy-to-read deformed map in advance. For that reason, it is difficult to tailor deformed map to preference of user when processing map on servers even automatic deformation of map data is proposed numerously. Mobile devices need loads of processing time which is virtually impossible in attribute to massive processing volume of data has to be required to deform map data by narrow throughput of mobile devices. In this paper, we propose parallel processing hardware engine for map deformation for mobile devices. We worked out to reduce processing time by processing on hardwares which was bottleneck of map deformation. Proposed parallel processing hardware engine can process deformation of map data within just 1 second on a mobile phone.

CiNii
高効率列処理演算器によるマルチレート対応高スループットイレギュラーLDPC復号器の実装と評価

長島諒侑, 今井優太, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 2008 ( 111 ) 37 - 42 2008年11月

　概要を見る

近年携帯電話や無線 LAN といった無線端末が普及し放送もデジタル化するなど，無線通信の利用が急激に加速している．通信環境の変動しやすい無線端末において，高い通信品質を保つことが大きな課題となっている．ＬDPC (Low Density Parity Check) 符号は高い誤り訂正能力を持つため次世代の誤り訂正符号として注目され，IEEE802.11n で規格化されている．本稿では，IEEE802.11n の規格に準拠したイレギュラーな検査行列の復号が可能な復号器を提案する．提案する復号器は列処理演算の並列性に着目することで符号化率や符号長が変化しても加算回路を共有し演算器の使用率を向上させる．並列性に行方向と列方向があり，既存研究では行方向の並列性を持たせていないことから，提案復号器では行方向，列方向ともに並列性を持たせることで高スループット化を実現する．さらに高符号化率になるにつれて列処理演算器の演算並列度を向上させることができるとともに，短い符号長でも演算器の使用率の低下を抑えることで従来手法よりも高いスループットを実現できる．提案手法により，既存研究に対して12.5% の面積減少と 81% のスループット向上を確認した．Low Density Parity Check (LDPC) code is expected to be an error correcting code for next generation networks since it shows high error correcting performance and is incorporated in IEEE802.11n the next standard of wireless network. In this paper, we propose a multi-rate compatible irregular LDPC decoder enhancing column operation parallelism. Focusing on column-wise parallelism of column operations, uplift usage rate of operational unit and throughput by calculating all inputs simultaneously. The decoder achieves 12% savings in area and 81% increase in throughput compared to recent architectures.

CiNii
レジスタ分散型アーキテクチャを対象としたフロアプラン指向高位合成のためのマルチプレクサ削減手法

遠藤哲弥, 大智輝, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 108 ( 298 ) 145 - 150 2008年11月

　概要を見る

リソース共有型の高位合成において,バインディング結果として演算器やレジスタの入力側にマルチプレクサが挿入される。マルチプレクサ数の増加は回路の面積増加や性能低下の原因となるため,高位合成の段階で考慮する必要がある.本稿では,レジスタ分散型アーキテクチャを対象としたフロアプラン指向高位合成のためのマルチプレクサ削減手法を提案する.提案手法は,データ転送回数テーブルを利用したスケジューリング/FUバインディング手法,演算ノードの割当コントロールステップならびに割当FUの局所変更を行うFU Connect Reduction手法,モジュール間ポート再割当を行うPort Re-Assignment手法,の3手法によりマルチプレクサ数を削減する.対象とする高位合成に提案手法を組み込む事で,平均で15.4%のマルチプレクサ数,9.9%の面積が削減でき有効性を確認した.

CiNii
組み込みシステムの2階層キャッシュとスクラッチパッドメモリのシミュレーション手法

東條信明, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 108 ( 298 ) 97 - 102 2008年11月

　概要を見る

本稿では複数の2階層キャッシュ構成およびスクラッチパッドメモリを含めたメモリ構成のシミュレーション手法を提案する.本手法は,アプリケーションソースコードを入力とし,メインメモリ,スクラッチパッドメモリ,L1,L2キャッシュからなるメモリ階層を,キャッシュの性質を利用した手法を導入することで正確かつ高速にシミュレーションし,各構成のキャッシュヒット数およびキャッシュミス数を求めることを目的としている.また,評価のために総メモリアクセス時間あるいは総メモリ消費エネルギーが最小となるようにSPM・キャッシュ構成を最適化するシステムを実装し,その有効性について確認した.

CiNii
クロスバスイッチを用いた S-Box 切替によるAES暗号処理回路のパワーマスキング手法

川畑伸幸, 奈良竜太, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 108 ( 298 ) 61 - 66 2008年11月

　概要を見る

共通鍵暗号規格の一つであるAESは専用処理ハードウェアが搭載されたICチップ等の組込み機器上での使用例が多く,格納された共通鍵は外部に対して秘密であることが前提とされている.しかし,暗号処理演算中に発生する物理量を解析して共通鍵を解読するサイドチャネル攻撃と呼ばれる攻撃法が提案されその危険性が指摘されている.中でも電力差分解析攻撃(DPA)は最も危険性が懸念されている攻撃法の一つであり,DPAへの耐性を考慮した専用ハードウェアの設計が要求されている.本稿では,AESのSubBytes処理にて複数のS-Box回路を用いて並列処理させる場合に,クロスバスイッチを用いて消費電力の異なる複数のS-Boxをランダムで切り替え消費電力を攪拌する手法を提案する.提案手法の実装をして評価および結果を報告する.

CiNii
暗号回路における動的に構造変化するセキュアスキャンアーキテクチャ

跡部浩士, 奈良竜太, 史又華, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 108 ( 298 ) 55 - 59 2008年11月

　概要を見る

スキャンテストはスキャンチェインを用いた手法で一般的かつ強力なテスト手法である.しかし,スキャンチェインは外部から回路内部の情報を取得できるため,暗号回路においては有効な攻撃手段となりえる.本稿ではスキャンベース攻撃に対するセキュアスキャンアーキテクチャを提案する.スキャンチェイン内のランダムな場所にインバータを挿入し,スキャンチェインの構造を複雑にする防御手法が提案されているが,回路設計時にインバータを挿入する場所が固定されてしまうため,その特徴を利用し攻撃される可能性がある.したがって,設計した後にもスキャンチェインの構造を動的に変化させる必要があると考えられる.我々はラッチを用いて過去のFFの状態を利用することで次のスキャンFFへの出力を変化させる状態依存スキャンFF(SDSFF)を提案する.このスキャンFFを用いることでスキャンチェインの構造を動的に変化させることが可能であり,コントローラを必要としないため面積オーバーヘッドも少ない.AES暗号回路に提案手法を実装し,評価を行った.

CiNii
周辺回路を含むAES-LSIへのスキャンベース攻撃

奈良竜太, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 108 ( 298 ) 49 - 53 2008年11月

　概要を見る

暗号LSIに対するサイドチャネル攻撃の危険性が指摘されているなか,スキャンチェインを利用して秘密鍵を解読するスキャンベース攻撃が注目されている.スキャンチェインは必須のLSIテスト技術である一方,LSI内部のレジスタを直接観測できるため,暗号回路の秘密鍵解読に利用されている.従来のスキャンベース攻撃は暗号回路だけのレジスタだけで構成されたスキャンチェインにのみ有効であり,周辺回路のレジスタを考慮していない欠点があった.そこで本稿では暗号回路以外のレジスタがスキャンチェインに含まれていても秘密鍵を解読する手法を提案する.特定のレジスタに着目し,その値の変化を見ることで秘密鍵を解析する.他のレジスタに影響を受けないため,スキャンチェインの構成に依存しない.そのため,周辺回路を含んだ,より現実に近い暗号LSIに対しスキャンベース攻撃することができる.

CiNii
歩行者向けデフォルメ地図生成のための並列処理ハードウェアエンジンの設計

荒幡明, 奈良竜太, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 108 ( 298 ) 43 - 48 2008年11月

　概要を見る

現在携帯端末へ地図情報を配信するサービスが普及しているが,それらの地図の多くはPC用の地図であり,微細な携帯端末用ディスプレイでの表示には適していない.地図情報は性質上リアルタイムの更新を必要とするため,あらかじめ視認性の高いデフォルメ化された地図を作成しておくのは現実的ではない.そのため地図情報を自動的にデフォルメ化する技術が多数開発・提案されているが,デフォルメ化処理をサーバ上で処理するとデフォルメ化地図をユーザの嗜好に合わせるのは困難である.携帯電話の処理能力ではデフォルメ化処理は処理量が大きく,現実的には実行不可能な程の処理時間を必要とする.本稿では携帯電話向けデフォルメ地図生成処理向け並列処理ハードウェアエンジンを提案する.携帯電話上でデフォルメ地図生成処理を可能とするために,処理量の削減に加えて,処理の並列化と並列処理ハードウェアエンジンを提案し,デフォルメ地図生成処理のボトルネック部分をハードウェア処理することで処理時間を短縮した.提案した並列処理ハードウェアエンジンによりデフォルメ地図生成処理は携帯電話上で1秒以内で処理出来る.

CiNii
高効率列処理演算器によるマルチレート対応高スループットイレギュラーLDPC復号器の実装と評価

長島諒侑, 今井優太, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 108 ( 298 ) 37 - 42 2008年11月

　概要を見る

近年携帯電話や無線LANといった無線端末が普及し放送もデジタル化するなど,無線通信の利用が急激に加速している.通信環境の変動しやすい無線端末において,高い通信品質を保つことが大きな課題となっている.LDPC(Low Density Parity Check)符号は高い誤り訂正能力を持つため次世代の誤り訂正符号として注目され,IEEE802.11nで規格化されている.本稿では,IEEE802.11nの規格に準拠したイレギュラーな検査行列の復号が可能な復号器を提案する.提案する復号器は列処理演算の並列性に着目することで符号化率や符号長が変化しても加算回路を共有し演算器の使用率を向上させる.並列性に行方向と列方向があり,既存研究では行方向の並列性を持たせていないことから,提案復号器では行方向,列方向ともに並列性を持たせることで高スループット化を実現する.さらに高符号化率になるにつれて列処理演算器の演算並列度を向上させることができるとともに,短い符号長でも演算器の使用率の低下を抑えることで従来手法よりも高いスループットを実現できる.提案手法により,既存研究に対して12.5%の面積減少と81%のスループット向上を確認した.

CiNii
レジスタ分散型アーキテクチャを対象としたフロアプラン指向高位合成のためのマルチプレクサ削減手法

遠藤哲弥, 大智輝, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. DC, ディペンダブルコンピューティング : IEICE technical report 108 ( 299 ) 145 - 150 2008年11月

　概要を見る

リソース共有型の高位合成において,バインディング結果として演算器やレジスタの入力側にマルチプレクサが挿入される.マルチプレクサ数の増加は回路の面積増加や性能低下の原因となるため,高位合成の段階で考慮する必要がある.本稿では,レジスタ分散型アーキテクチャを対象としたフロアプラン指向高位合成のためのマルチプレクサ削減手法を提案する.提案手法は,データ転送回数テーブルを利用したスケジューリング/FUバインディング手法,演算ノードの割当コントロールステップならびに割当FUの局所変更を行うFU Connect Reduction手法,モジュール間ポート再割当を行うPort Re-Assignment手法,の3手法によりマルチプレクサ数を削減する.対象とする高位合成に提案手法を組み込む事で,平均で15.4%のマルチプレクサ数,9.9%の面積が削減でき有効性を確認した.

CiNii
組み込みシステムの2階層キャッシュとスクラッチパッドメモリのシミュレーション手法

東條信明, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. DC, ディペンダブルコンピューティング : IEICE technical report 108 ( 299 ) 97 - 102 2008年11月

　概要を見る

本稿では複数の2階層キャッシュ構成およびスクラッチパッドメモリを含めたメモリ構成のシミュレーション手法を提案する.本手法は,アプリケーションソースコードを入力とし,メインメモリ,スクラッチパッドメモリ,L1,L2キャッシュからなるメモリ階層を,キャッシュの性質を利用した手法を導入することで正確かつ高速にシミュレーションし,各構成のキャッシュヒット数およびキャッシュミス数を求めることを目的としている.また,評価のために総メモリアクセス時間あるいは総メモリ消費エネルギーが最小となるようにSPM・キャッシュ構成を最適化するシステムを実装し,その有効性について確認した.

CiNii
クロスバスイッチを用いた S-Box 切替によるAES暗号処理回路のパワーマスキング手法

川畑伸幸, 奈良竜太, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. DC, ディペンダブルコンピューティング : IEICE technical report 108 ( 299 ) 61 - 66 2008年11月

　概要を見る

共通鍵暗号規格の一つであるAESは専用処理ハードウェアが搭載されたICチップ等の組込み機器上での使用例が多く,格納された共通鍵は外部に対して秘密であることが前提とされている.しかし,暗号処理演算中に発生する物理量を解析して共通鍵を解読するサイドチャネル攻撃と呼ばれる攻撃法が提案されその危険性が指摘されている.中でも電力差分解析攻撃(DPA)は最も危険性が懸念されている攻撃法の一つであり,DPAへの耐性を考慮した専用ハードウェアの設計が要求されている.本稿では,AESのSubBytes処理にて複数のS-Box回路を用いて並列処理させる場合に,クロスバスイッチを用いて消費電力の異なる複数のS-Boxをランダムで切り替え消費電力を攪拌する手法を提案する.提案手法の実装をして評価および結果を報告する.

CiNii
暗号回路における動的に構造変化するセキュアスキャンアーキテクチャ

跡部浩士, 奈良竜太, 史又華, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. DC, ディペンダブルコンピューティング : IEICE technical report 108 ( 299 ) 55 - 59 2008年11月

　概要を見る

スキャンテストはスキャンチェインを用いた手法で一般的かつ強力なテスト手法である.しかし,スキャンチェインは外部から回路内部の情報を取得できるため,暗号回路においては有効な攻撃手段となりえる.本稿ではスキャンベース攻撃に対するセキュアスキャンアーキテクチャを提案する.スキャンチェイン内のランダムな場所にインバータを挿入し,スキャンチェインの構造を複雑にする防御手法が提案されているが,回路設計時にインバータを挿入する場所が固定されてしまうため,その特徴を利用し攻撃される可能性がある.したがって,設計した後にもスキャンチェインの構造を動的に変化させる必要があると考えられる.我々はラッチを用いて過去のFFの状態を利用することで次のスキャンFFへの出力を変化させる状態依存スキャンFF(SDSFF)を提案する.このスキャンFFを用いることでスキャンチェインの構造を動的に変化させることが可能であり,コントローラを必要としないため面積オーバーヘッドも少ない.AES暗号回路に提案手法を実装し,評価を行った.

CiNii
周辺回路を含むAES-LSIへのスキャンベース攻撃

奈良竜太, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. DC, ディペンダブルコンピューティング : IEICE technical report 108 ( 299 ) 49 - 53 2008年11月

　概要を見る

暗号LSIに対するサイドチャネル攻撃の危険性が指摘されているなか,スキャンチェインを利用して秘密鍵を解読するスキャンベース攻撃が注目されている.スキャンチェインは必須のLSIテスト技術である一方,LSI内部のレジスタを直接観測できるため,暗号回路の秘密鍵解読に利用されている.従来のスキャンベース攻撃は暗号回路だけのレジスタだけで構成されたスキャンチェインにのみ有効であり,周辺回路のレジスタを考慮していない欠点があった.そこで本稿では暗号回路以外のレジスタがスキャンチェインに含まれていても秘密鍵を解読する手法を提案する.特定のレジスタに着目し,その値の変化を見ることで秘密鍵を解析する.他のレジスタに影響を受けないため,スキャンチェインの構成に依存しない.そのため,周辺回路を含んだ,より現実に近い暗号LSIに対しスキャンベース攻撃することができる.

CiNii
歩行者向けデフォルメ地図生成のための並列処理ハードウェアエンジンの設計

荒幡明, 奈良竜太, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. DC, ディペンダブルコンピューティング : IEICE technical report 108 ( 299 ) 43 - 48 2008年11月

　概要を見る

現在携帯端末へ地図情報を配信するサービスが普及しているが,それらの地図の多くはPC用の地図であり,微細な携帯端末用ディスプレイでの表示には適していない.地図情報は性質上リアルタイムの更新を必要とするため,あらかじめ視認性の高いデフォルメ化された地図を作成しておくのは現実的ではない.そのため地図情報を自動的にデフォルメ化する技術が多数開発・提案されているが,デフォルメ化処理をサーバ上で処理するとデフォルメ化地図をユーザの嗜好に合わせるのは困難である.携帯電話の処理能力ではデフォルメ化処理は処理量が大きく,現実的には実行不可能な程の処理時間を必要とする.本稿では携帯電話向けデフォルメ地図生成処理向け並列処理ハードウェアエンジンを提案する.携帯電話上でデフォルメ地図生成処理を可能とするために,処理量の削減に加えて,処理の並列化と並列処理ハードウェアエンジンを提案し,デフォルメ地図生成処理のボトルネック部分をハードウェア処理することで処理時間を短縮した.提案した並列処理ハードウェアエンジンによりデフォルメ地図生成処理は携帯電話上で1秒以内で処理出来る.

CiNii
高効率列処理演算器によるマルチレート対応高スループットイレギュラーLDPC復号器の実装と評価

長島諒侑, 今井優太, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. DC, ディペンダブルコンピューティング : IEICE technical report 108 ( 299 ) 37 - 42 2008年11月

　概要を見る

近年携帯電話や無線LANといった無線端末が普及し放送もデジタル化するなど,無線通信の利用が急激に加速している.通信環境の変動しやすい無線端末において,高い通信品質を保つことが大きな課題となっている.LDPC(Low Density Parity Check)符号は高い誤り訂正能力を持つため次世代の誤り訂正符号として注目され,IEEE802.11nで規格化されている.本稿では,IEEE802.11nの規格に準拠したイレギュラーな検査行列の復号が可能な復号器を提案する.提案する復号器は列処理演算の並列性に着目することで符号化率や符号長が変化しても加算回路を共有し演算器の使用率を向上させる.並列性に行方向と列方向があり,既存研究では行方向の並列性を持たせていないことから,提案復号器では行方向,列方向ともに並列性を持たせることで高スループット化を実現する.さらに高符号化率になるにつれて列処理演算器の演算並列度を向上させることができるとともに,短い符号長でも演算器の使用率の低下を抑えることで従来手法よりも高いスループットを実現できる.提案手法により,既存研究に対して12.5%の面積減少と81%のスループット向上を確認した.

CiNii
Floorplan-Driven High-Level Synthesis for Distributed/Shared-Register Architectures (IPSJ Transactions on System LSI Design Methodology Vol.1)

OHCHI AKIRA, KOHARA SHUNITSU, TOGAWA NOZOMU

情報処理学会論文誌論文誌トランザクション 2008 ( 1 ) 78 - 90 2008年11月

CiNii
MANETにおけるGPSの位置情報を用いたハイブリッド型ルーティングプロトコル (アドホックネットワーク)

三浦俊祐, 戸川望, 柳澤政生

電子情報通信学会技術研究報告 108 ( 251 ) 17 - 22 2008年10月

CiNii
特集「組込みシステム工学」の編集にあたって

平山雅之, 戸川望

情報処理学会論文誌 49 ( 10 ) 3450 - 3450 2008年10月

CiNii
SIMD型プロセッサコアの面積/遅延見積り

山崎大輔, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

情報処理学会論文誌 49 ( 10 ) 3462 - 3481 2008年10月

　概要を見る

ASIP（Application Specific Instruction Processor）の自動合成は，対象とするアプリケーションに最適な構成を決定し，プロセッサのハードウェア部分とソフトウェア部分を同時に設計する．最適な構成の探索において，ある時点での構成に対して逐一論理合成を行い最適な構成の判定を行うと探索に多大な時間を要してしまうため，探索の評価指標として面積/遅延の見積り値を用い，論理合成することなく高速な探索を行う必要がある．また，アーキテクチャ探索に使用する見積り値と論理合成値との誤差が大きいと解の探索において適切な解が得られない可能性があるため精度の高い見積りを行うことが重要となる．本稿では，SIMD演算ユニットおよびアドレッシングユニットの構成の変化に対応したSIMD型プロセッサコアの面積/遅延時間見積り式を提案する．見積り式はプロセッサコアと付随するハードウェアユニットを部分機能ごとに分けてパラメータ化することによって導出し，これを用いることで論理合成することなく所望のアーキテクチャの面積・遅延値を導出することが可能となる．見積り式により導出されたプロセッサコアの面積値と論理合成値の相対誤差は平均2.25%，遅延時間の誤差は平均で0.54 nsとなった．In synthesis of ASIP (Application Specific Instruction Processor), we optimize processor architecture for a target application, and design a hardware part and a software part at the same time. In order to obtain an optimal processor architecture in a short time, we require a fast area/delay estimation without doing logic synthesis in an architecture exploration phase. It is important to estimate them accurately because a large range of errors may lead an inadequate solution. This paper proposes area/delay estimation for SIMD processor cores with configurable SIMD functional units and adressing units. Estimation equation is obtained by partitioning the processor core and hardware units into several functional parts and parameterizing them, and can obtain an estimation value for an architecture. We show the effectiveness of estimation equation by verifying the area/delay values obtained from the estimation equation and the logic synthesis value of processor cores. Relative error of them is 2.25% on the average. Error of delays is 0.54ns on the average.

CiNii
再構成型プロセッサFE-GAへのデータフローグラフマッピング手法

本間雅行, 田村亮, 戸川望, 柳澤政生, 大附辰夫, 佐藤真琴

電子情報通信学会技術研究報告. VLD, VLSI設計技術 108 ( 224 ) 7 - 12 2008年09月

　概要を見る

近年のディジタル機器においては,多種多様で,膨大なデータを短時間で処理することが要求されている.このような要求に応える新たなアーキテクチャとして,多数の演算器を並列に動作させることができる再構成型プロセッサがある.ここでは,ディジタルメディア処理向け動的再構成プロセッサFE-GA(Flexible Engine/Generic ALU array)に注目する.現在,FE-GAの開発ツールに関してはまだ確立されていない.そこで本稿では,FE-GAへの設計を容易にし,開発コストを軽減するFE-GAマッピングアルゴリズムを提案する.このアルゴリズムは特定のデータフローグラフ(DFG)を入力とすることで,FE-GAへのマッピング結果を生成,変換し,FE-GA専用のアセンブリ言語を自動生成するものである.この自動生成したアセンブリ言語をFEEditorと呼ばれる専用ツールに読み込ませることでマッピング自動化を実現する.提案手法では,DFGの入力側から出力側に向かってレベル順にノードを一つ一つFE-GAの演算セルアレイに配置配線していく.最初にマッピングするノードを優先的に左上にマッピングすることとし,それ以降のノードは,マッピングしたいノードの入力データを出力するノードの位置により,その位置を決定する.この過程を繰り返すことでマッピングを実現する.8つのDFGに対し提案手法を適用しサイクル数および実行時間を算出した.すべてのDFGでマッピングを実現することができた.

CiNii
ディジタルメディア向け動的再構成型プロセッサFE-GAへのFFTマッピングとその自動化手法

田村亮, 本間雅行, 戸川望, 柳澤政生, 大附辰夫, 佐藤真琴

電子情報通信学会技術研究報告. VLD, VLSI設計技術 108 ( 224 ) 13 - 18 2008年09月

　概要を見る

本稿では,動的再構成型プロセッサFE-GA(Flexible Engine/Generic ALU array)上にマッピングを自動的に行うCADツール開発の第1歩として,アプリケーションをFFTに限定しFE-GAに様々なFFTをマッピングすることを考える.まず第一にFE-GA上でFFTを設計する.第二に,得られたFE-GA専用のアセンブリ言語を元にその得られたアセンブリ言語を解析することにより,マッピング自動化手法を提案する.提案した手法では,FE-GAのアーキテクチャ制限の範囲内において,任意のデータ点数のFFTを自動的にマッピングすることに成功した.

CiNii
ビットレベル式変形によるセレクタ帰着型バタフライ演算器の設計と評価

名村健, 戸川望, 柳澤政生, 大附辰夫, 外村元伸

電子情報通信学会技術研究報告. VLD, VLSI設計技術 108 ( 224 ) 31 - 36 2008年09月

　概要を見る

セレクタ論理に帰着させた演算回路は,高速演算回路設計の一手法として提案されている.本稿では,FFTの高性能な要求に対して,バタフライ演算式をビットレベル式変形することで桁上げ伝播処理を削減したセレクタ帰着型バタフライ演算構成を提案する.評価実験を行った結果,提案したバタフライ演算回路は,算術演算子を用いて設計した従来の演算構成と比較して,面積制約を与えた設計において37.1%から49.8%高速化することができることを確認した.

CiNii
認知科学を応用した微小画面向け略地図生成手法とその統計的評価

二宮直也, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会論文誌. A, 基礎・境界 = The transactions of the Institute of Electronics, Information and Communication Engineers. A 91 ( 9 ) 869 - 882 2008年09月

　概要を見る

携帯電話による位置情報サービスとインターネットサービスの普及により,歩行者を対象とした地図サービスの利用が拡大している.これに伴い,表示面積の狭いモバイル端末に有効な略地図を自動生成するための各種技術の研究が盛んに行われている.道路形状の水平・垂直化,交差点角度の量子化を基本とする従来手法では碁盤の目のようなデザイン性の高い略地図の生成が可能であるが,それらがユーザにとって迷いにくい地図であるとは限らない.本論文では歩行者が道路形状やランドマークをどのように認識しているかという認知科学に着目し,これらの認知科学を反映させた略地図生成手法を提案する.略地図生成手法では第1に経路探索結果を用いて経路に沿った道路ネットワークデータを抽出する.第2に抽出した道路ネットワークデータに対して認知科学に基づいた簡略化を施す.最後にベクタグラフィックスにより略地図を描画する.評価実験後のアンケート調査により,提案手法で生成した地図の利用者の約7割〜9割が視認性,迷いにくさの観点から「良好」と回答していることを確認した.

CiNii
道路ネットワーク分割に基づく高速エリア略地図生成手法

松本和也, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. ITS 108 ( 171 ) 25 - 30 2008年07月

　概要を見る

携帯電話の高性能化・小型化により,GPSやナビゲーションシステムを用いた地図サービスの利用が拡大し,都市部から郊外部にわたって需要が増加すると考えられる.しかし表示画面が狭く処理能力の低い携帯端末で地図を表示させるには,携帯端末画面に適した略地図を生成する必要がある.本稿では,主に直線から構成される都市部だけでなく,直線・曲線を含む郊外部にも適用可能な略地図生成手法を提案する.提案手法は,エリア全体の道路ネットワークをいくつかのグループに分割し,各グループごとに間引き処理すると同時に直線化曲線化することで,都市部や郊外部に適応した略地図生成を図る.都市部と郊外部各10箇所の入力データを用いて提案手法を適用した結果,都市部ならびに曲線の多い郊外部でもデータ量削減と見やすい略地図が生成されることを確認した.

CiNii
屋内環境におけるユーザの経路嗜好調査とこれに基づく経路探索手法

山岸敬弘, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. ITS 108 ( 171 ) 31 - 36 2008年07月

　概要を見る

近年,携帯電話の普及に伴って移動通信サービスが大きく展開され,実用化が進んでいる.しかしながら実用段階まで進んでいる歩行者ナビゲーションサービスの研究は屋外環境に限ったものである.本稿では,屋外と比較して複雑な構造を持つ屋内環境におけるナビゲーションサービスに着目し,各ユーザに特化した最適な経路を提供することを目的として,ユーザの嗜好を反映させた経路探索手法を提案する.まず可視グラフを利用して対象とする屋内環境に特化したネットワークデータを提案する.次に,取り入れるべき嗜好項目を調査し,「最短経路」への要求は70%強,階段やエレベータ,エスカレータ等の「階層移動手段」に対し,特に高齢者から80%以上の要求があることを示す.これに加えて人混みを避けた経路に対し60%強の要求があった.そこで「最短経路」,「階層移動手段」さらに「混雑状況」という時間的因子を考慮した経路探索手法を提案する.提案手法の有効性を示すために実地調査を実施し,数種類に及ぶシミュレーション実験の結果から各ユーザにとって最適な経路が出力されることを示す.

CiNii
歩行者ナビゲーションにおける道路標識を用いた位置特定システムのための撮影状況に依存した認識度調査

児島伴幸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. ITS 108 ( 171 ) 37 - 42 2008年07月

　概要を見る

GPSの普及により歩行者ナビゲーションシステムが可能になったが,都市部においてGPSは電離層やマルチパスなどの影響により数100m程度の測位誤差が生じる可能性がある.都市部において数100mの測位誤差は道路数本分の誤差に対応するため,歩行者に混乱を与えかねない.数m以下の測位誤差にするために,我々は既存インフラである道路標識と携帯カメラを用いたGPS位置補正システムを提案している.我々の提案では,携帯電話で受信したGPS座標からユーザの大まかな位置を把握し,携帯カメラで撮影した標識と地図データベースを照らし合わせ,詳細な位置を求める.GPS位置補正システムの中で重要なサブシステムの一つに道路標識認識システムがある.道路標識認識システムは,自動車向けのシステム開発が進行しているが,携帯電話向けのシステム開発はほとんど始まっていない.本稿では,我々のグループで開発を進めている2種類の道路標識認識システムを用い,実際に携帯カメラで撮影した画像を元に道路標識を解析し,撮影状況に依存した道路標識の認識度を調査する.とくに,夜間における携帯ライト・天候による逆光の影響が道路標識認識システムの認識度に変化を与えることを示した.

CiNii
セレクタ論理を用いたバタフライ演算器の設計

名村健, 戸川望, 柳澤政生, 大附辰夫, 外村元伸

電子情報通信学会技術研究報告. VLD, VLSI設計技術 108 ( 22 ) 25 - 30 2008年05月

　概要を見る

算術演算回路の処理を高速化する手法として,セレクタ論理を利用した算術演算回路が提案されている.本稿では,FFTにおけるバタフライ演算式を式変形し,セレクタ論理に帰着させることで桁上げ伝播処理を削減することによって可変点数に対応した新しいバタフライ演算回路構成を提案する.評価実験をした結果,提案したバタフライ演算器は,算術演算子を用いて設計した従来のバタフライ演算構造に比べ,速度優先設計で21.8%高速化することができることを確認した.

CiNii
セレクタ論理を用いたバタフライ演算器の設計

名村健, 戸川望, 柳澤政生, 大附辰夫, 外村元伸

情報処理学会研究報告システムLSI設計技術（SLDM） 2008 ( 38 ) 25 - 30 2008年05月

　概要を見る

算術演算回路の処理を高速化する手法として，セレクタ論理を利用した算術演算回路が提案されている．本稿では，ＦＦＴにおけるバタフライ演算式を式変形し，セレクタ論理に帰着させることで桁上げ伝播処理を削減することによって可変点数に対応した新しいバタフライ演算回路構成を提案する．評価実験をした結果，提案したバタフライ演算器は，算術演算子を用いて設計した従来のバタフライ演算構造に比べ，速度優先設計で 21.8％高速化することができることを確認した．An arithmetic circuit using selector logic has been proposed, as a high computational approach for processing. In this paper, we propose a radix-2 butterfly circuit architecture using selector logic. Our butterfly circuit reduces carry propagations, compared to conventional butterfly circuits. Experimental results show that our proposed butterfly circuit improves the performance by 21.8％, compared to conventional butterfly circuits.

CiNii
アプリケーションプロセッサのL1キャッシュ最適化手法 (第21回回路とシステム軽井沢ワークショップ論文集) -- (メモリ最適化)

東條信明, 戸川望, 柳澤政生

回路とシステム軽井沢ワークショップ論文集 21 243 - 248 2008年04月

CiNii
エニーキャストにおけるルータの負荷に基づく経路選択手法

横田雅之, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. NS, ネットワークシステム 107 ( 443 ) 13 - 18 2008年04月

　概要を見る

エニーキャスト通信では,クライアント側はエニーキャストアドレスを指定するだけで特定のアプリケーションを提供する複数のサーバの中から最適なサーバと自動的に通信することができる.最適なサーバへの経路選択は経路のホップ数やサーバの処理時間などで判断するが,多数のクライアントが存在する場合,ネットワークの輻輳によるルータの負荷も考慮しなければならない.本稿では,エニーキャストにおけるルータの負荷に基づく経路選択手法を提案する.提案手法はCore-Based Tree Method(CBT)を基にCBT木だけでなく,リッジルータを構成要素とするCover木を構築することで経路を選択する.提案手法はサーバの処理時間だけではなくルータの負荷も考えているため,ネットワークのトラヒックが増大しても既存手法に比べてルータの負荷が小さい経路を選択することができる.シミュレーションによる評価を行い,提案手法の有効性を示す.

CiNii
MAPドメイン間移動のためのハンドオフ時間とパケットロスの削減手法

田中敦樹, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. NS, ネットワークシステム 107 ( 443 ) 41 - 46 2008年04月

　概要を見る

近年,特急電車や高速バスなどの移動体からインターネットを利用するための環境が求められている.その環境に対応するために,MAPドメイン間を移動する場合でも適用可能なハンドオフ時間とパケットロス削減手法を提案する.提案手法は,移動端末が次に接続する可能性の高いAP(Access Point)のIPレイヤパラメータを予めAR(Access Router)が取得しておくことにより,MAPドメイン間を移動する場合でもハンドオフ時間およびパケットロスの削減が可能である.NS-2を用いて提案手法のシミュレーションを行った結果,MAPドメイン間移動・MAPドメイン内移動いずれの場合でもVoIP通信を継続できる程度のハンドオフ時間に抑えられ,パケットロス率もFMIPv6の約50%程度に抑えられていることを確認した.

CiNii
LAMR : アドホックネットワークにおける負荷分散を考慮したマルチパスルーティング

清水悠司, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. NS, ネットワークシステム 107 ( 443 ) 51 - 56 2008年04月

　概要を見る

従来,アドホックネットワークのルーティングプロトコルは,ノード負荷を考慮せず常に最短ルートを選択するという問題や,シングルパスのためにリンクブレークに弱いという問題があった.そこでノード負荷の減少と分散に着目したプロトコルLASRとディスジョイントなマルチパスを構成するプロトコルSMRを組み合わせて,ノードの負荷を制御したディスジョイントなマルチパスを構築するプロトコルLAMRを提案する.提案手法では,まず経路構築の際にRREQ破棄アルゴリズムにより,近隣ノードと単位時間当たりの送信パケット数を比較し,中間ノードが受信したRREQを選択破棄する.そして,到着したRREQにより構築されたマルチパスの中で最もディスジョイントなルートから2つを選択し,経路を構築する.提案手法をシミュレーションした結果,パケット到着率の向上と制御パケット数の減少と分散を確認した.

CiNii
応用指向型動的再構成可能ネットワークプロセッサアーキテクチャとその最適化手法

大田元則, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. ICD, 集積回路 107 ( 511 ) 47 - 52 2008年03月

　概要を見る

本稿では,応用指向型動的再構成可能ネットワークプロセッサアーキテクチャとその最適化手法を提案する.提案するネットワークプロセッサアーキテクチャは入出力処理用Micro Packet Processor(MPP),アプリケーション専用ハードウェア(HW),Dynamic Micro Packet Processor(Dynamic MPP)から構成され,ネットワークプロセッサのパケット処理実行時において,各処理のボトルネックを検出し,Dynamic MPPが随時処理の並列化を行うことによってネットワークプロセッサのスループット向上を可能とする.実アプリケーションを対象にしたネットワークプロセッサ設計において,アプリケーションによって変化するアーキテクチャの最適な構成を求めるために,ネットワークシミュレータを提案・構築し,アプリケーションに最適なDynamic MPPの個数を評価する.対象アプリケーションとしてDES暗号化を取り上げた場合,最適なDynamic MPP数は6個と求めることができた.さらに既存製品と比較して,応用指向型動的再構成可能ネットワークプロセッサアーキテクチャの優位性を示した.

CiNii
応用指向型動的再構成可能ネットワークプロセッサアーキテクチャとその最適化手法 (VLSI設計技術)

大田元則, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 107 ( 508 ) 47 - 52 2008年03月

　概要を見る

本稿では,応用指向型動的再構成可能ネットワークプロセッサアーキテクチャとその最適化手法を提案する.提案するネットワークプロセッサアーキテクチャは入出力処理用Micro Packet Processor(MPP),アプリケーション専用ハードウェア(HW),Dynamic Micro Packet Processor (Dynamic MPP)から構成され,ネットワークプロセッサのパケット処理実行時において,各処理のボトルネックを検出し,Dynamic MPPが随時処理の並列化を行うことによってネットワークプロセッサのスループット向上を可能とする.実アプリケーションを対象にしたネットワークプロセッサ設計において,アプリケーションによって変化するアーキテクチャの最適な構成を求めるために,ネットワークシミュレータを提案・構築し,アプリケーションに最適なDynamic MPPの個数を評価する.対象アプリケーションとしてDES暗号化を取り上げた場合,最適なDynamic MPP数は6個と求めることができた.さらに既存製品と比較して,応用指向型動的再構成可能ネットワークプロセッサアーキテクチャの優位性を示した.

CiNii
命令メモリビット幅削減に基づく低エネルギーASIP合成手法

小原俊逸, 史又華, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. ICD, 集積回路 107 ( 509 ) 25 - 30 2008年03月

　概要を見る

本稿ではASIPを対象としたハードウェア/ソフトウェア協調合成システムにおける命令メモリビット幅削減に基づく低エネルギー化手法を提案する。VLIW型プロセッサは並列に命令を発行可能だが,命令メモリのビット幅が長くなり,消費電力・消費エネルギーを無駄に増加させてしまう.したがって,VLIW型プロセッサの命令メモリのビット幅の削減は,高性能でエネルギー効率の高いプロセッサを実現可能にすると考えられる.命令メモリのビット幅は命令エンコーディング形式に依存し,それはオペコードとオペランド群で構成される.オペコードのビット幅は命令セットにおける命令数に,オペランドのビット幅は汎用レジスタ数に依存する.また,我々はオペコードのビット幅を削減するために,結合命令の概念を導入した.結合命令は各VLIWスロットで同時に発行される複数の命令を1つの命令として取り扱った命令である.我々は,オペコードビット幅削減アルゴリズム,オペランドビット幅削減アルゴリズム,エネルギー最小化アルゴリズムの3つのアルゴリズムで構成されるASIP合成システムを構築した.実験結果では,メモリを含むプロセッサ全体で,9%〜12%の消費エネルギーを削減することを確認した.

CiNii
命令メモリビット幅削減に基づく低エネルギーASIP合成手法 (VLSI設計技術)

小原俊逸, 史又華, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 107 ( 506 ) 25 - 30 2008年03月

　概要を見る

本稿ではASIPを対象としたハードウェア/ソフトウェア協調合成システムにおける命令メモリビット幅削減に基づく低エネルギー化手法を提案する。VLIW型プロセッサは並列に命令を発行可能だが,命令メモリのビット幅が長くなり,消費電力・消費エネルギーを無駄に増加させてしまう.したがって,VLIW型プロセッサの命令メモリのビット幅の削減は,高性能でエネルギー効率の高いプロセッサを実現可能にすると考えられる.命令メモリのビット幅は命令エンコーディング形式に依存し,それはオペコードとオペランド群で構成される.オペコードのビット幅は命令セットにおける命令数に,オペランドのビット幅は汎用レジスタ数に依存する.また,我々はオペコードのビット幅を削減するために,結合命令の概念を導入した.結合命令は各VLIWスロットで同時に発行される複数の命令を1つの命令として取り扱った命令である.我々は,オペコードビット幅削減アルゴリズム,オペランドビット幅削減アルゴリズム,エネルギー最小化アルゴリズムの3つのアルゴリズムで構成されるASIP合成システムを構築した.実験結果では,メモリを含むプロセッサ全体で,9%〜12%の消費エネルギーを削減することを確認した.

CiNii
レジスタ分散型アーキテクチャを対象とした高位合成のためのマルチプレクサ削減手法

遠藤哲弥, 大智輝, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 2008 ( 2 ) 85 - 90 2008年01月

　概要を見る

近年のLSI設計プロセスの微細化に伴い，配線遅延がゲート遅延に対し相対的に増加してきている．また単位面積あたりの総ゲート数，総配線数が増加し，配線制御に必要なマルチプレクサ数が増大してきている．レジスタ分散型アーキテクチャを用いると，レジスタ間データ転送を利用することにより配線遅延が回路の性能に与える影響を低減できるが，レジスタ間接続に要する総配線数の増加に伴い，必要となるマルチプレクサ数の増大を招いてしまう．本稿では，レジスタ分散型アーキテクチャを対象とした高位合成システムにおけるマルチプレクサ削減手法を提案する．提案手法は各演算器，ローカルレジスタ間の配線接続に対し，ポート割当を最適化することで必要なマルチプレクサ数を削減する．計算機実験によって，対象とする高位合成手法に提案手法を組み込んだ場合，平均で10.9％のマルチプレクサ数，49％の面積が削減でき有効性を確認した．As device feature size decreases, interconnection delay becomes the dominating factor of total delay. In addition, as the number of total gates and the number of wirings in each unit area increase, the number of multiplexers that is necessary for the wiring control increases. By using a distributed-register architecture, we can synthesize circuits with register-to-register data transfer, and can reduce influence of interconnection delay. However, as the number of wirings required for the connection between registers increases, the needed number of multiplexers is also increased. In this paper, we propose a multiplexer reduction algorithm in high-level synthesis for distributed-register architectures. This algorithm can reduce the number of multiplexers for each functional unit, wiring connection between local registers by optimizing a port re-assignment. We show effectiveness of the proposed algorithm thorough experimental results.

CiNii
アプリケーションプロセッサのＬ１データキャッシュ最適化手法

東條信明, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 2008 ( 2 ) 155 - 160 2008年01月

　概要を見る

本稿では複数のキャッシュ構成の中から特定のアプリケーションに適した構成を選択するためのキャッシュ最適化手法を提案する．本手法は，アプリケーションソースコードと制約条件を入力することで，メインメモリとL1キャッシュからなるメモリ階層全体の平均アクセス時間を評価指標として，対象となるアプリケーションに特化したキャッシュ構成を求める．キャッシュ容量，ブロックサイズ，連想度を最適化し，アプリケーションに最適なL1データキャッシュ構成を目指す．計算機実験を行った結果を既存手法と比較し，有効性を確認した．One major factor in improving the performance of embedded processors is the use of data and instruction caches. In this paper, we propose an LI data cache optimization algorithm which selects a suitable cache configuration for a given embedded application. Our algorithm can have the area constraint by introducing CRMF (Configuration Reduction approach by the Miss Factor) and CRCB(Configuration Reduction approach by the Cache Behavior). Our algorithm finally selects best cache size, block size and associativity under the area constraint for a targeted application. We demonstrate the effectiveness of our algorithm by applying it to Mediabench.

CiNii
アプリケーションプロセッサのカーネル記述自動生成手法

日浦敏宏, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 2008 ( 2 ) 161 - 166 2008年01月

　概要を見る

本稿ではアプリケーションプロセッサ向けハードウェア/ソフトウェア(HW/SW)協調設計システム｢SPADES」におけるプロセッサのカーネル記述自動生成手法について提案する．近代の組み込みシステムの中枢を担うアプリケーションプロセッサには低コスト，低面積，高性能でかつ短期間の生産が求められている．アプリケーションプロセッサの性能をあげる主な手段として，SIMD型演算器やMAC演算器，ハードウェアループユニットやアドレッシングユニット，Yデータメモリといったハードウェアユニットを付加することがあげられ，要求に応じてこれらハードウェアユニットを付加できることが重要になる．提案手法では，プロセッサコアを構成する各々の機能を実現する部分に分割し，ハードウェアユニットに対応可能な個々の機能部分を生成させる．生成された機能部分の各記述を併合することで，プロセッサコアの記述を生成する．本手法により生成されたVHDLの平均9％ほどのXML記述で，プロセツサコアのVHDL記述をおおよそ3秒以内で生成することができた．This paper proposes a processor kernel generation method for HW/SW co-design system named SPADES. SPADES is a system to synthesize processor cores specialized in application automatically. Low cost, small area, high performance and high productivity are required for application-specific processors in embedded systems. One of the effective methods to improve the processor performance is to integrate some hardware units such as SIMD functional units, MAC functional units, hardware loop unit, addressing unit, extra data memory, and it is important to select the appropriate hardware units depending on each target application. In our work, we divide the application-specific processor into some functional parts which are customized for additional hardware units, and our method generates and merges them. The description of the processor core is composed of each description of the function parts. In the experimental results, the VHDL descriptions of the processor cores can be generated in 3 seconds.

CiNii
レジスタ分散型アーキテクチャを対象とした高位合成のためのマルチプレクサ削減手法

遠藤哲弥, 大智輝, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. RECONF, リコンフィギャラブルシステム : IEICE technical report 107 ( 419 ) 7 - 12 2008年01月

　概要を見る

近年のLSI設計プロセスの微細化に伴い,配線遅延がゲート遅延に対し相対的に増加してきている.また単位面積あたりの総ゲート数,総配線数が増加し,配線制御に必要なマルチプレクサ数が増大してきている.レジスタ分散型アーキテクチャを用いると,レジスタ間データ転送を利用することにより配線遅延が回路の性能に与える影響を低減できるが,レジスタ間接続に要する総配線数の増加に伴い,必要となるマルチプレクサ数の増大を招いてしまう.本稿では,レジスタ分散型アーキテクチャを対象とした高位合成システムにおけるマルチプレクサ削減手法を提案する.提案手法は各演算器,ローカルレジスタ間の配線接続に対し,ポート割当を最適化することで必要なマルチプレクサ数を削減する.計算機実験によって,対象とする高位合成手法に提案手法を組み込んだ場合,平均で10.9%のマルチプレクサ数,4.9%の面積が削減でき有効性を確認した.

CiNii
アプリケーションプロセッサのL1データキャッシュ最適化手法

東條信明, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. RECONF, リコンフィギャラブルシステム : IEICE technical report 107 ( 419 ) 77 - 82 2008年01月

　概要を見る

本稿では複数のキャッシュ構成の中から特定のアプリケーションに適した構成を選択するためのキャッシュ最適化手法を提案する.本手法は,アプリケーションソースコードと制約条件を入力することで,メインメモリとL1キャッシュからなるメモリ階層全体の平均アクセス時間を評価指標として,対象となるアプリケーションに特化したキャッシュ構成を求める.キャッシュ容量,ブロックサイズ,連想度を最適化し,アプリケーションに最適なL1データキャッシュ構成を目指す.計算機実験を行った結果を既存手法と比較し,有効性を確認した.

CiNii
アプリケーションプロセッサのカーネル記述自動生成手法

日浦敏宏, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. RECONF, リコンフィギャラブルシステム : IEICE technical report 107 ( 419 ) 83 - 88 2008年01月

　概要を見る

本稿ではアプリケーションプロセッサ向けハードウェア/ソフトウェア(HW/SW)協調設計システム「SPADES」におけるプロセッサのカーネル記述自動生成手法について提案する.近代の組み込みシステムの中枢を担うアプリケーションプロセッサには低コスト,低面積,高性能でかつ短期間の生産が求められている.アプリケーションプロセッサの性能をあげる主な手段として,SIMD型演算器やMAC演算器,ハードウェアループユニットやアドレッシングユニット,Yデータメモリといったハードウェアユニットを付加することがあげられ,要求に応じてこれらハードウェアユニットを付加できることが重要になる.提案手法では,プロセッサコアを構成する各々の機能を実現する部分に分割し,ハードウェアユニットに対応可能な個々の機能部分を生成させる.生成された機能部分の各記述を併合することで,プロセッサコアの記述を生成する.本手法により生成されたVHDLの平均9%ほどのXML記述で,プロセッサコアのVHDL記述をおおよそ3秒以内で生成することができた.

CiNii
レジスタ分散型アーキテクチャを対象とした高位合成のためのマルチプレクサ削減手法

遠藤哲弥, 大智輝, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. CPSY, コンピュータシステム 107 ( 417 ) 7 - 12 2008年01月

　概要を見る

近年のLSI設計プロセスの微細化に伴い,配線遅延がゲート遅延に対し相対的に増加してきている.また単位面積あたりの総ゲート数,総配線数が増加し,配線制御に必要なマルチプレクサ数が増大してきている.レジスタ分散型アーキテクチャを用いると,レジスタ間データ転送を利用することにより配線遅延が回路の性能に与える影響を低減できるが,レジスタ間接続に要する総配線数の増加に伴い,必要となるマルチプレクサ数の増大を招いてしまう.本稿では,レジスタ分散型アーキテクチャを対象とした高位合成システムにおけるマルチプレクサ削減手法を提案する.提案手法は各演算器,ローカルレジスタ間の配線接続に対し,ポート割当を最適化することで必要なマルチプレクサ数を削減する.計算機実験によって,対象とする高位合成手法に提案手法を組み込んだ場合,平均で10.9%のマルチプレクサ数,4.9%の面積が削減でき有効性を確認した.

CiNii
アプリケーションプロセッサのL1データキャッシュ最適化手法

東條信明, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. CPSY, コンピュータシステム 107 ( 417 ) 77 - 82 2008年01月

　概要を見る

本稿では複数のキャッシュ構成の中から特定のアプリケーションに適した構成を選択するためのキャッシュ最適化手法を提案する.本手法は,アプリケーションソースコードと制約条件を入力することで,メインメモリとL1キャッシュからなるメモリ階層全体の平均アクセス時間を評価指標として,対象となるアプリケーションに特化したキャッシュ構成を求める.キャッシュ容量,ブロックサイズ,連想度を最適化し,アプリケーションに最適なL1データキャッシュ構成を目指す.計算機実験を行った結果を既存手法と比較し,有効性を確認した.

CiNii
アプリケーションプロセッサのカーネル記述自動生成手法

日浦敏宏, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. CPSY, コンピュータシステム 107 ( 417 ) 83 - 88 2008年01月

　概要を見る

本稿ではアプリケーションプロセッサ向けハードウェア/ソフトウェア(HW/SW)協調設計システム「SPADES」におけるプロセッサのカーネル記述自動生成手法について提案する.近代の組み込みシステムの中枢を担うアプリケーションプロセッサには低コスト,低面積,高性能でかつ短期間の生産が求められている.アプリケーションプロセッサの性能をあげる主な手段として,SIMD型演算器やMAC演算器,ハードウェアループユニットやアドレッシングユニット,Yデータメモリといったハードウェアユニットを付加することがあげられ,要求に応じてこれらハードウェアユニットを付加できることが重要になる.提案手法では,プロセッサコアを構成する各々の機能を実現する部分に分割し,ハードウェアユニットに対応可能な個々の機能部分を生成させる.生成された機能部分の各記述を併合することで,プロセッサコアの記述を生成する.本手法により生成されたVHDLの平均9%ほどのXML記述で,プロセッサコアのVHDL記述をおおよそ3秒以内で生成することができた.

CiNii
レジスタ分散型アーキテクチャを対象とした高位合成のためのマルチプレクサ削減手法

遠藤哲弥, 大智輝, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 107 ( 415 ) 7 - 12 2008年01月

　概要を見る

近年のLSI設計プロセスの微細化に伴い,配線遅延がゲート遅延に対し相対的に増加してきている.また単位面積あたりの総ゲート数,総配線数が増加し,配線制御に必要なマルチプレクサ数が増大してきている.レジスタ分散型アーキテクチャを用いると,レジスタ間データ転送を利用することにより配線遅延が回路の性能に与える影響を低減できるが,レジスタ間接続に要する総配線数の増加に伴い,必要となるマルチプレクサ数の増大を招いてしまう.本稿では,レジスタ分散型アーキテクチャを対象とした高位合成システムにおけるマルチプレクサ削減手法を提案する.提案手法は各演算器,ローカルレジスタ間の配線接続に対し,ポート割当を最適化することで必要なマルチプレクサ数を削減する.計算機実験によって,対象とする高位合成手法に提案手法を組み込んだ場合,平均で10.9%のマルチプレクサ数,4.9%の面積が削減でき有効性を確認した.

CiNii
アプリケーションプロセッサのL1データキャッシュ最適化手法

東條信明, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 107 ( 415 ) 77 - 82 2008年01月

　概要を見る

本稿では複数のキャッシュ構成の中から特定のアプリケーションに適した構成を選択するためのキャッシュ最適化手法を提案する.本手法は,アプリケーションソースコードと制約条件を入力することで,メインメモリとL1キャッシュからなるメモリ階層全体の平均アクセス時間を評価指標として,対象となるアプリケーションに特化したキャッシュ構成を求める.キャッシュ容量,ブロックサイズ,連想度を最適化し,アプリケーションに最適なL1データキャッシュ構成を目指す.計算機実験を行った結果を既存手法と比較し,有効性を確認した.

CiNii
アプリケーションプロセッサのカーネル記述自動生成手法

日浦敏宏, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 107 ( 415 ) 83 - 88 2008年01月

　概要を見る

本稿ではアプリケーションプロセッサ向けハードウェア/ソフトウェア(HW/SW)協調設計システム「SPADES」におけるプロセッサのカーネル記述自動生成手法について提案する.近代の組み込みシステムの中枢を担うアプリケーションプロセッサには低コスト,低面積,高性能でかつ短期間の生産が求められている.アプリケーションプロセッサの性能をあげる主な手段として,SIMD型演算器やMAC演算器,ハードウェアループユニットやアドレッシングユニット,Yデータメモリといったハードウェアユニットを付加することがあげられ,要求に応じてこれらハードウェアユニットを付加できることが重要になる.提案手法では,プロセッサコアを構成する各々の機能を実現する部分に分割し,ハードウェアユニットに対応可能な個々の機能部分を生成させる.生成された機能部分の各記述を併合することで,プロセッサコアの記述を生成する.本手法により生成されたVHDLの平均9%ほどのXML記述で,プロセッサコアのVHDL記述をおおよそ3秒以内で生成することができた.

CiNii
屋内用歩行者ナビゲーションにおける階層構造表現に特化した経路表示手法

廣石敏行, 戸川望, 柳澤政生, 大附辰夫

情報処理学会シンポジウムシリーズ(CD-ROM) 2008 ( 1 ) 2008年

J-GLOBAL
マルチサイクル配線遅延を考慮したフロアプラン指向高位合成手法

大智輝, 戸川望, 柳澤政生, 大附辰夫

情報処理学会シンポジウム論文集 2008 ( 7 ) 2008年

J-GLOBAL
セキュリティ処理向け動的再構成可能ネットワークプロセッサの設計

阿部拓野, 戸川望, 柳澤政生, 大附辰夫

情報処理学会シンポジウム論文集 2008 ( 7 ) 2008年

J-GLOBAL
SIMD型プロセッサコアの面積/遅延見積り

山崎大輔, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

情報処理学会論文誌ジャーナル(CD-ROM) 49 ( 10 ) 2008年

J-GLOBAL
位置特定のための携帯電話向け道路標識認識アルゴリズム

宮川了, 戸川望, 柳澤政生, 大附辰夫

情報処理学会シンポジウムシリーズ(CD-ROM) 2008 ( 1 ) 2008年

J-GLOBAL
歩行者向けデフォルメ地図生成ハードウェアエンジンの設計

荒幡明, 奈良竜太, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 2007 ( 114 ) 175 - 180 2007年11月

　概要を見る

携帯電話の普及から多様なアプリケーションへの要求が高まっている．現在携帯端末へ地図情報を配信するサービスが普及しているが，それらの地図の多くはPC用の地図であり，微細な携帯端末用ディスプレイでの表示には適していない．地図情報は性質上リアルタイムの更新を必要とするため，あらかじめ視認性の高いデフォルメ化された地図を作成しておくのは現実的ではない．そのため地図情報を自動的にデフォルメ化する技術が多数開発・提案されているが，デフォルメ化処理をサーバ上で処理するとデフォルメ化地図をユーザの嗜好に合わせるのは困難である．携帯電話の処理能力ではデフォルメ化処理は処理量が大きくレスポンスタイムや消費電力の増大を招く．本稿では携帯電話向けデフォルメ地図生成ハードウェアエンジンを提案する．携帯端末向けの画像処理という点に着目し，一つのデフォルメ化処理を基本に分析し，ボトルネックを抽出，適切なハードウェアを設計する設計した演算器を携帯電話に組み込むことで，従来の2分の1から5分の1の処理量で実行できる．An image of map information for a computer display is complex to be shown in a mobile phone LCD. thus a deformed map is necessary in a mobile phone. An image of map information needs a renewal on real-time processing. hence an automatic generation of a deformed map is proposed. An automatic generation of a deformed map on the network server is not favor of individual and on the network client costs mobile phone load. This paper presents a hardware engine for generating deformed map of a mobile phone. We analyzed generating deformed map and detected a bottleneck of the processing. As a result, we proposes appropriate ALU for a mobile phone. Embedding the proposing ALU for a mobile phone, It is possible to execute generating deformed map from 50％ to 20％ of the processing of the past.

CiNii
AESにおける合成体SubBytes向けパワーマスキング乗算回路の設計

川畑伸幸, 奈良竜太, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 2007 ( 114 ) 109 - 114 2007年11月

　概要を見る

共通鍵暗号規格の一つである AES は主な利用方法の一つとして IC チップ等の組込み機器上での使用が挙げられる．第三者に対して共通鍵は知られてはならないが，暗号処理演算中に発する物理量を測定，解析して共通鍵を解読する攻撃法(サイドチャネルアタック)が考案されその危険性が指摘されている．中でも電力差分解析攻撃(DPA)は最も現実的かつ有力な攻撃方法として知られており，今後の攻撃手法として脅威になると予想されている．本稿では，AES の主な利用方法である IC チップ等の組込み機器上での使用を想定し，合成体理論を用いた AES の SubBytes 処理回路に着目した．同一の演算に対してランダムに消費電力が返される拡大体演算回路を新たに提案し合成体 SubBytes 処理回路に対して適用することで，モジュールレベルでの合成体 SubBytes 回路に特化した小面積指向耐 DPA 設計をした．提案手法の実装をして評価及び結果を報告する．AES is one of common key cryptosystems and mainly used on an embedded system, IC-chip and others, and the common key must not known by others. However the common key can be cracked by side channel attack(SCA). SCA, an attacking method of cracking common key by measuring and analyzing physical quantity at the encryption processing, is proposed and pointed as a dangerous for the security of AES. Especialy in SCA, the attacking method that is the most dangerous and realistic for security of AES is to be a deffirential power analysis (DPA). Hence against DPA, SubBytes circuit is needed to design as an anti-DPA. To design an anti-DPA SubBytes circuit, we propose a power masking multiplier based on galois field for composite field AES. With the multiplier, we design a circuit of inverse-element based on galois field for composite field and design SubBytes circuit oriented low area by using it. We report evaluation and result.

CiNii
歩行者向けデフォルメ地図生成ハードウェアエンジンの設計

荒幡明, 奈良竜太, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 107 ( 336 ) 61 - 66 2007年11月

　概要を見る

携帯電話の普及から多様なアプリケーションへの要求が高まっている.現在携帯端末へ地図情報を配信するサービスが普及しているが,それらの地図の多くはPC用の地図であり,微細な携帯端末用ディスプレイでの表示には適していない.地図情報は性質上リアルタイムの更新を必要とするため,あらかじめ視認性の高いデフォルメ化された地図を作成しておくのは現実的ではない.そのため地図情報を自動的にデフォルメ化する技術が多数開発・提案されているが,デフォルメ化処理をサーバ上で処理するとデフォルメ化地図をユーザの嗜好に合わせるのは困難である.携帯電話の処理能力ではデフォルメ化処理は処理量が大きくレスポンスタイムや消費電力の増大を招く.本稿では携帯電話向けデフォルメ地図生成ハードウェアエンジンを提案する.携帯端末向けの画像処理という点に着目し,一つのデフォルメ化処理を基本に分析し,ボトルネックを抽出,適切なハードウェアを設計する.設計した演算器を携帯電話に組み込むことで,従来の2分の1から5分の1の処理量で実行できる.

CiNii
歩行者向けデフォルメ地図生成ハードウェアエンジンの設計

荒幡明, 奈良竜太, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. DC, ディペンダブルコンピューティング 107 ( 339 ) 61 - 66 2007年11月

　概要を見る

携帯電話の普及から多様なアプリケーションへの要求が高まっている.現在携帯端末へ地図情報を配信するサービスが普及しているが,それらの地図の多くはPC用の地図であり,微細な携帯端末用ディスプレイでの表示には適していない.地図情報は性質上リアルタイムの更新を必要とするため,あらかじめ視認性の高いデフォルメ化された地図を作成しておくのは現実的ではない.そのため地図情報を自動的にデフォルメ化する技術が多数開発・提案されているが,デフォルメ化処理をサーバ上で処理するとデフォルメ化地図をユーザの嗜好に合わせるのは困難である.携帯電話の処理能力ではデフォルメ化処理は処理量が大きくレスポンスタイムや消費電力の増大を招く.本稿では携帯電話向けデフォルメ地図生成ハードウェアエンジンを提案する.携帯端末向けの画像処理という点に着目し,一つのデフォルメ化処理を基本に分析し,ボトルネックを抽出,適切なハードウェアを設計する.設計した演算器を携帯電話に組み込むことで,従来の2分の1から5分の1の処理量で実行できる.

CiNii
列処理演算法に着目したマルチレート対応イレギュラーLDPC符号復号器

今井優太, 清水一範, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. RECONF, リコンフィギャラブルシステム : IEICE technical report 107 ( 342 ) 19 - 24 2007年11月

　概要を見る

近年放送のデジタル化,携帯電話や音楽再生機の大容量化,高機能化が進む中で無線通信を介したデジタルコンテンツ利用が急激に加速している.通信環境の変動しやすい状況において高い通信品質を保つことが大きな課題となっている.LDPC (Low Density Parity Check)符号は高い誤り訂正能力を持つことから次世代の符号として期待を浴びており,現在までにも様々な研究が行われIEEE802.11nでも規格化されている.本稿では,無線通信環境が大きく変化する場合に,その環境変動に対応でき,IEEE802.11nで定義されたパリティチェック行列に準拠する小面積な符号化率可変復号器を提案する.列処理演算の加算演算器を符号化率が変化しても共有利用させることにより実現する.さらにこの手法では,高符号化率になるにつれ復号器の演算並列度を向上させることができ従来手法と比較し高いスループットを実現することが可能になる.

CiNii
AESにおける合成体 SubBytes 向けパワーマスキング乗算回路の設計

川畑伸幸, 奈良竜太, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 107 ( 335 ) 37 - 42 2007年11月

　概要を見る

共通鍵暗号規格の一つであるAESは主な利用方法の一つとしてICチップ等の組込み機器上での使用が挙げられる.第三者に対して共通鍵は知られてはならないが,暗号処理演算中に発する物理量を測定,解析して共通鍵を解読する攻撃法(サイドチャネルアタック)が考案されその危険性が指摘されている.中でも電力差分解析攻撃(DPA)は最も現実的かつ有力な攻撃方法として知られており,今後の攻撃手法として脅威になると予想されている.本稿では,AESの主な利用方法であるICチップ等の組込み機器上での使用を想定し,合成体理論を用いたAESのSubBytes処理回路に着目した.同一の演算に対してランダムに消費電力が返される拡大体演算回路を新たに提案し合成体SubBytes処理回路に対して適用することで,モジュールレベルでの合成体SubBytes回路に特化した小面積指向耐DPA設計をした.提案手法の実装をして評価及び結果を報告する.

CiNii
AESにおける合成体 SubBytes 向けパワーマスキング乗算回路の設計

川畑伸幸, 奈良竜太, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. DC, ディペンダブルコンピューティング 107 ( 338 ) 37 - 42 2007年11月

　概要を見る

共通鍵暗号規格の一つであるAESは主な利用方法の一つとしてICチップ等の組込み機器上での使用が挙げられる.第三者に対して共通鍵は知られてはならないが,暗号処理演算中に発する物理量を測定,解析して共通鍵を解読する攻撃法(サイドチャネルアタック)が考案されその危険性が指摘されている.中でも電力差分解析攻撃(DPA)は最も現実的かつ有力な攻撃方法として知られており,今後の攻撃手法として脅威になると予想されている.本稿では,AESの主な利用方法であるICチップ等の組込み機器上での使用を想定し,合成体理論を用いたAESのSubBytes処理回路に着目した.同一の演算に対してランダムに消費電力が返される拡大体演算回路を新たに提案し合成体SubBytes処理回路に対して適用することで,モジュールレベルでの合成体SubBytes回路に特化した小面積指向耐DPA設計をした.提案手法の実装をして評価及び結果を報告する.

CiNii
組み込みシステム解剖室(第2回)ICカード対応自動改札システムの解剖

戸川望

インタ-フェ-ス 33 ( 11 ) 149 - 156 2007年11月

CiNii
組み込みシステム解剖室(新連載・第1回)ディジタル一眼レフ・カメラを解剖する

戸川望

インタ-フェ-ス 33 ( 10 ) 148 - 154 2007年10月

CiNii
移動体を対象としたアプリケーションとデータサイズによる階層 Network Mobilityの負荷分散方式

月木英治, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告高度交通システム（ITS） 2007 ( 90 ) 65 - 70 2007年09月

　概要を見る

車や電車内部からインターネットを利用する手法として Network Mobility（NEMO）が提案されている．しかしながら，ハンドオーバの際に生じるパケットロスや遅延時間，特定ルータヘの負荷集中が通信品質の劣化原因となっている．本稿では，Mobility Anchor Point（MAP）を用いたアプリケーションとデータサイズによる多階層型MAP の負荷分散方式を提案する．本方式では，移動体（MN）の移動速度に依存しないため，電車内部のようにＭ蕊速度が同じであっても，MAP の負荷分散を実現できる．また，IP 電話やビデオストリーミングなどのアプリケーションに対して，一定の遅延時間の保証が行える.本手法の有効性をネットワークシミュレータ OPNET を用い検証する．Network Mobility (NEMO) is a method using the Internet inside cars or trains. However, packet losses, delay time and load concentration to a router, when handover is taking place, cause quality degradation. We propose a load balancing method by using application type and data size in multi-layered Mobility Anchor Point (MAP). Our method does not depend on MN's speed, so the load balancing over MAPs can be realized where all MNs move at the same speed as in a train. In addition, we can guarantee constant delay time for applications such as IP telephone or video streaming. We show the effectiveness of the proposed method using the network simulater OPNET.

CiNii
歩行者ナビゲーションにおけるＧＰＳ誤差補正のための道路標識による現在位置測位手法

本多聖人, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告高度交通システム（ITS） 2007 ( 90 ) 71 - 76 2007年09月

　概要を見る

現在のGPSは建物等によるGPS信号の遮蔽やマルチパス，利用可能な衛星の制限等によって位置測位の精度が著しく低下する．そこで，我々はGPS誤差の補正を目的とした位置測位システムを提案している．本システムは既存のインフラ・デバイスであるカメラ機能付き携帯電話，GPS，地図サーバによって構成される．地図サーバは携帯電話で撮影された道路標識画像を識別し，その道路標識が目視できる位置に利用者が存在しているという観点からGPS誤差範囲を絞り込み，位置を測位する．本システムの最も重要な処理の1 つが道路標識の識別である．道路標識の識別はシステムのレスポンスと正しい位置測位の観点から，高速化・高精度化が必要となる．高速化・高精度化を実現するにあたり，"利用者が位置測位に積極的である"という観点から道路標識は画像の中央付近に撮影されるという考えを利用した道路標識識別手法を提案する．地図サーバは，携帯電話で撮像された道路標識画像を受信し，道路標識の色の特定，ハブ変換による形状の特定，テンプレートマッチングによるシンボルの特定によって標識を識別する．撮影画像の送受信ではレスポンスタイムが3秒以内に収まるように画像を低解像度化することで通信データを削減し，通信時間を抑制する．道路標識の色の特定は中心に近い画素に重みをつけることで高精度化する．道路標識の形状の特定はエッジの方向を利用し，ハブ変換を高速化する．ハブ変換の中で円形の道路標識では標識が中央付近に存在していることを利用し，高速化する．テンプレートマッチングでは道路標識の色と形状によって候補を絞り込むことで高速化する．GPS has a significant error caused by multipath, poor satellite reception and so on. We propose a positioning method that can correct GPS error for pedestrian navigation. Our system is composed of a cellular phone with camera, GPS and the map server that are all existing infrastructures and devices. A user takes a picture including a traffic sign by using his cellular phone camera and send it to the map server. The map server identifies the road traffic sign from the picture taken by the cellular phone. Because the user is in which he can see the sign, the map server corrects GPS error using position of the sign and identifies his position. One of the most important processes of this system is identification of the road traffic sign. Speed-up and high accuracy are needed for this process. We propose speed-up and high accuracy technique for identification of the road traffic sign. This technique is based on the feature that the user takes a picture of the sign to the center of the image. When the map server receives a sign picture, the map server identifies the sign by color recognition, shape recognition using Hough transform and symbol recognition using template matching. Proposed system reduces communication data by changing the resolution of the image to low so that the time from the sending the image to receiving result is kept within 3 seconds. In recognition of the color of the sign, the high accuracy is achieved by putting high weight on a pixel near the center of image. In recognition of the shape of sign, the hough transform is sped up by using direction of edges and the feature that a rounded sign is at the center of image. In template matching, speed-up is achieved by narrowing the candidate according to the color and the shape of the sign.

CiNii
歩行者ナビゲーションにおける携帯電話カメラ機能とランドマークを利用した位置補正手法

本多聖人, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告高度交通システム（ITS） 2007 ( 90 ) 77 - 82 2007年09月

　概要を見る

近年，携帯電話による歩行者ナビゲーションサービスの利用が拡大している．このサービスにおける利用者の測位方式では，GPSおよび携帯電話の基地局情報を用いた方式が一般的である．しかし，都市部においては，マルチパス・捕捉可能衛星の減少等の原因により，大きな誤差が生じる．本手法では，利用者がカメラ付き携帯電話を用いて，道路交通標識を撮影し，サーバへ送信，組み合わせの判別結果から位置候補を絞り込み位置を利用者へ返す．候補が複数存在する場合は，視認性を考慮した利用者の周囲のランドマーク情報を提供・確認することで，正しい位置を特定する．本手法は，GPSや基地局情報のみを用いる手法ではなく，都市部における測位誤差要因の影響を除去する．実環境におけるモデル実験を通じて，提案手法の有効性を確認した．A Navigation service on a mobile phone for pedestrians has been increasing in recent years. The method using GPS or base-station information on a mobile phone is generally used for pedestrians positioning. However, in urban areas, significant signal errors may be caused by several factors, such as multipath and reduction of available satellites. In this paper, we propose a new positioning method to achieve highly accurate positioning with low-cost. By our method, all the users need to do is taking pictures of road signs and sending pictures to a map sever using his mobile phone. The server sends back the user's position using the positioning date of road signs. If there are several candidates for the present date, the server asks a question concerning landmarks that the user can see. By answering this question, the server can give the user accurate user's position. Since our positioning method does not use only GPS or base-station information, we remove the factors of errors in urban areas. We confirm the effectiveness of the proposed method through the experiments in a real environment.

CiNii
進路方向によって異なる混雑度を考慮した旅行時間算出手法

大高宏介, 戸川望, 柳澤政生, 大附辰夫

電気学会研究会資料. ITS, ITS研究会 2007 ( 15 ) 15 - 20 2007年09月

　概要を見る

現在,自動車が走行する経路の出発地から目的地までの旅行時間を算出する際にはVICSや各テレマティクスサービスが提供するリンク旅行時間が用いられているが,このリンク旅行時間の精度が十分高くないことが算出結果の精度の低下につながっている.そこで本稿では,従来と比較してより正確な旅行時間の算出手法を提案する.走行中の車両から収集した走行履歴の情報を元に,各リンク旅行時間を進路方向ごとに個別に格納する機能を付加することで,右左折や直進によって混雑度が異なることを考慮し,かつ高速道路に限らず一般道までを対象にすることが可能となる.また,交通流シミュレータを用いて本手法の有効性を検証する.

CiNii
移動体を対象としたアプリケーションとデータサイズによる階層型 Network Mobility の負荷分散方式

月木英治, 戸川望, 柳澤政生, 大附辰夫

電気学会研究会資料. ITS, ITS研究会 2007 ( 15 ) 65 - 70 2007年09月

CiNii
歩行者ナビゲーションにおけるGPS誤差補正のための道路標識による現在位置測位手法

大平英貴, 戸川望, 柳澤政生, 大附辰夫

電気学会研究会資料. ITS, ITS研究会 2007 ( 15 ) 71 - 76 2007年09月

CiNii
歩行者ナビゲーションにおける携帯電話カメラ機能とランドマークを利用した位置補正手法

本多聖人, 戸川望, 柳澤政生, 大附辰夫

電気学会研究会資料. ITS, ITS研究会 2007 ( 15 ) 77 - 82 2007年09月

CiNii
進路方向によって異なる混雑度を考慮した旅行時間算出手法

大高宏介, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告高度交通システム（ITS） 2007 ( 90 ) 15 - 20 2007年09月

　概要を見る

現在，自動車が走行する経路の出発地から目的地までの旅行時間を算出する際には VICS や各テレマティクスサービスが提供するリンク旅行時間が用いられているが，このリンク旅行時間の精度が十分高くないことが算出結果の精度の低下につながっている．そこで本稿では，従来と比較してより正確な旅行時間の算出手法を提案する．走行中の車両から収集した走行履歴の情報を元に，各リンク旅行時間を進路方向ごとに個別に格納する機能を付加することで，右左折や直進によって混雑度が異なることを考慮し，かつ高速道路に限らず一般道までを対象にすることが可能となる．また，交通流シミュレータを用いて本手法の有効性を検証する．To calculate travel time from departing point to destination point by vehicles, link travel time offered by VICS or telematics service is used. However, the accuracy of link travel time is so low that the accuracy of the travel time becomes also low. In this paper, we propose a method to calculate travel time with higher accuracy than traditional methods. By storing link travel time based on driving records for each course direction, our method enables to calculate travel time which considers congestion of course direction, and covers all roads including highway. In addition, we show the effectiveness of our method by examining with a traffic simulator.

CiNii
移動体を対象としたアプリケーションとデータサイズによる階層型 Network Mobility の負荷分散方式

月木英治, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. ITS 107 ( 216 ) 21 - 26 2007年09月

　概要を見る

車や電車内部からインターネットを利用する手法としてNetwork Mobility(NEMO)が提案されている.しかしながら,ハンドオーバの際に生じるパケットロスや遅延時間,特定ルータへの負荷集中が通信品質の劣化原因となっている.本稿では,Mobility Anchor Point(MAP)を用いたアプリケーションとデータサイズによる多階層型MAPの負荷分散方式を提案する.本方式では,移動体(MN)の移動速度に依存しないため,電車内部のようにMN速度が同じであっても,MAPの負荷分散を実現できる.また,IP電話やビデオストリーミングなどのアプリケーションに対して,一定の遅延時間の保証が行える.本手法の有効性をネットワークシミュレータOPNETを用い検証する.

CiNii
歩行者ナビゲーションにおけるGPS誤差補正のための道路標識による現在位置測位手法

大平英貴, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. ITS 107 ( 216 ) 27 - 32 2007年09月

　概要を見る

現在のGPSは建物等によるGPS信号の遮蔽やマルチパス,利用可能な衛星の制限等によって位置測位の精度が著しく低下する.そこで,我々はGPS誤差の補正を目的とした位置測位システムを提案している.本システムは既存のインフラ・デバイスであるカメラ機能付き携帯電話,GPS,地図サーバによって構成される.地図サーバは携帯電話で撮影された道路標識画像を識別し,その道路標識が目視できる位置に利用者が存在しているという観点からGPS誤差範囲を絞り込み,位置を測位する.本システムの最も重要な処理の1つが道路標識の識別である.道路標識の識別はシステムのレスポンスと正しい位置測位の観点から,高速化・高精度化が必要となる.高速化・高精度化を実現するにあたり,"利用者が位置測位に積極的である"という観点から道路標識は画像の中央付近に撮影されるという考えを利用した道路標識識別手法を提案する.地図サーバは,携帯電話で撮像された道路標識画像を受信し,道路標識の色の特定,ハフ変換による形状の特定、テンプレートマッチングによるシンボルの特定によって標識を識別する.撮影画像の送受信ではレスポンスタイムが3秒以内に収まるように画像を低解像度化することで通信データを削減し,通信時間を抑制する.道路標識の色の特定は中心に近い画素に重みをつけることで高精度化する.道路標識の形状の特定はエッジの方向を利用し,ハフ変換を高速化する.ハフ変換の中で円形の道路標識では標識が中央付近に存在していることを利用し,高速化する.テンプレートマッチングでは道路標識の色と形状によって候補を絞り込むことで高速化する.

CiNii
歩行者ナビゲーションにおける携帯電話カメラ機能とランドマークを利用した位置補正手法

本多聖人, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. ITS 107 ( 216 ) 33 - 38 2007年09月

　概要を見る

近年,携帯電話による歩行者ナビゲーションサービスの利用が拡大している.このサービスにおける利用者の測位方式では,GPSおよび携帯電話の基地局情報を用いた方式が一般的である.しかし,都市部においては,マルチパス・捕捉可能衛星の減少等の原因により,大きな誤差が生じる.本手法では,利用者がカメラ付き携帯電話を用いて,道路交通標識を撮影し,サーバへ送信,組み合わせの判別結果から位置候補を絞り込み位置を利用者へ返す.候補が複数存在する場合は,視認性を考慮した利用者の周囲のランドマーク情報を提供・確認することで,正しい位置を特定する.本手法は,GPSや基地局情報のみを用いる手法ではなく,都市部における測位誤差要因の影響を除去する.実環境におけるモデル実験を通じて,提案手法の有効性を確認した.

CiNii
進路方向によって異なる混雑度を考慮した旅行時間算出手法

大高宏介, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. ITS 107 ( 215 ) 15 - 20 2007年09月

　概要を見る

現在,自動車が走行する経路の出発地から目的地までの旅行時間を算出する際にはVICSや各テレマティクスサービスが提供するリンク旅行時間が用いられているが,このリンク旅行時間の精度が十分高くないことが算出結果の精度の低下につながっている.そこで本稿では,従来と比較してより正確な旅行時間の算出手法を提案する.走行中の車両から収集した走行履歴の情報を元に,各リンク旅行時間を進路方向ごとに個別に格納する機能を付加することで,右左折や直進によって混雑度が異なることを考慮し,かつ高速道路に限らず一般道までを対象にすることが可能となる.また,交通流シミュレータを用いて本手法の有効性を検証する.

CiNii
GF(2^n)及びGF(P)におけるスケーラブル双基数ユニファイド型モンゴメリ乗算器

谷村和幸, 奈良竜太, 小原俊逸, 史又華, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. CAS, 回路とシステム 107 ( 101 ) 43 - 48 2007年06月

　概要を見る

公開鍵記号の1つである楕円曲線暗号の中で支配的な演算である剰余乗算には,モンゴメリ乗算が一般的に使われる.モンゴメリ乗算器には暗号強度によって扱うオペランドのビット数が異なるので,スケーラビリティが要求される.また,楕円曲線暗号はGF(2^n)もしくはGF(P)上で演算されるため,両フィールドを扱えるスケーラブルなユニファイド型乗算器も過去に提案されている.しかし, GF(P)を扱う回路の方が, GF(2^n)より遅延時間が長いため,フィールド毎に動作周波数を変えるか, GF(P)の時だけクロックサイクル数の増加と引き換えに基数を小さくする必要がある.本稿ではGF(2^n)及びGF(P)におけるスケーラブル双基数ユニファイド型モンゴメリ乗算器を提案する.提案アーキテクチャは基数2^<16>で4並列化したGF(P)乗算器と基数2^<64>のGF(2^n)乗算器を1つに統合するものである.双基数化によってGF(2^n)とGF(P)における遅延時間差を削減し,それに伴う低基数側のクロックサイクル数増加を,並列化によって削減する.その結果,最速のスケーラブルユニファイド型モンゴメリ乗算器となった.

CiNii
再構成型プロセッサFE-GAへのフィルタマッピングとその自動化手法

本間雅行, 戸川望, 柳澤政生, 大附辰夫, 佐藤真琴

電子情報通信学会技術研究報告. CAS, 回路とシステム 107 ( 100 ) 67 - 72 2007年06月

　概要を見る

近年のディジタル機器においては,多種多様で,膨大なデータを短時間で処理することが要求されている.このような要求に応える新たなアーキテクチャとして,多数の演算器を並列に動作させることができる再構成型プロセッサがある.ここでは,ディジタルメディア処理向け動的再構成プロセッサFE-GA (Flexible Engine/Generic ALU array)に注目する.現在,FE-GAの開発ツールに関してはまだ確立されていない.そこで本稿では,FE-GAに自動的にマッピングを行うツール開発の導入として,ディジタルメディア処理の基本となるFIRフィルタを取り上げ,その動作を実現する回路をFE-GAに設計する.さらに,そのマッピング手法を提案する.提案する手法はn次のFIRフィルタに対して,その次数とフィルタ係数を指定することで,FE-GA専用のアセンブリ言語を自動生成するものである.この自動生成したアセンブリ言語をFEEditorと呼ばれる専用ツールに読み込ませることでマッピング自動化を実現する.提案手法では,FE-GAアーキテクチャの仕様範囲内で,すべての次数のFIRフィルタのマッピング自動化を可能とし,スレッド切り替えの無い場合に限り,最小サイクル数となるマッピングを実現している.

CiNii
GF(2^n)及びGF(P)におけるスケーラブル双基数ユニファイド型モンゴメリ乗算器

谷村和幸, 奈良竜太, 小原俊逸, 史又華, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. SIP, 信号処理 107 ( 105 ) 43 - 48 2007年06月

　概要を見る

公開鍵記号の1つである楕円曲線暗号の中で支配的な演算である剰余乗算には,モンゴメリ乗算が一般的に使われる.モンゴメリ乗算器には暗号強度によって扱うオペランドのビット数が異なるので,スケーラビリティが要求される.また,楕円曲線暗号はGF(2^n)もしくはGF(P)上で演算されるため,両フィールドを扱えるスケーラブルなユニファイド型乗算器も過去に提案されている.しかし, GF(P)を扱う回路の方が, GF(2^n)より遅延時間が長いため,フィールド毎に動作周波数を変えるか, GF(P)の時だけクロックサイクル数の増加と引き換えに基数を小さくする必要がある.本稿ではGF(2^n)及びGF(P)におけるスケーラブル双基数ユニファイド型モンゴメリ乗算器を提案する.提案アーキテクチャは基数2^<16>で4並列化したGF(P)乗算器と基数2^<64>のGF(2^n)乗算器を1つに統合するものである.双基数化によってGF(2^n)とGF(P)における遅延時間差を削減し,それに伴う低基数側のクロックサイクル数増加を,並列化によって削減する.その結果,最速のスケーラブルユニファイド型モンゴメリ乗算器となった.

CiNii
GF(2^n)及びGF(P)におけるスケーラブル双基数ユニファイド型モンゴメリ乗算器

谷村和幸, 奈良竜太, 小原俊逸, 史又華, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 107 ( 103 ) 43 - 48 2007年06月

　概要を見る

公開鍵記号の1つである楕円曲線暗号の中で支配的な演算である剰余乗算には,モンゴメリ乗算が一般的に使われる.モンゴメリ乗算器には暗号強度によって扱うオペランドのビット数が異なるので,スケーラビリティが要求される.また,楕円曲線暗号はGF(2^n)もしくはGF(P)上で演算されるため,両フィールドを扱えるスケーラブルなユニファイド型乗算器も過去に提案されている.しかし, GF(P)を扱う回路の方が, GF(2^n)より遅延時間が長いため,フィールド毎に動作周波数を変えるか, GF(P)の時だけクロックサイクル数の増加と引き換えに基数を小さくする必要がある.本稿ではGF(2^n)及びGF(P)におけるスケーラブル双基数ユニファイド型モンゴメリ乗算器を提案する.提案アーキテクチャは基数2^<16>で4並列化したGF(P)乗算器と基数2^<64>のGF(2^n)乗算器を1つに統合するものである.双基数化によってGF(2^n)とGF(P)における遅延時間差を削減し,それに伴う低基数側のクロックサイクル数増加を,並列化によって削減する.その結果,最速のスケーラブルユニファイド型モンゴメリ乗算器となった.

CiNii
再構成型プロセッサFE-GAへのフィルタマッピングとその自動化手法

本間雅行, 戸川望, 柳澤政生, 大附辰夫, 佐藤真琴

電子情報通信学会技術研究報告. SIP, 信号処理 107 ( 104 ) 67 - 72 2007年06月

　概要を見る

近年のディジタル機器においては,多種多様で,膨大なデータを短時間で処理することが要求されている.このような要求に応える新たなアーキテクチャとして,多数の演算器を並列に動作させることができる再構成型プロセッサがある.ここでは,ディジタルメディア処理向け動的再構成プロセッサFE-GA (Flexible Engine/Generic ALU array)に注目する.現在,FE-GAの開発ツールに関してはまだ確立されていない.そこで本稿では,FE-GAに自動的にマッピングを行うツール開発の導入として,ディジタルメディア処理の基本となるFIRフィルタを取り上げ,その動作を実現する回路をFE-GAに設計する.さらに,そのマッピング手法を提案する.提案する手法はn次のFIRフィルタに対して,その次数とフィルタ係数を指定することで,FE-GA専用のアセンブリ言語を自動生成するものである.この自動生成したアセンブリ言語をFEEditorと呼ばれる専用ツールに読み込ませることでマッピング自動化を実現する.提案手法では,FE-GAアーキテクチャの仕様範囲内で,すべての次数のFIRフィルタのマッピング自動化を可能とし,スレッド切り替えの無い場合に限り,最小サイクル数となるマッピングを実現している.

CiNii
再構成型プロセッサFE-GAへのフィルタマッピングとその自動化手法

本間雅行, 戸川望, 柳澤政生, 大附辰夫, 佐藤真琴

電子情報通信学会技術研究報告. VLD, VLSI設計技術 107 ( 102 ) 67 - 72 2007年06月

　概要を見る

近年のディジタル機器においては,多種多様で,膨大なデータを短時間で処理することが要求されている.このような要求に応える新たなアーキテクチャとして,多数の演算器を並列に動作させることができる再構成型プロセッサがある.ここでは,ディジタルメディア処理向け動的再構成プロセッサFE-GA (Flexible Engine/Generic ALU array)に注目する.現在,FE-GAの開発ツールに関してはまだ確立されていない.そこで本稿では,FE-GAに自動的にマッピングを行うツール開発の導入として,ディジタルメディア処理の基本となるFIRフィルタを取り上げ,その動作を実現する回路をFE-GAに設計する.さらに,そのマッピング手法を提案する.提案する手法はn次のFIRフィルタに対して,その次数とフィルタ係数を指定することで,FE-GA専用のアセンブリ言語を自動生成するものである.この自動生成したアセンブリ言語をFEEditorと呼ばれる専用ツールに読み込ませることでマッピング自動化を実現する.提案手法では,FE-GAアーキテクチャの仕様範囲内で,すべての次数のFIRフィルタのマッピング自動化を可能とし,スレッド切り替えの無い場合に限り,最小サイクル数となるマッピングを実現している.

CiNii
楕円曲線暗号に適したＧＦ(2m)上のＳＩＭＤ型ＭＳＤ乗算器の設計

奈良竜太, 清水一範, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 2007 ( 39 ) 57 - 61 2007年05月

　概要を見る

楕円曲線暗号ハードウェアを実装する場合，用途に合わせて複数の鍵長を処理することが求められる．Digit-serial乗算器は楕円曲線暗号を構成するＧＦ(2m)上の乗算を高速に処理することができる．しかし，digit-serial乗算器はデータ長が固定された数値を扱うことに適しており，複数の鍵長を扱う暗号システムには向いていなかった．そこで本稿では楕円曲線暗号に適したＧＦ(2m)上のSIMD型ＭＳＤ乗算器を提案する．digit-serial乗算器の一つであるＭＳＤ乗算器をデータ長に合わせてＳＩＭＤ演算で乗算を並列処理することにより，楕円曲線スカラー乗算を高速処理することができる．また，ＮＩＳＴが推奨する５つのデータ長について提案乗算器で処理することができるため，５種類のＭＳＤ乗算器を使用した場合に対し処理速度が同程度で比較した場合，面積を最大1/3まで削減することができる．また短い鍵長に対しＳＩＭＤ演算することで２つの乗算を同時に処理することができるため，従来のＭＳＤ乗算器と比較し最大で約２倍の処理速度を得ることができる．Originally elliptic curve cryptosystem (ECC) hardware are often required to operate variable key length. Digit-serial multipliers for ECC enable the hardware to accelerale the finite field operation. However, the lack of flexibility of digit-serial multipliers is major challenge for building the ecc hardware which operates variable key length. In this paper, we propose a SIMD MSD multiplier based on variable GF(2m) for ECC. Adjusting the parallellizm of the SIMD MSD multiplier according to the field length enables us to accelarate the ecc scalar multiplication throughput. The proposed multiplier operates 5 types of field length which are recommended by NIST, where 2 multiplications can be operated simultaneously for the small field length. Implementation results show that the proposed multiplier reduces the hardware area by up to 1/3 compared to the same throughput, while achieving up to about 2 times multiplication throughput compared to the conventional multipliers for the variable field length.

CiNii
楕円曲線暗号に適したGF(2^m)上のSIMD型MSD乗算器の設計

奈良竜太, 清水一範, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 107 ( 32 ) 25 - 29 2007年05月

　概要を見る

楕円曲線暗号ハードウェアを実装する場合,用途に合わせて複数の鍵長を処理することが求められる.Digit-serial乗算器は楕円曲線暗号を構成するGF(2^m)上の乗算を高速に処理することができる.しかし,digit-serial乗算器はデータ長が固定された数値を扱うことに適しており,複数の鍵長を扱う暗号システムには向いていなかった.そこで本稿では楕円曲線暗号に適したGF(2^m)上のSIMD型MSD乗算器を提案する.digit-serial乗算器の一つであるMSD乗算器をデータ長に合わせてSIMD演算で乗算を並列処理することにより,楕円曲線スカラー乗算を高速処理することができる.また,NISTが推奨する5つのデータ長について提案乗算器で処理することができるため,5種類のMSD乗算器を使用した場合に対し処理速度が同程度で比較した場合,面積を最大1/3まで削減することができる.また短い鍵長に対しSIMD演算することで2つの乗算を同時に処理することができるため,従来のMSD乗算器と比較し最大で約2倍の処理速度を得ることができる.

CiNii
高効率Message-Passingスケジュールを用いたLDPC復号器の低消費電力化 (第20回回路とシステム軽井沢ワークショップ論文集) -- (アーキテクチャ設計と低電力化)

清水一範, 戸川望, 池永剛

回路とシステム軽井沢ワークショップ論文集 20 331 - 336 2007年04月

CiNii
モバイルユーザの目的地への方向性を考慮した楕円領域検索手法

山本隆之, 戸川望, 柳澤政生, 大附辰夫

電気学会研究会資料. ITS, ITS研究会 2007 ( 1 ) 25 - 30 2007年03月

CiNii
モバイルユーザの目的地への方向性を考慮した楕円領域検索手法

山本隆之, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. ITS 106 ( 616 ) 25 - 30 2007年03月

　概要を見る

現在,GPSなどの位置特定手法を用いてモバイルユーザの現在地を調べ,その周辺の情報を提供する位置依存情報サービスが展開している.しかしながら,これら既存システムの多くは,膨大な量の情報をユーザに提供してしまうことで本当に必要な情報とそうでない情報との振り分けが困難となってしまっている.このように,検索によって得られる情報の絞り込みを行うことが課題となっている.そこで本稿では,モバイルユーザにとって有効的な情報の絞り込みを行うための手法として,目的地への方向性を考慮した楕円領域検索手法を提案する.提案手法は,ユーザの現在地と目的地を二つの焦点とする楕円領域を検索範囲とすることで,ユーザの目的地への方向性を考慮した検索を行う.さらに既存システムとの比較実験を行うことで,提案手法の有効性を示す.

CiNii
無線センサネットワークにおけるエネルギー消費削減のためのクラスタリング手法

廣瀬文昭, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. NS, ネットワークシステム 106 ( 577 ) 41 - 46 2007年03月

　概要を見る

近年,無線センサネットワークは,ユビキタス社会における新しい通信基盤技術として期待されている.TCP/IPのような従来のネットワークプロトコルは,スループットや遅延対策を最優先して設計され,消費エネルギーについては二の次とされていた.しかし,センサネットワーク上のプロトコルでな,端末の物理的な電源管理が困難であることや端末の小型化,低コスト化などを理由に省エネルギーを最優先に設計することが求められる.そこで,本稿では,ネットワーク全体の長寿命化と基地局側でのデータ取得率向上を目的としたセンサネットワーク向け通信プロトコルを提案する.クラスタ型のセンサネットワーク向けプロトコルLEACH[2]を拡張し,余分な通信のオーバーヘッドを削減するとともに,端末間で最適なクラスタを生成することで通信の効率化を図る.

CiNii
エニーキャストにおけるサーバ処理時間を考慮した経路選択手法

楊夏, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. NS, ネットワークシステム 106 ( 577 ) 381 - 386 2007年03月

　概要を見る

本論文では,現在の経路遅延時間により最適(遅延時間最短)なサーバ選択の方法を分析の上,エニーキャストにおけるサーバ処理時間と処理能力を考慮した最適なサーバ選択手法を提案する.提案手法は,経路の遅延時間だけではなくサーバの処理時間も考えている.そのため,ネットワークのトラヒックが増大したとしても,提案手法は,従来手法に比べて,適当なサーバを選択することできる.OPNETにより定量的な評価を行った結果,経路の遅延時間が30%以上削減できた.また,スループットの増加も従の方法に比べ,優れていることがわかった.

CiNii
アプリケーションに特化した動的再構成可能なネットワークプロセッサ

大田元則, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

情報処理学会シンポジウム論文集 2007 ( 7 ) 2007年

J-GLOBAL
応用指向動的再構成なネットワークプロセッサ設計手法

大田元則, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

情報処理学会シンポジウム論文集 2007 ( 8 ) 2007年

J-GLOBAL
柔軟かつ適応的な通信処理システムLSI設計に関する研究

戸川望

電気通信普及財団研究調査報告書(CD-ROM) ( 22 ) 2007年

J-GLOBAL
SIMDプロセッサコアの面積/遅延見積もり手法

山崎大輔, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

情報処理学会シンポジウム論文集 2007 ( 8 ) 2007年

J-GLOBAL
HW/SW協調合成におけるASIPの面積/遅延見積もり手法

山崎大輔, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

情報処理学会シンポジウム論文集 2007 ( 7 ) 2007年

J-GLOBAL
GF(2^m)上のSIMD型MSD乗算器を用いた楕円曲線暗号回路の実装

奈良竜太, 清水一範, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

情報処理学会シンポジウム論文集 2007 ( 7 ) 2007年

J-GLOBAL
楕円曲線暗号用SIMD型MSD乗算器の設計

奈良竜太, 清水一範, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

情報処理学会シンポジウム論文集 2007 ( 8 ) 2007年

J-GLOBAL
歩行者ナビゲーションにおける微小画面での視認性とユーザの迷いにくさを考慮した略地図生成方法

二宮直也, 戸川望, 柳沢政生, 大附辰夫

情報処理学会研究報告高度交通システム（ITS） 2006 ( 103 ) 111 - 116 2006年09月

　概要を見る

携帯電話による位置情報サービスとインターネットサービスの普及により、歩行者を対象とした地図サービスの利用が拡大している。これに伴い、表示面積の狭いモバイル端末に有効な略地図を自動生成するための各種技術の研究が盛んに行われている。道路形状の水平、垂直化、交差点角度の量子化を基本とする従来手法では碁盤の目のようなデザイン性の高い略地図の生成が可能であるが，それがユーザに取って迷いにくい地図であるとは限らない。本編では，人間の方向判断基準や交差点形状が歩行者に与える影響を反映させた略地図を生成するための簡略化処理アルゴリズムを提案し，携帯電話のような微小画面においても視認性がよく、かつ迷いにくい略地図の生成を目指す。ノード数４００程度の道路ネットワークデータに対して本手法を適用し略地図が生成されることを確認した。The use of map service for pedestrians has expanded by the spread of the location information service and Internet services by the cellular phone. There have been various researches to generate effective deformed maps to mobile devices with a small display automatically. The existing techniques are based on making road shape horizontal and vertical, and quantizing of intersection angle. Deformed maps generated by them have a high level of visibility, but they are not easy to understand for users. In this paper, we propose a road shape transformationalgorithm based on cognitive science. It can generate deformed maps that can be understandable in a small displayand has easiness of route understanding. By applying our proposed algorithm to about 400 node road-network data,we confirmed that our proposed algorithm work efficiently.

CiNii
車車間・路車間通を用いた車線別の渋滞情報の検出手法

大高宏介, 戸川望, 柳澤政生, 大附辰夫

電気学会研究会資料. ITS, ITS研究会 2006 ( 20 ) 19 - 24 2006年09月

CiNii
車車間・路車間通を用いた車線別の渋滞情報の検出手法

大高宏介, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告高度交通システム（ITS） 2006 ( 103 ) 19 - 24 2006年09月

　概要を見る

年のITS技術の進化に伴ってカーナピにおける測位精度や経路案内の技術が高まりつつあるが，出発地から目的地までの所要時間の測定の精度は未だに十分とは言えず，いかにして正確な渋滞情報を取得するかが課題となっている．特に車線ごとに混雑状況が異なることが大きな影響を及ぼすため，既存の渋滞情報の検出方法の問題を起こすことなく，交差点などにおいて車線ごとに混雑状況が異なる場合があったとしてもそれを個別に検出することが必要である．そこで本稿では，車車間通信および路車間通信を用いることにより，一般道路の各交差点において車線別に渋滞情報をリアルタイムに検出するための手法を提案する．ビーコンから通信を開始して車車間通信を繰り返すことで車両の情報を収集し，その情報から渋滞を通過するまでに要した時間を算出する．また，シミュレーションによりこの手法の有効性を示す．As the ITS technology evolves, the measurement accuracy and the technology of the route guide is rising. But, because the measurement accuracy of the time required from startoing point to destination is not high enough, it is problem how to acquire accurate congestion information. Especially, because difference in congestion situation for each lane exerts a great influence on calculation of the time required, if the congestion level is different for each lane, it is necessary to detect congestion information for each lane in the intersection without causing a problem which was seen in conventional congestion-detecting method. Then, we propose a method to detect congestion information of each lane by using Vehicle-to-Vehicle and Road-to-Vehicle Communication technology in real time in the intersection on a general road. Time required to pass the congestion is calculated by using the information which was gathered by iterative communication among cars which starts from a beacom. After that,we show the effectiveness of this method by simulating it.

CiNii
車車間・路車間通を用いた車線別の渋滞情報の検出手法

大高宏介, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. ITS 106 ( 265 ) 19 - 24 2006年09月

　概要を見る

近年のITS技術の進化に伴ってカーナビにおける測位精度や経路案内の技術が高まりつつあるが,出発地から目的地までの所要時間の測定の精度は未だに十分とは言えず,いかにして正確な渋滞情報を取得するかが課題となっている.特に車線ごとに混雑状況が異なることが大きな影響を及ぼすため,既存の渋滞情報の検出方法の問題を起こすことなく,交差点などにおいて車線ごとに混雑状況が異なる場合があったとしてもそれを個別に検出することが必要である.そこで本稿では,車車間通信および路車間通信を用いることにより,一般道路の各交差点において車線別に渋滞情報をリアルタイムに検出するための手法を提案する.ビーコンから通信を開始して車車間通信を繰り返すことで車両の情報を収集し,その情報から渋滞を通過するまでに要した時間を算出する.また,シミュレーションによりこの手法の有効性を示す.

CiNii
H.264/AVC符号化向けDSPにおける動き予測演算器の設計

高橋豊和, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 106 ( 114 ) 13 - 18 2006年06月

　概要を見る

H.264/AVCは高い符号化効率を実現する反面,符号化に必要な処理量が多い問題点があり,その90%以上は動き予測処理が占めている.符号化効率を向上させるために導入された複数参照フレーム,可変ブロックサイズ,1/4画素精度動き補償がその主因である.これを高速化させるため,複数参照フレーム,可変ブロックサイズに対応した整数精度動き予測処理アーキテクチャが提案されている.しかし,これらのアーキテクチャは探索場所の移動において変則的なメモリアクセスを要し,メモリバンド幅が制限されるDSP組み込み等の用途では性能向上が難しい.本稿では,この間題に対応するため,画素サブサンプリング手法を用いたDSP組み込み整数精度動き予測処理アーキテクチャを提案する.画素サブサンプリングは演算に用いる画素を間引くことにより,一般的にハードウェア量削減に用いられる.提案アーキテクチャではサブサンプリングパターンを一般的なチェスボード状から縦縞状に変更することにより.演算器のデータ読み込みサイクルを削減し動き予測処理の高速化を可能とする.提案するアーキテクチャは200MHzで動作させた場合,CIF画像の予測処理を86.5fpsで実行可能である.

CiNii
H.264/AVC符号化向けDSPにおける動き予測演算器の設計

高橋豊和, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. SIP, 信号処理 106 ( 116 ) 13 - 18 2006年06月

　概要を見る

H.264/AVCは高い符号化効率を実現する反面,符号化に必要な処理量が多い問題点があり,その90%以上は動き予測処理が占めている.符号化効率を向上させるために導入された複数参照フレーム,可変ブロックサイズ,1/4画素精度動き補償がその主因である.これを高速化させるため,複数参照フレーム,可変ブロックサイズに対応した整数精度動き予測処理アーキテクチャが提案されている.しかし,これらのアーキテクチャは探索場所の移動において変則的なメモリアクセスを要し,メモリバンド幅が制限されるDSP組み込み等の用途では性能向上が難しい.本稿では,この問題に対応するため,画素サブサンプリング手法を用いたDSP組み込み整数精度動き予測処理アーキテクチャを提案する.画素サブサンプリングは演算に用いる画素を間引くことにより,一般的にハードウェア量削減に用いられる.提案アーキテクチャではサブサンプリングパターンを一般的なチェスボード状から縦縞状に変更することにより,演算器のデータ読み込みサイクルを削減し動き予測処理の高速化を可能とする.提案するアーキテクチャは200MHzで動作させた場合,CIF画像の予測処理を86.5fpsで実行可能である.

CiNii
H.264/AVC符号化向けDSPにおける動き予測演算器の設計

高橋豊和, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. CAS, 回路とシステム 106 ( 112 ) 13 - 18 2006年06月

　概要を見る

H.264/AVCは高い符号化効率を実現する反面,符号化に必要な処理量が多い問題点があり,その90%以上は動き予測処理が占めている.符号化効率を向上させるために導入された複数参照フレーム,可変ブロックサイズ,1/4画素精度動き補償がその主因である.これを高速化させるため,複数参照フレーム,可変ブロックサイズに対応した整数精度動き予測処理アーキテクチャが提案されている.しかし,これらのアーキテクチャは探索場所の移動において変則的なメモリアクセスを要し,メモリバンド幅が制限されるDSP組み込み等の用途では性能向上が難しい.本稿では,この問題に対応するため,画素サブサンプリング手法を用いたDSP組み込み整数精度動き予測処理アーキテクチャを提案する.画素サブサンプリングは演算に用いる画素を間引くことにより,一般的にハードウェア量削減に用いられる.提案アーキテクチャではサブサンプリングパターンを一般的なチェスボード状から縦縞状に変更することにより,演算器のデータ読み込みサイクルを削減し動き予測処理の高速化を可能とする.提案するアーキテクチャは200MHzで動作させた場合,CIF画像の予測処理を86.5fpsで実行可能である.

CiNii
HW/SW協調合成におけるアプリケーションプロセッサの面積/遅延見積もり手法

山崎大輔, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 106 ( 113 ) 1 - 6 2006年06月

　概要を見る

本稿では,パイプライン段数と制御構造の変化に対応したアプリケーションプロセッサの面積/遅延の見積もり手法を提案する.プロセッサのHW/SW協調合成では,対象とするアプリケーションに最適な構成を決定し,プロセッサのハードウェア部分とソフトウェア部分を同時に設計する.最適な構成の探索において,ある時点での構成に対して逐一論理合成を行い最適な構成の判定を行うと探索に多大な時間を要してしまうため,探索の評価指標として面積/遅延の見積もり値を用い,論理合成することなく高速な探索を行う必要がある.また,アーキテクチャ探索に使用する見積もり値と論理合成値との誤差が大きいと解の探索において適切な解が得られない可能性があるため精度の高い見積もりを行うことが重要となる.提案手法ではプロセッサコアを部分機能ごとに分けてパラメータ化し,論理合成した結果の解析を行って見積もり式を導出する.導出した見積もり式によるプロセッサコアの面積値と論理合成値の相対誤差は平均1.13[%],遅延時間の誤差は平均で0.14[ns]となった.

CiNii
HW/SW協調合成におけるアプリケーションプロセッサの面積/遅延見積もり手法

山崎大輔, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. SIP, 信号処理 106 ( 115 ) 1 - 6 2006年06月

　概要を見る

本稿では,パイプライン段数と制御構造の変化に対応したアプリケーションプロセッサの面積/遅延の見積もり手法を提案する.プロセッサのHW/SW協調合成では,対象とするアプリケーションに最適な構成を決定し,プロセッサのハードウェア部分とソフトウェア部分を同時に設計する.最適な構成の探索において,ある時点での構成に対して逐一論理合成を行い最適な構成の判定を行うと探索に多大な時間を要してしまうため,探索の評価指標として面積/遅延の見積もり値を用い,論理合成することなく高速な探索を行う必要がある.また,アーキテクチャ探索に使用する見積もり値と論理合成値との誤差が大きいと解の探索において適切な解が得られない可能性があるため精度の高い見積もりを行うことが重要となる.提案手法ではプロセッサコアを部分機能ごとに分けてパラメータ化し,論理合成した結果の解析を行って見積もり式を導出する.導出した見積もり式によるプロセッサコアの面積値と論理合成値の相対誤差は平均1.13[%],遅延時間の誤差は平均で0.14[ns]となった.

CiNii
HW/SW協調合成におけるアプリケーションプロセッサの面積/遅延見積もり手法

山崎大輔, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. CAS, 回路とシステム 106 ( 111 ) 1 - 6 2006年06月

　概要を見る

本稿では,パイプライン段数と制御構造の変化に対応したアプリケーションプロセッサの面積/遅延の見積もり手法を提案する.プロセッサのHW/SW協調合成では,対象とするアプリケーションに最適な構成を決定し,プロセッサのハードウェア部分とソフトウェア部分を同時に設計する.最適な構成の探索において,ある時点での構成に対して逐一論理合成を行い最適な構成の判定を行うと探索に多大な時間を要してしまうため,探索の評価指標として面積/遅延の見積もり値を用い,論理合成することなく高速な探索を行う必要がある.また,アーキテクチャ探索に使用する見積もり値と論理合成値との誤差が大きいと解の探索において適切な解が得られない可能性があるため精度の高い見積もりを行うことが重要となる.提案手法ではプロセッサコアを部分機能ごとに分けてパラメータ化し,論理合成した結果の解析を行って見積もり式を導出する.導出した見積もり式によるプロセッサコアの面積値と論理合成値の相対誤差は平均1.13[%],遅延時間の誤差は平均で0.14[ns]となった.

CiNii
設計事例 FIFOバッファによる高効率Message-Passingスケジュールを用いたLDPC復号器

清水一範, 石川達之, 戸川望

回路とシステム軽井沢ワークショップ論文集 19 211 - 216 2006年04月

CiNii
アプリケーションプロセッサのデータキャッシュ構成最適化手法 (リコンフィギャラブルとキャッシュ最適化)

堀内一央, 小原俊逸, 戸川望

回路とシステム軽井沢ワークショップ論文集 19 583 - 588 2006年04月

CiNii
歩行者ナビゲーションシステムにおける携帯電話カメラ機能を利用した位置補正手法

中口智史, 戸川望, 柳澤政生, 大附辰夫

電気学会研究会資料. ITS, ITS研究会 2006 ( 1 ) 25 - 30 2006年03月

CiNii
歩行者ナビゲーションシステムにおける携帯電話カメラ機能を利用した位置補正手法

中口智史, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. ITS 105 ( 688 ) 25 - 30 2006年03月

　概要を見る

歩行者ナビゲーションにおける測位方式ではGPSまたは携帯電話の基地局情報を利用した方式が現在一般的に用いられている.しかし,都市部においてはマルチパス・補足衛星の減少などの要因により,大きな誤差を生じる可能性がある.本稿で提案する手法は,現在広く普及しているカメラ付き携帯電話を利用して,街中にある道路交通標識を撮影・認識することにより,既存のインフラを活用した低コストで高精度な位置特定の実現を目指す.実環境におけるモデル実験を通じて,提案手法の有効性を確認した.

CiNii
動的フローに適応したネットワークプロセッサの改良とその評価

田淵英孝, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 105 ( 644 ) 25 - 30 2006年03月

　概要を見る

動的フローに適応したネットワークプロセッサとは固定的な処理を行うプロセッサと処理が動的に変化するプロセッサから構成され,動的にボトルネックとなる処理を並列化できるネットワークプロセッサである.本稿では,動的フローに適応したネットワークプロセッサの改良としてネットワークプロセッサ向けCAMモジュールを提案する.提案CAMモジュールはネットワーク上におけるIP Prefix長の割合にあわせてルーティングテーブルをPrefix長ごとに分割し,CAMとTCAMを組み合わせて実現することで,面積の削減とネットワークプロセッサ全体のスループットの向上を図ることを考える.提案アーキテクチャをHW記述言語であるVHDLを用いて実装して計算機実験を行い,面積と回線速度1Gbpsを達成するために必要なスレッド数を比較することで提案手法の有効性を示す.評価結果においては,面積は0.35μmプロセスで11.9%,パケット処理数は最大で16.3%向上した.

CiNii
動的フローに適応したネットワークプロセッサの改良とその評価

田淵英孝, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. ICD, 集積回路 105 ( 646 ) 25 - 30 2006年03月

　概要を見る

動的フローに適応したネットワークプロセッサとは固定的な処理を行うプロセッサと処理が動的に変化するプロセッサから構成され,動的にボトルネックとなる処理を並列化できるネットワークプロセッサである.本稿では,動的フローに適応したネットワークプロセッサの改良としてネットワークプロセッサ向けCAMモジュールを提案する.提案CAMモジュールはネットワーク上におけるIP Prefix長の割合にあわせてルーティングテーブルをPrefix長ごとに分割し,CAMとTCAMを組み合わせて実現することで,面積の削減とネットワークプロセッサ全体のスループットの向上を図ることを考える.提案アーキテクチャをHW記述言語であるVHDLを用いて実装して計算機実験を行い,面積と回線速度1Gbpsを達成するために必要なスレッド数を比較することで提案手法の有効性を示す.評価結果においては,面積は0.35μmプロセスで11.9%,パケット処理数は最大で16.3%向上した.

CiNii
柔軟かつ適応的な通信処理システムLSI設計に関する研究

戸川望

電気通信普及財団研究調査報告書(CD-ROM) ( 21 ) 2006年

J-GLOBAL
レジスタ分散・共有アーキテクチャを対象としたフロアプラン指向高位合成手法

大智輝, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 2005 ( 121 ) 73 - 78 2005年11月

　概要を見る

近年のLSI設計プロセスの微細化に伴い，配線による遅延の割合がゲート遅延に対し相対的に増加してきており，高位合成の段階においてもフロアプランを考慮する必要がある．レジスタ分散型アーキテクチャを用いると，レジスタ間データ転送を利用することにより配線遅延が回路の性能に与える影響を削除することが可能であるが，レジスタ数が増大し面積増加を招いてしまうという問題点が生じる．本稿では，レジスタ分散型とレジスタ共有型を併用するレジスタ分散・共有型を対象とし，（1）スケジューリング（2）レジスタアロケーション，（3）レジスタバインディング（4）モジュール配置の工程を繰り返し（4）から得られたフロアプラン情報をフィードバックすることにより，解を収束させる高位合成手法を提案する．この手法はレジスタ分散型アーキテクチャと同等の回路の性能を維持しながら面積を削減することが可能となる．また，計算機実験によって，提案手法の有効性を示す．As device feature size decreases, interconnection delay becomes the dominating factor of total delay. By using Distributed-Register architectures, we can synthesize the circuits with register-to-register data transfer, and can reduce influence of interconnect delay. However, Distributed-Register architectures have the problem that circuit area increases by the number of registers increasing. In this paper, we propose a high-level synthesis method targeting a Distributed/Shared-Register architectures. Our method repeats (1)scheduling,(2)register allocation,(3)register binding,(4)module placement processes,and feeds back floorplan information from (4). This method can reduce circuit area while maintaining the performance of the circuit equal with Distrubuted-register axchitectures. We show effectiveness of the proposed methods through experimental results.

CiNii
重回帰分析により得られた1次式によるインダクタンスを考慮した配線遅延の見積り

鈴木康成, マルタディナタアンワル, 戸川望, 柳津政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 2005 ( 121 ) 109 - 114 2005年11月

　概要を見る

DSM（Deep SubMicron technology）時代では高位設計の際，フロアプランや配線抵抗などを考慮する必要が出でくる．また，高位設計で繰り返し行われるグローバル配線遅延の見積もりの際，インダクタンスの影響が無視できない．本稿ではインダクタンスを考慮してグローバル配線遅延を見積もる方法について述べる．本稿ではドライバーRLC配線一負荷モデルのステップ応答のが50％に達するまでの時間（50％遅延）を見積もる．提案する見積もり式は，あらかじめ素子値を説明変数として重回帰分析により得られた1次式を用いる．本手法は遅延の内，time of flightが支配的な場合に適用可能で，SPICEで計算した値との誤差を最大約15％，平均約2.5％で見積もることができる．In recent DSM (Deep SubMicron) technology, we need to take some important points, such as floorplaning, interconnect resistance and so on into consideration. It has been shown that inductance effect on clock, power, bus and macroblock interconnect is considerably large. In this paper we propose a new method to estimate single interconnect 50% delay by using an approximated equation given by multiple regression analysis. The proposed method achieved higher accuracy and less amount of operation than those of a conventional method.

CiNii
レジスタ分散・共有アーキテクチャを対象としたフロアプラン指向高位合成手法

大智輝, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 105 ( 442 ) 31 - 36 2005年11月

　概要を見る

近年のLSI設計プロセスの微細化に伴い, 配線による遅延の割合がゲート遅延に対し相対的に増加してきおり, 高位合成の段階においてもフロアプランを考慮する必要がある.レジスタ分散型アーキテクチャを用いると, レジスタ間データ転送を利用することにより配線遅延が回路の性能に与える影響を削除することが可能であるが, レジスタ数が増大し面積増加を招いてしまうという問題点が生じる.本稿では, レジスタ分散型とレジスタ共有型を併用するレジスタ分散・共有型を対象とし, (1)スケジューリング, (2)レジスタアロケーション, (3)レジスタバインディング, (4)モジュール配置の工程を繰り返し(4)から得られたフロアプラン情報をフィードバックすることにより, 解を収束させる高位合成手法を提案する.この手法はレジスタ分散型アーキテクチャと同等の回路の性能を維持しながら面積を削減することが可能となる.また, 計算機実験によって, 提案手法の有効性を示す.

CiNii
重回帰分析により得られた1次式によるインダクタンスを考慮した配線遅延の見積り

鈴木康成, アンワルマルタディナタ, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 105 ( 442 ) 67 - 72 2005年11月

　概要を見る

DSM(Deep SubMicron technology)時代では高位設計の際, フロアプランや配線抵抗などを考慮する必要が出でくる.また, 高位設計で繰り返し行われるグローバル配線遅延の見積もりの際, インダクタンスの影響が無視できない.本稿ではインダクタンスを考慮してグローバル配線遅延を見積もる方法について述べる.本稿ではドライバ-RLC配線-負荷モデルのステップ応答のが50%に達するまでの時間(50%遅延)を見積もる.提案する見積もり式は, あらかじめ素子値を説明変数として重回帰分析により得られた1次式を用いる.本手法は遅延の内, time of flightが支配的な場合に適用可能で, SPICEで計算した値との誤差を最大約15%, 平均約2.5%で見積もることができる.

CiNii
Asia and South Pacific Design Automation Conference 2005(ASP-DAC 2005, アジア・南太平洋設計自動化会議2005)

戸川望

電子情報通信学会誌 88 ( 4 ) 303 - 303 2005年04月

CiNii
Sum-Product アルゴリズムによる信頼度情報の伝播を改善する部分並列LDPC復号器の実装と評価

清水一範, 石川達之, 戸川望, 池永剛, 後藤敏

電子情報通信学会技術研究報告. VLD, VLSI設計技術 104 ( 709 ) 73 - 78 2005年03月

　概要を見る

本稿では, Sum-Productアルゴリズムにおける信頼度情報の伝播を改善する部分並列LDPC復号器を提案する.提案するLDPC復号器は, Sum-Productアルゴリズムにおける行処理に連動して, 列処理を実行する.列処理を実行する列処理モジュールは, 並列実行される行処理により更新される検査ノードに属する全てのビットノードをパイプライン処理することにより, Sum-Productアルゴリズムによる信頼度情報の伝播回数を増やす.提案する信頼度情報の伝播方法で復号処理を実行する部分並列LDPC復号器をFPGAに実装し評価した結果, 提案する部分並列LDPC復号器はLDPC符号の復号処理における復号繰り返し回数及び復号特性を改善できることを確認した.

CiNii
Sum-Product アルゴリズムによる信頼度情報の伝播を改善する部分並列LDPC復号器の実装と評価

清水一範, 石川達之, 戸川望, 池永剛, 後藤敏

電子情報通信学会技術研究報告. ICD, 集積回路 104 ( 711 ) 73 - 78 2005年03月

　概要を見る

本稿では, Sum-Productアルゴリズムにおける信頼度情報の伝播を改善する部分並列LDPC復号器を提案する.提案するLDPC復号器は, Sum-Productアルゴリズムにおける行処理に連動して, 列処理を実行する.列処理を実行する列処理モジュールは, 並列実行される行処理により更新される検査ノードに属する全てのビットノードをパイプライン処理することにより, Sum-Productアルゴリズムによる信頼度情報の伝播回数を増やす.提案する信頼度情報の伝播方法で復号処理を実行する部分並列LDPC復号器をFPGAに実装し評価した結果, 提案する部分並列LDPC復号器はLDPC符号の復号処理における復号繰り返し回数及び復号特性を改善できることを確認した.

CiNii
A Hardware/Software Cosynthesis Algorithm for Processors with Heterogeneous Datapaths

MIYAOKA Yuichiro, TOGAWA Nozomu, YANAGISAWA Masao, OHTSUKI Tatsuo

IEICE transactions on fundamentals of electronics, communications and computer sciences 87 ( 4 ) 830 - 836 2004年04月

　概要を見る

This paper proposes a hardware/software cosynthesis algorithm for processors with heterogeneous registers. Given a CDFG corresponding to an application program and a timing constraint, the algorithm generates a processor configuration minimizing area of the processor and an assembly code on the processor. First, the algorithm con figures a datapath which can execute several DFG nodes with data dependency at one cycle. The datapath can execute the application program at the least number of cycles. The branch and bound algorithm is applied and all the number of functional units and memory banks are tried. For an assumed number of functional units and memory banks, an appropriate number of heterogeneous registers and connections to functional units and registers are explored. The experimental results show effectiveness and efficiency of the algorithm.

CiNii
面積制約を考慮したCAMプロセッサ向けハードウェア／ソフトウェア協調設計手法

石川裕一朗, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 2003 ( 105 ) 175 - 180 2003年10月

　概要を見る

我々はCAM(仮想メモリ)を使用するプロセッサを対象としたハードウェア/ソフトウェア強調合成システムを構築中である．現在のシステムはＣ言語で記述されたアプリケーション記述を入力としてそのアプリケーションを実行するプロセッサの最適ハードウェア構築を出力する．本稿では，現在のシステムを拡張し，面積制約機能を付加したCAMプロセッサ向けハードウェア/ソフトウェア協調合成システムを提案する．提案手法では面積制約を満足した上で実行時間を最小化するCAMワード数を導出し，CAMに一部をRAMに置換してプロセッサの面積を削減する．計算機実験により，面積制約を満たした上で，システムに入力されたアプリケーションを最速に実行するプロセッサの構成を出力できる事を確認した．We have been building the hardware/software cosynthesis system for a processor core with a content addressable memory (CAM). We input a description of an application program written in C language into the system, and the system outputs an optimal hardware configuration of a CAM processor which executes an inputted application program. This paper extends our hardware/ software cosynthesis system which incorporates area constraints for a CAM processor. The system computes the number of CAM words which minimizes the execution time with meeting the area constraints. We reduce the CAM processor's area by replacing CAM with RAM according to the word number that the system computed. Experimental results for practical application program show that the system can output a configuration of the processor which executes the application program fastest with meeting the area constraints.

CiNii
高位合成システムにおけるスレッド分割を用いた低消費電力化手法

内田純平, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 102 ( 684 ) 7 - 12 2003年02月

　概要を見る

本稿では,高位合成システムにおけるスレッド分割を用いた低消費電力化手法を提案する.提案手法では,並列に動作する回路ブロック(スレッド)を明示的に記述できる高位合成システムを対象とする.あるデータパスを実現するスレッドにおいて,局所的なレジスタに着目し,分割された一方の部分スレッド内のみにそのレジスタが生成されるようにスレッドを分割する.このとき,データの依存関係を保つために同期通信が必要となる.分割した部分スレッドは同期通信のため待機状態を持つ.この待機状態にある部分スレッドに対しGated Clockを適用することで,分割する前に比べ回路面積のオーバーヘッドが少なく効果的に低消費電力化が実現できる.さらに,計算機実験により消費電力が削減されていることを確認する.

CiNii
SIMD型プロセッサコア向けHW/SW分割におけるSIMD型演算最適化手法

太刀掛宏一, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 102 ( 684 ) 13 - 18 2003年02月

　概要を見る

SIMD型演算は算術演算,シフト演算および飽和演算によって構成される.算術演算,シフト演算および飽和演算はSIMD型演算の部分演算と呼ばれる.部分演算の全てを1クロックサイクル内で実行する場合,処理は高速であるが演算を実行するSIMD型演算器の構成が複雑となる.一方,各々の部分演算をそれぞれに対応したSIMD型演算器に割り当てることで各々のSIMD型演算器構成を簡略化することができるが,アプリケーション実行時間は増加する.本稿では,SIMD型プロセッサコア向けHW/SW分割におけるSIMD型演算最適化手法を提案する.プロセッサコアが各々のSIMD型演算を1クロックサイクルで実行すると仮定したアプリケーションおよび時間制約を入力する.提案手法はアプリケーション中の一つのSIMD型演算を2つのビット拡張付き算術演算およびシフト付きビット縮小演算に分割する.時間制約を満たす間分割を続けることでプロセッサコアのハードウェアコストを削減する.計算機実験により提案手法を評価し結果を報告する.

CiNii
閾値検索機能付きCAMプロセッサの最適化手法

戸塚崇夫, 宮岡祐一郎, 石川裕一朗, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 102 ( 684 ) 19 - 24 2003年02月

　概要を見る

一致検索機能や,以上検索や以下検索などの閾値検索機能を持つCAM(連想メモリ)を有効に利用するためにはCAMセルアレー周辺に検索結果を並列に処理する回路が必要となる.使用するCAMセルアレーの種類や周辺回路の最適な構成はアプリケーションとその要求性能ごとに異なるため,個々に設計する必要がある.本稿ではCAMを使用したプロセッサであるCAMプロセッサの最適化手法を提案する.提案手法はアプリケーション記述からCAMの機能を抽出し,CAMプロセッサを合成した後,分枝限定法に基づき,面積の小さいCAMセルアレーヘの置き換えや部分機能のソフトウェアでの代替処理によってアプリケーション実行時間制約を満たす最適なCAMプロセッサの構成を得る.改良ハードウェア構成木を導入することによりハードウェア構成の探索時間を短縮することが期待できる.計算機実験による提案手法の結果を報告し,評価する.

CiNii
高位合成システムにおけるスレッド分割を用いた低消費電力化手法

内田純平, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. ICD, 集積回路 102 ( 686 ) 7 - 12 2003年02月

　概要を見る

本稿では,高位合成システムにおけるスレッド分割を用いた低消費電力化手法を提案する.提案手法では,並列に動作する回路ブロック(スレッド)を明示的に記述できる高位合成システムを対象とする.あるデータパスを実現するスレッドにおいて,局所的なレジスタに着目し,分割された一方の部分スレッド内のみにそのレジスタが生成されるようにスレッドを分割する.このとき,データの依存関係を保つために同期通信が必要となる.分割した部分スレッドは同期通信のため待機状態を持つ.この待機状態にある部分スレッドに対しGated Clock を適用することで,分割する前に比べ回路面積のオーバーヘッドが少なく効果的に低消費電力化が実現できる.さらに,計算機実験により消費電力が削減されていることを確認する.

CiNii
SIMD型プロセッサコア向けHW/SW分割におけるSIMD型演算最適化手法

太刀掛宏一, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. ICD, 集積回路 102 ( 686 ) 13 - 18 2003年02月

　概要を見る

SIMD型演算は算術演算,シフト演算および飽和演算によって構成される.算術演算,シフト演算および飽和演算はSIMD型演算の部分演算と呼ばれる.部分演算の全てを1クロックサイクル内で実行する場合,処理は高速であるが演算を実行するSIMD型演算器の構成が複雑となる.一方,各々の部分演算をそれぞれに対応したSIMD型演算器に割り当てることで各々のSIMD型演算器構成を簡略化することができるが,アプリケーション実行時間は増加する.本稿では,SIMD型プロセッサコア向けHW/SW分割におけるSIMD型演算最適化手法を提案する.プロセッサコアが各々のSIMD型演算を1クロックサイクルで実行すると仮定したアプリケーションおよび時間制約を入力する.提案手法はアプリケーション中の一つのSIMD型演算を2つのビット拡張付き算術演算およびシフト付きビット縮小演算に分割する.時間制約を満たす間分割を続けることでプロセッサコアのハードウェアコストを削減する.計算機実験により提案手法を評価し結果を報告する.

CiNii
閾値検索機能付きCAMプロセッサの最適化手法

戸塚崇夫, 宮岡祐一郎, 石川裕一朗, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. ICD, 集積回路 102 ( 686 ) 19 - 24 2003年02月

　概要を見る

一致検索機能や,以上検索や以下検索などの閾値検索機能を持つCAM (連想メモリ)を有効に利用するためにはCAMセルアレー周辺に検索結果を並列に処理する回路が必要となる.使用するCAMセルアレーの種類や周辺回路の最適な構成はアプリケーションとその要求性能ごとに異なるため,個々に設計する必要がある.本稿ではCAMを使用したプロセッサであるCAMプロセッサの最適化手法を提案する.提案手法はアプリケーション記述からCAMの機能を抽出し,CAMプロセッサを合成した後,分枝限定法に基づき,面積の小さいCAMセルアレーヘの置き換えや部分機能のソフトウェアでの代替処理によってアプリケーション実行時間制約を満たす最適なCAMプロセッサの構成を得る.改良ハードウェア構成木を導入することによりハードウェア構成の探索時間を短縮することが期待できる.計算機実験による提案手法の結果を報告し,評価する.

CiNii
ハードウェアIPの応答時間を考慮したプロセッサコアのハードウェア／ソフトウェア分割手法

田川博規, 小原俊逸, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 2003 ( 7 ) 93 - 98 2003年01月

　概要を見る

本稿では，ハードウェアIPの応答時間を考慮したプロセッサコアのハードウェア/ソフトウェア分割手法を提案する．我々は対象とするアプリケーションに応じて利用するハードウェアIPを始めに決定した上で，機能・性能に過不足の無いプロセッサコアを合成するシステムLSI設計アプローチを提案している．そこで適切な構成をもったプロセッサコアを合成するためには，ハードウェアIPの応答時間を考慮したハードウェア/ソフトウェア分割が有効である．提案手法はハードウェアIPの応答時間を命令レベルで考慮することで既存手法を拡張しており，これによりプロセッサコアとハードウェアIPが独立したタスクを効率良く並列実行することが可能となる．計算機実験により提案手法を評価し，本設計アプローチの有効性を示す．This paper proposes a hardware/software partitioning algorithm based on response time of hardware IPs. We have been developing a new design approach which first determines the hardware IPs, then co-synthesizes a processor core. Our approach realizes an application-specific system LSI including the processor core that contains only the necessary functionalities. We can reduce an unnecessary functionalities by hardware/software partitioning for micro processors based on response time of hardware IPs. Our algorithm obtains hardware response time of hardware IPs at instruction level. That realizes the efficient parallel execution of hardware and software. THe experimental results show effectiveness of the proposed algorithm and our new approach.

CiNii
ハードウェアIPの応答時間を考慮したプロセッサコア合成システム

小原俊逸, 田川博規, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 2003 ( 7 ) 87 - 92 2003年01月

　概要を見る

本稿では，ハードウェアIPの応答時間を考慮したプロセッサコアの合成システムと，これを利用したシステムLSI設計のフレームワークを提案する．ハードウェアIPを利用したシステムLSI設計では，システムLSIに要求される性能に対して必要にして十分な性能のIPが必ずしも用意されているとは限らない．そこでシステムのハードウェア/ソフトウェア分割後，ハードウェアで実現する機能にはIPを用いるが，ソフトウェアを動作させるプロセッサコアにはIPを用いず，アプリケーションに応じて性能に過不足のないプロセッサコアを自動合成する．提案するフレームワークに沿ってJPEGエンコーダを設計し，計算機実験により提案する合成システムとフレームワークの有効性を示す．This paper proposes a processor core synthesis system based on response time of hardware IPs, and a framework for system LSI design over the synthesis system. In case of designing a system LSI using hardware IPs. IPs which are necessary and sufficient performance for the system LSI are not always provided. Our approach is as follows: After system-level hardware / software partitioning, we use IPs for hardware, but not processor core IPs for software. We use a processor core which is auto-synthesized by the proposed synthesis system and has just enough performance. We design a JPEG encoder within the framework and the results demonstrate its effectiveness and efficiency.

CiNii
MPEG－4コアプロファイル符号化に対応した専用演算器を持つDSP

石本剛, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 2003 ( 7 ) 81 - 86 2003年01月

　概要を見る

本稿ではMPEG-4コアプロファイル符号化に対応した専用演算器を持つDSPを提案する．提案DSPはコアプロファイルを用いてQCIF動画像を30fpsで符号化可能である．本DSPは専用ハードウェアとして形状符号化器，パディング器，量子化器，可変長符号化器を持つ．これらはMPEG-4コアプロファイルで用いられる計算量を多く必要とする処理を実行する．専用ハードウェアは演算器形式とし，DSPから直接操作できる形とする．本DSPでは任意長のビットストリームのメモリアクセスを実現するため，ビットストリームロードユニットおよびビットストリームストアユニットを使用する．これらのユニットが形状符号化器や可変長符号化器から出力される符号のメモリへの書き出しを高速化し，符号化器のプロセッサへの組込みを可能とする．これにより本DSPは専用ハードウェアの性能とプロセッサとしての柔軟性の両立を実現する．本DSPをチップ試作を通して評価した．試作チップは0.35μmプロセスを用いて40MHzで動作する.This paper proposes a DSP with dedicated functional units for MPEG-4 core profile encoding. In the proposed DSP, we can have 30fps of QCIF for MPEG-4 core profile encoding, which no other LSIs for MPEG-4 have ever achieved. The proposed DSP has four dedicated functional units: a shape coding unit, a padding unit, a quantization unit, and a variable-length coding unit. These units execute processes which require much computational power for MPEG-4 core profile encoding. It also has a bitstream load unit and a bitstream store unit which realize a variable-length memory access. These units speed up reading and writing variable-length codes outputted from the shape coding unit and the variable-length coding unita, and make it possible to incorporate these coding units into the proposed DSP. Therefore, out DSP can achive both the performance of dedicated hardwares and the flexibility of a processor. Our DSP has been implemented using 0.35 μm CMOS technology and operates at 40MHz.

CiNii
MPEG-4コアプロファイル符号化に対応した専用演算器を持つDSP

石本剛, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. CPSY, コンピュータシステム 102 ( 611 ) 25 - 30 2003年01月

　概要を見る

本稿ではMPEG-4コアプロファイル符号化に対応した専用演算器を持つDSPを提案する.提案DSPはコアプロファイルを用いてQCIF動画像を30fpsで符号化可能である.本DSPは専用ハードウェアとして形状符号化器,パディング器,量子化器,可変長符号化器を持つ.これらはMPEG-4コアプロファイルで用いられる計算量を多く必要とする処理を実行する.専用ハードウェアは演算器形式とし,DSPから直接操作できる形とする.本DSPでは任意長のビットストリームのメモリアクセスを実現するため,ビットストリームロードユニットおよびビットストリームストアユニットを使用する.これらのユニットが形状符号化器や可変長符号化器から出力される符号のメモリヘの書き出しを高速化し,符号化器のプロセッサヘの組み込みを可能とする.これにより本DSPは専用ハードウェアの性能とプロセッサとしての柔軟性の両立を実現する.本DSPをチップ試作を通して評価した.試作チップは0.35μmプロセスを用いて40MHzで動作する.

CiNii
MPEG-4コアプロファイル符号化に対応した専用演算器を持つDSP

石本剛, 宮岡祐一郎, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 102 ( 609 ) 25 - 30 2003年01月

　概要を見る

本稿ではMPEG-4コアプロファイル符号化に対応した専用演算器を持つDSPを提案する.提案DSPはコアプロファイルを用いてQCIF動画像を30fpsで符号化可能である.本DSPは専用ハードウェアとして形状符号化器,パディング器,量子化器,可変長符号化器を持つ.これらはMPEG-4コアプロファイルで用いられる計算量を多く必要とする処理を実行する.専用ハードウェアは演算器形式とし,DSPから直接操作できる形とする.本DSPでは任意長のビットストリームのメモリアクセスを実現するため,ビットストリームロードユニットおよびビットストリームストアユニットを使用する.これらのユニットが形状符号化器や可変長符号化器から出力される符号のメモリヘの書き出しを高速化し,符号化器のプロセッサヘの組み込みを可能とする.これにより本DSPは専用ハードウェアの性能とプロセッサとしての柔軟性の両立を実現する.本DSPをチップ試作を通して評価した.試作チップは0.35μmプロセスを用いて40MHzで動作する.

CiNii
A High-Level Energy-Optimizing Algorithm for System VLSIs Based on Area/Time/Power Estimation

NODA Shinichi, TOGAWA Nozomu, YANAGISAWA Masao, OHTSUKI Tatsuo

IEICE transactions on fundamentals of electronics, communications and computer sciences 85 ( 12 ) 2655 - 2666 2002年12月

　概要を見る

This paper proposes a high-level energy-optimizing algorithm which can synthesize low energy system VLSIs. Given an initial system hardware obtained from an abstract behavioral description, the proposed algorithm applies to it the three energy reduction techniques, 1) reducing supply voltage, 2) selecting lower energy modules, and 3) applying gated clocks. By incorporating our area/delay/power estimation, the proposed algorithm can obtain low energy system VLSIs meeting the constraints of area, delay, and execution time. The proposed algorithm has been incorporated into a high-level synthesis system and experimental results demonstrate effectiveness and efficiency of the algorithm.

CiNii
High-Level Area/Delay/Power Estimation for Low Power System VLSIs with Gated Clocks

NODA Shinichi, TOGAWA Nozomu, YANAGISAWA Masao, OHTSUKI Tatsuo

IEICE transactions on fundamentals of electronics, communications and computer sciences 85 ( 4 ) 827 - 834 2002年04月

　概要を見る

At high-level synthesis for system VLSIs, their power consumption is efficiently reduced by applying gated clocks to them. Since using gated clocks causes the reduction of power consumption and the increase of area/delay, estimating trade-off between power and area/delay by applying gated clocks is very important. In this paper, we discuss the amount of variance of area, delay and power by applying gated clocks. We propose a simple gate-level circuit model and estimation equations. We vary parameters in our proposed circuit model, and evaluate power consumption by back-annotating gate-level simulation results to the original circuit. This paper also proposes a conditional expression for applying gated clocks. The expression shows whether or not we can reduce power consumption by applying gated clocks. We confirm the accuracy of proposed estimation equations by experiments.

CiNii
Packed SIMD 型演算器を持つディジタル信号処理プロセッサのためのリターゲッタブルシミュレータ生成手法

笠原亨介, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 101 ( 695 ) 17 - 24 2002年03月

　概要を見る

Packed SIMD型命令を持つプロセッサコアのハードウェア/ソフトウェア協調合成をする場合,合成されるプロセッサコア専用のシミュレータが必要になる.しかし,Packed SIMD型命令は多種類あるため,予めシミュレータ向けのPacked SIMD型命令の記述を用意することはできない.本稿では,Packed SIMD型命令の部分的な機能ごとの動作記述を組み合わせることによって,シミュレーションに必要なPacked SIMD型命令の記述を生成する手法を提案する.提案手法により,任意のPccked SIMD命令を持つプロセッサコアのシミュレーションが可能になった.提案手法を計算機上に実装しPacked SIMD型命令を持つプロセッサのシミュレータを生成した結果を示し,有効性を評価する.

CiNii
システムVLSIのための高位面積／遅延／消費電力見積もりに基づく低消費電力指向高位合成手法

野田真一, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 2002 ( 5 ) 169 - 176 2002年01月

　概要を見る

本稿では、面積/遅延/実行時間の制約を満たしながら低消費電力なシステム VLSI を合成可能な高位合成システムを提案する。低消費電力化手法として、1）電源電圧の低減、2）低消費電力なモジュールの選択、3）Gated Clock の3つの手法を採用した。一般にこれら3つの手法の適用により消費電力は低減可能であるが、面積/遅延/実行時間は増加してしまう。提案する手法では、面積/遅延/実行時間の変化量を予測することによって、これらの各制約を満たしながら初期ハードウェアよりも消費電力を低減したハードウェアを合成することができる。さらに、計算機実験により消費電力が低減されていることを確認した。This paper proposes a new high-level synthesis which can synthesize how-powered system VLSIs under the constraints of area, delay, and execution time. In the proposed system, first an initial system hardware is obtained from an abstract behavioral description. Then three power reduction techniques, 1) reducing power supply voltage, 2) selecting lower power modules, and 3) applying gated clocks, are applied to it. However these power reduction techniques may increase area, delay, and/or execution time of a synthesized hardware, while they can reduce its power dissipation. In this paper, we propose a power optimization algorithm which incorporates area/delay/power estimation, in which we can obtain a synthesized hardware meeting given area/delay/power constraints. Experimental results demonstrate effectiveness and effectiveness and efficiency of the algorithm.

CiNii
制御処理ハードウェアの高位合成システムにおける面積／遅延見積もり手法

余田貴幸, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 2001 ( 12 ) 25 - 32 2001年02月

　概要を見る

本稿では，制御処理ハードウェアの高位合成システムのための面積／遅延見積もり手法を提案する．面積／遅延見積もりでは本システム構成の1つである面積／時間最適化において構築された状態遷移グラフを入力としてその状態遷移グラフに対する面積見積もり値および遅延見積もり値を出力する．提案見積もり手法では状態数および演算器の種類に依存した見積もり式を定式化することでハードウェアの制御部分を含めた面積，遅延の見積もり値を得ている．提案手法をハフマン符号化を始めとするいくつかの制御処理アプリケーションプログラムに適用し，その有効性を評価する．This paper proposes an area/delay estimation technique in high-level synthesis for control flow based hardwares. At area/delay estimation, the input is the state-transition graph, which is generated by the area/time optimizing. The output is estimated area and delay value for the state-transition graph. Our estimation technique gives area and delay including control part of hardware, using an estimation equation. The equation has been decided by number of operations, number of states and type of operations. Experimental results for several control-based hardware demonstrate effectiveness and efficiency of the technique.

CiNii
発見的算法と分枝限定法を用いた計算時間予測に基づくリソースバインディング手法

中村洋, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 2001 ( 2 ) 65 - 72 2001年01月

　概要を見る

本稿では，ディジタル信号処理ハードウェアのデータバス設計を対象とした計算時間予測に基づき解を導出するリソースパインディング手法を提案する．提案手法は，発見的算法に基づくリソースバインダと分枝限定法に基づくリソースバインダを組み合わせたものである．まず，発見的算法に基づくリソースバインダが割り当てるリソース数を変化させ，残りのリソースを分枝限定法に基づくリソースバインダで割り当てた場合，計算時間がどのように増減するかを予測する．その予測に基づき，設計者の与える計算時間制約を満足するように発見的算法に基づくリソースバインダで割り当てるリソース数を決定し，実際に割当を実行する．残りのリソースの割当を分枝限定法に基づくリソースバインダで決定することにより，最終的なリソースパインディングの解を得る．計算機実験により，本手法の有効性を確認する．This paper proposes a resource binding algorithm based on computation time estimation in the high-level synthesis system for digital signal processing. In the algorithm, a heuristic based binder is first executed and then a branch-and-bound based binder is executed. The computation time to run the algorithm depends on the number of resource assignments which the heuristic based binder determines. Thus we can estimate computation time to run the algorithm by varying the number of such resource assignments. In the algorithm, for a given constraint of computation time, we first obtain the number of resource assignments which the heuristic based binder determines based on the computation time estimation. Then we actually execute the heuristic based binder. After that, we execute the branch-and-bound based binder for the rest of the resource assignments. Experimental results demonstrate effectiveness and efficiency of the algorithm.

CiNii
発見的算法と分枝限定法を用いた計算時間予測に基づくりソースバインディング手法

中村洋, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. CPSY, コンピュータシステム 100 ( 534 ) 17 - 24 2001年01月

　概要を見る

本稿では, ディジタル信号処理ハードウェアのデータパス設計を対象とした計算時間予測に基づき解を導出するリソースバインディング手法を提案する.提案手法は, 発見的算法に基づくリソースバインダと分枝限定法に基づくリソースバインダを組み合わせたものである.まず, 発見的算法に基づくリソースバインダが割り当てるリソース数を変化させ, 残りのリソースを分枝限定法に基づくリソースバインダで割り当てた場合, 計算時間がどのように増減するかを予測する.その予測に基づき, 設計者の与える計算時間制約を満足するように発見的算法に基づくリソースバインダで割り当てるリソース数を決定し, 実際に割当を実行する.残りのリソースの割当を分枝限定法に基づくリソースバインダで決定することにより, 最終的なリソースバインディングの解を得る.計算機実験により, 本手法の有効性を確認する.

CiNii
発見的算法と分枝限定法を用いた計算時間予測に基づくリソースバインディング手法

中村洋, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 100 ( 532 ) 17 - 24 2001年01月

　概要を見る

本稿では, ディジタル信号処理ハードウェアのデータパス設計を対象とした計算時間予測に基づき解を導出するリソースバインディング手法を提案する.提案手法は, 発見的算法に基づくリソースバインダと分枝限定法に基づくリソースバインダを組み合わせたものである.まず, 発見的算法に基づくリソースバインダが割り当てるリソース数を変化させ, 残りのリソースを分枝限定法に基づくリソースバインダで割り当てた場合, 計算時間がどのように増減するかを予測する.その予測に基づき, 設計者の与える計算時間制約を満足するように発見的算法に基づくリソースバインダで割り当てるリソース数を決定し, 実際に割当を実行する.残りのリソースの割当を分枝限定法に基づくリソースバインダで決定することにより, 最終的なリソースバインディングの解を得る.計算機実験により, 本手法の有効性を確認する.

CiNii
機能メモリを使用したプロセッサの面積/遅延見積り手法 (デザインガイヤ2000) -- (VLSIの設計/検証/テスト及び一般)

余傳達彦, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 100 ( 475 ) 83 - 88 2000年11月

　概要を見る

機能メモリを使用したプロセッサのハードウェア/ソフトフェア協調合成システムではプロセッサの面積/遅延見積もり値が必要になる.本手法は, まずプロセッサ記述を論理合成ツールで論理合成して得られた論理回路を解析し, 面積/遅延の見積もり式がどのような関数になるかの予測を行った.その予測を元にプロセッサ記述を変化させ論理合成し, プロセッサの面積/遅延見積り式を導出した.導出した機能メモリを使用したプロセッサの面積/遅延見積り式は論理合成後の面積/遅延の値と比較して誤差は面積2.7%, 遅延3.8%以内に抑えられた.

CiNii
CAMプロセッサを対象とするハードウェア/ソフトウェア協調合成システム (デザインガイヤ2000) -- (VLSIの設計/検証/テスト及び一般)

涌井達彦, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 100 ( 475 ) 89 - 94 2000年11月

　概要を見る

本稿では, CAM(一致検索機能を有する機能メモリ)を使用したプロセッサを対象とするハードウェア/ソフトウェア協調合成システムを提案する.本システムではC言語で記述されたCAM機能を使用したアプリケーションプログラムおよび面積/時間制約を入力とし, 制約を満足するCAMとマイクロプロセッサユニットで構成されるCAMプロセッサの論理合成可能なハードウェア記述およびCAMプロセッサ上で動作するバイナリコードを出力する.本システムはCAMの並列処理を担う各機能モジュールをハードウェアで実現するか, ソフトウェアで代替するかを分枝限定法により決定する.計算機上に実装した本システムにアプリケーションプログラムおよび時間制約を入力した結果, 制約を満足するCAMプロセッサのハードウェア記述およびバイナリコードが得られた.

CiNii
機能メモリを使用したプロセッサの面積/遅延見積り手法

余傳達彦, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. ICD, 集積回路 100 ( 474 ) 83 - 88 2000年11月

　概要を見る

機能メモリを使用したプロセッサのハードウェア/ソフトフェア協調合成システムではプロセッサの面積/遅延見積もり値が必要になる.本手法は, まずプロセッサ記述を論理合成ツールで論理合成して得られた論理回路を解析し, 面積/遅延の見積もり式がどのような関数になるかの予測を行った.その予測を元にプロセッサ記述を変化させ論理合成し, プロセッサの面積/遅延見積り式を導出した.導出した機能メモリを使用したプロセッサの面積/遅延見積り式は論理合成後の面積/遅延の値と比較して誤差は面積2.7%, 遅延3.8%以内に抑えられた.

CiNii
機能メモリを使用したプロセッサの面積/遅延見積り手法

余傳達彦, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 100 ( 473 ) 83 - 88 2000年11月

　概要を見る

機能メモリを使用したプロセッサのハードウェア/ソフトウェア協調合成システムではプロセッサの面積/遅延見積もり値が必要になる.本手法は, まずプロセッサ記述を論理合成ツールで論理合成して得られた論理回路を解析し, 面積/遅延の見積もり式がどのような関数になるかの予測を行った.その予測を元にプロセッサ記述を変化させ論理合成し, プロセッサの面積/遅延見積り式を導出した.導出した機能メモリを使用したプロセッサの面積/遅延見積り式は論理合成後の面積/遅延の値と比較して誤差は面積2.7%, 遅延3.8%以内に抑えられた.

CiNii
ディジタル信号処理向けプロセッサコアの面積/遅延見積り手法 (デザインガイア'99--VLSIの設計/検証/テスト及び一般)

片岡義治, 吉澤大, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 99 ( 479 ) 1 - 8 1999年11月

　概要を見る

2種類のレジスタファイルを持つディジタル信号処理向けプロセッサのハードウェア/ソフトウェア協調合成システムでは,ハードウェア/ソフトウェア分割の評価値として,アプリケーションプログラムの実行時間の見積り値と生成されるプロセッサコアの面積の見積り値が必要となる.これら見積り値を得るためには,実際にシステムを用いてハードウェアユニットを変化させ得られたプロセッサコア記述を論理合成ツールで論理合成した結果を解析し,見積り式を導出する必要がある.本稿では,プロセッサコアの面積見積り式および遅延見積り式の導出方法とその検証結果について報告する.面積見積り式の導出では,まず,プロセッサコアの面積がプロセッサカーネルとカーネルに付加されるハードウェアユニットの面積の和として表されることを示す.しかも,プロセッサカーネルの面積が付加するハードウェアユニットに依存する部分と汎用レジスタ数に依存する部分に分離して考えられる点に注目する.導出した面積見積り式によるプロセッサコアの面積見積り値は,論理合成結果後の面積値と比較して,誤差を2%程度に抑えられることが分かった.遅延見積り式の導出では,クリティカルパスを構成する演算器ごとに見積り式を導出することにより誤差を小さくできることを示す.導出した遅延見積り式によるプロセッサコアの1クロック周期は,論理合成結果後の1クロック周期と比較して,誤差を2ns以下に抑えられることが分かった.

CiNii
2種類のレジスタファイルを持つディジタル信号処理向けプロセッサのハードウェア/ソフトウェア分割手法 (デザインガイア'99--VLSIの設計/検証/テスト及び一般)

桜井崇志, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告 99 ( 479 ) 9 - 16 1999年11月

　概要を見る

本稿では,ビット幅の異なる2種類のレジスタファイルを持つディジタル信号処理向けプロセッサのプロセッサを対象としたハードウェア/ソフトウェア分割手法を提案する.本手法は,アプリケーションプログラムをコンパイルしたアセンブリコード,アプリケーションデータによるアプリケーションプログラムの解析結果とアプリケーションプログラムの実行時間制約を入力とし,プロセッサのアーキテクチャとそのプロセッサ上で動作するアセンブリコードを出力とする.合成するプロセッサは,複数個の命令を同時に実行するVLIWタイプのプロセッサであり,プロセッサカーネル,2種類のレジスタファイルと複数のハードウェアユニットで構成される.ハードウェアユニットとしてハードウェアループ,アドレッシングユニット,複数個の演算器,複数個のデータメモリバス構成をとることが可能である.レジスタファイルはビット幅の異なる2種類のレジスタファイルを考えることができる.アプリケーションプログラムに記述される変数を適切なビット幅のレジスタに割り当てることによりレジスタファイルのハードウェアコストを削減できる.演算器に関しては,同じ演算を実現するのに複数種類の演算器を用意する.アプリケーションプログラムに応じて適切なハードウェアコスト,遅延を持つ演算器を選択することにより,演算器のハードウェアコストが削減できる.計算機実験により提案手法を評価した結果を報告する.

CiNii
制御処理ハードウェアの高位合成システムのための面積/時間最適化アルゴリズム

家長真行, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 99 ( 317 ) 15 - 22 1999年09月

　概要を見る

本稿では,制御処理ハードウェアの高位合成システムのための面積/時間最適化アルゴリズムを提案する.面積/時間最適化アルゴリズムは,入力としてコールグラフおよびコールグラフを構成するコントロールフローグラフ集合を取り,面積制約および時間制約のもとに,コールグラフ全体を表す状態遷移グラフ集合を合成する.まず,時間制約のみを満足する状態遷移グラフを構築し,その後,面積制約を満足するよう状態遷移グラフを変換する.提案アルゴリズムは,コントロールフローグラフを直接的に操作するため,ビット処理および条件分岐処理といった制御処理を扱うことができ,しかも,アプリケーションプログラム全体を表す1個のコールグラフから,面積制約および時間制約を満足する複数個のハードウェア候補を列挙することができる.提案アルゴリズムをハフマン符号化を始めとする,いくつかの制御処理アプリケーションプログラムに適用し,その有効性を評価する.

CiNii
制御処理ハードウェアの高位合成システムのための面積/時間最適化アルゴリズム

家長真行, 戸川望, 柳澤政生

電子情報通信学会技術研究報告 99 ( 317 ) 15 - 22 1999年09月

　概要を見る

本稿では,制御処理ハードウェアの高位合成システムのための面積/時間最適化アルゴリズムを提案する.面積/時間最適化アルゴリズムは,入力としてコールグラフおよびコールグラフを構成するコントロールフローグラフ集合を取り,面積制約および時間制約のもとに,コールグラフ全体を表す状態遷移グラフ集合を合成する.まず,時間制約のみを満足する状態遷移グラフを構築し,その後,面積制約を満足するよう状態遷移グラフを変換する.提案アルゴリズムは,コントロールフローグラフを直接的に操作するため,ビット処理および条件分岐処理といった制御処理を扱うことができ,しかも,アプリケーションプログラム全体を表す1個のコールグラフから,面積制約および時間制約を満足する複数個のハードウェア候補を列挙することができる.提案アルゴリズムをハフマン符号化を始めとする,いくつかの制御処理アプリケーションプログラムに適用し,その有効性を評価する.

CiNii
FPGAを用いた動的再構成可能システムとその応用

長谷川洋平, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 98 ( 625 ) 17 - 24 1999年03月

　概要を見る

FPGAはゲートアレイの一種であり,ユーザの手元でディジタル回路をプログラミングできる可変構造デバイスである. FPGAを用いることにより, あたかもソフトウェアのように何度も回路を書き換えることが可能な柔軟なハードウェアを設計することができる.近年ではシステムの動作中にFPGA上の回路を書き換える動的再構成可能システムが研究されている. 本稿では大規模なディジタル信号処理アプリケーションの高速実行を目的とした動的再構成可能システムmFPS2を提案する.mFPS2は4個のFPGAで構成され, そのうち2個のFPGAを動的再構成可能とする. ホストコンピュータとPCIバスにより接続される。FPGAの動的再構成機能を取り入れたシステムを構築することで, 現実の回路規模以上の大規模アプリケーションが実行可能となる. mFPS2の応用例としてJPEGエンコーダを実現した結果, 計算機によるソフトウェア処理に比較して2倍の処理速度を達成した.

CiNii
FPGAを用いた動的再構成可能システムとその応用

長谷川洋平, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. ICD, 集積回路 98 ( 626 ) 17 - 24 1999年03月

　概要を見る

FPGAはゲートアレイの一種であり,ユーザの手元でディジタル回路をプログラミングできる可変構造デバイスである. FPGA を用いることにより, あたかもソフトウェアのように何度も回路を書き換えることが可能な柔軟なハードウェアを設計することができる. 近年ではシステムの動作中にFPGA 上の回路を書き換える動的再構成可能システムが研究されている. 本稿では大規模なディジタル信号処理アプリケーションの高速実行を目的とした動的再構成可能システムmFPS2を提案する. mFPS2は4個のFPGA で構成され, そのうち2個の FPGA を動的再構成可能とする. ホストコンピュータとPCIバスにより接続される. FPGA の動的再構成機能を取り入れたシステムを構築することで, 現実の回路規模以上の大規模アプリケーションが実行可能となる. mFPS2の応用例としてJPEGエンコーダを実現した結果, 計算機によるソフトウェア処理に比較して2倍の処理速度を達成した.

CiNii
2種類のレジスタファイルを持ったディジタル信号処理向けプロセッサのハードウエア/ソフトウエア協調合成システムとその並列化コンパイラ

中村剛, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告. 設計自動化研究会報告 99 ( 12 ) 113 - 120 1999年02月

　概要を見る

ディジタル信号処理において高い演算精度を保つためには, 演算の途中結果は入力系列のビット幅より大きなビット幅を持つ必要がある. ディジタル信号処理向けプロセッサが2種類のレジスタファイルを持てば, 演算精度を保ち, しかも小さいハードウェア面積でディジタル信号処理アプリケーションを実現することができる. 本稿では, レジスタビット幅の異なる2種類のレジスタファイルを持ったディジタル信号処理向けプロセッサのハードウエア/ソフトウェア協調合成システムおよびその並列化コンパイラを提案する. 本システムはアプリケーションプログラムのC言語記述およびアプリケーションデータを入力とし, プロセッサのハードウエア記述, プロセッサ上で動作するオブジェクトコードおよびソフトウェア環境を出力とする. 並列化コンパイラは, 与えられたC言語記述からターゲットアーキテクチャで想定される全てのハードウェアユニットを持つプロセッサ上で動作するアセンブリコードを出力する. この際, アプリケーションが持つ並列度を最大限に抽出し, 実行時間の最小化を目指す. さらに2つのデータ型から変数を2種類のレジスタファイルに割り当て, 演算精度を保つアセンブリコードを生成できる. 計算機実験によって本システムを評価した結果を報告する.

CiNii
機能メモリを使用したプロセッサを対象とするハードウェア/ソフトウェア協調合成システム

寺島信, 戸川望, 柳澤政生, 大附辰夫

電子情報通信学会技術研究報告. CPSY, コンピュータシステム 98 ( 291 ) 31 - 38 1998年09月

　概要を見る

本稿では, CAM(一致検索機能を有する機能メモリ)を使用したプロセッサを対象とするハードウェア/ソフトウェア協調合成システムを提案する.本システムではC言語で記述されたCAM機能を使用したアプリケーションプログラムを入力とし, CAMとマイクロプロセッサユニットで構成されるCAMプロセッサの論理合成可能なハードウェア記述およびCAMプロセッサ上で動作するバイナリコードを出力する.このシステムによりCAMの並列処理機能を使用したアプリケーションプログラムを高速に実行するハードウェアならびにソフトウェアを短期間で設計可能となる.生成されたCAMプロセッサのハードウェア記述を論理合成し, 計算機上で性能を評価した結果を報告する.

CiNii
機能メモリを使用したプロセッサを対象とするハードウェア/ソフトウェア協調合成システム

寺島信, 戸川望, 柳澤政生, 大附辰夫

情報処理学会研究報告. 設計自動化研究会報告 98 ( 87 ) 83 - 90 1998年09月

　概要を見る

本稿では, CAM(一致検索機能を有する機能メモリ)を使用したプロセッサを対象とするハードウェア/ソフトウェア協調合成システムを提案する.本システムではC言語で記述されたCAM機能を使用したアプリケーションプログラムを入力とし, CAMとマイクロプロセッサユニットで構成されるCAMプロセッサの論理合成可能なハードウェア記述およびCAMプロセッサ上で動作するバイナリコードを出力する.このシステムによりCAMの並列処理機能を使用したアプリケーションプログラムを高速に実行するハードウェアならびにソフトウェアを短期間で設計可能となる.生成されたCAMプロセッサのハードウェア記述を論理合成し, 計算機上で性能を評価した結果を報告する.

CiNii
An FPGA Layout Reconfiguration Algorithm Based on Global Routes for Engineering Changes in System Design Specifications

TOGAWA Nozomu, HAGI Kayoko, YANAGISAWA Masao, OHTSUKI Tatsuo

IEICE transactions on fundamentals of electronics, communications and computer sciences 81 ( 5 ) 873 - 884 1998年05月

　概要を見る

Rapid system prototyping is one of the main applications for field-programmable gate arrays(FPGAs). At the stage of rapid system prototyping, design specifications can often be changed since they cannot be determined completely. In this paper, layout design change is focused on and a layout reconfiguration algorithm is proposed for FPGAs. The target FPGA architecture is develioped for transport processing. In order to implement more various circuits flexibly, it has three-input lookup tables(LUTs)as minimum logic cells. Since its logic granularity is finer than that of conventional FPGAs, it requires more routing resources to connect them and minimization of routing congestion is indispensable. In layout reconfiguration, the main problem is to add LUTs to initial layouts. Our algorithm consists of two steps: For given placement and global routing of LUTs, in Step 1 an added LUT is placed with allowing that the position of the added LUT may overlap that of a preplaced LUT; Then in Step 2 preplaced LUTs are moved to their adjacent positions so that the overlap of the LUT positions can be resolved. Global routes are updated corresponding to reconfiguration of placement. The algorithm keeps routing congestion small by evaluating global routes directly both in Steps 1 and 2. Especially in Step 2, if the minimum number of preplaced LUTs are moved to their adjacent positions, our algorithm minimizes routing congestion. Experimental results demonstrate that, if the number of added LUTs is at most 20% of the number of initial LUTs, our algorithm generates the reconfigured layouts whose routing congestion is as small as that obtained by executing a conventional placement and global routing algorithm. Run time of our algorithm is within approximately one second.

CiNii
ツリー構造を持つ論理ブロックを対象としたテクノロジマッピング手法

荒宏視

電子情報通信学会技術研究報告 VLD97;104 1997年12月

CiNii
A Performance-Oriented Simultaneous Placement and Global Routing Algorithm for Transport-Processing FPGAs

TOGAWA Nozomu, SATO Masao, OHTSUKI Tatsuo

IEICE transactions on fundamentals of electronics, communications and computer sciences 80 ( 10 ) 1795 - 1806 1997年10月

　概要を見る

In layout design of transport-processing FPGAs, it is required that not only routing congestion is kept small but also circuits implemented on them operate with higher operation frequency. This paper extends the proposed simultaneous placement and global routing algorithm for transport-processing FPGAs whose objective is to minimize routing congestion and proposes a new algorithm in which the length of each critical signal path (path length) is limited within a specified upper bound imposed on it (path length constraint). The algorithm is based on hierarchical bipartitioning of layout regions and LUT (LookUp Table) sets to be placed. In each bipartitioning, the algorithm first searches the paths with tighter path length constraints by estimating their path lengths. Second the algorithm proceeds the bipartitioning so that the path lengths of critical paths can be reduced. The algorithm is applied to transport-processing circuits and compared with conventional approaches. The results demonstrate that the algorithm satisfies the path length constraints for 11 out of 13 circuits, though it increases routing congestion by an average of 20%. After detailed routing, it achieves 100% routing for all the circuits and decreases a circuit delay by an average of 23%.

CiNii
A Circuit Partitioning Algorithm with Path Delay Constraints for Multi-FPGA Systems

TOGAWA Nozomu, SATO Masao, OHTSUKI Tatsuo

IEICE transactions on fundamentals of electronics, communications and computer sciences 80 ( 3 ) 494 - 505 1997年03月

　概要を見る

In this paper, we extend the circuit partitioning algorithm which we have proposed for multi-FPGA systems and present a new algorithm in which the delay of each critical signal path is within a specified upper bound imposed on it. The core of the presented algorithm is recursive bipartitioning of a circuit. The bipartitioning procedure consists of three stages: 0) detection of critical paths; 1) bipartitioning of a set of primary inputs and outputs; and 2) bipartitioning of a set of logic-blocks. In 0), the algorithm computes the lower bounds of delays for paths with path delay constraints and detects the critical paths based on the difference between the lower and upper bound dynamically in every bipartitioning procedure. The delays of the critical paths are reduced with higher priority. In 1), the algorithm attempts to assign the primary inputs and outputs on each critical path to one chip so that the critical path does not cross between chips. Finally in 2), the algorithm not only decreases the number of crossings between chips but also assigns the logic-blocks on each critical path to one chip by exploiting a network flow technique. The algorithm has been implemented and applied to MCNC PARTITIONING 93 benchmark circuits. The experimental results demonstrate that it resolves almost all path delay constraints with maintaining the maximum number of required I/O blocks per chip small compared with conventional algorithms.

CiNii
A fast scheduling algorithm in high-level synthesis system for digital signal processing

TOGAWA N.

Proc. IPSJ DA Symposium '97 167 - 172 1997年

CiNii
Simultaneous Placement and Global Routing for Transport-Processing FPGA Layout

TOGAWA Nozomu, SATO Masao, OHTSUKI Tatsuo

IEICE transactions on fundamentals of electronics, communications and computer sciences 79 ( 12 ) 2140 - 2150 1996年12月

　概要を見る

Transport-processing FPGAs have been pro-posed for flexible telecommunication systems. Since those FP-GAs have finer granularity of logic functions to implement circuits on them, the amount of routing resources tends to increase.In order to keep routing congestion small, it is necessary to execute placement and routing simultaneously. This paper pro-poses a simultaneous placement and global routing algorithm for transport-processing FPGAs whose primary objective is minimizing routing congestion. The algorithm is based on hierarchical bipartition of layout regions and sets of LUTs (LookUp Tables) to be placed. It achieves bipartitioning which leads to small routing congestion by applying a network flow technique to it and computing a maximum flow and a minimum cut. If there exist connections between bipartitioned LUT sets, pairs of pseudo-terminals are introduced to preserve the connections. A sequence of pseudo-terminals represents a global route of each net. As a result, both placement of LUTs and global routing are determined when hierarchical bipartitioning procedures are finished. The proposed algorithm has been implemented and applied to practical transport-processing circuits. The experimental results demonstrate that it decreases routing congestion by an average of 37% compared with a conventional algorithm and achieves 100% routing for the circuits for which[ the conventional algorithm causes unrouted nets.

CiNii
プリント配線板を対象とした二層均等化スペーシング手法 (&lt特集&gt レイアウトと一般)

金井宏和, 戸川望, 佐藤政生, 大附辰夫

情報処理学会研究報告. 設計自動化研究会報告 96 ( 51 ) 9 - 14 1996年05月

　概要を見る

プリント配線板の設計では, 部品を配置した後, 部品間を配線する. そのため, 必ずしも配置された部品間をすべて配線できるとは限らない. 配線設計に先だって, 配線が可能になるように配置された部品を移動する処理をスペーシングと呼ぶ. 本稿では, 基板の両面に部品を配置配線する表面実装技術を用いた二層のプリント配線板を対象としたスペーシング手法を提案する. 本手法は, 部品間の配線本数に応じて配線に必要な間隔を配線領域に与えるように部品を再配置する. 配置された部品同士に重なりなどの設計規則違反がある場合には, 配線に必要な間隔を配線領域の確保し, 同時に部品の重なりを除去することで, 違反を解決する. 提案手法を計算上に実装し, 手法の有効性を確認した.

CiNii
パス遅延制約を考慮したマルチFPGA用回路分割手法

電子情報通信学会第9回回路とシステム軽井沢ワークショップ論文集 1996年04月
条件分岐構造を持つコントロールデータフローグラフの時間制約スケジューリング手法

石渡宏明, 戸川望, 佐藤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 95 ( 561 ) 31 - 36 1996年03月

　概要を見る

LSIの高位合成において動作記述に条件分岐を含む場合, 条件分岐を考慮したスケジューリング手法が必要となる. 条件分岐構造を持つコントロールデータフローグラフ(CDFG)のスケジューリングでは, 実行時間の異なる演算間のみではなく, 実行条件の異なる演算間でも演算器を共有することができる. 本稿では, 条件分岐を持つCDFGのスケジューリングに特有な演算の排他性に着目した時間制約スケジューリング手法を提案する. 提案手法では, まず実行条件の異なる資源共有が可能な演算の組を探索しそれらを同じコントロールステップに割り当て, 続いてその他の演算のスケジューリングを行う. 計算機実験結果により, 提案手法は現実的な時間でほぼ最適解が得られることを確認した.

CiNii
イタレーション間データ依存制約を考慮したパイプライン化DSPスケジューリング手法

西田浩一, 戸川望, 佐藤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 95 ( 561 ) 37 - 44 1996年03月

　概要を見る

DSP用LSIの高位合成では, データフローグラフのスケジューリングが中心的な役割を果たす. ディジタルフィルタを始めとした多くのDSPでは, 信号を遅延器で遅らせ演算器で演算するという処理が中心となるため, スケジューリングではイタレーション間データ依存制約を満足することが要求される. 高速処理が必要なDSPでは, スループットを向上させるためにパイプライン化データパスの実現も重要である. 本稿では, マルチサイクル演算器およびパイプライン化演算器の考慮, イタレーション間データ依存制約の満足, パイプライン化データパスの合成を可能とした時間制約スケジューリング手法を提案する. 提案手法は, まずデータフローグラフ中の演算が割り当てられるコントロールステップの候補を列挙し, 徐々に候補を減少させることで最終的なスケジューリング結果を得る. 計算機実験結果により, 提案手法は現実的な例題に対し1秒以下ではほぼ最適解が得られることを示す.

CiNii
エントロピーCODECの高位合成手法

鈴木克青, 戸川望, 佐藤政生, 大附辰夫

情報処理学会研究報告. 設計自動化研究会報告 96 ( 16 ) 25 - 30 1996年02月

　概要を見る

画像符号化処理において, エントロピー符号化・復号化は、高位合成CADとFPGAの組合せによって、高速かつ柔軟なハードウェアとして実現できる. 本稿では、そのようなCADシステムの核となる、エントロピーCODECの動作記述を入力としたスケジユーリング・アロケーション手法を提案する. スケジユーリング処理では、制御の流れを表すグラフ(CFG)を入力とし、CFGの各節点を縮退させることで、実行時間が短く、ハードウェアコストの小さい結果を得る. アロケーシヨン処理では、各演算をピット長が異なる機能ユニットに割り当てることを可能とする. このような処理によって、条件分岐か多く、かつ変数のビット長が異なるようなエントロピーCODECの動作記述から効率良くRTレベルの記述を得ることができる. 提案手法を計算機上に実装し、評価実験を行った結果を報告する.

CiNii
韓国の学校教育制度 : 教員養成制度を中心に

黄義一, 林冨烈, 畑克明

島根大学教育学部紀要. 教育科学 29 77 - 91 1995年12月

CiNii
動作記述からのデータフローグラフ生成手法

川田容子, 戸川望, 佐藤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 95 ( 307 ) 55 - 62 1995年10月

　概要を見る

本稿では,DSPを対象とした高位合成の第1段階として,制御構造を含まない演算式から成る動作記述からデ-タフロ-グラフ(DFG)を生成する手法を提案する. DFGはその後の合成結果に大きな影響を与えるため,時間,面積といった設計要求を考慮したグラフを生成することが重要である.提案手法は,与えられた時間制約を満たすように,DFGの構造変形によって演算を並列化する.時間制約を満たすDFGから,面積の評価値と共にいくつかのDFGの候補を生成する.複委性成されたDFGを,それぞれその後の合成の入力とすることで,より設計要求を満足する合成結果を得ることができると考える.計算機上に提案手法を実装し,評価実験を行った結影について報告する.

CiNii
リソースアロケーションを考慮したデータパス・スケジューリング手法

西田浩一, 戸川望, 佐藤政生, 大附辰夫

電子情報通信学会技術研究報告. VLD, VLSI設計技術 95 ( 307 ) 63 - 70 1995年10月

　概要を見る

DSPのデータパスを対象とした高位合成において,データフローグラフのスケジューリングが中心的な役割を果たす.スケジューリングでは,アロケーション時に割り当てられるリソースのハードウェア量をできるだけ正確に見積もり,最小化するように各演算をコントロ-ルステップに割り当てることが望ましい.しかも画像処理をはじめとした高速なDSPにおいては,演算をオーバーラップして実行するパイプライン処理が重要である.本稿では,パイプライン処理に対応し,演算器とレジスタの双方のコストの最小化を目的とする時間制約スケジューリング手法を提案する.本手法では,フォースディレクティッド法のように各演算を最も良いと思われるコントロールステップヘ一度に割り当てるのではなく,各演算に対して最も悪いと思われるコントロールステップを徐々に除いていくことにより最終的な解を得る.計算機実験結果により,提案手法は現実的な例題に対し,1秒以下でほぼ最適解に近い解が得られることを示す.

CiNii
動作記述からのデータフローグラフ生成手法

川田容子, 戸川望, 佐藤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 1995 ( 99 ) 137 - 144 1995年10月

　概要を見る

本稿では，DSPを対象とした高位合成の第1段階として，制御構造を含まない演算式から成る動作記述からデータフローグラフ(F)を生成する手法を提案する．DFGはその後の合成結果に大きな影響を与えるため，時間，面積といった設計要求を考慮したグラフを生成することが重要である．提案手法は，与えられた時間制約を満たすように，DFGの構造変形によって演算を並列化する．時間制約を満たすDFGから，面積の評価値と共にいくつかのDFGの候補を生成する，複数生成されたDFGを，それぞれその後の合成の入力とすることで，より設計要求を満足する合成結果を得ることができると考える．計算機上に提案手法を実装し，評価実験を行った結果について報告する．In this paper, we propose an algorithm for generating data flow graphs (DFGs) from a behavioral description which consists of algebraic expressions without control structures. DFG generation is the first task of the high level synthesis for designing DSP. Since the results of the synthesis much depend on a DFG structure, it is important to consider design requirements such as time and area during generating DFGs. In the proposed technique, we transform a DFG structure and make operations in parallel without increasing a resource cost so that the DFG can satisfy a given time constraint. Among generated DFGs satisfying a given time constraint, multiple DFGs are chosen based on the estimated resource costs. We can obtain better results by preparing multiple DFGs and synthesizing each of them. Experimental results show that we obtain multiple DFGs with low resource costs from a practical behavioral description.

CiNii
リソースアロケーションを考慮したデータパス・スケジューリング手法

西田浩一, 戸川望, 佐藤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 1995 ( 99 ) 145 - 152 1995年10月

　概要を見る

DSPのデータパスを対象とした高位合成において，データフローグラフのスケジューリングが中心的な役割を果たす．スケジューリングでは，アロケーション時に割り当てられるリソースのハードウェア量をできるだけ正確に見積もり，最小化するように各演算をコントロールステップに割り当てることが望ましい．しかも画像処理をはじめとした高速なDSPにおいては，演算をオーバーラップして実行するパイプライン処理が重要である．本稿では，パイプライン処理に対応し，演算器とレジスタの双方のコストの最小化を目的とする時間制約スケジューリング手法を提案する．本手法では，フォースディレクティッド法のように各演算を最も良いと思われるコントロールステップヘ一度に割り当てるのではなく，各演算に対して最も悪いと思われるコントロールステップを徐々に除いていくことにより最終的な解を得る．計算機実験結果により，提案手法は現実的な例題に対し，1秒以下でほぼ最適解に近い解が得られることを示す．In high-level synthesis for DSP data paths, scheduling of data flow graphs plays a primary role. In scheduling, it is required that a hardware resource amount is estimated as precisely as possible, and that operations are assigned to control steps so that a resource amount is minimized. In addition, pipelining that overlaps operations is necessary in high-speed DSP application such as image processing. In this paper, we propose a time constraint scheduling algorithm that deals with pipelining, and minimizes both functional unit and register costs. In our algorithm, the control step regarded as the worst in terms of a resource cost is gradually eliminated for each operation in each iteration. Finally, each operation is assigned to one control step. Experimental results for practical DSP data flow graphs show that our algorithm obtains near optimal solutions in less than one second.

CiNii
マルチFPGAを対象とした階層的回路分割手法

戸川望, 佐藤政生, 大附辰夫

電子情報通信学会技術研究報告. CAS, 回路とシステム 95 ( 106 ) 69 - 76 1995年06月

　概要を見る

本稿では,入力回路を複数のFPGAチップ上に分割する回路分割手法を提案する.提案手法は,回路の再帰的な2分割に基づく.各2分割では,ネットワークフローによる最小カットを繰り返し算出することで,適切な回路の分割位置が探索される.この分割は,チップ間にまたがる信号線数をできるだけ小さくすることができ,外部との入出力を行うブロック(I/Oブロック)の数を削減することを可能とする.その際,FPGAの論理機能を実現するブロック(論理ブロック)の複製が信号線数削減の観点から自然に行われる.提案手法を計算機上に実装し,評価実験を行った結果について報告する.

CiNii
A circuit partitioning algorithm with replication capability for multi-FPGA systems

TOGAWA N.

IEICE Trans. Fundamentals 78 ( 12 ) 1765 - 1776 1995年

CiNii
ロングラインに対応した階層的FPGA配線手法

戸川望, 佐藤政生, 大附辰夫

情報処理学会論文誌 35 ( 12 ) 2785 - 2796 1994年12月

　概要を見る

FPGA（Field?ProgrammableGateArrays）とは、比較的高い集積度を特つプログラマブルデバイスの一種であり、とくにシステムのラピッドプロトタイピングの分野で重要なデバイスとなっている。FPGAは、ローカルライン、ロングライン等のように目的に応じた配線セグメントを備えている。したがって、FPGAの設計を考えたときには、これらの配線セグメントを有効に利用するような柔軟性に富んだ手法が必要である。また、FPGAのプログラムはスイッチ素子により実現されるため、その影響によって信号遅延が大きくなる傾向がある。つまり、遅延制御を実現することが可能な設計手法が重要である。本論文では、FPGA設計の中でもとくに配線設計を取り上げ、柔軟な配線構造とくにロングラインに対応し、かつ運延制御を実現した階層的FPGA配線手法を提案する。提案手法は、頒域を再帰的に2分割し、分割線上のネットの通過位置を線形割当てによって決定するという処理を基本としている。このとき、適切なコストにもとづく線形割当てを2段階に適用することで、ネットが経由する分割線上の配線セグメントを決定する。この割当ては、ネットに対して配線遅延の許容値を付加し、その範囲内で配線設計を行うことを目指したものであり、その結果、運延制御を実現することが可能である。本手法をいくつかのベンチマーク回路に適用し、その有効性を示す。

CiNii
Maple : LUT-FPGAを対象としたテクノロジーマッピング・配置・概略配線同時処理手法

佐藤政生, 戸川望, 大附辰夫

電子情報通信学会技術研究報告. CPSY, コンピュータシステム 94 ( 257 ) 41 - 48 1994年09月

　概要を見る

本稿では,FPGAにおけるテクノロジーマッピングの目的が配置配線まで含めた設計にあると考え,テクノロジーマッピング,配置,概略配線の同時処理手法Mapleを提案する.MapleはFPGAを対象とした階層的配置概略配線同時処理手法の拡張であり,レイアウト領域およびブロック集合の再帰的分割に基づく.Mapleは,この手法の基本処理を引き続ぐと共に,各再帰処理の中でテクノロジーマッピングをも同時に行う.従って,各階層での配置,概略配線の結果を反映したマッピングを実現できる.特に,配線に必要な1チャネルあたりのトラックの数を削減するという点で有効なマッピングを行う.テクノロジーマッピングはネットワークによる最大フローを利用しており,配置概略配線処理と併せ全体として高速な処理が可能となる.Mapleを計算機上に実装し,その有効性を示す.

CiNii
節点がレベル付けされたグラフの最小枝交差描画問題に関する一考察

松本英幸, 佐々木整, 梅田達也, 戸川望, 佐藤政生, 竹谷誠, 大附辰夫

情報処理学会研究報告アルゴリズム（AL） 1994 ( 82 ) 49 - 56 1994年09月

　概要を見る

グラフの節点がレベル付けされたグラフをマルチレベルグラフという．ここでは，マルチレベルグラフを平面上に描画する際に枝の交差数が最小になるように各レベルの節点順序を決定する問題を取り扱う．このとき，枝は直線分のみで描画されるものとする．この問題は非常に難しい問題のクラスに属することが知られており，これまで近似解法に関する研究に重点が置かれていた．本稿ではまず，この問題に関する従来の研究，諸定理を示す．そして，0?1整数線形計画問題として定式化し，二分決定グラフ(）を用いて最適解を求める解法を提案する．最後に，提案手法を計算機に実装し実験を行った結果を報告する．A graph is called a multi-level graph when each node of the graph is given a hierarchical level. The problem of minimizing the number of edge crossings in a multi-level graphs is discussed, where each edge is drawn with a line-segment. The problem is known to include NP-complete problems. In the first part of this paper, previous works are summarized and several theories are introduced. Then, the problem is formulated by 0-1 integer linear programming, and solved optimally using a binary diagram. Experimental results are also shown.

CiNii
FPGA 用のレイアウト手法

佐藤政生, 戸川望

情報処理 35 ( 6 ) 535 - 540 1994年06月

CiNii
パス長制約を考慮したFPGA配置概略配線同時処理手法

戸川望, 佐藤政生, 大附辰夫

情報処理学会論文誌 35 ( 5 ) 1994年

J-GLOBAL
ロングラインに対応した階層的FPGA配線手法

戸川望, 佐藤政生, 大附辰夫

情報処理学会論文誌 35 ( 12 ) 1994年

J-GLOBAL
BDDを用いたマンハッタン配線問題の解法

梅田達也, 戸川望, 佐藤政生, 大附辰夫

情報処理学会研究報告アルゴリズム（AL） 1993 ( 88 ) 35 - 42 1993年10月

　概要を見る

マンハッタン配線問題において、交差なしに結線可能な最大ネット集合を求める問題はNP困難な問題である。本稿では、この問題をBDDを用いて解く手法を示す。さらに、この手法では、BDDの変数順が重要になるため、本稿の問題に適した変数順の決定法に関して考察する。また本手法に適した問題の性質についても考察する。To find the maximum number of nets which can be connected by manhattan wires without cossing on a place is NP-hard. This prblem is considered as a prblem of finding MIS(Maximum Independence Set) in graph theory. This paper presents a method for finding MIS using BDD(binary decision diagram). The variable ordering of BDD is quite important. Thus we consider suitable variable orderings for Manhattan wiring. In this paper we also concern about the condition in which hour method could be effective.

CiNii
FPGAを対象とした階層的概略詳細配線手法

戸川望, 粟島亨, 金子一哉, 佐藤政生, 大附辰夫

電子情報通信学会論文誌. A, 基礎・境界 76 ( 9 ) 1312 - 1321 1993年09月

　概要を見る

ゲートアレーとPLAの間隙を埋めるデバイスとしてFPGAが注目されている.FPGAはユーザプログラマブルなデバイスであり短期間で所望の回路を設計できるため,特にシステムのプロトタイピング等の分野で重要である.これは,FPGAの設計手法に対し特に城理の高速性が求められることを意味する.また,FPGAのプログラムは記憶素子またはスイッチにより実現されるため,その影響によって信号遅延が大きくなる傾向がある.従って,FPGAの設計手法では遅延制御に対しても注意する必要がある.本論文では,FPGA設計の中でも配線設計を取り上げ,高速でかつ遅延制御を実現した階層的概略詳細配線手法を提案する.階層的配線手法は,領域を再帰的に2分割し,分割線上のネットの通過位置を線形割当てにより決定するという高速な処理を基本としている.また,2段階の線形割当てによって分割線と交差するネットの通過トラック位置まで決定することで,概略配線と詳細配線の一括処理を可能とし,より高速な処理を実現する.このとき,ネットに優先度を付加し優先度の高いネットを優先的に短く配線することで遅延制御を実現する.本手法をいくつかのベンチマーク回路に適用し,その有効性を示す.

CiNii
配置概略配線同時処理手法

戸川望, 佐藤政生, 大附辰夫

電子情報通信学会技術研究報告. ED, 電子デバイス 93 ( 216 ) 53 - 60 1993年09月

　概要を見る

本稿では,FPGAを対象として,配置処理および概略配線処理を統合化したレイアウト設計手法を提案する.本手法は,領域の再帰的2分割による高速な階層処理を基本とする.このとき,配線方向に対して垂直に分割線を引くことで,概略径路は分割線上の交差位置により特定することができる.分割線上に仮想的なブロック(仮想ブロック)を導入し,その並びによって概略往路を表す.仮想ブロックは,配置すべき論理ブロックと同等であり,これらのブロックは同時に処理できる.これは,配置と概略配線の同時処理を意味する.本手法を計算機上に実装し,その有効性を示す.

CiNii
ロングラインに対応した階層的FPGA配線手法

曽根原理仁, 戸川望, 柳澤政生, 大附辰夫, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 1993 ( 22 ) 17 - 24 1993年03月

　概要を見る

FPGAは，ローカルライン，ロングライン等のように目的に応じた配線資源を備えているため，これらを有効に利用するような柔軟性に富んだ手法が必要である．本稿では，様々な配線構造に対応し，かつ遅延制御を実現した柔軟な階層的配線手法を提案する．提案手法は，領域を再帰2分割し分割線上のネットの通過位置を2段階の線形割当てによって決定するという高速な処理にもとづいており，さらにロングラインを考慮した割当て処理を行う．このとき，遅延の上界値をネットの特定の端子対に付加し，その値を越えないような配線設計を行うことでクリティカルパスの遅延の最小化を目指す．計算機実験の結果，従来手法に比較して最大30%程度の遅延値の改善を確認した．Field-Programmable Gate Arrays (FPGAs) have been attracting attentions. They posses some kinds of wiring segments such as local-lines and long-lines for connections. Therefore, the wiring segments should be handled as efficiently as possible in wiring design. In this paper, a top-down hierarchical FPGA routing algoritm applicable to long-lines is presented. It is based on fast top-down bi-partitioning and linear assignment. The algorithm assigns nets crossing each bi-partitioning line (cut-line) to wiring segments including long-lines by two-phased linear assignments. It realizes delay control based on maximum delays for connections. Experimental results show its efficiency.

CiNii
遅延を考慮したFPGA配線手法

戸川望, 粟島亨, 金子一哉, 佐藤政生, 大附辰夫

電子情報通信学会技術研究報告 92 ( 485(VLD92 91-102) ) 1993年

J-GLOBAL
タイミング制約を考慮したFPGA配置概略配線同時処理手法

戸川望

DAシンポジウム'93論文集 137 - 142 1993年

CiNii
FPGAを対象としたトップダウン配線手法の実装と評価

粟島亨, 戸川望, 金子一哉, 佐藤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 1992 ( 43 ) 223 - 228 1992年05月

　概要を見る

FPGAを対象としたトップダウン配線手法を提案する．本手法は，sea?of?gates等を対象とした概略配線手法を拡張したもので，チップ領域を再帰的に2分割し，その分割線上のネットの通過位置を線形割当てによって決定するというトップダウン処理に基づいている．さらに，本手法では，2段階の線形割当てにより分割線と交差するネットの通過トラックをも決定してしまうことで，概略配線と詳細配線の一括処理を可能とし，高速な処理を実現している．従来用いられてきたCGEアルゴリズムと比較実験を行った結果，配線達成率に関しては両者がほぼ同等の性能であったが，処理時間に関しては本手法が40%?60%高速であることが確認できた．A top-down routing algorithm for field programmable gate array (FPGA) is proposed. The algorithm is extension of a top-down global routing algorithm which is based on a fast top-down bi-partitioning process of a routing region. During each partitioning, the algorithm determines not only global position but detailed position, say track position, of each net crossing a cut-line using a two-phase linear-assignment algorithm. Thus a detailed routing phase is no longer necessary. Experimental results show that the algorithm is approximately two times faster than CGE algorithm which is based on a traditional two-phase approach, while routing results are comparable.

CiNii
配置問題に対する分枝限定法の階層的適用とその評価

粟島亨, 金子一哉, 戸川望, 佐藤政生, 大附辰夫

情報処理学会研究報告システムLSI設計技術（SLDM） 1992 ( 43 ) 179 - 184 1992年05月

　概要を見る

分枝限定法に基づいた階層的配置手法と，その計算機実験による評価について報告する．本手法は，大規模なモジュール配置問題をクラスタリング手法を用いて階層化し，各階層レベルで分枝限定法を適用するものである．まず，階層化する際のクラスタ数とクラスタサイズの上限の設定法について検討する．次に，各階層レベルの最終処理としてmulti?partitioningによる反復改良を適用する．実験の結果，クラスタ数，およびクラスタサイズを12以下に設定し，問題を2階層に分割することによって，モジュール数80程度の問題における近似的最適解が比較的短時間で得られた．Experimental estimation of the hierarchical branch-and-bound method for module placement problems is presented as well as a description of the algorithm. This method consists of hierarchical partitioning based on clustering of modules and branch-and-bound placement at each hiearachical level. The proposed method yields a suboptimal solotion of 80 module problem within a reasonable time domain where both the number of clusters and size of clusters are controlled not to exceed 12.

CiNii
遅延を考慮したFPGAレイアウトシステム

戸川望, 粟島亨, 金子一哉, 大附辰夫

電子情報通信学会大会講演論文集 1992 ( Shuki Pt 1 ) 1992年

J-GLOBAL

▼全件表示

産業財産権

計算方法、計算システム、及びプログラム

特許第7829925号

戸川望, 白井達彦

特許権

J-GLOBAL
計算方法、計算システム、及びプログラム

特許第7795203号

戸川望, 多和田雅師, 跡部悠太

特許権

J-GLOBAL
組合せ最適化方法、組合せ最適化装置、及びプログラム

大野乾太郎, 白井達彦, 戸川望

特許権

J-GLOBAL
情報処理システム、情報処理方法、及びプログラム

古賀純隆, 武笠陽介, 戸川望

特許権

J-GLOBAL
アニーリング処理装置、アニーリング処理方法及びプログラム

川村一志, 神保聡, 戸川望, 白井達彦

特許権

J-GLOBAL
量子計算方法、量子計算システム、及びプログラム

戸川望, 白井達彦

特許権

J-GLOBAL
エネルギー関数の最小値探索装置、エネルギー関数の最小値探索方法、及びプログラム。

大野乾太郎, 巴徳瑪, 八木哲志, 寺本純司, 川上蒼馬, 戸川望

特許権

J-GLOBAL
組合せ最適化装置、組合せ最適化方法、およびプログラム

特許第7632848号

巴徳瑪, 新井淳也, 八木哲志, 寺本純司, 川上蒼馬, 武笠陽介, 鮑思雅, 戸川望

特許権

J-GLOBAL
計算方法、計算装置、及びプログラム

特許第7578277号

戸川望, 多和田雅師, 於久太祐, 田中宗

特許権

J-GLOBAL
学習装置、学習方法及び学習プログラム

特許第7576790号

披田野清良, 清本晋作, 野澤康平, 戸川望

特許権

J-GLOBAL
検知装置、学習装置、検知方法及び検知プログラム

特許第7565561号

長谷川健人, 披田野清良, 清本晋作, 戸川望

特許権

J-GLOBAL
計算方法、計算システム、及びプログラム

戸川望, 白井達彦

特許権

J-GLOBAL
エネルギー関数の最小値探索装置、エネルギー関数の最小値探索方法、及びプログラム

大野乾太郎, 寺本純司, 八木哲志, 巴徳瑪, 川上蒼馬, 戸川望

特許権

J-GLOBAL
計算方法、計算システム、及びプログラム

戸川望, 多和田雅師, 跡部悠太

特許権

J-GLOBAL
計算方法、計算システム、及びプログラム

戸川望, 白井達彦

特許権

J-GLOBAL
ハードウエアトロイ検出方法、ハードウエアトロイ検出装置及びハードウエアトロイ検出用プログラム

特許第7410476号

永田真一, 高橋功次, 戸川望, 大屋優

特許権

J-GLOBAL
組合せ最適化装置、組合せ最適化方法、およびプログラム

巴徳瑪, 新井淳也, 八木哲志, 寺本純司, 川上蒼馬, 武笠陽介, 鮑思雅, 戸川望

特許権

J-GLOBAL
処理装置、処理方法及び処理プログラム

特許第7285516号

巴徳瑪, 内山寛之, 八木哲志, 新井淳也, 吉村夏一, 多和田雅師, 田中宗, 戸川望

特許権

J-GLOBAL
学習装置、学習方法及び学習プログラム

特許第7223372号

披田野清良, 清本晋作, 長谷川健人, 戸川望

特許権

J-GLOBAL
検知装置、学習装置、検知方法及び検知プログラム

長谷川健人, 披田野清良, 清本晋作, 戸川望

特許権

J-GLOBAL
検出方法及び検出装置

特許第7136439号

戸川望, 長谷川健人

特許権

J-GLOBAL
計算方法、計算装置、及びプログラム

戸川望, 多和田雅師, 於久太祐, 田中宗

特許権

J-GLOBAL
学習装置、学習方法及び学習プログラム

披田野清良, 清本晋作, 野澤康平, 戸川望

特許権

J-GLOBAL
処理装置、処理方法及び処理プログラム

新井淳也, 巴徳瑪, 八木哲志, 吉村夏一, 多和田雅師, 戸川望

特許権

J-GLOBAL
ハードウエアトロイ検出方法、ハードウエアトロイ検出装置及びハードウエアトロイ検出用プログラム

永田真一, 高橋功次, 戸川望, 大屋優

特許権

J-GLOBAL
測定装置、ナビゲーションシステム、測定方法及びプログラム

特許第6867254号

戸川望, 矢野椋也, 石川和明

特許権

J-GLOBAL
学習装置、学習方法及び学習プログラム

披田野清良, 清本晋作, 長谷川健人, 戸川望

特許権

J-GLOBAL
処理装置、処理方法及び処理プログラム

巴徳瑪, 内山寛之, 八木哲志, 新井淳也, 吉村夏一, 多和田雅師, 田中宗, 戸川望

特許権

J-GLOBAL
検出方法及び検出装置

戸川望, 長谷川健人

特許権

J-GLOBAL
ハードウェアトロイの検出方法、ハードウェアトロイの検出プログラム、およびハードウェアトロイの検出装置

特許第6566576号

戸川望, 大屋優

特許権

J-GLOBAL
測定装置、ナビゲーションシステム、測定方法及びプログラム

戸川望, 矢野椋也, 石川和明

特許権

J-GLOBAL
辞書検索方法、装置、およびプログラム

右近祐太, 宮崎昭彦, 島▲崎▼ 健太, 多和田雅師, 津田俊隆, 中里秀則, 戸川望

特許権

J-GLOBAL
集積回路の正常化方法、正常化回路、及び集積回路

戸川望, 大屋優

特許権

J-GLOBAL
辞書検索方法および装置

青木孝, 羽田野孝裕, 大塚卓哉, 宮崎昭彦, 島▲崎▼ 健太, 戸川望, 朴容震, 津田俊隆

特許権

J-GLOBAL
ハッシュ関数計算装置および方法

青木孝, 宮崎昭彦, 羽田野孝裕, 戸川望, 島崎健太, 津田俊隆, 朴容震

特許権

J-GLOBAL
ハッシュ関数計算装置および方法

青木孝, 宮崎昭彦, 羽田野孝裕, 戸川望, 島崎健太, 津田俊隆, 朴容震

特許権

J-GLOBAL
ハードウェアトロイの検出方法、ハードウェアトロイの検出プログラム、およびハードウェアトロイの検出装置

戸川望, 大屋優

特許権

J-GLOBAL
ハードウェアトロイの検出方法、ハードウェアトロイの検出プログラム、およびハードウェアトロイの検出装置

戸川望, 大屋優

特許権

J-GLOBAL
資源再配置装置、資源再配置方法およびプログラム

青木孝, 右近祐太, 関原悠介, 戸川望

特許権

J-GLOBAL
画像処理システムの構成装置および構成方法

特許第5697102号

小野澤晃, 青木孝, 戸川望, 李昇周

特許権

J-GLOBAL
計算機システム、ルータ装置、パケット転送方法およびプログラム

関原悠介, 青木孝, 戸川望, 李昇周

特許権

J-GLOBAL
信号処理装置および信号処理方法

史又華, 戸川望, 柳澤政生, 五十嵐博昭

特許権

J-GLOBAL
信号処理装置および信号処理方法

史又華, 戸川望, 柳澤政生, 五十嵐博昭

特許権

J-GLOBAL
故障攻撃検出回路および暗号処理装置

戸川望, 五十嵐博昭, 史又華

特許権

J-GLOBAL
計算システム、処理装置、及び計算システムにおける内部負荷分散方法

小野澤晃, 青木孝, 戸川望, 李昇周

特許権

J-GLOBAL
画像処理システムの構成装置および構成方法

小野澤晃, 青木孝, 戸川望, 李昇周

特許権

J-GLOBAL
重み付き加算演算器および加算演算方法

外村元伸, 戸川望, 原智昭, 大附辰夫

特許権

J-GLOBAL
LDPC符号検出装置及びLDPC符号検出方法

特許第4519694号

清水一範, 戸川望, 池永剛, 後藤敏

特許権

J-GLOBAL
複素数の積和演算装置および積和演算方法

名村健, 戸川望, 大附辰夫, 外村元伸

特許権

J-GLOBAL
複素数の積和演算装置および積和演算方法

名村健, 戸川望, 大附辰夫, 外村元伸

特許権

J-GLOBAL
略地図生成装置および略地図生成方法

二宮直也, 戸川望, 大附辰夫, 廣津亜弥子

特許権

J-GLOBAL
位置特定システム及び位置特定方法

大附辰夫, 戸川望, 中口智史

特許権

J-GLOBAL
LDPC符号検出装置及びLDPC符号検出方法

清水一範, 戸川望, 池永剛, 後藤敏

特許権

J-GLOBAL
位置情報提供システム及び位置情報提供方法

戸川望, 馬場孝明, 大附辰夫

特許権

J-GLOBAL
通信装置及び通信方法

戸川望, 清水一範, 大附辰夫, 池永剛, 後藤敏

特許権

J-GLOBAL

▼全件表示

現在担当している科目

Master's Thesis (Department of Computer Science and Communications Engineering)

大学院基幹理工学研究科

2026年通年
修士論文（情報・通信）

大学院基幹理工学研究科

2026年通年
IoTシステム設計

大学院基幹理工学研究科

2026年春学期
情報システム設計研究

大学院基幹理工学研究科

2026年通年
Digital System Design

大学院基幹理工学研究科

2026年冬クォーター
Seminar on Information System Design A

大学院基幹理工学研究科

2026年春学期
Seminar on Information System Design D

大学院基幹理工学研究科

2026年秋学期
Seminar on Information System Design C

大学院基幹理工学研究科

2026年春学期
Seminar on Information System Design B

大学院基幹理工学研究科

2026年秋学期
Special Laboratory B in Computer Science and Communications Engineering

大学院基幹理工学研究科

2026年秋学期
Special Laboratory A in Computer Science and Communications Engineering

大学院基幹理工学研究科

2026年春学期
Digital System Design

大学院基幹理工学研究科

2026年冬クォーター
Research on Information System Design

大学院基幹理工学研究科

2026年通年
情報システム設計演習D

大学院基幹理工学研究科

2026年秋学期
情報システム設計演習C

大学院基幹理工学研究科

2026年春学期
情報システム設計演習B

大学院基幹理工学研究科

2026年秋学期
情報システム設計演習A

大学院基幹理工学研究科

2026年春学期
情報理工・情報通信特別実験A

大学院基幹理工学研究科

2026年春学期
情報理工・情報通信特別実験B

大学院基幹理工学研究科

2026年秋学期
ディジタルシステム設計

大学院基幹理工学研究科

2026年冬クォーター
IoTシステム設計

大学院創造理工学研究科

2026年春学期
情報理工・情報通信特別演習Ｂ

大学院基幹理工学研究科

2026年秋学期
情報理工・情報通信特別演習Ａ

大学院基幹理工学研究科

2026年春学期
情報システム設計研究

大学院基幹理工学研究科

2026年通年
IoTシステム設計

大学院先進理工学研究科

2026年春学期
体育各部２年目　（航空部）

グローバル・エデュケーション・センター

2026年通年
体育各部１年目　（航空部）

グローバル・エデュケーション・センター

2026年通年
情報通信基礎　【前年度成績S評価者用】

基幹理工学部

2026年春学期
情報通信基礎

基幹理工学部

2026年春学期
卒業論文Ａ（秋学期）

基幹理工学部

2026年秋学期
卒業論文Ａ

基幹理工学部

2026年春学期
卒業論文Ａ（秋学期）　18前再

基幹理工学部

2026年秋学期
卒業論文Ａ　18前再

基幹理工学部

2026年春学期
卒業論文Ａ　18前再　【前年度成績S評価者用】

基幹理工学部

2026年春学期
情報理工学実験Ｂ【前年度成績S評価者用】

基幹理工学部

2026年春学期
卒業論文Ｂ　18前再　【前年度成績S評価者用】

基幹理工学部

2026年秋学期
卒業論文Ｂ（春学期）　18前再

基幹理工学部

2026年春学期
卒業論文Ｂ　18前再

基幹理工学部

2026年秋学期
情報理工学実験Ｂ

基幹理工学部

2026年春学期
情報理工学実験Ａ　【前年度成績S評価者用】

基幹理工学部

2026年秋学期
IoTシステム設計

基幹理工学部

2026年春学期
IoTシステム設計

基幹理工学部

2026年春学期
ディジタルシステム設計

基幹理工学部

2026年冬クォーター
情報理工学実験Ａ

基幹理工学部

2026年秋学期
プロジェクト研究Ａ

基幹理工学部

2026年春学期
卒業論文Ｂ

基幹理工学部

2026年秋学期
プロジェクト研究Ｂ

基幹理工学部

2026年秋学期
卒業論文Ａ　（集中）

基幹理工学部

2026年集中（春・秋学期）
電子回路　　【前年度成績S評価者用】

基幹理工学部

2026年秋学期
電子回路

基幹理工学部

2026年秋学期
卒業論文Ｂ（春学期）

基幹理工学部

2026年春学期
卒業論文Ｂ　18前再

基幹理工学部

2026年秋学期
卒業論文Ａ

基幹理工学部

2026年春学期
卒業論文Ａ　18前再　【前年度成績S評価者用】

基幹理工学部

2026年春学期
卒業論文Ａ（秋学期）　18前再

基幹理工学部

2026年秋学期
情報通信実験Ｂ【前年度成績S評価者用】

基幹理工学部

2026年春学期
卒業論文Ａ　18前再

基幹理工学部

2026年春学期
情報通信実験Ａ　【前年度成績S評価者用】

基幹理工学部

2026年秋学期
情報通信実験Ａ

基幹理工学部

2026年秋学期
電子回路　【前年度成績S評価者用】

基幹理工学部

2026年秋学期
電子回路

基幹理工学部

2026年秋学期
情報通信実験Ｂ

基幹理工学部

2026年春学期
ディジタルシステム設計

基幹理工学部

2026年冬クォーター
プロジェクト研究Ｂ

基幹理工学部

2026年秋学期
IoTシステム設計

基幹理工学部

2026年春学期
卒業論文Ｂ

基幹理工学部

2026年秋学期
プロジェクト研究Ａ

基幹理工学部

2026年春学期
卒業論文Ａ（秋学期）

基幹理工学部

2026年秋学期
卒業論文Ｂ（春学期）

基幹理工学部

2026年春学期
卒業論文Ａ　（集中）

基幹理工学部

2026年集中（春・秋学期）
卒業論文Ｂ　18前再　【前年度成績S評価者用】

基幹理工学部

2026年秋学期
卒業論文Ｂ（春学期）　18前再

基幹理工学部

2026年春学期
Graduation Thesis B (Fall) [S Grade]

基幹理工学部

2026年秋学期
Graduation Thesis A　(Fall)【For students enrolled before 2022】

基幹理工学部

2026年秋学期
Graduation Thesis B (Fall)

基幹理工学部

2026年秋学期
Graduation Thesis B (Spring) [S Grade]

基幹理工学部

2026年春学期
Project Research Spring

基幹理工学部

2026年春学期
Project Research Fall

基幹理工学部

2026年秋学期
Digital System Design

基幹理工学部

2026年冬クォーター
Introduction to Computers and Networks

基幹理工学部

2026年春学期
Computer Science and Communications Engineering Laboratory B

基幹理工学部

2026年春学期
Graduation Thesis A (Spring) [S Grade]

基幹理工学部

2026年春学期
Graduation Thesis A　(Fall)[S Grade]【For students enrolled before 2022】

基幹理工学部

2026年秋学期
Graduation Thesis A (Fall)

基幹理工学部

2026年秋学期
Graduation Thesis A (Spring)

基幹理工学部

2026年春学期
Graduation Thesis A (Fall) [S Grade]

基幹理工学部

2026年秋学期
Graduation Thesis A　(Spring)[S Grade]【For students enrolled before 2022】

基幹理工学部

2026年春学期
Computer Science and Communications Engineering Laboratory A

基幹理工学部

2026年秋学期
Graduation Thesis A　(Spring)【For students enrolled before 2022】

基幹理工学部

2026年春学期
Graduation Thesis B (Spring)

基幹理工学部

2026年春学期
Computer Science and Communications Engineering Laboratory A [S Grade]

基幹理工学部

2026年秋学期

▼全件表示

他学部・他研究科等兼任情報

附属機関・学校グローバル・エデュケーション・センター
理工学術院大学院基幹理工学研究科

学内研究所・附属機関兼任歴

2025年

-

2026年

データ科学センター兼任センター員
2024年

-

2026年

理工学術院総合研究所兼任研究員
2024年

-

2026年

各務記念材料技術研究所兼任研究員
2024年

-

2026年

カーボンニュートラル社会研究教育センター兼任センター員
2024年

-

2026年

リサーチ・イノベーション・センター　オープンイノベーション推進セクション兼任センター員
2023年

-

2025年

次世代コンピューティング基盤研究所プロジェクト研究所所長

▼全件表示

特定課題制度（学内資金）

量子計算機のための組合せ最適化共通アルゴリズム基盤の研究開発

2025年

　概要を見る

量子コンピュータは、従来の古典計算システムと比較して、特定の分野あるいはアプリケーションプログラムにおいて有効な計算システムを提供することができるとされるものの、(1)演算に必要なビット数がアプリケーションプログラムの必要とするビット数に比べ小さく、実用規模のアプリケーションプログラムを直接量子コンピュータで演算することができない、(2)量子コンピュータは現在開発途上の計算エンジンであり、量子状態やノイズ等の影響により必ずしも所望の計算結果が得られるとは限らない、といった問題がある。これら量子コンピュータの問題点について、ソフトウェアの観点でカバーしながら、量子コンピュータが持つ計算能力を最大限引き出すことが不可欠となる。本研究では、組合せ最適化問題に焦点をあて、実用問題を想定しながら、量子アニーリングマシン、NISQデバイス、ならびに初期段階を含む誤り耐性量子コンピュータに向けて、さまざまな組合せ最適化のためのアルゴリズム基盤を構築を目的に、問題の簡略化手法、現問題から部分問題を抽出する手法、さらにこれらの技術を組み合わせて実問題を解法する取り組みを行った。特にいくつの研究開発プロジェクトと合わせて、上記により研究した基盤技術の有効性を評価し、実問題の適用可能性を検討した。
多様な量子計算機に適合する組合せ最適化アルゴリズムの研究開発

2024年

　概要を見る

量子コンピュータは、従来の古典計算システムと比較して，特定の分野あるいはアプリケーションプログラムにて極めて高速な演算や精度の高い解を得ることができるものの，(1)演算に必要なビット数がアプリケーションプログラムの必要とするビット数に比べ小さく，実用規模のアプリケーションプログラムを直接量子コンピュータで演算することができない，(2)量子コンピュータは現在開発途上の計算エンジンであり，量子状態やノイズ等の影響により必ずしも所望の計算結果が得られるとは限らない、といった問題がある．これら量子コンピュータの問題点について，ソフトウェアの観点でカバーしながら，量子コンピュータが持つ計算能力を最大限引き出すことが不可欠となる．そこで本研究では，組合せ最適化問題に焦点をあて，実用問題を想定しながら，量子アニーリングマシン，NISQデバイス，ならびに誤り耐性量子コンピュータに向けて，さまざまな組合せ最適化のためのアルゴリズム基盤を構築する．量子コンピュータに共通アルゴリズム，個別の量子コンピュータに特有のアルゴリズムに整理しながら，各マシンの性能の最大化を目指した．まず旅程最適化問題をモチーフに，原問題から部分問題を抽出し，部分問題ごとに最適化を実行するアプローチを検討した．さらに部分問題を原問題に帰着させた後，いくつかの補正処理を提案し，理論的な裏付けならびに実験評価を行った．実際に複数の地域について旅程最適化問題を実量子コンピュータを用いて求解した結果，提案アプローチの有効性を確認した．さらに量子状態を利用することで，最適化問題の探索空間を制限し，実現可能解を得るアプローチを検討した．量子回路を構築し，量子コンピュータのシミュレータで評価した結果，手法の有効性を確認した．
機械学習を用いた自動プログラミングによる量子アニーリング計算機の実応用

2023年

　概要を見る

　内閣府「量子未来社会ビジョン」（2022年4月公表）では，創薬・医療，材料，金融，製造など社会経済システム全体に量子技術を取り組み，我が国の産業の成長機会を創出し社会課題の解決が得組まれる．一方，産業応用が近いとされる量子技術の一つとして「量子アニーリング計算機」が注目され，さまざまな産業応用に見られる組合せ最適化問題を高速・実時間に解法することが期待される．一方，量子アニーリングによる組合せ最適化問題の解法では，対象問題をイジングモデルによって表現する必要があるが，組合せ最適化問題の中には本質的にイジングモデルで表現不可能なものが多数存在する．　本研究は機械学習を利用することで量子アニーリング計算機のプログラミングに注目し，実問題や実物理現象をイジングモデルのベースとなる「2次形式」で自動的に十分よく近似表現する仕組みを構築することを目指す．得られた「2次形式」のもと量子アニーリング計算機により原問題を求解することで，これまで難しいとされた実問題や実物理現象にもとづく多様な組合せ最適化問題が量子アニーリング計算機によって極めて高速・高精度に求解可能となる．今年度は，いくつかの例題をもとにその妥当性を検証した．
機械学習を用いた自動プログラミングによる量子アニーリングの応用

2023年

　概要を見る

　量子技術の中でも「量子アニーリング計算機」は，内部に「スピン」の集合から構成されるイジング模型を持ち，その基底状態（最小エネルギー状態）を求めることで組合せ最適化問題を高速に解法することが期待される．産業応用が近く，内閣府「量子未来社会ビジョン」で注目される一方，量子アニーリング計算機を利用するには，組合せ最適化問題を説明バイナリ変数（以降，単に説明変数と呼ぶ）の「2次形式」で表現されるエネルギー関数に落とし込む必要がある（量子アニーリング計算機のプログラミングに相当する）が，実問題や実物理現象を説明変数の「2次形式」で表現することは極めて難しい，あるいは不可能である．　本研究では，機械学習を利用することで量子アニーリング計算機のプログラミングに注目することで，実問題や実物理現象を「2次形式」で自動的に十分よく近似表現する仕組みを構築し，さらに特定の拾産業問題に適用して，その有効性を評価する．特に，今年度採択されたNEDO量子AI事業を補完し，提案技術の有効性を評価するための基盤技術研究開発を行った．
機械学習と量子アニーリングによる多様な組合せ最適化問題の解法

2022年

　概要を見る

　Soceity5.0を実現する産業分野では数多くの「組合せ最適化問題」が存在し，その高速・実時間解法が最大の困難点と言われる．一方，量子アニーリングマシンをはじめとする「イジング計算機」が注目されており，さまざまな組合せ最適化問題の解法が検討されている．一方，イジング計算機を利用した組合せ最適化問題の解法では，対象問題をイジングモデルによって表現する必要があるが，組合せ最適化問題の中には本質的にイジングモデルで表現不可能なものが多数存在する．　本研究では，上記の問題を解決するため，「機械学習」と「量子アニーリング」を融合により，Soceity5.0を実現する産業分野のさまざまな実問題の解法を目指し，基礎検討を行った．結果として，広範な組合せ最適化に量子アニーリングマシンの適用の道筋を確認した．
イジング計算機による「新しい生活様式」の実現

2021年

　概要を見る

　Soceity5.0を実現する産業分野では数多くの「組合せ最適化問題」が存在し，その高速・実時間解法が最大の困難点と言われる．一方，量子アニーリングマシンをはじめとする「イジング計算機」が注目されていており，イジング計算機の活用が期待されている．　本研究では，厚生労働省で示された「新しい生活様式」を実現すべく，密集・密接・密閉（3密）を回避する生活様式ならびに行動様式を「組合せ最適化問題」に定式化し，これらをイジング計算機によって解法することを目的に，QUBOによる定式化ならびに実イジング計算機による評価を行った．評価の結果，従来方式に比較して，計算時間や収束性において優位性が確認できた．
イジング計算機による地理空間情報処理問題の高速解法

2020年

　概要を見る

本研究では，Society5.0の実現に不可欠となる「地理空間情報処理問題」をいくつか取り上げ，これをイジング計算機によって高速解法を実施した．地理空間情報処理問題として，ここでは実問題をベースに，複数の集積箇所を持ち移動体に容量制約を持った集配経路探索問題，アミューズメントパークを対象とした経路探索問題を取り上げ，これらを効果的にイジング模型にマッピングし，イジング計算機によって解法した．その上で既存手法と比較評価することで，イジング模型マッピングならびにイジング計算機による解法の有効性を確認した．
イジング計算機による組合せ最適化問題の高速解法の可能性

2019年

　概要を見る

　内閣府資料によれば，モビリティ，金融，創薬など，Society5.0を実現する産業分野では，数多くの「組合せ最適化問題」の高速・実時間解法が最大の困難点であり，NP困難問題など難しいクラスの組合せ最適化問題に対し，いかに「高速・実時間で」（準）最適解を求めるかがその成否を決めると言われる．　一方，非ノイマン型コンピューティング技術の決定打として量子アニーリングマシンをはじめとする「イジング計算機」が注目されている．イジング計算機は物理現象を利用することで組合せ最適化問題を高速に解法するものであり，カナダD-Wave，我が国では日立，NTT，富士通，NEC，東芝等が次々にイジング計算機を発表している．ところが，これら既存のイジング計算機が対象とした組合せ最適化問題の多くは，グラフ最大カット問題等のイジング計算機に都合が良い単純な問題ばかりであり，現実的な組合せ最適化問題の解法に至っていない．しかも現状，イジング計算機は「物理的特性」（複数のスピンのコヒーレンス時間の長時間化等）が注目されるばかりであり「イジング計算機の実応用」はあまり注目されていない．つまりイジング計算機による高速化・低電力化等はまだサンプル評価段階であり，ここに最大の問題点がある．　以上の背景のもと，本研究では，イジング計算機にブレークスルーを与え， Society5.0の実現に不可欠となる「現実的な」組合せ最適化問題に対して，イジング計算機により高速に解法することを目的に研究に取り組んだ．2019年度には，他の研究資金の成果を補うことを目的に，二次割当問題の最適解法，長方形敷き詰め問題（矩形パッキング問題）の解法，長方形敷き詰めを3次元に拡張した直方体敷き詰め問題の解法，グラフの同型判定問題の解法など，これまでイジング計算機で解法されて来なかった数々の問題の解法に挑戦し，一定の成果を得た．
不安定な環境発電でも永続動作可能とする超低エネルギーでロバストな集積回路設計技術

2018年木村晋二, 多和田雅師, 川村一志

　概要を見る

IoT (Internet of Things) 時代に「もの」がネットワーク化され至るところで運用されれば，電力ネットワークから安定電力の供給は不可能となり，エネルギーの地産地消，即ち太陽光や振動など環境発電による回路駆動が必須となる．本研究では，レジスタ分散型アーキテクチャと呼ばれる基本アーキテクチャをベースに，集積回路の設計マージンを削減し，さらに短期・長期の遅延変動にロバストな集積回路設計技術を構築した．構築された集積回路設計技術は，レジスタ分散型アーキテクチャにより高位設計と物理設計とを統合したものであり，これにより集積回路の設計マージンを削減，低エネルギー化を実現する．さらに，遅延監視回路を埋め込むことで短期遅延変動に対応，複数の設計シナリオの作り込みにより長期遅延変動に対応した．結果的に，これら個別の集積回路設計技術の見込みを得た．
FPGAデバイスに侵入したハードウェアトロイ検知技術の構築

2018年

　概要を見る

　一般に，集積回路の設計・製造工程は，設計や製造コストを削減するため積極的に外注を利用しているのが現状である．すなわち設計・製造プロセスにおいて，悪意ある設計・製造者が存在した場合，IoT機器に原理的に設計者の意図しない不正な回路部品(ハードウェアトロイと呼ばれる)の侵入の危険性がある．本研究は，FPGAデバイス（書き換え可能な回路デバイス）などの集積回路を対象に，ハードウェアトロイ回路を発見することを目的とする．本研究では，まずFPGAを含む集積回路デバイス中において，ハードウェアトロイの特徴について考察を進めた．その結果，ハードウェアトロイは，(1) 局所的に高いファンイン（入力線数）を持つ，(2) 外部入力に近い位置にある等の性質があることを見出した．これらの知見のもと，ハードウェアトロイを「学習」ならびに学習させて識別器をもとに，未知集積回路に対して，ハードウェアトロイの識別に成功した．
機械学習による複合的なIoTデバイスの異常検知・回復技術の構築

2017年

　概要を見る

IoT（「もの」のインターネット）デバイスは多くの大規模集積回路（LSI）によって構成されが，その設計・製造プロセスにおいて，悪意ある設計・製造者が存在した場合，IoTデバイスに原理的に設計者の意図しない不正な回路部品(ハードウェアトロイと呼ばれる)の侵入の危険性がある．安全かつ安心にIoTデバイスを運用するためには，IoTデバイス中の不正な回路部品をいち早く検知，これを取り除くことで，セキュアなIoTデバイスを実現する必要が強く求められる．本研究では機械学習を積極的に利用することで，IoTデバイス中の不正回路を高精度に検知することに成功し，またIoTデバイスの消費電力を計測することで不正動作を発見することに成功した．
機械学習を用いた設計工程ハードウェアトロイ検出手法の構築

2016年

　概要を見る

一般に，大規模集積回路（LSI）の設計・製造工程において，悪意ある設計・製造者が存在した場合，原理的にハードウェアトロイの侵入の危険性がある．そしてハードウェアトロイが侵入したLSIならびにこれを用いたシステムの機能を無効・破壊される可能性や機密情報を漏洩する恐れがある．　本研究では，未知のハードウェアトロイ回路に適応的に対応すべく，機械学習を用いた設計工程ハードウェアトロイ検出技術を確立した．未知のハードウェアトロイに対して，例題によって80%を超える検出率を達成している．
不揮発メモリのための書込みビット数を厳密に最小化する符号化とノーマリオフ計算応用

2016年

　概要を見る

不揮発メモリはノーマリオフ計算の中心的役割を果たすが，不揮発メモリのビット書込みエネルギーはビット読出しに比べ1桁～2桁以上大きく，その成否は「不揮発メモリの書込みビットをいかに削減するか」にある．これに対し我々はデータを符号化することで，書込みビット数を厳密に最小化する符号構成方法の構築に成功した．本研究では，まずノーマリオフ計算アプリケーションに最適な構成を持つ書込みビットを最小化する符号を構築した．特に書込みビットに対して，誤り訂正能力を付加し，書込み削減と同時に誤り訂正能力を持つ符号化の構成方法の構築に成功した．
環境発電の不安定な微小電力で永続動作する超低エネルギー・ロバスト集積回路設計技術

2016年木村晋二

　概要を見る

本研究では，集積回路設計において，動作変動があってもロバストに動作を続ける回路設計技術の構築を目標に主に以下の2点について取り組んだ．まず第一に回路動作を複数の「シナリオ」として実現し，回路動作中に遅延の変動を監視し，遅延変動があった場合には適切な「シナリオ」に回路動作をスイッチし，常に「最適なシナリオ」で動作する回路設計技術を構築した．続いて第二に，回路の経年劣化現象，特にNBTI（Negative Bias Temperature Instability ; 負バイアス温度不安定性）に注目し，経年劣化を考慮した回路設計技術を構築した．回路中の最適箇所をパワーゲーティングすることにより，経年劣化による遅延変動量を小さく抑えることに成功した．
ノーマリオフ計算のための書込みビットを厳密に最小化する書込み削減符号の構築

2015年

　概要を見る

　不揮発メモリはノーマリオフ計算の中心的役割を果たすが，不揮発メモリのビット書込みエネルギーはビット読出しに比べ1桁～2桁以上大きく，その成否は「不揮発メモリの書込みビットをいかに削減するか」にある．これに対し我々はデータを符号化することで，書込みビット数を厳密に最小化する符号構成方法の構築に成功した． 　本研究では，まずノーマリオフ計算アプリケーションに最適な構成を持つ書込みビットを最小化する符号を構築し，符号化器，復号化器，メモリセルから構成される不揮発メモリシステムを構築した．さらにビット書込み削減と同時に，誤り訂正能力を持った符号を構築した．メモリエネルギーを最大40%程度削減した．
自然エネルギーで半永久的に動作し続けるレジリエント集積回路設計技術

2015年

　概要を見る

　自然エネルギーを中心とした社会において，自然エネルギー発電で駆動する集積回路を動作し続けるためには，(A) 集積回路の消費電力量の無駄を極限まで省く設計技術，ならびに，(B) 自然エネルギーの電力供給に変動があっても動作する集積回路の設計技術が鍵となる．　本研究は，(1) 集積回路の正常動作と回路要素の正しい結合状態とが等価であることを利用し，(1-1) 結合状態を監視する回路技術，(1-2) 結合状態に異常が見られた場合これを修復する回路技術を構築した．さらに (2) 結合状態の監視・修復技術を持った回路設計技術を構築した．さまざまなアプリケーションに適応した結果，20%を超える性能マージンの削減に成功した．
書込みビット数を1桁削減する書込み削減符号の構築とノーマリオフ技術への展開

2014年多和田雅師

　概要を見る

ノーマリオフコンピューティングとは，常時「電源オフ」を基本とする計算パラダイムであり，その中心的な役割を担うのが「不揮発メモリ」である．不揮発メモリは，書込みエネルギーが通常の揮発メモリに比べ極めて大きい．ノーマリオフコンピューティングの実現には「不揮発メモリの書き込みビットをいかに削減するか」が最大の問題となる．本研究では，不揮発メモリを対象に，データを一旦符号化し，符号語どうしの「距離」を極小化することによって，不揮発メモリを書き込みエネルギーの最小化手法を構築した．実験により，不揮発メモリの書き込みビット数を最大75%，書き込みエネルギーを最大33%削減することを確認した．
世界最速を達成するメニーコアプロセッサのキャッシュ構成シミュレータの研究開発

2013年多和田　雅師

　概要を見る

　現在，我々の身の回りにあるデジタルテレビ，ハードディスクレコーダ，携帯電話，自動車，エアコン，炊飯器などあらゆる電化製品に，ほぼ必ず大小の「組込みプロセッサ」が組み込まれている．我々の豊かで安全・安心な生活に組込みプロセッサの性能・価格は密接に関わってきている．とりわけ半導体加工技術の進歩に伴い，組込みプロセッサのトレンドは単一プロセッサコアから複数のプロセッサコアを集積したメニーコアプロセッサが主流となっている．　高性能化されたメニーコアプロセッサは，内部に「キャッシュメモリ」を搭載している．キャッシュメモリとは，メニーコアプロセッサの性能と，SDRAMなどの外部メモリの性能とのギャップを補償するために，プロセッサと外部メモリの間を仲介するメモリシステムであるが，キャッシュサイズそのものの増大ならびに半導体の微細化によるリーク電流の増大を主な原因として，キャッシュの面積は，プロセッサ全面積のうち最大で60%～80%にも達し，同様にその消費電力は最大で50%～70%にも達する．極端に言えば，メニーコア組込みプロセッサの価格・性能を決定づけるのはもはやキャッシュメモリである．とりわけメニーコアプロセッサのメモリ構成は，各プロセッサコアに固有のL1キャッシュ，また複数のプロセッサコアに共有されるL2キャッシュ，L3キャッシュより構成され，単一のプロセッサに比較し極めて複雑なものとなる．特定の応用プログラムが与えられたとき，メニーコア組込みプロセッサのキャッシュの振舞いを正確に知ることは，その価格・性能の決定に大きく寄与することになる．　以上の背景のもと，本研究ではメニーコアプロセッサのキャッシュに特有な数理的性質を発見・証明すると共に，ここまでの数理的性質を適用することで，超高速なメニーコアプロセッサのキャッシュ構成シミュレーション技術の開発した．本研究の成果は主に以下の2点に集約される：(1) キャッシュ構成シミュレーションは，単一構成のキャッシュシミュレーションを複数回行うことで実現できる．しかし，この手法は現実的でない時間がかかる可能性がある．複数のキャッシュ構成をまとめて同時にシミュレーションすることができれば実行時間を短縮できる．　複数のキャッシュ構成をまとめて同時にシミュレーションするためには，同時に複数のキャッシュ構成を表現するデータ構造が必要となる．ひとつのデータ構造を探索，更新することで複数のキャッシュ構成で探索，更新が行われるようなデータ構造を構築することができれば高速なキャッシュ構成シミュレーションを実現できる可能性がある．　そこで本研究では，キャッシュの「連想度」に着目し，連想度の異なる複数のキャッシュ構成を「ひとつのデータ構造で表現する手法」を提案した．(2) 上記(1)で提案した，複数のキャッシュ構成を同時に表現するデータ構造を計算機上に実装し，実際にメニーコアプロセッサのためのキャッシュ構成シミュレータを構築した．構築したキャッシュ構成シミュレータは，従来のキャッシュ構成シミュレーションに比較して，キャッシュのヒット/ミスを正確に，かつ，20倍の高速化を実現していることを確認した．
世界最速を達成する階層キャッシュ構成シミュレータの研究開発

2011年多和田　雅師

　概要を見る

1. 研究背景　現在，我々の身の回りにあるデジタルテレビ，ハードディスクレコーダ，携帯電話，自動車，エアコン，炊飯器などあらゆる電化製品に，ほぼ必ず大小の「組込みプロセッサ」が組み込まれている．我々の豊かで安全・安心な生活に組込みプロセッサの性能・価格は密接に関わってきている．　高性能化された組込みプロセッサは，内部に「オンチップメモリ」と称されるメモリを搭載している．オンチップメモリとは，組込みプロセッサの性能と，DRAMなどの外部メモリの性能とのギャップを補償するために，プロセッサと外部メモリの間を仲介するメモリシステムであるが，オンチップメモリサイズの増大ならびに半導体の微細化によるリーク電流の増大を主な原因として，オンチップメモリの面積は，組込みプロセッサの全面積のうち最大で60%～80%にも達し，同様にその消費電力は最大で50%～70%にも達する．極端に言えば，組込みプロセッサの価格・性能を決定づけるのは，もはやオンチップメモリであり，その振舞いを知ることが組込みプロセッサの価格・性能の決定に大きく寄与することになる．　オンチップメモリは，一般に(1) L1（レベル1）キャッシュ，(2) L2 （レベル2）キャッシュならびに(3) スクラッチパッドメモリによって構成される．本研究では，これら構成要素(1)～(3)に対して，特定のプログラム－例えば，デジタルテレビであれば，デジタル放送のデコード処理－が組込みプロセッサ上で実行されると仮定し，オンチップメモリの構成要素(1)～(3)の総計7個のパラメータを，それぞれその最小値から最大値まで変化させたとき，オンチップメモリ内でデータのヒットとミスが何回起こるかを，極めて高速にかつ正確にシミュレーションすることで，これを結果をベースとした【最適なオンチップメモリ構成】を得ることを目的とする．世界で最速とされるオンチップメモリシミュレータに比較して1000倍以上の高速化を達成することを目標とした．2. 研究成果概要　以下に示すように第1段階(L1キャッシュ/L2キャッシュ)と第2段階(L1キャッシュ/L2キャッシュ/スクラッチパッドメモリ)に分けて，研究を実施した．【第1段階】 (L1キャッシュ/L2キャッシュの超高速シミュレーションによる最適化)オンチップメモリの構成要素のうち，まずL1キャッシュ/L2キャッシュのシミュレーションを取り上げ，6個のパラメータを変化させたときのヒット数とミス数を正確に算出することで，メモリアクセス時間最小化あるいは消費エネルギー最小化を達成する最適パラメータを探索する．　申請者は，L1キャッシュのシミュレーションにおいて，キャッシュメモリが持つ普遍的な数理的性質を世界で初めて見出している．これらの性質をL1キャッシュとL2キャッシュの双方に適用することで，超高速なシミュレーションベースのメモリアクセス時間最小化あるいは消費エネルギー最小化が実現することを考えた．以上の考察のもと次の性質を見出し，さらにこれに基づくオンチップメモリの高速最適化技術を考案した．【性質1】L1 キャッシュ構成の連想度が1 となる2階層L1キャッシュ/L2キャッシュ構成は，同じ構成をもつ1階層L1キャッシュのキャッシュミス数と同一となる．　この性質をもとに，2階層L1キャッシュ/L2キャッシュ構成のキャッシュヒット数，ミス数を正確にシミュレーションする高速化手法CRCB-T手法を考案した．考案した手法を計算機上で評価した結果，従来の技術に比較して【1465倍】の高速化が達成できていることを確認した．【第2段階】 (L1キャッシュ/L2キャッシュ/スクラッチパッドメモリの超高速シミュレーションによる最適化)　組込みプロセッサのオンチップメモリは，L1キャッシュ-L2キャッシュ-スクラッチパッドメモリという構造を持つ．【第2段階】では，L1/L2キャッシュメモリだけでなく，スクラッチパッドメモリを含めたオンチップメモリ全体の高速シミュレーションによる7個のパラメータ全部の最適化を課題とする．【第1段階】の研究成果をスクラッチパッドメモリに拡張すると同時に，スクラッチパッド単独の数理的性質を見出すことで，最終的に従来最速されているシミュレーションベースのオンチップメモリ最適化手法に比較して，100倍～1000倍の高速化を実現する．　スクラッチパッドを組み込んだL1キャッシュ-L2キャッシュ-スクラッチパッドメモリ構成において上述の性質1につづき，以下の性質を見出した．これは単純かつ明解なものであるが，不変の原理としてすべてのオンチップメモリに適用し得る極めて重要な性質である．【性質2】より小さい容量のスクラッチパッドメモリに収容されるデータは，必ずより大きい容量のスクラッチパッドメモリに含まれる．　この性質をもとに，L1キャッシュ-L2キャッシュ-スクラッチパッドメモリ構成のキャッシュヒット数，ミス数を正確にシミュレーションする高速化手法CRCB-S手法を考案した．考案した手法を計算機上で評価した結果，従来の技術に比較して約【3173倍】の高速化が達成できていることを確認した．　これらの研究成果として，従来，世界最高速のシミュレーションベースのオンチップメモリ最適化に数ヶ月を要した実行時間を，提案する技術により数時間以内に完了させることになる．本研究は世界際高速のキャッシュ構成シミュレーションが達成されたことを意味する．
安全・安心な電子社会のための暗号LSI攻撃とその防御

2010年

　概要を見る

　近年，暗号処理を実装したLSI (大規模集積回路) に対し，テスト用のスキャンパスを利用することでその秘密鍵を復元するスキャンベース攻撃が注目されている．スキャンパスとはLSI中のレジスタを直列に接続し，LSIの外部からレジスタを直接制御・観測できるようにしたテスト容易化手法の1つであり，スキャンパステストを用いることでLSIテスト効率を大幅に高めることができる．　その一方，スキャンパスを使用して動作中のLSI内部のレジスタ出力を取得できることを利用し，暗号回路の動作状態を解析，秘密鍵復元に応用したものがスキャンベース攻撃である．スキャンベース攻撃の難しさは攻撃者が暗号動作中のスキャンデータを取得しても，そのスキャンデータとレジスタの対応関係が不明である点にある．これに対し従来いくつかの手法が提案されて来ているがいずれも次の2点に大きな問題がある．(1) スキャンパスが暗号回路中のレジスタだけで構成されている場合のみ有効であり，周辺回路のレジスタを含むことができない．(2) 共通鍵暗号DESおよびAES を対象としており，スキャンベース攻撃で公開鍵暗号方式の秘密鍵を復元できない．　このような背景のもと本研究では，暗号回路以外のレジスタがスキャンパスに含まれていても秘密鍵を復元すると同時に，公開鍵暗号方式として知られるRSA暗号ならびに楕円曲線暗号の秘密鍵も復元することを可能とした新たなスキャンベース攻撃手法を提案した．提案手法は，暗号中に計算される「中間値」を保持する特定の１ビットレジスタの変化の系列に着目する．十分な数の入力からそれぞれ計算した暗号処理中の中間値の１ビットの変化は乱数に近い値であり，その途中結果に固有の値となる(これを判別値あるいはスキャンシグニチャと呼ぶ)．スキャンシグニチャがスキャンデータの中に存在するか否かでスキャンデータを解析する．計算機シミュレーションおよびFPGAボードを使った評価実験を通して，AES，RSA，楕円曲線暗号のそれぞれにおいて最大数百程度の平文によって，128ビットを越える秘密鍵を解読できることを示した．さらに提案するスキャンベース攻撃から暗号LSIを防御するためスキャンデータの解析を妨害する新たなスキャンパス防御手法－状態依存スキャンレジスタ技術－を提案した．
通信処理向け適応型プロセッサ設計に関する研究

2005年

　概要を見る

　通信処理プロセッサ/ネットワークプロセッサは，比較的新しいタイプのプロセッサで，主にパケットのスイッチングに代表される通信処理に特化した専用プロセッサである．これまで基幹ネットワークの通信処理など高速な通信パケットのスイッチングが主眼となる箇所に使用されてきた．しかしながら，情報家電を筆頭とするエンドユーザ機器では，通信処理プロセッサに対し，これまでの (1)単純なスイッチング処理，に加え，(2)マルチメディア情報の符号化/復号化，(3)個人コンテンツ情報の暗号化/復号化，(4)ファイアウォール機能，(5)QoS（Quality ofService）の制御，を適応的に実現することが不可欠と考える．　以上の背景のもと，本研究では通信処理向けにアプリケーション処理を適応的に変化させることを可能とした専用プロセッサの設計に取り組んだ．提案する通信処理プロセッサは，複数個の「不均一」な構造を持つプロセッサコアの集合体として，通信パケット処理の負荷に応じて，「適応的に」内蔵プログラムを変化させ，処理の均衡化を図るしくみを持つものを考える．また通信パケットの遅延制御を確保するため，パケット優先順位に基づくバス調停機構（QoS調停機構）を設け，しかも各プロセッサコアは，QoS調停機構付き共通バスに接続されるアーキテクチャを持つ．この結果，上記(1)～(5)の処理を実現しかつ確実なパケット遅延制御を実現することが期待できる．　(1) 通信処理向け適応型プロセッサの基本アーキテクチャの構築，ならびに，(2) 通信処理向け適応型プロセッサの自動設計環境フレームワークの構築，を通じて，提案する通信処理向けにアプリケーション処理を適応的に変化させた専用プロセッサは，既存のネットワークプロセッサに比較して，そのスケーラビリティを大幅に向上させることで，10%～20%程度の性能向上を実現し，3Gbpsに近いスイッチング処理や，250Mbpsを越える暗号化通信処理を実現することが確認できた．
制御フローを主体としたハードウェアの高位合成手法に関する研究

1999年

　概要を見る

　一般に、画像符号化・復号化、プロトコル処理あるいは暗号処理といった、ビット処理あるいは条件分岐処理から構成されるアプリケーションプログラムが専用ハードウェアによって実現されると、ビット処理および条件分岐処理等が並列実行可能となり、マイクロプロセッサによって実現された場合と比較し、高速実行が可能となる。制御処理を主体とする専用ハードウェア設計を自動化する高位合成システムは、1.ビット処理および条件分岐処理といった制御処理を実現するハードウェアを合成可能であり、加えて2.設計者によって与えられた動作仕様に対し複数の設計候補を提供し最適設計を評価する環境を必要とする、と考える。本研究では、このような考えに基づき、制御処理ハードウェアを対象に、C言語による動作記述からハードウェア記述言語によるハードウェア記述を合成する高位合成システムを提案した。本システムは、C言語によるアプリケーションプログラムの動作記述、アプリケーションデータを入力として、アプリケーションプログラムを実現するハードウェア記述を出力する。入力されたアプリケーションプログラムに対し、時間制約および面積制約を満足するハードウェアを複数個列挙する。システムは、(i)コード最適化、(ii)面積/時間最適化、(iii)ハードウェア記述生成の各処理によって実現される。まず、コード最適化は、アプリケーションプログラムを入力とし、これを内部表現となるコールグラフならびにコントロールフローグラフにより表現する。面積/時間最適化は、コールグラフならびにコントロールフローグラフから、時間制約および面積制約を満足する複数個のハードウェア候補を得る。最後に、ハードウェア記述生成系は、面積/時間最適化によって得られたハードウェア候補に対してハードウェア記述を出力する。　本研究ではさらに、本システムで核となる面積/時間最適化に注目し、これを実現する面積/時間最適化アルゴリズムを提案・構築した。提案アルゴリズムは、入力としてコールグラフおよびコールグラフを構成するコントロールフローグラフ集合を取り、面積制約および時間制約のもとに、コールグラフ全体を表す状態遷移グラフ集合を合成する。まず、時間制約のみを満足する状態遷移グラフを構築し、その後、時間制約を満足する間、面積制約を満足するよう状態遷移グラフを変換することによって複数個のハードウェア候補を得ることができる。提案アルゴリズムは次の特長を持つ。（1）コントロールフローグラフを直接的に操作することで、ビット処理および条件分岐処理といった制御処理を扱うことができる。（2）アプリケーションプログラム全体を表す1個のコールグラフから、面積制約および時間制約を満足する複数個のハードウェア候補を列挙することができる。　提案した面積/時間最適化アルゴリズムをシステムの一部として組み込み、制御処理アプリケーションプログラムに適用した結果、面積と実行時間とがトレードオフの関係にある複数個のハードウェアを合成することができた。しかも、合成されたハードウェアは、人手設計によるハードウェアに比較して、より面積の小さい結果から面積の大きい結果、より実行時間の小さい結果から実行時間の大きい結果を得た。
画像処理向け組込みプロセッサのハードウェア/ソフトウェア協調設計手法に関する研究

1998年

　概要を見る

　画像処理向け組込みプロセッサとは、画像処理専用システムLSIに集積されたプロセッサである。画像処理向け組込みプロセッサは、従来の汎用途マイクロプロセッサに見られない数多くの画像処理専用ユニットを有しており、いかにこれらを組み合わせてプロセッサを構築していくかが最大の焦点となる。しかも最適設計を得るには、数多くのプロセッサアーキテクチャの候補を列挙し、プロセッサ上でアプリケーションソフトウェアを動作させる必要がある。即ち、画像処理向け組込みプロセッサのハードウェアとその上で動作するソフトウェアとを同時に自動設計する手法（ハードウェア/ソフトウェア協調設計手法）が求められる。これまで、アーキテクチャ候補列挙に基づくハードウェア自動設計手法に関する研究を行ってきた。計算機を用いてより広範囲のアーキテクチャの解空間を探索することにより、最適なアーキテクチャ設計を実現している。本研究は、この概念を画像処理向け組込みプロセッサのハードウェア/ソフトウェア協調設計に拡張することを目指したものである。　以上のような背景から、本研究では、動画像あるいは高精細画像の符号化・復号化、特徴抽出、強調復元といった画像処理アプリケーションを対象に、これらのアプリケーションソフトウェア専用のプロセッサを計算機によって自動設計する手法を構築した。構築された画像処理向け組込みプロセッサの自動設計手法は、次の手順に基づく。(1) まず、プロセッサに対して想定される全てのハードウェアユニット（積和器、アドレッシングユニット、ハードウェアルーピング回路、複数レジスタファイル）を付加したプロセッサモデルを定義し、このモデル上で与えられたアプリケーションプログラムをコンパイルする。この結果得られたアセンブリコードを実行するプロセッサハードウェアは、ハードウェアコストが増加するが実行時間は最短となる。(2) 続いて、ハードウェアユニットによる実現部の一部を徐々にソフトウェアによって代替する。得られるアセンブリコードは、徐々に実行時間が長くなるが、プロセッサハードウェアに必要とされる面積は小さくなる。(3) (2)によって得られたプロセッサ構成から、ハードウェア記述言語VHDLによって記述されたレジスタトランスファレベル（RTレベル）でのプロセッサ記述を合成する。得られたプロセッサ記述は、市販の論理合成ツールで論理合成可能であり、構築した手法を用いることで、極めて高速かつ低コストに画像処理向け組込みプロセッサを得ることができた。
ディジタル信号処理専用プロセッサのためのハードウェア/ソフトウェア分割手法に関する研究

1997年

　概要を見る

DSP（Digital Singal Processor; ディジタル信号処理専用プロセッサ）は、高精細画像処理に代表される今日のディジタル信号処理に不可欠なデバイスである。DSPのためのハードウェア/ソフトウェア分割（HW/SW分割）とは、DSP内部において、ハードウェアとして実現する部分とソフトウェアとして実現する部分とを決定する問題であり、DSP自身ひいてはDSPを持つディジタル信号処理システムの価格、面積、性能を決定するものである。HW/SW分割手法を計算機によって自動的に実現することは、5年程度前から始まった新しい研究であり、これまで、一般のマイクロプロセッサに対しての報告があるのみである。DSPは、マイクロプロセッサに存在しない数多く信号処理専用ユニットを有しているため、本質的にマイクロプロセッサに対するHW/SW分割と問題を異にする。DSPのHW/SW分割には、従来研究がなされてきた集積回路の自動設計手法の概念、特に、DSPのデータパスを対象とした設計候補列挙による高位設計の概念、を応用できると考える。　以上のような背景から、本研究では、動画像の符号化・復号化、特徴抽出、強調復元といった画像情報処理アプリケーションを対象に、アプリケーションプログラム群専用のDSPハードウェアの計算機による自動合成システムを考え、システムの中核をなすHW/SW分割手法を構築した。構築されたHW/SW分割手法は、次の処理に基づく。(1)まず、DSPハードウェアに対して想定される全てのハードウェアユニット（積和器、アドレッシングユニット、ハードウェアルーピング回路）を付加したプロセッサモデルを定義し、このモデル上で与えられたアプリケーションプログラムをコンパイルする。この結果得られたアセンブリコードは、ハードウェアコストが増加するが実行時間は最短となる。(2)続いて、ハードウェアユニットによる実現部の一部を徐々にソフトウェアによって代替する。得られるアセンブリコードは、徐々に実行時間が長くなるが、DSPハードウェアに必要とされる面積は小さくなる。(3)時間制約に違反するまでこの処理を繰り返すことにより、アプリケーションの実行時間が時間制約を満たし小面積かつアプリケーションプログラムに最適なプロセッサコアを合成することが可能となる。　構築された手法を用いることで、短期間でアプリケーションプログラム群に適合したDSPハードウェアを構築・評価することが可能となり、短期間のうちに最新の画像情報処理アルゴリズムを実現するDSPおよびDSPを含めた信号処理システムを構築可能となった。研究成果の発表：1998年3月戸川望、桜井崇志、柳澤政生、大附辰夫、“ディジタル信号処理向けプロセッサのハードウェア/ソフトウェア協調合成システム、”電子情報通信学会技術研究報告、VLD97-115。1998年3月川崎隆志、戸川望、柳澤政生、大附辰夫、“ディジタル信号処理向けプロセッサの自動合成システムにおける並列化コンパイラ、”電子情報通信学会技術研究報告、VLD97-116。1998年3月濱辺雅哉、能勢敦、戸川望、柳澤政生、大附辰夫、“パイプラインプロセッサのハードウェア記述自動生成手法、” 電子情報通信学会技術研究報告、VLD97-117。

▼全件表示