Updated on 2025/03/14


GOTO, Satoshi
Faculty of Science and Engineering
Job title
Professor Emeritus
Doctor of Engineering
He worked for NEC Laboratories for 32 years, and became Professor of Waseda University in 2002. His interesting research area is Multimedia and its LSI design.

Research Experience

  • 2013

    - : 中国復旦大学客員教授

  • 2012

    - : 中国清華大学客員教授

  • 2009

    - : 上海交通大学客員教授

  • 2002

    - : 早稲田大学大学院情報生産システム研究科教授

  • 2000

    : 同研究本部長

  • 1998

    : 同理事、支配人

  • 1990

    : 同情報メデイア研究所長

  • 1984

    : 同研究課長、部長

  • 1970

    : 日本電気株式会社中央研究所 研究員

  • 1976

    :University of California, Berkeley Visiting Scholar

▼display all

Education Background


    Waseda University   Faculty of Science and Engineering   Electronics and Communication  


    Waseda University   Graduate School of Science and Engineering   Electronics and Communication Engineering  


    Waseda University   School of Science and Engineering  


    Waseda University  

Committee Memberships

  • 2006

    電子情報通信学会  監事

  • 2006

    画像電子学会  正会員


    情報処理学会  正会員


    電子情報通信学会  フェロー


    IEEE  ライフフェロー


    産業技術総合研究所: 研究ユニット評価委員会委員


    総務省:戦略的情報通信研究開発推進制度 専門評価委員




    電子情報通信学会: 編集理事、調査理事、監事


    人工知能学会  正会員




    IEEE  Life Fellow


    VLSI-DAT: Co-General Chairperson, Technical Program Chair


    ISOCC: Co-General Chairperson


    DAC: TPC Chairperson


    ASPDAC: General Chairperson








    文部科科学省: 科学技術会議委員、21世紀COE委員、CREST領域アドバイザー、


    人工知能学会: 理事、監事


    IEEE : CAS Board member




    ISCCSP: Co-General Chairperson


    ITC-CSCC: Co- General Chairperson


    ASICON: Co-General Chairperson


    ICCAD: General Chairperson, TPC Chairperson

▼display all

Professional Memberships













▼display all

Research Areas

  • Communication and network engineering / Electron device and electronic equipment / Control and system engineering

Research Interests

  • システム情報(知識)処理

  • 暗号・セキュリティ

  • 電子デバイス・集積回路

  • digital circuit

  • Multimedia

  • System LSI

  • 回路設計・CAD

  • CAD

  • マルチメディア

▼display all


  • STARC Research Award


  • STARC共同研究賞


  • ASICON功績賞


  • ASICON Distinguished Performace Award


  • IEEE VLSI-DAT Best Paper


  • 半導体オブザイヤ 優秀賞


  • IEEE VLSI-DAT Best Paper


  • IEEE VLSI Symposium Best Paper


  • ISIC Design award


  • IEEE VLSI Symposium Best Paper


  • ISIC Design Award


  • IEEE ライフフェロー


  • ISLPED Low Power Design Contest 3rd prize


  • ISLPED デザインコンテスト 3位


  • IEEE Life Fellow


  • CSPA 最優秀論文賞


  • CSPA Best Paper Award


  • 第10回 LSI IP デザイン・アワード IP優秀賞


  • ISOCC シルバー賞


  • 第9回 LSI IP デザイン・アワード IP賞


  • International SoC Design Conference Silver Prize


  • 第8回 LSI IP デザインアワード IP賞


  • DAC/ISSCCデザインコンテスト 1位


  • DAC/ISSCC Design Contest 1st place


  • 電子情報通信学会 フェロー


  • IEICE Fellow


  • IEEE Golden Jubilee Awards


  • IEEE ゴールデンジュビリー賞


  • IEEE フェロー


  • IEEE Fellow


  • 人工知能学会業績賞


  • 電子情報通信学会 業績賞


▼display all


Books and Other Publications

  • モバイルコンピューテイング

    アスキー社  2000

  • 最新VLSIの開発設計とCAD

    ミマツデータ  1994

  • CADによるプリント配線設計技術

    ミマツデータ  1989

  • AIテクノロジー

    オーム社  1986

  • VLSI Layout

    North Holland  1985

  • Design Methodologies

    North Holland  1985

  • VLSI Layout

    North Holland  1985

  • Design Methodologies

    North Holland  1985

  • 数理計画法の応用:LSI設計における組合せ問題

    産業図書  1979

▼display all


  • HEVC向け動き補償処理のメモリ構成

    電子情報通信学会 総合大会 

    Presentation date: 2013

  • Human detection method by hybrid characteristic information

    Presentation date: 2013

  • 複数Depth mapを用いた境界線補正によるDepth map生成

    電子情報通信学会 総合大会 

    Presentation date: 2013

  • 画像コーデックLSIの技術動向と研究成果

    電子情報通信学会 VLD研究会 

    Presentation date: 2013

  • 動き予測プロセッサーを低消費電力で実現する3次元LSI

    電子情報通信学会 総合大会 

    Presentation date: 2013

  • HEVC CABACデコーダのバイパスアルゴリズム

    電子情報通信学会 総合大会 

    Presentation date: 2013

  • New technology trend on Video Codec LSI

    Presentation date: 2013

  • Proposed Bypass algorithms of HEVC CABAC Decoder”

    Presentation date: 2013

  • HEVC向けイントラ予測の効率的ハードウエアアーキテキチャ

    電子情報通信学会 総合大会 

    Presentation date: 2013

  • HEVC向けCABACデコーダの効率的なVLSIアーキテクチャ

    電子情報通信学会 総合大会 

    Presentation date: 2013

  • HEVC小数動き予測のモードフィルタリングアルズム

    電子情報通信学会 総合大会 

    Presentation date: 2013

  • An Efficient Hardware Architecture For Intra Prediction Module in HEVC

    Presentation date: 2013

  • ハイブリット情報に基づく特徴を用いた人物検出手法

    電子情報通信学会 総合大会 

    Presentation date: 2013

  • Matching Costの信頼度に応じたDepth Mapの生成法

    電子情報通信学会 総合大会 

    Presentation date: 2013

  • Depth map generation based on edge compensation method

    Presentation date: 2013

  • A High Efficient VLSI Architecture for HEVC CABAC Decoder

    Presentation date: 2013

  • Memory Organization in HEVC Motion Compensation for HEVC Luma and Chroma Parallel

    Presentation date: 2013

  • Mode filtering algorithm for HEVC fractional motion estimation

    Presentation date: 2013

  • A 3DIC Design for Low Power Motion Estimation Processor

    Presentation date: 2013

  • Depth map generation based on the maching cost reliability

    Presentation date: 2013

  • Human Detection of Still Image by using Color Difference Gradient Method

    Presentation date: 2012

  • 静止画像における色差勾配を用いた人物検出手法

    電子情報通信学会 総合大会, 

    Presentation date: 2012

  • ステレオ映像入力における奥行き情報を用いた多視点映像符号化

    電子情報通信学会2011年ソサイエティ大会 (EIC2011) 

    Presentation date: 2011

  • Malti View Video Encoding based on Depth Information for Stereo Video Sequence

    Presentation date: 2011

  • 自由視点映像合成での過去フレームを用いたホール処理手法

    電子情報通信学会2011年ソサイエティ大会 (EIC2011), 

    Presentation date: 2011

  • Hole Filling Method for Free View Synthesis absed on the Past Frame

    Presentation date: 2011

  • メデイア処理における超低消費電力化技術


    Presentation date: 2010

  • Low Power Design for Video Compression and Processing


    Presentation date: 2010

  • A Sorting-Based Architecture of Finding the First Two Minimum Values

    Presentation date: 2010

  • A High-Parallelism Reconfigurable Permutation Network for IEEE 802.11n/802.16e LDPC Decoder

    Presentation date: 2010

  • An adaptive reference frame compression scheme for video decoding

    Presentation date: 2010

  • A novel three-step error concealment method for H.264/AVC

    Presentation date: 2010

  • Turbo decoding based on LDPC message passing algorithm

    Presentation date: 2010

  • A Low-power and Performance-aware DVB-S2 LDPC Decoder with Layered Scheduling

    Presentation date: 2010

  • Low complexity decoding with frame-skipping for surveillance video

    Presentation date: 2010

  • マルチメデイア処理における超低消費電力化技術


    Presentation date: 2009

  • 顧客数のカウントを目的としたIMAPCARを用いた顔画像抽出手法


    Presentation date: 2009

  • Recent Advance in Video Processing Technology


    Presentation date: 2009

  • How to achieve Ultra Low Power Video Processing


    Presentation date: 2009

  • Ambient SoC Initiative

    Workshop on future Electronics Engineer 

    Presentation date: 2009

  • An Efficient Decoding Algorithm for ISDB-S2

    Presentation date: 2009

  • メディア処理における低消費電力化技術


    Presentation date: 2009

  • Implementation of LDPC decoder for 802.16e

    Presentation date: 2009

  • Five Pattern Adaptive Down-sampling Method for Variable Block Size Video Coding

    Presentation date: 2009

  • An Multi-rate LDPC decoder system on FPGA

    Presentation date: 2009

  • An efficient motion vector coding scheme based on prioritized reference decision

    Presentation date: 2009

  • Prediction-based Center-bias Fast Fractional Motion Estimation Algorithm for H.264/AVC

    Presentation date: 2008

  • High Throughput Rate-1/2 Partially-Parallel Irregular LDPC Decoder

    Presentation date: 2008

  • A Fast Mode Decision Algorithm for H.264/AVC Intra Prediction

    Presentation date: 2008

  • Level C+ Bandwidth Reduction Method for MPEG-2 to H.264 Transcoding

    Presentation date: 2008

  • Group-Based Prediction Scheme on Multiple Reference Frame Fractional Motion Estimation in H.264/AVC

    Presentation date: 2008

  • Adaptive Spatial EC Based on Numerical Measures of Edge Statistical Model

    Presentation date: 2008

  • 長い符号長に適した低消費電力高効率Message-Passing LDPCデコーダーの設計


    Presentation date: 2008

  • 高効率Message-Passingスケジュールを用いたイレギュラーLDPC復号器の高速マルチレート設計


    Presentation date: 2008

  • H.264/AVCエラーコンシールメントに基づくUEPメディアシステムの消費電力削減手法


    Presentation date: 2008

  • Rate Estimation Using Linear programming in RDO of H.264/AVC

    Presentation date: 2008

  • Diamond Web-grid Search Algorithm for H.264/AVC Motion Estimation

    Presentation date: 2008

  • An Adaptive Error Concealment Order in H.264/AVC

    Presentation date: 2008

  • A Mode Reduction Based Fast Integer Motion Estimation Algorithm for HDTV

    Presentation date: 2008

  • 高効率メッセージパッシングイレギュラーLDPC復号器の面積削減方法


    Presentation date: 2008

  • OFDM無線通信向き高速・低消費電力FFT回路の提案


    Presentation date: 2007

  • H.264におけるデブロッキング・フィルタ処理の計算量削減手法

    電子情報通信学会 総合大会 

    Presentation date: 2007

  • 不均一誤り保護方式を用いたメディア処理システムの計算量削減手法


    Presentation date: 2007

  • マルチメディアデータの重要度に基づく不均一誤り保護方式


    Presentation date: 2007

  • 高効率Message-Passing スケジュールを用いた部分並列型イレギュラーLDPC復号器


    Presentation date: 2007

  • モバイル向け0.3mW、1.4mm2動き検出プロセッサLSI


    Presentation date: 2006

  • メモリ容量削減手法を用いた高スループットLDPC復号器


    Presentation date: 2006

  • System-in-Silicon (SiS) 技術と動き探索エンジンへの応用


    Presentation date: 2006

  • 奥行情報を用いた携帯端末向けリアルタイム人物抽出LSI


    Presentation date: 2006

  • Min-Sumアルゴリズムを用いたLDPC復号器のメモリ削減手法


    Presentation date: 2005

  • 携帯端末でのテレビ電話映像における奥行情報を用いた人物抽出アルゴリズム及びそのハードウェア化の提案


    Presentation date: 2005

  • 携帯電話でのテレビ電話映像における奥行情報を用いた背景隠蔽


    Presentation date: 2005

  • Sum-Product アルゴリズムによる信頼度情報の伝播を改善する部分並列LDPC復号器の実装と評価


    Presentation date: 2005

  • MPEG画像向け可変速早送りアルゴリズム及びそのハードウエア化


    Presentation date: 2005

  • 人工知能の目指すもの


    Presentation date: 2005

  • 超低演算量な動き検出アルゴリズムと専用プロセッサーの実装


    Presentation date: 2005

  • 低消費電力RSA暗号LSI


    Presentation date: 2005

  • 復号特性を考慮した検査行列に基づく部分並列LDPC復号器


    Presentation date: 2005

  • 回路とシステム分野における理論と実用の狭間


    Presentation date: 2005

  • 楕円曲線暗号のLSI化

    電子情報通信学会 第8回システムLSIワークショップ 

    Presentation date: 2004

  • N bit-wiseモンゴメリ乗算回路を搭載した楕円曲線暗号回路の実装

    電子情報通信学会 2004年ソサイエティ大会 

    Presentation date: 2004

  • μ T-Engineを用いたリモートセンシングシステムとその応用展開


    Presentation date: 2004

  • Paradigm Shift of SoC Design

    15th VLSI Design and CAD Symposium 

    Presentation date: 2004

  • 多ビット乗算における乗算回数の削減手法について

    電子情報通信学会 第7回システムLSIワークショップ 

    Presentation date: 2003

  • 知的クラスタと事業創出

    電子情報通信学会 デザインガイヤ 

    Presentation date: 2003

  • Paradigm Shift of System LSI Design

    System LSI workshop 

    Presentation date: 2003

  • System LSI Cluster Project at Kitakyushu

    SoC Design Conference 

    Presentation date: 2003

▼display all

Research Projects

  • Strategic Innovation Program(MEXT)

    Grants and Funding

    Project Year :


  • 地域イノベーション戦略支援プログラム(文部科学省)


    Project Year :


  • スーパハイビジョン用エンコーダLSIの研究(科研費・基盤研究A)


    Project Year :


     View Summary


  • Academia based New Industry Incubation Project

    Grants and Funding

    Project Year :


  • 大学発新産業創出拠点プロジェクト(プロジェクト支援型)


    Project Year :


  • 科学研究費補助金基盤研究A:スーパハイビジョンエンコーダLSIの研究

    Project Year :


  • 超低消費電力メデイア処理SoCの研究:CREST(研究代表者)

    JST戦略的創造研究推進制度(研究チーム型) (戦略的基礎研究推進事業:CREST)

    Project Year :


     View Summary


  • Research on Ultra Low Power Media Processing SoC

    JST Basic Research Programs (Core Research for Evolutional Science and Technology :CREST)

    Project Year :


  • 半導体理工学研究センター:スーパハイビジョン用エンコーダLSIの研究


    Project Year :



    Cooperative Research

    Project Year :


  • ICTアプリケーションLSI IPとその先端的設計技術の研究:知的クラスタ(研究代表者)


    Project Year :


     View Summary


  • International Research and Education Center for Ambient SoC

    R and D Consignment Program

    Project Year :


  • 地域イノベーションクラスタプロジェクト:ICTアプリケーションLSI IPの研究(文部科学省)

    Project Year :


  • CREST:超低消費電力メデイア処理SoCの研究(JST)

    Project Year :


  • アンビエントSoCの教育研究の国際拠点グローバルCOE(拠点リーダ)


    Project Year :


     View Summary


  • グローバルCOE:アンビエントSoCの教育研究の国際拠点(文部科学省/JST)

    Project Year :


  • 富士通研究所

    Project Year :


  • Fujitsu Laboratories

    Project Year :


  • 科学研究費補助金・基盤研究A:超大規模LSI設計・実装技術の研究(文部科学省/JST)

    Project Year :


  • 超大規模LSI設計・実装技術の研究(科研費・基盤研究A)


    Project Year :


  • 科学技術振興調整費:システムLSI設計のための基盤ソフトウエア(文部科学省/JST)

    Project Year :


  • ルネサスエレクトロニクス

    Project Year :


  • Renesas Electronic

    Project Year :


  • カリフォルニア大学バークレー校


    Project Year :


  • University of California, Berkeley

    International Joint Research Projects

    Project Year :


  • 復旦大学(中国)


    Project Year :


  • Fudan University

    International Joint Research Projects

    Project Year :


  • 知的クラスタプロジェクト:アプリケーションSoCの研究(文部科学省)

    Project Year :


  • トヨタ自動車株式会社


    Project Year :


  • Toyota Motor

    Cooperative Research

    Project Year :


  • Tsinghua University(China)

    International Joint Research Projects

    Project Year :


  • National Taiwan University

    International Joint Research Projects

    Project Year :


  • 中国清華大学


    Project Year :


  • 台湾大学


    Project Year :


  • 上海交通大学


    Project Year :


  • Shanghai Jio Tang University

    International Joint Research Projects

    Project Year :


  • 日本電気株式会社


    Project Year :


  • ルネサスマイクロシステム


    Project Year :


  • Renesus Microsystem

    Cooperative Research

    Project Year :


  • (株)東芝


    Project Year :


  • Toshiba

    Cooperative Research

    Project Year :


  • NEC Corporation

    Cooperative Research

    Project Year :


▼display all


  • Real-Time Refinement Method for Foreground Objects Detectors Using Super Fast Resolution-Free Tracking System

    Axel Beaugendre, Satoshi Goto


     View Summary

    Moving objects or more generally foreground objects are the simplest objects in the field of computer vision after the pixel. Indeed, a moving object can be defined by 4 integers only, either two pairs of coordinates or a pair of coordinates and the size. In fixed camera scenes, moving objects (or blobs) can be extracted quite easily but the methods to produce them are not able to tell if a blob corresponds to remaining background noise, a single target or if there is an occlusion between many target which are too close together thus creating a single blob resulting from the fusion of all targets. In this paper we propose an novel method to refine moving object detection results in order to get as many blobs as targets on the scene by using a tracking system for additional information. Knowing if a blob is at proximity of a tracker allows us to remove noise blobs, keep the rest and handle occlusions when there are more than one tracker on a blob. The results show that the refinement is an efficient tool to sort good blobs from noise blobs and accurate enough to perform a tracking based on moving objects. The tracking process is a resolution free system able to reach speed such as 20 000 fps even for UHDTV sequences. The refinement process itself is in real time, running at more than 2000 fps in difficult situations. Different tests are presented to show the efficiency of the noise removal and the reality of the independence of the refinement tracking system from the resolution of the videos.


  • A 995Mpixels/s 0.2nJ/pixel fractional motion estimation architecture in HEVC for UHD

    He Gang, Dajiang Zhou, Satoshi Goto

    IEEE A-SSCC     301 - 304  2014


  • Fast Prediction Unit Selection and Mode selection for HEVC Intra Prediction

    Hemming Sun, Dajiang Zhou, Peilin Liu, Satoshi Got

    IEICE Transaction on Fundamentals of Electronics, Communications and Computer Science   Vol.E97-A ( No.2 )  2014


  • Fast Prediction Unit Selection and Mode selection for HEVC Intra Prediction

    Hemming Sun, Dajiang Zhou, Peilin Liu, Satoshi Got

    IEICE Transaction on Fundamentals of Electronics, Communications and Computer Science   Vol.E97-A ( No.2 )  2014


  • Low-complexity rate-distortion optimization algorithms for HEVC intra prediction

    Zhe Sheng, Dajiang Zhou, Heming Sun, Satoshi Goto

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   8325 ( 1 ) 541 - 552  2014

     View Summary

    HEVC achieves a better coding efficiency relative to prior standards, but also involves dramatically increased complexity. The complexity increase for intra prediction is especially intensive due to a highly flexible quad-tree coding structure and a large number of prediction modes. The encoder employs rate-distortion optimization (RDO) to select the optimal coding mode. And RDO takes a great portion of intra encoding complexity.Moreover HEVC has stronger dependency on RDO than H.264/AVC. To reduce the computational complexity and to implement a real-time system,this paper presents two low-complexity RDO algorithms for HEVC intra prediction. The structure of RDO is simplified by the proposed rate and distortion estimators, and some hardware-unfriendly modules are facilitated. Compared with the original RDO procedure, the two proposed algorithms reduce RDO time by 46% and 64% respectively with acceptable coding efficiency loss. © 2014 Springer International Publishing.


  • A High Performance HEVC De-Blocking Filter and SAO Architecture for UHDTV Decoder

    Jiayi Zhu, Dajiang Zhou, Satoshi Goto


     View Summary

    High efficiency video coding (HEVC) is the next generation video compression standard. In-loop filter is an important component of HEVC which is composed of two parts, deblocking filter (DBF) and sample adaptive offset (SAO). In this article, we propose a high performance in-loop filter architecture for HEVC which integrate both deblocking filter and SAO. To achieve it, several ideas are adopted. Firstly, SAO is processed based on drifted block, which suits the output pattern of deblocking filter and ease the coupling of deblocking filter and SAO. Secondly, luma and chroma samples of each 4 x 4 block are organized in same memory storage unit and they are processed simultaneously to raise the parallelism. Thirdly, in both deblocking filter and SAO, calculation core is implemented in combinational logic and data storage is implemented in register groups. Calculation core keeps processing data continually, which greatly raises the utilization of DBF core and SAO core. Fourthly, task level pipeline in processing 8 x 8 block is employed between deblocking filter and SAO. By these means, a high performance in-loop filter including both deblocking filter and SAO is achieved without any intermediate storage or circuit. It takes only four cycles to finish the deblocking filter and SAO of one 8x8 block. The implementation results show that the proposed solution can be synthesized to 240 MHz with 65 nm technology. Thus this solution can process 3.84G pixels/s at maximum. UHDTV 4320p (7680 x 4320) @ 60 fps decoding can be realized with 124.4 MHz working frequency by the proposed architecture.


  • A Dual-Mode Deblocking Filter Design for HEVC and H.264/AVC

    Muchen Li, Jinjia Zhou, Dajiang Zhou, Xiao Peng, Satoshi Goto


     View Summary

    As the successive video compression standard of H.264/AVC, High Efficiency Video Codec (HEVC) will play an important role in video coding area. In the deblocking filter part, HEVC inherits the basic property of H.264/AVC and gives some new features. Based on this variation, this paper introduces a novel dual-mode deblocking filter architecture which could support both of the HEVC and H.264/AVC standards. For HEVC standard, the proposed symmetric unified-cross unit (SUCU) based filtering scheme greatly reduces the design complexity. As a result, processing a 16 x 16 block needs 24 clock cycles. For H.264/AVC standard, it takes 48 clock cycles for a 16 x 16 macro-block (MB). In synthesis result, the proposed architecture occupies 41.6k equivalent gate count at frequency of 200 MHz in SMIC 65 nm library, which could satisfy the throughput requirement of super hi-vision (SHV) on 60 fps. With filter reusing scheme, the universal design for the two standards saves 30% gate counts than the dedicated ones in filter part. In addition, the total power consumption could be reduced by 57.2% with skipping mode when the edges need not be filtered.


  • An Integrated Hole-Filling Algorithm for View Synthesis

    Wenxin Yu, Weichen Wang, Minghui Wang, Satoshi Goto


     View Summary

    Multi-view video can provide users with three-dimensional (3-D) and virtual reality perception through multiple viewing angles. In recent years, depth image-based rendering (DIBR) has been generally used to synthesize virtual view images in free viewpoint television (FTV) and 3-D video. To conceal the zero-region more accurately and improve the quality of a virtual view synthesized frame, an integrated hole-filling algorithm for view synthesis is proposed in this paper. The proposed algorithm contains five parts: an algorithm for distinguishing different regions, foreground and background boundary detection, texture image isophotes detection, a textural and structural isophote prediction algorithm, and an in-painting algorithm with gradient priority order. Based on the texture isophote prediction with a geometrical principle and the in-painting algorithm with a gradient priority order, the boundary information of the foreground is considerably clearer and the texture information in the zero-region can be concealed much more accurately than in previous works. The vision quality mainly depends on the distortion of the structural information. Experimental results indicate that the proposed algorithm improves not only the objective quality of the virtual image, but also its subjective quality considerably; human vision is also clearly improved based on the subjective results. In particular, the algorithm ensures the boundary contours of the foreground objects and the textural and structural information.


  • All-Zero Block-Based Optimization for Quadtree-Structured Prediction and Residual Encoding in High Efficiency Video Coding

    Guifen Tian, Xin Jin, Satoshi Goto


     View Summary

    High Efficiency Video Coding (HEVC) outperforms H.264 High Profile with bitrate saving of about 43%, mostly because block sizes for hybrid prediction and residual encoding are recursively chosen using a quadtree structure. Nevertheless, the exhaustive quadtree-based partition is not always necessary. This paper takes advantage of all-zero residual blocks at every quadtree depth to accelerate the prediction and residual encoding processes. First, we derive a near-sufficient condition to detect variable-sized all-zero blocks (AZBs). For these blocks, discrete cosine transform (DCT) and quantization can be skipped. Next, using the derived condition, we propose an early termination technique to reduce the complexity for motion estimation (ME). More significantly, we present a two-dimensional pruning technique based on AZBs to constrain prediction units (PU) that contribute negligibly to rate-distortion (RD) performance. Experiments on a wide range of videos with resolution ranging from 416 x 240 to 4k x 2k, show that the proposed scheme can reduce computational complexity for the HEVC encoder by up to 70.46% (50.34% on average), with slight loss in terms of the peak signal-to-noise ratio (PSNR) and bitrate. The proposal also outperforms other state-of-the-art methods by achieving greater complexity reduction and improved bitrate performance.


  • Low Power Video Decoding with Adaptive Granularity in Temporal Scalability

    Yu Wenxin, Jin Xin, Goto Satoshi

    The Journal of the Institute of Image Electronics Engineers of Japan   Vol.42 ( No.2 ) 249 - 261  2013

     View Summary

    This paper proposes a low power video decoding with adaptive granularity in temporal scalability. This proposal can be applied to reduce the computational complexity of H.264/AVC decoder with acceptable loss of the video quality, and make the single layer bit stream sources much more flexible for various terminal devices. Proposed low power decoding process consists of four proposed algorithms, the reference frame index decision algorithm, motion vector composition algorithm, block-partition decision algorithm and the adaptive selecting algorithm for skipped frames. The experiment results show that the reduction rate of the decoding time decreases when the number of skipped frames increases, and the loss of the video quality increases at the same time. The PSNR loss in the B frame skipping is much smaller than the PSNR loss in the P frame skipping. In the fixed frame skipping cases, the 2/3 P frame is skipped with 60% decoding time reduction and 2.73 dB average PSNR loss in the all filling comparison or 1.59 dB average PSNR loss in the corresponding comparison. Analyzing the relation between motion vector information and the video quality loss of the corresponding frames in probability shows that the proposed adaptive skipping scheme reduces quality loss by skipping the frames with slight movements and keeping the frames with strong movements. Based on the adaptive skipping scheme, the average PSNR is improved 0.68 dB in the all filling comparison or 0.60 dB in the corresponding comparison compared with the fixed frame skipping scheme with almost the same reduction rate of the decoding time.

    DOI CiNii

  • Bidirectional Local Template Patterns: An effective and discriminative feature for pedestrian detection

    Jiu Xu, Ning Jiang, Satoshi Goto

    IEICE Transaction on Fundamentals of Electronics, Communications and Computer Science   Vol,E96-A ( No.6 ) 1195 - 1203  2013


  • Low complexity Merge candidate decision for fast HEVC encoding

    Muchen Li, Keiichi Chono, Satoshi Goto

    IEEE International Conference on Multimedia and Expo (ICME)     1 - 6  2013


  • A 24.5-53.6pJ/pixel 4320p 60fps H.264/AVC intra-frame video encoder chip in 65nm CMOS

    Dajiang Zhou, Gang He, Wei Fei, Zhixiang Chen, Jinjia Zhou, Satoshi Goto

    Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC     73 - 74  2013

     View Summary

    An H.264/AVC intra-frame video encoder is implemented in 65nm CMOS. With an efficient intra prediction design, its maximum throughput reaches 1991Mpixels/s for 7680×4320p 60fps video, 9.4x to 32x faster than previous designs. The encoder also incorporates a 1.41Gbins/s CABAC architecture that has been enhanced by 31%. Moreover, low energy consumption is achieved by the high parallelism and hardware efficiency of this design. 1080p 30fps encoding dissipates only 2mW at 0.8V and 9MHz. © 2013 IEEE.


  • A 6.72-Gb/s, 8pJ/bit/iteration WPAN LDPC Decoder in 65nm CMOS

    Zhixiang CHEN, Xiao PENG, Xiongxin ZHAO, Leona OKAMURA, Dajiang ZHOU, Satoshi GOTO

    Asia and South Pacific Design Automation Conference (ASP-DAC)    2013


  • Low Power Video Decoding with Adaptive Granularity in Temporal Scalability

    Wenxin Yu, Xin Jin, Satoshi Goto

    The Journal of the Institute of Image Electronics Engineers of Japan   Vol.42 ( No.2 ) 249 - 261  2013

     View Summary

    This paper proposes a low power video decoding with adaptive granularity in temporal scalability. This proposal can be applied to reduce the computational complexity of H.264/AVC decoder with acceptable loss of the video quality, and make the single layer bit stream sources much more flexible for various terminal devices. Proposed low power decoding process consists of four proposed algorithms, the reference frame index decision algorithm, motion vector composition algorithm, block-partition decision algorithm and the adaptive selecting algorithm for skipped frames. The experiment results show that the reduction rate of the decoding time decreases when the number of skipped frames increases, and the loss of the video quality increases at the same time. The PSNR loss in the B frame skipping is much smaller than the PSNR loss in the P frame skipping. In the fixed frame skipping cases, the 2/3 P frame is skipped with 60% decoding time reduction and 2.73 dB average PSNR loss in the all filling comparison or 1.59 dB average PSNR loss in the corresponding comparison. Analyzing the relation between motion vector information and the video quality loss of the corresponding frames in probability shows that the proposed adaptive skipping scheme reduces quality loss by skipping the frames with slight movements and keeping the frames with strong movements. Based on the adaptive skipping scheme, the average PSNR is improved 0.68 dB in the all filling comparison or 0.60 dB in the corresponding comparison compared with the fixed frame skipping scheme with almost the same reduction rate of the decoding time.

    DOI CiNii

  • Bidirectional Local Template Patterns: An effective and discriminative feature for pedestrian detection

    Jiu Xu, Ning Jiang, Satoshi Goto

    IEICE Transaction on Fundamentals of Electronics, Communications and Computer Science   Vol,E96-A ( No.6 ) 1195 - 1203  2013


  • An FPGA-Based 4K UHDTV h.264/AVC Video Decoder

    Yue Pan, Daijiang Zhou, Satoshi Goto

    IEEE International Conference on Multimedia and Expo(ICME     1 - 4  2013


  • A mode filtering algorithm for accelerating HEVC FME

    Yiming Cao, Satoshi Goto

    2013 IEEE International Workshop on Multimedia Signal Processing, MMSP 2013     218 - 223  2013

     View Summary

    In the high efficiency video coding (HEVC), motion estimation (ME) is an essential process for inter prediction. Compared to integer ME (IME), fractional ME (FME) occupies much more encoding time in the fast search case. To reduce the computational complexity of FME, we propose a fast algorithm based on mode filtering. In the proposal, according to results of IME and neighboring largest coding units (LCU), 1 or 2 levels are selected from the original 4-level quad-tree structure, and only the selected levels will process FME. Specifically, quad-tree separation is adopted to remove the interdependence between IME and FME. Experiments on a wide range of video sequences show that the proposed algorithm can reduce up to 42.9% and an average of 33.4% total encoding time for HEVC with little loss in PSNR and rate. © 2013 IEEE.


  • Design and Analysis Strategies for 3D Based Multimedia 2k*4k Decoder with Stacked Memory

    Nan Duan, Satoshi Got

    ITC-CSC    2013

  • PUF-based One-Time-Pad with Majority Judgment and Dual Integrity Verification

    PUF-based One-Time-Pad with Majority Judgment, Dual Integrity Verification

    ITC-CSCC    2013


    Jiayi Zhu, Dajiang Zhou, Gang He, Satoshi Goto


     View Summary

    The up-coming video compression standard, high efficiency video coding (HEVC), reduces 50% bit rates in encoding video sequences with same picture quality compared to H.264/AVC. In the in-loop filter (LF) part of HEVC, sample adaptive offset (SAO) is newly added and de-blocking filter (DBF) has been changed a lot. Thus how to construct a high speed and low cost VLSI architecture for HEVC SAO and de-blocking filter is a challenge.
    In this article, we propose a HEVC LF architecture composed of fully utilized de-blocking filter and SAO. Block based SAO and DBF are employed in this architecture to achieve seamless pipeline between them. The implementation results show that it can be synthesized to 240MHz with 65nm technology. Thus this solution can process 3.84G pixels/s and support 4320p(7680x4320)@ 120fps decoding.


  • An 115mW 1Gbps Bit-Serial Layered LDPC Decoder for WiMAX

    Xiongxin Zhao, Zhixiang Chen, Xiao Peng, Dajiang Zhou, Satoshi Goto

    IEICE TRANS. on Fundamentals : VLSI Design and CAD Algorithms(Accepted)    2013


  • A 6.72-Gb/s, 8pJ/bit/iteration WPAN LDPC Decoder in 65nm CMOS

    Zhixiang CHEN, Xiao PENG, Xiongxin ZHAO, Leona OKAMURA, Dajiang ZHOU, Satoshi GOTO

    Asia and South Pacific Design Automation Conference (ASP-DAC)    2013


  • An FPGA-Based 4K UHDTV h.264/AVC Video Decoder

    Yue Pan, Daijiang Zhou, Satoshi Goto

    IEEE International Conference on Multimedia and Expo(ICME     1 - 4  2013


  • PUF-based One-Time-Pad with Majority Judgment and Dual Integrity Verification

    PUF-based One-Time-Pad with Majority Judgment, Dual Integrity Verification

    ITC-CSCC    2013

  • 4K UHDTV H.264/AVC video decoding& Displaying based on FPGA

    Yue Pan, Dajiang Zhou, Satoshi Goto

    IEEE International Workshop on Multimedia Signal Processing (MMSP)    2013

  • A 995Mpixels/s 0.2nJ/pixel fractional motion estimation architecture in HEVC for Ultra-HD

    Gang He, Dajiang Zhou, Zhixiang Chen, Tianruo Zhang, Satoshi Goto

    Proceedings of the 2013 IEEE Asian Solid-State Circuits Conference, A-SSCC 2013     301 - 304  2013

     View Summary

    This paper presents a fractional motion estimation (FME) design in high efficiency video coding (HEVC) for ultrahigh definition video (Ultra-HD). To reduce complexity and achieve high throughput, the design is co-optimized in algorithm and hardware architecture. Bilinear quarter pixel approximation, together with a 5T12S search pattern is proposed to reduce the complexity of the interpolation and search process. Furthermore, we introduce an exhaustive size-hadamard transform (ES-HAD), to improve coding quality, and determine the best transform size rather than using complex transform coding. Besides, a data reuse method of ES-HAD is applied to reduce the hardware overhead. This design is implemented in 65nm CMOS chip and verified by FPGA based evaluation system. It achieves 995Mpixels/s for 7680×4320 30fps encoding, at least 4.7 times faster than previous designs. Its power dissipation is 198.6mW at 188MHz, with 0.2nJ/pixel power efficiency. Despite high complexity in HEVC, the chip achieves 56% improvement on power efficiency than previous works in H.264. © 2013 IEEE.


  • Content adaptive hierarchical decision of variable coding block sizes in high efficiency video coding for high resolution videos

    Guifen Tian, Xin Jin, Satoshi Goto

    IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences   E96-A ( 4 ) 780 - 789  2013

     View Summary

    The quadtree-based variable block sized prediction makes the biggest contribution for dramatically improved coding efficiency in the new video coding standard named HEVC. However, this technique takes about 75-80% computational complexity of an HEVC encoder. This paper brings forward an adaptive scheme that exploits temporal, spatial and transform-domain features to speed up the original quadtree-based prediction, targeting at high resolution videos. Before encoding starts, analysis on utilization ratio of each coding depth is performed to skip rarely adopted coding depths at frame level. Then, texture complexity (TC) measurement is applied to filter out none-contributable coding blocks for each largest coding unit (LCU). In this step, a dynamic threshold setting approach is proposed to make filtering adaptable to videos and coding parameters. Thirdly, during encoding process, sum of absolute quantized residual coefficient (SAQC) is used as criterion to prune useless coding blocks for both LCUs and 32 × 32 blocks. By using proposed scheme, motion estimation is performed for prediction blocks within a narrowed range. Experiments show that proposed scheme outperforms existing works and speeds up original HEVC by a factor of up to 61.89% and by an average of 33.65% for 4kx2k video sequences. Meanwhile, the peak signal-to-noise ratio (PSNR) degradation and bit increment are trivial. Copyright © 2013 The Institute of Electronics, Information and Communication Engineers.


  • A 1.59Gpixel/s Motion Estimation Processor with -211-to-211 Search Range for UHDTV Video Encoder

    Jinjia Zhou, Dajiang Zhou, Gang He, Satoshi Goto

    Symposium on VLSI Circuits     287 - 288  2013

  • Lossless Embedded Compression Using Multi-mode DPCM& Averaging Prediction for HEVC-like Video Code

    Guo Li, Dajiang Zhou, Satoshi Goto

    European Signal Proc. Conf.(EUSIPCO)    2013


    Wenxin Yu, Weichen Wang, Gang He, Satoshi Goto


     View Summary

    A combined hole-filling approach with spatial and temporal prediction is presented in this paper. Depth image-based rendering (DIBR) is generally used to synthesize virtual view images in free viewpoint television (FTV) and threedimensional (3-D) video. Limited original camera views and depth maps are used to generate the additional virtual views in the synthesizing process. One of the main problems in DIBR is that there are some regions are occluded by the foreground objects in the original views, and they will be some holes in the generated additional virtual views, especially for the view extrapolation cases. Therefore, the proposed algorithm is introduced and it can be used to fill the holes which caused by disocclusion regions and inaccurate depth values. The proposed algorithm combines the spatial and temporal prediction, and the performance is much better and more stable than the previous work. The experimental results show that the proposed method can improve the quality of the virtual views a lot compared with the previous work. The improvement is not only obvious in the objective comparison, but also in the subjective comparison.


  • A 1.59Gpixel/s Motion Estimation Processor with -211-to-211 Search Range for UHDTV Video Encoder

    Jinjia Zhou, Dajiang Zhou, Gang He, Satoshi Goto

    Symposium on VLSI Circuits     287 - 288  2013

  • Gradient Local Binary Patterns for Human Detection

    Ning Jiang, Jiu Xu, Wenxin Yu, Satoshi Goto


     View Summary

    In recent years, local pattern based features have attracted increasing interest in object detection and recognition systems. Local Binary Pattern (LBP) feature is widely used in texture classification and face detection. But the original definition of LBP is not suitable for human detection. In this paper, we propose a novel feature set named gradient local binary patterns (GLBP), Original GLBP and Improved GLBP, for human detection. Experiments are performed on INRIA dataset, which shows the proposal GLBP feature is more discriminative than histogram of orientated gradient (HOG), histogram of template (HOT) and Semantic Local Binary Patterns (S-LBP), under the same training method. In our experiments, the window size is fixed. That means the performance can be improved by boosting and cascade methods. And the computation of GLBP feature is parallel, which make it easy for hardware acceleration. These factors make GLBP feature possible for real-time human detection.


  • A High-performance CABAC Encoder Architecture for HEVC and H.264/AVC

    Jinjia Zhou, Dajiang Zhou, Wei Fei, Satoshi G

    International Conference on Image Processing(ICIP)     1568 - 1572  2013


  • Joint feature based rain detection and removal from videos

    Xinwei Xue, Xin Jin, Chenyuan Zhang, Satoshi Goto

    IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences   E96-A ( 6 ) 1195 - 1203  2013

     View Summary

    Adverse weather, such as rain or snow, can cause difficulties in the processing of video streams. Because the appearance of raindrops can affect the performance of human tracking and reduce the efficiency of video compression, the detection and removal of rain is a challenging problem in outdoor surveillance systems. In this paper, we propose a new algorithm for rain detection and removal based on both spatial and wavelet domain features. Our system involves fewer frames during detection and removal, and is robust to moving objects in the rain. Experimental results demonstrate that the proposed algorithm outperforms existing approaches in terms of subjective and objective quality. Copyright © 2013 The Institute of Electronics, Information and Communication Engineers.


  • Design and Analysis Strategies for 3D Based Multimedia 2k*4k Decoder with Stacked Memory

    Nan Duan, Satoshi Got

    ITC-CSC    2013

  • 4K UHDTV H.264/AVC video decoding& Displaying based on FPGA

    Yue Pan, Dajiang Zhou, Satoshi Goto

    IEEE International Workshop on Multimedia Signal Processing (MMSP)    2013

  • A High-Performance H.264/AVC Intra Prediction Architecture for Ultra-HD Video Applications

    Gang He, Dajiang Zhou, Wei Fei, Zhixiang Chen, Jinjia Zhou, Satoshi Goto

    IEEE Transactions on VLSI   Vol.99  2013


  • Correlated Noise Reduction for Electromagnetic Analysis

    Hongying Liu, Xin Jin, Yukiyasu Tsunoo, Satoshi Goto


     View Summary

    Electromagnetic emissions leak confidential data of cryptographic devices. Electromagnetic Analysis (EMA) exploits such emission for cryptanalysis. The performance of EMA dramatically decreases when correlated noise, which is caused by the interference of clock network and exhibits strong correlation with encryption signal, is present in the acquired EM signal. In this paper, three techniques are proposed to reduce the correlated noise. Based on the observation that the clock signal has a high variance at the signal edges, the first technique: single-sample Singular Value Decomposition (SVD), extracts the clock signal with only one EM sample. The second technique: multi-sample SVD is capable of suppressing the clock signal with short sampling length. The third one: averaged subtraction is suitable for estimation of correlated noise when background samplings are included. Experiments on the EM signal during AES encryption on the FPGA and ASIC implementation demonstrate that the proposed techniques increase SNR as much as 22.94 dB, and the success rates of EMA show that the data-independent information is retained and the performance of EMA is improved.


  • High-Parallel Performance-Aware LDPC Decoder IP Core Design for WIMA

    Xiongxin ZHAO, Zhixiang CHEN, Xiao PENG, Dajiang ZHOU, Satoshi GO

    IEEE MWSCAS    2013


  • A high-performance CABAC encoder architecture for HEVC and H.264/AVC

    Jinjia Zhou, Dajiang Zhou, Wei Fei, Satoshi Goto

    2013 IEEE International Conference on Image Processing, ICIP 2013 - Proceedings     1568 - 1572  2013

     View Summary

    This paper presents a high-performance context adaptive binary arithmetic coding (CABAC) architecture for the next-generation UHDTV applications. Its maximum throughput has been enhanced by 31%∼34% with the proposed pre-normalization (prenorm.), hybrid path coverage (HPC), bypass bin splitting (BPBS) and state dual-transition (SDT) schemes. Both the HEVC and H.264/AVC formats can be supported with our architecture by applying a dualstandard binarization design. The proposed CABAC architecture has been silicon proven in a 65nm video encoder chip. It delivers 4.27∼4.40 bins/cycle with synthesized and measured clock rates of 401.5MHz and 330MHz, respectively. Therefore a high performance of 1.452Gbin/s is achieved for real-time UHDTV encoding. © 2013 IEEE.


  • A dual mode De-Blocking filter design for HEVC and H.264/AVC

    Muchen Li, Jinjia Zhou, Xiao Peng, Dajiang Zhou, Satoshi Goto

    IEICE Transaction on Fundamentals of Electronics, Communications and Computer Science   vol.E96-A ( No.6 ) 1366 - 1375  2013


  • Low complexity Merge candidate decision for fast HEVC encoding

    Muchen Li, Keiichi Chono, Satoshi Goto

    IEEE International Conference on Multimedia and Expo (ICME)     1 - 6  2013


  • High-Parallel Performance-Aware LDPC Decoder IP Core Design for WIMA

    Xiongxin ZHAO, Zhixiang CHEN, Xiao PENG, Dajiang ZHOU, Satoshi GO

    IEEE MWSCAS    2013


  • Lossless Embedded Compression Using Multi-mode DPCM& Averaging Prediction for HEVC-like Video Code

    Guo Li, Dajiang Zhou, Satoshi Goto

    European Signal Proc. Conf.(EUSIPCO)    2013

  • Multi-scale Bidirectional Local Template Patterns for Real-time Human Detection

    Jiu Xu, Ning Jiang, Xinwei Xue, Heming Sun, Wenxin Yu, Satoshi Goto


     View Summary

    In this paper, a feature named multi-scale bidirectional local template patterns (MBLTP) is proposed for human detection. As an extension of bidirectional local template patterns (BLTP), MBLTP not only integrates the textural and gradient information according to the four predefined templates but also calculates information for additional feature vectors by adjusting the scale of the training samples. These additional feature vectors contain multi-scale information on the samples, which can make the feature more discriminative than its original form. Experimental results for an INRIA dataset show that the detection rate of our proposed MBLTP feature outperforms those of other features such as the multi-level histogram of orientated gradient (multi-level HOG), multi scale block histogram of template (MB-HOT), and HOG-LBP. Moreover, in order to make our feature meet real-time requirements, an implementation based on a graphic process unit (GPU) is adopted to accelerate the calculation.


  • A High-Performance H.264/AVC Intra Prediction Architecture for Ultra-HD Video Applications

    Gang He, Dajiang Zhou, Wei Fei, Zhixiang Chen, Jinjia Zhou, Satoshi Goto

    IEEE Transactions on VLSI   Vol.99  2013


  • A 115 mW 1 Gbps Bit-Serial Layered LDPC Decoder for WiMAX

    Xiongxin Zhao, Xiao Peng, Zhixiang Chen, Dajiang Zhou, Satoshi Goto


     View Summary

    Structured quasi-cyclic low-density parity-check (QC-LDPC) codes have been adopted in many wireless communication standards, such as WiMAX, Wi-Fi and WPAN. To completely support the variable code rate (multi-rate) and variable code length (multi-length) implementation for universal applications, the partial-parallel layered LDPC decoder architecture is straightforward and widely used in the decoder design. In this paper, we propose a high parallel LDPC decoder architecture for WiMAX system with dedicated ASIC design. Different from the block by block decoding schedule in most partial-parallel layered architectures, all the messages within each layer are updated simultaneously in the proposed fully-parallel layered decoder architecture. Meanwhile, the message updating is separated into bit-serial style to reduce hardware complexity. A 6-bit implementation is adopted in the decoder chip, since simulations demonstrate that 6-bit quantization is the best trade-off between performance and complexity. Moreover, the two-layer concurrent processing technique is proposed to further increase the parallelism for low code rates. Implementation results show that the decoder chip saves 22.2% storage bits and only takes 24 similar to 48 clock cycles per iteration for all the code rates defined in WiMAX standard. It occupies 3.36 mm(2) in SMIC 65 nm CMOS process, and realizes 1056 Mbps throughput at 1.2 V, 110 MHz and 10 iterations with 115 mW power occupation, which infers a power efficiency of 10.9 pJ/bit/iteration. The power efficiency is improved 63.6% in normalized comparison with the state-of-art WiMAX LDPC decoder.


  • Pedestrian Detection Using Gradient Local Binary Patterns

    Ning Jiang, Jiu Xu, Satoshi Goto


     View Summary

    In recent years, local pattern based features have attracted increasing interest in object detection and recognition systems. Local Binary Pattern (LBP) feature is widely used in texture classification and face detection. But the original definition of LBP is not suitable for human detection. In this paper, we propose a novel feature named gradient local binary patterns (GLBP) for human detection. In this feature, original 256 local binary patterns are reduced to 56 patterns. These 56 patterns named uniform patterns are used for generating a 56-bin histogram. And gradient value of each pixel is set as the weight which is always same in LBP based features in histogram calculation to computing the values in 56 bins for histogram. Experiments are performed on INRIA dataset, which shows the proposal GLBP feature is discriminative than histogram of orientated gradient (HOG), Semantic Local Binary Patterns (S-LBP) and histogram of template (HOT). In our experiments, the window size is fixed. That means the performance can be improved by boosting methods. And the computation of GLBP feature is parallel, which make it easy for hardware acceleration. These factors make GLBP feature possible for real-time pedestrian detection.


  • Encoder-Unconstrained User Interactive Partial Decoding Scheme

    Chen Liu, Xin Jin, Tianruo Zhang, Satoshi Goto


     View Summary

    High-definition (HD) videos become more and more popular on portable devices these years. Due to the resolution mismatch between the HD video sources and the relative low-resolution screens of portable devices, the HD videos are usually fully decoded and then down-sampled (FDDS) for the displays, which not only increase the cost of both computational power and memory bandwidth, but also lose the details of video contents. In this paper, an encoder-unconstrained partial decoding scheme for H.264/AVC is presented to solve the problem by only decoding the object of interest (001) related region, which is defined by users. A simplified compression domain tracking method is utilized to ensure that the 001 locates in the center of the display area. The decoded partial area (DPA) adaptation, the reference block relocation (RBR) and co-located temporal Intra prediction (CTIP) methods are proposed to improve the visual quality for the DPA with low complexity. The simulation results show that the proposed partial decoding scheme provides an average of 50.16% decoding time reduction comparing to the fully decoding process. The displayed region also presents the original HD granularity of OOI. The proposed partial decoding scheme is especially useful for displaying HD video on the devices of which the battery life is a crucial factor.


  • Framework of a Contour Based Depth Map Coding Method

    Minghui Wang, Xun He, Xin Jin, Satoshi Goto


     View Summary

    Stereo-view and multi-view video formats are heavily investigated topics given their vast application potential. Depth Image Based Rendering (DIBR) system has been developed to improve Multiview Video Coding (MVC). Depth image is introduced to synthesize virtual views on the decoder side in this system. Depth image is a piecewise image, which is filled with sharp contours and smooth interior. Contours in a depth image show more importance than interior in view synthesis process. In order to improve the quality of the synthesized views and reduce the bitrate of depth image, a contour based coding strategy is proposed. First, depth image is divided into layers by different depth value intervals. Then regions, which are defined as the basic coding unit in this work, are segmented from each layer. The region is further divided into the contour and the interior. Two different procedures are employed to code contours and interiors respectively. A vector-based strategy is applied to code the contour lines. Straight lines in contours cost few of bits since they are regarded as vectors. Pixels, which are out of straight lines, are coded one by one. Depth values in the interior of a region are modeled by a linear or nonlinear formula. Coefficients in the formula are retrieved by regression. This process is called interior painting. Unlike conventional block based coding method, the residue between original frame and reconstructed frame (by contour rebuilt and interior painting) is not sent to decoder. In this proposal, contour is coded in a lossless way whereas interior is coded in a lossy way. Experimental results show that the proposed Contour Based Depth map Coding (CBDC) achieves a better performance than JMVC (reference software of MVC) in the high quality scenarios.


  • Advanced error concealment for 1Seg video communication considering error propagation inside a frame

    Jun Wang, Satoshi Goto


     View Summary

    Transmission of compressed video over error-prone channels may result in packet losses or errors, which can significantly degrade the image quality. Such degradation becomes even worse in 1Seg video broadcasting, which is widely used in Japan and Brazil for mobile phone TV services recently, where errors are drastically increased and the lost areas are contiguous. Therefore, the errors in the earlier concealed MBs (macro blocks) may propagate to the MBs later to be concealed inside the same frame (spatial domain). The error concealment (EC) is used to recover the lost data by the redundancy in videos. Aiming at spatial error propagation (SEP) reduction, this paper proposes an SEP-reduction-based EC (SEPEC). In SEPEC, besides the mismatch distortion in the current MB, the potential propagated mismatch distortion in the following to be concealed MBs is also minimized. Also, two extensions of SEPEC, i.e. SEPEC with refined search and SEPEC with multiple layer match, are discussed. Compared with previous work, the experiments show SEPEC achieves much better performance of video recovery and excellent trade-off between quality and computation in 1Seg broadcasting. (C) 2012 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.


  • Hilbert Transform-Based Workload Prediction and Dynamic Frequency Scaling for Power-Efficient Video Encoding

    Xin Jin, Satoshi Goto


     View Summary

    With the popularity of mobile devices with embedded video cameras, real-time video encoding on hand-held devices becomes increasingly popular. Reducing the power consumption during real-time video encoding to suspend the battery life with the same encoding performance is very important to improve the quality of service. Although some workload estimation techniques have been developed for video decoding to reduce power consumption for video playback applications, they present inefficiency in being transferred to video encoding directly because the compressed information cannot be retrieved before encoding and the future input video content is often nondeterministic. In this paper, a workload estimation scheme targeting video encoding applications is proposed. Based on the definition of video encoding workload and the analysis of the features, a Hilbert transform-based workload estimation model is proposed to predict the overall variation trend in the encoding workload to overcome the workload fluctuations and the nondeterministic content variations, e. g., burst motion. The effectiveness of the proposed algorithm is demonstrated on two H.264/AVC encoders on PC and an embedded platform by encoding different video contents at different bit-rates. The proposed algorithm provides a negligible deadline missing ratio around 4.8%, which is much lower than the previous solutions, together with platform and content robustness. Compared with the previous solutions, the proposed algorithm provides up to 61.69% power reduction under the same performance constraint.


  • An advanced hierarchical motion estimation scheme with lossless frame recompression and early-level termination for beyond high-definition video coding

    Xuena Bao, Dajiang Zhou, Peilin Liu, Satoshi Goto

    IEEE Transactions on Multimedia   14 ( 2 ) 237 - 249  2012.04

     View Summary

    In this paper, we present a hardware-efficient fast algorithm with a lossless frame recompression scheme and early-level termination strategy for large search range (SR) motion estimation (ME) utilized in beyond high-definition video encoder. To achieve high ME quality for hierarchical motion search, we propose an advanced hierarchical ME scheme which processes the multiresolution motion search with an efficient refining stage. This enables high data and hardware reuse for much lower bandwidth and memory cost, while achieving higher ME quality than previous works. In addition, a lossless frame recompression scheme based on this ME algorithm is presented to further reduce bandwidth. A hierarchical memory organization as well as a leveling two-step data fetching strategy is applied to meet constraint of random access for hierarchical motion search structure. Also, the leveling compression strategy by allowing a lower level to refer to a higher one for compression is proposed to efficiently reduce the bandwidth. Furthermore, an early-level termination method suitable for hierarchical ME structure is also applied. This method terminates high-level redundant motion searches by establishing thresholds based on current block mode and motion search level
    it also applies the early refinement termination in order to avoid unnecessary refinement for high levels. Experimental results show that the total scheme has a much lower bit rate increasing compared with previous works especially for high motion sequences, while achieving a considerable saving of memory and bandwidth cost for large SR of [-128,127]. © 2011 IEEE.


  • Cluster Generation and Network Component Insertion for Topology Synthesis of Application-Specific Network-on-Chips

    Wei Zhong, Takeshi Yoshimura, Bei Yu, Song Chen, Sheqin Dong, Satoshi Goto

    IEICE TRANSACTIONS ON ELECTRONICS   E95C ( 4 ) 534 - 545  2012.04

     View Summary

    Network-on-Chips (NoCs) have been proposed as a solution for addressing the global communication challenges in System-on-Chip (SoC) architectures that are implemented in nanoscale technologies. For the use of NoCs to be feasible in today's industrial designs, a custom-tailored, power- efficient NoC topology that satisfies the application characteristics is required. In this work, we present a design methodology that automates the synthesis of such application-specific NoC topologies. We present a method which integrates partitioning into floorplanning phase to explore optimal clustering of cores during floorplanning with minimized link and switch power consumption. Based on the size of applications, we also present an Integer Linear Programming and a heuristic method to place switches and network interfaces on the floorplan. Then, a power and timing aware path allocation algorithm is carried out to determine the connectivity across different switches. We perform experiments on several SoC benchmarks and present a comparison with the latest work. For small applications, the NoC topologies synthesized by our method show large improvements in power consumption (27,54%), hop-count (4%) and running time (66%) on average. And for large applications, the synthesized topologies result in large power (31.77%), hop-count (29%) and running time (94.18%) on average.

    DOI CiNii

  • An Advanced Hierarchical Motion Estimation Scheme With Lossless Frame Recompression and Early-Level Termination for Beyond High-Definition Video Coding

    Xuena Bao, Dajiang Zhou, Peilin Liu, Satoshi Goto

    IEEE TRANSACTIONS ON MULTIMEDIA   14 ( 2 ) 237 - 249  2012.04

     View Summary

    In this paper, we present a hardware-efficient fast algorithm with a lossless frame recompression scheme and early-level termination strategy for large search range (SR) motion estimation (ME) utilized in beyond high-definition video encoder. To achieve high ME quality for hierarchical motion search, we propose an advanced hierarchical ME scheme which processes the multiresolution motion search with an efficient refining stage. This enables high data and hardware reuse for much lower bandwidth and memory cost, while achieving higher ME quality than previous works. In addition, a lossless frame recompression scheme based on this ME algorithm is presented to further reduce bandwidth. A hierarchical memory organization as well as a leveling two-step data fetching strategy is applied to meet constraint of random access for hierarchical motion search structure. Also, the leveling compression strategy by allowing a lower level to refer to a higher one for compression is proposed to efficiently reduce the bandwidth. Furthermore, an early-level termination method suitable for hierarchical ME structure is also applied. This method terminates high-level redundant motion searches by establishing thresholds based on current block mode and motion search level; it also applies the early refinement termination in order to avoid unnecessary refinement for high levels. Experimental results show that the total scheme has a much lower bit rate increasing compared with previous works especially for high motion sequences, while achieving a considerable saving of memory and bandwidth cost for large SR of [-128, 127]..


  • Linear optimal one-sided single-detour algorithm for untangling twisted bus. ASP-DAC 2012:151-156

    Tao Lin, Sheqin Dong, Song Chen, Satoshi Goto

    ASPDAC2012     151 - 156  2012

  • Adptive Coding Unit Size Decision Algorithm for HEVC Intra Coding

    Guifen Tian, Satoshi Goto

    Picture Coding Symposium(PCS2012)    2012


  • De-blocking filter for HEVC with skipping mode

    Muchen Li, Jinjin Zhou, Xiao Peng, Satoshi Goto

    International Technical Conference on Circuits/Systems, Computers and Communications(ITC-CSCC2012)    2012

  • A low-complexity HEVC intra prediction algorithm based on level and mode filtering

    Heming Sun, Dajiang Zhou, Satoshi Goto

    IEEE International Conference on Multimedia and Expo (ICME2012)    2012


  • Human Detection Algorithm Based on Hybrid Information in Images

    Hiroki Ichikawa, Satoshi Goto

    Image Electronics and Visual Computing Workshop (IEVC2012)(Accepted)    2012

  • DVB-T2 LDPC Decoder with Perfect Conflict Resolution

    Xiongxin Zhao, Zhixiang Chen, Xiao Peng, Dajiang Zhou, Satoshi Goto

    IPSJ   5   23 - 31  2012


  • LDPC Coded MIMO Cooperative Communication System With Relay Selection

    Nanfan Qiu, Xiao Peng, Yichao Lu, Satoshi Goto

    SASIMI    2012

  • An Optimized MC Interpolation Architecture for HEVC

    Guo Zhengyan, Dajiang Zhou, Satoshi Goto

    37th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012    2012


  • DVB-T2 LDPC decoder with perfect conflict resolution

    Xiongxin Zhao, Zhixiang Chen, Xiao Peng, Dajiang Zhou, Satoshi Goto

    2012 International Symposium on VLSI Design, Automation and Test, VLSI-DAT 2012 - Proceedings of Technical Papers    2012

     View Summary

    In this paper we focus on the resolution of the message updating conflict problem in layered algorithm for DVB-T2 LDPC decoders. Unlike the previous resolutions, we directly implement the layered algorithm without modifying the parity-check matrices (PCM) or the decoding algorithm. DVB-T2 LDPC decoder architecture is also proposed in this paper with two new techniques which guarantee conflict-free layered decoding. The PCM Rearrange technique reduces the number of conflicts and eliminates all of data dependency problems between layers to ensure high pipeline efficiency. The Layer Division technique deals with all remaining conflicts with a well-designed decoding schedule. Experiment results show that compared to state-of-the-art works we achieve a slight error-correcting performance gain for DVB-T2 LDPC codes. © 2012 IEEE.


  • Electromagnetic Analysis Enhancement Based on Near-Field Scan

    Hongying Liu, Yukiyasu Tsunoo, Yibo Fan, Bin Hu, Satoshi Goto

    Journal of Signal Processing   16 ( 3 ) 241 - 250  2012

    DOI CiNii

  • Pedestrian Detection Using Gradient Local Binary Patterns

    Ning Jiang, Jiu Xu, Satoshi Goto

    IEICE TRANS. Fundamentals   E95-A ( 8 ) 1280 - 1287  2012


  • A 1991 Mpixels/s Intra Prediction Architecture for Super Hi-Vision H.264/AVC Encoder

    Gang He, Dajiang Zhou, Jinjia Zhou, Satoshi Goto

    European Signal Processing Conference (EUSIPCO2012)    2012

  • Interlaced asymmetric search range assignment for bidirectional motion estimation

    Jinjia Zhou, Dajiang Zhou, Satoshi Goto

    International Conference on Image Processing (ICIP2012)    2012


  • Compression Algorithm for Multi-View Video Coding Using a Depth Map

    Yohei Matsui, Satoshi Goto

    Image Electronics and Visual Computing Workshop (IEVC2012)(Accepted)    2012

  • Envelope detection based workload prediction for partial decoding scheme

    Chen Liu, Xin Jin, Satoshi Goto

    IEEE Asia-Pacific Conference on Circuits and Systems, Proceedings, APCCAS     53 - 56  2012

     View Summary

    In this paper, a novel envelope detection based workload prediction is proposed. The workload estimation is based on the envelope of the difference value between the previous adjacent actual-workloads, and the envelope is obtained by an absolute value and low pass filter method. To further improve the performance of the envelope detection, the negative truncation is introduced by ignoring the negative difference. This workload prediction method is proposed for the partial decoding scheme and works together with the Dynamic Voltage Frequency Scaling (DVFS) to reduce the power consumption of H.264/AVC video decoding. It provides accurate estimation results with only 0.66% deadline missing rate, which is much lower than previous methods. The power consumption results on the evaluation board show that, with the proposed envelope detection based workload prediction and DVFS, the power reduction achieves about 41.65%, the energy reduction is about 10.16%. The energy reduction of the partial decoding together with proposed workload prediction and DVFS is about 61.15% compare to the fully decoding without DVFS. © 2012 IEEE.


  • Enhanced Moving Object Detection Using Tracking System for Video Surveillance Purposes

    Axel Beaugendre, Chenyuan Zhang, Jiu Xu, Satoshi Goto

    Visual Communication and Image Processing (VCIP2012)    2012


  • An optimization scheme for quadtree-structured prediction and residual encoding in HEVC

    Guifen Tian, Satoshi Goto

    IEEE Asia-Pacific Conference on Circuits and Systems, Proceedings, APCCAS     547 - 550  2012

     View Summary

    In the High Efficiency Video Coding (HEVC), block sizes for hybrid-prediction and residual encoding are recursively selected by using a quadtree structure. This technique requires enormous computational complexity. Nevertheless, the exhaustive quadtree-based partition is not always necessary. This paper fully utilizes all-zero residual blocks to accelerate HEVC encoding process. A near-sufficient condition is derived to detect variable-sized all-zero blocks. For these blocks, DCT and quantization can be skipped. Moreover, a novel PU pruning technique based on allzero block is presented to constrain prediction units (PU) which have little contribution to RD performance. Experiments on a wide range of videos show that proposed scheme can reduce up to 73.42% and an average of 53.37% computational complexity for HEVC encoder with only trivial loss in PSNR and rate. © 2012 IEEE.


  • De-blocking filter for HEVC with skipping mode

    Muchen Li, Jinjin Zhou, Xiao Peng, Satoshi Go

    International Technical Conference on Circuits/Systems, Computers and Communications(ITC-CSCC)    2012

  • De-blocking filter for HEVC with skipping mode

    Muchen Li, Jinjin Zhou, Xiao Peng, Satoshi Go

    International Technical Conference on Circuits/Systems, Computers and Communications(ITC-CSCC)    2012

  • Interlaced asymmetric search range assignment for bidirectional motion estimation

    Jinjia Zhou, Dajiang Zhou, Satoshi Goto

    Proceedings - International Conference on Image Processing, ICIP     1557 - 1560  2012

     View Summary

    Bidirectional motion estimation significantly enhances video coding efficiency, but its huge complexity is also a critical problem for implementation. This paper presents an interlaced asymmetric search range assignment (IASRA) algorithm. By applying a large and a small search ranges to two reference directions and switching the assignment of these two search ranges once per macroblock, total complexity for bidirectional motion estimation can be reduced by near half with slight coding efficiency drop. IASRA also has the flexibility to be combined with existing fast algorithms and architectures for further complexity saving. We demonstrate this feature by combining IASRA with the state-of-the-art IMNPDR and PMRME architectures, which results in 33% to 42% complexity reduction with less than 1% bit rate increase. © 2012 IEEE.


  • Compression Algorithm for Multi-View Video Coding Using a Depth Map

    Yohei Matsui, Satoshi Goto

    Image Electronics and Visual Computing Workshop (IEVC2012)    2012

  • DVB-T2 LDPC Decoder with Perfect Conflict Resolution

    Xiongxin Zhao, Zhixiang Chen, Xiao Peng, Dajiang Zhou, Satoshi Goto

    IPSJ   5   23 - 31  2012


  • LDPC Coded MIMO Cooperative Communication System With Relay Selection

    Nanfan Qiu, Xiao Peng, Yichao Lu, Satoshi Goto

    SASIMI    2012

  • Stereo Matching with Pixel Classification and Reliable Disparity Propagation

    Weichen Wang, Satoshi Goto

    International Symposium on Circuits and Systems (ISCAS2012)    2012


  • Compression Algorithm for Multi-View Video Coding Using a Depth Map

    Yohei Matsui, Satoshi Goto

    Image Electronics and Visual Computing Workshop (IEVC2012)(Accepted)    2012

  • Leakage-aware performance-driven TSV-planning based on network flow algorithm in 3D ICs.

    Kan Wang, Sheqin Dong, Yuchun Ma, Satoshi Goto, Jason Cong

    ISQED 2012     129 - 136  2012


  • Secret recovery from electromagnetic emissions

    Hongying Liu, Yibo Fan, Satoshi Goto

    Advanced Science Letters   7   182 - 186  2012

     View Summary

    Electromagnetic emissions leak confidential data of cryptographic devices. The electromagnetic emission has been reported as an important side channel for cryptanalysis. Electromagnetic Analysis (EMA) exploits the external radiation of cryptographic devices during encryption to reveal secret keys. The performance of EMA depends on the acquired signals to a large extent. To protect the devices from attacks, noises are introduced in the side channel either by unintentional interference from surroundings or elaborate design from engineers. Thus the secret recovery becomes difficult and even unavailable. In this paper, we propose two signal processing techniques to counteract both of these noises. The bandpass filtering and independent component analysis are widely used in other areas. We demonstrate their applications to EMA against the encryption algorithms on application-specific integrated circuit. With these techniques, the secret keys are extracted successfully and rapidly. © 2012 American Scientific Publishers. All Rights Reserved.


  • A 4320p 60fps H.264/AVC intra-frame encoder chip with 1.41Gbins/s CABAC

    Dajiang Zhou, Gang He, Wei Fei, Zhixiang Chen, Jinjia Zhou, Satoshi Goto

    IEEE Symposium on VLSI Circuits, Digest of Technical Papers     154 - 155  2012

     View Summary

    An H.264/AVC intra-frame video encoder is implemented in 65nm CMOS. With an efficient intra prediction design, its maximum throughput reaches 1991Mpixels/s for 7680x4320p 60fps video, 9.4x to 32x faster than previous designs. The encoder also incorporates a 1.41Gbins/s CABAC architecture that has been enhanced by 31%. Moreover, low energy consumption is achieved by the high parallelism and hardware efficiency of this design. 1080p30 encoding dissipates only 2mW at 0.8V and 9MHz. © 2012 IEEE.


  • A low-complexity HEVC intra prediction algorithm based on level and mode filtering

    Heming Sun, Dajiang Zhou, Satoshi Goto

    Proceedings - IEEE International Conference on Multimedia and Expo     1085 - 1090  2012

     View Summary

    HEVC achieves a better coding efficiency relative to prior standards, but also involves increased complexity. For intra prediction, complexity is especially intensive due to a highly flexible coding unit structure and a large number of prediction modes. This paper presents a low-complexity intra prediction algorithm for HEVC. A fast preprocessing stage based on a simplified cost model is proposed. Based on its results, a level filtering scheme reduces the number of prediction unit levels that requires fine processing from 5 to 2. To supply level filtering decision with appropriate thresholds, a fast training method is also designed. A mode filtering scheme further reduces the maximum number of angular modes to be evaluated from 34 to 9. Complexity reduction from HM 3.0 is over 50% and stable for various sequences, which makes the proposed algorithm suitable for real-time applications. The corresponding bit rate increase is lower than 2.5%. © 2012 IEEE.


  • A Comparative Study of Electric Side Channel and Magnetic Side Channel

    Hongying Liu, Yibo Fan, Yukiyasu Tsunoo, Satoshi Goto

    International Workshop on Information Security Applications(WISA2012)    2012

  • Distributed Punctured LDPC Coding Scheme Using Shuffled Decoding For MIMO Relay Channels

    Nanfan Qiu, Xiao Peng, Yichao Lu, Satoshi Goto

    European Signal Processing Conference (EUSIPCO2012)    2012

  • Enhanced Moving Object Detection Using Tracking System for Video Surveillance Purposes

    Axel Beaugendre, Chenyuan Zhang, Jiu Xu, Satoshi Goto

    Visual Communication and Image Processing (VCIP2012)(Accepted)    2012


  • Edge Detection Algorithm for Depth Map Generation in Free View Video

    Akira Inoue, Satoshi Goto

    Image Electronics and Visual Computing Workshop (IEVC2012)(Accepted)    2012

  • Envelope Detection Based Workload Prediction for Partial Decoding Scheme

    Chen LIU, Xin Jin, Satoshi Goto

    IEEEE Asia Pacific Conference on Circuits and Systems (APCCS2012)    2012


  • Motion Robust Rain Detection and Removal from Videos

    Xinwei Xue, Xin Jin, Chenyuan Zhang, Satoshi Goto


     View Summary

    Weather such as rain and snow cause difficulties in processing the videos captured. Since the appearance of rain drops can affect the performance of human tracking and reduce the efficiency of video compression, detection and removal of rain is a challenging problem in outdoor surveillance systems. In this paper, we propose a new algorithm for rain detection, which is based on joint spatial and wavelet domain features. This approach is robust to the videos with moving objects in the rain. Experimental results demonstrated its better performance in comparison with the existing approaches in the subjective quality.


  • An integrated Hole-filling algorithm for view synthesis

    Wenxin Yu, Weichen Wang, Zhengyan Guo, Satoshi Goto

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   7674   80 - 92  2012

     View Summary

    Multi-view video can provide users a 3-D and virtual reality perception by its multiple viewing angles. In recent years, the depth image-based rendering (DIBR) is generally used to synthesize virtual view images in free viewpoint television (FTV) and three-dimensional (3-D) video. In order to conceal the zero-region more accurately and improve the quality of virtual view synthesized frame, an integrated Hole-filling Algorithm for View Synthesis is proposed in this paper. It contains five parts: the different regions distinguishing algorithm, foreground and background boundary detection, the texture image isophotes detection, the textural and structural isophotes prediction algorithm, the in-painting algorithm with gradient priority order. Based on the texture isophotes prediction with geometrical principle and the in-painting algorithm with gradient priority order, the boundary information of the foreground is much clearer and the texture information in the zero-region can be concealed much more accurately than the previous work. The vision quality mainly depends on the distortion of the structural information. Through the experimental results, the proposed algorithm not only improves the objective quality of the virtual image, but also improves the subjective quality of the virtual image a lot, and the human vision quality is also improved obviously based on the subjective results. Especially, it ensures the boundary contours of the foreground objects and the textural and the structural information. © 2012 Springer-Verlag.


  • An Optimization Scheme for Quadtree-Structured Prediction and Residual Encoding in HEVC

    Guifen Tian, Satoshi Goto

    IEEE Asia Pacific Conference on Circuits and Systems (APCCAS2012)    2012


  • Pedestrian Detection Based on Bidirectional Local Template Patterns

    Jiu Xu, Ning Jiang, Satoshi Goto

    European Signal Processing Conference (EUSIPCO)    2012

  • Enhanced Moving Object Detection Using Tracking System for Video Surveillance Purposes

    Axel Beaugendre, Chenyuan Zhang, Jiu Xu, Satoshi Goto

    Visual Communication and Image Processing (VCIP2012)    2012


  • Pedestrian Detection Based on Bidirectional Local Template Patterns

    Jiu Xu, Ning Jiang, Satoshi Goto

    European Signal Processing Conference (EUSIPCO)    2012

  • Compression Algorithm for Multi-View Video Coding Using a Depth Map

    Yohei Matsui, Satoshi Goto

    Image Electronics and Visual Computing Workshop (IEVC2012)    2012

  • Edge Detection Algorithm for Depth Map Generation in Free View Video

    Akira Inoue, Satoshi Goto

    Image Electronics and Visual Computing Workshop (IEVC2012)    2012

  • An Efficient Majority-Logic Based Message-Passing Algorithm for Non-Binary LDPC Decoding

    Yichao Lu, Xiao Peng, Zhixiang Chen, Satoshi Goto

    IEEE Asia Pacific Conference on Circuits and Systems (APCCAS 2012)    2012


  • Edge Detection Algorithm for Depth Map Generation in Free View Video

    Akira Inoue, Satoshi Goto

    Image Electronics and Visual Computing Workshop (IEVC2012)    2012

  • Leakage-aware performance-driven TSV-planning based on network flow algorithm in 3D ICs.

    Kan Wang, Sheqin Dong, Yuchun Ma, Satoshi Goto, Jason Cong

    ISQED 2012     129 - 136  2012


  • An Optimized MC Interpolation Architecture for HEVC

    Guo Zhengyan, Dajiang Zhou, Satoshi Goto

    37th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012    2012


  • A Comparative Study of Electric Side Channel and Magnetic Side Channel

    Hongying Liu, Yibo Fan, Yukiyasu Tsunoo, Satoshi Goto

    International Workshop on Information Security Applications(WISA2012)    2012

  • Distributed Punctured LDPC Coding Scheme Using Shuffled Decoding For MIMO Relay Channels

    Nanfan Qiu, Xiao Peng, Yichao Lu, Satoshi Goto

    European Signal Processing Conference (EUSIPCO2012)    2012

  • Pedestrian Detection Based on Bidirectional Local Template Patterns

    Jiu Xu, Ning Jiang, Satoshi Goto

    European Signal Processing Conference (EUSIPCO2012)    2012

  • Motion robust rain detection and removal from videos,

    Xinwei Xue, Xin Jin, Chenyuan Zhang, Satoshi Goto

    IEEE International Workshop on Multimedia Signal Processing (MMSP2012)    2012


  • An Optimization Scheme for Quadtree-Structured Prediction and Residual Encoding in HEVC

    Guifen Tian, Satoshi Goto

    IEEE Asia Pacific Conference on Circuits and Systems (APCCAS2012)(Accepted)    2012


  • Linear optimal one-sided single-detour algorithm for untangling twisted bus

    Tao Lin, Sheqin Dong, Song Chen, Satoshi Goto

    Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC     151 - 156  2012

     View Summary

    We considered the one-sided single-detour untangling twisted nets problem for printed circuit board bus routing. A previous optimal dynamic programming based O(n 3) algorithm was proposed in a previous work, where n is the number of nets. In this paper, we propose an optimal O(n) untangling algorithm without considering capacity, and this algorithm is further modified to consider capacity. Experimental results show that our algorithms runs much faster than the previous work due to its low time complexity. © 2012 IEEE.


  • Linear optimal one-sided single-detour algorithm for untangling twisted bus. ASP-DAC 2012:151-156

    Tao Lin, Sheqin Dong, Song Chen, Satoshi Goto

    ASPDAC2012     151 - 156  2012

  • A KLT-Based Approach for Occlusion Handling in Human Tracking

    Chenyuan Zhang, Jiu Xu, Axel Beaugendre, Satoshi Goto

    2012 PICTURE CODING SYMPOSIUM (PCS)     337 - 340  2012

     View Summary

    Occlusions significantly affect the result during human tracking. This paper proposes a novel occlusion detection and handling algorithm which is mainly based on the KLT (Kanade-Lucas-Tomasi) method. Instead of using KLT as a tracker, we apply it for occlusion detection to enhance tracking stability. In this paper, a combinational method of particle filter tracking and occlusion detection is proposed. Depending on the detection result, our method makes decisions that whether to update the appearance model and use the occlusion handling strategy. Our occlusion detector associates color information, KLT feature tracker and directions of feature points. Additional, the occlusion handling strategy is based on the information from detection. Moreover, the algorithm also can solve the drift problem. Experimental results on famous datasets prove that our method has better performance and robustness on occlusion detection and handling.


  • Voltage island-driven power optimization for application specific network-on-chip design

    Kan Wang, Sheqin Dong, Satoshi Goto

    Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI     171 - 176  2012

     View Summary

    In this paper, a voltage island aware framework is proposed for low power design of application specific NoC (LPASNoC). Through a three-phase processing including voltage island generation, VI-driven floorplanning and post-floorplan processing, the total power consumption, design cost and total wire length can be optimized. Experimental results show that compared to traditional ASNoC, the proposed method can reduce total core power by about 34.5% and chip area by about 26.8% without increasing communication power. Copyright 2012 ACM.


  • All Zero Block Detection Algorithms for Residual Quadtree Encoding in HEVC

    Guifen Tian, Satoshi Goto

    International Technical Conference on Circuits/Systems, Computers and Communications(ITC-CSCC2012)    2012

  • Pedestrian Detection Based on Bidirectional Local Template Patterns

    Jiu Xu, Ning Jiang, Satoshi Goto

    European Signal Processing Conference (EUSIPCO2012)    2012

  • Motion robust rain detection and removal from videos,

    Xinwei Xue, Xin Jin, Chenyuan Zhang, Satoshi Goto

    IEEE International Workshop on Multimedia Signal Processing (MMSP2012)    2012


  • Human Detection Algorithm Based on Hybrid Information in Images

    Hiroki Ichikawa, Satoshi Goto

    Image Electronics and Visual Computing Workshop (IEVC2012)(Accepted)    2012

  • An Optimization Scheme for Quadtree-Structured Prediction and Residual Encoding in HEVC

    Guifen Tian, Satoshi Goto

    IEEE Asia Pacific Conference on Circuits and Systems (APCCAS2012)(Accepted)    2012


  • All Zero Block Detection Algorithms for Residual Quadtree Encoding in HEVC

    Guifen Tian, Satoshi Goto

    International Technical Conference on Circuits/Systems, Computers and Communications(ITC-CSCC2012)    2012

  • A 1991 Mpixels/s Intra Prediction Architecture for Super Hi-Vision H.264/AVC Encoder

    Gang He, Dajiang Zhou, Jinjia Zhou, Satoshi Goto

    European Signal Processing Conference (EUSIPCO2012)    2012

  • Interlaced asymmetric search range assignment for bidirectional motion estimation

    Jinjia Zhou, Dajiang Zhou, Satoshi Goto

    International Conference on Image Processing (ICIP2012)    2012


  • Enhanced Moving Object Detection Using Tracking System for Video Surveillance Purposes

    Axel Beaugendre, Chenyuan Zhang, Jiu Xu, Satoshi Goto

    Visual Communication and Image Processing (VCIP2012)(Accepted)    2012


  • Edge Detection Algorithm for Depth Map Generation in Free View Video

    Akira Inoue, Satoshi Goto

    Image Electronics and Visual Computing Workshop (IEVC2012)(Accepted)    2012

  • Envelope Detection Based Workload Prediction for Partial Decoding Scheme

    Chen LIU, Xin Jin, Satoshi Goto

    IEEEE Asia Pacific Conference on Circuits and Systems (APCCS2012)(Accepted)    2012


  • A 2Gpixel/s H.264/AVC HP/MVC video decoder chip for Super Hi-Vision and 3DTV/FTV applications

    Dajiang Zhou, Jinjia Zhou, Jiayi Zhu, Peilin Liu, Satoshi Goto

    Digest of Technical Papers - IEEE International Solid-State Circuits Conference   55   224 - 225  2012

     View Summary

    8Kx4K Super Hi-Vision (SHV) offers a significantly enhanced visual experience relative to 1080p, and is on its way to being the next digital TV standard. In addition, advanced 3DTV specifications involving a large number of camera views are targeted by emerging applications such as free-viewpoint TV (FTV). This paper presents a single-chip design that supports real-time H.264 decoding of SHV or up to 32 HD views. The design of the chip involved 3 key challenges: 1) Data dependencies of video coding algorithms restrict the degree of hardware parallelism. For SHV, each macroblock (MB) should be processed in less than 40 cycles at 300MHz, which is difficult to meet with a single pipeline
    2) due to the massive design and verification effort for video decoders, a scalable architecture that allows the maximum reuse of existing IP is desirable
    and 3) the DRAM bandwidth requirements are always a bottleneck in high-throughput video decoders. © 2012 IEEE.


  • A 18.42 Times Faster Video Encoder Based on Retinex Theory

    Xin Jin, Satoshi Goto

    2012 PICTURE CODING SYMPOSIUM (PCS)     473 - 476  2012

     View Summary

    Because of the limitation in the imaging system, videos captured by mobile devices are usually noisy. In this paper, a noise robust fast video encoding scheme is proposed based on Retinex theory. The reflectance image is estimated in a co-analysis module by Retinex theory, within which the content similarity is detected to conduct the encoding process. Experimental results demonstrate that the proposed encoding scheme provides an average of 18.42 times higher encoding speed relative to the existing approaches without sacrificing the subjective quality.


  • Adptive Coding Unit Size Decision Algorithm for HEVC Intra Coding

    Guifen Tian, Satoshi Goto

    Picture Coding Symposium(PCS2012)    2012


  • Stereo Matching with Pixel Classification and Reliable Disparity Propagation

    Weichen Wang, Satoshi Goto

    International Symposium on Circuits and Systems (ISCAS2012)    2012


  • De-blocking filter for HEVC with skipping mode

    Muchen Li, Jinjin Zhou, Xiao Peng, Satoshi Goto

    International Technical Conference on Circuits/Systems, Computers and Communications(ITC-CSCC2012)    2012

  • Envelope Detection Based Workload Prediction for Partial Decoding Scheme

    Chen LIU, Xin Jin, Satoshi Goto

    IEEEE Asia Pacific Conference on Circuits and Systems (APCCS2012)    2012


  • De-blocking filter design for HEVC and H.264/AVC

    Muchen Li, Jinjia Zhou, Dajiang Zhou, Xiao Peng, Satoshi Goto

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   7674   273 - 284  2012

     View Summary

    As the successor of H.264/AVC, HEVC inherits the basic property of H.264/AVC and gives some new features. This paper introduces a novel dual-standard de-blocking filter architecture which could support both of the HEVC and H.264/AVC standards. It takes 48 clock cycles for H.264/AVC and 24 cycles for HEVC for every 16×16 block. The proposed unified-cross based processing order greatly reduces the design complexity. The proposed architecture occupies 43.3k equivalent gate count at frequency of 200MHz in SMIC 65nm library, which could satisfy the throughput requirement of quad-full high definition (QFHD) on 60fps for H.264/AVC and super hi-vision (SHV) on 60fps for HEVC. In addition, the total power consumption could be reduced by 37.8% in skipping mode when the edges need not be filtered. © 2012 Springer-Verlag.


  • Human Detection Algorithm Based on Hybrid Information in Images

    Hiroki Ichikawa, Satoshi Goto

    Image Electronics and Visual Computing Workshop (IEVC2012)    2012

  • Interlaced asymmetric search range assignment for bidirectional motion estimation

    Jinjia Zhou, Dajiang Zhou, Satoshi Goto

    International Conference on Image Processing (ICIP2012)     1557 - 1560  2012


  • Human Detection Algorithm Based on Hybrid Information in Images

    Hiroki Ichikawa, Satoshi Goto

    Image Electronics and Visual Computing Workshop (IEVC2012)    2012

  • An efficient majority-logic based message-passing algorithm for non-binary LDPC decoding

    Yichao Lu, Nanfan Qiu, Zhixiang Chen, Satoshi Goto

    IEEE Asia-Pacific Conference on Circuits and Systems, Proceedings, APCCAS     479 - 482  2012

     View Summary

    This paper presents a majority-logic based message-passing algorithm for decoding non-binary LDPC codes. Recently, many majority-logic decoding (MLGD) algorithms make huge efforts on reducing the computational complexity for decoding non-binary LDPC codes. Inspired by one step majority-logic decoding and q-ary sum-product algorithm, we devise a novel iterative double-reliability- based (IDRB) MLGD algorithm which carries out an efficient trade-off between decoding computational complexity and error performance. The proposed algorithm achieves a remarkable enhancement on error correct ability and yet requires only integer operations and finite field operations. Simulation results on two NB-LDPC codes show that we succeed in achieving significant coding gain compared with IHRB-and ISRB-MLGD with limited complexity increase. © 2012 IEEE.


  • A 6.72-Gb/s 8 pJ/bit/iteration IEEE 802.15.3c LDPC Decoder Chip

    Zhixiang Chen, Xiao Peng, Xiongxin Zhao, Leona Okamura, Dajiang Zhou, Satoshi Goto


     View Summary

    In this paper, we introduce an LDPC decoder design for decoding a length-672 multi-rate code adopted in IEEE 802.15.3c standard. The proposed decoder features high performances in both data rate and power efficiency. A macro-layer level fully parallel layered decoding architecture is proposed to support the throughput requirement in the standard. For the proposed decoder, it takes only 4 clock cycles to process one decoding iteration. While parallelism increases, the chip routing congestion problem becomes more severe because a more complicated interconnection network is needed for message passing during the decoding process. This problem is nicely solved by our proposed efficient message permutation scheme utilizing exploited parity check matrix features. The proposed message permutation network features high compatibility and zero-logic-gate VLSI implementation, which contribute to the remarkable improvements in both area utilization ratio and total gate count. Meanwhile, frame-level pipeline decoding is applied in the design to shorten the critical path. To verify the above techniques, the proposed decoder is implemented on a chip fabricated using Fujitsu 65 nm 1P12L LVT CMOS process. The chip occupies a core area of 1.30 mm(2) with area utilization ratio 86.3%. According to the measurement results, working at 1.2 V, 400 MHz and 10 iterations the proposed decoder delivers a 6.72 Gb/s data throughput and dissipates a power of 537.6 mW, resulting in an energy efficiency 8.0 pJ/bit/iteration. Moreover, a decoder of the same architecture but with no pipeline stage for low-profile application is also implemented and evaluated at post-layout level.


  • A 98 GMACs/W 32-Core Vector Processor in 65 nm CMOS

    Xun He, Xin Jin, Minghui Wang, Dajiang Zhou, Satoshi Goto


     View Summary

    This paper presents a high-performance dual-issue 32-core SIMD platform for image and video processing. The SIMD cores support 8/16 bits SIMD MAC instructions, and vertical vector access. Eight cores with a 4-ports L2 cache are connected by CIB bus as a cluster. Four clusters are connected by mesh network. This hierarchical network can provide more than 192 GB/s low latency inter-core BW in average. The 4-ports L2 cache architecture is also designed to provide 192 GB/s L2 cache BW. To reduce coherence operation in large-scale SMP, an application specified protocol is proposed. Compared with MOESI, 67.8% of L1 cache energy can be saved in 32 cores case. The whole system including 32 vector cores, 256 KB L2 cache, 64-bit DDRII PHY and two PLL units, occupy 25 mm(2) in 65 nm CMOS. It can achieve a peak performance of 375 GMACs and 98 GMACs/W at 1.2 V.


  • Greedy Algorithm for the On-Chip Decoupling Capacitance Optimization to Satisfy the Voltage Drop Constraint

    Mikiko Sode Tanaka, Nozomu Togawa, Masao Yanagisawa, Satoshi Goto


     View Summary

    With the progress of process technology in recent years, low voltage power supplies have become quite predominant. With this, the voltage margin has decreased and therefore the on-chip decoupling capacitance optimization that satisfies the voltage drop constraint becomes more important. In addition, the reduction of the on-chip decoupling capacitance area will reduce the chip area and, therefore, manufacturing costs. Hence, we propose an algorithm that satisfies the voltage drop constraint and at the same time, minimizes the total on-chip decoupling capacitance area. The proposed algorithm uses the idea of the network algorithm where the path which has the most influence on voltage drop is found. Voltage drop is improved by adding the on-chip capacitance to the node on the path. The proposed algorithm is efficient and effectively adds the on-chip capacitance to the greatest influence on the voltage drop. Experimental results demonstrate that, with the proposed algorithm, real size power/ground network could be optimized in just a few minutes which are quite practical. Compared with the conventional algorithm, we confirmed that the total on-chip decoupling capacitance area of the power/ground network was reducible by about 40 similar to 50%.

    DOI CiNii

  • Efficient motion vector prediction algorithm using pattern matching

    Zhenxing Chen, Satoshi Goto


     View Summary

    The state-of-the-art median prediction scheme is widely used for predicting motion vectors (MVs) in recent video standards. By exploiting the spatial correlations among MVs, median prediction scheme predicts MV for current block from three neighboring blocks. When MV is obtained from motion estimation, MV difference (MVD) is calculated and then transmitted. This process for predicting MV and calculating MVD is known as MV coding process. For MV coding, the performance depends on how efficient both the spatial and the temporal correlations among MVs are being exploited. Median prediction scheme applies a sophisticated way including some special rules to exploit the spatial correlations, however the temporal correlations among successive MVs are not exploited. In this paper, a new algorithm named MV pattern matching (MV-PM) exploiting both the spatial and temporal correlations is proposed. Various kinds of experimental results show that the proposed MV-PM algorithm outperforms the median prediction and the other related prediction schemes. (C) 2011 Elsevier Inc. All rights reserved.

    DOI CiNii

  • Watermarking for HDR Image Robust to Tone Mapping

    Xinwei Xue, Takao Jinno, Xin Jin, Masahiro Okuda, Satoshi Goto


     View Summary

    High Dynamic Range (HDR) images have been widely applied in daily applications. However, HDR image is a special format, which needs to be pre-processed known as tone mapping operators for display. Since the visual quality of HDR images is very sensitive to luminance value variations, conventional watermarking methods for low dynamic range (LDR) images are not suitable and may even cause catastrophic visible distortion. Currently, few methods for HDR image watermarking are proposed. In this paper, two watermarking schemes targeting HDR images are proposed, which are based on p-Law and bilateral filtering, respectively. Both of the subjective and objective qualities of watermarked images are greatly improved by the two methods. What's more, these proposed methods also show higher robustness against tone mapping operations.

    DOI CiNii

  • Low bit rate overhead based reference modification for error resilient video coding

    Jun Wang, Satoshi Goto

    IEICE ELECTRONICS EXPRESS   8 ( 20 ) 1689 - 1697  2011.10

     View Summary

    During video transmission, errors may happen normally. Addressing error resiliency, the reference modification (RM) approach at video compression encoder is used to modify the original reference to a modified one, which is less sensitive to errors. Apparently it is a bit rate overhead compared to original reference based conventional encoder. Therefore to reduce such bit rate overhead while still keep the error resilience performance is a big issue. A low bit rate overhead based RM approach is proposed in this paper, where the decoder information is exploited to perform encoding. In the proposal, the modified reference has more correlation to original reference, thus can reduce the bit rate overhead. In addition, the proposal can achieve better image recovery compared to existing works, and still be compatible to standard decoder. Overall, the proposal provides the best trade-off among the error resiliency, coding efficiency, and standard compatibility.

    DOI CiNii

  • Composite Model-Based DC Dithering for Suppressing Contour Artifacts in Decompressed Video

    Xin Jin, Satoshi Goto, King Ngi Ngan

    IEEE TRANSACTIONS ON IMAGE PROCESSING   20 ( 8 ) 2110 - 2121  2011.08

     View Summary

    Because of the outstanding contribution in improving compression efficiency, block-based quantization has been widely accepted in state-of-the-art image/video coding standards. However, false contour artifacts are introduced, which result in reducing the fidelity of the decoded image/video especially in terms of subjective quality. In this paper, a block-based decontouring method is proposed to reduce the false contour artifacts in the decoded image/video by automatically dithering its direct current (DC) value according to a composite model established between gradient smoothness and block-edge smoothness. Feature points on the model with the corresponding criteria in suppressing contour artifacts are compared to show a good consistency between the model and the actual processing effects. Discrete cosine transform (DCT)-based block level contour artifacts detection mechanism ensures the blocks within the texture region are not affected by the DC dithering. Both the implementation method and the algorithm complexity are analyzed to present the feasibility in integrating the proposed method into an existing video decoder on an embedded platform or system-on-chip (SoC). Experimental results demonstrate the effectiveness of the proposed method both in terms of subjective quality and processing complexity in comparison with the previous methods.

    DOI PubMed CiNii

  • A 530 Mpixels/s 4096x2160@60fps H.264/AVC High Profile Video Decoder Chip

    Dajiang Zhou, Jinjia Zhou, Xun He, Jiayi Zhu, Ji Kong, Peilin Liu, Satoshi Goto

    IEEE JOURNAL OF SOLID-STATE CIRCUITS   46 ( 4 ) 777 - 788  2011.04

     View Summary

    The increased resolution of Quad Full High Definition (QFHD) offers significantly enhanced visual experience. However, the corresponding huge data throughput of up to 530 Mpixels/s greatly challenges the design of real-time video decoder VLSI with the extensive requirement on both DRAM bandwidth and computational power. In this work, a lossless frame recompression technique and a partial MB reordering scheme are proposed to save the DRAM access of a QFHD video decoder chip. Besides, pipelining and parallelization techniques such as NAL/slice-parallel entropy decoding are implemented to efficiently enhance its computational power. The chip supporting H.264/AVC high profile is fabricated in 90 nm CMOS and verified. It delivers a maximum throughput of 4096x2160@60fps, which is at least 4.3 times higher than the state-of-the-art. DRAM bandwidth requirement is reduced by typically 51%, which fits the design into a 64-bit LPDDR SDRAM interface and results in 58% DRAM power saving. Meanwhile, the core energy is saved by 54% by pipelining and parallelization.

    DOI CiNii

  • Greedy Optimization Algorithm for the Power/Ground Network Design to Satisfy the Voltage Drop Constraint

    Mikiko Sode Tanaka, Nozomu Togawa, Masao Yanagisawa, Satoshi Goto


     View Summary

    With the process technological progress in recent years, low voltage power supplies have become quite predominant. With this, the voltage margin has decreased and therefore the power/ground design that satisfies the voltage drop constraint becomes more important. In addition, the reduction of the power/ground total wiring area and the number of layers will reduce manufacturing and designing costs. So, we propose an algorithm that satisfies the voltage drop constraint and at the same time, minimizes the power/ground total wiring area. The proposed algorithm uses the idea of a network algorithm [I] where the edge which has the most influence on voltage drop is found. Voltage drop is improved by changing the resistance of the edge. The proposed algorithm is efficient and effectively updates the edge with the greatest influence on the voltage drop. From experimental results, compared with the conventional algorithm, we confirmed that the total wiring area of the power/ground was reducible by about 1/3. Also, the experimental data shows that the proposed algorithm satisfies the voltage drop constraint in the data whereas the conventional algorithm cannot.

    DOI CiNii

  • Optimized 2-D SAD Tree Architecture of Integer Motion Estimation for H.264/AVC

    Yibo Fan, Xiaoyang Zeng, Satoshi Goto

    IEICE TRANSACTIONS ON ELECTRONICS   E94C ( 4 ) 411 - 418  2011.04

     View Summary

    Integer Motion Estimation (IME) costs much computation in H.264/AVC video encoder. 2-D SAD tree IME architecture provides very high performance for encoder, and it has been used by many video codec designs. This paper proposes an optimized hardware design of 2-D SAD tree IME. Firstly, a new hardware architecture is proposed to reduce on-chip memory size. Secondly, a new search pattern is proposed to fully use memory bandwidth and reduce external memory access. Thirdly, the data-path is redesigned, and the performance is greatly improved. In order to compare with other IME designs, an IME design support D1 size, 30 fps with search range [+/- 32, +/- 32] is implemented. The hardware cost of this design includes 118 KGates and 8 Kb SRAM, the maximum clock frequency is 200 MHz. Compared to the original 2-D SAD tree IME, our design saves 87.5% on-chip memory, and achieves 3 times performance than original one. Our design provides a new way to design a low cost and high performance IME for H.264/AVC encoder.

    DOI CiNii

  • Multiple Region-of-Interest Based H.264 Encoder with a Detection Architecture in Macroblock Level Pipelining

    Tianruo Zhang, Chen Liu, Minghui Wang, Satoshi Goto

    IEICE TRANSACTIONS ON ELECTRONICS   E94C ( 4 ) 401 - 410  2011.04

     View Summary

    This paper proposes a region-of-interest (ROI) based H.264 encoder and the VLSI architecture of the ROI detection algorithm. In ROI based video coding system, pre-processing unit to detect ROI should only introduce low computational complexity overhead due to the low power requirement. The Macroblocks (MBs) in ROIs are detected sequentially in the same order of H.264 encoding to satisfy the MB level pipelining of ROI detector and H.264 encoder. ROI detection is performed in a novel estimation-and-verification process with an ROT contour template. Proposed architecture can be configured to detect either single ROI or multiple ROIs in each frame and the throughput of single detection mode is 5.5 times of multiple detection mode. 98.01% and 97.89% of MBs in ROIs can be detected in single and multiple detection modes respectively. Hardware cost of proposed architecture is only 4.68k gates. Detection speed is 753 fps for CIF format video at the operation frequency of 200 MHz in multiple detection mode with power consumption of 0.47 mW. Compared with previous fast ROI detection algorithms for video coding application, the proposed architecture obtains more accurate and smaller ROI. Therefore, more efficient ROT based computation complexity and compression efficiency optimization can be implemented in H.264 encoder.

    DOI CiNii

  • Cache Based Motion Compensation Architecture for Quad-HD H.264/AVC Video Decoder

    Jinjia Zhou, Dajiang Zhou, Gang He, Satoshi Goto

    IEICE TRANSACTIONS ON ELECTRONICS   E94C ( 4 ) 439 - 447  2011.04

     View Summary

    In this paper, we present a cache based motion compensation (MC) architecture for Quad-HD H.264/AVC video decoder. With the significantly increased throughput requirement. VLSI design for MC is greatly challenged by the huge area cost and power consumption. Moreover, the long memory system latency leads to performance drop of the MC pipeline. To solve these problems, three optimization schemes are proposed in this work. Firstly, a high-performance interpolator based on Horizontal-Vertical Expansion and Luma-Chroma Parallelism (HVE-LCP) is proposed to efficiently increase the processing throughput to at least over 4 times as the previous designs. Secondly, an efficient cache memory organization scheme (4Sx4) is adopted to improve the on-chip memory utilization, which contributes to memory area saving of 25% and memory power saving of 39 similar to 49%. Finally, by employing a Split Task Queue (STQ) architecture, the cache system is capable of tolerating much longer latency of the memory system. Consequently, the cache idle time is saved by 90%, which contributes to reducing the overall processing time by 24 similar to 40%. When implemented with SMIC 90 nm process, this design costs a logic gate count and on-chip memory of 108.8k and 3.1kB respectively. The proposed MC architecture can support real-time processing of 3840x2160@60 fps with less than 166 MHz.

    DOI CiNii

  • A 530 Mpixels/s Intra Prediction Architecture for Ultra High Definition H.264/AVC Encoder

    Gang He, Dajiang Zhou, Jinjia Zhou, Tianruo Zhang, Satoshi Goto

    IEICE TRANSACTIONS ON ELECTRONICS   E94C ( 4 ) 419 - 427  2011.04

     View Summary

    Intra coding in H.264/AVC significantly enhances video compression efficiency. However, due to the high data dependency of intra prediction in H.264, both pipelining and parallel processing techniques are limited to be applied. Moreover, it is difficult to get high hardware utilization and throughput because of the long block/MB-level reconstruction loops. This paper proposes a high-performance intra prediction architecture that can support H.264/AVC high profile. The proposed MB/block co-reordering can avoid data dependency and improve pipeline utilization. Therefore, the timing constraint of real-time 4096 x 2160 encoding can be achieved with negligible quality loss. 16 x 16 prediction engine and 8 x 8 prediction engine work parallel for prediction and coefficients generating. A reordering interlaced reconstruction is also designed for fully pipelined architecture. It takes only 160 cycles to process one macroblock (MB). Hardware utilization of prediction and reconstruction modules is almost 100%. Furthermore, PE-reusable 8 x 8 intra predictor and hybrid SAD & SAID mode decision are proposed to save hardware cost. The design is implemented by 90 nm CMOS technology with 113.2k gates and can encode 4096 x 2160 video sequences at 60 fps with operation frequency of 332 MHz.

    DOI CiNii

  • Encoder adaptable difference detection for low power video compression in surveillance system

    Xin Jin, Satoshi Goto

    SIGNAL PROCESSING-IMAGE COMMUNICATION   26 ( 3 ) 130 - 142  2011.03

     View Summary

    As a state-of-the-art video compression technique, H.264/AVC has been deployed in many surveillance cameras to improve the compression efficiency. However, it induces very high coding complexity, and thus high power consumption. In this paper, a difference detection algorithm is proposed to reduce the computational complexity and power consumption in surveillance video compression by automatically distributing the video data to different modules of the video encoder according to their content similarity features. Without any requirement in changing the encoder hardware, the proposed algorithm provides high adaptability to be integrated into the existing H.264 video encoders. An average of over 82% of overall encoding complexity can be reduced regardless of whether or not the H.264 encoder itself has employed fast algorithms. No loss is observed in both subjective and objective video quality. (C) 2011 Elsevier B.V. All rights reserved.

    DOI CiNii

  • A Novel Depth-Image Based View Synthesis Scheme for Multiview and 3DTV

    Xun He, Xin Jin, Minghui Wang, Satoshi Goto

    MMM   Part I, LNCS 6523   161 - 170  2011


  • A 16-65 Cycles/MB H.264/AVC Motion Compensation Architecture for Quad-HD Applications

    Jinjia Zhou, Dajiang Zhou, Gang He, Satoshi Goto

    2011 European Signal Processing Conference (EUSIPCO)     728 - 733  2011

  • A 1 Gbin/s CABAC Encoder for H.264/AVC

    Wei Fei, Dajiang Zhou, Satoshi Goto

    2011 European Signal Processing Conference (EUSIPCO    2011

  • Ultra Low Power QC-LDPC Decoder with High Parallelism

    Ying Cui, Xiao Peng, Zhixiang Chen, Xiongxin Zhao, Yichao Lu, Dajiang Zhou, Satoshi Goto

    24th IEEE International SOC Conference (SOCC2011)    2011

  • Real-time Human Tracking by Detection based on HOG and Particle Filter

    Jiu Xu, Axel Beaugendre, Satoshi Goto

    2011 6th International Conference on Computer Sciences and Convergence Information Technology (ICCIT)    2011

  • Framework of contour based depth map coding system

    Minghui Wang, Xun He, Xin Ji, Satoshi Goto

    2011 Pacific-Rim Conference on Multimedia (PCM 2011)    2011

  • A new optimized FME hardware structure for QFHD video

    He Qian, Satoshi Goto

    CSPA     108 - 111  2011


  • A Macro-Layer Level Fully Parallel Layered LDPC Decoder SoC for IEEE 802.15.3c Application

    Zhixiang Chen, Xiao Peng, Xiongxin Zhao, Qian Xie, Leona Okamura, Dajiang Zhou, Satoshi Goto

    2011 IEEE International Symposium on VLSI Design, Automation and Test (VSLI-DAT 2011)    2011


  • Ultra Low Power QC-LDPC Decoder with High Parallelism

    Ying Cui, Xiao Peng, Zhixiang Chen, Xiongxin Zhao, Yichao Lu, Dajiang Zhou, Satoshi Goto

    24th IEEE International SOC Conference (SOCC2011)    2011

  • Bilateral filtering based watermarking for high dynamic range image

    Xinwei Xue, Masahiro Okuda, Satoshi Goto

    2011 International Symposium on Intelligent Signal Processing and Communications Systems: "The Decade of Intelligent and Green Signal Processing and Communications", ISPACS 2011    2011

     View Summary

    As the future of digital photography, high dynamic range imaging (HDRI) has been widely applied in daily applications. However, since the visual quality of HDR images is very sensitive to luminance value variations, the conventional watermarking methods present inefficiency because of introducing many visible image distortions. In this paper, a watermarking scheme targeting HDR images is proposed, which is based on bilateral filtering. Watermarks are embedded into detail part of HDR image which is the residual image between the HDR image and the bilateral filtered image. The objective quality of the watermarked images can be significantly improved by an average of 20dB in PSNR with corresponding improvement in subjective quality. It also shows higher robustness against tone mapping operations. © 2011 IEEE.


  • A high throughput CABAC encoder design

    Wei Fei, Dajiang Zhou, Satoshi Goto

    Proceedings - 2011 IEEE 7th International Colloquium on Signal Processing and Its Applications, CSPA 2011     99 - 102  2011

     View Summary

    In this paper, we propose a full hardware encoder architecture for context-based adaptive binary arithmetic coding (CABAC) for Super Hi-vision data that tries to enlarge the throughput of the encoder. CABAC is a crucial part in H.264/AVC main profile that provides a great compression ratio at the expense of high computational complexity. Due to the data dependence between bit-wise processing, the throughput of the encoder is limited. Some techniques have been proposed in the latest encoder architecture designs to improve the speed to meet the need of QFHD or 3DHD applications. While for Super Hi-vision (4320p) case, a throughput of more than 1Gbps is required. While the current designs can only reach a throughput of around 660Mbps. As a result, frame parallelism is a usual but hardware costing way to solve the throughput gap. What's more, frame parallel will also cost frame delay problem, which is crucial in real-time system. This design tries to avoid the frame parallelism and save the power by encoding 4 bins per cycle using only one core, while working at a frequency of 264MHz. The technology used for synthesis is SMIC 90nm. Two main ideas are applied in this design to realize this high throughput. © 2011 IEEE.


  • A Macro-Layer Level Fully Parallel Layered LDPC Decoder SoC for IEEE 802.15.3c Application

    Zhixiang Chen, Xiao Peng, Xiongxin Zhao, Qian Xie, Leona Okamura, Dajiang Zhou, Satoshi Goto

    2011 IEEE International Symposium on VLSI Design, Automation and Test (VSLI-DAT 2011)    2011


  • Shift Obtaining in a Fast Encoder of Frame-compatible Format for 3D Distribution

    Zhuoying Zeng, Xin Jin, Satoshi Goto

    International Technicl Conference on Circuits/Systems, Computers and Communications (ITC-CSCC)    2011

  • A High Parallel Macro Block Level Layered LDPC Decoding Architecture based on Dedicated Matrix Reordering

    Qian Xie, Qian He, Xiao Peng, Ying Cui, Zhixiang Chen, Dajiang Zhou, Satoshi Goto

    SiPS 2011    2011


  • A fast encoder of frame-compatible format based on content similarity for 3D delivery

    Zhuoying Zeng, Xin Jin, Satoshi Goto

    Visual Communications and Image Processing (VCIP 2011)    2011


  • Application-specific Network-on-Chip synthesis: Cluster generation and network component insertion

    Wei Zhong, Bei Yu, Song Chen, Takeshi Yoshimura, Sheqin Dong, Satoshi Goto

    ISQED2011     144 - 149  2011


  • Floorplanning driven Network-on-Chip synthesis for 3-D SoCs

    Wei Zhong, Song Chen, Fei Ma, Takeshi Yoshimura, Satoshi Goto

    ISCAS2011     1203 - 1206  2011


  • A 98 GMACs/W 32-core vector processor in 65nm CMOS

    Xun He, Dajiang Zhou, Xin Jin, Satoshi Goto

    Proceedings of the International Symposium on Low Power Electronics and Design     373 - 378  2011

     View Summary

    This paper presents a high-performance dual-issue 32-core SIMD platform for image and video processing. Eight cores with a 4-ports L2 cache are connected by CIB bus as a cluster. Four clusters are connected by mesh network. The proposed hierarchical network can provide 192 GB/sintercore communication BW in average. To reduce coherence operation in large-scale SMP, an application specified protocol is proposed. Comparing with MOESI, 67.8% of L1 Cache energy can be saved in 32 cores case. It can achieve a peak performance of 375 GMACs and 98 GMACs/W in 65 nm CMOS. © 2011 IEEE.


  • Correlation Power Analysis with Companding Methods

    Hongying Liu, Satoshi Goto, Yukiyasu Tsunoo

    International Conference on Advances in Control Engineering and Information Science (CEIS 2011)    2011


  • A High Parallel Macro Block Level Layered LDPC Decoding Architecture based on Dedicated Matrix Reordering

    Qian Xie, Qian He, Xiao Peng, Ying Cui, Zhixiang Chen, Dajiang Zhou, Satoshi Goto

    SiPS 2011    2011


  • A study on channel polarization and polar coding

    Yichao Lu, Satoshi Goto

    IEEE 9th International Conference on ASIC (ASICON 2011)    2011

  • Block-based Codebook Model with Oriented-Gradient Feature for Real-time Foreground Detection

    Jiu Xu, Ning Jiang, Satoshi Goto

    2011 IEEE 13th International Workshop on Multimedia Signal Processing(MMSP)    2011


  • Framework of contour based depth map coding system

    Minghui Wang, Xun He, Xin Ji, Satoshi Goto

    2011 Pacific-Rim Conference on Multimedia (PCM 2011)    2011

  • Region-interior painting in Contour Based Depth map Coding System

    Minghui Wang, Xun He, Xin Jin, Satoshi Goto

    2011 International Symposium on Intelligent Signal Processing and Communications Systems: "The Decade of Intelligent and Green Signal Processing and Communications", ISPACS 2011    2011

     View Summary

    Depth Image Based Rendering (DIBR) is a coding strategy for Multiview Video Coding (MVC). Depth video is involved to DIBR to represent another dimensional of the traditional 2D video (texture video). The physics meaning of depth video is different from texture video. But the traditional block based coding system use the same procedure to code both depth video and texture video. A new coding system that dedicate to depth video coding is proposed in this paper, which is called Contour Based Depth map Coding system (CBDC). To meet the feature of sharp-edge and smooth-interior, the proposed CBDC performs contour coding and interior coding separately. This paper focus on introduce the interior coding part. Linear painting modes and non-linear painting mode are introduced and compared. Experimental results indicate that proposed painting method performs an equivalent PSNR performance to the block based coding strategy (JMVC). © 2011 IEEE.


  • Object tracking by detection for video surveillance systems based on modified codebook foreground detection and particle filter

    Jiu Xu, Chenyuan Zhang, Satoshi Goto

    2011 International Symposium on Intelligent Signal Processing and Communications Systems: "The Decade of Intelligent and Green Signal Processing and Communications", ISPACS 2011    2011

     View Summary

    In this paper, a novel approach is proposed to achieve the multi-object tracking in video surveillance system using a combination of tracking by detection method. For the foreground objects detection part, we implement a modified codebook model. First, the block-based model upgrades the pixel-based codebook model to block level, thus improving the processing speed and reducing memory. Moreover, by adding the orientation and magnitude of the block gradient, the codebook model contains not only information of color, but also the texture feature in order to further reduce noises and refine more entire foreground regions. For the tracking aspect, we further utilize the data from the foreground detection that a color-edgetexture histogram is used by calculate the local binary pattern of the edge of the foreground objects which could have a good performance in describing the shape and texture of the objects. Finally, occlusion solutions strategies are applies to order to overcome the occlusion problems during tracking. Experimental results on different data sets prove that our method has better performance and good real-time ability. © 2011 IEEE.


  • Novel and efficient min cut based voltage assignment in gate level. ISQED 2011:150-155

    Tao Lin, Sheqin Dong, Song Chen, Yuchun Ma, Ou He, Satoshi Goto

    ISQED2011     150 - 155  2011

  • Design of turbo codes without 4-cycles in Tanner graph representation for message passing algorithm

    Ying Cui, Xiao Peng, Satoshi Goto

    Proceedings - 2011 IEEE 7th International Colloquium on Signal Processing and Its Applications, CSPA 2011     108 - 111  2011

     View Summary

    Turbo code and Low Density Parity Check (LDPC) code are both recommended as FEC code in many communication standards owing to their impressive error correcting performance. Aiming at the common architecture which can decode both of the two codes, this paper describes the method of decoding turbo codes with message passing algorithm which is conventionally used for LDPC codes. In this method, turbo codes are viewed as block codes and the sparse parity check matrices are constructed through Smith Decomposition or GBT (Generator matrix based transformation). In order to guarantee decoding performance, we propose the criterion of turbo codes with no 4-cycles, which is mathematically proved and demonstrated by the simulation results. © 2011 IEEE.


  • A Sorting-based Architecture of Finding the First Two Minimum Values for LDPC Decoding

    Qian Xie, Zhixiang Chen, Xiao Peng, Satoshi Goto

    CSPA     95 - 98  2011


  • Floorplanning driven Network-on-Chip synthesis for 3-D SoCs

    Wei Zhong, Song Chen, Fei Ma, Takeshi Yoshimura, Satoshi Goto

    ISCAS2011     1203 - 1206  2011


  • A Low-Complexity Reliability Based Message Passing Algorithm for Nonbinary LDPC Decoding

    Yichao Lu, Xiao Peng, Satoshi Goto

    International Technicl Conference on Circuits/Systems, Computers and Communications (ITC-CSCC)    2011

  • Electromagnetic Analysis Enhancement with Signal Processing Techniques

    Hongying Liu, Yukiyasu Tsunoo, Satoshi Goto

    16th Australasian Conference on Information Security and Privacy (ACISP 2011)     456 - 461  2011


  • A 98 GMACs/W 32-Core Vector Processor in 65nm CMOS

    Xun He, Dajiang Zhou, Xin Jin, Satoshi Goto

    International Symposium on Low Power Electronics and Design (ISLPED) 2011    2011


  • A study on channel polarization and polar coding

    Yichao Lu, Satoshi Goto

    IEEE 9th International Conference on ASIC (ASICON 2011)    2011

  • µ-Law Based Watermarking for HDR Image Robust to Tone Mapping

    Xinwei Xue, Takao Jinno, Masahiro Okuda, Satoshi Goto

    Asia Pacific Signal and Information Processing Association, APSIPA ASC 2011    2011

  • Block-based Codebook Model with Oriented-Gradient Feature for Real-time Foreground Detection

    Jiu Xu, Ning Jiang, Satoshi Goto

    2011 IEEE 13th International Workshop on Multimedia Signal Processing(MMSP)    2011


  • Low power parallel surveillance video encoding system based on joint power-speed scheduling

    Xin Jin, Satoshi Goto

    2011 IEEE Visual Communications and Image Processing, VCIP 2011    2011

     View Summary

    In this paper, a low power parallel surveillance video encoding system based on joint power-speed scheduling is proposed. The relative relationships among the CPU statuses, total power consumption and encoding speed are analyzed and modeled for multi-core processors. Based on the power directional graph and the relative encoding speed model, the working statuses of the cores are controlled jointly adapting to video encoding workload to minimize the total power consumption. It provides more than 20% power reduction compared with the latest existing system without a penalty on the encoding speed. © 2011 IEEE.


  • A 115mW 1Gbps QC-LDPC decoder ASIC for WiMAX in 65nm CMOS

    Xiao Peng, Zhixiang Chen, Xiongxin Zhao, Dajiang Zhou, Satoshi Goto

    2011 Proceedings of Technical Papers: IEEE Asian Solid-State Circuits Conference 2011, A-SSCC 2011     317 - 320  2011

     View Summary

    Structured quasi-cyclic low-density parity-check (QC-LDPC) code is a part of many emerging wireless communication standards, such as WiMAX, WiFi and WPAN. This paper presents a high parallel decoder architecture for the QC-LDPC codes and the corresponding decoder ASIC for WiMAX system. Through utilizing the proposed fully parallel layered scheduling architecture, the decoder chip saves 22.2% memory bits and takes 24-48 clock cycles per iteration for different code rates. It occupies 3.36 mm 2 in SMIC 65nm CMOS, and realizes 1Gbps (1056Mbps) throughput at 1.2V, 110MHz and 10 iterations with the power 115mW and power efficiency 10.9pJ/bit/iteration. The energy/bit/iteration reduces 63.6% in normalized comparison with the state-of-art publication. © 2011 IEEE.


  • A Multiple Bits Output Ring-Oscillator PUF

    Hu Bin, Satoshi Goto, Yukiyasu Tsunoo

    Intelligent Signal Processing and Communication Systems (ISPACS 2011)    2011


  • A Study on the Improvement of Ring-Oscillator PUF

    Hu Bin, Yukiyasu Tsunoo, Satoshi Goto

    SCIS    2011

  • Novel and efficient min cut based voltage assignment in gate level. ISQED 2011:150-155

    Tao Lin, Sheqin Dong, Song Chen, Yuchun Ma, Ou He, Satoshi Goto

    ISQED2011     150 - 155  2011

  • A Sorting-based Architecture of Finding the First Two Minimum Values for LDPC Decoding

    Qian Xie, Zhixiang Chen, Xiao Peng, Satoshi Goto

    CSPA     95 - 98  2011


  • Shift Obtaining in a Fast Encoder of Frame-compatible Format for 3D Distribution

    Zhuoying Zeng, Xin Jin, Satoshi Goto

    International Technicl Conference on Circuits/Systems, Computers and Communications (ITC-CSCC)    2011


    Zhuoying Zeng, Xin Jin, Satoshi Goto


     View Summary

    Frame-compatible format, packing two neighbouring views into one frame, is considered as a promising solution for 3D distribution in the existing system. In this paper, a fast Infra encoder is designed for frame-compatible format coding based on the content similarity and proposed to reduce the computation complexity. With a relative shift, statistics analysis proves the high prediction correlation between the two packed views and qualifies the first coded view provide prediction mode as the reference for the second view prediction. The proposed scheme enables the prediction of the second view's MB only perform candidate Infra modes and candidate directions of each, according to the reference MB in the first view. An average of 75.69% complexity reduction of encoding the second view can be achieved with comparable compression efficiency. With no requirement to change the decoder of existing system and there is little effort for this encouraging format, this proposed algorithm improves the efficiency of frame-compatible format coding, and contributes to the delivery of 3D service.

  • A 1 Gbin/s CABAC Encoder for H.264/AVC

    Wei Fei, Dajiang Zhou, Satoshi Goto

    2011 European Signal Processing Conference (EUSIPCO    2011

  • A fast encoder of frame-compatible format based on content similarity for 3D delivery

    Zhuoying Zeng, Xin Jin, Satoshi Goto

    Visual Communications and Image Processing (VCIP 2011)    2011


  • Real-time Human Tracking by Detection based on HOG and Particle Filter

    Jiu Xu, Axel Beaugendre, Satoshi Goto

    2011 6th International Conference on Computer Sciences and Convergence Information Technology (ICCIT)    2011

  • Floorplanning driven network-on-chip synthesis for 3-D SoCs

    Wei Zhong, Song Chen, Fei Ma, Takeshi Yoshimura, Satoshi Goto

    Proceedings - IEEE International Symposium on Circuits and Systems     1203 - 1206  2011

     View Summary

    1 As technology advances, 3-D stacking of silicon layers is emerging as a promising approach to address the integration challenges faced by current System-on-Chips (SoCs). Designing efficient Network-on-Chips (NoCs) is necessary to handle the 3-D interconnect complexity. In this paper, we present a four-stage synthesis approach to determine the power-performance efficient 3-D NoC topology for the application. First, we propose an algorithm to explore optimal clustering of cores during 3-D floorplanning. Then, an Integer Linear Programming (ILP) algorithm is proposed to place switches and network interfaces on the 3-D floorplan. Thirdly, a power and timing aware path allocation algorithm is carried out to determine the connectivity across different switches. Last, a min-cost max-flow based algorithm is proposed for Through-Silicon Via (TSV) assignment to minimize the link power consumption. Experimental results show the effectiveness of the proposed algorithm. © 2011 IEEE.


  • A Study on the Improvement of Ring-Oscillator PUF

    Hu Bin, Yukiyasu Tsunoo, Satoshi Goto

    SCIS    2011

  • Network flow-based simultaneous retiming and slack budgeting for low power design. ASP-DAC 2011:473-478

    Bei Yu, Sheqin Dong, Yuchun Ma, Tao Lin, Yu Wang, Song Chen, Satoshi Goto

    ASPDAC2011     473 - 478  2011


  • Application-specific Network-on-Chip synthesis: Cluster generation and network component insertion

    Wei Zhong, Bei Yu, Song Chen, Takeshi Yoshimura, Sheqin Dong, Satoshi Goto

    ISQED2011     144 - 149  2011


  • A cascade detector for rapid face detection

    Ning Jiang, Wenxin Yu, Shaopeng Tang, Satoshi Goto

    Proceedings - 2011 IEEE 7th International Colloquium on Signal Processing and Its Applications, CSPA 2011     155 - 158  2011

     View Summary

    In recent years, LBP feature based svm detector and Haar feature based cascade detector are the two types of efficient detectors for face detection. In this paper, we proposed to improve the performance on Haar feature based cascade detector. First, we describe a new feature for cascade detector. The feature is called Separate Haar Feature. Secondly, we describe a new decision algorithm in cascade detection to improve the detection rate. There are three key contributions. The first is "Separate Haar Feature", which adds a don't-care area between the rectangles of Haar Feature. The second is the algorithm for selecting the best width for this don't-care area. Finally, we proposed a new decision algorithm which makes the decision by not only a stage result in cascade detection process to improve the detection rate. © 2011 IEEE.


  • Region of interest oriented fast mode decision for depth map coding in DIBR

    Minghui Wang, Chen Liu, Tianruo Zhang, Xin Jin, Satoshi Goto

    Proceedings - 2011 IEEE 7th International Colloquium on Signal Processing and Its Applications, CSPA 2011     177 - 180  2011

     View Summary

    During the depth image based rendering (DIBR), depth map is a new type of video sequence to be compressed. The synthesized result of DIBR is highly related to the quality of the depth map. Depth map also has some new features than color maps. It occurs to us that some improvement is required. Higher efficiency and keeping the edge quality is two critical issues for depth compression. In this paper, region of interest (ROI) detection is introduced to meet these requirements. We adopt an ROI detection method which can distinguish the moving and relatively complexity part (foreground) from each frame. By using different mode decision strategy in ROI and non-ROI, computing burden is reduced and performance is kept. Experimental results show that the proposed strategy saves 40% computation of ME on average with an acceptable performance lost. © 2011 IEEE.


  • A new optimized FME hardware structure for QFHD video

    He Qian, Satoshi Goto

    CSPA     108 - 111  2011


  • Correlation Power Analysis with Companding Methods

    Hongying Liu, Satoshi Goto, Yukiyasu Tsunoo

    International Conference on Advances in Control Engineering and Information Science (CEIS 2011)    2011


  • The switching glitch power leakage model

    Hongying Liu, Guoyu Qian, Yukiyasu Tsunoo, Satoshi Goto

    Journal of Software   6 ( 9 ) 1787 - 1794  2011

     View Summary

    Power analysis attacks are based on analyzing the power consumption of the cryptographic devices while they perform the encryption operation. Correlation Power Analysis (CPA) attacks exploit the linear relation between the known power consumption and the predicted power consumption of cryptographic devices to recover keys. It has been one of the effective side channel attacks that threaten the security of CMOS circuits. However, few works consider the leakage of glitches at the logic gates. In this paper, we present a new power consumption model, namely Switching Glitch (SG) power leakage model, which not only considers the data dependent switching activities but also including glitch power consumptions in CMOS circuits. Additionally, from a theoretical point of view, we show how to estimate the glitch factor. The experiments against AES implementation validate the proposed model. Compared with CPA based on Hamming Distance model, the power traces of recovering keys have been decreased by as much as 28.9%. © 2011 ACADEMY PUBLISHER.


  • A Novel Depth-Image Based View Synthesis Scheme for Multiview and 3DTV

    Xun He, Xin Jin, Minghui Wang, Satoshi Goto

    MMM   Part I, LNCS 6523   161 - 170  2011


  • Network flow-based simultaneous retiming and slack budgeting for low power design. ASP-DAC 2011:473-478

    Bei Yu, Sheqin Dong, Yuchun Ma, Tao Lin, Yu Wang, Song Chen, Satoshi Goto

    ASPDAC2011     473 - 478  2011


  • Adaptive low power decoding process with temporal prediction method for common video

    Wenxin Yu, Ning Jiang, Xin Jin, Satoshi Goto

    Proceedings - 2011 IEEE 7th International Colloquium on Signal Processing and Its Applications, CSPA 2011     163 - 166  2011

     View Summary

    This paper introduces an adaptive low power decoding process with temporal prediction method for common video. This method can be used to reduce the decoding time and reduce the decoding power consumption by skipping the decoding process of some frames and reducing the frame rate. With the temporal prediction, it is different from the certain frame skipping scheme in the temporal scalable decoding process with frame rate down conversion method (TSDP) [2]. This method considers the video quality loss when the current frame is skipped and chooses the skipping scheme which causes minimum cost, so this method can be also used in the common video cases. And compares with the temporal scalable decoding process with frame rate down conversion method, the video quality (PSNR) is improved about 0.01 - 2.4 dB in the experimental common video cases by using the temporal prediction method. And can reduce the decoding time based on the number of the skipped frames, and it can get about 65% - 86% reduction which compares with the frame rate reduction in the experimental cases. © 2011 IEEE.


  • Proposed optimization for AdaBoost-based face detection

    Xu Jiu, Satoshi Goto


     View Summary

    In this paper, a novel approach is proposed for face detection in still image based on the AdaBoost algorithm. First, face candidates are detected by AdaBoost Algorithm. Since a lot of influence might exist, such as size of the image, illumination and noise, some non-faces windows might also be detected as face candidates, or some faces might be missed. In order to solve these problems and get better performances, we take use of skin color information in the YCbCr color space together with the edge information of the color image. In this way, we are able to remove some non-faces that have been wrongly detected as faces and add some possible missed faces as well. Experimental results show that the hit rate could be improved and false alarm could also be reduced by this method.


  • A Low-Complexity Reliability Based Message Passing Algorithm for Nonbinary LDPC Decoding

    Yichao Lu, Xiao Peng, Satoshi Goto

    International Technicl Conference on Circuits/Systems, Computers and Communications (ITC-CSCC)    2011

  • Electromagnetic Analysis Enhancement with Signal Processing Techniques

    Hongying Liu, Yukiyasu Tsunoo, Satoshi Goto

    16th Australasian Conference on Information Security and Privacy (ACISP 2011)     456 - 461  2011


  • A 16-65 Cycles/MB H.264/AVC Motion Compensation Architecture for Quad-HD Applications

    Jinjia Zhou, Dajiang Zhou, Gang He, Satoshi Goto

    2011 European Signal Processing Conference (EUSIPCO)     728 - 733  2011

  • High-parallel LDPC decoder with power gating design

    Ying Cui, Xiao Peng, Yu Jin, Peilin Liu, Shinji Kimura, Satoshi Goto

    Proceedings of International Conference on ASIC     21 - 24  2011

     View Summary

    Leakage power is growing comparable to dynamic power dissipation as a result of technology trends, and thus it has become an important issue in low-power circuit design. As a popular technique for standby power reduction, power gating is applied to high-parallel LDPC decoder for WiMAX standard. The clustered-block processing engine (CBPE) array are divided into 9 power domains, and they are switched on or off according to different code lengths of LDPC code defined in WiMAX standard. As CBPE array occupies about 70% of the decoder system, the dedicated power gating strategy is very effective in shorter code length case since more power domains can be switched off. At shortest code length, power gating design brings about 55% power reduction compared to that of longest code length. © 2011 IEEE.


  • µ-Law Based Watermarking for HDR Image Robust to Tone Mapping

    Xinwei Xue, Takao Jinno, Masahiro Okuda, Satoshi Goto

    Asia Pacific Signal and Information Processing Association, APSIPA ASC 2011    2011

  • Adaptive raster scan for slice/frame coding

    Xin Jin, Satoshi Goto

    2011 IEEE Visual Communications and Image Processing, VCIP 2011    2011

     View Summary

    In this paper, new raster scan orders other than the conventional horizontal scan from the top-left corner of the image/slice are proposed to fully exploit content correlation from the regions right to and below the current coding unit. By being applied to each slice/frame adaptively with the conventional one, features influencing the scan order selection are investigated. Different adaptive scan schemes are tested and compared. Up to 3.4% bit-rate reduction can be achieved for Intra coding with limited increase in complexity. © 2011 IEEE.


  • A Multiple Bits Output Ring-Oscillator PUF

    Hu Bin, Satoshi Goto, Yukiyasu Tsunoo

    Intelligent Signal Processing and Communication Systems (ISPACS 2011)    2011


  • Adaptive temporal scalable decoding scheme with temporal and spatial prediction method

    Wenxin Yu, Xin Jin, Satoshi Goto

    2011 International Symposium on Intelligent Signal Processing and Communications Systems: "The Decade of Intelligent and Green Signal Processing and Communications", ISPACS 2011    2011

     View Summary

    This paper introduces an adaptive temporal scalable decoding scheme with temporal and spatial prediction method. This method can be used to reduce the decoding time and reduce the decoding power consumption by skipping the decoding process of some frames and reducing the frame rate. With the spatial prediction and the temporal prediction, it is different from the certain frame skipping scheme in the temporal scalable decoding process with frame rate down conversion method (TSDP) [2]. This method chooses the skipping scheme which causes minimum cost by considering the video quality loss when the current frame is skipped and the movements of the current frame, so this method can be also used in the common video cases well. It is compared with the process which only used the temporal prediction method [3], the video quality (PSNR) is improved about 0.94.6 dB in the experimental common video cases by using the combined method. It can reduce the decoding time based on the number of the skipped frames, and can also get about 65%-86% reduction which compares with the frame rate reduction in the experimental cases. © 2011 IEEE.


  • Generic Permutation Network for QC-LDPC Decoder

    Xiao Peng, Xiongxin Zhao, Zhixiang Chen, Fumiaki Maehara, Satoshi Goto


     View Summary

    Permutation network plays an important role in the reconfigurable QC LDPC decoder for most modem wireless communication systems with multiple code rates and various code lengths This paper presents the generic permutation network (GPN) for the reconfigurable QC LDPC decoder Compared with conventional permutation networks this proposal could break through the input number restriction such as power of 2 and other limited number and optimize the network for any application in demand Moreover the proposed scheme could greatly reduce the latency because of less stages and efficient control signal generating algorithm In addition the proposed network processes the nature of high parallelism which could enable several groups of data to be cyclically shifted simultaneously The synthesis results using the 90 nm technology demonstrate that this architecture can be implemented with the gate count of 18 3k for WiMAX standard at the frequency of 600 MHz and 10 9k for WiFi standard at the frequency of 800 MHz


  • Accurate Human Detection by Appearance and Motion

    Shaopeng Tang, Satoshi Goto

    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS   E93D ( 10 ) 2728 - 2736  2010.10

     View Summary

    In this paper, a human detection method is developed. An appearance based detector and a motion based detector are proposed respectively. A multi scale block histogram of template feature (MB-HOT) is used to detect human by the appearance. It integrates the gray value information and the gradient value information, and represents the relationship of three blocks. Experiment on INRIA dataset shows that this feature is more discriminative than other features, such as histogram of orientation gradient (HOG). A motion based feature is also proposed to capture the relative motion of human body. This feature is calculated in optical flow domain and experimental result in our dataset shows that this feature outperforms other motion based features. The detection responses obtained by two features are combined to reduce the false detection. Graphic process unit (GPU) based implementation is proposed to accelerate the calculation of two features, and make it suitable for real time applications.

    DOI CiNii

  • Self-adjustable offset min-sum algorithm for ISDB-S2 LDPC decoder

    Wen Ji, Makoto Hamaminato, Hiroshi Nakayama, Satoshi Goto

    IEICE ELECTRONICS EXPRESS   7 ( 17 ) 1283 - 1289  2010.09

     View Summary

    In this paper, a novel self-adjustable offset min-sum LDPC decoding algorithm is proposed for ISDB-S2 application. We present for the first time a uniform approximation of the check node operation through mathematical induction on Jacobian logarithm and theoretically shows that the offset value is mainly dependent on the difference of the two most unreliable inputs from the bit nodes, which makes the offset value adjustable during the iterative decoding procedure. Simulation results for all 11 code rates of ISDB-S2 demonstrate that the proposed method can achieve an average of 0.15 dB gain compared to the Min-sum based algorithms, and consumes only 1.21% computation complexity compared to BP-based algorithms in the best case.

    DOI CiNii

  • A Bandwidth Optimized, 64 Cycles/NIB Joint Parameter Decoder Architecture for Ultra High Definition H.264/AVC Applications

    Jinjia Zhou, Dajiang Zhou, Xun He, Satoshi Goto


     View Summary

    In this paper, VLSI architecture of a joint parameter decoder is proposed to realize the calculation of motion vector (MV), intra prediction mode (IPM) and boundary strength (BS) for ultra high definition H.264/AVC applications. For this architecture, a 64-cycle-per-MB pipeline with simplified control modes is designed to increase system throughput and reduce hardware cost. Moreover, in order to save memory bandwidth, the data which includes the motion information for the co-located picture and the last decoded line, is pre-processed before being stored to DRAM. A partition based storage format is applied to condense the MB level data, while variable length coding based compression method is utilized to reduce the data size in each partition. Experimental results show our design is capable of real-time 3840 x 2160@60 fps decoding at less than 133 MHz, with 37.2k logic gates. Meanwhile, by applying the proposed scheme, 85-98% bandwidth saving is achieved, compared with storing the original information for every 4 x 4 block to DRAM.

    DOI CiNii

  • Histogram of Template for Pedestrian Detection

    Shaopeng Tang, Satoshi Goto


     View Summary

    In this paper, we propose a novel feature named histogram of template (HOT) for human detection in still images. For every pixel of an image, various templates are defined, each of which contains the pixel itself and two of its neighboring pixels. If the texture and gradient values of the three pixels satisfy a pre-defined formula, the central pixel is regarded to meet the corresponding template for this formula. Histograms of pixels meeting various templates are calculated for a set of formulas, and combined to be the feature for detection. Compared to the other features, the proposed feature takes texture as well as the gradient information into consideration. Besides, it reflects the relationship between 3 pixels, instead of focusing on only one. Experiments for human detection are performed on INRIA dataset, which shows the proposed HOT feature is more discriminative than histogram of orientated gradient (HOG) feature, under the same training method.

    DOI CiNii

  • Permutation Network for Reconfigurable LDPC Decoder Based on Banyan Network

    Xiao Peng, Zhixiang Chen, Xiongxin Zhao, Fumiaki Maehara, Satoshi Gotto

    IEICE TRANSACTIONS ON ELECTRONICS   E93C ( 3 ) 270 - 278  2010.03

     View Summary

    Since the structured quasi-cyclic low-density parity-check (QC-LDPC) codes tor most modern wireless communication systems include multiple code rates, various block lengths. and the corresponding different sizes of submatrices in parity check matrix (PCM), the reconfigurable LDPC decoder is desirable and the permutation network is needed to accommodate any Input number (IN) and shift number (SN) for cyclic shift In this paper we propose a novel permutation network architecture for the reconfigurable QC-LDPC decoders based on Banyan network We prove that Banyan network ha, the nonblocking property or cyclic shift when the IN is power of 2, and give the control signal generating algorithm Through introducing the bypass network, we put forward the nonblocking scheme for any IN and SN In addition, we present the hardware design of the control signal generator. which can greatly reduce the hardware complexity and latency The synthesis results using the TSMC 0 mu m pin library demonstrate that the proposed permutation network can be implemented with the area of 0 546 mm(2) and the frequency of 292 MHz


  • A High Performance and Low Bandwidth Multi-Standard Motion Compensation Design for HD Video Decoder

    Xianmin Chen, Peilin Liu, Dajiang Zhou, Jiayi Zhu, Xingguang Pan, Satoshi Goto

    IEICE TRANSACTIONS ON ELECTRONICS   E93C ( 3 ) 253 - 260  2010.03

     View Summary

    Motion compensation is widely used in many video coding standards Due to its bandwidth requirement and complexity. motion compensation is One of the most challenging parts in the design of high definition video decoder In this paper, we propose a high performance and low bandwidth motion compensation design which supports H 264/AVC. MPEG-1/2 and Chinese AVS standards We introduce a 2-Dimensional cache that can greatly reduce the external bandwidth requirement Similarities among the 3 standards are also explored to reduce hardware cost We also propose a block-pipelining strategy to conceal the long latency of external memory access Experimental results show that our motion compensation design can reduce the bandwidth by 74% in average and it can teal-tune decode 1920x1088@30 fps video stream at 80 MHz


  • High security countermeasure method against differential power analysis attack

    Ying Zhou, Guoyu Qian, Yueying Xing, Hongying Liu, Yukiyasu Tsunoo, Satoshi Goto

    SCIS 2010    2010

  • CPA Attack with Switching Distance Model on AES ASIC Implementation

    Hongying Liu, Guoyu Qian, Yukiyasu Tsunoo, Satoshi Goto

    SCIS 2010    2010

  • Histogram of template for human detection

    Shaopeng Tang, Satoshi Goto

    ICASSP 2010    2010


  • Fixed Outline Multi-Bend Bus Driven Floorplanning

    Wenxu Sheng, Sheqin Dong, Yuliang Wu, Satoshi Goto

    ISQED2010     632 - 637  2010


  • A Lossless Frame Recompression Scheme for Reducing DRAM Power in Video Encoding

    Xuena Bao, Dajiang Zhou, Satoshi Goto

    ISCAS 2010     677 - 680  2010


  • A Set of Separate Haar Features for Rapid Face Detection

    Ning Jiang, Yijun Lu, Shaopeng Tang, Satoshi Goto

    ITC-CSCC 2010     128 - 131  2010

  • A Novel Three-Step Temporal Error Concealment Method for H.264/AVC

    Hao Sun, Jun Wang, Satoshi Goto

    ITC-CSCC 2010     103 - 106  2010

  • Wen Ji, Makoto Hamaminato, Hiroshi Nakayama and Satoshi Goto

    Self-adjustable offset, min-sum algorithm for ISDB-S, LDPC decoder

    IEICE Electron. Express   Vol. 7, No. 17   1283 - 2010  2010

  • Low power surveillance video coding system

    Xin Jin, Kun Ba, Satoshi Goto

    ICME 2010     1156 - 1157  2010


  • Encoder Adaptable Difference Detection for Low Power Video Compression in Surveillance System

    Xin Jin, Satoshi Goto

    PCM 2010     285 - 296  2010


  • A Bandwidth Reduction Scheme and its VLSI Implementation for H.264/AVC Motion Vector Decoding

    Jinjia Zhou, Dajiang Zhou, Gang He, Satoshi Goto

    PCM 2010     52 - 61  2010


  • Hilbert Transform based Workload Estimation for Low Power Surveillance Video Compression

    Xin Jin, Satoshi Goto

    ICIP 2010     4461 - 4464  2010


  • Intra prediction architecture for H.264/AVC QFHD encoder

    Gang He, Dajiang Zhou, Jinjia Zhou, Satoshi Goto

    28th Picture Coding Symposium, PCS 2010     450 - 453  2010

     View Summary

    This paper proposes a high-performance intra prediction architecture that can support H.264/AVC high profile. The proposed MB/block co-reordering can avoid data dependency and improve pipeline utilization. Therefore, the timing constraint of real-time 4kx2k encoding can be achieved with negligible quality loss. 16×16 prediction engine and 8×8 prediction engine work parallel for prediction and coefficients generating. A reordering interlaced reconstruction is also designed for fully pipelined architecture. It takes only 160 cycles to process one macroblock (MB). Hardware utilization of prediction and reconstruction modules is almost 100%. Furthermore, PE-reusable 8×8 intra predictor and hybrid SAD &amp
    SATD mode decision are proposed to save hardware cost. The design is implemented by 90nm CMOS technology with 113.2k gates and can encode 4kx2k video sequences at 60 fps with operation frequency of 310MHz. © 2010 IEEE.


  • High security countermeasure method against differential power analysis attack

    Ying Zhou, Guoyu Qian, Yueying Xing, Hongying Liu, Yukiyasu Tsunoo, Satoshi Goto

    SCIS 2010    2010

  • Efficient VLSI Architectures for Ultra High Definition H.264/AVC Deblocking Filter

    Jinjia Zhou, Satoshi Goto

    ASP-DAC2010    2010

  • Content similarity based early skip mode decision for low power surveillance video compression

    Xin Jin, Satoshi Goto

    Final Program and Abstract Book - 4th International Symposium on Communications, Control, and Signal Processing, ISCCSP 2010    2010

     View Summary

    As a state-of-the-art video compression technique, H.264/AVC has been deployed in many surveillance cameras to improve the compression efficiency. However, it induces very high coding complexity, and thus high power consumption. In this paper, an early skip mode decision algorithm is proposed to reduce the computational complexity and power consumption in surveillance video compression by detecting the content similarity using chrominance and motion features. The overall computational complexity of the whole video encoding system is significantly reduced by 84%. No loss is observed in both of subjective and objective video quality. ©2010 IEEE.


  • Histogram of template for human detection

    Shaopeng Tang, Satoshi Goto

    ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings     2186 - 2189  2010

     View Summary

    In this paper, we propose a novel feature named histogram of template (HOT) for human detection in still images. For every pixel of an image, various templates are defined, each of which contains the pixel itself and two of its neighboring pixels. If the intensity and gradient values of the three pixels satisfy a pre-defined function, the central pixel is regarded to meet the corresponding template for this function. Histograms of pixels meeting various templates are calculated for a set of functions, and combined to be the feature for detection. Compared to the other features, the proposed feature takes intensity as well as the gradient information into consideration. Besides, it reflects the relationship between 3 pixels, instead of focusing on only one. Experiments for human detection are performed on INRIA dataset, which shows the proposed HOT feature is more discriminative than histogram of orientated gradient (HOG) feature, under the same training method. ©2010 IEEE.


  • A Set of Separate Haar Features for Rapid Face Detection

    Ning Jiang, Yijun Lu, Shaopeng Tang, Satoshi Goto

    ITC-CSCC 2010     128 - 131  2010

  • A Novel Three-Step Temporal Error Concealment Method for H.264/AVC

    Hao Sun, Jun Wang, Satoshi Goto

    ITC-CSCC 2010     103 - 106  2010

  • The Hybrid of dynamic and static allocation directory for cache coherence

    Gang He, Xun He, Satoshi Goto

    ITC-CSCC 2010    2010

  • A constant rate bandwdith reduction architecture with adaptive compression mode decision for video decoding

    Liu Song, Dajiang Zhou, Xin Jin, Satoshi Goto

    EUSIPCO-2010     2017 - 2021  2010

  • A Novel Hardware-friendly Self-adjustable Offset Min-sum Algorithm for ISDB-S2 LDPC Decoder

    Wen Ji, Makoto Hamaminato, Hiroshi Nakayama, Satoshi Goto

    EUSIPCO 2010     1394 - 1398  2010

  • System-level low power design for ultra high definition H.264/AVC video decoder

    Dajiang Zhou, Jinjia Zhou, Xun He, Ji Kong, Jiayi Zhu, Peilin Liu, Satoshi Goto

    ACM ISLPED 2010    2010

  • Adaptive solution of temporal scalable decoding process with frame rate conversion method for surveillance video

    Wenxin Yu, Xin Jin, Satoshi Goto

    ISPACS 2010 - 2010 International Symposium on Intelligent Signal Processing and Communication Systems, Proceedings    2010

     View Summary

    This paper proposes an adaptive solution of temporal scalable decoding process with frame rate conversion method for surveillance video. It realizes the adaptive skipping scheme in the temporal scalable decoding process [2] based on the content of the pictures. By analyzing the relationship between motion vector energy and the video quality loss of the same frame in probability, chooses the suitable form of the motion vector value to qualify the video quality loss which is caused by the skipping process in a frame. And uses a selecting algorithm based on the energy accumulation principle to realize the adaptive frame skipping. By using this frame rate-down conversion algorithm, the PSNR is improved about 0.2-1.4 dB (compared with the certain frame skipping scheme [2]) in different skipping cases. And the loss of the decoding time reduction is less than 5% in the worst case, but in the most of the cases it is only 0 ∼ 2%. © 2010 IEEE.


  • Enhanced Temporal Error Concealment for 1Seg Video Broadcasting

    Jun Wang, Yichun Tang, Satoshi Goto

    MMM2010    2010


  • Floorplanning and topology generation for application-specific network –on-chip

    Bei Yu, Sheqin Dong, Song Chen, Satoshi Goto

    ASP-DAC2010    2010


  • A 530Mpixels/s 4096x2160@60fps H.264/AVC high profile video decoder chip

    Dajiang Zhou, Jinjia Zhou, Xun He, Ji Kong, Jiayi Zhu, Peilin Liu, Satoshi Goto

    IEEE Symposium on VLSI Circuits 2010     171 - 172  2010


  • メデイア処理における超低消費電力SoC技術

    後藤敏, 池永剛, 吉村猛, 木村晋二, 戸川望

    情報処理   No.51 ( Vol.7 ) 837 - 845  2010

  • An approach of using different positions of double registers to protect AES hardware structure from DPA

    Ying Zhou, Guoyu Qian, Yueying Xing, Hongying, Liu, Satoshi Goto, Yukiyasu Tsunoo

    3rd International Symposium on Electronic Commerce and Security, ISECS 2010     223 - 227  2010

     View Summary

    Advanced Encryption Standard (AES) is a widely used symmetric cryptographic algorithm. Differential power analysis attack (DPA) is a powerful side-channel attack method which has been successfully used to attack AES. As a lot of people focus on revealing the key, more and more people become interested in how to defend DPA attack. In this paper, we propose a method by adding a set of register and a random generator to defend DPA attack. The proposed method has been implemented on SASEBO board. It can process data with 2.56Gbps at 200MHz frequency. And we use the DPA attack to prove it can effectively defend the normal DPA attack. © 2010 IEEE.


  • Correlation Power Analysis based on Switching Glitch Model

    Hongying Liu, Guoyu Qian, Satoshi Goto, Yukiyasu Tsunoo

    WISA 2010    2010


  • High profile intra prediction architecture for UHD H.264 Decoder

    Xun He, Dajiang Zhou, Jinjia Zhou, Satoshi Goto

    IPSJ Transactions on System LSI Design Methodology   3   303 - 313  2010

     View Summary

    This paper presents a new architecture for high profile intra prediction in H.264/AVC video coding standard. Our goal is to design an Intra prediction engine for 4Kx2K@60fps Ultra High Definition (UHD) Decoder. The proposed architecture can provide very stable throughput, which can predict any H.264 intra prediction mode within 66 cycles. Compared with previous design, this feature can guarantee the whole decoding pipeline to work efficiently. The intra prediction engine is divided into two parallel pipelines, one is used for 4x4 block prediction loops and the other is used to prepare data for MB loops. It can overlap data preparing time with prediction time, which can finish data loading and storing within 2 cycles. Comparing with MB pipeline only architecture, it can achieve more than 3.2 times higher throughput with 29.8 K gates cost. The proposed architecture is verified to work at 175 MHz for our UHD Decoder by using TSMC 90 G. © 2010 Information Processing Society of Japan.

    DOI CiNii

  • System-level low power design for ultra high definition H.264/AVC video decoder

    Dajiang Zhou, Jinjia Zhou, Xun He, Ji Kong, Jiayi Zhu, Peilin Liu, Satoshi Goto

    ACM ISLPED 2010    2010

  • Temporal scalable decoding process with frame rate conversion method for surveillance video

    Wenxin Yu, Xin Jin, Satoshi Goto

    PCM 2010     297 - 308  2010


  • ROI based Complexity Reduction Algorithm for Region-of-Interest based H.264 Encoding

    Tianruo Zhang, Minghui Wang, Chen Liu, Satoshi Goto

    APCCAS 2010    2010


  • Human Tracking System for Automatic Video Surveillance with Particle Filters

    Axel Beaugendre, Hiroyoshi Miyano, Eiki Ishidera, Satoshi Goto

    APCCAS 2010    2010


  • A High-Speed CPA Attack based on Wave Integral

    Guoyu Qian, Ying Zhou, Yueying Xing, Hongying Liu, Yukiyasu Tsunoo, Satoshi Goto

    SCIS 2010    2010

  • Enhanced Temporal Error Concealment for 1Seg Video Broadcasting

    Jun Wang, Yichun Tang, Satoshi Goto

    MMM2010    2010


  • Fixed Outline Multi-Bend Bus Driven Floorplanning

    Wenxu Sheng, Sheqin Dong, Yuliang Wu, Satoshi Goto

    ISQED2010     632 - 637  2010


  • A 530Mpixels/s 4096x2160@60fps H.264/AVC high profile video decoder chip

    Dajiang Zhou, Jinjia Zhou, Xun He, Ji Kong, Jiayi Zhu, Peilin Liu, Satoshi Goto

    IEEE Symposium on VLSI Circuits 2010     171 - 172  2010


  • Hierarchical low complexity decoding with frame-skipping for surveillance video

    Wenxin Yu, Xin Jin, Satoshi Goto

    ITC-CSCC 2010     1152 - 1155  2010

  • High Parallel Variation Banyan Network Based Permutation Network for Reconfigurable LDPC Decoder

    Xiao Peng, Zhixiang Chen, Xiongxin Zhao, Fumiaki Maehara, Satoshi Goto

    ASAP 2010     233 - 238  2010

  • Fast Inter Mode Decision Algorithm Based on Residual Feature

    Chen Liu, Tianruo Zhang, Xin Jin, Minghui Wang, Satoshi Goto

    Journal of the Institute of Image Electronics Engineers of Japan   39 ( 5 ) 663 - 671  2010

     View Summary

    H.264/AVC introduces the variable block size motion estimation (VBSME), which brings huge computational cost of the encoder. In this paper, a novel fast inter mode decision algorithm for H.264/AVC has been proposed. The proposed algorithm evaluates the modes based on residual feature. The residual is obtained after the motion search of P16 X 16 mode or P8 X 8 mode. And then basing on the extracted residual feature, the complexity and similarity are evaluated for the inter mode decision. According to the evaluation of similarity between different sub-blocks and the complexity of each sub-block, the most possible inter modes for current block is chosen to be conducted. In the worst case, the proposed whole scheme of inter mode decision algorithm only conducts 4 modes, which is much more effective than conducting all the 8 modes in conventional approach. The simulation results show that, comparing to JM14.1, on average, the proposed algorithm achieves 57.98% and 55.72% time-saving on CIF and 720p sequences respectively, with equivalent 0.219 dB PSNR drop and 5.55% bit rate increase for CIF and 0.107dB PSNR drop and 3.53% bit rate increase for 720p. Compared to existing inter mode decision algorithm, proposed algorithm achieves 10.68% and 13.26% timing-reduction on CIF and 720p sequences respectively with less performance loss. © 2010, The Institute of Image Electronics Engineers of Japan. All rights reserved.

    DOI CiNii

  • A Macroblock Homogeneity Detection Method and its Application for Block Size Decision in H.264/AVC

    Guifen Tian, Tianruo Zhang, Satoshi Goto

    Journal of the Institute of Image Electronics Engineers of Japan   39 ( 5 ) 672 - 681  2010

     View Summary

    The variable block sizes for intra and inter coding in H.264/AVC achieves significant coding gain compared with coding a macroblock (MB) with fixed size. However, extremely heavy computational burden is required when Rate Distortion Optimization (RDO) process runs in brutal force searching manner for selecting the optimal coding block. This paper proposes an MB homogeneity detection method to accelerate H.264/AVC intra and inter coding. All the luminance values of pixels in an MB are taken to calculate their entropy feature, which is defined as MB's spatial homogeneity. Based on homogeneity judgment, 16 X 16 or 4 X 4 block size is appropriately selected for intra coding
    Meanwhile, either the large blocks in {16 X 16, 16 X 8, 8 X 16} or sub-blocks in {8 X 8, 8 X 4, 4 X 8, 4 X 4} are chosen for inter coding. Especially, a cost function is defined to select near optimal threshold for selecting optimal block size. Proposed methods are verified on a wide range of video sequences with different spatial-/motion characteristics. Sufficient simulations demonstrate that consistent encoding gain is achieved for all videos with different motion and spatial features. Encoding complexity for intra coding alone can be reduced by 31%- 34% and time savings for inter mode decision is 43.7%-58.7%, both with negligible loss in bitrate and PSNR. © 2010, The Institute of Image Electronics Engineers of Japan. All rights reserved.

    DOI CiNii

  • Encoder Adaptable Difference Detection for Low Power Video Compression in Surveillance System

    Xin Jin, Satoshi Goto

    PCM 2010     285 - 296  2010


  • An Efficient Frame Loss Error Concealment Scheme With Tentative Projection Based on H.264/AVC

    Hao Sun, Peilin Liu, Jun Wang, Satoshi Goto

    PCM 2010     394 - 404  2010


  • Multi scale block histogram of template feature for pedestrian detection

    Shaopeng Tang, Satoshi Goto

    ICIP 2010     3493 - 3496  2010


  • Error concealment considering error propagation inside a frame

    Jun Wang, Yichun Tang, Hao Sun, Satoshi Goto

    2010 IEEE International Workshop on Multimedia Signal Processing, MMSP2010     394 - 399  2010

     View Summary

    Transmission of compressed video over error prone channels may result in packet losses or errors, which can significantly degrade the image quality. Such degradation even becomes worse in 1Seg video broadcasting, which is widely used in Japan and Brazil for mobile phone TV service recently, where errors are drastically increased and lost areas are contiguous. Therefore the errors in earlier concealed MBs (macro blocks) may propagate to the MBs later to be concealed inside the same frame (spatial domain). The error concealment (EC) is used to recover the lost data by the redundancy in videos. Aiming at spatial error propagation (SEP) reduction, this paper proposes a SEP reduction based EC (SEPEC). In SEPEC, besides the mismatch distortion in current MB, the potential propagated mismatch distortion in the following to be concealed MBs is also minimized. Also, 2 extensions of SEPEC, that SEPEC with refined search and SEPEC with multiple layer match are discussed. Compared with previous work, the experiments show SEPEC achieves much better performance of video recovery and excellent trade-off between quality and computation in 1Seg broadcasting in terms of computation cost. ©2010 IEEE.


  • Low power parallel encoding system for video surveillance applications

    Xin Jin, Kun Ba, Satoshi Goto

    2010 International SoC Design Conference, ISOCC 2010     229 - 232  2010

     View Summary

    In this paper, an encoding system is proposed to further reduce the power consumption in parallel video encoding based on difference detection and Hilbert transform based workload estimation (HTWE). The proposed parallel encoding system provides an average of 40% computational complexity reduction for parallel encoding without objective performance loss. For the multi-core platform with frequency and voltage scalability, up to 78% of power reduction can be achieved. ©2010 IEEE.


  • Partial decoding scheme for H.264/AVC decoder

    Chen Liu, Xin Jin, Tianruo Zhang, Satoshi Goto

    ISPACS 2010 - 2010 International Symposium on Intelligent Signal Processing and Communication Systems, Proceedings    2010

     View Summary

    In this paper, a region of interest (ROI) oriented partial H.264/AVC decoding scheme is proposed to enable low-power and low-complexity HD video content display on low resolution screens. The proposed scheme is a macroblock level based partial decoding, which employs several strategies to maintain the good video quality in the ROI The simulation results show that, the proposed scheme provides better visual quality, and very limited PSNR loss quality in the ROI, but it achieves much lower computational complexity and power consumption. On average, it reduces the decoding time by about 40.66% with the PSNR loss up to O.77dB in the ROI among the test cases. The partially decoded sequences maintain more details in the ROI, thus provide better user experience. This proposal is especially useful for the portable devices. © 2010 IEEE.


  • Difference detection based early mode termination for depth map coding in MVC

    Minghui Wang, Xin Jin, Satoshi Goto

    28th Picture Coding Symposium, PCS 2010     502 - 505  2010

     View Summary

    Depth map coding is a new topic in multiview video coding (MVC) following the development of depth-image-based rendering (DIBR). Since depth map is monochromatic and has less texture than color map, fast algorithm is necessary and possible to reduce the computation burden of the encoder. This paper proposed difference detection based early mode termination strategy. The difference detection (DD) algorithms are categorized to reconstructed frame based (RDD) and original frame based (ODD). A simplified ODD (sODD) strategy is also proposed. Early mode termination based on these three DD algorithms are implemented and evaluated in the reference software of Joint Multiview Video Coding (JMVC) version 8.0 respectively. Simulation results indicate that RDD based one has no performance lost and reduce 25% runtime on average. ODD and sODD based ones can save 54.3% and 43.6% runtime respectively and have an acceptable R-D performance lost. © 2010 IEEE.


  • Human Tracking System for Automatic Video Surveillance with Particle Filters

    Axel Beaugendre, Hiroyoshi Miyano, Eiki Ishidera, Satoshi Goto

    APCCAS 2010    2010


  • A Noise Detection-Filtration Method for Improving the SR of DPA Attack

    Yueying Xing, Ying Zhou, Guoyu Qian, Hongying Liu, Satoshi Goto, Yukiyasu Tsunoo

    SCIS 2010    2010

  • Efficient VLSI Architectures for Ultra High Definition H.264/AVC Deblocking Filter

    Jinjia Zhou, Satoshi Goto

    ASP-DAC2010    2010

  • An adaptive bandwidth reduction scheme for video coding

    Liu Song, Dajiang Zhou, Xin Jin, Peilin Liu, Satoshi Goto

    ISCAS 2010     401 - 404  2010


  • Bus via reduction based on floorplan revising

    Ou He, Sheqin Dong, Jinian Bian, Sotoshi Goto, Chung-Kuan Cheng

    Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI     9 - 14  2010

     View Summary

    As a global interconnection, bus is critical for chip performance in deep submicron technology. Reducing bus routing vias will facilitate the lithography and give bus routing a higher yield and also a higher performance. In this paper, we present a floorplan revising method to minimize the number of reducible routing vias with a controllable loss on the chip area and wirelength. Therefore, it is easy to make a proper tradeoff between via reduction and revising loss. Experiments show that our method reaches a 96.2% and 93.5% reduction of routing vias, which is close to 100% and runs fast. Besides, our revising is friendly to all third-party floorplanners, which can be applied to any existing floorplans to reduce vias. It is also scalable to larger benchmarks. © 2010 ACM.


  • Ultra low bit rate video coding for surveillance system

    Muchen Li, Xin Jin, Satoshi Goto

    1st International Conference on Green Circuits and Systems, ICGCS 2010     611 - 614  2010

     View Summary

    In surveillance systems, ultra low bit rate video coding is necessary to reduce the storage and bandwidth consumption, especially for high definition video coding. This paper proposes a new architecture to further reduce the bit rate than H.264/AVC. The algorithm of detecting the difference of the adjacent frames and skipping the static frames in the original video is proposed. Moreover, this paper presents an adaptive coding method to code the active regions with traditional slices and background with the proposed S slice respectively. The S slice contains less syntax elements than the conventional P slice. The simulation results demonstrate that great bit reduction is achieved by these two schemes. © 2010 IEEE.


  • A heuristic strategy for upper-body pose estimation in image

    Yijun Lu, Ning Jiang, Shaopeng Tang, Satoshi Goto

    ITC-CSCC 2010     124 - 127  2010

  • Hierarchical low complexity decoding with frame-skipping for surveillance video

    Wenxin Yu, Xin Jin, Satoshi Goto

    ITC-CSCC 2010     1152 - 1155  2010

  • The Hybrid of dynamic and static allocation directory for cache coherence

    Gang He, Xun He, Satoshi Goto

    ITC-CSCC 2010    2010

  • High Parallel Variation Banyan Network Based Permutation Network for Reconfigurable LDPC Decoder

    Xiao Peng, Zhixiang Chen, Xiongxin Zhao, Fumiaki Maehara, Satoshi Goto

    ASAP 2010     233 - 238  2010

  • A constant rate bandwdith reduction architecture with adaptive compression mode decision for video decoding

    Liu Song, Dajiang Zhou, Xin Jin, Satoshi Goto

    EUSIPCO-2010     2017 - 2021  2010

  • A Novel Hardware-friendly Self-adjustable Offset Min-sum Algorithm for ISDB-S2 LDPC Decoder

    Wen Ji, Makoto Hamaminato, Hiroshi Nakayama, Satoshi Goto

    EUSIPCO 2010     1394 - 1398  2010

  • An Efficient Frame Loss Error Concealment Scheme With Tentative Projection Based on H.264/AVC

    Hao Sun, Peilin Liu, Jun Wang, Satoshi Goto

    PCM 2010     394 - 404  2010


  • Adaptive solution of temporal scalable decoding process with frame rate conversion method for surveillance video

    Wenxin Yu, Xin Jin, Satoshi Goto

    The 18th International Symposium on Intelligent Signal Processing and Communications Systems (ISPACS2010),    2010


  • CPA Attack with Switching Distance Model on AES ASIC Implementation

    Hongying Liu, Guoyu Qian, Yukiyasu Tsunoo, Satoshi Goto

    SCIS 2010    2010

  • An adaptive bandwidth reduction scheme for video coding

    Liu Song, Dajiang Zhou, Xin Jin, Peilin Liu, Satoshi Goto

    ISCAS 2010     401 - 404  2010


  • An Early Stopping Criterion for Decoding LDPC Codes in WiMAX and WiFi Standards

    Zhixiang Chen, Xiongxin Zhao, Xiao Peng, Dajiang Zhou, Satoshi Goto

    ISCAS 2010     473 - 476  2010


  • A heuristic strategy for upper-body pose estimation in image

    Yijun Lu, Ning Jiang, Shaopeng Tang, Satoshi Goto

    ITC-CSCC 2010     124 - 127  2010

  • AES key recovery based on Switching Distance model

    Hongying Liu, Guoyu Qian, Satoshi Goto, Yukiyasu Tsunoo

    3rd International Symposium on Electronic Commerce and Security, ISECS 2010     218 - 222  2010

     View Summary

    As one of the effective side-channel attacks that threaten the security of cryptographic devices, Correlation Power Analysis (CPA) attacks exploit the linear relation between the known power consumption and the predicted power consumption of cryptographic devices to recover keys. A robust cryptographic algorithm should endure both the cryptanalysis from software and hardware implementations. Researches have focused on the security examination of AES (Advanced Encryption Standard). In this paper, we present the CPA attack with the Switching Distance model against an AES implementation on ASIC. Compared with the leakage model of Hamming Distance, the power traces of recovering keys have been decreased by as much as 25%. These should cause more attention of security experts. © 2010 IEEE.


  • An Advanced Hierarchical Motion Estimation Scheme With Lossless Frame Recompression For Ultra High Definition Video Coding

    Xuena Bao, Dajiang Zhou, Peilin Liu, Satoshi Goto

    ICME 2010     820 - 825  2010


  • 2 μw AES Core with DPA Attack-Countermeasure

    Yibo Fan, Yukiyasu Tsuno, Satoshi Goto

    ACM ISPLED2010    2010

  • Correlation Power Analysis based on Switching Glitch Model

    Hongying Liu, Guoyu Qian, Satoshi Goto, Yukiyasu Tsunoo

    WISA 2010    2010


  • A high parallelism LDPC decoder with an early stopping criterion for WiMax and WiFi application

    Zhixiang Chen, Xiongxin Zhao, Xiao Peng, Dajiang Zhou, Satoshi Goto

    IPSJ Transactions on System LSI Design Methodology   3   292 - 302  2010

     View Summary

    In this paper we propose a synthesizable LDPC decoder IP core for WiMax and WiFi applications. Two new techniques are applied in the proposed decoder to improve the decoding performance. Firstly, a high parallelism permutation network (PN) is proposed to perform the circulant shift according to the parity check matrix (PCM) defined in WiMax and WiFi standards. By using the proposed PN, at most, four independent code frames with small code length are decoded concurrently, which largely improves the decoding throughput (2-4 times). Secondly, a fast early stopping criterion specialized for WiMax and WiFi LDPC code is proposed to reduce the average iteration number. Unlike the early works, by utilizing our proposed stopping criterion, the decoding will be stopped when all the information bits of a code frame are corrected even if there are still some errors in redundant part. Experiment results show that, it can reduce up to 20% iteration numbers compared to popular used stopping criterion. © 2010 Information Processing Society of Japan.

    DOI CiNii

  • Region-of-Interest based Preprocessing for H.264/AVC Encoding

    Minghui Wang, Tianruo Zhang, Chen Liu, Satoshi Goto

    Journal of the Institute of Image Electronics Engineers of Japan   39 ( 5 ) 682 - 691  2010

     View Summary

    H.264/AVC achieves low bit-rate video stream which meets the requirement of video communication. The problem of H.264/AVC is the large computation burden. Thus fast algorithm should be adopted to reduce the computation burden to meet the limited power of the mobile device. This paper uses region-of-interest (ROI) detector to locate an “important” region and apply unequally coding in the encoder engine according ROI. Several coding parameters including quantization parameter (QP), candidates for mode decision, number of referencing frames and the search range of motion estimation are adaptively adjusted at the macroblock (MB) level. This design is decoding-friendly. Experimental result shows a large amount computation is saved and the subjective visual quality is kept or even improved. © 2010, The Institute of Image Electronics Engineers of Japan. All rights reserved.

    DOI CiNii

  • A Bandwidth Reduction Scheme and its VLSI Implementation for H.264/AVC Motion Vector Decoding

    Jinjia Zhou, Dajiang Zhou, Gang He, Satoshi Goto

    PCM 2010     52 - 61  2010


  • Hilbert Transform based Workload Estimation for Low Power Surveillance Video Compression

    Xin Jin, Satoshi Goto

    ICIP 2010     4461 - 4464  2010


  • Data conflict resolution for layered LDPC decoding algorithm by selective recalculation

    Wen Ji, Makoto Hamaminato, Hiroshi Nakayama, Satoshi Goto

    Proceedings - 2010 3rd International Congress on Image and Signal Processing, CISP 2010   6   2985 - 2989  2010

     View Summary

    Layered LDPC decoding algorithm is known to achieve high Bit Error Rate (BER) performance and high throughput for LDPC decoders. However, for ISDB-S2 (Integrated Services Digital Broadcasting via Satellite - Second Generation) LDPC decoder, applying layered algorithm directly will result in data conflict problem. In this paper, a novel selective recalculation method is proposed to solve the data conflict problem. It determines the inaccurately calculated values based on a recalculation decision rule, and correct them accordingly. By applying this selective recalculation method, the layered algorithm can achieve conflict free BER performance. Simulation results demonstrate that the proposed method can achieve 0.1dB gain for the code with most conflicts in ISDB-S2, under the same BER performance compared to the previous strategy to solve data conflict problem. ©2010 IEEE.


  • ROI based Complexity Reduction Algorithm for Region-of-Interest based H.264 Encoding

    Tianruo Zhang, Minghui Wang, Chen Liu, Satoshi Goto

    APCCAS 2010    2010


  • A dynamic slice-resize algorithm for fast H.264/AVC parallel encoder

    Kun Ba, Xin Jin, Satoshi Goto

    ISPACS 2010 - 2010 International Symposium on Intelligent Signal Processing and Communication Systems, Proceedings    2010

     View Summary

    With development of multi-core system, parallel encoder with fast algorithm is widely used to improve video encoding speed. In this paper, a dynamic slice-resize algorithm is proposed to balance workload and increase encoding speed for slice-level parallel encoder. This algorithm retrieves encoding workload from previous encoded frames and dynamically changes the slice size for the following frames to balance the encoding workload among slices. By applying the proposed algorithm, an average of 21 % encoding speed improvement can be achieved for parallel video encoder without loss in objective performance. © 2010 IEEE.


  • A High-Speed CPA Attack based on Wave Integral

    Guoyu Qian, Ying Zhou, Yueying Xing, Hongying Liu, Yukiyasu Tsunoo, Satoshi Goto

    SCIS 2010    2010

  • An Early Stopping Criterion for Decoding LDPC Codes in WiMAX and WiFi Standards

    Zhixiang Chen, Xiongxin Zhao, Xiao Peng, Dajiang Zhou, Satoshi Goto

    ISCAS 2010     473 - 476  2010


  • An Advanced Hierarchical Motion Estimation Scheme With Lossless Frame Recompression For Ultra High Definition Video Coding

    Xuena Bao, Dajiang Zhou, Peilin Liu, Satoshi Goto

    ICME 2010     820 - 825  2010


  • 2 μw AES Core with DPA Attack-Countermeasure

    Yibo Fan, Yukiyasu Tsuno, Satoshi Goto

    ACM ISPLED2010    2010

  • Multi scale block histogram of template feature for pedestrian detection

    Shaopeng Tang, Satoshi Goto

    ICIP 2010     3493 - 3496  2010


  • Interactive partial video decoding for viewing resolution adaptation

    Chen Liu, Xin Jin, Tianruo Zhang, Satoshi Goto

    2010 International SoC Design Conference, ISOCC 2010     244 - 247  2010

     View Summary

    High-definition (HD) and super-high-definition (SHD) videos become more and more popular for various video applications, including video capture and playback on portable devices. Because of the resolution mismatch between HD/SHD video and the low resolution screen of portable devices, the video is fully decoded and then sown-sampled for the display, which causes a waste of both computational power and memory bandwidth. In this paper, a user-defined region of interest (ROI) oriented partial decoding scheme for H.264/AVC is proposed to achieve low-power and good subjective visual quality of decoding/display for viewing resolution adaptation. The proposed partial decoding scheme works basing on the adaptive ROI tracking. The motion-vector (MV) adaptation is adopted for maintaining the visual quality in ROI. The simulation results show that, the displayed partial decoded sequence provides better subjective visual quality, more details, and very small PSNR drop in ROI compare to original one. And the partial decoder achieves about 48.34% decoding time reduction averagely, which means lower computational complexity and power consumption. The proposal is especially useful for displaying HD video on the portable devices in which the battery life is a crucial factor. ©2010 IEEE.


  • A Noise Detection-Filtration Method for Improving the SR of DPA Attack

    Yueying Xing, Ying Zhou, Guoyu Qian, Hongying Liu, Satoshi Goto, Yukiyasu Tsunoo

    SCIS 2010    2010

  • Floorplanning and topology generation for application-specific network –on-chip

    Bei Yu, Sheqin Dong, Song Chen, Satoshi Goto

    ASP-DAC2010    2010


  • A revisit to voltage partitioning problem

    Tao Lin, Sheqin Dong, Bei Yu, Song Chen, Satoshi Goto

    Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI     115 - 118  2010

     View Summary

    We revisit voltage partitioning problem when the mapped voltages of functional units are predetermined. If energy consumption is estimated by formulation E=CV2, a published work claimed this problem was NP-hard. We clarify that it is polynomial solvable, then propose an optimal algorithm, its time complexity is O(nk+k2d) which is best so far, where n, k, and d are respectively the numbers of functional units, available supply voltages, and voltages employed in the final design. In reality, considering leakage power the energy-voltage curve is not simply monotonically increasing and there is still no optimal polynomal polynomial time algorithm. However, under the assumption that energy-voltage curve is quasiconvex, which is also a good approximation to actual situation, the optimal solution can be got in time O(nk2). Experimental results show that our algorithms are more efficient than previous works. © 2010 ACM.


  • A Lossless Frame Recompression Scheme for Reducing DRAM Power in Video Encoding

    Xuena Bao, Dajiang Zhou, Satoshi Goto

    ISCAS 2010     677 - 680  2010


  • Wen Ji, Makoto Hamaminato, Hiroshi Nakayama and Satoshi Goto

    Self-adjustable offset, min-sum algorithm for ISDB-S, LDPC decoder

    IEICE Electron. Express   Vol. 7, No. 17   1283 - 2010  2010

  • Low power surveillance video coding system

    Xin Jin, Kun Ba, Satoshi Goto

    ICME 2010     1156 - 1157  2010


  • Temporal scalable decoding process with frame rate conversion method for surveillance video

    Wenxin Yu, Xin Jin, Satoshi Goto

    PCM 2010     297 - 308  2010


  • A BER performance-aware early termination scheme for layered LDPC decoder

    Xiongxin Zhao, Zhixiang Chen, Xiao Peng, Dajiang Zhou, Satoshi Goto

    IEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation     416 - 419  2010

     View Summary

    This paper presents a novel early termination scheme for layered LDPC decoder. By solving the bit error rate (BER) performance degradation which will occur when other early termination schemes are applied in layered LDPC decoder, the proposed method achieves very fast termination speed without BER performance loss. It is the best solution for BER performance-aware layered LDPC decoders, such as satellite video broadcasting applications. ©2010 IEEE.


  • Rapid face detection using a multi-mode cascade and Separate Haar Feature

    Ning Jiang, Yijun Lu, Shaopeng Tang, Satoshi Goto

    ISPACS 2010 - 2010 International Symposium on Intelligent Signal Processing and Communication Systems, Proceedings    2010

     View Summary

    In this paper, firstly, we describe a new feature for fast and accurate face detection. The feature is called Separate Haar Feature. Secondly, we describe a multi-mode detection algorithm to improve the detection rate. There are three key contributions. The first is "Separate Haar Feature", which adds a don't-care area between the rectangles of Haar Feature. We can get some more efficient features by this algorithm. The second is the algorithm for selecting the best width for this don't-care area and the False Alarm Rate for each stage in "Learning". This algorithm for width selecting is proposed to reduce the total number of learning features for reducing the memory used. And we use a smaller false alarm rate for each stage and less number of stages training the detector to improve the hit rate within the same detection time consuming. Finally, we proposed a multi-mode detection algorithm in cascade detection process to improve the detection rate. © 2010 IEEE.


  • Intra Prediction Architecture for H.264/AVC QFHD Encoder

    Gang He, Dajiang Zhou, Jinjia Zhou, Satoshi Goto

    PCS2010     450 - 453  2010


  • Voltage and Level-Shifter Assignment Driven Floorplanning

    Bei Yu, Sheqin Dong, Song Chen, Satoshi Goto


     View Summary

    Low Power Design has become a significant requirement when the CMOS technology entered the nanometer era. Multiple-Supply Voltage (MSV) is a popular and effective method for both dynamic and static power reduction while maintaining performance. Level shifters may cause area and Interconnect Length Overhead (ILO), and should be considered at both floorplanning and post-floorplanning stages. In this paper, we propose a two phases algorithm framework, called VLSAF, to solve voltage and level shifter assignment problem. At floorplanning phase, we use a convex cost network flow algorithm to assign voltage and a minimum cost flow algorithm to handle level-shifter assignment. At post-floorplanning phase, a heuristic method is adopted to redistribute white spaces and calculate the positions and shapes of level shifters. The experimental results show VLSAF is effective.

    DOI CiNii

  • A 48 Cycles/MB H.264/AVC Deblocking Filter Architecture for Ultra High Definition Applications

    Dajiang Zhou, Jinjia Zhou, Jiayi Zhu, Satoshi Goto


     View Summary

    In this paper, a highly parallel deblocking filter architecture for H.264/AVC is proposed to process one macroblock in 48 clock cycles and give real-time support to QFHD@60 fps sequences at less than 100 MHz. 4 edge filters organized in 2 groups for simultaneously processing vertical and horizontal edges are applied in this architecture to enhance its throughput. While parallelism increases, pipeline hazards arise owing to the latency of edge filters and data dependency of deblocking algorithm. To solve this problem, a zig-zag processing schedule is proposed to eliminate the pipeline bubbles. Data path of the architecture is then derived according to the processing schedule and optimized through data flow merging, so as to minimize the cost of logic and internal buffer. Meanwhile, the architecture's data input rate is designed to be identical to its throughput, while the transmission order of input data can also match the zig-zag processing schedule. Therefore no intercommunication buffer is required between the deblocking filter and its previous component for speed matching or data reordering. As a result, only one 24x64 two-port SRAM as internal buffer is required in this design. When synthesized with SMIC 130 not process, the architecture costs a gate count of 30.2 k, which is competitive considering its high performance.

    DOI CiNii

  • An Efficient Motion Vector Coding Scheme Based on Prioritized Reference Decision

    Dajiang Zhou, Jinjia Zhou, Satoshi Goto


     View Summary

    In the latest video coding frameworks, efficiency of motion vector (MV) coding is becoming increasingly important because of the growing bit rate portion of motion information. However, neither the conventional median predictor, nor the newer schemes such as the minimum bit rate prediction scheme and the hybrid scheme, can effectively eliminate the local redundancy of motion vectors. In this paper, we present the prioritized reference decision scheme for efficient motion vector coding, based on the H.264/AVC framework. This scheme makes use of a boolean indicator to specify whether the median predictor is to be used for the current MV or not. If not, the median prediction is considered not suitable for the current MV, and this information is used for refining the possible space of a group of reference MVs including 4 neighboring MVs and the zero MV. This group of MVs is organized to be a prioritized list so that the reference MV with highest priority is to be selected as the prediction value. Furthermore, the boolean indicators are coded into the modified code words of nib-type and sub_mb_type, so as to reduce the overhead. By applying the proposed scheme, the structure and the applicability problems with the state-of-the-art MBP scheme have been overcome. Experimental result shows that the proposed scheme achieves a considerable reduction of bits for MVDs. compared with the conventional median prediction algorithm. It also achieves a better and much stabler performance than MBP-based MV coding.

    DOI CiNii

  • An Ultra-Low Bandwidth Design Method for MPEG-2 to H.264/AVC Transcoding

    Xianghui Wei, Takeshi Ikenaga, Satoshi Goto


     View Summary

    Motion estimation (ME) is a computation and data intensive module in video coding system. The search window reuse methods play a critical role in bandwidth reduction by exploiting the data locality in video coding system. In this paper, a search window reuse method (Level C+) is proposed for MPEG-2 to H.264/AVC transcoding. The proposed method is designed for ultra-low bandwidth application, while the on-chip memory is not a main constraining factor. By loading search window for the motion estimation unit (MEU) and applying motion vector clipping processing, each MB in MEU can utilize both horizontal and vertical search reuse. A very low bandwidth level (R-alpha < 2) can be achieved with an acceptable on-chip memory.

    DOI CiNii

  • HDTV1080p H.264/AVC Encoder Chip Design and Performance Analysis

    Zhenyu Liu, Yang Song, Ming Shao, Shen Li, Lingfeng Li, Shunichi Ishiwata, Masaki Nakagawa, Satoshi Goto, Takeshi Ikenaga

    IEEE JOURNAL OF SOLID-STATE CIRCUITS   44 ( 2 ) 594 - 608  2009.02

     View Summary

    A H.264/AVC baseline-profile real-time encoder for HDTV-1080p at 30 fps is proposed in this paper. On the basis of the specifications and algorithm optimizations, the dedicated hardware engines and one 32-bit Media embedded Processor (MeP) equipped with hardware extensions are mapped into the three-stage macroblock pipelining system architecture. This paper describes the design considerations for chief components, including high throughput integer motion estimation, data reusing fractional motion estimation, and hardware friendly mode reduction for intra prediction. The 11.5 Gbps 64 Mb System-in-Silicon DRAM is embedded to alleviate the external memory bandwidth. Using TSMC one-poly six-metal 0.18 mu m CMOS technology, the prototype chip is implemented with 1140 k logic gates and 108.3 KB internal SRAM. The SoC core occupies 27.1 mm(2) die area and consumes 1.41 W at 200 MHz execution speed in typical work conditions.

    DOI CiNii

  • Parallel HD Encoding on CELL

    Xun He, Xiangzhong Fang, Ci Wang, Satoshi Goto

    ISCS2009    2009


  • Prioritized reference decision for efficient motion vector coding

    Dajiang Zhou, Jinjia Zhou, Satoshi Goto

    ISCAS 2009    2009


  • Composite modeling of optical flow for artifacts reduction

    Xin Jin, Satoshi Goto, King Ni Ngan

    ICME2009    2009



    Jinjia Zhou, Dajiang Zhou, Hang Zhang, Yu Hong, Peilin Liu, Satoshi Goto

    ICME2009    2009


  • n Indoor Localization System with RFID Passive Tags

    Hongying Liu, Satoshi Goto, Junhuai Li

    UCS'2009    2009

  • Region-of-interest based H.264 encoder for videophone with a hardware macroblock level face detector

    Tianruo Zhang, Chen Liu, Minghui Wang, Satoshi Goto

    MMSP    2009


  • Multi-voltage and level-shifter assignment driven floorplanning

    Bei Yu, Sheqin Dong, Satoshi Goto

    ASICON 2009 - Proceedings 2009 8th IEEE International Conference on ASIC     1264 - 1267  2009

     View Summary

    As technology scales, low power design has become a significant requirement for soc designers. Among the existing techniques, Multiple-Supply Voltage (MSV) is a popular and effective method to reduce both dynamic and static power. Besides, level shifters consume area and delay, and should be considered during floorplanning. In this paper, we present a new floorplanning system, called MVLSAF, to solve multi-voltage and level shifter assignment problem. We use a convex cost network flow algorithm to assign arbitrary number of legal working voltages and a minimum cost flow algorithm to handle level-shifter assignment. The experimental results show MVLSAF is effective. ©2009 IEEE.


  • Difference Detection with Encoder Adaptability for Low Complexity Surveillance Video Compression

    Xin Jin, Satoshi Goto

    ISPACS 2009    2009


  • Parallel HD Encoding on CELL

    Xun He, Xiangzhong Fang, Ci Wang, Satoshi Goto

    ISCS2009    2009


  • A Memory Efficient Check Message Quantization Scheme for LDPC decoder

    Zhixiang Chen, Xiongxin Zhao, Satoshi Goto

    ITC-CSCC2009    2009


    Xiongxin ZHAO, Zhixiang CHEN, Satoshi Goto

    ITC-CSCC2009    2009

  • Pre-processor of the region-of-interest based H.264 encoder for low power application

    Minghui Wang, Tianruo Zhang, Satoshi Goto

    ASICON    2009


  • A highly efficient inverse transform architecture for multi-standard HDTV decoder

    Hang Zhang, Peilin Liu, Yu Hong, Dajiang Zhou, Satoshi Goto

    ASICON    2009


  • An ultra-low bandwidth design method for MPEG-2 to H.264/AVC transcoding

    Xianghui Wei, Takeshi Ikenaga, Satoshi Goto

    IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences   E92-A ( 4 ) 1072 - 1079  2009

     View Summary

    Motion estimation (ME) is a computation and data intensive module in video coding system. The search window reuse methods play a critical role in bandwidth reduction by exploiting the data locality in video coding system. In this paper, a search window reuse method (Level C+) is proposed for MPEG-2 to H.264/AVC transcoding. The proposed method is designed for ultra-low bandwidth application, while the on-chip memory is not a main constraining factor. By loading search window for the motion estimation unit (MEU) and applying motion vector clipping processing, each MB in MEU can utilize both horizontal and vertical search reuse. A very low bandwidth level (Rα &lt
    2) can be achieved with an acceptable on-chip memory. Copyright © 2009 The Institute of Electronics, Information and Communication Engineers.

    DOI CiNii

  • Optical flow based DC surface compensation for artifacts reduction

    Xin Jin, Satoshi Goto, King Ni Ngan

    PCS 2009    2009



    Xiongxin ZHAO, Zhixiang CHEN, Satoshi Goto

    ITC-CSCC2009    2009

  • A Study of Effect of Oscilloscope Parameters on DPA Attacks

    Guoyu Qian, Yibo Fan, Yukiyasu Tsunoo, Satoshi Goto

    2009 International Conference on Information Security and Privacy    2009


    Zhenxing CHEN, Satoshi GOTO

    CISP 2009    2009


  • Implementation of LDPC decoder for 802.16e

    Xiao PENG, Satoshi GOTO

    ASICON    2009


  • VLSI architecture of a low complexity face detection algorithm for real-time video encoding

    Tianruo Zhang, Minghui Wang, Chen Liu, Satoshi Goto

    ASICON    2009


  • A Weighted Statistical Analysis of DPA Attack on an ASIC AES Implementation

    Guoyu QIAN, Ying ZHOU, Yueying XING, Yibo FAN, Yukiyasu TSUNOO, Satoshi GOTO

    ASICON    2009


  • High Profile Intra Prediction Architecture for H.264

    Xun He, Dajiang Zhou, Jinjia Zhou, Satoshi Goto

    International SoC Design Conference    2009


  • A High-Parallelism Reconfigurable Permutation Network for IEEE 802.11n and 802.16e LDPC Decoder

    Zhixiang Chen, Xiongxin Zhao, Xiao Peng, Dajiang Zhou, Satoshi Goto

    ISPACS 2009    2009


  • The Study and Application of Tree-based RFID Complex Event Detection Algorithm

    Hongying Liu, Satoshi Goto, Junhuai Li

    ISECS'2009    2009

  • Pedestrian Detection with an Ensemble of Localized Features

    Shaopeng TANG, Satoshi GOTO

    ISCAS 2009    2009


  • Side match distortion based adaptive error concealment order for 1Seg video broadcasting application

    Jun Wang, Yichun Tang, Shen Li, Shunichi Ishiwata, Satoshi Goto

    ISCAS 2009    2009


  • Prioritized reference decision for efficient motion vector coding

    Dajiang Zhou, Jinjia Zhou, Satoshi Goto

    ISCAS 2009    2009



    Jinjia Zhou, Dajiang Zhou, Hang Zhang, Yu Hong, Peilin Liu, Satoshi Goto

    ICME2009    2009


  • A Study of Effect of Oscilloscope Parameters on DPA Attacks

    Guoyu Qian, Yibo Fan, Yukiyasu Tsunoo, Satoshi Goto

    2009 International Conference on Information Security and Privacy    2009

  • n Indoor Localization System with RFID Passive Tags

    Hongying Liu, Satoshi Goto, Junhuai Li

    UCS'2009    2009

  • Integrated Interlayer Via Planning and Pin Assignment for 3D ICs

    Xu He, Sheqin Dong, Xianlong Hong, Satoshi Goto

    ACM SLIP 2009    2009


  • Voltage-Island Driven Floorplanning Considering Level-Shifter Positions

    Bei Yu, Sheqin Dong, Song Chen, Satoshi Goto

    ACM GLSVLSI 2009     51 - 59  2009


  • Region-of-interest based H.264 encoder for videophone with a hardware macroblock level face detector

    Tianruo Zhang, Chen Liu, Minghui Wang, Satoshi Goto

    MMSP    2009


  • Multi-Voltage and Level-Shifter Assignment Driven Floorplanning

    Bei Yu, Sheqin Dong, Satoshi Goto

    ASICON2009    2009


  • VLSI architecture of a low complexity face detection algorithm for real-time video encoding

    Tianruo Zhang, Minghui Wang, Chen Liu, Satoshi Goto

    ASICON    2009


  • High Profile Intra Prediction Architecture for H.264

    Xun He, Dajiang Zhou, Jinjia Zhou, Satoshi Goto

    International SoC Design Conference    2009


  • A High-Parallelism Reconfigurable Permutation Network for IEEE 802.11n and 802.16e LDPC Decoder

    Zhixiang Chen, Xiongxin Zhao, Xiao Peng, Dajiang Zhou, Satoshi Goto

    ISPACS 2009    2009


  • Difference Detection with Encoder Adaptability for Low Complexity Surveillance Video Compression

    Xin Jin, Satoshi Goto

    ISPACS 2009    2009


  • An Spatial Error Propagation Reduction based Temporal Error Concealment for 1Seg Video Broadcasting

    Jun Wang, Yichun Tang, Satoshi Goto

    ISPACS 2009    2009


  • A 1080p@60fps multi-standard video decoder chip designed for power and cost efficiency in a system perspective

    Dajiang Zhou, Zongyuan You, Jiayi Zhu, Ji Kong, Yu Hong, Xianmin Chen, Xuewen He, Chen Xu, Hang Zhang, Jinjia Zhou, Ning Deng, Peilin Liu, Satoshi Goto

    Symp. VLSI Circuits 2009    2009

  • A Memory Efficient Check Message Quantization Scheme for LDPC decoder

    Zhixiang Chen, Xiongxin Zhao, Satoshi Goto

    ITC-CSCC2009    2009

  • Voltage-Island Driven Floorplanning Considering Level-Shifter Positions

    Bei Yu, Sheqin Dong, Song Chen, Satoshi Goto

    ACM GLSVLSI 2009     51 - 59  2009


  • Pre-processor of the region-of-interest based H.264 encoder for low power application

    Minghui Wang, Tianruo Zhang, Satoshi Goto

    ASICON    2009


  • Random Walk Algorithm for Large Thermal RC Network Analysis

    Jun Guo, Sheqin Dong, Satoshi Goto

    ASICON    2009


  • A 360Mbin/s CABAC Decoder for H.264/AVC Level 5.1 Applications

    Yu Hong, Peilin Liu, Hang Zhang, Zongyuan You, Dajiang Zhou, Satoshi Goto

    ISOCC 2009    2009


  • A New Architecture for High Performance Intra Prediction in H.264 Decoder

    Xun He, Dajiang Zhou, Jinjia Zhou, Satoshi Goto

    ISPACS 2009    2009


  • Residual analysis based fast inter mode decision for H.264/AVC

    Chen Liu, Tianruo Zhang, Xin Jin, Minghui Wang, Satoshi Goto

    ISPACS 2009 - 2009 International Symposium on Intelligent Signal Processing and Communication Systems, Proceedings     339 - 342  2009

     View Summary

    H.264/AVC introduces the variable block size macroblock mode for motion estimation, which brings huge computational cost. In this paper, a novel fast inter macroblock mode decision algorithm for H.264/AVC has been proposed. The proposed algorithm evaluates the modes based on residual information. The residual is obtained after the motion search of P16x16 mode or P8x8 mode. Then the characteristic of the blocks are extracted by the defined conversion, and the complexity and similarity are calculated for the decision the next conducting mode. The mode with the minimal R-D cost is chosen as the final mode. The simulation results show that, the proposed algorithm achieves 65.95% time-saving, which is 1.5 times than that of other algorithm with less PSNR loss and bit rate increase. ©2009 IEEE.


  • A High Performance LDPC Decoder for IEEE802.11n Standard

    Wen Ji, Yuta Abe, Takeshi Ikenaga, Satoshi Goto

    ASP-DAC2009     127 - 128  2009


  • A 1080p@60fps multi-standard video decoder chip designed for power and cost efficiency in a system perspective

    Dajiang Zhou, Zongyuan You, Jiayi Zhu, Ji Kong, Yu Hong, Xianmin Chen, Xuewen He, Chen Xu, Hang Zhang, Jinjia Zhou, Ning Deng, Peilin Liu, Satoshi Goto

    Symp. VLSI Circuits 2009    2009

  • Human detection using motion and appearance based feature

    Shaopeng Tang, Satoshi Goto

    ICICS 2009 - Conference Proceedings of the 7th International Conference on Information, Communications and Signal Processing    2009

     View Summary

    An approach to detect moving and standing human from video is proposed in this paper. Human detection from videos is a difficult problem because of motion of human, camera and background. In order to detect moving human, the dense optical flow is calculated by two consecutive frames, to represent the motion of human. Motion based feature is extracted from optical flow field. It not only represents the global motion caused by the boundary of human body, but also contains local motion caused by the limbs. Motion based feature is combined with histogram of template feature, which is designed to detect standing human, as final feature for detection. Experiment on CAS dataset shows that this feature has more discriminative ability than other motion based feature. Besides, this feature is easier for hardware acceleration, which makes it suitable for real time application. ©2009 IEEE.


  • Random Walk Algorithm for Large Thermal RC Network Analysis

    Jun Guo, Sheqin Dong, Satoshi Goto

    ASICON    2009


  • A Weighted Statistical Analysis of DPA Attack on an ASIC AES Implementation

    Guoyu QIAN, Ying ZHOU, Yueying XING, Yibo FAN, Yukiyasu TSUNOO, Satoshi GOTO

    ASICON    2009


  • A 360Mbin/s CABAC Decoder for H.264/AVC Level 5.1 Applications

    Yu Hong, Peilin Liu, Hang Zhang, Zongyuan You, Dajiang Zhou, Satoshi Goto

    ISOCC 2009    2009


  • A High Performance LDPC Decoder for IEEE802.11n Standard

    Wen Ji, Yuta Abe, Takeshi Ikenaga, Satoshi Goto

    ASP-DAC2009     127 - 128  2009


  • A New DCT-Domain Distortion Model for MB-Level Quality Control

    Xun He, Xianmin Chen, Peilin Liu, Satoshi Goto


     View Summary

    For many applications like video surveillance and digital cinema, it is desirable to encode video content with constant video quality. However, although constant quantization parameter is used in video encoder, quality fluctuation in macroblock level is still very large and standard deviation of PSNR is typically about 1.8dB. Since distortion is introduced in the process of quantization of DCT coefficients, we propose a new DCT-domain distortion model which utilizes the truncated tailing bits in quantization to estimate distortion in macroblock level, and QP of each block is determined by the estimated PSNR. Experimental results show that our proposed model can accurately estimate distortion of each MB before implementing quantization and the corresponding RMSE is only about 0.3dB (less than 1% of the actual value). Compared with previous works, our proposed algorithm can achieve 0.57dB enhancement in quality stabilization.



    Minghui WANG, Tianruo ZHANG, Chen LIU, Satoshi GOTO

    PCS 2009    2009


  • The Study and Application of Tree-based RFID Complex Event Detection Algorithm

    Hongying Liu, Satoshi Goto, Junhuai Li

    ISECS'2009    2009

  • Block-pipelining cache for motion compensation in high definition H.264/AVC video decoder

    Xianmin Chen, Peilin Liu, Jiayi Zhu, Dajiang Zhou, Satoshi Goto

    ISCAS 2009    2009


  • Pedestrian Detection with an Ensemble of Localized Features

    Shaopeng TANG, Satoshi GOTO

    ISCAS 2009    2009


  • Side match distortion based adaptive error concealment order for 1Seg video broadcasting application

    Jun Wang, Yichun Tang, Shen Li, Shunichi Ishiwata, Satoshi Goto

    ISCAS 2009    2009


  • An efficient motion vector coding scheme based on prioritized reference decision

    Dajiang Zhou, Jinjia Zhou, Satoshi Goto

    IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences   E92-A ( 8 ) 1978 - 1985  2009

     View Summary

    In the latest video coding frameworks, efficiency of motion vector (MV) coding is becoming increasingly important because of the growing bit rate portion of motion information. However, neither the conventional median predictor, nor the newer schemes such as the minimum bit rate prediction scheme and the hybrid scheme, can effectively eliminate the local redundancy of motion vectors. In this paper, we present the prioritized reference decision scheme for efficient motion vector coding, based on the H.264/AVC framework. This scheme makes use of a boolean indicator to specify whether the median predictor is to be used for the current MV or not. If not, the median prediction is considered not suitable for the current MV, and this information is used for refining the possible space of a group of reference MVs including 4 neighboring MVs and the zero MV. This group of MVs is organized to be a prioritized list so that the reference MV with highest priority is to be selected as the prediction value. Furthermore, the boolean indicators are coded into the modified code words of mb-type and sub-mb-type, so as to reduce the overhead. By applying the proposed scheme, the structure and the applicability problems with the state-of-the-art MBP scheme have been overcome. Experimental result shows that the proposed scheme achieves a considerable reduction of bits for MVDs, compared with the conventional median prediction algorithm. It also achieves a better and much stabler performance than MBP-based MV coding. Copyright © 2009 The Institute of Electronics, Information and Communication Engineers.

    DOI CiNii

  • Integrated Interlayer Via Planning and Pin Assignment for 3D ICs

    Xu He, Sheqin Dong, Xianlong Hong, Satoshi Goto

    ACM SLIP 2009    2009


  • A highly efficient inverse transform architecture for multi-standard HDTV decoder

    Hang Zhang, Peilin Liu, Yu Hong, Dajiang Zhou, Satoshi Goto

    ASICON    2009


  • A High Speed Deblocking Filter Architecture for H.264/AVC

    Jinjia Zhou, Dajiang Zhou, Xun He, Satoshi Goto

    International SoC Design Conference    2009


  • A 64-cycleper-mb joint parameter decoder architecture for ultra high definition H.264/AVC applications

    Jinjia Zhou, Dajiang Zhou, Xun He, Satoshi Goto

    ISPACS 2009    2009


  • An Spatial Error Propagation Reduction based Temporal Error Concealment for 1Seg Video Broadcasting

    Jun Wang, Yichun Tang, Satoshi Goto

    ISPACS 2009    2009


  • A high throughput LDPC decoder design based on novel delta-value message-passing schedule

    Wen Ji, Xing Li, Takeshi Ikenaga, Satoshi Goto

    IPSJ Transactions on System LSI Design Methodology   2   122 - 130  2009

     View Summary

    In this paper, we propose a partially-parallel irregular LDPC decoder for IEEE 802.11n standard targeting high throughput applications. The proposed decoder has several merits: (i) The decoder is designed based on a novel deltavalue based message passing algorithm which facilitates the decoding throughput by redundant computation removal. (ii) Techniques such as binary sorting, parallel column operation, high performance pipelining are used to further speed up the message-passing procedure. The synthesis result in TSMC 0.18 CMOS technology demonstrates that for (648,324) irregular LDPC code, our decoder can achieve 8 times increasement in throughput, reaching 418 Mbps at the frequency of 200 MHz. © 2009 Information Processing Society of Japan.

    DOI CiNii

  • Optical flow based DC surface compensation for artifacts reduction

    Xin Jin, Satoshi Goto, King Ni Ngan

    PCS 2009    2009



    Minghui WANG, Tianruo ZHANG, Chen LIU, Satoshi GOTO

    PCS 2009    2009


  • Block-pipelining cache for motion compensation in high definition H.264/AVC video decoder

    Xianmin Chen, Peilin Liu, Jiayi Zhu, Dajiang Zhou, Satoshi Goto

    ISCAS 2009    2009


  • Composite modeling of optical flow for artifacts reduction

    Xin Jin, Satoshi Goto, King Ni Ngan

    ICME2009    2009


  • A QP and partition-size statistic based fuzzy algorithm for fast inter&intra mode decision in video coding

    Zhenxing Chen, Satoshi Goto

    Proceedings of the 2009 2nd International Congress on Image and Signal Processing, CISP'09    2009

     View Summary

    The inter motion estimation and intra motion estimation are most important and widely used prediction techniques in various kinds of video standards. However, inter and intra predictions are rarely considered together since the contents used for reference by these 2 predictions are different. As most of the previous works focus on either inter or intra or reluctantly combining inter/intra together, this paper proposes a fast mode decision algorithms that considers both inter and intra organically, by taking advantage of that both inter and intra have hierarchical partition sizes. Moreover, in contrast to traditional mode decision algorithms that either enable or disable one specific mode, this paper proposes a fuzzy variable called "effort" which can adaptively adjust the computation paid to each inter or intra mode. This "effort" is defined as a fractional value ranges from 0 to 1. In this paper, firstly the correlation between partition-size selection and QP is related, then by describing how to calculate and implement this "effort" the proposed algorithm is present. Experimental result shows that about 42 67% computational complexity can be saved by adopting the proposed fast mode decision algorithm. ©2009 IEEE.


  • Implementation of LDPC decoder for 802.16e

    Xiao PENG, Satoshi GOTO

    ASICON    2009


  • A high speed deblocking filter architecture for H.264/AVC

    Jinjia Zhou, Dajiang Zhou, Xun He, Satoshi Goto

    2009 International SoC Design Conference, ISOCC 2009     63 - 66  2009

     View Summary

    In this paper, a high speed deblocking filter architecture for H.264/AVC is proposed to process one macroblock in 48 clock cycles and give real-time support to QFHD (3840x2160)@60fps sequences at less than 100MHz. 4 edge filters organized in 2 groups for simultaneously processing vertical and horizontal edges are applied in this architecture to enhance its throughput. While parallelism increases, pipeline hazards arise owing to the latency of edge filters and data dependency of deblocking algorithm. To solve this problem, a zig-zag processing schedule is proposed to eliminate the pipeline bubbles. Data path of the architecture is then derived according to the processing schedule and optimized through data flow merging, so as to minimize the cost of logic and internal buffer. Meanwhile, the architecture's data input rate is designed to be identical to its throughput, while the transmission order of input data can also match the zig-zag processing schedule. Therefore no intercommunication buffer is required between the deblocking filter and its previous component for speed matching or data reordering. As a result, only one 24x64 two-port SRAM as internal buffer is required in this design. When synthesized with SMIC 130nm process, the architecture costs a gate count of 30.2k, which is competitive considering its high performance. ©2009 IEEE.


  • A 64-cycle-per-MB joint parameter decoder architecture for ultra high definition H.264/AVC applications

    Jinjia Zhou, Dajiang Zhou, Xun He, Satoshi Goto

    ISPACS 2009 - 2009 International Symposium on Intelligent Signal Processing and Communication Systems, Proceedings     49 - 52  2009

     View Summary

    In this paper, VLSI architecture of a joint parameter decoder is proposed to realize the calculation of motion vector (MV), intra prediction mode (IPM) and boundary strength (BS) for ultra high definition H.264/AVC applications. To achieve an efficient design, compact data storage formats are proposed for SRAM size saving and DRAM bandwidth reduction. Moreover, a 64-cycle-per-MB pipeline with simplified control modes is designed to enhance the throughput. Experimental results show the proposed architecture is capable of real-time QFHD@60fps decoding at less than 133MHz, with 26.7k logic gates and 3.6kB SRAM. ©2009 IEEE.


  • A New Architecture for High Performance Intra Prediction in H.264 Decoder

    Xun He, Dajiang Zhou, Jinjia Zhou, Satoshi Goto

    ISPACS 2009    2009


  • A High Performance Partially-Parallel Irregular LDPC Decoder Based on Sum-Delta Message Passing Schedule

    Wen Ji, Yuta Abe, Takeshi Ikenaga, Satoshi Goto


     View Summary

    In this paper, we propose a partially-parallel irregular LDPC decoder based on IEEE 802.1 In standard targeting high throughput and small area applications. The design is based on a novel sum-delta message passing algorithm characterized as follows: (i) Decoding throughput is greatly improved by utilizing the difference value between the updated and the original value to remove redundant computations. (H) Registers and memory are optimized to store only the frequently used messages to decrease the hardware cost. (iii) Techniques such as binary sorting, parallel column operation, high performance pipelining are used to further speed lip the message passing procedure. The synthesis result in TSMC 0.18 CMOS technology demonstrates that for (648,324) irregular LDPC code, our decoder achieves 7.5X improvement in throughput, which reaches 402 Mbps at the frequency of 200 MHz, with 11% area reduction. The synthesis result also demonstrates the competitiveness to the fully-parallel regular LDPC decoders in terms of the tradeoff between throughput, area and power.

    DOI CiNii

  • Standard Deviation and Intra Prediction Mode Based Adaptive Spatial Error Concealment (SEC) in H.264/AVC

    Jun Wang, Lei Wang, Takeshi Ikenaga, Satoshi Goto


     View Summary

    Transmission of compressed video over error prone channels may result in packet losses or errors, which can significantly degrade the image quality. Therefore an error concealment scheme is applied at the video receiver side to mask the damaged video. Considering there are 3 types of MBs (Macro Blocks) in natural video frame, i.e., Textural MB. Edged MB, and Smooth MB, this paper proposes an adaptive spatial error concealment which can choose 3 different methods for these 3 different MBs. For criteria of choosing appropriate method, 2 factors are taken into consideration. Firstly, standard deviation of our proposed edge statistical model is exploited. Secondly, some new features of latest video compression standard H.264/AVC. i.e.. intra prediction mode is also considered for criterion formulation. Compared with previous works. which are only based on deterministic measurement, proposed method achieves the best image recovery. Subjective and objective image quality evaluations in experiments confirmed this.


  • Motion feature and Hadamard coefficient-based fast multiple reference frame motion estimation for H.264

    Zhenyu Liu, Lingfeng Li, Yang Song, Shen Li, Satoshi Goto, Takeshi Ikenaga


     View Summary

    In the state-of-the-art video coding standard', H.264/AVC, the encoder is allowed to search for its prediction signals among a large number of reference pictures that have been decoded and stored in the decoder to enhance its coding efficiency. Therefore, the computation complexity of the motion estimation (ME) increases linearly with the number of reference picture. Many fast multiple reference frame ME algorithms have been proposed, whose performance, however, will be considerably degraded in the hardwired encoder design due to the macroblock (MB) pipelining architecture. Considering the limitations of the traditional four-stage MB pipelining architecture, two fast multiple reference frame ME algorithms are proposed here. First, on the basis of mathematical analysis, which reveals that the efficiency of multiple reference frames will be degraded by the relative motion between the camera and the objects, for the slow-moving MB, the authors adopt the multiple reference frames but reduce their search range. On the other hand, for the fast-moving MB, the first previous reference frame is used with the full search range during the ME processing. The mutually exclusive feature between the large search range and the multiple reference frames makes the computation saving performance of the proposed algorithm insensitive to the nature of video sequence. Second, following the Hadamard transform coefficient-based all_zeros block early detection algorithm, two early termination criteria are proposed. These methods ensure the pronounced computation saving efficiency when the encoded video has strong spatial homogeneity or temporal stationarity. Experimental results show that 72.7%-93.7% computation can be saved by the proposed fast algorithms with an average of 0.0899 dB coding quality degradation. Moreover, these fast algorithms can be combined with fast block matching algorithms to further improve their speedup performance.

    DOI CiNii

  • Adaptive search range algorithms for variable block size motion estimation in H.264/AVC

    Zhenxing Chen, Yang Song, Takeshi Ikenaga, Satoshi Goto


     View Summary

    Comparing with search pattern motion estimation (ME) algorithms, adaptive search range (ASR) algorithms are more fundamental, regular and flexible. In variable block size motion estimation (VBSME), ASR algorithms can be applied whether on a whole frame (frame level), or on an entire macroblock which includes up to forty-one blocks (macroblock level), or just on a single block (block level). In the other hand, in H.264/AVC, not the motion vectors (MVs) but the motion vector differences (MVDs) are coded and the median motion vector predictors (median-MVPs) are used to place the search centers. In this sense, it can be thought that the search windows (SWs) are centered at the positions pointed by median-MVPs, the search ranges (SRs) play the role of limiting MVDs. Thus it is reasonable for considering using MVDs to predict SRs. In this paper, one of the MB level and two of the block level, at all three MVD based SR prediction algorithms are proposed. VBSME based experiments are carried out to assess the proposed algorithms. Comparisons between the proposed three algorithms and the previously proposed one given in [8] are done in terms of encoding quality and computational complexity.

    DOI CiNii

  • A high-speed design of Montgomery multiplier

    Yibo Fan, Takeshi Ikenaga, Satoshi Goto


     View Summary

    With the increase of key length used in public cryptographic algorithms such as RSA and ECC, the speed of Montgomery multiplication becomes a bottleneck. This paper proposes a high speed design of Montgomery multiplier. Firstly, a modified scalable high-radix Montgomery algorithm is proposed to reduce critical path. Secondly, a high-radix clock-saving dataflow is proposed to support high-radix operation and one clock cycle delay in dataflow. Finally, a hardware-reused architecture is proposed to reduce the hardware cost and a parallel radix-16 design of data path is proposed to accelerate the speed. By using HHNEC 0.25 mu m standard cell library, the implementation results show that the total cost of Montgomery multiplier is 130 KGates, the clock frequency is 180 MHz and the throughput of 1024-bit RSA encryption is 352 kbps. This design is suitable to be used in high speed RSA or ECC encryption/decryption. As a scalable design, it supports any key-length encryption/decryption up to the size of OD-chip memory.

    DOI CiNii

  • Reconfigurable variable block size motion estimation architecture for search range reduction algorithm

    Yibo Fan, Takeshi Ikenaga, Satoshi Goto

    IEICE TRANSACTIONS ON ELECTRONICS   E91C ( 4 ) 440 - 448  2008.04

     View Summary

    Variable Block Size Motion Estimation (VBSME) costs a lot of computation during video coding. Search range reduction algorithm is widely used to reduce computational cost of motion estimation. Current VBSME designs are not suitable for this algorithm. This paper proposes a reconfigurable design of VBSME which can be efficiently used with search range reduction algorithm. While using proposed design, n x m reference MBs form an MB array which can be processed in parallel. n and m can be configured according to the new search range shape calculated by algorithm. In this way, the parallelism of proposed design is very flexible and can be adapted to any search range shape. The hardware resource is also fully used while performing VBSME. There are two primary reconfigurable modules in this design: PEGA (PE Group Array) and SAD comparator. By using TSMC 0.18 mu m standard cell library, the implementation results show that the hardware cost of design which uses 16 PEGs (PE Groups) is about 179 K Gates, the clock frequency is 167 MHz.

    DOI CiNii

  • An irregular search window reuse scheme for MPEG-2 to H.264 transcoding

    Xiang-Hui Wei, Shen Li, Yang Song, Satoshi Goto


     View Summary

    Motion estimation (ME) is a computation-intensive module in video coding system. In MPEG-2 to H.264 transcoding, motion vector (MV) from MPEG-2 reused as search center in H.264 encoder is a simple but effective technique to simplify ME processing. However, directly applying MPEG-2 MV as search center will bring difficulties on application of data reuse method in hardware design, because the irregular overlapping of search windows between successive macro block (MB). In this paper, we propose a search window reuse scheme for transcoding, especially for HDTV application. By utilizing the similarity between neighboring MV, overlapping area of search windows can be regularized. Experiment results show that our method achieves average 93.1% search window reuse-rate in HDTV720p sequence with almost no video quality degradation. Compared to transcoding method without any data reuse scheme, bandwidth of the proposed method can be reduced to 40.6% of that.

    DOI CiNii

  • An Unequal Secure Encryption Scheme For H.264/AVC Video Compression Standard

    Yibo FAN, Jidong WANG, Takeshi IKENAGA, Yukiyasu, TSUNOO, Satoshi GOTO

    IEICE Trans. Fundamentals   E91-A ( 1 ) 12 - 21  2008

    DOI CiNii

  • A Hardware-Oriented High Precision Motion Vector Prediction Scheme for MPEG-2 to H.264 Transcoding

    Xianghui WEI, Wenming TANG, Guifen TIAN, Takeshi IKENAGA, Satoshi GOTO

    CISP     278 - 282  2008


  • A Power-saving 1Gbps Irregular LDPC Decoder based on High-efficiency Message Passing

    Wenming TANG, Wen JI, Xianghui WEI, Takeshi IKENAGA, Satoshi GOTO

    ITC-CSCC    2008

  • A Low-cost Reconfigurable Architecture for AES Algorithm

    Yibo FAN, Takeshi IKENAGA, Satoshi GOTO

    ICICS 2008    2008

  • A Motion Vector Difference Based Self-Incremental Adaptive Search Range Algorithm for Variable Block Size Motion Estimation

    Zhenxing CHEN, Qin LIU, Takeshi IKENAGA, Satoshi GOTO

    ICIP2008    2008


  • A Low Bandwidth Integer Motion Estimation Module for MPEG-2 to H.264 Transcoding

    Xianghui WEI, Wenming TANG, Guifen TIAN, Takeshi IKENAGA, Satoshi GOTO

    APCCAS2008    2008


  • A Frequency-Based Fast Block Type Decision Algorithm for intra Prediction in H.264/AVC High Profile

    Tianruo ZHANG, Guifen TIAN, Takeshi IKENAGA, Satoshi GOTO

    APCCAS2008    2008

  • A Low-cost LSI design of AES against DPA attack by hiding power information

    Yibo FAN, Takeshi IKENAGA, Yukiyasu TSUNOO, Satoshi GOTO

    The 21th workshop on circuits and systems in karuizawa    2008

  • Reconfigurable variable block size motion estimation architecture for search range reduction algorithm

    Yibo Fan, Takeshi Ikenaga, Satoshi Goto

    IEICE Transactions on Electronics   E91-C ( 4 ) 440 - 448  2008

     View Summary

    Variable Block Size Motion Estimation (VBSME) costs a lot of computation during video coding. Search range reduction algorithm is widely used to reduce computational cost of motion estimation. Current VBSME designs are not suitable for this algorithm. This paper proposes a reconfigurable design of VBSME which can be efficiently used with search range reduction algorithm. While using proposed design, n x m reference MBs form an MB array which can be processed in parallel. and m can be configured according to the new search range shape calculated by algorithm. In this way, the parallelism of proposed design is very flexible and can be adapted to any search range shape. The hardware resource is also fully used while performing VBSME. There are two primary reconfigurable modules in this design: PEGA (PE Group Array) and SAD comparator. By using TSMC 0.18 pm standard cell library, the implementation results show that the hardware cost of design which uses 16 PEGs (PE Groups) is about 179 K Gates, the clock frequency is 167 MHz. Copyright © 2008 The Institute of Electronics, Information and Communication Engineers.

    DOI CiNii

  • An Adaptive Spatial Error Concealment(SEC) with More Accurate MB Type Decision in H.264/AVC

    Jun WANG, Lei WANG, Takeshi IKENAGA, Satoshi GOTO

    VIE2008    2008


  • Optimized 2-D SAD Tree Architecture of Integer Motion Estimation for H.264/AVC

    Yibo FAN, Takeshi IKENAGA, Satoshi GOTO

    VLSI-SoC 2008    2008

  • A Low Bandwidth Integer Motion Estimation Module for MPEG-2 to H.264 Transcoding

    Xianghui WEI, Wenming TANG, Guifen TIAN, Takeshi IKENAGA, Satoshi GOTO

    APCCAS2008    2008


  • A Frequency-Based Fast Block Type Decision Algorithm for intra Prediction in H.264/AVC High Profile

    Tianruo ZHANG, Guifen TIAN, Takeshi IKENAGA, Satoshi GOTO

    APCCAS2008    2008

  • High Throughput Partially-Parallel Irregular LDPC Decoder Based on Delta-Value Message-Passing Schedule

    Wen JI, Xing LI, Takeshi IKENAGA, Satoshi GOTO

    VLSI-DAT   ( 220 ) 223  2008


  • A Low-cost LSI design of AES against DPA attack by hiding power information

    Yibo FAN, Takeshi IKENAGA, Yukiyasu TSUNOO, Satoshi GOTO

    The 21th workshop on circuits and systems in karuizawa    2008

  • An Adaptive Spatial Error Concealment(SEC) with More Accurate MB Type Decision in H.264/AVC

    Jun WANG, Lei WANG, Takeshi IKENAGA, Satoshi GOTO

    VIE2008    2008


  • An Extended Small Diamond Search Algorithm for Fast Block Motion Estimation

    Chang-Uk JEONG, Takeshi IKENAGA, Satoshi GOTO

    ITC-CSCC    2008

  • VLSI Oriented Group-based Algorithm for Multiple Reference Fractional Motion Estimation in H.264/AVC

    Wenqi You, Yao Mあ, Yang Song, Yan Zhung, Takeshi Ikenaga, Satoshi Goto

    SIP2008    2008

  • Fast VBSME design using reconfigurable hardware architecture and search range reduction algorithm

    Yibo Fan, Takeshi Ikenaga, Satoshi Goto

    SIP 2008    2008

  • Optimized 2-D SAD Tree Architecture of Integer Motion Estimation for H.264/AVC

    Yibo FAN, Takeshi IKENAGA, Satoshi GOTO

    VLSI-SoC 2008    2008

  • A hardware architecture of CABAC encoding and decoding with dynamic pipeline for H.264/AVC

    Lingfeng Li, Yang Song, Shen Li, Takeshi Ikenaga, Satoshi Goto


     View Summary

    This paper presents a compact hardware architecture of Context-Based Adaptive Binary Arithmetic Coding (CABAC) codec for H.264/AVC. The similarities between encoding algorithm and decoding algorithm are explored to achieve remarkable hardware reuse. System-level hardware/software partition is conducted to improve overall performance. Meanwhile, the characteristics of CABAC algorithm are utilized to implement dynamic pipeline scheme, which increases the processing throughput with very small hardware overhead. Proposed architecture is implemented under 0.18 mu m technology. Results show that the core area of proposed design is 0.496 mm(2) when the maximum clock frequency is 230 MHz. It is estimated that the proposed architecture can support CABAC encoding or decoding for HD1080i resolution at a speed of 30 frame/s.


  • A Hardware-Oriented High Precision Motion Vector Prediction Scheme for MPEG-2 to H.264 Transcoding

    Xianghui WEI, Wenming TANG, Guifen TIAN, Takeshi IKENAGA, Satoshi GOTO

    CISP     278 - 282  2008


  • A Novel Rate Control Algorithm for H.264/AVC

    Zhao MIN, Takeshi IKENAGA, Satoshi GOTO

    ITC-CSCC    2008

  • A Power-saving 1Gbps Irregular LDPC Decoder based on High-efficiency Message Passing

    Wenming TANG, Wen JI, Xianghui WEI, Takeshi IKENAGA, Satoshi GOTO

    ITC-CSCC    2008

  • An Extended Small Diamond Search Algorithm for Fast Block Motion Estimation

    Chang-Uk JEONG, Takeshi IKENAGA, Satoshi GOTO

    ITC-CSCC    2008

  • A Motion Vector Difference Based Self-Incremental Adaptive Search Range Algorithm for Variable Block Size Motion Estimation

    Zhenxing CHEN, Qin LIU, Takeshi IKENAGA, Satoshi GOTO

    ICIP2008    2008


  • A cost-efficient partially-parallel irregular LDPC decoder based on sum-delta message passing algorithm

    Ji Wen, Yuta Abe, Takeshi Lkenaga, Satoshi Goto

    Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI     207 - 212  2008

     View Summary

    A partially-parallel decoder architecture for irregular LDPC code targeting high throughput and low cost applications is proposed. The design is based on a novel sum-delta message passing algorithm that facilitates the decoding throughput by removing redundant computations and decreases the hardware cost by optimizing the storage. Techniques such as binary sorting, parallel column operation, high performance pipelining are used to further speed up the message passing procedure. The synthesis result in TSMC 0.18 CMOS technology demonstrates that for (648,324) irregular LDPC code, our decoder achieves 7.5X improvement in throughput, which reaches 402 Mbps at the frequency of 200MHz, with 11% area reduction. Copyright 2008 ACM.


  • HyMacs: Hybrid memory access optimization based on custom-instruction scheduling

    Kang Zhao, Jinian Bian, Sheqin Dong, Yang Song, Satoshi Goto

    Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI     89 - 94  2008

     View Summary

    This paper presents an efficient hybrid memory access optimization system called HyMacs, which integrates the hardware and software optimization strategies in the embedded system design. First, HyMacs features a pre-configuration stage which is equipped with a memory configuration algorithm to satisfy area constraints. Then a custom instruction generation process is integrated in the system via a seedgrowth algorithm under the intelligent guide functions. The custom instruction benefits to the reduction of the whole memory access latency and thus relieves the burden of system through hardware mode. Finally, a data-dependencydriven scheduling algorithm is also integrated to compress the whole latency through access mode conversion. We have tested the system on a set of commonly used benchmarks, and compared the results with the previous memory access system MACCESS-opt proposed in DAC'05. The experimental results indicate 20% enhancement obtained for the total memory access latency reduction compared with MACCESS-opt, where the custom instruction generation and scheduling contribute about 15% and 5% respectively. Copyright 2008 ACM.


  • A Novel Rate Control Algorithm for H.264/AVC

    Zhao MIN, Takeshi IKENAGA, Satoshi GOTO

    ITC-CSCC    2008

  • A Block Type Decision Algorithm for H.264/AVC Intra Prediction Based on Entropy Feature

    Guifen TIAN, Tianruo ZHANG, Takeshi IKENAGA, Satoshi GOTO

    APCCAS2008    2008


  • Symmetry constraint based on mismatch analysis for analog layout in SOI technology

    Jiayi Liu, Sheqin Dong, Xianlong Hong, Yibo Wang, Ou He, Satoshi Goto

    ACM Great Lakes Symposium on VLSI 2008    2008


  • High Throughput Partially-Parallel Irregular LDPC Decoder Based on Delta-Value Message-Passing Schedule

    Wen JI, Xing LI, Takeshi IKENAGA, Satoshi GOTO

    VLSI-DAT   ( 220 ) 223  2008


  • An Adaptive Spatial Error Concealment for H.264/AVC Video Stream

    Jun WANG, Lei WANG, Takeshi IKENAGA, Satoshi GOTO

    SIGMAP2008    2008

  • A Low-cost Reconfigurable Architecture for AES Algorithm

    Yibo FAN, Takeshi IKENAGA, Satoshi GOTO

    ICICS 2008    2008

  • Fast VBSME design using reconfigurable hardware architecture and search range reduction algorithm

    Yibo Fan, Takeshi Ikenaga, Satoshi Goto

    SIP 2008    2008

  • Symmetry constraint based on mismatch analysis for analog layout in SOI technology

    Jiayi Liu, Sheqin Dong, Xianlong Hong, Yibo Wang, Ou He, Satoshi Goto

    ACM Great Lakes Symposium on VLSI 2008    2008


  • A Hardware/Software Co-solution to Achieving High Throughput Required by Motion Estimation Part in H.264/AVC HDTV Real-time Application

    Zhenxing CHEN, Takeshi IKENAGA, Satoshi GOTO

    VLSI Design, Automation and Test    2008


  • A Novel Fast Block Type Decision Algorithm for Intra Prediction in H.264/AVC High Profile

    Tianruo ZHANG, Guifen TIAN, Takeshi IKENAGA, Satoshi GOTO

    The 21st Workshop on Circuits and Systems in Karuizawa     121 - 125  2008

  • High Throughput VLSI Architecture of a Fast Mode Decision Algorithm for H.264/AVC Intra Prediction

    Tianruo Zhang, Shen Li, Guifen Tian, Takeshi Ikenaga, Satoshi Goto


     View Summary

    Intra coding in H.264/AVC has significantly enhanced the video compression efficiency. However, computation complexity increases due to the rate-distortion (RD) based mode decision. This paper proposes a new fast mode decision algorithm in H.264/AVC intra prediction and its VLSI architecture. A new edge-detection pattern is proposed and both edge-detection technique and spatial mode prediction technique are combined together to reduce intra 4x4 candidate mode number from 9 to an average of 2.42. This algorithm is the only hardware-oriented algorithm which can reduce the number of 4x4 candidate mode to less than 4. VLSI architecture of intra mode decision module is designed with TSMC 0.18 mu m CMOS technology. The maximum frequency of 285MHz is achieved and 13.1k gates are required. High frequency, efficient processing cycle reduction and small area make this design to be an excellent accelerator for HDTV 1080p@30fps real time encoder.


  • Rate estimation of RD optimization for intra mode decision of H.264/AVC

    Yan Zhuang, Takeshi Ikenaga, Satoshi Goto

    Proceedings - 1st International Congress on Image and Signal Processing, CISP 2008   2   100 - 104  2008

     View Summary

    H.264 is the latest international video coding standard. It is a joint work by the ITUT Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group. H.264 contains a number of new features which make it compress video sequences much more effectively than previous standards and can provide more flexibility for application to a wide variety of network environments. But the H.264 standard also brings much more computations, especially when the rate-distortion optimization function is on. In order to reduce the complexity of rate-distortion cost of the rate part, we propose al rate estimation methods to avoid doing the real entropy coding method. The estimation method proposed is based on the notification of CAVLC tables, and is changed adoptively according to total_coeff. The final result shows that the proposed estimation method can reduce the encoding computation by 50% through the intra coding with ignorable degradation of coding performance. ©2008 IEEE.


  • A high quality fast motion estimation algorithm for H.264/AVC

    Wenqi You, Yang Song, Takeshi Ikenaga, Satoshi Goto

    Proceedings - 1st International Congress on Image and Signal Processing, CISP 2008   1   375 - 379  2008

     View Summary

    This paper proposes a novel computational structure for Sum of Absolute Difference (SAD) in Fast Full Search (FFS) algorithm to ellipsis part of the SAD value dependency. This structure adopts an SAD Accumulating Termination Algorithm (SAD-ATA) and an Early Loop Termination Decision (ELTD). SAD-ATA is used to analyze and determine whether to skip the following operation of successive SAD accumulation or not, and ELTD is for the control and reduction of SAD comparison loops' degree in FFS ME. The Ratio Distortion (RD) based simulation results show that the proposed algorithm achieves an acceleration in ME processing speed by averagely more than 92% and 20% reduction compared with Full Search (FS) and Fast Full Search (FFS) in H.264 JM12.4, respectively. Meanwhile, the impact on video quality is negligible. © 2008 IEEE.


  • An Adaptive Spatial Error Concealment for H.264/AVC Video Stream

    Jun WANG, Lei WANG, Takeshi IKENAGA, Satoshi GOTO

    SIGMAP2008    2008

  • Bandwidth Reduction Schemes for MPEG-2 to H.264 Transcoding Design

    Xianghui WEI, Wenming TANG, Guifen TIAN, Yan ZHUANG, Takeshi IKENAGA, Satoshi GOTO

    EUSIPCO 2008    2008

  • An Unequal Secure Encryption Scheme For H.264/AVC Video Compression Standard

    Yibo FAN, Jidong WANG, Takeshi IKENAGA, Yukiyasu, TSUNOO, Satoshi GOTO

    IEICE Trans. Fundamentals   E91-A ( 1 ) 12 - 21  2008

    DOI CiNii

  • A Hardware/Software Co-solution to Achieving High Throughput Required by Motion Estimation Part in H.264/AVC HDTV Real-time Application

    Zhenxing CHEN, Takeshi IKENAGA, Satoshi GOTO

    VLSI Design, Automation and Test    2008


  • A Novel Fast Block Type Decision Algorithm for Intra Prediction in H.264/AVC High Profile

    Tianruo ZHANG, Guifen TIAN, Takeshi IKENAGA, Satoshi GOTO

    The 21st Workshop on Circuits and Systems in Karuizawa     121 - 125  2008

  • Efficiency-Complexity curve based method for evaluating adaptive search range algorithms in motion estimation

    Zhenxing Chen, Takeshi Ikenaga, Satoshi Goto


     View Summary

    The well known Rate-Distortion (RD) curve and the average Peak Signal to Noise Ratio (PSNR) difference between two RD curves (Delta PSNR) are frequently and widely used for evaluating how well a fast motion estimation (ME) algorithm performs in encoding efficiency. Besides this one for encoding efficiency, there usually exists another parameter, such as ME executing time or average search point number for processing one macroblock (ASP/MB), to evaluate the complexity of this fast ME algorithm. In the other hand, adaptive search range (ASR) ME algorithms are proved to be more fundamental, regular and controllable than normal fixed search pattern (FSP) ME algorithms. Thus for especially evaluating ASR ME algorithms, an Efficiency-Complexity (EC) curve based method is proposed and discussed in this paper.


  • An efficient fast mode decision algorithm for H.264/AVC intra prediction

    Guifen Tian, Tianruo Zhang, Xianghui Wei, Takeshi Ikenaga, Satoshi Goto


     View Summary

    Rate distortion optimization (RDO) based spatial intra prediction is an important new feature in H.264/AVC. It chooses the best mode for each MB by exhaustively examining all possible modes. Thus, drastically increased computation complexity is involved. This paper proposes an efficient fast intra mode decision algorithm. Prewitt operator is used to extract each block's two most dominant edge directions. Based on these two directions, only few modes are chosen as candidates. In this way, computational load caused by RDO can be reduced remarkably. Experimental results show that more than 70% computation reduction can be achieved, with little loss of visual quality. Moreover, only addition and comparison operations are used, which makes proposed algorithm friendly enough to be implemented in hardware.


  • VLSI Oriented Group-based Algorithm for Multiple Reference Fractional Motion Estimation in H.264/AVC

    Wenqi You, Yao Mあ, Yang Song, Yan Zhung, Takeshi Ikenaga, Satoshi Goto

    SIP2008    2008

  • Bandwidth Reduction Schemes for MPEG-2 to H.264 Transcoding Design

    Xianghui WEI, Wenming TANG, Guifen TIAN, Yan ZHUANG, Takeshi IKENAGA, Satoshi GOTO

    EUSIPCO 2008    2008

  • A Block Type Decision Algorithm for H.264/AVC Intra Prediction Based on Entropy Feature

    Guifen TIAN, Tianruo ZHANG, Takeshi IKENAGA, Satoshi GOTO

    APCCAS2008    2008


  • An MRF model-based approach shape objects in to the detection of rectangular color images

    Yangxing Liu, Takeshi Ikenaga, Satoshi Goto

    SIGNAL PROCESSING   87 ( 11 ) 2649 - 2658  2007.11

     View Summary

    Rectangular shape object detection in color images is a critical step of many image recognition systems. However, there are few reports on this matter. In this paper, we proposed a hierarchical approach, which combines a global contour-based line segment detection algorithm and an Markov random field (MRF) model, to extract rectangular shape objects from real color images. Firstly, we use an elaborate edge detection algorithm to obtain image edge map and accurate edge pixel gradient information (magnitude and direction). Then line segments are extracted from the edge map and some neighboring parallel segments are merged into a single line segment. Finally all segments lying on the boundary of unknown rectangular shape objects are labeled via an MRF model built on line segments. Experimental results show that our method is robust in locating multiple rectangular shape objects simultaneously with respect to different size, orientation and color. (c) 2007 Elsevier BN. All rights reserved.


  • Efficient fully-parallel LDPC decoder design with improved simplified min-sum algorithms

    Qi Wang, Kazunori Shimizu, Takeshi Ikenaga, Satoshi Goto

    IEICE TRANSACTIONS ON ELECTRONICS   E90C ( 10 ) 1964 - 1971  2007.10

     View Summary

    In this paper we introduce an area and power efficient fully-parallel LDPC decoder design, which keeps the BER performance while consuming less hardware resources and lower power compared with conventional decoders. For this decoder, we firstly propose two improved simplified min-sum algorithms, which enable the decoder to reduce the hardware implementation complexity and area: hardware consumption of check operation module is reduced by 40%, while achieving a negligible performance loss compared with the general min-sum algorithm. To reduce the power dissipation of the decoder, we also proposed a power-saved strategy according to which the message evolution halts as the parity-check condition is satisfied. This strategy reduces more than 50% power under food channel condition. The synthesis result in 0.18 mu m CMOS technology shows our decoder based on (648,540) irregular LDPC code of WLAN (802.11n) protocol achieves 8 10 [Mbps] throughput with 283 [mW] power consumption.


  • Lossless VLSI oriented full computation reusing algorithm for H.264/AVC fractional motion estimation

    Ming Shao, Zhenyui Liu, Satoshi Goto, Takeshi Ikenaga


     View Summary

    Fractional Motion Estimation (FME) is an advanced feature adopted in H.264/AVC video compression standard with quarter-pixel accuracy. Although FME could gain considerably higher encoding efficiency, sub-pixel interpolation and sum of absolute transformed difference (SATD) computation, as main parts of FME, increase the computation complexity a lot. To reduce the complexity of FME, this paper proposes a full computation reusable VLSI oriented algorithm. Through exploiting the similarity among motion vectors (MVs) of partitions in the same macroblock (MB), temporary computation results can be fully reused. Furthermore, a simple and effective searching method is adopted to make the proposed method more suitable for VLSI implementation. Experiment results show that up to 80% add operations and 85% internal reference frame memory access operations are saved without any degradation in the coding quality.

    DOI CiNii

  • Lossless VLSI oriented full computation reusing algorithm for H.264/AVC fractional motion estimation

    Ming Shao, Zhenyui Liu, Satoshi Goto, Takeshi Ikenaga


     View Summary

    Fractional Motion Estimation (FME) is an advanced feature adopted in H.264/AVC video compression standard with quarter-pixel accuracy. Although FME could gain considerably higher encoding efficiency, sub-pixel interpolation and sum of absolute transformed difference (SATD) computation, as main parts of FME, increase the computation complexity a lot. To reduce the complexity of FME, this paper proposes a full computation reusable VLSI oriented algorithm. Through exploiting the similarity among motion vectors (MVs) of partitions in the same macroblock (MB), temporary computation results can be fully reused. Furthermore, a simple and effective searching method is adopted to make the proposed method more suitable for VLSI implementation. Experiment results show that up to 80% add operations and 85% internal reference frame memory access operations are saved without any degradation in the coding quality.


  • Lossy strict multilevel successive elimination algorithm for fast motion estimation

    Yang Song, Zhenyu Liu, Takeshi Ikenaga, Satoshi Goto


     View Summary

    This paper presents simple and effective method to further reduce the search points in multilevel successive elimination algorithm (MSEA). Because the calculated sea values of those best matching search points are much smaller than the current minimum SAD, we can simply increase the calculated sea values to increase the elimination ratio without much affecting the coding quality. Compared with the original MSEA algorithm, the proposed strict MSEA algorithm (SMSEA) can provide average 6.52 times speedup. Compared with other lossy fast ME algorithms such as TSS and DS, the proposed SMSEA can maintain more stable image quality. In practice, the proposed technique can also be used in the fine granularity SEA (FGSEA) algorithm and the calculation process is almost the same.


  • A VLSI architecture design of an edge based fast intra prediction mode decision algorithm for h.264/avc

    Shen Li, Xianghui Wei, Takeshi Ikenaga, Satoshi Goto

    Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI     20 - 24  2007

     View Summary

    The intra-frame coding in H.264/AVC has made significant contribution to the enhancement of coding efficiency. However it brings about a heavy computation burden in the rate distortion based (RD) mode decision (MD) process. Although the real-time encoding of 1280-720p signals is realized in recent works with existing algorithms, for higher resolution e.g. 1920-1088p some hardware-oriented fast algorithms are necessary. Yet so far few of the many proposed fast MD algorithms have seen successful hardware implementation. This paper presents a novel VLSI design (15.8k gates@200MHz, with TSMC CMOS 0.18m technology) of an edge based fast intra MD algorithm which can constantly reduce about 66% of the RD related computation with a negligible quality loss. It is expected to be utilized as a favorable accelerator hardware module in a real-time HDTV (1920-1088p) H.264 encoder or MPEG2-H.264 transcoder. Copyright 2007 ACM.


  • Hardware Architecture Design of CABAC Codec for H.264/AVC

    Lingfeng LI, Yang SONG, Takeshi IKENAGA, Satoshi GOTO

    International Symposium on VLSI Design, Automation & Test (VLSI-DAT 2007)     248 - 251  2007


  • Small Deadspace Fixed-Outline Floorplanner with Soft Modules

    Yukio YAMAKOSHI, Takeshi YOSHIMURA, Takeshi IKENAGA, Satoshi GOTO

    The 20th Workshop on Circuits and Systems in Karuizawa     295 - 300  2007

  • Fragmented Edges Grouping for Monocular Building Extraction

    Jing WANG, Makoto IWATA, Hirokazu KOIZUMI, Hideo SHIMAZU, Satoshi GOTO, Takeshi IKENAGA

    The 20th Workshop on Circuits and Systems in Karuizawa     557 - 562  2007

  • An Efficient Encryption Scheme for H.264 Format Video Streams

    Jidong WANG, Yibo FAN, Takeshi IKENAGA, Satoshi GOTO

    The 20th Workshop on Circuits and Systems in Karuizawa     631 - 636  2007

  • Enhanced Strict Multilevel Successive Elimination Algorithm for Fast Motion Estimation

    Yang SONG, Zhenyu LIU, Takeshi IKENAGA, Satoshi GOTO

    IEEE ISCAS’07     3659 - 3662  2007

  • An Irregular Search Window Reuse Scheme for Motion Estimation in MPEG-2 to H.264 Transcoding

    Xianghui WEI, Shen LI, Yang SONG, Satoshi GOTO

    IEEE ISCAS’07     1987 - 1990  2007

  • A Survey of Video Encryption Methods

    Yibo FAN, Jidong WANG, Takeshi IKENAGA, Satoshi GOTO

    Proc. of the 2nd International Ph.D. Student Workshop on SOC (IPS)     17 - 20  2007

  • Partially-Parallel Irregular LDPC Decoder based on Improved Message Passing Schedule

    Xing LI, Kazunori SHIMIZU, Zhen QIU, Takeshi IKENAGA, Satoshi GOTO

    IEEE International Midwest Symposium on Circuits and Systems ( MWSCAS) 2007     1473 - 1476  2007


  • A High Throughput Multiple Transform Architecture for H.264/AVC Fidelity Range Extensions

    Yao MA, Yang SONG, Takeshi IKENAGA, Satoshi GOTO

    Jornal of Semiconductor Technology and Science   Vol.7 ( No.4 )  2007

  • Hardware Reuse Architecture for High-Radix Scalable Montgomery Multiplier

    Yibo Fan, Xiaoyang Zeng, Takeshi Ikenaga, Satoshi Goto

    2E2-1, Symposium on Cryptography and Information Security (SCIS2007)    2007

  • A real-time parallel architecture for human face detection based on the Algorithm Architecture Adequation approach

    Dmitriev IVAN, Grandpierre THIERRY, Akil MOHAMED, Ghorayeb HICHAM, Satoshi GOTO, Takeshi IKENAGA

    The 20th Workshop on Circuits and Systems in Karuizawa     253 - 258  2007

  • Lossless VLSI oriented full computation reusing algorithm for H.264/AVC fractional motion estimation

    Ming Shao, Zhenyu Liu, Satoshi Goto, Takeshi Ikenaga

    IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences   E90-A ( 4 ) 756 - 763  2007

     View Summary

    Fractional Motion Estimation (FME) is an advanced feature adopted in H.264/AVC video compression standard with quarter-pixel accuracy. Although FME could gain considerably higher encoding efficiency, sub-pixel interpolation and sum of absolute transformed difference (SATD) computation, as main parts of FME, increase the computation complexity a lot. To reduce the complexity of FME, this paper proposes a full computation reusable VLSI oriented algorithm. Through exploiting the similarity among motion vectors (MVs) of partitions in the same macroblock (MB), temporary computation results can be fully reused. Furthermore, a simple and effective searching method is adopted to make the proposed method more suitable for VLSI implementation. Experiment results show that up to 80 add operations and 85 internal reference frame memory access operations are saved without any degradation in the coding quality. Copyright © 2007 The Institute of Electronics, Information and Communication Engineers.

    DOI CiNii

  • A Survey of Video Encryption Methods

    Yibo FAN, Jidong WANG, Takeshi IKENAGA, Satoshi GOTO

    Proc. of the 2nd International Ph.D. Student Workshop on SOC (IPS)     17 - 20  2007

  • Adaptive Edge Detection Pre-Process Multiple Reference Frames Motion Estimation in H.264/AVC

    Yiqing Huang, Zhenyu Liu, Satoshi Goto, Takeshi Ikenaga

    2007 International Conference on Communications, Circuits and Systems (ICCCAS 2007)     787 - 791  2007

  • Two-StepsCross-Diamond Fast Search Algorithmon Motion Estimation in H.264

    Qin Liu, Seiichiro Hiratsuka, Satoshi Goto, Takeshi Ikenaga

    2007 International Conference on Communications, Circuits and Systems (ICCCAS 2007)     782 - 786  2007

  • VLSI Friendly Edge Gradient Detection Based Multiple Reference Frames Motion Estimation Optimization for H.264/AVC

    Zhenyu Liu, Yiqing Huang, Yang Song, Lingfeng Li, Satoshi Goto, Takeshi Ikenaga

    15th European Signal Processing Conference (EUSIPCO 2007)     1809 - 1813  2007

  • Mixed Bus Width Architecture for Low Cost AES VLSI Design

    Yibo Fan, Jidong Wang, Takeshi Ikenaga, Satoshi Goto

    ASICON2007    2007


  • Floorplanning with Constraint Extraction based on Interconnecting Information Analysis

    Jiayi Liu, Sheqin Dong, Zianlong Hong, Satoshi Goto

    ASICON2007    2007


  • A partial scramble scheme for H.264 video

    Jidong Wang, Yibo Fan, Takeshi Ikenaga, Satoshi Goto

    ASICON 2007 - 2007 7th International Conference on ASIC Proceeding     802 - 805  2007

     View Summary

    There are many encryption algorithms for general data such as text data. But they are unsuitable for the encryption of the video data because of the real time constraint of the video applications. In this paper, a partial scramble scheme is proposed. The main feature of this scheme is making use of the characteristics of a H.264/AVC video. In stead of encrypting all the video data, some important parameters of the H.264/AVC video such as motion vector difference (MVD) and trailing ones (T1) of CAVLC module are scrambled in this scheme. There are 3 scramble levels in this scheme. For different scramble levels, MVDs and T1s of different MBs in a frame are scrambled according to different part of the key stream generated by a stream cipher. The advantage of this scheme is that it adds very small computional overhead to H.264/AVC coding process. Hence it is fast enough to be used for real time video applications. This scheme can be utilized in mobile and entertainment applications. © 2007 IEEE.


  • Rectangle Region Based Stereo Matching for Building Reconstruction

    Jing WANG, Toru MIYAZAKI, Hirokazu KOIZUMI, Makoto IWATA, Jongwha CHONG, Hiroyuki YAGYU, Hideo SHIMAZU, Takeshi IKENAGA, Satoshi GOTO

    Jornal of Ubiquitous Convergence Technology   Vol.1 ( No.1 )  2007

  • A High Throughput Multiple Transform Architecture for H.264/AVC Fidelity Range Extensions

    Yao MA, Yang SONG, Takeshi IKENAGA, Satoshi GOTO

    Jornal of Semiconductor Technology and Science   Vol.7 ( No.4 )  2007

  • Low-Power Partial Distortion Sorting Fast Motion Estimation Algorithms and VLSI Implementation

    Yang Song, Zhenyu Liu, Takeshi Ikenaga, Satoshi Goto

    IEICE Trans. Inf. & Syst.   E90-D ( 1 ) 108 - 117  2007


  • eSTREAM 提案暗号へのDPA解析報告

    久門 亨, 角尾 幸保, 深澤 宏, 庄司 陽彦, 後藤 敏, 池永剛

    3E4-5, Symposium on Cryptography and Information Security (SCIS2007)    2007

  • No Compression Ratio Reduction H.264 Video Scrambling

    Jidon Wang, Yibo Fan, Xiaoyang Zeng, Takeshi Ikenaga, Satoshi Goto

    3B3-1, Symposium on Cryptography and Information Security (SCIS2007)    2007

  • Inter Search Mode Reduction Based Parallel Propagate Partial SAD Architecture for Variable Block Size Motion Estimation in H.264/AVC

    Yiqing HUANG, Zhenyu LIU, Yang SONG, Satoshi GOTO, Takeshi IKENAGA

    The 20th Workshop on Circuits and Systems in Karuizawa     609 - 614  2007

  • A High-speed Design of Montgomery Multiplier

    Yibo FAN, Takeshi IKENAGA, Satoshi GOTO

    The 20th Workshop on Circuits and Systems in Karuizawa     137 - 142  2007

  • Ultra Low-Complexity Fast Variable Block Size Motion Estimation Algorithm in H.264/AVC

    Yang SONG, Zhenyu LIU, Takeshi IKENAGA, Satoshi GOTO

    International Conference on Multimedia & Expo (ICME 07)     376 - 379  2007

  • Adaptive Edge Detection Pre-Process Multiple Reference Frames Motion Estimation in H.264/AVC

    Yiqing Huang, Zhenyu Liu, Satoshi Goto, Takeshi Ikenaga

    2007 International Conference on Communications, Circuits and Systems (ICCCAS 2007)     787 - 791  2007

  • Two-StepsCross-Diamond Fast Search Algorithmon Motion Estimation in H.264

    Qin Liu, Seiichiro Hiratsuka, Satoshi Goto, Takeshi Ikenaga

    2007 International Conference on Communications, Circuits and Systems (ICCCAS 2007)     782 - 786  2007

  • キュービック補間に基づく魚眼画像の高画質補正アルゴリズム及び専用ハードウェアエンジンの提案

    森 隆寛, 外村 元伸, 大住 勇治, 後藤 敏, 池永 剛

    画像電子学会誌 (Journal of IIEEJ)   Vol.36 ( No.9 )  2007


  • Power Reduction through Specific Instruction Scheduling based on Hardware/Software Co-Design

    Kang Zhao, Jinian Bian, Chenqian Jiang, Sheqin Dong, Satoshi Goto

    ASICON2007    2007


  • Floorplanning with Constraint Extraction based on Interconnecting Information Analysis

    Jiayi Liu, Sheqin Dong, Zianlong Hong, Satoshi Goto

    ASICON2007    2007


  • A Partial Scramble Scheme for H.264 Video

    Jidong Wang, Yibo Fan, Takeshi Ikenaga, Satoshi Goto

    ASICON 2007    2007


  • A New Video Encryption Scheme for H.264/AVC

    Yibo FAN, Jidong WANG, Takeshi IKENAGA, Yukiyasu TSUNOO, Satoshi GOTO

    PCM 2007    2007

  • VLSI architecture for variable block size motion estimation in H.264/AVC with low cost memory organization

    Yang Song, Zhenyu Liu, Takeshi Ikenaga, Satoshi Goto

    2006 International Symposium on VLSI Design, Automation and Test, VLSI-DAT 2006 - Proceedings of Technical Papers     89 - 92  2007

     View Summary

    A 1-D full search variable block sizes motion estimation (VBSME) architecture is presented in this paper. By properly choosing the partial sum of absolute differences (SAD) registers and scheduling the add operations, the architecture can be implemented with simple control logic and regular workflow. Moreover, only one single-port SRAM is required to store the search area and then reduces 72.7% hardware cost of SRAM. The design is realized with TSMC 0.18μm 1P6M technology with a hardware cost of 67.6K gates. In typical working condition (1.8V, 25°C), a clock frequency of 266MHz can be achieved. © 2006 IEEE.


  • An Efficient Encryption Scheme for H.264 Format Video Streams

    Jidong WANG, Yibo FAN, Takeshi IKENAGA, Satoshi GOTO

    The 20th Workshop on Circuits and Systems in Karuizawa     631 - 636  2007

  • Power-Efficient LDPC Code Decoder Architecture

    Kazunori Shimizu, Nozomu Togawa, Takeshi Ikenaga, Satoshi Goto


     View Summary

    This paper proposes the power-efficient LDPC decoder architecture which features (1) a FIFO buffering based rapid convergence schedule which enables the decoder to accelerate the decoding throughput without increasing the required number of memory bits, (2) an intermediate message compression technique based on a clock gated shift register which reduces the read and write, power dissipation for the intermediate messages. Simulation results show that the proposed decoder achieves 1.66 times faster decoding throughput, and improves the power efficiency (which is defined by the power dissipation per Mbps) up to 52% compared to the decoder based on the conventional overlapped schedule.


  • A Hardware-Oriented Hybrid Fast Mode Decision Algorithm for H.26/AVC Intra Encoder

    Tianruo ZHANG, Shen LI, Heng-Yao Lin, Takeshi IKENAGA, Satoshi GOTO

    SIP Symposium    2007

  • A New Video Encryption Scheme for H.264/AVC

    Yibo FAN, Jidong WANG, Takeshi IKENAGA, Yukiyasu TSUNOO, Satoshi GOTO

    PCM 2007    2007

  • Hardware friendly adaptive search range algorithm for variable block size motion estimation in H.264/AVC

    Zhenxing CHEN, Yang SONG, Takeshi IKENAGA, Satoshi GOTO

    ISPACS 2007    2007

  • A Novel Dynamic Search Range Decision Method for Variable Block Size Motion Estimation in H.264/AVC

    Zhenxing CHEN, Yang SONG, Takeshi IKENAGA, Satoshi GOTO

    ICICS 2007    2007


  • Motion-Content Based Search Range Prediction in Variable Block Size Motion Estimation

    Zhenxing, CHEN, Yang SONG, Takeshi IKENAGA, Satoshi GOTO

    The 20th Workshop on Circuits and Systems in Karuizawa     615 - 619  2007

  • Power-Saved 1.35Gbps Irregular LDPC Decoder based on Simplified Min-Sum Algorithm

    Qi WANG, Kazunori SHIMIZU, Takeshi IKENAGA, Satoshi GOTO

    International Symposium on VLSI Design, Automation & Test (VLSI-DAT 2007)     95 - 98  2007


  • Cost-Efficient Partially-Parallel Irregular LDPC Decoder with Message Passing Schedule

    Xing LI, Yuta ABE, Kazunori SHIMIZU, Zhen QIU, Takeshi IKENAGA, Satoshi GOTO

    International Symposium on Integrated Circuits 2007 (ISIC)     578 - 551  2007


  • VLSI friendly edge gradient detection based multiple reference frames motion estimation optimization for H.264/AVC

    LIU Z. Y.

    Proc. 2007 European Signal Processing Conference, Sept.     1809 - 1813  2007


  • デジタルシネマ用JPEG 2000 エンコーダ向け並列CBMアルゴリズム及びLSI アーキテクチャ

    伊東 健, 中村 創, 後藤 敏, 池永 剛

    画像電子学会誌 (Journal of IIEEJ)   Vol.36 ( No.9 )  2007


  • Mixed Bus Width Architecture for Low Cost AES VLSI Design

    Yibo Fan, Jidong Wang, Takeshi Ikenaga, Satoshi Goto

    ASICON2007    2007


  • A 41mW VGA@30fps Quadtree Video Encoder for Video Surveillance Systems

    Qin Liu, Seiichiro Hiratsuka, Satoshi Goto, Takeshi Ikenaga

    ASICON2007    2007


  • Cost efficient propagate partial SAD architecture for integer motion estimation in H.264/AVC

    Yiqing Huang, Zhenyu Liu, Satoshi Goto, Takeshi Ikenaga

    ASICON 2007 - 2007 7th International Conference on ASIC Proceeding     782 - 785  2007

     View Summary

    The latest video coding standard H.264/AVC covers a wide range of applications from QCIF to HDTV. In case of HDTV, subsampling technique is widely adopted to reduce hardware cost with little video quality degradation. Moreover, experiments show that contribution of small inter search modes to video quality is trivial so that mode reduction can help to further reduce hardware cost. This paper proposes a cost efficient Propagate Partial SAD architecture for HDTV application. The highly pipelined feature of proposed architecture makes it robust to high speed impact. Compared with widely used SAD Tree structure, the proposed cost efficient structure which adopts subsampling and inter search mode reduction can reduce averagely 23.88% hardware cost with negligible video quality loss. With TSMC 0.18μm CMOS 1P6M standard cell library, the maximum clock speed of this design is 233 MHz in worst work condition (1.62, 125°C). © 2007 IEEE.


  • Unequal error protected transmission with dynamic classification in H.264/AVC

    Jun Wang, Shen Li, Kazunori Shimizu, Takeshi Ikenaga, Satoshi Goto

    ASICON 2007    2007


  • A Hardware-Oriented Hybrid Fast Mode Decision Algorithm for H.26/AVC Intra Encoder

    Tianruo ZHANG, Shen LI, Heng-Yao Lin, Takeshi IKENAGA, Satoshi GOTO

    SIP Symposium    2007

  • Rectangle Region Based Stereo Matching for Building Reconstruction

    Jing WANG, Toru MIYAZAKI, Hirokazu KOIZUMI, Makoto IWATA, Jongwha CHONG, Hiroyuki YAGYU, Hideo SHIMAZU, Takeshi IKENAGA, Satoshi GOTO

    Jornal of Ubiquitous Convergence Technology   Vol.1 ( No.1 )  2007

  • Low-power partial distortion sorting fast motion estimation algorithms and VLSI implementations

    Yang Song, Zhenyu Liu, Takeshi Ikenaga, Satoshi Goto

    IEICE Transactions on Information and Systems   E90-D ( 1 ) 108 - 117  2007

     View Summary

    This paper presents two hardware-friendly low-power oriented fast motion estimation (ME) algorithms and their VLSI implementations. The basic idea of the proposed partial distortion sorting (PDS) algorithm is to disable the search points which have larger partial distortions during the ME process, and only keep those search points with smaller ones. To further reduce the computation overhead, a simplified local PDS (LPDS) algorithm is also presented. Experiments show that the PDS and LPDS algorithms can provide almost the same image quality as full search only with 36.7 computation complexity. The proposed two algorithms can be integrated into different FSBMA architectures to save power consumption. In this paper, the 1-D inter ME architecture [12] is used as an detailed example. Under the worst working conditions (1.62 V, 125°C) and 166 MHz clock frequency, the PDS algorithm can reduce 33.3 power consumption with 4.05 K gates extra hardware cost, and the LPDS can reduce 37.8 power consumption with 1.73 K gates overhead. Copyright © 2007 The Institute of Electornics, Information and Communication Engineers.


  • Motion-Content Based Search Range Prediction in Variable Block Size Motion Estimation

    Zhenxing, CHEN, Yang SONG, Takeshi IKENAGA, Satoshi GOTO

    The 20th Workshop on Circuits and Systems in Karuizawa     615 - 619  2007

  • Inter Search Mode Reduction Based Parallel Propagate Partial SAD Architecture for Variable Block Size Motion Estimation in H.264/AVC

    Yiqing HUANG, Zhenyu LIU, Yang SONG, Satoshi GOTO, Takeshi IKENAGA

    The 20th Workshop on Circuits and Systems in Karuizawa     609 - 614  2007

  • A High-speed Design of Montgomery Multiplier

    Yibo FAN, Takeshi IKENAGA, Satoshi GOTO

    The 20th Workshop on Circuits and Systems in Karuizawa     137 - 142  2007

  • Fragmented Edges Grouping for Monocular Building Extraction

    Jing WANG, Makoto IWATA, Hirokazu KOIZUMI, Hideo SHIMAZU, Satoshi GOTO, Takeshi IKENAGA

    The 20th Workshop on Circuits and Systems in Karuizawa     557 - 562  2007

  • Power-Saved 1.35Gbps Irregular LDPC Decoder based on Simplified Min-Sum Algorithm

    Qi WANG, Kazunori SHIMIZU, Takeshi IKENAGA, Satoshi GOTO

    International Symposium on VLSI Design, Automation & Test (VLSI-DAT 2007)     95 - 98  2007


  • Enhanced Strict Multilevel Successive Elimination Algorithm for Fast Motion Estimation

    Yang SONG, Zhenyu LIU, Takeshi IKENAGA, Satoshi GOTO

    IEEE ISCAS’07     3659 - 3662  2007

  • A 1.41w h.264/avc real-time encoder soc for hdtv1080p

    Zhenyu Liu, Yang Song, Ming Shao, Shen Li, Lingfeng Li, Shunichi Ishiwata

    2007 Symposium on VLSI Circuits, Digest of Technical Papers     12 - 13  2007

     View Summary

    A H.264/AVC baseline-profile real-time encoder for HDTV-1080p at 30fps is implemented with the dedicated hardware engines and one 32-bit Media embedded Processor (MeP) equipped with hardware extensions. The 11.5Gbps 64Mb Svstem-in-Silicon DRAMA is embedded to alleviate the external memory bandwidth. With TSMC 0.18 mu m CMOS technology, the SoC core occupies 27.1mm(2) die area and consumes 1.41W at 200MHz in typical work conditions.


  • Ultra Low-Complexity Fast Variable Block Size Motion Estimation Algorithm in H.264/AVC

    Yang SONG, Zhenyu LIU, Takeshi IKENAGA, Satoshi GOTO

    International Conference on Multimedia & Expo (ICME 07)     376 - 379  2007

  • A 41mW VGA@30fps Quadtree Video Encoder for Video Surveillance Systems

    Qin Liu, Seiichiro Hiratsuka, Satoshi Goto, Takeshi Ikenaga

    ASICON2007    2007


  • Power Reduction through Specific Instruction Scheduling based on Hardware/Software Co-Design

    Kang Zhao, Jinian Bian, Chenqian Jiang, Sheqin Dong, Satoshi Goto

    ASICON2007    2007


  • RSA暗号を実装したINSTAC-32に対するサイドチャネル攻撃実験

    深澤 宏, 東 邦彦, 後藤 敏, 池永 剛, 角尾 幸保, 久門 亨, 庄司 陽彦

    3E3-3, Symposium on Cryptography and Information Security (SCIS2007)    2007

  • Hardware Reuse Architecture for High-Radix Scalable Montgomery Multiplier

    Yibo Fan, Xiaoyang Zeng, Takeshi Ikenaga, Satoshi Goto

    2E2-1, Symposium on Cryptography and Information Security (SCIS2007)    2007

  • Low Power LDPC Decoder Design based on Accelerated Message-Passing Schedule

    Kazunori SHIMIZU, Nozomu TOGAWA, Takeshi IKENAGA, Satoshi GOTO

    The 20th Workshop on Circuits and Systems in Karuizawa     331 - 336  2007

  • A real-time parallel architecture for human face detection based on the Algorithm Architecture Adequation approach

    Dmitriev IVAN, Grandpierre THIERRY, Akil MOHAMED, Ghorayeb HICHAM, Satoshi GOTO, Takeshi IKENAGA

    The 20th Workshop on Circuits and Systems in Karuizawa     253 - 258  2007

  • A motion vector prediction scheme for MPEG-2 to H.264 transcoding based on smoothness of motion vector field

    Xiang Hui Wei, Shen Li, Satoshi Goto


     View Summary

    In MPEG-2 to H.264 transcoding, MPEG-2 motion vector (MV) is often reused as search center to simplify H.264 motion estimation (ME) module. In this paper, we show that motion vector predictor (MVP) from neighboring sub-blocks is more accurate than MPEG-2 MV as search center when MPEG-2 MV field is non-smooth. A criterion based on relative motion is proposed to evaluate smoothness of MPEG-2 MV field. And a hardware oriented MV prediction scheme is also proposed. Experiment results show that the proposed MV prediction scheme with a relative small search range can approach the performance of full search algorithm. Comparing with the method only utilizing MPEG-2 MV, the proposed approach can achieve significant improvement on accuracy of motion prediction, especially in sequences with fast motion and complicate background.

  • An improved inter frame error concealment in H.264/AVC

    Jun Wang, Lei Wang, Shen Li, Takeshi Ikenaga, Satoshi Goto

    ASICON 2007    2007


  • Hardware friendly adaptive search range algorithm for variable block size motion estimation in H.264/AVC

    Zhenxing CHEN, Yang SONG, Takeshi IKENAGA, Satoshi GOTO

    ISPACS 2007    2007

  • A Novel Dynamic Search Range Decision Method for Variable Block Size Motion Estimation in H.264/AVC

    Zhenxing CHEN, Yang SONG, Takeshi IKENAGA, Satoshi GOTO

    ICICS 2007    2007


  • Content-based complexity reduction methods for MPEG-2 to H.264 transcoding

    Shen Li, Lingfeng Li, Takeshi Ikenaga, Shunichi Ishiwata, Masataka Matsui, Satoshi Goto

    IEICE Transactions on Information and Systems   E90-D ( 1 ) 90 - 98  2007

     View Summary

    The coexistence of MPEG-2 and its powerful successor H.264/AVC has created a huge need for MPEG-2/H.264 video transcoding. However, a traditional transcoder where an MPEG-2 decoder is simply cascaded to an H.264 encoder requires huge computational power due to the adoption of a complicated rate-distortion based mode decision process in H.264. This paper proposes a 2-D Sobel filter based motion vector domain method and a DCT domain method to measure macroblock complexity and realize content-based H.264 candidate mode decision. A new local edge based fast INTRA prediction mode decision method is also adopted to boost the encoding efficiency. Simulation results confirm that with the proposed methods the computational burden of a traditional transcoder can be reduced by 20 ∼ 30 with only a negligible bit-rate increase for a wide range of video sequences. Copyright © 2007 The Institute of Electornics, Information and Communication Engineers.


  • Lossless VLSI oriented full computation reusing algorithm for H.264/AVC fractional motion estimation

    Ming Shao, Zhenyu Liu, Satoshi Goto, Takeshi Ikenaga

    IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences   E90-A ( 4 ) 756 - 763  2007

     View Summary

    Fractional Motion Estimation (FME) is an advanced feature adopted in H.264/AVC video compression standard with quarter-pixel accuracy. Although FME could gain considerably higher encoding efficiency, sub-pixel interpolation and sum of absolute transformed difference (SATD) computation, as main parts of FME, increase the computation complexity a lot. To reduce the complexity of FME, this paper proposes a full computation reusable VLSI oriented algorithm. Through exploiting the similarity among motion vectors (MVs) of partitions in the same macroblock (MB), temporary computation results can be fully reused. Furthermore, a simple and effective searching method is adopted to make the proposed method more suitable for VLSI implementation. Experiment results show that up to 80 add operations and 85 internal reference frame memory access operations are saved without any degradation in the coding quality. Copyright © 2007 The Institute of Electronics, Information and Communication Engineers.


  • Hardware-efficient propagate partial sad architecture for variable block size motion estimation in H.264/AVC

    Zhenyu Liu, Yiqing Huang, Yang Song, Satoshi Goto, Takeshi Ikenaga

    Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI     160 - 163  2007

     View Summary

    One hardware efficient and high speed architecture for variableblock size motion estimation in H.264 is presented in this paper. Through compressing the propagated data and optimizing theprocessing element and adder tree circuits in pipeline, this architecture gets more hardware efficient datapath logic. Compared with the original Propagate Partial SAD structure, 12.1% hardware cost can be saved. With TSMC 0.18m CMOS 1P6M standard celllibrary, the maximum clock speed of this design is 227MHz in worstwork conditions (1.62V, 125°C). With the 48x32 search range, the maximum throughput of our design is 147786 MB/S, which can be used in the real-time encoding of VGA resolution frame with 4 reference frames at 30Hz. Copyright 2007 ACM.


  • Hardware architecture design of CABAC Codec for H.264/AVC

    Lingfeng Li, Yang Song, Takeshi Ikenaga, Satoshi Goto

    2007 International Symposium on VLSI Design, Automation and Test, VLSI-DAT 2007 - Proceedings of Technical Papers     248 - 251  2007

     View Summary

    This paper presents a hardware architecture for Context-Based Adaptive Binary Arithmetic Coding (CABAC) codec in H.264/AVC main profile. The similarities between encoding algorithm and decoding algorithm are explored to fulfill hardware reuse. Meanwhile, dynamic pipeline scheme is adopted, to speedup the throughput. The characteristics of CABAC algorithm are utilized to reduce pipeline latency. Proposed codec design is implemented under TSMC 0.18 μm technology. Results show that the equivalent gate counts is 33.2k when the maximum frequency is 230MHz. It is estimated that the proposed CABAC, codec can process the input binary symbol at 135Mb/s for encoding and 90Mb/s for decoding. © 2007 IEEE.


  • Small Deadspace Fixed-Outline Floorplanner with Soft Modules

    Yukio YAMAKOSHI, Takeshi YOSHIMURA, Takeshi IKENAGA, Satoshi GOTO

    The 20th Workshop on Circuits and Systems in Karuizawa     295 - 300  2007

  • Low Power LDPC Decoder Design based on Accelerated Message-Passing Schedule

    Kazunori SHIMIZU, Nozomu TOGAWA, Takeshi IKENAGA, Satoshi GOTO

    The 20th Workshop on Circuits and Systems in Karuizawa     331 - 336  2007

  • An Irregular Search Window Reuse Scheme for Motion Estimation in MPEG-2 to H.264 Transcoding

    Xianghui WEI, Shen LI, Yang SONG, Satoshi GOTO

    IEEE ISCAS’07     1987 - 1990  2007

  • VLSI oriented fast multiple reference frame motion estimation algorithm for H.264/AVC

    Zhenyu Liu, Lingfeng Li, Yang Song, Takeshi Ikenaga, Satoshi Goto


     View Summary

    In H.264/AVC standard, motion estimation can be processed on multiple reference frames (MRF) to improve the video coding performance. For the VLSI real-time encoder, the heavy computation of fractional motion estimation (FME) makes the integer motion estimation (IME) and FME must be scheduled in two macro block (MB) pipeline stages, which makes many fast MRF algorithms inefficient for the computation reduction. In this paper, two algorithms are provided to reduce the computation of FME and IME. First, through analyzing the block's Hadamard transform coefficients, all-zero case after quantization can be accurately detected. The FME processing in the remaining frames for the block, detected as all-zero one, can be eliminated. Second, because the fast motion object blurs its edges in image, the effect of MRF to aliasing is weakened. The first reference frame is enough for fast motion MBs and MRF is just processed on those slow motion MBs with a small search range. The computation of IME is also highly reduced with this algorithm. Experimental results show that 61.4%-76.7% computation can be saved with the similar coding quality as the reference software. Moreover, the provided fast algorithms can be combined with fast block matching algorithms to further improve the performance.

  • Partially-Parallel Irregular LDPC Decoder based on Improved Message Passing Schedule

    Xing LI, Kazunori SHIMIZU, Zhen QIU, Takeshi IKENAGA, Satoshi GOTO

    IEEE International Midwest Symposium on Circuits and Systems ( MWSCAS) 2007     1473 - 1476  2007


  • Scale Invariance Based Salient Contour Detection

    Jing Wang, Takeshi Ikenaga, Satoshi Goto

    Journal of the Institute of Image Electronics Engineers of Japan   36 ( 5 ) 710 - 720  2007

     View Summary

    Contour detection is a fundamental step to scene analysis and interpretation. However, because contours often locate in rich texture background it is still a difficult task in realistic vision. Through multi-scale analysis, it becomes clear that edge responses of real object contours are relatively stable across scales, while those from noise or texture background are not. In this paper, a salient contour detection method is proposed based on the scale invariance of piecewise linear approximation of real object contours. Firstly, an image pyramid is efficiently constructed by repeatedly smoothing and sub-sampling the image. Secondly, the piecewise linear approximation of contours in multiple scales are extracted and then a collinear line grouping process is implemented to improve the connectivity of the contours. Thirdly, the new salient line segments are generated based on the analysis on the stability of line segments across scales. Experimental results show that the proposed method can effectively improve the connectivity and saliency of the contour detection compared with the former method. © 2007, The Institute of Image Electronics Engineers of Japan. All rights reserved.


  • Cost-Efficient Partially-Parallel Irregular LDPC Decoder with Message Passing Schedule

    Xing LI, Yuta ABE, Kazunori SHIMIZU, Zhen QIU, Takeshi IKENAGA, Satoshi GOTO

    International Symposium on Integrated Circuits 2007 (ISIC)     578 - 551  2007


  • Unequal error protected transmission with dynamic classification in H.264/AVC

    Jun Wang, Shen Li, Kazunori Shimizu, Takeshi Ikenaga, Satoshi Goto

    ASICON 2007    2007


  • An improved inter frame error concealment in H.264/AVC

    Jun Wang, Lei Wang, Shen Li, Takeshi Ikenaga, Satoshi Goto

    ASICON 2007    2007


  • Power-efficient LDPC decoder architecture based on accelerated message-passing schedule

    Kazunori Shimizu, Tatsuyuki Ishikawa, Nozomu Togawa, Takeshi Ikenaga, Satoshi Goto


     View Summary

    In this paper, we propose a power-efficient LDPC decoder architecture based on an accelerated message-passing schedule. The proposed decoder architecture is characterized as follows: (i) Partitioning a pipelined operation not to read and write intermediate messages simultaneously enables the accelerated message-passing schedule to be implemented with single-port SRAMs. (H) FIFO-based buffering reduces the number of SRAM banks and words of the LDPC. decoder based on the accelerated message-passing schedule.. The proposed LDPC decoder keeps a single message for each non-zero bit in a parity check matrix as well as a classical schedule while achieving the accelerated message-passing schedule. Implementation results in 0.18 [mu m] CMOS technology show that the proposed decoder architecture reduces an area of the LDPC decoder by 43% and a power dissipation by 29% compared to the conventional architecture based on the accelerated message-passing schedule.


  • A VLSI architecture for variable block size motion estimation in H.264/AVC with low cost memory organization

    Yang Song, Zhenyu Liu, Takeshi Ikenaga, Satoshi Goto


     View Summary

    A one-dimensional (1-D) full search variable block size motion estimation (VBSME) architecture is presented in this paper. By properly choosing the partial sum of absolute differences (SAD) registers and scheduling the addition operations, the architecture can be implemented with simple control logic and regular workflow. Moreover, only one single-port SRAM is used to store the search area data. The design is realized in TSMC 0.18 mu m 1P6M technology with a hardware cost of 67.6K gates. In typical working conditions (1.8 V, 25 degrees C), a clock frequency of 266 MHz can be achieved.

    DOI CiNii

  • A fine-grain scalable and low memory cost variable block size motion estimation architecture for H.264/AVC

    Zhenyu Liu, Yang Song, Takeshi Ikenaga, Satoshi Goto

    IEICE TRANSACTIONS ON ELECTRONICS   E89C ( 12 ) 1928 - 1936  2006.12

     View Summary

    One full search variable block size motion estimation (VBSME) architecture with integer pixel accuracy is proposed in this paper. This proposed architecture has following features: (1) Through widening data path from the search area memories, m processing element groups (PEG) could be scheduled to work in parallel and fully utilized, where m is a factor of sixteen. Each PEG has sixteen processing elements (PE) and just costs 8.5K gates. This feature provides users more flexibility to make tradeoff between the hardware cost and the performance. (2) Based on pipelining and multi-cycle data path techniques, this architecture can work at high clock frequency. (3) The memory partition number is greatly reduced. When sixteen PEGs are adopted, only two memory partitions are required for the search area data storage. Therefore, both the system hardware cost and power consumption can be saved. A 16-PEG design with 48 x 32 search range has been implemented with TSMC 0.18 mu m CMOS technology. In typical work conditions, its maximum clock frequency is 261 MHz. Compared with the previous 2-D architecture [9], about 13.4% hardware cost and 5.7% power consumption can be saved.

    DOI CiNii

  • Partially-parallel LDPC decoder achieving high-efficiency message-passing schedule

    K Shimizu, T Ishikawa, N Togawa, T Ikenaga, S Goto


     View Summary

    In this paper, we propose a partially-parallel LDPC decoder which achieves a high-efficiency message-passing schedule. The proposed LDPC decoder is characterized as follows: (i) The column operations follow the row operations in a pipelined architecture to ensure that the row and column operations are performed concurrently. (ii) The proposed parallel pipelined bit functional unit enables the column operation module to compute every message in each bit node which is updated by the row operations. These column operations can be performed without extending the single iterative decoding delay when the row and column operations are performed concurrently. Therefore, the proposed decoder performs the column operations more frequently in a single iterative decoding, and achieves a high-efficiency message-passing schedule within the limited decoding delay time. Hardware implementation on an FPGA and simulation results show that the proposed partially-parallel LDPC decoder improves the decoding throughput and bit error performance with a small hardware overhead.


  • Scalable VLSI architecture for variable block size integer motion estimation in H.264/AVC

    Y Song, ZY Liu, S Goto, T Ikenaga


     View Summary

    Because of the data correlation in the motion estimation (ME) algorithm of H.264/AVC reference software, it is difficult to implement an efficient ME hardware architecture. In order to make parallel processing feasible, four modified hardware friendly ME workflows are proposed in this paper. Based on these workflows, a scalable full search ME architecture is presented, which has following characteristics: (1) The sum of absolute differences (SAD) results of 4 x 4 sub-blocks is accumulated and reused to calculate SADs of bigger sub-blocks. (2) The number of PE groups is configurable. For a search range of MxN pixels, where M is width and N is height, up to M PE groups can be configured to work in parallel with a peak processing speed of N x 16 clock cycles to fulfill a full search variable block size ME (VBSME). (3) Only conventional single port SRAM is required, which makes this architecture suitable for standard-cell-based implementation. A design with 8 PE groups has been realized with TSMC 0.18 mu m CMOS technology. The core area is 2.13 mm x 1.60 mm and clock frequency is 228 MHz in typical condition (1.8 V, 25 degrees C).

    DOI CiNii

  • A hardware implementation of a content-based motion estimation algorithm for real-time MPEG-4 video coding

    S Li, T Ikenaga, H Takeda, M Matsui, S Goto


     View Summary

    Power efficiency and real-time processing capability are two major issues in today's mobile video applications. We proposed a novel Motion Estimation (ME) engine for power-efficient real-time MPEG-4 video coding based on our previously proposed content-based ME algorithm [8],[13]. By adopting Full Search (FS) and Three Step Search (TSS) alternatively according to the nature of video contents, this algorithm keeps the visual quality very close to that of FS with only 3% of its computational power. We designed a flexible Block Matching (BM) Unit with 16-PE SIMD data path so that the adaptive ME can be performed at a much lower clock frequency and hardware cost as compared with previous FS based work. To reduce the energy cost caused by excessive external memory access, on-chip SRAM is also utilized and optimized for parallel processing in the BM Unit. The ME engine is fabricated with TSMC 0.18 mu m technology. When processing QCEF (15 fps) video, the estimated power is 2.88 mW@4.16 MHz (supply voltage: 1.62 V). It is believed to be a favorable contribution to the video encoder LSI design for mobile applications.

    DOI CiNii

  • A contour-based robust algorithm for text detection in color images

    YX Liu, S Goto, T Ikenaga


     View Summary

    Text detection in color images has become an active research area in the past few decades. In this paper, we present a novel approach to accurately detect text in color images possibly with a complex background. The proposed algorithm is based on the combination of connected component and texture feature analysis of unknown text region contours. First, we utilize an elaborate color image edge detection algorithm to extract all possible text edge pixels. Connected component analysis is performed oil these edge pixels to detect the external contour and possible internal contours of potential text regions. The gradient and geometrical characteristics of each region contour are carefully examined to construct candidate text regions and classify part non-text regions. Then each candidate text region is verified with texture features derived from wavelet domain. Finally, the Expectation maximization algorithm is introduced to binarize each text region to prepare data for recognition. In contrast to previous approach, our algorithm combines both the efficiency of connected component based method and robustness Of texture based analysis. Experimental results show that our proposed algorithm is robust ill text detection with respect to different character size, orientation. color and language and can provide reliable text binarization result.

    DOI CiNii

  • High-throughput decoder for low-density aprity-check code

    Tatsuya Ishikawa, Kazunori Shimizu, Takeshi Ikenaga, Satoshi Goto

    Proc. of IEEE ASP-DAC2006    2006

  • A Strict Successive Elimination Algorithm for Fast Motion Estimation

    Yang Song, Zhenyu Liu, Takeshi Ikenaga, Satoshi Goto

    第19回 回路とシステム軽井沢ワークショップ    2006

  • A Pipeline Parallel Tree Architecture for Full Search Variable Block Size Motion Estimation in H.264/AVC

    Zhenyu Liu, Yang Song, Takeshi Ikenaga, Satoshi Goto

    Picture Coding Symposium (PCS 2006)    2006

  • VLSI Architecture for Variable Block Size Motion Estimation in H.264/AVC with Low Cost Memory Organization

    Yang Song, Zhenyu Liu, Takeshi Ikenaga, Satoshi Goto

    International Symposium on VLSI Design, Automation & Test (VLSI-DAT 2006)    2006

  • A 232MHz Variable Block Size Integer Motion Estimation Processor with System-in-Silicon DRAM for H.264/AVC

    Zhenyu Liu, Yang Song, Satoshi Goto, Takeshi Ikenaga, Kouichi Kumagai, Yoshihiro Mabuchi, Kenji Yoshida

    IEEE Symposium on Low-Power and High-Speed Chips (COOL Chips IX)    2006

  • 奥行情報を用いた携帯端末向けリアルタイム人物抽出LSI

    有門 智弘, 平塚誠一郎, 後藤 敏, 池永 剛

    ガゾウ デンシ ガッカイシ   35 ( 5 ) 453 - 460  2006

    DOI CiNii

  • Robust Scalable Video Transmission using Object-Oriented Unequal Loss Protection over Internet

    Zhen Qiu, Takeshi Ikenaga, Satoshi Goto

    IEEE Asia-Pacific Conference on Circuits and Systems    2006


  • Enhanced Partial Distortion Sorting Fast Motion Estimation Algorithm for Low-Power Applications

    Yang Song, Zhenyu Liu, Takeshi Ikenaga, Satoshi Goto

    IEEE Asia Pacific Conference on Circuits and Systems (APCCAS’06)    2006


  • A Strict Successive Elimination Algorithm for Fast Motion Estimation

    Yang Song, Zhenyu Liu, Takeshi Ikenaga, Satoshi Goto

    第19回 回路とシステム軽井沢ワークショップ    2006

  • A Pipeline Parallel Tree Architecture for Full Search Variable Block Size Motion Estimation in H.264/AVC

    Zhenyu Liu, Yang Song, Takeshi Ikenaga, Satoshi Goto

    Picture Coding Symposium (PCS 2006)    2006

  • VLSI Architecture for Variable Block Size Motion Estimation in H.264/AVC with Low Cost Memory Organization

    Yang Song, Zhenyu Liu, Takeshi Ikenaga, Satoshi Goto

    International Symposium on VLSI Design, Automation & Test (VLSI-DAT 2006)    2006

  • A Parallel LSI Architecture for LDPC Decoder Improving Message-Passing Schedule

    Kazunori Shimizu, Tatsuyuki Ishikawa, Nozomu Togawa, Takeshi Ikenag, Satoshi Goto

    IEEE International Symposium on Circuits and Systems (ISCAS2006)    2006

  • Geometrical, Physical and Text/Symbol Analysis Based Approach of Traffic Sign Detection System

    Yangxing Liu, Takeshi Ikenaga, Satoshi Goto

    IEEE Intelligent Vehicle Symposium(IV2006)     238 - 243  2006

  • Multi-Resolution Analysis Based Salient Contour Extraction

    Jing Wang, Kazuo Kunieda, Makoto Iwata, Hirokazu Koizumi, Hideo Shimazu, Takeshi Ikenaga, Satoshi Goto

    IEEE International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS 2006)    2006

  • Complexity Based Fast Coding Mode Decision for MPEG-2 / H.264 Video Transcoding

    Shen Li, Lingfeng Li, Takeshi Ikenaga, Shunichi Ishiwata, Masataka Matsui, Satoshi Goto

    IEEE Asia-Pacific Conference on Circuits and Systems (APCCAS2006)    2006


  • Enhanced Partial Distortion Sorting Fast Motion Estimation Algorithm for Low-Power Applications

    Yang Song, Zhenyu Liu, Takeshi Ikenaga, Satoshi Goto

    IEEE Asia Pacific Conference on Circuits and Systems (APCCAS’06)    2006


  • ASIC Implementation of LDPC Decoder Accelerating Message-Passing Schedule

    Kazunori Shimizu, Tatsuya Ishikawa, Nozomu Togawa, Takeshi Ikenaga, Satoshi Goto

    Student Design Contest, IEEE International Solid State Circuits Confeference(ISSCC)    2006

  • Hardware Architecture of Efficient Message-Passing Schedule based on Modified Min-Sum Algorithm for Decoding LDPC Codes

    Kazunori Shimizu, Tatsuyuki Ishikawa, Nozomu Togawa, Takeshi Ikenaga, Satoshi Goto

    The 13th Workshop on Synthesis And System Integration of Mixed Information technologies (SASIMI 2006)    2006

  • A strict successive algorithm for fast motion estimation

    Yang Song, Zhenyu Liu, Takeshi Ikenaga, Satoshi Goto

    Karuizawa workshop on Circuits and Systems,IEICE    2006

  • Low-Pass Filter Based VLSI Oriented Variable Block Size Motion Estimation Algorithm for H.264"

    Zhenyu Liu, Yang Song, Takeshi Ikenaga, Satoshi Goto

    IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2006)    2006

  • Geometrical, Physical and Text/Symbol Analysis Based Approach of Traffic Sign Detection System

    Yangxing Liu, Takeshi Ikenaga, Satoshi Goto

    IEEE Intelligent Vehicle Symposium(IV2006)     238 - 243  2006

  • 61.5mW 2048-bit RSA Cryptographic Co-processor LSI based on N bit-wised Modular Multiplier

    Toru Hisakado, Nobuyuki Kobayashi, Satoshi Goto, Takeshi Ikenaga, Kunihiko Higashi, Ichiro Kitao, Yukiyasu Tsunoo

    International Symposium on VLSI Design, Automation & Test (VLSI-DAT 2006)    2006


  • Multi-Resolution Analysis Based Salient Contour Extraction

    Jing Wang, Kazuo Kunieda, Makoto Iwata, Hirokazu Koizumi, Hideo Shimazu, Takeshi Ikenaga, Satoshi Goto

    IEEE International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS 2006)    2006

  • Memory-efficient accelerating schedule for LDPC decoder

    Kazunori Shimizu, Nozomu Togawa, Takeshi Ikenaga, Satoshi Goto

    IEEE Asia-Pacific Conference on Circuits and Systems, Proceedings, APCCAS     1317 - 1320  2006

     View Summary

    This paper proposes a memory-efficient accelerating schedule for LDPC decoder. Important properties of the proposed techniques are as follows: (i) Partitioning a pipelined operation not to read and write intermediate messages simultaneously enables the accelerated message-passing schedule to be implemented with single-port memories. (ii) FIFO-based buffering reduces the number of memory banks and words for the decoder based on the accelerated message-passing schedule. The proposed decoder reduces the memories for intermediate messages by half compared to the conventional one based on the accelerated message-passing schedule. ©2006 IEEE.


  • Novel Hybrid Approach of Color Image Segmentation

    Yangxing Liu, Takeshi Ikenaga, Satoshi Goto

    IEEE Asia-Pacific Conference on Circuits and System    2006


  • High-throughput decoder for low-density aprity-check code

    Tatsuya Ishikawa, Kazunori Shimizu, Takeshi Ikenaga, Satoshi Goto

    Proc. of IEEE ASP-DAC2006    2006

  • ASIC Implementation of LDPC Decoder Accelerating Message-Passing Schedule

    Kazunori Shimizu, Tatsuya Ishikawa, Nozomu Togawa, Takeshi Ikenaga, Satoshi Goto

    Student Design Contest, IEEE International Solid State Circuits Confeference(ISSCC)    2006

  • A strict successive algorithm for fast motion estimation

    Yang Song, Zhenyu Liu, Takeshi Ikenaga, Satoshi Goto

    Karuizawa workshop on Circuits and Systems,IEICE    2006

  • High-Throughput LDPC Decoder for long code-length

    Tatsuyuki Ishikawa, Kazunori Shimizu, Takeshi Ikenaga, Satoshi Goto

    International Symposium on VLSI Design, Automation & Test (VLSI-DAT 2006)    2006


  • Content-based motion estimation VLSI design for real-time MPEG4 video coding

    Shen Li, Satoshi Goto, Taleshi Ikenaga, Hideki Takeda, Masataka Matsui

    IEICE Trans.on Fundamentals   E89-A ( 4 ) 932 - 940  2006


  • A 232MHz Variable Block Size Integer Motion Estimation Processor with System-in-Silicon DRAM for H.264/AVC

    Zhenyu Liu, Yang Song, Satoshi Goto, Takeshi Ikenaga, Kouichi Kumagai, Yoshihiro Mabuchi, Kenji Yoshida

    IEEE Symposium on Low-Power and High-Speed Chips (COOL Chips IX)    2006

  • 61.5mW 2048-bit RSA Cryptographic Co-processor LSI based on N bit-wised Modular Multiplier

    Toru Hisakado, Nobuyuki Kobayashi, Satoshi Goto, Takeshi Ikenaga, Kunihiko Higashi, Ichiro Kitao, Yukiyasu Tsunoo

    International Symposium on VLSI Design, Automation & Test (VLSI-DAT 2006)    2006


  • Geometric Primitives Detection In Aerial Image

    Jing WANG, Satoshi Goto, Kazuo KUNIEDA, Makoto IWATA, Hirokazu KOIZUMI, Hideo SHIMAZU, Takeshi IKENAGA

    IEEE International Conference on Cognitive Informatics (ICCI2006)    2006


  • An Efficient and Accurate Approach of Detecting Elliptical Objects in Color Images

    Yangxing Liu, Takeshi Ikenaga, Satoshi Goto

    IEEE 8th Internal Conference on Signal Processing (ICSP2006)    2006

  • System-in-Silicon architecture and its application to an H.264/AVC motion estimation for 1080HDTV

    Kouichi Kumagai, Chanqi Yang, Satoshi Goto, Takeshi Ikenaga, Yoshihiro Mabushi, Kenji Yoshida

    IEEE International Solid-Stae Circuits Conference(ISSCC)    2006

  • FIFO バッファによる高効率Message-Passing スケジュールを用いたLDPC復号器

    清水一範, 石川達之, 戸川 望, 池永 剛, 後藤 敏

    回路とシステム軽井沢ワークショップ、電子情報通信学会    2006

  • Feature-Based Early-Termination Approach for Multi-Frame Motion Estimation of H.264/AVC

    Lingfeng Li, Shen Li, Takeshi Ikenaga, Shunichi Ishiwata, Masataka Matsui, Satoshi Goto

    Picture Coding Symposium (PCS 2006)    2006

  • Content-based motion estimation VLSI design for real-time MPEG4 video coding

    Shen Li, Satoshi Goto, Taleshi Ikenaga, Hideki Takeda, Masataka Matsui

    IEICE Trans.on Fundamentals   E89-A ( 4 ) 932 - 940  2006


  • A Parallel LSI Architecture for LDPC Decoder Improving Message-Passing Schedule

    Kazunori Shimizu, Tatsuyuki Ishikawa, Nozomu Togawa, Takeshi Ikenag, Satoshi Goto

    IEEE International Symposium on Circuits and Systems (ISCAS2006)    2006

  • High Performance VLSI Architecture of Fractional Motion Estimation in H.264 for HDTV

    Changqi Yang, Satoshi Goto, Takeshi Ikenaga

    IEEE International Symposium on Circuits and Systems (ISCAS2006)    2006

  • A Novel Approach of Rectangular Shape Object Detection in Color Images Based on An MRF Mode

    Yangxing Liu, Satoshi Goto, Takeshi Ikenaga

    IEEE International Conference on Cognitive Informatics (ICCI2006)     386 - 393  2006


  • Object-oriented unequal loss protection with product codes for wireless video transmission

    Qui Zhen, Takeshi Ikenaga, Satoshi Goto

    IET Conference Publications   ( 525 ) 221  2006

     View Summary

    This paper presents an object-oriented unequal loss protection scheme for robust wireless scalable 3D-SPIHT encoded video transmission using product codes. In our proposed OOULP system, the embedded video bitstream of all concerned video objects are packetized independently for transmission over wireless network. The scheme employs regular LDPC codes and Reed-Solomon codes to deal effectively with burst errors over wireless network. For optimal unequal loss protection, a coarse-to-fine strategy has been used to assign unequal rate allocation to each object within the video:(a) coarse allocation: the inter-object rate allocation assign the rate budget among different objects according to their priority level
    (b) fine allocation: intra-object rate allocation provides byte-wise protection to the embedded bitstream of each object to minimize its individual mean distortion. Our proposed framework allows the users to independently access and manipulates the objects with their interests. The simulation experiment shows that the proposed framework is robust in hostile network conditions and therefore can provide reasonable visual qualities for different video objects according to their importance.


  • Complexity Based Fast Coding Mode Decision for MPEG-2 / H.264 Video Transcoding

    Shen Li, Lingfeng Li, Takeshi Ikenaga, Shunichi Ishiwata, Masataka Matsui, Satoshi Goto

    IEEE Asia-Pacific Conference on Circuits and Systems (APCCAS2006)    2006


  • A CABAC Encoding Core with Dynamic Pipeline for H.264/AVC Main Profile

    Lingfeng Li, Yang Song, Takeshi Ikenaga, Satoshi Goto

    IEEE Asia-Pacific Conference on Circuits and Systems (APCCAS2006)    2006


  • Hardware Architecture of Efficient Message-Passing Schedule based on Modified Min-Sum Algorithm for Decoding LDPC Codes

    Kazunori Shimizu, Tatsuyuki Ishikawa, Nozomu Togawa, Takeshi Ikenaga, Satoshi Goto

    The 13th Workshop on Synthesis And System Integration of Mixed Information technologies (SASIMI 2006)    2006

  • High Performance Chip Design on H.264/AVC Integer Motion Estimation for 1080HDTV Based on SiS Multi-Chip Architecture

    Changqi Yang, Kouichi Kumagai, Yoshihiro Mabuchi, Kenji Yoshida, Takeshi Ikenaga, Satoshi Goto

    Picture Coding Symposium (PCS 2006)    2006

  • An Ultra-Low Complexity Motion Estimation Algorithm and its Implementation of Specific Processor

    Seiichiro Hiratsuka, Satoshi Goto, Takeshi Ikenaga

    IEEE International Symposium on Circuits and Systems (ISCAS2006)    2006

  • High Speed Floorplanner with Soft Module

    Yukio Yamakoshi, Takeshi Yoshimura, Takeshi Ikenaga, Satoshi Goto

    International SoC Design Conference (ISOCC2006)    2006

  • Fully Automatic Approach of Color Image Edge Detection

    Yangxing Liu, Takeshi Ikenaga, Satoshi Goto

    IEEE International Conference on Systems, Man, and Cybernetics    2006


  • A CABAC Encoding Core with Dynamic Pipeline for H.264/AVC Main Profile

    Lingfeng Li, Yang Song, Takeshi Ikenaga, Satoshi Goto

    IEEE Asia-Pacific Conference on Circuits and Systems (APCCAS2006)    2006


  • A New Multiscale Line Detection Approach for Aerial Image with Complex Scene

    Jing Wang, Kazuo Kunieda, Makoto Iwata, Hirokazu Koizumi, Hideo Shimazu, Takeshi Ikenaga, Satoshi Goto

    IEEE Asia-Pacific Conference on Circuits and Systems (APCCAS2006)    2006


  • A novel hybrid approach of color image segmentation

    Yangxing Liu, Takeshi Ikenaga, Satoshi Goto

    IEEE Asia-Pacific Conference on Circuits and Systems, Proceedings, APCCAS     1863 - 1866  2006

     View Summary

    Color image segmentation is probably the most important task in image analysis and understanding. In this paper, we present a novel approach to segment color images by integrating region clustering result with edge detection result. In contrast to existing region-based clustering method, we do not cluster all pixels in an image at one time. We divide clustering process into two steps. First we only cluster those reliable pixels, whose colors are not affected by shadow or highlight, to get more reasonable initial clustering results. Then we cluster left unreliable pixels into classes obtained in previous step or new classes. To avoid over-segmenting an image, edge detection result and spatial information are utilized to merge some neighboring regions, a significant part of whose common boundary consists of weak edges, together as a whole. Experimental results demonstrate the efficacy of our algorithm to segment color images without any prior knowledge. ©2006 IEEE.


  • A VLSI Architecture Design of an Edge Based Fast Intra Prediction Mode Decision Algorithm for H.264/AVC

    Shen Li, Xianghui Wei, Satoshi Goto, Takeshi Ikenaga

    ACM GLSVLSI    2006


  • A selective video encryption scheme for MPEG compression standard

    G Liu, T Ikenaga, S Goto, T Baka


     View Summary

    With the increase of commercial multimedia applications using digital video, the security of video data becomes more and more important. Although several techniques have been proposed in order to protect these video data, they provide limited security or introduce significant overhead. This paper proposes a video security scheme for MPEG video compression standard, which includes two methods: DCEA (DC Coefficient Encryption Algorithm) and "Event Shuffle." DCEA is aim to encrypt group of codewords of DC coefficients. The feature of this method is the usage of data permutation to scatter the ciphertexts of additional codes in DC codewords. These additional codes are encrypted by block cipher previously. With the combination of these algorithms, the method provides enough security for important DC component of MPEG video data. "Event Shuffle" is aim to encrypt the AC coefficients. The prominent feature of this method is a shuffling of AC events generated after DCT transformation and quantization stages. Experimental results show that these methods introduce no bit overhead to MPEG bit stream while achieving low processing overhead to MPEG codec.


  • ストリーム暗号に対するDPA

    久門 亨, 角尾 幸保, 後藤 敏, 池永 剛

    IEICE SCIS 2006    2006

  • A power disturbance circuit for A5/1 resistant to power analysis attack

    Wei Dai, Tohru Hisakado, Zhenyu Liu, Satoshi Goto, Takeshi Ikenaga, Yukiyasu Tunoo

    IEICE SCIS 2006    2006

  • Loss Free VLSI Oriented Full Computation Reusing Algorithm for H.264 Fractional Motion

    Ming Shao, Zhenyu Liu, Satoshi Goto, Takeshi Ikenaga

    第19回 回路とシステム軽井沢ワークショップ    2006

  • High Performance Chip Design on H.264/AVC Integer Motion Estimation for 1080HDTV Based on SiS Multi-Chip Architecture

    Changqi Yang, Kouichi Kumagai, Yoshihiro Mabuchi, Kenji Yoshida, Takeshi Ikenaga, Satoshi Goto

    Picture Coding Symposium (PCS 2006)    2006

  • High-Throughput LDPC Decoder for long code-length

    Tatsuyuki Ishikawa, Kazunori Shimizu, Takeshi Ikenaga, Satoshi Goto

    International Symposium on VLSI Design, Automation & Test (VLSI-DAT 2006)    2006


  • An Ultra-Low Complexity Motion Estimation Algorithm and its Implementation of Specific Processor

    Seiichiro Hiratsuka, Satoshi Goto, Takeshi Ikenaga

    IEEE International Symposium on Circuits and Systems (ISCAS2006)    2006

  • 適応型トリーに基づく低計算量動画圧縮方式

    平塚 誠一郎, 後藤 敏, 馬場 孝明, 池永 剛

    電子情報通信学会論文誌   D, Vol. J89-D ( No. 6 ) 1297 - 1305  2006

  • Geometric Primitives Detection In Aerial Image

    Jing WANG, Satoshi Goto, Kazuo KUNIEDA, Makoto IWATA, Hirokazu KOIZUMI, Hideo SHIMAZU, Takeshi IKENAGA

    IEEE International Conference on Cognitive Informatics (ICCI2006)    2006


  • High Speed Floorplanner with Soft Module

    Yukio Yamakoshi, Takeshi Yoshimura, Takeshi Ikenaga, Satoshi Goto

    International SoC Design Conference (ISOCC2006)    2006

  • An Efficient Hardware Architecture for Full-Search Variable Block Size Motion Estimation in H.264/AVC

    Seung-Man Pyen, Kyeong-Yuk Min, Jong-Wha Chong, Satoshi Goto

    ISVC    2006

  • Fully Automatic Approach of Color Image Edge Detection

    Yangxing Liu, Takeshi Ikenaga, Satoshi Goto

    IEEE International Conference on Systems, Man, and Cybernetics    2006


  • An Efficient and Accurate Approach of Detecting Elliptical Objects in Color Images

    Yangxing Liu, Takeshi Ikenaga, Satoshi Goto

    IEEE 8th Internal Conference on Signal Processing (ICSP2006)    2006

  • A New Multiscale Line Detection Approach for Aerial Image with Complex Scene

    Jing Wang, Kazuo Kunieda, Makoto Iwata, Hirokazu Koizumi, Hideo Shimazu, Takeshi Ikenaga, Satoshi Goto

    IEEE Asia-Pacific Conference on Circuits and Systems (APCCAS2006)    2006


  • A power disturbance circuit for A5/1 resistant to power analysis attack

    Wei Dai, Tohru Hisakado, Zhenyu Liu, Satoshi Goto, Takeshi Ikenaga, Yukiyasu Tunoo

    IEICE SCIS 2006    2006

  • System-in-Silicon architecture and its application to an H.264/AVC motion estimation for 1080HDTV

    Kouichi Kumagai, Chanqi Yang, Satoshi Goto, Takeshi Ikenaga, Yoshihiro Mabushi, Kenji Yoshida

    IEEE International Solid-Stae Circuits Conference(ISSCC)    2006

  • Feature-Based Early-Termination Approach for Multi-Frame Motion Estimation of H.264/AVC

    Lingfeng Li, Shen Li, Takeshi Ikenaga, Shunichi Ishiwata, Masataka Matsui, Satoshi Goto

    Picture Coding Symposium (PCS 2006)    2006

  • High Performance VLSI Architecture of Fractional Motion Estimation in H.264 for HDTV

    Changqi Yang, Satoshi Goto, Takeshi Ikenaga

    IEEE International Symposium on Circuits and Systems (ISCAS2006)    2006

  • Low-Pass Filter Based VLSI Oriented Variable Block Size Motion Estimation Algorithm for H.264"

    Zhenyu Liu, Yang Song, Takeshi Ikenaga, Satoshi Goto

    IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2006)    2006

  • A novel approach of rectangular shape object detection in color images based on an MRF model

    Yangxing Liu, Takeshi Ikenaga, Satoshi Goto

    Proceedings of the 5th IEEE International Conference on Cognitive Informatics, ICCI 2006   1   386 - 393  2006

     View Summary

    Rectangular shape object detection in color images is a critical step of many image recognition systems. However, there are few reports on this matter. In this paper, we proposed a hierarchical approach, which combines a global contour based line segment detection algorithm and an Markov Random Field (MRF) Model, to extract rectangular shape objects from real color images. First, we use an elaborate edge detection algorithm to obtain image edge map and accurate edge pixel gradient information (magnitude and direction). Then line segments are extracted from the edge map and some neighboring parallel segments are merged into a single line segment. Finally all segments lying on the boundary of unknown rectangular shape objects are labeled via an MRF Model built on line segments. Experimental results show that our method is robust in locating multiple rectangular shape objects simultaneously with respect to different size, orientation and color. © 2006 IEEE.


  • An Efficient Hardware Architecture for Full-Search Variable Block Size Motion Estimation in H.264/AVC

    Seung-Man Pyen, Kyeong-Yuk Min, Jong-Wha Chong, Satoshi Goto

    ISVC    2006

  • A 0.3mW 1.4mm2 motion estimation processor for mobile video application

    Seiichiro Hiratsuka, Satoshi Goto, Takeshi Ikenaga

    2006 IEEE Asian Solid-State Circuits Conference, ASSCC 2006     103 - 106  2006

     View Summary

    Motion estimation (ME) is a key processing in video encoding systems. Since it requires huge computational complexity, many algorithms and LSI architectures have been proposed to reduce it. Conventional LSIs, however, are not sufficient for mobile applications which require both flexibility and low power dissipation. This paper describes an application specific instruction processor (ASIP) LSI for ME processing. It has a dedicated unit for SAD (sum of absolute difference) operations. By applying our proposed ultra-low ME algorithm named ULCMEA, it can reduce power while keeping high flexibility. A chip capable of operating at 80 MHz was fabricated using TSMC 0.18-μm CMOS technology. 15K logic gates and 32 Kbit SRAM have been integrated into 1.4 mm2 chip. Typical power dissipation is 0.3-mW for QCIF 15 frame/sec ME processing. © 2006 IEEE.


  • Robust scalable video transmission using object-oriented unequal loss protection over internet

    Zhen Qiu, Takeshi Ikenaga, Satoshi Goto

    IEEE Asia-Pacific Conference on Circuits and Systems, Proceedings, APCCAS     1583 - 1586  2006

     View Summary

    This paper describes our work on an object-oriented unequal loss protection framework for robust scalable 3D-SPIHT encoded video transmission over Internet In the OOULP scheme, the embedded video bitstream of all concerned video objects are packetized into multiple description packet streams for transmission over packets erasure prone network. Our proposed method allows the users to independently access and manipulate the objects with their interests. The simulation experiment shows that the framework is simple, fast and robust in hostile network conditions and therefore can provide reasonable video qualities for different video objects according to their priority levels. ©2006 IEEE.


  • A VLSI array processing oriented fast Fourier transform algorithm and hardware implementation

    ZY Liu, Y Song, T Ikenaga, S Goto


     View Summary

    Many parallel Fast Fourier Transform (FFI) algorithms adopt multiple stages architecture to increase performance. However, data permutation between stages consumes volume memory and processing time. One FFT array processing mapping algorithm is proposed in this paper to overcome this demerit. In this algorithm, arbitrary 2(k) butterfly units (BUs) could be scheduled to work in parallel on n = 2(s) data (k = 0, 1_.., s - 1). Because no inter stage data transfer is required, memory consumption and system latency are both greatly reduced. Moreover, with the increasing of BUs, not only does throughput increase linearly, system latency also decreases linearly. This array processing orientated architecture provides flexible tradeoff between hardware cost and system performance. In theory, the system latency is (s x 2(s-k)) x t(clk). and the throughput is n/(s x 2s-k X t(clk)), where t(clk) is the system clock period. Based on this mapping algorithm, several 18-bit word-length 1024-point FFT processors implemented with TSMC0.18 mu m CMOS technology are given to demonstrate its scalability and high performance. The core area of 4-BU design is 2.991 x 1.121 mm(2) and clock frequency is 326 MHz in typical condition (1.8 V, 25 degrees C). This processor completes 1024 FFT calculation in 7.839 ps.


  • A highly parallel architecture for deblocking filter in H.264/AVC

    LF Li, S Goto, T Ikenaga


     View Summary

    This paper presents a highly parallel architecture for deblocking filter in H.264/AVC. We adopt various parallel schemes in memory sub-system and datapath. A 2-dimensional parallel memory scheme is employed to support efficient parallel access in both horizontal and vertical directions in order to speed up the whole filtering process. This parallel memory also eliminates the need for a transpose circuit. In the datapath, an algorithm optimization is performed to implement parallel filtering with hardware reuse. Pipeline techniques are also adopted to improve the throughput of filtering operations. Our design is implemented under TSMC 0.18 mu m technology. Results show that the core size is 0.82 x 1.13 mm(2) when the maximum frequency is 230 MHz. Compared to other existing architectures, our design has advantages in both speed and area.


  • Content-based motion estimation with extended temporal-spatial analysis

    S Li, Y Jiang, T Ikenaga, S Goto


     View Summary

    In adaptive motion estimation, spatial-temporal correlation based motion type inference has been recognized as an effective way to guide the motion estimation strategy adjustment according to video contents. However, the complexity and the reliability of those methods remain two crucial problems. In this paper, a motion vector field model is introduced as the basis for a new spatial-temporal correlation based motion type inference method. For each block, Full Search with Adaptive Search Window (ASW) and Three Step Search (TSS), as two search strategy candidates, can be employed alternatively. Simulation results show that the proposed method can constantly reduce the dynamic computational cost to as low as 3% to 4% of that of Full Search (FS), while remaining a closer approximation to FS in terms of visual quality than other fast algorithms for various video sequences. Due to its efficiency and reliability, this method is expected to be a favorable contribution to the mobile video communication where low power real-time video coding is necessary.


  • Reconfigurable adaptive FEC system based on Reed-Solomon code with interleaving

    K Shimizu, N Togawa, T Ikenaga, S Goto


     View Summary

    This paper proposes a reconfigurable adaptive FEC system based on Reed-Solomon (RS) code with interleaving. In adaptive FEC schemes, error correction capability t is changed dynamically according to the communication channel condition. For given error correction capability t, we can implement an optimal RS decoder composed of minimum hardware units for each t. If the hardware units of the RS decoder can be reduced for any given error correction capability t, we can embed as large deinterleaver as possible into the RS decoder for each.t. Reconfiguring the RS decoder embedded with the expanded deinterleaver dynamically for each error correction capability t allows us to decode larger interleaved codes which are more robust error correction codes to burst errors. In a reliable transport protocol, experimental results show that our system achieves up to 65% lower packet error rate and 5.9% higher data transmission throughput compared to the adaptive FEC scheme on a conventional fixed hardware system. In an unreliable transport protocol, our system achieves up to 76% better bit error performance with higher code rate compared to the adaptive FEC scheme on a conventional fixed hardware system.


  • An Efficient Deblocking Filter Architecture with 2-Dimensional Parallel Memory for H.264/AVC

    Lingfeng Li, Satoshi Goto, Takeshi Ikenaga

    IEEE, ASP-DAC 2005    2005

  • Dynamic reconfiguration in motion estimation

    Divya Krishna Murthy, Shen Li, Satoshi Goto, Tomoyashi Sato, Prakash.S.Murthy, Vivek .T.D

    Asia and South Pacific International Conference on Embedded SoCs    2005

  • Partially-parallel LDPC decoder based on high-efficiency message-passing algorithm

    K Shimizu, T Ishikawa, N Togawa, T Ikenaga, S Goto

    2005 IEEE International Conference on Computer Design: VLSI in Computers & Processors, Proceedings     503 - 510  2005

     View Summary

    This paper proposes a partially-parallel LDPC decoder based on a high-efficiency message-passing algorithm. Our proposed partially-parallel LDPC decoder performs the column operations for bit nodes in conjunction with the row operations for check nodes. Bit functional unit with pipeline architecture in our LDPC decoder allows us to perform column operations for every bit node connected to each of check nodes which are updated by the row operations in parallel. Our proposed LDPC decoder improves the timing when the column operations are performed, accordingly it improves the message-passing efficiency within the limited number of iterations for decoding. We implemented the proposed partially-parallel LDPC decoder on an FPGA, and simulated its decoding performance. Practical simulation shows that our proposed LDPC decoder reduces the number of iterations for decoding, and it improves the bit error performance with a small hardware overhead.


  • Surface curvature based automatic human face feature extraction

    Jing Wang, Yanning Zhang, Satoshi Goto

    International Symposium on Intelligent Signal Processing and Communications Systems(IPACS 2005)    2005

  • An Efficient Deblocking Filter Architecture with 2-Dimensional Parallel Memory for H.264/AVC

    Lingfeng Li, Satoshi Goto, Takeshi Ikenaga

    IEEE, ASP-DAC 2005    2005

  • A locally adaptive subsampling algorithm for software based motion estimation

    Seiichiro Hiratsuka, Satoshi Goto, Takeshi Ikenaga

    IEEE International conference on ciucuts and systems(ISCAS2005)    2005


  • Surface curvature based automatic human face feature extraction

    Jing Wang, Yanning Zhang, Satoshi Goto

    International Symposium on Intelligent Signal Processing and Communications Systems(IPACS 2005)    2005

  • A VLSI architecture for motion compensation interpolation in H.264/AVC

    Yang Song, Zhenyu Liu, Satoshi Goto, Takeshi Ikenaga

    6th International Conference on ASIC(ASICON2005)    2005

  • A robust algorithm for text detection in color image

    Yangxing Liu, Satoshi Goto, Takeshi Ikenaga

    8th International Conference on Document Analysis and Recognition (ICDAR)    2005


  • An accurate and low complexity approach of detecting circular shape objects in still color images

    Yangxing Liu, Satoshi Goto, Takeshi Ikenaga

    IEEE International Conference on Image Processing (ICIP)    2005


  • 高性能GF演算器を搭載した楕円曲線暗号LSI

    小林伸行, 久門亨, 内田純平, 後藤敏, 池永剛, 角尾幸保

    SCIS2005    2005

  • 信頼度の伝播効率を改善する部分並列LDPC復号器の実装と評価

    清水一範, 石川達之, 戸川望, 池永剛, 後藤敏

    回路とシステム軽井沢ワークショップ、電子情報通信学会    2005

  • Motion estimation algorithm modification and implementation in H.264/AVC

    Yang Song, Zhenyu Liu, Satoshi Goto, Takeshi Ikenaga

    IEICE Karuizawa workshop on Cicuits and Systems    2005

  • A locally adaptive subsampling algorithm for software based motion estimation

    Seiichiro Hiratsuka, Satoshi Goto, Takeshi Ikenaga

    IEEE International conference on ciucuts and systems(ISCAS2005)    2005


  • A robust algorithm for text detection in color image

    Yangxing Liu, Satoshi Goto, Takeshi Ikenaga

    8th International Conference on Document Analysis and Recognition (ICDAR)    2005


  • Inter-symbol interference suppression employing sub-carrier group selection for OFDM-TDD transmit diversity

    Fumiaki Maehara, Takeshi Ikenaga, Fumio Takahara, Satoshi Goto

    2nd International Conference on Wireless Communication System    2005


  • 日本の半導体産業の課題と新たな復活への提言

    後藤 敏

    JEITAレビュー   6 ( 10 ) 2 - 7  2005


  • 日本の半導体産業の復活への提言

    後藤 敏

    Innovation通信   9 ( 秋 ) 6 - 6  2005

  • A mixed design flow for FPGA prototyping of design with scan circuits

    Lifeng Li, Eko fajar, Ken-ichi Kurimoto, Satoshi Goto

    6th International Conference on ASIC(ASICON2005)    2005

  • Real-time automatic moving object extraction and tracing based on improved active contour model

    Yunchuan Geng, yangxing Liu, Satoshi Goto, Takeshi Ikenaga

    International Symposium on Intelligent Signal Processing and Communications Systems(IPACS 2005)    2005

  • Content-based Motion Estimation VLSI design for real time MPEG-4 video coding

    Shen Li, Satoshi Goto, Takeshi Ikenaga, Hideki Takeda, Masataka Matsui

    IEICE Karuizawa workshop on Cicuits and Systems    2005

  • An MRF model based algorithm of triangular shape object detection in color images

    Yangxing Liu, Satoshi Goto, Takeshi Ikenaga

    International conference on Intelligent Computing (ICIC)    2005

  • Inter-symbol interference suppression employing sub-carrier group selection for OFDM-TDD transmit diversity

    Fumiaki Maehara, Takeshi Ikenaga, Fumio Takahata, Satoshi Goto

    2nd International Symposium on Wireless Communications Systems 2005, ISWCS 2005 - Conference Proceedings   2005   515 - 519  2005

     View Summary

    This paper proposes an inter-symbol interference (ISI) suppression scheme using the sub-carrier group selection for orthogonal frequency division multiplexing (OFDM) time division duplex (TDD) transmit diversity. Although transmit diversity is quite effective in improving the transmission performance, the ISI due to the long multipath delays causes the error floor and leads to a loss of the space diversity benefit. To cope with this degradation, the periodic time domain waveform obtained by the even-numbered sub-carrier group is used for the extension of the guard interval. Moreover, the odd-numbered sub-carrier group is also applied to exploit the frequency diversity benefit, which is realized by carrying out the simple frequency conversion at the receiver. To keep the constant transmission rate even in the sub-carrier group transmission, the high-level modulation scheme with power enhancement is applied. The computer simulation results show that the proposed scheme not only prevents the error floor due to the ISI but also achieves the space and frequency diversity benefit without the extension of the guard interval. © 2005 IEEE.


  • A VLSI architecture for motion compensation interpolation in H.264/AVC

    Yang Song, Zhenyu Liu, Satoshi Goto, Takeshi Ikenaga

    6th International Conference on ASIC(ASICON2005)    2005

  • Content-based Motion Estimation VLSI design for real time MPEG-4 video coding

    Shen Li, Satoshi Goto, Takeshi Ikenaga, Hideki Takeda, Masataka Matsui

    IEICE Karuizawa workshop on Cicuits and Systems    2005

  • An MRF model based algorithm of triangular shape object detection in color images

    Yangxing Liu, Satoshi Goto, Takeshi Ikenaga

    International conference on Intelligent Computing (ICIC)    2005

  • An accurate and low complexity approach of detecting circular shape objects in still color images

    Yangxing Liu, Satoshi Goto, Takeshi Ikenaga

    IEEE International Conference on Image Processing (ICIP)    2005


  • 未来のLSI

    後藤 敏

    電子情報通信学会誌   88 ( 10 ) 790 - 794  2005

  • Motion estimation algorithm modification and implementation in H.264/AVC

    Yang Song, Zhenyu Liu, Satoshi Goto, Takeshi Ikenaga

    IEICE Karuizawa workshop on Cicuits and Systems    2005

  • Reconfigurable adaptive FEC system based on reed-solomon code with interleaving

    Kazunori Shimizu, Nozomu Togawa, Takeshi Ikenaga, Satoshi Goto

    IEICE Transactions on Information and Systems   E88-D ( 7 ) 1526 - 1537  2005

     View Summary

    This paper proposes a reconfigurable adaptive FEC system based on Reed-Solomon (RS) code with interleaving. In adaptive FEC schemes, error correction capability t is changed dynamically according to the communication channel condition. For given error correction capability t, we can implement an optimal RS decoder composed of minimum hardware units for each t. If the hardware units of the RS decoder can be reduced for any given error correction capability t, we can embed as large deinterleaver as possible into the RS decoder for each t. Reconfiguring the RS decoder embedded with the expanded deinterleaver dynamically for each error correction capability t allows us to decode larger interleaved codes which are more robust error correction codes to burst errors. In a reliable transport protocol, experimental results show that our system achieves up to 65% lower packet error rate and 5.9% higher data transmission throughput compared to the adaptive FEC scheme on a conventional fixed hardware system. In an unreliable transport protocol, our system achieves up to 76% better bit error performance with higher code rate compared to the adaptive FEC scheme on a conventional fixed hardware system. Copyright © 2005 The Institute of Electronics, Information and Communication Engineers.


  • Dynamic reconfiguration in motion estimation

    Divya Krishna Murthy, Shen Li, Satoshi Goto, Tomoyashi Sato, Prakash.S.Murthy, Vivek .T.D

    Asia and South Pacific International Conference on Embedded SoCs    2005

  • Partially-parallel LDPC decoder based on high-efficiency message-passing algorithm

    Kazunori Shimizu, Tatsuyuki Ishikawa, Nozomu Togawa, Takeshi Ikenaga, Satoshi Goto

    Proceedings - IEEE International Conference on Computer Design: VLSI in Computers and Processors   2005   503 - 510  2005

     View Summary

    This paper proposes a partially-parallel LDPC decoder based on a high-efficiency message-passing algorithm. Our proposed partially-parallel LDPC decoder performs the column operations for bit nodes in conjunction with the row operations for check nodes. Bit functional unit with pipeline architecture in our LDPC decoder allows us to perform column operations for every bit node connected to each of check nodes which are updated by the row operations in parallel. Our proposed LDPC decoder improves the timing when the column operations are performed, accordingly it improves the message-passing efficiency within the limited number of iterations for decoding. We implemented the proposed partially-parallel LDPC decoder on an FPGA, and simulated its decoding performance. Practical simulation shows that our proposed LDPC decoder reduces the number of iterations for decoding, and it improves the bit error performance with a small hardware overhead. © 2005 IEEE.


  • A mixed design flow for FPGA prototyping of design with scan circuits

    Lifeng Li, Eko fajar, Ken-ichi Kurimoto, Satoshi Goto

    6th International Conference on ASIC(ASICON2005)    2005

  • Real-time automatic moving object extraction and tracing based on improved active contour model

    Yunchuan Geng, yangxing Liu, Satoshi Goto, Takeshi Ikenaga

    International Symposium on Intelligent Signal Processing and Communications Systems(IPACS 2005)    2005

  • Content-based Motion Estimation with Extended Temporal-Spatial Analysis

    Shen Li, Yong Jiang, Takeshi Ikenaga, Satoshi Goto

    IEEE NWSCAS     461 - 464  2004

  • 公開鍵暗号向けモンゴメリ乗算剩余演算器

    久門亨, 小林伸行, 池永剛, 後藤敏

    電子情報通信学会 第8回システムLSIワークショップ     343 - 346  2004

  • A reconfigurable adaptive FEC system for reliable wireless communications

    Kazunori Shimizu, Nozomu Togawa, Takeshi Ikenaga, Masao Yanagisawa, Satoshi Goto, Tatsuo Ohtsuki

    IEEE proc.APCCAS2004     13 - 16  2004

  • Video Coding Algorithm based on adaptive Tree for Low Electricity Consumption

    Seiichiro Hiratsuka, Satoshi Goto, Takaaki Baba, Takeshi Ikenaga

    IEEE Proc. APCCAS2004    2004

  • A Low Complexity Variable Block Size Motion Estimation Algorithm for Video Telephony Communicaiton

    Yong Jiang, Shen Li, Satoshi Goto

    IEEE NWSCAS     465 - 468  2004

  • An Efficient Algorithm/Architecture Codesign for Image Encoders

    Jinku Choi, Satoshi Goto, Takeshi Ikenaga

    IEEE MWSCAS    2004

  • N bit-wise modular multiplier architecture for public key cryptography

    T Hisakado, N Kobayashi, T Ikenaga, T Baba, S Goto, K Higashi, Kitao, I, Y Tsunoo


     View Summary

    Along with the progress of the information society, we are relying more and more on digital information processing with security. Cryptography plays an important role in a situation where unwanted eavesdropping or falsification has to be avoided. Public key encryptions including RSA require a huge number of arithmetic operations. Major part of its operation is modular multiplication with very large bit-width. This operation takes long time, and there is an advantage in hardware implementation of it. We propose the hardware implementation of N-bit-wise multiplier. It allows the operation performed at the speed 2 times the original performance for the same circuit size, or the circuit size reduced to approximately 60% for the same processing time. Employing the architecture proposed in this paper contributes to the performance improvement of encryption system and the reduction of chip size of encryption system.

  • System LSI education strategy at Waseda University

    Satoshi Goto

    VLSI circuits and Systems Education Program Workshop    2004

  • A reconfigurable adaptive FEC system for reliable wireless communications

    Kazunori Shimizu, Nozomu Togawa, Takeshi Ikenaga, Masao Yanagisawa, Satoshi Goto, Tatsuo Ohtsuki

    IEEE proc.APCCAS2004     13 - 16  2004

  • A Stable Multi-level Partitioning Algorithm Using Adaptive Connectivity Threshold

    Jin-Kuk Kim, Jong-Wha Chong, Satoshi Goto

    IEEE Proc. APCCAS2004    2004

  • Trellis-Coded Space-Time Linear Constellation Precoding Codes

    Shouyin Liu, Jong-Wha Chong, Satoshi Goto

    IEEE VTC Spring 2004    2004

  • An Efficient Algorithm/Architecture Codesign for Image Encoders

    Jinku Choi, Satoshi Goto, Takeshi Ikenaga

    IEEE MWSCAS    2004

  • Future reconfigurable computing architecture

    Satoshi Goto

    IEEE, ASPDAC 2004    2004

  • Characteristics of Dipole antenna delay

    Yuji Tanabe, Tomoki Uwano, Satoshi Goto, Tatsuo Ohtsuki, Takaaki Baba

    2004 Korea-Japan AP/EMC/EMT Joint Conference    2004

  • Video Coding Algorithm based on adaptive Tree for Low Electricity Consumption

    Seiichiro Hiratsuka, Satoshi Goto, Takaaki Baba, Takeshi Ikenaga

    IEEE Proc. APCCAS2004    2004

  • No Bit overhead MPEG Video Scrambling based on Event Shuffle in Frequency Domain

    Gang Liu, Satoshi Goto, Takaaki Baba, Takeshi Ikenaga

    IEEE Proc. APCCAS2004    2004

  • A Low Complexity Variable Block Size Motion Estimation Algorithm for Video Telephony Communicaiton

    Yong Jiang, Shen Li, Satoshi Goto

    IEEE NWSCAS     465 - 468  2004

  • A Stable Multi-level Partitioning Algorithm Using Adaptive Connectivity Threshold

    Jin-Kuk Kim, Jong-Wha Chong, Satoshi Goto

    IEEE Proc. APCCAS2004    2004

  • Future reconfigurable computing architecture

    Satoshi Goto

    IEEE, ASPDAC 2004    2004

  • Characteristics of Dipole antenna delay

    Yuji Tanabe, Tomoki Uwano, Satoshi Goto, Tatsuo Ohtsuki, Takaaki Baba

    2004 Korea-Japan AP/EMC/EMT Joint Conference    2004

  • System LSI education strategy at Waseda University

    Satoshi Goto

    VLSI circuits and Systems Education Program Workshop    2004

  • No Bit overhead MPEG Video Scrambling based on Event Shuffle in Frequency Domain

    Gang Liu, Satoshi Goto, Takaaki Baba, Takeshi Ikenaga

    IEEE Proc. APCCAS2004    2004

  • Trellis-Coded Space-Time Linear Constellation Precoding Codes

    Shouyin Liu, Jong-Wha Chong, Satoshi Goto

    IEEE VTC Spring 2004    2004

  • Content-based Motion Estimation with Extended Temporal-Spatial Analysis

    Shen Li, Yong Jiang, Takeshi Ikenaga, Satoshi Goto

    IEEE NWSCAS     461 - 464  2004

  • ATM technology for 21st Century

    Satoshi Goto

    Iranian Conference on Electrical Engineering    2001

  • Broadbandnetworking technology for the 21st century

    Satoshi Goto

    NEC R&D    2001

  • Broadbandnetworking technology for the 21st century

    Satoshi Goto

    NEC R&D    2001

  • ATM technology for 21st Century

    Satoshi Goto

    Iranian Conference on Electrical Engineering    2001

  • Photo technology for next generation

    Satoshi Goto

    ITU, Telecom2000    2000

  • Photo technology for next generation

    Satoshi Goto

    ITU, Telecom2000    2000

  • The evolving information network and its impact on management

    Satoshi Goto

    ITU, TELECOM2000    2000

  • The evolving information network and its impact on management

    Satoshi Goto

    ITU, TELECOM2000    2000

  • Linear optimal one-sided single-detour algorithm for untangling twisted bus.

    Tao Lin, Sheqin Dong, Song Chen, Satoshi Goto

    IEEE ASP-DAC 2012     151 - 156


▼display all


Internal Special Research Projects

  • 画像処理LSIプラトフォーム設計技術の研究


     View Summary

     現在の情報通信技術ではテキスト、音声、静止画像、動画像という様々なマルチメディア情報を扱っているが、データの種類、特性や重要度をほとんど考慮せず、一括して同じ処理方式を採用している。例えばマルチメディアデータ伝送を行う際に、データの内容に関わらず、データ圧縮を行い、暗号化し、誤り訂正符号化を個々に行ってデータを処理する方式が取られている。各処理を独立した異なるハード機器を用いて実行するために、無駄な計算処理、ハードウェア機器の増加、ソフトウェア処理の増大化を招き、全体の電子機器の規模が増加し、消費電力を大幅に低減できない状況となっている。また、メディアの多様化と大容量化に伴い、従来の延長上の技術では、処理するための消費電力が指数関数的に増加してしまうために、革新的な技術の開発が望まれている。 超低消費電力メディア処理SoCの実現のため、画像、暗号、誤り訂正符号の各方式の最適な分担およびアルゴリズム最適化手法、さらにはハードウェア・ソフトウェア実装最適化手法を融合させ、従来技術と比較して1/100の電力削減を図ることを目標に掲げて研究を進めた。(1)方式・アルゴリズムレベルでは、メディア処理で最も計算量を必要とする画像圧縮問題に取組み、画像の動きを予測し、必要な演算量に応じてプロセッサの周波数を動的に変化させる方式を考案してマルチコアシステム上に実装し、監視系システムへの応用において消費電力を平均46%、最大78%削減することができた。また、TV会議ではRoI(Region of Interest)方式を導入し、RoI(顔)の領域を精度よく検出・圧縮することにして、RoI以外の部分は品質は落とし、RoI部分は高品質に保つ手法を取り入れることでエンコーダの演算量を平均で76%削減した。 (2)チップ試作では、動画像符号化/復号化、誤り訂正符号、暗号を対象にLSIを試作し電力消費を大幅に削減した。動画像符号化ではハイビジョン(H.264)対応のエンコーダLSIを開発し、過去の最良なものと比較し約50%の電力削減を行った。動画像復号化に関しては4096x2160対応(H.264)デコーダLSIを試作し、従来比で約60%の電力削減を確認した。誤り訂正符号ではLDPC方式を対象にしたLDPCデコーダLSIを開発し、従来比で約90%の電力削減が行えた。また暗号ではAES暗号LSIを開発し、約50%の電力削減を確認した。顕著な成果としては以下が挙げられる。1.低消費電力ビデオエンコーダシステムマルチコアプロセッサ上に独自に開発した動き差分方式に基づくビデオエンコーダを実装し、必要な演算量に基づき、各コアの周波数(電圧)を動的に変化させる制御方式を組込んだシステムを試作した。従来方式に比べて、平均で48%、最良で78%の電力削減を達成した。ICME2010国際会議で発表し、SPIC(2011年3月号)に掲載された。またISLPED2011やICME2010でデモ展示を行い、ISOCC2010会議で優秀論文賞を受賞した。2.4kx2kビデオデコーダLSI概要:4096x2160対応のH.264ハイプロファイル復号LSIを開発し、60枚/秒を処理し、従来比で約60%の電力削減を行うことができた。IEEE VLSI Symposium2010で発表し、IEEE JSSC(2011年4月号)に掲載された(24)。ISPLED2010で優秀デザイン賞、VLSI Symposia2011で最優秀学生論文賞を受賞した。3.人物検出のSTPエンジン概要:早稲田大学で開発した人物検出アルゴリズムをルネサスが担当して、PC(Core2Duo@3GHz)とSTP(動的再構成プロセッサ:Stream Transpose)上に実装し、消費エネルギーを実測した。PCと比較して約98%のエネルギー削減を行うことを実証することができた。ISLPED2011を含め多数のデモ展示を行った。

  • 次世代システムLSI設計実装技術の研究


     View Summary

    H.264のフル・ハイビジョン規格(1920×1080画素)のM.E.エンコーダ用のアルゴリズムを開発するとともに、企業から技術供与を受けたメディアプロセッサーMePを活用してSiSに纏め上げる設計アーキテクチャーの開発を行ない、M.E.処理エンジン搭載のH.264のフル・ハイビジョン規格のエンコーダをSiSとして実現するためのチップ試作を実施した。平成18年10月にチップが完成した。チップは0.18μmCMOSプロセスを用い、チップサイズは5.44mmx4.98mmでロジック部分のゲート数は1.14Mゲート、SRAMは108.3Kバイトであった。SiDRAMは64Mbを用い、ASICとは23Gb/sのインターフェイスで接続した。MePプロセッサーを採用することで、システム全体の制御が容易に実装でき、エントロピー符号化とデブロッキングフィルターをハード化し、エンコーダエンジンとともに、64ビットのAMBAバスで相互を接続した。エンコーダチップは200MHzで動作し、消費尾電力はDRAM込みで1.4Wを達成し、H.264のフルハイビジョン向けの世界で最初に開発したエンコーダチップと言える。VLSI Circuits 2007に論文が採択され、2007年6月に発表予定である。開発したこれらの技術は今後、低消費電力を狙った携帯端末や情報家電に利用でき、共同開発した会社と製品化を図っていく予定である。更に、独自に研究していた画像処理LSI、暗号LSI、符号化LSIなどの技術を統合することにより低消費電力化を図るという提案「超低消費電力メディア処理SoCの研究」がJST・戦略的創造研究推進事業(CRESTタイプ)の情報処理システムの超低消費電力化を目指した技術革新と統合化技術(平成18年度~23年度)として採択された。本事業での成果の情報発信の効果もあり、大手IDMとベンチャー企業の共同研究のテーマが結びついてきている。具体的には画像圧縮、暗号処理、符号化処理の一体化で従来比、1/100の低消費電力を実現しようというもので、個々の技術要素の革新とともに、システム全体の観点から超消費電力を図ることから、大きな注目を浴びている。大学発の技術を北九州学術研究都市に進出したベンチャー企業と大手IDMとで役割を分担しながら技術を育成し、次世代の製品につなげる期待が大きい。