Updated on 2024/04/20

写真a

 
ISHIKAWA, Hiroshi
 
Affiliation
Faculty of Science and Engineering, School of Fundamental Science and Engineering
Job title
Professor
Degree
Ph.D. in Computer Science ( New York University (USA) )
Master of Science ( Kyoto University )

Research Experience

  • 2016
    -
    Now

    National Institute of Informatics   Visiting Professor

  • 2010
    -
    Now

    Waseda University   Department of Computer Science and Engineering   Professor

  • 2009
    -
    2013

    JST   PRESTO (Sakigake) Researcher

  • 2010
     
     

    Nagoya City University   Department of Information and Biological Sciences   Professor

  • 2005
    -
    2010

    Nagoya City University   Department of Information and Biological Sciences   Asscociate Professor

  • 2004
    -
    2005

    Nagoya City University   Department of Information and Biological Sciences   Assistant Professor

  • 2000
    -
    2001

    New York University   Courant Institute of Mathematical Sciences   Associate Research Scientist

▼display all

Education Background

  •  
    -
    2000

    New York University   Computer Science  

  •  
    -
    1993

    Kyoto University   Graduate School, Division of Natural Science   Mathematics  

  •  
    -
    1991

    Kyoto University   Faculty of Science   Mathematics  

Committee Memberships

  • 2020
     
     

    European Conference on Computer Vision (ECCV2020)  Area Chair

  • 2018
    -
    2020

    Asian Conference on Computer Vision (ACCV2020)  Program Chair

  • 2016
    -
    2020

    IPSJ SIG-CVIM: Computer Vision and Image Media  Managing Member

  • 2016
    -
    2020

    IPSJ Transactions on Computer Vision Applications  Associate Editor in Chief

  • 2019
     
     

    IEEE International Conference on Computer Vision (ICCV2019)  Area Chair

  • 2018
    -
    2019

    Computer Graphics International 2019 (CGI’19)  Conference Co-Chair

  • 2017
    -
    2018

    IPSJ 80th National Convention Executive Committee  Vice Chair

  • 2016
    -
    2018

    The Institute of Image Electronics Engineers of Japan  Member of the Board of Directors

  • 2017
     
     

    IEEE International Conference on Computer Vision (ICCV2017)  Area Chair

  • 2016
    -
    2017

    Fifteenth IAPR International Conference on Machine Vision Applications (MVA2017)  General Chiar

  • 2011
    -
    2017

    IEICE PRMU: Pattern Recognition and Media Understanding  Expert Committee Member

  • 2016
     
     

    Asian Conference on Computer Vision (ACCV2016)  Area Chair

  • 2016
     
     

    Conference Editorial Board, Meeting on Image Recognition and Understanding (MIRU2016)  Editor in Chief

  • 2013
    -
    2016

    IEEE Transactions on Pattern Analysis and Machine Intelligence  Associate Editor

  • 2015
     
     

    IEEE International Conference on Computer Vision (ICCV2015)  Area Chair

  • 2015
     
     

    Conference Editorial Board, Meeting on Image Recognition and Understanding (MIRU2015)  Associate Editor in Chief

  • 2014
    -
    2015

    Fourteenth IAPR International Conference on Machine Vision Applications (MVA2015)  General Chiar

  • 2012
    -
    2015

    IET Computer Vision  Editorial Board Member

  • 2014
     
     

    Asian Conference on Computer Vision (ACCV2014)  Area Chair

  • 2014
     
     

    Conference Editorial Board, Meeting on Image Recognition and Understanding (MIRU2014)  Associate Editor in Chief

  • 2013
    -
     

    IEICE MI: Medical Imaging  Expert Committee Member

  • 2013
    -
     

    電子情報通信学会 医用画像研究会  専門委員

  • 2013
     
     

    Meeting on Image Recognition and Understanding (MIRU2013)  Area Chair

  • 2013
     
     

    IEEE Conference on Computer Vision and Pattern Recognition (CVPR2013)  Area Chair

  • 2011
    -
    2013

    International Conference on Energy Minimization Methods in Computer Vision and Pattern Recognition (EMMCVPR 2011, 2013)  Program Committee Member

  • 2010
    -
    2013

    IPSJ Transactions on Computer Vision Applications  Associate Editor

  • 2009
    -
    2012

    IEICE Transactions on Information and Systems  Editorial Board Member

  • 2009
    -
    2012

    電子情報通信学会和文論文誌D  編集委員

  • 2008
    -
    2012

    IEEE Computer Society Workshop on Perceptual Organization in Computer Vision (POCV2008, 2010, 2012)  Program Committee Member

  • 2011
     
     

    Meeting on Image Recognition and Understanding (MIRU2011)  Area Chair

  • 2011
     
     

    IEEE Conference on Computer Vision and Pattern Recognition (CVPR2011)  Area Chair

  • 2007
    -
    2011

    IEEE International Conference on Computer Vision (ICCV 2007, 2009, 2011)  Program Committee Member

  • 2010
    -
     

    International Journal of Computer Vision  Editorial Board Member

  • 2010
     
     

    Asian Conference on Computer Vision (ACCV2010)  Program Committee Member

  • 2010
     
     

    Meeting on Image Recognition and Understanding (MIRU2010)  Area Chair

  • 2008
    -
    2010

    European Conference on Computer Vision (ECCV 2008, 2010)  Program Committee Member

  • 2007
    -
    2010

    IPSJ SIG-CVIM: Computer Vision and Image Media  Organizing Committee Member

  • 2007
    -
    2010

    情報処理学会 コンピュータビジョンとイメージメディア(CVIM)研究運営委員会  運営委員

  • 2009
     
     

    Meeting on Image Recognition and Understanding (MIRU2009)  Area Chair

  • 2008
     
     

    Short Program: Graph Cuts and Related Discrete or Continuous Optimization Problems at IPAM, UCLA.  Organizing Committee Member

  • 2008
     
     

    International Conference on Pattern Recognition (ICPR2008)  Program Committee Member

  • 2008
     
     

    Meeting on Image Recognition and Understanding  Area Chair

  • 2006
    -
    2007

    Tenth IAPR Conference on Machine Vision Applications (MVA2007).  Local Arrangement Chair

  • 2005
    -
     

    IAPR Conference on Machine Vision Applications(MVA)  Organizing Committee Member

▼display all

Professional Memberships

  •  
     
     

    ACM

  •  
     
     

    IEEE

  •  
     
     

    The Institute of Electronics, Information and Communication Engineers

  •  
     
     

    Information Processing Society of Japan

Research Areas

  • Intelligent robotics

Research Interests

  • Artificial Intelligence, Deep Learning

  • Computer Vision, Discrete Optimization, Pattern Analysis

Awards

  • 75th Annual IEICE Best Paper Award

    2019.06   The Institute of Electronics, Information and Communication Engineers  

  • Innovative Technologies 2016 Special Prize for Culture

    2016.10   Ministry of Economy, Trade and Industry  

  • MIRU Nagao Award (Best Paper Award)

    2009.07  

  • Young Author Award

    2006.12   IEEE Computer Society Japan Chapter  

  • MIRU2006 Excellent Paper Award

    2006.07  

  • Harold Grad Memorial Prize (Courant Institute of Mathematical Sciences)

    2000.04   Courant Institute of Mathematical Sciences, New York University  

▼display all

 

Papers

  • Optimization-Based Data Generation for Photo Enhancement

    M. Omiya, Y. Horiuchi, E. Simo-Serra, S. Iizuka, H. Ishikawa

    New Trends in Image Restoration and Enhancement Workshop (NITRE2019) at the IEEE Conference on Computer Vision and Pattern Recognition (CVPR2019)    2019.06  [Refereed]

  • Temporal Distance Matrices for Squat Classification

    R. Ogata, E. Simo-Serra, S. Iizuka, H. Ishikawa

    Fifth International Workshop on Computer Vision in Sports (CVsports) at the IEEE Conference on Computer Vision and Pattern Recognition (CVPR2019)    2019.06  [Refereed]

  • Re-staining Pathology Images by FCNN

    M. Fujitani, Y. Mochizuki, S. Iizuka, E. Simo-Serra, H. Kobayashi, C. Iwamoto, K. Ohuchida, M. Hashizume, H. Hontani, H. Ishikawa

    The 16th International Conference on Machine Vision Applications (MVA 2019)    2019.05  [Refereed]

    DOI

    Scopus

    7
    Citation
    (Scopus)
  • Spectral Normalization and Relativistic Adversarial Training for Conditional Pose Generation with Self-Attention

    Y. Horiuchi, E. Simo-Serra, S. Iizuka, H. Ishikawa

    The 16th International Conference on Machine Vision Applications (MVA 2019)    2019.05  [Refereed]

    DOI

    Scopus

    6
    Citation
    (Scopus)
  • Learning Photo Enhancement by Black-Box Model Optimization Data Generation

    M. Omiya, E. Simo-Serra, S. Iizuka, H. Ishikawa

    SIGGRAPH Asia 2018 Technical Briefs    2018.12  [Refereed]

    DOI

    Scopus

    6
    Citation
    (Scopus)
  • Real-Time Data-Driven Interactive Rough Sketch Inking

    E. Simo-Serra, S. Iizuka, H. Ishikawa

    ACM Transactions on Graphics (Proc. of SIGGRAPH2018)   37 ( 4 )  2018.08  [Refereed]

    DOI

    Scopus

    40
    Citation
    (Scopus)
  • 高階グラフカットによる医用画像領域分割

    H. Ishikawa, Y. Kitamura

    画像ラボ   ( 2018年7月号 ) 21 - 26  2018.07

  • Learning to restore deteriorated line drawing

    Kazuma Sasaki, Satoshi Iizuka, Edgar Simo-Serra, Hiroshi Ishikawa

    Visual Computer   34 ( 6-8 ) 1077 - 1085  2018.06  [Refereed]

     View Summary

    We propose a fully automatic approach to restore aged old line drawings. We decompose the task into two subtasks: the line extraction subtask, which aims to extract line fragments and remove the paper texture background, and the restoration subtask, which fills in possible gaps and deterioration of the lines to produce a clean line drawing. Our approach is based on a convolutional neural network that consists of two sub-networks corresponding to the two subtasks. They are trained as part of a single framework in an end-to-end fashion. We also introduce a new dataset consisting of manually annotated sketches by Leonardo da Vinci which, in combination with a synthetic data generation approach, allows training the network to restore deteriorated line drawings. We evaluate our method on challenging 500-year-old sketches and compare with existing approaches with a user study, in which it is found that our approach is preferred 72.7% of the time.

    DOI

    Scopus

    13
    Citation
    (Scopus)
  • Flooding-based segmentation for contactless hand biometrics oriented to mobile devices

    Gonzalo Bailador, Belén Ríos-Sánchez, Raúl Sánchez-Reillo, Hiroshi Ishikawa, Carmen Sánchez-Ávila

    IET Biometrics   7 ( 5 ) 431 - 438  2018.05  [Refereed]

    DOI

    Scopus

    4
    Citation
    (Scopus)
  • Multi-label Fashion Image Classification with Minimal Human Supervision

    Naoto Inoue, Edgar Simo-Serra, Toshihiko Yamasaki, Hiroshi Ishikawa

    Proceedings - 2017 IEEE International Conference on Computer Vision Workshops, ICCVW 2017   2018-   2261 - 2267  2018.01  [Refereed]

     View Summary

    We tackle the problem of multi-label classification of fashion images, learning from noisy data with minimal human supervision. We present a new dataset of full body poses, each with a set of 66 binary labels corresponding to the information about the garments worn in the image obtained in an automatic manner. As the automatically-collected labels contain significant noise, we manually correct the labels for a small subset of the data, and use these correct labels for further training and evaluation. We build upon a recent approach that both cleans the noisy labels and learns to classify, and introduce simple changes that can significantly improve the performance.

    DOI

    Scopus

    43
    Citation
    (Scopus)
  • What Makes a Style: Experimental Analysis of Fashion Prediction

    Moeko Takagi, Edgar Simo-Serra, Satoshi Iizuka, Hiroshi Ishikawa

    Proceedings - 2017 IEEE International Conference on Computer Vision Workshops, ICCVW 2017   2018-   2247 - 2253  2018.01  [Refereed]

     View Summary

    In this work, we perform an experimental analysis of the differences of both how humans and machines see and distinguish fashion styles. For this purpose, we propose an expert-curated new dataset for fashion style prediction, which consists of 14 different fashion styles each with roughly 1,000 images of worn outfits. The dataset, with a total of 13,126 images, captures the diversity and complexity of modern fashion styles. We perform an extensive analysis of the dataset by benchmarking a wide variety of modern classification networks, and also perform an in-depth user study with both fashion-savvy and fashion-naïve users. Our results indicate that, although classification networks are able to outperform naive users, they are still far from the performance of savvy users, for which it is important to not only consider texture and color, but subtle differences in the combination of garments.

    DOI

    Scopus

    41
    Citation
    (Scopus)
  • Mastering sketching: Adversarial augmentation for structured prediction

    Edgar Simo-Serra, Satoshi Iizuka, Hiroshi Ishikawa

    ACM Transactions on Graphics   37 ( 1 )  2018.01  [Refereed]

     View Summary

    We present an integral framework for training sketch simplification networks that convert challenging rough sketches into clean line drawings. Our approach augments a simplification network with a discriminator network, training both networks jointly so that the discriminator network discerns whether a line drawing is real training data or the output of the simplification network, which, in turn, tries to fool it. This approach has two major advantages: first, because the discriminator network learns the structure in line drawings, it encourages the output sketches of the simplification network to be more similar in appearance to the training sketches. Second, we can also train the networks with additional unsupervised data: by adding rough sketches and line drawings that are not corresponding to each other, we can improve the quality of the sketch simplification. Thanks to a difference in the architecture, our approach has advantages over similar adversarial training approaches in stability of training and the aforementioned ability to utilize unsupervised training data. We show how our framework can be used to train models that significantly outperform the state of the art in the sketch simplification task, despite using the same architecture for inference. We also present an approach to optimize for a single image, which improves accuracy at the cost of additional computation time. Finally, we show that, using the same framework, it is possible to train the network to perform the inverse problem, i.e., convert simple line sketches into pencil drawings, which is not possible using the standard mean squared error loss. We validate our framework with two user tests, in which our approach is preferred to the state of the art in sketch simplification 88.9% of the time.

    DOI

    Scopus

    83
    Citation
    (Scopus)
  • Medical Image Segmentation by Higher-Order Energy Minimization

    Y. Kitamura, H. Ishikawa

    IEICE Transactions on Information and Systems (Japanese Edition)   J101-D ( 1 ) 3 - 26  2018.01  [Refereed]  [Invited]

    DOI

  • 画像類似度を考慮したデータセットを用いて学習したCNNによる病理画像の染色変換 (医用画像)

    藤谷 真之, 望月 義彦, 飯塚 里志, シモセラ エドガー, 石川 博

    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報   117 ( 281 ) 9 - 14  2017.11

    CiNii

  • Adaptive Energy Selection for Content-Aware Image Resizing

    K. Sasaki, Y. Nagahama, Z. Ze, S. Iizuka, E. Simo-Serra, Y. Mochizuki, H. Ishikawa

    Fourth Asian Conference on Pattern Recognition (ACPR2017)    2017.11  [Refereed]

  • 人工知能で白黒写真をカラーに

    H. Ishikawa

    画像ラボ   ( 2017年10月号 ) 14 - 21  2017.10

  • Banknote portrait detection using convolutional neural network

    Ryutaro Kitagawa, Yoshihiko Mochizuki, Satoshi Iizuka, Edgar Simo-Serra, Hiroshi Matsuki, Naotake Natori, Hiroshi Ishikawa

    Proceedings of the 15th IAPR International Conference on Machine Vision Applications, MVA 2017     440 - 443  2017.07  [Refereed]

     View Summary

    Banknotes generally have different designs according to their denominations. Thus, if characteristics of each design can be recognized, they can be used for sorting banknotes according to denominations. Portrait in banknotes is one such characteristic that can be used for classification. A sorting system for banknotes can be designed that recognizes portraits in each banknote and sort it accordingly. In this paper, our aim is to automate the configuration of such a sorting system by automatically detect portraits in sample banknotes, so that it can be quickly deployed in a new target country. We use Convolutional Neural Networks to detect portraits in completely new set of banknotes robust to variation in the ways they are shown, such as the size and the orientation of the face.

    DOI

    Scopus

    9
    Citation
    (Scopus)
  • Unsupervised video object segmentation by supertrajectory labeling

    Masahiro Masuda, Yoshihiko Mochizuki, Hiroshi Ishikawa

    Proceedings of the 15th IAPR International Conference on Machine Vision Applications, MVA 2017     448 - 451  2017.07  [Refereed]

     View Summary

    We propose a novel approach to unsupervised video segmentation based on the trajectories of Temporal Super-pixels (TSPs). We cast the segmentation problem as a trajectory-labeling problem and define a Markov random field on a graph in which each node represents a trajectory of TSPs, which we minimize using a new two-stage optimization method we developed. The adaption of the trajectories as basic building blocks brings several advantages over conventional superpixel-based methods, such as more expressive potential functions, temporal coherence of the resulting segmentation, and drastically reduced number of the MRF nodes. The most important effect is, however, that it allows more robust segmentation of the foreground that is static in some frames. The method is evaluated on a subset of the standard SegTrack benchmark and yields competitive results against the state-of-the-art methods.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Multiple-organ segmentation by graph cuts with supervoxel nodes

    Toshiya Takaoka, Yoshihiko Mochizuki, Hiroshi Ishikawa

    Proceedings of the 15th IAPR International Conference on Machine Vision Applications, MVA 2017     424 - 427  2017.07  [Refereed]

     View Summary

    Improvement in medical imaging technologies has made it possible for doctors to directly look into patients' bodies in ever finer details. However, since only the cross-sectional image can be directly seen, it is essential to segment the volume into organs so that their shape can be seen as 3D graphics of the organ boundary surfaces. Segmentation is also important for quantitative measurement for diagnosis. Here, we introduce a novel higher-precision method to segment multiple organs using graph cuts within medical images such as CT-scanned images. We utilize super voxels instead of voxels as the units of segmentation, i.e., the nodes in the graphical model, and design the energy function to minimize accordingly. We utilize SLIC super voxel algorithm and verify the performance of our segmentation algorithm by energy minimization comparing to the ground truth.

    DOI

    Scopus

    3
    Citation
    (Scopus)
  • Message from the chairs

    Ishikawa, Hiroshi, Okada, Ryuzo, Ukita, Norimichi, Mori, Greg

    Proceedings of the 15th IAPR International Conference on Machine Vision Applications, MVA 2017    2017.07  [Refereed]

    DOI

    Scopus

  • Globally and Locally Consistent Image Completion

    Satoshi Iizuka, Edgar Simo-Serra, Hiroshi Ishikawa

    ACM TRANSACTIONS ON GRAPHICS   36 ( 4 )  2017.07  [Refereed]

     View Summary

    We present a novel approach for image completion that results in images that are both locally and globally consistent. With a fully-convolutional neural network, we can complete images of arbitrary resolutions by filling in missing regions of any shape. To train this image completion network to be consistent, we use global and local context discriminators that are trained to distinguish real images from completed ones. The global discriminator looks at the entire image to assess if it is coherent as a whole, while the local discriminator looks only at a small area centered at the completed region to ensure the local consistency of the generated patches. The image completion network is then trained to fool the both context discriminator networks, which requires it to generate images that are indistinguishable from real ones with regard to overall consistency as well as in details. We show that our approach can be used to complete a wide variety of scenes. Furthermore, in contrast with the patch-based approaches such as PatchMatch, our approach can generate fragments that do not appear elsewhere in the image, which allows us to naturally complete the images of objects with familiar and highly specific structures, such as faces.

    DOI

    Scopus

    1664
    Citation
    (Scopus)
  • Guest Editorial: Machine Vision Applications

    Yasuyo Kita, Hiroshi Ishikawa, Takeshi Masuda

    INTERNATIONAL JOURNAL OF COMPUTER VISION   122 ( 2 ) 191 - 192  2017.04  [Refereed]

    DOI

    Scopus

    5
    Citation
    (Scopus)
  • Joint Gap Detection and Inpainting of Line Drawings

    Kazuma Sasaki, Satoshi Iizuka, Edgar Simo-Serra, Hiroshi Ishikawa

    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017)     5768 - 5776  2017  [Refereed]

     View Summary

    We propose a novel data-driven approach for automatically detecting and completing gaps in line drawings with a Convolutional Neural Network. In the case of existing inpainting approaches for natural images, masks indicating the missing regions are generally required as input. Here, we show that line drawings have enough structures that can be learned by the CNN to allow automatic detection and completion of the gaps without any such input. Thus, our method can find the gaps in line drawings and complete them without user interaction. Furthermore, the completion realistically conserves thickness and curvature of the line segments. All the necessary heuristics for such realistic line completion are learned naturally from a dataset of line drawings, where various patterns of line completion are generated on the fly as training pairs to improve the model generalization. We evaluate our method qualitatively on a diverse set of challenging line drawings and also provide quantitative results with a user study, where it significantly outperforms the state of the art.

    DOI

    Scopus

    34
    Citation
    (Scopus)
  • Let there be Color!: Joint End-to-end Learning of Global and Local Image Priors for Automatic Image Colorization with Simultaneous Classification

    Satoshi Iizuka, Edgar Simo-Serra, Hiroshi Ishikawa

    ACM TRANSACTIONS ON GRAPHICS   35 ( 4 )  2016.07  [Refereed]

     View Summary

    We present a novel technique to automatically colorize grayscale images that combines both global priors and local image features. Based on Convolutional Neural Networks, our deep network features a fusion layer that allows us to elegantly merge local information dependent on small image patches with global priors computed using the entire image. The entire framework, including the global and local priors as well as the colorization model, is trained in an end-to-end fashion. Furthermore, our architecture can process images of any resolution, unlike most existing approaches based on CNN. We leverage an existing large-scale scene classification database to train our model, exploiting the class labels of the dataset to more efficiently and discriminatively learn the global priors. We validate our approach with a user study and compare against the state of the art, where we show significant improvements. Furthermore, we demonstrate our method extensively on many different types of images, including black-and-white photography from over a hundred years ago, and show realistic colorizations.

    DOI

    Scopus

    636
    Citation
    (Scopus)
  • Learning to Simplify: Fully Convolutional Networks for Rough Sketch Cleanup

    Edgar Simo-Serra, Satoshi Iizuka, Kazuma Sasaki, Hiroshi Ishikawa

    ACM TRANSACTIONS ON GRAPHICS   35 ( 4 )  2016.07  [Refereed]

     View Summary

    In this paper, we present a novel technique to simplify sketch drawings based on learning a series of convolution operators. In contrast to existing approaches that require vector images as input, we allow the more general and challenging input of rough raster sketches such as those obtained from scanning pencil sketches. We convert the rough sketch into a simplified version which is then amendable for vectorization. This is all done in a fully automatic way without user intervention. Our model consists of a fully convolutional neural network which, unlike most existing convolutional neural networks, is able to process images of any dimensions and aspect ratio as input, and outputs a simplified sketch which has the same dimensions as the input image. In order to teach our model to simplify, we present a new dataset of pairs of rough and simplified sketch drawings. By leveraging convolution operators in combination with efficient use of our proposed dataset, we are able to train our sketch simplification model. Our approach naturally overcomes the limitations of existing methods, e. g., vector images as input and long computation time; and we show that meaningful simplifications can be obtained for many different test cases. Finally, we validate our results with a user study in which we greatly outperform similar approaches and establish the state of the art in sketch simplification of raster images.

    DOI

    Scopus

    159
    Citation
    (Scopus)
  • 地球観測衛星画像上の地物自動認識

    中村, 良介, 石井, 智大, 野里, 博和, 坂無, 英徳, シモセラ, エドガー, 望月, 義彦, 飯塚, 里志, 石川, 博

    人工知能学会全国大会論文集   JSAI2016   {1B24 - 1B24  2016.06

    DOI

  • Data-Dependent Higher-Order Clique Selection for Artery-Vein Segmentation by Energy Minimization

    Yoshiro Kitamura, Yuanzhong Li, Wataru Ito, Hiroshi Ishikawa

    INTERNATIONAL JOURNAL OF COMPUTER VISION   117 ( 2 ) 142 - 158  2016.04  [Refereed]

     View Summary

    We propose a novel segmentation method based on energy minimization of higher-order potentials. We introduce higher-order terms into the energy to incorporate prior knowledge on the shape of the segments. The terms encourage certain sets of pixels to be entirely in one segment or the other. The sets can for instance be smooth curves in order to help delineate pulmonary vessels, which are known to run in almost straight lines. The higher-order terms can be converted to submodular first-order terms by adding auxiliary variables, which can then be globally minimized using graph cuts. We also determine the weight of these terms, or the degree of the aforementioned encouragement, in a principled way by learning from training data with the ground truth. We demonstrate the effectiveness of the method in a real-world application in fully-automatic pulmonary artery-vein segmentation in CT images.

    DOI

    Scopus

    18
    Citation
    (Scopus)
  • Inference and Learning of Graphical Models: Theory and Applications in Computer Vision and Image Analysis

    Chaohui Wang, Nikos Komodakis, Hiroshi Ishikawa, Olga Veksler, Endre Boros

    COMPUTER VISION AND IMAGE UNDERSTANDING   143   52 - 53  2016.02  [Refereed]

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Psoas Major Muscle Segmentation Using Higher-Order Shape Prior

    Tsutomu Inoue, Yoshiro Kitamura, Yuanzhong Li, Wataru Ito, Hiroshi Ishikawa

    Medical Computer Vision: Algorithms for Big Data   9601   116 - 124  2016  [Refereed]

     View Summary

    We propose a novel segmentation method based on higher-order graph cuts which enables the utilization of prior knowledge regarding anatomical shapes. We applied the method for segmentation of psoas major muscles by using combinations of logistic curves which representing their shapes. The higher-order terms consisting of variables (voxels) just inside or outside of the estimated shapes are added to the energy function to encourage the segmentation results to fit to the shapes. We verified the effectiveness of the method with 20 abdominal CT images. By comparing the segmentation results to the ground truth data prepared by a clinical expert, we validated the method where it achieved the Jaccard similarity coefficient (JSC) of 75.4 % (right major) and 77.5 % (left major). We also confirmed that the proposed method worked well for thick CT images.

    DOI

    Scopus

    5
    Citation
    (Scopus)
  • Fashion Style in 128 Floats: Joint Ranking and Classification using Weak Data for Feature Extraction

    Edgar Simo-Serra, Hiroshi Ishikawa

    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)     298 - 307  2016  [Refereed]

     View Summary

    We propose a novel approach for learning features from weakly-supervised data by joint ranking and classification. In order to exploit data with weak labels, we jointly train a feature extraction network with a ranking loss and a classification network with a cross-entropy loss. We obtain high-quality compact discriminative features with few parameters, learned on relatively small datasets without additional annotations. This enables us to tackle tasks with specialized images not very similar to the more generic ones in existing fully-supervised datasets. We show that the resulting features in combination with a linear classifier surpass the state-of-the-art on the Hipster Wars dataset despite using features only 0.3% of the size. Our proposed features significantly outperform those obtained from networks trained on ImageNet, despite being 32 times smaller (128 single-precision floats), trained on noisy and weakly-labeled data, and using only 1.5% of the number of parameters.

    DOI

    Scopus

    125
    Citation
    (Scopus)
  • Room Reconstruction from a Single Spherical Image by Higher-order Energy Minimization

    Kosuke Fukano, Yoshihiko Mochizuki, Satoshi Iizuka, Edgar Simo-Serra, Akihiro Sugimoto, Hiroshi Ishikawa

    2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR)     1768 - 1773  2016  [Refereed]

     View Summary

    We propose a method for understanding a room from a single spherical image, i.e., reconstructing and identifying structural planes forming the ceiling, the floor, and the walls in a room. A spherical image records the light that falls onto a single viewpoint from all directions and does not require correlating geometrical information from multiple images, which facilitates robust and precise reconstruction of the room structure. In our method, we detect line segments from a given image, and classify them into two groups: segments that form the boundaries of the structural planes and those that do not. We formulate this problem as a higher-order energy minimization problem that combines the various measures of likelihood that one, two, or three line segments are part of the boundary. We minimize the energy with graph cuts to identify segments forming boundaries, from which we estimate structural the planes in 3D. Experimental results on synthetic and real images confirm the effectiveness of the proposed method.

    DOI

    Scopus

    10
    Citation
    (Scopus)
  • Detection by Classification of Buildings in Multispectral Satellite Imagery

    Tomohiro Ishii, Edgar Simo-Serra, Satoshi Iizuka, Yoshihiko Mochizuki, Akihiro Sugimoto, Hiroshi Ishikawa, Ryosuke Nakamura

    2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR)     3344 - 3349  2016  [Refereed]

     View Summary

    We present an approach for the detection of buildings in multispectral satellite images. Unlike 3-channel RGB images, satellite imagery contains additional channels corresponding to different wavelengths. Approaches that do not use all channels are unable to fully exploit these images for optimal performance. Furthermore, care must be taken due to the large bias in classes, e.g., most of the Earth is covered in water and thus it will be dominant in the images. Our approach consists of training a Convolutional Neural Network (CNN) from scratch to classify multispectral image patches taken by satellites as whether or not they belong to a class of buildings. We then adapt the classification network to detection by converting the fully-connected layers of the network to convolutional layers, which allows the network to process images of any resolution. The dataset bias is compensated by subsampling negatives and tuning the detection threshold for optimal performance. We have constructed a new dataset using images from the Landsat 8 satellite for detecting solar power plants and show our approach is able to significantly outperform the state-of-the-art. Furthermore, we provide an in-depth evaluation of the seven different spectral bands provided by the satellite images and show it is critical to combine them to obtain good results.

    DOI

    Scopus

    36
    Citation
    (Scopus)
  • 高階エネルギー最小化による1枚の球面画像からの部屋形状推定

    深野 昂祐, 望月 義彦, 石川 博

    情報処理学会研究報告. CVIM, [コンピュータビジョンとイメージメディア]   2015 ( 9 ) 1 - 7  2015.05

     View Summary

    本稿では,1 枚の球面画像から単純な部屋の形状を復元する手法を提案する.広い視野を持つ球面画像によって,よりロバストに部屋の構造を認識することができる.部屋の形状は,壁や天井や床といった長方形の面の境界の線分の集合で表される.提案手法では,高階エネルギー最小化によって,検出された線分を境界かそうでない線分に分類する.そして,境界の線分を用いて,壁や天井や床といった部屋を構成する面を推定する.実画像を用いて実験を行い,部屋を構成する面が正しく推定できることを検証した.

    CiNii

  • コンピュータービジョンの数理

    H. Ishikawa

    数学   67 ( 2 ) 190 - 202  2015.04  [Refereed]  [Invited]

  • Multi-organ Segmentation by Minimization of Higher-order Energy for CT Boundary

    Asuka Okagawa, Yuji Oyamada, Yoshihiko Mochizuki, Hiroshi Ishikawa

    2015 14th IAPR International Conference on Machine Vision Applications (MVA)     547 - 550  2015  [Refereed]

     View Summary

    In medical image analysis, segmentation of medical images such as Computed Tomography (CT) volumetric images is necessary for further medical image analysis and computer aided intervention. We propose a method for medical image segmentation by higher-order energy minimization. Specifically, we introduce a higher-order term that describes the continuity around the edge points of a CT image. The parameters of the energy terms are determined according to various conditional probabilities learned from sample data with the ground truth. Then we minimize the energy using graph cuts and evaluate the effectiveness of the introduction of the term into the traditional energy.

    DOI

    Scopus

    1
    Citation
    (Scopus)
  • Multiple-organ Segmentation Based on Spatially-divided Neighboring Data Energy

    Minato Morita, Asuka Okagawa, Yuji Oyamada, Yoshihiko Mochizuki, Hiroshi Ishikawa

    2015 14th IAPR International Conference on Machine Vision Applications (MVA)     158 - 161  2015  [Refereed]

     View Summary

    Medical image segmentation, e.g., Computed Tomography (CT) volume segmentation, is necessary for further medical image analysis and computer aided intervention. In the standard energy minimization scheme for medical image segmentation, three terms exist in the energy: the data term, the Potts smoothing term, and the probabilistic atlas term. In this paper, we propose a novel potential function that extends the data term. The discriminability of the existing data term, which fully depends on how distinctive the objects of interest appear on CT volume, has problem when some of the objects have similar or same CT values. We overcome this limitation by considering the CT values of a pair of neighboring voxels. Increasing the voxel of interest to be evaluated, the data term become more discriminable even if some objects of interest have similar CT values. We also propose to learn the probability of the neighboring data term for each sub-region, not for each voxel. The proposed neighboring data term can be regarded as to combine the standard data term and the probabilistic atlas.

    DOI

    Scopus

  • Three-DoF Pose Estimation of Asteroids by Appearance-based Linear Regression with Divided Parameter Space

    Naoki Kobayashi, Yuji Oyamada, Yoshihiko Mochizuki, Hiroshi Ishikawa

    2015 14th IAPR International Conference on Machine Vision Applications (MVA)     551 - 554  2015  [Refereed]

     View Summary

    We present an appearance-based linear regression method for pose estimation from a single image of an asteroid, which can have any pose in the full space of three degree-of-freedom rotation parameters. The method is characterized by its division of the parameter space into multiple regions. Given a large number of training images with known pose parameters, we learn the relationship between the images and the pose parameters, separately for each parameter region, using the standard linear pose estimation. We also create a common subspace such that, when projected to it, the difference between images in the same parameter region tends to collapse. In estimating the pose of an input image, we project it onto the common subspace to determine the parameter region. We apply the method for pose estimation from asteroid images and report the experimental results.

    DOI

    Scopus

    3
    Citation
    (Scopus)
  • Surface Object Recognition with CNN and SVM in Landsat 8 Images

    Tomohiro Ishii, Ryosuke Nakamura, Hidemoto Nakada, Yoshihiko Mochizuki, Hiroshi Ishikawa

    2015 14th IAPR International Conference on Machine Vision Applications (MVA)     341 - 344  2015  [Refereed]

     View Summary

    There is a series of earth observation satellites called Landsat, which send a very large amount of image data every day such that it is hard to analyze manually. Thus an effective application of machine learning techniques to automatically analyze such data is called for. In surface object recognition, which is one of the important applications of such data, the distribution of a specific object on the surface is surveyed. In this paper, we propose and compare two methods for surface object recognition, one using the convolutional neural network (CNN) and the other support vector machine (SVM). In our experiments, CNN showed higher performance than SVM. In addition, we observed that the number of negative samples have a influence on the performance, and it is necessary to select the number of them for practical use.

    DOI

    Scopus

    38
    Citation
    (Scopus)
  • 高階エネルギー最小化による3次元多臓器セグメンテーション

    H. Ishikawa

    インナービジョン   29 ( 11 ) 52 - 52  2014.11  [Invited]

  • 1階のデータ項を用いた多臓器同時セグメンテーション

    森田皆人, 岡川明日翔, 小滝将太, 望月義彦, 小山田雄仁, 石川博

    研究報告コンピュータビジョンとイメージメディア(CVIM)   2014 ( 16 ) 1 - 7  2014.05

     View Summary

    多臓器の同時セグメンテーションを CRF を用いて行う手法を提案する.従来手法では,隣接画素間の影響を与える項は事前確率のみに基づいていたが,提案手法では,入力画像に依存する条件付確率に基づいた 1 階のデータ項を用いる.真値を伴うデータセットから学習した 1 階のデータ項を用いたセグメンテーション実験の結果を報告する.

    CiNii

  • Convolutional Neural Networkを用いた一般物体認識手法の解析

    石井智大, 望月義彦, 小山田雄仁, 石川博

    研究報告コンピュータビジョンとイメージメディア(CVIM)   2014 ( 14 ) 1 - 8  2014.05

     View Summary

    一般物体認識では,近年 Deep Learning を用いた手法が注目されており,その 1 つである Convolutional Neural Network (CNN) は特に優れた結果を示している.しかし,どのような構成の CNN が画像認識に有用であるかは理論的に示されておらず,ノウハウが必要なのが現状である.本研究では CNN を用いた一般物体認識手法において認識精度を変化させる要因の解析を行う.具体的には,Krizhevsky らの手法において,畳込み層のパラメータが認識精度に与える影響を解析するとともに,学習手法の変更が認識精度に与える影響を調べた.

    CiNii

  • 多視点照度差画像を用いた光源方向推定

    岩野俊介, 小山田雄仁, 望月義彦, 石川博

    研究報告コンピュータビジョンとイメージメディア(CVIM)   2014 ( 13 ) 1 - 5  2014.05

     View Summary

    画像からの物体・シーンの 3 次元形状の復元において,代表的な手法である多視点ステレオ法と照度差ステレオ法では,物体に対するカメラの位置と光源環境の双方が異なるときには,正確な 3 次元形状を推定することができない.より正確な復元を行うためには,各画像ごとの物体に対する光源方向の推定が必要である.本研究では,多視点照度差画像から 3 次元形状を復元するために必要な光源方向を,入力画像のみを用いて推定する手法を提案し,合成画像を使った予備的な実験の結果を報告する.

    CiNii

  • Coronary Lumen and Plaque Segmentation from CTA Using Higher-Order Shape Prior

    Yoshiro Kitamura, Yuanzhong Li, Wataru Ito, Hiroshi Ishikawa

    MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION - MICCAI 2014, PT I   8673   339 - +  2014  [Refereed]

     View Summary

    We propose a novel segmentation method based on multi-label graph cuts utilizing higher-order potentials to impose shape priors. Each higher-order potential is defined with respect to a candidate shape, and takes a low value if and only if most of the voxels inside the shape are foreground and most of those outside are background. We apply this technique to coronary lumen and plaque segmentation in CT angiography, exploiting the prior knowledge that the vessel walls tend to be tubular, whereas calcified plaques are more likely globular. We use the Hessian analysis to detect the candidate shapes and introduce corresponding higher-order terms into the energy. Since each higher-order term has any effect only when its highly specific condition is met, we can add many of them at possible locations and sizes without severe side effects. We show the effectiveness of the method by testing it on the standardized evaluation framework presented at MICCAI segmentation challenge 2012. The method achieved values comparable to the best in each of the sensitivity and positive predictive value, placing it at the top in average rank.

    DOI

    Scopus

    17
    Citation
    (Scopus)
  • Higher-Order Clique Reduction Without Auxiliary Variables

    Hiroshi Ishikawa

    2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)     1362 - 1369  2014  [Refereed]

     View Summary

    We introduce a method to reduce most higher-order terms of Markov Random Fields with binary labels into lower-order ones without introducing any new variables, while keeping the minimizer of the energy unchanged. While the method does not reduce all terms, it can be used with existing techniques that transforms arbitrary terms (by introducing auxiliary variables) and improve the speed. The method eliminates a higher-order term in the polynomial representation of the energy by finding the value assignment to the variables involved that cannot be part of a global minimizer and increasing the potential value only when that particular combination occurs by the exact amount that makes the potential of lower order. We also introduce a faster approximation that forego the guarantee of exact equivalence of minimizer in favor of speed. With experiments on the same field of experts dataset used in previous work, we show that the roof-dual algorithm after the reduction labels significantly more variables and the energy converges more rapidly.

    DOI

    Scopus

    17
    Citation
    (Scopus)
  • A HOG-BASED HAND GESTURE RECOGNITION SYSTEM ON A MOBILE DEVICE

    Lukas Prasuhn, Yuji Oyamada, Yoshihiko Mochizuki, Hiroshi Ishikawa

    2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP)     3973 - 3977  2014  [Refereed]

     View Summary

    We propose a HOG-based hand gesture recognition system running on a mobile device. Input data is a video of hand gesture taken by a mobile device. The input data is compared with a database storing hand gesture images, which was synthesized with rotation variation. The comparison is done based on their HOG features and the gesture corresponding to the best-matched image is returned as the result. The recognition algorithm is implemented on a client-server system. The proposed system is applied to American Sign Language (ASL) alphabet recognition problem. The experimental results show that the proposed recognition algorithm improves HOG's robustness under rotation change and compare processing time with different network configurations.

    DOI

    Scopus

    38
    Citation
    (Scopus)
  • Adaptive higher-order submodular potentials for pulmonary artery-vein segmentation

    Y. Kitamura, Y. Li, W. Ito, H. Ishikawa

    Fifth International Workshop on Pulmonary Image Analysis.    2013  [Refereed]

  • QPBOアルゴリズムの多値化による非劣モジュラエネルギー最小化

    T. Windheuser, H. Ishikawa, D. Cremers

    画像の認識・理解シンポジウム(MIRU2012)    2012.08  [Refereed]

  • Generalized Roof Duality for Multi-Label Optimization: Optimal Lower Bounds and Persistency

    Thomas Windheuser, Hiroshi Ishikawa, Daniel Cremers

    COMPUTER VISION - ECCV 2012, PT VI   7577   400 - 413  2012  [Refereed]

     View Summary

    We extend the concept of generalized roof duality from pseudo-boolean functions to real-valued functions over multi-label variables. In particular, we prove that an analogue of the persistency property holds for energies of any order with any number of linearly ordered labels. Moreover, we show how the optimal submodular relaxation can be constructed in the first-order case.

    DOI

    Scopus

    11
    Citation
    (Scopus)
  • Transformation of General Binary MRF Minimization to the First-Order Case

    Hiroshi Ishikawa

    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE   33 ( 6 ) 1234 - 1249  2011.06  [Refereed]

     View Summary

    We introduce a transformation of general higher-order Markov random field with binary labels into a first-order one that has the same minima as the original. Moreover, we formalize a framework for approximately minimizing higher-order multilabel MRF energies that combines the new reduction with the fusion-move and QPBO algorithms. While many computer vision problems today are formulated as energy minimization problems, they have mostly been limited to using first-order energies, which consist of unary and pairwise clique potentials, with a few exceptions that consider triples. This is because of the lack of efficient algorithms to optimize energies with higher-order interactions. Our algorithm challenges this restriction that limits the representational power of the models so that higher-order energies can be used to capture the rich statistics of natural scenes. We also show that some minimization methods can be considered special cases of the present framework, as well as comparing the new method experimentally with other such techniques.

    DOI

    Scopus

    102
    Citation
    (Scopus)
  • グラフカットによるエネルギー最小化法の最新動向

    H. Ishikawa

    電子情報通信学会技術研究報告   110 ( 121 ) 45 - 50  2010.07  [Invited]

  • 変数フリップによる高階グラフカットの拡張

    H. Ishikawa

    画像の認識・理解シンポジウム(MIRU2010)     2076 - 2083  2010.07  [Refereed]

  • パターンとは何かーー非記号計算と一般対象の情報計量

    H. Ishikawa

    情報論的学習理論ワークショップ (IBIS2009)     24 - 45  2009  [Invited]

    CiNii

  • 最適化としてのパターン自動発見にむけて

    H. Ishikawa

    電子情報通信学会技術研究報告   109 ( 344 ) 49 - 54  2009

  • Higher-Order Clique Reduction in Binary Graph Cut

    Hiroshi Ishikawa

    CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4     2985 - 2992  2009  [Refereed]

     View Summary

    e introduce a new technique that can reduce any higher-order Markov random field with binary labels into a first-order one that has the same minima as the original. Moreover, we combine the reduction with the fusion-move and Q B algorithms to optimize higher-order multi-label problems. hile many vision problems today are formulated as energy minimization problems, they have mostly been limited to using first-order energies, which consist of unary and pairwise clique potentials, with a few exceptions that consider triples, his is because of the lack of efficient algorithms to optimize energies with higher-order interactions. ur algorithm challenges this restriction that limits the representational power of the models, so that higher-order energies can be used to capture the rich statistics of natural scenes. o demonstrate the algorithm, we minimize a third-order energy, which allows clique potentials with up to four pixels, in an image restoration problem. he problem uses the Fields of Experts model, a learned spatial prior of natural images that has been used to test two belief propagation algorithms capable of optimizing higher-order energies. he results show that the algorithm exceeds the B algorithms in both optimization performance and speed.

    DOI

    Scopus

    99
    Citation
    (Scopus)
  • Higher-Order Gradient Descent by Fusion-Move Graph Cut

    Hiroshi Ishikawa

    2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)     568 - 574  2009  [Refereed]

     View Summary

    Markov Random Field is now ubiquitous in many formulations of various vision problems. Recently, optimization of higher-order potentials became practical using higher-order graph cuts: the combination of i) the fusion move algorithm, ii) the reduction of higher-order binary energy minimization to first-order, and iii) the QPBO algorithm. In the fusion move, it is crucial for the success and efficiency of the optimization to provide proposals that fits the energies being optimized. For higher-order energies, it is even more so because they have richer class of null potentials. In this paper, we focus on the efficiency of the higher-order graph cuts and present a simple technique for generating proposal labelings that makes the algorithm much more efficient, which we empirically show using examples in stereo and image denoising.

    DOI

    Scopus

    26
    Citation
    (Scopus)
  • グラフカット(チュートリアル)

    H. Ishikawa

    情報処理学会研究報告   2007-CVIM-158 ( 31 ) 193 - 204  2007  [Invited]

  • パターンの疎・再帰的・階層的な表現

    H. Ishikawa

    画像の認識・理解シンポジウム(MIRU2007)     726 - 731  2007

  • Representation and Measure of Structural Information

    H. Ishikawa

    arXiv    2007

  • Total absolute Gaussian curvature for stereo prior

    Hiroshi Ishikawa

    COMPUTER VISION - ACCV 2007, PT II, PROCEEDINGS   4844   537 - 548  2007  [Refereed]

     View Summary

    In spite of the great progress in stereo matching algorithms, the prior models they use, i.e., the assumptions about the probability to see each possible surface, have not changed much in three decades. Here, we introduce a novel prior model motivated by psychophysical experiments. It is based on minimizing the total sum of the absolute value of the Gaussian curvature over the disparity surface. Intuitively, it is similar to rolling and bending a flexible paper to fit to the stereo surface, whereas the conventional prior is more akin to spanning a soap film. Through controlled experiments, we show that the new prior outperforms the conventional models, when compared in the equal setting.

    DOI

    Scopus

    3
    Citation
    (Scopus)
  • ヒト視覚系から示唆される高階ステレオ事前分布

    H. Ishikawa

    画像の認識・理解シンポジウム(MIRU2006)     128 - 134  2006  [Refereed]

  • Rethinking the prior model for stereo

    H Ishikawa, D Geiger

    COMPUTER VISION - ECCV 2006, PT 3, PROCEEDINGS   3953   526 - 537  2006  [Refereed]

     View Summary

    Sometimes called the smoothing assumption, the prior model of a stereo matching algorithm is the algorithm's expectation on the surfaces in the world. Any stereo algorithm makes assumptions about the probability to see each surface that can be represented in its representation system. Although the past decade has seen much continued progress in stereo matching algorithms, the prior models used in them have not changed much in three decades: most algorithms still use a smoothing prior that minimizes some function of the difference of depths between neighboring sites, sometimes allowing for discontinuities.
    However, one system seems to use a very different prior model from all other systems: the human vision system. In this paper, we first report the observations we made in examining human disparity interpolation using stereo pairs with sparse identifiable features. Then we mathematically analyze the implication of using current prior models and explain why the human system seems to use a model that is not only different but in a sense diametrically opposite from all current models. Finally, we propose two candidate models that reflect the behavior of human vision. Although the two models look very different, we show that they are closely related.

    DOI

    Scopus

    18
    Citation
    (Scopus)
  • Illusory volumes in human stereo perception

    H Ishikawa, D Geiger

    VISION RESEARCH   46 ( 1-2 ) 171 - 178  2006.01  [Refereed]

     View Summary

    Any complete theory of human stereopsis must model not only how the correspondences between locations in the two views are determined and the depths are recovered from their disparity, but also how the ambiguity arising from such factors as noise, periodicity, and large regions of constant intensity are resolved and missing data are interpolated. In investigating this process of recovering surface structure from sparse disparity information, using stereo pairs with sparse identifiable features, we made an observation that contradicts all extant models. It suggests the inadequacy of retinotopic representation in modeling surface perception in this stage. We also suggest a possible alternative theory, which is a minimization of the modulus of Gaussian curvature. (c) 2005 Elsevier Ltd. All rights reserved.

    DOI

    Scopus

    5
    Citation
    (Scopus)
  • Higher-dimensional Segmentation by Minimum-cut Algorithm

    H. Ishikawa, D. Geiger

    IAPR Conference on Machine Vision Applications (MVA2005)     488 - 491  2005  [Refereed]

  • Finding tree structures by grouping symmetries

    H Ishikawa, D Geiger, R Cole

    TENTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1 AND 2, PROCEEDINGS     1132 - 1139  2005  [Refereed]

     View Summary

    The representation of objects in images as tree structures is of great interest to vision, as they can represent articulated objects such as people as well as other structured objects like arteries in human bodies, roads, circuit board patterns, etc. Tree structures are often related to the symmetry axis representation of shapes, which captures their local symmetries. Algorithms have been introduced to detect (i) open contours in images in quadratic time (ii) closed contours in images in cubic time, and (iii) tree structures from contours in quadratic time. The algorithms are based on dynamic programming and Single Source Shortest Path algorithms. However, in this paper, we show that the problem of finding tree structures in images in a principled manner is a much harder problem. We argue that the optimization problem of finding tree structures in images is essentially equivalent to a variant of the Steiner Tree problem, which is NP-hard. Nevertheless, an approximate polynomial-time algorithm for this problem exists: we apply a fast implementation of the Goemans-Williamson approximate algorithm to the problem of finding a tree representation after an image is transformed by a local symmetry mapping. Examples of extracting tree structures from images illustrate the idea and applicability of the approximate method.

    DOI

    Scopus

    8
    Citation
    (Scopus)
  • Exact optimization for Markov random fields with convex priors

    H Ishikawa

    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE   25 ( 10 ) 1333 - 1336  2003.10  [Refereed]

     View Summary

    We introduce a method to solve exactly a first order Markov Random Field optimization problem in more generality than was previously possible. The MRF shall have a prior term that is convex in terms of a linearly ordered label set. The method maps the problem into a minimum-cut problem for a directed graph, for which a globally optimal solution can be found in polynomial time. The convexity of the prior function in the energy is shown to be necessary and sufficient for the applicability of the method.

    DOI

    Scopus

    412
    Citation
    (Scopus)
  • Globally optimal regions and boundaries as minimum ratio weight cycles

    IH Jermyn, H Ishikawa

    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE   23 ( 10 ) 1075 - 1088  2001.10  [Refereed]

     View Summary

    We describe anew form of energy functional for the modeling and identification of regions in images. The energy is defined on the space of boundaries in the image domain and can incorporate very general combinations of modeling information both from the boundary (intensity gradients, etc.) and from the interior of the region (texture, homogeneity, etc.). We describe two polynomial-time digraph algorithms for finding the global minima of this energy. One of the algorithms is completely general, minimizing the functional for any choice of modeling information. It runs in a few seconds on a 256x256 image. The other algorithm applies to a subclass of functionals, but has the advantage of being extremely parallelizable. Neither algorithm requires initialization.

    DOI

    Scopus

    133
    Citation
    (Scopus)
  • Region extraction from multiple images

    H Ishikawa, IH Jermyn

    EIGHTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOL I, PROCEEDINGS     509 - 516  2001  [Refereed]

     View Summary

    We present a method for region identification in multiple images. A set of regions in different images and the correspondences on their boundaries can be thought of as a boundary in the multi-dimensional space formed by the product of the individual image domains. We minimize an energy functional on the space of such boundaries, thereby identifying simultaneously both the optimal regions in each image and the optimal correspondences on their boundaries. We use a ratio form for the energy functional, thus enabling the global minimization of the energy functional using a polynomial time graph algorithm, among other desirable properties. We choose a simple form for this energy that favours boundaries that lie on high intensity gradients in each image, while encouraging correspondences between boundaries in different images that match intensity values. The latter tendency is weighted by a novel heuristic energy that encourages the boundaries to lie on disparity or optical flow discontinuities, although no dense optical flow or disparity map is computed.

    DOI

  • Multi-scale Feature Selection in Stereo

    H. Ishikawa

    IEEE Computer Society Conference on Computer Vision and Pattern Recognition. (CVPR'99)     1132 - 1137  1999  [Refereed]

    DOI CiNii

  • Mapping image restoration to a graph problem

    H Ishikawa, D Geiger

    PROCEEDINGS OF THE IEEE-EURASIP WORKSHOP ON NONLINEAR SIGNAL AND IMAGE PROCESSING (NSIP'99)     890 - 894  1999  [Refereed]

     View Summary

    We propose a graph optimization method for the restoration of gray-scale images. We consider an arbitrary noise model for each pixel location. We also consider a smooth constraint where the potentials between neighbor pixels are convex functionals. We show how to map this problem to a directed flow graph. Then, a global optimal solution is obtained via the use of the maximum-flow algorithm. The algorithm runs in a polynomial time with respect to the size of the image.

  • Globally Optimal Regions and Boundaries

    I. H. Jermyn, H. Ishikawa

    IEEE International Conference on Computer Vision (ICCV'99)     904 - 910  1999  [Refereed]

    DOI

  • Occlusions, Discontinuities, and Epipolar Lines in Stereo

    H. Ishikawa, D. Geiger

    European Conference on Computer Vision (ECCV'98)     232 - 248  1998  [Refereed]

    DOI CiNii

    Scopus

    106
    Citation
    (Scopus)
  • Segmentation by grouping junctions

    H Ishikawa, D Geiger

    1998 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS     125 - 131  1998  [Refereed]

     View Summary

    We propose a method for segmenting gray-value images. By segmentation, we mean a map from the set of pixels To a small set of levels such that each connected component of the set of pixels with the same level forms a relatively large and "meaningful" region. The method finds a set of levels with associated gray values by first finding junctions in the image and their seeking a minimum set of threshold values that preserves the junctions. Then it finds a segmentation map that maps each pixel to the level with the closest gray value to the pixel data, within a smoothness constraint. For a convex smoothing penalty, we show the global optimal solution for an energy function that fits the data can be obtained in a polynomial time, by a novel use of the maximum-flow algorithm. Our approach is in contrast to a view in computer vision where segmentation is driven by intensity, gradient, usually not yielding closed boundaries.

    DOI

  • 高階グラフカット

    H. Ishikawa

    画像の認識・理解シンポジウム(MIRU2009)     7 - 14  [Refereed]

▼display all

Books and Other Publications

  • "Graph Cuts—Combinatorial Optimization in Vision" (Chapter 2), Olivier Lezoray and Leo Grady ed."Image Processing and Analysis with Graphs: Theory and Practice"

    Hiroshi Ishikawa( Part: Joint author)

    CRC Press  2012.07 ISBN: 9781439855072

  • 実践 医用画像解析ハンドブック

    石川 博( Part: Joint author)

    オーム社  2012

  • "Optimizing Multi-Label MRFs with Convex and Truncated Convex Priors" (Chapter 4), Andrew Blake, Pushmeet Kohli, and Carsten Rother ed. "Markov Random Fields for Vision and Image Processing"

    Hiroshi Ishikawa, Olga Veksler( Part: Joint author)

    MIT Press  2011.09 ISBN: 9780262015776

  • 「グラフカット」(第2章, pp. 39-74) 斎藤英雄・八木康史(編)「コンピュータビジョン最先端ガイド1: Level Set, Graph Cut, Particle Filter, Tensor, AdaBoost」

    石川 博( Part: Joint author)

    アドコム・メディア  2008.12 ISBN: 9784915851346

  • “Local Feature Selection and Global Energy Optimization in Stereo,” (Chapter 22, pp. 411-430), Rustam Stolkin ed. "Scene Reconstruction, Pose Estimation and Tracking"

    H. Ishikawa, D. Geiger( Part: Joint author)

    I-Tech Education and Publishing  2007

Presentations

  • The Future of Computer Graphics in the Age of AI and Social Media

    Hiroshi Ishikawa  [Invited]

    Computer Graphics International  (Calgary) 

    Presentation date: 2019.06

  • Higher-Order Random Fields for Image Segmentation

    Hiroshi Ishikawa  [Invited]

    Computer Graphics International  (Calgary) 

    Presentation date: 2019.06

  • Structured Prediction by Fully Convolutional Deep Neural Networks

    Hiroshi Ishikawa  [Invited]

    Irish Machine Vision and Image Processing Conference  (Belfast)  Irish Pattern Recognition and Classification Society

    Presentation date: 2018.08

  • Image Completion by CNN with Global and Local Consistency

    Hiroshi Ishikawa  [Invited]

    SIAM Conference on Imaging Science  (Bologna)  Society for Industrial and Applied Mathematics

    Presentation date: 2018.06

  • Image Transformation with Deep Learning

    Hiroshi Ishikawa  [Invited]

    Seventh Symposium on Biometrics, Recognition and Authentication  (Tokyo) 

    Presentation date: 2017.11

  • Mathematical Models of Vision and Structured Predicition Problem

    Hiroshi Ishikawa  [Invited]

    Tohoku University Graduate School of Information Sciences Seminar  (Sendai) 

    Presentation date: 2017.10

  • What you see is made up in your head

    Hiroshi Ishikawa  [Invited]

    (Yamagata) 

    Presentation date: 2017.09

  • Rules and Models versus Data and Machine Learning in Graphics and Vision

    Hiroshi Ishikawa  [Invited]

    Computer Graphics International  (Yokohama) 

    Presentation date: 2017.06

  • Frontiers of Image Processing and Computer Graphics by Deep Learning

    Hiroshi Ishikawa  [Invited]

    Computer Graphics International  (Yokohama) 

    Presentation date: 2017.06

  • Image Generation by Deep Learning

    Hiroshi Ishikawa  [Invited]

    (Tokyo) 

    Presentation date: 2017.06

  • Toward a mathematical model of perception

    Hiroshi Ishikawa  [Invited]

    (Tokyo) 

    Presentation date: 2016.10

  • Automatic Image Colorization and Rough Sketch Cleanup by Deep Learning

    Hiroshi Ishikawa  [Invited]

    (Tsukuba) 

    Presentation date: 2016.09

  • Automatic Image Colorization and Rough Sketch Cleanup by Deep Learning

    Hiroshi Ishikawa  [Invited]

    ACCV2016 Area Chairs Workshop  (Keelung) 

    Presentation date: 2016.08

  • MAP Estimation of Markov Random Fields with Some Applications in Medical Imaging

    Hiroshi Ishikawa  [Invited]

    Probabilistic Graphical Model Workshop: Sparsity, Structure and High-dimensionality  (Tachikawa) 

    Presentation date: 2016.03

  • Graph Cuts: How far can you go with quadratic submodular minimization?

    Hiroshi Ishikawa  [Invited]

    18th Informatioin-Based Induction Sciences Workshop  (Tsukuba) 

    Presentation date: 2015.11

  • Higher-Order Graph Cuts and Medical Image Segmentation

    Hiroshi Ishikawa  [Invited]

    The workshop on mathematical and computational methods in biomedical imaging and image analysis  (Auckland) 

    Presentation date: 2015.11

  • Maximum A Posteriori Estimation of Markov Random Fields in Vision

    Hiroshi Ishikawa  [Invited]

    (Tokyo) 

    Presentation date: 2015.11

  • Discrete Optimization in Vision

    Hiroshi Ishikawa  [Invited]

    27th RAMP Symposium  (Hamamatsu) 

    Presentation date: 2015.10

  • Vision and Recognition as Optimization

    Hiroshi Ishikawa  [Invited]

    6th Cryptography Frontier Seminar  (Noumi) 

    Presentation date: 2015.03

  • Higher-order Graph Cuts

    Hiroshi Ishikawa  [Invited]

    ACCV2014 Area Chairs Workshop  (Singapore) 

    Presentation date: 2014.09

  • Graph Cuts: Since then

    Hiroshi Ishikawa  [Invited]

    Meeting on Image Recognition and Understanding  (Tokyo) 

    Presentation date: 2013.07

  • Maximum A Posteriori Estimation in Higher-order Markov Random Fields

    Hiroshi Ishikawa  [Invited]

    SIG-FPAI, Japanese Society for Artificial Intelligence  (Yokohama) 

    Presentation date: 2012.11

  • Proposal Selection in Higher-order Graph Cuts

    Hiroshi Ishikawa  [Invited]

    25th European Conference on Operational Research  (Vilnius) 

    Presentation date: 2012.07

  • Higher-order Optimization and Learning in Vision--A Bottom-up Approach--

    Hiroshi Ishikawa  [Invited]

    PRMU/IBISML/CVIM  (Fukuoka) 

    Presentation date: 2010.09

  • Latest Trends in Energy Minimization with Graph Cuts

    Hiroshi Ishikawa  [Invited]

    IEICE Medical Imaging  (Tokushima) 

    Presentation date: 2010.07

  • What is pattern? -- Non-symbolic computation and information measure of general objects

    Hiroshi Ishikawa  [Invited]

    12th Informatioin-Based Induction Sciences Workshop  (Fukuoka) 

    Presentation date: 2009.10

  • A Practical Introduction to Graph Cut

    Hiroshi Ishikawa  [Invited]

    The 3rd Pacific-Rim Symposium on Image and Video Technology  (Tokyo) 

    Presentation date: 2009.01

  • What is pattern? -- On a theory of grounded computation

    Hiroshi Ishikawa  [Invited]

    RIMS Seminar, Kyoto University  (Kyoto) 

    Presentation date: 2008.11

  • Theory and Application of Graph Cuts

    Hiroshi Ishikawa  [Invited]

    The 14th Symposium on Sensing via Image Information  (Yokohama) 

    Presentation date: 2008.06

  • Organizing higher-order cliques by sparse representation

    Hiroshi Ishikawa  [Invited]

    IPAM Short Program “Graph Cuts and Related Discrete or Continuous Optimization Problems”  (Los Angeles) 

    Presentation date: 2008.02

  • Graph algorithms in computer vision

    Hiroshi Ishikawa  [Invited]

    Mathematical Aspects of Image Processing and Computer Vision  (Sapporo) 

    Presentation date: 2007.11

  • Graph Cuts

    Hiroshi Ishikawa  [Invited]

    IPSJ SIG-CVIM  (Kagoshima) 

    Presentation date: 2007.03

  • Embedded Graph Algorithms for Computer Vision

    Hiroshi Ishikawa  [Invited]

    DIMACS Workshop on Graph Theoretic Methods in Computer Vision  (New Brunswick) 

    Presentation date: 1999.05

▼display all

Research Projects

  • 画像空間と画像変換学習システムの構造

    Project Year :

    2020.04
    -
    2025.03
     

  • Mathematical Foundations of Multidisciplinary Computational Anatomy

    Project Year :

    2014.06
    -
    2019.03
     

     View Summary

    We would like to construct statistical models that help to understand human body by using medical images of many patients from variety of modalities. For this purpose, we studied the following topics and made results: (1) development of new imaging principle for the magnetic resonance(MR) and improvement of existing diffusion MR imaging methods, (2)mathematical foundation for measuring distances among images and registration methods of medical images that have variety of spatial resolutions and that are captured at different times, (3) methods that map organ regions based both on the shapes and the textures on the surfaces, (4) applications of high-order graph cut and deep neural networks for medical image analysis, and (5) information geometry for mathematical foundation, which helps to analyze the characteristics of statistical models constructed from small number of training data

  • Analysis of geological activities of asteroids using high definition shape models

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research

    Project Year :

    2013.04
    -
    2016.03
     

    Hirata Naru, DEMURA Hirohide, SUGITA Seiji, NARUSE Keitarou, OHTA Naoya, TAKAHASHI Shigeo, YAGUCHI Yuichi, ISHIKAWA Hiroshi, OGAWA Naoko

     View Summary

    We developed a method for high resolution 3D shape reconstruction of asteroids with a combination of SPC and SfM methods and a 3D geographical information system for asteroid data analysis. By utilizing these research environment, we analyzed image data of the asteroid Itokawa taken by the Hayabusa spacecraft to reveal new views on a small rubble-pile asteroid. We describe a resurfacing history of Itokawa by impact events. It is suggested that Itokawa was suffered by a global resurfacing event on several Myr ago, which is consistent with evidence from samples recovered from Itokawa

  • Enhanced Generalizability by Structural Model Learning

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research

    Project Year :

    2013.04
    -
    2015.03
     

    ISHIKAWA Hiroshi

     View Summary

    In the convolutional neural network (CNN), features with invariance under parallel translation, which is essential in image recognition tasks, can be learned by training the neurons that translate to each other by parallel translation together so that they have the same value. Aiming at similar effect in the case of general transformations, we conducted a theoretical research aimed at the application of the theory that can answer the question on the presence of patterns in raw data by uniformly defining algebraic representation of structures and their semantics in the data space. Also, as an example of application of learning algorithm, we compared algorithms for ground object recognition for Landsat images using CNN and support vector machine (SVM)

  • Approximate optimization and learning of higher-order energy

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research

    Project Year :

    2012.04
    -
    2015.03
     

    ISHIKAWA Hiroshi

     View Summary

    We realized an algorithm that approximately minimize non-submodular multi-label energies. We also made it possible to minimize binary higher-order energies faster and with less memory by enabling to reduce them into first-order energies without adding additional variables in certain cases. As applications of higher-order energies, we used them for segmentation of pulmonary artery-vein segmentation, where we represented the shapes of pulmonary blood vessels by higher-order potentials. We also improved algorithms to segment coronary lumen and plaques from CT angiography, also using higher-order shape priors

  • 高階エネルギー最小化による3次元多臓器セグメンテーション

    科学研究費助成事業(早稲田大学)  科学研究費助成事業(新学術領域研究(研究領域提案型))

    Project Year :

    2012
    -
    2013
     

     View Summary

    多臓器セグメンテーションのために高階エネルギーを最小化する新たなアルゴリズムを開発した。これは、従来手法と異なり、付加変数を必要としない。この成果はCVPR2014に採択された。また、セグメンテーション結果の評価により最適なセグメンテーション結果を選択する方法を検討した。各セグメンテーション結果に対して,トレーニングデータから得られる統計から尤度を計算することで,セグメンテーション結果の評価を行った。セグメンテーション結果の良さは,セグメンテーション結果と正解データとの一致率の高さと定義し,尤度と一致率の相関を調べた。また他にも、肺の血管のCT画像を動脈と静脈に分けるセグメンテーションに高階エネルギー最小化を応用した。これはワークショップPIA2013で発表した。一方、心臓の動脈硬化の程度をCT画像から測定するためのアルゴリズムを開発し、国際会議MICCAI2014に採択された。

  • 非記号計算の基礎理論の構築と構造学習への応用

    戦略的創造研究推進事業  戦略的創造研究推進事業(さきがけ)

    Project Year :

    2009
    -
    2013
     

     View Summary

    昨今増加が著しい画像や映像、各種計測データ等のアナログ情報で表される現実の世界と、インターネットに代表され、デジタル記述されるサイバー世界における情報の概念の間に橋渡しをすることを目指します。そのために構造一般を記述する基本である計算の概念を非記号空間内に直接表現し、複雑な構造を持つ情報一般を統一的に扱う理論を構築します。また画像などの高次元データ中にパターンを見つけることへの応用を目指します。

  • 構造情報表現によるパターン発見の研究

    科学研究費助成事業(名古屋市立大学)  科学研究費助成事業(萌芽研究)

    Project Year :

    2007
    -
    2009
     

     View Summary

    情報科学では、抽象化された情報と現実の世界の一般対象の関係、すなわち符号化は、任意であるとされる。逆に、一般対象に内包される情報の概念は、符号化できる対象の全体を限定して、その範囲でのみ意味を持つ。しかし、対象の全体が大きくなると、その全体に共通して適用可能な情報概念の定義が難しくなる。ビットで表わされる記号の世界を離れて情報について考えると、他にも符号化の任意性の喪失や、一般対象の規則性と符号の規則性の不一致等の問題が生ずる。本研究では、「図式とその断面によるパターン表現」という新概念を定式化し、統一的で一般性が高く、パターンの構造情報と実装依存部分を分離できる非記号情報の表現方法を提案した。これはそのパターンの含まれる空間を特徴付ける写像の集合に対して相対的に定義されるが、自然数を特徴付ける写像(0を与える写像と後者写像)に相対的に定義されたとき、この表現により表現可能な写像全体は帰納的部分関数全体と一致することを示した。また、この表現によるパターン発見を目指して、最適化問題との関係を探るうち、高次元データの中に高階相関構造を見つけることの重要性を認識し、コンピュータビジョンにおける新しい曲面事前モデルの定式化についても検討した。これから示唆された曲面事前モデルであるガウス曲率絶対値最小化を、より性能の高い最新の最適化技術で行うため、グラフカットにおける高階エネルギーの最小化について研究した結果、任意の高階2値エネルギーを1階に変換する方法を発見し、またそれを繰り返し使うことによる多値エネルギー最小化法も開発した

▼display all

 

Syllabus

▼display all

 

Sub-affiliation

  • Faculty of Science and Engineering   Graduate School of Fundamental Science and Engineering

Research Institute

  • 2022
    -
    2024

    Waseda Research Institute for Science and Engineering   Concurrent Researcher