2022/08/10 更新

写真a

シモセラ エドガー
シモセラ エドガー
所属
理工学術院 基幹理工学部
職名
准教授
ホームページ

兼担

  • 政治経済学術院   政治経済学部

学歴

  • 2011年08月
    -
    2015年07月

    BarcelonaTech (UPC)   PhD in Automatic Control, Robotics and Vision   Institute of Industrial and Control Engineering  

  • 2005年08月
    -
    2011年05月

    BarcelonaTech (UPC)   Superior Industrial Engineering  

学位

  • 2015年07月   BarcelonaTech (UPC)   PhD in Automatic Control, Robotics and Vision

経歴

  • 2021年04月
    -
    継続中

    早稲田大学   理工学術院   准教授

  • 2018年09月
    -
    2021年03月

    科学技術振興機構   さきがけ研究員 (兼任)

  • 2018年09月
    -
    2021年03月

    早稲田大学   理工学術院   専任講師

  • 2018年04月
    -
    2018年08月

    科学技術振興機構   さきがけ専任研究員

  • 2017年04月
    -
    2018年03月

    早稲田大学 理工学術院総合研究所   研究院講師

  • 2015年08月
    -
    2017年03月

    早稲田大学大学院 基幹理工学研究科   研究院助教

  • 2011年09月
    -
    2015年07月

    BarcelonaTech (UPC)   博士課程の大学院生(FPI奨励費)

▼全件表示

 

研究分野

  • 知能情報学

研究キーワード

  • コンピュータグラフィックス

  • コンピュータビジョン

  • 機械学習

論文

  • Generating Digital Painting Lighting Effects via RGB-space Geometry

    Lvmin Zhang, Edgar Simo-Serra, Yi Ji, Chunping Liu

    ACM Transactions on Graphics (Presented at SIGGRAPH)   39 ( 2 )  2020年01月  [査読有り]

    DOI

  • DeepRemaster: Temporal Source-Reference Attention Networks for Comprehensive Video Enhancement

    Satoshi Iizuka, Edgar Simo-Serra

    ACM Transactions on Graphics (SIGGRAPH Asia)   38 ( 6 )  2019年11月  [査読有り]

    DOI

  • Real-Time Data-Driven Interactive Rough Sketch Inking

    Edgar Simo-Serra, Satoshi Iizuka, Hiroshi Ishikawa

    ACM Transactions on Graphics (SIGGRAPH)    2018年08月  [査読有り]

    DOI

  • Mastering sketching: Adversarial augmentation for structured prediction

    Edgar Simo-Serra, Satoshi Iizuka, Hiroshi Ishikawa

    ACM Transactions on Graphics   37 ( 1 )  2018年01月  [査読有り]

     概要を見る

    We present an integral framework for training sketch simplification networks that convert challenging rough sketches into clean line drawings. Our approach augments a simplification network with a discriminator network, training both networks jointly so that the discriminator network discerns whether a line drawing is real training data or the output of the simplification network, which, in turn, tries to fool it. This approach has two major advantages: first, because the discriminator network learns the structure in line drawings, it encourages the output sketches of the simplification network to be more similar in appearance to the training sketches. Second, we can also train the networks with additional unsupervised data: by adding rough sketches and line drawings that are not corresponding to each other, we can improve the quality of the sketch simplification. Thanks to a difference in the architecture, our approach has advantages over similar adversarial training approaches in stability of training and the aforementioned ability to utilize unsupervised training data. We show how our framework can be used to train models that significantly outperform the state of the art in the sketch simplification task, despite using the same architecture for inference. We also present an approach to optimize for a single image, which improves accuracy at the cost of additional computation time. Finally, we show that, using the same framework, it is possible to train the network to perform the inverse problem, i.e., convert simple line sketches into pencil drawings, which is not possible using the standard mean squared error loss. We validate our framework with two user tests, in which our approach is preferred to the state of the art in sketch simplification 88.9% of the time.

    DOI

  • Joint gap detection and inpainting of line drawings

    Kazuma Sasaki, Satoshi Iizuka, Edgar Simo-Serra, Hiroshi Ishikawa

    Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017   2017-   5768 - 5776  2017年11月  [査読有り]

     概要を見る

    We propose a novel data-driven approach for automatically detecting and completing gaps in line drawings with a Convolutional Neural Network. In the case of existing inpainting approaches for natural images, masks indicating the missing regions are generally required as input. Here, we show that line drawings have enough structures that can be learned by the CNN to allow automatic detection and completion of the gaps without any such input. Thus, our method can find the gaps in line drawings and complete them without user interaction. Furthermore, the completion realistically conserves thickness and curvature of the line segments. All the necessary heuristics for such realistic line completion are learned naturally from a dataset of line drawings, where various patterns of line completion are generated on the fly as training pairs to improve the model generalization. We evaluate our method qualitatively on a diverse set of challenging line drawings and also provide quantitative results with a user study, where it significantly outperforms the state of the art.

    DOI

  • 3D Human Pose Tracking Priors using Geodesic Mixture Models

    Edgar Simo-Serra, Carme Torras, Francesc Moreno-Noguer

    International Journal of Computer Vision   122 ( 2 ) 388 - 408  2017年04月  [査読有り]

     概要を見る

    We present a novel approach for learning a finite mixture model on a Riemannian manifold in which Euclidean metrics are not applicable and one needs to resort to geodesic distances consistent with the manifold geometry. For this purpose, we draw inspiration on a variant of the expectation-maximization algorithm, that uses a minimum message length criterion to automatically estimate the optimal number of components from multivariate data lying on an Euclidean space. In order to use this approach on Riemannian manifolds, we propose a formulation in which each component is defined on a different tangent space, thus avoiding the problems associated with the loss of accuracy produced when linearizing the manifold with a single tangent space. Our approach can be applied to any type of manifold for which it is possible to estimate its tangent space. Additionally, we consider using shrinkage covariance estimation to improve the robustness of the method, especially when dealing with very sparsely distributed samples. We evaluate the approach on a number of situations, going from data clustering on manifolds to combining pose and kinematics of articulated bodies for 3D human pose tracking. In all cases, we demonstrate remarkable improvement compared to several chosen baselines.

    DOI

  • Globally and locally consistent image completion

    Satoshi Iizuka, Edgar Simo-Serra, Hiroshi Ishikawa

    ACM Transactions on Graphics   36 ( 4 )  2017年  [査読有り]

     概要を見る

    We present a novel approach for image completion that results in images that are both locally and globally consistent. With a fully-convolutional neural network, we can complete images of arbitrary resolutions by filling-in missing regions of any shape. To train this image completion network to be consistent, we use global and local context discriminators that are trained to distinguish real images from completed ones. The global discriminator looks at the entire image to assess if it is coherent as a whole, while the local discriminator looks only at a small area centered at the completed region to ensure the local consistency of the generated patches. The image completion network is then trained to fool the both context discriminator networks, which requires it to generate images that are indistinguishable from real ones with regard to overall consistency as well as in details. We show that our approach can be used to complete a wide variety of scenes. Furthermore, in contrast with the patch-based approaches such as PatchMatch, our approach can generate fragments that do not appear elsewhere in the image, which allows us to naturally complete the images of objects with familiar and highly specific structures, such as faces.

    DOI

  • Let there be color!: Joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification

    Satoshi Iizuka, Edgar Simo-Serra, Hiroshi Ishikawa

    ACM Transactions on Graphics   35 ( 4 )  2016年07月  [査読有り]

     概要を見る

    We present a novel technique to automatically colorize grayscale images that combines both global priors and local image features. Based on Convolutional Neural Networks, our deep network features a fusion layer that allows us to elegantly merge local information dependent on small image patches with global priors computed using the entire image. The entire framework, including the global and local priors as well as the colorization model, is trained in an end-to-end fashion. Furthermore, our architecture can process images of any resolution, unlike most existing approaches based on CNN. We leverage an existing large-scale scene classification database to train our model, exploiting the class labels of the dataset to more efficiently and discriminatively learn the global priors. We validate our approach with a user study and compare against the state of the art, where we show significant improvements. Furthermore, we demonstrate our method extensively on many different types of images, including black-and-white photography from over a hundred years ago, and show realistic colorizations.

    DOI

  • Learning to simplify: Fully convolutional networks for rough sketch cleanup

    Edgar Simo-Serra, Satoshi Iizuka, Kazuma Sasaki, Hiroshi Ishikawa

    ACM Transactions on Graphics   35 ( 4 )  2016年07月  [査読有り]

     概要を見る

    In this paper, we present a novel technique to simplify sketch drawings based on learning a series of convolution operators. In contrast to existing approaches that require vector images as input, we allow the more general and challenging input of rough raster sketches such as those obtained from scanning pencil sketches. We convert the rough sketch into a simplified version which is then amendable for vectorization. This is all done in a fully automatic way without user intervention. Our model consists of a fully convolutional neural network which, unlike most existing convolutional neural networks, is able to process images of any dimensions and aspect ratio as input, and outputs a simplified sketch which has the same dimensions as the input image. In order to teach our model to simplify, we present a new dataset of pairs of rough and simplified sketch drawings. By leveraging convolution operators in combination with efficient use of our proposed dataset, we are able to train our sketch simplification model. Our approach naturally overcomes the limitations of existing methods, e.g., vector images as input and long computation time
    and we show that meaningful simplifications can be obtained for many different test cases. Finally, we validate our results with a user study in which we greatly outperform similar approaches and establish the state of the art in sketch simplification of raster images.

    DOI

  • Fashion Style in 128 Floats: Joint Ranking and Classification using Weak Data for Feature Extraction

    Edgar Simo-Serra and Hiroshi Ishikawa

    Conference in Computer Vision and Pattern Recognition (CVPR)    2016年06月  [査読有り]

    DOI

  • Neuroaesthetics in Fashion: Modeling the Perception of Fashionability

    Edgar Simo-Serra, Sanja Fidler, Francesc Moreno-Noguer, Raquel Urtasun

    Conference in Computer Vision and Pattern Recognition (CVPR)    2015年06月  [査読有り]

    DOI

  • Discriminative learning of deep convolutional feature point descriptors

    Edgar Simo-Serra, Eduard Trulls, Luis Ferraz, Iasonas Kokkinos, Pascal Fua, Francesc Moreno-Noguer

    Proceedings of the IEEE International Conference on Computer Vision   2015   118 - 126  2015年02月  [査読有り]

     概要を見る

    Deep learning has revolutionalized image-level tasks such as classification, but patch-level tasks, such as correspondence, still rely on hand-crafted features, e.g. SIFT. In this paper we use Convolutional Neural Networks (CNNs) to learn discriminant patch representations and in particular train a Siamese network with pairs of (non-)corresponding patches. We deal with the large number of potential pairs with the combination of a stochastic sampling of the training set and an aggressive mining strategy biased towards patches that are hard to classify. By using the L2 distance during both training and testing we develop 128-D descriptors whose euclidean distances reflect patch similarity, and which can be used as a drop-in replacement for any task involving SIFT. We demonstrate consistent performance gains over the state of the art, and generalize well against scaling and rotation, perspective transformation, non-rigid deformation, and illumination changes. Our descriptors are efficient to compute and amenable to modern GPUs, and are publicly available.

    DOI

  • DaLI: Deformation and Light Invariant Descriptor

    Edgar Simo-Serra, Carme Torras, Francesc Moreno-Noguer

    International Journal of Computer Vision (IJCV) 115(2):135-154   115 ( 2 ) 136 - 154  2015年02月  [査読有り]

    DOI

  • Kinematic Synthesis using Tree Topologies

    Edgar Simo-Serra and Alba Perez-Gracia

    Mechanism and Machine Theory (MAMT) 72:94-113   72   94 - 113  2014年02月  [査読有り]

    DOI

  • A Joint Model for 2D and 3D Pose Estimation from a Single Image

    Edgar Simo-Serra, Ariadna Quattoni, Carme Torras, Francesc Moreno-Noguer

    Conference in Computer Vision and Pattern Recognition (CVPR)    2013年06月  [査読有り]

    DOI

  • Single Image 3D Human Pose Estimation from Noisy Observations

    Edgar Simo-Serra, Arnau Ramisa, Guillem Alenyà, Carme Torras, Francesc Moreno-Noguer

    Conference in Computer Vision and Pattern Recognition (CVPR)    2012年06月  [査読有り]

    DOI

  • Two-stage Discriminative Re-ranking for Large-scale Landmark Retrieval

    Shuhei Yokoo, Kohei Ozaki, Edgar Simo-Serra, Satoshi Iizuka

    Conference in Computer Vision and Pattern Recognition Workshops (CVPRW)    2020年06月  [査読有り]

  • Regularized Adversarial Training for Single-shot Virtual Try-On

    Kotaro Kikuchi, Kota Yamaguchi, Edgar Simo-Serra, Tetsunori Kobayashi

    International Conference on Computer Vision Workshops (ICCVW)    2019年11月  [査読有り]

  • Understanding the Effects of Pre-training for Object Detectors via Eigenspectrum

    Yosuke Shinya, Edgar Simo-Serra, Taiji Suzuki

    International Conference on Computer Vision Workshops (ICCVW)    2019年10月  [査読有り]

  • Virtual Thin Slice: 3D Conditional GAN-based Super-resolution for CT Slice Interval

    Akira Kudo, Yoshiro Kitamura, Yuanzhong Li, Satoshi Iizuka, Edgar Simo-Serra

    International Conference on Medical Image Computing and Computer Assisted Intervention Workshops (MICCAIW)    2019年10月  [査読有り]

    DOI

  • 固有値分布に基づく物体検出CNNの事前学習効果の分析

    進矢 陽介, シモセラ エドガー, 鈴木 大慈

    第22回画像の認識・理解シンポジウム(MIRU)    2019年08月

  • Temporal Distance Matrices for Workout Form Assessment

    Ryoji Ogata, Edgar Simo-Serra, Satoshi Iizuka, Hiroshi Ishikawa

    第22回画像の認識・理解シンポジウム(MIRU、ショートオーラル)    2019年07月  [査読有り]

  • Regularizing Adversarial Training for Single-shot Object Placement

    Kotaro Kikuchi, Kota Yamaguchi, Edgar Simo-Serra, Tetsunori Kobayashi

    第22回画像の認識・理解シンポジウム(MIRU、ショートオーラル)    2019年07月  [査読有り]

  • DeepRemaster: Temporal Source-Reference Attentionを用いた動画のデジタルリマスター

    飯塚 里志,シモセラ エドガー

    Visual Computing / グラフィクスとCAD 合同シンポジウム(オーラル)    2019年06月  [査読有り]

  • Optimization-Based Data Generation for Photo Enhancement

    Mayu Omiya, Yusuke Horiuchi, Edgar Simo-Serra, Satoshi Iizuka, Hiroshi Ishikawa

    Conference in Computer Vision and Pattern Recognition Workshops (CVPRW)    2019年06月  [査読有り]

  • Temporal Distance Matrices for Squat Classification

    Ryoji Ogata, Edgar Simo-Serra, Satoshi Iizuka, Hiroshi Ishikawa

    Conference in Computer Vision and Pattern Recognition Workshops (CVPRW)    2019年06月  [査読有り]

  • Re-staining Pathology Images by FCNN

    Masayuki Fujitani, Yoshihiko Mochizuki, Satoshi Iizuka, Edgar Simo-Serra, Hirokazu Kobayashi, Chika Iwamoto, Kenoki Ohuchida, Makoto Hashizume, Hidekata Hontani, Hiroshi Ishikawa

    International Conference on Machine Vision Applications (MVA)    2019年05月  [査読有り]

    DOI

  • Spectral Normalization and Relativistic Adversarial Training for Conditional Pose Generation with Self-Attention

    Yusuke Horiuchi, Edgar Simo-Serra, Satoshi Iizuka, Hiroshi Ishikawa

    International Conference on Machine Vision Applications (MVA)    2019年05月  [査読有り]

    DOI

  • Learning Photo Enhancement by Black-Box Model Optimization Data Generation

    Mayu Omiya, Edgar Simo-Serra, Satoshi Iizuka, Hiroshi Ishikawa

    SIGGRAPH Asia Technical Brief    2018年11月  [査読有り]

    DOI

  • FCNNを用いた病理画像の染色変換

    藤谷 真之,望月 義彦,飯塚 里志,シモセラ エドガー,小林 裕和,岩本 千佳,大内田 研宙,橋爪 誠,本谷 秀堅,石川 博

    第21回画像の認識・理解シンポジウム(MIRU、オーラル)    2018年08月  [査読有り]

  • 背景と反射成分の同時推定による画像の映り込み除去

    佐藤 良亮,飯塚 里志,シモセラ エドガー,石川 博

    第21回画像の認識・理解シンポジウム(MIRU、オーラル)    2018年08月  [査読有り]

  • 再帰型畳み込みニューラルネットワークによる航空写真の多クラスセグメンテーション

    高橋 宏輝,飯塚 里志,シモセラ エドガー,石川 博

    第21回画像の認識・理解シンポジウム(MIRU)    2018年08月

  • 補正パラメータ学習による写真の高品質自動補正

    近江谷 真由,シモセラ エドガー,飯塚 里志,石川 博

    第21回画像の認識・理解シンポジウム(MIRU)    2018年08月

  • SSDによる郵便物ラベルの認識及び高速化

    尾形 亮二,望月 義彦,飯塚 里志,シモセラ エドガー,石川 博

    第21回画像の認識・理解シンポジウム(MIRU)    2018年08月

  • Learning to restore deteriorated line drawing

    Kazuma Sasaki, Satoshi Iizuka, Edgar Simo-Serra, Hiroshi Ishikawa

    Visual Computer   34 ( 6-8 ) 1077 - 1085  2018年06月  [査読有り]

     概要を見る

    We propose a fully automatic approach to restore aged old line drawings. We decompose the task into two subtasks: the line extraction subtask, which aims to extract line fragments and remove the paper texture background, and the restoration subtask, which fills in possible gaps and deterioration of the lines to produce a clean line drawing. Our approach is based on a convolutional neural network that consists of two sub-networks corresponding to the two subtasks. They are trained as part of a single framework in an end-to-end fashion. We also introduce a new dataset consisting of manually annotated sketches by Leonardo da Vinci which, in combination with a synthetic data generation approach, allows training the network to restore deteriorated line drawings. We evaluate our method on challenging 500-year-old sketches and compare with existing approaches with a user study, in which it is found that our approach is preferred 72.7% of the time.

    DOI

  • 特集「漫画・線画の画像処理」ラフスケッチの自動線画化技術

    シモセラ エドガー,飯塚 里志

    映像情報メディア学会誌2018年5月号    2018年05月

  • Multi-modal joint embedding for fashion product retrieval

    A. Rubio, Longlong Yu, E. Simo-Serra, F. Moreno-Noguer

    Proceedings - International Conference on Image Processing, ICIP   2017-   400 - 404  2018年02月  [査読有り]

     概要を見る

    Finding a product in the fashion world can be a daunting task. Everyday, e-commerce sites are updating with thousands of images and their associated metadata (textual information), deepening the problem, akin to finding a needle in a haystack. In this paper, we leverage both the images and textual metadata and propose a joint multi-modal embedding that maps both the text and images into a common latent space. Distances in the latent space correspond to similarity between products, allowing us to effectively perform retrieval in this latent space, which is both efficient and accurate. We train this embedding using large-scale real world e-commerce data by both minimizing the similarity between related products and using auxiliary classification networks to that encourage the embedding to have semantic meaning. We compare against existing approaches and show significant improvements in retrieval tasks on a large-scale e-commerce dataset. We also provide an analysis of the different metadata.

    DOI

  • Multi-modal Embedding for Main Product Detection in Fashion

    LongLong Yu, Edgar Simo-Serra, Francesc Moreno-Noguer, Antonio Rubio

    Proceedings - 2017 IEEE International Conference on Computer Vision Workshops, ICCVW 2017   2018-   2236 - 2242  2018年01月  [査読有り]

     概要を見る

    We present an approach to detect the main product in fashion images by exploiting the textual metadata associated with each image. Our approach is based on a Convolutional Neural Network and learns a joint embedding of object proposals and textual metadata to predict the main product in the image. We additionally use several complementary classification and overlap losses in order to improve training stability and performance. Our tests on a large-scale dataset taken from eight e-commerce sites show that our approach outperforms strong baselines and is able to accurately detect the main product in a wide diversity of challenging fashion images.

    DOI

  • What Makes a Style: Experimental Analysis of Fashion Prediction

    Moeko Takagi, Edgar Simo-Serra, Satoshi Iizuka, Hiroshi Ishikawa

    Proceedings - 2017 IEEE International Conference on Computer Vision Workshops, ICCVW 2017   2018-   2247 - 2253  2018年01月  [査読有り]

     概要を見る

    In this work, we perform an experimental analysis of the differences of both how humans and machines see and distinguish fashion styles. For this purpose, we propose an expert-curated new dataset for fashion style prediction, which consists of 14 different fashion styles each with roughly 1,000 images of worn outfits. The dataset, with a total of 13,126 images, captures the diversity and complexity of modern fashion styles. We perform an extensive analysis of the dataset by benchmarking a wide variety of modern classification networks, and also perform an in-depth user study with both fashion-savvy and fashion-naïve users. Our results indicate that, although classification networks are able to outperform naive users, they are still far from the performance of savvy users, for which it is important to not only consider texture and color, but subtle differences in the combination of garments.

    DOI

  • Multi-label Fashion Image Classification with Minimal Human Supervision

    Naoto Inoue, Edgar Simo-Serra, Toshihiko Yamasaki, Hiroshi Ishikawa

    Proceedings - 2017 IEEE International Conference on Computer Vision Workshops, ICCVW 2017   2018-   2261 - 2267  2018年01月  [査読有り]

     概要を見る

    We tackle the problem of multi-label classification of fashion images, learning from noisy data with minimal human supervision. We present a new dataset of full body poses, each with a set of 66 binary labels corresponding to the information about the garments worn in the image obtained in an automatic manner. As the automatically-collected labels contain significant noise, we manually correct the labels for a small subset of the data, and use these correct labels for further training and evaluation. We build upon a recent approach that both cleans the noisy labels and learns to classify, and introduce simple changes that can significantly improve the performance.

    DOI

  • Adaptive Energy Selection For Content-Aware Image Resizing

    Kazuma Sasaki, Yuya Nagahama, Zheng Ze, Satoshi Iizuka, Edgar Simo-Serra, Yoshihiko Mochizuki, Hiroshi Ishikawa

    Asian Conference on Pattern Recognition (ACPR)    2017年11月  [査読有り]

    DOI

  • 画像類似度を考慮したデータセットを用いて学習したCNNによる病理画像の染色変換

    藤谷 真之,望月 義彦,飯塚 里志,シモセラ エドガー,石川 博

    ヘルスケア・医療情報通信技術研究会(MICT)    2017年11月

  • ディープラーニングによるファッションコーディネートの分類

    高木 萌子,シモセラ エドガー,飯塚 里志,石川 博

    第20回画像の認識・理解シンポジウム(MIRU)    2017年08月

  • 再帰構造を用いた全層畳み込みニューラルネットワークによる航空写真における建物のセグメンテーション

    高橋 宏輝,飯塚 里志,シモセラ エドガー,石川 博

    第20回画像の認識・理解シンポジウム(MIRU、オーラル)    2017年08月  [査読有り]

  • Banknote portrait detection using convolutional neural network

    Ryutaro Kitagawa, Yoshihiko Mochizuki, Satoshi Iizuka, Edgar Simo-Serra, Hiroshi Matsuki, Naotake Natori, Hiroshi Ishikawa

    Proceedings of the 15th IAPR International Conference on Machine Vision Applications, MVA 2017     440 - 443  2017年07月  [査読有り]

     概要を見る

    Banknotes generally have different designs according to their denominations. Thus, if characteristics of each design can be recognized, they can be used for sorting banknotes according to denominations. Portrait in banknotes is one such characteristic that can be used for classification. A sorting system for banknotes can be designed that recognizes portraits in each banknote and sort it accordingly. In this paper, our aim is to automate the configuration of such a sorting system by automatically detect portraits in sample banknotes, so that it can be quickly deployed in a new target country. We use Convolutional Neural Networks to detect portraits in completely new set of banknotes robust to variation in the ways they are shown, such as the size and the orientation of the face.

    DOI

  • ディープマリオ

    北川 竜太郎,シモセラ エドガー,飯塚 里志,望月 義彦,石川 博

    Visual Computing / グラフィクスとCAD 合同シンポジウム(オーラル)    2017年06月  [査読有り]

  • 回帰分析にもとづく補正モデルを用いた写真の自動補正

    近江谷 真由,シモセラ エドガー,飯塚 里志,石川 博

    Visual Computing / グラフィクスとCAD 合同シンポジウム(オーラル)    2017年06月  [査読有り]

  • Room reconstruction from a single spherical image by higher-order energy minimization

    Kosuke Fukano, Yoshihiko Mochizuki, Satoshi Iizuka, Edgar Simo-Serra, Akihiro Sugimoto, Hiroshi Ishikawa

    Proceedings - International Conference on Pattern Recognition     1768 - 1773  2017年04月  [査読有り]

     概要を見る

    We propose a method for understanding a room from a single spherical image, i.e., reconstructing and identifying structural planes forming the ceiling, the floor, and the walls in a room. A spherical image records the light that falls onto a single viewpoint from all directions and does not require correlating geometrical information from multiple images, which facilitates robust and precise reconstruction of the room structure. In our method, we detect line segments from a given image, and classify them into two groups: segments that form the boundaries of the structural planes and those that do not. We formulate this problem as a higher-order energy minimization problem that combines the various measures of likelihood that one, two, or three line segments are part of the boundary. We minimize the energy with graph cuts to identify segments forming boundaries, from which we estimate structural the planes in 3D. Experimental results on synthetic and real images confirm the effectiveness of the proposed method.

    DOI

  • Detection by classification of buildings in multispectral satellite imagery

    Tomohiro Ishii, Edgar Simo-Serra, Satoshi Iizuka, Yoshihiko Mochizuki, Akihiro Sugimoto, Hiroshi Ishikawa, Ryosuke Nakamura

    Proceedings - International Conference on Pattern Recognition     3344 - 3349  2017年04月  [査読有り]

     概要を見る

    We present an approach for the detection of buildings in multispectral satellite images. Unlike 3-channel RGB images, satellite imagery contains additional channels corresponding to different wavelengths. Approaches that do not use all channels are unable to fully exploit these images for optimal performance. Furthermore, care must be taken due to the large bias in classes, e.g., most of the Earth is covered in water and thus it will be dominant in the images. Our approach consists of training a Convolutional Neural Network (CNN) from scratch to classify multispectral image patches taken by satellites as whether or not they belong to a class of buildings. We then adapt the classification network to detection by converting the fully-connected layers of the network to convolutional layers, which allows the network to process images of any resolution. The dataset bias is compensated by subsampling negatives and tuning the detection threshold for optimal performance. We have constructed a new dataset using images from the Landsat 8 satellite for detecting solar power plants and show our approach is able to significantly outperform the state-of-the-art. Furthermore, we provide an indepth evaluation of the seven different spectral bands provided by the satellite images and show it is critical to combine them to obtain good results.

    DOI

  • BASS: Boundary-Aware Superpixel Segmentation

    Antonio Rubio, Longlong Yu, Edgar Simo-Serra, Francesc Moreno-Noguer

    Proceedings - International Conference on Pattern Recognition     2824 - 2829  2017年04月  [査読有り]

     概要を見る

    We propose a new superpixel algorithm based on exploiting the boundary information of an image, as objects in images can generally be described by their boundaries. Our proposed approach initially estimates the boundaries and uses them to place superpixel seeds in the areas in which they are more dense. Afterwards, we minimize an energy function in order to expand the seeds into full superpixels. In addition to standard terms such as color consistency and compactness, we propose using the geodesic distance which concentrates small superpixels in regions of the image with more information, while letting larger superpixels cover more homogeneous regions. By both improving the initialization using the boundaries and coherency of the superpixels with geodesic distances, we are able to maintain the coherency of the image structure with fewer superpixels than other approaches. We show the resulting algorithm to yield smaller Variation of Information metrics in seven different datasets while maintaining Undersegmentation Error values similar to the state-of-the-art methods.

    DOI

  • Multi-Modal Fashion Product Retrieval

    Antonio Rubio, Longlong Yu, Edgar Simo-Serra, Francesc Moreno-Noguer

    The 6th Workshop on Vision and Language    2017年04月  [査読有り]

    DOI

  • Convolutional Neural Network による紙幣の肖像画検出

    北川 竜太郎,望月 義彦,飯塚 里志,シモセラ エドガー,名取 直毅,松木 洋,石川 博

    第19回画像の認識・理解シンポジウム(MIRU)    2016年08月

  • 全層畳込みニューラルネットワークを用いた線画の自動補完

    佐々木 一真,飯塚 里志,シモセラ エドガー,石川 博

    第19回画像の認識・理解シンポジウム(MIRU)    2016年08月

  • Structured Prediction with Output Embeddings for Semantic Image Annotation

    Ariadna Quattoni, Arnau Ramisa, Pranava Swaroop Madhyastha, Edgar Simo-Serra, Francesc Moreno-Noguer

    Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT, Short)    2016年06月  [査読有り]

  • 地球観測衛星画像上の地物自動認識

    中村 良介,石井 智大,野里 博和,坂無 英徳,シモセラ エドガー,望月 義彦,飯塚 里志,石川 博

    人工知能学会全国大会    2016年06月

  • Understanding Human-Centric Images: From Geometry to Fashion

    Edgar Simo-Serra

    PhD Thesis    2015年07月  [査読有り]

  • Efficient Monocular Pose Estimation for Complex 3D Models

    Antonio Rubio, Michael Villamizar, Luis Ferraz, Adrián Peñate-Sánchez, Arnau Ramisa, Edgar Simo-Serra, Alberto Sanfeliu, Francesc Moreno-Noguer

    International Conference in Robotics and Automation (ICRA)    2015年05月  [査読有り]

    DOI

  • Lie Algebra-Based Kinematic Prior for 3D Human Pose Tracking

    Edgar Simo-Serra, Carme Torras, Francesc Moreno-Noguer

    International Conference on Machine Vision Applications (MVA)    2015年05月  [査読有り]

    DOI

  • A High Performance CRF Model for Clothes Parsing

    Edgar Simo-Serra, Sanja Fidler, Francesc Moreno-Noguer, Raquel Urtasun

    Asian Conference on Computer Vision (ACCV)    2014年11月  [査読有り]

    DOI

  • Geodesic Finite Mixture Models

    Edgar Simo-Serra, Carme Torras, Francesc Moreno-Noguer

    British Machine Vision Conference (BMVC)    2014年09月  [査読有り]

    DOI

  • Kinematic Synthesis of Multi-Fingered Robotic Hands for Finite and Infinitesimal Tasks

    Edgar Simo-Serra, Alba Perez-Gracia, Hyosang Moon, Nina Robson

    Advances in Robot Kinematics (ARK)    2012年06月  [査読有り]

    DOI

  • Design of Non-Anthropomorphic Robotic Hands for Anthropomorphic Tasks

    Edgar Simo-Serra, Francesc Moreno-Noguer, Alba Perez-Gracia

    ASME International Design Engineering Technical Conferences (IDETC)    2011年07月  [査読有り]

    DOI

  • Kinematic Model of the Hand using Computer Vision

    Edgar Simo-Serra

    Degree Thesis    2011年05月  [査読有り]

▼全件表示

Misc

  • Kinematic Synthesis Using Tree Topologies (vol 72, pg 94, 2013)

    Edgar Simo-Serra, Alba Perez-Gracia

    MECHANISM AND MACHINE THEORY   73   314 - 315  2014年03月

    その他  

    DOI

受賞

  • 科学技術への顕著な貢献2018(ナイスステップな研究者)に選定

    2018年11月   文部科学省  

  • Best paper賞

    2017年10月   International Conference on Computer Vision - Computer Vision for Fashion Workshop (ICCV-CVF)  

  • Special Doctoral Award

    2017年09月   BarcelonaTech  

  • Innovative Technologies 2016 特別賞「Culture」

    2016年10月   経済産業省  

  • 最優秀賞博士論文賞

    2016年10月   Catalan Association for Artificial Intelligence  

  • ACIA award to best diffusion of artificial intelligence research 2015

    2015年10月   Catalan Association for Artificial Intelligence (ACIA)  

     概要を見る

    Award for diffusion of artificial intelligence research to the general public.

  • Best paper award at the International Conference on Machine Vision Applications (MVA) 2015

    2015年05月  

     概要を見る

    国際会議Machine Vision Applications (MVA)におけるベストペーパー

▼全件表示

共同研究・競争的資金等の研究課題

  • 対話型パーソナライゼーションAIによるコンテンツ制作の拡張

    科学技術振興機構 

    研究期間:

    2018年04月
    -
    2021年03月
     

  • 教師なし学習による人工知能を用いた対話的なコンテンツ制作

    科学技術振興機構 

    研究期間:

    2016年12月
    -
    2018年03月
     

講演・口頭発表等

  • 深層学習を用いた画像処理技術

    シモセラ エドガー  [招待有り]

    ナイステップな研究者2018 講演会  

    発表年月: 2019年06月

  • 対話的なニューラルネットワークによるラフスケッチのペン入れ支援

    シモセラ エドガー  [招待有り]

    JST CREST HCI for Machine Learning Symposium  

    発表年月: 2019年03月

  • 対話的なニューラルネットワークによるコンテンツ作成の支援

    シモセラ エドガー  [招待有り]

    CGVI 第173回研究発表会  

    発表年月: 2019年03月

  • Towards Augmenting Creative Processes with Machine Learning

    シモセラ エドガー  [招待有り]

    Computer Science Student Workshop  

    発表年月: 2018年08月

  • 敵対的データ拡張による自動線画化

    シモセラ エドガー  [招待有り]

    Visual Computingシンポジウム  

    発表年月: 2018年06月

  • Smart Inker: ラフスケッチのペン入れ支援

    シモセラ エドガー  [招待有り]

    Visual Computingシンポジウム  

    発表年月: 2018年06月

  • Semi-Supervised Learning of Sketch Simplification

    シモセラ エドガー  [招待有り]

    The Deep Learning Workshop 2018  

    発表年月: 2018年03月

  • 深層学習の基礎と導入に向けて

    シモセラ エドガー  [招待有り]

    電子情報通信学会総合大会  

    発表年月: 2018年03月

  • ディープネットワークを用いた画像変換

    シモセラ エドガー  [招待有り]

    第69回 Stereo Club Tokyo 例会【春】+先進映像表現研究会  

    発表年月: 2018年03月

  • ディープネットワークを用いた白黒写真の自動色付け

    シモセラ エドガー  [招待有り]

    3Dフォーラム  

    発表年月: 2018年01月

  • Exploiting the Web to Understand Fashion

    シモセラ エドガー  [招待有り]

    ICCV2017 Computer Vision for Fashion Workshop  

    発表年月: 2017年10月

  • Leveraging the Web for Fashion and Image Understanding

    シモセラ エドガー  [招待有り]

    Workshop on E-Commerce and Entertainment Computing (ECEC)  

    発表年月: 2017年09月

  • 深層学習による画像変換

    シモセラ エドガー  [招待有り]

    CVIM 2017年9月研究会  

    発表年月: 2017年09月

  • ディープラーニングによる画像生成

    シモセラ エドガー  [招待有り]

    IAIP定例研究会 ビッグデータ・画像認識  

    発表年月: 2017年07月

  • Frontiers of Image Processing and Computer Graphics by Deep Learning

    シモセラ エドガー  [招待有り]

    Computer Graphics International 2017  

    発表年月: 2017年06月

  • ディープラーニングによる画像生成

    シモセラ エドガー  [招待有り]

    第42回 光学シンポジウム  

    発表年月: 2017年06月

  • Towards Mastering the Image: Deep Learning for Computer Graphics

    シモセラ エドガー  [招待有り]

    画像と人工知能の最前線  

    発表年月: 2017年01月

  • 全層畳込みニューラルネットワークによる画像生成

    シモセラ エドガー  [招待有り]

    第4回 MSR Intern Alumni交流会  

    発表年月: 2016年12月

  • ディープラーニングによる画像生成の最前線(応用)

    シモセラ エドガー  [招待有り]

    画像関連学会連合会 第3回秋季大会  

    発表年月: 2016年11月

  • Understanding Human-Centric Images: From Geometry to Fashion

    シモセラ エドガー  [招待有り]

    19th International Conference of the Catalan Association for Artificial Intelligence  

    発表年月: 2016年10月

  • Fashion Style in 128 Floats: Joint Ranking and Classification using Weak Data for Feature Extraction

    シモセラ エドガー  [招待有り]

    MIRU2016 回画像の認識・理解シンポジウム  

    発表年月: 2016年08月

  • Learning to Simplify Sketches

    シモセラ エドガー  [招待有り]

    DCAJプレゼンテーション “Industrial Application of Content Technology in Japan”  

    発表年月: 2016年07月

  • 全層畳込みニューラルネットワークによるラフスケッチの自動線画化

    シモセラ エドガー  [招待有り]

    Visual Computingシンポジウム  

    発表年月: 2016年06月

▼全件表示

 

現在担当している科目

▼全件表示

 

委員歴

  • 2018年04月
    -
    2022年03月

    情報処理学会コンピュータビジョンとイメージメディア研究会  研究専門委員