Updated on 2025/08/27

写真a

 
SAKAI, Yuta
 
Affiliation
Faculty of Science and Engineering, School of Creative Science and Engineering
Job title
Research Associate

Research Experience

  • 2024.04
    -
    Now

    Waseda University

Professional Memberships

  •  
     
     

    日本品質管理学会

  •  
     
     

    日本経営工学会

Research Areas

  • Intelligent informatics / Statistical science / Social systems engineering

Research Interests

  • データ分析

  • 経営工学

  • 機械学習

  • データサイエンス

Awards

  • Best Paper Award, The 19th ANQ Congress 2021

    2021.10  

    Winner: Yuki Tsuboi, Yuta Sakai, Satoshi Suzuki, Masayuki Goto

 

Papers

  • Bayesian Optimization Method for Business Policy Decisions Considering Input-dependent Variance

    Taiga Yoshikawa, Yuta Sakai, Tianxiang Yang, Masayuki Goto

    Journal of Japan Industrial Management Association   75 ( 2 ) 49 - 59  2024  [Refereed]

    DOI J-GLOBAL

    Scopus

  • Multiple treatment effect estimation for business analytics using observational data

    Yuki Tsuboi, Yuta Sakai, Ryotaro Shimizu, Masayuki Goto

    Cogent Engineering   11 ( 1 )  2024  [Refereed]

     View Summary

    To correctly evaluate the effects of treatments, conducting randomized controlled trials (RCTs) is a reasonable approach. However, because it is generally difficult to implement RCTs for all treatments, methods to estimate the treatment effects using observational data have been actively studied and used in various decision-making processes. Observational data accumulated in business activities and elsewhere contains the results of various previously implemented treatments, and correctly estimating the effects of any given treatment without separating the impacts of other treatments is challenging. Against this background, this paper proposes a method to estimate the effects of multiple treatments of various types while considering various causal relationships. Specifically, the proposal is a variational inference method that estimates the effect of multiple treatments using four latent factors estimated from observations, making assumptions that are independent of the type and number of treatments. The proposed method makes it possible to appropriately estimate the effects of measures even in situations with complex causal relationships. In addition, in situations where measures with continuous parameters are being implemented, it is possible to estimate the effects of measures that have not been implemented in the past by treating the content of the measures as a continuous variable.

    DOI

    Scopus

    8
    Citation
    (Scopus)
  • A Semi-Supervised Learning Model for Predicting User Attributes Based on Ladder Network

    Mizuki TAKEUCHI, Taichi IMAFUKU, Yuta SAKAI, Masayuki GOTO

    Total Quality Science   9 ( 2 ) 109 - 120  2023.12  [Refereed]  [Invited]

    DOI

  • A Robust Estimation Method for Conditional Average Treatment Effects Taking Account of Selection Bias Based on Causal Tree

    Yuki Tsuboi, Yuta Sakai, Satoshi Suzuki, Masayuki Goto

    情報処理学会論文誌ジャーナル(Web)   64 ( 9 ) 1399 - 1412  2023.09  [Refereed]

    DOI J-GLOBAL

  • Purchasing Behavior Analysis Model that Considers the Relationship Between Topic Hierarchy and Item Categories

    Yuta Sakai, Yui Matsuoka, Masayuki Goto

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   13316 LNCS   344 - 358  2022  [Refereed]

     View Summary

    With the spread of EC sites, it has become an important work for companies to analyze user preferences contained in accumulated purchase history data and utilize them in marketing measures. A topic model is well known as a method for analyzing user preferences from purchase history data, and a model assuming hierarchy of topics has been proposed as an extension method. The previously proposed PAM (Pachinko Allocation Model) is a highly expressive model in which all upper and lower topics are connected by a network and the relationships between multiple topics can be analyzed. However, PAM is easily affected by the initial values of learning parameters, and it is difficult to obtain stable topics, so the interpretation of the estimated topics becomes unstable. It is dangerous to make business decisions based on the interpretation of such unstable results. Therefore, in this research, instead of using the hierarchy of topics estimated based on the user’s purchasing behavior, we use information with a hierarchical structure of “product categories” given by the EC site for managing items. Therefore, we propose a method that is useful for studying measures and that enables hierarchical topic analysis. Finally, the proposed method is applied to the evaluation history data of the actual EC site to analyze the user’s preference and show its usefulness.

    DOI

    Scopus

  • An Extension of Semi-supervised Boosting to Multi-valued Classification Problems

    Yuta Sakai, Kazuki Yasui, Kenta Mikawa, Masayuki Goto

    Total Quality Science   6 ( 2 ) 60 - 69  2021.02  [Refereed]  [Invited]

    DOI

▼display all

Misc

  • Proposal of a Comprehensive Analysis Framework for Risk Using Text Data of Securities Reports

    KARINO Yusei, FUJIWARA Daiki, CHOMEI Shogo, SAKAI Yuta, GOTO Masayuki

    Proceedings of the Annual Conference of JSAI   JSAI2025   1O3GS1003 - 1O3GS1003  2025

     View Summary

    A comprehensive understanding of corporate risks is essential for investment decisions. The "Business and Other Risks" section in Securities Reports provides detailed insights into potential risks, but the large volume of text makes cross-company analysis labor-intensive. To improve efficiency, various studies have applied machine learning to risk analysis. However, many lack systematic frameworks, as they do not incorporate pre-existing risk classifications or industry-wide commonalities, limiting the validity of results. This study proposes a method that utilizes 12 risk categories proposed by Noda, based on national risk management standards, to systematically extract risk-related text. By applying BERTopic, we classify industry-common and firm-specific risks, allowing for both a comprehensive overview and a focused understanding of individual corporate risks. An analysis using Securities Reports for Fiscal Year 2023 demonstrates the effectiveness of the proposed approach in enabling systematic and efficient risk identification.

    DOI

  • Proposal of an Off-Policy Evaluation Method Considering the Entire Reward Distribution in Large Action Spaces.

    TAKASHI Taishiro, SAKAI Yuta, GOTO Masayuki

    Proceedings of the Annual Conference of JSAI   JSAI2025   2S5GS202 - 2S5GS202  2025

     View Summary

    In counterfactual machine learning, off-policy evaluation (OPE) aims to estimate the true performance of decision-making policies using logged data. Traditional estimators evaluate policy performance based solely on rewards directly induced by the policy. However, in real-world scenarios like e-commerce recommendation systems, users often take actions (e.g., purchases) outside the recommended list, leading to unaccounted rewards. To address this, estimators must evaluate performance beyond the recommended items.Existing methods struggle as action spaces grow, with accuracy deteriorating in large-scale environments. For example, e-commerce platforms may have action spaces ranging from thousands to millions of items, requiring robust methods to maintain accuracy. This study proposes a novel estimator extending the OffCEM framework to mitigate accuracy degradation, achieving high performance in large action spaces. Theoretical analysis and experiments show that the proposed method outperforms previous estimators, delivering enhanced accuracy in large-scale settings.

    DOI

  • Proposal of An Explainable Multi-class Classification Method using Neural Networks with ECOC Method

    SAKAI Yuta, MIKAWA Kenta, GOTO Masayuki

    Proceedings of the Annual Conference of JSAI   JSAI2025   3M5OS7b05 - 3M5OS7b05  2025

     View Summary

    With the advancement of information technology, data utilization has expanded, and multi-class classification tasks like image classification have become crucial. While machine learning models have achieved high accuracy, their opacity poses challenges, spurring the development of explainable AI (XAI). Current XAI methods, such as heatmaps highlighting influential input features or techniques quantifying feature importance for local explanations, primarily interpret input-output relationships. However, they fail to elucidate the structural relationships between multiple classes and provide limited global interpretability, often restricted to identifying predictive features. This study proposes a novel XAI approach leveraging the ECOC method to interpret category groupings that enhance model identification in multi-class classification. By decomposing the problem into multiple classification tasks, this approach offers insights into the ease of classification and the similarities among categories, advancing the interpretability of machine learning models.

    DOI

  • ミリ波レーダーで観測された点群データに基づく日常生活活動の識別モデル

    布目 悠人, 阪井 優太, 山極 綾子, 後藤 正幸, 新井 俊宏

    日本計算機統計学会シンポジウム論文集   38   136 - 139  2024.10

  • 不均衡な結果変数を伴う中古ファッションアイテムのフラッシュセールにおけるCATE推定

    木村 恵悟, 阪井 優太, 後藤 正幸, 佐々木 北都

    日本計算機統計学会シンポジウム論文集   38   94 - 97  2024.10

  • 顧客の閲覧頻度と嗜好多様性を考慮したクラスタリングモデル

    松岡 龍汰, 阪井 優太, 後藤 正幸, 山下 遥

    日本計算機統計学会シンポジウム論文集   38   128 - 131  2024.10

  • 継続的メール配信の影響を考慮したメール開封要因分析モデル

    阪井 優太, 後藤 正幸, 清水 良太郎

    日本計算機統計学会シンポジウム論文集   38   54 - 57  2024.10

  • 学習データ選択を用いた中古スマートフォンの価格予測モデルの提案

    森川 卓哉, 阪井 優太, 後藤 正幸

    日本計算機統計学会シンポジウム論文集   38   78 - 81  2024.10

  • 表形式データの自己教師あり学習モデルを対象とした効果的なデータ拡張法

    朝海柊, 長命祥吾, 藤原大喜, 阪井優太, 後藤正幸

    日本経営工学会秋季大会予稿集(Web)   2024  2024

    J-GLOBAL

  • Hierarchical Multi-label Classification Model Adapted to Training Data with Different Layers of Correct Labels

    MIYAJIMA Kengo, NUNOME Yuto, SAKAI Yuta, GOTO Masayuki

    Proceedings of the Annual Conference of JSAI   JSAI2024   2B1GS202 - 2B1GS202  2024

     View Summary

    Multi-label classification in document data is the task of correctly assigning multiple class labels to each document. However, there is often a semantic hierarchical structure among the assigned labels, and considering the hierarchical structure can improve the accuracy of label prediction. The Multi-label Box Model (MBM) has been proposed as a multi-label classification model that takes into account the semantic hierarchical structure among labels, and its effectiveness has been demonstrated when class labels of all layers are assigned to training data. However, real-world document data posted on user-contributed websites often lack class labels for all layers of the hierarchy. If such data is used to train MBM, the accuracy of label prediction is reduced. In this study, we propose a framework for learning MBM after complementing labels of missing hierarchies by introducing Bidirectional Encoder Representations from Transformers (BERT). The effectiveness of the proposed method is also demonstrated through evaluation experiments, which compare the accuracy of the conventional method and the proposed method when applied to data with missing labels of some hierarchies.

    DOI J-GLOBAL

  • 埋込空間を利用した顧客の購買行動とレビューの分析

    布目 悠人, 阪井 優太, 後藤 正幸

    日本計算機統計学会シンポジウム論文集   37   46 - 49  2023.11

  • Price Factor Analysis Model for Used Smartphone Products Based on Machine Learning

    森川卓哉, 竹内瑞生, 阪井優太, 後藤正幸

    人工知能学会全国大会論文集(Web)   37th   2L6GS304 - 2L6GS304  2023

     View Summary

    In recent years, more and more used smartphones have been bought and sold through online sales services in the used smartphone market, and it is desirable to utilize the large amount of transaction data accumulated in conjunction with these transactions when listing and purchasing used smartphones. Used equipment buyers can use this data to analyze market price trends and the factors that determine those prices, which can lead to optimal purchase strategies and pricing. However, used smartphones are handled by various sales services. For such a target problem, it would be possible to understand which factors contribute to the selling price if a prediction model could be constructed to explain the selling price based on various characteristics. In this study, we analyze price determinants using a model that incorporates the gradient boosting method, which is a model with high accuracy and interpretability, with the help of explainable AI. In this analysis, it is undesirable to apply a single pricing factor analysis model that could not consider differences in sales services, which has been the focus of previous analyses of the used equipment market. Therefore, we proposes an analytical model that employs the technique of explainable AI for the different price determinants among sales services. The proposed model is applied to analyze actual sales data of used smartphones and discuss the results.

    DOI J-GLOBAL

  • Adversarial Training with Data Selection which Improves the Accuracy and Reduces the Computational Complexity of Domain Adaptation

    木村恵悟, 中村太祐, 阪井優太, 後藤正幸

    人工知能学会全国大会論文集(Web)   37th   2A6GS203 - 2A6GS203  2023

     View Summary

    In general, Machine Learning does not ensure the proper prediction if the statistical structures of the features between training data and prediction data are different, but it is sometimes required to apply a prediction model to a target which may have the different structure from its train data. In recent years, the studies to address this challenge have been actively conducted, and one of them is Adversarial Discriminative Domain Adaptation(ADDA), which is the classification model with adversarial training of Generative Adversarial Networks(GAN). ADDA has a defect that it uses all data from mini-batch, which includes bad data for training. In this study, we propose the improved method of ADDA with the application of GAN's related study, Top-k training. This application would enable ADDA to select useful data based on internal outputs, and the prediction accuracy is expected to increase. The result of the experiment shows that proposed method has significances of the accuracy and the length of training time.

    DOI J-GLOBAL

  • A Study on Review Analysis Based on Using Query-focused Summarization Model

    中村太祐, 阪井優太, 後藤正幸

    情報理論とその応用シンポジウム予稿集(CD-ROM)   46th  2023

    J-GLOBAL

  • Adversarial Counter Factual Regression based on Importance Weighted Sampling

    今福太一, 阪井優太, 後藤正幸

    情報理論とその応用シンポジウム予稿集(CD-ROM)   46th  2023

    J-GLOBAL

  • A Study on Analysis Model for Action History Data Based on Self- and Semi-supervised Learning

    竹内瑞生, 阪井優太, 後藤正幸

    情報理論とその応用シンポジウム予稿集(CD-ROM)   46th  2023

    J-GLOBAL

  • A Study on High-dimensional Data Visualization Methods Based on Angles

    阪井優太, 三川健太, 後藤正幸

    情報理論とその応用シンポジウム予稿集(CD-ROM)   46th  2023

    J-GLOBAL

  • 選択データを用いた敵対的訓練によるドメイン適応に関する一考察

    木村 恵悟, 坪井 優樹, 阪井 優太, 後藤 正幸

    日本計算機統計学会シンポジウム論文集   36   44 - 47  2022.11

  • 複数の施策を対象とした処置効果推定手法に関する一考察

    坪井, 優樹, 阪井, 優太, 清水, 良太郎, 後藤, 正幸

    第84回全国大会講演論文集   2022 ( 1 ) 689 - 690  2022.02

     View Summary

    企業がマーケティング施策を講じる際、適切な効果検証を行い、適切な意思決定につなげることは非常に重要な課題であり、行動履歴データを活用した効果検証によるデータ駆動型マーケティングへの期待が高まっている。近年では、複数の種類の施策を同期間に実施する場合が多く、このような状況においても施策ごとの効果を正しく推定及び検証できる手法がマーケティング戦略立案のために求められている。そこで本研究では、複数種類の施策効果を同時に推定可能な手法を活用し、ファッションECサイト内のバナー広告の効果の推定及び分析を行う。

  • A Study on Active Learning Research Trends and Issues in Regression / Classification Problems

    阪井優太, 小林学, 後藤正幸

    情報処理学会全国大会講演論文集   84th ( 2 ) 23 - 24  2022

    J-GLOBAL

  • A Relationship Analysis Model between Products based on Store Sales Data by Machine Learning Approach with SHAP Values

    石倉滉大, 阪井優太, 吉開朋弘, 後藤正幸

    情報理論とその応用シンポジウム予稿集(CD-ROM)   45th  2022

    J-GLOBAL

  • Multiple Treatment Effect Estimation for E-commerce Marketing Using Observational Data

    坪井優樹, 阪井優太, 清水良太郎, 清水良太郎, 後藤正幸

    情報理論とその応用シンポジウム予稿集(CD-ROM)   45th  2022

    J-GLOBAL

  • A Proposal of Business Decision-Making Model by Bayesian Optimization Considering Input-Dependent Dispersion

    良川太河, 阪井優太, YANG Tianxiang, 後藤正幸

    情報理論とその応用シンポジウム予稿集(CD-ROM)   45th  2022

    J-GLOBAL

  • An Analytical Model of Users’ Switching Possibilities by Using Predicted Time to Credit Card Users

    大久保亮吾, 阪井優太, 立花徹也, 長坂典香, 後藤正幸

    情報理論とその応用シンポジウム予稿集(CD-ROM)   45th  2022

    J-GLOBAL

  • 潜在的な関係性の違いを考慮した知識グラフによる推薦システムの一考察

    中村太祐, 阪井優太, 後藤正幸

    日本経営工学会秋季大会予稿集(Web)   2022  2022

    J-GLOBAL

  • A Study of Model Predicting User Attributes Based on Semi-supervised Learning by Ladder Network

    竹内瑞生, 今福太一, 阪井優太, 後藤正幸

    人工知能学会全国大会論文集(Web)   36th   1A4GS201 - 1A4GS201  2022

     View Summary

    In recent years, marketing using attribute information associated with member accounts of online services has been widely used. However, the majority of users are non-members who use services without registering for an account, and it is difficult to implement measures using attribute information for these non-member users. In order to deal with this situation, semi-supervised learning is an effective way to increase the number of users with attribute information by predicting it from the history data of member users who have attribute information, using the history data of non-member users as well. One of such semi-supervised learning methods is the Ladder Network, which is a neural network based model with adding and removing noise. This model provides highly accurate prediction for image data, and is also considered to be useful for predicting user attributes from historical data, where the feature vector is high-dimensional. However, this method cannot be applied to the case where the label takes ordered value, such as the user's age category. In this study, we propose an extended model based on the Ladder Network that incorporates a mechanism that can appropriately predict the user's attribute information. We also conduct an evaluation experiment using actual browsing history data to show the effectiveness of the proposed method.

    DOI J-GLOBAL

  • 選択バイアスを考慮するCausal Treeによる条件付き平均処置効果推定手法

    坪井優樹, 阪井優太, 鈴木佐俊, 後藤正幸

    日本経営工学会春季大会予稿集(Web)   2021  2021

    J-GLOBAL

  • トピックの階層性を考慮した購買行動分析モデルに関する一考察

    松岡佑以, 平野洋介, 阪井優太, 後藤正幸

    日本経営工学会春季大会予稿集(Web)   2021  2021

    J-GLOBAL

  • トピックへの所属確率分布を考慮した学術論文へのキーワードの割り当て手法

    阪井優太, 浅見怜, 後藤正幸

    日本経営工学会春季大会予稿集(Web)   2021  2021

    J-GLOBAL

  • A Study on a Method for Estimating Conditional Average Treatment Effects Taking Account of Selection Bias Based on Causal Tree

    坪井優樹, 阪井優太, 鈴木佐俊, 後藤正幸

    人工知能学会全国大会論文集(Web)   35th   3G2GS2h04 - 3G2GS2h04  2021

     View Summary

    It is important for companies to verify the effects of their marketing measures and to make right decisions. To verify the effects from observational data correctly, they make use of causal inference. In recent causal inference, after allocating subjects to two groups and treating them differently, they often seek to estimate the Conditional Average Treatment Effect (CATE) to better understand causal mechanisms. CATE makes it possible to identify the group of users for whom it is effective to take measures. As a CATE estimation method, Causal Tree which has high interpretability and usefulness for analyzing the factors that affect measures, has been proposed. However, this method cannot be used when they allocate subjects to two groups on purpose due to selection bias. Therefore, we propose a method for estimating CATE taking account of selection bias based on Causal Tree. Finally, we evaluate the precision of CATE estimates by simulation experiments. In addition, we apply the proposed method to an actual data and show the usefulness of the proposed method.

    DOI CiNii J-GLOBAL

  • Zero-shot Generative Model Considering Attribute Uncertainty

    阪井優太, 三川健太, 後藤正幸

    電子情報通信学会技術研究報告(Web)   120 ( 300(PRMU2020 38-68) )  2020

    J-GLOBAL

  • An Analytical Model of the Customer Purchase Factor Based on Conditional VAE Learned of Web Browsing Data

    川上達也, 阪井優太, 山下遥, 後藤正幸

    人工知能学会全国大会論文集(Web)   34th   1I3GS204 - 1I3GS204  2020

     View Summary

    Due to the accumulation of browsing history data on EC sites, Web marketing techniques are of growing significance. Most previous studies analyzed differences in overall browsing pages between purchasing and non-purchasing sessions by constructing a discriminative model and proposed measures for all users. However, it is difficult to utilize this model when considering personalized measures for each user. In this situation, a generative model, which infers browsing-behavior conditioned by whether a user purchases or not, is effective. Conditional VAE infers the data from the label and features of input data. In this paper, we apply Conditional VAE to browsing history data and identify important pages by generating a pseudo session assuming that a user in a non-purchasing session purchases. We propose a method to analyze important browsing pages that contribute to each user's purchase. We clarify the effectiveness of our proposed method by using real browsing history data.

    DOI CiNii J-GLOBAL

  • Construction of Demand Forecast Model of Tokyo Taxi Based on Probe Data Analysis

    飯塚玲夫, 小野雄生, 野中賢也, 阪井優太, 後藤正幸

    人工知能学会全国大会論文集(Web)   34th   2I5GS204 - 2I5GS204  2020

     View Summary

    We construct a decision support model that can help taxi drivers dispatch their vehicles appropriately based on machine learning by utilizing probe data of taxis in Tokyo. Traditionally, taxi dispatch has relied on the driver's experience and intuition. The number of customers depends on the knowledge gained through many years of experience. For examples, many taxis are sometimes waiting for customers, and sometimes many customers wait in a long queue for a taxi. In addition, there are differences in the transport distance depending on the location. However, in a given situation, not all drivers know places with high demand. Therefore, it is desirable to build a model, which is easy to understand for drivers, that enables efficient acquisition of customers regardless of their experience. In this situation, we propose an analysis model that supports driver's decision based on taxi probe data.

    DOI CiNii J-GLOBAL

  • An extension method of semi-supervised boosting to multiclass classification

    阪井優太, 安井一貴, 三川健太, 後藤正幸

    人工知能学会全国大会論文集(Web)   33rd   4A3J105 - 4A3J105  2019

     View Summary

    Recent years, semi-supervised learning which classifies test data into correct category using not only training (labeled) data but a large number of unlabeled data has paid attention. However, in the semi-supervised learning setting, there is a problem that classification accuracy deteriorates because distribution of labeled data is biased. The SemiBoost is one of semi-supervised learning method to solve the problem. The SemiBoost is a binary classification method. However, this method can not be extended directly to multi-class classification. In this research, we propose the way to extend the SemiBoost for multi-class classification using the concept of Error Correcting Output Code (ECOC) method. To verify the effectiveness of our proposed method, we conduct simulation experiment using UCI machine learning repository.

    DOI CiNii J-GLOBAL

▼display all

 

Internal Special Research Projects

  • 機械学習モデルに基づくECマーケティング施策の効果検証に関する研究

    2024   後藤正幸, 三川健太

     View Summary

    近年多くのユーザを抱えるECサイトを運営する企業では,様々なビジネス施策を講じている。そのビジネス施策を講じた過去の観察データを用いて施策の効果検証を行う統計的因果推論が盛んに活用されてきている。しかしながら,一般的な因果推論で扱う施策効果の推定においては単一の施策効果に着目し推定するものであるが,実応用上では複数の施策の効果を考慮し類似した特徴を持つユーザ群やユーザごとに推定することが好ましい。そこで本研究では,複数の施策を考慮しユーザごとに施策効果を推定可能なモデルを提案した。本研究において,実際のECサイトにおけるメール施策をおこなった観察データを用いて効果推定及び分析をおこなう。このメール施策には,施策を講じるために必要なコストが小さいのでユーザに対して高い頻度で送付されることや,プロモーション施策をおこないたい部署ごとに多様な種類のメールが混在しているといった特徴が存在する。このとき大量のメールが及ぼすユーザのメール開封への影響を分析するモデルを提案した。この研究により,メール施策はユーザの直近の開封行動以外に1週間や1か月前に送信した数が,メール開封率に影響を示すことがわかった。これにより,ユーザ行動に左右されない中でもメールの数量をコントロールする重要性を示した。この研究成果を日本計算機統計学会第38回シンポジウムで発表した。また,多様なカテゴリを持つデータにおける分析手法の提案もおこなった。この手法では様々なカテゴリのデータを低次元空間に縮約し可視化することで,カテゴリ間の類似性を視覚的に評価することをおこなっている。これはメール施策をはじめとしたユーザなどの属性情報をカテゴリとしてみなすことで,過去の行動特徴量をカテゴリ単位で可視化することができ,嗜好の多様性を理解するための一助となると考えられる。この研究成果をANQ Congress 2024で発表した。