I am interested in solving problems in real-world applications with simple ML architectures.

  • G. Kim, T. Hong, M. Yim, J. Nam, J. Park, J. Yim, W. Hwang, S. Yun, D. Han, and S. Park, “OCR-free Document Understanding Transformer”, Proceedings of the European Conference on Computer Vision (ECCV) (to appear), 2022.
  • G. Kim, W. Hwang, M. Seo, and S. Park, “Semi-Structured Query Grounding for Document-Oriented Databases with Deep Retrieval and Its Application to Receipt and POI Matching”, Proceedings of the AAAI-22 Workshop on Knowledge Discovery from Unstructured Data in Financial Services, 2022.
  • W. Hwang, H. Lee, J. Yim, G. Kim, and M. Seo, “Cost-effective End-to-end Information Extraction for Semi-structured Document Images”, Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.
  • M. Naito, S. Yokoi, G. Kim, and H. Shimodaira, “Revisiting Additive Compositionality: AND, OR and NOT Operations with Word Embeddings”, Proceedings of the ACL-IJCNLP 2021 Student Research Workshop, 2021.
  • S. Park, G. Kim, J. Lee, J. Cha, J. Kim, and H. Lee, “Scale down Transformer by Grouping Features for a Lightweight Character-level Language Model”, Proceedings of the 28th International Conference on Computational Linguistics (COLING), 2020.
  • M. Mizutani, A. Okuno, G. Kim, and H. Shimodaira, “Stochastic Neighbor Embedding of Multimodal Relational Data for Image-Text Simultaneous Visualization”, Arxiv preprint, 2020.
  • G. Kim, A. Okuno, K. Fukui, and H. Shimodaira, “Representation Learning with Weighted Inner Product for Universal Approximation of General Similarities”, Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI), 2019.
  • G. Kim, K. Fukui, and H. Shimodaira, “Segmentation-free Compositional n-gram Embedding”, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), 2019.
  • A. Okuno, G. Kim, and H. Shimodaira, “Graph Embedding with Shifted Inner Product Similarity and Its Improved Approximation Capability”, Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS), 2019.
  • J. Baek, G. Kim, J. Lee, S. Park, D. Han, S. Yun, S. J. Oh, and H. Lee, “What is wrong with scene text recognition model comparisons? dataset and model analysis”, Proceedings of the 2019 IEEE International Conference on Computer Vision (ICCV), 2019 (Oral presentation).
  • G. Kim, A. Okuno, and H. Shimodaira, “Embedding Words into Pseudo-Euclidean Space”, Proceedings of the 25th Annual Meeting of the Association for Natural Language Processing (in Japanese), 2019.
  • T. Tanaka, A. Okuno, K. Fukui, G. Kim, and H. Shimodaira, “Image Tag Estimation Using Multiscale k-Nearest Neighbors”, Presentation at the 22nd Information-Based Induction Sciences Workshop (in Japanese), 2019.
  • G. Kim, K. Fukui, and H. Shimodaira, “Word-like Character n-gram Embedding”, Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text, 2018.
  • G. Kim, K. Fukui, T. Hada, and H. Shimodaira, “Segmentation-free Word Embedding with Word Dictionary”, Presentation at the 20th Information-Based Induction Sciences Workshop (in Japanese), 2017.

Invited Talk

  • You have been there!: Identifying a store from a receipt image, DEVIEW. 2021.
  • Michinoku Communication Science Seminar, Tohoku University. May 2019.

Selected Honors & Awards

  • Young Researcher Award of the Twenty-fifth Annual Meeting of the Association for Natural Language Processing.
  • Seiwa International Students Scholarship. 2019.
  • Korea-Japan Joint Government Scholarship. 2013–2018.
    • Admission and tuition fees, and living costs covered for a year of preliminary education and four years of Bachelor’s studies


  • Research scientist at NAVER Corp. (Apr. 2020-)
    • Working on document processing and machine learning towards real-world applications.
  • Mathematical Statistics Team, RIKEN Center for Advanced Intelligence Project (Sep. 2017-Feb. 2020)
  • Shimodaira Lab. (Statistics and Machine Learning), Kyoto University (Apr. 2017-Mar. 2020)
  • CLOVA OCR Team, NAVER Corp. (Aug. 2018-Oct. 2018 and Aug. 2019-Sep. 2019)