Research

My research interests span machine learning, focusing on both its underlying principles and practical applications. I specialize in creating robust, generalizable machine learning systems, including vision and language models, for real-world application.

Publications / Preprints List

Please see my Google Scholar or Semantic Scholar for a full list.

  • D. Kim, T. Hong, M. Yim, Y. Kim, and G. Kim, “On Web-based Visual Corpus Construction for Visual Document Understanding”, Proceedings of the International Conference on Document Analysis and Recognition (ICDAR), 2023. Paper / GitHub Stars
  • G. Kim, S. Yokoo, S. Seo, A. Osanai, Y. Okamoto and Y. Baek, “On Text Localization in End-to-End OCR-Free Document Understanding Transformer Without Text Localization Supervision”, Proceedings of the International Conference on Document Analysis and Recognition Workshops, 2023. Paper (available at ICDAR 2023 Workshops Proceedings Part I) / Slide
  • G. Kim, T. Hong, M. Yim, J. Nam, J. Park, J. Yim, W. Hwang, S. Yun, D. Han, and S. Park, “OCR-free Document Understanding Transformer”, Proceedings of the European Conference on Computer Vision (ECCV), 2022. Paper / Slide / Poster / GitHub Stars / PyPi Package Downloads
  • G. Kim, W. Hwang, M. Seo, and S. Park, “Semi-Structured Query Grounding for Document-Oriented Databases with Deep Retrieval and Its Application to Receipt and POI Matching”, Proceedings of the AAAI-22 Workshop on Knowledge Discovery from Unstructured Data in Financial Services, 2022. Paper
  • W. Hwang, H. Lee, J. Yim, G. Kim, and M. Seo, “Cost-effective End-to-end Information Extraction for Semi-structured Document Images”, Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021. Paper
  • M. Naito, S. Yokoi, G. Kim, and H. Shimodaira, “Revisiting Additive Compositionality: AND, OR and NOT Operations with Word Embeddings”, Proceedings of the ACL-IJCNLP 2021 Student Research Workshop, 2021. Paper
  • S. Park, G. Kim, J. Lee, J. Cha, J. Kim, and H. Lee, “Scale down Transformer by Grouping Features for a Lightweight Character-level Language Model”, Proceedings of the 28th International Conference on Computational Linguistics (COLING), 2020. Paper / GitHub
  • M. Mizutani, A. Okuno, G. Kim, and H. Shimodaira, “Stochastic Neighbor Embedding of Multimodal Relational Data for Image-Text Simultaneous Visualization”, Arxiv preprint, 2020. Paper
  • G. Kim, A. Okuno, K. Fukui, and H. Shimodaira, “Representation Learning with Weighted Inner Product for Universal Approximation of General Similarities”, Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI), 2019. Paper / GitHub / Slide
  • G. Kim, K. Fukui, and H. Shimodaira, “Segmentation-free Compositional n-gram Embedding”, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), 2019. Paper / GitHub
  • A. Okuno, G. Kim, and H. Shimodaira, “Graph Embedding with Shifted Inner Product Similarity and Its Improved Approximation Capability”, Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS), 2019. Paper / GitHub
  • J. Baek, G. Kim, J. Lee, S. Park, D. Han, S. Yun, S. J. Oh, and H. Lee, “What is wrong with scene text recognition model comparisons? dataset and model analysis”, Proceedings of the 2019 IEEE International Conference on Computer Vision (ICCV), 2019. Paper / GitHub Stars
    • Selected as an oral presentation : 4.3% (187/4303)
  • G. Kim, A. Okuno, and H. Shimodaira, “Embedding Words into Pseudo-Euclidean Space”, Proceedings of the 25th Annual Meeting of the Association for Natural Language Processing (in Japanese), 2019. Paper
  • G. Kim, K. Fukui, and H. Shimodaira, “Word-like Character n-gram Embedding”, Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text, 2018. Paper / GitHub

Invited Talk

  • “Recent Advances in Document AI”, Korea University. Mar. 2023.
  • “Recent Advances in Document AI”, Kookmin University. Dec. 2022.
  • “OCR-Free Document Understanding Transformer”, Microsoft. Nov. 2022.
  • “Identifying a store from a receipt image”, Developer Conference DEVIEW. 2021. Session Link
  • “Representation Learning with Weighted Inner Product for Universal Approximation of General Similarities”, Michinoku Communication Science Seminar, Tohoku University. May 2019. Session Link

Career

  • Technical Leader and Applied Research Scientist at NAVER Cloud Corp. (May 2023-)
    • Working on LLM-based solutions and products through research and software engineering (Web Page)
    • Managed and led several R&D projects, e.g., Cream
  • Applied Research Scientist at NAVER Corp. (Apr. 2020-Apr. 2023)
    • Worked on research and software engineering for Document AI family of solutions and products (Web Demo)
    • Managed and led several R&D projects, e.g., Donut, Webvicob, etc
  • Shimodaira Lab. (Statistics and Machine Learning), Kyoto University (Apr. 2017-Mar. 2020)
  • Mathematical Statistics Team, RIKEN Center for Advanced Intelligence Project (Sep. 2017-Feb. 2020)
    • Worked on several representation learning related projects as a research part-timer / trainee
    • Advisor : Prof. Hidetoshi Shimodaira
  • CLOVA OCR Team, NAVER Corp. (Aug. 2018-Oct. 2018 and Aug. 2019-Sep. 2019)
    • Worked on several OCR related projects as a research intern
    • Advisor : Dr. Hwalsuk Lee

Education

Selected Honors, Awards & Services

  • Young Researcher Award of the Twenty-fifth Annual Meeting of the Association for Natural Language Processing. 2019.
  • Seiwa International Students Scholarship. 2019.
  • Korea-Japan Joint Government Scholarship. 2013–2018.
    • Admission and tuition fees, and living costs covered for a year of preliminary education and four years of Bachelor’s studies
  • Serve as a reviewer at NAACL 2022 Industry Track, EMNLP 2022 Industry Track, ACL 2023 Industry Track, IEEE Access
Last updated on 23.09.20