Research
I am interested in solving problems in real-world applications with simple ML architectures 🤗
- D. Kim, T. Hong, M. Yim, Y. Kim, and G. Kim, “On Web-based Visual Corpus Construction for Visual Document Understanding”, Proceedings of the International Conference on Document Analysis and Recognition (ICDAR), 2023 (to appear).
- G. Kim, T. Hong, M. Yim, J. Nam, J. Park, J. Yim, W. Hwang, S. Yun, D. Han, and S. Park, “OCR-free Document Understanding Transformer”, Proceedings of the European Conference on Computer Vision (ECCV), 2022.
- G. Kim, W. Hwang, M. Seo, and S. Park, “Semi-Structured Query Grounding for Document-Oriented Databases with Deep Retrieval and Its Application to Receipt and POI Matching”, Proceedings of the AAAI-22 Workshop on Knowledge Discovery from Unstructured Data in Financial Services, 2022.
- W. Hwang, H. Lee, J. Yim, G. Kim, and M. Seo, “Cost-effective End-to-end Information Extraction for Semi-structured Document Images”, Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.
- M. Naito, S. Yokoi, G. Kim, and H. Shimodaira, “Revisiting Additive Compositionality: AND, OR and NOT Operations with Word Embeddings”, Proceedings of the ACL-IJCNLP 2021 Student Research Workshop, 2021.
- S. Park, G. Kim, J. Lee, J. Cha, J. Kim, and H. Lee, “Scale down Transformer by Grouping Features for a Lightweight Character-level Language Model”, Proceedings of the 28th International Conference on Computational Linguistics (COLING), 2020.
- M. Mizutani, A. Okuno, G. Kim, and H. Shimodaira, “Stochastic Neighbor Embedding of Multimodal Relational Data for Image-Text Simultaneous Visualization”, Arxiv preprint, 2020.
- G. Kim, A. Okuno, K. Fukui, and H. Shimodaira, “Representation Learning with Weighted Inner Product for Universal Approximation of General Similarities”, Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI), 2019.
- G. Kim, K. Fukui, and H. Shimodaira, “Segmentation-free Compositional n-gram Embedding”, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), 2019.
- A. Okuno, G. Kim, and H. Shimodaira, “Graph Embedding with Shifted Inner Product Similarity and Its Improved Approximation Capability”, Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS), 2019.
- J. Baek, G. Kim, J. Lee, S. Park, D. Han, S. Yun, S. J. Oh, and H. Lee, “What is wrong with scene text recognition model comparisons? dataset and model analysis”, Proceedings of the 2019 IEEE International Conference on Computer Vision (ICCV), 2019 (Oral presentation).
- Paper
- GitHub
- Acceptance rate : 25.0% (1077/4304)
- Selected as an oral presentation : 4.3% (187/4303)
- G. Kim, A. Okuno, and H. Shimodaira, “Embedding Words into Pseudo-Euclidean Space”, Proceedings of the 25th Annual Meeting of the Association for Natural Language Processing (in Japanese), 2019.
- Paper
- Selected to receive both Young Researcher Award and Best Poster Award
- T. Tanaka, A. Okuno, K. Fukui, G. Kim, and H. Shimodaira, “Image Tag Estimation Using Multiscale k-Nearest Neighbors”, Presentation at the 22nd Information-Based Induction Sciences Workshop (in Japanese), 2019.
- G. Kim, K. Fukui, and H. Shimodaira, “Word-like Character n-gram Embedding”, Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text, 2018.
- G. Kim, K. Fukui, T. Hada, and H. Shimodaira, “Segmentation-free Word Embedding with Word Dictionary”, Presentation at the 20th Information-Based Induction Sciences Workshop (in Japanese), 2017.
Invited Talk
- “Recent Advances in Document AI”, Korea University. Mar. 2023.
- “Recent Advances in Document AI”, Kookmin University. Dec. 2022.
- “OCR-Free Document Understanding Transformer”, Visual Document Intelligence Team Reading Group, Microsoft. Nov. 2022.
- “Identifying a store from a receipt image”, Developer Conference DEVIEW. 2021.
- “Representation Learning with Weighted Inner Product for Universal Approximation of General Similarities”, Michinoku Communication Science Seminar, Tohoku University. May 2019.
Selected Honors, Awards & Services
- Young Researcher Award of the Twenty-fifth Annual Meeting of the Association for Natural Language Processing.
- Seiwa International Students Scholarship. 2019.
- Korea-Japan Joint Government Scholarship. 2013–2018.
- Admission and tuition fees, and living costs covered for a year of preliminary education and four years of Bachelor’s studies
- Serve as a reviewer at NAACL 2022 Industry Track, EMNLP 2022 Industry Track, ACL 2023 Industry Track
Career
- Applied Research scientist at NAVER Corp. (Apr. 2020-)
- Mathematical Statistics Team, RIKEN Center for Advanced Intelligence Project (Sep. 2017-Feb. 2020)
- As a research part-timer / trainee
- Advisor : Prof. Hidetoshi Shimodaira
- Shimodaira Lab. (Statistics and Machine Learning), Kyoto University (Apr. 2017-Mar. 2020)
- Advisor : Prof. Hidetoshi Shimodaira
- CLOVA OCR Team, NAVER Corp. (Aug. 2018-Oct. 2018 and Aug. 2019-Sep. 2019)
- As a research intern
- Advisor : Dr. Hwalsuk Lee
Education
- Master of Informatics, Graduate School of Informatics, Kyoto University (Apr. 2018-Mar. 2020)
- Major : Systems Science
- Laboratory : Shimodaira Lab. (Statistics and Machine Learning)
- Advisor : Prof. Hidetoshi Shimodaira
- Bachelor of Engineering, School of Informatics and Mathematical Science, Kyoto University (Apr. 2014-Mar. 2018)
- Major : Informatics and Mathematical Science (Applied Mathematics and Physics Course)
- Laboratory : Shimodaira Lab. (Statistics and Machine Learning)
- Advisor : Prof. Hidetoshi Shimodaira