Research interests
My research interests lie at the intersection of machine learning and natural language processing. I am particularly interested in word embedding, part of network architectures for statistical language modelling, which can transform words or phrases from a certain corpus into vectors of real numbers for language modelling purposes. Currently, I am working on cross-lingual word embedding, which consists of embedding multiple languages into a single semantic space. Nonetheless, I am not restricted to the topics mentioned above, I enjoy working on multiple languages, or more specially, connecting languages and transferring resources from one language to another.
Education
- Korea Advanced Institute of Science and Technology (KAIST), Ph.D., Computer Science, 2015 - Present
- Korea University (KU), M.S., Computer Science, 2012-2014
- University of Science and Technology Houari Boumediene (USTHB), B.S., Mathemathics and Comuter science, 2006-2010
Work Experience
-
Graduate Assistant, Users & Information Lab (U&I Lab), Kaist, Daejeon, Korea, 2016 - Present
- Lab Assistant
- Teaching Assistant
-
Teaching Assistant, Data Science Expert Training Course, Daejeon, Korea, 06/2017 - 08/2017
- Team and workshop manager
- Assist members of the team to publish their work on a workshop
-
Workshop Instructor, Issac KAIST, Kaist, Daejeon, Korea, 2017
- Java and Web Development Workshop, May
- Python Programming Workshop, October
-
Graduate Assistant, Semantic Web Research Center (SWRC), Kaist, Daejeon, Korea, 2015 - 2016
- Lab Assistant
- Implemented an Entity Linking module (for Korean and English) used in OKBQA-3 hackathon 2016
- and a pipeline for Named Entity Recognition(NER) and Named Entity Disambiguation(NED)
- Participated in building DBpedia Korea using the pipeline and module mentioned above
-
Graduate Assistant, Intelligent Information System Lab, Korea University, Sejong, Korea, 2012 - 2014
- Lab Assistant
- Participated in the national project BK21+
-
Intern, HSBC middle east (IT department), Algeirs, Algeria, 06/2012 - 08/2012
- Volunteer, FOREM, Algeirs, Algeria, 2004 - Present
Publications
-
H. Park, K. Kwon, Abdel-ilah Z Khiati, J. Lee, In-Jeong Chung. "Agglomerative Hierarchical Clustering for Information Retrieval Using Latent Semantic Index". SocialCom, 2015. PDF link
-
Abdel-Ilah Z Khiati, D. Kang, H. Park, K. Kwon, In-Jeong Chung. "Agglomerative Hierarchical Clustering Using Latent Semantic Analysis in Information Retrieval". Domestic conference, KIPS, 2014. PDF link
-
H. Park, Abdel-Ilah Z Khiati, D. Kang, K. Kwon, In-Jeong Chung. "Collaborative Movie Recommendation Method Using Sentiment Analysis". Domestic conference, KIPS, 2014. PDF link
-
D. Wang, Abdel-Ilah Z Khiati, J. Sohn, B. Joo, In-Jeong Chung. "An Improved Method for Measurement of Gross National Happiness Using Social Network Services". HumanCom, 2013. PDF link
-
"OSGI for the management and implementation of dynamic applications". B.S. Thesis, USTHB, 2010.
Skills
- Laboratory/Research Skills: Data processing, Statistical analysis, Team managment, Teaching asistance, etc.
-
Languages:
- English (Fluent)
- French (Native)
- Arabic (Native)
- Korean (Intermediate)
- Programming Languages: Java, Python, Matlab, etc.
References
- Alice Oh, Associate Professor, Computer Science, KAIST, alice.oh (at) kaist.edu
- Chung In-Jeong, Professor, Computer Science, Korea University, chung (at) korea.ac.kr
Research Statement
My research interest lies in the area of natural language processing (NLP) and Machine Learning where I have worked on statistical approaches and computational models to extract semantic information from text. I am particularly interested in closing the gap between commonly spoken languages like English or other European languages, for which there exist an abundance of NLP resources and technologies, and minority languages that often lack even the most basic NLP resources and tools such as Arabic and Korean.
During my masters, I worked at the Intelligent Information System Laboratory (IIS Lab) as graduate assistant under the supervision of Prof. Chung In-Jeong. My research area was in-between NLP and Machine Learning. In the laboratory, I was involved in several projects such as The Brain Korea 21 Plus (BK21+), a human resource development program funded by the Korean Ministry of Education. My master thesis was about retrieving snippets from search engines and embedding them using Latent Semantic Analysis (LSA), clustering the output vectors using Agglomerative Hierarchical Clustering (AHC). I published a paper by the end of my master which explains how to use LSA efficiently for clustering purposes and topic analysis.
Early 2015, I started pursuing a doctorate in Computer Science at Korea Advanced Institute of Science and Technology (KAIST) under the supervision of Prof. Key-Sun Choi. I also joined Semantic Web Research Center (SWRC) where I worked as a graduate assistant. My research topics were focused on NLP and Semantic Web where I started building an interest in tasks that extracts/enrich information from/into Knowledge bases such as DBpedia. More precisely, I worked on Entity Linking which is a task that relied on a knowledge base (KB) to identify entities from a given text and then link them to their corresponding links in the KB. Most of these entities are very ambiguous, and thus they need a strong model to make sure every named entity is correctly linked using its context. I developed a model that was able to successfully find and disambiguate named entities in Korean and was even able to accomplish results that are on par with state-of-the-art models such as AGDISTIS in English. The model was part of project and was also used in several other applications such as OKBQA, DBpedia Korea and several others.
On the second year of my doctorate, I decided to shift my focus to a less application oriented laboratory and thus I have joined the Users and information Laboratory (U&I Lab) under the supervision of Prof. Alice Oh. My current research topic are multi-lingual word embedding and its application to other NLP tasks such as cross-lingual transfer. Recently, I have been working on an idea that combines traditional embedding methods such as Dice Aligner and current state-of-the-art methods such as Word2Vec to come up with a better representation for multi-lingual embeddings. I am also using my work to collaborate with other researchers to work on different tasks such as topic analysis. One of the current collaborations involves clustering political tweets from three different languages (English, French, and German) and then analyze the results to identify politicians that have similar/opposite point of views.
Finally, my goal is to develop or improve new techniques that could serve as a bridge between different languages, especially the less fortunate languages that are lacking in research development. Such type of transfer can help a lot of NLP tasks ranging from machine translation to all sort of other NLP tasks that rely on a multi-lingual environment.
Contact
Khiati Abdel-ilah Zakaria
E-mail: zakria.ai (at) kaist.ac.kr
Tel: (+82)-10-5521-8905
Ph.D. Student
Korea Advanced Institute of Science and Technology (KAIST)
Daejeon, Korea