This repo is now archived and will no longer maintained (some of the notebooks may no longer work in Google Colaboratory due to it moving to Tensorflow 2). However, for an updated and maintained version of this tutorial, use hybridnlp with revised and new notebooks as well as an upcoming book!
This repo contains the notebooks and overall materials of the HybridNLP 2018 tutorial (http://expertsystemlab.com/hybridNLP18/)
Overal description of the tutorial: Many different artificial intelligence techniques can be used to explore and exploit large document corpora that are available inside organizations and on the Web. While natural language is symbolic in nature and the first approaches in the field were based on symbolic and rule-based methods, like ontologies, semantic networks and knowledge bases, many of the most widely used methods are currently based on statistical approaches. Each of these two main schools of thought in natural language processing, knowledge-based and statistical, have their limitations and strengths and there is an increasing trend that seeks to combine them in complementary ways to get the best of both worlds.
This tutorial covers the foundations and modern practical applications of knowledge-based and statistical methods, techniques and models and their combination for exploiting large document corpora. The tutorial will first focus on the foundations that can be used to this purpose, including knowledge graphs and word embeddings, and will then show how these techniques can be effectively combined in NLP tasks (and other data modalities in addition to text) related to research and commercial projects where the instructors currently participate.
- Sign in your Google account and go to “Hello, Colaboratory”: https://colab.research.google.com
- Download the tutorial notebooks from the tutorial repo on GitHub: https://github.com/HybridNLP2018/tutorial
- Open the notebooks (warning: Some of the notebooks e.g. notebook 08 may take a while to load data and/or model weights)
If this work is relevant for your research, please cite the following paper:
Ronald Denaux and Jose Manuel Gomez-Perez. 2019. Vecsigrafo: Corpus-based Word-Concept Embeddings. Semantic Web (2019), 1–28. https://doi.org/10.3233/SW-190361
You can use the following BibTex entry:
@article{Denaux2019Vecsigrafo,
title={Vecsigrafo: Corpus-based Word-Concept Embeddings},
author={Ronald Denaux and Jose Manuel Gomez-Perez},
journal={Semantic Web},
year={2019},
pages={1-28},
doi = {10.3233/SW-190361}}
We gratefully acknowledge funding from the EU Research and Innovation Horizon 2020 programme (projects DANTE-700367 and TRIVALENT-740934) and the Spanish Centre for the Development of Industrial Technology, CDTI (project GRESLADIX-IDI-20160805).