The task is to give a contextual description from non-fixed schema tabular data. The work involves curating a dataset based on NLP based heuristics and a Look-up algorithm. Achieved 90% success, which bypasses the performance of state-of-the-art Natural Language Processing models like SEMPRE by 40-50%. Further, we ran baselines models based on Copy Mechanism (Christopher D. Manning et al). Also, to evaluate the process we devise deep learning model: Attention Based Copy Mechanism, that learns to copy unseen vocabulary, has an alignment matrix, and learns vectorial representation for tables.

Keyword: Tensorflow, Python, NLTK, SpaCy.

Slides

Won 2nd prize for the work done during the internship in the poster competition organized by Tata Research Development and Design Centre, Pune (India) at Chennai Mathematical Institute