Simple is powerful

Eric Lam | Voidful
UDIC LAB Member

Content

Github : https://github.com/voidful
Email : [email protected]γ€€γ€€γ€€
Medium : https://medium.com/@voidful.stack
LinkedIn : https://www.linkedin.com/in/voidful/
Twitter : https://twitter.com/voidful_stack
Facebook : https://www.facebook.com/voidful.nlp/

Skill

Natural Language Processing, Machine Learning, Crawler, Web Framework, Data Mining
Front-end(React), Adobe illustrator Design, Backend, Android Apps

Demo Page

WebPage

Project

TFkit

Github
πŸ€–πŸ“‡ Transformers kit - NLP library for different downstream tasks, built on huggingface project

NLPrep

Github
🍳 NLPrep - download and pre-processing data for nlp tasks

nlp2go

Github
πŸƒ hosting nlp models for demo purpose

Best AI Award In MAKENTU 2019

Talk to a poster, it can answer related question using machine reading comprehension & Information retrieval.

Medical Record Analytic

Extract knowledge form medical record which have different expressions due to doctor’s expression.

Zero Short End to End Cantonese Speech Recognition

Lack of cantonese corpus is most of problem.Now trying to solve it in transfer learning ways.

Machine Reading Comprehension

Dataset List
keep updating with the newest dataset and model

Bert For Sentence Generate

Coalb Trial
Trial of bert fineturing on sentence generating in different approach : generate one by one, generate one time ,generate from LSTM.

Cipher

Cipher Github
Open source Android app,it can let you hash your password before typing.

Python NLP Preprocessing Toolkit

nlp2 Github
Python library that help to do text mining and preprocessing, with unit test and detail document

OOV Word Extraction

Phraseg Github
A different approach that can extract new phrase in short text.It use conditional probability different to PMI and Entropy. Compare to others, it have less limit on size of input corpus and less computation.

Active Learning For Question Answering

ActiveBag Github
Using Fasttext train multi-classifier, select retrain sample form voting, entropy and clustering.So that we can use less labeling effort to get a high accuracy.

Knowledge Extraction with WiKi Dump

Mining from Wiki Dump data, getting plain text, synonym from redirect, translation from language link and relationship from category.

Web Crawler For Well-know Hong Kong & Taiwan Website

Collect corpus for nlp task, base on scrapy, crawling all text in 19 well-know website.