Information extraction with Python

Speaker

jiawei chen /英語

Tags

machine learning, natural language processing, data analysis

Abstract

This talk will present a named entity recognition (NER) system for extracting attributes and values, like person, company, place or time, from various of text data. I will introduce how to combine several python tools to build this system. First, use a python written annotation tool BRAT to create a custom annotated corpus. Second, use python to link CRFsuite, training a Conditional Random Fields model to labeling our list of text data, the labeling result will be further analyzed by pandas and scikit-learn.

關於講者


A search engineer, usually like to study machine learning and natural language processing.

頭銜

search engineer