Information extraction from webpages
Project Description

As a part of a NLP project. Tasks: 1. scrape the internet for advertisements about a product; 2. fine tune pre-trained language model to extract structured information from these ads.

Supervisor
LIN Fangzhen
Quota
2
Course type
UROP1100
Applicant's Roles

Mostly python programming. Understanding pre-trained language models like bert.

Applicant's Learning Objectives

programming with python, webdriver, tensor flow.

Complexity of the project
Moderate