Learning to Compute Word Embeddings On the Fly
Tytuł:
Learning to Compute Word Embeddings On the Fly
Czasopismo:
MONTREAL AI SYMPOSIUM 2017
Rok:
2017
Opis:
Words in natural language follow a Zipfian distribution whereby some words are frequent but most are rare. Learning representations for words in the "long tail" of this distribution requires enormous amounts of data. Representations of rare words trained directly on end-tasks are usually poor, requiring us to pre-train embeddings on external data, or treat all rare words as out-of-vocabulary words with a unique representation.