Skip to content

Latest commit

 

History

History
44 lines (27 loc) · 3.4 KB

2017.md

File metadata and controls

44 lines (27 loc) · 3.4 KB

Non-Autoregressive Neural Machine Translation

Authors: Anonymous authors

Abstract: Existing approaches to neural machine translation condition each output word on previously generated outputs. We introduce a model that avoids this autoregressive property and produces its outputs in parallel, allowing an order of magnitude lower latency during inference. Through knowledge distillation, the use of input token fertilities as a latent variable, and policy gradient fine-tuning, we achieve this at a cost of as little as 2.0 BLEU points relative to the autoregressive Transformer network used as a teacher. We demonstrate substantial cumulative improvements associated with each of the three aspects of our training strategy, and validate our approach on IWSLT 2016 English–German and two WMT language pairs. By sampling fertilities in parallel at inference time, our non-autoregressive model achieves near-state-of-the-art performance of 29.8 BLEU on WMT 2016 English– Romanian.

URL: https://arxiv.org/abs/1711.02281

Notes: approximate autoregressive model to non-autoregressive model

needs : improve parallelism

challenge : conditional independence not valid in general estimate target length multimodality problem

method : conditional indepenence --> multi-head self-attention target length estimation --> estimate its distribution multi-modality --> modeling fertility (use mid-level representation trained from other pretrained model) sequence-level knowledge distillation from autoregressive model

XGAN : Unsupervised Image-to-Image translation for many-to-many mappings

Authors: A. Royer et al

Abstract: Style transfer usually refers to the task of applying color and texture information from a specific style image to a given content image while preserving the structure of the latter. Here we tackle the more generic problem of semantic style transfer: given two unpaired collections of images, we aim to learn a mapping between the corpus-level style of each collection, while preserving semantic content shared across the two domains. We introduce XGAN ("Cross-GAN"), a dual adversarial autoencoder, which captures a shared representation of the common domain semantic content in an unsupervised way, while jointly learning the domain-to-domain image translations in both directions. We exploit ideas from the domain adaptation literature and define a semantic consistency loss which encourages the model to preserve semantics in the learned embedding space. We report promising qualitative results for the task of face-to-cartoon translation. The cartoon dataset we collected for this purpose is in the process of being released as a new benchmark for semantic style transfer.

URL: https://arxiv.org/abs/1711.05139

Notes: goal : semantic style transfer (two domains are very different in pixel space, but have many commons in semantic)    

needs : conventional DiscoGAN/CycleGAN does not work if two domains are very different in pixel space --> really why?

challenge : two domains are very different pixel space many-to-many mapping --> why is it the problem?

method : different pixel space --> semantic consistency constraint many-to-many --> teacher loss : help utilizing pre-trained knowledge