A curated list of text-based image manipulation methods
Name | Links |
---|---|
102 Category Flower | Images Captions |
Caltech-UCSD Birds-200-2011 | Images Captions |
CoDraw | Data |
DeepFashion | Images Captions |
i-CLEVR | Data |
Multi-Modal-CelebA-HQ | Data |
Note on the date column: if a paper was published on a preprint venue (such as arXiv) before being accepted at a conference, the date information will match the first preprint release date.
Date | Title | Venue | Citations | Paper | Code |
---|---|---|---|---|---|
2023/01 | Muse: Text-To-Image Generation via Masked Generative Transformers | - | N/A | arXiv Project Page |
PyTorch |
2023/01 | FICE: Text-Conditioned Fashion Image Editing With Guided GAN Inversion | - | N/A | arXiv | - |
2023/01 | SEGA: Instructing Diffusion using Semantic Dimensions | - | N/A | arXiv | - |
2022/12 | CLIPVG: Text-Guided Image Manipulation Using Differentiable Vector Graphics | AAAI | N/A | arXiv | PyTorch (Official) |
2022/12 | SINE: SINgle Image Editing with Text-to-Image Diffusion Models | - | N/A | arXiv | PyTorch (Official) |
2022/12 | The Stable Artist: Steering Semantics in Diffusion Latent Space | - | N/A | arXiv | - |
2022/11 | Null-text Inversion for Editing Real Images using Guided Diffusion Models | - | N/A | arXiv Official Page |
PyTorch (Official) |
2022/11 | InstructPix2Pix: Learning to Follow Image Editing Instructions | - | N/A | arXiv Official Page |
PyTorch (Official) |
2022/11 | DiffStyler: Controllable Dual Diffusion for Text-Driven Image Stylization | - | N/A | arXiv | - |
2022/11 | Interactive Image Manipulation with Complex Text Instructions | WACV | N/A | arXiv | - |
2022/11 | Target-Free Text-guided Image Manipulation | AAAI | N/A | arXiv | - |
2022/11 | Direct Inversion: Optimization-Free Text-Driven Real Image Editing with Diffusion Models | - | N/A | arXiv | - |
2022/11 | Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation | - | N/A | arXiv Project Page |
- |
2022/10 | ManiCLIP: Multi-Attribute Face Manipulation from Text | N/A | arXiv | - | |
2022/10 | Towards Open-Ended Text-to-Face Generation, Combination and Manipulation | ACMMM | N/A | ACM | - |
2022/10 | CLIP-PAE: Projection-Augmentation Embedding to Extract Relevant Features for a Disentangled, Interpretable, and Controllable Text-Guided Image Manipulation | - | N/A | arXiv | - |
2022/10 | One Model to Edit Them All: Free-Form Text-Driven Image Manipulation with Semantic Modulation | NeurIPS | N/A | arXiv | - (Official) |
2022/10 | Imagic: Text-Based Real Image Editing with Diffusion Models | - | N/A | arXiv Official Page |
- |
2022/10 | Assessment of Image Manipulation Using Natural Language Description: Quantification of Manipulation Direction | ICIP | N/A | IEEE | - |
2022/10 | LDEdit: Towards Generalized Text Guided Image Manipulation via Latent Diffusion Models | BMVC | N/A | arXiv | - |
2022/10 | UniTune: Text-Driven Image Editing by Fine Tuning an Image Generation Model on a Single Image | - | N/A | arXiv | PyTorch |
2022/10 | DiffEdit: Diffusion-based semantic image editing with mask guidance | - | N/A | arXiv | - |
2022/10 | Bridging CLIP and StyleGAN through Latent Alignment for Image Editing | - | N/A | arXiv | - |
2022/09 | Language-based Image Manipulation Built on Language-Guided Ranking | IEEE Transactions on Multimedia | N/A | IEEE | - |
2022/09 | StyleGAN-based CLIP-guided Image Shape Manipulation | CBMI | N/A | ACM | - |
2022/08 | Prompt-to-Prompt Image Editing with Cross Attention Control | - | N/A | arXiv Official Page |
PyTorch (Official) |
2022/08 | DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation | CVPR (2023) | N/A | arXiv Project Page |
Dataset (Official) PyTorch |
2022/07 | Towards Counterfactual Image Manipulation via CLIP | ACMMM | N/A | arXiv | PyTorch (Official) |
2022/07 | Cross-modal Representation Learning and Relation Reasoning for Bidirectional Adaptive Manipulation | IJCAI | N/A | IJCAI | - |
2022/06 | DE-Net: Dynamic Text-guided Image Editing Adversarial Networks | AAAI | N/A | arXiv | PyTorch (Official) |
2022/05 | Generative Adversarial Network Including Referring Image Segmentation For Text-Guided Image Manipulation | ICASSP | N/A | IEEE | - |
2022/04 | Paired-D++ GAN for image manipulation with text | Machine Vision and Applications | N/A | Springer | - |
2022/04 | ManiTrans: Entity-Level Text-Guided Image Manipulation via Token-wise Semantic Alignment and Generation | CVPR | N/A | arXiv | - (Official) |
2022/04 | VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance | ECCV | N/A | arXiv | PyTorch (Official) |
2022/04 | Referring Object Manipulation of Natural Images with Conditional Classifier-free Guidance | ECCV | N/A | ECVA | PyTorch (Official) |
2022/04 | Text2LIVE: Text-Driven Layered Image and Video Editing | ECCV | N/A | arXiv | PyTorch (Official) |
2022/03 | FlexIT: Towards Flexible Semantic Image Translation | CVPR | N/A | arXiv | PyTorch (Official) |
2022/03 | EnvEdit: Environment Editing for Vision-and-Language Navigation | CVPR | N/A | arXiv | PyTorch (Official) |
2022/03 | AnyFace: Free-style Text-to-Face Synthesis and Manipulation | CVPR | N/A | arXiv | - |
2022/02 | FEAT: Face Editing with Attention | - | N/A | arXiv | PyTorch |
2022/02 | Learning by Imagination: A Joint Framework for Text-based Image Manipulation and Change Captioning | IEEE Transactions on Multimedia | N/A | IEEE | - |
2022/02 | Name Your Style: An Arbitrary Artist-aware Image Style Transfer | CVPR (2023, Workshop) | N/A | arXiv CvF |
PyTorch (Official) |
2022/02 | Interactive Image Generation with Natural-Language Feedback | AAAI | N/A | AAAI | - |
2021/12 | CLIPstyler: Image Style Transfer with a Single Text Condition | CVPR | N/A | arXiv | PyTorch (Official) |
2021/12 | Embedding Arithmetic for Text-driven Image Transformation | CVPR (2022, O-DRUM Workshop) | N/A | arXiv CvF |
PyTorch (Official) |
2021/12 | CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields | CVPR | N/A | arXiv | PyTorch (Official) |
2021/12 | HairCLIP: Design Your Hair by Text and Reference Image | CVPR | N/A | arXiv | PyTorch (Official) |
2021/12 | More Control for Free! Image Synthesis with Semantic Diffusion Guidance | WACV (2023) | N/A | arXiv | PyTorch (Official) |
2021/12 | StyleMC: Multi-Channel Based Fast Text-Guided Image Generation and Manipulation | WACV | N/A | arXiv | PyTorch (Official) |
2021/12 | GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models | ICML (2022) | N/A | arXiv MLR |
PyTorch (Official) |
2021/12 | Generative Adversarial Network for Text-to-Face Synthesis and Manipulation with Pretrained BERT Model | IEEE FG | N/A | IEEE | - |
2021/11 | LatteGAN: Visually Guided Language Attention for Multi-Turn Text-Conditioned Image Manipulation | IEEE Access | N/A | IEEE | PyTorch (Official) |
2021/11 | Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model | CVPR (2022) | N/A | arXiv | PaddlePaddle (Official) PyTorch (Official) |
2021/11 | Blended Diffusion for Text-driven Editing of Natural Images | CVPR (2022) | N/A | arXiv CvF |
PyTorch (Official) |
2021/11 | SpaceEdit: Learning a Unified Editing Space for Open-Domain Image Editing | CVPR | N/A | arXiv | - |
2021/10 | DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation | CVPR | N/A | arXiv | PyTorch (Official) |
2021/10 | Language-Guided Global Image Editing via Cross-Modal Cyclic Mechanism | ICCV | N/A | CvF | PyTorch (Official) |
2021/10 | Each Attribute Matters: Contrastive Attention for Sentence-based Image Editing | BMVC | N/A | arXiv | (Official) |
2021/09 | Segmentation-Aware Text-Guided Image Manipulation | ICIP | N/A | IEEE | - |
2021/09 | Talk-to-Edit: Fine-Grained Facial Editing via Dialog | ICCV | N/A | arXiv | PyTorch (Official) |
2021/06 | Text-Guided Human Image Manipulation via Image-Text Shared Space | IEEE TPAMI | N/A | IEEE | - |
2021/06 | Grounded, Controllable and Debiased Image Completion with Lexical Semantics | CVPR (Causality in Vision Workshop) | N/A | CvF | - |
2021/06 | Learning by Planning: Language-Guided Global Image Editing | CVPR | N/A | arXiv | PyTorch (Official) |
2021/03 | StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery | ICCV | N/A | arXiv | PyTorch (Official) |
2021/03 | Text-Guided Style Transfer-Based Image Manipulation Using Multimodal Generative Models | IEEE TPAMI | N/A | IEEE | - |
2021/02 | Zero-Shot Text-to-Image Generation | ICML | N/A | arXiv MLR |
PyTorch (Official) |
2020/12 | TediGAN: Text-Guided Diverse Face Image Generation and Manipulation | CVPR | N/A | arXiv | PyTorch (Official) |
2020/10 | Learning Cross-Modal Representations for Language-Based Image Manipulation | ICIP | N/A | ResearchGate | - |
2020/10 | Text-Guided Image Inpainting | ACMMM | N/A | ACM | - |
2020/10 | Lightweight Generative Adversarial Networks for Text-Guided Image Manipulation | NeurIPS | N/A | arXiv | (Official) |
2020/10 | A Benchmark and Baseline for Language-Driven Image Editing | ACCV | N/A | arXiv | - (Official) |
2020/09 | SSCR: Iterative Language-Based Image Editing via Self-Supervised Counterfactual Reasoning | EMNLP | N/A | arXiv | PyTorch (Official) |
2020/08 | Describe What to Change: A Text-guided Unsupervised Image-to-Image Translation Approach | ACMMM | N/A | arXiv | PyTorch (Official) |
2020/08 | Open-Edit: Open-Domain Image Manipulation with Open-Vocabulary Instructions | ECCV | N/A | arXiv | PyTorch (Official) |
2020/08 | Text as Neural Operator: Image Manipulation by Text Instruction | ACMMM (2021) | N/A | arXiv ACM |
PyTorch (Official) |
2020/08 | IR-GAN: Image Manipulation with Linguistic Instructionby Increment Reasoning | ACMMM | N/A | ACM | PyTorch (Official) |
2020/06 | Customizable GAN: A Method for Image Synthesis of Human Controllable | IEEE Access | N/A | IEEE | - |
2020/05 | Scones: Towards Conversational Authoring of Sketches | IUI | N/A | arXiv | TensorFlow (Official) |
2020/04 | Text-Guided Neural Image Inpainting | ACMMM | N/A | arXiv | PyTorch (Official) |
2020/02 | FACT: Fused Attention for Clothing Transfer with Generative Adversarial Networks | AAAI | N/A | AAAI | - |
2020/02 | Grounded and Controllable Image Completion by Incorporating Lexical Semantics | - | N/A | arXiv | - |
2020/02 | Image-to-Image Translation with Text Guidance | - | N/A | arXiv | - |
2020/01 | Progressive Semantic Image Synthesis via Generative Adversarial Network | VCIP | N/A | IEEE | - |
2019/12 | Image Manipulation with Natural Language using Two-sided Attentive Conditional Generative Adversarial Network | Neural Networks (2021) | N/A | arXiv | - |
2019/12 | ManiGAN: Text-Guided Image Manipulation | CVPR | N/A | arXiv | PyTorch (Official) |
2019/12 | Controlling Style and Semantics in Weakly-Supervised Image Generation | ECCV | N/A | arXiv | PyTorch (Official) |
2019/09 | Controllable Text-to-Image Generation | NeurIPS | N/A | arXiv | PyTorch (Official) TensorFLow |
2019/09 | Multi-mapping Image-to-Image Translation via Learning Disentanglement | NeurIPS | N/A | arXiv | PyTorch (Official) |
2019/08 | SIMGAN: Photo-Realistic Semantic Image Manipulation Using Generative Adversarial Networks | ICIP | N/A | IEEE Author |
- |
2019/05 | Eevee: Transforming Images by Bridging High-level Goals and Low-level Edit Operations | CHI | N/A | ACM | - |
2019/04 | Text Guided Person Image Synthesis | CVPR | N/A | arXiv | - |
2019/03 | Bilinear Representation for Language-based Image Editing Using Conditional Generative Adversarial Networks | ICASSP | N/A | arXiv | PyTorch (Official) |
2019/03 | Language-based Colorization of Scene Sketches | SIGGRAPH Asia | N/A | ACM Author |
TensorFlow (Official) |
2018/12 | Paired-D GAN for Semantic Image Synthesis | ACCV | N/A | Springer Author |
PyTorch (Official) |
2018/12 | Sequential Attention GAN for Interactive Image Editing via Dialogue | ACMMM | N/A | arXiv | - |
2018/11 | Keep Drawing It: Iterative Language-based Image Generation and Editing | NeurIPS (ViGIL Workshop) | N/A | arXiv | - |
2018/11 | Tell, Draw, and Repeat: Generating and Modifying Images Based on Continual Linguistic Instruction | ICCV | N/A | arXiv | PyTorch (Official) |
2018/10 | Cross-Modal Style Transfer | ICIP | N/A | IEEE | PyTorch (Official) |
2018/10 | Learning to Globally Edit Images with Textual Description | - | N/A | arXiv | PyTorch (Official) |
2018/10 | Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language | NeurIPS | N/A | arXiv | PyTorch (Official) |
2018/08 | Language Guided Fashion Image Manipulation with Feature-wise Transformations | ECCV (Workshop) | N/A | arXiv | - |
2018/08 | LUCSS: Language-based User-customized Colourization of Scene Sketches | - | N/A | arXiv | - |
2018/07 | Semantic Image Synthesis via Conditional Cycle-Generative Adversarial Networks | ICPR | N/A | IEEE ResearchGate |
- |
2018/07 | Semantics Images Synthesis and Resolution Refinement Using Generative Adversarial Networks | CSPS | N/A | Springer | - |
2018/05 | MC-GAN: Multi-conditional Generative Adversarial Network for Image Synthesis | BMVC | N/A | arXiv | PyTorch (Official) |
2018/04 | Coloring with Words: Guiding Image Colorization Through Text-based Palette Generation | ECCV | N/A | arXiv | PyTorch (Official) |
2018/04 | Learning to Color from Language | NAACL | N/A | arXiv | PyTorch (Official) |
2017/12 | CoDraw: Collaborative Drawing as a Testbed for Grounded Goal-driven Communication | ACL | N/A | arXiv | PyTorch (Official) |
2017/12 | Interactive Image Manipulation with Natural Language Instruction Commands | NeurIPS (ViGIL Workshop) | N/A | arXiv | - |
2017/11 | Language-Based Image Editing with Recurrent Attentive Models | CVPR | N/A | arXiv | TensorFlow (Official) |
2017/10 | Be Your Own Prada: Fashion Synthesis with Structural Coherence | ICCV | N/A | arXiv | Torch (Official) |
2017/07 | Semantic Image Synthesis via Adversarial Learning | ICCV | N/A | arXiv | PyTorch |
2016/05 | Generative Adversarial Text to Image Synthesis | ICML | N/A | arXiv | Torch (Official) |
Feel free to send me pull requests to add resources.