Serious Discussion - Artists can search these databases for links to their work and flag them for removal


Thread author
Staff Member
Jan 8, 2011

HaveIBeenTrained uses clip retrieval to search the Laion-5B and Laion-400M image datasets. These are currently the largest public text-to-image datsets, and they are used to train models like Stable Diffusion, Imagen, among many others.

Text-to-image datasets are typically shared as files that resemble enormous spreadsheets. Their main columns are:
  1. a link to an image on the internet like:
  2. a caption that describes that image like:
    "Platform mp3 Album by Holly Herndon"
When it's time to train a generative AI system, organizations like Stability use those datasets to download the images from their links and present them to the model with their captions.

With HaveIBeenTrained, artists can search these databases for links to their work and flag them for removal.

We're incorporating new datasets as they are released and we're also partnering with other organizations who collect and use image links, so HaveIBeenTrained can serve as a once only opt-out tool that applies to every dataset used to train generative AI Art tools.

Our solution builds up upon retrieval tools [1,2,3] created by LAION community that enable efficient search through very large collections of image-text pairs based on kNN indicies that are pre-computed by using CLIP models pre-trained by openAI and LAION.

Related thread: Discussion Thread - AI training site stole his photos, then sued when he complained: Robert Kneschke's story
  • Like
Reactions: silversurfer

About us

  • MalwareTips is a community-driven platform providing the latest information and resources on malware and cyber threats. Our team of experienced professionals and passionate volunteers work to keep the internet safe and secure. We provide accurate, up-to-date information and strive to build a strong and supportive community dedicated to cybersecurity.

User Menu

Follow us

Follow us on Facebook or Twitter to know first about the latest cybersecurity incidents and malware threats.