AI & Your Music

Is My Music in AI Training Data?

AI music models learn from huge collections of songs. Some of those collections are public, which means you can actually check whether your catalog is in them.

The short answer

You can check whether your music is in the public datasets used to train AI music models in under a minute, for free. Paste your song titles, search your name, or import your catalog from Spotify, and we cross-reference it against roughly 13.8 million tracks in the datasets that have circulated among AI developers. A match tells you your work is sitting in data that AI builders can reach. It does not tell you a specific model trained on it, because the companies building these models have not disclosed exactly what they used.

What "AI training data" means for music

To build a model that can generate or analyze music, developers need enormous amounts of song data. A few of the datasets they use are public. The two largest that matter for artists are LAION-DISCO-12M, a collection of about 12 million music tracks and their metadata, and the Free Music Archive. Together that is roughly 13.8 million tracks.

Important nuance: these public datasets mostly hold metadata and links (titles, artist names, album, duration), not the actual audio files. So a match means your song is catalogued in a dataset built for AI research and shared among developers. It is a strong signal of exposure, not a signed confession from any one company.

How to check if your music is in them

There are three ways to run the free check, depending on how much you want to look at:

  • Search by name. Enter your artist or songwriter name to see every track of yours that appears in the data.
  • Paste a list of titles. Drop in a set of song titles and check them in one pass.
  • Import from Spotify.Pull an artist's catalog or a playlist and check the whole thing at once.

Every option is free, no account required. The same check also tells you, at the same time, which of those songs are registered with the U.S. Copyright Office, because that is the part that decides whether you can do anything about it.

What a match means, and what it does not

A match means: your track is in a public dataset that AI developers can access. That is worth knowing. It is the reason to take the next step.

A match does not mean a specific AI company trained on your song, that you are owed money today, or that anyone broke the law. Those questions are being fought out in court right now. What you control is whether you are positioned to act if and when the answer comes back.

What to do if you find a match

Register the songs that are exposed and unregistered. Both paths that could pay a creator depend on it. The proposed CLEAR Act (not yet law) would let registered copyright owners pursue penalties when AI companies fail to disclose their work. And copyright infringement statutory damages run up to $150,000 per work, but only for registered copyrights. An unregistered song is locked out of both, no matter how clearly it shows up in the data.

The check is free. Filing a registration costs $65 per work at copyright.gov. Knowing which songs to prioritize is the whole point of running the check first.

Frequently asked

Does a match mean an AI company stole my song?

No. A match means your track appears in a public dataset that AI developers can access, not proof that any specific company trained on it. It is a signal worth knowing, and a reason to make sure you are registered, because registration is what lets you act if it ever was used.

Which datasets do you check against?

We cross-reference your catalog against LAION-DISCO-12M (about 12 million music tracks) and the Free Music Archive, roughly 13.8 million tracks in total. These are public datasets that have circulated among AI music developers.

Is it free to check?

Yes. The AI training-data check is free for your whole catalog, every song, with no signup. You can search by artist or songwriter name, paste a list of titles, or import an artist or playlist from Spotify.

My music is not in the data. Am I safe?

Not necessarily. These are the public datasets we can see. AI companies also use private and licensed data they have not disclosed. The absence of a match is not a guarantee your work was never used, which is another reason registration matters: it protects you regardless of which dataset a song ended up in.

Check your own music free

See which of your songs appear in the public AI training datasets, and which are registered with the U.S. Copyright Office. Free, no signup.