Discussion about this post

User's avatar
Declan Dunn's avatar

Makes it clearer in many ways, thanks - though snippets are an interesting take on this, there's decisions on snippets. This argument is in line with the thinking on Authors Guild v. Google, as far as snippets and fair use.

Didn't consider the length of what's being output. It is all snippets. Even friends who've had SEO content scraped would fine sections, but not whole articles, verbatim.

And never consider this memory, you know the legal verbage game I think much better than I - it's that whole output thing, but if it's pieces, that's sort of fair use.

The Thomson Reuters summary judgement against the fair use defense surprised me, as Ross legally bought the data. Thomson sold it through a third party.

And those legal headnotes as I read are considered like Cliff Notes, though the judge found enough to merit upholding the copyright. What I read are basically snippet arguments.

There's the negative business impact, competitor's gaming around a license being turned down, that add much more to the decision than just content inside an AI system.

But that's what it is, and it wasn't because of output, it was a violation on the input. And so much data being purchased from 3rd parties for so many years, especially with the LLMs being insatiable.

I would have thought the third party purchase might be one way around this.

No posts

Ready for more?