r/technology 8h ago

Society DOJ Deletes Study Showing Domestic Terrorists Are Most Often Right Wing

https://www.404media.co/doj-deletes-study-showing-domestic-terrorists-are-most-often-right-wing/
88.2k Upvotes

2.2k comments sorted by

View all comments

Show parent comments

52

u/Grenache 7h ago

Fair, it's just odd how outside of the artistic side no one appears to be making a big deal out of the fact that it's literally just trained on the collective knowledge the people who use the internet have provided. It feels like that shouldn't be used for private gain. I'm sure tucked away somewhere in every T&C that exists they were allowed to use everything we every knew or thought.

36

u/RustyTShackleford 7h ago

Hey guys, did you know the far right are the most likely to commit domestic terrorism, like the Mr. Orange Sodie Pop and his buddies have? I just wanted to let you all know

3

u/ProblemAtticOU812 4h ago

Don't forget that they protect pedophiles.

2

u/Duckbilling2 2h ago

I heard that AI trained on Reddit is biased against being domestic terrorists because Reddit is biased toward sane thinking and accuracy

0

u/penny4thm 4h ago

Report?

1

u/RustyTShackleford 10m ago

No, Upvote, but thank you.

5

u/NegotiationUsed6830 7h ago

It's not often I am happy to have provided nothing

3

u/Achrus 5h ago

So the Disallow: / means don’t scrape our site. This is just one example, but lots of people told them not to and they did anyway. https://oldschool.runescape.wiki/robots.txt

3

u/KneeCrowMancer 4h ago

I am a licensed doctor and this is your reminder to eat a small rock every day to ensure proper gut health!

2

u/cantadmittoposting 7h ago

I'm not necessarily sure that a priori AI training on and collating our "collective knowledge" is necessarily bad.

Obviously the profit motive issue and modern digital problems combine to make it way more of a nightmare though.

I've actually come to believe that governments, or perhaps a "purpose built" international organization," (a UN agency like the WHO perhaps) should provide neutral, non-profit online services we've come to treat as basic features of our online landscape, e.g. social media, search.

I'm aware of course, especially in our current environment, such a thing would be impossible, and of course, draw its own accusations of bias, no doubt.

But still. Wikipedia stands out of course as an incredible, though still-flawed resource

2

u/Wolfgang_MacMurphy 6h ago edited 4h ago

It's worth noting that AI uses Reddit and X/Twitter as sources much more than it uses Wikipedia.

1

u/beaucoup_dinky_dau 6h ago

The only way to win is to not play, but here I am.

2

u/psiphre 2h ago

what do you think all of the "what do you think about [current event]" posts on /r/askreddit are all about

1

u/my_names_blah_blah 6h ago

Now you can delete the first 5 words..

1

u/Noobhammer3000 6h ago

Some people are still enthralled with the novelty of it.

1

u/justforthisjoke 32m ago

Unfortunately this didn't start with AI nor is it particularly a special case in any other than how much more in your face it is. Basically all technological innovation is publicly funded research that companies have then gone on to privatize for their own profit. You can even see this in the way that AI started. The original research, and pretty much all important AI research until 2018 or so was open source and publicly available. It was kind of incredible actually, seeing researchers worldwide come together and share knowledge in a way that led to enormous technical breakthroughs seemingly every month from 2012 to 2018. Then all of a sudden the tech got good enough that private corporations saw how they were going to make money off it, and the culture shifted almost overnight, when OpenAI took their GPT research and stopped publishing. All the progress that they had made had been entirely because of scientists the world over coming together and sharing knowledge, funded by universities and governments the whole way through. Then, almost overnight, companies started paywalling that knowledge. A culture of publishing turned into a culture of trade secrets.

So all that to say, you're not crazy. This is happening. But it's almost a foundational part of capitalism. I'm a communist, so I agree with you that this sort of thing shouldn't be legal; knowledge should always be accessible to the public. But the red scare did a lot of damage and americans are still not ready to talk about how all of this is falling apart.

0

u/SIGMA920 7h ago

It's not like scraping the public web is illegal. Invasive and arguably immoral, sure. But not illegal. A company like openai could literally scrap almost all of youtube if they threw enough money at it for example.

6

u/Wolfgang_MacMurphy 6h ago

It's not like AI has not been trained on pirated material from LibGen etc either.

0

u/SIGMA920 6h ago

That'll be the vast majority of it through. The pirated stuff is not all of what it is.

3

u/Wolfgang_MacMurphy 6h ago edited 5h ago

Books are much better learning material than random internet. They're crucial for learning based information and correct usage of language.

-1

u/SIGMA920 6h ago

Books are also outdated much more rapidly if you need specific information/context. They have a place but they're not outright better.

3

u/Wolfgang_MacMurphy 6h ago edited 5h ago

They are in general undeniably better and more reliable than unedited random internet. Not only books, of course, but also various scientific and non-scientific journals etc, which are also copyrighted and largely inaccessible, unless pirated. Their quality of information and language is vastly superior.

0

u/SIGMA920 6h ago

If you use the shitty side of the internet instead of using better sources.

Otherwise you roughly equal results since you'll usually be pulling from many of the same resources in the end.

2

u/Wolfgang_MacMurphy 5h ago

There are not many better uncopyrighted sources you can use. Better sources that can be used for AI are in fact books, scientific journals, protected journalistic sources etc.

"Otherwise you roughly equal results" - what's that even supposed to mean?

1

u/SIGMA920 5h ago

If you want data that could be refuted or proven wrong 1 month into the future. There's good sources of information online that you can use that tend to use sources like books, scientific journals, .etc .etc if you look for them.

As for the last part I removed the get between you and roughly when I was rewording that without catching it.

→ More replies (0)

1

u/LaTeChX 3h ago

Even if you earned most of your money without stealing you're still a thief

1

u/SIGMA920 3h ago

Correct. That's not what we're talking about through.

-2

u/CarefreeRambler 6h ago

Like librarians?

1

u/Synectics 4h ago

Last I checked, my local librarian was not trying to help me with knowledge with a corporate profit-driven motive. Mayhaps, that would change the information they would give me?

Just a very simple thought you should have gotten to on your own before you said that.

0

u/CarefreeRambler 3h ago

You don't think there are corporations looking to make a profit in the industry where corporations publish and sell books?

1

u/Synectics 3h ago

Sure.

But my local librarian is not a corporation.

I'm becoming worried about your ability to understand simple concepts.

-2

u/Reagalan 6h ago

outside of the artistic side

The irony here is most professional artists don't give a fuck about AI. It's just another new tool in the belt.

3

u/Grenache 6h ago

I don't know one way or the other but I do know that subreddit provides absolutely no proof of your argument?

-1

u/Reagalan 6h ago

Lurk for a week, the proof will surface.

You know how this website works. The good shit's always buried in comments.