Running out of data to train AI programs.

Discussion:

Add Reply

RonO

2024-12-11 22:17:12 UTC

https://www.nature.com/articles/d41586-024-03990-2

The claim in this article is that soon the AI programers will run out of
data to train their AI on. If they want to improve their AI they will
have to create their own data, but how are they going to do that?

My guess is that they will identify what data would be most beneficial
to have and try to generate it. It could direct medical research into
generating useful data.

They could also spend decades weeding through the data that has already
been used and throw out the trash data. They could also go through the
scientific experiments in a field and use the good data, but remove the
conclusions of the researhers, and see what conclusions the AI can come
up with, cross check the conclusions to see if there was anything missed
by the original researchers and use that to train some other AI.

Ron Okimoto

erik simpson

2024-12-11 22:45:28 UTC

Permalink

Post by RonO
https://www.nature.com/articles/d41586-024-03990-2
The claim in this article is that soon the AI programers will run out of
data to train their AI on. If they want to improve their AI they will
have to create their own data, but how are they going to do that?
My guess is that they will identify what data would be most beneficial
to have and try to generate it. It could direct medical research into
generating useful data.
They could also spend decades weeding through the data that has already
been used and throw out the trash data. They could also go through the
scientific experiments in a field and use the good data, but remove the
conclusions of the researhers, and see what conclusions the AI can come
up with, cross check the conclusions to see if there was anything missed
by the original researchers and use that to train some other AI.
Ron Okimoto

Why not let AI create its own data. Then we wouldn't have to worry
about it. And think of the great literature and movies!

RonO

2024-12-11 23:02:46 UTC

Permalink

Post by RonO
https://www.nature.com/articles/d41586-024-03990-2
The claim in this article is that soon the AI programers will run out
of data to train their AI on. If they want to improve their AI they
will have to create their own data, but how are they going to do that?
My guess is that they will identify what data would be most beneficial
to have and try to generate it. It could direct medical research into
generating useful data.
They could also spend decades weeding through the data that has
already been used and throw out the trash data. They could also go
through the scientific experiments in a field and use the good data,
but remove the conclusions of the researhers, and see what conclusions
the AI can come up with, cross check the conclusions to see if there
was anything missed by the original researchers and use that to train
some other AI.
Ron Okimoto

Why not let AI create its own data. Then we wouldn't have to worry
about it. And think of the great literature and movies!

That results in increased bogus output by the AI.

The AI start to "hallucinate" when fed data generated by other AI
according to one article that I recall reading. I think one poster
responded that AI hallucination can be caused by other factors too.
Maybe the AI generate data in a format that they were not trained to
deal with, and that causes issues in assimilating AI generated data.

Ron Okimoto

erik simpson

2024-12-11 23:33:20 UTC

Permalink

Post by RonO

Post by RonO
https://www.nature.com/articles/d41586-024-03990-2
The claim in this article is that soon the AI programers will run out
of data to train their AI on. If they want to improve their AI they
will have to create their own data, but how are they going to do that?
My guess is that they will identify what data would be most
beneficial to have and try to generate it. It could direct medical
research into generating useful data.
They could also spend decades weeding through the data that has
already been used and throw out the trash data. They could also go
through the scientific experiments in a field and use the good data,
but remove the conclusions of the researhers, and see what
conclusions the AI can come up with, cross check the conclusions to
see if there was anything missed by the original researchers and use
that to train some other AI.
Ron Okimoto

Why not let AI create its own data. Then we wouldn't have to worry
about it. And think of the great literature and movies!

That results in increased bogus output by the AI.
The AI start to "hallucinate" when fed data generated by other AI
according to one article that I recall reading. I think one poster
responded that AI hallucination can be caused by other factors too.
Maybe the AI generate data in a format that they were not trained to
deal with, and that causes issues in assimilating AI generated data.
Ron Okimoto

I was joking about AI -> AI generating anything sensible. AI is very
useful detecting patterns in all kinds of data. Applying in to weather
data could save millions if not billions of dollars. AI navel-gazing
isn't going to work.

LDagget

2024-12-12 02:49:05 UTC

Permalink

Post by erik simpson

Post by RonO

Post by RonO
https://www.nature.com/articles/d41586-024-03990-2
The claim in this article is that soon the AI programers will run out
of data to train their AI on. If they want to improve their AI they
will have to create their own data, but how are they going to do that?
My guess is that they will identify what data would be most
beneficial to have and try to generate it. It could direct medical
research into generating useful data.
They could also spend decades weeding through the data that has
already been used and throw out the trash data. They could also go
through the scientific experiments in a field and use the good data,
but remove the conclusions of the researhers, and see what
conclusions the AI can come up with, cross check the conclusions to
see if there was anything missed by the original researchers and use
that to train some other AI.
Ron Okimoto

Why not let AI create its own data. Then we wouldn't have to worry
about it. And think of the great literature and movies!

That results in increased bogus output by the AI.
The AI start to "hallucinate" when fed data generated by other AI
according to one article that I recall reading. I think one poster
responded that AI hallucination can be caused by other factors too.
Maybe the AI generate data in a format that they were not trained to
deal with, and that causes issues in assimilating AI generated data.
Ron Okimoto

The problem remains the lack of intelligence.
For this I cite heavy promoters of AI.

Martin Harran

2024-12-12 14:17:34 UTC

Permalink

On Wed, 11 Dec 2024 15:33:20 -0800, erik simpson

Post by erik simpson

Post by RonO

Post by RonO
https://www.nature.com/articles/d41586-024-03990-2
The claim in this article is that soon the AI programers will run out
of data to train their AI on. If they want to improve their AI they
will have to create their own data, but how are they going to do that?
My guess is that they will identify what data would be most
beneficial to have and try to generate it. It could direct medical
research into generating useful data.
They could also spend decades weeding through the data that has
already been used and throw out the trash data. They could also go
through the scientific experiments in a field and use the good data,
but remove the conclusions of the researhers, and see what
conclusions the AI can come up with, cross check the conclusions to
see if there was anything missed by the original researchers and use
that to train some other AI.
Ron Okimoto

Why not let AI create its own data. Then we wouldn't have to worry
about it. And think of the great literature and movies!

That results in increased bogus output by the AI.
The AI start to "hallucinate" when fed data generated by other AI
according to one article that I recall reading. I think one poster
responded that AI hallucination can be caused by other factors too.
Maybe the AI generate data in a format that they were not trained to
deal with, and that causes issues in assimilating AI generated data.
Ron Okimoto

I realise you were being tongue-in-cheek, but I would actually worry
about the possibility of this. We see something along the same lines
in Internet tracking where things you like are constantly monitored
and you are shown ever more things that are the same or similar. This
is a particular problem in social media where people are fed more and
more stuff to reinforce their political and cultural biases. I shudder
to think of what sort of stuff we might get if AI turns to a feeding
frenzy upon its own output, bearing in mind how people generally are
becoming less and less adept at sifting out the utter rubbish :(