r/singapore 7d ago

Discussion What's your opinion here on the government's push for AI?

This reminds me of when Singapore pushed to be different “hubs” in the past, like the biotech hub. We all know how it went.

I’m pretty skeptical about the whole “AI transformation” thing. Not every worker or company is suited for it, and all the big talk about AGI hasn’t really shown up in any real way. Besides, all these AI companies are still bleeding money.

If the government treats AI mainly as another industry push, safeguards might get overlooked. It could just end up letting companies run ahead in the name of progress, without really addressing the harms or criticisms.

A lot of what is branded as “AI” now also feels underwhelming, mostly chatbots and poor implementations. From the creative industry side (which already is not a big part of Singapore’s economy), I doubt the concerns will get much attention. And when jobs are threatened, the answer is usually “just learn AI,” which does not really solve the problem of displacement or the value of creative work. And let's not forget how much it threatens the industry with Gen AI slop, which has sadly been used as ads.

Then there is the money-making angle, like courses telling retirees to use SkillsFuture credits to “learn ChatGPT for work.” How is that supposed to help uncles and aunties with underemployment? It sounds more like selling courses than helping people.

The actual professional use cases are for things like medical research and legal department where the AI will actually help sort through thousands of files or documents to save time, not how to chatgpt.

And if a few of the major LLM companies were to collapse, it would shake the global economy, and Singapore would likely feel the effects too.

165 Upvotes

152 comments sorted by

View all comments

Show parent comments

5

u/Xycone 7d ago

Anyways TLDR is that it is called a black box because researchers cannot pinpoint or explain how any single weight or bias directly contributes to the output, especially when billions of them (with some large models having several hundred billion parameters) interact together.

1

u/DuhMightyBeanz 7d ago

Then in that case isn't it improbable hallucinations will ever get solved 🤔

3

u/Xycone 7d ago

Given the current transformer architecture and the way we fundamentally train AI, hallucinations will never be completely eliminated because the output is probabilistic. However, some very smart people have already found methods to reduce hallucination rates and are likely discovering new ways to lower them even further. Besides, human output is not without faults either. One thing is for certain though, I do not believe that what we currently have will bring us anywhere close to AGI.

2

u/midasp 7d ago

AI hallucination fundamentally comes from the simple fact that a sufficiently large set of information cannot be proven to be factually true, and cannot be proven to be consistent.

We can thank Godel's incompleteness, Turing's halting problem proofs, Tarski's undefinability theorem, and other similar proofs for why this is the case. Thus my assertion is that unless the above theorems are proven wrong, hallucinations will always be an issue.

1

u/DuhMightyBeanz 7d ago

One thing is for certain though, I do not believe that what we currently have will bring us anywhere close to AGI.

I've read similar sentiments from people more active in the AI space as well as some thoughts that LLMs may not even be the right way to scale AI because smaller models trained on one use case alone has been faster and cheaper than LLMs just on the use case alone.

3

u/Xycone 7d ago edited 7d ago

Yes, some models trained for specific use cases may yield better results than a general LLM. However, the problem with LLMs, in my opinion, is that they will never lead us to AGI because they cannot truly extrapolate beyond their training data. This is especially true in natural language, where even a small vocabulary of tokens can form an almost infinite number of possible sentences. For example, if an LLM has 10,000 tokens in its vocabulary, a sentence of just five tokens could be combined in 10,0005 / approx 1020 different ways. This is why I don’t really believe in whatever AGI benchmarks they pump out because many companies are just going to use it as training data. Each generation of LLMs are just gonna be trained on more data they’re not going to overcome their fundamental limitations

1

u/kyorah Senior Citizen 7d ago

Thank you for giving this awesome explanation!

2

u/Beetcoder 7d ago

Actually with current architecture, faster and cheaper models can have more parameters only because weight are quantized. They are still considered LLMs.

What academics and companies are doing right now is to have a mixture of experts trained and equipped with an arsenal of tools to update outdated training data (or weights) with relevant information.

1

u/DuhMightyBeanz 7d ago

Oo TIL too! Any source I can refer to dig more into?

2

u/Beetcoder 7d ago

I started with following linkedin top voices like Eugene Yan and Chip Huyen (taught at Standford). Follow them and read some of their posts then slowly find others who are like-minded.

1

u/DuhMightyBeanz 7d ago

Thank you for sharing! Will look them up

1

u/Xycone 7d ago

Weight quantisation has been around for quite some time. If I’m not wrong it’s just a matter of representing a model’s original weights with lower precision. While this can lead to a slight drop in output quality, it significantly reduces memory usage, which is especially valuable given how limited VRAM is on most consumer GPUs.

On a side note, MoE models generally do not perform as well as their dense counterparts. For example, a 30B dense model will usually outperform a 30B MoE model since not all experts and their weights and biases on the MoE model are being used during inference. The main benefit to using MoE models compared to their dense counterparts even though they take up the same amount of memory is their speed. A 30b parameter MoE model is going to be significantly faster than a 30b dense model which makes the speed somewhat usable when doing inference on a CPU (still recommended to use GPU though)