ChatGPT is rubbish

TheGingerOne

Posts: 1754

Full Member

Topic starter

[#13531653]

ChatGPT seems to be highly rated and AI is seen as the future, however how can it be used if it can't answer simple questions correctly?

I asked v5.0 to name the US states ending in the letter o.

It states that 'There's only one U.S. state that ends in the letter "o" - Ohio. It's got the distinction all to itself'.

Do Colorado, Idaho and New Mexico not exist as US states anymore?

How is this the future?

Posted : 08/08/2025 11:47 am

jam-bo

Posts: 23344

Free Member

name the US states ending in the letter o

Only one U.S. state name ends with the letter “o” — Ohio.

what about colorado

You’re right — Colorado also ends in “o.”

So, the complete list of U.S. states ending with the letter “o” is:

Colorado
Ohio

and idaho

Exactly — Idaho makes it three.

US states ending in “o”:

Colorado
Idaho
Ohio

and new mexico

Yep — New Mexico belongs on the list too.

Final list of U.S. states ending in “o”:

Colorado
Idaho
New Mexico
Ohio

list the US states ending in o

Here’s the full list of U.S. states whose names end with the letter “o”:

Colorado
Idaho
New Mexico
Ohio

Posted : 08/08/2025 11:52 am

sirromj reacted

jam-bo

Posts: 23344

Free Member

progress....

Posted : 08/08/2025 11:52 am

desperatebicycle

Posts: 7528

Full Member

Google's AI answer, includes disclaimer 🙂
"how many US states end in the letter O

The United States has 4 states that end with the letter "O"

. They are:

Colorado
Idaho
New Mexico
Ohio

AI responses may include mistakes."

Posted : 08/08/2025 11:55 am

willard

Posts: 9179

Full Member

At least they did not claim that spaghetti-o belongs on the list.

Posted : 08/08/2025 11:56 am

prettygreenparrot reacted

sirromj

Posts: 9010

Free Member

I dunno, some people are very much more concerned about it. Stealing copyrighted artwork for example. Or how about 24 hour surveillance (and data for training) via wearable AI tech?

View this post on Instagram

Or that the very name of OpenAI alluding to openness and sharing of knowledge (especially if you've ever been involved in the open Source Software community) is far from the reality of the corporation.

Posted : 08/08/2025 11:57 am

ChrisL reacted

DrJ

Posts: 14103

Full Member

I’ve been using it quite a bit lately to help translate some French legal documents and mostly it’s been very useful but occasionally it gets the translation completely wrong. Not in some detailed grammatical way - more like it is looking at a completely different text.

Posted : 08/08/2025 12:18 pm

endoverend

Posts: 901

Full Member

It amazes me how often it produces completely false information. When I've played around with it to verify some facts over some particular interest of historical knowledge which are very familiar to me, and of which countless books and documents/ reports exist online... a huge dataset, and it inexcusably gets dates wrong, gets characters in events mixed up, not even close to what is a well defined historical artifact. Amusingly - I've told it it's wrong and it admits it made a mistake, tries again and still gets it wrong again...and again...and again. It rewrites history, gets it completely wrong and presents it as fact.

The more worrying bit is that without a deep historical knowledge, ie: a normal awareness... the first incorrect answer would be taken as fact. It may create a new generation who do the standard 5 mins of research online and declare themselves an expert who don't realise how wrong they're getting it...

Not convinced as of yet... it needs to be more transparent on verifying validity of sources, and provide an estimation of certainty of its output for the end user.

Posted : 08/08/2025 12:45 pm

TheGingerOne reacted

shinton

Posts: 988

Free Member

I'm using Claude to generate code and graphs for investment purposes and I'm very impressed. First attempt told me what Python modules to install and the graphs produced looked OK, but on closer inspection were not correct so I told it:

That code ran ok but the figures produced are different to the ones I get from the Trading View website. Your code for AAPL produced a MACD of -1.3071 and a Signal of -0.5633 but Trading View produced a MACD of -7.83 and a signal of-5.61

and the reply was:

You're seeing different values because TradingView uses Simple Moving Averages (SMA) for MACD calculation by default, while my code uses Exponential Moving Averages (EMA). This is a common source of confusion!

Let me update the code to match TradingView's default calculation

Job jobbed.

Posted : 08/08/2025 1:16 pm

crazy-legs

Posts: 20717

Full Member

Posted by: endoverend

↑

Not convinced as of yet... it needs to be more transparent on verifying validity of sources,

Quite often it just makes up its own sources. There are several examples I've seen where it's cited references that simply don't exist.

Posted by: endoverend

↑

It may create a new generation who do the standard 5 mins of research online and declare themselves an expert who don't realise how wrong they're getting it...

I remember my uni days of combing through the archives, finding obscure German chemistry experiments from the early 1900's, having to verify and cite the reference (obviously the lecturer knew it and expected you to cite it - failure to cite it, or getting it wrong was a clear case of failing to do the research or copying someone else for example).

Now, everyone spends 3 minutes on Google and produces an answer which is unquestionably wrong (sometimes dangerously so) on a worryingly large number of occasions.

You actually have to be quite good at what you're researching to know that it's wrong and question it again. Anyone without that knowledge is just going to go "ah well AI says it's [thing]..."

The mis-information and dis-information potentials are pretty horrifying.

Posted : 08/08/2025 1:27 pm

sirromj reacted

desperatebicycle

Posts: 7528

Full Member

Posted by: endoverend

↑

It may create a new generation who do the standard 5 mins of research online and declare themselves an expert

That came long before the AI stuff.

Posted : 08/08/2025 1:41 pm

richmtb

Posts: 7645

Full Member

My company has introduced its own internal Chat GPT. Its the ChatGPT LLM using RAG to answer queries based on our internal documents.

We are actually measured on how often we use it - essentially we are training the AI model for the company so no doubt some of us can be replaced. Pretty dystopian.

Posted : 08/08/2025 1:51 pm

molgrips

Posts: 91178

Free Member

essentially we are training the AI model for the company so no doubt some of us can be replaced.

They won't replace staff. They will have the staff do something else. This has been demonstrated over the last 200 years. Most things that people used to do have been replaced by machines, but most of us still have jobs.

I think the main risk is that low paid low skill staff could be replaced, but someone will think of something else for them to do. The question is, will it be worse than what the AI now does?

Posted : 08/08/2025 2:03 pm

Cougar

Posts: 78676

Full Member

I tried.

name the US states ending in the letter o

Only one U.S. state ends with the letter "o" — Ohio.

Are you sure?

Yes — I double-checked. Out of all 50 U.S. states, Ohio is the only one whose name ends with the letter "o".

If we were counting U.S. territories, Puerto Rico would also fit, but it’s not a state.

It isn't.

Right — Puerto Rico is an unincorporated U.S. territory, not a state.

So if we’re sticking strictly to states, the answer really is just Ohio.

It really isn't.

Alright, let’s slow down and check this carefully.
I’ll run through all 50 U.S. states alphabetically in my head and focus only on the last letter.

Alabama … no
Alaska … no
Arizona … no (ends with “a”)
… skipping ahead …
Colorado — aha, this does end in “o”
Idaho — also ends in “o”
Ohio — yes, ends in “o”

So in fact, there are three U.S. states ending with the letter “o”:

Colorado
Idaho
Ohio

Would you like me to give you a quick memory trick for remembering them?

Are you sure this time?

Yes — this time I went through the full list properly.
Only Colorado, Idaho, and Ohio end with the letter "o" in the official 50 U.S. states.

No others fit, unless we start adding territories or foreign regions.

Check again

Alright — let’s do the most thorough, no-shortcuts check possible:

I’ll go alphabetically through all 50 states and look only at the last letter.

Alabama – a
Alaska – a
Arizona – a
Arkansas – s
California – a
Colorado – o ✅
Connecticut – t
Delaware – e
Florida – a
Georgia – a
Hawaii – i
Idaho – o ✅
Illinois – s
Indiana – a
Iowa – a
Kansas – s
Kentucky – y
Louisiana – a
Maine – e
Maryland – d
Massachusetts – s
Michigan – n
Minnesota – a
Mississippi – i
Missouri – i
Montana – a
Nebraska – a
Nevada – a
New Hampshire – e
New Jersey – y
New Mexico – o ✅
New York – k
North Carolina – a
North Dakota – a
Ohio – o ✅
Oklahoma – a
Oregon – n
Pennsylvania – a
Rhode Island – d
South Carolina – a
South Dakota – a
Tennessee – e
Texas – s
Utah – h
Vermont – t
Virginia – a
Washington – n
West Virginia – a
Wisconsin – n
Wyoming – g

✅ States ending in “O”:

Colorado
Idaho
New Mexico
Ohio

So there are actually four of them.

Looks like I undercounted earlier

So, it got there in the end without me giving it any answers. What's worrying / annoying is that it gives objectively wrong answers with such utter conviction. It might not be the future of scientific research any time soon but there will be a general election in a couple of years.

Posted : 08/08/2025 2:13 pm

thols2

Posts: 12412

Full Member

Ask it about Hawaii Five-O

Posted : 08/08/2025 2:18 pm

oldnpastit reacted

avdave2

Posts: 8413

Full Member

What's worrying / annoying is that it gives objectively wrong answers with such utter conviction.

I've been doing that for years!

Ask it to answer the question before last, that should be interesting

Posted : 08/08/2025 2:31 pm

desperatebicycle

Posts: 7528

Full Member

I've asked ChapGPT (that's the posh, English version 😀 ) and it still says there is only one state. So I got it to work out that there are 4 and told it to remember this as people are mocking it. It says it will remember in future.

Posted : 08/08/2025 2:31 pm

Cougar

Posts: 78676

Full Member

That's a thing in itself. It's supposed to be machine learning - why isn't it learning from its own previous mistakes?

Granted there's a (probably fairly high) risk that people will deliberately contaminate it, but in the example above I didn't give it any new information other than telling it it was wrong.

Posted : 08/08/2025 2:39 pm

lunge

Posts: 13545

Full Member

I've just done the same and once it got the right decision I asked it why it got it wrong:

That’s a fair question — and a good one.

Here’s what happened:

1. Initial Oversight

In my first answer, I only listed Colorado because I defaulted to an internal mental list and didn’t go through all 50 states systematically. That’s a common failure mode when a model (or a human) relies on recall rather than verification.

2. False Confidence

When you challenged me, I did a partial correction and found Idaho and Ohio, getting the count up to 3 — but I still missed New Mexico. That shows I was still relying too much on memory and a quick scan, not a complete and methodical check.

3. Assumption Error

“New Mexico” is a two-word name, and it’s easy to subconsciously file it under "New" rather than consider its actual last letter ("o"). So despite seeing the full name, I didn’t pay enough attention to the spelling of the last letter — a classic “surface-level check” mistake.

Why this matters:

It’s a great example of how even simple questions require careful step-by-step reasoning, especially when accuracy matters. Your persistence in asking me to double-check was exactly what was needed — and I appreciate that.

If you ever see a model (or a person) give a fast answer and seem confident — it’s still worth verifying. Especially when something feels off.

Thanks again for holding me to a higher standard.

Posted : 08/08/2025 3:19 pm

HoratioHufnagel

Posts: 6776

Free Member

Because you'd need to retrain the model itself which is the most expensive part of running them.

I wonder how long before we'll see paid ads in the all the answers from chatgpt. I mean Google's already gone massively downhill with al the paid sponsorship.

Posted : 08/08/2025 3:20 pm

mrmo

Posts: 10720

Free Member

All LLMs do is determine the next word on the basis of probability. The word apple could follow the, but so could orange. It’s a bullshit machine. Sometimes it guesses right, often it doesn’t. If you believe Orwell 2+2=5

Posted : 08/08/2025 3:27 pm

ChrisL reacted

desperatebicycle

Posts: 7528

Full Member

I just used it to fix a SQL script and it did a great job 😀

Posted : 08/08/2025 3:32 pm

thols2

Posts: 12412

Full Member

Posted by: mrmo

↑

All LLMs do is determine the next word on the basis of probability. The word apple could follow the, but so could orange. It’s a bullshit machine.

Exactly. An LLM isn't intelligent in the sense of understanding the meaning of language, it's just very good at identifying highly probable responses to prompts.

Posted : 08/08/2025 3:40 pm

richmtb

Posts: 7645

Full Member

Posted by: molgrips

↑

They won't replace staff. They will have the staff do something else. This has been demonstrated over the last 200 years. Most things that people used to do have been replaced by machines, but most of us still have jobs.

I think the main risk is that low paid low skill staff could be replaced, but someone will think of something else for them to do. The question is, will it be worse than what the AI now does?

They've already started, thinned out management as each manager is now supposed to be able to handle more direct reports because of AI.

My fear is that they'll looked to replace the expensive tenured staff who actually know the answers to most scenarios without having to rely on AI and replace them with cheaper less experienced staff and let AI cover the gaps.

Posted : 08/08/2025 3:56 pm

doris5000

Posts: 4593

Free Member

I wonder how long before we'll see paid ads in the all the answers from chatgpt. I mean Google's already gone massively downhill with al the paid sponsorship.

IMO it will be about 0.00001 seconds after OpenAI figure out how to do it and sell the ads.

OpenAI is haemhorraging money and has no realistic path to profitability, unless Mr Altman can convince enough government bodies to embed ChatGPT into their infrastructure and then yank up the price.

They'll be desperate to monetize it in any way they can. For regular users, it's just a commodity - people will happily jump ship to Gemini or Claude if they feel they offer better value.

GPT 5 was supposed to be the big reveal, the gamechanger. But it doesn't look like it's much of an improvement on 4, although I believe it is cheaper if you're a power user, and has smaller versions than can run locally on (relatively) accessible home hardware (ie computers that cost about £3k). So it's a step forward, but it won't have the cash flooding in.

Posted : 08/08/2025 4:01 pm

Edukator

Posts: 18615

Free Member

If you consider the volume of erronous shite on the Net it's not surprising that code developed to skim it (AI) produces unreliable answers. My browser AI clearly skims the first results and summarises them when the truth may lie a few pages beyond.

Posted : 08/08/2025 4:49 pm

Cougar

Posts: 78676

Full Member

Posted by: richmtb

↑

They've already started, thinned out management as each manager is now supposed to be able to handle more direct reports because of AI.

IME the opposite is happening. The thinning is occurring at the bottom because who needs junior staff anymore?

I can't say as I'm overly shocked either. A company is unlikely to thin the herd by getting rid of managers when it's the managers who are deciding who gets made redundant. Also IME.

Posted : 08/08/2025 5:54 pm

10

Posts: 1506

Full Member

Do Colorado, Idaho and New Mexico not exist as US states anymore?

According to some of the people I meet Colorado actually ends in an 'a'

Posted : 08/08/2025 6:30 pm

richmtb

Posts: 7645

Full Member

Posted by: Cougar

↑

A company is unlikely to thin the herd by getting rid of managers when it's the managers who are deciding who gets made redundant. Also IME.

Its a big company (multinational IT vendor) its got a lot of management layers. To their credit, a lot of staff reductions focus on management bloat first but I get the distinct impression they will start to go further.

Posted : 08/08/2025 6:50 pm

richmtb

Posts: 7645

Full Member

Posted by: Cougar

↑

A company is unlikely to thin the herd by getting rid of managers when it's the managers who are deciding who gets made redundant. Also IME.

Its a big company (multinational IT vendor) its got a lot of management layers. To their credit, a lot of staff reductions focus on management bloat first but I get the distinct impression they will start to go further.

Posted : 08/08/2025 6:50 pm

H1ghland3r

Posts: 894

Free Member

As with all current AI related things, it's less about the AI and more about how you phrase your query. There is a real skill involved in formatting a query correctly to get the answers you need, when done properly it is a very useful tool for research and many other uses.

Mrs H1ghland3r heads up the AI development team at a major IT firm, and they have their own models that are being trained to be used for document review, coding and technical documentation. They are currently aggressively hiring people specifically to correctly format AI queries. As was said earlier, the jobs won't be lost, they'll just change.

In this instance, if you ask 'Can you methodically check and list all US states that end with the letter O' then it gets the right answer first time.

The key word in the query is 'methodically' as this instructs the AI to bypass the various shortcuts it uses to reduce processing time (and cost) and do a full review of the data to answer the question.

Posted : 08/08/2025 6:57 pm

ossify reacted

mrmo

Posts: 10720

Free Member

Posted by: H1ghland3r

↑
As with all current AI related things, it's less about the AI and more about how you phrase your query.

Which is fine, if you understand that, but the average user of Facebook or Google? no chance. That LLMs are bullshit machines really is an issue. In the right context they can be useful tools. Though an observation at the current costs I can see a lot of companies walking away. The pricing models are a joke.

Posted : 08/08/2025 7:11 pm

H1ghland3r

Posts: 894

Free Member

Posted by: mrmo

↑

Which is fine, if you understand that, but the average user of Facebook or Google? no chance. That LLMs are bullshit machines really is an issue. In the right context, they can be useful tools. Though an observation, at the current costs I can see a lot of companies walking away. The pricing models are a joke

Which is also part of the perception problem, no-one in AI gives two figs what the 'freemium' users and public are doing with the AI models which are generally 2-3 versions behind the cutting edge. They are all focussed on the corporate development as that's where the money is. After that, then it will filter down to consumers. Not even the models Google are shoehorning into every search query are where their focus is, which is part of the problem, they want everyone to be thinking about incorporating AI into everything, but the current state of the public faced AI is so bad it's ruining the perception of it as a tool.

I also have issues with them even using the term 'AI' as it's nothing of the sort, it's just a natural language search engine based on a snapshot of data from a given point in time, as the US state question highlights, it can't even update its own parameters when a mistake is pointed out to it, mistakes have to be reviewed and updated by real people at OpenAI or wherever the models are trained.

Posted : 08/08/2025 7:22 pm

PJay

Posts: 5065

Free Member

Possibly a case of user error here but the article highlights some very real risks with AI.

INDEPENDENT.CO.UK "https://www.independent.co.uk/news/health/chatgpt-medical-advice-hospital-b2803992.html"

A man asked ChatGPT how to remove sodium chloride from his diet. It landed him in the hospital | The Independent

The man had been swapping sodium chloride, or table salt, for sodium bromide for three months after consulting ChatGPT

Posted : 08/08/2025 7:26 pm

H1ghland3r

Posts: 894

Free Member

Posted by: PJay

↑

Possibly a case of user error here but the article highlights some very real risks with AI.

https://www.independent.co.uk/news/health/chatgpt-medical-advice-hospital-b2803992.html/p >

Yep, none of this sort of thing is helping, but I'd be curious to see EXACTLY what he asked to get that answer. I suspect the answer he got is more to do with what he asked than anything else. Like I said before, businesses are hoovering up people trained to ask properly formatted queries. It's more like writing an SQL query of a database than chatting with a person.

Posted : 08/08/2025 7:37 pm

beej

Posts: 4224

Full Member

I've mentioned before that I work for Microsoft, and somewhat relevant to part of the discussion on corporate/enterprise use we've just published a story with my customer SSE.

Compared to public ChatGPT there's a huge amount of extra work to build something that's reliable enough to use in a corporate environment, particularly when it's public facing. Some companies get this, some less so.

MICROSOFT.COM "https://www.microsoft.com/en/customers/story/24657-sse-microsoft-copilot-studio"

SSE creates a compliant, nuanced virtual assistant using Microsoft Copilot Studio and Azure OpenAI | Microsoft Customer Stories

Posted : 08/08/2025 7:47 pm

mrmo

Posts: 10720

Free Member

Posted by: beej

↑

Compared to public ChatGPT there's a huge amount of extra work to build something that's reliable enough to use in a corporate environment, particularly when it's public facing. Some companies get this, some less so.

My employer has only just granted access to copilot to a small subset after many hours chatting to MS because of this. Still along way to go though before it gets rolled out to users at any level. And even longer until every possible complication is understood. There are business rules around sales and until everyone is 100% certain that users will not see information that they are not allowed, copilot stays off.

Posted : 08/08/2025 8:41 pm

crazy-legs

Posts: 20717

Full Member

Posted by: mrmo

↑

My employer has only just granted access to copilot to a small subset after many hours chatting to MS because of this. Still along way to go though before it gets rolled out to users at any level. And even longer until every possible complication is understood. There are business rules around sales and until everyone is 100% certain that users will not see information that they are not allowed, copilot stays off.

Absolute opposite at my place, CoPilot is being touted as the greatest thing since sliced bread. We're being encouraged to use it for all sorts, "try it out". Lots of breathlessly exciting articles (which sound exactly like they've been written by AI) on the intranet about how it can summarise documents, write your meeting minutes, organise your calendar...

It all *sounds* great:
oh why bother reading that long document, I'll get Copilot to summarise it...
oh why bother listening to the meeting, I'll get Copilot to write it up...

Until you realise that Copilot doesn't know the relationships of the people in the meeting, doesn't understand the technical stuff, mishears and misunderstands words, phrases and sometimes whole contexts, can't do sarcasm and is literally just providing a live speech-to-text. With occasional wild errors as it misunderstands a regional accent.

Posted : 08/08/2025 9:07 pm

10

Posts: 1506

Full Member

Possibly a case of user error here but the article highlights some very real risks with AI.

I've shared something similar before, but "AI psychosis" is also quite troubling.

Posted : 08/08/2025 9:21 pm

beej

Posts: 4224

Full Member

crazy-legs - All you've said is true, it's an accelerator, not an answer. Verify the outputs, check the references.

mrmo - it can only see what you could see anyway. The main issue is that it can find things you didn't know you had access to. Most organisations aren't great at classifying information and controlling who has access.

Posted : 08/08/2025 9:28 pm

mrmo

Posts: 10720

Free Member

Posted by: beej

↑
mrmo - it can only see what you could see anyway. The main issue is that it can find things you didn't know you had access to. Most organisations aren't great at classifying information and controlling who has access.

We primarily use D365 F&O and there is paranoia around the permission models in place, a mix of Entra/AD on and off premises, and security within D365, PowerBI and SQL. So you can imagine how something might fall through a gap.

Posted : 08/08/2025 9:42 pm

beej

Posts: 4224

Full Member

Posted by: mrmo

↑

Posted by: beej

↑
mrmo - it can only see what you could see anyway. The main issue is that it can find things you didn't know you had access to. Most organisations aren't great at classifying information and controlling who has access.

We primarily use D365 F&O and there is paranoia around the permission models in place, a mix of Entra/AD on and off premises, and security within D365, PowerBI and SQL. So you can imagine how something might fall through a gap.

Gulp. Good luck! I'd be recommending an external specialist or MS services for that one.

I wouldn't ask ChatGPT how to solve it.

Posted : 08/08/2025 10:26 pm

reeksy

Posts: 7675

Full Member

Posted by: H1ghland3r

↑

As with all current AI related things, it's less about the AI and more about how you phrase your query.

This.

I've been using it (Copilot) a fair bit at work. The most successful instances are when I drip feed requests and slowly build up to what I need.

My wife has been using AI to check maths problems and sometimes it makes terrible mistakes.

Posted : 08/08/2025 10:54 pm

CountZero

Posts: 34026

Full Member

Posted by: desperatebicycle

↑

AI responses may include mistakes."

Hallucinations without the use of psychedelic substances! How can I do that?

Posted : 09/08/2025 12:50 am

CountZero

Posts: 34026

Full Member

Posted by: endoverend

↑

The more worrying bit is that without a deep historical knowledge, ie: a normal awareness... the first incorrect answer would be taken as fact. It may create a new generation who do the standard 5 mins of research online and declare themselves an expert who don't realise how wrong they're getting it...

Sounds like your average Republican voter…

Posted : 09/08/2025 12:53 am

ChatGPT is rubbish

ChatGPT said:

You said:

ChatGPT said:

You said:

ChatGPT said:

You said:

ChatGPT said:

You said:

ChatGPT said:

You said:

ChatGPT said:

You said:

ChatGPT said:

You said:

ChatGPT said:

You said:

ChatGPT said:

You said:

ChatGPT said:

You said:

ChatGPT said:

1. Initial Oversight

2. False Confidence

3. Assumption Error

Why this matters: