Forum menu
ChatGPT is rubbish
 

ChatGPT is rubbish

Posts: 1754
Full Member
Topic starter
 
[#13531653]

ChatGPT seems to be highly rated and AI is seen as the future, however how can it be used if it can't answer simple questions correctly?

I asked v5.0 to name the US states ending in the letter o.

It states that 'There's only one U.S. state that ends in the letter "o" - Ohio. It's got the distinction all to itself'.

Do Colorado, Idaho and New Mexico not exist as US states anymore?

How is this the future?


 
Posted : 08/08/2025 11:47 am
Posts: 23334
Free Member
 
name the US states ending in the letter o
 
ChatGPT said:

Only one U.S. state name ends with the letter “o”Ohio.

 
You said:
what about colorado
 
ChatGPT said:

You’re right — Colorado also ends in “o.”

So, the complete list of U.S. states ending with the letter “o” is:

  • Colorado

  • Ohio

 
You said:
and idaho
 
ChatGPT said:

Exactly — Idaho makes it three.

US states ending in “o”:

  • Colorado

  • Idaho

  • Ohio

 
You said:
and new mexico
 
ChatGPT said:

Yep — New Mexico belongs on the list too.

Final list of U.S. states ending in “o”:

  • Colorado

  • Idaho

  • New Mexico

  • Ohio

 
You said:
list the US states ending in o
 
ChatGPT said:

Here’s the full list of U.S. states whose names end with the letter “o”:

  • Colorado

  • Idaho

  • New Mexico

  • Ohio


 
Posted : 08/08/2025 11:52 am
sirromj reacted
Posts: 23334
Free Member
 

progress....


 
Posted : 08/08/2025 11:52 am
Posts: 7431
Full Member
 

Google's AI answer, includes disclaimer 🙂 
"how many US states end in the letter O

The United States has 4 states that end with the letter "O"
. They are: 
  • Colorado
  • Idaho
  • New Mexico
  • Ohio 
AI responses may include mistakes."

 
Posted : 08/08/2025 11:55 am
Posts: 9136
Full Member
 

At least they did not claim that spaghetti-o belongs on the list.


 
Posted : 08/08/2025 11:56 am
Posts: 9010
Free Member
 

I dunno, some people are very much more concerned about it. Stealing copyrighted artwork for example. Or how about 24 hour surveillance (and data for training) via wearable AI tech? 

Or that the very name of OpenAI alluding to openness and sharing of knowledge (especially if you've ever been involved in the open Source Software community) is far from the reality of the corporation.


 
Posted : 08/08/2025 11:57 am
ChrisL reacted
 DrJ
Posts: 14006
Full Member
 

I’ve been using it quite a bit lately to help translate some French legal documents and mostly it’s been very useful but occasionally it gets the translation completely wrong. Not in some detailed grammatical way - more like it is looking at a completely different text. 


 
Posted : 08/08/2025 12:18 pm
Posts: 901
Full Member
 

It amazes me how often it produces completely false information. When I've played around with it to verify some facts over some particular interest of historical knowledge which are very familiar to me, and of which countless books and documents/ reports exist online... a huge dataset, and it inexcusably gets dates wrong, gets characters in events mixed up, not even close to what is a well defined historical artifact. Amusingly - I've told it it's wrong and it admits it made a mistake, tries again and still gets it wrong again...and again...and again. It rewrites history, gets it completely wrong and presents it as fact.

The more worrying bit is that without a deep historical knowledge, ie: a normal awareness... the first incorrect answer would be taken as fact. It may create a new generation who do the standard 5 mins of research online and declare themselves an expert who don't realise how wrong they're getting it...

Not convinced as of yet... it needs to be more transparent on verifying validity of sources, and provide an estimation of certainty of its output for the end user.


 
Posted : 08/08/2025 12:45 pm
TheGingerOne reacted
Posts: 988
Free Member
 

I'm using Claude to generate code and graphs for investment purposes and I'm very impressed. First attempt told me what Python modules to install and the graphs produced looked OK, but on closer inspection were not correct so I told it:

That code ran ok but the figures produced are different to the ones I get from the Trading View website. Your code for AAPL produced a MACD of -1.3071 and a Signal of -0.5633 but Trading View produced a MACD of -7.83 and a signal of-5.61  

and the reply was:

You're seeing different values because TradingView uses Simple Moving Averages (SMA) for MACD calculation by default, while my code uses Exponential Moving Averages (EMA). This is a common source of confusion!

Let me update the code to match TradingView's default calculation

Job jobbed.


 
Posted : 08/08/2025 1:16 pm
Posts: 20662
Full Member
 

Posted by: endoverend

Not convinced as of yet... it needs to be more transparent on verifying validity of sources,

Quite often it just makes up its own sources. There are several examples I've seen where it's cited references that simply don't exist.

 

Posted by: endoverend

It may create a new generation who do the standard 5 mins of research online and declare themselves an expert who don't realise how wrong they're getting it...

I remember my uni days of combing through the archives, finding obscure German chemistry experiments from the early 1900's, having to verify and cite the reference (obviously the lecturer knew it and expected you to cite it - failure to cite it, or getting it wrong was a clear case of failing to do the research or copying someone else for example).

Now, everyone spends 3 minutes on Google and produces an answer which is unquestionably wrong (sometimes dangerously so) on a worryingly large number of occasions.

You actually have to be quite good at what you're researching to know that it's wrong and question it again. Anyone without that knowledge is just going to go "ah well AI says it's [thing]..."

The mis-information and dis-information potentials are pretty horrifying.


 
Posted : 08/08/2025 1:27 pm
sirromj reacted
Posts: 7431
Full Member
 

Posted by: endoverend

It may create a new generation who do the standard 5 mins of research online and declare themselves an expert

That came long before the AI stuff.


 
Posted : 08/08/2025 1:41 pm
Posts: 7619
Full Member
 

My company has introduced its own internal Chat GPT.  Its the ChatGPT LLM using RAG to answer queries based on our internal documents.

We are actually measured on how often we use it - essentially we are training the AI model for the company so no doubt some of us can be replaced.  Pretty dystopian.


 
Posted : 08/08/2025 1:51 pm
Posts: 91165
Free Member
 

essentially we are training the AI model for the company so no doubt some of us can be replaced.

They won't replace staff.  They will have the staff do something else. This has been demonstrated over the last 200 years.  Most things that people used to do have been replaced by machines, but most of us still have jobs.

I think the main risk is that low paid low skill staff could be replaced, but someone will think of something else for them to do.  The question is, will it be worse than what the AI now does?


 
Posted : 08/08/2025 2:03 pm
Posts: 78461
Full Member
 

I tried.

 
 

 
Posted : 08/08/2025 2:13 pm
Posts: 12365
Full Member
 

Ask it about Hawaii Five-O


 
Posted : 08/08/2025 2:18 pm
oldnpastit reacted
Posts: 8400
Full Member
 

What's worrying / annoying is that it gives objectively wrong answers with such utter conviction. 

I've been doing that for years!

 

Ask it to answer the question before last, that should be interesting

 

 


 
Posted : 08/08/2025 2:31 pm
Posts: 7431
Full Member
 

I've asked ChapGPT (that's the posh, English version 😀 ) and it still says there is only one state. So I got it to work out that there are 4 and told it to remember this as people are mocking it. It says it will remember in future.


 
Posted : 08/08/2025 2:31 pm
Posts: 78461
Full Member
 

That's a thing in itself.  It's supposed to be machine learning - why isn't it learning from its own previous mistakes?

Granted there's a (probably fairly high) risk that people will deliberately contaminate it, but in the example above I didn't give it any new information other than telling it it was wrong.


 
Posted : 08/08/2025 2:39 pm
Posts: 13509
Full Member
 

I've just done the same and once it got the right decision I asked it why it got it wrong:

 

That’s a fair question — and a good one.

Here’s what happened:

1. Initial Oversight

In my first answer, I only listed Colorado because I defaulted to an internal mental list and didn’t go through all 50 states systematically. That’s a common failure mode when a model (or a human) relies on recall rather than verification.

2. False Confidence

When you challenged me, I did a partial correction and found Idaho and Ohio, getting the count up to 3 — but I still missed New Mexico. That shows I was still relying too much on memory and a quick scan, not a complete and methodical check.

3. Assumption Error

“New Mexico” is a two-word name, and it’s easy to subconsciously file it under "New" rather than consider its actual last letter ("o"). So despite seeing the full name, I didn’t pay enough attention to the spelling of the last letter — a classic “surface-level check” mistake.


Why this matters:

It’s a great example of how even simple questions require careful step-by-step reasoning, especially when accuracy matters. Your persistence in asking me to double-check was exactly what was needed — and I appreciate that.

If you ever see a model (or a person) give a fast answer and seem confident — it’s still worth verifying. Especially when something feels off.

Thanks again for holding me to a higher standard.

 
 

 
Posted : 08/08/2025 3:19 pm
Posts: 6754
Free Member
 

Because you'd need to retrain the model itself which is the most expensive part of running them.

I wonder how long before we'll see paid ads in the all the answers from chatgpt. I mean Google's already gone massively downhill with al the paid sponsorship.


 
Posted : 08/08/2025 3:20 pm
 mrmo
Posts: 10720
Free Member
 

All LLMs do is determine the next word on the basis of probability. The word apple could follow the, but so could orange. It’s a bullshit machine. Sometimes it guesses right, often it doesn’t. If you believe Orwell 2+2=5


 
Posted : 08/08/2025 3:27 pm
ChrisL reacted
Posts: 7431
Full Member
 

I just used it to fix a SQL script and it did a great job 😀


 
Posted : 08/08/2025 3:32 pm
Posts: 12365
Full Member
 

Posted by: mrmo

All LLMs do is determine the next word on the basis of probability. The word apple could follow the, but so could orange. It’s a bullshit machine.

Exactly. An LLM isn't intelligent in the sense of understanding the meaning of language, it's just very good at identifying highly probable responses to prompts.


 
Posted : 08/08/2025 3:40 pm
Posts: 7619
Full Member
 

Posted by: molgrips

They won't replace staff.  They will have the staff do something else. This has been demonstrated over the last 200 years.  Most things that people used to do have been replaced by machines, but most of us still have jobs.

I think the main risk is that low paid low skill staff could be replaced, but someone will think of something else for them to do.  The question is, will it be worse than what the AI now does?

They've already started, thinned out management as each manager is now supposed to be able to handle more direct reports because of AI.

My fear is that they'll looked to replace the expensive tenured staff who actually know the answers to most scenarios without having to rely on AI and replace them with cheaper less experienced staff and let AI cover the gaps.


 
Posted : 08/08/2025 3:56 pm
Posts: 4593
Free Member
 

I wonder how long before we'll see paid ads in the all the answers from chatgpt. I mean Google's already gone massively downhill with al the paid sponsorship.

IMO it will be about 0.00001 seconds after OpenAI figure out how to do it and sell the ads.

OpenAI is haemhorraging money and has no realistic path to profitability, unless Mr Altman can convince enough government bodies to embed ChatGPT into their infrastructure and then yank up the price.

They'll be desperate to monetize it in any way they can. For regular users, it's just a commodity - people will happily jump ship to Gemini or Claude if they feel they offer better value.

GPT 5 was supposed to be the big reveal, the gamechanger.  But it doesn't look like it's much of an improvement on 4, although I believe it is cheaper if you're a power user, and has smaller versions than can run locally on (relatively) accessible home hardware (ie computers that cost about £3k).  So it's a step forward, but it won't have the cash flooding in.


 
Posted : 08/08/2025 4:01 pm
Posts: 18593
Free Member
 

If you consider the volume of erronous shite on the Net it's not surprising that code developed to skim it (AI) produces unreliable answers. My browser AI clearly skims the first results and summarises them when the truth may lie a few pages beyond.


 
Posted : 08/08/2025 4:49 pm
Posts: 78461
Full Member
 

Posted by: richmtb

They've already started, thinned out management as each manager is now supposed to be able to handle more direct reports because of AI.

IME the opposite is happening.  The thinning is occurring at the bottom because who needs junior staff anymore?

I can't say as I'm overly shocked either.  A company is unlikely to thin the herd by getting rid of managers when it's the managers who are deciding who gets made redundant.  Also IME.


 
Posted : 08/08/2025 5:54 pm
 10
Posts: 1506
Full Member
 

Do Colorado, Idaho and New Mexico not exist as US states anymore?

According to some of the people I meet Colorado actually ends in an 'a'


 
Posted : 08/08/2025 6:30 pm
Posts: 7619
Full Member
 

Posted by: Cougar

A company is unlikely to thin the herd by getting rid of managers when it's the managers who are deciding who gets made redundant.  Also IME.

Its a big company (multinational IT vendor) its got a lot of management layers. To their credit, a lot of staff reductions focus on management bloat first but I get the distinct impression they will start to go further.


 
Posted : 08/08/2025 6:50 pm
Posts: 7619
Full Member
 

Posted by: Cougar

A company is unlikely to thin the herd by getting rid of managers when it's the managers who are deciding who gets made redundant.  Also IME.

Its a big company (multinational IT vendor) its got a lot of management layers. To their credit, a lot of staff reductions focus on management bloat first but I get the distinct impression they will start to go further.


 
Posted : 08/08/2025 6:50 pm
Posts: 894
Free Member
 

As with all current AI related things, it's less about the AI and more about how you phrase your query. There is a real skill involved in formatting a query correctly to get the answers you need, when done properly it is a very useful tool for research and many other uses.  

Mrs H1ghland3r heads up the AI development team at a major IT firm, and they have their own models that are being trained to be used for document review, coding and technical documentation.  They are currently aggressively hiring people specifically to correctly format AI queries. As was said earlier, the jobs won't be lost, they'll just change.

In this instance, if you ask 'Can you methodically check and list all US states that end with the letter O' then it gets the right answer first time.

The key word in the query is 'methodically' as this instructs the AI to bypass the various shortcuts it uses to reduce processing time (and cost) and do a full review of the data to answer the question.


 
Posted : 08/08/2025 6:57 pm
ossify reacted
 mrmo
Posts: 10720
Free Member
 

Posted by: H1ghland3r
As with all current AI related things, it's less about the AI and more about how you phrase your query.

Which is fine, if you understand that, but the average user of Facebook or Google? no chance. That LLMs are bullshit machines really is an issue. In the right context they can be useful tools. Though an observation at the current costs I can see a lot of companies walking away. The pricing models are a joke.


 
Posted : 08/08/2025 7:11 pm
Posts: 894
Free Member
 

Posted by: mrmo

Which is fine, if you understand that, but the average user of Facebook or Google? no chance. That LLMs are bullshit machines really is an issue. In the right context, they can be useful tools. Though an observation, at the current costs I can see a lot of companies walking away. The pricing models are a joke

Which is also part of the perception problem, no-one in AI gives two figs what the 'freemium' users and public are doing with the AI models which are generally 2-3 versions behind the cutting edge. They are all focussed on the corporate development as that's where the money is. After that, then it will filter down to consumers. Not even the models Google are shoehorning into every search query are where their focus is, which is part of the problem, they want everyone to be thinking about incorporating AI into everything, but the current state of the public faced AI is so bad it's ruining the perception of it as a tool.

I also have issues with them even using the term 'AI' as it's nothing of the sort, it's just a natural language search engine based on a snapshot of data from a given point in time, as the US state question highlights, it can't even update its own parameters when a mistake is pointed out to it, mistakes have to be reviewed and updated by real people at OpenAI or wherever the models are trained.


 
Posted : 08/08/2025 7:22 pm
 PJay
Posts: 4997
Free Member
 

Possibly a case of user error here but the article highlights some very real risks with AI.

https://www.independent.co.uk/news/health/chatgpt-medical-advice-hospital-b2803992.html


 
Posted : 08/08/2025 7:26 pm
Posts: 894
Free Member
 

Posted by: PJay

Possibly a case of user error here but the article highlights some very real risks with AI.

Yep, none of this sort of thing is helping, but I'd be curious to see EXACTLY what he asked to get that answer. I suspect the answer he got is more to do with what he asked than anything else.  Like I said before, businesses are hoovering up people trained to ask properly formatted queries.  It's more like writing an SQL query of a database than chatting with a person.

 


 
Posted : 08/08/2025 7:37 pm
 beej
Posts: 4210
Full Member
 

I've mentioned before that I work for Microsoft, and somewhat relevant to part of the discussion on corporate/enterprise use we've just published a story with my customer SSE. 

Compared to public ChatGPT there's a huge amount of extra work to build something that's reliable enough to use in a corporate environment, particularly when it's public facing. Some companies get this, some less so.

https://www.microsoft.com/en/customers/story/24657-sse-microsoft-copilot-studio

 


 
Posted : 08/08/2025 7:47 pm
 mrmo
Posts: 10720
Free Member
 

Posted by: beej

Compared to public ChatGPT there's a huge amount of extra work to build something that's reliable enough to use in a corporate environment, particularly when it's public facing. Some companies get this, some less so.

My employer has only just granted access to copilot to a small subset after many hours chatting to MS because of this. Still along way to go though before it gets rolled out to users at any level. And even longer until every possible complication is understood. There are business rules around sales and until everyone is 100% certain that users will not see information that they are not allowed, copilot stays off.


 
Posted : 08/08/2025 8:41 pm
Posts: 20662
Full Member
 

Posted by: mrmo

My employer has only just granted access to copilot to a small subset after many hours chatting to MS because of this. Still along way to go though before it gets rolled out to users at any level. And even longer until every possible complication is understood. There are business rules around sales and until everyone is 100% certain that users will not see information that they are not allowed, copilot stays off.

Absolute opposite at my place, CoPilot is being touted as the greatest thing since sliced bread. We're being encouraged to use it for all sorts, "try it out". Lots of breathlessly exciting articles (which sound exactly like they've been written by AI) on the intranet about how it can summarise documents, write your meeting minutes, organise your calendar...

It all *sounds* great:
oh why bother reading that long document, I'll get Copilot to summarise it...
oh why bother listening to the meeting, I'll get Copilot to write it up...

Until you realise that Copilot doesn't know the relationships of the people in the meeting, doesn't understand the technical stuff, mishears and misunderstands words, phrases and sometimes whole contexts, can't do sarcasm and is literally just providing a live speech-to-text. With occasional wild errors as it misunderstands a regional accent.


 
Posted : 08/08/2025 9:07 pm
 10
Posts: 1506
Full Member
 

Possibly a case of user error here but the article highlights some very real risks with AI.

I've shared something similar before, but "AI psychosis" is also quite troubling. 


 
Posted : 08/08/2025 9:21 pm
 beej
Posts: 4210
Full Member
 

crazy-legs - All you've said is true, it's an accelerator, not an answer. Verify the outputs, check the references.

mrmo - it can only see what you could see anyway. The main issue is that it can find things you didn't know you had access to. Most organisations aren't great at classifying information and controlling who has access.

 


 
Posted : 08/08/2025 9:28 pm
 mrmo
Posts: 10720
Free Member
 

Posted by: beej
mrmo - it can only see what you could see anyway. The main issue is that it can find things you didn't know you had access to. Most organisations aren't great at classifying information and controlling who has access.

We primarily use D365 F&O and there is paranoia around the permission models in place, a mix of Entra/AD on and off premises, and security within D365, PowerBI and SQL. So you can imagine how something might fall through a gap.


 
Posted : 08/08/2025 9:42 pm
 beej
Posts: 4210
Full Member
 

Posted by: mrmo

Posted by: beej
mrmo - it can only see what you could see anyway. The main issue is that it can find things you didn't know you had access to. Most organisations aren't great at classifying information and controlling who has access.

We primarily use D365 F&O and there is paranoia around the permission models in place, a mix of Entra/AD on and off premises, and security within D365, PowerBI and SQL. So you can imagine how something might fall through a gap.

Gulp. Good luck! I'd be recommending an external specialist or MS services for that one.

I wouldn't ask ChatGPT how to solve it.

 


 
Posted : 08/08/2025 10:26 pm
Posts: 7556
Free Member
 

Posted by: H1ghland3r

As with all current AI related things, it's less about the AI and more about how you phrase your query.

This.

I've been using it (Copilot) a fair bit at work. The most successful instances are when I drip feed requests and slowly build up to what I need.

My wife has been using AI to check maths problems and sometimes it makes terrible mistakes.


 
Posted : 08/08/2025 10:54 pm
Posts: 33967
Full Member
 

Posted by: desperatebicycle

AI responses may include mistakes."

Hallucinations without the use of psychedelic substances! How can I do that?


 
Posted : 09/08/2025 12:50 am
Posts: 33967
Full Member
 

Posted by: endoverend

The more worrying bit is that without a deep historical knowledge, ie: a normal awareness... the first incorrect answer would be taken as fact. It may create a new generation who do the standard 5 mins of research online and declare themselves an expert who don't realise how wrong they're getting it...

Sounds like your average Republican voter…


 
Posted : 09/08/2025 12:53 am
Page 1 / 2