Jack Riddle[Any/All]

@ jackr @lemmy.dbzer0.com

Posts

2
Comments

202
Joined

10 mo. ago

Profile picture drawn by Paws and Claws and licensed under the Creative Commons Attribution Sharealike 4.0 International license(cc-by-sa 4.0)
currently migrating my main account to anise@quokk.au

3d ago

"Just ask ChatGPT" (Art by Shave_your_eyebrows)
Jump
1

Jack Riddle[Any/All] @lemmy.dbzer0.com 3d ago
what text are you reading that has a 0% error rate?
as I said, the text has a 0% error rate about the contents of the text, which is what the LLM is summarising, and to which it adds it's own error rate. Then you read that and add your error rate.
the question is can we make a system that has an error rate that is close to or lower than a person's
can we???
could you read and summarize 75 novels with a 0% error rate?
why… would I want that? I read novels because I like reading novels? I also think that on summaries LLMs are especially bad, since there is no distinction between "important" and "unimportant" in the architecture. The point of a summary is to only get the important points, so it clashes.
provide a page reference to all of the passages written in iambic pentameter?
no LLM can do this. LLMs are notoriously bad at doing any analysis of this kind of style element because of their architecture. why would you pick this example
Meanwhile an LLM could produce a summary, with citations generated and tracked by non-AI systems, with an error rate comparable to a human (assuming the human was given a few months to work on the problem) in seconds.
I still have not seen any evidence for this, and it still does not adress the point that the summary would be pretty much unreadable

3d ago

"Just ask ChatGPT" (Art by Shave_your_eyebrows)

Jack Riddle[Any/All] @lemmy.dbzer0.com 3d ago

The study of this in academia

you are linking to an arxiv preprint. I do not know these researchers. there is nothing that indicates to me that this source is any more credible than a blog post.

has found that LLM hallucination rate can be dropped to almost nothing

where? It doesn't seem to be in this preprint, which is mostly a history of RAG and mentions hallucinations only as a problem affecting certain types of RAG more than other types. It makes some relative claims about accuracy that suggest including irrelevant data might make models more accurate. It doesn't mention anything about “hallucination rate being dropped to almost nothing”.

(less than a human)

you know what has a 0% hallucination rate about the contents of a text? the text

You can see in the images I posted that it both answered the question and also correctly cited the source which was the entire point of contention.

this is anecdotal evidence, and also not the only point of contention. Another point was, for example, that ai text is horrible to read. I don't think RAG(or any other tacked-on tool they've been trying for the past few years) fixes that.

Jack Riddle[Any/All]

@ jackr @lemmy.dbzer0.com

Posts

2
Comments

202
Joined

10 mo. ago

Profile picture drawn by Paws and Claws and licensed under the Creative Commons Attribution Sharealike 4.0 International license(cc-by-sa 4.0)
currently migrating my main account to anise@quokk.au

Jack Riddle[Any/All]

"Just ask ChatGPT" (Art by Shave_your_eyebrows)

"Just ask ChatGPT" (Art by Shave_your_eyebrows)

"Just ask ChatGPT" (Art by Shave_your_eyebrows)

Stubsack: weekly thread for sneers not worth an entire post, week ending 22nd February 2026

Stubsack: weekly thread for sneers not worth an entire post, week ending 22nd February 2026

Stubsack: weekly thread for sneers not worth an entire post, week ending 22nd February 2026

"Just ask ChatGPT" (Art by Shave_your_eyebrows)

"Just ask ChatGPT" (Art by Shave_your_eyebrows)

At least one of the 7 letters has still some relevance to my life.

Stubsack: weekly thread for sneers not worth an entire post, week ending 22nd February 2026

Stubsack: weekly thread for sneers not worth an entire post, week ending 22nd February 2026

feddit.org's Zionist bar problem: community ban(s) vote

#microblogmemes

gender rule

English Has Used the Word Milk for Plant Milks Since the Year 1200 Rule

gender rule

The carrot family

easy solution

feddit.org's Zionist bar problem: community ban(s) vote

feddit.org's Zionist bar problem: community ban(s) vote

The solution to AI lying is… wikipedia but worse, apparently

Found a pretty good blog post on our friends in the wild