- #1
kyphysics
- 681
- 438
After an inauspicious debut by Google last month (Feb. 8th), where BARD (Alphabet's rival chatbot to Microsoft/C3.ai's Chat GPT) gave an incorrect answer to an "astronomy-related" question, BARD again seems to flop with its abilities in public testing the past week:
https://fortune.com/2023/03/28/google-chatbot-bard-would-fail-sats-exam/
https://fortune.com/2023/03/28/google-chatbot-bard-would-fail-sats-exam/
Fortune sourced practice SAT math questions from online learning resources and found that Bard got anywhere from 50% to 75% of them wrong—even when multiple-choice answers were provided.
Often Bard gave answers which were not even a multiple-choice option, though it sometimes got them correct when asked the same question again. . .
Bard’s first written language test with Fortune came back with around 30% correct answers, often needing to be asked the questions twice for the A.I. to understand.
Even when it was wrong, Bard’s tone is confident, frequently framing responses as: “The correct answer is”—which is a common feature of large language models.
The more Bard was asked language-based questions by Fortune—around 45 in total—the less frequently it struggled to understand or needed the question to be repeated.
On reading tests, Bard similarly performed better than it did in math—getting around half the answers correct on average.