“There are things known and there are things unknown and in between are the doors of perception.” — Aldous Huxley
I’m Huxley Westemeier (26’) and welcome to “The Sift,” a weekly opinions column focused on the impacts and implications of new technologies.
______________________________________________________
As finals week approaches, I’m sure more students will rely on AI-powered services like Quizlet or ChatGPT to help with exam preparation. AI is easier to access than ever before, especially with the public release of ChatGPT Search as a Google Extension on Halloween 2024. I understand- it might seem easier or more efficient to ask ChatGPT or some other service to generate an outline or a few ideas for a final essay or help you understand some math homework. And with the junior U.S. History Research Paper coming up in January and papers in various classes on the horizon, it might seem even more efficient to use ChatGPT to find a few sources to help bolster your argument. After all, now that it can search the internet, wouldn’t it be the best drafting/source-finding chatbot ever? I’m here to prove why using ChatGPT as a search engine might not be a great idea, specifically in academic contexts.
Any good source-finding website (AI like ChatGPT or a database like JSTOR) must be able to cite and credit the source you choose accurately. It’s easy to press a button above the PDF you’re reading and format it in MLA/Chicago as needed using online resources that have access to their entire library of content, like JSTOR. But since ChatGPT can search the internet instead of just staying within a set collection, it quickly runs into issues. Specific news organizations- like The New York Times– have code on their websites that ban web scraping and chatbots like ChatGPT to access their pages. Paywalls, account requirements, and other blocking techniques can cause the chatbot to lose permissions. This means that while ChatGPT might be able to find the article’s name, it often “hallucinates” the actual content- which means it either makes something up entirely or quotes the wrong source.
I picked random articles on Wired, The New York Times, TheVerge, and OpenAI’s research paper website and asked ChatGPT to provide the original article and a summary. I also included a recent The SIFT article.
- Wired: Identified that it was Wired but gave the wrong link and contained an inaccurate summary.
- NYT: Identified it as Popular Science, gave the wrong link, and summarized a random Popular Science article.
- TheVerge: Identified it as Business Insider, gave the wrong link, and summarized the linked Business Insider article.
- OpenAI research paper: Identified it but provided a link to commentary by Surviving the Singularity (a commentary page on the harms of AI) while giving an accurate summary.
- The Rubicon: Identified as The Rubicon, gave the correct link, and provided an accurate summary.
These results were surprising to me.
My article from The Rubicon was the only example of ChatGPT Search accurately identifying, linking, and providing a correct summary. This is likely because The Rubicon doesn’t have an AI blocker or paywall like other sites. There’s a chatbot permission line a website creator can include within the HTML code of the page that forces web scrapers like ChatGPT to ignore it, which explains why ChatGPT wasn’t able to access NYT or TheVerge. The other examples either misattributed it to another website (like NYT to Popular Science and TheVerge to Business Insider) or gave misleading information.
To all SPA juniors beginning to draft their research papers in January, please don’t use AI to find sources. A quick Google search on an actual academic database like JSTOR or the vast array of resources available to SPA students through the library would be much more reliable.
The takeaway: ChatGPT is inaccurate at identifying a source.