
Navigating Online Survey Fraud: Lessons from a Study on Open Educational Resources in Spanish for Specific Purposes
By Christine Rickabaugh, Open Education Librarian, University of Arkansas; Dr. Diana Galarreta-Aima, Associate Professor of Spanish, James Madison University; Dr. Andrea Nate, Associate Professor of Spanish, University of North Alabama
DOI: https://www.doi.org/10.69732/MEQX4277
What started as a straightforward survey about teaching resources quickly became an unexpected lesson in online fraud
Introduction: The Promise and Peril of Online Surveys in Language Teaching Research
As Spanish faculty and an open education librarian, we were eager to learn how colleagues across the country use Open Educational Resources (OER) in Spanish for Specific Purposes (SSP) courses. Our survey seemed straightforward enough—we wanted insights from fellow educators about incorporating freely available materials into specialized Spanish programs. SSP programs have been growing rapidly in U.S. universities over the past two decades (Martin et al., 2017), and we wanted to understand how faculty were incorporating freely available, openly licensed materials into their specialized Spanish courses—whether business Spanish, medical Spanish, or other professional applications.
Our initial expectations were modest but hopeful: we anticipated responses from perhaps 80 Spanish faculty who teach SSP courses, and we sought to gauge their understanding of OER, current usage patterns, and barriers to adoption. We designed what we thought was a straightforward survey hosted on Qualtrics, included a small incentive to encourage participation, and prepared for a typical academic data collection experience. What we got instead was an eye-opening lesson in the darker side of online research—and a story we feel compelled to share with our community.
Literature Review: The Double-Edged Sword of Digital Data Collection
Online surveys have become the backbone of educational research, particularly since the COVID-19 pandemic accelerated our migration to digital platforms. For language educators, they offer unique advantages: we can reach geographically dispersed colleagues, accommodate busy academic schedules, and collect data cost-effectively. SSP research in particular benefits from online methodologies because our community is relatively small and scattered across institutions (Gil de Montes Garín & Oliva Sanz, 2023).
However, recent studies paint an increasingly troubling picture of online survey integrity. A 2024 study analyzing 31 fraud detection strategies found that usable survey responses declined from 75% to 10% in recent years due to sophisticated AI-powered bots and fraudulent respondents (Pinzón, 2024). Even more concerning, research targeting specialized populations—precisely what we were doing with SSP faculty—shows fraud rates exceeding 80% (Bell & Gift, 2023). AI has made these problems worse. A recent study found that over a third of participants admitted to using ChatGPT to help answer survey questions (Zhang et al., 2025). As language teachers who use AI tools in our classrooms, we found ourselves dealing with AI-generated responses in our research—an ironic twist we had not anticipated. We’re not alone in facing these challenges. Other researchers have reported similar experiences; one COVID-19 study found that most of their responses were fake (Nur et al., 2023), and another received nearly 1,000 responses in a single hour, which seemed suspicious (Wang et al., 2023).
Launching the Survey: When Enthusiasm Meets Reality
Our survey design followed standard academic practices. We created a comprehensive questionnaire covering faculty demographics, institutional contexts, current OER awareness and usage, barriers to adoption, and specific needs related to SSP instruction. We included questions about different SSP domains: business, healthcare, law enforcement, and social services, reflecting the diversity we see in our field (Gil de Montes Garín & Oliva Sanz, 2023).
We cast a wide net for participants, sharing our survey through professional organizations like ACTFL, posting in Spanish educator Facebook groups, reaching out via LinkedIn, and asking colleagues to help spread the word. Following standard practice, we offered a $10 Amazon gift card incentive, with winners selected from the first 10 completed responses. The survey launched on Monday, January 6, 2025, just after winter break. Within the first month, we had 58 responses. That excitement lasted until we started examining the data more closely and conducting focus groups.
The Red Flags: When Something Seems Too Good to Be True
The problems became clear during our first focus group. We had invited four survey participants to join a Teams call on March 3rd, offering additional compensation for their time. We selected the individuals based on availability indicated via a Google Form and sent invitations for a Microsoft Teams meeting. However, the focus group immediately raised red flags. The meeting room was repeatedly accessed hours before the scheduled time. Once the session began, participants refused to turn on their cameras. Even after multiple prompts, they hesitated or refused to confirm their consent to participate in the discussion verbally. When they did respond, answers were vague, nonsensical, or entirely unconnected to our questions. None would clarify their institutional affiliations or indicate the type of SSP course they claimed to teach. After twelve minutes of fruitless engagement, we decided to end the meeting early. The behaviors we observed, like unwillingness to engage, lack of identifiable information, and nonsensical or evasive answers, raised strong suspicions that the participants were not who they claimed to be.
Detection Techniques: Playing Digital Detective
Immediately following the failed focus group, our research team held an emergency meeting to assess the integrity of our overall dataset. We began by closely examining the metadata associated with each survey submission. Using Qualtrics’ built-in and external tools, we reviewed IP addresses and performed manual internet searches for all respondents who had not used institutional email addresses. It became quickly apparent that many survey entries were suspicious. A significant number of responses originated from outside the U.S., including several with IP addresses located in Nigeria, a region far outside our target demographic of U.S.-based faculty teaching Spanish for Specific Purposes. We adopted a systematic protocol: any response submitted from outside the United States was removed from the dataset. Likewise, if a respondent could not be verified as a faculty member at an educational institution through directory confirmation or professional presence online, we excluded the record from analysis. Each removed survey was preserved by exporting to PDF and securely archived in a protected cloud-based folder, ensuring that we retained a complete audit trail. In total, out of 104 total submissions, we deemed 23 surveys to be inauthentic and removed them. This left us with 81 validated responses in our final dataset. Additionally, we rescheduled focus group sessions exclusively with those participants whose credentials as faculty could be confirmed through reliable methods.
Looking closer at the data only confirmed our suspicions. We saw:
- Baffling email addresses: Instead of the typical .edu or familiar personal accounts, many used strings of random letters and numbers that didn’t pass the “smell test.”
- Batch submission patterns: In theory, a cluster of responses after a survey announcement isn’t suspicious. But when three surveys were completed and submitted in the same minute, we raised our eyebrows. The odds of that happening organically, especially among busy faculty, were slim to none.
We struggled with not wanting to change our methodological structure while attempting to include as many legitimate responses as possible. The primary method we used was employing geographic restrictions with IP verification. We limited participation to United States-based respondents and implemented IP geolocation checking, then confirmed identities of potential focus group participants with identity checking via Google search.
Broader Implications: What This Means for Language Technology Research
Our experience reflects broader threats to research integrity in language education. As we increasingly rely on digital tools for pedagogy and research, we must acknowledge that the same technologies creating opportunities are also creating vulnerabilities.
For SSP research specifically, our small, specialized community may be particularly vulnerable to fraud (Gordon et al., 2024). When fraudsters target surveys seeking “Spanish faculty” or “business language instructors,” they’re not just submitting random responses—they’re potentially skewing our understanding of field needs, resource usage, and pedagogical practices.
The rise of generative AI presents additional challenges. If language educators are using ChatGPT and similar tools in their teaching (Hellmich et al., 2024; Qu & Wu, 2024), we must assume bad actors are using the same technologies to generate plausible survey responses. A fraudster could potentially prompt an AI system: “Respond to this survey as if you’re a Spanish business professor at a U.S. university who uses Open Educational Resources.”
This has profound implications for evidence-based practice in our field. If fraudulent data contaminates research about OER adoption, technology integration, or pedagogical preferences, we risk making curricular and policy decisions based on false information.
Recommendations: Protecting Our Research Community
Drawing on emerging research about online survey fraud (Rodriguez & Oppenheimer, 2024; Zhang et al., 2022), we’d like to pay it forward with some practical tips for anyone in the research community venturing into online data collection:
Before you launch your survey:
- Budget real time for fraud detection: Plan on spending just as much time cleaning your data as collecting it—maybe even more.
- Know your audience, and your risks: Small or specialized populations (like Spanish for Specific Purposes faculty) are magnets for fraud, especially when incentives are on the line.
- Build in validation: Seed your survey with cross-checks: questions that let you spot inconsistencies or catch bots in the act. For example, the same question can be asked twice in a different way (i.e. how long have you been teaching at the college level?, and how many years of college teaching experience do you have?). Add an attention-check question like “Please select ‘Strongly Agree’ for this question to show you’re paying attention.” Look for patterns like straight-lining (choosing the same answer down an entire scale). Ask a mandatory open-ended question that only someone in the field could answer meaningfully (bots will write a nonsensical answer).
- Expect the unexpected: If you’re offering gift cards or similar incentives, brace yourself for a tsunami of responses, not all of them legitimate.
- Include fraud detection in your protocol: document your planned fraud detection methods, such as IP tracking, validating questions, and verification procedures.
- Select a survey platform with advanced respondent tracking and fraud detecting capabilities: It’s important to choose a survey tool that allows for IP address logging or other identification methods, which can help flag suspicious patterns like rapid submissions or geographic outliers. Platforms like Qualtrics, SurveyMonkey, and LimeSurvey all offer some variation of IP tracking or metadata collection that can aid researchers in spotting potential fraudulent responses. Verify these capabilities before selecting a platform and launching any data collection. Be sure to comply with any privacy regulations when logging personal data such as IP addresses.
When designing your survey:
- Use domain-specific screening questions: Ask about classroom realities or OER licensing quirks that only real SSP instructors would know. Bots (and most fraudsters) will struggle.
- Make it clear, not easy for bots: Streamline questions for genuine faculty, but use wording and logic that throw off automated or inattentive responses. For example, ask “In the last academic year, how many Spanish for Specific Purposes (SSP) courses did you teach? If none, please enter “0”. Real faculty will know their teaching load. Bots may give numbers that don’t make sense (e.g., “200”).
Technical and logistical safeguards:
- Add CAPTCHA and anti-bot checks: Tools like Google’s reCAPTCHA help, even if some savvy fraudsters can still sneak through.
- Delay your incentives: Don’t pay out right away. Let your verification process finish before distributing any rewards.
- Gift items, not gift cards: Consider compensating participants with gift items (or gift cards to specific stores, like Starbucks), rather than digital cash or generic gift cards, like Amazon.
- Verify institutional affiliation: Depending on the goals of your research, you may consider requiring institutional email addresses—it’s far easier to check and, at the very least, slows down would-be fakers. Alternatively, you could consider a platform that allows you to collect identifying information separately from the main survey. Qualtrics, SurveyMoney, Jotform, and LimeSurvey offer various features, some of which a) email invitations or institutional verification can be managed through a collector or login system while individual responses remain anonymous, b) allow for anonymous email collection by allowing you to upload a list of institutional emails for survey distribution, but you never seen which individual gave specific responses, or c) allow for the separation of personal data collections from response data by using unique codes, where one code is generated in the identity verification portion and then used for survey response collection in a separate portion.
- Check IP addresses: Use your survey platform (or external tools) to flag or block submissions that don’t match your target geography or demographic.
During and After Data Collection: Staying One Step Ahead
Once your survey is live, keep your eyes wide open—things can go sideways quickly. Here’s what we learned about staying alert during and after data collection:
While your survey is running:
- Monitor response patterns in real-time: Keep an eye on timestamps, response volumes, and other red flags. A sudden flood of submissions, especially at odd hours, might not be cause for celebration.
- Set up alerts for strange activity: If your platform allows it, enable notifications for duplicate IPs or unusually fast completions.
- Be ready to pivot fast: Fraud can escalate quickly. Adjust your settings, pause incentive distribution, or refine your eligibility criteria as needed.
- Keep a paper (or digital) trail: Honor your future self by documenting decisions, flagged responses, and steps you take to protect your data.
After the collection is done:
- Assume there’s fraud in the mix: Don’t take your dataset at face value. If it seems too good to be true, it probably is.
- Create a validation workflow: Whether it’s verifying emails, cross-checking demographic responses, or analyzing IP data, have a plan—and apply it methodically.
- Ask for help if needed: If you’re unsure what to do with a suspicious dataset, reach out to colleagues in cybersecurity, IT, or research design. You’re not alone in this.
- Be transparent in your reporting: Include a brief section in your write-up or publication about your fraud-screening methods—even a few lines can help normalize this kind of rigor.
For the broader field:
- Talk about it—even when it’s awkward: Sharing our missteps helps others avoid them. It’s part of good scholarly practice.
- Build shared resources: We need checklists, templates, and case studies for identifying fraud—why not create them together?
- Think about collaborative data collection: Coordinating with other researchers (especially in small fields like SSP) can help mitigate fraud and increase transparency.
- Push for better tools: Our platforms need to do more. Let’s advocate for built-in fraud detection features, rather than relying on DIY strategies.
Working with Institutional Review Boards (IRB):
- Include fraud detection in your protocol: Document your planned fraud detection methods, like IP tracking, validation questions, and verification procedures, in your IRB application rather than making edits later.
- Update your consent language: Include specific language about fraud detection and participant verification. Consider adding statements like: “We may verify your eligibility and contact information to ensure data integrity. Participants who cannot be verified as meeting study criteria will not receive compensation, and their data may be excluded from analysis” (Leibbrand, 2021).
- Plan for non-payment scenarios: IRBs now expect researchers to clearly outline conditions under which participants won’t be compensated due to suspected fraud. State this upfront in your consent forms (Institutional Review Board, 2024).
Conclusion: Lessons Learned and the Path Forward
Our OER in SSP survey taught us uncomfortable truths about contemporary online research. What began as an innocent inquiry into resource-sharing practices became a crash course in digital fraud detection. While frustrating, this experience ultimately made us better researchers and, we hope, can help our colleagues avoid similar pitfalls.
We want to emphasize that we’re not advocating against online surveys—they remain essential tools for language education research. However, we must approach them with the same methodological rigor we bring to classroom research, recognizing that data integrity requires active protection.
For the SSP community specifically, our experience highlights both our vulnerability as a small, specialized field and our resilience. Despite the challenges, we eventually collected valid data that informed our understanding of OER adoption patterns. More importantly, we learned that sharing methodological challenges strengthens our research community.
We encourage others to adopt proactive fraud prevention measures, but more importantly, to share their own experiences—both successes and failures—in navigating digital research challenges. Only through open dialogue about these issues can we maintain the integrity that makes our research valuable to language educators worldwide.
The digital transformation of language education brings tremendous opportunities, but it also requires us to develop new forms of methodological sophistication. Our survey fraud experience was initially discouraging, but ultimately it reminded us why rigorous research practices matter: because the decisions we make based on our findings affect real students in real classrooms.
Let’s continue this conversation. Share your own experiences with online research challenges, and let’s work together to protect the integrity of language education research in an increasingly digital world.
References
Bell, A. M., & Gift, T. (2023). Fraud in online surveys: Evidence from a nonprobability, subpopulation sample. Journal of Experimental Political Science, 10(1), 148–153. https://doi.org/10.1017/XPS.2022.8
Gil de Montes Garín, L., & Oliva Sanz, C. (2023). Study of the course syllabuses for the training of teachers of Spanish for specific purposes in Spanish universities. Ibérica, (45), 267–288. https://doi.org/10.17398/2340-2784.45.267
Gordon, J. H., Fujinaga-Gordon, K., & Sherwin, C. (2024). Fraudulent online survey respondents may disproportionately threaten validity of research in small target populations. Health Expectations, 27(3), e14099. https://doi.org/10.1111/hex.14099
Hellmich, E. A., Vinall, K., Brandt, Z. M., Chen, S., & Sparks, M. M. (2024). ChatGPT in language education: Centering learner voices. Technology in Language Teaching & Learning, 6(3), 1741. https://doi.org/10.29140/tltl.v6n3.1741
Institutional Review Board, University of Wisconsin–Milwaukee. (2024, January). Tip sheet on preventing fraudulent responses and bots in online studies. University of Wisconsin–Milwaukee. https://uwm.edu/irb/wp-content/uploads/sites/127/2024/01/Tip‑Sheet‑on‑Preventing‑Bots.pdf
Leibbrand, C. (2021, April 19). CSDE science core tips: The challenges of survey fraud and tools for combating fraud. Center for Studies in Demography and Ecology. https://csde.washington.edu/news-events/identifying-and-preventing-fraud-in-survey-research-lessons-from-someone-who-has-really-been-there/
Nur, A. A., Leibbrand, C., Curran, S. R., Votruba-Drzal, E., & Gibson-Davis, C. (2023). Managing and minimizing online survey questionnaire fraud: Lessons from the Triple C project. International Journal of Social Research Methodology, 27(5), 613–619. https://doi.org/10.1080/13645579.2023.2229651
Pinzón, N., Koundinya, V., Galt, R. E., Dowling, W. O., Baukloh, M., C., N., Schohr, T., Roche, L. M., Ikendi, S., Cooper, M., Parker, L. E., & Pathak, T. B. (2024). AI-powered fraud and the erosion of online survey integrity: An analysis of 31 fraud detection strategies. Frontiers in Research Metrics and Analytics, 9, 1432774. https://doi.org/10.3389/frma.2024.1432774
Qu, K., & Wu, X. (2024). ChatGPT as a CALL tool in language education: A study of hedonic motivation adoption models in English learning environments. Education and Information Technologies, 29, 19471–19503. https://doi.org/10.1007/s10639-024-12598-y
Rodriguez, C., & Oppenheimer, D. M. (2024). Creating a bottleneck for malicious AI: Psychological methods for bot detection. Behavior Research Methods, 56, 6258–6275. https://doi.org/10.3758/s13428-024-02357-9
Wang, J., Calderon, G., Hager, E. R., Edwards, L. V., Berry, A. A., Liu, Y., Dinh, J., Summers, A. C., Connor, K. A., Collins, M. E., Prichett, L., Marshall, B. R., & Johnson, S. B. (2023). Identifying and preventing fraudulent responses in online public health surveys: Lessons learned during the COVID-19 pandemic. PLOS Global Public Health, 3(8), e0001452. https://doi.org/10.1371/journal.pgph.0001452
Zhang, S., Xu, J., & Alvero, A. (2025). Generative AI meets open-ended survey responses: Research participant use of AI and homogenization. Sociological Methods & Research, 54(3), 1197–1242. https://doi.org/10.1177/00491241251327130 (Original work published 2025)
Zhang, Z., Zhu, S., Mink, J., Xiong, A., Song, L., & Wang, G. (2022). Beyond bot detection: Combating fraudulent online survey takers. In Proceedings of the ACM Web Conference 2022 (WWW ’22) (pp. 699–709). Association for Computing Machinery. https://doi.org/10.1145/3485447.3512230