In the world of market research, data plays a pivotal role. It empowers businesses to connect with consumers, offers valuable products and services, and gain a competitive edge. However, not all collected data is valuable. Inaccurate, incomplete, inconsistent, and skewed information can lead to incorrect decision-making. Data quality has long been a concern in our industry, and the emergence of Artificial Intelligence (AI) has indeed heightened those concerns specifically for open-ended responses. However, AI has introduced the potential for identifying and mitigating poor-quality open-ended responses.
Pre-survey Checks
To Prevent Fraudsters and Eliminate Duplicates in Real Time.
At Canadian Viewpoint, we recognize these challenges, and we have developed a comprehensive process and set of strategies for detecting and preventing fraud. To maintain the integrity of our data, we’ve implemented strict standards for verifying our panelists’ names and addresses, including the issuance of physical incentive cheques to their Canadian addresses. Additionally, we employ bot detection techniques to flag responses with a high likelihood of being generated by automated bots. Our use of digital fingerprinting ensures a unique identity for each computer, preventing users from participating in the same survey multiple times. We apply GEO IP checks and CAPTCHA to ensure that data quality is consistently upheld to the highest standards.
In-survey checks
To enhance the quality of collected open-ended responses
Over the years, we’ve come to realize that in our efforts to maintain data integrity, we’re not just combating fraudsters; we’re also facing a growing trend where some survey participants are increasingly depending on AI for generating open-ended responses. In our process, we have deployed methods to accurately identify AI-generated, internet-sourced, or nonsensical participant responses. In-survey quality control is vital for upholding data quality standards, achieved through a combination of NLP-based approaches, manual validation, and survey programming methods.
Clean Survey Data
We maintain data integrity by monitoring respondent engagement and behaviour throughout the study. This includes conducting a series of checks to identify and eliminate inattentive and fraudulent respondents from open-ended answers within the dataset.
Tracking Time Stamps
We record the duration of each response, enabling us to measure the time allocated by respondents to answer open-ended questions and identify ‘speeders,’ referring to participants who move through the questionnaire at an unusually rapid pace.
Detecting Copy-and-Paste Answers
We implement a text classification algorithm to identify copy-and-paste responses.
Filtering Out Gibberish Responses
We carefully assess responses to open-ended questions, promptly removing respondents who provide gibberish, profanity, nonsensical, or offensive content. In addition, we identify and remove straightliners (those who rush through the surveys and use the same pattern or select the same answers for all the questions).
Identifying AI-Generated Duplicate Responses
To maintain data authenticity, we employ an algorithm that detects and addresses duplicate responses generated by artificial intelligence.
We have established and consistently upheld a quality control process for over 40 years. Our clients place their trust in us to deliver reliable data while fostering enduring and dependable relationships with our panelists, all the while upholding privacy policies. Our data quality process involves a mix of automated checks, anti-fraud technology, and human inspections. By using both machine learning and human oversight, we make sure that our collected data maintains the utmost quality throughout the entire respondent cycle, from registration to in-survey completion. As part of this process, our project managers diligently review the data throughout the project’s lifecycle, examining open-ended responses and removing any data that appears to be fraudulent.