Python Musings #4: Why you shouldn’t use Google Forms for getting Data- Simulating Spam Attacks with Selenium
Want to share your content on python-bloggers? click here.
Its no secret- Google Forms is one of the most popular sites for making surveys, sign-up lists, contact forms and the like. The forms can be custom designed and data is stored and formatted quite nicely. But what you may not know – is that with a little bit of Selenium and nonsense data- your data collection will be sullied with nonsensical data and you can waste a lot of time cleaning the data — or even worse, make your data totally unusable.
In this blog post I’m going to give you the code to write a simple spam attack on Google forms with Selenium and a few strategies that can help you avoid such attacks should you continue to use Google Forms.
Disclaimer: The following is purely demonstrative and meant solely for educational purposes. Please don’t be lame and actually do this to anyone.
Step 1: Setting up Selenium.
I’ve said it a few times already, but if you don’t know- learning how to use Selenium only takes a little more than 10 minutes. So if you haven’t already, please check out TheCodex’s video how to use Selenium with Chromedriver.
(I haven’t been sponsored- this is just a great resource for knowing the basics)
Step 2: The Form
This is the sample form that we will be using. It just asks for a name and some standard contact info. Sure, it’s possible to spam this form by manually submitting every time– but who has the stamina to do that?
Before we go on to the actual script- let’s find out how to get some phony contact info.
Step 3: Getting Phony Data
You might think that you need to be pretty creative to make something convincing. Thankfully the internet is there to take care of that!
If you do a quick search for fake name, email, addresses and phone number generators you will find dozens of such websites who will do that for you; I’ll leave that to you to figure out.
Step 4: The code
This is essentially finding the appropriate xpaths and assigning them to the appropriate fields. And then iterating through the data with a single for loop.
I put a sleep at the beginning and end of each iteration to ensure that there will be no issues with loading the site and submitting the data.
from selenium import webdriver from time import sleep def fill_my_form(names, emails, addresses, numbers): for i in range(len(names)): browser = webdriver.Chrome("PathtoChromeDriver") browser.get( 'link to form') sleep(3) nam = browser.find_element_by_xpath( 'xpath for name field') emal = browser.find_element_by_xpath( 'xpath for email field') add = browser.find_element_by_xpath( '') pn = browser.find_element_by_xpath( '//*[@id="mG61Hd"]/div[2]/div/div[2]/div[4]/div/div/div[2]/div/div[1]/div/div[1]/input') nam.send_keys(names[i]) emal.send_keys(emails[i]) add.send_keys(addresses[i]) pn.send_keys(numbers[i]) submit = browser.find_element_by_xpath('//*[@id="mG61Hd"]/div[2]/div/div[3]/div[1]/div/div/span/span') sleep(4) submit.click() browser.close() random_names = ["List of Random Names"] random_emails = ["needs to be the same length as random_names"] random_addresses = ["ibid."] rand_numbers = ["ibid."] fill_my_form(random_names, random_emails, random_addresses, rand_numbers)
After running this code. Our phony data has been entered. Easy as that.
Still want to use Google Forms? Remember to have these implement these work-arounds.
Even after showing you Google Forms vulnerability- people may still want to use Google Forms (Hey, I still want to use it also!).
With this in mind, here’s a list of some best practices on how to secure your Form from an attack that can be created in 10 minutes.
1. Limit Responses to one per user
Limiting your responses to one per user is a great way to secure your form from a spam attack.
The only con is that your target audience must have google accounts to sign in. If they are using another service provider and have no accounts associated with Google then you’ve lost a response.
2. Shuffle the Question Order
Shuffling the question order will change the structure of your form slightly and will help thwart an automated attack. It will either cause the attack to crash or will put the wrong responses in the wrong fields, which can then be picked out using some Regex.
More Best Practices
There are other methods to protect spam attacks, such as making pseudo-CAPTCHAs, and Password protection as well. You can check out xFanatical’s blog on the topic where he goes in depth on how best to secure your forms.
Conclusion
Selenium is a powerful tool for web automation and is openly available and is very easy to learn and use.
With that in mind, it is important to know about what can be done with these tools and potential threats that exist.
Hope you enjoyed this blog! I’ll see you next time!
Did you like this content? Be sure to never miss an update and Subscribe!
Want to share your content on python-bloggers? click here.