ianhg Posted January 5, 2013 Share Posted January 5, 2013 Hi Sorry if this is a stupid question. But I am trying to stop users entering url's into a text area on a form. if I am using this code //list of bad values $very_bad = array('to:','cc:','bcc:','content-type:','mime-version:','multipart-mixed:','content-transfer-encoding:'); //if any of the bad strings are in submitted value return an empty string foreach ($very_bad as $v) { if (stripos($value, $v) !== false) return''; } can I add $very_bad = array('to:','cc:','bcc:','content-type:','mime-version:','multipart-mixed:','content-transfer-encoding:','www.','/url]','http://','https://'); //if any of the bad strings are in submitted value return an empty string foreach ($very_bad as $v) { if (stripos($value, $v) !== false) return''; } if so would this work? Thanks Link to comment Share on other sites More sharing options...
HartleySan Posted January 6, 2013 Share Posted January 6, 2013 It would and wouldn't work, if you catch my drift. For example, let's say that a user entered the following URL: http://www.example.com/ If you ran it through your filter, you'd be left with the following: example.com/ I imagine that you don't want that either though, in which case, you'll need to use a regex to get rid of the whole line, beyond just the "http://" and "www." parts. This could get a bit tricky though, as stuff that shouldn't be erased will be. For example, "www." will also apply to the following: "What a cute kitty cat. Awwwwwwww." Just some food for thought. 1 Link to comment Share on other sites More sharing options...
ianhg Posted January 6, 2013 Author Share Posted January 6, 2013 Thanks HartleySan for the reply much appreciated. I see where you are coming from. So I guess regex would need to done separately and not included in the $very_bad array. Thanks again, I will do a bit more research I am very new to all this. Link to comment Share on other sites More sharing options...
ianhg Posted January 21, 2013 Author Share Posted January 21, 2013 I am still struggling with this.. But I have just noticed that even though data input has been scrubbed when using echo the data sent via email has not been scrubbed. Can anyone tell me what I have done wrong or missed? Thanks code for email //create body $body = "Name: {$scrubbed['contact_name']} \nAddress: {$scrubbed['address1'] } \nAddress: {$scrubbed['address2'] } \nCity/Town:{$scrubbed['town_city'] } \nCounty: {$scrubbed['county'] } \nPost Code: {$scrubbed['post_code'] } \nemail: {$scrubbed['email_address'] } \nRepeat email:{$scrubbed['repeat_email'] } \nTelephone: {$scrubbed['telephone_no' ] } \nComments: {$scrubbed['additional_information']}"; $body = wordwrap($body, 170); Link to comment Share on other sites More sharing options...
Larry Posted January 21, 2013 Share Posted January 21, 2013 There's nothing obviously wrong with the code you've posted. It depends upon where $scrubbed comes from. Link to comment Share on other sites More sharing options...
ianhg Posted January 22, 2013 Author Share Posted January 22, 2013 There's nothing obviously wrong with the code you've posted. It depends upon where $scrubbed comes from. Thanks Larry.Basically I have been trying to stop spammers entering Url's into the input fields on a web form. Which is still on going at the moment (so any pointers would be appreciated) but while looking into this I found that the contact form although removes some bad stuff when I use echo results to a thank you page the email body contains html tags that had been removed. So I must be missing something. I am using HTML5 on the site. This is the full php code //check for form submission check if (isset($_POST['submitted'])) { function spam_scrubber($value) { //list of bad values $very_bad = array('to:','cc:','bcc:','content-type:','mime-version:','multipart-mixed:','content-transfer-encoding:'); //if any of the bad strings are in submitted value return an empty string foreach ($very_bad as $v) { if (stripos($value, $v) !== false) return''; } //replace any newline chara with spaces $value = str_replace(array("\r","\n","%0a","%0d"), ' ',$value); return trim($value); } //end of spam scrubber function //clean form data $scrubbed = array_map('spam_scrubber',$_POST); // form validation if (!empty($scrubbed['RadioGroup1'])&&!empty($scrubbed['contact_name'])&&!empty($scrubbed['company_name'])&&!empty($scrubbed['address1'])&&!empty($scrubbed['town_city'])&&!empty($scrubbed['county'])&&!empty($scrubbed['post_code'])&&!empty($scrubbed['email_address'])&&!empty($scrubbed['repeat_email'])&&!empty($scrubbed['telephone_no']) ) { //create body $body = "Market Sector:{$scrubbed['RadioGroup1']} \nName: {$scrubbed['contact_name']} \nCompany:{$scrubbed['company_name'] } \nAddress: {$scrubbed['address1'] } \nAddress: {$scrubbed['address2'] } \nCity/Town:{$scrubbed['town_city'] } \nCounty: {$scrubbed['county'] } \nPost Code: {$scrubbed['post_code'] } \nemail: {$scrubbed['email_address'] } \nRepeat email:{$scrubbed['repeat_email'] } \nTelephone: {$scrubbed['telephone_no' ] } \nComments: {$scrubbed['additional_information']}"; $body = wordwrap($body, 200); mail($mailuser,'Contact Form from Aerospace UK',$body, "From:{$scrubbed['email_address']}"); // Clear $_POST (so that the form's not sticky): $_POST = array(); }else{ echo "<h3>Sorry. You did not properly fill out the form. Please try again.</h3>"; } } // *** This is where you would post a comment to inform visitor of the data sent, etc *** echo "You information input has been sent <br><br>"; echo "This is what you sent <br><br>"; echo "Market Sector:\"" .$scrubbed["RadioGroup1"] ."\" <br>"; echo "Your Name: \"" . $scrubbed["contact_name"] . "\" <br>"; echo "Your Company: \"" . $scrubbed["company_name"] . "\" <br>"; echo "Your Address: \"" . $scrubbed["address1"] . "\" <br>"; echo "Your Address: \"" . $scrubbed["address2"] . "\" <br>"; echo "Your Town/City: \"" . $scrubbed["town_city"] . "\" <br>"; echo "Your County: \"" . $scrubbed["county"] . "\" <br>"; echo "Your Post Code: \"" . $scrubbed["post_code"] . "\" <br>"; echo "Your email: \"" . $scrubbed["email_address"] . "\" <br>"; echo "Repeat email: \"" . $scrubbed["repeat_email"] . "\" <br>"; echo "Your Telephone: \"" . $scrubbed["telephone_no"] . "\" <br>"; echo "Additional Information Requested: \"" .$scrubbed["additional_information"] . "\" <br>"; Link to comment Share on other sites More sharing options...
ianhg Posted January 22, 2013 Author Share Posted January 22, 2013 Ah maybe this is in the wrong place // Clear $_POST (so that the form's not sticky): $_POST = array(); Link to comment Share on other sites More sharing options...
Larry Posted January 23, 2013 Share Posted January 23, 2013 I don't see anything in your code that would remove HTML tags. Link to comment Share on other sites More sharing options...
ianhg Posted January 23, 2013 Author Share Posted January 23, 2013 I don't see anything in your code that would remove HTML tags. Hi Larry Sorry but I don't think I explained myself very well. The code above does not attempt to remove URL I am still working on that aspect. I posted the code because I noticed that spam_scrubber was removing bad stuff when I used echo but it's not removing bad stuff from emals sent. I was wondering if I have something in the wrong place. Sorry to be a pain. Thanks for the replies. Link to comment Share on other sites More sharing options...
ianhg Posted January 27, 2013 Author Share Posted January 27, 2013 Hi, I have added this code //strip_tagsforeach ($_POST as $key => $value) { $_POST[$key] = is_array($key) ? $_POST[$key]: strip_tags($_POST[$key]);}//htmlentitiesforeach ($_POST as $key => $value) { $_POST[$key] = is_array($key) ? $_POST[$key]: htmlentities($_POST[$key]);} which appears to be working fine. Can i use something similar to check if any url's have been entered into text fields. So it would check $_POST and remove any url's posted before being passed onto spam_scrubber? Sorry if the terminology is not correct. Thanks for the replies so far. Link to comment Share on other sites More sharing options...
Larry Posted January 28, 2013 Share Posted January 28, 2013 First of all, you don't need to use both. In fact, htmlentities() won't do anything after you've already done strip_tags(). To check if URLs had been entered, you would have to use a regular expression. Instead of doing these things before spam_scrubber(), I would add them to spam_scrubber(). Link to comment Share on other sites More sharing options...
ianhg Posted January 28, 2013 Author Share Posted January 28, 2013 Larry , thanks so appreciate the advice. I am trying the regular expression using preg_match which is returning either 0 or 1 This is what I have now clearly it not working properly. foreach ($_POST as $key => $value){ $_POST[$key] = is_array($key) ? $_POST[$key]: (preg_match_all('/\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|$!:,.;]*[A-Z0-9+&@#\/%=~_|$]/i', $_POST[$key], $matches)); return (isset($matches[1])) ? str_replace($matches[1], "", $key) : $key; } Link to comment Share on other sites More sharing options...
Larry Posted January 28, 2013 Share Posted January 28, 2013 Where are you using that code? In the spam_scrubber() function or outside of it? Link to comment Share on other sites More sharing options...
ianhg Posted January 28, 2013 Author Share Posted January 28, 2013 Where are you using that code? In the spam_scrubber() function or outside of it? Outside... spam_scrubber working fine don't want to cock it up :-) The idea being it would remove anything before spam_scrubber Link to comment Share on other sites More sharing options...
Larry Posted January 28, 2013 Share Posted January 28, 2013 Okay, then your use of `return` outside of any function makes no sense whatsoever. Link to comment Share on other sites More sharing options...
HartleySan Posted January 28, 2013 Share Posted January 28, 2013 ianhg, this thread has become rather convoluted. I had to go back to your original post to even remember what you wanted. If I can ask, why do you even need to delete URLs from user-submitted data? Is it really that essential? Assuming it is, I'd do a quick search for how to delete URLs from text using a regex. There are a ton of Stack Overflow threads on the topic. Just delete the URLs with a regex and be done with it. If you're going to do that though, I'd warn the user about it as well as submit the modified text back to them so that they can re-edit there post, if need be. Link to comment Share on other sites More sharing options...
ianhg Posted January 28, 2013 Author Share Posted January 28, 2013 ianhg, this thread has become rather convoluted. I had to go back to your original post to even remember what you wanted. If I can ask, why do you even need to delete URLs from user-submitted data? Is it really that essential? Assuming it is, I'd do a quick search for how to delete URLs from text using a regex. There are a ton of Stack Overflow threads on the topic. Just delete the URLs with a regex and be done with it. If you're going to do that though, I'd warn the user about it as well as submit the modified text back to them so that they can re-edit there post, if need be. Thanks sorry for rambling. Trying to teach myself slow progress at 65 but will take the advice and appreciate yours and Larry's guidance. Found his books very helpful. Link to comment Share on other sites More sharing options...
HartleySan Posted January 28, 2013 Share Posted January 28, 2013 You don't have to take my advice. I was just confused about what was going on more than anything. It just seems like you could use the preg_replace function once in your code and your problem is solved. Link to comment Share on other sites More sharing options...
ianhg Posted January 29, 2013 Author Share Posted January 29, 2013 You don't have to take my advice. I was just confused about what was going on more than anything. It just seems like you could use the preg_replace function once in your code and your problem is solved. Thanks HartleySan 'use the preg_replace function once' helped me. For what itis worth this the code, I know it's not perfect but it appears to working. //url removal $pattern ='/\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|$!:,.;]*[A-Z0-9+&@#\/%=~_|$]/i'; $replacement = 'removed url' ; foreach ($_POST as $key => $value){ $_POST[$key] = is_array($key) ? $_POST[$key]: preg_replace($pattern,$replacement,$_POST[$key] ); } Appreciate the guidance from both Larry and yourself 1 Link to comment Share on other sites More sharing options...
HartleySan Posted January 29, 2013 Share Posted January 29, 2013 Good, good, glad to hear it. That's a heck of a regex too, but I imagine it's pretty complete. Thanks for sharing your solution. Link to comment Share on other sites More sharing options...
ianhg Posted July 27, 2013 Author Share Posted July 27, 2013 Hi back again Folks, The script works well but I am getting some web forms like below: My question is how do I write some php to stop input being entered the same and posible reject the form? 'removed domain' is the regex working :-) Name: canada goose outlet Company:parksaleonline@126.com Address: removed domain Address: canada goose outlet City/Town:canada goose outlet County: canada goose outlet Post Code: canadagoos email: canada goose outlet Repeat email:canada goose outlet Comments: Business Lying About Lake O to Win Hearts and Minds: The Eric DraperStory canada goose outlet removed domain Link to comment Share on other sites More sharing options...
HartleySan Posted July 27, 2013 Share Posted July 27, 2013 It's a bit tricky, and you always have to be careful about being too restrictive. I would validate email addresses as email addresses. For that, I recommend the filter_var function in PHP. For things like the post code, I'd only allow digits (and maybe spaces and hyphens). And if possible, as much as possible, I'd provide select inputs for certain fields. For example, if you can restrict the County field to a reasonable number of counties, I would do a select list. That way, you can restrict the number of valid inputs to a limited set of integers. Even with all that, you'll still get people inputting invalid data, and there's really not much you can do about that. I suppose if you're getting it a lot, you should question your whole site and product in general, as it might be causing people to not care enough to take the time to fill out the form properly (sorry if that sounded a bit harsh). Anyway, please let us know if you have any success with all that. Link to comment Share on other sites More sharing options...
ianhg Posted July 28, 2013 Author Share Posted July 28, 2013 Thanks for your reply. I see what you mean about being too restrictive as an example the UK use post code letters and numbers while Zip Codes in the States are intergers. Currently using javascript to validate but of course does not work if browser has turned off javascript. Just hate stupid emails from web forms selling SEO or trying to link to a domain to sell you some rubbish. Thanks again. Link to comment Share on other sites More sharing options...
Recommended Posts