I search for solutions in this order: Past Code, Unreal Source, Wiki, BUF, groups.yahoo, google, screaming at monitor. – RegularX

Legacy:Chongqing Page/Discuss

From Unreal Wiki, The Unreal Engine Documentation Site
< Legacy:Chongqing Page
Revision as of 00:56, 18 July 2006 by Wormbo (Talk | contribs) (reverted)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

There are two types of "bans" (actually, read-only states) imposed by the automatic spam filter:

  • Temporary Bans are imposed for adding several, but not too many links to a page. The page (with the links) is saved, and the user is put on read-only for ten minutes for each link he or she added.
  • Permanent Bans are imposed for adding too many links to a page. The changes to the page are discarded, and the user is put on read-only permanently.

I'm intentionally keeping the exact number of links required to trigger either of those cases vague. I'd rather not have spammers fine-tune their spamming attempts using that information.  ;-)

In both cases, an email containing time, IP address, modified page name and submitted page content is dispatched to the wiki admins. This allows them to react quickly to temporary bans (to revert the changes and make the ban permanent if a spammer was caught, or to undo the automatic ban on false positives), and it serves as proof when filing complaints to the spammers' ISPs.

If somebody would like to be added to the list of users receiving those notification emails, drop Mychaeel a line.

See also Chongqing Page/Statistics.

Discussion[edit]

Tarquin: Just for added fun, I have hacked our page database so even the old revision of the page this guy edited leads to chongqed.org bwahahahaha :D

RavuAlHemio: :D You're kicking the spammers anywhere you can, huh? :D

anonymous: Yes indeed folks go on treating them *******s the hard way. I had real troubles with spammers myself glad to see some action

Foxpaw: The ones I just posted didn't really have any keywords, not ones legible on my PC, anyway, so I just replaced the keywords with the URL of the sites..

MythOpus: This spamming is making me mad. Grr. I suggest we keep it open to the public to edit as they see fit, but I think we should have people sign up for the privelage. We could really make use of the passwords in the preferences menu for this and it will help with the spam problem and if it continues then we could easilly find out EXACTLY who is doing it. Of course, that may go against what this wiki was meant for but still... Desperate Times, Desperate Measures.

Graphik: Hello Myth. :)

Times aren't desperate, just annoying. ;) I thought of the same thing, but in the end it is like you said, that's not what a wiki is meant for. Eventually they'll give up; it's very unsatisfying to see your work undone again and again. We already can find out who it is without registration, and I just banned one person today. :)

MythOpus: Note to all Spammers... Bow now, or bow later... muhahahahaphmuhahaMPH!!

Mychaeel: I don't really believe in technical means to stop spammers, but since those asian spam links are easy enough to spot and never used anywhere else on the Wiki, I've added a bit of code to the script that redirects people trying to save a page which contains such a link. Try it yourself...  :-)

Foxpaw: The spamming has been intense lately. Is there a known reason for the surge in spamming? I don't remember it ever happening less than a year ago, not it seems within the last week or so it's been several spamvertizements a day.

Mychaeel: I think the one and only reason is that spammers have discovered wikis. Now, true to their "I'm the only thing that matters in the world" attitude, they hook onto wikis everywhere in order to milk them for whatever benefit they see in doing so until they're dead and useless – like a deadly virus. Fortunately for us, their limited view of the world doesn't quite match reality, so there's still hope for the social rest of the world.

Tarquin: Does Google's bot see old revisions of pages?

Mychaeel: Our robots.txt wasn't properly set up to prevent this when I checked (only disallowing access to /cgi-bin/wiki.pl?action=, not /wiki?action=). I've fixed this.

OverloadUT: I noticed that if you check the "minor edit" box when editing a page, it does not jump to the top of the list on the Recent Changes page. If a spammer figured this out, couldn't they check that box on all their spammed pages, and it would take much longer to find them? Or is there some sort of protection against this?

Tarquin: Yes: set your preferences to see minor edits! :)

Mychaeel: The chongqing tool I added today makes is pretty simple to chongqify stuff added by a spammer; however, it doesn't register those keywords at http://chongqed.org. The chongqing links still do their purpose, but they lead to an unimpressive "The keyword [...] does not seem to be in the database" page.

Foxpaw: Just thought of something - this page is going to get huge, and sufficiently huge pages seem to time out when you try to save them - that could become a problem as this page grows. The chongquing tool was very handy though, and just in time, as this spamming binge was enormous.

Parallax: Pagerank on google is based on the number of links into the linking page. We should have every single page in the wiki containing a link to this Chongqing page so that it gets a very very high page rank. Such a link to this page could be very discreet. That would mean that this page would have more authority when it comes to destroying the page rank of the spammers.

Tarquin: Yup. We could add it to the footer (in a small font) or something.

Foxpaw: Not only would that greatly increase loading times for pages on the wiki, but I don't think that it would work. Search engine spiders probrably only recognize a given link/keyword combination once per domain - otherwise spammers wouldn't have to spam wikis and blogs, they could just spam one domain (that they controlled) a whole ton of times.

Mychaeel: I don't think anybody's talking about adding the links on this page to the footer – it's about adding a link to this page to the footer. That really wouldn't increase loading times a lot.

Wormbo: I've just removed over 50% of the links because they were duplicates. Please check that first if you add them so we can keep the page size within reasonable limits. Does linking with an URL as the link text actually help?

Halz Yeah those URL links under 'Miscellaneous' are less effective chongqing links, particularly as they all point to the same page, rather than specific parts of the chongqed.org database. Also the links like Effexor and weight loss are keywords which you haven't told chongqed.org about. To chongq effectively, you have to sumbit the spammer to chongqed.org, wait for the keywords to be accepted into the database, and then link using the keywords. So a link like Business Transcription is best. However any links to chongqed.org will help boost its general ranking. Also as Foxpaw was saying, links from various different domains is better than lots of links from just one page, so if you folks have your own homepage, put a link there too!

Savannahlion: Yep, I've been Chongqed. Initially, this <insert appropriate insult here> started by dumping a few links in the Sandbox. Simple enough, I simply banned him. Problem solved for a few months.

Then he returns, and creates two pages in his interest. Deleted them. A few days later he assaults about eight or so pages with piles of his links. I call him Mr. Seo, as a hint. Unfortunately, the tactic of simply Chngqing his links won't work too well. He's creating wiki links within existing text without regard for context. Links within code where the [ and ] would be misinterpreted. A spam link within the words Counter-Strike, which is woefully outside of the context. Even a link to his spam within a reference to Google. Completely random words.

I created a patch to prevent his lame additions. This was before I realized Tarquin solved the problem but after 4 hours of site clean up, coding, patching, and testing and 16 hours at work.

I'm going to look intto creating a Chongq page to support all of this. But the details of implemenation are kind of hazy. I'm going to have to mull over implementation details after work. Thank you Tarquin for pointing me in the right direction. But I would still like to pick your brain personally. IRC, ICQ, something.

Tarquin: A Chongq page simply links to the Chongq site to damage the spammer's google rating. I'm not in a position at the moment to be often on IRC, so you're best off reading up at the Chongq site to see how it works, or catching another wiki admin on our channel. Good luck dealing with this guy. Does his IP change a lot? You can always try banning him. Or you can hack to wiki code to refuse to save edits containing certain words or URLs.


Zxanphorian: OMG to all of those sites!

Tarquin: I'm thinking we should perhaps post on BuF to ask peopel to help the wiki fight spammers by putting a link to this page on their websites. What do you all think?

Foxpaw: Hrmm. That would boost the pagerank of this page, but not of Chongqed.org. If possible, copying the text from this page to a google-indexed page on their own site would probrably work better. That way Chongqed.org gets a higher cumulative rank as opposed to those referral links being divided among two sites (here and Chongqed.org).

At least, I don't THINK that being google has a heirarchial sort of system where a link to this page would be recognized as an indirect link to Chongqed.org.

Tarquin: We're not trying to raise the rating of Chongqed.org, we're trying to connect the keywords the spammers care about with the link to Chongqed.org. So yeah, copying this content works too, but according to the guys at Chongqed.org, making this page highly-linked to will be good too.

Foxpaw: Err, well, yes, but like I said, I don't think that linking here from elsewhere will do anything except raise the pagerank of this page, with the keywords used to link here. I wouldn't expect it to have any effect on the pagerank of Chongqed.org in combination with the keywords here, as I don't think that google counts indirect references like that.

However, if they say it will work, I'll take their word for it.

ElMuerte: we need some magic for this page, e.g.:

  • links on this page are read from an arbitrary file
  • to add new chongqing links use an addition script, this script will:
    • check for duplicates
    • ... do more? ...
  • Clearly mark the link to the add script, like: "If you want to add spam to this site, please use the following link"

This makes things easier:

  • no duplicates
  • no high loading/saving/editing time for this page
  • easier to catch new spam on this page

Mychaeel: I was thinking about extending the Chongq My Links! script with duplicate-removal functionality – I just didn't get around to doing it yet. (Now I've got my computer set up at home again, so chances are rising that I'll find the time soon.) – Anyway, I find it amusing that this page is being the main target for spammers these days...  :p

Foxpaw: It seems like the logical page to spam, if you think about it. Not only do you get the normal "benefit" of posting your links, but you remove any of your links that may have been chongqed previously too.

Mychaeel: True... though that explanation implies that spammers were that smart, which appears hardly plausible to me (just look at the recent spam attempts – the link tags were so incompetently crafted that they didn't even work). – Maybe this sentence from the edit page header is what makes spammers paste their links here: "Note that old page revisions aren't indexed by Google, but the Chongqing Page is."

Joe@Chongqed: Hi. You guys are doing some good chongqing here, thanks. I just wanted to add to what Tarquin said we said about linking (which I had forgot myself). By increasing the PageRank for this page it gives the links here more credibility and should help their PageRanks too. Links from a page with higher PageRank are more valuable. As mentioned above, links from your footer or sidebar on every page would help the PageRank of this page. It would also help the PageRank of chongqed.org if you linked from every page, but the purpose of chongqed.org is to remove junk from wikis. While more links are great we don't want to clutter up people's wikis so are happy with how you guys are doing it.

I also want to invite you guys to our chongqed wiki in case you missed it, its a newer addition. We use it to discuss chongqing ideas, spammers, antispam protection, etc. It has been up long enough to attract spammers though, it shows how really stupid spammers are that they attack chongqed.org, not out of revenge, just regular spamming. We have been learning a lot more about spammers recently. So far all the ones that have attacked us have found the pages by searching for other spammer's URLs. They let the other spammers do the dirty work of finding pages where their spam will stay long enough for Google to see it. So by keeping a clean wiki you should attract fewer spammers. Because we list lots of spammer URLs (though not links) our page appears to spammers to be a good target. Although we hate spam, we like being spammed on our wiki, it makes chongqing them that much easier and more fun.

Graphik: Thanks for the information on your fine service.

I tried to post this once before, but there was an edit conflict; this page was spammed. :rolleyes:

I'm not sure if the links I reverted should be chongqued from here. They were rather 'adult' in nature and might violate BU's 'no links to porn' hosting condition (In spirit, anyway).

Mychaeel: Chongqed links point to http://chongqed.org, not porn (unless Chongqed.org has changed its agenda since last time I checked). So just chongq ahead!

Graphik: Done. Damn this page is going to be long after a while, so even my cable connection (much more a dial-up user) will start to struggle with it. I suggest that we create a subpage to contain the spammed links. The Chongq My Links! tool can automagically add them to the subpage instead of linking back to here to add them. <insert better-thought-out variant of that idea here>

Mychaeel: Moved the discussion to a subpage.

Graphik: An excellent idea, but that doesn't solve the problem of having to load the parent page to chongq links.

Mychaeel: Right, but neither would keeping the discussion on the main page and putting the links on the subpage...

Graphik: Ah, I believe there is a misunderstanding. The discussion is being on this page is an excellent idea, as I said.

I was suggesting that the Chongq My Links tool not be located on the same page as the spammed links repository. That way, the tool could automatically add the links to the Chongqing Page without the user ever loading it.

Mychaeel: Auto-adding links to the Chongqing Page through the Chongq My Links! tool isn't in yet (not as trivial as it may sound), but I've added functionality to sort and remove duplicates from the list: Just copy the existing chongqing links into the second textbox and press "Sort and Remove Duplicates".


Mychaeel: The most recent spammer added the following (badly transcribed) Russian comment amidst his spam links: "Ne udaliyt – derju pod kontrolem" (followed by a great many exclamation marks). My father, who knows Russian, tells me that this can be translated as "Don't remove – I'm keeping it under control!"  :D

captaink: If you Google the quote, it leads to a Russian SEO forum. o_0

Zxanphorian: lol


xX)(Xx: When i try to edit a page ( ZoneInfo (UT) i get sent to the chongqing page, even though im not spamming, i even get sent here when i try to preview what ive done, i think it might be because there is already an off-wiki link there? Or perhaps it doesnt like my IP hehe, well anyway, im not trying to spam, is it possible to find out why this is happening?

Tarquin: The problem is the phrase 'bGravity...Zone' (without the dots). We set the script up to block a guy who kept writing 'GraViTy...Z'. I'll fix it tomorrow.

xX)(Xx: Thanks Tarquin :)

Damn spammers :(

Wormbo: The guy was here again. This time he spammed "GRav.IT.Z" and various dots on several pages and from several IPs.

BTW: The Chongq My Links tool sorts case-insensitive, but removes duplicates case-sensitively based on that sorting. The result is that we get the same key words repeating, e.g. "sports betting", "Sports Betting", "sports betting", "Sports Betting" and so on.


Mosquito: This one has to be the worst yet.

If it wasn't for the quick acting of the BU admin it might not have stopped until the whole wiki was toast.

Foxpaw: Erm.. I noticed some pages I ended up reverting to their spammed state. I was reverting in batches of 30 or so pages simultaneously so there was some delay between when I checked the recent changes list to when I reverted the page.

Mosquito: Fair enough. Though, when it came to the page the speed in which the whole spamming was happening was incredible. It would revert one page and then 3 more would be spammed. Coincidentally enough I had the platoon theme playing on my playlist when I loaded the recent pages for the first time

Tarquin: I expect the spammer was running a bot. :(

EntropicLqd: Couldn't we update the Edit/Save page functionality to randomly generate the name of the submission form and form elements and store them on the user's session. Then the update script could compare the fields received with the names held on the session and if they don't match then the update is not performed. If the labels on the buttons were either images (with generated names) or selected from a larger pool of possible labels (e.g. Save, Update, Store Changes, you get the idea) then you'd effectively disable a bot as it would have nothing to work with. Hopefully I've explained what I mean.

El Muerte: Here's an implementation idea that would stop bots. Add an hidden field that contains a hash of the page name and some other stuff. When an edit is submitted the hash is checked to see if it's correct. This way you always have to visit the edit page and can't use an automated POST script. e.g. hash = md5(pageTitle+userAddr+secretKey) . When the submission doesn't have the correct hash the edit page is popped up again with your edited text. This way when you get a new IP assigned in the meanwhile your edit won't be lost.

Mychaeel: That's a pretty smart idea, I think. Right now most of the spam seems to be blocked by the link filter, but I'll look into implementing the hash idea as well.


Mychaeel: I've added code that automatically puts people who add too many links to a page on read-only (with a "low" threshold which only causes a temporary ban and a "high" one that causes an immediate permanent ban). It also dispatches an email to the wiki admin in those cases to ensure they can react in a timely manner if necessary.

Foxpaw: How many links is "too many?" It seems like a person who was refactoring a page or merging two pages might add a number of links in a single edit and trip the ban.

Mychaeel: I've currently defined 15 new external links as "too many." (Between 5 and 14 new external links, the user is temporarily put on read-only; no harm done by that. Just sit it out or wait until an admin undoes the auto-ban before it times out.) I doubt many pages even contain that number of external links.

Mychaeel: ...and already caught one tonight who tried to add 451 links to Brush Preservation...  :D

Tarquin: I don't think pages have that many external links. At most I usually see maybe up to 5 in the External Links section. Most links on wiki pages are internal. Good work Mych! :D

strider: Brilliant idea Mychaeel!

Mychaeel: This afternoon, the spam trap caught a spammer who added the modest amount of 12 links to Using the Wiki and thus got temporarily banned at first. The automatic notification gave me the opportunity to make the ban permanent and revert the changes within minutes.

Mychaeel: By the way, I'm getting down to business about the "file an official complaint" part. I've done so for the last two spammers, and we'll see what the response is.

Wormbo: I can't say which of these two good news I like better. :D

strider: You can now add rel=nofollow to anchor tags to make it so external links aren't counted in search engine stats. Wikipedia have already added this feature, and doing this might help kill spam.

Mychaeel: Spammers don't spam because spam is effective; it's sufficient that somebody tells them it is. (That's especially evident in those spam attacks on our wiki with code that didn't even render as links...) Reducing the effectivity of spam won't help, because the effectivity of spam is a non-issue to spammers. They believe it's effective (or are made to by people who sell them spamming services and software), and that's enough. Spam has become a self-sufficient phenomenon.

That said, the spam filter just caught another one adding 536 links to "Inside The Death Chamber - Exploring Executions" before any damage could be done. Funnily enough, we don't even have that page.

Wormbo: I've removed the wiki link markup to "Inside The Death Chamber - Exploring Executions" from your comment. Someone created that page only with the word "You" on it. After I deleted it, it was created again with the same content.

Mychaeel: Caught three more in the course of the afternoon before they could do damage.

Mosquito: Have you considered passing this around to other wikis, I'd imagine some are getting hit pretty hard these days.

Mychaeel: Actually, I pinched the basic idea for this filter from MeatBall:ShotgunSpam, so other wikis are probably already aware of this idea. Our implementation just adds auto-banning and email notifications.

Mychaeel: One of the ISPs I sent a complaint to has sent a reply:

Dear Sir/Madam,


We have already issued a warning to the user to ensure that such activity is not repeated in future. Hence we would request you to consider this case as closed. The Trouble Ticket Number for this complaint is '2973'.


Please do contact us if the incident repeats again.


Assuring you the best of our services.


Thanking you.


Yours Sincerely,


Antiabuse Support

E-mail : Antiabuse.Support@relianceinfo.com

Phone : 91-(022) -30388464

Mychaeel: ...and one more:

Hi,


we will take the necessary actions to stop this kind of illegal activity.


Ystävällisin terveisin

Best Regards

EUnet Finland

puh +358 9 4243 3205

fax +358 9 4243 0601

Linnoitustie 4 B (Alto)

02600 Espoo


EntropicLqd: Is the Recent Changes page indexed? It occured to me that the recent page reverts were not so much an attempt at vandalising the Wiki page but merely a way of getting a single link listed many times by placing the target link in the change summary. Might be worth preventing URLs from being placed in the change description.

Mychaeel: Pages whose URLs look like http://wiki.beyondunreal.com/wiki?... are exempt from indexing, and after reversal, the spammy summary is removed from Recent Changes too. Of course, spammers wouldn't even care about that subtlety if we put it on a flashing banner across the editing page.

Perhaps you're right with that idea, though I'm unclear about why someone would want a plain-text URL spammed somewhere; it's not like any search engine would consider that a "link" to increase the URL target's PageRank.

I could add a bit of code that prevents a page from being saved (and redirects the user elsewhere to an explanation) when an URL is placed in the summary field (and put a hint not to do that in the caption text below).

By the same token, I'll probably extend the link counting algorithm to include prior page revisions created by the same user as well. I had anticipated this development for a while – it's a well-established fact that spammers' motivation to spam is not impaired by even the most coarse hints that it's ultimately futile even if it manages to get past any technical measures, as demonstrated by waves of spam emails crafted to bypass spam filters. I just hadn't been able to get myself to actually implement that.

...which reminds me of shooting off two complaints about those recent spammers.

Birelli: I think the last two are actually the same spammer. Two spams to a *.*.su both for gambling (presumably) spamming in an unusual way, it's probably the same one. And given that one IP is on the East Coast US, the other in Australia, it's a fair bet that it's a redirect.


Joe@Chongqed: Hi. I saw your Temporary_Read-Only page. I like the idea. I added it to our page discussing spam protection methods. I am calling it AutoBan for now, do you have any better name? You mentioned above the idea comes from ShotGunSpam. Is this method unique enough to deserve a section of its own or should I just add to our description of ShotGunSpam?


MythOpus: Awhile ago I caught a spammer who edited the actual hyper link of links that were already on a few pages. I don't think the auto ban incorporates that so is there any way we can shut those spamming possibilities down? Also, it may be a little over the top but, a captcha system might be a good idea. Or, we can fake one :) Make a fake little picture, put it on the editing pages and spam bots would have to type in the special number. If someone get's a spam bot in we can simply change the pic to something else and change the pass?

zugy: Hmmm-I wonder...It can't be too hard to generate a security code thing, would it? You see them everywhere, these days....

Draconx: Guah! That temp ban thing raped me :( I revert the Unreal Engine Versions page which has twenty-six external links on it and get owned for 4h 20m :(

Mychaeel: I had undone your temporary ban within minutes of getting the notification email. Thanks for the trouble.  :-)

zugy: Bummer-I was gonna revert that myself except- 1) didn't know how and 2) thought it might be an admin-only kind of thing...


Guest: Mych, have you considered making this page auto-update when a ban or other relevant event took place? I know a feature like this wouldn't code itself, but it seems like you spend a lot of time updating the table.

MythOpus: Mych, are you have the time of your life or what?

Mychaeel: Actually, manually keeping track of the auto-bans gets tiring after the first hundred or so. I'd rather see spammers discouraged or (gasp!) on their way of understanding that what they're doing is wrong on so many levels.

MythOpus: I'll be waiting for the day when that happens. Sadly, most spammers these days are getting paid big bucks to do what they do. It's a shame. You should consider the auto-update idea though. I don't think it would be too hard to manage?

Mychaeel: Mostly I want to confirm that it's really a spamming attempt and not a false positive before I log anything anywhere.

T1: I have an idea that may work to help slow down people from writing bots that change large amounts of pages with only single links, therefore avoiding the autoban. You could prevent non-registered users from doing many edits at once, by only allowing 1 edit every 10 seconds. Basically you would store a global "last edit time" and every time non-registered-user makes an edit, check if there has been an edit in the last 10 seconds by a non-registered-user (it doesn't matter if it's the same or a different one). If so, then you'd get a new page that said, please wait X seconds, but it still has the textbox with what you wrote so you don't lose your submission. This would prevent that situation some referred to farther up the page where there was bot that was spamming faster than wikizens could revert. The non-registered bot would be slower, but it wouldn't affect non-registered users who aren't spamming very much. You might say that the bot would then simply register before spamming, but you could put a ten second limit on registering also, therefore each time it made an edit, it would still be slower because the autoban or a good wikizen would probably catch it. The only problem I forsee is people that don't register and want to help against a bot, they may not be able to do much except slow it down if they get a lucky shot into the minimal gap between the end of the ten seconds and the next bot attack.

Also, if the ENTIRE text of the page is deleted and replaced with more than two or three links and no non-links, isn't that an obvious spam attempt that should be auto-detected, even if it's less links than the standard autoban link limit?

craze: i think the Wiki just got a large dose of chongqing, tryed to see if i got my help desk question answered, and woulkda ya look at tehh, massive page edits, all with ad links in em'....

T1: Another bot prevention method: If some one adds the same text to more than two pages, autotempban. Happens a 3rd time? it's an autopermaban. Now, if only someone would implement these things, it'd slow dow/inconvience/prevent some spammers. Also, the statistics shouldn't be on this page because then probably no one notices actual discussion on this page because they think it's the stats being updated..

El Muerte: I think the spam check needs a fix to also scan for bad defined links. In this case the bot should have gotten an tempban because he submitted 6 links, but the links were not correct unrealwiki markup.

Tarquin: I'm locking the wiki for a bit.

Wormbo: Hmm, all those changes came from different IPs. A coordinated distributed spam attack? I think I smell a botnet.

Tarquin: I'm looking at our spam detection code, and Mych's use of regexps is beyond me :(

Mychaeel: The spam trap did spring – once for each of the 41 different IP addresses the spammer used.

Currently, the spam trap is very simple. Yes, we can make it more sophisticated, but in the end it'll always be just a small Perl script, while spammers will always possess some level of human intelligence (even though the fact that this small Perl script of ours manages to outsmart about 99% of them speaks volumes about it). So, while we can put huge amounts of brains into creating a script which catches an even higher percentage of spammers, we'll still never reach the point where we can be sure it'll catch all spammers. I don't think we can currently get any better effort/success ratio than we have already.

Several trusted wikizens here have been given a secret URL to lock the wiki in case of emergency. Why did nobody use it?

T1: I know perl myself, and I'd love to lend a hand in improving the script. In my opinion there is nothing that cannot be improved upon, and I find programming much more fun than wikignoming. 99% is still not 100%. Besides, my hard drive on my main pc is messed up, and I have absolutely NOTHING to do right now, so I'd absolutely love to do some semi-constructive programming.

Also, I'm checking for other wikis that got hit by the same spammer, and I've found these so far:

http://www.intertwingly.net/wiki/pie/FrontPage

http://wiki.python.org/moin/AtAudioSprint?action=diff

http://wiki.43folders.com/index.php/Special:Recentchanges

http://wiki.kde.org/tiki-index.php?PHPSESSID=4942f9c1fedf2525cddb6c6e878ea769

Tarquin: Mych, my thought was actually to make the trap simpler – to check purely for URLs, rather than URL links. But if it caught the spammer... cool :) BTW, what does '&#' do in the regexp? (line 4388)

T1: '&#'????? That's not even in the man page.

Mychaeel: It looks for the string "&#" ;-) – those pattern matches are part of the "old" spam detection code which was added when wikis were flooded by "chinese spam." The auto-banning URL filter is somewhere else and indeed simply looks for occurrences of /UTF8_REPLACEMENThttp:/i.

craze: although i know very little on the subject, how about having the spam detection script(when tripped) begin to traceroute all edits within the next x secconds/x minuites and check for matching IPs near the end of the list, just to make sure its not just one person with a network of IPs attacking teh Wiki...(this probably makes no sence...)

El Muerte: how about using DNSBL to check the host? The wiki doesn't have that many edits so the overhead might be low. Although in the last case it wouldn't have stopped the sammer since he was using a zombie network.

Switch`: How about adding user registration and blocking anonymous posting? Maybe not exactly in spirit of wiki but this is "The Unreal Engine Documentation Site" for developers by developers first, wiki second.

craze: ahh thats thwe word i was thinking of, zombie, just check for a common IP twards the end after it starts going through ISPs, just check if someone after the ISPs IP has something after it, if i got this right, IPs twards the end of the list would start to match(xept the last one) in the trace route, then if it starts matching abd the anti spam scripts has been tripped just discard the edit


MythOpus: Another spam attempt just occured. The spammer mass joined and logged in with several usernames and link spammed some pages. Apparently, he thought he could trick us by deleting some of the links on the pages as to make us think he deleted all of the links. When will the maturity level of these 'geniuses' level off with normal standards?

Wormbo: Did we already discuss these number/letter combinations presented through an image and the user has to type it in? We could add something like this to the edit page for each IP that didn't properly respond to an image yet. After this verification the user or IP won't have to do it again. If verification fails the user will only see the preview page. (so the edits aren't lost)

T1: Same spammer as last time.

http://wiki.kde.org/

They had such an image verification system, spammers hit them just as hard as us last time, if not harder.


Wormbo: We need a way to revert pages with many links. Wiki Integration/Browser Sidebar got overridden with spam and attempts to revert it result in permanent read-only, without any changes to the page.

Mychaeel: Short-term solutions: Admins are now exempt from the link threshold (should have been that way from the start). Long-term solution: Reverting a page to a previous version should be allowed by anyone no matter what it does to the number of links.


Mychaeel: That display: none-type spam is blocked now as well.


Tarquin: A lot of the links here go to 404s on the Chongqing site.

Kartoshka: 404s because whoever added the links to the Chongqing page didn't submit them to chongqed.org?