Once I get that upgrade to 36-hour days, I will tackle that. – Mychaeel
Legacy:Project Discussion
This is a page for general discussion about the site.
You can also contact wiki contributors on:
Moved discussions:
- Legacy:Project Discussion/Versions: how do deal with differences in UnrealEd 1 / 2 / 2.6. Discuss.
- Wiki Public Relations: discuss publicity for the project here
- Subtopic Discussion
- Wiki Growth musings on stuff
Contents
- 1 BU Hosted Site Updates Link
- 2 Encyclopaedia vs. Tutorials
- 3 Other projects
- 4 Maps on the Wiki
- 5 Time Zone Offset
- 6 Namespaces
- 7 Personal Attribution
- 8 Ghost Links
- 9 www.unrealwiki.com
- 10 Google Search
- 11 Legenda
- 12 Reversion
- 13 Minor bug with Google Search and Page Templates
- 14 Proper HTTP responses
- 15 Spammers and lower-case page names
- 16 Spammers and "display: none"
- 17 Orphan Pages
- 18 Bots that mess up pages
- 19 Liandri Archives
- 20 Related Topics
BU Hosted Site Updates Link[edit]
Follow BU's Hosted Site Updates link about UCPP and you'll find yourself on Home Page (which mentions nothing on the topic).
Perhaps the link should point to Featured Pages?
Tarquin: Yeah, I'm not sure what to do about that. I've linked to Featured Pages quite high up on the home page. On the other hand, Featured Pages doesn't say much to a new visitor. Any further thoughts on this?
El Muerte: A "news" page? That contains a short intro and the recent "Hosted Site Updates" entries with links to the corresponding pages.
Tarquin: I've created News. It currently redirects to Featured Pages, but we can improve on it later, or simply move Featured Pages.
Encyclopaedia vs. Tutorials[edit]
Encyclopaedia vs. Tutorials: What should this project look like?
The project started covering the UScript class tree, as there's an obvious topic hierarchy, whereas with tutes the divisons are a bit more blurry. Particularly, the (arbitrary) division of tutorials into beginner/intermediate/advanced isn't conducive to finding information when the split is over several pages.
The real mapper beginner could be lost in all the knowledge of a complex encyclopedia. Ideas (mostly now implemented):
- mini 'real' tutorials at the end of each section. For example, at the end of the effects page a little article 'Now follow this tutorial to make an exploding wall with smoke'. That's what beginners are looking for, and that's the best way to learn.
- example: the page on the Mover class stands as a reference document, but a page that explains what a mover is and the basics of using it is needed too, for example Create A Mover.
- similar to disambiguation pages on Wikipedia: technical pages such as Mover begin with links to tutorials on the subject. Put either a tutorial, or a link to it at the beginning of each page. Then after practice,explain the theory. Advanced disciples would just skip the tutorial part.
Novice and pro mappers browse differently, because they will be visualizing the process from a different perspective. Refactoring redundant and duplicated information is time-consuming.
Other projects[edit]
unrealscript@yahoogroups[edit]
The mailing list at yahoo has generated loads of compiled Q&A documents, which are partially archived at yahoo & at http://www.chessmess.com when it's up.
I've suggested on that list that they could be imported into the Wiki. There's some duplication & it might be an idea to refactor things into expositions rather than Q&A format. Anyway, I was thinking of inviting the mods from that group to discuss the idea here, wondering if anyone had opinions on the subject.
Maps on the Wiki[edit]
Mychaeel: Do I see a new trend here...? After Creating DM-Quadroid (which is more or less meant to be a tutorial), we recently got DM-Encarceration. Is that something we should encourage, or should we rather ask people not to use the Unreal Wiki for that purpose?
EntropicLqd: If the sole aim of the Wiki is to provide a single point of reference for everything related to doing things with the Unreal engine then I would say not. In which case the Creating DM-Quadroid page should definately be renamed to Tutorial DM-Quadroid cos that's what it is (more or less). If the Wiki has a secondary goal of allowing the contributors to strut their stuff and pimp their l33t h4x0r sk1llz then it's fine.
Tarquin: Early on I had considered that we could host mappers, but that would a separate branch where mappers get their own subdomain. I'm not longer sure that's a good idea. I don't know if QAPete wants to host mappers directly from BU (he's not at the moment AFAIK), and I don't know if that's because he doesn't feel he has the space or doesn't want too many hosted sites. In case 2, a "United Association of Mappers" would be a single hosted site with subdomains. Anyway, all that's more or less been dropped. I suggested that "leet" mappers could do "masterclasses", where they take one of their famous maps and explain how they made it: the techniques, the ideas, the thought-process behind the design. With ut2003, all the "leetness" is reset, so it'll be a while before some great maps arise & become well-known enough. Another idea was to give several mappers a simple map, like one box room or a corridor, and say "make this leet", and then showcase the different results. I don't yet have a feel for static mesh mapping but I think there may be less possiblity for architectural creativity now, unless one plays with Maya. So that idea might be down the pan too. To close this lengthy ramble, I would say we should only showcase things if the principal aim is to demonstrate or instruct, in which case 2D screenshots are important too. Entropic, maybe we should rename those map tutorial pages so they don't have a name: "Simple DM Map Tutorial" for example, so people don't get the wrong idea.
EntropicLqd: Given the above comments then those pages definately need renaming. Something a bit shorter and snappier would be nice though - maybe Tutorial DM (UT) and Tutorial DM – assuming I have the naming conventions right. It's a shame to lose the name as I'm intending to make the DM tutorials build equivalent maps but no matter. It is of very little consequence. I'll tweak the name of what will become the DD tutorial. I'm still playing my way through the single player game of UT2003, so I've not even bothered to look at the editor in any real depth. I figured I'd do well to get a feel for the new game first. If I had to give a single word summary it would be passable. Most of the maps are adequate but none have really grabbed me as stunning.
Mychaeel: Bump. At least mappers should put their map project pages below their personal page.
Flashman: <RANT> ...which I did, then Tarq comes along and informs me that the projects page I created should be under Category:Legacy Journal instead?
This made sense until I read this topic. (props to Mychaeel for the bump) You see IMHO (not that it holds any sway fact wise, but might give an idea of what the fresh contrubitor typically thinks...) UWiki is a place for the various minds of the UEd community to come together, share knowledge, and advise each other, but in addition to that it can be (with the use of the personal pages) a place where peeps come in search of a Texture artist to help them in their latest project, or where the 1337 mappers among us (myself most definitely not in that bracket) can see other peoples efforts and if they wish offer honest advice and support to help kick the community as a whole along to greater things. This sort of feedback cannot come from your average play-tests or reviewers, because it will provide moreinsight into technical issues alongside artistic considerations, and it will come from the minds of not just one, but possibly many contributors. This is a focus I reckon the Wiki could have, but doesn't need any modification for.
I don't mean that the zips should be hosted here of every map someon wants feedback about, but just information about the project, one or two screenies perhaps, and a discussion below. If you're in any doubt about the sort of page I mean, look at my attempt: flashman/projects. People could then put a link to an FTP where the map is hosted, or ask people who wanted to see a copy to submit their email addy. </RANT>
Mychaeel: Putting those map pages in the mappers' areas under Category:Legacy Journal sounds good to me (better than making them subpages of their personal pages). There's only a single level of subpages though, so different maps would all have to be on the same page. (I just dislike top-level map pages.)
EntropicLqd: Alternatively you could create mutliple journal pages and just prefix the page name with your name. Having a single journal page will be unworkable in the long run as it will get far too long to have any real meaning. My single flag CTF mod journal will be huge once I've finished the mod. Personally, I wouldn't mind seeing a Map Ideas equivalent for the Mod Ideas page. The attempt for formalise a way of specifying a map design would be an interesting exercise in its own right. In terms of actually hosting pages for completed maps I'm not so sure about. Surely BU has an area for that (if not maybe it needs one). I realise I abuse the Wiki slightly because I create pages that allow me to keep a single version of a "doc" in one place, accessable from both home and work, but in the case of my current EntropicLqd/EntropicMapInCreepMode page, once I've finished the spec. the page will vanish. It's a shame there is only one level of sub-pages.
Tarquin: It is perhaps strange to have the "journal" pages as subpages elsewhere. That's just the way it grew. I'm currently wondering if we should encourage pages for comment/opinion/debate to be non-subpages – see for example the Curse of Static Meshes page.
Time Zone Offset[edit]
RoninLord: It would make it easier to have the time zone offset from UTC rather than local server time. Everyone knows their UTC offset off the top of their head, and Time Zone offset on its own usually refers to UTC offset. While you only have to do it once, it would make it easier for new users.
Mychaeel: I doubt many users know what UTC is to start with. The "Preferences" page displays the current server time directly over the time zone offset just in order to make it easy to figure out the offset: If server time is 6 PM and your local time is 2 PM (check your watch), the offset is obviously 4 hours.
RoninLord: Well took a little time to workout for me, I had to calculate for the next day. While your probably right on not many people knowing of UTC, but then I wouldn't survive with out knowing it. As a coder and network admin I couldn't survive without it :)
Namespaces[edit]
DJPaul: Bing! Take Actor (UT) as an example; that's obviously a page for UT. What do we do for the UT2003 version? I'm seeing a lot of like, Actor page naming in this case because this is a bad idea - not specifcially for the actor class, but what happens if we bring along a Deus Ex version? Is that a Actor (DX)? If it is, therefore is UT2003 the main game - i.e. should unsuffixed pages presumed to be about UT2003?
Mychaeel: Yes, that's how it is handled at the moment. (If somebody makes a list of all class page names relating to UT that don't have a "(UT)" suffix yet, we could batch-rename them.)
Personal Attribution[edit]
Mychaeel: Some newly created pages contain statements that clearly attribute that page to its initial creator. It's customary here though that MeatBall:DocumentMode contributions are anonymous (just like they are "public domain" in terms of our copyright statement).
Personal attribution also inhibits other people's freedom of editing a page, which is generally in contrast to the Wiki idea and, in the Unreal Wiki, only used where we copied previously existing static documents with the author's permission.
How should we go about this?
Tarquin: One possibilty is to rewrite as "Page contributors: *list*". This is done on some Meatball pages when different things are refactored into a single document. People would be honour-bound to only add their names to the list if they'd made a major contribution to the tutorial or article. But I'd rather people did things the other way round: maybe keep a list of which areas they've worked on at their personal page and keep the articles themselves clean.
Mychaeel: Yes... I think of all options (except keeping documents anonymous altogether) asking people to keep a list of pages they worked on on their personal page if they want to and keep the documents themselves clean is the best.
Any "list of authors" on a page implies that everybody who's not on that list didn't contribute anything, and that raises the question what and how much somebody has to change on a page to be eligible for inclusion in that list. Add a new paragraph? At a single, but very useful bit of information? Fix a typo that completely changed the meaning of a paragraph? Fix a typo that just looked bad? (That's a rhetoric question, none to be answered – that'd be futile anyway, which is my point.)
Sobiwan: The way I see it, WIKI is a public effort by its definition. Personal recognition, posterity, notoriety and other forms of ego boosting are opposite to the goal of collecting public information. Keep it anonymous, dont worry about who gets bragging rights and focus only on getting accurate and concise information into the WIKI.
SuperApe: I'm in agreement. That's why I tentatively added the Content Ownership subsection to the Discussion on New Contributors Quick Start. Should that be moved to the main body of that page?
Ghost Links[edit]
Mychaeel: Creating links to non-existing pages on "topic" or "table of contents" pages without the actual intention of creating that page is, I believe, counterproductive.
The problem is similar to that of creating sticky postings in a forum: Those page links soon become just part of the normality that's perceived as such by the regulars and so, as time goes by, it becomes increasingly likely that the corresponding page will actually never be created. The title "rough draft" on Project Copyright has a similar problem. (I am going to remove it shortly, but I'll leave it there for demonstration purposes.)
Note that linking to a non-existing page from within the flow of another page – like linking to Bot from Bot Support – is a different matter and not subject to this statement of mine.
Sweavo: I agree wholeheartedly. A table of contents is a means of indexing information that is there. If there's no information then it's just noise. However I got caught out posting a Delete Me tag to such a page that had only been up one minute! So it's probably worth allowing such pages a day's grace before deleting them!
www.unrealwiki.com[edit]
Mychaeel: The domain http://www.unrealwiki.com is fully functional now as a direct alternative to http://wiki.beyondunreal.com (not a HTTP-level redirect or a HTML frameset hack). Thanks to DJPaul for funding the domain registration during the last and for the next year.
The redirect from http://www.unrealwiki.com still leads to http://wiki.beyondunreal.com/wiki/ – I haven't figured out yet how to make it depend on the host name and redirect to http://www.unrealwiki.com/wiki/ only if "www.unrealwiki.com" was originally entered in the browser's address bar. I wouldn't like to automatically redirect everybody to www.unrealwiki.com yet since most people's cookies still refer to wiki.beyondunreal.com and won't work without logging in anew for the new domain.
Tarquin: I don't know enough about the technical side to form an opinion. As for our canonical address:
- http://wiki.beyondunreal.com/
- promotes BU
- if we move from BU / BU dies, we're in trouble
- http://www.unrealwiki.com/
- is easy to remember
- if we move from BU / BU dies, there's no problem
- if we run out of money, we're in trouble
What do other people think? I'm cool whichever way :)
Mychaeel: Well... I don't think that running out of money will be a problem anytime soon, given that it doesn't really cost a fortune to keep the domain up. (For that matter, if BeyondUnreal would suddenly cease to exist we'd be in problems anyway – Unreal Wiki's bandwidth certainly costs more than the domain.)
DJPaul: http://www.unrealwiki.com/Unreal_Engine/ would be nicer still. You could mod_redirect http://www.unrealwiki.com/wiki/Unreal_Engine/ to the aforementioned address, and do a similar thing for just http://www.unrealwiki.com/.
Does BU uses Apache as a webserver? If so, I can write up a .htaccess redirect if the server has mod_redirect installed and give it to you (just two lines).
Mychaeel: I would not want http://www.unrealwiki.com/Whatever to be a redirect to http://wiki.beyondunreal.com/wiki/Whatever – that'd still mean the latter address would eventually appear in the browser's address bar.
It is quite possible to internally direct http://www.unrealwiki.com/Whatever (and http://wiki.beyondunreal.com/Whatever) without the "/wiki" part directly to the Wiki script and have it return the corresponding page. The technical problem, however, is that there are regular directories and files on http://www.unrealwiki.com as well – and such file requests should obviously not be directed to the Wiki script but handled by the server directly. Right now, the "/wiki" part serves to clearly distinguish page requests from regular file requests.
Since all page names on the Wiki are bound to start with a digit or a capital letter (and all regular files and directories are lowercase by convention), that could serve to distinguish a page request from a regular file request – but then again right now it's also perfectly legal to enter the address http://www.unrealwiki.com/wiki/mychaeel (note the small "m") to get to my home page, and the corresponding "new" request http://www.unrealwiki.com/mychaeel would instead lead to a "file not found" error.
Thinking of it though that's not too much of an issue – people should simply not link to pages in any other than their original capitalization (that is, with a capital initial in any case).
DJPaul: What about moving the regular directories and files to a subdirectory, which the webserver then checks for in the URL and doesn't rewrite the URL if this new directory is present?
Tarquin: it's a fairly small number of subdirectories that are accessed from the browser. In fact, /wiki-ext/ is the only one I can think of. Easier to just list them rather than move files about.
Mychaeel: Speaking in terms of URLs, we have
images/
for images,cgi-bin/
for certain scripts such as the image uploader,wiki-ext/
for images and styles, anddl.php
for mirrored downloads.
Those "virtual" directories could indeed be explicitly excluded from the internal forwarding to the Wiki script. The downside is that there are some more common requests for files on a web server's domain root that would also be forwarded to the Wiki script even though the proper response would not be having the Wiki script return a "Page not found" page, such as (but not necessarily limited to)
favicon.ico
(requested by browsers under certain circumstances)robots.txt
(requested by search engine spiders)
I think it'd be best to limit the requests that are directed to the Wiki script to those that start with a capital letter or a digit and forward everything else to the web server's default handling. (And maybe we should modify the Wiki script so that it will always redirect the user's browser to the "canonical" representation of a page name – initial caps for every word – unless it is given that way in the request already.)
DJPaul: Yes. Of course you would have to run a script on the wiki to ensure there's no lower-case/no first number page names existing in the wiki, and modify the create new page script to enforce uppercase page names. BTW I do plan to renew the domain in a few days.
Mychaeel: The Wiki script enforces capital initials for page names already. There might be an arcane way to create pages starting with a lowercase letter, but that's more a bug than a feature (given that you can't successfully link to one such page within the wiki) that could be either ignored or fixed.
DJPaul: This presumes BeyondUnreal runs a case-sensitive filesystem - does it?
Mychaeel: It actually presumes that the Wiki script is case-sensitive. That, though, happens to depend in turn on the underlying file system, and that's indeed case-sensitive on BeyondUnreal's Linux server.
Mychaeel: I've just renewed the registration of unrealwiki.com for another two years. If somebody feels like sending me a share of the 22 USD I paid, let me know ;-)
Mychaeel: unrealwiki.com remains securely in our hands until (at least) 2010. (I can hardly believe my previous statement is two years old now.)
Google Search[edit]
Wormbo: I just tried using Google to find anything about UnrealScript, but the Wiki was not listed at all. Explicitly searching for "site:wiki.beyondunreal.com <put anything here>" doesn't seem to find anything either.
Tarquin: That would explain why our Wiki Site Stats dropped a couple of months ago. I wonder if Google have changed something like their response to robots.txt. I don't think anything has been changed here. We have a problem.
Wormbo: Unreal Wiki still doesn't show up in Google, Lycos or AllTheWeb searches. I only checked the sites Opera displays by default, but I'm quite sure all others don't list the wiki either.
EntropicLqd: Doing a search for Unreal Wiki using altavista gives the wiki result number 1. Doing the same on google gives a zillion links to articles with links but no direct ones (I got bored after 10 result pages). Same with yahoo (looks like they use google tbh). web.ask.com also lists the Wiki home page. It's a conspiracy I tell you.
EntropicLqd: I just submitted both the unreal wiki and the UDN to the ODP project that google runs. We'll see if that does the trick.
Tarquin: We're still not on Google :( Is something still blocking the bot, or is it just a question of waiting patiently?
Mychaeel: We have regular visits of "Mediapartners-Google/2.1 (+http://www.googlebot.com/bot.html)", but I believe that's only related to the Google ads at the bottom of the page. – In addition, I've found visits from "Googlebot/2.1 (+http://www.googlebot.com/bot.html)" in recent logs which in turn I believe is indeed Google's spider. Maybe we're soon in Google again...
EntropicLqd: Google still haven't added this site to their directory - it's a conspiracy I tell you. Also, I dropped a mail (2003-Dec-10) to the http://www.unrealtournament.com/ site asking for the Unreal Wiki to be added to their community links page. So far no joy - might be worth a couple of other people mailing them. It's a conspiracy I tell you. Also, the Unreal Wiki still doesn't show up in google. Does google block Wiki based sites in the same way that they block "blogs"?
EntropicLqd: The new Unreal Tournament site has a link now for submitting fan sites so I've re-submitted the Wiki. It broke the form which is a bit of a bummer but I'll keep checking the site to see if the request got through. I'll get the Wiki added yet.
Tarquin: UT.com has mucked up our URL :(
Wormbo: Most likely because you added "http://" to it. I saw other links also having this problem.
Tarquin: Still no sign of us on Google :(
Tarquin: January 2004: I've brought the matter up at BU. Apparently they blocked the Google spider a while back. They're letting it back in, so it's hopefully just a matter or time before we're back on Google: [1]
El Muerte TDS: we're back on google, the page rank could be better, but I think that'll take some time.
Tarquin: I don't think we are. Those are things that link to us, not our own pages. Try [2] for example. Could someone email QAPete about this? My email is out at the moment.
El Muerte TDS: wtf, I'm pretty sure I had a hit this morning
Mychaeel: Well, there are some Unreal Wiki pages on Google – try [3]. Very few, but perhaps it'll get better with time. – Anyway, short of blocking GoogleBot's IP address there's little QAPete could do to block it, and I don't think he's doing it since we're getting GoogleBot hits on a regular basis, probably related to the Google ads on the site.
Legenda[edit]
El Muerte TDS: it would be usefull if every page that has some requirement (game/patch depended) to have a box with the requirement list. Something like:
{{{1}}}
Would be nice if this could be a special tag, that puts a box in the upper right corner:
<requirements> * UT2003 + Epic Bonus Pack #1 * UT2004 </requirements>
Tarquin: Do we really need special markup? Why not just write near the top of the page:
Requirements:
- UT2003 + Epic Bonus Pack #1
- UT2004
El Muerte TDS: this just looks better:
http://redir.elmuerte.com/junk/wikireq.png
I feel as if this should be MANDATORY information for wiki pages - force editors to specify. They could set a preference to say "well, I'm nearly always talking about UT2003" which would set the info for them when they write new pages. The trouble we have is that content authors do not have to specify what version their info applies to, and yet they are the one person who can easily say that info. Anyone else who comes along to it will have a harder time figuring out that info than the original author.
Reversion[edit]
Foxpaw: After reverting a number of pages I noticed it took like 4 clicks to revert each one. Would it be feasable to add a "revert to this revision" link on recent changes and/or the view other revisions list? It might also discourage wiki vandals if there's a link right there showing how easy it is to revert changes.
Mychaeel: I see your point, but we have to be careful not to make it too simple lest we encourage another type of spamming – the reversion of a page to an older version. We could, however, create an additional admin-only script that bulk-reverts pages to the last revision before all edits done from a given IP address.
Graphik: My suggestion: a list of edits compiled from the diff logs that one can selectively revert, similar to the way Image Uploads works. Or does UseMod have a feature similar to Windows XP's System Restore?
Tarquin: My suggestion: on the history page, a link on each item that says "restore this version". Or a link at the foot of an old revision isplay, saying "restore this version". Admin-only, like Mych said. But Mych's bulk IP revert idea is good too, if it's possible to code it.
Minor bug with Google Search and Page Templates[edit]
Foxpaw: Apparently, Google is following links to non-existant pages, recieving the "This page doesn't exist yet." template, and then including those non-existant pages in the search.
Proper HTTP responses[edit]
El Muerte: I'd like some improvements in the response headers of the wiki. Currently the wiki is very inefficient:
- Non existing pages produce a "200 Ok" return instead of a "404 Not Found". Producing the 404 makes the wiki more efficient for search engines, e.g. no incorrect results, but also makes it easier for other things like the unrealwiki plugin for uncodex (all that is needed is a HEAD request to check if the page exists).
- No Last-Modified response. The Last-Modified response is very useful for caching of pages, without a Last-Modified response pages will never be cached. If the wiki engine also correctly responds to the If-Modified-Since request the performance of the wiki will increase a lot because cached versions of the pages can be used. This has a positive effect on the performance of the client and server (and I'm sure BU would like that).
Another thing that might be interesting is to produce a Sitemap for Google. It saves google a lot of time digging through the site since it has already has a map of the site containing info on what page has been modified. I noticed a significant performance increase on my sites by giving google a sitemap to use. Too bad other search engines don't have the same feature, google's spider is very friendly compared to some (e.g. Yahoo Slurp, MSNbot, AskJeeves (listed from worst to less terrible)).
Tarquin: I can't look into either of those right now because of this.
El Muerte: I'm not allowed to view that forum.
Tarquin: Oops, sorry. Basically, since BU moved hosts, I can't log into FTP from any OS X client. I'll need to set up a copy of the wiki on my linux system before I can start looking at this.
Tarquin:; El, could check this: http://wiki.beyondunreal.com/cgi-bin/fourohfour.cgi Is it producing a 404 response for you? I've not done this in perl before.
El Muerte: yes, it returns a 404 error
fyfe: For generating the Google SiteMap I would recomend Google Site Crawler, it'll crawl the wiki and generrate a sitemap file for you. Here's a project file that should do the trick...
<?xml version="1.0" encoding="UTF-8"?> <!– Project Export: beyondunreal.com [wiki] –> <!– Exported by GSiteCrawler v1.12 rev. 260 –> <!– 13/10/2006 / 19:16 on \\MARS by Andrew–> <!– Contains: complete project –> <GSiteCrawlerProject> <TABLE Name="Projects"> <RECORD> <ProjectName>S:beyondunreal.com [wiki]</ProjectName> <CaseSensitive>N:1.</CaseSensitive> <ExtentionList>S:asp,aspx,cfm,cgi,do,htm,html,jsp,mv,mvc,php,php5,phtml,pl,py,shtml</ExtentionList> <MainUrl>S:http://wiki.beyondunreal.com/</MainUrl> <RemoveSlash>N:1.</RemoveSlash> <Sitemap_DateInclude>N:1.</Sitemap_DateInclude> <Sitemap_PriorityInclude>N:1.</Sitemap_PriorityInclude> <Sitemap_FrequencyInclude>N:1.</Sitemap_FrequencyInclude> <RemoveHtmlComments>N:1.</RemoveHtmlComments> <ActionError404>N:1.</ActionError404> <CrawlPriority>N:100.</CrawlPriority> </RECORD> </TABLE> <TABLE Name="URLBan"> <RECORD> <Url>S:http://wiki.beyondunreal.com/cgi-bin/</Url> </RECORD> <RECORD> <Url>S:http://wiki.beyondunreal.com/wiki-ext/</Url> </RECORD> <RECORD> <Url>S:http://wiki.beyondunreal.com/wiki?action=</Url> </RECORD> </TABLE> <TABLE Name="ParamDrop"> </TABLE> <TABLE Name="ParamRemove"> <RECORD> <ParamText>S:osCsid</ParamText> </RECORD> <RECORD> <ParamText>S:PhpSessId</ParamText> </RECORD> <RECORD> <ParamText>S:PhpSessionId</ParamText> </RECORD> <RECORD> <ParamText>S:s</ParamText> </RECORD> <RECORD> <ParamText>S:Session</ParamText> </RECORD> <RECORD> <ParamText>S:SessionId</ParamText> </RECORD> <RECORD> <ParamText>S:SID</ParamText> </RECORD> <RECORD> <ParamText>S:XTCsid</ParamText> </RECORD> </TABLE> <TABLE Name="URLS"> <RECORD> <Url>S:http://wiki.beyondunreal.com/</Url> <Include>N:1.</Include> <Crawl>N:1.</Crawl> <CurrentDate>D:2006-10-13T18:07:39+00:00</CurrentDate> <Priority>N:1.</Priority> <Manual>N:1.</Manual> <Frequency>N:1.</Frequency> <Title>S:UnrealWiki: Home Page</Title> <FullUrl>S:http://wiki.beyondunreal.com/</FullUrl> <UrlHash>N:1349269650.</UrlHash> <DateLastChange>D:2006-10-13T18:07:39+00:00</DateLastChange> <PageSize>N:12735.</PageSize> <PageHash>N:858873500.</PageHash> <DateLastCrawl>D:2006-10-13T19:07:39+00:00</DateLastCrawl> <TimeDownloadSecs>N:5.698</TimeDownloadSecs> <TimeParseSecs>N:0.04</TimeParseSecs> </RECORD> </TABLE> </GSiteCrawlerProject> <!– End of file –>
Spammers and lower-case page names[edit]
Wormbo: It seems the spammers have found a way to create pages starting with a lower-case character. These pages are not accessible directly, but the (diff) link displays them and fortunately it seems they can be deleted via "Edit/Rename pages".
Tarquin: yeah, looks like a bug in UseMod. Page names should always get an initial upper-case character. I'll try to look into fixing this at some point – but ongoing FTP problems with BU make this difficult.
Wormbo: I just deleted two lower-case pages. Mych, could you have a look at this? Maybe renaming the 'title' and 'text' fields as mentioned in IRC a while ago should block automated spam bots for a while.
Wormbo: Removed one more lower-case page name and some regular spam.
Spammers and "display: none"[edit]
Mychaeel: We have a simple filter that redirects any attempt to save a page whose page source contains display:
none
to the Chongqing Page. However, some spammers seem to manage to sneak past that filter occasionally. I'm frankly at a loss right now as to how – even if I just load the revision they saved and try to save it myself, the filter works on me. Does anybody have any ideas or guesses?
MythOpus: I did a few tests on my personal page using display:
none
and was happy that the wiki won't lock you out and sad to see that even the preview button redirects you hehe. After a few tests I thought that browser settings might allow spammers to spam like that so I fired up mozilla and disabled javascript images and some other stuff. It still redirected me. Then I thought that perhaps if you embed the html into a wiki tag or something, the parser will ignore it and continue on. I don't think that works though. Now I'm thinking maybe there are certain characters that are ignored and deleted after the wiki scans the updates for the bad code. For example...
If the % sign was ignored and deleted 'd%i%s%' etc would be saved as 'dis'...
Wormbo: Oh right, browsers discard any null characters they encounter and process the remaining text as if the nulls were never there. Perl allows and understands \0 chars, doesn't it? Would at least be a reason why the regex doesn't match.
And while we're at it, add another (maybe hidden) checkbox to the delete image page that says "ban me". Or maybe create a "restore image" feature or restrict deleting images to admins. I have a feeling random people are deleting images just for fun.
MythOpus: Yeah, I think the images should be set up like normal pages, where when an image is deleted it stays in 'memory' for awhile so that it can restored, unless it was marked as 'Permanent Delete' by an admin. Random people ARE deleting random images for fun >_<
Mychaeel: There's an "undelete image" feature here (even for bulk undeletions) – http://wiki.beyondunreal.com/cgi-bin/images-restore.cgi
. We do have image revisions, but the "delete image" feature is more more like the "delete page" feature than anything else – it'd be pointless to delete images in a way that leaves them accessible.
Wormbo: Good to know. I've restored some of the deleted images.
Wormbo: I just deleted about 75 pages created by spambots and reverted a few more pages. All of them had "display:
none" spam on them.
Wormbo: Seems like the fix didn't work...
Mychaeel: Seems so. I'm going to add some debug code in the hope to find out what exactly is submitted next time.
MythOpus: How does the redirection work? Does it do it with the browser (i.e. meta redirect stuff) or does it do it some other way. I would think that if you were using the browser to redirect and the bot wasn't using an actual 'browser' then redirection wouldn't work on it. And I'm sure if you created a custom browser that you could bypass most anything.
Wormbo: Redirection means that instead of saving the page and sending it as response, a HTTP redirect to the Chongqing Page is send and nothing is saved. However, it seems that the spam detection script no longer works: some spambots just replaced several pages with a lot of links.
Wormbo: Site is locked for now as bots are still active.
Mychaeel: Small wonder our locks, spam filters etc. didn't do anything: The spammers were posting to wiki-umw.cgi
rather than wiki.cgi
. I've disabled that possibility. – Now that leaves me with the question how wiki-umw.cgi
got there... does anybody remember putting it there?
Wormbo: We just gota new lower-case named page. Anyone care to investigate? *bump*
Tarquin: Attempting to create a lowercase page should send you to the New Contributors page, same as if you try to save a new page without editing the default text.
Wormbo: Site locked again after multiple bots bombed three pages with similar spam links: http://wiki.beyondunreal.com/wiki?action=history&id=ScratchPad
MythOpus: I just became aware(been clueless for awhile apparently) that you can use direct javascript in IE (and most likely in other browsers) as in you can type JavaScript into the URL box and it'll run as if it were on your current site. Somehow related to bots and bypassing the security measures?
Wormbo: Spammers seem to bypass the filters once again.
Orphan Pages[edit]
Fyfe: Is there any way to check the wiki for orphaned pages?? (orphaned page = pages that aren't linked to by an other page)
Wormbo: Try clicking on the page title.
Bots that mess up pages[edit]
Wormbo: Recently there seem to be some bots that cut off page content at the first semicolon and replace plus signs with spaces. Basically they behave like they never heard of URL encoding. Can this somehow be blocked? What about a Captcha for users without a user name?
Liandri Archives[edit]
Wormbo: I think we should take advantage of the other BU-hoste wiki, Liandri Archives, and link general non-development articles there instead of duplicating content here. This goes especially for pages like Unreal Tournament 3, where stuff like vehicle information should be left to Liandri Archives while we supply details about the dev tools and features related to modding.