08 September 2006

Java and CMS

I mentioned rather briefly my interest in Java-powered CMS (here). There are not many wiki engines written in Java, possibly because it's more demanding. Java is a program run by the web browser, which take is responsible for converting available data into a readable webpage. My impression which could be wrong, is that interactive webpages are more robust and less prone to unintended results when loading, since they are designed to actually interface with the web browser's virtual machine. In contrast, programs like PHP or JavaScript are designed to create another layer of interface by prompting the website's host to generate a page.
(I tried to discuss this in the prior post on CMS applications linked above. Basically, most CMS applications either generate static pages, which are created as stand-alone HTML files; or else they follow the database format, in which case every single distinguishing trait of each page in the website is saved in computer memory as a field in a database record. The later design is usually more efficient in terms of memory and searching, and is essential for very large sites like Wikipedia. In either case, however, the CMS application that powers the website must generate a file--temporary or permanent--that is read as HTML.)

Another reason why Java-based CMS's might be better is that they do not actually launch a server process whenever the user interacts with the application. Supposing it is a WikiEngine, for example, which is accessed by a large number of users. Each time a user wants to preview her new post, for example, the CGI application is required to launch a new process. But the Java app will only need to launch a new thread.

CGI versus Java: not a valid comparison!

It has to be pointed out that the dichotomy between CGI and Java is not valid. CGI is, after all, an open application programming interface (API); Java is a programming language. One can create a CGI application that is powered by Java, although this is not common. Generally speaking, Perl or PHP is used for programming CGI applications; Java applets are used for programs that run off the visitor's web browser.

However, in researching this essay, it became apparent that Java (unlike Perl or PHP) can replace many of the functions of a CGI application, while executing those functions in a way that is, in some ways, preferable to (and logically exclusive of) the CGI API. Conversely, most CMS's that are in common use were created in Perl or PHP (not Java!) because they are easily understood by people with a casual familiarity with HTML. Also, it is often unnecessary to have a costly Java application when mere HTML with a little JavaScript will do fine.
__________________________________________________________
There are quite a few Java-powered WikiEngines, mostly of the database-orientation. Courtesy of WikiMatrix, I am aware of Clearspace, Corendal, Ikewiki, JAMWiki, JSPWiki, SnipSnap, and XWiki. In addition to these named, there are some systems developed for large organizations, such as SamePage, which I have ignored. XWiki (samples) seems to be oriented to professional developers, and I don't think it's really feasible for my purposes.

Ikewiki is a semantic wiki developed in Salzburg, Austria. Semantic wikis (SW's) differ from the usual type in that they have a peculiar logical structure of the data. So far I have found no implementations.

Examination of wikis created from these engines has been extremely time-consuming, but let's make some quick notes. Clearspace is a commercial product ($29/user) from Jive Software. It's evidently used in the BBC's website, TechRepublic forums, CNET forums, and Amazon.

JAMWiki is an interesting concept: it's a WikiEngine with feature parity to MediaWiki (the most commonly implemented of all, and used with Wikipedia). So far, the selection of implementations is very slim indeed. Janne Jalkanen created JSPWiki to develop and advertise coding tricks, but it's spartan and specific to the general purpose of JSP.

Labels: , , , , , ,

05 September 2006

List of Wiki Engines

(The Varieties of Wiki Experience)

While WikiIndex lists scores of wiki engines, I wanted to narrow down the list of wiki engines to those that have widespread use or prominent implementations.

All of the wiki engines listed below except for Everything2 are free and open source
Wiki EngineDescriptionTop Wiki

Bitweaver

CMS related to TikiWiki; written in PHP; modular; high traffic, custom web development.
VOIP-Info
Clearspace
Jive Software; written in Java;
BBC newsforums
CNet
CLiki
Written in Common Lisp;
TUNES Wiki
Corendal
Corendal; written in Java; no wiki syntax to learn, a WYSIWYG rich text editor is used instead collaborative
DokuWikiSplitbrain Software; written in PHP;
Romapedia
mostly collaborative
Everything2Custom Wiki for Everything Development Company; programming language probably Perl
Everything2
JAMWikiWritten in Java; feature parity with MediaWiki
OLAT
MediaWikiOpen source; most widely used; developed specifically for Wikipedia. Written in PHP;
1911Encyclopedia
Althistory
AnswerWiki
Banknote Wiki
C Language
CFD Wiki
Chainki
Changemakers
Christianity KB
Corpsknowpedia
SourceWatch
Uncyclopedia
Wikipedia
MoinMoinWritten in Python;
Edubuntu
FedoraProject
GnomeLiveWiki
Handhelds.org
Python
TechnoratiDeveloper
Ubuntu
PmWikiWritten in PHP; mostly used for non-reference sites;
CenterForestResearch
ITmission (Linux)
Leo Laporte
PukiWikiJapanese; written in PHP;
Mostly Japanese
TiddlyWikiWritten in JavaScript; weak on anti-spam, other security features; no preview; entire site stored in a single HTML file (that's how it's possible to be written in JavaScript)
BoliviaWiki
Reasoning Well
Xwiki
Written in Java; enterprise wiki used mainly in France
collaborative
"Collaborative" means that users have used their installations on organization intranets, as opposed to general access reference wikis.

BitWeaver and MediaWiki are database-oriented; the other wiki engines listed above are file-oriented. The difference is that, with a file-oriented wiki engine, each entry is its own file. In contrast, with a database, each entry is a record in a table belonging to the backend.

I was interested in some of the various alternatives, one of which was the commercial wiki engine Clearspace ($29/user). There I was surprised to learn that the BBC website is (evidently) powered by this product. Clearspace is written in Java, which is very interesting to me for several reasons I'll discuss later. Another product that I thought looks attractive is Corendal.


SOURCES & ADDITIONAL READING: Gmap Package, Bitweaver organization; Wiki Popularity results, Wiki Creole; Comparison of WikiEngines (results for this table), WikiMatrix;

Labels: , ,

01 September 2006

On the Varieties of Wiki Experience

The Wiki* is a type of website in which many users can edit pages. Perhaps the best example of a Wiki is Wikipedia, a collaborative encyclopedia in many languages. The English version of Wikipedia has over 1.7 million entries, and there are 186 languages for Wikipedia at the time of this writing. This is made possible by the fact that there are thousands of people posting entries in Wikipedia on every imaginable subject. Wikipedia also has a structure of editing rules that tend to weed out poorly written or misleading articles over time. As a result, while some Wikipedia articles are flawed, they are subject to constant scrutiny and deliberation (here's an example).

In addition to Wikipedia, there are numerous reference wikis (list). A major distinguishing feature of wikis is the choice of wiki engine, which is basically a database front end for the display of information. I was fortunate to discover this website listing a large number of wiki engines available for creators of wikis. Oddly, this site--though vast--does not include a very important, famous wiki, Everything2. Everything2 has its own, custom wiki engine.

PmWiki and MediaWiki are two of the most frequently used engines; the later is available for free download. PmWiki has various "skins" that allow a wiki manager to select the look and feel of the site. However, both can be readily modified to look very different from the familiar Wikipedia format (e.g., Neogia--MediaWiki; OpenSceneGraph--PmWiki). In some cases, such as Wikia, a single wiki implementation is scaled up to support 3000+ distinct wikis hosted at the same site (almost exactly the same as Blogger, only with wikis instead of blogs).

What makes a wiki different from other multi-page websites is supposedly that so many people can edit it, although one can also enable many posters at one's blog, if one wished. So that's not a crucial difference. Another difference is that wikis are "quick"--entries can usually be typed in quickly, without knowledge of HTML; and as WikiWikiWeb is older than the earliest blogs (c.1994), it's reasonable to point out that the blog is a derivative of the wiki concept: a multipage website that can be generated quickly merely by typing in the content (as opposed to individually programming each page in HTML, as one does with static web pages). Now, Blogger offers similar ease of use.

What I regard as the definitive distinction is a cluster of basic features, including the ability to save prior drafts of each page, and the fact that pages are organized in a non-hierarchical fashion. So, for example, in a traditional website like the old Library of Congress country studies site, there is a top page which branches out to the list of countries, then to the topic headings for each country study (e.g., Oman). In contrast, one can navigate this blog by clicking links all day, but that's merely an option I've tried to encourage in the last year or so I've been posting here. This blog is organized in a linear structure and organized by months. Blogger has kindly included a search engine (top) and I incorporated another (right sidebar, main index). But in a large wiki site, where pages are added by many people, links to the page or searches are the only reliable navigation method. The main structure to the site will take the form of the database tables, which may not exist for certain wiki engines.

* The term "wiki" is derived from Hawai'ian colloquialism for "quick."

See also, " Wikis used for Reference," "Java and CMS," and "List of Wiki Engines,"

Labels: ,

15 June 2005

Data Freeway

FTP stands for file transfer protocol. It's the standard for sharing files (of any format) between different operating systems. It allows a webmaster to communicate with the server hosting her homepage.

FTP clients
are programs that allow a webmaster to upload or edit the files of a web page. Unlike "normal" files like your term paper (that you wrote in MS Word), the files that are incorporated in your webpage reside on another computer—an FTP server. FTP servers are also called hosts. You are only going to need an FTP client if you have a web page. Even then, you may not need one: this site can be maintained without an FTP client. Most personal web publishers, like Nucleus CMS, Movable Type, bBlog, WordPress, b2evolution, boastMachine, Radio, and Drupal* have file uploading built in. Likewise, Macromedia Dreamweaver has an FTP client built in.

However, these are often inadequate. Movable Type only allows one to upload files; taking them down or editing them, or re-arranging the file system (like, for example, putting images in subdirectories) is impossible. I've never used the other publishing CGI applications, so I can't comment about them.

For those of you who—like me—always want free stuff, there is a free download of an excellent FTP client available:

So now you know. I've been using this one for several months and I think it's superb.
* List of web publishers and links via Wikipedia

Labels: , , ,

14 June 2005

On a Foray into HTML-3

SunSoft Java[*], and Netscape JavaScript [*] are closely related ideas. They're both programming languages that are commonly associated with the internet. The similar names are just a coincidence, however, and they refer to very different things. In this blog post and others, I'm going to refer to a program and its elements as "code." You could say programs are written with code. I also will use a term, "compiler." This is a program that reads code written in a high-level language and translates it into assembly language so the computer can do what it's supposed to do.

Java was developed about the same time as Mosaic, the first Web browser. Most computers supplied since 1990 have a "Java virtual machine" (JVM) that is a compiler for Java code. This allows Java code to be read by any browser anywhere, any time, regardless of the computer on which one is browsing the web. The VM is common to all browsers, regardless of flavor (this is not STRICTLY true!).

An application is any program that you need a computer for, such as word processing or managing a database. An application written in Java is called an applet. An applet can do pretty much anything that a conventional application can do; so, for example, this list of applets includes calculators, graphers, simulators; an MP3 player; chat rooms, email programs, and spam blockers are also written in Java.
What about JavaScript? JavaScript was created by Netscape as a simple set of commands that all browsers would recognize. Unlike Java, which is a completely separate programming language, designed for autonomous applications, JavaScript is a set of commands recognized by browsers. JavaScript programs, or scripts, are usually embedded directly in HTML files. The script executes when the user's browser opens the HTML file.

JavaScript allows the person visiting your website to interact with the site. A simple script involves letting the visitor select the background color of the page. Another script can detect the user's operating system and browser type, then give instructions that are appropriate to the user's particular computer. A third type evaluates user input. Drop down menus and combination bars are things that you can do with JavaScript.

(To be continued.)

Labels: , , , , , ,

On a Foray into HTML-2

This post has been edited for accuracy

So, to recap: the Web and the Internet are similar and it's reasonable for people to use them as synonyms. It's just that the Web is what individual computer users have created with HTML, in the medium of the Internet. The Internet is older; it's th foundation and building material of the Web.

Web pages are created with HTML. This simply a file type that can be read by a browser. Web pages are "made" of HTML; HTML is a high-level computer language that explains to the browsers visiting the site how to display the text and images hosted at the website.

In addition to the HTML files that the browser reads, there are elements that the browser is told to display. Web browsers are designed to "read" (recognize the format and display accurately) JPEG images (*.jpg), GIF images (*.gif), TIF (*.tif), and bitmaps (*.bmp). They can also recognize other types of files, which I'll describe in a moment.

In addition to HTML files, the above-mentioned image files, and Java or Javascript files, you can post pretty much any type of file you want on your website. However, in order to read things like an MS Word document or Acrobat PDF, you need to have the software installed on your computer. Hence, the popularity of Adobe Acrobat. The software for reading PDF files is free; people pay to buy the software for creating *.pdf files. These files will display in a new window of the browser, or a window spawned by the browser's computer (i.e., Windows or Mac OS will launch MS Excel if you open an Excel file at a website).

WHAT ARE SOME OTHER FILE TYPES YOU CAN HAVE?
You can have MPEG's, which are files that are either audio, video, or both. MPEG refers to a standards committee (like you needed to know that!), and this committee keeps issuing new formats. MPEG-2 is the standard used for most *.mpg files. A variant is MPEG-4, which was modified to create the Windows Media Video (*.wmv, or "Wave") format; Apple Quicktime (*.mov) is a third. These file types can be created by different softwares, and they can be played back by freely ditributed playback software. Like Adobe Acrobat, the player is usually free, and the computer's operating system must spawn the player for it to be seen. The file formats are mutually incompatible, although some players can play more than one format.

In addition to these, there is Macromedia Flash/FlashPlayer. This is like the others, except that Flash allows one to create a digital image by manipulating objects in the Flash software; it's like MS PowerPoint, with the ability to animate the presentation and upload it to the web. Flash files (*.swf) are typically viewed as an animated graphic within the web page; it's not usually necessary to spawn a new window for playback. As a result, one can combine animated and non-animated elements in a single page. Also, Flash is very easy to use, in my opinion.
COOL STUFF I NOTICED LATER: Here's a blog post about new features available in the latest release of Flash (hat tip to Wikipedia's Flash entry).
(To be continued)

Labels: , , , ,

13 June 2005

On a Foray into HTML

Some terms of art for the web:

Some of you are going to hear some technical language used here that is quite intimidating. A case in point is the jargon associated with web pages, the internet, and so on. The fact that many of these terms have multiple meaning doesn't make it easier, but let us hope this does.

First, many people surfing the internet may be a little confused by the terms, "internet" and "web." These are almost, but not quite, synonyms. The internet is a network of networks that is connected (at least initially) through the telephone lines, using signals much like voice transmission. Modems used a universal standard for exchanging data through the phone lines, called TCP/IP. This format was developed in 1969 though the Advanced Research Projects Agency (ARPA), a branch of the Department of Defense. Much later, a protocol called HTML was developed that allowed web browsers to treat data sent over modems and convert this into graphical images, such as a web page. At the same time that HTML was invented, web browsers were also invented by the National Center for Supercomputing Applications (NCSA). It's easy to see why browsers and HTML had to be invented concurrently: a browser had to be able to translate data from a modem into an image that could be displayed, and there had to be a standard that allowed browsers to speak to each other.

The internet was initially useful to computer terminals connected to mainframes, running arcane software like FTP, Usenet and Gopher. I recall having a lot of friends who were familiar with these services and talked about them a lot, and finding it inconceivable that these things would ever amount to anything but costly nerd toys. In 1992, however, Mosaic emerged as the first graphical browser, thereby creating--in a stroke--the world of interconnected hypertext we know as the "Web."

(To be continued.)

Labels: , , , ,