| 
		World-Wise Web? 
		By 
		Richard Waters Financial Times, March 4, 2008
 
		Edited by Andy Ross 
	A new wave of AI technology, based on a collection of technologies that 
	includes natural language processing, image recognition, and expert systems, 
	may lead to intelligent machines. Thinking Machines founder Danny Hillis: "I 
	had some hope you could just put everything into some big neural network 
	that would just start to think, but it doesn't take long working in AI to 
	realise it's much more complex than that."
 The basic building block 
	for this new technology movement is the semantic web, the brainchild of Sir 
	Tim Berners-Lee, who invented the present World Wide Web. Sir Tim imagined a 
	new web formed by linking the data contained inside the documents. That way 
	the data, not just the documents, would become accessible to machines.
 
 This semantic web is the product of a set of core standards promoted by 
	the World Wide Web Consortium, the organisation that Sir Tim leads. Now some 
	supporters say enough pieces are in place to make the first semantic web 
	services a reality.
 
 But there are some big obstacles. At the heart of 
	the problem is the need to make information on the web "understandable" to 
	machines, so that it can be extracted, processed and made useful. To make 
	this possible, machine-readable "tags" need to be attached to each piece of 
	data to describe what type of information it represents.
 
 Attaching 
	these tags to every piece of information on the web is a huge task. Without 
	new semantic services, there is no incentive to undertake the laborious work 
	of tagging data, but creating the services is pointless unless the data 
	exist in the first place. To try to overcome the problem, the semantic web 
	depends on a set of "ontologies", or dictionaries that help to create common 
	definitions that can be universally applied. These are designed to establish 
	a basic common level of understanding about language to allow machines to do 
	their work.
 
 A technology first developed for use in AI is natural 
	language processing. Even simple words or concepts can mean very different 
	things to different people, and their meaning changes depending on 
	circumstances. While the human mind can make the necessary adjustments, 
	computers that follow strict rules about language find it hard to grasp the 
	many context-specific meanings.
 
 Companies trying to employ natural 
	language processing maintain that technical advances in recent years have at 
	last given it a level of practical application. By using software to "read" 
	text, services such as Powerset aim to add tags to data automatically. The 
	natural language approach also raises the possibility of new applications, 
	for example being able directly to answer questions posed by a user.
 
 Powerset is using technology licensed from Parc to try to solve the problems 
	of natural language processing. The software is based on similar ideas to 
	those in quantum physics. A number of potential meanings for all the 
	elements in the text are allowed to co-exist as equally accurate during the 
	"reading", until the most likely answer is singled out at the end.
 
 Combining this approach with other techniques of data analysis can lift the 
	accuracy level further. One method relies on predicting the meaning of a 
	word based on the probabilities of its proximity to other words in the text. 
	As words do not appear in random sequences, the fact that one word has been 
	used in a sentence increases the chance that a particular other word will 
	also turn up.
 
 Most expect the impact of the technology to be felt in 
	stages. The early advances are likely to be incremental improvements. Search 
	engines should return higher quality results, and services that rely on 
	personalization should make better guesses about your preferences, while 
	targeted advertising systems should become more accurate.
 
	The Charms of Wikipedia 
	By Nicholson 
	BakerThe New York Review of Books
 Volume 55, Number 4, March 20, 2008
 
	Edited by Andy Ross 
	Wikipedia: The Missing ManualBy John Broughton
 Pogue Press/O'Reilly, 
	477 pages
 
	Wikipedia is just an incredible thing. It has 2.2 million articles and it's 
	very often the first hit in a Google search. It was constructed, in less 
	than eight years, by strangers who disagreed about all kinds of things but 
	who were drawn to a shared, not-for-profit purpose.
 It worked and 
	grew because it tapped into the heretofore unmarshaled energies of the 
	uncredentialed. This was an effort to build something that made sense apart 
	from one's own opinion, something that helped the whole human cause roll 
	forward.
 
 Wikipedia was the point of convergence for the self-taught 
	and the expensively educated. All everyone knew was that the end product had 
	to make legible sense and sound encyclopedic. The need for the outcome of 
	all edits to fit together as readable, unemotional sentences muted natural 
	antagonisms. Wikipedians see vandalism as a problem, but a Diogenes-minded 
	observer would submit that Wikipedia would never have been the prodigious 
	success it has been without its demons.
 
 Co-founder Jimmy "Jimbo" 
	Wales: "The main thing about Wikipedia is that it is fun and addictive."
 
 John Broughton: "This Missing Manual helps you avoid beginners' blunders 
	and gets you sounding like a pro from your first edit."
 
	SAP NetWeaver TREX 
	Wikipedia:
	
	TREX search engine 
	TREX is a search engine in the SAP NetWeaver integrated technology platform 
	produced by SAP AG. The TREX engine is a standalone component that can be 
	used in a range of system environments but is used primarily as an integral 
	part of such SAP products as Enterprise Portal, Knowledge Warehouse, and 
	Business Intelligence (BI, formerly SAP Business Information Warehouse). In 
	SAP NetWeaver BI, the TREX engine powers the BI Accelerator, which is a 
	plug-in appliance for enhancing the performance of online analytical 
	processing. The name "TREX" stands for Text Retrieval and information 
	EXtraction, but it is not a registered trade mark of SAP and is not used in 
	marketing collateral.Search functions
 TREX supports various kinds of 
	text search, including exact search, boolean search, wildcard search, 
	linguistic search (grammatical variants are normalized for the index search) 
	and fuzzy search (input strings that differ by a few letters from an index 
	term are normalized for the index search). Result sets are ranked using term 
	frequency-inverse document frequency (tf-idf) weighting, and results can 
	include snippets with the search terms highlighted.
 
 TREX supports 
	text mining and classification using a vector space model. Groups of 
	documents can be classified using query based classification, example based 
	classification, or a combination of these plus keyword management.
 
 TREX supports structured data search not only for document metadata but also 
	for mass business data and data in SAP business objects. Indexes for 
	structured data are implemented compactly using data compression and the 
	data can be aggregated in linear time, to enable large volumes of data to be 
	processed entirely in memory.
 
	Sir Tim: Google could be superseded 
	
	By Jonathan RichardsTimes Online, March 12, 2008
 
	Edited by Andy Ross 
	Google may eventually be displaced as the pre-eminent brand on the internet 
	by a company that harnesses the power of next-generation web technology, 
	says Tim Berners-Lee. The web of the future would allow any piece of 
	information — such as a photo or a bank statement — to be linked to any 
	other.
 Tim Berners-Lee said that in the same way, the "current craze" 
	for social networking sites would eventually be superseded by networks that 
	connected all types of things. The semantic web will enable direct 
	connectivity between much more low-level pieces of information — a written 
	street address and a map, for instance — which in turn will give rise to new 
	services.
 
 Tim Berners-Lee: "Using the semantic web, you can build 
	applications that are much more powerful than anything on the regular web. 
	Imagine if two completely separate things — your bank statements and your 
	calendar — spoke the same language and could share information with one 
	another. You could drag one on top of the other and a whole bunch of dots 
	would appear showing you when you spent your money."
 
 Tim Berners-Lee 
	invented the World Wide Web in 1989 while a fellow at CERN in Switzerland. 
	Asked about the type of application that the Google of the future would 
	develop, he said it would likely be a type of mega-mashup, where information 
	is taken from one place and made useful in another context using the web.
 
 Tim Berners-Lee is now a director of the Web Science Research 
	Initiative, a collaborative project between the Massachusetts Institute of 
	Technology and the University of Southampton.
 
	  
		
			|  |  |  |