Semantic Technology: A New ContextPosted: January 15, 2007
Semantic search is actually just one facet of what’s possible using semantic technology. There are many more uses and implementations that are generally not discussed and frequently passed over altogether.
This doesn’t mean that they are any less valid, it’s just that the companies that are developing these technologies for the most part are primarily search engine companies looking to apply it to on line applications over gargantuan databases of millions if not billions of websites.
So let’s have a look at some other practical uses of the technology that think slightly further out of the box.
404 error pages
How irritating is it when you hit a 404 page for an article which would have contained everything about the subject you were looking for? It’s very annoying and the chances are it hasn’t been deleted forever but just moved and then not yet re-indexed (if it ever will be). If the article is very old then it can be extremely difficult to find in a website with a mass of content.
By using semantic technology we can do a number of things to aid that lost user. If the url has been rewritten to include the title, and the referring page contains good content we can come up with the most likely pages the should have been directed to. We simply compare the referring content and the referring link against the website’s database or XML site map. This helps to ensure that your users, even if lost, will rarely not find what they’re looking for.
Increasingly, content is being tagged and the structure of content is improving thanks to the advent of web 2.0’s social standards and astute web masters/SEOs. There is in fact a veritable goldmine of data which is available for analysis by your website or blog statistics packages. Is it being used though? Not so much (obligatory Borat quote dealt with).
Think of the data generally gathered by your statistics package:
Referring website pages.
Search engine referrals, with the keywords of the query used.
These sources are both rich for use in semantic relationship analysis. The referring links are likely going to be from articles or opinion pieces of some type whilst the search engine referrals will include the search query that was used by the user to find your page.
If this data is properly focused we can show not just where your traffic is arriving from but what your traffic is arriving from. We can use the referring pages and search engine queries to focus on the context of the referring pages, the keyword densities and break down traffic into categories and focuses. In the simplest case we can suggest the proportion of negative to positive response traffic. Tagging your articles, and selecting keywords for SEO can be greatly eased by looking at this data and seeing what areas already perform well and strengthening those. I believe there are many other uses in this area but as always I want my readers to think a bit for themselves and come up with other possibilities, the point is though that knowing the context your traffic puts you in is an invaluable resource.
DySeTagging (Dynamic Semantic Tagging [Dice – Tagging])
Dice Tagging is kind of a joke, as my first commenter from the last article will realize, I’m making up crap acronyms for fun because terms like Web 2.0 tend to make me cringe (yes I realize I’ve used it). The only recent acronym I actually use is probably AJAX.
Anyway, this Dice stuff is clever. There are reportedly a number of groups working on something similar to what I’m going to talk about – including DARPA. The premise is that the web server itself has a semantic module, and on the load of any web page or document it analyses the context of the page and generates tags to define it which are then added to the header information.
This saves a lot of load on the poor search engine at the other end, on you at your end, and enables anyone to be responsible for their own tagging systems rather than having them assigned to you by an illiterate engine programmed by a kid on an OLPC.
Make your own mind up, as usual I’m trying vainly to ignite some sparks in other developers and thinkers out there who can take the technology where it needs to go. I wish I had the time to spend on all the projects I thought of but unfortunately I don’t which is half the reason I have this blog now. A lot of what I write is playing Devil’s Advocate and is meant to produce a reaction! So please give me some 🙂