ComputersFirst Aidfirstaidquiz.comGeneralLinuxWeb DesignWeb Site

First Aid Quiz, St. John Ambulance News and The pitfalls of web capturing

My First Aid Quiz web site has a page which displays relevant first aid and St. John Ambulance news. Whilst I have control over the first section the St. John Ambulance news stories are taken from and linked to the St. John Ambulance News Web Page. This is done automatically by a script that runs on a daily basis.
To achieve this the script looks through the web page looking for key parts that identify each of the stories and then extract the relevant information. Whilst this works well the problem is that if the site changes, like the St. John Ambulance site has recently, then the script is unable to identify the appropriate section and so on the best case it ignores the information. In the worst case it could create a badly corrupted news entry, but that depends upon how the script is written and how much the site has changed.
The change to the St. John Ambulance web site was quite small but it was enough to mean that pages were not being captured for just under a month. This is not a big deal for this site really, but for others it may be a bigger problem.

There is a solution, but it depends upon how the 3rd party site is managed, and is not currently possible for the St. John Ambulance web site. Tools such as RSS readers can combine multiple feeds into a separate page. You can see the RSS feed for my WatkissOnline blog at:
RSS Feed for WatkissOnline blog entries.

The Firefox web browser includes an RSS reader, although I’m really looking for a way to read RSS through programming means, but I’ve not really found a news feed that meets my requirements as yet.