Things To Do
This page is a list of design and coding things to do for the Gallery, largely unsorted and unorganised... Much of it is just wish-list.
There is a page of pending fixes, housekeeping, etc
to be done.
This page content is English-only.
Possible Student Projects
Note that there are some large and fairly self-contained tasks which
might be suitable for an undergraduate end-of-year project,
for which some support would be provided
as well as acknowledgement of the student's contribution. Work done
(eg code) would be made publicly available under the same conditions
as the rest of the Gallery; eg open source code under the BSD licence.
In outline, possible projects include:
- 3D real-time interactive walkthrough of the Gallery,
as if visiting and wandering around a physical gallery
being able to inspect stills and movies "hung" on the walls, etc,
with rendering of comparable quality to games consoles,
or in Web browsers or mobile phones with Java.
The Gallery view should be accessible to as many users as is reasonably
possible, not requiring unreasonably expensive bandwidth or CPU,
or at least a graceful degradation of experience in limited environments.
One possible delivery mechanism would be Java WebStart.
- Integration of charging mechanism, preferably widely supported such
as VISA or PayPal IPN or EntroPay, to enable instant donations or
the ability to pay for faster access (and thus Gallery upgrades!)
especially when the system is very busy.
- Extension of exhibit-upload and management features,
with the following features:
- Runs off-line but connects as needed to upload exhibits,
update its off-line index, etc, retrying as needed,
and efficient with bandwidth for users on slow/expensive connections.
- Allows the user to rename entire directories full of exhibits
ready for bulk upload, verifying names, magic numbers, etc,
and possibly moving to an "uploaded" directory when done
and providing the user with an local catalogue of all their own images.
- Allows entry and editing of all aux data about an exhibit,
such as description, location, etc,
possibly with locator on world map and/or mockup catalogue page.
All edits and uploads still to be approved by the maintainer.
- Usable by contributors to manage their own collection of images,
a local index and full thumbnails off-line,
plus any uploaded-but-not-yet approved exhibits,
eg for picture research.
- Usable by the maintainer to examine all pending exhibits and approve
or not and possibly create a list of files to be moved into the
Gallery proper in one operation run off-line,
eg on the Gallery master server's file server.
This implies that that the maintainer has "super" rights;
these could possibly be attached to the "ANON" contributor ID.
(This final step must at least initially NOT be available on-line
for security reasons, or possibly only over HTTPS and/or
from "trusted" IP addresses.)
- Can run over HTTPS or be otherwise authenticated on each operation
if need be.
- Allows for edit and update of existing exhibits already uploaded by that
- Possible explicit support for author "stories"/travalogues.
- Allows synchronisation of on-line and off-line aux data and maybe
exhibit data (so the user gets a handy indexed copy of their images
wherever they want). This synchronisation to be partial or full with
clear progress indication and the possibility to cancel and keep any
progress already made.
- As many possible of these features as is reasonable are provided
through the light-weight Web interface,
so a contibutor arriving at an Internet cafe with their digital camera
can achieve many of the same end results in a similar way,
and in any case the look-and-feel and general operations should stay
similar so that a user can switch back and forth easily and for as few
unexpected (eg security and architectural) consequences as possible.
- Any agreed set of the to-do items described elsewhere.
These are the things I'd like to do first for various reasons.
Ideally, this should only ever contain a handful of items,
preferably quick to resolve,
and not especially categorised.
The first items in this list are probably the most likely or important.
- Include version number in site and make it available programmatically.
- Integrate with PayPal IPN and/or EntroPay to take payment.
- Fix "infinite redirections" problem that hurts IE6:
GET /_virtual/ByCategory/places-and-sights/Su/Ss/S-/Sc/archive-11.HTM HTTP/1.1
HTTP/1.1 302 Moved Temporarily
Date: Wed, 10 Mar 2004 17:11:02 GMT
Server: Apache Tomcat/4.1.27 (HTTP/1.1 Connector)
- IPv6 support and ECM assuming that IPv6/48 ~ IPv4/24, eg see:
- Fix "Recent additions to the Gallery" filter on search page.
- Add select-by maximum-bytes and minimum-megapixels/resolution/orientation/Hz filters, eg:
- Store (and add it) local-only (ie per server) CTR-by-layout-style stats.
- Index and allow select by EXIF camera make and model data.
- Finish off light-weight Scorer computation front-end:
- Fetch & prime with server's best Scorers (also indicates base names to work on)
- Implement callback & send best to server
- Check that doChunk() logic correct for Mini
- Have server cache calib set for a while (1m-1h)?
- Install as JWS on master
- Stop work after ~1hr with no progress
- Include LocalProps "emergency cutoff" to prevent background Scorer work.
- Write new tool(s) to:
- Move a complete set of "approved" exhibits safely into place.
- Set file permissions.
This is where the bulk of pending things should be filed.
The first items in this list are probably the most likely or important.
This should be(come) roughly categorised by area, eg upload or tree-display.
User Experience, Interactivity, New Features (M)
- Colour-code some of the sections such as Toilets/Food/Travel, eg given this quote from a BBC News item:
Colour coding saves reading time by a third, he says. It's common to use black text on yellow background for flying information (departures, arrivals), yellow text on black for bathroom facilities, green for exits and blue for food and retail.
- SVG support (FF now supports it natively).
- RAW support (eg with jrawio for the Sony DSC-F828 .srf files).
- Improve Aloha Earth. Still to do (and suggestions from testers):
- Integrate with higher-resolution images for high zoom,
eg from satellite sources such as http://www.sat.dundee.ac.uk/pdus.html.
- Integrate with live images, eg weather sat overlay.
- Integrate with Google Earth/Maps or similar.
- (RF) a search box with country names
- (RF) do make it clear that page buttons are such
- (RF) include number of pages near exhibits number
- (RF) allow the user to specify the number of images per page
- Add HREF listener to JWS uploader HTML items.
- Integrate catalogue with Google Maps or similar.
a little like the clues given by JIndexer,
based on the lexicon of words collected by JIndexer or otherwise.
This should not take lots of space (ie time to download or execute),
- On search page show words NOT found in index,
and closest ones to them alphabetically.
- Improve thumbnail quality by ensuring that JPEG thumbnails do not contain any thumbnail
from the original image (eg via the metadata).
- To search add select by (less than) size, eg: 1kB, 32kB, 1MB, 32MB, 100MB up to max exhibit size.
- To search add image select by (more than) total pixels, eg: 1kpix ... 10Mpix.
- By country filter/search in catalogue and Aloha Earth and starting with user's location.
Implies maintaining map of ccTLD x country prefix in places-and-sights x i18ned name.
- Add map coordinate filter/search to catalogue and Aloha Earth.
- Add "not" (ie invert) to each search select option.
- A Computer-Generated Images (CGI) section,
including maybe games screenshots,
and other synthetic imagery (eg maybe visualisations, graphs).
(Also possibly computer-games packshots in Mechanoids section.)
Thanks to Level80 for provoking
- Add property/ies in GenProps to override default livery colours,
text colour, etc.
- Automatically extract more text/information from exhibits
for indexing and display
- OCR of still images: eg http://www.java-channel.org/display.jsp?id=c_10678
- Speech-to-text from audio/video.
- Extend/improve automatic detection with list of User-Agent regexes in GenProps.
- Add number of exhibits at/below current node in tree and against sub-nodes
in tree-based views.
- Show "bg" images tiled, possibly as their own backdrop.
- Enhance jIndexer with a strict-AND mode (where all words must show)
which should be very fast and only return relevant results,
and an auto-AND mode which tries strict-AND and falls back to
normal all-words mode, and make that the default for user queries.
- Add ability for contributors to add anotated virtual collections
(eg "stories" or travelogues) under their name on the site
and edit these via a Web interface including mainly their own exhibits,
thus enabling (for example) people travelling to upload digital pics
from Internet cafes and provide annotations and descriptions
in a particular order (from which a visitor can click through to the
catalgoue page for more detail on particular items)
and edit or add to at any time, for example when they come back.
Must be robust enough to survive some tweaking of names by the Gallery
These can then be seen as 'features' perhaps.
- BG suggests: ``Another idea I had for your pg2k site: you have this
'find an exhibit at random' button which I think is very cool (it keeps
people hooked to the site). So you could extend the concept so that people
could choose an exhibit totally at random or amongst the best/worst rated,
not rated, new, etc. exhibits.'' On the search page this might chose given
the filter criteria, and keywords if any, and might automatically exclude
poorly-rated exhibits by default, for example (which helps my LRU cache?).
We might also make the random function non-uniform, ie more likely to pic
Structural/Architectural Changes (M)
- Reload cached by-word index on restart of server.
- Try BZIP2 and LZMA compression methods for very large RPC frames.
- Add generic getProperties(hash, name, subtype) to data pipeline
to allow efficient run-time distribution of IP maps, i18n updates, etc,
and allow possibility of sending patch rather than whole file.
- Consider NTP-like (automatic or manual) hierarchy,
so data (etc) can be retrieved/exchanged in tree structure.
- Add facility on-line for anyone to suggest i18n text,
missing or extant (maybe log as an event).
- Add equals() and hashCode() for sort and filter classes,
including anonymous ones.
- Consider creating alternative caching pipeline element
that assumes that it has enough space for all exibits
and keeps a ZIP-file representation of the exhibits and
- Investigate Open GIS Consortium
as source of APIs, data format standards, data for the maps,
and partners to link with to improve user experience.
- Allow cache to extend start of fetch for a request so as to be able
to cache the results of a request.
- Create automatic to-do pages to help with maintenance for such items as:
- Places-and-sights items without any (or very broad) location.
- Duplicate exhibits by hash.
- Most frequently-used i18n items (and locale) missing a translation.
- Most popular subjects/keywords/etc to indicate possible new exhibits.
- Provide read/write tests for all media types.
- Periodically (asynchronously) pull in "current" event set of particular period
when it has been requested from upstream of simple cache.
This keeps it somewhat up to date at no huge cost,
especially avoiding the requester being blocked unnecessarily.
- Exclude redirection to mirrors showing a different AEP hashcode to us,
eg to cope with transition while new exhibits are being propagated.
Do this via new hashcode global (Number) variable.
- Consider vote-limiting by IP globally limiting fraction over one or more days.
- Add (possibly to GenProps) run-time map of IP address to country code,
at least for the countries we know anything about or where we have mirrors.
This should also have a proximity map (eg closest to RIPE/uk) by country code,
and a default "best" region/country to serve from (eg ARIN/us).
and a set of weightings to replace the static ones in the GeoProximity enum.
This should allow much more accurate targetting of clients to servers
than with the static information.
- Create location/load-sensitive Java-based DNS service for gallery entries.
- Record stats on user visit duration and pages viewed.
- Record stats on bandwidth and downloads by URI.
- Do unit tests of system behaviour under heavy load.
- Do automated (HttpUnit?) tests for upload and other parts of the site.
- Use JTidy to vet/sanitise any external or comment text to be embedded
in Gallery pages (probably only if the txt contains at least < or &);
this would include section text for example.
- Make FullSortedExhibitSet usage more efficient in not having to reset
(sort) expression every time.
- Attempt to have cache read ahead (a la UNIX bread() algorithm) so that
slient should not be blocked waiting for transfer of data into cache
other than possibly on first read.
- Make Observer in AbstractFilterBean (a) embedded (b) hold only
a SoftReference to the AFB so that the DataSourceBean reference
to the Observer does not stop it being GCed (c) possibly unregister
the Observer upon GC of the AFB to clean up state as quickly as
possible (d) maybe allow a cache to be memory sensitive and an Observer.
- Disable cacheing or force revalidation of pages if the user has a session,
since that session might be being used for a changed language (i18n)
or sticky tree pages, for example, which cacheing might mess up.
- Extend the normal filter base to (a) optionally be memory sensitive
if very quick to construct and (b) throw away some or all of their
content or make it memory sensitive if unused for some minium time
or a large multiple of the time to compute its value (say 1/1000th)
to automatically reclaim resources slowly even if the exhibit set
does not change.
- Deal with all the "TODO"s and "FIXME"s in the code.
- Put extra choose/upload box near top of page when loading a series
to avoid having to scroll to the bottom of the page each time.
- Show links to items already uploaded;
allow some designated users to see all pending uploads.
- Allow direct upload from another URL, eg where contributor
already has photos on their own site.
- In JWS uploader:
- Allow stand-alone use as app uploading to master.
- Compute/show upload Bps, estimated time-to-complete all queued upload,
and improve progress-meter behaviour.
- Show progress while adding (many) files for upload.
- Fix resetting of number-in-series to 1 when perhaps not appropriate.
- Exhibit upload:
- Complete upload passwords in shared db with user able to reset passwd.
- Collect location information.
- Save exhibit properties and accession files.
- Add listing of similar extant exhibit names a la search engine.
- When checking for name clashes in upload,
if possible warn of names that differ only in number, author, suffix
or main words.
- Prioritising/categorising all the tasks on this page!
Including breaking up into sub-groups such as upload- or search- related.
- Make "exhibit range" leader at top of tree pages XXX ... YYY possibly
more like in home page with the text linked or normalised in some way.
- As a further potential optimisation of tree display,
where all the leaves below a node would fit onto a single page for display,
fold them into that node directly
(though allow the sub-nodes to exist for consistency).
- Improve layout (etc) of Tree JSP output, such as labelling the buttons better,
putting the number of entries in brackets after a directory a la Yahoo, etc.
- Do lazy tree-building to allow much faster initial access to large trees
(eg at 20030616, "all" exhibits of circa 11,000 entries takes 40s on rosehip to rebuild)
probably by building a customised object rather than a raw Map.
This can be transparent to the Tree display JSP.
Possibly have the AllExhibitsProperties generate a canonical set of lower-cased names
and/or prefixes so that we don't need lots of copies of the same values in memory.
- Re-enable Amazon advertising.
- Modify ExhibitDataFileSource to try to reload any new static data cache
(prepared offline) immediately rather than imposing a minimum time
before retrying. Possibly apply in conjunction with new minimum
delay which is multiple (10x) time taken to actually load data
to minimise stress on CPU and disc.
- Allow uploader (or admin user, maybe filtered by IP address)
to view (some or all) pending items in some way,
including details such as size and timestamp,
to make it easy to see what is pending and if something has already
- Show uploader quick selection of "similar" exhibits based on stuff
in their saved info.
- Add code to expire/zap redundant thumbnails in server-side cache.
- JUNIT COMMAND-LINE AND/OR IN-SERVLET TESTING,
including tests for cache-clearing Observer/Observable in DataSourceBean.
- Add capability in search engine to restrict search by date, size, dimensions,
"has comment/description", has location, etc.
Probably just wish-list pie-in-the-sky material: may never happen.
The first items in this list are probably the most likely or important.
This should be(come) roughly categorised by area, eg upload or tree-display.
User Experience, Interactivity, New Features (L)
- Add table of URLs for more info on each media type where possible,
- Possibly change alt text of thumbnail to first word (or two) of exhibit
to help partially-sighted visitors for example.
- Add a recent/popular searches item to make good use of any search cache
providing it does not hurt privacy.
- Add instant messagging (IM) or chat to Gallery,
possibly a "room" per catalogue page so you can go meet at the Gallery.
Consider http://myjxta2.jxta.org/ as J2EE compat.
- Have /notFound.jsp respond with a MIME type that the client can handle,
specifically not with a relatively-expensive HTML page to
a .txt page request or a client that accepts only WML for example.
- Redirect all queries via a lookup cache keyed to the data hash;
this can be keyed on an equality/hash test on SearchFilterBean for example.
- Try harder to provide some sort of progress report during exhibit upload.
- Have upload2.jsp verify that file has
correct magic number, etc, maybe even display it back to
- Add upload bandwidth measurement on system, complete with running total
(and initial estimaate based on outgoing bandwidth limit, if any).
Misc, Admin and Maintenance (L)
- Possibly add optional tie-breaker for identical (or near-identical)
scoring items in JIndexer with Comparator and/or have sort-by
"Goodness" or relevance or date switch on search.
- Avoid enabling page cacheing of "standard" (eg catalogue) pages when the
site is busy as it stops langauge switching from working properly...
- Write a standard logger interface to write to System.err or
the servlet-container's log as appropriate.
- Write tool so find out which properites in the common bundle have
not had translations supplied so that I can ask for them;
log most common property/language missing entry combinations.
Unsorted list of things to do
No special order.
- Get Last-modified working on catalogue pages to reduce load.
- Write HttpUnit tests for pages.
- Upgrade to Tomcat 4.1.19 or greater.
- HttpUnit tests of cacheing.
Also make use of and set E-tags in the headers.
- Add check in upload that shows when a new first "main" word is about
to be used as that is especially important.
- Add query caching for a small number of identical successive queries;
maybe always hold the last 100 (with very long queries truncated or forbidden)
with any beyond that held in a memory-sensitive cache.
- Attempt to speed and focus search by having search default to *all* words
(and make sure that JIndexer lookup is optimised for this case),
with a user-selectable alternative of *most* words (the current)
which might be the automatic fall-back if no results are found with the default.
This could be very fast if jindexer is optimised for the "all" case.
Currently jindexer only ensures that all meants entries that match all come first
as opposed to excluding all other items; maybe this should change.
- Add _by/Estd/ for Aloha Earth replacement and scrap old Aloha Earth.
- Map in ZAPIX replacement front page.
- Index items by attribute word and by discovered attributes at run-time
(eg all .mpg get attribute "video" and all .jpg could get "still").
Include "new" automatic attribute for most recently-added exhibits.
- Add landscape/portrait image selector/filter.
- Eliminate most unused attributes and rename odd items to further prune list.
- Possibly tune JIndexer to stop adding in results of words in the search
list when the results could not possibly make a difference to the final
result (usually very common words), since this may mean that any optional
(and maybe slow) filter we supply can be called far less often.
- Have user able to register browser (with system generated unique ID)
to remember search prefs, remember "favourite" exhibits,
show new related images each time in
"Similar exhibits" section, etc. This unique ID and browser
registration can be shared with other broswers at user request.
We might do some of this in a session if extant.
- Have a link on each word in an exhibit name to pick out more with that
same feature, at least as a primary criterion. Eg pick on a normal
word to find more with that word, or an attribute to find more with
that attribute, or the author to find more by that author, etc...
- See comments for search/catalogue improvement in
- To avoid ambiguity, prevent last component of name (main or attribute)
being an integer.
- Suggestion for new category of buildings:
Date: Tue, 19 Mar 2002 17:14:38 -0000
From: "Louise Hartup"
May I suggest that you add a category of buildings. I also get involved
with marketing (for my sins) and we're always looking for good business
images. You could also do people at work and that sort of thing though I
doubt you'd be interested based on the nature of the pictures on your site.
...and in a further email...
With regard to the buildings I was thinking of the sort of arty pictures you
now see in most marketing material which photograph buildings in an unusual
way (am I making myself clear???! - not really methinks!). I need images
that capture different industries for example, telecoms, mining, farming,
oil production, financial services, education, police, drug rehabilitation,
prisons, community care, chemical etc. etc. The focus again is perhaps an
unusual picture that captures something of the essence of the
industry/people who work in said industry, preferably using strong colours.
You've no idea how hard it is to find a decent picture of a mobile phone (I
should imagine you're thinking I'm sadder and sadder!). For my sins, I once
worked for PwC and they commissioned a library of images for use in their
material which were absolutely amazing, difficult for me to describe, but
one was of the inside of a beautiful library, there were sand dunes and
unusual architecture - I think they called them world images. Anyway I now
work for a tiny company and am trying to help them pull some marketing stuff
together (this is NOT my forte!). I had a delve around ages ago and found
nothing that would help. However yesterday I found your site and your
pictures have definitely livened up my essays.
- Add filters with weightings/sliders from
"very important" through to "very important not so".
- Add accession year and EXIF data to vote correlations.
- Self-sizing aka cache.
- Log sticky visitor ccTLD/region/proximity.
- Allow selective Urchin tracking by user region/country...
- Add search filter for "best" exhibits.
- Cache/load serialised DSB by-word index.
- Add indexing support for Unicode chars > 0xff.
- Flashing hotlink img.
- Formalise links to external mechanisms to sell high-res images,
such as with shareit.com and AHD's magnet photo, eg have
property in new properties file with external link and another
with a simple largely-language-free summary of the features of
the purchasable version. (We can then have i18n as we will for the
- Find out why sometimes a small thumbnail is generated when a large
one is not, and possibly forbid the situation.