Archive/Usage/Offline images
Note: This page is obsolete. It is preserved for historical reference only. |
Note: This feature is currently not supported. The WMF Tarballs are severely out of date (2012 December). See the following for more info:
Once the WMF provides a more up to date tarball, I'll resume work on this feature. If you are interested in this feature, please send inquiries to the xmldatadumps mailing list or comment on bugzilla. |
Contents
Overview [edit]
XOWA displays images by using online downloads as specified in Dev/File/Setup. However, XOWA can also be setup to use offline sources. If you have the time and storage space, you can download all the images for your wiki and run them from a hard drive. Note that the full tarball set for English Wikipedia is over 2 TB and may take 15 days to download and unzip. Even a small wiki like Simple Wikipedia requires 56 GB. Please prepare accordingly.
Download tarball [edit]
- Navigate to the Wikimedia image dump site at http://ftpmirror.your.org/pub/wikimedia/imagedumps/tarballs/fulls/
- Select the tarball for your wiki and download it.
Unzip tarball [edit]
- If you are on a Windows system, the current version of 7-Zip does not handle UTF-8 filenames (it default to CP-1252). You will need to download the latest 7-Zip alpha (I used 9.27).
- You can try the latest alpha here: http://sourceforge.net/projects/sevenzip/forums/forum/45797/topic/6063877
- Unzip the tarball to the hard-drive's root.
- For example, on a Windows 7 system with a hard drive mounted at W, unzip to W:\. This will create a directory called W:\wikipedia\common\.
Change config file [edit]
Two changes will have to be made to the config file:
-
commons source
-
Search for the following:
.set('src_http_commons', 'http://upload.wikimedia.org/wikipedia/commons/' , 'commons.wikimedia.org' ).ext_rules_('img_only').owner
-
Replace it with the following:
.set('src_http_commons', 'W:\wikipedia\commons\' , 'commons.wikimedia.org' ).fsys_('wnt').tarball_('y').ext_rules_('img_only').owner
-
Search for the following:
-
wiki source
-
Search for the following:
.set('src_http_~{wiki_key}', 'http://upload.wikimedia.org/~{wiki_type_name}/~{wiki_lang}/' , '~{wiki_key}').ext_rules_('img_only').owner
-
Replace it with the following:
.set('src_http_~{wiki_key}', 'W:\~{wiki_type_name}\~{wiki_lang}\' , '~{wiki_key}').fsys_('wnt').tarball_('y').ext_rules_('img_only').owner
-
Search for the following:
Note: the .fsys_('wnt') is not needed for Linux systems. It is a "hack" for Windows systems to handle File titles that have invalid Windows NT filesystem characters (such as " or :)
Known limitations [edit]
- The tarballs do not come with thumbnails for video or pdf files. XOWA will show them as blank.
- These thumbnails are generated with other programs. Future versions of XOWA will try to generate them.
- imageMagick/inkscape will improperly identify sizes for certain svgs.
-
For example, on en.wikipedia.org/wiki/Earth, Earth-Moon.svg is 512x320
- imageMagick incorrectly identifies it as 1000,1000.
- inkscape incorrectly identifies it as 1991,316. It provides an X,Y of -825,-2.9 which doesn't help.
- inkscape will improperly show artifacts in certain svgs.
- For example, on en.wikipedia.org/wiki/France, France_location_map-Regions_and_departements.svg will have random black lines around Paris
Offline thumbnails [edit]
Dumping all the thumbnails in a wiki is currently being implemented. See Dev/Design/Offline_files