<spanclass="mw-cite-backlink"><ahref="#cite_ref-data_storage_format_0-0">^</a></span><spanclass="reference-text">Choose one of the following: (default is <code>.gz</code>)</span>
<spanclass="reference-text"><b>text</b>: fastest for reading but has no compression. Simple Wikipedia will be 300 MB</span>
</li>
<li>
<spanclass="reference-text"><b>gzip</b>: (default) fast for reading and has compression. Simple Wikipedia will be 100 MB</span>
</li>
<li>
<spanclass="reference-text"><b>bzip2</b>: very slow for reading but has best compression. Simple Wikipedia will be 85 MB (Note: The performance is very noticeable. Please try this with Simple Wikipedia first before using on a large wiki.)</span>
<spanclass="mw-cite-backlink"><ahref="#cite_ref-dump_server_urls_1-0">^</a></span><spanclass="reference-text">Enter a list of server urls separated by a comma and newline.</span>
<spanclass="reference-text">Note that servers are prioritized from left-to-right. In the default example, <b>your.org</b> will be tried first. If it is offline, then the next server -- <b>dumps.wikimedia.org</b> -- will be tried, etc.</span>
<spanclass="reference-text">See <ahref="http://xowa.org/home/wiki/App/Import/Download/Dump_servers.html"id="xolnki_2"title="App/Import/Download/Dump servers"class="xowa-visited">App/Import/Download/Dump_servers</a> for more info</span>
<spanclass="mw-cite-backlink"><ahref="#cite_ref-import_bz2_by_stdout_2-0">^</a></span><spanclass="reference-text"><b>NOTE 1: this option only applies if the "Custom wiki commands" option is <code>wiki.download,wiki.import</code> (wiki.unzip must be removed)</b><br>
Select the method for importing a wiki dump bz2 file. (default is <code>checked</code>)</span>
<ul>
<li>
<spanclass="reference-text"><b>checked</b> : import through a native process's stdout. This will be faster, but may not work on all Operating Systems. A 95 MB file takes 85 seconds</span>
</li>
<li>
<spanclass="reference-text"><b>unchecked</b>: import though Apache Common's Java bz2 compression library. This will be slower, but will work on all Operating Systems. A 95 MB file takes 215 seconds.</span>
</li>
</ul><spanclass="reference-text"><b>NOTE 2: lbzip2 (Many thanks to Anselm for making this suggestion, as well as compiling the data to support it. See <ahref="http://sourceforge.net/p/xowa/tickets/263/?limit=10&page=6#f2fb/dcb6"rel="nofollow"class="external free">http://sourceforge.net/p/xowa/tickets/263/?limit=10&page=6#f2fb/dcb6</a>)</b> Linux users should consider using lbzip2, as lbzip2 has significant performance differences (30% in many cases).</span>
<spanclass="mw-cite-backlink"><ahref="#cite_ref-3">^</a></span><spanclass="reference-text">Process used to decompress bz2 by stdout. Recommended: Operating System default</span>
<spanclass="reference-text">For fast imports, but high disk space usage, use <code>wiki.download,wiki.unzip,wiki.import</code></span>
</li>
<li>
<spanclass="reference-text">For slow imports, but low disk space usage, use <code>wiki.download,wiki.import</code></span>
</li>
</ul><spanclass="reference-text"><b>Long version:</b> Enter a list of commands separated by a comma. Valid commands are listed below. Note that simple.wikipedia.org is used for all examples, but the commands apply to any wiki.</span>
<ul>
<li>
<spanclass="reference-text"><code>wiki.download</code>: downloads the wiki data dump from the dump server</span>
</li>
</ul>
<dl>
<dd>
<spanclass="reference-text">A file will be generated in "/xowa/wiki/simple.wikipedia.org/simplewiki-latest-pages-articles.xml.bz2"</span>
</dd>
</dl>
<ul>
<li>
<spanclass="reference-text"><code>wiki.unzip</code>: unzips an xml file from the wiki data dump</span>
</li>
</ul>
<dl>
<dd>
<spanclass="reference-text">A file will be created for "/xowa/wiki/simple.wikipedia.org/simplewiki-latest-pages-articles.xml" (assuming the corresponding .xml.bz2 exists)</span>
</dd>
<dd>
<spanclass="reference-text">If this step is omitted, then XOWA will read directly from the .bz2 file. Although this will use less space (no .xml file to unzip), it will be significantly slower. <b>Also, due to a program limitation, the progress percentage will not be accurate. It may hover at 99.99% for several minutes</b></span>
</dd>
</dl>
<ul>
<li>
<spanclass="reference-text"><code>wiki.import</code>: imports the xml file</span>
</li>
</ul>
<dl>
<dd>
<spanclass="reference-text">A wiki will be imported from "/xowa/wiki/simple.wikipedia.org/simplewiki-latest-pages-articles.xml"</span>
</dd>
</dl><spanclass="reference-text">The following lists possible combinations:</span>
<spanclass="reference-text">This is the default. Note that this will be the fastest to set up, but will take more space. For example, English Wikipedia will set up in 5 hours and require at least 45 GB of temp space</span>
<spanclass="reference-text">This will read directly from the bz2 file. Note that this will use the least disk space, but will take more time. For example, English Wikipedia will set up in 8 hours but will only use 5 GB of temp space</span>
<spanclass="mw-cite-backlink"><ahref="#cite_ref-download_xowa_common_css_5-0">^</a></span><spanclass="reference-text">Affects the xowa_common.css in /xowa/user/anonymous/wiki/wiki_name/html/. Occurs when importing a wiki. (default is <code>checked</code>)</span>
<spanclass="reference-text"><b>checked</b> : downloads xowa_common.css from the Wikimedia servers. Note that this stylesheet will be the latest copy but it may cause unexpected formatting in XOWA.</span>
</li>
<li>
<spanclass="reference-text"><b>unchecked</b>: (default) copies xowa_common.css from /xowa/bin/any/html/html/import/. Note that this stylesheet is the one XOWA is coded against. It is the most stable, but will not have the latest logo</span>
<spanclass="mw-cite-backlink"><ahref="#cite_ref-delete_xml_file_6-0">^</a></span><spanclass="reference-text">(Only relevant for wiki.unzip) Choose one of the following: (default is <code>checked</code>)</span>
<spanclass="mw-cite-backlink"><ahref="#cite_ref-page_rank-iteration_max_7-0">^</a></span><spanclass="reference-text">Specify one of the following: (default is <code>0</code>)</span>
<spanclass="reference-text"><b>(number greater than 1)</b>: page rank will be calculated until it is finished or maximum number of interations are reached. For more info, see <ahref="http://xowa.org/home/wiki/Help/Features/Search/Build.html"id="xolnki_3"title="Help/Features/Search/Build">Help/Features/Search/Build</a></span>
<spanclass="mw-cite-backlink">^ <sup><ahref="#cite_ref-layout_text_max_8-0">a</a></sup><sup><ahref="#cite_ref-layout_text_max_8-1">b</a></sup><sup><ahref="#cite_ref-layout_text_max_8-2">c</a></sup></span><spanclass="reference-text">Enter a number in MB to represent the cutoff for generating sets of page databases as one file or many files (default is <code>1500</code>)<br>
<spanclass="reference-text"><b>text</b>: These are Wikitext databases and have entries like ''italics''. They have <code>-text-</code> in their file name.</span>
</li>
<li>
<spanclass="reference-text"><b>html</b>: These the html-dump databases and have entries like <i>italics</i>. They have <code>-html-</code> in their file name</span>
</li>
<li>
<spanclass="reference-text"><b>file</b>: These are image databases which have the raw binary images. They have <code>-file-</code> in their file name</span>
<spanclass="reference-text">For small wikis, XOWA generates one database for the entire wiki. For example, Simple Wikipedia will just have "simple.wikipedia.org-text.xowa". This way is preferred as it is simpler.</span>
<spanclass="reference-text">For large wikis, XOWA generates many databases for the entire wiki. For example, English Wikipedia will have "en.wikipedia.org-text-ns.000.xowa", "en.wikipedia.org-text-ns.000-db.002.xowa", etc. This way is necessary, because some file-systems don't support large databases. For example, creating an "en.wikipedia.org-text.xowa" file will generate a 20 GB file. This 20 GB file will generally fail on flash drives (FAT32), as well as Android (SQLite library allows 2 GB max)</span>
</li>
</ul><spanclass="reference-text"><br>
These options can force XOWA to generate a wiki using either one database (Simple Wikipedia style) or many databases (English Wikipedia style). It does this by using a cutoff for the XML database dump<br>
For example, 1500 means that a wiki with a dump file size of 1.5 GB or less will generate a single file. Any wiki with a dump file size larger than 1.5 GB will generate multiple files.</span>
<ul>
<li>
<spanclass="reference-text">If you always want to generate a set with only one file, set the value to a large number like 999,999 (999 GB)</span>
</li>
<li>
<spanclass="reference-text">If you always want to generate a set with multiple files, set the value to 0.</span>
</li>
<li>
<spanclass="reference-text">Otherwise, set the value to a cutoff. Wikis below that cutoff will be "single file"; wikis above it will be "multiple files"</span>
<spanclass="mw-cite-backlink"><ahref="#cite_ref-12">^</a></span><spanclass="reference-text">Decompress zip file(needed for importing dumps) . Recommended: <ahref="http://7-zip.org/"rel="nofollow"class="external text">7-zip</a></span>
<li><ahref="http://dumps.wikimedia.org/backup-index.html"title="Get wiki datababase dumps directly from Wikimedia">Wikimedia dumps</a></li>
<li><ahref="https://archive.org/search.php?query=xowa"title="Search archive.org for XOWA files">XOWA @ archive.org</a></li>
<li><ahref="http://en.wikipedia.org"title="Visit Wikipedia (and compare to XOWA!)">English Wikipedia</a></li>
</ul>
</div>
</div>
<divclass="portal"id='xowa-portal-donate'>
<h3>Donate</h3>
<divclass="body">
<ul>
<li><ahref="https://archive.org/donate/index.php"title="Support archive.org!">archive.org</a></li><!-- listed first due to recent fire damages: http://blog.archive.org/2013/11/06/scanning-center-fire-please-help-rebuild/ -->