Fares Salem 4 years ago
commit 21a3ee7204

@ -23,19 +23,15 @@ Here are our official chapters. Let us know if you are interested in [starting o
* [Bhubaneswar](https://www.facebook.com/groups/pwlbbsr/)
* [Boston](http://www.meetup.com/Papers-We-Love-Boston-Cambridge/)
* [Brasilia](http://www.meetup.com/papers-we-love-bsb)
* [Boulder](http://www.meetup.com/Papers-We-Love-Boulder/)
* [Bucharest](http://www.meetup.com/papers-we-love-bucharest/)
* [Buenos Aires](https://paperswelove.org/buenos-aires/)
* [Cairo](http://www.meetup.com/Papers-We-Love-Cairo/)
* [Chattanooga](http://www.meetup.com/Papers-We-Love-Chattanooga/)
* [Chicago](http://www.meetup.com/papers-we-love-chicago/)
* [Columbus, Ohio](http://www.meetup.com/Papers-We-Love-Columbus/)
* [Dallas](http://www.papersdallas.com/)
* [Gothenburg](https://www.meetup.com/Papers-We-Love-Gothenburg/)
* [Guadalajara](https://www.facebook.com/pwlgdl/)
* [Hamburg](http://www.meetup.com/Papers-We-Love-Hamburg/)
* [Hyderabad](http://www.meetup.com/papers-we-love-hyderabad/)
* [Iasi](http://www.meetup.com/Papers-We-Love-Iasi/)
* [Iowa City](https://www.meetup.com/techcorridorio)
* [Kathmandu](https://www.facebook.com/groups/PapersWeLoveKathmandu/)
* [Kyiv](https://www.facebook.com/groups/PapersWeLoveKyiv)
@ -43,18 +39,11 @@ Here are our official chapters. Let us know if you are interested in [starting o
* [London](http://www.meetup.com/papers-we-love-london)
* [Los Angeles](http://www.meetup.com/papers-we-love-la)
* [Madrid](http://www.meetup.com/Papers-We-Love-Madrid/)
* [Medellín](https://www.meetup.com/paperswelovemde/)
* [Montreal](http://www.meetup.com/Papers-We-Love-Montreal/)
* [Mumbai](https://www.meetup.com/Papers-We-Love-Mumbai/)
* [Munich](http://www.meetup.com/Papers-We-Love-Munich/)
* [New York City](http://www.meetup.com/papers-we-love/)
* [Paris](http://www.meetup.com/Papers-We-Love-Paris/)
* [Philadelphia](http://www.meetup.com/Papers-We-Love-Philadelphia/)
* [Portland](http://www.meetup.com/Papers-We-Love-PDX/)
* [Porto](https://www.meetup.com/Papers-We-Love-Porto)
* [Pune](http://www.meetup.com/Doo-Things)
* [Raleigh-Durham](https://www.meetup.com/Papers-We-Love-Raleigh-Durham/)
* [Reykjavík](http://www.meetup.com/Papers-We-Love-Reykjavik)
* [Rio de Janeiro](https://www.meetup.com/pt-BR/papers-we-love-rio-de-janeiro/)
* [San Diego](http://www.meetup.com/Papers-We-Love-San-Diego/)
* [San Francisco](http://www.meetup.com/papers-we-love-too/)
@ -119,6 +108,18 @@ Reading a paper is not the same as reading a blogpost or a novel. Here are a few
* Love a Paper - [@loveapaper](https://twitter.com/loveapaper)
### Download papers
Open your favourite terminal and run:
```bash
$ ./scripts/download.sh
```
This will scrape markdown files for links to PDFs and download papers to their respective directories.
See [README.md](./scripts/README.md) for more options.
## Contributing Guidelines
Please take a look at our [CONTRIBUTING.md](https://github.com/papers-we-love/papers-we-love/blob/master/.github/CONTRIBUTING.md) file.

@ -3,7 +3,7 @@
## External Papers
* [Top 10 algorithms in data mining](http://www.cs.uvm.edu/~icdm/algorithms/10Algorithms-08.pdf)
* [Top 10 algorithms in data mining](https://www.researchgate.net/publication/29467751_Top_10_algorithms_in_data_mining)
While it is difficult to identify the top 10, this paper contains 10 very important data mining/machine learning algorithms

@ -0,0 +1,22 @@
# Scripts
Scripts for working with repository content.
## Download Utility
A convenience script to download papers. This will scrape the README.md files for URLs containing links to pdfs and download them to their respective directories.
The download utility is idempotent and can be run multiple times safely.
### Usage
Open your favourite terminal and run:
```bash
$ ./scripts/download.sh
```
Optionally, to download specific topics specify their directories as arguments:
```bash
$ ./scripts/download.sh android concurrency
```

@ -0,0 +1,46 @@
#!/bin/bash
# Guard clause check if required binaries are installed
which wget > /dev/null || { echo "Error: wget not installed." ; exit 1 ; }
which egrep > /dev/null || { echo "Error: egrep not installed." ; exit 1 ; }
# Recursively traverse directories in repo scraping markdown file for URLs
# containing pdfs. Downloads pdfs into respective directories.
download_for_directory() {
cd $1 || { echo "Error: directory not found." ; exit 1 ; }
for f in *; do
if [[ -d ${f} ]]; then
download_for_directory ${f} &
fi
done
# Scrape URLs from markdown files
urls=$(ls | cat *.md 2> /dev/null | egrep -o 'https?://[^ ]+' | grep '\.pdf' | tr -d ')')
for url in "$urls"; do
# Ignore empty URLs
if [[ ! -z ${url} ]]; then
wget ${url} --no-clobber --quiet --timeout=5 --tries=2
fi
done
cd ..
echo "$1 done."
}
# If no directories are supplied, iterate over the entire repo.
if [[ "$#" -eq 0 ]]; then
REPO_ROOT_DIR="$(dirname $0)/.."
download_for_directory ${REPO_ROOT_DIR}
else
# Iterate over the specified directories
for dir in "$@"
do
download_for_directory ${dir}
done
fi
# Wait for child processes to terminate
wait
Loading…
Cancel
Save