mirror of
https://github.com/crioux/turbo-linecount.git
synced 2024-10-27 17:24:01 +00:00
Compare commits
5 Commits
Author | SHA1 | Date | |
---|---|---|---|
|
9bf2210d40 | ||
|
4ca4808a26 | ||
|
edea23afe6 | ||
|
7a2a4a601c | ||
|
82062d4e63 |
34
README.md
34
README.md
@ -5,14 +5,14 @@ turbo-linecount 1.0 Copyright 2015, Christien Rioux
|
||||
|
||||
*turbo-linecount* is a tool that simply counts the number of lines in a file, as fast as possible. It reads the file in large chunks into several threads and quickly scans the file for line endings.
|
||||
|
||||
Many times, you have to count the number of lines in text file on disk. The typical solution is to use 'wc -l' on the command line. 'wc' uses buffered streams to process the file, which has its advantages, but it is slower than direct memory mapped file access. You can't 'pipe' to
|
||||
Many times, you have to count the number of lines in text file on disk. The typical solution is to use `wc -l` on the command line. `wc -l` uses buffered streams to process the file, which has its advantages, but it is slower than direct memory mapped file access. You can't 'pipe' to *turbo-linecount* however. This may change in a future release.
|
||||
|
||||
How much faster is *turbo-linecount*? About 8 times faster than `wc` and 5 times faster than the naive Python implementation.
|
||||
How much faster is *turbo-linecount*? About 8 times faster than `wc -l` and 5 times faster than the naive Python implementation.
|
||||
|
||||
To use *turbo-linecount*, just run the command line:
|
||||
|
||||
```
|
||||
lc <file>
|
||||
tlc <file>
|
||||
```
|
||||
|
||||
where *\<file\>* is the path to the file of which you'd like to count the lines.
|
||||
@ -21,8 +21,8 @@ where *\<file\>* is the path to the file of which you'd like to count the lines.
|
||||
To get help with *turbo-linecount*:
|
||||
|
||||
```
|
||||
lc -h
|
||||
usage: lc [options] <file>
|
||||
tlc -h
|
||||
usage: tlc [options] <file>
|
||||
-h --help print this usage and exit
|
||||
-b --buffersize <BUFFERSIZE> size of buffer per-thread to use when reading (default is 1MB)
|
||||
-t --threadcount <THREADCOUNT> number of threads to use (defaults to number of cpu cores)
|
||||
@ -54,16 +54,16 @@ Cygwin
|
||||
|
||||
### Testing
|
||||
|
||||
Testing cmake against `wc` and `python` can be done with the test scripts. To generate some random test files, run `create_testfiles.sh`, and four test files, one 10MB, one 100MB, one 1GB, and one 10GB file will be created. Feel free to delete these when you're done testing to save space.
|
||||
Testing cmake against `wc -l` and `python` can be done with the test scripts. To generate some random test files, run `create_testfiles.sh`, and four test files, one 10MB, one 100MB, one 1GB, and one 10GB file will be created. Feel free to delete these when you're done testing to save space.
|
||||
|
||||
To run the test, run `compare_testfiles.sh`. This will generate output as such:
|
||||
|
||||
```
|
||||
Timing for tlc
|
||||
lc: test_10MB.txt 0.006s
|
||||
lc: test_100MB.txt 0.015s
|
||||
lc: test_1GB.txt 0.127s
|
||||
lc: test_10GB.txt 1.196s
|
||||
tlc: test_10MB.txt 0.006s
|
||||
tlc: test_100MB.txt 0.015s
|
||||
tlc: test_1GB.txt 0.127s
|
||||
tlc: test_10GB.txt 1.196s
|
||||
Timing for python
|
||||
python: test_10MB.txt 0.025s
|
||||
python: test_100MB.txt 0.084s
|
||||
@ -85,9 +85,11 @@ Performance on Windows and Mac OS X is excellent for all file sizes. Performance
|
||||
* 1TB SSD hard drive
|
||||
* 16GB Memory
|
||||
|
||||
| File Size | `tlc` | `python` | `wc -l` |
|
||||
|-----------|---|---|---|---|---|
|
||||
| 10MB | 0.006s | 0.025s (4.2x) | 0.012s (2.0x) |
|
||||
| 100MB | 0.015s | 0.084s (5.6x) | 0.100s (6.7x) |
|
||||
| 1GB | 0.127s | 0.661s (5.2x) | 0.933s (7.3x) |
|
||||
| 10GB | 1.196s | 6.165s (5.15x) | 9.857s (8.2x) |
|
||||
```
|
||||
| File Size | `tlc` | `python` | `wc -l` |
|
||||
|-----------|--------|----------------|----------------|
|
||||
| 10MB | 0.006s | 0.025s (4.2x) | 0.012s (2.0x) |
|
||||
| 100MB | 0.015s | 0.084s (5.6x) | 0.100s (6.7x) |
|
||||
| 1GB | 0.127s | 0.661s (5.2x) | 0.933s (7.3x) |
|
||||
| 10GB | 1.196s | 6.165s (5.15x) | 9.857s (8.2x) |
|
||||
```
|
||||
|
Loading…
Reference in New Issue
Block a user