2015-10-07 19:11:04 +00:00
# Data Compression
2014-05-01 01:28:49 +00:00
2019-12-26 04:47:27 +00:00
* :scroll: [Data Compression ](data-compression.pdf )
Math papers from original `isomorphisms` PR (#587)
* Add gitter for community.
* Update CODE_OF_CONDUCT.md
* Add statecharts paper in a new systems modeling category (#565)
* Rename "paradigm" and "plt" folders for findability (#561)
* rename "language-paradigm" folder for findability
lang para pluralize
* rename PLT => languages-theory
* fixed formatting
* group pattern-* related papers (#564)
* combine clustering algo into pattern matching
* rename stringology with the pattern_ prefix
* improved the README header info for paper related to patterns
* consolidate org-sim and sw-eng dirs (#567)
* consolidate org-sim and sw-eng dirs
* typo and links
* Fixed link (#568)
* Update README.md
* Fixed A Unified Theory of Garbage Collection link
* Verification faults dirs (#566)
* consolidate program verificaiton and program fault detection listings.
* faults and validation gets header info
* self-similarity by Tom Leinster
Again on the topic of renormalisation. Dr Leinster has a nice, simple picture of self-similarity.
* added new papers in Machine Learning dir. fixed-up references
Truncation of Wavelet Matrices
Understanding Deep Convolutional Networks
General self-similarity: an overview
cleanup url files (wrong repo format)
* what has sphere packing to do with compression?
• role of E8 & Leech lattice in optimal codes
• mathematically best compression was never used
• ikosahedron
* surfaces ∑
I show this paper to college freshmen because
• it’s pictorial
• it’s about an object you mightn’t have considered mathematical
• no calculus, crypto, ML, or pretentious notation
• it’s short
• it’s a classification proof: “How can it be that you know something about _all possible_ X, even the xϵX you haven’t seen yet?’
* good combinatorics
Programmers are used to counting boring things. Why not count something more interesting for a change?
* added comentaries from commit messages. more consistent formatting.
* graphs
Programmers work with graphs often (file system, greplin, trees, "graph isomorphism problem" (who cares) ). But have you ever tried to construct a simpler building-block (basis) with which graphs could be built? Or at least a different building block to build the same old things.
This <10-page paper also uses 𝔰𝔩₂(ℂ), a simple mathematical object you haven’t heard of, but which is a nice lead-in to an area of real mathematics—rep theory—that (1) contains actual insights (1a) that you aren’t using (2) is simple (3) isn’t pretentious.
* from dominoes to hexagons
why is this super-smart guy interested in such simple drawings?
* sorting
You do sorting all the time. Are there smart ways to organise sub-sorts?
* distributed robots!!
Robots! And varying your dimensionality across a space. But also — distributed robots!
* knitting
Get into knitting.
Learn a data structure that needs to be embedded in 3D to do its thing.
Break your mind a bit.
* female genius
* On “On Invariants of Manifolds”
2 pages about how notation and algorithms are inferior to clarity and simplicity.
* pretty robots
You’ll understand calculus better after looking at these pretty 75 pages.
* Farey
Have another look at ye olde Int class.
* renormalisation
Stéphane Mallat thinks renormalisation has something to do with why deep nets work.
* the torus trick, applied
In Simons Foundation’s interview by Michael Hartley Freedman of Robion Kirby, Freedman mentions this paper in which MHF applied RK’s “torus trick” to compression via wavelets.
* renormalisation
Here is a video of a master (https://press.princeton.edu/titles/5669.html) talking about renormalisation. Which S Mallat has suggested is key to why deep learning works.
* Cartan triality + Milnor fibre
This is a higher-level paper, but still a survey (so more readable). It ties together disparate areas like Platonic solids (A-D-E), Milnor’s exceptional fibre, and algebra.
It has pictures and you’ll get a better sense of what mathematics is like from skimming it.
* Create see.machine.learning
* tropical geometry
Recently there have been some papers posted about tropical geometry of neural nets. Tropical is also said to be derived from CS. This is a good introduction.
* self-similarity by Tom Leinster
Again on the topic of renormalisation. Dr Leinster has a nice, simple picture of self-similarity.
* rename papers accordingly, and add descriptive info
remove dup maths papers
* fixed crappy explanations
* improved the annotations for papers in the Machine Learning readme
* remediated descriptive wording for papers in the mathematics section
* removed local copy and added link to Conway Zip Proof
* removed local copy and added link to Packing of Spheres - Sloane
* removed local copy and added link to Algebraic Topo - Hatcher
* removed local copy and added link to Topo of Numbers - Hatcher
* removed local copy and added link to Young Tableax - Yong
* removed local copy and added link to Elements of A Topo
* removed local copy and added link to Truncation of Wavlet Matrices
Co-authored-by: Zeeshan Lakhani <202820+zeeshanlakhani@users.noreply.github.com>
Co-authored-by: Wiktor Czajkowski <wiktor.czajkowski@gmail.com>
Co-authored-by: keddad <keddad@yandex.ru>
Co-authored-by: i <isomorphisms@sdf.org>
2019-12-26 04:36:58 +00:00
2020-03-29 19:06:55 +00:00
> This paper surveys a variety of data compression methods spanning almost 40 years of research, from the work of Shannon, Fano and Huffman in the 40's, to a technique developed in 1986.
## Scientific Data Compression
2020-03-29 19:14:53 +00:00
* :scroll: [Fast Error-bounded Lossy HPC Data Compression with SZ ](fast_error_bounded_Lossy_hpc_data_compression_with_sz.pdf )
2020-03-29 19:06:55 +00:00
> This is the first version of SZ. In this paper, SZ is introduced to achieve data reduction using regression-based data point prediction.
2020-03-29 19:14:53 +00:00
* :scroll: [Significantly Improving Lossy Compression for Scientific Data Sets Based on Multidimensional Prediction and Error-Controlled Quantization ](Significantly_Improving_Lossy_Compression_for_Scientific_Data_Sets_Based_on_Multidimensional_Prediction_and_Error-Controlled_Quantization.pdf )
2020-03-29 19:06:55 +00:00
> This work is known as SZ-1.4. In this work, SZ employs multi-dimensional data prediction so that data with dimension larger than 1 is no longer linearized into single dimension before compression. In this way, more data locality is preserved thus compression ratio is improved.
2020-03-29 19:14:53 +00:00
* :scroll: [Error-Controlled Lossy Compression Optimized for High Compression Ratios of Scientific Datasets ](Error-Controlled_Lossy_Compression_Optimized_for_High_Compression_Ratios_of_Scientific_Datasets.pdf )
2020-03-29 19:06:55 +00:00
> This work is known as SZ-2.0. In this work, authors proposed an online selection tool between 2 predictors, the mean-integrated Lorenzo predictor and linear regression-based predictor. Users can choose the predictor that yields larger compression ratio with higher prediction accuracy.
2020-03-29 19:14:53 +00:00
* :scroll: [Fixed-Rate Compressed Floating-Point Arrays ](fixed-rate_compressed_floating_point_arrays.pdf )
2020-03-29 19:06:55 +00:00
2020-03-29 19:14:53 +00:00
* :scroll: [FPC: A High-Speed Compressor for Double-Precision Floating-Point Data ](fpc_a_high_speed_compressor_for_double_precision_floating_point_data.pdf )