papers-we-love_papers-we-love/machine_learning/README.md
NewAlexandria 39fd04bdce Math papers from original isomorphisms PR (#587)
* Add gitter for community.

* Update CODE_OF_CONDUCT.md

* Add statecharts paper in a new systems modeling category (#565)

* Rename "paradigm" and "plt" folders for findability (#561)

* rename "language-paradigm" folder for findability

lang para pluralize

* rename PLT => languages-theory

* fixed formatting

* group pattern-* related papers (#564)

* combine clustering algo into pattern matching

* rename stringology with the pattern_ prefix

* improved the README header info for paper related to patterns

* consolidate org-sim and sw-eng dirs (#567)

* consolidate org-sim and sw-eng dirs
* typo and links

* Fixed link (#568)

* Update README.md
* Fixed A Unified Theory of Garbage Collection link

* Verification faults dirs (#566)

* consolidate program verificaiton and program fault detection listings.
* faults and validation gets header info

* self-similarity by Tom Leinster

Again on the topic of renormalisation. Dr Leinster has a nice, simple picture of self-similarity.

* added new papers in Machine Learning dir.  fixed-up references
Truncation of Wavelet Matrices
Understanding Deep Convolutional Networks
General self-similarity: an overview

cleanup url files (wrong repo format)

* what has sphere packing to do with compression?

• role of E8 & Leech lattice in optimal codes
• mathematically best compression was never used
• ikosahedron

* surfaces ∑

I show this paper to college freshmen because
• it’s pictorial
• it’s about an object you mightn’t have considered mathematical
• no calculus, crypto, ML, or pretentious notation
• it’s short
• it’s a classification proof: “How can it be that you know something about _all possible_ X, even the xϵX you haven’t seen yet?’

* good combinatorics

Programmers are used to counting boring things. Why not count something more interesting for a change?

* added comentaries from commit messages.  more consistent formatting.

* graphs

Programmers work with graphs often (file system, greplin, trees, "graph isomorphism problem" (who cares) ).   But have you ever tried to construct a simpler building-block (basis) with which graphs could be built? Or at least a different building block to build the same old things.

This <10-page paper also uses 𝔰𝔩₂(ℂ), a simple mathematical object you haven’t heard of, but which is a nice lead-in to an area of real mathematics—rep theory—that (1) contains actual insights (1a) that you aren’t using (2) is simple (3) isn’t pretentious.

* from dominoes to hexagons

why is this super-smart guy interested in such simple drawings?

* sorting

You do sorting all the time. Are there smart ways to organise sub-sorts?

* distributed robots!!

Robots! And varying your dimensionality across a space. But also — distributed robots!

* knitting

Get into knitting.

Learn a data structure that needs to be embedded in 3D to do its thing.

Break your mind a bit.

* female genius

* On “On Invariants of Manifolds”

2 pages about how notation and algorithms are inferior to clarity and simplicity.

* pretty robots

You’ll understand calculus better after looking at these pretty 75 pages.

* Farey

Have another look at ye olde Int class.

* renormalisation

Stéphane Mallat thinks renormalisation has something to do with why deep nets work.

* the torus trick, applied

In Simons Foundation’s interview by Michael Hartley Freedman of Robion Kirby, Freedman mentions this paper in which MHF applied RK’s “torus trick” to compression via wavelets.

* renormalisation

Here is a video of a master (https://press.princeton.edu/titles/5669.html) talking about renormalisation. Which S Mallat has suggested is key to why deep learning works.

* Cartan triality + Milnor fibre

This is a higher-level paper, but still a survey (so more readable). It ties together disparate areas like Platonic solids (A-D-E), Milnor’s exceptional fibre, and algebra.

It has pictures and you’ll get a better sense of what mathematics is like from skimming it.

* Create see.machine.learning

* tropical geometry

Recently there have been some papers posted about tropical geometry of neural nets. Tropical is also said to be derived from CS. This is a good introduction.

* self-similarity by Tom Leinster

Again on the topic of renormalisation. Dr Leinster has a nice, simple picture of self-similarity.

* rename papers accordingly, and add descriptive info

remove dup maths papers

* fixed crappy explanations

* improved the annotations for papers in the Machine Learning readme

* remediated descriptive wording for papers in the mathematics section

* removed local copy and added link to Conway Zip Proof

* removed local copy and added link to Packing of Spheres - Sloane

* removed local copy and added link to Algebraic Topo - Hatcher

* removed local copy and added link to Topo of Numbers - Hatcher

* removed local copy and added link to Young Tableax - Yong

* removed local copy and added link to Elements of A Topo

* removed local copy and added link to Truncation of Wavlet Matrices

Co-authored-by: Zeeshan Lakhani <202820+zeeshanlakhani@users.noreply.github.com>
Co-authored-by: Wiktor Czajkowski <wiktor.czajkowski@gmail.com>
Co-authored-by: keddad <keddad@yandex.ru>
Co-authored-by: i <isomorphisms@sdf.org>
2019-12-25 23:36:58 -05:00

76 lines
6.2 KiB
Markdown

# Machine Learning
## External Papers
* [Top 10 algorithms in data mining](http://www.cs.uvm.edu/~icdm/algorithms/10Algorithms-08.pdf)
While it is difficult to identify the top 10, this paper contains 10 very important data mining/machine learning algorithms
* [A Few Useful Things to Know about Machine Learning](http://homes.cs.washington.edu/~pedrod/papers/cacm12.pdf)
Just like the title says, it contains many useful tips and gotchas for machine learning
* [Random Forests](https://www.stat.berkeley.edu/~breiman/randomforest2001.pdf)
The initial paper on random forests
* [Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data](http://repository.upenn.edu/cgi/viewcontent.cgi?article=1162&context=cis_papers)
The paper introducing conditional random fields as a framework for building probabilistic models.
* [Support-Vector Networks](http://rd.springer.com/content/pdf/10.1007%2FBF00994018.pdf)
The initial paper on support-vector networks for classification.
* [The Fast Johnson-Lindenstrauss Transforms](https://www.cs.princeton.edu/~chazelle/pubs/FJLT-sicomp09.pdf)
The Johnson-Lindenstrauss transform (JLT) prescribes that there exists a matrix of size `k x d`, where `k = O(1/eps^2 log d)` such that with high probability, a matrix A drawn from this distribution preserves pairwise distances up to epsilon (e.g. `(1-eps) * ||x-y|| < ||Ax - Ay|| < (1+eps) ||x-y||`). This paper was the first paper to show that you can actually compute the JLT in less that `O(kd)` operations (e.g. you don't need to do the full matrix multiplication). They used their faster algorithm to construct one of the fastest known approximate nearest neighbor algorithms.
*Ailon, Nir, and Bernard Chazelle. "The fast Johnson-Lindenstrauss transform and approximate nearest neighbors." SIAM Journal on Computing 39.1 (2009): 302-322. Available: https://www.cs.princeton.edu/~chazelle/pubs/FJLT-sicomp09.pdf*
* [Applications of Machine Learning to Location Data](http://www.berkkapicioglu.com/wp-content/uploads/2013/11/thesis_final.pdf)
Using machine learning to design and analyze novel algorithms that leverage location data.
* ["Why Should I Trust You?" Explaining the Predictions of Any Classifier](http://www.kdd.org/kdd2016/papers/files/rfp0573-ribeiroA.pdf)
This paper introduces an explanation technique for any classifier in a interpretable manner.
* [Multiple Narrative Disentanglement: Unraveling *Infinite Jest*](http://dreammachin.es/p1-wallace.pdf)
Uses an unsupervised approach to natural language processing that classifies narrators in David Foster Wallace's 1,000-page novel.
* [ImageNet Classification with Deep Convolutional Neural Networks](http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf)
This paper introduces AlexNet, a neural network architecture which dramatically improved over the state-of-the-art in image classification algorithms and is widely regarded as a breakthrough moment for deep learning.
* [Interpretable machine learning: definitions, methods, and applications](https://arxiv.org/pdf/1901.04592.pdf)
This paper introduces the foundations of the rapidly emerging field of interpretable machine learning.
* [Distilling the Knowledge in a Neural Network](https://arxiv.org/pdf/1503.02531.pdf)
This seminal paper introduces a method to distill information from an ensemble of neural networks into a single model.
## Hosted Papers
* :scroll: **[A Sparse Johnson-Lindenstrauss Transform](dimensionality_reduction/a-sparse-johnson-lindenstrauss-transform.pdf)**
The JLT is still computationally expensive for a lot of applications and one goal would be to minimize the overall operations needed to do the aforementioned matrix multiplication. This paper showed that a goal of a `O(k log d)` algorithm (e.g. `(log(d))^2)` may be attainable by showing that very sparse, structured random matrices could provide the *JL* guarantee on pairwise distances.
*Dasgupta, Anirban, Ravi Kumar, and Tamás Sarlós. "A sparse johnson: Lindenstrauss transform." Proceedings of the forty-second ACM symposium on Theory of computing. ACM, 2010. Available: [arXiv/cs/1004:4240](http://arxiv.org/abs/1004.4240)*
* :scroll: **[Towards a unified theory of sparse dimensionality reduction in Euclidean space](dimensionality_reduction/toward-a-unified-theory-of-sparse-dimensionality-reduction-in-euclidean-space.pdf)**
This paper attempts to layout the generic mathematical framework (in terms of convex analysis and functional analysis) for sparse dimensionality reduction. The first author is a Fields Medalist who is interested in taking techniques for Banach Spaces and applying them to this problem. This paper is a very technical paper that attempts to answer the question, "when does a sparse embedding exist deterministically?" (e.g. doesn't require drawing random matrices).
*Bourgain, Jean, and Jelani Nelson. "Toward a unified theory of sparse dimensionality reduction in euclidean space." arXiv preprint arXiv:1311.2542; Accepted in an AMS Journal but unpublished at the moment (2013). Available: http://arxiv.org/abs/1311.2542*
* :scroll: **[Truncation of Wavelet Matrices: Edge Effects and the Reduction of Topological Control](https://reader.elsevier.com/reader/sd/pii/0024379594000395?token=EB0AA78D59A9648480596F018EFB72E0A02FD5FA70326B24B9D501E1A6869FE72CC4D97FA9ACC8BAB56060D6C908EC83)** by Freedman
In this paper by Michael Hartley Freedman, he applies Robion Kirby “torus trick”, via wavelets, to the problem of compression.
* :scroll: **[Understanding Deep Convolutional Networks](https://github.com/papers-we-love/papers-we-love/blob/master/machine_learning/Understanding-Deep-Convolutional-Networks.pdf)** by Mallat
Stéphane Mallat proposes a model by which renormalisation can identify self-similar structures in deep networks. [This video of Curt MacMullen discussing renormalization](https://www.youtube.com/watch?v=_qjPFF5Gv1I) can help with more context.
* :scroll: **[General self-similarity: an overview](https://github.com/papers-we-love/papers-we-love/blob/master/machine_learning/General-self-similarity--an-overview.pdf)** by Leinster
Dr Leinster's paper provides a concise, straightforward, picture of self-similarity, and its role in renormalization.