I am a beginning graduate student in CS and I am transferring from my field of complexity theory to .

One thing I cannot help but notice (after starting out a month ago) is that machine learning that are published in NIPS and elsewhere have absolutely terrible, downright atrocious, indecipherable math.

Right now I am reading a R;popular paper” called Generative Adversarial Nets, and I am hit with walls of unclear math.

  • The paper begins with defining a generator distribution p_g over x, but what set is x contained in? What dimension is x? What does the distribution p_g look like? If it is unknown, then say so.

  • Then it says, “we define a prior on input noise variables p_z(z)”. So is z the variable or p_z(z)? When is the distribution written as a function of z here, but not for p_g?

  • Then, authors define a mapping to “data space” G(z;theta_g), where G is claimed to be differentiable (a very strong claim, yet no proof, we just need to accept it), and theta_g is a parameter (in what set, space?)

  • Are G and D functions? If so, what are domains and range of such functions? These are basic details from high school algebra around the world.

When I got to the proof of proposition 1, I burst out in laughter!!!!! This proof would fail any 1st year undergraduate math students at my university. (How was this paper written by 8 people, statisticians no less)?

  • First, what does it mean for G to be fixed? Fixed with what?

  • The proof attempts to define a mapping, y to alog(y) + blog(1-y). First of all, writing 1D constants, a, b, as a pair (a,b) in R2 is simply bizarre. The fact that R^2 is subtracting a set {0, 0} instead of the set containing the pair {(0,0)} is wrong from the perspective of set theory.

  • The map should be written with $mapsto$ instead of $to$ (just look at ANY math textbook, or even Wikipedia) so it is also notationally incorrect.

  • Finally, Supp(p_data) and Supp(p_g) are never defined anywhere.

  • The proof seems to be using a simple 1D differentiation argument. Say so at the beginning. And please do not differentiable over the closed sets [0,1]. The derivatives are not well defined at the boundary (you know?).

I seriously could not continue anymore with this paper. My advisor warned me something about the field lacking in rigor and I did not believe him, but now I do. Does anyone else feel the same way?

Source link
thanks you RSS link
( https://www.reddit.com/r//comments/9l7j46/d_why_do_machine_learning_papers_have_such/)


Please enter your comment!
Please enter your name here