Pretty much any intro to queuing theory or stochastic processes book will cover this, e.g., Ross, Stochastic Processes, or the Kleinrock, Queuing Theory.
For an outline of a proof that memoryless arrivals lead to an exponential dist'n:
Let G(x) = P(X > x) = 1 - F(x). Now, if the distribution is memoryless,
G(s+t) = G(s)G(t)
i.e., the probability that x > s+t = the probability that it is greater than s, and that, now that it is greater than s, it's greater than (s+t). The memoryless property means that the second (conditional) probability is equal to the probability that a different r.v. with the same distribution > t.
To quote Ross:
"The only solutions of the above equation that satisfy any sort of reasonable conditions, (such as monotonicity, right or left continuity, or even measurability), are of the form:"
G(x) = exp(-ax) for some suitable value of a.
and we are at the Exponential distribution.