The question asks two things: (1) how to show that the maximum X(n) converges, in the sense that (X(n)−bn)/an converges (in distribution) for suitably chosen sequences (an) and (bn), to the Standard Gumbel distribution and (2) how to find such sequences.
The first is well-known and documented in the original papers on the Fisher-Tippett-Gnedenko theorem (FTG). The second appears to be more difficult; that is the issue addressed here.
Please note, to clarify some assertions appearing elsewhere in this thread, that
The maximum does not converge to anything: it diverges (albeit extremely slowly).
There appear to be different conventions concerning the Gumbel distribution. I will adopt the convention that the CDF of a reversed Gumbel distribution is, up to scale and location, given by 1−exp(−exp(x)). A suitably standardized maximum of iid Normal variates converges to a reversed Gumbel distribution.
Intuition
When the Xi are iid with common distribution function F, the distribution of the maximum X(n) is
Fn(x)=Pr(X(n)≤x)=Pr(X1≤x)Pr(X2≤x)⋯Pr(Xn≤x)=Fn(x).
When the support of F has no upper bound, as with a Normal distribution, the sequence of functions Fn marches forever to the right without limit:
Partial graphs of Fn for n=1,2,22,24,28,216 are shown.
To study the shapes of these distributions, we can shift each one back to the left by some amount bn and rescale it by an to make them comparable.
Each of the previous graphs has been shifted to place its median at 0 and to make its interquartile range of unit length.
FTG asserts that sequences (an) and (bn) can be chosen so that these distribution functions converge pointwise at every x to some extreme value distribution, up to scale and location. When F is a Normal distribution, the particular limiting extreme value distribution is a reversed Gumbel, up to location and scale.
Solution
It is tempting to emulate the Central Limit Theorem by standardizing Fn to have unit mean and unit variance. This is inappropriate, though, in part because FTG applies even to (continuous) distributions that have no first or second moments. Instead, use a percentile (such as the median) to determine the location and a difference of percentiles (such as the IQR) to determine the spread. (This general approach should succeed in finding an and bn for any continuous distribution.)
For the standard Normal distribution, this turns out to be easy! Let 0<q<1. A quantile of Fn corresponding to q is any value xq for which Fn(xq)=q. Recalling the definition of Fn(x)=Fn(x), the solution is
xq;n=F−1(q1/n).
Therefore we may set
bn=x1/2;n, an=x3/4;n−x1/4;n; Gn(x)=Fn(anx+bn).
Because, by construction, the median of Gn is 0 and its IQR is 1, the median of the limiting value of Gn (which is some version of a reversed Gumbel) must be 0 and its IQR must be 1. Let the scale parameter be β and the location parameter be α. Since the median is α+βloglog(2) and the IQR is readily found to be β(loglog(4)−loglog(4/3)), the parameters must be
α=loglog2loglog(4/3)−loglog(4); β=1loglog(4)−loglog(4/3).
It is not necessary for an and bn to be exactly these values: they need only approximate them, provided the limit of Gn is still this reversed Gumbel distribution. Straightforward (but tedious) analysis for a standard normal F indicates that the approximations
a′n=log((4log2(2))/(log2(43)))22log(n)−−−−−−√, b′n=2log(n)−−−−−−√−log(log(n))+log(4πlog2(2))22log(n)−−−−−−√
will work fine (and are as simple as possible).
The light blue curves are partial graphs of Gn for n=2,26,211,216 using the approximate sequences a′n and b′n. The dark red line graphs the reversed Gumbel distribution with parameters α and β. The convergence is clear (although the rate of convergence for negative x is noticeably slower).
References
B. V. Gnedenko, On The Limiting Distribution of the Maximum Term in a Random Series. In Kotz and Johnson, Breakthroughs in Statistics Volume I: Foundations and Basic Theory, Springer, 1992. Translated by Norman Johnson.