Adaptive Optics is a general technology for correcting images of objects seen through inhomogeneous medium. Originally developed by astronomers and the military to study objects outside the earth’s atmosphere, the technology now has applications in many different fields, from biological imaging to thermonuclear fusion.
[Return to home page]AO SYSTEM AT CHICAGO

How do you build an AO system?
Making the Deformable Mirror
Building the Wavefront Sensor:
       Choosing the detector
         Calculating the Wavefront    
         Working out the Reconstructor Matrix
         Noise Gain

AO system as a Feedback Loop
Non-Common Path Errors


Temperature fluctuations in the earth’s atmosphere cause local changes in the refractive index of air. These changes cause the twinkling of stars and the mirage seen when driving on a hot road. Light from a star comes through space as a perfectly flat and smooth wavefront.  As the light  passes through the atmosphere, the wavefront, although smooth on small scales, becomes "crinkled", degrading the resolution of a ground based telescope.

One way to overcome this problem is to send the telescope into space, as was done with the highly successful Hubble Space Telescope. Adaptive Optics tries to obtain similar imaging performance from the ground by measuring the wavefront in real time and then correcting the distortion using a deformable mirror.The minimum elements of any Adaptive Optic (AO) system are therefore a device for actively correcting the wavefront (aka “Deformable mirror”), a device for providing correction signals (aka “Wavefront Sensor”) and a computer to control the whole system.

Although invented by an astronomer, Adaptive optics has applications in any field requiring active correction of the wavefront before the detector. It is already used for retinal imaging, optical coherence tomography, deep tissue imaging, laser communications and a wide range of DoD applications. Although most papers on Adaptive Optics tend to be very mathematical, building Adaptive Optics systems is both easier and more affordable that you might think
[Return to top]  [Return to home page]

How do you Build an AO System?

When my group started building AO systems in the early 1990s, there were very few components available to build an AO system andPhoto of groupPhoto of group we had to build the deformable mirror, wavefront sensor and even the high speed computer ourselves. On the plus side, it did give us a lot of experience and this also meant that we were able to make many different types of deformable mirrors with different geometries, including some new designs. The original system ( called “ChAOS”) now resides in the Adler Planetarium in Chicago as a historical exhibit. It still looks good.
[Return to top]  [Return to home page]

Making the Deformable Mirror

Close of DMOur deformable mirrors were built using tubes of piezoelectric material about 1 inch long. These tubes change their length by a few microns when a voltage is applied across the inner and outer surfaces of the tube. A glass ball is used to interface the PZT tube with the faceplate. The actuators  were assembled in a baseplate and glued together so that the tops of the  glass spheres were all in a plane. A thin flat faceplate 1 mm was held in a vacuum chuck and glued onto the actuators, producing a deformable mirror than was flat to about micron and could be moved over a range of 4 microns. Making these deformable mirrors is well within the capabilities of a skilled amateur and full details of how these mirrors are made is given in Mike Smutko's paper. A modest deformable mirror can be built for a few hundred dollars. Even lower cost mirrors can be made from bimorph plates.  Newer devices, using MEMS technology, can be bought ready-made for a few thousand dollars from ThorLabs.  A photograph of a collection of our DMs is shown below. These have 91 or 201 actuators. The best DMs that we make use a quasi-hex geometry, which reduces waffle pattern noise.                      
Collection of Deformable mirrorsDriving all these deformable mirrors is probably harder than making the mirror. Each actuator for either piezo, bimorph or MEMS device requires a few hundred volts to operate, and amplifiers operating at these voltage are both expensive and power hungry. However,the piezo actuators are built from a ceramic material and are basically capacitors. This enabled us to use a multiplexed drive system in which a single high voltage amplifier drove up to 16 actuators. It took 5 microseconds to charge each piezo so that the shape of the mirror could be updated in 100 microseconds.
[Return to top]  [Return to home page]

Building the Wavefront Sensor

    Although other types of wavefront sensor - shearing interferometers, Mach-Zender interferometers, pyramid prisms, curvature sensors - may have significant advantages for your application, the classical wavefront sensor is the Shack-Hartmann sensor. This device measures the gradient of the wavefront across the pupil rather than the wavefront itself and consists of a lattice of small lenses (lenslets) arranged in a regular grid and a detector. The pupil of the telescope is usually imaged onto the lenslet array so that, when the telescope is pointed at a bright star, a series of images of the stars are formed in the focal plane of the lenslet. A detector, usually a CCD, digitizes these images and the position of each image determined by a computer. If the wavefront at the telescope entrance pupil is flat, and the optics is perfect, the stellar images will form a regular array on the detector. In practice, each image will be displaced by an amount that is approximately proportional to the average gradient of the wavefront across its subaperture The sensor measures this average gradient, sampled at the subaperture positions. It is usual to place the wavefront sensor after the deformable mirror so that we actually measure the difference between the mean slope of the atmosphere and the mean slope of the DM, averaged over any subaperture. We can also simplify the mathematics and reduce some systematic errors if we image the deformable mirror onto the lenslet array so that an image of the deformable mirror actuators is accurately mapped onto the corners of the subapertures.
[Return to top]  [Return to home page]

choosing the Detector

If you are not photon limited, a commercial CCD or similar camera with sufficient pixels to adequately sample the Shack-Hartmann spots – say 4 to 8 CCD pixels across the FWHM of the image from a given subaperture - will meet the needs of a low bandwidth AO system. Unless you are using a very poor detector, or a silly algorithm, to work out the position, oversampling does not significantly increase the accuracy and usually results in a lower read out speed, increased read-out noise and possibly higher cost. The detector usually has a fixed read noise for every pixel read-out (hence the name). If photon noise or read out speed IS a significant problem, then you must trade the number of pixels/sub image with the linearity of the measurement. The more pixels you have, the more linear the slope measurements but, usually, the higher the measurement noise and longer the read out time. Adequate bandwidth is just as important as low noise.

In the limit, you use the detector as a quad cell and arrange that the lenslet images of the source are at the corners of adjacent pixels. While the quad cell minimizes the number of pixels used to measure the position, the downside is that
(1) the measurement error is non-linear – the quad cell is most sensitive when the spot is centered at the 4 corners of adjacent pixels.
(2) the gain, even when the spot is centered on the corners of the pixel, (signal out/movement of the spot) depends on the size and shape of the spot.
(3) alignment issues become important.
Because the normal AO system is a feedback device, this non-linearity is not very important, since the servo is always trying  to move the actuators to force the spots to their null position. Similarly, a change in spot size only effects the gain of the system, which can, in principle, be compensated for by the the control system. However you do have to very carefully match the registration of the lenslets and detector pixels, so that there are an integral number of pixels between each subaperture. For some reason, the lenslet manufacturers do not often do this and you may have to  re-image and re-scale the lenslets onto the detector pixels. The simplest low noise AO systems (such as ChAOS) carefully image the DM actuators onto the corners of the subapertures and then arrange re-imaging and re-scaling optics between the lenlet array and detector. The Shack Hartmann sensor is thus not without its problems.
Its most significant advantage is perhaps that its operation is easy to understand in principle.
[Return to top]  [Return to home page]

Calculating the Wavefront

  In this section I will assume that you have built a Shack Hartmann wavefront sensor that allows you to image the DM actuator pattern accurately onto the lenslet array and that you have a two dimensional detector to image the spots. After some processing of the CCD data you will end up with a series of measurements of the average slope of the wavefront across each subaperture. This is called the slope vector, although, in almost all calculations, what we actually need is the phase difference across the subaperture, which is the slope times the subaperture size, because most ways of reconstructing the wavefront from this data rely on a model relating these phase difference measurements to the phase of the wavefront at the edges of the subaperture.

The first step is to define the phase differences in terms of the phase points. The figure left shows one such arrangement, called the Fried geometry, often used with the Shack-Hartmann sensor. In this model, the first x phase difference is given by  0.5(2+5) - 0.5(1+4) where the integers correspond to Geometry matrix picactuator positions in the diagram. Different AO systems may have different geometries, but we can always define all the slopes measured by the wavefront sensor in terms of the actuator positions as a matrix equation:
                              geometry matrix eqn
Where s is a vector containing a list of all the phase differences,  phi are the phases and A is a very sparse matrix called the geometry matrix. There exists another  inverse matrix, usually called A+, which is a non sparse matrix relating the phases to the phase differences:  
                     A+ equation
This matrix gives us a recipe for calculating the phases from the phase differences - we take all the phase differences that we measure and weight each slope with a number that depends on the phase position j and the slope number i:
                                                          actuator posn eqn
There has been considerable discussion, often heated, over what the “best” coefficients are, but these differences are often only important when the source is faint and we are trying to optimize the signal/noise of the system. We should note that most Gradient and curvature wavefront sensors do not sense all possible modes. For instance, the Fried geometry, discussed above, cannot measure the so-called "Waffle mode".  The Waffle mode occurs when the actuators are moved in a checkerboard pattern( all white squares change position while the black squares remain stationary). This pattern can build up over time, it is reduced by using different actuator geometries or filtering the control signals so as suppress this mode. We should also note that the  geometry model is only an approximation to the  wavefront. Even if there the actuator positions reflect the true positions of the wavefront at these points, there are high frequency components in the wavefront that cause system errors. This error ( usually called "aliasing" ) can be reduced by proper design but introduces an additional term into the error budget.
[Return to top]  [Return to home page]

Working out the reconstruction matrix

Derivation of phase pointsWe need the work out an inverse to the geometry matrix.  Starting with the original matrix equation relating slopes to phase points, we note that A is not square ( there are about twice as many slopes as phase points) so that we cannot invert A directly. If we multiply both sides by A transpose we get a square matrix A transpose A and, following the steps shown left, we can at least write down an equation of the phase points in terms of the slopes. This reconstruction matrix is usually called the A+ matrix.

We have, of course, still not shown that we can invert A, and, indeed if you try this the inversion will fail. To understand why, and what to do about it, we should look at a simple 3x3 matrix drawn in the last section. The first x slope is given by a Fried geometry as 0.5(2+5) - 0.5(1+4). If we add a constant to either (2 and 4) or (1 and 5) the resulting slope is unchanged, even though the shape of the wavefront is very different. This is just an artifact of the original equation but is the reason for the "Waffle mode"- you get the same slopes if you piston all the odd or all the even phase points. You can overcome this  mathematically for the inversion problem either by setting one odd and one even point to zero, or (better) by setting the sum of all the odd points,and the sum of all the even points to zero, the so-called minimum norm solution. The mode is still undetected and can build up with time, but we will discuss this issue in the section after next.

 The first thing to do is to define how your actuator positions produce the slopes you hope to measure.
If you use Mathematica you can define this matrix, called the geometry matrix, for an m x n array of points as follows, this adds in the extra two rows needed to provide the minimum norm solution.

          geometry matrix mathematica

We will try this for a 3x3 actuator matrix, setting m=3 and n=3.
The geometry matrix produces an ordered list of slopes from the phases at the actuator positions.
In this example, the geometry matrix, A,  looks as follows:

                                          3 x 3 Geometry matrix

The first row is the first phase difference. The actuator positions run from 1 to 9 left to right along the top, so this matrix says that the first x phase difference is given as 0.5(2+3)-0.5(1+4), where the integer numbers are the actuator positions. There are 4 x slopes and then four y slopes. The last two rows at the bottom add up the phases of all the odd phases and all the even phases and represent the unobserved Waffle mode terms. The matrix A transpose A is given is Mathematica as

                                       ata matrix

and the final A+ matrix is given by

                     A+ matrix values

In this A+ matrix each row represents a phase point which is a sum of the 8 slope values multiplied by a weighting function. The last two columns are the waffle mode pistons which are set to zero and not used. Although we have not shown it here, this is the least squares estimator of the phases from the phase differences. You must be careful about numbering of your actuator positions and phase differences, so that the geometry matrix you use to calculate the least squares solution corresponds to the actuator positions. Measured phase differences and phase difference measurements round the edges and corners should be treated slightly differently, but this example illustrates the general principles of reconstruction.

If you have an adequate number of pixels/subaperture, you can (in principle) simply poke up each actuator in turn and measure the resulting phase differences, thus measuring the geometry A matrix rather than calculating it.  The measured A matrix can then be used to calculate the reconstructor M matrix. This turns out to be trickier than it sounds and, if you are starting off in AO, it may be faster to put in the required opto-mechanical adjustments and simplify the maths.

[Return to top]  [Return to home page]  

 Noise Gain

Once you have the A matrix, you can work out the noise gain of the system. We can show that the noise gain for the least squares A+ reconstructor is given by:

                                                 Noise gain = Tr[Inverse[Transpose[ae].ae]]/(m x n)

Where ae is the geometry matrix. The mean square error in calculating the wavefront is given by this number times the mean square error of measuring the phase difference across the subaperture. Note that the error is defined in terms of a variance and that the key parameter is not the mean square slope error but rather the mean square slope error x subaperture size^2.

The noise gain does not increase rapidly with number of actuators. This is crucial for the whole approach and is certainly not true for a one dimensional array of actuators.  If we had a single line of phase differences and set the phase at one end to zero, the phase at the other end is simply the sum of the phase differences along the line. In this case, if the noise between phase difference measurements is uncorrelated, the variance of the phase estimate at the other end will then be the sum of the variances of the individual phase difference measurements and the noise gain will increase directly with the number of phase differences.

 However because we have are measuring phase differences in two directions, as a vector, there are a  number of different paths joining two points in the array and many of the paths are independent. In fact if we have an N x N array of phase points the number of different loops is nearly equal to N, and we can show that the noise gain only increases logarithmically with the number of phases. This is a feature of using a weighted sum of all the phase differences in two dimensions. Different reconstruction algorithms have different noise propagators. For instance, curvature sensors, which measure a scalar field, have much worse noise gains for large numbers of actuators.
[Return to top]  [Return to home page]

Adaptive Optics as a Feedback Loop

It is important to realize that we are actually building  a sampled feedback loop drawn, outlined below. The wavefront sensor attempts to measure the slopes of the difference between the atmospheric wavefront and deformable mirror, integrated over the subapertures and over a sample time period t. This slope vector is then multiplied by a matrix M to obtain an updated vector ( or map ) of the difference between the wavefront and the positions of the DM actuators. M can be one of a large number of different matrices from sparse iterative matrices to dense matrices based on the statistics of the wavefront.

                ao as a feedback loop

The reconstructor matrix M provides error signals to the actuator controller and can also control the spatial smoothing of the deformable mirror. The most common reconstructor used in adaptive optics probably the least squares reconstructor . This minimizes the fitting error between the atmosphere and DM but assumes that there is no correlation between the phases of adjacent actuators. The lack of spatial smoothing has been a serious criticism of this reconstructor.  Atmospheric turbulence has the form of a fractal and has much more power at large spatial scales so that the idea that the atmospheric wavefront can have big swings between adjacent phase points is physically unrealistic. An enormous amount of time and effort has gone into working out the optimum M matrix assuming given atmospheric wavefront statistics. We should note however that what we actually measure is the difference between the atmosphere and the DM.  This naturally takes out the high power at large spatial scales (in fact that is how AO works) and the correct statistics are not those of the atmosphere phi atmosphere but those of phi atm-phiDM, which are, very approximately, uncorrelated between actuator positions. The least square solution is therefore a better solution than would first appear.  Whatever M matrix is used, the new phase differences between atmospheric wavefront and deformable mirror are fed to the control loop of each actuator. This controller provides the required actuator position. We often use simple first order difference equation for this step.

                                        control differnce eqn

where g is the loop gain. This says that we just add a weighted value of the new difference measurement to each of the original actuator positions to  obtain a new position for the next sample.

    The bandwidth of the system is controlled by the sample frequency and the loop gain of the system. For the least squares reconstructor, the servo bandwidth is given by servo bandwidth   and the noise gain of the system by Noise gain eqn , where Noise gain  is proportional to the measurement noise of the wavefront sensor. We should note that  the system noise gain increases rapidly as g approaches 2 and the servo becomes unstable. For this reason the loop gain is rarely set above one.

The approach used in ChAOS was to have a set of precomputed M matrices, each one optimized for a range of different observing conditions (source brightness and atmospheric seeing) and to change this matrix and the servo gain factor independently, by trial and error, so as to obtain the best system performance.

There exist general trades between how much computing we need to do (M can be sparse) and the sample frequency.
Under conditions of high turbulence, significant scintillation exists, including singularities in the wavefront and different approaches must be used.
[Return to top]  [Return to home page]


The performance of an AO system depends on how well you can calibrate out systematic effects. It is essential to have some point source (often a single mode fiber) at a convenient position in the AO system so that you can measure and remove systematic errors in the system.
[Return to top]  [Return to home page]

Non-Common Path Errors

Looking at  the feedback loop diagram above, we see that we actually control the wavefront reflected from the beamsplitter. Any differences in abberation introduced into the beam as it passes through the beamsplitter ( such as astigmatism because the plate is in a converging beam) is not detected and reduces the final image quality. It is usual to introduce additional optics, such as a thinly wedged plate, into the beam before the wavefront sensor to correct for this effect. Non common path errors are generally difficult to eliminate and are often responsible for poor correction of the system.
[Return to top]  [Return to home page]

6:18 pm Feb 23 2010 Edward Kibblewhite