A Raman Spectrum (or any other spectrum) is based upon the intensity of light at a given wavelength (generally in Angstroms/nanometers)... When working with grayscale images (8-bit), the intensity is one of the variables for each pixel, while using grayscale avoids the issues involved in manipulating RGB (where several RGB codes can be used for the same color/wavelength).
If one were to align the bottom axis of a 300K sensor up with the spectrum (640x480 pixels (640 x 480 = 307200 pixels)), then you could, hypothetically, align the visible-NIR wavelengths (the ones visible with a CMOS/CCD camera anyway, from about 400-around 900) so that you get one wavelength (nm) / pixel.
You then align the long axis up vertically to that, so that over the full 640px you split it up into 3 - so 3 x 310 (which leaves 2 x 5px for the separators) which would be the pixels you'd expose to the 3 different light sources (2 from the 532nm - one that was passed through the reference sample and the other through the analyte+reference and one from a neon bulb).
We know the spectral characteristics of a neon bulb, which allows us to callibrate the spectrometer, ie. the peaks are at known wavenumbers, thus given we are only using a single CMOS/CCD sensor and a single grating, are also going to be the same positions in the other two parts of the picture. As the incremental scattered wavelengths (whether scattered by a reflective grating or refracted by a transmission grating) are for most of our purposes, increase in an essentially incrementally linear fashion (ie. they increase in a way we can predict), we could (and many companies do) utilize CMOS/CCD sensors to collect the spectrum assigning either one, or a bunch of pixels to each wavelength (depending upon the resolution of their sensor and the range required).
Thus, the pixels position on the short axis of the CMOS/CCD image correlates effectively (within limits) to its wavelength in nm, so if we plot the position of the pixel on the bottom axis (corrected to correlate to the known spectral response of the neon bulb and assigned wavelengths in nm therefrom), then we can plot the intensity of the light at that wavelength by merely reading the intensity of the light at that pixel position (realistically, you'd probably average the entirety of the pixels corresponding to that position, then use that average intensity).
Therefore, we graph the average intensity of every pixel (from a 2-300px high image/2-300) along the bottom axis (assigning every pixel a wavenumber in nm) with the intensity thereof being plotted as pixel position/wavenumber v intensity.
As the nearest equivalent hardware, which does not include self-alignment every time (nor the use of a reference and analyte every shot), is several thousand US$, this could make for quite an interesting amateur project (especially given they use essentially the same hardware and the same way too).
PS That Python Image Library looks fucking interesting, I'll browse through that, nice one

I wonder if I should build the software/desktop application on XULRunner? It would allow for real time comparison with the various free-access Raman spectral libraries...
EDIT
I was thinking about it, once we get the image from TWAIN (which we can now do with Java using a
workaround), we can then play with it...
Say for example we are using a 532nm laser and a 550nm edge filter, that means that only the remainder of the visible spectrum is likely to be of any interest whatsoever...
Now, let us also say we are using a 480x640 px (300Kpx CMOS/CCD) camera module (which, being outdated are cheap as shit).
Now, we want to split the beam (once we narrow it - I've gone into that elsewhere), to give us a Raman Spectrum of both reference sample (solvent only) and the Analytical Sample (Solvent + analyte). We also want to calibrate the spectrometer and a neon bulb uses fuck all power and costs a couple of bucks.
Ok, that being so, we'll need some orange glass (alternative edge-filter for >550nm is orange glass from Schott), so as to remove reflected 532nm light and Rayleigh scattered light, the anti-stokes light is downshifted, so only the upshifted Stokes-shifted light will be acquired, it is apparently best to do this from perpendicular to the beam, so put a mirror on the other side so as to improve the amount of Stokes-shifted light collected through the >550nm edge filter. Pass that light into optical fiber (cheap as chips in short lengths, it is available on auction sites), then into the slitless spectrometer.
Now, the spectrometer in question is going to be the tricky bit - it will have one grating, three light sources (the Stokes-scattered light from both the reference and analytical sample and also the calibrating light from the neon beam). Each of these will enter the spectrometer at different points and the grating will pass through separating walls, which extend all the way to the CCD/CMOS sensor, but which will allow us to collect 3-subimages on one image.
Now, given that the visible spectrum goes from about 400nm to approximately 800nm, and we have nothing under 550nm, then I suggest we utilize the long side of the sensor (640px x 480px) to collect the spectrum, limiting them to 500px (nb.e. 500px/250nm=0.5nm resolution, which will be shithot when it is made to work). As the grating is the same for all three light sources, and the light source for both the reference & analytical samples are the same laser, then everything is directly comparable.
First off, in order to work with the image, we'll have to identify where our sub-pictures are, then pass that information on to the program (JAVA - easier to build & distribute, ie. it is free whereas other programs are not, plus there are a lot more tutorials on doing this in JAVA than in most languages).
We then utilize that data, to instruct the program to iterate through the pixels of each sub-image, using an
algorithm to determine the intensity/luminosity of each pixel and to keep that in memory for each column (from y1:y200 and x1:x500), then we can do something cool and average out the luminosity/intensity of x1-x500 by adding together the values of column y1:y200 and dividing by 200. That will reduce the effects of stray pixels and noise dramatically.
We keep that variable[] array in memory for all 3 sub-images. Then we look at the intensities of the calibration source (the neon light) and in accordance with the known spectral peaks thereof, assign absolute wavelengths to the x-axis of all 3 images (not quite linear, but close enough - if one were interested enough, it could be worked out mathematically).
We could then draw a graph, of average luminosity/intensity (y) v wavelength (essentially linear progression) based upon the results of having done so for all 3 sub-pictures, in addition to, having subtracted the relevant values (x,y) of the solvent from the analytical spectrum, we would have a good Raman Spectra of the analyte itself.
Provided we can access TWAIN and it allows us to process the images from the spectrometer without having first saved them, then opened them and had to go through a shitload of extra (manual) manipulation, this "COULD" work... In fact, it should work - the algorithm above is the way RGB pictures are converted to 8-bit grayscale, the use of CCD/CMOS sensors for this job is VERY OLD NEWS and provided we can get the resolution worked out, it would be one hell of an analytical instrument (about the size of an iphone). Even better, it may be possible to power the laser, the neon light and the CCD/CMOS camera from a USB 2.0 port, which would make it hell portable.