A long time ago I did a blog post on how to do AR on the iPhone. The code was written pre iOS4.0 and used the UIGetScreenImage function. I’ve now updated the code to use the new 4.0 features.
The code is available here
With the advent of iOS4.0 we now have proper access to the real time camera stream and the use of UIGetScreenImage is not allowed anymore.
To access the camera we now use the AV Foundation framework. There’s a number of really good resources on how to use this. The WWDC video “Session 409 - Using the Camera with AV Foundation” along with the sample code from that session is a great resource (registered iPhone devs can download all the WWDC videos and sample code from iTunes by clicking here: https://developer.apple.com/videos/wwdc/2010/).
To access the camera we need to add the following frameworks to our application:
- AVFoundation
- CoreVideo
- CoreMedia
The first step is to create an AVCaptureSession
// Create the AVCapture Session
session = [[AVCaptureSession alloc] init];
If you want to show what the camera is seeing then you can create a view preview layer - this is handy if you just want to draw stuff on top of the camera preview.
// create a preview layer to show the output from the camera
AVCaptureVideoPreviewLayer *previewLayer = [AVCaptureVideoPreviewLayer layerWithSession:session];
previewLayer.frame = previewView.frame;
[previewView.layer addSublayer:previewLayer];
In the code above I’ve assumed that you’ve created a view called previewView in your view hierarchy and positioned it where you want the camera preview to appear. The preview image will be aspect scaled to fit in the frame you specify. The next step is to get hold of the capture device - here we just ask the OS to give us the default device that supports video. This will normally be the back camera. You can use the AVCaptureDevice class to query what devices are available using the devicesWithMediaType method.
// Get the default camera device
AVCaptureDevice* camera = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeVideo];
We now need to create a capture input with our camera
// Create a AVCaptureInput with the camera device
NSError *error=nil;
AVCaptureInput* cameraInput = [[AVCaptureDeviceInput alloc] initWithDevice:camera error:&error];
```{:lang="ruby"}
And configure the capture output. This a bit more complicated. We allocate an instance of AVCaptureViewDataOutput and set ourself as the delegate to receive sample buffers - our class will need to implements the AVCaptureVideoDataOutputSampleBufferDelegate protocol. We need to provide a dispatch queue to run the output delegate on - you can't use the default queue. We also set our desired pixel format - there are a couple of recommended output formats, CVPixelFormatType_32BGRA on all devices, kCVPixelFormatType_422YpCbCr8 on the 3G device and kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange on the 3GS and 4 devices. In this example we'll just use CVPixelFormatType_32BGRA. <br />
```objc
// Set the output
AVCaptureVideoDataOutput* videoOutput = [[AVCaptureVideoDataOutput alloc] init];
// create a queue to run the capture on
dispatch_queue_t captureQueue=dispatch_queue_create("catpureQueue", NULL);
// setup our delegate
[videoOutput setSampleBufferDelegate:self queue:captureQueue];
// configure the pixel format
videoOutput.videoSettings = [NSDictionary dictionaryWithObjectsAndKeys:[NSNumber numberWithUnsignedInt:kCVPixelFormatType_32BGRA], (id)kCVPixelBufferPixelFormatTypeKey,
nil];
We can also configure the resolution of the images we want to capture.
Preset | 3G | 3GS | 4 back | 4 front |
AVCaptureSessionPresetHigh | 400x304 | 640x480 | 1280x720 | 640x480 |
AVCaptureSessionPresetMedium | 400x304 | 480x360 | 480x360 | 480x360 |
AVCaptureSessionPresetLow | 400x306 | 192x144 | 192x144 | 192x144 |
AVCaptureSessionPreset640x480 | NA | 640x480 | 640x480 | 640x480 |
AVCaptureSessionPreset1280x720 | NA | NA | 1280x720 | NA |
AVCaptureSessionPresetPhoto | NA | NA | NA | NA |
// and the size of the frames we want
[session setSessionPreset:AVCaptureSessionPresetMedium];
Finally we need to add the input and outputs to the session and start it running:
// Add the input and output
[session addInput:cameraInput];
[session addOutput:videoOutput];
// Start the session
[session startRunning];
We’ll now start to receive images from the capture session on our implementation of didOutputSampleBuffer.
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection {
To access the video data we need to use some of the functions from CoreVideo and CoreMedia. First we get hold of the image buffer from the sample buffer. We then need to lock the image buffer and then we can access its data.
// this is the image buffer
CVImageBufferRef cvimgRef = CMSampleBufferGetImageBuffer(sampleBuffer);
// Lock the image buffer
CVPixelBufferLockBaseAddress(cvimgRef,0);
// access the data
int width=CVPixelBufferGetWidth(cvimgRef);
int height=CVPixelBufferGetHeight(cvimgRef);
// get the raw image bytes
uint8_t *buf=(uint8_t *) CVPixelBufferGetBaseAddress(cvimgRef);
size_t bprow=CVPixelBufferGetBytesPerRow(cvimgRef);
.... do something useful with the image here .....
With the image data you have a number of options, you can turn it into a UIImage:
CGColorSpaceRef colorSpace=CGColorSpaceCreateDeviceRGB();
CGContextRef context=CGBitmapContextCreate(buf, width, height, 8, bprow, colorSpace, kCGBitmapByteOrder32Little|kCGImageAlphaNoneSkipFirst);
CGImageRef image=CGBitmapContextCreateImage(context);
CGContextRelease(context);
CGColorSpaceRelease(colorSpace);
UIImage *resultUIImage=[UIImage imageWithCGImage:image];
CGImageRelease(image);
Or you can just process the data directly. The image will be stored in BGRA format, so each row of image data has pixels consisting of 4 bytes each - blue,green,red,alpha, blue,green,red,alpha, blue,green,red,alpha etc…
If you’re doing some image processing then you’ll probably not bother with creating a UIImage and just go straight for crunching the raw bytes.
I’ve put together a simple demo project here. The image processing is just a demo - don’t try and use if for anything important.