One of the last bastions of Flash and native apps is the processing of video from outside sources such as webcams. It does not at all seem difficult to add this functionality to HTML5.
I don’t have much of any experience in designing these kind of specs (though I did request <audio> and <video> elements some two and a half years ago), but here are some design notes which seem to make some sense:
Get outside video
Create a specific src parameter for video, for instance src=”webcam” to get image data from a current installed webcam1. The user-agent can mediate the presence of cameras and the routing of sources. This gives the user-agent a way to get device video into the web application.
Besides augmented reality this could be used for most webcam related applications on websites but for that some more facilities for retrieving and transmitting the video stream will be necessary.
Get at the frames
Now to get at the raw video data the addition to HTMLVideoElement of a method (like the canvas already has) would seem to fit:
that returns an ImageData object for the current frame of video. This would either work for the current frame when the video is paused or the current frame unpredictably when the video is playing (for applications to retrieve frames of video as fast as they can process them).
Alternatively register a callback function to the <video> element where every video frame is pushed to.
If you can reliably extract all video frames and store them locally, you may even be able to build a non-linear video editing application.
Process and redraw
Processing the frames to create an augmented reality is left as an exercise for the reader.
Ideally each frame of video could also be rendered into a canvas where the client could draw other primitives on top of the video frames. This seems to be necessary for the augmented part of the augmented reality.
Update: This already seems to be possible by putting the image back into a canvas, but I don’t think that would sync up the audio properly.
So all that is needed is the addition of an extra source and an extra method to the <video> element. Doesn’t seem like that much, does it?
- Though there may be better ways than a ‘magic value’ for the source attribute. ↩