5 Frame sizes for advanced users

This section covers the resizing part. Which frame you should capture, if  and how you should resize. This section is meant for people who are familiar with basic capturing, and want to know a bit more about it. Subjects like Bandwidth, Nyquist sampling, ITU-601 standard, capture card scalers and determining the active capture window are treated with a bit of detail. It is also explained why these subjects are important for the capture process, and how they influence the recommended capture size and the resizing after that.

If you know about dvd resizing and you think that resizing analogue captures works the same way, then you should read this section very carefully. It is important to realize that the card always samples at the same fixed sample rate and downsizes (resamples) the picture afterwards according to the settings provided by the driver and the capture application, regardless which capture size you request. For many practical reasons all cards sample much higher than the bandwidth of the videosignal. These reasons include maintaining line sync and having proper ADC anti-alias filtering to name two. If you think about it, this makes sense. Supporting many rates would be expensive. What this means is that if your application/capture process allows for more than one size for a given TV system, the same image will be sampled at a fixed rate initially, and scaled to the size you requested afterwards.

This would be fine if the card/drivers scaled and then cropped/added borders to hit the requested 'standard' size. However, they don't seem to do this. In addition, depending on which driver is used, a given card does not even sample/capture the same amount of the analogue picture. Thankfully, for a given driver, you seem to get the same section of the analogue picture, regardless of the pixel frame size you requested.

To summarize: card/driver combinations capture a constant part of the analogue picture/signal. If the driver offers more than one frame size option, then the constant frame size is simply resized to the requested frame size. This means, if you don't ask for the 'native' size for the card/driver, and don't resize after capture, the picture will be distorted.

In this section it will be explained how to resize to the correct target size. How this should be done depend on a number of things, whether you live in a PAL and NTSC country (in subsection 5.1 a few things will be explained about the PAL and NTSC standards), your capture size, your active capture window (discussed in subsection 5.4) and the end format. In subsection 5.5 the minimal recommended capture size will be given. In subsection 5.6 it will be discussed how to determine to correct resizing settings.

5.1 The guideline for digitizing

This subsection devotes a few words to the ITU recommendation Rec. ITU-R BT.601-5 (a guideline for the digital coding of the analogue TV signal). This information is necessary to understand how you may (1) resize your captured video. One needs pixels for digital storage. Therefore one needs a guideline, which prescribes how the analogue signals can be described digitally. This guideline (as recommendation) is given as the Recommendation ITU-R BT.601-5. Some important issues in this guideline: References:
13.5 Mhz: Why 13.5 MHz?
ITU-R BT.601-5 recommendation: The guideline for converting analogue signal into computer pixels.
Der Karl's Capture Karten aspect ratio fuer Dummies: Der Karl's Capture Card Aspect Ratio for Dummies (in German).
Capture-Cards and aspect-ratio for Dummies: Der Karl's Capture Card Aspect Ratio for Dummies (translated by Arachnotron).
Leopold's Home Video Formats Page: Some information on various video formats.
Video Signal Standards and Conversion Page: Great site about video signal standards with a lot of links!
Bandwidth Versus Video Resolution: Performance requirements for various video standards.
Video Basics: About the of the fundamentals of analogue video.
EBU R92-1999: Active picture area and picture centering in analogue and digital 625/50 television systems.
EBU Technical Information I15-1998: Testing for conformity with ITU-R Recommendations BT.601 and BT.656.

5.2 Sampling and Nyquist's theorem

To understand the "basic resize rules" in the introduction of this section, a few words will be said about sampling and the internal workings of capture cards. As already mentioned, TV transmission is analogue. The process of digitizing it by the capture card, is called sampling or performing an analogue to digital conversion or digitizing. Mathematically it just means that a waveform is discretized in a certain number of parts, and these parts are called samples. As mentioned above, the number of samples (or sample rate) is always the same (for bt8x8/cx2388x cards for example: NTSC: 28.64 MHz, and PAL: 35.48 MHz (2)), independently of the chosen capture size. The sample rate of the driver can't be changed by the user. After sampling, the clip is resized to the resolution which you used for capturing. (How the resize is done, is controlled by the driver.)

One of the reasons of this high number of samples, is the fulfillment of Nyquist sampling. Nyquist's theorem, also known as Shannon theorem, describes the minimal sample rate to sample an analogue signal, in order to be able to reconstruct it again without information loss. This sample rate must be at least twice the bandwidth of the signal. There are two conditions to be met for Nyquist to be valid: 

  1. The analogue signal must be filtered to remove all frequencies above half the sample rate (like for instance noise). A low pass filter called an anti-aliasing filter (3) does this. Each capture device has one build in, so you need not worry about that. 
  2. The playback device must use an ideal low pass filter to recreate the signal from the samples. This filter is non-causal and infinite, which is not possible in practice.

The last condition is never met in practice. That is, the ideal reconstruction filter to recreate the signal doesn't exist in practice. This implies that the optimal capture size is higher than twice the bandwidth. How large? We don't know. To be safe, try to cap as high as possible.

So, Nyquist should only be taken as a guideline here (that is, as an absolute minimum). Take for example a standard VHS tape. The bandwidth of the VCR signal is 3 MHz. You need at least 2*3 = 6 MHz to sample it. But don't worry; all modern capping cards are sampling well above 6, 12 or 24 MHz (2). However when scaling down (using the driver itself by selecting a suitable capture size, or by an editing program like VirtualDub or AviSynth, make sure that you don't resize below twice the bandwidth. So for our VHS tape, don't go below 6 MHz * 52 µs = 312 pixels horizontally (PAL). But, the optimal final size will be larger than this.

In fact, Nyquist's theorem doesn't even fully apply here. Nyquist is about a complete analogue-digital-analogue cycle. Most people break half way into the cycle and split off to do something different. The correct cycle: analogue signal from VHS -> digitized -> fed into DAC (digital analogue converter) to reconstruct the original analogue signal. An incorrect cycle: VHS -> digitized -> MPEG2 -> RGB. In the second path the original signal is not reconstructed. Instead, the samples are lossy compressed, and the result is converted later into a completely different type of analogue signal.

References:
Sampling and Nyquist's Theorem for Audio and Video: This one is simple and also addresses playback.
Basic Signal Processing: Nice document about signal reconstruction. 
The Scientist and Engineer's Guide to Digital Signal Processing: Have a look at chapter 3 which is about analogue to digital (and digital to analogue) conversion. Great non-technical introduction.

5.2.1 Sample rates of some standard formats

It is important to understand that the size of the frame directly relates to the sampling rate used to create the pixel. If a 53.333 µs analogue signal was sampled at 14.32 MHz you get 910 pixels. If this same line is sampled at 13.5 MHz, you get 720 pixels. But both cover the same area. This concept of sampling can be reversed to understand how digital is turned back into analogue. If a device knows the standard, it knows how to create or what to do with the pixels.

Here are some 'standard' sampling rates: (Note: sample rate (in MHz) * width (in µs) = pixels.)

Standard Sample rate in MHz Width of standard in µs Pixels
DV/DVD 13.5 53.333 720
DVD 13.5 52.148 704
SVCD 9 53.333 480
VCD1/CVD 6.75 52.148 352
PAL TV on a PC 14.769 2 52 768
1/2 (or 1/4 1) PAL TV on a PC 7.3845 2 52 384
NTSC TV on a PC 12.3064 2 52.666 648
1/2 (or 1/4 1) NTSC TV on a PC 6.153 2 52.666 324

(1) VCD, 1/4 PAL and 1/4 NTSC have half the height of the other "standards"
(2) There are no devices that sample to or from PC pixels. The rates are derived from the fact that PC pixels are the same height as width, and screens have an aspect of 4:3. Given this, a full frame of PAL TV of 576 lines plays on a PC in a 768x576 frame. NTSC TV plays on a PC in a 648x486 frame.

References:
Digital video resolutions: A Quick Guide to Digital Video Resolution and Aspect Ratio Conversions.

5.3 Capture card scalers

If you still capture at a size which is too low (for whatever reason) it might be that the image will be degraded too much because of lousy vertical scalers. This subsection gives some examples for different capture devices. Some size thresholds are given and beyond these image quality rapidly deteriorates (usually the usage of vertical scaler destroy the sharpness).

A vhs-source is captured at multiple sizes, and the blurriness is compared looking a common frame (a frame with a lot of detail is chosen). Below, screenshots are given from only two sizes. The first one has about the same quality as the one which is captured at full size, the second one is the first one which is blurrier.

If your capture device is not listed, you can do the following test to check it yourself:
Capture a vhs-source at different horizontal sizes: 720, 704, 640, 480, 400, 384, 368 and 352 (for PAL and NTSC). Find a common frame which contains a lot of detail (books, rasters, text, etc.). Compare the different sizes, and find the smallest one which has the same quality as the one at full PAL or NTSC. Never capture below this size, except if you have to (due to hardware requirements for example).

5.3.1 Capture card scalers (PAL)

bt8x8:

400x576to768x576_bt8x8

384x576to768x576_bt8x8

bt878: 400x576 (upsized to 768x576)

bt878: 384x576 (upsized to 768x576)

There is a clear difference in bluriness when capping at different sizes. Starting from 480x576, the image will become more blurrier. 400x576 is still acceptable, but lower sizes are blurred very much.

So, if possible, make sure that you capture at least at 400x576 or higher sizes.

cx2388x:

720to384_cx23881

384_cx23881.jpg

cx23881: 720x576 (downsized to 384x576)

cx23881: 384x576

The difference between 720x576 and 384x576 is minimal. There is no crappy vertical resizer at work, when capping at low sizes. Thus, you can use 384x576.

SAA71xx:

704to384_saa7134

384_saa7134

SAA7134: 704x576 (downsized to 384x576)

SAA7134: 384x576

The difference between 720x576 and 384x576 is minimal. There is no crappy vertical resizer at work, when capping at low sizes. Thus, you can use 384x576.

ati cards with Theater chip:

480x576_ati

400x576_ati

ATI AIW Radeon (Rage theater chip): 480x576

ATI AIW Radeon (Rage theater chip): 400x576

There is a clear difference in bluriness when capping at different sizes. 480x576 is the smallest acceptable size.

Note that not all ATI cards use a theater chip, because the 'ATI TV wonder VE' uses a BT878 chip for example.

References:
bt8x8 Data Sheet: Have a look at the 100119a.pdf document.
Conexant cx2388x Data Sheet
Philips SAA7108 / 7113 Data Sheets
ATI theater 200: Some features.

5.3.2 Capture card scalers (NTSC)

bt8x8:

368_bt8x8

356_bt8x8

bt878: 368x480

bt878: 356x480

There is a clear difference in bluriness when capping at different sizes. Starting from 400x400, the image will become more blurrier. 368x400 is still acceptable, but lower sizes are blurred very much.

So, if possible, make sure that you capture at least at 368x400 or higher sizes.

cx2388x:

No screenshots available.  I guess the same conclusion holds as in the PAL case: There are no lousy vertical resizers at work when capping at lower sizes. Thus, you can use 320x480.

SAA71xx:

No screenshots available. I guess the same conclusion holds as in the PAL case: There are no lousy vertical resizers at work when capping at lower sizes. Thus, you can use 320x480.

ati cards with Theater chip:

400x480_ati 368x480_ati
ATI Radeon AIW 8500DV: 400x480 ATI Radeon AIW 8500DV: 368x480

There is a clear difference in bluriness when capping at different sizes. 400x576 is the smallest acceptable size.

Note that not all ATI cards use a theater chip, because the 'ATI TV wonder VE' uses a BT878 chip for example.

References:
bt8x8 Data Sheet: Have a look at the 100119a.pdf document.
Conexant cx2388x Data Sheet
Philips SAA7108 / 7113 Data Sheets
ATI theater 200: Some features.

5.4 Active capture window

Capture cards only capture/sample a portion of this signal (this portion is called the active capture window). As stated in subsection 5.1, for PAL the active picture is contained in about 52 µs of 576 active scan lines. Looking at various capture chips/drivers (see table below) it turns out that the subrange of the "about 52 µs" which is actually captured lies between 51.56 µs and 53.333 µs. Where the former capture window is missing a part of the image (for 720x576 this results in 6 missing pixels) and the latter contains black borders (for 720x576 this results in a border of 18 black pixels).
Similarly, for NTSC the active picture is contained in about 52.6555 µs of 486 active scan lines. Looking at various capture chips/drivers (see table below), it turns out that the subrange of the "about 52.6555 µs" which is actually captured lies between 50.96 µs and 53.333 µs. Where the former capture window is missing a part of the image (for 720x480 this results in 23 missing pixels) and the latter contains black borders (for 720x480 this results in a border of 9 black pixels).

In order to resize, you will need to know how much of the TV picture your capture device captures. The following table lists tested capture card/driver combinations. Card manufacturer does not matter, only the capture chip and driver matter. If your device is not listed, there is a test procedure (determining the active capture window) to determine your values. This test procedure can be followed if you have a dvd player and dvd burner, or you have a dvd player that can read svcd's and a cd burner. If you are unable to do this test, and your card/driver combination is not listed in the table, you should have a look at section 4 to determine an approximate capture window.

Card, Driver Capture width in µs
PAL:
BT878, BTWincap v5.3.6.1 52.03
BT878, Hauppauge WDM v3.35 b 21125 51.56
BT878, Iulabs universal WDM v3.1.28.36 51.56
CX23881, Hauppauge WDM v2.75.21070 51.56
SAA7108, Nvidia WDM v30.82 53.33
SAA7113, Terratec Cameo Grabster 200 USB v3.05 53.33
SAA7134, Terratec Cinergy TV 400 WDM v1.2.0.5 52.15
ATI's Rage Theater chip 52.22
Any ITU-601 compliant DV Device 53.33

NTSC:
BT878, BTWincap v5.3.6.1 52.80
BT878, Hauppauge WDM v3.35 b 21125 50.96
BT878, Iulabs universal WDM v3.1.28.36 50.96
CX23881, Hauppauge WDM v2.75.21070 50.96
SAA7113, Terratec Cameo Grabster 200 USB v3.05 53.33
SAA7134, Terratec Cinergy TV 400 WDM v1.2.0.5 52.15
Any ITU-601 compliant DV Device 53.33
ATI's Theater 200 chip 52.15
ATI's Rage Theater chip 52.96

PAL-60:
BT878, BTWincap v5.3.6.1 52.00
SAA7134, Terratec Cinergy TV 400 WDM v1.2.0.5 52.15

References:
Table of 'active capture windows' for different chips/drivers determined by various cappers.

5.5 Recommended capture size

What size should you capture at and what codec should be used at that sizes? After reading 3.5 (about bandwidth) 5.2 (Nyquist) and 5.3 (capture card scalers), we have enough information to answer this question. The basic rules are:

CVD/VCD: you should capture at dvd-size and resize to the target size. If you would capture directly at the target size too much quality gets lost.

DVD/SVCD: you should capture at the target size. If, for example, it isn't possible to capture at dvd-size, you should capture at a lower size and go for a different format like SVCD or CVD. The reason is that in this case the difference between the capture size and the target size is small, and the quality will be degraded too much when an extra resize is necessary. For example, instead of capturing and denoising at 768x576 (for PAL) and resizing to 720x576 it is qualitywise better to capture and denoise at the target size 720x576.

XviD/DivX: Two cases will be considered here: good quality high-size capping and good quality low-size capping. The latter can be used when you have a "slow" pc, or you don't have much hard disk space.

good quality high-size capping:

You should capture at least twice the bandwidth (with a maximal size of 768 for PAL and 720 for NTSC). Thus if you capture using a horizontal size of 768, 720 or 704 (for PAL, with a vertical size of 576) or 720, 704 or 640 (for NTSC, with vertical size of 480), it is good enough. There are some important issues:

good quality low-size capping:

If you have a "slow" pc, or you don't have much hard disk space, you have to choose a lower capture size. The recommended sizes are

device

PAL (with vertical downsize) NTSC (with vertical downsize)
bt8x8 400x576 368x480
cx2388x 384x576 320x480
saa71xx 384x576 320x480
ati cards with theater chip 480x576 400x480

As explained in section 5.2, you should capture at least twice the bandwidth (with a maximal size of 768 for PAL and 720 for NTSC), and if your hardware/software can't handle it (you get a lot of dropped frames during capture, or you simply can't capture at the recommended size) you should try lower sizes. Make sure that also the final size (after post processing) is above twice the bandwidth, and no step in between should go below that.

However, that's not always possible when using the capture sizes in the table above. From the table in section 3.5

PAL-TV 520 pixels
NTSC-TV 440 pixels
SVHS 520 pixels
VHS 312 pixels

When capping from VHS, the condition is satisfied. However, when capping from SVHS or TV you either have to cap at higher sizes, or you have to accept to loose some information.

Huffyuv/MJPEG (at 18 or 19): If you are using a mjpeg codec for capturing (at quality 18 or 19) and you want good quality, you should do that at high sizes. At lower sizes (SVCD or CVD) you should use Huffyuv instead.

5.6 Resizing

If you care to get the aspect ratio correct, and you capture at a standard size, resizing is almost always required. This is because capture card/driver combinations almost never capture exactly the complete picture but mostly either a bit too little or too much. In addition, they scale (instead of pad (4)) to the size you requested. This means asking for a size such as 768, 720, 704, 640, etc almost always gets you a slightly distorted picture. If your application/driver allow for a custom capture size, you can simply capture the right size for your card and skip this step. Regardless, you will have to calculate the 'correct' size. Your destination size should be divisible by 16, since motion estimation uses 16x16 pixel-sized macroblocks. It is assumed that your device captures 576 lines for PAL and 480 for NTSC. That's right, all devices crop six NTSC lines to get 480 lines (5).

There are two methods for determining the correct size. The first is short and works well for standard destination sizes. It may also require you to add a small border to hit your desired standard. The second will give you the tools to figure out any custom options you may want. In general, this method will only work if you are able to cap at arbitrary sizes. If not, then most of the times you are stuck with the second method.

first method:

  1. Find your approximate active capture width in µs in the table in section 5.4.
  2. Find the sample rate for your destination standard in the table in section 5.1.2.
  3. Multiply the two numbers together.
  4. Round to an even whole number (the reason is that the colorformat of the capture is YUY2, and the chroma (UV) is shared between two horizontal pixels (6)):
    Now you have the pixel size your device captures expressed in the pixels of your destination standard. Resize to this size regardless of your capture frame size. If this would cause your to 'up-size', you may want to consider a smaller frame standard. If you can do a custom capture size, you can use this.
  5. Add borders or crop to exactly hit your standard frame size, and round to the nearest 16 mod size.

second method:

  1. Select a suitable capture size (guidelines are given at the start of this section).
  2. Find the capture width of your destination standard (in µs) in the table in section 5.1.2.
  3. Find your approximate active capture width (in µs) in the table in section 5.4.
  4. Divide the two numbers, and multiply the result with the horizontal capture size.
  5. Now you have the capture width of your destination standard (in µs) expressed in the pixels of your destination standard. Add borders or crop to get to this size.
  6. Resize to your standard frame size.

Some examples:

1) Say you want to make a DivX/XviD (PAL). Full PAL TV on a PC gives 52 µs * 14.7692 MHz = 768 pixels, ending with 768x576. Your "SAA7108, Nvidia with a WDM driver" has a capture window of 53.33 µs. So, how many PC sized pixels fit in the capture window? 53.33 µs * 14.7692 MHz = 788 pixels. 

So you have two options here (however, if you can't capture at arbitrary sizes you only have one option):

a) Cap pixels that are at or close at the target size. That's not always possible.

b) Capping at another resolution and resize to the correct pixel size afterwards. If you want that, cap high, say 720x576. 
However, your card caps only 53.33 µs, so you need less pixels to make up the difference with the 52 µs a PC needs for correct AR. How many?
(52 / 53.333) * 720 = 702 in total.

So, remove 18 black pixels of your 702x576 cap. You now have 52 µs of info in 702 pixels. Resize the resulting 702x576 to 768x576 or a scaling of it.

2) Say you want to make a DivX/XviD (NTSC). Full NTSC TV on a PC gives 52.6555 µs * 12.3064 MHz = 648 pixels, ending with 648x480.
Your "SAA7108, Nvidia with a WDM driver" has a capture window of 53.33 µs. So, how many PC sized pixels fit in there? 53.33 µs * 12.3064 MHz = 656 pixels. 

So you have two options (if you can't capture at arbitrary sizes you only have one option) here:

a) Cap pixels that are at or close at the target size. So, try capping at 656x480 and remove 8 pixels to get 648x480. Add or remove 8 pixels to obtain a width which is divisible by 16.

b) Capping at another resolution and resize to the correct pixel size afterwards. If you want that, cap high, say 720x480. 
However, your card caps only 53.33 µs, so you need less pixels to make up the difference with the 52.6555 µs a PC needs for correct AR. How many?
(52.6555 / 53.333) * 720 = 711 in total (712 rounding it to an even number).

So, remove 8 black pixels of your 720x480 cap. You now have 52.666 µs of info in 712 pixels. Resize the resulting 712x480 to 648x480 or a scaling of it.

3) Say you want to make a SVCD (NTSC). NTSC SVCD gives 53.333 µs * 9 MHz = 480 pixels, ending with 480x480.
Your card has a capture window of 51 µs. So, how many SVCD sized pixels fit in there? 51 µs * 9 MHz = 459 pixels. That is not a nice number to cap. Depending on the codec, you should always cap mod 2, 4 or 8.

So you have two options (if you can't capture at arbitrary size you only have one option) here:

a) Cap pixels that are at or close at the target size. So, try capping at 460x480 and pad to 480x480.

b) Capping at an other size and resize to the correct pixel size afterwards. If you want that, cap high, say 720x480. 
However, your card caps only 51 µs, so you need extra pixels to make up the difference with the 53.333 µs a SVCD needs for correct AR. How many?
(53.333 / 51) * 720 = 752 in total.

So, add 16 extra black pixels to each side of your 720x480 cap. You now have 53.333 µs of info in 752 pixels. Resize the resulting 752x480 horizontally to 480x480 and again, you have SVCD with correct AR.

4) Say you want to make a CVD (NTSC). NTSC CVD gives 52.148 µs * 6.75 MHz = 352 pixels, ending with 352x480.
Your card has a capture window of 53.33 µs. So, how many CVD sized pixels fit in there? 53.33 µs * 6.75 MHz = 360 pixels.

So you have two options (if you can't capture at arbitrary size you only have one option) here:

a) Cap pixels that are at or close at the target size. So, try capping at 360x480 and remove 8 pixels to get 352x480. However, since 360 is pretty close to the bandwidth for VHS (VHS: 354, TV: 440), it is recommended to capture at a higher size. Have a look at the option (b).

b) Capping at an other size and resize to the correct pixel size afterwards. If you want that, cap high, say 720x480. 
However, your card caps only 53.33 µs, so you need to less pixels to make up the difference with the 52.148 µs a CVD needs for correct AR. How many?
(52.148 / 53.33) * 720 = 704 in total.

So, remove 18 black pixels of your 720x480 cap. You now have 52.148 µs of info in 704 pixels. Resize the resulting 704x480 horizontally to 352x480 and again, you have CVD with correct AR.

References:
Der Karl's Capture Karten aspect ratio fuer Dummies: Der Karl's Capture Card Aspect Ratio for Dummies (in German).
Capture-Cards and aspect-ratio for Dummies: Der Karl's Capture Card Aspect Ratio for Dummies (translated by Arachnotron).


Footnotes:
(1) This may give the impression ITU is the only/correct way to do it. It is merely a way to standardize things, nothing else. Cards that do it differently are not doing it wrong.
(2) Have a look at the bt8x8 datasheet. SAA71xx devices: 27 MHz (PAL and NTSC). ATI devices: ?
(3) Pre-aliasing: undersampling results in spectral overlap, with the consequence that high frequencies will foldover and appear as low frequencies. Imperfect reconstruction results in post-aliasing. Source: page 20-22 Basic Signal processing.
(4) Padding to a certain size means "adding black pixels until the size reached".
(5) Datasheets don't mention this explicitly, but all evidence points to it: your image would be degraded very much if it was resized instead of cropped of, with bt_tweaker you can choose which lines you want to capture.
(6) reference: 4:2:2 formats


Next step: capturing with VfW or WDM drivers

Back to the Indext: HOME


English version last edited on: 06/13/2004 | First release: n/a | Authors: Version4Team | Content by Doom9.org