Hi everyone,
I am new to the world of image and video processing and have been going through some documents to understand some basics. At multiple places, I see references to "half and quarter sample" processing.
I am unable to understand what half or quarter here means.
Is it mid way or 1/4 between 2 samples?
Can anyone give some good example please?
below one snippet where I see such a reference:
first generating the values of one or two
neighboring samples at half-sample positions using six-tap
filtering, rounding the intermediate results,
Sounds like interpolating. What is the context?
Sounds like interpolating. What is the context?
I second that. In fact, that's more or less word-for-word what I was going to reply.
It is interpolating. It is spacial, though, in case that is any source of confusion to the OP.
I'm guessing - or rather Google is - that the snippet came from here:
http://www.uta.edu/faculty/krrao/dip/Courses/EE535...
This looks to be a good explanation of the 6-tap filters used in H.264 and their spacial orientation:
Hello all,
Thanks for the replies. The snippet is from an HEVC tutorial.
Anyway, I do understand what interpolation is in image processing context.
But what I am not able to understand is "half sample" or "quarter sample".
This part is confusing me.
From the paper whose link I posted:
Fig. 8 a) Half-pel pixel values - (b) Quater-pel pixel values.
The integer-pels are your streamed luma samples. Once the half-pel area is available, the half-pel values can be interpolated using the defined filter. After that calculation, the quarter-pel values can be calculated. So... Integer-pel=luma sample. Half-pel=half luma sample. Quarter pel=quarter luma sample.
Thanks. Do you know why half pel or quarter pels are needed?
Eventually, display is always in terms of pixels.
I believe that the primary purpose is for more realistic motion.
If you have an object having linear movement from one frame to the next, there is no reason to expect its movement to be an exact integer amount of pixels. Rather, it might move -0.22 pixels in the vertical dimensions and 2.71828 pixels in the horisontal.
If you are tracking such movement in order to capitalize on the quality of rendering in the previous frame (and the bits spent in order to accomplish that), to basically say "do this block like the previous frame, only shifted horizontally/vertically by some amount", it makes sense to allow expressing that movement by whatever accuracy strikes a good balance between:
1) The accuracy (improvement) of predicting a block via motion vectors
2) The bits spent encoding those motion vectors
3) The computational cost of finding that motion vector and resampling/shifting
-k