presentación tesis 08022016
TRANSCRIPT
![Page 1: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/1.jpg)
Visual attention and perception models for assessing quality in 2D
and 3D stereoscopic video
Juan Pedro López Velasco - [email protected]: José Manuel Menéndez García - [email protected]
Universidad Politécnica de MadridMadrid, 8th February 2016
![Page 2: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/2.jpg)
2
Index• Introduction• Objectives and Work Development• Visual discomfort prediction in 3D
stereoscopic video• Visual Attention Model for Video Quality
Assessment• Conclusions and Future work• Merits
![Page 3: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/3.jpg)
3
Introduction• Quality of Experience (QoE) is defined as the degree
of delight or annoyance of the user of an application or service, in this case, multimedia services.
• Necessary: Estimation of QoE in different stages of video broadcasting dataflow and for a variety of sources: 2D and 3D.
![Page 4: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/4.jpg)
4
Scenarios
CONTENT CREATION PHASE:
Visual comfort assessment
(3D)
COMPRESSION PHASE:
Visual attention and
saliency models (2D)
![Page 5: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/5.jpg)
5
…the final user.
The most important thing in video quality assessment is…
![Page 6: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/6.jpg)
6
Ob
ject
ives
an
d W
ork
D
eevl
opm
ent
![Page 7: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/7.jpg)
7
Objectives (I)
For visual comfort assessment (3D):• Detecting empirically the main sources of visual discomfort in 3D
stereoscopic video after developing subjective assessment.• Quantifying the situations of sequences where the probability of visual
discomfort to occur is higher.• Analyzing the factors of motion, distribution of parallax and disparity
change in pairs of sequences for developing tools that correspond to human perception.
• Demonstrate with sequences that the results obtained in subjective assessment may be predicted with objective parameters and characteristics measurement.
![Page 8: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/8.jpg)
8
Preliminary subjective assessmnet
Determination of visual discomfort sources
Characterization of video sequences
Statistics analysis, new subjective assessment, metrics development and drawing conclusions
Work Development (I)
![Page 9: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/9.jpg)
9
Objectives (II)
For visual attention and saliency models (2D):• Improving objective quality metrics by applying visual attention models,
which weight regions of interest to obtain results closer to human eye’s response.
• Determining accurate visual attention models, particular for each sequence, which predict the most probable areas observed by the user.
• Weighting the saliency factors analyzed by the use of subjective assessment. These saliency factors are the following: motion, level of detail, face detection and position of pixel.
• Demonstrating the improvement of the objective metrics for measuring quality and artifacts in the sequence when applying the developed visual attention model (Advanced Blur metric)
![Page 10: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/10.jpg)
10
Determining factors: motion, face detection, level of detail and position
Subjective assessment with artificially impaired sequences
Weighting these factors in order of importance.
Visual attention model generation
Application of model in objective metrics (Advanced Blur metric)
Work Development (II)
![Page 11: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/11.jpg)
11
Vis
ual
dis
com
fort
pre
dic
tion
in
3D
ste
reos
cop
ic v
ideo
![Page 12: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/12.jpg)
12
Introduction to Stereoscopy• Stereoscopic 3D video perception is based on the fact that two
different video signals (different but highly correlated) are captured in order to feed each of the viewer’s eyes.
• One signal is received by the left eye and another one by the right eye. The brain fuses left and right view.
• 3D video imitates the binocular human vision (natural view).• The cyclopean eye is an imaginary eye situated midway
between the two eyes.
![Page 13: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/13.jpg)
13
Disparity and Parallax
• Disparities are the differences between the angles subtended between pairs of features.
• Parallax is created by disparities: Positive, negative or zero, depending on the position of the object respect to the screen.
![Page 14: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/14.jpg)
14
Example of 3D disparity
![Page 15: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/15.jpg)
15
Accommodation-Vergence conflict
• Viewing an object in stereoscopic displays: – Eyes accommodate to the screen – But when rotating to fix the apparent object (vergence)– an inconsistency between them occurs (derived from stereopsis).
• This effect is the accommodation-vergence conflict.
![Page 16: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/16.jpg)
16
Problem description
• Disparity may offer an incredible experience, BUT differences in 3D disparity eye may have difficulties to focus objects causing visual discomfort, annoyance, headache.
• The eye focus the objects: Accommodation of the eyes needs enough time to adapt to changes for correct vision of 3D videos (importance of motion).
• Common sources of visual discomfort:– Excessive binocular parallax (especially negative)– Accommodation and vergence mismatches (AVM)
![Page 17: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/17.jpg)
17
Accomodation-Vergence Mismatches (AVM)
• AVM is one of the most frequent sources of visual discomfort in 3DTV.
• When position of the objects change (parallax), the accommodation is constant but the vergence changes.
• The crystalline must adapt to change fastly.
Near distance object Far distance object
![Page 18: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/18.jpg)
18
Zone of Comfort
• Zone of Comfort (ZoC) is a term introduced by Percival (1892) to define the relationship between distance of vergence and distance to the screen (accommodation distance).
• Studies focused on static images (Shibata, 2011)
![Page 19: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/19.jpg)
19
Work methodologyCharacterization of individual video sequences
Sequence
Motion
Depth map
Distribution of parallax
1
Sequence 1
Sequence 2
Combination of video pair sequences2
Wide casuistic of transitions
Subjective assessment with pairs of sequences for transition analysis
3 Analysis of when visual discomfort happens4
![Page 20: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/20.jpg)
20
Characterization of video sequences
• Tools for characterization:– Depth maps: using SAD (Sum of Absolute Differences) techniques.– Histograms of parallax information (based on depth map information)– Diagrams of TI (Temporal Information) and SI (Spatial Information) variation.
SAD
![Page 21: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/21.jpg)
21
Case of study: Sequence Palco HD• Separation of virtual cameras over the average interpupillary
distance. Human eye adapts to change produce by negative parallax, but… abrupt transition generates discomfort.
Progressive Temporal Parallax
variation
![Page 22: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/22.jpg)
22
Subjective Assessment
• Analysis of changes / transitions between pairs of video sequences to determine a preliminary ZoC.
• Analysis of transitions between scenes:– Selection of sequences with different values of SI (Spatial Information) and TI
(Temporal information), bidimensional information.– Selection of sequences with diferent values of spatial and temporal parallax
variance (negative, parallax), tridimensional information
• Test conditions (following Recommendations BT.500 and P.910)– 74 observers– 65 inches television– Observation distance: 2,5 m– HD sequences– Annoyance 5-notes Scale
MOS Scale
Annoyance derived from transition Quality of Experience
5 Very comfortable Excellent Experience4 Comfortable Good Experience3 Mildly uncomfortable No visual discomfort2 Uncomfortable Visual discomfort1 Extremely uncomfortable High visual discomfort
![Page 23: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/23.jpg)
23
Results of subjective assessment
![Page 24: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/24.jpg)
24
Transition: Angel to Ladder (I)
40% of the people gave a score that manifests visual discomfort
![Page 25: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/25.jpg)
25
Transition: Angel to Ladder (II)
Parallax variation in pixel
![Page 26: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/26.jpg)
26
Transition: “Spaceship” to “Astronaut”
Negative parallax in right side of first video to negative/positive combination
![Page 27: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/27.jpg)
27
Transition: “Station” to “Itaca3d”
This is the worst scored transition in the tests
↑↑Motion↑↑Motion
Hiperstereoscopy!
![Page 28: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/28.jpg)
28
Transition: “Boxers” to “Dance”
Negative parallax located in different areas, less annoyance for observers.
![Page 29: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/29.jpg)
29
Transition: “Hall” to “Laboratory”
Both videos with negative parallax in both videos and window violation → low scores.
Window violation!
![Page 30: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/30.jpg)
30
Conclusions• After subjective assessment, results indicate the necessity of
evaluating both static disparity and dynamic variation of the stereoscopic image, in terms of motion.
• ZoC is affected by motion in the scene. The state-of-the-art must be actualized to offer results with tests of dynamic sequences.
• Avoiding visual discomfort is possible locating objects in positive parallax, BUT that implies a consequent decrease of QoE.
• Negative parallax must be controlled to generate soft variations:– Fast variation of negative parallax is usually the main source of visual discomfort,
especially when the transition is produced to a content with a completely different disparity diagram.
– Only hyperstereoscopy (i.e. pixels with negative parallax with disparities higher than 5) in the sequence is not enough for detecting visual discomfort, it is the transition what provokes the discomfort.
• Positive parallax is recommended for its tolerance to visual discomfort and the consequent.
![Page 31: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/31.jpg)
31
Future work
After the conclusions obtained after detecting the main sources of visual discomfort:
• Developing recommendations and guidelines for 3D contents creators.
• Generating tools for automatic detection of discomfort in 3D videos.
![Page 32: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/32.jpg)
32
Visu
al A
ttent
ion
Mod
el fo
r Vid
eo
Qua
lity
Ass
essm
ent
![Page 33: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/33.jpg)
33
Contents
• Introduction: Problem description• Calibration of the visual attention model
– Artificially impaired video sequences generation: Analysis of video characteristics by regions Creation of masks based on ROI’s
• Results and examples with test sequences• Advanced blur metric
– Application to real video sequences (encoded in H.264 at different bitrates)• Conclusions
![Page 34: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/34.jpg)
34
Problem description (I)
• Assessing video quality is still a complex task.• Video Quality Assessment needs to correspond to human
perception.• Visual attention is focused on concrete regions (ROI’s) of an image
as demonstrated with fixation maps and eye-tracking.
Original image Fixation map Image with visual attention weights
![Page 35: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/35.jpg)
35
• Most pixel-based metrics do not present enough correlation between objective and subjective results
• Algorithms need to correspond to human perception when analyzing quality in a video sequence.
• For example, these four frames have the same MSE.
• Video quality metrics should correlate with visual attention and psychovisual models adapted to concrete artifacts and their visualization.
Problem description (II)
High blocking High blurring (defocus) Salt and pepper noise JPEG encoding
![Page 36: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/36.jpg)
36
Visual Attention Features
• According to context-aware saliency detection model proposed by Goferman et al [GOFERMAN-1, 2012], image regions of interest are detected based on four principles of human attention supported by psychological evidence– Low-level characteristics affecting to each individual pixel, such as color
and contrast– Global considerations, which suppress frequently occurring features,
while maintaining features that deviate from the norm.– Visual organization rules which state that visual forms may possess one
or several centers of gravity about which the form is organized– High-level factors, such as human faces or concrete objects recognition.
This factor could be content dependent, but human faces generate specific patterns in human retina that increase the probability of be perceived related to psychological and cognitive features.
![Page 37: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/37.jpg)
37
Example of artificially impaired sequences
• Impaired area (with blocking artifact) located in human faces ROI.
• This effect is excessive in this example but in real life is a common effect.
![Page 38: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/38.jpg)
38
Work methodology• Objectives:
– Calibration of the influence of features (ROI) for determining the visual attention model.
– Creation of Advanced Blur Metrics• Methodology for Visual Attention Model:
– Selection of ROI’s: motion, faces, spatial detail and position.– Creation of masks for artificially impaired sequences (adapted to
concrete artifact: blurring).– Subjective Assessment: Opinions of users (MOS scaled).– Search for inconsistencies between subjective assessment (MOS
obtained) with pixel-based objective metrics (PSNR), to weight the influence of each feature.
• Advanced Blur metric: loss of energy (blur) adapted to visual attention.• Tests: Once the visual attention model is generated, it will be tested with
real sequences (distorted by the effect of H.264 encoding).
![Page 39: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/39.jpg)
39
Scheme of artificially impaired video sequences generation
Impaired video
sequenceOriginal
video sequence
Artificiallyimpaired sequence
InverseFeature
Mask
FeatureMask
Distortion
(2 sequences for each distortion:
One and the opposite case
As seen in next example)
![Page 40: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/40.jpg)
40
Impairment and artifacts insertion process
Original video
sequenceArtifact Distortion
Impaired video
sequence
Blocking
Blurring
Ringing
Blocking simulated with 8x8 mosaic filter
Blurring simulated with gaussian lowpass filter
Ringing simulated with JPEG codification filter
![Page 41: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/41.jpg)
41
Creation of masks based on ROI’s (I)
• Types of regions of interest for masks
Original video
sequence Feature Detection
Feature Mask
Inverse Feature Mask
Motion
Spatial Detail
Faces
Position
Color
![Page 42: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/42.jpg)
42
Motion mask
• For motion detection, temporal information in consecutive frames is scrutinized
• Temporal information is analyzed:
0),(),(,.),( 1 yxFyxFifMaskyxPix frameiii
Original frame Motion mask based on TI
![Page 43: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/43.jpg)
43
Spatial Detail Mask• Textures, edges and objects in motion are the source of hiding or
highlighting a determined impairments, in cases such as blocking or blurring artifacts.
• Canny algorithm is used to create binary masks for separating homogenous from high-frequencies areas.
Original frame Spatial detail mask based on Canny algorithm
![Page 44: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/44.jpg)
44
Pixel Position Masks• The image is divided in 9 sections (Nojiri, 2009)• Objective: Analyzing influence of pixel position by areas.
• Three types of masks are created depending on the regions:
Nojiri’s sections distribution
Corner mask Lateral mask Central mask
![Page 45: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/45.jpg)
45
Facial Mask
• Haar algorithm included in OpenCV libraries based on a boosted cascade of simple features is used for face detection
Face detection Face mask
![Page 46: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/46.jpg)
46
Subjective assessment for calibration• Results based on subjective tests are analyzed to demonstrate
the validity of test sequences. Spatial detail is analyzed in these 3 sequences.
• MOS scale is used: 5 (excellent) to 1 (Poor)
“News Report”: Faces “Barrier”: Motion “Crowd”: Pixel Position
Sequence FR Metric
H.264 Impairment located in Faces ROI.
75Mbps 500Kbps D. Inv.
News Report
PSNR 47.93 37.58 46.82 34.52
Blur 0.44 3.63 0.38 5.17
MSE 0.67 1.93 0.10 2.30
MOS 4.81 1.54 1.33 3.78
Sequence FR Metric
H.264 Impairment located in Motion ROI.
75Mbps 500Kbps D. Inv.
Barrier
PSNR 49.82 33.19 39.85 34.24
Blur 0.27 8.36 1.97 6.24
MSE 0.51 3.34 0.359 2.98
MOS 4.77 1.33 3.11 3.89
Seq. FR Metric
H.264 Impairment located in Position ROI’s
75 Mbps
500 Kbps
Center Lateral Corner
D. Inv. D. Inv. D. Inv.
Crowd
PSNR 34.33 25.34 30.74 26.82 33.87 26.00 35.95 25.88
Blur 3.44 22.55 6.27 15.33 2.60 19.44 0.95 22.47
MSE 3.55 8.76 2.30 6.21 1.21 7.30 0.64 7.87
MOS 4.68 1.22 1.44 2.44 3.78 1.33 4.11 1.22
![Page 47: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/47.jpg)
47
Calibration of Faces
• Distortion is located in the human faces ROI• Subjective MOS values are lower (1.33) than when located in
the rest of the picture and faces appear sharp (3.78)• Inconsistence with objective metrics: PSNR (46.82 vs. 34.52) or
MSE’s behavior (0.10 vs. 2.30)
Sequence FR Metric
H.264 Impairment located in Faces ROI.
75Mbps 500Kbps D. Inv.
News Report
PSNR 47.93 37.58 46.82 34.52
Blur 0.44 3.63 0.38 5.17
MSE 0.67 1.93 0.10 2.30
MOS 4.81 1.54 1.33 3.78
![Page 48: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/48.jpg)
48
Calibration of Motion and Faces• A similar situation occurs when analyzing motion in “Barrier”
sequence. Inconsistence with objective metrics.
• Inconsistencies in corner regions between MOS and objective metrics, such as PSNR, for sequence “Crowd”.
• Inconsistencies in spatial detail areas, less
Sequence FR Metric
H.264 Impairment located in Motion ROI.
75Mbps 500Kbps D. Inv.
Barrier
PSNR 49.82 33.19 39.85 34.24
Blur 0.27 8.36 1.97 6.24
MSE 0.51 3.34 0.359 2.98
MOS 4.77 1.33 3.11 3.89
Seq. FR Metric
H.264 Impairment located in Position ROI’s
75 Mbps
500 Kbps
Center Lateral Corner
D. Inv. D. Inv. D. Inv.
Crowd
PSNR 34.33 25.34 30.74 26.82 33.87 26.00 35.95 25.88
Blur 3.44 22.55 6.27 15.33 2.60 19.44 0.95 22.47
MSE 3.55 8.76 2.30 6.21 1.21 7.30 0.64 7.87
MOS 4.68 1.22 1.44 2.44 3.78 1.33 4.11 1.22
![Page 49: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/49.jpg)
49
Relative influence of factors
• After subjective assessment we concluded that the following chain of influence has been considered
Faces > Central > Motion > Detail > Lateral > Corner
![Page 50: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/50.jpg)
50
Example of psychovisual model defined (I)
Frame from sequence “News Report”
![Page 51: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/51.jpg)
51
Example of psychovisual model defined (II)
Motion Mask Spatial Details Mask
Pixel Position Mask Faces Mask
![Page 52: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/52.jpg)
52
Advanced Blur metric
• Blur metrics calculates the loss of energy when compressing a video sequence with transforms, such as DCT. Blur is the comparison of gradient between reference and distorted image
• Advanced Blur includes the effect of visual attention model.
1
0
1
0
)),(()),((),(W
j
H
icodref jifGEjifGEjipsyBlur
1
0
1
0
)),(()),((1 W
j
H
icodref jifGEjifGE
HWBlur
Advanced Blur:
3
0
)(
),(),(),(),(),(
cMAX
FACESPOSDETMOT
ccoefHW
jicoefjicoefjicoefjicoefjipsy
![Page 53: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/53.jpg)
53
Test with real sequences
• Real sequences encoded at different bitrates:– H.264: 6Mbps – 500Kbps (HD Sequences)
Umbrella Boxers
Tree BranchesPhone Call
![Page 54: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/54.jpg)
54
Results (I)
• Results of sequences compared to MOS (subjective opinión), PCC (Pearson Correlation Coefficient), and the improvement from conventional Blur metric to Advanced Blur metric.
Sequence Value 6Mbps 4Mbps 1Mbps 500Kbps PCC Δ(Adv.Blur-Blur)
Boxers Blur 0,650 0,920 3,040 6,880 -0,953 2,97% Adv Blur 1,340 1,480 2,000 2,660 -0,983 MOS 4,778 4,111 2,444 1,333
Hall Blur 0,790 3,280 14,180 27,230 -0,982 1,40% Adv Blur 2,440 3,490 6,880 9,670 -0,996 MOS 4,889 4,111 2,667 1,556
Phone Call Blur 1,950 2,260 3,460 4,490 -0,990 0,94% Adv Blur 1,640 1,780 1,990 2,170 -0,999 MOS 4,889 4,000 2,444 1,333
Tree Branches Blur 11,920 17,360 22,380 20,120 -0,863 13,24% Adv Blur 6,150 8,030 9,790 12,090 -0,996 MOS 4,889 3,778 2,556 1,550
![Page 55: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/55.jpg)
55
Results (II)
![Page 56: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/56.jpg)
56
Conclusions
• Algorithms are not adapted to subjective human eye response.• Subjective tests revealed the importance of some concrete
regions.• Visual attention models adapted to visual attention obtain better
correlations when weighting regions of interest (ROI) and adapted to concrete artifacts.
• The use of visual attention models obtains improvement in objective metrics (Advanced Blur metric) up to 13% compared to conventional methods.
![Page 57: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/57.jpg)
57
Con
clu
sion
s an
d F
utu
re W
ork
![Page 58: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/58.jpg)
58
Conclusions• ZoC is affected by motion in the scene. The state-of-the-art
must be actualized to offer results with tests of dynamic sequences. Motion is a key factor in visual discomfort.
• Avoiding visual discomfort is possible locating objects in positive parallax, BUT that implies a decrease of QoE: – Negative parallax must be controlled to generate soft variations.– Positive parallax is recommended for its tolerance to visual discomfort and
the consequent.• Subjective tests revealed the importance of concrete ROI’s.• Visual attention models adapted to visual attention obtain better
correlations when weighting regions of interest (ROI) and adapted to concrete artifacts.
• The use of visual attention models obtains improvement in objective metrics (Advanced Blur metric) up to 13% compared to conventional methods.
![Page 59: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/59.jpg)
59
Future work
• Development and patent of a system for automatization of quality of Experience for content generation (measuring visual discomfort).
• Developing recommendations and guidelines for 3D contents creators.
• Improvement of Visual attention model with more low, medium and high level features, such as color.
• Advanced metrics adapted to other artifacts, such as blocking.• Development of No-Reference metrics including visual attention
models.
![Page 60: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/60.jpg)
60
Mer
its
![Page 61: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/61.jpg)
61
Publications (I)Peer-reviewed international journal articles (1)
• López, J. P., Rodrigo, J. A., Jiménez, D., & Menéndez, J. M. (2013). Stereoscopic 3D video quality assessment based on depth maps and video motion. EURASIP Journal on Image and Video Processing, 2013(1), 1-14. December 2013. Impact Factor: 0.74. JCR Indexed.
Peer-reviewed international conference papers (9)• López, J. P., Rodrigo, J. A., Jimenez, D., & Menendez, J. M. Subjective quality assessment in
stereoscopic video based on analyzing parallax and disparity. Consumer Electronics (ICCE), 2015 IEEE International Conference on. Las Vegas (U.S.A.), January 2015.
• López, J. P., Rodrigo, J. A., Jimenez, D., & Menendez, J. M. Proposal for characterization of 3DTV video sequences describing parallax information. In Consumer Electronics (ICCE), 2015 IEEE International Conference on. Las Vegas (U.S.A.), January 2015.
• López, J. P., Slanina, M., Arnaiz, L., & Menéndez, J. M. Subjective quality assessment in scalable video for measuring impact over device adaptation. In EUROCON, 2013 IEEE (pp. 162-169). Zagreb (Croatia), July 2013.
• López, J. P., Rodrigo, J. A., Jimenez, D., & Menendez, J. M. Insertion of Impairments in Test Video Sequences for Quality Assessment Based on Psychovisual Characteristics. Artificial Intelligence, Modelling and Simulation, International Conference on. Madrid, November 2014.
• López, J. P., Rodrigo, J. A., Jimenez, D., & Menendez, J. M. Definition of masks related to psychovisual features for Video Quality Assessment. In Consumer Electronics (ISCE), 2015 IEEE International Symposium on (pp. 1-2). Madrid, June 2015.
![Page 62: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/62.jpg)
62
Publications (II)
• López, J. P., Jimenez, D., Cerezo, A., & Menéndez, J. M. No-reference algorithms for video quality assessment based on artifact evaluation in MPEG-2 and H. 264 encoding standards. IFIP/IEEE International Symposium on. IEEE. Ganthe (Belgium), May 2013.
• Rodrigo, J. A., López, J. P., Jiménez Bermejo, D., & Menendez Garcia, J. M. (2013). Automatic 3DTV Quality Assessment Based On Depth Perception Analysis. Nem Summit 2013 Proceedings, 69-74. Nantes (France), October 2013.
• López, J.P., Jiménez, D., Díaz, M., & Menéndez, J.M. Metrics for the objective quality assessment in high definition digital video. IASTED International Conference on Signal Processing, Pattern Recognition and Applications (SPPRA). 2008.
• López, J.P., Díaz, M., Jiménez, D., & Menéndez, J. M. Tiling effect in quality assessment in high definition digital television. 12th IEEE International Symposium on Consumer Electronics- ISCE2008, ISBN: 978-1-4244-2422-1, Vilamoura, April 2008.
Book chapters (1)• López, J.P. Video Quality Assessment. Video Compression, Ed. InTech, ISBN: 978-953-51-
0422-3, March 2012.
Other peer-reviewed international conference papers (5)Peer-reviewed national journal articles (1)
![Page 63: Presentación Tesis 08022016](https://reader036.vdocumento.com/reader036/viewer/2022062901/58f2eab01a28ab00798b45ef/html5/thumbnails/63.jpg)
63
Research projects• ACTIVA. Ministerio de Industria, Turismo y Comercio (FIT-330300-2007-42).• BUSCAMEDIA: hacia una adaptación semántica de medios digitales multirred-multiterminal. [2009-2012].• CIUDAD2020: Hacia un nuevo modelo de ciudad inteligente sostenible. [2011-2014].• COST Action IC1105: 3D-ConTourNet 3D Content Creation, Coding and Transmission over Future Media Networks.• EPSIS. Entretenimiento y publicidad segmentada en entornos inmersivos. Ministerio Economía y Competitividad [2011-
2013].• FURIA 2009. Futura red integrada audiovisual. Ministerio de Industria, Turismo y Comercio (TSI-020301-2009-33) [2009-
10]• HBB4ALL Hybrid Broadcast Broadband TV For All. [2013-2016]• HORFI-Radar MIMO de banda ultra ancha. TEC2012-38402-C04-01 HORFI. • ICT 2020. Ministerio de Industria, Turismo y Comercio (TSI-020302-2011-23). [2011-2013]• IMMERSIVE TV: Una aproximación a los medios inmersivos. Ministerio de Industria, Turismo y Comercio [2010-2012].• ITACA 3D. Plataforma de creación, producción y distribución de video estereoscópico de entretenimiento para la
visualización de televisión en 3D a través de briadcast. Ministerio de Industria, Turismo y Comercio (TSI-020110-2009-396).• MELISMAS - Generación automática de mensajes en lengua de signos para aplicaciones sanitarias. Ministerio de
Economía y Competitividad (RTC-2014-2762-1). [2014-16]• Palco HD. Convergencia de plataformas digitales hacia la HD y medidas de calidad asociadas. Ministerio de Industria,
Turismo y Comercio. [2007-2009]• PALCO HD2. Ministerio de Industria, Turismo y Comercio. [2009-2011].• PLEASE Plataforma de alta eficiencia avanzada para distribución de contenidos [2014-15].• PRO-TVD-CM PRO-TVD-CM: Proyecto Integral de Investigación en Televisión Digital (S0505/TIC-0398). [2005-2009]• S3D: Equipo servidor-editor de vídeo 3D realizado en colaboración con las empresas Overon y Aicox.• SIRENA: SIstemas y tecnologías 3D Media sobre Internet del Futuro y REdes de difusión de NuevA generación. Ministerio
de Economía y Competitividad (IPT-2011-1269-430000). [2011-2013]