Skip to content

Commit 953d6bd

Browse files
authored
Merge pull request #5 from surbhimadan/master
updated documents including readme, usage and performance section
2 parents 39edcfb + e58b319 commit 953d6bd

9 files changed

Lines changed: 178 additions & 105 deletions

.gitignore

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
*.jpg
22
*.JPG
33
*.BMP
4-
*.PNG
54
*.o
65
*.bin
76
*.log

README.md

Lines changed: 11 additions & 104 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,17 @@
11
# Intel® Library for Video Super Resolution (Intel® Library for VSR) README
2-
Video Super Resolution coverts video from low resolution to high resolution using traditional image processing or AI-based methods.
2+
Video Super Resolution coverts video from low resolution to high resolution using traditional image processing or AI-based methods. Intel Library for Video Super Resolution consist of a few different algorithms including machine learning and deep learning implementations to offer a balance between quality and performance.
33

4-
RAISR (Rapid and Accurate Image Super Resolution) algorithm (https://arxiv.org/pdf/1606.01299.pdf) is public AI-based VSR algorithm. The algorithm provides better quality results than standard (bicubic) algorithms and a good performance vs quality trade-off as compared to DL-based algorithms like EDSR.
4+
We have enhanced the public RAISR (Rapid and Accurate Image Super Resolution), an AI based Super Resolution algorithm https://arxiv.org/pdf/1606.01299.pdf, to achieve better visual quality and beyond real-time performance for 2x and 1.5x upscaling on Intel® Xeon® platforms and Intel® GPUs. Enhanced RAISR provides better quality results than standard (bicubic) algorithms and a good performance vs quality trade-off as compared to compute intensive DL-based algorithms.
5+
6+
Enhanced RAISR is provided as an FFmpeg plugin inside of a Docker container(Docker container only for CPU) to help ease testing and deployment burdens. This project is developed using C++ and takes advantage of Intel® Advanced Vector Extension 512 (Intel® AVX-512) on Intel® Xeon® Scalable Processor family and OpenCL support on Intel® GPUs.
7+
8+
## Latest News
9+
- July 2024 : Release performance of the alogorithm and pipeline on Intel® Xeon® Scalable processor as well as EC2 Intel instances deployed on AWS Cloud. See details at "performance.md".
10+
11+
- April 2024: Intel Library for Video Super Resolution algorithm now available on AWS. See the repository for details on how video super resolution works on the AWS service at https://github.com/aws-samples/video-super-resolution-too . Technical details including video quality comparisons and performance information are available in a joint Intel / AWS white paper available at https://www.intel.com/content/www/us/en/content-details/820769/aws-compute-video-super-resolution-powered-by-the-intel-library-for-video-super-resolution.html
12+
13+
- Feb 2024 : AWS and Intel announced collaboration to release Intel Library for VSR on AWS Cloud at the Mile High Video 2024 conference, technical details available at https://dl.acm.org/doi/10.1145/3638036.3640290
514

6-
We have enhanced the public RAISR algorithm to achieve better visual quality and beyond real-time performance for 2x and 1.5x upscaling on Intel® Xeon® platforms and Intel® GPUs. The Intel Library for VSR is provided as an FFmpeg plugin inside of a Docker container(Docker container only for CPU) to help ease testing and deployment burdens. This project is developed using C++ and takes advantage of Intel® Advanced Vector Extension 512 (Intel® AVX-512) where available and newly added Intel® AVX-512FP16 support on Intel® Xeon® 4th Generation (Sapphire Rapids) and added OpenCL support on Intel® GPUs.
715

816
## How to build
917
Please see "How to build.md" to build via scripts or manually.
@@ -80,107 +88,6 @@ ffmpeg -init_hw_device vaapi=va -init_hw_device opencl=ocl@va -hwaccel vaapi -hw
8088
platform <int> ..FV....... select the platform (from 0 to INT_MAX) (default 0)
8189
device <int> ..FV....... select the device (from 0 to INT_MAX) (default 0)
8290

83-
## Advanced Usage ( through Exposed Parameters )
84-
The FFmpeg plugin for Intel Library for VSR exposes a number of parameters that can be changed for advanced customization
85-
### threadcount (only for CPU)
86-
Allowable values (1,120), default (20)
87-
88-
Changes the number of software threads used in the algorithm. Values 1..120 will operate on segments of an image such that efficient threading can greatly increase the performance of the upscale. The value itself is the number of threads allocated.
89-
### filterfolder
90-
Allowable values: (Any folder path containing the 4 required filter files: Qfactor_cohbin_2_8/10, Qfactor_strbin_2_8/10, filterbin_2_8/10, config), default (“filters_2x/filters_lowres”)
91-
92-
Changing the way RAISR is trained (using different parameters and datasets) can alter the way RAISR's ML-based algorithms do upscale. For the current release, provides 3 filters for 2x upscaling and 2 filters for 1.5x upscaling, current the 1.5x upscaling only support 8-bit. And for each filter you can find the training informantion in filternotes.txt of each filter folder.The following is a brief introduction to the usage scenarios of each filter.
93-
<table border="1">
94-
<tbody>
95-
<tr>
96-
<th rowspan=2>Upscaling</th>
97-
<th rowspan=2>Filters</th>
98-
<th rowspan=2>Resolution (recommendation)</th>
99-
<th rowspan=2>Usage</th>
100-
<th colspan=2>Effect</th>
101-
</tr>
102-
<tr>
103-
<th rowspan=1>1pass</th>
104-
<th rowspan=1>2pass</th>
105-
</tr>
106-
<tr>
107-
<td rowspan=3>2x(support 8-bit and 10-bit)</td>
108-
<td >filters_lowres</td>
109-
<td >low resolution
110-
360p->720p,540p->1080p</td>
111-
<td >filterfolder=filters_2x/filters_lowres:passes=1/2</td>
112-
<td >2x upscaling</td>
113-
<td >2x upscaling and sharpening</td>
114-
</tr>
115-
<tr>
116-
<td >filters_highres</td>
117-
<td >high resolution
118-
1080p->4k</td>
119-
<td >filterfolder=filters_2x/filters_highres:passes=1/2</td>
120-
<td >2x upscaling and sharpening</td>
121-
<td >2x upscaling and more sharpening than 1st pass</td>
122-
</tr>
123-
<tr>
124-
<td >filters_denoise</td>
125-
<td >no limitation</td>
126-
<td >filterfolder=filters_2x/filters_denoise:passes=2:mode=2</td>
127-
<td >denosing only for input</td>
128-
<td >2x upscaling and sharpening</td>
129-
</tr>
130-
<tr>
131-
<td rowspan=2>1.5x(only support 8-bit)</td>
132-
<td >filters_highres</td>
133-
<td >high resolution
134-
720p->1080p</td>
135-
<td >filterfolder=filters_1.5x/filters_highres:passes=1:ratio=1.5</td>
136-
<td >1.5x upscaling and sharpening</td>
137-
<td >N/A</td>
138-
</tr>
139-
<tr>
140-
<td >filters_denoise</td>
141-
<td >no limitation</td>
142-
<td >filterfolder=filters_1.5x/filters_denoise:passes=2:mode=2:ratio=1.5</td>
143-
<td >denosing only for input</td>
144-
<td >1.5x upscaling and sharpening </td>
145-
</tr>
146-
</tbody>
147-
</table>
148-
149-
Please see the examples under the "Evaluating the Quality" section above where we suggest 3 command lines based upon preference.
150-
Note that for second pass to work, the filter folder must contain 3 additional files: Qfactor_cohbin_2_8/10_2, Qfactor_strbin_2_8/10_2, filterbin_2_8/10_2
151-
### bits
152-
Allowable values (8: 8-bit depth, 10: 10-bit depth), default (8)
153-
154-
The library supports 8 and 10-bit depth input. Use HEVC encoder to encoder yuv420p10le format.
155-
```
156-
./ffmpeg -y -i [10bits video clip] -vf "raisr=threadcount=20:bits=10" -c:v libx265 -preset medium -crf 28 -pix_fmt yuv420p10le output_10bit.mp4
157-
```
158-
### range
159-
Allowable values (video: video range, full: full range), default (video)
160-
161-
The library caps color within video/full range.
162-
```
163-
./ffmpeg -y -i [image/video file] -vf "raisr=threadcount=20:range=full" outputfile
164-
```
165-
### blending
166-
Allowable values (1: Randomness, 2: CountOfBitsChanged), default (2 ). For GPU only support 2:CountOfBitsChanged blending.
167-
168-
The library holds two different functions which blend the initial (cheap) upscaled image with the RAISR filtered image. This can be a means of removing any aggressive or outlying artifacts that get introduced by the filtered image.
169-
### passes
170-
Allowable values (1,2), default(1)
171-
172-
`passes=2` enables a second pass. Adding a second pass can further enhance the output image quality, but doubles the time to upscale. Note that for second pass to work, the filter folder must contain 3 additional files: Qfactor_cohbin_2_8/10_2, Qfactor_strbin_2_8/10_2, filterbin_2_8/10_2
173-
### mode
174-
Allowable values (1,2), default(1). Requires flag passes=2”
175-
176-
Dictates which pass the upscaling should occur in. Some filters have the best results when it is applied on a high resolution image that was upscaled during a first pass by using mode=1. Alternatively, the Intel Library for VSR can apply filters on low resolution images during the first pass THEN upscale the image in the second pass if mode=2, for a different outcome.
177-
```
178-
./ffmpeg -i /input_files/input.mp4 -vf "raisr=threadcount=20:passes=2:mode=2" -pix_fmt yuv420p /output_files/out.yuv
179-
```
180-
### asm
181-
Allowable values ("avx512fp16", "avx512","avx2","opencl"), default("avx512fp16")
182-
183-
The VSR Library requires an x86 processor which has the Advanced Vector Extensions 2 (AVX2) available. AVX2 was first introduced into the Intel Xeon roadmap with Haswell in 2015. Performance can be further increased if the newer AVX-512 Foundation and Vector Length Extensions are available. AVX512 was introduced into the Xeon Scalable Processors (Skylake gen) in 2017. Performance improves again with the introduction of AVX-512FP16, which uses _Float16 instead of float(32bit) with minimal precision and visual quality loss. AVX-512FP16 was introduced into the 4th gen Xeon (Sapphire Rappids) in 2022. The VSR Library will always check for the highest available ISA first, then fallback according to what is available (AVX-512FP16/AVX512/AVX2). However if the use case requires it, this asm parameter allows the default behavior to be changed. User can also choose opencl if the opencl is supported in their system.
18491

18592
# How to Contribute
18693
We welcome community contributions to the Open Visual Cloud repositories. If you have any idea how to improve the project, please share it with us.

docs/advanced usage

Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
## Advanced Usage ( through Exposed Parameters )
2+
The FFmpeg plugin for Enhanced RAISR exposes a number of parameters that can be changed for advanced customization
3+
### threadcount (only for CPU)
4+
Allowable values (1,120), default (20)
5+
6+
Changes the number of software threads used in the algorithm. Values 1..120 will operate on segments of an image such that efficient threading can greatly increase the performance of the upscale. The value itself is the number of threads allocated.
7+
### filterfolder
8+
Allowable values: (Any folder path containing the 4 required filter files: Qfactor_cohbin_2_8/10, Qfactor_strbin_2_8/10, filterbin_2_8/10, config), default (“filters_2x/filters_lowres”)
9+
10+
Changing the way Enhanced RAISR is trained (using different parameters and datasets) can alter the way ML-based algorithms do upscale. For the current release, provides 3 filters for 2x upscaling and 2 filters for 1.5x upscaling, current the 1.5x upscaling only support 8-bit. And for each filter you can find the training informantion in filternotes.txt of each filter folder.The following is a brief introduction to the usage scenarios of each filter.
11+
<table border="1">
12+
<tbody>
13+
<tr>
14+
<th rowspan=2>Upscaling</th>
15+
<th rowspan=2>Filters</th>
16+
<th rowspan=2>Resolution (recommendation)</th>
17+
<th rowspan=2>Usage</th>
18+
<th colspan=2>Effect</th>
19+
</tr>
20+
<tr>
21+
<th rowspan=1>1pass</th>
22+
<th rowspan=1>2pass</th>
23+
</tr>
24+
<tr>
25+
<td rowspan=3>2x(support 8-bit and 10-bit)</td>
26+
<td >filters_lowres</td>
27+
<td >low resolution
28+
360p->720p,540p->1080p</td>
29+
<td >filterfolder=filters_2x/filters_lowres:passes=1/2</td>
30+
<td >2x upscaling</td>
31+
<td >2x upscaling and sharpening</td>
32+
</tr>
33+
<tr>
34+
<td >filters_highres</td>
35+
<td >high resolution
36+
1080p->4k</td>
37+
<td >filterfolder=filters_2x/filters_highres:passes=1/2</td>
38+
<td >2x upscaling and sharpening</td>
39+
<td >2x upscaling and more sharpening than 1st pass</td>
40+
</tr>
41+
<tr>
42+
<td >filters_denoise</td>
43+
<td >no limitation</td>
44+
<td >filterfolder=filters_2x/filters_denoise:passes=2:mode=2</td>
45+
<td >denosing only for input</td>
46+
<td >2x upscaling and sharpening</td>
47+
</tr>
48+
<tr>
49+
<td rowspan=2>1.5x(only support 8-bit)</td>
50+
<td >filters_highres</td>
51+
<td >high resolution
52+
720p->1080p</td>
53+
<td >filterfolder=filters_1.5x/filters_highres:passes=1:ratio=1.5</td>
54+
<td >1.5x upscaling and sharpening</td>
55+
<td >N/A</td>
56+
</tr>
57+
<tr>
58+
<td >filters_denoise</td>
59+
<td >no limitation</td>
60+
<td >filterfolder=filters_1.5x/filters_denoise:passes=2:mode=2:ratio=1.5</td>
61+
<td >denosing only for input</td>
62+
<td >1.5x upscaling and sharpening </td>
63+
</tr>
64+
</tbody>
65+
</table>
66+
67+
Please see the examples under the "Evaluating the Quality" section above where we suggest 3 command lines based upon preference.
68+
Note that for second pass to work, the filter folder must contain 3 additional files: Qfactor_cohbin_2_8/10_2, Qfactor_strbin_2_8/10_2, filterbin_2_8/10_2
69+
### bits
70+
Allowable values (8: 8-bit depth, 10: 10-bit depth), default (8)
71+
72+
The model supports 8 and 10-bit depth input. Use HEVC encoder to encoder yuv420p10le format.
73+
```
74+
./ffmpeg -y -i [10bits video clip] -vf "raisr=threadcount=20:bits=10" -c:v libx265 -preset medium -crf 28 -pix_fmt yuv420p10le output_10bit.mp4
75+
```
76+
### range
77+
Allowable values (video: video range, full: full range), default (video)
78+
79+
The implementation caps color within video/full range.
80+
```
81+
./ffmpeg -y -i [image/video file] -vf "raisr=threadcount=20:range=full" outputfile
82+
```
83+
### blending
84+
Allowable values (1: Randomness, 2: CountOfBitsChanged), default (2 ). For GPU only support 2:CountOfBitsChanged blending.
85+
86+
The implementation holds two different functions which blend the initial (cheap) upscaled image with the Enhanced RAISR filtered image. This can be a means of removing any aggressive or outlying artifacts that get introduced by the filtered image.
87+
### passes
88+
Allowable values (1,2), default(1)
89+
90+
`passes=2` enables a second pass. Adding a second pass can further enhance the output image quality, but doubles the time to upscale. Note that for second pass to work, the filter folder must contain 3 additional files: Qfactor_cohbin_2_8/10_2, Qfactor_strbin_2_8/10_2, filterbin_2_8/10_2
91+
### mode
92+
Allowable values (1,2), default(1). Requires flag passes=2”
93+
94+
Dictates which pass the upscaling should occur in. Some filters have the best results when it is applied on a high resolution image that was upscaled during a first pass by using mode=1. Alternatively, the Enhanced RAISR can apply filters on low resolution images during the first pass THEN upscale the image in the second pass if mode=2, for a different outcome.
95+
```
96+
./ffmpeg -i /input_files/input.mp4 -vf "raisr=threadcount=20:passes=2:mode=2" -pix_fmt yuv420p /output_files/out.yuv
97+
```
98+
### asm
99+
Allowable values ("avx512fp16", "avx512","avx2","opencl"), default("avx512fp16")
100+
101+
The Enhanced RAISR requires an x86 processor which has the Intel® Advanced Vector Extensions 2 (Intel® AVX2) available. Intel AVX2 was first introduced into the Intel Xeon roadmap with Haswell in 2015. Performance can be further increased if the newer Intel® AVX512 Foundation and Vector Length Extensions are available. Intel AVX512 was introduced into the Xeon Scalable Processors (Skylake gen) in 2017. Performance improves again with the introduction of FP16 for Intel AVX512, which uses _Float16 instead of float(32bit) with minimal precision and visual quality loss. FP16 was introduced into the 4th Gen Intel Xeon Scalable Processors (formerly known as Sapphire Rappids) in 2022. The implementation always check for the highest available Instruction Set Architecture (ISA) first, then fallback according to what is available. However if the use case requires it, this asm parameter allows the default behavior to be changed. User can also choose opencl if the opencl is supported in their system.

docs/images/Pipeline_flow.png

26 KB
Loading
30.3 KB
Loading
88.3 KB
Loading

docs/images/RAISR_AWS.png

43.9 KB
Loading

docs/images/RAISR_baremetal.png

48.6 KB
Loading

0 commit comments

Comments
 (0)