The trivial video is a video of 1 frame. This is how images are interpreted by scikit-video. Let’s walk through the following example for interpreting images:
1 2 3 4 5 6 7 8 9 10 | import skvideo.io
# a frame from the bigbuckbunny sequence
vid = skvideo.io.vread("vid_luma_frame1.png")
T, M, N, C = vid.shape
print("Number of frames: %d" % (T,))
print("Number of rows: %d" % (M,))
print("Number of cols: %d" % (N,))
print("Number of channels: %d" % (C,))
|
Running this code yields this output:
Number of frames: 1
Number of rows: 720
Number of cols: 1280
Number of channels: 3
As you can see, the 1280x720 sized image has loaded without problems, and is treated as a rgb video with 1 frame.
If you’d like to upscale this image during loading, you can run the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | import skvideo.io
# upscale frame from the bigbuckbunny sequence by a factor of 2
vid = skvideo.io.vread("vid_luma_frame1.png",
outputdict={
"-sws_flags": "bilinear",
"-s": "2560x1440"
}
)
T, M, N, C = vid.shape
print("Number of frames: %d" % (T,))
print("Number of rows: %d" % (M,))
print("Number of cols: %d" % (N,))
print("Number of channels: %d" % (C,))
|
Running this code yields this output:
Number of frames: 1
Number of rows: 1440
Number of cols: 2560
Number of channels: 3
Notice that the upscaling type is set to “bilinear” by simply writing it out in plain English. You can also upscale using other parameters that ffmpeg/avconv support.
Note that although ffmpeg/avconv supports relative scaling, scikit-video doesn’t support that yet. Future support can be added by parsing the video filter “-vf” commands, so that scikit-video is aware of the buffer size expected from the ffmpeg/avconv subprocess.
Of course, images can be written just as easily as they can be read.
1 2 3 4 5 6 7 8 | import skvideo.io
import numpy as np
# create random data, sized 1280x720
image = np.random.random(size=(720, 1280))*255
print("Random image, shape (%d, %d)" % image.shape)
skvideo.io.vwrite("output.png", image)
|
Again, the output:
Random image, shape (720, 1280)
First, notice that the shape of the image is height x width. Scikit-Video always interprets images and video matrices as a height then a width, which is a standard matrix format. Second, notice that writing images does not require them to be in the same format as videos. Scikit-Video will interpret shapes of (1, M, N), (M, N), (M, N, C) as images where M is height, N is width, and C is the number of channels. Internally, scikit-video standardizes shapes and datatypes for accurate reading and writing through the ffmpeg/avconv subprocess.