[Home]
Table of contents
I use my Samsung Galaxy J7 Nxt mobile phone to shoot them. I do
not use any separate audio recorder (because I do not have
any). I use only the front camera, because it has fixed
focus. I mount the phone on a tripod in landscape orientation
while shooting. The tripod is an invaluable tool.
I use a variety of free tools to create the superimposed
pictures. These are Inkscape (for 2D drawings), GIMP
(occasionally for touching things up), ArtOfIllusion (for
photorealistic 3D animations/stills). These are all artistic
softwares, and often are not
enough for the mathematical objects I need to create. Then I
write own program to create the basic shapes, which are then
loaded into one of these for the artistic finishing touches. My
programs are mostly written in R and J. All the screencasts are
done in Kazam (no audio, since my laptop audio input does not work).
After shooting the parts I use the free software kdenlive to do
the post-processing.
My videos are nowhere even close to what a good educational video
should be. Even then, the process to produce them is somewhat
intricate. I shall split up the details in a number of
sections.
A good educational video, in my opinion, should have 5
characteristics. I decided upon these after comparing various
online educational videos (BBC, Khan Academy,
ThreeBlueOneBrown, FilmMakersIQ and others). These guidelines are
for educational video makers like myself, who have not much control on
the topic to be presented, and cannot afford to get
expensive visuals (a close up video of a blue whale deep in the Pacific, does not
need my guidelines to make itself popular!).
-
The video should show the presenter talking to the viewer. This is
because human facial expressions and gestures constitute one of
the most potent languages known to mankind. What is more, most
of us actually enjoy reading this language. Simply by talking
right into the camera, the presenter can convey so much more
information than is possible with voice alone. If there is one
reason why Khan Academy videos suck, while BBC videos don't, this
is it. Also, it is not the beauty of the presenter's face that
attracts the viewer. It is the natural gesture that goes with
normal oral communication (eye contact, hand movement, facial
expressions) that count. So a screencast with a talking face
looking not at the audience does not help at all.
-
The video must show variation. The video occupies
only an insignificant small rectangle in front of the
viewer. There are plenty of interesting things happening
before his eyes outside that rectangle: birds flying, trees
moving, people moving around. The content of the video has to
compete with all these distractions to keep the viewer's
attention glued to itself. Nobody like to stare at some dumb old
scene all the time. So it is important to vary the visuals.
-
The video must be well punctuated. Just as a book
should have chapters and sections and subsections and bulleted
items for easy navigation, so should an educational video. Suitable
audio jingles, fades and change of background should be used for
this purpose.
-
A video must be restful to the eyes. Educational
videos like the ones shown in threeblueonebrown are based on
catchy themes that naturally attract the viewer. But when we
want to cover a typical coursework in a video, you cannot always
rely on the topic being catchy. Often you'll need to drag your
audience through details that are not so appetizing. When you
suspect that this is the case, your video must move slowly,
allowing longer fades, holding the visuals longer on screen, and
occasionally injecting soothing visuals like birds chirping.
-
A video must use Foley and reverse-Foley effects.
Foley effects means "you hear what you see". If a paper is
crumpled, you must hear the corresponding sound. Most often the
real sound is so feeble, that it is not audible. Then a similar
sound must be added later during editing. This apparently silly
thing adds unbelievably to the charm of a video. However, even
more important for an educational video is what I have called
the reverse-Foley effect: the audience must see what they
hear. Information flows to the viewer via two channels, audio and
video. It is too easy for the presented to get carried away by a
topic, and just keep on talking. Then more information is
channeled via the audio and the video merely shows the presented
standing. Such a skewness puts strain on the audience leading to
quick fatigue. A good video must distribute the info more or less
symmetrically over the audio and video channels.
AN educational video typically uses three types of parts based on
the denseness of the information presented:
-
Motivation: Here the presenter talks emotionally to motivate the
audience. Not much information is transmitted, only the
enthusiasm.
-
Derivation: Here the presenter is providing lots of
information. Indeed, so much that the presenter himself might
find it difficult to keep track of things. A typical example is
where a mathematical derivation is being done.
-
Discussion: Here the information content is moderate. The presenter can
rattle off the whole thing easily from memory, but the audience
will not able to follow that easily.
For the first type, it is enough to show the presenter in close
up (or medium shot to show hand gestures). No visual other than
the body language is generally called for.
For the second type, the best way is to show the presenter in
front of a black/white board, and record the entire teaching
session.
The third type is the most interesting, and offers the maximum in
terms of what an educational video can do. The presenter keeps on
talking while various diagrams/ animations etc are superimposed
in a synchronized manner. This is where the editing phase becomes
tough. But if you can manage it, it is surely worth it. Indeed,
the aim of an educational video maker should be to avoid the
second type as much as possible, and replace it with the third
type.
Making a video is a somewhat long process. So careful planning
or scripting helps. This includes what to say, what to
show, deciding upon camera angles and backgrounds, etc
etc. While scripting may look like a good idea (and professional
video makers cannot think of video making without a script), it
is nevertheless is a source of trouble in itself.
- First, it takes
up a LOT of extra time and energy. Being a teacher by profession I can
easily lecture for an hour on a familiar topic. The ideas and
expressions come rolling automatically, and I can make
appropriate diagrams and derivations on the fly. It is just like
moving your facial muscles in a coordinated way while eating. You
just do in naturally...without thinking. But if you are asked to describe
all those muscle movements in advance, you'll be hard put.
- Second, not being an actor, I find it difficult to repeat
words from a prepared script without appearing mechanical. And
losing my spontaneity is the last thing I want.
But still making a script has its definite advantages. The most
important of these is the ability to do "coverage shoot" that I shall discuss
later. But all in all, I generally prefer to work without a
script.
I start by chalking out the following points:
- the stills/animations to be superimposed. If there is a
challenging one, then I make it first, just to be sure that it
is possible. Also, this gives me an idea of the amount of screen
space that will be needed by it.
- the general flow of ideas, e.g., answering questions like: should I start a
definition, or arrive at the definition after some
motivation?
- Location: where should I shoot? On the roof, or in the drawing room? Or
all over the house?
Shall I need the whiteboard? How much of the types motivation, discussion and derivation do I need?
The shooting part, surprisingly, is the least troublesome phase
of the whole process of making an educational video. It is done
almost in real time (e.g., if I am shooting a 30 min video,
I'll typically finish the shooting in 45 min or so). This is
partly because I use spontaneous flow instead of following a
script. So the experience is more like acting on stage and not
acting in a movie. The speed of the shooting owes part of its
origin to the fact that the fan has to remain switched off during
the shoot (for the sake of audio quality), and I do not enjoy
sweating in front of the camera.
Here is how I do the shooting.
First I mount the phone on the tripod. Landscape mode, front
camera. Then I think about what I want to say in the first
shot. I keep it simple: one one idea per shot. If the idea changes,
then so must the shot. That provides a natural punctuation. I
look into the camera (hard to do, as my eyes like to look at my
image on the phone, and not at the camera lens, which is a tiny
inconspicuous dot near the margin). I stand pretty close to the
camera, so that my voice is clearly picked up. This causes my
head to look more bulbous than it actually is (Hey, now you know
that I am much more handsome than I look in the videos).
Once I believe that I have finished my first idea, I stop recording
(stop, not pause, because I like to keep my files small, as it
helps me avoid loading problems later).
Once my first shot is over, I quickly think about a natural
continuation to the next idea, and shoot that in
a different setting (different camera angle, different corner
of the room etc). I think that I should use some 4 different
settings all through the video. Each setting should be for one
type of ideas, e.g., motivation, derivation, critical
thinking. But I have not tried out this type-to-setting mapping
in any of my videos yet.
This is a smart idea that I learned while working before I
learned the term "coverage shoot". It means shooting the same
thing from multiple angles. Then later you might mix those
different angles during editing. You see this all the time in
movies: a dialogue between A and B is partly shown over A's
shoulder, partly over B's shoulder, and partly from the
side. However, in an educational video of the type discussed
here, the presenter has to look straight at the audience (i.e.,
into the camera) all the time. That does not leave much scope for
exciting coverage shoots. There are two exceptions:
- First, you may link up two shots nicely using an idea like
coverage shoot. Imagine moving from a motivational shot to a
derivation shot. The motivational shot ends with you saying
"Let's look at the proof." Start the derivation shot with
precisely the same sentence. While editing show the motivational
shot only up to "Let's look at the..." and immediately start the
derivation shot with "...proof."
- Sometimes, you want to say words that are carefully
chosen. Say there five such sentences and you do not want to
change background setting during them. So you need to say those
sentences in one go. But if the sentences are pre-worded, you
cannot rely on your spontaneity here. In such a case, you might
find it difficult to say more than two sentences at a time. Here
coverage shoot helps. Say the first two sentences before the
camera. Then move the camera closer to or away from you
retaining the same angle. Say the 2nd and 3rd sentences (the 2nd
sentence get repeated). Again move the camera back to the
original position and say the 3rd and 4th sentences, and so
on. Then while editing jump from one shot to the other during
the overlap sentences. Such cuts are unobtrusive, and it appears
that you are speaking all the 5 sentences smartly in one go.
I do all my screencasts using the free software Kazam. Since my
laptop audio input does not work, I record the audio separately
using the phone. In fact, here I do something like a coverage
shoot. I keep the video camera of the my phone focused on me,
while I run Kazam on my laptop. Then I say "O..K" very clearly,
while typing the letters on screen. While editing these help me
to sync the camera video with the screencast.
Also, since I have both footage of my face and the screencast, I
can move between them during editing. But I have found that
making the screencast 50% transparent and superposing it on the
footage of my face keeps the best of both worlds.
PIP (or Picture In Picture) stands for all the little
pictures/animations are superimposed on the main video. They are
what makes an educational video stand out. They are are essential
for the discussion shots. I prefer to use PIPs with
transparent background and 50% transparency. This, I believe, makes them merge better
with the video, and also save screen space as I need not find a
separate screen space for myself. I use different techniques to
suit different requirements:
- Still images: These are pop ups like a
graph or a formula that appear somewhere on screen. I use Inkscape to create
them. There is "Tex Text" plugin, which turns LaTex
formulae into images.
- 2D Animations: I tried to use a free software called Synfig
for this purpose initially. But it is too crude for my taste. So
I have abandoned it. Now I write an R script for each
animation. The script dumps a sequence of images in a separate
folder. I generate 30 frames for each second of animation. Most
animations are exactly 1 sec in duration.
- 3D animations: I use the free software ArtOfIllusion for
this purpose. Again, I use this software to generate a sequence
of images and dump them into a separate folder.
ArtOfIllusion is an artists' program. Often I need to model
objects that are complicated but have simple mathematical
descriptions. One example is the soap film frame (or the film
itself). Then I first create an .obj file, which is a super
simple format for describing 3D objects. It is just a list of
points followed by a list of triangular faces. Such a file may
be created using a text editor (or output from R or J). Then I
import it inside ArtOfIllusion, and choose the artistic details
(camera angle, lighting, texture, colour etc).
I use the free software kdenlive for all the editing.
Editing for me mostly means dumping all the shots on the
timeline, and chopping off the extra bits at the two ends (where
I extend my hand to stop recording). Occasionally, I need to
remove a little coughing or faltering. Since my distance from
camera is not always the same, the audio volume tends to differ
from shot to shot. I manually adjust the volume to achieve consistency.
This much is pretty easy. The hard part is to insert the PIPs.
For this I first play a shot in my editor, carefully marking out
all the time points where PIPs are to be added. So I get a list
of time points together with brief descriptions of what I intend
to put there. Then I write an R program to generate all the
I generally steer clear of special effects, because they are
rather time consuming and end up creating an alienated
environment that is not desirable in an educational video. Here
are the few special effects that I do use occasionally:
-
Spatial sync: Sometimes I point to or look at a PIP during
the video, as if it is a real object floating in the air. For these, I roughly
decide upon the position and then point or look at that place
approximately while shooting. Later I place the PIP in that
place while editing.
-
Green screening: This is making some part of a video
transparent, so that something else shows behind it. Most
Youtube tutorial videos scared me about the requirements for
this effect. It seemed that one needs sophisticated lighting
arrangement and profession green screens to achieve this. But it
turned out pretty easy to implement using ordinary home lighting
(day light or fluorescent light) and piece of green cloth I
bought from the local tailor shop. However, green screening does
add an overhead during editing.