This notebook splits a video into consecutive frame-count–based chunks and saves them into a subfolder named after the source video.
Example: user/xyz/documents/videos/example.mp4 → outputs into user/xyz/documents/videos/example/ as example_chunk01.mp4, example_chunk02.mp4, …
Overview¶
What this does
Creates an output subdirectory with the base filename of your video.
Splits the video into consecutive chunks, each containing a user-defined number of frames.
Writes a final remainder chunk if the video length isn’t divisible by your chunk size.
Names chunks as:
<video_stem>_chunkXX.mp4(e.g.,example_chunk01.mp4).
Why frames (not time)?
This ensures exact frame counts per chunk, which is helpful when you want deterministic splits for annotation, ML, or analysis workflows.
Requirements¶
Python 3.8+
OpenCV (
cv2) installed with a working backend (FFmpeg/GStreamer depending on OS).Sufficient disk space to write chunked files.
Tip (macOS/Linux): If your OpenCV lacks codecs, install/enable FFmpeg. On macOS,
brew install ffmpegcan help; on Linux, install via your package manager.
Install OpenCV, if you need¶
Uncomment and run the cell below if you need to install OpenCV. If you’re in a restricted network, install locally on your machine.
#!pip install --upgrade opencv-python
# If you need FFMPEG-enabled backend, ensure FFmpeg is installed system-wide.
#!pip install ffmpeg
1) Set Your Inputs¶
input_path: Absolute or relative path to your video file.chunk_num_frame: Number of frames per chunk (positive integer).codec: FourCC code for output (defaultmp4v; tryavc1orH264if available on your system for smaller files).
from pathlib import Path
# >>> EDIT THESE <<<
input_path = Path("/Users/souvikmandal/Documents/example.mp4")
chunk_num_frame = 1000
codec = "mp4v" # alternatives: "avc1", "H264" (requires proper system codecs)
# No edits needed below
input_path = input_path.expanduser().resolve()
input_path
2) Core Function¶
The function below reads frames sequentially and writes chunk files with the same FPS and resolution as the source video.
import sys
import cv2
def split_video_by_frames(input_path: Path, chunk_num_frame: int, codec: str = "mp4v") -> Path:
"""Split a video into consecutive chunks by frame count.
Args:
input_path: Path to the input video.
chunk_num_frame: Number of frames per chunk (must be > 0).
codec: FourCC for output encoding (e.g., 'mp4v', 'avc1', 'H264').
Returns:
Path to the output directory where chunks are saved.
"""
if not input_path.exists() or not input_path.is_file():
raise FileNotFoundError(f"Input file not found: {input_path}")
if chunk_num_frame <= 0:
raise ValueError("--chunk_num_frame must be a positive integer.")
stem = input_path.stem # e.g., "example"
parent_dir = input_path.parent # e.g., user/xyz/documents/videos
output_dir = parent_dir / stem # e.g., user/xyz/documents/videos/example
output_dir.mkdir(parents=True, exist_ok=True)
cap = cv2.VideoCapture(str(input_path))
if not cap.isOpened():
raise RuntimeError(f"Could not open video: {input_path}")
fps = cap.get(cv2.CAP_PROP_FPS)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
if fps <= 0 or width <= 0 or height <= 0:
print("[WARN] Could not read video metadata reliably. Proceeding with defaults if possible.", file=sys.stderr)
fourcc = cv2.VideoWriter_fourcc(*codec)
chunk_idx = 1
frames_in_current_chunk = 0
total_frames = 0
writer = None
def start_new_writer(index: int):
nonlocal writer, frames_in_current_chunk
out_name = f"{stem}_chunk{index:02d}.mp4"
out_path = output_dir / out_name
writer = cv2.VideoWriter(str(out_path), fourcc, fps if fps > 0 else 30.0, (width, height))
if not writer.isOpened():
cap.release()
raise RuntimeError(f"Could not open writer for: {out_path}")
frames_in_current_chunk = 0
print(f"[INFO] Writing: {out_path}")
# Initialize writer for the first chunk
start_new_writer(chunk_idx)
try:
while True:
ok, frame = cap.read()
if not ok:
break # end of video
writer.write(frame)
frames_in_current_chunk += 1
total_frames += 1
if frames_in_current_chunk >= chunk_num_frame:
writer.release()
chunk_idx += 1
start_new_writer(chunk_idx)
finally:
if writer is not None:
if frames_in_current_chunk == 0:
# last writer created but no frames written; try to delete empty file
writer.release()
empty_out = output_dir / f"{stem}_chunk{chunk_idx:02d}.mp4"
try:
if empty_out.exists() and empty_out.stat().st_size == 0:
empty_out.unlink(missing_ok=True)
except Exception:
pass
else:
writer.release()
cap.release()
print("\n[SUMMARY]")
print(f" Input video: {input_path}")
print(f" Output dir : {output_dir}")
print(f" Total frames processed: {total_frames}")
return output_dir
3) Run the Splitter¶
Run the cell below to split your video using the parameters defined earlier.
out_dir = split_video_by_frames(input_path, chunk_num_frame, codec)
out_dir
4) Verify Outputs¶
List the chunked files to confirm.
sorted(list(out_dir.glob("*.mp4")))