File compression glossary

Every important file compression term explained

0
  • 7-Zip (7z)

    An open-source archive format that often provides higher compression ratios compared to ZIP or RAR.

A
  • Algorithm

    A specific set of rules or procedures used to compress and decompress data. Common compression algorithms include DEFLATE, LZMA, and Huffman coding.

  • Archive

    A collection of one or more files that have been grouped together for easier storage and transmission. Archives are often compressed but not necessarily so.

  • Arithmetic Coding

    An advanced compression algorithm that represents data as a single floating-point number between 0 and 1.

  • Audio Compression

    Methods (lossy or lossless) used to reduce the size of audio files like MP3, WAV, AAC, or FLAC.

  • AV1

    An open-source, royalty-free video coding format developed by the Alliance for Open Media.

B
  • Bandwidth

    The maximum rate of data transfer across a given path. Compression helps reduce the amount of bandwidth needed to transmit files.

  • Batch Processing

    Processing or compressing multiple files or tasks at once, often in an automated fashion.

  • Bitrate

    The amount of data processed per unit of time, commonly used in audio or video compression. It can be constant (CBR), variable (VBR) or average (ABR).

  • Block-Based Compression

    A compression method that processes data in blocks (chunks) rather than a continuous stream. Examples include certain implementations of LZ-based algorithms.

  • Brotli

    Brotli is a modern data compression algorithm developed by Google, offering superior compression ratios compared to traditional methods like Gzip.

  • Bzip2

    A popular open-source, lossless data compression tool that uses the Burrows–Wheeler algorithm.

C
  • Checksum

    A value used to verify the integrity of a file before and after compression or transfer.

  • Chunking

    Splitting a large file into smaller parts (chunks) for easier uploading, downloading, or parallel processing.

  • Cloud-Based Compression

    Performing file compression and decompression using servers in the cloud rather than local hardware.

  • Codec

    Short for "coder-decoder", it refers to the software or hardware that compresses and decompresses data - commonly used in audio and video file handling.

  • Compression

    The process of encoding information using fewer bits than the original representation, resulting in a smaller file size.

  • Container Format

    A type of file format that can contain various types of data, such as audio, video, and metadata. Examples include MP4, MKV, and AVI for video files.

  • Content-Encoding

    An HTTP header indicating the compression method used for transmitted data.

  • CRC (Cyclic Redundancy Check)

    A method for detecting errors in compressed data.

D
  • Data Deduplication

    A process that eliminates redundant copies of data to reduce storage requirements.

  • Decompression

    The process of restoring a compressed file to its original size and format.

  • Deflate

    A commonly used lossless data compression algorithm that combines LZ77 and Huffman coding, used in formats like ZIP and gzip.

  • Delta Encoding

    A compression technique that stores differences between sequential data rather than the full data.

  • Dictionary-Based Compression

    Compression methods (like LZ77/LZ78) that build and reference a "dictionary" of data segments to reduce redundancy.

E
  • Encryption

    Securing data so that it can only be accessed or decrypted by authorized parties, often used alongside compression but distinct from it.

  • Entropy

    A measure of randomness in data that determines how effectively it can be compressed.

  • Entropy Coding

    A technique (like Huffman or Arithmetic coding) that compresses by assigning shorter codes to more frequent symbols.

  • Error Recovery

    Features that allow partial recovery of compressed data even if some portions are corrupted.

F
  • File Conversion

    Changing a file from one format to another (e.g., converting WAV to MP3), which sometimes involves compression as part of the process.

  • File Format

    A standard way that information is encoded for storage in a computer file (e.g., JPG, PNG, MP3, PDF).

  • File Header

    Metadata at the beginning of a compressed file containing information about the compression method and file properties.

G
  • Gzip

    A software application and file format for compression/decompression, often used on Unix-like systems.

H
  • H.264 / H.265

    Video encoding standards that offer high compression efficiency with good image quality.

  • HTTP Compression

    The compression of web content before transmission to reduce bandwidth usage and loading times.

  • Huffman Coding

    A technique for lossless data compression that uses variable-length codes based on the frequency of occurrence for each data element.

I
  • Image Compression

    Methods (lossy or lossless) used to reduce the size of image files like TIFF, JPEG, PNG, or WebP.

J
K
L
  • Lossless Compression

    A type of compression where the original data can be perfectly reconstructed from the compressed data (e.g., PNG, FLAC, or ZIP).

  • Lossy Compression

    A type of compression that discards some data, resulting in reduced quality or resolution to achieve smaller file sizes (e.g., JPEG, MP3).

M
  • Metadata

    Data that provides information about other data, such as file attributes (author, date created, etc.). Sometimes removed or optimized to reduce file size.

  • MIME Type

    A standard identifier for file formats and content types, used in web servers and email systems.

  • Multi-Pass Encoding

    A compression technique (often used in video) where the encoder analyzes the data in multiple passes to achieve better optimization.

N
O
P
  • PDF Compression

    Specific techniques used to reduce PDF file sizes, such as image downsampling, font embedding, and removing unnecessary metadata.

  • Progressive Loading

    A technique where compressed data is structured to allow partial decompression and viewing before the entire file is processed.

Q
  • Quality Factor

    A setting in lossy compression that dictates the level of detail retained (and therefore file size). Commonly used in JPEG compression.

R
  • RAR

    A proprietary archive format that provides robust compression and optional error recovery features.

  • Resampling

    Reducing the resolution or sampling rate (in images or audio) to achieve smaller file sizes.

S
  • Stream Compression

    Real-time compression of data as it's being transmitted, common in network applications.

  • SXF (Self-Extracting Archive)

    An archive format that includes decompression software, allowing users to extract files without installing a separate decompression tool.

T
  • TAR

    Short for Tape Archive, a common Unix-based archiving format often paired with compression (e.g., .tar.gz).

  • Transcoding

    The process of converting from one encoding to another, often involving lossy compression in multimedia files.

U
V
  • Video Compression

    Reducing the size of video files using codecs like H.264, H.265 (HEVC), or VP9, which often includes spatial and temporal compression techniques.

  • VP8/VP9

    VP8/VP9 are open-source video codecs developed by Google, with VP8 (2008) offering basic web video compression and VP9 (2013) providing enhanced efficiency for high-resolution streaming.

W
X
Y
Z
  • ZIP

    A common archive and compression format widely used for both Windows and Mac systems.