Deduplication FAQ - Strawberry Manuals

How does deduplication work?

The process begins by detecting duplicates using modern cryptographic hashing algorithms and generating a unique hash per file, also known as a fingerprint. Once a duplicate is identified, it is replaced with a hard link to the other file. For any editing application that works with Strawberry (e.g. Adobe© Premiere© Pro, Avid MediaComposer…), this hard link is no different from the deleted file, as it has the same binary content. Additionally, Strawberry ensures that the hard link has the same name as the deleted file, keeping path consistency and preventing “media offline” errors. Please have a look at our website for more information.

Which requirements must a file fulfil to get deduplicated?

The file must have been fingerprinted by Strawberry.
The file must be a duplicate, meaning it must have the same essence fingerprint as at least one other file.
The file must be on production storage in a Strawberry-managed project or in an ingest library. Files in external libraries as well as files in the archive cannot be deduplicated.
The file must be at least 1 Megabyte in size.

How often does the deduplication job run?

The deduplication job runs once per night at around 4 am server time.

When are deduplication reports generated?

Deduplication reports are generated as soon as the nightly deduplication job has deduplicated files. If the job does not deduplicate any files no new report is generated.