Skip to content

Video Deduplication

Bayanat integrates with Benetech's video deduplication tool (JusticeAI). Import matches generated by the tool for video Bulletins already in Bayanat. Bayanat processes matches into Bulletin-to-Bulletin relationships based on the distance between videos.

Importing Matches

From the Bayanat root directory:

bash
source .venv/bin/activate

Import the CSV file generated by the generate_matches.py script:

bash
flask dedup-import /path/to/matches.csv

The command will prompt about overwriting existing matches.

Processing Matches

The deduplication tool provides a distance value (0 to 0.7) for each match between two videos.

Bayanat uses the following criteria to create relationships:

  • Potentially Duplicate: distance below 0.3
  • Potentially Related: distance between 0.3 and 0.5

These thresholds are based on SJAC's testing and may be adjusted.

Bayanat looks up both Bulletins in each match. If no existing relationship exists, a new one is created based on the criteria above.