Bayanat has a simple integration with Benetech's video deduplication tool (JusticeAI). You can import matches generated by the tool for video Bulletins already in Bayanat. Bayanat will then process these matches into Bulletin-to-Bulletin relationships according to the distance between the videos.
In the root directory of Bayanat installation, you should first enter the virtual environment:
source env/bin/activate
export FLASK_APP=run.py
You can then use the following command to import a CSV file which was generated by the generate_matches.py
script:
flask dedup-import /path/to/csv
The command will prompt for overwriting existing matches in the database.
Benetech's deduplication tool provide a distance value for a match between two videos. This number is between 0 and 0.7.
Once the matches have been imported into Bayanat, they can be viewed and processed in the Deduplication Dashboard.
The criteria which Bayanat uses to add relationships in the database is as follows:
Potentially Duplicate
for matches with distance below 0.3Potentially Related
for matches with distance between 0.3 and 0.5This criteria is based on SJAC's own testing of the deduplication tool's output so far. The criteria might be adjusted in the future as we test and verify more matches.
Bayanat will lookup the two Bulletins in each match. If no existing relationship between the two Bulletins, a new relationship based on the above criteria will be created in the database.